[2024-06-17 21:58:39,292][12645] Saving configuration to /workspace/metta/train_dir/p2.dr4/config.json... [2024-06-17 21:58:39,308][12645] Rollout worker 0 uses device cpu [2024-06-17 21:58:39,309][12645] Rollout worker 1 uses device cpu [2024-06-17 21:58:39,309][12645] Rollout worker 2 uses device cpu [2024-06-17 21:58:39,310][12645] Rollout worker 3 uses device cpu [2024-06-17 21:58:39,310][12645] Rollout worker 4 uses device cpu [2024-06-17 21:58:39,310][12645] Rollout worker 5 uses device cpu [2024-06-17 21:58:39,310][12645] Rollout worker 6 uses device cpu [2024-06-17 21:58:39,311][12645] Rollout worker 7 uses device cpu [2024-06-17 21:58:39,311][12645] Rollout worker 8 uses device cpu [2024-06-17 21:58:39,311][12645] Rollout worker 9 uses device cpu [2024-06-17 21:58:39,311][12645] Rollout worker 10 uses device cpu [2024-06-17 21:58:39,312][12645] Rollout worker 11 uses device cpu [2024-06-17 21:58:39,312][12645] Rollout worker 12 uses device cpu [2024-06-17 21:58:39,312][12645] Rollout worker 13 uses device cpu [2024-06-17 21:58:39,313][12645] Rollout worker 14 uses device cpu [2024-06-17 21:58:39,313][12645] Rollout worker 15 uses device cpu [2024-06-17 21:58:39,313][12645] Rollout worker 16 uses device cpu [2024-06-17 21:58:39,313][12645] Rollout worker 17 uses device cpu [2024-06-17 21:58:39,313][12645] Rollout worker 18 uses device cpu [2024-06-17 21:58:39,313][12645] Rollout worker 19 uses device cpu [2024-06-17 21:58:39,314][12645] Rollout worker 20 uses device cpu [2024-06-17 21:58:39,314][12645] Rollout worker 21 uses device cpu [2024-06-17 21:58:39,314][12645] Rollout worker 22 uses device cpu [2024-06-17 21:58:39,314][12645] Rollout worker 23 uses device cpu [2024-06-17 21:58:39,314][12645] Rollout worker 24 uses device cpu [2024-06-17 21:58:39,314][12645] Rollout worker 25 uses device cpu [2024-06-17 21:58:39,315][12645] Rollout worker 26 uses device cpu [2024-06-17 21:58:39,315][12645] Rollout worker 27 uses device cpu [2024-06-17 21:58:39,315][12645] Rollout worker 28 uses device cpu [2024-06-17 21:58:39,315][12645] Rollout worker 29 uses device cpu [2024-06-17 21:58:39,315][12645] Rollout worker 30 uses device cpu [2024-06-17 21:58:39,315][12645] Rollout worker 31 uses device cpu [2024-06-17 21:58:39,889][12645] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-17 21:58:39,889][12645] InferenceWorker_p0-w0: min num requests: 10 [2024-06-17 21:58:39,964][12645] Starting all processes... [2024-06-17 21:58:39,965][12645] Starting process learner_proc0 [2024-06-17 21:58:40,196][12645] Starting all processes... [2024-06-17 21:58:40,199][12645] Starting process inference_proc0-0 [2024-06-17 21:58:40,199][12645] Starting process rollout_proc0 [2024-06-17 21:58:40,199][12645] Starting process rollout_proc1 [2024-06-17 21:58:40,199][12645] Starting process rollout_proc2 [2024-06-17 21:58:40,199][12645] Starting process rollout_proc3 [2024-06-17 21:58:40,200][12645] Starting process rollout_proc4 [2024-06-17 21:58:40,201][12645] Starting process rollout_proc5 [2024-06-17 21:58:40,249][12645] Starting process rollout_proc6 [2024-06-17 21:58:40,250][12645] Starting process rollout_proc7 [2024-06-17 21:58:40,251][12645] Starting process rollout_proc8 [2024-06-17 21:58:40,252][12645] Starting process rollout_proc9 [2024-06-17 21:58:40,252][12645] Starting process rollout_proc10 [2024-06-17 21:58:40,254][12645] Starting process rollout_proc11 [2024-06-17 21:58:40,254][12645] Starting process rollout_proc12 [2024-06-17 21:58:40,255][12645] Starting process rollout_proc13 [2024-06-17 21:58:40,256][12645] Starting process rollout_proc14 [2024-06-17 21:58:40,256][12645] Starting process rollout_proc15 [2024-06-17 21:58:40,256][12645] Starting process rollout_proc16 [2024-06-17 21:58:40,256][12645] Starting process rollout_proc17 [2024-06-17 21:58:40,257][12645] Starting process rollout_proc18 [2024-06-17 21:58:40,257][12645] Starting process rollout_proc19 [2024-06-17 21:58:40,258][12645] Starting process rollout_proc20 [2024-06-17 21:58:40,258][12645] Starting process rollout_proc21 [2024-06-17 21:58:40,261][12645] Starting process rollout_proc22 [2024-06-17 21:58:40,262][12645] Starting process rollout_proc23 [2024-06-17 21:58:40,265][12645] Starting process rollout_proc24 [2024-06-17 21:58:40,265][12645] Starting process rollout_proc25 [2024-06-17 21:58:40,265][12645] Starting process rollout_proc26 [2024-06-17 21:58:40,272][12645] Starting process rollout_proc27 [2024-06-17 21:58:40,279][12645] Starting process rollout_proc28 [2024-06-17 21:58:40,286][12645] Starting process rollout_proc29 [2024-06-17 21:58:40,293][12645] Starting process rollout_proc30 [2024-06-17 21:58:40,294][12645] Starting process rollout_proc31 [2024-06-17 21:58:42,210][12932] Worker 18 uses CPU cores [18] [2024-06-17 21:58:42,472][12887] Worker 4 uses CPU cores [4] [2024-06-17 21:58:42,490][12862] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-17 21:58:42,491][12862] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-17 21:58:42,499][12862] Num visible devices: 1 [2024-06-17 21:58:42,508][12889] Worker 5 uses CPU cores [5] [2024-06-17 21:58:42,512][12862] Setting fixed seed 0 [2024-06-17 21:58:42,513][12862] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-17 21:58:42,513][12862] Initializing actor-critic model on device cuda:0 [2024-06-17 21:58:42,529][12884] Worker 2 uses CPU cores [2] [2024-06-17 21:58:42,532][12929] Worker 15 uses CPU cores [15] [2024-06-17 21:58:42,542][12883] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-17 21:58:42,543][12883] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-17 21:58:42,553][12883] Num visible devices: 1 [2024-06-17 21:58:42,561][12885] Worker 1 uses CPU cores [1] [2024-06-17 21:58:42,631][12891] Worker 8 uses CPU cores [8] [2024-06-17 21:58:42,640][13069] Worker 31 uses CPU cores [31] [2024-06-17 21:58:42,663][12927] Worker 11 uses CPU cores [11] [2024-06-17 21:58:42,671][13035] Worker 27 uses CPU cores [27] [2024-06-17 21:58:42,680][12937] Worker 23 uses CPU cores [23] [2024-06-17 21:58:42,685][12925] Worker 12 uses CPU cores [12] [2024-06-17 21:58:42,688][12936] Worker 22 uses CPU cores [22] [2024-06-17 21:58:42,708][12882] Worker 0 uses CPU cores [0] [2024-06-17 21:58:42,725][12892] Worker 10 uses CPU cores [10] [2024-06-17 21:58:42,744][12935] Worker 21 uses CPU cores [21] [2024-06-17 21:58:42,749][12928] Worker 14 uses CPU cores [14] [2024-06-17 21:58:42,760][12933] Worker 19 uses CPU cores [19] [2024-06-17 21:58:42,784][12967] Worker 25 uses CPU cores [25] [2024-06-17 21:58:42,848][13068] Worker 30 uses CPU cores [30] [2024-06-17 21:58:42,859][12926] Worker 13 uses CPU cores [13] [2024-06-17 21:58:42,868][13067] Worker 29 uses CPU cores [29] [2024-06-17 21:58:42,869][12893] Worker 9 uses CPU cores [9] [2024-06-17 21:58:42,888][12931] Worker 17 uses CPU cores [17] [2024-06-17 21:58:42,898][12934] Worker 20 uses CPU cores [20] [2024-06-17 21:58:42,916][12890] Worker 7 uses CPU cores [7] [2024-06-17 21:58:42,931][12886] Worker 3 uses CPU cores [3] [2024-06-17 21:58:42,941][12888] Worker 6 uses CPU cores [6] [2024-06-17 21:58:43,003][13033] Worker 26 uses CPU cores [26] [2024-06-17 21:58:43,011][12930] Worker 16 uses CPU cores [16] [2024-06-17 21:58:43,052][13034] Worker 28 uses CPU cores [28] [2024-06-17 21:58:43,086][12970] Worker 24 uses CPU cores [24] [2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,380][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,381][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,384][12862] RunningMeanStd input shape: (1,) [2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (1,) [2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (1,) [2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (1,) [2024-06-17 21:58:43,385][12862] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:43,425][12862] RunningMeanStd input shape: (1,) [2024-06-17 21:58:43,429][12862] Created Actor Critic model with architecture: [2024-06-17 21:58:43,429][12862] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-17 21:58:43,499][12862] Using optimizer [2024-06-17 21:58:43,684][12862] No checkpoints found [2024-06-17 21:58:43,685][12862] Did not load from checkpoint, starting from scratch! [2024-06-17 21:58:43,685][12862] Initialized policy 0 weights for model version 0 [2024-06-17 21:58:43,686][12862] LearnerWorker_p0 finished initialization! [2024-06-17 21:58:43,686][12862] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,452][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,453][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,456][12883] RunningMeanStd input shape: (1,) [2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (1,) [2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (1,) [2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (1,) [2024-06-17 21:58:44,457][12883] RunningMeanStd input shape: (11, 11) [2024-06-17 21:58:44,497][12883] RunningMeanStd input shape: (1,) [2024-06-17 21:58:44,520][12645] Inference worker 0-0 is ready! [2024-06-17 21:58:44,520][12645] All inference workers are ready! Signal rollout workers to start! [2024-06-17 21:58:46,994][12645] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 0. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-17 21:58:47,270][12934] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,281][12970] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,282][12926] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,289][12937] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,301][12889] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,313][12885] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,332][12967] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,348][13067] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,353][12930] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,361][12887] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,361][13034] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,375][12884] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,375][12886] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,383][12888] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,385][12936] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,390][12935] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,392][13068] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,403][12929] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,404][12933] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,406][12891] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,433][12892] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,443][13033] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,453][12925] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,459][13069] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,457][13035] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,460][12928] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,468][12932] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,468][12893] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,475][12882] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,481][12927] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,488][12931] Decorrelating experience for 0 frames... [2024-06-17 21:58:47,494][12890] Decorrelating experience for 0 frames... [2024-06-17 21:58:48,401][12934] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,468][12970] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,470][12926] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,525][12886] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,547][13034] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,547][13067] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,559][12888] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,618][12884] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,647][12885] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,663][13068] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,664][12967] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,670][12892] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,671][12937] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,699][12882] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,707][12889] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,712][12930] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,725][12893] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,726][12931] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,743][12891] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,744][12935] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,752][12936] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,764][12928] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,773][12887] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,776][13033] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,783][12925] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,802][12927] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,802][12932] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,803][12929] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,809][12933] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,823][13069] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,830][12890] Decorrelating experience for 256 frames... [2024-06-17 21:58:48,852][13035] Decorrelating experience for 256 frames... [2024-06-17 21:58:51,994][12645] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 748.0. Samples: 3740. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-17 21:58:56,994][12645] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 0. Throughput: 0: 31075.6. Samples: 310760. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-17 21:58:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:58:57,045][13069] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-17 21:58:57,166][12892] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-17 21:58:57,212][12926] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-17 21:58:57,246][12884] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-17 21:58:57,280][12886] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-17 21:58:57,304][12885] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-17 21:58:57,347][12893] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-17 21:58:57,366][13034] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-17 21:58:57,404][12862] Signal inference workers to stop experience collection... [2024-06-17 21:58:57,404][12925] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-17 21:58:57,409][12887] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-17 21:58:57,412][13068] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-17 21:58:57,417][12883] InferenceWorker_p0-w0: stopping experience collection [2024-06-17 21:58:57,424][12929] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-17 21:58:58,026][12862] Signal inference workers to resume experience collection... [2024-06-17 21:58:58,026][12883] InferenceWorker_p0-w0: resuming experience collection [2024-06-17 21:58:58,052][12889] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-17 21:58:58,279][12927] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-17 21:58:58,409][12970] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-17 21:58:58,455][13067] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-17 21:58:58,465][12891] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-17 21:58:58,528][12888] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-17 21:58:58,537][12934] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-17 21:58:58,592][12928] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-17 21:58:58,594][12931] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-17 21:58:58,595][12930] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-17 21:58:58,595][12935] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-17 21:58:58,595][13033] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-17 21:58:58,595][13035] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-17 21:58:58,613][12967] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-17 21:58:58,614][12936] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-17 21:58:58,647][12890] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-17 21:58:58,696][12932] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-17 21:58:58,696][12937] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-17 21:58:58,747][12933] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-17 21:58:59,226][12883] Updated weights for policy 0, policy_version 10 (0.0013) [2024-06-17 21:58:59,886][12645] Heartbeat connected on Batcher_0 [2024-06-17 21:58:59,888][12645] Heartbeat connected on LearnerWorker_p0 [2024-06-17 21:58:59,902][12645] Heartbeat connected on RolloutWorker_w0 [2024-06-17 21:58:59,953][12645] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-17 21:59:01,994][12645] Fps is (10 sec: 16383.6, 60 sec: 10922.5, 300 sec: 10922.5). Total num frames: 163840. Throughput: 0: 21925.0. Samples: 328880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-17 21:59:01,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:01,996][12862] Saving new best policy, reward=0.000! [2024-06-17 21:59:02,015][12885] Worker 1 awakens! [2024-06-17 21:59:02,020][12645] Heartbeat connected on RolloutWorker_w1 [2024-06-17 21:59:06,668][12884] Worker 2 awakens! [2024-06-17 21:59:06,673][12645] Heartbeat connected on RolloutWorker_w2 [2024-06-17 21:59:06,994][12645] Fps is (10 sec: 16384.3, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 163840. Throughput: 0: 17032.0. Samples: 340640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-17 21:59:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:11,413][12886] Worker 3 awakens! [2024-06-17 21:59:11,423][12645] Heartbeat connected on RolloutWorker_w3 [2024-06-17 21:59:11,994][12645] Fps is (10 sec: 3276.9, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 196608. Throughput: 0: 14449.6. Samples: 361240. Policy #0 lag: (min: 0.0, avg: 1.1, max: 10.0) [2024-06-17 21:59:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:16,252][12887] Worker 4 awakens! [2024-06-17 21:59:16,258][12645] Heartbeat connected on RolloutWorker_w4 [2024-06-17 21:59:16,994][12645] Fps is (10 sec: 6553.6, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 229376. Throughput: 0: 12552.0. Samples: 376560. Policy #0 lag: (min: 0.0, avg: 4.4, max: 12.0) [2024-06-17 21:59:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:21,588][12889] Worker 5 awakens! [2024-06-17 21:59:21,594][12645] Heartbeat connected on RolloutWorker_w5 [2024-06-17 21:59:21,994][12645] Fps is (10 sec: 8192.0, 60 sec: 7958.0, 300 sec: 7958.0). Total num frames: 278528. Throughput: 0: 12174.9. Samples: 426120. Policy #0 lag: (min: 0.0, avg: 1.2, max: 3.0) [2024-06-17 21:59:22,001][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:26,578][12883] Updated weights for policy 0, policy_version 20 (0.0013) [2024-06-17 21:59:26,752][12888] Worker 6 awakens! [2024-06-17 21:59:26,758][12645] Heartbeat connected on RolloutWorker_w6 [2024-06-17 21:59:26,994][12645] Fps is (10 sec: 9830.4, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 327680. Throughput: 0: 12341.0. Samples: 493640. Policy #0 lag: (min: 0.0, avg: 1.8, max: 4.0) [2024-06-17 21:59:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:31,559][12890] Worker 7 awakens! [2024-06-17 21:59:31,567][12645] Heartbeat connected on RolloutWorker_w7 [2024-06-17 21:59:31,994][12645] Fps is (10 sec: 13107.0, 60 sec: 9102.2, 300 sec: 9102.2). Total num frames: 409600. Throughput: 0: 12029.3. Samples: 541320. Policy #0 lag: (min: 0.0, avg: 2.6, max: 5.0) [2024-06-17 21:59:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:36,064][12891] Worker 8 awakens! [2024-06-17 21:59:36,070][12645] Heartbeat connected on RolloutWorker_w8 [2024-06-17 21:59:36,598][12883] Updated weights for policy 0, policy_version 30 (0.0012) [2024-06-17 21:59:36,994][12645] Fps is (10 sec: 16383.9, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 491520. Throughput: 0: 14041.8. Samples: 635620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 29.0) [2024-06-17 21:59:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 21:59:39,635][12893] Worker 9 awakens! [2024-06-17 21:59:39,644][12645] Heartbeat connected on RolloutWorker_w9 [2024-06-17 21:59:41,994][12645] Fps is (10 sec: 18022.5, 60 sec: 10724.1, 300 sec: 10724.1). Total num frames: 589824. Throughput: 0: 9669.8. Samples: 745900. Policy #0 lag: (min: 0.0, avg: 3.4, max: 6.0) [2024-06-17 21:59:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:44,140][12892] Worker 10 awakens! [2024-06-17 21:59:44,146][12645] Heartbeat connected on RolloutWorker_w10 [2024-06-17 21:59:45,349][12883] Updated weights for policy 0, policy_version 40 (0.0016) [2024-06-17 21:59:46,994][12645] Fps is (10 sec: 19660.6, 60 sec: 11468.8, 300 sec: 11468.8). Total num frames: 688128. Throughput: 0: 10766.7. Samples: 813380. Policy #0 lag: (min: 0.0, avg: 14.2, max: 37.0) [2024-06-17 21:59:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:49,862][12927] Worker 11 awakens! [2024-06-17 21:59:49,869][12645] Heartbeat connected on RolloutWorker_w11 [2024-06-17 21:59:51,577][12883] Updated weights for policy 0, policy_version 50 (0.0015) [2024-06-17 21:59:51,994][12645] Fps is (10 sec: 22937.5, 60 sec: 13653.3, 300 sec: 12603.1). Total num frames: 819200. Throughput: 0: 13587.5. Samples: 952080. Policy #0 lag: (min: 0.0, avg: 4.1, max: 8.0) [2024-06-17 21:59:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 21:59:53,662][12925] Worker 12 awakens! [2024-06-17 21:59:53,669][12645] Heartbeat connected on RolloutWorker_w12 [2024-06-17 21:59:56,994][12645] Fps is (10 sec: 26214.4, 60 sec: 15837.9, 300 sec: 13575.3). Total num frames: 950272. Throughput: 0: 16641.7. Samples: 1110120. Policy #0 lag: (min: 0.0, avg: 4.8, max: 9.0) [2024-06-17 21:59:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 21:59:58,250][12926] Worker 13 awakens! [2024-06-17 21:59:58,259][12645] Heartbeat connected on RolloutWorker_w13 [2024-06-17 21:59:58,311][12883] Updated weights for policy 0, policy_version 60 (0.0019) [2024-06-17 22:00:01,994][12645] Fps is (10 sec: 26214.3, 60 sec: 15291.8, 300 sec: 14417.9). Total num frames: 1081344. Throughput: 0: 18011.9. Samples: 1187100. Policy #0 lag: (min: 0.0, avg: 21.6, max: 60.0) [2024-06-17 22:00:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:04,319][12928] Worker 14 awakens! [2024-06-17 22:00:04,327][12645] Heartbeat connected on RolloutWorker_w14 [2024-06-17 22:00:04,766][12883] Updated weights for policy 0, policy_version 70 (0.0022) [2024-06-17 22:00:06,994][12645] Fps is (10 sec: 27852.6, 60 sec: 17749.3, 300 sec: 15360.0). Total num frames: 1228800. Throughput: 0: 20463.0. Samples: 1346960. Policy #0 lag: (min: 0.0, avg: 4.0, max: 10.0) [2024-06-17 22:00:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:07,836][12929] Worker 15 awakens! [2024-06-17 22:00:07,844][12645] Heartbeat connected on RolloutWorker_w15 [2024-06-17 22:00:10,146][12883] Updated weights for policy 0, policy_version 80 (0.0020) [2024-06-17 22:00:11,994][12645] Fps is (10 sec: 27852.6, 60 sec: 19387.7, 300 sec: 15998.5). Total num frames: 1359872. Throughput: 0: 22864.8. Samples: 1522560. Policy #0 lag: (min: 0.0, avg: 4.5, max: 10.0) [2024-06-17 22:00:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:13,692][12930] Worker 16 awakens! [2024-06-17 22:00:13,703][12645] Heartbeat connected on RolloutWorker_w16 [2024-06-17 22:00:15,942][12883] Updated weights for policy 0, policy_version 90 (0.0027) [2024-06-17 22:00:16,994][12645] Fps is (10 sec: 27852.9, 60 sec: 21299.1, 300 sec: 16748.1). Total num frames: 1507328. Throughput: 0: 23800.4. Samples: 1612340. Policy #0 lag: (min: 0.0, avg: 4.5, max: 12.0) [2024-06-17 22:00:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:18,380][12931] Worker 17 awakens! [2024-06-17 22:00:18,391][12645] Heartbeat connected on RolloutWorker_w17 [2024-06-17 22:00:21,278][12883] Updated weights for policy 0, policy_version 100 (0.0026) [2024-06-17 22:00:21,994][12645] Fps is (10 sec: 29491.7, 60 sec: 22937.6, 300 sec: 17418.8). Total num frames: 1654784. Throughput: 0: 25725.8. Samples: 1793280. Policy #0 lag: (min: 0.0, avg: 6.2, max: 12.0) [2024-06-17 22:00:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:23,169][12932] Worker 18 awakens! [2024-06-17 22:00:23,179][12645] Heartbeat connected on RolloutWorker_w18 [2024-06-17 22:00:26,576][12883] Updated weights for policy 0, policy_version 110 (0.0028) [2024-06-17 22:00:26,994][12645] Fps is (10 sec: 29491.2, 60 sec: 24575.9, 300 sec: 18022.4). Total num frames: 1802240. Throughput: 0: 27466.2. Samples: 1981880. Policy #0 lag: (min: 0.0, avg: 7.0, max: 13.0) [2024-06-17 22:00:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:27,908][12933] Worker 19 awakens! [2024-06-17 22:00:27,930][12645] Heartbeat connected on RolloutWorker_w19 [2024-06-17 22:00:31,211][12883] Updated weights for policy 0, policy_version 120 (0.0030) [2024-06-17 22:00:31,994][12645] Fps is (10 sec: 32767.5, 60 sec: 26214.4, 300 sec: 18880.6). Total num frames: 1982464. Throughput: 0: 27990.6. Samples: 2072960. Policy #0 lag: (min: 0.0, avg: 7.0, max: 13.0) [2024-06-17 22:00:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:00:31,995][12862] Saving new best policy, reward=0.001! [2024-06-17 22:00:32,384][12934] Worker 20 awakens! [2024-06-17 22:00:32,400][12645] Heartbeat connected on RolloutWorker_w20 [2024-06-17 22:00:36,041][12883] Updated weights for policy 0, policy_version 130 (0.0036) [2024-06-17 22:00:36,994][12645] Fps is (10 sec: 34406.7, 60 sec: 27579.7, 300 sec: 19511.9). Total num frames: 2146304. Throughput: 0: 29304.9. Samples: 2270800. Policy #0 lag: (min: 0.0, avg: 6.5, max: 13.0) [2024-06-17 22:00:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000131_2146304.pth... [2024-06-17 22:00:37,133][12935] Worker 21 awakens! [2024-06-17 22:00:37,146][12645] Heartbeat connected on RolloutWorker_w21 [2024-06-17 22:00:41,469][12883] Updated weights for policy 0, policy_version 140 (0.0032) [2024-06-17 22:00:41,839][12936] Worker 22 awakens! [2024-06-17 22:00:41,852][12645] Heartbeat connected on RolloutWorker_w22 [2024-06-17 22:00:41,994][12645] Fps is (10 sec: 32767.9, 60 sec: 28671.9, 300 sec: 20088.2). Total num frames: 2310144. Throughput: 0: 30242.2. Samples: 2471020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 14.0) [2024-06-17 22:00:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:46,324][12883] Updated weights for policy 0, policy_version 150 (0.0032) [2024-06-17 22:00:46,608][12937] Worker 23 awakens! [2024-06-17 22:00:46,622][12645] Heartbeat connected on RolloutWorker_w23 [2024-06-17 22:00:46,994][12645] Fps is (10 sec: 32767.9, 60 sec: 29764.3, 300 sec: 20616.5). Total num frames: 2473984. Throughput: 0: 30896.5. Samples: 2577440. Policy #0 lag: (min: 0.0, avg: 16.3, max: 150.0) [2024-06-17 22:00:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:50,528][12883] Updated weights for policy 0, policy_version 160 (0.0037) [2024-06-17 22:00:51,010][12970] Worker 24 awakens! [2024-06-17 22:00:51,025][12645] Heartbeat connected on RolloutWorker_w24 [2024-06-17 22:00:51,994][12645] Fps is (10 sec: 34407.0, 60 sec: 30583.5, 300 sec: 21233.7). Total num frames: 2654208. Throughput: 0: 32045.9. Samples: 2789020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 17.0) [2024-06-17 22:00:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:00:55,328][12883] Updated weights for policy 0, policy_version 170 (0.0027) [2024-06-17 22:00:55,901][12967] Worker 25 awakens! [2024-06-17 22:00:55,916][12645] Heartbeat connected on RolloutWorker_w25 [2024-06-17 22:00:56,994][12645] Fps is (10 sec: 36044.4, 60 sec: 31402.6, 300 sec: 21803.3). Total num frames: 2834432. Throughput: 0: 32987.5. Samples: 3007000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 17.0) [2024-06-17 22:00:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:00:59,055][12883] Updated weights for policy 0, policy_version 180 (0.0038) [2024-06-17 22:01:00,572][13033] Worker 26 awakens! [2024-06-17 22:01:00,588][12645] Heartbeat connected on RolloutWorker_w26 [2024-06-17 22:01:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 32495.0, 300 sec: 22452.2). Total num frames: 3031040. Throughput: 0: 33505.9. Samples: 3120100. Policy #0 lag: (min: 0.0, avg: 7.3, max: 16.0) [2024-06-17 22:01:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:01:03,400][12883] Updated weights for policy 0, policy_version 190 (0.0035) [2024-06-17 22:01:05,259][13035] Worker 27 awakens! [2024-06-17 22:01:05,274][12645] Heartbeat connected on RolloutWorker_w27 [2024-06-17 22:01:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 33041.1, 300 sec: 22937.6). Total num frames: 3211264. Throughput: 0: 34543.5. Samples: 3347740. Policy #0 lag: (min: 0.0, avg: 43.3, max: 192.0) [2024-06-17 22:01:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:08,545][12883] Updated weights for policy 0, policy_version 200 (0.0033) [2024-06-17 22:01:08,716][13034] Worker 28 awakens! [2024-06-17 22:01:08,730][12645] Heartbeat connected on RolloutWorker_w28 [2024-06-17 22:01:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 34406.4, 300 sec: 23615.5). Total num frames: 3424256. Throughput: 0: 35381.8. Samples: 3574060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 18.0) [2024-06-17 22:01:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:12,805][12883] Updated weights for policy 0, policy_version 210 (0.0042) [2024-06-17 22:01:14,492][13067] Worker 29 awakens! [2024-06-17 22:01:14,508][12645] Heartbeat connected on RolloutWorker_w29 [2024-06-17 22:01:16,994][12645] Fps is (10 sec: 36045.1, 60 sec: 34406.5, 300 sec: 23811.4). Total num frames: 3571712. Throughput: 0: 35926.8. Samples: 3689660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-17 22:01:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:17,243][12883] Updated weights for policy 0, policy_version 220 (0.0039) [2024-06-17 22:01:18,140][13068] Worker 30 awakens! [2024-06-17 22:01:18,155][12645] Heartbeat connected on RolloutWorker_w30 [2024-06-17 22:01:21,148][12883] Updated weights for policy 0, policy_version 230 (0.0043) [2024-06-17 22:01:21,994][12645] Fps is (10 sec: 37683.8, 60 sec: 35771.7, 300 sec: 24523.2). Total num frames: 3801088. Throughput: 0: 36589.8. Samples: 3917340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 22:01:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:22,460][13069] Worker 31 awakens! [2024-06-17 22:01:22,479][12645] Heartbeat connected on RolloutWorker_w31 [2024-06-17 22:01:25,914][12883] Updated weights for policy 0, policy_version 240 (0.0035) [2024-06-17 22:01:26,995][12645] Fps is (10 sec: 42592.2, 60 sec: 36590.1, 300 sec: 24985.4). Total num frames: 3997696. Throughput: 0: 37200.7. Samples: 4145100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-17 22:01:26,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:29,707][12883] Updated weights for policy 0, policy_version 250 (0.0038) [2024-06-17 22:01:31,996][12645] Fps is (10 sec: 37674.7, 60 sec: 36589.6, 300 sec: 25320.4). Total num frames: 4177920. Throughput: 0: 37460.9. Samples: 4263260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-17 22:01:31,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:34,206][12883] Updated weights for policy 0, policy_version 260 (0.0044) [2024-06-17 22:01:36,994][12645] Fps is (10 sec: 37688.4, 60 sec: 37137.1, 300 sec: 25732.5). Total num frames: 4374528. Throughput: 0: 37856.8. Samples: 4492580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 22:01:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:01:37,021][12862] Saving new best policy, reward=0.003! [2024-06-17 22:01:37,967][12883] Updated weights for policy 0, policy_version 270 (0.0032) [2024-06-17 22:01:41,994][12645] Fps is (10 sec: 36052.7, 60 sec: 37137.1, 300 sec: 25933.5). Total num frames: 4538368. Throughput: 0: 38109.4. Samples: 4721920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:01:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:42,900][12883] Updated weights for policy 0, policy_version 280 (0.0036) [2024-06-17 22:01:46,537][12883] Updated weights for policy 0, policy_version 290 (0.0035) [2024-06-17 22:01:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37956.3, 300 sec: 26396.4). Total num frames: 4751360. Throughput: 0: 38054.2. Samples: 4832540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 22:01:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:51,298][12883] Updated weights for policy 0, policy_version 300 (0.0033) [2024-06-17 22:01:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 37956.3, 300 sec: 26657.2). Total num frames: 4931584. Throughput: 0: 38125.0. Samples: 5063360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:01:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:01:55,126][12883] Updated weights for policy 0, policy_version 310 (0.0031) [2024-06-17 22:01:56,996][12645] Fps is (10 sec: 37674.9, 60 sec: 38228.0, 300 sec: 26990.2). Total num frames: 5128192. Throughput: 0: 38128.4. Samples: 5289920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 22:01:56,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:01:59,722][12883] Updated weights for policy 0, policy_version 320 (0.0040) [2024-06-17 22:02:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 38502.4, 300 sec: 27390.7). Total num frames: 5341184. Throughput: 0: 38093.8. Samples: 5403880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-17 22:02:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:03,931][12883] Updated weights for policy 0, policy_version 330 (0.0028) [2024-06-17 22:02:06,998][12645] Fps is (10 sec: 37675.6, 60 sec: 38226.7, 300 sec: 27524.5). Total num frames: 5505024. Throughput: 0: 38067.5. Samples: 5630540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:02:06,998][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:08,200][12883] Updated weights for policy 0, policy_version 340 (0.0025) [2024-06-17 22:02:10,248][12862] Signal inference workers to stop experience collection... (50 times) [2024-06-17 22:02:10,249][12862] Signal inference workers to resume experience collection... (50 times) [2024-06-17 22:02:10,287][12883] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-17 22:02:10,288][12883] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-17 22:02:11,994][12645] Fps is (10 sec: 36044.6, 60 sec: 37956.3, 300 sec: 27812.8). Total num frames: 5701632. Throughput: 0: 38143.4. Samples: 5861500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 22:02:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:02:12,639][12883] Updated weights for policy 0, policy_version 350 (0.0047) [2024-06-17 22:02:16,899][12883] Updated weights for policy 0, policy_version 360 (0.0028) [2024-06-17 22:02:16,994][12645] Fps is (10 sec: 39338.2, 60 sec: 38775.4, 300 sec: 28086.9). Total num frames: 5898240. Throughput: 0: 38178.7. Samples: 5981220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:02:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:21,563][12883] Updated weights for policy 0, policy_version 370 (0.0040) [2024-06-17 22:02:21,994][12645] Fps is (10 sec: 37683.6, 60 sec: 37956.3, 300 sec: 28271.9). Total num frames: 6078464. Throughput: 0: 38054.3. Samples: 6205020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-17 22:02:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:02:25,128][12883] Updated weights for policy 0, policy_version 380 (0.0039) [2024-06-17 22:02:26,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37957.1, 300 sec: 28523.0). Total num frames: 6275072. Throughput: 0: 38260.8. Samples: 6443660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-17 22:02:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:02:29,687][12883] Updated weights for policy 0, policy_version 390 (0.0042) [2024-06-17 22:02:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 37957.7, 300 sec: 28690.2). Total num frames: 6455296. Throughput: 0: 38264.6. Samples: 6554440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-17 22:02:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:33,991][12883] Updated weights for policy 0, policy_version 400 (0.0042) [2024-06-17 22:02:37,000][12645] Fps is (10 sec: 39295.5, 60 sec: 38225.1, 300 sec: 28991.7). Total num frames: 6668288. Throughput: 0: 38195.1. Samples: 6782400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-17 22:02:37,001][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000407_6668288.pth... [2024-06-17 22:02:38,182][12883] Updated weights for policy 0, policy_version 410 (0.0035) [2024-06-17 22:02:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 38775.5, 300 sec: 29212.3). Total num frames: 6864896. Throughput: 0: 38193.5. Samples: 7008540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 22:02:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:42,930][12883] Updated weights for policy 0, policy_version 420 (0.0029) [2024-06-17 22:02:46,848][12883] Updated weights for policy 0, policy_version 430 (0.0051) [2024-06-17 22:02:46,994][12645] Fps is (10 sec: 37708.2, 60 sec: 38229.3, 300 sec: 29354.7). Total num frames: 7045120. Throughput: 0: 38266.1. Samples: 7125860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-17 22:02:46,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:51,593][12883] Updated weights for policy 0, policy_version 440 (0.0037) [2024-06-17 22:02:51,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.3, 300 sec: 29491.2). Total num frames: 7225344. Throughput: 0: 38185.3. Samples: 7348720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-17 22:02:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:02:55,338][12883] Updated weights for policy 0, policy_version 450 (0.0039) [2024-06-17 22:02:56,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38230.6, 300 sec: 29687.8). Total num frames: 7421952. Throughput: 0: 38292.3. Samples: 7584660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 22:02:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:00,296][12883] Updated weights for policy 0, policy_version 460 (0.0041) [2024-06-17 22:03:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 38229.3, 300 sec: 29941.0). Total num frames: 7634944. Throughput: 0: 38268.0. Samples: 7703280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-17 22:03:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:03,552][12883] Updated weights for policy 0, policy_version 470 (0.0028) [2024-06-17 22:03:06,994][12645] Fps is (10 sec: 37683.8, 60 sec: 38232.0, 300 sec: 29995.3). Total num frames: 7798784. Throughput: 0: 38314.1. Samples: 7929160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:03:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:08,781][12883] Updated weights for policy 0, policy_version 480 (0.0037) [2024-06-17 22:03:11,994][12645] Fps is (10 sec: 34406.3, 60 sec: 37956.2, 300 sec: 30109.5). Total num frames: 7979008. Throughput: 0: 38020.4. Samples: 8154580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-17 22:03:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:12,493][12883] Updated weights for policy 0, policy_version 490 (0.0043) [2024-06-17 22:03:16,624][12883] Updated weights for policy 0, policy_version 500 (0.0039) [2024-06-17 22:03:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 38229.3, 300 sec: 30340.7). Total num frames: 8192000. Throughput: 0: 38160.7. Samples: 8271680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 22:03:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:20,950][12883] Updated weights for policy 0, policy_version 510 (0.0034) [2024-06-17 22:03:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 38502.3, 300 sec: 30504.0). Total num frames: 8388608. Throughput: 0: 38022.1. Samples: 8493140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-17 22:03:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:26,101][12883] Updated weights for policy 0, policy_version 520 (0.0037) [2024-06-17 22:03:26,994][12645] Fps is (10 sec: 34407.0, 60 sec: 37683.3, 300 sec: 30485.9). Total num frames: 8536064. Throughput: 0: 38062.7. Samples: 8721360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-17 22:03:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:03:29,631][12883] Updated weights for policy 0, policy_version 530 (0.0041) [2024-06-17 22:03:31,994][12645] Fps is (10 sec: 36044.9, 60 sec: 38229.3, 300 sec: 30698.4). Total num frames: 8749056. Throughput: 0: 37932.5. Samples: 8832820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 22:03:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:33,993][12883] Updated weights for policy 0, policy_version 540 (0.0044) [2024-06-17 22:03:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 37960.5, 300 sec: 30847.1). Total num frames: 8945664. Throughput: 0: 38247.2. Samples: 9069840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 22:03:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:37,800][12883] Updated weights for policy 0, policy_version 550 (0.0041) [2024-06-17 22:03:41,994][12645] Fps is (10 sec: 37683.4, 60 sec: 37683.2, 300 sec: 30935.2). Total num frames: 9125888. Throughput: 0: 37891.7. Samples: 9289780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-17 22:03:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:43,025][12883] Updated weights for policy 0, policy_version 560 (0.0042) [2024-06-17 22:03:46,157][12862] Signal inference workers to stop experience collection... (100 times) [2024-06-17 22:03:46,158][12862] Signal inference workers to resume experience collection... (100 times) [2024-06-17 22:03:46,183][12883] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-17 22:03:46,183][12883] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-17 22:03:46,614][12883] Updated weights for policy 0, policy_version 570 (0.0037) [2024-06-17 22:03:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38229.4, 300 sec: 31657.2). Total num frames: 9338880. Throughput: 0: 37913.8. Samples: 9409400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:03:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:51,292][12883] Updated weights for policy 0, policy_version 580 (0.0031) [2024-06-17 22:03:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38229.3, 300 sec: 32268.1). Total num frames: 9519104. Throughput: 0: 37791.9. Samples: 9629800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-17 22:03:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:03:55,257][12883] Updated weights for policy 0, policy_version 590 (0.0046) [2024-06-17 22:03:56,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.4, 300 sec: 32323.7). Total num frames: 9699328. Throughput: 0: 37892.6. Samples: 9859740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-17 22:03:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:00,155][12883] Updated weights for policy 0, policy_version 600 (0.0026) [2024-06-17 22:04:01,999][12645] Fps is (10 sec: 36027.8, 60 sec: 37407.1, 300 sec: 32934.1). Total num frames: 9879552. Throughput: 0: 37807.2. Samples: 9973180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 22:04:01,999][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:03,848][12883] Updated weights for policy 0, policy_version 610 (0.0043) [2024-06-17 22:04:06,994][12645] Fps is (10 sec: 37682.5, 60 sec: 37956.2, 300 sec: 33490.0). Total num frames: 10076160. Throughput: 0: 37949.8. Samples: 10200880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 22:04:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:08,338][12883] Updated weights for policy 0, policy_version 620 (0.0037) [2024-06-17 22:04:11,994][12645] Fps is (10 sec: 39340.9, 60 sec: 38229.4, 300 sec: 34045.4). Total num frames: 10272768. Throughput: 0: 37978.2. Samples: 10430380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 22:04:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:12,683][12883] Updated weights for policy 0, policy_version 630 (0.0044) [2024-06-17 22:04:16,996][12645] Fps is (10 sec: 37675.0, 60 sec: 37681.9, 300 sec: 34489.4). Total num frames: 10452992. Throughput: 0: 37893.7. Samples: 10538120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:04:16,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:04:17,432][12883] Updated weights for policy 0, policy_version 640 (0.0039) [2024-06-17 22:04:21,363][12883] Updated weights for policy 0, policy_version 650 (0.0038) [2024-06-17 22:04:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 37956.2, 300 sec: 35045.1). Total num frames: 10665984. Throughput: 0: 37618.1. Samples: 10762660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-17 22:04:21,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:04:25,705][12883] Updated weights for policy 0, policy_version 660 (0.0042) [2024-06-17 22:04:26,994][12645] Fps is (10 sec: 37691.7, 60 sec: 38229.3, 300 sec: 35322.8). Total num frames: 10829824. Throughput: 0: 37841.3. Samples: 10992640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 22:04:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:30,025][12883] Updated weights for policy 0, policy_version 670 (0.0053) [2024-06-17 22:04:31,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.3, 300 sec: 35711.6). Total num frames: 11026432. Throughput: 0: 37690.2. Samples: 11105460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-17 22:04:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:34,730][12883] Updated weights for policy 0, policy_version 680 (0.0041) [2024-06-17 22:04:36,994][12645] Fps is (10 sec: 36044.5, 60 sec: 37410.1, 300 sec: 35933.7). Total num frames: 11190272. Throughput: 0: 37764.5. Samples: 11329200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-17 22:04:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000683_11190272.pth... [2024-06-17 22:04:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000131_2146304.pth [2024-06-17 22:04:38,696][12883] Updated weights for policy 0, policy_version 690 (0.0031) [2024-06-17 22:04:41,994][12645] Fps is (10 sec: 36044.5, 60 sec: 37683.1, 300 sec: 36266.9). Total num frames: 11386880. Throughput: 0: 37885.6. Samples: 11564600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 22:04:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:04:43,031][12883] Updated weights for policy 0, policy_version 700 (0.0026) [2024-06-17 22:04:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 37410.1, 300 sec: 36489.1). Total num frames: 11583488. Throughput: 0: 37850.3. Samples: 11676260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 22:04:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:04:47,411][12883] Updated weights for policy 0, policy_version 710 (0.0029) [2024-06-17 22:04:51,842][12883] Updated weights for policy 0, policy_version 720 (0.0040) [2024-06-17 22:04:51,996][12645] Fps is (10 sec: 40951.2, 60 sec: 37954.9, 300 sec: 36766.5). Total num frames: 11796480. Throughput: 0: 37769.3. Samples: 11900580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:04:52,005][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:04:56,156][12883] Updated weights for policy 0, policy_version 730 (0.0044) [2024-06-17 22:04:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 37956.2, 300 sec: 36933.4). Total num frames: 11976704. Throughput: 0: 37727.4. Samples: 12128120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-17 22:04:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:00,355][12883] Updated weights for policy 0, policy_version 740 (0.0035) [2024-06-17 22:05:01,994][12645] Fps is (10 sec: 37691.3, 60 sec: 38232.3, 300 sec: 37100.0). Total num frames: 12173312. Throughput: 0: 37967.6. Samples: 12246580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-17 22:05:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:02,677][12862] Signal inference workers to stop experience collection... (150 times) [2024-06-17 22:05:02,710][12883] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-17 22:05:02,798][12862] Signal inference workers to resume experience collection... (150 times) [2024-06-17 22:05:02,798][12883] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-17 22:05:04,953][12883] Updated weights for policy 0, policy_version 750 (0.0048) [2024-06-17 22:05:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 37956.2, 300 sec: 37266.7). Total num frames: 12353536. Throughput: 0: 37930.6. Samples: 12469540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 22:05:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:05:09,425][12883] Updated weights for policy 0, policy_version 760 (0.0028) [2024-06-17 22:05:11,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37683.1, 300 sec: 37377.7). Total num frames: 12533760. Throughput: 0: 37849.3. Samples: 12695860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-17 22:05:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:13,698][12883] Updated weights for policy 0, policy_version 770 (0.0034) [2024-06-17 22:05:16,995][12645] Fps is (10 sec: 39317.9, 60 sec: 38230.1, 300 sec: 37599.7). Total num frames: 12746752. Throughput: 0: 38100.5. Samples: 12820020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-17 22:05:16,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:17,882][12883] Updated weights for policy 0, policy_version 780 (0.0024) [2024-06-17 22:05:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 37683.2, 300 sec: 37711.0). Total num frames: 12926976. Throughput: 0: 38227.1. Samples: 13049420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-17 22:05:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:22,220][12883] Updated weights for policy 0, policy_version 790 (0.0033) [2024-06-17 22:05:26,260][12883] Updated weights for policy 0, policy_version 800 (0.0046) [2024-06-17 22:05:26,994][12645] Fps is (10 sec: 36048.3, 60 sec: 37956.2, 300 sec: 37711.0). Total num frames: 13107200. Throughput: 0: 37986.2. Samples: 13273980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 22:05:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:30,897][12883] Updated weights for policy 0, policy_version 810 (0.0041) [2024-06-17 22:05:31,994][12645] Fps is (10 sec: 37683.3, 60 sec: 37956.3, 300 sec: 37822.0). Total num frames: 13303808. Throughput: 0: 38078.6. Samples: 13389800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 22:05:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:05:34,945][12883] Updated weights for policy 0, policy_version 820 (0.0037) [2024-06-17 22:05:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 38775.5, 300 sec: 37988.7). Total num frames: 13516800. Throughput: 0: 38182.8. Samples: 13618720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-17 22:05:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:39,577][12883] Updated weights for policy 0, policy_version 830 (0.0037) [2024-06-17 22:05:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 38502.4, 300 sec: 38044.2). Total num frames: 13697024. Throughput: 0: 37990.2. Samples: 13837680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 22:05:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:05:43,834][12883] Updated weights for policy 0, policy_version 840 (0.0044) [2024-06-17 22:05:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.3, 300 sec: 38044.2). Total num frames: 13877248. Throughput: 0: 38092.9. Samples: 13960760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 22:05:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:48,470][12883] Updated weights for policy 0, policy_version 850 (0.0033) [2024-06-17 22:05:51,994][12645] Fps is (10 sec: 37682.9, 60 sec: 37957.6, 300 sec: 38099.7). Total num frames: 14073856. Throughput: 0: 38272.9. Samples: 14191820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-17 22:05:52,007][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:05:52,127][12883] Updated weights for policy 0, policy_version 860 (0.0028) [2024-06-17 22:05:56,518][12883] Updated weights for policy 0, policy_version 870 (0.0026) [2024-06-17 22:05:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 37956.3, 300 sec: 38044.2). Total num frames: 14254080. Throughput: 0: 38156.0. Samples: 14412880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 22:05:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:00,750][12883] Updated weights for policy 0, policy_version 880 (0.0035) [2024-06-17 22:06:01,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37683.2, 300 sec: 38044.2). Total num frames: 14434304. Throughput: 0: 37955.9. Samples: 14528000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-17 22:06:01,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:06:05,819][12883] Updated weights for policy 0, policy_version 890 (0.0047) [2024-06-17 22:06:06,996][12645] Fps is (10 sec: 39312.8, 60 sec: 38228.0, 300 sec: 38043.9). Total num frames: 14647296. Throughput: 0: 38114.6. Samples: 14764660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-17 22:06:06,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:06:09,030][12883] Updated weights for policy 0, policy_version 900 (0.0042) [2024-06-17 22:06:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 14811136. Throughput: 0: 38087.1. Samples: 14987900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 22:06:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:14,187][12883] Updated weights for policy 0, policy_version 910 (0.0037) [2024-06-17 22:06:16,994][12645] Fps is (10 sec: 39330.2, 60 sec: 38230.0, 300 sec: 38099.7). Total num frames: 15040512. Throughput: 0: 38019.9. Samples: 15100700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 22:06:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:18,378][12883] Updated weights for policy 0, policy_version 920 (0.0038) [2024-06-17 22:06:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 37956.3, 300 sec: 37988.8). Total num frames: 15204352. Throughput: 0: 37989.8. Samples: 15328260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:06:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:23,373][12883] Updated weights for policy 0, policy_version 930 (0.0037) [2024-06-17 22:06:26,994][12645] Fps is (10 sec: 34406.3, 60 sec: 37956.3, 300 sec: 37988.9). Total num frames: 15384576. Throughput: 0: 38236.4. Samples: 15558320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-17 22:06:26,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:06:27,126][12883] Updated weights for policy 0, policy_version 940 (0.0040) [2024-06-17 22:06:31,319][12883] Updated weights for policy 0, policy_version 950 (0.0041) [2024-06-17 22:06:31,994][12645] Fps is (10 sec: 37683.0, 60 sec: 37956.2, 300 sec: 37988.7). Total num frames: 15581184. Throughput: 0: 37935.5. Samples: 15667860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 22:06:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:33,612][12862] Signal inference workers to stop experience collection... (200 times) [2024-06-17 22:06:33,646][12883] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-17 22:06:33,675][12862] Signal inference workers to resume experience collection... (200 times) [2024-06-17 22:06:33,676][12883] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-17 22:06:36,099][12883] Updated weights for policy 0, policy_version 960 (0.0048) [2024-06-17 22:06:36,994][12645] Fps is (10 sec: 37684.1, 60 sec: 37410.2, 300 sec: 38044.2). Total num frames: 15761408. Throughput: 0: 37986.9. Samples: 15901220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 24.0) [2024-06-17 22:06:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:06:37,056][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000963_15777792.pth... [2024-06-17 22:06:37,107][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000407_6668288.pth [2024-06-17 22:06:40,010][12883] Updated weights for policy 0, policy_version 970 (0.0045) [2024-06-17 22:06:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 37956.3, 300 sec: 38044.2). Total num frames: 15974400. Throughput: 0: 38044.9. Samples: 16124900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 22:06:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:06:44,491][12883] Updated weights for policy 0, policy_version 980 (0.0044) [2024-06-17 22:06:46,994][12645] Fps is (10 sec: 39320.5, 60 sec: 37956.2, 300 sec: 38044.2). Total num frames: 16154624. Throughput: 0: 38094.2. Samples: 16242240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 22:06:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:48,573][12883] Updated weights for policy 0, policy_version 990 (0.0044) [2024-06-17 22:06:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 38229.5, 300 sec: 38100.0). Total num frames: 16367616. Throughput: 0: 37864.7. Samples: 16468480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-17 22:06:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:53,782][12883] Updated weights for policy 0, policy_version 1000 (0.0045) [2024-06-17 22:06:56,997][12645] Fps is (10 sec: 37670.4, 60 sec: 37954.1, 300 sec: 37932.7). Total num frames: 16531456. Throughput: 0: 37894.5. Samples: 16693280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 22:06:56,998][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:06:57,451][12883] Updated weights for policy 0, policy_version 1010 (0.0038) [2024-06-17 22:07:01,623][12883] Updated weights for policy 0, policy_version 1020 (0.0028) [2024-06-17 22:07:01,994][12645] Fps is (10 sec: 34405.9, 60 sec: 37956.3, 300 sec: 37989.2). Total num frames: 16711680. Throughput: 0: 37948.0. Samples: 16808360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-17 22:07:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:07:05,675][12883] Updated weights for policy 0, policy_version 1030 (0.0034) [2024-06-17 22:07:06,994][12645] Fps is (10 sec: 39335.4, 60 sec: 37957.7, 300 sec: 38044.2). Total num frames: 16924672. Throughput: 0: 38004.5. Samples: 17038460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:07:06,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:07:10,287][12883] Updated weights for policy 0, policy_version 1040 (0.0039) [2024-06-17 22:07:11,994][12645] Fps is (10 sec: 37683.8, 60 sec: 37956.4, 300 sec: 37933.1). Total num frames: 17088512. Throughput: 0: 37896.2. Samples: 17263640. Policy #0 lag: (min: 1.0, avg: 7.8, max: 21.0) [2024-06-17 22:07:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:07:14,182][12883] Updated weights for policy 0, policy_version 1050 (0.0026) [2024-06-17 22:07:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 37683.2, 300 sec: 38044.2). Total num frames: 17301504. Throughput: 0: 37864.0. Samples: 17371740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-17 22:07:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:07:18,848][12883] Updated weights for policy 0, policy_version 1060 (0.0061) [2024-06-17 22:07:21,993][12645] Fps is (10 sec: 40960.4, 60 sec: 38229.5, 300 sec: 38044.2). Total num frames: 17498112. Throughput: 0: 37936.9. Samples: 17608380. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-17 22:07:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:07:22,682][12883] Updated weights for policy 0, policy_version 1070 (0.0032) [2024-06-17 22:07:26,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 17661952. Throughput: 0: 37969.8. Samples: 17833540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-17 22:07:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:07:27,503][12883] Updated weights for policy 0, policy_version 1080 (0.0037) [2024-06-17 22:07:31,215][12883] Updated weights for policy 0, policy_version 1090 (0.0034) [2024-06-17 22:07:31,994][12645] Fps is (10 sec: 37682.3, 60 sec: 38229.4, 300 sec: 37989.5). Total num frames: 17874944. Throughput: 0: 37845.9. Samples: 17945300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-17 22:07:31,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:07:36,151][12883] Updated weights for policy 0, policy_version 1100 (0.0046) [2024-06-17 22:07:36,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37683.2, 300 sec: 37822.1). Total num frames: 18022400. Throughput: 0: 37901.3. Samples: 18174040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:07:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:07:39,694][12883] Updated weights for policy 0, policy_version 1110 (0.0040) [2024-06-17 22:07:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 18251776. Throughput: 0: 38050.6. Samples: 18405420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 22:07:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:07:44,873][12883] Updated weights for policy 0, policy_version 1120 (0.0039) [2024-06-17 22:07:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 37683.3, 300 sec: 37933.1). Total num frames: 18415616. Throughput: 0: 38011.2. Samples: 18518860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 22:07:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:07:48,418][12883] Updated weights for policy 0, policy_version 1130 (0.0040) [2024-06-17 22:07:51,994][12645] Fps is (10 sec: 36044.8, 60 sec: 37410.1, 300 sec: 37933.2). Total num frames: 18612224. Throughput: 0: 37860.5. Samples: 18742180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-17 22:07:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:07:53,501][12883] Updated weights for policy 0, policy_version 1140 (0.0044) [2024-06-17 22:07:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 37958.5, 300 sec: 37877.6). Total num frames: 18808832. Throughput: 0: 37960.9. Samples: 18971880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-17 22:07:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:07:57,275][12883] Updated weights for policy 0, policy_version 1150 (0.0042) [2024-06-17 22:08:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 18989056. Throughput: 0: 38019.1. Samples: 19082600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 22:08:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:08:02,031][12883] Updated weights for policy 0, policy_version 1160 (0.0038) [2024-06-17 22:08:05,650][12883] Updated weights for policy 0, policy_version 1170 (0.0045) [2024-06-17 22:08:06,995][12645] Fps is (10 sec: 39317.6, 60 sec: 37955.7, 300 sec: 38044.1). Total num frames: 19202048. Throughput: 0: 37822.1. Samples: 19310420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-17 22:08:06,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:08:10,509][12883] Updated weights for policy 0, policy_version 1180 (0.0035) [2024-06-17 22:08:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38229.3, 300 sec: 37933.1). Total num frames: 19382272. Throughput: 0: 38116.0. Samples: 19548760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:08:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:08:14,397][12883] Updated weights for policy 0, policy_version 1190 (0.0035) [2024-06-17 22:08:15,903][12862] Signal inference workers to stop experience collection... (250 times) [2024-06-17 22:08:15,924][12883] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-17 22:08:15,958][12862] Signal inference workers to resume experience collection... (250 times) [2024-06-17 22:08:15,960][12883] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-17 22:08:16,994][12645] Fps is (10 sec: 37686.7, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 19578880. Throughput: 0: 38102.2. Samples: 19659900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-17 22:08:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:08:19,225][12883] Updated weights for policy 0, policy_version 1200 (0.0055) [2024-06-17 22:08:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37683.1, 300 sec: 38044.2). Total num frames: 19759104. Throughput: 0: 38064.9. Samples: 19886960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-17 22:08:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:08:23,046][12883] Updated weights for policy 0, policy_version 1210 (0.0038) [2024-06-17 22:08:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 19939328. Throughput: 0: 38080.9. Samples: 20119060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:08:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:08:27,791][12883] Updated weights for policy 0, policy_version 1220 (0.0024) [2024-06-17 22:08:31,794][12883] Updated weights for policy 0, policy_version 1230 (0.0036) [2024-06-17 22:08:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 20152320. Throughput: 0: 38098.7. Samples: 20233300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 22:08:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:08:36,479][12883] Updated weights for policy 0, policy_version 1240 (0.0047) [2024-06-17 22:08:36,996][12645] Fps is (10 sec: 39312.8, 60 sec: 38500.9, 300 sec: 37988.4). Total num frames: 20332544. Throughput: 0: 38099.0. Samples: 20456720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 22:08:36,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:08:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001242_20348928.pth... [2024-06-17 22:08:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000683_11190272.pth [2024-06-17 22:08:40,638][12883] Updated weights for policy 0, policy_version 1250 (0.0035) [2024-06-17 22:08:42,000][12645] Fps is (10 sec: 37659.6, 60 sec: 37952.3, 300 sec: 37932.3). Total num frames: 20529152. Throughput: 0: 38093.8. Samples: 20686340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 22:08:42,001][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:08:44,988][12883] Updated weights for policy 0, policy_version 1260 (0.0050) [2024-06-17 22:08:46,994][12645] Fps is (10 sec: 39330.8, 60 sec: 38502.5, 300 sec: 37988.7). Total num frames: 20725760. Throughput: 0: 38306.8. Samples: 20806400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-17 22:08:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:08:49,406][12883] Updated weights for policy 0, policy_version 1270 (0.0034) [2024-06-17 22:08:51,996][12645] Fps is (10 sec: 36059.4, 60 sec: 37954.8, 300 sec: 37932.8). Total num frames: 20889600. Throughput: 0: 38219.4. Samples: 21030340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-17 22:08:51,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:08:53,335][12883] Updated weights for policy 0, policy_version 1280 (0.0035) [2024-06-17 22:08:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38229.3, 300 sec: 38044.8). Total num frames: 21102592. Throughput: 0: 37968.1. Samples: 21257320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 22:08:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:08:58,167][12883] Updated weights for policy 0, policy_version 1290 (0.0032) [2024-06-17 22:09:01,919][12883] Updated weights for policy 0, policy_version 1300 (0.0039) [2024-06-17 22:09:01,994][12645] Fps is (10 sec: 40969.5, 60 sec: 38502.5, 300 sec: 38044.2). Total num frames: 21299200. Throughput: 0: 38037.0. Samples: 21371560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-17 22:09:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:09:06,598][12883] Updated weights for policy 0, policy_version 1310 (0.0032) [2024-06-17 22:09:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 37956.9, 300 sec: 37988.7). Total num frames: 21479424. Throughput: 0: 38139.5. Samples: 21603240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-17 22:09:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:09:10,675][12883] Updated weights for policy 0, policy_version 1320 (0.0030) [2024-06-17 22:09:11,994][12645] Fps is (10 sec: 37682.4, 60 sec: 38229.3, 300 sec: 38044.5). Total num frames: 21676032. Throughput: 0: 37983.5. Samples: 21828320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:09:11,999][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:09:15,225][12883] Updated weights for policy 0, policy_version 1330 (0.0043) [2024-06-17 22:09:16,994][12645] Fps is (10 sec: 36044.3, 60 sec: 37683.1, 300 sec: 37877.6). Total num frames: 21839872. Throughput: 0: 37992.3. Samples: 21942960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:09:16,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:09:19,405][12883] Updated weights for policy 0, policy_version 1340 (0.0044) [2024-06-17 22:09:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 38775.5, 300 sec: 38155.3). Total num frames: 22085632. Throughput: 0: 38106.3. Samples: 22171420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:09:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:09:23,874][12883] Updated weights for policy 0, policy_version 1350 (0.0031) [2024-06-17 22:09:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38229.3, 300 sec: 37988.7). Total num frames: 22233088. Throughput: 0: 38040.8. Samples: 22397940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-17 22:09:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:09:28,178][12883] Updated weights for policy 0, policy_version 1360 (0.0042) [2024-06-17 22:09:31,995][12645] Fps is (10 sec: 34402.1, 60 sec: 37955.5, 300 sec: 38099.6). Total num frames: 22429696. Throughput: 0: 37885.5. Samples: 22511300. Policy #0 lag: (min: 1.0, avg: 12.6, max: 26.0) [2024-06-17 22:09:31,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:09:32,503][12883] Updated weights for policy 0, policy_version 1370 (0.0036) [2024-06-17 22:09:36,613][12883] Updated weights for policy 0, policy_version 1380 (0.0029) [2024-06-17 22:09:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 37957.6, 300 sec: 38044.2). Total num frames: 22609920. Throughput: 0: 38169.8. Samples: 22747900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 22:09:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:09:40,535][12883] Updated weights for policy 0, policy_version 1390 (0.0040) [2024-06-17 22:09:41,994][12645] Fps is (10 sec: 36048.7, 60 sec: 37687.1, 300 sec: 37988.6). Total num frames: 22790144. Throughput: 0: 38031.4. Samples: 22968740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:09:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:09:45,204][12883] Updated weights for policy 0, policy_version 1400 (0.0035) [2024-06-17 22:09:45,931][12862] Signal inference workers to stop experience collection... (300 times) [2024-06-17 22:09:45,973][12883] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-17 22:09:45,984][12862] Signal inference workers to resume experience collection... (300 times) [2024-06-17 22:09:45,993][12883] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-17 22:09:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 38229.3, 300 sec: 38044.5). Total num frames: 23019520. Throughput: 0: 38021.2. Samples: 23082520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-17 22:09:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:09:49,112][12883] Updated weights for policy 0, policy_version 1410 (0.0034) [2024-06-17 22:09:51,994][12645] Fps is (10 sec: 37684.0, 60 sec: 37957.7, 300 sec: 37933.1). Total num frames: 23166976. Throughput: 0: 38142.7. Samples: 23319660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-17 22:09:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:09:53,922][12883] Updated weights for policy 0, policy_version 1420 (0.0045) [2024-06-17 22:09:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 38502.4, 300 sec: 38099.8). Total num frames: 23412736. Throughput: 0: 38024.1. Samples: 23539400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 22:09:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:09:57,223][12883] Updated weights for policy 0, policy_version 1430 (0.0043) [2024-06-17 22:10:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 37683.1, 300 sec: 37988.7). Total num frames: 23560192. Throughput: 0: 37988.1. Samples: 23652420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 22:10:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:02,608][12883] Updated weights for policy 0, policy_version 1440 (0.0038) [2024-06-17 22:10:06,323][12883] Updated weights for policy 0, policy_version 1450 (0.0040) [2024-06-17 22:10:06,994][12645] Fps is (10 sec: 34406.4, 60 sec: 37956.2, 300 sec: 38044.2). Total num frames: 23756800. Throughput: 0: 37899.1. Samples: 23876880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 22:10:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:10:11,293][12883] Updated weights for policy 0, policy_version 1460 (0.0027) [2024-06-17 22:10:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 37956.4, 300 sec: 37988.8). Total num frames: 23953408. Throughput: 0: 38041.9. Samples: 24109820. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-17 22:10:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:15,110][12883] Updated weights for policy 0, policy_version 1470 (0.0032) [2024-06-17 22:10:16,994][12645] Fps is (10 sec: 36045.1, 60 sec: 37956.4, 300 sec: 37933.1). Total num frames: 24117248. Throughput: 0: 37911.7. Samples: 24217280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:10:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:10:19,739][12883] Updated weights for policy 0, policy_version 1480 (0.0047) [2024-06-17 22:10:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 37683.2, 300 sec: 38099.8). Total num frames: 24346624. Throughput: 0: 37714.3. Samples: 24445040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-17 22:10:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:23,223][12883] Updated weights for policy 0, policy_version 1490 (0.0030) [2024-06-17 22:10:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 24510464. Throughput: 0: 38170.7. Samples: 24686420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-17 22:10:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:10:28,462][12883] Updated weights for policy 0, policy_version 1500 (0.0042) [2024-06-17 22:10:31,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38230.1, 300 sec: 37988.7). Total num frames: 24723456. Throughput: 0: 37953.7. Samples: 24790440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-17 22:10:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:32,097][12883] Updated weights for policy 0, policy_version 1510 (0.0046) [2024-06-17 22:10:36,564][12883] Updated weights for policy 0, policy_version 1520 (0.0025) [2024-06-17 22:10:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 38501.0, 300 sec: 38043.9). Total num frames: 24920064. Throughput: 0: 37884.2. Samples: 25024540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 22:10:36,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001521_24920064.pth... [2024-06-17 22:10:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000000963_15777792.pth [2024-06-17 22:10:40,803][12883] Updated weights for policy 0, policy_version 1530 (0.0035) [2024-06-17 22:10:41,994][12645] Fps is (10 sec: 36045.1, 60 sec: 38229.4, 300 sec: 37988.7). Total num frames: 25083904. Throughput: 0: 38108.9. Samples: 25254300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-17 22:10:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:45,628][12883] Updated weights for policy 0, policy_version 1540 (0.0043) [2024-06-17 22:10:46,994][12645] Fps is (10 sec: 37692.0, 60 sec: 37956.3, 300 sec: 38044.2). Total num frames: 25296896. Throughput: 0: 38099.2. Samples: 25366880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 19.0) [2024-06-17 22:10:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:49,786][12883] Updated weights for policy 0, policy_version 1550 (0.0038) [2024-06-17 22:10:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38229.3, 300 sec: 37988.7). Total num frames: 25460736. Throughput: 0: 38268.4. Samples: 25598960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-17 22:10:51,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:54,079][12883] Updated weights for policy 0, policy_version 1560 (0.0032) [2024-06-17 22:10:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37683.2, 300 sec: 38099.8). Total num frames: 25673728. Throughput: 0: 37950.2. Samples: 25817580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-17 22:10:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:10:58,061][12883] Updated weights for policy 0, policy_version 1570 (0.0048) [2024-06-17 22:11:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37956.3, 300 sec: 37933.4). Total num frames: 25837568. Throughput: 0: 38303.1. Samples: 25940920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-17 22:11:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:11:02,632][12883] Updated weights for policy 0, policy_version 1580 (0.0040) [2024-06-17 22:11:06,612][12883] Updated weights for policy 0, policy_version 1590 (0.0036) [2024-06-17 22:11:06,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38229.3, 300 sec: 38099.7). Total num frames: 26050560. Throughput: 0: 38105.7. Samples: 26159800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-17 22:11:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:11:11,377][12883] Updated weights for policy 0, policy_version 1600 (0.0043) [2024-06-17 22:11:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 38229.4, 300 sec: 37988.7). Total num frames: 26247168. Throughput: 0: 37972.1. Samples: 26395160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 22:11:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:11:15,460][12883] Updated weights for policy 0, policy_version 1610 (0.0044) [2024-06-17 22:11:16,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38502.3, 300 sec: 38044.2). Total num frames: 26427392. Throughput: 0: 38155.5. Samples: 26507440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-17 22:11:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:11:20,186][12883] Updated weights for policy 0, policy_version 1620 (0.0031) [2024-06-17 22:11:21,994][12645] Fps is (10 sec: 37682.5, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 26624000. Throughput: 0: 37991.7. Samples: 26734080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-17 22:11:21,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:11:23,558][12862] Signal inference workers to stop experience collection... (350 times) [2024-06-17 22:11:23,586][12883] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-17 22:11:23,616][12862] Signal inference workers to resume experience collection... (350 times) [2024-06-17 22:11:23,618][12883] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-17 22:11:23,767][12883] Updated weights for policy 0, policy_version 1630 (0.0045) [2024-06-17 22:11:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38229.3, 300 sec: 38044.2). Total num frames: 26804224. Throughput: 0: 38090.2. Samples: 26968360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:11:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:11:28,526][12883] Updated weights for policy 0, policy_version 1640 (0.0031) [2024-06-17 22:11:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 27000832. Throughput: 0: 38031.0. Samples: 27078280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 22:11:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:11:32,478][12883] Updated weights for policy 0, policy_version 1650 (0.0045) [2024-06-17 22:11:36,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37684.7, 300 sec: 37988.7). Total num frames: 27181056. Throughput: 0: 38042.7. Samples: 27310880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:11:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:11:37,090][12883] Updated weights for policy 0, policy_version 1660 (0.0040) [2024-06-17 22:11:41,145][12883] Updated weights for policy 0, policy_version 1670 (0.0045) [2024-06-17 22:11:41,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38229.3, 300 sec: 38044.2). Total num frames: 27377664. Throughput: 0: 38211.5. Samples: 27537100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:11:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:11:46,056][12883] Updated weights for policy 0, policy_version 1680 (0.0039) [2024-06-17 22:11:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 37956.3, 300 sec: 37988.7). Total num frames: 27574272. Throughput: 0: 38106.3. Samples: 27655700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:11:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:11:49,388][12883] Updated weights for policy 0, policy_version 1690 (0.0045) [2024-06-17 22:11:51,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38229.4, 300 sec: 38044.7). Total num frames: 27754496. Throughput: 0: 38363.6. Samples: 27886160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 22:11:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:11:54,625][12883] Updated weights for policy 0, policy_version 1700 (0.0049) [2024-06-17 22:11:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 27983872. Throughput: 0: 38113.6. Samples: 28110280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 22:11:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:11:57,831][12883] Updated weights for policy 0, policy_version 1710 (0.0028) [2024-06-17 22:12:01,994][12645] Fps is (10 sec: 36045.0, 60 sec: 37956.3, 300 sec: 37933.1). Total num frames: 28114944. Throughput: 0: 38317.9. Samples: 28231740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-17 22:12:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:12:03,243][12883] Updated weights for policy 0, policy_version 1720 (0.0039) [2024-06-17 22:12:05,778][12883] Updated weights for policy 0, policy_version 1730 (0.0033) [2024-06-17 22:12:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 28360704. Throughput: 0: 38288.5. Samples: 28457060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-17 22:12:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:12:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 37410.1, 300 sec: 37933.1). Total num frames: 28491776. Throughput: 0: 38327.2. Samples: 28693080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 22.0) [2024-06-17 22:12:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:12:12,015][12883] Updated weights for policy 0, policy_version 1740 (0.0039) [2024-06-17 22:12:15,075][12883] Updated weights for policy 0, policy_version 1750 (0.0039) [2024-06-17 22:12:16,994][12645] Fps is (10 sec: 36045.2, 60 sec: 38229.5, 300 sec: 38044.2). Total num frames: 28721152. Throughput: 0: 38216.6. Samples: 28798020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 22:12:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:12:20,258][12883] Updated weights for policy 0, policy_version 1760 (0.0042) [2024-06-17 22:12:21,994][12645] Fps is (10 sec: 44236.0, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 28934144. Throughput: 0: 38176.8. Samples: 29028840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 22:12:21,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:12:23,166][12883] Updated weights for policy 0, policy_version 1770 (0.0041) [2024-06-17 22:12:26,996][12645] Fps is (10 sec: 36036.3, 60 sec: 37954.9, 300 sec: 37988.4). Total num frames: 29081600. Throughput: 0: 38376.3. Samples: 29264120. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-17 22:12:26,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:12:28,909][12883] Updated weights for policy 0, policy_version 1780 (0.0032) [2024-06-17 22:12:31,994][12645] Fps is (10 sec: 37683.8, 60 sec: 38502.5, 300 sec: 38266.4). Total num frames: 29310976. Throughput: 0: 38245.7. Samples: 29376760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-17 22:12:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:12:32,015][12883] Updated weights for policy 0, policy_version 1790 (0.0036) [2024-06-17 22:12:36,994][12645] Fps is (10 sec: 37691.4, 60 sec: 37956.2, 300 sec: 37988.6). Total num frames: 29458432. Throughput: 0: 38324.3. Samples: 29610760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-17 22:12:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:12:37,102][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001799_29474816.pth... [2024-06-17 22:12:37,159][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001242_20348928.pth [2024-06-17 22:12:37,360][12883] Updated weights for policy 0, policy_version 1800 (0.0034) [2024-06-17 22:12:40,356][12883] Updated weights for policy 0, policy_version 1810 (0.0033) [2024-06-17 22:12:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38502.4, 300 sec: 38210.8). Total num frames: 29687808. Throughput: 0: 38212.0. Samples: 29829820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 22:12:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:12:45,810][12883] Updated weights for policy 0, policy_version 1820 (0.0026) [2024-06-17 22:12:46,998][12645] Fps is (10 sec: 37666.5, 60 sec: 37680.3, 300 sec: 38043.6). Total num frames: 29835264. Throughput: 0: 38195.2. Samples: 29950700. Policy #0 lag: (min: 0.0, avg: 7.4, max: 20.0) [2024-06-17 22:12:46,999][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:12:47,482][12862] Signal inference workers to stop experience collection... (400 times) [2024-06-17 22:12:47,534][12883] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-17 22:12:47,603][12862] Signal inference workers to resume experience collection... (400 times) [2024-06-17 22:12:47,603][12883] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-17 22:12:48,961][12883] Updated weights for policy 0, policy_version 1830 (0.0035) [2024-06-17 22:12:51,996][12645] Fps is (10 sec: 34398.6, 60 sec: 37954.8, 300 sec: 38043.9). Total num frames: 30031872. Throughput: 0: 38212.3. Samples: 30176700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:12:51,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:12:54,402][12883] Updated weights for policy 0, policy_version 1840 (0.0029) [2024-06-17 22:12:56,994][12645] Fps is (10 sec: 45896.1, 60 sec: 38502.5, 300 sec: 38321.9). Total num frames: 30294016. Throughput: 0: 38157.8. Samples: 30410180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 26.0) [2024-06-17 22:12:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:12:57,589][12883] Updated weights for policy 0, policy_version 1850 (0.0037) [2024-06-17 22:13:01,994][12645] Fps is (10 sec: 39331.0, 60 sec: 38502.4, 300 sec: 38044.3). Total num frames: 30425088. Throughput: 0: 38597.3. Samples: 30534900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:13:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:13:02,846][12883] Updated weights for policy 0, policy_version 1860 (0.0029) [2024-06-17 22:13:05,759][12883] Updated weights for policy 0, policy_version 1870 (0.0033) [2024-06-17 22:13:06,994][12645] Fps is (10 sec: 36044.4, 60 sec: 38229.3, 300 sec: 38210.8). Total num frames: 30654464. Throughput: 0: 38437.8. Samples: 30758540. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-17 22:13:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:13:11,457][12883] Updated weights for policy 0, policy_version 1880 (0.0033) [2024-06-17 22:13:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.6, 300 sec: 38155.3). Total num frames: 30834688. Throughput: 0: 38482.4. Samples: 30995740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 22:13:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:13:14,270][12883] Updated weights for policy 0, policy_version 1890 (0.0036) [2024-06-17 22:13:16,994][12645] Fps is (10 sec: 36045.2, 60 sec: 38229.3, 300 sec: 38155.3). Total num frames: 31014912. Throughput: 0: 38412.0. Samples: 31105300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-17 22:13:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:13:19,688][12883] Updated weights for policy 0, policy_version 1900 (0.0031) [2024-06-17 22:13:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 38502.5, 300 sec: 38321.9). Total num frames: 31244288. Throughput: 0: 38544.1. Samples: 31345240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-17 22:13:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:13:22,792][12883] Updated weights for policy 0, policy_version 1910 (0.0048) [2024-06-17 22:13:26,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38230.8, 300 sec: 38044.2). Total num frames: 31375360. Throughput: 0: 38889.8. Samples: 31579860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-17 22:13:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:13:28,279][12883] Updated weights for policy 0, policy_version 1920 (0.0040) [2024-06-17 22:13:31,437][12883] Updated weights for policy 0, policy_version 1930 (0.0023) [2024-06-17 22:13:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38775.4, 300 sec: 38322.2). Total num frames: 31637504. Throughput: 0: 38600.3. Samples: 31687540. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-17 22:13:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:13:36,693][12883] Updated weights for policy 0, policy_version 1940 (0.0038) [2024-06-17 22:13:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39048.6, 300 sec: 38211.6). Total num frames: 31801344. Throughput: 0: 38948.3. Samples: 31929280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:13:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:13:39,676][12883] Updated weights for policy 0, policy_version 1950 (0.0037) [2024-06-17 22:13:41,994][12645] Fps is (10 sec: 34406.7, 60 sec: 38229.4, 300 sec: 38155.3). Total num frames: 31981568. Throughput: 0: 38948.0. Samples: 32162840. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-17 22:13:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:13:44,893][12883] Updated weights for policy 0, policy_version 1960 (0.0047) [2024-06-17 22:13:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39597.6, 300 sec: 38377.7). Total num frames: 32210944. Throughput: 0: 38875.0. Samples: 32284280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 24.0) [2024-06-17 22:13:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:13:48,230][12883] Updated weights for policy 0, policy_version 1970 (0.0038) [2024-06-17 22:13:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39050.1, 300 sec: 38210.8). Total num frames: 32374784. Throughput: 0: 38976.1. Samples: 32512460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:13:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:13:53,523][12883] Updated weights for policy 0, policy_version 1980 (0.0040) [2024-06-17 22:13:56,897][12883] Updated weights for policy 0, policy_version 1990 (0.0041) [2024-06-17 22:13:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38502.4, 300 sec: 38321.9). Total num frames: 32604160. Throughput: 0: 38772.8. Samples: 32740520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-17 22:13:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:14:01,993][12645] Fps is (10 sec: 39322.1, 60 sec: 39048.6, 300 sec: 38266.4). Total num frames: 32768000. Throughput: 0: 39105.5. Samples: 32865040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-17 22:14:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:14:01,999][12883] Updated weights for policy 0, policy_version 2000 (0.0030) [2024-06-17 22:14:04,698][12862] Signal inference workers to stop experience collection... (450 times) [2024-06-17 22:14:04,744][12883] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-17 22:14:04,812][12862] Signal inference workers to resume experience collection... (450 times) [2024-06-17 22:14:04,812][12883] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-17 22:14:04,953][12883] Updated weights for policy 0, policy_version 2010 (0.0038) [2024-06-17 22:14:06,996][12645] Fps is (10 sec: 36036.7, 60 sec: 38501.0, 300 sec: 38266.1). Total num frames: 32964608. Throughput: 0: 38702.4. Samples: 33086940. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-17 22:14:06,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:14:10,597][12883] Updated weights for policy 0, policy_version 2020 (0.0039) [2024-06-17 22:14:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39048.5, 300 sec: 38433.0). Total num frames: 33177600. Throughput: 0: 38700.0. Samples: 33321360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:14:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:14:13,858][12883] Updated weights for policy 0, policy_version 2030 (0.0034) [2024-06-17 22:14:16,996][12645] Fps is (10 sec: 37683.1, 60 sec: 38774.0, 300 sec: 38155.0). Total num frames: 33341440. Throughput: 0: 38705.7. Samples: 33429380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:14:16,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:14:19,285][12883] Updated weights for policy 0, policy_version 2040 (0.0036) [2024-06-17 22:14:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 33570816. Throughput: 0: 38647.0. Samples: 33668400. Policy #0 lag: (min: 2.0, avg: 12.0, max: 24.0) [2024-06-17 22:14:21,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:14:22,036][12883] Updated weights for policy 0, policy_version 2050 (0.0044) [2024-06-17 22:14:26,994][12645] Fps is (10 sec: 37691.3, 60 sec: 39048.5, 300 sec: 38266.5). Total num frames: 33718272. Throughput: 0: 38442.6. Samples: 33892760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 22:14:26,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:14:27,735][12883] Updated weights for policy 0, policy_version 2060 (0.0044) [2024-06-17 22:14:31,337][12883] Updated weights for policy 0, policy_version 2070 (0.0036) [2024-06-17 22:14:31,994][12645] Fps is (10 sec: 36045.5, 60 sec: 38229.4, 300 sec: 38377.5). Total num frames: 33931264. Throughput: 0: 38161.0. Samples: 34001520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-17 22:14:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:14:36,286][12883] Updated weights for policy 0, policy_version 2080 (0.0030) [2024-06-17 22:14:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 34127872. Throughput: 0: 38475.6. Samples: 34243860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-17 22:14:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:14:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002083_34127872.pth... [2024-06-17 22:14:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001521_24920064.pth [2024-06-17 22:14:39,353][12883] Updated weights for policy 0, policy_version 2090 (0.0040) [2024-06-17 22:14:41,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38775.4, 300 sec: 38266.4). Total num frames: 34308096. Throughput: 0: 38486.2. Samples: 34472400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 22:14:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:14:44,648][12883] Updated weights for policy 0, policy_version 2100 (0.0033) [2024-06-17 22:14:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38502.4, 300 sec: 38488.5). Total num frames: 34521088. Throughput: 0: 38292.7. Samples: 34588220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-17 22:14:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:14:48,231][12883] Updated weights for policy 0, policy_version 2110 (0.0041) [2024-06-17 22:14:51,994][12645] Fps is (10 sec: 34406.2, 60 sec: 37956.2, 300 sec: 38099.7). Total num frames: 34652160. Throughput: 0: 38454.7. Samples: 34817320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:14:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:14:53,022][12883] Updated weights for policy 0, policy_version 2120 (0.0043) [2024-06-17 22:14:56,361][12883] Updated weights for policy 0, policy_version 2130 (0.0038) [2024-06-17 22:14:56,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38229.3, 300 sec: 38433.0). Total num frames: 34897920. Throughput: 0: 38247.1. Samples: 35042480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-17 22:14:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:15:01,359][12883] Updated weights for policy 0, policy_version 2140 (0.0028) [2024-06-17 22:15:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 35094528. Throughput: 0: 38609.1. Samples: 35166700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 22:15:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:15:04,943][12883] Updated weights for policy 0, policy_version 2150 (0.0038) [2024-06-17 22:15:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 38503.9, 300 sec: 38377.4). Total num frames: 35274752. Throughput: 0: 38303.2. Samples: 35392040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 22:15:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:15:09,914][12883] Updated weights for policy 0, policy_version 2160 (0.0031) [2024-06-17 22:15:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38229.3, 300 sec: 38488.5). Total num frames: 35471360. Throughput: 0: 38429.4. Samples: 35622080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-17 22:15:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:15:13,911][12883] Updated weights for policy 0, policy_version 2170 (0.0036) [2024-06-17 22:15:16,994][12645] Fps is (10 sec: 36045.1, 60 sec: 38230.8, 300 sec: 38266.4). Total num frames: 35635200. Throughput: 0: 38644.0. Samples: 35740500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 22:15:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:15:18,129][12883] Updated weights for policy 0, policy_version 2180 (0.0042) [2024-06-17 22:15:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 37956.4, 300 sec: 38433.0). Total num frames: 35848192. Throughput: 0: 38507.1. Samples: 35976680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:15:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:15:22,191][12883] Updated weights for policy 0, policy_version 2190 (0.0043) [2024-06-17 22:15:26,718][12883] Updated weights for policy 0, policy_version 2200 (0.0041) [2024-06-17 22:15:26,994][12645] Fps is (10 sec: 42597.4, 60 sec: 39048.5, 300 sec: 38433.0). Total num frames: 36061184. Throughput: 0: 38561.7. Samples: 36207680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 22:15:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:15:30,457][12862] Signal inference workers to stop experience collection... (500 times) [2024-06-17 22:15:30,493][12883] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-17 22:15:30,513][12862] Signal inference workers to resume experience collection... (500 times) [2024-06-17 22:15:30,515][12883] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-17 22:15:30,836][12883] Updated weights for policy 0, policy_version 2210 (0.0035) [2024-06-17 22:15:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38229.3, 300 sec: 38322.2). Total num frames: 36225024. Throughput: 0: 38565.5. Samples: 36323660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 22:15:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:15:34,946][12883] Updated weights for policy 0, policy_version 2220 (0.0042) [2024-06-17 22:15:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.3, 300 sec: 38488.5). Total num frames: 36438016. Throughput: 0: 38611.1. Samples: 36554820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:15:36,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:15:39,259][12883] Updated weights for policy 0, policy_version 2230 (0.0038) [2024-06-17 22:15:41,994][12645] Fps is (10 sec: 40959.2, 60 sec: 38775.4, 300 sec: 38433.0). Total num frames: 36634624. Throughput: 0: 38661.7. Samples: 36782260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-17 22:15:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:15:43,731][12883] Updated weights for policy 0, policy_version 2240 (0.0035) [2024-06-17 22:15:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 38229.4, 300 sec: 38488.5). Total num frames: 36814848. Throughput: 0: 38536.5. Samples: 36900840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:15:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:15:48,057][12883] Updated weights for policy 0, policy_version 2250 (0.0043) [2024-06-17 22:15:51,761][12883] Updated weights for policy 0, policy_version 2260 (0.0048) [2024-06-17 22:15:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.6, 300 sec: 38488.5). Total num frames: 37027840. Throughput: 0: 38660.8. Samples: 37131780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 22:15:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:15:56,265][12883] Updated weights for policy 0, policy_version 2270 (0.0037) [2024-06-17 22:15:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 38502.4, 300 sec: 38544.0). Total num frames: 37208064. Throughput: 0: 38645.4. Samples: 37361120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:15:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:16:00,445][12883] Updated weights for policy 0, policy_version 2280 (0.0038) [2024-06-17 22:16:01,994][12645] Fps is (10 sec: 36045.0, 60 sec: 38229.3, 300 sec: 38433.0). Total num frames: 37388288. Throughput: 0: 38454.1. Samples: 37470940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 22:16:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:16:04,961][12883] Updated weights for policy 0, policy_version 2290 (0.0044) [2024-06-17 22:16:06,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.3, 300 sec: 38377.4). Total num frames: 37568512. Throughput: 0: 38477.7. Samples: 37708180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 23.0) [2024-06-17 22:16:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:16:08,590][12883] Updated weights for policy 0, policy_version 2300 (0.0033) [2024-06-17 22:16:11,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38229.4, 300 sec: 38433.0). Total num frames: 37765120. Throughput: 0: 38669.9. Samples: 37947820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 22:16:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:16:13,618][12883] Updated weights for policy 0, policy_version 2310 (0.0036) [2024-06-17 22:16:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39321.5, 300 sec: 38544.1). Total num frames: 37994496. Throughput: 0: 38504.4. Samples: 38056360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-17 22:16:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:16:17,091][12883] Updated weights for policy 0, policy_version 2320 (0.0039) [2024-06-17 22:16:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38229.3, 300 sec: 38433.0). Total num frames: 38141952. Throughput: 0: 38699.6. Samples: 38296300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-17 22:16:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:16:22,127][12862] Saving new best policy, reward=0.006! [2024-06-17 22:16:22,370][12883] Updated weights for policy 0, policy_version 2330 (0.0032) [2024-06-17 22:16:25,318][12883] Updated weights for policy 0, policy_version 2340 (0.0029) [2024-06-17 22:16:26,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38229.4, 300 sec: 38488.5). Total num frames: 38354944. Throughput: 0: 38746.3. Samples: 38525840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 22.0) [2024-06-17 22:16:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:16:30,367][12883] Updated weights for policy 0, policy_version 2350 (0.0041) [2024-06-17 22:16:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 39321.5, 300 sec: 38655.1). Total num frames: 38584320. Throughput: 0: 38849.7. Samples: 38649080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-17 22:16:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:16:34,331][12883] Updated weights for policy 0, policy_version 2360 (0.0034) [2024-06-17 22:16:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38502.5, 300 sec: 38544.1). Total num frames: 38748160. Throughput: 0: 38741.9. Samples: 38875160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-17 22:16:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:16:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002366_38764544.pth... [2024-06-17 22:16:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000001799_29474816.pth [2024-06-17 22:16:39,046][12883] Updated weights for policy 0, policy_version 2370 (0.0034) [2024-06-17 22:16:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38775.5, 300 sec: 38599.6). Total num frames: 38961152. Throughput: 0: 38712.8. Samples: 39103200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:16:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:16:42,601][12883] Updated weights for policy 0, policy_version 2380 (0.0054) [2024-06-17 22:16:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38502.4, 300 sec: 38544.1). Total num frames: 39124992. Throughput: 0: 38972.1. Samples: 39224680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-17 22:16:47,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:16:47,418][12883] Updated weights for policy 0, policy_version 2390 (0.0042) [2024-06-17 22:16:50,730][12883] Updated weights for policy 0, policy_version 2400 (0.0050) [2024-06-17 22:16:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38502.4, 300 sec: 38488.5). Total num frames: 39337984. Throughput: 0: 38839.0. Samples: 39455940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 22:16:51,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:16:55,670][12883] Updated weights for policy 0, policy_version 2410 (0.0031) [2024-06-17 22:16:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 38775.4, 300 sec: 38710.6). Total num frames: 39534592. Throughput: 0: 38790.1. Samples: 39693380. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-17 22:16:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:16:58,265][12862] Signal inference workers to stop experience collection... (550 times) [2024-06-17 22:16:58,265][12862] Signal inference workers to resume experience collection... (550 times) [2024-06-17 22:16:58,304][12883] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-17 22:16:58,304][12883] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-17 22:16:59,307][12883] Updated weights for policy 0, policy_version 2420 (0.0048) [2024-06-17 22:17:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 38775.5, 300 sec: 38488.5). Total num frames: 39714816. Throughput: 0: 38830.2. Samples: 39803720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-17 22:17:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:17:04,370][12883] Updated weights for policy 0, policy_version 2430 (0.0031) [2024-06-17 22:17:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39594.7, 300 sec: 38821.7). Total num frames: 39944192. Throughput: 0: 38756.4. Samples: 40040340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 22:17:07,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:17:07,289][12883] Updated weights for policy 0, policy_version 2440 (0.0034) [2024-06-17 22:17:11,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38502.3, 300 sec: 38488.5). Total num frames: 40075264. Throughput: 0: 38909.3. Samples: 40276760. Policy #0 lag: (min: 0.0, avg: 6.9, max: 20.0) [2024-06-17 22:17:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:17:13,119][12883] Updated weights for policy 0, policy_version 2450 (0.0043) [2024-06-17 22:17:15,830][12883] Updated weights for policy 0, policy_version 2460 (0.0039) [2024-06-17 22:17:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 38775.4, 300 sec: 38599.6). Total num frames: 40321024. Throughput: 0: 38541.3. Samples: 40383440. Policy #0 lag: (min: 0.0, avg: 13.0, max: 22.0) [2024-06-17 22:17:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:17:21,642][12883] Updated weights for policy 0, policy_version 2470 (0.0026) [2024-06-17 22:17:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39321.6, 300 sec: 38711.0). Total num frames: 40501248. Throughput: 0: 38855.9. Samples: 40623680. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-17 22:17:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:17:24,278][12883] Updated weights for policy 0, policy_version 2480 (0.0038) [2024-06-17 22:17:26,994][12645] Fps is (10 sec: 36044.9, 60 sec: 38775.4, 300 sec: 38544.0). Total num frames: 40681472. Throughput: 0: 38815.1. Samples: 40849880. Policy #0 lag: (min: 1.0, avg: 12.3, max: 24.0) [2024-06-17 22:17:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:17:30,234][12883] Updated weights for policy 0, policy_version 2490 (0.0041) [2024-06-17 22:17:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38502.4, 300 sec: 38766.2). Total num frames: 40894464. Throughput: 0: 38743.0. Samples: 40968120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-17 22:17:31,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:17:33,278][12883] Updated weights for policy 0, policy_version 2500 (0.0044) [2024-06-17 22:17:36,996][12645] Fps is (10 sec: 37675.0, 60 sec: 38501.0, 300 sec: 38543.8). Total num frames: 41058304. Throughput: 0: 38772.0. Samples: 41200760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 22:17:36,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:17:38,558][12883] Updated weights for policy 0, policy_version 2510 (0.0039) [2024-06-17 22:17:41,182][12883] Updated weights for policy 0, policy_version 2520 (0.0040) [2024-06-17 22:17:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39048.6, 300 sec: 38877.9). Total num frames: 41304064. Throughput: 0: 38591.7. Samples: 41430000. Policy #0 lag: (min: 2.0, avg: 9.7, max: 23.0) [2024-06-17 22:17:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:17:46,946][12883] Updated weights for policy 0, policy_version 2530 (0.0027) [2024-06-17 22:17:46,995][12645] Fps is (10 sec: 39326.0, 60 sec: 38774.7, 300 sec: 38710.8). Total num frames: 41451520. Throughput: 0: 38811.5. Samples: 41550280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-17 22:17:46,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:17:49,873][12883] Updated weights for policy 0, policy_version 2540 (0.0042) [2024-06-17 22:17:51,994][12645] Fps is (10 sec: 34406.1, 60 sec: 38502.5, 300 sec: 38488.5). Total num frames: 41648128. Throughput: 0: 38595.1. Samples: 41777120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-17 22:17:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:17:55,270][12883] Updated weights for policy 0, policy_version 2550 (0.0047) [2024-06-17 22:17:56,994][12645] Fps is (10 sec: 42603.3, 60 sec: 39048.6, 300 sec: 38821.7). Total num frames: 41877504. Throughput: 0: 38613.9. Samples: 42014380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-17 22:17:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:17:58,589][12883] Updated weights for policy 0, policy_version 2560 (0.0043) [2024-06-17 22:18:01,996][12645] Fps is (10 sec: 39313.0, 60 sec: 38774.0, 300 sec: 38599.3). Total num frames: 42041344. Throughput: 0: 38866.1. Samples: 42132500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-17 22:18:01,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:18:03,842][12883] Updated weights for policy 0, policy_version 2570 (0.0039) [2024-06-17 22:18:06,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38229.4, 300 sec: 38655.1). Total num frames: 42237952. Throughput: 0: 38560.0. Samples: 42358880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 22:18:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:07,206][12883] Updated weights for policy 0, policy_version 2580 (0.0042) [2024-06-17 22:18:11,994][12645] Fps is (10 sec: 37691.2, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 42418176. Throughput: 0: 38915.5. Samples: 42601080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 22:18:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:18:12,088][12883] Updated weights for policy 0, policy_version 2590 (0.0038) [2024-06-17 22:18:15,618][12883] Updated weights for policy 0, policy_version 2600 (0.0035) [2024-06-17 22:18:16,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38229.3, 300 sec: 38544.0). Total num frames: 42614784. Throughput: 0: 38731.0. Samples: 42711020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-17 22:18:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:20,166][12883] Updated weights for policy 0, policy_version 2610 (0.0047) [2024-06-17 22:18:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 38775.5, 300 sec: 38821.8). Total num frames: 42827776. Throughput: 0: 38881.1. Samples: 42950320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-17 22:18:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:24,632][12883] Updated weights for policy 0, policy_version 2620 (0.0038) [2024-06-17 22:18:26,223][12862] Signal inference workers to stop experience collection... (600 times) [2024-06-17 22:18:26,225][12862] Signal inference workers to resume experience collection... (600 times) [2024-06-17 22:18:26,263][12883] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-17 22:18:26,263][12883] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-17 22:18:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39048.5, 300 sec: 38599.6). Total num frames: 43024384. Throughput: 0: 38899.9. Samples: 43180500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-17 22:18:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:18:28,557][12883] Updated weights for policy 0, policy_version 2630 (0.0041) [2024-06-17 22:18:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38775.5, 300 sec: 38710.7). Total num frames: 43220992. Throughput: 0: 38777.9. Samples: 43295240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 22:18:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:32,639][12883] Updated weights for policy 0, policy_version 2640 (0.0039) [2024-06-17 22:18:36,945][12883] Updated weights for policy 0, policy_version 2650 (0.0031) [2024-06-17 22:18:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39323.0, 300 sec: 38766.2). Total num frames: 43417600. Throughput: 0: 39024.9. Samples: 43533240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 22:18:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002650_43417600.pth... [2024-06-17 22:18:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002083_34127872.pth [2024-06-17 22:18:41,482][12883] Updated weights for policy 0, policy_version 2660 (0.0036) [2024-06-17 22:18:41,994][12645] Fps is (10 sec: 39321.0, 60 sec: 38502.3, 300 sec: 38655.1). Total num frames: 43614208. Throughput: 0: 38858.6. Samples: 43763020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 22:18:41,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:45,362][12883] Updated weights for policy 0, policy_version 2670 (0.0036) [2024-06-17 22:18:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38776.1, 300 sec: 38655.1). Total num frames: 43778048. Throughput: 0: 38782.3. Samples: 43877620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 22:18:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:49,566][12883] Updated weights for policy 0, policy_version 2680 (0.0036) [2024-06-17 22:18:51,994][12645] Fps is (10 sec: 36045.5, 60 sec: 38775.6, 300 sec: 38544.1). Total num frames: 43974656. Throughput: 0: 38833.9. Samples: 44106400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 22:18:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:53,802][12883] Updated weights for policy 0, policy_version 2690 (0.0035) [2024-06-17 22:18:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38229.3, 300 sec: 38655.1). Total num frames: 44171264. Throughput: 0: 38588.9. Samples: 44337580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-17 22:18:57,000][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:18:58,459][12883] Updated weights for policy 0, policy_version 2700 (0.0039) [2024-06-17 22:19:01,971][12883] Updated weights for policy 0, policy_version 2710 (0.0051) [2024-06-17 22:19:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39323.1, 300 sec: 38766.5). Total num frames: 44400640. Throughput: 0: 38700.6. Samples: 44452540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 22:19:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:19:06,782][12883] Updated weights for policy 0, policy_version 2720 (0.0044) [2024-06-17 22:19:06,994][12645] Fps is (10 sec: 39322.4, 60 sec: 38775.6, 300 sec: 38599.6). Total num frames: 44564480. Throughput: 0: 38442.8. Samples: 44680240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 22:19:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:19:10,961][12883] Updated weights for policy 0, policy_version 2730 (0.0039) [2024-06-17 22:19:11,994][12645] Fps is (10 sec: 34406.2, 60 sec: 38775.5, 300 sec: 38655.4). Total num frames: 44744704. Throughput: 0: 38478.7. Samples: 44912040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-17 22:19:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:19:15,379][12883] Updated weights for policy 0, policy_version 2740 (0.0048) [2024-06-17 22:19:16,994][12645] Fps is (10 sec: 37682.2, 60 sec: 38775.5, 300 sec: 38544.0). Total num frames: 44941312. Throughput: 0: 38490.5. Samples: 45027320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-17 22:19:16,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:19:19,221][12883] Updated weights for policy 0, policy_version 2750 (0.0048) [2024-06-17 22:19:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38229.3, 300 sec: 38655.2). Total num frames: 45121536. Throughput: 0: 38317.4. Samples: 45257520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-17 22:19:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:19:24,104][12883] Updated weights for policy 0, policy_version 2760 (0.0050) [2024-06-17 22:19:26,995][12645] Fps is (10 sec: 39318.7, 60 sec: 38501.9, 300 sec: 38655.0). Total num frames: 45334528. Throughput: 0: 38347.3. Samples: 45488680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 22:19:26,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:19:27,975][12883] Updated weights for policy 0, policy_version 2770 (0.0040) [2024-06-17 22:19:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 38502.3, 300 sec: 38655.1). Total num frames: 45531136. Throughput: 0: 38593.3. Samples: 45614320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 22:19:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:19:32,234][12883] Updated weights for policy 0, policy_version 2780 (0.0035) [2024-06-17 22:19:36,309][12883] Updated weights for policy 0, policy_version 2790 (0.0037) [2024-06-17 22:19:36,994][12645] Fps is (10 sec: 37686.3, 60 sec: 38229.3, 300 sec: 38655.1). Total num frames: 45711360. Throughput: 0: 38471.9. Samples: 45837640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:19:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:19:40,864][12883] Updated weights for policy 0, policy_version 2800 (0.0032) [2024-06-17 22:19:41,994][12645] Fps is (10 sec: 36045.3, 60 sec: 37956.4, 300 sec: 38544.1). Total num frames: 45891584. Throughput: 0: 38455.7. Samples: 46068080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-17 22:19:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:19:45,078][12883] Updated weights for policy 0, policy_version 2810 (0.0039) [2024-06-17 22:19:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38502.4, 300 sec: 38766.2). Total num frames: 46088192. Throughput: 0: 38408.0. Samples: 46180900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-17 22:19:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:19:49,527][12883] Updated weights for policy 0, policy_version 2820 (0.0032) [2024-06-17 22:19:51,994][12645] Fps is (10 sec: 39320.9, 60 sec: 38502.3, 300 sec: 38599.6). Total num frames: 46284800. Throughput: 0: 38439.3. Samples: 46410020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-17 22:19:51,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:19:53,514][12883] Updated weights for policy 0, policy_version 2830 (0.0046) [2024-06-17 22:19:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 38502.5, 300 sec: 38599.6). Total num frames: 46481408. Throughput: 0: 38530.3. Samples: 46645900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:19:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:19:57,904][12883] Updated weights for policy 0, policy_version 2840 (0.0034) [2024-06-17 22:20:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 37956.2, 300 sec: 38655.1). Total num frames: 46678016. Throughput: 0: 38531.6. Samples: 46761240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-17 22:20:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:20:02,032][12883] Updated weights for policy 0, policy_version 2850 (0.0035) [2024-06-17 22:20:06,455][12883] Updated weights for policy 0, policy_version 2860 (0.0035) [2024-06-17 22:20:06,491][12862] Signal inference workers to stop experience collection... (650 times) [2024-06-17 22:20:06,491][12862] Signal inference workers to resume experience collection... (650 times) [2024-06-17 22:20:06,506][12883] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-17 22:20:06,506][12883] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-17 22:20:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 38502.3, 300 sec: 38655.1). Total num frames: 46874624. Throughput: 0: 38560.0. Samples: 46992720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:20:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:20:11,240][12883] Updated weights for policy 0, policy_version 2870 (0.0041) [2024-06-17 22:20:11,994][12645] Fps is (10 sec: 34406.5, 60 sec: 37956.2, 300 sec: 38599.6). Total num frames: 47022080. Throughput: 0: 38585.6. Samples: 47225000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 22:20:11,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:20:15,005][12883] Updated weights for policy 0, policy_version 2880 (0.0031) [2024-06-17 22:20:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.6, 300 sec: 38766.2). Total num frames: 47284224. Throughput: 0: 38304.1. Samples: 47338000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 19.0) [2024-06-17 22:20:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:20:19,286][12883] Updated weights for policy 0, policy_version 2890 (0.0045) [2024-06-17 22:20:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 38502.3, 300 sec: 38544.1). Total num frames: 47431680. Throughput: 0: 38649.4. Samples: 47576860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-17 22:20:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:20:23,411][12883] Updated weights for policy 0, policy_version 2900 (0.0035) [2024-06-17 22:20:26,994][12645] Fps is (10 sec: 36044.7, 60 sec: 38503.0, 300 sec: 38710.7). Total num frames: 47644672. Throughput: 0: 38614.6. Samples: 47805740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-17 22:20:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:20:28,187][12883] Updated weights for policy 0, policy_version 2910 (0.0043) [2024-06-17 22:20:31,790][12883] Updated weights for policy 0, policy_version 2920 (0.0043) [2024-06-17 22:20:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 38502.4, 300 sec: 38655.1). Total num frames: 47841280. Throughput: 0: 38749.7. Samples: 47924640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 22:20:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:20:36,572][12883] Updated weights for policy 0, policy_version 2930 (0.0030) [2024-06-17 22:20:36,996][12645] Fps is (10 sec: 36036.7, 60 sec: 38227.9, 300 sec: 38543.8). Total num frames: 48005120. Throughput: 0: 38711.1. Samples: 48152100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-17 22:20:36,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:20:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002930_48005120.pth... [2024-06-17 22:20:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002366_38764544.pth [2024-06-17 22:20:40,343][12883] Updated weights for policy 0, policy_version 2940 (0.0038) [2024-06-17 22:20:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39321.6, 300 sec: 38766.2). Total num frames: 48250880. Throughput: 0: 38538.6. Samples: 48380140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-17 22:20:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:20:45,029][12883] Updated weights for policy 0, policy_version 2950 (0.0041) [2024-06-17 22:20:46,994][12645] Fps is (10 sec: 40968.9, 60 sec: 38775.4, 300 sec: 38599.6). Total num frames: 48414720. Throughput: 0: 38747.1. Samples: 48504860. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-17 22:20:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:20:49,017][12883] Updated weights for policy 0, policy_version 2960 (0.0037) [2024-06-17 22:20:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39048.7, 300 sec: 38710.7). Total num frames: 48627712. Throughput: 0: 38592.5. Samples: 48729380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-17 22:20:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:20:53,620][12883] Updated weights for policy 0, policy_version 2970 (0.0042) [2024-06-17 22:20:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 38775.5, 300 sec: 38710.7). Total num frames: 48807936. Throughput: 0: 38753.9. Samples: 48968920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-17 22:20:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:20:57,121][12883] Updated weights for policy 0, policy_version 2980 (0.0046) [2024-06-17 22:21:01,986][12883] Updated weights for policy 0, policy_version 2990 (0.0031) [2024-06-17 22:21:01,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38502.5, 300 sec: 38710.7). Total num frames: 48988160. Throughput: 0: 38754.2. Samples: 49081940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-17 22:21:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:21:05,483][12883] Updated weights for policy 0, policy_version 3000 (0.0033) [2024-06-17 22:21:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 39321.6, 300 sec: 38877.3). Total num frames: 49233920. Throughput: 0: 38609.0. Samples: 49314260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:21:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:21:10,307][12883] Updated weights for policy 0, policy_version 3010 (0.0040) [2024-06-17 22:21:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.7, 300 sec: 38655.1). Total num frames: 49397760. Throughput: 0: 38928.0. Samples: 49557500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-17 22:21:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:21:13,955][12883] Updated weights for policy 0, policy_version 3020 (0.0039) [2024-06-17 22:21:16,994][12645] Fps is (10 sec: 36044.5, 60 sec: 38502.4, 300 sec: 38821.7). Total num frames: 49594368. Throughput: 0: 38855.6. Samples: 49673140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 22:21:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:21:18,500][12883] Updated weights for policy 0, policy_version 3030 (0.0042) [2024-06-17 22:21:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.6, 300 sec: 38710.7). Total num frames: 49774592. Throughput: 0: 38986.9. Samples: 49906420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:21:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:21:22,237][12883] Updated weights for policy 0, policy_version 3040 (0.0040) [2024-06-17 22:21:26,994][12645] Fps is (10 sec: 36044.7, 60 sec: 38502.4, 300 sec: 38544.1). Total num frames: 49954816. Throughput: 0: 39036.8. Samples: 50136800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:21:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:21:27,353][12883] Updated weights for policy 0, policy_version 3050 (0.0056) [2024-06-17 22:21:30,825][12883] Updated weights for policy 0, policy_version 3060 (0.0045) [2024-06-17 22:21:31,214][12862] Signal inference workers to stop experience collection... (700 times) [2024-06-17 22:21:31,236][12883] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-17 22:21:31,329][12862] Signal inference workers to resume experience collection... (700 times) [2024-06-17 22:21:31,330][12883] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-17 22:21:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 39048.6, 300 sec: 38766.2). Total num frames: 50184192. Throughput: 0: 38932.5. Samples: 50256820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-17 22:21:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:21:35,304][12883] Updated weights for policy 0, policy_version 3070 (0.0038) [2024-06-17 22:21:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39050.0, 300 sec: 38599.6). Total num frames: 50348032. Throughput: 0: 39021.7. Samples: 50485360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:21:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:21:39,375][12883] Updated weights for policy 0, policy_version 3080 (0.0042) [2024-06-17 22:21:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.4, 300 sec: 38766.2). Total num frames: 50561024. Throughput: 0: 38775.0. Samples: 50713800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 22:21:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:21:44,191][12883] Updated weights for policy 0, policy_version 3090 (0.0035) [2024-06-17 22:21:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.6, 300 sec: 38710.7). Total num frames: 50757632. Throughput: 0: 38970.6. Samples: 50835620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-17 22:21:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:21:47,827][12883] Updated weights for policy 0, policy_version 3100 (0.0038) [2024-06-17 22:21:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 38502.4, 300 sec: 38655.1). Total num frames: 50937856. Throughput: 0: 38923.1. Samples: 51065800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-17 22:21:51,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:21:52,333][12883] Updated weights for policy 0, policy_version 3110 (0.0050) [2024-06-17 22:21:56,423][12883] Updated weights for policy 0, policy_version 3120 (0.0035) [2024-06-17 22:21:56,996][12645] Fps is (10 sec: 37674.8, 60 sec: 38774.0, 300 sec: 38710.4). Total num frames: 51134464. Throughput: 0: 38580.3. Samples: 51293700. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-17 22:21:56,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:22:00,890][12883] Updated weights for policy 0, policy_version 3130 (0.0046) [2024-06-17 22:22:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.5, 300 sec: 38599.6). Total num frames: 51331072. Throughput: 0: 38640.0. Samples: 51411940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 22:22:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:22:04,749][12883] Updated weights for policy 0, policy_version 3140 (0.0048) [2024-06-17 22:22:06,994][12645] Fps is (10 sec: 39330.9, 60 sec: 38229.4, 300 sec: 38821.8). Total num frames: 51527680. Throughput: 0: 38606.2. Samples: 51643700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-17 22:22:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:22:08,709][12883] Updated weights for policy 0, policy_version 3150 (0.0026) [2024-06-17 22:22:11,994][12645] Fps is (10 sec: 36044.9, 60 sec: 38229.3, 300 sec: 38544.1). Total num frames: 51691520. Throughput: 0: 38641.9. Samples: 51875680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-17 22:22:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:22:13,314][12883] Updated weights for policy 0, policy_version 3160 (0.0034) [2024-06-17 22:22:16,994][12645] Fps is (10 sec: 36044.1, 60 sec: 38229.3, 300 sec: 38599.6). Total num frames: 51888128. Throughput: 0: 38447.5. Samples: 51986960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-17 22:22:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:22:17,827][12883] Updated weights for policy 0, policy_version 3170 (0.0034) [2024-06-17 22:22:21,564][12883] Updated weights for policy 0, policy_version 3180 (0.0039) [2024-06-17 22:22:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 39048.4, 300 sec: 38766.2). Total num frames: 52117504. Throughput: 0: 38613.7. Samples: 52222980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-17 22:22:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:22:25,883][12883] Updated weights for policy 0, policy_version 3190 (0.0039) [2024-06-17 22:22:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 52297728. Throughput: 0: 38799.9. Samples: 52459800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 22:22:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:22:30,045][12883] Updated weights for policy 0, policy_version 3200 (0.0038) [2024-06-17 22:22:31,994][12645] Fps is (10 sec: 36045.5, 60 sec: 38229.4, 300 sec: 38711.0). Total num frames: 52477952. Throughput: 0: 38642.3. Samples: 52574520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 22:22:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:22:34,992][12883] Updated weights for policy 0, policy_version 3210 (0.0039) [2024-06-17 22:22:37,000][12645] Fps is (10 sec: 39297.3, 60 sec: 39044.5, 300 sec: 38598.8). Total num frames: 52690944. Throughput: 0: 38639.5. Samples: 52804820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:22:37,001][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:22:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003216_52690944.pth... [2024-06-17 22:22:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002650_43417600.pth [2024-06-17 22:22:38,298][12883] Updated weights for policy 0, policy_version 3220 (0.0039) [2024-06-17 22:22:41,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38229.3, 300 sec: 38655.3). Total num frames: 52854784. Throughput: 0: 38904.6. Samples: 53044320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-17 22:22:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:22:43,012][12883] Updated weights for policy 0, policy_version 3230 (0.0041) [2024-06-17 22:22:46,457][12883] Updated weights for policy 0, policy_version 3240 (0.0033) [2024-06-17 22:22:46,994][12645] Fps is (10 sec: 39346.2, 60 sec: 38775.5, 300 sec: 38766.2). Total num frames: 53084160. Throughput: 0: 38752.4. Samples: 53155800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 22:22:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:22:51,410][12883] Updated weights for policy 0, policy_version 3250 (0.0033) [2024-06-17 22:22:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 38775.4, 300 sec: 38599.6). Total num frames: 53264384. Throughput: 0: 38936.7. Samples: 53395860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-17 22:22:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:22:55,373][12883] Updated weights for policy 0, policy_version 3260 (0.0035) [2024-06-17 22:22:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39050.1, 300 sec: 38766.5). Total num frames: 53477376. Throughput: 0: 38908.5. Samples: 53626560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-17 22:22:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:22:59,870][12883] Updated weights for policy 0, policy_version 3270 (0.0039) [2024-06-17 22:23:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39048.5, 300 sec: 38766.2). Total num frames: 53673984. Throughput: 0: 39019.6. Samples: 53742840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:23:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:23:03,615][12862] Signal inference workers to stop experience collection... (750 times) [2024-06-17 22:23:03,615][12862] Signal inference workers to resume experience collection... (750 times) [2024-06-17 22:23:03,650][12883] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-17 22:23:03,650][12883] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-17 22:23:03,767][12883] Updated weights for policy 0, policy_version 3280 (0.0044) [2024-06-17 22:23:06,994][12645] Fps is (10 sec: 36044.3, 60 sec: 38502.3, 300 sec: 38710.7). Total num frames: 53837824. Throughput: 0: 38819.2. Samples: 53969840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-17 22:23:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:23:08,367][12883] Updated weights for policy 0, policy_version 3290 (0.0029) [2024-06-17 22:23:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39321.5, 300 sec: 38766.2). Total num frames: 54050816. Throughput: 0: 38748.0. Samples: 54203460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-17 22:23:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:23:12,165][12883] Updated weights for policy 0, policy_version 3300 (0.0052) [2024-06-17 22:23:16,936][12883] Updated weights for policy 0, policy_version 3310 (0.0048) [2024-06-17 22:23:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 54231040. Throughput: 0: 38902.6. Samples: 54325140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 22:23:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:23:20,693][12883] Updated weights for policy 0, policy_version 3320 (0.0036) [2024-06-17 22:23:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38710.7). Total num frames: 54444032. Throughput: 0: 38895.6. Samples: 54554880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 22:23:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:23:25,559][12883] Updated weights for policy 0, policy_version 3330 (0.0036) [2024-06-17 22:23:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39048.5, 300 sec: 38710.6). Total num frames: 54640640. Throughput: 0: 38644.4. Samples: 54783320. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-17 22:23:27,003][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:23:28,638][12883] Updated weights for policy 0, policy_version 3340 (0.0027) [2024-06-17 22:23:31,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39048.5, 300 sec: 38655.1). Total num frames: 54820864. Throughput: 0: 38877.3. Samples: 54905280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 22:23:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:23:33,663][12883] Updated weights for policy 0, policy_version 3350 (0.0037) [2024-06-17 22:23:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39052.6, 300 sec: 38710.7). Total num frames: 55033856. Throughput: 0: 38757.8. Samples: 55139960. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-17 22:23:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:23:37,229][12883] Updated weights for policy 0, policy_version 3360 (0.0045) [2024-06-17 22:23:41,603][12883] Updated weights for policy 0, policy_version 3370 (0.0045) [2024-06-17 22:23:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39321.6, 300 sec: 38766.2). Total num frames: 55214080. Throughput: 0: 38930.1. Samples: 55378420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:23:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:23:45,354][12883] Updated weights for policy 0, policy_version 3380 (0.0026) [2024-06-17 22:23:46,995][12645] Fps is (10 sec: 36038.6, 60 sec: 38501.3, 300 sec: 38710.4). Total num frames: 55394304. Throughput: 0: 39009.2. Samples: 55498320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-17 22:23:46,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:23:50,285][12883] Updated weights for policy 0, policy_version 3390 (0.0038) [2024-06-17 22:23:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39321.6, 300 sec: 38821.7). Total num frames: 55623680. Throughput: 0: 39116.0. Samples: 55730060. Policy #0 lag: (min: 1.0, avg: 12.8, max: 26.0) [2024-06-17 22:23:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:23:53,885][12883] Updated weights for policy 0, policy_version 3400 (0.0036) [2024-06-17 22:23:56,994][12645] Fps is (10 sec: 40967.3, 60 sec: 38775.4, 300 sec: 38655.1). Total num frames: 55803904. Throughput: 0: 39163.7. Samples: 55965820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:23:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:23:59,052][12883] Updated weights for policy 0, policy_version 3410 (0.0033) [2024-06-17 22:24:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 38821.7). Total num frames: 56016896. Throughput: 0: 38945.8. Samples: 56077700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 22:24:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:02,417][12883] Updated weights for policy 0, policy_version 3420 (0.0033) [2024-06-17 22:24:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.6, 300 sec: 38766.2). Total num frames: 56180736. Throughput: 0: 39136.5. Samples: 56316020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-17 22:24:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:24:07,551][12883] Updated weights for policy 0, policy_version 3430 (0.0032) [2024-06-17 22:24:11,081][12883] Updated weights for policy 0, policy_version 3440 (0.0041) [2024-06-17 22:24:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.5, 300 sec: 38821.8). Total num frames: 56393728. Throughput: 0: 39138.2. Samples: 56544540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 22:24:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:24:15,555][12883] Updated weights for policy 0, policy_version 3450 (0.0050) [2024-06-17 22:24:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39048.5, 300 sec: 38821.7). Total num frames: 56573952. Throughput: 0: 38929.3. Samples: 56657100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:24:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:19,131][12883] Updated weights for policy 0, policy_version 3460 (0.0028) [2024-06-17 22:24:21,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38775.5, 300 sec: 38766.3). Total num frames: 56770560. Throughput: 0: 38834.7. Samples: 56887520. Policy #0 lag: (min: 1.0, avg: 8.5, max: 18.0) [2024-06-17 22:24:22,003][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:23,811][12883] Updated weights for policy 0, policy_version 3470 (0.0043) [2024-06-17 22:24:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38502.5, 300 sec: 38710.7). Total num frames: 56950784. Throughput: 0: 38923.2. Samples: 57129960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 22:24:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:27,910][12883] Updated weights for policy 0, policy_version 3480 (0.0044) [2024-06-17 22:24:30,893][12862] Signal inference workers to stop experience collection... (800 times) [2024-06-17 22:24:30,933][12883] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-17 22:24:30,941][12862] Signal inference workers to resume experience collection... (800 times) [2024-06-17 22:24:30,954][12883] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-17 22:24:31,643][12883] Updated weights for policy 0, policy_version 3490 (0.0042) [2024-06-17 22:24:31,999][12645] Fps is (10 sec: 40936.2, 60 sec: 39317.8, 300 sec: 38876.5). Total num frames: 57180160. Throughput: 0: 38848.9. Samples: 57246680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 19.0) [2024-06-17 22:24:32,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:24:36,222][12883] Updated weights for policy 0, policy_version 3500 (0.0035) [2024-06-17 22:24:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 57393152. Throughput: 0: 39071.2. Samples: 57488260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-17 22:24:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003503_57393152.pth... [2024-06-17 22:24:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000002930_48005120.pth [2024-06-17 22:24:40,187][12883] Updated weights for policy 0, policy_version 3510 (0.0033) [2024-06-17 22:24:41,994][12645] Fps is (10 sec: 36065.8, 60 sec: 38775.5, 300 sec: 38821.8). Total num frames: 57540608. Throughput: 0: 38926.6. Samples: 57717520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 22:24:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:44,921][12883] Updated weights for policy 0, policy_version 3520 (0.0045) [2024-06-17 22:24:46,994][12645] Fps is (10 sec: 34406.0, 60 sec: 39049.6, 300 sec: 38821.8). Total num frames: 57737216. Throughput: 0: 38934.6. Samples: 57829760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-17 22:24:46,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:48,383][12883] Updated weights for policy 0, policy_version 3530 (0.0040) [2024-06-17 22:24:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 38502.4, 300 sec: 38821.7). Total num frames: 57933824. Throughput: 0: 39029.3. Samples: 58072340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-17 22:24:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:24:52,942][12883] Updated weights for policy 0, policy_version 3540 (0.0044) [2024-06-17 22:24:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39048.5, 300 sec: 38877.3). Total num frames: 58146816. Throughput: 0: 39058.3. Samples: 58302160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 25.0) [2024-06-17 22:24:56,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:24:57,072][12883] Updated weights for policy 0, policy_version 3550 (0.0034) [2024-06-17 22:25:01,262][12883] Updated weights for policy 0, policy_version 3560 (0.0034) [2024-06-17 22:25:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 58343424. Throughput: 0: 39254.8. Samples: 58423560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:25:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:25:05,116][12883] Updated weights for policy 0, policy_version 3570 (0.0041) [2024-06-17 22:25:06,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 58523648. Throughput: 0: 39215.0. Samples: 58652200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:25:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:25:09,777][12883] Updated weights for policy 0, policy_version 3580 (0.0030) [2024-06-17 22:25:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.6, 300 sec: 38766.2). Total num frames: 58720256. Throughput: 0: 39005.8. Samples: 58885220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-17 22:25:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:25:14,043][12883] Updated weights for policy 0, policy_version 3590 (0.0039) [2024-06-17 22:25:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.6, 300 sec: 38932.8). Total num frames: 58916864. Throughput: 0: 38888.1. Samples: 58996420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-17 22:25:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:25:18,655][12883] Updated weights for policy 0, policy_version 3600 (0.0036) [2024-06-17 22:25:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39048.6, 300 sec: 38877.3). Total num frames: 59113472. Throughput: 0: 38741.4. Samples: 59231620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-17 22:25:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:25:22,275][12883] Updated weights for policy 0, policy_version 3610 (0.0033) [2024-06-17 22:25:26,807][12883] Updated weights for policy 0, policy_version 3620 (0.0032) [2024-06-17 22:25:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.6, 300 sec: 38932.8). Total num frames: 59326464. Throughput: 0: 38962.1. Samples: 59470820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 22:25:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:25:31,187][12883] Updated weights for policy 0, policy_version 3630 (0.0033) [2024-06-17 22:25:31,999][12645] Fps is (10 sec: 37661.2, 60 sec: 38502.4, 300 sec: 38932.4). Total num frames: 59490304. Throughput: 0: 38997.3. Samples: 59584860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-17 22:25:32,000][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:25:35,301][12883] Updated weights for policy 0, policy_version 3640 (0.0037) [2024-06-17 22:25:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 59719680. Throughput: 0: 38716.6. Samples: 59814580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:25:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:25:39,470][12883] Updated weights for policy 0, policy_version 3650 (0.0034) [2024-06-17 22:25:41,994][12645] Fps is (10 sec: 39344.6, 60 sec: 39048.6, 300 sec: 38877.3). Total num frames: 59883520. Throughput: 0: 38855.7. Samples: 60050660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 22:25:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:25:43,712][12883] Updated weights for policy 0, policy_version 3660 (0.0044) [2024-06-17 22:25:46,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39321.6, 300 sec: 38877.3). Total num frames: 60096512. Throughput: 0: 38744.8. Samples: 60167080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:25:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:25:47,885][12883] Updated weights for policy 0, policy_version 3670 (0.0039) [2024-06-17 22:25:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.6, 300 sec: 38821.8). Total num frames: 60260352. Throughput: 0: 38850.4. Samples: 60400460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-17 22:25:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:25:52,333][12883] Updated weights for policy 0, policy_version 3680 (0.0041) [2024-06-17 22:25:56,401][12883] Updated weights for policy 0, policy_version 3690 (0.0047) [2024-06-17 22:25:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 60473344. Throughput: 0: 38900.3. Samples: 60635740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:25:57,003][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:25:57,926][12862] Signal inference workers to stop experience collection... (850 times) [2024-06-17 22:25:57,975][12883] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-17 22:25:57,984][12862] Signal inference workers to resume experience collection... (850 times) [2024-06-17 22:25:57,987][12883] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-17 22:26:00,222][12883] Updated weights for policy 0, policy_version 3700 (0.0037) [2024-06-17 22:26:01,994][12645] Fps is (10 sec: 42597.4, 60 sec: 39048.4, 300 sec: 38821.7). Total num frames: 60686336. Throughput: 0: 39081.2. Samples: 60755080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-17 22:26:02,004][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:26:04,715][12883] Updated weights for policy 0, policy_version 3710 (0.0024) [2024-06-17 22:26:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.5, 300 sec: 38877.3). Total num frames: 60866560. Throughput: 0: 38966.6. Samples: 60985120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:26:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:26:09,239][12883] Updated weights for policy 0, policy_version 3720 (0.0039) [2024-06-17 22:26:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.4, 300 sec: 38877.3). Total num frames: 61063168. Throughput: 0: 38863.6. Samples: 61219680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-17 22:26:12,008][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:26:13,101][12883] Updated weights for policy 0, policy_version 3730 (0.0039) [2024-06-17 22:26:16,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 61243392. Throughput: 0: 38921.0. Samples: 61336080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-17 22:26:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:26:17,418][12883] Updated weights for policy 0, policy_version 3740 (0.0031) [2024-06-17 22:26:21,523][12883] Updated weights for policy 0, policy_version 3750 (0.0042) [2024-06-17 22:26:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38775.4, 300 sec: 38932.8). Total num frames: 61440000. Throughput: 0: 39062.5. Samples: 61572400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:26:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:26:25,698][12883] Updated weights for policy 0, policy_version 3760 (0.0043) [2024-06-17 22:26:26,996][12645] Fps is (10 sec: 39312.9, 60 sec: 38501.0, 300 sec: 38821.5). Total num frames: 61636608. Throughput: 0: 38979.3. Samples: 61804820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-17 22:26:26,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:26:29,833][12883] Updated weights for policy 0, policy_version 3770 (0.0034) [2024-06-17 22:26:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39325.4, 300 sec: 38988.4). Total num frames: 61849600. Throughput: 0: 39099.5. Samples: 61926560. Policy #0 lag: (min: 1.0, avg: 12.1, max: 23.0) [2024-06-17 22:26:31,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:26:33,908][12883] Updated weights for policy 0, policy_version 3780 (0.0053) [2024-06-17 22:26:36,994][12645] Fps is (10 sec: 39330.2, 60 sec: 38502.3, 300 sec: 38877.3). Total num frames: 62029824. Throughput: 0: 39188.3. Samples: 62163940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:26:36,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:26:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003786_62029824.pth... [2024-06-17 22:26:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003216_52690944.pth [2024-06-17 22:26:38,249][12883] Updated weights for policy 0, policy_version 3790 (0.0036) [2024-06-17 22:26:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39048.6, 300 sec: 38877.3). Total num frames: 62226432. Throughput: 0: 39033.9. Samples: 62392260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-17 22:26:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:26:42,386][12883] Updated weights for policy 0, policy_version 3800 (0.0034) [2024-06-17 22:26:46,489][12883] Updated weights for policy 0, policy_version 3810 (0.0029) [2024-06-17 22:26:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 62423040. Throughput: 0: 38955.6. Samples: 62508080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:26:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:26:50,486][12883] Updated weights for policy 0, policy_version 3820 (0.0037) [2024-06-17 22:26:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39321.6, 300 sec: 38933.1). Total num frames: 62619648. Throughput: 0: 39138.8. Samples: 62746360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:26:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:26:55,919][12883] Updated weights for policy 0, policy_version 3830 (0.0046) [2024-06-17 22:26:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 62832640. Throughput: 0: 38875.1. Samples: 62969060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 22:26:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:26:59,522][12883] Updated weights for policy 0, policy_version 3840 (0.0030) [2024-06-17 22:27:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 63012864. Throughput: 0: 38998.6. Samples: 63091020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-17 22:27:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:27:03,983][12883] Updated weights for policy 0, policy_version 3850 (0.0027) [2024-06-17 22:27:06,994][12645] Fps is (10 sec: 34406.9, 60 sec: 38502.5, 300 sec: 38932.8). Total num frames: 63176704. Throughput: 0: 38851.3. Samples: 63320700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-17 22:27:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:27:07,646][12883] Updated weights for policy 0, policy_version 3860 (0.0044) [2024-06-17 22:27:11,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38502.4, 300 sec: 38932.8). Total num frames: 63373312. Throughput: 0: 38894.8. Samples: 63555000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 22:27:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:27:12,422][12883] Updated weights for policy 0, policy_version 3870 (0.0037) [2024-06-17 22:27:15,934][12883] Updated weights for policy 0, policy_version 3880 (0.0026) [2024-06-17 22:27:16,997][12645] Fps is (10 sec: 42582.6, 60 sec: 39319.3, 300 sec: 38932.4). Total num frames: 63602688. Throughput: 0: 38782.2. Samples: 63671900. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-17 22:27:16,998][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:27:20,751][12883] Updated weights for policy 0, policy_version 3890 (0.0037) [2024-06-17 22:27:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 39594.7, 300 sec: 39043.9). Total num frames: 63815680. Throughput: 0: 38906.2. Samples: 63914720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-17 22:27:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:27:24,109][12862] Signal inference workers to stop experience collection... (900 times) [2024-06-17 22:27:24,136][12883] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-17 22:27:24,229][12862] Signal inference workers to resume experience collection... (900 times) [2024-06-17 22:27:24,230][12883] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-17 22:27:24,367][12883] Updated weights for policy 0, policy_version 3900 (0.0025) [2024-06-17 22:27:26,994][12645] Fps is (10 sec: 37697.2, 60 sec: 39050.0, 300 sec: 38988.4). Total num frames: 63979520. Throughput: 0: 38896.0. Samples: 64142580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 22:27:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:27:28,714][12883] Updated weights for policy 0, policy_version 3910 (0.0036) [2024-06-17 22:27:31,994][12645] Fps is (10 sec: 37680.6, 60 sec: 39048.1, 300 sec: 38989.1). Total num frames: 64192512. Throughput: 0: 38923.9. Samples: 64259680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:27:31,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:27:32,548][12883] Updated weights for policy 0, policy_version 3920 (0.0039) [2024-06-17 22:27:36,996][12645] Fps is (10 sec: 39312.4, 60 sec: 39047.1, 300 sec: 39043.6). Total num frames: 64372736. Throughput: 0: 38924.7. Samples: 64498060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:27:36,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:27:37,292][12883] Updated weights for policy 0, policy_version 3930 (0.0036) [2024-06-17 22:27:41,182][12883] Updated weights for policy 0, policy_version 3940 (0.0036) [2024-06-17 22:27:41,994][12645] Fps is (10 sec: 36047.8, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 64552960. Throughput: 0: 39129.5. Samples: 64729880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-17 22:27:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:27:45,453][12883] Updated weights for policy 0, policy_version 3950 (0.0050) [2024-06-17 22:27:46,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 64765952. Throughput: 0: 39075.1. Samples: 64849400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-17 22:27:46,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:27:49,736][12883] Updated weights for policy 0, policy_version 3960 (0.0038) [2024-06-17 22:27:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38877.3). Total num frames: 64946176. Throughput: 0: 39266.7. Samples: 65087700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-17 22:27:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:27:53,889][12883] Updated weights for policy 0, policy_version 3970 (0.0041) [2024-06-17 22:27:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39048.6, 300 sec: 38988.4). Total num frames: 65175552. Throughput: 0: 39100.1. Samples: 65314500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-17 22:27:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:27:58,445][12883] Updated weights for policy 0, policy_version 3980 (0.0044) [2024-06-17 22:28:01,994][12645] Fps is (10 sec: 40958.7, 60 sec: 39048.4, 300 sec: 39043.9). Total num frames: 65355776. Throughput: 0: 39339.9. Samples: 65442060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-17 22:28:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:28:02,403][12883] Updated weights for policy 0, policy_version 3990 (0.0036) [2024-06-17 22:28:06,591][12883] Updated weights for policy 0, policy_version 4000 (0.0040) [2024-06-17 22:28:06,994][12645] Fps is (10 sec: 36044.1, 60 sec: 39321.5, 300 sec: 38932.8). Total num frames: 65536000. Throughput: 0: 39220.8. Samples: 65679660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 22:28:06,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:28:10,362][12883] Updated weights for policy 0, policy_version 4010 (0.0036) [2024-06-17 22:28:11,994][12645] Fps is (10 sec: 40961.1, 60 sec: 39867.8, 300 sec: 39099.5). Total num frames: 65765376. Throughput: 0: 39271.1. Samples: 65909780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 22:28:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:28:14,577][12883] Updated weights for policy 0, policy_version 4020 (0.0037) [2024-06-17 22:28:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39050.9, 300 sec: 38988.4). Total num frames: 65945600. Throughput: 0: 39388.1. Samples: 66032120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-17 22:28:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:28:18,525][12883] Updated weights for policy 0, policy_version 4030 (0.0050) [2024-06-17 22:28:21,996][12645] Fps is (10 sec: 37674.7, 60 sec: 38774.1, 300 sec: 38988.1). Total num frames: 66142208. Throughput: 0: 39370.7. Samples: 66269740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 22:28:21,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:28:22,953][12883] Updated weights for policy 0, policy_version 4040 (0.0052) [2024-06-17 22:28:26,764][12883] Updated weights for policy 0, policy_version 4050 (0.0040) [2024-06-17 22:28:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39867.6, 300 sec: 39155.0). Total num frames: 66371584. Throughput: 0: 39582.9. Samples: 66511120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-17 22:28:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:28:30,854][12883] Updated weights for policy 0, policy_version 4060 (0.0026) [2024-06-17 22:28:31,994][12645] Fps is (10 sec: 39330.6, 60 sec: 39049.1, 300 sec: 38988.4). Total num frames: 66535424. Throughput: 0: 39461.4. Samples: 66625160. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-17 22:28:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:28:35,211][12883] Updated weights for policy 0, policy_version 4070 (0.0043) [2024-06-17 22:28:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39869.1, 300 sec: 39155.0). Total num frames: 66764800. Throughput: 0: 39474.8. Samples: 66864080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 22:28:36,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:28:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004075_66764800.pth... [2024-06-17 22:28:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003503_57393152.pth [2024-06-17 22:28:39,737][12883] Updated weights for policy 0, policy_version 4080 (0.0047) [2024-06-17 22:28:41,994][12645] Fps is (10 sec: 39320.7, 60 sec: 39594.5, 300 sec: 39099.7). Total num frames: 66928640. Throughput: 0: 39632.3. Samples: 67097960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-17 22:28:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:28:43,249][12883] Updated weights for policy 0, policy_version 4090 (0.0042) [2024-06-17 22:28:46,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39321.5, 300 sec: 38988.4). Total num frames: 67125248. Throughput: 0: 39406.7. Samples: 67215360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-17 22:28:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:28:47,817][12883] Updated weights for policy 0, policy_version 4100 (0.0035) [2024-06-17 22:28:50,030][12862] Signal inference workers to stop experience collection... (950 times) [2024-06-17 22:28:50,084][12883] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-17 22:28:50,085][12862] Signal inference workers to resume experience collection... (950 times) [2024-06-17 22:28:50,101][12883] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-17 22:28:51,664][12883] Updated weights for policy 0, policy_version 4110 (0.0035) [2024-06-17 22:28:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39867.7, 300 sec: 39099.4). Total num frames: 67338240. Throughput: 0: 39452.6. Samples: 67455020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 22:28:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:28:56,173][12883] Updated weights for policy 0, policy_version 4120 (0.0049) [2024-06-17 22:28:56,995][12645] Fps is (10 sec: 40954.1, 60 sec: 39320.5, 300 sec: 39043.7). Total num frames: 67534848. Throughput: 0: 39515.9. Samples: 67688060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-17 22:28:56,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:28:59,663][12883] Updated weights for policy 0, policy_version 4130 (0.0042) [2024-06-17 22:29:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39594.8, 300 sec: 39155.0). Total num frames: 67731456. Throughput: 0: 39361.3. Samples: 67803380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-17 22:29:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:29:04,360][12883] Updated weights for policy 0, policy_version 4140 (0.0043) [2024-06-17 22:29:06,994][12645] Fps is (10 sec: 37689.1, 60 sec: 39594.8, 300 sec: 39043.9). Total num frames: 67911680. Throughput: 0: 39325.0. Samples: 68039280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-17 22:29:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:29:08,645][12883] Updated weights for policy 0, policy_version 4150 (0.0038) [2024-06-17 22:29:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.5, 300 sec: 39099.5). Total num frames: 68108288. Throughput: 0: 39058.4. Samples: 68268740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 22:29:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:29:12,850][12883] Updated weights for policy 0, policy_version 4160 (0.0040) [2024-06-17 22:29:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.6, 300 sec: 39099.4). Total num frames: 68304896. Throughput: 0: 39201.2. Samples: 68389220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-17 22:29:17,003][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:29:17,179][12883] Updated weights for policy 0, policy_version 4170 (0.0043) [2024-06-17 22:29:21,428][12883] Updated weights for policy 0, policy_version 4180 (0.0058) [2024-06-17 22:29:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39323.0, 300 sec: 39155.0). Total num frames: 68501504. Throughput: 0: 38879.7. Samples: 68613660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-17 22:29:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:29:25,682][12883] Updated weights for policy 0, policy_version 4190 (0.0044) [2024-06-17 22:29:26,994][12645] Fps is (10 sec: 37683.9, 60 sec: 38502.5, 300 sec: 38989.1). Total num frames: 68681728. Throughput: 0: 38936.2. Samples: 68850080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-17 22:29:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:29:29,813][12883] Updated weights for policy 0, policy_version 4200 (0.0038) [2024-06-17 22:29:31,994][12645] Fps is (10 sec: 36044.7, 60 sec: 38775.4, 300 sec: 38877.3). Total num frames: 68861952. Throughput: 0: 38904.0. Samples: 68966040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 22:29:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:29:34,023][12883] Updated weights for policy 0, policy_version 4210 (0.0042) [2024-06-17 22:29:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 38775.6, 300 sec: 39155.0). Total num frames: 69091328. Throughput: 0: 38812.8. Samples: 69201600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:29:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:29:38,005][12883] Updated weights for policy 0, policy_version 4220 (0.0036) [2024-06-17 22:29:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39048.6, 300 sec: 39099.5). Total num frames: 69271552. Throughput: 0: 38897.3. Samples: 69438380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-17 22:29:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:29:43,019][12883] Updated weights for policy 0, policy_version 4230 (0.0036) [2024-06-17 22:29:46,222][12883] Updated weights for policy 0, policy_version 4240 (0.0034) [2024-06-17 22:29:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39321.7, 300 sec: 39155.0). Total num frames: 69484544. Throughput: 0: 38853.0. Samples: 69551760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-17 22:29:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:29:51,617][12883] Updated weights for policy 0, policy_version 4250 (0.0037) [2024-06-17 22:29:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 38502.3, 300 sec: 38988.4). Total num frames: 69648384. Throughput: 0: 38810.6. Samples: 69785760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:29:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:29:55,178][12883] Updated weights for policy 0, policy_version 4260 (0.0049) [2024-06-17 22:29:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 38776.5, 300 sec: 39043.9). Total num frames: 69861376. Throughput: 0: 38781.8. Samples: 70013920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-17 22:29:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:30:00,011][12883] Updated weights for policy 0, policy_version 4270 (0.0034) [2024-06-17 22:30:01,996][12645] Fps is (10 sec: 40951.3, 60 sec: 38774.1, 300 sec: 39099.2). Total num frames: 70057984. Throughput: 0: 38716.8. Samples: 70131560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-17 22:30:01,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:30:03,680][12883] Updated weights for policy 0, policy_version 4280 (0.0047) [2024-06-17 22:30:06,996][12645] Fps is (10 sec: 36036.6, 60 sec: 38500.9, 300 sec: 38988.1). Total num frames: 70221824. Throughput: 0: 38911.9. Samples: 70364780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-17 22:30:06,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:30:08,360][12883] Updated weights for policy 0, policy_version 4290 (0.0046) [2024-06-17 22:30:11,792][12883] Updated weights for policy 0, policy_version 4300 (0.0048) [2024-06-17 22:30:11,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39048.5, 300 sec: 39099.5). Total num frames: 70451200. Throughput: 0: 38790.6. Samples: 70595660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-17 22:30:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:30:16,346][12862] Signal inference workers to stop experience collection... (1000 times) [2024-06-17 22:30:16,356][12862] Signal inference workers to resume experience collection... (1000 times) [2024-06-17 22:30:16,383][12883] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-17 22:30:16,383][12883] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-17 22:30:16,544][12883] Updated weights for policy 0, policy_version 4310 (0.0041) [2024-06-17 22:30:16,994][12645] Fps is (10 sec: 40969.5, 60 sec: 38775.5, 300 sec: 39043.9). Total num frames: 70631424. Throughput: 0: 38949.0. Samples: 70718740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:30:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:30:20,476][12883] Updated weights for policy 0, policy_version 4320 (0.0040) [2024-06-17 22:30:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.4, 300 sec: 38988.4). Total num frames: 70828032. Throughput: 0: 38758.6. Samples: 70945740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 22:30:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:30:24,759][12883] Updated weights for policy 0, policy_version 4330 (0.0046) [2024-06-17 22:30:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39321.5, 300 sec: 39155.7). Total num frames: 71041024. Throughput: 0: 38699.5. Samples: 71179860. Policy #0 lag: (min: 0.0, avg: 7.2, max: 19.0) [2024-06-17 22:30:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:30:28,810][12883] Updated weights for policy 0, policy_version 4340 (0.0038) [2024-06-17 22:30:31,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39048.6, 300 sec: 38932.8). Total num frames: 71204864. Throughput: 0: 38713.8. Samples: 71293880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-17 22:30:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:30:33,233][12883] Updated weights for policy 0, policy_version 4350 (0.0034) [2024-06-17 22:30:36,782][12883] Updated weights for policy 0, policy_version 4360 (0.0032) [2024-06-17 22:30:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 39155.0). Total num frames: 71434240. Throughput: 0: 38819.2. Samples: 71532620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-17 22:30:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:30:37,031][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004360_71434240.pth... [2024-06-17 22:30:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000003786_62029824.pth [2024-06-17 22:30:41,492][12883] Updated weights for policy 0, policy_version 4370 (0.0039) [2024-06-17 22:30:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39048.6, 300 sec: 39043.9). Total num frames: 71614464. Throughput: 0: 38863.6. Samples: 71762780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-17 22:30:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:30:45,577][12883] Updated weights for policy 0, policy_version 4380 (0.0038) [2024-06-17 22:30:46,994][12645] Fps is (10 sec: 36044.3, 60 sec: 38502.3, 300 sec: 39099.4). Total num frames: 71794688. Throughput: 0: 38686.2. Samples: 71872360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:30:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:30:50,172][12883] Updated weights for policy 0, policy_version 4390 (0.0041) [2024-06-17 22:30:51,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38775.5, 300 sec: 38988.4). Total num frames: 71974912. Throughput: 0: 38623.3. Samples: 72102740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 22:30:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:30:54,275][12883] Updated weights for policy 0, policy_version 4400 (0.0037) [2024-06-17 22:30:56,994][12645] Fps is (10 sec: 37683.6, 60 sec: 38502.4, 300 sec: 38932.8). Total num frames: 72171520. Throughput: 0: 38835.5. Samples: 72343260. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-17 22:30:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:30:58,441][12883] Updated weights for policy 0, policy_version 4410 (0.0035) [2024-06-17 22:31:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 38776.8, 300 sec: 39043.9). Total num frames: 72384512. Throughput: 0: 38692.8. Samples: 72459920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 22:31:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:31:03,174][12883] Updated weights for policy 0, policy_version 4420 (0.0042) [2024-06-17 22:31:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38776.9, 300 sec: 38932.8). Total num frames: 72548352. Throughput: 0: 38675.2. Samples: 72686120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:31:07,000][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:31:07,230][12883] Updated weights for policy 0, policy_version 4430 (0.0039) [2024-06-17 22:31:11,139][12883] Updated weights for policy 0, policy_version 4440 (0.0029) [2024-06-17 22:31:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 38775.5, 300 sec: 39099.5). Total num frames: 72777728. Throughput: 0: 38605.5. Samples: 72917100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-17 22:31:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:31:15,227][12883] Updated weights for policy 0, policy_version 4450 (0.0045) [2024-06-17 22:31:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 38775.4, 300 sec: 39043.9). Total num frames: 72957952. Throughput: 0: 38891.4. Samples: 73044000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 22:31:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:31:19,498][12883] Updated weights for policy 0, policy_version 4460 (0.0050) [2024-06-17 22:31:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.5, 300 sec: 39044.2). Total num frames: 73154560. Throughput: 0: 38571.5. Samples: 73268340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 22:31:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:31:23,682][12883] Updated weights for policy 0, policy_version 4470 (0.0031) [2024-06-17 22:31:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38229.3, 300 sec: 38932.8). Total num frames: 73334784. Throughput: 0: 38820.8. Samples: 73509720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:31:26,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:31:28,053][12883] Updated weights for policy 0, policy_version 4480 (0.0038) [2024-06-17 22:31:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39048.4, 300 sec: 39043.9). Total num frames: 73547776. Throughput: 0: 38940.0. Samples: 73624660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 22:31:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:31:32,153][12883] Updated weights for policy 0, policy_version 4490 (0.0036) [2024-06-17 22:31:36,114][12883] Updated weights for policy 0, policy_version 4500 (0.0032) [2024-06-17 22:31:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 38775.4, 300 sec: 39099.4). Total num frames: 73760768. Throughput: 0: 39182.2. Samples: 73865940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 23.0) [2024-06-17 22:31:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:31:40,892][12883] Updated weights for policy 0, policy_version 4510 (0.0045) [2024-06-17 22:31:41,994][12645] Fps is (10 sec: 37684.0, 60 sec: 38502.4, 300 sec: 38988.4). Total num frames: 73924608. Throughput: 0: 39103.6. Samples: 74102920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-17 22:31:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:31:44,539][12883] Updated weights for policy 0, policy_version 4520 (0.0031) [2024-06-17 22:31:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39321.6, 300 sec: 39099.4). Total num frames: 74153984. Throughput: 0: 39022.7. Samples: 74215940. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-17 22:31:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:31:49,430][12883] Updated weights for policy 0, policy_version 4530 (0.0031) [2024-06-17 22:31:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 74334208. Throughput: 0: 39291.0. Samples: 74454220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 22:31:51,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:31:52,655][12883] Updated weights for policy 0, policy_version 4540 (0.0036) [2024-06-17 22:31:56,994][12645] Fps is (10 sec: 36045.2, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 74514432. Throughput: 0: 39376.8. Samples: 74689060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-17 22:31:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:31:58,004][12883] Updated weights for policy 0, policy_version 4550 (0.0045) [2024-06-17 22:31:59,709][12862] Signal inference workers to stop experience collection... (1050 times) [2024-06-17 22:31:59,710][12862] Signal inference workers to resume experience collection... (1050 times) [2024-06-17 22:31:59,726][12883] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-17 22:31:59,727][12883] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-17 22:32:00,741][12883] Updated weights for policy 0, policy_version 4560 (0.0032) [2024-06-17 22:32:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39048.6, 300 sec: 39155.0). Total num frames: 74727424. Throughput: 0: 39277.5. Samples: 74811480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 22:32:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:32:05,971][12883] Updated weights for policy 0, policy_version 4570 (0.0042) [2024-06-17 22:32:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.6, 300 sec: 39155.0). Total num frames: 74924032. Throughput: 0: 39609.7. Samples: 75050780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-17 22:32:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:32:09,362][12883] Updated weights for policy 0, policy_version 4580 (0.0047) [2024-06-17 22:32:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.5, 300 sec: 39044.4). Total num frames: 75120640. Throughput: 0: 39201.9. Samples: 75273800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 22:32:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:32:14,160][12883] Updated weights for policy 0, policy_version 4590 (0.0035) [2024-06-17 22:32:16,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 38988.4). Total num frames: 75317248. Throughput: 0: 39324.2. Samples: 75394240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 22:32:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:32:17,391][12883] Updated weights for policy 0, policy_version 4600 (0.0033) [2024-06-17 22:32:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39048.5, 300 sec: 39043.9). Total num frames: 75497472. Throughput: 0: 39197.4. Samples: 75629820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:32:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:32:22,526][12883] Updated weights for policy 0, policy_version 4610 (0.0035) [2024-06-17 22:32:26,755][12883] Updated weights for policy 0, policy_version 4620 (0.0044) [2024-06-17 22:32:26,994][12645] Fps is (10 sec: 37682.4, 60 sec: 39321.6, 300 sec: 38988.4). Total num frames: 75694080. Throughput: 0: 39137.6. Samples: 75864120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 22:32:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:32:30,940][12883] Updated weights for policy 0, policy_version 4630 (0.0041) [2024-06-17 22:32:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39048.6, 300 sec: 39044.2). Total num frames: 75890688. Throughput: 0: 39192.9. Samples: 75979620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-17 22:32:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:32:34,668][12883] Updated weights for policy 0, policy_version 4640 (0.0041) [2024-06-17 22:32:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 39048.6, 300 sec: 39155.0). Total num frames: 76103680. Throughput: 0: 39091.7. Samples: 76213340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 22:32:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:32:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004646_76120064.pth... [2024-06-17 22:32:37,179][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004075_66764800.pth [2024-06-17 22:32:39,426][12883] Updated weights for policy 0, policy_version 4650 (0.0040) [2024-06-17 22:32:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39321.6, 300 sec: 39043.9). Total num frames: 76283904. Throughput: 0: 38985.4. Samples: 76443400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 22:32:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:32:43,281][12883] Updated weights for policy 0, policy_version 4660 (0.0040) [2024-06-17 22:32:46,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.5, 300 sec: 39099.4). Total num frames: 76480512. Throughput: 0: 38947.9. Samples: 76564140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:32:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:32:47,808][12883] Updated weights for policy 0, policy_version 4670 (0.0038) [2024-06-17 22:32:51,681][12883] Updated weights for policy 0, policy_version 4680 (0.0042) [2024-06-17 22:32:52,000][12645] Fps is (10 sec: 39296.9, 60 sec: 39044.5, 300 sec: 38987.5). Total num frames: 76677120. Throughput: 0: 38771.6. Samples: 76795740. Policy #0 lag: (min: 1.0, avg: 11.8, max: 23.0) [2024-06-17 22:32:52,001][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:32:56,112][12883] Updated weights for policy 0, policy_version 4690 (0.0040) [2024-06-17 22:32:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39043.9). Total num frames: 76873728. Throughput: 0: 38980.0. Samples: 77027900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:32:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:33:00,508][12883] Updated weights for policy 0, policy_version 4700 (0.0038) [2024-06-17 22:33:01,994][12645] Fps is (10 sec: 39346.1, 60 sec: 39048.5, 300 sec: 39099.5). Total num frames: 77070336. Throughput: 0: 38817.2. Samples: 77141020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-17 22:33:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:33:04,216][12883] Updated weights for policy 0, policy_version 4710 (0.0028) [2024-06-17 22:33:06,994][12645] Fps is (10 sec: 37682.8, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 77250560. Throughput: 0: 38755.1. Samples: 77373800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:33:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:33:08,601][12883] Updated weights for policy 0, policy_version 4720 (0.0029) [2024-06-17 22:33:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39048.5, 300 sec: 39043.9). Total num frames: 77463552. Throughput: 0: 38914.3. Samples: 77615260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 22:33:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:33:12,951][12883] Updated weights for policy 0, policy_version 4730 (0.0043) [2024-06-17 22:33:16,961][12883] Updated weights for policy 0, policy_version 4740 (0.0046) [2024-06-17 22:33:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39048.5, 300 sec: 39044.2). Total num frames: 77660160. Throughput: 0: 38884.9. Samples: 77729440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:33:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:33:20,974][12883] Updated weights for policy 0, policy_version 4750 (0.0035) [2024-06-17 22:33:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.7, 300 sec: 38932.8). Total num frames: 77856768. Throughput: 0: 38985.8. Samples: 77967700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:33:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:33:25,163][12883] Updated weights for policy 0, policy_version 4760 (0.0038) [2024-06-17 22:33:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 38775.6, 300 sec: 38932.8). Total num frames: 78020608. Throughput: 0: 39204.4. Samples: 78207600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 22:33:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:33:29,154][12883] Updated weights for policy 0, policy_version 4770 (0.0048) [2024-06-17 22:33:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.7, 300 sec: 38932.9). Total num frames: 78249984. Throughput: 0: 39081.0. Samples: 78322780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 22:33:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:33:33,426][12883] Updated weights for policy 0, policy_version 4780 (0.0034) [2024-06-17 22:33:36,996][12645] Fps is (10 sec: 40950.8, 60 sec: 38774.0, 300 sec: 38988.1). Total num frames: 78430208. Throughput: 0: 39333.8. Samples: 78565600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-17 22:33:36,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:33:37,666][12883] Updated weights for policy 0, policy_version 4790 (0.0031) [2024-06-17 22:33:41,507][12883] Updated weights for policy 0, policy_version 4800 (0.0034) [2024-06-17 22:33:41,996][12645] Fps is (10 sec: 39312.7, 60 sec: 39320.1, 300 sec: 39043.6). Total num frames: 78643200. Throughput: 0: 39067.8. Samples: 78786040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-17 22:33:41,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:33:46,152][12883] Updated weights for policy 0, policy_version 4810 (0.0029) [2024-06-17 22:33:46,994][12645] Fps is (10 sec: 40968.9, 60 sec: 39321.6, 300 sec: 38988.3). Total num frames: 78839808. Throughput: 0: 39423.1. Samples: 78915060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-17 22:33:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:33:47,758][12862] Signal inference workers to stop experience collection... (1100 times) [2024-06-17 22:33:47,795][12883] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-17 22:33:47,869][12862] Signal inference workers to resume experience collection... (1100 times) [2024-06-17 22:33:47,869][12883] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-17 22:33:50,297][12883] Updated weights for policy 0, policy_version 4820 (0.0041) [2024-06-17 22:33:51,994][12645] Fps is (10 sec: 36052.5, 60 sec: 38779.4, 300 sec: 38877.5). Total num frames: 79003648. Throughput: 0: 39278.6. Samples: 79141340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-17 22:33:51,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:33:54,955][12883] Updated weights for policy 0, policy_version 4830 (0.0037) [2024-06-17 22:33:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39594.6, 300 sec: 39043.9). Total num frames: 79249408. Throughput: 0: 39044.9. Samples: 79372280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 20.0) [2024-06-17 22:33:56,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:33:58,310][12883] Updated weights for policy 0, policy_version 4840 (0.0037) [2024-06-17 22:34:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 38775.4, 300 sec: 38932.8). Total num frames: 79396864. Throughput: 0: 39341.7. Samples: 79499820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 22:34:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:34:03,308][12883] Updated weights for policy 0, policy_version 4850 (0.0046) [2024-06-17 22:34:06,685][12883] Updated weights for policy 0, policy_version 4860 (0.0034) [2024-06-17 22:34:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 39099.4). Total num frames: 79642624. Throughput: 0: 39129.8. Samples: 79728540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 22:34:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:34:11,446][12883] Updated weights for policy 0, policy_version 4870 (0.0031) [2024-06-17 22:34:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 38775.5, 300 sec: 38932.8). Total num frames: 79790080. Throughput: 0: 39154.6. Samples: 79969560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-17 22:34:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:34:15,260][12883] Updated weights for policy 0, policy_version 4880 (0.0025) [2024-06-17 22:34:16,994][12645] Fps is (10 sec: 36044.7, 60 sec: 39048.5, 300 sec: 38988.4). Total num frames: 80003072. Throughput: 0: 39081.3. Samples: 80081440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-17 22:34:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:34:20,153][12883] Updated weights for policy 0, policy_version 4890 (0.0049) [2024-06-17 22:34:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 39594.6, 300 sec: 39155.0). Total num frames: 80232448. Throughput: 0: 39027.7. Samples: 80321760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-17 22:34:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:34:23,251][12883] Updated weights for policy 0, policy_version 4900 (0.0037) [2024-06-17 22:34:26,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39321.6, 300 sec: 39043.9). Total num frames: 80379904. Throughput: 0: 39550.0. Samples: 80565700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 21.0) [2024-06-17 22:34:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:34:28,057][12883] Updated weights for policy 0, policy_version 4910 (0.0030) [2024-06-17 22:34:31,096][12883] Updated weights for policy 0, policy_version 4920 (0.0039) [2024-06-17 22:34:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39594.7, 300 sec: 39099.4). Total num frames: 80625664. Throughput: 0: 39297.0. Samples: 80683420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-17 22:34:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:34:36,389][12883] Updated weights for policy 0, policy_version 4930 (0.0040) [2024-06-17 22:34:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 39596.2, 300 sec: 39099.5). Total num frames: 80805888. Throughput: 0: 39839.7. Samples: 80934120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 22:34:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:34:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004933_80822272.pth... [2024-06-17 22:34:37,101][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004360_71434240.pth [2024-06-17 22:34:39,590][12883] Updated weights for policy 0, policy_version 4940 (0.0037) [2024-06-17 22:34:41,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39050.0, 300 sec: 38988.4). Total num frames: 80986112. Throughput: 0: 39868.9. Samples: 81166380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 22:34:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:34:44,454][12883] Updated weights for policy 0, policy_version 4950 (0.0046) [2024-06-17 22:34:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 39321.6, 300 sec: 39155.0). Total num frames: 81199104. Throughput: 0: 39682.7. Samples: 81285540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-17 22:34:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:34:47,606][12883] Updated weights for policy 0, policy_version 4960 (0.0035) [2024-06-17 22:34:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39594.7, 300 sec: 39043.9). Total num frames: 81379328. Throughput: 0: 39872.8. Samples: 81522820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 22:34:51,999][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:34:52,583][12883] Updated weights for policy 0, policy_version 4970 (0.0040) [2024-06-17 22:34:56,152][12883] Updated weights for policy 0, policy_version 4980 (0.0042) [2024-06-17 22:34:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 39099.7). Total num frames: 81592320. Throughput: 0: 39445.8. Samples: 81744620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-17 22:34:57,000][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:35:01,140][12883] Updated weights for policy 0, policy_version 4990 (0.0044) [2024-06-17 22:35:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39867.8, 300 sec: 39210.8). Total num frames: 81788928. Throughput: 0: 39666.3. Samples: 81866420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-17 22:35:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:35:03,951][12862] Signal inference workers to stop experience collection... (1150 times) [2024-06-17 22:35:04,004][12883] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-17 22:35:04,010][12862] Signal inference workers to resume experience collection... (1150 times) [2024-06-17 22:35:04,019][12883] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-17 22:35:04,150][12883] Updated weights for policy 0, policy_version 5000 (0.0041) [2024-06-17 22:35:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 38775.5, 300 sec: 39043.9). Total num frames: 81969152. Throughput: 0: 39422.7. Samples: 82095780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-17 22:35:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:35:09,713][12883] Updated weights for policy 0, policy_version 5010 (0.0039) [2024-06-17 22:35:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.9, 300 sec: 39155.0). Total num frames: 82182144. Throughput: 0: 39269.9. Samples: 82332840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-17 22:35:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:35:12,587][12883] Updated weights for policy 0, policy_version 5020 (0.0033) [2024-06-17 22:35:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.7, 300 sec: 39155.0). Total num frames: 82378752. Throughput: 0: 39400.0. Samples: 82456420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-17 22:35:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:35:18,119][12883] Updated weights for policy 0, policy_version 5030 (0.0033) [2024-06-17 22:35:21,302][12883] Updated weights for policy 0, policy_version 5040 (0.0037) [2024-06-17 22:35:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39048.6, 300 sec: 39099.5). Total num frames: 82575360. Throughput: 0: 39004.8. Samples: 82689340. Policy #0 lag: (min: 0.0, avg: 12.9, max: 23.0) [2024-06-17 22:35:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:35:26,018][12883] Updated weights for policy 0, policy_version 5050 (0.0039) [2024-06-17 22:35:26,994][12645] Fps is (10 sec: 37682.6, 60 sec: 39594.6, 300 sec: 39155.0). Total num frames: 82755584. Throughput: 0: 39154.6. Samples: 82928340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:35:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:35:29,532][12883] Updated weights for policy 0, policy_version 5060 (0.0038) [2024-06-17 22:35:31,996][12645] Fps is (10 sec: 37674.8, 60 sec: 38774.0, 300 sec: 39043.6). Total num frames: 82952192. Throughput: 0: 38963.5. Samples: 83038980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 22:35:31,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:35:34,686][12883] Updated weights for policy 0, policy_version 5070 (0.0054) [2024-06-17 22:35:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39321.6, 300 sec: 39155.0). Total num frames: 83165184. Throughput: 0: 39090.3. Samples: 83281880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:35:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:35:38,078][12883] Updated weights for policy 0, policy_version 5080 (0.0028) [2024-06-17 22:35:41,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39321.6, 300 sec: 39155.0). Total num frames: 83345408. Throughput: 0: 39481.4. Samples: 83521280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:35:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:35:42,555][12883] Updated weights for policy 0, policy_version 5090 (0.0035) [2024-06-17 22:35:46,179][12883] Updated weights for policy 0, policy_version 5100 (0.0032) [2024-06-17 22:35:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 83574784. Throughput: 0: 39407.1. Samples: 83639740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-17 22:35:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:35:51,210][12883] Updated weights for policy 0, policy_version 5110 (0.0036) [2024-06-17 22:35:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.7, 300 sec: 39210.5). Total num frames: 83738624. Throughput: 0: 39418.3. Samples: 83869600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 22:35:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:35:54,868][12883] Updated weights for policy 0, policy_version 5120 (0.0026) [2024-06-17 22:35:56,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39048.5, 300 sec: 39155.0). Total num frames: 83935232. Throughput: 0: 39381.2. Samples: 84105000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 22:35:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:35:59,592][12883] Updated weights for policy 0, policy_version 5130 (0.0029) [2024-06-17 22:36:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38775.4, 300 sec: 39210.5). Total num frames: 84115456. Throughput: 0: 39195.5. Samples: 84220220. Policy #0 lag: (min: 2.0, avg: 11.0, max: 22.0) [2024-06-17 22:36:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:36:03,421][12883] Updated weights for policy 0, policy_version 5140 (0.0045) [2024-06-17 22:36:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 39594.7, 300 sec: 39210.5). Total num frames: 84344832. Throughput: 0: 39234.7. Samples: 84454900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-17 22:36:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:36:07,844][12883] Updated weights for policy 0, policy_version 5150 (0.0030) [2024-06-17 22:36:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39048.4, 300 sec: 39210.5). Total num frames: 84525056. Throughput: 0: 39216.9. Samples: 84693100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-17 22:36:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:36:12,180][12883] Updated weights for policy 0, policy_version 5160 (0.0038) [2024-06-17 22:36:16,135][12883] Updated weights for policy 0, policy_version 5170 (0.0035) [2024-06-17 22:36:16,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39048.6, 300 sec: 39210.5). Total num frames: 84721664. Throughput: 0: 39359.8. Samples: 84810080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-17 22:36:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:36:20,331][12883] Updated weights for policy 0, policy_version 5180 (0.0035) [2024-06-17 22:36:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39321.5, 300 sec: 39321.6). Total num frames: 84934656. Throughput: 0: 39171.4. Samples: 85044600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-17 22:36:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:36:22,079][12862] Signal inference workers to stop experience collection... (1200 times) [2024-06-17 22:36:22,080][12862] Signal inference workers to resume experience collection... (1200 times) [2024-06-17 22:36:22,129][12883] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-17 22:36:22,129][12883] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-17 22:36:24,462][12883] Updated weights for policy 0, policy_version 5190 (0.0046) [2024-06-17 22:36:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.7, 300 sec: 39210.5). Total num frames: 85114880. Throughput: 0: 39308.4. Samples: 85290160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 22:36:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:36:28,844][12883] Updated weights for policy 0, policy_version 5200 (0.0043) [2024-06-17 22:36:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39596.1, 300 sec: 39210.5). Total num frames: 85327872. Throughput: 0: 39227.1. Samples: 85404960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-17 22:36:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:36:32,633][12883] Updated weights for policy 0, policy_version 5210 (0.0043) [2024-06-17 22:36:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39048.4, 300 sec: 39266.0). Total num frames: 85508096. Throughput: 0: 39445.2. Samples: 85644640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-17 22:36:36,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:36:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005220_85524480.pth... [2024-06-17 22:36:37,007][12883] Updated weights for policy 0, policy_version 5220 (0.0045) [2024-06-17 22:36:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004646_76120064.pth [2024-06-17 22:36:40,693][12883] Updated weights for policy 0, policy_version 5230 (0.0039) [2024-06-17 22:36:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 39266.1). Total num frames: 85737472. Throughput: 0: 39278.7. Samples: 85872540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:36:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:36:45,497][12883] Updated weights for policy 0, policy_version 5240 (0.0042) [2024-06-17 22:36:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 85934080. Throughput: 0: 39699.9. Samples: 86006720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 22:36:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:36:48,640][12883] Updated weights for policy 0, policy_version 5250 (0.0034) [2024-06-17 22:36:51,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39321.5, 300 sec: 39266.1). Total num frames: 86097920. Throughput: 0: 39630.6. Samples: 86238280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 22:36:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:36:53,565][12883] Updated weights for policy 0, policy_version 5260 (0.0034) [2024-06-17 22:36:56,730][12883] Updated weights for policy 0, policy_version 5270 (0.0041) [2024-06-17 22:36:56,995][12645] Fps is (10 sec: 40953.1, 60 sec: 40139.7, 300 sec: 39376.9). Total num frames: 86343680. Throughput: 0: 39563.9. Samples: 86473540. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-17 22:36:56,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:37:01,848][12883] Updated weights for policy 0, policy_version 5280 (0.0040) [2024-06-17 22:37:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.8, 300 sec: 39266.1). Total num frames: 86507520. Throughput: 0: 39780.8. Samples: 86600220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-17 22:37:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:37:04,904][12883] Updated weights for policy 0, policy_version 5290 (0.0052) [2024-06-17 22:37:06,994][12645] Fps is (10 sec: 39328.0, 60 sec: 39867.6, 300 sec: 39377.1). Total num frames: 86736896. Throughput: 0: 39798.7. Samples: 86835540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-17 22:37:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:37:10,419][12883] Updated weights for policy 0, policy_version 5300 (0.0032) [2024-06-17 22:37:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 39321.6). Total num frames: 86917120. Throughput: 0: 39564.0. Samples: 87070540. Policy #0 lag: (min: 0.0, avg: 7.0, max: 21.0) [2024-06-17 22:37:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:37:13,230][12883] Updated weights for policy 0, policy_version 5310 (0.0046) [2024-06-17 22:37:16,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39594.6, 300 sec: 39321.6). Total num frames: 87097344. Throughput: 0: 39635.6. Samples: 87188560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-17 22:37:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:37:18,474][12883] Updated weights for policy 0, policy_version 5320 (0.0041) [2024-06-17 22:37:21,353][12883] Updated weights for policy 0, policy_version 5330 (0.0058) [2024-06-17 22:37:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40140.9, 300 sec: 39488.2). Total num frames: 87343104. Throughput: 0: 39705.0. Samples: 87431360. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-17 22:37:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:37:26,498][12883] Updated weights for policy 0, policy_version 5340 (0.0048) [2024-06-17 22:37:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 87490560. Throughput: 0: 39983.7. Samples: 87671800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 21.0) [2024-06-17 22:37:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:37:28,861][12862] Signal inference workers to stop experience collection... (1250 times) [2024-06-17 22:37:28,879][12883] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-17 22:37:28,981][12862] Signal inference workers to resume experience collection... (1250 times) [2024-06-17 22:37:28,981][12883] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-17 22:37:29,747][12883] Updated weights for policy 0, policy_version 5350 (0.0029) [2024-06-17 22:37:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39867.8, 300 sec: 39377.1). Total num frames: 87719936. Throughput: 0: 39523.1. Samples: 87785260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-17 22:37:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:37:34,954][12883] Updated weights for policy 0, policy_version 5360 (0.0045) [2024-06-17 22:37:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 39867.7, 300 sec: 39377.1). Total num frames: 87900160. Throughput: 0: 39788.4. Samples: 88028760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-17 22:37:36,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:37:38,093][12883] Updated weights for policy 0, policy_version 5370 (0.0044) [2024-06-17 22:37:41,994][12645] Fps is (10 sec: 34406.3, 60 sec: 38775.5, 300 sec: 39266.1). Total num frames: 88064000. Throughput: 0: 39719.7. Samples: 88260860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-17 22:37:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:37:43,496][12883] Updated weights for policy 0, policy_version 5380 (0.0040) [2024-06-17 22:37:46,539][12883] Updated weights for policy 0, policy_version 5390 (0.0043) [2024-06-17 22:37:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 39867.7, 300 sec: 39489.0). Total num frames: 88326144. Throughput: 0: 39476.3. Samples: 88376660. Policy #0 lag: (min: 0.0, avg: 7.0, max: 19.0) [2024-06-17 22:37:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:37:51,654][12883] Updated weights for policy 0, policy_version 5400 (0.0024) [2024-06-17 22:37:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 39377.1). Total num frames: 88489984. Throughput: 0: 39686.7. Samples: 88621440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-17 22:37:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:37:54,954][12883] Updated weights for policy 0, policy_version 5410 (0.0036) [2024-06-17 22:37:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39595.8, 300 sec: 39488.2). Total num frames: 88719360. Throughput: 0: 39493.8. Samples: 88847760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-17 22:37:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:37:59,951][12883] Updated weights for policy 0, policy_version 5420 (0.0047) [2024-06-17 22:38:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39594.7, 300 sec: 39432.7). Total num frames: 88883200. Throughput: 0: 39569.5. Samples: 88969180. Policy #0 lag: (min: 0.0, avg: 7.3, max: 20.0) [2024-06-17 22:38:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:38:03,796][12883] Updated weights for policy 0, policy_version 5430 (0.0055) [2024-06-17 22:38:06,994][12645] Fps is (10 sec: 32767.6, 60 sec: 38502.4, 300 sec: 39266.1). Total num frames: 89047040. Throughput: 0: 39119.5. Samples: 89191740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-17 22:38:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:38:08,375][12883] Updated weights for policy 0, policy_version 5440 (0.0030) [2024-06-17 22:38:11,955][12883] Updated weights for policy 0, policy_version 5450 (0.0043) [2024-06-17 22:38:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39594.6, 300 sec: 39432.7). Total num frames: 89292800. Throughput: 0: 39101.2. Samples: 89431360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-17 22:38:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:38:16,826][12883] Updated weights for policy 0, policy_version 5460 (0.0053) [2024-06-17 22:38:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 89456640. Throughput: 0: 39303.1. Samples: 89553900. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-17 22:38:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:38:20,039][12883] Updated weights for policy 0, policy_version 5470 (0.0040) [2024-06-17 22:38:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39048.5, 300 sec: 39543.8). Total num frames: 89686016. Throughput: 0: 38977.9. Samples: 89782760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 22:38:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:38:25,246][12883] Updated weights for policy 0, policy_version 5480 (0.0050) [2024-06-17 22:38:27,000][12645] Fps is (10 sec: 40934.8, 60 sec: 39590.5, 300 sec: 39376.3). Total num frames: 89866240. Throughput: 0: 39188.8. Samples: 90024600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 22:38:27,001][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:38:28,463][12883] Updated weights for policy 0, policy_version 5490 (0.0044) [2024-06-17 22:38:31,994][12645] Fps is (10 sec: 36044.6, 60 sec: 38775.4, 300 sec: 39377.4). Total num frames: 90046464. Throughput: 0: 39074.2. Samples: 90135000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 22:38:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:38:33,503][12883] Updated weights for policy 0, policy_version 5500 (0.0047) [2024-06-17 22:38:36,697][12883] Updated weights for policy 0, policy_version 5510 (0.0031) [2024-06-17 22:38:36,994][12645] Fps is (10 sec: 40985.5, 60 sec: 39594.7, 300 sec: 39433.0). Total num frames: 90275840. Throughput: 0: 39012.9. Samples: 90377020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-17 22:38:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:38:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005511_90292224.pth... [2024-06-17 22:38:37,048][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000004933_80822272.pth [2024-06-17 22:38:41,576][12883] Updated weights for policy 0, policy_version 5520 (0.0053) [2024-06-17 22:38:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39867.8, 300 sec: 39377.1). Total num frames: 90456064. Throughput: 0: 39280.9. Samples: 90615400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 20.0) [2024-06-17 22:38:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:38:45,042][12883] Updated weights for policy 0, policy_version 5530 (0.0031) [2024-06-17 22:38:46,994][12645] Fps is (10 sec: 37682.9, 60 sec: 38775.4, 300 sec: 39488.2). Total num frames: 90652672. Throughput: 0: 39130.5. Samples: 90730060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:38:46,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:38:49,496][12883] Updated weights for policy 0, policy_version 5540 (0.0036) [2024-06-17 22:38:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 90849280. Throughput: 0: 39449.0. Samples: 90966940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-17 22:38:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:38:53,695][12883] Updated weights for policy 0, policy_version 5550 (0.0036) [2024-06-17 22:38:53,728][12862] Signal inference workers to stop experience collection... (1300 times) [2024-06-17 22:38:53,729][12862] Signal inference workers to resume experience collection... (1300 times) [2024-06-17 22:38:53,749][12883] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-17 22:38:53,749][12883] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-17 22:38:56,994][12645] Fps is (10 sec: 37683.9, 60 sec: 38502.4, 300 sec: 39432.7). Total num frames: 91029504. Throughput: 0: 39389.4. Samples: 91203880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 27.0) [2024-06-17 22:38:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:38:58,289][12883] Updated weights for policy 0, policy_version 5560 (0.0028) [2024-06-17 22:39:01,892][12883] Updated weights for policy 0, policy_version 5570 (0.0060) [2024-06-17 22:39:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.5, 300 sec: 39377.1). Total num frames: 91258880. Throughput: 0: 39279.1. Samples: 91321460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 22:39:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:39:06,812][12883] Updated weights for policy 0, policy_version 5580 (0.0044) [2024-06-17 22:39:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 91439104. Throughput: 0: 39400.0. Samples: 91555760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:39:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:39:10,360][12883] Updated weights for policy 0, policy_version 5590 (0.0035) [2024-06-17 22:39:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 91635712. Throughput: 0: 39134.3. Samples: 91785400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 22:39:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:39:15,036][12883] Updated weights for policy 0, policy_version 5600 (0.0034) [2024-06-17 22:39:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39321.7, 300 sec: 39266.1). Total num frames: 91815936. Throughput: 0: 39495.6. Samples: 91912300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-17 22:39:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:39:18,923][12883] Updated weights for policy 0, policy_version 5610 (0.0044) [2024-06-17 22:39:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 38775.4, 300 sec: 39432.7). Total num frames: 92012544. Throughput: 0: 39024.4. Samples: 92133120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 22:39:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:39:23,207][12883] Updated weights for policy 0, policy_version 5620 (0.0046) [2024-06-17 22:39:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39325.7, 300 sec: 39321.6). Total num frames: 92225536. Throughput: 0: 39107.2. Samples: 92375220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 22:39:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:39:27,040][12883] Updated weights for policy 0, policy_version 5630 (0.0029) [2024-06-17 22:39:31,652][12883] Updated weights for policy 0, policy_version 5640 (0.0041) [2024-06-17 22:39:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.6, 300 sec: 39321.6). Total num frames: 92405760. Throughput: 0: 39144.1. Samples: 92491540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:39:32,003][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:39:35,471][12883] Updated weights for policy 0, policy_version 5650 (0.0041) [2024-06-17 22:39:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 92618752. Throughput: 0: 39138.7. Samples: 92728180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 22:39:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:39:40,015][12883] Updated weights for policy 0, policy_version 5660 (0.0040) [2024-06-17 22:39:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 92815360. Throughput: 0: 39178.1. Samples: 92966900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-17 22:39:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:39:44,224][12883] Updated weights for policy 0, policy_version 5670 (0.0038) [2024-06-17 22:39:46,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39048.5, 300 sec: 39377.1). Total num frames: 92995584. Throughput: 0: 39172.9. Samples: 93084240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-17 22:39:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:39:47,931][12883] Updated weights for policy 0, policy_version 5680 (0.0031) [2024-06-17 22:39:51,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39321.7, 300 sec: 39377.2). Total num frames: 93208576. Throughput: 0: 39350.8. Samples: 93326540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:39:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:39:52,043][12883] Updated weights for policy 0, policy_version 5690 (0.0039) [2024-06-17 22:39:56,340][12883] Updated weights for policy 0, policy_version 5700 (0.0046) [2024-06-17 22:39:56,994][12645] Fps is (10 sec: 40961.1, 60 sec: 39594.7, 300 sec: 39377.2). Total num frames: 93405184. Throughput: 0: 39470.4. Samples: 93561560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 22:39:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:40:00,737][12883] Updated weights for policy 0, policy_version 5710 (0.0056) [2024-06-17 22:40:01,994][12645] Fps is (10 sec: 37682.6, 60 sec: 38775.5, 300 sec: 39377.1). Total num frames: 93585408. Throughput: 0: 39237.7. Samples: 93678000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-17 22:40:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:40:04,409][12883] Updated weights for policy 0, policy_version 5720 (0.0040) [2024-06-17 22:40:06,994][12645] Fps is (10 sec: 40958.9, 60 sec: 39594.6, 300 sec: 39432.6). Total num frames: 93814784. Throughput: 0: 39654.6. Samples: 93917580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:40:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:40:08,840][12883] Updated weights for policy 0, policy_version 5730 (0.0036) [2024-06-17 22:40:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 93995008. Throughput: 0: 39565.3. Samples: 94155660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-17 22:40:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:40:13,152][12883] Updated weights for policy 0, policy_version 5740 (0.0034) [2024-06-17 22:40:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39594.6, 300 sec: 39377.1). Total num frames: 94191616. Throughput: 0: 39598.1. Samples: 94273460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-17 22:40:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:40:17,320][12883] Updated weights for policy 0, policy_version 5750 (0.0032) [2024-06-17 22:40:21,248][12883] Updated weights for policy 0, policy_version 5760 (0.0044) [2024-06-17 22:40:21,556][12862] Signal inference workers to stop experience collection... (1350 times) [2024-06-17 22:40:21,613][12883] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-17 22:40:21,614][12862] Signal inference workers to resume experience collection... (1350 times) [2024-06-17 22:40:21,630][12883] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-17 22:40:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 94404608. Throughput: 0: 39713.4. Samples: 94515280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-17 22:40:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:40:25,895][12883] Updated weights for policy 0, policy_version 5770 (0.0042) [2024-06-17 22:40:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39594.6, 300 sec: 39488.5). Total num frames: 94601216. Throughput: 0: 39742.3. Samples: 94755300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-17 22:40:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:40:29,703][12883] Updated weights for policy 0, policy_version 5780 (0.0036) [2024-06-17 22:40:31,994][12645] Fps is (10 sec: 39320.6, 60 sec: 39867.6, 300 sec: 39432.6). Total num frames: 94797824. Throughput: 0: 39784.9. Samples: 94874560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 22:40:31,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:40:33,668][12883] Updated weights for policy 0, policy_version 5790 (0.0052) [2024-06-17 22:40:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 39488.2). Total num frames: 94994432. Throughput: 0: 39701.2. Samples: 95113100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-17 22:40:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:40:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005798_94994432.pth... [2024-06-17 22:40:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005220_85524480.pth [2024-06-17 22:40:37,730][12883] Updated weights for policy 0, policy_version 5800 (0.0045) [2024-06-17 22:40:41,521][12883] Updated weights for policy 0, policy_version 5810 (0.0041) [2024-06-17 22:40:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 95191040. Throughput: 0: 39692.8. Samples: 95347740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 22:40:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:40:46,396][12883] Updated weights for policy 0, policy_version 5820 (0.0048) [2024-06-17 22:40:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 39543.8). Total num frames: 95404032. Throughput: 0: 39869.0. Samples: 95472100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-17 22:40:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:40:49,891][12883] Updated weights for policy 0, policy_version 5830 (0.0047) [2024-06-17 22:40:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39321.5, 300 sec: 39432.7). Total num frames: 95567872. Throughput: 0: 39649.0. Samples: 95701780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-17 22:40:52,000][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:40:54,438][12883] Updated weights for policy 0, policy_version 5840 (0.0036) [2024-06-17 22:40:56,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.5, 300 sec: 39543.7). Total num frames: 95780864. Throughput: 0: 39476.4. Samples: 95932100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-17 22:40:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:40:58,046][12883] Updated weights for policy 0, policy_version 5850 (0.0049) [2024-06-17 22:41:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 95961088. Throughput: 0: 39654.3. Samples: 96057900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-17 22:41:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:41:02,622][12883] Updated weights for policy 0, policy_version 5860 (0.0046) [2024-06-17 22:41:06,757][12883] Updated weights for policy 0, policy_version 5870 (0.0043) [2024-06-17 22:41:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39488.2). Total num frames: 96174080. Throughput: 0: 39375.1. Samples: 96287160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 22:41:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:41:11,160][12883] Updated weights for policy 0, policy_version 5880 (0.0037) [2024-06-17 22:41:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.7, 300 sec: 39488.2). Total num frames: 96370688. Throughput: 0: 39164.9. Samples: 96517720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-17 22:41:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:41:15,468][12883] Updated weights for policy 0, policy_version 5890 (0.0062) [2024-06-17 22:41:16,994][12645] Fps is (10 sec: 37682.4, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 96550912. Throughput: 0: 39240.5. Samples: 96640380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-17 22:41:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:41:19,203][12883] Updated weights for policy 0, policy_version 5900 (0.0047) [2024-06-17 22:41:21,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 96747520. Throughput: 0: 39172.4. Samples: 96875860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 22:41:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:41:23,713][12883] Updated weights for policy 0, policy_version 5910 (0.0036) [2024-06-17 22:41:26,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39321.6, 300 sec: 39432.7). Total num frames: 96960512. Throughput: 0: 39210.3. Samples: 97112200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-17 22:41:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:41:27,191][12883] Updated weights for policy 0, policy_version 5920 (0.0040) [2024-06-17 22:41:31,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.6, 300 sec: 39377.2). Total num frames: 97124352. Throughput: 0: 38893.7. Samples: 97222320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-17 22:41:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:41:32,998][12883] Updated weights for policy 0, policy_version 5930 (0.0042) [2024-06-17 22:41:35,547][12883] Updated weights for policy 0, policy_version 5940 (0.0040) [2024-06-17 22:41:36,996][12645] Fps is (10 sec: 40950.5, 60 sec: 39593.2, 300 sec: 39432.4). Total num frames: 97370112. Throughput: 0: 39061.6. Samples: 97459640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 22:41:36,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:41:41,116][12883] Updated weights for policy 0, policy_version 5950 (0.0034) [2024-06-17 22:41:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 38502.4, 300 sec: 39210.5). Total num frames: 97501184. Throughput: 0: 39389.0. Samples: 97704600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-17 22:41:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:41:44,133][12883] Updated weights for policy 0, policy_version 5960 (0.0035) [2024-06-17 22:41:46,994][12645] Fps is (10 sec: 37691.5, 60 sec: 39048.5, 300 sec: 39488.2). Total num frames: 97746944. Throughput: 0: 38973.3. Samples: 97811700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-17 22:41:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:41:49,576][12883] Updated weights for policy 0, policy_version 5970 (0.0030) [2024-06-17 22:41:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 39594.6, 300 sec: 39321.8). Total num frames: 97943552. Throughput: 0: 39321.2. Samples: 98056620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-17 22:41:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:41:52,228][12883] Updated weights for policy 0, policy_version 5980 (0.0049) [2024-06-17 22:41:52,836][12862] Signal inference workers to stop experience collection... (1400 times) [2024-06-17 22:41:52,880][12883] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-17 22:41:52,885][12862] Signal inference workers to resume experience collection... (1400 times) [2024-06-17 22:41:52,892][12883] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-17 22:41:56,996][12645] Fps is (10 sec: 36037.0, 60 sec: 38774.1, 300 sec: 39321.3). Total num frames: 98107392. Throughput: 0: 39408.2. Samples: 98291180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 22:41:56,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:41:57,819][12883] Updated weights for policy 0, policy_version 5990 (0.0039) [2024-06-17 22:42:00,828][12883] Updated weights for policy 0, policy_version 6000 (0.0037) [2024-06-17 22:42:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.7, 300 sec: 39377.1). Total num frames: 98353152. Throughput: 0: 39328.5. Samples: 98410160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 22:42:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:42:05,998][12883] Updated weights for policy 0, policy_version 6010 (0.0041) [2024-06-17 22:42:06,994][12645] Fps is (10 sec: 40968.7, 60 sec: 39048.4, 300 sec: 39321.6). Total num frames: 98516992. Throughput: 0: 39455.1. Samples: 98651340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 23.0) [2024-06-17 22:42:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:42:09,130][12883] Updated weights for policy 0, policy_version 6020 (0.0029) [2024-06-17 22:42:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39321.6, 300 sec: 39432.7). Total num frames: 98729984. Throughput: 0: 39147.5. Samples: 98873840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 22:42:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:42:14,362][12883] Updated weights for policy 0, policy_version 6030 (0.0031) [2024-06-17 22:42:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39594.8, 300 sec: 39266.1). Total num frames: 98926592. Throughput: 0: 39530.7. Samples: 99001200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-17 22:42:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:42:17,195][12883] Updated weights for policy 0, policy_version 6040 (0.0044) [2024-06-17 22:42:21,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39048.6, 300 sec: 39321.6). Total num frames: 99090432. Throughput: 0: 39555.8. Samples: 99239560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-17 22:42:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:42:22,722][12883] Updated weights for policy 0, policy_version 6050 (0.0034) [2024-06-17 22:42:25,226][12883] Updated weights for policy 0, policy_version 6060 (0.0044) [2024-06-17 22:42:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39594.6, 300 sec: 39377.1). Total num frames: 99336192. Throughput: 0: 39252.0. Samples: 99470940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-17 22:42:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:42:30,874][12883] Updated weights for policy 0, policy_version 6070 (0.0048) [2024-06-17 22:42:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 99500032. Throughput: 0: 39652.6. Samples: 99596060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-17 22:42:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:42:33,833][12883] Updated weights for policy 0, policy_version 6080 (0.0038) [2024-06-17 22:42:36,994][12645] Fps is (10 sec: 36044.5, 60 sec: 38776.9, 300 sec: 39432.7). Total num frames: 99696640. Throughput: 0: 39326.2. Samples: 99826300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-17 22:42:36,999][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:42:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006085_99696640.pth... [2024-06-17 22:42:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005511_90292224.pth [2024-06-17 22:42:39,010][12883] Updated weights for policy 0, policy_version 6090 (0.0038) [2024-06-17 22:42:41,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40413.7, 300 sec: 39321.6). Total num frames: 99926016. Throughput: 0: 39317.4. Samples: 100060380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-17 22:42:41,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:42:42,159][12883] Updated weights for policy 0, policy_version 6100 (0.0036) [2024-06-17 22:42:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39048.6, 300 sec: 39321.6). Total num frames: 100089856. Throughput: 0: 39301.4. Samples: 100178720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 22:42:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:42:47,314][12883] Updated weights for policy 0, policy_version 6110 (0.0037) [2024-06-17 22:42:50,762][12883] Updated weights for policy 0, policy_version 6120 (0.0040) [2024-06-17 22:42:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 39321.6). Total num frames: 100319232. Throughput: 0: 39233.4. Samples: 100416840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:42:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:42:55,728][12883] Updated weights for policy 0, policy_version 6130 (0.0032) [2024-06-17 22:42:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40142.2, 300 sec: 39432.7). Total num frames: 100515840. Throughput: 0: 39426.1. Samples: 100648020. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-17 22:42:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:42:58,727][12883] Updated weights for policy 0, policy_version 6140 (0.0039) [2024-06-17 22:43:01,994][12645] Fps is (10 sec: 34406.7, 60 sec: 38502.5, 300 sec: 39377.2). Total num frames: 100663296. Throughput: 0: 39216.9. Samples: 100765960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-17 22:43:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:43:03,873][12883] Updated weights for policy 0, policy_version 6150 (0.0048) [2024-06-17 22:43:06,818][12883] Updated weights for policy 0, policy_version 6160 (0.0045) [2024-06-17 22:43:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40140.8, 300 sec: 39432.7). Total num frames: 100925440. Throughput: 0: 39316.8. Samples: 101008820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-17 22:43:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:43:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39048.5, 300 sec: 39377.1). Total num frames: 101072896. Throughput: 0: 39606.6. Samples: 101253240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-17 22:43:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:43:12,122][12883] Updated weights for policy 0, policy_version 6170 (0.0036) [2024-06-17 22:43:14,307][12862] Signal inference workers to stop experience collection... (1450 times) [2024-06-17 22:43:14,364][12883] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-17 22:43:14,421][12862] Signal inference workers to resume experience collection... (1450 times) [2024-06-17 22:43:14,422][12883] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-17 22:43:14,943][12883] Updated weights for policy 0, policy_version 6180 (0.0046) [2024-06-17 22:43:16,996][12645] Fps is (10 sec: 36037.0, 60 sec: 39320.1, 300 sec: 39321.3). Total num frames: 101285888. Throughput: 0: 39227.3. Samples: 101361380. Policy #0 lag: (min: 2.0, avg: 9.7, max: 20.0) [2024-06-17 22:43:16,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:43:20,354][12883] Updated weights for policy 0, policy_version 6190 (0.0039) [2024-06-17 22:43:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.7, 300 sec: 39378.0). Total num frames: 101482496. Throughput: 0: 39442.2. Samples: 101601200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-17 22:43:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:43:23,482][12883] Updated weights for policy 0, policy_version 6200 (0.0041) [2024-06-17 22:43:26,994][12645] Fps is (10 sec: 37691.5, 60 sec: 38775.5, 300 sec: 39377.1). Total num frames: 101662720. Throughput: 0: 39587.7. Samples: 101841820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 24.0) [2024-06-17 22:43:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:43:28,634][12883] Updated weights for policy 0, policy_version 6210 (0.0042) [2024-06-17 22:43:31,571][12883] Updated weights for policy 0, policy_version 6220 (0.0033) [2024-06-17 22:43:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40140.8, 300 sec: 39432.7). Total num frames: 101908480. Throughput: 0: 39482.7. Samples: 101955440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-17 22:43:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:43:36,818][12883] Updated weights for policy 0, policy_version 6230 (0.0044) [2024-06-17 22:43:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 102072320. Throughput: 0: 39662.7. Samples: 102201660. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-06-17 22:43:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:43:40,552][12883] Updated weights for policy 0, policy_version 6240 (0.0039) [2024-06-17 22:43:41,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39048.6, 300 sec: 39377.1). Total num frames: 102268928. Throughput: 0: 39624.5. Samples: 102431120. Policy #0 lag: (min: 1.0, avg: 8.6, max: 22.0) [2024-06-17 22:43:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:43:44,952][12883] Updated weights for policy 0, policy_version 6250 (0.0038) [2024-06-17 22:43:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.7, 300 sec: 39377.1). Total num frames: 102465536. Throughput: 0: 39774.2. Samples: 102555800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 22:43:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:43:48,647][12883] Updated weights for policy 0, policy_version 6260 (0.0038) [2024-06-17 22:43:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 38775.5, 300 sec: 39377.1). Total num frames: 102645760. Throughput: 0: 39660.5. Samples: 102793540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-17 22:43:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:43:53,603][12883] Updated weights for policy 0, policy_version 6270 (0.0040) [2024-06-17 22:43:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39321.6, 300 sec: 39377.1). Total num frames: 102875136. Throughput: 0: 39199.5. Samples: 103017220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-17 22:43:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:43:57,566][12883] Updated weights for policy 0, policy_version 6280 (0.0036) [2024-06-17 22:44:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39594.6, 300 sec: 39321.6). Total num frames: 103038976. Throughput: 0: 39669.9. Samples: 103146440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 22:44:01,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:44:02,130][12883] Updated weights for policy 0, policy_version 6290 (0.0041) [2024-06-17 22:44:05,461][12883] Updated weights for policy 0, policy_version 6300 (0.0035) [2024-06-17 22:44:06,996][12645] Fps is (10 sec: 37674.9, 60 sec: 38774.1, 300 sec: 39376.8). Total num frames: 103251968. Throughput: 0: 39567.9. Samples: 103381840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:44:06,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:44:10,234][12883] Updated weights for policy 0, policy_version 6310 (0.0038) [2024-06-17 22:44:11,995][12645] Fps is (10 sec: 44229.7, 60 sec: 40139.7, 300 sec: 39543.5). Total num frames: 103481344. Throughput: 0: 39484.7. Samples: 103618700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:44:11,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:44:13,995][12883] Updated weights for policy 0, policy_version 6320 (0.0047) [2024-06-17 22:44:16,996][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39432.4). Total num frames: 103645184. Throughput: 0: 39542.9. Samples: 103734960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 22:44:16,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:44:18,533][12883] Updated weights for policy 0, policy_version 6330 (0.0044) [2024-06-17 22:44:21,994][12645] Fps is (10 sec: 39328.0, 60 sec: 39867.7, 300 sec: 39488.2). Total num frames: 103874560. Throughput: 0: 39377.8. Samples: 103973660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-17 22:44:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:44:22,001][12883] Updated weights for policy 0, policy_version 6340 (0.0048) [2024-06-17 22:44:26,612][12883] Updated weights for policy 0, policy_version 6350 (0.0036) [2024-06-17 22:44:26,994][12645] Fps is (10 sec: 40969.4, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 104054784. Throughput: 0: 39653.8. Samples: 104215540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 22:44:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:44:30,070][12883] Updated weights for policy 0, policy_version 6360 (0.0038) [2024-06-17 22:44:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 104251392. Throughput: 0: 39516.0. Samples: 104334020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 20.0) [2024-06-17 22:44:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:44:34,162][12862] Signal inference workers to stop experience collection... (1500 times) [2024-06-17 22:44:34,162][12862] Signal inference workers to resume experience collection... (1500 times) [2024-06-17 22:44:34,185][12883] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-17 22:44:34,185][12883] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-17 22:44:34,617][12883] Updated weights for policy 0, policy_version 6370 (0.0034) [2024-06-17 22:44:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39867.7, 300 sec: 39488.2). Total num frames: 104464384. Throughput: 0: 39579.4. Samples: 104574620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 26.0) [2024-06-17 22:44:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:44:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006376_104464384.pth... [2024-06-17 22:44:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000005798_94994432.pth [2024-06-17 22:44:38,116][12883] Updated weights for policy 0, policy_version 6380 (0.0035) [2024-06-17 22:44:41,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39048.5, 300 sec: 39377.2). Total num frames: 104611840. Throughput: 0: 39950.2. Samples: 104814980. Policy #0 lag: (min: 1.0, avg: 10.7, max: 26.0) [2024-06-17 22:44:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:44:43,122][12883] Updated weights for policy 0, policy_version 6390 (0.0035) [2024-06-17 22:44:46,297][12883] Updated weights for policy 0, policy_version 6400 (0.0038) [2024-06-17 22:44:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.7, 300 sec: 39488.2). Total num frames: 104857600. Throughput: 0: 39528.0. Samples: 104925200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-17 22:44:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:44:51,466][12883] Updated weights for policy 0, policy_version 6410 (0.0040) [2024-06-17 22:44:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 39432.7). Total num frames: 105037824. Throughput: 0: 39734.8. Samples: 105169820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-17 22:44:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:44:54,385][12883] Updated weights for policy 0, policy_version 6420 (0.0032) [2024-06-17 22:44:56,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39321.7, 300 sec: 39488.2). Total num frames: 105234432. Throughput: 0: 39588.2. Samples: 105400100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-17 22:44:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:44:59,739][12883] Updated weights for policy 0, policy_version 6430 (0.0040) [2024-06-17 22:45:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40414.0, 300 sec: 39488.2). Total num frames: 105463808. Throughput: 0: 39782.5. Samples: 105525080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-17 22:45:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:45:03,230][12883] Updated weights for policy 0, policy_version 6440 (0.0035) [2024-06-17 22:45:06,997][12645] Fps is (10 sec: 37671.7, 60 sec: 39321.1, 300 sec: 39376.7). Total num frames: 105611264. Throughput: 0: 39653.4. Samples: 105758180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 22.0) [2024-06-17 22:45:06,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:45:08,345][12883] Updated weights for policy 0, policy_version 6450 (0.0041) [2024-06-17 22:45:11,259][12883] Updated weights for policy 0, policy_version 6460 (0.0047) [2024-06-17 22:45:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39595.8, 300 sec: 39543.8). Total num frames: 105857024. Throughput: 0: 39410.2. Samples: 105989000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-17 22:45:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:45:16,473][12883] Updated weights for policy 0, policy_version 6470 (0.0031) [2024-06-17 22:45:16,994][12645] Fps is (10 sec: 40972.5, 60 sec: 39596.2, 300 sec: 39377.1). Total num frames: 106020864. Throughput: 0: 39609.8. Samples: 106116460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 21.0) [2024-06-17 22:45:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:45:19,698][12883] Updated weights for policy 0, policy_version 6480 (0.0029) [2024-06-17 22:45:21,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39321.7, 300 sec: 39432.7). Total num frames: 106233856. Throughput: 0: 39395.7. Samples: 106347420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 22:45:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:45:24,560][12883] Updated weights for policy 0, policy_version 6490 (0.0027) [2024-06-17 22:45:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 39488.2). Total num frames: 106446848. Throughput: 0: 39537.8. Samples: 106594180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:45:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:45:27,619][12883] Updated weights for policy 0, policy_version 6500 (0.0037) [2024-06-17 22:45:31,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39048.5, 300 sec: 39321.6). Total num frames: 106594304. Throughput: 0: 39711.7. Samples: 106712220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 22:45:31,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:45:32,741][12883] Updated weights for policy 0, policy_version 6510 (0.0039) [2024-06-17 22:45:36,070][12883] Updated weights for policy 0, policy_version 6520 (0.0027) [2024-06-17 22:45:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.9, 300 sec: 39543.8). Total num frames: 106856448. Throughput: 0: 39664.1. Samples: 106954700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-17 22:45:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:45:40,820][12883] Updated weights for policy 0, policy_version 6530 (0.0038) [2024-06-17 22:45:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 40413.9, 300 sec: 39432.7). Total num frames: 107036672. Throughput: 0: 39781.2. Samples: 107190260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-17 22:45:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:45:44,386][12883] Updated weights for policy 0, policy_version 6540 (0.0049) [2024-06-17 22:45:46,994][12645] Fps is (10 sec: 34405.8, 60 sec: 39048.5, 300 sec: 39432.7). Total num frames: 107200512. Throughput: 0: 39610.5. Samples: 107307560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-17 22:45:47,003][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:45:49,230][12883] Updated weights for policy 0, policy_version 6550 (0.0034) [2024-06-17 22:45:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.8, 300 sec: 39543.8). Total num frames: 107446272. Throughput: 0: 39830.1. Samples: 107550420. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-17 22:45:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:45:52,589][12883] Updated weights for policy 0, policy_version 6560 (0.0035) [2024-06-17 22:45:56,295][12862] Signal inference workers to stop experience collection... (1550 times) [2024-06-17 22:45:56,295][12862] Signal inference workers to resume experience collection... (1550 times) [2024-06-17 22:45:56,336][12883] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-17 22:45:56,336][12883] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-17 22:45:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.6, 300 sec: 39488.2). Total num frames: 107610112. Throughput: 0: 40048.5. Samples: 107791180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-17 22:45:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:45:57,550][12883] Updated weights for policy 0, policy_version 6570 (0.0032) [2024-06-17 22:46:00,607][12883] Updated weights for policy 0, policy_version 6580 (0.0025) [2024-06-17 22:46:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39321.5, 300 sec: 39488.2). Total num frames: 107823104. Throughput: 0: 39822.5. Samples: 107908480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-17 22:46:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:46:05,635][12883] Updated weights for policy 0, policy_version 6590 (0.0045) [2024-06-17 22:46:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40415.8, 300 sec: 39543.7). Total num frames: 108036096. Throughput: 0: 40073.3. Samples: 108150720. Policy #0 lag: (min: 0.0, avg: 13.1, max: 24.0) [2024-06-17 22:46:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:46:08,758][12883] Updated weights for policy 0, policy_version 6600 (0.0032) [2024-06-17 22:46:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39048.6, 300 sec: 39488.2). Total num frames: 108199936. Throughput: 0: 39814.6. Samples: 108385840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-17 22:46:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:46:13,982][12883] Updated weights for policy 0, policy_version 6610 (0.0038) [2024-06-17 22:46:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.8, 300 sec: 39654.8). Total num frames: 108445696. Throughput: 0: 39714.7. Samples: 108499380. Policy #0 lag: (min: 2.0, avg: 10.5, max: 25.0) [2024-06-17 22:46:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:46:17,057][12883] Updated weights for policy 0, policy_version 6620 (0.0031) [2024-06-17 22:46:21,996][12883] Updated weights for policy 0, policy_version 6630 (0.0036) [2024-06-17 22:46:21,996][12645] Fps is (10 sec: 42588.8, 60 sec: 39866.2, 300 sec: 39543.4). Total num frames: 108625920. Throughput: 0: 39760.6. Samples: 108744020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-17 22:46:21,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:46:25,307][12883] Updated weights for policy 0, policy_version 6640 (0.0039) [2024-06-17 22:46:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39594.6, 300 sec: 39654.8). Total num frames: 108822528. Throughput: 0: 39638.2. Samples: 108973980. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-17 22:46:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:46:30,075][12883] Updated weights for policy 0, policy_version 6650 (0.0030) [2024-06-17 22:46:31,994][12645] Fps is (10 sec: 37691.7, 60 sec: 40140.8, 300 sec: 39433.0). Total num frames: 109002752. Throughput: 0: 39711.7. Samples: 109094580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 22:46:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:46:33,734][12883] Updated weights for policy 0, policy_version 6660 (0.0039) [2024-06-17 22:46:36,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39048.5, 300 sec: 39654.8). Total num frames: 109199360. Throughput: 0: 39606.8. Samples: 109332720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-17 22:46:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:46:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006665_109199360.pth... [2024-06-17 22:46:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006085_99696640.pth [2024-06-17 22:46:38,697][12883] Updated weights for policy 0, policy_version 6670 (0.0034) [2024-06-17 22:46:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 39867.8, 300 sec: 39599.3). Total num frames: 109428736. Throughput: 0: 39501.4. Samples: 109568740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 22:46:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:46:42,015][12883] Updated weights for policy 0, policy_version 6680 (0.0029) [2024-06-17 22:46:46,727][12883] Updated weights for policy 0, policy_version 6690 (0.0031) [2024-06-17 22:46:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40413.9, 300 sec: 39599.3). Total num frames: 109625344. Throughput: 0: 39761.3. Samples: 109697740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-17 22:46:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:46:50,204][12883] Updated weights for policy 0, policy_version 6700 (0.0037) [2024-06-17 22:46:51,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39321.6, 300 sec: 39655.1). Total num frames: 109805568. Throughput: 0: 39487.6. Samples: 109927660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-17 22:46:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:46:54,874][12883] Updated weights for policy 0, policy_version 6710 (0.0034) [2024-06-17 22:46:56,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.9, 300 sec: 39543.8). Total num frames: 110018560. Throughput: 0: 39596.0. Samples: 110167660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:46:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:46:58,693][12883] Updated weights for policy 0, policy_version 6720 (0.0053) [2024-06-17 22:47:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39321.7, 300 sec: 39543.8). Total num frames: 110182400. Throughput: 0: 39729.8. Samples: 110287220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 22:47:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:47:03,290][12883] Updated weights for policy 0, policy_version 6730 (0.0051) [2024-06-17 22:47:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39321.7, 300 sec: 39543.8). Total num frames: 110395392. Throughput: 0: 39362.0. Samples: 110515220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 22:47:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:47:07,750][12883] Updated weights for policy 0, policy_version 6740 (0.0037) [2024-06-17 22:47:11,347][12883] Updated weights for policy 0, policy_version 6750 (0.0040) [2024-06-17 22:47:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40140.7, 300 sec: 39599.3). Total num frames: 110608384. Throughput: 0: 39517.3. Samples: 110752260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 20.0) [2024-06-17 22:47:11,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:47:15,741][12883] Updated weights for policy 0, policy_version 6760 (0.0026) [2024-06-17 22:47:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 39048.5, 300 sec: 39654.8). Total num frames: 110788608. Throughput: 0: 39666.6. Samples: 110879580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:47:16,998][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:47:19,503][12883] Updated weights for policy 0, policy_version 6770 (0.0044) [2024-06-17 22:47:21,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39323.1, 300 sec: 39488.2). Total num frames: 110985216. Throughput: 0: 39560.9. Samples: 111112960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 22:47:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:47:24,276][12883] Updated weights for policy 0, policy_version 6780 (0.0040) [2024-06-17 22:47:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 111181824. Throughput: 0: 39768.8. Samples: 111358340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-17 22:47:26,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 22:47:27,010][12862] Saving new best policy, reward=0.013! [2024-06-17 22:47:27,712][12883] Updated weights for policy 0, policy_version 6790 (0.0041) [2024-06-17 22:47:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39867.7, 300 sec: 39654.8). Total num frames: 111394816. Throughput: 0: 39492.5. Samples: 111474900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-17 22:47:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:47:32,152][12883] Updated weights for policy 0, policy_version 6800 (0.0048) [2024-06-17 22:47:34,457][12862] Signal inference workers to stop experience collection... (1600 times) [2024-06-17 22:47:34,502][12883] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-17 22:47:34,514][12862] Signal inference workers to resume experience collection... (1600 times) [2024-06-17 22:47:34,529][12883] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-17 22:47:35,783][12883] Updated weights for policy 0, policy_version 6810 (0.0050) [2024-06-17 22:47:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40413.7, 300 sec: 39654.8). Total num frames: 111624192. Throughput: 0: 39598.1. Samples: 111709580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-17 22:47:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:47:40,827][12883] Updated weights for policy 0, policy_version 6820 (0.0045) [2024-06-17 22:47:41,994][12645] Fps is (10 sec: 36044.8, 60 sec: 38775.4, 300 sec: 39543.8). Total num frames: 111755264. Throughput: 0: 39667.9. Samples: 111952720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 23.0) [2024-06-17 22:47:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:47:44,091][12883] Updated weights for policy 0, policy_version 6830 (0.0023) [2024-06-17 22:47:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39594.7, 300 sec: 39599.3). Total num frames: 112001024. Throughput: 0: 39340.8. Samples: 112057560. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-17 22:47:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:47:49,179][12883] Updated weights for policy 0, policy_version 6840 (0.0039) [2024-06-17 22:47:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39594.6, 300 sec: 39543.8). Total num frames: 112181248. Throughput: 0: 39728.7. Samples: 112303020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 22:47:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:47:52,667][12883] Updated weights for policy 0, policy_version 6850 (0.0036) [2024-06-17 22:47:56,994][12645] Fps is (10 sec: 34406.6, 60 sec: 38775.4, 300 sec: 39599.3). Total num frames: 112345088. Throughput: 0: 39771.2. Samples: 112541960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 21.0) [2024-06-17 22:47:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:47:57,344][12883] Updated weights for policy 0, policy_version 6860 (0.0040) [2024-06-17 22:48:00,643][12883] Updated weights for policy 0, policy_version 6870 (0.0033) [2024-06-17 22:48:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.7, 300 sec: 39543.8). Total num frames: 112590848. Throughput: 0: 39459.6. Samples: 112655260. Policy #0 lag: (min: 1.0, avg: 12.6, max: 22.0) [2024-06-17 22:48:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:48:06,053][12883] Updated weights for policy 0, policy_version 6880 (0.0045) [2024-06-17 22:48:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 112754688. Throughput: 0: 39630.7. Samples: 112896340. Policy #0 lag: (min: 1.0, avg: 12.6, max: 22.0) [2024-06-17 22:48:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:48:09,189][12883] Updated weights for policy 0, policy_version 6890 (0.0045) [2024-06-17 22:48:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39594.6, 300 sec: 39655.1). Total num frames: 112984064. Throughput: 0: 39437.3. Samples: 113133020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 22.0) [2024-06-17 22:48:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:48:14,134][12883] Updated weights for policy 0, policy_version 6900 (0.0033) [2024-06-17 22:48:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 39654.9). Total num frames: 113180672. Throughput: 0: 39605.0. Samples: 113257120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 22:48:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:48:17,431][12883] Updated weights for policy 0, policy_version 6910 (0.0054) [2024-06-17 22:48:21,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39321.5, 300 sec: 39599.3). Total num frames: 113344512. Throughput: 0: 39524.5. Samples: 113488180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 22:48:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:48:22,740][12883] Updated weights for policy 0, policy_version 6920 (0.0050) [2024-06-17 22:48:25,632][12883] Updated weights for policy 0, policy_version 6930 (0.0032) [2024-06-17 22:48:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 39599.3). Total num frames: 113590272. Throughput: 0: 39256.1. Samples: 113719240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-17 22:48:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:48:31,105][12883] Updated weights for policy 0, policy_version 6940 (0.0044) [2024-06-17 22:48:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 113754112. Throughput: 0: 39709.3. Samples: 113844480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-17 22:48:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:48:33,974][12883] Updated weights for policy 0, policy_version 6950 (0.0038) [2024-06-17 22:48:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.7, 300 sec: 39654.8). Total num frames: 113967104. Throughput: 0: 39470.4. Samples: 114079180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 22:48:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:48:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006956_113967104.pth... [2024-06-17 22:48:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006376_104464384.pth [2024-06-17 22:48:39,329][12883] Updated weights for policy 0, policy_version 6960 (0.0043) [2024-06-17 22:48:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40414.0, 300 sec: 39710.4). Total num frames: 114180096. Throughput: 0: 39540.5. Samples: 114321280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-17 22:48:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:48:42,070][12883] Updated weights for policy 0, policy_version 6970 (0.0041) [2024-06-17 22:48:46,994][12645] Fps is (10 sec: 36045.2, 60 sec: 38775.6, 300 sec: 39599.3). Total num frames: 114327552. Throughput: 0: 39589.0. Samples: 114436760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-17 22:48:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:48:47,275][12883] Updated weights for policy 0, policy_version 6980 (0.0030) [2024-06-17 22:48:50,293][12862] Signal inference workers to stop experience collection... (1650 times) [2024-06-17 22:48:50,293][12862] Signal inference workers to resume experience collection... (1650 times) [2024-06-17 22:48:50,342][12883] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-17 22:48:50,342][12883] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-17 22:48:50,474][12883] Updated weights for policy 0, policy_version 6990 (0.0044) [2024-06-17 22:48:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39867.8, 300 sec: 39654.8). Total num frames: 114573312. Throughput: 0: 39578.6. Samples: 114677380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-17 22:48:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:48:55,236][12883] Updated weights for policy 0, policy_version 7000 (0.0047) [2024-06-17 22:48:56,994][12645] Fps is (10 sec: 40959.0, 60 sec: 39867.6, 300 sec: 39654.8). Total num frames: 114737152. Throughput: 0: 39927.1. Samples: 114929740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-17 22:48:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:48:58,834][12883] Updated weights for policy 0, policy_version 7010 (0.0027) [2024-06-17 22:49:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 39710.7). Total num frames: 114966528. Throughput: 0: 39608.5. Samples: 115039500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-17 22:49:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:03,142][12883] Updated weights for policy 0, policy_version 7020 (0.0031) [2024-06-17 22:49:06,512][12883] Updated weights for policy 0, policy_version 7030 (0.0046) [2024-06-17 22:49:06,998][12645] Fps is (10 sec: 44218.3, 60 sec: 40410.9, 300 sec: 39654.5). Total num frames: 115179520. Throughput: 0: 40073.1. Samples: 115291640. Policy #0 lag: (min: 1.0, avg: 7.8, max: 20.0) [2024-06-17 22:49:06,999][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:11,447][12883] Updated weights for policy 0, policy_version 7040 (0.0038) [2024-06-17 22:49:11,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39594.7, 300 sec: 39710.7). Total num frames: 115359744. Throughput: 0: 40234.5. Samples: 115529800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 22:49:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:14,867][12883] Updated weights for policy 0, policy_version 7050 (0.0034) [2024-06-17 22:49:16,994][12645] Fps is (10 sec: 39338.8, 60 sec: 39867.7, 300 sec: 39654.9). Total num frames: 115572736. Throughput: 0: 40054.3. Samples: 115646920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 22:49:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:19,448][12883] Updated weights for policy 0, policy_version 7060 (0.0048) [2024-06-17 22:49:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 39710.4). Total num frames: 115769344. Throughput: 0: 40306.6. Samples: 115892980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-17 22:49:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:22,862][12883] Updated weights for policy 0, policy_version 7070 (0.0046) [2024-06-17 22:49:26,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39594.5, 300 sec: 39710.4). Total num frames: 115965952. Throughput: 0: 40134.0. Samples: 116127320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-17 22:49:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:27,401][12883] Updated weights for policy 0, policy_version 7080 (0.0055) [2024-06-17 22:49:31,043][12883] Updated weights for policy 0, policy_version 7090 (0.0035) [2024-06-17 22:49:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 39710.4). Total num frames: 116178944. Throughput: 0: 40363.9. Samples: 116253140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 22:49:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:35,585][12883] Updated weights for policy 0, policy_version 7100 (0.0040) [2024-06-17 22:49:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 116342784. Throughput: 0: 40245.3. Samples: 116488420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 22:49:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:39,303][12883] Updated weights for policy 0, policy_version 7110 (0.0039) [2024-06-17 22:49:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 116572160. Throughput: 0: 40041.5. Samples: 116731600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-17 22:49:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:49:43,901][12883] Updated weights for policy 0, policy_version 7120 (0.0036) [2024-06-17 22:49:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40686.8, 300 sec: 39765.9). Total num frames: 116768768. Throughput: 0: 40171.9. Samples: 116847240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 22:49:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:49:47,496][12883] Updated weights for policy 0, policy_version 7130 (0.0039) [2024-06-17 22:49:51,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 116965376. Throughput: 0: 39918.8. Samples: 117087820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:49:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:49:52,017][12883] Updated weights for policy 0, policy_version 7140 (0.0029) [2024-06-17 22:49:56,086][12883] Updated weights for policy 0, policy_version 7150 (0.0033) [2024-06-17 22:49:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40686.9, 300 sec: 39710.3). Total num frames: 117178368. Throughput: 0: 39778.2. Samples: 117319820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-17 22:49:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:50:00,494][12883] Updated weights for policy 0, policy_version 7160 (0.0044) [2024-06-17 22:50:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39867.6, 300 sec: 39821.8). Total num frames: 117358592. Throughput: 0: 39957.7. Samples: 117445020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-17 22:50:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:50:04,455][12883] Updated weights for policy 0, policy_version 7170 (0.0041) [2024-06-17 22:50:06,996][12645] Fps is (10 sec: 37675.5, 60 sec: 39596.0, 300 sec: 39654.5). Total num frames: 117555200. Throughput: 0: 39751.4. Samples: 117681880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 22:50:06,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:50:08,584][12883] Updated weights for policy 0, policy_version 7180 (0.0038) [2024-06-17 22:50:12,000][12645] Fps is (10 sec: 40934.7, 60 sec: 40136.7, 300 sec: 39820.6). Total num frames: 117768192. Throughput: 0: 39887.0. Samples: 117922480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 22.0) [2024-06-17 22:50:12,001][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:50:12,446][12883] Updated weights for policy 0, policy_version 7190 (0.0031) [2024-06-17 22:50:16,781][12883] Updated weights for policy 0, policy_version 7200 (0.0045) [2024-06-17 22:50:16,996][12645] Fps is (10 sec: 40959.7, 60 sec: 39866.2, 300 sec: 39765.6). Total num frames: 117964800. Throughput: 0: 39770.9. Samples: 118042920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 22:50:16,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:50:20,665][12883] Updated weights for policy 0, policy_version 7210 (0.0033) [2024-06-17 22:50:21,994][12645] Fps is (10 sec: 39346.0, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 118161408. Throughput: 0: 39902.7. Samples: 118284040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 22:50:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:50:25,182][12883] Updated weights for policy 0, policy_version 7220 (0.0036) [2024-06-17 22:50:26,876][12862] Signal inference workers to stop experience collection... (1700 times) [2024-06-17 22:50:26,876][12862] Signal inference workers to resume experience collection... (1700 times) [2024-06-17 22:50:26,920][12883] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-17 22:50:26,920][12883] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-17 22:50:26,994][12645] Fps is (10 sec: 39330.1, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 118358016. Throughput: 0: 39720.7. Samples: 118519040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:50:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:50:28,961][12883] Updated weights for policy 0, policy_version 7230 (0.0055) [2024-06-17 22:50:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.6, 300 sec: 39654.8). Total num frames: 118554624. Throughput: 0: 39710.2. Samples: 118634200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 22:50:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:50:33,159][12883] Updated weights for policy 0, policy_version 7240 (0.0045) [2024-06-17 22:50:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 39765.9). Total num frames: 118767616. Throughput: 0: 39776.9. Samples: 118877780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 22:50:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:50:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007250_118784000.pth... [2024-06-17 22:50:37,079][12883] Updated weights for policy 0, policy_version 7250 (0.0029) [2024-06-17 22:50:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006665_109199360.pth [2024-06-17 22:50:41,685][12883] Updated weights for policy 0, policy_version 7260 (0.0048) [2024-06-17 22:50:41,996][12645] Fps is (10 sec: 40951.0, 60 sec: 39866.2, 300 sec: 39876.7). Total num frames: 118964224. Throughput: 0: 39825.7. Samples: 119112060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 22:50:41,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:50:45,783][12883] Updated weights for policy 0, policy_version 7270 (0.0035) [2024-06-17 22:50:46,994][12645] Fps is (10 sec: 36045.3, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 119128064. Throughput: 0: 39641.9. Samples: 119228900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-17 22:50:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:50:50,077][12883] Updated weights for policy 0, policy_version 7280 (0.0034) [2024-06-17 22:50:51,994][12645] Fps is (10 sec: 39330.3, 60 sec: 39867.8, 300 sec: 39821.4). Total num frames: 119357440. Throughput: 0: 39689.0. Samples: 119467800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 22:50:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:50:54,049][12883] Updated weights for policy 0, policy_version 7290 (0.0035) [2024-06-17 22:50:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39321.6, 300 sec: 39710.4). Total num frames: 119537664. Throughput: 0: 39642.7. Samples: 119706160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 22:50:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:50:58,325][12883] Updated weights for policy 0, policy_version 7300 (0.0047) [2024-06-17 22:51:01,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39867.9, 300 sec: 39710.4). Total num frames: 119750656. Throughput: 0: 39485.2. Samples: 119819660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 22:51:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:51:02,155][12883] Updated weights for policy 0, policy_version 7310 (0.0035) [2024-06-17 22:51:06,486][12883] Updated weights for policy 0, policy_version 7320 (0.0036) [2024-06-17 22:51:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 39869.3, 300 sec: 39821.5). Total num frames: 119947264. Throughput: 0: 39605.4. Samples: 120066280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 22:51:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:51:10,603][12883] Updated weights for policy 0, policy_version 7330 (0.0036) [2024-06-17 22:51:11,994][12645] Fps is (10 sec: 39320.9, 60 sec: 39598.7, 300 sec: 39654.8). Total num frames: 120143872. Throughput: 0: 39499.2. Samples: 120296500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-17 22:51:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:51:14,679][12883] Updated weights for policy 0, policy_version 7340 (0.0035) [2024-06-17 22:51:16,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39869.2, 300 sec: 39766.2). Total num frames: 120356864. Throughput: 0: 39676.0. Samples: 120419620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-17 22:51:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:51:18,552][12883] Updated weights for policy 0, policy_version 7350 (0.0041) [2024-06-17 22:51:21,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39048.5, 300 sec: 39599.3). Total num frames: 120504320. Throughput: 0: 39558.6. Samples: 120657920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-17 22:51:21,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:51:23,238][12883] Updated weights for policy 0, policy_version 7360 (0.0044) [2024-06-17 22:51:26,891][12883] Updated weights for policy 0, policy_version 7370 (0.0032) [2024-06-17 22:51:26,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 120750080. Throughput: 0: 39656.3. Samples: 120896500. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-17 22:51:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:51:31,503][12883] Updated weights for policy 0, policy_version 7380 (0.0039) [2024-06-17 22:51:31,994][12645] Fps is (10 sec: 42599.3, 60 sec: 39594.8, 300 sec: 39765.9). Total num frames: 120930304. Throughput: 0: 39777.4. Samples: 121018880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 22:51:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:51:35,009][12883] Updated weights for policy 0, policy_version 7390 (0.0044) [2024-06-17 22:51:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39321.6, 300 sec: 39654.8). Total num frames: 121126912. Throughput: 0: 39715.6. Samples: 121255000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-17 22:51:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:51:39,411][12883] Updated weights for policy 0, policy_version 7400 (0.0045) [2024-06-17 22:51:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 39869.2, 300 sec: 39765.9). Total num frames: 121356288. Throughput: 0: 39553.4. Samples: 121486060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-17 22:51:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:51:43,081][12883] Updated weights for policy 0, policy_version 7410 (0.0045) [2024-06-17 22:51:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 121520128. Throughput: 0: 39868.0. Samples: 121613720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 22:51:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:51:47,391][12883] Updated weights for policy 0, policy_version 7420 (0.0039) [2024-06-17 22:51:51,850][12883] Updated weights for policy 0, policy_version 7430 (0.0031) [2024-06-17 22:51:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 121733120. Throughput: 0: 39690.2. Samples: 121852340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 22:51:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:51:55,810][12883] Updated weights for policy 0, policy_version 7440 (0.0035) [2024-06-17 22:51:56,996][12645] Fps is (10 sec: 40950.9, 60 sec: 39866.4, 300 sec: 39821.1). Total num frames: 121929728. Throughput: 0: 39720.8. Samples: 122084020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-17 22:51:56,997][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:51:59,874][12883] Updated weights for policy 0, policy_version 7450 (0.0039) [2024-06-17 22:52:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 122109952. Throughput: 0: 39762.4. Samples: 122208920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 19.0) [2024-06-17 22:52:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:52:03,547][12862] Signal inference workers to stop experience collection... (1750 times) [2024-06-17 22:52:03,574][12883] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-17 22:52:03,605][12862] Signal inference workers to resume experience collection... (1750 times) [2024-06-17 22:52:03,613][12883] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-17 22:52:03,750][12883] Updated weights for policy 0, policy_version 7460 (0.0041) [2024-06-17 22:52:06,994][12645] Fps is (10 sec: 39329.9, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 122322944. Throughput: 0: 39752.5. Samples: 122446780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-17 22:52:06,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:52:08,823][12883] Updated weights for policy 0, policy_version 7470 (0.0037) [2024-06-17 22:52:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 39821.5). Total num frames: 122535936. Throughput: 0: 39472.4. Samples: 122672760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:52:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:52:12,274][12883] Updated weights for policy 0, policy_version 7480 (0.0040) [2024-06-17 22:52:16,707][12883] Updated weights for policy 0, policy_version 7490 (0.0033) [2024-06-17 22:52:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39321.7, 300 sec: 39765.9). Total num frames: 122716160. Throughput: 0: 39543.1. Samples: 122798320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-17 22:52:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:52:20,994][12883] Updated weights for policy 0, policy_version 7500 (0.0038) [2024-06-17 22:52:21,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 122896384. Throughput: 0: 39579.6. Samples: 123036080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-17 22:52:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:52:24,862][12883] Updated weights for policy 0, policy_version 7510 (0.0040) [2024-06-17 22:52:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 123109376. Throughput: 0: 39721.3. Samples: 123273520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-17 22:52:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:52:29,416][12883] Updated weights for policy 0, policy_version 7520 (0.0048) [2024-06-17 22:52:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39594.6, 300 sec: 39599.3). Total num frames: 123305984. Throughput: 0: 39473.7. Samples: 123390040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-17 22:52:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:52:32,984][12883] Updated weights for policy 0, policy_version 7530 (0.0036) [2024-06-17 22:52:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39594.6, 300 sec: 39821.4). Total num frames: 123502592. Throughput: 0: 39406.6. Samples: 123625640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-17 22:52:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:52:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007538_123502592.pth... [2024-06-17 22:52:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000006956_113967104.pth [2024-06-17 22:52:37,659][12883] Updated weights for policy 0, policy_version 7540 (0.0035) [2024-06-17 22:52:41,767][12883] Updated weights for policy 0, policy_version 7550 (0.0044) [2024-06-17 22:52:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 123715584. Throughput: 0: 39574.4. Samples: 123864780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-17 22:52:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:52:45,933][12883] Updated weights for policy 0, policy_version 7560 (0.0038) [2024-06-17 22:52:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 123895808. Throughput: 0: 39393.7. Samples: 123981640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-17 22:52:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:52:49,723][12883] Updated weights for policy 0, policy_version 7570 (0.0035) [2024-06-17 22:52:51,994][12645] Fps is (10 sec: 39320.6, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 124108800. Throughput: 0: 39462.6. Samples: 124222600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-17 22:52:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:52:54,100][12883] Updated weights for policy 0, policy_version 7580 (0.0032) [2024-06-17 22:52:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39323.0, 300 sec: 39654.8). Total num frames: 124289024. Throughput: 0: 39580.9. Samples: 124453900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 22:52:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:52:57,827][12883] Updated weights for policy 0, policy_version 7590 (0.0028) [2024-06-17 22:53:01,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 124485632. Throughput: 0: 39499.9. Samples: 124575820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-17 22:53:01,999][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:53:02,395][12883] Updated weights for policy 0, policy_version 7600 (0.0036) [2024-06-17 22:53:06,338][12883] Updated weights for policy 0, policy_version 7610 (0.0042) [2024-06-17 22:53:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39654.9). Total num frames: 124682240. Throughput: 0: 39436.0. Samples: 124810700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-17 22:53:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:53:10,351][12883] Updated weights for policy 0, policy_version 7620 (0.0038) [2024-06-17 22:53:11,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39048.6, 300 sec: 39654.8). Total num frames: 124878848. Throughput: 0: 39531.7. Samples: 125052440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 22:53:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:53:14,511][12883] Updated weights for policy 0, policy_version 7630 (0.0033) [2024-06-17 22:53:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39594.7, 300 sec: 39821.5). Total num frames: 125091840. Throughput: 0: 39535.1. Samples: 125169120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-17 22:53:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:53:18,303][12883] Updated weights for policy 0, policy_version 7640 (0.0026) [2024-06-17 22:53:21,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40140.7, 300 sec: 39710.3). Total num frames: 125304832. Throughput: 0: 39701.7. Samples: 125412220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-17 22:53:21,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:53:23,527][12883] Updated weights for policy 0, policy_version 7650 (0.0033) [2024-06-17 22:53:26,523][12883] Updated weights for policy 0, policy_version 7660 (0.0032) [2024-06-17 22:53:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 125517824. Throughput: 0: 39626.9. Samples: 125648000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 22:53:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:53:31,595][12883] Updated weights for policy 0, policy_version 7670 (0.0039) [2024-06-17 22:53:31,994][12645] Fps is (10 sec: 36045.4, 60 sec: 39321.6, 300 sec: 39654.8). Total num frames: 125665280. Throughput: 0: 39813.4. Samples: 125773240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-17 22:53:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:53:34,456][12862] Signal inference workers to stop experience collection... (1800 times) [2024-06-17 22:53:34,457][12862] Signal inference workers to resume experience collection... (1800 times) [2024-06-17 22:53:34,497][12883] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-17 22:53:34,497][12883] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-17 22:53:34,610][12883] Updated weights for policy 0, policy_version 7680 (0.0039) [2024-06-17 22:53:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 125894656. Throughput: 0: 39648.5. Samples: 126006780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-17 22:53:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:53:39,848][12883] Updated weights for policy 0, policy_version 7690 (0.0033) [2024-06-17 22:53:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 126091264. Throughput: 0: 39800.8. Samples: 126244940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-17 22:53:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:53:43,275][12883] Updated weights for policy 0, policy_version 7700 (0.0042) [2024-06-17 22:53:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 126287872. Throughput: 0: 39635.5. Samples: 126359420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 22:53:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:53:48,186][12883] Updated weights for policy 0, policy_version 7710 (0.0035) [2024-06-17 22:53:51,330][12883] Updated weights for policy 0, policy_version 7720 (0.0036) [2024-06-17 22:53:51,996][12645] Fps is (10 sec: 40951.0, 60 sec: 39866.3, 300 sec: 39876.7). Total num frames: 126500864. Throughput: 0: 39787.3. Samples: 126601220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 23.0) [2024-06-17 22:53:51,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:53:56,609][12883] Updated weights for policy 0, policy_version 7730 (0.0045) [2024-06-17 22:53:56,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 126648320. Throughput: 0: 39756.0. Samples: 126841460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-17 22:53:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:53:59,329][12883] Updated weights for policy 0, policy_version 7740 (0.0039) [2024-06-17 22:54:01,994][12645] Fps is (10 sec: 37691.8, 60 sec: 39867.8, 300 sec: 39655.4). Total num frames: 126877696. Throughput: 0: 39617.3. Samples: 126951900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-17 22:54:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:54:04,826][12883] Updated weights for policy 0, policy_version 7750 (0.0037) [2024-06-17 22:54:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 127074304. Throughput: 0: 39674.4. Samples: 127197560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 22:54:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:54:07,680][12883] Updated weights for policy 0, policy_version 7760 (0.0030) [2024-06-17 22:54:11,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.5, 300 sec: 39599.3). Total num frames: 127254528. Throughput: 0: 39660.0. Samples: 127432700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 22:54:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:54:13,003][12883] Updated weights for policy 0, policy_version 7770 (0.0039) [2024-06-17 22:54:16,125][12883] Updated weights for policy 0, policy_version 7780 (0.0031) [2024-06-17 22:54:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 127483904. Throughput: 0: 39512.4. Samples: 127551300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-17 22:54:16,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:54:21,684][12883] Updated weights for policy 0, policy_version 7790 (0.0034) [2024-06-17 22:54:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39321.7, 300 sec: 39654.8). Total num frames: 127664128. Throughput: 0: 39580.4. Samples: 127787900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-17 22:54:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:54:24,334][12883] Updated weights for policy 0, policy_version 7800 (0.0031) [2024-06-17 22:54:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39048.6, 300 sec: 39599.3). Total num frames: 127860736. Throughput: 0: 39512.5. Samples: 128023000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 23.0) [2024-06-17 22:54:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:54:29,511][12883] Updated weights for policy 0, policy_version 7810 (0.0032) [2024-06-17 22:54:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.8, 300 sec: 39821.5). Total num frames: 128090112. Throughput: 0: 39763.5. Samples: 128148780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-17 22:54:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:54:33,096][12883] Updated weights for policy 0, policy_version 7820 (0.0038) [2024-06-17 22:54:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 128253952. Throughput: 0: 39644.1. Samples: 128385120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-17 22:54:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:54:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007828_128253952.pth... [2024-06-17 22:54:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007250_118784000.pth [2024-06-17 22:54:37,912][12883] Updated weights for policy 0, policy_version 7830 (0.0039) [2024-06-17 22:54:41,350][12883] Updated weights for policy 0, policy_version 7840 (0.0041) [2024-06-17 22:54:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 128483328. Throughput: 0: 39446.6. Samples: 128616560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-17 22:54:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:54:46,255][12883] Updated weights for policy 0, policy_version 7850 (0.0034) [2024-06-17 22:54:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39599.3). Total num frames: 128647168. Throughput: 0: 39651.6. Samples: 128736220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-17 22:54:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:54:49,250][12883] Updated weights for policy 0, policy_version 7860 (0.0040) [2024-06-17 22:54:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39323.1, 300 sec: 39599.3). Total num frames: 128860160. Throughput: 0: 39555.2. Samples: 128977540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-17 22:54:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:54:54,699][12883] Updated weights for policy 0, policy_version 7870 (0.0043) [2024-06-17 22:54:56,994][12645] Fps is (10 sec: 42597.2, 60 sec: 40413.7, 300 sec: 39710.4). Total num frames: 129073152. Throughput: 0: 39592.8. Samples: 129214380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-17 22:54:56,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:54:57,511][12883] Updated weights for policy 0, policy_version 7880 (0.0035) [2024-06-17 22:55:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39321.5, 300 sec: 39599.6). Total num frames: 129236992. Throughput: 0: 39584.9. Samples: 129332620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 22:55:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:55:03,060][12883] Updated weights for policy 0, policy_version 7890 (0.0045) [2024-06-17 22:55:03,446][12862] Signal inference workers to stop experience collection... (1850 times) [2024-06-17 22:55:03,446][12862] Signal inference workers to resume experience collection... (1850 times) [2024-06-17 22:55:03,474][12883] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-17 22:55:03,475][12883] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-17 22:55:05,769][12883] Updated weights for policy 0, policy_version 7900 (0.0041) [2024-06-17 22:55:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.7, 300 sec: 39655.7). Total num frames: 129466368. Throughput: 0: 39649.8. Samples: 129572140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:55:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:55:10,997][12883] Updated weights for policy 0, policy_version 7910 (0.0031) [2024-06-17 22:55:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 39599.6). Total num frames: 129646592. Throughput: 0: 39849.3. Samples: 129816220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-17 22:55:11,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:55:13,911][12883] Updated weights for policy 0, policy_version 7920 (0.0029) [2024-06-17 22:55:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.7, 300 sec: 39654.8). Total num frames: 129859584. Throughput: 0: 39676.1. Samples: 129934200. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-17 22:55:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:55:18,994][12883] Updated weights for policy 0, policy_version 7930 (0.0038) [2024-06-17 22:55:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.8, 300 sec: 39654.9). Total num frames: 130056192. Throughput: 0: 39539.2. Samples: 130164380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 22:55:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:55:22,303][12883] Updated weights for policy 0, policy_version 7940 (0.0032) [2024-06-17 22:55:26,930][12883] Updated weights for policy 0, policy_version 7950 (0.0034) [2024-06-17 22:55:26,996][12645] Fps is (10 sec: 39312.5, 60 sec: 39866.2, 300 sec: 39654.5). Total num frames: 130252800. Throughput: 0: 39884.3. Samples: 130411440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:55:26,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:55:30,406][12883] Updated weights for policy 0, policy_version 7960 (0.0043) [2024-06-17 22:55:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39321.6, 300 sec: 39599.3). Total num frames: 130449408. Throughput: 0: 40025.7. Samples: 130537380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:55:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:55:34,770][12883] Updated weights for policy 0, policy_version 7970 (0.0031) [2024-06-17 22:55:36,994][12645] Fps is (10 sec: 39330.0, 60 sec: 39867.7, 300 sec: 39599.6). Total num frames: 130646016. Throughput: 0: 39956.3. Samples: 130775580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-17 22:55:36,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:55:38,975][12883] Updated weights for policy 0, policy_version 7980 (0.0038) [2024-06-17 22:55:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39321.5, 300 sec: 39710.3). Total num frames: 130842624. Throughput: 0: 40000.0. Samples: 131014380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 22:55:41,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:55:43,068][12883] Updated weights for policy 0, policy_version 7990 (0.0037) [2024-06-17 22:55:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.7, 300 sec: 39654.8). Total num frames: 131055616. Throughput: 0: 40111.9. Samples: 131137660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-17 22:55:46,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:55:47,175][12883] Updated weights for policy 0, policy_version 8000 (0.0034) [2024-06-17 22:55:51,169][12883] Updated weights for policy 0, policy_version 8010 (0.0039) [2024-06-17 22:55:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 131252224. Throughput: 0: 40143.2. Samples: 131378580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-17 22:55:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:55:55,113][12883] Updated weights for policy 0, policy_version 8020 (0.0035) [2024-06-17 22:55:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 131481600. Throughput: 0: 40044.9. Samples: 131618240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:55:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:55:59,588][12883] Updated weights for policy 0, policy_version 8030 (0.0034) [2024-06-17 22:56:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.9, 300 sec: 39654.8). Total num frames: 131645440. Throughput: 0: 40237.8. Samples: 131744900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:56:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:56:03,283][12883] Updated weights for policy 0, policy_version 8040 (0.0034) [2024-06-17 22:56:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 131858432. Throughput: 0: 40404.4. Samples: 131982580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 22:56:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:56:07,432][12883] Updated weights for policy 0, policy_version 8050 (0.0028) [2024-06-17 22:56:11,869][12883] Updated weights for policy 0, policy_version 8060 (0.0031) [2024-06-17 22:56:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40140.9, 300 sec: 39654.9). Total num frames: 132055040. Throughput: 0: 40256.3. Samples: 132222880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:56:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:56:15,524][12883] Updated weights for policy 0, policy_version 8070 (0.0028) [2024-06-17 22:56:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 132268032. Throughput: 0: 40017.5. Samples: 132338160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-17 22:56:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:56:19,890][12883] Updated weights for policy 0, policy_version 8080 (0.0032) [2024-06-17 22:56:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.7, 300 sec: 39654.8). Total num frames: 132448256. Throughput: 0: 40074.7. Samples: 132578940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 22:56:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:56:23,410][12883] Updated weights for policy 0, policy_version 8090 (0.0031) [2024-06-17 22:56:26,996][12645] Fps is (10 sec: 37675.2, 60 sec: 39867.9, 300 sec: 39710.1). Total num frames: 132644864. Throughput: 0: 40092.6. Samples: 132818620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 22:56:26,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:56:28,231][12883] Updated weights for policy 0, policy_version 8100 (0.0033) [2024-06-17 22:56:31,782][12883] Updated weights for policy 0, policy_version 8110 (0.0033) [2024-06-17 22:56:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40413.9, 300 sec: 39821.5). Total num frames: 132874240. Throughput: 0: 39904.1. Samples: 132933340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-17 22:56:31,998][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:56:36,158][12883] Updated weights for policy 0, policy_version 8120 (0.0050) [2024-06-17 22:56:36,996][12645] Fps is (10 sec: 40959.1, 60 sec: 40139.4, 300 sec: 39654.5). Total num frames: 133054464. Throughput: 0: 40030.0. Samples: 133180020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-17 22:56:36,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:56:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008122_133070848.pth... [2024-06-17 22:56:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007538_123502592.pth [2024-06-17 22:56:40,422][12883] Updated weights for policy 0, policy_version 8130 (0.0033) [2024-06-17 22:56:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 133251072. Throughput: 0: 39974.7. Samples: 133417100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 22.0) [2024-06-17 22:56:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:56:44,634][12883] Updated weights for policy 0, policy_version 8140 (0.0038) [2024-06-17 22:56:46,318][12862] Signal inference workers to stop experience collection... (1900 times) [2024-06-17 22:56:46,319][12862] Signal inference workers to resume experience collection... (1900 times) [2024-06-17 22:56:46,335][12883] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-17 22:56:46,335][12883] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-17 22:56:46,994][12645] Fps is (10 sec: 39329.7, 60 sec: 39867.7, 300 sec: 39710.3). Total num frames: 133447680. Throughput: 0: 39765.6. Samples: 133534360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-17 22:56:46,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:56:48,518][12883] Updated weights for policy 0, policy_version 8150 (0.0029) [2024-06-17 22:56:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39766.2). Total num frames: 133660672. Throughput: 0: 39796.5. Samples: 133773420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-17 22:56:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:56:52,721][12883] Updated weights for policy 0, policy_version 8160 (0.0040) [2024-06-17 22:56:56,524][12883] Updated weights for policy 0, policy_version 8170 (0.0037) [2024-06-17 22:56:56,994][12645] Fps is (10 sec: 40961.0, 60 sec: 39594.8, 300 sec: 39821.5). Total num frames: 133857280. Throughput: 0: 39755.2. Samples: 134011860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-17 22:56:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:57:01,080][12883] Updated weights for policy 0, policy_version 8180 (0.0054) [2024-06-17 22:57:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.6, 300 sec: 39710.4). Total num frames: 134037504. Throughput: 0: 39865.6. Samples: 134132120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 22:57:01,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:57:05,144][12883] Updated weights for policy 0, policy_version 8190 (0.0038) [2024-06-17 22:57:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40140.7, 300 sec: 39765.9). Total num frames: 134266880. Throughput: 0: 39848.0. Samples: 134372100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-17 22:57:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:57:09,517][12883] Updated weights for policy 0, policy_version 8200 (0.0039) [2024-06-17 22:57:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 134430720. Throughput: 0: 39740.5. Samples: 134606860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:57:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:57:13,183][12883] Updated weights for policy 0, policy_version 8210 (0.0042) [2024-06-17 22:57:16,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.5, 300 sec: 39821.4). Total num frames: 134643712. Throughput: 0: 39837.3. Samples: 134726020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 22:57:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:57:17,695][12883] Updated weights for policy 0, policy_version 8220 (0.0046) [2024-06-17 22:57:21,611][12883] Updated weights for policy 0, policy_version 8230 (0.0037) [2024-06-17 22:57:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 40139.3, 300 sec: 39821.1). Total num frames: 134856704. Throughput: 0: 39696.8. Samples: 134966380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-17 22:57:21,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:57:25,631][12883] Updated weights for policy 0, policy_version 8240 (0.0034) [2024-06-17 22:57:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39869.0, 300 sec: 39765.9). Total num frames: 135036928. Throughput: 0: 39716.4. Samples: 135204340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-17 22:57:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:57:29,659][12883] Updated weights for policy 0, policy_version 8250 (0.0037) [2024-06-17 22:57:31,996][12645] Fps is (10 sec: 39321.8, 60 sec: 39593.2, 300 sec: 39821.2). Total num frames: 135249920. Throughput: 0: 39796.4. Samples: 135325280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-17 22:57:32,005][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:57:34,165][12883] Updated weights for policy 0, policy_version 8260 (0.0036) [2024-06-17 22:57:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39869.2, 300 sec: 39765.9). Total num frames: 135446528. Throughput: 0: 39988.0. Samples: 135572880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 22:57:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:57:37,510][12883] Updated weights for policy 0, policy_version 8270 (0.0035) [2024-06-17 22:57:41,994][12645] Fps is (10 sec: 37691.6, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 135626752. Throughput: 0: 39941.7. Samples: 135809240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-17 22:57:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:57:42,297][12883] Updated weights for policy 0, policy_version 8280 (0.0047) [2024-06-17 22:57:45,866][12883] Updated weights for policy 0, policy_version 8290 (0.0038) [2024-06-17 22:57:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 135856128. Throughput: 0: 39912.9. Samples: 135928200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 22:57:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:57:50,218][12883] Updated weights for policy 0, policy_version 8300 (0.0028) [2024-06-17 22:57:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39321.6, 300 sec: 39765.9). Total num frames: 136019968. Throughput: 0: 39768.6. Samples: 136161680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 22:57:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:57:54,007][12883] Updated weights for policy 0, policy_version 8310 (0.0038) [2024-06-17 22:57:56,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39594.6, 300 sec: 39821.5). Total num frames: 136232960. Throughput: 0: 39826.7. Samples: 136399060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-17 22:57:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:57:58,635][12883] Updated weights for policy 0, policy_version 8320 (0.0037) [2024-06-17 22:58:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40413.9, 300 sec: 39932.5). Total num frames: 136462336. Throughput: 0: 39951.1. Samples: 136523820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 22:58:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:58:02,279][12883] Updated weights for policy 0, policy_version 8330 (0.0042) [2024-06-17 22:58:06,556][12883] Updated weights for policy 0, policy_version 8340 (0.0042) [2024-06-17 22:58:06,994][12645] Fps is (10 sec: 40959.1, 60 sec: 39594.6, 300 sec: 39876.9). Total num frames: 136642560. Throughput: 0: 39870.2. Samples: 136760460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 22:58:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:58:10,868][12883] Updated weights for policy 0, policy_version 8350 (0.0037) [2024-06-17 22:58:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40140.7, 300 sec: 39821.4). Total num frames: 136839168. Throughput: 0: 40040.4. Samples: 137006160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-17 22:58:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:58:14,449][12883] Updated weights for policy 0, policy_version 8360 (0.0033) [2024-06-17 22:58:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39821.4). Total num frames: 137052160. Throughput: 0: 39832.5. Samples: 137117660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-17 22:58:16,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:58:18,848][12883] Updated weights for policy 0, policy_version 8370 (0.0042) [2024-06-17 22:58:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39869.2, 300 sec: 39765.9). Total num frames: 137248768. Throughput: 0: 39784.4. Samples: 137363180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-17 22:58:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:58:23,287][12883] Updated weights for policy 0, policy_version 8380 (0.0038) [2024-06-17 22:58:27,000][12645] Fps is (10 sec: 39297.8, 60 sec: 40136.7, 300 sec: 39931.7). Total num frames: 137445376. Throughput: 0: 39727.4. Samples: 137597220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-17 22:58:27,000][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:58:27,138][12883] Updated weights for policy 0, policy_version 8390 (0.0054) [2024-06-17 22:58:31,314][12883] Updated weights for policy 0, policy_version 8400 (0.0034) [2024-06-17 22:58:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39869.1, 300 sec: 39821.4). Total num frames: 137641984. Throughput: 0: 39988.3. Samples: 137727680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 22:58:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:58:35,353][12883] Updated weights for policy 0, policy_version 8410 (0.0040) [2024-06-17 22:58:36,994][12645] Fps is (10 sec: 39346.3, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 137838592. Throughput: 0: 39856.0. Samples: 137955200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 22:58:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:58:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008413_137838592.pth... [2024-06-17 22:58:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000007828_128253952.pth [2024-06-17 22:58:37,652][12862] Signal inference workers to stop experience collection... (1950 times) [2024-06-17 22:58:37,652][12862] Signal inference workers to resume experience collection... (1950 times) [2024-06-17 22:58:37,699][12883] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-17 22:58:37,699][12883] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-17 22:58:39,583][12883] Updated weights for policy 0, policy_version 8420 (0.0030) [2024-06-17 22:58:41,993][12645] Fps is (10 sec: 39322.8, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 138035200. Throughput: 0: 39903.2. Samples: 138194700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-17 22:58:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:58:43,886][12883] Updated weights for policy 0, policy_version 8430 (0.0033) [2024-06-17 22:58:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 39766.2). Total num frames: 138231808. Throughput: 0: 39865.4. Samples: 138317760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-17 22:58:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:58:47,786][12883] Updated weights for policy 0, policy_version 8440 (0.0047) [2024-06-17 22:58:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 138428416. Throughput: 0: 39845.1. Samples: 138553480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-17 22:58:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 22:58:52,067][12883] Updated weights for policy 0, policy_version 8450 (0.0031) [2024-06-17 22:58:55,859][12883] Updated weights for policy 0, policy_version 8460 (0.0038) [2024-06-17 22:58:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 40412.3, 300 sec: 39932.2). Total num frames: 138657792. Throughput: 0: 39713.7. Samples: 138793360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 22:58:56,997][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:59:00,207][12883] Updated weights for policy 0, policy_version 8470 (0.0046) [2024-06-17 22:59:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.7, 300 sec: 39821.5). Total num frames: 138821632. Throughput: 0: 39932.6. Samples: 138914620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-17 22:59:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:59:04,075][12883] Updated weights for policy 0, policy_version 8480 (0.0043) [2024-06-17 22:59:06,994][12645] Fps is (10 sec: 36053.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 139018240. Throughput: 0: 39652.0. Samples: 139147520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:59:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:59:08,733][12883] Updated weights for policy 0, policy_version 8490 (0.0040) [2024-06-17 22:59:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 139231232. Throughput: 0: 39900.8. Samples: 139392500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 22:59:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:59:12,142][12883] Updated weights for policy 0, policy_version 8500 (0.0030) [2024-06-17 22:59:16,752][12883] Updated weights for policy 0, policy_version 8510 (0.0040) [2024-06-17 22:59:17,000][12645] Fps is (10 sec: 40934.4, 60 sec: 39590.7, 300 sec: 39876.2). Total num frames: 139427840. Throughput: 0: 39633.7. Samples: 139511440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-17 22:59:17,000][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 22:59:20,469][12883] Updated weights for policy 0, policy_version 8520 (0.0049) [2024-06-17 22:59:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39594.7, 300 sec: 39877.0). Total num frames: 139624448. Throughput: 0: 39678.7. Samples: 139740740. Policy #0 lag: (min: 1.0, avg: 12.7, max: 24.0) [2024-06-17 22:59:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:59:24,954][12883] Updated weights for policy 0, policy_version 8530 (0.0049) [2024-06-17 22:59:26,994][12645] Fps is (10 sec: 37706.9, 60 sec: 39325.7, 300 sec: 39710.4). Total num frames: 139804672. Throughput: 0: 39804.8. Samples: 139985920. Policy #0 lag: (min: 1.0, avg: 12.7, max: 24.0) [2024-06-17 22:59:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:59:28,951][12883] Updated weights for policy 0, policy_version 8540 (0.0043) [2024-06-17 22:59:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39594.7, 300 sec: 39877.0). Total num frames: 140017664. Throughput: 0: 39696.3. Samples: 140104100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-17 22:59:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 22:59:33,223][12883] Updated weights for policy 0, policy_version 8550 (0.0036) [2024-06-17 22:59:36,828][12883] Updated weights for policy 0, policy_version 8560 (0.0047) [2024-06-17 22:59:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 140247040. Throughput: 0: 39918.3. Samples: 140349800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:59:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:59:41,400][12883] Updated weights for policy 0, policy_version 8570 (0.0051) [2024-06-17 22:59:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 140410880. Throughput: 0: 39751.4. Samples: 140582080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 22:59:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 22:59:45,034][12883] Updated weights for policy 0, policy_version 8580 (0.0035) [2024-06-17 22:59:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39594.6, 300 sec: 39821.4). Total num frames: 140607488. Throughput: 0: 39653.3. Samples: 140699020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 22:59:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 22:59:49,749][12883] Updated weights for policy 0, policy_version 8590 (0.0039) [2024-06-17 22:59:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 140820480. Throughput: 0: 39823.6. Samples: 140939580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 22:59:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 22:59:53,492][12883] Updated weights for policy 0, policy_version 8600 (0.0046) [2024-06-17 22:59:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39323.1, 300 sec: 39932.5). Total num frames: 141017088. Throughput: 0: 39798.5. Samples: 141183440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 22:59:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 22:59:58,062][12883] Updated weights for policy 0, policy_version 8610 (0.0041) [2024-06-17 22:59:58,397][12862] Signal inference workers to stop experience collection... (2000 times) [2024-06-17 22:59:58,398][12862] Signal inference workers to resume experience collection... (2000 times) [2024-06-17 22:59:58,419][12883] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-17 22:59:58,419][12883] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-17 23:00:01,826][12883] Updated weights for policy 0, policy_version 8620 (0.0046) [2024-06-17 23:00:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.7, 300 sec: 39877.0). Total num frames: 141230080. Throughput: 0: 39781.5. Samples: 141301360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 23:00:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:00:06,274][12883] Updated weights for policy 0, policy_version 8630 (0.0039) [2024-06-17 23:00:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 141410304. Throughput: 0: 39966.5. Samples: 141539240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-17 23:00:06,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:00:10,381][12883] Updated weights for policy 0, policy_version 8640 (0.0038) [2024-06-17 23:00:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 141623296. Throughput: 0: 39781.2. Samples: 141776080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 23:00:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:00:14,303][12883] Updated weights for policy 0, policy_version 8650 (0.0045) [2024-06-17 23:00:16,996][12645] Fps is (10 sec: 40951.3, 60 sec: 39870.4, 300 sec: 39876.7). Total num frames: 141819904. Throughput: 0: 39667.0. Samples: 141889200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 23:00:16,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:00:18,357][12883] Updated weights for policy 0, policy_version 8660 (0.0030) [2024-06-17 23:00:21,994][12645] Fps is (10 sec: 37684.0, 60 sec: 39594.7, 300 sec: 39821.8). Total num frames: 142000128. Throughput: 0: 39714.7. Samples: 142136960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:00:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:00:22,567][12883] Updated weights for policy 0, policy_version 8670 (0.0042) [2024-06-17 23:00:26,994][12645] Fps is (10 sec: 37691.9, 60 sec: 39867.8, 300 sec: 39821.5). Total num frames: 142196736. Throughput: 0: 39760.0. Samples: 142371280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:00:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:00:27,027][12883] Updated weights for policy 0, policy_version 8680 (0.0042) [2024-06-17 23:00:31,065][12883] Updated weights for policy 0, policy_version 8690 (0.0049) [2024-06-17 23:00:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 142426112. Throughput: 0: 39959.5. Samples: 142497200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:00:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:00:35,139][12883] Updated weights for policy 0, policy_version 8700 (0.0038) [2024-06-17 23:00:36,995][12645] Fps is (10 sec: 40952.4, 60 sec: 39320.4, 300 sec: 39876.8). Total num frames: 142606336. Throughput: 0: 39785.5. Samples: 142730000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-17 23:00:36,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:00:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008704_142606336.pth... [2024-06-17 23:00:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008122_133070848.pth [2024-06-17 23:00:39,273][12883] Updated weights for policy 0, policy_version 8710 (0.0043) [2024-06-17 23:00:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 39932.5). Total num frames: 142835712. Throughput: 0: 39548.0. Samples: 142963100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 19.0) [2024-06-17 23:00:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:00:43,095][12883] Updated weights for policy 0, policy_version 8720 (0.0039) [2024-06-17 23:00:46,994][12645] Fps is (10 sec: 39328.3, 60 sec: 39867.7, 300 sec: 39821.4). Total num frames: 142999552. Throughput: 0: 39691.5. Samples: 143087480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 19.0) [2024-06-17 23:00:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:00:47,443][12883] Updated weights for policy 0, policy_version 8730 (0.0039) [2024-06-17 23:00:51,277][12883] Updated weights for policy 0, policy_version 8740 (0.0035) [2024-06-17 23:00:51,996][12645] Fps is (10 sec: 37675.1, 60 sec: 39866.3, 300 sec: 39765.6). Total num frames: 143212544. Throughput: 0: 39691.5. Samples: 143325440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-17 23:00:51,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:00:56,001][12883] Updated weights for policy 0, policy_version 8750 (0.0043) [2024-06-17 23:00:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.8, 300 sec: 39877.0). Total num frames: 143409152. Throughput: 0: 39603.7. Samples: 143558240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 23:00:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:00:59,377][12883] Updated weights for policy 0, policy_version 8760 (0.0033) [2024-06-17 23:01:01,994][12645] Fps is (10 sec: 36052.5, 60 sec: 39048.5, 300 sec: 39710.4). Total num frames: 143572992. Throughput: 0: 39743.7. Samples: 143677580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 23:01:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:01:04,113][12883] Updated weights for policy 0, policy_version 8770 (0.0044) [2024-06-17 23:01:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 143802368. Throughput: 0: 39556.9. Samples: 143917020. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-17 23:01:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:01:07,690][12883] Updated weights for policy 0, policy_version 8780 (0.0034) [2024-06-17 23:01:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 143982592. Throughput: 0: 39765.8. Samples: 144160740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 23:01:11,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:01:12,216][12883] Updated weights for policy 0, policy_version 8790 (0.0039) [2024-06-17 23:01:15,921][12883] Updated weights for policy 0, policy_version 8800 (0.0041) [2024-06-17 23:01:16,994][12645] Fps is (10 sec: 39320.7, 60 sec: 39596.1, 300 sec: 39821.4). Total num frames: 144195584. Throughput: 0: 39492.4. Samples: 144274360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 23:01:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:01:20,167][12883] Updated weights for policy 0, policy_version 8810 (0.0041) [2024-06-17 23:01:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.7, 300 sec: 39821.7). Total num frames: 144392192. Throughput: 0: 39740.3. Samples: 144518240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-17 23:01:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:01:24,306][12883] Updated weights for policy 0, policy_version 8820 (0.0033) [2024-06-17 23:01:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.7, 300 sec: 39710.4). Total num frames: 144588800. Throughput: 0: 39899.2. Samples: 144758560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:01:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:01:28,375][12883] Updated weights for policy 0, policy_version 8830 (0.0048) [2024-06-17 23:01:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39594.6, 300 sec: 39821.7). Total num frames: 144801792. Throughput: 0: 39757.3. Samples: 144876560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:01:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:01:32,471][12883] Updated weights for policy 0, policy_version 8840 (0.0044) [2024-06-17 23:01:36,434][12883] Updated weights for policy 0, policy_version 8850 (0.0036) [2024-06-17 23:01:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40141.9, 300 sec: 39877.0). Total num frames: 145014784. Throughput: 0: 39831.2. Samples: 145117760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-17 23:01:37,003][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:01:40,619][12883] Updated weights for policy 0, policy_version 8860 (0.0034) [2024-06-17 23:01:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39048.5, 300 sec: 39765.9). Total num frames: 145178624. Throughput: 0: 39839.0. Samples: 145351000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 23:01:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:01:43,463][12862] Signal inference workers to stop experience collection... (2050 times) [2024-06-17 23:01:43,467][12862] Signal inference workers to resume experience collection... (2050 times) [2024-06-17 23:01:43,492][12883] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-17 23:01:43,493][12883] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-17 23:01:44,707][12883] Updated weights for policy 0, policy_version 8870 (0.0030) [2024-06-17 23:01:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 39765.9). Total num frames: 145391616. Throughput: 0: 39832.9. Samples: 145470060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 23:01:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:01:49,102][12883] Updated weights for policy 0, policy_version 8880 (0.0033) [2024-06-17 23:01:51,994][12645] Fps is (10 sec: 42597.0, 60 sec: 39868.9, 300 sec: 39821.4). Total num frames: 145604608. Throughput: 0: 39976.3. Samples: 145715980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-17 23:01:51,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:01:52,923][12883] Updated weights for policy 0, policy_version 8890 (0.0042) [2024-06-17 23:01:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.6, 300 sec: 39821.4). Total num frames: 145784832. Throughput: 0: 39733.6. Samples: 145948760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-17 23:01:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:01:57,600][12883] Updated weights for policy 0, policy_version 8900 (0.0043) [2024-06-17 23:02:00,973][12883] Updated weights for policy 0, policy_version 8910 (0.0059) [2024-06-17 23:02:01,994][12645] Fps is (10 sec: 39323.5, 60 sec: 40413.9, 300 sec: 39765.9). Total num frames: 145997824. Throughput: 0: 39873.9. Samples: 146068680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 20.0) [2024-06-17 23:02:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:02:06,247][12883] Updated weights for policy 0, policy_version 8920 (0.0042) [2024-06-17 23:02:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.5, 300 sec: 39877.0). Total num frames: 146194432. Throughput: 0: 39877.6. Samples: 146312740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 23:02:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:02:09,852][12883] Updated weights for policy 0, policy_version 8930 (0.0035) [2024-06-17 23:02:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 39821.5). Total num frames: 146391040. Throughput: 0: 39619.5. Samples: 146541440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 23:02:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:02:14,426][12883] Updated weights for policy 0, policy_version 8940 (0.0041) [2024-06-17 23:02:16,996][12645] Fps is (10 sec: 40951.4, 60 sec: 40139.4, 300 sec: 39821.5). Total num frames: 146604032. Throughput: 0: 39756.8. Samples: 146665700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 23:02:16,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:02:18,095][12883] Updated weights for policy 0, policy_version 8950 (0.0040) [2024-06-17 23:02:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 146767872. Throughput: 0: 39688.1. Samples: 146903720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 23:02:21,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:02:23,013][12883] Updated weights for policy 0, policy_version 8960 (0.0030) [2024-06-17 23:02:26,043][12883] Updated weights for policy 0, policy_version 8970 (0.0035) [2024-06-17 23:02:26,994][12645] Fps is (10 sec: 37691.8, 60 sec: 39867.7, 300 sec: 39766.2). Total num frames: 146980864. Throughput: 0: 39754.4. Samples: 147139940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 23:02:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:02:31,031][12883] Updated weights for policy 0, policy_version 8980 (0.0046) [2024-06-17 23:02:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.8, 300 sec: 39765.9). Total num frames: 147177472. Throughput: 0: 39761.0. Samples: 147259300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 23:02:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:02:34,361][12883] Updated weights for policy 0, policy_version 8990 (0.0042) [2024-06-17 23:02:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39321.6, 300 sec: 39821.4). Total num frames: 147374080. Throughput: 0: 39562.5. Samples: 147496280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-17 23:02:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:02:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008995_147374080.pth... [2024-06-17 23:02:37,111][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008413_137838592.pth [2024-06-17 23:02:39,295][12883] Updated weights for policy 0, policy_version 9000 (0.0040) [2024-06-17 23:02:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 147587072. Throughput: 0: 39544.9. Samples: 147728280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-17 23:02:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:02:42,683][12883] Updated weights for policy 0, policy_version 9010 (0.0045) [2024-06-17 23:02:46,996][12645] Fps is (10 sec: 37675.0, 60 sec: 39320.1, 300 sec: 39765.6). Total num frames: 147750912. Throughput: 0: 39580.2. Samples: 147849880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-17 23:02:46,997][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:02:47,437][12883] Updated weights for policy 0, policy_version 9020 (0.0041) [2024-06-17 23:02:50,860][12883] Updated weights for policy 0, policy_version 9030 (0.0038) [2024-06-17 23:02:51,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39048.8, 300 sec: 39710.4). Total num frames: 147947520. Throughput: 0: 39416.5. Samples: 148086480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 23:02:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:02:55,389][12883] Updated weights for policy 0, policy_version 9040 (0.0033) [2024-06-17 23:02:56,994][12645] Fps is (10 sec: 40968.9, 60 sec: 39594.7, 300 sec: 39654.8). Total num frames: 148160512. Throughput: 0: 39734.1. Samples: 148329480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-17 23:02:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:02:58,879][12883] Updated weights for policy 0, policy_version 9050 (0.0046) [2024-06-17 23:03:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 148357120. Throughput: 0: 39595.2. Samples: 148447400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-17 23:03:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:03:03,846][12883] Updated weights for policy 0, policy_version 9060 (0.0034) [2024-06-17 23:03:06,933][12883] Updated weights for policy 0, policy_version 9070 (0.0033) [2024-06-17 23:03:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 40140.9, 300 sec: 39877.0). Total num frames: 148602880. Throughput: 0: 39645.8. Samples: 148687780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-17 23:03:07,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:03:09,886][12862] Signal inference workers to stop experience collection... (2100 times) [2024-06-17 23:03:09,887][12862] Signal inference workers to resume experience collection... (2100 times) [2024-06-17 23:03:09,910][12883] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-17 23:03:09,910][12883] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-17 23:03:11,995][12645] Fps is (10 sec: 37679.0, 60 sec: 39047.8, 300 sec: 39599.2). Total num frames: 148733952. Throughput: 0: 39718.4. Samples: 148927320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-17 23:03:11,996][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:03:12,342][12883] Updated weights for policy 0, policy_version 9080 (0.0031) [2024-06-17 23:03:15,628][12883] Updated weights for policy 0, policy_version 9090 (0.0042) [2024-06-17 23:03:16,994][12645] Fps is (10 sec: 34406.0, 60 sec: 39049.9, 300 sec: 39654.8). Total num frames: 148946944. Throughput: 0: 39598.9. Samples: 149041260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-17 23:03:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:03:20,568][12883] Updated weights for policy 0, policy_version 9100 (0.0034) [2024-06-17 23:03:21,994][12645] Fps is (10 sec: 42604.0, 60 sec: 39867.8, 300 sec: 39711.2). Total num frames: 149159936. Throughput: 0: 39786.4. Samples: 149286660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-17 23:03:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:03:23,658][12883] Updated weights for policy 0, policy_version 9110 (0.0034) [2024-06-17 23:03:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 149356544. Throughput: 0: 39911.1. Samples: 149524280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-17 23:03:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:03:28,376][12883] Updated weights for policy 0, policy_version 9120 (0.0040) [2024-06-17 23:03:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.6, 300 sec: 39710.4). Total num frames: 149553152. Throughput: 0: 39803.4. Samples: 149640940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-17 23:03:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:03:32,488][12883] Updated weights for policy 0, policy_version 9130 (0.0051) [2024-06-17 23:03:36,933][12883] Updated weights for policy 0, policy_version 9140 (0.0041) [2024-06-17 23:03:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39594.7, 300 sec: 39710.3). Total num frames: 149749760. Throughput: 0: 39660.9. Samples: 149871220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 23:03:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:03:40,852][12883] Updated weights for policy 0, policy_version 9150 (0.0039) [2024-06-17 23:03:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39321.6, 300 sec: 39710.4). Total num frames: 149946368. Throughput: 0: 39754.3. Samples: 150118420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 23:03:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:03:44,872][12883] Updated weights for policy 0, policy_version 9160 (0.0039) [2024-06-17 23:03:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40142.4, 300 sec: 39765.9). Total num frames: 150159360. Throughput: 0: 39811.7. Samples: 150238920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 23:03:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:03:48,780][12883] Updated weights for policy 0, policy_version 9170 (0.0042) [2024-06-17 23:03:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40140.9, 300 sec: 39655.1). Total num frames: 150355968. Throughput: 0: 39714.7. Samples: 150474940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:03:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:03:53,012][12883] Updated weights for policy 0, policy_version 9180 (0.0038) [2024-06-17 23:03:56,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 150536192. Throughput: 0: 39523.2. Samples: 150705820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:03:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:03:57,374][12883] Updated weights for policy 0, policy_version 9190 (0.0023) [2024-06-17 23:04:01,170][12883] Updated weights for policy 0, policy_version 9200 (0.0027) [2024-06-17 23:04:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40414.0, 300 sec: 39877.0). Total num frames: 150781952. Throughput: 0: 39639.7. Samples: 150825040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:04:01,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:04:05,352][12883] Updated weights for policy 0, policy_version 9210 (0.0030) [2024-06-17 23:04:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 39048.4, 300 sec: 39710.3). Total num frames: 150945792. Throughput: 0: 39531.4. Samples: 151065580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-17 23:04:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:04:09,381][12883] Updated weights for policy 0, policy_version 9220 (0.0047) [2024-06-17 23:04:11,994][12645] Fps is (10 sec: 36044.5, 60 sec: 40141.6, 300 sec: 39711.2). Total num frames: 151142400. Throughput: 0: 39502.2. Samples: 151301880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-17 23:04:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:04:13,839][12883] Updated weights for policy 0, policy_version 9230 (0.0047) [2024-06-17 23:04:16,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39594.7, 300 sec: 39654.8). Total num frames: 151322624. Throughput: 0: 39659.1. Samples: 151425600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:04:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:04:17,586][12883] Updated weights for policy 0, policy_version 9240 (0.0046) [2024-06-17 23:04:21,773][12883] Updated weights for policy 0, policy_version 9250 (0.0052) [2024-06-17 23:04:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 39821.4). Total num frames: 151552000. Throughput: 0: 39987.1. Samples: 151670640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-17 23:04:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:04:25,480][12883] Updated weights for policy 0, policy_version 9260 (0.0048) [2024-06-17 23:04:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40140.7, 300 sec: 39821.4). Total num frames: 151764992. Throughput: 0: 39785.7. Samples: 151908780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-17 23:04:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:04:30,035][12883] Updated weights for policy 0, policy_version 9270 (0.0030) [2024-06-17 23:04:31,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39594.7, 300 sec: 39599.3). Total num frames: 151928832. Throughput: 0: 39842.3. Samples: 152031820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-17 23:04:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:04:33,655][12883] Updated weights for policy 0, policy_version 9280 (0.0036) [2024-06-17 23:04:36,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 152141824. Throughput: 0: 40015.0. Samples: 152275620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:04:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:04:37,094][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009287_152158208.pth... [2024-06-17 23:04:37,160][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008704_142606336.pth [2024-06-17 23:04:38,103][12883] Updated weights for policy 0, policy_version 9290 (0.0034) [2024-06-17 23:04:41,617][12883] Updated weights for policy 0, policy_version 9300 (0.0050) [2024-06-17 23:04:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40413.9, 300 sec: 39877.0). Total num frames: 152371200. Throughput: 0: 40063.6. Samples: 152508680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:04:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:04:46,761][12883] Updated weights for policy 0, policy_version 9310 (0.0034) [2024-06-17 23:04:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 152551424. Throughput: 0: 40218.5. Samples: 152634880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:04:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:04:50,020][12883] Updated weights for policy 0, policy_version 9320 (0.0030) [2024-06-17 23:04:50,057][12862] Signal inference workers to stop experience collection... (2150 times) [2024-06-17 23:04:50,057][12862] Signal inference workers to resume experience collection... (2150 times) [2024-06-17 23:04:50,067][12883] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-17 23:04:50,091][12883] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-17 23:04:51,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 152748032. Throughput: 0: 40034.7. Samples: 152867140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:04:51,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:04:54,704][12883] Updated weights for policy 0, policy_version 9330 (0.0046) [2024-06-17 23:04:56,997][12645] Fps is (10 sec: 40944.8, 60 sec: 40411.4, 300 sec: 39765.4). Total num frames: 152961024. Throughput: 0: 40228.2. Samples: 153112300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:04:56,998][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:04:58,138][12883] Updated weights for policy 0, policy_version 9340 (0.0031) [2024-06-17 23:05:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39048.4, 300 sec: 39710.4). Total num frames: 153124864. Throughput: 0: 40091.0. Samples: 153229700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:05:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:05:03,363][12883] Updated weights for policy 0, policy_version 9350 (0.0038) [2024-06-17 23:05:06,142][12883] Updated weights for policy 0, policy_version 9360 (0.0051) [2024-06-17 23:05:06,994][12645] Fps is (10 sec: 40975.2, 60 sec: 40413.9, 300 sec: 39821.5). Total num frames: 153370624. Throughput: 0: 39958.7. Samples: 153468780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:05:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:05:11,537][12883] Updated weights for policy 0, policy_version 9370 (0.0034) [2024-06-17 23:05:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40140.8, 300 sec: 39766.2). Total num frames: 153550848. Throughput: 0: 40198.4. Samples: 153717700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-17 23:05:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:05:14,065][12883] Updated weights for policy 0, policy_version 9380 (0.0029) [2024-06-17 23:05:17,000][12645] Fps is (10 sec: 37659.9, 60 sec: 40409.6, 300 sec: 39820.6). Total num frames: 153747456. Throughput: 0: 39974.3. Samples: 153830920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 23:05:17,001][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:05:19,582][12883] Updated weights for policy 0, policy_version 9390 (0.0037) [2024-06-17 23:05:21,998][12645] Fps is (10 sec: 42577.8, 60 sec: 40410.7, 300 sec: 39931.9). Total num frames: 153976832. Throughput: 0: 39918.9. Samples: 154072160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 23:05:21,999][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:05:22,628][12883] Updated weights for policy 0, policy_version 9400 (0.0040) [2024-06-17 23:05:26,994][12645] Fps is (10 sec: 36067.6, 60 sec: 39048.6, 300 sec: 39599.3). Total num frames: 154107904. Throughput: 0: 40170.7. Samples: 154316360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:05:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:05:27,605][12883] Updated weights for policy 0, policy_version 9410 (0.0047) [2024-06-17 23:05:30,883][12883] Updated weights for policy 0, policy_version 9420 (0.0029) [2024-06-17 23:05:31,996][12645] Fps is (10 sec: 40970.5, 60 sec: 40958.4, 300 sec: 39932.5). Total num frames: 154386432. Throughput: 0: 39885.2. Samples: 154429800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:05:31,997][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:05:35,747][12883] Updated weights for policy 0, policy_version 9430 (0.0026) [2024-06-17 23:05:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39867.8, 300 sec: 39654.8). Total num frames: 154533888. Throughput: 0: 39913.4. Samples: 154663240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 23:05:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:05:39,022][12883] Updated weights for policy 0, policy_version 9440 (0.0037) [2024-06-17 23:05:41,994][12645] Fps is (10 sec: 32775.5, 60 sec: 39048.5, 300 sec: 39710.4). Total num frames: 154714112. Throughput: 0: 39829.2. Samples: 154904460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 23:05:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:05:44,083][12883] Updated weights for policy 0, policy_version 9450 (0.0032) [2024-06-17 23:05:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40413.9, 300 sec: 39877.3). Total num frames: 154976256. Throughput: 0: 39911.7. Samples: 155025720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-17 23:05:46,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:05:47,063][12883] Updated weights for policy 0, policy_version 9460 (0.0035) [2024-06-17 23:05:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39594.8, 300 sec: 39710.4). Total num frames: 155123712. Throughput: 0: 40036.6. Samples: 155270420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 23:05:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:05:52,368][12883] Updated weights for policy 0, policy_version 9470 (0.0041) [2024-06-17 23:05:55,488][12883] Updated weights for policy 0, policy_version 9480 (0.0030) [2024-06-17 23:05:57,000][12645] Fps is (10 sec: 37659.6, 60 sec: 39866.1, 300 sec: 39931.7). Total num frames: 155353088. Throughput: 0: 39584.7. Samples: 155499260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 23:05:57,000][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:06:00,467][12883] Updated weights for policy 0, policy_version 9490 (0.0047) [2024-06-17 23:06:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.9, 300 sec: 39765.9). Total num frames: 155533312. Throughput: 0: 39895.9. Samples: 155625980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) [2024-06-17 23:06:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:06:03,796][12883] Updated weights for policy 0, policy_version 9500 (0.0043) [2024-06-17 23:06:06,996][12645] Fps is (10 sec: 36058.4, 60 sec: 39047.0, 300 sec: 39765.6). Total num frames: 155713536. Throughput: 0: 39779.9. Samples: 155862160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-17 23:06:06,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:06:08,312][12862] Signal inference workers to stop experience collection... (2200 times) [2024-06-17 23:06:08,340][12883] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-17 23:06:08,379][12862] Signal inference workers to resume experience collection... (2200 times) [2024-06-17 23:06:08,380][12883] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-17 23:06:08,523][12883] Updated weights for policy 0, policy_version 9510 (0.0030) [2024-06-17 23:06:11,743][12883] Updated weights for policy 0, policy_version 9520 (0.0034) [2024-06-17 23:06:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40413.9, 300 sec: 39932.6). Total num frames: 155975680. Throughput: 0: 39773.8. Samples: 156106180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-17 23:06:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:06:16,635][12883] Updated weights for policy 0, policy_version 9530 (0.0042) [2024-06-17 23:06:16,994][12645] Fps is (10 sec: 44247.9, 60 sec: 40145.0, 300 sec: 39877.0). Total num frames: 156155904. Throughput: 0: 40182.9. Samples: 156237940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 23:06:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:06:19,979][12883] Updated weights for policy 0, policy_version 9540 (0.0040) [2024-06-17 23:06:21,994][12645] Fps is (10 sec: 37682.3, 60 sec: 39597.7, 300 sec: 39877.0). Total num frames: 156352512. Throughput: 0: 40246.6. Samples: 156474340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 23:06:22,003][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:06:24,582][12883] Updated weights for policy 0, policy_version 9550 (0.0024) [2024-06-17 23:06:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 39877.0). Total num frames: 156565504. Throughput: 0: 40461.4. Samples: 156725220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-17 23:06:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:06:27,939][12883] Updated weights for policy 0, policy_version 9560 (0.0040) [2024-06-17 23:06:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39323.1, 300 sec: 39765.9). Total num frames: 156745728. Throughput: 0: 40341.3. Samples: 156841080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 23:06:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:06:33,224][12883] Updated weights for policy 0, policy_version 9570 (0.0035) [2024-06-17 23:06:35,990][12883] Updated weights for policy 0, policy_version 9580 (0.0032) [2024-06-17 23:06:36,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40686.9, 300 sec: 39988.1). Total num frames: 156975104. Throughput: 0: 40163.8. Samples: 157077800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 23:06:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:06:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009581_156975104.pth... [2024-06-17 23:06:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000008995_147374080.pth [2024-06-17 23:06:41,155][12883] Updated weights for policy 0, policy_version 9590 (0.0048) [2024-06-17 23:06:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 39877.0). Total num frames: 157155328. Throughput: 0: 40462.1. Samples: 157319800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:06:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:06:44,655][12883] Updated weights for policy 0, policy_version 9600 (0.0037) [2024-06-17 23:06:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39594.6, 300 sec: 39821.5). Total num frames: 157351936. Throughput: 0: 40147.0. Samples: 157432600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:06:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:06:49,554][12883] Updated weights for policy 0, policy_version 9610 (0.0038) [2024-06-17 23:06:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 39932.6). Total num frames: 157564928. Throughput: 0: 40339.6. Samples: 157677340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:06:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:06:52,647][12883] Updated weights for policy 0, policy_version 9620 (0.0034) [2024-06-17 23:06:56,996][12645] Fps is (10 sec: 39313.1, 60 sec: 39870.4, 300 sec: 39821.1). Total num frames: 157745152. Throughput: 0: 40326.4. Samples: 157920960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 23:06:56,997][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:06:57,602][12883] Updated weights for policy 0, policy_version 9630 (0.0044) [2024-06-17 23:07:01,316][12883] Updated weights for policy 0, policy_version 9640 (0.0029) [2024-06-17 23:07:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 39877.0). Total num frames: 157958144. Throughput: 0: 39921.7. Samples: 158034420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 23:07:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:07:05,582][12883] Updated weights for policy 0, policy_version 9650 (0.0040) [2024-06-17 23:07:06,994][12645] Fps is (10 sec: 40968.7, 60 sec: 40688.5, 300 sec: 39877.0). Total num frames: 158154752. Throughput: 0: 40048.0. Samples: 158276500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-17 23:07:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:07:09,323][12883] Updated weights for policy 0, policy_version 9660 (0.0036) [2024-06-17 23:07:11,996][12645] Fps is (10 sec: 39312.8, 60 sec: 39593.1, 300 sec: 39821.4). Total num frames: 158351360. Throughput: 0: 39765.1. Samples: 158514740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-17 23:07:11,996][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:07:13,610][12883] Updated weights for policy 0, policy_version 9670 (0.0038) [2024-06-17 23:07:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.7, 300 sec: 39988.1). Total num frames: 158564352. Throughput: 0: 39797.2. Samples: 158631960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-17 23:07:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:07:17,757][12883] Updated weights for policy 0, policy_version 9680 (0.0042) [2024-06-17 23:07:21,509][12883] Updated weights for policy 0, policy_version 9690 (0.0052) [2024-06-17 23:07:21,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 158760960. Throughput: 0: 39920.1. Samples: 158874200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 23:07:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:07:26,137][12883] Updated weights for policy 0, policy_version 9700 (0.0043) [2024-06-17 23:07:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40139.2, 300 sec: 39987.7). Total num frames: 158973952. Throughput: 0: 39796.2. Samples: 159110720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-17 23:07:26,997][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:07:28,917][12862] Signal inference workers to stop experience collection... (2250 times) [2024-06-17 23:07:28,971][12862] Signal inference workers to resume experience collection... (2250 times) [2024-06-17 23:07:28,972][12883] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-17 23:07:28,991][12883] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-17 23:07:29,760][12883] Updated weights for policy 0, policy_version 9710 (0.0046) [2024-06-17 23:07:31,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 159137792. Throughput: 0: 39865.7. Samples: 159226560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-17 23:07:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:07:34,262][12883] Updated weights for policy 0, policy_version 9720 (0.0042) [2024-06-17 23:07:36,994][12645] Fps is (10 sec: 37692.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 159350784. Throughput: 0: 39797.8. Samples: 159468240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 23:07:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:07:38,087][12883] Updated weights for policy 0, policy_version 9730 (0.0032) [2024-06-17 23:07:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.7, 300 sec: 39988.4). Total num frames: 159547392. Throughput: 0: 39757.5. Samples: 159709960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 23:07:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:07:42,451][12883] Updated weights for policy 0, policy_version 9740 (0.0032) [2024-06-17 23:07:46,068][12883] Updated weights for policy 0, policy_version 9750 (0.0044) [2024-06-17 23:07:46,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40140.8, 300 sec: 40043.6). Total num frames: 159760384. Throughput: 0: 39885.3. Samples: 159829260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:07:46,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:07:50,830][12883] Updated weights for policy 0, policy_version 9760 (0.0037) [2024-06-17 23:07:51,995][12645] Fps is (10 sec: 37678.6, 60 sec: 39320.8, 300 sec: 39876.8). Total num frames: 159924224. Throughput: 0: 39813.2. Samples: 160068140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 23:07:51,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:07:54,674][12883] Updated weights for policy 0, policy_version 9770 (0.0031) [2024-06-17 23:07:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39869.2, 300 sec: 39932.5). Total num frames: 160137216. Throughput: 0: 39745.1. Samples: 160303180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 23:07:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:07:59,683][12883] Updated weights for policy 0, policy_version 9780 (0.0034) [2024-06-17 23:08:01,994][12645] Fps is (10 sec: 44242.3, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 160366592. Throughput: 0: 39892.1. Samples: 160427100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-17 23:08:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:08:02,705][12883] Updated weights for policy 0, policy_version 9790 (0.0044) [2024-06-17 23:08:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39594.7, 300 sec: 39988.2). Total num frames: 160530432. Throughput: 0: 39750.6. Samples: 160662980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 23:08:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:08:07,609][12883] Updated weights for policy 0, policy_version 9800 (0.0044) [2024-06-17 23:08:11,312][12883] Updated weights for policy 0, policy_version 9810 (0.0041) [2024-06-17 23:08:11,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39596.1, 300 sec: 39932.5). Total num frames: 160727040. Throughput: 0: 39748.6. Samples: 160899320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-17 23:08:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:08:15,970][12883] Updated weights for policy 0, policy_version 9820 (0.0040) [2024-06-17 23:08:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 160940032. Throughput: 0: 39907.2. Samples: 161022380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 23:08:16,999][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:08:19,249][12883] Updated weights for policy 0, policy_version 9830 (0.0022) [2024-06-17 23:08:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 161136640. Throughput: 0: 39747.5. Samples: 161256880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 23:08:21,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:08:24,060][12883] Updated weights for policy 0, policy_version 9840 (0.0040) [2024-06-17 23:08:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39323.1, 300 sec: 39932.5). Total num frames: 161333248. Throughput: 0: 39620.0. Samples: 161492860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-17 23:08:26,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-17 23:08:27,035][12862] Saving new best policy, reward=0.015! [2024-06-17 23:08:28,279][12883] Updated weights for policy 0, policy_version 9850 (0.0038) [2024-06-17 23:08:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 39867.8, 300 sec: 39932.5). Total num frames: 161529856. Throughput: 0: 39668.0. Samples: 161614320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:08:32,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:08:32,136][12883] Updated weights for policy 0, policy_version 9860 (0.0036) [2024-06-17 23:08:36,446][12883] Updated weights for policy 0, policy_version 9870 (0.0054) [2024-06-17 23:08:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39594.6, 300 sec: 39932.5). Total num frames: 161726464. Throughput: 0: 39637.6. Samples: 161851780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:08:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:08:37,195][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009873_161759232.pth... [2024-06-17 23:08:37,256][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009287_152158208.pth [2024-06-17 23:08:40,331][12883] Updated weights for policy 0, policy_version 9880 (0.0034) [2024-06-17 23:08:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39867.6, 300 sec: 39932.5). Total num frames: 161939456. Throughput: 0: 39699.9. Samples: 162089680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:08:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:08:44,582][12883] Updated weights for policy 0, policy_version 9890 (0.0039) [2024-06-17 23:08:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39321.6, 300 sec: 39877.0). Total num frames: 162119680. Throughput: 0: 39651.1. Samples: 162211400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:08:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:08:48,416][12883] Updated weights for policy 0, policy_version 9900 (0.0034) [2024-06-17 23:08:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39868.5, 300 sec: 39932.5). Total num frames: 162316288. Throughput: 0: 39585.0. Samples: 162444300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-17 23:08:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:08:52,787][12883] Updated weights for policy 0, policy_version 9910 (0.0049) [2024-06-17 23:08:56,859][12883] Updated weights for policy 0, policy_version 9920 (0.0036) [2024-06-17 23:08:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.7, 300 sec: 39821.4). Total num frames: 162529280. Throughput: 0: 39589.0. Samples: 162680820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 23:08:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:09:01,505][12883] Updated weights for policy 0, policy_version 9930 (0.0048) [2024-06-17 23:09:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39048.6, 300 sec: 39877.0). Total num frames: 162709504. Throughput: 0: 39531.6. Samples: 162801300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 23:09:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:09:04,835][12883] Updated weights for policy 0, policy_version 9940 (0.0035) [2024-06-17 23:09:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.8, 300 sec: 39988.1). Total num frames: 162938880. Throughput: 0: 39549.7. Samples: 163036620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-17 23:09:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:09:09,555][12862] Signal inference workers to stop experience collection... (2300 times) [2024-06-17 23:09:09,559][12862] Signal inference workers to resume experience collection... (2300 times) [2024-06-17 23:09:09,575][12883] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-17 23:09:09,612][12883] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-17 23:09:09,702][12883] Updated weights for policy 0, policy_version 9950 (0.0046) [2024-06-17 23:09:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 39988.1). Total num frames: 163119104. Throughput: 0: 39674.7. Samples: 163278220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:09:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:09:12,739][12883] Updated weights for policy 0, policy_version 9960 (0.0045) [2024-06-17 23:09:16,994][12645] Fps is (10 sec: 36044.5, 60 sec: 39321.5, 300 sec: 39821.4). Total num frames: 163299328. Throughput: 0: 39648.8. Samples: 163398520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:09:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:09:17,844][12883] Updated weights for policy 0, policy_version 9970 (0.0035) [2024-06-17 23:09:20,792][12883] Updated weights for policy 0, policy_version 9980 (0.0039) [2024-06-17 23:09:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39867.6, 300 sec: 39877.0). Total num frames: 163528704. Throughput: 0: 39586.1. Samples: 163633160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-17 23:09:22,000][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:09:25,923][12883] Updated weights for policy 0, policy_version 9990 (0.0046) [2024-06-17 23:09:26,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 163708928. Throughput: 0: 39713.1. Samples: 163876760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-17 23:09:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:09:29,553][12883] Updated weights for policy 0, policy_version 10000 (0.0047) [2024-06-17 23:09:31,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39594.7, 300 sec: 39877.0). Total num frames: 163905536. Throughput: 0: 39517.4. Samples: 163989680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:09:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:09:34,187][12883] Updated weights for policy 0, policy_version 10010 (0.0047) [2024-06-17 23:09:36,996][12645] Fps is (10 sec: 40951.3, 60 sec: 39866.3, 300 sec: 39821.2). Total num frames: 164118528. Throughput: 0: 39789.8. Samples: 164234920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-17 23:09:36,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:09:37,605][12883] Updated weights for policy 0, policy_version 10020 (0.0041) [2024-06-17 23:09:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39321.7, 300 sec: 39821.5). Total num frames: 164298752. Throughput: 0: 39867.5. Samples: 164474860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-17 23:09:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:09:42,450][12883] Updated weights for policy 0, policy_version 10030 (0.0032) [2024-06-17 23:09:45,931][12883] Updated weights for policy 0, policy_version 10040 (0.0031) [2024-06-17 23:09:46,994][12645] Fps is (10 sec: 40968.1, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 164528128. Throughput: 0: 39927.4. Samples: 164598040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 23:09:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:09:50,738][12883] Updated weights for policy 0, policy_version 10050 (0.0035) [2024-06-17 23:09:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40140.8, 300 sec: 39877.5). Total num frames: 164724736. Throughput: 0: 40081.8. Samples: 164840300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-17 23:09:51,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:09:53,964][12883] Updated weights for policy 0, policy_version 10060 (0.0042) [2024-06-17 23:09:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.6, 300 sec: 39932.5). Total num frames: 164904960. Throughput: 0: 39937.2. Samples: 165075400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-17 23:09:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:09:58,842][12883] Updated weights for policy 0, policy_version 10070 (0.0046) [2024-06-17 23:10:01,993][12645] Fps is (10 sec: 40960.8, 60 sec: 40413.9, 300 sec: 39877.0). Total num frames: 165134336. Throughput: 0: 39902.5. Samples: 165194120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-17 23:10:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:10:02,105][12883] Updated weights for policy 0, policy_version 10080 (0.0041) [2024-06-17 23:10:06,980][12883] Updated weights for policy 0, policy_version 10090 (0.0043) [2024-06-17 23:10:06,994][12645] Fps is (10 sec: 40960.9, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 165314560. Throughput: 0: 40070.4. Samples: 165436320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-17 23:10:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:10:10,625][12883] Updated weights for policy 0, policy_version 10100 (0.0043) [2024-06-17 23:10:11,994][12645] Fps is (10 sec: 37682.4, 60 sec: 39867.7, 300 sec: 39877.8). Total num frames: 165511168. Throughput: 0: 39847.9. Samples: 165669920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 23:10:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:10:15,034][12883] Updated weights for policy 0, policy_version 10110 (0.0026) [2024-06-17 23:10:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40413.9, 300 sec: 39822.1). Total num frames: 165724160. Throughput: 0: 40071.0. Samples: 165792880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:10:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:10:18,786][12883] Updated weights for policy 0, policy_version 10120 (0.0043) [2024-06-17 23:10:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39321.7, 300 sec: 39932.5). Total num frames: 165888000. Throughput: 0: 39785.0. Samples: 166025160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:10:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:10:23,558][12883] Updated weights for policy 0, policy_version 10130 (0.0039) [2024-06-17 23:10:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.8, 300 sec: 39766.2). Total num frames: 166117376. Throughput: 0: 39698.7. Samples: 166261300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 23:10:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:10:27,020][12883] Updated weights for policy 0, policy_version 10140 (0.0034) [2024-06-17 23:10:31,663][12883] Updated weights for policy 0, policy_version 10150 (0.0036) [2024-06-17 23:10:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 166297600. Throughput: 0: 39552.0. Samples: 166377880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 23:10:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:10:35,129][12883] Updated weights for policy 0, policy_version 10160 (0.0043) [2024-06-17 23:10:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39869.1, 300 sec: 39988.1). Total num frames: 166510592. Throughput: 0: 39447.6. Samples: 166615440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 23:10:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:10:37,158][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010164_166526976.pth... [2024-06-17 23:10:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009581_156975104.pth [2024-06-17 23:10:39,815][12883] Updated weights for policy 0, policy_version 10170 (0.0033) [2024-06-17 23:10:41,988][12862] Signal inference workers to stop experience collection... (2350 times) [2024-06-17 23:10:41,988][12862] Signal inference workers to resume experience collection... (2350 times) [2024-06-17 23:10:41,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 166690816. Throughput: 0: 39711.7. Samples: 166862420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-17 23:10:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:10:42,017][12883] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-17 23:10:42,017][12883] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-17 23:10:43,214][12883] Updated weights for policy 0, policy_version 10180 (0.0036) [2024-06-17 23:10:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39321.7, 300 sec: 39877.0). Total num frames: 166887424. Throughput: 0: 39691.1. Samples: 166980220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-17 23:10:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:10:47,722][12883] Updated weights for policy 0, policy_version 10190 (0.0030) [2024-06-17 23:10:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 39867.8, 300 sec: 39877.8). Total num frames: 167116800. Throughput: 0: 39616.4. Samples: 167219060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-17 23:10:51,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-17 23:10:51,995][12862] Saving new best policy, reward=0.016! [2024-06-17 23:10:51,998][12883] Updated weights for policy 0, policy_version 10200 (0.0035) [2024-06-17 23:10:55,837][12883] Updated weights for policy 0, policy_version 10210 (0.0041) [2024-06-17 23:10:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39867.8, 300 sec: 39877.0). Total num frames: 167297024. Throughput: 0: 39807.1. Samples: 167461240. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-17 23:10:56,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-17 23:10:57,010][12862] Saving new best policy, reward=0.023! [2024-06-17 23:11:00,234][12883] Updated weights for policy 0, policy_version 10220 (0.0046) [2024-06-17 23:11:01,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39048.5, 300 sec: 39877.3). Total num frames: 167477248. Throughput: 0: 39686.4. Samples: 167578760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:11:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:11:03,945][12883] Updated weights for policy 0, policy_version 10230 (0.0042) [2024-06-17 23:11:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 167706624. Throughput: 0: 39904.9. Samples: 167820880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:11:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:11:08,353][12883] Updated weights for policy 0, policy_version 10240 (0.0047) [2024-06-17 23:11:11,996][12645] Fps is (10 sec: 44226.6, 60 sec: 40139.4, 300 sec: 39876.7). Total num frames: 167919616. Throughput: 0: 39835.8. Samples: 168054000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:11:11,997][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:11:12,224][12883] Updated weights for policy 0, policy_version 10250 (0.0035) [2024-06-17 23:11:16,762][12883] Updated weights for policy 0, policy_version 10260 (0.0038) [2024-06-17 23:11:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39594.8, 300 sec: 39821.5). Total num frames: 168099840. Throughput: 0: 40035.7. Samples: 168179480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:11:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:11:20,255][12883] Updated weights for policy 0, policy_version 10270 (0.0037) [2024-06-17 23:11:21,994][12645] Fps is (10 sec: 36051.7, 60 sec: 39867.5, 300 sec: 39710.3). Total num frames: 168280064. Throughput: 0: 39959.8. Samples: 168413640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:11:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:11:24,851][12883] Updated weights for policy 0, policy_version 10280 (0.0041) [2024-06-17 23:11:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39321.6, 300 sec: 39765.9). Total num frames: 168476672. Throughput: 0: 40070.7. Samples: 168665600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 23:11:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:11:28,422][12883] Updated weights for policy 0, policy_version 10290 (0.0044) [2024-06-17 23:11:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 168706048. Throughput: 0: 39954.5. Samples: 168778180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 23:11:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:11:33,322][12883] Updated weights for policy 0, policy_version 10300 (0.0032) [2024-06-17 23:11:36,746][12883] Updated weights for policy 0, policy_version 10310 (0.0038) [2024-06-17 23:11:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 168919040. Throughput: 0: 40009.3. Samples: 169019480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-17 23:11:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:11:41,299][12883] Updated weights for policy 0, policy_version 10320 (0.0039) [2024-06-17 23:11:41,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 169082880. Throughput: 0: 39942.3. Samples: 169258640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 23:11:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:11:44,899][12883] Updated weights for policy 0, policy_version 10330 (0.0037) [2024-06-17 23:11:46,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 169295872. Throughput: 0: 39905.3. Samples: 169374500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-17 23:11:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:11:49,676][12883] Updated weights for policy 0, policy_version 10340 (0.0044) [2024-06-17 23:11:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 39594.6, 300 sec: 39821.7). Total num frames: 169492480. Throughput: 0: 39969.6. Samples: 169619520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-17 23:11:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:11:53,304][12883] Updated weights for policy 0, policy_version 10350 (0.0046) [2024-06-17 23:11:56,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40139.3, 300 sec: 39821.1). Total num frames: 169705472. Throughput: 0: 40072.0. Samples: 169857240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:11:56,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:11:57,772][12883] Updated weights for policy 0, policy_version 10360 (0.0035) [2024-06-17 23:11:59,423][12862] Signal inference workers to stop experience collection... (2400 times) [2024-06-17 23:11:59,471][12883] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-17 23:11:59,535][12862] Signal inference workers to resume experience collection... (2400 times) [2024-06-17 23:11:59,535][12883] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-17 23:12:01,681][12883] Updated weights for policy 0, policy_version 10370 (0.0032) [2024-06-17 23:12:01,996][12645] Fps is (10 sec: 42589.5, 60 sec: 40685.4, 300 sec: 39876.7). Total num frames: 169918464. Throughput: 0: 39910.9. Samples: 169975560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:12:01,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:12:06,255][12883] Updated weights for policy 0, policy_version 10380 (0.0031) [2024-06-17 23:12:06,994][12645] Fps is (10 sec: 37691.9, 60 sec: 39594.7, 300 sec: 39766.2). Total num frames: 170082304. Throughput: 0: 39975.0. Samples: 170212500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:12:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:12:09,953][12883] Updated weights for policy 0, policy_version 10390 (0.0036) [2024-06-17 23:12:11,994][12645] Fps is (10 sec: 36052.4, 60 sec: 39323.0, 300 sec: 39710.4). Total num frames: 170278912. Throughput: 0: 39612.7. Samples: 170448180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:12:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:12:14,382][12883] Updated weights for policy 0, policy_version 10400 (0.0030) [2024-06-17 23:12:16,994][12645] Fps is (10 sec: 40959.2, 60 sec: 39867.6, 300 sec: 39765.9). Total num frames: 170491904. Throughput: 0: 39757.7. Samples: 170567280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 25.0) [2024-06-17 23:12:16,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:12:18,100][12883] Updated weights for policy 0, policy_version 10410 (0.0029) [2024-06-17 23:12:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40141.0, 300 sec: 39710.7). Total num frames: 170688512. Throughput: 0: 39747.1. Samples: 170808100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-17 23:12:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:12:22,478][12883] Updated weights for policy 0, policy_version 10420 (0.0034) [2024-06-17 23:12:26,650][12883] Updated weights for policy 0, policy_version 10430 (0.0042) [2024-06-17 23:12:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.8, 300 sec: 39821.5). Total num frames: 170885120. Throughput: 0: 39695.5. Samples: 171044940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-17 23:12:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:12:30,736][12883] Updated weights for policy 0, policy_version 10440 (0.0044) [2024-06-17 23:12:31,996][12645] Fps is (10 sec: 37675.0, 60 sec: 39320.2, 300 sec: 39710.1). Total num frames: 171065344. Throughput: 0: 39793.1. Samples: 171165280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-17 23:12:31,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:12:34,974][12883] Updated weights for policy 0, policy_version 10450 (0.0040) [2024-06-17 23:12:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39321.6, 300 sec: 39765.9). Total num frames: 171278336. Throughput: 0: 39622.3. Samples: 171402520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-17 23:12:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:12:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010455_171294720.pth... [2024-06-17 23:12:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000009873_161759232.pth [2024-06-17 23:12:38,729][12883] Updated weights for policy 0, policy_version 10460 (0.0037) [2024-06-17 23:12:41,994][12645] Fps is (10 sec: 44246.7, 60 sec: 40413.8, 300 sec: 39821.5). Total num frames: 171507712. Throughput: 0: 39601.1. Samples: 171639200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-17 23:12:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:12:43,004][12883] Updated weights for policy 0, policy_version 10470 (0.0037) [2024-06-17 23:12:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 39867.6, 300 sec: 39877.1). Total num frames: 171687936. Throughput: 0: 39662.7. Samples: 171760300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-17 23:12:46,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:12:47,191][12883] Updated weights for policy 0, policy_version 10480 (0.0024) [2024-06-17 23:12:51,504][12883] Updated weights for policy 0, policy_version 10490 (0.0031) [2024-06-17 23:12:51,996][12645] Fps is (10 sec: 37674.6, 60 sec: 39866.3, 300 sec: 39821.1). Total num frames: 171884544. Throughput: 0: 39694.8. Samples: 171998860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-17 23:12:51,997][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:12:55,583][12883] Updated weights for policy 0, policy_version 10500 (0.0045) [2024-06-17 23:12:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39596.1, 300 sec: 39710.4). Total num frames: 172081152. Throughput: 0: 39809.7. Samples: 172239620. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-17 23:12:56,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:12:59,753][12883] Updated weights for policy 0, policy_version 10510 (0.0037) [2024-06-17 23:13:01,996][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.6, 300 sec: 39821.2). Total num frames: 172277760. Throughput: 0: 39813.7. Samples: 172358980. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-17 23:13:01,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:13:03,810][12883] Updated weights for policy 0, policy_version 10520 (0.0032) [2024-06-17 23:13:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.6, 300 sec: 39821.5). Total num frames: 172474368. Throughput: 0: 39515.1. Samples: 172586280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-17 23:13:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:13:07,874][12883] Updated weights for policy 0, policy_version 10530 (0.0050) [2024-06-17 23:13:11,994][12645] Fps is (10 sec: 37691.2, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 172654592. Throughput: 0: 39622.1. Samples: 172827940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 23:13:11,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:13:12,485][12883] Updated weights for policy 0, policy_version 10540 (0.0052) [2024-06-17 23:13:16,137][12883] Updated weights for policy 0, policy_version 10550 (0.0031) [2024-06-17 23:13:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 172867584. Throughput: 0: 39446.8. Samples: 172940300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 23:13:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:13:21,052][12883] Updated weights for policy 0, policy_version 10560 (0.0037) [2024-06-17 23:13:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 39867.7, 300 sec: 39821.4). Total num frames: 173080576. Throughput: 0: 39685.2. Samples: 173188360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-17 23:13:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:13:24,299][12883] Updated weights for policy 0, policy_version 10570 (0.0024) [2024-06-17 23:13:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 173260800. Throughput: 0: 39600.9. Samples: 173421240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-17 23:13:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:13:29,358][12883] Updated weights for policy 0, policy_version 10580 (0.0030) [2024-06-17 23:13:29,699][12862] Signal inference workers to stop experience collection... (2450 times) [2024-06-17 23:13:29,699][12862] Signal inference workers to resume experience collection... (2450 times) [2024-06-17 23:13:29,729][12883] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-17 23:13:29,730][12883] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-17 23:13:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40142.2, 300 sec: 39821.4). Total num frames: 173473792. Throughput: 0: 39700.0. Samples: 173546800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-17 23:13:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:13:32,832][12883] Updated weights for policy 0, policy_version 10590 (0.0055) [2024-06-17 23:13:36,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39321.5, 300 sec: 39654.8). Total num frames: 173637632. Throughput: 0: 39739.7. Samples: 173787060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-17 23:13:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:13:37,375][12883] Updated weights for policy 0, policy_version 10600 (0.0029) [2024-06-17 23:13:40,862][12883] Updated weights for policy 0, policy_version 10610 (0.0046) [2024-06-17 23:13:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 173883392. Throughput: 0: 39606.7. Samples: 174021920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-17 23:13:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:13:45,538][12883] Updated weights for policy 0, policy_version 10620 (0.0050) [2024-06-17 23:13:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 174080000. Throughput: 0: 39785.0. Samples: 174149220. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-17 23:13:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:13:48,957][12883] Updated weights for policy 0, policy_version 10630 (0.0039) [2024-06-17 23:13:51,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39323.1, 300 sec: 39710.4). Total num frames: 174243840. Throughput: 0: 39804.5. Samples: 174377480. Policy #0 lag: (min: 0.0, avg: 7.6, max: 21.0) [2024-06-17 23:13:52,004][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:13:53,823][12883] Updated weights for policy 0, policy_version 10640 (0.0035) [2024-06-17 23:13:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39594.7, 300 sec: 39821.4). Total num frames: 174456832. Throughput: 0: 39708.9. Samples: 174614840. Policy #0 lag: (min: 1.0, avg: 7.2, max: 19.0) [2024-06-17 23:13:57,003][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:13:57,518][12883] Updated weights for policy 0, policy_version 10650 (0.0034) [2024-06-17 23:14:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39050.0, 300 sec: 39599.3). Total num frames: 174620672. Throughput: 0: 39812.9. Samples: 174731880. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-17 23:14:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:14:02,269][12883] Updated weights for policy 0, policy_version 10660 (0.0045) [2024-06-17 23:14:05,405][12883] Updated weights for policy 0, policy_version 10670 (0.0043) [2024-06-17 23:14:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40140.8, 300 sec: 39877.0). Total num frames: 174882816. Throughput: 0: 39650.7. Samples: 174972640. Policy #0 lag: (min: 0.0, avg: 7.4, max: 21.0) [2024-06-17 23:14:07,000][12645] Avg episode reward: [(0, '0.018')] [2024-06-17 23:14:10,622][12883] Updated weights for policy 0, policy_version 10680 (0.0039) [2024-06-17 23:14:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 175046656. Throughput: 0: 39807.9. Samples: 175212600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-17 23:14:11,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:14:13,547][12883] Updated weights for policy 0, policy_version 10690 (0.0039) [2024-06-17 23:14:16,994][12645] Fps is (10 sec: 34406.6, 60 sec: 39321.6, 300 sec: 39654.8). Total num frames: 175226880. Throughput: 0: 39465.8. Samples: 175322760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-17 23:14:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:14:18,823][12883] Updated weights for policy 0, policy_version 10700 (0.0040) [2024-06-17 23:14:21,992][12883] Updated weights for policy 0, policy_version 10710 (0.0044) [2024-06-17 23:14:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 39867.9, 300 sec: 39877.0). Total num frames: 175472640. Throughput: 0: 39532.2. Samples: 175566000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-17 23:14:21,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:14:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39321.5, 300 sec: 39710.4). Total num frames: 175620096. Throughput: 0: 39836.0. Samples: 175814540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 22.0) [2024-06-17 23:14:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:14:27,119][12883] Updated weights for policy 0, policy_version 10720 (0.0034) [2024-06-17 23:14:29,943][12883] Updated weights for policy 0, policy_version 10730 (0.0036) [2024-06-17 23:14:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40140.8, 300 sec: 39877.3). Total num frames: 175882240. Throughput: 0: 39381.0. Samples: 175921360. Policy #0 lag: (min: 1.0, avg: 9.2, max: 24.0) [2024-06-17 23:14:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:14:35,331][12883] Updated weights for policy 0, policy_version 10740 (0.0049) [2024-06-17 23:14:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 176046080. Throughput: 0: 39965.3. Samples: 176175920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 23:14:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:14:37,100][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010746_176062464.pth... [2024-06-17 23:14:37,154][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010164_166526976.pth [2024-06-17 23:14:38,141][12883] Updated weights for policy 0, policy_version 10750 (0.0047) [2024-06-17 23:14:41,994][12645] Fps is (10 sec: 34406.5, 60 sec: 39048.6, 300 sec: 39654.8). Total num frames: 176226304. Throughput: 0: 39911.6. Samples: 176410860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 23:14:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:14:43,425][12883] Updated weights for policy 0, policy_version 10760 (0.0038) [2024-06-17 23:14:46,080][12862] Signal inference workers to stop experience collection... (2500 times) [2024-06-17 23:14:46,080][12862] Signal inference workers to resume experience collection... (2500 times) [2024-06-17 23:14:46,128][12883] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-17 23:14:46,128][12883] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-17 23:14:46,211][12883] Updated weights for policy 0, policy_version 10770 (0.0025) [2024-06-17 23:14:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 39867.8, 300 sec: 39821.4). Total num frames: 176472064. Throughput: 0: 40015.9. Samples: 176532600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:14:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:14:51,910][12883] Updated weights for policy 0, policy_version 10780 (0.0035) [2024-06-17 23:14:51,996][12645] Fps is (10 sec: 39312.8, 60 sec: 39593.2, 300 sec: 39710.1). Total num frames: 176619520. Throughput: 0: 39991.9. Samples: 176772360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:14:51,997][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:14:54,576][12883] Updated weights for policy 0, policy_version 10790 (0.0037) [2024-06-17 23:14:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 176865280. Throughput: 0: 39754.7. Samples: 177001560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-17 23:14:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:15:00,345][12883] Updated weights for policy 0, policy_version 10800 (0.0031) [2024-06-17 23:15:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 40413.9, 300 sec: 39765.9). Total num frames: 177045504. Throughput: 0: 40156.5. Samples: 177129800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-17 23:15:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:15:02,898][12883] Updated weights for policy 0, policy_version 10810 (0.0053) [2024-06-17 23:15:06,996][12645] Fps is (10 sec: 37675.2, 60 sec: 39320.2, 300 sec: 39765.6). Total num frames: 177242112. Throughput: 0: 39896.2. Samples: 177361420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-17 23:15:06,996][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 23:15:08,362][12883] Updated weights for policy 0, policy_version 10820 (0.0036) [2024-06-17 23:15:11,222][12883] Updated weights for policy 0, policy_version 10830 (0.0041) [2024-06-17 23:15:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 39765.9). Total num frames: 177455104. Throughput: 0: 39647.1. Samples: 177598660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:15:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:15:16,309][12883] Updated weights for policy 0, policy_version 10840 (0.0052) [2024-06-17 23:15:16,994][12645] Fps is (10 sec: 37691.2, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 177618944. Throughput: 0: 39988.8. Samples: 177720860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:15:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:15:19,818][12883] Updated weights for policy 0, policy_version 10850 (0.0036) [2024-06-17 23:15:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.6, 300 sec: 39821.4). Total num frames: 177864704. Throughput: 0: 39476.4. Samples: 177952360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-17 23:15:21,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:15:24,957][12883] Updated weights for policy 0, policy_version 10860 (0.0031) [2024-06-17 23:15:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.8, 300 sec: 39710.4). Total num frames: 178012160. Throughput: 0: 39642.2. Samples: 178194760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-17 23:15:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:15:28,253][12883] Updated weights for policy 0, policy_version 10870 (0.0034) [2024-06-17 23:15:31,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39048.6, 300 sec: 39710.4). Total num frames: 178225152. Throughput: 0: 39428.6. Samples: 178306880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 23:15:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:15:33,259][12883] Updated weights for policy 0, policy_version 10880 (0.0036) [2024-06-17 23:15:36,495][12883] Updated weights for policy 0, policy_version 10890 (0.0048) [2024-06-17 23:15:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 178421760. Throughput: 0: 39397.0. Samples: 178545140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 23:15:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:15:41,474][12883] Updated weights for policy 0, policy_version 10900 (0.0044) [2024-06-17 23:15:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 178618368. Throughput: 0: 39684.5. Samples: 178787360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 23:15:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:15:44,763][12883] Updated weights for policy 0, policy_version 10910 (0.0046) [2024-06-17 23:15:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 39594.7, 300 sec: 39765.9). Total num frames: 178847744. Throughput: 0: 39424.0. Samples: 178903880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-17 23:15:46,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:15:49,587][12883] Updated weights for policy 0, policy_version 10920 (0.0043) [2024-06-17 23:15:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40142.3, 300 sec: 39765.9). Total num frames: 179027968. Throughput: 0: 39604.2. Samples: 179143520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-17 23:15:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:15:52,842][12883] Updated weights for policy 0, policy_version 10930 (0.0043) [2024-06-17 23:15:56,994][12645] Fps is (10 sec: 36044.5, 60 sec: 39048.6, 300 sec: 39765.9). Total num frames: 179208192. Throughput: 0: 39435.6. Samples: 179373260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-17 23:15:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:15:58,016][12883] Updated weights for policy 0, policy_version 10940 (0.0038) [2024-06-17 23:16:01,122][12883] Updated weights for policy 0, policy_version 10950 (0.0038) [2024-06-17 23:16:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 39594.6, 300 sec: 39710.3). Total num frames: 179421184. Throughput: 0: 39349.3. Samples: 179491580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-17 23:16:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:16:06,205][12883] Updated weights for policy 0, policy_version 10960 (0.0032) [2024-06-17 23:16:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39596.1, 300 sec: 39655.1). Total num frames: 179617792. Throughput: 0: 39646.2. Samples: 179736440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:16:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:16:09,343][12883] Updated weights for policy 0, policy_version 10970 (0.0027) [2024-06-17 23:16:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 179814400. Throughput: 0: 39561.8. Samples: 179975040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 23:16:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:16:12,982][12862] Signal inference workers to stop experience collection... (2550 times) [2024-06-17 23:16:12,982][12862] Signal inference workers to resume experience collection... (2550 times) [2024-06-17 23:16:13,003][12883] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-17 23:16:13,008][12883] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-17 23:16:14,263][12883] Updated weights for policy 0, policy_version 10980 (0.0049) [2024-06-17 23:16:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 39821.5). Total num frames: 180027392. Throughput: 0: 39788.4. Samples: 180097360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 23:16:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:16:17,608][12883] Updated weights for policy 0, policy_version 10990 (0.0035) [2024-06-17 23:16:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 38775.4, 300 sec: 39710.3). Total num frames: 180191232. Throughput: 0: 39837.4. Samples: 180337820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 23:16:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:16:22,767][12883] Updated weights for policy 0, policy_version 11000 (0.0035) [2024-06-17 23:16:25,672][12883] Updated weights for policy 0, policy_version 11010 (0.0051) [2024-06-17 23:16:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 39765.9). Total num frames: 180436992. Throughput: 0: 39701.3. Samples: 180573920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-17 23:16:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:16:30,797][12883] Updated weights for policy 0, policy_version 11020 (0.0033) [2024-06-17 23:16:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40140.7, 300 sec: 39710.4). Total num frames: 180633600. Throughput: 0: 39907.4. Samples: 180699720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-17 23:16:31,995][12645] Avg episode reward: [(0, '0.014')] [2024-06-17 23:16:33,723][12883] Updated weights for policy 0, policy_version 11030 (0.0038) [2024-06-17 23:16:36,994][12645] Fps is (10 sec: 34406.4, 60 sec: 39321.7, 300 sec: 39654.8). Total num frames: 180781056. Throughput: 0: 39769.8. Samples: 180933160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-17 23:16:36,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:16:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011035_180797440.pth... [2024-06-17 23:16:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010455_171294720.pth [2024-06-17 23:16:38,960][12883] Updated weights for policy 0, policy_version 11040 (0.0046) [2024-06-17 23:16:41,991][12883] Updated weights for policy 0, policy_version 11050 (0.0039) [2024-06-17 23:16:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.8, 300 sec: 39821.4). Total num frames: 181043200. Throughput: 0: 39896.4. Samples: 181168600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-17 23:16:41,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:16:47,000][12645] Fps is (10 sec: 40934.1, 60 sec: 39044.4, 300 sec: 39654.0). Total num frames: 181190656. Throughput: 0: 40157.1. Samples: 181298900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-17 23:16:47,001][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:16:47,158][12883] Updated weights for policy 0, policy_version 11060 (0.0037) [2024-06-17 23:16:49,923][12883] Updated weights for policy 0, policy_version 11070 (0.0029) [2024-06-17 23:16:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 39710.7). Total num frames: 181420032. Throughput: 0: 39855.6. Samples: 181529940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-17 23:16:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:16:55,325][12883] Updated weights for policy 0, policy_version 11080 (0.0036) [2024-06-17 23:16:56,994][12645] Fps is (10 sec: 42624.8, 60 sec: 40140.7, 300 sec: 39655.1). Total num frames: 181616640. Throughput: 0: 39851.9. Samples: 181768380. Policy #0 lag: (min: 1.0, avg: 6.9, max: 19.0) [2024-06-17 23:16:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:16:57,869][12883] Updated weights for policy 0, policy_version 11090 (0.0043) [2024-06-17 23:17:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39594.7, 300 sec: 39710.4). Total num frames: 181796864. Throughput: 0: 39844.4. Samples: 181890360. Policy #0 lag: (min: 1.0, avg: 6.9, max: 19.0) [2024-06-17 23:17:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:17:03,249][12883] Updated weights for policy 0, policy_version 11100 (0.0025) [2024-06-17 23:17:06,483][12883] Updated weights for policy 0, policy_version 11110 (0.0037) [2024-06-17 23:17:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 39821.5). Total num frames: 182026240. Throughput: 0: 39800.5. Samples: 182128840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 23:17:07,000][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:17:11,237][12883] Updated weights for policy 0, policy_version 11120 (0.0045) [2024-06-17 23:17:11,994][12645] Fps is (10 sec: 40957.9, 60 sec: 39867.4, 300 sec: 39710.3). Total num frames: 182206464. Throughput: 0: 39922.6. Samples: 182370460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-17 23:17:11,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:17:14,756][12883] Updated weights for policy 0, policy_version 11130 (0.0036) [2024-06-17 23:17:16,994][12645] Fps is (10 sec: 36044.5, 60 sec: 39321.5, 300 sec: 39654.8). Total num frames: 182386688. Throughput: 0: 39695.1. Samples: 182486000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-17 23:17:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:17:19,796][12883] Updated weights for policy 0, policy_version 11140 (0.0042) [2024-06-17 23:17:21,994][12645] Fps is (10 sec: 40962.5, 60 sec: 40414.0, 300 sec: 39765.9). Total num frames: 182616064. Throughput: 0: 39873.9. Samples: 182727480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 23:17:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:17:23,355][12883] Updated weights for policy 0, policy_version 11150 (0.0029) [2024-06-17 23:17:26,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39048.6, 300 sec: 39710.7). Total num frames: 182779904. Throughput: 0: 40000.6. Samples: 182968620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-17 23:17:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:17:27,989][12883] Updated weights for policy 0, policy_version 11160 (0.0055) [2024-06-17 23:17:28,392][12862] Signal inference workers to stop experience collection... (2600 times) [2024-06-17 23:17:28,435][12883] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-17 23:17:28,507][12862] Signal inference workers to resume experience collection... (2600 times) [2024-06-17 23:17:28,508][12883] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-17 23:17:31,464][12883] Updated weights for policy 0, policy_version 11170 (0.0036) [2024-06-17 23:17:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39867.9, 300 sec: 39821.5). Total num frames: 183025664. Throughput: 0: 39676.3. Samples: 183084080. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-17 23:17:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:17:35,920][12883] Updated weights for policy 0, policy_version 11180 (0.0037) [2024-06-17 23:17:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.9, 300 sec: 39654.8). Total num frames: 183205888. Throughput: 0: 39965.8. Samples: 183328400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 25.0) [2024-06-17 23:17:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:17:39,999][12883] Updated weights for policy 0, policy_version 11190 (0.0040) [2024-06-17 23:17:41,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39048.6, 300 sec: 39654.9). Total num frames: 183386112. Throughput: 0: 40049.5. Samples: 183570600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:17:41,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 23:17:44,007][12883] Updated weights for policy 0, policy_version 11200 (0.0047) [2024-06-17 23:17:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40691.1, 300 sec: 39821.7). Total num frames: 183631872. Throughput: 0: 40014.5. Samples: 183691020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:17:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:17:48,008][12883] Updated weights for policy 0, policy_version 11210 (0.0034) [2024-06-17 23:17:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 39867.7, 300 sec: 39765.9). Total num frames: 183812096. Throughput: 0: 40032.5. Samples: 183930300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:17:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:17:52,097][12883] Updated weights for policy 0, policy_version 11220 (0.0044) [2024-06-17 23:17:55,835][12883] Updated weights for policy 0, policy_version 11230 (0.0031) [2024-06-17 23:17:56,996][12645] Fps is (10 sec: 36037.2, 60 sec: 39593.3, 300 sec: 39710.4). Total num frames: 183992320. Throughput: 0: 40081.1. Samples: 184174180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:17:56,997][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:18:00,093][12883] Updated weights for policy 0, policy_version 11240 (0.0035) [2024-06-17 23:18:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40960.0, 300 sec: 39932.5). Total num frames: 184254464. Throughput: 0: 40155.2. Samples: 184292980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-17 23:18:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:18:04,573][12883] Updated weights for policy 0, policy_version 11250 (0.0051) [2024-06-17 23:18:06,994][12645] Fps is (10 sec: 42607.4, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 184418304. Throughput: 0: 40387.3. Samples: 184544920. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-17 23:18:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:18:08,112][12883] Updated weights for policy 0, policy_version 11260 (0.0045) [2024-06-17 23:18:11,994][12645] Fps is (10 sec: 36044.8, 60 sec: 40141.1, 300 sec: 39821.5). Total num frames: 184614912. Throughput: 0: 40191.5. Samples: 184777240. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-17 23:18:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:18:13,149][12883] Updated weights for policy 0, policy_version 11270 (0.0043) [2024-06-17 23:18:16,197][12883] Updated weights for policy 0, policy_version 11280 (0.0035) [2024-06-17 23:18:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 39821.5). Total num frames: 184827904. Throughput: 0: 40386.0. Samples: 184901460. Policy #0 lag: (min: 2.0, avg: 10.7, max: 24.0) [2024-06-17 23:18:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:18:21,261][12883] Updated weights for policy 0, policy_version 11290 (0.0041) [2024-06-17 23:18:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 184991744. Throughput: 0: 40360.3. Samples: 185144620. Policy #0 lag: (min: 2.0, avg: 10.7, max: 24.0) [2024-06-17 23:18:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:18:24,493][12883] Updated weights for policy 0, policy_version 11300 (0.0045) [2024-06-17 23:18:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 39877.0). Total num frames: 185237504. Throughput: 0: 40215.5. Samples: 185380300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 23:18:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:18:29,545][12883] Updated weights for policy 0, policy_version 11310 (0.0035) [2024-06-17 23:18:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 40140.7, 300 sec: 39988.1). Total num frames: 185434112. Throughput: 0: 40424.6. Samples: 185510120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 23:18:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:18:32,804][12883] Updated weights for policy 0, policy_version 11320 (0.0042) [2024-06-17 23:18:36,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39867.6, 300 sec: 39710.4). Total num frames: 185597952. Throughput: 0: 40289.7. Samples: 185743340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:18:36,999][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:18:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011328_185597952.pth... [2024-06-17 23:18:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000010746_176062464.pth [2024-06-17 23:18:37,684][12883] Updated weights for policy 0, policy_version 11330 (0.0032) [2024-06-17 23:18:40,783][12883] Updated weights for policy 0, policy_version 11340 (0.0032) [2024-06-17 23:18:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40959.9, 300 sec: 39877.0). Total num frames: 185843712. Throughput: 0: 40080.6. Samples: 185977720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-17 23:18:42,000][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:18:45,689][12883] Updated weights for policy 0, policy_version 11350 (0.0044) [2024-06-17 23:18:46,994][12645] Fps is (10 sec: 39322.5, 60 sec: 39321.8, 300 sec: 39821.5). Total num frames: 185991168. Throughput: 0: 40220.6. Samples: 186102900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-17 23:18:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:18:49,018][12883] Updated weights for policy 0, policy_version 11360 (0.0029) [2024-06-17 23:18:52,000][12645] Fps is (10 sec: 39297.4, 60 sec: 40409.6, 300 sec: 39931.7). Total num frames: 186236928. Throughput: 0: 39786.2. Samples: 186335540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-17 23:18:52,001][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:18:54,382][12883] Updated weights for policy 0, policy_version 11370 (0.0033) [2024-06-17 23:18:56,394][12862] Signal inference workers to stop experience collection... (2650 times) [2024-06-17 23:18:56,395][12862] Signal inference workers to resume experience collection... (2650 times) [2024-06-17 23:18:56,412][12883] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-17 23:18:56,412][12883] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-17 23:18:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40415.5, 300 sec: 39988.1). Total num frames: 186417152. Throughput: 0: 39861.5. Samples: 186571000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-17 23:18:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:18:57,157][12883] Updated weights for policy 0, policy_version 11380 (0.0035) [2024-06-17 23:19:01,994][12645] Fps is (10 sec: 36067.3, 60 sec: 39048.6, 300 sec: 39710.4). Total num frames: 186597376. Throughput: 0: 39741.4. Samples: 186689820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-17 23:19:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:19:02,765][12883] Updated weights for policy 0, policy_version 11390 (0.0029) [2024-06-17 23:19:05,210][12883] Updated weights for policy 0, policy_version 11400 (0.0033) [2024-06-17 23:19:06,994][12645] Fps is (10 sec: 40959.0, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 186826752. Throughput: 0: 39519.1. Samples: 186922980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-17 23:19:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:19:10,647][12883] Updated weights for policy 0, policy_version 11410 (0.0026) [2024-06-17 23:19:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 186990592. Throughput: 0: 39739.7. Samples: 187168580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-17 23:19:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:19:13,613][12883] Updated weights for policy 0, policy_version 11420 (0.0048) [2024-06-17 23:19:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39594.6, 300 sec: 39765.9). Total num frames: 187203584. Throughput: 0: 39359.4. Samples: 187281300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-17 23:19:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:19:18,469][12883] Updated weights for policy 0, policy_version 11430 (0.0036) [2024-06-17 23:19:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.9, 300 sec: 39932.5). Total num frames: 187400192. Throughput: 0: 39668.5. Samples: 187528420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-17 23:19:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:19:22,223][12883] Updated weights for policy 0, policy_version 11440 (0.0041) [2024-06-17 23:19:26,318][12883] Updated weights for policy 0, policy_version 11450 (0.0040) [2024-06-17 23:19:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39321.7, 300 sec: 39710.4). Total num frames: 187596800. Throughput: 0: 39849.0. Samples: 187770920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 23:19:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:19:30,307][12883] Updated weights for policy 0, policy_version 11460 (0.0036) [2024-06-17 23:19:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 39932.5). Total num frames: 187826176. Throughput: 0: 39809.2. Samples: 187894320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-17 23:19:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:19:34,809][12883] Updated weights for policy 0, policy_version 11470 (0.0042) [2024-06-17 23:19:36,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39594.8, 300 sec: 39821.5). Total num frames: 187973632. Throughput: 0: 39635.4. Samples: 188118880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-17 23:19:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:19:38,908][12883] Updated weights for policy 0, policy_version 11480 (0.0041) [2024-06-17 23:19:41,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39048.5, 300 sec: 39710.4). Total num frames: 188186624. Throughput: 0: 39799.7. Samples: 188362000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-17 23:19:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:19:42,817][12883] Updated weights for policy 0, policy_version 11490 (0.0038) [2024-06-17 23:19:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40140.7, 300 sec: 39932.8). Total num frames: 188399616. Throughput: 0: 39867.5. Samples: 188483860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:19:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:19:47,156][12883] Updated weights for policy 0, policy_version 11500 (0.0032) [2024-06-17 23:19:51,065][12883] Updated weights for policy 0, policy_version 11510 (0.0044) [2024-06-17 23:19:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 39598.8, 300 sec: 39821.5). Total num frames: 188612608. Throughput: 0: 39955.2. Samples: 188720960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:19:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:19:55,219][12883] Updated weights for policy 0, policy_version 11520 (0.0036) [2024-06-17 23:19:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.6, 300 sec: 39821.5). Total num frames: 188792832. Throughput: 0: 39744.4. Samples: 188957080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-17 23:19:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:19:59,229][12883] Updated weights for policy 0, policy_version 11530 (0.0049) [2024-06-17 23:20:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39867.7, 300 sec: 39821.7). Total num frames: 188989440. Throughput: 0: 39782.3. Samples: 189071500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-17 23:20:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:20:03,939][12883] Updated weights for policy 0, policy_version 11540 (0.0026) [2024-06-17 23:20:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.7, 300 sec: 39821.4). Total num frames: 189202432. Throughput: 0: 39569.3. Samples: 189309040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-17 23:20:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:20:07,361][12883] Updated weights for policy 0, policy_version 11550 (0.0048) [2024-06-17 23:20:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 189382656. Throughput: 0: 39676.1. Samples: 189556340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:20:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:20:12,020][12883] Updated weights for policy 0, policy_version 11560 (0.0037) [2024-06-17 23:20:16,157][12883] Updated weights for policy 0, policy_version 11570 (0.0049) [2024-06-17 23:20:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39867.8, 300 sec: 39765.9). Total num frames: 189595648. Throughput: 0: 39529.8. Samples: 189673160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:20:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:20:20,220][12883] Updated weights for policy 0, policy_version 11580 (0.0048) [2024-06-17 23:20:21,996][12645] Fps is (10 sec: 42588.2, 60 sec: 40139.3, 300 sec: 39987.8). Total num frames: 189808640. Throughput: 0: 39948.1. Samples: 189916640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:20:21,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:20:24,104][12883] Updated weights for policy 0, policy_version 11590 (0.0037) [2024-06-17 23:20:24,878][12862] Signal inference workers to stop experience collection... (2700 times) [2024-06-17 23:20:24,878][12862] Signal inference workers to resume experience collection... (2700 times) [2024-06-17 23:20:24,894][12883] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-17 23:20:24,926][12883] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-17 23:20:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.7, 300 sec: 39877.0). Total num frames: 189988864. Throughput: 0: 39821.9. Samples: 190153980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:20:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:20:28,859][12883] Updated weights for policy 0, policy_version 11600 (0.0033) [2024-06-17 23:20:31,994][12645] Fps is (10 sec: 39330.5, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 190201856. Throughput: 0: 39797.8. Samples: 190274760. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-17 23:20:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:20:32,135][12883] Updated weights for policy 0, policy_version 11610 (0.0031) [2024-06-17 23:20:36,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39867.7, 300 sec: 39821.5). Total num frames: 190365696. Throughput: 0: 39875.2. Samples: 190515340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 19.0) [2024-06-17 23:20:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:20:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011620_190382080.pth... [2024-06-17 23:20:37,072][12883] Updated weights for policy 0, policy_version 11620 (0.0035) [2024-06-17 23:20:37,124][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011035_180797440.pth [2024-06-17 23:20:40,637][12883] Updated weights for policy 0, policy_version 11630 (0.0039) [2024-06-17 23:20:41,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.8, 300 sec: 39821.4). Total num frames: 190595072. Throughput: 0: 39848.8. Samples: 190750280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-17 23:20:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:20:45,291][12883] Updated weights for policy 0, policy_version 11640 (0.0027) [2024-06-17 23:20:46,994][12645] Fps is (10 sec: 44235.8, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 190808064. Throughput: 0: 40084.4. Samples: 190875300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-17 23:20:46,998][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:20:48,661][12883] Updated weights for policy 0, policy_version 11650 (0.0037) [2024-06-17 23:20:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 39867.7, 300 sec: 39988.1). Total num frames: 191004672. Throughput: 0: 40145.8. Samples: 191115600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-17 23:20:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:20:53,412][12883] Updated weights for policy 0, policy_version 11660 (0.0046) [2024-06-17 23:20:56,693][12883] Updated weights for policy 0, policy_version 11670 (0.0031) [2024-06-17 23:20:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.7, 300 sec: 39932.5). Total num frames: 191201280. Throughput: 0: 39918.9. Samples: 191352700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:20:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:21:01,651][12883] Updated weights for policy 0, policy_version 11680 (0.0031) [2024-06-17 23:21:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.8, 300 sec: 39877.0). Total num frames: 191381504. Throughput: 0: 40057.3. Samples: 191475740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:21:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:21:05,120][12883] Updated weights for policy 0, policy_version 11690 (0.0043) [2024-06-17 23:21:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 39932.5). Total num frames: 191594496. Throughput: 0: 40025.6. Samples: 191717700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:21:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:21:09,602][12883] Updated weights for policy 0, policy_version 11700 (0.0033) [2024-06-17 23:21:11,996][12645] Fps is (10 sec: 44227.0, 60 sec: 40685.4, 300 sec: 39987.8). Total num frames: 191823872. Throughput: 0: 40184.3. Samples: 191962360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-17 23:21:11,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:21:12,985][12883] Updated weights for policy 0, policy_version 11710 (0.0035) [2024-06-17 23:21:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 39867.6, 300 sec: 39988.1). Total num frames: 191987712. Throughput: 0: 40299.4. Samples: 192088240. Policy #0 lag: (min: 2.0, avg: 9.2, max: 20.0) [2024-06-17 23:21:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:21:17,770][12883] Updated weights for policy 0, policy_version 11720 (0.0043) [2024-06-17 23:21:20,819][12883] Updated weights for policy 0, policy_version 11730 (0.0033) [2024-06-17 23:21:21,994][12645] Fps is (10 sec: 37691.3, 60 sec: 39869.2, 300 sec: 39877.0). Total num frames: 192200704. Throughput: 0: 40105.6. Samples: 192320100. Policy #0 lag: (min: 2.0, avg: 9.2, max: 20.0) [2024-06-17 23:21:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:21:25,704][12883] Updated weights for policy 0, policy_version 11740 (0.0036) [2024-06-17 23:21:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40140.9, 300 sec: 39877.0). Total num frames: 192397312. Throughput: 0: 40332.6. Samples: 192565240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:21:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:21:28,891][12883] Updated weights for policy 0, policy_version 11750 (0.0027) [2024-06-17 23:21:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 40043.6). Total num frames: 192593920. Throughput: 0: 40185.4. Samples: 192683640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:21:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:21:33,768][12883] Updated weights for policy 0, policy_version 11760 (0.0036) [2024-06-17 23:21:36,804][12883] Updated weights for policy 0, policy_version 11770 (0.0043) [2024-06-17 23:21:37,000][12645] Fps is (10 sec: 44209.0, 60 sec: 41228.7, 300 sec: 39987.2). Total num frames: 192839680. Throughput: 0: 40275.7. Samples: 192928260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:21:37,001][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:21:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.8, 300 sec: 39988.9). Total num frames: 192987136. Throughput: 0: 40332.1. Samples: 193167640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:21:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:21:42,163][12883] Updated weights for policy 0, policy_version 11780 (0.0036) [2024-06-17 23:21:45,025][12883] Updated weights for policy 0, policy_version 11790 (0.0034) [2024-06-17 23:21:46,994][12645] Fps is (10 sec: 37706.9, 60 sec: 40140.9, 300 sec: 39988.1). Total num frames: 193216512. Throughput: 0: 40103.1. Samples: 193280380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-17 23:21:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:21:50,683][12883] Updated weights for policy 0, policy_version 11800 (0.0034) [2024-06-17 23:21:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39867.7, 300 sec: 39932.5). Total num frames: 193396736. Throughput: 0: 40269.7. Samples: 193529840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 23:21:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:21:52,485][12862] Signal inference workers to stop experience collection... (2750 times) [2024-06-17 23:21:52,529][12862] Signal inference workers to resume experience collection... (2750 times) [2024-06-17 23:21:52,530][12883] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-17 23:21:52,542][12883] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-17 23:21:52,995][12883] Updated weights for policy 0, policy_version 11810 (0.0032) [2024-06-17 23:21:56,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 193576960. Throughput: 0: 40024.2. Samples: 193763360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 23:21:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:21:58,955][12883] Updated weights for policy 0, policy_version 11820 (0.0042) [2024-06-17 23:22:01,477][12883] Updated weights for policy 0, policy_version 11830 (0.0032) [2024-06-17 23:22:01,994][12645] Fps is (10 sec: 44237.5, 60 sec: 40960.0, 300 sec: 40043.6). Total num frames: 193839104. Throughput: 0: 39862.8. Samples: 193882060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-17 23:22:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:22:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.7, 300 sec: 39877.1). Total num frames: 193970176. Throughput: 0: 40225.4. Samples: 194130240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-17 23:22:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:22:07,098][12883] Updated weights for policy 0, policy_version 11840 (0.0035) [2024-06-17 23:22:09,510][12883] Updated weights for policy 0, policy_version 11850 (0.0041) [2024-06-17 23:22:11,994][12645] Fps is (10 sec: 36044.9, 60 sec: 39596.2, 300 sec: 40043.6). Total num frames: 194199552. Throughput: 0: 40021.8. Samples: 194366220. Policy #0 lag: (min: 1.0, avg: 12.6, max: 20.0) [2024-06-17 23:22:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:22:15,118][12883] Updated weights for policy 0, policy_version 11860 (0.0028) [2024-06-17 23:22:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40687.0, 300 sec: 40043.6). Total num frames: 194428928. Throughput: 0: 40237.8. Samples: 194494340. Policy #0 lag: (min: 1.0, avg: 12.6, max: 20.0) [2024-06-17 23:22:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:22:17,707][12883] Updated weights for policy 0, policy_version 11870 (0.0053) [2024-06-17 23:22:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39594.7, 300 sec: 39988.1). Total num frames: 194576384. Throughput: 0: 40108.7. Samples: 194732900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-17 23:22:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:22:23,187][12883] Updated weights for policy 0, policy_version 11880 (0.0029) [2024-06-17 23:22:25,600][12883] Updated weights for policy 0, policy_version 11890 (0.0032) [2024-06-17 23:22:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 39988.0). Total num frames: 194822144. Throughput: 0: 39972.4. Samples: 194966400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-17 23:22:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:22:31,198][12883] Updated weights for policy 0, policy_version 11900 (0.0036) [2024-06-17 23:22:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40413.9, 300 sec: 40043.6). Total num frames: 195018752. Throughput: 0: 40426.7. Samples: 195099580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:22:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:22:33,768][12883] Updated weights for policy 0, policy_version 11910 (0.0040) [2024-06-17 23:22:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39598.8, 300 sec: 40099.1). Total num frames: 195215360. Throughput: 0: 40163.2. Samples: 195337180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:22:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011915_195215360.pth... [2024-06-17 23:22:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011328_185597952.pth [2024-06-17 23:22:39,175][12883] Updated weights for policy 0, policy_version 11920 (0.0045) [2024-06-17 23:22:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40686.9, 300 sec: 39988.1). Total num frames: 195428352. Throughput: 0: 40198.2. Samples: 195572280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-17 23:22:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:22:42,264][12883] Updated weights for policy 0, policy_version 11930 (0.0033) [2024-06-17 23:22:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 195592192. Throughput: 0: 40364.9. Samples: 195698480. Policy #0 lag: (min: 1.0, avg: 8.2, max: 22.0) [2024-06-17 23:22:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:22:47,190][12883] Updated weights for policy 0, policy_version 11940 (0.0035) [2024-06-17 23:22:51,107][12883] Updated weights for policy 0, policy_version 11950 (0.0036) [2024-06-17 23:22:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 40155.0). Total num frames: 195837952. Throughput: 0: 40075.9. Samples: 195933660. Policy #0 lag: (min: 1.0, avg: 8.2, max: 22.0) [2024-06-17 23:22:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:22:55,279][12883] Updated weights for policy 0, policy_version 11960 (0.0048) [2024-06-17 23:22:56,971][12862] Signal inference workers to stop experience collection... (2800 times) [2024-06-17 23:22:56,972][12862] Signal inference workers to resume experience collection... (2800 times) [2024-06-17 23:22:56,992][12883] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-17 23:22:56,992][12883] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-17 23:22:56,994][12645] Fps is (10 sec: 42597.2, 60 sec: 40686.8, 300 sec: 39877.0). Total num frames: 196018176. Throughput: 0: 40261.6. Samples: 196178000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-17 23:22:56,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:22:59,074][12883] Updated weights for policy 0, policy_version 11970 (0.0048) [2024-06-17 23:23:01,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39321.5, 300 sec: 39932.5). Total num frames: 196198400. Throughput: 0: 40098.6. Samples: 196298780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-17 23:23:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:23:03,304][12883] Updated weights for policy 0, policy_version 11980 (0.0035) [2024-06-17 23:23:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40959.9, 300 sec: 40043.6). Total num frames: 196427776. Throughput: 0: 40167.5. Samples: 196540440. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-17 23:23:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:23:07,051][12883] Updated weights for policy 0, policy_version 11990 (0.0041) [2024-06-17 23:23:11,711][12883] Updated weights for policy 0, policy_version 12000 (0.0041) [2024-06-17 23:23:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40413.9, 300 sec: 39988.1). Total num frames: 196624384. Throughput: 0: 40417.0. Samples: 196785160. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-17 23:23:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:23:15,133][12883] Updated weights for policy 0, policy_version 12010 (0.0047) [2024-06-17 23:23:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 196837376. Throughput: 0: 40089.3. Samples: 196903600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:23:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:23:19,551][12883] Updated weights for policy 0, policy_version 12020 (0.0043) [2024-06-17 23:23:21,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41233.0, 300 sec: 40043.6). Total num frames: 197050368. Throughput: 0: 40235.0. Samples: 197147760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:23:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:23:23,515][12883] Updated weights for policy 0, policy_version 12030 (0.0041) [2024-06-17 23:23:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 39594.8, 300 sec: 39877.0). Total num frames: 197197824. Throughput: 0: 40445.5. Samples: 197392320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:23:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:23:27,752][12883] Updated weights for policy 0, policy_version 12040 (0.0033) [2024-06-17 23:23:31,790][12883] Updated weights for policy 0, policy_version 12050 (0.0030) [2024-06-17 23:23:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 197443584. Throughput: 0: 40200.3. Samples: 197507500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:23:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:23:35,865][12883] Updated weights for policy 0, policy_version 12060 (0.0045) [2024-06-17 23:23:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 40413.9, 300 sec: 39988.1). Total num frames: 197640192. Throughput: 0: 40478.3. Samples: 197755180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-17 23:23:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:23:39,862][12883] Updated weights for policy 0, policy_version 12070 (0.0037) [2024-06-17 23:23:41,994][12645] Fps is (10 sec: 36045.4, 60 sec: 39594.8, 300 sec: 40043.6). Total num frames: 197804032. Throughput: 0: 40295.4. Samples: 197991280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-17 23:23:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:23:43,769][12883] Updated weights for policy 0, policy_version 12080 (0.0034) [2024-06-17 23:23:46,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40413.8, 300 sec: 39933.4). Total num frames: 198017024. Throughput: 0: 40191.6. Samples: 198107400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 23:23:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:23:48,012][12883] Updated weights for policy 0, policy_version 12090 (0.0045) [2024-06-17 23:23:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.8, 300 sec: 40043.6). Total num frames: 198230016. Throughput: 0: 40154.8. Samples: 198347400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-17 23:23:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:23:52,076][12883] Updated weights for policy 0, policy_version 12100 (0.0049) [2024-06-17 23:23:55,976][12883] Updated weights for policy 0, policy_version 12110 (0.0038) [2024-06-17 23:23:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 40099.1). Total num frames: 198426624. Throughput: 0: 39940.8. Samples: 198582500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-17 23:23:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:24:00,280][12883] Updated weights for policy 0, policy_version 12120 (0.0044) [2024-06-17 23:24:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40413.8, 300 sec: 39988.1). Total num frames: 198623232. Throughput: 0: 40005.7. Samples: 198703860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:24:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:24:03,942][12883] Updated weights for policy 0, policy_version 12130 (0.0036) [2024-06-17 23:24:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39594.8, 300 sec: 40043.6). Total num frames: 198803456. Throughput: 0: 40030.8. Samples: 198949140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:24:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:24:08,514][12883] Updated weights for policy 0, policy_version 12140 (0.0048) [2024-06-17 23:24:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.6, 300 sec: 40043.6). Total num frames: 199016448. Throughput: 0: 39628.8. Samples: 199175620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:24:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:24:12,759][12883] Updated weights for policy 0, policy_version 12150 (0.0038) [2024-06-17 23:24:16,765][12883] Updated weights for policy 0, policy_version 12160 (0.0031) [2024-06-17 23:24:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 199229440. Throughput: 0: 39861.3. Samples: 199301260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:24:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:24:20,814][12883] Updated weights for policy 0, policy_version 12170 (0.0034) [2024-06-17 23:24:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39321.7, 300 sec: 40043.6). Total num frames: 199409664. Throughput: 0: 39707.1. Samples: 199542000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-17 23:24:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:24:25,199][12883] Updated weights for policy 0, policy_version 12180 (0.0038) [2024-06-17 23:24:26,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40140.8, 300 sec: 39932.5). Total num frames: 199606272. Throughput: 0: 39852.5. Samples: 199784640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-17 23:24:26,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:24:28,031][12862] Signal inference workers to stop experience collection... (2850 times) [2024-06-17 23:24:28,032][12862] Signal inference workers to resume experience collection... (2850 times) [2024-06-17 23:24:28,069][12883] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-17 23:24:28,069][12883] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-17 23:24:28,924][12883] Updated weights for policy 0, policy_version 12190 (0.0042) [2024-06-17 23:24:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 199819264. Throughput: 0: 39828.4. Samples: 199899680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-17 23:24:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:24:33,440][12883] Updated weights for policy 0, policy_version 12200 (0.0052) [2024-06-17 23:24:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 39867.6, 300 sec: 40154.7). Total num frames: 200032256. Throughput: 0: 39939.4. Samples: 200144680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-17 23:24:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:24:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012209_200032256.pth... [2024-06-17 23:24:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011620_190382080.pth [2024-06-17 23:24:37,565][12883] Updated weights for policy 0, policy_version 12210 (0.0042) [2024-06-17 23:24:41,340][12883] Updated weights for policy 0, policy_version 12220 (0.0039) [2024-06-17 23:24:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.7, 300 sec: 40043.6). Total num frames: 200212480. Throughput: 0: 39877.2. Samples: 200376980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:24:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:24:45,733][12883] Updated weights for policy 0, policy_version 12230 (0.0044) [2024-06-17 23:24:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.7, 300 sec: 40043.6). Total num frames: 200425472. Throughput: 0: 40006.2. Samples: 200504140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:24:46,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:24:49,714][12883] Updated weights for policy 0, policy_version 12240 (0.0038) [2024-06-17 23:24:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39594.6, 300 sec: 40043.6). Total num frames: 200605696. Throughput: 0: 39893.7. Samples: 200744360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 23:24:51,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:24:53,971][12883] Updated weights for policy 0, policy_version 12250 (0.0036) [2024-06-17 23:24:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 200851456. Throughput: 0: 40002.7. Samples: 200975740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 23:24:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:24:57,608][12883] Updated weights for policy 0, policy_version 12260 (0.0031) [2024-06-17 23:25:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.8, 300 sec: 40043.6). Total num frames: 201015296. Throughput: 0: 40186.7. Samples: 201109660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-17 23:25:01,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:25:02,086][12883] Updated weights for policy 0, policy_version 12270 (0.0040) [2024-06-17 23:25:05,488][12883] Updated weights for policy 0, policy_version 12280 (0.0028) [2024-06-17 23:25:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 201228288. Throughput: 0: 40020.4. Samples: 201342920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:25:06,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:25:10,153][12883] Updated weights for policy 0, policy_version 12290 (0.0036) [2024-06-17 23:25:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 201441280. Throughput: 0: 40106.9. Samples: 201589460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:25:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:25:13,922][12883] Updated weights for policy 0, policy_version 12300 (0.0037) [2024-06-17 23:25:16,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.8, 300 sec: 40043.9). Total num frames: 201621504. Throughput: 0: 40207.2. Samples: 201709000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-17 23:25:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:25:18,139][12883] Updated weights for policy 0, policy_version 12310 (0.0044) [2024-06-17 23:25:21,951][12883] Updated weights for policy 0, policy_version 12320 (0.0031) [2024-06-17 23:25:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40210.2). Total num frames: 201850880. Throughput: 0: 40084.5. Samples: 201948480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-17 23:25:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:25:26,107][12883] Updated weights for policy 0, policy_version 12330 (0.0042) [2024-06-17 23:25:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.7, 300 sec: 40043.6). Total num frames: 202014720. Throughput: 0: 40290.7. Samples: 202190060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-17 23:25:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:25:30,222][12883] Updated weights for policy 0, policy_version 12340 (0.0034) [2024-06-17 23:25:32,000][12645] Fps is (10 sec: 37659.6, 60 sec: 40136.6, 300 sec: 40209.4). Total num frames: 202227712. Throughput: 0: 40273.6. Samples: 202316700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-17 23:25:32,001][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:25:34,443][12883] Updated weights for policy 0, policy_version 12350 (0.0039) [2024-06-17 23:25:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 202440704. Throughput: 0: 40278.2. Samples: 202556880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 23:25:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:25:38,331][12883] Updated weights for policy 0, policy_version 12360 (0.0035) [2024-06-17 23:25:41,994][12645] Fps is (10 sec: 40986.0, 60 sec: 40414.0, 300 sec: 40099.2). Total num frames: 202637312. Throughput: 0: 40480.5. Samples: 202797360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-17 23:25:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:25:42,766][12883] Updated weights for policy 0, policy_version 12370 (0.0034) [2024-06-17 23:25:46,403][12883] Updated weights for policy 0, policy_version 12380 (0.0032) [2024-06-17 23:25:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 202850304. Throughput: 0: 40099.1. Samples: 202914120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-17 23:25:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:25:50,772][12883] Updated weights for policy 0, policy_version 12390 (0.0028) [2024-06-17 23:25:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.9, 300 sec: 40099.2). Total num frames: 203030528. Throughput: 0: 40370.2. Samples: 203159580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-17 23:25:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:25:54,144][12862] Signal inference workers to stop experience collection... (2900 times) [2024-06-17 23:25:54,144][12862] Signal inference workers to resume experience collection... (2900 times) [2024-06-17 23:25:54,163][12883] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-17 23:25:54,163][12883] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-17 23:25:54,434][12883] Updated weights for policy 0, policy_version 12400 (0.0036) [2024-06-17 23:25:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 203243520. Throughput: 0: 40155.1. Samples: 203396440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-17 23:25:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:25:58,790][12883] Updated weights for policy 0, policy_version 12410 (0.0048) [2024-06-17 23:26:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 203440128. Throughput: 0: 40392.0. Samples: 203526640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-17 23:26:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:26:02,490][12883] Updated weights for policy 0, policy_version 12420 (0.0045) [2024-06-17 23:26:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40043.9). Total num frames: 203636736. Throughput: 0: 40448.1. Samples: 203768640. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 23:26:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:26:07,189][12883] Updated weights for policy 0, policy_version 12430 (0.0034) [2024-06-17 23:26:10,796][12883] Updated weights for policy 0, policy_version 12440 (0.0052) [2024-06-17 23:26:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40414.0, 300 sec: 40265.8). Total num frames: 203866112. Throughput: 0: 40361.9. Samples: 204006340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 23:26:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:26:15,175][12883] Updated weights for policy 0, policy_version 12450 (0.0035) [2024-06-17 23:26:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 204029952. Throughput: 0: 40319.9. Samples: 204130840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:26:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:26:18,928][12883] Updated weights for policy 0, policy_version 12460 (0.0030) [2024-06-17 23:26:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 204242944. Throughput: 0: 40155.6. Samples: 204363880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:26:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:26:23,219][12883] Updated weights for policy 0, policy_version 12470 (0.0033) [2024-06-17 23:26:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 204455936. Throughput: 0: 40142.6. Samples: 204603780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:26:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:26:27,028][12883] Updated weights for policy 0, policy_version 12480 (0.0033) [2024-06-17 23:26:31,463][12883] Updated weights for policy 0, policy_version 12490 (0.0041) [2024-06-17 23:26:31,997][12645] Fps is (10 sec: 39308.3, 60 sec: 40142.8, 300 sec: 39988.5). Total num frames: 204636160. Throughput: 0: 40277.0. Samples: 204726720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:26:31,998][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:26:35,051][12883] Updated weights for policy 0, policy_version 12500 (0.0035) [2024-06-17 23:26:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 204849152. Throughput: 0: 40066.6. Samples: 204962580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-17 23:26:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:26:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012503_204849152.pth... [2024-06-17 23:26:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000011915_195215360.pth [2024-06-17 23:26:39,237][12883] Updated weights for policy 0, policy_version 12510 (0.0041) [2024-06-17 23:26:41,994][12645] Fps is (10 sec: 40973.5, 60 sec: 40140.7, 300 sec: 40099.1). Total num frames: 205045760. Throughput: 0: 40317.3. Samples: 205210720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-17 23:26:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:26:43,342][12883] Updated weights for policy 0, policy_version 12520 (0.0042) [2024-06-17 23:26:46,999][12645] Fps is (10 sec: 40936.7, 60 sec: 40137.0, 300 sec: 40209.5). Total num frames: 205258752. Throughput: 0: 40119.3. Samples: 205332240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-17 23:26:47,000][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:26:47,422][12883] Updated weights for policy 0, policy_version 12530 (0.0034) [2024-06-17 23:26:51,502][12883] Updated weights for policy 0, policy_version 12540 (0.0045) [2024-06-17 23:26:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 205471744. Throughput: 0: 40137.7. Samples: 205574840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-17 23:26:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:26:55,997][12883] Updated weights for policy 0, policy_version 12550 (0.0038) [2024-06-17 23:26:56,994][12645] Fps is (10 sec: 40983.3, 60 sec: 40413.8, 300 sec: 40099.1). Total num frames: 205668352. Throughput: 0: 40174.1. Samples: 205814180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-17 23:26:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:26:59,818][12883] Updated weights for policy 0, policy_version 12560 (0.0030) [2024-06-17 23:27:01,994][12645] Fps is (10 sec: 39319.7, 60 sec: 40413.5, 300 sec: 40321.2). Total num frames: 205864960. Throughput: 0: 39995.1. Samples: 205930640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-17 23:27:01,995][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:27:03,984][12883] Updated weights for policy 0, policy_version 12570 (0.0035) [2024-06-17 23:27:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40140.7, 300 sec: 40154.7). Total num frames: 206045184. Throughput: 0: 40327.0. Samples: 206178600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 25.0) [2024-06-17 23:27:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:27:07,523][12862] Signal inference workers to stop experience collection... (2950 times) [2024-06-17 23:27:07,523][12862] Signal inference workers to resume experience collection... (2950 times) [2024-06-17 23:27:07,544][12883] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-17 23:27:07,544][12883] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-17 23:27:07,843][12883] Updated weights for policy 0, policy_version 12580 (0.0030) [2024-06-17 23:27:11,994][12645] Fps is (10 sec: 39323.6, 60 sec: 39867.7, 300 sec: 40099.2). Total num frames: 206258176. Throughput: 0: 40318.2. Samples: 206418100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:27:11,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:27:12,344][12883] Updated weights for policy 0, policy_version 12590 (0.0042) [2024-06-17 23:27:16,323][12883] Updated weights for policy 0, policy_version 12600 (0.0041) [2024-06-17 23:27:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 206471168. Throughput: 0: 40287.5. Samples: 206539520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:27:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:27:20,267][12883] Updated weights for policy 0, policy_version 12610 (0.0044) [2024-06-17 23:27:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40099.2). Total num frames: 206651392. Throughput: 0: 40431.7. Samples: 206782000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 23:27:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:27:24,188][12883] Updated weights for policy 0, policy_version 12620 (0.0035) [2024-06-17 23:27:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40099.1). Total num frames: 206848000. Throughput: 0: 40399.3. Samples: 207028680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 23:27:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:27:28,246][12883] Updated weights for policy 0, policy_version 12630 (0.0029) [2024-06-17 23:27:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40689.3, 300 sec: 40210.2). Total num frames: 207077376. Throughput: 0: 40308.8. Samples: 207145900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 23:27:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:27:32,287][12883] Updated weights for policy 0, policy_version 12640 (0.0033) [2024-06-17 23:27:36,401][12883] Updated weights for policy 0, policy_version 12650 (0.0030) [2024-06-17 23:27:36,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 207273984. Throughput: 0: 40371.0. Samples: 207391540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 23:27:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:27:40,420][12883] Updated weights for policy 0, policy_version 12660 (0.0028) [2024-06-17 23:27:41,996][12645] Fps is (10 sec: 39312.5, 60 sec: 40412.4, 300 sec: 40265.4). Total num frames: 207470592. Throughput: 0: 40307.9. Samples: 207628120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 23:27:41,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:27:44,473][12883] Updated weights for policy 0, policy_version 12670 (0.0043) [2024-06-17 23:27:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40144.6, 300 sec: 40099.1). Total num frames: 207667200. Throughput: 0: 40252.8. Samples: 207742000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 23:27:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:27:48,899][12883] Updated weights for policy 0, policy_version 12680 (0.0033) [2024-06-17 23:27:51,994][12645] Fps is (10 sec: 37691.7, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 207847424. Throughput: 0: 40070.3. Samples: 207981760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-17 23:27:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:27:52,834][12883] Updated weights for policy 0, policy_version 12690 (0.0045) [2024-06-17 23:27:56,994][12883] Updated weights for policy 0, policy_version 12700 (0.0028) [2024-06-17 23:27:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 40265.8). Total num frames: 208076800. Throughput: 0: 40203.9. Samples: 208227280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-17 23:27:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:28:00,778][12883] Updated weights for policy 0, policy_version 12710 (0.0031) [2024-06-17 23:28:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40141.2, 300 sec: 40154.7). Total num frames: 208273408. Throughput: 0: 40263.6. Samples: 208351380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-17 23:28:01,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:28:05,274][12883] Updated weights for policy 0, policy_version 12720 (0.0039) [2024-06-17 23:28:07,000][12645] Fps is (10 sec: 39297.5, 60 sec: 40409.7, 300 sec: 40153.8). Total num frames: 208470016. Throughput: 0: 40196.6. Samples: 208591100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-17 23:28:07,000][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:28:08,932][12883] Updated weights for policy 0, policy_version 12730 (0.0037) [2024-06-17 23:28:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.8, 300 sec: 40099.1). Total num frames: 208666624. Throughput: 0: 40053.3. Samples: 208831080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-17 23:28:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:28:13,143][12883] Updated weights for policy 0, policy_version 12740 (0.0037) [2024-06-17 23:28:16,996][12645] Fps is (10 sec: 40976.1, 60 sec: 40139.3, 300 sec: 40098.9). Total num frames: 208879616. Throughput: 0: 40136.1. Samples: 208952120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-17 23:28:16,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:28:17,339][12883] Updated weights for policy 0, policy_version 12750 (0.0040) [2024-06-17 23:28:21,834][12883] Updated weights for policy 0, policy_version 12760 (0.0040) [2024-06-17 23:28:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 209059840. Throughput: 0: 39954.7. Samples: 209189500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 23:28:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:28:25,857][12883] Updated weights for policy 0, policy_version 12770 (0.0036) [2024-06-17 23:28:26,994][12645] Fps is (10 sec: 39330.7, 60 sec: 40413.8, 300 sec: 40099.2). Total num frames: 209272832. Throughput: 0: 40002.0. Samples: 209428120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-17 23:28:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:28:29,705][12883] Updated weights for policy 0, policy_version 12780 (0.0026) [2024-06-17 23:28:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 39867.8, 300 sec: 40099.2). Total num frames: 209469440. Throughput: 0: 40175.3. Samples: 209549880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:28:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:28:33,949][12883] Updated weights for policy 0, policy_version 12790 (0.0037) [2024-06-17 23:28:36,995][12645] Fps is (10 sec: 39316.9, 60 sec: 39867.0, 300 sec: 40210.1). Total num frames: 209666048. Throughput: 0: 40080.3. Samples: 209785420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:28:36,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:28:37,141][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012798_209682432.pth... [2024-06-17 23:28:37,197][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012209_200032256.pth [2024-06-17 23:28:38,101][12883] Updated weights for policy 0, policy_version 12800 (0.0038) [2024-06-17 23:28:41,996][12645] Fps is (10 sec: 39312.9, 60 sec: 39867.8, 300 sec: 40154.4). Total num frames: 209862656. Throughput: 0: 40108.4. Samples: 210032240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-17 23:28:41,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:28:42,249][12883] Updated weights for policy 0, policy_version 12810 (0.0036) [2024-06-17 23:28:46,414][12883] Updated weights for policy 0, policy_version 12820 (0.0033) [2024-06-17 23:28:46,996][12645] Fps is (10 sec: 40957.2, 60 sec: 40139.7, 300 sec: 40154.4). Total num frames: 210075648. Throughput: 0: 39919.2. Samples: 210147820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-17 23:28:46,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:28:50,274][12883] Updated weights for policy 0, policy_version 12830 (0.0047) [2024-06-17 23:28:51,994][12645] Fps is (10 sec: 40968.5, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 210272256. Throughput: 0: 40025.9. Samples: 210392020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:28:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:28:54,381][12883] Updated weights for policy 0, policy_version 12840 (0.0031) [2024-06-17 23:28:56,995][12645] Fps is (10 sec: 39323.0, 60 sec: 39866.8, 300 sec: 40154.5). Total num frames: 210468864. Throughput: 0: 40003.6. Samples: 210631300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:28:56,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:28:58,345][12883] Updated weights for policy 0, policy_version 12850 (0.0030) [2024-06-17 23:29:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 210665472. Throughput: 0: 40032.2. Samples: 210753480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-17 23:29:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:29:02,395][12883] Updated weights for policy 0, policy_version 12860 (0.0037) [2024-06-17 23:29:06,201][12862] Signal inference workers to stop experience collection... (3000 times) [2024-06-17 23:29:06,251][12883] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-17 23:29:06,316][12862] Signal inference workers to resume experience collection... (3000 times) [2024-06-17 23:29:06,316][12883] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-17 23:29:06,458][12883] Updated weights for policy 0, policy_version 12870 (0.0038) [2024-06-17 23:29:06,994][12645] Fps is (10 sec: 40966.2, 60 sec: 40145.0, 300 sec: 40210.2). Total num frames: 210878464. Throughput: 0: 40169.4. Samples: 210997120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-17 23:29:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:29:10,360][12883] Updated weights for policy 0, policy_version 12880 (0.0046) [2024-06-17 23:29:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 211091456. Throughput: 0: 40203.9. Samples: 211237300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-17 23:29:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:29:14,600][12883] Updated weights for policy 0, policy_version 12890 (0.0046) [2024-06-17 23:29:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39869.3, 300 sec: 40210.2). Total num frames: 211271680. Throughput: 0: 40153.7. Samples: 211356800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-17 23:29:16,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 23:29:18,406][12883] Updated weights for policy 0, policy_version 12900 (0.0031) [2024-06-17 23:29:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 211468288. Throughput: 0: 40241.4. Samples: 211596240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:29:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:29:22,590][12883] Updated weights for policy 0, policy_version 12910 (0.0031) [2024-06-17 23:29:26,529][12883] Updated weights for policy 0, policy_version 12920 (0.0045) [2024-06-17 23:29:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 211681280. Throughput: 0: 40003.6. Samples: 211832320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:29:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:29:30,690][12883] Updated weights for policy 0, policy_version 12930 (0.0043) [2024-06-17 23:29:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 39867.6, 300 sec: 40099.1). Total num frames: 211861504. Throughput: 0: 40264.2. Samples: 211959640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 23:29:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:29:34,457][12883] Updated weights for policy 0, policy_version 12940 (0.0027) [2024-06-17 23:29:36,996][12645] Fps is (10 sec: 39313.0, 60 sec: 40140.1, 300 sec: 40209.9). Total num frames: 212074496. Throughput: 0: 40188.7. Samples: 212200600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 23:29:36,997][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:29:38,863][12883] Updated weights for policy 0, policy_version 12950 (0.0032) [2024-06-17 23:29:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 40688.4, 300 sec: 40265.8). Total num frames: 212303872. Throughput: 0: 40344.1. Samples: 212446720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-17 23:29:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:29:43,146][12883] Updated weights for policy 0, policy_version 12960 (0.0031) [2024-06-17 23:29:46,994][12645] Fps is (10 sec: 40969.5, 60 sec: 40142.1, 300 sec: 40265.8). Total num frames: 212484096. Throughput: 0: 40369.0. Samples: 212570080. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-17 23:29:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:29:47,102][12883] Updated weights for policy 0, policy_version 12970 (0.0040) [2024-06-17 23:29:51,137][12883] Updated weights for policy 0, policy_version 12980 (0.0037) [2024-06-17 23:29:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 212697088. Throughput: 0: 40176.4. Samples: 212805060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:29:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:29:54,979][12883] Updated weights for policy 0, policy_version 12990 (0.0035) [2024-06-17 23:29:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40414.9, 300 sec: 40265.8). Total num frames: 212893696. Throughput: 0: 40232.1. Samples: 213047740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:29:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:29:59,427][12883] Updated weights for policy 0, policy_version 13000 (0.0045) [2024-06-17 23:30:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40414.0, 300 sec: 40210.2). Total num frames: 213090304. Throughput: 0: 40369.4. Samples: 213173420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 23:30:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:30:02,820][12883] Updated weights for policy 0, policy_version 13010 (0.0042) [2024-06-17 23:30:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 213303296. Throughput: 0: 40559.2. Samples: 213421400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 23:30:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:30:07,287][12883] Updated weights for policy 0, policy_version 13020 (0.0040) [2024-06-17 23:30:10,904][12883] Updated weights for policy 0, policy_version 13030 (0.0036) [2024-06-17 23:30:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40140.8, 300 sec: 40265.7). Total num frames: 213499904. Throughput: 0: 40606.2. Samples: 213659600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:30:11,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:30:15,826][12883] Updated weights for policy 0, policy_version 13040 (0.0039) [2024-06-17 23:30:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.7, 300 sec: 40154.7). Total num frames: 213696512. Throughput: 0: 40463.1. Samples: 213780480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:30:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:30:18,958][12883] Updated weights for policy 0, policy_version 13050 (0.0044) [2024-06-17 23:30:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 213909504. Throughput: 0: 40612.2. Samples: 214028060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-17 23:30:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:30:23,714][12883] Updated weights for policy 0, policy_version 13060 (0.0030) [2024-06-17 23:30:26,461][12862] Signal inference workers to stop experience collection... (3050 times) [2024-06-17 23:30:26,504][12883] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-17 23:30:26,514][12862] Signal inference workers to resume experience collection... (3050 times) [2024-06-17 23:30:26,524][12883] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-17 23:30:26,836][12883] Updated weights for policy 0, policy_version 13070 (0.0037) [2024-06-17 23:30:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 41233.1, 300 sec: 40433.2). Total num frames: 214155264. Throughput: 0: 40561.7. Samples: 214272000. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-17 23:30:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:30:31,586][12883] Updated weights for policy 0, policy_version 13080 (0.0038) [2024-06-17 23:30:31,995][12645] Fps is (10 sec: 39316.2, 60 sec: 40686.0, 300 sec: 40210.0). Total num frames: 214302720. Throughput: 0: 40492.4. Samples: 214392300. Policy #0 lag: (min: 2.0, avg: 11.7, max: 21.0) [2024-06-17 23:30:32,000][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:30:35,259][12883] Updated weights for policy 0, policy_version 13090 (0.0052) [2024-06-17 23:30:36,994][12645] Fps is (10 sec: 36044.9, 60 sec: 40688.4, 300 sec: 40265.8). Total num frames: 214515712. Throughput: 0: 40641.3. Samples: 214633920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:30:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:30:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013093_214515712.pth... [2024-06-17 23:30:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012503_204849152.pth [2024-06-17 23:30:40,049][12883] Updated weights for policy 0, policy_version 13100 (0.0033) [2024-06-17 23:30:41,994][12645] Fps is (10 sec: 40966.0, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 214712320. Throughput: 0: 40618.7. Samples: 214875580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:30:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:30:43,414][12883] Updated weights for policy 0, policy_version 13110 (0.0042) [2024-06-17 23:30:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 214908928. Throughput: 0: 40458.9. Samples: 214994080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 23:30:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:30:47,999][12883] Updated weights for policy 0, policy_version 13120 (0.0043) [2024-06-17 23:30:51,881][12883] Updated weights for policy 0, policy_version 13130 (0.0049) [2024-06-17 23:30:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 215121920. Throughput: 0: 40350.2. Samples: 215237160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 23:30:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:30:56,326][12883] Updated weights for policy 0, policy_version 13140 (0.0038) [2024-06-17 23:30:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.8, 300 sec: 40265.7). Total num frames: 215318528. Throughput: 0: 40546.7. Samples: 215484200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-17 23:30:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:30:59,820][12883] Updated weights for policy 0, policy_version 13150 (0.0037) [2024-06-17 23:31:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 215515136. Throughput: 0: 40441.5. Samples: 215600340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-17 23:31:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:31:04,467][12883] Updated weights for policy 0, policy_version 13160 (0.0039) [2024-06-17 23:31:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 215711744. Throughput: 0: 40240.0. Samples: 215838860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-17 23:31:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:31:08,282][12883] Updated weights for policy 0, policy_version 13170 (0.0031) [2024-06-17 23:31:11,994][12645] Fps is (10 sec: 37682.2, 60 sec: 39867.6, 300 sec: 40210.2). Total num frames: 215891968. Throughput: 0: 40206.5. Samples: 216081300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-17 23:31:11,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:31:12,556][12883] Updated weights for policy 0, policy_version 13180 (0.0045) [2024-06-17 23:31:16,173][12883] Updated weights for policy 0, policy_version 13190 (0.0044) [2024-06-17 23:31:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.9, 300 sec: 40265.8). Total num frames: 216121344. Throughput: 0: 40123.9. Samples: 216197820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-17 23:31:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:31:20,610][12883] Updated weights for policy 0, policy_version 13200 (0.0040) [2024-06-17 23:31:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.7, 300 sec: 40154.7). Total num frames: 216301568. Throughput: 0: 39972.9. Samples: 216432700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-17 23:31:21,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:31:24,778][12883] Updated weights for policy 0, policy_version 13210 (0.0035) [2024-06-17 23:31:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39048.5, 300 sec: 40210.7). Total num frames: 216498176. Throughput: 0: 39929.7. Samples: 216672420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-17 23:31:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:31:28,700][12883] Updated weights for policy 0, policy_version 13220 (0.0040) [2024-06-17 23:31:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40141.7, 300 sec: 40210.2). Total num frames: 216711168. Throughput: 0: 39981.4. Samples: 216793240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:31:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:31:32,746][12883] Updated weights for policy 0, policy_version 13230 (0.0045) [2024-06-17 23:31:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 216891392. Throughput: 0: 39759.6. Samples: 217026340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:31:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:31:37,191][12883] Updated weights for policy 0, policy_version 13240 (0.0034) [2024-06-17 23:31:40,634][12883] Updated weights for policy 0, policy_version 13250 (0.0033) [2024-06-17 23:31:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.8, 300 sec: 40211.0). Total num frames: 217120768. Throughput: 0: 39613.4. Samples: 217266800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-17 23:31:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:31:45,227][12883] Updated weights for policy 0, policy_version 13260 (0.0037) [2024-06-17 23:31:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 217300992. Throughput: 0: 39733.6. Samples: 217388360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-17 23:31:47,000][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:31:49,002][12883] Updated weights for policy 0, policy_version 13270 (0.0030) [2024-06-17 23:31:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 217513984. Throughput: 0: 39788.9. Samples: 217629360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:31:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:31:53,523][12883] Updated weights for policy 0, policy_version 13280 (0.0034) [2024-06-17 23:31:56,994][12645] Fps is (10 sec: 42599.3, 60 sec: 40140.9, 300 sec: 40210.3). Total num frames: 217726976. Throughput: 0: 39680.2. Samples: 217866900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:31:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:31:57,071][12883] Updated weights for policy 0, policy_version 13290 (0.0030) [2024-06-17 23:32:01,652][12883] Updated weights for policy 0, policy_version 13300 (0.0046) [2024-06-17 23:32:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.6, 300 sec: 40210.2). Total num frames: 217907200. Throughput: 0: 39909.7. Samples: 217993760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-17 23:32:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:32:05,566][12883] Updated weights for policy 0, policy_version 13310 (0.0053) [2024-06-17 23:32:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 218103808. Throughput: 0: 39925.9. Samples: 218229360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-17 23:32:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:32:10,157][12883] Updated weights for policy 0, policy_version 13320 (0.0048) [2024-06-17 23:32:10,165][12862] Signal inference workers to stop experience collection... (3100 times) [2024-06-17 23:32:10,165][12862] Signal inference workers to resume experience collection... (3100 times) [2024-06-17 23:32:10,188][12883] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-17 23:32:10,188][12883] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-17 23:32:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 218316800. Throughput: 0: 39930.7. Samples: 218469300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:32:11,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:32:13,795][12883] Updated weights for policy 0, policy_version 13330 (0.0032) [2024-06-17 23:32:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 218513408. Throughput: 0: 40074.8. Samples: 218596600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:32:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:32:17,826][12883] Updated weights for policy 0, policy_version 13340 (0.0031) [2024-06-17 23:32:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 218710016. Throughput: 0: 40207.9. Samples: 218835700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 23:32:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:32:22,150][12883] Updated weights for policy 0, policy_version 13350 (0.0030) [2024-06-17 23:32:25,933][12883] Updated weights for policy 0, policy_version 13360 (0.0050) [2024-06-17 23:32:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 218923008. Throughput: 0: 40157.0. Samples: 219073860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 23:32:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:32:30,471][12883] Updated weights for policy 0, policy_version 13370 (0.0027) [2024-06-17 23:32:31,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39867.8, 300 sec: 40099.2). Total num frames: 219103232. Throughput: 0: 40190.4. Samples: 219196920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 23:32:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:32:34,052][12883] Updated weights for policy 0, policy_version 13380 (0.0026) [2024-06-17 23:32:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.9, 300 sec: 40155.0). Total num frames: 219316224. Throughput: 0: 40102.3. Samples: 219433960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 23:32:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:32:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013386_219316224.pth... [2024-06-17 23:32:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000012798_209682432.pth [2024-06-17 23:32:38,572][12883] Updated weights for policy 0, policy_version 13390 (0.0040) [2024-06-17 23:32:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40140.9, 300 sec: 40210.3). Total num frames: 219529216. Throughput: 0: 40174.2. Samples: 219674740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 23:32:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:32:42,010][12883] Updated weights for policy 0, policy_version 13400 (0.0029) [2024-06-17 23:32:46,817][12883] Updated weights for policy 0, policy_version 13410 (0.0040) [2024-06-17 23:32:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.9, 300 sec: 40210.2). Total num frames: 219709440. Throughput: 0: 39993.8. Samples: 219793480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-17 23:32:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:32:50,265][12883] Updated weights for policy 0, policy_version 13420 (0.0038) [2024-06-17 23:32:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40414.0, 300 sec: 40210.3). Total num frames: 219938816. Throughput: 0: 40285.3. Samples: 220042200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 23:32:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:32:55,240][12883] Updated weights for policy 0, policy_version 13430 (0.0052) [2024-06-17 23:32:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 40154.7). Total num frames: 220119040. Throughput: 0: 40332.0. Samples: 220284240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-17 23:32:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:32:58,381][12883] Updated weights for policy 0, policy_version 13440 (0.0038) [2024-06-17 23:33:01,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40413.9, 300 sec: 40211.1). Total num frames: 220332032. Throughput: 0: 40103.0. Samples: 220401240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-17 23:33:01,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:33:03,111][12883] Updated weights for policy 0, policy_version 13450 (0.0033) [2024-06-17 23:33:06,621][12883] Updated weights for policy 0, policy_version 13460 (0.0045) [2024-06-17 23:33:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40686.9, 300 sec: 40265.8). Total num frames: 220545024. Throughput: 0: 40279.2. Samples: 220648260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-17 23:33:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:33:11,517][12883] Updated weights for policy 0, policy_version 13470 (0.0038) [2024-06-17 23:33:11,994][12645] Fps is (10 sec: 36045.2, 60 sec: 39594.8, 300 sec: 40043.9). Total num frames: 220692480. Throughput: 0: 40394.2. Samples: 220891600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-17 23:33:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:33:14,782][12883] Updated weights for policy 0, policy_version 13480 (0.0044) [2024-06-17 23:33:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.7, 300 sec: 40265.8). Total num frames: 220938240. Throughput: 0: 40215.8. Samples: 221006640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-17 23:33:16,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:33:19,891][12883] Updated weights for policy 0, policy_version 13490 (0.0037) [2024-06-17 23:33:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.8, 300 sec: 40099.1). Total num frames: 221102080. Throughput: 0: 40484.4. Samples: 221255760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 23:33:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:33:22,811][12883] Updated weights for policy 0, policy_version 13500 (0.0036) [2024-06-17 23:33:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 221331456. Throughput: 0: 40322.5. Samples: 221489260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 23:33:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:33:27,765][12883] Updated weights for policy 0, policy_version 13510 (0.0034) [2024-06-17 23:33:31,124][12883] Updated weights for policy 0, policy_version 13520 (0.0037) [2024-06-17 23:33:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40686.8, 300 sec: 40265.9). Total num frames: 221544448. Throughput: 0: 40474.6. Samples: 221614840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-17 23:33:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:33:35,540][12883] Updated weights for policy 0, policy_version 13530 (0.0039) [2024-06-17 23:33:36,994][12645] Fps is (10 sec: 37683.8, 60 sec: 39867.7, 300 sec: 40155.0). Total num frames: 221708288. Throughput: 0: 40165.7. Samples: 221849660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-17 23:33:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:33:39,297][12883] Updated weights for policy 0, policy_version 13540 (0.0048) [2024-06-17 23:33:41,999][12645] Fps is (10 sec: 39300.0, 60 sec: 40137.0, 300 sec: 40209.7). Total num frames: 221937664. Throughput: 0: 40042.7. Samples: 222086380. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-17 23:33:42,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:33:43,494][12883] Updated weights for policy 0, policy_version 13550 (0.0036) [2024-06-17 23:33:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 222117888. Throughput: 0: 40154.8. Samples: 222208200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-17 23:33:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:33:47,761][12883] Updated weights for policy 0, policy_version 13560 (0.0038) [2024-06-17 23:33:48,413][12862] Signal inference workers to stop experience collection... (3150 times) [2024-06-17 23:33:48,423][12862] Signal inference workers to resume experience collection... (3150 times) [2024-06-17 23:33:48,456][12883] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-17 23:33:48,457][12883] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-17 23:33:51,943][12883] Updated weights for policy 0, policy_version 13570 (0.0032) [2024-06-17 23:33:51,994][12645] Fps is (10 sec: 39343.4, 60 sec: 39867.6, 300 sec: 40210.4). Total num frames: 222330880. Throughput: 0: 39872.0. Samples: 222442500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 23:33:51,998][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:33:56,094][12883] Updated weights for policy 0, policy_version 13580 (0.0041) [2024-06-17 23:33:57,000][12645] Fps is (10 sec: 42571.6, 60 sec: 40409.8, 300 sec: 40264.9). Total num frames: 222543872. Throughput: 0: 39774.5. Samples: 222681700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 23:33:57,000][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:33:59,820][12883] Updated weights for policy 0, policy_version 13590 (0.0043) [2024-06-17 23:34:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 222707712. Throughput: 0: 39938.8. Samples: 222803880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 23:34:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:34:04,293][12883] Updated weights for policy 0, policy_version 13600 (0.0043) [2024-06-17 23:34:06,994][12645] Fps is (10 sec: 40985.1, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 222953472. Throughput: 0: 39784.8. Samples: 223046080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 23:34:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:34:07,715][12883] Updated weights for policy 0, policy_version 13610 (0.0030) [2024-06-17 23:34:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 223117312. Throughput: 0: 39987.1. Samples: 223288680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:34:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:34:12,279][12883] Updated weights for policy 0, policy_version 13620 (0.0030) [2024-06-17 23:34:15,720][12883] Updated weights for policy 0, policy_version 13630 (0.0040) [2024-06-17 23:34:16,996][12645] Fps is (10 sec: 36037.0, 60 sec: 39593.3, 300 sec: 40154.4). Total num frames: 223313920. Throughput: 0: 39838.5. Samples: 223407660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:34:16,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:34:20,252][12883] Updated weights for policy 0, policy_version 13640 (0.0033) [2024-06-17 23:34:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40686.9, 300 sec: 40210.2). Total num frames: 223543296. Throughput: 0: 39981.3. Samples: 223648820. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-17 23:34:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:34:24,276][12883] Updated weights for policy 0, policy_version 13650 (0.0026) [2024-06-17 23:34:26,994][12645] Fps is (10 sec: 39330.1, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 223707136. Throughput: 0: 40193.4. Samples: 223894860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-17 23:34:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:34:28,479][12883] Updated weights for policy 0, policy_version 13660 (0.0028) [2024-06-17 23:34:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.8, 300 sec: 40210.5). Total num frames: 223936512. Throughput: 0: 40011.0. Samples: 224008700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 23:34:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:34:32,559][12883] Updated weights for policy 0, policy_version 13670 (0.0033) [2024-06-17 23:34:36,633][12883] Updated weights for policy 0, policy_version 13680 (0.0039) [2024-06-17 23:34:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.8, 300 sec: 40099.1). Total num frames: 224133120. Throughput: 0: 40262.6. Samples: 224254320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 23:34:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:34:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013680_224133120.pth... [2024-06-17 23:34:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013093_214515712.pth [2024-06-17 23:34:40,628][12883] Updated weights for policy 0, policy_version 13690 (0.0030) [2024-06-17 23:34:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39871.5, 300 sec: 40154.7). Total num frames: 224329728. Throughput: 0: 40267.4. Samples: 224493480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-17 23:34:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:34:44,935][12883] Updated weights for policy 0, policy_version 13700 (0.0025) [2024-06-17 23:34:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 224542720. Throughput: 0: 40124.8. Samples: 224609500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-17 23:34:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:34:49,237][12883] Updated weights for policy 0, policy_version 13710 (0.0035) [2024-06-17 23:34:51,994][12645] Fps is (10 sec: 37682.7, 60 sec: 39594.6, 300 sec: 40043.6). Total num frames: 224706560. Throughput: 0: 40151.1. Samples: 224852880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-17 23:34:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:34:53,057][12883] Updated weights for policy 0, policy_version 13720 (0.0039) [2024-06-17 23:34:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39871.9, 300 sec: 40154.7). Total num frames: 224935936. Throughput: 0: 40009.9. Samples: 225089120. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-17 23:34:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:34:57,212][12883] Updated weights for policy 0, policy_version 13730 (0.0043) [2024-06-17 23:35:01,554][12883] Updated weights for policy 0, policy_version 13740 (0.0040) [2024-06-17 23:35:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 225148928. Throughput: 0: 40072.2. Samples: 225210820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 23:35:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:35:05,739][12883] Updated weights for policy 0, policy_version 13750 (0.0045) [2024-06-17 23:35:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 225329152. Throughput: 0: 40040.4. Samples: 225450640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 23:35:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:35:09,560][12883] Updated weights for policy 0, policy_version 13760 (0.0043) [2024-06-17 23:35:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 225558528. Throughput: 0: 39837.0. Samples: 225687520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-17 23:35:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:35:14,017][12883] Updated weights for policy 0, policy_version 13770 (0.0038) [2024-06-17 23:35:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40142.3, 300 sec: 40043.6). Total num frames: 225722368. Throughput: 0: 40036.0. Samples: 225810320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-17 23:35:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:35:17,366][12883] Updated weights for policy 0, policy_version 13780 (0.0036) [2024-06-17 23:35:21,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39594.6, 300 sec: 39877.0). Total num frames: 225918976. Throughput: 0: 39861.0. Samples: 226048060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-17 23:35:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:35:22,136][12883] Updated weights for policy 0, policy_version 13790 (0.0037) [2024-06-17 23:35:23,206][12862] Signal inference workers to stop experience collection... (3200 times) [2024-06-17 23:35:23,268][12883] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-17 23:35:23,329][12862] Signal inference workers to resume experience collection... (3200 times) [2024-06-17 23:35:23,329][12883] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-17 23:35:25,272][12883] Updated weights for policy 0, policy_version 13800 (0.0048) [2024-06-17 23:35:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 40099.3). Total num frames: 226131968. Throughput: 0: 40036.9. Samples: 226295140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-17 23:35:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:35:30,164][12883] Updated weights for policy 0, policy_version 13810 (0.0039) [2024-06-17 23:35:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39321.7, 300 sec: 39932.5). Total num frames: 226295808. Throughput: 0: 40105.0. Samples: 226414220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 23:35:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:35:33,804][12883] Updated weights for policy 0, policy_version 13820 (0.0036) [2024-06-17 23:35:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.9, 300 sec: 40099.1). Total num frames: 226541568. Throughput: 0: 39860.0. Samples: 226646580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-17 23:35:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:35:38,228][12883] Updated weights for policy 0, policy_version 13830 (0.0031) [2024-06-17 23:35:41,750][12883] Updated weights for policy 0, policy_version 13840 (0.0034) [2024-06-17 23:35:41,994][12645] Fps is (10 sec: 45874.4, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 226754560. Throughput: 0: 39974.1. Samples: 226887960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 23:35:41,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-17 23:35:46,757][12883] Updated weights for policy 0, policy_version 13850 (0.0032) [2024-06-17 23:35:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39594.6, 300 sec: 39988.1). Total num frames: 226918400. Throughput: 0: 39987.5. Samples: 227010260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 23:35:46,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:35:49,785][12883] Updated weights for policy 0, policy_version 13860 (0.0043) [2024-06-17 23:35:51,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40413.9, 300 sec: 40043.6). Total num frames: 227131392. Throughput: 0: 39936.0. Samples: 227247760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-17 23:35:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:35:54,805][12883] Updated weights for policy 0, policy_version 13870 (0.0022) [2024-06-17 23:35:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39321.6, 300 sec: 39932.5). Total num frames: 227295232. Throughput: 0: 40188.9. Samples: 227496020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:35:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:35:58,331][12883] Updated weights for policy 0, policy_version 13880 (0.0041) [2024-06-17 23:36:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 39867.6, 300 sec: 40099.1). Total num frames: 227540992. Throughput: 0: 39990.6. Samples: 227609900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:36:01,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:36:03,038][12883] Updated weights for policy 0, policy_version 13890 (0.0030) [2024-06-17 23:36:06,662][12883] Updated weights for policy 0, policy_version 13900 (0.0028) [2024-06-17 23:36:06,996][12645] Fps is (10 sec: 45865.0, 60 sec: 40412.4, 300 sec: 40209.9). Total num frames: 227753984. Throughput: 0: 40051.8. Samples: 227850480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:36:06,997][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:36:11,481][12883] Updated weights for policy 0, policy_version 13910 (0.0037) [2024-06-17 23:36:11,994][12645] Fps is (10 sec: 37684.0, 60 sec: 39321.6, 300 sec: 39988.1). Total num frames: 227917824. Throughput: 0: 39970.7. Samples: 228093820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:36:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:36:14,738][12883] Updated weights for policy 0, policy_version 13920 (0.0039) [2024-06-17 23:36:16,994][12645] Fps is (10 sec: 37691.3, 60 sec: 40140.8, 300 sec: 40099.1). Total num frames: 228130816. Throughput: 0: 39843.4. Samples: 228207180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:36:16,995][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:36:19,608][12883] Updated weights for policy 0, policy_version 13930 (0.0046) [2024-06-17 23:36:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 228360192. Throughput: 0: 40073.4. Samples: 228449880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:36:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:36:22,827][12883] Updated weights for policy 0, policy_version 13940 (0.0037) [2024-06-17 23:36:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 228524032. Throughput: 0: 39942.7. Samples: 228685380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:36:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:36:27,821][12883] Updated weights for policy 0, policy_version 13950 (0.0043) [2024-06-17 23:36:31,288][12883] Updated weights for policy 0, policy_version 13960 (0.0040) [2024-06-17 23:36:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 228737024. Throughput: 0: 39862.7. Samples: 228804080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:36:31,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:36:35,674][12883] Updated weights for policy 0, policy_version 13970 (0.0040) [2024-06-17 23:36:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39594.8, 300 sec: 39988.1). Total num frames: 228917248. Throughput: 0: 40074.7. Samples: 229051120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-17 23:36:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:36:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013973_228933632.pth... [2024-06-17 23:36:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013386_219316224.pth [2024-06-17 23:36:39,344][12883] Updated weights for policy 0, policy_version 13980 (0.0036) [2024-06-17 23:36:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 229130240. Throughput: 0: 39885.7. Samples: 229290880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-17 23:36:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:36:43,645][12883] Updated weights for policy 0, policy_version 13990 (0.0031) [2024-06-17 23:36:45,161][12862] Signal inference workers to stop experience collection... (3250 times) [2024-06-17 23:36:45,161][12862] Signal inference workers to resume experience collection... (3250 times) [2024-06-17 23:36:45,211][12883] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-17 23:36:45,211][12883] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-17 23:36:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40413.9, 300 sec: 40099.1). Total num frames: 229343232. Throughput: 0: 40070.3. Samples: 229413060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:36:46,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:36:47,979][12883] Updated weights for policy 0, policy_version 14000 (0.0028) [2024-06-17 23:36:51,519][12883] Updated weights for policy 0, policy_version 14010 (0.0027) [2024-06-17 23:36:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40140.8, 300 sec: 40043.6). Total num frames: 229539840. Throughput: 0: 40073.2. Samples: 229653680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:36:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:36:55,968][12883] Updated weights for policy 0, policy_version 14020 (0.0038) [2024-06-17 23:36:56,994][12645] Fps is (10 sec: 36045.1, 60 sec: 40140.8, 300 sec: 39988.1). Total num frames: 229703680. Throughput: 0: 40079.5. Samples: 229897400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:36:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:36:59,799][12883] Updated weights for policy 0, policy_version 14030 (0.0033) [2024-06-17 23:37:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 229949440. Throughput: 0: 40138.3. Samples: 230013400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:37:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:37:03,982][12883] Updated weights for policy 0, policy_version 14040 (0.0029) [2024-06-17 23:37:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 39869.2, 300 sec: 40099.2). Total num frames: 230146048. Throughput: 0: 40221.3. Samples: 230259840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-17 23:37:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:37:08,326][12883] Updated weights for policy 0, policy_version 14050 (0.0025) [2024-06-17 23:37:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 230359040. Throughput: 0: 40122.8. Samples: 230490900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-17 23:37:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:37:12,004][12883] Updated weights for policy 0, policy_version 14060 (0.0035) [2024-06-17 23:37:16,393][12883] Updated weights for policy 0, policy_version 14070 (0.0039) [2024-06-17 23:37:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 230539264. Throughput: 0: 40239.6. Samples: 230614860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) [2024-06-17 23:37:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:37:20,645][12883] Updated weights for policy 0, policy_version 14080 (0.0041) [2024-06-17 23:37:21,994][12645] Fps is (10 sec: 34406.2, 60 sec: 39048.6, 300 sec: 39932.5). Total num frames: 230703104. Throughput: 0: 40161.8. Samples: 230858400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 27.0) [2024-06-17 23:37:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:37:24,426][12883] Updated weights for policy 0, policy_version 14090 (0.0039) [2024-06-17 23:37:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 230948864. Throughput: 0: 40043.0. Samples: 231092820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-17 23:37:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:37:28,770][12883] Updated weights for policy 0, policy_version 14100 (0.0040) [2024-06-17 23:37:31,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40140.7, 300 sec: 40099.1). Total num frames: 231145472. Throughput: 0: 40090.6. Samples: 231217140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-17 23:37:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:37:32,623][12883] Updated weights for policy 0, policy_version 14110 (0.0035) [2024-06-17 23:37:36,777][12883] Updated weights for policy 0, policy_version 14120 (0.0039) [2024-06-17 23:37:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40413.8, 300 sec: 40043.6). Total num frames: 231342080. Throughput: 0: 40053.7. Samples: 231456100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:37:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:37:40,969][12883] Updated weights for policy 0, policy_version 14130 (0.0037) [2024-06-17 23:37:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.7, 300 sec: 40099.1). Total num frames: 231538688. Throughput: 0: 40169.2. Samples: 231705020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:37:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:37:44,799][12883] Updated weights for policy 0, policy_version 14140 (0.0034) [2024-06-17 23:37:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 39867.8, 300 sec: 39988.1). Total num frames: 231735296. Throughput: 0: 40200.5. Samples: 231822420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:37:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:37:49,050][12883] Updated weights for policy 0, policy_version 14150 (0.0040) [2024-06-17 23:37:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.7, 300 sec: 40099.2). Total num frames: 231948288. Throughput: 0: 39990.2. Samples: 232059400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-17 23:37:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:37:52,695][12883] Updated weights for policy 0, policy_version 14160 (0.0042) [2024-06-17 23:37:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 39988.1). Total num frames: 232128512. Throughput: 0: 40205.2. Samples: 232300140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-17 23:37:56,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:37:57,554][12883] Updated weights for policy 0, policy_version 14170 (0.0039) [2024-06-17 23:38:00,782][12883] Updated weights for policy 0, policy_version 14180 (0.0047) [2024-06-17 23:38:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 39594.7, 300 sec: 39932.5). Total num frames: 232325120. Throughput: 0: 40032.5. Samples: 232416320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-17 23:38:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:38:05,618][12883] Updated weights for policy 0, policy_version 14190 (0.0044) [2024-06-17 23:38:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 232538112. Throughput: 0: 40072.9. Samples: 232661680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-17 23:38:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:38:09,589][12883] Updated weights for policy 0, policy_version 14200 (0.0029) [2024-06-17 23:38:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 232751104. Throughput: 0: 40226.4. Samples: 232903000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-17 23:38:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:38:13,941][12883] Updated weights for policy 0, policy_version 14210 (0.0037) [2024-06-17 23:38:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 232947712. Throughput: 0: 40113.0. Samples: 233022220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-17 23:38:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:38:17,512][12883] Updated weights for policy 0, policy_version 14220 (0.0045) [2024-06-17 23:38:21,766][12883] Updated weights for policy 0, policy_version 14230 (0.0028) [2024-06-17 23:38:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 40043.6). Total num frames: 233144320. Throughput: 0: 40196.9. Samples: 233264960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-17 23:38:21,995][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:38:25,383][12883] Updated weights for policy 0, policy_version 14240 (0.0037) [2024-06-17 23:38:26,996][12645] Fps is (10 sec: 39312.8, 60 sec: 39866.3, 300 sec: 39987.8). Total num frames: 233340928. Throughput: 0: 40074.1. Samples: 233508440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 25.0) [2024-06-17 23:38:26,996][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:38:30,184][12883] Updated weights for policy 0, policy_version 14250 (0.0032) [2024-06-17 23:38:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 233553920. Throughput: 0: 40123.5. Samples: 233627980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-17 23:38:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:38:33,406][12883] Updated weights for policy 0, policy_version 14260 (0.0034) [2024-06-17 23:38:36,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40140.8, 300 sec: 40044.4). Total num frames: 233750528. Throughput: 0: 40395.1. Samples: 233877180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-17 23:38:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:38:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014267_233750528.pth... [2024-06-17 23:38:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013680_224133120.pth [2024-06-17 23:38:38,150][12883] Updated weights for policy 0, policy_version 14270 (0.0039) [2024-06-17 23:38:41,477][12862] Signal inference workers to stop experience collection... (3300 times) [2024-06-17 23:38:41,478][12862] Signal inference workers to resume experience collection... (3300 times) [2024-06-17 23:38:41,528][12883] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-17 23:38:41,528][12883] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-17 23:38:41,607][12883] Updated weights for policy 0, policy_version 14280 (0.0028) [2024-06-17 23:38:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 233963520. Throughput: 0: 40224.6. Samples: 234110240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-17 23:38:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:38:46,092][12883] Updated weights for policy 0, policy_version 14290 (0.0034) [2024-06-17 23:38:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40413.9, 300 sec: 40099.2). Total num frames: 234160128. Throughput: 0: 40491.1. Samples: 234238420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-17 23:38:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:38:49,767][12883] Updated weights for policy 0, policy_version 14300 (0.0040) [2024-06-17 23:38:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.9, 300 sec: 40044.5). Total num frames: 234356736. Throughput: 0: 40259.5. Samples: 234473360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:38:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:38:53,960][12883] Updated weights for policy 0, policy_version 14310 (0.0028) [2024-06-17 23:38:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 234569728. Throughput: 0: 40322.1. Samples: 234717500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-17 23:38:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:38:57,659][12883] Updated weights for policy 0, policy_version 14320 (0.0047) [2024-06-17 23:39:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40043.6). Total num frames: 234766336. Throughput: 0: 40448.0. Samples: 234842380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:39:01,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-17 23:39:02,371][12883] Updated weights for policy 0, policy_version 14330 (0.0035) [2024-06-17 23:39:06,060][12883] Updated weights for policy 0, policy_version 14340 (0.0032) [2024-06-17 23:39:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.7, 300 sec: 40154.7). Total num frames: 234962944. Throughput: 0: 40411.5. Samples: 235083480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:39:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:39:10,451][12883] Updated weights for policy 0, policy_version 14350 (0.0029) [2024-06-17 23:39:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.8, 300 sec: 40210.5). Total num frames: 235175936. Throughput: 0: 40426.4. Samples: 235327540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-17 23:39:11,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:39:14,071][12883] Updated weights for policy 0, policy_version 14360 (0.0027) [2024-06-17 23:39:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40140.8, 300 sec: 40043.6). Total num frames: 235356160. Throughput: 0: 40432.0. Samples: 235447420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-17 23:39:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:39:18,432][12883] Updated weights for policy 0, policy_version 14370 (0.0037) [2024-06-17 23:39:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40265.8). Total num frames: 235585536. Throughput: 0: 40330.7. Samples: 235692060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-17 23:39:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:39:22,163][12883] Updated weights for policy 0, policy_version 14380 (0.0032) [2024-06-17 23:39:26,606][12883] Updated weights for policy 0, policy_version 14390 (0.0040) [2024-06-17 23:39:26,996][12645] Fps is (10 sec: 42588.9, 60 sec: 40687.0, 300 sec: 40154.4). Total num frames: 235782144. Throughput: 0: 40476.6. Samples: 235931780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-17 23:39:26,996][12645] Avg episode reward: [(0, '0.020')] [2024-06-17 23:39:30,203][12883] Updated weights for policy 0, policy_version 14400 (0.0040) [2024-06-17 23:39:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 235978752. Throughput: 0: 40357.8. Samples: 236054520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-17 23:39:31,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:39:34,585][12883] Updated weights for policy 0, policy_version 14410 (0.0039) [2024-06-17 23:39:36,994][12645] Fps is (10 sec: 39329.9, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 236175360. Throughput: 0: 40515.0. Samples: 236296540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-17 23:39:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:39:38,293][12883] Updated weights for policy 0, policy_version 14420 (0.0035) [2024-06-17 23:39:41,994][12645] Fps is (10 sec: 37682.6, 60 sec: 39867.6, 300 sec: 40043.6). Total num frames: 236355584. Throughput: 0: 40607.5. Samples: 236544840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 22.0) [2024-06-17 23:39:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:39:42,743][12883] Updated weights for policy 0, policy_version 14430 (0.0032) [2024-06-17 23:39:46,246][12883] Updated weights for policy 0, policy_version 14440 (0.0035) [2024-06-17 23:39:47,000][12645] Fps is (10 sec: 44209.5, 60 sec: 40955.7, 300 sec: 40376.0). Total num frames: 236617728. Throughput: 0: 40430.3. Samples: 236662000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:39:47,001][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:39:50,968][12883] Updated weights for policy 0, policy_version 14450 (0.0046) [2024-06-17 23:39:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 236781568. Throughput: 0: 40549.9. Samples: 236908220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:39:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:39:54,457][12883] Updated weights for policy 0, policy_version 14460 (0.0037) [2024-06-17 23:39:56,994][12645] Fps is (10 sec: 34427.5, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 236961792. Throughput: 0: 40428.8. Samples: 237146840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-17 23:39:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:39:59,263][12883] Updated weights for policy 0, policy_version 14470 (0.0032) [2024-06-17 23:40:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 237191168. Throughput: 0: 40403.1. Samples: 237265560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-17 23:40:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:40:02,493][12883] Updated weights for policy 0, policy_version 14480 (0.0042) [2024-06-17 23:40:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40140.9, 300 sec: 40043.6). Total num frames: 237371392. Throughput: 0: 40419.1. Samples: 237510920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:40:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:40:07,291][12883] Updated weights for policy 0, policy_version 14490 (0.0030) [2024-06-17 23:40:10,424][12862] Signal inference workers to stop experience collection... (3350 times) [2024-06-17 23:40:10,424][12862] Signal inference workers to resume experience collection... (3350 times) [2024-06-17 23:40:10,461][12883] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-17 23:40:10,462][12883] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-17 23:40:10,559][12883] Updated weights for policy 0, policy_version 14500 (0.0052) [2024-06-17 23:40:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 237617152. Throughput: 0: 40295.3. Samples: 237744980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:40:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:40:15,447][12883] Updated weights for policy 0, policy_version 14510 (0.0032) [2024-06-17 23:40:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40265.8). Total num frames: 237797376. Throughput: 0: 40406.1. Samples: 237872800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-17 23:40:16,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:40:18,833][12883] Updated weights for policy 0, policy_version 14520 (0.0036) [2024-06-17 23:40:21,994][12645] Fps is (10 sec: 36045.1, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 237977600. Throughput: 0: 40227.7. Samples: 238106780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-17 23:40:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:40:23,697][12883] Updated weights for policy 0, policy_version 14530 (0.0030) [2024-06-17 23:40:26,693][12883] Updated weights for policy 0, policy_version 14540 (0.0036) [2024-06-17 23:40:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40688.4, 300 sec: 40432.4). Total num frames: 238223360. Throughput: 0: 40162.7. Samples: 238352160. Policy #0 lag: (min: 1.0, avg: 10.7, max: 24.0) [2024-06-17 23:40:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:40:31,817][12883] Updated weights for policy 0, policy_version 14550 (0.0033) [2024-06-17 23:40:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 238387200. Throughput: 0: 40410.1. Samples: 238480200. Policy #0 lag: (min: 0.0, avg: 13.2, max: 24.0) [2024-06-17 23:40:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:40:34,984][12883] Updated weights for policy 0, policy_version 14560 (0.0052) [2024-06-17 23:40:37,000][12645] Fps is (10 sec: 36022.4, 60 sec: 40136.7, 300 sec: 40098.3). Total num frames: 238583808. Throughput: 0: 40052.2. Samples: 238710820. Policy #0 lag: (min: 0.0, avg: 13.2, max: 24.0) [2024-06-17 23:40:37,001][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:40:37,125][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014563_238600192.pth... [2024-06-17 23:40:37,187][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000013973_228933632.pth [2024-06-17 23:40:40,264][12883] Updated weights for policy 0, policy_version 14570 (0.0048) [2024-06-17 23:40:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40265.8). Total num frames: 238796800. Throughput: 0: 40224.6. Samples: 238956940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-17 23:40:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:40:43,103][12883] Updated weights for policy 0, policy_version 14580 (0.0047) [2024-06-17 23:40:46,994][12645] Fps is (10 sec: 39346.0, 60 sec: 39325.7, 300 sec: 40154.7). Total num frames: 238977024. Throughput: 0: 40284.4. Samples: 239078360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-17 23:40:46,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:40:48,326][12883] Updated weights for policy 0, policy_version 14590 (0.0039) [2024-06-17 23:40:51,286][12883] Updated weights for policy 0, policy_version 14600 (0.0028) [2024-06-17 23:40:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40686.9, 300 sec: 40432.4). Total num frames: 239222784. Throughput: 0: 40147.1. Samples: 239317540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-17 23:40:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:40:56,617][12883] Updated weights for policy 0, policy_version 14610 (0.0039) [2024-06-17 23:40:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 239386624. Throughput: 0: 40429.7. Samples: 239564320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-17 23:40:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:40:59,921][12883] Updated weights for policy 0, policy_version 14620 (0.0042) [2024-06-17 23:41:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40140.8, 300 sec: 40155.0). Total num frames: 239599616. Throughput: 0: 40235.7. Samples: 239683400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:41:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:41:04,380][12883] Updated weights for policy 0, policy_version 14630 (0.0035) [2024-06-17 23:41:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40960.0, 300 sec: 40376.8). Total num frames: 239828992. Throughput: 0: 40653.6. Samples: 239936200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-17 23:41:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:41:07,763][12883] Updated weights for policy 0, policy_version 14640 (0.0036) [2024-06-17 23:41:11,993][12645] Fps is (10 sec: 40960.5, 60 sec: 39867.8, 300 sec: 40265.8). Total num frames: 240009216. Throughput: 0: 40459.8. Samples: 240172840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-17 23:41:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:41:12,281][12883] Updated weights for policy 0, policy_version 14650 (0.0032) [2024-06-17 23:41:15,763][12883] Updated weights for policy 0, policy_version 14660 (0.0030) [2024-06-17 23:41:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 240222208. Throughput: 0: 40344.8. Samples: 240295720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-17 23:41:16,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:41:20,274][12883] Updated weights for policy 0, policy_version 14670 (0.0024) [2024-06-17 23:41:21,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 240402432. Throughput: 0: 40736.8. Samples: 240543720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 23:41:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:41:24,211][12883] Updated weights for policy 0, policy_version 14680 (0.0028) [2024-06-17 23:41:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 240631808. Throughput: 0: 40603.5. Samples: 240784100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 23:41:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:41:28,109][12883] Updated weights for policy 0, policy_version 14690 (0.0048) [2024-06-17 23:41:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40687.0, 300 sec: 40376.8). Total num frames: 240828416. Throughput: 0: 40602.8. Samples: 240905480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 23:41:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:41:32,069][12883] Updated weights for policy 0, policy_version 14700 (0.0032) [2024-06-17 23:41:36,144][12883] Updated weights for policy 0, policy_version 14710 (0.0048) [2024-06-17 23:41:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40691.2, 300 sec: 40321.3). Total num frames: 241025024. Throughput: 0: 40800.0. Samples: 241153540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-17 23:41:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:41:39,993][12883] Updated weights for policy 0, policy_version 14720 (0.0039) [2024-06-17 23:41:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 241238016. Throughput: 0: 40609.9. Samples: 241391760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-17 23:41:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:41:43,537][12862] Signal inference workers to stop experience collection... (3400 times) [2024-06-17 23:41:43,537][12862] Signal inference workers to resume experience collection... (3400 times) [2024-06-17 23:41:43,551][12883] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-17 23:41:43,552][12883] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-17 23:41:44,542][12883] Updated weights for policy 0, policy_version 14730 (0.0033) [2024-06-17 23:41:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 40265.7). Total num frames: 241418240. Throughput: 0: 40683.9. Samples: 241514180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:41:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:41:47,973][12883] Updated weights for policy 0, policy_version 14740 (0.0043) [2024-06-17 23:41:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40140.7, 300 sec: 40432.4). Total num frames: 241631232. Throughput: 0: 40460.0. Samples: 241756900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:41:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:41:52,545][12883] Updated weights for policy 0, policy_version 14750 (0.0034) [2024-06-17 23:41:56,536][12883] Updated weights for policy 0, policy_version 14760 (0.0038) [2024-06-17 23:41:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 40321.3). Total num frames: 241844224. Throughput: 0: 40585.6. Samples: 241999200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-17 23:41:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:42:00,672][12883] Updated weights for policy 0, policy_version 14770 (0.0034) [2024-06-17 23:42:01,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 242024448. Throughput: 0: 40555.2. Samples: 242120700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-17 23:42:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:42:04,585][12883] Updated weights for policy 0, policy_version 14780 (0.0035) [2024-06-17 23:42:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.8, 300 sec: 40265.7). Total num frames: 242237440. Throughput: 0: 40350.6. Samples: 242359500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 23:42:06,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:42:08,724][12883] Updated weights for policy 0, policy_version 14790 (0.0033) [2024-06-17 23:42:11,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40686.7, 300 sec: 40376.8). Total num frames: 242450432. Throughput: 0: 40547.8. Samples: 242608760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-17 23:42:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:42:12,614][12883] Updated weights for policy 0, policy_version 14800 (0.0038) [2024-06-17 23:42:16,733][12883] Updated weights for policy 0, policy_version 14810 (0.0032) [2024-06-17 23:42:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40413.9, 300 sec: 40487.9). Total num frames: 242647040. Throughput: 0: 40477.8. Samples: 242726980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:42:16,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:42:20,543][12883] Updated weights for policy 0, policy_version 14820 (0.0031) [2024-06-17 23:42:21,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40960.0, 300 sec: 40376.9). Total num frames: 242860032. Throughput: 0: 40425.4. Samples: 242972680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:42:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:42:24,740][12883] Updated weights for policy 0, policy_version 14830 (0.0044) [2024-06-17 23:42:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 243056640. Throughput: 0: 40647.9. Samples: 243220920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-17 23:42:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:42:28,607][12883] Updated weights for policy 0, policy_version 14840 (0.0038) [2024-06-17 23:42:31,994][12645] Fps is (10 sec: 37682.4, 60 sec: 40140.6, 300 sec: 40321.3). Total num frames: 243236864. Throughput: 0: 40460.8. Samples: 243334920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-17 23:42:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:42:33,278][12883] Updated weights for policy 0, policy_version 14850 (0.0035) [2024-06-17 23:42:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 243449856. Throughput: 0: 40524.4. Samples: 243580500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 20.0) [2024-06-17 23:42:36,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 23:42:37,027][12883] Updated weights for policy 0, policy_version 14860 (0.0037) [2024-06-17 23:42:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014860_243466240.pth... [2024-06-17 23:42:37,097][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014267_233750528.pth [2024-06-17 23:42:41,323][12883] Updated weights for policy 0, policy_version 14870 (0.0032) [2024-06-17 23:42:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.8, 300 sec: 40432.4). Total num frames: 243662848. Throughput: 0: 40588.4. Samples: 243825680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 23:42:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:42:45,060][12883] Updated weights for policy 0, policy_version 14880 (0.0056) [2024-06-17 23:42:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 40432.4). Total num frames: 243875840. Throughput: 0: 40561.7. Samples: 243945980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 23:42:46,996][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:42:49,341][12883] Updated weights for policy 0, policy_version 14890 (0.0037) [2024-06-17 23:42:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40414.0, 300 sec: 40432.4). Total num frames: 244056064. Throughput: 0: 40715.7. Samples: 244191700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:42:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:42:52,965][12883] Updated weights for policy 0, policy_version 14900 (0.0033) [2024-06-17 23:42:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.9, 300 sec: 40432.4). Total num frames: 244252672. Throughput: 0: 40647.8. Samples: 244437900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:42:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:42:57,252][12883] Updated weights for policy 0, policy_version 14910 (0.0044) [2024-06-17 23:43:00,965][12883] Updated weights for policy 0, policy_version 14920 (0.0040) [2024-06-17 23:43:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 40543.4). Total num frames: 244498432. Throughput: 0: 40725.2. Samples: 244559620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-17 23:43:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:43:05,321][12883] Updated weights for policy 0, policy_version 14930 (0.0049) [2024-06-17 23:43:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.9, 300 sec: 40376.8). Total num frames: 244662272. Throughput: 0: 40556.0. Samples: 244797700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-17 23:43:06,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:43:09,514][12862] Signal inference workers to stop experience collection... (3450 times) [2024-06-17 23:43:09,547][12883] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-17 23:43:09,574][12862] Signal inference workers to resume experience collection... (3450 times) [2024-06-17 23:43:09,575][12883] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-17 23:43:09,579][12883] Updated weights for policy 0, policy_version 14940 (0.0045) [2024-06-17 23:43:11,996][12645] Fps is (10 sec: 37675.3, 60 sec: 40412.6, 300 sec: 40432.1). Total num frames: 244875264. Throughput: 0: 40309.7. Samples: 245034940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-17 23:43:11,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:43:13,460][12883] Updated weights for policy 0, policy_version 14950 (0.0035) [2024-06-17 23:43:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40376.8). Total num frames: 245055488. Throughput: 0: 40480.2. Samples: 245156520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-17 23:43:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:43:17,511][12883] Updated weights for policy 0, policy_version 14960 (0.0036) [2024-06-17 23:43:21,525][12883] Updated weights for policy 0, policy_version 14970 (0.0035) [2024-06-17 23:43:21,994][12645] Fps is (10 sec: 40968.6, 60 sec: 40413.8, 300 sec: 40488.2). Total num frames: 245284864. Throughput: 0: 40428.9. Samples: 245399800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-17 23:43:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:43:25,402][12883] Updated weights for policy 0, policy_version 14980 (0.0035) [2024-06-17 23:43:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40414.0, 300 sec: 40432.4). Total num frames: 245481472. Throughput: 0: 40426.8. Samples: 245644880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:43:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:43:29,370][12883] Updated weights for policy 0, policy_version 14990 (0.0030) [2024-06-17 23:43:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 245678080. Throughput: 0: 40358.2. Samples: 245762100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:43:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:43:33,481][12883] Updated weights for policy 0, policy_version 15000 (0.0034) [2024-06-17 23:43:36,996][12645] Fps is (10 sec: 39312.3, 60 sec: 40412.4, 300 sec: 40376.5). Total num frames: 245874688. Throughput: 0: 40287.2. Samples: 246004720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-17 23:43:36,997][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:43:37,394][12883] Updated weights for policy 0, policy_version 15010 (0.0028) [2024-06-17 23:43:41,582][12883] Updated weights for policy 0, policy_version 15020 (0.0041) [2024-06-17 23:43:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 246104064. Throughput: 0: 40193.7. Samples: 246246620. Policy #0 lag: (min: 1.0, avg: 11.4, max: 24.0) [2024-06-17 23:43:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:43:45,354][12883] Updated weights for policy 0, policy_version 15030 (0.0035) [2024-06-17 23:43:46,994][12645] Fps is (10 sec: 40969.0, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 246284288. Throughput: 0: 40285.8. Samples: 246372480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:43:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:43:49,754][12883] Updated weights for policy 0, policy_version 15040 (0.0039) [2024-06-17 23:43:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.8, 300 sec: 40432.4). Total num frames: 246497280. Throughput: 0: 40306.6. Samples: 246611500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:43:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:43:53,619][12883] Updated weights for policy 0, policy_version 15050 (0.0048) [2024-06-17 23:43:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 246693888. Throughput: 0: 40544.7. Samples: 246859360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:43:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:43:57,873][12883] Updated weights for policy 0, policy_version 15060 (0.0038) [2024-06-17 23:44:01,648][12883] Updated weights for policy 0, policy_version 15070 (0.0032) [2024-06-17 23:44:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.8, 300 sec: 40487.9). Total num frames: 246906880. Throughput: 0: 40504.0. Samples: 246979200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:44:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:44:05,793][12883] Updated weights for policy 0, policy_version 15080 (0.0029) [2024-06-17 23:44:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40959.9, 300 sec: 40487.9). Total num frames: 247119872. Throughput: 0: 40535.1. Samples: 247223880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:44:06,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-17 23:44:09,758][12883] Updated weights for policy 0, policy_version 15090 (0.0028) [2024-06-17 23:44:11,996][12645] Fps is (10 sec: 37674.5, 60 sec: 40140.7, 300 sec: 40432.1). Total num frames: 247283712. Throughput: 0: 40545.9. Samples: 247469540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 23:44:11,997][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:44:13,906][12883] Updated weights for policy 0, policy_version 15100 (0.0043) [2024-06-17 23:44:16,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40960.0, 300 sec: 40432.4). Total num frames: 247513088. Throughput: 0: 40379.2. Samples: 247579160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-17 23:44:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:44:17,898][12883] Updated weights for policy 0, policy_version 15110 (0.0029) [2024-06-17 23:44:21,994][12645] Fps is (10 sec: 42608.6, 60 sec: 40414.0, 300 sec: 40432.7). Total num frames: 247709696. Throughput: 0: 40476.3. Samples: 247826060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 23:44:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:44:22,074][12883] Updated weights for policy 0, policy_version 15120 (0.0034) [2024-06-17 23:44:23,851][12862] Signal inference workers to stop experience collection... (3500 times) [2024-06-17 23:44:23,851][12862] Signal inference workers to resume experience collection... (3500 times) [2024-06-17 23:44:23,864][12883] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-17 23:44:23,890][12883] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-17 23:44:26,230][12883] Updated weights for policy 0, policy_version 15130 (0.0030) [2024-06-17 23:44:26,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.7, 300 sec: 40376.8). Total num frames: 247889920. Throughput: 0: 40513.8. Samples: 248069740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 23:44:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:44:29,975][12883] Updated weights for policy 0, policy_version 15140 (0.0034) [2024-06-17 23:44:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40960.0, 300 sec: 40543.5). Total num frames: 248135680. Throughput: 0: 40373.8. Samples: 248189300. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-17 23:44:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:44:34,233][12883] Updated weights for policy 0, policy_version 15150 (0.0025) [2024-06-17 23:44:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40142.4, 300 sec: 40432.4). Total num frames: 248283136. Throughput: 0: 40453.5. Samples: 248431900. Policy #0 lag: (min: 2.0, avg: 10.3, max: 22.0) [2024-06-17 23:44:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:44:37,067][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015155_248299520.pth... [2024-06-17 23:44:37,130][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014563_238600192.pth [2024-06-17 23:44:38,173][12883] Updated weights for policy 0, policy_version 15160 (0.0044) [2024-06-17 23:44:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.8, 300 sec: 40322.2). Total num frames: 248512512. Throughput: 0: 40259.0. Samples: 248671020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 23:44:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:44:42,833][12883] Updated weights for policy 0, policy_version 15170 (0.0060) [2024-06-17 23:44:46,662][12883] Updated weights for policy 0, policy_version 15180 (0.0046) [2024-06-17 23:44:46,994][12645] Fps is (10 sec: 44235.6, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 248725504. Throughput: 0: 40408.7. Samples: 248797600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 23:44:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:44:50,971][12883] Updated weights for policy 0, policy_version 15190 (0.0044) [2024-06-17 23:44:51,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.7, 300 sec: 40432.4). Total num frames: 248889344. Throughput: 0: 40333.8. Samples: 249038900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-17 23:44:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:44:54,803][12883] Updated weights for policy 0, policy_version 15200 (0.0039) [2024-06-17 23:44:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40413.8, 300 sec: 40432.4). Total num frames: 249118720. Throughput: 0: 40056.7. Samples: 249272000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:44:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:45:00,057][12883] Updated weights for policy 0, policy_version 15210 (0.0030) [2024-06-17 23:45:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 249331712. Throughput: 0: 40399.9. Samples: 249397160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:45:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:45:02,859][12883] Updated weights for policy 0, policy_version 15220 (0.0036) [2024-06-17 23:45:06,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39594.7, 300 sec: 40265.7). Total num frames: 249495552. Throughput: 0: 40182.1. Samples: 249634260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 19.0) [2024-06-17 23:45:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:45:07,733][12883] Updated weights for policy 0, policy_version 15230 (0.0045) [2024-06-17 23:45:10,831][12883] Updated weights for policy 0, policy_version 15240 (0.0034) [2024-06-17 23:45:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40961.6, 300 sec: 40487.9). Total num frames: 249741312. Throughput: 0: 40099.2. Samples: 249874200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 19.0) [2024-06-17 23:45:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:45:15,694][12883] Updated weights for policy 0, policy_version 15250 (0.0030) [2024-06-17 23:45:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40140.8, 300 sec: 40487.9). Total num frames: 249921536. Throughput: 0: 40335.6. Samples: 250004400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 23:45:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:45:18,623][12883] Updated weights for policy 0, policy_version 15260 (0.0029) [2024-06-17 23:45:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 250118144. Throughput: 0: 40135.1. Samples: 250237980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-17 23:45:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:45:23,766][12883] Updated weights for policy 0, policy_version 15270 (0.0031) [2024-06-17 23:45:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 250331136. Throughput: 0: 40275.5. Samples: 250483420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:45:26,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:45:27,214][12883] Updated weights for policy 0, policy_version 15280 (0.0044) [2024-06-17 23:45:31,809][12883] Updated weights for policy 0, policy_version 15290 (0.0029) [2024-06-17 23:45:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 40433.2). Total num frames: 250511360. Throughput: 0: 40122.4. Samples: 250603100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:45:31,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:45:35,335][12883] Updated weights for policy 0, policy_version 15300 (0.0043) [2024-06-17 23:45:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 40487.9). Total num frames: 250740736. Throughput: 0: 40213.8. Samples: 250848520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:45:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:45:39,972][12883] Updated weights for policy 0, policy_version 15310 (0.0034) [2024-06-17 23:45:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 250937344. Throughput: 0: 40450.7. Samples: 251092280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 23:45:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:45:43,523][12883] Updated weights for policy 0, policy_version 15320 (0.0048) [2024-06-17 23:45:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.9, 300 sec: 40376.8). Total num frames: 251133952. Throughput: 0: 40286.2. Samples: 251210040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 23:45:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:45:47,919][12883] Updated weights for policy 0, policy_version 15330 (0.0043) [2024-06-17 23:45:49,668][12862] Signal inference workers to stop experience collection... (3550 times) [2024-06-17 23:45:49,708][12883] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-17 23:45:49,717][12862] Signal inference workers to resume experience collection... (3550 times) [2024-06-17 23:45:49,727][12883] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-17 23:45:51,623][12883] Updated weights for policy 0, policy_version 15340 (0.0033) [2024-06-17 23:45:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40543.5). Total num frames: 251346944. Throughput: 0: 40398.3. Samples: 251452180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-17 23:45:51,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:45:55,799][12883] Updated weights for policy 0, policy_version 15350 (0.0040) [2024-06-17 23:45:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40376.8). Total num frames: 251510784. Throughput: 0: 40467.1. Samples: 251695220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 20.0) [2024-06-17 23:45:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:45:59,902][12883] Updated weights for policy 0, policy_version 15360 (0.0036) [2024-06-17 23:46:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40140.8, 300 sec: 40376.9). Total num frames: 251740160. Throughput: 0: 40207.6. Samples: 251813740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-17 23:46:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:46:03,823][12883] Updated weights for policy 0, policy_version 15370 (0.0037) [2024-06-17 23:46:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 251936768. Throughput: 0: 40410.1. Samples: 252056440. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-17 23:46:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:46:07,845][12883] Updated weights for policy 0, policy_version 15380 (0.0045) [2024-06-17 23:46:11,968][12883] Updated weights for policy 0, policy_version 15390 (0.0038) [2024-06-17 23:46:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 252149760. Throughput: 0: 40267.1. Samples: 252295440. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-17 23:46:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:46:16,501][12883] Updated weights for policy 0, policy_version 15400 (0.0034) [2024-06-17 23:46:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 252329984. Throughput: 0: 40152.4. Samples: 252409960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:46:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:46:20,258][12883] Updated weights for policy 0, policy_version 15410 (0.0043) [2024-06-17 23:46:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40140.7, 300 sec: 40321.3). Total num frames: 252526592. Throughput: 0: 39949.4. Samples: 252646240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:46:22,003][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:46:24,780][12883] Updated weights for policy 0, policy_version 15420 (0.0041) [2024-06-17 23:46:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 40376.8). Total num frames: 252739584. Throughput: 0: 39795.2. Samples: 252883060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:46:27,003][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:46:28,565][12883] Updated weights for policy 0, policy_version 15430 (0.0057) [2024-06-17 23:46:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.7, 300 sec: 40321.3). Total num frames: 252919808. Throughput: 0: 39928.0. Samples: 253006800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:46:31,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:46:32,798][12883] Updated weights for policy 0, policy_version 15440 (0.0037) [2024-06-17 23:46:36,908][12883] Updated weights for policy 0, policy_version 15450 (0.0042) [2024-06-17 23:46:36,996][12645] Fps is (10 sec: 39312.2, 60 sec: 39866.3, 300 sec: 40321.0). Total num frames: 253132800. Throughput: 0: 39874.0. Samples: 253246600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:46:36,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:46:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015450_253132800.pth... [2024-06-17 23:46:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000014860_243466240.pth [2024-06-17 23:46:41,165][12883] Updated weights for policy 0, policy_version 15460 (0.0032) [2024-06-17 23:46:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 39867.7, 300 sec: 40376.8). Total num frames: 253329408. Throughput: 0: 39698.2. Samples: 253481640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:46:41,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:46:45,176][12883] Updated weights for policy 0, policy_version 15470 (0.0038) [2024-06-17 23:46:46,994][12645] Fps is (10 sec: 37692.2, 60 sec: 39594.7, 300 sec: 40265.8). Total num frames: 253509632. Throughput: 0: 39724.5. Samples: 253601340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-17 23:46:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:46:49,108][12883] Updated weights for policy 0, policy_version 15480 (0.0054) [2024-06-17 23:46:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39594.6, 300 sec: 40265.8). Total num frames: 253722624. Throughput: 0: 39713.7. Samples: 253843560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:46:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:46:53,308][12883] Updated weights for policy 0, policy_version 15490 (0.0051) [2024-06-17 23:46:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 253919232. Throughput: 0: 39895.2. Samples: 254090720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:46:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:46:57,431][12883] Updated weights for policy 0, policy_version 15500 (0.0051) [2024-06-17 23:47:01,265][12883] Updated weights for policy 0, policy_version 15510 (0.0041) [2024-06-17 23:47:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 254132224. Throughput: 0: 39923.1. Samples: 254206500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:47:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:47:05,440][12883] Updated weights for policy 0, policy_version 15520 (0.0043) [2024-06-17 23:47:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40140.9, 300 sec: 40321.3). Total num frames: 254345216. Throughput: 0: 40048.5. Samples: 254448420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:47:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:47:09,681][12883] Updated weights for policy 0, policy_version 15530 (0.0032) [2024-06-17 23:47:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 39594.7, 300 sec: 40265.8). Total num frames: 254525440. Throughput: 0: 40251.9. Samples: 254694400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-17 23:47:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:47:13,455][12883] Updated weights for policy 0, policy_version 15540 (0.0031) [2024-06-17 23:47:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40265.8). Total num frames: 254738432. Throughput: 0: 40115.6. Samples: 254812000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-17 23:47:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:47:17,573][12883] Updated weights for policy 0, policy_version 15550 (0.0040) [2024-06-17 23:47:21,523][12883] Updated weights for policy 0, policy_version 15560 (0.0043) [2024-06-17 23:47:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40413.8, 300 sec: 40321.3). Total num frames: 254951424. Throughput: 0: 40331.8. Samples: 255061440. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-17 23:47:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:47:25,558][12883] Updated weights for policy 0, policy_version 15570 (0.0058) [2024-06-17 23:47:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 255131648. Throughput: 0: 40345.0. Samples: 255297160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-17 23:47:26,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:47:29,601][12883] Updated weights for policy 0, policy_version 15580 (0.0040) [2024-06-17 23:47:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 40376.8). Total num frames: 255361024. Throughput: 0: 40387.9. Samples: 255418800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-17 23:47:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:47:34,087][12883] Updated weights for policy 0, policy_version 15590 (0.0030) [2024-06-17 23:47:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39596.2, 300 sec: 40154.7). Total num frames: 255508480. Throughput: 0: 40419.7. Samples: 255662440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:47:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:47:38,130][12883] Updated weights for policy 0, policy_version 15600 (0.0046) [2024-06-17 23:47:38,877][12862] Signal inference workers to stop experience collection... (3600 times) [2024-06-17 23:47:38,933][12883] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-17 23:47:38,932][12862] Signal inference workers to resume experience collection... (3600 times) [2024-06-17 23:47:38,951][12883] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-17 23:47:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 255737856. Throughput: 0: 40248.8. Samples: 255901920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:47:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:47:42,078][12883] Updated weights for policy 0, policy_version 15610 (0.0034) [2024-06-17 23:47:46,423][12883] Updated weights for policy 0, policy_version 15620 (0.0041) [2024-06-17 23:47:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40413.7, 300 sec: 40265.7). Total num frames: 255934464. Throughput: 0: 40378.5. Samples: 256023540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:47:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:47:50,467][12883] Updated weights for policy 0, policy_version 15630 (0.0045) [2024-06-17 23:47:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40413.9, 300 sec: 40321.3). Total num frames: 256147456. Throughput: 0: 40350.2. Samples: 256264180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-17 23:47:51,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:47:54,542][12883] Updated weights for policy 0, policy_version 15640 (0.0040) [2024-06-17 23:47:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 256344064. Throughput: 0: 40141.4. Samples: 256500760. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-17 23:47:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:47:58,519][12883] Updated weights for policy 0, policy_version 15650 (0.0038) [2024-06-17 23:48:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40140.7, 300 sec: 40265.7). Total num frames: 256540672. Throughput: 0: 40218.5. Samples: 256621840. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-17 23:48:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:48:02,969][12883] Updated weights for policy 0, policy_version 15660 (0.0034) [2024-06-17 23:48:06,570][12883] Updated weights for policy 0, policy_version 15670 (0.0040) [2024-06-17 23:48:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40266.1). Total num frames: 256753664. Throughput: 0: 39994.8. Samples: 256861200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:48:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:48:11,174][12883] Updated weights for policy 0, policy_version 15680 (0.0035) [2024-06-17 23:48:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.8, 300 sec: 40321.3). Total num frames: 256950272. Throughput: 0: 40245.6. Samples: 257108220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:48:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:48:14,382][12883] Updated weights for policy 0, policy_version 15690 (0.0038) [2024-06-17 23:48:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 257146880. Throughput: 0: 40121.0. Samples: 257224240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:48:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:48:18,987][12883] Updated weights for policy 0, policy_version 15700 (0.0038) [2024-06-17 23:48:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40140.8, 300 sec: 40265.7). Total num frames: 257359872. Throughput: 0: 40190.2. Samples: 257471000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-17 23:48:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:48:22,649][12883] Updated weights for policy 0, policy_version 15710 (0.0052) [2024-06-17 23:48:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 257540096. Throughput: 0: 40156.1. Samples: 257708940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-17 23:48:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:48:27,349][12883] Updated weights for policy 0, policy_version 15720 (0.0041) [2024-06-17 23:48:30,558][12883] Updated weights for policy 0, policy_version 15730 (0.0030) [2024-06-17 23:48:31,996][12645] Fps is (10 sec: 39313.0, 60 sec: 39866.3, 300 sec: 40265.8). Total num frames: 257753088. Throughput: 0: 40088.3. Samples: 257827600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 23:48:31,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:48:35,275][12883] Updated weights for policy 0, policy_version 15740 (0.0033) [2024-06-17 23:48:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 257949696. Throughput: 0: 40209.4. Samples: 258073600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-17 23:48:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:48:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015744_257949696.pth... [2024-06-17 23:48:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015155_248299520.pth [2024-06-17 23:48:38,549][12883] Updated weights for policy 0, policy_version 15750 (0.0035) [2024-06-17 23:48:41,994][12645] Fps is (10 sec: 39330.4, 60 sec: 40140.9, 300 sec: 40210.2). Total num frames: 258146304. Throughput: 0: 40277.8. Samples: 258313260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:48:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:48:43,466][12883] Updated weights for policy 0, policy_version 15760 (0.0048) [2024-06-17 23:48:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 258342912. Throughput: 0: 40288.0. Samples: 258434800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:48:46,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:48:47,374][12883] Updated weights for policy 0, policy_version 15770 (0.0044) [2024-06-17 23:48:51,643][12883] Updated weights for policy 0, policy_version 15780 (0.0030) [2024-06-17 23:48:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 258555904. Throughput: 0: 40365.3. Samples: 258677640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:48:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:48:55,355][12883] Updated weights for policy 0, policy_version 15790 (0.0047) [2024-06-17 23:48:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 258736128. Throughput: 0: 40244.5. Samples: 258919220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-17 23:48:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:48:59,673][12883] Updated weights for policy 0, policy_version 15800 (0.0033) [2024-06-17 23:49:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 258965504. Throughput: 0: 40301.7. Samples: 259037820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-17 23:49:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:49:03,497][12883] Updated weights for policy 0, policy_version 15810 (0.0046) [2024-06-17 23:49:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.7, 300 sec: 40210.5). Total num frames: 259145728. Throughput: 0: 40168.0. Samples: 259278560. Policy #0 lag: (min: 1.0, avg: 12.9, max: 24.0) [2024-06-17 23:49:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:49:07,667][12883] Updated weights for policy 0, policy_version 15820 (0.0037) [2024-06-17 23:49:11,112][12862] Signal inference workers to stop experience collection... (3650 times) [2024-06-17 23:49:11,113][12862] Signal inference workers to resume experience collection... (3650 times) [2024-06-17 23:49:11,135][12883] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-17 23:49:11,136][12883] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-17 23:49:11,421][12883] Updated weights for policy 0, policy_version 15830 (0.0033) [2024-06-17 23:49:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 259375104. Throughput: 0: 40211.9. Samples: 259518480. Policy #0 lag: (min: 1.0, avg: 12.9, max: 24.0) [2024-06-17 23:49:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:49:16,256][12883] Updated weights for policy 0, policy_version 15840 (0.0042) [2024-06-17 23:49:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 259571712. Throughput: 0: 40334.4. Samples: 259642560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-17 23:49:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:49:19,421][12883] Updated weights for policy 0, policy_version 15850 (0.0036) [2024-06-17 23:49:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 259751936. Throughput: 0: 40210.6. Samples: 259883080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-17 23:49:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:49:24,225][12883] Updated weights for policy 0, policy_version 15860 (0.0041) [2024-06-17 23:49:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40687.0, 300 sec: 40154.7). Total num frames: 259981312. Throughput: 0: 40276.9. Samples: 260125720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-17 23:49:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:49:27,499][12883] Updated weights for policy 0, policy_version 15870 (0.0034) [2024-06-17 23:49:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40142.3, 300 sec: 40265.8). Total num frames: 260161536. Throughput: 0: 40306.4. Samples: 260248580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-17 23:49:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:49:32,128][12883] Updated weights for policy 0, policy_version 15880 (0.0038) [2024-06-17 23:49:35,799][12883] Updated weights for policy 0, policy_version 15890 (0.0042) [2024-06-17 23:49:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 260374528. Throughput: 0: 40213.8. Samples: 260487260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-17 23:49:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:49:40,117][12883] Updated weights for policy 0, policy_version 15900 (0.0041) [2024-06-17 23:49:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.8, 300 sec: 40154.7). Total num frames: 260571136. Throughput: 0: 40106.6. Samples: 260724020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-17 23:49:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:49:44,328][12883] Updated weights for policy 0, policy_version 15910 (0.0036) [2024-06-17 23:49:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40140.9, 300 sec: 40210.3). Total num frames: 260751360. Throughput: 0: 40103.2. Samples: 260842460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-17 23:49:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:49:48,407][12883] Updated weights for policy 0, policy_version 15920 (0.0038) [2024-06-17 23:49:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 260964352. Throughput: 0: 40167.2. Samples: 261086080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:49:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:49:52,327][12883] Updated weights for policy 0, policy_version 15930 (0.0036) [2024-06-17 23:49:56,963][12883] Updated weights for policy 0, policy_version 15940 (0.0038) [2024-06-17 23:49:56,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40413.8, 300 sec: 40099.1). Total num frames: 261160960. Throughput: 0: 40305.3. Samples: 261332220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-17 23:49:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:50:00,323][12883] Updated weights for policy 0, policy_version 15950 (0.0038) [2024-06-17 23:50:01,994][12645] Fps is (10 sec: 42597.4, 60 sec: 40413.7, 300 sec: 40321.3). Total num frames: 261390336. Throughput: 0: 40160.4. Samples: 261449780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 23:50:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:50:04,881][12883] Updated weights for policy 0, policy_version 15960 (0.0035) [2024-06-17 23:50:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40413.7, 300 sec: 40099.1). Total num frames: 261570560. Throughput: 0: 40265.6. Samples: 261695040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 23:50:06,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:50:08,252][12883] Updated weights for policy 0, policy_version 15970 (0.0031) [2024-06-17 23:50:11,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 261767168. Throughput: 0: 40132.8. Samples: 261931700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-17 23:50:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:50:12,841][12883] Updated weights for policy 0, policy_version 15980 (0.0037) [2024-06-17 23:50:16,238][12883] Updated weights for policy 0, policy_version 15990 (0.0048) [2024-06-17 23:50:16,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40140.9, 300 sec: 40210.2). Total num frames: 261980160. Throughput: 0: 40148.0. Samples: 262055240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:50:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:50:21,439][12883] Updated weights for policy 0, policy_version 16000 (0.0050) [2024-06-17 23:50:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 262160384. Throughput: 0: 40039.6. Samples: 262289040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-17 23:50:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:50:24,560][12883] Updated weights for policy 0, policy_version 16010 (0.0033) [2024-06-17 23:50:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40140.7, 300 sec: 40265.7). Total num frames: 262389760. Throughput: 0: 40249.7. Samples: 262535260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:50:26,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:50:29,423][12883] Updated weights for policy 0, policy_version 16020 (0.0039) [2024-06-17 23:50:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.8, 300 sec: 40099.2). Total num frames: 262569984. Throughput: 0: 40357.2. Samples: 262658540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:50:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:50:32,883][12883] Updated weights for policy 0, policy_version 16030 (0.0045) [2024-06-17 23:50:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 262782976. Throughput: 0: 40235.5. Samples: 262896680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:50:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:50:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016039_262782976.pth... [2024-06-17 23:50:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015450_253132800.pth [2024-06-17 23:50:37,340][12883] Updated weights for policy 0, policy_version 16040 (0.0035) [2024-06-17 23:50:39,426][12862] Signal inference workers to stop experience collection... (3700 times) [2024-06-17 23:50:39,448][12883] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-17 23:50:39,485][12862] Signal inference workers to resume experience collection... (3700 times) [2024-06-17 23:50:39,485][12883] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-17 23:50:41,048][12883] Updated weights for policy 0, policy_version 16050 (0.0033) [2024-06-17 23:50:41,996][12645] Fps is (10 sec: 42588.9, 60 sec: 40412.4, 300 sec: 40209.9). Total num frames: 262995968. Throughput: 0: 40106.1. Samples: 263137080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:50:41,997][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:50:45,373][12883] Updated weights for policy 0, policy_version 16060 (0.0048) [2024-06-17 23:50:46,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.6, 300 sec: 40043.6). Total num frames: 263159808. Throughput: 0: 40256.4. Samples: 263261320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:50:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:50:49,239][12883] Updated weights for policy 0, policy_version 16070 (0.0038) [2024-06-17 23:50:51,996][12645] Fps is (10 sec: 39321.7, 60 sec: 40412.3, 300 sec: 40265.5). Total num frames: 263389184. Throughput: 0: 40096.9. Samples: 263499480. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-17 23:50:51,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:50:53,499][12883] Updated weights for policy 0, policy_version 16080 (0.0041) [2024-06-17 23:50:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40140.8, 300 sec: 40099.1). Total num frames: 263569408. Throughput: 0: 40273.7. Samples: 263744020. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-17 23:50:56,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:50:57,859][12883] Updated weights for policy 0, policy_version 16090 (0.0048) [2024-06-17 23:51:01,511][12883] Updated weights for policy 0, policy_version 16100 (0.0030) [2024-06-17 23:51:01,994][12645] Fps is (10 sec: 39330.7, 60 sec: 39867.9, 300 sec: 40154.7). Total num frames: 263782400. Throughput: 0: 40024.5. Samples: 263856340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 23:51:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:51:05,951][12883] Updated weights for policy 0, policy_version 16110 (0.0041) [2024-06-17 23:51:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 40687.1, 300 sec: 40210.2). Total num frames: 264011776. Throughput: 0: 40302.6. Samples: 264102660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-17 23:51:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:51:09,516][12883] Updated weights for policy 0, policy_version 16120 (0.0050) [2024-06-17 23:51:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.8, 300 sec: 40154.7). Total num frames: 264175616. Throughput: 0: 40169.0. Samples: 264342860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 23:51:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:51:13,956][12883] Updated weights for policy 0, policy_version 16130 (0.0051) [2024-06-17 23:51:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40265.8). Total num frames: 264404992. Throughput: 0: 40094.6. Samples: 264462800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 23:51:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:51:17,618][12883] Updated weights for policy 0, policy_version 16140 (0.0028) [2024-06-17 23:51:21,994][12645] Fps is (10 sec: 37682.9, 60 sec: 39867.7, 300 sec: 40043.6). Total num frames: 264552448. Throughput: 0: 40189.7. Samples: 264705220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-17 23:51:21,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-17 23:51:22,488][12883] Updated weights for policy 0, policy_version 16150 (0.0028) [2024-06-17 23:51:26,260][12883] Updated weights for policy 0, policy_version 16160 (0.0037) [2024-06-17 23:51:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 264781824. Throughput: 0: 40156.5. Samples: 264944040. Policy #0 lag: (min: 2.0, avg: 10.4, max: 25.0) [2024-06-17 23:51:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:51:30,589][12883] Updated weights for policy 0, policy_version 16170 (0.0040) [2024-06-17 23:51:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40686.9, 300 sec: 40266.1). Total num frames: 265011200. Throughput: 0: 40218.3. Samples: 265071140. Policy #0 lag: (min: 2.0, avg: 10.4, max: 25.0) [2024-06-17 23:51:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:51:34,443][12883] Updated weights for policy 0, policy_version 16180 (0.0026) [2024-06-17 23:51:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.7, 300 sec: 40154.7). Total num frames: 265175040. Throughput: 0: 40219.8. Samples: 265309280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:51:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:51:38,707][12883] Updated weights for policy 0, policy_version 16190 (0.0040) [2024-06-17 23:51:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40142.3, 300 sec: 40321.3). Total num frames: 265404416. Throughput: 0: 39920.9. Samples: 265540460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:51:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:51:42,451][12883] Updated weights for policy 0, policy_version 16200 (0.0046) [2024-06-17 23:51:46,793][12883] Updated weights for policy 0, policy_version 16210 (0.0038) [2024-06-17 23:51:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40414.0, 300 sec: 40210.2). Total num frames: 265584640. Throughput: 0: 40227.9. Samples: 265666600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 23:51:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:51:50,961][12883] Updated weights for policy 0, policy_version 16220 (0.0033) [2024-06-17 23:51:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 39869.2, 300 sec: 40210.2). Total num frames: 265781248. Throughput: 0: 40127.6. Samples: 265908400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 23:51:52,000][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:51:54,852][12883] Updated weights for policy 0, policy_version 16230 (0.0047) [2024-06-17 23:51:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40265.7). Total num frames: 266010624. Throughput: 0: 40061.2. Samples: 266145620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-17 23:51:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:51:58,934][12883] Updated weights for policy 0, policy_version 16240 (0.0042) [2024-06-17 23:52:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 266174464. Throughput: 0: 40268.5. Samples: 266274880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-17 23:52:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:52:02,823][12862] Signal inference workers to stop experience collection... (3750 times) [2024-06-17 23:52:02,882][12883] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-17 23:52:02,891][12862] Signal inference workers to resume experience collection... (3750 times) [2024-06-17 23:52:02,895][12883] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-17 23:52:03,027][12883] Updated weights for policy 0, policy_version 16250 (0.0039) [2024-06-17 23:52:06,988][12883] Updated weights for policy 0, policy_version 16260 (0.0034) [2024-06-17 23:52:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.8, 300 sec: 40265.8). Total num frames: 266403840. Throughput: 0: 40094.8. Samples: 266509480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-17 23:52:06,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 23:52:11,081][12883] Updated weights for policy 0, policy_version 16270 (0.0045) [2024-06-17 23:52:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 266600448. Throughput: 0: 40102.0. Samples: 266748620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:52:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:52:15,734][12883] Updated weights for policy 0, policy_version 16280 (0.0054) [2024-06-17 23:52:16,994][12645] Fps is (10 sec: 37682.3, 60 sec: 39594.6, 300 sec: 40099.1). Total num frames: 266780672. Throughput: 0: 39914.2. Samples: 266867280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:52:16,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:52:19,289][12883] Updated weights for policy 0, policy_version 16290 (0.0035) [2024-06-17 23:52:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 266993664. Throughput: 0: 39968.1. Samples: 267107840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:52:21,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-17 23:52:23,879][12883] Updated weights for policy 0, policy_version 16300 (0.0032) [2024-06-17 23:52:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40140.9, 300 sec: 40099.2). Total num frames: 267190272. Throughput: 0: 40263.1. Samples: 267352300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:52:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:52:27,647][12883] Updated weights for policy 0, policy_version 16310 (0.0035) [2024-06-17 23:52:31,842][12883] Updated weights for policy 0, policy_version 16320 (0.0050) [2024-06-17 23:52:31,996][12645] Fps is (10 sec: 39312.7, 60 sec: 39593.3, 300 sec: 40265.5). Total num frames: 267386880. Throughput: 0: 39901.6. Samples: 267462260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-17 23:52:31,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:52:35,727][12883] Updated weights for policy 0, policy_version 16330 (0.0037) [2024-06-17 23:52:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40686.8, 300 sec: 40265.8). Total num frames: 267616256. Throughput: 0: 40078.1. Samples: 267711920. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-17 23:52:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:52:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016334_267616256.pth... [2024-06-17 23:52:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000015744_257949696.pth [2024-06-17 23:52:39,977][12883] Updated weights for policy 0, policy_version 16340 (0.0041) [2024-06-17 23:52:41,994][12645] Fps is (10 sec: 40968.6, 60 sec: 39867.7, 300 sec: 40210.2). Total num frames: 267796480. Throughput: 0: 40114.2. Samples: 267950760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-17 23:52:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:52:43,954][12883] Updated weights for policy 0, policy_version 16350 (0.0034) [2024-06-17 23:52:46,994][12645] Fps is (10 sec: 36045.3, 60 sec: 39867.7, 300 sec: 40099.1). Total num frames: 267976704. Throughput: 0: 39838.7. Samples: 268067620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 23:52:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:52:47,939][12883] Updated weights for policy 0, policy_version 16360 (0.0026) [2024-06-17 23:52:51,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 268189696. Throughput: 0: 40127.6. Samples: 268315220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-17 23:52:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:52:52,039][12883] Updated weights for policy 0, policy_version 16370 (0.0030) [2024-06-17 23:52:56,188][12883] Updated weights for policy 0, policy_version 16380 (0.0039) [2024-06-17 23:52:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39594.7, 300 sec: 40154.7). Total num frames: 268386304. Throughput: 0: 40207.0. Samples: 268557940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 23:52:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:52:59,830][12883] Updated weights for policy 0, policy_version 16390 (0.0042) [2024-06-17 23:53:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 268599296. Throughput: 0: 40081.5. Samples: 268670940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 23:53:01,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:53:04,201][12883] Updated weights for policy 0, policy_version 16400 (0.0033) [2024-06-17 23:53:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 268812288. Throughput: 0: 40324.8. Samples: 268922460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-17 23:53:06,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:53:07,949][12883] Updated weights for policy 0, policy_version 16410 (0.0035) [2024-06-17 23:53:11,994][12645] Fps is (10 sec: 39320.6, 60 sec: 39867.6, 300 sec: 40154.7). Total num frames: 268992512. Throughput: 0: 40291.8. Samples: 269165440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-17 23:53:12,003][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:53:12,363][12883] Updated weights for policy 0, policy_version 16420 (0.0044) [2024-06-17 23:53:16,089][12883] Updated weights for policy 0, policy_version 16430 (0.0033) [2024-06-17 23:53:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40414.0, 300 sec: 40154.7). Total num frames: 269205504. Throughput: 0: 40422.9. Samples: 269281200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-17 23:53:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:53:20,903][12883] Updated weights for policy 0, policy_version 16440 (0.0036) [2024-06-17 23:53:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.8, 300 sec: 40265.7). Total num frames: 269418496. Throughput: 0: 40429.8. Samples: 269531260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 23:53:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:53:24,027][12883] Updated weights for policy 0, policy_version 16450 (0.0030) [2024-06-17 23:53:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40155.0). Total num frames: 269598720. Throughput: 0: 40300.0. Samples: 269764260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-17 23:53:26,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-17 23:53:28,977][12883] Updated weights for policy 0, policy_version 16460 (0.0036) [2024-06-17 23:53:31,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40686.9, 300 sec: 40265.4). Total num frames: 269828096. Throughput: 0: 40497.1. Samples: 269890080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-17 23:53:31,997][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:53:32,165][12883] Updated weights for policy 0, policy_version 16470 (0.0037) [2024-06-17 23:53:36,854][12883] Updated weights for policy 0, policy_version 16480 (0.0028) [2024-06-17 23:53:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 270008320. Throughput: 0: 40408.3. Samples: 270133600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-17 23:53:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:53:40,419][12883] Updated weights for policy 0, policy_version 16490 (0.0032) [2024-06-17 23:53:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 40413.9, 300 sec: 40265.8). Total num frames: 270221312. Throughput: 0: 40298.3. Samples: 270371360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-17 23:53:41,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:53:44,943][12883] Updated weights for policy 0, policy_version 16500 (0.0037) [2024-06-17 23:53:45,710][12862] Signal inference workers to stop experience collection... (3800 times) [2024-06-17 23:53:45,762][12883] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-17 23:53:45,772][12862] Signal inference workers to resume experience collection... (3800 times) [2024-06-17 23:53:45,779][12883] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-17 23:53:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40210.2). Total num frames: 270417920. Throughput: 0: 40623.5. Samples: 270499000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 23:53:47,000][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:53:48,380][12883] Updated weights for policy 0, policy_version 16510 (0.0038) [2024-06-17 23:53:51,996][12645] Fps is (10 sec: 37674.7, 60 sec: 40139.2, 300 sec: 40209.9). Total num frames: 270598144. Throughput: 0: 40298.0. Samples: 270735960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-17 23:53:51,997][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:53:53,643][12883] Updated weights for policy 0, policy_version 16520 (0.0033) [2024-06-17 23:53:56,653][12883] Updated weights for policy 0, policy_version 16530 (0.0037) [2024-06-17 23:53:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40265.7). Total num frames: 270843904. Throughput: 0: 40117.9. Samples: 270970740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-17 23:53:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:54:01,661][12883] Updated weights for policy 0, policy_version 16540 (0.0042) [2024-06-17 23:54:01,994][12645] Fps is (10 sec: 39330.4, 60 sec: 39867.7, 300 sec: 40154.7). Total num frames: 270991360. Throughput: 0: 40242.2. Samples: 271092100. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-17 23:54:01,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:54:04,641][12883] Updated weights for policy 0, policy_version 16550 (0.0041) [2024-06-17 23:54:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 271237120. Throughput: 0: 40044.5. Samples: 271333260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-17 23:54:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:54:09,691][12883] Updated weights for policy 0, policy_version 16560 (0.0033) [2024-06-17 23:54:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40413.9, 300 sec: 40154.7). Total num frames: 271417344. Throughput: 0: 40247.5. Samples: 271575400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 23:54:11,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:54:12,955][12883] Updated weights for policy 0, policy_version 16570 (0.0056) [2024-06-17 23:54:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 271613952. Throughput: 0: 40158.4. Samples: 271697120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-17 23:54:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:54:18,098][12883] Updated weights for policy 0, policy_version 16580 (0.0034) [2024-06-17 23:54:21,562][12883] Updated weights for policy 0, policy_version 16590 (0.0029) [2024-06-17 23:54:21,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 271826944. Throughput: 0: 40018.8. Samples: 271934440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-17 23:54:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:54:25,888][12883] Updated weights for policy 0, policy_version 16600 (0.0040) [2024-06-17 23:54:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 272007168. Throughput: 0: 40246.7. Samples: 272182460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-17 23:54:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-17 23:54:29,568][12883] Updated weights for policy 0, policy_version 16610 (0.0034) [2024-06-17 23:54:31,994][12645] Fps is (10 sec: 39320.7, 60 sec: 39869.1, 300 sec: 40154.7). Total num frames: 272220160. Throughput: 0: 40003.8. Samples: 272299180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-17 23:54:31,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:54:33,778][12883] Updated weights for policy 0, policy_version 16620 (0.0041) [2024-06-17 23:54:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 272433152. Throughput: 0: 40074.5. Samples: 272539220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 23:54:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:54:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016628_272433152.pth... [2024-06-17 23:54:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016039_262782976.pth [2024-06-17 23:54:37,572][12883] Updated weights for policy 0, policy_version 16630 (0.0040) [2024-06-17 23:54:41,994][12645] Fps is (10 sec: 39322.5, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 272613376. Throughput: 0: 40248.6. Samples: 272781920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-17 23:54:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:54:42,089][12883] Updated weights for policy 0, policy_version 16640 (0.0055) [2024-06-17 23:54:45,602][12883] Updated weights for policy 0, policy_version 16650 (0.0046) [2024-06-17 23:54:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 272826368. Throughput: 0: 40222.2. Samples: 272902100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-17 23:54:46,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:54:50,359][12883] Updated weights for policy 0, policy_version 16660 (0.0031) [2024-06-17 23:54:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40142.3, 300 sec: 40154.7). Total num frames: 273006592. Throughput: 0: 40353.3. Samples: 273149160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-17 23:54:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:54:53,434][12883] Updated weights for policy 0, policy_version 16670 (0.0039) [2024-06-17 23:54:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.7, 300 sec: 40099.2). Total num frames: 273219584. Throughput: 0: 40387.2. Samples: 273392820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 23:54:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:54:58,119][12883] Updated weights for policy 0, policy_version 16680 (0.0028) [2024-06-17 23:55:01,234][12883] Updated weights for policy 0, policy_version 16690 (0.0046) [2024-06-17 23:55:01,996][12645] Fps is (10 sec: 44227.3, 60 sec: 40958.5, 300 sec: 40265.5). Total num frames: 273448960. Throughput: 0: 40434.5. Samples: 273516760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 23:55:01,996][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:55:06,112][12883] Updated weights for policy 0, policy_version 16700 (0.0042) [2024-06-17 23:55:06,995][12645] Fps is (10 sec: 39318.6, 60 sec: 39594.1, 300 sec: 40154.6). Total num frames: 273612800. Throughput: 0: 40425.9. Samples: 273753640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-17 23:55:06,995][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:55:09,662][12883] Updated weights for policy 0, policy_version 16710 (0.0044) [2024-06-17 23:55:11,994][12645] Fps is (10 sec: 36052.7, 60 sec: 39867.8, 300 sec: 40099.1). Total num frames: 273809408. Throughput: 0: 40345.2. Samples: 273998000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:55:11,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:55:12,457][12862] Signal inference workers to stop experience collection... (3850 times) [2024-06-17 23:55:12,512][12883] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-17 23:55:12,516][12862] Signal inference workers to resume experience collection... (3850 times) [2024-06-17 23:55:12,530][12883] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-17 23:55:14,759][12883] Updated weights for policy 0, policy_version 16720 (0.0025) [2024-06-17 23:55:16,994][12645] Fps is (10 sec: 44240.1, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 274055168. Throughput: 0: 40472.5. Samples: 274120440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-17 23:55:16,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:55:17,636][12883] Updated weights for policy 0, policy_version 16730 (0.0048) [2024-06-17 23:55:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40413.8, 300 sec: 40210.2). Total num frames: 274251776. Throughput: 0: 40557.7. Samples: 274364320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:55:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:55:22,577][12883] Updated weights for policy 0, policy_version 16740 (0.0033) [2024-06-17 23:55:25,861][12883] Updated weights for policy 0, policy_version 16750 (0.0035) [2024-06-17 23:55:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.8, 300 sec: 40265.8). Total num frames: 274448384. Throughput: 0: 40359.0. Samples: 274598080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:55:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:55:30,604][12883] Updated weights for policy 0, policy_version 16760 (0.0037) [2024-06-17 23:55:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 40210.2). Total num frames: 274644992. Throughput: 0: 40489.7. Samples: 274724140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-17 23:55:31,995][12645] Avg episode reward: [(0, '0.011')] [2024-06-17 23:55:33,884][12883] Updated weights for policy 0, policy_version 16770 (0.0040) [2024-06-17 23:55:36,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39867.7, 300 sec: 40099.5). Total num frames: 274825216. Throughput: 0: 40441.4. Samples: 274969020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:55:36,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-17 23:55:38,484][12883] Updated weights for policy 0, policy_version 16780 (0.0047) [2024-06-17 23:55:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 40960.0, 300 sec: 40376.9). Total num frames: 275070976. Throughput: 0: 40306.8. Samples: 275206620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-17 23:55:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:55:42,002][12883] Updated weights for policy 0, policy_version 16790 (0.0026) [2024-06-17 23:55:46,369][12883] Updated weights for policy 0, policy_version 16800 (0.0038) [2024-06-17 23:55:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40686.9, 300 sec: 40266.1). Total num frames: 275267584. Throughput: 0: 40430.8. Samples: 275336060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:55:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:55:50,401][12883] Updated weights for policy 0, policy_version 16810 (0.0042) [2024-06-17 23:55:51,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40686.9, 300 sec: 40265.7). Total num frames: 275447808. Throughput: 0: 40374.0. Samples: 275570440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:55:51,995][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:55:54,318][12883] Updated weights for policy 0, policy_version 16820 (0.0027) [2024-06-17 23:55:56,997][12645] Fps is (10 sec: 40947.3, 60 sec: 40957.9, 300 sec: 40320.9). Total num frames: 275677184. Throughput: 0: 40394.5. Samples: 275815880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:55:56,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:55:58,623][12883] Updated weights for policy 0, policy_version 16830 (0.0034) [2024-06-17 23:56:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40142.3, 300 sec: 40154.7). Total num frames: 275857408. Throughput: 0: 40413.1. Samples: 275939020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:56:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:56:02,938][12883] Updated weights for policy 0, policy_version 16840 (0.0036) [2024-06-17 23:56:06,994][12645] Fps is (10 sec: 37695.0, 60 sec: 40687.5, 300 sec: 40265.8). Total num frames: 276054016. Throughput: 0: 40313.0. Samples: 276178400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:56:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:56:07,153][12883] Updated weights for policy 0, policy_version 16850 (0.0040) [2024-06-17 23:56:11,060][12883] Updated weights for policy 0, policy_version 16860 (0.0039) [2024-06-17 23:56:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40686.9, 300 sec: 40154.7). Total num frames: 276250624. Throughput: 0: 40467.6. Samples: 276419120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-17 23:56:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:56:15,236][12883] Updated weights for policy 0, policy_version 16870 (0.0041) [2024-06-17 23:56:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40140.8, 300 sec: 40376.8). Total num frames: 276463616. Throughput: 0: 40430.7. Samples: 276543520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-17 23:56:16,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:56:19,318][12883] Updated weights for policy 0, policy_version 16880 (0.0040) [2024-06-17 23:56:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40141.0, 300 sec: 40265.8). Total num frames: 276660224. Throughput: 0: 40390.3. Samples: 276786580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:56:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:56:23,157][12883] Updated weights for policy 0, policy_version 16890 (0.0034) [2024-06-17 23:56:26,994][12645] Fps is (10 sec: 39322.6, 60 sec: 40140.9, 300 sec: 40154.7). Total num frames: 276856832. Throughput: 0: 40605.4. Samples: 277033860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:56:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:56:27,019][12862] Signal inference workers to stop experience collection... (3900 times) [2024-06-17 23:56:27,020][12862] Signal inference workers to resume experience collection... (3900 times) [2024-06-17 23:56:27,046][12883] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-17 23:56:27,078][12883] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-17 23:56:27,166][12883] Updated weights for policy 0, policy_version 16900 (0.0034) [2024-06-17 23:56:30,876][12883] Updated weights for policy 0, policy_version 16910 (0.0050) [2024-06-17 23:56:31,996][12645] Fps is (10 sec: 40950.3, 60 sec: 40412.5, 300 sec: 40321.0). Total num frames: 277069824. Throughput: 0: 40291.8. Samples: 277149280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-17 23:56:31,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:56:35,272][12883] Updated weights for policy 0, policy_version 16920 (0.0037) [2024-06-17 23:56:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40210.2). Total num frames: 277266432. Throughput: 0: 40596.6. Samples: 277397280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:56:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:56:37,081][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016924_277282816.pth... [2024-06-17 23:56:37,134][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016334_267616256.pth [2024-06-17 23:56:38,931][12883] Updated weights for policy 0, policy_version 16930 (0.0036) [2024-06-17 23:56:41,994][12645] Fps is (10 sec: 39330.8, 60 sec: 39867.7, 300 sec: 40265.8). Total num frames: 277463040. Throughput: 0: 40411.8. Samples: 277634280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-17 23:56:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:56:43,769][12883] Updated weights for policy 0, policy_version 16940 (0.0034) [2024-06-17 23:56:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.9, 300 sec: 40376.8). Total num frames: 277692416. Throughput: 0: 40422.2. Samples: 277758020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:56:46,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-17 23:56:47,294][12883] Updated weights for policy 0, policy_version 16950 (0.0033) [2024-06-17 23:56:51,963][12883] Updated weights for policy 0, policy_version 16960 (0.0041) [2024-06-17 23:56:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40414.0, 300 sec: 40210.2). Total num frames: 277872640. Throughput: 0: 40316.9. Samples: 277992660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-17 23:56:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:56:55,530][12883] Updated weights for policy 0, policy_version 16970 (0.0043) [2024-06-17 23:56:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40142.8, 300 sec: 40376.8). Total num frames: 278085632. Throughput: 0: 40213.7. Samples: 278228740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:56:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:57:00,026][12883] Updated weights for policy 0, policy_version 16980 (0.0036) [2024-06-17 23:57:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40140.7, 300 sec: 40210.2). Total num frames: 278265856. Throughput: 0: 40228.9. Samples: 278353820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:57:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:57:04,049][12883] Updated weights for policy 0, policy_version 16990 (0.0044) [2024-06-17 23:57:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 278495232. Throughput: 0: 40168.2. Samples: 278594160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-17 23:57:06,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:57:08,616][12883] Updated weights for policy 0, policy_version 17000 (0.0039) [2024-06-17 23:57:11,951][12883] Updated weights for policy 0, policy_version 17010 (0.0035) [2024-06-17 23:57:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40686.9, 300 sec: 40376.9). Total num frames: 278691840. Throughput: 0: 39920.3. Samples: 278830280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-17 23:57:11,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:57:16,568][12883] Updated weights for policy 0, policy_version 17020 (0.0052) [2024-06-17 23:57:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.9, 300 sec: 40265.7). Total num frames: 278872064. Throughput: 0: 40030.4. Samples: 278950560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-17 23:57:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:57:20,122][12883] Updated weights for policy 0, policy_version 17030 (0.0050) [2024-06-17 23:57:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 40321.3). Total num frames: 279085056. Throughput: 0: 39889.3. Samples: 279192300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:57:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:57:24,583][12883] Updated weights for policy 0, policy_version 17040 (0.0035) [2024-06-17 23:57:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.8, 300 sec: 40266.1). Total num frames: 279265280. Throughput: 0: 40063.1. Samples: 279437120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:57:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:57:28,052][12883] Updated weights for policy 0, policy_version 17050 (0.0041) [2024-06-17 23:57:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40142.2, 300 sec: 40210.2). Total num frames: 279478272. Throughput: 0: 39919.4. Samples: 279554400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-17 23:57:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:57:33,058][12883] Updated weights for policy 0, policy_version 17060 (0.0052) [2024-06-17 23:57:36,940][12883] Updated weights for policy 0, policy_version 17070 (0.0044) [2024-06-17 23:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40265.8). Total num frames: 279674880. Throughput: 0: 39984.5. Samples: 279791960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 23:57:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:57:40,980][12883] Updated weights for policy 0, policy_version 17080 (0.0049) [2024-06-17 23:57:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39867.7, 300 sec: 40265.8). Total num frames: 279855104. Throughput: 0: 40015.2. Samples: 280029420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-17 23:57:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:57:45,000][12883] Updated weights for policy 0, policy_version 17090 (0.0039) [2024-06-17 23:57:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39594.6, 300 sec: 40265.7). Total num frames: 280068096. Throughput: 0: 39940.9. Samples: 280151160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 23:57:46,995][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:57:49,202][12883] Updated weights for policy 0, policy_version 17100 (0.0034) [2024-06-17 23:57:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39594.6, 300 sec: 40210.2). Total num frames: 280248320. Throughput: 0: 40078.2. Samples: 280397680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 23:57:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:57:52,973][12883] Updated weights for policy 0, policy_version 17110 (0.0039) [2024-06-17 23:57:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 39594.8, 300 sec: 40210.2). Total num frames: 280461312. Throughput: 0: 40218.8. Samples: 280640120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-17 23:57:56,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:57:57,178][12883] Updated weights for policy 0, policy_version 17120 (0.0035) [2024-06-17 23:58:01,093][12883] Updated weights for policy 0, policy_version 17130 (0.0035) [2024-06-17 23:58:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 280707072. Throughput: 0: 40216.9. Samples: 280760320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:58:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:58:05,324][12883] Updated weights for policy 0, policy_version 17140 (0.0036) [2024-06-17 23:58:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 39594.6, 300 sec: 40265.8). Total num frames: 280870912. Throughput: 0: 40140.4. Samples: 280998620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-17 23:58:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:58:07,579][12862] Signal inference workers to stop experience collection... (3950 times) [2024-06-17 23:58:07,626][12883] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-17 23:58:07,689][12862] Signal inference workers to resume experience collection... (3950 times) [2024-06-17 23:58:07,689][12883] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-17 23:58:09,370][12883] Updated weights for policy 0, policy_version 17150 (0.0026) [2024-06-17 23:58:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.8, 300 sec: 40265.8). Total num frames: 281083904. Throughput: 0: 40041.7. Samples: 281239000. Policy #0 lag: (min: 2.0, avg: 14.1, max: 29.0) [2024-06-17 23:58:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:58:13,467][12883] Updated weights for policy 0, policy_version 17160 (0.0032) [2024-06-17 23:58:16,994][12645] Fps is (10 sec: 39322.4, 60 sec: 39867.8, 300 sec: 40154.7). Total num frames: 281264128. Throughput: 0: 40263.3. Samples: 281366240. Policy #0 lag: (min: 2.0, avg: 14.1, max: 29.0) [2024-06-17 23:58:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:58:17,373][12883] Updated weights for policy 0, policy_version 17170 (0.0043) [2024-06-17 23:58:21,591][12883] Updated weights for policy 0, policy_version 17180 (0.0044) [2024-06-17 23:58:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 281493504. Throughput: 0: 40377.3. Samples: 281608940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 23:58:21,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:58:25,468][12883] Updated weights for policy 0, policy_version 17190 (0.0045) [2024-06-17 23:58:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40960.0, 300 sec: 40321.6). Total num frames: 281722880. Throughput: 0: 40382.3. Samples: 281846620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 23:58:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:58:29,527][12883] Updated weights for policy 0, policy_version 17200 (0.0051) [2024-06-17 23:58:32,000][12645] Fps is (10 sec: 39297.2, 60 sec: 40136.7, 300 sec: 40264.9). Total num frames: 281886720. Throughput: 0: 40491.8. Samples: 281973540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-17 23:58:32,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-17 23:58:33,469][12883] Updated weights for policy 0, policy_version 17210 (0.0041) [2024-06-17 23:58:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40265.8). Total num frames: 282099712. Throughput: 0: 40423.6. Samples: 282216740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-17 23:58:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:58:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017218_282099712.pth... [2024-06-17 23:58:37,052][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016628_272433152.pth [2024-06-17 23:58:38,068][12883] Updated weights for policy 0, policy_version 17220 (0.0040) [2024-06-17 23:58:41,614][12883] Updated weights for policy 0, policy_version 17230 (0.0051) [2024-06-17 23:58:41,994][12645] Fps is (10 sec: 44263.9, 60 sec: 41233.0, 300 sec: 40376.8). Total num frames: 282329088. Throughput: 0: 40315.0. Samples: 282454300. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-17 23:58:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:58:46,086][12883] Updated weights for policy 0, policy_version 17240 (0.0042) [2024-06-17 23:58:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 40377.1). Total num frames: 282509312. Throughput: 0: 40479.1. Samples: 282581880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 23:58:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-17 23:58:49,555][12883] Updated weights for policy 0, policy_version 17250 (0.0047) [2024-06-17 23:58:51,994][12645] Fps is (10 sec: 37684.1, 60 sec: 40960.2, 300 sec: 40210.3). Total num frames: 282705920. Throughput: 0: 40461.1. Samples: 282819360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 23:58:52,000][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 23:58:54,300][12883] Updated weights for policy 0, policy_version 17260 (0.0028) [2024-06-17 23:58:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 40432.4). Total num frames: 282918912. Throughput: 0: 40740.9. Samples: 283072340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-17 23:58:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:58:57,611][12883] Updated weights for policy 0, policy_version 17270 (0.0031) [2024-06-17 23:59:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 39867.8, 300 sec: 40210.2). Total num frames: 283099136. Throughput: 0: 40504.0. Samples: 283188920. Policy #0 lag: (min: 0.0, avg: 13.2, max: 25.0) [2024-06-17 23:59:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:59:02,383][12883] Updated weights for policy 0, policy_version 17280 (0.0031) [2024-06-17 23:59:05,598][12883] Updated weights for policy 0, policy_version 17290 (0.0048) [2024-06-17 23:59:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40376.9). Total num frames: 283328512. Throughput: 0: 40552.0. Samples: 283433780. Policy #0 lag: (min: 0.0, avg: 13.2, max: 25.0) [2024-06-17 23:59:06,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-17 23:59:10,375][12883] Updated weights for policy 0, policy_version 17300 (0.0028) [2024-06-17 23:59:11,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40412.3, 300 sec: 40321.0). Total num frames: 283508736. Throughput: 0: 40820.1. Samples: 283683620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-17 23:59:11,997][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:59:13,797][12883] Updated weights for policy 0, policy_version 17310 (0.0026) [2024-06-17 23:59:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40686.8, 300 sec: 40265.7). Total num frames: 283705344. Throughput: 0: 40670.0. Samples: 283803440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-17 23:59:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:59:18,604][12883] Updated weights for policy 0, policy_version 17320 (0.0036) [2024-06-17 23:59:21,742][12883] Updated weights for policy 0, policy_version 17330 (0.0040) [2024-06-17 23:59:22,000][12645] Fps is (10 sec: 42581.1, 60 sec: 40682.7, 300 sec: 40431.5). Total num frames: 283934720. Throughput: 0: 40673.9. Samples: 284047320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-17 23:59:22,001][12645] Avg episode reward: [(0, '0.003')] [2024-06-17 23:59:25,504][12862] Signal inference workers to stop experience collection... (4000 times) [2024-06-17 23:59:25,552][12883] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-17 23:59:25,558][12862] Signal inference workers to resume experience collection... (4000 times) [2024-06-17 23:59:25,579][12883] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-17 23:59:26,519][12883] Updated weights for policy 0, policy_version 17340 (0.0051) [2024-06-17 23:59:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 284114944. Throughput: 0: 40834.7. Samples: 284291860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 23:59:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-17 23:59:29,726][12883] Updated weights for policy 0, policy_version 17350 (0.0035) [2024-06-17 23:59:31,994][12645] Fps is (10 sec: 42624.9, 60 sec: 41237.3, 300 sec: 40432.4). Total num frames: 284360704. Throughput: 0: 40516.9. Samples: 284405140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-17 23:59:31,995][12645] Avg episode reward: [(0, '0.010')] [2024-06-17 23:59:34,564][12883] Updated weights for policy 0, policy_version 17360 (0.0038) [2024-06-17 23:59:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 284524544. Throughput: 0: 40755.0. Samples: 284653340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-17 23:59:36,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-17 23:59:38,083][12883] Updated weights for policy 0, policy_version 17370 (0.0039) [2024-06-17 23:59:41,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 284721152. Throughput: 0: 40514.5. Samples: 284895500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-17 23:59:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-17 23:59:42,624][12883] Updated weights for policy 0, policy_version 17380 (0.0046) [2024-06-17 23:59:46,017][12883] Updated weights for policy 0, policy_version 17390 (0.0053) [2024-06-17 23:59:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 284950528. Throughput: 0: 40621.6. Samples: 285016900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-17 23:59:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-17 23:59:50,534][12883] Updated weights for policy 0, policy_version 17400 (0.0042) [2024-06-17 23:59:51,996][12645] Fps is (10 sec: 40951.4, 60 sec: 40412.3, 300 sec: 40376.6). Total num frames: 285130752. Throughput: 0: 40535.3. Samples: 285257960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-17 23:59:51,996][12645] Avg episode reward: [(0, '0.007')] [2024-06-17 23:59:54,121][12883] Updated weights for policy 0, policy_version 17410 (0.0036) [2024-06-17 23:59:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 40321.6). Total num frames: 285343744. Throughput: 0: 40420.6. Samples: 285502460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-17 23:59:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-17 23:59:58,457][12883] Updated weights for policy 0, policy_version 17420 (0.0035) [2024-06-18 00:00:01,994][12645] Fps is (10 sec: 42607.5, 60 sec: 40959.9, 300 sec: 40488.0). Total num frames: 285556736. Throughput: 0: 40524.4. Samples: 285627040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 00:00:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:00:02,147][12883] Updated weights for policy 0, policy_version 17430 (0.0041) [2024-06-18 00:00:06,912][12883] Updated weights for policy 0, policy_version 17440 (0.0046) [2024-06-18 00:00:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40432.4). Total num frames: 285736960. Throughput: 0: 40506.1. Samples: 285869840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 00:00:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:00:10,379][12883] Updated weights for policy 0, policy_version 17450 (0.0034) [2024-06-18 00:00:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40688.4, 300 sec: 40321.3). Total num frames: 285949952. Throughput: 0: 40219.6. Samples: 286101740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 00:00:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:00:14,820][12883] Updated weights for policy 0, policy_version 17460 (0.0037) [2024-06-18 00:00:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 40321.3). Total num frames: 286146560. Throughput: 0: 40468.1. Samples: 286226200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 00:00:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:00:18,483][12883] Updated weights for policy 0, policy_version 17470 (0.0025) [2024-06-18 00:00:21,994][12645] Fps is (10 sec: 36045.0, 60 sec: 39598.8, 300 sec: 40210.2). Total num frames: 286310400. Throughput: 0: 40269.8. Samples: 286465480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 00:00:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:00:22,932][12883] Updated weights for policy 0, policy_version 17480 (0.0043) [2024-06-18 00:00:26,284][12883] Updated weights for policy 0, policy_version 17490 (0.0043) [2024-06-18 00:00:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 40376.9). Total num frames: 286556160. Throughput: 0: 40245.4. Samples: 286706540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 00:00:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:00:30,761][12883] Updated weights for policy 0, policy_version 17500 (0.0027) [2024-06-18 00:00:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 39867.8, 300 sec: 40432.4). Total num frames: 286752768. Throughput: 0: 40452.1. Samples: 286837240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 00:00:31,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:00:34,898][12883] Updated weights for policy 0, policy_version 17510 (0.0046) [2024-06-18 00:00:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40140.8, 300 sec: 40210.2). Total num frames: 286932992. Throughput: 0: 40361.4. Samples: 287074140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 00:00:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:00:37,121][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017514_286949376.pth... [2024-06-18 00:00:37,194][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000016924_277282816.pth [2024-06-18 00:00:39,185][12883] Updated weights for policy 0, policy_version 17520 (0.0050) [2024-06-18 00:00:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40321.3). Total num frames: 287162368. Throughput: 0: 40245.8. Samples: 287313520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-18 00:00:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:00:43,171][12883] Updated weights for policy 0, policy_version 17530 (0.0040) [2024-06-18 00:00:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40140.9, 300 sec: 40376.9). Total num frames: 287358976. Throughput: 0: 40233.4. Samples: 287437540. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-18 00:00:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:00:47,023][12883] Updated weights for policy 0, policy_version 17540 (0.0043) [2024-06-18 00:00:51,324][12883] Updated weights for policy 0, policy_version 17550 (0.0060) [2024-06-18 00:00:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40688.4, 300 sec: 40321.7). Total num frames: 287571968. Throughput: 0: 40223.9. Samples: 287679920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 00:00:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:00:55,171][12883] Updated weights for policy 0, policy_version 17560 (0.0036) [2024-06-18 00:00:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 287784960. Throughput: 0: 40541.4. Samples: 287926100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 00:00:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:00:59,167][12883] Updated weights for policy 0, policy_version 17570 (0.0036) [2024-06-18 00:01:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 40321.3). Total num frames: 287948800. Throughput: 0: 40454.2. Samples: 288046640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 00:01:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:01:03,296][12883] Updated weights for policy 0, policy_version 17580 (0.0040) [2024-06-18 00:01:04,886][12862] Signal inference workers to stop experience collection... (4050 times) [2024-06-18 00:01:04,886][12862] Signal inference workers to resume experience collection... (4050 times) [2024-06-18 00:01:04,903][12883] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-18 00:01:04,903][12883] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-18 00:01:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 40432.4). Total num frames: 288178176. Throughput: 0: 40468.4. Samples: 288286560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 00:01:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:01:07,779][12883] Updated weights for policy 0, policy_version 17590 (0.0037) [2024-06-18 00:01:11,358][12883] Updated weights for policy 0, policy_version 17600 (0.0038) [2024-06-18 00:01:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.9, 300 sec: 40376.9). Total num frames: 288374784. Throughput: 0: 40433.3. Samples: 288526040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 00:01:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:01:15,738][12883] Updated weights for policy 0, policy_version 17610 (0.0031) [2024-06-18 00:01:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 40321.3). Total num frames: 288555008. Throughput: 0: 40335.2. Samples: 288652320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:01:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:01:19,408][12883] Updated weights for policy 0, policy_version 17620 (0.0042) [2024-06-18 00:01:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 40487.9). Total num frames: 288800768. Throughput: 0: 40481.4. Samples: 288895800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:01:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:01:23,714][12883] Updated weights for policy 0, policy_version 17630 (0.0026) [2024-06-18 00:01:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.9, 300 sec: 40377.2). Total num frames: 288980992. Throughput: 0: 40759.2. Samples: 289147680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:01:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:01:27,377][12883] Updated weights for policy 0, policy_version 17640 (0.0026) [2024-06-18 00:01:31,531][12883] Updated weights for policy 0, policy_version 17650 (0.0044) [2024-06-18 00:01:31,996][12645] Fps is (10 sec: 37674.7, 60 sec: 40412.3, 300 sec: 40376.5). Total num frames: 289177600. Throughput: 0: 40627.7. Samples: 289265880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 00:01:31,997][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:01:35,263][12883] Updated weights for policy 0, policy_version 17660 (0.0029) [2024-06-18 00:01:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41506.2, 300 sec: 40543.4). Total num frames: 289423360. Throughput: 0: 40708.9. Samples: 289511820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 00:01:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:01:39,553][12883] Updated weights for policy 0, policy_version 17670 (0.0035) [2024-06-18 00:01:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 40413.9, 300 sec: 40321.3). Total num frames: 289587200. Throughput: 0: 40716.4. Samples: 289758340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 00:01:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:01:43,159][12883] Updated weights for policy 0, policy_version 17680 (0.0033) [2024-06-18 00:01:46,994][12645] Fps is (10 sec: 36044.5, 60 sec: 40413.8, 300 sec: 40376.8). Total num frames: 289783808. Throughput: 0: 40703.9. Samples: 289878320. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 00:01:46,995][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:01:47,722][12883] Updated weights for policy 0, policy_version 17690 (0.0037) [2024-06-18 00:01:51,080][12883] Updated weights for policy 0, policy_version 17700 (0.0055) [2024-06-18 00:01:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40432.4). Total num frames: 290013184. Throughput: 0: 40876.5. Samples: 290126000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 00:01:51,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:01:56,188][12883] Updated weights for policy 0, policy_version 17710 (0.0042) [2024-06-18 00:01:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40413.8, 300 sec: 40487.9). Total num frames: 290209792. Throughput: 0: 41011.5. Samples: 290371560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 00:01:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:01:59,429][12883] Updated weights for policy 0, policy_version 17720 (0.0045) [2024-06-18 00:02:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 40432.4). Total num frames: 290422784. Throughput: 0: 40818.3. Samples: 290489140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 00:02:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:02:04,105][12883] Updated weights for policy 0, policy_version 17730 (0.0037) [2024-06-18 00:02:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 40432.4). Total num frames: 290619392. Throughput: 0: 40942.2. Samples: 290738200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 00:02:06,995][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:02:07,487][12883] Updated weights for policy 0, policy_version 17740 (0.0042) [2024-06-18 00:02:11,994][12645] Fps is (10 sec: 36044.7, 60 sec: 40140.9, 300 sec: 40376.9). Total num frames: 290783232. Throughput: 0: 40806.7. Samples: 290983980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 00:02:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:02:12,289][12883] Updated weights for policy 0, policy_version 17750 (0.0041) [2024-06-18 00:02:15,566][12883] Updated weights for policy 0, policy_version 17760 (0.0047) [2024-06-18 00:02:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 40543.4). Total num frames: 291045376. Throughput: 0: 40748.2. Samples: 291099460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 00:02:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:02:20,286][12883] Updated weights for policy 0, policy_version 17770 (0.0044) [2024-06-18 00:02:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 291225600. Throughput: 0: 40848.5. Samples: 291350000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 00:02:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:02:23,353][12883] Updated weights for policy 0, policy_version 17780 (0.0034) [2024-06-18 00:02:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 40413.8, 300 sec: 40432.4). Total num frames: 291405824. Throughput: 0: 40858.2. Samples: 291596960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 00:02:26,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:02:28,233][12883] Updated weights for policy 0, policy_version 17790 (0.0042) [2024-06-18 00:02:31,591][12883] Updated weights for policy 0, policy_version 17800 (0.0037) [2024-06-18 00:02:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 41507.6, 300 sec: 40654.5). Total num frames: 291667968. Throughput: 0: 40757.3. Samples: 291712400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 00:02:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:02:36,134][12883] Updated weights for policy 0, policy_version 17810 (0.0038) [2024-06-18 00:02:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.7, 300 sec: 40543.4). Total num frames: 291815424. Throughput: 0: 40675.4. Samples: 291956400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 00:02:36,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:02:37,134][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017812_291831808.pth... [2024-06-18 00:02:37,198][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017218_282099712.pth [2024-06-18 00:02:39,612][12883] Updated weights for policy 0, policy_version 17820 (0.0034) [2024-06-18 00:02:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 40599.0). Total num frames: 292044800. Throughput: 0: 40568.5. Samples: 292197140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 00:02:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:02:43,887][12862] Signal inference workers to stop experience collection... (4100 times) [2024-06-18 00:02:43,889][12862] Signal inference workers to resume experience collection... (4100 times) [2024-06-18 00:02:43,923][12883] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-18 00:02:43,923][12883] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-18 00:02:44,043][12883] Updated weights for policy 0, policy_version 17830 (0.0041) [2024-06-18 00:02:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40960.1, 300 sec: 40654.5). Total num frames: 292241408. Throughput: 0: 40853.7. Samples: 292327560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-18 00:02:46,996][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:02:47,672][12883] Updated weights for policy 0, policy_version 17840 (0.0033) [2024-06-18 00:02:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 292438016. Throughput: 0: 40640.9. Samples: 292567040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-18 00:02:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:02:52,045][12883] Updated weights for policy 0, policy_version 17850 (0.0039) [2024-06-18 00:02:55,761][12883] Updated weights for policy 0, policy_version 17860 (0.0029) [2024-06-18 00:02:56,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40959.9, 300 sec: 40543.4). Total num frames: 292667392. Throughput: 0: 40367.3. Samples: 292800520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 00:02:56,995][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:03:00,150][12883] Updated weights for policy 0, policy_version 17870 (0.0050) [2024-06-18 00:03:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 40487.9). Total num frames: 292814848. Throughput: 0: 40535.7. Samples: 292923560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 00:03:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:03:04,078][12883] Updated weights for policy 0, policy_version 17880 (0.0042) [2024-06-18 00:03:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 293060608. Throughput: 0: 40362.1. Samples: 293166300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 00:03:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:03:08,200][12883] Updated weights for policy 0, policy_version 17890 (0.0032) [2024-06-18 00:03:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41233.0, 300 sec: 40654.5). Total num frames: 293257216. Throughput: 0: 40359.6. Samples: 293413140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:03:11,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:03:12,186][12883] Updated weights for policy 0, policy_version 17900 (0.0028) [2024-06-18 00:03:16,137][12883] Updated weights for policy 0, policy_version 17910 (0.0032) [2024-06-18 00:03:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 39867.8, 300 sec: 40487.9). Total num frames: 293437440. Throughput: 0: 40411.6. Samples: 293530920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:03:16,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:03:20,174][12883] Updated weights for policy 0, policy_version 17920 (0.0035) [2024-06-18 00:03:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.8, 300 sec: 40487.9). Total num frames: 293666816. Throughput: 0: 40385.8. Samples: 293773760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:03:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:03:24,076][12883] Updated weights for policy 0, policy_version 17930 (0.0028) [2024-06-18 00:03:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 40544.3). Total num frames: 293847040. Throughput: 0: 40628.1. Samples: 294025400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-18 00:03:26,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:03:28,241][12883] Updated weights for policy 0, policy_version 17940 (0.0033) [2024-06-18 00:03:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 39867.8, 300 sec: 40543.5). Total num frames: 294060032. Throughput: 0: 40280.9. Samples: 294140200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-18 00:03:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:03:32,350][12883] Updated weights for policy 0, policy_version 17950 (0.0036) [2024-06-18 00:03:36,209][12883] Updated weights for policy 0, policy_version 17960 (0.0032) [2024-06-18 00:03:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 40543.5). Total num frames: 294289408. Throughput: 0: 40560.4. Samples: 294392260. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-18 00:03:36,996][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:03:40,423][12883] Updated weights for policy 0, policy_version 17970 (0.0038) [2024-06-18 00:03:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40487.9). Total num frames: 294453248. Throughput: 0: 40747.3. Samples: 294634140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-18 00:03:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:03:44,500][12883] Updated weights for policy 0, policy_version 17980 (0.0041) [2024-06-18 00:03:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 294682624. Throughput: 0: 40632.3. Samples: 294752020. Policy #0 lag: (min: 1.0, avg: 9.2, max: 22.0) [2024-06-18 00:03:46,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:03:48,661][12883] Updated weights for policy 0, policy_version 17990 (0.0039) [2024-06-18 00:03:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40686.8, 300 sec: 40543.4). Total num frames: 294879232. Throughput: 0: 40737.3. Samples: 294999480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 00:03:51,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:03:52,639][12883] Updated weights for policy 0, policy_version 18000 (0.0047) [2024-06-18 00:03:56,483][12883] Updated weights for policy 0, policy_version 18010 (0.0043) [2024-06-18 00:03:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40414.0, 300 sec: 40654.5). Total num frames: 295092224. Throughput: 0: 40571.6. Samples: 295238860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 00:03:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:04:00,832][12862] Signal inference workers to stop experience collection... (4150 times) [2024-06-18 00:04:00,877][12883] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-18 00:04:00,878][12862] Signal inference workers to resume experience collection... (4150 times) [2024-06-18 00:04:00,887][12883] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-18 00:04:00,894][12883] Updated weights for policy 0, policy_version 18020 (0.0043) [2024-06-18 00:04:01,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 40543.5). Total num frames: 295288832. Throughput: 0: 40650.3. Samples: 295360180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 00:04:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:04:04,716][12883] Updated weights for policy 0, policy_version 18030 (0.0032) [2024-06-18 00:04:06,994][12645] Fps is (10 sec: 37682.6, 60 sec: 40140.8, 300 sec: 40543.7). Total num frames: 295469056. Throughput: 0: 40598.2. Samples: 295600680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 00:04:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:04:08,894][12883] Updated weights for policy 0, policy_version 18040 (0.0038) [2024-06-18 00:04:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40140.8, 300 sec: 40543.5). Total num frames: 295665664. Throughput: 0: 40464.8. Samples: 295846320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 00:04:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:04:12,859][12883] Updated weights for policy 0, policy_version 18050 (0.0033) [2024-06-18 00:04:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40433.2). Total num frames: 295862272. Throughput: 0: 40546.0. Samples: 295964780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 00:04:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:04:17,661][12883] Updated weights for policy 0, policy_version 18060 (0.0041) [2024-06-18 00:04:21,018][12883] Updated weights for policy 0, policy_version 18070 (0.0034) [2024-06-18 00:04:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40140.9, 300 sec: 40543.5). Total num frames: 296075264. Throughput: 0: 40291.6. Samples: 296205380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 00:04:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:04:25,617][12883] Updated weights for policy 0, policy_version 18080 (0.0035) [2024-06-18 00:04:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40959.9, 300 sec: 40487.9). Total num frames: 296304640. Throughput: 0: 40344.8. Samples: 296449660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 00:04:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:04:29,198][12883] Updated weights for policy 0, policy_version 18090 (0.0034) [2024-06-18 00:04:31,996][12645] Fps is (10 sec: 42589.2, 60 sec: 40685.4, 300 sec: 40598.7). Total num frames: 296501248. Throughput: 0: 40360.8. Samples: 296568340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 00:04:31,996][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:04:33,407][12883] Updated weights for policy 0, policy_version 18100 (0.0039) [2024-06-18 00:04:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 296697856. Throughput: 0: 40406.4. Samples: 296817760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 00:04:36,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:04:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018109_296697856.pth... [2024-06-18 00:04:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017514_286949376.pth [2024-06-18 00:04:37,224][12883] Updated weights for policy 0, policy_version 18110 (0.0040) [2024-06-18 00:04:41,112][12883] Updated weights for policy 0, policy_version 18120 (0.0055) [2024-06-18 00:04:41,994][12645] Fps is (10 sec: 39329.9, 60 sec: 40686.9, 300 sec: 40487.9). Total num frames: 296894464. Throughput: 0: 40411.9. Samples: 297057400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:04:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:04:45,283][12883] Updated weights for policy 0, policy_version 18130 (0.0044) [2024-06-18 00:04:47,000][12645] Fps is (10 sec: 42571.9, 60 sec: 40682.8, 300 sec: 40654.0). Total num frames: 297123840. Throughput: 0: 40683.7. Samples: 297191200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:04:47,000][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:04:49,105][12883] Updated weights for policy 0, policy_version 18140 (0.0042) [2024-06-18 00:04:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.9, 300 sec: 40487.9). Total num frames: 297287680. Throughput: 0: 40676.1. Samples: 297431100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:04:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:04:53,476][12883] Updated weights for policy 0, policy_version 18150 (0.0043) [2024-06-18 00:04:56,890][12883] Updated weights for policy 0, policy_version 18160 (0.0041) [2024-06-18 00:04:56,994][12645] Fps is (10 sec: 40985.4, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 297533440. Throughput: 0: 40550.7. Samples: 297671100. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 00:04:57,000][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:05:01,304][12883] Updated weights for policy 0, policy_version 18170 (0.0032) [2024-06-18 00:05:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 297730048. Throughput: 0: 40704.2. Samples: 297796460. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 00:05:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:05:04,587][12883] Updated weights for policy 0, policy_version 18180 (0.0043) [2024-06-18 00:05:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40687.1, 300 sec: 40543.5). Total num frames: 297910272. Throughput: 0: 40778.3. Samples: 298040400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:05:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:05:09,613][12883] Updated weights for policy 0, policy_version 18190 (0.0045) [2024-06-18 00:05:11,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 40654.6). Total num frames: 298139648. Throughput: 0: 40647.3. Samples: 298278780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:05:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:05:12,480][12883] Updated weights for policy 0, policy_version 18200 (0.0034) [2024-06-18 00:05:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.2, 300 sec: 40654.5). Total num frames: 298303488. Throughput: 0: 40811.9. Samples: 298404780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:05:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:05:17,481][12862] Signal inference workers to stop experience collection... (4200 times) [2024-06-18 00:05:17,545][12883] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-18 00:05:17,601][12862] Signal inference workers to resume experience collection... (4200 times) [2024-06-18 00:05:17,601][12883] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-18 00:05:17,731][12883] Updated weights for policy 0, policy_version 18210 (0.0030) [2024-06-18 00:05:20,596][12883] Updated weights for policy 0, policy_version 18220 (0.0036) [2024-06-18 00:05:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 40599.0). Total num frames: 298532864. Throughput: 0: 40528.4. Samples: 298641540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 00:05:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:05:25,687][12883] Updated weights for policy 0, policy_version 18230 (0.0034) [2024-06-18 00:05:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 40687.1, 300 sec: 40654.5). Total num frames: 298745856. Throughput: 0: 40785.4. Samples: 298892740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 00:05:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:05:29,049][12883] Updated weights for policy 0, policy_version 18240 (0.0028) [2024-06-18 00:05:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40415.3, 300 sec: 40654.5). Total num frames: 298926080. Throughput: 0: 40599.3. Samples: 299017920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:05:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:05:33,673][12883] Updated weights for policy 0, policy_version 18250 (0.0056) [2024-06-18 00:05:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40654.6). Total num frames: 299155456. Throughput: 0: 40556.5. Samples: 299256140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:05:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:05:37,023][12883] Updated weights for policy 0, policy_version 18260 (0.0036) [2024-06-18 00:05:41,715][12883] Updated weights for policy 0, policy_version 18270 (0.0030) [2024-06-18 00:05:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 299335680. Throughput: 0: 40668.8. Samples: 299501200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:05:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:05:45,038][12883] Updated weights for policy 0, policy_version 18280 (0.0042) [2024-06-18 00:05:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40418.0, 300 sec: 40599.0). Total num frames: 299548672. Throughput: 0: 40508.4. Samples: 299619340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 00:05:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:05:49,725][12883] Updated weights for policy 0, policy_version 18290 (0.0041) [2024-06-18 00:05:51,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41233.2, 300 sec: 40599.0). Total num frames: 299761664. Throughput: 0: 40656.9. Samples: 299869960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 00:05:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:05:53,044][12883] Updated weights for policy 0, policy_version 18300 (0.0033) [2024-06-18 00:05:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 299958272. Throughput: 0: 40755.8. Samples: 300112800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 00:05:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:05:57,555][12883] Updated weights for policy 0, policy_version 18310 (0.0042) [2024-06-18 00:06:01,073][12883] Updated weights for policy 0, policy_version 18320 (0.0035) [2024-06-18 00:06:01,994][12645] Fps is (10 sec: 39320.6, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 300154880. Throughput: 0: 40570.4. Samples: 300230460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:06:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:06:06,150][12883] Updated weights for policy 0, policy_version 18330 (0.0044) [2024-06-18 00:06:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 300367872. Throughput: 0: 40631.1. Samples: 300469940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:06:06,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:06:09,867][12883] Updated weights for policy 0, policy_version 18340 (0.0033) [2024-06-18 00:06:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.6, 300 sec: 40599.0). Total num frames: 300531712. Throughput: 0: 40564.8. Samples: 300718160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 00:06:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:06:14,086][12883] Updated weights for policy 0, policy_version 18350 (0.0032) [2024-06-18 00:06:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 40543.5). Total num frames: 300761088. Throughput: 0: 40397.0. Samples: 300835780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 00:06:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:06:17,862][12883] Updated weights for policy 0, policy_version 18360 (0.0052) [2024-06-18 00:06:21,903][12883] Updated weights for policy 0, policy_version 18370 (0.0030) [2024-06-18 00:06:21,999][12645] Fps is (10 sec: 44214.4, 60 sec: 40683.4, 300 sec: 40653.8). Total num frames: 300974080. Throughput: 0: 40625.1. Samples: 301084480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 00:06:21,999][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:06:26,040][12883] Updated weights for policy 0, policy_version 18380 (0.0033) [2024-06-18 00:06:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40710.4). Total num frames: 301187072. Throughput: 0: 40545.4. Samples: 301325740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:06:26,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:06:29,841][12883] Updated weights for policy 0, policy_version 18390 (0.0037) [2024-06-18 00:06:31,994][12645] Fps is (10 sec: 39342.1, 60 sec: 40687.0, 300 sec: 40487.9). Total num frames: 301367296. Throughput: 0: 40577.5. Samples: 301445320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:06:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:06:33,917][12883] Updated weights for policy 0, policy_version 18400 (0.0046) [2024-06-18 00:06:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 301563904. Throughput: 0: 40511.5. Samples: 301692980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:06:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:06:37,095][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018407_301580288.pth... [2024-06-18 00:06:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000017812_291831808.pth [2024-06-18 00:06:37,714][12883] Updated weights for policy 0, policy_version 18410 (0.0034) [2024-06-18 00:06:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40414.0, 300 sec: 40599.0). Total num frames: 301760512. Throughput: 0: 40605.0. Samples: 301940020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:06:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:06:42,237][12883] Updated weights for policy 0, policy_version 18420 (0.0035) [2024-06-18 00:06:45,587][12883] Updated weights for policy 0, policy_version 18430 (0.0039) [2024-06-18 00:06:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40413.9, 300 sec: 40543.4). Total num frames: 301973504. Throughput: 0: 40691.1. Samples: 302061560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:06:46,998][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:06:50,115][12883] Updated weights for policy 0, policy_version 18440 (0.0047) [2024-06-18 00:06:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 40412.3, 300 sec: 40598.7). Total num frames: 302186496. Throughput: 0: 40858.4. Samples: 302308660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 00:06:51,997][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:06:53,945][12883] Updated weights for policy 0, policy_version 18450 (0.0035) [2024-06-18 00:06:56,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.9, 300 sec: 40487.9). Total num frames: 302366720. Throughput: 0: 40729.9. Samples: 302551000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 00:06:56,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:06:58,060][12883] Updated weights for policy 0, policy_version 18460 (0.0039) [2024-06-18 00:06:59,024][12862] Signal inference workers to stop experience collection... (4250 times) [2024-06-18 00:06:59,056][12883] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-18 00:06:59,083][12862] Signal inference workers to resume experience collection... (4250 times) [2024-06-18 00:06:59,084][12883] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-18 00:07:01,838][12883] Updated weights for policy 0, policy_version 18470 (0.0029) [2024-06-18 00:07:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 40960.1, 300 sec: 40654.5). Total num frames: 302612480. Throughput: 0: 40727.0. Samples: 302668500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 00:07:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:07:06,364][12883] Updated weights for policy 0, policy_version 18480 (0.0042) [2024-06-18 00:07:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 302792704. Throughput: 0: 40681.5. Samples: 302914940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:07:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:07:10,284][12883] Updated weights for policy 0, policy_version 18490 (0.0038) [2024-06-18 00:07:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40960.0, 300 sec: 40487.9). Total num frames: 302989312. Throughput: 0: 40745.7. Samples: 303159300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:07:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:07:14,568][12883] Updated weights for policy 0, policy_version 18500 (0.0039) [2024-06-18 00:07:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 303202304. Throughput: 0: 40742.2. Samples: 303278720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 00:07:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:07:18,041][12883] Updated weights for policy 0, policy_version 18510 (0.0045) [2024-06-18 00:07:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40144.3, 300 sec: 40599.0). Total num frames: 303382528. Throughput: 0: 40631.2. Samples: 303521380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 00:07:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:07:22,335][12883] Updated weights for policy 0, policy_version 18520 (0.0035) [2024-06-18 00:07:25,790][12883] Updated weights for policy 0, policy_version 18530 (0.0037) [2024-06-18 00:07:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40413.8, 300 sec: 40487.9). Total num frames: 303611904. Throughput: 0: 40592.3. Samples: 303766680. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 00:07:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:07:30,750][12883] Updated weights for policy 0, policy_version 18540 (0.0059) [2024-06-18 00:07:31,996][12645] Fps is (10 sec: 44226.6, 60 sec: 40958.4, 300 sec: 40709.8). Total num frames: 303824896. Throughput: 0: 40788.7. Samples: 303897140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 00:07:31,997][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:07:34,280][12883] Updated weights for policy 0, policy_version 18550 (0.0031) [2024-06-18 00:07:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.8, 300 sec: 40543.4). Total num frames: 304005120. Throughput: 0: 40656.1. Samples: 304138100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 00:07:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:07:38,798][12883] Updated weights for policy 0, policy_version 18560 (0.0043) [2024-06-18 00:07:41,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41233.0, 300 sec: 40654.5). Total num frames: 304234496. Throughput: 0: 40615.5. Samples: 304378700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 00:07:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:07:42,357][12883] Updated weights for policy 0, policy_version 18570 (0.0027) [2024-06-18 00:07:46,592][12883] Updated weights for policy 0, policy_version 18580 (0.0034) [2024-06-18 00:07:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 304431104. Throughput: 0: 40836.4. Samples: 304506140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:07:46,995][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:07:50,166][12883] Updated weights for policy 0, policy_version 18590 (0.0037) [2024-06-18 00:07:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40688.5, 300 sec: 40543.5). Total num frames: 304627712. Throughput: 0: 40676.5. Samples: 304745380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:07:51,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 00:07:54,593][12883] Updated weights for policy 0, policy_version 18600 (0.0027) [2024-06-18 00:07:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 304857088. Throughput: 0: 40614.8. Samples: 304986960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 00:07:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:07:58,573][12883] Updated weights for policy 0, policy_version 18610 (0.0042) [2024-06-18 00:08:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 39867.7, 300 sec: 40487.9). Total num frames: 305004544. Throughput: 0: 40939.0. Samples: 305120980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 00:08:01,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:08:02,543][12883] Updated weights for policy 0, policy_version 18620 (0.0034) [2024-06-18 00:08:06,641][12883] Updated weights for policy 0, policy_version 18630 (0.0037) [2024-06-18 00:08:06,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40687.0, 300 sec: 40599.0). Total num frames: 305233920. Throughput: 0: 40823.0. Samples: 305358420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 00:08:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:08:10,704][12883] Updated weights for policy 0, policy_version 18640 (0.0035) [2024-06-18 00:08:11,994][12645] Fps is (10 sec: 45876.0, 60 sec: 41233.2, 300 sec: 40765.6). Total num frames: 305463296. Throughput: 0: 40817.5. Samples: 305603460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 00:08:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:08:14,525][12883] Updated weights for policy 0, policy_version 18650 (0.0032) [2024-06-18 00:08:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.8, 300 sec: 40543.5). Total num frames: 305627136. Throughput: 0: 40599.8. Samples: 305724040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 00:08:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:08:18,670][12883] Updated weights for policy 0, policy_version 18660 (0.0032) [2024-06-18 00:08:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 305856512. Throughput: 0: 40678.8. Samples: 305968640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 00:08:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:08:22,232][12862] Signal inference workers to stop experience collection... (4300 times) [2024-06-18 00:08:22,286][12883] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-18 00:08:22,302][12862] Signal inference workers to resume experience collection... (4300 times) [2024-06-18 00:08:22,302][12883] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-18 00:08:22,453][12883] Updated weights for policy 0, policy_version 18670 (0.0051) [2024-06-18 00:08:26,724][12883] Updated weights for policy 0, policy_version 18680 (0.0047) [2024-06-18 00:08:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40654.5). Total num frames: 306053120. Throughput: 0: 40876.0. Samples: 306218120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:08:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:08:30,351][12883] Updated weights for policy 0, policy_version 18690 (0.0027) [2024-06-18 00:08:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40142.3, 300 sec: 40487.9). Total num frames: 306233344. Throughput: 0: 40762.3. Samples: 306340440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:08:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:08:34,766][12883] Updated weights for policy 0, policy_version 18700 (0.0048) [2024-06-18 00:08:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.2, 300 sec: 40765.6). Total num frames: 306479104. Throughput: 0: 40831.1. Samples: 306582780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 00:08:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:08:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018706_306479104.pth... [2024-06-18 00:08:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018109_296697856.pth [2024-06-18 00:08:38,789][12883] Updated weights for policy 0, policy_version 18710 (0.0049) [2024-06-18 00:08:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 306675712. Throughput: 0: 40980.9. Samples: 306831100. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 00:08:41,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:08:42,522][12883] Updated weights for policy 0, policy_version 18720 (0.0036) [2024-06-18 00:08:46,805][12883] Updated weights for policy 0, policy_version 18730 (0.0045) [2024-06-18 00:08:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 40654.6). Total num frames: 306872320. Throughput: 0: 40609.8. Samples: 306948420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 00:08:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:08:50,762][12883] Updated weights for policy 0, policy_version 18740 (0.0047) [2024-06-18 00:08:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 307085312. Throughput: 0: 40861.4. Samples: 307197180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:08:51,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:08:54,827][12883] Updated weights for policy 0, policy_version 18750 (0.0038) [2024-06-18 00:08:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 307298304. Throughput: 0: 40821.7. Samples: 307440440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:08:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:08:58,830][12883] Updated weights for policy 0, policy_version 18760 (0.0038) [2024-06-18 00:09:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 40710.1). Total num frames: 307478528. Throughput: 0: 40797.8. Samples: 307559940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:09:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:09:03,018][12883] Updated weights for policy 0, policy_version 18770 (0.0038) [2024-06-18 00:09:06,788][12883] Updated weights for policy 0, policy_version 18780 (0.0038) [2024-06-18 00:09:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 40821.2). Total num frames: 307707904. Throughput: 0: 40913.8. Samples: 307809760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:09:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:09:11,331][12883] Updated weights for policy 0, policy_version 18790 (0.0041) [2024-06-18 00:09:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40413.9, 300 sec: 40765.7). Total num frames: 307888128. Throughput: 0: 40712.1. Samples: 308050160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:09:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:09:14,777][12883] Updated weights for policy 0, policy_version 18800 (0.0046) [2024-06-18 00:09:16,996][12645] Fps is (10 sec: 39312.9, 60 sec: 41231.6, 300 sec: 40765.3). Total num frames: 308101120. Throughput: 0: 40610.9. Samples: 308168020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:09:16,996][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:09:19,427][12883] Updated weights for policy 0, policy_version 18810 (0.0040) [2024-06-18 00:09:21,996][12645] Fps is (10 sec: 40951.5, 60 sec: 40685.6, 300 sec: 40654.3). Total num frames: 308297728. Throughput: 0: 40717.4. Samples: 308415140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:09:21,996][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:09:22,650][12883] Updated weights for policy 0, policy_version 18820 (0.0034) [2024-06-18 00:09:26,994][12645] Fps is (10 sec: 37691.3, 60 sec: 40413.8, 300 sec: 40599.3). Total num frames: 308477952. Throughput: 0: 40609.3. Samples: 308658520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:09:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:09:27,335][12883] Updated weights for policy 0, policy_version 18830 (0.0043) [2024-06-18 00:09:28,742][12862] Signal inference workers to stop experience collection... (4350 times) [2024-06-18 00:09:28,742][12862] Signal inference workers to resume experience collection... (4350 times) [2024-06-18 00:09:28,758][12883] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-18 00:09:28,759][12883] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-18 00:09:30,352][12883] Updated weights for policy 0, policy_version 18840 (0.0044) [2024-06-18 00:09:31,994][12645] Fps is (10 sec: 39329.5, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 308690944. Throughput: 0: 40689.0. Samples: 308779420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:09:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:09:35,411][12883] Updated weights for policy 0, policy_version 18850 (0.0040) [2024-06-18 00:09:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 308903936. Throughput: 0: 40588.5. Samples: 309023660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:09:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:09:38,837][12883] Updated weights for policy 0, policy_version 18860 (0.0027) [2024-06-18 00:09:41,996][12645] Fps is (10 sec: 39312.4, 60 sec: 40139.3, 300 sec: 40544.0). Total num frames: 309084160. Throughput: 0: 40509.1. Samples: 309263440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:09:41,997][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:09:43,347][12883] Updated weights for policy 0, policy_version 18870 (0.0031) [2024-06-18 00:09:46,727][12883] Updated weights for policy 0, policy_version 18880 (0.0043) [2024-06-18 00:09:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 309329920. Throughput: 0: 40586.6. Samples: 309386340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:09:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:09:51,347][12883] Updated weights for policy 0, policy_version 18890 (0.0028) [2024-06-18 00:09:51,994][12645] Fps is (10 sec: 42608.1, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 309510144. Throughput: 0: 40480.9. Samples: 309631400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:09:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:09:55,088][12883] Updated weights for policy 0, policy_version 18900 (0.0045) [2024-06-18 00:09:56,994][12645] Fps is (10 sec: 36045.3, 60 sec: 39867.7, 300 sec: 40543.5). Total num frames: 309690368. Throughput: 0: 40459.5. Samples: 309870840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 00:09:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:09:59,343][12883] Updated weights for policy 0, policy_version 18910 (0.0043) [2024-06-18 00:10:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 309919744. Throughput: 0: 40420.7. Samples: 309986860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 00:10:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:10:03,088][12883] Updated weights for policy 0, policy_version 18920 (0.0030) [2024-06-18 00:10:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 39867.8, 300 sec: 40543.5). Total num frames: 310099968. Throughput: 0: 40512.1. Samples: 310238100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 00:10:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:10:07,993][12883] Updated weights for policy 0, policy_version 18930 (0.0042) [2024-06-18 00:10:11,281][12883] Updated weights for policy 0, policy_version 18940 (0.0024) [2024-06-18 00:10:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 310329344. Throughput: 0: 40320.5. Samples: 310472940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 00:10:11,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:10:15,926][12883] Updated weights for policy 0, policy_version 18950 (0.0040) [2024-06-18 00:10:16,994][12645] Fps is (10 sec: 44235.9, 60 sec: 40688.4, 300 sec: 40710.1). Total num frames: 310542336. Throughput: 0: 40568.3. Samples: 310605000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 00:10:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:10:19,195][12883] Updated weights for policy 0, policy_version 18960 (0.0038) [2024-06-18 00:10:21,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39869.0, 300 sec: 40487.9). Total num frames: 310689792. Throughput: 0: 40550.9. Samples: 310848460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 00:10:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:10:23,961][12883] Updated weights for policy 0, policy_version 18970 (0.0051) [2024-06-18 00:10:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 310935552. Throughput: 0: 40519.0. Samples: 311086700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 00:10:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:10:27,508][12883] Updated weights for policy 0, policy_version 18980 (0.0049) [2024-06-18 00:10:31,891][12883] Updated weights for policy 0, policy_version 18990 (0.0034) [2024-06-18 00:10:31,994][12645] Fps is (10 sec: 44237.8, 60 sec: 40687.0, 300 sec: 40599.0). Total num frames: 311132160. Throughput: 0: 40740.6. Samples: 311219660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 00:10:31,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:10:35,818][12883] Updated weights for policy 0, policy_version 19000 (0.0051) [2024-06-18 00:10:36,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.7, 300 sec: 40599.0). Total num frames: 311312384. Throughput: 0: 40668.8. Samples: 311461500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 00:10:36,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:10:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019001_311312384.pth... [2024-06-18 00:10:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018407_301580288.pth [2024-06-18 00:10:39,746][12883] Updated weights for policy 0, policy_version 19010 (0.0026) [2024-06-18 00:10:40,598][12862] Signal inference workers to stop experience collection... (4400 times) [2024-06-18 00:10:40,646][12883] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-18 00:10:40,654][12862] Signal inference workers to resume experience collection... (4400 times) [2024-06-18 00:10:40,661][12883] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-18 00:10:41,995][12645] Fps is (10 sec: 42591.3, 60 sec: 41233.6, 300 sec: 40709.9). Total num frames: 311558144. Throughput: 0: 40659.4. Samples: 311700580. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 00:10:41,996][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:10:43,783][12883] Updated weights for policy 0, policy_version 19020 (0.0042) [2024-06-18 00:10:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 40140.9, 300 sec: 40599.0). Total num frames: 311738368. Throughput: 0: 40989.8. Samples: 311831400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 00:10:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:10:47,587][12883] Updated weights for policy 0, policy_version 19030 (0.0030) [2024-06-18 00:10:51,994][12645] Fps is (10 sec: 37689.1, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 311934976. Throughput: 0: 40851.5. Samples: 312076420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 00:10:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:10:52,125][12883] Updated weights for policy 0, policy_version 19040 (0.0046) [2024-06-18 00:10:55,584][12883] Updated weights for policy 0, policy_version 19050 (0.0047) [2024-06-18 00:10:56,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42052.2, 300 sec: 40876.7). Total num frames: 312213504. Throughput: 0: 40868.9. Samples: 312312040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 00:10:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:10:59,975][12883] Updated weights for policy 0, policy_version 19060 (0.0034) [2024-06-18 00:11:01,996][12645] Fps is (10 sec: 42588.7, 60 sec: 40685.3, 300 sec: 40654.2). Total num frames: 312360960. Throughput: 0: 40934.9. Samples: 312447160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 00:11:01,997][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 00:11:03,523][12883] Updated weights for policy 0, policy_version 19070 (0.0032) [2024-06-18 00:11:06,994][12645] Fps is (10 sec: 36044.9, 60 sec: 41233.0, 300 sec: 40821.2). Total num frames: 312573952. Throughput: 0: 40782.3. Samples: 312683660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 00:11:06,994][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 00:11:07,004][12862] Saving new best policy, reward=0.038! [2024-06-18 00:11:07,773][12883] Updated weights for policy 0, policy_version 19080 (0.0035) [2024-06-18 00:11:11,444][12883] Updated weights for policy 0, policy_version 19090 (0.0040) [2024-06-18 00:11:11,994][12645] Fps is (10 sec: 42608.6, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 312786944. Throughput: 0: 40994.3. Samples: 312931440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 00:11:11,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 00:11:16,327][12883] Updated weights for policy 0, policy_version 19100 (0.0028) [2024-06-18 00:11:16,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40140.9, 300 sec: 40599.7). Total num frames: 312950784. Throughput: 0: 40824.0. Samples: 313056740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-18 00:11:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:11:19,436][12883] Updated weights for policy 0, policy_version 19110 (0.0046) [2024-06-18 00:11:21,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 40710.1). Total num frames: 313196544. Throughput: 0: 40694.2. Samples: 313292740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-18 00:11:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:11:24,565][12883] Updated weights for policy 0, policy_version 19120 (0.0035) [2024-06-18 00:11:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 313376768. Throughput: 0: 41033.0. Samples: 313547000. Policy #0 lag: (min: 0.0, avg: 12.9, max: 26.0) [2024-06-18 00:11:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:11:27,442][12883] Updated weights for policy 0, policy_version 19130 (0.0029) [2024-06-18 00:11:31,994][12645] Fps is (10 sec: 36045.2, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 313556992. Throughput: 0: 40598.2. Samples: 313658320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 00:11:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:11:32,640][12883] Updated weights for policy 0, policy_version 19140 (0.0041) [2024-06-18 00:11:35,465][12883] Updated weights for policy 0, policy_version 19150 (0.0037) [2024-06-18 00:11:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 40932.2). Total num frames: 313835520. Throughput: 0: 40753.3. Samples: 313910320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 00:11:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:11:40,713][12883] Updated weights for policy 0, policy_version 19160 (0.0037) [2024-06-18 00:11:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40414.9, 300 sec: 40710.1). Total num frames: 313982976. Throughput: 0: 41181.7. Samples: 314165220. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) [2024-06-18 00:11:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:11:43,606][12883] Updated weights for policy 0, policy_version 19170 (0.0037) [2024-06-18 00:11:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41233.0, 300 sec: 40765.9). Total num frames: 314212352. Throughput: 0: 40559.0. Samples: 314272220. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) [2024-06-18 00:11:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:11:48,631][12883] Updated weights for policy 0, policy_version 19180 (0.0041) [2024-06-18 00:11:51,434][12883] Updated weights for policy 0, policy_version 19190 (0.0032) [2024-06-18 00:11:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41506.2, 300 sec: 40876.7). Total num frames: 314425344. Throughput: 0: 40957.4. Samples: 314526740. Policy #0 lag: (min: 0.0, avg: 7.3, max: 21.0) [2024-06-18 00:11:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:11:52,201][12862] Signal inference workers to stop experience collection... (4450 times) [2024-06-18 00:11:52,250][12883] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-18 00:11:52,259][12862] Signal inference workers to resume experience collection... (4450 times) [2024-06-18 00:11:52,264][12883] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-18 00:11:56,495][12883] Updated weights for policy 0, policy_version 19200 (0.0040) [2024-06-18 00:11:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 39594.7, 300 sec: 40599.0). Total num frames: 314589184. Throughput: 0: 41084.8. Samples: 314780260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:11:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:11:59,294][12883] Updated weights for policy 0, policy_version 19210 (0.0039) [2024-06-18 00:12:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40961.5, 300 sec: 40765.6). Total num frames: 314818560. Throughput: 0: 40861.7. Samples: 314895520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:12:01,995][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:12:04,341][12883] Updated weights for policy 0, policy_version 19220 (0.0038) [2024-06-18 00:12:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 315031552. Throughput: 0: 41195.6. Samples: 315146540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:12:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:12:07,505][12883] Updated weights for policy 0, policy_version 19230 (0.0031) [2024-06-18 00:12:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 315211776. Throughput: 0: 41041.3. Samples: 315393860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 00:12:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:12:12,295][12883] Updated weights for policy 0, policy_version 19240 (0.0031) [2024-06-18 00:12:15,455][12883] Updated weights for policy 0, policy_version 19250 (0.0030) [2024-06-18 00:12:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 40876.7). Total num frames: 315441152. Throughput: 0: 41164.8. Samples: 315510740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 00:12:16,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:12:20,162][12883] Updated weights for policy 0, policy_version 19260 (0.0033) [2024-06-18 00:12:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 315637760. Throughput: 0: 41074.1. Samples: 315758660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 00:12:21,995][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:12:23,185][12883] Updated weights for policy 0, policy_version 19270 (0.0041) [2024-06-18 00:12:26,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40686.9, 300 sec: 40654.8). Total num frames: 315817984. Throughput: 0: 40867.6. Samples: 316004260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:12:26,996][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:12:28,426][12883] Updated weights for policy 0, policy_version 19280 (0.0044) [2024-06-18 00:12:31,028][12883] Updated weights for policy 0, policy_version 19290 (0.0037) [2024-06-18 00:12:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 316063744. Throughput: 0: 41123.6. Samples: 316122780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:12:31,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:12:36,346][12883] Updated weights for policy 0, policy_version 19300 (0.0033) [2024-06-18 00:12:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 39867.8, 300 sec: 40654.5). Total num frames: 316227584. Throughput: 0: 40929.7. Samples: 316368580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 00:12:36,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:12:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019302_316243968.pth... [2024-06-18 00:12:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000018706_306479104.pth [2024-06-18 00:12:39,455][12883] Updated weights for policy 0, policy_version 19310 (0.0032) [2024-06-18 00:12:41,994][12645] Fps is (10 sec: 37682.2, 60 sec: 40959.9, 300 sec: 40710.1). Total num frames: 316440576. Throughput: 0: 40751.9. Samples: 316614100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 00:12:41,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:12:44,195][12883] Updated weights for policy 0, policy_version 19320 (0.0036) [2024-06-18 00:12:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 316669952. Throughput: 0: 40865.5. Samples: 316734460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 00:12:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:12:47,367][12883] Updated weights for policy 0, policy_version 19330 (0.0025) [2024-06-18 00:12:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 316850176. Throughput: 0: 40842.2. Samples: 316984440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:12:51,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:12:52,081][12883] Updated weights for policy 0, policy_version 19340 (0.0041) [2024-06-18 00:12:55,646][12883] Updated weights for policy 0, policy_version 19350 (0.0040) [2024-06-18 00:12:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 40932.2). Total num frames: 317079552. Throughput: 0: 40528.9. Samples: 317217660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:12:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:13:00,085][12883] Updated weights for policy 0, policy_version 19360 (0.0033) [2024-06-18 00:13:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 317276160. Throughput: 0: 40843.6. Samples: 317348700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:13:01,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:13:03,454][12883] Updated weights for policy 0, policy_version 19370 (0.0036) [2024-06-18 00:13:06,994][12645] Fps is (10 sec: 36044.6, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 317440000. Throughput: 0: 40649.0. Samples: 317587860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 00:13:06,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:13:08,503][12883] Updated weights for policy 0, policy_version 19380 (0.0043) [2024-06-18 00:13:11,358][12883] Updated weights for policy 0, policy_version 19390 (0.0037) [2024-06-18 00:13:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 317685760. Throughput: 0: 40447.2. Samples: 317824380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 00:13:11,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:13:16,591][12883] Updated weights for policy 0, policy_version 19400 (0.0031) [2024-06-18 00:13:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 317865984. Throughput: 0: 40716.4. Samples: 317955020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 00:13:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:13:20,074][12883] Updated weights for policy 0, policy_version 19410 (0.0044) [2024-06-18 00:13:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 318062592. Throughput: 0: 40568.8. Samples: 318194180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:13:21,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:13:24,559][12883] Updated weights for policy 0, policy_version 19420 (0.0043) [2024-06-18 00:13:27,000][12645] Fps is (10 sec: 42571.5, 60 sec: 41228.8, 300 sec: 40875.8). Total num frames: 318291968. Throughput: 0: 40465.2. Samples: 318435280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:13:27,000][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:13:28,010][12883] Updated weights for policy 0, policy_version 19430 (0.0040) [2024-06-18 00:13:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.7, 300 sec: 40654.5). Total num frames: 318472192. Throughput: 0: 40606.1. Samples: 318561740. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-18 00:13:31,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:13:32,960][12883] Updated weights for policy 0, policy_version 19440 (0.0034) [2024-06-18 00:13:32,996][12862] Signal inference workers to stop experience collection... (4500 times) [2024-06-18 00:13:33,045][12883] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-18 00:13:33,110][12862] Signal inference workers to resume experience collection... (4500 times) [2024-06-18 00:13:33,110][12883] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-18 00:13:35,931][12883] Updated weights for policy 0, policy_version 19450 (0.0033) [2024-06-18 00:13:36,994][12645] Fps is (10 sec: 39346.3, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 318685184. Throughput: 0: 40346.2. Samples: 318800020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-18 00:13:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:13:40,719][12883] Updated weights for policy 0, policy_version 19460 (0.0030) [2024-06-18 00:13:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.2, 300 sec: 40821.2). Total num frames: 318914560. Throughput: 0: 40709.3. Samples: 319049580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 19.0) [2024-06-18 00:13:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:13:43,954][12883] Updated weights for policy 0, policy_version 19470 (0.0027) [2024-06-18 00:13:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 319111168. Throughput: 0: 40528.9. Samples: 319172500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 00:13:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:13:48,746][12883] Updated weights for policy 0, policy_version 19480 (0.0050) [2024-06-18 00:13:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 319307776. Throughput: 0: 40616.2. Samples: 319415580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 00:13:51,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:13:52,011][12883] Updated weights for policy 0, policy_version 19490 (0.0042) [2024-06-18 00:13:56,703][12883] Updated weights for policy 0, policy_version 19500 (0.0035) [2024-06-18 00:13:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 319504384. Throughput: 0: 41022.2. Samples: 319670380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 00:13:56,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 00:14:00,252][12883] Updated weights for policy 0, policy_version 19510 (0.0043) [2024-06-18 00:14:01,994][12645] Fps is (10 sec: 39320.6, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 319700992. Throughput: 0: 40795.0. Samples: 319790800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-18 00:14:01,995][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:14:04,362][12883] Updated weights for policy 0, policy_version 19520 (0.0035) [2024-06-18 00:14:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 40765.6). Total num frames: 319913984. Throughput: 0: 40848.1. Samples: 320032340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-18 00:14:06,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:14:08,597][12883] Updated weights for policy 0, policy_version 19530 (0.0034) [2024-06-18 00:14:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40686.9, 300 sec: 40765.9). Total num frames: 320126976. Throughput: 0: 40987.5. Samples: 320279460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 26.0) [2024-06-18 00:14:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:14:12,194][12883] Updated weights for policy 0, policy_version 19540 (0.0036) [2024-06-18 00:14:16,527][12883] Updated weights for policy 0, policy_version 19550 (0.0023) [2024-06-18 00:14:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40765.9). Total num frames: 320323584. Throughput: 0: 40907.6. Samples: 320402580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:14:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:14:20,223][12883] Updated weights for policy 0, policy_version 19560 (0.0029) [2024-06-18 00:14:21,995][12645] Fps is (10 sec: 39315.6, 60 sec: 40959.0, 300 sec: 40821.0). Total num frames: 320520192. Throughput: 0: 41097.7. Samples: 320649480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:14:21,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:14:24,527][12883] Updated weights for policy 0, policy_version 19570 (0.0047) [2024-06-18 00:14:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40691.1, 300 sec: 40821.1). Total num frames: 320733184. Throughput: 0: 40969.3. Samples: 320893200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 00:14:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:14:28,212][12883] Updated weights for policy 0, policy_version 19580 (0.0035) [2024-06-18 00:14:31,994][12645] Fps is (10 sec: 40965.9, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 320929792. Throughput: 0: 40967.4. Samples: 321016040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 00:14:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:14:32,522][12883] Updated weights for policy 0, policy_version 19590 (0.0034) [2024-06-18 00:14:36,284][12883] Updated weights for policy 0, policy_version 19600 (0.0030) [2024-06-18 00:14:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 40932.5). Total num frames: 321159168. Throughput: 0: 41000.6. Samples: 321260620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 00:14:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:14:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019602_321159168.pth... [2024-06-18 00:14:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019001_311312384.pth [2024-06-18 00:14:40,293][12883] Updated weights for policy 0, policy_version 19610 (0.0045) [2024-06-18 00:14:41,996][12645] Fps is (10 sec: 42589.5, 60 sec: 40685.5, 300 sec: 40765.3). Total num frames: 321355776. Throughput: 0: 40833.6. Samples: 321507980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 00:14:41,996][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:14:44,020][12883] Updated weights for policy 0, policy_version 19620 (0.0034) [2024-06-18 00:14:46,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 321536000. Throughput: 0: 40833.9. Samples: 321628320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 00:14:46,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:14:48,085][12883] Updated weights for policy 0, policy_version 19630 (0.0036) [2024-06-18 00:14:51,994][12645] Fps is (10 sec: 40969.3, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 321765376. Throughput: 0: 41046.7. Samples: 321879440. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 00:14:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:14:52,016][12883] Updated weights for policy 0, policy_version 19640 (0.0036) [2024-06-18 00:14:56,061][12883] Updated weights for policy 0, policy_version 19650 (0.0046) [2024-06-18 00:14:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 321961984. Throughput: 0: 40913.3. Samples: 322120560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 00:14:56,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:15:00,056][12883] Updated weights for policy 0, policy_version 19660 (0.0038) [2024-06-18 00:15:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 322158592. Throughput: 0: 40912.0. Samples: 322243620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 00:15:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:15:04,254][12883] Updated weights for policy 0, policy_version 19670 (0.0034) [2024-06-18 00:15:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 322371584. Throughput: 0: 40881.3. Samples: 322489080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 00:15:06,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:15:08,537][12883] Updated weights for policy 0, policy_version 19680 (0.0044) [2024-06-18 00:15:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 40765.6). Total num frames: 322568192. Throughput: 0: 40843.7. Samples: 322731160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 00:15:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:15:12,334][12883] Updated weights for policy 0, policy_version 19690 (0.0041) [2024-06-18 00:15:16,597][12883] Updated weights for policy 0, policy_version 19700 (0.0034) [2024-06-18 00:15:16,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40687.0, 300 sec: 40932.3). Total num frames: 322764800. Throughput: 0: 40793.1. Samples: 322851720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 00:15:16,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:15:20,390][12883] Updated weights for policy 0, policy_version 19710 (0.0029) [2024-06-18 00:15:21,999][12645] Fps is (10 sec: 40939.5, 60 sec: 40957.7, 300 sec: 40820.5). Total num frames: 322977792. Throughput: 0: 40738.7. Samples: 323094060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 00:15:21,999][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:15:22,559][12862] Signal inference workers to stop experience collection... (4550 times) [2024-06-18 00:15:22,559][12862] Signal inference workers to resume experience collection... (4550 times) [2024-06-18 00:15:22,577][12883] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-18 00:15:22,577][12883] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-18 00:15:24,693][12883] Updated weights for policy 0, policy_version 19720 (0.0035) [2024-06-18 00:15:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 323174400. Throughput: 0: 40766.8. Samples: 323342400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:15:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:15:28,671][12883] Updated weights for policy 0, policy_version 19730 (0.0032) [2024-06-18 00:15:31,994][12645] Fps is (10 sec: 42619.7, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 323403776. Throughput: 0: 40734.2. Samples: 323461360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:15:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:15:32,640][12883] Updated weights for policy 0, policy_version 19740 (0.0030) [2024-06-18 00:15:36,503][12883] Updated weights for policy 0, policy_version 19750 (0.0045) [2024-06-18 00:15:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40414.0, 300 sec: 40765.8). Total num frames: 323584000. Throughput: 0: 40716.8. Samples: 323711700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 00:15:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:15:40,887][12883] Updated weights for policy 0, policy_version 19760 (0.0039) [2024-06-18 00:15:41,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40688.3, 300 sec: 40876.7). Total num frames: 323796992. Throughput: 0: 40662.7. Samples: 323950380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 00:15:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:15:44,585][12883] Updated weights for policy 0, policy_version 19770 (0.0043) [2024-06-18 00:15:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.8, 300 sec: 40821.1). Total num frames: 323977216. Throughput: 0: 40530.5. Samples: 324067500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 00:15:46,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:15:49,092][12883] Updated weights for policy 0, policy_version 19780 (0.0029) [2024-06-18 00:15:51,994][12645] Fps is (10 sec: 37683.8, 60 sec: 40140.8, 300 sec: 40543.5). Total num frames: 324173824. Throughput: 0: 40447.3. Samples: 324309200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 00:15:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:15:52,709][12883] Updated weights for policy 0, policy_version 19790 (0.0047) [2024-06-18 00:15:56,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40414.0, 300 sec: 40765.9). Total num frames: 324386816. Throughput: 0: 40609.8. Samples: 324558600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 00:15:56,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:15:57,097][12883] Updated weights for policy 0, policy_version 19800 (0.0038) [2024-06-18 00:16:01,076][12883] Updated weights for policy 0, policy_version 19810 (0.0040) [2024-06-18 00:16:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 324616192. Throughput: 0: 40573.2. Samples: 324677520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 00:16:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:16:05,045][12883] Updated weights for policy 0, policy_version 19820 (0.0042) [2024-06-18 00:16:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40687.0, 300 sec: 40765.6). Total num frames: 324812800. Throughput: 0: 40692.0. Samples: 324925000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:16:06,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:16:09,002][12883] Updated weights for policy 0, policy_version 19830 (0.0040) [2024-06-18 00:16:11,997][12645] Fps is (10 sec: 37671.1, 60 sec: 40411.6, 300 sec: 40820.7). Total num frames: 324993024. Throughput: 0: 40545.1. Samples: 325167060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:16:11,998][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:16:12,978][12883] Updated weights for policy 0, policy_version 19840 (0.0051) [2024-06-18 00:16:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.8, 300 sec: 40654.6). Total num frames: 325189632. Throughput: 0: 40523.1. Samples: 325284900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:16:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:16:17,286][12883] Updated weights for policy 0, policy_version 19850 (0.0042) [2024-06-18 00:16:21,101][12883] Updated weights for policy 0, policy_version 19860 (0.0035) [2024-06-18 00:16:21,994][12645] Fps is (10 sec: 44251.4, 60 sec: 40963.4, 300 sec: 40876.7). Total num frames: 325435392. Throughput: 0: 40458.2. Samples: 325532320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-18 00:16:21,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:16:25,477][12883] Updated weights for policy 0, policy_version 19870 (0.0042) [2024-06-18 00:16:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 325615616. Throughput: 0: 40416.2. Samples: 325769100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 19.0) [2024-06-18 00:16:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:16:29,230][12883] Updated weights for policy 0, policy_version 19880 (0.0047) [2024-06-18 00:16:31,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40140.7, 300 sec: 40599.0). Total num frames: 325812224. Throughput: 0: 40447.6. Samples: 325887640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 00:16:31,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:16:33,483][12883] Updated weights for policy 0, policy_version 19890 (0.0034) [2024-06-18 00:16:36,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40413.8, 300 sec: 40765.6). Total num frames: 326008832. Throughput: 0: 40534.9. Samples: 326133280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 00:16:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:16:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019898_326008832.pth... [2024-06-18 00:16:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019302_316243968.pth [2024-06-18 00:16:37,605][12883] Updated weights for policy 0, policy_version 19900 (0.0028) [2024-06-18 00:16:41,519][12883] Updated weights for policy 0, policy_version 19910 (0.0031) [2024-06-18 00:16:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 326221824. Throughput: 0: 40448.8. Samples: 326378800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 00:16:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:16:45,499][12883] Updated weights for policy 0, policy_version 19920 (0.0038) [2024-06-18 00:16:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 326434816. Throughput: 0: 40589.0. Samples: 326504020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:16:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:16:49,475][12883] Updated weights for policy 0, policy_version 19930 (0.0031) [2024-06-18 00:16:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 326615040. Throughput: 0: 40516.8. Samples: 326748260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:16:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:16:53,614][12883] Updated weights for policy 0, policy_version 19940 (0.0032) [2024-06-18 00:16:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40686.8, 300 sec: 40710.1). Total num frames: 326828032. Throughput: 0: 40455.4. Samples: 326987420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:16:56,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:16:57,505][12883] Updated weights for policy 0, policy_version 19950 (0.0034) [2024-06-18 00:17:01,755][12883] Updated weights for policy 0, policy_version 19960 (0.0038) [2024-06-18 00:17:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 327041024. Throughput: 0: 40601.7. Samples: 327111980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:17:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:17:05,825][12883] Updated weights for policy 0, policy_version 19970 (0.0033) [2024-06-18 00:17:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.9, 300 sec: 40710.1). Total num frames: 327221248. Throughput: 0: 40473.9. Samples: 327353640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:17:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:17:09,585][12883] Updated weights for policy 0, policy_version 19980 (0.0042) [2024-06-18 00:17:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40962.2, 300 sec: 40710.1). Total num frames: 327450624. Throughput: 0: 40558.6. Samples: 327594240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:17:11,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:17:14,010][12862] Signal inference workers to stop experience collection... (4600 times) [2024-06-18 00:17:14,020][12862] Signal inference workers to resume experience collection... (4600 times) [2024-06-18 00:17:14,024][12883] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-18 00:17:14,034][12883] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-18 00:17:14,169][12883] Updated weights for policy 0, policy_version 19990 (0.0030) [2024-06-18 00:17:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 327647232. Throughput: 0: 40753.5. Samples: 327721540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 00:17:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:17:17,499][12883] Updated weights for policy 0, policy_version 20000 (0.0023) [2024-06-18 00:17:21,989][12883] Updated weights for policy 0, policy_version 20010 (0.0038) [2024-06-18 00:17:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.8, 300 sec: 40765.6). Total num frames: 327843840. Throughput: 0: 40761.5. Samples: 327967540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 00:17:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:17:25,748][12883] Updated weights for policy 0, policy_version 20020 (0.0039) [2024-06-18 00:17:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 328073216. Throughput: 0: 40485.4. Samples: 328200640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 00:17:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:17:30,053][12883] Updated weights for policy 0, policy_version 20030 (0.0032) [2024-06-18 00:17:31,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40140.9, 300 sec: 40654.5). Total num frames: 328220672. Throughput: 0: 40626.3. Samples: 328332200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:17:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:17:33,745][12883] Updated weights for policy 0, policy_version 20040 (0.0037) [2024-06-18 00:17:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 328450048. Throughput: 0: 40378.3. Samples: 328565280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:17:36,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:17:37,930][12883] Updated weights for policy 0, policy_version 20050 (0.0028) [2024-06-18 00:17:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 328663040. Throughput: 0: 40734.3. Samples: 328820460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:17:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:17:41,995][12883] Updated weights for policy 0, policy_version 20060 (0.0033) [2024-06-18 00:17:45,833][12883] Updated weights for policy 0, policy_version 20070 (0.0037) [2024-06-18 00:17:46,996][12645] Fps is (10 sec: 39312.9, 60 sec: 40139.3, 300 sec: 40654.2). Total num frames: 328843264. Throughput: 0: 40491.8. Samples: 328934200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-18 00:17:46,997][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:17:50,138][12883] Updated weights for policy 0, policy_version 20080 (0.0039) [2024-06-18 00:17:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 40710.1). Total num frames: 329089024. Throughput: 0: 40506.1. Samples: 329176420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-18 00:17:51,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 00:17:53,808][12883] Updated weights for policy 0, policy_version 20090 (0.0038) [2024-06-18 00:17:56,994][12645] Fps is (10 sec: 39330.8, 60 sec: 40140.9, 300 sec: 40543.5). Total num frames: 329236480. Throughput: 0: 40798.8. Samples: 329430180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 00:17:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:17:58,229][12883] Updated weights for policy 0, policy_version 20100 (0.0033) [2024-06-18 00:18:01,826][12883] Updated weights for policy 0, policy_version 20110 (0.0045) [2024-06-18 00:18:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 329482240. Throughput: 0: 40428.9. Samples: 329540840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 00:18:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:18:06,287][12883] Updated weights for policy 0, policy_version 20120 (0.0041) [2024-06-18 00:18:06,994][12645] Fps is (10 sec: 45874.5, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 329695232. Throughput: 0: 40574.1. Samples: 329793380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 00:18:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:18:09,953][12883] Updated weights for policy 0, policy_version 20130 (0.0042) [2024-06-18 00:18:11,519][12862] Signal inference workers to stop experience collection... (4650 times) [2024-06-18 00:18:11,519][12862] Signal inference workers to resume experience collection... (4650 times) [2024-06-18 00:18:11,559][12883] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-18 00:18:11,559][12883] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-18 00:18:11,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 329859072. Throughput: 0: 40879.0. Samples: 330040200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 00:18:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:18:14,337][12883] Updated weights for policy 0, policy_version 20140 (0.0039) [2024-06-18 00:18:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 40821.2). Total num frames: 330104832. Throughput: 0: 40614.5. Samples: 330159860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 00:18:16,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:18:17,877][12883] Updated weights for policy 0, policy_version 20150 (0.0035) [2024-06-18 00:18:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40413.9, 300 sec: 40599.9). Total num frames: 330268672. Throughput: 0: 40844.1. Samples: 330403260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 00:18:21,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 00:18:22,408][12883] Updated weights for policy 0, policy_version 20160 (0.0043) [2024-06-18 00:18:25,706][12883] Updated weights for policy 0, policy_version 20170 (0.0038) [2024-06-18 00:18:26,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40140.8, 300 sec: 40710.1). Total num frames: 330481664. Throughput: 0: 40705.0. Samples: 330652180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 00:18:26,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:18:30,374][12883] Updated weights for policy 0, policy_version 20180 (0.0053) [2024-06-18 00:18:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.0, 300 sec: 40765.6). Total num frames: 330711040. Throughput: 0: 40934.0. Samples: 330776140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 00:18:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:18:33,473][12883] Updated weights for policy 0, policy_version 20190 (0.0036) [2024-06-18 00:18:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 330874880. Throughput: 0: 40908.5. Samples: 331017300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 00:18:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:18:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020195_330874880.pth... [2024-06-18 00:18:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019602_321159168.pth [2024-06-18 00:18:38,548][12883] Updated weights for policy 0, policy_version 20200 (0.0042) [2024-06-18 00:18:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 331104256. Throughput: 0: 40534.1. Samples: 331254220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-18 00:18:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:18:42,040][12883] Updated weights for policy 0, policy_version 20210 (0.0047) [2024-06-18 00:18:46,594][12883] Updated weights for policy 0, policy_version 20220 (0.0042) [2024-06-18 00:18:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41234.6, 300 sec: 40710.1). Total num frames: 331317248. Throughput: 0: 40929.8. Samples: 331382680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-18 00:18:46,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:18:49,976][12883] Updated weights for policy 0, policy_version 20230 (0.0043) [2024-06-18 00:18:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 331497472. Throughput: 0: 40559.5. Samples: 331618560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-18 00:18:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:18:54,582][12883] Updated weights for policy 0, policy_version 20240 (0.0046) [2024-06-18 00:18:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 331710464. Throughput: 0: 40379.6. Samples: 331857280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 00:18:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:18:58,577][12883] Updated weights for policy 0, policy_version 20250 (0.0040) [2024-06-18 00:19:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 39867.7, 300 sec: 40543.5). Total num frames: 331874304. Throughput: 0: 40541.4. Samples: 331984220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 00:19:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:19:02,983][12883] Updated weights for policy 0, policy_version 20260 (0.0043) [2024-06-18 00:19:06,639][12883] Updated weights for policy 0, policy_version 20270 (0.0033) [2024-06-18 00:19:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 332103680. Throughput: 0: 40428.8. Samples: 332222560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:19:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:19:11,051][12883] Updated weights for policy 0, policy_version 20280 (0.0025) [2024-06-18 00:19:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40960.1, 300 sec: 40654.5). Total num frames: 332316672. Throughput: 0: 40313.3. Samples: 332466280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:19:11,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:19:14,673][12883] Updated weights for policy 0, policy_version 20290 (0.0041) [2024-06-18 00:19:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40140.8, 300 sec: 40654.7). Total num frames: 332513280. Throughput: 0: 40364.0. Samples: 332592520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:19:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:19:18,863][12883] Updated weights for policy 0, policy_version 20300 (0.0026) [2024-06-18 00:19:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 40654.5). Total num frames: 332726272. Throughput: 0: 40407.9. Samples: 332835660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 00:19:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:19:22,557][12883] Updated weights for policy 0, policy_version 20310 (0.0035) [2024-06-18 00:19:26,890][12883] Updated weights for policy 0, policy_version 20320 (0.0040) [2024-06-18 00:19:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.7, 300 sec: 40654.5). Total num frames: 332922880. Throughput: 0: 40757.2. Samples: 333088300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 00:19:26,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:19:30,470][12883] Updated weights for policy 0, policy_version 20330 (0.0042) [2024-06-18 00:19:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 333135872. Throughput: 0: 40516.5. Samples: 333205920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 00:19:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:19:34,774][12883] Updated weights for policy 0, policy_version 20340 (0.0022) [2024-06-18 00:19:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.0, 300 sec: 40654.8). Total num frames: 333348864. Throughput: 0: 40762.2. Samples: 333452860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 00:19:36,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:19:38,744][12883] Updated weights for policy 0, policy_version 20350 (0.0039) [2024-06-18 00:19:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 333545472. Throughput: 0: 41028.4. Samples: 333703560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 00:19:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:19:42,818][12883] Updated weights for policy 0, policy_version 20360 (0.0033) [2024-06-18 00:19:45,401][12862] Signal inference workers to stop experience collection... (4700 times) [2024-06-18 00:19:45,401][12862] Signal inference workers to resume experience collection... (4700 times) [2024-06-18 00:19:45,414][12883] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-18 00:19:45,414][12883] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-18 00:19:46,703][12883] Updated weights for policy 0, policy_version 20370 (0.0046) [2024-06-18 00:19:47,000][12645] Fps is (10 sec: 39297.3, 60 sec: 40409.7, 300 sec: 40598.1). Total num frames: 333742080. Throughput: 0: 40785.5. Samples: 333819820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 00:19:47,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:19:50,756][12883] Updated weights for policy 0, policy_version 20380 (0.0037) [2024-06-18 00:19:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 40654.6). Total num frames: 333955072. Throughput: 0: 40997.0. Samples: 334067420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 00:19:51,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:19:54,745][12883] Updated weights for policy 0, policy_version 20390 (0.0048) [2024-06-18 00:19:57,000][12645] Fps is (10 sec: 40959.8, 60 sec: 40682.7, 300 sec: 40653.7). Total num frames: 334151680. Throughput: 0: 40921.8. Samples: 334308020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 00:19:57,000][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:19:58,725][12883] Updated weights for policy 0, policy_version 20400 (0.0048) [2024-06-18 00:20:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 40654.6). Total num frames: 334364672. Throughput: 0: 40908.1. Samples: 334433380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 00:20:01,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:20:02,602][12883] Updated weights for policy 0, policy_version 20410 (0.0049) [2024-06-18 00:20:06,849][12883] Updated weights for policy 0, policy_version 20420 (0.0040) [2024-06-18 00:20:06,994][12645] Fps is (10 sec: 40985.7, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 334561280. Throughput: 0: 41015.6. Samples: 334681360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 00:20:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:20:11,045][12883] Updated weights for policy 0, policy_version 20430 (0.0035) [2024-06-18 00:20:11,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40686.8, 300 sec: 40654.5). Total num frames: 334757888. Throughput: 0: 40795.6. Samples: 334924100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 00:20:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:20:14,978][12883] Updated weights for policy 0, policy_version 20440 (0.0028) [2024-06-18 00:20:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41231.6, 300 sec: 40710.4). Total num frames: 334987264. Throughput: 0: 40925.5. Samples: 335047660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 00:20:16,997][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:20:18,937][12883] Updated weights for policy 0, policy_version 20450 (0.0035) [2024-06-18 00:20:21,994][12645] Fps is (10 sec: 42595.7, 60 sec: 40959.5, 300 sec: 40710.0). Total num frames: 335183872. Throughput: 0: 40842.0. Samples: 335290780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 00:20:21,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:20:23,033][12883] Updated weights for policy 0, policy_version 20460 (0.0034) [2024-06-18 00:20:26,994][12645] Fps is (10 sec: 37692.0, 60 sec: 40687.1, 300 sec: 40543.5). Total num frames: 335364096. Throughput: 0: 40834.4. Samples: 335541100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 00:20:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:20:27,103][12883] Updated weights for policy 0, policy_version 20470 (0.0037) [2024-06-18 00:20:31,082][12883] Updated weights for policy 0, policy_version 20480 (0.0034) [2024-06-18 00:20:31,994][12645] Fps is (10 sec: 40963.1, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 335593472. Throughput: 0: 40999.0. Samples: 335664520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 00:20:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:20:35,070][12883] Updated weights for policy 0, policy_version 20490 (0.0034) [2024-06-18 00:20:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 335806464. Throughput: 0: 40954.5. Samples: 335910380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 00:20:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:20:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020496_335806464.pth... [2024-06-18 00:20:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000019898_326008832.pth [2024-06-18 00:20:38,909][12883] Updated weights for policy 0, policy_version 20500 (0.0043) [2024-06-18 00:20:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 335986688. Throughput: 0: 41070.2. Samples: 336155920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 00:20:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:20:42,981][12883] Updated weights for policy 0, policy_version 20510 (0.0038) [2024-06-18 00:20:46,942][12883] Updated weights for policy 0, policy_version 20520 (0.0031) [2024-06-18 00:20:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40964.2, 300 sec: 40765.6). Total num frames: 336199680. Throughput: 0: 40967.4. Samples: 336276920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 00:20:47,000][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:20:50,857][12883] Updated weights for policy 0, policy_version 20530 (0.0042) [2024-06-18 00:20:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 336412672. Throughput: 0: 41102.3. Samples: 336530960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 00:20:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:20:54,878][12883] Updated weights for policy 0, policy_version 20540 (0.0035) [2024-06-18 00:20:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40964.3, 300 sec: 40654.6). Total num frames: 336609280. Throughput: 0: 40952.6. Samples: 336766960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 00:20:56,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:20:59,046][12883] Updated weights for policy 0, policy_version 20550 (0.0031) [2024-06-18 00:21:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 336822272. Throughput: 0: 40846.1. Samples: 336885640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 00:21:01,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:21:02,847][12883] Updated weights for policy 0, policy_version 20560 (0.0037) [2024-06-18 00:21:06,962][12883] Updated weights for policy 0, policy_version 20570 (0.0033) [2024-06-18 00:21:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 40766.1). Total num frames: 337018880. Throughput: 0: 40909.9. Samples: 337131700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 00:21:06,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:21:10,629][12883] Updated weights for policy 0, policy_version 20580 (0.0034) [2024-06-18 00:21:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 337215488. Throughput: 0: 40764.3. Samples: 337375500. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 00:21:11,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:21:15,175][12883] Updated weights for policy 0, policy_version 20590 (0.0049) [2024-06-18 00:21:17,000][12645] Fps is (10 sec: 40934.7, 60 sec: 40684.2, 300 sec: 40653.7). Total num frames: 337428480. Throughput: 0: 40752.1. Samples: 337498620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:21:17,001][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:21:18,994][12883] Updated weights for policy 0, policy_version 20600 (0.0046) [2024-06-18 00:21:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40414.3, 300 sec: 40654.5). Total num frames: 337608704. Throughput: 0: 40716.8. Samples: 337742640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:21:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:21:23,177][12883] Updated weights for policy 0, policy_version 20610 (0.0030) [2024-06-18 00:21:26,994][12645] Fps is (10 sec: 39346.7, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 337821696. Throughput: 0: 40709.4. Samples: 337987840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:21:26,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:21:27,025][12883] Updated weights for policy 0, policy_version 20620 (0.0042) [2024-06-18 00:21:30,509][12862] Signal inference workers to stop experience collection... (4750 times) [2024-06-18 00:21:30,509][12862] Signal inference workers to resume experience collection... (4750 times) [2024-06-18 00:21:30,545][12883] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-18 00:21:30,545][12883] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-18 00:21:31,313][12883] Updated weights for policy 0, policy_version 20630 (0.0035) [2024-06-18 00:21:31,998][12645] Fps is (10 sec: 44217.4, 60 sec: 40956.9, 300 sec: 40820.5). Total num frames: 338051072. Throughput: 0: 40792.4. Samples: 338112760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:21:31,999][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:21:34,919][12883] Updated weights for policy 0, policy_version 20640 (0.0034) [2024-06-18 00:21:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 338231296. Throughput: 0: 40530.2. Samples: 338354820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:21:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:21:39,174][12883] Updated weights for policy 0, policy_version 20650 (0.0037) [2024-06-18 00:21:41,994][12645] Fps is (10 sec: 39339.9, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 338444288. Throughput: 0: 40698.7. Samples: 338598400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:21:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:21:43,047][12883] Updated weights for policy 0, policy_version 20660 (0.0030) [2024-06-18 00:21:46,983][12883] Updated weights for policy 0, policy_version 20670 (0.0035) [2024-06-18 00:21:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 338657280. Throughput: 0: 40933.3. Samples: 338727640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:21:46,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:21:51,101][12883] Updated weights for policy 0, policy_version 20680 (0.0043) [2024-06-18 00:21:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 338853888. Throughput: 0: 40915.5. Samples: 338972900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:21:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:21:55,000][12883] Updated weights for policy 0, policy_version 20690 (0.0036) [2024-06-18 00:21:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 339050496. Throughput: 0: 40838.8. Samples: 339213240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-18 00:21:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:21:58,961][12883] Updated weights for policy 0, policy_version 20700 (0.0042) [2024-06-18 00:22:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.7, 300 sec: 40710.1). Total num frames: 339230720. Throughput: 0: 40831.4. Samples: 339335780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-18 00:22:01,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:22:02,799][12883] Updated weights for policy 0, policy_version 20710 (0.0039) [2024-06-18 00:22:06,983][12883] Updated weights for policy 0, policy_version 20720 (0.0037) [2024-06-18 00:22:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 339476480. Throughput: 0: 40925.1. Samples: 339584260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 26.0) [2024-06-18 00:22:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:22:10,787][12883] Updated weights for policy 0, policy_version 20730 (0.0036) [2024-06-18 00:22:11,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 339689472. Throughput: 0: 40797.2. Samples: 339823720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 00:22:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:22:15,002][12883] Updated weights for policy 0, policy_version 20740 (0.0028) [2024-06-18 00:22:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40691.2, 300 sec: 40765.6). Total num frames: 339869696. Throughput: 0: 40916.2. Samples: 339953800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 00:22:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:22:18,722][12883] Updated weights for policy 0, policy_version 20750 (0.0040) [2024-06-18 00:22:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.2, 300 sec: 40654.5). Total num frames: 340066304. Throughput: 0: 40844.4. Samples: 340192820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 00:22:21,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 00:22:23,591][12883] Updated weights for policy 0, policy_version 20760 (0.0034) [2024-06-18 00:22:26,669][12883] Updated weights for policy 0, policy_version 20770 (0.0036) [2024-06-18 00:22:26,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41504.5, 300 sec: 40987.4). Total num frames: 340312064. Throughput: 0: 40972.6. Samples: 340442260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 00:22:26,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:22:31,429][12883] Updated weights for policy 0, policy_version 20780 (0.0039) [2024-06-18 00:22:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40690.0, 300 sec: 40821.1). Total num frames: 340492288. Throughput: 0: 40886.1. Samples: 340567520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 00:22:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:22:34,677][12883] Updated weights for policy 0, policy_version 20790 (0.0042) [2024-06-18 00:22:36,994][12645] Fps is (10 sec: 39330.0, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 340705280. Throughput: 0: 40839.1. Samples: 340810660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 00:22:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020795_340705280.pth... [2024-06-18 00:22:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020195_330874880.pth [2024-06-18 00:22:39,419][12883] Updated weights for policy 0, policy_version 20800 (0.0029) [2024-06-18 00:22:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.8, 300 sec: 40821.5). Total num frames: 340885504. Throughput: 0: 40947.9. Samples: 341055900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 00:22:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:22:42,548][12862] Signal inference workers to stop experience collection... (4800 times) [2024-06-18 00:22:42,596][12883] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-18 00:22:42,667][12862] Signal inference workers to resume experience collection... (4800 times) [2024-06-18 00:22:42,668][12883] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-18 00:22:42,799][12883] Updated weights for policy 0, policy_version 20810 (0.0044) [2024-06-18 00:22:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 341082112. Throughput: 0: 40770.6. Samples: 341170460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 00:22:46,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:22:47,530][12883] Updated weights for policy 0, policy_version 20820 (0.0043) [2024-06-18 00:22:51,209][12883] Updated weights for policy 0, policy_version 20830 (0.0031) [2024-06-18 00:22:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 341311488. Throughput: 0: 40924.4. Samples: 341425860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 00:22:51,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:22:55,336][12883] Updated weights for policy 0, policy_version 20840 (0.0035) [2024-06-18 00:22:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 341508096. Throughput: 0: 40814.2. Samples: 341660360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-18 00:22:56,998][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:22:59,443][12883] Updated weights for policy 0, policy_version 20850 (0.0042) [2024-06-18 00:23:01,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 341688320. Throughput: 0: 40623.9. Samples: 341781880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-18 00:23:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:23:03,852][12883] Updated weights for policy 0, policy_version 20860 (0.0026) [2024-06-18 00:23:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.8, 300 sec: 40821.2). Total num frames: 341901312. Throughput: 0: 40593.2. Samples: 342019520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-18 00:23:06,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:23:07,373][12883] Updated weights for policy 0, policy_version 20870 (0.0040) [2024-06-18 00:23:11,865][12883] Updated weights for policy 0, policy_version 20880 (0.0036) [2024-06-18 00:23:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 342097920. Throughput: 0: 40515.8. Samples: 342265380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 00:23:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:23:15,301][12883] Updated weights for policy 0, policy_version 20890 (0.0039) [2024-06-18 00:23:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 342327296. Throughput: 0: 40424.4. Samples: 342386620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 00:23:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:23:19,755][12883] Updated weights for policy 0, policy_version 20900 (0.0030) [2024-06-18 00:23:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 342523904. Throughput: 0: 40649.5. Samples: 342639880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 00:23:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:23:23,125][12883] Updated weights for policy 0, policy_version 20910 (0.0045) [2024-06-18 00:23:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40142.2, 300 sec: 40710.1). Total num frames: 342720512. Throughput: 0: 40635.1. Samples: 342884480. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-18 00:23:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:23:27,696][12883] Updated weights for policy 0, policy_version 20920 (0.0040) [2024-06-18 00:23:31,144][12883] Updated weights for policy 0, policy_version 20930 (0.0033) [2024-06-18 00:23:31,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40685.5, 300 sec: 40876.4). Total num frames: 342933504. Throughput: 0: 40715.3. Samples: 343002740. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-18 00:23:31,997][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:23:35,553][12883] Updated weights for policy 0, policy_version 20940 (0.0040) [2024-06-18 00:23:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40414.0, 300 sec: 40765.6). Total num frames: 343130112. Throughput: 0: 40434.3. Samples: 343245400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-18 00:23:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:23:39,167][12883] Updated weights for policy 0, policy_version 20950 (0.0029) [2024-06-18 00:23:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 343326720. Throughput: 0: 40797.4. Samples: 343496240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 00:23:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:23:43,491][12883] Updated weights for policy 0, policy_version 20960 (0.0033) [2024-06-18 00:23:47,000][12645] Fps is (10 sec: 42571.6, 60 sec: 41228.8, 300 sec: 40875.8). Total num frames: 343556096. Throughput: 0: 40852.6. Samples: 343620500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 00:23:47,001][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:23:47,178][12883] Updated weights for policy 0, policy_version 20970 (0.0037) [2024-06-18 00:23:51,426][12883] Updated weights for policy 0, policy_version 20980 (0.0025) [2024-06-18 00:23:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 343736320. Throughput: 0: 41070.4. Samples: 343867680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 00:23:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:23:55,170][12883] Updated weights for policy 0, policy_version 20990 (0.0052) [2024-06-18 00:23:56,994][12645] Fps is (10 sec: 39346.0, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 343949312. Throughput: 0: 40846.6. Samples: 344103480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 00:23:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:23:59,525][12883] Updated weights for policy 0, policy_version 21000 (0.0049) [2024-06-18 00:24:01,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 344145920. Throughput: 0: 40949.8. Samples: 344229360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 00:24:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:24:03,201][12883] Updated weights for policy 0, policy_version 21010 (0.0029) [2024-06-18 00:24:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 344358912. Throughput: 0: 40877.2. Samples: 344479360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:24:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:24:07,779][12883] Updated weights for policy 0, policy_version 21020 (0.0032) [2024-06-18 00:24:11,135][12883] Updated weights for policy 0, policy_version 21030 (0.0045) [2024-06-18 00:24:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41233.0, 300 sec: 40876.7). Total num frames: 344571904. Throughput: 0: 40834.7. Samples: 344722040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:24:11,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:24:15,673][12883] Updated weights for policy 0, policy_version 21040 (0.0025) [2024-06-18 00:24:16,363][12862] Signal inference workers to stop experience collection... (4850 times) [2024-06-18 00:24:16,408][12883] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-18 00:24:16,412][12862] Signal inference workers to resume experience collection... (4850 times) [2024-06-18 00:24:16,427][12883] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-18 00:24:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40765.6). Total num frames: 344752128. Throughput: 0: 40919.7. Samples: 344844040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:24:16,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:24:19,447][12883] Updated weights for policy 0, policy_version 21050 (0.0039) [2024-06-18 00:24:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 344965120. Throughput: 0: 41005.7. Samples: 345090660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:24:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:24:23,375][12883] Updated weights for policy 0, policy_version 21060 (0.0035) [2024-06-18 00:24:26,994][12645] Fps is (10 sec: 42599.5, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 345178112. Throughput: 0: 40856.5. Samples: 345334780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:24:26,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:24:27,199][12883] Updated weights for policy 0, policy_version 21070 (0.0029) [2024-06-18 00:24:31,430][12883] Updated weights for policy 0, policy_version 21080 (0.0032) [2024-06-18 00:24:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40961.5, 300 sec: 40821.1). Total num frames: 345391104. Throughput: 0: 40982.5. Samples: 345464460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:24:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:24:35,353][12883] Updated weights for policy 0, policy_version 21090 (0.0026) [2024-06-18 00:24:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 345587712. Throughput: 0: 40746.5. Samples: 345701280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:24:36,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:24:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021093_345587712.pth... [2024-06-18 00:24:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020496_335806464.pth [2024-06-18 00:24:39,543][12883] Updated weights for policy 0, policy_version 21100 (0.0029) [2024-06-18 00:24:41,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 40822.0). Total num frames: 345784320. Throughput: 0: 41065.4. Samples: 345951420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:24:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:24:43,399][12883] Updated weights for policy 0, policy_version 21110 (0.0043) [2024-06-18 00:24:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40418.1, 300 sec: 40765.6). Total num frames: 345980928. Throughput: 0: 40923.3. Samples: 346070900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:24:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:24:47,428][12883] Updated weights for policy 0, policy_version 21120 (0.0032) [2024-06-18 00:24:51,218][12883] Updated weights for policy 0, policy_version 21130 (0.0025) [2024-06-18 00:24:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 40822.0). Total num frames: 346193920. Throughput: 0: 40725.8. Samples: 346312020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:24:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:24:56,342][12883] Updated weights for policy 0, policy_version 21140 (0.0038) [2024-06-18 00:24:56,994][12645] Fps is (10 sec: 39318.3, 60 sec: 40413.4, 300 sec: 40710.0). Total num frames: 346374144. Throughput: 0: 40831.8. Samples: 346559500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:24:56,995][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:24:59,453][12883] Updated weights for policy 0, policy_version 21150 (0.0035) [2024-06-18 00:25:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 346603520. Throughput: 0: 40652.6. Samples: 346673400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:25:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:25:04,303][12883] Updated weights for policy 0, policy_version 21160 (0.0028) [2024-06-18 00:25:07,000][12645] Fps is (10 sec: 44212.8, 60 sec: 40955.8, 300 sec: 40875.9). Total num frames: 346816512. Throughput: 0: 40808.2. Samples: 346927280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:25:07,000][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:25:07,718][12883] Updated weights for policy 0, policy_version 21170 (0.0029) [2024-06-18 00:25:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40414.0, 300 sec: 40710.4). Total num frames: 346996736. Throughput: 0: 40802.3. Samples: 347170880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:25:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:25:12,111][12883] Updated weights for policy 0, policy_version 21180 (0.0034) [2024-06-18 00:25:16,000][12883] Updated weights for policy 0, policy_version 21190 (0.0036) [2024-06-18 00:25:16,994][12645] Fps is (10 sec: 39345.5, 60 sec: 40960.0, 300 sec: 40765.7). Total num frames: 347209728. Throughput: 0: 40568.0. Samples: 347290020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 00:25:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:25:19,898][12883] Updated weights for policy 0, policy_version 21200 (0.0046) [2024-06-18 00:25:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 347406336. Throughput: 0: 40740.6. Samples: 347534600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 00:25:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:25:23,921][12883] Updated weights for policy 0, policy_version 21210 (0.0038) [2024-06-18 00:25:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 347619328. Throughput: 0: 40576.8. Samples: 347777380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 00:25:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:25:27,906][12883] Updated weights for policy 0, policy_version 21220 (0.0029) [2024-06-18 00:25:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 347815936. Throughput: 0: 40622.1. Samples: 347898900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 00:25:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:25:32,175][12883] Updated weights for policy 0, policy_version 21230 (0.0042) [2024-06-18 00:25:35,995][12883] Updated weights for policy 0, policy_version 21240 (0.0038) [2024-06-18 00:25:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 348012544. Throughput: 0: 40644.5. Samples: 348141020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:25:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:25:40,129][12883] Updated weights for policy 0, policy_version 21250 (0.0044) [2024-06-18 00:25:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40959.9, 300 sec: 40821.2). Total num frames: 348241920. Throughput: 0: 40462.0. Samples: 348380260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:25:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:25:44,078][12883] Updated weights for policy 0, policy_version 21260 (0.0038) [2024-06-18 00:25:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 348405760. Throughput: 0: 40770.7. Samples: 348508080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:25:46,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:25:48,250][12883] Updated weights for policy 0, policy_version 21270 (0.0040) [2024-06-18 00:25:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 348635136. Throughput: 0: 40444.6. Samples: 348747040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 00:25:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:25:52,297][12883] Updated weights for policy 0, policy_version 21280 (0.0039) [2024-06-18 00:25:56,178][12883] Updated weights for policy 0, policy_version 21290 (0.0047) [2024-06-18 00:25:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.5, 300 sec: 40710.1). Total num frames: 348831744. Throughput: 0: 40599.0. Samples: 348997840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 00:25:56,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:26:00,238][12883] Updated weights for policy 0, policy_version 21300 (0.0048) [2024-06-18 00:26:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 349028352. Throughput: 0: 40627.7. Samples: 349118260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 00:26:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:26:03,566][12862] Signal inference workers to stop experience collection... (4900 times) [2024-06-18 00:26:03,607][12883] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-18 00:26:03,612][12862] Signal inference workers to resume experience collection... (4900 times) [2024-06-18 00:26:03,626][12883] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-18 00:26:04,365][12883] Updated weights for policy 0, policy_version 21310 (0.0039) [2024-06-18 00:26:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40418.0, 300 sec: 40765.6). Total num frames: 349241344. Throughput: 0: 40496.4. Samples: 349356940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:26:06,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:26:08,212][12883] Updated weights for policy 0, policy_version 21320 (0.0043) [2024-06-18 00:26:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.8, 300 sec: 40710.9). Total num frames: 349437952. Throughput: 0: 40737.7. Samples: 349610580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:26:11,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:26:12,496][12883] Updated weights for policy 0, policy_version 21330 (0.0028) [2024-06-18 00:26:16,207][12883] Updated weights for policy 0, policy_version 21340 (0.0034) [2024-06-18 00:26:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 349650944. Throughput: 0: 40681.8. Samples: 349729580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:26:16,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:26:20,292][12883] Updated weights for policy 0, policy_version 21350 (0.0048) [2024-06-18 00:26:21,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 349880320. Throughput: 0: 40736.5. Samples: 349974160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 00:26:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:26:24,094][12883] Updated weights for policy 0, policy_version 21360 (0.0047) [2024-06-18 00:26:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40413.9, 300 sec: 40655.2). Total num frames: 350044160. Throughput: 0: 41031.2. Samples: 350226660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 00:26:26,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:26:28,250][12883] Updated weights for policy 0, policy_version 21370 (0.0046) [2024-06-18 00:26:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 350273536. Throughput: 0: 40764.0. Samples: 350342460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 00:26:31,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:26:32,088][12883] Updated weights for policy 0, policy_version 21380 (0.0036) [2024-06-18 00:26:36,086][12883] Updated weights for policy 0, policy_version 21390 (0.0031) [2024-06-18 00:26:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 350470144. Throughput: 0: 40971.1. Samples: 350590740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 00:26:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:26:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021392_350486528.pth... [2024-06-18 00:26:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000020795_340705280.pth [2024-06-18 00:26:40,551][12883] Updated weights for policy 0, policy_version 21400 (0.0031) [2024-06-18 00:26:41,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 350666752. Throughput: 0: 40906.1. Samples: 350838620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 00:26:41,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:26:43,904][12883] Updated weights for policy 0, policy_version 21410 (0.0034) [2024-06-18 00:26:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 350912512. Throughput: 0: 40882.2. Samples: 350957960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 00:26:46,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:26:49,283][12883] Updated weights for policy 0, policy_version 21420 (0.0049) [2024-06-18 00:26:51,944][12883] Updated weights for policy 0, policy_version 21430 (0.0034) [2024-06-18 00:26:51,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41233.2, 300 sec: 40876.7). Total num frames: 351109120. Throughput: 0: 40949.0. Samples: 351199640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 00:26:51,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:26:56,994][12645] Fps is (10 sec: 34406.5, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 351256576. Throughput: 0: 40719.7. Samples: 351442960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 00:26:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:26:57,061][12883] Updated weights for policy 0, policy_version 21440 (0.0033) [2024-06-18 00:26:59,873][12883] Updated weights for policy 0, policy_version 21450 (0.0038) [2024-06-18 00:27:01,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 351485952. Throughput: 0: 40647.1. Samples: 351558700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:27:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:27:05,003][12883] Updated weights for policy 0, policy_version 21460 (0.0024) [2024-06-18 00:27:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40687.0, 300 sec: 40654.6). Total num frames: 351682560. Throughput: 0: 40826.2. Samples: 351811340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:27:06,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:27:08,171][12883] Updated weights for policy 0, policy_version 21470 (0.0039) [2024-06-18 00:27:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 351879168. Throughput: 0: 40507.9. Samples: 352049520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:27:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:27:13,033][12883] Updated weights for policy 0, policy_version 21480 (0.0047) [2024-06-18 00:27:14,668][12862] Signal inference workers to stop experience collection... (4950 times) [2024-06-18 00:27:14,703][12883] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-18 00:27:14,785][12862] Signal inference workers to resume experience collection... (4950 times) [2024-06-18 00:27:14,785][12883] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-18 00:27:16,111][12883] Updated weights for policy 0, policy_version 21490 (0.0032) [2024-06-18 00:27:16,996][12645] Fps is (10 sec: 42588.4, 60 sec: 40958.5, 300 sec: 40820.8). Total num frames: 352108544. Throughput: 0: 40673.9. Samples: 352172880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-18 00:27:16,997][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:27:20,897][12883] Updated weights for policy 0, policy_version 21500 (0.0034) [2024-06-18 00:27:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 39867.7, 300 sec: 40543.8). Total num frames: 352272384. Throughput: 0: 40680.6. Samples: 352421360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-18 00:27:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:27:24,417][12883] Updated weights for policy 0, policy_version 21510 (0.0042) [2024-06-18 00:27:26,994][12645] Fps is (10 sec: 39330.6, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 352501760. Throughput: 0: 40462.8. Samples: 352659440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 20.0) [2024-06-18 00:27:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:27:29,224][12883] Updated weights for policy 0, policy_version 21520 (0.0047) [2024-06-18 00:27:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 352714752. Throughput: 0: 40532.5. Samples: 352781920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:27:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:27:32,437][12883] Updated weights for policy 0, policy_version 21530 (0.0052) [2024-06-18 00:27:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 352894976. Throughput: 0: 40539.4. Samples: 353023920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:27:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:27:37,250][12883] Updated weights for policy 0, policy_version 21540 (0.0038) [2024-06-18 00:27:40,584][12883] Updated weights for policy 0, policy_version 21550 (0.0040) [2024-06-18 00:27:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40414.0, 300 sec: 40710.1). Total num frames: 353091584. Throughput: 0: 40496.5. Samples: 353265300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:27:41,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:27:45,419][12883] Updated weights for policy 0, policy_version 21560 (0.0042) [2024-06-18 00:27:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 39867.6, 300 sec: 40654.5). Total num frames: 353304576. Throughput: 0: 40703.5. Samples: 353390360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:27:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:27:48,970][12883] Updated weights for policy 0, policy_version 21570 (0.0033) [2024-06-18 00:27:51,994][12645] Fps is (10 sec: 40958.9, 60 sec: 39867.6, 300 sec: 40654.5). Total num frames: 353501184. Throughput: 0: 40359.7. Samples: 353627540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:27:51,995][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:27:53,403][12883] Updated weights for policy 0, policy_version 21580 (0.0030) [2024-06-18 00:27:56,847][12883] Updated weights for policy 0, policy_version 21590 (0.0035) [2024-06-18 00:27:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41233.1, 300 sec: 40821.2). Total num frames: 353730560. Throughput: 0: 40500.1. Samples: 353872020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:27:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:28:01,292][12883] Updated weights for policy 0, policy_version 21600 (0.0032) [2024-06-18 00:28:01,996][12645] Fps is (10 sec: 40951.6, 60 sec: 40412.4, 300 sec: 40709.8). Total num frames: 353910784. Throughput: 0: 40483.6. Samples: 353994640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 00:28:01,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:28:04,677][12883] Updated weights for policy 0, policy_version 21610 (0.0050) [2024-06-18 00:28:06,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40413.7, 300 sec: 40710.1). Total num frames: 354107392. Throughput: 0: 40424.8. Samples: 354240480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 00:28:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:28:09,313][12883] Updated weights for policy 0, policy_version 21620 (0.0052) [2024-06-18 00:28:11,994][12645] Fps is (10 sec: 42607.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 354336768. Throughput: 0: 40575.4. Samples: 354485340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 00:28:11,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:28:12,997][12883] Updated weights for policy 0, policy_version 21630 (0.0054) [2024-06-18 00:28:16,996][12645] Fps is (10 sec: 40952.6, 60 sec: 40141.0, 300 sec: 40654.3). Total num frames: 354516992. Throughput: 0: 40517.4. Samples: 354605280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:28:16,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:28:17,503][12883] Updated weights for policy 0, policy_version 21640 (0.0038) [2024-06-18 00:28:20,792][12883] Updated weights for policy 0, policy_version 21650 (0.0037) [2024-06-18 00:28:21,998][12645] Fps is (10 sec: 40941.5, 60 sec: 41229.9, 300 sec: 40765.0). Total num frames: 354746368. Throughput: 0: 40622.1. Samples: 354852100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:28:21,999][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:28:25,667][12883] Updated weights for policy 0, policy_version 21660 (0.0037) [2024-06-18 00:28:26,994][12645] Fps is (10 sec: 40967.8, 60 sec: 40413.9, 300 sec: 40654.8). Total num frames: 354926592. Throughput: 0: 40739.5. Samples: 355098580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 00:28:26,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:28:28,660][12883] Updated weights for policy 0, policy_version 21670 (0.0038) [2024-06-18 00:28:31,996][12645] Fps is (10 sec: 39330.9, 60 sec: 40412.3, 300 sec: 40709.8). Total num frames: 355139584. Throughput: 0: 40690.1. Samples: 355221500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 00:28:31,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:28:33,738][12883] Updated weights for policy 0, policy_version 21680 (0.0045) [2024-06-18 00:28:36,870][12883] Updated weights for policy 0, policy_version 21690 (0.0039) [2024-06-18 00:28:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.1, 300 sec: 40821.2). Total num frames: 355368960. Throughput: 0: 40936.2. Samples: 355469660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 00:28:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:28:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021691_355385344.pth... [2024-06-18 00:28:37,117][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021093_345587712.pth [2024-06-18 00:28:41,638][12883] Updated weights for policy 0, policy_version 21700 (0.0044) [2024-06-18 00:28:41,994][12645] Fps is (10 sec: 42608.3, 60 sec: 41233.1, 300 sec: 40710.9). Total num frames: 355565568. Throughput: 0: 41015.2. Samples: 355717700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 00:28:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:28:44,983][12883] Updated weights for policy 0, policy_version 21710 (0.0039) [2024-06-18 00:28:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 355762176. Throughput: 0: 40946.4. Samples: 355837140. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 00:28:46,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:28:49,325][12883] Updated weights for policy 0, policy_version 21720 (0.0045) [2024-06-18 00:28:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.2, 300 sec: 40710.1). Total num frames: 355958784. Throughput: 0: 40858.4. Samples: 356079100. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 00:28:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:28:52,930][12883] Updated weights for policy 0, policy_version 21730 (0.0037) [2024-06-18 00:28:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 356155392. Throughput: 0: 41055.6. Samples: 356332840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 00:28:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:28:57,292][12883] Updated weights for policy 0, policy_version 21740 (0.0031) [2024-06-18 00:29:00,819][12883] Updated weights for policy 0, policy_version 21750 (0.0028) [2024-06-18 00:29:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40961.6, 300 sec: 40710.1). Total num frames: 356368384. Throughput: 0: 41038.6. Samples: 356451940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 00:29:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:29:02,758][12862] Signal inference workers to stop experience collection... (5000 times) [2024-06-18 00:29:02,758][12862] Signal inference workers to resume experience collection... (5000 times) [2024-06-18 00:29:02,807][12883] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-18 00:29:02,807][12883] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-18 00:29:05,590][12883] Updated weights for policy 0, policy_version 21760 (0.0050) [2024-06-18 00:29:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41233.2, 300 sec: 40710.1). Total num frames: 356581376. Throughput: 0: 41047.0. Samples: 356699020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 00:29:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:29:08,969][12883] Updated weights for policy 0, policy_version 21770 (0.0039) [2024-06-18 00:29:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 356761600. Throughput: 0: 40811.1. Samples: 356935080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 00:29:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:29:13,590][12883] Updated weights for policy 0, policy_version 21780 (0.0035) [2024-06-18 00:29:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41234.3, 300 sec: 40765.6). Total num frames: 356990976. Throughput: 0: 40866.4. Samples: 357060400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:29:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:29:17,335][12883] Updated weights for policy 0, policy_version 21790 (0.0034) [2024-06-18 00:29:21,731][12883] Updated weights for policy 0, policy_version 21800 (0.0039) [2024-06-18 00:29:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40417.0, 300 sec: 40654.5). Total num frames: 357171200. Throughput: 0: 40736.0. Samples: 357302780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:29:21,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:29:25,235][12883] Updated weights for policy 0, policy_version 21810 (0.0041) [2024-06-18 00:29:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 40710.1). Total num frames: 357400576. Throughput: 0: 40580.3. Samples: 357543820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 00:29:26,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:29:29,695][12883] Updated weights for policy 0, policy_version 21820 (0.0035) [2024-06-18 00:29:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41234.6, 300 sec: 40765.6). Total num frames: 357613568. Throughput: 0: 40771.6. Samples: 357671860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 00:29:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:29:33,233][12883] Updated weights for policy 0, policy_version 21830 (0.0034) [2024-06-18 00:29:36,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39867.6, 300 sec: 40599.0). Total num frames: 357761024. Throughput: 0: 40678.5. Samples: 357909640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 00:29:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:29:37,828][12883] Updated weights for policy 0, policy_version 21840 (0.0025) [2024-06-18 00:29:41,093][12883] Updated weights for policy 0, policy_version 21850 (0.0034) [2024-06-18 00:29:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 358006784. Throughput: 0: 40485.4. Samples: 358154680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 00:29:41,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:29:45,987][12883] Updated weights for policy 0, policy_version 21860 (0.0023) [2024-06-18 00:29:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 358203392. Throughput: 0: 40805.3. Samples: 358288180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 00:29:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:29:49,113][12883] Updated weights for policy 0, policy_version 21870 (0.0040) [2024-06-18 00:29:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 40765.7). Total num frames: 358400000. Throughput: 0: 40549.7. Samples: 358523760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 00:29:51,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:29:54,237][12883] Updated weights for policy 0, policy_version 21880 (0.0046) [2024-06-18 00:29:56,951][12883] Updated weights for policy 0, policy_version 21890 (0.0038) [2024-06-18 00:29:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 358645760. Throughput: 0: 40677.8. Samples: 358765580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 00:29:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:30:01,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40140.8, 300 sec: 40544.3). Total num frames: 358776832. Throughput: 0: 40597.5. Samples: 358887280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 00:30:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:30:02,210][12883] Updated weights for policy 0, policy_version 21900 (0.0032) [2024-06-18 00:30:05,088][12883] Updated weights for policy 0, policy_version 21910 (0.0030) [2024-06-18 00:30:06,993][12645] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 359055360. Throughput: 0: 40669.0. Samples: 359132880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 00:30:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:30:10,355][12883] Updated weights for policy 0, policy_version 21920 (0.0046) [2024-06-18 00:30:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 359219200. Throughput: 0: 40889.9. Samples: 359383860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 00:30:11,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:30:13,343][12883] Updated weights for policy 0, policy_version 21930 (0.0040) [2024-06-18 00:30:16,994][12645] Fps is (10 sec: 34405.6, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 359399424. Throughput: 0: 40625.8. Samples: 359500020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 00:30:16,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:30:18,134][12883] Updated weights for policy 0, policy_version 21940 (0.0028) [2024-06-18 00:30:21,287][12883] Updated weights for policy 0, policy_version 21950 (0.0036) [2024-06-18 00:30:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.0, 300 sec: 40821.1). Total num frames: 359661568. Throughput: 0: 40804.5. Samples: 359745840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 00:30:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:30:26,153][12883] Updated weights for policy 0, policy_version 21960 (0.0046) [2024-06-18 00:30:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 359825408. Throughput: 0: 40837.2. Samples: 359992360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 00:30:27,000][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:30:29,494][12883] Updated weights for policy 0, policy_version 21970 (0.0033) [2024-06-18 00:30:31,995][12645] Fps is (10 sec: 37676.7, 60 sec: 40412.7, 300 sec: 40765.4). Total num frames: 360038400. Throughput: 0: 40463.8. Samples: 360109120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 00:30:31,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:30:34,165][12883] Updated weights for policy 0, policy_version 21980 (0.0032) [2024-06-18 00:30:36,996][12645] Fps is (10 sec: 42589.2, 60 sec: 41504.7, 300 sec: 40709.8). Total num frames: 360251392. Throughput: 0: 40782.8. Samples: 360359080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 00:30:36,996][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:30:37,056][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021989_360267776.pth... [2024-06-18 00:30:37,115][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021392_350486528.pth [2024-06-18 00:30:37,461][12883] Updated weights for policy 0, policy_version 21990 (0.0048) [2024-06-18 00:30:42,000][12645] Fps is (10 sec: 37666.1, 60 sec: 40136.6, 300 sec: 40709.2). Total num frames: 360415232. Throughput: 0: 40657.0. Samples: 360595400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 00:30:42,001][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:30:42,579][12883] Updated weights for policy 0, policy_version 22000 (0.0039) [2024-06-18 00:30:44,476][12862] Signal inference workers to stop experience collection... (5050 times) [2024-06-18 00:30:44,476][12862] Signal inference workers to resume experience collection... (5050 times) [2024-06-18 00:30:44,505][12883] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-18 00:30:44,505][12883] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-18 00:30:46,132][12883] Updated weights for policy 0, policy_version 22010 (0.0042) [2024-06-18 00:30:46,994][12645] Fps is (10 sec: 39330.7, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 360644608. Throughput: 0: 40564.4. Samples: 360712680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 00:30:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:30:50,613][12883] Updated weights for policy 0, policy_version 22020 (0.0056) [2024-06-18 00:30:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 40686.8, 300 sec: 40710.0). Total num frames: 360841216. Throughput: 0: 40623.3. Samples: 360960940. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-18 00:30:51,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:30:54,354][12883] Updated weights for policy 0, policy_version 22030 (0.0031) [2024-06-18 00:30:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 39867.7, 300 sec: 40710.1). Total num frames: 361037824. Throughput: 0: 40255.4. Samples: 361195360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-18 00:30:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:30:59,176][12883] Updated weights for policy 0, policy_version 22040 (0.0033) [2024-06-18 00:31:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40959.9, 300 sec: 40654.5). Total num frames: 361234432. Throughput: 0: 40473.3. Samples: 361321320. Policy #0 lag: (min: 1.0, avg: 11.6, max: 21.0) [2024-06-18 00:31:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:31:02,264][12883] Updated weights for policy 0, policy_version 22050 (0.0031) [2024-06-18 00:31:06,994][12645] Fps is (10 sec: 37684.1, 60 sec: 39321.5, 300 sec: 40599.0). Total num frames: 361414656. Throughput: 0: 40464.1. Samples: 361566720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:31:06,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:31:07,147][12883] Updated weights for policy 0, policy_version 22060 (0.0047) [2024-06-18 00:31:10,411][12883] Updated weights for policy 0, policy_version 22070 (0.0036) [2024-06-18 00:31:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 361676800. Throughput: 0: 40281.8. Samples: 361805040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:31:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:31:15,168][12883] Updated weights for policy 0, policy_version 22080 (0.0040) [2024-06-18 00:31:16,994][12645] Fps is (10 sec: 44235.4, 60 sec: 40959.9, 300 sec: 40599.0). Total num frames: 361857024. Throughput: 0: 40560.9. Samples: 361934300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:31:16,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:31:18,487][12883] Updated weights for policy 0, policy_version 22090 (0.0032) [2024-06-18 00:31:21,994][12645] Fps is (10 sec: 36044.6, 60 sec: 39594.6, 300 sec: 40654.5). Total num frames: 362037248. Throughput: 0: 40302.3. Samples: 362172600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 00:31:21,995][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 00:31:23,041][12883] Updated weights for policy 0, policy_version 22100 (0.0041) [2024-06-18 00:31:26,643][12883] Updated weights for policy 0, policy_version 22110 (0.0036) [2024-06-18 00:31:26,994][12645] Fps is (10 sec: 40961.3, 60 sec: 40687.0, 300 sec: 40654.5). Total num frames: 362266624. Throughput: 0: 40690.2. Samples: 362426200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 00:31:26,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 00:31:31,019][12883] Updated weights for policy 0, policy_version 22120 (0.0030) [2024-06-18 00:31:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40415.0, 300 sec: 40654.5). Total num frames: 362463232. Throughput: 0: 40679.4. Samples: 362543260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 00:31:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:31:34,518][12883] Updated weights for policy 0, policy_version 22130 (0.0036) [2024-06-18 00:31:36,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40413.9, 300 sec: 40709.8). Total num frames: 362676224. Throughput: 0: 40561.2. Samples: 362786280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 00:31:36,996][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:31:38,700][12883] Updated weights for policy 0, policy_version 22140 (0.0038) [2024-06-18 00:31:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40964.4, 300 sec: 40543.5). Total num frames: 362872832. Throughput: 0: 40872.6. Samples: 363034620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 00:31:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:31:42,491][12883] Updated weights for policy 0, policy_version 22150 (0.0031) [2024-06-18 00:31:46,493][12883] Updated weights for policy 0, policy_version 22160 (0.0031) [2024-06-18 00:31:46,994][12645] Fps is (10 sec: 39330.3, 60 sec: 40413.8, 300 sec: 40543.4). Total num frames: 363069440. Throughput: 0: 40727.6. Samples: 363154060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 00:31:46,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:31:50,539][12883] Updated weights for policy 0, policy_version 22170 (0.0023) [2024-06-18 00:31:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 363298816. Throughput: 0: 40769.2. Samples: 363401340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 00:31:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:31:54,671][12883] Updated weights for policy 0, policy_version 22180 (0.0027) [2024-06-18 00:31:56,996][12645] Fps is (10 sec: 42589.2, 60 sec: 40958.6, 300 sec: 40709.8). Total num frames: 363495424. Throughput: 0: 40916.3. Samples: 363646360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 00:31:56,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:31:58,428][12883] Updated weights for policy 0, policy_version 22190 (0.0037) [2024-06-18 00:32:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 363692032. Throughput: 0: 40789.1. Samples: 363769800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 00:32:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:32:02,978][12862] Signal inference workers to stop experience collection... (5100 times) [2024-06-18 00:32:03,013][12883] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-18 00:32:03,043][12862] Signal inference workers to resume experience collection... (5100 times) [2024-06-18 00:32:03,048][12883] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-18 00:32:03,052][12883] Updated weights for policy 0, policy_version 22200 (0.0041) [2024-06-18 00:32:06,650][12883] Updated weights for policy 0, policy_version 22210 (0.0038) [2024-06-18 00:32:06,994][12645] Fps is (10 sec: 40968.9, 60 sec: 41506.1, 300 sec: 40765.6). Total num frames: 363905024. Throughput: 0: 40968.5. Samples: 364016180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 00:32:06,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:32:10,973][12883] Updated weights for policy 0, policy_version 22220 (0.0042) [2024-06-18 00:32:11,996][12645] Fps is (10 sec: 40950.6, 60 sec: 40412.4, 300 sec: 40654.5). Total num frames: 364101632. Throughput: 0: 40686.3. Samples: 364257180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 00:32:11,997][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:32:14,882][12883] Updated weights for policy 0, policy_version 22230 (0.0049) [2024-06-18 00:32:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 364298240. Throughput: 0: 40853.0. Samples: 364381640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 00:32:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:32:19,001][12883] Updated weights for policy 0, policy_version 22240 (0.0037) [2024-06-18 00:32:21,994][12645] Fps is (10 sec: 39330.0, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 364494848. Throughput: 0: 40704.1. Samples: 364617880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:32:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:32:22,858][12883] Updated weights for policy 0, policy_version 22250 (0.0024) [2024-06-18 00:32:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 364691456. Throughput: 0: 40639.5. Samples: 364863400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:32:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:32:27,094][12883] Updated weights for policy 0, policy_version 22260 (0.0036) [2024-06-18 00:32:30,789][12883] Updated weights for policy 0, policy_version 22270 (0.0044) [2024-06-18 00:32:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40687.1, 300 sec: 40710.1). Total num frames: 364904448. Throughput: 0: 40667.2. Samples: 364984080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 00:32:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:32:35,306][12883] Updated weights for policy 0, policy_version 22280 (0.0028) [2024-06-18 00:32:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40688.5, 300 sec: 40765.6). Total num frames: 365117440. Throughput: 0: 40720.1. Samples: 365233740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 00:32:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:32:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022285_365117440.pth... [2024-06-18 00:32:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021691_355385344.pth [2024-06-18 00:32:38,618][12883] Updated weights for policy 0, policy_version 22290 (0.0028) [2024-06-18 00:32:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 365330432. Throughput: 0: 40532.2. Samples: 365470220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 00:32:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:32:43,248][12883] Updated weights for policy 0, policy_version 22300 (0.0048) [2024-06-18 00:32:46,973][12883] Updated weights for policy 0, policy_version 22310 (0.0037) [2024-06-18 00:32:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.1, 300 sec: 40765.7). Total num frames: 365527040. Throughput: 0: 40462.7. Samples: 365590620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 00:32:46,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:32:51,292][12883] Updated weights for policy 0, policy_version 22320 (0.0040) [2024-06-18 00:32:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40654.5). Total num frames: 365723648. Throughput: 0: 40461.8. Samples: 365836960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:32:51,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:32:55,066][12883] Updated weights for policy 0, policy_version 22330 (0.0034) [2024-06-18 00:32:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40142.3, 300 sec: 40654.8). Total num frames: 365903872. Throughput: 0: 40588.2. Samples: 366083560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:32:56,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:32:59,213][12883] Updated weights for policy 0, policy_version 22340 (0.0038) [2024-06-18 00:33:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.9, 300 sec: 40710.1). Total num frames: 366116864. Throughput: 0: 40481.4. Samples: 366203300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:33:01,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:33:03,074][12883] Updated weights for policy 0, policy_version 22350 (0.0031) [2024-06-18 00:33:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 366329856. Throughput: 0: 40666.7. Samples: 366447880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 00:33:06,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:33:07,207][12883] Updated weights for policy 0, policy_version 22360 (0.0032) [2024-06-18 00:33:11,347][12883] Updated weights for policy 0, policy_version 22370 (0.0034) [2024-06-18 00:33:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40415.3, 300 sec: 40710.3). Total num frames: 366526464. Throughput: 0: 40669.2. Samples: 366693520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 00:33:11,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:33:14,848][12883] Updated weights for policy 0, policy_version 22380 (0.0031) [2024-06-18 00:33:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40686.8, 300 sec: 40655.2). Total num frames: 366739456. Throughput: 0: 40694.0. Samples: 366815320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 00:33:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:33:19,443][12883] Updated weights for policy 0, policy_version 22390 (0.0037) [2024-06-18 00:33:22,000][12645] Fps is (10 sec: 40934.7, 60 sec: 40682.8, 300 sec: 40709.2). Total num frames: 366936064. Throughput: 0: 40615.7. Samples: 367061700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:33:22,001][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:33:22,812][12883] Updated weights for policy 0, policy_version 22400 (0.0043) [2024-06-18 00:33:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.8, 300 sec: 40654.8). Total num frames: 367132672. Throughput: 0: 40807.1. Samples: 367306540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:33:26,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:33:27,648][12883] Updated weights for policy 0, policy_version 22410 (0.0036) [2024-06-18 00:33:30,950][12862] Signal inference workers to stop experience collection... (5150 times) [2024-06-18 00:33:30,951][12862] Signal inference workers to resume experience collection... (5150 times) [2024-06-18 00:33:30,980][12883] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-18 00:33:30,980][12883] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-18 00:33:31,131][12883] Updated weights for policy 0, policy_version 22420 (0.0037) [2024-06-18 00:33:31,996][12645] Fps is (10 sec: 42615.3, 60 sec: 40958.4, 300 sec: 40654.2). Total num frames: 367362048. Throughput: 0: 40737.4. Samples: 367423900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:33:31,997][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:33:35,726][12883] Updated weights for policy 0, policy_version 22430 (0.0042) [2024-06-18 00:33:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40413.8, 300 sec: 40599.0). Total num frames: 367542272. Throughput: 0: 40759.9. Samples: 367671160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:33:36,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:33:39,361][12883] Updated weights for policy 0, policy_version 22440 (0.0049) [2024-06-18 00:33:41,994][12645] Fps is (10 sec: 37691.8, 60 sec: 40140.8, 300 sec: 40599.0). Total num frames: 367738880. Throughput: 0: 40551.6. Samples: 367908380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:33:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:33:43,783][12883] Updated weights for policy 0, policy_version 22450 (0.0034) [2024-06-18 00:33:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.7, 300 sec: 40654.5). Total num frames: 367951872. Throughput: 0: 40608.3. Samples: 368030680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:33:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:33:47,361][12883] Updated weights for policy 0, policy_version 22460 (0.0041) [2024-06-18 00:33:51,707][12883] Updated weights for policy 0, policy_version 22470 (0.0028) [2024-06-18 00:33:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 368164864. Throughput: 0: 40724.2. Samples: 368280460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 00:33:51,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:33:55,795][12883] Updated weights for policy 0, policy_version 22480 (0.0031) [2024-06-18 00:33:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 40599.0). Total num frames: 368345088. Throughput: 0: 40725.4. Samples: 368526160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 00:33:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:33:59,498][12883] Updated weights for policy 0, policy_version 22490 (0.0042) [2024-06-18 00:34:01,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40959.9, 300 sec: 40654.5). Total num frames: 368574464. Throughput: 0: 40651.1. Samples: 368644620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 00:34:01,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:34:04,478][12883] Updated weights for policy 0, policy_version 22500 (0.0041) [2024-06-18 00:34:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 368771072. Throughput: 0: 40605.1. Samples: 368888680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:34:06,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:34:07,494][12883] Updated weights for policy 0, policy_version 22510 (0.0045) [2024-06-18 00:34:11,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.9, 300 sec: 40543.5). Total num frames: 368951296. Throughput: 0: 40616.5. Samples: 369134280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:34:11,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:34:12,236][12883] Updated weights for policy 0, policy_version 22520 (0.0032) [2024-06-18 00:34:15,544][12883] Updated weights for policy 0, policy_version 22530 (0.0045) [2024-06-18 00:34:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 369180672. Throughput: 0: 40540.7. Samples: 369248140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:34:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:34:20,021][12883] Updated weights for policy 0, policy_version 22540 (0.0041) [2024-06-18 00:34:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40691.1, 300 sec: 40599.0). Total num frames: 369377280. Throughput: 0: 40718.6. Samples: 369503500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 00:34:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:34:23,287][12883] Updated weights for policy 0, policy_version 22550 (0.0027) [2024-06-18 00:34:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40599.0). Total num frames: 369590272. Throughput: 0: 40889.7. Samples: 369748420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 00:34:27,003][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:34:28,257][12883] Updated weights for policy 0, policy_version 22560 (0.0033) [2024-06-18 00:34:31,397][12883] Updated weights for policy 0, policy_version 22570 (0.0038) [2024-06-18 00:34:31,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40688.5, 300 sec: 40821.2). Total num frames: 369803264. Throughput: 0: 40953.1. Samples: 369873560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 00:34:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:34:36,106][12883] Updated weights for policy 0, policy_version 22580 (0.0041) [2024-06-18 00:34:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40543.4). Total num frames: 369967104. Throughput: 0: 40983.8. Samples: 370124740. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 00:34:36,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:34:37,132][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022583_369999872.pth... [2024-06-18 00:34:37,193][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000021989_360267776.pth [2024-06-18 00:34:39,364][12883] Updated weights for policy 0, policy_version 22590 (0.0041) [2024-06-18 00:34:41,994][12645] Fps is (10 sec: 40958.8, 60 sec: 41232.9, 300 sec: 40710.0). Total num frames: 370212864. Throughput: 0: 40656.3. Samples: 370355700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 00:34:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:34:44,235][12883] Updated weights for policy 0, policy_version 22600 (0.0023) [2024-06-18 00:34:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40960.1, 300 sec: 40710.1). Total num frames: 370409472. Throughput: 0: 41033.4. Samples: 370491120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 00:34:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:34:47,270][12883] Updated weights for policy 0, policy_version 22610 (0.0038) [2024-06-18 00:34:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40413.7, 300 sec: 40487.9). Total num frames: 370589696. Throughput: 0: 41091.5. Samples: 370737800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:34:51,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:34:52,124][12883] Updated weights for policy 0, policy_version 22620 (0.0038) [2024-06-18 00:34:55,149][12883] Updated weights for policy 0, policy_version 22630 (0.0030) [2024-06-18 00:34:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 40876.7). Total num frames: 370835456. Throughput: 0: 40843.2. Samples: 370972220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:34:56,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:34:59,931][12883] Updated weights for policy 0, policy_version 22640 (0.0035) [2024-06-18 00:35:00,775][12862] Signal inference workers to stop experience collection... (5200 times) [2024-06-18 00:35:00,808][12883] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-18 00:35:00,889][12862] Signal inference workers to resume experience collection... (5200 times) [2024-06-18 00:35:00,889][12883] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-18 00:35:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40687.0, 300 sec: 40543.4). Total num frames: 371015680. Throughput: 0: 41256.0. Samples: 371104660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:35:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:35:03,375][12883] Updated weights for policy 0, policy_version 22650 (0.0035) [2024-06-18 00:35:06,994][12645] Fps is (10 sec: 36044.9, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 371195904. Throughput: 0: 40797.5. Samples: 371339380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:35:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:35:07,986][12883] Updated weights for policy 0, policy_version 22660 (0.0042) [2024-06-18 00:35:11,386][12883] Updated weights for policy 0, policy_version 22670 (0.0034) [2024-06-18 00:35:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 40821.2). Total num frames: 371441664. Throughput: 0: 40769.0. Samples: 371583020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:35:11,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:35:15,742][12883] Updated weights for policy 0, policy_version 22680 (0.0025) [2024-06-18 00:35:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40686.8, 300 sec: 40543.4). Total num frames: 371621888. Throughput: 0: 40950.0. Samples: 371716320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:35:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:35:19,496][12883] Updated weights for policy 0, policy_version 22690 (0.0030) [2024-06-18 00:35:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 371834880. Throughput: 0: 40646.2. Samples: 371953820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:35:21,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:35:23,622][12883] Updated weights for policy 0, policy_version 22700 (0.0038) [2024-06-18 00:35:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 40960.0, 300 sec: 40710.3). Total num frames: 372047872. Throughput: 0: 41011.8. Samples: 372201220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 00:35:26,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:35:27,420][12883] Updated weights for policy 0, policy_version 22710 (0.0037) [2024-06-18 00:35:31,870][12883] Updated weights for policy 0, policy_version 22720 (0.0045) [2024-06-18 00:35:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.8, 300 sec: 40654.8). Total num frames: 372244480. Throughput: 0: 40719.0. Samples: 372323480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 00:35:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:35:35,762][12883] Updated weights for policy 0, policy_version 22730 (0.0031) [2024-06-18 00:35:36,998][12645] Fps is (10 sec: 42578.1, 60 sec: 41775.9, 300 sec: 40876.9). Total num frames: 372473856. Throughput: 0: 40734.9. Samples: 372571060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 00:35:36,999][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:35:39,979][12883] Updated weights for policy 0, policy_version 22740 (0.0030) [2024-06-18 00:35:41,994][12645] Fps is (10 sec: 42599.4, 60 sec: 40960.2, 300 sec: 40765.6). Total num frames: 372670464. Throughput: 0: 40928.0. Samples: 372813980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 00:35:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:35:43,806][12883] Updated weights for policy 0, policy_version 22750 (0.0029) [2024-06-18 00:35:46,994][12645] Fps is (10 sec: 37700.6, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 372850688. Throughput: 0: 40616.7. Samples: 372932420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 00:35:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:35:47,879][12883] Updated weights for policy 0, policy_version 22760 (0.0030) [2024-06-18 00:35:51,793][12883] Updated weights for policy 0, policy_version 22770 (0.0037) [2024-06-18 00:35:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 41231.6, 300 sec: 40765.3). Total num frames: 373063680. Throughput: 0: 40915.7. Samples: 373180680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 00:35:51,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:35:56,127][12883] Updated weights for policy 0, policy_version 22780 (0.0036) [2024-06-18 00:35:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40686.8, 300 sec: 40821.1). Total num frames: 373276672. Throughput: 0: 40888.7. Samples: 373423020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:35:56,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:35:59,881][12883] Updated weights for policy 0, policy_version 22790 (0.0045) [2024-06-18 00:36:01,994][12645] Fps is (10 sec: 40969.6, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 373473280. Throughput: 0: 40762.0. Samples: 373550600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:36:01,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:36:03,818][12883] Updated weights for policy 0, policy_version 22800 (0.0035) [2024-06-18 00:36:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40959.9, 300 sec: 40599.0). Total num frames: 373653504. Throughput: 0: 40780.9. Samples: 373788960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 00:36:06,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:36:07,853][12883] Updated weights for policy 0, policy_version 22810 (0.0033) [2024-06-18 00:36:11,758][12883] Updated weights for policy 0, policy_version 22820 (0.0031) [2024-06-18 00:36:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 373882880. Throughput: 0: 40870.1. Samples: 374040380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:36:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:36:15,949][12883] Updated weights for policy 0, policy_version 22830 (0.0032) [2024-06-18 00:36:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 374063104. Throughput: 0: 40848.2. Samples: 374161640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:36:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:36:17,664][12862] Signal inference workers to stop experience collection... (5250 times) [2024-06-18 00:36:17,664][12862] Signal inference workers to resume experience collection... (5250 times) [2024-06-18 00:36:17,694][12883] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-18 00:36:17,694][12883] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-18 00:36:20,162][12883] Updated weights for policy 0, policy_version 22840 (0.0041) [2024-06-18 00:36:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 374292480. Throughput: 0: 40728.3. Samples: 374403640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 00:36:21,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:36:23,982][12883] Updated weights for policy 0, policy_version 22850 (0.0034) [2024-06-18 00:36:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 374489088. Throughput: 0: 40750.7. Samples: 374647760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 00:36:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:36:28,117][12883] Updated weights for policy 0, policy_version 22860 (0.0036) [2024-06-18 00:36:31,954][12883] Updated weights for policy 0, policy_version 22870 (0.0041) [2024-06-18 00:36:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40960.0, 300 sec: 40765.9). Total num frames: 374702080. Throughput: 0: 40748.4. Samples: 374766100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 00:36:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:36:36,108][12883] Updated weights for policy 0, policy_version 22880 (0.0045) [2024-06-18 00:36:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40690.2, 300 sec: 40821.2). Total num frames: 374915072. Throughput: 0: 40886.1. Samples: 375020460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 00:36:36,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:36:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022883_374915072.pth... [2024-06-18 00:36:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022285_365117440.pth [2024-06-18 00:36:40,008][12883] Updated weights for policy 0, policy_version 22890 (0.0046) [2024-06-18 00:36:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40686.8, 300 sec: 40821.1). Total num frames: 375111680. Throughput: 0: 40842.2. Samples: 375260920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 00:36:41,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:36:43,865][12883] Updated weights for policy 0, policy_version 22900 (0.0038) [2024-06-18 00:36:46,996][12645] Fps is (10 sec: 39312.3, 60 sec: 40958.5, 300 sec: 40709.8). Total num frames: 375308288. Throughput: 0: 40666.3. Samples: 375380680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 00:36:46,997][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:36:48,277][12883] Updated weights for policy 0, policy_version 22910 (0.0040) [2024-06-18 00:36:51,651][12883] Updated weights for policy 0, policy_version 22920 (0.0036) [2024-06-18 00:36:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40961.5, 300 sec: 40765.9). Total num frames: 375521280. Throughput: 0: 40970.7. Samples: 375632640. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 00:36:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:36:56,175][12883] Updated weights for policy 0, policy_version 22930 (0.0040) [2024-06-18 00:36:56,994][12645] Fps is (10 sec: 40969.8, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 375717888. Throughput: 0: 40956.2. Samples: 375883400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 00:36:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:36:59,651][12883] Updated weights for policy 0, policy_version 22940 (0.0026) [2024-06-18 00:37:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 375930880. Throughput: 0: 40862.5. Samples: 376000460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 00:37:01,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:37:04,464][12883] Updated weights for policy 0, policy_version 22950 (0.0041) [2024-06-18 00:37:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41506.1, 300 sec: 40821.4). Total num frames: 376143872. Throughput: 0: 41028.3. Samples: 376249920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 00:37:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:37:07,390][12883] Updated weights for policy 0, policy_version 22960 (0.0029) [2024-06-18 00:37:11,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 40765.6). Total num frames: 376324096. Throughput: 0: 41202.6. Samples: 376501880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 00:37:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:37:12,435][12883] Updated weights for policy 0, policy_version 22970 (0.0032) [2024-06-18 00:37:15,282][12883] Updated weights for policy 0, policy_version 22980 (0.0046) [2024-06-18 00:37:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 40932.3). Total num frames: 376569856. Throughput: 0: 41244.1. Samples: 376622080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 00:37:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:37:20,292][12883] Updated weights for policy 0, policy_version 22990 (0.0031) [2024-06-18 00:37:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 376750080. Throughput: 0: 41151.6. Samples: 376872280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 00:37:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:37:23,419][12883] Updated weights for policy 0, policy_version 23000 (0.0028) [2024-06-18 00:37:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 376946688. Throughput: 0: 41344.5. Samples: 377121420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 00:37:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:37:28,214][12883] Updated weights for policy 0, policy_version 23010 (0.0038) [2024-06-18 00:37:31,143][12883] Updated weights for policy 0, policy_version 23020 (0.0028) [2024-06-18 00:37:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.2, 300 sec: 40876.7). Total num frames: 377176064. Throughput: 0: 41326.2. Samples: 377240260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-18 00:37:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:37:36,318][12883] Updated weights for policy 0, policy_version 23030 (0.0029) [2024-06-18 00:37:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 377372672. Throughput: 0: 41333.4. Samples: 377492640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-18 00:37:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:37:38,750][12862] Signal inference workers to stop experience collection... (5300 times) [2024-06-18 00:37:38,750][12862] Signal inference workers to resume experience collection... (5300 times) [2024-06-18 00:37:38,764][12883] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-18 00:37:38,764][12883] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-18 00:37:38,905][12883] Updated weights for policy 0, policy_version 23040 (0.0033) [2024-06-18 00:37:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 40821.1). Total num frames: 377569280. Throughput: 0: 40998.1. Samples: 377728320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 23.0) [2024-06-18 00:37:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:37:44,360][12883] Updated weights for policy 0, policy_version 23050 (0.0036) [2024-06-18 00:37:46,794][12883] Updated weights for policy 0, policy_version 23060 (0.0042) [2024-06-18 00:37:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41780.8, 300 sec: 40987.8). Total num frames: 377815040. Throughput: 0: 41282.7. Samples: 377858180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 00:37:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:37:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 377962496. Throughput: 0: 41151.6. Samples: 378101740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 00:37:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:37:52,252][12883] Updated weights for policy 0, policy_version 23070 (0.0035) [2024-06-18 00:37:55,089][12883] Updated weights for policy 0, policy_version 23080 (0.0045) [2024-06-18 00:37:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 378191872. Throughput: 0: 41033.2. Samples: 378348380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 00:37:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:38:00,619][12883] Updated weights for policy 0, policy_version 23090 (0.0030) [2024-06-18 00:38:01,994][12645] Fps is (10 sec: 45875.6, 60 sec: 41506.2, 300 sec: 40987.8). Total num frames: 378421248. Throughput: 0: 41237.3. Samples: 378477760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:38:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:38:02,739][12883] Updated weights for policy 0, policy_version 23100 (0.0035) [2024-06-18 00:38:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.9, 300 sec: 40821.2). Total num frames: 378568704. Throughput: 0: 41003.9. Samples: 378717460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:38:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:38:08,475][12883] Updated weights for policy 0, policy_version 23110 (0.0044) [2024-06-18 00:38:11,194][12883] Updated weights for policy 0, policy_version 23120 (0.0036) [2024-06-18 00:38:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 40932.3). Total num frames: 378814464. Throughput: 0: 40922.3. Samples: 378962920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 00:38:11,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:38:16,231][12883] Updated weights for policy 0, policy_version 23130 (0.0048) [2024-06-18 00:38:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 40686.9, 300 sec: 40933.1). Total num frames: 379011072. Throughput: 0: 41228.4. Samples: 379095540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 00:38:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:38:19,019][12883] Updated weights for policy 0, policy_version 23140 (0.0049) [2024-06-18 00:38:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41232.9, 300 sec: 40987.8). Total num frames: 379224064. Throughput: 0: 41039.9. Samples: 379339440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 00:38:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:38:24,194][12883] Updated weights for policy 0, policy_version 23150 (0.0050) [2024-06-18 00:38:26,826][12883] Updated weights for policy 0, policy_version 23160 (0.0042) [2024-06-18 00:38:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 40988.1). Total num frames: 379453440. Throughput: 0: 41216.3. Samples: 379583060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 00:38:26,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:38:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 379600896. Throughput: 0: 41084.0. Samples: 379706960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:38:31,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 00:38:32,107][12883] Updated weights for policy 0, policy_version 23170 (0.0050) [2024-06-18 00:38:34,771][12883] Updated weights for policy 0, policy_version 23180 (0.0045) [2024-06-18 00:38:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 379846656. Throughput: 0: 41008.5. Samples: 379947120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:38:36,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:38:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023184_379846656.pth... [2024-06-18 00:38:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022583_369999872.pth [2024-06-18 00:38:39,956][12883] Updated weights for policy 0, policy_version 23190 (0.0042) [2024-06-18 00:38:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 380043264. Throughput: 0: 40996.1. Samples: 380193200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:38:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:38:43,043][12883] Updated weights for policy 0, policy_version 23200 (0.0024) [2024-06-18 00:38:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 40876.7). Total num frames: 380223488. Throughput: 0: 40863.5. Samples: 380316620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:38:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:38:48,007][12883] Updated weights for policy 0, policy_version 23210 (0.0023) [2024-06-18 00:38:51,095][12883] Updated weights for policy 0, policy_version 23220 (0.0047) [2024-06-18 00:38:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 380452864. Throughput: 0: 41039.5. Samples: 380564240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:38:51,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:38:55,806][12883] Updated weights for policy 0, policy_version 23230 (0.0029) [2024-06-18 00:38:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 380649472. Throughput: 0: 41072.3. Samples: 380811180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:38:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:38:58,873][12862] Signal inference workers to stop experience collection... (5350 times) [2024-06-18 00:38:58,873][12862] Signal inference workers to resume experience collection... (5350 times) [2024-06-18 00:38:58,916][12883] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-18 00:38:58,916][12883] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-18 00:38:59,235][12883] Updated weights for policy 0, policy_version 23240 (0.0031) [2024-06-18 00:39:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 380846080. Throughput: 0: 40768.9. Samples: 380930140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:39:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:39:04,013][12883] Updated weights for policy 0, policy_version 23250 (0.0046) [2024-06-18 00:39:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41098.9). Total num frames: 381075456. Throughput: 0: 40690.8. Samples: 381170520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:39:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:39:07,281][12883] Updated weights for policy 0, policy_version 23260 (0.0042) [2024-06-18 00:39:11,962][12883] Updated weights for policy 0, policy_version 23270 (0.0047) [2024-06-18 00:39:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 381255680. Throughput: 0: 40838.3. Samples: 381420780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:39:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:39:15,288][12883] Updated weights for policy 0, policy_version 23280 (0.0030) [2024-06-18 00:39:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 381485056. Throughput: 0: 40780.0. Samples: 381542060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 00:39:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:39:19,979][12883] Updated weights for policy 0, policy_version 23290 (0.0045) [2024-06-18 00:39:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 381681664. Throughput: 0: 41063.0. Samples: 381794960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:39:21,997][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:39:23,191][12883] Updated weights for policy 0, policy_version 23300 (0.0040) [2024-06-18 00:39:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40140.9, 300 sec: 40876.7). Total num frames: 381861888. Throughput: 0: 40952.5. Samples: 382036060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:39:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:39:27,975][12883] Updated weights for policy 0, policy_version 23310 (0.0031) [2024-06-18 00:39:31,399][12883] Updated weights for policy 0, policy_version 23320 (0.0039) [2024-06-18 00:39:32,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41774.9, 300 sec: 41153.5). Total num frames: 382107648. Throughput: 0: 40945.0. Samples: 382159400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:39:32,000][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:39:36,153][12883] Updated weights for policy 0, policy_version 23330 (0.0033) [2024-06-18 00:39:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 382271488. Throughput: 0: 41103.6. Samples: 382413900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:39:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:39:39,167][12883] Updated weights for policy 0, policy_version 23340 (0.0036) [2024-06-18 00:39:41,994][12645] Fps is (10 sec: 39346.4, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 382500864. Throughput: 0: 40971.2. Samples: 382654880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:39:41,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:39:44,670][12883] Updated weights for policy 0, policy_version 23350 (0.0036) [2024-06-18 00:39:46,944][12883] Updated weights for policy 0, policy_version 23360 (0.0031) [2024-06-18 00:39:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 382730240. Throughput: 0: 41244.4. Samples: 382786140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:39:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:39:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40414.0, 300 sec: 40821.2). Total num frames: 382877696. Throughput: 0: 41056.5. Samples: 383018060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 00:39:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:39:52,323][12883] Updated weights for policy 0, policy_version 23370 (0.0043) [2024-06-18 00:39:55,055][12883] Updated weights for policy 0, policy_version 23380 (0.0039) [2024-06-18 00:39:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 383123456. Throughput: 0: 40971.5. Samples: 383264500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 00:39:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:39:59,953][12883] Updated weights for policy 0, policy_version 23390 (0.0047) [2024-06-18 00:40:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 383320064. Throughput: 0: 41201.7. Samples: 383396140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 00:40:01,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:40:02,901][12883] Updated weights for policy 0, policy_version 23400 (0.0033) [2024-06-18 00:40:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 383516672. Throughput: 0: 40917.4. Samples: 383636240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 00:40:06,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:40:07,807][12883] Updated weights for policy 0, policy_version 23410 (0.0043) [2024-06-18 00:40:10,694][12883] Updated weights for policy 0, policy_version 23420 (0.0032) [2024-06-18 00:40:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 383729664. Throughput: 0: 41046.7. Samples: 383883160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:40:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:40:15,562][12883] Updated weights for policy 0, policy_version 23430 (0.0041) [2024-06-18 00:40:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 383926272. Throughput: 0: 40982.6. Samples: 384003360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:40:16,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:40:19,276][12883] Updated weights for policy 0, policy_version 23440 (0.0034) [2024-06-18 00:40:20,836][12862] Signal inference workers to stop experience collection... (5400 times) [2024-06-18 00:40:20,837][12862] Signal inference workers to resume experience collection... (5400 times) [2024-06-18 00:40:20,887][12883] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-18 00:40:20,887][12883] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-18 00:40:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 384139264. Throughput: 0: 41049.8. Samples: 384261140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:40:21,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:40:23,444][12883] Updated weights for policy 0, policy_version 23450 (0.0043) [2024-06-18 00:40:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 41043.3). Total num frames: 384352256. Throughput: 0: 41051.4. Samples: 384502200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:40:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:40:27,002][12883] Updated weights for policy 0, policy_version 23460 (0.0047) [2024-06-18 00:40:31,726][12883] Updated weights for policy 0, policy_version 23470 (0.0028) [2024-06-18 00:40:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40418.2, 300 sec: 40877.4). Total num frames: 384532480. Throughput: 0: 41000.6. Samples: 384631160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:40:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:40:35,426][12883] Updated weights for policy 0, policy_version 23480 (0.0033) [2024-06-18 00:40:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 384745472. Throughput: 0: 41342.1. Samples: 384878460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 00:40:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:40:37,115][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023484_384761856.pth... [2024-06-18 00:40:37,171][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000022883_374915072.pth [2024-06-18 00:40:39,477][12883] Updated weights for policy 0, policy_version 23490 (0.0039) [2024-06-18 00:40:41,994][12645] Fps is (10 sec: 45874.3, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 384991232. Throughput: 0: 41173.8. Samples: 385117320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 00:40:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:40:43,608][12883] Updated weights for policy 0, policy_version 23500 (0.0047) [2024-06-18 00:40:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40413.9, 300 sec: 40988.1). Total num frames: 385155072. Throughput: 0: 41236.9. Samples: 385251800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 00:40:46,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:40:47,323][12883] Updated weights for policy 0, policy_version 23510 (0.0042) [2024-06-18 00:40:51,289][12883] Updated weights for policy 0, policy_version 23520 (0.0042) [2024-06-18 00:40:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 40987.8). Total num frames: 385368064. Throughput: 0: 41201.3. Samples: 385490300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 00:40:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:40:55,078][12883] Updated weights for policy 0, policy_version 23530 (0.0046) [2024-06-18 00:40:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 385581056. Throughput: 0: 41133.2. Samples: 385734160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 00:40:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:40:59,326][12883] Updated weights for policy 0, policy_version 23540 (0.0038) [2024-06-18 00:41:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 385777664. Throughput: 0: 41150.6. Samples: 385855140. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 00:41:01,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:41:03,281][12883] Updated weights for policy 0, policy_version 23550 (0.0039) [2024-06-18 00:41:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 385974272. Throughput: 0: 40867.5. Samples: 386100180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 00:41:06,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:41:07,303][12883] Updated weights for policy 0, policy_version 23560 (0.0036) [2024-06-18 00:41:11,086][12883] Updated weights for policy 0, policy_version 23570 (0.0043) [2024-06-18 00:41:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 386187264. Throughput: 0: 41084.1. Samples: 386350980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 00:41:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:41:15,320][12883] Updated weights for policy 0, policy_version 23580 (0.0047) [2024-06-18 00:41:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 386383872. Throughput: 0: 40970.5. Samples: 386474840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 00:41:16,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:41:19,214][12883] Updated weights for policy 0, policy_version 23590 (0.0045) [2024-06-18 00:41:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 386596864. Throughput: 0: 40843.6. Samples: 386716420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 00:41:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:41:23,500][12883] Updated weights for policy 0, policy_version 23600 (0.0038) [2024-06-18 00:41:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 386809856. Throughput: 0: 41005.9. Samples: 386962580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 00:41:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:41:27,097][12883] Updated weights for policy 0, policy_version 23610 (0.0051) [2024-06-18 00:41:31,454][12883] Updated weights for policy 0, policy_version 23620 (0.0034) [2024-06-18 00:41:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41232.9, 300 sec: 40987.7). Total num frames: 387006464. Throughput: 0: 40812.4. Samples: 387088360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 00:41:31,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:41:35,194][12883] Updated weights for policy 0, policy_version 23630 (0.0037) [2024-06-18 00:41:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 387219456. Throughput: 0: 40898.8. Samples: 387330740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 00:41:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:41:39,405][12883] Updated weights for policy 0, policy_version 23640 (0.0039) [2024-06-18 00:41:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40413.8, 300 sec: 41043.6). Total num frames: 387416064. Throughput: 0: 40952.8. Samples: 387577040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 00:41:41,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:41:43,061][12883] Updated weights for policy 0, policy_version 23650 (0.0034) [2024-06-18 00:41:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 387612672. Throughput: 0: 40905.8. Samples: 387695900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 00:41:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:41:47,576][12883] Updated weights for policy 0, policy_version 23660 (0.0026) [2024-06-18 00:41:51,526][12883] Updated weights for policy 0, policy_version 23670 (0.0035) [2024-06-18 00:41:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 387825664. Throughput: 0: 40951.5. Samples: 387943000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 00:41:51,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:41:54,028][12862] Signal inference workers to stop experience collection... (5450 times) [2024-06-18 00:41:54,075][12862] Signal inference workers to resume experience collection... (5450 times) [2024-06-18 00:41:54,075][12883] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-18 00:41:54,095][12883] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-18 00:41:55,382][12883] Updated weights for policy 0, policy_version 23680 (0.0033) [2024-06-18 00:41:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 388038656. Throughput: 0: 40837.4. Samples: 388188660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 00:41:56,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 00:41:59,495][12883] Updated weights for policy 0, policy_version 23690 (0.0040) [2024-06-18 00:42:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 388235264. Throughput: 0: 40858.2. Samples: 388313460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 00:42:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:42:03,121][12883] Updated weights for policy 0, policy_version 23700 (0.0047) [2024-06-18 00:42:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 388431872. Throughput: 0: 41007.1. Samples: 388561740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 00:42:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:42:07,623][12883] Updated weights for policy 0, policy_version 23710 (0.0039) [2024-06-18 00:42:11,302][12883] Updated weights for policy 0, policy_version 23720 (0.0038) [2024-06-18 00:42:11,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40958.5, 300 sec: 40931.9). Total num frames: 388644864. Throughput: 0: 40923.7. Samples: 388804240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 00:42:11,997][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:42:15,611][12883] Updated weights for policy 0, policy_version 23730 (0.0036) [2024-06-18 00:42:16,996][12645] Fps is (10 sec: 40950.9, 60 sec: 40958.4, 300 sec: 40987.4). Total num frames: 388841472. Throughput: 0: 40965.6. Samples: 388931900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:42:16,997][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:42:19,196][12883] Updated weights for policy 0, policy_version 23740 (0.0035) [2024-06-18 00:42:21,996][12645] Fps is (10 sec: 40960.0, 60 sec: 40958.5, 300 sec: 41043.0). Total num frames: 389054464. Throughput: 0: 40847.8. Samples: 389168980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:42:21,996][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:42:23,474][12883] Updated weights for policy 0, policy_version 23750 (0.0038) [2024-06-18 00:42:26,994][12645] Fps is (10 sec: 40969.7, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 389251072. Throughput: 0: 41111.3. Samples: 389427040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:42:26,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:42:27,154][12883] Updated weights for policy 0, policy_version 23760 (0.0034) [2024-06-18 00:42:31,551][12883] Updated weights for policy 0, policy_version 23770 (0.0035) [2024-06-18 00:42:31,994][12645] Fps is (10 sec: 40968.6, 60 sec: 40960.0, 300 sec: 40987.7). Total num frames: 389464064. Throughput: 0: 41144.8. Samples: 389547420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:42:31,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:42:35,263][12883] Updated weights for policy 0, policy_version 23780 (0.0044) [2024-06-18 00:42:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 389693440. Throughput: 0: 40984.0. Samples: 389787280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:42:36,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:42:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023785_389693440.pth... [2024-06-18 00:42:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023184_379846656.pth [2024-06-18 00:42:39,597][12883] Updated weights for policy 0, policy_version 23790 (0.0029) [2024-06-18 00:42:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 389873664. Throughput: 0: 41036.0. Samples: 390035280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:42:41,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:42:43,240][12883] Updated weights for policy 0, policy_version 23800 (0.0033) [2024-06-18 00:42:46,996][12645] Fps is (10 sec: 37675.3, 60 sec: 40958.5, 300 sec: 41043.0). Total num frames: 390070272. Throughput: 0: 40923.4. Samples: 390155100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:42:46,996][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:42:47,585][12883] Updated weights for policy 0, policy_version 23810 (0.0031) [2024-06-18 00:42:51,271][12883] Updated weights for policy 0, policy_version 23820 (0.0036) [2024-06-18 00:42:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 390283264. Throughput: 0: 40905.5. Samples: 390402480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 00:42:51,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:42:55,671][12883] Updated weights for policy 0, policy_version 23830 (0.0029) [2024-06-18 00:42:56,994][12645] Fps is (10 sec: 40968.7, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 390479872. Throughput: 0: 41058.9. Samples: 390651800. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 00:42:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:42:59,428][12883] Updated weights for policy 0, policy_version 23840 (0.0054) [2024-06-18 00:43:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 390692864. Throughput: 0: 40790.0. Samples: 390767360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 00:43:01,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:43:03,794][12883] Updated weights for policy 0, policy_version 23850 (0.0036) [2024-06-18 00:43:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 390889472. Throughput: 0: 40970.9. Samples: 391012580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:43:06,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:43:07,301][12883] Updated weights for policy 0, policy_version 23860 (0.0043) [2024-06-18 00:43:12,000][12645] Fps is (10 sec: 37660.0, 60 sec: 40411.2, 300 sec: 40875.8). Total num frames: 391069696. Throughput: 0: 40715.2. Samples: 391259480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:43:12,000][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:43:12,208][12883] Updated weights for policy 0, policy_version 23870 (0.0030) [2024-06-18 00:43:12,658][12862] Signal inference workers to stop experience collection... (5500 times) [2024-06-18 00:43:12,695][12883] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-18 00:43:12,704][12862] Signal inference workers to resume experience collection... (5500 times) [2024-06-18 00:43:12,707][12883] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-18 00:43:15,094][12883] Updated weights for policy 0, policy_version 23880 (0.0038) [2024-06-18 00:43:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40961.5, 300 sec: 40932.2). Total num frames: 391299072. Throughput: 0: 40648.5. Samples: 391376600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 00:43:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:43:19,895][12883] Updated weights for policy 0, policy_version 23890 (0.0033) [2024-06-18 00:43:21,994][12645] Fps is (10 sec: 45903.3, 60 sec: 41234.5, 300 sec: 40932.2). Total num frames: 391528448. Throughput: 0: 40894.2. Samples: 391627520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 00:43:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:43:22,999][12883] Updated weights for policy 0, policy_version 23900 (0.0029) [2024-06-18 00:43:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 391725056. Throughput: 0: 40815.6. Samples: 391871980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 00:43:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:43:27,856][12883] Updated weights for policy 0, policy_version 23910 (0.0036) [2024-06-18 00:43:31,107][12883] Updated weights for policy 0, policy_version 23920 (0.0032) [2024-06-18 00:43:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 391921664. Throughput: 0: 40838.3. Samples: 391992740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 00:43:31,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:43:35,691][12883] Updated weights for policy 0, policy_version 23930 (0.0040) [2024-06-18 00:43:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 392134656. Throughput: 0: 40919.0. Samples: 392243840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 00:43:36,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:43:39,086][12883] Updated weights for policy 0, policy_version 23940 (0.0032) [2024-06-18 00:43:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 392314880. Throughput: 0: 40758.7. Samples: 392485940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 00:43:41,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 00:43:43,564][12883] Updated weights for policy 0, policy_version 23950 (0.0030) [2024-06-18 00:43:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41234.6, 300 sec: 40987.8). Total num frames: 392544256. Throughput: 0: 40870.3. Samples: 392606520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 00:43:46,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:43:47,102][12883] Updated weights for policy 0, policy_version 23960 (0.0036) [2024-06-18 00:43:51,493][12883] Updated weights for policy 0, policy_version 23970 (0.0045) [2024-06-18 00:43:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 392740864. Throughput: 0: 40833.5. Samples: 392850080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 00:43:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:43:55,187][12883] Updated weights for policy 0, policy_version 23980 (0.0033) [2024-06-18 00:43:56,994][12645] Fps is (10 sec: 37682.4, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 392921088. Throughput: 0: 40902.4. Samples: 393099840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:43:56,995][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:43:59,380][12883] Updated weights for policy 0, policy_version 23990 (0.0028) [2024-06-18 00:44:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 393134080. Throughput: 0: 40989.9. Samples: 393221140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:44:01,995][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:44:03,162][12883] Updated weights for policy 0, policy_version 24000 (0.0028) [2024-06-18 00:44:06,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 393330688. Throughput: 0: 40874.8. Samples: 393466880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 00:44:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:44:07,638][12883] Updated weights for policy 0, policy_version 24010 (0.0038) [2024-06-18 00:44:11,351][12883] Updated weights for policy 0, policy_version 24020 (0.0022) [2024-06-18 00:44:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41510.4, 300 sec: 40932.2). Total num frames: 393560064. Throughput: 0: 40828.8. Samples: 393709280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 00:44:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:44:15,774][12883] Updated weights for policy 0, policy_version 24030 (0.0037) [2024-06-18 00:44:16,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 393756672. Throughput: 0: 40847.2. Samples: 393830860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 00:44:16,995][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:44:19,393][12883] Updated weights for policy 0, policy_version 24040 (0.0039) [2024-06-18 00:44:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 393953280. Throughput: 0: 40733.8. Samples: 394076860. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 00:44:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:44:23,765][12883] Updated weights for policy 0, policy_version 24050 (0.0031) [2024-06-18 00:44:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40959.8, 300 sec: 40933.1). Total num frames: 394182656. Throughput: 0: 40730.2. Samples: 394318800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 00:44:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:44:27,513][12883] Updated weights for policy 0, policy_version 24060 (0.0046) [2024-06-18 00:44:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 394346496. Throughput: 0: 40738.7. Samples: 394439760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 00:44:31,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:44:32,053][12883] Updated weights for policy 0, policy_version 24070 (0.0037) [2024-06-18 00:44:35,620][12883] Updated weights for policy 0, policy_version 24080 (0.0040) [2024-06-18 00:44:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 394575872. Throughput: 0: 40694.8. Samples: 394681360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 00:44:36,995][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:44:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024083_394575872.pth... [2024-06-18 00:44:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023484_384761856.pth [2024-06-18 00:44:40,193][12883] Updated weights for policy 0, policy_version 24090 (0.0043) [2024-06-18 00:44:42,000][12645] Fps is (10 sec: 42571.5, 60 sec: 40955.8, 300 sec: 40820.3). Total num frames: 394772480. Throughput: 0: 40645.2. Samples: 394929120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-18 00:44:42,001][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:44:43,634][12883] Updated weights for policy 0, policy_version 24100 (0.0043) [2024-06-18 00:44:46,994][12645] Fps is (10 sec: 37684.0, 60 sec: 40140.8, 300 sec: 40932.2). Total num frames: 394952704. Throughput: 0: 40592.4. Samples: 395047800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-18 00:44:47,000][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:44:48,121][12883] Updated weights for policy 0, policy_version 24110 (0.0038) [2024-06-18 00:44:51,555][12883] Updated weights for policy 0, policy_version 24120 (0.0029) [2024-06-18 00:44:51,994][12645] Fps is (10 sec: 42625.4, 60 sec: 40960.0, 300 sec: 40932.3). Total num frames: 395198464. Throughput: 0: 40588.4. Samples: 395293360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-18 00:44:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:44:56,002][12883] Updated weights for policy 0, policy_version 24130 (0.0034) [2024-06-18 00:44:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 395378688. Throughput: 0: 40755.5. Samples: 395543280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-18 00:44:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:44:58,128][12862] Signal inference workers to stop experience collection... (5550 times) [2024-06-18 00:44:58,128][12862] Signal inference workers to resume experience collection... (5550 times) [2024-06-18 00:44:58,152][12883] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-18 00:44:58,153][12883] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-18 00:44:59,686][12883] Updated weights for policy 0, policy_version 24140 (0.0034) [2024-06-18 00:45:01,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 395575296. Throughput: 0: 40714.3. Samples: 395663000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 00:45:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:45:04,174][12883] Updated weights for policy 0, policy_version 24150 (0.0028) [2024-06-18 00:45:07,000][12645] Fps is (10 sec: 40934.6, 60 sec: 40955.7, 300 sec: 40875.8). Total num frames: 395788288. Throughput: 0: 40800.5. Samples: 395913140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 00:45:07,000][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:45:07,661][12883] Updated weights for policy 0, policy_version 24160 (0.0035) [2024-06-18 00:45:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40413.9, 300 sec: 40876.7). Total num frames: 395984896. Throughput: 0: 40852.6. Samples: 396157160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 00:45:11,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:45:12,166][12883] Updated weights for policy 0, policy_version 24170 (0.0033) [2024-06-18 00:45:15,656][12883] Updated weights for policy 0, policy_version 24180 (0.0034) [2024-06-18 00:45:16,994][12645] Fps is (10 sec: 42624.5, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 396214272. Throughput: 0: 40898.9. Samples: 396280220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 00:45:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:45:20,229][12883] Updated weights for policy 0, policy_version 24190 (0.0040) [2024-06-18 00:45:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 396410880. Throughput: 0: 40828.2. Samples: 396518620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 00:45:21,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:45:23,674][12883] Updated weights for policy 0, policy_version 24200 (0.0038) [2024-06-18 00:45:26,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40140.9, 300 sec: 40876.7). Total num frames: 396591104. Throughput: 0: 40848.4. Samples: 396767040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 00:45:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:45:28,233][12883] Updated weights for policy 0, policy_version 24210 (0.0029) [2024-06-18 00:45:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 396804096. Throughput: 0: 40861.4. Samples: 396886560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 00:45:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:45:32,085][12883] Updated weights for policy 0, policy_version 24220 (0.0037) [2024-06-18 00:45:36,015][12883] Updated weights for policy 0, policy_version 24230 (0.0035) [2024-06-18 00:45:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40687.1, 300 sec: 40765.6). Total num frames: 397017088. Throughput: 0: 40852.9. Samples: 397131740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 00:45:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:45:39,998][12883] Updated weights for policy 0, policy_version 24240 (0.0041) [2024-06-18 00:45:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40691.2, 300 sec: 40876.7). Total num frames: 397213696. Throughput: 0: 40701.9. Samples: 397374860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 00:45:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:45:43,846][12883] Updated weights for policy 0, policy_version 24250 (0.0032) [2024-06-18 00:45:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 397410304. Throughput: 0: 40648.2. Samples: 397492160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:45:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:45:48,250][12883] Updated weights for policy 0, policy_version 24260 (0.0028) [2024-06-18 00:45:51,884][12883] Updated weights for policy 0, policy_version 24270 (0.0040) [2024-06-18 00:45:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 397639680. Throughput: 0: 40585.6. Samples: 397739240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:45:51,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:45:56,283][12883] Updated weights for policy 0, policy_version 24280 (0.0037) [2024-06-18 00:45:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 397819904. Throughput: 0: 40554.3. Samples: 397982100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:45:56,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:46:00,423][12883] Updated weights for policy 0, policy_version 24290 (0.0041) [2024-06-18 00:46:01,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 398016512. Throughput: 0: 40495.7. Samples: 398102520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:46:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:46:04,175][12883] Updated weights for policy 0, policy_version 24300 (0.0030) [2024-06-18 00:46:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40691.2, 300 sec: 40821.2). Total num frames: 398229504. Throughput: 0: 40667.7. Samples: 398348660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 00:46:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:46:08,371][12883] Updated weights for policy 0, policy_version 24310 (0.0038) [2024-06-18 00:46:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 398426112. Throughput: 0: 40660.8. Samples: 398596780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 00:46:11,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:46:12,213][12883] Updated weights for policy 0, policy_version 24320 (0.0038) [2024-06-18 00:46:16,143][12883] Updated weights for policy 0, policy_version 24330 (0.0034) [2024-06-18 00:46:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40687.1, 300 sec: 40876.7). Total num frames: 398655488. Throughput: 0: 40612.1. Samples: 398714100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 00:46:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:46:20,623][12883] Updated weights for policy 0, policy_version 24340 (0.0033) [2024-06-18 00:46:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40140.9, 300 sec: 40710.1). Total num frames: 398819328. Throughput: 0: 40676.5. Samples: 398962180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 00:46:21,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:46:24,003][12862] Signal inference workers to stop experience collection... (5600 times) [2024-06-18 00:46:24,004][12862] Signal inference workers to resume experience collection... (5600 times) [2024-06-18 00:46:24,019][12883] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-18 00:46:24,019][12883] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-18 00:46:24,200][12883] Updated weights for policy 0, policy_version 24350 (0.0041) [2024-06-18 00:46:26,994][12645] Fps is (10 sec: 36044.4, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 399015936. Throughput: 0: 40602.2. Samples: 399201960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 00:46:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:46:28,556][12883] Updated weights for policy 0, policy_version 24360 (0.0042) [2024-06-18 00:46:31,994][12645] Fps is (10 sec: 44235.8, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 399261696. Throughput: 0: 40751.3. Samples: 399325980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 00:46:31,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:46:32,205][12883] Updated weights for policy 0, policy_version 24370 (0.0034) [2024-06-18 00:46:36,873][12883] Updated weights for policy 0, policy_version 24380 (0.0036) [2024-06-18 00:46:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40413.8, 300 sec: 40765.6). Total num frames: 399441920. Throughput: 0: 40644.5. Samples: 399568240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 00:46:36,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:46:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024381_399458304.pth... [2024-06-18 00:46:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000023785_389693440.pth [2024-06-18 00:46:40,628][12883] Updated weights for policy 0, policy_version 24390 (0.0037) [2024-06-18 00:46:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40686.8, 300 sec: 40821.2). Total num frames: 399654912. Throughput: 0: 40541.7. Samples: 399806480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 00:46:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:46:44,658][12883] Updated weights for policy 0, policy_version 24400 (0.0034) [2024-06-18 00:46:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 399851520. Throughput: 0: 40636.4. Samples: 399931160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 00:46:46,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 00:46:48,685][12883] Updated weights for policy 0, policy_version 24410 (0.0036) [2024-06-18 00:46:51,994][12645] Fps is (10 sec: 37683.9, 60 sec: 39867.8, 300 sec: 40654.5). Total num frames: 400031744. Throughput: 0: 40574.7. Samples: 400174520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 00:46:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:46:52,574][12883] Updated weights for policy 0, policy_version 24420 (0.0035) [2024-06-18 00:46:56,870][12883] Updated weights for policy 0, policy_version 24430 (0.0034) [2024-06-18 00:46:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40686.8, 300 sec: 40765.6). Total num frames: 400261120. Throughput: 0: 40540.3. Samples: 400421100. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-18 00:46:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:47:00,795][12883] Updated weights for policy 0, policy_version 24440 (0.0037) [2024-06-18 00:47:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 400474112. Throughput: 0: 40781.6. Samples: 400549280. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-18 00:47:01,995][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:47:05,018][12883] Updated weights for policy 0, policy_version 24450 (0.0029) [2024-06-18 00:47:06,994][12645] Fps is (10 sec: 40961.1, 60 sec: 40687.0, 300 sec: 40765.9). Total num frames: 400670720. Throughput: 0: 40454.2. Samples: 400782620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-18 00:47:06,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:47:08,649][12883] Updated weights for policy 0, policy_version 24460 (0.0037) [2024-06-18 00:47:11,994][12645] Fps is (10 sec: 37684.0, 60 sec: 40414.0, 300 sec: 40710.4). Total num frames: 400850944. Throughput: 0: 40664.1. Samples: 401031840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-18 00:47:11,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:47:13,072][12883] Updated weights for policy 0, policy_version 24470 (0.0034) [2024-06-18 00:47:16,486][12883] Updated weights for policy 0, policy_version 24480 (0.0041) [2024-06-18 00:47:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40413.8, 300 sec: 40765.9). Total num frames: 401080320. Throughput: 0: 40509.9. Samples: 401148920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-18 00:47:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:47:20,914][12883] Updated weights for policy 0, policy_version 24490 (0.0026) [2024-06-18 00:47:21,994][12645] Fps is (10 sec: 44236.0, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 401293312. Throughput: 0: 40772.8. Samples: 401403020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-18 00:47:21,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:47:24,363][12883] Updated weights for policy 0, policy_version 24500 (0.0034) [2024-06-18 00:47:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 401473536. Throughput: 0: 40913.3. Samples: 401647580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 00:47:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:47:29,038][12883] Updated weights for policy 0, policy_version 24510 (0.0033) [2024-06-18 00:47:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40687.1, 300 sec: 40710.1). Total num frames: 401702912. Throughput: 0: 40896.5. Samples: 401771500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 00:47:31,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:47:32,182][12883] Updated weights for policy 0, policy_version 24520 (0.0029) [2024-06-18 00:47:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 401883136. Throughput: 0: 40987.1. Samples: 402018940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 00:47:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:47:37,179][12883] Updated weights for policy 0, policy_version 24530 (0.0042) [2024-06-18 00:47:40,222][12883] Updated weights for policy 0, policy_version 24540 (0.0041) [2024-06-18 00:47:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40413.8, 300 sec: 40710.4). Total num frames: 402079744. Throughput: 0: 40864.1. Samples: 402259980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 00:47:41,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:47:45,334][12883] Updated weights for policy 0, policy_version 24550 (0.0034) [2024-06-18 00:47:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41233.1, 300 sec: 40821.1). Total num frames: 402325504. Throughput: 0: 40859.2. Samples: 402387940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:47:46,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 00:47:48,730][12883] Updated weights for policy 0, policy_version 24560 (0.0034) [2024-06-18 00:47:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40959.9, 300 sec: 40710.1). Total num frames: 402489344. Throughput: 0: 40925.3. Samples: 402624260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:47:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:47:53,463][12883] Updated weights for policy 0, policy_version 24570 (0.0042) [2024-06-18 00:47:53,962][12862] Signal inference workers to stop experience collection... (5650 times) [2024-06-18 00:47:53,962][12862] Signal inference workers to resume experience collection... (5650 times) [2024-06-18 00:47:53,993][12883] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-18 00:47:53,993][12883] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-18 00:47:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40687.1, 300 sec: 40710.1). Total num frames: 402702336. Throughput: 0: 40846.6. Samples: 402869940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:47:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:47:57,062][12883] Updated weights for policy 0, policy_version 24580 (0.0039) [2024-06-18 00:48:01,396][12883] Updated weights for policy 0, policy_version 24590 (0.0032) [2024-06-18 00:48:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 402915328. Throughput: 0: 40982.6. Samples: 402993140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 00:48:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:48:04,842][12883] Updated weights for policy 0, policy_version 24600 (0.0031) [2024-06-18 00:48:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40959.9, 300 sec: 40877.6). Total num frames: 403128320. Throughput: 0: 40815.2. Samples: 403239700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 00:48:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:48:09,231][12883] Updated weights for policy 0, policy_version 24610 (0.0043) [2024-06-18 00:48:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 403308544. Throughput: 0: 40694.8. Samples: 403478840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 00:48:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:48:13,309][12883] Updated weights for policy 0, policy_version 24620 (0.0030) [2024-06-18 00:48:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40413.9, 300 sec: 40599.0). Total num frames: 403505152. Throughput: 0: 40604.4. Samples: 403598700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:48:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:48:17,257][12883] Updated weights for policy 0, policy_version 24630 (0.0040) [2024-06-18 00:48:21,128][12883] Updated weights for policy 0, policy_version 24640 (0.0051) [2024-06-18 00:48:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 403734528. Throughput: 0: 40606.5. Samples: 403846240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:48:21,998][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:48:25,110][12883] Updated weights for policy 0, policy_version 24650 (0.0038) [2024-06-18 00:48:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41233.1, 300 sec: 40765.6). Total num frames: 403947520. Throughput: 0: 40715.6. Samples: 404092180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:48:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:48:28,849][12883] Updated weights for policy 0, policy_version 24660 (0.0045) [2024-06-18 00:48:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40654.5). Total num frames: 404127744. Throughput: 0: 40607.5. Samples: 404215280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:48:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:48:33,053][12883] Updated weights for policy 0, policy_version 24670 (0.0048) [2024-06-18 00:48:36,874][12883] Updated weights for policy 0, policy_version 24680 (0.0034) [2024-06-18 00:48:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 40821.2). Total num frames: 404357120. Throughput: 0: 40793.4. Samples: 404459960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 00:48:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:48:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024680_404357120.pth... [2024-06-18 00:48:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024083_394575872.pth [2024-06-18 00:48:41,402][12883] Updated weights for policy 0, policy_version 24690 (0.0037) [2024-06-18 00:48:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40654.5). Total num frames: 404537344. Throughput: 0: 40651.5. Samples: 404699260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 00:48:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:48:45,496][12883] Updated weights for policy 0, policy_version 24700 (0.0047) [2024-06-18 00:48:46,994][12645] Fps is (10 sec: 39320.8, 60 sec: 40413.7, 300 sec: 40710.0). Total num frames: 404750336. Throughput: 0: 40552.4. Samples: 404818000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 00:48:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:48:49,303][12883] Updated weights for policy 0, policy_version 24710 (0.0036) [2024-06-18 00:48:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 40710.1). Total num frames: 404930560. Throughput: 0: 40547.6. Samples: 405064340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 00:48:51,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:48:53,575][12883] Updated weights for policy 0, policy_version 24720 (0.0030) [2024-06-18 00:48:56,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 405159936. Throughput: 0: 40611.1. Samples: 405306340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 00:48:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:48:57,097][12883] Updated weights for policy 0, policy_version 24730 (0.0031) [2024-06-18 00:49:01,511][12883] Updated weights for policy 0, policy_version 24740 (0.0041) [2024-06-18 00:49:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 405356544. Throughput: 0: 40884.8. Samples: 405438520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 00:49:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:49:05,127][12883] Updated weights for policy 0, policy_version 24750 (0.0043) [2024-06-18 00:49:06,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40140.7, 300 sec: 40599.0). Total num frames: 405536768. Throughput: 0: 40717.2. Samples: 405678520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:49:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:49:09,449][12883] Updated weights for policy 0, policy_version 24760 (0.0037) [2024-06-18 00:49:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.0, 300 sec: 40765.6). Total num frames: 405782528. Throughput: 0: 40528.1. Samples: 405915940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:49:11,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:49:13,119][12883] Updated weights for policy 0, policy_version 24770 (0.0040) [2024-06-18 00:49:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 40710.1). Total num frames: 405962752. Throughput: 0: 40750.2. Samples: 406049040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:49:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:49:17,475][12883] Updated weights for policy 0, policy_version 24780 (0.0033) [2024-06-18 00:49:21,049][12883] Updated weights for policy 0, policy_version 24790 (0.0035) [2024-06-18 00:49:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40654.5). Total num frames: 406175744. Throughput: 0: 40596.8. Samples: 406286820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 00:49:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:49:25,446][12883] Updated weights for policy 0, policy_version 24800 (0.0041) [2024-06-18 00:49:26,996][12645] Fps is (10 sec: 42588.8, 60 sec: 40685.4, 300 sec: 40820.8). Total num frames: 406388736. Throughput: 0: 40953.0. Samples: 406542240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 00:49:26,997][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:49:28,905][12883] Updated weights for policy 0, policy_version 24810 (0.0036) [2024-06-18 00:49:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 40654.6). Total num frames: 406568960. Throughput: 0: 41020.6. Samples: 406663920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 00:49:31,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 00:49:33,385][12883] Updated weights for policy 0, policy_version 24820 (0.0045) [2024-06-18 00:49:34,799][12862] Signal inference workers to stop experience collection... (5700 times) [2024-06-18 00:49:34,799][12862] Signal inference workers to resume experience collection... (5700 times) [2024-06-18 00:49:34,815][12883] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-18 00:49:34,846][12883] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-18 00:49:36,786][12883] Updated weights for policy 0, policy_version 24830 (0.0040) [2024-06-18 00:49:36,994][12645] Fps is (10 sec: 42608.4, 60 sec: 40960.0, 300 sec: 40822.0). Total num frames: 406814720. Throughput: 0: 40949.7. Samples: 406907080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 00:49:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:49:41,298][12883] Updated weights for policy 0, policy_version 24840 (0.0038) [2024-06-18 00:49:42,000][12645] Fps is (10 sec: 42571.6, 60 sec: 40955.8, 300 sec: 40820.3). Total num frames: 406994944. Throughput: 0: 41060.0. Samples: 407154300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 00:49:42,001][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:49:45,311][12883] Updated weights for policy 0, policy_version 24850 (0.0034) [2024-06-18 00:49:46,994][12645] Fps is (10 sec: 37682.6, 60 sec: 40687.0, 300 sec: 40654.5). Total num frames: 407191552. Throughput: 0: 40766.6. Samples: 407273020. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 00:49:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:49:49,643][12883] Updated weights for policy 0, policy_version 24860 (0.0029) [2024-06-18 00:49:51,994][12645] Fps is (10 sec: 44264.1, 60 sec: 41779.1, 300 sec: 40876.7). Total num frames: 407437312. Throughput: 0: 40991.2. Samples: 407523120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 00:49:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:49:53,215][12883] Updated weights for policy 0, policy_version 24870 (0.0044) [2024-06-18 00:49:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 407601152. Throughput: 0: 41217.2. Samples: 407770720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 00:49:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:49:57,737][12883] Updated weights for policy 0, policy_version 24880 (0.0038) [2024-06-18 00:50:01,061][12883] Updated weights for policy 0, policy_version 24890 (0.0033) [2024-06-18 00:50:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 40766.5). Total num frames: 407814144. Throughput: 0: 40744.0. Samples: 407882520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 00:50:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:50:05,701][12883] Updated weights for policy 0, policy_version 24900 (0.0040) [2024-06-18 00:50:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 408027136. Throughput: 0: 41202.1. Samples: 408140920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 00:50:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:50:08,907][12883] Updated weights for policy 0, policy_version 24910 (0.0036) [2024-06-18 00:50:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 40654.6). Total num frames: 408207360. Throughput: 0: 41000.8. Samples: 408387180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 00:50:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:50:14,188][12883] Updated weights for policy 0, policy_version 24920 (0.0036) [2024-06-18 00:50:16,514][12883] Updated weights for policy 0, policy_version 24930 (0.0032) [2024-06-18 00:50:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 40821.1). Total num frames: 408453120. Throughput: 0: 41059.8. Samples: 408511620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 00:50:16,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:50:21,962][12883] Updated weights for policy 0, policy_version 24940 (0.0030) [2024-06-18 00:50:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 408616960. Throughput: 0: 41051.5. Samples: 408754400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 00:50:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:50:24,539][12883] Updated weights for policy 0, policy_version 24950 (0.0043) [2024-06-18 00:50:26,996][12645] Fps is (10 sec: 37675.4, 60 sec: 40687.0, 300 sec: 40765.3). Total num frames: 408829952. Throughput: 0: 41064.1. Samples: 409002020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 00:50:26,996][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:50:29,766][12883] Updated weights for policy 0, policy_version 24960 (0.0041) [2024-06-18 00:50:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 409075712. Throughput: 0: 41227.2. Samples: 409128240. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 00:50:31,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:50:32,600][12883] Updated weights for policy 0, policy_version 24970 (0.0035) [2024-06-18 00:50:36,994][12645] Fps is (10 sec: 40969.5, 60 sec: 40413.9, 300 sec: 40765.6). Total num frames: 409239552. Throughput: 0: 41105.5. Samples: 409372860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 00:50:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:50:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024978_409239552.pth... [2024-06-18 00:50:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024381_399458304.pth [2024-06-18 00:50:37,378][12862] Signal inference workers to stop experience collection... (5750 times) [2024-06-18 00:50:37,413][12883] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-18 00:50:37,435][12862] Signal inference workers to resume experience collection... (5750 times) [2024-06-18 00:50:37,436][12883] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-18 00:50:37,574][12883] Updated weights for policy 0, policy_version 24980 (0.0033) [2024-06-18 00:50:40,631][12883] Updated weights for policy 0, policy_version 24990 (0.0044) [2024-06-18 00:50:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40964.3, 300 sec: 40821.1). Total num frames: 409452544. Throughput: 0: 41061.9. Samples: 409618500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 00:50:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:50:45,618][12883] Updated weights for policy 0, policy_version 25000 (0.0035) [2024-06-18 00:50:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.2, 300 sec: 40821.2). Total num frames: 409681920. Throughput: 0: 41386.7. Samples: 409744920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 00:50:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:50:49,166][12883] Updated weights for policy 0, policy_version 25010 (0.0051) [2024-06-18 00:50:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40413.9, 300 sec: 40821.1). Total num frames: 409862144. Throughput: 0: 41136.5. Samples: 409992060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 00:50:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:50:53,557][12883] Updated weights for policy 0, policy_version 25020 (0.0037) [2024-06-18 00:50:56,793][12883] Updated weights for policy 0, policy_version 25030 (0.0032) [2024-06-18 00:50:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 40932.2). Total num frames: 410091520. Throughput: 0: 41111.9. Samples: 410237220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 00:50:56,996][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:51:01,414][12883] Updated weights for policy 0, policy_version 25040 (0.0040) [2024-06-18 00:51:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 410271744. Throughput: 0: 41049.1. Samples: 410358820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 00:51:01,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:51:04,763][12883] Updated weights for policy 0, policy_version 25050 (0.0032) [2024-06-18 00:51:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.2, 300 sec: 40932.2). Total num frames: 410501120. Throughput: 0: 41152.5. Samples: 410606260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 00:51:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:51:09,831][12883] Updated weights for policy 0, policy_version 25060 (0.0036) [2024-06-18 00:51:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 40876.7). Total num frames: 410714112. Throughput: 0: 41100.7. Samples: 410851460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 00:51:11,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:51:12,644][12883] Updated weights for policy 0, policy_version 25070 (0.0031) [2024-06-18 00:51:16,994][12645] Fps is (10 sec: 36045.2, 60 sec: 40141.0, 300 sec: 40821.2). Total num frames: 410861568. Throughput: 0: 41055.2. Samples: 410975720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 00:51:16,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:51:17,437][12883] Updated weights for policy 0, policy_version 25080 (0.0044) [2024-06-18 00:51:20,907][12883] Updated weights for policy 0, policy_version 25090 (0.0044) [2024-06-18 00:51:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 411090944. Throughput: 0: 41009.2. Samples: 411218280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 00:51:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:51:25,276][12883] Updated weights for policy 0, policy_version 25100 (0.0032) [2024-06-18 00:51:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41507.7, 300 sec: 40876.7). Total num frames: 411320320. Throughput: 0: 41024.4. Samples: 411464600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 00:51:26,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:51:28,912][12883] Updated weights for policy 0, policy_version 25110 (0.0043) [2024-06-18 00:51:31,996][12645] Fps is (10 sec: 42588.9, 60 sec: 40685.4, 300 sec: 40931.9). Total num frames: 411516928. Throughput: 0: 41035.3. Samples: 411591600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 00:51:31,997][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:51:32,898][12883] Updated weights for policy 0, policy_version 25120 (0.0041) [2024-06-18 00:51:36,763][12883] Updated weights for policy 0, policy_version 25130 (0.0056) [2024-06-18 00:51:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 411729920. Throughput: 0: 41024.0. Samples: 411838140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 00:51:37,000][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:51:40,842][12883] Updated weights for policy 0, policy_version 25140 (0.0036) [2024-06-18 00:51:41,994][12645] Fps is (10 sec: 39330.0, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 411910144. Throughput: 0: 41079.9. Samples: 412085820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:51:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:51:44,883][12883] Updated weights for policy 0, policy_version 25150 (0.0033) [2024-06-18 00:51:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 412106752. Throughput: 0: 41087.9. Samples: 412207780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:51:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:51:48,739][12883] Updated weights for policy 0, policy_version 25160 (0.0035) [2024-06-18 00:51:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 412336128. Throughput: 0: 41012.3. Samples: 412451820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 00:51:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:51:53,117][12883] Updated weights for policy 0, policy_version 25170 (0.0042) [2024-06-18 00:51:56,635][12862] Signal inference workers to stop experience collection... (5800 times) [2024-06-18 00:51:56,635][12862] Signal inference workers to resume experience collection... (5800 times) [2024-06-18 00:51:56,649][12883] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-18 00:51:56,650][12883] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-18 00:51:56,779][12883] Updated weights for policy 0, policy_version 25180 (0.0029) [2024-06-18 00:51:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 40960.1, 300 sec: 40932.3). Total num frames: 412549120. Throughput: 0: 41086.7. Samples: 412700360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 00:51:56,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 00:52:01,373][12883] Updated weights for policy 0, policy_version 25190 (0.0035) [2024-06-18 00:52:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 412729344. Throughput: 0: 41092.8. Samples: 412824900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 00:52:01,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:52:04,643][12883] Updated weights for policy 0, policy_version 25200 (0.0045) [2024-06-18 00:52:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 412958720. Throughput: 0: 41091.1. Samples: 413067380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 00:52:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:52:09,202][12883] Updated weights for policy 0, policy_version 25210 (0.0036) [2024-06-18 00:52:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 413155328. Throughput: 0: 41100.3. Samples: 413314120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 00:52:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:52:12,837][12883] Updated weights for policy 0, policy_version 25220 (0.0039) [2024-06-18 00:52:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.0, 300 sec: 40821.2). Total num frames: 413335552. Throughput: 0: 40985.6. Samples: 413435860. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 00:52:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:52:17,196][12883] Updated weights for policy 0, policy_version 25230 (0.0040) [2024-06-18 00:52:21,017][12883] Updated weights for policy 0, policy_version 25240 (0.0038) [2024-06-18 00:52:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 413581312. Throughput: 0: 40900.0. Samples: 413678640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 00:52:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:52:25,261][12883] Updated weights for policy 0, policy_version 25250 (0.0048) [2024-06-18 00:52:26,996][12645] Fps is (10 sec: 40950.9, 60 sec: 40412.3, 300 sec: 40820.8). Total num frames: 413745152. Throughput: 0: 40837.2. Samples: 413923580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 00:52:26,996][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 00:52:29,030][12883] Updated weights for policy 0, policy_version 25260 (0.0040) [2024-06-18 00:52:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40688.5, 300 sec: 40932.2). Total num frames: 413958144. Throughput: 0: 40699.6. Samples: 414039260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:52:31,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 00:52:33,242][12883] Updated weights for policy 0, policy_version 25270 (0.0038) [2024-06-18 00:52:36,994][12645] Fps is (10 sec: 42607.7, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 414171136. Throughput: 0: 40753.9. Samples: 414285740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:52:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:52:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025280_414187520.pth... [2024-06-18 00:52:37,008][12883] Updated weights for policy 0, policy_version 25280 (0.0050) [2024-06-18 00:52:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024680_404357120.pth [2024-06-18 00:52:41,632][12883] Updated weights for policy 0, policy_version 25290 (0.0040) [2024-06-18 00:52:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 40821.1). Total num frames: 414367744. Throughput: 0: 40855.9. Samples: 414538880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 00:52:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:52:44,787][12883] Updated weights for policy 0, policy_version 25300 (0.0031) [2024-06-18 00:52:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 414580736. Throughput: 0: 40740.4. Samples: 414658220. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 00:52:46,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:52:49,543][12883] Updated weights for policy 0, policy_version 25310 (0.0033) [2024-06-18 00:52:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 414793728. Throughput: 0: 40963.6. Samples: 414910740. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 00:52:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:52:52,660][12883] Updated weights for policy 0, policy_version 25320 (0.0045) [2024-06-18 00:52:57,000][12645] Fps is (10 sec: 39297.5, 60 sec: 40409.6, 300 sec: 40875.8). Total num frames: 414973952. Throughput: 0: 40911.3. Samples: 415155380. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 00:52:57,000][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:52:57,415][12883] Updated weights for policy 0, policy_version 25330 (0.0043) [2024-06-18 00:53:00,434][12883] Updated weights for policy 0, policy_version 25340 (0.0037) [2024-06-18 00:53:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 40987.8). Total num frames: 415219712. Throughput: 0: 40879.1. Samples: 415275420. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 00:53:01,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:53:05,867][12883] Updated weights for policy 0, policy_version 25350 (0.0028) [2024-06-18 00:53:06,994][12645] Fps is (10 sec: 40984.9, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 415383552. Throughput: 0: 41076.8. Samples: 415527100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 00:53:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:53:08,274][12883] Updated weights for policy 0, policy_version 25360 (0.0047) [2024-06-18 00:53:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 415596544. Throughput: 0: 41009.5. Samples: 415768920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 00:53:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:53:13,673][12883] Updated weights for policy 0, policy_version 25370 (0.0031) [2024-06-18 00:53:15,713][12862] Signal inference workers to stop experience collection... (5850 times) [2024-06-18 00:53:15,713][12862] Signal inference workers to resume experience collection... (5850 times) [2024-06-18 00:53:15,728][12883] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-06-18 00:53:15,728][12883] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-06-18 00:53:16,668][12883] Updated weights for policy 0, policy_version 25380 (0.0051) [2024-06-18 00:53:16,994][12645] Fps is (10 sec: 45875.9, 60 sec: 41779.2, 300 sec: 41043.3). Total num frames: 415842304. Throughput: 0: 41132.0. Samples: 415890200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 00:53:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:53:21,409][12883] Updated weights for policy 0, policy_version 25390 (0.0048) [2024-06-18 00:53:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40140.8, 300 sec: 40821.2). Total num frames: 415989760. Throughput: 0: 41116.5. Samples: 416135980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 00:53:21,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:53:24,598][12883] Updated weights for policy 0, policy_version 25400 (0.0035) [2024-06-18 00:53:26,994][12645] Fps is (10 sec: 36044.8, 60 sec: 40961.5, 300 sec: 40932.2). Total num frames: 416202752. Throughput: 0: 40964.9. Samples: 416382300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 00:53:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:53:29,667][12883] Updated weights for policy 0, policy_version 25410 (0.0037) [2024-06-18 00:53:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 416415744. Throughput: 0: 41185.3. Samples: 416511560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 00:53:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:53:32,548][12883] Updated weights for policy 0, policy_version 25420 (0.0036) [2024-06-18 00:53:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 416612352. Throughput: 0: 41048.0. Samples: 416757900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 00:53:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:53:37,419][12883] Updated weights for policy 0, policy_version 25430 (0.0038) [2024-06-18 00:53:40,494][12883] Updated weights for policy 0, policy_version 25440 (0.0041) [2024-06-18 00:53:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 416858112. Throughput: 0: 40822.9. Samples: 416992160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 00:53:42,003][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:53:45,010][12883] Updated weights for policy 0, policy_version 25450 (0.0029) [2024-06-18 00:53:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 417038336. Throughput: 0: 41079.1. Samples: 417123980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 00:53:46,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:53:48,780][12883] Updated weights for policy 0, policy_version 25460 (0.0030) [2024-06-18 00:53:51,994][12645] Fps is (10 sec: 36044.8, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 417218560. Throughput: 0: 40923.6. Samples: 417368660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 00:53:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:53:52,954][12883] Updated weights for policy 0, policy_version 25470 (0.0038) [2024-06-18 00:53:56,474][12883] Updated weights for policy 0, policy_version 25480 (0.0030) [2024-06-18 00:53:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41510.4, 300 sec: 41043.3). Total num frames: 417464320. Throughput: 0: 40964.0. Samples: 417612300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 00:53:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:54:01,171][12883] Updated weights for policy 0, policy_version 25490 (0.0030) [2024-06-18 00:54:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 417660928. Throughput: 0: 41212.1. Samples: 417744740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 00:54:01,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:54:04,289][12883] Updated weights for policy 0, policy_version 25500 (0.0041) [2024-06-18 00:54:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 417841152. Throughput: 0: 41081.3. Samples: 417984640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 00:54:06,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 00:54:09,081][12883] Updated weights for policy 0, policy_version 25510 (0.0030) [2024-06-18 00:54:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 418086912. Throughput: 0: 41095.2. Samples: 418231580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 00:54:11,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 00:54:12,415][12883] Updated weights for policy 0, policy_version 25520 (0.0033) [2024-06-18 00:54:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40140.8, 300 sec: 40932.2). Total num frames: 418250752. Throughput: 0: 41069.5. Samples: 418359680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 00:54:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:54:17,198][12883] Updated weights for policy 0, policy_version 25530 (0.0035) [2024-06-18 00:54:20,447][12883] Updated weights for policy 0, policy_version 25540 (0.0042) [2024-06-18 00:54:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 40988.1). Total num frames: 418480128. Throughput: 0: 40774.6. Samples: 418592760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 00:54:21,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 00:54:25,124][12883] Updated weights for policy 0, policy_version 25550 (0.0037) [2024-06-18 00:54:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 418693120. Throughput: 0: 41114.7. Samples: 418842320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 00:54:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:54:28,571][12883] Updated weights for policy 0, policy_version 25560 (0.0028) [2024-06-18 00:54:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 418873344. Throughput: 0: 40899.0. Samples: 418964440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 00:54:31,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:54:32,946][12883] Updated weights for policy 0, policy_version 25570 (0.0037) [2024-06-18 00:54:36,515][12883] Updated weights for policy 0, policy_version 25580 (0.0041) [2024-06-18 00:54:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41044.2). Total num frames: 419102720. Throughput: 0: 40925.0. Samples: 419210280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 00:54:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:54:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025581_419119104.pth... [2024-06-18 00:54:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000024978_409239552.pth [2024-06-18 00:54:40,762][12883] Updated weights for policy 0, policy_version 25590 (0.0040) [2024-06-18 00:54:41,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40414.0, 300 sec: 40987.8). Total num frames: 419282944. Throughput: 0: 41127.8. Samples: 419463040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 00:54:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:54:42,550][12862] Signal inference workers to stop experience collection... (5900 times) [2024-06-18 00:54:42,551][12862] Signal inference workers to resume experience collection... (5900 times) [2024-06-18 00:54:42,595][12883] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-06-18 00:54:42,595][12883] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-06-18 00:54:44,791][12883] Updated weights for policy 0, policy_version 25600 (0.0034) [2024-06-18 00:54:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 419495936. Throughput: 0: 40870.1. Samples: 419583900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 00:54:46,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:54:48,602][12883] Updated weights for policy 0, policy_version 25610 (0.0030) [2024-06-18 00:54:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 419692544. Throughput: 0: 40938.8. Samples: 419826880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 00:54:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:54:52,994][12883] Updated weights for policy 0, policy_version 25620 (0.0049) [2024-06-18 00:54:56,946][12883] Updated weights for policy 0, policy_version 25630 (0.0035) [2024-06-18 00:54:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 419921920. Throughput: 0: 40882.1. Samples: 420071280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 00:54:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:55:00,896][12883] Updated weights for policy 0, policy_version 25640 (0.0033) [2024-06-18 00:55:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 420134912. Throughput: 0: 40676.4. Samples: 420190120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 00:55:01,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 00:55:05,423][12883] Updated weights for policy 0, policy_version 25650 (0.0038) [2024-06-18 00:55:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 420331520. Throughput: 0: 41067.5. Samples: 420440800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 00:55:06,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 00:55:08,907][12883] Updated weights for policy 0, policy_version 25660 (0.0033) [2024-06-18 00:55:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40413.9, 300 sec: 40876.7). Total num frames: 420511744. Throughput: 0: 40864.5. Samples: 420681220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 00:55:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:55:13,483][12883] Updated weights for policy 0, policy_version 25670 (0.0036) [2024-06-18 00:55:16,977][12883] Updated weights for policy 0, policy_version 25680 (0.0032) [2024-06-18 00:55:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 420741120. Throughput: 0: 40767.2. Samples: 420798960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 00:55:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:55:21,179][12883] Updated weights for policy 0, policy_version 25690 (0.0037) [2024-06-18 00:55:21,996][12645] Fps is (10 sec: 40950.8, 60 sec: 40685.4, 300 sec: 40987.8). Total num frames: 420921344. Throughput: 0: 40842.3. Samples: 421048280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:55:21,997][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:55:25,146][12883] Updated weights for policy 0, policy_version 25700 (0.0034) [2024-06-18 00:55:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 421150720. Throughput: 0: 40572.4. Samples: 421288800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:55:27,000][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:55:28,993][12883] Updated weights for policy 0, policy_version 25710 (0.0038) [2024-06-18 00:55:31,994][12645] Fps is (10 sec: 39330.2, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 421314560. Throughput: 0: 40616.0. Samples: 421411620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:55:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:55:32,981][12883] Updated weights for policy 0, policy_version 25720 (0.0043) [2024-06-18 00:55:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.8, 300 sec: 40987.7). Total num frames: 421543936. Throughput: 0: 40740.3. Samples: 421660200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 00:55:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:55:37,091][12883] Updated weights for policy 0, policy_version 25730 (0.0030) [2024-06-18 00:55:41,025][12883] Updated weights for policy 0, policy_version 25740 (0.0043) [2024-06-18 00:55:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40959.8, 300 sec: 40876.7). Total num frames: 421740544. Throughput: 0: 40729.7. Samples: 421904120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 00:55:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:55:44,985][12883] Updated weights for policy 0, policy_version 25750 (0.0030) [2024-06-18 00:55:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 421953536. Throughput: 0: 40699.9. Samples: 422021620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 00:55:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:55:48,869][12883] Updated weights for policy 0, policy_version 25760 (0.0034) [2024-06-18 00:55:51,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 422133760. Throughput: 0: 40672.6. Samples: 422271060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 00:55:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:55:53,163][12883] Updated weights for policy 0, policy_version 25770 (0.0050) [2024-06-18 00:55:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 422346752. Throughput: 0: 40802.7. Samples: 422517340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:55:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:55:57,246][12883] Updated weights for policy 0, policy_version 25780 (0.0034) [2024-06-18 00:56:00,695][12862] Signal inference workers to stop experience collection... (5950 times) [2024-06-18 00:56:00,728][12883] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-06-18 00:56:00,752][12862] Signal inference workers to resume experience collection... (5950 times) [2024-06-18 00:56:00,753][12883] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-06-18 00:56:01,061][12883] Updated weights for policy 0, policy_version 25790 (0.0033) [2024-06-18 00:56:02,000][12645] Fps is (10 sec: 44209.1, 60 sec: 40682.7, 300 sec: 40931.4). Total num frames: 422576128. Throughput: 0: 40992.5. Samples: 422643880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:56:02,000][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:56:05,149][12883] Updated weights for policy 0, policy_version 25800 (0.0042) [2024-06-18 00:56:06,996][12645] Fps is (10 sec: 40950.5, 60 sec: 40412.4, 300 sec: 40820.8). Total num frames: 422756352. Throughput: 0: 40709.8. Samples: 422880220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:56:06,997][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:56:08,968][12883] Updated weights for policy 0, policy_version 25810 (0.0026) [2024-06-18 00:56:11,994][12645] Fps is (10 sec: 37706.7, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 422952960. Throughput: 0: 40843.1. Samples: 423126740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 00:56:11,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:56:13,284][12883] Updated weights for policy 0, policy_version 25820 (0.0035) [2024-06-18 00:56:16,994][12645] Fps is (10 sec: 40969.3, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 423165952. Throughput: 0: 40753.8. Samples: 423245540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 00:56:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:56:17,364][12883] Updated weights for policy 0, policy_version 25830 (0.0030) [2024-06-18 00:56:21,546][12883] Updated weights for policy 0, policy_version 25840 (0.0039) [2024-06-18 00:56:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41234.5, 300 sec: 40932.2). Total num frames: 423395328. Throughput: 0: 40676.9. Samples: 423490660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 00:56:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:56:25,309][12883] Updated weights for policy 0, policy_version 25850 (0.0042) [2024-06-18 00:56:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 40877.0). Total num frames: 423575552. Throughput: 0: 40592.2. Samples: 423730760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 00:56:26,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:56:29,519][12883] Updated weights for policy 0, policy_version 25860 (0.0035) [2024-06-18 00:56:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 423772160. Throughput: 0: 40741.4. Samples: 423854980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:56:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:56:33,690][12883] Updated weights for policy 0, policy_version 25870 (0.0033) [2024-06-18 00:56:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 423968768. Throughput: 0: 40672.0. Samples: 424101300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:56:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:56:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025878_423985152.pth... [2024-06-18 00:56:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025280_414187520.pth [2024-06-18 00:56:37,670][12883] Updated weights for policy 0, policy_version 25880 (0.0045) [2024-06-18 00:56:41,509][12883] Updated weights for policy 0, policy_version 25890 (0.0026) [2024-06-18 00:56:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 424198144. Throughput: 0: 40653.7. Samples: 424346760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:56:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 00:56:45,557][12883] Updated weights for policy 0, policy_version 25900 (0.0030) [2024-06-18 00:56:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 424394752. Throughput: 0: 40658.0. Samples: 424473240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 00:56:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:56:49,288][12883] Updated weights for policy 0, policy_version 25910 (0.0031) [2024-06-18 00:56:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 424624128. Throughput: 0: 41015.3. Samples: 424725820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:56:51,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:56:53,250][12883] Updated weights for policy 0, policy_version 25920 (0.0046) [2024-06-18 00:56:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 424804352. Throughput: 0: 40905.8. Samples: 424967500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:56:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:56:57,319][12883] Updated weights for policy 0, policy_version 25930 (0.0038) [2024-06-18 00:57:00,955][12883] Updated weights for policy 0, policy_version 25940 (0.0040) [2024-06-18 00:57:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40691.2, 300 sec: 40876.7). Total num frames: 425017344. Throughput: 0: 41047.1. Samples: 425092660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 00:57:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:57:05,299][12883] Updated weights for policy 0, policy_version 25950 (0.0039) [2024-06-18 00:57:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40961.5, 300 sec: 40876.7). Total num frames: 425213952. Throughput: 0: 41034.3. Samples: 425337200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 00:57:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:57:09,172][12883] Updated weights for policy 0, policy_version 25960 (0.0052) [2024-06-18 00:57:11,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 425410560. Throughput: 0: 41029.6. Samples: 425577100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 00:57:11,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:57:13,876][12883] Updated weights for policy 0, policy_version 25970 (0.0043) [2024-06-18 00:57:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 425623552. Throughput: 0: 41026.7. Samples: 425701180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 00:57:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 00:57:17,535][12883] Updated weights for policy 0, policy_version 25980 (0.0036) [2024-06-18 00:57:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40140.9, 300 sec: 40877.0). Total num frames: 425803776. Throughput: 0: 40985.4. Samples: 425945640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 00:57:21,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 00:57:22,113][12883] Updated weights for policy 0, policy_version 25990 (0.0053) [2024-06-18 00:57:25,814][12883] Updated weights for policy 0, policy_version 26000 (0.0027) [2024-06-18 00:57:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 426033152. Throughput: 0: 40803.5. Samples: 426182920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 00:57:26,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 00:57:27,098][12862] Signal inference workers to stop experience collection... (6000 times) [2024-06-18 00:57:27,125][12883] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-06-18 00:57:27,161][12862] Signal inference workers to resume experience collection... (6000 times) [2024-06-18 00:57:27,164][12883] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-06-18 00:57:30,144][12883] Updated weights for policy 0, policy_version 26010 (0.0032) [2024-06-18 00:57:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 426229760. Throughput: 0: 40784.5. Samples: 426308540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 00:57:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:57:33,700][12883] Updated weights for policy 0, policy_version 26020 (0.0049) [2024-06-18 00:57:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 426442752. Throughput: 0: 40702.3. Samples: 426557420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 00:57:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:57:38,150][12883] Updated weights for policy 0, policy_version 26030 (0.0047) [2024-06-18 00:57:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40413.9, 300 sec: 40821.2). Total num frames: 426622976. Throughput: 0: 40482.7. Samples: 426789220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:57:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:57:42,216][12883] Updated weights for policy 0, policy_version 26040 (0.0034) [2024-06-18 00:57:46,109][12883] Updated weights for policy 0, policy_version 26050 (0.0040) [2024-06-18 00:57:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40414.0, 300 sec: 40765.6). Total num frames: 426819584. Throughput: 0: 40371.6. Samples: 426909380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:57:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:57:50,074][12883] Updated weights for policy 0, policy_version 26060 (0.0045) [2024-06-18 00:57:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40413.9, 300 sec: 40933.1). Total num frames: 427048960. Throughput: 0: 40424.1. Samples: 427156280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 00:57:51,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 00:57:54,051][12883] Updated weights for policy 0, policy_version 26070 (0.0025) [2024-06-18 00:57:56,999][12645] Fps is (10 sec: 42575.0, 60 sec: 40683.2, 300 sec: 40764.9). Total num frames: 427245568. Throughput: 0: 40417.8. Samples: 427396120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 00:57:57,000][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:57:57,947][12883] Updated weights for policy 0, policy_version 26080 (0.0057) [2024-06-18 00:58:01,996][12645] Fps is (10 sec: 39312.7, 60 sec: 40412.3, 300 sec: 40876.4). Total num frames: 427442176. Throughput: 0: 40403.3. Samples: 427519420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 00:58:01,996][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:58:02,132][12883] Updated weights for policy 0, policy_version 26090 (0.0027) [2024-06-18 00:58:06,004][12883] Updated weights for policy 0, policy_version 26100 (0.0042) [2024-06-18 00:58:06,996][12645] Fps is (10 sec: 40973.1, 60 sec: 40685.4, 300 sec: 40876.4). Total num frames: 427655168. Throughput: 0: 40461.0. Samples: 427766480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 00:58:06,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:58:10,138][12883] Updated weights for policy 0, policy_version 26110 (0.0034) [2024-06-18 00:58:11,994][12645] Fps is (10 sec: 39330.1, 60 sec: 40413.9, 300 sec: 40654.5). Total num frames: 427835392. Throughput: 0: 40680.5. Samples: 428013540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 00:58:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:58:14,029][12883] Updated weights for policy 0, policy_version 26120 (0.0038) [2024-06-18 00:58:16,994][12645] Fps is (10 sec: 39330.2, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 428048384. Throughput: 0: 40453.3. Samples: 428128940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 00:58:16,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 00:58:18,242][12883] Updated weights for policy 0, policy_version 26130 (0.0041) [2024-06-18 00:58:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 428261376. Throughput: 0: 40411.6. Samples: 428375940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 00:58:21,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 00:58:22,083][12883] Updated weights for policy 0, policy_version 26140 (0.0046) [2024-06-18 00:58:26,383][12883] Updated weights for policy 0, policy_version 26150 (0.0029) [2024-06-18 00:58:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.9, 300 sec: 40765.6). Total num frames: 428441600. Throughput: 0: 40588.4. Samples: 428615700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 00:58:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:58:30,091][12883] Updated weights for policy 0, policy_version 26160 (0.0023) [2024-06-18 00:58:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 428687360. Throughput: 0: 40575.1. Samples: 428735260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-18 00:58:31,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:58:34,631][12883] Updated weights for policy 0, policy_version 26170 (0.0042) [2024-06-18 00:58:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40413.8, 300 sec: 40710.1). Total num frames: 428867584. Throughput: 0: 40634.6. Samples: 428984840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-18 00:58:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:58:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026176_428867584.pth... [2024-06-18 00:58:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025581_419119104.pth [2024-06-18 00:58:38,038][12883] Updated weights for policy 0, policy_version 26180 (0.0038) [2024-06-18 00:58:41,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40686.9, 300 sec: 40765.6). Total num frames: 429064192. Throughput: 0: 40687.1. Samples: 429226820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-18 00:58:41,998][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:58:42,942][12883] Updated weights for policy 0, policy_version 26190 (0.0038) [2024-06-18 00:58:46,127][12883] Updated weights for policy 0, policy_version 26200 (0.0045) [2024-06-18 00:58:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 429293568. Throughput: 0: 40866.0. Samples: 429358300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 27.0) [2024-06-18 00:58:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:58:50,537][12883] Updated weights for policy 0, policy_version 26210 (0.0032) [2024-06-18 00:58:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40140.8, 300 sec: 40654.6). Total num frames: 429457408. Throughput: 0: 40852.3. Samples: 429604740. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-18 00:58:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:58:52,557][12862] Signal inference workers to stop experience collection... (6050 times) [2024-06-18 00:58:52,599][12883] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-06-18 00:58:52,676][12862] Signal inference workers to resume experience collection... (6050 times) [2024-06-18 00:58:52,676][12883] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-06-18 00:58:54,158][12883] Updated weights for policy 0, policy_version 26220 (0.0031) [2024-06-18 00:58:56,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41235.3, 300 sec: 40876.4). Total num frames: 429719552. Throughput: 0: 40813.2. Samples: 429850220. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-18 00:58:56,996][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 00:58:58,742][12883] Updated weights for policy 0, policy_version 26230 (0.0041) [2024-06-18 00:59:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40961.5, 300 sec: 40876.7). Total num frames: 429899776. Throughput: 0: 41163.1. Samples: 429981280. Policy #0 lag: (min: 1.0, avg: 11.5, max: 21.0) [2024-06-18 00:59:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:59:02,055][12883] Updated weights for policy 0, policy_version 26240 (0.0031) [2024-06-18 00:59:06,638][12883] Updated weights for policy 0, policy_version 26250 (0.0045) [2024-06-18 00:59:06,994][12645] Fps is (10 sec: 36053.0, 60 sec: 40415.4, 300 sec: 40654.5). Total num frames: 430080000. Throughput: 0: 40992.0. Samples: 430220580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 00:59:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:59:10,158][12883] Updated weights for policy 0, policy_version 26260 (0.0048) [2024-06-18 00:59:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 40987.8). Total num frames: 430342144. Throughput: 0: 40952.8. Samples: 430458580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 00:59:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 00:59:14,478][12883] Updated weights for policy 0, policy_version 26270 (0.0023) [2024-06-18 00:59:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 40710.1). Total num frames: 430489600. Throughput: 0: 41101.7. Samples: 430584840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 00:59:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:59:18,236][12883] Updated weights for policy 0, policy_version 26280 (0.0032) [2024-06-18 00:59:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 40959.9, 300 sec: 40765.6). Total num frames: 430718976. Throughput: 0: 40917.7. Samples: 430826140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 00:59:21,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 00:59:22,709][12883] Updated weights for policy 0, policy_version 26290 (0.0042) [2024-06-18 00:59:26,323][12883] Updated weights for policy 0, policy_version 26300 (0.0042) [2024-06-18 00:59:26,994][12645] Fps is (10 sec: 47514.3, 60 sec: 42052.3, 300 sec: 40987.8). Total num frames: 430964736. Throughput: 0: 41138.3. Samples: 431078040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 00:59:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 00:59:30,499][12883] Updated weights for policy 0, policy_version 26310 (0.0033) [2024-06-18 00:59:31,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40140.8, 300 sec: 40654.5). Total num frames: 431095808. Throughput: 0: 41029.8. Samples: 431204640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 00:59:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 00:59:34,127][12883] Updated weights for policy 0, policy_version 26320 (0.0030) [2024-06-18 00:59:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 431341568. Throughput: 0: 40896.9. Samples: 431445100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 00:59:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 00:59:38,262][12883] Updated weights for policy 0, policy_version 26330 (0.0028) [2024-06-18 00:59:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 431538176. Throughput: 0: 41158.8. Samples: 431702280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:59:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:59:42,129][12883] Updated weights for policy 0, policy_version 26340 (0.0041) [2024-06-18 00:59:46,014][12883] Updated weights for policy 0, policy_version 26350 (0.0038) [2024-06-18 00:59:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 431734784. Throughput: 0: 40895.1. Samples: 431821560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:59:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:59:50,149][12883] Updated weights for policy 0, policy_version 26360 (0.0032) [2024-06-18 00:59:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 40876.7). Total num frames: 431980544. Throughput: 0: 41059.1. Samples: 432068240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:59:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 00:59:53,688][12862] Signal inference workers to stop experience collection... (6100 times) [2024-06-18 00:59:53,688][12862] Signal inference workers to resume experience collection... (6100 times) [2024-06-18 00:59:53,746][12883] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-06-18 00:59:53,746][12883] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-06-18 00:59:54,080][12883] Updated weights for policy 0, policy_version 26370 (0.0035) [2024-06-18 00:59:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40415.3, 300 sec: 40710.1). Total num frames: 432144384. Throughput: 0: 41340.3. Samples: 432318900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 00:59:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 00:59:58,217][12883] Updated weights for policy 0, policy_version 26380 (0.0034) [2024-06-18 01:00:01,827][12883] Updated weights for policy 0, policy_version 26390 (0.0031) [2024-06-18 01:00:01,995][12645] Fps is (10 sec: 39316.2, 60 sec: 41232.2, 300 sec: 40821.0). Total num frames: 432373760. Throughput: 0: 41086.8. Samples: 432433800. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-18 01:00:01,996][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:00:06,055][12883] Updated weights for policy 0, policy_version 26400 (0.0036) [2024-06-18 01:00:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.1, 300 sec: 40876.7). Total num frames: 432570368. Throughput: 0: 41406.0. Samples: 432689400. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-18 01:00:06,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 01:00:09,837][12883] Updated weights for policy 0, policy_version 26410 (0.0029) [2024-06-18 01:00:11,996][12645] Fps is (10 sec: 39318.0, 60 sec: 40412.3, 300 sec: 40765.3). Total num frames: 432766976. Throughput: 0: 41279.6. Samples: 432935720. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-18 01:00:11,997][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:00:14,267][12883] Updated weights for policy 0, policy_version 26420 (0.0042) [2024-06-18 01:00:16,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 40877.0). Total num frames: 432979968. Throughput: 0: 41115.0. Samples: 433054820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 01:00:16,999][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:00:17,948][12883] Updated weights for policy 0, policy_version 26430 (0.0030) [2024-06-18 01:00:21,982][12883] Updated weights for policy 0, policy_version 26440 (0.0035) [2024-06-18 01:00:21,994][12645] Fps is (10 sec: 42608.1, 60 sec: 41233.1, 300 sec: 40821.1). Total num frames: 433192960. Throughput: 0: 41246.1. Samples: 433301180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 01:00:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:00:25,762][12883] Updated weights for policy 0, policy_version 26450 (0.0047) [2024-06-18 01:00:26,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.8, 300 sec: 40876.7). Total num frames: 433373184. Throughput: 0: 41029.5. Samples: 433548600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 01:00:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:00:29,783][12883] Updated weights for policy 0, policy_version 26460 (0.0039) [2024-06-18 01:00:31,998][12645] Fps is (10 sec: 39303.0, 60 sec: 41502.8, 300 sec: 40820.5). Total num frames: 433586176. Throughput: 0: 41074.9. Samples: 433670120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 01:00:31,999][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:00:34,038][12883] Updated weights for policy 0, policy_version 26470 (0.0032) [2024-06-18 01:00:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 433815552. Throughput: 0: 41048.8. Samples: 433915440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:00:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:00:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026478_433815552.pth... [2024-06-18 01:00:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000025878_423985152.pth [2024-06-18 01:00:38,290][12883] Updated weights for policy 0, policy_version 26480 (0.0033) [2024-06-18 01:00:41,829][12883] Updated weights for policy 0, policy_version 26490 (0.0046) [2024-06-18 01:00:41,994][12645] Fps is (10 sec: 42618.4, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 434012160. Throughput: 0: 41028.1. Samples: 434165160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:00:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:00:46,163][12883] Updated weights for policy 0, policy_version 26500 (0.0035) [2024-06-18 01:00:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 434208768. Throughput: 0: 41301.6. Samples: 434292320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:00:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:00:49,627][12883] Updated weights for policy 0, policy_version 26510 (0.0029) [2024-06-18 01:00:51,994][12645] Fps is (10 sec: 40957.9, 60 sec: 40686.5, 300 sec: 40932.1). Total num frames: 434421760. Throughput: 0: 41019.4. Samples: 434535300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:00:51,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:00:54,033][12883] Updated weights for policy 0, policy_version 26520 (0.0037) [2024-06-18 01:00:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 40822.0). Total num frames: 434618368. Throughput: 0: 40987.9. Samples: 434780080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:00:56,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:00:57,831][12883] Updated weights for policy 0, policy_version 26530 (0.0030) [2024-06-18 01:01:01,994][12645] Fps is (10 sec: 39323.9, 60 sec: 40687.9, 300 sec: 40877.0). Total num frames: 434814976. Throughput: 0: 41143.7. Samples: 434906280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:01:01,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:01:02,191][12883] Updated weights for policy 0, policy_version 26540 (0.0034) [2024-06-18 01:01:05,778][12883] Updated weights for policy 0, policy_version 26550 (0.0030) [2024-06-18 01:01:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 435044352. Throughput: 0: 41207.5. Samples: 435155520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:01:06,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:01:10,234][12883] Updated weights for policy 0, policy_version 26560 (0.0035) [2024-06-18 01:01:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41507.7, 300 sec: 40987.8). Total num frames: 435257344. Throughput: 0: 41021.2. Samples: 435394560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:01:11,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:01:13,721][12883] Updated weights for policy 0, policy_version 26570 (0.0048) [2024-06-18 01:01:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 435437568. Throughput: 0: 41253.9. Samples: 435526360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:01:16,995][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:01:18,037][12883] Updated weights for policy 0, policy_version 26580 (0.0034) [2024-06-18 01:01:21,910][12883] Updated weights for policy 0, policy_version 26590 (0.0031) [2024-06-18 01:01:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 435650560. Throughput: 0: 41269.7. Samples: 435772580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:01:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:01:26,247][12883] Updated weights for policy 0, policy_version 26600 (0.0031) [2024-06-18 01:01:26,993][12645] Fps is (10 sec: 44238.4, 60 sec: 41779.2, 300 sec: 41043.3). Total num frames: 435879936. Throughput: 0: 41180.2. Samples: 436018260. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-18 01:01:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:01:29,810][12883] Updated weights for policy 0, policy_version 26610 (0.0035) [2024-06-18 01:01:32,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41505.1, 300 sec: 41042.4). Total num frames: 436076544. Throughput: 0: 41115.2. Samples: 436142760. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-18 01:01:32,001][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:01:33,973][12883] Updated weights for policy 0, policy_version 26620 (0.0045) [2024-06-18 01:01:34,648][12862] Signal inference workers to stop experience collection... (6150 times) [2024-06-18 01:01:34,700][12883] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-06-18 01:01:34,704][12862] Signal inference workers to resume experience collection... (6150 times) [2024-06-18 01:01:34,712][12883] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-06-18 01:01:36,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 436273152. Throughput: 0: 41129.8. Samples: 436386120. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-18 01:01:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:01:37,533][12883] Updated weights for policy 0, policy_version 26630 (0.0035) [2024-06-18 01:01:41,713][12883] Updated weights for policy 0, policy_version 26640 (0.0038) [2024-06-18 01:01:41,994][12645] Fps is (10 sec: 39345.4, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 436469760. Throughput: 0: 41289.5. Samples: 436638120. Policy #0 lag: (min: 2.0, avg: 11.7, max: 23.0) [2024-06-18 01:01:41,995][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:01:45,821][12883] Updated weights for policy 0, policy_version 26650 (0.0046) [2024-06-18 01:01:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 40821.2). Total num frames: 436666368. Throughput: 0: 41097.8. Samples: 436755680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:01:46,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:01:49,701][12883] Updated weights for policy 0, policy_version 26660 (0.0041) [2024-06-18 01:01:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.4, 300 sec: 40987.8). Total num frames: 436895744. Throughput: 0: 41002.7. Samples: 437000640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:01:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:01:53,562][12883] Updated weights for policy 0, policy_version 26670 (0.0040) [2024-06-18 01:01:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 437075968. Throughput: 0: 41233.3. Samples: 437250060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:01:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:01:57,586][12883] Updated weights for policy 0, policy_version 26680 (0.0033) [2024-06-18 01:02:01,872][12883] Updated weights for policy 0, policy_version 26690 (0.0042) [2024-06-18 01:02:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 437288960. Throughput: 0: 40899.3. Samples: 437366820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:02:01,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 01:02:05,725][12883] Updated weights for policy 0, policy_version 26700 (0.0039) [2024-06-18 01:02:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 437501952. Throughput: 0: 40867.9. Samples: 437611640. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-18 01:02:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:02:09,890][12883] Updated weights for policy 0, policy_version 26710 (0.0043) [2024-06-18 01:02:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.8, 300 sec: 40876.7). Total num frames: 437682176. Throughput: 0: 40916.2. Samples: 437859500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-18 01:02:11,995][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:02:13,733][12883] Updated weights for policy 0, policy_version 26720 (0.0047) [2024-06-18 01:02:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 437911552. Throughput: 0: 40829.2. Samples: 437979820. Policy #0 lag: (min: 0.0, avg: 7.9, max: 20.0) [2024-06-18 01:02:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:02:17,977][12883] Updated weights for policy 0, policy_version 26730 (0.0032) [2024-06-18 01:02:21,953][12883] Updated weights for policy 0, policy_version 26740 (0.0033) [2024-06-18 01:02:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 438108160. Throughput: 0: 40828.4. Samples: 438223400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:02:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:02:25,721][12883] Updated weights for policy 0, policy_version 26750 (0.0029) [2024-06-18 01:02:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.7, 300 sec: 40932.2). Total num frames: 438304768. Throughput: 0: 40732.1. Samples: 438471060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:02:26,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:02:29,901][12883] Updated weights for policy 0, policy_version 26760 (0.0037) [2024-06-18 01:02:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40964.3, 300 sec: 40987.8). Total num frames: 438534144. Throughput: 0: 40824.4. Samples: 438592780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:02:31,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:02:33,433][12883] Updated weights for policy 0, policy_version 26770 (0.0047) [2024-06-18 01:02:36,996][12645] Fps is (10 sec: 39313.1, 60 sec: 40412.4, 300 sec: 40931.9). Total num frames: 438697984. Throughput: 0: 40981.5. Samples: 438844900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:02:36,996][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 01:02:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026777_438714368.pth... [2024-06-18 01:02:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026176_428867584.pth [2024-06-18 01:02:37,885][12883] Updated weights for policy 0, policy_version 26780 (0.0046) [2024-06-18 01:02:41,236][12883] Updated weights for policy 0, policy_version 26790 (0.0030) [2024-06-18 01:02:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 438943744. Throughput: 0: 40736.4. Samples: 439083200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 01:02:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:02:46,090][12883] Updated weights for policy 0, policy_version 26800 (0.0028) [2024-06-18 01:02:46,994][12645] Fps is (10 sec: 44245.9, 60 sec: 41232.9, 300 sec: 40987.7). Total num frames: 439140352. Throughput: 0: 41084.3. Samples: 439215620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 01:02:46,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:02:49,134][12883] Updated weights for policy 0, policy_version 26810 (0.0031) [2024-06-18 01:02:51,994][12645] Fps is (10 sec: 37683.8, 60 sec: 40413.9, 300 sec: 40933.0). Total num frames: 439320576. Throughput: 0: 41079.7. Samples: 439460220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 01:02:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:02:54,151][12883] Updated weights for policy 0, policy_version 26820 (0.0034) [2024-06-18 01:02:56,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41506.2, 300 sec: 41099.2). Total num frames: 439566336. Throughput: 0: 40950.8. Samples: 439702280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 01:02:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:02:57,020][12883] Updated weights for policy 0, policy_version 26830 (0.0051) [2024-06-18 01:03:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.9, 300 sec: 40932.5). Total num frames: 439730176. Throughput: 0: 41202.7. Samples: 439833940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 01:03:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:03:02,163][12883] Updated weights for policy 0, policy_version 26840 (0.0044) [2024-06-18 01:03:04,843][12883] Updated weights for policy 0, policy_version 26850 (0.0027) [2024-06-18 01:03:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 439959552. Throughput: 0: 41152.5. Samples: 440075260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 01:03:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:03:10,249][12883] Updated weights for policy 0, policy_version 26860 (0.0044) [2024-06-18 01:03:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 41779.3, 300 sec: 41154.4). Total num frames: 440188928. Throughput: 0: 41188.1. Samples: 440324520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 01:03:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:03:12,852][12883] Updated weights for policy 0, policy_version 26870 (0.0039) [2024-06-18 01:03:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 440369152. Throughput: 0: 41379.2. Samples: 440454840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 01:03:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:03:18,032][12883] Updated weights for policy 0, policy_version 26880 (0.0038) [2024-06-18 01:03:18,166][12862] Signal inference workers to stop experience collection... (6200 times) [2024-06-18 01:03:18,166][12862] Signal inference workers to resume experience collection... (6200 times) [2024-06-18 01:03:18,208][12883] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-06-18 01:03:18,208][12883] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-06-18 01:03:21,110][12883] Updated weights for policy 0, policy_version 26890 (0.0033) [2024-06-18 01:03:21,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 440582144. Throughput: 0: 41176.8. Samples: 440697760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 01:03:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:03:25,773][12883] Updated weights for policy 0, policy_version 26900 (0.0042) [2024-06-18 01:03:26,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 440795136. Throughput: 0: 41511.5. Samples: 440951220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 01:03:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:03:28,836][12883] Updated weights for policy 0, policy_version 26910 (0.0033) [2024-06-18 01:03:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 440975360. Throughput: 0: 41311.4. Samples: 441074620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 01:03:31,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:03:33,652][12883] Updated weights for policy 0, policy_version 26920 (0.0035) [2024-06-18 01:03:36,650][12883] Updated weights for policy 0, policy_version 26930 (0.0042) [2024-06-18 01:03:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42053.8, 300 sec: 41209.9). Total num frames: 441221120. Throughput: 0: 41173.6. Samples: 441313040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 01:03:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:03:41,444][12883] Updated weights for policy 0, policy_version 26940 (0.0041) [2024-06-18 01:03:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 441401344. Throughput: 0: 41540.8. Samples: 441571620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 01:03:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:03:44,892][12883] Updated weights for policy 0, policy_version 26950 (0.0032) [2024-06-18 01:03:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.2, 300 sec: 41209.9). Total num frames: 441614336. Throughput: 0: 41205.8. Samples: 441688200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 01:03:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:03:49,758][12883] Updated weights for policy 0, policy_version 26960 (0.0024) [2024-06-18 01:03:51,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42050.6, 300 sec: 41098.8). Total num frames: 441843712. Throughput: 0: 41382.9. Samples: 441937580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-06-18 01:03:51,996][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:03:53,365][12883] Updated weights for policy 0, policy_version 26970 (0.0031) [2024-06-18 01:03:56,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 441991168. Throughput: 0: 41573.7. Samples: 442195340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-06-18 01:03:57,000][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:03:57,451][12883] Updated weights for policy 0, policy_version 26980 (0.0038) [2024-06-18 01:04:01,224][12883] Updated weights for policy 0, policy_version 26990 (0.0046) [2024-06-18 01:04:01,994][12645] Fps is (10 sec: 37691.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 442220544. Throughput: 0: 41070.5. Samples: 442303020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-06-18 01:04:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:04:05,697][12883] Updated weights for policy 0, policy_version 27000 (0.0042) [2024-06-18 01:04:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 442433536. Throughput: 0: 41335.1. Samples: 442557840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 19.0) [2024-06-18 01:04:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:04:08,933][12883] Updated weights for policy 0, policy_version 27010 (0.0049) [2024-06-18 01:04:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 442630144. Throughput: 0: 40975.7. Samples: 442795120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 01:04:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:04:13,615][12883] Updated weights for policy 0, policy_version 27020 (0.0027) [2024-06-18 01:04:16,603][12883] Updated weights for policy 0, policy_version 27030 (0.0035) [2024-06-18 01:04:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 442859520. Throughput: 0: 41050.0. Samples: 442921880. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 01:04:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:04:21,531][12883] Updated weights for policy 0, policy_version 27040 (0.0046) [2024-06-18 01:04:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.8, 300 sec: 40876.7). Total num frames: 443023360. Throughput: 0: 41209.8. Samples: 443167480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 01:04:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:04:24,458][12883] Updated weights for policy 0, policy_version 27050 (0.0030) [2024-06-18 01:04:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 443252736. Throughput: 0: 40834.3. Samples: 443409160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:04:26,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:04:29,768][12883] Updated weights for policy 0, policy_version 27060 (0.0044) [2024-06-18 01:04:31,766][12862] Signal inference workers to stop experience collection... (6250 times) [2024-06-18 01:04:31,810][12883] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-06-18 01:04:31,819][12862] Signal inference workers to resume experience collection... (6250 times) [2024-06-18 01:04:31,833][12883] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-06-18 01:04:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.0, 300 sec: 41154.4). Total num frames: 443482112. Throughput: 0: 41132.7. Samples: 443539180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:04:31,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:04:32,343][12883] Updated weights for policy 0, policy_version 27070 (0.0044) [2024-06-18 01:04:36,994][12645] Fps is (10 sec: 37681.9, 60 sec: 40140.7, 300 sec: 40987.7). Total num frames: 443629568. Throughput: 0: 41000.0. Samples: 443782500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:04:36,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:04:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027077_443629568.pth... [2024-06-18 01:04:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026478_433815552.pth [2024-06-18 01:04:37,891][12883] Updated weights for policy 0, policy_version 27080 (0.0035) [2024-06-18 01:04:40,675][12883] Updated weights for policy 0, policy_version 27090 (0.0038) [2024-06-18 01:04:41,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 443858944. Throughput: 0: 40497.4. Samples: 444017720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:04:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:04:45,756][12883] Updated weights for policy 0, policy_version 27100 (0.0043) [2024-06-18 01:04:46,994][12645] Fps is (10 sec: 42599.3, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 444055552. Throughput: 0: 40913.8. Samples: 444144140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:04:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:04:48,867][12883] Updated weights for policy 0, policy_version 27110 (0.0030) [2024-06-18 01:04:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40142.3, 300 sec: 41043.3). Total num frames: 444252160. Throughput: 0: 40486.6. Samples: 444379740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:04:51,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:04:53,899][12883] Updated weights for policy 0, policy_version 27120 (0.0024) [2024-06-18 01:04:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41043.5). Total num frames: 444481536. Throughput: 0: 40671.0. Samples: 444625320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:04:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:04:57,020][12883] Updated weights for policy 0, policy_version 27130 (0.0051) [2024-06-18 01:05:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 444645376. Throughput: 0: 40616.1. Samples: 444749600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:05:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:05:02,026][12883] Updated weights for policy 0, policy_version 27140 (0.0039) [2024-06-18 01:05:05,105][12883] Updated weights for policy 0, policy_version 27150 (0.0033) [2024-06-18 01:05:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41099.2). Total num frames: 444891136. Throughput: 0: 40525.4. Samples: 444991120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-18 01:05:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:05:10,024][12883] Updated weights for policy 0, policy_version 27160 (0.0038) [2024-06-18 01:05:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 445087744. Throughput: 0: 40591.5. Samples: 445235780. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-18 01:05:11,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:05:13,087][12883] Updated weights for policy 0, policy_version 27170 (0.0029) [2024-06-18 01:05:16,994][12645] Fps is (10 sec: 36044.4, 60 sec: 39867.7, 300 sec: 40876.7). Total num frames: 445251584. Throughput: 0: 40359.6. Samples: 445355360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-18 01:05:16,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:05:17,994][12883] Updated weights for policy 0, policy_version 27180 (0.0043) [2024-06-18 01:05:21,551][12883] Updated weights for policy 0, policy_version 27190 (0.0033) [2024-06-18 01:05:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 445497344. Throughput: 0: 40291.4. Samples: 445595600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:05:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:05:26,100][12883] Updated weights for policy 0, policy_version 27200 (0.0044) [2024-06-18 01:05:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40413.8, 300 sec: 40988.4). Total num frames: 445677568. Throughput: 0: 40400.8. Samples: 445835760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:05:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:05:29,485][12883] Updated weights for policy 0, policy_version 27210 (0.0036) [2024-06-18 01:05:31,994][12645] Fps is (10 sec: 36044.8, 60 sec: 39594.8, 300 sec: 40821.2). Total num frames: 445857792. Throughput: 0: 40164.1. Samples: 445951520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:05:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:05:34,386][12883] Updated weights for policy 0, policy_version 27220 (0.0047) [2024-06-18 01:05:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.3, 300 sec: 40932.2). Total num frames: 446087168. Throughput: 0: 40393.4. Samples: 446197440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:05:36,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:05:37,602][12883] Updated weights for policy 0, policy_version 27230 (0.0046) [2024-06-18 01:05:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40140.7, 300 sec: 40876.7). Total num frames: 446267392. Throughput: 0: 40443.1. Samples: 446445260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 01:05:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:05:42,432][12883] Updated weights for policy 0, policy_version 27240 (0.0036) [2024-06-18 01:05:46,274][12883] Updated weights for policy 0, policy_version 27250 (0.0038) [2024-06-18 01:05:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40414.0, 300 sec: 40876.8). Total num frames: 446480384. Throughput: 0: 40156.0. Samples: 446556620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 01:05:46,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:05:50,449][12883] Updated weights for policy 0, policy_version 27260 (0.0036) [2024-06-18 01:05:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 446676992. Throughput: 0: 40266.8. Samples: 446803120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 01:05:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:05:54,168][12883] Updated weights for policy 0, policy_version 27270 (0.0039) [2024-06-18 01:05:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40140.9, 300 sec: 40932.2). Total num frames: 446889984. Throughput: 0: 40377.5. Samples: 447052760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 01:05:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:05:58,402][12883] Updated weights for policy 0, policy_version 27280 (0.0046) [2024-06-18 01:06:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 447102976. Throughput: 0: 40382.3. Samples: 447172560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 01:06:01,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:06:02,346][12883] Updated weights for policy 0, policy_version 27290 (0.0025) [2024-06-18 01:06:03,891][12862] Signal inference workers to stop experience collection... (6300 times) [2024-06-18 01:06:03,928][12883] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-06-18 01:06:03,941][12862] Signal inference workers to resume experience collection... (6300 times) [2024-06-18 01:06:03,956][12883] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-06-18 01:06:06,282][12883] Updated weights for policy 0, policy_version 27300 (0.0044) [2024-06-18 01:06:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40140.8, 300 sec: 40821.2). Total num frames: 447299584. Throughput: 0: 40593.7. Samples: 447422320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 01:06:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:06:10,279][12883] Updated weights for policy 0, policy_version 27310 (0.0030) [2024-06-18 01:06:11,994][12645] Fps is (10 sec: 37683.5, 60 sec: 39867.8, 300 sec: 40821.2). Total num frames: 447479808. Throughput: 0: 40735.2. Samples: 447668840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 01:06:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:06:14,205][12883] Updated weights for policy 0, policy_version 27320 (0.0029) [2024-06-18 01:06:17,000][12645] Fps is (10 sec: 42572.0, 60 sec: 41228.9, 300 sec: 40931.4). Total num frames: 447725568. Throughput: 0: 40738.3. Samples: 447785000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:06:17,001][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:06:18,379][12883] Updated weights for policy 0, policy_version 27330 (0.0036) [2024-06-18 01:06:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 40413.9, 300 sec: 40821.1). Total num frames: 447922176. Throughput: 0: 40884.8. Samples: 448037260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:06:21,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:06:22,047][12883] Updated weights for policy 0, policy_version 27340 (0.0040) [2024-06-18 01:06:26,139][12883] Updated weights for policy 0, policy_version 27350 (0.0037) [2024-06-18 01:06:26,994][12645] Fps is (10 sec: 39346.1, 60 sec: 40687.0, 300 sec: 40822.0). Total num frames: 448118784. Throughput: 0: 40840.0. Samples: 448283060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:06:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:06:29,994][12883] Updated weights for policy 0, policy_version 27360 (0.0028) [2024-06-18 01:06:31,996][12645] Fps is (10 sec: 42589.2, 60 sec: 41504.6, 300 sec: 40931.9). Total num frames: 448348160. Throughput: 0: 41156.6. Samples: 448408760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:06:31,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:06:34,717][12883] Updated weights for policy 0, policy_version 27370 (0.0040) [2024-06-18 01:06:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 448528384. Throughput: 0: 41184.9. Samples: 448656440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:06:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:06:37,109][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027377_448544768.pth... [2024-06-18 01:06:37,165][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000026777_438714368.pth [2024-06-18 01:06:37,788][12883] Updated weights for policy 0, policy_version 27380 (0.0031) [2024-06-18 01:06:41,994][12645] Fps is (10 sec: 39330.1, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 448741376. Throughput: 0: 41146.1. Samples: 448904340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:06:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:06:42,499][12883] Updated weights for policy 0, policy_version 27390 (0.0033) [2024-06-18 01:06:46,374][12883] Updated weights for policy 0, policy_version 27400 (0.0036) [2024-06-18 01:06:46,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 448937984. Throughput: 0: 41189.3. Samples: 449026080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:06:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:06:50,174][12883] Updated weights for policy 0, policy_version 27410 (0.0036) [2024-06-18 01:06:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 449150976. Throughput: 0: 41162.3. Samples: 449274620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:06:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:06:54,419][12883] Updated weights for policy 0, policy_version 27420 (0.0027) [2024-06-18 01:06:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 449363968. Throughput: 0: 41103.4. Samples: 449518500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:06:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:06:57,914][12883] Updated weights for policy 0, policy_version 27430 (0.0046) [2024-06-18 01:07:01,996][12645] Fps is (10 sec: 40950.6, 60 sec: 40958.5, 300 sec: 40876.4). Total num frames: 449560576. Throughput: 0: 41282.8. Samples: 449642560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:07:01,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:07:02,416][12883] Updated weights for policy 0, policy_version 27440 (0.0035) [2024-06-18 01:07:06,051][12883] Updated weights for policy 0, policy_version 27450 (0.0028) [2024-06-18 01:07:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 449757184. Throughput: 0: 41077.7. Samples: 449885760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:07:06,995][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:07:10,314][12883] Updated weights for policy 0, policy_version 27460 (0.0037) [2024-06-18 01:07:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41506.1, 300 sec: 40876.7). Total num frames: 449970176. Throughput: 0: 41008.9. Samples: 450128460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:07:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:07:14,020][12883] Updated weights for policy 0, policy_version 27470 (0.0036) [2024-06-18 01:07:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40691.2, 300 sec: 40876.7). Total num frames: 450166784. Throughput: 0: 40891.8. Samples: 450248800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:07:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:07:18,336][12883] Updated weights for policy 0, policy_version 27480 (0.0039) [2024-06-18 01:07:21,947][12883] Updated weights for policy 0, policy_version 27490 (0.0040) [2024-06-18 01:07:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 450396160. Throughput: 0: 40941.3. Samples: 450498800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:07:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:07:24,062][12862] Signal inference workers to stop experience collection... (6350 times) [2024-06-18 01:07:24,115][12883] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-06-18 01:07:24,118][12862] Signal inference workers to resume experience collection... (6350 times) [2024-06-18 01:07:24,144][12883] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-06-18 01:07:26,421][12883] Updated weights for policy 0, policy_version 27500 (0.0047) [2024-06-18 01:07:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 450576384. Throughput: 0: 40891.1. Samples: 450744440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:07:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:07:29,767][12883] Updated weights for policy 0, policy_version 27510 (0.0028) [2024-06-18 01:07:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40961.5, 300 sec: 41043.6). Total num frames: 450805760. Throughput: 0: 40956.1. Samples: 450869100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 01:07:31,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:07:34,239][12883] Updated weights for policy 0, policy_version 27520 (0.0033) [2024-06-18 01:07:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 40932.3). Total num frames: 451018752. Throughput: 0: 40905.4. Samples: 451115360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 01:07:36,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:07:37,751][12883] Updated weights for policy 0, policy_version 27530 (0.0032) [2024-06-18 01:07:41,996][12645] Fps is (10 sec: 39312.9, 60 sec: 40958.5, 300 sec: 40876.4). Total num frames: 451198976. Throughput: 0: 41058.1. Samples: 451366200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 01:07:41,996][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:07:42,180][12883] Updated weights for policy 0, policy_version 27540 (0.0030) [2024-06-18 01:07:45,664][12883] Updated weights for policy 0, policy_version 27550 (0.0038) [2024-06-18 01:07:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 451411968. Throughput: 0: 40930.6. Samples: 451484340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 01:07:46,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:07:50,446][12883] Updated weights for policy 0, policy_version 27560 (0.0045) [2024-06-18 01:07:51,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40959.9, 300 sec: 40821.1). Total num frames: 451608576. Throughput: 0: 40969.8. Samples: 451729400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:07:52,004][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:07:53,780][12883] Updated weights for policy 0, policy_version 27570 (0.0029) [2024-06-18 01:07:56,996][12645] Fps is (10 sec: 39312.2, 60 sec: 40685.5, 300 sec: 40931.9). Total num frames: 451805184. Throughput: 0: 40959.3. Samples: 451971720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:07:57,005][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:07:58,281][12883] Updated weights for policy 0, policy_version 27580 (0.0040) [2024-06-18 01:08:01,686][12883] Updated weights for policy 0, policy_version 27590 (0.0044) [2024-06-18 01:08:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41234.7, 300 sec: 40932.2). Total num frames: 452034560. Throughput: 0: 41114.3. Samples: 452098940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:08:01,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 01:08:01,994][12862] Saving new best policy, reward=0.048! [2024-06-18 01:08:06,149][12883] Updated weights for policy 0, policy_version 27600 (0.0033) [2024-06-18 01:08:06,994][12645] Fps is (10 sec: 40969.4, 60 sec: 40960.1, 300 sec: 40765.6). Total num frames: 452214784. Throughput: 0: 41104.8. Samples: 452348520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:08:06,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 01:08:09,499][12883] Updated weights for policy 0, policy_version 27610 (0.0033) [2024-06-18 01:08:11,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 452427776. Throughput: 0: 41051.0. Samples: 452591740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:08:12,000][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:08:14,327][12883] Updated weights for policy 0, policy_version 27620 (0.0031) [2024-06-18 01:08:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 40932.2). Total num frames: 452657152. Throughput: 0: 41190.3. Samples: 452722660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:08:16,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:08:17,355][12883] Updated weights for policy 0, policy_version 27630 (0.0041) [2024-06-18 01:08:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 452837376. Throughput: 0: 41092.9. Samples: 452964540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:08:21,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:08:22,100][12883] Updated weights for policy 0, policy_version 27640 (0.0030) [2024-06-18 01:08:25,327][12883] Updated weights for policy 0, policy_version 27650 (0.0041) [2024-06-18 01:08:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 40987.7). Total num frames: 453066752. Throughput: 0: 40918.8. Samples: 453207460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 01:08:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:08:30,202][12883] Updated weights for policy 0, policy_version 27660 (0.0029) [2024-06-18 01:08:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 40821.2). Total num frames: 453263360. Throughput: 0: 41267.0. Samples: 453341360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 01:08:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:08:33,161][12883] Updated weights for policy 0, policy_version 27670 (0.0033) [2024-06-18 01:08:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 453459968. Throughput: 0: 41281.4. Samples: 453587060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 01:08:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:08:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027677_453459968.pth... [2024-06-18 01:08:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027077_443629568.pth [2024-06-18 01:08:37,929][12883] Updated weights for policy 0, policy_version 27680 (0.0040) [2024-06-18 01:08:41,256][12883] Updated weights for policy 0, policy_version 27690 (0.0039) [2024-06-18 01:08:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41507.7, 300 sec: 40932.2). Total num frames: 453689344. Throughput: 0: 41287.4. Samples: 453829560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 01:08:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:08:45,852][12883] Updated weights for policy 0, policy_version 27700 (0.0043) [2024-06-18 01:08:46,996][12645] Fps is (10 sec: 40950.8, 60 sec: 40958.4, 300 sec: 40765.6). Total num frames: 453869568. Throughput: 0: 41361.8. Samples: 453960320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 01:08:46,997][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:08:49,401][12883] Updated weights for policy 0, policy_version 27710 (0.0038) [2024-06-18 01:08:49,769][12862] Signal inference workers to stop experience collection... (6400 times) [2024-06-18 01:08:49,796][12883] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-06-18 01:08:49,822][12862] Signal inference workers to resume experience collection... (6400 times) [2024-06-18 01:08:49,823][12883] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-06-18 01:08:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 454066176. Throughput: 0: 41086.2. Samples: 454197400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 01:08:51,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:08:53,578][12883] Updated weights for policy 0, policy_version 27720 (0.0040) [2024-06-18 01:08:56,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41507.7, 300 sec: 40932.2). Total num frames: 454295552. Throughput: 0: 41232.9. Samples: 454447220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 01:08:56,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:08:57,229][12883] Updated weights for policy 0, policy_version 27730 (0.0039) [2024-06-18 01:09:01,470][12883] Updated weights for policy 0, policy_version 27740 (0.0037) [2024-06-18 01:09:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 454492160. Throughput: 0: 41269.4. Samples: 454579780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:09:01,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:09:05,143][12883] Updated weights for policy 0, policy_version 27750 (0.0037) [2024-06-18 01:09:06,996][12645] Fps is (10 sec: 40950.8, 60 sec: 41504.6, 300 sec: 40931.9). Total num frames: 454705152. Throughput: 0: 41198.3. Samples: 454818560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:09:06,997][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:09:09,630][12883] Updated weights for policy 0, policy_version 27760 (0.0037) [2024-06-18 01:09:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 40876.7). Total num frames: 454918144. Throughput: 0: 41399.2. Samples: 455070420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:09:11,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:09:13,023][12883] Updated weights for policy 0, policy_version 27770 (0.0039) [2024-06-18 01:09:16,994][12645] Fps is (10 sec: 39330.8, 60 sec: 40687.0, 300 sec: 40932.3). Total num frames: 455098368. Throughput: 0: 41160.5. Samples: 455193580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:09:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:09:17,474][12883] Updated weights for policy 0, policy_version 27780 (0.0034) [2024-06-18 01:09:21,031][12883] Updated weights for policy 0, policy_version 27790 (0.0044) [2024-06-18 01:09:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 455327744. Throughput: 0: 41115.1. Samples: 455437240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:09:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:09:25,513][12883] Updated weights for policy 0, policy_version 27800 (0.0045) [2024-06-18 01:09:26,996][12645] Fps is (10 sec: 44226.6, 60 sec: 41231.6, 300 sec: 40876.4). Total num frames: 455540736. Throughput: 0: 41241.9. Samples: 455685540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:09:26,996][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:09:29,155][12883] Updated weights for policy 0, policy_version 27810 (0.0036) [2024-06-18 01:09:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 455720960. Throughput: 0: 41089.3. Samples: 455809240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:09:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:09:33,421][12883] Updated weights for policy 0, policy_version 27820 (0.0040) [2024-06-18 01:09:36,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41506.2, 300 sec: 40987.8). Total num frames: 455950336. Throughput: 0: 41268.4. Samples: 456054480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 01:09:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:09:37,526][12883] Updated weights for policy 0, policy_version 27830 (0.0028) [2024-06-18 01:09:41,919][12883] Updated weights for policy 0, policy_version 27840 (0.0036) [2024-06-18 01:09:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 456130560. Throughput: 0: 41381.8. Samples: 456309400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:09:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:09:45,329][12883] Updated weights for policy 0, policy_version 27850 (0.0047) [2024-06-18 01:09:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41234.7, 300 sec: 40987.8). Total num frames: 456343552. Throughput: 0: 40991.1. Samples: 456424380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:09:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:09:49,857][12883] Updated weights for policy 0, policy_version 27860 (0.0046) [2024-06-18 01:09:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 40932.2). Total num frames: 456556544. Throughput: 0: 41057.6. Samples: 456666060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:09:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:09:53,688][12883] Updated weights for policy 0, policy_version 27870 (0.0037) [2024-06-18 01:09:56,994][12645] Fps is (10 sec: 40959.0, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 456753152. Throughput: 0: 41016.3. Samples: 456916160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 01:09:56,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 01:09:57,827][12883] Updated weights for policy 0, policy_version 27880 (0.0050) [2024-06-18 01:10:01,447][12883] Updated weights for policy 0, policy_version 27890 (0.0048) [2024-06-18 01:10:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 456966144. Throughput: 0: 41036.7. Samples: 457040240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 01:10:01,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:10:05,574][12883] Updated weights for policy 0, policy_version 27900 (0.0032) [2024-06-18 01:10:06,994][12645] Fps is (10 sec: 44237.9, 60 sec: 41507.8, 300 sec: 41043.3). Total num frames: 457195520. Throughput: 0: 41148.6. Samples: 457288920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 01:10:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:10:09,404][12883] Updated weights for policy 0, policy_version 27910 (0.0040) [2024-06-18 01:10:11,994][12645] Fps is (10 sec: 40958.4, 60 sec: 40959.6, 300 sec: 41098.8). Total num frames: 457375744. Throughput: 0: 41029.1. Samples: 457531780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 01:10:11,995][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:10:13,319][12883] Updated weights for policy 0, policy_version 27920 (0.0045) [2024-06-18 01:10:13,961][12862] Signal inference workers to stop experience collection... (6450 times) [2024-06-18 01:10:14,007][12883] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-06-18 01:10:14,012][12862] Signal inference workers to resume experience collection... (6450 times) [2024-06-18 01:10:14,021][12883] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-06-18 01:10:16,994][12645] Fps is (10 sec: 36044.4, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 457555968. Throughput: 0: 40809.7. Samples: 457645680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 01:10:16,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:10:17,646][12883] Updated weights for policy 0, policy_version 27930 (0.0046) [2024-06-18 01:10:21,365][12883] Updated weights for policy 0, policy_version 27940 (0.0037) [2024-06-18 01:10:22,000][12645] Fps is (10 sec: 40936.8, 60 sec: 40955.8, 300 sec: 41042.5). Total num frames: 457785344. Throughput: 0: 40890.8. Samples: 457894820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 01:10:22,000][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:10:25,524][12883] Updated weights for policy 0, policy_version 27950 (0.0041) [2024-06-18 01:10:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40688.5, 300 sec: 41098.8). Total num frames: 457981952. Throughput: 0: 40775.6. Samples: 458144300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 01:10:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:10:29,253][12883] Updated weights for policy 0, policy_version 27960 (0.0036) [2024-06-18 01:10:31,994][12645] Fps is (10 sec: 39345.7, 60 sec: 40959.9, 300 sec: 40987.7). Total num frames: 458178560. Throughput: 0: 40954.5. Samples: 458267340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 01:10:31,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:10:33,206][12883] Updated weights for policy 0, policy_version 27970 (0.0037) [2024-06-18 01:10:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 458391552. Throughput: 0: 41098.2. Samples: 458515480. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-18 01:10:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:10:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027978_458391552.pth... [2024-06-18 01:10:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027377_448544768.pth [2024-06-18 01:10:37,394][12883] Updated weights for policy 0, policy_version 27980 (0.0032) [2024-06-18 01:10:41,120][12883] Updated weights for policy 0, policy_version 27990 (0.0036) [2024-06-18 01:10:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 458604544. Throughput: 0: 40938.7. Samples: 458758400. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-18 01:10:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:10:45,447][12883] Updated weights for policy 0, policy_version 28000 (0.0030) [2024-06-18 01:10:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 458817536. Throughput: 0: 40899.2. Samples: 458880700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 23.0) [2024-06-18 01:10:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:10:49,063][12883] Updated weights for policy 0, policy_version 28010 (0.0038) [2024-06-18 01:10:51,994][12645] Fps is (10 sec: 39322.5, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 458997760. Throughput: 0: 40798.7. Samples: 459124860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 01:10:51,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:10:53,476][12883] Updated weights for policy 0, policy_version 28020 (0.0036) [2024-06-18 01:10:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 459227136. Throughput: 0: 40867.7. Samples: 459370800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 01:10:56,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:10:57,399][12883] Updated weights for policy 0, policy_version 28030 (0.0035) [2024-06-18 01:11:01,751][12883] Updated weights for policy 0, policy_version 28040 (0.0042) [2024-06-18 01:11:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 459423744. Throughput: 0: 41093.3. Samples: 459494880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 01:11:01,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:11:05,395][12883] Updated weights for policy 0, policy_version 28050 (0.0036) [2024-06-18 01:11:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 41154.4). Total num frames: 459620352. Throughput: 0: 40942.5. Samples: 459736980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 01:11:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:11:09,729][12883] Updated weights for policy 0, policy_version 28060 (0.0036) [2024-06-18 01:11:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40960.4, 300 sec: 41044.2). Total num frames: 459833344. Throughput: 0: 40882.2. Samples: 459984000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:11:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:11:13,418][12883] Updated weights for policy 0, policy_version 28070 (0.0032) [2024-06-18 01:11:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 460013568. Throughput: 0: 40752.0. Samples: 460101180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:11:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:11:17,944][12883] Updated weights for policy 0, policy_version 28080 (0.0033) [2024-06-18 01:11:21,521][12883] Updated weights for policy 0, policy_version 28090 (0.0041) [2024-06-18 01:11:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40964.2, 300 sec: 41098.9). Total num frames: 460242944. Throughput: 0: 40721.4. Samples: 460347940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:11:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:11:26,027][12883] Updated weights for policy 0, policy_version 28100 (0.0026) [2024-06-18 01:11:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 40988.1). Total num frames: 460439552. Throughput: 0: 40803.6. Samples: 460594560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:11:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:11:29,495][12883] Updated weights for policy 0, policy_version 28110 (0.0035) [2024-06-18 01:11:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 460652544. Throughput: 0: 40690.3. Samples: 460711760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 01:11:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:11:33,941][12883] Updated weights for policy 0, policy_version 28120 (0.0046) [2024-06-18 01:11:36,996][12645] Fps is (10 sec: 39313.0, 60 sec: 40685.5, 300 sec: 40987.5). Total num frames: 460832768. Throughput: 0: 40678.4. Samples: 460955480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 01:11:36,997][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:11:37,014][12862] Signal inference workers to stop experience collection... (6500 times) [2024-06-18 01:11:37,014][12862] Signal inference workers to resume experience collection... (6500 times) [2024-06-18 01:11:37,036][12883] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-06-18 01:11:37,037][12883] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-06-18 01:11:37,848][12883] Updated weights for policy 0, policy_version 28130 (0.0030) [2024-06-18 01:11:41,994][12645] Fps is (10 sec: 36045.2, 60 sec: 40141.0, 300 sec: 40932.3). Total num frames: 461012992. Throughput: 0: 40798.7. Samples: 461206740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 01:11:41,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:11:42,379][12883] Updated weights for policy 0, policy_version 28140 (0.0052) [2024-06-18 01:11:45,685][12883] Updated weights for policy 0, policy_version 28150 (0.0038) [2024-06-18 01:11:46,994][12645] Fps is (10 sec: 42608.1, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 461258752. Throughput: 0: 40669.9. Samples: 461325020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 01:11:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:11:50,248][12883] Updated weights for policy 0, policy_version 28160 (0.0039) [2024-06-18 01:11:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 461455360. Throughput: 0: 40865.8. Samples: 461575940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 01:11:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:11:53,641][12883] Updated weights for policy 0, policy_version 28170 (0.0038) [2024-06-18 01:11:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 40988.1). Total num frames: 461651968. Throughput: 0: 40807.5. Samples: 461820340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 01:11:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:11:58,093][12883] Updated weights for policy 0, policy_version 28180 (0.0037) [2024-06-18 01:12:01,687][12883] Updated weights for policy 0, policy_version 28190 (0.0037) [2024-06-18 01:12:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 461864960. Throughput: 0: 40915.2. Samples: 461942360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 01:12:01,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:12:05,938][12883] Updated weights for policy 0, policy_version 28200 (0.0038) [2024-06-18 01:12:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 462061568. Throughput: 0: 40919.1. Samples: 462189300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:12:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:12:09,545][12883] Updated weights for policy 0, policy_version 28210 (0.0035) [2024-06-18 01:12:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 462274560. Throughput: 0: 40803.2. Samples: 462430700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:12:11,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:12:14,370][12883] Updated weights for policy 0, policy_version 28220 (0.0039) [2024-06-18 01:12:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 462487552. Throughput: 0: 40948.9. Samples: 462554460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:12:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:12:17,530][12883] Updated weights for policy 0, policy_version 28230 (0.0035) [2024-06-18 01:12:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 462667776. Throughput: 0: 40865.5. Samples: 462794340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:12:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:12:22,539][12883] Updated weights for policy 0, policy_version 28240 (0.0025) [2024-06-18 01:12:25,714][12883] Updated weights for policy 0, policy_version 28250 (0.0043) [2024-06-18 01:12:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 462880768. Throughput: 0: 40750.9. Samples: 463040540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 01:12:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:12:30,342][12883] Updated weights for policy 0, policy_version 28260 (0.0035) [2024-06-18 01:12:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 463093760. Throughput: 0: 40860.4. Samples: 463163740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 01:12:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:12:34,154][12883] Updated weights for policy 0, policy_version 28270 (0.0042) [2024-06-18 01:12:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40961.5, 300 sec: 40988.1). Total num frames: 463290368. Throughput: 0: 40704.4. Samples: 463407640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 01:12:37,000][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:12:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028277_463290368.pth... [2024-06-18 01:12:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027677_453459968.pth [2024-06-18 01:12:38,359][12883] Updated weights for policy 0, policy_version 28280 (0.0028) [2024-06-18 01:12:41,994][12645] Fps is (10 sec: 39318.5, 60 sec: 41232.5, 300 sec: 40932.1). Total num frames: 463486976. Throughput: 0: 40622.0. Samples: 463648360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 01:12:41,995][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:12:42,037][12883] Updated weights for policy 0, policy_version 28290 (0.0039) [2024-06-18 01:12:46,175][12883] Updated weights for policy 0, policy_version 28300 (0.0032) [2024-06-18 01:12:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 463683584. Throughput: 0: 40723.0. Samples: 463774900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 01:12:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:12:49,907][12883] Updated weights for policy 0, policy_version 28310 (0.0034) [2024-06-18 01:12:51,994][12645] Fps is (10 sec: 39324.6, 60 sec: 40413.9, 300 sec: 40932.5). Total num frames: 463880192. Throughput: 0: 40712.4. Samples: 464021360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 01:12:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:12:54,160][12883] Updated weights for policy 0, policy_version 28320 (0.0035) [2024-06-18 01:12:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 464125952. Throughput: 0: 40614.6. Samples: 464258360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 01:12:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:12:58,324][12883] Updated weights for policy 0, policy_version 28330 (0.0033) [2024-06-18 01:13:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 464306176. Throughput: 0: 40790.2. Samples: 464390020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 01:13:02,007][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:13:02,557][12883] Updated weights for policy 0, policy_version 28340 (0.0050) [2024-06-18 01:13:06,073][12883] Updated weights for policy 0, policy_version 28350 (0.0035) [2024-06-18 01:13:06,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40687.0, 300 sec: 40932.3). Total num frames: 464502784. Throughput: 0: 40639.7. Samples: 464623120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 01:13:06,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:13:10,375][12883] Updated weights for policy 0, policy_version 28360 (0.0046) [2024-06-18 01:13:11,577][12862] Signal inference workers to stop experience collection... (6550 times) [2024-06-18 01:13:11,625][12883] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-06-18 01:13:11,627][12862] Signal inference workers to resume experience collection... (6550 times) [2024-06-18 01:13:11,638][12883] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-06-18 01:13:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 464732160. Throughput: 0: 40824.5. Samples: 464877640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 01:13:11,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:13:13,948][12883] Updated weights for policy 0, policy_version 28370 (0.0039) [2024-06-18 01:13:16,999][12645] Fps is (10 sec: 40939.8, 60 sec: 40410.6, 300 sec: 40931.5). Total num frames: 464912384. Throughput: 0: 40792.5. Samples: 464999600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 01:13:17,004][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:13:18,633][12883] Updated weights for policy 0, policy_version 28380 (0.0032) [2024-06-18 01:13:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 465108992. Throughput: 0: 40717.3. Samples: 465239920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 01:13:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:13:22,460][12883] Updated weights for policy 0, policy_version 28390 (0.0047) [2024-06-18 01:13:26,508][12883] Updated weights for policy 0, policy_version 28400 (0.0043) [2024-06-18 01:13:26,994][12645] Fps is (10 sec: 40979.5, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 465321984. Throughput: 0: 40968.6. Samples: 465491920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 01:13:26,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:13:30,181][12883] Updated weights for policy 0, policy_version 28410 (0.0049) [2024-06-18 01:13:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 465534976. Throughput: 0: 40862.7. Samples: 465613720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 01:13:31,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:13:34,277][12883] Updated weights for policy 0, policy_version 28420 (0.0047) [2024-06-18 01:13:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 465731584. Throughput: 0: 40907.1. Samples: 465862180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 01:13:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:13:38,443][12883] Updated weights for policy 0, policy_version 28430 (0.0042) [2024-06-18 01:13:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.4, 300 sec: 40932.5). Total num frames: 465944576. Throughput: 0: 41089.3. Samples: 466107380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 01:13:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:13:42,151][12883] Updated weights for policy 0, policy_version 28440 (0.0037) [2024-06-18 01:13:46,131][12883] Updated weights for policy 0, policy_version 28450 (0.0034) [2024-06-18 01:13:46,996][12645] Fps is (10 sec: 42588.7, 60 sec: 41231.6, 300 sec: 40987.5). Total num frames: 466157568. Throughput: 0: 41008.2. Samples: 466235480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 01:13:46,996][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:13:50,000][12883] Updated weights for policy 0, policy_version 28460 (0.0044) [2024-06-18 01:13:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 40876.7). Total num frames: 466354176. Throughput: 0: 41149.2. Samples: 466474840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 01:13:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:13:53,908][12883] Updated weights for policy 0, policy_version 28470 (0.0031) [2024-06-18 01:13:56,994][12645] Fps is (10 sec: 40969.1, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 466567168. Throughput: 0: 41106.7. Samples: 466727440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 01:13:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:13:57,560][12883] Updated weights for policy 0, policy_version 28480 (0.0040) [2024-06-18 01:14:01,846][12883] Updated weights for policy 0, policy_version 28490 (0.0043) [2024-06-18 01:14:01,999][12645] Fps is (10 sec: 42575.3, 60 sec: 41229.3, 300 sec: 40931.8). Total num frames: 466780160. Throughput: 0: 41213.6. Samples: 466854240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:14:02,000][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:14:05,315][12883] Updated weights for policy 0, policy_version 28500 (0.0040) [2024-06-18 01:14:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 466993152. Throughput: 0: 41202.1. Samples: 467094020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:14:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:14:09,838][12883] Updated weights for policy 0, policy_version 28510 (0.0036) [2024-06-18 01:14:11,994][12645] Fps is (10 sec: 40982.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 467189760. Throughput: 0: 41305.4. Samples: 467350660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:14:11,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:14:12,995][12883] Updated weights for policy 0, policy_version 28520 (0.0036) [2024-06-18 01:14:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41236.4, 300 sec: 40876.7). Total num frames: 467386368. Throughput: 0: 41234.3. Samples: 467469260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:14:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:14:17,961][12883] Updated weights for policy 0, policy_version 28530 (0.0030) [2024-06-18 01:14:20,359][12862] Signal inference workers to stop experience collection... (6600 times) [2024-06-18 01:14:20,359][12862] Signal inference workers to resume experience collection... (6600 times) [2024-06-18 01:14:20,394][12883] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-06-18 01:14:20,394][12883] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-06-18 01:14:20,913][12883] Updated weights for policy 0, policy_version 28540 (0.0027) [2024-06-18 01:14:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 40932.5). Total num frames: 467615744. Throughput: 0: 41163.6. Samples: 467714540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-18 01:14:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:14:26,426][12883] Updated weights for policy 0, policy_version 28550 (0.0047) [2024-06-18 01:14:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 467779584. Throughput: 0: 41357.0. Samples: 467968440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-18 01:14:26,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:14:28,976][12883] Updated weights for policy 0, policy_version 28560 (0.0031) [2024-06-18 01:14:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 41233.1, 300 sec: 40876.7). Total num frames: 468008960. Throughput: 0: 40967.3. Samples: 468078920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-18 01:14:31,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:14:34,335][12883] Updated weights for policy 0, policy_version 28570 (0.0041) [2024-06-18 01:14:36,803][12883] Updated weights for policy 0, policy_version 28580 (0.0031) [2024-06-18 01:14:36,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 41098.9). Total num frames: 468254720. Throughput: 0: 41320.1. Samples: 468334240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 23.0) [2024-06-18 01:14:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:14:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028580_468254720.pth... [2024-06-18 01:14:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000027978_458391552.pth [2024-06-18 01:14:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40687.0, 300 sec: 40821.1). Total num frames: 468385792. Throughput: 0: 41413.3. Samples: 468591040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-18 01:14:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:14:42,623][12883] Updated weights for policy 0, policy_version 28590 (0.0030) [2024-06-18 01:14:45,164][12883] Updated weights for policy 0, policy_version 28600 (0.0042) [2024-06-18 01:14:46,994][12645] Fps is (10 sec: 37682.6, 60 sec: 41234.5, 300 sec: 40932.2). Total num frames: 468631552. Throughput: 0: 40913.3. Samples: 468695120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-18 01:14:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:14:50,324][12883] Updated weights for policy 0, policy_version 28610 (0.0046) [2024-06-18 01:14:51,994][12645] Fps is (10 sec: 45875.9, 60 sec: 41506.3, 300 sec: 40987.8). Total num frames: 468844544. Throughput: 0: 41321.1. Samples: 468953460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-18 01:14:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:14:52,809][12883] Updated weights for policy 0, policy_version 28620 (0.0032) [2024-06-18 01:14:56,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 469008384. Throughput: 0: 41199.6. Samples: 469204640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:14:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:14:58,157][12883] Updated weights for policy 0, policy_version 28630 (0.0041) [2024-06-18 01:15:00,517][12883] Updated weights for policy 0, policy_version 28640 (0.0037) [2024-06-18 01:15:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41236.9, 300 sec: 40876.7). Total num frames: 469254144. Throughput: 0: 41191.2. Samples: 469322860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:15:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:15:06,047][12883] Updated weights for policy 0, policy_version 28650 (0.0041) [2024-06-18 01:15:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 40960.0, 300 sec: 40932.3). Total num frames: 469450752. Throughput: 0: 41367.4. Samples: 469576080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:15:06,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 01:15:08,623][12883] Updated weights for policy 0, policy_version 28660 (0.0031) [2024-06-18 01:15:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 469647360. Throughput: 0: 41118.6. Samples: 469818780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:15:11,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:15:13,871][12883] Updated weights for policy 0, policy_version 28670 (0.0034) [2024-06-18 01:15:16,607][12883] Updated weights for policy 0, policy_version 28680 (0.0037) [2024-06-18 01:15:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 41044.2). Total num frames: 469893120. Throughput: 0: 41422.2. Samples: 469942920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 01:15:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:15:21,699][12883] Updated weights for policy 0, policy_version 28690 (0.0035) [2024-06-18 01:15:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 470056960. Throughput: 0: 41214.5. Samples: 470188900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 01:15:21,995][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:15:24,945][12883] Updated weights for policy 0, policy_version 28700 (0.0035) [2024-06-18 01:15:26,994][12645] Fps is (10 sec: 36045.6, 60 sec: 41233.1, 300 sec: 40932.3). Total num frames: 470253568. Throughput: 0: 40922.8. Samples: 470432560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 01:15:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:15:29,722][12883] Updated weights for policy 0, policy_version 28710 (0.0034) [2024-06-18 01:15:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 470499328. Throughput: 0: 41389.4. Samples: 470557640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 01:15:31,996][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:15:33,234][12883] Updated weights for policy 0, policy_version 28720 (0.0035) [2024-06-18 01:15:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 40413.7, 300 sec: 40932.2). Total num frames: 470679552. Throughput: 0: 41042.9. Samples: 470800400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:15:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:15:37,463][12883] Updated weights for policy 0, policy_version 28730 (0.0031) [2024-06-18 01:15:41,313][12883] Updated weights for policy 0, policy_version 28740 (0.0037) [2024-06-18 01:15:41,994][12645] Fps is (10 sec: 37681.8, 60 sec: 41505.9, 300 sec: 40876.6). Total num frames: 470876160. Throughput: 0: 40922.7. Samples: 471046180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:15:41,995][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:15:43,356][12862] Signal inference workers to stop experience collection... (6650 times) [2024-06-18 01:15:43,396][12883] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-06-18 01:15:43,409][12862] Signal inference workers to resume experience collection... (6650 times) [2024-06-18 01:15:43,413][12883] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-06-18 01:15:45,324][12883] Updated weights for policy 0, policy_version 28750 (0.0044) [2024-06-18 01:15:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.7). Total num frames: 471089152. Throughput: 0: 41138.1. Samples: 471174080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:15:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:15:49,215][12883] Updated weights for policy 0, policy_version 28760 (0.0035) [2024-06-18 01:15:51,994][12645] Fps is (10 sec: 40961.3, 60 sec: 40686.8, 300 sec: 40876.7). Total num frames: 471285760. Throughput: 0: 40903.2. Samples: 471416720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:15:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:15:53,345][12883] Updated weights for policy 0, policy_version 28770 (0.0039) [2024-06-18 01:15:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 40987.8). Total num frames: 471515136. Throughput: 0: 40979.1. Samples: 471662840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 01:15:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:15:57,679][12883] Updated weights for policy 0, policy_version 28780 (0.0049) [2024-06-18 01:16:01,374][12883] Updated weights for policy 0, policy_version 28790 (0.0040) [2024-06-18 01:16:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 471695360. Throughput: 0: 40950.0. Samples: 471785660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 01:16:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:16:05,554][12883] Updated weights for policy 0, policy_version 28800 (0.0035) [2024-06-18 01:16:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 471908352. Throughput: 0: 40937.8. Samples: 472031100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 01:16:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:16:09,531][12883] Updated weights for policy 0, policy_version 28810 (0.0037) [2024-06-18 01:16:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 472121344. Throughput: 0: 40924.4. Samples: 472274160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:16:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:16:13,678][12883] Updated weights for policy 0, policy_version 28820 (0.0029) [2024-06-18 01:16:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 472317952. Throughput: 0: 40932.5. Samples: 472399600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:16:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:16:17,213][12883] Updated weights for policy 0, policy_version 28830 (0.0030) [2024-06-18 01:16:21,630][12883] Updated weights for policy 0, policy_version 28840 (0.0039) [2024-06-18 01:16:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 472530944. Throughput: 0: 41188.6. Samples: 472653880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:16:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:16:25,032][12883] Updated weights for policy 0, policy_version 28850 (0.0045) [2024-06-18 01:16:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 472743936. Throughput: 0: 41158.5. Samples: 472898300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:16:26,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:16:29,416][12883] Updated weights for policy 0, policy_version 28860 (0.0048) [2024-06-18 01:16:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 41043.6). Total num frames: 472940544. Throughput: 0: 41047.7. Samples: 473021220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:16:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:16:33,036][12883] Updated weights for policy 0, policy_version 28870 (0.0037) [2024-06-18 01:16:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 473137152. Throughput: 0: 40990.7. Samples: 473261300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:16:36,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 01:16:37,068][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028879_473153536.pth... [2024-06-18 01:16:37,123][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028277_463290368.pth [2024-06-18 01:16:37,401][12883] Updated weights for policy 0, policy_version 28880 (0.0033) [2024-06-18 01:16:41,601][12883] Updated weights for policy 0, policy_version 28890 (0.0035) [2024-06-18 01:16:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.4, 300 sec: 40987.8). Total num frames: 473350144. Throughput: 0: 41039.2. Samples: 473509600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:16:41,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 01:16:45,356][12883] Updated weights for policy 0, policy_version 28900 (0.0034) [2024-06-18 01:16:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 473563136. Throughput: 0: 41064.9. Samples: 473633580. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:16:46,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:16:49,406][12883] Updated weights for policy 0, policy_version 28910 (0.0034) [2024-06-18 01:16:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 473743360. Throughput: 0: 40872.2. Samples: 473870340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:16:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:16:53,643][12883] Updated weights for policy 0, policy_version 28920 (0.0026) [2024-06-18 01:16:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 473956352. Throughput: 0: 40817.6. Samples: 474110960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:16:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:16:57,774][12883] Updated weights for policy 0, policy_version 28930 (0.0034) [2024-06-18 01:17:01,723][12883] Updated weights for policy 0, policy_version 28940 (0.0037) [2024-06-18 01:17:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 474169344. Throughput: 0: 40915.1. Samples: 474240780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:17:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:17:05,542][12883] Updated weights for policy 0, policy_version 28950 (0.0039) [2024-06-18 01:17:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 474365952. Throughput: 0: 40610.1. Samples: 474481340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:17:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:17:09,706][12883] Updated weights for policy 0, policy_version 28960 (0.0043) [2024-06-18 01:17:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 474578944. Throughput: 0: 40645.9. Samples: 474727360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:17:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:17:12,284][12862] Signal inference workers to stop experience collection... (6700 times) [2024-06-18 01:17:12,285][12862] Signal inference workers to resume experience collection... (6700 times) [2024-06-18 01:17:12,314][12883] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-06-18 01:17:12,315][12883] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-06-18 01:17:13,402][12883] Updated weights for policy 0, policy_version 28970 (0.0037) [2024-06-18 01:17:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 474775552. Throughput: 0: 40530.0. Samples: 474845080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:17:16,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:17:18,076][12883] Updated weights for policy 0, policy_version 28980 (0.0037) [2024-06-18 01:17:21,277][12883] Updated weights for policy 0, policy_version 28990 (0.0042) [2024-06-18 01:17:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41232.9, 300 sec: 41098.8). Total num frames: 475004928. Throughput: 0: 40674.1. Samples: 475091640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:17:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:17:26,060][12883] Updated weights for policy 0, policy_version 29000 (0.0037) [2024-06-18 01:17:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 475185152. Throughput: 0: 40745.8. Samples: 475343160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:17:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:17:29,359][12883] Updated weights for policy 0, policy_version 29010 (0.0027) [2024-06-18 01:17:31,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40686.8, 300 sec: 40987.8). Total num frames: 475381760. Throughput: 0: 40586.2. Samples: 475459960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:17:31,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:17:34,215][12883] Updated weights for policy 0, policy_version 29020 (0.0032) [2024-06-18 01:17:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41043.4). Total num frames: 475594752. Throughput: 0: 40874.2. Samples: 475709680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:17:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:17:37,219][12883] Updated weights for policy 0, policy_version 29030 (0.0030) [2024-06-18 01:17:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 475774976. Throughput: 0: 41015.5. Samples: 475956660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:17:41,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:17:42,316][12883] Updated weights for policy 0, policy_version 29040 (0.0037) [2024-06-18 01:17:45,227][12883] Updated weights for policy 0, policy_version 29050 (0.0033) [2024-06-18 01:17:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 475987968. Throughput: 0: 40823.5. Samples: 476077840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:17:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:17:50,135][12883] Updated weights for policy 0, policy_version 29060 (0.0044) [2024-06-18 01:17:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 476217344. Throughput: 0: 40992.9. Samples: 476326020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:17:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:17:53,190][12883] Updated weights for policy 0, policy_version 29070 (0.0026) [2024-06-18 01:17:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 476397568. Throughput: 0: 41216.3. Samples: 476582100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:17:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:17:58,036][12883] Updated weights for policy 0, policy_version 29080 (0.0042) [2024-06-18 01:18:01,081][12883] Updated weights for policy 0, policy_version 29090 (0.0029) [2024-06-18 01:18:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 476626944. Throughput: 0: 41249.3. Samples: 476701300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:18:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:18:05,658][12883] Updated weights for policy 0, policy_version 29100 (0.0034) [2024-06-18 01:18:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 476823552. Throughput: 0: 41357.4. Samples: 476952720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:18:06,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:18:09,036][12883] Updated weights for policy 0, policy_version 29110 (0.0048) [2024-06-18 01:18:11,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40960.0, 300 sec: 41099.5). Total num frames: 477036544. Throughput: 0: 41200.0. Samples: 477197160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 01:18:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:18:14,016][12883] Updated weights for policy 0, policy_version 29120 (0.0032) [2024-06-18 01:18:16,886][12883] Updated weights for policy 0, policy_version 29130 (0.0035) [2024-06-18 01:18:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 477265920. Throughput: 0: 41308.8. Samples: 477318860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 01:18:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:18:21,994][12645] Fps is (10 sec: 37682.4, 60 sec: 40140.8, 300 sec: 40987.8). Total num frames: 477413376. Throughput: 0: 41409.6. Samples: 477573120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 01:18:21,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:18:22,055][12883] Updated weights for policy 0, policy_version 29140 (0.0040) [2024-06-18 01:18:24,740][12883] Updated weights for policy 0, policy_version 29150 (0.0031) [2024-06-18 01:18:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 477642752. Throughput: 0: 41051.2. Samples: 477803960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 01:18:26,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:18:27,144][12862] Signal inference workers to stop experience collection... (6750 times) [2024-06-18 01:18:27,192][12883] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-06-18 01:18:27,214][12862] Signal inference workers to resume experience collection... (6750 times) [2024-06-18 01:18:27,214][12883] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-06-18 01:18:29,915][12883] Updated weights for policy 0, policy_version 29160 (0.0032) [2024-06-18 01:18:31,996][12645] Fps is (10 sec: 44227.7, 60 sec: 41231.6, 300 sec: 41098.5). Total num frames: 477855744. Throughput: 0: 41262.9. Samples: 477934760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:18:31,996][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:18:33,138][12883] Updated weights for policy 0, policy_version 29170 (0.0034) [2024-06-18 01:18:36,996][12645] Fps is (10 sec: 37674.8, 60 sec: 40412.3, 300 sec: 40931.9). Total num frames: 478019584. Throughput: 0: 41104.2. Samples: 478175800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:18:36,997][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:18:37,063][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029177_478035968.pth... [2024-06-18 01:18:37,127][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028580_468254720.pth [2024-06-18 01:18:37,719][12883] Updated weights for policy 0, policy_version 29180 (0.0048) [2024-06-18 01:18:41,284][12883] Updated weights for policy 0, policy_version 29190 (0.0030) [2024-06-18 01:18:41,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41506.2, 300 sec: 41043.6). Total num frames: 478265344. Throughput: 0: 40785.4. Samples: 478417440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:18:42,000][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:18:45,562][12883] Updated weights for policy 0, policy_version 29200 (0.0028) [2024-06-18 01:18:46,994][12645] Fps is (10 sec: 45885.7, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 478478336. Throughput: 0: 41088.1. Samples: 478550260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:18:46,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:18:49,451][12883] Updated weights for policy 0, policy_version 29210 (0.0026) [2024-06-18 01:18:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 478674944. Throughput: 0: 40865.8. Samples: 478791680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:18:51,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:18:53,618][12883] Updated weights for policy 0, policy_version 29220 (0.0043) [2024-06-18 01:18:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 40988.5). Total num frames: 478871552. Throughput: 0: 40935.1. Samples: 479039240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:18:56,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:18:57,213][12883] Updated weights for policy 0, policy_version 29230 (0.0036) [2024-06-18 01:19:01,819][12883] Updated weights for policy 0, policy_version 29240 (0.0029) [2024-06-18 01:19:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 479068160. Throughput: 0: 40968.1. Samples: 479162420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:19:01,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:19:05,386][12883] Updated weights for policy 0, policy_version 29250 (0.0035) [2024-06-18 01:19:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 479297536. Throughput: 0: 40749.8. Samples: 479406860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:19:06,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:19:09,863][12883] Updated weights for policy 0, policy_version 29260 (0.0030) [2024-06-18 01:19:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 479494144. Throughput: 0: 40976.5. Samples: 479647900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:19:11,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:19:13,187][12883] Updated weights for policy 0, policy_version 29270 (0.0035) [2024-06-18 01:19:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 479690752. Throughput: 0: 40863.6. Samples: 479773540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:19:16,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:19:17,805][12883] Updated weights for policy 0, policy_version 29280 (0.0039) [2024-06-18 01:19:21,053][12883] Updated weights for policy 0, policy_version 29290 (0.0030) [2024-06-18 01:19:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41098.8). Total num frames: 479903744. Throughput: 0: 40991.0. Samples: 480020300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:19:21,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:19:25,907][12883] Updated weights for policy 0, policy_version 29300 (0.0030) [2024-06-18 01:19:26,996][12645] Fps is (10 sec: 40951.5, 60 sec: 40958.5, 300 sec: 40987.5). Total num frames: 480100352. Throughput: 0: 40978.0. Samples: 480261540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 01:19:26,996][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:19:29,045][12883] Updated weights for policy 0, policy_version 29310 (0.0041) [2024-06-18 01:19:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40688.4, 300 sec: 40821.1). Total num frames: 480296960. Throughput: 0: 40716.8. Samples: 480382520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 01:19:31,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:19:33,599][12883] Updated weights for policy 0, policy_version 29320 (0.0032) [2024-06-18 01:19:36,994][12645] Fps is (10 sec: 42607.7, 60 sec: 41780.8, 300 sec: 41154.4). Total num frames: 480526336. Throughput: 0: 40992.9. Samples: 480636360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 01:19:36,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:19:37,003][12883] Updated weights for policy 0, policy_version 29330 (0.0028) [2024-06-18 01:19:41,599][12883] Updated weights for policy 0, policy_version 29340 (0.0026) [2024-06-18 01:19:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 480722944. Throughput: 0: 40833.2. Samples: 480876740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 01:19:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:19:45,410][12883] Updated weights for policy 0, policy_version 29350 (0.0037) [2024-06-18 01:19:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 480919552. Throughput: 0: 40760.0. Samples: 480996620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:19:46,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:19:49,497][12883] Updated weights for policy 0, policy_version 29360 (0.0036) [2024-06-18 01:19:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 481132544. Throughput: 0: 40801.8. Samples: 481242940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:19:51,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:19:53,714][12883] Updated weights for policy 0, policy_version 29370 (0.0048) [2024-06-18 01:19:53,919][12862] Signal inference workers to stop experience collection... (6800 times) [2024-06-18 01:19:53,953][12883] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-06-18 01:19:53,976][12862] Signal inference workers to resume experience collection... (6800 times) [2024-06-18 01:19:53,986][12883] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-06-18 01:19:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 481329152. Throughput: 0: 41016.5. Samples: 481493640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:19:56,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 01:19:57,264][12883] Updated weights for policy 0, policy_version 29380 (0.0041) [2024-06-18 01:20:01,569][12883] Updated weights for policy 0, policy_version 29390 (0.0037) [2024-06-18 01:20:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 481542144. Throughput: 0: 40834.7. Samples: 481611100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:20:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:20:05,069][12883] Updated weights for policy 0, policy_version 29400 (0.0046) [2024-06-18 01:20:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40687.1, 300 sec: 40987.8). Total num frames: 481738752. Throughput: 0: 40909.3. Samples: 481861220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:20:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:20:09,597][12883] Updated weights for policy 0, policy_version 29410 (0.0031) [2024-06-18 01:20:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40821.2). Total num frames: 481935360. Throughput: 0: 41090.4. Samples: 482110520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:20:11,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:20:13,543][12883] Updated weights for policy 0, policy_version 29420 (0.0034) [2024-06-18 01:20:16,995][12645] Fps is (10 sec: 40953.8, 60 sec: 40959.1, 300 sec: 40987.6). Total num frames: 482148352. Throughput: 0: 41104.0. Samples: 482232260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:20:16,996][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:20:17,495][12883] Updated weights for policy 0, policy_version 29430 (0.0028) [2024-06-18 01:20:21,455][12883] Updated weights for policy 0, policy_version 29440 (0.0044) [2024-06-18 01:20:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 482361344. Throughput: 0: 40950.2. Samples: 482479120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:20:21,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:20:25,287][12883] Updated weights for policy 0, policy_version 29450 (0.0023) [2024-06-18 01:20:26,994][12645] Fps is (10 sec: 40965.8, 60 sec: 40961.5, 300 sec: 40876.7). Total num frames: 482557952. Throughput: 0: 41181.3. Samples: 482729900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:20:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:20:29,560][12883] Updated weights for policy 0, policy_version 29460 (0.0041) [2024-06-18 01:20:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 482770944. Throughput: 0: 41215.1. Samples: 482851300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:20:31,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:20:33,653][12883] Updated weights for policy 0, policy_version 29470 (0.0035) [2024-06-18 01:20:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41043.4). Total num frames: 482983936. Throughput: 0: 41109.4. Samples: 483092860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:20:36,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:20:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029479_482983936.pth... [2024-06-18 01:20:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000028879_473153536.pth [2024-06-18 01:20:37,261][12883] Updated weights for policy 0, policy_version 29480 (0.0044) [2024-06-18 01:20:41,328][12883] Updated weights for policy 0, policy_version 29490 (0.0038) [2024-06-18 01:20:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 483164160. Throughput: 0: 41097.7. Samples: 483343040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:20:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:20:44,963][12883] Updated weights for policy 0, policy_version 29500 (0.0028) [2024-06-18 01:20:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 483393536. Throughput: 0: 41144.1. Samples: 483462580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:20:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:20:49,062][12883] Updated weights for policy 0, policy_version 29510 (0.0047) [2024-06-18 01:20:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 483606528. Throughput: 0: 41245.2. Samples: 483717260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:20:51,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:20:52,946][12883] Updated weights for policy 0, policy_version 29520 (0.0032) [2024-06-18 01:20:56,960][12883] Updated weights for policy 0, policy_version 29530 (0.0038) [2024-06-18 01:20:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 483819520. Throughput: 0: 41113.8. Samples: 483960640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:20:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:21:01,406][12883] Updated weights for policy 0, policy_version 29540 (0.0042) [2024-06-18 01:21:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 484016128. Throughput: 0: 41010.6. Samples: 484077680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 01:21:01,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:21:05,138][12883] Updated weights for policy 0, policy_version 29550 (0.0026) [2024-06-18 01:21:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 484196352. Throughput: 0: 40895.2. Samples: 484319400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 01:21:06,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:21:09,318][12883] Updated weights for policy 0, policy_version 29560 (0.0036) [2024-06-18 01:21:10,809][12862] Signal inference workers to stop experience collection... (6850 times) [2024-06-18 01:21:10,816][12862] Signal inference workers to resume experience collection... (6850 times) [2024-06-18 01:21:10,840][12883] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-06-18 01:21:10,840][12883] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-06-18 01:21:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 484409344. Throughput: 0: 40736.4. Samples: 484563040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 01:21:11,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:21:13,463][12883] Updated weights for policy 0, policy_version 29570 (0.0044) [2024-06-18 01:21:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40961.1, 300 sec: 40932.2). Total num frames: 484605952. Throughput: 0: 40818.3. Samples: 484688120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 01:21:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:21:17,150][12883] Updated weights for policy 0, policy_version 29580 (0.0030) [2024-06-18 01:21:21,518][12883] Updated weights for policy 0, policy_version 29590 (0.0043) [2024-06-18 01:21:21,996][12645] Fps is (10 sec: 40951.2, 60 sec: 40958.5, 300 sec: 40931.9). Total num frames: 484818944. Throughput: 0: 41024.2. Samples: 484939040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 01:21:21,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:21:25,505][12883] Updated weights for policy 0, policy_version 29600 (0.0039) [2024-06-18 01:21:26,994][12645] Fps is (10 sec: 44235.7, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 485048320. Throughput: 0: 40671.4. Samples: 485173260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 01:21:26,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:21:29,934][12883] Updated weights for policy 0, policy_version 29610 (0.0028) [2024-06-18 01:21:31,994][12645] Fps is (10 sec: 39330.7, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 485212160. Throughput: 0: 40922.3. Samples: 485304080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 01:21:31,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:21:33,582][12883] Updated weights for policy 0, policy_version 29620 (0.0035) [2024-06-18 01:21:36,994][12645] Fps is (10 sec: 36045.3, 60 sec: 40413.9, 300 sec: 40876.7). Total num frames: 485408768. Throughput: 0: 40499.6. Samples: 485539740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 01:21:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:21:37,795][12883] Updated weights for policy 0, policy_version 29630 (0.0027) [2024-06-18 01:21:41,515][12883] Updated weights for policy 0, policy_version 29640 (0.0037) [2024-06-18 01:21:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 485638144. Throughput: 0: 40586.2. Samples: 485787020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 01:21:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:21:45,601][12883] Updated weights for policy 0, policy_version 29650 (0.0037) [2024-06-18 01:21:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40686.8, 300 sec: 40987.7). Total num frames: 485834752. Throughput: 0: 40812.4. Samples: 485914240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 01:21:46,995][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:21:49,309][12883] Updated weights for policy 0, policy_version 29660 (0.0039) [2024-06-18 01:21:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 486047744. Throughput: 0: 40862.1. Samples: 486158200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 01:21:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:21:53,639][12883] Updated weights for policy 0, policy_version 29670 (0.0032) [2024-06-18 01:21:56,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40140.9, 300 sec: 40876.7). Total num frames: 486227968. Throughput: 0: 40761.1. Samples: 486397280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 01:21:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:21:57,812][12883] Updated weights for policy 0, policy_version 29680 (0.0028) [2024-06-18 01:22:01,365][12883] Updated weights for policy 0, policy_version 29690 (0.0031) [2024-06-18 01:22:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 486440960. Throughput: 0: 40651.0. Samples: 486517420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-18 01:22:02,000][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:22:05,769][12883] Updated weights for policy 0, policy_version 29700 (0.0037) [2024-06-18 01:22:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 486653952. Throughput: 0: 40714.1. Samples: 486771080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-18 01:22:06,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:22:09,274][12883] Updated weights for policy 0, policy_version 29710 (0.0030) [2024-06-18 01:22:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 486850560. Throughput: 0: 40922.8. Samples: 487014780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-18 01:22:11,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:22:13,877][12883] Updated weights for policy 0, policy_version 29720 (0.0041) [2024-06-18 01:22:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41232.9, 300 sec: 40932.2). Total num frames: 487079936. Throughput: 0: 40865.7. Samples: 487143040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-18 01:22:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:22:17,238][12883] Updated weights for policy 0, policy_version 29730 (0.0024) [2024-06-18 01:22:21,544][12883] Updated weights for policy 0, policy_version 29740 (0.0043) [2024-06-18 01:22:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40961.6, 300 sec: 40987.8). Total num frames: 487276544. Throughput: 0: 41153.4. Samples: 487391640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 01:22:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:22:25,224][12883] Updated weights for policy 0, policy_version 29750 (0.0031) [2024-06-18 01:22:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 487489536. Throughput: 0: 40974.0. Samples: 487630840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 01:22:26,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:22:28,511][12862] Signal inference workers to stop experience collection... (6900 times) [2024-06-18 01:22:28,512][12862] Signal inference workers to resume experience collection... (6900 times) [2024-06-18 01:22:28,524][12883] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-06-18 01:22:28,524][12883] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-06-18 01:22:29,346][12883] Updated weights for policy 0, policy_version 29760 (0.0031) [2024-06-18 01:22:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 487686144. Throughput: 0: 40965.1. Samples: 487757660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 01:22:31,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:22:33,160][12883] Updated weights for policy 0, policy_version 29770 (0.0035) [2024-06-18 01:22:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 487882752. Throughput: 0: 41116.6. Samples: 488008440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 01:22:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:22:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029779_487899136.pth... [2024-06-18 01:22:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029177_478035968.pth [2024-06-18 01:22:37,235][12883] Updated weights for policy 0, policy_version 29780 (0.0039) [2024-06-18 01:22:41,029][12883] Updated weights for policy 0, policy_version 29790 (0.0029) [2024-06-18 01:22:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 488095744. Throughput: 0: 41344.9. Samples: 488257800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:22:41,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:22:45,111][12883] Updated weights for policy 0, policy_version 29800 (0.0046) [2024-06-18 01:22:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 488308736. Throughput: 0: 41479.1. Samples: 488383980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:22:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:22:49,035][12883] Updated weights for policy 0, policy_version 29810 (0.0038) [2024-06-18 01:22:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 40685.5, 300 sec: 40987.5). Total num frames: 488488960. Throughput: 0: 41350.3. Samples: 488631940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:22:51,997][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:22:53,122][12883] Updated weights for policy 0, policy_version 29820 (0.0047) [2024-06-18 01:22:56,936][12883] Updated weights for policy 0, policy_version 29830 (0.0042) [2024-06-18 01:22:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 41043.3). Total num frames: 488734720. Throughput: 0: 41479.9. Samples: 488881380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 01:22:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:23:00,963][12883] Updated weights for policy 0, policy_version 29840 (0.0051) [2024-06-18 01:23:01,994][12645] Fps is (10 sec: 44247.1, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 488931328. Throughput: 0: 41427.3. Samples: 489007260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:23:01,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:23:05,241][12883] Updated weights for policy 0, policy_version 29850 (0.0036) [2024-06-18 01:23:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41232.9, 300 sec: 40987.7). Total num frames: 489127936. Throughput: 0: 41322.9. Samples: 489251180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:23:06,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:23:08,789][12883] Updated weights for policy 0, policy_version 29860 (0.0030) [2024-06-18 01:23:11,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41506.0, 300 sec: 40932.2). Total num frames: 489340928. Throughput: 0: 41429.6. Samples: 489495180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:23:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:23:13,231][12883] Updated weights for policy 0, policy_version 29870 (0.0045) [2024-06-18 01:23:16,953][12883] Updated weights for policy 0, policy_version 29880 (0.0034) [2024-06-18 01:23:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 489553920. Throughput: 0: 41415.4. Samples: 489621360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:23:16,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:23:21,342][12883] Updated weights for policy 0, policy_version 29890 (0.0044) [2024-06-18 01:23:21,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 489734144. Throughput: 0: 41330.2. Samples: 489868300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 01:23:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:23:24,719][12883] Updated weights for policy 0, policy_version 29900 (0.0039) [2024-06-18 01:23:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41043.6). Total num frames: 489963520. Throughput: 0: 41139.5. Samples: 490109080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 01:23:26,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:23:29,480][12883] Updated weights for policy 0, policy_version 29910 (0.0034) [2024-06-18 01:23:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41154.7). Total num frames: 490160128. Throughput: 0: 41159.7. Samples: 490236160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 01:23:31,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:23:32,734][12883] Updated weights for policy 0, policy_version 29920 (0.0026) [2024-06-18 01:23:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 490356736. Throughput: 0: 41111.0. Samples: 490481840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 01:23:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:23:37,123][12883] Updated weights for policy 0, policy_version 29930 (0.0034) [2024-06-18 01:23:40,670][12883] Updated weights for policy 0, policy_version 29940 (0.0034) [2024-06-18 01:23:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41506.0, 300 sec: 41043.3). Total num frames: 490586112. Throughput: 0: 41055.9. Samples: 490728900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 01:23:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:23:44,850][12883] Updated weights for policy 0, policy_version 29950 (0.0037) [2024-06-18 01:23:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 490782720. Throughput: 0: 41068.0. Samples: 490855320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 01:23:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:23:48,578][12883] Updated weights for policy 0, policy_version 29960 (0.0040) [2024-06-18 01:23:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41234.6, 300 sec: 40987.8). Total num frames: 490962944. Throughput: 0: 41065.4. Samples: 491099120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 01:23:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:23:53,233][12883] Updated weights for policy 0, policy_version 29970 (0.0045) [2024-06-18 01:23:56,516][12883] Updated weights for policy 0, policy_version 29980 (0.0027) [2024-06-18 01:23:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 491192320. Throughput: 0: 41151.7. Samples: 491347000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 01:23:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:24:00,999][12883] Updated weights for policy 0, policy_version 29990 (0.0034) [2024-06-18 01:24:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41232.9, 300 sec: 41043.3). Total num frames: 491405312. Throughput: 0: 41275.5. Samples: 491478760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:24:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:24:04,599][12883] Updated weights for policy 0, policy_version 30000 (0.0034) [2024-06-18 01:24:06,996][12645] Fps is (10 sec: 42588.8, 60 sec: 41504.7, 300 sec: 41098.5). Total num frames: 491618304. Throughput: 0: 41187.6. Samples: 491721840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:24:06,997][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:24:08,670][12883] Updated weights for policy 0, policy_version 30010 (0.0043) [2024-06-18 01:24:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 491798528. Throughput: 0: 41467.6. Samples: 491975120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:24:11,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:24:12,485][12883] Updated weights for policy 0, policy_version 30020 (0.0040) [2024-06-18 01:24:14,157][12862] Signal inference workers to stop experience collection... (6950 times) [2024-06-18 01:24:14,185][12883] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-06-18 01:24:14,210][12862] Signal inference workers to resume experience collection... (6950 times) [2024-06-18 01:24:14,211][12883] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-06-18 01:24:16,596][12883] Updated weights for policy 0, policy_version 30030 (0.0036) [2024-06-18 01:24:16,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 492027904. Throughput: 0: 41290.7. Samples: 492094240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:24:16,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:24:20,406][12883] Updated weights for policy 0, policy_version 30040 (0.0030) [2024-06-18 01:24:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 41154.7). Total num frames: 492240896. Throughput: 0: 41325.7. Samples: 492341500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:24:21,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:24:24,508][12883] Updated weights for policy 0, policy_version 30050 (0.0039) [2024-06-18 01:24:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 492421120. Throughput: 0: 41289.9. Samples: 492586940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:24:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:24:28,368][12883] Updated weights for policy 0, policy_version 30060 (0.0046) [2024-06-18 01:24:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 492650496. Throughput: 0: 41257.7. Samples: 492711920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:24:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:24:32,218][12883] Updated weights for policy 0, policy_version 30070 (0.0032) [2024-06-18 01:24:36,143][12883] Updated weights for policy 0, policy_version 30080 (0.0032) [2024-06-18 01:24:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 492830720. Throughput: 0: 41328.1. Samples: 492958880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:24:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:24:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030080_492830720.pth... [2024-06-18 01:24:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029479_482983936.pth [2024-06-18 01:24:40,479][12883] Updated weights for policy 0, policy_version 30090 (0.0030) [2024-06-18 01:24:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 493043712. Throughput: 0: 41331.9. Samples: 493206940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 01:24:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:24:44,067][12883] Updated weights for policy 0, policy_version 30100 (0.0040) [2024-06-18 01:24:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 41098.9). Total num frames: 493256704. Throughput: 0: 41004.0. Samples: 493323940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 01:24:46,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:24:48,507][12883] Updated weights for policy 0, policy_version 30110 (0.0034) [2024-06-18 01:24:51,879][12883] Updated weights for policy 0, policy_version 30120 (0.0037) [2024-06-18 01:24:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 493486080. Throughput: 0: 41090.4. Samples: 493570820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 01:24:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:24:56,499][12883] Updated weights for policy 0, policy_version 30130 (0.0039) [2024-06-18 01:24:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41098.9). Total num frames: 493666304. Throughput: 0: 40911.9. Samples: 493816160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 01:24:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:25:00,483][12883] Updated weights for policy 0, policy_version 30140 (0.0028) [2024-06-18 01:25:01,994][12645] Fps is (10 sec: 36045.3, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 493846528. Throughput: 0: 40986.7. Samples: 493938640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:25:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:25:04,476][12883] Updated weights for policy 0, policy_version 30150 (0.0034) [2024-06-18 01:25:06,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40688.3, 300 sec: 41098.8). Total num frames: 494059520. Throughput: 0: 41121.2. Samples: 494191960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:25:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:25:08,589][12883] Updated weights for policy 0, policy_version 30160 (0.0039) [2024-06-18 01:25:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41506.1, 300 sec: 41154.6). Total num frames: 494288896. Throughput: 0: 41101.7. Samples: 494436520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:25:11,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:25:12,219][12883] Updated weights for policy 0, policy_version 30170 (0.0030) [2024-06-18 01:25:16,483][12883] Updated weights for policy 0, policy_version 30180 (0.0037) [2024-06-18 01:25:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 40686.8, 300 sec: 41043.3). Total num frames: 494469120. Throughput: 0: 41123.1. Samples: 494562460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 01:25:16,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:25:19,997][12883] Updated weights for policy 0, policy_version 30190 (0.0037) [2024-06-18 01:25:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 494682112. Throughput: 0: 40902.2. Samples: 494799480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:25:21,996][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:25:23,485][12862] Signal inference workers to stop experience collection... (7000 times) [2024-06-18 01:25:23,536][12862] Signal inference workers to resume experience collection... (7000 times) [2024-06-18 01:25:23,540][12883] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-06-18 01:25:23,551][12883] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-06-18 01:25:24,621][12883] Updated weights for policy 0, policy_version 30200 (0.0028) [2024-06-18 01:25:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 494878720. Throughput: 0: 40966.7. Samples: 495050440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:25:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:25:28,137][12883] Updated weights for policy 0, policy_version 30210 (0.0046) [2024-06-18 01:25:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 495091712. Throughput: 0: 41037.9. Samples: 495170640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:25:31,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 01:25:32,471][12883] Updated weights for policy 0, policy_version 30220 (0.0035) [2024-06-18 01:25:36,224][12883] Updated weights for policy 0, policy_version 30230 (0.0026) [2024-06-18 01:25:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 495321088. Throughput: 0: 41071.2. Samples: 495419020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 01:25:36,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:25:40,836][12883] Updated weights for policy 0, policy_version 30240 (0.0033) [2024-06-18 01:25:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 495517696. Throughput: 0: 41035.1. Samples: 495662740. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-18 01:25:41,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 01:25:44,385][12883] Updated weights for policy 0, policy_version 30250 (0.0045) [2024-06-18 01:25:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 495714304. Throughput: 0: 40964.9. Samples: 495782060. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-18 01:25:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:25:48,828][12883] Updated weights for policy 0, policy_version 30260 (0.0030) [2024-06-18 01:25:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 495910912. Throughput: 0: 40839.2. Samples: 496029720. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-18 01:25:51,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:25:52,793][12883] Updated weights for policy 0, policy_version 30270 (0.0030) [2024-06-18 01:25:56,686][12883] Updated weights for policy 0, policy_version 30280 (0.0035) [2024-06-18 01:25:57,000][12645] Fps is (10 sec: 40934.2, 60 sec: 40955.8, 300 sec: 41042.5). Total num frames: 496123904. Throughput: 0: 41004.6. Samples: 496281980. Policy #0 lag: (min: 0.0, avg: 12.9, max: 24.0) [2024-06-18 01:25:57,000][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:26:00,683][12883] Updated weights for policy 0, policy_version 30290 (0.0033) [2024-06-18 01:26:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 496336896. Throughput: 0: 40879.7. Samples: 496402040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:26:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:26:04,690][12883] Updated weights for policy 0, policy_version 30300 (0.0043) [2024-06-18 01:26:06,994][12645] Fps is (10 sec: 40985.2, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 496533504. Throughput: 0: 41075.5. Samples: 496647880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:26:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:26:08,623][12883] Updated weights for policy 0, policy_version 30310 (0.0035) [2024-06-18 01:26:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 496730112. Throughput: 0: 40946.2. Samples: 496893020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:26:11,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:26:12,626][12883] Updated weights for policy 0, policy_version 30320 (0.0038) [2024-06-18 01:26:16,713][12883] Updated weights for policy 0, policy_version 30330 (0.0041) [2024-06-18 01:26:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41043.6). Total num frames: 496926720. Throughput: 0: 41104.8. Samples: 497020360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 01:26:16,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:26:20,453][12883] Updated weights for policy 0, policy_version 30340 (0.0030) [2024-06-18 01:26:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 497156096. Throughput: 0: 41108.4. Samples: 497268900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 01:26:21,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:26:24,702][12883] Updated weights for policy 0, policy_version 30350 (0.0029) [2024-06-18 01:26:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 497369088. Throughput: 0: 41178.3. Samples: 497515760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 01:26:26,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:26:28,277][12883] Updated weights for policy 0, policy_version 30360 (0.0048) [2024-06-18 01:26:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 497549312. Throughput: 0: 41425.2. Samples: 497646200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 01:26:31,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:26:32,478][12883] Updated weights for policy 0, policy_version 30370 (0.0036) [2024-06-18 01:26:35,890][12883] Updated weights for policy 0, policy_version 30380 (0.0033) [2024-06-18 01:26:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 497778688. Throughput: 0: 41432.9. Samples: 497894200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 01:26:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:26:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030382_497778688.pth... [2024-06-18 01:26:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000029779_487899136.pth [2024-06-18 01:26:40,281][12883] Updated weights for policy 0, policy_version 30390 (0.0038) [2024-06-18 01:26:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 497975296. Throughput: 0: 41287.9. Samples: 498139680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:26:41,994][12645] Avg episode reward: [(0, '0.000')] [2024-06-18 01:26:43,901][12883] Updated weights for policy 0, policy_version 30400 (0.0037) [2024-06-18 01:26:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 498188288. Throughput: 0: 41369.7. Samples: 498263680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:26:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:26:47,880][12862] Signal inference workers to stop experience collection... (7050 times) [2024-06-18 01:26:47,880][12862] Signal inference workers to resume experience collection... (7050 times) [2024-06-18 01:26:47,893][12883] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-06-18 01:26:47,893][12883] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-06-18 01:26:48,018][12883] Updated weights for policy 0, policy_version 30410 (0.0031) [2024-06-18 01:26:51,978][12883] Updated weights for policy 0, policy_version 30420 (0.0033) [2024-06-18 01:26:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 498401280. Throughput: 0: 41546.4. Samples: 498517460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:26:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:26:55,747][12883] Updated weights for policy 0, policy_version 30430 (0.0025) [2024-06-18 01:26:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41237.4, 300 sec: 41209.9). Total num frames: 498597888. Throughput: 0: 41598.8. Samples: 498764960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:26:56,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:26:59,822][12883] Updated weights for policy 0, policy_version 30440 (0.0040) [2024-06-18 01:27:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 498810880. Throughput: 0: 41489.4. Samples: 498887380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 01:27:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:27:03,654][12883] Updated weights for policy 0, policy_version 30450 (0.0030) [2024-06-18 01:27:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 499023872. Throughput: 0: 41570.7. Samples: 499139580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 01:27:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:27:07,611][12883] Updated weights for policy 0, policy_version 30460 (0.0034) [2024-06-18 01:27:11,570][12883] Updated weights for policy 0, policy_version 30470 (0.0038) [2024-06-18 01:27:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 499220480. Throughput: 0: 41564.4. Samples: 499386160. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 01:27:12,003][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:27:15,642][12883] Updated weights for policy 0, policy_version 30480 (0.0040) [2024-06-18 01:27:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 499417088. Throughput: 0: 41315.7. Samples: 499505400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 01:27:16,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 01:27:19,871][12883] Updated weights for policy 0, policy_version 30490 (0.0040) [2024-06-18 01:27:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41265.4). Total num frames: 499662848. Throughput: 0: 41232.1. Samples: 499749640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 01:27:21,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:27:23,511][12883] Updated weights for policy 0, policy_version 30500 (0.0042) [2024-06-18 01:27:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 499826688. Throughput: 0: 41352.5. Samples: 500000540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 01:27:26,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:27:27,733][12883] Updated weights for policy 0, policy_version 30510 (0.0037) [2024-06-18 01:27:31,329][12883] Updated weights for policy 0, policy_version 30520 (0.0038) [2024-06-18 01:27:31,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 500039680. Throughput: 0: 41230.1. Samples: 500119040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 01:27:31,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:27:35,742][12883] Updated weights for policy 0, policy_version 30530 (0.0042) [2024-06-18 01:27:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41506.3, 300 sec: 41265.5). Total num frames: 500269056. Throughput: 0: 41197.8. Samples: 500371360. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:27:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:27:38,981][12883] Updated weights for policy 0, policy_version 30540 (0.0040) [2024-06-18 01:27:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 500449280. Throughput: 0: 41248.0. Samples: 500621120. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:27:41,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:27:43,858][12883] Updated weights for policy 0, policy_version 30550 (0.0037) [2024-06-18 01:27:46,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41506.1, 300 sec: 41321.3). Total num frames: 500678656. Throughput: 0: 41187.8. Samples: 500740840. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:27:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:27:47,062][12883] Updated weights for policy 0, policy_version 30560 (0.0035) [2024-06-18 01:27:51,546][12883] Updated weights for policy 0, policy_version 30570 (0.0037) [2024-06-18 01:27:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.8, 300 sec: 41098.8). Total num frames: 500858880. Throughput: 0: 41255.8. Samples: 500996100. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:27:51,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:27:55,165][12883] Updated weights for policy 0, policy_version 30580 (0.0039) [2024-06-18 01:27:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 501071872. Throughput: 0: 41316.8. Samples: 501245420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-18 01:27:56,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:27:59,547][12883] Updated weights for policy 0, policy_version 30590 (0.0046) [2024-06-18 01:28:01,072][12862] Signal inference workers to stop experience collection... (7100 times) [2024-06-18 01:28:01,125][12862] Signal inference workers to resume experience collection... (7100 times) [2024-06-18 01:28:01,127][12883] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-06-18 01:28:01,152][12883] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-06-18 01:28:01,994][12645] Fps is (10 sec: 44237.9, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 501301248. Throughput: 0: 41354.3. Samples: 501366340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-18 01:28:01,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:28:03,333][12883] Updated weights for policy 0, policy_version 30600 (0.0042) [2024-06-18 01:28:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 501481472. Throughput: 0: 41484.4. Samples: 501616440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-18 01:28:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:28:07,623][12883] Updated weights for policy 0, policy_version 30610 (0.0036) [2024-06-18 01:28:11,018][12883] Updated weights for policy 0, policy_version 30620 (0.0027) [2024-06-18 01:28:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 501678080. Throughput: 0: 41374.7. Samples: 501862400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 22.0) [2024-06-18 01:28:11,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:28:15,368][12883] Updated weights for policy 0, policy_version 30630 (0.0033) [2024-06-18 01:28:16,997][12645] Fps is (10 sec: 44223.3, 60 sec: 41777.0, 300 sec: 41320.6). Total num frames: 501923840. Throughput: 0: 41462.6. Samples: 501984980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:28:16,997][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:28:19,039][12883] Updated weights for policy 0, policy_version 30640 (0.0043) [2024-06-18 01:28:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 502120448. Throughput: 0: 41403.0. Samples: 502234500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:28:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:28:23,368][12883] Updated weights for policy 0, policy_version 30650 (0.0041) [2024-06-18 01:28:26,819][12883] Updated weights for policy 0, policy_version 30660 (0.0032) [2024-06-18 01:28:26,996][12645] Fps is (10 sec: 40963.4, 60 sec: 41777.6, 300 sec: 41265.1). Total num frames: 502333440. Throughput: 0: 41131.7. Samples: 502472140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:28:26,997][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:28:31,594][12883] Updated weights for policy 0, policy_version 30670 (0.0030) [2024-06-18 01:28:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 502513664. Throughput: 0: 41289.8. Samples: 502598880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:28:31,995][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:28:35,479][12883] Updated weights for policy 0, policy_version 30680 (0.0039) [2024-06-18 01:28:36,994][12645] Fps is (10 sec: 37692.1, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 502710272. Throughput: 0: 41118.9. Samples: 502846440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:28:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:28:37,047][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030684_502726656.pth... [2024-06-18 01:28:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030080_492830720.pth [2024-06-18 01:28:39,399][12883] Updated weights for policy 0, policy_version 30690 (0.0033) [2024-06-18 01:28:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 502939648. Throughput: 0: 40947.1. Samples: 503088040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:28:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:28:43,654][12883] Updated weights for policy 0, policy_version 30700 (0.0035) [2024-06-18 01:28:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 503136256. Throughput: 0: 41156.3. Samples: 503218380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:28:46,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:28:47,286][12883] Updated weights for policy 0, policy_version 30710 (0.0045) [2024-06-18 01:28:51,367][12883] Updated weights for policy 0, policy_version 30720 (0.0042) [2024-06-18 01:28:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 503332864. Throughput: 0: 41042.7. Samples: 503463360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:28:51,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:28:55,138][12883] Updated weights for policy 0, policy_version 30730 (0.0032) [2024-06-18 01:28:57,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41501.8, 300 sec: 41209.1). Total num frames: 503562240. Throughput: 0: 40999.6. Samples: 503707640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 01:28:57,001][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:28:59,154][12883] Updated weights for policy 0, policy_version 30740 (0.0035) [2024-06-18 01:29:01,996][12645] Fps is (10 sec: 42590.5, 60 sec: 40958.7, 300 sec: 41154.4). Total num frames: 503758848. Throughput: 0: 41111.8. Samples: 503834960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 01:29:01,996][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:29:03,112][12883] Updated weights for policy 0, policy_version 30750 (0.0031) [2024-06-18 01:29:06,976][12883] Updated weights for policy 0, policy_version 30760 (0.0043) [2024-06-18 01:29:06,994][12645] Fps is (10 sec: 40985.8, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 503971840. Throughput: 0: 40868.1. Samples: 504073560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 01:29:06,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:29:11,189][12883] Updated weights for policy 0, policy_version 30770 (0.0035) [2024-06-18 01:29:11,994][12645] Fps is (10 sec: 40968.1, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 504168448. Throughput: 0: 41182.1. Samples: 504325240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 01:29:11,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:29:14,768][12883] Updated weights for policy 0, policy_version 30780 (0.0030) [2024-06-18 01:29:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40416.0, 300 sec: 41043.3). Total num frames: 504348672. Throughput: 0: 41117.5. Samples: 504449160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 01:29:16,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:29:17,303][12862] Signal inference workers to stop experience collection... (7150 times) [2024-06-18 01:29:17,304][12862] Signal inference workers to resume experience collection... (7150 times) [2024-06-18 01:29:17,344][12883] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-06-18 01:29:17,344][12883] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-06-18 01:29:18,747][12883] Updated weights for policy 0, policy_version 30790 (0.0043) [2024-06-18 01:29:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 504578048. Throughput: 0: 41096.7. Samples: 504695800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:29:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:29:22,725][12883] Updated weights for policy 0, policy_version 30800 (0.0037) [2024-06-18 01:29:26,432][12883] Updated weights for policy 0, policy_version 30810 (0.0023) [2024-06-18 01:29:26,994][12645] Fps is (10 sec: 45874.2, 60 sec: 41234.5, 300 sec: 41209.9). Total num frames: 504807424. Throughput: 0: 41211.0. Samples: 504942540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:29:26,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 01:29:31,161][12883] Updated weights for policy 0, policy_version 30820 (0.0041) [2024-06-18 01:29:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 504971264. Throughput: 0: 40980.1. Samples: 505062480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:29:31,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:29:34,438][12883] Updated weights for policy 0, policy_version 30830 (0.0036) [2024-06-18 01:29:36,996][12645] Fps is (10 sec: 40951.2, 60 sec: 41777.6, 300 sec: 41265.2). Total num frames: 505217024. Throughput: 0: 41086.4. Samples: 505312340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:29:36,997][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 01:29:39,334][12883] Updated weights for policy 0, policy_version 30840 (0.0041) [2024-06-18 01:29:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 505413632. Throughput: 0: 41262.5. Samples: 505564200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:29:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:29:42,621][12883] Updated weights for policy 0, policy_version 30850 (0.0039) [2024-06-18 01:29:46,974][12883] Updated weights for policy 0, policy_version 30860 (0.0026) [2024-06-18 01:29:46,994][12645] Fps is (10 sec: 39330.6, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 505610240. Throughput: 0: 41250.2. Samples: 505691140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:29:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:29:50,338][12883] Updated weights for policy 0, policy_version 30870 (0.0042) [2024-06-18 01:29:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 505823232. Throughput: 0: 41472.3. Samples: 505939820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:29:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:29:54,720][12883] Updated weights for policy 0, policy_version 30880 (0.0028) [2024-06-18 01:29:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40964.3, 300 sec: 41265.5). Total num frames: 506019840. Throughput: 0: 41534.7. Samples: 506194300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:29:56,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:29:58,211][12883] Updated weights for policy 0, policy_version 30890 (0.0029) [2024-06-18 01:30:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41234.4, 300 sec: 41265.5). Total num frames: 506232832. Throughput: 0: 41379.9. Samples: 506311260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 01:30:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:30:02,457][12883] Updated weights for policy 0, policy_version 30900 (0.0041) [2024-06-18 01:30:06,075][12883] Updated weights for policy 0, policy_version 30910 (0.0044) [2024-06-18 01:30:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 506462208. Throughput: 0: 41455.1. Samples: 506561280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 01:30:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:30:10,666][12883] Updated weights for policy 0, policy_version 30920 (0.0031) [2024-06-18 01:30:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 506626048. Throughput: 0: 41658.7. Samples: 506817180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 01:30:11,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:30:13,835][12883] Updated weights for policy 0, policy_version 30930 (0.0026) [2024-06-18 01:30:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 506871808. Throughput: 0: 41598.6. Samples: 506934420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 01:30:16,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:30:18,798][12883] Updated weights for policy 0, policy_version 30940 (0.0038) [2024-06-18 01:30:21,789][12883] Updated weights for policy 0, policy_version 30950 (0.0049) [2024-06-18 01:30:21,994][12645] Fps is (10 sec: 45876.2, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 507084800. Throughput: 0: 41704.4. Samples: 507188940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 01:30:21,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:30:26,529][12883] Updated weights for policy 0, policy_version 30960 (0.0038) [2024-06-18 01:30:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 507265024. Throughput: 0: 41613.4. Samples: 507436800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 01:30:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:30:29,847][12883] Updated weights for policy 0, policy_version 30970 (0.0045) [2024-06-18 01:30:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 507494400. Throughput: 0: 41534.7. Samples: 507560200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 01:30:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:30:34,407][12883] Updated weights for policy 0, policy_version 30980 (0.0043) [2024-06-18 01:30:36,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41234.7, 300 sec: 41265.5). Total num frames: 507691008. Throughput: 0: 41576.6. Samples: 507810760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 01:30:36,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:30:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030988_507707392.pth... [2024-06-18 01:30:37,173][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030382_497778688.pth [2024-06-18 01:30:37,685][12883] Updated weights for policy 0, policy_version 30990 (0.0043) [2024-06-18 01:30:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 507887616. Throughput: 0: 41431.5. Samples: 508058720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:30:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:30:42,034][12883] Updated weights for policy 0, policy_version 31000 (0.0042) [2024-06-18 01:30:45,735][12883] Updated weights for policy 0, policy_version 31010 (0.0025) [2024-06-18 01:30:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 508116992. Throughput: 0: 41517.8. Samples: 508179560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:30:46,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:30:49,768][12883] Updated weights for policy 0, policy_version 31020 (0.0044) [2024-06-18 01:30:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 41266.3). Total num frames: 508297216. Throughput: 0: 41460.1. Samples: 508426980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:30:51,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:30:53,874][12883] Updated weights for policy 0, policy_version 31030 (0.0031) [2024-06-18 01:30:56,994][12645] Fps is (10 sec: 37682.5, 60 sec: 41232.9, 300 sec: 41209.9). Total num frames: 508493824. Throughput: 0: 41318.6. Samples: 508676520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 01:30:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:30:57,966][12883] Updated weights for policy 0, policy_version 31040 (0.0039) [2024-06-18 01:31:01,745][12883] Updated weights for policy 0, policy_version 31050 (0.0029) [2024-06-18 01:31:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 508723200. Throughput: 0: 41413.8. Samples: 508798040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:31:01,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:31:05,841][12883] Updated weights for policy 0, policy_version 31060 (0.0038) [2024-06-18 01:31:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 508936192. Throughput: 0: 41204.7. Samples: 509043160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:31:06,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:31:07,512][12862] Signal inference workers to stop experience collection... (7200 times) [2024-06-18 01:31:07,540][12883] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-06-18 01:31:07,578][12862] Signal inference workers to resume experience collection... (7200 times) [2024-06-18 01:31:07,578][12883] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-06-18 01:31:09,719][12883] Updated weights for policy 0, policy_version 31070 (0.0035) [2024-06-18 01:31:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 509132800. Throughput: 0: 41265.3. Samples: 509293740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:31:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:31:13,770][12883] Updated weights for policy 0, policy_version 31080 (0.0040) [2024-06-18 01:31:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 509329408. Throughput: 0: 41133.2. Samples: 509411200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:31:17,000][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:31:17,615][12883] Updated weights for policy 0, policy_version 31090 (0.0041) [2024-06-18 01:31:21,551][12883] Updated weights for policy 0, policy_version 31100 (0.0034) [2024-06-18 01:31:22,000][12645] Fps is (10 sec: 42572.1, 60 sec: 41228.7, 300 sec: 41320.1). Total num frames: 509558784. Throughput: 0: 41223.1. Samples: 509666060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:22,000][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:31:25,646][12883] Updated weights for policy 0, policy_version 31110 (0.0045) [2024-06-18 01:31:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 509755392. Throughput: 0: 41119.6. Samples: 509909100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:26,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:31:29,552][12883] Updated weights for policy 0, policy_version 31120 (0.0031) [2024-06-18 01:31:31,994][12645] Fps is (10 sec: 39346.2, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 509952000. Throughput: 0: 41187.1. Samples: 510032980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:31,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:31:33,838][12883] Updated weights for policy 0, policy_version 31130 (0.0034) [2024-06-18 01:31:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 510164992. Throughput: 0: 41147.1. Samples: 510278600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:36,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:31:37,439][12883] Updated weights for policy 0, policy_version 31140 (0.0040) [2024-06-18 01:31:41,480][12883] Updated weights for policy 0, policy_version 31150 (0.0040) [2024-06-18 01:31:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 510377984. Throughput: 0: 41069.5. Samples: 510524640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:31:45,801][12883] Updated weights for policy 0, policy_version 31160 (0.0035) [2024-06-18 01:31:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 510558208. Throughput: 0: 41190.5. Samples: 510651620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:46,995][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:31:49,217][12883] Updated weights for policy 0, policy_version 31170 (0.0030) [2024-06-18 01:31:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 510771200. Throughput: 0: 41259.1. Samples: 510899820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:51,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:31:53,507][12883] Updated weights for policy 0, policy_version 31180 (0.0032) [2024-06-18 01:31:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 511000576. Throughput: 0: 41223.1. Samples: 511148780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:31:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:31:57,519][12883] Updated weights for policy 0, policy_version 31190 (0.0042) [2024-06-18 01:32:01,231][12883] Updated weights for policy 0, policy_version 31200 (0.0030) [2024-06-18 01:32:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 511197184. Throughput: 0: 41476.0. Samples: 511277620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:32:01,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 01:32:05,310][12883] Updated weights for policy 0, policy_version 31210 (0.0047) [2024-06-18 01:32:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 511410176. Throughput: 0: 41096.4. Samples: 511515140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:32:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:32:09,372][12883] Updated weights for policy 0, policy_version 31220 (0.0036) [2024-06-18 01:32:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 511606784. Throughput: 0: 41248.8. Samples: 511765300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:32:11,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:32:13,044][12883] Updated weights for policy 0, policy_version 31230 (0.0042) [2024-06-18 01:32:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 511803392. Throughput: 0: 41097.4. Samples: 511882360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:32:16,994][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 01:32:17,828][12883] Updated weights for policy 0, policy_version 31240 (0.0052) [2024-06-18 01:32:21,015][12883] Updated weights for policy 0, policy_version 31250 (0.0023) [2024-06-18 01:32:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41510.5, 300 sec: 41432.1). Total num frames: 512049152. Throughput: 0: 41228.5. Samples: 512133880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:32:21,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 01:32:25,546][12883] Updated weights for policy 0, policy_version 31260 (0.0031) [2024-06-18 01:32:26,994][12645] Fps is (10 sec: 40958.9, 60 sec: 40959.8, 300 sec: 41265.5). Total num frames: 512212992. Throughput: 0: 41631.4. Samples: 512398060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:32:26,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:32:28,882][12883] Updated weights for policy 0, policy_version 31270 (0.0030) [2024-06-18 01:32:31,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 512425984. Throughput: 0: 41232.1. Samples: 512507060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:32:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:32:33,426][12883] Updated weights for policy 0, policy_version 31280 (0.0036) [2024-06-18 01:32:36,858][12883] Updated weights for policy 0, policy_version 31290 (0.0047) [2024-06-18 01:32:36,994][12645] Fps is (10 sec: 44237.9, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 512655360. Throughput: 0: 41428.5. Samples: 512764100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:32:36,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:32:37,049][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031291_512671744.pth... [2024-06-18 01:32:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030684_502726656.pth [2024-06-18 01:32:41,213][12883] Updated weights for policy 0, policy_version 31300 (0.0036) [2024-06-18 01:32:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 512835584. Throughput: 0: 41481.3. Samples: 513015440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 01:32:41,995][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:32:44,090][12862] Signal inference workers to stop experience collection... (7250 times) [2024-06-18 01:32:44,090][12862] Signal inference workers to resume experience collection... (7250 times) [2024-06-18 01:32:44,106][12883] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-06-18 01:32:44,107][12883] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-06-18 01:32:44,706][12883] Updated weights for policy 0, policy_version 31310 (0.0044) [2024-06-18 01:32:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 513081344. Throughput: 0: 41226.6. Samples: 513132820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 01:32:46,995][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 01:32:49,340][12883] Updated weights for policy 0, policy_version 31320 (0.0033) [2024-06-18 01:32:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 513245184. Throughput: 0: 41517.3. Samples: 513383420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 01:32:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:32:52,821][12883] Updated weights for policy 0, policy_version 31330 (0.0033) [2024-06-18 01:32:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 513458176. Throughput: 0: 41395.0. Samples: 513628080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 01:32:56,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:32:57,546][12883] Updated weights for policy 0, policy_version 31340 (0.0031) [2024-06-18 01:33:00,736][12883] Updated weights for policy 0, policy_version 31350 (0.0034) [2024-06-18 01:33:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 513687552. Throughput: 0: 41585.2. Samples: 513753700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 01:33:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:33:05,633][12883] Updated weights for policy 0, policy_version 31360 (0.0035) [2024-06-18 01:33:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 513867776. Throughput: 0: 41461.2. Samples: 513999640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:33:06,997][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:33:08,615][12883] Updated weights for policy 0, policy_version 31370 (0.0043) [2024-06-18 01:33:11,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40960.0, 300 sec: 41154.8). Total num frames: 514064384. Throughput: 0: 40826.9. Samples: 514235260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:33:11,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:33:13,655][12883] Updated weights for policy 0, policy_version 31380 (0.0036) [2024-06-18 01:33:16,552][12883] Updated weights for policy 0, policy_version 31390 (0.0031) [2024-06-18 01:33:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 41265.5). Total num frames: 514293760. Throughput: 0: 41195.5. Samples: 514360860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:33:16,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:33:21,741][12883] Updated weights for policy 0, policy_version 31400 (0.0029) [2024-06-18 01:33:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 41154.7). Total num frames: 514473984. Throughput: 0: 41022.7. Samples: 514610120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:33:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:33:24,536][12883] Updated weights for policy 0, policy_version 31410 (0.0039) [2024-06-18 01:33:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 514703360. Throughput: 0: 40704.5. Samples: 514847140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) [2024-06-18 01:33:26,996][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:33:29,496][12883] Updated weights for policy 0, policy_version 31420 (0.0034) [2024-06-18 01:33:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 514899968. Throughput: 0: 40890.4. Samples: 514972880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) [2024-06-18 01:33:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:33:32,783][12883] Updated weights for policy 0, policy_version 31430 (0.0032) [2024-06-18 01:33:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 515096576. Throughput: 0: 40694.7. Samples: 515214680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) [2024-06-18 01:33:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:33:37,191][12883] Updated weights for policy 0, policy_version 31440 (0.0044) [2024-06-18 01:33:40,848][12883] Updated weights for policy 0, policy_version 31450 (0.0044) [2024-06-18 01:33:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 515309568. Throughput: 0: 40577.4. Samples: 515454060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 23.0) [2024-06-18 01:33:41,998][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:33:45,280][12883] Updated weights for policy 0, policy_version 31460 (0.0030) [2024-06-18 01:33:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40140.9, 300 sec: 41209.9). Total num frames: 515489792. Throughput: 0: 40626.7. Samples: 515581900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 01:33:46,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:33:48,675][12883] Updated weights for policy 0, policy_version 31470 (0.0038) [2024-06-18 01:33:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40686.9, 300 sec: 41099.7). Total num frames: 515686400. Throughput: 0: 40564.4. Samples: 515825040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 01:33:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:33:53,582][12883] Updated weights for policy 0, policy_version 31480 (0.0038) [2024-06-18 01:33:56,554][12883] Updated weights for policy 0, policy_version 31490 (0.0033) [2024-06-18 01:33:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41506.2, 300 sec: 41321.3). Total num frames: 515948544. Throughput: 0: 40720.0. Samples: 516067660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 01:33:56,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:34:01,437][12883] Updated weights for policy 0, policy_version 31500 (0.0039) [2024-06-18 01:34:01,995][12645] Fps is (10 sec: 42592.9, 60 sec: 40413.0, 300 sec: 41154.2). Total num frames: 516112384. Throughput: 0: 40909.1. Samples: 516201820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 01:34:01,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:34:02,422][12862] Signal inference workers to stop experience collection... (7300 times) [2024-06-18 01:34:02,422][12862] Signal inference workers to resume experience collection... (7300 times) [2024-06-18 01:34:02,452][12883] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-06-18 01:34:02,452][12883] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-06-18 01:34:04,396][12883] Updated weights for policy 0, policy_version 31510 (0.0047) [2024-06-18 01:34:06,994][12645] Fps is (10 sec: 36045.1, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 516308992. Throughput: 0: 40690.7. Samples: 516441200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) [2024-06-18 01:34:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:34:09,395][12883] Updated weights for policy 0, policy_version 31520 (0.0040) [2024-06-18 01:34:11,994][12645] Fps is (10 sec: 44243.0, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 516554752. Throughput: 0: 41000.9. Samples: 516692180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) [2024-06-18 01:34:11,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:34:12,258][12883] Updated weights for policy 0, policy_version 31530 (0.0029) [2024-06-18 01:34:16,996][12645] Fps is (10 sec: 40950.0, 60 sec: 40412.4, 300 sec: 41154.1). Total num frames: 516718592. Throughput: 0: 41090.3. Samples: 516822040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) [2024-06-18 01:34:17,005][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:34:17,395][12883] Updated weights for policy 0, policy_version 31540 (0.0034) [2024-06-18 01:34:20,431][12883] Updated weights for policy 0, policy_version 31550 (0.0035) [2024-06-18 01:34:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 516947968. Throughput: 0: 41003.9. Samples: 517059860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 19.0) [2024-06-18 01:34:21,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:34:25,197][12883] Updated weights for policy 0, policy_version 31560 (0.0041) [2024-06-18 01:34:26,994][12645] Fps is (10 sec: 44247.2, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 517160960. Throughput: 0: 41380.4. Samples: 517316180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-18 01:34:26,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:34:28,382][12883] Updated weights for policy 0, policy_version 31570 (0.0039) [2024-06-18 01:34:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40687.0, 300 sec: 41099.2). Total num frames: 517341184. Throughput: 0: 41301.8. Samples: 517440480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-18 01:34:31,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:34:33,529][12883] Updated weights for policy 0, policy_version 31580 (0.0035) [2024-06-18 01:34:36,714][12883] Updated weights for policy 0, policy_version 31590 (0.0034) [2024-06-18 01:34:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 517570560. Throughput: 0: 41086.8. Samples: 517673940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-18 01:34:36,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:34:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031590_517570560.pth... [2024-06-18 01:34:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000030988_507707392.pth [2024-06-18 01:34:41,386][12883] Updated weights for policy 0, policy_version 31600 (0.0028) [2024-06-18 01:34:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 517767168. Throughput: 0: 41384.4. Samples: 517929960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 24.0) [2024-06-18 01:34:41,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:34:44,615][12883] Updated weights for policy 0, policy_version 31610 (0.0030) [2024-06-18 01:34:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 517963776. Throughput: 0: 41071.5. Samples: 518049980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 01:34:46,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 01:34:49,271][12883] Updated weights for policy 0, policy_version 31620 (0.0042) [2024-06-18 01:34:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 518176768. Throughput: 0: 41143.9. Samples: 518292680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 01:34:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:34:52,832][12883] Updated weights for policy 0, policy_version 31630 (0.0031) [2024-06-18 01:34:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 518373376. Throughput: 0: 41047.9. Samples: 518539340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 01:34:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:34:57,023][12883] Updated weights for policy 0, policy_version 31640 (0.0036) [2024-06-18 01:35:01,017][12883] Updated weights for policy 0, policy_version 31650 (0.0043) [2024-06-18 01:35:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 40961.0, 300 sec: 41043.3). Total num frames: 518569984. Throughput: 0: 40909.4. Samples: 518662860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 01:35:01,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:35:04,748][12883] Updated weights for policy 0, policy_version 31660 (0.0036) [2024-06-18 01:35:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 518766592. Throughput: 0: 41140.1. Samples: 518911160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:35:06,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:35:08,678][12883] Updated weights for policy 0, policy_version 31670 (0.0046) [2024-06-18 01:35:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 519012352. Throughput: 0: 40972.9. Samples: 519159960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:35:11,996][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:35:12,897][12883] Updated weights for policy 0, policy_version 31680 (0.0049) [2024-06-18 01:35:16,923][12883] Updated weights for policy 0, policy_version 31690 (0.0030) [2024-06-18 01:35:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41507.8, 300 sec: 41098.8). Total num frames: 519208960. Throughput: 0: 41049.3. Samples: 519287700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:35:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:35:20,709][12883] Updated weights for policy 0, policy_version 31700 (0.0048) [2024-06-18 01:35:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 519405568. Throughput: 0: 41237.3. Samples: 519529620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:35:21,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:35:24,842][12883] Updated weights for policy 0, policy_version 31710 (0.0032) [2024-06-18 01:35:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 519634944. Throughput: 0: 41064.5. Samples: 519777860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 01:35:26,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 01:35:28,287][12862] Signal inference workers to stop experience collection... (7350 times) [2024-06-18 01:35:28,289][12862] Signal inference workers to resume experience collection... (7350 times) [2024-06-18 01:35:28,328][12883] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-06-18 01:35:28,328][12883] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-06-18 01:35:28,427][12883] Updated weights for policy 0, policy_version 31720 (0.0044) [2024-06-18 01:35:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 519815168. Throughput: 0: 41078.1. Samples: 519898500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 01:35:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:35:32,765][12883] Updated weights for policy 0, policy_version 31730 (0.0031) [2024-06-18 01:35:36,105][12883] Updated weights for policy 0, policy_version 31740 (0.0029) [2024-06-18 01:35:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 520028160. Throughput: 0: 41224.0. Samples: 520147760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 01:35:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:35:40,566][12883] Updated weights for policy 0, policy_version 31750 (0.0028) [2024-06-18 01:35:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 520224768. Throughput: 0: 41471.7. Samples: 520405560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 01:35:41,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:35:43,796][12883] Updated weights for policy 0, policy_version 31760 (0.0047) [2024-06-18 01:35:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 520437760. Throughput: 0: 41316.3. Samples: 520522100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 01:35:46,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:35:48,536][12883] Updated weights for policy 0, policy_version 31770 (0.0035) [2024-06-18 01:35:51,446][12883] Updated weights for policy 0, policy_version 31780 (0.0029) [2024-06-18 01:35:51,994][12645] Fps is (10 sec: 45874.0, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 520683520. Throughput: 0: 41376.3. Samples: 520773100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:35:51,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:35:56,304][12883] Updated weights for policy 0, policy_version 31790 (0.0036) [2024-06-18 01:35:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 520863744. Throughput: 0: 41380.1. Samples: 521022060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:35:56,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:35:59,351][12883] Updated weights for policy 0, policy_version 31800 (0.0026) [2024-06-18 01:36:01,996][12645] Fps is (10 sec: 37675.3, 60 sec: 41504.5, 300 sec: 41098.5). Total num frames: 521060352. Throughput: 0: 41077.5. Samples: 521136280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:36:01,996][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:36:04,738][12883] Updated weights for policy 0, policy_version 31810 (0.0030) [2024-06-18 01:36:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41265.5). Total num frames: 521306112. Throughput: 0: 41417.8. Samples: 521393420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 01:36:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:36:07,383][12883] Updated weights for policy 0, policy_version 31820 (0.0038) [2024-06-18 01:36:11,994][12645] Fps is (10 sec: 39330.1, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 521453568. Throughput: 0: 41450.6. Samples: 521643140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-18 01:36:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:36:12,825][12883] Updated weights for policy 0, policy_version 31830 (0.0042) [2024-06-18 01:36:15,245][12883] Updated weights for policy 0, policy_version 31840 (0.0033) [2024-06-18 01:36:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41155.3). Total num frames: 521699328. Throughput: 0: 41332.5. Samples: 521758460. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-18 01:36:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:36:20,562][12883] Updated weights for policy 0, policy_version 31850 (0.0037) [2024-06-18 01:36:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42052.2, 300 sec: 41265.4). Total num frames: 521928704. Throughput: 0: 41457.7. Samples: 522013360. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-18 01:36:21,998][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:36:23,432][12883] Updated weights for policy 0, policy_version 31860 (0.0040) [2024-06-18 01:36:26,994][12645] Fps is (10 sec: 36045.1, 60 sec: 40414.0, 300 sec: 41043.3). Total num frames: 522059776. Throughput: 0: 41243.5. Samples: 522261520. Policy #0 lag: (min: 0.0, avg: 13.0, max: 25.0) [2024-06-18 01:36:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:36:28,611][12883] Updated weights for policy 0, policy_version 31870 (0.0033) [2024-06-18 01:36:31,636][12883] Updated weights for policy 0, policy_version 31880 (0.0041) [2024-06-18 01:36:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 41265.5). Total num frames: 522338304. Throughput: 0: 41212.2. Samples: 522376640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-18 01:36:31,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:36:36,486][12862] Signal inference workers to stop experience collection... (7400 times) [2024-06-18 01:36:36,516][12883] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-06-18 01:36:36,549][12862] Signal inference workers to resume experience collection... (7400 times) [2024-06-18 01:36:36,552][12883] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-06-18 01:36:36,554][12883] Updated weights for policy 0, policy_version 31890 (0.0043) [2024-06-18 01:36:36,994][12645] Fps is (10 sec: 45873.5, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 522518528. Throughput: 0: 41256.3. Samples: 522629640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-18 01:36:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:36:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031893_522534912.pth... [2024-06-18 01:36:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031291_512671744.pth [2024-06-18 01:36:39,518][12883] Updated weights for policy 0, policy_version 31900 (0.0029) [2024-06-18 01:36:41,994][12645] Fps is (10 sec: 32767.8, 60 sec: 40686.8, 300 sec: 41043.3). Total num frames: 522665984. Throughput: 0: 41308.9. Samples: 522880960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-18 01:36:41,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:36:44,171][12883] Updated weights for policy 0, policy_version 31910 (0.0041) [2024-06-18 01:36:46,994][12645] Fps is (10 sec: 40961.4, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 522928128. Throughput: 0: 41272.3. Samples: 522993440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 24.0) [2024-06-18 01:36:46,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:36:47,577][12883] Updated weights for policy 0, policy_version 31920 (0.0043) [2024-06-18 01:36:51,836][12883] Updated weights for policy 0, policy_version 31930 (0.0042) [2024-06-18 01:36:51,994][12645] Fps is (10 sec: 47513.1, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 523141120. Throughput: 0: 41270.2. Samples: 523250580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:36:51,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:36:55,459][12883] Updated weights for policy 0, policy_version 31940 (0.0037) [2024-06-18 01:36:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41098.9). Total num frames: 523321344. Throughput: 0: 41139.1. Samples: 523494400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:36:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:36:59,857][12883] Updated weights for policy 0, policy_version 31950 (0.0031) [2024-06-18 01:37:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41507.6, 300 sec: 41154.4). Total num frames: 523550720. Throughput: 0: 41371.0. Samples: 523620160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:37:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:37:03,373][12883] Updated weights for policy 0, policy_version 31960 (0.0035) [2024-06-18 01:37:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 523763712. Throughput: 0: 41417.4. Samples: 523877140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:37:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:37:07,777][12883] Updated weights for policy 0, policy_version 31970 (0.0029) [2024-06-18 01:37:11,053][12883] Updated weights for policy 0, policy_version 31980 (0.0045) [2024-06-18 01:37:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 523976704. Throughput: 0: 41174.6. Samples: 524114380. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 01:37:11,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:37:15,719][12883] Updated weights for policy 0, policy_version 31990 (0.0045) [2024-06-18 01:37:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 524173312. Throughput: 0: 41611.9. Samples: 524249180. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 01:37:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:37:18,985][12883] Updated weights for policy 0, policy_version 32000 (0.0042) [2024-06-18 01:37:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 524386304. Throughput: 0: 41609.5. Samples: 524502060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 01:37:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:37:23,439][12883] Updated weights for policy 0, policy_version 32010 (0.0035) [2024-06-18 01:37:26,773][12883] Updated weights for policy 0, policy_version 32020 (0.0034) [2024-06-18 01:37:27,000][12645] Fps is (10 sec: 44209.7, 60 sec: 42594.0, 300 sec: 41320.2). Total num frames: 524615680. Throughput: 0: 41565.5. Samples: 524751660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 01:37:27,009][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:37:31,088][12883] Updated weights for policy 0, policy_version 32030 (0.0045) [2024-06-18 01:37:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 524795904. Throughput: 0: 41965.3. Samples: 524881880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:37:31,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:37:35,187][12883] Updated weights for policy 0, policy_version 32040 (0.0034) [2024-06-18 01:37:36,994][12645] Fps is (10 sec: 39346.2, 60 sec: 41506.4, 300 sec: 41265.5). Total num frames: 525008896. Throughput: 0: 41740.2. Samples: 525128880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:37:36,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:37:38,896][12883] Updated weights for policy 0, policy_version 32050 (0.0033) [2024-06-18 01:37:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41098.9). Total num frames: 525205504. Throughput: 0: 41903.5. Samples: 525380060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:37:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:37:42,976][12883] Updated weights for policy 0, policy_version 32060 (0.0031) [2024-06-18 01:37:44,732][12862] Signal inference workers to stop experience collection... (7450 times) [2024-06-18 01:37:44,732][12862] Signal inference workers to resume experience collection... (7450 times) [2024-06-18 01:37:44,751][12883] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-06-18 01:37:44,751][12883] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-06-18 01:37:46,656][12883] Updated weights for policy 0, policy_version 32070 (0.0042) [2024-06-18 01:37:46,994][12645] Fps is (10 sec: 42597.4, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 525434880. Throughput: 0: 41852.4. Samples: 525503520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:37:46,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:37:51,028][12883] Updated weights for policy 0, policy_version 32080 (0.0036) [2024-06-18 01:37:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 525615104. Throughput: 0: 41585.7. Samples: 525748500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:37:51,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:37:54,700][12883] Updated weights for policy 0, policy_version 32090 (0.0037) [2024-06-18 01:37:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 525844480. Throughput: 0: 41833.7. Samples: 525996900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 01:37:56,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:37:59,141][12883] Updated weights for policy 0, policy_version 32100 (0.0050) [2024-06-18 01:38:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 526041088. Throughput: 0: 41626.7. Samples: 526122380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 01:38:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:38:02,394][12883] Updated weights for policy 0, policy_version 32110 (0.0039) [2024-06-18 01:38:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 526237696. Throughput: 0: 41525.4. Samples: 526370700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 01:38:06,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:38:07,363][12883] Updated weights for policy 0, policy_version 32120 (0.0041) [2024-06-18 01:38:10,390][12883] Updated weights for policy 0, policy_version 32130 (0.0032) [2024-06-18 01:38:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 526467072. Throughput: 0: 41466.1. Samples: 526617380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 01:38:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:38:15,109][12883] Updated weights for policy 0, policy_version 32140 (0.0029) [2024-06-18 01:38:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 526663680. Throughput: 0: 41532.0. Samples: 526750820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:38:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:38:18,045][12883] Updated weights for policy 0, policy_version 32150 (0.0036) [2024-06-18 01:38:21,996][12645] Fps is (10 sec: 39313.1, 60 sec: 41231.6, 300 sec: 41209.6). Total num frames: 526860288. Throughput: 0: 41509.8. Samples: 526996920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:38:21,996][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:38:23,043][12883] Updated weights for policy 0, policy_version 32160 (0.0031) [2024-06-18 01:38:25,875][12883] Updated weights for policy 0, policy_version 32170 (0.0029) [2024-06-18 01:38:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41237.3, 300 sec: 41321.0). Total num frames: 527089664. Throughput: 0: 41253.0. Samples: 527236440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:38:26,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:38:31,129][12883] Updated weights for policy 0, policy_version 32180 (0.0030) [2024-06-18 01:38:31,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 527269888. Throughput: 0: 41328.1. Samples: 527363280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 01:38:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:38:33,819][12883] Updated weights for policy 0, policy_version 32190 (0.0033) [2024-06-18 01:38:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 527482880. Throughput: 0: 41483.3. Samples: 527615240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:38:36,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:38:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032196_527499264.pth... [2024-06-18 01:38:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031590_517570560.pth [2024-06-18 01:38:38,888][12883] Updated weights for policy 0, policy_version 32200 (0.0046) [2024-06-18 01:38:41,762][12883] Updated weights for policy 0, policy_version 32210 (0.0042) [2024-06-18 01:38:41,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.4, 300 sec: 41487.6). Total num frames: 527728640. Throughput: 0: 41150.3. Samples: 527848660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:38:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:38:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 527876096. Throughput: 0: 41180.3. Samples: 527975500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:38:46,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:38:47,097][12883] Updated weights for policy 0, policy_version 32220 (0.0052) [2024-06-18 01:38:49,768][12883] Updated weights for policy 0, policy_version 32230 (0.0030) [2024-06-18 01:38:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 528105472. Throughput: 0: 41147.1. Samples: 528222320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:38:51,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:38:54,876][12883] Updated weights for policy 0, policy_version 32240 (0.0041) [2024-06-18 01:38:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41233.1, 300 sec: 41376.7). Total num frames: 528318464. Throughput: 0: 41272.9. Samples: 528474660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:38:56,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:38:57,983][12883] Updated weights for policy 0, policy_version 32250 (0.0045) [2024-06-18 01:39:01,996][12645] Fps is (10 sec: 39312.6, 60 sec: 40958.4, 300 sec: 41320.7). Total num frames: 528498688. Throughput: 0: 40929.9. Samples: 528592760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:39:01,997][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:39:02,618][12883] Updated weights for policy 0, policy_version 32260 (0.0037) [2024-06-18 01:39:06,083][12883] Updated weights for policy 0, policy_version 32270 (0.0040) [2024-06-18 01:39:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 528728064. Throughput: 0: 40907.4. Samples: 528837660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:39:06,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:39:11,015][12883] Updated weights for policy 0, policy_version 32280 (0.0041) [2024-06-18 01:39:11,996][12645] Fps is (10 sec: 44236.8, 60 sec: 41231.5, 300 sec: 41432.1). Total num frames: 528941056. Throughput: 0: 41102.4. Samples: 529086140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:39:11,997][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:39:14,099][12883] Updated weights for policy 0, policy_version 32290 (0.0029) [2024-06-18 01:39:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 529121280. Throughput: 0: 40909.3. Samples: 529204200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:39:16,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:39:18,906][12883] Updated weights for policy 0, policy_version 32300 (0.0043) [2024-06-18 01:39:21,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41507.7, 300 sec: 41321.0). Total num frames: 529350656. Throughput: 0: 40872.0. Samples: 529454480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:39:21,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:39:22,003][12883] Updated weights for policy 0, policy_version 32310 (0.0035) [2024-06-18 01:39:26,386][12862] Signal inference workers to stop experience collection... (7500 times) [2024-06-18 01:39:26,387][12862] Signal inference workers to resume experience collection... (7500 times) [2024-06-18 01:39:26,443][12883] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-06-18 01:39:26,444][12883] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-06-18 01:39:26,786][12883] Updated weights for policy 0, policy_version 32320 (0.0049) [2024-06-18 01:39:26,996][12645] Fps is (10 sec: 40951.1, 60 sec: 40685.4, 300 sec: 41320.7). Total num frames: 529530880. Throughput: 0: 41190.7. Samples: 529702340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:39:26,997][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:39:29,866][12883] Updated weights for policy 0, policy_version 32330 (0.0040) [2024-06-18 01:39:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 529760256. Throughput: 0: 41051.1. Samples: 529822800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:39:31,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:39:34,866][12883] Updated weights for policy 0, policy_version 32340 (0.0037) [2024-06-18 01:39:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 529956864. Throughput: 0: 41187.1. Samples: 530075740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:39:36,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:39:37,752][12883] Updated weights for policy 0, policy_version 32350 (0.0037) [2024-06-18 01:39:41,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.7, 300 sec: 41265.5). Total num frames: 530137088. Throughput: 0: 41129.7. Samples: 530325500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:39:41,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:39:42,506][12883] Updated weights for policy 0, policy_version 32360 (0.0030) [2024-06-18 01:39:45,799][12883] Updated weights for policy 0, policy_version 32370 (0.0047) [2024-06-18 01:39:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 530366464. Throughput: 0: 41156.8. Samples: 530444720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:39:46,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:39:50,240][12883] Updated weights for policy 0, policy_version 32380 (0.0037) [2024-06-18 01:39:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 530579456. Throughput: 0: 41261.9. Samples: 530694440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:39:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:39:54,115][12883] Updated weights for policy 0, policy_version 32390 (0.0035) [2024-06-18 01:39:56,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 530759680. Throughput: 0: 41195.8. Samples: 530939860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:39:56,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:39:58,181][12883] Updated weights for policy 0, policy_version 32400 (0.0038) [2024-06-18 01:40:01,827][12883] Updated weights for policy 0, policy_version 32410 (0.0051) [2024-06-18 01:40:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41780.7, 300 sec: 41487.6). Total num frames: 531005440. Throughput: 0: 41544.9. Samples: 531073720. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:40:01,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:40:05,892][12883] Updated weights for policy 0, policy_version 32420 (0.0039) [2024-06-18 01:40:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 531185664. Throughput: 0: 41398.5. Samples: 531317420. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:40:06,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:40:09,869][12883] Updated weights for policy 0, policy_version 32430 (0.0035) [2024-06-18 01:40:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41234.6, 300 sec: 41376.5). Total num frames: 531415040. Throughput: 0: 41321.2. Samples: 531561700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:40:11,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:40:13,901][12883] Updated weights for policy 0, policy_version 32440 (0.0027) [2024-06-18 01:40:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 531611648. Throughput: 0: 41419.9. Samples: 531686700. Policy #0 lag: (min: 1.0, avg: 11.4, max: 23.0) [2024-06-18 01:40:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:40:17,866][12883] Updated weights for policy 0, policy_version 32450 (0.0027) [2024-06-18 01:40:21,625][12883] Updated weights for policy 0, policy_version 32460 (0.0044) [2024-06-18 01:40:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 531824640. Throughput: 0: 41273.7. Samples: 531933060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 01:40:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:40:26,080][12883] Updated weights for policy 0, policy_version 32470 (0.0044) [2024-06-18 01:40:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41507.6, 300 sec: 41376.5). Total num frames: 532021248. Throughput: 0: 41288.0. Samples: 532183460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 01:40:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:40:29,738][12883] Updated weights for policy 0, policy_version 32480 (0.0033) [2024-06-18 01:40:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 532234240. Throughput: 0: 41364.8. Samples: 532306140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 01:40:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:40:34,079][12883] Updated weights for policy 0, policy_version 32490 (0.0032) [2024-06-18 01:40:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 532430848. Throughput: 0: 41312.7. Samples: 532553520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 01:40:36,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:40:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032498_532447232.pth... [2024-06-18 01:40:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000031893_522534912.pth [2024-06-18 01:40:37,639][12883] Updated weights for policy 0, policy_version 32500 (0.0032) [2024-06-18 01:40:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 532643840. Throughput: 0: 41353.5. Samples: 532800760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 01:40:41,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:40:42,004][12883] Updated weights for policy 0, policy_version 32510 (0.0043) [2024-06-18 01:40:45,389][12883] Updated weights for policy 0, policy_version 32520 (0.0047) [2024-06-18 01:40:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 532840448. Throughput: 0: 41074.3. Samples: 532922060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:40:46,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 01:40:49,877][12883] Updated weights for policy 0, policy_version 32530 (0.0032) [2024-06-18 01:40:51,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41231.4, 300 sec: 41320.7). Total num frames: 533053440. Throughput: 0: 41244.2. Samples: 533173500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:40:51,996][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:40:53,136][12883] Updated weights for policy 0, policy_version 32540 (0.0042) [2024-06-18 01:40:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41265.8). Total num frames: 533233664. Throughput: 0: 41435.6. Samples: 533426300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:40:56,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:40:57,690][12883] Updated weights for policy 0, policy_version 32550 (0.0039) [2024-06-18 01:40:59,544][12862] Signal inference workers to stop experience collection... (7550 times) [2024-06-18 01:40:59,544][12862] Signal inference workers to resume experience collection... (7550 times) [2024-06-18 01:40:59,563][12883] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-06-18 01:40:59,592][12883] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-06-18 01:41:00,977][12883] Updated weights for policy 0, policy_version 32560 (0.0041) [2024-06-18 01:41:01,994][12645] Fps is (10 sec: 42608.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 533479424. Throughput: 0: 41349.0. Samples: 533547400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 01:41:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:41:05,777][12883] Updated weights for policy 0, policy_version 32570 (0.0044) [2024-06-18 01:41:06,996][12645] Fps is (10 sec: 44227.0, 60 sec: 41504.6, 300 sec: 41431.8). Total num frames: 533676032. Throughput: 0: 41341.6. Samples: 533793520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:41:06,996][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:41:08,820][12883] Updated weights for policy 0, policy_version 32580 (0.0047) [2024-06-18 01:41:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 533856256. Throughput: 0: 41301.9. Samples: 534042040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:41:11,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:41:14,192][12883] Updated weights for policy 0, policy_version 32590 (0.0036) [2024-06-18 01:41:16,644][12883] Updated weights for policy 0, policy_version 32600 (0.0036) [2024-06-18 01:41:16,994][12645] Fps is (10 sec: 45885.2, 60 sec: 42052.3, 300 sec: 41376.5). Total num frames: 534134784. Throughput: 0: 41265.8. Samples: 534163100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:41:16,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:41:21,768][12883] Updated weights for policy 0, policy_version 32610 (0.0027) [2024-06-18 01:41:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 534282240. Throughput: 0: 41503.1. Samples: 534421160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 01:41:21,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:41:24,716][12883] Updated weights for policy 0, policy_version 32621 (0.0040) [2024-06-18 01:41:26,994][12645] Fps is (10 sec: 36045.0, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 534495232. Throughput: 0: 41447.0. Samples: 534665880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:41:26,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:41:30,218][12883] Updated weights for policy 0, policy_version 32631 (0.0043) [2024-06-18 01:41:31,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42052.3, 300 sec: 41487.7). Total num frames: 534757376. Throughput: 0: 41611.5. Samples: 534794580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:41:31,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:41:32,483][12883] Updated weights for policy 0, policy_version 32641 (0.0034) [2024-06-18 01:41:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 534888448. Throughput: 0: 41609.3. Samples: 535045820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:41:36,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:41:38,095][12883] Updated weights for policy 0, policy_version 32651 (0.0035) [2024-06-18 01:41:41,326][12883] Updated weights for policy 0, policy_version 32661 (0.0033) [2024-06-18 01:41:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 535134208. Throughput: 0: 41246.6. Samples: 535282400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:41:41,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:41:45,783][12883] Updated weights for policy 0, policy_version 32671 (0.0029) [2024-06-18 01:41:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 535347200. Throughput: 0: 41574.7. Samples: 535418260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 01:41:46,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:41:49,067][12883] Updated weights for policy 0, policy_version 32681 (0.0033) [2024-06-18 01:41:51,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40961.6, 300 sec: 41321.0). Total num frames: 535511040. Throughput: 0: 41639.0. Samples: 535667180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 01:41:51,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:41:53,705][12883] Updated weights for policy 0, policy_version 32691 (0.0045) [2024-06-18 01:41:56,864][12883] Updated weights for policy 0, policy_version 32701 (0.0050) [2024-06-18 01:41:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 41432.1). Total num frames: 535773184. Throughput: 0: 41416.8. Samples: 535905800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 01:41:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:42:01,935][12883] Updated weights for policy 0, policy_version 32711 (0.0042) [2024-06-18 01:42:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 535937024. Throughput: 0: 41659.6. Samples: 536037780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 01:42:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:42:04,620][12883] Updated weights for policy 0, policy_version 32721 (0.0039) [2024-06-18 01:42:06,996][12645] Fps is (10 sec: 37674.9, 60 sec: 41233.0, 300 sec: 41265.1). Total num frames: 536150016. Throughput: 0: 41293.5. Samples: 536279460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 01:42:06,996][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:42:09,903][12883] Updated weights for policy 0, policy_version 32731 (0.0037) [2024-06-18 01:42:10,774][12862] Signal inference workers to stop experience collection... (7600 times) [2024-06-18 01:42:10,812][12883] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-06-18 01:42:10,830][12862] Signal inference workers to resume experience collection... (7600 times) [2024-06-18 01:42:10,831][12883] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-06-18 01:42:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.2, 300 sec: 41432.1). Total num frames: 536395776. Throughput: 0: 41319.5. Samples: 536525260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 01:42:11,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:42:12,387][12883] Updated weights for policy 0, policy_version 32741 (0.0034) [2024-06-18 01:42:16,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40413.8, 300 sec: 41265.5). Total num frames: 536559616. Throughput: 0: 41287.5. Samples: 536652520. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 01:42:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:42:17,720][12883] Updated weights for policy 0, policy_version 32751 (0.0042) [2024-06-18 01:42:20,841][12883] Updated weights for policy 0, policy_version 32761 (0.0023) [2024-06-18 01:42:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 41266.3). Total num frames: 536788992. Throughput: 0: 40973.7. Samples: 536889640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 01:42:21,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:42:25,753][12883] Updated weights for policy 0, policy_version 32771 (0.0026) [2024-06-18 01:42:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 537001984. Throughput: 0: 41433.8. Samples: 537146920. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 01:42:26,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:42:28,607][12883] Updated weights for policy 0, policy_version 32781 (0.0043) [2024-06-18 01:42:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40140.8, 300 sec: 41209.9). Total num frames: 537165824. Throughput: 0: 40979.1. Samples: 537262320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 01:42:31,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:42:33,499][12883] Updated weights for policy 0, policy_version 32791 (0.0037) [2024-06-18 01:42:36,506][12883] Updated weights for policy 0, policy_version 32801 (0.0034) [2024-06-18 01:42:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 41432.1). Total num frames: 537427968. Throughput: 0: 41013.2. Samples: 537512780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 01:42:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:42:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032802_537427968.pth... [2024-06-18 01:42:37,058][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032196_527499264.pth [2024-06-18 01:42:41,346][12883] Updated weights for policy 0, policy_version 32811 (0.0025) [2024-06-18 01:42:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 537608192. Throughput: 0: 41248.9. Samples: 537762000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 01:42:41,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:42:44,740][12883] Updated weights for policy 0, policy_version 32821 (0.0039) [2024-06-18 01:42:46,994][12645] Fps is (10 sec: 36045.4, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 537788416. Throughput: 0: 40948.1. Samples: 537880440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 01:42:46,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:42:49,104][12883] Updated weights for policy 0, policy_version 32831 (0.0033) [2024-06-18 01:42:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41265.5). Total num frames: 538017792. Throughput: 0: 41103.8. Samples: 538129040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 01:42:51,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:42:52,782][12883] Updated weights for policy 0, policy_version 32841 (0.0040) [2024-06-18 01:42:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 538214400. Throughput: 0: 41220.6. Samples: 538380180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 01:42:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:42:57,171][12883] Updated weights for policy 0, policy_version 32851 (0.0027) [2024-06-18 01:43:01,286][12883] Updated weights for policy 0, policy_version 32861 (0.0040) [2024-06-18 01:43:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 538411008. Throughput: 0: 40939.2. Samples: 538494780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 01:43:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:43:04,911][12883] Updated weights for policy 0, policy_version 32871 (0.0035) [2024-06-18 01:43:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41507.7, 300 sec: 41265.5). Total num frames: 538640384. Throughput: 0: 41269.3. Samples: 538746760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 01:43:06,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:43:09,136][12883] Updated weights for policy 0, policy_version 32881 (0.0046) [2024-06-18 01:43:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 41209.9). Total num frames: 538820608. Throughput: 0: 41225.8. Samples: 539002080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 01:43:11,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:43:12,834][12883] Updated weights for policy 0, policy_version 32891 (0.0045) [2024-06-18 01:43:16,872][12883] Updated weights for policy 0, policy_version 32901 (0.0041) [2024-06-18 01:43:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41321.3). Total num frames: 539049984. Throughput: 0: 41208.9. Samples: 539116720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 01:43:16,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:43:21,192][12883] Updated weights for policy 0, policy_version 32911 (0.0039) [2024-06-18 01:43:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 539262976. Throughput: 0: 41125.4. Samples: 539363420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 01:43:21,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:43:25,181][12883] Updated weights for policy 0, policy_version 32921 (0.0030) [2024-06-18 01:43:26,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40686.8, 300 sec: 41265.4). Total num frames: 539443200. Throughput: 0: 41063.4. Samples: 539609860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 01:43:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:43:28,937][12883] Updated weights for policy 0, policy_version 32931 (0.0036) [2024-06-18 01:43:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 539672576. Throughput: 0: 41151.4. Samples: 539732260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 01:43:31,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:43:33,186][12883] Updated weights for policy 0, policy_version 32941 (0.0039) [2024-06-18 01:43:34,267][12862] Signal inference workers to stop experience collection... (7650 times) [2024-06-18 01:43:34,268][12862] Signal inference workers to resume experience collection... (7650 times) [2024-06-18 01:43:34,304][12883] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-06-18 01:43:34,305][12883] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-06-18 01:43:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40413.9, 300 sec: 41098.8). Total num frames: 539852800. Throughput: 0: 41175.6. Samples: 539981940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:43:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:43:37,049][12883] Updated weights for policy 0, policy_version 32951 (0.0047) [2024-06-18 01:43:41,037][12883] Updated weights for policy 0, policy_version 32961 (0.0042) [2024-06-18 01:43:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 540065792. Throughput: 0: 41040.4. Samples: 540227000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:43:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:43:45,097][12883] Updated weights for policy 0, policy_version 32971 (0.0043) [2024-06-18 01:43:46,994][12645] Fps is (10 sec: 44235.8, 60 sec: 41779.0, 300 sec: 41321.0). Total num frames: 540295168. Throughput: 0: 41192.3. Samples: 540348440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:43:46,995][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:43:48,805][12883] Updated weights for policy 0, policy_version 32981 (0.0046) [2024-06-18 01:43:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 540475392. Throughput: 0: 41196.5. Samples: 540600600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:43:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:43:52,667][12883] Updated weights for policy 0, policy_version 32991 (0.0038) [2024-06-18 01:43:56,770][12883] Updated weights for policy 0, policy_version 33001 (0.0033) [2024-06-18 01:43:56,994][12645] Fps is (10 sec: 39322.3, 60 sec: 41233.0, 300 sec: 41321.3). Total num frames: 540688384. Throughput: 0: 40922.6. Samples: 540843600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 01:43:56,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:44:00,935][12883] Updated weights for policy 0, policy_version 33011 (0.0048) [2024-06-18 01:44:01,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 540884992. Throughput: 0: 41055.0. Samples: 540964200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 01:44:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:44:04,729][12883] Updated weights for policy 0, policy_version 33021 (0.0039) [2024-06-18 01:44:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 41154.7). Total num frames: 541081600. Throughput: 0: 41108.7. Samples: 541213320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 01:44:06,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:44:08,623][12883] Updated weights for policy 0, policy_version 33031 (0.0034) [2024-06-18 01:44:11,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 541294592. Throughput: 0: 41080.2. Samples: 541458460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 01:44:12,000][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:44:12,719][12883] Updated weights for policy 0, policy_version 33041 (0.0032) [2024-06-18 01:44:16,279][12883] Updated weights for policy 0, policy_version 33051 (0.0032) [2024-06-18 01:44:16,994][12645] Fps is (10 sec: 44235.6, 60 sec: 41232.8, 300 sec: 41265.4). Total num frames: 541523968. Throughput: 0: 41185.5. Samples: 541585620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-18 01:44:16,995][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:44:20,955][12883] Updated weights for policy 0, policy_version 33061 (0.0033) [2024-06-18 01:44:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40413.8, 300 sec: 41210.2). Total num frames: 541687808. Throughput: 0: 41012.8. Samples: 541827520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-18 01:44:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:44:24,370][12883] Updated weights for policy 0, policy_version 33071 (0.0023) [2024-06-18 01:44:26,994][12645] Fps is (10 sec: 37684.4, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 541900800. Throughput: 0: 40941.3. Samples: 542069360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-18 01:44:26,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:44:29,028][12883] Updated weights for policy 0, policy_version 33081 (0.0041) [2024-06-18 01:44:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 542130176. Throughput: 0: 40985.2. Samples: 542192760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-18 01:44:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:44:32,169][12883] Updated weights for policy 0, policy_version 33091 (0.0027) [2024-06-18 01:44:36,839][12883] Updated weights for policy 0, policy_version 33101 (0.0034) [2024-06-18 01:44:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 542326784. Throughput: 0: 41102.3. Samples: 542450200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-18 01:44:36,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:44:37,059][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033102_542343168.pth... [2024-06-18 01:44:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032498_532447232.pth [2024-06-18 01:44:39,973][12883] Updated weights for policy 0, policy_version 33111 (0.0048) [2024-06-18 01:44:41,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 542523392. Throughput: 0: 40918.2. Samples: 542684920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 01:44:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:44:45,127][12883] Updated weights for policy 0, policy_version 33121 (0.0035) [2024-06-18 01:44:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 40960.1, 300 sec: 41265.4). Total num frames: 542752768. Throughput: 0: 40977.4. Samples: 542808180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 01:44:46,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:44:48,594][12883] Updated weights for policy 0, policy_version 33131 (0.0034) [2024-06-18 01:44:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40413.8, 300 sec: 41154.4). Total num frames: 542900224. Throughput: 0: 40888.0. Samples: 543053280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 01:44:51,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:44:53,388][12883] Updated weights for policy 0, policy_version 33141 (0.0039) [2024-06-18 01:44:56,399][12883] Updated weights for policy 0, policy_version 33151 (0.0046) [2024-06-18 01:44:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 543145984. Throughput: 0: 40751.1. Samples: 543292260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 01:44:56,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:45:01,140][12883] Updated weights for policy 0, policy_version 33161 (0.0024) [2024-06-18 01:45:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40687.1, 300 sec: 41154.4). Total num frames: 543326208. Throughput: 0: 40896.4. Samples: 543425940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:45:01,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:45:02,384][12862] Signal inference workers to stop experience collection... (7700 times) [2024-06-18 01:45:02,433][12862] Signal inference workers to resume experience collection... (7700 times) [2024-06-18 01:45:02,434][12883] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-06-18 01:45:02,468][12883] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-06-18 01:45:04,086][12883] Updated weights for policy 0, policy_version 33171 (0.0036) [2024-06-18 01:45:06,994][12645] Fps is (10 sec: 36044.5, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 543506432. Throughput: 0: 40976.5. Samples: 543671460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:45:06,995][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:45:08,781][12883] Updated weights for policy 0, policy_version 33181 (0.0035) [2024-06-18 01:45:11,775][12883] Updated weights for policy 0, policy_version 33191 (0.0037) [2024-06-18 01:45:11,994][12645] Fps is (10 sec: 47512.8, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 543801344. Throughput: 0: 40922.2. Samples: 543910860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:45:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:45:16,818][12883] Updated weights for policy 0, policy_version 33201 (0.0051) [2024-06-18 01:45:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 40687.1, 300 sec: 41154.4). Total num frames: 543965184. Throughput: 0: 41243.4. Samples: 544048720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 01:45:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:45:19,949][12883] Updated weights for policy 0, policy_version 33211 (0.0041) [2024-06-18 01:45:21,994][12645] Fps is (10 sec: 36044.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 544161792. Throughput: 0: 40732.2. Samples: 544283160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 01:45:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:45:24,833][12883] Updated weights for policy 0, policy_version 33221 (0.0026) [2024-06-18 01:45:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 544407552. Throughput: 0: 41238.4. Samples: 544540640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 01:45:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:45:27,529][12883] Updated weights for policy 0, policy_version 33231 (0.0041) [2024-06-18 01:45:32,000][12645] Fps is (10 sec: 40934.9, 60 sec: 40682.6, 300 sec: 41153.5). Total num frames: 544571392. Throughput: 0: 41292.6. Samples: 544666600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 01:45:32,000][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:45:33,161][12883] Updated weights for policy 0, policy_version 33241 (0.0037) [2024-06-18 01:45:35,862][12883] Updated weights for policy 0, policy_version 33251 (0.0041) [2024-06-18 01:45:36,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41232.9, 300 sec: 41209.9). Total num frames: 544800768. Throughput: 0: 41115.9. Samples: 544903500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 01:45:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:45:40,905][12883] Updated weights for policy 0, policy_version 33261 (0.0044) [2024-06-18 01:45:41,994][12645] Fps is (10 sec: 44263.9, 60 sec: 41506.1, 300 sec: 41265.4). Total num frames: 545013760. Throughput: 0: 41285.6. Samples: 545150120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 01:45:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:45:43,725][12883] Updated weights for policy 0, policy_version 33271 (0.0046) [2024-06-18 01:45:46,995][12645] Fps is (10 sec: 39318.6, 60 sec: 40686.4, 300 sec: 41154.6). Total num frames: 545193984. Throughput: 0: 41087.1. Samples: 545274900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 01:45:46,995][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:45:48,952][12883] Updated weights for policy 0, policy_version 33281 (0.0040) [2024-06-18 01:45:51,735][12883] Updated weights for policy 0, policy_version 33291 (0.0028) [2024-06-18 01:45:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 41376.5). Total num frames: 545439744. Throughput: 0: 41116.1. Samples: 545521680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 01:45:51,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:45:56,847][12883] Updated weights for policy 0, policy_version 33301 (0.0029) [2024-06-18 01:45:56,994][12645] Fps is (10 sec: 42602.1, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 545619968. Throughput: 0: 41526.3. Samples: 545779540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 01:45:56,996][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 01:45:59,051][12862] Signal inference workers to stop experience collection... (7750 times) [2024-06-18 01:45:59,052][12862] Signal inference workers to resume experience collection... (7750 times) [2024-06-18 01:45:59,069][12883] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-06-18 01:45:59,103][12883] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-06-18 01:45:59,386][12883] Updated weights for policy 0, policy_version 33311 (0.0040) [2024-06-18 01:46:01,994][12645] Fps is (10 sec: 36044.5, 60 sec: 41233.0, 300 sec: 41099.2). Total num frames: 545800192. Throughput: 0: 40869.4. Samples: 545887840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 01:46:01,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:46:04,670][12883] Updated weights for policy 0, policy_version 33321 (0.0036) [2024-06-18 01:46:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41321.0). Total num frames: 546045952. Throughput: 0: 41320.0. Samples: 546142560. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) [2024-06-18 01:46:06,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:46:07,571][12883] Updated weights for policy 0, policy_version 33331 (0.0032) [2024-06-18 01:46:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40140.9, 300 sec: 40932.2). Total num frames: 546209792. Throughput: 0: 41330.2. Samples: 546400500. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) [2024-06-18 01:46:11,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:46:12,680][12883] Updated weights for policy 0, policy_version 33341 (0.0042) [2024-06-18 01:46:15,825][12883] Updated weights for policy 0, policy_version 33351 (0.0043) [2024-06-18 01:46:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 546455552. Throughput: 0: 41129.2. Samples: 546517160. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) [2024-06-18 01:46:16,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:46:20,519][12883] Updated weights for policy 0, policy_version 33361 (0.0035) [2024-06-18 01:46:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 546652160. Throughput: 0: 41455.6. Samples: 546769000. Policy #0 lag: (min: 1.0, avg: 7.9, max: 22.0) [2024-06-18 01:46:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:46:24,041][12883] Updated weights for policy 0, policy_version 33371 (0.0035) [2024-06-18 01:46:26,994][12645] Fps is (10 sec: 39322.4, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 546848768. Throughput: 0: 41592.2. Samples: 547021760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:46:26,994][12645] Avg episode reward: [(0, '0.032')] [2024-06-18 01:46:28,518][12883] Updated weights for policy 0, policy_version 33381 (0.0034) [2024-06-18 01:46:31,913][12883] Updated weights for policy 0, policy_version 33391 (0.0030) [2024-06-18 01:46:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41783.4, 300 sec: 41321.0). Total num frames: 547078144. Throughput: 0: 41514.9. Samples: 547143040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:46:31,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:46:36,173][12883] Updated weights for policy 0, policy_version 33401 (0.0041) [2024-06-18 01:46:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 547274752. Throughput: 0: 41629.7. Samples: 547395020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:46:36,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:46:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033403_547274752.pth... [2024-06-18 01:46:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000032802_537427968.pth [2024-06-18 01:46:40,179][12883] Updated weights for policy 0, policy_version 33411 (0.0044) [2024-06-18 01:46:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 547487744. Throughput: 0: 41344.0. Samples: 547640020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:46:41,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:46:43,887][12883] Updated weights for policy 0, policy_version 33421 (0.0034) [2024-06-18 01:46:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.7, 300 sec: 41321.0). Total num frames: 547700736. Throughput: 0: 41711.5. Samples: 547764860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:46:46,995][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:46:47,969][12883] Updated weights for policy 0, policy_version 33431 (0.0033) [2024-06-18 01:46:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 547880960. Throughput: 0: 41570.4. Samples: 548013220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:46:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:46:52,166][12883] Updated weights for policy 0, policy_version 33441 (0.0037) [2024-06-18 01:46:55,495][12883] Updated weights for policy 0, policy_version 33451 (0.0045) [2024-06-18 01:46:56,994][12645] Fps is (10 sec: 37683.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 548077568. Throughput: 0: 41353.7. Samples: 548261420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:46:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:46:59,864][12883] Updated weights for policy 0, policy_version 33461 (0.0038) [2024-06-18 01:47:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41210.2). Total num frames: 548306944. Throughput: 0: 41389.8. Samples: 548379700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:47:01,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:47:03,920][12883] Updated weights for policy 0, policy_version 33471 (0.0031) [2024-06-18 01:47:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 548503552. Throughput: 0: 41301.3. Samples: 548627560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:47:06,995][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:47:07,774][12883] Updated weights for policy 0, policy_version 33481 (0.0037) [2024-06-18 01:47:11,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 548700160. Throughput: 0: 41040.0. Samples: 548868560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:47:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:47:12,078][12883] Updated weights for policy 0, policy_version 33491 (0.0045) [2024-06-18 01:47:15,788][12883] Updated weights for policy 0, policy_version 33501 (0.0032) [2024-06-18 01:47:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 548929536. Throughput: 0: 41009.5. Samples: 548988460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:47:16,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:47:19,962][12883] Updated weights for policy 0, policy_version 33511 (0.0036) [2024-06-18 01:47:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 549126144. Throughput: 0: 41112.9. Samples: 549245100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:47:21,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:47:23,866][12883] Updated weights for policy 0, policy_version 33521 (0.0031) [2024-06-18 01:47:26,387][12862] Signal inference workers to stop experience collection... (7800 times) [2024-06-18 01:47:26,388][12862] Signal inference workers to resume experience collection... (7800 times) [2024-06-18 01:47:26,413][12883] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-06-18 01:47:26,413][12883] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-06-18 01:47:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 549339136. Throughput: 0: 41069.0. Samples: 549488120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 01:47:26,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:47:27,859][12883] Updated weights for policy 0, policy_version 33531 (0.0034) [2024-06-18 01:47:31,843][12883] Updated weights for policy 0, policy_version 33541 (0.0034) [2024-06-18 01:47:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 40958.6, 300 sec: 41043.0). Total num frames: 549535744. Throughput: 0: 40977.2. Samples: 549608920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:47:31,996][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:47:35,680][12883] Updated weights for policy 0, policy_version 33551 (0.0035) [2024-06-18 01:47:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 41231.6, 300 sec: 41154.1). Total num frames: 549748736. Throughput: 0: 41117.9. Samples: 549863620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:47:36,997][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:47:39,533][12883] Updated weights for policy 0, policy_version 33561 (0.0034) [2024-06-18 01:47:41,994][12645] Fps is (10 sec: 40968.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 549945344. Throughput: 0: 41204.4. Samples: 550115620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:47:41,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:47:43,651][12883] Updated weights for policy 0, policy_version 33571 (0.0033) [2024-06-18 01:47:46,994][12645] Fps is (10 sec: 42607.0, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 550174720. Throughput: 0: 41260.8. Samples: 550236440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:47:46,995][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:47:47,142][12883] Updated weights for policy 0, policy_version 33581 (0.0042) [2024-06-18 01:47:51,823][12883] Updated weights for policy 0, policy_version 33591 (0.0030) [2024-06-18 01:47:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 550371328. Throughput: 0: 41360.9. Samples: 550488800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 01:47:51,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:47:54,870][12883] Updated weights for policy 0, policy_version 33601 (0.0035) [2024-06-18 01:47:56,994][12645] Fps is (10 sec: 39322.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 550567936. Throughput: 0: 41425.3. Samples: 550732700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:47:56,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:47:59,591][12883] Updated weights for policy 0, policy_version 33611 (0.0029) [2024-06-18 01:48:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 550797312. Throughput: 0: 41558.7. Samples: 550858600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:48:01,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:48:02,745][12883] Updated weights for policy 0, policy_version 33621 (0.0032) [2024-06-18 01:48:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 550977536. Throughput: 0: 41395.1. Samples: 551107880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:48:06,999][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:48:07,527][12883] Updated weights for policy 0, policy_version 33631 (0.0035) [2024-06-18 01:48:10,995][12883] Updated weights for policy 0, policy_version 33641 (0.0035) [2024-06-18 01:48:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 551206912. Throughput: 0: 41309.3. Samples: 551347040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 01:48:11,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:48:15,520][12883] Updated weights for policy 0, policy_version 33651 (0.0040) [2024-06-18 01:48:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 551387136. Throughput: 0: 41587.8. Samples: 551480280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 01:48:16,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:48:18,713][12883] Updated weights for policy 0, policy_version 33661 (0.0034) [2024-06-18 01:48:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 551600128. Throughput: 0: 41310.9. Samples: 551722520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 01:48:21,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:48:23,468][12883] Updated weights for policy 0, policy_version 33671 (0.0043) [2024-06-18 01:48:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 551829504. Throughput: 0: 41141.4. Samples: 551966980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 01:48:26,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:48:26,998][12883] Updated weights for policy 0, policy_version 33681 (0.0032) [2024-06-18 01:48:31,392][12883] Updated weights for policy 0, policy_version 33691 (0.0029) [2024-06-18 01:48:31,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41234.6, 300 sec: 41209.9). Total num frames: 552009728. Throughput: 0: 41294.9. Samples: 552094700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 01:48:31,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:48:34,871][12883] Updated weights for policy 0, policy_version 33701 (0.0045) [2024-06-18 01:48:36,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40961.4, 300 sec: 41154.4). Total num frames: 552206336. Throughput: 0: 41023.5. Samples: 552334860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 01:48:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:48:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033704_552206336.pth... [2024-06-18 01:48:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033102_542343168.pth [2024-06-18 01:48:39,456][12883] Updated weights for policy 0, policy_version 33711 (0.0047) [2024-06-18 01:48:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 552435712. Throughput: 0: 41128.5. Samples: 552583480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 01:48:41,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:48:43,029][12883] Updated weights for policy 0, policy_version 33721 (0.0032) [2024-06-18 01:48:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40960.2, 300 sec: 41209.9). Total num frames: 552632320. Throughput: 0: 41047.5. Samples: 552705740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 01:48:46,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:48:47,196][12883] Updated weights for policy 0, policy_version 33731 (0.0034) [2024-06-18 01:48:51,458][12883] Updated weights for policy 0, policy_version 33741 (0.0040) [2024-06-18 01:48:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 552845312. Throughput: 0: 41005.3. Samples: 552953120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 01:48:51,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:48:55,096][12883] Updated weights for policy 0, policy_version 33751 (0.0039) [2024-06-18 01:48:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 553041920. Throughput: 0: 41158.3. Samples: 553199160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 01:48:56,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:48:59,725][12883] Updated weights for policy 0, policy_version 33761 (0.0031) [2024-06-18 01:49:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 553254912. Throughput: 0: 41101.8. Samples: 553329860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:49:01,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:49:02,898][12883] Updated weights for policy 0, policy_version 33771 (0.0039) [2024-06-18 01:49:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 553451520. Throughput: 0: 41152.9. Samples: 553574400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:49:06,998][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:49:07,371][12883] Updated weights for policy 0, policy_version 33781 (0.0046) [2024-06-18 01:49:10,320][12862] Signal inference workers to stop experience collection... (7850 times) [2024-06-18 01:49:10,320][12862] Signal inference workers to resume experience collection... (7850 times) [2024-06-18 01:49:10,356][12883] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-06-18 01:49:10,356][12883] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-06-18 01:49:11,117][12883] Updated weights for policy 0, policy_version 33791 (0.0037) [2024-06-18 01:49:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41210.0). Total num frames: 553680896. Throughput: 0: 41098.2. Samples: 553816400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:49:11,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:49:15,260][12883] Updated weights for policy 0, policy_version 33801 (0.0042) [2024-06-18 01:49:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 553861120. Throughput: 0: 40905.2. Samples: 553935440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:49:16,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:49:19,200][12883] Updated weights for policy 0, policy_version 33811 (0.0037) [2024-06-18 01:49:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 554074112. Throughput: 0: 41116.1. Samples: 554185080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 01:49:21,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:49:23,586][12883] Updated weights for policy 0, policy_version 33821 (0.0031) [2024-06-18 01:49:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 554287104. Throughput: 0: 41170.6. Samples: 554436160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:49:26,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:49:26,999][12883] Updated weights for policy 0, policy_version 33831 (0.0036) [2024-06-18 01:49:31,436][12883] Updated weights for policy 0, policy_version 33841 (0.0035) [2024-06-18 01:49:31,993][12645] Fps is (10 sec: 37683.8, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 554450944. Throughput: 0: 41141.0. Samples: 554557080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:49:31,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:49:35,048][12883] Updated weights for policy 0, policy_version 33851 (0.0040) [2024-06-18 01:49:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 554713088. Throughput: 0: 41196.4. Samples: 554806960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:49:36,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:49:39,217][12883] Updated weights for policy 0, policy_version 33861 (0.0040) [2024-06-18 01:49:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 554893312. Throughput: 0: 41396.1. Samples: 555061980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 01:49:41,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:49:43,034][12883] Updated weights for policy 0, policy_version 33871 (0.0035) [2024-06-18 01:49:46,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 555089920. Throughput: 0: 41077.0. Samples: 555178320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:49:46,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:49:47,058][12883] Updated weights for policy 0, policy_version 33881 (0.0039) [2024-06-18 01:49:50,759][12883] Updated weights for policy 0, policy_version 33891 (0.0036) [2024-06-18 01:49:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 555319296. Throughput: 0: 41257.9. Samples: 555431000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:49:51,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:49:55,064][12883] Updated weights for policy 0, policy_version 33901 (0.0039) [2024-06-18 01:49:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 555515904. Throughput: 0: 41449.1. Samples: 555681620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:49:56,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:49:58,522][12883] Updated weights for policy 0, policy_version 33911 (0.0027) [2024-06-18 01:50:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 555712512. Throughput: 0: 41431.1. Samples: 555799840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 01:50:01,995][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:50:02,850][12883] Updated weights for policy 0, policy_version 33921 (0.0040) [2024-06-18 01:50:06,242][12883] Updated weights for policy 0, policy_version 33931 (0.0040) [2024-06-18 01:50:06,994][12645] Fps is (10 sec: 44237.8, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 555958272. Throughput: 0: 41547.2. Samples: 556054700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:50:07,000][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 01:50:10,705][12883] Updated weights for policy 0, policy_version 33941 (0.0026) [2024-06-18 01:50:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 556138496. Throughput: 0: 41404.0. Samples: 556299340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:50:11,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:50:14,399][12883] Updated weights for policy 0, policy_version 33951 (0.0047) [2024-06-18 01:50:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 556335104. Throughput: 0: 41536.4. Samples: 556426220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:50:16,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:50:18,348][12883] Updated weights for policy 0, policy_version 33961 (0.0042) [2024-06-18 01:50:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 556548096. Throughput: 0: 41609.4. Samples: 556679380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:50:21,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:50:22,427][12883] Updated weights for policy 0, policy_version 33971 (0.0037) [2024-06-18 01:50:26,158][12883] Updated weights for policy 0, policy_version 33981 (0.0034) [2024-06-18 01:50:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 41779.1, 300 sec: 41432.9). Total num frames: 556793856. Throughput: 0: 41440.3. Samples: 556926800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:50:26,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 01:50:30,429][12883] Updated weights for policy 0, policy_version 33991 (0.0048) [2024-06-18 01:50:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 556957696. Throughput: 0: 41735.5. Samples: 557056420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:50:31,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:50:33,795][12883] Updated weights for policy 0, policy_version 34001 (0.0044) [2024-06-18 01:50:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 557187072. Throughput: 0: 41446.5. Samples: 557296100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:50:36,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:50:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034008_557187072.pth... [2024-06-18 01:50:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033403_547274752.pth [2024-06-18 01:50:38,607][12883] Updated weights for policy 0, policy_version 34011 (0.0036) [2024-06-18 01:50:41,084][12862] Signal inference workers to stop experience collection... (7900 times) [2024-06-18 01:50:41,084][12862] Signal inference workers to resume experience collection... (7900 times) [2024-06-18 01:50:41,111][12883] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-06-18 01:50:41,111][12883] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-06-18 01:50:41,583][12883] Updated weights for policy 0, policy_version 34021 (0.0034) [2024-06-18 01:50:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.1, 300 sec: 41432.2). Total num frames: 557416448. Throughput: 0: 41586.7. Samples: 557553020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:50:41,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:50:46,634][12883] Updated weights for policy 0, policy_version 34031 (0.0043) [2024-06-18 01:50:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 557580288. Throughput: 0: 41865.9. Samples: 557683800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 01:50:46,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:50:49,176][12883] Updated weights for policy 0, policy_version 34041 (0.0038) [2024-06-18 01:50:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 557826048. Throughput: 0: 41567.5. Samples: 557925240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:50:51,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:50:54,613][12883] Updated weights for policy 0, policy_version 34051 (0.0042) [2024-06-18 01:50:56,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.4, 300 sec: 41487.6). Total num frames: 558039040. Throughput: 0: 41653.7. Samples: 558173760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:50:56,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:50:57,013][12883] Updated weights for policy 0, policy_version 34061 (0.0051) [2024-06-18 01:51:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41506.3, 300 sec: 41210.0). Total num frames: 558202880. Throughput: 0: 41534.7. Samples: 558295280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:51:01,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:51:02,384][12883] Updated weights for policy 0, policy_version 34071 (0.0042) [2024-06-18 01:51:04,983][12883] Updated weights for policy 0, policy_version 34081 (0.0040) [2024-06-18 01:51:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 558465024. Throughput: 0: 41275.1. Samples: 558536760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:51:06,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 01:51:10,438][12883] Updated weights for policy 0, policy_version 34091 (0.0041) [2024-06-18 01:51:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 558612480. Throughput: 0: 41679.1. Samples: 558802360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-18 01:51:11,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:51:13,085][12883] Updated weights for policy 0, policy_version 34101 (0.0044) [2024-06-18 01:51:16,994][12645] Fps is (10 sec: 36044.9, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 558825472. Throughput: 0: 41189.8. Samples: 558909960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 01:51:16,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:51:18,146][12883] Updated weights for policy 0, policy_version 34111 (0.0031) [2024-06-18 01:51:20,860][12883] Updated weights for policy 0, policy_version 34121 (0.0045) [2024-06-18 01:51:21,994][12645] Fps is (10 sec: 47514.2, 60 sec: 42325.4, 300 sec: 41487.6). Total num frames: 559087616. Throughput: 0: 41523.3. Samples: 559164640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 01:51:21,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:51:25,991][12883] Updated weights for policy 0, policy_version 34131 (0.0036) [2024-06-18 01:51:27,000][12645] Fps is (10 sec: 39297.0, 60 sec: 40409.7, 300 sec: 41153.5). Total num frames: 559218688. Throughput: 0: 41592.6. Samples: 559424940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 01:51:27,000][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:51:29,105][12883] Updated weights for policy 0, policy_version 34141 (0.0040) [2024-06-18 01:51:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 559464448. Throughput: 0: 41204.0. Samples: 559537980. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 01:51:31,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:51:33,910][12883] Updated weights for policy 0, policy_version 34151 (0.0033) [2024-06-18 01:51:36,994][12645] Fps is (10 sec: 45903.9, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 559677440. Throughput: 0: 41358.6. Samples: 559786380. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 01:51:36,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 01:51:37,410][12883] Updated weights for policy 0, policy_version 34161 (0.0030) [2024-06-18 01:51:41,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40414.0, 300 sec: 41154.4). Total num frames: 559841280. Throughput: 0: 41347.6. Samples: 560034400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 01:51:41,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:51:42,073][12883] Updated weights for policy 0, policy_version 34171 (0.0034) [2024-06-18 01:51:44,157][12862] Signal inference workers to stop experience collection... (7950 times) [2024-06-18 01:51:44,157][12862] Signal inference workers to resume experience collection... (7950 times) [2024-06-18 01:51:44,188][12883] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-06-18 01:51:44,188][12883] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-06-18 01:51:45,245][12883] Updated weights for policy 0, policy_version 34181 (0.0043) [2024-06-18 01:51:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 560103424. Throughput: 0: 41310.1. Samples: 560154240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 01:51:46,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:51:49,851][12883] Updated weights for policy 0, policy_version 34191 (0.0038) [2024-06-18 01:51:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 560300032. Throughput: 0: 41700.5. Samples: 560413280. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 01:51:51,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:51:52,915][12883] Updated weights for policy 0, policy_version 34201 (0.0027) [2024-06-18 01:51:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 560480256. Throughput: 0: 41338.8. Samples: 560662600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 01:51:56,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:51:57,797][12883] Updated weights for policy 0, policy_version 34211 (0.0032) [2024-06-18 01:52:00,598][12883] Updated weights for policy 0, policy_version 34221 (0.0042) [2024-06-18 01:52:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 41432.1). Total num frames: 560726016. Throughput: 0: 41510.6. Samples: 560777940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 01:52:01,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:52:05,594][12883] Updated weights for policy 0, policy_version 34231 (0.0042) [2024-06-18 01:52:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40413.9, 300 sec: 41321.0). Total num frames: 560889856. Throughput: 0: 41514.2. Samples: 561032780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 01:52:06,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:52:08,427][12883] Updated weights for policy 0, policy_version 34241 (0.0035) [2024-06-18 01:52:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 561119232. Throughput: 0: 41211.5. Samples: 561279200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 01:52:11,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:52:13,572][12883] Updated weights for policy 0, policy_version 34251 (0.0047) [2024-06-18 01:52:16,295][12883] Updated weights for policy 0, policy_version 34261 (0.0043) [2024-06-18 01:52:16,996][12645] Fps is (10 sec: 44226.4, 60 sec: 41777.6, 300 sec: 41376.2). Total num frames: 561332224. Throughput: 0: 41421.4. Samples: 561402040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 01:52:16,997][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 01:52:21,591][12883] Updated weights for policy 0, policy_version 34271 (0.0037) [2024-06-18 01:52:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40140.8, 300 sec: 41209.9). Total num frames: 561496064. Throughput: 0: 41470.7. Samples: 561652560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:52:21,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 01:52:24,257][12883] Updated weights for policy 0, policy_version 34281 (0.0041) [2024-06-18 01:52:27,000][12645] Fps is (10 sec: 40944.0, 60 sec: 42052.3, 300 sec: 41376.0). Total num frames: 561741824. Throughput: 0: 41363.2. Samples: 561896000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:52:27,000][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 01:52:29,283][12883] Updated weights for policy 0, policy_version 34291 (0.0034) [2024-06-18 01:52:31,994][12645] Fps is (10 sec: 47513.0, 60 sec: 41779.1, 300 sec: 41432.4). Total num frames: 561971200. Throughput: 0: 41559.5. Samples: 562024420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:52:31,995][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:52:32,173][12883] Updated weights for policy 0, policy_version 34301 (0.0049) [2024-06-18 01:52:36,994][12645] Fps is (10 sec: 39346.0, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 562135040. Throughput: 0: 41265.3. Samples: 562270220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 01:52:36,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:52:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034310_562135040.pth... [2024-06-18 01:52:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000033704_552206336.pth [2024-06-18 01:52:37,237][12883] Updated weights for policy 0, policy_version 34311 (0.0030) [2024-06-18 01:52:40,275][12883] Updated weights for policy 0, policy_version 34321 (0.0026) [2024-06-18 01:52:41,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42323.7, 300 sec: 41376.3). Total num frames: 562380800. Throughput: 0: 41188.5. Samples: 562516180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:52:41,997][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:52:45,153][12883] Updated weights for policy 0, policy_version 34331 (0.0039) [2024-06-18 01:52:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 40959.9, 300 sec: 41321.0). Total num frames: 562561024. Throughput: 0: 41370.2. Samples: 562639600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:52:46,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:52:48,542][12883] Updated weights for policy 0, policy_version 34341 (0.0047) [2024-06-18 01:52:51,994][12645] Fps is (10 sec: 37691.6, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 562757632. Throughput: 0: 41123.9. Samples: 562883360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:52:51,999][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 01:52:53,154][12883] Updated weights for policy 0, policy_version 34351 (0.0042) [2024-06-18 01:52:56,551][12883] Updated weights for policy 0, policy_version 34361 (0.0033) [2024-06-18 01:52:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 41376.5). Total num frames: 563003392. Throughput: 0: 41037.0. Samples: 563125860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:52:56,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:53:00,836][12883] Updated weights for policy 0, policy_version 34371 (0.0045) [2024-06-18 01:53:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40960.1, 300 sec: 41376.6). Total num frames: 563183616. Throughput: 0: 41234.6. Samples: 563257500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 01:53:01,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:53:04,517][12883] Updated weights for policy 0, policy_version 34381 (0.0026) [2024-06-18 01:53:06,994][12645] Fps is (10 sec: 36044.5, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 563363840. Throughput: 0: 41151.0. Samples: 563504360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:53:06,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:53:08,397][12883] Updated weights for policy 0, policy_version 34391 (0.0026) [2024-06-18 01:53:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 563593216. Throughput: 0: 41136.3. Samples: 563746880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:53:11,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:53:12,205][12862] Signal inference workers to stop experience collection... (8000 times) [2024-06-18 01:53:12,211][12862] Signal inference workers to resume experience collection... (8000 times) [2024-06-18 01:53:12,228][12883] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-06-18 01:53:12,229][12883] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-06-18 01:53:12,359][12883] Updated weights for policy 0, policy_version 34401 (0.0045) [2024-06-18 01:53:16,518][12883] Updated weights for policy 0, policy_version 34411 (0.0037) [2024-06-18 01:53:16,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41233.1, 300 sec: 41376.2). Total num frames: 563806208. Throughput: 0: 41151.3. Samples: 563876320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:53:16,997][12645] Avg episode reward: [(0, '0.032')] [2024-06-18 01:53:20,682][12883] Updated weights for policy 0, policy_version 34421 (0.0034) [2024-06-18 01:53:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 563986432. Throughput: 0: 41071.5. Samples: 564118440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 01:53:21,996][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:53:24,744][12883] Updated weights for policy 0, policy_version 34431 (0.0031) [2024-06-18 01:53:26,994][12645] Fps is (10 sec: 40969.5, 60 sec: 41237.3, 300 sec: 41376.5). Total num frames: 564215808. Throughput: 0: 40987.0. Samples: 564360500. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) [2024-06-18 01:53:26,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:53:28,568][12883] Updated weights for policy 0, policy_version 34441 (0.0046) [2024-06-18 01:53:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 564412416. Throughput: 0: 41058.3. Samples: 564487220. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) [2024-06-18 01:53:31,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:53:32,562][12883] Updated weights for policy 0, policy_version 34451 (0.0041) [2024-06-18 01:53:36,532][12883] Updated weights for policy 0, policy_version 34461 (0.0047) [2024-06-18 01:53:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 564609024. Throughput: 0: 41085.3. Samples: 564732200. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) [2024-06-18 01:53:36,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:53:40,510][12883] Updated weights for policy 0, policy_version 34471 (0.0033) [2024-06-18 01:53:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40961.6, 300 sec: 41376.5). Total num frames: 564838400. Throughput: 0: 40962.2. Samples: 564969160. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) [2024-06-18 01:53:41,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 01:53:44,946][12883] Updated weights for policy 0, policy_version 34481 (0.0031) [2024-06-18 01:53:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 565002240. Throughput: 0: 40927.5. Samples: 565099240. Policy #0 lag: (min: 2.0, avg: 12.2, max: 22.0) [2024-06-18 01:53:46,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 01:53:48,398][12883] Updated weights for policy 0, policy_version 34491 (0.0037) [2024-06-18 01:53:51,996][12645] Fps is (10 sec: 37675.9, 60 sec: 40958.7, 300 sec: 41265.2). Total num frames: 565215232. Throughput: 0: 40868.1. Samples: 565343500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:53:51,996][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:53:53,195][12883] Updated weights for policy 0, policy_version 34501 (0.0031) [2024-06-18 01:53:56,762][12883] Updated weights for policy 0, policy_version 34511 (0.0028) [2024-06-18 01:53:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 565444608. Throughput: 0: 40883.6. Samples: 565586640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:53:56,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:54:01,531][12883] Updated weights for policy 0, policy_version 34521 (0.0043) [2024-06-18 01:54:01,994][12645] Fps is (10 sec: 39329.2, 60 sec: 40413.8, 300 sec: 41209.9). Total num frames: 565608448. Throughput: 0: 40741.2. Samples: 565709580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:54:01,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 01:54:04,556][12883] Updated weights for policy 0, policy_version 34531 (0.0031) [2024-06-18 01:54:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 565837824. Throughput: 0: 40560.1. Samples: 565943640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 01:54:06,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:54:09,691][12883] Updated weights for policy 0, policy_version 34541 (0.0035) [2024-06-18 01:54:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 40140.8, 300 sec: 41154.4). Total num frames: 566001664. Throughput: 0: 40786.7. Samples: 566195900. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 01:54:11,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:54:12,619][12883] Updated weights for policy 0, policy_version 34551 (0.0044) [2024-06-18 01:54:16,994][12645] Fps is (10 sec: 37683.0, 60 sec: 40142.3, 300 sec: 41154.4). Total num frames: 566214656. Throughput: 0: 40456.4. Samples: 566307760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 01:54:16,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:54:17,641][12883] Updated weights for policy 0, policy_version 34561 (0.0040) [2024-06-18 01:54:20,661][12883] Updated weights for policy 0, policy_version 34571 (0.0029) [2024-06-18 01:54:21,994][12645] Fps is (10 sec: 45874.5, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 566460416. Throughput: 0: 40527.5. Samples: 566555940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 01:54:21,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:54:25,929][12883] Updated weights for policy 0, policy_version 34581 (0.0041) [2024-06-18 01:54:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40140.7, 300 sec: 41265.4). Total num frames: 566624256. Throughput: 0: 40903.0. Samples: 566809800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 01:54:26,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:54:28,705][12883] Updated weights for policy 0, policy_version 34591 (0.0036) [2024-06-18 01:54:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40413.9, 300 sec: 41098.9). Total num frames: 566837248. Throughput: 0: 40448.9. Samples: 566919440. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 01:54:31,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:54:33,799][12883] Updated weights for policy 0, policy_version 34601 (0.0032) [2024-06-18 01:54:35,577][12862] Signal inference workers to stop experience collection... (8050 times) [2024-06-18 01:54:35,616][12883] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-06-18 01:54:35,637][12862] Signal inference workers to resume experience collection... (8050 times) [2024-06-18 01:54:35,639][12883] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-06-18 01:54:36,841][12883] Updated weights for policy 0, policy_version 34611 (0.0032) [2024-06-18 01:54:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 567066624. Throughput: 0: 40795.4. Samples: 567179220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:54:36,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:54:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034611_567066624.pth... [2024-06-18 01:54:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034008_557187072.pth [2024-06-18 01:54:41,571][12883] Updated weights for policy 0, policy_version 34621 (0.0037) [2024-06-18 01:54:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 39867.7, 300 sec: 41154.4). Total num frames: 567230464. Throughput: 0: 40934.6. Samples: 567428700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:54:41,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:54:44,826][12883] Updated weights for policy 0, policy_version 34631 (0.0036) [2024-06-18 01:54:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 567459840. Throughput: 0: 40752.0. Samples: 567543420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:54:46,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 01:54:49,422][12883] Updated weights for policy 0, policy_version 34641 (0.0038) [2024-06-18 01:54:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 40961.3, 300 sec: 41209.9). Total num frames: 567672832. Throughput: 0: 41014.1. Samples: 567789280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 01:54:51,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:54:52,922][12883] Updated weights for policy 0, policy_version 34651 (0.0047) [2024-06-18 01:54:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 39867.7, 300 sec: 41098.9). Total num frames: 567836672. Throughput: 0: 41045.8. Samples: 568042960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:54:56,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:54:57,310][12883] Updated weights for policy 0, policy_version 34661 (0.0037) [2024-06-18 01:55:00,954][12883] Updated weights for policy 0, policy_version 34671 (0.0039) [2024-06-18 01:55:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 568098816. Throughput: 0: 41019.5. Samples: 568153640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:55:01,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:55:05,360][12883] Updated weights for policy 0, policy_version 34681 (0.0047) [2024-06-18 01:55:06,994][12645] Fps is (10 sec: 45875.1, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 568295424. Throughput: 0: 41123.7. Samples: 568406500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:55:06,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:55:09,295][12883] Updated weights for policy 0, policy_version 34691 (0.0033) [2024-06-18 01:55:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 568492032. Throughput: 0: 40757.9. Samples: 568643900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:55:11,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:55:13,395][12883] Updated weights for policy 0, policy_version 34701 (0.0053) [2024-06-18 01:55:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 568688640. Throughput: 0: 41188.0. Samples: 568772900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 01:55:16,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:55:17,273][12883] Updated weights for policy 0, policy_version 34711 (0.0042) [2024-06-18 01:55:21,160][12883] Updated weights for policy 0, policy_version 34721 (0.0036) [2024-06-18 01:55:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40140.8, 300 sec: 40932.2). Total num frames: 568868864. Throughput: 0: 40727.6. Samples: 569011960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:55:21,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:55:25,369][12883] Updated weights for policy 0, policy_version 34731 (0.0037) [2024-06-18 01:55:27,000][12645] Fps is (10 sec: 40934.5, 60 sec: 41228.9, 300 sec: 41153.5). Total num frames: 569098240. Throughput: 0: 40644.7. Samples: 569257960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:55:27,000][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 01:55:29,572][12883] Updated weights for policy 0, policy_version 34741 (0.0035) [2024-06-18 01:55:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 569327616. Throughput: 0: 40813.7. Samples: 569380040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:55:31,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 01:55:33,559][12883] Updated weights for policy 0, policy_version 34751 (0.0046) [2024-06-18 01:55:36,994][12645] Fps is (10 sec: 40985.2, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 569507840. Throughput: 0: 40736.9. Samples: 569622440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:55:36,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:55:37,759][12883] Updated weights for policy 0, policy_version 34761 (0.0036) [2024-06-18 01:55:41,561][12883] Updated weights for policy 0, policy_version 34771 (0.0052) [2024-06-18 01:55:41,994][12645] Fps is (10 sec: 36045.3, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 569688064. Throughput: 0: 40474.3. Samples: 569864300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:55:41,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:55:45,651][12883] Updated weights for policy 0, policy_version 34781 (0.0037) [2024-06-18 01:55:46,996][12645] Fps is (10 sec: 40951.0, 60 sec: 40958.5, 300 sec: 40987.5). Total num frames: 569917440. Throughput: 0: 40608.3. Samples: 569981100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:55:46,996][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:55:49,702][12883] Updated weights for policy 0, policy_version 34791 (0.0036) [2024-06-18 01:55:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 570097664. Throughput: 0: 40552.1. Samples: 570231340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:55:51,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:55:53,404][12883] Updated weights for policy 0, policy_version 34801 (0.0030) [2024-06-18 01:55:56,994][12645] Fps is (10 sec: 40968.6, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 570327040. Throughput: 0: 40590.2. Samples: 570470460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:55:56,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 01:55:57,367][12883] Updated weights for policy 0, policy_version 34811 (0.0030) [2024-06-18 01:55:59,269][12862] Signal inference workers to stop experience collection... (8100 times) [2024-06-18 01:55:59,270][12862] Signal inference workers to resume experience collection... (8100 times) [2024-06-18 01:55:59,310][12883] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-06-18 01:55:59,310][12883] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-06-18 01:56:01,271][12883] Updated weights for policy 0, policy_version 34821 (0.0031) [2024-06-18 01:56:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40140.9, 300 sec: 40821.2). Total num frames: 570507264. Throughput: 0: 40434.7. Samples: 570592460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:56:01,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:56:05,398][12883] Updated weights for policy 0, policy_version 34831 (0.0048) [2024-06-18 01:56:06,994][12645] Fps is (10 sec: 39322.5, 60 sec: 40413.9, 300 sec: 41043.3). Total num frames: 570720256. Throughput: 0: 40512.1. Samples: 570835000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:56:06,994][12645] Avg episode reward: [(0, '0.002')] [2024-06-18 01:56:09,748][12883] Updated weights for policy 0, policy_version 34841 (0.0039) [2024-06-18 01:56:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40140.9, 300 sec: 40932.2). Total num frames: 570900480. Throughput: 0: 40660.4. Samples: 571087420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:56:11,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:56:13,030][12883] Updated weights for policy 0, policy_version 34851 (0.0038) [2024-06-18 01:56:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 571146240. Throughput: 0: 40722.2. Samples: 571212540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:56:16,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:56:17,723][12883] Updated weights for policy 0, policy_version 34861 (0.0040) [2024-06-18 01:56:21,575][12883] Updated weights for policy 0, policy_version 34871 (0.0031) [2024-06-18 01:56:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41233.1, 300 sec: 41099.7). Total num frames: 571342848. Throughput: 0: 40771.1. Samples: 571457140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 01:56:21,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:56:25,484][12883] Updated weights for policy 0, policy_version 34881 (0.0035) [2024-06-18 01:56:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40691.1, 300 sec: 40932.2). Total num frames: 571539456. Throughput: 0: 41092.7. Samples: 571713480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:56:27,003][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 01:56:29,542][12883] Updated weights for policy 0, policy_version 34891 (0.0040) [2024-06-18 01:56:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 571768832. Throughput: 0: 41215.3. Samples: 571835700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:56:31,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:56:33,158][12883] Updated weights for policy 0, policy_version 34901 (0.0037) [2024-06-18 01:56:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 571949056. Throughput: 0: 41034.1. Samples: 572077880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:56:36,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:56:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034909_571949056.pth... [2024-06-18 01:56:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034310_562135040.pth [2024-06-18 01:56:37,370][12883] Updated weights for policy 0, policy_version 34911 (0.0037) [2024-06-18 01:56:41,111][12883] Updated weights for policy 0, policy_version 34921 (0.0040) [2024-06-18 01:56:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 40876.7). Total num frames: 572162048. Throughput: 0: 41194.7. Samples: 572324220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:56:41,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:56:45,321][12883] Updated weights for policy 0, policy_version 34931 (0.0033) [2024-06-18 01:56:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40961.5, 300 sec: 40932.2). Total num frames: 572375040. Throughput: 0: 41293.2. Samples: 572450660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 01:56:46,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 01:56:48,974][12883] Updated weights for policy 0, policy_version 34941 (0.0024) [2024-06-18 01:56:51,996][12645] Fps is (10 sec: 42589.2, 60 sec: 41504.5, 300 sec: 41043.0). Total num frames: 572588032. Throughput: 0: 41334.8. Samples: 572695160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:56:51,996][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:56:53,351][12883] Updated weights for policy 0, policy_version 34951 (0.0035) [2024-06-18 01:56:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 40876.7). Total num frames: 572784640. Throughput: 0: 40988.8. Samples: 572931920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:56:56,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:56:57,571][12883] Updated weights for policy 0, policy_version 34961 (0.0057) [2024-06-18 01:57:01,566][12883] Updated weights for policy 0, policy_version 34971 (0.0046) [2024-06-18 01:57:01,994][12645] Fps is (10 sec: 37692.0, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 572964864. Throughput: 0: 41095.7. Samples: 573061840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:57:01,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 01:57:05,596][12883] Updated weights for policy 0, policy_version 34981 (0.0035) [2024-06-18 01:57:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 40932.3). Total num frames: 573194240. Throughput: 0: 41025.5. Samples: 573303280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 01:57:06,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 01:57:07,081][12862] Saving new best policy, reward=0.054! [2024-06-18 01:57:09,297][12883] Updated weights for policy 0, policy_version 34991 (0.0044) [2024-06-18 01:57:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41779.1, 300 sec: 40932.5). Total num frames: 573407232. Throughput: 0: 40681.0. Samples: 573544120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 01:57:11,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 01:57:13,458][12883] Updated weights for policy 0, policy_version 35001 (0.0040) [2024-06-18 01:57:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 40414.0, 300 sec: 40932.2). Total num frames: 573571072. Throughput: 0: 40611.7. Samples: 573663220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 01:57:16,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 01:57:18,144][12883] Updated weights for policy 0, policy_version 35011 (0.0037) [2024-06-18 01:57:21,127][12883] Updated weights for policy 0, policy_version 35021 (0.0025) [2024-06-18 01:57:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 40933.1). Total num frames: 573816832. Throughput: 0: 40805.3. Samples: 573914120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 01:57:21,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 01:57:26,035][12883] Updated weights for policy 0, policy_version 35031 (0.0047) [2024-06-18 01:57:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.2, 300 sec: 40821.2). Total num frames: 574013440. Throughput: 0: 40741.0. Samples: 574157560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 01:57:26,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:57:28,523][12862] Signal inference workers to stop experience collection... (8150 times) [2024-06-18 01:57:28,524][12862] Signal inference workers to resume experience collection... (8150 times) [2024-06-18 01:57:28,563][12883] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-06-18 01:57:28,563][12883] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-06-18 01:57:29,367][12883] Updated weights for policy 0, policy_version 35041 (0.0033) [2024-06-18 01:57:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 574193664. Throughput: 0: 40611.6. Samples: 574278180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 01:57:31,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 01:57:33,867][12883] Updated weights for policy 0, policy_version 35051 (0.0036) [2024-06-18 01:57:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 40821.5). Total num frames: 574423040. Throughput: 0: 40675.3. Samples: 574525460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:57:36,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:57:37,365][12883] Updated weights for policy 0, policy_version 35061 (0.0030) [2024-06-18 01:57:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40687.0, 300 sec: 40821.2). Total num frames: 574603264. Throughput: 0: 40976.9. Samples: 574775880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:57:41,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 01:57:41,996][12883] Updated weights for policy 0, policy_version 35071 (0.0038) [2024-06-18 01:57:45,249][12883] Updated weights for policy 0, policy_version 35081 (0.0029) [2024-06-18 01:57:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 574832640. Throughput: 0: 40765.6. Samples: 574896300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:57:46,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 01:57:49,820][12883] Updated weights for policy 0, policy_version 35091 (0.0037) [2024-06-18 01:57:51,994][12645] Fps is (10 sec: 44236.1, 60 sec: 40961.4, 300 sec: 40821.1). Total num frames: 575045632. Throughput: 0: 40987.3. Samples: 575147720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 01:57:51,994][12645] Avg episode reward: [(0, '0.004')] [2024-06-18 01:57:53,125][12883] Updated weights for policy 0, policy_version 35101 (0.0034) [2024-06-18 01:57:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 575225856. Throughput: 0: 41099.6. Samples: 575393600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 01:57:56,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 01:57:58,013][12883] Updated weights for policy 0, policy_version 35111 (0.0036) [2024-06-18 01:58:01,180][12883] Updated weights for policy 0, policy_version 35121 (0.0032) [2024-06-18 01:58:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 575455232. Throughput: 0: 41100.8. Samples: 575512760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 01:58:01,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 01:58:05,765][12883] Updated weights for policy 0, policy_version 35131 (0.0039) [2024-06-18 01:58:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 575651840. Throughput: 0: 41088.0. Samples: 575763080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 01:58:06,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:58:08,906][12883] Updated weights for policy 0, policy_version 35141 (0.0034) [2024-06-18 01:58:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 40877.0). Total num frames: 575864832. Throughput: 0: 41175.9. Samples: 576010480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 01:58:11,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:58:13,375][12883] Updated weights for policy 0, policy_version 35151 (0.0032) [2024-06-18 01:58:16,776][12883] Updated weights for policy 0, policy_version 35161 (0.0035) [2024-06-18 01:58:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 40987.8). Total num frames: 576077824. Throughput: 0: 41409.3. Samples: 576141600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 01:58:16,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 01:58:21,402][12883] Updated weights for policy 0, policy_version 35171 (0.0042) [2024-06-18 01:58:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40821.1). Total num frames: 576258048. Throughput: 0: 41360.0. Samples: 576386660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:58:21,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 01:58:25,081][12883] Updated weights for policy 0, policy_version 35181 (0.0034) [2024-06-18 01:58:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 576487424. Throughput: 0: 41293.7. Samples: 576634100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:58:26,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:58:29,124][12883] Updated weights for policy 0, policy_version 35191 (0.0043) [2024-06-18 01:58:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 40932.2). Total num frames: 576684032. Throughput: 0: 41369.8. Samples: 576757940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:58:31,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 01:58:32,898][12883] Updated weights for policy 0, policy_version 35201 (0.0048) [2024-06-18 01:58:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 40821.1). Total num frames: 576880640. Throughput: 0: 41269.9. Samples: 577004860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 01:58:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:58:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035210_576880640.pth... [2024-06-18 01:58:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034611_567066624.pth [2024-06-18 01:58:37,374][12883] Updated weights for policy 0, policy_version 35211 (0.0039) [2024-06-18 01:58:40,855][12883] Updated weights for policy 0, policy_version 35221 (0.0038) [2024-06-18 01:58:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 577093632. Throughput: 0: 41220.4. Samples: 577248520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 01:58:41,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 01:58:45,353][12883] Updated weights for policy 0, policy_version 35231 (0.0047) [2024-06-18 01:58:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 40988.0). Total num frames: 577306624. Throughput: 0: 41390.1. Samples: 577375320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 01:58:46,994][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 01:58:49,070][12883] Updated weights for policy 0, policy_version 35241 (0.0035) [2024-06-18 01:58:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 40876.7). Total num frames: 577503232. Throughput: 0: 41204.3. Samples: 577617280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 01:58:51,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 01:58:53,164][12883] Updated weights for policy 0, policy_version 35251 (0.0039) [2024-06-18 01:58:56,911][12883] Updated weights for policy 0, policy_version 35261 (0.0033) [2024-06-18 01:58:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 577716224. Throughput: 0: 41287.6. Samples: 577868420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 01:58:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:59:00,863][12883] Updated weights for policy 0, policy_version 35271 (0.0044) [2024-06-18 01:59:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 577912832. Throughput: 0: 41271.6. Samples: 577998820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 01:59:01,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 01:59:04,843][12883] Updated weights for policy 0, policy_version 35281 (0.0027) [2024-06-18 01:59:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 578125824. Throughput: 0: 41067.6. Samples: 578234700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 01:59:07,003][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:59:09,202][12883] Updated weights for policy 0, policy_version 35291 (0.0048) [2024-06-18 01:59:09,483][12862] Signal inference workers to stop experience collection... (8200 times) [2024-06-18 01:59:09,483][12862] Signal inference workers to resume experience collection... (8200 times) [2024-06-18 01:59:09,505][12883] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-06-18 01:59:09,505][12883] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-06-18 01:59:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 578322432. Throughput: 0: 41162.3. Samples: 578486400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 01:59:12,003][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 01:59:12,651][12883] Updated weights for policy 0, policy_version 35301 (0.0038) [2024-06-18 01:59:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 578519040. Throughput: 0: 41095.6. Samples: 578607240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 01:59:16,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 01:59:17,066][12883] Updated weights for policy 0, policy_version 35311 (0.0039) [2024-06-18 01:59:20,601][12883] Updated weights for policy 0, policy_version 35321 (0.0025) [2024-06-18 01:59:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 578732032. Throughput: 0: 40944.1. Samples: 578847340. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 01:59:21,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:59:24,895][12883] Updated weights for policy 0, policy_version 35331 (0.0037) [2024-06-18 01:59:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 40987.7). Total num frames: 578928640. Throughput: 0: 41146.1. Samples: 579100100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:59:26,994][12645] Avg episode reward: [(0, '0.007')] [2024-06-18 01:59:28,638][12883] Updated weights for policy 0, policy_version 35341 (0.0045) [2024-06-18 01:59:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 579141632. Throughput: 0: 41043.7. Samples: 579222280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:59:31,994][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 01:59:32,788][12883] Updated weights for policy 0, policy_version 35351 (0.0033) [2024-06-18 01:59:36,715][12883] Updated weights for policy 0, policy_version 35361 (0.0042) [2024-06-18 01:59:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 579354624. Throughput: 0: 41106.3. Samples: 579467060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:59:36,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 01:59:40,729][12883] Updated weights for policy 0, policy_version 35371 (0.0042) [2024-06-18 01:59:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 579567616. Throughput: 0: 41104.4. Samples: 579718120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:59:41,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 01:59:44,285][12883] Updated weights for policy 0, policy_version 35381 (0.0031) [2024-06-18 01:59:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 579747840. Throughput: 0: 40864.3. Samples: 579837720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 01:59:46,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 01:59:48,834][12883] Updated weights for policy 0, policy_version 35391 (0.0040) [2024-06-18 01:59:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 579993600. Throughput: 0: 41190.3. Samples: 580088260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 01:59:51,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 01:59:52,184][12883] Updated weights for policy 0, policy_version 35401 (0.0030) [2024-06-18 01:59:56,917][12883] Updated weights for policy 0, policy_version 35411 (0.0041) [2024-06-18 01:59:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 580173824. Throughput: 0: 41193.8. Samples: 580340120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 01:59:56,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 02:00:00,284][12883] Updated weights for policy 0, policy_version 35421 (0.0030) [2024-06-18 02:00:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 580370432. Throughput: 0: 41026.3. Samples: 580453420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:00:01,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:00:04,954][12883] Updated weights for policy 0, policy_version 35431 (0.0038) [2024-06-18 02:00:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 580599808. Throughput: 0: 41308.9. Samples: 580706240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:00:06,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:00:08,160][12883] Updated weights for policy 0, policy_version 35441 (0.0039) [2024-06-18 02:00:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 580796416. Throughput: 0: 41139.7. Samples: 580951380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:00:11,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:00:12,786][12883] Updated weights for policy 0, policy_version 35451 (0.0034) [2024-06-18 02:00:16,806][12883] Updated weights for policy 0, policy_version 35461 (0.0035) [2024-06-18 02:00:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 580993024. Throughput: 0: 41236.3. Samples: 581077920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:00:16,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 02:00:20,653][12883] Updated weights for policy 0, policy_version 35471 (0.0041) [2024-06-18 02:00:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41044.2). Total num frames: 581206016. Throughput: 0: 41286.7. Samples: 581324960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:00:21,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:00:24,582][12883] Updated weights for policy 0, policy_version 35481 (0.0033) [2024-06-18 02:00:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 40932.2). Total num frames: 581402624. Throughput: 0: 41231.0. Samples: 581573520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:00:26,995][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 02:00:28,558][12883] Updated weights for policy 0, policy_version 35491 (0.0029) [2024-06-18 02:00:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 581632000. Throughput: 0: 41363.1. Samples: 581699060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:00:31,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:00:32,535][12883] Updated weights for policy 0, policy_version 35501 (0.0035) [2024-06-18 02:00:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 581795840. Throughput: 0: 41247.0. Samples: 581944380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:00:36,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 02:00:37,036][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035511_581812224.pth... [2024-06-18 02:00:37,041][12883] Updated weights for policy 0, policy_version 35511 (0.0027) [2024-06-18 02:00:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000034909_571949056.pth [2024-06-18 02:00:37,324][12862] Signal inference workers to stop experience collection... (8250 times) [2024-06-18 02:00:37,376][12862] Signal inference workers to resume experience collection... (8250 times) [2024-06-18 02:00:37,377][12883] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-06-18 02:00:37,396][12883] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-06-18 02:00:40,400][12883] Updated weights for policy 0, policy_version 35521 (0.0028) [2024-06-18 02:00:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41099.2). Total num frames: 582041600. Throughput: 0: 41135.0. Samples: 582191200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:00:41,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:00:44,884][12883] Updated weights for policy 0, policy_version 35531 (0.0034) [2024-06-18 02:00:46,994][12645] Fps is (10 sec: 45875.7, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 582254592. Throughput: 0: 41423.5. Samples: 582317480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:00:46,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:00:48,224][12883] Updated weights for policy 0, policy_version 35541 (0.0030) [2024-06-18 02:00:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 582434816. Throughput: 0: 41223.2. Samples: 582561280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:00:51,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:00:52,659][12883] Updated weights for policy 0, policy_version 35551 (0.0023) [2024-06-18 02:00:56,155][12883] Updated weights for policy 0, policy_version 35561 (0.0039) [2024-06-18 02:00:56,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41504.6, 300 sec: 41209.6). Total num frames: 582664192. Throughput: 0: 41370.4. Samples: 582813140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:00:56,996][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 02:01:01,030][12883] Updated weights for policy 0, policy_version 35571 (0.0037) [2024-06-18 02:01:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 582877184. Throughput: 0: 41392.6. Samples: 582940580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 02:01:01,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:01:03,819][12883] Updated weights for policy 0, policy_version 35581 (0.0035) [2024-06-18 02:01:06,996][12645] Fps is (10 sec: 39321.3, 60 sec: 40958.5, 300 sec: 41209.6). Total num frames: 583057408. Throughput: 0: 41344.5. Samples: 583185560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 02:01:06,996][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 02:01:08,650][12883] Updated weights for policy 0, policy_version 35591 (0.0039) [2024-06-18 02:01:11,727][12883] Updated weights for policy 0, policy_version 35601 (0.0037) [2024-06-18 02:01:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 583286784. Throughput: 0: 41240.5. Samples: 583429340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 02:01:11,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 02:01:16,400][12883] Updated weights for policy 0, policy_version 35611 (0.0039) [2024-06-18 02:01:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 583483392. Throughput: 0: 41268.9. Samples: 583556160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 02:01:16,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 02:01:19,671][12883] Updated weights for policy 0, policy_version 35621 (0.0043) [2024-06-18 02:01:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 583680000. Throughput: 0: 41312.0. Samples: 583803420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:01:21,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 02:01:24,164][12883] Updated weights for policy 0, policy_version 35631 (0.0025) [2024-06-18 02:01:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41098.8). Total num frames: 583892992. Throughput: 0: 41371.1. Samples: 584052900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:01:26,998][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:01:27,744][12883] Updated weights for policy 0, policy_version 35641 (0.0032) [2024-06-18 02:01:31,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 584089600. Throughput: 0: 41342.9. Samples: 584177920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:01:31,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:01:32,059][12883] Updated weights for policy 0, policy_version 35651 (0.0044) [2024-06-18 02:01:35,509][12883] Updated weights for policy 0, policy_version 35661 (0.0032) [2024-06-18 02:01:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 584302592. Throughput: 0: 41266.0. Samples: 584418260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:01:36,994][12645] Avg episode reward: [(0, '0.036')] [2024-06-18 02:01:39,899][12883] Updated weights for policy 0, policy_version 35671 (0.0032) [2024-06-18 02:01:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 584499200. Throughput: 0: 41214.8. Samples: 584667720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:01:41,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 02:01:43,719][12883] Updated weights for policy 0, policy_version 35681 (0.0035) [2024-06-18 02:01:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41099.1). Total num frames: 584712192. Throughput: 0: 41094.0. Samples: 584789820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:01:46,994][12645] Avg episode reward: [(0, '0.001')] [2024-06-18 02:01:48,001][12883] Updated weights for policy 0, policy_version 35691 (0.0045) [2024-06-18 02:01:51,391][12883] Updated weights for policy 0, policy_version 35701 (0.0035) [2024-06-18 02:01:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 584925184. Throughput: 0: 41065.1. Samples: 585033400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:01:51,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:01:55,926][12883] Updated weights for policy 0, policy_version 35711 (0.0055) [2024-06-18 02:01:56,996][12645] Fps is (10 sec: 39313.1, 60 sec: 40686.9, 300 sec: 41154.1). Total num frames: 585105408. Throughput: 0: 41270.0. Samples: 585286580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:01:56,997][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:01:59,163][12883] Updated weights for policy 0, policy_version 35721 (0.0037) [2024-06-18 02:02:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 585318400. Throughput: 0: 41106.7. Samples: 585405960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:02:01,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:02:04,055][12883] Updated weights for policy 0, policy_version 35731 (0.0038) [2024-06-18 02:02:06,994][12645] Fps is (10 sec: 45885.8, 60 sec: 41780.8, 300 sec: 41209.9). Total num frames: 585564160. Throughput: 0: 41100.0. Samples: 585652920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:02:06,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:02:07,015][12883] Updated weights for policy 0, policy_version 35741 (0.0046) [2024-06-18 02:02:10,262][12862] Signal inference workers to stop experience collection... (8300 times) [2024-06-18 02:02:10,263][12862] Signal inference workers to resume experience collection... (8300 times) [2024-06-18 02:02:10,299][12883] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-06-18 02:02:10,299][12883] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-06-18 02:02:11,948][12883] Updated weights for policy 0, policy_version 35751 (0.0039) [2024-06-18 02:02:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 585744384. Throughput: 0: 41189.8. Samples: 585906440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 02:02:11,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:02:15,414][12883] Updated weights for policy 0, policy_version 35761 (0.0038) [2024-06-18 02:02:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 585940992. Throughput: 0: 41051.8. Samples: 586025240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 02:02:16,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:02:20,010][12883] Updated weights for policy 0, policy_version 35771 (0.0041) [2024-06-18 02:02:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 586170368. Throughput: 0: 41389.5. Samples: 586280780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 02:02:21,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:02:23,419][12883] Updated weights for policy 0, policy_version 35781 (0.0034) [2024-06-18 02:02:26,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 586350592. Throughput: 0: 41360.8. Samples: 586528960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 02:02:26,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 02:02:28,003][12883] Updated weights for policy 0, policy_version 35791 (0.0033) [2024-06-18 02:02:31,135][12883] Updated weights for policy 0, policy_version 35801 (0.0038) [2024-06-18 02:02:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 586579968. Throughput: 0: 41259.6. Samples: 586646500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 02:02:31,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:02:36,083][12883] Updated weights for policy 0, policy_version 35811 (0.0038) [2024-06-18 02:02:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 586792960. Throughput: 0: 41401.0. Samples: 586896440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 02:02:36,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 02:02:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035815_586792960.pth... [2024-06-18 02:02:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035210_576880640.pth [2024-06-18 02:02:39,733][12883] Updated weights for policy 0, policy_version 35821 (0.0039) [2024-06-18 02:02:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 586973184. Throughput: 0: 41233.6. Samples: 587142000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 02:02:41,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:02:43,686][12883] Updated weights for policy 0, policy_version 35831 (0.0040) [2024-06-18 02:02:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 587202560. Throughput: 0: 41332.3. Samples: 587265920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 02:02:46,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 02:02:47,606][12883] Updated weights for policy 0, policy_version 35841 (0.0036) [2024-06-18 02:02:51,604][12883] Updated weights for policy 0, policy_version 35851 (0.0040) [2024-06-18 02:02:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 587399168. Throughput: 0: 41282.2. Samples: 587510620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 02:02:51,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:02:55,571][12883] Updated weights for policy 0, policy_version 35861 (0.0032) [2024-06-18 02:02:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41234.7, 300 sec: 41098.8). Total num frames: 587579392. Throughput: 0: 41134.7. Samples: 587757500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 02:02:56,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:02:59,385][12883] Updated weights for policy 0, policy_version 35871 (0.0033) [2024-06-18 02:03:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41232.9, 300 sec: 41154.4). Total num frames: 587792384. Throughput: 0: 41114.5. Samples: 587875400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 02:03:01,995][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:03:03,755][12883] Updated weights for policy 0, policy_version 35881 (0.0036) [2024-06-18 02:03:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 588005376. Throughput: 0: 41060.9. Samples: 588128520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 02:03:06,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:03:07,327][12883] Updated weights for policy 0, policy_version 35891 (0.0042) [2024-06-18 02:03:11,646][12883] Updated weights for policy 0, policy_version 35901 (0.0049) [2024-06-18 02:03:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 588201984. Throughput: 0: 40916.1. Samples: 588370180. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 02:03:11,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:03:15,350][12883] Updated weights for policy 0, policy_version 35911 (0.0031) [2024-06-18 02:03:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 588431360. Throughput: 0: 41024.1. Samples: 588492580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 24.0) [2024-06-18 02:03:16,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 02:03:19,674][12883] Updated weights for policy 0, policy_version 35921 (0.0045) [2024-06-18 02:03:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 588611584. Throughput: 0: 41043.6. Samples: 588743400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 02:03:21,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 02:03:23,375][12883] Updated weights for policy 0, policy_version 35931 (0.0025) [2024-06-18 02:03:26,994][12645] Fps is (10 sec: 37682.9, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 588808192. Throughput: 0: 41008.4. Samples: 588987380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 02:03:26,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:03:27,884][12883] Updated weights for policy 0, policy_version 35941 (0.0032) [2024-06-18 02:03:31,154][12883] Updated weights for policy 0, policy_version 35951 (0.0038) [2024-06-18 02:03:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 589053952. Throughput: 0: 40961.8. Samples: 589109200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 02:03:31,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 02:03:35,681][12883] Updated weights for policy 0, policy_version 35961 (0.0039) [2024-06-18 02:03:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40413.8, 300 sec: 41098.9). Total num frames: 589217792. Throughput: 0: 40964.5. Samples: 589354020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 02:03:36,994][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 02:03:39,439][12883] Updated weights for policy 0, policy_version 35971 (0.0035) [2024-06-18 02:03:40,608][12862] Signal inference workers to stop experience collection... (8350 times) [2024-06-18 02:03:40,609][12862] Signal inference workers to resume experience collection... (8350 times) [2024-06-18 02:03:40,630][12883] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-06-18 02:03:40,630][12883] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-06-18 02:03:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 589430784. Throughput: 0: 40925.0. Samples: 589599120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:03:41,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:03:43,777][12883] Updated weights for policy 0, policy_version 35981 (0.0038) [2024-06-18 02:03:46,994][12645] Fps is (10 sec: 44235.7, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 589660160. Throughput: 0: 41158.6. Samples: 589727540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:03:46,995][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:03:47,228][12883] Updated weights for policy 0, policy_version 35991 (0.0035) [2024-06-18 02:03:51,858][12883] Updated weights for policy 0, policy_version 36001 (0.0043) [2024-06-18 02:03:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 589840384. Throughput: 0: 40763.4. Samples: 589962880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:03:51,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 02:03:55,443][12883] Updated weights for policy 0, policy_version 36011 (0.0035) [2024-06-18 02:03:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 590069760. Throughput: 0: 40854.6. Samples: 590208640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:03:56,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 02:03:59,743][12883] Updated weights for policy 0, policy_version 36021 (0.0033) [2024-06-18 02:04:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 590249984. Throughput: 0: 40948.0. Samples: 590335240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:04:01,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:04:03,364][12883] Updated weights for policy 0, policy_version 36031 (0.0023) [2024-06-18 02:04:06,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 590446592. Throughput: 0: 40781.8. Samples: 590578580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 02:04:06,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:04:07,843][12883] Updated weights for policy 0, policy_version 36041 (0.0045) [2024-06-18 02:04:11,605][12883] Updated weights for policy 0, policy_version 36051 (0.0033) [2024-06-18 02:04:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 590675968. Throughput: 0: 40797.8. Samples: 590823280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 02:04:11,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 02:04:15,696][12883] Updated weights for policy 0, policy_version 36061 (0.0050) [2024-06-18 02:04:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40413.8, 300 sec: 41098.8). Total num frames: 590856192. Throughput: 0: 40831.6. Samples: 590946620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 02:04:16,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:04:19,711][12883] Updated weights for policy 0, policy_version 36071 (0.0040) [2024-06-18 02:04:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 591069184. Throughput: 0: 40860.5. Samples: 591192740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 02:04:21,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:04:24,037][12883] Updated weights for policy 0, policy_version 36081 (0.0041) [2024-06-18 02:04:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 591298560. Throughput: 0: 40729.6. Samples: 591431960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 02:04:27,000][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:04:27,654][12883] Updated weights for policy 0, policy_version 36091 (0.0044) [2024-06-18 02:04:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 40140.8, 300 sec: 41043.3). Total num frames: 591462400. Throughput: 0: 40634.8. Samples: 591556100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 02:04:31,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:04:32,275][12883] Updated weights for policy 0, policy_version 36101 (0.0034) [2024-06-18 02:04:35,675][12883] Updated weights for policy 0, policy_version 36111 (0.0036) [2024-06-18 02:04:36,994][12645] Fps is (10 sec: 36044.7, 60 sec: 40686.8, 300 sec: 40987.8). Total num frames: 591659008. Throughput: 0: 40800.0. Samples: 591798880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 02:04:36,994][12645] Avg episode reward: [(0, '0.009')] [2024-06-18 02:04:37,058][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036113_591675392.pth... [2024-06-18 02:04:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035511_581812224.pth [2024-06-18 02:04:40,274][12883] Updated weights for policy 0, policy_version 36121 (0.0035) [2024-06-18 02:04:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 591872000. Throughput: 0: 40696.5. Samples: 592039980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 02:04:41,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:04:43,678][12883] Updated weights for policy 0, policy_version 36131 (0.0048) [2024-06-18 02:04:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 592101376. Throughput: 0: 40640.8. Samples: 592164080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 02:04:46,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 02:04:48,026][12883] Updated weights for policy 0, policy_version 36141 (0.0032) [2024-06-18 02:04:51,650][12883] Updated weights for policy 0, policy_version 36151 (0.0028) [2024-06-18 02:04:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 592314368. Throughput: 0: 40702.6. Samples: 592410200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:04:51,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:04:55,986][12883] Updated weights for policy 0, policy_version 36161 (0.0030) [2024-06-18 02:04:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 592510976. Throughput: 0: 40670.5. Samples: 592653460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:04:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 02:04:59,693][12883] Updated weights for policy 0, policy_version 36171 (0.0043) [2024-06-18 02:05:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 592707584. Throughput: 0: 40688.1. Samples: 592777580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:05:01,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:05:04,145][12883] Updated weights for policy 0, policy_version 36181 (0.0027) [2024-06-18 02:05:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 592904192. Throughput: 0: 40768.7. Samples: 593027340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:05:06,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 02:05:08,051][12883] Updated weights for policy 0, policy_version 36191 (0.0030) [2024-06-18 02:05:11,987][12883] Updated weights for policy 0, policy_version 36201 (0.0038) [2024-06-18 02:05:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 593117184. Throughput: 0: 41055.5. Samples: 593279460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:05:11,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 02:05:15,971][12883] Updated weights for policy 0, policy_version 36211 (0.0039) [2024-06-18 02:05:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 593330176. Throughput: 0: 41017.9. Samples: 593401900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 02:05:16,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 02:05:19,719][12883] Updated weights for policy 0, policy_version 36221 (0.0046) [2024-06-18 02:05:21,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 593543168. Throughput: 0: 41080.2. Samples: 593647480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 02:05:21,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:05:23,469][12862] Signal inference workers to stop experience collection... (8400 times) [2024-06-18 02:05:23,469][12862] Signal inference workers to resume experience collection... (8400 times) [2024-06-18 02:05:23,511][12883] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-06-18 02:05:23,511][12883] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-06-18 02:05:24,159][12883] Updated weights for policy 0, policy_version 36231 (0.0027) [2024-06-18 02:05:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40414.0, 300 sec: 40987.8). Total num frames: 593723392. Throughput: 0: 41384.9. Samples: 593902300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 02:05:26,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:05:27,438][12883] Updated weights for policy 0, policy_version 36241 (0.0044) [2024-06-18 02:05:31,994][12645] Fps is (10 sec: 37683.3, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 593920000. Throughput: 0: 41309.4. Samples: 594023000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 02:05:31,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:05:32,069][12883] Updated weights for policy 0, policy_version 36251 (0.0043) [2024-06-18 02:05:35,104][12883] Updated weights for policy 0, policy_version 36261 (0.0032) [2024-06-18 02:05:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.4, 300 sec: 41154.4). Total num frames: 594182144. Throughput: 0: 41476.1. Samples: 594276620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 02:05:36,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 02:05:39,681][12883] Updated weights for policy 0, policy_version 36271 (0.0038) [2024-06-18 02:05:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 594362368. Throughput: 0: 41618.0. Samples: 594526260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 02:05:41,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:05:43,504][12883] Updated weights for policy 0, policy_version 36281 (0.0035) [2024-06-18 02:05:46,994][12645] Fps is (10 sec: 37682.5, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 594558976. Throughput: 0: 41492.7. Samples: 594644760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 02:05:46,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:05:47,447][12883] Updated weights for policy 0, policy_version 36291 (0.0034) [2024-06-18 02:05:51,781][12883] Updated weights for policy 0, policy_version 36301 (0.0040) [2024-06-18 02:05:51,997][12645] Fps is (10 sec: 40945.5, 60 sec: 40957.7, 300 sec: 41043.1). Total num frames: 594771968. Throughput: 0: 41497.3. Samples: 594894860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 02:05:51,998][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:05:55,566][12883] Updated weights for policy 0, policy_version 36311 (0.0027) [2024-06-18 02:05:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 594984960. Throughput: 0: 41456.1. Samples: 595144980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 02:05:56,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:05:59,397][12883] Updated weights for policy 0, policy_version 36321 (0.0029) [2024-06-18 02:06:01,994][12645] Fps is (10 sec: 42612.9, 60 sec: 41506.1, 300 sec: 41154.7). Total num frames: 595197952. Throughput: 0: 41506.1. Samples: 595269680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 02:06:01,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:06:01,994][12862] Saving new best policy, reward=0.065! [2024-06-18 02:06:03,250][12883] Updated weights for policy 0, policy_version 36331 (0.0049) [2024-06-18 02:06:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 595394560. Throughput: 0: 41576.3. Samples: 595518420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 02:06:06,994][12645] Avg episode reward: [(0, '0.032')] [2024-06-18 02:06:07,157][12883] Updated weights for policy 0, policy_version 36341 (0.0045) [2024-06-18 02:06:10,899][12883] Updated weights for policy 0, policy_version 36351 (0.0029) [2024-06-18 02:06:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41098.8). Total num frames: 595607552. Throughput: 0: 41573.3. Samples: 595773100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 02:06:11,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:06:15,012][12883] Updated weights for policy 0, policy_version 36361 (0.0041) [2024-06-18 02:06:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 595820544. Throughput: 0: 41597.3. Samples: 595894880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 02:06:16,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:06:18,472][12883] Updated weights for policy 0, policy_version 36371 (0.0041) [2024-06-18 02:06:21,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41504.6, 300 sec: 41154.1). Total num frames: 596033536. Throughput: 0: 41509.0. Samples: 596144620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 02:06:21,996][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 02:06:21,997][12862] Saving new best policy, reward=0.091! [2024-06-18 02:06:22,881][12883] Updated weights for policy 0, policy_version 36381 (0.0039) [2024-06-18 02:06:26,537][12883] Updated weights for policy 0, policy_version 36391 (0.0034) [2024-06-18 02:06:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 596230144. Throughput: 0: 41689.2. Samples: 596402280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:06:26,994][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 02:06:30,428][12883] Updated weights for policy 0, policy_version 36401 (0.0029) [2024-06-18 02:06:31,996][12645] Fps is (10 sec: 40959.9, 60 sec: 42050.6, 300 sec: 41154.1). Total num frames: 596443136. Throughput: 0: 41790.9. Samples: 596525440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:06:31,996][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:06:34,152][12883] Updated weights for policy 0, policy_version 36411 (0.0038) [2024-06-18 02:06:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41506.0, 300 sec: 41265.5). Total num frames: 596672512. Throughput: 0: 41773.8. Samples: 596774540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:06:36,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 02:06:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036418_596672512.pth... [2024-06-18 02:06:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000035815_586792960.pth [2024-06-18 02:06:38,387][12883] Updated weights for policy 0, policy_version 36421 (0.0034) [2024-06-18 02:06:40,590][12862] Signal inference workers to stop experience collection... (8450 times) [2024-06-18 02:06:40,590][12862] Signal inference workers to resume experience collection... (8450 times) [2024-06-18 02:06:40,635][12883] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-06-18 02:06:40,635][12883] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-06-18 02:06:41,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42052.2, 300 sec: 41265.5). Total num frames: 596885504. Throughput: 0: 41788.9. Samples: 597025480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:06:41,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 02:06:41,999][12883] Updated weights for policy 0, policy_version 36431 (0.0037) [2024-06-18 02:06:46,137][12883] Updated weights for policy 0, policy_version 36441 (0.0043) [2024-06-18 02:06:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 41209.9). Total num frames: 597082112. Throughput: 0: 41923.2. Samples: 597156220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 02:06:46,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:06:49,873][12883] Updated weights for policy 0, policy_version 36451 (0.0050) [2024-06-18 02:06:51,996][12645] Fps is (10 sec: 39311.7, 60 sec: 41779.9, 300 sec: 41265.4). Total num frames: 597278720. Throughput: 0: 41956.9. Samples: 597406580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 02:06:51,997][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:06:53,775][12883] Updated weights for policy 0, policy_version 36461 (0.0029) [2024-06-18 02:06:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 597491712. Throughput: 0: 41794.7. Samples: 597653860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 02:06:56,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:06:58,059][12883] Updated weights for policy 0, policy_version 36471 (0.0045) [2024-06-18 02:07:01,418][12883] Updated weights for policy 0, policy_version 36481 (0.0046) [2024-06-18 02:07:01,994][12645] Fps is (10 sec: 42609.0, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 597704704. Throughput: 0: 41937.3. Samples: 597782060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 02:07:01,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 02:07:05,821][12883] Updated weights for policy 0, policy_version 36491 (0.0035) [2024-06-18 02:07:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 597901312. Throughput: 0: 41995.4. Samples: 598034320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 02:07:06,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:07:09,080][12883] Updated weights for policy 0, policy_version 36501 (0.0029) [2024-06-18 02:07:11,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41777.7, 300 sec: 41265.1). Total num frames: 598114304. Throughput: 0: 41763.3. Samples: 598281720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:07:11,997][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 02:07:13,762][12883] Updated weights for policy 0, policy_version 36511 (0.0046) [2024-06-18 02:07:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 598343680. Throughput: 0: 41871.0. Samples: 598409540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:07:16,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:07:17,106][12883] Updated weights for policy 0, policy_version 36521 (0.0040) [2024-06-18 02:07:21,763][12883] Updated weights for policy 0, policy_version 36531 (0.0028) [2024-06-18 02:07:21,996][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41320.7). Total num frames: 598540288. Throughput: 0: 41874.5. Samples: 598658980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:07:21,997][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:07:24,961][12883] Updated weights for policy 0, policy_version 36541 (0.0036) [2024-06-18 02:07:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41265.5). Total num frames: 598753280. Throughput: 0: 41825.7. Samples: 598907640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:07:26,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 02:07:29,811][12883] Updated weights for policy 0, policy_version 36551 (0.0039) [2024-06-18 02:07:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42053.8, 300 sec: 41265.5). Total num frames: 598966272. Throughput: 0: 41772.4. Samples: 599035980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:07:32,000][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:07:32,787][12883] Updated weights for policy 0, policy_version 36561 (0.0040) [2024-06-18 02:07:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 599162880. Throughput: 0: 41821.9. Samples: 599288460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:07:36,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:07:37,289][12883] Updated weights for policy 0, policy_version 36571 (0.0044) [2024-06-18 02:07:40,886][12883] Updated weights for policy 0, policy_version 36581 (0.0035) [2024-06-18 02:07:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 599392256. Throughput: 0: 41695.0. Samples: 599530140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:07:41,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 02:07:45,081][12883] Updated weights for policy 0, policy_version 36591 (0.0043) [2024-06-18 02:07:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 599572480. Throughput: 0: 41729.4. Samples: 599659880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:07:46,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:07:48,806][12883] Updated weights for policy 0, policy_version 36601 (0.0031) [2024-06-18 02:07:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41780.9, 300 sec: 41376.5). Total num frames: 599785472. Throughput: 0: 41622.6. Samples: 599907340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:07:51,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 02:07:53,055][12883] Updated weights for policy 0, policy_version 36611 (0.0035) [2024-06-18 02:07:56,680][12883] Updated weights for policy 0, policy_version 36621 (0.0033) [2024-06-18 02:07:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.2, 300 sec: 41487.6). Total num frames: 600031232. Throughput: 0: 41796.7. Samples: 600162480. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) [2024-06-18 02:07:56,994][12645] Avg episode reward: [(0, '0.006')] [2024-06-18 02:08:00,773][12883] Updated weights for policy 0, policy_version 36631 (0.0040) [2024-06-18 02:08:01,999][12645] Fps is (10 sec: 42575.5, 60 sec: 41775.4, 300 sec: 41375.8). Total num frames: 600211456. Throughput: 0: 41707.4. Samples: 600286600. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) [2024-06-18 02:08:02,000][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:08:04,478][12883] Updated weights for policy 0, policy_version 36641 (0.0034) [2024-06-18 02:08:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 600424448. Throughput: 0: 41707.8. Samples: 600535740. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) [2024-06-18 02:08:06,994][12645] Avg episode reward: [(0, '0.064')] [2024-06-18 02:08:08,624][12883] Updated weights for policy 0, policy_version 36651 (0.0050) [2024-06-18 02:08:11,994][12645] Fps is (10 sec: 39342.9, 60 sec: 41507.7, 300 sec: 41265.5). Total num frames: 600604672. Throughput: 0: 41856.9. Samples: 600791200. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) [2024-06-18 02:08:11,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:08:12,013][12862] Signal inference workers to stop experience collection... (8500 times) [2024-06-18 02:08:12,060][12883] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-06-18 02:08:12,068][12862] Signal inference workers to resume experience collection... (8500 times) [2024-06-18 02:08:12,079][12883] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-06-18 02:08:12,487][12883] Updated weights for policy 0, policy_version 36661 (0.0035) [2024-06-18 02:08:16,453][12883] Updated weights for policy 0, policy_version 36671 (0.0043) [2024-06-18 02:08:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 600817664. Throughput: 0: 41503.6. Samples: 600903640. Policy #0 lag: (min: 2.0, avg: 12.1, max: 25.0) [2024-06-18 02:08:16,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 02:08:20,410][12883] Updated weights for policy 0, policy_version 36681 (0.0043) [2024-06-18 02:08:21,996][12645] Fps is (10 sec: 44226.7, 60 sec: 41779.2, 300 sec: 41487.3). Total num frames: 601047040. Throughput: 0: 41424.6. Samples: 601152660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 02:08:21,996][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:08:24,567][12883] Updated weights for policy 0, policy_version 36691 (0.0051) [2024-06-18 02:08:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 601210880. Throughput: 0: 41641.8. Samples: 601404020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 02:08:26,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 02:08:28,501][12883] Updated weights for policy 0, policy_version 36701 (0.0033) [2024-06-18 02:08:31,994][12645] Fps is (10 sec: 40968.9, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 601456640. Throughput: 0: 41449.6. Samples: 601525120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 02:08:31,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:08:32,660][12883] Updated weights for policy 0, policy_version 36711 (0.0043) [2024-06-18 02:08:36,383][12883] Updated weights for policy 0, policy_version 36721 (0.0034) [2024-06-18 02:08:36,994][12645] Fps is (10 sec: 45874.5, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 601669632. Throughput: 0: 41467.4. Samples: 601773380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 02:08:36,996][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 02:08:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036723_601669632.pth... [2024-06-18 02:08:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036113_591675392.pth [2024-06-18 02:08:40,752][12883] Updated weights for policy 0, policy_version 36731 (0.0045) [2024-06-18 02:08:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 601849856. Throughput: 0: 41318.3. Samples: 602021800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 02:08:41,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:08:44,193][12883] Updated weights for policy 0, policy_version 36741 (0.0022) [2024-06-18 02:08:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 602079232. Throughput: 0: 41163.2. Samples: 602138720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:08:46,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:08:48,507][12883] Updated weights for policy 0, policy_version 36751 (0.0040) [2024-06-18 02:08:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 602275840. Throughput: 0: 41360.1. Samples: 602396940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:08:51,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 02:08:51,997][12883] Updated weights for policy 0, policy_version 36761 (0.0029) [2024-06-18 02:08:56,167][12883] Updated weights for policy 0, policy_version 36771 (0.0033) [2024-06-18 02:08:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 602472448. Throughput: 0: 41202.7. Samples: 602645320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:08:56,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:08:59,879][12883] Updated weights for policy 0, policy_version 36781 (0.0035) [2024-06-18 02:09:01,996][12645] Fps is (10 sec: 44224.5, 60 sec: 41781.1, 300 sec: 41598.3). Total num frames: 602718208. Throughput: 0: 41392.1. Samples: 602766400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:09:01,997][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 02:09:04,286][12883] Updated weights for policy 0, policy_version 36791 (0.0041) [2024-06-18 02:09:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 602898432. Throughput: 0: 41542.6. Samples: 603021980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:09:06,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:09:08,014][12883] Updated weights for policy 0, policy_version 36801 (0.0040) [2024-06-18 02:09:11,994][12645] Fps is (10 sec: 37693.4, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 603095040. Throughput: 0: 41187.1. Samples: 603257440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:09:11,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 02:09:12,064][12883] Updated weights for policy 0, policy_version 36811 (0.0028) [2024-06-18 02:09:16,207][12883] Updated weights for policy 0, policy_version 36821 (0.0038) [2024-06-18 02:09:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 603308032. Throughput: 0: 41333.9. Samples: 603385140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:09:16,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:09:19,892][12883] Updated weights for policy 0, policy_version 36831 (0.0034) [2024-06-18 02:09:21,996][12645] Fps is (10 sec: 40950.6, 60 sec: 40960.0, 300 sec: 41376.2). Total num frames: 603504640. Throughput: 0: 41396.3. Samples: 603636300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:09:21,996][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 02:09:24,052][12883] Updated weights for policy 0, policy_version 36841 (0.0037) [2024-06-18 02:09:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41654.3). Total num frames: 603750400. Throughput: 0: 41275.6. Samples: 603879200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:09:26,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:09:28,168][12883] Updated weights for policy 0, policy_version 36851 (0.0048) [2024-06-18 02:09:31,994][12645] Fps is (10 sec: 40969.5, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 603914240. Throughput: 0: 41533.4. Samples: 604007720. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-18 02:09:31,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 02:09:32,012][12883] Updated weights for policy 0, policy_version 36861 (0.0038) [2024-06-18 02:09:35,795][12883] Updated weights for policy 0, policy_version 36871 (0.0034) [2024-06-18 02:09:36,996][12645] Fps is (10 sec: 37674.8, 60 sec: 40958.6, 300 sec: 41542.8). Total num frames: 604127232. Throughput: 0: 41304.1. Samples: 604255720. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-18 02:09:36,996][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:09:40,014][12883] Updated weights for policy 0, policy_version 36881 (0.0028) [2024-06-18 02:09:40,435][12862] Signal inference workers to stop experience collection... (8550 times) [2024-06-18 02:09:40,463][12883] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-06-18 02:09:40,493][12862] Signal inference workers to resume experience collection... (8550 times) [2024-06-18 02:09:40,493][12883] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-06-18 02:09:41,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42050.7, 300 sec: 41598.4). Total num frames: 604372992. Throughput: 0: 41173.0. Samples: 604498200. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-18 02:09:41,997][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:09:43,680][12883] Updated weights for policy 0, policy_version 36891 (0.0027) [2024-06-18 02:09:46,994][12645] Fps is (10 sec: 39330.6, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 604520448. Throughput: 0: 41328.7. Samples: 604626080. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-18 02:09:46,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:09:47,951][12883] Updated weights for policy 0, policy_version 36901 (0.0033) [2024-06-18 02:09:51,566][12883] Updated weights for policy 0, policy_version 36911 (0.0041) [2024-06-18 02:09:51,994][12645] Fps is (10 sec: 37691.9, 60 sec: 41233.0, 300 sec: 41487.7). Total num frames: 604749824. Throughput: 0: 41008.9. Samples: 604867380. Policy #0 lag: (min: 1.0, avg: 7.9, max: 21.0) [2024-06-18 02:09:51,994][12645] Avg episode reward: [(0, '0.005')] [2024-06-18 02:09:55,764][12883] Updated weights for policy 0, policy_version 36921 (0.0022) [2024-06-18 02:09:56,994][12645] Fps is (10 sec: 45874.1, 60 sec: 41779.0, 300 sec: 41598.7). Total num frames: 604979200. Throughput: 0: 41345.1. Samples: 605117980. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 02:09:56,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 02:09:59,649][12883] Updated weights for policy 0, policy_version 36931 (0.0037) [2024-06-18 02:10:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40688.7, 300 sec: 41543.2). Total num frames: 605159424. Throughput: 0: 41264.4. Samples: 605242040. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 02:10:01,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 02:10:03,572][12883] Updated weights for policy 0, policy_version 36941 (0.0035) [2024-06-18 02:10:06,994][12645] Fps is (10 sec: 39322.5, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 605372416. Throughput: 0: 41089.7. Samples: 605485240. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 02:10:06,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:10:07,388][12883] Updated weights for policy 0, policy_version 36951 (0.0039) [2024-06-18 02:10:11,445][12883] Updated weights for policy 0, policy_version 36961 (0.0034) [2024-06-18 02:10:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 605601792. Throughput: 0: 41264.8. Samples: 605736120. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 02:10:11,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 02:10:15,427][12883] Updated weights for policy 0, policy_version 36971 (0.0032) [2024-06-18 02:10:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 605798400. Throughput: 0: 41246.6. Samples: 605863820. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 02:10:16,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:10:19,359][12883] Updated weights for policy 0, policy_version 36981 (0.0052) [2024-06-18 02:10:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41507.7, 300 sec: 41598.7). Total num frames: 605995008. Throughput: 0: 41298.5. Samples: 606114060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:10:21,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 02:10:23,328][12883] Updated weights for policy 0, policy_version 36991 (0.0037) [2024-06-18 02:10:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 606208000. Throughput: 0: 41489.2. Samples: 606365120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:10:26,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 02:10:27,011][12883] Updated weights for policy 0, policy_version 37001 (0.0033) [2024-06-18 02:10:31,210][12883] Updated weights for policy 0, policy_version 37011 (0.0034) [2024-06-18 02:10:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 606420992. Throughput: 0: 41545.2. Samples: 606495620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:10:31,994][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 02:10:34,991][12883] Updated weights for policy 0, policy_version 37021 (0.0041) [2024-06-18 02:10:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41507.7, 300 sec: 41543.1). Total num frames: 606617600. Throughput: 0: 41695.5. Samples: 606743680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:10:36,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:10:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037026_606633984.pth... [2024-06-18 02:10:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036418_596672512.pth [2024-06-18 02:10:39,159][12883] Updated weights for policy 0, policy_version 37031 (0.0044) [2024-06-18 02:10:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 40961.6, 300 sec: 41598.7). Total num frames: 606830592. Throughput: 0: 41622.4. Samples: 606990980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 02:10:41,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:10:42,744][12883] Updated weights for policy 0, policy_version 37041 (0.0031) [2024-06-18 02:10:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41543.6). Total num frames: 607027200. Throughput: 0: 41608.4. Samples: 607114420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 02:10:46,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:10:47,136][12883] Updated weights for policy 0, policy_version 37051 (0.0041) [2024-06-18 02:10:50,741][12883] Updated weights for policy 0, policy_version 37061 (0.0033) [2024-06-18 02:10:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 607240192. Throughput: 0: 41808.9. Samples: 607366640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 02:10:51,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 02:10:54,912][12883] Updated weights for policy 0, policy_version 37071 (0.0051) [2024-06-18 02:10:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 607469568. Throughput: 0: 41737.4. Samples: 607614300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 02:10:56,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:10:58,570][12883] Updated weights for policy 0, policy_version 37081 (0.0054) [2024-06-18 02:11:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 607666176. Throughput: 0: 41734.7. Samples: 607741880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 02:11:01,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:11:02,740][12883] Updated weights for policy 0, policy_version 37091 (0.0058) [2024-06-18 02:11:06,573][12883] Updated weights for policy 0, policy_version 37101 (0.0036) [2024-06-18 02:11:06,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42050.6, 300 sec: 41653.9). Total num frames: 607895552. Throughput: 0: 41714.7. Samples: 607991320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:11:06,997][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:11:10,786][12883] Updated weights for policy 0, policy_version 37111 (0.0036) [2024-06-18 02:11:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 608092160. Throughput: 0: 41684.7. Samples: 608240940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:11:11,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 02:11:14,121][12883] Updated weights for policy 0, policy_version 37121 (0.0035) [2024-06-18 02:11:16,994][12645] Fps is (10 sec: 37692.0, 60 sec: 41233.1, 300 sec: 41487.9). Total num frames: 608272384. Throughput: 0: 41566.4. Samples: 608366100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:11:16,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:11:18,963][12883] Updated weights for policy 0, policy_version 37131 (0.0037) [2024-06-18 02:11:21,823][12883] Updated weights for policy 0, policy_version 37141 (0.0037) [2024-06-18 02:11:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 608518144. Throughput: 0: 41404.9. Samples: 608606900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:11:21,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:11:26,233][12862] Signal inference workers to stop experience collection... (8600 times) [2024-06-18 02:11:26,233][12862] Signal inference workers to resume experience collection... (8600 times) [2024-06-18 02:11:26,258][12883] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-06-18 02:11:26,258][12883] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-06-18 02:11:26,961][12883] Updated weights for policy 0, policy_version 37151 (0.0045) [2024-06-18 02:11:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41487.9). Total num frames: 608681984. Throughput: 0: 41789.8. Samples: 608871520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 02:11:26,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:11:29,993][12883] Updated weights for policy 0, policy_version 37161 (0.0041) [2024-06-18 02:11:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 608911360. Throughput: 0: 41675.2. Samples: 608989800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 02:11:31,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 02:11:34,699][12883] Updated weights for policy 0, policy_version 37171 (0.0046) [2024-06-18 02:11:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 609124352. Throughput: 0: 41581.6. Samples: 609237820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 02:11:36,995][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:11:37,842][12883] Updated weights for policy 0, policy_version 37181 (0.0031) [2024-06-18 02:11:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 609304576. Throughput: 0: 41777.4. Samples: 609494280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 02:11:41,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 02:11:42,384][12883] Updated weights for policy 0, policy_version 37191 (0.0036) [2024-06-18 02:11:45,601][12883] Updated weights for policy 0, policy_version 37201 (0.0026) [2024-06-18 02:11:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41543.5). Total num frames: 609533952. Throughput: 0: 41480.5. Samples: 609608500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 02:11:46,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:11:50,225][12883] Updated weights for policy 0, policy_version 37211 (0.0042) [2024-06-18 02:11:51,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 609763328. Throughput: 0: 41662.6. Samples: 609866040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 02:11:51,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 02:11:53,981][12883] Updated weights for policy 0, policy_version 37221 (0.0034) [2024-06-18 02:11:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 609927168. Throughput: 0: 41725.4. Samples: 610118580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:11:56,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 02:11:57,005][12862] Saving new best policy, reward=0.110! [2024-06-18 02:11:58,210][12883] Updated weights for policy 0, policy_version 37231 (0.0035) [2024-06-18 02:12:01,668][12883] Updated weights for policy 0, policy_version 37241 (0.0029) [2024-06-18 02:12:01,996][12645] Fps is (10 sec: 39312.8, 60 sec: 41504.6, 300 sec: 41542.8). Total num frames: 610156544. Throughput: 0: 41559.7. Samples: 610236380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:12:01,996][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 02:12:02,048][12862] Saving new best policy, reward=0.119! [2024-06-18 02:12:05,742][12883] Updated weights for policy 0, policy_version 37251 (0.0045) [2024-06-18 02:12:06,994][12645] Fps is (10 sec: 47513.6, 60 sec: 41780.8, 300 sec: 41654.5). Total num frames: 610402304. Throughput: 0: 42003.6. Samples: 610497060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:12:06,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:12:09,357][12883] Updated weights for policy 0, policy_version 37261 (0.0033) [2024-06-18 02:12:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 610566144. Throughput: 0: 41534.2. Samples: 610740560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:12:11,994][12645] Avg episode reward: [(0, '0.053')] [2024-06-18 02:12:13,567][12883] Updated weights for policy 0, policy_version 37271 (0.0034) [2024-06-18 02:12:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 41543.5). Total num frames: 610795520. Throughput: 0: 41560.0. Samples: 610860000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:12:16,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 02:12:17,061][12883] Updated weights for policy 0, policy_version 37281 (0.0039) [2024-06-18 02:12:21,503][12883] Updated weights for policy 0, policy_version 37291 (0.0042) [2024-06-18 02:12:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 610992128. Throughput: 0: 41736.6. Samples: 611115960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:12:21,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:12:25,307][12883] Updated weights for policy 0, policy_version 37301 (0.0040) [2024-06-18 02:12:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 611188736. Throughput: 0: 41536.8. Samples: 611363440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:12:26,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:12:29,466][12883] Updated weights for policy 0, policy_version 37311 (0.0040) [2024-06-18 02:12:31,995][12645] Fps is (10 sec: 42592.4, 60 sec: 41778.2, 300 sec: 41543.0). Total num frames: 611418112. Throughput: 0: 41689.8. Samples: 611484600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:12:31,995][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 02:12:32,968][12883] Updated weights for policy 0, policy_version 37321 (0.0029) [2024-06-18 02:12:36,355][12862] Signal inference workers to stop experience collection... (8650 times) [2024-06-18 02:12:36,356][12862] Signal inference workers to resume experience collection... (8650 times) [2024-06-18 02:12:36,373][12883] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-06-18 02:12:36,373][12883] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-06-18 02:12:36,964][12883] Updated weights for policy 0, policy_version 37331 (0.0045) [2024-06-18 02:12:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 611631104. Throughput: 0: 41520.9. Samples: 611734480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:12:36,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:12:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037331_611631104.pth... [2024-06-18 02:12:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000036723_601669632.pth [2024-06-18 02:12:41,506][12883] Updated weights for policy 0, policy_version 37341 (0.0037) [2024-06-18 02:12:41,994][12645] Fps is (10 sec: 39327.3, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 611811328. Throughput: 0: 41436.5. Samples: 611983220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:12:41,994][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 02:12:44,618][12883] Updated weights for policy 0, policy_version 37351 (0.0026) [2024-06-18 02:12:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 612040704. Throughput: 0: 41485.6. Samples: 612103140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:12:46,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 02:12:49,072][12883] Updated weights for policy 0, policy_version 37361 (0.0041) [2024-06-18 02:12:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 612237312. Throughput: 0: 41313.5. Samples: 612356160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:12:51,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 02:12:52,390][12883] Updated weights for policy 0, policy_version 37371 (0.0030) [2024-06-18 02:12:56,989][12883] Updated weights for policy 0, policy_version 37381 (0.0030) [2024-06-18 02:12:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41488.4). Total num frames: 612450304. Throughput: 0: 41294.7. Samples: 612598820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:12:56,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 02:13:00,383][12883] Updated weights for policy 0, policy_version 37391 (0.0030) [2024-06-18 02:13:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41507.7, 300 sec: 41432.1). Total num frames: 612646912. Throughput: 0: 41517.8. Samples: 612728300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:13:01,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:13:04,676][12883] Updated weights for policy 0, policy_version 37401 (0.0039) [2024-06-18 02:13:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 612843520. Throughput: 0: 41301.8. Samples: 612974540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 02:13:06,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 02:13:08,361][12883] Updated weights for policy 0, policy_version 37411 (0.0044) [2024-06-18 02:13:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 613056512. Throughput: 0: 41239.0. Samples: 613219200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 02:13:11,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 02:13:12,995][12883] Updated weights for policy 0, policy_version 37421 (0.0037) [2024-06-18 02:13:16,639][12883] Updated weights for policy 0, policy_version 37431 (0.0026) [2024-06-18 02:13:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41432.4). Total num frames: 613269504. Throughput: 0: 41384.8. Samples: 613346860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 02:13:16,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:13:20,914][12883] Updated weights for policy 0, policy_version 37441 (0.0045) [2024-06-18 02:13:21,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 613482496. Throughput: 0: 41370.3. Samples: 613596140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 02:13:21,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 02:13:24,687][12883] Updated weights for policy 0, policy_version 37451 (0.0038) [2024-06-18 02:13:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 613695488. Throughput: 0: 41322.6. Samples: 613842740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 02:13:26,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:13:28,624][12883] Updated weights for policy 0, policy_version 37461 (0.0044) [2024-06-18 02:13:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 40960.9, 300 sec: 41376.5). Total num frames: 613875712. Throughput: 0: 41426.1. Samples: 613967320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 02:13:31,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:13:32,766][12883] Updated weights for policy 0, policy_version 37471 (0.0029) [2024-06-18 02:13:36,455][12883] Updated weights for policy 0, policy_version 37481 (0.0036) [2024-06-18 02:13:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 614105088. Throughput: 0: 41232.3. Samples: 614211620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 02:13:36,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:13:40,736][12883] Updated weights for policy 0, policy_version 37491 (0.0034) [2024-06-18 02:13:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 614318080. Throughput: 0: 41510.5. Samples: 614466800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 02:13:41,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:13:44,143][12883] Updated weights for policy 0, policy_version 37501 (0.0028) [2024-06-18 02:13:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 614514688. Throughput: 0: 41431.5. Samples: 614592720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 02:13:46,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 02:13:48,337][12883] Updated weights for policy 0, policy_version 37511 (0.0039) [2024-06-18 02:13:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.0, 300 sec: 41543.1). Total num frames: 614727680. Throughput: 0: 41507.4. Samples: 614842380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 02:13:51,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:13:52,337][12862] Signal inference workers to stop experience collection... (8700 times) [2024-06-18 02:13:52,388][12862] Signal inference workers to resume experience collection... (8700 times) [2024-06-18 02:13:52,390][12883] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-06-18 02:13:52,396][12883] Updated weights for policy 0, policy_version 37521 (0.0045) [2024-06-18 02:13:52,407][12883] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-06-18 02:13:56,812][12883] Updated weights for policy 0, policy_version 37531 (0.0033) [2024-06-18 02:13:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41321.4). Total num frames: 614907904. Throughput: 0: 41740.1. Samples: 615097500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:13:56,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:14:00,062][12883] Updated weights for policy 0, policy_version 37541 (0.0046) [2024-06-18 02:14:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 615153664. Throughput: 0: 41540.5. Samples: 615216180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:14:01,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:14:04,874][12883] Updated weights for policy 0, policy_version 37551 (0.0048) [2024-06-18 02:14:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 615350272. Throughput: 0: 41525.2. Samples: 615464780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:14:06,994][12645] Avg episode reward: [(0, '0.036')] [2024-06-18 02:14:07,705][12883] Updated weights for policy 0, policy_version 37561 (0.0031) [2024-06-18 02:14:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 615530496. Throughput: 0: 41640.0. Samples: 615716540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:14:11,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:14:12,802][12883] Updated weights for policy 0, policy_version 37571 (0.0032) [2024-06-18 02:14:15,547][12883] Updated weights for policy 0, policy_version 37581 (0.0039) [2024-06-18 02:14:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41599.0). Total num frames: 615776256. Throughput: 0: 41647.7. Samples: 615841460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:14:16,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 02:14:20,551][12883] Updated weights for policy 0, policy_version 37591 (0.0033) [2024-06-18 02:14:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 615972864. Throughput: 0: 41952.8. Samples: 616099500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:14:21,994][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 02:14:23,422][12883] Updated weights for policy 0, policy_version 37601 (0.0048) [2024-06-18 02:14:26,994][12645] Fps is (10 sec: 37682.3, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 616153088. Throughput: 0: 41766.6. Samples: 616346300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:14:26,995][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:14:28,487][12883] Updated weights for policy 0, policy_version 37611 (0.0037) [2024-06-18 02:14:31,060][12883] Updated weights for policy 0, policy_version 37621 (0.0035) [2024-06-18 02:14:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 41654.5). Total num frames: 616415232. Throughput: 0: 41741.3. Samples: 616471080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:14:31,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:14:36,109][12883] Updated weights for policy 0, policy_version 37631 (0.0049) [2024-06-18 02:14:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.1, 300 sec: 41432.4). Total num frames: 616595456. Throughput: 0: 41972.9. Samples: 616731160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:14:36,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:14:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037635_616611840.pth... [2024-06-18 02:14:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037026_606633984.pth [2024-06-18 02:14:39,116][12883] Updated weights for policy 0, policy_version 37641 (0.0032) [2024-06-18 02:14:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 616808448. Throughput: 0: 41705.0. Samples: 616974220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:14:41,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:14:43,838][12883] Updated weights for policy 0, policy_version 37651 (0.0036) [2024-06-18 02:14:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 617021440. Throughput: 0: 41915.5. Samples: 617102380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 02:14:46,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 02:14:47,172][12883] Updated weights for policy 0, policy_version 37661 (0.0031) [2024-06-18 02:14:51,679][12883] Updated weights for policy 0, policy_version 37671 (0.0041) [2024-06-18 02:14:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 617201664. Throughput: 0: 42016.5. Samples: 617355520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 02:14:51,994][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 02:14:54,805][12883] Updated weights for policy 0, policy_version 37681 (0.0044) [2024-06-18 02:14:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41709.8). Total num frames: 617463808. Throughput: 0: 41888.0. Samples: 617601500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 02:14:56,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 02:14:59,310][12883] Updated weights for policy 0, policy_version 37691 (0.0042) [2024-06-18 02:15:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 617660416. Throughput: 0: 42186.1. Samples: 617739840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 02:15:01,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 02:15:02,359][12883] Updated weights for policy 0, policy_version 37701 (0.0045) [2024-06-18 02:15:06,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 617840640. Throughput: 0: 41985.9. Samples: 617988860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 02:15:06,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:15:07,267][12883] Updated weights for policy 0, policy_version 37711 (0.0030) [2024-06-18 02:15:07,938][12862] Signal inference workers to stop experience collection... (8750 times) [2024-06-18 02:15:07,941][12862] Signal inference workers to resume experience collection... (8750 times) [2024-06-18 02:15:07,956][12883] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-06-18 02:15:07,956][12883] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-06-18 02:15:10,623][12883] Updated weights for policy 0, policy_version 37721 (0.0042) [2024-06-18 02:15:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 41709.8). Total num frames: 618102784. Throughput: 0: 41826.7. Samples: 618228500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 02:15:11,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:15:14,901][12883] Updated weights for policy 0, policy_version 37731 (0.0039) [2024-06-18 02:15:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 618283008. Throughput: 0: 42175.6. Samples: 618368980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 02:15:16,994][12645] Avg episode reward: [(0, '0.053')] [2024-06-18 02:15:18,249][12883] Updated weights for policy 0, policy_version 37741 (0.0044) [2024-06-18 02:15:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 618479616. Throughput: 0: 41810.6. Samples: 618612640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 02:15:21,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 02:15:22,477][12883] Updated weights for policy 0, policy_version 37751 (0.0036) [2024-06-18 02:15:25,877][12883] Updated weights for policy 0, policy_version 37761 (0.0039) [2024-06-18 02:15:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.6, 300 sec: 41765.3). Total num frames: 618741760. Throughput: 0: 41878.0. Samples: 618858740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 02:15:26,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 02:15:30,298][12883] Updated weights for policy 0, policy_version 37771 (0.0034) [2024-06-18 02:15:31,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 618905600. Throughput: 0: 42050.0. Samples: 618994640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 02:15:31,995][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 02:15:33,722][12883] Updated weights for policy 0, policy_version 37781 (0.0032) [2024-06-18 02:15:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 619118592. Throughput: 0: 41864.3. Samples: 619239420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:15:36,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 02:15:38,073][12883] Updated weights for policy 0, policy_version 37791 (0.0043) [2024-06-18 02:15:41,473][12883] Updated weights for policy 0, policy_version 37801 (0.0034) [2024-06-18 02:15:41,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 619347968. Throughput: 0: 41988.0. Samples: 619490960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:15:41,994][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 02:15:45,624][12883] Updated weights for policy 0, policy_version 37811 (0.0041) [2024-06-18 02:15:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 619528192. Throughput: 0: 41793.3. Samples: 619620540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:15:46,995][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 02:15:49,249][12883] Updated weights for policy 0, policy_version 37821 (0.0039) [2024-06-18 02:15:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 41709.8). Total num frames: 619773952. Throughput: 0: 41815.5. Samples: 619870560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:15:51,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:15:53,211][12883] Updated weights for policy 0, policy_version 37831 (0.0034) [2024-06-18 02:15:57,000][12645] Fps is (10 sec: 44209.7, 60 sec: 41774.9, 300 sec: 41708.9). Total num frames: 619970560. Throughput: 0: 42310.7. Samples: 620132740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 02:15:57,000][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:15:57,120][12883] Updated weights for policy 0, policy_version 37841 (0.0032) [2024-06-18 02:16:01,302][12883] Updated weights for policy 0, policy_version 37851 (0.0038) [2024-06-18 02:16:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41599.0). Total num frames: 620167168. Throughput: 0: 41746.7. Samples: 620247580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 02:16:01,994][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 02:16:05,076][12883] Updated weights for policy 0, policy_version 37861 (0.0040) [2024-06-18 02:16:06,994][12645] Fps is (10 sec: 42625.3, 60 sec: 42598.5, 300 sec: 41709.8). Total num frames: 620396544. Throughput: 0: 41847.2. Samples: 620495760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 02:16:06,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:16:09,181][12883] Updated weights for policy 0, policy_version 37871 (0.0039) [2024-06-18 02:16:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 620576768. Throughput: 0: 42189.4. Samples: 620757260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 02:16:11,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 02:16:13,100][12883] Updated weights for policy 0, policy_version 37881 (0.0032) [2024-06-18 02:16:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 620789760. Throughput: 0: 41879.7. Samples: 620879220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 02:16:16,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:16:17,031][12883] Updated weights for policy 0, policy_version 37891 (0.0042) [2024-06-18 02:16:20,677][12883] Updated weights for policy 0, policy_version 37901 (0.0028) [2024-06-18 02:16:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 621002752. Throughput: 0: 42068.1. Samples: 621132480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 02:16:21,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:16:23,709][12862] Signal inference workers to stop experience collection... (8800 times) [2024-06-18 02:16:23,740][12883] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-06-18 02:16:23,762][12862] Signal inference workers to resume experience collection... (8800 times) [2024-06-18 02:16:23,768][12883] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-06-18 02:16:24,829][12883] Updated weights for policy 0, policy_version 37911 (0.0032) [2024-06-18 02:16:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 621215744. Throughput: 0: 42037.3. Samples: 621382640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 02:16:26,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:16:28,845][12883] Updated weights for policy 0, policy_version 37921 (0.0037) [2024-06-18 02:16:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 621445120. Throughput: 0: 41944.1. Samples: 621508020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 02:16:31,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 02:16:32,357][12883] Updated weights for policy 0, policy_version 37931 (0.0038) [2024-06-18 02:16:36,575][12883] Updated weights for policy 0, policy_version 37941 (0.0038) [2024-06-18 02:16:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 621625344. Throughput: 0: 41958.9. Samples: 621758720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 02:16:36,995][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:16:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037941_621625344.pth... [2024-06-18 02:16:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037331_611631104.pth [2024-06-18 02:16:40,330][12883] Updated weights for policy 0, policy_version 37951 (0.0028) [2024-06-18 02:16:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 621854720. Throughput: 0: 41596.4. Samples: 622004320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 02:16:41,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:16:44,283][12883] Updated weights for policy 0, policy_version 37961 (0.0031) [2024-06-18 02:16:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 622067712. Throughput: 0: 41936.8. Samples: 622134740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 02:16:46,994][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 02:16:48,084][12883] Updated weights for policy 0, policy_version 37971 (0.0040) [2024-06-18 02:16:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 622264320. Throughput: 0: 42091.1. Samples: 622389860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 02:16:51,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:16:52,672][12883] Updated weights for policy 0, policy_version 37981 (0.0037) [2024-06-18 02:16:55,954][12883] Updated weights for policy 0, policy_version 37991 (0.0028) [2024-06-18 02:16:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41783.5, 300 sec: 41765.6). Total num frames: 622477312. Throughput: 0: 41723.5. Samples: 622634820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 02:16:56,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 02:17:00,343][12883] Updated weights for policy 0, policy_version 38001 (0.0034) [2024-06-18 02:17:01,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42050.7, 300 sec: 41653.9). Total num frames: 622690304. Throughput: 0: 41778.9. Samples: 622759360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 02:17:01,996][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 02:17:03,519][12883] Updated weights for policy 0, policy_version 38011 (0.0042) [2024-06-18 02:17:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 622919680. Throughput: 0: 41827.0. Samples: 623014700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 02:17:06,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:17:08,046][12883] Updated weights for policy 0, policy_version 38021 (0.0034) [2024-06-18 02:17:11,747][12883] Updated weights for policy 0, policy_version 38031 (0.0035) [2024-06-18 02:17:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 623099904. Throughput: 0: 41841.4. Samples: 623265500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 02:17:11,997][12645] Avg episode reward: [(0, '0.036')] [2024-06-18 02:17:15,975][12883] Updated weights for policy 0, policy_version 38041 (0.0036) [2024-06-18 02:17:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 623312896. Throughput: 0: 41652.0. Samples: 623382360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 02:17:16,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:17:19,714][12883] Updated weights for policy 0, policy_version 38051 (0.0044) [2024-06-18 02:17:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 623509504. Throughput: 0: 41737.1. Samples: 623636880. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 02:17:21,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:17:23,643][12883] Updated weights for policy 0, policy_version 38061 (0.0051) [2024-06-18 02:17:26,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42050.7, 300 sec: 41765.2). Total num frames: 623738880. Throughput: 0: 41841.1. Samples: 623887260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 02:17:26,996][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:17:27,327][12883] Updated weights for policy 0, policy_version 38071 (0.0033) [2024-06-18 02:17:31,459][12883] Updated weights for policy 0, policy_version 38081 (0.0027) [2024-06-18 02:17:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 623919104. Throughput: 0: 41908.1. Samples: 624020600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 02:17:31,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:17:35,455][12883] Updated weights for policy 0, policy_version 38091 (0.0045) [2024-06-18 02:17:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.4, 300 sec: 41820.8). Total num frames: 624148480. Throughput: 0: 41710.6. Samples: 624266840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:17:36,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 02:17:39,242][12883] Updated weights for policy 0, policy_version 38101 (0.0035) [2024-06-18 02:17:41,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 624377856. Throughput: 0: 41823.5. Samples: 624516880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:17:41,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 02:17:43,255][12883] Updated weights for policy 0, policy_version 38111 (0.0032) [2024-06-18 02:17:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 624558080. Throughput: 0: 41843.4. Samples: 624642220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:17:46,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:17:47,100][12883] Updated weights for policy 0, policy_version 38121 (0.0031) [2024-06-18 02:17:49,828][12862] Signal inference workers to stop experience collection... (8850 times) [2024-06-18 02:17:49,853][12883] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-06-18 02:17:49,941][12862] Signal inference workers to resume experience collection... (8850 times) [2024-06-18 02:17:49,941][12883] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-06-18 02:17:50,919][12883] Updated weights for policy 0, policy_version 38131 (0.0039) [2024-06-18 02:17:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 624771072. Throughput: 0: 41803.1. Samples: 624895840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:17:51,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:17:54,891][12883] Updated weights for policy 0, policy_version 38141 (0.0037) [2024-06-18 02:17:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 625016832. Throughput: 0: 41828.4. Samples: 625147780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:17:56,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 02:17:58,463][12883] Updated weights for policy 0, policy_version 38151 (0.0039) [2024-06-18 02:18:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41780.7, 300 sec: 41876.4). Total num frames: 625197056. Throughput: 0: 42200.8. Samples: 625281400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 02:18:01,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:18:02,550][12883] Updated weights for policy 0, policy_version 38161 (0.0032) [2024-06-18 02:18:05,977][12883] Updated weights for policy 0, policy_version 38171 (0.0047) [2024-06-18 02:18:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 625410048. Throughput: 0: 42043.0. Samples: 625528820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 02:18:06,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 02:18:10,411][12883] Updated weights for policy 0, policy_version 38181 (0.0043) [2024-06-18 02:18:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 625639424. Throughput: 0: 42036.3. Samples: 625778800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 02:18:11,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 02:18:13,665][12883] Updated weights for policy 0, policy_version 38191 (0.0035) [2024-06-18 02:18:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 625819648. Throughput: 0: 42029.8. Samples: 625911940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 02:18:16,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:18:17,985][12883] Updated weights for policy 0, policy_version 38201 (0.0031) [2024-06-18 02:18:21,507][12883] Updated weights for policy 0, policy_version 38211 (0.0032) [2024-06-18 02:18:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 626049024. Throughput: 0: 42092.0. Samples: 626160980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 02:18:21,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 02:18:25,930][12883] Updated weights for policy 0, policy_version 38221 (0.0043) [2024-06-18 02:18:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.8, 300 sec: 41987.5). Total num frames: 626262016. Throughput: 0: 42238.3. Samples: 626417600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 02:18:26,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:18:29,642][12883] Updated weights for policy 0, policy_version 38231 (0.0026) [2024-06-18 02:18:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 626442240. Throughput: 0: 42317.8. Samples: 626546520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 02:18:31,994][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 02:18:33,498][12883] Updated weights for policy 0, policy_version 38241 (0.0031) [2024-06-18 02:18:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 626688000. Throughput: 0: 42168.0. Samples: 626793400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 02:18:36,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 02:18:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038250_626688000.pth... [2024-06-18 02:18:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037635_616611840.pth [2024-06-18 02:18:37,453][12883] Updated weights for policy 0, policy_version 38251 (0.0032) [2024-06-18 02:18:41,206][12883] Updated weights for policy 0, policy_version 38261 (0.0037) [2024-06-18 02:18:41,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 626900992. Throughput: 0: 42210.7. Samples: 627047260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 02:18:41,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 02:18:45,191][12883] Updated weights for policy 0, policy_version 38271 (0.0033) [2024-06-18 02:18:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 627097600. Throughput: 0: 41942.7. Samples: 627168820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 02:18:46,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:18:49,419][12883] Updated weights for policy 0, policy_version 38281 (0.0046) [2024-06-18 02:18:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 627343360. Throughput: 0: 42180.5. Samples: 627426940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:18:51,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:18:53,232][12883] Updated weights for policy 0, policy_version 38291 (0.0034) [2024-06-18 02:18:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 627507200. Throughput: 0: 42288.8. Samples: 627681800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:18:56,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:18:57,045][12883] Updated weights for policy 0, policy_version 38301 (0.0037) [2024-06-18 02:19:00,897][12883] Updated weights for policy 0, policy_version 38311 (0.0038) [2024-06-18 02:19:01,994][12645] Fps is (10 sec: 39319.5, 60 sec: 42325.0, 300 sec: 41987.4). Total num frames: 627736576. Throughput: 0: 42080.3. Samples: 627805580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:19:01,995][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 02:19:04,630][12883] Updated weights for policy 0, policy_version 38321 (0.0041) [2024-06-18 02:19:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 627949568. Throughput: 0: 42228.4. Samples: 628061260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:19:06,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 02:19:07,464][12862] Signal inference workers to stop experience collection... (8900 times) [2024-06-18 02:19:07,464][12862] Signal inference workers to resume experience collection... (8900 times) [2024-06-18 02:19:07,480][12883] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-06-18 02:19:07,480][12883] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-06-18 02:19:08,436][12883] Updated weights for policy 0, policy_version 38331 (0.0036) [2024-06-18 02:19:12,000][12645] Fps is (10 sec: 40936.9, 60 sec: 41774.9, 300 sec: 41931.0). Total num frames: 628146176. Throughput: 0: 42223.6. Samples: 628317920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:19:12,000][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 02:19:12,449][12883] Updated weights for policy 0, policy_version 38341 (0.0038) [2024-06-18 02:19:16,098][12883] Updated weights for policy 0, policy_version 38351 (0.0027) [2024-06-18 02:19:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 628375552. Throughput: 0: 42076.4. Samples: 628439960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:19:16,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:19:20,241][12883] Updated weights for policy 0, policy_version 38361 (0.0034) [2024-06-18 02:19:21,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 628572160. Throughput: 0: 42219.1. Samples: 628693260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:19:21,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 02:19:23,769][12883] Updated weights for policy 0, policy_version 38371 (0.0026) [2024-06-18 02:19:26,996][12645] Fps is (10 sec: 39313.0, 60 sec: 41777.7, 300 sec: 41876.1). Total num frames: 628768768. Throughput: 0: 42182.4. Samples: 628945560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:19:26,996][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:19:28,170][12883] Updated weights for policy 0, policy_version 38381 (0.0044) [2024-06-18 02:19:31,555][12883] Updated weights for policy 0, policy_version 38391 (0.0031) [2024-06-18 02:19:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 628998144. Throughput: 0: 42312.0. Samples: 629072860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:19:31,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:19:35,695][12883] Updated weights for policy 0, policy_version 38401 (0.0033) [2024-06-18 02:19:36,994][12645] Fps is (10 sec: 42607.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 629194752. Throughput: 0: 42156.4. Samples: 629323980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:19:36,994][12645] Avg episode reward: [(0, '0.036')] [2024-06-18 02:19:39,597][12883] Updated weights for policy 0, policy_version 38411 (0.0042) [2024-06-18 02:19:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 629407744. Throughput: 0: 41980.4. Samples: 629570920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:19:41,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:19:43,547][12883] Updated weights for policy 0, policy_version 38421 (0.0040) [2024-06-18 02:19:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 629620736. Throughput: 0: 42025.4. Samples: 629696700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:19:47,000][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 02:19:47,642][12883] Updated weights for policy 0, policy_version 38431 (0.0047) [2024-06-18 02:19:51,687][12883] Updated weights for policy 0, policy_version 38441 (0.0029) [2024-06-18 02:19:51,996][12645] Fps is (10 sec: 40950.9, 60 sec: 41231.5, 300 sec: 41876.1). Total num frames: 629817344. Throughput: 0: 41835.3. Samples: 629943940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:19:51,997][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 02:19:55,492][12883] Updated weights for policy 0, policy_version 38451 (0.0040) [2024-06-18 02:19:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 630046720. Throughput: 0: 41644.0. Samples: 630191640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:19:56,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:19:59,442][12883] Updated weights for policy 0, policy_version 38461 (0.0030) [2024-06-18 02:20:01,994][12645] Fps is (10 sec: 40969.0, 60 sec: 41506.5, 300 sec: 41987.5). Total num frames: 630226944. Throughput: 0: 41737.3. Samples: 630318140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:20:01,995][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 02:20:03,548][12883] Updated weights for policy 0, policy_version 38471 (0.0039) [2024-06-18 02:20:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 630439936. Throughput: 0: 41625.8. Samples: 630566420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:20:06,994][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 02:20:07,536][12883] Updated weights for policy 0, policy_version 38481 (0.0042) [2024-06-18 02:20:11,801][12883] Updated weights for policy 0, policy_version 38491 (0.0032) [2024-06-18 02:20:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41783.5, 300 sec: 41931.9). Total num frames: 630652928. Throughput: 0: 41638.0. Samples: 630819180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:20:11,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:20:15,301][12883] Updated weights for policy 0, policy_version 38501 (0.0039) [2024-06-18 02:20:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 630882304. Throughput: 0: 41475.1. Samples: 630939240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:20:16,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 02:20:19,481][12883] Updated weights for policy 0, policy_version 38511 (0.0033) [2024-06-18 02:20:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 631062528. Throughput: 0: 41433.9. Samples: 631188500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:20:21,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:20:23,098][12883] Updated weights for policy 0, policy_version 38521 (0.0028) [2024-06-18 02:20:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41780.6, 300 sec: 41932.0). Total num frames: 631275520. Throughput: 0: 41554.1. Samples: 631440860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 02:20:26,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:20:27,249][12883] Updated weights for policy 0, policy_version 38531 (0.0038) [2024-06-18 02:20:30,781][12883] Updated weights for policy 0, policy_version 38541 (0.0039) [2024-06-18 02:20:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 631504896. Throughput: 0: 41694.6. Samples: 631572960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 02:20:31,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 02:20:31,995][12862] Saving new best policy, reward=0.130! [2024-06-18 02:20:35,282][12883] Updated weights for policy 0, policy_version 38551 (0.0026) [2024-06-18 02:20:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 631685120. Throughput: 0: 41768.8. Samples: 631823440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 02:20:36,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 02:20:37,083][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038556_631701504.pth... [2024-06-18 02:20:37,145][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000037941_621625344.pth [2024-06-18 02:20:37,339][12862] Signal inference workers to stop experience collection... (8950 times) [2024-06-18 02:20:37,339][12862] Signal inference workers to resume experience collection... (8950 times) [2024-06-18 02:20:37,387][12883] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-06-18 02:20:37,387][12883] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-06-18 02:20:38,547][12883] Updated weights for policy 0, policy_version 38561 (0.0030) [2024-06-18 02:20:42,000][12645] Fps is (10 sec: 40934.8, 60 sec: 41774.9, 300 sec: 41986.6). Total num frames: 631914496. Throughput: 0: 41688.8. Samples: 632067900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 02:20:42,000][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 02:20:43,103][12883] Updated weights for policy 0, policy_version 38571 (0.0042) [2024-06-18 02:20:46,324][12883] Updated weights for policy 0, policy_version 38581 (0.0031) [2024-06-18 02:20:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 632127488. Throughput: 0: 41856.0. Samples: 632201660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 02:20:46,994][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 02:20:50,893][12883] Updated weights for policy 0, policy_version 38591 (0.0039) [2024-06-18 02:20:51,994][12645] Fps is (10 sec: 40985.9, 60 sec: 41780.8, 300 sec: 41877.3). Total num frames: 632324096. Throughput: 0: 42024.0. Samples: 632457500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:20:51,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 02:20:53,957][12883] Updated weights for policy 0, policy_version 38601 (0.0028) [2024-06-18 02:20:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 632553472. Throughput: 0: 41804.8. Samples: 632700400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:20:56,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 02:20:58,781][12883] Updated weights for policy 0, policy_version 38611 (0.0041) [2024-06-18 02:21:01,942][12883] Updated weights for policy 0, policy_version 38621 (0.0032) [2024-06-18 02:21:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 632766464. Throughput: 0: 42077.4. Samples: 632832720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:21:01,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 02:21:06,451][12883] Updated weights for policy 0, policy_version 38631 (0.0030) [2024-06-18 02:21:06,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 632946688. Throughput: 0: 42093.8. Samples: 633082720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:21:06,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 02:21:09,968][12883] Updated weights for policy 0, policy_version 38641 (0.0036) [2024-06-18 02:21:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 633192448. Throughput: 0: 42040.6. Samples: 633332680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:21:11,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 02:21:14,154][12883] Updated weights for policy 0, policy_version 38651 (0.0026) [2024-06-18 02:21:16,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 633405440. Throughput: 0: 42020.0. Samples: 633463860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:21:16,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:21:17,713][12883] Updated weights for policy 0, policy_version 38661 (0.0037) [2024-06-18 02:21:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41779.0, 300 sec: 41876.4). Total num frames: 633569280. Throughput: 0: 41953.6. Samples: 633711360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:21:21,994][12645] Avg episode reward: [(0, '0.014')] [2024-06-18 02:21:22,281][12883] Updated weights for policy 0, policy_version 38671 (0.0037) [2024-06-18 02:21:25,663][12883] Updated weights for policy 0, policy_version 38681 (0.0031) [2024-06-18 02:21:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 633798656. Throughput: 0: 42047.1. Samples: 633959760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:21:26,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 02:21:30,250][12883] Updated weights for policy 0, policy_version 38691 (0.0036) [2024-06-18 02:21:31,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42052.4, 300 sec: 42043.1). Total num frames: 634028032. Throughput: 0: 41936.2. Samples: 634088780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:21:31,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:21:33,323][12883] Updated weights for policy 0, policy_version 38701 (0.0042) [2024-06-18 02:21:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 634208256. Throughput: 0: 41707.9. Samples: 634334360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:21:36,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:21:38,053][12883] Updated weights for policy 0, policy_version 38711 (0.0040) [2024-06-18 02:21:40,933][12883] Updated weights for policy 0, policy_version 38721 (0.0029) [2024-06-18 02:21:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41783.6, 300 sec: 41876.4). Total num frames: 634421248. Throughput: 0: 41980.5. Samples: 634589520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 02:21:41,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:21:45,761][12883] Updated weights for policy 0, policy_version 38731 (0.0045) [2024-06-18 02:21:46,705][12862] Signal inference workers to stop experience collection... (9000 times) [2024-06-18 02:21:46,705][12862] Signal inference workers to resume experience collection... (9000 times) [2024-06-18 02:21:46,751][12883] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-06-18 02:21:46,752][12883] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-06-18 02:21:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 634650624. Throughput: 0: 41905.4. Samples: 634718460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 02:21:46,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 02:21:48,899][12883] Updated weights for policy 0, policy_version 38741 (0.0046) [2024-06-18 02:21:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 634830848. Throughput: 0: 41805.2. Samples: 634963960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 02:21:51,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 02:21:53,508][12883] Updated weights for policy 0, policy_version 38751 (0.0046) [2024-06-18 02:21:56,586][12883] Updated weights for policy 0, policy_version 38761 (0.0030) [2024-06-18 02:21:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 41987.8). Total num frames: 635076608. Throughput: 0: 41780.1. Samples: 635212780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 02:21:56,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 02:22:01,231][12883] Updated weights for policy 0, policy_version 38771 (0.0036) [2024-06-18 02:22:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 635240448. Throughput: 0: 41741.8. Samples: 635342240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 02:22:01,994][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 02:22:04,242][12883] Updated weights for policy 0, policy_version 38781 (0.0034) [2024-06-18 02:22:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 635469824. Throughput: 0: 41856.2. Samples: 635594880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:22:06,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:22:09,153][12883] Updated weights for policy 0, policy_version 38791 (0.0027) [2024-06-18 02:22:11,843][12883] Updated weights for policy 0, policy_version 38801 (0.0038) [2024-06-18 02:22:11,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 635715584. Throughput: 0: 41726.6. Samples: 635837460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:22:11,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 02:22:16,963][12883] Updated weights for policy 0, policy_version 38811 (0.0045) [2024-06-18 02:22:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 635879424. Throughput: 0: 41750.1. Samples: 635967540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:22:16,994][12645] Avg episode reward: [(0, '0.036')] [2024-06-18 02:22:20,163][12883] Updated weights for policy 0, policy_version 38821 (0.0030) [2024-06-18 02:22:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 41987.8). Total num frames: 636125184. Throughput: 0: 41961.0. Samples: 636222600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:22:21,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 02:22:24,584][12883] Updated weights for policy 0, policy_version 38831 (0.0042) [2024-06-18 02:22:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 636321792. Throughput: 0: 41829.7. Samples: 636471860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:22:26,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 02:22:27,982][12883] Updated weights for policy 0, policy_version 38841 (0.0036) [2024-06-18 02:22:31,994][12645] Fps is (10 sec: 36044.9, 60 sec: 40960.0, 300 sec: 41820.9). Total num frames: 636485632. Throughput: 0: 41586.6. Samples: 636589860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 02:22:31,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 02:22:32,904][12883] Updated weights for policy 0, policy_version 38851 (0.0050) [2024-06-18 02:22:35,893][12883] Updated weights for policy 0, policy_version 38861 (0.0033) [2024-06-18 02:22:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 636731392. Throughput: 0: 41645.8. Samples: 636838020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 02:22:36,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 02:22:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038863_636731392.pth... [2024-06-18 02:22:37,107][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038250_626688000.pth [2024-06-18 02:22:40,641][12883] Updated weights for policy 0, policy_version 38871 (0.0036) [2024-06-18 02:22:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 636944384. Throughput: 0: 41640.7. Samples: 637086620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 02:22:41,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 02:22:44,261][12883] Updated weights for policy 0, policy_version 38881 (0.0031) [2024-06-18 02:22:46,996][12645] Fps is (10 sec: 39312.6, 60 sec: 41231.5, 300 sec: 41876.1). Total num frames: 637124608. Throughput: 0: 41691.8. Samples: 637218460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 02:22:46,997][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:22:48,373][12883] Updated weights for policy 0, policy_version 38891 (0.0030) [2024-06-18 02:22:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 637337600. Throughput: 0: 41503.9. Samples: 637462560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 02:22:51,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:22:52,250][12883] Updated weights for policy 0, policy_version 38901 (0.0028) [2024-06-18 02:22:56,200][12883] Updated weights for policy 0, policy_version 38911 (0.0036) [2024-06-18 02:22:56,994][12645] Fps is (10 sec: 42608.4, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 637550592. Throughput: 0: 41672.6. Samples: 637712720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 02:22:56,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:22:59,959][12883] Updated weights for policy 0, policy_version 38921 (0.0028) [2024-06-18 02:23:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 637763584. Throughput: 0: 41635.2. Samples: 637841120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 02:23:01,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:23:03,484][12862] Signal inference workers to stop experience collection... (9050 times) [2024-06-18 02:23:03,544][12883] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-06-18 02:23:03,601][12862] Signal inference workers to resume experience collection... (9050 times) [2024-06-18 02:23:03,601][12883] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-06-18 02:23:04,219][12883] Updated weights for policy 0, policy_version 38931 (0.0030) [2024-06-18 02:23:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 637960192. Throughput: 0: 41422.2. Samples: 638086600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 02:23:06,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:23:07,605][12883] Updated weights for policy 0, policy_version 38941 (0.0035) [2024-06-18 02:23:11,889][12883] Updated weights for policy 0, policy_version 38951 (0.0035) [2024-06-18 02:23:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 40960.1, 300 sec: 41876.4). Total num frames: 638173184. Throughput: 0: 41603.6. Samples: 638344020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 02:23:11,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:23:15,290][12883] Updated weights for policy 0, policy_version 38961 (0.0057) [2024-06-18 02:23:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 638386176. Throughput: 0: 41673.4. Samples: 638465160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 02:23:16,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:23:19,505][12883] Updated weights for policy 0, policy_version 38971 (0.0054) [2024-06-18 02:23:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 638582784. Throughput: 0: 41712.9. Samples: 638715100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:23:21,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 02:23:22,807][12883] Updated weights for policy 0, policy_version 38981 (0.0038) [2024-06-18 02:23:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 638812160. Throughput: 0: 41814.4. Samples: 638968260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:23:26,994][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 02:23:27,391][12883] Updated weights for policy 0, policy_version 38991 (0.0036) [2024-06-18 02:23:31,010][12883] Updated weights for policy 0, policy_version 39001 (0.0028) [2024-06-18 02:23:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 639008768. Throughput: 0: 41683.8. Samples: 639094140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:23:31,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 02:23:35,048][12883] Updated weights for policy 0, policy_version 39011 (0.0023) [2024-06-18 02:23:36,996][12645] Fps is (10 sec: 40950.3, 60 sec: 41504.6, 300 sec: 41765.0). Total num frames: 639221760. Throughput: 0: 41762.0. Samples: 639341940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:23:36,996][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:23:38,677][12883] Updated weights for policy 0, policy_version 39021 (0.0028) [2024-06-18 02:23:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 639401984. Throughput: 0: 41891.4. Samples: 639597840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:23:41,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 02:23:42,795][12883] Updated weights for policy 0, policy_version 39031 (0.0039) [2024-06-18 02:23:46,927][12883] Updated weights for policy 0, policy_version 39041 (0.0033) [2024-06-18 02:23:46,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42053.8, 300 sec: 41709.8). Total num frames: 639647744. Throughput: 0: 41677.2. Samples: 639716600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:23:46,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:23:50,741][12883] Updated weights for policy 0, policy_version 39051 (0.0035) [2024-06-18 02:23:52,000][12645] Fps is (10 sec: 44209.9, 60 sec: 41774.9, 300 sec: 41820.0). Total num frames: 639844352. Throughput: 0: 41760.0. Samples: 639966060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:23:52,000][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 02:23:55,021][12883] Updated weights for policy 0, policy_version 39061 (0.0034) [2024-06-18 02:23:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 41709.9). Total num frames: 640040960. Throughput: 0: 41659.1. Samples: 640218680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:23:56,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 02:23:58,586][12883] Updated weights for policy 0, policy_version 39071 (0.0038) [2024-06-18 02:24:01,994][12645] Fps is (10 sec: 42625.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 640270336. Throughput: 0: 41647.1. Samples: 640339280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:24:01,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:24:02,586][12883] Updated weights for policy 0, policy_version 39081 (0.0029) [2024-06-18 02:24:05,941][12862] Signal inference workers to stop experience collection... (9100 times) [2024-06-18 02:24:05,942][12862] Signal inference workers to resume experience collection... (9100 times) [2024-06-18 02:24:05,988][12883] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-06-18 02:24:05,988][12883] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-06-18 02:24:06,651][12883] Updated weights for policy 0, policy_version 39091 (0.0041) [2024-06-18 02:24:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41766.2). Total num frames: 640466944. Throughput: 0: 41692.9. Samples: 640591280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 02:24:06,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:24:10,412][12883] Updated weights for policy 0, policy_version 39101 (0.0044) [2024-06-18 02:24:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 640679936. Throughput: 0: 41595.0. Samples: 640840040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 02:24:11,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 02:24:14,610][12883] Updated weights for policy 0, policy_version 39111 (0.0033) [2024-06-18 02:24:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 640892928. Throughput: 0: 41592.1. Samples: 640965780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 02:24:16,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 02:24:18,296][12883] Updated weights for policy 0, policy_version 39121 (0.0037) [2024-06-18 02:24:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41765.6). Total num frames: 641089536. Throughput: 0: 41597.1. Samples: 641213720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 02:24:21,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 02:24:22,729][12883] Updated weights for policy 0, policy_version 39131 (0.0043) [2024-06-18 02:24:26,072][12883] Updated weights for policy 0, policy_version 39141 (0.0032) [2024-06-18 02:24:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 641302528. Throughput: 0: 41470.3. Samples: 641464000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 02:24:26,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 02:24:30,511][12883] Updated weights for policy 0, policy_version 39151 (0.0033) [2024-06-18 02:24:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 641515520. Throughput: 0: 41616.1. Samples: 641589320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 02:24:31,994][12645] Avg episode reward: [(0, '0.018')] [2024-06-18 02:24:33,958][12883] Updated weights for policy 0, policy_version 39161 (0.0022) [2024-06-18 02:24:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41507.7, 300 sec: 41709.8). Total num frames: 641712128. Throughput: 0: 41759.6. Samples: 641844980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 02:24:36,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:24:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039167_641712128.pth... [2024-06-18 02:24:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038556_631701504.pth [2024-06-18 02:24:38,305][12883] Updated weights for policy 0, policy_version 39171 (0.0038) [2024-06-18 02:24:41,818][12883] Updated weights for policy 0, policy_version 39181 (0.0031) [2024-06-18 02:24:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 641941504. Throughput: 0: 41728.9. Samples: 642096480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 02:24:41,994][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 02:24:46,059][12883] Updated weights for policy 0, policy_version 39191 (0.0031) [2024-06-18 02:24:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.2, 300 sec: 41765.6). Total num frames: 642138112. Throughput: 0: 41935.4. Samples: 642226380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 02:24:46,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:24:49,803][12883] Updated weights for policy 0, policy_version 39201 (0.0029) [2024-06-18 02:24:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41783.6, 300 sec: 41709.8). Total num frames: 642351104. Throughput: 0: 41804.0. Samples: 642472460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 02:24:51,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:24:53,837][12883] Updated weights for policy 0, policy_version 39211 (0.0044) [2024-06-18 02:24:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 642564096. Throughput: 0: 41890.2. Samples: 642725100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 25.0) [2024-06-18 02:24:56,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 02:24:57,426][12883] Updated weights for policy 0, policy_version 39221 (0.0043) [2024-06-18 02:25:01,515][12883] Updated weights for policy 0, policy_version 39231 (0.0033) [2024-06-18 02:25:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 642760704. Throughput: 0: 41905.7. Samples: 642851540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 02:25:02,000][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:25:05,380][12883] Updated weights for policy 0, policy_version 39241 (0.0039) [2024-06-18 02:25:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 642990080. Throughput: 0: 42048.1. Samples: 643105880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 02:25:06,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:25:09,350][12883] Updated weights for policy 0, policy_version 39251 (0.0032) [2024-06-18 02:25:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 643203072. Throughput: 0: 42128.0. Samples: 643359760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 02:25:11,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 02:25:13,054][12883] Updated weights for policy 0, policy_version 39261 (0.0025) [2024-06-18 02:25:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 643399680. Throughput: 0: 42044.9. Samples: 643481340. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 02:25:16,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 02:25:17,056][12883] Updated weights for policy 0, policy_version 39271 (0.0040) [2024-06-18 02:25:17,960][12862] Signal inference workers to stop experience collection... (9150 times) [2024-06-18 02:25:17,964][12862] Signal inference workers to resume experience collection... (9150 times) [2024-06-18 02:25:17,993][12883] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-06-18 02:25:17,993][12883] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-06-18 02:25:20,891][12883] Updated weights for policy 0, policy_version 39281 (0.0039) [2024-06-18 02:25:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 41931.9). Total num frames: 643645440. Throughput: 0: 42010.6. Samples: 643735460. Policy #0 lag: (min: 1.0, avg: 9.6, max: 21.0) [2024-06-18 02:25:21,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:25:24,967][12883] Updated weights for policy 0, policy_version 39291 (0.0038) [2024-06-18 02:25:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 643825664. Throughput: 0: 42068.5. Samples: 643989560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 02:25:26,994][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 02:25:28,471][12883] Updated weights for policy 0, policy_version 39301 (0.0030) [2024-06-18 02:25:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 644038656. Throughput: 0: 41779.6. Samples: 644106460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 02:25:31,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 02:25:33,083][12883] Updated weights for policy 0, policy_version 39311 (0.0035) [2024-06-18 02:25:36,288][12883] Updated weights for policy 0, policy_version 39321 (0.0041) [2024-06-18 02:25:36,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 41932.8). Total num frames: 644284416. Throughput: 0: 42181.2. Samples: 644370620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 02:25:36,994][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 02:25:40,726][12883] Updated weights for policy 0, policy_version 39331 (0.0028) [2024-06-18 02:25:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 644448256. Throughput: 0: 42101.4. Samples: 644619660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 02:25:41,994][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 02:25:44,194][12883] Updated weights for policy 0, policy_version 39341 (0.0037) [2024-06-18 02:25:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 644661248. Throughput: 0: 42012.5. Samples: 644742100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 02:25:46,996][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 02:25:48,303][12883] Updated weights for policy 0, policy_version 39351 (0.0046) [2024-06-18 02:25:51,854][12883] Updated weights for policy 0, policy_version 39361 (0.0039) [2024-06-18 02:25:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 644890624. Throughput: 0: 42130.2. Samples: 645001740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:25:51,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:25:55,839][12883] Updated weights for policy 0, policy_version 39371 (0.0039) [2024-06-18 02:25:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 645054464. Throughput: 0: 42125.2. Samples: 645255400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:25:56,995][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 02:25:59,479][12883] Updated weights for policy 0, policy_version 39381 (0.0040) [2024-06-18 02:26:01,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.8, 300 sec: 41876.1). Total num frames: 645300224. Throughput: 0: 42050.8. Samples: 645373720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:26:01,996][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 02:26:04,092][12883] Updated weights for policy 0, policy_version 39391 (0.0030) [2024-06-18 02:26:06,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 645529600. Throughput: 0: 42110.2. Samples: 645630420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:26:06,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:26:07,150][12883] Updated weights for policy 0, policy_version 39401 (0.0037) [2024-06-18 02:26:11,708][12883] Updated weights for policy 0, policy_version 39411 (0.0038) [2024-06-18 02:26:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 645709824. Throughput: 0: 42080.0. Samples: 645883160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 02:26:11,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 02:26:14,986][12883] Updated weights for policy 0, policy_version 39421 (0.0027) [2024-06-18 02:26:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 645955584. Throughput: 0: 42204.4. Samples: 646005660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 02:26:16,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:26:19,504][12883] Updated weights for policy 0, policy_version 39431 (0.0034) [2024-06-18 02:26:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 646135808. Throughput: 0: 41997.1. Samples: 646260480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 02:26:21,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 02:26:22,089][12862] Saving new best policy, reward=0.134! [2024-06-18 02:26:23,188][12883] Updated weights for policy 0, policy_version 39441 (0.0035) [2024-06-18 02:26:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 646332416. Throughput: 0: 42103.1. Samples: 646514300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 02:26:26,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 02:26:27,711][12883] Updated weights for policy 0, policy_version 39451 (0.0031) [2024-06-18 02:26:28,344][12862] Signal inference workers to stop experience collection... (9200 times) [2024-06-18 02:26:28,344][12862] Signal inference workers to resume experience collection... (9200 times) [2024-06-18 02:26:28,368][12883] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-06-18 02:26:28,368][12883] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-06-18 02:26:30,948][12883] Updated weights for policy 0, policy_version 39461 (0.0038) [2024-06-18 02:26:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 646578176. Throughput: 0: 42045.5. Samples: 646634140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 02:26:31,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 02:26:35,554][12883] Updated weights for policy 0, policy_version 39471 (0.0023) [2024-06-18 02:26:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 646774784. Throughput: 0: 42027.6. Samples: 646892980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 02:26:36,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:26:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039476_646774784.pth... [2024-06-18 02:26:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000038863_636731392.pth [2024-06-18 02:26:39,039][12883] Updated weights for policy 0, policy_version 39481 (0.0037) [2024-06-18 02:26:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 646971392. Throughput: 0: 41835.7. Samples: 647138000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 02:26:41,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 02:26:42,995][12883] Updated weights for policy 0, policy_version 39491 (0.0028) [2024-06-18 02:26:46,705][12883] Updated weights for policy 0, policy_version 39501 (0.0032) [2024-06-18 02:26:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 647184384. Throughput: 0: 42009.7. Samples: 647264060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 02:26:46,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 02:26:50,771][12883] Updated weights for policy 0, policy_version 39511 (0.0026) [2024-06-18 02:26:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41709.7). Total num frames: 647380992. Throughput: 0: 41919.5. Samples: 647516800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 02:26:51,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 02:26:54,222][12883] Updated weights for policy 0, policy_version 39521 (0.0041) [2024-06-18 02:26:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 647593984. Throughput: 0: 41907.1. Samples: 647768980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 02:26:56,994][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 02:26:58,590][12883] Updated weights for policy 0, policy_version 39531 (0.0027) [2024-06-18 02:27:01,745][12883] Updated weights for policy 0, policy_version 39541 (0.0036) [2024-06-18 02:27:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42326.9, 300 sec: 41931.9). Total num frames: 647839744. Throughput: 0: 42022.7. Samples: 647896680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 02:27:01,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 02:27:06,636][12883] Updated weights for policy 0, policy_version 39551 (0.0039) [2024-06-18 02:27:06,997][12645] Fps is (10 sec: 42584.9, 60 sec: 41504.0, 300 sec: 41709.3). Total num frames: 648019968. Throughput: 0: 41877.9. Samples: 648145120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:27:06,997][12645] Avg episode reward: [(0, '0.003')] [2024-06-18 02:27:09,855][12883] Updated weights for policy 0, policy_version 39561 (0.0033) [2024-06-18 02:27:11,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 648216576. Throughput: 0: 41822.1. Samples: 648396300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:27:11,994][12645] Avg episode reward: [(0, '0.036')] [2024-06-18 02:27:14,420][12883] Updated weights for policy 0, policy_version 39571 (0.0047) [2024-06-18 02:27:16,994][12645] Fps is (10 sec: 44250.4, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 648462336. Throughput: 0: 41933.6. Samples: 648521160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:27:16,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:27:17,733][12883] Updated weights for policy 0, policy_version 39581 (0.0029) [2024-06-18 02:27:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 648642560. Throughput: 0: 41932.8. Samples: 648779960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:27:21,994][12645] Avg episode reward: [(0, '0.012')] [2024-06-18 02:27:22,062][12883] Updated weights for policy 0, policy_version 39591 (0.0033) [2024-06-18 02:27:25,254][12883] Updated weights for policy 0, policy_version 39601 (0.0030) [2024-06-18 02:27:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 648855552. Throughput: 0: 41973.7. Samples: 649026820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:27:26,994][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 02:27:29,749][12883] Updated weights for policy 0, policy_version 39611 (0.0030) [2024-06-18 02:27:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 649101312. Throughput: 0: 42048.8. Samples: 649156260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 02:27:31,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:27:32,958][12883] Updated weights for policy 0, policy_version 39621 (0.0033) [2024-06-18 02:27:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 649265152. Throughput: 0: 42093.8. Samples: 649411020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 02:27:36,994][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 02:27:37,690][12883] Updated weights for policy 0, policy_version 39631 (0.0036) [2024-06-18 02:27:40,647][12883] Updated weights for policy 0, policy_version 39641 (0.0032) [2024-06-18 02:27:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 649510912. Throughput: 0: 41804.4. Samples: 649650180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 02:27:41,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 02:27:41,999][12862] Saving new best policy, reward=0.171! [2024-06-18 02:27:45,612][12883] Updated weights for policy 0, policy_version 39651 (0.0028) [2024-06-18 02:27:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 649723904. Throughput: 0: 41978.6. Samples: 649785720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 02:27:46,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 02:27:47,729][12862] Signal inference workers to stop experience collection... (9250 times) [2024-06-18 02:27:47,730][12862] Signal inference workers to resume experience collection... (9250 times) [2024-06-18 02:27:47,775][12883] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-06-18 02:27:47,775][12883] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-06-18 02:27:48,441][12883] Updated weights for policy 0, policy_version 39661 (0.0051) [2024-06-18 02:27:51,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41820.8). Total num frames: 649887744. Throughput: 0: 41990.5. Samples: 650034560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 02:27:51,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 02:27:53,478][12883] Updated weights for policy 0, policy_version 39671 (0.0035) [2024-06-18 02:27:56,147][12883] Updated weights for policy 0, policy_version 39681 (0.0030) [2024-06-18 02:27:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 650133504. Throughput: 0: 41913.9. Samples: 650282420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 02:27:56,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 02:28:01,108][12883] Updated weights for policy 0, policy_version 39691 (0.0039) [2024-06-18 02:28:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 650330112. Throughput: 0: 42240.2. Samples: 650421960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 02:28:01,994][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 02:28:03,998][12883] Updated weights for policy 0, policy_version 39701 (0.0037) [2024-06-18 02:28:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41781.4, 300 sec: 41876.4). Total num frames: 650526720. Throughput: 0: 41915.5. Samples: 650666160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 02:28:06,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 02:28:08,980][12883] Updated weights for policy 0, policy_version 39711 (0.0036) [2024-06-18 02:28:11,699][12883] Updated weights for policy 0, policy_version 39721 (0.0028) [2024-06-18 02:28:11,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42043.0). Total num frames: 650788864. Throughput: 0: 41851.9. Samples: 650910160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 02:28:11,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:28:16,936][12883] Updated weights for policy 0, policy_version 39731 (0.0042) [2024-06-18 02:28:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 650952704. Throughput: 0: 41936.5. Samples: 651043400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 02:28:16,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:28:19,822][12883] Updated weights for policy 0, policy_version 39741 (0.0035) [2024-06-18 02:28:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 651165696. Throughput: 0: 41809.0. Samples: 651292420. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 02:28:21,994][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 02:28:24,805][12883] Updated weights for policy 0, policy_version 39751 (0.0028) [2024-06-18 02:28:26,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 651427840. Throughput: 0: 42040.5. Samples: 651542000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 02:28:26,994][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 02:28:27,452][12883] Updated weights for policy 0, policy_version 39761 (0.0037) [2024-06-18 02:28:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 41876.7). Total num frames: 651575296. Throughput: 0: 41949.5. Samples: 651673440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 02:28:31,994][12645] Avg episode reward: [(0, '0.064')] [2024-06-18 02:28:32,553][12883] Updated weights for policy 0, policy_version 39771 (0.0033) [2024-06-18 02:28:35,673][12883] Updated weights for policy 0, policy_version 39781 (0.0038) [2024-06-18 02:28:36,996][12645] Fps is (10 sec: 37674.8, 60 sec: 42323.8, 300 sec: 42042.7). Total num frames: 651804672. Throughput: 0: 41956.1. Samples: 651922680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 02:28:36,996][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 02:28:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039783_651804672.pth... [2024-06-18 02:28:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039167_641712128.pth [2024-06-18 02:28:40,456][12883] Updated weights for policy 0, policy_version 39791 (0.0039) [2024-06-18 02:28:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 652034048. Throughput: 0: 41931.6. Samples: 652169340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 02:28:41,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 02:28:43,609][12883] Updated weights for policy 0, policy_version 39801 (0.0034) [2024-06-18 02:28:46,994][12645] Fps is (10 sec: 39330.5, 60 sec: 41233.1, 300 sec: 41877.3). Total num frames: 652197888. Throughput: 0: 41702.2. Samples: 652298560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 02:28:46,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 02:28:48,288][12883] Updated weights for policy 0, policy_version 39811 (0.0047) [2024-06-18 02:28:49,642][12862] Signal inference workers to stop experience collection... (9300 times) [2024-06-18 02:28:49,680][12883] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-06-18 02:28:49,753][12862] Signal inference workers to resume experience collection... (9300 times) [2024-06-18 02:28:49,753][12883] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-06-18 02:28:51,278][12883] Updated weights for policy 0, policy_version 39821 (0.0037) [2024-06-18 02:28:51,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 652427264. Throughput: 0: 41694.6. Samples: 652542420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 02:28:51,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 02:28:55,948][12883] Updated weights for policy 0, policy_version 39831 (0.0027) [2024-06-18 02:28:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 652623872. Throughput: 0: 42116.6. Samples: 652805400. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 02:28:56,994][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 02:28:58,948][12883] Updated weights for policy 0, policy_version 39841 (0.0029) [2024-06-18 02:29:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 652836864. Throughput: 0: 41796.5. Samples: 652924240. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 02:29:01,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 02:29:03,806][12883] Updated weights for policy 0, policy_version 39851 (0.0027) [2024-06-18 02:29:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 653066240. Throughput: 0: 41834.6. Samples: 653174980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 02:29:07,007][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 02:29:07,464][12883] Updated weights for policy 0, policy_version 39861 (0.0049) [2024-06-18 02:29:11,595][12883] Updated weights for policy 0, policy_version 39871 (0.0033) [2024-06-18 02:29:11,994][12645] Fps is (10 sec: 40959.1, 60 sec: 40959.9, 300 sec: 41876.4). Total num frames: 653246464. Throughput: 0: 41889.2. Samples: 653427020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 02:29:12,003][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 02:29:15,256][12883] Updated weights for policy 0, policy_version 39881 (0.0048) [2024-06-18 02:29:16,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42323.8, 300 sec: 42042.7). Total num frames: 653492224. Throughput: 0: 41616.1. Samples: 653546260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:29:16,996][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 02:29:19,622][12883] Updated weights for policy 0, policy_version 39891 (0.0028) [2024-06-18 02:29:21,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 653688832. Throughput: 0: 41847.0. Samples: 653805700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:29:21,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 02:29:22,950][12883] Updated weights for policy 0, policy_version 39901 (0.0034) [2024-06-18 02:29:26,993][12645] Fps is (10 sec: 39330.8, 60 sec: 40960.1, 300 sec: 41932.0). Total num frames: 653885440. Throughput: 0: 41877.3. Samples: 654053820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:29:26,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:29:27,093][12883] Updated weights for policy 0, policy_version 39911 (0.0041) [2024-06-18 02:29:30,857][12883] Updated weights for policy 0, policy_version 39921 (0.0036) [2024-06-18 02:29:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 654147584. Throughput: 0: 41877.3. Samples: 654183040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:29:31,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:29:35,346][12883] Updated weights for policy 0, policy_version 39931 (0.0038) [2024-06-18 02:29:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41780.8, 300 sec: 41931.9). Total num frames: 654311424. Throughput: 0: 42101.0. Samples: 654436960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 02:29:37,000][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:29:38,454][12883] Updated weights for policy 0, policy_version 39941 (0.0033) [2024-06-18 02:29:41,994][12645] Fps is (10 sec: 36045.0, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 654508032. Throughput: 0: 41739.5. Samples: 654683680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:29:41,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:29:42,844][12883] Updated weights for policy 0, policy_version 39951 (0.0023) [2024-06-18 02:29:46,556][12883] Updated weights for policy 0, policy_version 39961 (0.0036) [2024-06-18 02:29:46,999][12645] Fps is (10 sec: 42577.1, 60 sec: 42321.8, 300 sec: 41986.8). Total num frames: 654737408. Throughput: 0: 41806.0. Samples: 654805720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:29:46,999][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 02:29:50,982][12883] Updated weights for policy 0, policy_version 39971 (0.0037) [2024-06-18 02:29:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 654950400. Throughput: 0: 41850.6. Samples: 655058260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:29:51,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 02:29:54,346][12883] Updated weights for policy 0, policy_version 39981 (0.0034) [2024-06-18 02:29:56,994][12645] Fps is (10 sec: 42619.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 655163392. Throughput: 0: 41809.9. Samples: 655308460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:29:56,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 02:29:58,541][12883] Updated weights for policy 0, policy_version 39991 (0.0037) [2024-06-18 02:30:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 655360000. Throughput: 0: 41993.2. Samples: 655435860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:30:01,994][12645] Avg episode reward: [(0, '0.032')] [2024-06-18 02:30:02,210][12883] Updated weights for policy 0, policy_version 40001 (0.0033) [2024-06-18 02:30:06,130][12883] Updated weights for policy 0, policy_version 40011 (0.0030) [2024-06-18 02:30:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 655556608. Throughput: 0: 41922.1. Samples: 655692200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:30:06,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:30:10,036][12883] Updated weights for policy 0, policy_version 40021 (0.0049) [2024-06-18 02:30:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 655769600. Throughput: 0: 41794.0. Samples: 655934560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:30:11,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:30:14,018][12883] Updated weights for policy 0, policy_version 40031 (0.0034) [2024-06-18 02:30:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41507.6, 300 sec: 41820.8). Total num frames: 655982592. Throughput: 0: 41770.2. Samples: 656062700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:30:16,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 02:30:17,755][12883] Updated weights for policy 0, policy_version 40041 (0.0050) [2024-06-18 02:30:18,227][12862] Signal inference workers to stop experience collection... (9350 times) [2024-06-18 02:30:18,275][12883] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-06-18 02:30:18,278][12862] Signal inference workers to resume experience collection... (9350 times) [2024-06-18 02:30:18,292][12883] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-06-18 02:30:21,525][12883] Updated weights for policy 0, policy_version 40051 (0.0037) [2024-06-18 02:30:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 656195584. Throughput: 0: 41762.7. Samples: 656316280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:30:21,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 02:30:25,307][12883] Updated weights for policy 0, policy_version 40061 (0.0032) [2024-06-18 02:30:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 656424960. Throughput: 0: 41871.0. Samples: 656567880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:30:26,996][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 02:30:29,869][12883] Updated weights for policy 0, policy_version 40071 (0.0032) [2024-06-18 02:30:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 656605184. Throughput: 0: 41951.3. Samples: 656693320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-18 02:30:31,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 02:30:33,293][12883] Updated weights for policy 0, policy_version 40081 (0.0033) [2024-06-18 02:30:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 656834560. Throughput: 0: 41844.6. Samples: 656941260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-18 02:30:36,994][12645] Avg episode reward: [(0, '0.008')] [2024-06-18 02:30:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040090_656834560.pth... [2024-06-18 02:30:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039476_646774784.pth [2024-06-18 02:30:37,600][12883] Updated weights for policy 0, policy_version 40091 (0.0034) [2024-06-18 02:30:41,223][12883] Updated weights for policy 0, policy_version 40101 (0.0035) [2024-06-18 02:30:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 657031168. Throughput: 0: 41860.4. Samples: 657192180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-18 02:30:41,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 02:30:45,262][12883] Updated weights for policy 0, policy_version 40111 (0.0037) [2024-06-18 02:30:46,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41782.6, 300 sec: 41876.4). Total num frames: 657244160. Throughput: 0: 41709.7. Samples: 657312800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-18 02:30:46,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 02:30:48,922][12883] Updated weights for policy 0, policy_version 40121 (0.0030) [2024-06-18 02:30:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 657457152. Throughput: 0: 41678.7. Samples: 657567740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 22.0) [2024-06-18 02:30:51,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 02:30:53,312][12883] Updated weights for policy 0, policy_version 40131 (0.0040) [2024-06-18 02:30:56,837][12883] Updated weights for policy 0, policy_version 40141 (0.0046) [2024-06-18 02:30:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 657670144. Throughput: 0: 41687.6. Samples: 657810500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:30:56,994][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 02:31:00,878][12883] Updated weights for policy 0, policy_version 40151 (0.0043) [2024-06-18 02:31:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 657850368. Throughput: 0: 41729.9. Samples: 657940540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:31:01,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:31:04,634][12883] Updated weights for policy 0, policy_version 40161 (0.0028) [2024-06-18 02:31:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 658063360. Throughput: 0: 41661.2. Samples: 658191040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:31:06,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:31:08,807][12883] Updated weights for policy 0, policy_version 40171 (0.0028) [2024-06-18 02:31:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 658276352. Throughput: 0: 41741.0. Samples: 658446220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:31:11,994][12645] Avg episode reward: [(0, '0.126')] [2024-06-18 02:31:12,553][12883] Updated weights for policy 0, policy_version 40181 (0.0032) [2024-06-18 02:31:16,407][12883] Updated weights for policy 0, policy_version 40191 (0.0035) [2024-06-18 02:31:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41777.7, 300 sec: 41876.1). Total num frames: 658489344. Throughput: 0: 41689.1. Samples: 658569420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:31:16,997][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 02:31:17,013][12862] Saving new best policy, reward=0.189! [2024-06-18 02:31:20,391][12883] Updated weights for policy 0, policy_version 40201 (0.0036) [2024-06-18 02:31:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 658685952. Throughput: 0: 41755.0. Samples: 658820240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:31:21,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:31:24,257][12883] Updated weights for policy 0, policy_version 40211 (0.0047) [2024-06-18 02:31:26,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 658898944. Throughput: 0: 41821.8. Samples: 659074160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:31:26,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:31:28,369][12883] Updated weights for policy 0, policy_version 40221 (0.0028) [2024-06-18 02:31:31,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42050.7, 300 sec: 41876.1). Total num frames: 659128320. Throughput: 0: 42022.4. Samples: 659203900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:31:31,997][12645] Avg episode reward: [(0, '0.112')] [2024-06-18 02:31:32,341][12883] Updated weights for policy 0, policy_version 40231 (0.0032) [2024-06-18 02:31:36,178][12883] Updated weights for policy 0, policy_version 40241 (0.0043) [2024-06-18 02:31:36,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 659341312. Throughput: 0: 41941.0. Samples: 659455180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:31:36,997][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:31:40,038][12883] Updated weights for policy 0, policy_version 40251 (0.0045) [2024-06-18 02:31:41,994][12645] Fps is (10 sec: 40969.4, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 659537920. Throughput: 0: 42048.9. Samples: 659702700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:31:41,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 02:31:44,074][12883] Updated weights for policy 0, policy_version 40261 (0.0037) [2024-06-18 02:31:46,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 659750912. Throughput: 0: 42002.2. Samples: 659830640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:31:46,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 02:31:47,746][12883] Updated weights for policy 0, policy_version 40271 (0.0036) [2024-06-18 02:31:51,967][12883] Updated weights for policy 0, policy_version 40281 (0.0029) [2024-06-18 02:31:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 659963904. Throughput: 0: 42136.5. Samples: 660087180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:31:51,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:31:52,127][12862] Signal inference workers to stop experience collection... (9400 times) [2024-06-18 02:31:52,128][12862] Signal inference workers to resume experience collection... (9400 times) [2024-06-18 02:31:52,173][12883] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-06-18 02:31:52,173][12883] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-06-18 02:31:55,753][12883] Updated weights for policy 0, policy_version 40291 (0.0031) [2024-06-18 02:31:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 660160512. Throughput: 0: 42085.2. Samples: 660340060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:31:56,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 02:31:59,487][12883] Updated weights for policy 0, policy_version 40301 (0.0040) [2024-06-18 02:32:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42596.8, 300 sec: 41987.6). Total num frames: 660406272. Throughput: 0: 42123.6. Samples: 660464980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:32:01,996][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 02:32:03,451][12883] Updated weights for policy 0, policy_version 40311 (0.0033) [2024-06-18 02:32:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 660602880. Throughput: 0: 42261.3. Samples: 660722000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:32:06,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 02:32:07,067][12883] Updated weights for policy 0, policy_version 40321 (0.0039) [2024-06-18 02:32:11,158][12883] Updated weights for policy 0, policy_version 40331 (0.0031) [2024-06-18 02:32:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 660815872. Throughput: 0: 42140.9. Samples: 660970500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:32:11,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:32:15,061][12883] Updated weights for policy 0, policy_version 40341 (0.0036) [2024-06-18 02:32:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42326.8, 300 sec: 41987.5). Total num frames: 661028864. Throughput: 0: 42070.4. Samples: 661096980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:32:16,995][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 02:32:18,784][12883] Updated weights for policy 0, policy_version 40351 (0.0041) [2024-06-18 02:32:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 661209088. Throughput: 0: 42214.1. Samples: 661354720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:32:21,994][12645] Avg episode reward: [(0, '0.112')] [2024-06-18 02:32:22,747][12883] Updated weights for policy 0, policy_version 40361 (0.0027) [2024-06-18 02:32:26,779][12883] Updated weights for policy 0, policy_version 40371 (0.0032) [2024-06-18 02:32:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 661438464. Throughput: 0: 42171.1. Samples: 661600400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:32:26,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 02:32:30,482][12883] Updated weights for policy 0, policy_version 40381 (0.0033) [2024-06-18 02:32:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42326.9, 300 sec: 42043.0). Total num frames: 661667840. Throughput: 0: 42167.1. Samples: 661728160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:32:31,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:32:34,710][12883] Updated weights for policy 0, policy_version 40391 (0.0036) [2024-06-18 02:32:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41780.8, 300 sec: 41820.9). Total num frames: 661848064. Throughput: 0: 42147.1. Samples: 661983800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 02:32:36,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:32:37,141][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040397_661864448.pth... [2024-06-18 02:32:37,202][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000039783_651804672.pth [2024-06-18 02:32:38,092][12883] Updated weights for policy 0, policy_version 40401 (0.0026) [2024-06-18 02:32:41,995][12645] Fps is (10 sec: 40956.3, 60 sec: 42324.7, 300 sec: 41876.3). Total num frames: 662077440. Throughput: 0: 42163.6. Samples: 662237460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:32:41,995][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:32:42,457][12883] Updated weights for policy 0, policy_version 40411 (0.0037) [2024-06-18 02:32:46,059][12883] Updated weights for policy 0, policy_version 40421 (0.0025) [2024-06-18 02:32:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 662323200. Throughput: 0: 42312.8. Samples: 662368960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:32:46,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:32:50,163][12883] Updated weights for policy 0, policy_version 40431 (0.0024) [2024-06-18 02:32:51,994][12645] Fps is (10 sec: 39325.0, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 662470656. Throughput: 0: 42072.9. Samples: 662615280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:32:51,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:32:53,640][12883] Updated weights for policy 0, policy_version 40441 (0.0044) [2024-06-18 02:32:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 662700032. Throughput: 0: 42103.6. Samples: 662865160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:32:56,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:32:57,947][12883] Updated weights for policy 0, policy_version 40451 (0.0039) [2024-06-18 02:33:01,503][12883] Updated weights for policy 0, policy_version 40461 (0.0034) [2024-06-18 02:33:01,994][12645] Fps is (10 sec: 47514.6, 60 sec: 42327.0, 300 sec: 42098.6). Total num frames: 662945792. Throughput: 0: 42272.7. Samples: 662999240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:33:01,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 02:33:05,644][12883] Updated weights for policy 0, policy_version 40471 (0.0037) [2024-06-18 02:33:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 663126016. Throughput: 0: 42094.8. Samples: 663248980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 02:33:06,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 02:33:09,358][12883] Updated weights for policy 0, policy_version 40481 (0.0028) [2024-06-18 02:33:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 663339008. Throughput: 0: 42197.9. Samples: 663499300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 02:33:11,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 02:33:13,491][12883] Updated weights for policy 0, policy_version 40491 (0.0038) [2024-06-18 02:33:16,209][12862] Signal inference workers to stop experience collection... (9450 times) [2024-06-18 02:33:16,215][12862] Signal inference workers to resume experience collection... (9450 times) [2024-06-18 02:33:16,242][12883] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-06-18 02:33:16,242][12883] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-06-18 02:33:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 663552000. Throughput: 0: 42174.2. Samples: 663626000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 02:33:16,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 02:33:17,023][12883] Updated weights for policy 0, policy_version 40501 (0.0030) [2024-06-18 02:33:21,195][12883] Updated weights for policy 0, policy_version 40511 (0.0034) [2024-06-18 02:33:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 663748608. Throughput: 0: 42132.5. Samples: 663879760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 02:33:21,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:33:24,688][12883] Updated weights for policy 0, policy_version 40521 (0.0037) [2024-06-18 02:33:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 663977984. Throughput: 0: 42128.4. Samples: 664133200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 02:33:26,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 02:33:29,080][12883] Updated weights for policy 0, policy_version 40531 (0.0034) [2024-06-18 02:33:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41987.8). Total num frames: 664190976. Throughput: 0: 42089.4. Samples: 664262980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:33:31,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 02:33:32,334][12883] Updated weights for policy 0, policy_version 40541 (0.0037) [2024-06-18 02:33:36,850][12883] Updated weights for policy 0, policy_version 40551 (0.0049) [2024-06-18 02:33:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 664387584. Throughput: 0: 41998.7. Samples: 664505220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:33:36,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:33:40,341][12883] Updated weights for policy 0, policy_version 40561 (0.0041) [2024-06-18 02:33:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42053.0, 300 sec: 42043.0). Total num frames: 664600576. Throughput: 0: 42139.6. Samples: 664761440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:33:41,994][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 02:33:44,494][12883] Updated weights for policy 0, policy_version 40571 (0.0034) [2024-06-18 02:33:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41932.0). Total num frames: 664797184. Throughput: 0: 41926.6. Samples: 664885940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:33:46,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 02:33:47,853][12883] Updated weights for policy 0, policy_version 40581 (0.0031) [2024-06-18 02:33:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.9, 300 sec: 42042.7). Total num frames: 665026560. Throughput: 0: 42126.3. Samples: 665144760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:33:51,996][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 02:33:52,186][12883] Updated weights for policy 0, policy_version 40591 (0.0036) [2024-06-18 02:33:55,736][12883] Updated weights for policy 0, policy_version 40601 (0.0034) [2024-06-18 02:33:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 665239552. Throughput: 0: 42036.0. Samples: 665390920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:33:56,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 02:33:59,983][12883] Updated weights for policy 0, policy_version 40611 (0.0030) [2024-06-18 02:34:01,996][12645] Fps is (10 sec: 42598.3, 60 sec: 41777.5, 300 sec: 41987.2). Total num frames: 665452544. Throughput: 0: 42109.0. Samples: 665521000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:34:01,996][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 02:34:03,533][12883] Updated weights for policy 0, policy_version 40621 (0.0027) [2024-06-18 02:34:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 665649152. Throughput: 0: 42101.8. Samples: 665774340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:34:06,994][12645] Avg episode reward: [(0, '0.064')] [2024-06-18 02:34:07,812][12883] Updated weights for policy 0, policy_version 40631 (0.0039) [2024-06-18 02:34:11,166][12883] Updated weights for policy 0, policy_version 40641 (0.0040) [2024-06-18 02:34:11,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 665878528. Throughput: 0: 42037.8. Samples: 666024900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:34:11,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 02:34:15,559][12883] Updated weights for policy 0, policy_version 40651 (0.0027) [2024-06-18 02:34:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 666075136. Throughput: 0: 42088.0. Samples: 666156940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:34:16,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:34:19,276][12883] Updated weights for policy 0, policy_version 40661 (0.0030) [2024-06-18 02:34:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 666271744. Throughput: 0: 42344.4. Samples: 666410720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 02:34:21,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 02:34:23,732][12883] Updated weights for policy 0, policy_version 40671 (0.0038) [2024-06-18 02:34:26,985][12883] Updated weights for policy 0, policy_version 40681 (0.0036) [2024-06-18 02:34:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 666517504. Throughput: 0: 42295.5. Samples: 666664740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 02:34:26,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:34:31,317][12883] Updated weights for policy 0, policy_version 40691 (0.0049) [2024-06-18 02:34:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 666714112. Throughput: 0: 42414.3. Samples: 666794580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 02:34:31,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 02:34:32,100][12862] Signal inference workers to stop experience collection... (9500 times) [2024-06-18 02:34:32,101][12862] Signal inference workers to resume experience collection... (9500 times) [2024-06-18 02:34:32,118][12883] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-06-18 02:34:32,119][12883] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-06-18 02:34:34,642][12883] Updated weights for policy 0, policy_version 40701 (0.0035) [2024-06-18 02:34:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 666927104. Throughput: 0: 42235.0. Samples: 667045240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 02:34:36,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 02:34:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040706_666927104.pth... [2024-06-18 02:34:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040090_656834560.pth [2024-06-18 02:34:38,854][12883] Updated weights for policy 0, policy_version 40711 (0.0031) [2024-06-18 02:34:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42043.7). Total num frames: 667140096. Throughput: 0: 42346.6. Samples: 667296520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 02:34:41,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 02:34:42,581][12883] Updated weights for policy 0, policy_version 40721 (0.0028) [2024-06-18 02:34:46,712][12883] Updated weights for policy 0, policy_version 40731 (0.0036) [2024-06-18 02:34:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 667336704. Throughput: 0: 42356.7. Samples: 667426960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 02:34:46,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 02:34:50,318][12883] Updated weights for policy 0, policy_version 40741 (0.0034) [2024-06-18 02:34:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42053.9, 300 sec: 41987.5). Total num frames: 667549696. Throughput: 0: 42282.6. Samples: 667677060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:34:51,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:34:54,430][12883] Updated weights for policy 0, policy_version 40751 (0.0025) [2024-06-18 02:34:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 667779072. Throughput: 0: 42299.0. Samples: 667928360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:34:56,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 02:34:57,991][12883] Updated weights for policy 0, policy_version 40761 (0.0039) [2024-06-18 02:35:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42327.0, 300 sec: 42154.1). Total num frames: 667992064. Throughput: 0: 42309.8. Samples: 668060880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:35:01,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:35:01,997][12883] Updated weights for policy 0, policy_version 40771 (0.0033) [2024-06-18 02:35:05,488][12883] Updated weights for policy 0, policy_version 40781 (0.0040) [2024-06-18 02:35:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 668188672. Throughput: 0: 42228.5. Samples: 668311000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:35:06,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:35:09,544][12883] Updated weights for policy 0, policy_version 40791 (0.0039) [2024-06-18 02:35:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 668418048. Throughput: 0: 42271.5. Samples: 668566960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 02:35:11,999][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:35:13,557][12883] Updated weights for policy 0, policy_version 40801 (0.0036) [2024-06-18 02:35:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 668631040. Throughput: 0: 42236.9. Samples: 668695240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:35:16,994][12645] Avg episode reward: [(0, '0.013')] [2024-06-18 02:35:17,246][12883] Updated weights for policy 0, policy_version 40811 (0.0040) [2024-06-18 02:35:21,339][12883] Updated weights for policy 0, policy_version 40821 (0.0028) [2024-06-18 02:35:21,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42869.9, 300 sec: 42098.2). Total num frames: 668844032. Throughput: 0: 42325.5. Samples: 668949980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:35:21,997][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:35:25,043][12883] Updated weights for policy 0, policy_version 40831 (0.0029) [2024-06-18 02:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 669057024. Throughput: 0: 42286.7. Samples: 669199420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:35:26,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:35:29,057][12883] Updated weights for policy 0, policy_version 40841 (0.0041) [2024-06-18 02:35:31,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 669237248. Throughput: 0: 42250.3. Samples: 669328220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:35:31,994][12645] Avg episode reward: [(0, '0.126')] [2024-06-18 02:35:32,597][12883] Updated weights for policy 0, policy_version 40851 (0.0035) [2024-06-18 02:35:36,836][12883] Updated weights for policy 0, policy_version 40861 (0.0044) [2024-06-18 02:35:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 669466624. Throughput: 0: 42426.2. Samples: 669586240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 02:35:36,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:35:40,489][12883] Updated weights for policy 0, policy_version 40871 (0.0033) [2024-06-18 02:35:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 669679616. Throughput: 0: 42303.7. Samples: 669832020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 02:35:41,994][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 02:35:44,622][12883] Updated weights for policy 0, policy_version 40881 (0.0028) [2024-06-18 02:35:46,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42323.8, 300 sec: 42098.2). Total num frames: 669876224. Throughput: 0: 42337.4. Samples: 669966160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 02:35:46,996][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:35:48,370][12883] Updated weights for policy 0, policy_version 40891 (0.0033) [2024-06-18 02:35:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 670089216. Throughput: 0: 42424.1. Samples: 670220080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 02:35:51,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:35:52,503][12883] Updated weights for policy 0, policy_version 40901 (0.0038) [2024-06-18 02:35:56,174][12883] Updated weights for policy 0, policy_version 40911 (0.0034) [2024-06-18 02:35:56,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 670318592. Throughput: 0: 42240.9. Samples: 670467800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 02:35:56,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:36:00,250][12883] Updated weights for policy 0, policy_version 40921 (0.0041) [2024-06-18 02:36:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 670531584. Throughput: 0: 42393.3. Samples: 670602940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 02:36:01,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 02:36:03,728][12883] Updated weights for policy 0, policy_version 40931 (0.0029) [2024-06-18 02:36:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 670711808. Throughput: 0: 42172.3. Samples: 670847640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 02:36:06,994][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 02:36:08,359][12883] Updated weights for policy 0, policy_version 40941 (0.0033) [2024-06-18 02:36:10,224][12862] Signal inference workers to stop experience collection... (9550 times) [2024-06-18 02:36:10,264][12883] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-06-18 02:36:10,271][12862] Signal inference workers to resume experience collection... (9550 times) [2024-06-18 02:36:10,281][12883] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-06-18 02:36:11,856][12883] Updated weights for policy 0, policy_version 40951 (0.0039) [2024-06-18 02:36:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42209.9). Total num frames: 670941184. Throughput: 0: 42227.5. Samples: 671099660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:36:11,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 02:36:16,015][12883] Updated weights for policy 0, policy_version 40961 (0.0041) [2024-06-18 02:36:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 671170560. Throughput: 0: 42233.3. Samples: 671228720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:36:16,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:36:19,765][12883] Updated weights for policy 0, policy_version 40971 (0.0037) [2024-06-18 02:36:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42053.7, 300 sec: 42265.1). Total num frames: 671367168. Throughput: 0: 42015.0. Samples: 671476920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:36:21,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 02:36:23,809][12883] Updated weights for policy 0, policy_version 40981 (0.0032) [2024-06-18 02:36:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42210.0). Total num frames: 671580160. Throughput: 0: 42224.4. Samples: 671732120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:36:26,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 02:36:27,564][12883] Updated weights for policy 0, policy_version 40991 (0.0037) [2024-06-18 02:36:31,423][12883] Updated weights for policy 0, policy_version 41001 (0.0042) [2024-06-18 02:36:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 671776768. Throughput: 0: 42087.4. Samples: 671860000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:36:31,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 02:36:35,240][12883] Updated weights for policy 0, policy_version 41011 (0.0047) [2024-06-18 02:36:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 671989760. Throughput: 0: 42134.7. Samples: 672116140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 02:36:36,994][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 02:36:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041016_672006144.pth... [2024-06-18 02:36:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040397_661864448.pth [2024-06-18 02:36:39,422][12883] Updated weights for policy 0, policy_version 41021 (0.0041) [2024-06-18 02:36:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 672219136. Throughput: 0: 42123.6. Samples: 672363360. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 02:36:41,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 02:36:42,809][12883] Updated weights for policy 0, policy_version 41031 (0.0032) [2024-06-18 02:36:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 672399360. Throughput: 0: 42026.2. Samples: 672494120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 02:36:46,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:36:47,010][12883] Updated weights for policy 0, policy_version 41041 (0.0023) [2024-06-18 02:36:50,938][12883] Updated weights for policy 0, policy_version 41051 (0.0038) [2024-06-18 02:36:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 672628736. Throughput: 0: 42059.2. Samples: 672740300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 02:36:51,994][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 02:36:54,627][12883] Updated weights for policy 0, policy_version 41061 (0.0040) [2024-06-18 02:36:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 672841728. Throughput: 0: 42105.9. Samples: 672994420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 02:36:56,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 02:36:58,639][12883] Updated weights for policy 0, policy_version 41071 (0.0035) [2024-06-18 02:37:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 673038336. Throughput: 0: 42071.0. Samples: 673121920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:37:01,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 02:37:02,438][12883] Updated weights for policy 0, policy_version 41081 (0.0033) [2024-06-18 02:37:06,389][12883] Updated weights for policy 0, policy_version 41091 (0.0034) [2024-06-18 02:37:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 673267712. Throughput: 0: 42109.9. Samples: 673371860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:37:06,994][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 02:37:10,351][12883] Updated weights for policy 0, policy_version 41101 (0.0046) [2024-06-18 02:37:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 673480704. Throughput: 0: 41867.9. Samples: 673616180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:37:11,994][12645] Avg episode reward: [(0, '0.104')] [2024-06-18 02:37:14,127][12883] Updated weights for policy 0, policy_version 41111 (0.0025) [2024-06-18 02:37:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 673660928. Throughput: 0: 41934.3. Samples: 673747040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:37:16,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 02:37:18,180][12883] Updated weights for policy 0, policy_version 41121 (0.0036) [2024-06-18 02:37:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 673873920. Throughput: 0: 41863.5. Samples: 674000000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 02:37:21,994][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 02:37:22,434][12883] Updated weights for policy 0, policy_version 41131 (0.0037) [2024-06-18 02:37:25,881][12883] Updated weights for policy 0, policy_version 41141 (0.0039) [2024-06-18 02:37:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 674119680. Throughput: 0: 41915.4. Samples: 674249560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:37:26,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 02:37:30,173][12883] Updated weights for policy 0, policy_version 41151 (0.0033) [2024-06-18 02:37:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 674299904. Throughput: 0: 42017.3. Samples: 674384900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:37:31,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 02:37:33,751][12883] Updated weights for policy 0, policy_version 41161 (0.0034) [2024-06-18 02:37:34,479][12862] Signal inference workers to stop experience collection... (9600 times) [2024-06-18 02:37:34,480][12862] Signal inference workers to resume experience collection... (9600 times) [2024-06-18 02:37:34,504][12883] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-06-18 02:37:34,505][12883] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-06-18 02:37:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42209.8). Total num frames: 674529280. Throughput: 0: 41992.0. Samples: 674629940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:37:36,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 02:37:37,732][12883] Updated weights for policy 0, policy_version 41171 (0.0048) [2024-06-18 02:37:41,489][12883] Updated weights for policy 0, policy_version 41181 (0.0022) [2024-06-18 02:37:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 674725888. Throughput: 0: 42074.7. Samples: 674887780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:37:41,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 02:37:45,748][12883] Updated weights for policy 0, policy_version 41191 (0.0042) [2024-06-18 02:37:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 674922496. Throughput: 0: 41896.5. Samples: 675007260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:37:46,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:37:49,444][12883] Updated weights for policy 0, policy_version 41201 (0.0030) [2024-06-18 02:37:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 675184640. Throughput: 0: 41918.1. Samples: 675258180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:37:51,994][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 02:37:53,456][12883] Updated weights for policy 0, policy_version 41211 (0.0039) [2024-06-18 02:37:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.0, 300 sec: 41987.4). Total num frames: 675332096. Throughput: 0: 42375.1. Samples: 675523060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:37:56,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:37:57,436][12883] Updated weights for policy 0, policy_version 41221 (0.0034) [2024-06-18 02:38:00,996][12883] Updated weights for policy 0, policy_version 41231 (0.0042) [2024-06-18 02:38:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 675561472. Throughput: 0: 41891.4. Samples: 675632160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:38:02,000][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 02:38:05,147][12883] Updated weights for policy 0, policy_version 41241 (0.0033) [2024-06-18 02:38:06,994][12645] Fps is (10 sec: 49152.6, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 675823616. Throughput: 0: 42217.3. Samples: 675899780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:38:06,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 02:38:08,693][12883] Updated weights for policy 0, policy_version 41251 (0.0026) [2024-06-18 02:38:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 675954688. Throughput: 0: 42495.2. Samples: 676161840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:38:11,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 02:38:12,828][12883] Updated weights for policy 0, policy_version 41261 (0.0033) [2024-06-18 02:38:16,736][12883] Updated weights for policy 0, policy_version 41271 (0.0047) [2024-06-18 02:38:16,994][12645] Fps is (10 sec: 36044.3, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 676184064. Throughput: 0: 41972.7. Samples: 676273680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:38:16,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:38:20,517][12883] Updated weights for policy 0, policy_version 41281 (0.0035) [2024-06-18 02:38:21,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 676429824. Throughput: 0: 42452.4. Samples: 676540300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 02:38:21,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 02:38:24,229][12883] Updated weights for policy 0, policy_version 41291 (0.0040) [2024-06-18 02:38:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 676610048. Throughput: 0: 42447.5. Samples: 676797920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 02:38:26,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 02:38:28,087][12883] Updated weights for policy 0, policy_version 41301 (0.0043) [2024-06-18 02:38:31,617][12883] Updated weights for policy 0, policy_version 41311 (0.0026) [2024-06-18 02:38:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 676839424. Throughput: 0: 42436.1. Samples: 676916880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 02:38:31,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 02:38:35,673][12883] Updated weights for policy 0, policy_version 41321 (0.0029) [2024-06-18 02:38:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 677052416. Throughput: 0: 42498.7. Samples: 677170620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 02:38:36,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 02:38:37,124][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041325_677068800.pth... [2024-06-18 02:38:37,176][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000040706_666927104.pth [2024-06-18 02:38:38,074][12862] Signal inference workers to stop experience collection... (9650 times) [2024-06-18 02:38:38,074][12862] Signal inference workers to resume experience collection... (9650 times) [2024-06-18 02:38:38,118][12883] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-06-18 02:38:38,118][12883] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-06-18 02:38:39,355][12883] Updated weights for policy 0, policy_version 41331 (0.0035) [2024-06-18 02:38:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 677249024. Throughput: 0: 42256.2. Samples: 677424580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 02:38:41,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 02:38:43,395][12883] Updated weights for policy 0, policy_version 41341 (0.0030) [2024-06-18 02:38:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42154.4). Total num frames: 677462016. Throughput: 0: 42540.1. Samples: 677546460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:38:46,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:38:47,547][12883] Updated weights for policy 0, policy_version 41351 (0.0037) [2024-06-18 02:38:50,992][12883] Updated weights for policy 0, policy_version 41361 (0.0042) [2024-06-18 02:38:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 677707776. Throughput: 0: 42469.8. Samples: 677810920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:38:51,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:38:55,007][12883] Updated weights for policy 0, policy_version 41371 (0.0030) [2024-06-18 02:38:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42210.0). Total num frames: 677904384. Throughput: 0: 42178.3. Samples: 678059860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:38:56,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 02:38:58,584][12883] Updated weights for policy 0, policy_version 41381 (0.0030) [2024-06-18 02:39:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 678100992. Throughput: 0: 42482.3. Samples: 678185380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:39:01,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 02:39:02,936][12883] Updated weights for policy 0, policy_version 41391 (0.0038) [2024-06-18 02:39:06,281][12883] Updated weights for policy 0, policy_version 41401 (0.0032) [2024-06-18 02:39:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 678330368. Throughput: 0: 42280.8. Samples: 678442940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:39:06,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 02:39:07,004][12862] Saving new best policy, reward=0.193! [2024-06-18 02:39:10,841][12883] Updated weights for policy 0, policy_version 41411 (0.0031) [2024-06-18 02:39:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42265.1). Total num frames: 678543360. Throughput: 0: 42078.5. Samples: 678691460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:39:11,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:39:13,903][12883] Updated weights for policy 0, policy_version 41421 (0.0031) [2024-06-18 02:39:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 678739968. Throughput: 0: 42339.9. Samples: 678822180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:39:16,994][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 02:39:18,519][12883] Updated weights for policy 0, policy_version 41431 (0.0041) [2024-06-18 02:39:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 678952960. Throughput: 0: 42229.8. Samples: 679070960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:39:21,996][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:39:22,115][12883] Updated weights for policy 0, policy_version 41441 (0.0040) [2024-06-18 02:39:26,313][12883] Updated weights for policy 0, policy_version 41451 (0.0046) [2024-06-18 02:39:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 679165952. Throughput: 0: 42324.5. Samples: 679329180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:39:26,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:39:29,636][12883] Updated weights for policy 0, policy_version 41461 (0.0032) [2024-06-18 02:39:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 679362560. Throughput: 0: 42357.6. Samples: 679452560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:39:31,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 02:39:33,907][12883] Updated weights for policy 0, policy_version 41471 (0.0034) [2024-06-18 02:39:36,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42323.8, 300 sec: 42209.3). Total num frames: 679591936. Throughput: 0: 42162.3. Samples: 679708320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:39:36,996][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 02:39:37,363][12883] Updated weights for policy 0, policy_version 41481 (0.0037) [2024-06-18 02:39:41,631][12883] Updated weights for policy 0, policy_version 41491 (0.0044) [2024-06-18 02:39:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 679788544. Throughput: 0: 42252.7. Samples: 679961240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:39:41,994][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 02:39:45,365][12883] Updated weights for policy 0, policy_version 41501 (0.0032) [2024-06-18 02:39:46,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 679985152. Throughput: 0: 42256.5. Samples: 680086920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:39:46,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 02:39:49,279][12883] Updated weights for policy 0, policy_version 41511 (0.0044) [2024-06-18 02:39:51,998][12645] Fps is (10 sec: 44219.9, 60 sec: 42049.5, 300 sec: 42209.1). Total num frames: 680230912. Throughput: 0: 42197.7. Samples: 680342000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:39:51,998][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 02:39:52,961][12883] Updated weights for policy 0, policy_version 41521 (0.0036) [2024-06-18 02:39:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 680427520. Throughput: 0: 42247.2. Samples: 680592580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:39:56,995][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 02:39:57,127][12883] Updated weights for policy 0, policy_version 41531 (0.0040) [2024-06-18 02:40:01,280][12883] Updated weights for policy 0, policy_version 41541 (0.0036) [2024-06-18 02:40:01,996][12645] Fps is (10 sec: 40967.1, 60 sec: 42323.8, 300 sec: 42209.3). Total num frames: 680640512. Throughput: 0: 42157.6. Samples: 680719360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 02:40:01,996][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 02:40:04,886][12883] Updated weights for policy 0, policy_version 41551 (0.0036) [2024-06-18 02:40:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 680853504. Throughput: 0: 42291.2. Samples: 680974060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-18 02:40:06,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:40:09,140][12883] Updated weights for policy 0, policy_version 41561 (0.0024) [2024-06-18 02:40:11,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 681066496. Throughput: 0: 42135.5. Samples: 681225280. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-18 02:40:11,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 02:40:12,694][12883] Updated weights for policy 0, policy_version 41571 (0.0038) [2024-06-18 02:40:15,970][12862] Signal inference workers to stop experience collection... (9700 times) [2024-06-18 02:40:15,970][12862] Signal inference workers to resume experience collection... (9700 times) [2024-06-18 02:40:16,010][12883] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-06-18 02:40:16,010][12883] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-06-18 02:40:16,912][12883] Updated weights for policy 0, policy_version 41581 (0.0045) [2024-06-18 02:40:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42098.9). Total num frames: 681263104. Throughput: 0: 42093.0. Samples: 681346740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-18 02:40:16,994][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 02:40:20,499][12883] Updated weights for policy 0, policy_version 41591 (0.0038) [2024-06-18 02:40:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 681492480. Throughput: 0: 42029.3. Samples: 681599540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-18 02:40:21,994][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 02:40:24,567][12883] Updated weights for policy 0, policy_version 41601 (0.0033) [2024-06-18 02:40:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 681705472. Throughput: 0: 42141.5. Samples: 681857600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 25.0) [2024-06-18 02:40:26,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:40:28,366][12883] Updated weights for policy 0, policy_version 41611 (0.0034) [2024-06-18 02:40:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 681902080. Throughput: 0: 42017.8. Samples: 681977720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:40:31,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 02:40:32,292][12883] Updated weights for policy 0, policy_version 41621 (0.0031) [2024-06-18 02:40:36,147][12883] Updated weights for policy 0, policy_version 41631 (0.0034) [2024-06-18 02:40:36,997][12645] Fps is (10 sec: 40945.7, 60 sec: 42051.4, 300 sec: 42153.6). Total num frames: 682115072. Throughput: 0: 42078.3. Samples: 682235500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:40:36,998][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 02:40:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041633_682115072.pth... [2024-06-18 02:40:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041016_672006144.pth [2024-06-18 02:40:39,744][12883] Updated weights for policy 0, policy_version 41641 (0.0033) [2024-06-18 02:40:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42210.0). Total num frames: 682328064. Throughput: 0: 42113.8. Samples: 682487700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:40:41,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 02:40:43,923][12883] Updated weights for policy 0, policy_version 41651 (0.0031) [2024-06-18 02:40:46,994][12645] Fps is (10 sec: 44252.0, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 682557440. Throughput: 0: 42133.2. Samples: 682615260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:40:46,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 02:40:47,490][12883] Updated weights for policy 0, policy_version 41661 (0.0027) [2024-06-18 02:40:51,885][12883] Updated weights for policy 0, policy_version 41671 (0.0030) [2024-06-18 02:40:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41782.0, 300 sec: 42098.6). Total num frames: 682737664. Throughput: 0: 41993.8. Samples: 682863780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:40:51,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 02:40:55,405][12883] Updated weights for policy 0, policy_version 41681 (0.0029) [2024-06-18 02:40:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 682950656. Throughput: 0: 42007.2. Samples: 683115600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:40:56,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:40:59,570][12883] Updated weights for policy 0, policy_version 41691 (0.0049) [2024-06-18 02:41:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 683180032. Throughput: 0: 42177.3. Samples: 683244720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:41:01,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 02:41:03,238][12883] Updated weights for policy 0, policy_version 41701 (0.0028) [2024-06-18 02:41:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 683360256. Throughput: 0: 42180.4. Samples: 683497660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:41:06,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 02:41:07,064][12862] Saving new best policy, reward=0.209! [2024-06-18 02:41:07,390][12883] Updated weights for policy 0, policy_version 41711 (0.0024) [2024-06-18 02:41:11,038][12883] Updated weights for policy 0, policy_version 41721 (0.0036) [2024-06-18 02:41:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 683573248. Throughput: 0: 41980.8. Samples: 683746740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:41:11,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 02:41:15,339][12883] Updated weights for policy 0, policy_version 41731 (0.0034) [2024-06-18 02:41:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 683819008. Throughput: 0: 42151.9. Samples: 683874560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:41:16,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 02:41:18,723][12883] Updated weights for policy 0, policy_version 41741 (0.0034) [2024-06-18 02:41:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 684015616. Throughput: 0: 41999.2. Samples: 684125320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 02:41:21,994][12645] Avg episode reward: [(0, '0.053')] [2024-06-18 02:41:23,257][12883] Updated weights for policy 0, policy_version 41751 (0.0039) [2024-06-18 02:41:26,377][12883] Updated weights for policy 0, policy_version 41761 (0.0044) [2024-06-18 02:41:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 684228608. Throughput: 0: 41946.5. Samples: 684375300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:41:26,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 02:41:31,065][12883] Updated weights for policy 0, policy_version 41771 (0.0033) [2024-06-18 02:41:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 684425216. Throughput: 0: 41972.9. Samples: 684504040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:41:31,994][12645] Avg episode reward: [(0, '0.104')] [2024-06-18 02:41:34,348][12883] Updated weights for policy 0, policy_version 41781 (0.0047) [2024-06-18 02:41:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42327.8, 300 sec: 42154.1). Total num frames: 684654592. Throughput: 0: 42066.2. Samples: 684756760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:41:36,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:41:38,758][12883] Updated weights for policy 0, policy_version 41791 (0.0035) [2024-06-18 02:41:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 684851200. Throughput: 0: 42071.6. Samples: 685008820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:41:41,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 02:41:42,120][12883] Updated weights for policy 0, policy_version 41801 (0.0047) [2024-06-18 02:41:46,354][12883] Updated weights for policy 0, policy_version 41811 (0.0029) [2024-06-18 02:41:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41506.0, 300 sec: 42098.5). Total num frames: 685047808. Throughput: 0: 41887.9. Samples: 685129680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 02:41:46,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 02:41:49,972][12883] Updated weights for policy 0, policy_version 41821 (0.0034) [2024-06-18 02:41:51,160][12862] Signal inference workers to stop experience collection... (9750 times) [2024-06-18 02:41:51,160][12862] Signal inference workers to resume experience collection... (9750 times) [2024-06-18 02:41:51,173][12883] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-06-18 02:41:51,174][12883] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-06-18 02:41:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 685277184. Throughput: 0: 41962.7. Samples: 685385980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:41:51,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:41:54,069][12883] Updated weights for policy 0, policy_version 41831 (0.0037) [2024-06-18 02:41:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 685473792. Throughput: 0: 42104.5. Samples: 685641440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:41:56,994][12645] Avg episode reward: [(0, '0.016')] [2024-06-18 02:41:58,045][12883] Updated weights for policy 0, policy_version 41841 (0.0035) [2024-06-18 02:42:01,783][12883] Updated weights for policy 0, policy_version 41851 (0.0037) [2024-06-18 02:42:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 685703168. Throughput: 0: 41858.2. Samples: 685758180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:42:01,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 02:42:05,646][12883] Updated weights for policy 0, policy_version 41861 (0.0030) [2024-06-18 02:42:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 685916160. Throughput: 0: 41978.2. Samples: 686014340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:42:06,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:42:09,465][12883] Updated weights for policy 0, policy_version 41871 (0.0029) [2024-06-18 02:42:11,998][12645] Fps is (10 sec: 39306.6, 60 sec: 42049.6, 300 sec: 42153.5). Total num frames: 686096384. Throughput: 0: 42147.6. Samples: 686272100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:42:11,998][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 02:42:13,463][12883] Updated weights for policy 0, policy_version 41881 (0.0026) [2024-06-18 02:42:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 686325760. Throughput: 0: 41831.9. Samples: 686386480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:42:16,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 02:42:17,207][12883] Updated weights for policy 0, policy_version 41891 (0.0029) [2024-06-18 02:42:21,600][12883] Updated weights for policy 0, policy_version 41901 (0.0030) [2024-06-18 02:42:21,994][12645] Fps is (10 sec: 42615.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 686522368. Throughput: 0: 41909.8. Samples: 686642700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:42:21,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 02:42:24,970][12883] Updated weights for policy 0, policy_version 41911 (0.0040) [2024-06-18 02:42:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 686735360. Throughput: 0: 41875.9. Samples: 686893240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:42:26,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 02:42:29,285][12883] Updated weights for policy 0, policy_version 41921 (0.0033) [2024-06-18 02:42:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 686964736. Throughput: 0: 41977.5. Samples: 687018660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:42:31,994][12645] Avg episode reward: [(0, '0.126')] [2024-06-18 02:42:32,594][12883] Updated weights for policy 0, policy_version 41931 (0.0031) [2024-06-18 02:42:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 687128576. Throughput: 0: 41962.3. Samples: 687274280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:42:36,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 02:42:37,166][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041941_687161344.pth... [2024-06-18 02:42:37,170][12883] Updated weights for policy 0, policy_version 41941 (0.0037) [2024-06-18 02:42:37,222][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041325_677068800.pth [2024-06-18 02:42:40,398][12883] Updated weights for policy 0, policy_version 41951 (0.0027) [2024-06-18 02:42:41,994][12645] Fps is (10 sec: 37681.3, 60 sec: 41505.7, 300 sec: 42098.5). Total num frames: 687341568. Throughput: 0: 41801.3. Samples: 687522520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 02:42:41,995][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:42:45,179][12883] Updated weights for policy 0, policy_version 41961 (0.0030) [2024-06-18 02:42:46,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 687587328. Throughput: 0: 41995.2. Samples: 687647960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:42:46,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 02:42:48,142][12883] Updated weights for policy 0, policy_version 41971 (0.0036) [2024-06-18 02:42:51,994][12645] Fps is (10 sec: 40961.6, 60 sec: 41233.0, 300 sec: 42098.6). Total num frames: 687751168. Throughput: 0: 41930.2. Samples: 687901200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:42:51,995][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 02:42:52,771][12883] Updated weights for policy 0, policy_version 41981 (0.0043) [2024-06-18 02:42:56,128][12883] Updated weights for policy 0, policy_version 41991 (0.0032) [2024-06-18 02:42:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 687980544. Throughput: 0: 41756.9. Samples: 688151000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:42:56,994][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 02:43:00,662][12883] Updated weights for policy 0, policy_version 42001 (0.0038) [2024-06-18 02:43:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 688209920. Throughput: 0: 42094.2. Samples: 688280720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:43:01,995][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 02:43:04,297][12883] Updated weights for policy 0, policy_version 42011 (0.0032) [2024-06-18 02:43:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 42154.1). Total num frames: 688390144. Throughput: 0: 41776.8. Samples: 688522660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:43:06,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:43:08,710][12883] Updated weights for policy 0, policy_version 42021 (0.0030) [2024-06-18 02:43:10,289][12862] Signal inference workers to stop experience collection... (9800 times) [2024-06-18 02:43:10,316][12883] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-06-18 02:43:10,400][12862] Signal inference workers to resume experience collection... (9800 times) [2024-06-18 02:43:10,400][12883] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-06-18 02:43:11,935][12883] Updated weights for policy 0, policy_version 42031 (0.0022) [2024-06-18 02:43:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42328.0, 300 sec: 42209.6). Total num frames: 688635904. Throughput: 0: 41777.7. Samples: 688773240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 02:43:11,994][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 02:43:16,690][12883] Updated weights for policy 0, policy_version 42041 (0.0034) [2024-06-18 02:43:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 688816128. Throughput: 0: 41833.8. Samples: 688901180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 02:43:16,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 02:43:19,909][12883] Updated weights for policy 0, policy_version 42051 (0.0034) [2024-06-18 02:43:21,996][12645] Fps is (10 sec: 39312.9, 60 sec: 41777.6, 300 sec: 42098.2). Total num frames: 689029120. Throughput: 0: 41624.5. Samples: 689147480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 02:43:21,997][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 02:43:24,213][12883] Updated weights for policy 0, policy_version 42061 (0.0037) [2024-06-18 02:43:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 689274880. Throughput: 0: 41630.2. Samples: 689395860. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 02:43:26,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 02:43:27,407][12883] Updated weights for policy 0, policy_version 42071 (0.0033) [2024-06-18 02:43:31,994][12645] Fps is (10 sec: 40969.8, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 689438720. Throughput: 0: 41787.2. Samples: 689528380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 02:43:31,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:43:32,352][12883] Updated weights for policy 0, policy_version 42081 (0.0027) [2024-06-18 02:43:35,167][12883] Updated weights for policy 0, policy_version 42091 (0.0033) [2024-06-18 02:43:36,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 689651712. Throughput: 0: 41548.4. Samples: 689770880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 25.0) [2024-06-18 02:43:36,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 02:43:40,262][12883] Updated weights for policy 0, policy_version 42101 (0.0030) [2024-06-18 02:43:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.6, 300 sec: 42043.0). Total num frames: 689864704. Throughput: 0: 41685.4. Samples: 690026840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 02:43:41,994][12645] Avg episode reward: [(0, '0.036')] [2024-06-18 02:43:43,139][12883] Updated weights for policy 0, policy_version 42111 (0.0044) [2024-06-18 02:43:46,997][12645] Fps is (10 sec: 42586.5, 60 sec: 41504.1, 300 sec: 41931.5). Total num frames: 690077696. Throughput: 0: 41596.5. Samples: 690152680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 02:43:46,997][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 02:43:47,790][12883] Updated weights for policy 0, policy_version 42121 (0.0029) [2024-06-18 02:43:51,162][12883] Updated weights for policy 0, policy_version 42131 (0.0027) [2024-06-18 02:43:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 690307072. Throughput: 0: 41766.2. Samples: 690402140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 02:43:51,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 02:43:55,642][12883] Updated weights for policy 0, policy_version 42141 (0.0035) [2024-06-18 02:43:56,994][12645] Fps is (10 sec: 42610.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 690503680. Throughput: 0: 41828.9. Samples: 690655540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 02:43:56,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 02:43:58,948][12883] Updated weights for policy 0, policy_version 42151 (0.0056) [2024-06-18 02:44:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 690683904. Throughput: 0: 41630.6. Samples: 690774560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 02:44:01,994][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 02:44:03,377][12883] Updated weights for policy 0, policy_version 42161 (0.0040) [2024-06-18 02:44:06,637][12883] Updated weights for policy 0, policy_version 42171 (0.0033) [2024-06-18 02:44:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 690929664. Throughput: 0: 41855.0. Samples: 691030860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 02:44:06,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 02:44:11,344][12883] Updated weights for policy 0, policy_version 42181 (0.0034) [2024-06-18 02:44:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 691109888. Throughput: 0: 41896.9. Samples: 691281220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 02:44:11,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 02:44:14,643][12883] Updated weights for policy 0, policy_version 42191 (0.0033) [2024-06-18 02:44:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 691306496. Throughput: 0: 41595.4. Samples: 691400180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 02:44:16,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 02:44:19,507][12883] Updated weights for policy 0, policy_version 42201 (0.0051) [2024-06-18 02:44:21,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41779.2, 300 sec: 41931.6). Total num frames: 691535872. Throughput: 0: 41767.8. Samples: 691650520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 02:44:21,996][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 02:44:22,562][12883] Updated weights for policy 0, policy_version 42211 (0.0043) [2024-06-18 02:44:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 691732480. Throughput: 0: 41855.0. Samples: 691910320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 02:44:26,995][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 02:44:27,110][12883] Updated weights for policy 0, policy_version 42221 (0.0031) [2024-06-18 02:44:30,574][12883] Updated weights for policy 0, policy_version 42231 (0.0053) [2024-06-18 02:44:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.2, 300 sec: 41932.2). Total num frames: 691961856. Throughput: 0: 41741.4. Samples: 692030920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:44:31,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 02:44:34,600][12883] Updated weights for policy 0, policy_version 42241 (0.0028) [2024-06-18 02:44:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 692174848. Throughput: 0: 41788.5. Samples: 692282620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:44:36,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 02:44:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042247_692174848.pth... [2024-06-18 02:44:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041633_682115072.pth [2024-06-18 02:44:38,598][12883] Updated weights for policy 0, policy_version 42251 (0.0047) [2024-06-18 02:44:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 692387840. Throughput: 0: 41867.9. Samples: 692539600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:44:42,000][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 02:44:42,442][12883] Updated weights for policy 0, policy_version 42261 (0.0034) [2024-06-18 02:44:46,550][12883] Updated weights for policy 0, policy_version 42271 (0.0038) [2024-06-18 02:44:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42054.2, 300 sec: 41932.5). Total num frames: 692600832. Throughput: 0: 41906.7. Samples: 692660360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:44:46,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:44:50,072][12883] Updated weights for policy 0, policy_version 42281 (0.0038) [2024-06-18 02:44:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 692813824. Throughput: 0: 41909.0. Samples: 692916760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:44:51,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 02:44:54,169][12883] Updated weights for policy 0, policy_version 42291 (0.0033) [2024-06-18 02:44:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41876.7). Total num frames: 692994048. Throughput: 0: 41868.0. Samples: 693165280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:44:56,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 02:44:58,507][12883] Updated weights for policy 0, policy_version 42301 (0.0043) [2024-06-18 02:44:59,302][12862] Signal inference workers to stop experience collection... (9850 times) [2024-06-18 02:44:59,302][12862] Signal inference workers to resume experience collection... (9850 times) [2024-06-18 02:44:59,325][12883] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-06-18 02:44:59,325][12883] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-06-18 02:45:01,879][12883] Updated weights for policy 0, policy_version 42311 (0.0052) [2024-06-18 02:45:01,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 693223424. Throughput: 0: 41944.0. Samples: 693287660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:45:01,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 02:45:06,260][12883] Updated weights for policy 0, policy_version 42321 (0.0047) [2024-06-18 02:45:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 693420032. Throughput: 0: 41980.4. Samples: 693539540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:45:06,994][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 02:45:09,456][12883] Updated weights for policy 0, policy_version 42331 (0.0047) [2024-06-18 02:45:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 693633024. Throughput: 0: 41956.6. Samples: 693798360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:45:11,994][12645] Avg episode reward: [(0, '0.022')] [2024-06-18 02:45:13,990][12883] Updated weights for policy 0, policy_version 42341 (0.0038) [2024-06-18 02:45:16,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 693862400. Throughput: 0: 41989.8. Samples: 693920460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:45:16,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:45:17,627][12883] Updated weights for policy 0, policy_version 42351 (0.0030) [2024-06-18 02:45:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41507.7, 300 sec: 41765.3). Total num frames: 694026240. Throughput: 0: 41819.1. Samples: 694164480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 02:45:21,999][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 02:45:22,025][12883] Updated weights for policy 0, policy_version 42361 (0.0031) [2024-06-18 02:45:25,636][12883] Updated weights for policy 0, policy_version 42371 (0.0032) [2024-06-18 02:45:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 694255616. Throughput: 0: 41722.2. Samples: 694417100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:45:26,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 02:45:29,657][12883] Updated weights for policy 0, policy_version 42381 (0.0043) [2024-06-18 02:45:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 41932.4). Total num frames: 694484992. Throughput: 0: 41812.0. Samples: 694541900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:45:31,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 02:45:33,407][12883] Updated weights for policy 0, policy_version 42391 (0.0030) [2024-06-18 02:45:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 694665216. Throughput: 0: 41709.2. Samples: 694793680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:45:36,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 02:45:37,567][12883] Updated weights for policy 0, policy_version 42401 (0.0027) [2024-06-18 02:45:41,143][12883] Updated weights for policy 0, policy_version 42411 (0.0035) [2024-06-18 02:45:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 694894592. Throughput: 0: 41637.7. Samples: 695038980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:45:41,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 02:45:45,199][12883] Updated weights for policy 0, policy_version 42421 (0.0031) [2024-06-18 02:45:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 695107584. Throughput: 0: 41885.5. Samples: 695172500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:45:46,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 02:45:48,932][12883] Updated weights for policy 0, policy_version 42431 (0.0036) [2024-06-18 02:45:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 695287808. Throughput: 0: 41959.5. Samples: 695427720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 02:45:51,994][12645] Avg episode reward: [(0, '0.017')] [2024-06-18 02:45:53,253][12883] Updated weights for policy 0, policy_version 42441 (0.0045) [2024-06-18 02:45:56,813][12883] Updated weights for policy 0, policy_version 42451 (0.0041) [2024-06-18 02:45:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 695517184. Throughput: 0: 41662.1. Samples: 695673160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 02:45:56,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:46:00,974][12883] Updated weights for policy 0, policy_version 42461 (0.0043) [2024-06-18 02:46:01,993][12645] Fps is (10 sec: 42598.8, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 695713792. Throughput: 0: 41736.6. Samples: 695798600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 02:46:01,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:46:04,547][12883] Updated weights for policy 0, policy_version 42471 (0.0039) [2024-06-18 02:46:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41778.9, 300 sec: 41876.4). Total num frames: 695926784. Throughput: 0: 41930.4. Samples: 696051360. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 02:46:06,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 02:46:08,638][12883] Updated weights for policy 0, policy_version 42481 (0.0027) [2024-06-18 02:46:11,995][12645] Fps is (10 sec: 44228.9, 60 sec: 42051.1, 300 sec: 41820.6). Total num frames: 696156160. Throughput: 0: 41847.0. Samples: 696300280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 02:46:11,996][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 02:46:12,258][12883] Updated weights for policy 0, policy_version 42491 (0.0036) [2024-06-18 02:46:16,246][12883] Updated weights for policy 0, policy_version 42501 (0.0029) [2024-06-18 02:46:16,996][12645] Fps is (10 sec: 44227.9, 60 sec: 41777.6, 300 sec: 41876.1). Total num frames: 696369152. Throughput: 0: 42051.3. Samples: 696434300. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 02:46:17,005][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 02:46:20,020][12883] Updated weights for policy 0, policy_version 42511 (0.0040) [2024-06-18 02:46:21,994][12645] Fps is (10 sec: 40966.3, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 696565760. Throughput: 0: 42127.5. Samples: 696689420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:46:21,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:46:24,052][12883] Updated weights for policy 0, policy_version 42521 (0.0038) [2024-06-18 02:46:26,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 696778752. Throughput: 0: 42176.0. Samples: 696936900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:46:26,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 02:46:27,987][12883] Updated weights for policy 0, policy_version 42531 (0.0037) [2024-06-18 02:46:31,990][12883] Updated weights for policy 0, policy_version 42541 (0.0029) [2024-06-18 02:46:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 696991744. Throughput: 0: 42055.1. Samples: 697064980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:46:31,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 02:46:35,884][12883] Updated weights for policy 0, policy_version 42551 (0.0029) [2024-06-18 02:46:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 697204736. Throughput: 0: 41954.6. Samples: 697315680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:46:36,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 02:46:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042555_697221120.pth... [2024-06-18 02:46:37,103][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000041941_687161344.pth [2024-06-18 02:46:39,807][12883] Updated weights for policy 0, policy_version 42561 (0.0032) [2024-06-18 02:46:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 697401344. Throughput: 0: 42119.9. Samples: 697568560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:46:41,995][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 02:46:42,896][12862] Signal inference workers to stop experience collection... (9900 times) [2024-06-18 02:46:42,921][12883] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-06-18 02:46:43,006][12862] Signal inference workers to resume experience collection... (9900 times) [2024-06-18 02:46:43,006][12883] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-06-18 02:46:43,371][12883] Updated weights for policy 0, policy_version 42571 (0.0031) [2024-06-18 02:46:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 697597952. Throughput: 0: 42127.3. Samples: 697694340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 02:46:46,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:46:47,587][12883] Updated weights for policy 0, policy_version 42581 (0.0031) [2024-06-18 02:46:51,002][12883] Updated weights for policy 0, policy_version 42591 (0.0027) [2024-06-18 02:46:51,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 697843712. Throughput: 0: 42265.2. Samples: 697953280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 02:46:51,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 02:46:55,340][12883] Updated weights for policy 0, policy_version 42601 (0.0039) [2024-06-18 02:46:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 698040320. Throughput: 0: 42257.5. Samples: 698201800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 02:46:56,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 02:46:59,078][12883] Updated weights for policy 0, policy_version 42611 (0.0037) [2024-06-18 02:47:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 698253312. Throughput: 0: 42138.1. Samples: 698330420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 02:47:01,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 02:47:03,074][12883] Updated weights for policy 0, policy_version 42621 (0.0028) [2024-06-18 02:47:06,904][12883] Updated weights for policy 0, policy_version 42631 (0.0033) [2024-06-18 02:47:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 41932.5). Total num frames: 698466304. Throughput: 0: 42013.8. Samples: 698580040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 02:47:06,994][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 02:47:10,919][12883] Updated weights for policy 0, policy_version 42641 (0.0037) [2024-06-18 02:47:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.4, 300 sec: 41876.4). Total num frames: 698679296. Throughput: 0: 42225.3. Samples: 698837040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 02:47:11,994][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 02:47:14,555][12883] Updated weights for policy 0, policy_version 42651 (0.0031) [2024-06-18 02:47:16,996][12645] Fps is (10 sec: 42589.5, 60 sec: 42052.3, 300 sec: 41931.6). Total num frames: 698892288. Throughput: 0: 42264.6. Samples: 698966980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:47:16,996][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 02:47:18,575][12883] Updated weights for policy 0, policy_version 42661 (0.0026) [2024-06-18 02:47:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 699105280. Throughput: 0: 42261.3. Samples: 699217440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:47:21,994][12645] Avg episode reward: [(0, '0.137')] [2024-06-18 02:47:22,172][12883] Updated weights for policy 0, policy_version 42671 (0.0040) [2024-06-18 02:47:26,159][12883] Updated weights for policy 0, policy_version 42681 (0.0023) [2024-06-18 02:47:26,998][12645] Fps is (10 sec: 40951.7, 60 sec: 42049.3, 300 sec: 41820.3). Total num frames: 699301888. Throughput: 0: 42388.6. Samples: 699476220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:47:26,998][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 02:47:29,722][12883] Updated weights for policy 0, policy_version 42691 (0.0046) [2024-06-18 02:47:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 699514880. Throughput: 0: 42376.9. Samples: 699601300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:47:31,994][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 02:47:33,839][12883] Updated weights for policy 0, policy_version 42701 (0.0041) [2024-06-18 02:47:36,994][12645] Fps is (10 sec: 44255.7, 60 sec: 42325.4, 300 sec: 42043.1). Total num frames: 699744256. Throughput: 0: 42348.8. Samples: 699858980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 02:47:36,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 02:47:37,491][12883] Updated weights for policy 0, policy_version 42711 (0.0028) [2024-06-18 02:47:41,653][12883] Updated weights for policy 0, policy_version 42721 (0.0040) [2024-06-18 02:47:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 41876.4). Total num frames: 699940864. Throughput: 0: 42436.1. Samples: 700111420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 02:47:41,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 02:47:45,351][12883] Updated weights for policy 0, policy_version 42731 (0.0029) [2024-06-18 02:47:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42098.6). Total num frames: 700170240. Throughput: 0: 42360.0. Samples: 700236620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 02:47:46,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:47:49,432][12883] Updated weights for policy 0, policy_version 42741 (0.0036) [2024-06-18 02:47:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 700366848. Throughput: 0: 42338.7. Samples: 700485280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 02:47:51,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 02:47:53,313][12883] Updated weights for policy 0, policy_version 42751 (0.0040) [2024-06-18 02:47:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 700579840. Throughput: 0: 42347.3. Samples: 700742660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 02:47:56,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 02:47:57,454][12883] Updated weights for policy 0, policy_version 42761 (0.0030) [2024-06-18 02:48:01,118][12883] Updated weights for policy 0, policy_version 42771 (0.0045) [2024-06-18 02:48:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 700792832. Throughput: 0: 42227.4. Samples: 700867120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 02:48:01,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 02:48:05,394][12883] Updated weights for policy 0, policy_version 42781 (0.0032) [2024-06-18 02:48:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 700989440. Throughput: 0: 42199.1. Samples: 701116400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 02:48:06,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:48:08,893][12883] Updated weights for policy 0, policy_version 42791 (0.0031) [2024-06-18 02:48:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 701202432. Throughput: 0: 42077.7. Samples: 701369540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:48:11,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 02:48:13,132][12883] Updated weights for policy 0, policy_version 42801 (0.0046) [2024-06-18 02:48:16,633][12883] Updated weights for policy 0, policy_version 42811 (0.0040) [2024-06-18 02:48:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42053.8, 300 sec: 41987.8). Total num frames: 701415424. Throughput: 0: 42082.7. Samples: 701495020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:48:16,994][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 02:48:20,744][12883] Updated weights for policy 0, policy_version 42821 (0.0028) [2024-06-18 02:48:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 41931.9). Total num frames: 701644800. Throughput: 0: 42058.3. Samples: 701751600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:48:21,994][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 02:48:24,602][12883] Updated weights for policy 0, policy_version 42831 (0.0041) [2024-06-18 02:48:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42601.4, 300 sec: 42098.5). Total num frames: 701857792. Throughput: 0: 41973.3. Samples: 702000220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:48:26,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 02:48:28,247][12883] Updated weights for policy 0, policy_version 42841 (0.0043) [2024-06-18 02:48:31,025][12862] Signal inference workers to stop experience collection... (9950 times) [2024-06-18 02:48:31,025][12862] Signal inference workers to resume experience collection... (9950 times) [2024-06-18 02:48:31,055][12883] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-06-18 02:48:31,055][12883] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-06-18 02:48:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 702038016. Throughput: 0: 42033.2. Samples: 702128120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 02:48:31,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 02:48:32,100][12862] Saving new best policy, reward=0.363! [2024-06-18 02:48:32,351][12883] Updated weights for policy 0, policy_version 42851 (0.0034) [2024-06-18 02:48:35,781][12883] Updated weights for policy 0, policy_version 42861 (0.0042) [2024-06-18 02:48:37,000][12645] Fps is (10 sec: 40934.5, 60 sec: 42047.9, 300 sec: 42042.1). Total num frames: 702267392. Throughput: 0: 41940.5. Samples: 702372860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:48:37,000][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 02:48:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042863_702267392.pth... [2024-06-18 02:48:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042247_692174848.pth [2024-06-18 02:48:40,360][12883] Updated weights for policy 0, policy_version 42871 (0.0030) [2024-06-18 02:48:41,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42323.7, 300 sec: 42043.1). Total num frames: 702480384. Throughput: 0: 41926.3. Samples: 702629440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:48:41,996][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 02:48:43,497][12883] Updated weights for policy 0, policy_version 42881 (0.0028) [2024-06-18 02:48:46,994][12645] Fps is (10 sec: 40985.8, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 702676992. Throughput: 0: 42118.7. Samples: 702762460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:48:46,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 02:48:47,790][12883] Updated weights for policy 0, policy_version 42891 (0.0028) [2024-06-18 02:48:50,867][12883] Updated weights for policy 0, policy_version 42901 (0.0036) [2024-06-18 02:48:51,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 702889984. Throughput: 0: 42044.5. Samples: 703008400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:48:51,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 02:48:55,426][12883] Updated weights for policy 0, policy_version 42911 (0.0040) [2024-06-18 02:48:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 703119360. Throughput: 0: 42030.7. Samples: 703260920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:48:56,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 02:48:58,969][12883] Updated weights for policy 0, policy_version 42921 (0.0040) [2024-06-18 02:49:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 703299584. Throughput: 0: 42068.0. Samples: 703388080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:49:01,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 02:49:03,514][12883] Updated weights for policy 0, policy_version 42931 (0.0043) [2024-06-18 02:49:06,765][12883] Updated weights for policy 0, policy_version 42941 (0.0032) [2024-06-18 02:49:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 703545344. Throughput: 0: 42019.5. Samples: 703642480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:49:06,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 02:49:11,139][12883] Updated weights for policy 0, policy_version 42951 (0.0033) [2024-06-18 02:49:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 703741952. Throughput: 0: 41978.7. Samples: 703889260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:49:11,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 02:49:14,511][12883] Updated weights for policy 0, policy_version 42961 (0.0046) [2024-06-18 02:49:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42043.3). Total num frames: 703938560. Throughput: 0: 41933.8. Samples: 704015140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:49:16,994][12645] Avg episode reward: [(0, '0.096')] [2024-06-18 02:49:18,754][12883] Updated weights for policy 0, policy_version 42971 (0.0041) [2024-06-18 02:49:21,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.7, 300 sec: 42209.3). Total num frames: 704184320. Throughput: 0: 42146.9. Samples: 704269300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:49:21,996][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 02:49:22,583][12883] Updated weights for policy 0, policy_version 42981 (0.0027) [2024-06-18 02:49:26,446][12883] Updated weights for policy 0, policy_version 42991 (0.0038) [2024-06-18 02:49:26,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 704380928. Throughput: 0: 42034.7. Samples: 704521000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:49:26,996][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:49:30,666][12883] Updated weights for policy 0, policy_version 43001 (0.0025) [2024-06-18 02:49:31,994][12645] Fps is (10 sec: 36052.9, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 704544768. Throughput: 0: 41898.2. Samples: 704647880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:49:31,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 02:49:34,071][12883] Updated weights for policy 0, policy_version 43011 (0.0038) [2024-06-18 02:49:36,994][12645] Fps is (10 sec: 39330.3, 60 sec: 41783.6, 300 sec: 41987.5). Total num frames: 704774144. Throughput: 0: 42019.1. Samples: 704899260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:49:36,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 02:49:38,215][12883] Updated weights for policy 0, policy_version 43021 (0.0040) [2024-06-18 02:49:41,164][12862] Signal inference workers to stop experience collection... (10000 times) [2024-06-18 02:49:41,223][12862] Signal inference workers to resume experience collection... (10000 times) [2024-06-18 02:49:41,224][12883] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-06-18 02:49:41,240][12883] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-06-18 02:49:41,698][12883] Updated weights for policy 0, policy_version 43031 (0.0027) [2024-06-18 02:49:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42326.9, 300 sec: 42098.6). Total num frames: 705019904. Throughput: 0: 42038.7. Samples: 705152660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:49:41,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 02:49:46,509][12883] Updated weights for policy 0, policy_version 43041 (0.0048) [2024-06-18 02:49:46,995][12645] Fps is (10 sec: 40956.6, 60 sec: 41778.6, 300 sec: 41931.8). Total num frames: 705183744. Throughput: 0: 42122.3. Samples: 705283620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:49:46,995][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 02:49:49,390][12883] Updated weights for policy 0, policy_version 43051 (0.0040) [2024-06-18 02:49:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 705429504. Throughput: 0: 42148.4. Samples: 705539160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:49:51,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 02:49:53,938][12883] Updated weights for policy 0, policy_version 43061 (0.0029) [2024-06-18 02:49:56,996][12645] Fps is (10 sec: 47507.1, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 705658880. Throughput: 0: 42249.0. Samples: 705790560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 02:49:56,996][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 02:49:57,225][12883] Updated weights for policy 0, policy_version 43071 (0.0032) [2024-06-18 02:50:01,763][12883] Updated weights for policy 0, policy_version 43081 (0.0025) [2024-06-18 02:50:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 705839104. Throughput: 0: 42325.7. Samples: 705919800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:50:01,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 02:50:05,238][12883] Updated weights for policy 0, policy_version 43091 (0.0022) [2024-06-18 02:50:06,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 706068480. Throughput: 0: 42295.3. Samples: 706172500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:50:06,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 02:50:09,297][12883] Updated weights for policy 0, policy_version 43101 (0.0033) [2024-06-18 02:50:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 706281472. Throughput: 0: 42490.5. Samples: 706432980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:50:11,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:50:12,820][12883] Updated weights for policy 0, policy_version 43111 (0.0036) [2024-06-18 02:50:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 706478080. Throughput: 0: 42458.1. Samples: 706558500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:50:16,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 02:50:17,266][12883] Updated weights for policy 0, policy_version 43121 (0.0038) [2024-06-18 02:50:20,996][12883] Updated weights for policy 0, policy_version 43131 (0.0034) [2024-06-18 02:50:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41780.8, 300 sec: 42154.1). Total num frames: 706691072. Throughput: 0: 42445.0. Samples: 706809280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:50:21,994][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 02:50:24,934][12883] Updated weights for policy 0, policy_version 43141 (0.0033) [2024-06-18 02:50:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 706904064. Throughput: 0: 42555.2. Samples: 707067640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:50:26,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 02:50:28,651][12883] Updated weights for policy 0, policy_version 43151 (0.0030) [2024-06-18 02:50:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 707117056. Throughput: 0: 42324.4. Samples: 707188180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:50:31,994][12645] Avg episode reward: [(0, '0.015')] [2024-06-18 02:50:32,703][12883] Updated weights for policy 0, policy_version 43161 (0.0022) [2024-06-18 02:50:36,200][12883] Updated weights for policy 0, policy_version 43171 (0.0033) [2024-06-18 02:50:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 707330048. Throughput: 0: 42291.9. Samples: 707442300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:50:36,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 02:50:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043172_707330048.pth... [2024-06-18 02:50:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042555_697221120.pth [2024-06-18 02:50:40,374][12883] Updated weights for policy 0, policy_version 43181 (0.0030) [2024-06-18 02:50:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 707543040. Throughput: 0: 42377.6. Samples: 707697460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:50:41,994][12645] Avg episode reward: [(0, '0.049')] [2024-06-18 02:50:44,256][12883] Updated weights for policy 0, policy_version 43191 (0.0031) [2024-06-18 02:50:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42872.1, 300 sec: 42265.2). Total num frames: 707756032. Throughput: 0: 42284.1. Samples: 707822580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:50:46,994][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 02:50:48,262][12883] Updated weights for policy 0, policy_version 43201 (0.0043) [2024-06-18 02:50:49,159][12862] Signal inference workers to stop experience collection... (10050 times) [2024-06-18 02:50:49,195][12883] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-06-18 02:50:49,206][12862] Signal inference workers to resume experience collection... (10050 times) [2024-06-18 02:50:49,216][12883] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-06-18 02:50:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 707952640. Throughput: 0: 42450.8. Samples: 708082780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 02:50:51,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 02:50:52,299][12883] Updated weights for policy 0, policy_version 43211 (0.0034) [2024-06-18 02:50:56,206][12883] Updated weights for policy 0, policy_version 43221 (0.0031) [2024-06-18 02:50:56,993][12645] Fps is (10 sec: 40960.5, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 708165632. Throughput: 0: 42257.0. Samples: 708334540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 02:50:56,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:50:59,926][12883] Updated weights for policy 0, policy_version 43231 (0.0030) [2024-06-18 02:51:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 708378624. Throughput: 0: 42250.3. Samples: 708459760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 02:51:01,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 02:51:03,882][12883] Updated weights for policy 0, policy_version 43241 (0.0039) [2024-06-18 02:51:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 42098.8). Total num frames: 708575232. Throughput: 0: 42309.6. Samples: 708713220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 02:51:06,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 02:51:07,631][12883] Updated weights for policy 0, policy_version 43251 (0.0027) [2024-06-18 02:51:11,621][12883] Updated weights for policy 0, policy_version 43261 (0.0024) [2024-06-18 02:51:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42154.4). Total num frames: 708804608. Throughput: 0: 42253.5. Samples: 708969060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 02:51:11,994][12645] Avg episode reward: [(0, '0.064')] [2024-06-18 02:51:15,136][12883] Updated weights for policy 0, policy_version 43271 (0.0044) [2024-06-18 02:51:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 709033984. Throughput: 0: 42407.8. Samples: 709096540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 02:51:16,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 02:51:19,045][12883] Updated weights for policy 0, policy_version 43281 (0.0036) [2024-06-18 02:51:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 709230592. Throughput: 0: 42422.2. Samples: 709351300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:51:21,995][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 02:51:22,818][12883] Updated weights for policy 0, policy_version 43291 (0.0033) [2024-06-18 02:51:26,673][12883] Updated weights for policy 0, policy_version 43301 (0.0032) [2024-06-18 02:51:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 709443584. Throughput: 0: 42468.9. Samples: 709608560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:51:26,995][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 02:51:30,552][12883] Updated weights for policy 0, policy_version 43311 (0.0028) [2024-06-18 02:51:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 709689344. Throughput: 0: 42540.5. Samples: 709736900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:51:31,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:51:34,547][12883] Updated weights for policy 0, policy_version 43321 (0.0027) [2024-06-18 02:51:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 709853184. Throughput: 0: 42347.0. Samples: 709988400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:51:36,994][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 02:51:38,287][12883] Updated weights for policy 0, policy_version 43331 (0.0031) [2024-06-18 02:51:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 710066176. Throughput: 0: 42370.1. Samples: 710241200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:51:41,994][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 02:51:42,452][12883] Updated weights for policy 0, policy_version 43341 (0.0028) [2024-06-18 02:51:46,075][12883] Updated weights for policy 0, policy_version 43351 (0.0037) [2024-06-18 02:51:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 710295552. Throughput: 0: 42338.2. Samples: 710364980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 02:51:46,994][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 02:51:50,099][12883] Updated weights for policy 0, policy_version 43361 (0.0037) [2024-06-18 02:51:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 710508544. Throughput: 0: 42401.3. Samples: 710621280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:51:51,996][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 02:51:53,816][12883] Updated weights for policy 0, policy_version 43371 (0.0043) [2024-06-18 02:51:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 710705152. Throughput: 0: 42209.1. Samples: 710868460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:51:56,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 02:51:57,869][12883] Updated weights for policy 0, policy_version 43381 (0.0027) [2024-06-18 02:52:01,422][12883] Updated weights for policy 0, policy_version 43391 (0.0027) [2024-06-18 02:52:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 710934528. Throughput: 0: 42168.6. Samples: 710994120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:52:01,994][12645] Avg episode reward: [(0, '0.020')] [2024-06-18 02:52:05,660][12883] Updated weights for policy 0, policy_version 43401 (0.0032) [2024-06-18 02:52:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 711114752. Throughput: 0: 42154.3. Samples: 711248240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:52:06,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 02:52:09,297][12883] Updated weights for policy 0, policy_version 43411 (0.0037) [2024-06-18 02:52:09,663][12862] Signal inference workers to stop experience collection... (10100 times) [2024-06-18 02:52:09,663][12862] Signal inference workers to resume experience collection... (10100 times) [2024-06-18 02:52:09,689][12883] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-06-18 02:52:09,689][12883] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-06-18 02:52:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 711327744. Throughput: 0: 42016.9. Samples: 711499320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 02:52:11,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 02:52:13,829][12883] Updated weights for policy 0, policy_version 43421 (0.0028) [2024-06-18 02:52:16,990][12883] Updated weights for policy 0, policy_version 43431 (0.0027) [2024-06-18 02:52:16,995][12645] Fps is (10 sec: 45868.4, 60 sec: 42324.4, 300 sec: 42265.0). Total num frames: 711573504. Throughput: 0: 41997.7. Samples: 711626860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 02:52:16,996][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 02:52:21,719][12883] Updated weights for policy 0, policy_version 43441 (0.0037) [2024-06-18 02:52:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42210.2). Total num frames: 711753728. Throughput: 0: 42018.7. Samples: 711879240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 02:52:21,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 02:52:24,576][12883] Updated weights for policy 0, policy_version 43451 (0.0029) [2024-06-18 02:52:26,994][12645] Fps is (10 sec: 39327.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 711966720. Throughput: 0: 42012.9. Samples: 712131780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 02:52:26,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 02:52:29,186][12883] Updated weights for policy 0, policy_version 43461 (0.0040) [2024-06-18 02:52:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 42154.1). Total num frames: 712179712. Throughput: 0: 42044.9. Samples: 712257000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 02:52:31,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 02:52:32,866][12883] Updated weights for policy 0, policy_version 43471 (0.0036) [2024-06-18 02:52:36,990][12883] Updated weights for policy 0, policy_version 43481 (0.0038) [2024-06-18 02:52:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 712392704. Throughput: 0: 41921.9. Samples: 712507760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 02:52:36,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 02:52:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043481_712392704.pth... [2024-06-18 02:52:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000042863_702267392.pth [2024-06-18 02:52:40,724][12883] Updated weights for policy 0, policy_version 43491 (0.0036) [2024-06-18 02:52:42,000][12645] Fps is (10 sec: 42572.5, 60 sec: 42320.9, 300 sec: 42153.2). Total num frames: 712605696. Throughput: 0: 41907.1. Samples: 712754540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 02:52:42,000][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 02:52:44,747][12883] Updated weights for policy 0, policy_version 43501 (0.0026) [2024-06-18 02:52:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 712802304. Throughput: 0: 42034.1. Samples: 712885660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 02:52:46,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 02:52:48,474][12883] Updated weights for policy 0, policy_version 43511 (0.0036) [2024-06-18 02:52:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 713031680. Throughput: 0: 42092.4. Samples: 713142400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 02:52:51,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 02:52:52,162][12883] Updated weights for policy 0, policy_version 43521 (0.0032) [2024-06-18 02:52:56,234][12883] Updated weights for policy 0, policy_version 43531 (0.0037) [2024-06-18 02:52:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 713244672. Throughput: 0: 42004.0. Samples: 713389500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 02:52:56,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 02:53:00,117][12883] Updated weights for policy 0, policy_version 43541 (0.0036) [2024-06-18 02:53:01,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41777.6, 300 sec: 42209.3). Total num frames: 713441280. Throughput: 0: 42052.2. Samples: 713519240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 02:53:01,996][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 02:53:03,955][12883] Updated weights for policy 0, policy_version 43551 (0.0037) [2024-06-18 02:53:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 713654272. Throughput: 0: 42093.4. Samples: 713773440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 02:53:06,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 02:53:07,863][12883] Updated weights for policy 0, policy_version 43561 (0.0025) [2024-06-18 02:53:11,610][12883] Updated weights for policy 0, policy_version 43571 (0.0035) [2024-06-18 02:53:11,994][12645] Fps is (10 sec: 45886.0, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 713900032. Throughput: 0: 42139.7. Samples: 714028060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:53:11,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 02:53:15,440][12883] Updated weights for policy 0, policy_version 43581 (0.0043) [2024-06-18 02:53:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41780.2, 300 sec: 42154.1). Total num frames: 714080256. Throughput: 0: 42291.7. Samples: 714160120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:53:16,994][12645] Avg episode reward: [(0, '0.019')] [2024-06-18 02:53:19,339][12883] Updated weights for policy 0, policy_version 43591 (0.0039) [2024-06-18 02:53:21,994][12645] Fps is (10 sec: 36044.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 714260480. Throughput: 0: 42209.7. Samples: 714407200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:53:21,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 02:53:23,258][12883] Updated weights for policy 0, policy_version 43601 (0.0040) [2024-06-18 02:53:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 714522624. Throughput: 0: 42349.0. Samples: 714659980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:53:26,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 02:53:26,998][12883] Updated weights for policy 0, policy_version 43611 (0.0029) [2024-06-18 02:53:30,787][12883] Updated weights for policy 0, policy_version 43621 (0.0028) [2024-06-18 02:53:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42210.5). Total num frames: 714719232. Throughput: 0: 42451.5. Samples: 714795980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:53:31,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 02:53:34,085][12862] Signal inference workers to stop experience collection... (10150 times) [2024-06-18 02:53:34,132][12883] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-06-18 02:53:34,141][12862] Signal inference workers to resume experience collection... (10150 times) [2024-06-18 02:53:34,151][12883] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-06-18 02:53:34,719][12883] Updated weights for policy 0, policy_version 43631 (0.0034) [2024-06-18 02:53:36,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41779.2, 300 sec: 42098.9). Total num frames: 714899456. Throughput: 0: 42213.8. Samples: 715042020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 02:53:36,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 02:53:38,557][12883] Updated weights for policy 0, policy_version 43641 (0.0030) [2024-06-18 02:53:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42602.9, 300 sec: 42320.7). Total num frames: 715161600. Throughput: 0: 42413.4. Samples: 715298100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 02:53:41,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 02:53:42,178][12883] Updated weights for policy 0, policy_version 43651 (0.0027) [2024-06-18 02:53:46,227][12883] Updated weights for policy 0, policy_version 43661 (0.0031) [2024-06-18 02:53:46,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 715358208. Throughput: 0: 42418.6. Samples: 715427980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 02:53:46,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 02:53:50,218][12883] Updated weights for policy 0, policy_version 43671 (0.0025) [2024-06-18 02:53:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 715554816. Throughput: 0: 42375.0. Samples: 715680320. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 02:53:51,994][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 02:53:54,159][12883] Updated weights for policy 0, policy_version 43681 (0.0039) [2024-06-18 02:53:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 715784192. Throughput: 0: 42410.9. Samples: 715936560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 02:53:56,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 02:53:57,876][12883] Updated weights for policy 0, policy_version 43691 (0.0034) [2024-06-18 02:54:01,743][12883] Updated weights for policy 0, policy_version 43701 (0.0025) [2024-06-18 02:54:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42873.0, 300 sec: 42265.1). Total num frames: 716013568. Throughput: 0: 42482.5. Samples: 716071840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 02:54:01,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 02:54:05,500][12883] Updated weights for policy 0, policy_version 43711 (0.0040) [2024-06-18 02:54:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 716177408. Throughput: 0: 42456.4. Samples: 716317740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 02:54:06,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 02:54:09,351][12883] Updated weights for policy 0, policy_version 43721 (0.0028) [2024-06-18 02:54:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 716423168. Throughput: 0: 42574.1. Samples: 716575820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 02:54:11,995][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 02:54:13,444][12883] Updated weights for policy 0, policy_version 43731 (0.0031) [2024-06-18 02:54:16,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42210.0). Total num frames: 716636160. Throughput: 0: 42486.8. Samples: 716707880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 02:54:16,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 02:54:17,094][12883] Updated weights for policy 0, policy_version 43741 (0.0034) [2024-06-18 02:54:21,223][12883] Updated weights for policy 0, policy_version 43751 (0.0035) [2024-06-18 02:54:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 716816384. Throughput: 0: 42451.9. Samples: 716952360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 02:54:21,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 02:54:24,295][12862] Signal inference workers to stop experience collection... (10200 times) [2024-06-18 02:54:24,296][12862] Signal inference workers to resume experience collection... (10200 times) [2024-06-18 02:54:24,339][12883] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-06-18 02:54:24,339][12883] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-06-18 02:54:24,757][12883] Updated weights for policy 0, policy_version 43761 (0.0028) [2024-06-18 02:54:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 717062144. Throughput: 0: 42643.1. Samples: 717217040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 02:54:26,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 02:54:28,846][12883] Updated weights for policy 0, policy_version 43771 (0.0027) [2024-06-18 02:54:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 717258752. Throughput: 0: 42687.9. Samples: 717348940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 02:54:31,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 02:54:32,571][12883] Updated weights for policy 0, policy_version 43781 (0.0022) [2024-06-18 02:54:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 717455360. Throughput: 0: 42461.8. Samples: 717591100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:54:36,994][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 02:54:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043790_717455360.pth... [2024-06-18 02:54:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043172_707330048.pth [2024-06-18 02:54:37,235][12883] Updated weights for policy 0, policy_version 43791 (0.0044) [2024-06-18 02:54:40,413][12883] Updated weights for policy 0, policy_version 43801 (0.0032) [2024-06-18 02:54:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42431.9). Total num frames: 717701120. Throughput: 0: 42451.7. Samples: 717846880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:54:41,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 02:54:44,896][12883] Updated weights for policy 0, policy_version 43811 (0.0030) [2024-06-18 02:54:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 717881344. Throughput: 0: 42289.1. Samples: 717974840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:54:46,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 02:54:48,229][12883] Updated weights for policy 0, policy_version 43821 (0.0042) [2024-06-18 02:54:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42209.9). Total num frames: 718110720. Throughput: 0: 42256.1. Samples: 718219260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:54:51,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 02:54:52,798][12883] Updated weights for policy 0, policy_version 43831 (0.0035) [2024-06-18 02:54:55,825][12883] Updated weights for policy 0, policy_version 43841 (0.0033) [2024-06-18 02:54:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 718340096. Throughput: 0: 42301.3. Samples: 718479380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 02:54:56,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 02:55:00,551][12883] Updated weights for policy 0, policy_version 43851 (0.0039) [2024-06-18 02:55:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 718520320. Throughput: 0: 42416.3. Samples: 718616620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:55:01,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 02:55:03,400][12883] Updated weights for policy 0, policy_version 43861 (0.0037) [2024-06-18 02:55:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 718733312. Throughput: 0: 42401.4. Samples: 718860420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:55:06,994][12645] Avg episode reward: [(0, '0.139')] [2024-06-18 02:55:08,202][12883] Updated weights for policy 0, policy_version 43871 (0.0033) [2024-06-18 02:55:11,342][12883] Updated weights for policy 0, policy_version 43881 (0.0029) [2024-06-18 02:55:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 718979072. Throughput: 0: 42110.6. Samples: 719112020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:55:11,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 02:55:15,918][12883] Updated weights for policy 0, policy_version 43891 (0.0035) [2024-06-18 02:55:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 719175680. Throughput: 0: 42106.2. Samples: 719243720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:55:16,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 02:55:19,175][12883] Updated weights for policy 0, policy_version 43901 (0.0031) [2024-06-18 02:55:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 719388672. Throughput: 0: 42197.7. Samples: 719490000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:55:21,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 02:55:23,639][12883] Updated weights for policy 0, policy_version 43911 (0.0038) [2024-06-18 02:55:26,979][12883] Updated weights for policy 0, policy_version 43921 (0.0043) [2024-06-18 02:55:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 719601664. Throughput: 0: 42312.4. Samples: 719750940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 02:55:26,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 02:55:31,313][12883] Updated weights for policy 0, policy_version 43931 (0.0028) [2024-06-18 02:55:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 719798272. Throughput: 0: 42120.3. Samples: 719870260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 02:55:31,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 02:55:34,858][12883] Updated weights for policy 0, policy_version 43941 (0.0031) [2024-06-18 02:55:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 720027648. Throughput: 0: 42338.3. Samples: 720124480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 02:55:36,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 02:55:38,442][12862] Signal inference workers to stop experience collection... (10250 times) [2024-06-18 02:55:38,443][12862] Signal inference workers to resume experience collection... (10250 times) [2024-06-18 02:55:38,480][12883] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-06-18 02:55:38,480][12883] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-06-18 02:55:38,821][12883] Updated weights for policy 0, policy_version 43951 (0.0032) [2024-06-18 02:55:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 720224256. Throughput: 0: 42340.5. Samples: 720384700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 02:55:42,000][12645] Avg episode reward: [(0, '0.104')] [2024-06-18 02:55:42,557][12883] Updated weights for policy 0, policy_version 43961 (0.0039) [2024-06-18 02:55:46,492][12883] Updated weights for policy 0, policy_version 43971 (0.0031) [2024-06-18 02:55:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 720420864. Throughput: 0: 42059.5. Samples: 720509300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 02:55:46,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 02:55:49,935][12883] Updated weights for policy 0, policy_version 43981 (0.0028) [2024-06-18 02:55:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 720666624. Throughput: 0: 42225.4. Samples: 720760560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 20.0) [2024-06-18 02:55:51,994][12645] Avg episode reward: [(0, '0.137')] [2024-06-18 02:55:54,687][12883] Updated weights for policy 0, policy_version 43991 (0.0038) [2024-06-18 02:55:56,998][12645] Fps is (10 sec: 44219.6, 60 sec: 42049.5, 300 sec: 42320.1). Total num frames: 720863232. Throughput: 0: 42256.6. Samples: 721013740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:55:56,998][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 02:55:58,219][12883] Updated weights for policy 0, policy_version 44001 (0.0041) [2024-06-18 02:56:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 721059840. Throughput: 0: 42122.2. Samples: 721139220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:56:01,994][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 02:56:02,259][12883] Updated weights for policy 0, policy_version 44011 (0.0029) [2024-06-18 02:56:05,873][12883] Updated weights for policy 0, policy_version 44021 (0.0054) [2024-06-18 02:56:06,994][12645] Fps is (10 sec: 40976.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 721272832. Throughput: 0: 42240.4. Samples: 721390820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:56:06,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 02:56:10,131][12883] Updated weights for policy 0, policy_version 44031 (0.0038) [2024-06-18 02:56:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 721485824. Throughput: 0: 42104.8. Samples: 721645660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:56:11,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 02:56:13,699][12883] Updated weights for policy 0, policy_version 44041 (0.0044) [2024-06-18 02:56:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 721698816. Throughput: 0: 42140.5. Samples: 721766580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:56:16,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 02:56:17,867][12883] Updated weights for policy 0, policy_version 44051 (0.0034) [2024-06-18 02:56:21,418][12883] Updated weights for policy 0, policy_version 44061 (0.0028) [2024-06-18 02:56:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 721895424. Throughput: 0: 41941.7. Samples: 722011860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 02:56:21,994][12645] Avg episode reward: [(0, '0.064')] [2024-06-18 02:56:25,644][12883] Updated weights for policy 0, policy_version 44071 (0.0027) [2024-06-18 02:56:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 722108416. Throughput: 0: 42109.6. Samples: 722279640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 02:56:26,994][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 02:56:29,070][12883] Updated weights for policy 0, policy_version 44081 (0.0039) [2024-06-18 02:56:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 722321408. Throughput: 0: 42056.6. Samples: 722401840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 02:56:31,994][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 02:56:33,153][12883] Updated weights for policy 0, policy_version 44091 (0.0023) [2024-06-18 02:56:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 42265.1). Total num frames: 722534400. Throughput: 0: 41931.0. Samples: 722647460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 02:56:36,995][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 02:56:37,106][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044101_722550784.pth... [2024-06-18 02:56:37,121][12883] Updated weights for policy 0, policy_version 44101 (0.0042) [2024-06-18 02:56:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043481_712392704.pth [2024-06-18 02:56:40,822][12883] Updated weights for policy 0, policy_version 44111 (0.0034) [2024-06-18 02:56:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 722731008. Throughput: 0: 42049.2. Samples: 722905780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 02:56:41,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 02:56:45,212][12883] Updated weights for policy 0, policy_version 44121 (0.0053) [2024-06-18 02:56:46,994][12645] Fps is (10 sec: 39322.3, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 722927616. Throughput: 0: 41817.4. Samples: 723021000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 02:56:46,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 02:56:47,651][12862] Signal inference workers to stop experience collection... (10300 times) [2024-06-18 02:56:47,651][12862] Signal inference workers to resume experience collection... (10300 times) [2024-06-18 02:56:47,700][12883] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-06-18 02:56:47,700][12883] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-06-18 02:56:48,855][12883] Updated weights for policy 0, policy_version 44131 (0.0032) [2024-06-18 02:56:51,996][12645] Fps is (10 sec: 44226.5, 60 sec: 41777.6, 300 sec: 42264.9). Total num frames: 723173376. Throughput: 0: 41910.0. Samples: 723276860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:56:51,996][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 02:56:52,791][12883] Updated weights for policy 0, policy_version 44141 (0.0033) [2024-06-18 02:56:56,741][12883] Updated weights for policy 0, policy_version 44151 (0.0034) [2024-06-18 02:56:56,994][12645] Fps is (10 sec: 44236.0, 60 sec: 41781.9, 300 sec: 42154.1). Total num frames: 723369984. Throughput: 0: 42003.0. Samples: 723535800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:56:56,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 02:57:00,438][12883] Updated weights for policy 0, policy_version 44161 (0.0041) [2024-06-18 02:57:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 723566592. Throughput: 0: 42132.5. Samples: 723662540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:57:01,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 02:57:04,343][12883] Updated weights for policy 0, policy_version 44171 (0.0043) [2024-06-18 02:57:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 723795968. Throughput: 0: 42252.4. Samples: 723913220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:57:06,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 02:57:08,258][12883] Updated weights for policy 0, policy_version 44181 (0.0029) [2024-06-18 02:57:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42154.3). Total num frames: 724008960. Throughput: 0: 41799.8. Samples: 724160620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:57:11,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 02:57:12,711][12883] Updated weights for policy 0, policy_version 44191 (0.0036) [2024-06-18 02:57:16,419][12883] Updated weights for policy 0, policy_version 44201 (0.0040) [2024-06-18 02:57:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 724189184. Throughput: 0: 41782.6. Samples: 724282060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 25.0) [2024-06-18 02:57:16,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 02:57:20,266][12883] Updated weights for policy 0, policy_version 44211 (0.0038) [2024-06-18 02:57:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 724434944. Throughput: 0: 41978.3. Samples: 724536480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:57:21,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 02:57:24,645][12883] Updated weights for policy 0, policy_version 44221 (0.0054) [2024-06-18 02:57:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 724631552. Throughput: 0: 41758.1. Samples: 724784900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:57:26,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 02:57:27,882][12883] Updated weights for policy 0, policy_version 44231 (0.0034) [2024-06-18 02:57:31,996][12645] Fps is (10 sec: 39313.1, 60 sec: 41777.6, 300 sec: 42153.8). Total num frames: 724828160. Throughput: 0: 42005.4. Samples: 724911340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:57:31,996][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 02:57:32,257][12883] Updated weights for policy 0, policy_version 44241 (0.0041) [2024-06-18 02:57:35,594][12883] Updated weights for policy 0, policy_version 44251 (0.0031) [2024-06-18 02:57:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42155.0). Total num frames: 725041152. Throughput: 0: 41990.6. Samples: 725166340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:57:36,994][12645] Avg episode reward: [(0, '0.064')] [2024-06-18 02:57:40,224][12883] Updated weights for policy 0, policy_version 44261 (0.0026) [2024-06-18 02:57:41,996][12645] Fps is (10 sec: 44236.7, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 725270528. Throughput: 0: 41758.5. Samples: 725415020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:57:41,996][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 02:57:43,402][12883] Updated weights for policy 0, policy_version 44271 (0.0039) [2024-06-18 02:57:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 725467136. Throughput: 0: 41867.5. Samples: 725546580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 02:57:46,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 02:57:47,624][12883] Updated weights for policy 0, policy_version 44281 (0.0027) [2024-06-18 02:57:51,429][12883] Updated weights for policy 0, policy_version 44291 (0.0051) [2024-06-18 02:57:51,994][12645] Fps is (10 sec: 39330.2, 60 sec: 41507.7, 300 sec: 42098.5). Total num frames: 725663744. Throughput: 0: 41774.3. Samples: 725793060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 02:57:51,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 02:57:55,350][12883] Updated weights for policy 0, policy_version 44301 (0.0034) [2024-06-18 02:57:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.9). Total num frames: 725893120. Throughput: 0: 41907.4. Samples: 726046460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 02:57:56,996][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 02:57:58,855][12883] Updated weights for policy 0, policy_version 44311 (0.0040) [2024-06-18 02:58:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 726089728. Throughput: 0: 42170.7. Samples: 726179740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 02:58:01,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 02:58:03,110][12883] Updated weights for policy 0, policy_version 44321 (0.0032) [2024-06-18 02:58:06,369][12883] Updated weights for policy 0, policy_version 44331 (0.0020) [2024-06-18 02:58:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 726319104. Throughput: 0: 42206.7. Samples: 726435780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 02:58:06,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 02:58:10,794][12883] Updated weights for policy 0, policy_version 44341 (0.0031) [2024-06-18 02:58:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 726532096. Throughput: 0: 42307.5. Samples: 726688740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 02:58:11,994][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 02:58:14,387][12883] Updated weights for policy 0, policy_version 44351 (0.0041) [2024-06-18 02:58:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 726728704. Throughput: 0: 42275.3. Samples: 726813640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 02:58:16,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 02:58:18,228][12883] Updated weights for policy 0, policy_version 44361 (0.0034) [2024-06-18 02:58:19,558][12862] Signal inference workers to stop experience collection... (10350 times) [2024-06-18 02:58:19,558][12862] Signal inference workers to resume experience collection... (10350 times) [2024-06-18 02:58:19,588][12883] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-06-18 02:58:19,589][12883] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-06-18 02:58:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 726958080. Throughput: 0: 42256.8. Samples: 727067900. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 02:58:21,994][12645] Avg episode reward: [(0, '0.104')] [2024-06-18 02:58:22,049][12883] Updated weights for policy 0, policy_version 44371 (0.0040) [2024-06-18 02:58:26,888][12883] Updated weights for policy 0, policy_version 44381 (0.0043) [2024-06-18 02:58:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 727138304. Throughput: 0: 42468.7. Samples: 727326020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 02:58:26,994][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 02:58:29,625][12883] Updated weights for policy 0, policy_version 44391 (0.0038) [2024-06-18 02:58:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42326.8, 300 sec: 42265.1). Total num frames: 727367680. Throughput: 0: 42071.5. Samples: 727439800. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 02:58:31,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 02:58:34,789][12883] Updated weights for policy 0, policy_version 44401 (0.0038) [2024-06-18 02:58:36,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 727597056. Throughput: 0: 42280.9. Samples: 727695700. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 02:58:36,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 02:58:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044410_727613440.pth... [2024-06-18 02:58:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000043790_717455360.pth [2024-06-18 02:58:37,906][12883] Updated weights for policy 0, policy_version 44411 (0.0036) [2024-06-18 02:58:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41780.8, 300 sec: 42098.5). Total num frames: 727777280. Throughput: 0: 42346.8. Samples: 727952060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 02:58:41,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 02:58:42,313][12883] Updated weights for policy 0, policy_version 44421 (0.0032) [2024-06-18 02:58:45,400][12883] Updated weights for policy 0, policy_version 44431 (0.0037) [2024-06-18 02:58:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 728023040. Throughput: 0: 42167.5. Samples: 728077280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 02:58:46,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 02:58:50,100][12883] Updated weights for policy 0, policy_version 44441 (0.0029) [2024-06-18 02:58:51,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 728219648. Throughput: 0: 42303.4. Samples: 728339440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 02:58:51,995][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 02:58:53,062][12883] Updated weights for policy 0, policy_version 44451 (0.0039) [2024-06-18 02:58:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 728432640. Throughput: 0: 42284.9. Samples: 728591560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 02:58:56,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 02:58:57,820][12883] Updated weights for policy 0, policy_version 44461 (0.0052) [2024-06-18 02:59:00,601][12883] Updated weights for policy 0, policy_version 44471 (0.0033) [2024-06-18 02:59:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 728662016. Throughput: 0: 42340.9. Samples: 728718980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 02:59:02,003][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 02:59:05,481][12883] Updated weights for policy 0, policy_version 44481 (0.0041) [2024-06-18 02:59:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 728842240. Throughput: 0: 42414.2. Samples: 728976540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 02:59:06,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 02:59:08,210][12883] Updated weights for policy 0, policy_version 44491 (0.0033) [2024-06-18 02:59:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 729055232. Throughput: 0: 42140.8. Samples: 729222360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 02:59:11,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 02:59:13,218][12883] Updated weights for policy 0, policy_version 44501 (0.0030) [2024-06-18 02:59:16,625][12883] Updated weights for policy 0, policy_version 44511 (0.0035) [2024-06-18 02:59:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 729284608. Throughput: 0: 42613.1. Samples: 729357380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:59:16,994][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 02:59:20,907][12883] Updated weights for policy 0, policy_version 44521 (0.0039) [2024-06-18 02:59:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 729464832. Throughput: 0: 42481.2. Samples: 729607360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:59:22,003][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 02:59:24,175][12883] Updated weights for policy 0, policy_version 44531 (0.0028) [2024-06-18 02:59:26,995][12645] Fps is (10 sec: 42592.1, 60 sec: 42870.5, 300 sec: 42209.4). Total num frames: 729710592. Throughput: 0: 42334.6. Samples: 729857180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:59:26,996][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 02:59:28,692][12883] Updated weights for policy 0, policy_version 44541 (0.0042) [2024-06-18 02:59:30,869][12862] Signal inference workers to stop experience collection... (10400 times) [2024-06-18 02:59:30,917][12883] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-06-18 02:59:30,989][12862] Signal inference workers to resume experience collection... (10400 times) [2024-06-18 02:59:30,989][12883] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-06-18 02:59:31,963][12883] Updated weights for policy 0, policy_version 44551 (0.0036) [2024-06-18 02:59:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 729923584. Throughput: 0: 42555.5. Samples: 729992280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:59:31,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 02:59:36,423][12883] Updated weights for policy 0, policy_version 44561 (0.0027) [2024-06-18 02:59:36,994][12645] Fps is (10 sec: 39326.8, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 730103808. Throughput: 0: 42293.4. Samples: 730242640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 02:59:36,994][12645] Avg episode reward: [(0, '0.205')] [2024-06-18 02:59:39,526][12883] Updated weights for policy 0, policy_version 44571 (0.0040) [2024-06-18 02:59:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 730349568. Throughput: 0: 42238.7. Samples: 730492300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:59:41,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 02:59:44,191][12883] Updated weights for policy 0, policy_version 44581 (0.0038) [2024-06-18 02:59:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 730546176. Throughput: 0: 42322.8. Samples: 730623500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:59:46,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 02:59:47,232][12883] Updated weights for policy 0, policy_version 44591 (0.0035) [2024-06-18 02:59:51,763][12883] Updated weights for policy 0, policy_version 44601 (0.0045) [2024-06-18 02:59:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 730742784. Throughput: 0: 42184.0. Samples: 730874820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:59:51,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 02:59:54,869][12883] Updated weights for policy 0, policy_version 44611 (0.0038) [2024-06-18 02:59:56,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42596.8, 300 sec: 42264.9). Total num frames: 730988544. Throughput: 0: 42343.4. Samples: 731127900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 02:59:56,997][12645] Avg episode reward: [(0, '0.099')] [2024-06-18 02:59:59,533][12883] Updated weights for policy 0, policy_version 44621 (0.0039) [2024-06-18 03:00:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 731185152. Throughput: 0: 42270.2. Samples: 731259540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 03:00:01,994][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 03:00:02,757][12883] Updated weights for policy 0, policy_version 44631 (0.0026) [2024-06-18 03:00:06,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 731381760. Throughput: 0: 42237.9. Samples: 731508060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 03:00:06,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 03:00:07,350][12883] Updated weights for policy 0, policy_version 44641 (0.0034) [2024-06-18 03:00:10,802][12883] Updated weights for policy 0, policy_version 44651 (0.0041) [2024-06-18 03:00:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 731627520. Throughput: 0: 42256.5. Samples: 731758660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:00:11,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 03:00:15,213][12883] Updated weights for policy 0, policy_version 44661 (0.0030) [2024-06-18 03:00:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 731807744. Throughput: 0: 42256.9. Samples: 731893840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:00:16,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 03:00:18,403][12883] Updated weights for policy 0, policy_version 44671 (0.0039) [2024-06-18 03:00:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 732020736. Throughput: 0: 42265.0. Samples: 732144560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:00:21,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 03:00:22,857][12883] Updated weights for policy 0, policy_version 44681 (0.0030) [2024-06-18 03:00:26,050][12883] Updated weights for policy 0, policy_version 44691 (0.0026) [2024-06-18 03:00:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42326.3, 300 sec: 42209.6). Total num frames: 732250112. Throughput: 0: 42339.6. Samples: 732397580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:00:26,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 03:00:30,381][12883] Updated weights for policy 0, policy_version 44701 (0.0030) [2024-06-18 03:00:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 732430336. Throughput: 0: 42391.5. Samples: 732531120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:00:31,994][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 03:00:33,623][12883] Updated weights for policy 0, policy_version 44711 (0.0036) [2024-06-18 03:00:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 732659712. Throughput: 0: 42425.2. Samples: 732783960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:00:36,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 03:00:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044718_732659712.pth... [2024-06-18 03:00:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044101_722550784.pth [2024-06-18 03:00:38,196][12883] Updated weights for policy 0, policy_version 44721 (0.0022) [2024-06-18 03:00:41,106][12883] Updated weights for policy 0, policy_version 44731 (0.0033) [2024-06-18 03:00:41,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 732905472. Throughput: 0: 42472.6. Samples: 733039080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:00:41,994][12645] Avg episode reward: [(0, '0.099')] [2024-06-18 03:00:45,915][12883] Updated weights for policy 0, policy_version 44741 (0.0026) [2024-06-18 03:00:46,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 733085696. Throughput: 0: 42389.9. Samples: 733167080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:00:46,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 03:00:49,226][12883] Updated weights for policy 0, policy_version 44751 (0.0035) [2024-06-18 03:00:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42210.2). Total num frames: 733315072. Throughput: 0: 42432.8. Samples: 733417540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:00:51,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 03:00:53,447][12883] Updated weights for policy 0, policy_version 44761 (0.0026) [2024-06-18 03:00:56,723][12883] Updated weights for policy 0, policy_version 44771 (0.0028) [2024-06-18 03:00:56,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42326.8, 300 sec: 42265.2). Total num frames: 733528064. Throughput: 0: 42588.3. Samples: 733675140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:00:56,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:01:01,178][12883] Updated weights for policy 0, policy_version 44781 (0.0031) [2024-06-18 03:01:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 733724672. Throughput: 0: 42392.0. Samples: 733801480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:01:01,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 03:01:04,465][12883] Updated weights for policy 0, policy_version 44791 (0.0035) [2024-06-18 03:01:06,081][12862] Signal inference workers to stop experience collection... (10450 times) [2024-06-18 03:01:06,119][12883] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-06-18 03:01:06,139][12862] Signal inference workers to resume experience collection... (10450 times) [2024-06-18 03:01:06,144][12883] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-06-18 03:01:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 733954048. Throughput: 0: 42516.0. Samples: 734057780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:01:06,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 03:01:08,889][12883] Updated weights for policy 0, policy_version 44801 (0.0037) [2024-06-18 03:01:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 734167040. Throughput: 0: 42457.8. Samples: 734308180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:01:11,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 03:01:12,193][12883] Updated weights for policy 0, policy_version 44811 (0.0044) [2024-06-18 03:01:16,526][12883] Updated weights for policy 0, policy_version 44821 (0.0049) [2024-06-18 03:01:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 734380032. Throughput: 0: 42359.6. Samples: 734437300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:01:16,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 03:01:19,751][12883] Updated weights for policy 0, policy_version 44831 (0.0024) [2024-06-18 03:01:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 734593024. Throughput: 0: 42440.0. Samples: 734693760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:01:21,994][12645] Avg episode reward: [(0, '0.053')] [2024-06-18 03:01:24,162][12883] Updated weights for policy 0, policy_version 44841 (0.0032) [2024-06-18 03:01:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 734822400. Throughput: 0: 42488.1. Samples: 734951040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:01:26,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 03:01:27,394][12883] Updated weights for policy 0, policy_version 44851 (0.0043) [2024-06-18 03:01:31,768][12883] Updated weights for policy 0, policy_version 44861 (0.0024) [2024-06-18 03:01:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 735002624. Throughput: 0: 42554.1. Samples: 735082020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:01:31,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 03:01:35,219][12883] Updated weights for policy 0, policy_version 44871 (0.0042) [2024-06-18 03:01:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 735199232. Throughput: 0: 42592.9. Samples: 735334220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:01:36,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 03:01:39,818][12883] Updated weights for policy 0, policy_version 44881 (0.0037) [2024-06-18 03:01:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 735444992. Throughput: 0: 42429.5. Samples: 735584460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:01:41,994][12645] Avg episode reward: [(0, '0.104')] [2024-06-18 03:01:43,176][12883] Updated weights for policy 0, policy_version 44891 (0.0042) [2024-06-18 03:01:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42154.4). Total num frames: 735608832. Throughput: 0: 42377.3. Samples: 735708460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:01:46,995][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 03:01:47,698][12883] Updated weights for policy 0, policy_version 44901 (0.0038) [2024-06-18 03:01:51,340][12883] Updated weights for policy 0, policy_version 44911 (0.0042) [2024-06-18 03:01:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 735854592. Throughput: 0: 42393.2. Samples: 735965480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:01:51,994][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 03:01:55,364][12883] Updated weights for policy 0, policy_version 44921 (0.0028) [2024-06-18 03:01:56,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 736083968. Throughput: 0: 42437.0. Samples: 736217840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:01:56,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 03:01:59,110][12883] Updated weights for policy 0, policy_version 44931 (0.0032) [2024-06-18 03:02:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 736264192. Throughput: 0: 42635.0. Samples: 736355880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:02:01,994][12645] Avg episode reward: [(0, '0.028')] [2024-06-18 03:02:02,844][12883] Updated weights for policy 0, policy_version 44941 (0.0039) [2024-06-18 03:02:06,779][12883] Updated weights for policy 0, policy_version 44951 (0.0037) [2024-06-18 03:02:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 736493568. Throughput: 0: 42539.7. Samples: 736608040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 03:02:06,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 03:02:10,527][12883] Updated weights for policy 0, policy_version 44961 (0.0036) [2024-06-18 03:02:11,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 736722944. Throughput: 0: 42398.3. Samples: 736859060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 03:02:11,996][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 03:02:14,474][12883] Updated weights for policy 0, policy_version 44971 (0.0042) [2024-06-18 03:02:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 736886784. Throughput: 0: 42457.3. Samples: 736992600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 03:02:16,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 03:02:17,993][12883] Updated weights for policy 0, policy_version 44981 (0.0034) [2024-06-18 03:02:21,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 737116160. Throughput: 0: 42443.6. Samples: 737244180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 03:02:21,994][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 03:02:22,047][12883] Updated weights for policy 0, policy_version 44991 (0.0040) [2024-06-18 03:02:25,618][12883] Updated weights for policy 0, policy_version 45001 (0.0033) [2024-06-18 03:02:26,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 737361920. Throughput: 0: 42571.5. Samples: 737500180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 03:02:26,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 03:02:29,747][12883] Updated weights for policy 0, policy_version 45011 (0.0047) [2024-06-18 03:02:31,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 737542144. Throughput: 0: 42746.6. Samples: 737632060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 03:02:31,995][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 03:02:33,248][12883] Updated weights for policy 0, policy_version 45021 (0.0040) [2024-06-18 03:02:34,643][12862] Signal inference workers to stop experience collection... (10500 times) [2024-06-18 03:02:34,664][12883] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-06-18 03:02:34,757][12862] Signal inference workers to resume experience collection... (10500 times) [2024-06-18 03:02:34,757][12883] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-06-18 03:02:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 737755136. Throughput: 0: 42503.2. Samples: 737878120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 03:02:36,994][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 03:02:37,113][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045030_737771520.pth... [2024-06-18 03:02:37,172][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044410_727613440.pth [2024-06-18 03:02:37,433][12883] Updated weights for policy 0, policy_version 45031 (0.0036) [2024-06-18 03:02:41,149][12883] Updated weights for policy 0, policy_version 45041 (0.0046) [2024-06-18 03:02:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 737984512. Throughput: 0: 42480.3. Samples: 738129460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 03:02:41,994][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 03:02:45,361][12883] Updated weights for policy 0, policy_version 45051 (0.0036) [2024-06-18 03:02:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 738181120. Throughput: 0: 42313.8. Samples: 738260000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 03:02:46,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 03:02:48,837][12883] Updated weights for policy 0, policy_version 45061 (0.0050) [2024-06-18 03:02:51,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42050.8, 300 sec: 42320.4). Total num frames: 738377728. Throughput: 0: 42191.7. Samples: 738506760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 03:02:51,997][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 03:02:53,476][12883] Updated weights for policy 0, policy_version 45071 (0.0036) [2024-06-18 03:02:56,321][12883] Updated weights for policy 0, policy_version 45081 (0.0026) [2024-06-18 03:02:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 738607104. Throughput: 0: 42350.5. Samples: 738764740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 03:02:56,994][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 03:03:01,112][12883] Updated weights for policy 0, policy_version 45091 (0.0042) [2024-06-18 03:03:01,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 738820096. Throughput: 0: 42392.6. Samples: 738900260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:03:01,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 03:03:03,899][12883] Updated weights for policy 0, policy_version 45101 (0.0031) [2024-06-18 03:03:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 739016704. Throughput: 0: 42338.7. Samples: 739149420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:03:06,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 03:03:08,917][12883] Updated weights for policy 0, policy_version 45111 (0.0038) [2024-06-18 03:03:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 739246080. Throughput: 0: 42192.4. Samples: 739398840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:03:11,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 03:03:12,153][12883] Updated weights for policy 0, policy_version 45121 (0.0034) [2024-06-18 03:03:16,537][12883] Updated weights for policy 0, policy_version 45131 (0.0034) [2024-06-18 03:03:17,000][12645] Fps is (10 sec: 44208.8, 60 sec: 42867.1, 300 sec: 42375.3). Total num frames: 739459072. Throughput: 0: 42274.7. Samples: 739534680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:03:17,001][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 03:03:19,819][12883] Updated weights for policy 0, policy_version 45141 (0.0038) [2024-06-18 03:03:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 739672064. Throughput: 0: 42484.0. Samples: 739789900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:03:21,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 03:03:24,461][12883] Updated weights for policy 0, policy_version 45151 (0.0043) [2024-06-18 03:03:26,994][12645] Fps is (10 sec: 44264.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 739901440. Throughput: 0: 42440.4. Samples: 740039280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:03:26,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 03:03:27,222][12883] Updated weights for policy 0, policy_version 45161 (0.0037) [2024-06-18 03:03:31,973][12883] Updated weights for policy 0, policy_version 45171 (0.0033) [2024-06-18 03:03:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 740081664. Throughput: 0: 42423.5. Samples: 740169060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:03:31,994][12645] Avg episode reward: [(0, '0.011')] [2024-06-18 03:03:35,334][12883] Updated weights for policy 0, policy_version 45181 (0.0034) [2024-06-18 03:03:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 740294656. Throughput: 0: 42535.8. Samples: 740420780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:03:36,994][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 03:03:40,039][12883] Updated weights for policy 0, policy_version 45191 (0.0034) [2024-06-18 03:03:41,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 740524032. Throughput: 0: 42266.3. Samples: 740666820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:03:41,997][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 03:03:43,371][12883] Updated weights for policy 0, policy_version 45201 (0.0037) [2024-06-18 03:03:46,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 740720640. Throughput: 0: 42269.8. Samples: 740802500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:03:46,996][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 03:03:47,513][12883] Updated weights for policy 0, policy_version 45211 (0.0029) [2024-06-18 03:03:49,071][12862] Signal inference workers to stop experience collection... (10550 times) [2024-06-18 03:03:49,072][12862] Signal inference workers to resume experience collection... (10550 times) [2024-06-18 03:03:49,101][12883] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-06-18 03:03:49,101][12883] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-06-18 03:03:51,158][12883] Updated weights for policy 0, policy_version 45221 (0.0035) [2024-06-18 03:03:51,994][12645] Fps is (10 sec: 39330.9, 60 sec: 42327.0, 300 sec: 42320.7). Total num frames: 740917248. Throughput: 0: 42328.0. Samples: 741054180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:03:51,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 03:03:54,946][12883] Updated weights for policy 0, policy_version 45231 (0.0039) [2024-06-18 03:03:56,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 741163008. Throughput: 0: 42296.5. Samples: 741302180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:03:56,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 03:03:59,083][12883] Updated weights for policy 0, policy_version 45241 (0.0038) [2024-06-18 03:04:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 741359616. Throughput: 0: 42246.3. Samples: 741435500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 03:04:01,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 03:04:02,446][12883] Updated weights for policy 0, policy_version 45251 (0.0030) [2024-06-18 03:04:06,690][12883] Updated weights for policy 0, policy_version 45261 (0.0034) [2024-06-18 03:04:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42376.3). Total num frames: 741556224. Throughput: 0: 42267.1. Samples: 741691920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 03:04:06,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 03:04:10,326][12883] Updated weights for policy 0, policy_version 45271 (0.0040) [2024-06-18 03:04:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 741801984. Throughput: 0: 42206.7. Samples: 741938580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 03:04:11,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 03:04:14,420][12883] Updated weights for policy 0, policy_version 45281 (0.0031) [2024-06-18 03:04:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42056.6, 300 sec: 42431.8). Total num frames: 741982208. Throughput: 0: 42220.9. Samples: 742069000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 03:04:16,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 03:04:18,139][12883] Updated weights for policy 0, policy_version 45291 (0.0027) [2024-06-18 03:04:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42320.9). Total num frames: 742195200. Throughput: 0: 42173.8. Samples: 742318600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 03:04:21,995][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 03:04:22,341][12883] Updated weights for policy 0, policy_version 45301 (0.0030) [2024-06-18 03:04:25,846][12883] Updated weights for policy 0, policy_version 45311 (0.0037) [2024-06-18 03:04:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 742440960. Throughput: 0: 42360.7. Samples: 742572960. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-18 03:04:26,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 03:04:29,847][12883] Updated weights for policy 0, policy_version 45321 (0.0025) [2024-06-18 03:04:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 742637568. Throughput: 0: 42396.6. Samples: 742710260. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-18 03:04:31,995][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 03:04:33,297][12883] Updated weights for policy 0, policy_version 45331 (0.0028) [2024-06-18 03:04:36,994][12645] Fps is (10 sec: 37684.1, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 742817792. Throughput: 0: 42271.1. Samples: 742956380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-18 03:04:36,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 03:04:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045339_742834176.pth... [2024-06-18 03:04:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000044718_732659712.pth [2024-06-18 03:04:38,031][12883] Updated weights for policy 0, policy_version 45341 (0.0028) [2024-06-18 03:04:41,151][12883] Updated weights for policy 0, policy_version 45351 (0.0037) [2024-06-18 03:04:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 743063552. Throughput: 0: 42345.4. Samples: 743207720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-18 03:04:41,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 03:04:45,613][12883] Updated weights for policy 0, policy_version 45361 (0.0034) [2024-06-18 03:04:46,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 743276544. Throughput: 0: 42464.1. Samples: 743346380. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-18 03:04:46,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 03:04:48,979][12883] Updated weights for policy 0, policy_version 45371 (0.0042) [2024-06-18 03:04:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42265.5). Total num frames: 743456768. Throughput: 0: 42202.6. Samples: 743591040. Policy #0 lag: (min: 1.0, avg: 9.6, max: 20.0) [2024-06-18 03:04:51,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 03:04:53,353][12883] Updated weights for policy 0, policy_version 45381 (0.0030) [2024-06-18 03:04:56,586][12883] Updated weights for policy 0, policy_version 45391 (0.0029) [2024-06-18 03:04:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 743686144. Throughput: 0: 42275.1. Samples: 743840960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:04:56,996][12645] Avg episode reward: [(0, '0.139')] [2024-06-18 03:05:01,088][12883] Updated weights for policy 0, policy_version 45401 (0.0031) [2024-06-18 03:05:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 743882752. Throughput: 0: 42383.5. Samples: 743976260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:05:01,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 03:05:04,313][12883] Updated weights for policy 0, policy_version 45411 (0.0023) [2024-06-18 03:05:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 744095744. Throughput: 0: 42383.5. Samples: 744225860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:05:06,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 03:05:08,662][12883] Updated weights for policy 0, policy_version 45421 (0.0028) [2024-06-18 03:05:11,933][12883] Updated weights for policy 0, policy_version 45431 (0.0032) [2024-06-18 03:05:11,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 744341504. Throughput: 0: 42381.5. Samples: 744480120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:05:11,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 03:05:15,909][12862] Signal inference workers to stop experience collection... (10600 times) [2024-06-18 03:05:15,910][12862] Signal inference workers to resume experience collection... (10600 times) [2024-06-18 03:05:15,932][12883] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-06-18 03:05:15,936][12883] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-06-18 03:05:16,432][12883] Updated weights for policy 0, policy_version 45441 (0.0032) [2024-06-18 03:05:16,997][12645] Fps is (10 sec: 44220.9, 60 sec: 42595.8, 300 sec: 42431.2). Total num frames: 744538112. Throughput: 0: 42280.7. Samples: 744613040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:05:16,998][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 03:05:19,650][12883] Updated weights for policy 0, policy_version 45451 (0.0039) [2024-06-18 03:05:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 744751104. Throughput: 0: 42442.6. Samples: 744866300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 03:05:21,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 03:05:24,041][12883] Updated weights for policy 0, policy_version 45461 (0.0031) [2024-06-18 03:05:26,994][12645] Fps is (10 sec: 44253.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 744980480. Throughput: 0: 42503.5. Samples: 745120380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 03:05:26,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 03:05:27,182][12883] Updated weights for policy 0, policy_version 45471 (0.0028) [2024-06-18 03:05:31,687][12883] Updated weights for policy 0, policy_version 45481 (0.0042) [2024-06-18 03:05:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 745160704. Throughput: 0: 42290.6. Samples: 745249460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 03:05:31,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 03:05:34,849][12883] Updated weights for policy 0, policy_version 45491 (0.0027) [2024-06-18 03:05:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42376.3). Total num frames: 745406464. Throughput: 0: 42516.1. Samples: 745504260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 03:05:36,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 03:05:39,590][12883] Updated weights for policy 0, policy_version 45501 (0.0035) [2024-06-18 03:05:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 745619456. Throughput: 0: 42752.9. Samples: 745764840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 03:05:41,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:05:42,510][12883] Updated weights for policy 0, policy_version 45511 (0.0032) [2024-06-18 03:05:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 745799680. Throughput: 0: 42364.0. Samples: 745882640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 03:05:46,994][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 03:05:47,237][12883] Updated weights for policy 0, policy_version 45521 (0.0031) [2024-06-18 03:05:50,799][12883] Updated weights for policy 0, policy_version 45531 (0.0037) [2024-06-18 03:05:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42376.3). Total num frames: 746029056. Throughput: 0: 42529.5. Samples: 746139680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 03:05:51,994][12645] Avg episode reward: [(0, '0.026')] [2024-06-18 03:05:54,964][12883] Updated weights for policy 0, policy_version 45541 (0.0038) [2024-06-18 03:05:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 746225664. Throughput: 0: 42519.1. Samples: 746393480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 03:05:56,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 03:05:58,605][12883] Updated weights for policy 0, policy_version 45551 (0.0034) [2024-06-18 03:06:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 746438656. Throughput: 0: 42436.8. Samples: 746522540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 03:06:01,998][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 03:06:02,663][12883] Updated weights for policy 0, policy_version 45561 (0.0038) [2024-06-18 03:06:06,074][12883] Updated weights for policy 0, policy_version 45571 (0.0048) [2024-06-18 03:06:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 746668032. Throughput: 0: 42380.9. Samples: 746773440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 03:06:06,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 03:06:10,291][12883] Updated weights for policy 0, policy_version 45581 (0.0035) [2024-06-18 03:06:11,997][12645] Fps is (10 sec: 42583.3, 60 sec: 42049.7, 300 sec: 42320.2). Total num frames: 746864640. Throughput: 0: 42291.3. Samples: 747023640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 03:06:12,006][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 03:06:13,746][12883] Updated weights for policy 0, policy_version 45591 (0.0040) [2024-06-18 03:06:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42054.9, 300 sec: 42265.2). Total num frames: 747061248. Throughput: 0: 42337.9. Samples: 747154660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 03:06:16,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 03:06:17,779][12883] Updated weights for policy 0, policy_version 45601 (0.0031) [2024-06-18 03:06:21,656][12883] Updated weights for policy 0, policy_version 45611 (0.0030) [2024-06-18 03:06:21,994][12645] Fps is (10 sec: 42613.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 747290624. Throughput: 0: 42240.9. Samples: 747405100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 03:06:21,994][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 03:06:25,312][12883] Updated weights for policy 0, policy_version 45621 (0.0032) [2024-06-18 03:06:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 747503616. Throughput: 0: 42205.4. Samples: 747664080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 03:06:26,994][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 03:06:29,322][12883] Updated weights for policy 0, policy_version 45631 (0.0027) [2024-06-18 03:06:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 747700224. Throughput: 0: 42429.8. Samples: 747791980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 03:06:31,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 03:06:32,241][12862] Signal inference workers to stop experience collection... (10650 times) [2024-06-18 03:06:32,241][12862] Signal inference workers to resume experience collection... (10650 times) [2024-06-18 03:06:32,267][12883] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-06-18 03:06:32,268][12883] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-06-18 03:06:33,054][12883] Updated weights for policy 0, policy_version 45641 (0.0036) [2024-06-18 03:06:36,947][12883] Updated weights for policy 0, policy_version 45651 (0.0025) [2024-06-18 03:06:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 747945984. Throughput: 0: 42397.7. Samples: 748047580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 03:06:36,994][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 03:06:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045651_747945984.pth... [2024-06-18 03:06:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045030_737771520.pth [2024-06-18 03:06:40,615][12883] Updated weights for policy 0, policy_version 45661 (0.0036) [2024-06-18 03:06:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 748142592. Throughput: 0: 42398.6. Samples: 748301420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 03:06:41,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 03:06:45,056][12883] Updated weights for policy 0, policy_version 45671 (0.0037) [2024-06-18 03:06:46,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 748339200. Throughput: 0: 42436.4. Samples: 748432180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 03:06:46,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 03:06:48,281][12883] Updated weights for policy 0, policy_version 45681 (0.0032) [2024-06-18 03:06:51,999][12645] Fps is (10 sec: 42576.2, 60 sec: 42321.5, 300 sec: 42319.9). Total num frames: 748568576. Throughput: 0: 42431.9. Samples: 748683100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:06:51,999][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 03:06:52,832][12883] Updated weights for policy 0, policy_version 45691 (0.0033) [2024-06-18 03:06:56,255][12883] Updated weights for policy 0, policy_version 45701 (0.0022) [2024-06-18 03:06:56,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 748781568. Throughput: 0: 42645.7. Samples: 748942540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:06:56,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 03:07:00,441][12883] Updated weights for policy 0, policy_version 45711 (0.0033) [2024-06-18 03:07:01,994][12645] Fps is (10 sec: 42620.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 748994560. Throughput: 0: 42449.6. Samples: 749064900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:07:01,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 03:07:04,223][12883] Updated weights for policy 0, policy_version 45721 (0.0033) [2024-06-18 03:07:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 749207552. Throughput: 0: 42526.2. Samples: 749318780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:07:06,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 03:07:08,157][12883] Updated weights for policy 0, policy_version 45731 (0.0032) [2024-06-18 03:07:11,877][12883] Updated weights for policy 0, policy_version 45741 (0.0042) [2024-06-18 03:07:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.9, 300 sec: 42487.3). Total num frames: 749420544. Throughput: 0: 42325.2. Samples: 749568720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:07:11,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 03:07:15,836][12883] Updated weights for policy 0, policy_version 45751 (0.0030) [2024-06-18 03:07:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 749617152. Throughput: 0: 42306.2. Samples: 749695760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:07:16,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 03:07:20,010][12883] Updated weights for policy 0, policy_version 45761 (0.0029) [2024-06-18 03:07:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 749830144. Throughput: 0: 42162.7. Samples: 749944900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:07:21,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 03:07:23,787][12883] Updated weights for policy 0, policy_version 45771 (0.0031) [2024-06-18 03:07:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 750026752. Throughput: 0: 42191.2. Samples: 750200020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:07:26,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:07:27,744][12883] Updated weights for policy 0, policy_version 45781 (0.0038) [2024-06-18 03:07:31,458][12883] Updated weights for policy 0, policy_version 45791 (0.0042) [2024-06-18 03:07:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 750256128. Throughput: 0: 42024.6. Samples: 750323280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:07:32,003][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 03:07:35,771][12883] Updated weights for policy 0, policy_version 45801 (0.0042) [2024-06-18 03:07:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 750452736. Throughput: 0: 42031.0. Samples: 750574280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:07:36,995][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 03:07:39,202][12883] Updated weights for policy 0, policy_version 45811 (0.0029) [2024-06-18 03:07:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 750665728. Throughput: 0: 41733.8. Samples: 750820560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:07:41,994][12645] Avg episode reward: [(0, '0.096')] [2024-06-18 03:07:43,703][12883] Updated weights for policy 0, policy_version 45821 (0.0037) [2024-06-18 03:07:46,991][12883] Updated weights for policy 0, policy_version 45831 (0.0030) [2024-06-18 03:07:46,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 750895104. Throughput: 0: 41909.0. Samples: 750950800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 03:07:47,002][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 03:07:51,566][12883] Updated weights for policy 0, policy_version 45841 (0.0040) [2024-06-18 03:07:51,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41509.7, 300 sec: 42209.6). Total num frames: 751058944. Throughput: 0: 41929.3. Samples: 751205600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 03:07:51,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 03:07:54,670][12883] Updated weights for policy 0, policy_version 45851 (0.0052) [2024-06-18 03:07:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 751288320. Throughput: 0: 41917.5. Samples: 751455000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 03:07:56,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 03:07:59,437][12883] Updated weights for policy 0, policy_version 45861 (0.0032) [2024-06-18 03:08:01,994][12645] Fps is (10 sec: 45876.2, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 751517696. Throughput: 0: 42008.9. Samples: 751586160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 03:08:01,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 03:08:02,487][12883] Updated weights for policy 0, policy_version 45871 (0.0035) [2024-06-18 03:08:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 751697920. Throughput: 0: 42049.9. Samples: 751837140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 03:08:06,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 03:08:07,210][12883] Updated weights for policy 0, policy_version 45881 (0.0029) [2024-06-18 03:08:10,366][12883] Updated weights for policy 0, policy_version 45891 (0.0036) [2024-06-18 03:08:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42266.1). Total num frames: 751927296. Throughput: 0: 41912.1. Samples: 752086060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 03:08:11,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 03:08:14,801][12883] Updated weights for policy 0, policy_version 45901 (0.0033) [2024-06-18 03:08:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 752140288. Throughput: 0: 42069.3. Samples: 752216400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:08:16,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 03:08:17,987][12883] Updated weights for policy 0, policy_version 45911 (0.0037) [2024-06-18 03:08:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 752353280. Throughput: 0: 42209.1. Samples: 752473680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:08:21,994][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 03:08:22,431][12883] Updated weights for policy 0, policy_version 45921 (0.0024) [2024-06-18 03:08:23,669][12862] Signal inference workers to stop experience collection... (10700 times) [2024-06-18 03:08:23,669][12862] Signal inference workers to resume experience collection... (10700 times) [2024-06-18 03:08:23,685][12883] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-06-18 03:08:23,692][12883] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-06-18 03:08:25,707][12883] Updated weights for policy 0, policy_version 45931 (0.0040) [2024-06-18 03:08:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 752582656. Throughput: 0: 42089.6. Samples: 752714600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:08:26,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 03:08:30,163][12883] Updated weights for policy 0, policy_version 45941 (0.0034) [2024-06-18 03:08:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 752779264. Throughput: 0: 42175.4. Samples: 752848700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:08:31,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 03:08:33,451][12883] Updated weights for policy 0, policy_version 45951 (0.0042) [2024-06-18 03:08:36,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 42210.0). Total num frames: 752975872. Throughput: 0: 42123.3. Samples: 753101140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:08:36,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 03:08:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045958_752975872.pth... [2024-06-18 03:08:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045339_742834176.pth [2024-06-18 03:08:37,918][12883] Updated weights for policy 0, policy_version 45961 (0.0028) [2024-06-18 03:08:41,251][12883] Updated weights for policy 0, policy_version 45971 (0.0036) [2024-06-18 03:08:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 753205248. Throughput: 0: 41968.8. Samples: 753343600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:08:41,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 03:08:45,669][12883] Updated weights for policy 0, policy_version 45981 (0.0031) [2024-06-18 03:08:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41504.6, 300 sec: 42264.8). Total num frames: 753385472. Throughput: 0: 42003.1. Samples: 753476400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 03:08:46,997][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 03:08:49,128][12883] Updated weights for policy 0, policy_version 45991 (0.0031) [2024-06-18 03:08:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 753582080. Throughput: 0: 41824.4. Samples: 753719240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 03:08:51,994][12645] Avg episode reward: [(0, '0.099')] [2024-06-18 03:08:53,640][12883] Updated weights for policy 0, policy_version 46001 (0.0035) [2024-06-18 03:08:56,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 753827840. Throughput: 0: 41948.0. Samples: 753973720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 03:08:56,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 03:08:57,034][12883] Updated weights for policy 0, policy_version 46011 (0.0030) [2024-06-18 03:09:01,604][12883] Updated weights for policy 0, policy_version 46021 (0.0032) [2024-06-18 03:09:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 42209.6). Total num frames: 754008064. Throughput: 0: 41898.6. Samples: 754101840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 03:09:01,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 03:09:04,931][12883] Updated weights for policy 0, policy_version 46031 (0.0037) [2024-06-18 03:09:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 754221056. Throughput: 0: 41531.0. Samples: 754342580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 03:09:06,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 03:09:09,463][12883] Updated weights for policy 0, policy_version 46041 (0.0038) [2024-06-18 03:09:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 754450432. Throughput: 0: 41888.5. Samples: 754599580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-18 03:09:12,006][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 03:09:12,930][12883] Updated weights for policy 0, policy_version 46051 (0.0027) [2024-06-18 03:09:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 754647040. Throughput: 0: 41717.1. Samples: 754725960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:16,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 03:09:17,093][12883] Updated weights for policy 0, policy_version 46061 (0.0023) [2024-06-18 03:09:20,981][12883] Updated weights for policy 0, policy_version 46071 (0.0039) [2024-06-18 03:09:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 754860032. Throughput: 0: 41560.7. Samples: 754971380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:21,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 03:09:24,965][12883] Updated weights for policy 0, policy_version 46081 (0.0045) [2024-06-18 03:09:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.3, 300 sec: 42154.1). Total num frames: 755073024. Throughput: 0: 41870.3. Samples: 755227760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:26,994][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 03:09:28,802][12883] Updated weights for policy 0, policy_version 46091 (0.0032) [2024-06-18 03:09:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 42154.1). Total num frames: 755253248. Throughput: 0: 41540.6. Samples: 755345640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:31,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 03:09:32,809][12883] Updated weights for policy 0, policy_version 46101 (0.0039) [2024-06-18 03:09:36,929][12883] Updated weights for policy 0, policy_version 46111 (0.0029) [2024-06-18 03:09:36,996][12645] Fps is (10 sec: 40949.1, 60 sec: 41777.3, 300 sec: 42098.2). Total num frames: 755482624. Throughput: 0: 41711.8. Samples: 755596380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:36,997][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 03:09:40,615][12883] Updated weights for policy 0, policy_version 46121 (0.0042) [2024-06-18 03:09:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 755695616. Throughput: 0: 41741.3. Samples: 755852080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:41,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 03:09:44,488][12883] Updated weights for policy 0, policy_version 46131 (0.0034) [2024-06-18 03:09:46,994][12645] Fps is (10 sec: 40970.5, 60 sec: 41780.7, 300 sec: 42154.1). Total num frames: 755892224. Throughput: 0: 41560.1. Samples: 755972040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:46,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 03:09:48,659][12883] Updated weights for policy 0, policy_version 46141 (0.0042) [2024-06-18 03:09:50,194][12862] Signal inference workers to stop experience collection... (10750 times) [2024-06-18 03:09:50,195][12862] Signal inference workers to resume experience collection... (10750 times) [2024-06-18 03:09:50,218][12883] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-06-18 03:09:50,218][12883] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-06-18 03:09:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 756121600. Throughput: 0: 41764.0. Samples: 756221960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:51,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 03:09:52,635][12883] Updated weights for policy 0, policy_version 46151 (0.0030) [2024-06-18 03:09:56,503][12883] Updated weights for policy 0, policy_version 46161 (0.0032) [2024-06-18 03:09:56,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41232.9, 300 sec: 42098.5). Total num frames: 756301824. Throughput: 0: 41502.1. Samples: 756467180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:09:56,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 03:10:00,298][12883] Updated weights for policy 0, policy_version 46171 (0.0038) [2024-06-18 03:10:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 756531200. Throughput: 0: 41568.3. Samples: 756596540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:10:02,000][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 03:10:04,485][12883] Updated weights for policy 0, policy_version 46181 (0.0028) [2024-06-18 03:10:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 756727808. Throughput: 0: 41694.3. Samples: 756847620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 03:10:06,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 03:10:08,284][12883] Updated weights for policy 0, policy_version 46191 (0.0032) [2024-06-18 03:10:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 42043.5). Total num frames: 756940800. Throughput: 0: 41535.5. Samples: 757096860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:10:11,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 03:10:12,833][12883] Updated weights for policy 0, policy_version 46201 (0.0037) [2024-06-18 03:10:16,050][12883] Updated weights for policy 0, policy_version 46211 (0.0037) [2024-06-18 03:10:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 757137408. Throughput: 0: 41596.1. Samples: 757217460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:10:16,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 03:10:20,431][12883] Updated weights for policy 0, policy_version 46221 (0.0039) [2024-06-18 03:10:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 757350400. Throughput: 0: 41637.6. Samples: 757469960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:10:21,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 03:10:23,876][12883] Updated weights for policy 0, policy_version 46231 (0.0037) [2024-06-18 03:10:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 757530624. Throughput: 0: 41598.2. Samples: 757724000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:10:26,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 03:10:28,067][12883] Updated weights for policy 0, policy_version 46241 (0.0032) [2024-06-18 03:10:31,487][12883] Updated weights for policy 0, policy_version 46251 (0.0033) [2024-06-18 03:10:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 757776384. Throughput: 0: 41543.6. Samples: 757841500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:10:31,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 03:10:35,877][12883] Updated weights for policy 0, policy_version 46261 (0.0025) [2024-06-18 03:10:36,994][12645] Fps is (10 sec: 45875.8, 60 sec: 41781.0, 300 sec: 41931.9). Total num frames: 757989376. Throughput: 0: 41726.3. Samples: 758099640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 03:10:36,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 03:10:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046265_758005760.pth... [2024-06-18 03:10:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045651_747945984.pth [2024-06-18 03:10:39,159][12883] Updated weights for policy 0, policy_version 46271 (0.0040) [2024-06-18 03:10:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 758169600. Throughput: 0: 41885.5. Samples: 758352020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:10:41,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 03:10:43,771][12883] Updated weights for policy 0, policy_version 46281 (0.0038) [2024-06-18 03:10:46,859][12883] Updated weights for policy 0, policy_version 46291 (0.0032) [2024-06-18 03:10:47,000][12645] Fps is (10 sec: 44208.9, 60 sec: 42321.0, 300 sec: 42042.1). Total num frames: 758431744. Throughput: 0: 41847.5. Samples: 758479940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:10:47,009][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 03:10:51,993][12883] Updated weights for policy 0, policy_version 46301 (0.0038) [2024-06-18 03:10:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 758595584. Throughput: 0: 41873.3. Samples: 758731920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:10:51,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 03:10:54,669][12883] Updated weights for policy 0, policy_version 46311 (0.0026) [2024-06-18 03:10:56,994][12645] Fps is (10 sec: 39346.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 758824960. Throughput: 0: 41839.1. Samples: 758979620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:10:56,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 03:10:58,081][12862] Signal inference workers to stop experience collection... (10800 times) [2024-06-18 03:10:58,082][12862] Signal inference workers to resume experience collection... (10800 times) [2024-06-18 03:10:58,093][12883] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-06-18 03:10:58,093][12883] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-06-18 03:10:59,629][12883] Updated weights for policy 0, policy_version 46321 (0.0039) [2024-06-18 03:11:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 759037952. Throughput: 0: 42192.4. Samples: 759116120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:11:01,994][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 03:11:02,699][12883] Updated weights for policy 0, policy_version 46331 (0.0028) [2024-06-18 03:11:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41876.9). Total num frames: 759218176. Throughput: 0: 42163.1. Samples: 759367300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:11:06,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 03:11:07,275][12883] Updated weights for policy 0, policy_version 46341 (0.0032) [2024-06-18 03:11:10,669][12883] Updated weights for policy 0, policy_version 46351 (0.0031) [2024-06-18 03:11:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 759463936. Throughput: 0: 41834.9. Samples: 759606660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:11:11,997][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 03:11:15,063][12883] Updated weights for policy 0, policy_version 46361 (0.0031) [2024-06-18 03:11:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 759644160. Throughput: 0: 42186.7. Samples: 759739900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:11:16,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 03:11:18,320][12883] Updated weights for policy 0, policy_version 46371 (0.0035) [2024-06-18 03:11:21,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 759873536. Throughput: 0: 42045.2. Samples: 759991680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:11:21,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 03:11:22,730][12883] Updated weights for policy 0, policy_version 46381 (0.0033) [2024-06-18 03:11:26,126][12883] Updated weights for policy 0, policy_version 46391 (0.0055) [2024-06-18 03:11:26,994][12645] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42098.6). Total num frames: 760119296. Throughput: 0: 41942.7. Samples: 760239440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:11:26,994][12645] Avg episode reward: [(0, '0.104')] [2024-06-18 03:11:30,391][12883] Updated weights for policy 0, policy_version 46401 (0.0036) [2024-06-18 03:11:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 760283136. Throughput: 0: 42048.9. Samples: 760371880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:11:31,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 03:11:33,803][12883] Updated weights for policy 0, policy_version 46411 (0.0034) [2024-06-18 03:11:36,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 760496128. Throughput: 0: 41919.6. Samples: 760618300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:11:36,994][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 03:11:38,529][12883] Updated weights for policy 0, policy_version 46421 (0.0040) [2024-06-18 03:11:41,483][12883] Updated weights for policy 0, policy_version 46431 (0.0023) [2024-06-18 03:11:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 760725504. Throughput: 0: 41873.9. Samples: 760863940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:11:41,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 03:11:46,415][12883] Updated weights for policy 0, policy_version 46441 (0.0024) [2024-06-18 03:11:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41510.4, 300 sec: 41877.1). Total num frames: 760922112. Throughput: 0: 41765.3. Samples: 760995560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:11:46,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 03:11:49,848][12883] Updated weights for policy 0, policy_version 46451 (0.0040) [2024-06-18 03:11:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 761118720. Throughput: 0: 41739.6. Samples: 761245580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:11:51,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 03:11:54,183][12883] Updated weights for policy 0, policy_version 46461 (0.0032) [2024-06-18 03:11:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 761348096. Throughput: 0: 42060.0. Samples: 761499260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:11:56,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 03:11:57,635][12883] Updated weights for policy 0, policy_version 46471 (0.0033) [2024-06-18 03:12:01,938][12883] Updated weights for policy 0, policy_version 46481 (0.0028) [2024-06-18 03:12:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 761544704. Throughput: 0: 41848.4. Samples: 761623080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:12:02,004][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 03:12:05,731][12883] Updated weights for policy 0, policy_version 46491 (0.0029) [2024-06-18 03:12:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 761774080. Throughput: 0: 41712.0. Samples: 761868720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:12:07,003][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 03:12:09,710][12883] Updated weights for policy 0, policy_version 46501 (0.0032) [2024-06-18 03:12:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41507.7, 300 sec: 41820.8). Total num frames: 761954304. Throughput: 0: 41859.5. Samples: 762123120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:12:11,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 03:12:13,583][12883] Updated weights for policy 0, policy_version 46511 (0.0034) [2024-06-18 03:12:16,343][12862] Signal inference workers to stop experience collection... (10850 times) [2024-06-18 03:12:16,399][12883] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-06-18 03:12:16,401][12862] Signal inference workers to resume experience collection... (10850 times) [2024-06-18 03:12:16,409][12883] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-06-18 03:12:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 762183680. Throughput: 0: 41533.0. Samples: 762240860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:12:16,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 03:12:17,480][12883] Updated weights for policy 0, policy_version 46521 (0.0040) [2024-06-18 03:12:21,557][12883] Updated weights for policy 0, policy_version 46531 (0.0034) [2024-06-18 03:12:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.3, 300 sec: 41820.9). Total num frames: 762363904. Throughput: 0: 41732.0. Samples: 762496240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:12:21,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 03:12:25,295][12883] Updated weights for policy 0, policy_version 46541 (0.0032) [2024-06-18 03:12:26,994][12645] Fps is (10 sec: 37682.7, 60 sec: 40686.8, 300 sec: 41709.8). Total num frames: 762560512. Throughput: 0: 41813.2. Samples: 762745540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:12:26,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 03:12:29,388][12883] Updated weights for policy 0, policy_version 46551 (0.0031) [2024-06-18 03:12:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 762822656. Throughput: 0: 41667.1. Samples: 762870580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:12:31,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 03:12:33,170][12883] Updated weights for policy 0, policy_version 46561 (0.0029) [2024-06-18 03:12:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 763002880. Throughput: 0: 41848.8. Samples: 763128780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:12:36,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 03:12:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046570_763002880.pth... [2024-06-18 03:12:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000045958_752975872.pth [2024-06-18 03:12:37,562][12883] Updated weights for policy 0, policy_version 46571 (0.0041) [2024-06-18 03:12:41,499][12883] Updated weights for policy 0, policy_version 46581 (0.0031) [2024-06-18 03:12:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 763215872. Throughput: 0: 41672.3. Samples: 763374520. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-18 03:12:41,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 03:12:45,325][12883] Updated weights for policy 0, policy_version 46591 (0.0030) [2024-06-18 03:12:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 763445248. Throughput: 0: 41673.8. Samples: 763498400. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-18 03:12:46,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 03:12:49,171][12883] Updated weights for policy 0, policy_version 46601 (0.0025) [2024-06-18 03:12:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 763625472. Throughput: 0: 41802.3. Samples: 763749820. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-18 03:12:51,995][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 03:12:52,872][12883] Updated weights for policy 0, policy_version 46611 (0.0036) [2024-06-18 03:12:56,819][12883] Updated weights for policy 0, policy_version 46621 (0.0033) [2024-06-18 03:12:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 763854848. Throughput: 0: 41700.4. Samples: 763999640. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-18 03:12:56,995][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 03:13:00,425][12883] Updated weights for policy 0, policy_version 46631 (0.0032) [2024-06-18 03:13:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 764051456. Throughput: 0: 41914.6. Samples: 764127020. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-18 03:13:01,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:13:04,413][12883] Updated weights for policy 0, policy_version 46641 (0.0032) [2024-06-18 03:13:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 764248064. Throughput: 0: 41872.0. Samples: 764380480. Policy #0 lag: (min: 2.0, avg: 11.2, max: 22.0) [2024-06-18 03:13:06,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 03:13:08,139][12883] Updated weights for policy 0, policy_version 46651 (0.0039) [2024-06-18 03:13:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 764461056. Throughput: 0: 41842.3. Samples: 764628440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 03:13:11,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 03:13:12,415][12883] Updated weights for policy 0, policy_version 46661 (0.0028) [2024-06-18 03:13:15,930][12883] Updated weights for policy 0, policy_version 46671 (0.0031) [2024-06-18 03:13:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 764690432. Throughput: 0: 41945.4. Samples: 764758120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 03:13:16,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 03:13:20,170][12883] Updated weights for policy 0, policy_version 46681 (0.0044) [2024-06-18 03:13:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 764870656. Throughput: 0: 41809.9. Samples: 765010220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 03:13:21,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 03:13:23,635][12883] Updated weights for policy 0, policy_version 46691 (0.0035) [2024-06-18 03:13:27,000][12645] Fps is (10 sec: 40934.3, 60 sec: 42321.0, 300 sec: 41764.5). Total num frames: 765100032. Throughput: 0: 41976.1. Samples: 765263700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 03:13:27,009][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 03:13:27,777][12883] Updated weights for policy 0, policy_version 46701 (0.0035) [2024-06-18 03:13:30,496][12862] Signal inference workers to stop experience collection... (10900 times) [2024-06-18 03:13:30,497][12862] Signal inference workers to resume experience collection... (10900 times) [2024-06-18 03:13:30,526][12883] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-06-18 03:13:30,526][12883] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-06-18 03:13:31,350][12883] Updated weights for policy 0, policy_version 46711 (0.0037) [2024-06-18 03:13:31,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 765345792. Throughput: 0: 42263.0. Samples: 765400240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 03:13:31,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 03:13:35,714][12883] Updated weights for policy 0, policy_version 46721 (0.0033) [2024-06-18 03:13:36,994][12645] Fps is (10 sec: 40985.3, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 765509632. Throughput: 0: 42064.5. Samples: 765642720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:13:36,994][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 03:13:39,198][12883] Updated weights for policy 0, policy_version 46731 (0.0032) [2024-06-18 03:13:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 41876.7). Total num frames: 765739008. Throughput: 0: 42217.4. Samples: 765899420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:13:41,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 03:13:43,547][12883] Updated weights for policy 0, policy_version 46741 (0.0038) [2024-06-18 03:13:46,929][12883] Updated weights for policy 0, policy_version 46751 (0.0035) [2024-06-18 03:13:46,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 765968384. Throughput: 0: 42225.4. Samples: 766027160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:13:46,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 03:13:51,369][12883] Updated weights for policy 0, policy_version 46761 (0.0032) [2024-06-18 03:13:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 766148608. Throughput: 0: 42247.3. Samples: 766281620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:13:51,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 03:13:54,641][12883] Updated weights for policy 0, policy_version 46771 (0.0037) [2024-06-18 03:13:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 766377984. Throughput: 0: 42153.8. Samples: 766525360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:13:56,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 03:13:59,345][12883] Updated weights for policy 0, policy_version 46781 (0.0028) [2024-06-18 03:14:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 766574592. Throughput: 0: 42217.2. Samples: 766657900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:14:01,994][12645] Avg episode reward: [(0, '0.025')] [2024-06-18 03:14:02,565][12883] Updated weights for policy 0, policy_version 46791 (0.0030) [2024-06-18 03:14:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 766771200. Throughput: 0: 42096.1. Samples: 766904540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 03:14:06,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 03:14:07,048][12883] Updated weights for policy 0, policy_version 46801 (0.0027) [2024-06-18 03:14:10,351][12883] Updated weights for policy 0, policy_version 46811 (0.0031) [2024-06-18 03:14:11,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 767016960. Throughput: 0: 42019.6. Samples: 767154320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 03:14:11,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 03:14:14,765][12883] Updated weights for policy 0, policy_version 46821 (0.0038) [2024-06-18 03:14:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 767197184. Throughput: 0: 41888.1. Samples: 767285200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 03:14:16,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 03:14:18,280][12883] Updated weights for policy 0, policy_version 46831 (0.0029) [2024-06-18 03:14:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 767410176. Throughput: 0: 42021.4. Samples: 767533680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 03:14:21,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 03:14:22,384][12883] Updated weights for policy 0, policy_version 46841 (0.0043) [2024-06-18 03:14:26,239][12883] Updated weights for policy 0, policy_version 46851 (0.0037) [2024-06-18 03:14:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42329.7, 300 sec: 41987.5). Total num frames: 767639552. Throughput: 0: 42127.5. Samples: 767795160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 03:14:26,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 03:14:29,998][12883] Updated weights for policy 0, policy_version 46861 (0.0033) [2024-06-18 03:14:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41506.0, 300 sec: 41876.7). Total num frames: 767836160. Throughput: 0: 41991.3. Samples: 767916780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 03:14:31,995][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 03:14:34,136][12883] Updated weights for policy 0, policy_version 46871 (0.0054) [2024-06-18 03:14:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 768049152. Throughput: 0: 41829.0. Samples: 768163920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:14:36,994][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 03:14:37,192][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046880_768081920.pth... [2024-06-18 03:14:37,239][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046265_758005760.pth [2024-06-18 03:14:38,223][12883] Updated weights for policy 0, policy_version 46881 (0.0032) [2024-06-18 03:14:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 768245760. Throughput: 0: 42133.7. Samples: 768421380. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:14:42,000][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 03:14:42,062][12883] Updated weights for policy 0, policy_version 46891 (0.0033) [2024-06-18 03:14:45,775][12883] Updated weights for policy 0, policy_version 46901 (0.0038) [2024-06-18 03:14:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 768458752. Throughput: 0: 41866.7. Samples: 768541900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:14:46,994][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 03:14:49,780][12883] Updated weights for policy 0, policy_version 46911 (0.0027) [2024-06-18 03:14:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 768704512. Throughput: 0: 42107.8. Samples: 768799400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:14:51,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 03:14:53,275][12883] Updated weights for policy 0, policy_version 46921 (0.0036) [2024-06-18 03:14:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 768884736. Throughput: 0: 42155.9. Samples: 769051340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:14:56,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 03:14:57,681][12883] Updated weights for policy 0, policy_version 46931 (0.0022) [2024-06-18 03:15:00,656][12883] Updated weights for policy 0, policy_version 46941 (0.0028) [2024-06-18 03:15:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 769081344. Throughput: 0: 41963.5. Samples: 769173560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:15:01,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 03:15:02,875][12862] Signal inference workers to stop experience collection... (10950 times) [2024-06-18 03:15:02,875][12862] Signal inference workers to resume experience collection... (10950 times) [2024-06-18 03:15:02,917][12883] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-06-18 03:15:02,917][12883] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-06-18 03:15:05,366][12883] Updated weights for policy 0, policy_version 46951 (0.0028) [2024-06-18 03:15:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 769310720. Throughput: 0: 42206.8. Samples: 769432980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:15:06,994][12645] Avg episode reward: [(0, '0.099')] [2024-06-18 03:15:08,433][12883] Updated weights for policy 0, policy_version 46961 (0.0031) [2024-06-18 03:15:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 769523712. Throughput: 0: 41882.7. Samples: 769679880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:15:11,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 03:15:13,234][12883] Updated weights for policy 0, policy_version 46971 (0.0029) [2024-06-18 03:15:16,515][12883] Updated weights for policy 0, policy_version 46981 (0.0034) [2024-06-18 03:15:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 769736704. Throughput: 0: 42058.9. Samples: 769809420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:15:16,994][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 03:15:21,029][12883] Updated weights for policy 0, policy_version 46991 (0.0037) [2024-06-18 03:15:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 769933312. Throughput: 0: 42304.4. Samples: 770067620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:15:21,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 03:15:24,091][12883] Updated weights for policy 0, policy_version 47001 (0.0035) [2024-06-18 03:15:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 770146304. Throughput: 0: 42140.0. Samples: 770317680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:15:26,995][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 03:15:28,963][12883] Updated weights for policy 0, policy_version 47011 (0.0033) [2024-06-18 03:15:31,821][12883] Updated weights for policy 0, policy_version 47021 (0.0031) [2024-06-18 03:15:31,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.6, 300 sec: 42098.5). Total num frames: 770408448. Throughput: 0: 42359.1. Samples: 770448060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 03:15:31,994][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 03:15:36,653][12883] Updated weights for policy 0, policy_version 47031 (0.0031) [2024-06-18 03:15:36,998][12645] Fps is (10 sec: 42580.8, 60 sec: 42049.4, 300 sec: 42042.4). Total num frames: 770572288. Throughput: 0: 42262.0. Samples: 770701360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 03:15:36,998][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 03:15:39,458][12883] Updated weights for policy 0, policy_version 47041 (0.0029) [2024-06-18 03:15:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 41877.3). Total num frames: 770785280. Throughput: 0: 42201.0. Samples: 770950380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 03:15:41,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 03:15:44,306][12883] Updated weights for policy 0, policy_version 47051 (0.0031) [2024-06-18 03:15:46,994][12645] Fps is (10 sec: 45894.6, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 771031040. Throughput: 0: 42332.1. Samples: 771078500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 03:15:46,996][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 03:15:47,013][12862] Saving new best policy, reward=0.369! [2024-06-18 03:15:47,330][12883] Updated weights for policy 0, policy_version 47061 (0.0035) [2024-06-18 03:15:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41932.0). Total num frames: 771194880. Throughput: 0: 42202.2. Samples: 771332080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 03:15:51,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:15:52,024][12883] Updated weights for policy 0, policy_version 47071 (0.0037) [2024-06-18 03:15:55,293][12883] Updated weights for policy 0, policy_version 47081 (0.0032) [2024-06-18 03:15:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 771407872. Throughput: 0: 42120.5. Samples: 771575300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 03:15:56,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 03:16:00,415][12883] Updated weights for policy 0, policy_version 47091 (0.0023) [2024-06-18 03:16:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 771620864. Throughput: 0: 42090.3. Samples: 771703480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 03:16:01,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 03:16:03,328][12883] Updated weights for policy 0, policy_version 47101 (0.0032) [2024-06-18 03:16:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41876.7). Total num frames: 771817472. Throughput: 0: 41885.9. Samples: 771952480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:16:06,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 03:16:07,933][12883] Updated weights for policy 0, policy_version 47111 (0.0036) [2024-06-18 03:16:11,209][12883] Updated weights for policy 0, policy_version 47121 (0.0033) [2024-06-18 03:16:11,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 772063232. Throughput: 0: 41719.9. Samples: 772195080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:16:11,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 03:16:15,796][12883] Updated weights for policy 0, policy_version 47131 (0.0031) [2024-06-18 03:16:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 772259840. Throughput: 0: 41923.6. Samples: 772334620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:16:16,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 03:16:18,861][12883] Updated weights for policy 0, policy_version 47141 (0.0038) [2024-06-18 03:16:21,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 772456448. Throughput: 0: 41801.7. Samples: 772582260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:16:21,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 03:16:23,479][12883] Updated weights for policy 0, policy_version 47151 (0.0039) [2024-06-18 03:16:25,190][12862] Signal inference workers to stop experience collection... (11000 times) [2024-06-18 03:16:25,190][12862] Signal inference workers to resume experience collection... (11000 times) [2024-06-18 03:16:25,205][12883] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-06-18 03:16:25,205][12883] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-06-18 03:16:26,922][12883] Updated weights for policy 0, policy_version 47161 (0.0038) [2024-06-18 03:16:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 772685824. Throughput: 0: 41805.8. Samples: 772831640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:16:26,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 03:16:31,307][12883] Updated weights for policy 0, policy_version 47171 (0.0030) [2024-06-18 03:16:31,994][12645] Fps is (10 sec: 40959.2, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 772866048. Throughput: 0: 41918.5. Samples: 772964840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:16:31,994][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 03:16:34,602][12883] Updated weights for policy 0, policy_version 47181 (0.0035) [2024-06-18 03:16:36,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42053.6, 300 sec: 41931.6). Total num frames: 773095424. Throughput: 0: 41934.3. Samples: 773219220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:16:36,996][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 03:16:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047186_773095424.pth... [2024-06-18 03:16:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046570_763002880.pth [2024-06-18 03:16:39,411][12883] Updated weights for policy 0, policy_version 47191 (0.0030) [2024-06-18 03:16:41,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 773324800. Throughput: 0: 41747.5. Samples: 773453940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:16:41,995][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 03:16:42,143][12883] Updated weights for policy 0, policy_version 47201 (0.0035) [2024-06-18 03:16:46,994][12645] Fps is (10 sec: 39329.9, 60 sec: 40959.9, 300 sec: 41931.9). Total num frames: 773488640. Throughput: 0: 41774.9. Samples: 773583360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:16:46,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 03:16:47,465][12883] Updated weights for policy 0, policy_version 47211 (0.0046) [2024-06-18 03:16:50,591][12883] Updated weights for policy 0, policy_version 47221 (0.0031) [2024-06-18 03:16:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 773734400. Throughput: 0: 41997.7. Samples: 773842380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:16:51,998][12645] Avg episode reward: [(0, '0.057')] [2024-06-18 03:16:55,118][12883] Updated weights for policy 0, policy_version 47231 (0.0038) [2024-06-18 03:16:56,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 773963776. Throughput: 0: 42115.7. Samples: 774090280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:16:56,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 03:16:58,144][12883] Updated weights for policy 0, policy_version 47241 (0.0046) [2024-06-18 03:17:01,995][12645] Fps is (10 sec: 40954.2, 60 sec: 42051.1, 300 sec: 41931.7). Total num frames: 774144000. Throughput: 0: 41925.2. Samples: 774221320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:17:01,996][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 03:17:02,703][12883] Updated weights for policy 0, policy_version 47251 (0.0036) [2024-06-18 03:17:05,671][12883] Updated weights for policy 0, policy_version 47261 (0.0045) [2024-06-18 03:17:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 774373376. Throughput: 0: 42058.7. Samples: 774474900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-18 03:17:06,994][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 03:17:10,383][12883] Updated weights for policy 0, policy_version 47271 (0.0044) [2024-06-18 03:17:11,994][12645] Fps is (10 sec: 44243.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 774586368. Throughput: 0: 42137.7. Samples: 774727840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-18 03:17:11,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 03:17:13,604][12883] Updated weights for policy 0, policy_version 47281 (0.0037) [2024-06-18 03:17:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 774782976. Throughput: 0: 41983.2. Samples: 774854080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-18 03:17:16,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 03:17:18,076][12883] Updated weights for policy 0, policy_version 47291 (0.0036) [2024-06-18 03:17:18,610][12862] Signal inference workers to stop experience collection... (11050 times) [2024-06-18 03:17:18,610][12862] Signal inference workers to resume experience collection... (11050 times) [2024-06-18 03:17:18,637][12883] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-06-18 03:17:18,637][12883] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-06-18 03:17:21,135][12883] Updated weights for policy 0, policy_version 47301 (0.0037) [2024-06-18 03:17:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 775012352. Throughput: 0: 42033.6. Samples: 775110640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-18 03:17:21,996][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 03:17:25,543][12883] Updated weights for policy 0, policy_version 47311 (0.0039) [2024-06-18 03:17:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 775208960. Throughput: 0: 42466.3. Samples: 775364920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-18 03:17:26,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 03:17:28,787][12883] Updated weights for policy 0, policy_version 47321 (0.0036) [2024-06-18 03:17:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 775405568. Throughput: 0: 42303.5. Samples: 775487020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 26.0) [2024-06-18 03:17:31,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 03:17:33,400][12883] Updated weights for policy 0, policy_version 47331 (0.0033) [2024-06-18 03:17:36,385][12883] Updated weights for policy 0, policy_version 47341 (0.0032) [2024-06-18 03:17:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42599.9, 300 sec: 42154.1). Total num frames: 775651328. Throughput: 0: 42283.1. Samples: 775745120. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 03:17:36,994][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 03:17:41,051][12883] Updated weights for policy 0, policy_version 47351 (0.0038) [2024-06-18 03:17:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 775847936. Throughput: 0: 42503.1. Samples: 776002920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 03:17:41,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 03:17:44,144][12883] Updated weights for policy 0, policy_version 47361 (0.0039) [2024-06-18 03:17:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 776044544. Throughput: 0: 42277.3. Samples: 776123740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 03:17:46,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 03:17:48,854][12883] Updated weights for policy 0, policy_version 47371 (0.0042) [2024-06-18 03:17:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 776273920. Throughput: 0: 42324.3. Samples: 776379500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 03:17:51,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 03:17:52,015][12883] Updated weights for policy 0, policy_version 47381 (0.0025) [2024-06-18 03:17:56,716][12883] Updated weights for policy 0, policy_version 47391 (0.0027) [2024-06-18 03:17:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 776470528. Throughput: 0: 42427.6. Samples: 776637080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 03:17:56,994][12645] Avg episode reward: [(0, '0.137')] [2024-06-18 03:17:59,997][12883] Updated weights for policy 0, policy_version 47401 (0.0034) [2024-06-18 03:18:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.3, 300 sec: 42154.1). Total num frames: 776683520. Throughput: 0: 42306.2. Samples: 776757860. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 03:18:01,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 03:18:04,301][12883] Updated weights for policy 0, policy_version 47411 (0.0030) [2024-06-18 03:18:06,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.7, 300 sec: 42209.3). Total num frames: 776912896. Throughput: 0: 42230.8. Samples: 777011120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-18 03:18:06,996][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 03:18:07,571][12883] Updated weights for policy 0, policy_version 47421 (0.0040) [2024-06-18 03:18:11,913][12883] Updated weights for policy 0, policy_version 47431 (0.0029) [2024-06-18 03:18:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 777109504. Throughput: 0: 42323.9. Samples: 777269500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-18 03:18:11,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 03:18:15,289][12883] Updated weights for policy 0, policy_version 47441 (0.0051) [2024-06-18 03:18:16,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 777338880. Throughput: 0: 42349.0. Samples: 777392720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-18 03:18:16,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 03:18:19,590][12883] Updated weights for policy 0, policy_version 47451 (0.0022) [2024-06-18 03:18:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42099.4). Total num frames: 777519104. Throughput: 0: 42247.9. Samples: 777646280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-18 03:18:22,000][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 03:18:23,241][12883] Updated weights for policy 0, policy_version 47461 (0.0031) [2024-06-18 03:18:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 777732096. Throughput: 0: 42175.1. Samples: 777900800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-18 03:18:26,994][12645] Avg episode reward: [(0, '0.023')] [2024-06-18 03:18:27,558][12883] Updated weights for policy 0, policy_version 47471 (0.0028) [2024-06-18 03:18:30,774][12883] Updated weights for policy 0, policy_version 47481 (0.0036) [2024-06-18 03:18:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 777961472. Throughput: 0: 42289.8. Samples: 778026780. Policy #0 lag: (min: 0.0, avg: 8.4, max: 22.0) [2024-06-18 03:18:31,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:18:35,191][12883] Updated weights for policy 0, policy_version 47491 (0.0035) [2024-06-18 03:18:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 778158080. Throughput: 0: 42263.2. Samples: 778281340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:18:36,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:18:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047495_778158080.pth... [2024-06-18 03:18:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000046880_768081920.pth [2024-06-18 03:18:38,681][12883] Updated weights for policy 0, policy_version 47501 (0.0044) [2024-06-18 03:18:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 778371072. Throughput: 0: 42100.4. Samples: 778531600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:18:41,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 03:18:42,506][12862] Signal inference workers to stop experience collection... (11100 times) [2024-06-18 03:18:42,506][12862] Signal inference workers to resume experience collection... (11100 times) [2024-06-18 03:18:42,525][12883] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-06-18 03:18:42,525][12883] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-06-18 03:18:42,805][12883] Updated weights for policy 0, policy_version 47511 (0.0028) [2024-06-18 03:18:46,728][12883] Updated weights for policy 0, policy_version 47521 (0.0035) [2024-06-18 03:18:46,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 778584064. Throughput: 0: 42157.7. Samples: 778654960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:18:46,995][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 03:18:50,565][12883] Updated weights for policy 0, policy_version 47531 (0.0033) [2024-06-18 03:18:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 778780672. Throughput: 0: 42079.8. Samples: 778904620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:18:51,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 03:18:54,570][12883] Updated weights for policy 0, policy_version 47541 (0.0040) [2024-06-18 03:18:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 778993664. Throughput: 0: 42043.6. Samples: 779161460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:18:56,994][12645] Avg episode reward: [(0, '0.126')] [2024-06-18 03:18:58,209][12883] Updated weights for policy 0, policy_version 47551 (0.0042) [2024-06-18 03:19:01,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 779223040. Throughput: 0: 42179.7. Samples: 779290800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:19:01,994][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 03:19:02,543][12883] Updated weights for policy 0, policy_version 47561 (0.0023) [2024-06-18 03:19:06,204][12883] Updated weights for policy 0, policy_version 47571 (0.0036) [2024-06-18 03:19:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41780.7, 300 sec: 42043.0). Total num frames: 779419648. Throughput: 0: 42155.7. Samples: 779543280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 03:19:06,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 03:19:10,304][12883] Updated weights for policy 0, policy_version 47581 (0.0040) [2024-06-18 03:19:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 779632640. Throughput: 0: 41978.1. Samples: 779789820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 03:19:11,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 03:19:14,076][12883] Updated weights for policy 0, policy_version 47591 (0.0041) [2024-06-18 03:19:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 779845632. Throughput: 0: 42037.4. Samples: 779918460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 03:19:16,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 03:19:17,885][12883] Updated weights for policy 0, policy_version 47601 (0.0040) [2024-06-18 03:19:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 780042240. Throughput: 0: 42130.6. Samples: 780177220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 03:19:21,994][12645] Avg episode reward: [(0, '0.137')] [2024-06-18 03:19:21,997][12883] Updated weights for policy 0, policy_version 47611 (0.0037) [2024-06-18 03:19:25,978][12883] Updated weights for policy 0, policy_version 47621 (0.0026) [2024-06-18 03:19:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 780255232. Throughput: 0: 42132.1. Samples: 780427540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 03:19:26,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 03:19:29,878][12883] Updated weights for policy 0, policy_version 47631 (0.0032) [2024-06-18 03:19:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 780484608. Throughput: 0: 42179.8. Samples: 780553040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 03:19:31,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:19:33,611][12883] Updated weights for policy 0, policy_version 47641 (0.0051) [2024-06-18 03:19:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 780664832. Throughput: 0: 42280.5. Samples: 780807240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 03:19:36,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:19:37,484][12883] Updated weights for policy 0, policy_version 47651 (0.0038) [2024-06-18 03:19:41,242][12883] Updated weights for policy 0, policy_version 47661 (0.0024) [2024-06-18 03:19:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 780877824. Throughput: 0: 42221.0. Samples: 781061400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 03:19:41,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 03:19:45,490][12883] Updated weights for policy 0, policy_version 47671 (0.0037) [2024-06-18 03:19:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 781123584. Throughput: 0: 42195.0. Samples: 781189580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 03:19:46,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 03:19:49,307][12883] Updated weights for policy 0, policy_version 47681 (0.0026) [2024-06-18 03:19:51,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 781320192. Throughput: 0: 42155.4. Samples: 781440280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 03:19:51,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 03:19:52,987][12883] Updated weights for policy 0, policy_version 47691 (0.0031) [2024-06-18 03:19:56,910][12883] Updated weights for policy 0, policy_version 47701 (0.0037) [2024-06-18 03:19:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 781533184. Throughput: 0: 42326.2. Samples: 781694500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 03:19:56,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 03:19:59,592][12862] Signal inference workers to stop experience collection... (11150 times) [2024-06-18 03:19:59,593][12862] Signal inference workers to resume experience collection... (11150 times) [2024-06-18 03:19:59,643][12883] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-06-18 03:19:59,643][12883] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-06-18 03:20:00,510][12883] Updated weights for policy 0, policy_version 47711 (0.0033) [2024-06-18 03:20:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 781762560. Throughput: 0: 42313.8. Samples: 781822580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 03:20:01,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 03:20:04,567][12883] Updated weights for policy 0, policy_version 47721 (0.0035) [2024-06-18 03:20:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 781942784. Throughput: 0: 42253.8. Samples: 782078640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:20:06,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 03:20:08,251][12883] Updated weights for policy 0, policy_version 47731 (0.0026) [2024-06-18 03:20:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 782155776. Throughput: 0: 42160.3. Samples: 782324760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:20:11,994][12645] Avg episode reward: [(0, '0.034')] [2024-06-18 03:20:12,241][12883] Updated weights for policy 0, policy_version 47741 (0.0037) [2024-06-18 03:20:15,927][12883] Updated weights for policy 0, policy_version 47751 (0.0030) [2024-06-18 03:20:16,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42596.8, 300 sec: 42264.9). Total num frames: 782401536. Throughput: 0: 42303.2. Samples: 782456780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:20:16,997][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 03:20:19,816][12883] Updated weights for policy 0, policy_version 47761 (0.0038) [2024-06-18 03:20:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 782581760. Throughput: 0: 42285.8. Samples: 782710100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:20:21,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 03:20:23,704][12883] Updated weights for policy 0, policy_version 47771 (0.0039) [2024-06-18 03:20:26,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 782794752. Throughput: 0: 42194.7. Samples: 782960160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:20:26,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 03:20:27,543][12883] Updated weights for policy 0, policy_version 47781 (0.0033) [2024-06-18 03:20:31,695][12883] Updated weights for policy 0, policy_version 47791 (0.0030) [2024-06-18 03:20:31,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42210.2). Total num frames: 783024128. Throughput: 0: 42098.3. Samples: 783084000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:20:31,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 03:20:35,819][12883] Updated weights for policy 0, policy_version 47801 (0.0032) [2024-06-18 03:20:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 783204352. Throughput: 0: 42179.6. Samples: 783338360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 03:20:36,994][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 03:20:37,120][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047804_783220736.pth... [2024-06-18 03:20:37,185][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047186_773095424.pth [2024-06-18 03:20:39,387][12883] Updated weights for policy 0, policy_version 47811 (0.0035) [2024-06-18 03:20:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 783417344. Throughput: 0: 41983.2. Samples: 783583740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 03:20:41,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 03:20:43,679][12883] Updated weights for policy 0, policy_version 47821 (0.0025) [2024-06-18 03:20:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 783646720. Throughput: 0: 42106.5. Samples: 783717380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 03:20:46,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 03:20:47,113][12883] Updated weights for policy 0, policy_version 47831 (0.0045) [2024-06-18 03:20:51,610][12883] Updated weights for policy 0, policy_version 47841 (0.0034) [2024-06-18 03:20:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 783843328. Throughput: 0: 41909.9. Samples: 783964580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 03:20:51,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 03:20:54,771][12883] Updated weights for policy 0, policy_version 47851 (0.0042) [2024-06-18 03:20:54,965][12862] Signal inference workers to stop experience collection... (11200 times) [2024-06-18 03:20:54,971][12862] Signal inference workers to resume experience collection... (11200 times) [2024-06-18 03:20:54,999][12883] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-06-18 03:20:55,004][12883] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-06-18 03:20:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 784072704. Throughput: 0: 41903.6. Samples: 784210420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 03:20:56,994][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 03:20:59,450][12883] Updated weights for policy 0, policy_version 47861 (0.0037) [2024-06-18 03:21:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 784252928. Throughput: 0: 41895.5. Samples: 784341980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 03:21:01,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 03:21:02,740][12883] Updated weights for policy 0, policy_version 47871 (0.0034) [2024-06-18 03:21:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 784449536. Throughput: 0: 41797.4. Samples: 784590980. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) [2024-06-18 03:21:06,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 03:21:07,230][12883] Updated weights for policy 0, policy_version 47881 (0.0032) [2024-06-18 03:21:10,504][12883] Updated weights for policy 0, policy_version 47891 (0.0038) [2024-06-18 03:21:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 784695296. Throughput: 0: 41592.5. Samples: 784831820. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) [2024-06-18 03:21:11,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 03:21:15,104][12883] Updated weights for policy 0, policy_version 47901 (0.0038) [2024-06-18 03:21:16,994][12645] Fps is (10 sec: 42597.3, 60 sec: 41234.5, 300 sec: 42098.5). Total num frames: 784875520. Throughput: 0: 41850.4. Samples: 784967280. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) [2024-06-18 03:21:16,994][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 03:21:18,352][12883] Updated weights for policy 0, policy_version 47911 (0.0039) [2024-06-18 03:21:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 785088512. Throughput: 0: 41638.4. Samples: 785212080. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) [2024-06-18 03:21:21,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 03:21:23,518][12883] Updated weights for policy 0, policy_version 47921 (0.0036) [2024-06-18 03:21:26,262][12883] Updated weights for policy 0, policy_version 47931 (0.0031) [2024-06-18 03:21:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 785317888. Throughput: 0: 41628.8. Samples: 785457040. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) [2024-06-18 03:21:26,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 03:21:31,211][12883] Updated weights for policy 0, policy_version 47941 (0.0033) [2024-06-18 03:21:31,994][12645] Fps is (10 sec: 39320.9, 60 sec: 40959.9, 300 sec: 41987.8). Total num frames: 785481728. Throughput: 0: 41613.8. Samples: 785590000. Policy #0 lag: (min: 0.0, avg: 14.0, max: 25.0) [2024-06-18 03:21:31,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 03:21:34,018][12883] Updated weights for policy 0, policy_version 47951 (0.0037) [2024-06-18 03:21:36,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 785694720. Throughput: 0: 41592.8. Samples: 785836260. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) [2024-06-18 03:21:36,994][12645] Avg episode reward: [(0, '0.030')] [2024-06-18 03:21:39,387][12883] Updated weights for policy 0, policy_version 47961 (0.0047) [2024-06-18 03:21:41,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 785940480. Throughput: 0: 41534.3. Samples: 786079460. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) [2024-06-18 03:21:41,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 03:21:42,052][12883] Updated weights for policy 0, policy_version 47971 (0.0025) [2024-06-18 03:21:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 786104320. Throughput: 0: 41660.3. Samples: 786216700. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) [2024-06-18 03:21:46,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 03:21:47,269][12883] Updated weights for policy 0, policy_version 47981 (0.0036) [2024-06-18 03:21:49,883][12883] Updated weights for policy 0, policy_version 47991 (0.0032) [2024-06-18 03:21:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 786350080. Throughput: 0: 41560.9. Samples: 786461220. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) [2024-06-18 03:21:51,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 03:21:54,802][12883] Updated weights for policy 0, policy_version 48001 (0.0047) [2024-06-18 03:21:56,994][12645] Fps is (10 sec: 47514.2, 60 sec: 41779.3, 300 sec: 42154.3). Total num frames: 786579456. Throughput: 0: 41821.7. Samples: 786713800. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) [2024-06-18 03:21:56,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 03:21:57,628][12883] Updated weights for policy 0, policy_version 48011 (0.0029) [2024-06-18 03:21:57,668][12862] Signal inference workers to stop experience collection... (11250 times) [2024-06-18 03:21:57,668][12862] Signal inference workers to resume experience collection... (11250 times) [2024-06-18 03:21:57,685][12883] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-06-18 03:21:57,686][12883] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-06-18 03:22:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 786726912. Throughput: 0: 41690.4. Samples: 786843340. Policy #0 lag: (min: 1.0, avg: 12.7, max: 27.0) [2024-06-18 03:22:01,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 03:22:02,656][12883] Updated weights for policy 0, policy_version 48021 (0.0044) [2024-06-18 03:22:05,254][12883] Updated weights for policy 0, policy_version 48031 (0.0035) [2024-06-18 03:22:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 786989056. Throughput: 0: 41685.3. Samples: 787087920. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-18 03:22:06,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 03:22:10,613][12883] Updated weights for policy 0, policy_version 48041 (0.0035) [2024-06-18 03:22:11,994][12645] Fps is (10 sec: 45875.5, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 787185664. Throughput: 0: 41962.8. Samples: 787345360. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-18 03:22:11,994][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 03:22:12,958][12883] Updated weights for policy 0, policy_version 48051 (0.0028) [2024-06-18 03:22:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 787365888. Throughput: 0: 41741.4. Samples: 787468360. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-18 03:22:16,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 03:22:18,059][12883] Updated weights for policy 0, policy_version 48061 (0.0029) [2024-06-18 03:22:21,287][12883] Updated weights for policy 0, policy_version 48071 (0.0027) [2024-06-18 03:22:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 787611648. Throughput: 0: 41781.3. Samples: 787716420. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-18 03:22:21,994][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 03:22:25,548][12883] Updated weights for policy 0, policy_version 48081 (0.0036) [2024-06-18 03:22:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 787808256. Throughput: 0: 42114.6. Samples: 787974620. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-18 03:22:26,994][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 03:22:29,454][12883] Updated weights for policy 0, policy_version 48091 (0.0043) [2024-06-18 03:22:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 788021248. Throughput: 0: 41924.9. Samples: 788103320. Policy #0 lag: (min: 2.0, avg: 10.1, max: 21.0) [2024-06-18 03:22:31,994][12645] Avg episode reward: [(0, '0.054')] [2024-06-18 03:22:33,299][12883] Updated weights for policy 0, policy_version 48101 (0.0044) [2024-06-18 03:22:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 788234240. Throughput: 0: 42080.9. Samples: 788354860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:22:36,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 03:22:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048111_788250624.pth... [2024-06-18 03:22:37,029][12883] Updated weights for policy 0, policy_version 48111 (0.0036) [2024-06-18 03:22:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047495_778158080.pth [2024-06-18 03:22:41,003][12883] Updated weights for policy 0, policy_version 48121 (0.0044) [2024-06-18 03:22:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 788414464. Throughput: 0: 42181.8. Samples: 788611980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:22:41,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 03:22:44,656][12883] Updated weights for policy 0, policy_version 48131 (0.0042) [2024-06-18 03:22:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 788660224. Throughput: 0: 41960.0. Samples: 788731540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:22:46,994][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 03:22:48,793][12883] Updated weights for policy 0, policy_version 48141 (0.0043) [2024-06-18 03:22:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 788873216. Throughput: 0: 42231.1. Samples: 788988320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:22:51,994][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 03:22:52,580][12883] Updated weights for policy 0, policy_version 48151 (0.0036) [2024-06-18 03:22:56,886][12883] Updated weights for policy 0, policy_version 48161 (0.0024) [2024-06-18 03:22:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 789069824. Throughput: 0: 42141.3. Samples: 789241720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:22:56,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 03:23:00,049][12883] Updated weights for policy 0, policy_version 48171 (0.0034) [2024-06-18 03:23:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 41987.8). Total num frames: 789299200. Throughput: 0: 42174.5. Samples: 789366220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:23:01,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 03:23:04,454][12883] Updated weights for policy 0, policy_version 48181 (0.0036) [2024-06-18 03:23:07,000][12645] Fps is (10 sec: 44209.1, 60 sec: 42047.9, 300 sec: 42042.1). Total num frames: 789512192. Throughput: 0: 42433.8. Samples: 789626200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 03:23:07,000][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 03:23:07,711][12883] Updated weights for policy 0, policy_version 48191 (0.0035) [2024-06-18 03:23:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 789708800. Throughput: 0: 42255.5. Samples: 789876120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 03:23:11,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 03:23:12,135][12883] Updated weights for policy 0, policy_version 48201 (0.0030) [2024-06-18 03:23:14,357][12862] Signal inference workers to stop experience collection... (11300 times) [2024-06-18 03:23:14,407][12883] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-06-18 03:23:14,414][12862] Signal inference workers to resume experience collection... (11300 times) [2024-06-18 03:23:14,424][12883] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-06-18 03:23:15,539][12883] Updated weights for policy 0, policy_version 48211 (0.0058) [2024-06-18 03:23:16,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 789921792. Throughput: 0: 42141.0. Samples: 789999660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 03:23:16,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:23:20,544][12883] Updated weights for policy 0, policy_version 48221 (0.0034) [2024-06-18 03:23:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 790134784. Throughput: 0: 42365.3. Samples: 790261300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 03:23:21,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 03:23:23,313][12883] Updated weights for policy 0, policy_version 48231 (0.0030) [2024-06-18 03:23:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 790347776. Throughput: 0: 42099.7. Samples: 790506460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 03:23:26,994][12645] Avg episode reward: [(0, '0.038')] [2024-06-18 03:23:28,082][12883] Updated weights for policy 0, policy_version 48241 (0.0034) [2024-06-18 03:23:31,743][12883] Updated weights for policy 0, policy_version 48251 (0.0035) [2024-06-18 03:23:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 790560768. Throughput: 0: 42348.1. Samples: 790637200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 03:23:31,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 03:23:36,414][12883] Updated weights for policy 0, policy_version 48261 (0.0042) [2024-06-18 03:23:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 790757376. Throughput: 0: 42332.9. Samples: 790893300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:23:36,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 03:23:39,457][12883] Updated weights for policy 0, policy_version 48271 (0.0036) [2024-06-18 03:23:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42098.6). Total num frames: 791003136. Throughput: 0: 42075.2. Samples: 791135100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:23:41,994][12645] Avg episode reward: [(0, '0.032')] [2024-06-18 03:23:44,073][12883] Updated weights for policy 0, policy_version 48281 (0.0045) [2024-06-18 03:23:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 791183360. Throughput: 0: 42313.9. Samples: 791270340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:23:46,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 03:23:47,026][12883] Updated weights for policy 0, policy_version 48291 (0.0034) [2024-06-18 03:23:51,684][12883] Updated weights for policy 0, policy_version 48301 (0.0053) [2024-06-18 03:23:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 791379968. Throughput: 0: 42053.0. Samples: 791518320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:23:51,994][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 03:23:54,779][12883] Updated weights for policy 0, policy_version 48311 (0.0048) [2024-06-18 03:23:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42098.5). Total num frames: 791642112. Throughput: 0: 41987.6. Samples: 791765560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:23:56,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 03:23:59,306][12883] Updated weights for policy 0, policy_version 48321 (0.0028) [2024-06-18 03:24:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 791822336. Throughput: 0: 42355.0. Samples: 791905640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 03:24:01,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 03:24:02,469][12883] Updated weights for policy 0, policy_version 48331 (0.0049) [2024-06-18 03:24:06,724][12883] Updated weights for policy 0, policy_version 48341 (0.0038) [2024-06-18 03:24:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42056.6, 300 sec: 42043.0). Total num frames: 792035328. Throughput: 0: 41979.1. Samples: 792150360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 03:24:06,996][12645] Avg episode reward: [(0, '0.053')] [2024-06-18 03:24:10,559][12883] Updated weights for policy 0, policy_version 48351 (0.0031) [2024-06-18 03:24:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 792264704. Throughput: 0: 42134.6. Samples: 792402520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 03:24:11,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 03:24:14,499][12883] Updated weights for policy 0, policy_version 48361 (0.0036) [2024-06-18 03:24:16,997][12645] Fps is (10 sec: 39309.0, 60 sec: 41776.9, 300 sec: 41987.0). Total num frames: 792428544. Throughput: 0: 42215.2. Samples: 792537020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 03:24:17,004][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 03:24:17,184][12862] Signal inference workers to stop experience collection... (11350 times) [2024-06-18 03:24:17,208][12883] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-06-18 03:24:17,245][12862] Signal inference workers to resume experience collection... (11350 times) [2024-06-18 03:24:17,245][12883] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-06-18 03:24:18,060][12883] Updated weights for policy 0, policy_version 48371 (0.0033) [2024-06-18 03:24:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 792657920. Throughput: 0: 41947.0. Samples: 792780920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 03:24:21,995][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 03:24:22,066][12883] Updated weights for policy 0, policy_version 48381 (0.0031) [2024-06-18 03:24:25,732][12883] Updated weights for policy 0, policy_version 48391 (0.0032) [2024-06-18 03:24:26,994][12645] Fps is (10 sec: 47529.0, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 792903680. Throughput: 0: 42286.2. Samples: 793037980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 03:24:26,994][12645] Avg episode reward: [(0, '0.053')] [2024-06-18 03:24:29,710][12883] Updated weights for policy 0, policy_version 48401 (0.0029) [2024-06-18 03:24:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 793067520. Throughput: 0: 42110.6. Samples: 793165320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 03:24:31,996][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 03:24:33,843][12883] Updated weights for policy 0, policy_version 48411 (0.0041) [2024-06-18 03:24:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 793313280. Throughput: 0: 41990.5. Samples: 793407900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 03:24:36,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 03:24:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048420_793313280.pth... [2024-06-18 03:24:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000047804_783220736.pth [2024-06-18 03:24:37,359][12883] Updated weights for policy 0, policy_version 48421 (0.0033) [2024-06-18 03:24:41,776][12883] Updated weights for policy 0, policy_version 48431 (0.0030) [2024-06-18 03:24:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 793493504. Throughput: 0: 42366.7. Samples: 793672060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 03:24:41,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 03:24:45,022][12883] Updated weights for policy 0, policy_version 48441 (0.0032) [2024-06-18 03:24:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 793690112. Throughput: 0: 41764.1. Samples: 793785020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 03:24:46,994][12645] Avg episode reward: [(0, '0.021')] [2024-06-18 03:24:49,593][12883] Updated weights for policy 0, policy_version 48451 (0.0040) [2024-06-18 03:24:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 793935872. Throughput: 0: 42036.4. Samples: 794042000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 03:24:51,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 03:24:52,679][12883] Updated weights for policy 0, policy_version 48461 (0.0021) [2024-06-18 03:24:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 794116096. Throughput: 0: 42475.1. Samples: 794313900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 03:24:56,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 03:24:57,233][12883] Updated weights for policy 0, policy_version 48471 (0.0044) [2024-06-18 03:25:00,199][12883] Updated weights for policy 0, policy_version 48481 (0.0031) [2024-06-18 03:25:01,996][12645] Fps is (10 sec: 40951.4, 60 sec: 42050.8, 300 sec: 42042.7). Total num frames: 794345472. Throughput: 0: 42051.2. Samples: 794429280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 03:25:01,996][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:25:05,036][12883] Updated weights for policy 0, policy_version 48491 (0.0025) [2024-06-18 03:25:06,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 794591232. Throughput: 0: 42255.7. Samples: 794682420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 03:25:06,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 03:25:08,107][12883] Updated weights for policy 0, policy_version 48501 (0.0036) [2024-06-18 03:25:11,994][12645] Fps is (10 sec: 39329.5, 60 sec: 41232.9, 300 sec: 41821.2). Total num frames: 794738688. Throughput: 0: 42240.3. Samples: 794938800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 03:25:11,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 03:25:12,151][12862] Signal inference workers to stop experience collection... (11400 times) [2024-06-18 03:25:12,152][12862] Signal inference workers to resume experience collection... (11400 times) [2024-06-18 03:25:12,195][12883] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-06-18 03:25:12,195][12883] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-06-18 03:25:12,930][12883] Updated weights for policy 0, policy_version 48511 (0.0043) [2024-06-18 03:25:15,917][12883] Updated weights for policy 0, policy_version 48521 (0.0030) [2024-06-18 03:25:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.8, 300 sec: 42098.6). Total num frames: 795000832. Throughput: 0: 41960.5. Samples: 795053540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 03:25:16,994][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 03:25:20,638][12883] Updated weights for policy 0, policy_version 48531 (0.0035) [2024-06-18 03:25:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 795197440. Throughput: 0: 42465.2. Samples: 795318840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 03:25:21,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 03:25:23,641][12883] Updated weights for policy 0, policy_version 48541 (0.0031) [2024-06-18 03:25:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 795377664. Throughput: 0: 42290.2. Samples: 795575120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 03:25:26,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 03:25:28,305][12883] Updated weights for policy 0, policy_version 48551 (0.0032) [2024-06-18 03:25:31,295][12883] Updated weights for policy 0, policy_version 48561 (0.0043) [2024-06-18 03:25:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 795639808. Throughput: 0: 42428.8. Samples: 795694320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 03:25:31,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 03:25:36,021][12883] Updated weights for policy 0, policy_version 48571 (0.0038) [2024-06-18 03:25:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 795820032. Throughput: 0: 42413.4. Samples: 795950600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 03:25:36,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 03:25:39,270][12883] Updated weights for policy 0, policy_version 48581 (0.0030) [2024-06-18 03:25:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 796033024. Throughput: 0: 41841.3. Samples: 796196760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 03:25:41,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 03:25:43,927][12883] Updated weights for policy 0, policy_version 48591 (0.0031) [2024-06-18 03:25:46,802][12883] Updated weights for policy 0, policy_version 48601 (0.0027) [2024-06-18 03:25:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42154.1). Total num frames: 796278784. Throughput: 0: 42168.2. Samples: 796326760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 03:25:46,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 03:25:51,697][12883] Updated weights for policy 0, policy_version 48611 (0.0033) [2024-06-18 03:25:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 796442624. Throughput: 0: 42100.0. Samples: 796576920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 03:25:51,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 03:25:54,473][12883] Updated weights for policy 0, policy_version 48621 (0.0026) [2024-06-18 03:25:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 796672000. Throughput: 0: 41987.2. Samples: 796828220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 03:25:56,994][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 03:25:59,744][12883] Updated weights for policy 0, policy_version 48631 (0.0042) [2024-06-18 03:26:01,996][12645] Fps is (10 sec: 45864.9, 60 sec: 42598.3, 300 sec: 42209.3). Total num frames: 796901376. Throughput: 0: 42380.1. Samples: 796960740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 03:26:01,997][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 03:26:02,310][12883] Updated weights for policy 0, policy_version 48641 (0.0032) [2024-06-18 03:26:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 797065216. Throughput: 0: 42063.8. Samples: 797211700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:26:06,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 03:26:07,439][12883] Updated weights for policy 0, policy_version 48651 (0.0029) [2024-06-18 03:26:10,345][12883] Updated weights for policy 0, policy_version 48661 (0.0034) [2024-06-18 03:26:11,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.6, 300 sec: 42209.7). Total num frames: 797327360. Throughput: 0: 41804.8. Samples: 797456340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:26:11,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 03:26:15,190][12883] Updated weights for policy 0, policy_version 48671 (0.0041) [2024-06-18 03:26:16,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 797523968. Throughput: 0: 42225.3. Samples: 797594460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:26:16,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 03:26:18,199][12883] Updated weights for policy 0, policy_version 48681 (0.0030) [2024-06-18 03:26:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 797720576. Throughput: 0: 42043.4. Samples: 797842560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:26:22,004][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 03:26:22,743][12883] Updated weights for policy 0, policy_version 48691 (0.0031) [2024-06-18 03:26:25,788][12883] Updated weights for policy 0, policy_version 48701 (0.0036) [2024-06-18 03:26:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 797949952. Throughput: 0: 42249.7. Samples: 798098000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:26:26,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 03:26:28,654][12862] Signal inference workers to stop experience collection... (11450 times) [2024-06-18 03:26:28,655][12862] Signal inference workers to resume experience collection... (11450 times) [2024-06-18 03:26:28,682][12883] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-06-18 03:26:28,683][12883] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-06-18 03:26:30,247][12883] Updated weights for policy 0, policy_version 48711 (0.0029) [2024-06-18 03:26:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 798146560. Throughput: 0: 42279.1. Samples: 798229320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:26:31,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 03:26:33,434][12883] Updated weights for policy 0, policy_version 48721 (0.0033) [2024-06-18 03:26:36,994][12645] Fps is (10 sec: 40956.8, 60 sec: 42324.7, 300 sec: 42098.4). Total num frames: 798359552. Throughput: 0: 42289.9. Samples: 798480000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:26:36,995][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 03:26:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048728_798359552.pth... [2024-06-18 03:26:37,058][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048111_788250624.pth [2024-06-18 03:26:37,995][12883] Updated weights for policy 0, policy_version 48731 (0.0026) [2024-06-18 03:26:41,760][12883] Updated weights for policy 0, policy_version 48741 (0.0033) [2024-06-18 03:26:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 798572544. Throughput: 0: 42379.6. Samples: 798735300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:26:41,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 03:26:45,625][12883] Updated weights for policy 0, policy_version 48751 (0.0034) [2024-06-18 03:26:46,994][12645] Fps is (10 sec: 42601.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 798785536. Throughput: 0: 42235.8. Samples: 798861260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:26:46,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 03:26:49,585][12883] Updated weights for policy 0, policy_version 48761 (0.0025) [2024-06-18 03:26:51,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42098.5). Total num frames: 798998528. Throughput: 0: 42368.2. Samples: 799118280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:26:51,994][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 03:26:53,179][12883] Updated weights for policy 0, policy_version 48771 (0.0026) [2024-06-18 03:26:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 799211520. Throughput: 0: 42481.3. Samples: 799368000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:26:56,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 03:26:57,341][12883] Updated weights for policy 0, policy_version 48781 (0.0042) [2024-06-18 03:27:01,014][12883] Updated weights for policy 0, policy_version 48791 (0.0026) [2024-06-18 03:27:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41780.7, 300 sec: 42098.5). Total num frames: 799408128. Throughput: 0: 42187.2. Samples: 799492880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:27:01,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 03:27:05,172][12883] Updated weights for policy 0, policy_version 48801 (0.0030) [2024-06-18 03:27:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 799621120. Throughput: 0: 42272.1. Samples: 799744800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 03:27:06,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 03:27:08,774][12883] Updated weights for policy 0, policy_version 48811 (0.0035) [2024-06-18 03:27:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 41777.6, 300 sec: 42264.8). Total num frames: 799834112. Throughput: 0: 42221.9. Samples: 799998080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 03:27:11,996][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 03:27:13,050][12883] Updated weights for policy 0, policy_version 48821 (0.0037) [2024-06-18 03:27:16,523][12883] Updated weights for policy 0, policy_version 48831 (0.0033) [2024-06-18 03:27:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 800047104. Throughput: 0: 42076.1. Samples: 800122740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 03:27:17,000][12645] Avg episode reward: [(0, '0.215')] [2024-06-18 03:27:21,069][12883] Updated weights for policy 0, policy_version 48841 (0.0034) [2024-06-18 03:27:21,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 800276480. Throughput: 0: 42290.1. Samples: 800383020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 03:27:21,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 03:27:24,222][12883] Updated weights for policy 0, policy_version 48851 (0.0033) [2024-06-18 03:27:26,994][12645] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 800456704. Throughput: 0: 42186.9. Samples: 800633720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 03:27:26,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 03:27:28,925][12883] Updated weights for policy 0, policy_version 48861 (0.0028) [2024-06-18 03:27:31,823][12883] Updated weights for policy 0, policy_version 48871 (0.0037) [2024-06-18 03:27:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 800702464. Throughput: 0: 42175.7. Samples: 800759160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 03:27:31,994][12645] Avg episode reward: [(0, '0.029')] [2024-06-18 03:27:36,595][12883] Updated weights for policy 0, policy_version 48881 (0.0033) [2024-06-18 03:27:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.9, 300 sec: 42320.7). Total num frames: 800899072. Throughput: 0: 42256.6. Samples: 801019820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 03:27:36,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 03:27:39,610][12883] Updated weights for policy 0, policy_version 48891 (0.0040) [2024-06-18 03:27:42,000][12645] Fps is (10 sec: 40934.0, 60 sec: 42320.9, 300 sec: 42208.7). Total num frames: 801112064. Throughput: 0: 42056.9. Samples: 801260820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 03:27:42,000][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 03:27:44,280][12883] Updated weights for policy 0, policy_version 48901 (0.0040) [2024-06-18 03:27:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 801325056. Throughput: 0: 42165.8. Samples: 801390340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 03:27:46,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 03:27:47,682][12883] Updated weights for policy 0, policy_version 48911 (0.0032) [2024-06-18 03:27:51,994][12645] Fps is (10 sec: 39346.5, 60 sec: 41779.4, 300 sec: 42154.1). Total num frames: 801505280. Throughput: 0: 42155.1. Samples: 801641780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 03:27:51,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 03:27:52,021][12883] Updated weights for policy 0, policy_version 48921 (0.0027) [2024-06-18 03:27:55,498][12883] Updated weights for policy 0, policy_version 48931 (0.0033) [2024-06-18 03:27:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 801718272. Throughput: 0: 42110.0. Samples: 801892940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 03:27:56,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 03:27:59,606][12862] Signal inference workers to stop experience collection... (11500 times) [2024-06-18 03:27:59,627][12883] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-06-18 03:27:59,660][12862] Signal inference workers to resume experience collection... (11500 times) [2024-06-18 03:27:59,663][12883] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-06-18 03:27:59,666][12883] Updated weights for policy 0, policy_version 48941 (0.0034) [2024-06-18 03:28:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42099.4). Total num frames: 801931264. Throughput: 0: 42131.1. Samples: 802018640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 03:28:01,994][12645] Avg episode reward: [(0, '0.205')] [2024-06-18 03:28:03,297][12883] Updated weights for policy 0, policy_version 48951 (0.0035) [2024-06-18 03:28:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 802160640. Throughput: 0: 42035.6. Samples: 802274620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-18 03:28:06,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 03:28:07,128][12883] Updated weights for policy 0, policy_version 48961 (0.0040) [2024-06-18 03:28:11,273][12883] Updated weights for policy 0, policy_version 48971 (0.0039) [2024-06-18 03:28:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 802357248. Throughput: 0: 42123.7. Samples: 802529280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 03:28:11,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 03:28:14,713][12883] Updated weights for policy 0, policy_version 48981 (0.0031) [2024-06-18 03:28:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 802570240. Throughput: 0: 42001.7. Samples: 802649240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 03:28:16,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 03:28:18,924][12883] Updated weights for policy 0, policy_version 48991 (0.0029) [2024-06-18 03:28:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 802783232. Throughput: 0: 42018.6. Samples: 802910660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 03:28:21,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 03:28:22,374][12883] Updated weights for policy 0, policy_version 49001 (0.0030) [2024-06-18 03:28:26,741][12883] Updated weights for policy 0, policy_version 49011 (0.0027) [2024-06-18 03:28:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 802996224. Throughput: 0: 42175.6. Samples: 803158460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 03:28:26,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 03:28:30,379][12883] Updated weights for policy 0, policy_version 49021 (0.0021) [2024-06-18 03:28:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 803192832. Throughput: 0: 42134.6. Samples: 803286400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 03:28:31,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 03:28:34,418][12883] Updated weights for policy 0, policy_version 49031 (0.0039) [2024-06-18 03:28:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 803405824. Throughput: 0: 42230.0. Samples: 803542140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 03:28:36,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 03:28:37,039][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049037_803422208.pth... [2024-06-18 03:28:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048420_793313280.pth [2024-06-18 03:28:38,163][12883] Updated weights for policy 0, policy_version 49041 (0.0042) [2024-06-18 03:28:41,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42055.1, 300 sec: 42209.3). Total num frames: 803635200. Throughput: 0: 42147.3. Samples: 803789660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 03:28:41,997][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 03:28:42,333][12883] Updated weights for policy 0, policy_version 49051 (0.0024) [2024-06-18 03:28:46,183][12883] Updated weights for policy 0, policy_version 49061 (0.0047) [2024-06-18 03:28:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 803831808. Throughput: 0: 42241.0. Samples: 803919480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 03:28:46,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 03:28:50,258][12883] Updated weights for policy 0, policy_version 49071 (0.0022) [2024-06-18 03:28:51,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 804044800. Throughput: 0: 42284.4. Samples: 804177420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 03:28:51,994][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 03:28:54,229][12883] Updated weights for policy 0, policy_version 49081 (0.0032) [2024-06-18 03:28:56,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42869.9, 300 sec: 42264.8). Total num frames: 804290560. Throughput: 0: 42099.7. Samples: 804423860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 03:28:56,996][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 03:28:58,171][12883] Updated weights for policy 0, policy_version 49091 (0.0038) [2024-06-18 03:29:01,924][12883] Updated weights for policy 0, policy_version 49101 (0.0044) [2024-06-18 03:29:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 804470784. Throughput: 0: 42418.3. Samples: 804558060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 03:29:01,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 03:29:05,877][12883] Updated weights for policy 0, policy_version 49111 (0.0028) [2024-06-18 03:29:06,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 804683776. Throughput: 0: 42313.9. Samples: 804814780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 03:29:06,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 03:29:09,456][12883] Updated weights for policy 0, policy_version 49121 (0.0035) [2024-06-18 03:29:10,302][12862] Signal inference workers to stop experience collection... (11550 times) [2024-06-18 03:29:10,303][12862] Signal inference workers to resume experience collection... (11550 times) [2024-06-18 03:29:10,337][12883] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-06-18 03:29:10,337][12883] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-06-18 03:29:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42376.7). Total num frames: 804929536. Throughput: 0: 42273.4. Samples: 805060760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 03:29:11,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 03:29:13,790][12883] Updated weights for policy 0, policy_version 49131 (0.0050) [2024-06-18 03:29:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 805109760. Throughput: 0: 42385.3. Samples: 805193740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 03:29:16,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 03:29:17,556][12883] Updated weights for policy 0, policy_version 49141 (0.0059) [2024-06-18 03:29:21,382][12883] Updated weights for policy 0, policy_version 49151 (0.0046) [2024-06-18 03:29:22,000][12645] Fps is (10 sec: 37658.5, 60 sec: 42047.7, 300 sec: 42042.1). Total num frames: 805306368. Throughput: 0: 42232.6. Samples: 805442880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 03:29:22,000][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 03:29:25,094][12883] Updated weights for policy 0, policy_version 49161 (0.0034) [2024-06-18 03:29:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 805568512. Throughput: 0: 42250.9. Samples: 805690860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 03:29:26,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 03:29:27,016][12862] Saving new best policy, reward=0.390! [2024-06-18 03:29:29,143][12883] Updated weights for policy 0, policy_version 49171 (0.0039) [2024-06-18 03:29:31,994][12645] Fps is (10 sec: 42626.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 805732352. Throughput: 0: 42521.8. Samples: 805832960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 03:29:31,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 03:29:32,493][12883] Updated weights for policy 0, policy_version 49181 (0.0030) [2024-06-18 03:29:36,834][12883] Updated weights for policy 0, policy_version 49191 (0.0043) [2024-06-18 03:29:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 805945344. Throughput: 0: 42246.6. Samples: 806078520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 03:29:36,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 03:29:40,227][12883] Updated weights for policy 0, policy_version 49201 (0.0038) [2024-06-18 03:29:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42376.2). Total num frames: 806191104. Throughput: 0: 42290.6. Samples: 806326840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 03:29:41,994][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 03:29:44,562][12883] Updated weights for policy 0, policy_version 49211 (0.0037) [2024-06-18 03:29:47,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42320.9, 300 sec: 42153.2). Total num frames: 806371328. Throughput: 0: 42393.2. Samples: 806466020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 03:29:47,000][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 03:29:47,953][12883] Updated weights for policy 0, policy_version 49221 (0.0031) [2024-06-18 03:29:51,999][12645] Fps is (10 sec: 39300.7, 60 sec: 42321.6, 300 sec: 42264.4). Total num frames: 806584320. Throughput: 0: 42166.1. Samples: 806712480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 03:29:51,999][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 03:29:52,151][12883] Updated weights for policy 0, policy_version 49231 (0.0022) [2024-06-18 03:29:55,636][12883] Updated weights for policy 0, policy_version 49241 (0.0044) [2024-06-18 03:29:56,994][12645] Fps is (10 sec: 45903.7, 60 sec: 42326.9, 300 sec: 42321.0). Total num frames: 806830080. Throughput: 0: 42206.2. Samples: 806960040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 03:29:56,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 03:29:59,571][12883] Updated weights for policy 0, policy_version 49251 (0.0037) [2024-06-18 03:30:01,994][12645] Fps is (10 sec: 40981.4, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 806993920. Throughput: 0: 42278.1. Samples: 807096260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 03:30:01,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 03:30:03,163][12883] Updated weights for policy 0, policy_version 49261 (0.0036) [2024-06-18 03:30:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42376.3). Total num frames: 807239680. Throughput: 0: 42516.4. Samples: 807355840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 03:30:06,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 03:30:07,351][12883] Updated weights for policy 0, policy_version 49271 (0.0028) [2024-06-18 03:30:11,107][12883] Updated weights for policy 0, policy_version 49281 (0.0027) [2024-06-18 03:30:11,994][12645] Fps is (10 sec: 49152.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 807485440. Throughput: 0: 42490.2. Samples: 807602920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 03:30:11,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 03:30:14,120][12862] Signal inference workers to stop experience collection... (11600 times) [2024-06-18 03:30:14,154][12883] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-06-18 03:30:14,235][12862] Signal inference workers to resume experience collection... (11600 times) [2024-06-18 03:30:14,235][12883] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-06-18 03:30:14,842][12883] Updated weights for policy 0, policy_version 49291 (0.0033) [2024-06-18 03:30:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 807649280. Throughput: 0: 42237.3. Samples: 807733640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 03:30:16,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 03:30:18,905][12883] Updated weights for policy 0, policy_version 49301 (0.0036) [2024-06-18 03:30:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42876.2, 300 sec: 42376.2). Total num frames: 807878656. Throughput: 0: 42342.3. Samples: 807983920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 03:30:21,994][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 03:30:22,842][12883] Updated weights for policy 0, policy_version 49311 (0.0030) [2024-06-18 03:30:26,743][12883] Updated weights for policy 0, policy_version 49321 (0.0028) [2024-06-18 03:30:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 808091648. Throughput: 0: 42568.4. Samples: 808242420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 03:30:26,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 03:30:30,612][12883] Updated weights for policy 0, policy_version 49331 (0.0035) [2024-06-18 03:30:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 808271872. Throughput: 0: 42296.6. Samples: 808369100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 03:30:31,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 03:30:34,253][12883] Updated weights for policy 0, policy_version 49341 (0.0025) [2024-06-18 03:30:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 808484864. Throughput: 0: 42421.9. Samples: 808621240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 03:30:36,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 03:30:37,173][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049347_808501248.pth... [2024-06-18 03:30:37,216][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000048728_798359552.pth [2024-06-18 03:30:38,137][12883] Updated weights for policy 0, policy_version 49351 (0.0033) [2024-06-18 03:30:41,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 808714240. Throughput: 0: 42536.5. Samples: 808874280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 26.0) [2024-06-18 03:30:41,997][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 03:30:42,203][12883] Updated weights for policy 0, policy_version 49361 (0.0033) [2024-06-18 03:30:45,684][12883] Updated weights for policy 0, policy_version 49371 (0.0035) [2024-06-18 03:30:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42329.8, 300 sec: 42265.2). Total num frames: 808910848. Throughput: 0: 42341.1. Samples: 809001600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:30:46,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 03:30:49,797][12883] Updated weights for policy 0, policy_version 49381 (0.0042) [2024-06-18 03:30:51,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42602.2, 300 sec: 42265.2). Total num frames: 809140224. Throughput: 0: 42178.7. Samples: 809253880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:30:51,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 03:30:53,376][12883] Updated weights for policy 0, policy_version 49391 (0.0039) [2024-06-18 03:30:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42154.4). Total num frames: 809336832. Throughput: 0: 42267.2. Samples: 809504940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:30:56,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 03:30:57,509][12883] Updated weights for policy 0, policy_version 49401 (0.0038) [2024-06-18 03:31:01,293][12883] Updated weights for policy 0, policy_version 49411 (0.0034) [2024-06-18 03:31:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 809549824. Throughput: 0: 42087.6. Samples: 809627580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:31:01,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 03:31:05,359][12883] Updated weights for policy 0, policy_version 49421 (0.0034) [2024-06-18 03:31:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 809762816. Throughput: 0: 42110.2. Samples: 809878880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:31:06,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 03:31:09,038][12883] Updated weights for policy 0, policy_version 49431 (0.0039) [2024-06-18 03:31:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 42154.1). Total num frames: 809959424. Throughput: 0: 41957.5. Samples: 810130500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:31:11,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 03:31:13,477][12883] Updated weights for policy 0, policy_version 49441 (0.0040) [2024-06-18 03:31:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 810205184. Throughput: 0: 41938.6. Samples: 810256340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-18 03:31:16,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 03:31:16,997][12883] Updated weights for policy 0, policy_version 49451 (0.0033) [2024-06-18 03:31:21,148][12883] Updated weights for policy 0, policy_version 49461 (0.0025) [2024-06-18 03:31:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 810385408. Throughput: 0: 41991.2. Samples: 810510840. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-18 03:31:21,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 03:31:25,055][12883] Updated weights for policy 0, policy_version 49471 (0.0036) [2024-06-18 03:31:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 810598400. Throughput: 0: 41937.1. Samples: 810761360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-18 03:31:26,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 03:31:28,828][12883] Updated weights for policy 0, policy_version 49481 (0.0050) [2024-06-18 03:31:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 810811392. Throughput: 0: 41967.4. Samples: 810890140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-18 03:31:31,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 03:31:32,909][12883] Updated weights for policy 0, policy_version 49491 (0.0039) [2024-06-18 03:31:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 811008000. Throughput: 0: 41956.0. Samples: 811141900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-18 03:31:36,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 03:31:37,035][12883] Updated weights for policy 0, policy_version 49501 (0.0028) [2024-06-18 03:31:40,645][12883] Updated weights for policy 0, policy_version 49511 (0.0028) [2024-06-18 03:31:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41780.8, 300 sec: 42154.1). Total num frames: 811220992. Throughput: 0: 41815.1. Samples: 811386620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 24.0) [2024-06-18 03:31:41,994][12645] Avg episode reward: [(0, '0.066')] [2024-06-18 03:31:45,050][12883] Updated weights for policy 0, policy_version 49521 (0.0035) [2024-06-18 03:31:45,512][12862] Signal inference workers to stop experience collection... (11650 times) [2024-06-18 03:31:45,562][12862] Signal inference workers to resume experience collection... (11650 times) [2024-06-18 03:31:45,563][12883] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-06-18 03:31:45,591][12883] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-06-18 03:31:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 811433984. Throughput: 0: 42019.2. Samples: 811518440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 03:31:46,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 03:31:48,582][12883] Updated weights for policy 0, policy_version 49531 (0.0034) [2024-06-18 03:31:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 811630592. Throughput: 0: 41959.1. Samples: 811767040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 03:31:51,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 03:31:52,769][12883] Updated weights for policy 0, policy_version 49541 (0.0037) [2024-06-18 03:31:56,002][12883] Updated weights for policy 0, policy_version 49551 (0.0043) [2024-06-18 03:31:56,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.7, 300 sec: 42264.8). Total num frames: 811876352. Throughput: 0: 41803.5. Samples: 812011760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 03:31:56,996][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 03:32:00,658][12883] Updated weights for policy 0, policy_version 49561 (0.0039) [2024-06-18 03:32:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 812056576. Throughput: 0: 42020.9. Samples: 812147280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 03:32:01,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 03:32:03,895][12883] Updated weights for policy 0, policy_version 49571 (0.0029) [2024-06-18 03:32:06,994][12645] Fps is (10 sec: 39331.1, 60 sec: 41779.3, 300 sec: 42154.4). Total num frames: 812269568. Throughput: 0: 41937.8. Samples: 812398040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 03:32:06,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 03:32:08,393][12883] Updated weights for policy 0, policy_version 49581 (0.0036) [2024-06-18 03:32:11,615][12883] Updated weights for policy 0, policy_version 49591 (0.0046) [2024-06-18 03:32:11,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 812531712. Throughput: 0: 41884.5. Samples: 812646160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 03:32:11,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 03:32:16,062][12883] Updated weights for policy 0, policy_version 49601 (0.0046) [2024-06-18 03:32:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 812695552. Throughput: 0: 41933.7. Samples: 812777160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 03:32:16,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 03:32:19,258][12883] Updated weights for policy 0, policy_version 49611 (0.0047) [2024-06-18 03:32:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 812908544. Throughput: 0: 41808.8. Samples: 813023300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 03:32:21,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 03:32:23,690][12883] Updated weights for policy 0, policy_version 49621 (0.0038) [2024-06-18 03:32:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 813137920. Throughput: 0: 41978.1. Samples: 813275640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 03:32:26,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 03:32:27,407][12883] Updated weights for policy 0, policy_version 49631 (0.0024) [2024-06-18 03:32:31,685][12883] Updated weights for policy 0, policy_version 49641 (0.0042) [2024-06-18 03:32:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 813318144. Throughput: 0: 41910.6. Samples: 813404420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 03:32:31,994][12645] Avg episode reward: [(0, '0.031')] [2024-06-18 03:32:35,099][12883] Updated weights for policy 0, policy_version 49651 (0.0050) [2024-06-18 03:32:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42155.0). Total num frames: 813547520. Throughput: 0: 41842.1. Samples: 813649940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 03:32:36,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:32:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049655_813547520.pth... [2024-06-18 03:32:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049037_803422208.pth [2024-06-18 03:32:39,440][12883] Updated weights for policy 0, policy_version 49661 (0.0029) [2024-06-18 03:32:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 813744128. Throughput: 0: 42099.9. Samples: 813906160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 03:32:41,995][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 03:32:42,967][12883] Updated weights for policy 0, policy_version 49671 (0.0029) [2024-06-18 03:32:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.0, 300 sec: 42154.0). Total num frames: 813940736. Throughput: 0: 41918.5. Samples: 814033620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 03:32:46,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 03:32:47,394][12883] Updated weights for policy 0, policy_version 49681 (0.0037) [2024-06-18 03:32:50,742][12883] Updated weights for policy 0, policy_version 49691 (0.0030) [2024-06-18 03:32:51,998][12645] Fps is (10 sec: 42578.6, 60 sec: 42322.0, 300 sec: 42209.0). Total num frames: 814170112. Throughput: 0: 41922.7. Samples: 814284760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 03:32:51,999][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 03:32:55,139][12883] Updated weights for policy 0, policy_version 49701 (0.0050) [2024-06-18 03:32:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41507.7, 300 sec: 42154.1). Total num frames: 814366720. Throughput: 0: 42069.2. Samples: 814539280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 03:32:56,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 03:32:58,521][12883] Updated weights for policy 0, policy_version 49711 (0.0047) [2024-06-18 03:33:01,994][12645] Fps is (10 sec: 39340.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 814563328. Throughput: 0: 41809.4. Samples: 814658580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 03:33:01,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:33:03,178][12883] Updated weights for policy 0, policy_version 49721 (0.0030) [2024-06-18 03:33:06,431][12883] Updated weights for policy 0, policy_version 49731 (0.0037) [2024-06-18 03:33:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 814825472. Throughput: 0: 41986.8. Samples: 814912700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 03:33:06,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:33:10,925][12883] Updated weights for policy 0, policy_version 49741 (0.0044) [2024-06-18 03:33:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 42098.6). Total num frames: 814989312. Throughput: 0: 41823.3. Samples: 815157680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 03:33:11,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 03:33:14,448][12883] Updated weights for policy 0, policy_version 49751 (0.0053) [2024-06-18 03:33:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 815202304. Throughput: 0: 41689.4. Samples: 815280440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 03:33:16,994][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 03:33:18,533][12883] Updated weights for policy 0, policy_version 49761 (0.0038) [2024-06-18 03:33:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 815415296. Throughput: 0: 41949.6. Samples: 815537660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 03:33:21,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 03:33:22,328][12883] Updated weights for policy 0, policy_version 49771 (0.0034) [2024-06-18 03:33:22,519][12862] Signal inference workers to stop experience collection... (11700 times) [2024-06-18 03:33:22,519][12862] Signal inference workers to resume experience collection... (11700 times) [2024-06-18 03:33:22,557][12883] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-06-18 03:33:22,557][12883] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-06-18 03:33:26,197][12883] Updated weights for policy 0, policy_version 49781 (0.0039) [2024-06-18 03:33:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 815628288. Throughput: 0: 41800.9. Samples: 815787200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 03:33:26,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 03:33:30,147][12883] Updated weights for policy 0, policy_version 49791 (0.0045) [2024-06-18 03:33:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 815841280. Throughput: 0: 41719.8. Samples: 815911000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 03:33:31,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 03:33:34,441][12883] Updated weights for policy 0, policy_version 49801 (0.0043) [2024-06-18 03:33:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 42043.3). Total num frames: 816037888. Throughput: 0: 41669.7. Samples: 816159700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 03:33:36,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 03:33:37,919][12883] Updated weights for policy 0, policy_version 49811 (0.0026) [2024-06-18 03:33:41,995][12645] Fps is (10 sec: 40955.6, 60 sec: 41778.5, 300 sec: 42098.4). Total num frames: 816250880. Throughput: 0: 41663.1. Samples: 816414160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 03:33:41,995][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 03:33:42,151][12883] Updated weights for policy 0, policy_version 49821 (0.0032) [2024-06-18 03:33:45,689][12883] Updated weights for policy 0, policy_version 49831 (0.0030) [2024-06-18 03:33:46,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42050.9, 300 sec: 42098.2). Total num frames: 816463872. Throughput: 0: 41813.5. Samples: 816540280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 03:33:46,996][12645] Avg episode reward: [(0, '0.027')] [2024-06-18 03:33:50,055][12883] Updated weights for policy 0, policy_version 49841 (0.0045) [2024-06-18 03:33:51,996][12645] Fps is (10 sec: 42594.7, 60 sec: 41781.1, 300 sec: 41987.5). Total num frames: 816676864. Throughput: 0: 41750.7. Samples: 816791560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 03:33:51,996][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 03:33:53,743][12883] Updated weights for policy 0, policy_version 49851 (0.0035) [2024-06-18 03:33:56,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42050.8, 300 sec: 42098.2). Total num frames: 816889856. Throughput: 0: 41844.6. Samples: 817040780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 03:33:56,996][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 03:33:57,769][12883] Updated weights for policy 0, policy_version 49861 (0.0031) [2024-06-18 03:34:01,516][12883] Updated weights for policy 0, policy_version 49871 (0.0037) [2024-06-18 03:34:01,994][12645] Fps is (10 sec: 40968.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 817086464. Throughput: 0: 41972.4. Samples: 817169200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 03:34:01,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 03:34:05,470][12883] Updated weights for policy 0, policy_version 49881 (0.0029) [2024-06-18 03:34:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 817315840. Throughput: 0: 41827.8. Samples: 817419920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 03:34:06,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 03:34:09,787][12883] Updated weights for policy 0, policy_version 49891 (0.0039) [2024-06-18 03:34:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 817512448. Throughput: 0: 41825.4. Samples: 817669340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 03:34:11,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 03:34:13,192][12883] Updated weights for policy 0, policy_version 49901 (0.0035) [2024-06-18 03:34:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42099.5). Total num frames: 817725440. Throughput: 0: 41827.6. Samples: 817793240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 03:34:16,994][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 03:34:17,544][12883] Updated weights for policy 0, policy_version 49911 (0.0047) [2024-06-18 03:34:20,870][12883] Updated weights for policy 0, policy_version 49921 (0.0045) [2024-06-18 03:34:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 817938432. Throughput: 0: 42018.6. Samples: 818050540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 03:34:21,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 03:34:25,179][12883] Updated weights for policy 0, policy_version 49931 (0.0030) [2024-06-18 03:34:26,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 818151424. Throughput: 0: 42022.6. Samples: 818305140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 03:34:26,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 03:34:28,626][12883] Updated weights for policy 0, policy_version 49941 (0.0039) [2024-06-18 03:34:32,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42047.9, 300 sec: 42097.7). Total num frames: 818364416. Throughput: 0: 41982.5. Samples: 818429660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 03:34:32,000][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 03:34:32,727][12883] Updated weights for policy 0, policy_version 49951 (0.0038) [2024-06-18 03:34:36,713][12883] Updated weights for policy 0, policy_version 49961 (0.0029) [2024-06-18 03:34:36,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 818561024. Throughput: 0: 42173.4. Samples: 818689280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 03:34:36,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 03:34:37,078][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049962_818577408.pth... [2024-06-18 03:34:37,142][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049347_808501248.pth [2024-06-18 03:34:40,410][12883] Updated weights for policy 0, policy_version 49971 (0.0030) [2024-06-18 03:34:41,994][12645] Fps is (10 sec: 37707.0, 60 sec: 41506.9, 300 sec: 41932.8). Total num frames: 818741248. Throughput: 0: 42147.0. Samples: 818937300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 03:34:41,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 03:34:44,539][12883] Updated weights for policy 0, policy_version 49981 (0.0032) [2024-06-18 03:34:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.9, 300 sec: 42099.3). Total num frames: 819003392. Throughput: 0: 41882.2. Samples: 819053900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 03:34:46,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 03:34:48,355][12883] Updated weights for policy 0, policy_version 49991 (0.0034) [2024-06-18 03:34:51,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42053.5, 300 sec: 41931.9). Total num frames: 819200000. Throughput: 0: 42144.4. Samples: 819316420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 03:34:51,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 03:34:52,234][12883] Updated weights for policy 0, policy_version 50001 (0.0034) [2024-06-18 03:34:52,472][12862] Signal inference workers to stop experience collection... (11750 times) [2024-06-18 03:34:52,472][12862] Signal inference workers to resume experience collection... (11750 times) [2024-06-18 03:34:52,495][12883] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-06-18 03:34:52,496][12883] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-06-18 03:34:55,898][12883] Updated weights for policy 0, policy_version 50011 (0.0032) [2024-06-18 03:34:56,996][12645] Fps is (10 sec: 37674.7, 60 sec: 41506.1, 300 sec: 41987.2). Total num frames: 819380224. Throughput: 0: 42196.9. Samples: 819568300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 03:34:56,996][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 03:34:59,972][12883] Updated weights for policy 0, policy_version 50021 (0.0025) [2024-06-18 03:35:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 819625984. Throughput: 0: 42261.2. Samples: 819695000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 03:35:01,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 03:35:03,662][12883] Updated weights for policy 0, policy_version 50031 (0.0032) [2024-06-18 03:35:06,996][12645] Fps is (10 sec: 44236.9, 60 sec: 41777.7, 300 sec: 41820.5). Total num frames: 819822592. Throughput: 0: 42237.9. Samples: 819951340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 03:35:06,997][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 03:35:07,617][12883] Updated weights for policy 0, policy_version 50041 (0.0033) [2024-06-18 03:35:11,507][12883] Updated weights for policy 0, policy_version 50051 (0.0035) [2024-06-18 03:35:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 820035584. Throughput: 0: 42087.7. Samples: 820199080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 03:35:11,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 03:35:15,229][12883] Updated weights for policy 0, policy_version 50061 (0.0047) [2024-06-18 03:35:16,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 820264960. Throughput: 0: 42220.8. Samples: 820329340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 03:35:16,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 03:35:19,590][12883] Updated weights for policy 0, policy_version 50071 (0.0033) [2024-06-18 03:35:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 820445184. Throughput: 0: 42111.1. Samples: 820584280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 03:35:21,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 03:35:22,975][12883] Updated weights for policy 0, policy_version 50081 (0.0021) [2024-06-18 03:35:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 820674560. Throughput: 0: 42128.0. Samples: 820833060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 03:35:26,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 03:35:27,355][12883] Updated weights for policy 0, policy_version 50091 (0.0038) [2024-06-18 03:35:30,632][12883] Updated weights for policy 0, policy_version 50101 (0.0037) [2024-06-18 03:35:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42056.6, 300 sec: 42043.0). Total num frames: 820887552. Throughput: 0: 42306.7. Samples: 820957700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 03:35:31,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:35:35,297][12883] Updated weights for policy 0, policy_version 50111 (0.0028) [2024-06-18 03:35:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 821100544. Throughput: 0: 42187.2. Samples: 821214840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 03:35:36,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 03:35:38,442][12883] Updated weights for policy 0, policy_version 50121 (0.0035) [2024-06-18 03:35:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 41987.4). Total num frames: 821297152. Throughput: 0: 42168.7. Samples: 821465800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 03:35:41,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 03:35:42,940][12883] Updated weights for policy 0, policy_version 50131 (0.0044) [2024-06-18 03:35:46,127][12883] Updated weights for policy 0, policy_version 50141 (0.0033) [2024-06-18 03:35:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 821526528. Throughput: 0: 42095.7. Samples: 821589300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 03:35:46,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 03:35:51,073][12883] Updated weights for policy 0, policy_version 50151 (0.0028) [2024-06-18 03:35:51,993][12645] Fps is (10 sec: 42599.5, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 821723136. Throughput: 0: 42048.0. Samples: 821843400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 03:35:51,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 03:35:53,842][12883] Updated weights for policy 0, policy_version 50161 (0.0036) [2024-06-18 03:35:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 41931.9). Total num frames: 821919744. Throughput: 0: 42230.7. Samples: 822099460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 03:35:56,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 03:35:58,867][12883] Updated weights for policy 0, policy_version 50171 (0.0029) [2024-06-18 03:36:01,985][12883] Updated weights for policy 0, policy_version 50181 (0.0033) [2024-06-18 03:36:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 822165504. Throughput: 0: 41989.5. Samples: 822218860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 03:36:01,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 03:36:06,579][12883] Updated weights for policy 0, policy_version 50191 (0.0027) [2024-06-18 03:36:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41780.9, 300 sec: 41931.9). Total num frames: 822329344. Throughput: 0: 41954.3. Samples: 822472220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 03:36:06,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 03:36:09,866][12883] Updated weights for policy 0, policy_version 50201 (0.0029) [2024-06-18 03:36:11,999][12645] Fps is (10 sec: 39298.7, 60 sec: 42048.2, 300 sec: 41875.6). Total num frames: 822558720. Throughput: 0: 41949.6. Samples: 822721040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 03:36:12,000][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 03:36:14,261][12883] Updated weights for policy 0, policy_version 50211 (0.0038) [2024-06-18 03:36:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 822788096. Throughput: 0: 42039.2. Samples: 822849460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 03:36:16,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 03:36:17,629][12883] Updated weights for policy 0, policy_version 50221 (0.0034) [2024-06-18 03:36:21,994][12645] Fps is (10 sec: 40984.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 822968320. Throughput: 0: 41996.0. Samples: 823104660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 03:36:21,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 03:36:22,301][12883] Updated weights for policy 0, policy_version 50231 (0.0029) [2024-06-18 03:36:25,832][12883] Updated weights for policy 0, policy_version 50241 (0.0029) [2024-06-18 03:36:26,994][12645] Fps is (10 sec: 42596.6, 60 sec: 42325.1, 300 sec: 42043.0). Total num frames: 823214080. Throughput: 0: 41820.3. Samples: 823347720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:36:26,995][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:36:29,845][12883] Updated weights for policy 0, policy_version 50251 (0.0040) [2024-06-18 03:36:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 823394304. Throughput: 0: 42119.5. Samples: 823484680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:36:31,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 03:36:33,543][12883] Updated weights for policy 0, policy_version 50261 (0.0039) [2024-06-18 03:36:34,660][12862] Signal inference workers to stop experience collection... (11800 times) [2024-06-18 03:36:34,661][12862] Signal inference workers to resume experience collection... (11800 times) [2024-06-18 03:36:34,700][12883] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-06-18 03:36:34,700][12883] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-06-18 03:36:36,994][12645] Fps is (10 sec: 37684.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 823590912. Throughput: 0: 41962.1. Samples: 823731700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:36:36,994][12645] Avg episode reward: [(0, '0.057')] [2024-06-18 03:36:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050268_823590912.pth... [2024-06-18 03:36:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049655_813547520.pth [2024-06-18 03:36:37,680][12883] Updated weights for policy 0, policy_version 50271 (0.0032) [2024-06-18 03:36:41,029][12883] Updated weights for policy 0, policy_version 50281 (0.0037) [2024-06-18 03:36:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 823836672. Throughput: 0: 41740.3. Samples: 823977780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:36:41,996][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 03:36:45,386][12883] Updated weights for policy 0, policy_version 50291 (0.0028) [2024-06-18 03:36:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 824033280. Throughput: 0: 42083.6. Samples: 824112620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:36:46,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 03:36:48,803][12883] Updated weights for policy 0, policy_version 50301 (0.0039) [2024-06-18 03:36:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41932.3). Total num frames: 824246272. Throughput: 0: 41941.7. Samples: 824359600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:36:51,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 03:36:53,352][12883] Updated weights for policy 0, policy_version 50311 (0.0045) [2024-06-18 03:36:56,468][12883] Updated weights for policy 0, policy_version 50321 (0.0034) [2024-06-18 03:36:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 824475648. Throughput: 0: 42022.3. Samples: 824611800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:36:56,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 03:37:01,093][12883] Updated weights for policy 0, policy_version 50331 (0.0035) [2024-06-18 03:37:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 824655872. Throughput: 0: 41996.3. Samples: 824739300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:37:01,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 03:37:04,336][12883] Updated weights for policy 0, policy_version 50341 (0.0034) [2024-06-18 03:37:06,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42596.8, 300 sec: 41876.1). Total num frames: 824885248. Throughput: 0: 41893.9. Samples: 824989980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:37:06,996][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 03:37:08,816][12883] Updated weights for policy 0, policy_version 50351 (0.0030) [2024-06-18 03:37:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42056.3, 300 sec: 41987.5). Total num frames: 825081856. Throughput: 0: 42094.9. Samples: 825241980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:37:11,995][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 03:37:12,217][12883] Updated weights for policy 0, policy_version 50361 (0.0033) [2024-06-18 03:37:16,603][12883] Updated weights for policy 0, policy_version 50371 (0.0039) [2024-06-18 03:37:16,994][12645] Fps is (10 sec: 40968.8, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 825294848. Throughput: 0: 41783.1. Samples: 825364920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:37:16,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 03:37:20,429][12883] Updated weights for policy 0, policy_version 50381 (0.0041) [2024-06-18 03:37:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 825524224. Throughput: 0: 41971.9. Samples: 825620440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:37:21,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 03:37:24,374][12883] Updated weights for policy 0, policy_version 50391 (0.0030) [2024-06-18 03:37:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.4, 300 sec: 41987.5). Total num frames: 825704448. Throughput: 0: 42217.9. Samples: 825877580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 03:37:26,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 03:37:28,097][12883] Updated weights for policy 0, policy_version 50401 (0.0033) [2024-06-18 03:37:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 825917440. Throughput: 0: 41820.3. Samples: 825994540. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-18 03:37:32,003][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 03:37:32,185][12883] Updated weights for policy 0, policy_version 50411 (0.0036) [2024-06-18 03:37:36,116][12883] Updated weights for policy 0, policy_version 50421 (0.0034) [2024-06-18 03:37:36,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42869.8, 300 sec: 42098.2). Total num frames: 826163200. Throughput: 0: 42146.3. Samples: 826256280. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-18 03:37:36,997][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 03:37:39,899][12883] Updated weights for policy 0, policy_version 50431 (0.0034) [2024-06-18 03:37:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 42043.1). Total num frames: 826343424. Throughput: 0: 42181.0. Samples: 826509940. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-18 03:37:41,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 03:37:43,751][12883] Updated weights for policy 0, policy_version 50441 (0.0033) [2024-06-18 03:37:46,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.2, 300 sec: 42043.7). Total num frames: 826572800. Throughput: 0: 42065.3. Samples: 826632240. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-18 03:37:46,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 03:37:47,491][12883] Updated weights for policy 0, policy_version 50451 (0.0031) [2024-06-18 03:37:51,495][12883] Updated weights for policy 0, policy_version 50461 (0.0039) [2024-06-18 03:37:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 826769408. Throughput: 0: 42195.8. Samples: 826888700. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-18 03:37:51,994][12645] Avg episode reward: [(0, '0.037')] [2024-06-18 03:37:55,723][12883] Updated weights for policy 0, policy_version 50471 (0.0044) [2024-06-18 03:37:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 826949632. Throughput: 0: 42082.7. Samples: 827135700. Policy #0 lag: (min: 1.0, avg: 11.9, max: 21.0) [2024-06-18 03:37:56,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 03:37:59,497][12883] Updated weights for policy 0, policy_version 50481 (0.0031) [2024-06-18 03:38:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 827211776. Throughput: 0: 42193.3. Samples: 827263620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-18 03:38:01,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 03:38:03,308][12883] Updated weights for policy 0, policy_version 50491 (0.0030) [2024-06-18 03:38:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41507.7, 300 sec: 41987.5). Total num frames: 827375616. Throughput: 0: 42295.6. Samples: 827523740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-18 03:38:06,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 03:38:07,097][12862] Signal inference workers to stop experience collection... (11850 times) [2024-06-18 03:38:07,097][12862] Signal inference workers to resume experience collection... (11850 times) [2024-06-18 03:38:07,119][12883] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-06-18 03:38:07,119][12883] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-06-18 03:38:07,278][12883] Updated weights for policy 0, policy_version 50501 (0.0035) [2024-06-18 03:38:11,176][12883] Updated weights for policy 0, policy_version 50511 (0.0046) [2024-06-18 03:38:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 827604992. Throughput: 0: 42076.9. Samples: 827771040. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-18 03:38:11,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 03:38:14,953][12883] Updated weights for policy 0, policy_version 50521 (0.0041) [2024-06-18 03:38:16,994][12645] Fps is (10 sec: 49152.1, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 827867136. Throughput: 0: 42437.4. Samples: 827904220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-18 03:38:16,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 03:38:18,608][12883] Updated weights for policy 0, policy_version 50531 (0.0039) [2024-06-18 03:38:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 828014592. Throughput: 0: 42319.0. Samples: 828160540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-18 03:38:21,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 03:38:22,797][12883] Updated weights for policy 0, policy_version 50541 (0.0038) [2024-06-18 03:38:26,080][12883] Updated weights for policy 0, policy_version 50551 (0.0031) [2024-06-18 03:38:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 828243968. Throughput: 0: 42137.1. Samples: 828406120. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-18 03:38:26,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 03:38:30,546][12883] Updated weights for policy 0, policy_version 50561 (0.0026) [2024-06-18 03:38:31,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 828489728. Throughput: 0: 42449.9. Samples: 828542480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 23.0) [2024-06-18 03:38:31,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 03:38:33,720][12883] Updated weights for policy 0, policy_version 50571 (0.0037) [2024-06-18 03:38:36,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40961.6, 300 sec: 41932.1). Total num frames: 828620800. Throughput: 0: 42329.9. Samples: 828793540. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-18 03:38:36,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 03:38:37,046][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050576_828637184.pth... [2024-06-18 03:38:37,129][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000049962_818577408.pth [2024-06-18 03:38:38,282][12883] Updated weights for policy 0, policy_version 50581 (0.0025) [2024-06-18 03:38:41,758][12883] Updated weights for policy 0, policy_version 50591 (0.0034) [2024-06-18 03:38:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42098.9). Total num frames: 828882944. Throughput: 0: 42297.3. Samples: 829039080. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-18 03:38:41,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 03:38:45,997][12883] Updated weights for policy 0, policy_version 50601 (0.0030) [2024-06-18 03:38:46,994][12645] Fps is (10 sec: 49151.6, 60 sec: 42325.4, 300 sec: 42154.4). Total num frames: 829112320. Throughput: 0: 42571.2. Samples: 829179320. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-18 03:38:46,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 03:38:49,092][12883] Updated weights for policy 0, policy_version 50611 (0.0036) [2024-06-18 03:38:51,996][12645] Fps is (10 sec: 37675.2, 60 sec: 41504.7, 300 sec: 41931.9). Total num frames: 829259776. Throughput: 0: 42269.6. Samples: 829425960. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-18 03:38:51,996][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 03:38:53,614][12883] Updated weights for policy 0, policy_version 50621 (0.0047) [2024-06-18 03:38:56,794][12883] Updated weights for policy 0, policy_version 50631 (0.0035) [2024-06-18 03:38:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42209.6). Total num frames: 829538304. Throughput: 0: 42277.6. Samples: 829673540. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-18 03:38:57,000][12645] Avg episode reward: [(0, '0.057')] [2024-06-18 03:39:01,377][12883] Updated weights for policy 0, policy_version 50641 (0.0033) [2024-06-18 03:39:01,994][12645] Fps is (10 sec: 47523.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 829734912. Throughput: 0: 42443.9. Samples: 829814200. Policy #0 lag: (min: 0.0, avg: 13.0, max: 21.0) [2024-06-18 03:39:01,994][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 03:39:02,304][12862] Signal inference workers to stop experience collection... (11900 times) [2024-06-18 03:39:02,304][12862] Signal inference workers to resume experience collection... (11900 times) [2024-06-18 03:39:02,340][12883] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-06-18 03:39:02,341][12883] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-06-18 03:39:04,311][12883] Updated weights for policy 0, policy_version 50651 (0.0031) [2024-06-18 03:39:06,994][12645] Fps is (10 sec: 37683.9, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 829915136. Throughput: 0: 42226.2. Samples: 830060720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 03:39:06,994][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 03:39:09,528][12883] Updated weights for policy 0, policy_version 50661 (0.0029) [2024-06-18 03:39:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 830177280. Throughput: 0: 42137.4. Samples: 830302300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 03:39:11,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 03:39:12,223][12883] Updated weights for policy 0, policy_version 50671 (0.0032) [2024-06-18 03:39:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 830341120. Throughput: 0: 42082.2. Samples: 830436180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 03:39:16,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 03:39:17,256][12883] Updated weights for policy 0, policy_version 50681 (0.0033) [2024-06-18 03:39:20,048][12883] Updated weights for policy 0, policy_version 50691 (0.0041) [2024-06-18 03:39:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 830570496. Throughput: 0: 42066.1. Samples: 830686520. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 03:39:21,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 03:39:25,053][12883] Updated weights for policy 0, policy_version 50701 (0.0037) [2024-06-18 03:39:26,994][12645] Fps is (10 sec: 49151.5, 60 sec: 43144.5, 300 sec: 42266.0). Total num frames: 830832640. Throughput: 0: 42137.7. Samples: 830935280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 03:39:26,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 03:39:27,966][12883] Updated weights for policy 0, policy_version 50711 (0.0035) [2024-06-18 03:39:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41232.9, 300 sec: 42043.0). Total num frames: 830963712. Throughput: 0: 41999.4. Samples: 831069300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 03:39:31,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 03:39:32,671][12883] Updated weights for policy 0, policy_version 50721 (0.0041) [2024-06-18 03:39:35,622][12883] Updated weights for policy 0, policy_version 50731 (0.0035) [2024-06-18 03:39:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 43144.4, 300 sec: 42265.1). Total num frames: 831209472. Throughput: 0: 42142.8. Samples: 831322300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 03:39:36,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 03:39:40,385][12883] Updated weights for policy 0, policy_version 50741 (0.0047) [2024-06-18 03:39:41,994][12645] Fps is (10 sec: 47514.8, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 831438848. Throughput: 0: 42441.1. Samples: 831583380. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 03:39:41,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 03:39:43,422][12883] Updated weights for policy 0, policy_version 50751 (0.0028) [2024-06-18 03:39:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 831602688. Throughput: 0: 41983.1. Samples: 831703440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 03:39:46,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:39:48,271][12883] Updated weights for policy 0, policy_version 50761 (0.0038) [2024-06-18 03:39:51,355][12883] Updated weights for policy 0, policy_version 50771 (0.0033) [2024-06-18 03:39:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43146.1, 300 sec: 42265.5). Total num frames: 831848448. Throughput: 0: 41924.1. Samples: 831947300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 03:39:51,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 03:39:56,263][12883] Updated weights for policy 0, policy_version 50781 (0.0038) [2024-06-18 03:39:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 832028672. Throughput: 0: 42451.9. Samples: 832212640. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 03:39:56,994][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 03:39:59,115][12883] Updated weights for policy 0, policy_version 50791 (0.0031) [2024-06-18 03:40:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 832258048. Throughput: 0: 42046.7. Samples: 832328280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 03:40:01,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 03:40:04,091][12883] Updated weights for policy 0, policy_version 50801 (0.0030) [2024-06-18 03:40:06,199][12862] Signal inference workers to stop experience collection... (11950 times) [2024-06-18 03:40:06,199][12862] Signal inference workers to resume experience collection... (11950 times) [2024-06-18 03:40:06,253][12883] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-06-18 03:40:06,253][12883] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-06-18 03:40:06,947][12883] Updated weights for policy 0, policy_version 50811 (0.0030) [2024-06-18 03:40:06,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 832487424. Throughput: 0: 42211.7. Samples: 832586040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 03:40:06,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 03:40:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 40960.1, 300 sec: 41932.0). Total num frames: 832634880. Throughput: 0: 42295.3. Samples: 832838560. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) [2024-06-18 03:40:11,994][12645] Avg episode reward: [(0, '0.053')] [2024-06-18 03:40:12,009][12883] Updated weights for policy 0, policy_version 50821 (0.0029) [2024-06-18 03:40:14,757][12883] Updated weights for policy 0, policy_version 50831 (0.0031) [2024-06-18 03:40:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 832897024. Throughput: 0: 41880.1. Samples: 832953900. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) [2024-06-18 03:40:16,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 03:40:19,672][12883] Updated weights for policy 0, policy_version 50841 (0.0032) [2024-06-18 03:40:21,993][12645] Fps is (10 sec: 47513.9, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 833110016. Throughput: 0: 42184.7. Samples: 833220600. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) [2024-06-18 03:40:21,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 03:40:22,547][12883] Updated weights for policy 0, policy_version 50851 (0.0032) [2024-06-18 03:40:26,994][12645] Fps is (10 sec: 36044.7, 60 sec: 40413.9, 300 sec: 41931.9). Total num frames: 833257472. Throughput: 0: 41826.9. Samples: 833465600. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) [2024-06-18 03:40:26,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 03:40:27,536][12883] Updated weights for policy 0, policy_version 50861 (0.0035) [2024-06-18 03:40:30,516][12883] Updated weights for policy 0, policy_version 50871 (0.0053) [2024-06-18 03:40:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.6, 300 sec: 42154.1). Total num frames: 833536000. Throughput: 0: 41814.3. Samples: 833585080. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) [2024-06-18 03:40:31,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 03:40:35,140][12883] Updated weights for policy 0, policy_version 50881 (0.0032) [2024-06-18 03:40:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 833699840. Throughput: 0: 42019.1. Samples: 833838160. Policy #0 lag: (min: 1.0, avg: 12.7, max: 21.0) [2024-06-18 03:40:36,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 03:40:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050886_833716224.pth... [2024-06-18 03:40:37,157][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050268_823590912.pth [2024-06-18 03:40:38,387][12883] Updated weights for policy 0, policy_version 50891 (0.0037) [2024-06-18 03:40:41,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 833912832. Throughput: 0: 41638.3. Samples: 834086360. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) [2024-06-18 03:40:41,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 03:40:43,221][12883] Updated weights for policy 0, policy_version 50901 (0.0032) [2024-06-18 03:40:46,067][12883] Updated weights for policy 0, policy_version 50911 (0.0034) [2024-06-18 03:40:46,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 834158592. Throughput: 0: 41925.2. Samples: 834214920. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) [2024-06-18 03:40:46,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 03:40:50,977][12883] Updated weights for policy 0, policy_version 50921 (0.0038) [2024-06-18 03:40:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 834322432. Throughput: 0: 41855.9. Samples: 834469560. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) [2024-06-18 03:40:51,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 03:40:53,908][12883] Updated weights for policy 0, policy_version 50931 (0.0025) [2024-06-18 03:40:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 834551808. Throughput: 0: 41658.6. Samples: 834713200. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) [2024-06-18 03:40:56,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 03:40:58,613][12883] Updated weights for policy 0, policy_version 50941 (0.0042) [2024-06-18 03:41:01,910][12883] Updated weights for policy 0, policy_version 50951 (0.0035) [2024-06-18 03:41:01,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 834781184. Throughput: 0: 42064.6. Samples: 834846800. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) [2024-06-18 03:41:01,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 03:41:06,373][12883] Updated weights for policy 0, policy_version 50961 (0.0031) [2024-06-18 03:41:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41988.3). Total num frames: 834945024. Throughput: 0: 41775.9. Samples: 835100520. Policy #0 lag: (min: 1.0, avg: 13.0, max: 28.0) [2024-06-18 03:41:06,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 03:41:07,847][12862] Signal inference workers to stop experience collection... (12000 times) [2024-06-18 03:41:07,900][12883] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-06-18 03:41:07,966][12862] Signal inference workers to resume experience collection... (12000 times) [2024-06-18 03:41:07,966][12883] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-06-18 03:41:09,641][12883] Updated weights for policy 0, policy_version 50971 (0.0036) [2024-06-18 03:41:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 835190784. Throughput: 0: 41670.8. Samples: 835340780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:41:11,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 03:41:14,122][12883] Updated weights for policy 0, policy_version 50981 (0.0037) [2024-06-18 03:41:16,994][12645] Fps is (10 sec: 45874.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 835403776. Throughput: 0: 42041.3. Samples: 835476940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:41:16,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 03:41:17,624][12883] Updated weights for policy 0, policy_version 50991 (0.0030) [2024-06-18 03:41:21,949][12883] Updated weights for policy 0, policy_version 51001 (0.0034) [2024-06-18 03:41:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 835600384. Throughput: 0: 41948.0. Samples: 835725820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:41:21,994][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 03:41:25,459][12883] Updated weights for policy 0, policy_version 51011 (0.0024) [2024-06-18 03:41:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 835829760. Throughput: 0: 41918.2. Samples: 835972680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:41:26,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 03:41:29,852][12883] Updated weights for policy 0, policy_version 51021 (0.0037) [2024-06-18 03:41:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 836042752. Throughput: 0: 42020.0. Samples: 836105820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:41:31,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 03:41:33,239][12883] Updated weights for policy 0, policy_version 51031 (0.0036) [2024-06-18 03:41:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 836206592. Throughput: 0: 41899.1. Samples: 836355020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:41:36,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 03:41:37,848][12883] Updated weights for policy 0, policy_version 51041 (0.0034) [2024-06-18 03:41:41,094][12883] Updated weights for policy 0, policy_version 51051 (0.0038) [2024-06-18 03:41:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 836452352. Throughput: 0: 42046.8. Samples: 836605300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 03:41:41,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 03:41:45,528][12883] Updated weights for policy 0, policy_version 51061 (0.0024) [2024-06-18 03:41:46,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 836665344. Throughput: 0: 41978.6. Samples: 836735840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 03:41:46,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 03:41:49,030][12883] Updated weights for policy 0, policy_version 51071 (0.0040) [2024-06-18 03:41:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 836845568. Throughput: 0: 41760.4. Samples: 836979740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 03:41:51,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 03:41:53,351][12883] Updated weights for policy 0, policy_version 51081 (0.0027) [2024-06-18 03:41:56,627][12883] Updated weights for policy 0, policy_version 51091 (0.0032) [2024-06-18 03:41:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 837091328. Throughput: 0: 42059.9. Samples: 837233480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 03:41:56,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:42:01,388][12883] Updated weights for policy 0, policy_version 51101 (0.0027) [2024-06-18 03:42:01,996][12645] Fps is (10 sec: 44226.9, 60 sec: 41777.6, 300 sec: 42043.0). Total num frames: 837287936. Throughput: 0: 41930.9. Samples: 837363920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 03:42:01,996][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 03:42:04,562][12883] Updated weights for policy 0, policy_version 51111 (0.0027) [2024-06-18 03:42:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 837484544. Throughput: 0: 41970.7. Samples: 837614500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 03:42:06,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 03:42:08,922][12883] Updated weights for policy 0, policy_version 51121 (0.0038) [2024-06-18 03:42:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 837697536. Throughput: 0: 42152.9. Samples: 837869560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-18 03:42:11,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 03:42:12,434][12883] Updated weights for policy 0, policy_version 51131 (0.0032) [2024-06-18 03:42:14,823][12862] Signal inference workers to stop experience collection... (12050 times) [2024-06-18 03:42:14,824][12862] Signal inference workers to resume experience collection... (12050 times) [2024-06-18 03:42:14,852][12883] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-06-18 03:42:14,852][12883] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-06-18 03:42:16,665][12883] Updated weights for policy 0, policy_version 51141 (0.0030) [2024-06-18 03:42:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 837910528. Throughput: 0: 41988.5. Samples: 837995300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:42:16,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 03:42:19,943][12883] Updated weights for policy 0, policy_version 51151 (0.0028) [2024-06-18 03:42:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 838123520. Throughput: 0: 42087.2. Samples: 838248940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:42:21,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 03:42:24,176][12883] Updated weights for policy 0, policy_version 51161 (0.0027) [2024-06-18 03:42:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 838336512. Throughput: 0: 42167.1. Samples: 838502820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:42:26,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 03:42:27,711][12883] Updated weights for policy 0, policy_version 51171 (0.0028) [2024-06-18 03:42:31,835][12883] Updated weights for policy 0, policy_version 51181 (0.0039) [2024-06-18 03:42:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41987.8). Total num frames: 838549504. Throughput: 0: 42040.4. Samples: 838627660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:42:31,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 03:42:35,308][12883] Updated weights for policy 0, policy_version 51191 (0.0039) [2024-06-18 03:42:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 838762496. Throughput: 0: 42189.8. Samples: 838878280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:42:36,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 03:42:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051194_838762496.pth... [2024-06-18 03:42:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050576_828637184.pth [2024-06-18 03:42:39,338][12883] Updated weights for policy 0, policy_version 51201 (0.0034) [2024-06-18 03:42:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 838975488. Throughput: 0: 42161.4. Samples: 839130740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:42:41,996][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 03:42:43,134][12883] Updated weights for policy 0, policy_version 51211 (0.0033) [2024-06-18 03:42:46,847][12883] Updated weights for policy 0, policy_version 51221 (0.0043) [2024-06-18 03:42:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 839204864. Throughput: 0: 42183.4. Samples: 839262080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 03:42:46,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 03:42:50,832][12883] Updated weights for policy 0, policy_version 51231 (0.0028) [2024-06-18 03:42:52,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42320.9, 300 sec: 42153.2). Total num frames: 839385088. Throughput: 0: 42213.7. Samples: 839514380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 03:42:52,000][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 03:42:54,454][12883] Updated weights for policy 0, policy_version 51241 (0.0044) [2024-06-18 03:42:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 839614464. Throughput: 0: 42112.1. Samples: 839764700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 03:42:56,997][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 03:42:59,013][12883] Updated weights for policy 0, policy_version 51251 (0.0034) [2024-06-18 03:43:01,994][12645] Fps is (10 sec: 44264.6, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 839827456. Throughput: 0: 42248.5. Samples: 839896480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 03:43:01,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 03:43:02,287][12883] Updated weights for policy 0, policy_version 51261 (0.0032) [2024-06-18 03:43:06,938][12883] Updated weights for policy 0, policy_version 51271 (0.0040) [2024-06-18 03:43:06,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 840024064. Throughput: 0: 42092.0. Samples: 840143080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 03:43:06,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 03:43:10,603][12883] Updated weights for policy 0, policy_version 51281 (0.0029) [2024-06-18 03:43:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 840237056. Throughput: 0: 41950.5. Samples: 840390600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 03:43:11,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 03:43:14,763][12883] Updated weights for policy 0, policy_version 51291 (0.0020) [2024-06-18 03:43:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 840433664. Throughput: 0: 41989.0. Samples: 840517160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 03:43:16,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 03:43:18,229][12883] Updated weights for policy 0, policy_version 51301 (0.0031) [2024-06-18 03:43:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 840663040. Throughput: 0: 42103.5. Samples: 840772940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:43:21,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 03:43:22,523][12883] Updated weights for policy 0, policy_version 51311 (0.0033) [2024-06-18 03:43:26,418][12883] Updated weights for policy 0, policy_version 51321 (0.0031) [2024-06-18 03:43:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 840859648. Throughput: 0: 42025.8. Samples: 841021900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:43:26,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 03:43:30,059][12883] Updated weights for policy 0, policy_version 51331 (0.0034) [2024-06-18 03:43:32,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42047.9, 300 sec: 42208.7). Total num frames: 841072640. Throughput: 0: 41815.6. Samples: 841144040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:43:32,000][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 03:43:34,034][12883] Updated weights for policy 0, policy_version 51341 (0.0045) [2024-06-18 03:43:36,886][12862] Signal inference workers to stop experience collection... (12100 times) [2024-06-18 03:43:36,920][12883] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-06-18 03:43:36,951][12862] Signal inference workers to resume experience collection... (12100 times) [2024-06-18 03:43:36,952][12883] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-06-18 03:43:36,996][12645] Fps is (10 sec: 40950.8, 60 sec: 41777.6, 300 sec: 41987.2). Total num frames: 841269248. Throughput: 0: 41941.5. Samples: 841401580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:43:36,996][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 03:43:37,690][12883] Updated weights for policy 0, policy_version 51351 (0.0035) [2024-06-18 03:43:41,928][12883] Updated weights for policy 0, policy_version 51361 (0.0037) [2024-06-18 03:43:41,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 841498624. Throughput: 0: 41899.0. Samples: 841650060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:43:41,996][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 03:43:45,763][12883] Updated weights for policy 0, policy_version 51371 (0.0039) [2024-06-18 03:43:46,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42052.3, 300 sec: 42265.5). Total num frames: 841728000. Throughput: 0: 41899.5. Samples: 841781960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:43:46,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 03:43:49,935][12883] Updated weights for policy 0, policy_version 51381 (0.0031) [2024-06-18 03:43:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41783.6, 300 sec: 41876.4). Total num frames: 841891840. Throughput: 0: 41878.2. Samples: 842027600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:43:51,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 03:43:54,069][12883] Updated weights for policy 0, policy_version 51391 (0.0030) [2024-06-18 03:43:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41780.8, 300 sec: 41987.5). Total num frames: 842121216. Throughput: 0: 41907.6. Samples: 842276440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:43:56,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 03:43:57,557][12883] Updated weights for policy 0, policy_version 51401 (0.0032) [2024-06-18 03:44:01,578][12883] Updated weights for policy 0, policy_version 51411 (0.0032) [2024-06-18 03:44:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 842334208. Throughput: 0: 42039.4. Samples: 842408940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:44:01,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 03:44:05,283][12883] Updated weights for policy 0, policy_version 51421 (0.0025) [2024-06-18 03:44:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 842530816. Throughput: 0: 41868.5. Samples: 842657020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:44:06,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 03:44:09,386][12883] Updated weights for policy 0, policy_version 51431 (0.0039) [2024-06-18 03:44:11,997][12645] Fps is (10 sec: 42583.3, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 842760192. Throughput: 0: 41864.2. Samples: 842905940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:44:11,998][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 03:44:13,063][12883] Updated weights for policy 0, policy_version 51441 (0.0029) [2024-06-18 03:44:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 842956800. Throughput: 0: 41998.6. Samples: 843033720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:44:16,995][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 03:44:17,006][12883] Updated weights for policy 0, policy_version 51451 (0.0025) [2024-06-18 03:44:21,285][12883] Updated weights for policy 0, policy_version 51461 (0.0037) [2024-06-18 03:44:21,994][12645] Fps is (10 sec: 39335.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 843153408. Throughput: 0: 41859.8. Samples: 843285180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 03:44:22,003][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 03:44:24,868][12883] Updated weights for policy 0, policy_version 51471 (0.0041) [2024-06-18 03:44:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 843399168. Throughput: 0: 41805.8. Samples: 843531320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 03:44:26,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 03:44:28,998][12883] Updated weights for policy 0, policy_version 51481 (0.0044) [2024-06-18 03:44:32,000][12645] Fps is (10 sec: 40934.7, 60 sec: 41506.1, 300 sec: 41875.5). Total num frames: 843563008. Throughput: 0: 41789.7. Samples: 843662760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 03:44:32,001][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 03:44:32,813][12883] Updated weights for policy 0, policy_version 51491 (0.0027) [2024-06-18 03:44:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41780.7, 300 sec: 41820.8). Total num frames: 843776000. Throughput: 0: 41876.3. Samples: 843912040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 03:44:36,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 03:44:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051500_843776000.pth... [2024-06-18 03:44:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000050886_833716224.pth [2024-06-18 03:44:37,248][12883] Updated weights for policy 0, policy_version 51501 (0.0032) [2024-06-18 03:44:40,689][12883] Updated weights for policy 0, policy_version 51511 (0.0034) [2024-06-18 03:44:41,994][12645] Fps is (10 sec: 45904.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 844021760. Throughput: 0: 41876.0. Samples: 844160860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 03:44:41,994][12645] Avg episode reward: [(0, '0.137')] [2024-06-18 03:44:44,715][12883] Updated weights for policy 0, policy_version 51521 (0.0032) [2024-06-18 03:44:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 844201984. Throughput: 0: 41961.0. Samples: 844297180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 03:44:46,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 03:44:48,374][12883] Updated weights for policy 0, policy_version 51531 (0.0029) [2024-06-18 03:44:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 844431360. Throughput: 0: 41944.4. Samples: 844544520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 03:44:51,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 03:44:52,285][12883] Updated weights for policy 0, policy_version 51541 (0.0041) [2024-06-18 03:44:56,287][12883] Updated weights for policy 0, policy_version 51551 (0.0033) [2024-06-18 03:44:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 844644352. Throughput: 0: 42173.2. Samples: 844803580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 03:44:56,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 03:45:00,167][12883] Updated weights for policy 0, policy_version 51561 (0.0028) [2024-06-18 03:45:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 844840960. Throughput: 0: 42108.5. Samples: 844928600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 03:45:01,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 03:45:04,143][12883] Updated weights for policy 0, policy_version 51571 (0.0028) [2024-06-18 03:45:05,662][12862] Signal inference workers to stop experience collection... (12150 times) [2024-06-18 03:45:05,663][12862] Signal inference workers to resume experience collection... (12150 times) [2024-06-18 03:45:05,680][12883] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-06-18 03:45:05,681][12883] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-06-18 03:45:06,996][12645] Fps is (10 sec: 42588.4, 60 sec: 42323.7, 300 sec: 42153.7). Total num frames: 845070336. Throughput: 0: 42019.3. Samples: 845176140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 03:45:06,997][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 03:45:08,423][12883] Updated weights for policy 0, policy_version 51581 (0.0036) [2024-06-18 03:45:11,838][12883] Updated weights for policy 0, policy_version 51591 (0.0034) [2024-06-18 03:45:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 845266944. Throughput: 0: 42207.2. Samples: 845430640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 03:45:11,994][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 03:45:16,158][12883] Updated weights for policy 0, policy_version 51601 (0.0044) [2024-06-18 03:45:16,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 845496320. Throughput: 0: 42022.7. Samples: 845553520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 03:45:16,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 03:45:19,672][12883] Updated weights for policy 0, policy_version 51611 (0.0035) [2024-06-18 03:45:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 845692928. Throughput: 0: 42116.5. Samples: 845807280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 03:45:21,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 03:45:23,862][12883] Updated weights for policy 0, policy_version 51621 (0.0028) [2024-06-18 03:45:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 845905920. Throughput: 0: 42245.8. Samples: 846061920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 24.0) [2024-06-18 03:45:26,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 03:45:27,451][12883] Updated weights for policy 0, policy_version 51631 (0.0031) [2024-06-18 03:45:31,575][12883] Updated weights for policy 0, policy_version 51641 (0.0033) [2024-06-18 03:45:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42056.7, 300 sec: 41987.5). Total num frames: 846086144. Throughput: 0: 42014.7. Samples: 846187840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 03:45:31,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 03:45:35,290][12883] Updated weights for policy 0, policy_version 51651 (0.0032) [2024-06-18 03:45:36,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42869.9, 300 sec: 42153.8). Total num frames: 846348288. Throughput: 0: 42111.7. Samples: 846439640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 03:45:36,996][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 03:45:39,427][12883] Updated weights for policy 0, policy_version 51661 (0.0042) [2024-06-18 03:45:42,000][12645] Fps is (10 sec: 44209.1, 60 sec: 41774.8, 300 sec: 41931.1). Total num frames: 846528512. Throughput: 0: 41833.7. Samples: 846686360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 03:45:42,001][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 03:45:43,037][12883] Updated weights for policy 0, policy_version 51671 (0.0038) [2024-06-18 03:45:46,994][12645] Fps is (10 sec: 37691.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 846725120. Throughput: 0: 41803.0. Samples: 846809740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 03:45:46,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 03:45:47,159][12883] Updated weights for policy 0, policy_version 51681 (0.0028) [2024-06-18 03:45:50,658][12883] Updated weights for policy 0, policy_version 51691 (0.0038) [2024-06-18 03:45:51,994][12645] Fps is (10 sec: 44264.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 846970880. Throughput: 0: 42071.1. Samples: 847069240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 03:45:51,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 03:45:54,650][12883] Updated weights for policy 0, policy_version 51701 (0.0035) [2024-06-18 03:45:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 847167488. Throughput: 0: 42018.5. Samples: 847321480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 03:45:57,003][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 03:45:58,151][12883] Updated weights for policy 0, policy_version 51711 (0.0035) [2024-06-18 03:46:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 847364096. Throughput: 0: 42103.7. Samples: 847448180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 03:46:01,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 03:46:02,410][12883] Updated weights for policy 0, policy_version 51721 (0.0041) [2024-06-18 03:46:05,959][12883] Updated weights for policy 0, policy_version 51731 (0.0039) [2024-06-18 03:46:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42053.8, 300 sec: 42043.0). Total num frames: 847593472. Throughput: 0: 42209.3. Samples: 847706700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 03:46:06,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 03:46:10,161][12883] Updated weights for policy 0, policy_version 51741 (0.0034) [2024-06-18 03:46:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 847806464. Throughput: 0: 42217.3. Samples: 847961700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 03:46:11,996][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 03:46:13,590][12883] Updated weights for policy 0, policy_version 51751 (0.0041) [2024-06-18 03:46:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 848003072. Throughput: 0: 42230.2. Samples: 848088200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 03:46:16,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 03:46:17,747][12883] Updated weights for policy 0, policy_version 51761 (0.0033) [2024-06-18 03:46:21,690][12883] Updated weights for policy 0, policy_version 51771 (0.0027) [2024-06-18 03:46:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 848232448. Throughput: 0: 42380.8. Samples: 848346680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 03:46:21,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 03:46:25,683][12883] Updated weights for policy 0, policy_version 51781 (0.0031) [2024-06-18 03:46:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 848445440. Throughput: 0: 42494.4. Samples: 848598340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 03:46:26,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 03:46:27,641][12862] Signal inference workers to stop experience collection... (12200 times) [2024-06-18 03:46:27,642][12862] Signal inference workers to resume experience collection... (12200 times) [2024-06-18 03:46:27,685][12883] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-06-18 03:46:27,692][12883] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-06-18 03:46:29,403][12883] Updated weights for policy 0, policy_version 51791 (0.0024) [2024-06-18 03:46:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 848658432. Throughput: 0: 42630.3. Samples: 848728100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 03:46:31,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 03:46:33,263][12883] Updated weights for policy 0, policy_version 51801 (0.0031) [2024-06-18 03:46:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41780.7, 300 sec: 42043.0). Total num frames: 848855040. Throughput: 0: 42414.9. Samples: 848977920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 03:46:36,995][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 03:46:37,071][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051811_848871424.pth... [2024-06-18 03:46:37,078][12883] Updated weights for policy 0, policy_version 51811 (0.0031) [2024-06-18 03:46:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051194_838762496.pth [2024-06-18 03:46:40,785][12883] Updated weights for policy 0, policy_version 51821 (0.0034) [2024-06-18 03:46:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42329.8, 300 sec: 42043.0). Total num frames: 849068032. Throughput: 0: 42406.8. Samples: 849229780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 03:46:41,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 03:46:45,019][12883] Updated weights for policy 0, policy_version 51831 (0.0025) [2024-06-18 03:46:46,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 849297408. Throughput: 0: 42568.0. Samples: 849363740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 03:46:46,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 03:46:48,669][12883] Updated weights for policy 0, policy_version 51841 (0.0038) [2024-06-18 03:46:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 849477632. Throughput: 0: 42364.9. Samples: 849613120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 03:46:51,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 03:46:52,727][12883] Updated weights for policy 0, policy_version 51851 (0.0039) [2024-06-18 03:46:56,367][12883] Updated weights for policy 0, policy_version 51861 (0.0033) [2024-06-18 03:46:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 849707008. Throughput: 0: 42366.2. Samples: 849868180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 03:46:56,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 03:47:00,566][12883] Updated weights for policy 0, policy_version 51871 (0.0041) [2024-06-18 03:47:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 849920000. Throughput: 0: 42440.0. Samples: 849998000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 03:47:01,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 03:47:02,149][12862] Saving new best policy, reward=0.416! [2024-06-18 03:47:04,147][12883] Updated weights for policy 0, policy_version 51881 (0.0025) [2024-06-18 03:47:06,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 850132992. Throughput: 0: 42161.5. Samples: 850244040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:47:06,997][12645] Avg episode reward: [(0, '0.215')] [2024-06-18 03:47:08,487][12883] Updated weights for policy 0, policy_version 51891 (0.0029) [2024-06-18 03:47:11,754][12883] Updated weights for policy 0, policy_version 51901 (0.0039) [2024-06-18 03:47:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 850345984. Throughput: 0: 42366.3. Samples: 850504820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:47:11,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 03:47:16,026][12883] Updated weights for policy 0, policy_version 51911 (0.0033) [2024-06-18 03:47:16,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 850542592. Throughput: 0: 42236.0. Samples: 850628720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:47:17,002][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 03:47:19,550][12883] Updated weights for policy 0, policy_version 51921 (0.0031) [2024-06-18 03:47:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 850788352. Throughput: 0: 42300.1. Samples: 850881420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:47:21,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 03:47:22,003][12862] Saving new best policy, reward=0.440! [2024-06-18 03:47:23,688][12883] Updated weights for policy 0, policy_version 51931 (0.0027) [2024-06-18 03:47:26,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 850984960. Throughput: 0: 42426.4. Samples: 851138980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:47:26,994][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 03:47:27,341][12883] Updated weights for policy 0, policy_version 51941 (0.0037) [2024-06-18 03:47:31,350][12883] Updated weights for policy 0, policy_version 51951 (0.0032) [2024-06-18 03:47:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 851181568. Throughput: 0: 42182.6. Samples: 851261960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:47:31,994][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 03:47:34,905][12883] Updated weights for policy 0, policy_version 51961 (0.0040) [2024-06-18 03:47:35,992][12862] Signal inference workers to stop experience collection... (12250 times) [2024-06-18 03:47:35,992][12862] Signal inference workers to resume experience collection... (12250 times) [2024-06-18 03:47:36,009][12883] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-06-18 03:47:36,009][12883] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-06-18 03:47:36,994][12645] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 851410944. Throughput: 0: 42381.5. Samples: 851520280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 03:47:36,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 03:47:39,194][12883] Updated weights for policy 0, policy_version 51971 (0.0034) [2024-06-18 03:47:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 851607552. Throughput: 0: 42355.1. Samples: 851774160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:47:41,994][12645] Avg episode reward: [(0, '0.010')] [2024-06-18 03:47:42,773][12883] Updated weights for policy 0, policy_version 51981 (0.0025) [2024-06-18 03:47:46,789][12883] Updated weights for policy 0, policy_version 51991 (0.0037) [2024-06-18 03:47:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.1, 300 sec: 42155.0). Total num frames: 851820544. Throughput: 0: 42187.5. Samples: 851896440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:47:46,994][12645] Avg episode reward: [(0, '0.051')] [2024-06-18 03:47:50,418][12883] Updated weights for policy 0, policy_version 52001 (0.0038) [2024-06-18 03:47:51,995][12645] Fps is (10 sec: 44231.0, 60 sec: 42870.5, 300 sec: 42154.2). Total num frames: 852049920. Throughput: 0: 42367.4. Samples: 852150540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:47:51,996][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 03:47:54,757][12883] Updated weights for policy 0, policy_version 52011 (0.0036) [2024-06-18 03:47:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 852246528. Throughput: 0: 42354.7. Samples: 852410780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:47:56,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 03:47:57,979][12883] Updated weights for policy 0, policy_version 52021 (0.0024) [2024-06-18 03:48:01,994][12645] Fps is (10 sec: 40965.8, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 852459520. Throughput: 0: 42240.8. Samples: 852529560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:48:01,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 03:48:02,357][12883] Updated weights for policy 0, policy_version 52031 (0.0037) [2024-06-18 03:48:05,783][12883] Updated weights for policy 0, policy_version 52041 (0.0028) [2024-06-18 03:48:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42326.9, 300 sec: 42154.1). Total num frames: 852672512. Throughput: 0: 42197.2. Samples: 852780300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 03:48:06,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 03:48:09,860][12883] Updated weights for policy 0, policy_version 52051 (0.0033) [2024-06-18 03:48:11,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 852869120. Throughput: 0: 42140.3. Samples: 853035380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:48:11,996][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 03:48:13,801][12883] Updated weights for policy 0, policy_version 52061 (0.0034) [2024-06-18 03:48:16,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 853098496. Throughput: 0: 42293.0. Samples: 853165240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:48:16,996][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 03:48:17,534][12883] Updated weights for policy 0, policy_version 52071 (0.0021) [2024-06-18 03:48:21,546][12883] Updated weights for policy 0, policy_version 52081 (0.0047) [2024-06-18 03:48:21,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 853311488. Throughput: 0: 42138.5. Samples: 853416520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:48:21,994][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 03:48:25,531][12883] Updated weights for policy 0, policy_version 52091 (0.0038) [2024-06-18 03:48:26,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.5, 300 sec: 42210.5). Total num frames: 853524480. Throughput: 0: 42187.2. Samples: 853672580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:48:26,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 03:48:29,346][12883] Updated weights for policy 0, policy_version 52101 (0.0039) [2024-06-18 03:48:31,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 853704704. Throughput: 0: 42212.6. Samples: 853796000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:48:31,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 03:48:33,250][12883] Updated weights for policy 0, policy_version 52111 (0.0049) [2024-06-18 03:48:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 853934080. Throughput: 0: 42217.4. Samples: 854050260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:48:36,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 03:48:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052121_853950464.pth... [2024-06-18 03:48:37,070][12883] Updated weights for policy 0, policy_version 52121 (0.0039) [2024-06-18 03:48:37,113][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051500_843776000.pth [2024-06-18 03:48:41,187][12883] Updated weights for policy 0, policy_version 52131 (0.0036) [2024-06-18 03:48:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 854147072. Throughput: 0: 42005.4. Samples: 854301020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 03:48:41,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 03:48:45,090][12883] Updated weights for policy 0, policy_version 52141 (0.0030) [2024-06-18 03:48:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 854343680. Throughput: 0: 42168.1. Samples: 854427120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 03:48:46,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 03:48:48,839][12883] Updated weights for policy 0, policy_version 52151 (0.0052) [2024-06-18 03:48:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42053.2, 300 sec: 42209.6). Total num frames: 854573056. Throughput: 0: 42228.5. Samples: 854680580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 03:48:51,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:48:52,764][12883] Updated weights for policy 0, policy_version 52161 (0.0031) [2024-06-18 03:48:56,522][12883] Updated weights for policy 0, policy_version 52171 (0.0038) [2024-06-18 03:48:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 854786048. Throughput: 0: 42248.8. Samples: 854936480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 03:48:56,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:49:00,540][12883] Updated weights for policy 0, policy_version 52181 (0.0033) [2024-06-18 03:49:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 854999040. Throughput: 0: 42159.8. Samples: 855062340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 03:49:01,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:49:04,274][12883] Updated weights for policy 0, policy_version 52191 (0.0033) [2024-06-18 03:49:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42210.1). Total num frames: 855212032. Throughput: 0: 42213.8. Samples: 855316140. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 03:49:06,994][12645] Avg episode reward: [(0, '0.126')] [2024-06-18 03:49:08,339][12883] Updated weights for policy 0, policy_version 52201 (0.0025) [2024-06-18 03:49:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 855408640. Throughput: 0: 42208.0. Samples: 855571940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 03:49:11,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 03:49:12,033][12883] Updated weights for policy 0, policy_version 52211 (0.0042) [2024-06-18 03:49:13,300][12862] Signal inference workers to stop experience collection... (12300 times) [2024-06-18 03:49:13,301][12862] Signal inference workers to resume experience collection... (12300 times) [2024-06-18 03:49:13,315][12883] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-06-18 03:49:13,315][12883] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-06-18 03:49:16,095][12883] Updated weights for policy 0, policy_version 52221 (0.0026) [2024-06-18 03:49:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 855621632. Throughput: 0: 42265.8. Samples: 855697960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:49:16,994][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 03:49:19,836][12883] Updated weights for policy 0, policy_version 52231 (0.0032) [2024-06-18 03:49:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 855834624. Throughput: 0: 42237.7. Samples: 855950960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:49:21,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 03:49:23,750][12883] Updated weights for policy 0, policy_version 52241 (0.0028) [2024-06-18 03:49:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42266.1). Total num frames: 856031232. Throughput: 0: 42391.5. Samples: 856208640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:49:26,994][12645] Avg episode reward: [(0, '0.057')] [2024-06-18 03:49:27,768][12883] Updated weights for policy 0, policy_version 52251 (0.0042) [2024-06-18 03:49:31,385][12883] Updated weights for policy 0, policy_version 52261 (0.0031) [2024-06-18 03:49:31,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42320.4). Total num frames: 856260608. Throughput: 0: 42246.3. Samples: 856328300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:49:31,997][12645] Avg episode reward: [(0, '0.064')] [2024-06-18 03:49:35,384][12883] Updated weights for policy 0, policy_version 52271 (0.0039) [2024-06-18 03:49:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 856473600. Throughput: 0: 42292.0. Samples: 856583720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:49:36,994][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 03:49:39,457][12883] Updated weights for policy 0, policy_version 52281 (0.0028) [2024-06-18 03:49:41,994][12645] Fps is (10 sec: 39330.8, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 856653824. Throughput: 0: 42336.1. Samples: 856841600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:49:41,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 03:49:43,112][12883] Updated weights for policy 0, policy_version 52291 (0.0031) [2024-06-18 03:49:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 856883200. Throughput: 0: 42264.9. Samples: 856964260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:49:46,995][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 03:49:47,169][12883] Updated weights for policy 0, policy_version 52301 (0.0035) [2024-06-18 03:49:50,767][12883] Updated weights for policy 0, policy_version 52311 (0.0028) [2024-06-18 03:49:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 857096192. Throughput: 0: 42396.6. Samples: 857223980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 03:49:51,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 03:49:54,670][12883] Updated weights for policy 0, policy_version 52321 (0.0050) [2024-06-18 03:49:56,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41777.5, 300 sec: 42209.3). Total num frames: 857292800. Throughput: 0: 42307.1. Samples: 857475860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 03:49:56,997][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 03:49:58,680][12883] Updated weights for policy 0, policy_version 52331 (0.0033) [2024-06-18 03:50:01,994][12645] Fps is (10 sec: 44234.7, 60 sec: 42325.1, 300 sec: 42265.4). Total num frames: 857538560. Throughput: 0: 42237.3. Samples: 857598660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 03:50:01,995][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 03:50:02,811][12883] Updated weights for policy 0, policy_version 52341 (0.0039) [2024-06-18 03:50:06,428][12883] Updated weights for policy 0, policy_version 52351 (0.0045) [2024-06-18 03:50:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 857735168. Throughput: 0: 42336.6. Samples: 857856120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 03:50:06,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 03:50:10,314][12883] Updated weights for policy 0, policy_version 52361 (0.0032) [2024-06-18 03:50:11,994][12645] Fps is (10 sec: 40962.0, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 857948160. Throughput: 0: 42177.8. Samples: 858106640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 03:50:11,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 03:50:14,295][12883] Updated weights for policy 0, policy_version 52371 (0.0031) [2024-06-18 03:50:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 858161152. Throughput: 0: 42486.5. Samples: 858240100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 03:50:16,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 03:50:17,879][12883] Updated weights for policy 0, policy_version 52381 (0.0025) [2024-06-18 03:50:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 858357760. Throughput: 0: 42460.5. Samples: 858494440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:50:21,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 03:50:22,140][12883] Updated weights for policy 0, policy_version 52391 (0.0036) [2024-06-18 03:50:25,661][12883] Updated weights for policy 0, policy_version 52401 (0.0038) [2024-06-18 03:50:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 858587136. Throughput: 0: 42153.7. Samples: 858738520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:50:26,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 03:50:29,995][12883] Updated weights for policy 0, policy_version 52411 (0.0039) [2024-06-18 03:50:30,394][12862] Signal inference workers to stop experience collection... (12350 times) [2024-06-18 03:50:30,395][12862] Signal inference workers to resume experience collection... (12350 times) [2024-06-18 03:50:30,407][12883] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-06-18 03:50:30,407][12883] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-06-18 03:50:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42053.9, 300 sec: 42154.4). Total num frames: 858783744. Throughput: 0: 42385.0. Samples: 858871580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:50:31,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 03:50:33,394][12883] Updated weights for policy 0, policy_version 52421 (0.0032) [2024-06-18 03:50:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42210.5). Total num frames: 858980352. Throughput: 0: 42097.3. Samples: 859118360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:50:36,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 03:50:37,109][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052429_858996736.pth... [2024-06-18 03:50:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000051811_848871424.pth [2024-06-18 03:50:37,695][12883] Updated weights for policy 0, policy_version 52431 (0.0032) [2024-06-18 03:50:41,013][12883] Updated weights for policy 0, policy_version 52441 (0.0030) [2024-06-18 03:50:41,995][12645] Fps is (10 sec: 44232.7, 60 sec: 42870.8, 300 sec: 42376.1). Total num frames: 859226112. Throughput: 0: 42131.2. Samples: 859371700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:50:41,995][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 03:50:45,437][12883] Updated weights for policy 0, policy_version 52451 (0.0039) [2024-06-18 03:50:46,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.9, 300 sec: 42264.8). Total num frames: 859439104. Throughput: 0: 42348.5. Samples: 859504420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:50:46,996][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 03:50:48,662][12883] Updated weights for policy 0, policy_version 52461 (0.0028) [2024-06-18 03:50:51,996][12645] Fps is (10 sec: 40954.5, 60 sec: 42323.8, 300 sec: 42264.9). Total num frames: 859635712. Throughput: 0: 42159.0. Samples: 859753360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 03:50:51,996][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 03:50:53,038][12883] Updated weights for policy 0, policy_version 52471 (0.0033) [2024-06-18 03:50:56,479][12883] Updated weights for policy 0, policy_version 52481 (0.0028) [2024-06-18 03:50:56,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42873.2, 300 sec: 42376.2). Total num frames: 859865088. Throughput: 0: 42264.4. Samples: 860008540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:50:56,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 03:51:01,158][12883] Updated weights for policy 0, policy_version 52491 (0.0031) [2024-06-18 03:51:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42052.6, 300 sec: 42265.2). Total num frames: 860061696. Throughput: 0: 42180.5. Samples: 860138220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:51:01,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 03:51:04,081][12883] Updated weights for policy 0, policy_version 52501 (0.0041) [2024-06-18 03:51:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 860274688. Throughput: 0: 42158.3. Samples: 860391560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:51:06,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 03:51:08,869][12883] Updated weights for policy 0, policy_version 52511 (0.0029) [2024-06-18 03:51:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 860487680. Throughput: 0: 42269.4. Samples: 860640640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:51:11,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 03:51:12,340][12883] Updated weights for policy 0, policy_version 52521 (0.0035) [2024-06-18 03:51:16,515][12883] Updated weights for policy 0, policy_version 52531 (0.0031) [2024-06-18 03:51:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 860684288. Throughput: 0: 42175.9. Samples: 860769500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:51:16,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 03:51:19,757][12883] Updated weights for policy 0, policy_version 52541 (0.0035) [2024-06-18 03:51:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 860913664. Throughput: 0: 42370.6. Samples: 861025040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 03:51:21,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 03:51:24,082][12883] Updated weights for policy 0, policy_version 52551 (0.0037) [2024-06-18 03:51:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 861126656. Throughput: 0: 42494.9. Samples: 861283940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 03:51:26,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 03:51:27,380][12883] Updated weights for policy 0, policy_version 52561 (0.0035) [2024-06-18 03:51:31,912][12883] Updated weights for policy 0, policy_version 52571 (0.0035) [2024-06-18 03:51:31,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 861323264. Throughput: 0: 42323.6. Samples: 861408980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 03:51:31,996][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 03:51:35,237][12883] Updated weights for policy 0, policy_version 52581 (0.0033) [2024-06-18 03:51:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 861552640. Throughput: 0: 42433.6. Samples: 861662780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 03:51:36,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 03:51:39,606][12883] Updated weights for policy 0, policy_version 52591 (0.0032) [2024-06-18 03:51:41,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42325.9, 300 sec: 42265.1). Total num frames: 861765632. Throughput: 0: 42352.4. Samples: 861914400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 03:51:41,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 03:51:43,229][12883] Updated weights for policy 0, policy_version 52601 (0.0050) [2024-06-18 03:51:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42053.9, 300 sec: 42320.7). Total num frames: 861962240. Throughput: 0: 42300.5. Samples: 862041740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 03:51:46,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 03:51:47,211][12883] Updated weights for policy 0, policy_version 52611 (0.0026) [2024-06-18 03:51:50,697][12883] Updated weights for policy 0, policy_version 52621 (0.0033) [2024-06-18 03:51:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 42320.7). Total num frames: 862191616. Throughput: 0: 42315.5. Samples: 862295760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 03:51:51,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 03:51:55,170][12883] Updated weights for policy 0, policy_version 52631 (0.0038) [2024-06-18 03:51:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 862388224. Throughput: 0: 42458.3. Samples: 862551260. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 03:51:56,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 03:51:58,550][12883] Updated weights for policy 0, policy_version 52641 (0.0035) [2024-06-18 03:52:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42265.5). Total num frames: 862601216. Throughput: 0: 42289.3. Samples: 862672520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 03:52:01,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 03:52:03,013][12883] Updated weights for policy 0, policy_version 52651 (0.0029) [2024-06-18 03:52:03,534][12862] Signal inference workers to stop experience collection... (12400 times) [2024-06-18 03:52:03,552][12883] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-06-18 03:52:03,591][12862] Signal inference workers to resume experience collection... (12400 times) [2024-06-18 03:52:03,592][12883] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-06-18 03:52:06,706][12883] Updated weights for policy 0, policy_version 52661 (0.0029) [2024-06-18 03:52:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 862830592. Throughput: 0: 42414.7. Samples: 862933700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 03:52:06,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 03:52:10,635][12883] Updated weights for policy 0, policy_version 52671 (0.0040) [2024-06-18 03:52:11,995][12645] Fps is (10 sec: 42591.8, 60 sec: 42324.2, 300 sec: 42320.5). Total num frames: 863027200. Throughput: 0: 42101.3. Samples: 863178560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 03:52:11,996][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 03:52:14,387][12883] Updated weights for policy 0, policy_version 52681 (0.0044) [2024-06-18 03:52:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 863240192. Throughput: 0: 42247.3. Samples: 863310020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 03:52:16,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 03:52:18,395][12883] Updated weights for policy 0, policy_version 52691 (0.0036) [2024-06-18 03:52:21,994][12645] Fps is (10 sec: 40966.5, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 863436800. Throughput: 0: 42086.3. Samples: 863556660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 03:52:21,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 03:52:22,231][12883] Updated weights for policy 0, policy_version 52701 (0.0033) [2024-06-18 03:52:26,035][12883] Updated weights for policy 0, policy_version 52711 (0.0033) [2024-06-18 03:52:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 863666176. Throughput: 0: 42019.2. Samples: 863805260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 03:52:26,994][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 03:52:30,042][12883] Updated weights for policy 0, policy_version 52721 (0.0037) [2024-06-18 03:52:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 863846400. Throughput: 0: 42099.4. Samples: 863936220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:52:31,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 03:52:33,638][12883] Updated weights for policy 0, policy_version 52731 (0.0039) [2024-06-18 03:52:36,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 864059392. Throughput: 0: 41916.7. Samples: 864182020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:52:36,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 03:52:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052738_864059392.pth... [2024-06-18 03:52:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052121_853950464.pth [2024-06-18 03:52:38,045][12883] Updated weights for policy 0, policy_version 52741 (0.0028) [2024-06-18 03:52:41,362][12883] Updated weights for policy 0, policy_version 52751 (0.0032) [2024-06-18 03:52:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 864305152. Throughput: 0: 41818.6. Samples: 864433100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:52:41,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 03:52:45,695][12883] Updated weights for policy 0, policy_version 52761 (0.0034) [2024-06-18 03:52:46,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 42098.8). Total num frames: 864468992. Throughput: 0: 42037.9. Samples: 864564220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:52:46,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 03:52:49,102][12883] Updated weights for policy 0, policy_version 52771 (0.0040) [2024-06-18 03:52:51,996][12645] Fps is (10 sec: 39312.9, 60 sec: 41777.6, 300 sec: 42209.3). Total num frames: 864698368. Throughput: 0: 41831.7. Samples: 864816220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:52:51,996][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 03:52:53,386][12883] Updated weights for policy 0, policy_version 52781 (0.0051) [2024-06-18 03:52:56,824][12883] Updated weights for policy 0, policy_version 52791 (0.0042) [2024-06-18 03:52:56,996][12645] Fps is (10 sec: 45864.3, 60 sec: 42323.7, 300 sec: 42264.8). Total num frames: 864927744. Throughput: 0: 41901.5. Samples: 865064160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:52:56,997][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 03:53:01,083][12883] Updated weights for policy 0, policy_version 52801 (0.0040) [2024-06-18 03:53:01,996][12645] Fps is (10 sec: 39321.2, 60 sec: 41504.5, 300 sec: 42098.2). Total num frames: 865091584. Throughput: 0: 41702.4. Samples: 865186720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 03:53:01,997][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 03:53:04,870][12883] Updated weights for policy 0, policy_version 52811 (0.0036) [2024-06-18 03:53:06,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41779.1, 300 sec: 42265.5). Total num frames: 865337344. Throughput: 0: 41792.8. Samples: 865437340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 03:53:06,995][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 03:53:09,047][12883] Updated weights for policy 0, policy_version 52821 (0.0044) [2024-06-18 03:53:11,994][12645] Fps is (10 sec: 44247.5, 60 sec: 41780.3, 300 sec: 42154.4). Total num frames: 865533952. Throughput: 0: 41993.8. Samples: 865694980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 03:53:11,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 03:53:12,651][12883] Updated weights for policy 0, policy_version 52831 (0.0023) [2024-06-18 03:53:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 865730560. Throughput: 0: 41706.7. Samples: 865813020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 03:53:16,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 03:53:17,759][12883] Updated weights for policy 0, policy_version 52841 (0.0028) [2024-06-18 03:53:19,991][12862] Signal inference workers to stop experience collection... (12450 times) [2024-06-18 03:53:19,991][12862] Signal inference workers to resume experience collection... (12450 times) [2024-06-18 03:53:20,012][12883] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-06-18 03:53:20,012][12883] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-06-18 03:53:20,303][12883] Updated weights for policy 0, policy_version 52851 (0.0036) [2024-06-18 03:53:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 865976320. Throughput: 0: 41854.0. Samples: 866065440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 03:53:21,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 03:53:25,344][12883] Updated weights for policy 0, policy_version 52861 (0.0035) [2024-06-18 03:53:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 866156544. Throughput: 0: 42197.7. Samples: 866332000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 03:53:26,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 03:53:28,072][12883] Updated weights for policy 0, policy_version 52871 (0.0036) [2024-06-18 03:53:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 866353152. Throughput: 0: 42025.3. Samples: 866455360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 03:53:31,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 03:53:32,874][12883] Updated weights for policy 0, policy_version 52881 (0.0034) [2024-06-18 03:53:36,188][12883] Updated weights for policy 0, policy_version 52891 (0.0027) [2024-06-18 03:53:37,010][12645] Fps is (10 sec: 45799.5, 60 sec: 42586.7, 300 sec: 42262.8). Total num frames: 866615296. Throughput: 0: 41975.1. Samples: 866705700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-18 03:53:37,011][12645] Avg episode reward: [(0, '0.137')] [2024-06-18 03:53:40,499][12883] Updated weights for policy 0, policy_version 52901 (0.0040) [2024-06-18 03:53:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 866795520. Throughput: 0: 42305.3. Samples: 866967800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-18 03:53:41,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 03:53:43,760][12883] Updated weights for policy 0, policy_version 52911 (0.0032) [2024-06-18 03:53:46,994][12645] Fps is (10 sec: 39385.8, 60 sec: 42325.1, 300 sec: 42154.1). Total num frames: 867008512. Throughput: 0: 42177.9. Samples: 867084640. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-18 03:53:46,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 03:53:48,552][12883] Updated weights for policy 0, policy_version 52921 (0.0030) [2024-06-18 03:53:51,542][12883] Updated weights for policy 0, policy_version 52931 (0.0030) [2024-06-18 03:53:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 867237888. Throughput: 0: 42402.4. Samples: 867345440. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-18 03:53:51,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 03:53:56,095][12883] Updated weights for policy 0, policy_version 52941 (0.0032) [2024-06-18 03:53:56,994][12645] Fps is (10 sec: 42599.7, 60 sec: 41780.8, 300 sec: 42154.1). Total num frames: 867434496. Throughput: 0: 42275.5. Samples: 867597380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-18 03:53:56,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 03:53:59,611][12883] Updated weights for policy 0, policy_version 52951 (0.0026) [2024-06-18 03:54:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42873.1, 300 sec: 42209.6). Total num frames: 867663872. Throughput: 0: 42392.8. Samples: 867720700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-18 03:54:01,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 03:54:03,893][12883] Updated weights for policy 0, policy_version 52961 (0.0037) [2024-06-18 03:54:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 867860480. Throughput: 0: 42561.2. Samples: 867980700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 23.0) [2024-06-18 03:54:06,998][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 03:54:07,145][12883] Updated weights for policy 0, policy_version 52971 (0.0033) [2024-06-18 03:54:11,599][12883] Updated weights for policy 0, policy_version 52981 (0.0026) [2024-06-18 03:54:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 868057088. Throughput: 0: 42359.1. Samples: 868238160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 03:54:11,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 03:54:14,791][12883] Updated weights for policy 0, policy_version 52991 (0.0028) [2024-06-18 03:54:16,993][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 868286464. Throughput: 0: 42252.5. Samples: 868356720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 03:54:16,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 03:54:19,197][12883] Updated weights for policy 0, policy_version 53001 (0.0039) [2024-06-18 03:54:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 868499456. Throughput: 0: 42423.5. Samples: 868614060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 03:54:21,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 03:54:22,387][12883] Updated weights for policy 0, policy_version 53011 (0.0036) [2024-06-18 03:54:26,803][12883] Updated weights for policy 0, policy_version 53021 (0.0037) [2024-06-18 03:54:26,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 868696064. Throughput: 0: 42218.6. Samples: 868867640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 03:54:26,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 03:54:30,173][12883] Updated weights for policy 0, policy_version 53031 (0.0022) [2024-06-18 03:54:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 868925440. Throughput: 0: 42437.7. Samples: 868994320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 03:54:31,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 03:54:34,371][12883] Updated weights for policy 0, policy_version 53041 (0.0031) [2024-06-18 03:54:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41517.6, 300 sec: 42209.6). Total num frames: 869105664. Throughput: 0: 42310.6. Samples: 869249420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 03:54:36,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 03:54:37,179][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053048_869138432.pth... [2024-06-18 03:54:37,238][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052429_858996736.pth [2024-06-18 03:54:37,441][12862] Signal inference workers to stop experience collection... (12500 times) [2024-06-18 03:54:37,441][12862] Signal inference workers to resume experience collection... (12500 times) [2024-06-18 03:54:37,474][12883] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-06-18 03:54:37,474][12883] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-06-18 03:54:37,791][12883] Updated weights for policy 0, policy_version 53051 (0.0034) [2024-06-18 03:54:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 869335040. Throughput: 0: 42143.5. Samples: 869493840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 03:54:41,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 03:54:42,014][12883] Updated weights for policy 0, policy_version 53061 (0.0036) [2024-06-18 03:54:45,390][12883] Updated weights for policy 0, policy_version 53071 (0.0024) [2024-06-18 03:54:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 869548032. Throughput: 0: 42374.6. Samples: 869627560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:54:46,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 03:54:49,813][12883] Updated weights for policy 0, policy_version 53081 (0.0038) [2024-06-18 03:54:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42265.5). Total num frames: 869761024. Throughput: 0: 42103.2. Samples: 869875340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:54:51,996][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 03:54:54,052][12883] Updated weights for policy 0, policy_version 53091 (0.0031) [2024-06-18 03:54:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 42043.1). Total num frames: 869941248. Throughput: 0: 41987.7. Samples: 870127600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:54:56,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 03:54:57,788][12883] Updated weights for policy 0, policy_version 53101 (0.0036) [2024-06-18 03:55:01,743][12883] Updated weights for policy 0, policy_version 53111 (0.0032) [2024-06-18 03:55:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 870170624. Throughput: 0: 42051.5. Samples: 870249040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:55:01,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 03:55:05,543][12883] Updated weights for policy 0, policy_version 53121 (0.0035) [2024-06-18 03:55:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 870383616. Throughput: 0: 41863.2. Samples: 870497900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:55:06,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 03:55:09,499][12883] Updated weights for policy 0, policy_version 53131 (0.0024) [2024-06-18 03:55:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 870563840. Throughput: 0: 41911.6. Samples: 870753660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:55:11,995][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 03:55:13,330][12883] Updated weights for policy 0, policy_version 53141 (0.0028) [2024-06-18 03:55:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 870809600. Throughput: 0: 41805.6. Samples: 870875580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 03:55:16,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 03:55:17,196][12883] Updated weights for policy 0, policy_version 53151 (0.0036) [2024-06-18 03:55:21,361][12883] Updated weights for policy 0, policy_version 53161 (0.0034) [2024-06-18 03:55:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 871022592. Throughput: 0: 41856.9. Samples: 871132980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:55:21,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 03:55:24,957][12883] Updated weights for policy 0, policy_version 53171 (0.0041) [2024-06-18 03:55:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 871202816. Throughput: 0: 41997.4. Samples: 871383720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:55:26,994][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 03:55:29,020][12883] Updated weights for policy 0, policy_version 53181 (0.0047) [2024-06-18 03:55:31,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.6, 300 sec: 42264.8). Total num frames: 871448576. Throughput: 0: 41814.8. Samples: 871509320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:55:31,997][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 03:55:32,974][12883] Updated weights for policy 0, policy_version 53191 (0.0036) [2024-06-18 03:55:36,836][12883] Updated weights for policy 0, policy_version 53201 (0.0030) [2024-06-18 03:55:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42098.7). Total num frames: 871645184. Throughput: 0: 42019.1. Samples: 871766200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:55:36,999][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 03:55:40,554][12883] Updated weights for policy 0, policy_version 53211 (0.0029) [2024-06-18 03:55:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 41779.2, 300 sec: 42043.3). Total num frames: 871841792. Throughput: 0: 41828.4. Samples: 872009880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:55:41,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 03:55:44,811][12883] Updated weights for policy 0, policy_version 53221 (0.0041) [2024-06-18 03:55:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 872071168. Throughput: 0: 41951.4. Samples: 872136860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 03:55:46,994][12645] Avg episode reward: [(0, '0.143')] [2024-06-18 03:55:48,587][12883] Updated weights for policy 0, policy_version 53231 (0.0033) [2024-06-18 03:55:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 872251392. Throughput: 0: 42069.4. Samples: 872391020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 03:55:51,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 03:55:52,671][12883] Updated weights for policy 0, policy_version 53241 (0.0039) [2024-06-18 03:55:56,060][12883] Updated weights for policy 0, policy_version 53251 (0.0035) [2024-06-18 03:55:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 872464384. Throughput: 0: 41961.4. Samples: 872641920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 03:55:56,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 03:56:00,482][12883] Updated weights for policy 0, policy_version 53261 (0.0026) [2024-06-18 03:56:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 872710144. Throughput: 0: 42120.1. Samples: 872770980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 03:56:01,994][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 03:56:04,107][12883] Updated weights for policy 0, policy_version 53271 (0.0028) [2024-06-18 03:56:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 872906752. Throughput: 0: 42008.5. Samples: 873023360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 03:56:06,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 03:56:08,222][12883] Updated weights for policy 0, policy_version 53281 (0.0042) [2024-06-18 03:56:10,230][12862] Signal inference workers to stop experience collection... (12550 times) [2024-06-18 03:56:10,230][12862] Signal inference workers to resume experience collection... (12550 times) [2024-06-18 03:56:10,240][12883] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-06-18 03:56:10,244][12883] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-06-18 03:56:11,789][12883] Updated weights for policy 0, policy_version 53291 (0.0039) [2024-06-18 03:56:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 873119744. Throughput: 0: 41896.4. Samples: 873269060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 03:56:11,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 03:56:16,128][12883] Updated weights for policy 0, policy_version 53301 (0.0023) [2024-06-18 03:56:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 873332736. Throughput: 0: 41912.3. Samples: 873395280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 03:56:16,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 03:56:19,498][12883] Updated weights for policy 0, policy_version 53311 (0.0032) [2024-06-18 03:56:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 873512960. Throughput: 0: 41878.2. Samples: 873650720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 03:56:21,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 03:56:24,161][12883] Updated weights for policy 0, policy_version 53321 (0.0042) [2024-06-18 03:56:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 873758720. Throughput: 0: 41748.0. Samples: 873888540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 03:56:26,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 03:56:27,546][12883] Updated weights for policy 0, policy_version 53331 (0.0035) [2024-06-18 03:56:31,970][12883] Updated weights for policy 0, policy_version 53341 (0.0029) [2024-06-18 03:56:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41507.7, 300 sec: 41987.5). Total num frames: 873938944. Throughput: 0: 41921.8. Samples: 874023340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 03:56:31,995][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 03:56:35,927][12883] Updated weights for policy 0, policy_version 53351 (0.0039) [2024-06-18 03:56:36,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 874135552. Throughput: 0: 41823.9. Samples: 874273100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 03:56:36,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 03:56:37,114][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053354_874151936.pth... [2024-06-18 03:56:37,174][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000052738_864059392.pth [2024-06-18 03:56:39,852][12883] Updated weights for policy 0, policy_version 53361 (0.0031) [2024-06-18 03:56:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 874381312. Throughput: 0: 41609.3. Samples: 874514340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 03:56:41,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 03:56:43,749][12883] Updated weights for policy 0, policy_version 53371 (0.0041) [2024-06-18 03:56:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 874545152. Throughput: 0: 41852.0. Samples: 874654320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 03:56:46,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 03:56:47,678][12883] Updated weights for policy 0, policy_version 53381 (0.0035) [2024-06-18 03:56:51,431][12883] Updated weights for policy 0, policy_version 53391 (0.0029) [2024-06-18 03:56:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 874758144. Throughput: 0: 41579.6. Samples: 874894440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 03:56:51,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 03:56:55,396][12883] Updated weights for policy 0, policy_version 53401 (0.0034) [2024-06-18 03:56:56,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 875020288. Throughput: 0: 41729.7. Samples: 875146900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-18 03:56:56,994][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 03:56:59,204][12883] Updated weights for policy 0, policy_version 53411 (0.0044) [2024-06-18 03:57:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 875200512. Throughput: 0: 41925.4. Samples: 875281920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-18 03:57:01,994][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 03:57:03,070][12883] Updated weights for policy 0, policy_version 53421 (0.0046) [2024-06-18 03:57:06,820][12883] Updated weights for policy 0, policy_version 53431 (0.0044) [2024-06-18 03:57:06,996][12645] Fps is (10 sec: 39313.2, 60 sec: 41777.6, 300 sec: 41987.4). Total num frames: 875413504. Throughput: 0: 41695.8. Samples: 875527120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-18 03:57:06,996][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 03:57:10,852][12883] Updated weights for policy 0, policy_version 53441 (0.0035) [2024-06-18 03:57:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 875626496. Throughput: 0: 42113.9. Samples: 875783660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-18 03:57:11,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 03:57:14,794][12883] Updated weights for policy 0, policy_version 53451 (0.0038) [2024-06-18 03:57:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 875839488. Throughput: 0: 41948.9. Samples: 875911040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-18 03:57:16,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 03:57:18,829][12883] Updated weights for policy 0, policy_version 53461 (0.0037) [2024-06-18 03:57:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 876036096. Throughput: 0: 41769.3. Samples: 876152720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-18 03:57:21,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 03:57:22,802][12883] Updated weights for policy 0, policy_version 53471 (0.0040) [2024-06-18 03:57:26,555][12883] Updated weights for policy 0, policy_version 53481 (0.0045) [2024-06-18 03:57:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 876249088. Throughput: 0: 42000.5. Samples: 876404360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-18 03:57:26,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 03:57:30,200][12862] Signal inference workers to stop experience collection... (12600 times) [2024-06-18 03:57:30,201][12862] Signal inference workers to resume experience collection... (12600 times) [2024-06-18 03:57:30,231][12883] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-06-18 03:57:30,232][12883] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-06-18 03:57:30,556][12883] Updated weights for policy 0, policy_version 53491 (0.0028) [2024-06-18 03:57:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 876462080. Throughput: 0: 41706.1. Samples: 876531100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:57:31,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 03:57:34,168][12883] Updated weights for policy 0, policy_version 53501 (0.0047) [2024-06-18 03:57:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 41876.1). Total num frames: 876658688. Throughput: 0: 41867.6. Samples: 876778580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:57:36,996][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 03:57:38,724][12883] Updated weights for policy 0, policy_version 53511 (0.0032) [2024-06-18 03:57:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 876871680. Throughput: 0: 41792.9. Samples: 877027580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:57:41,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 03:57:42,107][12883] Updated weights for policy 0, policy_version 53521 (0.0046) [2024-06-18 03:57:46,436][12883] Updated weights for policy 0, policy_version 53531 (0.0034) [2024-06-18 03:57:46,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42052.2, 300 sec: 41932.2). Total num frames: 877068288. Throughput: 0: 41728.3. Samples: 877159700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:57:46,994][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 03:57:49,662][12883] Updated weights for policy 0, policy_version 53541 (0.0032) [2024-06-18 03:57:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 41932.3). Total num frames: 877297664. Throughput: 0: 41720.2. Samples: 877404440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:57:51,996][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 03:57:54,542][12883] Updated weights for policy 0, policy_version 53551 (0.0032) [2024-06-18 03:57:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.1, 300 sec: 42154.4). Total num frames: 877527040. Throughput: 0: 41594.4. Samples: 877655420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:57:56,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 03:57:57,559][12883] Updated weights for policy 0, policy_version 53561 (0.0032) [2024-06-18 03:58:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 877690880. Throughput: 0: 41585.2. Samples: 877782380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 03:58:01,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 03:58:02,230][12883] Updated weights for policy 0, policy_version 53571 (0.0045) [2024-06-18 03:58:05,205][12883] Updated weights for policy 0, policy_version 53581 (0.0031) [2024-06-18 03:58:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41507.6, 300 sec: 41931.9). Total num frames: 877903872. Throughput: 0: 41675.6. Samples: 878028120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 03:58:06,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 03:58:09,837][12883] Updated weights for policy 0, policy_version 53591 (0.0033) [2024-06-18 03:58:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 878133248. Throughput: 0: 41743.1. Samples: 878282800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 03:58:11,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 03:58:13,054][12883] Updated weights for policy 0, policy_version 53601 (0.0035) [2024-06-18 03:58:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 878313472. Throughput: 0: 41724.4. Samples: 878408700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 03:58:16,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 03:58:18,084][12883] Updated weights for policy 0, policy_version 53611 (0.0033) [2024-06-18 03:58:20,747][12883] Updated weights for policy 0, policy_version 53621 (0.0039) [2024-06-18 03:58:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 878559232. Throughput: 0: 41666.5. Samples: 878653480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 03:58:21,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 03:58:25,694][12883] Updated weights for policy 0, policy_version 53631 (0.0037) [2024-06-18 03:58:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41779.0, 300 sec: 42043.0). Total num frames: 878755840. Throughput: 0: 41874.1. Samples: 878911920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 03:58:26,995][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 03:58:28,606][12883] Updated weights for policy 0, policy_version 53641 (0.0039) [2024-06-18 03:58:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41823.2). Total num frames: 878952448. Throughput: 0: 41687.2. Samples: 879035620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 03:58:31,994][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 03:58:33,441][12883] Updated weights for policy 0, policy_version 53651 (0.0022) [2024-06-18 03:58:36,377][12883] Updated weights for policy 0, policy_version 53661 (0.0031) [2024-06-18 03:58:36,996][12645] Fps is (10 sec: 44227.7, 60 sec: 42325.3, 300 sec: 42042.7). Total num frames: 879198208. Throughput: 0: 41893.0. Samples: 879289720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 03:58:36,997][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 03:58:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053662_879198208.pth... [2024-06-18 03:58:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053048_869138432.pth [2024-06-18 03:58:41,109][12883] Updated weights for policy 0, policy_version 53671 (0.0030) [2024-06-18 03:58:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 879394816. Throughput: 0: 42128.7. Samples: 879551200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 03:58:41,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 03:58:44,119][12883] Updated weights for policy 0, policy_version 53681 (0.0049) [2024-06-18 03:58:46,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 879591424. Throughput: 0: 41937.0. Samples: 879669540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 03:58:46,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 03:58:47,360][12862] Signal inference workers to stop experience collection... (12650 times) [2024-06-18 03:58:47,370][12883] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-06-18 03:58:47,418][12862] Signal inference workers to resume experience collection... (12650 times) [2024-06-18 03:58:47,419][12883] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-06-18 03:58:48,836][12883] Updated weights for policy 0, policy_version 53691 (0.0039) [2024-06-18 03:58:51,973][12883] Updated weights for policy 0, policy_version 53701 (0.0036) [2024-06-18 03:58:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 879837184. Throughput: 0: 42152.5. Samples: 879924980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 03:58:51,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 03:58:56,427][12883] Updated weights for policy 0, policy_version 53711 (0.0028) [2024-06-18 03:58:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 880017408. Throughput: 0: 42148.0. Samples: 880179460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 03:58:56,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 03:58:59,646][12883] Updated weights for policy 0, policy_version 53721 (0.0042) [2024-06-18 03:59:01,994][12645] Fps is (10 sec: 36044.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 880197632. Throughput: 0: 42050.7. Samples: 880300980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 03:59:01,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 03:59:04,355][12883] Updated weights for policy 0, policy_version 53731 (0.0036) [2024-06-18 03:59:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 880443392. Throughput: 0: 42250.2. Samples: 880554740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 03:59:06,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 03:59:08,152][12883] Updated weights for policy 0, policy_version 53741 (0.0033) [2024-06-18 03:59:11,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 880640000. Throughput: 0: 42155.8. Samples: 880808920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 03:59:11,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 03:59:12,013][12883] Updated weights for policy 0, policy_version 53751 (0.0027) [2024-06-18 03:59:15,918][12883] Updated weights for policy 0, policy_version 53761 (0.0027) [2024-06-18 03:59:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 880852992. Throughput: 0: 42143.0. Samples: 880932060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 03:59:16,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 03:59:19,703][12883] Updated weights for policy 0, policy_version 53771 (0.0031) [2024-06-18 03:59:21,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 881098752. Throughput: 0: 42181.6. Samples: 881187800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 03:59:21,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 03:59:24,101][12883] Updated weights for policy 0, policy_version 53781 (0.0029) [2024-06-18 03:59:27,000][12645] Fps is (10 sec: 44209.5, 60 sec: 42321.1, 300 sec: 41931.0). Total num frames: 881295360. Throughput: 0: 42057.2. Samples: 881444040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 03:59:27,000][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 03:59:27,286][12883] Updated weights for policy 0, policy_version 53791 (0.0042) [2024-06-18 03:59:31,917][12883] Updated weights for policy 0, policy_version 53801 (0.0038) [2024-06-18 03:59:31,998][12645] Fps is (10 sec: 37667.0, 60 sec: 42049.2, 300 sec: 41931.3). Total num frames: 881475584. Throughput: 0: 42154.1. Samples: 881566660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 03:59:31,999][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 03:59:35,145][12883] Updated weights for policy 0, policy_version 53811 (0.0032) [2024-06-18 03:59:36,994][12645] Fps is (10 sec: 40985.9, 60 sec: 41780.8, 300 sec: 41931.9). Total num frames: 881704960. Throughput: 0: 42032.5. Samples: 881816440. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 03:59:36,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 03:59:39,536][12883] Updated weights for policy 0, policy_version 53821 (0.0027) [2024-06-18 03:59:41,994][12645] Fps is (10 sec: 44256.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 881917952. Throughput: 0: 42166.7. Samples: 882076960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 03:59:42,000][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 03:59:42,883][12883] Updated weights for policy 0, policy_version 53831 (0.0030) [2024-06-18 03:59:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 882114560. Throughput: 0: 42151.3. Samples: 882197780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:59:46,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 03:59:47,400][12883] Updated weights for policy 0, policy_version 53841 (0.0040) [2024-06-18 03:59:50,469][12883] Updated weights for policy 0, policy_version 53851 (0.0041) [2024-06-18 03:59:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 882360320. Throughput: 0: 42036.2. Samples: 882446460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:59:51,996][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 03:59:55,261][12883] Updated weights for policy 0, policy_version 53861 (0.0028) [2024-06-18 03:59:56,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 882540544. Throughput: 0: 42104.2. Samples: 882703620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 03:59:56,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 03:59:58,289][12883] Updated weights for policy 0, policy_version 53871 (0.0040) [2024-06-18 04:00:01,994][12645] Fps is (10 sec: 37691.3, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 882737152. Throughput: 0: 42118.3. Samples: 882827380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 04:00:01,994][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 04:00:03,035][12883] Updated weights for policy 0, policy_version 53881 (0.0033) [2024-06-18 04:00:05,909][12883] Updated weights for policy 0, policy_version 53891 (0.0032) [2024-06-18 04:00:06,994][12645] Fps is (10 sec: 44238.1, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 882982912. Throughput: 0: 42041.5. Samples: 883079660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 04:00:06,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 04:00:10,456][12862] Signal inference workers to stop experience collection... (12700 times) [2024-06-18 04:00:10,456][12862] Signal inference workers to resume experience collection... (12700 times) [2024-06-18 04:00:10,505][12883] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-06-18 04:00:10,505][12883] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-06-18 04:00:10,786][12883] Updated weights for policy 0, policy_version 53901 (0.0032) [2024-06-18 04:00:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 883163136. Throughput: 0: 42200.5. Samples: 883342800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 04:00:11,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 04:00:13,523][12883] Updated weights for policy 0, policy_version 53911 (0.0030) [2024-06-18 04:00:16,994][12645] Fps is (10 sec: 37682.6, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 883359744. Throughput: 0: 42150.7. Samples: 883463260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 04:00:17,000][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 04:00:18,496][12883] Updated weights for policy 0, policy_version 53921 (0.0051) [2024-06-18 04:00:21,201][12883] Updated weights for policy 0, policy_version 53931 (0.0033) [2024-06-18 04:00:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 883638272. Throughput: 0: 42258.1. Samples: 883718060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:00:21,995][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 04:00:26,410][12883] Updated weights for policy 0, policy_version 53941 (0.0042) [2024-06-18 04:00:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41510.5, 300 sec: 41821.2). Total num frames: 883785728. Throughput: 0: 42159.1. Samples: 883974120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:00:26,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 04:00:28,894][12883] Updated weights for policy 0, policy_version 53951 (0.0023) [2024-06-18 04:00:31,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42328.5, 300 sec: 41931.9). Total num frames: 884015104. Throughput: 0: 42031.1. Samples: 884089180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:00:31,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 04:00:34,205][12883] Updated weights for policy 0, policy_version 53961 (0.0046) [2024-06-18 04:00:36,802][12883] Updated weights for policy 0, policy_version 53971 (0.0033) [2024-06-18 04:00:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 884260864. Throughput: 0: 42277.1. Samples: 884348840. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:00:36,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 04:00:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053971_884260864.pth... [2024-06-18 04:00:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053354_874151936.pth [2024-06-18 04:00:41,937][12883] Updated weights for policy 0, policy_version 53981 (0.0044) [2024-06-18 04:00:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 884424704. Throughput: 0: 42378.8. Samples: 884610660. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:00:41,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 04:00:44,574][12883] Updated weights for policy 0, policy_version 53991 (0.0027) [2024-06-18 04:00:46,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42596.8, 300 sec: 42098.2). Total num frames: 884670464. Throughput: 0: 42219.3. Samples: 884727340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:00:46,997][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 04:00:49,482][12883] Updated weights for policy 0, policy_version 54001 (0.0038) [2024-06-18 04:00:51,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42326.8, 300 sec: 42154.1). Total num frames: 884899840. Throughput: 0: 42418.1. Samples: 884988480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 04:00:51,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 04:00:52,242][12883] Updated weights for policy 0, policy_version 54011 (0.0041) [2024-06-18 04:00:56,994][12645] Fps is (10 sec: 36053.3, 60 sec: 41506.4, 300 sec: 41765.3). Total num frames: 885030912. Throughput: 0: 42483.7. Samples: 885254560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 04:00:56,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 04:00:57,311][12883] Updated weights for policy 0, policy_version 54021 (0.0042) [2024-06-18 04:01:00,027][12883] Updated weights for policy 0, policy_version 54031 (0.0039) [2024-06-18 04:01:01,999][12645] Fps is (10 sec: 40938.3, 60 sec: 42867.7, 300 sec: 42042.2). Total num frames: 885309440. Throughput: 0: 42199.5. Samples: 885362460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 04:01:01,999][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 04:01:04,855][12883] Updated weights for policy 0, policy_version 54041 (0.0028) [2024-06-18 04:01:06,266][12862] Signal inference workers to stop experience collection... (12750 times) [2024-06-18 04:01:06,266][12862] Signal inference workers to resume experience collection... (12750 times) [2024-06-18 04:01:06,277][12883] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-06-18 04:01:06,277][12883] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-06-18 04:01:06,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 885522432. Throughput: 0: 42466.8. Samples: 885629060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 04:01:06,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 04:01:07,667][12883] Updated weights for policy 0, policy_version 54051 (0.0030) [2024-06-18 04:01:11,994][12645] Fps is (10 sec: 39342.6, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 885702656. Throughput: 0: 42635.5. Samples: 885892720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 04:01:11,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 04:01:12,523][12883] Updated weights for policy 0, policy_version 54061 (0.0023) [2024-06-18 04:01:15,245][12883] Updated weights for policy 0, policy_version 54071 (0.0027) [2024-06-18 04:01:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42209.6). Total num frames: 885964800. Throughput: 0: 42640.8. Samples: 886008020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 04:01:16,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 04:01:20,348][12883] Updated weights for policy 0, policy_version 54081 (0.0046) [2024-06-18 04:01:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 886145024. Throughput: 0: 42565.9. Samples: 886264300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 25.0) [2024-06-18 04:01:21,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 04:01:23,145][12883] Updated weights for policy 0, policy_version 54091 (0.0052) [2024-06-18 04:01:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 886341632. Throughput: 0: 42450.7. Samples: 886520940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 04:01:26,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 04:01:28,051][12883] Updated weights for policy 0, policy_version 54101 (0.0042) [2024-06-18 04:01:30,964][12883] Updated weights for policy 0, policy_version 54111 (0.0029) [2024-06-18 04:01:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 886587392. Throughput: 0: 42641.6. Samples: 886646120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 04:01:31,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 04:01:35,619][12883] Updated weights for policy 0, policy_version 54121 (0.0022) [2024-06-18 04:01:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 886751232. Throughput: 0: 42377.4. Samples: 886895460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 04:01:36,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 04:01:39,094][12883] Updated weights for policy 0, policy_version 54131 (0.0028) [2024-06-18 04:01:42,000][12645] Fps is (10 sec: 37660.1, 60 sec: 42320.9, 300 sec: 42097.7). Total num frames: 886964224. Throughput: 0: 42030.1. Samples: 887146180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 04:01:42,001][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 04:01:43,351][12883] Updated weights for policy 0, policy_version 54141 (0.0033) [2024-06-18 04:01:46,905][12883] Updated weights for policy 0, policy_version 54151 (0.0028) [2024-06-18 04:01:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 887209984. Throughput: 0: 42553.5. Samples: 887277140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 04:01:46,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 04:01:50,929][12883] Updated weights for policy 0, policy_version 54161 (0.0032) [2024-06-18 04:01:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 887390208. Throughput: 0: 42190.5. Samples: 887527640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 04:01:51,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 04:01:54,828][12883] Updated weights for policy 0, policy_version 54171 (0.0038) [2024-06-18 04:01:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42098.5). Total num frames: 887619584. Throughput: 0: 41782.6. Samples: 887772940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 04:01:56,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 04:01:58,880][12883] Updated weights for policy 0, policy_version 54181 (0.0046) [2024-06-18 04:02:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42056.1, 300 sec: 42098.9). Total num frames: 887832576. Throughput: 0: 42229.4. Samples: 887908340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 04:02:01,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 04:02:02,408][12883] Updated weights for policy 0, policy_version 54191 (0.0039) [2024-06-18 04:02:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 888012800. Throughput: 0: 41812.8. Samples: 888145880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 04:02:06,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 04:02:07,186][12883] Updated weights for policy 0, policy_version 54201 (0.0033) [2024-06-18 04:02:10,075][12862] Signal inference workers to stop experience collection... (12800 times) [2024-06-18 04:02:10,076][12862] Signal inference workers to resume experience collection... (12800 times) [2024-06-18 04:02:10,097][12883] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-06-18 04:02:10,097][12883] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-06-18 04:02:10,238][12883] Updated weights for policy 0, policy_version 54211 (0.0024) [2024-06-18 04:02:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 888258560. Throughput: 0: 41661.3. Samples: 888395700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 04:02:11,994][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 04:02:14,895][12883] Updated weights for policy 0, policy_version 54221 (0.0041) [2024-06-18 04:02:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 41232.9, 300 sec: 42043.0). Total num frames: 888438784. Throughput: 0: 41715.5. Samples: 888523320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 04:02:16,995][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 04:02:18,503][12883] Updated weights for policy 0, policy_version 54231 (0.0035) [2024-06-18 04:02:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 888668160. Throughput: 0: 41732.6. Samples: 888773420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 04:02:21,994][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 04:02:22,501][12883] Updated weights for policy 0, policy_version 54241 (0.0057) [2024-06-18 04:02:26,472][12883] Updated weights for policy 0, policy_version 54251 (0.0045) [2024-06-18 04:02:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 888864768. Throughput: 0: 41810.1. Samples: 889027380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 04:02:26,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 04:02:30,217][12883] Updated weights for policy 0, policy_version 54261 (0.0033) [2024-06-18 04:02:31,994][12645] Fps is (10 sec: 40959.1, 60 sec: 41506.1, 300 sec: 42098.9). Total num frames: 889077760. Throughput: 0: 41606.6. Samples: 889149440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:02:31,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 04:02:34,637][12883] Updated weights for policy 0, policy_version 54271 (0.0037) [2024-06-18 04:02:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 889290752. Throughput: 0: 41619.1. Samples: 889400500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:02:36,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 04:02:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054278_889290752.pth... [2024-06-18 04:02:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053662_879198208.pth [2024-06-18 04:02:38,079][12883] Updated weights for policy 0, policy_version 54281 (0.0034) [2024-06-18 04:02:41,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42055.1, 300 sec: 42098.2). Total num frames: 889487360. Throughput: 0: 41904.6. Samples: 889658740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:02:41,997][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 04:02:42,393][12883] Updated weights for policy 0, policy_version 54291 (0.0028) [2024-06-18 04:02:45,674][12883] Updated weights for policy 0, policy_version 54301 (0.0032) [2024-06-18 04:02:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 889700352. Throughput: 0: 41694.1. Samples: 889784580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:02:46,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 04:02:50,094][12883] Updated weights for policy 0, policy_version 54311 (0.0040) [2024-06-18 04:02:51,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 889929728. Throughput: 0: 42003.1. Samples: 890036020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:02:51,994][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 04:02:53,550][12883] Updated weights for policy 0, policy_version 54321 (0.0034) [2024-06-18 04:02:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 890142720. Throughput: 0: 42139.5. Samples: 890291980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:02:56,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 04:02:57,718][12883] Updated weights for policy 0, policy_version 54331 (0.0031) [2024-06-18 04:03:01,168][12883] Updated weights for policy 0, policy_version 54341 (0.0036) [2024-06-18 04:03:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 890355712. Throughput: 0: 42093.1. Samples: 890417500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:03:01,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 04:03:05,591][12883] Updated weights for policy 0, policy_version 54351 (0.0027) [2024-06-18 04:03:06,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 890568704. Throughput: 0: 42144.5. Samples: 890670020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 04:03:06,997][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 04:03:08,893][12883] Updated weights for policy 0, policy_version 54361 (0.0034) [2024-06-18 04:03:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 890765312. Throughput: 0: 42045.0. Samples: 890919400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 04:03:11,994][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 04:03:13,516][12883] Updated weights for policy 0, policy_version 54371 (0.0028) [2024-06-18 04:03:16,695][12883] Updated weights for policy 0, policy_version 54381 (0.0038) [2024-06-18 04:03:16,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 890994688. Throughput: 0: 42112.9. Samples: 891044520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 04:03:16,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 04:03:21,258][12883] Updated weights for policy 0, policy_version 54391 (0.0037) [2024-06-18 04:03:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 891174912. Throughput: 0: 42175.2. Samples: 891298380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 04:03:21,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 04:03:24,414][12883] Updated weights for policy 0, policy_version 54401 (0.0042) [2024-06-18 04:03:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 891387904. Throughput: 0: 42199.5. Samples: 891557620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 04:03:26,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 04:03:29,023][12883] Updated weights for policy 0, policy_version 54411 (0.0030) [2024-06-18 04:03:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 891617280. Throughput: 0: 42096.6. Samples: 891678920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 04:03:31,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 04:03:32,105][12883] Updated weights for policy 0, policy_version 54421 (0.0038) [2024-06-18 04:03:36,780][12883] Updated weights for policy 0, policy_version 54431 (0.0035) [2024-06-18 04:03:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 891813888. Throughput: 0: 42055.5. Samples: 891928520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 04:03:36,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 04:03:38,159][12862] Signal inference workers to stop experience collection... (12850 times) [2024-06-18 04:03:38,160][12862] Signal inference workers to resume experience collection... (12850 times) [2024-06-18 04:03:38,180][12883] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-06-18 04:03:38,181][12883] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-06-18 04:03:39,845][12883] Updated weights for policy 0, policy_version 54441 (0.0038) [2024-06-18 04:03:41,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42325.3, 300 sec: 42153.8). Total num frames: 892026880. Throughput: 0: 42134.8. Samples: 892188140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:03:41,997][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 04:03:44,484][12883] Updated weights for policy 0, policy_version 54451 (0.0027) [2024-06-18 04:03:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 892239872. Throughput: 0: 42081.0. Samples: 892311140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:03:46,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 04:03:47,824][12883] Updated weights for policy 0, policy_version 54461 (0.0039) [2024-06-18 04:03:51,994][12645] Fps is (10 sec: 40968.8, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 892436480. Throughput: 0: 42025.1. Samples: 892561060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:03:51,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 04:03:52,059][12883] Updated weights for policy 0, policy_version 54471 (0.0033) [2024-06-18 04:03:55,848][12883] Updated weights for policy 0, policy_version 54481 (0.0036) [2024-06-18 04:03:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 892649472. Throughput: 0: 42245.2. Samples: 892820440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:03:56,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 04:04:00,165][12883] Updated weights for policy 0, policy_version 54491 (0.0044) [2024-06-18 04:04:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 892862464. Throughput: 0: 42248.6. Samples: 892945700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:01,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 04:04:03,526][12883] Updated weights for policy 0, policy_version 54501 (0.0038) [2024-06-18 04:04:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41780.7, 300 sec: 42154.1). Total num frames: 893075456. Throughput: 0: 42171.5. Samples: 893196100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:06,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 04:04:07,941][12883] Updated weights for policy 0, policy_version 54511 (0.0039) [2024-06-18 04:04:11,422][12883] Updated weights for policy 0, policy_version 54521 (0.0038) [2024-06-18 04:04:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 893288448. Throughput: 0: 42072.0. Samples: 893450860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:11,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 04:04:15,450][12883] Updated weights for policy 0, policy_version 54531 (0.0023) [2024-06-18 04:04:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 893517824. Throughput: 0: 42221.7. Samples: 893578900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:16,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 04:04:19,158][12883] Updated weights for policy 0, policy_version 54541 (0.0030) [2024-06-18 04:04:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42099.4). Total num frames: 893714432. Throughput: 0: 42287.1. Samples: 893831440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:21,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 04:04:22,995][12883] Updated weights for policy 0, policy_version 54551 (0.0032) [2024-06-18 04:04:26,885][12883] Updated weights for policy 0, policy_version 54561 (0.0038) [2024-06-18 04:04:26,994][12645] Fps is (10 sec: 40957.5, 60 sec: 42324.9, 300 sec: 42210.2). Total num frames: 893927424. Throughput: 0: 42130.0. Samples: 894083920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:26,995][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 04:04:31,052][12883] Updated weights for policy 0, policy_version 54571 (0.0031) [2024-06-18 04:04:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 894156800. Throughput: 0: 42219.9. Samples: 894211040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:31,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 04:04:34,601][12883] Updated weights for policy 0, policy_version 54581 (0.0034) [2024-06-18 04:04:36,994][12645] Fps is (10 sec: 39323.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 894320640. Throughput: 0: 42180.1. Samples: 894459160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:36,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 04:04:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054586_894337024.pth... [2024-06-18 04:04:37,119][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000053971_884260864.pth [2024-06-18 04:04:38,507][12883] Updated weights for policy 0, policy_version 54591 (0.0031) [2024-06-18 04:04:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 894550016. Throughput: 0: 42132.6. Samples: 894716400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:41,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 04:04:42,543][12883] Updated weights for policy 0, policy_version 54601 (0.0037) [2024-06-18 04:04:45,996][12883] Updated weights for policy 0, policy_version 54611 (0.0036) [2024-06-18 04:04:46,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 894779392. Throughput: 0: 42233.4. Samples: 894846200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 04:04:46,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 04:04:50,341][12883] Updated weights for policy 0, policy_version 54621 (0.0034) [2024-06-18 04:04:51,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 894976000. Throughput: 0: 42271.0. Samples: 895098300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 04:04:51,995][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 04:04:53,879][12883] Updated weights for policy 0, policy_version 54631 (0.0031) [2024-06-18 04:04:56,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.9, 300 sec: 42209.3). Total num frames: 895188992. Throughput: 0: 42073.5. Samples: 895344260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 04:04:56,996][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 04:04:58,065][12883] Updated weights for policy 0, policy_version 54641 (0.0030) [2024-06-18 04:05:01,407][12883] Updated weights for policy 0, policy_version 54651 (0.0041) [2024-06-18 04:05:01,996][12645] Fps is (10 sec: 42589.6, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 895401984. Throughput: 0: 42253.9. Samples: 895480420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 04:05:01,997][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 04:05:04,900][12862] Signal inference workers to stop experience collection... (12900 times) [2024-06-18 04:05:04,901][12862] Signal inference workers to resume experience collection... (12900 times) [2024-06-18 04:05:04,955][12883] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-06-18 04:05:04,955][12883] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-06-18 04:05:05,802][12883] Updated weights for policy 0, policy_version 54661 (0.0026) [2024-06-18 04:05:06,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 895598592. Throughput: 0: 42264.0. Samples: 895733320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 04:05:06,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 04:05:09,235][12883] Updated weights for policy 0, policy_version 54671 (0.0034) [2024-06-18 04:05:11,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 895811584. Throughput: 0: 42305.9. Samples: 895987660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 04:05:11,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 04:05:13,483][12883] Updated weights for policy 0, policy_version 54681 (0.0037) [2024-06-18 04:05:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 896024576. Throughput: 0: 42210.3. Samples: 896110500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 04:05:16,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 04:05:17,444][12883] Updated weights for policy 0, policy_version 54691 (0.0029) [2024-06-18 04:05:21,141][12883] Updated weights for policy 0, policy_version 54701 (0.0039) [2024-06-18 04:05:21,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41777.6, 300 sec: 42153.8). Total num frames: 896221184. Throughput: 0: 42109.0. Samples: 896354160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 04:05:21,997][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 04:05:25,102][12883] Updated weights for policy 0, policy_version 54711 (0.0031) [2024-06-18 04:05:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.7, 300 sec: 42154.1). Total num frames: 896450560. Throughput: 0: 42093.7. Samples: 896610620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:05:26,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 04:05:28,713][12883] Updated weights for policy 0, policy_version 54721 (0.0025) [2024-06-18 04:05:31,996][12645] Fps is (10 sec: 42598.4, 60 sec: 41504.6, 300 sec: 41987.2). Total num frames: 896647168. Throughput: 0: 42136.0. Samples: 896742420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:05:31,997][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 04:05:32,820][12883] Updated weights for policy 0, policy_version 54731 (0.0026) [2024-06-18 04:05:36,247][12883] Updated weights for policy 0, policy_version 54741 (0.0033) [2024-06-18 04:05:36,993][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 896876544. Throughput: 0: 41822.1. Samples: 896980280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:05:36,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 04:05:40,494][12883] Updated weights for policy 0, policy_version 54751 (0.0043) [2024-06-18 04:05:41,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.2, 300 sec: 42043.3). Total num frames: 897073152. Throughput: 0: 42087.4. Samples: 897238100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:05:41,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 04:05:44,181][12883] Updated weights for policy 0, policy_version 54761 (0.0034) [2024-06-18 04:05:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41932.0). Total num frames: 897269760. Throughput: 0: 41891.5. Samples: 897365440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:05:46,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 04:05:48,044][12883] Updated weights for policy 0, policy_version 54771 (0.0037) [2024-06-18 04:05:51,659][12883] Updated weights for policy 0, policy_version 54781 (0.0025) [2024-06-18 04:05:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.6, 300 sec: 42376.2). Total num frames: 897531904. Throughput: 0: 42010.7. Samples: 897623800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:05:51,994][12645] Avg episode reward: [(0, '0.078')] [2024-06-18 04:05:53,487][12862] Signal inference workers to stop experience collection... (12950 times) [2024-06-18 04:05:53,543][12862] Signal inference workers to resume experience collection... (12950 times) [2024-06-18 04:05:53,548][12883] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-06-18 04:05:53,572][12883] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-06-18 04:05:55,787][12883] Updated weights for policy 0, policy_version 54791 (0.0037) [2024-06-18 04:05:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42053.8, 300 sec: 42043.8). Total num frames: 897712128. Throughput: 0: 42066.6. Samples: 897880660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 04:05:56,994][12645] Avg episode reward: [(0, '0.067')] [2024-06-18 04:05:59,249][12883] Updated weights for policy 0, policy_version 54801 (0.0035) [2024-06-18 04:06:01,993][12645] Fps is (10 sec: 36045.3, 60 sec: 41507.8, 300 sec: 41931.9). Total num frames: 897892352. Throughput: 0: 41964.5. Samples: 897998900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 04:06:01,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 04:06:03,828][12883] Updated weights for policy 0, policy_version 54811 (0.0032) [2024-06-18 04:06:06,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 898170880. Throughput: 0: 42282.9. Samples: 898256800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 04:06:06,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 04:06:07,008][12883] Updated weights for policy 0, policy_version 54821 (0.0024) [2024-06-18 04:06:11,597][12883] Updated weights for policy 0, policy_version 54831 (0.0040) [2024-06-18 04:06:12,000][12645] Fps is (10 sec: 45845.9, 60 sec: 42320.9, 300 sec: 41986.6). Total num frames: 898351104. Throughput: 0: 42116.8. Samples: 898506140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 04:06:12,000][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 04:06:15,775][12883] Updated weights for policy 0, policy_version 54841 (0.0033) [2024-06-18 04:06:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 898547712. Throughput: 0: 41921.2. Samples: 898628780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 04:06:16,999][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 04:06:19,555][12883] Updated weights for policy 0, policy_version 54851 (0.0030) [2024-06-18 04:06:21,993][12645] Fps is (10 sec: 42625.6, 60 sec: 42600.1, 300 sec: 42154.1). Total num frames: 898777088. Throughput: 0: 42452.9. Samples: 898890660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 04:06:21,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 04:06:21,995][12862] Saving new best policy, reward=0.447! [2024-06-18 04:06:23,994][12883] Updated weights for policy 0, policy_version 54861 (0.0026) [2024-06-18 04:06:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 898990080. Throughput: 0: 42271.1. Samples: 899140300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 04:06:26,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 04:06:27,652][12883] Updated weights for policy 0, policy_version 54871 (0.0031) [2024-06-18 04:06:31,695][12883] Updated weights for policy 0, policy_version 54881 (0.0029) [2024-06-18 04:06:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 899170304. Throughput: 0: 42305.7. Samples: 899269200. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-06-18 04:06:31,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 04:06:35,095][12883] Updated weights for policy 0, policy_version 54891 (0.0029) [2024-06-18 04:06:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42210.5). Total num frames: 899416064. Throughput: 0: 42260.9. Samples: 899525540. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-06-18 04:06:36,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 04:06:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054896_899416064.pth... [2024-06-18 04:06:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054278_889290752.pth [2024-06-18 04:06:39,135][12883] Updated weights for policy 0, policy_version 54901 (0.0034) [2024-06-18 04:06:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 899612672. Throughput: 0: 42178.7. Samples: 899778700. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-06-18 04:06:41,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 04:06:42,792][12883] Updated weights for policy 0, policy_version 54911 (0.0026) [2024-06-18 04:06:46,668][12883] Updated weights for policy 0, policy_version 54921 (0.0033) [2024-06-18 04:06:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 899825664. Throughput: 0: 42382.1. Samples: 899906100. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-06-18 04:06:46,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 04:06:50,687][12883] Updated weights for policy 0, policy_version 54931 (0.0039) [2024-06-18 04:06:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 900055040. Throughput: 0: 42395.2. Samples: 900164580. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-06-18 04:06:51,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 04:06:54,351][12883] Updated weights for policy 0, policy_version 54941 (0.0033) [2024-06-18 04:06:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 900251648. Throughput: 0: 42465.5. Samples: 900416820. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-06-18 04:06:56,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 04:06:58,368][12883] Updated weights for policy 0, policy_version 54951 (0.0030) [2024-06-18 04:07:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 900464640. Throughput: 0: 42545.9. Samples: 900543340. Policy #0 lag: (min: 1.0, avg: 12.7, max: 25.0) [2024-06-18 04:07:01,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 04:07:02,227][12883] Updated weights for policy 0, policy_version 54961 (0.0048) [2024-06-18 04:07:06,171][12883] Updated weights for policy 0, policy_version 54971 (0.0050) [2024-06-18 04:07:06,996][12645] Fps is (10 sec: 42588.1, 60 sec: 41777.7, 300 sec: 42098.2). Total num frames: 900677632. Throughput: 0: 42401.7. Samples: 900798840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:07:06,997][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 04:07:09,770][12883] Updated weights for policy 0, policy_version 54981 (0.0023) [2024-06-18 04:07:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.8, 300 sec: 42209.7). Total num frames: 900890624. Throughput: 0: 42367.7. Samples: 901046840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:07:11,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 04:07:13,700][12883] Updated weights for policy 0, policy_version 54991 (0.0032) [2024-06-18 04:07:16,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 901103616. Throughput: 0: 42273.3. Samples: 901171500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:07:16,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 04:07:17,801][12883] Updated weights for policy 0, policy_version 55001 (0.0039) [2024-06-18 04:07:18,561][12862] Signal inference workers to stop experience collection... (13000 times) [2024-06-18 04:07:18,562][12862] Signal inference workers to resume experience collection... (13000 times) [2024-06-18 04:07:18,577][12883] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-06-18 04:07:18,577][12883] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-06-18 04:07:21,599][12883] Updated weights for policy 0, policy_version 55011 (0.0029) [2024-06-18 04:07:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 901316608. Throughput: 0: 42389.0. Samples: 901433040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:07:21,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 04:07:25,472][12883] Updated weights for policy 0, policy_version 55021 (0.0046) [2024-06-18 04:07:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42209.7). Total num frames: 901529600. Throughput: 0: 42253.3. Samples: 901680100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:07:26,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 04:07:29,351][12883] Updated weights for policy 0, policy_version 55031 (0.0032) [2024-06-18 04:07:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 901742592. Throughput: 0: 42283.6. Samples: 901808860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:07:31,994][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 04:07:33,395][12883] Updated weights for policy 0, policy_version 55041 (0.0032) [2024-06-18 04:07:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42210.0). Total num frames: 901939200. Throughput: 0: 42211.7. Samples: 902064100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:07:36,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 04:07:37,015][12883] Updated weights for policy 0, policy_version 55051 (0.0028) [2024-06-18 04:07:41,163][12883] Updated weights for policy 0, policy_version 55061 (0.0037) [2024-06-18 04:07:41,993][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 902152192. Throughput: 0: 42210.3. Samples: 902316280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 04:07:41,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 04:07:44,877][12883] Updated weights for policy 0, policy_version 55071 (0.0028) [2024-06-18 04:07:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 902365184. Throughput: 0: 42257.8. Samples: 902444940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 04:07:46,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 04:07:48,951][12883] Updated weights for policy 0, policy_version 55081 (0.0038) [2024-06-18 04:07:51,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.8, 300 sec: 42209.3). Total num frames: 902594560. Throughput: 0: 42182.3. Samples: 902697040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 04:07:51,996][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 04:07:52,650][12883] Updated weights for policy 0, policy_version 55091 (0.0029) [2024-06-18 04:07:56,707][12883] Updated weights for policy 0, policy_version 55101 (0.0041) [2024-06-18 04:07:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 902774784. Throughput: 0: 42381.4. Samples: 902954000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 04:07:56,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 04:08:00,373][12883] Updated weights for policy 0, policy_version 55111 (0.0038) [2024-06-18 04:08:01,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 903004160. Throughput: 0: 42334.7. Samples: 903076560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 04:08:01,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 04:08:04,210][12883] Updated weights for policy 0, policy_version 55121 (0.0029) [2024-06-18 04:08:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 903200768. Throughput: 0: 42065.2. Samples: 903325980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 04:08:06,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 04:08:08,109][12883] Updated weights for policy 0, policy_version 55131 (0.0048) [2024-06-18 04:08:11,993][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 903413760. Throughput: 0: 42256.5. Samples: 903581640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 04:08:11,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 04:08:12,055][12883] Updated weights for policy 0, policy_version 55141 (0.0030) [2024-06-18 04:08:16,193][12883] Updated weights for policy 0, policy_version 55151 (0.0033) [2024-06-18 04:08:16,994][12645] Fps is (10 sec: 42595.8, 60 sec: 42051.8, 300 sec: 42209.5). Total num frames: 903626752. Throughput: 0: 42186.5. Samples: 903707280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:08:16,995][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 04:08:19,586][12883] Updated weights for policy 0, policy_version 55161 (0.0027) [2024-06-18 04:08:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 903839744. Throughput: 0: 42156.0. Samples: 903961120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:08:21,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 04:08:23,646][12883] Updated weights for policy 0, policy_version 55171 (0.0045) [2024-06-18 04:08:26,994][12645] Fps is (10 sec: 44239.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 904069120. Throughput: 0: 42126.2. Samples: 904211960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:08:26,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 04:08:27,316][12883] Updated weights for policy 0, policy_version 55181 (0.0025) [2024-06-18 04:08:31,809][12883] Updated weights for policy 0, policy_version 55191 (0.0040) [2024-06-18 04:08:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 904249344. Throughput: 0: 42108.0. Samples: 904339800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:08:31,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 04:08:35,112][12883] Updated weights for policy 0, policy_version 55201 (0.0026) [2024-06-18 04:08:36,996][12645] Fps is (10 sec: 39312.2, 60 sec: 42050.6, 300 sec: 42154.1). Total num frames: 904462336. Throughput: 0: 42059.5. Samples: 904589720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:08:36,996][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 04:08:37,055][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055205_904478720.pth... [2024-06-18 04:08:37,130][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054586_894337024.pth [2024-06-18 04:08:39,591][12883] Updated weights for policy 0, policy_version 55211 (0.0039) [2024-06-18 04:08:41,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.6, 300 sec: 42209.3). Total num frames: 904691712. Throughput: 0: 41796.0. Samples: 904834920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:08:41,997][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 04:08:43,007][12883] Updated weights for policy 0, policy_version 55221 (0.0028) [2024-06-18 04:08:46,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42050.6, 300 sec: 42209.3). Total num frames: 904888320. Throughput: 0: 42075.1. Samples: 904970040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:08:46,996][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 04:08:47,251][12883] Updated weights for policy 0, policy_version 55231 (0.0041) [2024-06-18 04:08:47,757][12862] Signal inference workers to stop experience collection... (13050 times) [2024-06-18 04:08:47,815][12883] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-06-18 04:08:47,874][12862] Signal inference workers to resume experience collection... (13050 times) [2024-06-18 04:08:47,874][12883] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-06-18 04:08:51,127][12883] Updated weights for policy 0, policy_version 55241 (0.0029) [2024-06-18 04:08:51,996][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42209.3). Total num frames: 905101312. Throughput: 0: 42015.2. Samples: 905216760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:08:51,997][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 04:08:55,258][12883] Updated weights for policy 0, policy_version 55251 (0.0027) [2024-06-18 04:08:56,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 905314304. Throughput: 0: 41921.7. Samples: 905468120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:08:56,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 04:08:58,931][12883] Updated weights for policy 0, policy_version 55261 (0.0054) [2024-06-18 04:09:01,996][12645] Fps is (10 sec: 40959.6, 60 sec: 41777.5, 300 sec: 42153.8). Total num frames: 905510912. Throughput: 0: 41975.3. Samples: 905596240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:09:01,997][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 04:09:03,241][12883] Updated weights for policy 0, policy_version 55271 (0.0032) [2024-06-18 04:09:06,564][12883] Updated weights for policy 0, policy_version 55281 (0.0053) [2024-06-18 04:09:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 905740288. Throughput: 0: 41958.1. Samples: 905849240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:09:06,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 04:09:10,796][12883] Updated weights for policy 0, policy_version 55291 (0.0037) [2024-06-18 04:09:11,993][12645] Fps is (10 sec: 44247.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 905953280. Throughput: 0: 42051.6. Samples: 906104280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:09:11,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 04:09:14,313][12883] Updated weights for policy 0, policy_version 55301 (0.0041) [2024-06-18 04:09:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.7, 300 sec: 42154.1). Total num frames: 906149888. Throughput: 0: 42137.3. Samples: 906235980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:09:16,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 04:09:18,252][12883] Updated weights for policy 0, policy_version 55311 (0.0049) [2024-06-18 04:09:21,996][12645] Fps is (10 sec: 40950.3, 60 sec: 42050.6, 300 sec: 42153.9). Total num frames: 906362880. Throughput: 0: 42064.9. Samples: 906482640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:09:21,997][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 04:09:22,346][12883] Updated weights for policy 0, policy_version 55321 (0.0027) [2024-06-18 04:09:25,877][12883] Updated weights for policy 0, policy_version 55331 (0.0027) [2024-06-18 04:09:26,993][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 906575872. Throughput: 0: 42386.3. Samples: 906742200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:09:26,994][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 04:09:30,004][12883] Updated weights for policy 0, policy_version 55341 (0.0037) [2024-06-18 04:09:31,993][12645] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 906788864. Throughput: 0: 42302.3. Samples: 906873540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:09:31,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 04:09:33,424][12883] Updated weights for policy 0, policy_version 55351 (0.0032) [2024-06-18 04:09:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 907001856. Throughput: 0: 42443.5. Samples: 907126620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:09:36,994][12645] Avg episode reward: [(0, '0.126')] [2024-06-18 04:09:37,558][12883] Updated weights for policy 0, policy_version 55361 (0.0025) [2024-06-18 04:09:41,004][12883] Updated weights for policy 0, policy_version 55371 (0.0041) [2024-06-18 04:09:42,000][12645] Fps is (10 sec: 44208.5, 60 sec: 42322.5, 300 sec: 42208.7). Total num frames: 907231232. Throughput: 0: 42553.2. Samples: 907383280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:09:42,000][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 04:09:45,180][12883] Updated weights for policy 0, policy_version 55381 (0.0041) [2024-06-18 04:09:46,997][12645] Fps is (10 sec: 42582.3, 60 sec: 42324.3, 300 sec: 42209.1). Total num frames: 907427840. Throughput: 0: 42563.5. Samples: 907511660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:09:46,998][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 04:09:48,776][12883] Updated weights for policy 0, policy_version 55391 (0.0029) [2024-06-18 04:09:51,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42327.0, 300 sec: 42209.9). Total num frames: 907640832. Throughput: 0: 42618.7. Samples: 907767080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:09:51,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 04:09:52,788][12883] Updated weights for policy 0, policy_version 55401 (0.0038) [2024-06-18 04:09:56,442][12883] Updated weights for policy 0, policy_version 55411 (0.0037) [2024-06-18 04:09:56,994][12645] Fps is (10 sec: 42614.7, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 907853824. Throughput: 0: 42497.3. Samples: 908016660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:09:56,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 04:10:00,681][12883] Updated weights for policy 0, policy_version 55421 (0.0028) [2024-06-18 04:10:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.2, 300 sec: 42320.7). Total num frames: 908083200. Throughput: 0: 42460.4. Samples: 908146700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 04:10:01,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 04:10:04,151][12883] Updated weights for policy 0, policy_version 55431 (0.0033) [2024-06-18 04:10:05,212][12862] Signal inference workers to stop experience collection... (13100 times) [2024-06-18 04:10:05,260][12862] Signal inference workers to resume experience collection... (13100 times) [2024-06-18 04:10:05,261][12883] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-06-18 04:10:05,275][12883] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-06-18 04:10:06,995][12645] Fps is (10 sec: 40953.8, 60 sec: 42051.3, 300 sec: 42209.4). Total num frames: 908263424. Throughput: 0: 42491.9. Samples: 908394740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 04:10:06,995][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 04:10:08,315][12883] Updated weights for policy 0, policy_version 55441 (0.0024) [2024-06-18 04:10:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 908492800. Throughput: 0: 42414.6. Samples: 908650860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 04:10:11,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 04:10:12,289][12883] Updated weights for policy 0, policy_version 55451 (0.0043) [2024-06-18 04:10:16,397][12883] Updated weights for policy 0, policy_version 55461 (0.0028) [2024-06-18 04:10:16,997][12645] Fps is (10 sec: 44227.4, 60 sec: 42595.8, 300 sec: 42320.5). Total num frames: 908705792. Throughput: 0: 42417.4. Samples: 908782480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 04:10:16,998][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 04:10:19,885][12883] Updated weights for policy 0, policy_version 55471 (0.0032) [2024-06-18 04:10:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42265.2). Total num frames: 908918784. Throughput: 0: 42395.0. Samples: 909034400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 04:10:21,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 04:10:23,918][12883] Updated weights for policy 0, policy_version 55481 (0.0032) [2024-06-18 04:10:26,994][12645] Fps is (10 sec: 42613.9, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 909131776. Throughput: 0: 42357.1. Samples: 909289080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 04:10:26,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 04:10:27,412][12883] Updated weights for policy 0, policy_version 55491 (0.0030) [2024-06-18 04:10:31,581][12883] Updated weights for policy 0, policy_version 55501 (0.0039) [2024-06-18 04:10:31,993][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 909328384. Throughput: 0: 42316.1. Samples: 909415720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:10:31,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 04:10:35,418][12883] Updated weights for policy 0, policy_version 55511 (0.0045) [2024-06-18 04:10:36,993][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 909557760. Throughput: 0: 42210.7. Samples: 909666560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:10:36,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 04:10:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055515_909557760.pth... [2024-06-18 04:10:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000054896_899416064.pth [2024-06-18 04:10:39,392][12883] Updated weights for policy 0, policy_version 55521 (0.0032) [2024-06-18 04:10:41,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42328.2, 300 sec: 42375.9). Total num frames: 909770752. Throughput: 0: 42322.7. Samples: 909921280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:10:41,996][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 04:10:42,930][12883] Updated weights for policy 0, policy_version 55531 (0.0032) [2024-06-18 04:10:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42328.0, 300 sec: 42154.1). Total num frames: 909967360. Throughput: 0: 42250.7. Samples: 910047980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:10:46,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 04:10:47,284][12883] Updated weights for policy 0, policy_version 55541 (0.0035) [2024-06-18 04:10:50,685][12883] Updated weights for policy 0, policy_version 55551 (0.0043) [2024-06-18 04:10:51,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 910213120. Throughput: 0: 42344.5. Samples: 910300180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:10:51,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 04:10:54,935][12883] Updated weights for policy 0, policy_version 55561 (0.0028) [2024-06-18 04:10:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 910376960. Throughput: 0: 42401.8. Samples: 910558940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:10:56,994][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 04:10:58,674][12883] Updated weights for policy 0, policy_version 55571 (0.0030) [2024-06-18 04:11:01,993][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 910589952. Throughput: 0: 42061.7. Samples: 910675100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:11:01,994][12645] Avg episode reward: [(0, '0.052')] [2024-06-18 04:11:02,687][12883] Updated weights for policy 0, policy_version 55581 (0.0031) [2024-06-18 04:11:06,259][12883] Updated weights for policy 0, policy_version 55591 (0.0031) [2024-06-18 04:11:06,993][12645] Fps is (10 sec: 45875.4, 60 sec: 42872.6, 300 sec: 42321.6). Total num frames: 910835712. Throughput: 0: 42047.7. Samples: 910926540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 04:11:06,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 04:11:10,748][12883] Updated weights for policy 0, policy_version 55601 (0.0037) [2024-06-18 04:11:11,993][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 910999552. Throughput: 0: 42095.6. Samples: 911183380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 04:11:11,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 04:11:14,263][12883] Updated weights for policy 0, policy_version 55611 (0.0043) [2024-06-18 04:11:16,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 911212544. Throughput: 0: 41973.2. Samples: 911304520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 04:11:16,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 04:11:18,397][12883] Updated weights for policy 0, policy_version 55621 (0.0032) [2024-06-18 04:11:21,993][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 911458304. Throughput: 0: 42097.4. Samples: 911560940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 04:11:21,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 04:11:21,998][12883] Updated weights for policy 0, policy_version 55631 (0.0025) [2024-06-18 04:11:26,053][12883] Updated weights for policy 0, policy_version 55641 (0.0027) [2024-06-18 04:11:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 911654912. Throughput: 0: 41974.9. Samples: 911810060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 04:11:27,003][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 04:11:29,604][12883] Updated weights for policy 0, policy_version 55651 (0.0033) [2024-06-18 04:11:32,000][12645] Fps is (10 sec: 39296.6, 60 sec: 42047.8, 300 sec: 42153.2). Total num frames: 911851520. Throughput: 0: 41946.6. Samples: 911935840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 04:11:32,000][12645] Avg episode reward: [(0, '0.069')] [2024-06-18 04:11:34,077][12883] Updated weights for policy 0, policy_version 55661 (0.0033) [2024-06-18 04:11:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 912080896. Throughput: 0: 42005.8. Samples: 912190440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 04:11:36,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 04:11:37,487][12883] Updated weights for policy 0, policy_version 55671 (0.0039) [2024-06-18 04:11:41,641][12883] Updated weights for policy 0, policy_version 55681 (0.0038) [2024-06-18 04:11:41,993][12645] Fps is (10 sec: 42625.5, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 912277504. Throughput: 0: 41960.1. Samples: 912447140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:11:41,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 04:11:45,103][12883] Updated weights for policy 0, policy_version 55691 (0.0038) [2024-06-18 04:11:46,994][12645] Fps is (10 sec: 40956.5, 60 sec: 42051.7, 300 sec: 42154.0). Total num frames: 912490496. Throughput: 0: 42105.4. Samples: 912569880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:11:46,995][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 04:11:48,483][12862] Signal inference workers to stop experience collection... (13150 times) [2024-06-18 04:11:48,488][12862] Signal inference workers to resume experience collection... (13150 times) [2024-06-18 04:11:48,532][12883] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-06-18 04:11:48,532][12883] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-06-18 04:11:49,148][12883] Updated weights for policy 0, policy_version 55701 (0.0043) [2024-06-18 04:11:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 912719872. Throughput: 0: 42331.5. Samples: 912831460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:11:51,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 04:11:52,619][12883] Updated weights for policy 0, policy_version 55711 (0.0047) [2024-06-18 04:11:56,857][12883] Updated weights for policy 0, policy_version 55721 (0.0039) [2024-06-18 04:11:56,994][12645] Fps is (10 sec: 44240.3, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 912932864. Throughput: 0: 42324.8. Samples: 913088000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:11:56,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 04:12:00,302][12883] Updated weights for policy 0, policy_version 55731 (0.0042) [2024-06-18 04:12:01,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.7, 300 sec: 42209.6). Total num frames: 913129472. Throughput: 0: 42396.6. Samples: 913212460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:12:01,996][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 04:12:04,690][12883] Updated weights for policy 0, policy_version 55741 (0.0032) [2024-06-18 04:12:06,998][12645] Fps is (10 sec: 42578.2, 60 sec: 42048.9, 300 sec: 42264.5). Total num frames: 913358848. Throughput: 0: 42366.9. Samples: 913467660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:12:06,999][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 04:12:08,209][12883] Updated weights for policy 0, policy_version 55751 (0.0039) [2024-06-18 04:12:11,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 913571840. Throughput: 0: 42530.8. Samples: 913723940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:12:11,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 04:12:12,138][12883] Updated weights for policy 0, policy_version 55761 (0.0045) [2024-06-18 04:12:15,771][12883] Updated weights for policy 0, policy_version 55771 (0.0030) [2024-06-18 04:12:16,994][12645] Fps is (10 sec: 40979.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 913768448. Throughput: 0: 42473.9. Samples: 913846900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 04:12:16,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 04:12:20,609][12883] Updated weights for policy 0, policy_version 55781 (0.0045) [2024-06-18 04:12:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 913997824. Throughput: 0: 42634.7. Samples: 914109000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 04:12:21,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 04:12:23,350][12883] Updated weights for policy 0, policy_version 55791 (0.0030) [2024-06-18 04:12:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 914194432. Throughput: 0: 42586.1. Samples: 914363520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 04:12:26,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 04:12:28,179][12883] Updated weights for policy 0, policy_version 55801 (0.0045) [2024-06-18 04:12:31,339][12883] Updated weights for policy 0, policy_version 55811 (0.0030) [2024-06-18 04:12:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42602.8, 300 sec: 42265.2). Total num frames: 914407424. Throughput: 0: 42474.5. Samples: 914481200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 04:12:31,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 04:12:35,851][12883] Updated weights for policy 0, policy_version 55821 (0.0033) [2024-06-18 04:12:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 914620416. Throughput: 0: 42337.8. Samples: 914736660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 04:12:36,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 04:12:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055825_914636800.pth... [2024-06-18 04:12:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055205_904478720.pth [2024-06-18 04:12:39,350][12883] Updated weights for policy 0, policy_version 55831 (0.0036) [2024-06-18 04:12:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 914817024. Throughput: 0: 42214.3. Samples: 914987640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 04:12:41,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 04:12:43,466][12883] Updated weights for policy 0, policy_version 55841 (0.0027) [2024-06-18 04:12:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.9, 300 sec: 42209.9). Total num frames: 915046400. Throughput: 0: 42307.0. Samples: 915116180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 04:12:46,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 04:12:47,168][12883] Updated weights for policy 0, policy_version 55851 (0.0044) [2024-06-18 04:12:51,091][12883] Updated weights for policy 0, policy_version 55861 (0.0046) [2024-06-18 04:12:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 915259392. Throughput: 0: 42355.7. Samples: 915373460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:12:51,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 04:12:54,983][12883] Updated weights for policy 0, policy_version 55871 (0.0035) [2024-06-18 04:12:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 915472384. Throughput: 0: 42108.4. Samples: 915618820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:12:56,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 04:12:58,988][12883] Updated weights for policy 0, policy_version 55881 (0.0038) [2024-06-18 04:13:00,588][12862] Signal inference workers to stop experience collection... (13200 times) [2024-06-18 04:13:00,589][12862] Signal inference workers to resume experience collection... (13200 times) [2024-06-18 04:13:00,618][12883] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-06-18 04:13:00,618][12883] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-06-18 04:13:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 915668992. Throughput: 0: 42206.3. Samples: 915746180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:13:01,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:13:02,699][12883] Updated weights for policy 0, policy_version 55891 (0.0024) [2024-06-18 04:13:06,844][12883] Updated weights for policy 0, policy_version 55901 (0.0031) [2024-06-18 04:13:06,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42054.0, 300 sec: 42264.8). Total num frames: 915881984. Throughput: 0: 42060.0. Samples: 916001800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:13:06,997][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 04:13:10,437][12883] Updated weights for policy 0, policy_version 55911 (0.0042) [2024-06-18 04:13:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42209.7). Total num frames: 916078592. Throughput: 0: 42049.4. Samples: 916255740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:13:11,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 04:13:14,453][12883] Updated weights for policy 0, policy_version 55921 (0.0031) [2024-06-18 04:13:16,994][12645] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 916307968. Throughput: 0: 42223.2. Samples: 916381240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:13:16,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 04:13:18,226][12883] Updated weights for policy 0, policy_version 55931 (0.0029) [2024-06-18 04:13:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 916520960. Throughput: 0: 42149.7. Samples: 916633400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:13:21,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 04:13:22,449][12883] Updated weights for policy 0, policy_version 55941 (0.0037) [2024-06-18 04:13:26,067][12883] Updated weights for policy 0, policy_version 55951 (0.0037) [2024-06-18 04:13:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 916717568. Throughput: 0: 41975.9. Samples: 916876560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:13:26,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 04:13:30,139][12883] Updated weights for policy 0, policy_version 55961 (0.0028) [2024-06-18 04:13:31,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 42265.2). Total num frames: 916930560. Throughput: 0: 41999.7. Samples: 917006260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:13:31,996][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 04:13:33,798][12883] Updated weights for policy 0, policy_version 55971 (0.0032) [2024-06-18 04:13:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42210.0). Total num frames: 917143552. Throughput: 0: 41875.5. Samples: 917257860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:13:36,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 04:13:38,107][12883] Updated weights for policy 0, policy_version 55981 (0.0032) [2024-06-18 04:13:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.2, 300 sec: 42210.0). Total num frames: 917340160. Throughput: 0: 41805.3. Samples: 917500060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:13:41,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 04:13:42,266][12883] Updated weights for policy 0, policy_version 55991 (0.0033) [2024-06-18 04:13:45,851][12883] Updated weights for policy 0, policy_version 56001 (0.0035) [2024-06-18 04:13:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42210.0). Total num frames: 917553152. Throughput: 0: 41715.6. Samples: 917623380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:13:46,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:13:49,943][12883] Updated weights for policy 0, policy_version 56011 (0.0027) [2024-06-18 04:13:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 41777.5, 300 sec: 42209.3). Total num frames: 917766144. Throughput: 0: 41875.1. Samples: 917886180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:13:51,996][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 04:13:53,811][12883] Updated weights for policy 0, policy_version 56021 (0.0029) [2024-06-18 04:13:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42265.5). Total num frames: 917979136. Throughput: 0: 41672.8. Samples: 918131020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:13:56,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 04:13:57,594][12883] Updated weights for policy 0, policy_version 56031 (0.0027) [2024-06-18 04:14:01,684][12883] Updated weights for policy 0, policy_version 56041 (0.0030) [2024-06-18 04:14:01,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 918192128. Throughput: 0: 41677.7. Samples: 918256740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:14:01,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 04:14:05,965][12883] Updated weights for policy 0, policy_version 56051 (0.0036) [2024-06-18 04:14:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41780.9, 300 sec: 42154.1). Total num frames: 918388736. Throughput: 0: 41768.5. Samples: 918512980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:14:06,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 04:14:09,430][12883] Updated weights for policy 0, policy_version 56061 (0.0042) [2024-06-18 04:14:11,993][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 918601728. Throughput: 0: 41869.9. Samples: 918760700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:14:11,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 04:14:13,688][12883] Updated weights for policy 0, policy_version 56071 (0.0032) [2024-06-18 04:14:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42210.0). Total num frames: 918814720. Throughput: 0: 41781.7. Samples: 918886340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:14:16,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 04:14:17,337][12883] Updated weights for policy 0, policy_version 56081 (0.0031) [2024-06-18 04:14:21,851][12883] Updated weights for policy 0, policy_version 56091 (0.0048) [2024-06-18 04:14:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 42098.5). Total num frames: 918994944. Throughput: 0: 41729.4. Samples: 919135680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:14:21,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 04:14:25,258][12883] Updated weights for policy 0, policy_version 56101 (0.0041) [2024-06-18 04:14:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 919224320. Throughput: 0: 41781.4. Samples: 919380220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:14:27,000][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 04:14:27,445][12862] Signal inference workers to stop experience collection... (13250 times) [2024-06-18 04:14:27,445][12862] Signal inference workers to resume experience collection... (13250 times) [2024-06-18 04:14:27,478][12883] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-06-18 04:14:27,479][12883] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-06-18 04:14:29,539][12883] Updated weights for policy 0, policy_version 56111 (0.0057) [2024-06-18 04:14:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41507.7, 300 sec: 42098.6). Total num frames: 919420928. Throughput: 0: 41883.6. Samples: 919508140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:14:31,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 04:14:33,113][12883] Updated weights for policy 0, policy_version 56121 (0.0031) [2024-06-18 04:14:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 42043.9). Total num frames: 919633920. Throughput: 0: 41442.1. Samples: 919750980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 04:14:36,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 04:14:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056130_919633920.pth... [2024-06-18 04:14:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055515_909557760.pth [2024-06-18 04:14:37,279][12883] Updated weights for policy 0, policy_version 56131 (0.0039) [2024-06-18 04:14:41,341][12883] Updated weights for policy 0, policy_version 56141 (0.0036) [2024-06-18 04:14:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42154.6). Total num frames: 919863296. Throughput: 0: 41621.8. Samples: 920004000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 04:14:41,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 04:14:45,534][12883] Updated weights for policy 0, policy_version 56151 (0.0034) [2024-06-18 04:14:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 920059904. Throughput: 0: 41548.1. Samples: 920126400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 04:14:46,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 04:14:49,107][12883] Updated weights for policy 0, policy_version 56161 (0.0034) [2024-06-18 04:14:51,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41234.7, 300 sec: 41987.5). Total num frames: 920240128. Throughput: 0: 41413.8. Samples: 920376600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 04:14:51,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 04:14:53,225][12883] Updated weights for policy 0, policy_version 56171 (0.0035) [2024-06-18 04:14:56,793][12883] Updated weights for policy 0, policy_version 56181 (0.0044) [2024-06-18 04:14:56,997][12645] Fps is (10 sec: 42582.3, 60 sec: 41776.6, 300 sec: 42042.5). Total num frames: 920485888. Throughput: 0: 41411.1. Samples: 920624360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 04:14:56,998][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 04:15:00,793][12883] Updated weights for policy 0, policy_version 56191 (0.0031) [2024-06-18 04:15:01,993][12645] Fps is (10 sec: 45875.3, 60 sec: 41779.3, 300 sec: 42154.3). Total num frames: 920698880. Throughput: 0: 41506.7. Samples: 920754140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 04:15:01,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 04:15:04,461][12883] Updated weights for policy 0, policy_version 56201 (0.0036) [2024-06-18 04:15:06,994][12645] Fps is (10 sec: 37697.0, 60 sec: 41232.9, 300 sec: 41931.9). Total num frames: 920862720. Throughput: 0: 41599.4. Samples: 921007660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 04:15:06,994][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 04:15:08,555][12883] Updated weights for policy 0, policy_version 56211 (0.0028) [2024-06-18 04:15:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42043.5). Total num frames: 921108480. Throughput: 0: 41756.0. Samples: 921259240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 04:15:11,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 04:15:12,113][12883] Updated weights for policy 0, policy_version 56221 (0.0037) [2024-06-18 04:15:16,364][12883] Updated weights for policy 0, policy_version 56231 (0.0039) [2024-06-18 04:15:16,993][12645] Fps is (10 sec: 45876.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 921321472. Throughput: 0: 41755.2. Samples: 921387120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 04:15:16,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 04:15:19,935][12883] Updated weights for policy 0, policy_version 56241 (0.0044) [2024-06-18 04:15:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 921518080. Throughput: 0: 41769.4. Samples: 921630600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 04:15:21,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 04:15:24,239][12883] Updated weights for policy 0, policy_version 56251 (0.0031) [2024-06-18 04:15:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 921731072. Throughput: 0: 41805.8. Samples: 921885260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 04:15:26,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 04:15:27,905][12883] Updated weights for policy 0, policy_version 56261 (0.0023) [2024-06-18 04:15:31,912][12883] Updated weights for policy 0, policy_version 56271 (0.0030) [2024-06-18 04:15:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 921944064. Throughput: 0: 41899.1. Samples: 922011860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 04:15:31,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 04:15:35,817][12883] Updated weights for policy 0, policy_version 56281 (0.0027) [2024-06-18 04:15:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41932.2). Total num frames: 922140672. Throughput: 0: 41887.5. Samples: 922261540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 04:15:36,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 04:15:39,690][12883] Updated weights for policy 0, policy_version 56291 (0.0038) [2024-06-18 04:15:41,996][12645] Fps is (10 sec: 40950.8, 60 sec: 41504.5, 300 sec: 41987.1). Total num frames: 922353664. Throughput: 0: 41964.9. Samples: 922512720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 04:15:41,997][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 04:15:43,887][12883] Updated weights for policy 0, policy_version 56301 (0.0028) [2024-06-18 04:15:44,318][12862] Signal inference workers to stop experience collection... (13300 times) [2024-06-18 04:15:44,319][12862] Signal inference workers to resume experience collection... (13300 times) [2024-06-18 04:15:44,349][12883] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-06-18 04:15:44,349][12883] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-06-18 04:15:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41504.5, 300 sec: 41820.5). Total num frames: 922550272. Throughput: 0: 41814.2. Samples: 922635880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 04:15:46,997][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 04:15:47,659][12883] Updated weights for policy 0, policy_version 56311 (0.0041) [2024-06-18 04:15:51,682][12883] Updated weights for policy 0, policy_version 56321 (0.0027) [2024-06-18 04:15:51,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 922763264. Throughput: 0: 41768.1. Samples: 922887220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 04:15:51,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 04:15:55,533][12883] Updated weights for policy 0, policy_version 56331 (0.0040) [2024-06-18 04:15:56,994][12645] Fps is (10 sec: 42608.5, 60 sec: 41508.8, 300 sec: 41987.5). Total num frames: 922976256. Throughput: 0: 41711.6. Samples: 923136260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 04:15:56,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 04:15:59,344][12883] Updated weights for policy 0, policy_version 56341 (0.0037) [2024-06-18 04:16:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 923172864. Throughput: 0: 41631.5. Samples: 923260540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 04:16:01,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 04:16:03,523][12883] Updated weights for policy 0, policy_version 56351 (0.0022) [2024-06-18 04:16:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 923402240. Throughput: 0: 41805.7. Samples: 923511860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 04:16:06,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 04:16:07,056][12883] Updated weights for policy 0, policy_version 56361 (0.0031) [2024-06-18 04:16:11,428][12883] Updated weights for policy 0, policy_version 56371 (0.0034) [2024-06-18 04:16:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 923582464. Throughput: 0: 41800.0. Samples: 923766260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 04:16:11,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 04:16:14,914][12883] Updated weights for policy 0, policy_version 56381 (0.0029) [2024-06-18 04:16:16,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41504.4, 300 sec: 41876.0). Total num frames: 923811840. Throughput: 0: 41735.6. Samples: 923890060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 04:16:16,997][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 04:16:19,307][12883] Updated weights for policy 0, policy_version 56391 (0.0032) [2024-06-18 04:16:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 924041216. Throughput: 0: 41870.2. Samples: 924145700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 04:16:21,995][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 04:16:22,534][12883] Updated weights for policy 0, policy_version 56401 (0.0029) [2024-06-18 04:16:26,996][12645] Fps is (10 sec: 40960.4, 60 sec: 41504.5, 300 sec: 41932.5). Total num frames: 924221440. Throughput: 0: 41954.7. Samples: 924400680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 04:16:26,996][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 04:16:27,243][12883] Updated weights for policy 0, policy_version 56411 (0.0036) [2024-06-18 04:16:30,316][12883] Updated weights for policy 0, policy_version 56421 (0.0038) [2024-06-18 04:16:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 924434432. Throughput: 0: 41920.4. Samples: 924522200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 04:16:31,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 04:16:34,942][12883] Updated weights for policy 0, policy_version 56431 (0.0033) [2024-06-18 04:16:36,996][12645] Fps is (10 sec: 42598.4, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 924647424. Throughput: 0: 41898.4. Samples: 924772740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 04:16:36,997][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 04:16:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056436_924647424.pth... [2024-06-18 04:16:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000055825_914636800.pth [2024-06-18 04:16:38,188][12883] Updated weights for policy 0, policy_version 56441 (0.0039) [2024-06-18 04:16:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41507.8, 300 sec: 41876.5). Total num frames: 924844032. Throughput: 0: 41931.5. Samples: 925023180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 04:16:41,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:16:42,938][12883] Updated weights for policy 0, policy_version 56451 (0.0023) [2024-06-18 04:16:46,561][12883] Updated weights for policy 0, policy_version 56461 (0.0036) [2024-06-18 04:16:46,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42053.8, 300 sec: 41876.4). Total num frames: 925073408. Throughput: 0: 41755.5. Samples: 925139540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 04:16:46,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 04:16:50,825][12883] Updated weights for policy 0, policy_version 56471 (0.0036) [2024-06-18 04:16:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 925286400. Throughput: 0: 41964.1. Samples: 925400240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 04:16:51,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 04:16:54,359][12883] Updated weights for policy 0, policy_version 56481 (0.0027) [2024-06-18 04:16:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 925466624. Throughput: 0: 41776.9. Samples: 925646220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:16:56,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 04:16:58,583][12883] Updated weights for policy 0, policy_version 56491 (0.0033) [2024-06-18 04:17:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41821.5). Total num frames: 925696000. Throughput: 0: 41743.1. Samples: 925768400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:17:01,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 04:17:02,344][12883] Updated weights for policy 0, policy_version 56501 (0.0038) [2024-06-18 04:17:06,235][12883] Updated weights for policy 0, policy_version 56511 (0.0034) [2024-06-18 04:17:06,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 925925376. Throughput: 0: 41791.6. Samples: 926026320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:17:06,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 04:17:10,292][12883] Updated weights for policy 0, policy_version 56521 (0.0040) [2024-06-18 04:17:11,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 41820.5). Total num frames: 926105600. Throughput: 0: 41572.9. Samples: 926271460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:17:11,996][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 04:17:13,699][12862] Signal inference workers to stop experience collection... (13350 times) [2024-06-18 04:17:13,748][12883] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-06-18 04:17:13,752][12862] Signal inference workers to resume experience collection... (13350 times) [2024-06-18 04:17:13,764][12883] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-06-18 04:17:14,061][12883] Updated weights for policy 0, policy_version 56531 (0.0027) [2024-06-18 04:17:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 41820.8). Total num frames: 926334976. Throughput: 0: 41599.5. Samples: 926394180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:17:16,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 04:17:18,183][12883] Updated weights for policy 0, policy_version 56541 (0.0039) [2024-06-18 04:17:21,993][12645] Fps is (10 sec: 39331.1, 60 sec: 40960.1, 300 sec: 41709.8). Total num frames: 926498816. Throughput: 0: 41536.9. Samples: 926641800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:17:21,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:17:22,645][12883] Updated weights for policy 0, policy_version 56551 (0.0032) [2024-06-18 04:17:25,955][12883] Updated weights for policy 0, policy_version 56561 (0.0031) [2024-06-18 04:17:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41780.7, 300 sec: 41765.3). Total num frames: 926728192. Throughput: 0: 41538.1. Samples: 926892400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:17:26,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 04:17:30,281][12883] Updated weights for policy 0, policy_version 56571 (0.0044) [2024-06-18 04:17:31,996][12645] Fps is (10 sec: 45864.1, 60 sec: 42050.6, 300 sec: 41820.5). Total num frames: 926957568. Throughput: 0: 41834.8. Samples: 927022200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:17:31,996][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 04:17:33,810][12883] Updated weights for policy 0, policy_version 56581 (0.0028) [2024-06-18 04:17:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41507.7, 300 sec: 41765.3). Total num frames: 927137792. Throughput: 0: 41590.2. Samples: 927271800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:17:36,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 04:17:37,845][12883] Updated weights for policy 0, policy_version 56591 (0.0040) [2024-06-18 04:17:41,651][12883] Updated weights for policy 0, policy_version 56601 (0.0034) [2024-06-18 04:17:41,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 927350784. Throughput: 0: 41706.3. Samples: 927523000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:17:41,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 04:17:45,782][12883] Updated weights for policy 0, policy_version 56611 (0.0039) [2024-06-18 04:17:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 41777.7, 300 sec: 41765.0). Total num frames: 927580160. Throughput: 0: 41747.7. Samples: 927647140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:17:46,996][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 04:17:49,501][12883] Updated weights for policy 0, policy_version 56621 (0.0032) [2024-06-18 04:17:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 927776768. Throughput: 0: 41660.0. Samples: 927901020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:17:51,994][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 04:17:53,348][12883] Updated weights for policy 0, policy_version 56631 (0.0029) [2024-06-18 04:17:56,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 927989760. Throughput: 0: 41676.3. Samples: 928146800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:17:56,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 04:17:57,478][12883] Updated weights for policy 0, policy_version 56641 (0.0038) [2024-06-18 04:18:01,159][12883] Updated weights for policy 0, policy_version 56651 (0.0033) [2024-06-18 04:18:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41765.7). Total num frames: 928202752. Throughput: 0: 41869.4. Samples: 928278300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:18:01,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 04:18:05,325][12883] Updated weights for policy 0, policy_version 56661 (0.0037) [2024-06-18 04:18:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 928415744. Throughput: 0: 42121.7. Samples: 928537280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 04:18:06,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 04:18:08,971][12883] Updated weights for policy 0, policy_version 56671 (0.0035) [2024-06-18 04:18:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42327.0, 300 sec: 41820.8). Total num frames: 928645120. Throughput: 0: 41868.1. Samples: 928776460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 04:18:11,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 04:18:13,581][12883] Updated weights for policy 0, policy_version 56681 (0.0045) [2024-06-18 04:18:16,856][12883] Updated weights for policy 0, policy_version 56691 (0.0033) [2024-06-18 04:18:16,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41504.5, 300 sec: 41709.5). Total num frames: 928825344. Throughput: 0: 41872.9. Samples: 928906480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 04:18:16,996][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 04:18:21,179][12883] Updated weights for policy 0, policy_version 56701 (0.0027) [2024-06-18 04:18:21,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42052.1, 300 sec: 41709.8). Total num frames: 929021952. Throughput: 0: 41974.2. Samples: 929160640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 04:18:21,994][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 04:18:24,401][12883] Updated weights for policy 0, policy_version 56711 (0.0030) [2024-06-18 04:18:26,994][12645] Fps is (10 sec: 45886.0, 60 sec: 42598.5, 300 sec: 41876.7). Total num frames: 929284096. Throughput: 0: 41911.2. Samples: 929409000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 04:18:26,994][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 04:18:28,827][12883] Updated weights for policy 0, policy_version 56721 (0.0025) [2024-06-18 04:18:31,930][12883] Updated weights for policy 0, policy_version 56731 (0.0024) [2024-06-18 04:18:31,994][12645] Fps is (10 sec: 45873.5, 60 sec: 42053.6, 300 sec: 41820.8). Total num frames: 929480704. Throughput: 0: 42159.1. Samples: 929544220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 04:18:31,995][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 04:18:32,424][12862] Signal inference workers to stop experience collection... (13400 times) [2024-06-18 04:18:32,424][12862] Signal inference workers to resume experience collection... (13400 times) [2024-06-18 04:18:32,438][12883] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-06-18 04:18:32,462][12883] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-06-18 04:18:36,688][12883] Updated weights for policy 0, policy_version 56741 (0.0032) [2024-06-18 04:18:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 929660928. Throughput: 0: 41995.6. Samples: 929790820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 04:18:36,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 04:18:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056742_929660928.pth... [2024-06-18 04:18:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056130_919633920.pth [2024-06-18 04:18:39,999][12883] Updated weights for policy 0, policy_version 56751 (0.0032) [2024-06-18 04:18:41,994][12645] Fps is (10 sec: 40961.4, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 929890304. Throughput: 0: 42090.2. Samples: 930040860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 04:18:41,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 04:18:44,521][12883] Updated weights for policy 0, policy_version 56761 (0.0043) [2024-06-18 04:18:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42053.9, 300 sec: 41821.2). Total num frames: 930103296. Throughput: 0: 42136.9. Samples: 930174460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 04:18:46,994][12645] Avg episode reward: [(0, '0.139')] [2024-06-18 04:18:47,645][12883] Updated weights for policy 0, policy_version 56771 (0.0034) [2024-06-18 04:18:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 930283520. Throughput: 0: 41884.4. Samples: 930422080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 04:18:51,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 04:18:52,022][12883] Updated weights for policy 0, policy_version 56781 (0.0045) [2024-06-18 04:18:55,415][12883] Updated weights for policy 0, policy_version 56791 (0.0038) [2024-06-18 04:18:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 930529280. Throughput: 0: 42142.2. Samples: 930672860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 04:18:56,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 04:18:59,991][12883] Updated weights for policy 0, policy_version 56801 (0.0026) [2024-06-18 04:19:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 930709504. Throughput: 0: 42309.6. Samples: 930810320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 04:19:01,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 04:19:03,396][12883] Updated weights for policy 0, policy_version 56811 (0.0044) [2024-06-18 04:19:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 930938880. Throughput: 0: 42085.3. Samples: 931054480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 04:19:06,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 04:19:07,844][12883] Updated weights for policy 0, policy_version 56821 (0.0029) [2024-06-18 04:19:11,386][12883] Updated weights for policy 0, policy_version 56831 (0.0043) [2024-06-18 04:19:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 931168256. Throughput: 0: 42039.0. Samples: 931300760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-18 04:19:11,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 04:19:15,447][12883] Updated weights for policy 0, policy_version 56841 (0.0032) [2024-06-18 04:19:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42053.8, 300 sec: 41876.4). Total num frames: 931348480. Throughput: 0: 41881.2. Samples: 931428860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 04:19:16,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 04:19:19,081][12883] Updated weights for policy 0, policy_version 56851 (0.0033) [2024-06-18 04:19:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 41820.8). Total num frames: 931561472. Throughput: 0: 41919.5. Samples: 931677200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 04:19:21,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 04:19:23,405][12883] Updated weights for policy 0, policy_version 56861 (0.0028) [2024-06-18 04:19:26,926][12883] Updated weights for policy 0, policy_version 56871 (0.0038) [2024-06-18 04:19:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 931774464. Throughput: 0: 42112.1. Samples: 931935900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 04:19:26,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 04:19:31,114][12883] Updated weights for policy 0, policy_version 56881 (0.0038) [2024-06-18 04:19:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41233.3, 300 sec: 41765.3). Total num frames: 931954688. Throughput: 0: 41755.4. Samples: 932053460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 04:19:31,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 04:19:34,606][12883] Updated weights for policy 0, policy_version 56891 (0.0037) [2024-06-18 04:19:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 932200448. Throughput: 0: 41764.9. Samples: 932301500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 04:19:36,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 04:19:38,874][12883] Updated weights for policy 0, policy_version 56901 (0.0034) [2024-06-18 04:19:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 932397056. Throughput: 0: 41925.9. Samples: 932559520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 04:19:41,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 04:19:42,349][12883] Updated weights for policy 0, policy_version 56911 (0.0033) [2024-06-18 04:19:43,430][12862] Signal inference workers to stop experience collection... (13450 times) [2024-06-18 04:19:43,460][12883] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-06-18 04:19:43,487][12862] Signal inference workers to resume experience collection... (13450 times) [2024-06-18 04:19:43,488][12883] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-06-18 04:19:46,577][12883] Updated weights for policy 0, policy_version 56921 (0.0029) [2024-06-18 04:19:46,996][12645] Fps is (10 sec: 39310.5, 60 sec: 41504.1, 300 sec: 41876.0). Total num frames: 932593664. Throughput: 0: 41510.8. Samples: 932678420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 04:19:47,005][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 04:19:50,122][12883] Updated weights for policy 0, policy_version 56931 (0.0033) [2024-06-18 04:19:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41821.4). Total num frames: 932823040. Throughput: 0: 41688.9. Samples: 932930480. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-18 04:19:51,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 04:19:54,285][12883] Updated weights for policy 0, policy_version 56941 (0.0029) [2024-06-18 04:19:56,994][12645] Fps is (10 sec: 40971.4, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 933003264. Throughput: 0: 41772.9. Samples: 933180540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-18 04:19:56,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 04:19:58,595][12883] Updated weights for policy 0, policy_version 56951 (0.0038) [2024-06-18 04:20:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 933232640. Throughput: 0: 41683.2. Samples: 933304600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-18 04:20:01,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 04:20:02,127][12883] Updated weights for policy 0, policy_version 56961 (0.0033) [2024-06-18 04:20:06,353][12883] Updated weights for policy 0, policy_version 56971 (0.0027) [2024-06-18 04:20:06,996][12645] Fps is (10 sec: 42589.1, 60 sec: 41504.6, 300 sec: 41765.0). Total num frames: 933429248. Throughput: 0: 41584.2. Samples: 933548580. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-18 04:20:06,997][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 04:20:09,971][12883] Updated weights for policy 0, policy_version 56981 (0.0027) [2024-06-18 04:20:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 933642240. Throughput: 0: 41572.3. Samples: 933806660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-18 04:20:11,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 04:20:14,066][12883] Updated weights for policy 0, policy_version 56991 (0.0038) [2024-06-18 04:20:16,994][12645] Fps is (10 sec: 42607.9, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 933855232. Throughput: 0: 41725.0. Samples: 933931080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-18 04:20:16,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 04:20:17,696][12883] Updated weights for policy 0, policy_version 57001 (0.0033) [2024-06-18 04:20:21,874][12883] Updated weights for policy 0, policy_version 57011 (0.0034) [2024-06-18 04:20:21,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 934068224. Throughput: 0: 41907.2. Samples: 934187320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-18 04:20:21,994][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 04:20:25,408][12883] Updated weights for policy 0, policy_version 57021 (0.0039) [2024-06-18 04:20:26,996][12645] Fps is (10 sec: 42588.8, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 934281216. Throughput: 0: 41711.6. Samples: 934436640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 04:20:26,996][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 04:20:29,586][12883] Updated weights for policy 0, policy_version 57031 (0.0037) [2024-06-18 04:20:31,996][12645] Fps is (10 sec: 42588.3, 60 sec: 42323.8, 300 sec: 41876.1). Total num frames: 934494208. Throughput: 0: 41883.6. Samples: 934563160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 04:20:31,997][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 04:20:33,567][12883] Updated weights for policy 0, policy_version 57041 (0.0037) [2024-06-18 04:20:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 934690816. Throughput: 0: 41848.0. Samples: 934813640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 04:20:36,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 04:20:37,027][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057049_934690816.pth... [2024-06-18 04:20:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056436_924647424.pth [2024-06-18 04:20:37,625][12883] Updated weights for policy 0, policy_version 57051 (0.0031) [2024-06-18 04:20:41,198][12883] Updated weights for policy 0, policy_version 57061 (0.0032) [2024-06-18 04:20:41,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 934887424. Throughput: 0: 41840.5. Samples: 935063360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 04:20:41,994][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 04:20:45,741][12883] Updated weights for policy 0, policy_version 57071 (0.0032) [2024-06-18 04:20:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42054.3, 300 sec: 41876.4). Total num frames: 935116800. Throughput: 0: 41961.4. Samples: 935192860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 04:20:46,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 04:20:48,685][12883] Updated weights for policy 0, policy_version 57081 (0.0033) [2024-06-18 04:20:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 935329792. Throughput: 0: 42109.3. Samples: 935443400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 04:20:51,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 04:20:53,351][12883] Updated weights for policy 0, policy_version 57091 (0.0028) [2024-06-18 04:20:56,376][12883] Updated weights for policy 0, policy_version 57101 (0.0034) [2024-06-18 04:20:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 935542784. Throughput: 0: 41864.4. Samples: 935690560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 04:20:56,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 04:21:00,906][12883] Updated weights for policy 0, policy_version 57111 (0.0026) [2024-06-18 04:21:01,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41777.6, 300 sec: 41820.5). Total num frames: 935739392. Throughput: 0: 42030.8. Samples: 935822560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:01,996][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 04:21:04,292][12883] Updated weights for policy 0, policy_version 57121 (0.0037) [2024-06-18 04:21:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42053.9, 300 sec: 41931.9). Total num frames: 935952384. Throughput: 0: 42034.2. Samples: 936078860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:06,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 04:21:08,924][12883] Updated weights for policy 0, policy_version 57131 (0.0037) [2024-06-18 04:21:11,994][12645] Fps is (10 sec: 44247.3, 60 sec: 42325.4, 300 sec: 41932.3). Total num frames: 936181760. Throughput: 0: 42069.3. Samples: 936329660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:11,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 04:21:12,090][12883] Updated weights for policy 0, policy_version 57141 (0.0035) [2024-06-18 04:21:13,357][12862] Signal inference workers to stop experience collection... (13500 times) [2024-06-18 04:21:13,357][12862] Signal inference workers to resume experience collection... (13500 times) [2024-06-18 04:21:13,366][12883] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-06-18 04:21:13,378][12883] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-06-18 04:21:16,662][12883] Updated weights for policy 0, policy_version 57151 (0.0023) [2024-06-18 04:21:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 936361984. Throughput: 0: 42198.2. Samples: 936461980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:16,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 04:21:19,766][12883] Updated weights for policy 0, policy_version 57161 (0.0027) [2024-06-18 04:21:21,996][12645] Fps is (10 sec: 40951.9, 60 sec: 42050.9, 300 sec: 41932.0). Total num frames: 936591360. Throughput: 0: 42250.7. Samples: 936715000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:21,996][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 04:21:24,235][12883] Updated weights for policy 0, policy_version 57171 (0.0037) [2024-06-18 04:21:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42326.9, 300 sec: 41987.5). Total num frames: 936820736. Throughput: 0: 42238.2. Samples: 936964080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:26,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 04:21:27,734][12883] Updated weights for policy 0, policy_version 57181 (0.0039) [2024-06-18 04:21:31,994][12645] Fps is (10 sec: 40967.4, 60 sec: 41780.7, 300 sec: 41876.7). Total num frames: 937000960. Throughput: 0: 42135.9. Samples: 937088980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:31,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 04:21:32,482][12883] Updated weights for policy 0, policy_version 57191 (0.0031) [2024-06-18 04:21:35,426][12883] Updated weights for policy 0, policy_version 57201 (0.0041) [2024-06-18 04:21:36,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42042.7). Total num frames: 937246720. Throughput: 0: 42200.0. Samples: 937342500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 04:21:36,997][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 04:21:39,982][12883] Updated weights for policy 0, policy_version 57211 (0.0034) [2024-06-18 04:21:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 937410560. Throughput: 0: 42419.2. Samples: 937599420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:21:41,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 04:21:43,234][12883] Updated weights for policy 0, policy_version 57221 (0.0051) [2024-06-18 04:21:46,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 937656320. Throughput: 0: 42181.3. Samples: 937720620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:21:46,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 04:21:47,552][12883] Updated weights for policy 0, policy_version 57231 (0.0036) [2024-06-18 04:21:50,953][12883] Updated weights for policy 0, policy_version 57241 (0.0024) [2024-06-18 04:21:51,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 937885696. Throughput: 0: 42211.0. Samples: 937978360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:21:51,994][12645] Avg episode reward: [(0, '0.040')] [2024-06-18 04:21:55,132][12883] Updated weights for policy 0, policy_version 57251 (0.0036) [2024-06-18 04:21:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 938065920. Throughput: 0: 42399.4. Samples: 938237640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:21:56,994][12645] Avg episode reward: [(0, '0.044')] [2024-06-18 04:21:58,611][12883] Updated weights for policy 0, policy_version 57261 (0.0027) [2024-06-18 04:22:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 41931.9). Total num frames: 938295296. Throughput: 0: 42123.9. Samples: 938357560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:22:01,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 04:22:02,663][12883] Updated weights for policy 0, policy_version 57271 (0.0045) [2024-06-18 04:22:06,278][12883] Updated weights for policy 0, policy_version 57281 (0.0032) [2024-06-18 04:22:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42043.3). Total num frames: 938508288. Throughput: 0: 42236.0. Samples: 938615540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:22:07,003][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 04:22:10,990][12883] Updated weights for policy 0, policy_version 57291 (0.0039) [2024-06-18 04:22:11,996][12645] Fps is (10 sec: 39313.0, 60 sec: 41777.5, 300 sec: 41876.1). Total num frames: 938688512. Throughput: 0: 42255.2. Samples: 938865660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 04:22:11,996][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 04:22:14,110][12883] Updated weights for policy 0, policy_version 57301 (0.0032) [2024-06-18 04:22:16,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42098.2). Total num frames: 938917888. Throughput: 0: 42169.9. Samples: 938986720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 04:22:16,997][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 04:22:18,606][12883] Updated weights for policy 0, policy_version 57311 (0.0029) [2024-06-18 04:22:20,344][12862] Signal inference workers to stop experience collection... (13550 times) [2024-06-18 04:22:20,344][12862] Signal inference workers to resume experience collection... (13550 times) [2024-06-18 04:22:20,386][12883] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-06-18 04:22:20,386][12883] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-06-18 04:22:21,993][12645] Fps is (10 sec: 42608.7, 60 sec: 42053.7, 300 sec: 41987.5). Total num frames: 939114496. Throughput: 0: 42215.6. Samples: 939242100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 04:22:21,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 04:22:22,184][12883] Updated weights for policy 0, policy_version 57321 (0.0024) [2024-06-18 04:22:26,205][12883] Updated weights for policy 0, policy_version 57331 (0.0034) [2024-06-18 04:22:26,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 939327488. Throughput: 0: 42057.3. Samples: 939492000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 04:22:26,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 04:22:29,874][12883] Updated weights for policy 0, policy_version 57341 (0.0025) [2024-06-18 04:22:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 939540480. Throughput: 0: 42123.0. Samples: 939616160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 04:22:31,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 04:22:33,835][12883] Updated weights for policy 0, policy_version 57351 (0.0031) [2024-06-18 04:22:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41507.8, 300 sec: 41987.5). Total num frames: 939737088. Throughput: 0: 42166.8. Samples: 939875860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 04:22:36,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 04:22:37,201][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057359_939769856.pth... [2024-06-18 04:22:37,245][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000056742_929660928.pth [2024-06-18 04:22:38,107][12883] Updated weights for policy 0, policy_version 57361 (0.0038) [2024-06-18 04:22:41,506][12883] Updated weights for policy 0, policy_version 57371 (0.0036) [2024-06-18 04:22:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42043.3). Total num frames: 939982848. Throughput: 0: 41779.6. Samples: 940117720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 04:22:41,994][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 04:22:45,811][12883] Updated weights for policy 0, policy_version 57381 (0.0033) [2024-06-18 04:22:46,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 940179456. Throughput: 0: 42092.4. Samples: 940251720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 04:22:46,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 04:22:49,123][12883] Updated weights for policy 0, policy_version 57391 (0.0050) [2024-06-18 04:22:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 940376064. Throughput: 0: 41995.2. Samples: 940505320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 04:22:51,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 04:22:53,505][12883] Updated weights for policy 0, policy_version 57401 (0.0037) [2024-06-18 04:22:56,850][12883] Updated weights for policy 0, policy_version 57411 (0.0021) [2024-06-18 04:22:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 940621824. Throughput: 0: 42024.3. Samples: 940756660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 04:22:56,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 04:23:01,214][12883] Updated weights for policy 0, policy_version 57421 (0.0026) [2024-06-18 04:23:01,993][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 940802048. Throughput: 0: 42244.1. Samples: 940887600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 04:23:01,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 04:23:04,909][12883] Updated weights for policy 0, policy_version 57431 (0.0038) [2024-06-18 04:23:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 941015040. Throughput: 0: 42123.0. Samples: 941137640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 04:23:06,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 04:23:08,873][12883] Updated weights for policy 0, policy_version 57441 (0.0045) [2024-06-18 04:23:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42600.1, 300 sec: 42098.9). Total num frames: 941244416. Throughput: 0: 42323.6. Samples: 941396560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 04:23:11,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 04:23:12,395][12883] Updated weights for policy 0, policy_version 57451 (0.0036) [2024-06-18 04:23:16,715][12883] Updated weights for policy 0, policy_version 57461 (0.0027) [2024-06-18 04:23:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 941441024. Throughput: 0: 42272.0. Samples: 941518400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 04:23:16,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 04:23:20,342][12883] Updated weights for policy 0, policy_version 57471 (0.0042) [2024-06-18 04:23:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 941670400. Throughput: 0: 42162.2. Samples: 941773160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 04:23:21,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 04:23:21,995][12862] Saving new best policy, reward=0.493! [2024-06-18 04:23:24,387][12883] Updated weights for policy 0, policy_version 57481 (0.0033) [2024-06-18 04:23:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 941850624. Throughput: 0: 42566.7. Samples: 942033220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:23:26,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 04:23:28,090][12883] Updated weights for policy 0, policy_version 57491 (0.0035) [2024-06-18 04:23:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 942080000. Throughput: 0: 42318.5. Samples: 942156040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:23:31,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 04:23:32,193][12883] Updated weights for policy 0, policy_version 57501 (0.0026) [2024-06-18 04:23:35,802][12883] Updated weights for policy 0, policy_version 57511 (0.0037) [2024-06-18 04:23:36,896][12862] Signal inference workers to stop experience collection... (13600 times) [2024-06-18 04:23:36,946][12883] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-06-18 04:23:36,949][12862] Signal inference workers to resume experience collection... (13600 times) [2024-06-18 04:23:36,958][12883] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-06-18 04:23:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42098.6). Total num frames: 942309376. Throughput: 0: 42399.0. Samples: 942413280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:23:36,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 04:23:39,797][12883] Updated weights for policy 0, policy_version 57521 (0.0038) [2024-06-18 04:23:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 942505984. Throughput: 0: 42516.1. Samples: 942669880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:23:41,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 04:23:43,515][12883] Updated weights for policy 0, policy_version 57531 (0.0030) [2024-06-18 04:23:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 942718976. Throughput: 0: 42310.6. Samples: 942791680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:23:46,997][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 04:23:47,392][12883] Updated weights for policy 0, policy_version 57541 (0.0036) [2024-06-18 04:23:51,207][12883] Updated weights for policy 0, policy_version 57551 (0.0033) [2024-06-18 04:23:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 942931968. Throughput: 0: 42474.3. Samples: 943048980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:23:51,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 04:23:55,483][12883] Updated weights for policy 0, policy_version 57561 (0.0033) [2024-06-18 04:23:56,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 943144960. Throughput: 0: 42380.4. Samples: 943303680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:23:56,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 04:23:59,029][12883] Updated weights for policy 0, policy_version 57571 (0.0027) [2024-06-18 04:24:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 943357952. Throughput: 0: 42498.3. Samples: 943430820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 04:24:01,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 04:24:02,943][12883] Updated weights for policy 0, policy_version 57581 (0.0029) [2024-06-18 04:24:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 943554560. Throughput: 0: 42394.2. Samples: 943680900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 04:24:06,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 04:24:07,391][12883] Updated weights for policy 0, policy_version 57591 (0.0038) [2024-06-18 04:24:11,437][12883] Updated weights for policy 0, policy_version 57601 (0.0035) [2024-06-18 04:24:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 943783936. Throughput: 0: 42230.1. Samples: 943933580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 04:24:11,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 04:24:14,951][12883] Updated weights for policy 0, policy_version 57611 (0.0035) [2024-06-18 04:24:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 943996928. Throughput: 0: 42265.2. Samples: 944057980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 04:24:16,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 04:24:18,979][12883] Updated weights for policy 0, policy_version 57621 (0.0030) [2024-06-18 04:24:21,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 944193536. Throughput: 0: 42260.6. Samples: 944315100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 04:24:21,996][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 04:24:22,570][12883] Updated weights for policy 0, policy_version 57631 (0.0041) [2024-06-18 04:24:26,565][12883] Updated weights for policy 0, policy_version 57641 (0.0035) [2024-06-18 04:24:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 944406528. Throughput: 0: 42124.9. Samples: 944565500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 04:24:26,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 04:24:30,645][12883] Updated weights for policy 0, policy_version 57651 (0.0036) [2024-06-18 04:24:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 944619520. Throughput: 0: 42160.8. Samples: 944688820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 04:24:31,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 04:24:34,356][12883] Updated weights for policy 0, policy_version 57661 (0.0044) [2024-06-18 04:24:36,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 944832512. Throughput: 0: 42060.9. Samples: 944941820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:24:36,997][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 04:24:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057668_944832512.pth... [2024-06-18 04:24:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057049_934690816.pth [2024-06-18 04:24:38,319][12883] Updated weights for policy 0, policy_version 57671 (0.0028) [2024-06-18 04:24:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.5). Total num frames: 945029120. Throughput: 0: 42025.0. Samples: 945194800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:24:41,994][12645] Avg episode reward: [(0, '0.041')] [2024-06-18 04:24:42,019][12883] Updated weights for policy 0, policy_version 57681 (0.0043) [2024-06-18 04:24:46,354][12883] Updated weights for policy 0, policy_version 57691 (0.0038) [2024-06-18 04:24:46,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42053.9, 300 sec: 42098.6). Total num frames: 945242112. Throughput: 0: 41964.4. Samples: 945319220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:24:46,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 04:24:49,930][12883] Updated weights for policy 0, policy_version 57701 (0.0036) [2024-06-18 04:24:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 945455104. Throughput: 0: 42012.9. Samples: 945571480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:24:51,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 04:24:54,101][12883] Updated weights for policy 0, policy_version 57711 (0.0024) [2024-06-18 04:24:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 945668096. Throughput: 0: 42028.5. Samples: 945824860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:24:56,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 04:24:57,624][12883] Updated weights for policy 0, policy_version 57721 (0.0032) [2024-06-18 04:25:01,710][12883] Updated weights for policy 0, policy_version 57731 (0.0033) [2024-06-18 04:25:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42154.4). Total num frames: 945864704. Throughput: 0: 42036.8. Samples: 945949640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:25:01,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 04:25:05,418][12883] Updated weights for policy 0, policy_version 57741 (0.0028) [2024-06-18 04:25:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 946110464. Throughput: 0: 41930.1. Samples: 946201860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:25:06,994][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 04:25:09,414][12883] Updated weights for policy 0, policy_version 57751 (0.0035) [2024-06-18 04:25:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 946290688. Throughput: 0: 41975.6. Samples: 946454400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:11,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 04:25:13,451][12883] Updated weights for policy 0, policy_version 57761 (0.0032) [2024-06-18 04:25:13,637][12862] Signal inference workers to stop experience collection... (13650 times) [2024-06-18 04:25:13,660][12883] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-06-18 04:25:13,698][12862] Signal inference workers to resume experience collection... (13650 times) [2024-06-18 04:25:13,698][12883] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-06-18 04:25:16,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 946487296. Throughput: 0: 41837.4. Samples: 946571500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:16,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 04:25:17,378][12883] Updated weights for policy 0, policy_version 57771 (0.0042) [2024-06-18 04:25:21,105][12883] Updated weights for policy 0, policy_version 57781 (0.0032) [2024-06-18 04:25:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42600.0, 300 sec: 42265.5). Total num frames: 946749440. Throughput: 0: 42062.2. Samples: 946834520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:21,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 04:25:25,072][12883] Updated weights for policy 0, policy_version 57791 (0.0028) [2024-06-18 04:25:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 946929664. Throughput: 0: 41972.8. Samples: 947083580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:26,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 04:25:29,065][12883] Updated weights for policy 0, policy_version 57801 (0.0039) [2024-06-18 04:25:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 947126272. Throughput: 0: 41881.4. Samples: 947203880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:31,994][12645] Avg episode reward: [(0, '0.205')] [2024-06-18 04:25:33,326][12883] Updated weights for policy 0, policy_version 57811 (0.0026) [2024-06-18 04:25:36,644][12883] Updated weights for policy 0, policy_version 57821 (0.0029) [2024-06-18 04:25:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 947339264. Throughput: 0: 41881.8. Samples: 947456160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:36,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 04:25:41,149][12883] Updated weights for policy 0, policy_version 57831 (0.0025) [2024-06-18 04:25:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 947552256. Throughput: 0: 41972.0. Samples: 947713600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:41,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 04:25:44,603][12883] Updated weights for policy 0, policy_version 57841 (0.0031) [2024-06-18 04:25:46,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.7, 300 sec: 42153.7). Total num frames: 947765248. Throughput: 0: 41903.7. Samples: 947835400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:25:46,997][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 04:25:49,042][12883] Updated weights for policy 0, policy_version 57851 (0.0046) [2024-06-18 04:25:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 947978240. Throughput: 0: 42077.8. Samples: 948095360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:25:51,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 04:25:52,146][12883] Updated weights for policy 0, policy_version 57861 (0.0041) [2024-06-18 04:25:56,649][12883] Updated weights for policy 0, policy_version 57871 (0.0036) [2024-06-18 04:25:56,994][12645] Fps is (10 sec: 40969.8, 60 sec: 41779.3, 300 sec: 42154.4). Total num frames: 948174848. Throughput: 0: 42207.6. Samples: 948353740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:25:56,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 04:25:59,791][12883] Updated weights for policy 0, policy_version 57881 (0.0047) [2024-06-18 04:26:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 948420608. Throughput: 0: 42325.7. Samples: 948476160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:26:01,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 04:26:04,256][12883] Updated weights for policy 0, policy_version 57891 (0.0036) [2024-06-18 04:26:06,996][12645] Fps is (10 sec: 42588.3, 60 sec: 41504.6, 300 sec: 42098.2). Total num frames: 948600832. Throughput: 0: 42148.0. Samples: 948731280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:26:06,997][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 04:26:07,618][12883] Updated weights for policy 0, policy_version 57901 (0.0045) [2024-06-18 04:26:11,996][12645] Fps is (10 sec: 37674.8, 60 sec: 41777.6, 300 sec: 42153.8). Total num frames: 948797440. Throughput: 0: 42292.0. Samples: 948986820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:26:11,997][12645] Avg episode reward: [(0, '0.081')] [2024-06-18 04:26:12,246][12883] Updated weights for policy 0, policy_version 57911 (0.0032) [2024-06-18 04:26:15,412][12883] Updated weights for policy 0, policy_version 57921 (0.0038) [2024-06-18 04:26:16,994][12645] Fps is (10 sec: 47524.2, 60 sec: 43144.5, 300 sec: 42321.0). Total num frames: 949075968. Throughput: 0: 42344.3. Samples: 949109380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:26:16,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 04:26:20,050][12883] Updated weights for policy 0, policy_version 57931 (0.0040) [2024-06-18 04:26:21,994][12645] Fps is (10 sec: 40969.6, 60 sec: 40960.0, 300 sec: 41987.5). Total num frames: 949207040. Throughput: 0: 42232.9. Samples: 949356640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:26:21,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 04:26:23,228][12883] Updated weights for policy 0, policy_version 57941 (0.0036) [2024-06-18 04:26:26,994][12645] Fps is (10 sec: 34406.6, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 949420032. Throughput: 0: 42024.9. Samples: 949604720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:26:26,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 04:26:27,956][12883] Updated weights for policy 0, policy_version 57951 (0.0037) [2024-06-18 04:26:31,118][12883] Updated weights for policy 0, policy_version 57961 (0.0027) [2024-06-18 04:26:31,800][12862] Signal inference workers to stop experience collection... (13700 times) [2024-06-18 04:26:31,831][12883] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-06-18 04:26:31,854][12862] Signal inference workers to resume experience collection... (13700 times) [2024-06-18 04:26:31,855][12883] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-06-18 04:26:31,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 949682176. Throughput: 0: 42173.3. Samples: 949733100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:26:31,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:26:35,800][12883] Updated weights for policy 0, policy_version 57971 (0.0041) [2024-06-18 04:26:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 949829632. Throughput: 0: 42026.7. Samples: 949986560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:26:36,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 04:26:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057974_949846016.pth... [2024-06-18 04:26:37,103][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057359_939769856.pth [2024-06-18 04:26:39,052][12883] Updated weights for policy 0, policy_version 57981 (0.0034) [2024-06-18 04:26:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 950075392. Throughput: 0: 41576.9. Samples: 950224700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:26:41,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 04:26:44,061][12883] Updated weights for policy 0, policy_version 57991 (0.0034) [2024-06-18 04:26:46,691][12883] Updated weights for policy 0, policy_version 58001 (0.0026) [2024-06-18 04:26:46,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42326.9, 300 sec: 42098.6). Total num frames: 950304768. Throughput: 0: 41945.8. Samples: 950363720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:26:46,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 04:26:51,592][12883] Updated weights for policy 0, policy_version 58011 (0.0032) [2024-06-18 04:26:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 950452224. Throughput: 0: 41705.3. Samples: 950607920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:26:51,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 04:26:54,386][12883] Updated weights for policy 0, policy_version 58021 (0.0029) [2024-06-18 04:26:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 950714368. Throughput: 0: 41555.9. Samples: 950856740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:26:56,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 04:26:59,017][12883] Updated weights for policy 0, policy_version 58031 (0.0026) [2024-06-18 04:27:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41506.3, 300 sec: 42043.0). Total num frames: 950910976. Throughput: 0: 41853.9. Samples: 950992800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 04:27:01,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 04:27:02,166][12883] Updated weights for policy 0, policy_version 58041 (0.0048) [2024-06-18 04:27:06,603][12883] Updated weights for policy 0, policy_version 58051 (0.0049) [2024-06-18 04:27:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41780.8, 300 sec: 42098.9). Total num frames: 951107584. Throughput: 0: 41714.7. Samples: 951233800. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 04:27:06,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 04:27:10,245][12883] Updated weights for policy 0, policy_version 58061 (0.0033) [2024-06-18 04:27:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42327.0, 300 sec: 42098.9). Total num frames: 951336960. Throughput: 0: 41863.6. Samples: 951488580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 04:27:11,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 04:27:13,984][12883] Updated weights for policy 0, policy_version 58071 (0.0033) [2024-06-18 04:27:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 40960.1, 300 sec: 42098.5). Total num frames: 951533568. Throughput: 0: 41854.2. Samples: 951616540. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 04:27:16,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 04:27:17,888][12883] Updated weights for policy 0, policy_version 58081 (0.0042) [2024-06-18 04:27:21,656][12883] Updated weights for policy 0, policy_version 58091 (0.0030) [2024-06-18 04:27:21,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 951762944. Throughput: 0: 41804.1. Samples: 951867840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 04:27:21,996][12645] Avg episode reward: [(0, '0.086')] [2024-06-18 04:27:25,670][12883] Updated weights for policy 0, policy_version 58101 (0.0044) [2024-06-18 04:27:26,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 951975936. Throughput: 0: 42045.4. Samples: 952116840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 04:27:26,996][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 04:27:29,579][12883] Updated weights for policy 0, policy_version 58111 (0.0036) [2024-06-18 04:27:31,996][12645] Fps is (10 sec: 37683.3, 60 sec: 40958.4, 300 sec: 42042.7). Total num frames: 952139776. Throughput: 0: 41805.1. Samples: 952245040. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 04:27:31,996][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 04:27:33,241][12883] Updated weights for policy 0, policy_version 58121 (0.0029) [2024-06-18 04:27:36,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42871.4, 300 sec: 42098.5). Total num frames: 952401920. Throughput: 0: 42012.3. Samples: 952498480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:27:36,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 04:27:38,111][12883] Updated weights for policy 0, policy_version 58131 (0.0040) [2024-06-18 04:27:41,133][12883] Updated weights for policy 0, policy_version 58141 (0.0030) [2024-06-18 04:27:41,995][12645] Fps is (10 sec: 47518.4, 60 sec: 42324.4, 300 sec: 42153.9). Total num frames: 952614912. Throughput: 0: 42065.9. Samples: 952749760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:27:41,995][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:27:45,801][12883] Updated weights for policy 0, policy_version 58151 (0.0040) [2024-06-18 04:27:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 952778752. Throughput: 0: 41886.6. Samples: 952877700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:27:46,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 04:27:48,930][12883] Updated weights for policy 0, policy_version 58161 (0.0032) [2024-06-18 04:27:51,996][12645] Fps is (10 sec: 40955.7, 60 sec: 42869.8, 300 sec: 42042.7). Total num frames: 953024512. Throughput: 0: 42119.6. Samples: 953129280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:27:51,997][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 04:27:53,711][12883] Updated weights for policy 0, policy_version 58171 (0.0031) [2024-06-18 04:27:55,750][12862] Signal inference workers to stop experience collection... (13750 times) [2024-06-18 04:27:55,785][12883] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-06-18 04:27:55,805][12862] Signal inference workers to resume experience collection... (13750 times) [2024-06-18 04:27:55,808][12883] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-06-18 04:27:56,765][12883] Updated weights for policy 0, policy_version 58181 (0.0033) [2024-06-18 04:27:56,996][12645] Fps is (10 sec: 45864.3, 60 sec: 42050.6, 300 sec: 42153.7). Total num frames: 953237504. Throughput: 0: 42008.9. Samples: 953379080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:27:56,997][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 04:28:01,479][12883] Updated weights for policy 0, policy_version 58191 (0.0039) [2024-06-18 04:28:01,994][12645] Fps is (10 sec: 37691.4, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 953401344. Throughput: 0: 41999.8. Samples: 953506540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:28:01,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 04:28:04,574][12883] Updated weights for policy 0, policy_version 58201 (0.0042) [2024-06-18 04:28:06,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 953663488. Throughput: 0: 42054.0. Samples: 953760180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:28:06,994][12645] Avg episode reward: [(0, '0.055')] [2024-06-18 04:28:09,287][12883] Updated weights for policy 0, policy_version 58211 (0.0029) [2024-06-18 04:28:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 953860096. Throughput: 0: 42236.3. Samples: 954017380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 04:28:11,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 04:28:12,475][12883] Updated weights for policy 0, policy_version 58221 (0.0036) [2024-06-18 04:28:16,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 954040320. Throughput: 0: 42179.9. Samples: 954143040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 04:28:16,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 04:28:17,331][12883] Updated weights for policy 0, policy_version 58231 (0.0027) [2024-06-18 04:28:20,281][12883] Updated weights for policy 0, policy_version 58241 (0.0054) [2024-06-18 04:28:22,000][12645] Fps is (10 sec: 44209.4, 60 sec: 42322.5, 300 sec: 42208.7). Total num frames: 954302464. Throughput: 0: 42095.1. Samples: 954393020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 04:28:22,001][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 04:28:25,156][12883] Updated weights for policy 0, policy_version 58251 (0.0048) [2024-06-18 04:28:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42053.8, 300 sec: 42098.5). Total num frames: 954499072. Throughput: 0: 42129.6. Samples: 954645540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 04:28:26,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 04:28:28,034][12883] Updated weights for policy 0, policy_version 58261 (0.0041) [2024-06-18 04:28:31,994][12645] Fps is (10 sec: 39346.6, 60 sec: 42600.0, 300 sec: 41987.5). Total num frames: 954695680. Throughput: 0: 41934.7. Samples: 954764760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 04:28:31,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 04:28:33,004][12883] Updated weights for policy 0, policy_version 58271 (0.0044) [2024-06-18 04:28:35,610][12883] Updated weights for policy 0, policy_version 58281 (0.0033) [2024-06-18 04:28:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 954925056. Throughput: 0: 42029.3. Samples: 955020500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 04:28:36,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 04:28:37,138][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058285_954941440.pth... [2024-06-18 04:28:37,193][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057668_944832512.pth [2024-06-18 04:28:40,517][12883] Updated weights for policy 0, policy_version 58291 (0.0032) [2024-06-18 04:28:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41507.0, 300 sec: 41987.8). Total num frames: 955105280. Throughput: 0: 42331.1. Samples: 955283880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 04:28:41,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 04:28:43,285][12883] Updated weights for policy 0, policy_version 58301 (0.0035) [2024-06-18 04:28:46,996][12645] Fps is (10 sec: 37674.5, 60 sec: 42050.6, 300 sec: 41931.6). Total num frames: 955301888. Throughput: 0: 42076.2. Samples: 955400060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 04:28:46,996][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 04:28:48,033][12883] Updated weights for policy 0, policy_version 58311 (0.0032) [2024-06-18 04:28:51,554][12883] Updated weights for policy 0, policy_version 58321 (0.0035) [2024-06-18 04:28:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 955547648. Throughput: 0: 42120.2. Samples: 955655580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 04:28:51,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 04:28:55,901][12883] Updated weights for policy 0, policy_version 58331 (0.0029) [2024-06-18 04:28:56,994][12645] Fps is (10 sec: 42608.5, 60 sec: 41507.8, 300 sec: 41931.9). Total num frames: 955727872. Throughput: 0: 42108.2. Samples: 955912240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 04:28:56,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 04:28:59,212][12883] Updated weights for policy 0, policy_version 58341 (0.0039) [2024-06-18 04:29:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 955957248. Throughput: 0: 42037.7. Samples: 956034740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 04:29:01,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 04:29:03,339][12883] Updated weights for policy 0, policy_version 58351 (0.0039) [2024-06-18 04:29:06,990][12883] Updated weights for policy 0, policy_version 58361 (0.0023) [2024-06-18 04:29:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 956186624. Throughput: 0: 42227.8. Samples: 956293000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 04:29:06,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 04:29:10,729][12862] Signal inference workers to stop experience collection... (13800 times) [2024-06-18 04:29:10,729][12862] Signal inference workers to resume experience collection... (13800 times) [2024-06-18 04:29:10,760][12883] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-06-18 04:29:10,764][12883] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-06-18 04:29:11,055][12883] Updated weights for policy 0, policy_version 58371 (0.0030) [2024-06-18 04:29:11,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 956399616. Throughput: 0: 42297.0. Samples: 956548900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 04:29:11,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 04:29:14,527][12883] Updated weights for policy 0, policy_version 58381 (0.0038) [2024-06-18 04:29:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42043.3). Total num frames: 956596224. Throughput: 0: 42330.2. Samples: 956669620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 04:29:16,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 04:29:19,190][12883] Updated weights for policy 0, policy_version 58391 (0.0034) [2024-06-18 04:29:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41783.6, 300 sec: 42043.0). Total num frames: 956809216. Throughput: 0: 42291.6. Samples: 956923620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 04:29:21,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 04:29:22,161][12883] Updated weights for policy 0, policy_version 58401 (0.0040) [2024-06-18 04:29:26,740][12883] Updated weights for policy 0, policy_version 58411 (0.0030) [2024-06-18 04:29:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 957005824. Throughput: 0: 42233.3. Samples: 957184380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 04:29:26,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 04:29:30,213][12883] Updated weights for policy 0, policy_version 58421 (0.0043) [2024-06-18 04:29:31,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42596.8, 300 sec: 42098.6). Total num frames: 957251584. Throughput: 0: 42240.5. Samples: 957300880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 04:29:31,996][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 04:29:34,327][12883] Updated weights for policy 0, policy_version 58431 (0.0034) [2024-06-18 04:29:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 957448192. Throughput: 0: 42264.3. Samples: 957557480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 04:29:36,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 04:29:37,942][12883] Updated weights for policy 0, policy_version 58441 (0.0041) [2024-06-18 04:29:41,923][12883] Updated weights for policy 0, policy_version 58451 (0.0038) [2024-06-18 04:29:41,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 957661184. Throughput: 0: 42356.7. Samples: 957818300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 04:29:41,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 04:29:45,534][12883] Updated weights for policy 0, policy_version 58461 (0.0045) [2024-06-18 04:29:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42098.6). Total num frames: 957874176. Throughput: 0: 42430.7. Samples: 957944120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 04:29:46,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 04:29:49,474][12883] Updated weights for policy 0, policy_version 58471 (0.0024) [2024-06-18 04:29:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 958087168. Throughput: 0: 42423.9. Samples: 958202080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 04:29:51,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 04:29:53,410][12883] Updated weights for policy 0, policy_version 58481 (0.0039) [2024-06-18 04:29:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 958300160. Throughput: 0: 42401.3. Samples: 958456960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 04:29:56,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 04:29:57,312][12883] Updated weights for policy 0, policy_version 58491 (0.0035) [2024-06-18 04:30:00,966][12883] Updated weights for policy 0, policy_version 58501 (0.0030) [2024-06-18 04:30:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 958496768. Throughput: 0: 42475.1. Samples: 958581000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:01,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 04:30:05,171][12883] Updated weights for policy 0, policy_version 58511 (0.0022) [2024-06-18 04:30:06,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42153.8). Total num frames: 958726144. Throughput: 0: 42408.4. Samples: 958832100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:06,996][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 04:30:08,631][12883] Updated weights for policy 0, policy_version 58521 (0.0038) [2024-06-18 04:30:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 958922752. Throughput: 0: 42316.1. Samples: 959088600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:11,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 04:30:13,005][12883] Updated weights for policy 0, policy_version 58531 (0.0028) [2024-06-18 04:30:16,584][12883] Updated weights for policy 0, policy_version 58541 (0.0038) [2024-06-18 04:30:16,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 959135744. Throughput: 0: 42435.0. Samples: 959210360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:16,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 04:30:20,595][12883] Updated weights for policy 0, policy_version 58551 (0.0027) [2024-06-18 04:30:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 959348736. Throughput: 0: 42353.1. Samples: 959463460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:21,996][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 04:30:24,290][12883] Updated weights for policy 0, policy_version 58561 (0.0026) [2024-06-18 04:30:26,995][12645] Fps is (10 sec: 40953.9, 60 sec: 42324.3, 300 sec: 42098.3). Total num frames: 959545344. Throughput: 0: 42343.1. Samples: 959723800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:26,996][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 04:30:28,438][12883] Updated weights for policy 0, policy_version 58571 (0.0045) [2024-06-18 04:30:31,942][12883] Updated weights for policy 0, policy_version 58581 (0.0051) [2024-06-18 04:30:31,994][12645] Fps is (10 sec: 44245.9, 60 sec: 42326.8, 300 sec: 42209.6). Total num frames: 959791104. Throughput: 0: 42209.7. Samples: 959843560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:31,995][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 04:30:36,158][12883] Updated weights for policy 0, policy_version 58591 (0.0042) [2024-06-18 04:30:36,996][12645] Fps is (10 sec: 42595.4, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 959971328. Throughput: 0: 42177.5. Samples: 960100160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 04:30:36,996][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 04:30:37,028][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058593_959987712.pth... [2024-06-18 04:30:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000057974_949846016.pth [2024-06-18 04:30:39,583][12883] Updated weights for policy 0, policy_version 58601 (0.0037) [2024-06-18 04:30:41,994][12645] Fps is (10 sec: 37684.0, 60 sec: 41779.3, 300 sec: 42043.4). Total num frames: 960167936. Throughput: 0: 42246.3. Samples: 960358040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 04:30:41,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 04:30:42,400][12862] Signal inference workers to stop experience collection... (13850 times) [2024-06-18 04:30:42,400][12862] Signal inference workers to resume experience collection... (13850 times) [2024-06-18 04:30:42,420][12883] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-06-18 04:30:42,420][12883] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-06-18 04:30:44,425][12883] Updated weights for policy 0, policy_version 58611 (0.0029) [2024-06-18 04:30:46,996][12645] Fps is (10 sec: 44236.6, 60 sec: 42323.7, 300 sec: 42153.8). Total num frames: 960413696. Throughput: 0: 42286.8. Samples: 960484000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 04:30:46,996][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 04:30:47,441][12883] Updated weights for policy 0, policy_version 58621 (0.0036) [2024-06-18 04:30:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 960593920. Throughput: 0: 42244.3. Samples: 960733000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 04:30:51,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 04:30:52,007][12883] Updated weights for policy 0, policy_version 58631 (0.0034) [2024-06-18 04:30:55,059][12883] Updated weights for policy 0, policy_version 58641 (0.0022) [2024-06-18 04:30:56,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 960823296. Throughput: 0: 42114.2. Samples: 960983740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 04:30:56,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 04:30:59,792][12883] Updated weights for policy 0, policy_version 58651 (0.0038) [2024-06-18 04:31:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42154.4). Total num frames: 961036288. Throughput: 0: 42211.6. Samples: 961109880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 04:31:01,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 04:31:02,889][12883] Updated weights for policy 0, policy_version 58661 (0.0032) [2024-06-18 04:31:06,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 961232896. Throughput: 0: 42235.9. Samples: 961364080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 04:31:06,997][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 04:31:07,447][12883] Updated weights for policy 0, policy_version 58671 (0.0028) [2024-06-18 04:31:10,732][12883] Updated weights for policy 0, policy_version 58681 (0.0032) [2024-06-18 04:31:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 961462272. Throughput: 0: 41838.3. Samples: 961606460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 04:31:11,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 04:31:15,131][12883] Updated weights for policy 0, policy_version 58691 (0.0035) [2024-06-18 04:31:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 961658880. Throughput: 0: 42176.1. Samples: 961741480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-18 04:31:16,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 04:31:18,721][12883] Updated weights for policy 0, policy_version 58701 (0.0030) [2024-06-18 04:31:22,000][12645] Fps is (10 sec: 40934.3, 60 sec: 42049.4, 300 sec: 42208.7). Total num frames: 961871872. Throughput: 0: 42065.1. Samples: 961993260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-18 04:31:22,001][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 04:31:23,087][12883] Updated weights for policy 0, policy_version 58711 (0.0030) [2024-06-18 04:31:26,575][12883] Updated weights for policy 0, policy_version 58721 (0.0035) [2024-06-18 04:31:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42326.3, 300 sec: 42043.0). Total num frames: 962084864. Throughput: 0: 41899.4. Samples: 962243520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-18 04:31:26,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 04:31:30,746][12883] Updated weights for policy 0, policy_version 58731 (0.0027) [2024-06-18 04:31:31,994][12645] Fps is (10 sec: 42625.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 962297856. Throughput: 0: 42031.0. Samples: 962375300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-18 04:31:31,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 04:31:34,714][12883] Updated weights for policy 0, policy_version 58741 (0.0031) [2024-06-18 04:31:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42599.9, 300 sec: 42209.6). Total num frames: 962527232. Throughput: 0: 42042.1. Samples: 962624900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-18 04:31:36,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 04:31:38,359][12883] Updated weights for policy 0, policy_version 58751 (0.0022) [2024-06-18 04:31:41,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42050.6, 300 sec: 41987.2). Total num frames: 962691072. Throughput: 0: 42203.2. Samples: 962882980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-18 04:31:41,996][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 04:31:42,417][12883] Updated weights for policy 0, policy_version 58761 (0.0037) [2024-06-18 04:31:46,093][12883] Updated weights for policy 0, policy_version 58771 (0.0032) [2024-06-18 04:31:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42053.8, 300 sec: 42320.7). Total num frames: 962936832. Throughput: 0: 42057.3. Samples: 963002460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 23.0) [2024-06-18 04:31:46,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 04:31:50,202][12883] Updated weights for policy 0, policy_version 58781 (0.0032) [2024-06-18 04:31:52,000][12645] Fps is (10 sec: 45856.8, 60 sec: 42594.0, 300 sec: 42153.2). Total num frames: 963149824. Throughput: 0: 42144.7. Samples: 963260760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:31:52,000][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 04:31:53,757][12883] Updated weights for policy 0, policy_version 58791 (0.0036) [2024-06-18 04:31:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 963346432. Throughput: 0: 42495.2. Samples: 963518740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:31:56,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 04:31:58,143][12883] Updated weights for policy 0, policy_version 58801 (0.0029) [2024-06-18 04:31:58,559][12862] Signal inference workers to stop experience collection... (13900 times) [2024-06-18 04:31:58,616][12883] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-06-18 04:31:58,616][12862] Signal inference workers to resume experience collection... (13900 times) [2024-06-18 04:31:58,634][12883] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-06-18 04:32:01,401][12883] Updated weights for policy 0, policy_version 58811 (0.0043) [2024-06-18 04:32:01,994][12645] Fps is (10 sec: 42624.8, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 963575808. Throughput: 0: 42094.2. Samples: 963635720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:32:01,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 04:32:05,924][12883] Updated weights for policy 0, policy_version 58821 (0.0029) [2024-06-18 04:32:06,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42598.4, 300 sec: 42209.3). Total num frames: 963788800. Throughput: 0: 42158.8. Samples: 963890240. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:32:06,997][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 04:32:09,228][12883] Updated weights for policy 0, policy_version 58831 (0.0043) [2024-06-18 04:32:11,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 963952640. Throughput: 0: 42208.4. Samples: 964142900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:32:11,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 04:32:13,541][12883] Updated weights for policy 0, policy_version 58841 (0.0033) [2024-06-18 04:32:16,887][12883] Updated weights for policy 0, policy_version 58851 (0.0042) [2024-06-18 04:32:16,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42209.9). Total num frames: 964214784. Throughput: 0: 41899.1. Samples: 964260760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:32:16,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 04:32:21,237][12883] Updated weights for policy 0, policy_version 58861 (0.0036) [2024-06-18 04:32:21,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42329.8, 300 sec: 42154.4). Total num frames: 964411392. Throughput: 0: 42188.1. Samples: 964523360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:32:21,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 04:32:24,416][12883] Updated weights for policy 0, policy_version 58871 (0.0031) [2024-06-18 04:32:26,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42209.9). Total num frames: 964591616. Throughput: 0: 41953.6. Samples: 964770800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:32:26,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 04:32:29,010][12883] Updated weights for policy 0, policy_version 58881 (0.0032) [2024-06-18 04:32:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 964820992. Throughput: 0: 42016.5. Samples: 964893200. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-18 04:32:31,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 04:32:32,701][12883] Updated weights for policy 0, policy_version 58891 (0.0039) [2024-06-18 04:32:36,854][12883] Updated weights for policy 0, policy_version 58901 (0.0037) [2024-06-18 04:32:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 42098.7). Total num frames: 965033984. Throughput: 0: 42022.8. Samples: 965151520. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-18 04:32:36,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 04:32:37,027][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058901_965033984.pth... [2024-06-18 04:32:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058285_954941440.pth [2024-06-18 04:32:40,558][12883] Updated weights for policy 0, policy_version 58911 (0.0052) [2024-06-18 04:32:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 965230592. Throughput: 0: 41754.2. Samples: 965397680. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-18 04:32:41,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 04:32:44,558][12883] Updated weights for policy 0, policy_version 58921 (0.0038) [2024-06-18 04:32:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 965459968. Throughput: 0: 41780.5. Samples: 965515840. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-18 04:32:46,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 04:32:48,462][12883] Updated weights for policy 0, policy_version 58931 (0.0040) [2024-06-18 04:32:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41510.5, 300 sec: 42043.3). Total num frames: 965640192. Throughput: 0: 41862.2. Samples: 965773940. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-18 04:32:51,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 04:32:52,767][12883] Updated weights for policy 0, policy_version 58941 (0.0026) [2024-06-18 04:32:56,561][12883] Updated weights for policy 0, policy_version 58951 (0.0037) [2024-06-18 04:32:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 965853184. Throughput: 0: 41636.5. Samples: 966016540. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-18 04:32:56,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 04:33:00,442][12883] Updated weights for policy 0, policy_version 58961 (0.0037) [2024-06-18 04:33:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 966066176. Throughput: 0: 41873.9. Samples: 966145080. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-18 04:33:01,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 04:33:04,535][12883] Updated weights for policy 0, policy_version 58971 (0.0036) [2024-06-18 04:33:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41234.7, 300 sec: 42043.0). Total num frames: 966262784. Throughput: 0: 41530.2. Samples: 966392220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 04:33:06,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 04:33:08,437][12883] Updated weights for policy 0, policy_version 58981 (0.0028) [2024-06-18 04:33:11,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42323.9, 300 sec: 42209.3). Total num frames: 966492160. Throughput: 0: 41592.7. Samples: 966642560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 04:33:11,996][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 04:33:12,126][12883] Updated weights for policy 0, policy_version 58991 (0.0026) [2024-06-18 04:33:15,997][12883] Updated weights for policy 0, policy_version 59001 (0.0027) [2024-06-18 04:33:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41932.8). Total num frames: 966672384. Throughput: 0: 41789.3. Samples: 966773720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 04:33:16,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 04:33:17,000][12862] Signal inference workers to stop experience collection... (13950 times) [2024-06-18 04:33:17,010][12883] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-06-18 04:33:17,055][12862] Signal inference workers to resume experience collection... (13950 times) [2024-06-18 04:33:17,056][12883] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-06-18 04:33:19,851][12883] Updated weights for policy 0, policy_version 59011 (0.0042) [2024-06-18 04:33:21,994][12645] Fps is (10 sec: 37691.5, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 966868992. Throughput: 0: 41519.1. Samples: 967019880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 04:33:21,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 04:33:23,722][12883] Updated weights for policy 0, policy_version 59021 (0.0034) [2024-06-18 04:33:26,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 967114752. Throughput: 0: 41616.6. Samples: 967270520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 04:33:26,996][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 04:33:28,037][12883] Updated weights for policy 0, policy_version 59031 (0.0035) [2024-06-18 04:33:31,810][12883] Updated weights for policy 0, policy_version 59041 (0.0029) [2024-06-18 04:33:31,994][12645] Fps is (10 sec: 45874.6, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 967327744. Throughput: 0: 41971.0. Samples: 967404540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 04:33:31,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 04:33:35,555][12883] Updated weights for policy 0, policy_version 59051 (0.0042) [2024-06-18 04:33:36,996][12645] Fps is (10 sec: 40960.2, 60 sec: 41504.5, 300 sec: 42098.2). Total num frames: 967524352. Throughput: 0: 41688.5. Samples: 967650020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 04:33:36,996][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 04:33:39,527][12883] Updated weights for policy 0, policy_version 59061 (0.0029) [2024-06-18 04:33:42,000][12645] Fps is (10 sec: 42572.0, 60 sec: 42047.9, 300 sec: 42209.1). Total num frames: 967753728. Throughput: 0: 41913.3. Samples: 967902900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:33:42,001][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 04:33:43,462][12883] Updated weights for policy 0, policy_version 59071 (0.0046) [2024-06-18 04:33:46,994][12645] Fps is (10 sec: 44246.4, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 967966720. Throughput: 0: 41919.8. Samples: 968031480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:33:46,994][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 04:33:47,160][12883] Updated weights for policy 0, policy_version 59081 (0.0052) [2024-06-18 04:33:51,606][12883] Updated weights for policy 0, policy_version 59091 (0.0037) [2024-06-18 04:33:51,994][12645] Fps is (10 sec: 40985.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 968163328. Throughput: 0: 42015.9. Samples: 968282940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:33:51,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 04:33:55,012][12883] Updated weights for policy 0, policy_version 59101 (0.0036) [2024-06-18 04:33:56,996][12645] Fps is (10 sec: 42590.4, 60 sec: 42324.0, 300 sec: 42153.8). Total num frames: 968392704. Throughput: 0: 41839.3. Samples: 968525320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:33:56,996][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 04:33:59,517][12883] Updated weights for policy 0, policy_version 59111 (0.0044) [2024-06-18 04:34:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 968572928. Throughput: 0: 41861.8. Samples: 968657500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:34:01,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 04:34:02,789][12883] Updated weights for policy 0, policy_version 59121 (0.0027) [2024-06-18 04:34:07,000][12645] Fps is (10 sec: 39304.6, 60 sec: 42047.9, 300 sec: 41986.6). Total num frames: 968785920. Throughput: 0: 41980.8. Samples: 968909280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:34:07,000][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 04:34:07,304][12883] Updated weights for policy 0, policy_version 59131 (0.0026) [2024-06-18 04:34:10,690][12883] Updated weights for policy 0, policy_version 59141 (0.0029) [2024-06-18 04:34:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42053.7, 300 sec: 42098.5). Total num frames: 969015296. Throughput: 0: 41749.6. Samples: 969149160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:34:11,995][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 04:34:15,462][12883] Updated weights for policy 0, policy_version 59151 (0.0042) [2024-06-18 04:34:17,000][12645] Fps is (10 sec: 40960.0, 60 sec: 42047.9, 300 sec: 41986.6). Total num frames: 969195520. Throughput: 0: 41757.8. Samples: 969283900. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 04:34:17,000][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 04:34:18,605][12883] Updated weights for policy 0, policy_version 59161 (0.0025) [2024-06-18 04:34:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 969408512. Throughput: 0: 41733.6. Samples: 969527940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 04:34:21,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 04:34:23,351][12883] Updated weights for policy 0, policy_version 59171 (0.0024) [2024-06-18 04:34:25,321][12862] Signal inference workers to stop experience collection... (14000 times) [2024-06-18 04:34:25,321][12862] Signal inference workers to resume experience collection... (14000 times) [2024-06-18 04:34:25,362][12883] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-06-18 04:34:25,362][12883] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-06-18 04:34:26,537][12883] Updated weights for policy 0, policy_version 59181 (0.0040) [2024-06-18 04:34:26,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42053.9, 300 sec: 41987.8). Total num frames: 969637888. Throughput: 0: 41633.3. Samples: 969776140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 04:34:26,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 04:34:31,187][12883] Updated weights for policy 0, policy_version 59191 (0.0033) [2024-06-18 04:34:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 969818112. Throughput: 0: 41682.6. Samples: 969907200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 04:34:31,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 04:34:34,375][12883] Updated weights for policy 0, policy_version 59201 (0.0026) [2024-06-18 04:34:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42053.8, 300 sec: 41987.5). Total num frames: 970047488. Throughput: 0: 41374.7. Samples: 970144800. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 04:34:36,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 04:34:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059207_970047488.pth... [2024-06-18 04:34:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058593_959987712.pth [2024-06-18 04:34:39,302][12883] Updated weights for policy 0, policy_version 59211 (0.0030) [2024-06-18 04:34:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41510.5, 300 sec: 41931.9). Total num frames: 970244096. Throughput: 0: 41632.0. Samples: 970398680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 04:34:41,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 04:34:42,606][12883] Updated weights for policy 0, policy_version 59221 (0.0029) [2024-06-18 04:34:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 40960.1, 300 sec: 41820.9). Total num frames: 970424320. Throughput: 0: 41386.7. Samples: 970519900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 04:34:46,994][12645] Avg episode reward: [(0, '0.075')] [2024-06-18 04:34:47,707][12883] Updated weights for policy 0, policy_version 59231 (0.0037) [2024-06-18 04:34:50,404][12883] Updated weights for policy 0, policy_version 59241 (0.0036) [2024-06-18 04:34:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 970686464. Throughput: 0: 41343.6. Samples: 970769480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 04:34:51,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 04:34:55,399][12883] Updated weights for policy 0, policy_version 59251 (0.0029) [2024-06-18 04:34:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 40961.4, 300 sec: 41876.4). Total num frames: 970850304. Throughput: 0: 41739.2. Samples: 971027420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:34:56,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 04:34:58,097][12883] Updated weights for policy 0, policy_version 59261 (0.0036) [2024-06-18 04:35:01,994][12645] Fps is (10 sec: 37682.9, 60 sec: 41506.1, 300 sec: 41821.2). Total num frames: 971063296. Throughput: 0: 41347.5. Samples: 971144280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:35:01,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 04:35:03,280][12883] Updated weights for policy 0, policy_version 59271 (0.0031) [2024-06-18 04:35:06,091][12883] Updated weights for policy 0, policy_version 59281 (0.0044) [2024-06-18 04:35:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41783.6, 300 sec: 41931.9). Total num frames: 971292672. Throughput: 0: 41561.8. Samples: 971398220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:35:06,994][12645] Avg episode reward: [(0, '0.065')] [2024-06-18 04:35:10,948][12883] Updated weights for policy 0, policy_version 59291 (0.0028) [2024-06-18 04:35:11,995][12645] Fps is (10 sec: 40954.3, 60 sec: 40959.1, 300 sec: 41820.7). Total num frames: 971472896. Throughput: 0: 41616.0. Samples: 971648920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:35:11,995][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:35:13,967][12883] Updated weights for policy 0, policy_version 59301 (0.0041) [2024-06-18 04:35:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41783.6, 300 sec: 41876.7). Total num frames: 971702272. Throughput: 0: 41446.8. Samples: 971772300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:35:16,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 04:35:18,681][12883] Updated weights for policy 0, policy_version 59311 (0.0038) [2024-06-18 04:35:21,994][12645] Fps is (10 sec: 42604.4, 60 sec: 41506.1, 300 sec: 41876.6). Total num frames: 971898880. Throughput: 0: 41867.2. Samples: 972028820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:35:21,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 04:35:22,084][12883] Updated weights for policy 0, policy_version 59321 (0.0043) [2024-06-18 04:35:26,373][12883] Updated weights for policy 0, policy_version 59331 (0.0036) [2024-06-18 04:35:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 972111872. Throughput: 0: 41858.8. Samples: 972282320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:35:26,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 04:35:29,785][12883] Updated weights for policy 0, policy_version 59341 (0.0027) [2024-06-18 04:35:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 41932.2). Total num frames: 972341248. Throughput: 0: 41860.8. Samples: 972403640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:35:31,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 04:35:33,963][12883] Updated weights for policy 0, policy_version 59351 (0.0034) [2024-06-18 04:35:36,354][12862] Signal inference workers to stop experience collection... (14050 times) [2024-06-18 04:35:36,393][12883] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-06-18 04:35:36,401][12862] Signal inference workers to resume experience collection... (14050 times) [2024-06-18 04:35:36,414][12883] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-06-18 04:35:36,996][12645] Fps is (10 sec: 44226.5, 60 sec: 41777.7, 300 sec: 41987.1). Total num frames: 972554240. Throughput: 0: 41950.3. Samples: 972657340. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:35:36,996][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 04:35:37,458][12883] Updated weights for policy 0, policy_version 59361 (0.0032) [2024-06-18 04:35:41,989][12883] Updated weights for policy 0, policy_version 59371 (0.0037) [2024-06-18 04:35:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41765.6). Total num frames: 972734464. Throughput: 0: 41856.9. Samples: 972910980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:35:41,994][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 04:35:45,198][12883] Updated weights for policy 0, policy_version 59381 (0.0032) [2024-06-18 04:35:46,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 972980224. Throughput: 0: 41935.5. Samples: 973031380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:35:46,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 04:35:49,627][12883] Updated weights for policy 0, policy_version 59391 (0.0023) [2024-06-18 04:35:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 973176832. Throughput: 0: 42030.7. Samples: 973289600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:35:51,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 04:35:53,117][12883] Updated weights for policy 0, policy_version 59401 (0.0031) [2024-06-18 04:35:56,994][12645] Fps is (10 sec: 39319.7, 60 sec: 42051.8, 300 sec: 41820.8). Total num frames: 973373440. Throughput: 0: 41999.9. Samples: 973538880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:35:56,995][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 04:35:57,299][12883] Updated weights for policy 0, policy_version 59411 (0.0027) [2024-06-18 04:36:00,866][12883] Updated weights for policy 0, policy_version 59421 (0.0034) [2024-06-18 04:36:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41932.3). Total num frames: 973602816. Throughput: 0: 42076.0. Samples: 973665720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:36:01,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 04:36:05,228][12883] Updated weights for policy 0, policy_version 59431 (0.0044) [2024-06-18 04:36:06,994][12645] Fps is (10 sec: 40962.1, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 973783040. Throughput: 0: 41890.6. Samples: 973913900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-18 04:36:06,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 04:36:08,511][12883] Updated weights for policy 0, policy_version 59441 (0.0033) [2024-06-18 04:36:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42053.2, 300 sec: 41820.9). Total num frames: 973996032. Throughput: 0: 41945.2. Samples: 974169860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 04:36:11,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 04:36:12,847][12883] Updated weights for policy 0, policy_version 59451 (0.0030) [2024-06-18 04:36:16,262][12883] Updated weights for policy 0, policy_version 59461 (0.0033) [2024-06-18 04:36:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 41932.8). Total num frames: 974241792. Throughput: 0: 42104.4. Samples: 974298340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 04:36:16,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 04:36:20,862][12883] Updated weights for policy 0, policy_version 59471 (0.0029) [2024-06-18 04:36:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 974422016. Throughput: 0: 42164.4. Samples: 974554640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 04:36:21,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 04:36:23,991][12883] Updated weights for policy 0, policy_version 59481 (0.0031) [2024-06-18 04:36:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 974651392. Throughput: 0: 42029.7. Samples: 974802320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 04:36:26,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 04:36:28,586][12883] Updated weights for policy 0, policy_version 59491 (0.0042) [2024-06-18 04:36:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 974848000. Throughput: 0: 42129.5. Samples: 974927200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 04:36:31,994][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 04:36:32,017][12883] Updated weights for policy 0, policy_version 59501 (0.0050) [2024-06-18 04:36:32,714][12862] Signal inference workers to stop experience collection... (14100 times) [2024-06-18 04:36:32,714][12862] Signal inference workers to resume experience collection... (14100 times) [2024-06-18 04:36:32,728][12883] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-06-18 04:36:32,728][12883] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-06-18 04:36:36,206][12883] Updated weights for policy 0, policy_version 59511 (0.0036) [2024-06-18 04:36:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41507.7, 300 sec: 41876.7). Total num frames: 975044608. Throughput: 0: 41953.8. Samples: 975177520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 04:36:36,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 04:36:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059512_975044608.pth... [2024-06-18 04:36:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000058901_965033984.pth [2024-06-18 04:36:39,744][12883] Updated weights for policy 0, policy_version 59521 (0.0038) [2024-06-18 04:36:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 975273984. Throughput: 0: 41930.3. Samples: 975425720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 04:36:41,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 04:36:43,921][12883] Updated weights for policy 0, policy_version 59531 (0.0035) [2024-06-18 04:36:46,996][12645] Fps is (10 sec: 42588.4, 60 sec: 41504.6, 300 sec: 41765.9). Total num frames: 975470592. Throughput: 0: 42023.6. Samples: 975556880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:36:46,997][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 04:36:47,297][12883] Updated weights for policy 0, policy_version 59541 (0.0033) [2024-06-18 04:36:51,530][12883] Updated weights for policy 0, policy_version 59551 (0.0028) [2024-06-18 04:36:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.1, 300 sec: 41876.4). Total num frames: 975699968. Throughput: 0: 42243.1. Samples: 975814840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:36:51,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 04:36:55,498][12883] Updated weights for policy 0, policy_version 59561 (0.0031) [2024-06-18 04:36:56,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42325.8, 300 sec: 41820.9). Total num frames: 975912960. Throughput: 0: 41823.1. Samples: 976051900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:36:56,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 04:36:59,717][12883] Updated weights for policy 0, policy_version 59571 (0.0031) [2024-06-18 04:37:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41710.1). Total num frames: 976093184. Throughput: 0: 41960.9. Samples: 976186580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:37:01,994][12645] Avg episode reward: [(0, '0.099')] [2024-06-18 04:37:03,206][12883] Updated weights for policy 0, policy_version 59581 (0.0039) [2024-06-18 04:37:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 976306176. Throughput: 0: 41858.7. Samples: 976438280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:37:06,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 04:37:07,307][12883] Updated weights for policy 0, policy_version 59591 (0.0031) [2024-06-18 04:37:10,860][12883] Updated weights for policy 0, policy_version 59601 (0.0041) [2024-06-18 04:37:11,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 41876.4). Total num frames: 976568320. Throughput: 0: 41793.3. Samples: 976683020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:37:11,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 04:37:15,280][12883] Updated weights for policy 0, policy_version 59611 (0.0036) [2024-06-18 04:37:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 976732160. Throughput: 0: 42187.1. Samples: 976825620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:37:16,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 04:37:18,434][12883] Updated weights for policy 0, policy_version 59621 (0.0033) [2024-06-18 04:37:21,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 976945152. Throughput: 0: 42216.0. Samples: 977077240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 04:37:21,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 04:37:22,825][12883] Updated weights for policy 0, policy_version 59631 (0.0028) [2024-06-18 04:37:26,148][12883] Updated weights for policy 0, policy_version 59641 (0.0044) [2024-06-18 04:37:26,994][12645] Fps is (10 sec: 47511.7, 60 sec: 42598.1, 300 sec: 41987.4). Total num frames: 977207296. Throughput: 0: 42158.8. Samples: 977322880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 04:37:26,995][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 04:37:30,890][12883] Updated weights for policy 0, policy_version 59651 (0.0044) [2024-06-18 04:37:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 977354752. Throughput: 0: 42218.2. Samples: 977456600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 04:37:31,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 04:37:33,883][12883] Updated weights for policy 0, policy_version 59661 (0.0028) [2024-06-18 04:37:36,994][12645] Fps is (10 sec: 37684.4, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 977584128. Throughput: 0: 41931.6. Samples: 977701760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 04:37:36,994][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 04:37:38,907][12883] Updated weights for policy 0, policy_version 59671 (0.0024) [2024-06-18 04:37:41,642][12883] Updated weights for policy 0, policy_version 59681 (0.0039) [2024-06-18 04:37:41,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 977829888. Throughput: 0: 42236.9. Samples: 977952560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 04:37:41,994][12645] Avg episode reward: [(0, '0.050')] [2024-06-18 04:37:46,958][12883] Updated weights for policy 0, policy_version 59691 (0.0042) [2024-06-18 04:37:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41780.8, 300 sec: 41820.8). Total num frames: 977977344. Throughput: 0: 42095.5. Samples: 978080880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 04:37:46,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 04:37:48,853][12862] Signal inference workers to stop experience collection... (14150 times) [2024-06-18 04:37:48,853][12862] Signal inference workers to resume experience collection... (14150 times) [2024-06-18 04:37:48,894][12883] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-06-18 04:37:48,894][12883] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-06-18 04:37:49,482][12883] Updated weights for policy 0, policy_version 59701 (0.0027) [2024-06-18 04:37:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 978223104. Throughput: 0: 41899.8. Samples: 978323780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 04:37:51,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 04:37:54,584][12883] Updated weights for policy 0, policy_version 59711 (0.0035) [2024-06-18 04:37:56,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 978436096. Throughput: 0: 42363.7. Samples: 978589380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 04:37:56,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 04:37:57,269][12883] Updated weights for policy 0, policy_version 59721 (0.0034) [2024-06-18 04:38:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 978616320. Throughput: 0: 41837.3. Samples: 978708300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 04:38:01,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 04:38:02,085][12883] Updated weights for policy 0, policy_version 59731 (0.0048) [2024-06-18 04:38:05,010][12883] Updated weights for policy 0, policy_version 59741 (0.0037) [2024-06-18 04:38:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 41932.2). Total num frames: 978862080. Throughput: 0: 41802.7. Samples: 978958360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 04:38:06,994][12645] Avg episode reward: [(0, '0.215')] [2024-06-18 04:38:09,663][12883] Updated weights for policy 0, policy_version 59751 (0.0031) [2024-06-18 04:38:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 979042304. Throughput: 0: 42289.2. Samples: 979225880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 04:38:11,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 04:38:12,691][12883] Updated weights for policy 0, policy_version 59761 (0.0028) [2024-06-18 04:38:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 979255296. Throughput: 0: 41946.2. Samples: 979344180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 04:38:16,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 04:38:17,799][12883] Updated weights for policy 0, policy_version 59771 (0.0028) [2024-06-18 04:38:20,681][12883] Updated weights for policy 0, policy_version 59781 (0.0037) [2024-06-18 04:38:21,994][12645] Fps is (10 sec: 45873.4, 60 sec: 42598.0, 300 sec: 41987.7). Total num frames: 979501056. Throughput: 0: 42181.0. Samples: 979599920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 04:38:21,995][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 04:38:25,424][12883] Updated weights for policy 0, policy_version 59791 (0.0040) [2024-06-18 04:38:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.3, 300 sec: 41876.4). Total num frames: 979681280. Throughput: 0: 42416.5. Samples: 979861300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 04:38:26,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 04:38:28,328][12883] Updated weights for policy 0, policy_version 59801 (0.0034) [2024-06-18 04:38:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.0, 300 sec: 41932.2). Total num frames: 979894272. Throughput: 0: 42272.1. Samples: 979983140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 04:38:31,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 04:38:33,102][12883] Updated weights for policy 0, policy_version 59811 (0.0035) [2024-06-18 04:38:36,183][12883] Updated weights for policy 0, policy_version 59821 (0.0035) [2024-06-18 04:38:36,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42043.9). Total num frames: 980156416. Throughput: 0: 42665.8. Samples: 980243740. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:38:36,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 04:38:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059824_980156416.pth... [2024-06-18 04:38:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059207_970047488.pth [2024-06-18 04:38:40,948][12883] Updated weights for policy 0, policy_version 59831 (0.0040) [2024-06-18 04:38:41,996][12645] Fps is (10 sec: 44228.5, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 980336640. Throughput: 0: 42357.3. Samples: 980495560. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:38:41,996][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 04:38:43,925][12883] Updated weights for policy 0, policy_version 59841 (0.0043) [2024-06-18 04:38:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 41932.0). Total num frames: 980533248. Throughput: 0: 42426.3. Samples: 980617480. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:38:46,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 04:38:48,565][12883] Updated weights for policy 0, policy_version 59851 (0.0036) [2024-06-18 04:38:51,669][12883] Updated weights for policy 0, policy_version 59861 (0.0024) [2024-06-18 04:38:51,994][12645] Fps is (10 sec: 42608.5, 60 sec: 42325.5, 300 sec: 41932.2). Total num frames: 980762624. Throughput: 0: 42636.5. Samples: 980877000. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:38:51,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 04:38:56,291][12883] Updated weights for policy 0, policy_version 59871 (0.0031) [2024-06-18 04:38:56,318][12862] Signal inference workers to stop experience collection... (14200 times) [2024-06-18 04:38:56,323][12862] Signal inference workers to resume experience collection... (14200 times) [2024-06-18 04:38:56,346][12883] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-06-18 04:38:56,347][12883] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-06-18 04:38:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42050.7, 300 sec: 41987.2). Total num frames: 980959232. Throughput: 0: 42477.1. Samples: 981137440. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:38:56,996][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 04:38:59,218][12883] Updated weights for policy 0, policy_version 59881 (0.0040) [2024-06-18 04:39:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42043.9). Total num frames: 981188608. Throughput: 0: 42610.6. Samples: 981261660. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:39:01,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 04:39:03,905][12883] Updated weights for policy 0, policy_version 59891 (0.0049) [2024-06-18 04:39:06,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 981401600. Throughput: 0: 42598.6. Samples: 981516840. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:39:06,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 04:39:07,490][12883] Updated weights for policy 0, policy_version 59901 (0.0037) [2024-06-18 04:39:11,764][12883] Updated weights for policy 0, policy_version 59911 (0.0036) [2024-06-18 04:39:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42043.9). Total num frames: 981598208. Throughput: 0: 42462.7. Samples: 981772120. Policy #0 lag: (min: 0.0, avg: 13.6, max: 26.0) [2024-06-18 04:39:11,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 04:39:15,076][12883] Updated weights for policy 0, policy_version 59921 (0.0048) [2024-06-18 04:39:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42154.1). Total num frames: 981843968. Throughput: 0: 42413.7. Samples: 981891740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:39:16,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 04:39:19,394][12883] Updated weights for policy 0, policy_version 59931 (0.0037) [2024-06-18 04:39:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.5, 300 sec: 41987.5). Total num frames: 982024192. Throughput: 0: 42268.9. Samples: 982145840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:39:21,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 04:39:23,025][12883] Updated weights for policy 0, policy_version 59941 (0.0025) [2024-06-18 04:39:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 982220800. Throughput: 0: 42306.1. Samples: 982399240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:39:26,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 04:39:27,059][12883] Updated weights for policy 0, policy_version 59951 (0.0040) [2024-06-18 04:39:30,897][12883] Updated weights for policy 0, policy_version 59961 (0.0032) [2024-06-18 04:39:31,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43144.9, 300 sec: 42154.1). Total num frames: 982482944. Throughput: 0: 42409.8. Samples: 982525920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:39:31,994][12645] Avg episode reward: [(0, '0.143')] [2024-06-18 04:39:34,604][12883] Updated weights for policy 0, policy_version 59971 (0.0032) [2024-06-18 04:39:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 982663168. Throughput: 0: 42328.8. Samples: 982781800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:39:36,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 04:39:38,592][12883] Updated weights for policy 0, policy_version 59981 (0.0043) [2024-06-18 04:39:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 982876160. Throughput: 0: 42107.9. Samples: 983032200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:39:41,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 04:39:42,099][12883] Updated weights for policy 0, policy_version 59991 (0.0038) [2024-06-18 04:39:46,147][12883] Updated weights for policy 0, policy_version 60001 (0.0035) [2024-06-18 04:39:46,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.7, 300 sec: 42042.7). Total num frames: 983089152. Throughput: 0: 42185.0. Samples: 983160080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:39:46,996][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 04:39:50,030][12883] Updated weights for policy 0, policy_version 60011 (0.0027) [2024-06-18 04:39:51,994][12645] Fps is (10 sec: 40956.6, 60 sec: 42051.7, 300 sec: 42154.0). Total num frames: 983285760. Throughput: 0: 42127.4. Samples: 983412600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:39:51,995][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 04:39:53,785][12883] Updated weights for policy 0, policy_version 60021 (0.0036) [2024-06-18 04:39:56,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42327.0, 300 sec: 42154.1). Total num frames: 983498752. Throughput: 0: 41992.5. Samples: 983661780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:39:56,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 04:39:57,797][12883] Updated weights for policy 0, policy_version 60031 (0.0029) [2024-06-18 04:40:01,894][12883] Updated weights for policy 0, policy_version 60041 (0.0039) [2024-06-18 04:40:01,994][12645] Fps is (10 sec: 42601.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 983711744. Throughput: 0: 42212.9. Samples: 983791320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:40:01,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 04:40:05,331][12883] Updated weights for policy 0, policy_version 60051 (0.0027) [2024-06-18 04:40:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 42154.3). Total num frames: 983908352. Throughput: 0: 42002.6. Samples: 984035960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:40:06,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 04:40:09,550][12883] Updated weights for policy 0, policy_version 60061 (0.0035) [2024-06-18 04:40:11,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.7, 300 sec: 42153.8). Total num frames: 984137728. Throughput: 0: 42105.5. Samples: 984294080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:40:11,996][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 04:40:12,953][12883] Updated weights for policy 0, policy_version 60071 (0.0053) [2024-06-18 04:40:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 984334336. Throughput: 0: 42179.6. Samples: 984424000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:40:16,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 04:40:17,403][12883] Updated weights for policy 0, policy_version 60081 (0.0026) [2024-06-18 04:40:20,575][12883] Updated weights for policy 0, policy_version 60091 (0.0033) [2024-06-18 04:40:21,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 984547328. Throughput: 0: 41885.2. Samples: 984666640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:40:21,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 04:40:25,187][12883] Updated weights for policy 0, policy_version 60101 (0.0036) [2024-06-18 04:40:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 984776704. Throughput: 0: 42031.0. Samples: 984923600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 04:40:26,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 04:40:28,783][12883] Updated weights for policy 0, policy_version 60111 (0.0037) [2024-06-18 04:40:31,996][12645] Fps is (10 sec: 42589.1, 60 sec: 41504.5, 300 sec: 42098.5). Total num frames: 984973312. Throughput: 0: 41969.3. Samples: 985048700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:40:31,996][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 04:40:33,004][12883] Updated weights for policy 0, policy_version 60121 (0.0033) [2024-06-18 04:40:33,010][12862] Signal inference workers to stop experience collection... (14250 times) [2024-06-18 04:40:33,011][12862] Signal inference workers to resume experience collection... (14250 times) [2024-06-18 04:40:33,025][12883] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-06-18 04:40:33,025][12883] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-06-18 04:40:36,617][12883] Updated weights for policy 0, policy_version 60131 (0.0041) [2024-06-18 04:40:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 985186304. Throughput: 0: 41930.5. Samples: 985299440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:40:36,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 04:40:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060131_985186304.pth... [2024-06-18 04:40:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059512_975044608.pth [2024-06-18 04:40:40,821][12883] Updated weights for policy 0, policy_version 60141 (0.0044) [2024-06-18 04:40:41,996][12645] Fps is (10 sec: 40960.0, 60 sec: 41777.5, 300 sec: 42042.7). Total num frames: 985382912. Throughput: 0: 42020.5. Samples: 985552800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:40:41,997][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 04:40:44,574][12883] Updated weights for policy 0, policy_version 60151 (0.0035) [2024-06-18 04:40:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41780.8, 300 sec: 42098.5). Total num frames: 985595904. Throughput: 0: 41928.5. Samples: 985678100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:40:46,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 04:40:48,509][12883] Updated weights for policy 0, policy_version 60161 (0.0048) [2024-06-18 04:40:51,994][12645] Fps is (10 sec: 44247.3, 60 sec: 42325.9, 300 sec: 42209.7). Total num frames: 985825280. Throughput: 0: 42168.6. Samples: 985933540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:40:51,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 04:40:52,206][12883] Updated weights for policy 0, policy_version 60171 (0.0038) [2024-06-18 04:40:56,219][12883] Updated weights for policy 0, policy_version 60181 (0.0033) [2024-06-18 04:40:56,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42050.6, 300 sec: 42098.2). Total num frames: 986021888. Throughput: 0: 42020.4. Samples: 986185000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:40:56,997][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 04:41:00,198][12883] Updated weights for policy 0, policy_version 60191 (0.0033) [2024-06-18 04:41:01,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42050.7, 300 sec: 42209.3). Total num frames: 986234880. Throughput: 0: 41900.9. Samples: 986309640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:41:01,996][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 04:41:03,842][12883] Updated weights for policy 0, policy_version 60201 (0.0027) [2024-06-18 04:41:06,996][12645] Fps is (10 sec: 44237.0, 60 sec: 42596.9, 300 sec: 42264.8). Total num frames: 986464256. Throughput: 0: 42234.9. Samples: 986567300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:06,996][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 04:41:07,781][12883] Updated weights for policy 0, policy_version 60211 (0.0041) [2024-06-18 04:41:11,475][12883] Updated weights for policy 0, policy_version 60221 (0.0023) [2024-06-18 04:41:11,996][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42098.2). Total num frames: 986660864. Throughput: 0: 42119.7. Samples: 986819080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:11,996][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 04:41:15,598][12883] Updated weights for policy 0, policy_version 60231 (0.0038) [2024-06-18 04:41:16,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 986873856. Throughput: 0: 42250.6. Samples: 986949880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:16,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 04:41:19,110][12883] Updated weights for policy 0, policy_version 60241 (0.0035) [2024-06-18 04:41:21,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 987070464. Throughput: 0: 42190.7. Samples: 987198020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:21,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 04:41:23,311][12883] Updated weights for policy 0, policy_version 60251 (0.0036) [2024-06-18 04:41:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 987299840. Throughput: 0: 42310.1. Samples: 987456660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:26,995][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 04:41:27,307][12883] Updated weights for policy 0, policy_version 60261 (0.0035) [2024-06-18 04:41:31,005][12883] Updated weights for policy 0, policy_version 60271 (0.0036) [2024-06-18 04:41:31,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42598.4, 300 sec: 42320.4). Total num frames: 987529216. Throughput: 0: 42445.4. Samples: 987588240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:31,996][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 04:41:34,858][12883] Updated weights for policy 0, policy_version 60281 (0.0034) [2024-06-18 04:41:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 987709440. Throughput: 0: 42279.6. Samples: 987836120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:36,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 04:41:38,599][12883] Updated weights for policy 0, policy_version 60291 (0.0041) [2024-06-18 04:41:41,996][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 987922432. Throughput: 0: 42471.2. Samples: 988096200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 04:41:41,996][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 04:41:42,640][12883] Updated weights for policy 0, policy_version 60301 (0.0025) [2024-06-18 04:41:46,467][12883] Updated weights for policy 0, policy_version 60311 (0.0035) [2024-06-18 04:41:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 988151808. Throughput: 0: 42431.9. Samples: 988218980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 04:41:46,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 04:41:50,250][12883] Updated weights for policy 0, policy_version 60321 (0.0030) [2024-06-18 04:41:51,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 988364800. Throughput: 0: 42341.6. Samples: 988472580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 04:41:51,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 04:41:54,103][12883] Updated weights for policy 0, policy_version 60331 (0.0038) [2024-06-18 04:41:56,993][12645] Fps is (10 sec: 40960.6, 60 sec: 42327.0, 300 sec: 42265.2). Total num frames: 988561408. Throughput: 0: 42503.1. Samples: 988731620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 04:41:56,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 04:41:58,055][12883] Updated weights for policy 0, policy_version 60341 (0.0038) [2024-06-18 04:42:01,886][12883] Updated weights for policy 0, policy_version 60351 (0.0036) [2024-06-18 04:42:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.0, 300 sec: 42320.7). Total num frames: 988790784. Throughput: 0: 42284.0. Samples: 988852660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 04:42:01,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 04:42:05,639][12883] Updated weights for policy 0, policy_version 60361 (0.0038) [2024-06-18 04:42:06,996][12645] Fps is (10 sec: 42587.8, 60 sec: 42052.2, 300 sec: 42098.2). Total num frames: 988987392. Throughput: 0: 42374.2. Samples: 989104960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 04:42:06,997][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 04:42:10,212][12883] Updated weights for policy 0, policy_version 60371 (0.0036) [2024-06-18 04:42:11,358][12862] Signal inference workers to stop experience collection... (14300 times) [2024-06-18 04:42:11,359][12862] Signal inference workers to resume experience collection... (14300 times) [2024-06-18 04:42:11,387][12883] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-06-18 04:42:11,387][12883] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-06-18 04:42:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42053.8, 300 sec: 42209.6). Total num frames: 989184000. Throughput: 0: 42230.7. Samples: 989357040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 04:42:11,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 04:42:13,229][12883] Updated weights for policy 0, policy_version 60381 (0.0043) [2024-06-18 04:42:16,994][12645] Fps is (10 sec: 40969.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 989396992. Throughput: 0: 42076.8. Samples: 989481600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 04:42:16,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 04:42:17,847][12883] Updated weights for policy 0, policy_version 60391 (0.0035) [2024-06-18 04:42:21,105][12883] Updated weights for policy 0, policy_version 60401 (0.0038) [2024-06-18 04:42:22,000][12645] Fps is (10 sec: 45846.6, 60 sec: 42866.9, 300 sec: 42153.2). Total num frames: 989642752. Throughput: 0: 42155.3. Samples: 989733380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:22,000][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 04:42:25,789][12883] Updated weights for policy 0, policy_version 60411 (0.0026) [2024-06-18 04:42:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 989806592. Throughput: 0: 42144.0. Samples: 989992580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:26,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 04:42:28,840][12883] Updated weights for policy 0, policy_version 60421 (0.0037) [2024-06-18 04:42:31,994][12645] Fps is (10 sec: 37706.8, 60 sec: 41507.7, 300 sec: 42154.1). Total num frames: 990019584. Throughput: 0: 41928.0. Samples: 990105740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:31,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 04:42:33,607][12883] Updated weights for policy 0, policy_version 60431 (0.0042) [2024-06-18 04:42:36,588][12883] Updated weights for policy 0, policy_version 60441 (0.0039) [2024-06-18 04:42:36,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 990281728. Throughput: 0: 41954.2. Samples: 990360520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:36,995][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 04:42:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060442_990281728.pth... [2024-06-18 04:42:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000059824_980156416.pth [2024-06-18 04:42:41,502][12883] Updated weights for policy 0, policy_version 60451 (0.0037) [2024-06-18 04:42:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41780.8, 300 sec: 42209.6). Total num frames: 990429184. Throughput: 0: 41808.3. Samples: 990613000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:41,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 04:42:44,503][12883] Updated weights for policy 0, policy_version 60461 (0.0034) [2024-06-18 04:42:46,997][12645] Fps is (10 sec: 37671.1, 60 sec: 41776.9, 300 sec: 42153.6). Total num frames: 990658560. Throughput: 0: 41718.3. Samples: 990730120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:46,998][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 04:42:49,358][12883] Updated weights for policy 0, policy_version 60471 (0.0039) [2024-06-18 04:42:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 990887936. Throughput: 0: 41947.0. Samples: 990992480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:51,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 04:42:52,252][12883] Updated weights for policy 0, policy_version 60481 (0.0024) [2024-06-18 04:42:56,994][12645] Fps is (10 sec: 39334.4, 60 sec: 41506.0, 300 sec: 42154.1). Total num frames: 991051776. Throughput: 0: 41994.7. Samples: 991246800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 04:42:56,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 04:42:57,303][12883] Updated weights for policy 0, policy_version 60491 (0.0022) [2024-06-18 04:43:00,147][12883] Updated weights for policy 0, policy_version 60501 (0.0032) [2024-06-18 04:43:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 991297536. Throughput: 0: 41893.3. Samples: 991366800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:43:01,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 04:43:05,017][12883] Updated weights for policy 0, policy_version 60511 (0.0036) [2024-06-18 04:43:06,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 991510528. Throughput: 0: 42026.8. Samples: 991624320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:43:06,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 04:43:07,933][12883] Updated weights for policy 0, policy_version 60521 (0.0031) [2024-06-18 04:43:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 991707136. Throughput: 0: 41767.4. Samples: 991872120. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:43:11,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 04:43:12,745][12883] Updated weights for policy 0, policy_version 60531 (0.0034) [2024-06-18 04:43:15,956][12883] Updated weights for policy 0, policy_version 60541 (0.0030) [2024-06-18 04:43:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 991936512. Throughput: 0: 41999.5. Samples: 991995720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:43:16,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 04:43:20,819][12883] Updated weights for policy 0, policy_version 60551 (0.0047) [2024-06-18 04:43:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41510.4, 300 sec: 42209.6). Total num frames: 992133120. Throughput: 0: 42020.0. Samples: 992251420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:43:21,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 04:43:23,775][12883] Updated weights for policy 0, policy_version 60561 (0.0028) [2024-06-18 04:43:24,767][12862] Signal inference workers to stop experience collection... (14350 times) [2024-06-18 04:43:24,800][12883] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-06-18 04:43:24,825][12862] Signal inference workers to resume experience collection... (14350 times) [2024-06-18 04:43:24,826][12883] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-06-18 04:43:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42209.7). Total num frames: 992346112. Throughput: 0: 41833.2. Samples: 992495500. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:43:26,994][12645] Avg episode reward: [(0, '0.073')] [2024-06-18 04:43:28,650][12883] Updated weights for policy 0, policy_version 60571 (0.0036) [2024-06-18 04:43:31,955][12883] Updated weights for policy 0, policy_version 60581 (0.0043) [2024-06-18 04:43:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 992559104. Throughput: 0: 42025.7. Samples: 992621140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 04:43:31,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 04:43:36,621][12883] Updated weights for policy 0, policy_version 60591 (0.0042) [2024-06-18 04:43:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 42043.3). Total num frames: 992739328. Throughput: 0: 41691.9. Samples: 992868620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:43:36,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 04:43:39,658][12883] Updated weights for policy 0, policy_version 60601 (0.0029) [2024-06-18 04:43:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 992952320. Throughput: 0: 41640.5. Samples: 993120620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:43:41,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 04:43:44,180][12883] Updated weights for policy 0, policy_version 60611 (0.0045) [2024-06-18 04:43:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41781.5, 300 sec: 42043.0). Total num frames: 993165312. Throughput: 0: 41876.5. Samples: 993251240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:43:46,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 04:43:47,450][12883] Updated weights for policy 0, policy_version 60621 (0.0042) [2024-06-18 04:43:51,812][12883] Updated weights for policy 0, policy_version 60631 (0.0035) [2024-06-18 04:43:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 42098.9). Total num frames: 993378304. Throughput: 0: 41692.8. Samples: 993500500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:43:51,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 04:43:55,616][12883] Updated weights for policy 0, policy_version 60641 (0.0041) [2024-06-18 04:43:56,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 993591296. Throughput: 0: 41709.5. Samples: 993749140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:43:57,004][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 04:43:59,676][12883] Updated weights for policy 0, policy_version 60651 (0.0036) [2024-06-18 04:44:01,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41504.6, 300 sec: 41987.2). Total num frames: 993787904. Throughput: 0: 41850.9. Samples: 993879100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:44:01,997][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 04:44:03,256][12883] Updated weights for policy 0, policy_version 60661 (0.0025) [2024-06-18 04:44:06,994][12645] Fps is (10 sec: 42607.9, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 994017280. Throughput: 0: 41738.3. Samples: 994129640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 04:44:06,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 04:44:07,277][12883] Updated weights for policy 0, policy_version 60671 (0.0023) [2024-06-18 04:44:11,057][12883] Updated weights for policy 0, policy_version 60681 (0.0039) [2024-06-18 04:44:11,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 994230272. Throughput: 0: 41880.5. Samples: 994380120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:11,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 04:44:15,039][12883] Updated weights for policy 0, policy_version 60691 (0.0028) [2024-06-18 04:44:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 994426880. Throughput: 0: 41771.2. Samples: 994500840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:16,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 04:44:18,959][12883] Updated weights for policy 0, policy_version 60701 (0.0033) [2024-06-18 04:44:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 994639872. Throughput: 0: 41953.5. Samples: 994756520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:21,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 04:44:22,753][12883] Updated weights for policy 0, policy_version 60711 (0.0030) [2024-06-18 04:44:26,608][12883] Updated weights for policy 0, policy_version 60721 (0.0039) [2024-06-18 04:44:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 994852864. Throughput: 0: 41859.6. Samples: 995004300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:26,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 04:44:30,523][12883] Updated weights for policy 0, policy_version 60731 (0.0036) [2024-06-18 04:44:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 995049472. Throughput: 0: 41909.7. Samples: 995137180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:31,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 04:44:34,271][12883] Updated weights for policy 0, policy_version 60741 (0.0038) [2024-06-18 04:44:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 995246080. Throughput: 0: 41924.6. Samples: 995387100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:36,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 04:44:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060746_995262464.pth... [2024-06-18 04:44:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060131_985186304.pth [2024-06-18 04:44:38,381][12883] Updated weights for policy 0, policy_version 60751 (0.0052) [2024-06-18 04:44:41,993][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42043.4). Total num frames: 995491840. Throughput: 0: 41970.2. Samples: 995637700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:41,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 04:44:42,355][12883] Updated weights for policy 0, policy_version 60761 (0.0034) [2024-06-18 04:44:46,137][12883] Updated weights for policy 0, policy_version 60771 (0.0035) [2024-06-18 04:44:46,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42098.7). Total num frames: 995704832. Throughput: 0: 41971.9. Samples: 995767740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 26.0) [2024-06-18 04:44:46,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 04:44:49,094][12862] Signal inference workers to stop experience collection... (14400 times) [2024-06-18 04:44:49,094][12862] Signal inference workers to resume experience collection... (14400 times) [2024-06-18 04:44:49,109][12883] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-06-18 04:44:49,110][12883] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-06-18 04:44:50,130][12883] Updated weights for policy 0, policy_version 60781 (0.0042) [2024-06-18 04:44:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 995885056. Throughput: 0: 41845.9. Samples: 996012700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 04:44:51,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 04:44:54,016][12883] Updated weights for policy 0, policy_version 60791 (0.0032) [2024-06-18 04:44:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 996114432. Throughput: 0: 41916.9. Samples: 996266380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 04:44:56,994][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 04:44:57,710][12883] Updated weights for policy 0, policy_version 60801 (0.0033) [2024-06-18 04:45:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 996311040. Throughput: 0: 42064.4. Samples: 996393740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 04:45:01,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 04:45:02,225][12883] Updated weights for policy 0, policy_version 60811 (0.0034) [2024-06-18 04:45:05,526][12883] Updated weights for policy 0, policy_version 60821 (0.0031) [2024-06-18 04:45:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41987.8). Total num frames: 996524032. Throughput: 0: 41714.6. Samples: 996633680. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 04:45:06,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 04:45:09,906][12883] Updated weights for policy 0, policy_version 60831 (0.0027) [2024-06-18 04:45:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 996737024. Throughput: 0: 42033.3. Samples: 996895800. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 04:45:11,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 04:45:13,625][12883] Updated weights for policy 0, policy_version 60841 (0.0039) [2024-06-18 04:45:16,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 996966400. Throughput: 0: 41923.7. Samples: 997023840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 04:45:16,997][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 04:45:17,740][12883] Updated weights for policy 0, policy_version 60851 (0.0027) [2024-06-18 04:45:21,357][12883] Updated weights for policy 0, policy_version 60861 (0.0041) [2024-06-18 04:45:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 997146624. Throughput: 0: 41975.4. Samples: 997276000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 04:45:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 04:45:25,540][12883] Updated weights for policy 0, policy_version 60871 (0.0036) [2024-06-18 04:45:26,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42043.3). Total num frames: 997376000. Throughput: 0: 41953.3. Samples: 997525600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:45:26,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 04:45:29,300][12883] Updated weights for policy 0, policy_version 60881 (0.0043) [2024-06-18 04:45:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 997588992. Throughput: 0: 41888.5. Samples: 997652720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:45:31,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 04:45:33,200][12883] Updated weights for policy 0, policy_version 60891 (0.0040) [2024-06-18 04:45:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42043.3). Total num frames: 997785600. Throughput: 0: 42269.8. Samples: 997914840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:45:36,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 04:45:37,039][12883] Updated weights for policy 0, policy_version 60901 (0.0042) [2024-06-18 04:45:41,015][12883] Updated weights for policy 0, policy_version 60911 (0.0028) [2024-06-18 04:45:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 997998592. Throughput: 0: 42131.9. Samples: 998162320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:45:41,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 04:45:45,037][12883] Updated weights for policy 0, policy_version 60921 (0.0042) [2024-06-18 04:45:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 998211584. Throughput: 0: 42098.2. Samples: 998288160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:45:46,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 04:45:48,429][12883] Updated weights for policy 0, policy_version 60931 (0.0040) [2024-06-18 04:45:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42043.4). Total num frames: 998424576. Throughput: 0: 42544.0. Samples: 998548160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:45:51,994][12645] Avg episode reward: [(0, '0.033')] [2024-06-18 04:45:52,883][12883] Updated weights for policy 0, policy_version 60941 (0.0029) [2024-06-18 04:45:56,006][12883] Updated weights for policy 0, policy_version 60951 (0.0041) [2024-06-18 04:45:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42043.3). Total num frames: 998637568. Throughput: 0: 42226.7. Samples: 998796000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:45:56,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 04:46:00,656][12883] Updated weights for policy 0, policy_version 60961 (0.0047) [2024-06-18 04:46:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 998850560. Throughput: 0: 42115.9. Samples: 998918960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:46:01,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 04:46:03,943][12883] Updated weights for policy 0, policy_version 60971 (0.0052) [2024-06-18 04:46:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41987.8). Total num frames: 999047168. Throughput: 0: 42165.4. Samples: 999173440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:06,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 04:46:08,434][12883] Updated weights for policy 0, policy_version 60981 (0.0032) [2024-06-18 04:46:11,971][12883] Updated weights for policy 0, policy_version 60991 (0.0034) [2024-06-18 04:46:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 999276544. Throughput: 0: 42116.7. Samples: 999420860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:11,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 04:46:16,221][12883] Updated weights for policy 0, policy_version 61001 (0.0036) [2024-06-18 04:46:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41780.8, 300 sec: 42043.0). Total num frames: 999473152. Throughput: 0: 42258.2. Samples: 999554340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:16,994][12645] Avg episode reward: [(0, '0.048')] [2024-06-18 04:46:17,146][12862] Signal inference workers to stop experience collection... (14450 times) [2024-06-18 04:46:17,146][12862] Signal inference workers to resume experience collection... (14450 times) [2024-06-18 04:46:17,159][12883] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-06-18 04:46:17,159][12883] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-06-18 04:46:19,582][12883] Updated weights for policy 0, policy_version 61011 (0.0035) [2024-06-18 04:46:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 999669760. Throughput: 0: 41945.3. Samples: 999802380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:21,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 04:46:24,313][12883] Updated weights for policy 0, policy_version 61021 (0.0035) [2024-06-18 04:46:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41932.3). Total num frames: 999899136. Throughput: 0: 41993.4. Samples: 1000052020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:26,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 04:46:27,555][12883] Updated weights for policy 0, policy_version 61031 (0.0037) [2024-06-18 04:46:31,797][12883] Updated weights for policy 0, policy_version 61041 (0.0039) [2024-06-18 04:46:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41987.4). Total num frames: 1000095744. Throughput: 0: 42160.0. Samples: 1000185360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:31,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 04:46:35,182][12883] Updated weights for policy 0, policy_version 61051 (0.0029) [2024-06-18 04:46:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41932.2). Total num frames: 1000292352. Throughput: 0: 41849.2. Samples: 1000431380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:36,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 04:46:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061053_1000292352.pth... [2024-06-18 04:46:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060442_990281728.pth [2024-06-18 04:46:39,416][12883] Updated weights for policy 0, policy_version 61061 (0.0032) [2024-06-18 04:46:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1000538112. Throughput: 0: 41934.7. Samples: 1000683060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 04:46:41,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 04:46:42,874][12883] Updated weights for policy 0, policy_version 61071 (0.0033) [2024-06-18 04:46:46,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42050.7, 300 sec: 41931.6). Total num frames: 1000734720. Throughput: 0: 42253.0. Samples: 1000820440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:46:46,997][12645] Avg episode reward: [(0, '0.042')] [2024-06-18 04:46:47,253][12883] Updated weights for policy 0, policy_version 61081 (0.0037) [2024-06-18 04:46:50,657][12883] Updated weights for policy 0, policy_version 61091 (0.0032) [2024-06-18 04:46:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 1000947712. Throughput: 0: 42093.7. Samples: 1001067660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:46:51,994][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 04:46:55,020][12883] Updated weights for policy 0, policy_version 61101 (0.0044) [2024-06-18 04:46:56,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1001177088. Throughput: 0: 41972.5. Samples: 1001309620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:46:56,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 04:46:59,136][12883] Updated weights for policy 0, policy_version 61111 (0.0044) [2024-06-18 04:47:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41876.7). Total num frames: 1001340928. Throughput: 0: 41959.6. Samples: 1001442520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:47:01,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 04:47:02,839][12883] Updated weights for policy 0, policy_version 61121 (0.0029) [2024-06-18 04:47:06,715][12883] Updated weights for policy 0, policy_version 61131 (0.0033) [2024-06-18 04:47:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1001570304. Throughput: 0: 41848.0. Samples: 1001685540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:47:06,994][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 04:47:10,643][12883] Updated weights for policy 0, policy_version 61141 (0.0033) [2024-06-18 04:47:11,708][12862] Signal inference workers to stop experience collection... (14500 times) [2024-06-18 04:47:11,708][12862] Signal inference workers to resume experience collection... (14500 times) [2024-06-18 04:47:11,726][12883] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-06-18 04:47:11,726][12883] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-06-18 04:47:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1001799680. Throughput: 0: 42058.2. Samples: 1001944640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:47:11,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 04:47:14,298][12883] Updated weights for policy 0, policy_version 61151 (0.0030) [2024-06-18 04:47:16,995][12645] Fps is (10 sec: 39314.7, 60 sec: 41504.9, 300 sec: 41766.0). Total num frames: 1001963520. Throughput: 0: 41858.9. Samples: 1002069080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 04:47:16,996][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 04:47:18,575][12883] Updated weights for policy 0, policy_version 61161 (0.0031) [2024-06-18 04:47:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1002209280. Throughput: 0: 41876.4. Samples: 1002315820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:21,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 04:47:22,339][12883] Updated weights for policy 0, policy_version 61171 (0.0038) [2024-06-18 04:47:26,255][12883] Updated weights for policy 0, policy_version 61181 (0.0030) [2024-06-18 04:47:26,994][12645] Fps is (10 sec: 45883.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1002422272. Throughput: 0: 42168.9. Samples: 1002580660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:26,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 04:47:29,875][12883] Updated weights for policy 0, policy_version 61191 (0.0027) [2024-06-18 04:47:31,996][12645] Fps is (10 sec: 39314.4, 60 sec: 41777.9, 300 sec: 41765.1). Total num frames: 1002602496. Throughput: 0: 41742.1. Samples: 1002698820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:31,996][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 04:47:34,317][12883] Updated weights for policy 0, policy_version 61201 (0.0048) [2024-06-18 04:47:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 1002864640. Throughput: 0: 41776.6. Samples: 1002947600. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:36,994][12645] Avg episode reward: [(0, '0.035')] [2024-06-18 04:47:37,462][12883] Updated weights for policy 0, policy_version 61211 (0.0039) [2024-06-18 04:47:41,994][12645] Fps is (10 sec: 42606.8, 60 sec: 41506.2, 300 sec: 41932.4). Total num frames: 1003028480. Throughput: 0: 42091.3. Samples: 1003203720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:41,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 04:47:42,123][12883] Updated weights for policy 0, policy_version 61221 (0.0038) [2024-06-18 04:47:45,720][12883] Updated weights for policy 0, policy_version 61231 (0.0041) [2024-06-18 04:47:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41780.9, 300 sec: 41876.4). Total num frames: 1003241472. Throughput: 0: 41651.1. Samples: 1003316820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:46,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 04:47:49,836][12883] Updated weights for policy 0, policy_version 61241 (0.0032) [2024-06-18 04:47:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1003487232. Throughput: 0: 42050.2. Samples: 1003577800. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:51,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 04:47:53,308][12883] Updated weights for policy 0, policy_version 61251 (0.0034) [2024-06-18 04:47:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 1003667456. Throughput: 0: 41996.0. Samples: 1003834460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 04:47:56,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 04:47:57,602][12883] Updated weights for policy 0, policy_version 61261 (0.0030) [2024-06-18 04:48:00,875][12883] Updated weights for policy 0, policy_version 61271 (0.0042) [2024-06-18 04:48:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 1003880448. Throughput: 0: 41829.6. Samples: 1003951340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:48:01,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 04:48:05,415][12883] Updated weights for policy 0, policy_version 61281 (0.0035) [2024-06-18 04:48:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 1004093440. Throughput: 0: 42088.8. Samples: 1004209820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:48:06,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 04:48:08,452][12883] Updated weights for policy 0, policy_version 61291 (0.0040) [2024-06-18 04:48:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 1004290048. Throughput: 0: 41785.8. Samples: 1004461020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:48:11,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 04:48:13,147][12883] Updated weights for policy 0, policy_version 61301 (0.0032) [2024-06-18 04:48:16,530][12883] Updated weights for policy 0, policy_version 61311 (0.0042) [2024-06-18 04:48:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42599.6, 300 sec: 41987.5). Total num frames: 1004519424. Throughput: 0: 41924.9. Samples: 1004585360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:48:16,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 04:48:21,057][12883] Updated weights for policy 0, policy_version 61321 (0.0039) [2024-06-18 04:48:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1004716032. Throughput: 0: 42001.2. Samples: 1004837660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:48:21,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 04:48:24,133][12883] Updated weights for policy 0, policy_version 61331 (0.0035) [2024-06-18 04:48:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 1004912640. Throughput: 0: 42158.7. Samples: 1005100860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:48:26,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 04:48:28,705][12883] Updated weights for policy 0, policy_version 61341 (0.0048) [2024-06-18 04:48:31,866][12883] Updated weights for policy 0, policy_version 61351 (0.0034) [2024-06-18 04:48:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42872.8, 300 sec: 42154.1). Total num frames: 1005174784. Throughput: 0: 42312.8. Samples: 1005220900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 04:48:31,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 04:48:36,504][12883] Updated weights for policy 0, policy_version 61361 (0.0037) [2024-06-18 04:48:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 1005338624. Throughput: 0: 42100.4. Samples: 1005472320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:48:36,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 04:48:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061362_1005355008.pth... [2024-06-18 04:48:37,095][12862] Signal inference workers to stop experience collection... (14550 times) [2024-06-18 04:48:37,096][12862] Signal inference workers to resume experience collection... (14550 times) [2024-06-18 04:48:37,106][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000060746_995262464.pth [2024-06-18 04:48:37,109][12883] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-06-18 04:48:37,110][12883] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-06-18 04:48:39,991][12883] Updated weights for policy 0, policy_version 61371 (0.0043) [2024-06-18 04:48:41,994][12645] Fps is (10 sec: 36045.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1005535232. Throughput: 0: 42021.9. Samples: 1005725440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:48:41,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 04:48:44,472][12883] Updated weights for policy 0, policy_version 61381 (0.0028) [2024-06-18 04:48:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 1005780992. Throughput: 0: 42264.1. Samples: 1005853320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:48:46,997][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 04:48:47,695][12883] Updated weights for policy 0, policy_version 61391 (0.0026) [2024-06-18 04:48:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41987.8). Total num frames: 1005977600. Throughput: 0: 42202.4. Samples: 1006108920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:48:51,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 04:48:52,226][12883] Updated weights for policy 0, policy_version 61401 (0.0029) [2024-06-18 04:48:55,383][12883] Updated weights for policy 0, policy_version 61411 (0.0034) [2024-06-18 04:48:56,996][12645] Fps is (10 sec: 40960.0, 60 sec: 42050.7, 300 sec: 42043.0). Total num frames: 1006190592. Throughput: 0: 42264.0. Samples: 1006363000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:48:56,997][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 04:48:59,951][12883] Updated weights for policy 0, policy_version 61421 (0.0035) [2024-06-18 04:49:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 1006436352. Throughput: 0: 42301.8. Samples: 1006488940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:49:01,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 04:49:03,057][12883] Updated weights for policy 0, policy_version 61431 (0.0035) [2024-06-18 04:49:06,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1006616576. Throughput: 0: 42321.3. Samples: 1006742120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:49:06,995][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 04:49:07,599][12883] Updated weights for policy 0, policy_version 61441 (0.0029) [2024-06-18 04:49:11,088][12883] Updated weights for policy 0, policy_version 61451 (0.0038) [2024-06-18 04:49:11,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1006813184. Throughput: 0: 42067.9. Samples: 1006993920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 04:49:11,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 04:49:15,345][12883] Updated weights for policy 0, policy_version 61461 (0.0034) [2024-06-18 04:49:16,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 1007058944. Throughput: 0: 42292.0. Samples: 1007124140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:16,997][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 04:49:19,090][12883] Updated weights for policy 0, policy_version 61471 (0.0043) [2024-06-18 04:49:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1007255552. Throughput: 0: 42359.0. Samples: 1007378480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:21,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 04:49:22,856][12883] Updated weights for policy 0, policy_version 61481 (0.0035) [2024-06-18 04:49:26,650][12883] Updated weights for policy 0, policy_version 61491 (0.0036) [2024-06-18 04:49:26,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 1007468544. Throughput: 0: 42273.7. Samples: 1007627760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:26,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 04:49:30,570][12883] Updated weights for policy 0, policy_version 61501 (0.0029) [2024-06-18 04:49:31,994][12645] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1007681536. Throughput: 0: 42349.3. Samples: 1007758940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:31,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 04:49:34,363][12883] Updated weights for policy 0, policy_version 61511 (0.0036) [2024-06-18 04:49:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 1007878144. Throughput: 0: 42255.4. Samples: 1008010420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:36,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 04:49:38,250][12883] Updated weights for policy 0, policy_version 61521 (0.0030) [2024-06-18 04:49:41,996][12645] Fps is (10 sec: 42588.4, 60 sec: 42869.8, 300 sec: 42042.7). Total num frames: 1008107520. Throughput: 0: 42197.8. Samples: 1008261900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:41,996][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 04:49:42,161][12883] Updated weights for policy 0, policy_version 61531 (0.0041) [2024-06-18 04:49:45,974][12883] Updated weights for policy 0, policy_version 61541 (0.0029) [2024-06-18 04:49:46,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42327.0, 300 sec: 42154.1). Total num frames: 1008320512. Throughput: 0: 42296.9. Samples: 1008392300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:46,994][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 04:49:50,034][12883] Updated weights for policy 0, policy_version 61551 (0.0041) [2024-06-18 04:49:51,998][12645] Fps is (10 sec: 40949.9, 60 sec: 42322.0, 300 sec: 42042.3). Total num frames: 1008517120. Throughput: 0: 42241.4. Samples: 1008643180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 04:49:51,999][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 04:49:53,961][12883] Updated weights for policy 0, policy_version 61561 (0.0036) [2024-06-18 04:49:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42154.1). Total num frames: 1008746496. Throughput: 0: 42205.3. Samples: 1008893160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 04:49:56,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 04:49:57,841][12883] Updated weights for policy 0, policy_version 61571 (0.0044) [2024-06-18 04:50:01,731][12883] Updated weights for policy 0, policy_version 61581 (0.0039) [2024-06-18 04:50:01,994][12645] Fps is (10 sec: 42618.5, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1008943104. Throughput: 0: 42129.7. Samples: 1009019880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 04:50:01,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 04:50:05,792][12883] Updated weights for policy 0, policy_version 61591 (0.0032) [2024-06-18 04:50:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 1009156096. Throughput: 0: 42076.5. Samples: 1009271920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 04:50:06,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 04:50:09,254][12883] Updated weights for policy 0, policy_version 61601 (0.0035) [2024-06-18 04:50:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 41987.8). Total num frames: 1009352704. Throughput: 0: 42119.0. Samples: 1009523120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 04:50:11,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 04:50:13,507][12883] Updated weights for policy 0, policy_version 61611 (0.0036) [2024-06-18 04:50:16,610][12862] Signal inference workers to stop experience collection... (14600 times) [2024-06-18 04:50:16,645][12883] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-06-18 04:50:16,672][12862] Signal inference workers to resume experience collection... (14600 times) [2024-06-18 04:50:16,673][12883] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-06-18 04:50:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 1009582080. Throughput: 0: 41941.2. Samples: 1009646300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 04:50:16,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 04:50:17,334][12883] Updated weights for policy 0, policy_version 61621 (0.0032) [2024-06-18 04:50:21,483][12883] Updated weights for policy 0, policy_version 61631 (0.0034) [2024-06-18 04:50:21,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1009778688. Throughput: 0: 42027.3. Samples: 1009901640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 04:50:21,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 04:50:25,111][12883] Updated weights for policy 0, policy_version 61641 (0.0037) [2024-06-18 04:50:26,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 1009991680. Throughput: 0: 41863.1. Samples: 1010145740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 04:50:26,996][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 04:50:29,412][12883] Updated weights for policy 0, policy_version 61651 (0.0049) [2024-06-18 04:50:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1010221056. Throughput: 0: 41871.5. Samples: 1010276520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:50:31,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 04:50:32,707][12883] Updated weights for policy 0, policy_version 61661 (0.0032) [2024-06-18 04:50:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1010401280. Throughput: 0: 41903.5. Samples: 1010528640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:50:36,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 04:50:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061670_1010401280.pth... [2024-06-18 04:50:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061053_1000292352.pth [2024-06-18 04:50:37,238][12883] Updated weights for policy 0, policy_version 61671 (0.0041) [2024-06-18 04:50:40,557][12883] Updated weights for policy 0, policy_version 61681 (0.0041) [2024-06-18 04:50:41,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42052.3, 300 sec: 42098.2). Total num frames: 1010630656. Throughput: 0: 41818.4. Samples: 1010775080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:50:41,996][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 04:50:45,227][12883] Updated weights for policy 0, policy_version 61691 (0.0036) [2024-06-18 04:50:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 1010843648. Throughput: 0: 41926.1. Samples: 1010906560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:50:46,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 04:50:48,434][12883] Updated weights for policy 0, policy_version 61701 (0.0041) [2024-06-18 04:50:51,994][12645] Fps is (10 sec: 37691.3, 60 sec: 41509.4, 300 sec: 41931.9). Total num frames: 1011007488. Throughput: 0: 41828.9. Samples: 1011154220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:50:51,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 04:50:52,859][12883] Updated weights for policy 0, policy_version 61711 (0.0034) [2024-06-18 04:50:56,254][12883] Updated weights for policy 0, policy_version 61721 (0.0035) [2024-06-18 04:50:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 1011236864. Throughput: 0: 41835.7. Samples: 1011405720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:50:56,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 04:51:00,411][12883] Updated weights for policy 0, policy_version 61731 (0.0031) [2024-06-18 04:51:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1011466240. Throughput: 0: 42132.9. Samples: 1011542280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:51:01,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 04:51:04,010][12883] Updated weights for policy 0, policy_version 61741 (0.0030) [2024-06-18 04:51:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 1011646464. Throughput: 0: 41853.2. Samples: 1011785040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 04:51:06,994][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 04:51:08,446][12883] Updated weights for policy 0, policy_version 61751 (0.0033) [2024-06-18 04:51:11,805][12883] Updated weights for policy 0, policy_version 61761 (0.0038) [2024-06-18 04:51:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42098.5). Total num frames: 1011892224. Throughput: 0: 41951.9. Samples: 1012033480. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-18 04:51:11,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 04:51:16,387][12883] Updated weights for policy 0, policy_version 61771 (0.0030) [2024-06-18 04:51:16,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41504.6, 300 sec: 42042.7). Total num frames: 1012072448. Throughput: 0: 41957.9. Samples: 1012164720. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-18 04:51:16,996][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 04:51:19,972][12883] Updated weights for policy 0, policy_version 61781 (0.0044) [2024-06-18 04:51:22,001][12645] Fps is (10 sec: 40928.7, 60 sec: 42046.9, 300 sec: 42041.9). Total num frames: 1012301824. Throughput: 0: 41927.6. Samples: 1012415700. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-18 04:51:22,002][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 04:51:23,987][12883] Updated weights for policy 0, policy_version 61791 (0.0033) [2024-06-18 04:51:26,994][12645] Fps is (10 sec: 45884.5, 60 sec: 42326.7, 300 sec: 42154.1). Total num frames: 1012531200. Throughput: 0: 41877.8. Samples: 1012659500. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-18 04:51:26,995][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 04:51:27,603][12883] Updated weights for policy 0, policy_version 61801 (0.0032) [2024-06-18 04:51:31,831][12883] Updated weights for policy 0, policy_version 61811 (0.0031) [2024-06-18 04:51:31,994][12645] Fps is (10 sec: 40991.5, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 1012711424. Throughput: 0: 42051.7. Samples: 1012798880. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-18 04:51:31,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 04:51:35,229][12883] Updated weights for policy 0, policy_version 61821 (0.0053) [2024-06-18 04:51:36,994][12645] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1012940800. Throughput: 0: 42202.3. Samples: 1013053320. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-18 04:51:36,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 04:51:38,488][12862] Signal inference workers to stop experience collection... (14650 times) [2024-06-18 04:51:38,488][12862] Signal inference workers to resume experience collection... (14650 times) [2024-06-18 04:51:38,535][12883] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-06-18 04:51:38,535][12883] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-06-18 04:51:39,430][12883] Updated weights for policy 0, policy_version 61831 (0.0026) [2024-06-18 04:51:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42327.0, 300 sec: 42154.4). Total num frames: 1013170176. Throughput: 0: 42082.8. Samples: 1013299440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 23.0) [2024-06-18 04:51:41,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 04:51:42,937][12883] Updated weights for policy 0, policy_version 61841 (0.0031) [2024-06-18 04:51:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1013350400. Throughput: 0: 41878.1. Samples: 1013426800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:51:46,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 04:51:47,053][12883] Updated weights for policy 0, policy_version 61851 (0.0046) [2024-06-18 04:51:51,035][12883] Updated weights for policy 0, policy_version 61861 (0.0044) [2024-06-18 04:51:51,994][12645] Fps is (10 sec: 36044.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 1013530624. Throughput: 0: 41952.5. Samples: 1013672900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:51:51,994][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 04:51:54,820][12883] Updated weights for policy 0, policy_version 61871 (0.0042) [2024-06-18 04:51:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1013776384. Throughput: 0: 42124.4. Samples: 1013929080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:51:56,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 04:51:59,075][12883] Updated weights for policy 0, policy_version 61881 (0.0039) [2024-06-18 04:52:01,994][12645] Fps is (10 sec: 44234.0, 60 sec: 41778.8, 300 sec: 42042.9). Total num frames: 1013972992. Throughput: 0: 42106.9. Samples: 1014059460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:52:01,995][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 04:52:02,549][12883] Updated weights for policy 0, policy_version 61891 (0.0033) [2024-06-18 04:52:06,929][12883] Updated weights for policy 0, policy_version 61901 (0.0038) [2024-06-18 04:52:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 1014185984. Throughput: 0: 41897.4. Samples: 1014300760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:52:06,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 04:52:10,358][12883] Updated weights for policy 0, policy_version 61911 (0.0039) [2024-06-18 04:52:11,994][12645] Fps is (10 sec: 42601.5, 60 sec: 41779.2, 300 sec: 42154.3). Total num frames: 1014398976. Throughput: 0: 42129.7. Samples: 1014555320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:52:11,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 04:52:14,735][12883] Updated weights for policy 0, policy_version 61921 (0.0047) [2024-06-18 04:52:16,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42052.3, 300 sec: 41987.2). Total num frames: 1014595584. Throughput: 0: 41901.4. Samples: 1014684540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:52:16,996][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 04:52:17,963][12883] Updated weights for policy 0, policy_version 61931 (0.0029) [2024-06-18 04:52:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41511.4, 300 sec: 41931.9). Total num frames: 1014792192. Throughput: 0: 41752.4. Samples: 1014932180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 04:52:21,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 04:52:22,555][12883] Updated weights for policy 0, policy_version 61941 (0.0030) [2024-06-18 04:52:26,094][12883] Updated weights for policy 0, policy_version 61951 (0.0044) [2024-06-18 04:52:26,994][12645] Fps is (10 sec: 44246.0, 60 sec: 41779.3, 300 sec: 42154.3). Total num frames: 1015037952. Throughput: 0: 41713.6. Samples: 1015176560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:52:26,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 04:52:30,517][12883] Updated weights for policy 0, policy_version 61961 (0.0033) [2024-06-18 04:52:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 1015218176. Throughput: 0: 41877.4. Samples: 1015311280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:52:31,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 04:52:32,883][12862] Signal inference workers to stop experience collection... (14700 times) [2024-06-18 04:52:32,883][12862] Signal inference workers to resume experience collection... (14700 times) [2024-06-18 04:52:32,893][12883] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-06-18 04:52:32,893][12883] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-06-18 04:52:33,801][12883] Updated weights for policy 0, policy_version 61971 (0.0023) [2024-06-18 04:52:36,995][12645] Fps is (10 sec: 39318.4, 60 sec: 41505.4, 300 sec: 42042.9). Total num frames: 1015431168. Throughput: 0: 41653.8. Samples: 1015547360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:52:36,995][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 04:52:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061977_1015431168.pth... [2024-06-18 04:52:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061362_1005355008.pth [2024-06-18 04:52:38,384][12883] Updated weights for policy 0, policy_version 61981 (0.0041) [2024-06-18 04:52:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 1015644160. Throughput: 0: 41541.7. Samples: 1015798460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:52:41,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 04:52:42,068][12883] Updated weights for policy 0, policy_version 61991 (0.0042) [2024-06-18 04:52:46,261][12883] Updated weights for policy 0, policy_version 62001 (0.0042) [2024-06-18 04:52:46,994][12645] Fps is (10 sec: 40963.7, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 1015840768. Throughput: 0: 41544.1. Samples: 1015928920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:52:46,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 04:52:49,701][12883] Updated weights for policy 0, policy_version 62011 (0.0040) [2024-06-18 04:52:51,995][12645] Fps is (10 sec: 40953.0, 60 sec: 42051.1, 300 sec: 41987.2). Total num frames: 1016053760. Throughput: 0: 41635.2. Samples: 1016174420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:52:51,996][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 04:52:54,109][12883] Updated weights for policy 0, policy_version 62021 (0.0036) [2024-06-18 04:52:56,996][12645] Fps is (10 sec: 44226.9, 60 sec: 41777.6, 300 sec: 42042.7). Total num frames: 1016283136. Throughput: 0: 41632.9. Samples: 1016428900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:52:56,997][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 04:52:57,370][12883] Updated weights for policy 0, policy_version 62031 (0.0049) [2024-06-18 04:53:01,860][12883] Updated weights for policy 0, policy_version 62041 (0.0044) [2024-06-18 04:53:01,994][12645] Fps is (10 sec: 42605.8, 60 sec: 41779.6, 300 sec: 41987.5). Total num frames: 1016479744. Throughput: 0: 41573.6. Samples: 1016555260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:53:01,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 04:53:05,165][12883] Updated weights for policy 0, policy_version 62051 (0.0030) [2024-06-18 04:53:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 1016709120. Throughput: 0: 41701.6. Samples: 1016808760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 04:53:06,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 04:53:09,696][12883] Updated weights for policy 0, policy_version 62061 (0.0043) [2024-06-18 04:53:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1016905728. Throughput: 0: 41917.5. Samples: 1017062840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 04:53:11,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 04:53:13,240][12883] Updated weights for policy 0, policy_version 62071 (0.0033) [2024-06-18 04:53:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41780.6, 300 sec: 41987.5). Total num frames: 1017102336. Throughput: 0: 41584.3. Samples: 1017182580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 04:53:16,994][12645] Avg episode reward: [(0, '0.104')] [2024-06-18 04:53:18,106][12883] Updated weights for policy 0, policy_version 62081 (0.0041) [2024-06-18 04:53:21,083][12883] Updated weights for policy 0, policy_version 62091 (0.0025) [2024-06-18 04:53:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 1017331712. Throughput: 0: 41868.9. Samples: 1017431420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 04:53:21,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 04:53:25,947][12883] Updated weights for policy 0, policy_version 62101 (0.0033) [2024-06-18 04:53:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41233.2, 300 sec: 41820.9). Total num frames: 1017511936. Throughput: 0: 42152.1. Samples: 1017695300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 04:53:26,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 04:53:28,846][12883] Updated weights for policy 0, policy_version 62111 (0.0035) [2024-06-18 04:53:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1017741312. Throughput: 0: 41883.2. Samples: 1017813660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 04:53:31,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 04:53:33,628][12883] Updated weights for policy 0, policy_version 62121 (0.0033) [2024-06-18 04:53:36,646][12883] Updated weights for policy 0, policy_version 62131 (0.0034) [2024-06-18 04:53:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42326.0, 300 sec: 42154.1). Total num frames: 1017970688. Throughput: 0: 42048.3. Samples: 1018066520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 04:53:36,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 04:53:41,721][12883] Updated weights for policy 0, policy_version 62141 (0.0039) [2024-06-18 04:53:41,996][12645] Fps is (10 sec: 39312.7, 60 sec: 41504.6, 300 sec: 41876.4). Total num frames: 1018134528. Throughput: 0: 42312.9. Samples: 1018332980. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:53:41,997][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 04:53:44,409][12883] Updated weights for policy 0, policy_version 62151 (0.0042) [2024-06-18 04:53:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1018380288. Throughput: 0: 41995.9. Samples: 1018445080. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:53:46,994][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 04:53:49,648][12883] Updated weights for policy 0, policy_version 62161 (0.0047) [2024-06-18 04:53:51,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42326.5, 300 sec: 42043.3). Total num frames: 1018593280. Throughput: 0: 41934.3. Samples: 1018695800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:53:51,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 04:53:52,555][12883] Updated weights for policy 0, policy_version 62171 (0.0040) [2024-06-18 04:53:56,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41234.6, 300 sec: 41765.3). Total num frames: 1018757120. Throughput: 0: 41942.6. Samples: 1018950260. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:53:56,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 04:53:57,346][12883] Updated weights for policy 0, policy_version 62181 (0.0037) [2024-06-18 04:53:57,520][12862] Signal inference workers to stop experience collection... (14750 times) [2024-06-18 04:53:57,574][12862] Signal inference workers to resume experience collection... (14750 times) [2024-06-18 04:53:57,575][12883] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-06-18 04:53:57,594][12883] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-06-18 04:54:00,094][12883] Updated weights for policy 0, policy_version 62191 (0.0037) [2024-06-18 04:54:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.7, 300 sec: 42042.7). Total num frames: 1019019264. Throughput: 0: 41923.8. Samples: 1019069240. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:54:01,996][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 04:54:04,903][12883] Updated weights for policy 0, policy_version 62201 (0.0044) [2024-06-18 04:54:06,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 1019215872. Throughput: 0: 42096.9. Samples: 1019325780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:54:06,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 04:54:07,891][12883] Updated weights for policy 0, policy_version 62211 (0.0025) [2024-06-18 04:54:11,994][12645] Fps is (10 sec: 37691.4, 60 sec: 41506.0, 300 sec: 41821.2). Total num frames: 1019396096. Throughput: 0: 41902.5. Samples: 1019580920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:54:11,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 04:54:12,657][12883] Updated weights for policy 0, policy_version 62221 (0.0052) [2024-06-18 04:54:15,692][12883] Updated weights for policy 0, policy_version 62231 (0.0042) [2024-06-18 04:54:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 1019658240. Throughput: 0: 41958.2. Samples: 1019701780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 04:54:16,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 04:54:20,177][12883] Updated weights for policy 0, policy_version 62241 (0.0033) [2024-06-18 04:54:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1019838464. Throughput: 0: 42000.9. Samples: 1019956560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:21,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 04:54:23,488][12883] Updated weights for policy 0, policy_version 62251 (0.0039) [2024-06-18 04:54:26,996][12645] Fps is (10 sec: 37674.7, 60 sec: 42050.6, 300 sec: 41876.1). Total num frames: 1020035072. Throughput: 0: 41630.2. Samples: 1020206340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:26,996][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 04:54:27,926][12883] Updated weights for policy 0, policy_version 62261 (0.0041) [2024-06-18 04:54:31,343][12883] Updated weights for policy 0, policy_version 62271 (0.0036) [2024-06-18 04:54:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1020264448. Throughput: 0: 41857.0. Samples: 1020328640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:31,995][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 04:54:31,995][12862] Saving new best policy, reward=0.517! [2024-06-18 04:54:35,901][12883] Updated weights for policy 0, policy_version 62281 (0.0039) [2024-06-18 04:54:36,994][12645] Fps is (10 sec: 42607.8, 60 sec: 41506.1, 300 sec: 41876.7). Total num frames: 1020461056. Throughput: 0: 41906.2. Samples: 1020581580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:36,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 04:54:37,028][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062285_1020477440.pth... [2024-06-18 04:54:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061670_1010401280.pth [2024-06-18 04:54:39,220][12883] Updated weights for policy 0, policy_version 62291 (0.0041) [2024-06-18 04:54:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42327.0, 300 sec: 41876.4). Total num frames: 1020674048. Throughput: 0: 41878.8. Samples: 1020834800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:41,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 04:54:43,565][12883] Updated weights for policy 0, policy_version 62301 (0.0030) [2024-06-18 04:54:46,996][12645] Fps is (10 sec: 42589.1, 60 sec: 41777.8, 300 sec: 41932.3). Total num frames: 1020887040. Throughput: 0: 41907.6. Samples: 1020955080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:46,996][12645] Avg episode reward: [(0, '0.047')] [2024-06-18 04:54:47,181][12883] Updated weights for policy 0, policy_version 62311 (0.0048) [2024-06-18 04:54:51,383][12883] Updated weights for policy 0, policy_version 62321 (0.0033) [2024-06-18 04:54:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 1021083648. Throughput: 0: 41889.8. Samples: 1021210820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:51,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 04:54:55,059][12883] Updated weights for policy 0, policy_version 62331 (0.0033) [2024-06-18 04:54:56,993][12645] Fps is (10 sec: 39330.9, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 1021280256. Throughput: 0: 41746.9. Samples: 1021459520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 04:54:56,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 04:54:59,469][12883] Updated weights for policy 0, policy_version 62341 (0.0027) [2024-06-18 04:55:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41507.7, 300 sec: 41876.4). Total num frames: 1021509632. Throughput: 0: 41877.3. Samples: 1021586260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 04:55:01,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 04:55:02,942][12883] Updated weights for policy 0, policy_version 62351 (0.0040) [2024-06-18 04:55:06,994][12645] Fps is (10 sec: 42597.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 1021706240. Throughput: 0: 41855.0. Samples: 1021840040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 04:55:06,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 04:55:07,011][12883] Updated weights for policy 0, policy_version 62361 (0.0042) [2024-06-18 04:55:10,670][12883] Updated weights for policy 0, policy_version 62371 (0.0042) [2024-06-18 04:55:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 1021935616. Throughput: 0: 41752.3. Samples: 1022085100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 04:55:11,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 04:55:14,806][12883] Updated weights for policy 0, policy_version 62381 (0.0051) [2024-06-18 04:55:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 1022132224. Throughput: 0: 41925.5. Samples: 1022215280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 04:55:16,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 04:55:18,527][12883] Updated weights for policy 0, policy_version 62391 (0.0039) [2024-06-18 04:55:21,993][12645] Fps is (10 sec: 37683.8, 60 sec: 41233.1, 300 sec: 41765.7). Total num frames: 1022312448. Throughput: 0: 41894.8. Samples: 1022466840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 04:55:21,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 04:55:22,632][12883] Updated weights for policy 0, policy_version 62401 (0.0022) [2024-06-18 04:55:25,137][12862] Signal inference workers to stop experience collection... (14800 times) [2024-06-18 04:55:25,174][12883] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-06-18 04:55:25,183][12862] Signal inference workers to resume experience collection... (14800 times) [2024-06-18 04:55:25,194][12883] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-06-18 04:55:26,623][12883] Updated weights for policy 0, policy_version 62411 (0.0033) [2024-06-18 04:55:26,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41779.2, 300 sec: 41765.0). Total num frames: 1022541824. Throughput: 0: 41820.4. Samples: 1022716820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 04:55:26,997][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 04:55:30,498][12883] Updated weights for policy 0, policy_version 62421 (0.0037) [2024-06-18 04:55:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 1022771200. Throughput: 0: 42001.7. Samples: 1022845060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 04:55:31,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 04:55:34,495][12883] Updated weights for policy 0, policy_version 62431 (0.0044) [2024-06-18 04:55:36,994][12645] Fps is (10 sec: 42608.4, 60 sec: 41779.3, 300 sec: 41821.2). Total num frames: 1022967808. Throughput: 0: 41792.5. Samples: 1023091480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:55:36,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 04:55:38,308][12883] Updated weights for policy 0, policy_version 62441 (0.0041) [2024-06-18 04:55:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 1023180800. Throughput: 0: 41926.1. Samples: 1023346200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:55:41,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 04:55:42,058][12883] Updated weights for policy 0, policy_version 62451 (0.0031) [2024-06-18 04:55:46,233][12883] Updated weights for policy 0, policy_version 62461 (0.0044) [2024-06-18 04:55:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.8, 300 sec: 42043.0). Total num frames: 1023410176. Throughput: 0: 41940.5. Samples: 1023473580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:55:46,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 04:55:49,752][12883] Updated weights for policy 0, policy_version 62471 (0.0035) [2024-06-18 04:55:51,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42050.7, 300 sec: 41931.6). Total num frames: 1023606784. Throughput: 0: 41751.3. Samples: 1023718940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:55:51,997][12645] Avg episode reward: [(0, '0.071')] [2024-06-18 04:55:53,914][12883] Updated weights for policy 0, policy_version 62481 (0.0037) [2024-06-18 04:55:56,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 1023787008. Throughput: 0: 41990.8. Samples: 1023974680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:55:56,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 04:55:57,632][12883] Updated weights for policy 0, policy_version 62491 (0.0024) [2024-06-18 04:56:01,490][12883] Updated weights for policy 0, policy_version 62501 (0.0038) [2024-06-18 04:56:01,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1024016384. Throughput: 0: 41723.5. Samples: 1024092840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:56:01,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 04:56:05,554][12883] Updated weights for policy 0, policy_version 62511 (0.0026) [2024-06-18 04:56:06,996][12645] Fps is (10 sec: 44228.1, 60 sec: 42051.0, 300 sec: 41820.6). Total num frames: 1024229376. Throughput: 0: 41744.3. Samples: 1024345420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:56:06,996][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 04:56:09,271][12883] Updated weights for policy 0, policy_version 62521 (0.0037) [2024-06-18 04:56:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41876.7). Total num frames: 1024425984. Throughput: 0: 41818.0. Samples: 1024598540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 04:56:11,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 04:56:13,372][12883] Updated weights for policy 0, policy_version 62531 (0.0052) [2024-06-18 04:56:16,996][12645] Fps is (10 sec: 42597.1, 60 sec: 42050.7, 300 sec: 41877.2). Total num frames: 1024655360. Throughput: 0: 41672.6. Samples: 1024720420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:16,996][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 04:56:17,217][12883] Updated weights for policy 0, policy_version 62541 (0.0033) [2024-06-18 04:56:21,232][12883] Updated weights for policy 0, policy_version 62551 (0.0039) [2024-06-18 04:56:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 1024835584. Throughput: 0: 41744.0. Samples: 1024969960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:21,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 04:56:25,079][12883] Updated weights for policy 0, policy_version 62561 (0.0041) [2024-06-18 04:56:26,996][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41876.1). Total num frames: 1025064960. Throughput: 0: 41750.3. Samples: 1025225060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:26,997][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 04:56:29,126][12883] Updated weights for policy 0, policy_version 62571 (0.0029) [2024-06-18 04:56:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 1025294336. Throughput: 0: 41727.6. Samples: 1025351320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:31,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 04:56:32,900][12883] Updated weights for policy 0, policy_version 62581 (0.0033) [2024-06-18 04:56:36,839][12883] Updated weights for policy 0, policy_version 62591 (0.0034) [2024-06-18 04:56:36,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 1025490944. Throughput: 0: 41937.1. Samples: 1025606020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:36,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 04:56:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062591_1025490944.pth... [2024-06-18 04:56:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000061977_1015431168.pth [2024-06-18 04:56:40,566][12883] Updated weights for policy 0, policy_version 62601 (0.0040) [2024-06-18 04:56:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 1025687552. Throughput: 0: 41630.6. Samples: 1025848060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:41,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 04:56:44,702][12883] Updated weights for policy 0, policy_version 62611 (0.0038) [2024-06-18 04:56:46,996][12645] Fps is (10 sec: 40951.2, 60 sec: 41504.6, 300 sec: 41931.6). Total num frames: 1025900544. Throughput: 0: 41844.2. Samples: 1025975920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:46,996][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 04:56:48,290][12883] Updated weights for policy 0, policy_version 62621 (0.0040) [2024-06-18 04:56:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41507.7, 300 sec: 41765.3). Total num frames: 1026097152. Throughput: 0: 41897.7. Samples: 1026230740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 04:56:51,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 04:56:52,507][12883] Updated weights for policy 0, policy_version 62631 (0.0026) [2024-06-18 04:56:56,203][12883] Updated weights for policy 0, policy_version 62641 (0.0032) [2024-06-18 04:56:56,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42325.3, 300 sec: 41876.5). Total num frames: 1026326528. Throughput: 0: 41720.1. Samples: 1026475940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:56:56,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 04:57:00,430][12883] Updated weights for policy 0, policy_version 62651 (0.0039) [2024-06-18 04:57:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 1026539520. Throughput: 0: 42017.2. Samples: 1026611100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:57:01,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 04:57:04,267][12883] Updated weights for policy 0, policy_version 62661 (0.0029) [2024-06-18 04:57:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41780.5, 300 sec: 41820.8). Total num frames: 1026736128. Throughput: 0: 41888.9. Samples: 1026854960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:57:06,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 04:57:08,467][12883] Updated weights for policy 0, policy_version 62671 (0.0049) [2024-06-18 04:57:11,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 41876.4). Total num frames: 1026949120. Throughput: 0: 41831.6. Samples: 1027107480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:57:11,997][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 04:57:12,187][12883] Updated weights for policy 0, policy_version 62681 (0.0037) [2024-06-18 04:57:15,252][12862] Signal inference workers to stop experience collection... (14850 times) [2024-06-18 04:57:15,252][12862] Signal inference workers to resume experience collection... (14850 times) [2024-06-18 04:57:15,291][12883] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-06-18 04:57:15,292][12883] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-06-18 04:57:16,146][12883] Updated weights for policy 0, policy_version 62691 (0.0037) [2024-06-18 04:57:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41507.7, 300 sec: 41876.4). Total num frames: 1027145728. Throughput: 0: 41904.9. Samples: 1027237040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:57:16,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 04:57:20,234][12883] Updated weights for policy 0, policy_version 62701 (0.0038) [2024-06-18 04:57:21,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 1027358720. Throughput: 0: 41889.8. Samples: 1027491060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:57:21,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 04:57:24,122][12883] Updated weights for policy 0, policy_version 62711 (0.0042) [2024-06-18 04:57:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42053.8, 300 sec: 41931.9). Total num frames: 1027588096. Throughput: 0: 42018.6. Samples: 1027738900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:57:26,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 04:57:27,923][12883] Updated weights for policy 0, policy_version 62721 (0.0028) [2024-06-18 04:57:31,832][12883] Updated weights for policy 0, policy_version 62731 (0.0026) [2024-06-18 04:57:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41876.5). Total num frames: 1027784704. Throughput: 0: 42062.1. Samples: 1027868620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 04:57:31,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 04:57:36,017][12883] Updated weights for policy 0, policy_version 62741 (0.0047) [2024-06-18 04:57:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 1027997696. Throughput: 0: 42028.5. Samples: 1028122020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:57:36,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 04:57:39,731][12883] Updated weights for policy 0, policy_version 62751 (0.0029) [2024-06-18 04:57:41,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42050.6, 300 sec: 41931.6). Total num frames: 1028210688. Throughput: 0: 42001.8. Samples: 1028366120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:57:41,997][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 04:57:43,792][12883] Updated weights for policy 0, policy_version 62761 (0.0044) [2024-06-18 04:57:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.8, 300 sec: 41932.2). Total num frames: 1028423680. Throughput: 0: 41957.7. Samples: 1028499200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:57:46,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 04:57:47,463][12883] Updated weights for policy 0, policy_version 62771 (0.0034) [2024-06-18 04:57:51,601][12883] Updated weights for policy 0, policy_version 62781 (0.0035) [2024-06-18 04:57:51,994][12645] Fps is (10 sec: 39331.1, 60 sec: 41779.3, 300 sec: 41765.7). Total num frames: 1028603904. Throughput: 0: 41993.0. Samples: 1028744640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:57:51,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 04:57:55,183][12883] Updated weights for policy 0, policy_version 62791 (0.0025) [2024-06-18 04:57:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1028849664. Throughput: 0: 41962.5. Samples: 1028995700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:57:56,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 04:57:59,166][12883] Updated weights for policy 0, policy_version 62801 (0.0026) [2024-06-18 04:58:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 1029046272. Throughput: 0: 42096.9. Samples: 1029131400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:01,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 04:58:02,899][12883] Updated weights for policy 0, policy_version 62811 (0.0032) [2024-06-18 04:58:06,587][12883] Updated weights for policy 0, policy_version 62821 (0.0035) [2024-06-18 04:58:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 1029259264. Throughput: 0: 41950.7. Samples: 1029378840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:06,995][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 04:58:10,507][12883] Updated weights for policy 0, policy_version 62831 (0.0035) [2024-06-18 04:58:11,996][12645] Fps is (10 sec: 44228.0, 60 sec: 42325.5, 300 sec: 41987.2). Total num frames: 1029488640. Throughput: 0: 42068.8. Samples: 1029632080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:11,996][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 04:58:14,179][12883] Updated weights for policy 0, policy_version 62841 (0.0032) [2024-06-18 04:58:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 1029685248. Throughput: 0: 42003.0. Samples: 1029758760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:16,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 04:58:18,457][12883] Updated weights for policy 0, policy_version 62851 (0.0039) [2024-06-18 04:58:21,943][12883] Updated weights for policy 0, policy_version 62861 (0.0028) [2024-06-18 04:58:21,994][12645] Fps is (10 sec: 42607.1, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 1029914624. Throughput: 0: 42109.8. Samples: 1030016960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:21,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 04:58:26,339][12883] Updated weights for policy 0, policy_version 62871 (0.0036) [2024-06-18 04:58:27,000][12645] Fps is (10 sec: 44210.0, 60 sec: 42321.0, 300 sec: 41986.6). Total num frames: 1030127616. Throughput: 0: 42261.7. Samples: 1030268060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:27,000][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 04:58:28,961][12862] Signal inference workers to stop experience collection... (14900 times) [2024-06-18 04:58:28,962][12862] Signal inference workers to resume experience collection... (14900 times) [2024-06-18 04:58:28,981][12883] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-06-18 04:58:28,981][12883] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-06-18 04:58:29,839][12883] Updated weights for policy 0, policy_version 62881 (0.0032) [2024-06-18 04:58:31,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 41876.1). Total num frames: 1030324224. Throughput: 0: 42135.7. Samples: 1030395400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:31,996][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 04:58:33,836][12883] Updated weights for policy 0, policy_version 62891 (0.0038) [2024-06-18 04:58:36,994][12645] Fps is (10 sec: 40985.4, 60 sec: 42325.3, 300 sec: 42043.3). Total num frames: 1030537216. Throughput: 0: 42232.3. Samples: 1030645100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:36,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 04:58:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062899_1030537216.pth... [2024-06-18 04:58:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062285_1020477440.pth [2024-06-18 04:58:37,562][12883] Updated weights for policy 0, policy_version 62901 (0.0028) [2024-06-18 04:58:41,422][12883] Updated weights for policy 0, policy_version 62911 (0.0041) [2024-06-18 04:58:41,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42327.0, 300 sec: 41932.0). Total num frames: 1030750208. Throughput: 0: 42202.3. Samples: 1030894800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:41,994][12645] Avg episode reward: [(0, '0.084')] [2024-06-18 04:58:45,332][12883] Updated weights for policy 0, policy_version 62921 (0.0034) [2024-06-18 04:58:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 1030963200. Throughput: 0: 42038.7. Samples: 1031023140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 04:58:46,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 04:58:49,308][12883] Updated weights for policy 0, policy_version 62931 (0.0049) [2024-06-18 04:58:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.2, 300 sec: 42043.0). Total num frames: 1031159808. Throughput: 0: 42120.8. Samples: 1031274280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:58:51,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 04:58:53,151][12883] Updated weights for policy 0, policy_version 62941 (0.0044) [2024-06-18 04:58:56,975][12883] Updated weights for policy 0, policy_version 62951 (0.0031) [2024-06-18 04:58:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42323.8, 300 sec: 41931.9). Total num frames: 1031389184. Throughput: 0: 42122.4. Samples: 1031527600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:58:56,996][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 04:59:00,949][12883] Updated weights for policy 0, policy_version 62961 (0.0036) [2024-06-18 04:59:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 1031569408. Throughput: 0: 42010.0. Samples: 1031649200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:59:01,994][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 04:59:04,702][12883] Updated weights for policy 0, policy_version 62971 (0.0042) [2024-06-18 04:59:06,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1031798784. Throughput: 0: 41939.6. Samples: 1031904240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:59:06,994][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 04:59:08,580][12883] Updated weights for policy 0, policy_version 62981 (0.0035) [2024-06-18 04:59:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41780.6, 300 sec: 41820.9). Total num frames: 1031995392. Throughput: 0: 42008.9. Samples: 1032158200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:59:11,994][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 04:59:13,070][12883] Updated weights for policy 0, policy_version 62991 (0.0033) [2024-06-18 04:59:16,811][12883] Updated weights for policy 0, policy_version 63001 (0.0033) [2024-06-18 04:59:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 1032208384. Throughput: 0: 41925.2. Samples: 1032281940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:59:16,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 04:59:20,763][12883] Updated weights for policy 0, policy_version 63011 (0.0036) [2024-06-18 04:59:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41932.3). Total num frames: 1032404992. Throughput: 0: 41987.2. Samples: 1032534520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:59:21,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 04:59:24,636][12883] Updated weights for policy 0, policy_version 63021 (0.0039) [2024-06-18 04:59:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41510.4, 300 sec: 41876.4). Total num frames: 1032617984. Throughput: 0: 42088.8. Samples: 1032788800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 04:59:26,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 04:59:28,433][12883] Updated weights for policy 0, policy_version 63031 (0.0037) [2024-06-18 04:59:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42053.9, 300 sec: 41987.5). Total num frames: 1032847360. Throughput: 0: 42061.3. Samples: 1032915900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:59:31,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 04:59:32,260][12883] Updated weights for policy 0, policy_version 63041 (0.0042) [2024-06-18 04:59:35,997][12883] Updated weights for policy 0, policy_version 63051 (0.0038) [2024-06-18 04:59:36,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42050.7, 300 sec: 41987.1). Total num frames: 1033060352. Throughput: 0: 42038.4. Samples: 1033166100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:59:36,996][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 04:59:39,907][12883] Updated weights for policy 0, policy_version 63061 (0.0029) [2024-06-18 04:59:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 1033256960. Throughput: 0: 42154.2. Samples: 1033424440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:59:41,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 04:59:43,971][12883] Updated weights for policy 0, policy_version 63071 (0.0047) [2024-06-18 04:59:46,996][12645] Fps is (10 sec: 40960.0, 60 sec: 41777.6, 300 sec: 41987.2). Total num frames: 1033469952. Throughput: 0: 42122.3. Samples: 1033544800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:59:46,996][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 04:59:47,867][12883] Updated weights for policy 0, policy_version 63081 (0.0034) [2024-06-18 04:59:51,883][12883] Updated weights for policy 0, policy_version 63091 (0.0045) [2024-06-18 04:59:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1033682944. Throughput: 0: 42101.7. Samples: 1033798820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:59:51,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 04:59:55,515][12883] Updated weights for policy 0, policy_version 63101 (0.0031) [2024-06-18 04:59:56,994][12645] Fps is (10 sec: 39330.8, 60 sec: 41234.7, 300 sec: 41876.4). Total num frames: 1033863168. Throughput: 0: 42199.2. Samples: 1034057160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 04:59:56,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 04:59:59,567][12883] Updated weights for policy 0, policy_version 63111 (0.0037) [2024-06-18 05:00:02,000][12645] Fps is (10 sec: 44209.2, 60 sec: 42593.9, 300 sec: 42097.7). Total num frames: 1034125312. Throughput: 0: 42048.4. Samples: 1034174380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:00:02,000][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 05:00:03,772][12883] Updated weights for policy 0, policy_version 63121 (0.0044) [2024-06-18 05:00:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 1034289152. Throughput: 0: 42019.9. Samples: 1034425420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:00:06,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 05:00:07,575][12883] Updated weights for policy 0, policy_version 63131 (0.0047) [2024-06-18 05:00:11,641][12883] Updated weights for policy 0, policy_version 63141 (0.0036) [2024-06-18 05:00:11,994][12645] Fps is (10 sec: 37706.2, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1034502144. Throughput: 0: 41930.6. Samples: 1034675680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 05:00:11,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 05:00:15,472][12883] Updated weights for policy 0, policy_version 63151 (0.0038) [2024-06-18 05:00:16,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42323.7, 300 sec: 42153.7). Total num frames: 1034747904. Throughput: 0: 41932.1. Samples: 1034802940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 05:00:16,997][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 05:00:17,545][12862] Signal inference workers to stop experience collection... (14950 times) [2024-06-18 05:00:17,546][12862] Signal inference workers to resume experience collection... (14950 times) [2024-06-18 05:00:17,562][12883] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-06-18 05:00:17,562][12883] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-06-18 05:00:19,581][12883] Updated weights for policy 0, policy_version 63161 (0.0027) [2024-06-18 05:00:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 41932.3). Total num frames: 1034911744. Throughput: 0: 41914.6. Samples: 1035052160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 05:00:21,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 05:00:23,169][12883] Updated weights for policy 0, policy_version 63171 (0.0053) [2024-06-18 05:00:26,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 1035141120. Throughput: 0: 41693.2. Samples: 1035300640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 05:00:26,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 05:00:27,323][12883] Updated weights for policy 0, policy_version 63181 (0.0030) [2024-06-18 05:00:31,191][12883] Updated weights for policy 0, policy_version 63191 (0.0027) [2024-06-18 05:00:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1035370496. Throughput: 0: 41922.1. Samples: 1035431200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 05:00:31,994][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 05:00:35,158][12883] Updated weights for policy 0, policy_version 63201 (0.0033) [2024-06-18 05:00:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41780.8, 300 sec: 41987.5). Total num frames: 1035567104. Throughput: 0: 41984.4. Samples: 1035688120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 05:00:36,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 05:00:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063206_1035567104.pth... [2024-06-18 05:00:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062591_1025490944.pth [2024-06-18 05:00:38,963][12883] Updated weights for policy 0, policy_version 63211 (0.0036) [2024-06-18 05:00:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1035780096. Throughput: 0: 41659.4. Samples: 1035931840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 05:00:41,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 05:00:42,805][12883] Updated weights for policy 0, policy_version 63221 (0.0040) [2024-06-18 05:00:46,811][12883] Updated weights for policy 0, policy_version 63231 (0.0036) [2024-06-18 05:00:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41780.8, 300 sec: 41932.3). Total num frames: 1035976704. Throughput: 0: 41822.7. Samples: 1036056140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:00:46,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 05:00:50,866][12883] Updated weights for policy 0, policy_version 63241 (0.0031) [2024-06-18 05:00:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1036206080. Throughput: 0: 42030.6. Samples: 1036316800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:00:51,994][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 05:00:54,551][12883] Updated weights for policy 0, policy_version 63251 (0.0032) [2024-06-18 05:00:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 1036419072. Throughput: 0: 41883.2. Samples: 1036560420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:00:56,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 05:00:58,528][12883] Updated weights for policy 0, policy_version 63261 (0.0041) [2024-06-18 05:01:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41237.3, 300 sec: 41932.2). Total num frames: 1036599296. Throughput: 0: 41994.9. Samples: 1036692620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:01:01,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 05:01:02,421][12883] Updated weights for policy 0, policy_version 63271 (0.0031) [2024-06-18 05:01:06,236][12883] Updated weights for policy 0, policy_version 63281 (0.0031) [2024-06-18 05:01:06,996][12645] Fps is (10 sec: 37674.8, 60 sec: 41777.6, 300 sec: 41931.6). Total num frames: 1036795904. Throughput: 0: 41909.4. Samples: 1036938180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:01:06,996][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 05:01:10,151][12883] Updated weights for policy 0, policy_version 63291 (0.0047) [2024-06-18 05:01:11,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 41987.8). Total num frames: 1037041664. Throughput: 0: 41950.4. Samples: 1037188400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:01:11,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 05:01:14,262][12883] Updated weights for policy 0, policy_version 63301 (0.0037) [2024-06-18 05:01:17,000][12645] Fps is (10 sec: 44218.8, 60 sec: 41503.3, 300 sec: 42042.1). Total num frames: 1037238272. Throughput: 0: 41929.2. Samples: 1037318280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:01:17,001][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 05:01:17,622][12883] Updated weights for policy 0, policy_version 63311 (0.0027) [2024-06-18 05:01:21,866][12883] Updated weights for policy 0, policy_version 63321 (0.0038) [2024-06-18 05:01:21,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 41987.5). Total num frames: 1037451264. Throughput: 0: 41734.3. Samples: 1037566260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:01:21,997][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 05:01:25,687][12883] Updated weights for policy 0, policy_version 63331 (0.0030) [2024-06-18 05:01:26,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1037664256. Throughput: 0: 41767.5. Samples: 1037811380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:01:26,994][12645] Avg episode reward: [(0, '0.079')] [2024-06-18 05:01:29,593][12883] Updated weights for policy 0, policy_version 63341 (0.0032) [2024-06-18 05:01:31,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41506.2, 300 sec: 41932.0). Total num frames: 1037860864. Throughput: 0: 41809.3. Samples: 1037937560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:01:31,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 05:01:33,491][12883] Updated weights for policy 0, policy_version 63351 (0.0043) [2024-06-18 05:01:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038073856. Throughput: 0: 41593.5. Samples: 1038188500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:01:36,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 05:01:37,482][12883] Updated weights for policy 0, policy_version 63361 (0.0038) [2024-06-18 05:01:41,205][12883] Updated weights for policy 0, policy_version 63371 (0.0039) [2024-06-18 05:01:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41987.8). Total num frames: 1038286848. Throughput: 0: 41780.1. Samples: 1038440520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:01:41,994][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 05:01:45,342][12883] Updated weights for policy 0, policy_version 63381 (0.0035) [2024-06-18 05:01:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038483456. Throughput: 0: 41726.9. Samples: 1038570320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:01:46,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 05:01:49,121][12883] Updated weights for policy 0, policy_version 63391 (0.0042) [2024-06-18 05:01:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038712832. Throughput: 0: 41822.0. Samples: 1038820080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:01:51,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 05:01:53,301][12883] Updated weights for policy 0, policy_version 63401 (0.0044) [2024-06-18 05:01:56,994][12645] Fps is (10 sec: 44235.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1038925824. Throughput: 0: 41775.8. Samples: 1039068320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:01:56,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 05:01:56,995][12883] Updated weights for policy 0, policy_version 63411 (0.0036) [2024-06-18 05:02:01,103][12883] Updated weights for policy 0, policy_version 63421 (0.0024) [2024-06-18 05:02:01,994][12645] Fps is (10 sec: 39322.3, 60 sec: 41779.4, 300 sec: 41931.9). Total num frames: 1039106048. Throughput: 0: 41777.1. Samples: 1039197980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 05:02:01,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 05:02:02,619][12862] Signal inference workers to stop experience collection... (15000 times) [2024-06-18 05:02:02,620][12862] Signal inference workers to resume experience collection... (15000 times) [2024-06-18 05:02:02,654][12883] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-06-18 05:02:02,654][12883] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-06-18 05:02:05,307][12883] Updated weights for policy 0, policy_version 63431 (0.0033) [2024-06-18 05:02:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42326.9, 300 sec: 41987.8). Total num frames: 1039335424. Throughput: 0: 41639.4. Samples: 1039439940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:06,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 05:02:08,998][12883] Updated weights for policy 0, policy_version 63441 (0.0032) [2024-06-18 05:02:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 1039532032. Throughput: 0: 41760.5. Samples: 1039690600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:11,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 05:02:12,950][12883] Updated weights for policy 0, policy_version 63451 (0.0038) [2024-06-18 05:02:16,805][12883] Updated weights for policy 0, policy_version 63461 (0.0036) [2024-06-18 05:02:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41783.6, 300 sec: 41987.5). Total num frames: 1039745024. Throughput: 0: 41749.7. Samples: 1039816300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:17,007][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 05:02:20,592][12883] Updated weights for policy 0, policy_version 63471 (0.0030) [2024-06-18 05:02:21,996][12645] Fps is (10 sec: 42589.0, 60 sec: 41779.2, 300 sec: 41931.6). Total num frames: 1039958016. Throughput: 0: 41751.2. Samples: 1040067400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:21,996][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 05:02:25,099][12883] Updated weights for policy 0, policy_version 63481 (0.0045) [2024-06-18 05:02:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41506.3, 300 sec: 41932.0). Total num frames: 1040154624. Throughput: 0: 41737.8. Samples: 1040318720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:26,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 05:02:28,258][12883] Updated weights for policy 0, policy_version 63491 (0.0028) [2024-06-18 05:02:31,994][12645] Fps is (10 sec: 39330.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 1040351232. Throughput: 0: 41598.6. Samples: 1040442260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:31,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 05:02:33,075][12883] Updated weights for policy 0, policy_version 63501 (0.0037) [2024-06-18 05:02:36,184][12883] Updated weights for policy 0, policy_version 63511 (0.0037) [2024-06-18 05:02:37,000][12645] Fps is (10 sec: 44208.7, 60 sec: 42047.8, 300 sec: 41986.9). Total num frames: 1040596992. Throughput: 0: 41535.2. Samples: 1040689420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:37,001][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 05:02:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063513_1040596992.pth... [2024-06-18 05:02:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000062899_1030537216.pth [2024-06-18 05:02:40,790][12883] Updated weights for policy 0, policy_version 63521 (0.0025) [2024-06-18 05:02:42,000][12645] Fps is (10 sec: 42571.6, 60 sec: 41501.8, 300 sec: 41875.5). Total num frames: 1040777216. Throughput: 0: 41826.3. Samples: 1040950760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 05:02:42,000][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 05:02:43,938][12883] Updated weights for policy 0, policy_version 63531 (0.0038) [2024-06-18 05:02:46,993][12645] Fps is (10 sec: 37707.3, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 1040973824. Throughput: 0: 41488.9. Samples: 1041064980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:02:46,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 05:02:48,357][12883] Updated weights for policy 0, policy_version 63541 (0.0038) [2024-06-18 05:02:51,623][12883] Updated weights for policy 0, policy_version 63551 (0.0028) [2024-06-18 05:02:51,994][12645] Fps is (10 sec: 45903.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1041235968. Throughput: 0: 41849.4. Samples: 1041323160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:02:51,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 05:02:56,716][12883] Updated weights for policy 0, policy_version 63561 (0.0034) [2024-06-18 05:02:56,996][12645] Fps is (10 sec: 40950.1, 60 sec: 40958.5, 300 sec: 41820.5). Total num frames: 1041383424. Throughput: 0: 42007.7. Samples: 1041581040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:02:56,997][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 05:02:59,425][12883] Updated weights for policy 0, policy_version 63571 (0.0025) [2024-06-18 05:03:01,994][12645] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1041612800. Throughput: 0: 41733.4. Samples: 1041694300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:03:01,994][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 05:03:04,467][12883] Updated weights for policy 0, policy_version 63581 (0.0029) [2024-06-18 05:03:06,994][12645] Fps is (10 sec: 45885.7, 60 sec: 41779.3, 300 sec: 41876.7). Total num frames: 1041842176. Throughput: 0: 41855.0. Samples: 1041950780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:03:06,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 05:03:07,512][12883] Updated weights for policy 0, policy_version 63591 (0.0042) [2024-06-18 05:03:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 1042022400. Throughput: 0: 41826.2. Samples: 1042200900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:03:11,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 05:03:12,284][12883] Updated weights for policy 0, policy_version 63601 (0.0027) [2024-06-18 05:03:15,209][12883] Updated weights for policy 0, policy_version 63611 (0.0040) [2024-06-18 05:03:16,995][12645] Fps is (10 sec: 40955.9, 60 sec: 41778.5, 300 sec: 41820.7). Total num frames: 1042251776. Throughput: 0: 41812.8. Samples: 1042323880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:03:16,995][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 05:03:18,091][12862] Signal inference workers to stop experience collection... (15050 times) [2024-06-18 05:03:18,091][12862] Signal inference workers to resume experience collection... (15050 times) [2024-06-18 05:03:18,141][12883] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-06-18 05:03:18,141][12883] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-06-18 05:03:20,096][12883] Updated weights for policy 0, policy_version 63621 (0.0038) [2024-06-18 05:03:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 41780.7, 300 sec: 41821.7). Total num frames: 1042464768. Throughput: 0: 42003.6. Samples: 1042579320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:03:21,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 05:03:23,159][12883] Updated weights for policy 0, policy_version 63631 (0.0035) [2024-06-18 05:03:26,994][12645] Fps is (10 sec: 42602.8, 60 sec: 42052.2, 300 sec: 41876.7). Total num frames: 1042677760. Throughput: 0: 41811.6. Samples: 1042832020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:03:26,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 05:03:27,686][12883] Updated weights for policy 0, policy_version 63641 (0.0035) [2024-06-18 05:03:30,699][12883] Updated weights for policy 0, policy_version 63651 (0.0037) [2024-06-18 05:03:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 1042907136. Throughput: 0: 42052.7. Samples: 1042957360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:03:31,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 05:03:35,508][12883] Updated weights for policy 0, policy_version 63661 (0.0045) [2024-06-18 05:03:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41783.6, 300 sec: 41876.4). Total num frames: 1043103744. Throughput: 0: 41880.5. Samples: 1043207780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:03:36,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 05:03:38,744][12883] Updated weights for policy 0, policy_version 63671 (0.0031) [2024-06-18 05:03:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42056.7, 300 sec: 41820.9). Total num frames: 1043300352. Throughput: 0: 41639.1. Samples: 1043454700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:03:41,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 05:03:43,301][12883] Updated weights for policy 0, policy_version 63681 (0.0037) [2024-06-18 05:03:46,575][12883] Updated weights for policy 0, policy_version 63691 (0.0048) [2024-06-18 05:03:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 1043513344. Throughput: 0: 41921.2. Samples: 1043580760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:03:46,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 05:03:51,461][12883] Updated weights for policy 0, policy_version 63701 (0.0030) [2024-06-18 05:03:51,996][12645] Fps is (10 sec: 42588.4, 60 sec: 41504.6, 300 sec: 41820.9). Total num frames: 1043726336. Throughput: 0: 41932.1. Samples: 1043837820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:03:51,996][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 05:03:54,067][12883] Updated weights for policy 0, policy_version 63711 (0.0028) [2024-06-18 05:03:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42327.0, 300 sec: 41876.4). Total num frames: 1043922944. Throughput: 0: 41875.6. Samples: 1044085300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:03:56,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 05:03:59,060][12883] Updated weights for policy 0, policy_version 63721 (0.0035) [2024-06-18 05:04:01,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 1044152320. Throughput: 0: 42016.5. Samples: 1044214580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 05:04:01,994][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 05:04:02,170][12883] Updated weights for policy 0, policy_version 63731 (0.0039) [2024-06-18 05:04:06,642][12883] Updated weights for policy 0, policy_version 63741 (0.0034) [2024-06-18 05:04:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1044348928. Throughput: 0: 42061.4. Samples: 1044472080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:06,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 05:04:09,796][12883] Updated weights for policy 0, policy_version 63751 (0.0048) [2024-06-18 05:04:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 1044561920. Throughput: 0: 41929.7. Samples: 1044718860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:11,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 05:04:14,291][12883] Updated weights for policy 0, policy_version 63761 (0.0037) [2024-06-18 05:04:16,995][12645] Fps is (10 sec: 44230.5, 60 sec: 42325.1, 300 sec: 41987.3). Total num frames: 1044791296. Throughput: 0: 42049.4. Samples: 1044849640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:16,996][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 05:04:17,539][12883] Updated weights for policy 0, policy_version 63771 (0.0026) [2024-06-18 05:04:21,994][12645] Fps is (10 sec: 40958.9, 60 sec: 41779.0, 300 sec: 41876.4). Total num frames: 1044971520. Throughput: 0: 42107.8. Samples: 1045102640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:21,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 05:04:22,054][12883] Updated weights for policy 0, policy_version 63781 (0.0034) [2024-06-18 05:04:25,138][12883] Updated weights for policy 0, policy_version 63791 (0.0040) [2024-06-18 05:04:26,994][12645] Fps is (10 sec: 40965.0, 60 sec: 42052.1, 300 sec: 41876.4). Total num frames: 1045200896. Throughput: 0: 42242.5. Samples: 1045355620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:26,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 05:04:29,987][12883] Updated weights for policy 0, policy_version 63801 (0.0029) [2024-06-18 05:04:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41779.2, 300 sec: 41876.7). Total num frames: 1045413888. Throughput: 0: 42378.2. Samples: 1045487780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:31,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 05:04:32,839][12883] Updated weights for policy 0, policy_version 63811 (0.0040) [2024-06-18 05:04:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1045610496. Throughput: 0: 42302.6. Samples: 1045741340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:36,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 05:04:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063819_1045610496.pth... [2024-06-18 05:04:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063206_1035567104.pth [2024-06-18 05:04:37,750][12883] Updated weights for policy 0, policy_version 63821 (0.0032) [2024-06-18 05:04:40,639][12883] Updated weights for policy 0, policy_version 63831 (0.0028) [2024-06-18 05:04:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41876.7). Total num frames: 1045823488. Throughput: 0: 42206.6. Samples: 1045984600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 05:04:41,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 05:04:43,249][12862] Signal inference workers to stop experience collection... (15100 times) [2024-06-18 05:04:43,298][12883] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-06-18 05:04:43,360][12862] Signal inference workers to resume experience collection... (15100 times) [2024-06-18 05:04:43,361][12883] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-06-18 05:04:45,526][12883] Updated weights for policy 0, policy_version 63841 (0.0028) [2024-06-18 05:04:47,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42048.0, 300 sec: 41875.5). Total num frames: 1046036480. Throughput: 0: 42210.1. Samples: 1046114300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:04:47,000][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 05:04:48,521][12883] Updated weights for policy 0, policy_version 63851 (0.0043) [2024-06-18 05:04:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42053.8, 300 sec: 41987.4). Total num frames: 1046249472. Throughput: 0: 42199.9. Samples: 1046371080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:04:51,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 05:04:53,399][12883] Updated weights for policy 0, policy_version 63861 (0.0025) [2024-06-18 05:04:56,338][12883] Updated weights for policy 0, policy_version 63871 (0.0039) [2024-06-18 05:04:56,994][12645] Fps is (10 sec: 44263.7, 60 sec: 42598.3, 300 sec: 41877.3). Total num frames: 1046478848. Throughput: 0: 42003.8. Samples: 1046609040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:04:56,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 05:05:01,058][12883] Updated weights for policy 0, policy_version 63881 (0.0041) [2024-06-18 05:05:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1046659072. Throughput: 0: 42086.5. Samples: 1046743480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:05:01,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 05:05:03,994][12883] Updated weights for policy 0, policy_version 63891 (0.0033) [2024-06-18 05:05:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 1046855680. Throughput: 0: 42055.3. Samples: 1046995120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:05:06,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 05:05:08,731][12883] Updated weights for policy 0, policy_version 63901 (0.0029) [2024-06-18 05:05:11,888][12883] Updated weights for policy 0, policy_version 63911 (0.0027) [2024-06-18 05:05:11,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 41931.9). Total num frames: 1047117824. Throughput: 0: 41899.3. Samples: 1047241180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:05:11,997][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 05:05:16,652][12883] Updated weights for policy 0, policy_version 63921 (0.0048) [2024-06-18 05:05:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41780.1, 300 sec: 41987.5). Total num frames: 1047298048. Throughput: 0: 42017.8. Samples: 1047378580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:05:16,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 05:05:19,693][12883] Updated weights for policy 0, policy_version 63931 (0.0030) [2024-06-18 05:05:21,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 1047511040. Throughput: 0: 41963.8. Samples: 1047629720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 05:05:21,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 05:05:24,318][12883] Updated weights for policy 0, policy_version 63941 (0.0028) [2024-06-18 05:05:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 41931.9). Total num frames: 1047740416. Throughput: 0: 42128.0. Samples: 1047880360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:05:26,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 05:05:27,839][12883] Updated weights for policy 0, policy_version 63951 (0.0041) [2024-06-18 05:05:31,928][12883] Updated weights for policy 0, policy_version 63961 (0.0041) [2024-06-18 05:05:31,996][12645] Fps is (10 sec: 42589.5, 60 sec: 42050.8, 300 sec: 41931.6). Total num frames: 1047937024. Throughput: 0: 42179.8. Samples: 1048012220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:05:32,008][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 05:05:35,460][12883] Updated weights for policy 0, policy_version 63971 (0.0026) [2024-06-18 05:05:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 1048133632. Throughput: 0: 41935.6. Samples: 1048258180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:05:36,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 05:05:39,649][12883] Updated weights for policy 0, policy_version 63981 (0.0028) [2024-06-18 05:05:41,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 1048379392. Throughput: 0: 42177.9. Samples: 1048507040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:05:41,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 05:05:43,221][12883] Updated weights for policy 0, policy_version 63991 (0.0035) [2024-06-18 05:05:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42056.6, 300 sec: 41876.4). Total num frames: 1048559616. Throughput: 0: 42301.8. Samples: 1048647060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:05:46,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 05:05:47,540][12883] Updated weights for policy 0, policy_version 64001 (0.0028) [2024-06-18 05:05:51,025][12883] Updated weights for policy 0, policy_version 64011 (0.0038) [2024-06-18 05:05:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 1048756224. Throughput: 0: 42280.9. Samples: 1048897760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:05:51,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 05:05:55,060][12883] Updated weights for policy 0, policy_version 64021 (0.0037) [2024-06-18 05:05:56,997][12645] Fps is (10 sec: 47499.9, 60 sec: 42596.4, 300 sec: 42153.7). Total num frames: 1049034752. Throughput: 0: 42297.2. Samples: 1049144580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:05:56,997][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 05:05:59,122][12883] Updated weights for policy 0, policy_version 64031 (0.0030) [2024-06-18 05:06:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41987.8). Total num frames: 1049182208. Throughput: 0: 42264.4. Samples: 1049280480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 05:06:01,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 05:06:02,864][12883] Updated weights for policy 0, policy_version 64041 (0.0043) [2024-06-18 05:06:06,935][12883] Updated weights for policy 0, policy_version 64051 (0.0036) [2024-06-18 05:06:06,996][12645] Fps is (10 sec: 37685.6, 60 sec: 42596.8, 300 sec: 41931.6). Total num frames: 1049411584. Throughput: 0: 42140.7. Samples: 1049526140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 05:06:06,997][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 05:06:09,179][12862] Signal inference workers to stop experience collection... (15150 times) [2024-06-18 05:06:09,180][12862] Signal inference workers to resume experience collection... (15150 times) [2024-06-18 05:06:09,233][12883] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-06-18 05:06:09,233][12883] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-06-18 05:06:10,691][12883] Updated weights for policy 0, policy_version 64061 (0.0035) [2024-06-18 05:06:11,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42326.9, 300 sec: 42099.5). Total num frames: 1049657344. Throughput: 0: 42164.4. Samples: 1049777760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 05:06:11,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 05:06:14,978][12883] Updated weights for policy 0, policy_version 64071 (0.0038) [2024-06-18 05:06:16,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41779.3, 300 sec: 41876.7). Total num frames: 1049804800. Throughput: 0: 42126.5. Samples: 1049907820. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 05:06:16,994][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 05:06:18,203][12883] Updated weights for policy 0, policy_version 64081 (0.0026) [2024-06-18 05:06:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 1050034176. Throughput: 0: 42179.6. Samples: 1050156260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 05:06:21,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 05:06:22,531][12883] Updated weights for policy 0, policy_version 64091 (0.0025) [2024-06-18 05:06:25,869][12883] Updated weights for policy 0, policy_version 64101 (0.0036) [2024-06-18 05:06:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1050263552. Throughput: 0: 42270.2. Samples: 1050409200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 05:06:26,994][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 05:06:30,767][12883] Updated weights for policy 0, policy_version 64111 (0.0036) [2024-06-18 05:06:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41507.7, 300 sec: 41876.4). Total num frames: 1050427392. Throughput: 0: 42127.6. Samples: 1050542800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 05:06:31,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 05:06:33,491][12883] Updated weights for policy 0, policy_version 64121 (0.0031) [2024-06-18 05:06:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1050673152. Throughput: 0: 42131.9. Samples: 1050793700. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 05:06:37,003][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 05:06:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064128_1050673152.pth... [2024-06-18 05:06:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063513_1040596992.pth [2024-06-18 05:06:38,494][12883] Updated weights for policy 0, policy_version 64131 (0.0037) [2024-06-18 05:06:41,409][12883] Updated weights for policy 0, policy_version 64141 (0.0041) [2024-06-18 05:06:41,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1050902528. Throughput: 0: 42230.2. Samples: 1051044820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:06:41,994][12645] Avg episode reward: [(0, '0.056')] [2024-06-18 05:06:46,825][12883] Updated weights for policy 0, policy_version 64151 (0.0027) [2024-06-18 05:06:46,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 1051049984. Throughput: 0: 42060.9. Samples: 1051173220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:06:46,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 05:06:49,095][12883] Updated weights for policy 0, policy_version 64161 (0.0034) [2024-06-18 05:06:51,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.9, 300 sec: 42042.7). Total num frames: 1051328512. Throughput: 0: 42057.4. Samples: 1051418720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:06:51,996][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 05:06:54,482][12883] Updated weights for policy 0, policy_version 64171 (0.0031) [2024-06-18 05:06:56,792][12883] Updated weights for policy 0, policy_version 64181 (0.0028) [2024-06-18 05:06:56,994][12645] Fps is (10 sec: 49152.7, 60 sec: 41781.3, 300 sec: 42154.1). Total num frames: 1051541504. Throughput: 0: 42296.1. Samples: 1051681080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:06:56,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 05:07:01,994][12645] Fps is (10 sec: 36052.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 1051688960. Throughput: 0: 42158.1. Samples: 1051804940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:07:01,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 05:07:02,043][12883] Updated weights for policy 0, policy_version 64191 (0.0036) [2024-06-18 05:07:04,786][12883] Updated weights for policy 0, policy_version 64201 (0.0041) [2024-06-18 05:07:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42600.0, 300 sec: 42154.1). Total num frames: 1051967488. Throughput: 0: 42189.7. Samples: 1052054800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:07:06,995][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 05:07:09,997][12883] Updated weights for policy 0, policy_version 64211 (0.0032) [2024-06-18 05:07:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 1052131328. Throughput: 0: 42295.6. Samples: 1052312500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:07:11,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 05:07:12,011][12862] Signal inference workers to stop experience collection... (15200 times) [2024-06-18 05:07:12,011][12862] Signal inference workers to resume experience collection... (15200 times) [2024-06-18 05:07:12,056][12883] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-06-18 05:07:12,060][12883] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-06-18 05:07:12,672][12883] Updated weights for policy 0, policy_version 64221 (0.0032) [2024-06-18 05:07:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 1052344320. Throughput: 0: 41905.3. Samples: 1052428540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 05:07:16,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 05:07:17,866][12883] Updated weights for policy 0, policy_version 64231 (0.0034) [2024-06-18 05:07:20,584][12883] Updated weights for policy 0, policy_version 64241 (0.0036) [2024-06-18 05:07:22,000][12645] Fps is (10 sec: 47483.8, 60 sec: 42867.0, 300 sec: 42208.7). Total num frames: 1052606464. Throughput: 0: 41998.2. Samples: 1052683880. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:22,000][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 05:07:25,517][12883] Updated weights for policy 0, policy_version 64251 (0.0035) [2024-06-18 05:07:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1052770304. Throughput: 0: 42241.3. Samples: 1052945680. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:26,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 05:07:28,443][12883] Updated weights for policy 0, policy_version 64261 (0.0031) [2024-06-18 05:07:31,994][12645] Fps is (10 sec: 37707.1, 60 sec: 42598.4, 300 sec: 41988.4). Total num frames: 1052983296. Throughput: 0: 41917.4. Samples: 1053059500. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:31,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 05:07:33,018][12883] Updated weights for policy 0, policy_version 64271 (0.0044) [2024-06-18 05:07:36,106][12883] Updated weights for policy 0, policy_version 64281 (0.0030) [2024-06-18 05:07:36,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42266.1). Total num frames: 1053245440. Throughput: 0: 42348.3. Samples: 1053324300. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:36,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 05:07:40,629][12883] Updated weights for policy 0, policy_version 64291 (0.0036) [2024-06-18 05:07:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41233.2, 300 sec: 42043.0). Total num frames: 1053376512. Throughput: 0: 42136.8. Samples: 1053577240. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:41,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 05:07:43,998][12883] Updated weights for policy 0, policy_version 64301 (0.0035) [2024-06-18 05:07:46,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42871.5, 300 sec: 41987.5). Total num frames: 1053622272. Throughput: 0: 41929.9. Samples: 1053691780. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:46,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 05:07:48,206][12883] Updated weights for policy 0, policy_version 64311 (0.0035) [2024-06-18 05:07:51,673][12883] Updated weights for policy 0, policy_version 64321 (0.0045) [2024-06-18 05:07:51,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42326.8, 300 sec: 42321.0). Total num frames: 1053868032. Throughput: 0: 42371.1. Samples: 1053961500. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:51,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 05:07:55,726][12883] Updated weights for policy 0, policy_version 64331 (0.0032) [2024-06-18 05:07:56,996][12645] Fps is (10 sec: 40950.4, 60 sec: 41504.5, 300 sec: 42098.2). Total num frames: 1054031872. Throughput: 0: 42207.6. Samples: 1054211940. Policy #0 lag: (min: 2.0, avg: 12.4, max: 24.0) [2024-06-18 05:07:56,997][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 05:07:59,341][12883] Updated weights for policy 0, policy_version 64341 (0.0032) [2024-06-18 05:08:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42154.1). Total num frames: 1054277632. Throughput: 0: 42231.9. Samples: 1054328980. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:01,995][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 05:08:03,586][12883] Updated weights for policy 0, policy_version 64351 (0.0038) [2024-06-18 05:08:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1054474240. Throughput: 0: 42504.9. Samples: 1054596340. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:06,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 05:08:07,129][12883] Updated weights for policy 0, policy_version 64361 (0.0029) [2024-06-18 05:08:11,555][12883] Updated weights for policy 0, policy_version 64371 (0.0025) [2024-06-18 05:08:11,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42098.7). Total num frames: 1054670848. Throughput: 0: 42205.5. Samples: 1054844920. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:11,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 05:08:14,010][12862] Signal inference workers to stop experience collection... (15250 times) [2024-06-18 05:08:14,063][12883] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-06-18 05:08:14,064][12862] Signal inference workers to resume experience collection... (15250 times) [2024-06-18 05:08:14,079][12883] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-06-18 05:08:14,695][12883] Updated weights for policy 0, policy_version 64381 (0.0030) [2024-06-18 05:08:17,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42593.9, 300 sec: 42153.2). Total num frames: 1054900224. Throughput: 0: 42299.4. Samples: 1054963240. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:17,001][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 05:08:19,357][12883] Updated weights for policy 0, policy_version 64391 (0.0040) [2024-06-18 05:08:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41510.4, 300 sec: 42098.5). Total num frames: 1055096832. Throughput: 0: 42302.2. Samples: 1055227900. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:21,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 05:08:22,569][12883] Updated weights for policy 0, policy_version 64401 (0.0039) [2024-06-18 05:08:26,994][12645] Fps is (10 sec: 39346.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1055293440. Throughput: 0: 42172.0. Samples: 1055474980. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:26,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 05:08:27,088][12862] Saving new best policy, reward=0.601! [2024-06-18 05:08:27,090][12883] Updated weights for policy 0, policy_version 64411 (0.0035) [2024-06-18 05:08:30,351][12883] Updated weights for policy 0, policy_version 64421 (0.0037) [2024-06-18 05:08:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 1055539200. Throughput: 0: 42317.2. Samples: 1055596060. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:31,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 05:08:34,764][12883] Updated weights for policy 0, policy_version 64431 (0.0026) [2024-06-18 05:08:36,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41504.6, 300 sec: 42153.8). Total num frames: 1055735808. Throughput: 0: 42029.1. Samples: 1055852900. Policy #0 lag: (min: 1.0, avg: 12.6, max: 27.0) [2024-06-18 05:08:36,996][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 05:08:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064437_1055735808.pth... [2024-06-18 05:08:37,056][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000063819_1045610496.pth [2024-06-18 05:08:38,081][12883] Updated weights for policy 0, policy_version 64441 (0.0034) [2024-06-18 05:08:41,994][12645] Fps is (10 sec: 40958.8, 60 sec: 42871.2, 300 sec: 42154.0). Total num frames: 1055948800. Throughput: 0: 41983.2. Samples: 1056101100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:08:41,995][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 05:08:42,409][12883] Updated weights for policy 0, policy_version 64451 (0.0039) [2024-06-18 05:08:45,869][12883] Updated weights for policy 0, policy_version 64461 (0.0041) [2024-06-18 05:08:46,994][12645] Fps is (10 sec: 39330.4, 60 sec: 41779.1, 300 sec: 42043.3). Total num frames: 1056129024. Throughput: 0: 42248.5. Samples: 1056230160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:08:46,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 05:08:50,063][12883] Updated weights for policy 0, policy_version 64471 (0.0036) [2024-06-18 05:08:51,996][12645] Fps is (10 sec: 42590.4, 60 sec: 41777.7, 300 sec: 42209.3). Total num frames: 1056374784. Throughput: 0: 41713.6. Samples: 1056473540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:08:51,996][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 05:08:53,930][12883] Updated weights for policy 0, policy_version 64481 (0.0032) [2024-06-18 05:08:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42327.0, 300 sec: 42098.5). Total num frames: 1056571392. Throughput: 0: 41945.2. Samples: 1056732460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:08:56,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 05:08:57,671][12883] Updated weights for policy 0, policy_version 64491 (0.0025) [2024-06-18 05:09:01,952][12883] Updated weights for policy 0, policy_version 64501 (0.0035) [2024-06-18 05:09:01,994][12645] Fps is (10 sec: 40969.3, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1056784384. Throughput: 0: 41972.1. Samples: 1056851720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:09:01,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 05:09:05,532][12883] Updated weights for policy 0, policy_version 64511 (0.0031) [2024-06-18 05:09:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 1056980992. Throughput: 0: 41644.9. Samples: 1057101920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:09:06,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 05:09:09,555][12883] Updated weights for policy 0, policy_version 64521 (0.0030) [2024-06-18 05:09:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 41987.7). Total num frames: 1057177600. Throughput: 0: 41983.1. Samples: 1057364220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:09:11,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 05:09:13,121][12883] Updated weights for policy 0, policy_version 64531 (0.0026) [2024-06-18 05:09:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42056.6, 300 sec: 42209.7). Total num frames: 1057423360. Throughput: 0: 41937.4. Samples: 1057483240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 05:09:16,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 05:09:17,430][12883] Updated weights for policy 0, policy_version 64541 (0.0035) [2024-06-18 05:09:21,328][12883] Updated weights for policy 0, policy_version 64551 (0.0040) [2024-06-18 05:09:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1057636352. Throughput: 0: 41770.2. Samples: 1057732460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:21,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 05:09:24,945][12883] Updated weights for policy 0, policy_version 64561 (0.0032) [2024-06-18 05:09:26,996][12645] Fps is (10 sec: 37675.0, 60 sec: 41777.7, 300 sec: 41987.2). Total num frames: 1057800192. Throughput: 0: 42132.0. Samples: 1057997120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:26,996][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 05:09:28,185][12862] Signal inference workers to stop experience collection... (15300 times) [2024-06-18 05:09:28,186][12862] Signal inference workers to resume experience collection... (15300 times) [2024-06-18 05:09:28,202][12883] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-06-18 05:09:28,202][12883] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-06-18 05:09:29,258][12883] Updated weights for policy 0, policy_version 64571 (0.0039) [2024-06-18 05:09:31,996][12645] Fps is (10 sec: 40950.5, 60 sec: 41777.7, 300 sec: 42153.8). Total num frames: 1058045952. Throughput: 0: 41821.0. Samples: 1058112200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:31,996][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 05:09:32,740][12883] Updated weights for policy 0, policy_version 64581 (0.0034) [2024-06-18 05:09:36,994][12645] Fps is (10 sec: 44246.1, 60 sec: 41780.7, 300 sec: 42098.5). Total num frames: 1058242560. Throughput: 0: 42108.2. Samples: 1058368320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:36,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 05:09:37,299][12883] Updated weights for policy 0, policy_version 64591 (0.0034) [2024-06-18 05:09:40,938][12883] Updated weights for policy 0, policy_version 64601 (0.0038) [2024-06-18 05:09:41,994][12645] Fps is (10 sec: 37691.7, 60 sec: 41233.3, 300 sec: 41988.4). Total num frames: 1058422784. Throughput: 0: 41771.5. Samples: 1058612180. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:41,994][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 05:09:45,267][12883] Updated weights for policy 0, policy_version 64611 (0.0033) [2024-06-18 05:09:46,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1058684928. Throughput: 0: 41879.2. Samples: 1058736280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:46,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 05:09:48,659][12883] Updated weights for policy 0, policy_version 64621 (0.0031) [2024-06-18 05:09:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41234.7, 300 sec: 41932.0). Total num frames: 1058848768. Throughput: 0: 42137.9. Samples: 1058998120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:51,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 05:09:52,931][12883] Updated weights for policy 0, policy_version 64631 (0.0029) [2024-06-18 05:09:56,316][12883] Updated weights for policy 0, policy_version 64641 (0.0033) [2024-06-18 05:09:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1059078144. Throughput: 0: 41768.8. Samples: 1059243820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 05:09:56,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 05:10:00,613][12883] Updated weights for policy 0, policy_version 64651 (0.0024) [2024-06-18 05:10:01,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1059323904. Throughput: 0: 42150.4. Samples: 1059380000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:01,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 05:10:04,068][12883] Updated weights for policy 0, policy_version 64661 (0.0036) [2024-06-18 05:10:06,995][12645] Fps is (10 sec: 40953.0, 60 sec: 41778.0, 300 sec: 41932.0). Total num frames: 1059487744. Throughput: 0: 42122.7. Samples: 1059628060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:06,996][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 05:10:08,342][12883] Updated weights for policy 0, policy_version 64671 (0.0036) [2024-06-18 05:10:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1059717120. Throughput: 0: 41669.6. Samples: 1059872160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:11,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 05:10:12,149][12883] Updated weights for policy 0, policy_version 64681 (0.0028) [2024-06-18 05:10:16,160][12883] Updated weights for policy 0, policy_version 64691 (0.0025) [2024-06-18 05:10:16,994][12645] Fps is (10 sec: 44244.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 1059930112. Throughput: 0: 42112.8. Samples: 1060007180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:16,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 05:10:19,847][12883] Updated weights for policy 0, policy_version 64701 (0.0040) [2024-06-18 05:10:21,995][12645] Fps is (10 sec: 39314.4, 60 sec: 41231.8, 300 sec: 41931.7). Total num frames: 1060110336. Throughput: 0: 41939.3. Samples: 1060255660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:21,996][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 05:10:23,938][12883] Updated weights for policy 0, policy_version 64711 (0.0030) [2024-06-18 05:10:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42873.0, 300 sec: 42154.4). Total num frames: 1060372480. Throughput: 0: 42143.0. Samples: 1060508620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:26,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 05:10:27,475][12883] Updated weights for policy 0, policy_version 64721 (0.0041) [2024-06-18 05:10:31,700][12883] Updated weights for policy 0, policy_version 64731 (0.0040) [2024-06-18 05:10:31,994][12645] Fps is (10 sec: 45883.8, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 1060569088. Throughput: 0: 42270.6. Samples: 1060638460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:31,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 05:10:35,051][12883] Updated weights for policy 0, policy_version 64741 (0.0033) [2024-06-18 05:10:36,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 1060749312. Throughput: 0: 42072.3. Samples: 1060891380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 22.0) [2024-06-18 05:10:36,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 05:10:37,131][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064744_1060765696.pth... [2024-06-18 05:10:37,195][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064128_1050673152.pth [2024-06-18 05:10:39,339][12883] Updated weights for policy 0, policy_version 64751 (0.0029) [2024-06-18 05:10:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 1060995072. Throughput: 0: 42154.8. Samples: 1061140780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:10:41,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 05:10:43,068][12883] Updated weights for policy 0, policy_version 64761 (0.0033) [2024-06-18 05:10:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1061191680. Throughput: 0: 42156.8. Samples: 1061277060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:10:46,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 05:10:47,008][12883] Updated weights for policy 0, policy_version 64771 (0.0037) [2024-06-18 05:10:50,540][12883] Updated weights for policy 0, policy_version 64781 (0.0045) [2024-06-18 05:10:51,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 41821.3). Total num frames: 1061371904. Throughput: 0: 42025.7. Samples: 1061519140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:10:51,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 05:10:52,345][12862] Signal inference workers to stop experience collection... (15350 times) [2024-06-18 05:10:52,368][12883] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-06-18 05:10:52,404][12862] Signal inference workers to resume experience collection... (15350 times) [2024-06-18 05:10:52,405][12883] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-06-18 05:10:54,634][12883] Updated weights for policy 0, policy_version 64791 (0.0034) [2024-06-18 05:10:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1061617664. Throughput: 0: 42381.4. Samples: 1061779320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:10:56,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 05:10:58,978][12883] Updated weights for policy 0, policy_version 64801 (0.0048) [2024-06-18 05:11:01,996][12645] Fps is (10 sec: 47502.8, 60 sec: 42050.6, 300 sec: 42154.1). Total num frames: 1061847040. Throughput: 0: 42316.5. Samples: 1061911520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:11:01,996][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 05:11:02,358][12883] Updated weights for policy 0, policy_version 64811 (0.0032) [2024-06-18 05:11:06,525][12883] Updated weights for policy 0, policy_version 64821 (0.0026) [2024-06-18 05:11:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.6, 300 sec: 41931.9). Total num frames: 1062027264. Throughput: 0: 42295.9. Samples: 1062158900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:11:06,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 05:11:10,162][12883] Updated weights for policy 0, policy_version 64831 (0.0043) [2024-06-18 05:11:11,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1062256640. Throughput: 0: 42374.4. Samples: 1062415460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:11:11,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 05:11:14,023][12883] Updated weights for policy 0, policy_version 64841 (0.0022) [2024-06-18 05:11:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 1062469632. Throughput: 0: 42431.9. Samples: 1062547900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 26.0) [2024-06-18 05:11:16,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 05:11:17,846][12883] Updated weights for policy 0, policy_version 64851 (0.0054) [2024-06-18 05:11:21,890][12883] Updated weights for policy 0, policy_version 64861 (0.0029) [2024-06-18 05:11:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42872.7, 300 sec: 42098.5). Total num frames: 1062682624. Throughput: 0: 42344.0. Samples: 1062796860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:21,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 05:11:25,301][12883] Updated weights for policy 0, policy_version 64871 (0.0033) [2024-06-18 05:11:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1062879232. Throughput: 0: 42404.4. Samples: 1063048980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:26,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 05:11:29,608][12883] Updated weights for policy 0, policy_version 64881 (0.0040) [2024-06-18 05:11:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 1063092224. Throughput: 0: 42242.2. Samples: 1063177960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:31,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 05:11:33,012][12883] Updated weights for policy 0, policy_version 64891 (0.0038) [2024-06-18 05:11:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 1063321600. Throughput: 0: 42584.9. Samples: 1063435460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:36,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 05:11:37,353][12883] Updated weights for policy 0, policy_version 64901 (0.0038) [2024-06-18 05:11:40,659][12883] Updated weights for policy 0, policy_version 64911 (0.0031) [2024-06-18 05:11:41,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1063534592. Throughput: 0: 42297.5. Samples: 1063682720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:41,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 05:11:45,434][12883] Updated weights for policy 0, policy_version 64921 (0.0030) [2024-06-18 05:11:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42098.9). Total num frames: 1063747584. Throughput: 0: 42283.0. Samples: 1063814160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:46,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 05:11:48,479][12883] Updated weights for policy 0, policy_version 64931 (0.0035) [2024-06-18 05:11:51,995][12645] Fps is (10 sec: 40953.8, 60 sec: 42870.2, 300 sec: 42042.8). Total num frames: 1063944192. Throughput: 0: 42429.9. Samples: 1064068320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:51,996][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 05:11:53,078][12883] Updated weights for policy 0, policy_version 64941 (0.0030) [2024-06-18 05:11:53,871][12862] Signal inference workers to stop experience collection... (15400 times) [2024-06-18 05:11:53,891][12883] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-06-18 05:11:53,926][12862] Signal inference workers to resume experience collection... (15400 times) [2024-06-18 05:11:53,929][12883] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-06-18 05:11:56,139][12883] Updated weights for policy 0, policy_version 64951 (0.0040) [2024-06-18 05:11:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1064157184. Throughput: 0: 42332.0. Samples: 1064320400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 19.0) [2024-06-18 05:11:56,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 05:12:00,833][12883] Updated weights for policy 0, policy_version 64961 (0.0037) [2024-06-18 05:12:02,000][12645] Fps is (10 sec: 44216.8, 60 sec: 42322.5, 300 sec: 42097.7). Total num frames: 1064386560. Throughput: 0: 42330.2. Samples: 1064453020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:02,000][12645] Avg episode reward: [(0, '0.118')] [2024-06-18 05:12:04,123][12883] Updated weights for policy 0, policy_version 64971 (0.0029) [2024-06-18 05:12:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1064583168. Throughput: 0: 42401.4. Samples: 1064704920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:06,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 05:12:08,442][12883] Updated weights for policy 0, policy_version 64981 (0.0033) [2024-06-18 05:12:11,718][12883] Updated weights for policy 0, policy_version 64991 (0.0032) [2024-06-18 05:12:11,994][12645] Fps is (10 sec: 42624.5, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 1064812544. Throughput: 0: 42346.6. Samples: 1064954580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:11,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 05:12:16,352][12883] Updated weights for policy 0, policy_version 65001 (0.0050) [2024-06-18 05:12:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 41988.4). Total num frames: 1064992768. Throughput: 0: 42434.7. Samples: 1065087520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:16,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 05:12:19,274][12883] Updated weights for policy 0, policy_version 65011 (0.0030) [2024-06-18 05:12:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1065205760. Throughput: 0: 42358.7. Samples: 1065341600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:21,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 05:12:24,075][12883] Updated weights for policy 0, policy_version 65021 (0.0029) [2024-06-18 05:12:26,766][12883] Updated weights for policy 0, policy_version 65031 (0.0044) [2024-06-18 05:12:26,994][12645] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42320.7). Total num frames: 1065467904. Throughput: 0: 42418.0. Samples: 1065591520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:26,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 05:12:31,679][12883] Updated weights for policy 0, policy_version 65041 (0.0036) [2024-06-18 05:12:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 1065664512. Throughput: 0: 42617.4. Samples: 1065731940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:31,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 05:12:34,329][12883] Updated weights for policy 0, policy_version 65051 (0.0035) [2024-06-18 05:12:36,995][12645] Fps is (10 sec: 39316.3, 60 sec: 42324.4, 300 sec: 42320.5). Total num frames: 1065861120. Throughput: 0: 42567.4. Samples: 1065983840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 05:12:36,995][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 05:12:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065055_1065861120.pth... [2024-06-18 05:12:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064437_1055735808.pth [2024-06-18 05:12:39,286][12883] Updated weights for policy 0, policy_version 65061 (0.0031) [2024-06-18 05:12:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1066106880. Throughput: 0: 42426.9. Samples: 1066229620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:12:41,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 05:12:42,150][12883] Updated weights for policy 0, policy_version 65071 (0.0037) [2024-06-18 05:12:46,996][12645] Fps is (10 sec: 40956.2, 60 sec: 42050.7, 300 sec: 42042.7). Total num frames: 1066270720. Throughput: 0: 42622.8. Samples: 1066370880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:12:46,997][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 05:12:47,213][12883] Updated weights for policy 0, policy_version 65081 (0.0048) [2024-06-18 05:12:50,041][12883] Updated weights for policy 0, policy_version 65091 (0.0033) [2024-06-18 05:12:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42599.6, 300 sec: 42265.5). Total num frames: 1066500096. Throughput: 0: 42479.1. Samples: 1066616480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:12:51,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 05:12:54,718][12883] Updated weights for policy 0, policy_version 65101 (0.0039) [2024-06-18 05:12:56,994][12645] Fps is (10 sec: 49163.5, 60 sec: 43417.6, 300 sec: 42320.7). Total num frames: 1066762240. Throughput: 0: 42498.8. Samples: 1066867020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:12:56,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 05:12:58,152][12883] Updated weights for policy 0, policy_version 65111 (0.0026) [2024-06-18 05:13:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41783.6, 300 sec: 42098.6). Total num frames: 1066893312. Throughput: 0: 42504.0. Samples: 1067000200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:13:01,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 05:13:02,341][12883] Updated weights for policy 0, policy_version 65121 (0.0024) [2024-06-18 05:13:05,837][12883] Updated weights for policy 0, policy_version 65131 (0.0033) [2024-06-18 05:13:06,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1067139072. Throughput: 0: 42463.2. Samples: 1067252440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:13:06,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 05:13:09,406][12862] Signal inference workers to stop experience collection... (15450 times) [2024-06-18 05:13:09,452][12883] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-06-18 05:13:09,516][12862] Signal inference workers to resume experience collection... (15450 times) [2024-06-18 05:13:09,516][12883] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-06-18 05:13:10,267][12883] Updated weights for policy 0, policy_version 65141 (0.0027) [2024-06-18 05:13:11,994][12645] Fps is (10 sec: 49151.6, 60 sec: 42871.6, 300 sec: 42321.6). Total num frames: 1067384832. Throughput: 0: 42544.5. Samples: 1067506020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:13:11,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 05:13:13,469][12883] Updated weights for policy 0, policy_version 65151 (0.0037) [2024-06-18 05:13:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1067548672. Throughput: 0: 42445.8. Samples: 1067642000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 05:13:16,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 05:13:17,732][12883] Updated weights for policy 0, policy_version 65161 (0.0036) [2024-06-18 05:13:20,857][12883] Updated weights for policy 0, policy_version 65171 (0.0027) [2024-06-18 05:13:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42376.2). Total num frames: 1067794432. Throughput: 0: 42449.3. Samples: 1067894000. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:21,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 05:13:25,194][12883] Updated weights for policy 0, policy_version 65181 (0.0041) [2024-06-18 05:13:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1068007424. Throughput: 0: 42809.9. Samples: 1068156060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:26,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 05:13:28,294][12883] Updated weights for policy 0, policy_version 65191 (0.0040) [2024-06-18 05:13:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42265.5). Total num frames: 1068204032. Throughput: 0: 42427.5. Samples: 1068280020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:31,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 05:13:32,986][12883] Updated weights for policy 0, policy_version 65201 (0.0033) [2024-06-18 05:13:36,319][12883] Updated weights for policy 0, policy_version 65211 (0.0035) [2024-06-18 05:13:36,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42870.8, 300 sec: 42320.4). Total num frames: 1068433408. Throughput: 0: 42569.9. Samples: 1068532220. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:36,996][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 05:13:40,602][12883] Updated weights for policy 0, policy_version 65221 (0.0031) [2024-06-18 05:13:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1068630016. Throughput: 0: 42781.4. Samples: 1068792180. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:41,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 05:13:44,029][12883] Updated weights for policy 0, policy_version 65231 (0.0038) [2024-06-18 05:13:46,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42600.0, 300 sec: 42209.9). Total num frames: 1068826624. Throughput: 0: 42606.9. Samples: 1068917520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:46,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 05:13:48,159][12883] Updated weights for policy 0, policy_version 65241 (0.0033) [2024-06-18 05:13:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1069056000. Throughput: 0: 42573.7. Samples: 1069168260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:51,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 05:13:52,083][12883] Updated weights for policy 0, policy_version 65251 (0.0037) [2024-06-18 05:13:55,737][12883] Updated weights for policy 0, policy_version 65261 (0.0033) [2024-06-18 05:13:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 1069285376. Throughput: 0: 42569.1. Samples: 1069421640. Policy #0 lag: (min: 1.0, avg: 11.8, max: 22.0) [2024-06-18 05:13:56,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 05:14:00,275][12883] Updated weights for policy 0, policy_version 65271 (0.0030) [2024-06-18 05:14:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1069449216. Throughput: 0: 42334.6. Samples: 1069547060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:01,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 05:14:03,467][12883] Updated weights for policy 0, policy_version 65281 (0.0027) [2024-06-18 05:14:06,994][12645] Fps is (10 sec: 40961.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1069694976. Throughput: 0: 42381.0. Samples: 1069801140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:06,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 05:14:08,107][12883] Updated weights for policy 0, policy_version 65291 (0.0037) [2024-06-18 05:14:11,581][12883] Updated weights for policy 0, policy_version 65301 (0.0030) [2024-06-18 05:14:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1069907968. Throughput: 0: 42088.9. Samples: 1070050060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:11,994][12645] Avg episode reward: [(0, '0.139')] [2024-06-18 05:14:15,735][12883] Updated weights for policy 0, policy_version 65311 (0.0047) [2024-06-18 05:14:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1070088192. Throughput: 0: 42141.8. Samples: 1070176400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:16,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 05:14:19,536][12883] Updated weights for policy 0, policy_version 65321 (0.0033) [2024-06-18 05:14:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42432.1). Total num frames: 1070317568. Throughput: 0: 42152.0. Samples: 1070428960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:21,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 05:14:23,623][12883] Updated weights for policy 0, policy_version 65331 (0.0037) [2024-06-18 05:14:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42321.0). Total num frames: 1070530560. Throughput: 0: 42047.1. Samples: 1070684300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:26,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 05:14:27,149][12883] Updated weights for policy 0, policy_version 65341 (0.0035) [2024-06-18 05:14:31,274][12883] Updated weights for policy 0, policy_version 65351 (0.0035) [2024-06-18 05:14:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1070710784. Throughput: 0: 41961.1. Samples: 1070805760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:31,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 05:14:34,995][12883] Updated weights for policy 0, policy_version 65361 (0.0030) [2024-06-18 05:14:36,994][12645] Fps is (10 sec: 42595.3, 60 sec: 42053.4, 300 sec: 42487.2). Total num frames: 1070956544. Throughput: 0: 42063.4. Samples: 1071061140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 05:14:36,995][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 05:14:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065366_1070956544.pth... [2024-06-18 05:14:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000064744_1060765696.pth [2024-06-18 05:14:38,882][12883] Updated weights for policy 0, policy_version 65371 (0.0038) [2024-06-18 05:14:41,996][12645] Fps is (10 sec: 44226.3, 60 sec: 42050.6, 300 sec: 42264.8). Total num frames: 1071153152. Throughput: 0: 41973.2. Samples: 1071310520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:14:41,997][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 05:14:43,175][12883] Updated weights for policy 0, policy_version 65381 (0.0032) [2024-06-18 05:14:44,392][12862] Signal inference workers to stop experience collection... (15500 times) [2024-06-18 05:14:44,392][12862] Signal inference workers to resume experience collection... (15500 times) [2024-06-18 05:14:44,426][12883] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-06-18 05:14:44,427][12883] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-06-18 05:14:46,629][12883] Updated weights for policy 0, policy_version 65391 (0.0037) [2024-06-18 05:14:46,994][12645] Fps is (10 sec: 40963.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1071366144. Throughput: 0: 41913.7. Samples: 1071433180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:14:46,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 05:14:51,140][12883] Updated weights for policy 0, policy_version 65401 (0.0035) [2024-06-18 05:14:51,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1071579136. Throughput: 0: 42071.5. Samples: 1071694360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:14:51,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 05:14:54,601][12883] Updated weights for policy 0, policy_version 65411 (0.0031) [2024-06-18 05:14:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 42265.1). Total num frames: 1071792128. Throughput: 0: 42160.9. Samples: 1071947300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:14:56,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 05:14:58,851][12883] Updated weights for policy 0, policy_version 65421 (0.0029) [2024-06-18 05:15:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42432.0). Total num frames: 1072005120. Throughput: 0: 42234.1. Samples: 1072076940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:15:01,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 05:15:02,291][12883] Updated weights for policy 0, policy_version 65431 (0.0029) [2024-06-18 05:15:06,551][12883] Updated weights for policy 0, policy_version 65441 (0.0031) [2024-06-18 05:15:07,000][12645] Fps is (10 sec: 40934.4, 60 sec: 41774.8, 300 sec: 42319.8). Total num frames: 1072201728. Throughput: 0: 42186.1. Samples: 1072327600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:15:07,000][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 05:15:09,929][12883] Updated weights for policy 0, policy_version 65451 (0.0039) [2024-06-18 05:15:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1072431104. Throughput: 0: 42136.4. Samples: 1072580440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:15:11,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 05:15:14,436][12883] Updated weights for policy 0, policy_version 65461 (0.0024) [2024-06-18 05:15:16,994][12645] Fps is (10 sec: 44264.5, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 1072644096. Throughput: 0: 42311.0. Samples: 1072709760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 05:15:16,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 05:15:17,617][12883] Updated weights for policy 0, policy_version 65471 (0.0028) [2024-06-18 05:15:21,996][12645] Fps is (10 sec: 39313.0, 60 sec: 41777.6, 300 sec: 42209.3). Total num frames: 1072824320. Throughput: 0: 42160.8. Samples: 1072958440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:21,996][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 05:15:22,151][12883] Updated weights for policy 0, policy_version 65481 (0.0035) [2024-06-18 05:15:25,500][12883] Updated weights for policy 0, policy_version 65491 (0.0028) [2024-06-18 05:15:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1073053696. Throughput: 0: 42344.8. Samples: 1073215940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:26,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 05:15:29,915][12883] Updated weights for policy 0, policy_version 65501 (0.0036) [2024-06-18 05:15:31,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1073283072. Throughput: 0: 42545.3. Samples: 1073347720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:31,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 05:15:33,250][12883] Updated weights for policy 0, policy_version 65511 (0.0028) [2024-06-18 05:15:36,994][12645] Fps is (10 sec: 39318.6, 60 sec: 41506.1, 300 sec: 42209.5). Total num frames: 1073446912. Throughput: 0: 42230.4. Samples: 1073594760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:36,995][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 05:15:37,437][12883] Updated weights for policy 0, policy_version 65521 (0.0043) [2024-06-18 05:15:40,984][12883] Updated weights for policy 0, policy_version 65531 (0.0037) [2024-06-18 05:15:41,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42325.4, 300 sec: 42375.9). Total num frames: 1073692672. Throughput: 0: 42273.0. Samples: 1073849680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:41,996][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 05:15:45,080][12883] Updated weights for policy 0, policy_version 65541 (0.0048) [2024-06-18 05:15:46,994][12645] Fps is (10 sec: 45878.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1073905664. Throughput: 0: 42405.0. Samples: 1073985160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:46,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 05:15:48,598][12883] Updated weights for policy 0, policy_version 65551 (0.0040) [2024-06-18 05:15:52,000][12645] Fps is (10 sec: 39305.8, 60 sec: 41774.8, 300 sec: 42264.3). Total num frames: 1074085888. Throughput: 0: 42325.8. Samples: 1074232260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:52,000][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 05:15:52,742][12883] Updated weights for policy 0, policy_version 65561 (0.0032) [2024-06-18 05:15:56,211][12883] Updated weights for policy 0, policy_version 65571 (0.0046) [2024-06-18 05:15:56,999][12645] Fps is (10 sec: 42575.7, 60 sec: 42321.6, 300 sec: 42320.3). Total num frames: 1074331648. Throughput: 0: 42290.1. Samples: 1074483720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 05:15:56,999][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 05:16:00,295][12883] Updated weights for policy 0, policy_version 65581 (0.0026) [2024-06-18 05:16:01,996][12645] Fps is (10 sec: 45893.5, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 1074544640. Throughput: 0: 42441.4. Samples: 1074619720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:01,997][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 05:16:02,659][12862] Signal inference workers to stop experience collection... (15550 times) [2024-06-18 05:16:02,659][12862] Signal inference workers to resume experience collection... (15550 times) [2024-06-18 05:16:02,702][12883] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-06-18 05:16:02,702][12883] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-06-18 05:16:03,675][12883] Updated weights for policy 0, policy_version 65591 (0.0030) [2024-06-18 05:16:06,994][12645] Fps is (10 sec: 39342.3, 60 sec: 42056.6, 300 sec: 42265.1). Total num frames: 1074724864. Throughput: 0: 42358.0. Samples: 1074864460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:06,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 05:16:07,949][12883] Updated weights for policy 0, policy_version 65601 (0.0036) [2024-06-18 05:16:11,846][12883] Updated weights for policy 0, policy_version 65611 (0.0029) [2024-06-18 05:16:11,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1074970624. Throughput: 0: 42272.8. Samples: 1075118220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:11,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 05:16:15,659][12883] Updated weights for policy 0, policy_version 65621 (0.0038) [2024-06-18 05:16:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1075183616. Throughput: 0: 42357.7. Samples: 1075253820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:16,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 05:16:19,414][12883] Updated weights for policy 0, policy_version 65631 (0.0031) [2024-06-18 05:16:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42599.9, 300 sec: 42376.2). Total num frames: 1075380224. Throughput: 0: 42292.2. Samples: 1075497880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:21,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 05:16:23,566][12883] Updated weights for policy 0, policy_version 65641 (0.0027) [2024-06-18 05:16:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1075609600. Throughput: 0: 42257.1. Samples: 1075751160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:26,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 05:16:27,307][12883] Updated weights for policy 0, policy_version 65651 (0.0032) [2024-06-18 05:16:31,195][12883] Updated weights for policy 0, policy_version 65661 (0.0035) [2024-06-18 05:16:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 1075806208. Throughput: 0: 42154.8. Samples: 1075882140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:31,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 05:16:35,035][12883] Updated weights for policy 0, policy_version 65671 (0.0035) [2024-06-18 05:16:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43145.0, 300 sec: 42376.3). Total num frames: 1076035584. Throughput: 0: 42315.1. Samples: 1076136180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:36,994][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 05:16:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065676_1076035584.pth... [2024-06-18 05:16:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065055_1065861120.pth [2024-06-18 05:16:39,074][12883] Updated weights for policy 0, policy_version 65681 (0.0029) [2024-06-18 05:16:41,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42600.0, 300 sec: 42376.2). Total num frames: 1076248576. Throughput: 0: 42413.5. Samples: 1076392100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:16:41,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 05:16:42,497][12883] Updated weights for policy 0, policy_version 65691 (0.0047) [2024-06-18 05:16:46,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42050.7, 300 sec: 42320.6). Total num frames: 1076428800. Throughput: 0: 42257.3. Samples: 1076521300. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:16:46,997][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 05:16:47,280][12883] Updated weights for policy 0, policy_version 65701 (0.0027) [2024-06-18 05:16:50,390][12883] Updated weights for policy 0, policy_version 65711 (0.0032) [2024-06-18 05:16:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42602.8, 300 sec: 42320.7). Total num frames: 1076641792. Throughput: 0: 42477.7. Samples: 1076775960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:16:51,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 05:16:54,833][12883] Updated weights for policy 0, policy_version 65721 (0.0040) [2024-06-18 05:16:56,994][12645] Fps is (10 sec: 45885.7, 60 sec: 42602.2, 300 sec: 42377.1). Total num frames: 1076887552. Throughput: 0: 42477.8. Samples: 1077029720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:16:56,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 05:16:58,247][12883] Updated weights for policy 0, policy_version 65731 (0.0042) [2024-06-18 05:17:01,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42052.3, 300 sec: 42320.4). Total num frames: 1077067776. Throughput: 0: 42287.8. Samples: 1077156860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:17:01,996][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 05:17:02,393][12883] Updated weights for policy 0, policy_version 65741 (0.0046) [2024-06-18 05:17:05,966][12883] Updated weights for policy 0, policy_version 65751 (0.0036) [2024-06-18 05:17:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 1077297152. Throughput: 0: 42376.4. Samples: 1077404820. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:17:06,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 05:17:10,262][12883] Updated weights for policy 0, policy_version 65761 (0.0036) [2024-06-18 05:17:11,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1077510144. Throughput: 0: 42371.3. Samples: 1077657860. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:17:11,994][12645] Avg episode reward: [(0, '0.089')] [2024-06-18 05:17:13,688][12883] Updated weights for policy 0, policy_version 65771 (0.0037) [2024-06-18 05:17:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 1077706752. Throughput: 0: 42281.6. Samples: 1077784800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:17:16,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 05:17:17,899][12883] Updated weights for policy 0, policy_version 65781 (0.0031) [2024-06-18 05:17:21,487][12883] Updated weights for policy 0, policy_version 65791 (0.0028) [2024-06-18 05:17:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1077919744. Throughput: 0: 42300.1. Samples: 1078039680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 05:17:21,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 05:17:25,574][12883] Updated weights for policy 0, policy_version 65801 (0.0030) [2024-06-18 05:17:26,588][12862] Signal inference workers to stop experience collection... (15600 times) [2024-06-18 05:17:26,589][12862] Signal inference workers to resume experience collection... (15600 times) [2024-06-18 05:17:26,600][12883] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-06-18 05:17:26,600][12883] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-06-18 05:17:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1078149120. Throughput: 0: 42288.9. Samples: 1078295100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:17:26,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 05:17:29,118][12883] Updated weights for policy 0, policy_version 65811 (0.0033) [2024-06-18 05:17:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.4, 300 sec: 42265.4). Total num frames: 1078329344. Throughput: 0: 42266.5. Samples: 1078423200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:17:31,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 05:17:33,255][12883] Updated weights for policy 0, policy_version 65821 (0.0040) [2024-06-18 05:17:36,818][12883] Updated weights for policy 0, policy_version 65831 (0.0040) [2024-06-18 05:17:36,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1078575104. Throughput: 0: 42207.9. Samples: 1078675320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:17:36,995][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 05:17:40,963][12883] Updated weights for policy 0, policy_version 65841 (0.0044) [2024-06-18 05:17:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 1078788096. Throughput: 0: 42167.6. Samples: 1078927260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:17:41,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 05:17:44,733][12883] Updated weights for policy 0, policy_version 65851 (0.0032) [2024-06-18 05:17:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 1078968320. Throughput: 0: 42197.2. Samples: 1079055640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:17:46,999][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 05:17:49,046][12883] Updated weights for policy 0, policy_version 65861 (0.0037) [2024-06-18 05:17:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42154.1). Total num frames: 1079197696. Throughput: 0: 42273.1. Samples: 1079307100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:17:51,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 05:17:52,383][12883] Updated weights for policy 0, policy_version 65871 (0.0028) [2024-06-18 05:17:56,873][12883] Updated weights for policy 0, policy_version 65881 (0.0040) [2024-06-18 05:17:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1079394304. Throughput: 0: 42343.0. Samples: 1079563300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:17:56,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 05:17:59,977][12883] Updated weights for policy 0, policy_version 65891 (0.0035) [2024-06-18 05:18:01,994][12645] Fps is (10 sec: 40958.1, 60 sec: 42326.6, 300 sec: 42265.1). Total num frames: 1079607296. Throughput: 0: 42211.6. Samples: 1079684340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 05:18:01,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 05:18:04,639][12883] Updated weights for policy 0, policy_version 65901 (0.0038) [2024-06-18 05:18:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1079803904. Throughput: 0: 42156.3. Samples: 1079936720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:06,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 05:18:07,805][12883] Updated weights for policy 0, policy_version 65911 (0.0026) [2024-06-18 05:18:11,993][12645] Fps is (10 sec: 40962.1, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1080016896. Throughput: 0: 42121.9. Samples: 1080190580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:11,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 05:18:12,284][12883] Updated weights for policy 0, policy_version 65921 (0.0033) [2024-06-18 05:18:15,971][12883] Updated weights for policy 0, policy_version 65931 (0.0033) [2024-06-18 05:18:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1080262656. Throughput: 0: 42193.4. Samples: 1080321900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:16,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 05:18:19,845][12883] Updated weights for policy 0, policy_version 65941 (0.0040) [2024-06-18 05:18:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1080442880. Throughput: 0: 42169.0. Samples: 1080572920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:21,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 05:18:23,693][12883] Updated weights for policy 0, policy_version 65951 (0.0035) [2024-06-18 05:18:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1080655872. Throughput: 0: 42226.7. Samples: 1080827460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:26,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 05:18:27,498][12883] Updated weights for policy 0, policy_version 65961 (0.0033) [2024-06-18 05:18:31,540][12883] Updated weights for policy 0, policy_version 65971 (0.0037) [2024-06-18 05:18:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42154.4). Total num frames: 1080868864. Throughput: 0: 42180.5. Samples: 1080953760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:31,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 05:18:35,483][12883] Updated weights for policy 0, policy_version 65981 (0.0034) [2024-06-18 05:18:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 1081065472. Throughput: 0: 42107.8. Samples: 1081201960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:36,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 05:18:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065983_1081065472.pth... [2024-06-18 05:18:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065366_1070956544.pth [2024-06-18 05:18:39,372][12883] Updated weights for policy 0, policy_version 65991 (0.0036) [2024-06-18 05:18:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1081294848. Throughput: 0: 42063.7. Samples: 1081456160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 05:18:41,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 05:18:42,970][12883] Updated weights for policy 0, policy_version 66001 (0.0035) [2024-06-18 05:18:45,978][12862] Signal inference workers to stop experience collection... (15650 times) [2024-06-18 05:18:46,024][12883] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-06-18 05:18:46,038][12862] Signal inference workers to resume experience collection... (15650 times) [2024-06-18 05:18:46,048][12883] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-06-18 05:18:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1081507840. Throughput: 0: 42305.3. Samples: 1081588060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:18:46,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 05:18:47,099][12883] Updated weights for policy 0, policy_version 66011 (0.0026) [2024-06-18 05:18:51,068][12883] Updated weights for policy 0, policy_version 66021 (0.0035) [2024-06-18 05:18:51,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42050.6, 300 sec: 42153.8). Total num frames: 1081720832. Throughput: 0: 42250.0. Samples: 1081838060. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:18:51,996][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 05:18:54,813][12883] Updated weights for policy 0, policy_version 66031 (0.0033) [2024-06-18 05:18:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1081933824. Throughput: 0: 42167.9. Samples: 1082088140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:18:56,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 05:18:59,267][12883] Updated weights for policy 0, policy_version 66041 (0.0039) [2024-06-18 05:19:01,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.7, 300 sec: 42209.6). Total num frames: 1082146816. Throughput: 0: 42100.6. Samples: 1082216420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:19:01,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 05:19:02,565][12883] Updated weights for policy 0, policy_version 66051 (0.0033) [2024-06-18 05:19:06,954][12883] Updated weights for policy 0, policy_version 66061 (0.0039) [2024-06-18 05:19:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1082343424. Throughput: 0: 42051.2. Samples: 1082465220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:19:06,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 05:19:10,418][12883] Updated weights for policy 0, policy_version 66071 (0.0026) [2024-06-18 05:19:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42320.7). Total num frames: 1082572800. Throughput: 0: 42076.7. Samples: 1082720920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:19:11,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 05:19:14,617][12883] Updated weights for policy 0, policy_version 66081 (0.0028) [2024-06-18 05:19:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1082753024. Throughput: 0: 42114.9. Samples: 1082848940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:19:16,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 05:19:17,970][12883] Updated weights for policy 0, policy_version 66091 (0.0036) [2024-06-18 05:19:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1082982400. Throughput: 0: 42295.2. Samples: 1083105240. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-18 05:19:21,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 05:19:22,194][12883] Updated weights for policy 0, policy_version 66101 (0.0036) [2024-06-18 05:19:25,779][12883] Updated weights for policy 0, policy_version 66111 (0.0039) [2024-06-18 05:19:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1083195392. Throughput: 0: 42143.6. Samples: 1083352620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:19:26,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 05:19:29,936][12883] Updated weights for policy 0, policy_version 66121 (0.0050) [2024-06-18 05:19:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 1083408384. Throughput: 0: 42095.1. Samples: 1083482340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:19:31,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 05:19:33,840][12883] Updated weights for policy 0, policy_version 66131 (0.0032) [2024-06-18 05:19:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1083604992. Throughput: 0: 42023.0. Samples: 1083729000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:19:36,994][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 05:19:37,654][12883] Updated weights for policy 0, policy_version 66141 (0.0050) [2024-06-18 05:19:41,662][12883] Updated weights for policy 0, policy_version 66151 (0.0036) [2024-06-18 05:19:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1083817984. Throughput: 0: 42122.6. Samples: 1083983660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:19:41,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 05:19:45,587][12883] Updated weights for policy 0, policy_version 66161 (0.0025) [2024-06-18 05:19:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1084030976. Throughput: 0: 42030.0. Samples: 1084107780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:19:46,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 05:19:49,393][12883] Updated weights for policy 0, policy_version 66171 (0.0028) [2024-06-18 05:19:52,000][12645] Fps is (10 sec: 44209.5, 60 sec: 42322.5, 300 sec: 42264.3). Total num frames: 1084260352. Throughput: 0: 42171.0. Samples: 1084363180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:19:52,000][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 05:19:53,326][12883] Updated weights for policy 0, policy_version 66181 (0.0025) [2024-06-18 05:19:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1084456960. Throughput: 0: 42120.8. Samples: 1084616360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:19:56,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 05:19:57,202][12883] Updated weights for policy 0, policy_version 66191 (0.0031) [2024-06-18 05:20:01,254][12883] Updated weights for policy 0, policy_version 66201 (0.0040) [2024-06-18 05:20:01,994][12645] Fps is (10 sec: 40984.2, 60 sec: 42051.9, 300 sec: 42266.0). Total num frames: 1084669952. Throughput: 0: 41956.2. Samples: 1084736980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:20:01,995][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 05:20:04,951][12883] Updated weights for policy 0, policy_version 66211 (0.0035) [2024-06-18 05:20:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1084882944. Throughput: 0: 41995.9. Samples: 1084995060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:06,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 05:20:08,979][12883] Updated weights for policy 0, policy_version 66221 (0.0047) [2024-06-18 05:20:11,994][12645] Fps is (10 sec: 40961.7, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1085079552. Throughput: 0: 42085.3. Samples: 1085246460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:11,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 05:20:13,000][12883] Updated weights for policy 0, policy_version 66231 (0.0035) [2024-06-18 05:20:16,482][12883] Updated weights for policy 0, policy_version 66241 (0.0035) [2024-06-18 05:20:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42265.5). Total num frames: 1085292544. Throughput: 0: 42067.6. Samples: 1085375380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:16,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 05:20:18,627][12862] Signal inference workers to stop experience collection... (15700 times) [2024-06-18 05:20:18,627][12862] Signal inference workers to resume experience collection... (15700 times) [2024-06-18 05:20:18,659][12883] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-06-18 05:20:18,659][12883] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-06-18 05:20:20,749][12883] Updated weights for policy 0, policy_version 66251 (0.0034) [2024-06-18 05:20:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1085521920. Throughput: 0: 42238.6. Samples: 1085629740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:21,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 05:20:24,713][12883] Updated weights for policy 0, policy_version 66261 (0.0031) [2024-06-18 05:20:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1085734912. Throughput: 0: 41972.5. Samples: 1085872420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:26,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 05:20:28,315][12883] Updated weights for policy 0, policy_version 66271 (0.0034) [2024-06-18 05:20:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42320.8). Total num frames: 1085931520. Throughput: 0: 42179.2. Samples: 1086005840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:31,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 05:20:32,744][12883] Updated weights for policy 0, policy_version 66281 (0.0041) [2024-06-18 05:20:36,203][12883] Updated weights for policy 0, policy_version 66291 (0.0027) [2024-06-18 05:20:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42209.9). Total num frames: 1086144512. Throughput: 0: 42115.6. Samples: 1086258120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:36,994][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 05:20:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066293_1086144512.pth... [2024-06-18 05:20:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065676_1076035584.pth [2024-06-18 05:20:40,403][12883] Updated weights for policy 0, policy_version 66301 (0.0037) [2024-06-18 05:20:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1086357504. Throughput: 0: 41991.2. Samples: 1086505960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:41,994][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 05:20:43,990][12883] Updated weights for policy 0, policy_version 66311 (0.0025) [2024-06-18 05:20:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42266.0). Total num frames: 1086554112. Throughput: 0: 42129.6. Samples: 1086632800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:20:46,995][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 05:20:48,184][12883] Updated weights for policy 0, policy_version 66321 (0.0036) [2024-06-18 05:20:51,783][12883] Updated weights for policy 0, policy_version 66331 (0.0031) [2024-06-18 05:20:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41783.4, 300 sec: 42154.8). Total num frames: 1086767104. Throughput: 0: 42026.5. Samples: 1086886260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:20:51,995][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 05:20:56,028][12883] Updated weights for policy 0, policy_version 66341 (0.0025) [2024-06-18 05:20:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 1086980096. Throughput: 0: 42139.5. Samples: 1087142740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:20:56,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 05:20:59,567][12883] Updated weights for policy 0, policy_version 66351 (0.0034) [2024-06-18 05:21:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1087209472. Throughput: 0: 41940.8. Samples: 1087262720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:21:01,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 05:21:03,741][12883] Updated weights for policy 0, policy_version 66361 (0.0036) [2024-06-18 05:21:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1087406080. Throughput: 0: 41882.3. Samples: 1087514440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:21:06,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 05:21:07,187][12883] Updated weights for policy 0, policy_version 66371 (0.0034) [2024-06-18 05:21:11,817][12883] Updated weights for policy 0, policy_version 66381 (0.0047) [2024-06-18 05:21:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 1087602688. Throughput: 0: 42311.1. Samples: 1087776420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:21:11,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 05:21:14,750][12883] Updated weights for policy 0, policy_version 66391 (0.0033) [2024-06-18 05:21:16,994][12645] Fps is (10 sec: 44234.3, 60 sec: 42597.9, 300 sec: 42265.1). Total num frames: 1087848448. Throughput: 0: 41974.5. Samples: 1087894720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:21:16,995][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 05:21:19,551][12883] Updated weights for policy 0, policy_version 66401 (0.0029) [2024-06-18 05:21:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 1088028672. Throughput: 0: 41998.3. Samples: 1088148040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:21:21,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 05:21:22,490][12883] Updated weights for policy 0, policy_version 66411 (0.0029) [2024-06-18 05:21:26,994][12645] Fps is (10 sec: 37685.6, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 1088225280. Throughput: 0: 42109.9. Samples: 1088400900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:21:26,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 05:21:27,431][12883] Updated weights for policy 0, policy_version 66421 (0.0028) [2024-06-18 05:21:28,976][12862] Signal inference workers to stop experience collection... (15750 times) [2024-06-18 05:21:28,977][12862] Signal inference workers to resume experience collection... (15750 times) [2024-06-18 05:21:28,991][12883] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-06-18 05:21:28,992][12883] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-06-18 05:21:30,125][12883] Updated weights for policy 0, policy_version 66431 (0.0030) [2024-06-18 05:21:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1088471040. Throughput: 0: 42034.7. Samples: 1088524360. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:21:31,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 05:21:35,142][12883] Updated weights for policy 0, policy_version 66441 (0.0042) [2024-06-18 05:21:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1088667648. Throughput: 0: 42187.3. Samples: 1088784680. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:21:36,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 05:21:37,959][12883] Updated weights for policy 0, policy_version 66451 (0.0033) [2024-06-18 05:21:41,996][12645] Fps is (10 sec: 39313.7, 60 sec: 41777.8, 300 sec: 42154.1). Total num frames: 1088864256. Throughput: 0: 41978.5. Samples: 1089031860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:21:41,996][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 05:21:42,992][12883] Updated weights for policy 0, policy_version 66461 (0.0032) [2024-06-18 05:21:45,571][12883] Updated weights for policy 0, policy_version 66471 (0.0038) [2024-06-18 05:21:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1089077248. Throughput: 0: 42090.6. Samples: 1089156800. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:21:46,995][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 05:21:50,813][12883] Updated weights for policy 0, policy_version 66481 (0.0036) [2024-06-18 05:21:51,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42052.5, 300 sec: 42043.0). Total num frames: 1089290240. Throughput: 0: 42137.5. Samples: 1089410620. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:21:51,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 05:21:53,275][12883] Updated weights for policy 0, policy_version 66491 (0.0031) [2024-06-18 05:21:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42154.4). Total num frames: 1089503232. Throughput: 0: 41920.8. Samples: 1089662860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:21:56,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 05:21:58,519][12883] Updated weights for policy 0, policy_version 66501 (0.0029) [2024-06-18 05:22:01,487][12883] Updated weights for policy 0, policy_version 66511 (0.0029) [2024-06-18 05:22:01,996][12645] Fps is (10 sec: 42588.4, 60 sec: 41777.7, 300 sec: 42098.2). Total num frames: 1089716224. Throughput: 0: 42058.0. Samples: 1089787400. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:22:01,997][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 05:22:06,119][12883] Updated weights for policy 0, policy_version 66521 (0.0029) [2024-06-18 05:22:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41987.4). Total num frames: 1089896448. Throughput: 0: 41983.4. Samples: 1090037300. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 05:22:06,995][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 05:22:09,372][12883] Updated weights for policy 0, policy_version 66531 (0.0028) [2024-06-18 05:22:11,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1090125824. Throughput: 0: 41869.7. Samples: 1090285040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:11,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 05:22:13,846][12883] Updated weights for policy 0, policy_version 66541 (0.0038) [2024-06-18 05:22:16,994][12645] Fps is (10 sec: 45875.7, 60 sec: 41779.6, 300 sec: 42154.1). Total num frames: 1090355200. Throughput: 0: 42006.7. Samples: 1090414660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:16,995][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 05:22:17,231][12883] Updated weights for policy 0, policy_version 66551 (0.0040) [2024-06-18 05:22:21,457][12883] Updated weights for policy 0, policy_version 66561 (0.0035) [2024-06-18 05:22:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1090535424. Throughput: 0: 41788.9. Samples: 1090665180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:21,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 05:22:24,950][12883] Updated weights for policy 0, policy_version 66571 (0.0030) [2024-06-18 05:22:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1090748416. Throughput: 0: 41844.5. Samples: 1090914780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:26,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 05:22:29,170][12883] Updated weights for policy 0, policy_version 66581 (0.0041) [2024-06-18 05:22:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1090977792. Throughput: 0: 41976.5. Samples: 1091045740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:31,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 05:22:32,727][12883] Updated weights for policy 0, policy_version 66591 (0.0033) [2024-06-18 05:22:36,835][12883] Updated weights for policy 0, policy_version 66601 (0.0033) [2024-06-18 05:22:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 1091190784. Throughput: 0: 42027.7. Samples: 1091301880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:37,000][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 05:22:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066601_1091190784.pth... [2024-06-18 05:22:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000065983_1081065472.pth [2024-06-18 05:22:40,666][12883] Updated weights for policy 0, policy_version 66611 (0.0021) [2024-06-18 05:22:41,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42325.2, 300 sec: 42153.8). Total num frames: 1091403776. Throughput: 0: 41896.2. Samples: 1091548280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:41,997][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 05:22:44,507][12883] Updated weights for policy 0, policy_version 66621 (0.0032) [2024-06-18 05:22:46,996][12645] Fps is (10 sec: 39313.5, 60 sec: 41777.7, 300 sec: 41987.1). Total num frames: 1091584000. Throughput: 0: 41965.3. Samples: 1091675840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:22:46,997][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 05:22:48,593][12883] Updated weights for policy 0, policy_version 66631 (0.0028) [2024-06-18 05:22:51,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1091813376. Throughput: 0: 42028.1. Samples: 1091928560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:22:51,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 05:22:52,848][12883] Updated weights for policy 0, policy_version 66641 (0.0033) [2024-06-18 05:22:56,250][12883] Updated weights for policy 0, policy_version 66651 (0.0032) [2024-06-18 05:22:56,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1092042752. Throughput: 0: 42090.7. Samples: 1092179120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:22:56,999][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 05:23:00,607][12883] Updated weights for policy 0, policy_version 66661 (0.0033) [2024-06-18 05:23:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42053.8, 300 sec: 42154.1). Total num frames: 1092239360. Throughput: 0: 42093.7. Samples: 1092308880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:23:01,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 05:23:04,125][12883] Updated weights for policy 0, policy_version 66671 (0.0039) [2024-06-18 05:23:06,732][12862] Signal inference workers to stop experience collection... (15800 times) [2024-06-18 05:23:06,763][12883] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-06-18 05:23:06,792][12862] Signal inference workers to resume experience collection... (15800 times) [2024-06-18 05:23:06,793][12883] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-06-18 05:23:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1092452352. Throughput: 0: 42040.0. Samples: 1092556980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:23:06,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 05:23:08,321][12883] Updated weights for policy 0, policy_version 66681 (0.0042) [2024-06-18 05:23:11,959][12883] Updated weights for policy 0, policy_version 66691 (0.0039) [2024-06-18 05:23:11,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1092665344. Throughput: 0: 42341.4. Samples: 1092820140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:23:11,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 05:23:15,984][12883] Updated weights for policy 0, policy_version 66701 (0.0039) [2024-06-18 05:23:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1092861952. Throughput: 0: 42132.0. Samples: 1092941680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:23:16,995][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 05:23:19,713][12883] Updated weights for policy 0, policy_version 66711 (0.0036) [2024-06-18 05:23:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 1093074944. Throughput: 0: 42039.2. Samples: 1093193640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:23:21,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 05:23:23,611][12883] Updated weights for policy 0, policy_version 66721 (0.0032) [2024-06-18 05:23:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 1093287936. Throughput: 0: 42329.2. Samples: 1093453000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 05:23:26,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 05:23:27,597][12883] Updated weights for policy 0, policy_version 66731 (0.0031) [2024-06-18 05:23:31,202][12883] Updated weights for policy 0, policy_version 66741 (0.0031) [2024-06-18 05:23:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1093500928. Throughput: 0: 42219.5. Samples: 1093575620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:23:31,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 05:23:35,303][12883] Updated weights for policy 0, policy_version 66751 (0.0026) [2024-06-18 05:23:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 1093713920. Throughput: 0: 42197.4. Samples: 1093827440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:23:36,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 05:23:39,443][12883] Updated weights for policy 0, policy_version 66761 (0.0040) [2024-06-18 05:23:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42098.5). Total num frames: 1093926912. Throughput: 0: 42269.0. Samples: 1094081220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:23:41,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 05:23:42,879][12883] Updated weights for policy 0, policy_version 66771 (0.0026) [2024-06-18 05:23:46,995][12645] Fps is (10 sec: 40956.6, 60 sec: 42326.3, 300 sec: 42043.2). Total num frames: 1094123520. Throughput: 0: 42178.0. Samples: 1094206920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:23:46,995][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 05:23:47,672][12883] Updated weights for policy 0, policy_version 66781 (0.0027) [2024-06-18 05:23:50,654][12883] Updated weights for policy 0, policy_version 66791 (0.0028) [2024-06-18 05:23:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1094336512. Throughput: 0: 42215.1. Samples: 1094456660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:23:51,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 05:23:55,213][12883] Updated weights for policy 0, policy_version 66801 (0.0043) [2024-06-18 05:23:56,994][12645] Fps is (10 sec: 42601.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1094549504. Throughput: 0: 42132.3. Samples: 1094716100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:23:56,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 05:23:58,449][12883] Updated weights for policy 0, policy_version 66811 (0.0031) [2024-06-18 05:24:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1094746112. Throughput: 0: 42113.8. Samples: 1094836800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:24:01,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 05:24:02,645][12883] Updated weights for policy 0, policy_version 66821 (0.0034) [2024-06-18 05:24:06,117][12883] Updated weights for policy 0, policy_version 66831 (0.0031) [2024-06-18 05:24:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1094991872. Throughput: 0: 42089.8. Samples: 1095087680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:24:06,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 05:24:10,478][12883] Updated weights for policy 0, policy_version 66841 (0.0034) [2024-06-18 05:24:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 1095172096. Throughput: 0: 42153.8. Samples: 1095349920. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:11,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 05:24:13,910][12883] Updated weights for policy 0, policy_version 66851 (0.0036) [2024-06-18 05:24:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 1095401472. Throughput: 0: 41998.2. Samples: 1095465540. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:16,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 05:24:18,689][12883] Updated weights for policy 0, policy_version 66861 (0.0037) [2024-06-18 05:24:21,913][12883] Updated weights for policy 0, policy_version 66871 (0.0035) [2024-06-18 05:24:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 1095614464. Throughput: 0: 41994.7. Samples: 1095717200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:21,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 05:24:26,430][12883] Updated weights for policy 0, policy_version 66881 (0.0029) [2024-06-18 05:24:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1095811072. Throughput: 0: 42171.9. Samples: 1095978960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:26,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 05:24:29,474][12883] Updated weights for policy 0, policy_version 66891 (0.0053) [2024-06-18 05:24:32,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42321.0, 300 sec: 42153.2). Total num frames: 1096040448. Throughput: 0: 42021.2. Samples: 1096098100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:32,000][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 05:24:34,014][12883] Updated weights for policy 0, policy_version 66901 (0.0040) [2024-06-18 05:24:34,280][12862] Signal inference workers to stop experience collection... (15850 times) [2024-06-18 05:24:34,280][12862] Signal inference workers to resume experience collection... (15850 times) [2024-06-18 05:24:34,296][12883] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-06-18 05:24:34,296][12883] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-06-18 05:24:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42050.7, 300 sec: 42098.2). Total num frames: 1096237056. Throughput: 0: 42174.3. Samples: 1096354600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:36,997][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 05:24:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066910_1096253440.pth... [2024-06-18 05:24:37,118][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066293_1086144512.pth [2024-06-18 05:24:37,470][12883] Updated weights for policy 0, policy_version 66911 (0.0023) [2024-06-18 05:24:41,659][12883] Updated weights for policy 0, policy_version 66921 (0.0026) [2024-06-18 05:24:41,994][12645] Fps is (10 sec: 40985.8, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 1096450048. Throughput: 0: 42144.6. Samples: 1096612600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:41,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 05:24:45,217][12883] Updated weights for policy 0, policy_version 66931 (0.0031) [2024-06-18 05:24:46,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.9, 300 sec: 42043.9). Total num frames: 1096663040. Throughput: 0: 42241.0. Samples: 1096737640. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:46,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 05:24:49,453][12883] Updated weights for policy 0, policy_version 66941 (0.0025) [2024-06-18 05:24:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1096859648. Throughput: 0: 42113.8. Samples: 1096982800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-18 05:24:51,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 05:24:53,062][12883] Updated weights for policy 0, policy_version 66951 (0.0028) [2024-06-18 05:24:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42043.1). Total num frames: 1097072640. Throughput: 0: 41968.9. Samples: 1097238520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:24:56,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 05:24:57,305][12883] Updated weights for policy 0, policy_version 66961 (0.0040) [2024-06-18 05:25:00,877][12883] Updated weights for policy 0, policy_version 66971 (0.0031) [2024-06-18 05:25:02,000][12645] Fps is (10 sec: 45846.7, 60 sec: 42867.1, 300 sec: 42153.2). Total num frames: 1097318400. Throughput: 0: 41956.4. Samples: 1097353840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:25:02,000][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 05:25:05,372][12883] Updated weights for policy 0, policy_version 66981 (0.0042) [2024-06-18 05:25:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 1097498624. Throughput: 0: 42079.9. Samples: 1097610800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:25:06,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 05:25:08,608][12883] Updated weights for policy 0, policy_version 66991 (0.0039) [2024-06-18 05:25:11,994][12645] Fps is (10 sec: 36067.5, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 1097678848. Throughput: 0: 42072.1. Samples: 1097872200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:25:11,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 05:25:13,014][12883] Updated weights for policy 0, policy_version 67001 (0.0025) [2024-06-18 05:25:16,499][12883] Updated weights for policy 0, policy_version 67011 (0.0041) [2024-06-18 05:25:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 1097924608. Throughput: 0: 41964.5. Samples: 1097986240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:25:16,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 05:25:20,623][12883] Updated weights for policy 0, policy_version 67021 (0.0029) [2024-06-18 05:25:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1098137600. Throughput: 0: 41896.4. Samples: 1098239840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:25:21,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 05:25:24,254][12883] Updated weights for policy 0, policy_version 67031 (0.0029) [2024-06-18 05:25:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 1098317824. Throughput: 0: 41975.6. Samples: 1098501500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:25:26,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 05:25:28,303][12883] Updated weights for policy 0, policy_version 67041 (0.0028) [2024-06-18 05:25:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41783.6, 300 sec: 42043.0). Total num frames: 1098547200. Throughput: 0: 41809.8. Samples: 1098619080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 05:25:31,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 05:25:32,097][12883] Updated weights for policy 0, policy_version 67051 (0.0038) [2024-06-18 05:25:36,369][12883] Updated weights for policy 0, policy_version 67061 (0.0047) [2024-06-18 05:25:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.9, 300 sec: 42043.0). Total num frames: 1098760192. Throughput: 0: 41936.4. Samples: 1098869940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:25:36,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 05:25:39,791][12883] Updated weights for policy 0, policy_version 67071 (0.0036) [2024-06-18 05:25:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 1098956800. Throughput: 0: 41830.2. Samples: 1099120880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:25:41,994][12645] Avg episode reward: [(0, '0.076')] [2024-06-18 05:25:44,062][12883] Updated weights for policy 0, policy_version 67081 (0.0039) [2024-06-18 05:25:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1099169792. Throughput: 0: 42104.1. Samples: 1099248260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:25:46,994][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 05:25:47,696][12883] Updated weights for policy 0, policy_version 67091 (0.0042) [2024-06-18 05:25:51,864][12883] Updated weights for policy 0, policy_version 67101 (0.0034) [2024-06-18 05:25:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1099382784. Throughput: 0: 41947.1. Samples: 1099498420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:25:51,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 05:25:52,976][12862] Signal inference workers to stop experience collection... (15900 times) [2024-06-18 05:25:53,001][12883] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-06-18 05:25:53,040][12862] Signal inference workers to resume experience collection... (15900 times) [2024-06-18 05:25:53,041][12883] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-06-18 05:25:55,791][12883] Updated weights for policy 0, policy_version 67111 (0.0031) [2024-06-18 05:25:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1099595776. Throughput: 0: 41762.1. Samples: 1099751500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:25:56,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 05:26:00,120][12883] Updated weights for policy 0, policy_version 67121 (0.0043) [2024-06-18 05:26:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41237.3, 300 sec: 41987.5). Total num frames: 1099792384. Throughput: 0: 41928.8. Samples: 1099873040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:26:01,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 05:26:03,768][12883] Updated weights for policy 0, policy_version 67131 (0.0036) [2024-06-18 05:26:07,000][12645] Fps is (10 sec: 40934.5, 60 sec: 41774.9, 300 sec: 42042.1). Total num frames: 1100005376. Throughput: 0: 41892.0. Samples: 1100125240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:26:07,000][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 05:26:07,824][12883] Updated weights for policy 0, policy_version 67141 (0.0035) [2024-06-18 05:26:11,861][12883] Updated weights for policy 0, policy_version 67151 (0.0039) [2024-06-18 05:26:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 41876.5). Total num frames: 1100201984. Throughput: 0: 41760.2. Samples: 1100380720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 05:26:11,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 05:26:15,534][12883] Updated weights for policy 0, policy_version 67161 (0.0033) [2024-06-18 05:26:16,996][12645] Fps is (10 sec: 42615.5, 60 sec: 41777.6, 300 sec: 42042.7). Total num frames: 1100431360. Throughput: 0: 41839.6. Samples: 1100501960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:16,997][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 05:26:19,666][12883] Updated weights for policy 0, policy_version 67171 (0.0042) [2024-06-18 05:26:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1100644352. Throughput: 0: 42018.1. Samples: 1100760760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:21,995][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 05:26:23,321][12883] Updated weights for policy 0, policy_version 67181 (0.0038) [2024-06-18 05:26:26,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1100840960. Throughput: 0: 41905.3. Samples: 1101006620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:26,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 05:26:27,580][12883] Updated weights for policy 0, policy_version 67191 (0.0033) [2024-06-18 05:26:31,070][12883] Updated weights for policy 0, policy_version 67201 (0.0025) [2024-06-18 05:26:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1101053952. Throughput: 0: 41770.7. Samples: 1101127940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:31,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 05:26:35,315][12883] Updated weights for policy 0, policy_version 67211 (0.0034) [2024-06-18 05:26:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42098.8). Total num frames: 1101283328. Throughput: 0: 42093.5. Samples: 1101392620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:36,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 05:26:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067217_1101283328.pth... [2024-06-18 05:26:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066601_1091190784.pth [2024-06-18 05:26:38,710][12883] Updated weights for policy 0, policy_version 67221 (0.0044) [2024-06-18 05:26:41,994][12645] Fps is (10 sec: 40957.6, 60 sec: 41778.9, 300 sec: 41987.4). Total num frames: 1101463552. Throughput: 0: 41840.5. Samples: 1101634340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:41,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 05:26:42,912][12883] Updated weights for policy 0, policy_version 67231 (0.0042) [2024-06-18 05:26:46,464][12883] Updated weights for policy 0, policy_version 67241 (0.0033) [2024-06-18 05:26:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1101692928. Throughput: 0: 41856.9. Samples: 1101756600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:46,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 05:26:50,652][12883] Updated weights for policy 0, policy_version 67251 (0.0041) [2024-06-18 05:26:51,994][12645] Fps is (10 sec: 40961.9, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 1101873152. Throughput: 0: 41922.7. Samples: 1102011500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:51,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 05:26:54,303][12883] Updated weights for policy 0, policy_version 67261 (0.0026) [2024-06-18 05:26:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41987.8). Total num frames: 1102102528. Throughput: 0: 41709.8. Samples: 1102257660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 05:26:57,003][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 05:26:58,651][12883] Updated weights for policy 0, policy_version 67271 (0.0042) [2024-06-18 05:27:01,871][12883] Updated weights for policy 0, policy_version 67281 (0.0040) [2024-06-18 05:27:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1102331904. Throughput: 0: 41977.3. Samples: 1102390840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:01,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 05:27:06,238][12883] Updated weights for policy 0, policy_version 67291 (0.0038) [2024-06-18 05:27:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42056.7, 300 sec: 42043.0). Total num frames: 1102528512. Throughput: 0: 41882.4. Samples: 1102645460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:06,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 05:27:09,698][12883] Updated weights for policy 0, policy_version 67301 (0.0034) [2024-06-18 05:27:11,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1102741504. Throughput: 0: 41923.0. Samples: 1102893160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:11,995][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 05:27:13,959][12883] Updated weights for policy 0, policy_version 67311 (0.0042) [2024-06-18 05:27:16,994][12645] Fps is (10 sec: 40956.6, 60 sec: 41780.2, 300 sec: 42042.9). Total num frames: 1102938112. Throughput: 0: 42157.0. Samples: 1103025040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:16,995][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 05:27:17,335][12862] Signal inference workers to stop experience collection... (15950 times) [2024-06-18 05:27:17,386][12883] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-06-18 05:27:17,393][12862] Signal inference workers to resume experience collection... (15950 times) [2024-06-18 05:27:17,402][12883] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-06-18 05:27:17,684][12883] Updated weights for policy 0, policy_version 67321 (0.0039) [2024-06-18 05:27:21,700][12883] Updated weights for policy 0, policy_version 67331 (0.0030) [2024-06-18 05:27:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 1103151104. Throughput: 0: 41743.6. Samples: 1103271080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:21,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 05:27:25,737][12883] Updated weights for policy 0, policy_version 67341 (0.0031) [2024-06-18 05:27:26,994][12645] Fps is (10 sec: 42600.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 1103364096. Throughput: 0: 41959.4. Samples: 1103522500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:26,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 05:27:29,356][12883] Updated weights for policy 0, policy_version 67351 (0.0037) [2024-06-18 05:27:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41932.0). Total num frames: 1103560704. Throughput: 0: 41960.6. Samples: 1103644820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:31,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 05:27:33,364][12883] Updated weights for policy 0, policy_version 67361 (0.0038) [2024-06-18 05:27:36,996][12645] Fps is (10 sec: 42589.5, 60 sec: 41777.6, 300 sec: 41987.5). Total num frames: 1103790080. Throughput: 0: 41984.6. Samples: 1103900900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 05:27:37,005][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 05:27:37,308][12883] Updated weights for policy 0, policy_version 67371 (0.0033) [2024-06-18 05:27:41,155][12883] Updated weights for policy 0, policy_version 67381 (0.0030) [2024-06-18 05:27:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.6, 300 sec: 42098.9). Total num frames: 1104003072. Throughput: 0: 42016.0. Samples: 1104148380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:27:41,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 05:27:44,985][12883] Updated weights for policy 0, policy_version 67391 (0.0031) [2024-06-18 05:27:46,994][12645] Fps is (10 sec: 40968.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1104199680. Throughput: 0: 41921.6. Samples: 1104277320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:27:46,994][12645] Avg episode reward: [(0, '0.091')] [2024-06-18 05:27:49,043][12883] Updated weights for policy 0, policy_version 67401 (0.0031) [2024-06-18 05:27:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 1104412672. Throughput: 0: 41815.0. Samples: 1104527140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:27:51,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 05:27:52,818][12883] Updated weights for policy 0, policy_version 67411 (0.0025) [2024-06-18 05:27:56,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 1104625664. Throughput: 0: 41953.6. Samples: 1104781060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:27:56,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 05:27:57,001][12883] Updated weights for policy 0, policy_version 67421 (0.0042) [2024-06-18 05:28:00,709][12883] Updated weights for policy 0, policy_version 67431 (0.0045) [2024-06-18 05:28:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 1104838656. Throughput: 0: 41903.3. Samples: 1104910660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:28:01,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 05:28:04,548][12883] Updated weights for policy 0, policy_version 67441 (0.0038) [2024-06-18 05:28:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1105035264. Throughput: 0: 41967.0. Samples: 1105159600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:28:06,998][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 05:28:08,513][12883] Updated weights for policy 0, policy_version 67451 (0.0025) [2024-06-18 05:28:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 1105264640. Throughput: 0: 42073.0. Samples: 1105415780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:28:11,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 05:28:12,554][12883] Updated weights for policy 0, policy_version 67461 (0.0038) [2024-06-18 05:28:16,251][12883] Updated weights for policy 0, policy_version 67471 (0.0027) [2024-06-18 05:28:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.8, 300 sec: 42043.0). Total num frames: 1105477632. Throughput: 0: 42137.3. Samples: 1105541000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:28:16,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 05:28:20,109][12883] Updated weights for policy 0, policy_version 67481 (0.0044) [2024-06-18 05:28:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1105690624. Throughput: 0: 42156.8. Samples: 1105797860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:21,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 05:28:23,971][12883] Updated weights for policy 0, policy_version 67491 (0.0032) [2024-06-18 05:28:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 1105887232. Throughput: 0: 42297.4. Samples: 1106051760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:26,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 05:28:27,681][12883] Updated weights for policy 0, policy_version 67501 (0.0047) [2024-06-18 05:28:31,718][12883] Updated weights for policy 0, policy_version 67511 (0.0043) [2024-06-18 05:28:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 1106100224. Throughput: 0: 42215.7. Samples: 1106177020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:31,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 05:28:35,374][12883] Updated weights for policy 0, policy_version 67521 (0.0030) [2024-06-18 05:28:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42053.7, 300 sec: 41987.4). Total num frames: 1106313216. Throughput: 0: 42344.3. Samples: 1106432640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:36,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 05:28:37,007][12862] Signal inference workers to stop experience collection... (16000 times) [2024-06-18 05:28:37,012][12862] Signal inference workers to resume experience collection... (16000 times) [2024-06-18 05:28:37,053][12883] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-06-18 05:28:37,053][12883] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-06-18 05:28:37,142][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067525_1106329600.pth... [2024-06-18 05:28:37,196][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000066910_1096253440.pth [2024-06-18 05:28:39,407][12883] Updated weights for policy 0, policy_version 67531 (0.0037) [2024-06-18 05:28:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42098.7). Total num frames: 1106542592. Throughput: 0: 42366.6. Samples: 1106687560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:41,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 05:28:43,191][12883] Updated weights for policy 0, policy_version 67541 (0.0033) [2024-06-18 05:28:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1106739200. Throughput: 0: 42388.9. Samples: 1106818160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:46,994][12645] Avg episode reward: [(0, '0.130')] [2024-06-18 05:28:47,227][12883] Updated weights for policy 0, policy_version 67551 (0.0030) [2024-06-18 05:28:50,986][12883] Updated weights for policy 0, policy_version 67561 (0.0029) [2024-06-18 05:28:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 1106935808. Throughput: 0: 42388.0. Samples: 1107067060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:51,994][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 05:28:54,919][12883] Updated weights for policy 0, policy_version 67571 (0.0046) [2024-06-18 05:28:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 1107181568. Throughput: 0: 42291.1. Samples: 1107318880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 05:28:56,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 05:28:58,721][12883] Updated weights for policy 0, policy_version 67581 (0.0022) [2024-06-18 05:29:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 1107361792. Throughput: 0: 42473.3. Samples: 1107452300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:01,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 05:29:02,827][12883] Updated weights for policy 0, policy_version 67591 (0.0043) [2024-06-18 05:29:06,635][12883] Updated weights for policy 0, policy_version 67601 (0.0034) [2024-06-18 05:29:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 1107574784. Throughput: 0: 42320.0. Samples: 1107702260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:06,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 05:29:10,730][12883] Updated weights for policy 0, policy_version 67611 (0.0036) [2024-06-18 05:29:11,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 1107820544. Throughput: 0: 42213.2. Samples: 1107951360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:11,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 05:29:14,702][12883] Updated weights for policy 0, policy_version 67621 (0.0038) [2024-06-18 05:29:16,993][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 1107984384. Throughput: 0: 42457.5. Samples: 1108087600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:16,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 05:29:18,444][12883] Updated weights for policy 0, policy_version 67631 (0.0034) [2024-06-18 05:29:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 1108197376. Throughput: 0: 42297.5. Samples: 1108336020. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:21,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 05:29:22,148][12883] Updated weights for policy 0, policy_version 67641 (0.0025) [2024-06-18 05:29:25,881][12883] Updated weights for policy 0, policy_version 67651 (0.0031) [2024-06-18 05:29:26,996][12645] Fps is (10 sec: 47502.1, 60 sec: 42869.8, 300 sec: 42099.1). Total num frames: 1108459520. Throughput: 0: 42311.6. Samples: 1108591680. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:26,997][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 05:29:29,808][12883] Updated weights for policy 0, policy_version 67661 (0.0041) [2024-06-18 05:29:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42043.3). Total num frames: 1108639744. Throughput: 0: 42340.9. Samples: 1108723500. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:31,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 05:29:33,469][12883] Updated weights for policy 0, policy_version 67671 (0.0028) [2024-06-18 05:29:36,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1108852736. Throughput: 0: 42341.7. Samples: 1108972440. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:36,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 05:29:37,449][12883] Updated weights for policy 0, policy_version 67681 (0.0033) [2024-06-18 05:29:41,112][12883] Updated weights for policy 0, policy_version 67691 (0.0038) [2024-06-18 05:29:41,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 1109098496. Throughput: 0: 42493.7. Samples: 1109231100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 05:29:41,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 05:29:45,260][12883] Updated weights for policy 0, policy_version 67701 (0.0038) [2024-06-18 05:29:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 1109278720. Throughput: 0: 42355.5. Samples: 1109358300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:29:46,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 05:29:48,923][12883] Updated weights for policy 0, policy_version 67711 (0.0027) [2024-06-18 05:29:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 1109491712. Throughput: 0: 42339.5. Samples: 1109607540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:29:51,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 05:29:52,810][12883] Updated weights for policy 0, policy_version 67721 (0.0032) [2024-06-18 05:29:56,775][12883] Updated weights for policy 0, policy_version 67731 (0.0031) [2024-06-18 05:29:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41988.4). Total num frames: 1109704704. Throughput: 0: 42566.3. Samples: 1109866840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:29:57,003][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 05:30:00,684][12883] Updated weights for policy 0, policy_version 67741 (0.0034) [2024-06-18 05:30:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 1109917696. Throughput: 0: 42324.6. Samples: 1109992220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:30:01,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 05:30:04,611][12883] Updated weights for policy 0, policy_version 67751 (0.0036) [2024-06-18 05:30:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1110130688. Throughput: 0: 42337.7. Samples: 1110241220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:30:06,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 05:30:07,699][12862] Signal inference workers to stop experience collection... (16050 times) [2024-06-18 05:30:07,699][12862] Signal inference workers to resume experience collection... (16050 times) [2024-06-18 05:30:07,722][12883] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-06-18 05:30:07,722][12883] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-06-18 05:30:08,572][12883] Updated weights for policy 0, policy_version 67761 (0.0036) [2024-06-18 05:30:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1110327296. Throughput: 0: 42510.1. Samples: 1110504540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:30:11,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 05:30:12,273][12883] Updated weights for policy 0, policy_version 67771 (0.0038) [2024-06-18 05:30:16,494][12883] Updated weights for policy 0, policy_version 67781 (0.0043) [2024-06-18 05:30:17,000][12645] Fps is (10 sec: 42571.9, 60 sec: 42866.9, 300 sec: 42097.7). Total num frames: 1110556672. Throughput: 0: 42128.9. Samples: 1110619560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:30:17,000][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 05:30:20,095][12883] Updated weights for policy 0, policy_version 67791 (0.0024) [2024-06-18 05:30:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42265.1). Total num frames: 1110786048. Throughput: 0: 42287.9. Samples: 1110875400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 05:30:21,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 05:30:24,317][12883] Updated weights for policy 0, policy_version 67801 (0.0047) [2024-06-18 05:30:26,994][12645] Fps is (10 sec: 39346.2, 60 sec: 41507.7, 300 sec: 42043.0). Total num frames: 1110949888. Throughput: 0: 42315.7. Samples: 1111135300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:30:26,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 05:30:27,767][12883] Updated weights for policy 0, policy_version 67811 (0.0038) [2024-06-18 05:30:31,895][12883] Updated weights for policy 0, policy_version 67821 (0.0024) [2024-06-18 05:30:31,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 1111179264. Throughput: 0: 42081.1. Samples: 1111251940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:30:31,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 05:30:35,470][12883] Updated weights for policy 0, policy_version 67831 (0.0027) [2024-06-18 05:30:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1111408640. Throughput: 0: 42265.0. Samples: 1111509460. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:30:36,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 05:30:37,080][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067836_1111425024.pth... [2024-06-18 05:30:37,136][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067217_1101283328.pth [2024-06-18 05:30:39,381][12883] Updated weights for policy 0, policy_version 67841 (0.0033) [2024-06-18 05:30:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 1111588864. Throughput: 0: 42376.0. Samples: 1111773760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:30:41,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 05:30:43,123][12883] Updated weights for policy 0, policy_version 67851 (0.0032) [2024-06-18 05:30:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1111818240. Throughput: 0: 42089.4. Samples: 1111886240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:30:46,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 05:30:47,102][12883] Updated weights for policy 0, policy_version 67861 (0.0030) [2024-06-18 05:30:50,835][12883] Updated weights for policy 0, policy_version 67871 (0.0037) [2024-06-18 05:30:51,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1112064000. Throughput: 0: 42362.7. Samples: 1112147540. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:30:51,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 05:30:55,014][12883] Updated weights for policy 0, policy_version 67881 (0.0029) [2024-06-18 05:30:57,000][12645] Fps is (10 sec: 39297.1, 60 sec: 41774.9, 300 sec: 42097.7). Total num frames: 1112211456. Throughput: 0: 42328.0. Samples: 1112409560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:30:57,000][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 05:30:58,696][12883] Updated weights for policy 0, policy_version 67891 (0.0034) [2024-06-18 05:31:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42266.1). Total num frames: 1112473600. Throughput: 0: 42283.1. Samples: 1112522040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 05:31:01,994][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 05:31:02,666][12883] Updated weights for policy 0, policy_version 67901 (0.0026) [2024-06-18 05:31:06,751][12883] Updated weights for policy 0, policy_version 67911 (0.0033) [2024-06-18 05:31:06,994][12645] Fps is (10 sec: 45903.8, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1112670208. Throughput: 0: 42368.1. Samples: 1112781960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:06,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 05:31:10,327][12883] Updated weights for policy 0, policy_version 67921 (0.0043) [2024-06-18 05:31:11,994][12645] Fps is (10 sec: 36044.9, 60 sec: 41779.2, 300 sec: 42043.3). Total num frames: 1112834048. Throughput: 0: 42301.3. Samples: 1113038860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:11,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 05:31:14,501][12883] Updated weights for policy 0, policy_version 67931 (0.0033) [2024-06-18 05:31:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42329.7, 300 sec: 42209.6). Total num frames: 1113096192. Throughput: 0: 42422.6. Samples: 1113160960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:16,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 05:31:17,908][12883] Updated weights for policy 0, policy_version 67941 (0.0036) [2024-06-18 05:31:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41506.3, 300 sec: 42154.1). Total num frames: 1113276416. Throughput: 0: 42335.2. Samples: 1113414540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:21,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 05:31:22,191][12883] Updated weights for policy 0, policy_version 67951 (0.0026) [2024-06-18 05:31:26,362][12883] Updated weights for policy 0, policy_version 67961 (0.0037) [2024-06-18 05:31:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1113489408. Throughput: 0: 42013.4. Samples: 1113664360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:26,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 05:31:28,076][12862] Signal inference workers to stop experience collection... (16100 times) [2024-06-18 05:31:28,076][12862] Signal inference workers to resume experience collection... (16100 times) [2024-06-18 05:31:28,096][12883] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-06-18 05:31:28,097][12883] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-06-18 05:31:29,951][12883] Updated weights for policy 0, policy_version 67971 (0.0028) [2024-06-18 05:31:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1113718784. Throughput: 0: 42164.8. Samples: 1113783660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:32,000][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 05:31:34,033][12883] Updated weights for policy 0, policy_version 67981 (0.0035) [2024-06-18 05:31:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42209.7). Total num frames: 1113915392. Throughput: 0: 42181.2. Samples: 1114045700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:36,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 05:31:37,647][12883] Updated weights for policy 0, policy_version 67991 (0.0030) [2024-06-18 05:31:41,493][12883] Updated weights for policy 0, policy_version 68001 (0.0041) [2024-06-18 05:31:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1114128384. Throughput: 0: 41797.4. Samples: 1114290180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:41,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 05:31:45,395][12883] Updated weights for policy 0, policy_version 68011 (0.0035) [2024-06-18 05:31:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1114357760. Throughput: 0: 42212.5. Samples: 1114421600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 05:31:46,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 05:31:49,145][12883] Updated weights for policy 0, policy_version 68021 (0.0038) [2024-06-18 05:31:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1114554368. Throughput: 0: 42190.2. Samples: 1114680520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:31:51,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 05:31:53,483][12883] Updated weights for policy 0, policy_version 68031 (0.0031) [2024-06-18 05:31:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42602.8, 300 sec: 42154.1). Total num frames: 1114767360. Throughput: 0: 41899.1. Samples: 1114924320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:31:56,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 05:31:57,235][12883] Updated weights for policy 0, policy_version 68041 (0.0033) [2024-06-18 05:32:01,191][12883] Updated weights for policy 0, policy_version 68051 (0.0033) [2024-06-18 05:32:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1114980352. Throughput: 0: 42104.8. Samples: 1115055680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:32:01,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 05:32:05,099][12883] Updated weights for policy 0, policy_version 68061 (0.0027) [2024-06-18 05:32:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1115176960. Throughput: 0: 42115.8. Samples: 1115309760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:32:06,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 05:32:08,732][12883] Updated weights for policy 0, policy_version 68071 (0.0034) [2024-06-18 05:32:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42154.2). Total num frames: 1115373568. Throughput: 0: 42152.7. Samples: 1115561240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:32:11,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 05:32:12,922][12883] Updated weights for policy 0, policy_version 68081 (0.0027) [2024-06-18 05:32:16,292][12883] Updated weights for policy 0, policy_version 68091 (0.0034) [2024-06-18 05:32:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1115635712. Throughput: 0: 42268.3. Samples: 1115685740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:32:16,994][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 05:32:20,946][12883] Updated weights for policy 0, policy_version 68101 (0.0046) [2024-06-18 05:32:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1115815936. Throughput: 0: 42263.1. Samples: 1115947540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:32:21,994][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 05:32:23,857][12883] Updated weights for policy 0, policy_version 68111 (0.0024) [2024-06-18 05:32:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 1116028928. Throughput: 0: 42253.7. Samples: 1116191600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:32:26,994][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 05:32:28,722][12883] Updated weights for policy 0, policy_version 68121 (0.0032) [2024-06-18 05:32:31,648][12883] Updated weights for policy 0, policy_version 68131 (0.0036) [2024-06-18 05:32:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 1116274688. Throughput: 0: 42273.8. Samples: 1116323920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:32:31,994][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 05:32:36,402][12883] Updated weights for policy 0, policy_version 68141 (0.0037) [2024-06-18 05:32:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 1116454912. Throughput: 0: 42248.1. Samples: 1116581680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:32:36,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 05:32:37,116][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068144_1116471296.pth... [2024-06-18 05:32:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067525_1106329600.pth [2024-06-18 05:32:39,346][12883] Updated weights for policy 0, policy_version 68151 (0.0031) [2024-06-18 05:32:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1116667904. Throughput: 0: 42302.3. Samples: 1116827920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:32:41,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 05:32:44,072][12883] Updated weights for policy 0, policy_version 68161 (0.0041) [2024-06-18 05:32:46,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1116897280. Throughput: 0: 42259.1. Samples: 1116957340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:32:46,995][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 05:32:47,264][12883] Updated weights for policy 0, policy_version 68171 (0.0032) [2024-06-18 05:32:51,801][12883] Updated weights for policy 0, policy_version 68181 (0.0037) [2024-06-18 05:32:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1117077504. Throughput: 0: 42165.0. Samples: 1117207180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:32:51,994][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 05:32:54,096][12862] Signal inference workers to stop experience collection... (16150 times) [2024-06-18 05:32:54,100][12862] Signal inference workers to resume experience collection... (16150 times) [2024-06-18 05:32:54,126][12883] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-06-18 05:32:54,127][12883] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-06-18 05:32:55,182][12883] Updated weights for policy 0, policy_version 68191 (0.0035) [2024-06-18 05:32:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1117306880. Throughput: 0: 41988.1. Samples: 1117450700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:32:56,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 05:32:59,730][12883] Updated weights for policy 0, policy_version 68201 (0.0033) [2024-06-18 05:33:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1117503488. Throughput: 0: 42289.4. Samples: 1117588760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:33:01,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 05:33:02,834][12883] Updated weights for policy 0, policy_version 68211 (0.0029) [2024-06-18 05:33:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 1117716480. Throughput: 0: 42021.9. Samples: 1117838520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 05:33:06,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 05:33:07,339][12883] Updated weights for policy 0, policy_version 68221 (0.0045) [2024-06-18 05:33:10,884][12883] Updated weights for policy 0, policy_version 68231 (0.0028) [2024-06-18 05:33:11,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 1117962240. Throughput: 0: 41960.9. Samples: 1118079840. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:11,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 05:33:15,066][12883] Updated weights for policy 0, policy_version 68241 (0.0035) [2024-06-18 05:33:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.4, 300 sec: 42209.6). Total num frames: 1118142464. Throughput: 0: 42048.5. Samples: 1118216100. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:16,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 05:33:18,569][12883] Updated weights for policy 0, policy_version 68251 (0.0039) [2024-06-18 05:33:21,994][12645] Fps is (10 sec: 36044.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1118322688. Throughput: 0: 41846.9. Samples: 1118464800. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:21,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 05:33:22,792][12883] Updated weights for policy 0, policy_version 68261 (0.0034) [2024-06-18 05:33:26,227][12883] Updated weights for policy 0, policy_version 68271 (0.0031) [2024-06-18 05:33:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1118584832. Throughput: 0: 41942.6. Samples: 1118715340. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:26,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 05:33:30,509][12883] Updated weights for policy 0, policy_version 68281 (0.0031) [2024-06-18 05:33:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 1118765056. Throughput: 0: 42315.3. Samples: 1118861520. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:31,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 05:33:33,840][12883] Updated weights for policy 0, policy_version 68291 (0.0043) [2024-06-18 05:33:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41779.0, 300 sec: 42098.5). Total num frames: 1118961664. Throughput: 0: 42071.0. Samples: 1119100380. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:36,995][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 05:33:38,279][12883] Updated weights for policy 0, policy_version 68301 (0.0027) [2024-06-18 05:33:41,590][12883] Updated weights for policy 0, policy_version 68311 (0.0042) [2024-06-18 05:33:41,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1119207424. Throughput: 0: 42262.5. Samples: 1119352520. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:41,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 05:33:46,135][12883] Updated weights for policy 0, policy_version 68321 (0.0040) [2024-06-18 05:33:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1119404032. Throughput: 0: 42257.0. Samples: 1119490320. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:46,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 05:33:49,609][12883] Updated weights for policy 0, policy_version 68331 (0.0033) [2024-06-18 05:33:51,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42323.8, 300 sec: 42153.8). Total num frames: 1119617024. Throughput: 0: 41930.3. Samples: 1119725480. Policy #0 lag: (min: 1.0, avg: 13.2, max: 25.0) [2024-06-18 05:33:51,996][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 05:33:53,405][12862] Signal inference workers to stop experience collection... (16200 times) [2024-06-18 05:33:53,439][12883] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-06-18 05:33:53,450][12862] Signal inference workers to resume experience collection... (16200 times) [2024-06-18 05:33:53,460][12883] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-06-18 05:33:54,375][12883] Updated weights for policy 0, policy_version 68341 (0.0028) [2024-06-18 05:33:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1119830016. Throughput: 0: 42250.3. Samples: 1119981100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:33:56,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 05:33:57,218][12883] Updated weights for policy 0, policy_version 68351 (0.0038) [2024-06-18 05:34:01,994][12645] Fps is (10 sec: 39330.8, 60 sec: 41779.4, 300 sec: 42154.1). Total num frames: 1120010240. Throughput: 0: 42087.6. Samples: 1120110040. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:34:01,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 05:34:02,098][12883] Updated weights for policy 0, policy_version 68361 (0.0028) [2024-06-18 05:34:05,234][12883] Updated weights for policy 0, policy_version 68371 (0.0025) [2024-06-18 05:34:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 1120272384. Throughput: 0: 42251.2. Samples: 1120366100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:34:06,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 05:34:09,809][12883] Updated weights for policy 0, policy_version 68381 (0.0034) [2024-06-18 05:34:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1120468992. Throughput: 0: 42211.2. Samples: 1120614840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:34:11,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 05:34:12,925][12883] Updated weights for policy 0, policy_version 68391 (0.0046) [2024-06-18 05:34:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1120665600. Throughput: 0: 41798.2. Samples: 1120742440. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:34:16,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 05:34:17,571][12883] Updated weights for policy 0, policy_version 68401 (0.0027) [2024-06-18 05:34:21,049][12883] Updated weights for policy 0, policy_version 68411 (0.0033) [2024-06-18 05:34:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42209.9). Total num frames: 1120911360. Throughput: 0: 42101.9. Samples: 1120994960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:34:21,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 05:34:25,449][12883] Updated weights for policy 0, policy_version 68421 (0.0033) [2024-06-18 05:34:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1121091584. Throughput: 0: 42108.5. Samples: 1121247400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:34:26,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 05:34:28,649][12883] Updated weights for policy 0, policy_version 68431 (0.0039) [2024-06-18 05:34:31,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1121288192. Throughput: 0: 41737.4. Samples: 1121368500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 05:34:31,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 05:34:33,293][12883] Updated weights for policy 0, policy_version 68441 (0.0037) [2024-06-18 05:34:36,283][12883] Updated weights for policy 0, policy_version 68451 (0.0032) [2024-06-18 05:34:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42154.1). Total num frames: 1121533952. Throughput: 0: 42273.2. Samples: 1121627680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:34:37,000][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 05:34:37,034][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068454_1121550336.pth... [2024-06-18 05:34:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000067836_1111425024.pth [2024-06-18 05:34:40,878][12883] Updated weights for policy 0, policy_version 68461 (0.0039) [2024-06-18 05:34:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1121730560. Throughput: 0: 42200.3. Samples: 1121880120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:34:41,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 05:34:44,033][12883] Updated weights for policy 0, policy_version 68471 (0.0026) [2024-06-18 05:34:46,996][12645] Fps is (10 sec: 37674.9, 60 sec: 41777.7, 300 sec: 42098.2). Total num frames: 1121910784. Throughput: 0: 42098.2. Samples: 1122004560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:34:46,996][12645] Avg episode reward: [(0, '0.087')] [2024-06-18 05:34:48,317][12883] Updated weights for policy 0, policy_version 68481 (0.0037) [2024-06-18 05:34:51,505][12883] Updated weights for policy 0, policy_version 68491 (0.0035) [2024-06-18 05:34:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 1122156544. Throughput: 0: 42227.7. Samples: 1122266340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:34:51,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 05:34:55,850][12883] Updated weights for policy 0, policy_version 68501 (0.0042) [2024-06-18 05:34:56,994][12645] Fps is (10 sec: 45885.0, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1122369536. Throughput: 0: 42355.4. Samples: 1122520840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:34:56,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 05:34:59,150][12883] Updated weights for policy 0, policy_version 68511 (0.0024) [2024-06-18 05:35:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1122566144. Throughput: 0: 42325.9. Samples: 1122647100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:35:01,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 05:35:03,374][12883] Updated weights for policy 0, policy_version 68521 (0.0038) [2024-06-18 05:35:06,328][12862] Signal inference workers to stop experience collection... (16250 times) [2024-06-18 05:35:06,328][12862] Signal inference workers to resume experience collection... (16250 times) [2024-06-18 05:35:06,371][12883] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-06-18 05:35:06,371][12883] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-06-18 05:35:06,834][12883] Updated weights for policy 0, policy_version 68531 (0.0041) [2024-06-18 05:35:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1122811904. Throughput: 0: 42474.2. Samples: 1122906300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:35:06,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 05:35:11,262][12883] Updated weights for policy 0, policy_version 68541 (0.0032) [2024-06-18 05:35:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42155.0). Total num frames: 1122992128. Throughput: 0: 42468.4. Samples: 1123158480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:35:11,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 05:35:14,593][12883] Updated weights for policy 0, policy_version 68551 (0.0042) [2024-06-18 05:35:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 1123205120. Throughput: 0: 42375.1. Samples: 1123275380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 05:35:16,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 05:35:19,009][12883] Updated weights for policy 0, policy_version 68561 (0.0037) [2024-06-18 05:35:21,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1123450880. Throughput: 0: 42504.6. Samples: 1123540380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:21,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 05:35:22,089][12883] Updated weights for policy 0, policy_version 68571 (0.0041) [2024-06-18 05:35:26,758][12883] Updated weights for policy 0, policy_version 68581 (0.0032) [2024-06-18 05:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1123631104. Throughput: 0: 42642.7. Samples: 1123799040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:26,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 05:35:29,618][12883] Updated weights for policy 0, policy_version 68591 (0.0023) [2024-06-18 05:35:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1123827712. Throughput: 0: 42524.9. Samples: 1123918080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:31,994][12645] Avg episode reward: [(0, '0.058')] [2024-06-18 05:35:34,470][12883] Updated weights for policy 0, policy_version 68601 (0.0039) [2024-06-18 05:35:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1124073472. Throughput: 0: 42485.4. Samples: 1124178180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:36,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 05:35:37,779][12883] Updated weights for policy 0, policy_version 68611 (0.0029) [2024-06-18 05:35:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1124270080. Throughput: 0: 42421.8. Samples: 1124429820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:41,998][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 05:35:42,483][12883] Updated weights for policy 0, policy_version 68621 (0.0037) [2024-06-18 05:35:45,516][12883] Updated weights for policy 0, policy_version 68631 (0.0037) [2024-06-18 05:35:46,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42871.5, 300 sec: 42098.2). Total num frames: 1124483072. Throughput: 0: 42464.9. Samples: 1124558120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:46,996][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 05:35:49,998][12883] Updated weights for policy 0, policy_version 68641 (0.0033) [2024-06-18 05:35:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42321.6). Total num frames: 1124696064. Throughput: 0: 42478.8. Samples: 1124817840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:51,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 05:35:53,168][12883] Updated weights for policy 0, policy_version 68651 (0.0036) [2024-06-18 05:35:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1124909056. Throughput: 0: 42510.7. Samples: 1125071460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 05:35:56,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 05:35:57,721][12883] Updated weights for policy 0, policy_version 68661 (0.0039) [2024-06-18 05:36:00,962][12883] Updated weights for policy 0, policy_version 68671 (0.0037) [2024-06-18 05:36:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1125105664. Throughput: 0: 42822.6. Samples: 1125202400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:01,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 05:36:05,426][12883] Updated weights for policy 0, policy_version 68681 (0.0026) [2024-06-18 05:36:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1125335040. Throughput: 0: 42749.7. Samples: 1125464120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:06,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 05:36:07,372][12862] Signal inference workers to stop experience collection... (16300 times) [2024-06-18 05:36:07,372][12862] Signal inference workers to resume experience collection... (16300 times) [2024-06-18 05:36:07,390][12883] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-06-18 05:36:07,390][12883] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-06-18 05:36:08,500][12883] Updated weights for policy 0, policy_version 68691 (0.0032) [2024-06-18 05:36:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1125564416. Throughput: 0: 42602.6. Samples: 1125716160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:11,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 05:36:12,882][12883] Updated weights for policy 0, policy_version 68701 (0.0040) [2024-06-18 05:36:16,006][12883] Updated weights for policy 0, policy_version 68711 (0.0037) [2024-06-18 05:36:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1125761024. Throughput: 0: 42947.3. Samples: 1125850720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:16,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 05:36:20,458][12883] Updated weights for policy 0, policy_version 68721 (0.0032) [2024-06-18 05:36:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.0, 300 sec: 42265.1). Total num frames: 1125957632. Throughput: 0: 42790.4. Samples: 1126103760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:21,995][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 05:36:23,603][12883] Updated weights for policy 0, policy_version 68731 (0.0035) [2024-06-18 05:36:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1126203392. Throughput: 0: 42717.8. Samples: 1126352120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:26,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 05:36:28,235][12883] Updated weights for policy 0, policy_version 68741 (0.0030) [2024-06-18 05:36:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.3, 300 sec: 42320.7). Total num frames: 1126400000. Throughput: 0: 42796.2. Samples: 1126483860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:31,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 05:36:32,158][12883] Updated weights for policy 0, policy_version 68751 (0.0040) [2024-06-18 05:36:35,802][12883] Updated weights for policy 0, policy_version 68761 (0.0033) [2024-06-18 05:36:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1126612992. Throughput: 0: 42565.6. Samples: 1126733300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 05:36:36,994][12645] Avg episode reward: [(0, '0.143')] [2024-06-18 05:36:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068763_1126612992.pth... [2024-06-18 05:36:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068144_1116471296.pth [2024-06-18 05:36:40,153][12883] Updated weights for policy 0, policy_version 68771 (0.0041) [2024-06-18 05:36:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1126825984. Throughput: 0: 42636.9. Samples: 1126990120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:36:41,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 05:36:43,486][12883] Updated weights for policy 0, policy_version 68781 (0.0030) [2024-06-18 05:36:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42600.0, 300 sec: 42320.7). Total num frames: 1127038976. Throughput: 0: 42514.3. Samples: 1127115540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:36:46,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 05:36:47,899][12883] Updated weights for policy 0, policy_version 68791 (0.0040) [2024-06-18 05:36:51,119][12883] Updated weights for policy 0, policy_version 68801 (0.0044) [2024-06-18 05:36:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 1127268352. Throughput: 0: 42225.8. Samples: 1127364280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:36:51,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 05:36:55,731][12883] Updated weights for policy 0, policy_version 68811 (0.0032) [2024-06-18 05:36:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1127448576. Throughput: 0: 42316.5. Samples: 1127620400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:36:56,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 05:36:58,865][12883] Updated weights for policy 0, policy_version 68821 (0.0027) [2024-06-18 05:37:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1127677952. Throughput: 0: 42088.1. Samples: 1127744680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:37:01,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 05:37:03,373][12883] Updated weights for policy 0, policy_version 68831 (0.0041) [2024-06-18 05:37:06,471][12883] Updated weights for policy 0, policy_version 68841 (0.0026) [2024-06-18 05:37:06,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42869.9, 300 sec: 42487.0). Total num frames: 1127907328. Throughput: 0: 42200.3. Samples: 1128002860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:37:06,997][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 05:37:11,106][12883] Updated weights for policy 0, policy_version 68851 (0.0034) [2024-06-18 05:37:11,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.8, 300 sec: 42209.3). Total num frames: 1128087552. Throughput: 0: 42352.1. Samples: 1128258060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:37:11,997][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 05:37:14,089][12883] Updated weights for policy 0, policy_version 68861 (0.0028) [2024-06-18 05:37:16,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1128300544. Throughput: 0: 42074.9. Samples: 1128377220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:37:16,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 05:37:18,712][12883] Updated weights for policy 0, policy_version 68871 (0.0022) [2024-06-18 05:37:21,801][12883] Updated weights for policy 0, policy_version 68881 (0.0047) [2024-06-18 05:37:21,994][12645] Fps is (10 sec: 45885.9, 60 sec: 43144.7, 300 sec: 42431.8). Total num frames: 1128546304. Throughput: 0: 42371.7. Samples: 1128640020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 05:37:21,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 05:37:26,618][12883] Updated weights for policy 0, policy_version 68891 (0.0038) [2024-06-18 05:37:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1128726528. Throughput: 0: 42279.2. Samples: 1128892680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:37:26,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 05:37:29,777][12883] Updated weights for policy 0, policy_version 68901 (0.0038) [2024-06-18 05:37:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1128939520. Throughput: 0: 42115.1. Samples: 1129010720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:37:31,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 05:37:34,178][12883] Updated weights for policy 0, policy_version 68911 (0.0045) [2024-06-18 05:37:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1129168896. Throughput: 0: 42402.2. Samples: 1129272380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:37:37,000][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 05:37:37,660][12883] Updated weights for policy 0, policy_version 68921 (0.0044) [2024-06-18 05:37:38,527][12862] Signal inference workers to stop experience collection... (16350 times) [2024-06-18 05:37:38,536][12862] Signal inference workers to resume experience collection... (16350 times) [2024-06-18 05:37:38,574][12883] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-06-18 05:37:38,574][12883] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-06-18 05:37:41,967][12883] Updated weights for policy 0, policy_version 68931 (0.0040) [2024-06-18 05:37:42,000][12645] Fps is (10 sec: 42571.3, 60 sec: 42320.9, 300 sec: 42264.3). Total num frames: 1129365504. Throughput: 0: 42255.0. Samples: 1129522140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:37:42,001][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 05:37:45,511][12883] Updated weights for policy 0, policy_version 68941 (0.0031) [2024-06-18 05:37:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1129578496. Throughput: 0: 42159.6. Samples: 1129641860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:37:46,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 05:37:49,828][12883] Updated weights for policy 0, policy_version 68951 (0.0038) [2024-06-18 05:37:51,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1129791488. Throughput: 0: 42140.2. Samples: 1129899080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:37:51,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 05:37:53,439][12883] Updated weights for policy 0, policy_version 68961 (0.0027) [2024-06-18 05:37:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1129988096. Throughput: 0: 42268.8. Samples: 1130160060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:37:56,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 05:37:57,347][12883] Updated weights for policy 0, policy_version 68971 (0.0034) [2024-06-18 05:38:01,153][12883] Updated weights for policy 0, policy_version 68981 (0.0037) [2024-06-18 05:38:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1130217472. Throughput: 0: 42443.0. Samples: 1130287160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 05:38:01,995][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 05:38:04,953][12883] Updated weights for policy 0, policy_version 68991 (0.0040) [2024-06-18 05:38:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 1130430464. Throughput: 0: 42115.5. Samples: 1130535220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:06,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 05:38:08,878][12883] Updated weights for policy 0, policy_version 69001 (0.0042) [2024-06-18 05:38:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1130627072. Throughput: 0: 42362.6. Samples: 1130799000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:11,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 05:38:12,498][12883] Updated weights for policy 0, policy_version 69011 (0.0021) [2024-06-18 05:38:16,832][12883] Updated weights for policy 0, policy_version 69021 (0.0037) [2024-06-18 05:38:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1130856448. Throughput: 0: 42433.3. Samples: 1130920220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:16,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 05:38:20,192][12883] Updated weights for policy 0, policy_version 69031 (0.0028) [2024-06-18 05:38:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 1131069440. Throughput: 0: 42398.6. Samples: 1131180320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:21,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 05:38:24,537][12883] Updated weights for policy 0, policy_version 69041 (0.0030) [2024-06-18 05:38:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1131266048. Throughput: 0: 42477.0. Samples: 1131433340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:26,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 05:38:27,907][12883] Updated weights for policy 0, policy_version 69051 (0.0031) [2024-06-18 05:38:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1131479040. Throughput: 0: 42657.7. Samples: 1131561460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:31,995][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 05:38:32,140][12883] Updated weights for policy 0, policy_version 69061 (0.0038) [2024-06-18 05:38:36,185][12883] Updated weights for policy 0, policy_version 69071 (0.0034) [2024-06-18 05:38:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1131692032. Throughput: 0: 42570.3. Samples: 1131814740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:36,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 05:38:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069074_1131708416.pth... [2024-06-18 05:38:37,167][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068454_1121550336.pth [2024-06-18 05:38:39,825][12883] Updated weights for policy 0, policy_version 69081 (0.0028) [2024-06-18 05:38:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.7, 300 sec: 42376.2). Total num frames: 1131905024. Throughput: 0: 42391.8. Samples: 1132067700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:41,994][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 05:38:43,918][12883] Updated weights for policy 0, policy_version 69091 (0.0037) [2024-06-18 05:38:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42321.0). Total num frames: 1132101632. Throughput: 0: 42265.8. Samples: 1132189120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 05:38:46,994][12645] Avg episode reward: [(0, '0.082')] [2024-06-18 05:38:47,483][12883] Updated weights for policy 0, policy_version 69101 (0.0033) [2024-06-18 05:38:51,412][12883] Updated weights for policy 0, policy_version 69111 (0.0032) [2024-06-18 05:38:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1132314624. Throughput: 0: 42397.2. Samples: 1132443100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:38:51,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 05:38:55,647][12883] Updated weights for policy 0, policy_version 69121 (0.0029) [2024-06-18 05:38:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1132544000. Throughput: 0: 42240.1. Samples: 1132699800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:38:56,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 05:38:59,490][12883] Updated weights for policy 0, policy_version 69131 (0.0040) [2024-06-18 05:39:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1132756992. Throughput: 0: 42319.5. Samples: 1132824600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:39:01,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 05:39:03,344][12883] Updated weights for policy 0, policy_version 69141 (0.0035) [2024-06-18 05:39:05,604][12862] Signal inference workers to stop experience collection... (16400 times) [2024-06-18 05:39:05,604][12862] Signal inference workers to resume experience collection... (16400 times) [2024-06-18 05:39:05,648][12883] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-06-18 05:39:05,648][12883] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-06-18 05:39:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1132953600. Throughput: 0: 42179.2. Samples: 1133078380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:39:06,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 05:39:07,105][12883] Updated weights for policy 0, policy_version 69151 (0.0033) [2024-06-18 05:39:11,115][12883] Updated weights for policy 0, policy_version 69161 (0.0039) [2024-06-18 05:39:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 1133150208. Throughput: 0: 42138.8. Samples: 1133329580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:39:11,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 05:39:14,903][12883] Updated weights for policy 0, policy_version 69171 (0.0023) [2024-06-18 05:39:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1133395968. Throughput: 0: 42045.8. Samples: 1133453520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:39:16,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 05:39:18,833][12883] Updated weights for policy 0, policy_version 69181 (0.0025) [2024-06-18 05:39:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1133608960. Throughput: 0: 42177.3. Samples: 1133712720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:39:21,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 05:39:22,597][12883] Updated weights for policy 0, policy_version 69191 (0.0043) [2024-06-18 05:39:26,413][12883] Updated weights for policy 0, policy_version 69201 (0.0027) [2024-06-18 05:39:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1133805568. Throughput: 0: 42253.5. Samples: 1133969100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 05:39:26,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 05:39:30,165][12883] Updated weights for policy 0, policy_version 69211 (0.0042) [2024-06-18 05:39:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1134034944. Throughput: 0: 42374.6. Samples: 1134095980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:39:31,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 05:39:34,139][12883] Updated weights for policy 0, policy_version 69221 (0.0043) [2024-06-18 05:39:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1134231552. Throughput: 0: 42307.5. Samples: 1134346940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:39:36,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 05:39:37,859][12883] Updated weights for policy 0, policy_version 69231 (0.0041) [2024-06-18 05:39:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42432.1). Total num frames: 1134428160. Throughput: 0: 42345.3. Samples: 1134605340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:39:41,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 05:39:42,066][12883] Updated weights for policy 0, policy_version 69241 (0.0031) [2024-06-18 05:39:45,598][12883] Updated weights for policy 0, policy_version 69251 (0.0032) [2024-06-18 05:39:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1134673920. Throughput: 0: 42287.6. Samples: 1134727540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:39:46,998][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 05:39:49,973][12883] Updated weights for policy 0, policy_version 69261 (0.0030) [2024-06-18 05:39:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1134870528. Throughput: 0: 42398.8. Samples: 1134986320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:39:51,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 05:39:53,492][12883] Updated weights for policy 0, policy_version 69271 (0.0041) [2024-06-18 05:39:56,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1135067136. Throughput: 0: 42349.8. Samples: 1135235320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:39:56,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 05:39:57,629][12883] Updated weights for policy 0, policy_version 69281 (0.0031) [2024-06-18 05:40:01,301][12883] Updated weights for policy 0, policy_version 69291 (0.0028) [2024-06-18 05:40:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1135296512. Throughput: 0: 42500.0. Samples: 1135366020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:40:01,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 05:40:05,459][12883] Updated weights for policy 0, policy_version 69301 (0.0042) [2024-06-18 05:40:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1135509504. Throughput: 0: 42461.8. Samples: 1135623500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:40:06,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 05:40:09,208][12883] Updated weights for policy 0, policy_version 69311 (0.0034) [2024-06-18 05:40:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 1135722496. Throughput: 0: 42299.9. Samples: 1135872600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 05:40:11,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 05:40:13,369][12883] Updated weights for policy 0, policy_version 69321 (0.0041) [2024-06-18 05:40:16,896][12883] Updated weights for policy 0, policy_version 69331 (0.0038) [2024-06-18 05:40:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 1135919104. Throughput: 0: 42294.1. Samples: 1135999220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:16,995][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 05:40:20,997][12883] Updated weights for policy 0, policy_version 69341 (0.0043) [2024-06-18 05:40:21,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1136132096. Throughput: 0: 42509.6. Samples: 1136259860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:21,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 05:40:24,468][12883] Updated weights for policy 0, policy_version 69351 (0.0038) [2024-06-18 05:40:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1136361472. Throughput: 0: 42344.3. Samples: 1136510840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:26,995][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 05:40:28,550][12883] Updated weights for policy 0, policy_version 69361 (0.0045) [2024-06-18 05:40:29,816][12862] Signal inference workers to stop experience collection... (16450 times) [2024-06-18 05:40:29,817][12862] Signal inference workers to resume experience collection... (16450 times) [2024-06-18 05:40:29,834][12883] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-06-18 05:40:29,834][12883] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-06-18 05:40:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 1136558080. Throughput: 0: 42493.9. Samples: 1136639760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:31,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 05:40:32,070][12883] Updated weights for policy 0, policy_version 69371 (0.0029) [2024-06-18 05:40:36,218][12883] Updated weights for policy 0, policy_version 69381 (0.0033) [2024-06-18 05:40:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1136771072. Throughput: 0: 42217.1. Samples: 1136886100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:36,994][12645] Avg episode reward: [(0, '0.039')] [2024-06-18 05:40:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069383_1136771072.pth... [2024-06-18 05:40:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000068763_1126612992.pth [2024-06-18 05:40:39,663][12883] Updated weights for policy 0, policy_version 69391 (0.0031) [2024-06-18 05:40:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42376.6). Total num frames: 1136984064. Throughput: 0: 42277.6. Samples: 1137137820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:41,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 05:40:43,830][12883] Updated weights for policy 0, policy_version 69401 (0.0027) [2024-06-18 05:40:46,997][12645] Fps is (10 sec: 40945.2, 60 sec: 41776.6, 300 sec: 42320.1). Total num frames: 1137180672. Throughput: 0: 42151.2. Samples: 1137262980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:46,998][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 05:40:47,686][12883] Updated weights for policy 0, policy_version 69411 (0.0028) [2024-06-18 05:40:51,472][12883] Updated weights for policy 0, policy_version 69421 (0.0041) [2024-06-18 05:40:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1137410048. Throughput: 0: 42092.8. Samples: 1137517680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 05:40:51,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 05:40:55,557][12883] Updated weights for policy 0, policy_version 69431 (0.0047) [2024-06-18 05:40:56,996][12645] Fps is (10 sec: 44243.5, 60 sec: 42596.7, 300 sec: 42431.5). Total num frames: 1137623040. Throughput: 0: 42138.0. Samples: 1137768900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:40:56,997][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 05:40:59,262][12883] Updated weights for policy 0, policy_version 69441 (0.0032) [2024-06-18 05:41:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1137819648. Throughput: 0: 42085.9. Samples: 1137893080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:01,994][12645] Avg episode reward: [(0, '0.098')] [2024-06-18 05:41:03,534][12883] Updated weights for policy 0, policy_version 69451 (0.0031) [2024-06-18 05:41:06,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1138032640. Throughput: 0: 41931.1. Samples: 1138146760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:06,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 05:41:07,088][12883] Updated weights for policy 0, policy_version 69461 (0.0029) [2024-06-18 05:41:11,336][12883] Updated weights for policy 0, policy_version 69471 (0.0040) [2024-06-18 05:41:11,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 1138262016. Throughput: 0: 42038.9. Samples: 1138402680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:11,996][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 05:41:14,703][12883] Updated weights for policy 0, policy_version 69481 (0.0047) [2024-06-18 05:41:16,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1138442240. Throughput: 0: 41979.4. Samples: 1138528840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:16,995][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 05:41:18,891][12883] Updated weights for policy 0, policy_version 69491 (0.0027) [2024-06-18 05:41:22,000][12645] Fps is (10 sec: 42581.1, 60 sec: 42593.9, 300 sec: 42319.8). Total num frames: 1138688000. Throughput: 0: 42088.1. Samples: 1138780320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:22,000][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 05:41:22,335][12883] Updated weights for policy 0, policy_version 69501 (0.0028) [2024-06-18 05:41:26,601][12883] Updated weights for policy 0, policy_version 69511 (0.0031) [2024-06-18 05:41:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.4, 300 sec: 42265.2). Total num frames: 1138868224. Throughput: 0: 42308.6. Samples: 1139041700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:26,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 05:41:29,847][12883] Updated weights for policy 0, policy_version 69521 (0.0033) [2024-06-18 05:41:31,994][12645] Fps is (10 sec: 39346.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1139081216. Throughput: 0: 42252.4. Samples: 1139164180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:31,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 05:41:34,172][12883] Updated weights for policy 0, policy_version 69531 (0.0035) [2024-06-18 05:41:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1139294208. Throughput: 0: 42246.2. Samples: 1139418760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 05:41:36,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 05:41:37,517][12883] Updated weights for policy 0, policy_version 69541 (0.0039) [2024-06-18 05:41:39,395][12862] Signal inference workers to stop experience collection... (16500 times) [2024-06-18 05:41:39,395][12862] Signal inference workers to resume experience collection... (16500 times) [2024-06-18 05:41:39,437][12883] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-06-18 05:41:39,438][12883] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-06-18 05:41:41,995][12645] Fps is (10 sec: 42593.9, 60 sec: 42051.6, 300 sec: 42265.0). Total num frames: 1139507200. Throughput: 0: 42288.7. Samples: 1139671840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:41:41,995][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 05:41:42,035][12883] Updated weights for policy 0, policy_version 69551 (0.0030) [2024-06-18 05:41:46,156][12883] Updated weights for policy 0, policy_version 69561 (0.0039) [2024-06-18 05:41:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42054.9, 300 sec: 42154.1). Total num frames: 1139703808. Throughput: 0: 42367.6. Samples: 1139799620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:41:46,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 05:41:49,711][12883] Updated weights for policy 0, policy_version 69571 (0.0023) [2024-06-18 05:41:51,994][12645] Fps is (10 sec: 42602.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1139933184. Throughput: 0: 42279.9. Samples: 1140049360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:41:51,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 05:41:53,976][12883] Updated weights for policy 0, policy_version 69581 (0.0050) [2024-06-18 05:41:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42053.8, 300 sec: 42265.2). Total num frames: 1140146176. Throughput: 0: 42224.7. Samples: 1140302700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:41:56,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 05:41:57,623][12883] Updated weights for policy 0, policy_version 69591 (0.0033) [2024-06-18 05:42:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42098.9). Total num frames: 1140326400. Throughput: 0: 42132.5. Samples: 1140424800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:42:01,999][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 05:42:02,157][12883] Updated weights for policy 0, policy_version 69601 (0.0039) [2024-06-18 05:42:05,237][12883] Updated weights for policy 0, policy_version 69611 (0.0040) [2024-06-18 05:42:06,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42050.6, 300 sec: 42265.2). Total num frames: 1140555776. Throughput: 0: 42184.6. Samples: 1140678460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:42:06,996][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 05:42:09,719][12883] Updated weights for policy 0, policy_version 69621 (0.0033) [2024-06-18 05:42:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41780.7, 300 sec: 42265.2). Total num frames: 1140768768. Throughput: 0: 42076.8. Samples: 1140935160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:42:11,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 05:42:13,129][12883] Updated weights for policy 0, policy_version 69631 (0.0037) [2024-06-18 05:42:16,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1140981760. Throughput: 0: 42167.5. Samples: 1141061720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 23.0) [2024-06-18 05:42:16,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 05:42:17,390][12883] Updated weights for policy 0, policy_version 69641 (0.0042) [2024-06-18 05:42:20,949][12883] Updated weights for policy 0, policy_version 69651 (0.0028) [2024-06-18 05:42:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41783.5, 300 sec: 42265.2). Total num frames: 1141194752. Throughput: 0: 41947.6. Samples: 1141306400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:21,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 05:42:25,215][12883] Updated weights for policy 0, policy_version 69661 (0.0026) [2024-06-18 05:42:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1141407744. Throughput: 0: 42041.8. Samples: 1141563680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:26,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 05:42:28,997][12883] Updated weights for policy 0, policy_version 69671 (0.0026) [2024-06-18 05:42:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1141620736. Throughput: 0: 42074.2. Samples: 1141692960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:31,997][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 05:42:32,903][12883] Updated weights for policy 0, policy_version 69681 (0.0032) [2024-06-18 05:42:36,651][12883] Updated weights for policy 0, policy_version 69691 (0.0033) [2024-06-18 05:42:36,997][12645] Fps is (10 sec: 40945.8, 60 sec: 42049.8, 300 sec: 42210.0). Total num frames: 1141817344. Throughput: 0: 42068.7. Samples: 1141942600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:36,998][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 05:42:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069691_1141817344.pth... [2024-06-18 05:42:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069074_1131708416.pth [2024-06-18 05:42:40,661][12883] Updated weights for policy 0, policy_version 69701 (0.0033) [2024-06-18 05:42:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42326.1, 300 sec: 42265.2). Total num frames: 1142046720. Throughput: 0: 42186.7. Samples: 1142201100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:41,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 05:42:44,200][12883] Updated weights for policy 0, policy_version 69711 (0.0031) [2024-06-18 05:42:46,994][12645] Fps is (10 sec: 44250.0, 60 sec: 42598.0, 300 sec: 42265.1). Total num frames: 1142259712. Throughput: 0: 42313.3. Samples: 1142328920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:46,995][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 05:42:48,510][12883] Updated weights for policy 0, policy_version 69721 (0.0049) [2024-06-18 05:42:49,409][12862] Signal inference workers to stop experience collection... (16550 times) [2024-06-18 05:42:49,462][12862] Signal inference workers to resume experience collection... (16550 times) [2024-06-18 05:42:49,463][12883] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-06-18 05:42:49,488][12883] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-06-18 05:42:51,890][12883] Updated weights for policy 0, policy_version 69731 (0.0033) [2024-06-18 05:42:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1142472704. Throughput: 0: 42152.7. Samples: 1142575240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:51,995][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 05:42:56,474][12883] Updated weights for policy 0, policy_version 69741 (0.0046) [2024-06-18 05:42:56,994][12645] Fps is (10 sec: 40962.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1142669312. Throughput: 0: 42314.2. Samples: 1142839300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:42:56,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 05:42:59,570][12883] Updated weights for policy 0, policy_version 69751 (0.0032) [2024-06-18 05:43:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42265.1). Total num frames: 1142898688. Throughput: 0: 42187.4. Samples: 1142960160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 05:43:01,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 05:43:04,129][12883] Updated weights for policy 0, policy_version 69761 (0.0029) [2024-06-18 05:43:06,994][12645] Fps is (10 sec: 42595.8, 60 sec: 42326.4, 300 sec: 42265.1). Total num frames: 1143095296. Throughput: 0: 42366.0. Samples: 1143212900. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:06,995][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 05:43:07,329][12883] Updated weights for policy 0, policy_version 69771 (0.0034) [2024-06-18 05:43:11,966][12883] Updated weights for policy 0, policy_version 69781 (0.0026) [2024-06-18 05:43:12,000][12645] Fps is (10 sec: 39297.6, 60 sec: 42047.9, 300 sec: 42153.2). Total num frames: 1143291904. Throughput: 0: 42370.6. Samples: 1143470620. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:12,001][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 05:43:15,198][12883] Updated weights for policy 0, policy_version 69791 (0.0036) [2024-06-18 05:43:16,994][12645] Fps is (10 sec: 44240.2, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1143537664. Throughput: 0: 42225.0. Samples: 1143593080. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:16,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 05:43:19,582][12883] Updated weights for policy 0, policy_version 69801 (0.0028) [2024-06-18 05:43:21,994][12645] Fps is (10 sec: 44265.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1143734272. Throughput: 0: 42372.8. Samples: 1143849220. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:21,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 05:43:22,801][12883] Updated weights for policy 0, policy_version 69811 (0.0029) [2024-06-18 05:43:26,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1143930880. Throughput: 0: 42329.8. Samples: 1144105940. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:26,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 05:43:27,703][12883] Updated weights for policy 0, policy_version 69821 (0.0021) [2024-06-18 05:43:30,482][12883] Updated weights for policy 0, policy_version 69831 (0.0029) [2024-06-18 05:43:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1144176640. Throughput: 0: 42228.9. Samples: 1144229200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:31,994][12645] Avg episode reward: [(0, '0.094')] [2024-06-18 05:43:35,368][12883] Updated weights for policy 0, policy_version 69841 (0.0032) [2024-06-18 05:43:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42327.8, 300 sec: 42209.6). Total num frames: 1144356864. Throughput: 0: 42524.9. Samples: 1144488860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:36,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 05:43:38,106][12883] Updated weights for policy 0, policy_version 69851 (0.0030) [2024-06-18 05:43:41,994][12645] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1144553472. Throughput: 0: 42155.7. Samples: 1144736300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:41,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 05:43:43,028][12883] Updated weights for policy 0, policy_version 69861 (0.0021) [2024-06-18 05:43:45,947][12883] Updated weights for policy 0, policy_version 69871 (0.0054) [2024-06-18 05:43:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.7, 300 sec: 42320.7). Total num frames: 1144799232. Throughput: 0: 42307.2. Samples: 1144863980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:43:46,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 05:43:50,643][12883] Updated weights for policy 0, policy_version 69881 (0.0030) [2024-06-18 05:43:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1144979456. Throughput: 0: 42390.8. Samples: 1145120460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:43:51,998][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 05:43:53,653][12883] Updated weights for policy 0, policy_version 69891 (0.0037) [2024-06-18 05:43:57,000][12645] Fps is (10 sec: 40934.4, 60 sec: 42321.0, 300 sec: 42208.7). Total num frames: 1145208832. Throughput: 0: 42041.3. Samples: 1145362480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:43:57,000][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 05:43:58,476][12883] Updated weights for policy 0, policy_version 69901 (0.0038) [2024-06-18 05:43:59,052][12862] Signal inference workers to stop experience collection... (16600 times) [2024-06-18 05:43:59,052][12862] Signal inference workers to resume experience collection... (16600 times) [2024-06-18 05:43:59,081][12883] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-06-18 05:43:59,081][12883] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-06-18 05:44:01,761][12883] Updated weights for policy 0, policy_version 69911 (0.0039) [2024-06-18 05:44:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 1145421824. Throughput: 0: 42284.4. Samples: 1145495880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:44:01,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 05:44:06,070][12883] Updated weights for policy 0, policy_version 69921 (0.0026) [2024-06-18 05:44:06,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42325.8, 300 sec: 42320.7). Total num frames: 1145634816. Throughput: 0: 42398.1. Samples: 1145757140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:44:06,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 05:44:07,017][12862] Saving new best policy, reward=0.651! [2024-06-18 05:44:09,690][12883] Updated weights for policy 0, policy_version 69931 (0.0028) [2024-06-18 05:44:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42875.9, 300 sec: 42265.2). Total num frames: 1145864192. Throughput: 0: 42055.9. Samples: 1145998460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:44:11,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 05:44:13,956][12883] Updated weights for policy 0, policy_version 69941 (0.0035) [2024-06-18 05:44:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1146044416. Throughput: 0: 42217.7. Samples: 1146129000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:44:16,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 05:44:17,632][12883] Updated weights for policy 0, policy_version 69951 (0.0042) [2024-06-18 05:44:21,653][12883] Updated weights for policy 0, policy_version 69961 (0.0031) [2024-06-18 05:44:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1146257408. Throughput: 0: 42083.1. Samples: 1146382600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:44:21,997][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 05:44:25,302][12883] Updated weights for policy 0, policy_version 69971 (0.0029) [2024-06-18 05:44:26,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42596.8, 300 sec: 42209.3). Total num frames: 1146486784. Throughput: 0: 42073.8. Samples: 1146629720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 05:44:26,996][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 05:44:29,247][12883] Updated weights for policy 0, policy_version 69981 (0.0046) [2024-06-18 05:44:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1146683392. Throughput: 0: 42194.6. Samples: 1146762740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:44:31,999][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 05:44:33,024][12883] Updated weights for policy 0, policy_version 69991 (0.0034) [2024-06-18 05:44:36,987][12883] Updated weights for policy 0, policy_version 70001 (0.0033) [2024-06-18 05:44:36,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1146896384. Throughput: 0: 42012.6. Samples: 1147011020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:44:36,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 05:44:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070001_1146896384.pth... [2024-06-18 05:44:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069383_1136771072.pth [2024-06-18 05:44:40,718][12883] Updated weights for policy 0, policy_version 70011 (0.0033) [2024-06-18 05:44:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1147092992. Throughput: 0: 42295.7. Samples: 1147265520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:44:41,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 05:44:44,559][12883] Updated weights for policy 0, policy_version 70021 (0.0030) [2024-06-18 05:44:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1147322368. Throughput: 0: 42197.8. Samples: 1147394780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:44:46,994][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 05:44:48,363][12883] Updated weights for policy 0, policy_version 70031 (0.0039) [2024-06-18 05:44:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1147535360. Throughput: 0: 42020.0. Samples: 1147648040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:44:51,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 05:44:52,259][12883] Updated weights for policy 0, policy_version 70041 (0.0033) [2024-06-18 05:44:56,091][12883] Updated weights for policy 0, policy_version 70051 (0.0035) [2024-06-18 05:44:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.8, 300 sec: 42209.6). Total num frames: 1147748352. Throughput: 0: 42211.7. Samples: 1147897980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:44:56,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 05:44:59,940][12883] Updated weights for policy 0, policy_version 70061 (0.0023) [2024-06-18 05:45:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1147944960. Throughput: 0: 42076.5. Samples: 1148022440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:45:01,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 05:45:03,951][12883] Updated weights for policy 0, policy_version 70071 (0.0033) [2024-06-18 05:45:06,995][12645] Fps is (10 sec: 40955.4, 60 sec: 42051.5, 300 sec: 42153.9). Total num frames: 1148157952. Throughput: 0: 42072.4. Samples: 1148275900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:45:06,995][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 05:45:07,671][12883] Updated weights for policy 0, policy_version 70081 (0.0032) [2024-06-18 05:45:11,847][12883] Updated weights for policy 0, policy_version 70091 (0.0037) [2024-06-18 05:45:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1148370944. Throughput: 0: 42174.1. Samples: 1148527460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 05:45:11,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 05:45:15,728][12883] Updated weights for policy 0, policy_version 70101 (0.0027) [2024-06-18 05:45:16,994][12645] Fps is (10 sec: 40964.1, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1148567552. Throughput: 0: 42092.4. Samples: 1148656900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:17,003][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 05:45:19,655][12883] Updated weights for policy 0, policy_version 70111 (0.0038) [2024-06-18 05:45:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1148780544. Throughput: 0: 42148.7. Samples: 1148907720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:21,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 05:45:23,526][12883] Updated weights for policy 0, policy_version 70121 (0.0032) [2024-06-18 05:45:26,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42052.3, 300 sec: 42209.3). Total num frames: 1149009920. Throughput: 0: 41938.4. Samples: 1149152840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:26,996][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 05:45:27,197][12883] Updated weights for policy 0, policy_version 70131 (0.0046) [2024-06-18 05:45:31,186][12883] Updated weights for policy 0, policy_version 70141 (0.0034) [2024-06-18 05:45:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1149222912. Throughput: 0: 42147.8. Samples: 1149291440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:31,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 05:45:33,008][12862] Signal inference workers to stop experience collection... (16650 times) [2024-06-18 05:45:33,009][12862] Signal inference workers to resume experience collection... (16650 times) [2024-06-18 05:45:33,050][12883] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-06-18 05:45:33,056][12883] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-06-18 05:45:35,314][12883] Updated weights for policy 0, policy_version 70151 (0.0030) [2024-06-18 05:45:36,994][12645] Fps is (10 sec: 39330.2, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 1149403136. Throughput: 0: 42030.2. Samples: 1149539400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:36,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 05:45:38,960][12883] Updated weights for policy 0, policy_version 70161 (0.0032) [2024-06-18 05:45:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42265.7). Total num frames: 1149648896. Throughput: 0: 41983.6. Samples: 1149787240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:41,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 05:45:42,981][12883] Updated weights for policy 0, policy_version 70171 (0.0041) [2024-06-18 05:45:46,651][12883] Updated weights for policy 0, policy_version 70181 (0.0032) [2024-06-18 05:45:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1149845504. Throughput: 0: 42214.3. Samples: 1149922080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:46,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 05:45:50,630][12883] Updated weights for policy 0, policy_version 70191 (0.0026) [2024-06-18 05:45:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42098.9). Total num frames: 1150042112. Throughput: 0: 42201.9. Samples: 1150174940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-18 05:45:51,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 05:45:54,452][12883] Updated weights for policy 0, policy_version 70201 (0.0032) [2024-06-18 05:45:56,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 1150287872. Throughput: 0: 42275.0. Samples: 1150429840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:45:56,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 05:45:58,226][12883] Updated weights for policy 0, policy_version 70211 (0.0031) [2024-06-18 05:46:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1150468096. Throughput: 0: 42285.4. Samples: 1150559740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:01,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 05:46:02,463][12883] Updated weights for policy 0, policy_version 70221 (0.0033) [2024-06-18 05:46:06,136][12883] Updated weights for policy 0, policy_version 70231 (0.0053) [2024-06-18 05:46:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42053.0, 300 sec: 42098.9). Total num frames: 1150681088. Throughput: 0: 42133.9. Samples: 1150803740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:06,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 05:46:10,286][12883] Updated weights for policy 0, policy_version 70241 (0.0026) [2024-06-18 05:46:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1150926848. Throughput: 0: 42404.3. Samples: 1151060940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:11,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 05:46:13,811][12883] Updated weights for policy 0, policy_version 70251 (0.0037) [2024-06-18 05:46:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42099.4). Total num frames: 1151107072. Throughput: 0: 42209.5. Samples: 1151190860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:16,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 05:46:17,698][12883] Updated weights for policy 0, policy_version 70261 (0.0037) [2024-06-18 05:46:21,437][12883] Updated weights for policy 0, policy_version 70271 (0.0024) [2024-06-18 05:46:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 1151320064. Throughput: 0: 42243.2. Samples: 1151440340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:21,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 05:46:25,495][12883] Updated weights for policy 0, policy_version 70281 (0.0040) [2024-06-18 05:46:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 1151549440. Throughput: 0: 42449.7. Samples: 1151697480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:26,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 05:46:29,463][12883] Updated weights for policy 0, policy_version 70291 (0.0025) [2024-06-18 05:46:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 1151746048. Throughput: 0: 42220.9. Samples: 1151822020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:31,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 05:46:33,113][12883] Updated weights for policy 0, policy_version 70301 (0.0037) [2024-06-18 05:46:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42209.8). Total num frames: 1151959040. Throughput: 0: 42336.1. Samples: 1152080060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 05:46:36,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 05:46:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070310_1151959040.pth... [2024-06-18 05:46:37,091][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000069691_1141817344.pth [2024-06-18 05:46:37,291][12883] Updated weights for policy 0, policy_version 70311 (0.0034) [2024-06-18 05:46:40,762][12883] Updated weights for policy 0, policy_version 70321 (0.0030) [2024-06-18 05:46:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1152172032. Throughput: 0: 42321.6. Samples: 1152334300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:46:41,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 05:46:44,783][12883] Updated weights for policy 0, policy_version 70331 (0.0038) [2024-06-18 05:46:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1152385024. Throughput: 0: 42374.2. Samples: 1152466580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:46:46,994][12645] Avg episode reward: [(0, '0.092')] [2024-06-18 05:46:48,375][12883] Updated weights for policy 0, policy_version 70341 (0.0032) [2024-06-18 05:46:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1152581632. Throughput: 0: 42600.5. Samples: 1152720760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:46:51,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 05:46:52,348][12883] Updated weights for policy 0, policy_version 70351 (0.0040) [2024-06-18 05:46:56,047][12883] Updated weights for policy 0, policy_version 70361 (0.0027) [2024-06-18 05:46:56,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42323.8, 300 sec: 42375.9). Total num frames: 1152827392. Throughput: 0: 42549.9. Samples: 1152975780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:46:56,997][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 05:46:59,608][12862] Signal inference workers to stop experience collection... (16700 times) [2024-06-18 05:46:59,641][12883] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-06-18 05:46:59,670][12862] Signal inference workers to resume experience collection... (16700 times) [2024-06-18 05:46:59,671][12883] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-06-18 05:47:00,010][12883] Updated weights for policy 0, policy_version 70371 (0.0047) [2024-06-18 05:47:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42265.5). Total num frames: 1153024000. Throughput: 0: 42589.2. Samples: 1153107380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:47:01,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 05:47:03,661][12883] Updated weights for policy 0, policy_version 70381 (0.0032) [2024-06-18 05:47:06,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1153236992. Throughput: 0: 42528.4. Samples: 1153354120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:47:06,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 05:47:08,358][12883] Updated weights for policy 0, policy_version 70391 (0.0043) [2024-06-18 05:47:11,639][12883] Updated weights for policy 0, policy_version 70401 (0.0039) [2024-06-18 05:47:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1153466368. Throughput: 0: 42393.2. Samples: 1153605180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:47:11,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 05:47:15,969][12883] Updated weights for policy 0, policy_version 70411 (0.0042) [2024-06-18 05:47:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1153646592. Throughput: 0: 42546.6. Samples: 1153736620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:47:16,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 05:47:19,387][12883] Updated weights for policy 0, policy_version 70421 (0.0045) [2024-06-18 05:47:21,993][12645] Fps is (10 sec: 39322.7, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 1153859584. Throughput: 0: 42375.6. Samples: 1153986960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 05:47:21,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 05:47:23,976][12883] Updated weights for policy 0, policy_version 70431 (0.0033) [2024-06-18 05:47:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1154088960. Throughput: 0: 42364.3. Samples: 1154240700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:47:26,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 05:47:27,164][12883] Updated weights for policy 0, policy_version 70441 (0.0048) [2024-06-18 05:47:31,514][12883] Updated weights for policy 0, policy_version 70451 (0.0028) [2024-06-18 05:47:31,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42265.7). Total num frames: 1154285568. Throughput: 0: 42275.5. Samples: 1154368980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:47:31,994][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 05:47:34,850][12883] Updated weights for policy 0, policy_version 70461 (0.0042) [2024-06-18 05:47:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1154498560. Throughput: 0: 42357.3. Samples: 1154626840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:47:36,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 05:47:39,448][12883] Updated weights for policy 0, policy_version 70471 (0.0034) [2024-06-18 05:47:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42265.3). Total num frames: 1154727936. Throughput: 0: 42358.7. Samples: 1154881820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:47:41,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 05:47:42,361][12883] Updated weights for policy 0, policy_version 70481 (0.0027) [2024-06-18 05:47:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1154924544. Throughput: 0: 42295.2. Samples: 1155010660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:47:46,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 05:47:46,996][12883] Updated weights for policy 0, policy_version 70491 (0.0041) [2024-06-18 05:47:50,273][12883] Updated weights for policy 0, policy_version 70501 (0.0034) [2024-06-18 05:47:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 1155153920. Throughput: 0: 42399.5. Samples: 1155262100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:47:51,998][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 05:47:54,680][12883] Updated weights for policy 0, policy_version 70511 (0.0041) [2024-06-18 05:47:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.8, 300 sec: 42209.6). Total num frames: 1155350528. Throughput: 0: 42626.7. Samples: 1155523380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:47:56,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 05:47:57,949][12883] Updated weights for policy 0, policy_version 70521 (0.0034) [2024-06-18 05:48:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42265.3). Total num frames: 1155563520. Throughput: 0: 42365.8. Samples: 1155643080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:01,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 05:48:02,298][12883] Updated weights for policy 0, policy_version 70531 (0.0022) [2024-06-18 05:48:05,850][12883] Updated weights for policy 0, policy_version 70541 (0.0035) [2024-06-18 05:48:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42377.2). Total num frames: 1155792896. Throughput: 0: 42548.4. Samples: 1155901640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:06,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 05:48:10,433][12883] Updated weights for policy 0, policy_version 70551 (0.0033) [2024-06-18 05:48:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1155973120. Throughput: 0: 42439.6. Samples: 1156150480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:11,994][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 05:48:13,523][12883] Updated weights for policy 0, policy_version 70561 (0.0027) [2024-06-18 05:48:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1156186112. Throughput: 0: 42284.4. Samples: 1156271780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:16,998][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 05:48:18,035][12883] Updated weights for policy 0, policy_version 70571 (0.0020) [2024-06-18 05:48:21,163][12883] Updated weights for policy 0, policy_version 70581 (0.0029) [2024-06-18 05:48:21,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42431.8). Total num frames: 1156448256. Throughput: 0: 42320.0. Samples: 1156531240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:21,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 05:48:25,766][12883] Updated weights for policy 0, policy_version 70591 (0.0032) [2024-06-18 05:48:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1156612096. Throughput: 0: 42392.8. Samples: 1156789500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:26,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 05:48:29,070][12883] Updated weights for policy 0, policy_version 70601 (0.0036) [2024-06-18 05:48:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1156841472. Throughput: 0: 42080.5. Samples: 1156904280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:31,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 05:48:33,353][12883] Updated weights for policy 0, policy_version 70611 (0.0025) [2024-06-18 05:48:36,672][12862] Signal inference workers to stop experience collection... (16750 times) [2024-06-18 05:48:36,672][12862] Signal inference workers to resume experience collection... (16750 times) [2024-06-18 05:48:36,728][12883] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-06-18 05:48:36,728][12883] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-06-18 05:48:36,802][12883] Updated weights for policy 0, policy_version 70621 (0.0039) [2024-06-18 05:48:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1157054464. Throughput: 0: 42331.1. Samples: 1157167000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:36,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 05:48:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070622_1157070848.pth... [2024-06-18 05:48:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070001_1146896384.pth [2024-06-18 05:48:41,073][12883] Updated weights for policy 0, policy_version 70631 (0.0031) [2024-06-18 05:48:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1157234688. Throughput: 0: 42106.3. Samples: 1157418160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:41,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 05:48:44,806][12883] Updated weights for policy 0, policy_version 70641 (0.0038) [2024-06-18 05:48:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1157480448. Throughput: 0: 42254.2. Samples: 1157544520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:48:46,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 05:48:48,718][12883] Updated weights for policy 0, policy_version 70651 (0.0030) [2024-06-18 05:48:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42210.5). Total num frames: 1157660672. Throughput: 0: 41965.2. Samples: 1157790080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:48:51,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 05:48:52,723][12883] Updated weights for policy 0, policy_version 70661 (0.0028) [2024-06-18 05:48:56,376][12883] Updated weights for policy 0, policy_version 70671 (0.0029) [2024-06-18 05:48:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1157890048. Throughput: 0: 42129.8. Samples: 1158046320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:48:56,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 05:49:00,712][12883] Updated weights for policy 0, policy_version 70681 (0.0032) [2024-06-18 05:49:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1158103040. Throughput: 0: 42340.1. Samples: 1158177080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:49:01,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 05:49:03,935][12883] Updated weights for policy 0, policy_version 70691 (0.0037) [2024-06-18 05:49:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 1158316032. Throughput: 0: 42174.6. Samples: 1158429100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:49:06,995][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 05:49:08,419][12883] Updated weights for policy 0, policy_version 70701 (0.0041) [2024-06-18 05:49:11,648][12883] Updated weights for policy 0, policy_version 70711 (0.0036) [2024-06-18 05:49:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1158529024. Throughput: 0: 41941.0. Samples: 1158676840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:49:11,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 05:49:16,452][12883] Updated weights for policy 0, policy_version 70721 (0.0046) [2024-06-18 05:49:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1158742016. Throughput: 0: 42227.1. Samples: 1158804500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:49:16,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 05:49:19,511][12883] Updated weights for policy 0, policy_version 70731 (0.0044) [2024-06-18 05:49:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 42209.9). Total num frames: 1158938624. Throughput: 0: 42111.9. Samples: 1159062040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:49:22,008][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 05:49:24,026][12883] Updated weights for policy 0, policy_version 70741 (0.0031) [2024-06-18 05:49:26,980][12883] Updated weights for policy 0, policy_version 70751 (0.0036) [2024-06-18 05:49:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 1159184384. Throughput: 0: 42162.1. Samples: 1159315460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:49:26,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 05:49:31,531][12883] Updated weights for policy 0, policy_version 70761 (0.0038) [2024-06-18 05:49:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 1159348224. Throughput: 0: 42304.3. Samples: 1159448220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 05:49:31,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 05:49:34,631][12883] Updated weights for policy 0, policy_version 70771 (0.0031) [2024-06-18 05:49:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1159577600. Throughput: 0: 42349.4. Samples: 1159695800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:49:36,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 05:49:39,380][12883] Updated weights for policy 0, policy_version 70781 (0.0035) [2024-06-18 05:49:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1159790592. Throughput: 0: 42303.1. Samples: 1159949960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:49:41,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 05:49:42,492][12883] Updated weights for policy 0, policy_version 70791 (0.0039) [2024-06-18 05:49:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1159987200. Throughput: 0: 42158.8. Samples: 1160074220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:49:46,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 05:49:47,053][12883] Updated weights for policy 0, policy_version 70801 (0.0030) [2024-06-18 05:49:50,241][12883] Updated weights for policy 0, policy_version 70811 (0.0024) [2024-06-18 05:49:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1160216576. Throughput: 0: 42110.3. Samples: 1160324060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:49:51,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 05:49:54,772][12883] Updated weights for policy 0, policy_version 70821 (0.0027) [2024-06-18 05:49:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1160413184. Throughput: 0: 42421.3. Samples: 1160585800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:49:56,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 05:49:57,870][12883] Updated weights for policy 0, policy_version 70831 (0.0038) [2024-06-18 05:50:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42265.3). Total num frames: 1160626176. Throughput: 0: 42337.4. Samples: 1160709680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:50:01,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 05:50:02,582][12883] Updated weights for policy 0, policy_version 70841 (0.0035) [2024-06-18 05:50:05,817][12883] Updated weights for policy 0, policy_version 70851 (0.0031) [2024-06-18 05:50:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1160855552. Throughput: 0: 42207.2. Samples: 1160961360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:50:06,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 05:50:09,431][12862] Signal inference workers to stop experience collection... (16800 times) [2024-06-18 05:50:09,431][12862] Signal inference workers to resume experience collection... (16800 times) [2024-06-18 05:50:09,459][12883] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-06-18 05:50:09,487][12883] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-06-18 05:50:10,174][12883] Updated weights for policy 0, policy_version 70861 (0.0027) [2024-06-18 05:50:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1161052160. Throughput: 0: 42280.6. Samples: 1161218080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 05:50:11,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 05:50:13,604][12883] Updated weights for policy 0, policy_version 70871 (0.0028) [2024-06-18 05:50:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1161248768. Throughput: 0: 42025.0. Samples: 1161339340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:16,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 05:50:17,674][12883] Updated weights for policy 0, policy_version 70881 (0.0038) [2024-06-18 05:50:21,339][12883] Updated weights for policy 0, policy_version 70891 (0.0033) [2024-06-18 05:50:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42321.0). Total num frames: 1161494528. Throughput: 0: 42301.9. Samples: 1161599380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:21,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 05:50:25,551][12883] Updated weights for policy 0, policy_version 70901 (0.0037) [2024-06-18 05:50:26,996][12645] Fps is (10 sec: 44226.8, 60 sec: 41777.7, 300 sec: 42264.9). Total num frames: 1161691136. Throughput: 0: 42307.2. Samples: 1161853880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:26,996][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 05:50:29,318][12883] Updated weights for policy 0, policy_version 70911 (0.0037) [2024-06-18 05:50:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1161887744. Throughput: 0: 42217.3. Samples: 1161974000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:31,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 05:50:33,036][12883] Updated weights for policy 0, policy_version 70921 (0.0030) [2024-06-18 05:50:36,787][12883] Updated weights for policy 0, policy_version 70931 (0.0033) [2024-06-18 05:50:37,000][12645] Fps is (10 sec: 45856.6, 60 sec: 42867.0, 300 sec: 42375.3). Total num frames: 1162149888. Throughput: 0: 42549.6. Samples: 1162239060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:37,001][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 05:50:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070932_1162149888.pth... [2024-06-18 05:50:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070310_1151959040.pth [2024-06-18 05:50:40,665][12883] Updated weights for policy 0, policy_version 70941 (0.0027) [2024-06-18 05:50:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1162330112. Throughput: 0: 42439.9. Samples: 1162495600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:41,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 05:50:44,466][12883] Updated weights for policy 0, policy_version 70951 (0.0029) [2024-06-18 05:50:46,994][12645] Fps is (10 sec: 39346.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1162543104. Throughput: 0: 42342.2. Samples: 1162615080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:46,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 05:50:48,325][12883] Updated weights for policy 0, policy_version 70961 (0.0032) [2024-06-18 05:50:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1162772480. Throughput: 0: 42486.2. Samples: 1162873240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:51,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 05:50:52,487][12883] Updated weights for policy 0, policy_version 70971 (0.0032) [2024-06-18 05:50:56,319][12883] Updated weights for policy 0, policy_version 70981 (0.0040) [2024-06-18 05:50:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1162985472. Throughput: 0: 42374.2. Samples: 1163124920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 05:50:56,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 05:51:00,331][12883] Updated weights for policy 0, policy_version 70991 (0.0032) [2024-06-18 05:51:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1163182080. Throughput: 0: 42488.9. Samples: 1163251340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:01,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 05:51:04,058][12883] Updated weights for policy 0, policy_version 71001 (0.0036) [2024-06-18 05:51:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1163411456. Throughput: 0: 42396.4. Samples: 1163507220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:06,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 05:51:08,108][12883] Updated weights for policy 0, policy_version 71011 (0.0040) [2024-06-18 05:51:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1163591680. Throughput: 0: 42428.8. Samples: 1163763080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:11,994][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 05:51:12,025][12883] Updated weights for policy 0, policy_version 71021 (0.0039) [2024-06-18 05:51:16,333][12883] Updated weights for policy 0, policy_version 71031 (0.0032) [2024-06-18 05:51:16,996][12645] Fps is (10 sec: 37674.7, 60 sec: 42323.7, 300 sec: 42264.8). Total num frames: 1163788288. Throughput: 0: 42368.6. Samples: 1163880680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:16,996][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 05:51:19,731][12883] Updated weights for policy 0, policy_version 71041 (0.0028) [2024-06-18 05:51:21,748][12862] Signal inference workers to stop experience collection... (16850 times) [2024-06-18 05:51:21,749][12862] Signal inference workers to resume experience collection... (16850 times) [2024-06-18 05:51:21,790][12883] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-06-18 05:51:21,790][12883] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-06-18 05:51:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1164050432. Throughput: 0: 42151.6. Samples: 1164135620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:21,994][12645] Avg episode reward: [(0, '0.151')] [2024-06-18 05:51:24,049][12883] Updated weights for policy 0, policy_version 71051 (0.0041) [2024-06-18 05:51:26,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1164230656. Throughput: 0: 42182.2. Samples: 1164393800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:26,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 05:51:27,566][12883] Updated weights for policy 0, policy_version 71061 (0.0026) [2024-06-18 05:51:31,631][12883] Updated weights for policy 0, policy_version 71071 (0.0036) [2024-06-18 05:51:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1164443648. Throughput: 0: 42231.1. Samples: 1164515480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:31,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 05:51:35,283][12883] Updated weights for policy 0, policy_version 71081 (0.0027) [2024-06-18 05:51:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42056.6, 300 sec: 42376.2). Total num frames: 1164673024. Throughput: 0: 42225.7. Samples: 1164773400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:36,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 05:51:39,327][12883] Updated weights for policy 0, policy_version 71091 (0.0023) [2024-06-18 05:51:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1164853248. Throughput: 0: 42343.6. Samples: 1165030380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 05:51:41,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 05:51:42,956][12883] Updated weights for policy 0, policy_version 71101 (0.0040) [2024-06-18 05:51:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1165066240. Throughput: 0: 42324.9. Samples: 1165155960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:51:46,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 05:51:47,002][12883] Updated weights for policy 0, policy_version 71111 (0.0039) [2024-06-18 05:51:51,107][12883] Updated weights for policy 0, policy_version 71121 (0.0030) [2024-06-18 05:51:51,996][12645] Fps is (10 sec: 42588.6, 60 sec: 41777.6, 300 sec: 42209.6). Total num frames: 1165279232. Throughput: 0: 42250.3. Samples: 1165408580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:51:51,997][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 05:51:54,606][12883] Updated weights for policy 0, policy_version 71131 (0.0039) [2024-06-18 05:51:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 1165492224. Throughput: 0: 42216.9. Samples: 1165662840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:51:56,994][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 05:51:58,870][12883] Updated weights for policy 0, policy_version 71141 (0.0026) [2024-06-18 05:52:01,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1165705216. Throughput: 0: 42337.7. Samples: 1165785780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:52:01,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 05:52:02,840][12883] Updated weights for policy 0, policy_version 71151 (0.0031) [2024-06-18 05:52:06,526][12883] Updated weights for policy 0, policy_version 71161 (0.0046) [2024-06-18 05:52:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 1165918208. Throughput: 0: 42363.3. Samples: 1166041960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:52:06,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 05:52:10,480][12883] Updated weights for policy 0, policy_version 71171 (0.0045) [2024-06-18 05:52:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1166098432. Throughput: 0: 42300.6. Samples: 1166297320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:52:11,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 05:52:14,379][12883] Updated weights for policy 0, policy_version 71181 (0.0031) [2024-06-18 05:52:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42599.9, 300 sec: 42320.7). Total num frames: 1166344192. Throughput: 0: 42254.1. Samples: 1166416920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:52:16,994][12645] Avg episode reward: [(0, '0.062')] [2024-06-18 05:52:18,158][12883] Updated weights for policy 0, policy_version 71191 (0.0029) [2024-06-18 05:52:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1166540800. Throughput: 0: 42064.9. Samples: 1166666320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 05:52:21,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 05:52:22,413][12883] Updated weights for policy 0, policy_version 71201 (0.0035) [2024-06-18 05:52:25,752][12883] Updated weights for policy 0, policy_version 71211 (0.0032) [2024-06-18 05:52:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42050.7, 300 sec: 42264.9). Total num frames: 1166753792. Throughput: 0: 41997.0. Samples: 1166920340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:52:26,997][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 05:52:29,995][12883] Updated weights for policy 0, policy_version 71221 (0.0030) [2024-06-18 05:52:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1166983168. Throughput: 0: 42117.7. Samples: 1167051260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:52:31,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 05:52:33,431][12883] Updated weights for policy 0, policy_version 71231 (0.0035) [2024-06-18 05:52:36,994][12645] Fps is (10 sec: 42608.0, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1167179776. Throughput: 0: 42135.0. Samples: 1167304560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:52:36,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 05:52:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071240_1167196160.pth... [2024-06-18 05:52:37,150][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070622_1157070848.pth [2024-06-18 05:52:37,543][12883] Updated weights for policy 0, policy_version 71241 (0.0035) [2024-06-18 05:52:41,420][12883] Updated weights for policy 0, policy_version 71251 (0.0041) [2024-06-18 05:52:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1167392768. Throughput: 0: 41959.0. Samples: 1167551000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:52:41,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 05:52:45,386][12883] Updated weights for policy 0, policy_version 71261 (0.0043) [2024-06-18 05:52:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1167605760. Throughput: 0: 41987.4. Samples: 1167675220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:52:46,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 05:52:48,970][12883] Updated weights for policy 0, policy_version 71271 (0.0030) [2024-06-18 05:52:50,944][12862] Signal inference workers to stop experience collection... (16900 times) [2024-06-18 05:52:50,976][12883] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-06-18 05:52:51,001][12862] Signal inference workers to resume experience collection... (16900 times) [2024-06-18 05:52:51,008][12883] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-06-18 05:52:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42327.0, 300 sec: 42265.2). Total num frames: 1167818752. Throughput: 0: 41983.1. Samples: 1167931200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:52:51,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 05:52:53,096][12883] Updated weights for policy 0, policy_version 71281 (0.0031) [2024-06-18 05:52:56,654][12883] Updated weights for policy 0, policy_version 71291 (0.0041) [2024-06-18 05:52:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 1168031744. Throughput: 0: 41912.3. Samples: 1168183380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:52:56,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 05:53:01,023][12883] Updated weights for policy 0, policy_version 71301 (0.0045) [2024-06-18 05:53:01,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 1168228352. Throughput: 0: 42074.1. Samples: 1168310260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:53:01,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 05:53:04,540][12883] Updated weights for policy 0, policy_version 71311 (0.0048) [2024-06-18 05:53:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.2, 300 sec: 42376.2). Total num frames: 1168474112. Throughput: 0: 42248.0. Samples: 1168567480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 05:53:06,995][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 05:53:08,563][12883] Updated weights for policy 0, policy_version 71321 (0.0026) [2024-06-18 05:53:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1168654336. Throughput: 0: 42374.2. Samples: 1168827080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:11,994][12645] Avg episode reward: [(0, '0.111')] [2024-06-18 05:53:12,366][12883] Updated weights for policy 0, policy_version 71331 (0.0035) [2024-06-18 05:53:16,213][12883] Updated weights for policy 0, policy_version 71341 (0.0032) [2024-06-18 05:53:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1168867328. Throughput: 0: 42156.0. Samples: 1168948280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:16,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 05:53:20,461][12883] Updated weights for policy 0, policy_version 71351 (0.0025) [2024-06-18 05:53:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1169096704. Throughput: 0: 42258.7. Samples: 1169206200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:21,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 05:53:23,859][12883] Updated weights for policy 0, policy_version 71361 (0.0028) [2024-06-18 05:53:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 1169293312. Throughput: 0: 42410.2. Samples: 1169459460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:26,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 05:53:28,184][12883] Updated weights for policy 0, policy_version 71371 (0.0030) [2024-06-18 05:53:31,407][12883] Updated weights for policy 0, policy_version 71381 (0.0042) [2024-06-18 05:53:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1169506304. Throughput: 0: 42390.7. Samples: 1169582800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:31,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 05:53:35,884][12883] Updated weights for policy 0, policy_version 71391 (0.0039) [2024-06-18 05:53:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1169752064. Throughput: 0: 42564.8. Samples: 1169846620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:36,995][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 05:53:39,032][12883] Updated weights for policy 0, policy_version 71401 (0.0037) [2024-06-18 05:53:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1169915904. Throughput: 0: 42554.2. Samples: 1170098320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:41,994][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 05:53:43,501][12883] Updated weights for policy 0, policy_version 71411 (0.0044) [2024-06-18 05:53:46,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42323.8, 300 sec: 42320.4). Total num frames: 1170145280. Throughput: 0: 42381.6. Samples: 1170217520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:46,996][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 05:53:47,082][12883] Updated weights for policy 0, policy_version 71421 (0.0040) [2024-06-18 05:53:51,190][12883] Updated weights for policy 0, policy_version 71431 (0.0029) [2024-06-18 05:53:51,994][12645] Fps is (10 sec: 45876.3, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1170374656. Throughput: 0: 42514.5. Samples: 1170480620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 05:53:51,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 05:53:54,667][12883] Updated weights for policy 0, policy_version 71441 (0.0033) [2024-06-18 05:53:56,994][12645] Fps is (10 sec: 39329.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1170538496. Throughput: 0: 42237.6. Samples: 1170727780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:53:56,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 05:53:58,978][12883] Updated weights for policy 0, policy_version 71451 (0.0037) [2024-06-18 05:54:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1170784256. Throughput: 0: 42183.6. Samples: 1170846540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:01,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 05:54:02,438][12883] Updated weights for policy 0, policy_version 71461 (0.0032) [2024-06-18 05:54:06,846][12883] Updated weights for policy 0, policy_version 71471 (0.0021) [2024-06-18 05:54:06,993][12645] Fps is (10 sec: 45876.7, 60 sec: 42052.5, 300 sec: 42265.2). Total num frames: 1170997248. Throughput: 0: 42322.8. Samples: 1171110720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:06,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 05:54:10,079][12883] Updated weights for policy 0, policy_version 71481 (0.0028) [2024-06-18 05:54:10,658][12862] Signal inference workers to stop experience collection... (16950 times) [2024-06-18 05:54:10,658][12862] Signal inference workers to resume experience collection... (16950 times) [2024-06-18 05:54:10,682][12883] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-06-18 05:54:10,682][12883] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-06-18 05:54:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1171193856. Throughput: 0: 42296.5. Samples: 1171362800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:11,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 05:54:14,512][12883] Updated weights for policy 0, policy_version 71491 (0.0043) [2024-06-18 05:54:16,996][12645] Fps is (10 sec: 42588.0, 60 sec: 42596.8, 300 sec: 42320.4). Total num frames: 1171423232. Throughput: 0: 42297.9. Samples: 1171486300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:16,996][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 05:54:17,731][12883] Updated weights for policy 0, policy_version 71501 (0.0033) [2024-06-18 05:54:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 1171603456. Throughput: 0: 42018.3. Samples: 1171737440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:21,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 05:54:22,305][12883] Updated weights for policy 0, policy_version 71511 (0.0041) [2024-06-18 05:54:26,140][12883] Updated weights for policy 0, policy_version 71521 (0.0030) [2024-06-18 05:54:26,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1171832832. Throughput: 0: 42026.8. Samples: 1171989520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:26,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 05:54:30,099][12883] Updated weights for policy 0, policy_version 71531 (0.0032) [2024-06-18 05:54:31,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.8, 300 sec: 42264.8). Total num frames: 1172045824. Throughput: 0: 42338.2. Samples: 1172122740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:31,997][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 05:54:32,043][12862] Saving new best policy, reward=0.654! [2024-06-18 05:54:33,862][12883] Updated weights for policy 0, policy_version 71541 (0.0038) [2024-06-18 05:54:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1172242432. Throughput: 0: 42025.1. Samples: 1172371760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 05:54:36,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 05:54:37,000][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071548_1172242432.pth... [2024-06-18 05:54:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000070932_1162149888.pth [2024-06-18 05:54:37,723][12883] Updated weights for policy 0, policy_version 71551 (0.0028) [2024-06-18 05:54:41,460][12883] Updated weights for policy 0, policy_version 71561 (0.0034) [2024-06-18 05:54:41,994][12645] Fps is (10 sec: 40969.8, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 1172455424. Throughput: 0: 42182.5. Samples: 1172625980. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:54:41,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 05:54:45,591][12883] Updated weights for policy 0, policy_version 71571 (0.0025) [2024-06-18 05:54:46,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42599.9, 300 sec: 42320.7). Total num frames: 1172701184. Throughput: 0: 42594.6. Samples: 1172763300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:54:46,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 05:54:49,192][12883] Updated weights for policy 0, policy_version 71581 (0.0030) [2024-06-18 05:54:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 42209.6). Total num frames: 1172865024. Throughput: 0: 41980.7. Samples: 1172999860. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:54:51,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 05:54:53,476][12883] Updated weights for policy 0, policy_version 71591 (0.0041) [2024-06-18 05:54:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1173094400. Throughput: 0: 42110.1. Samples: 1173257760. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:54:56,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 05:54:57,267][12883] Updated weights for policy 0, policy_version 71601 (0.0032) [2024-06-18 05:55:01,025][12883] Updated weights for policy 0, policy_version 71611 (0.0023) [2024-06-18 05:55:01,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1173340160. Throughput: 0: 42325.3. Samples: 1173390840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:55:01,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 05:55:05,132][12883] Updated weights for policy 0, policy_version 71621 (0.0029) [2024-06-18 05:55:06,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42050.6, 300 sec: 42264.8). Total num frames: 1173520384. Throughput: 0: 42289.0. Samples: 1173640540. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:55:06,996][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 05:55:08,624][12883] Updated weights for policy 0, policy_version 71631 (0.0045) [2024-06-18 05:55:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1173733376. Throughput: 0: 42403.1. Samples: 1173897660. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:55:11,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 05:55:12,874][12883] Updated weights for policy 0, policy_version 71641 (0.0030) [2024-06-18 05:55:16,345][12883] Updated weights for policy 0, policy_version 71651 (0.0031) [2024-06-18 05:55:16,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42326.9, 300 sec: 42265.1). Total num frames: 1173962752. Throughput: 0: 42262.1. Samples: 1174024440. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 05:55:16,995][12645] Avg episode reward: [(0, '0.095')] [2024-06-18 05:55:20,639][12883] Updated weights for policy 0, policy_version 71661 (0.0031) [2024-06-18 05:55:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42265.5). Total num frames: 1174159360. Throughput: 0: 42353.9. Samples: 1174277680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:21,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 05:55:24,133][12883] Updated weights for policy 0, policy_version 71671 (0.0045) [2024-06-18 05:55:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1174372352. Throughput: 0: 42306.0. Samples: 1174529760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:26,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 05:55:28,303][12883] Updated weights for policy 0, policy_version 71681 (0.0040) [2024-06-18 05:55:28,717][12862] Signal inference workers to stop experience collection... (17000 times) [2024-06-18 05:55:28,770][12862] Signal inference workers to resume experience collection... (17000 times) [2024-06-18 05:55:28,771][12883] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-06-18 05:55:28,791][12883] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-06-18 05:55:31,676][12883] Updated weights for policy 0, policy_version 71691 (0.0027) [2024-06-18 05:55:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42600.1, 300 sec: 42210.5). Total num frames: 1174601728. Throughput: 0: 42080.6. Samples: 1174656920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:31,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 05:55:36,012][12883] Updated weights for policy 0, policy_version 71701 (0.0026) [2024-06-18 05:55:36,996][12645] Fps is (10 sec: 44227.7, 60 sec: 42870.0, 300 sec: 42320.4). Total num frames: 1174814720. Throughput: 0: 42551.3. Samples: 1174914760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:36,996][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 05:55:39,442][12883] Updated weights for policy 0, policy_version 71711 (0.0033) [2024-06-18 05:55:41,994][12645] Fps is (10 sec: 39320.6, 60 sec: 42325.1, 300 sec: 42209.6). Total num frames: 1174994944. Throughput: 0: 42373.3. Samples: 1175164560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:41,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 05:55:43,743][12883] Updated weights for policy 0, policy_version 71721 (0.0040) [2024-06-18 05:55:46,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1175224320. Throughput: 0: 42220.3. Samples: 1175290760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:46,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 05:55:47,342][12883] Updated weights for policy 0, policy_version 71731 (0.0048) [2024-06-18 05:55:51,282][12883] Updated weights for policy 0, policy_version 71741 (0.0038) [2024-06-18 05:55:51,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1175420928. Throughput: 0: 42423.0. Samples: 1175549480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:51,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 05:55:55,267][12883] Updated weights for policy 0, policy_version 71751 (0.0030) [2024-06-18 05:55:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1175633920. Throughput: 0: 42186.2. Samples: 1175796040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:55:56,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 05:55:59,330][12883] Updated weights for policy 0, policy_version 71761 (0.0047) [2024-06-18 05:56:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1175846912. Throughput: 0: 42266.3. Samples: 1175926420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 05:56:01,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 05:56:02,841][12883] Updated weights for policy 0, policy_version 71771 (0.0037) [2024-06-18 05:56:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1176043520. Throughput: 0: 42210.3. Samples: 1176177140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:06,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 05:56:07,070][12883] Updated weights for policy 0, policy_version 71781 (0.0034) [2024-06-18 05:56:10,573][12883] Updated weights for policy 0, policy_version 71791 (0.0027) [2024-06-18 05:56:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42376.5). Total num frames: 1176289280. Throughput: 0: 42120.0. Samples: 1176425160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:11,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 05:56:14,986][12883] Updated weights for policy 0, policy_version 71801 (0.0037) [2024-06-18 05:56:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 1176469504. Throughput: 0: 42248.0. Samples: 1176558080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:16,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 05:56:18,117][12883] Updated weights for policy 0, policy_version 71811 (0.0042) [2024-06-18 05:56:21,994][12645] Fps is (10 sec: 37683.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1176666112. Throughput: 0: 42046.1. Samples: 1176806740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:21,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 05:56:22,695][12883] Updated weights for policy 0, policy_version 71821 (0.0038) [2024-06-18 05:56:25,924][12883] Updated weights for policy 0, policy_version 71831 (0.0043) [2024-06-18 05:56:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 1176911872. Throughput: 0: 42090.3. Samples: 1177058620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:26,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 05:56:30,299][12883] Updated weights for policy 0, policy_version 71841 (0.0030) [2024-06-18 05:56:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1177108480. Throughput: 0: 42139.2. Samples: 1177187020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:31,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 05:56:33,661][12883] Updated weights for policy 0, policy_version 71851 (0.0033) [2024-06-18 05:56:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41780.7, 300 sec: 42265.2). Total num frames: 1177321472. Throughput: 0: 41981.7. Samples: 1177438660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:36,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 05:56:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071858_1177321472.pth... [2024-06-18 05:56:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071240_1167196160.pth [2024-06-18 05:56:37,958][12883] Updated weights for policy 0, policy_version 71861 (0.0039) [2024-06-18 05:56:41,519][12883] Updated weights for policy 0, policy_version 71871 (0.0047) [2024-06-18 05:56:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1177550848. Throughput: 0: 42012.8. Samples: 1177686620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:41,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 05:56:46,278][12883] Updated weights for policy 0, policy_version 71881 (0.0036) [2024-06-18 05:56:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42265.5). Total num frames: 1177747456. Throughput: 0: 42100.0. Samples: 1177820920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 05:56:46,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 05:56:48,092][12862] Signal inference workers to stop experience collection... (17050 times) [2024-06-18 05:56:48,092][12862] Signal inference workers to resume experience collection... (17050 times) [2024-06-18 05:56:48,133][12883] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-06-18 05:56:48,133][12883] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-06-18 05:56:49,427][12883] Updated weights for policy 0, policy_version 71891 (0.0037) [2024-06-18 05:56:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1177944064. Throughput: 0: 42094.1. Samples: 1178071380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:56:51,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 05:56:53,945][12883] Updated weights for policy 0, policy_version 71901 (0.0038) [2024-06-18 05:56:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1178173440. Throughput: 0: 42097.9. Samples: 1178319560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:56:56,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 05:56:57,053][12883] Updated weights for policy 0, policy_version 71911 (0.0033) [2024-06-18 05:57:01,557][12883] Updated weights for policy 0, policy_version 71921 (0.0032) [2024-06-18 05:57:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1178353664. Throughput: 0: 42068.8. Samples: 1178451180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:57:01,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 05:57:04,610][12883] Updated weights for policy 0, policy_version 71931 (0.0035) [2024-06-18 05:57:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1178583040. Throughput: 0: 42221.2. Samples: 1178706700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:57:06,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 05:57:09,211][12883] Updated weights for policy 0, policy_version 71941 (0.0031) [2024-06-18 05:57:11,995][12645] Fps is (10 sec: 47507.5, 60 sec: 42324.5, 300 sec: 42320.5). Total num frames: 1178828800. Throughput: 0: 42142.9. Samples: 1178955100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:57:11,995][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 05:57:12,159][12883] Updated weights for policy 0, policy_version 71951 (0.0034) [2024-06-18 05:57:16,799][12883] Updated weights for policy 0, policy_version 71961 (0.0027) [2024-06-18 05:57:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1179009024. Throughput: 0: 42160.9. Samples: 1179084260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:57:16,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 05:57:20,179][12883] Updated weights for policy 0, policy_version 71971 (0.0032) [2024-06-18 05:57:21,994][12645] Fps is (10 sec: 37688.4, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1179205632. Throughput: 0: 42066.8. Samples: 1179331660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:57:21,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 05:57:25,025][12883] Updated weights for policy 0, policy_version 71981 (0.0045) [2024-06-18 05:57:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 1179435008. Throughput: 0: 42225.9. Samples: 1179586780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:57:26,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 05:57:27,856][12883] Updated weights for policy 0, policy_version 71991 (0.0035) [2024-06-18 05:57:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1179631616. Throughput: 0: 42017.2. Samples: 1179711700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 05:57:31,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 05:57:32,724][12883] Updated weights for policy 0, policy_version 72001 (0.0032) [2024-06-18 05:57:35,454][12883] Updated weights for policy 0, policy_version 72011 (0.0048) [2024-06-18 05:57:36,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 1179860992. Throughput: 0: 41913.5. Samples: 1179957580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:57:36,997][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 05:57:40,280][12883] Updated weights for policy 0, policy_version 72021 (0.0042) [2024-06-18 05:57:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 1180057600. Throughput: 0: 42388.0. Samples: 1180227020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:57:41,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 05:57:43,479][12883] Updated weights for policy 0, policy_version 72031 (0.0036) [2024-06-18 05:57:46,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1180270592. Throughput: 0: 42072.8. Samples: 1180344460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:57:46,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 05:57:48,307][12883] Updated weights for policy 0, policy_version 72041 (0.0045) [2024-06-18 05:57:51,279][12883] Updated weights for policy 0, policy_version 72051 (0.0025) [2024-06-18 05:57:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1180499968. Throughput: 0: 42074.7. Samples: 1180600060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:57:51,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 05:57:55,902][12883] Updated weights for policy 0, policy_version 72061 (0.0027) [2024-06-18 05:57:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1180663808. Throughput: 0: 42292.2. Samples: 1180858200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:57:56,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 05:57:59,082][12883] Updated weights for policy 0, policy_version 72071 (0.0035) [2024-06-18 05:58:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 1180893184. Throughput: 0: 41928.4. Samples: 1180971040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:58:01,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 05:58:03,731][12883] Updated weights for policy 0, policy_version 72081 (0.0029) [2024-06-18 05:58:06,814][12883] Updated weights for policy 0, policy_version 72091 (0.0026) [2024-06-18 05:58:06,994][12645] Fps is (10 sec: 47514.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 1181138944. Throughput: 0: 42294.2. Samples: 1181234900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:58:06,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 05:58:11,415][12883] Updated weights for policy 0, policy_version 72101 (0.0040) [2024-06-18 05:58:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41233.9, 300 sec: 42154.1). Total num frames: 1181302784. Throughput: 0: 42283.9. Samples: 1181489560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:58:11,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 05:58:14,719][12883] Updated weights for policy 0, policy_version 72111 (0.0044) [2024-06-18 05:58:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1181532160. Throughput: 0: 42242.4. Samples: 1181612600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 05:58:16,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 05:58:19,335][12883] Updated weights for policy 0, policy_version 72121 (0.0035) [2024-06-18 05:58:21,415][12862] Signal inference workers to stop experience collection... (17100 times) [2024-06-18 05:58:21,456][12883] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-06-18 05:58:21,488][12862] Signal inference workers to resume experience collection... (17100 times) [2024-06-18 05:58:21,492][12883] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-06-18 05:58:21,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1181777920. Throughput: 0: 42575.1. Samples: 1181873360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:21,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 05:58:22,358][12883] Updated weights for policy 0, policy_version 72131 (0.0035) [2024-06-18 05:58:26,892][12883] Updated weights for policy 0, policy_version 72141 (0.0027) [2024-06-18 05:58:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1181958144. Throughput: 0: 42268.0. Samples: 1182129080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:26,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 05:58:30,276][12883] Updated weights for policy 0, policy_version 72151 (0.0041) [2024-06-18 05:58:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1182187520. Throughput: 0: 42322.3. Samples: 1182248960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:31,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 05:58:34,543][12883] Updated weights for policy 0, policy_version 72161 (0.0041) [2024-06-18 05:58:37,000][12645] Fps is (10 sec: 44209.3, 60 sec: 42322.5, 300 sec: 42319.8). Total num frames: 1182400512. Throughput: 0: 42416.0. Samples: 1182509040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:37,000][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 05:58:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072168_1182400512.pth... [2024-06-18 05:58:37,093][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071548_1172242432.pth [2024-06-18 05:58:37,801][12883] Updated weights for policy 0, policy_version 72171 (0.0026) [2024-06-18 05:58:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 1182580736. Throughput: 0: 42427.3. Samples: 1182767420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:41,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 05:58:42,393][12883] Updated weights for policy 0, policy_version 72181 (0.0028) [2024-06-18 05:58:45,384][12883] Updated weights for policy 0, policy_version 72191 (0.0039) [2024-06-18 05:58:46,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 1182826496. Throughput: 0: 42537.4. Samples: 1182885220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:46,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 05:58:50,374][12883] Updated weights for policy 0, policy_version 72201 (0.0032) [2024-06-18 05:58:51,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1183039488. Throughput: 0: 42485.1. Samples: 1183146740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:52,006][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 05:58:53,512][12883] Updated weights for policy 0, policy_version 72211 (0.0042) [2024-06-18 05:58:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1183219712. Throughput: 0: 42248.9. Samples: 1183390760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:58:56,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 05:58:58,212][12883] Updated weights for policy 0, policy_version 72221 (0.0029) [2024-06-18 05:59:01,219][12883] Updated weights for policy 0, policy_version 72231 (0.0035) [2024-06-18 05:59:01,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42869.9, 300 sec: 42264.8). Total num frames: 1183465472. Throughput: 0: 42335.1. Samples: 1183517780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 05:59:01,997][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 05:59:06,158][12883] Updated weights for policy 0, policy_version 72241 (0.0042) [2024-06-18 05:59:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 1183645696. Throughput: 0: 42279.0. Samples: 1183775920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:06,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 05:59:08,817][12883] Updated weights for policy 0, policy_version 72251 (0.0028) [2024-06-18 05:59:11,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42598.4, 300 sec: 42154.4). Total num frames: 1183858688. Throughput: 0: 42019.5. Samples: 1184019960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:11,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 05:59:13,906][12883] Updated weights for policy 0, policy_version 72261 (0.0032) [2024-06-18 05:59:16,827][12883] Updated weights for policy 0, policy_version 72271 (0.0027) [2024-06-18 05:59:16,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42596.8, 300 sec: 42320.4). Total num frames: 1184088064. Throughput: 0: 42289.9. Samples: 1184152100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:16,996][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 05:59:21,532][12883] Updated weights for policy 0, policy_version 72281 (0.0031) [2024-06-18 05:59:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 42098.6). Total num frames: 1184251904. Throughput: 0: 42164.1. Samples: 1184406160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:21,994][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 05:59:24,634][12883] Updated weights for policy 0, policy_version 72291 (0.0032) [2024-06-18 05:59:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42209.9). Total num frames: 1184497664. Throughput: 0: 41817.2. Samples: 1184649200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:26,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 05:59:29,529][12883] Updated weights for policy 0, policy_version 72301 (0.0033) [2024-06-18 05:59:31,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1184727040. Throughput: 0: 42251.6. Samples: 1184786540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:31,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 05:59:32,345][12883] Updated weights for policy 0, policy_version 72311 (0.0026) [2024-06-18 05:59:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41510.4, 300 sec: 42154.1). Total num frames: 1184890880. Throughput: 0: 42061.0. Samples: 1185039480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:36,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 05:59:37,251][12883] Updated weights for policy 0, policy_version 72321 (0.0034) [2024-06-18 05:59:40,220][12883] Updated weights for policy 0, policy_version 72331 (0.0033) [2024-06-18 05:59:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1185153024. Throughput: 0: 41984.5. Samples: 1185280060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 21.0) [2024-06-18 05:59:41,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 05:59:45,017][12883] Updated weights for policy 0, policy_version 72341 (0.0034) [2024-06-18 05:59:45,652][12862] Signal inference workers to stop experience collection... (17150 times) [2024-06-18 05:59:45,652][12862] Signal inference workers to resume experience collection... (17150 times) [2024-06-18 05:59:45,680][12883] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-06-18 05:59:45,680][12883] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-06-18 05:59:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1185349632. Throughput: 0: 42202.0. Samples: 1185416780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 05:59:46,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 05:59:47,852][12883] Updated weights for policy 0, policy_version 72351 (0.0032) [2024-06-18 05:59:51,994][12645] Fps is (10 sec: 36045.1, 60 sec: 41233.2, 300 sec: 42098.6). Total num frames: 1185513472. Throughput: 0: 42004.1. Samples: 1185666100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 05:59:51,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 05:59:52,593][12883] Updated weights for policy 0, policy_version 72361 (0.0034) [2024-06-18 05:59:55,714][12883] Updated weights for policy 0, policy_version 72371 (0.0028) [2024-06-18 05:59:56,995][12645] Fps is (10 sec: 44229.8, 60 sec: 42870.3, 300 sec: 42209.4). Total num frames: 1185792000. Throughput: 0: 41913.1. Samples: 1185906120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 05:59:56,996][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 06:00:00,533][12883] Updated weights for policy 0, policy_version 72381 (0.0029) [2024-06-18 06:00:01,996][12645] Fps is (10 sec: 45865.5, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 1185972224. Throughput: 0: 42229.0. Samples: 1186052400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 06:00:01,996][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 06:00:03,494][12883] Updated weights for policy 0, policy_version 72391 (0.0032) [2024-06-18 06:00:06,994][12645] Fps is (10 sec: 37689.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1186168832. Throughput: 0: 42015.9. Samples: 1186296880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 06:00:06,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 06:00:08,202][12883] Updated weights for policy 0, policy_version 72401 (0.0026) [2024-06-18 06:00:11,354][12883] Updated weights for policy 0, policy_version 72411 (0.0035) [2024-06-18 06:00:11,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 1186414592. Throughput: 0: 42280.1. Samples: 1186551800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 06:00:11,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 06:00:15,746][12883] Updated weights for policy 0, policy_version 72421 (0.0043) [2024-06-18 06:00:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1186611200. Throughput: 0: 42243.2. Samples: 1186687480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 06:00:16,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 06:00:18,854][12883] Updated weights for policy 0, policy_version 72431 (0.0035) [2024-06-18 06:00:21,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1186824192. Throughput: 0: 42109.7. Samples: 1186934420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 06:00:21,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 06:00:23,296][12883] Updated weights for policy 0, policy_version 72441 (0.0046) [2024-06-18 06:00:26,596][12883] Updated weights for policy 0, policy_version 72451 (0.0038) [2024-06-18 06:00:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1187037184. Throughput: 0: 42515.2. Samples: 1187193240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 25.0) [2024-06-18 06:00:26,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 06:00:31,190][12883] Updated weights for policy 0, policy_version 72461 (0.0047) [2024-06-18 06:00:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42043.3). Total num frames: 1187217408. Throughput: 0: 42298.9. Samples: 1187320220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:00:31,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 06:00:34,376][12883] Updated weights for policy 0, policy_version 72471 (0.0045) [2024-06-18 06:00:37,000][12645] Fps is (10 sec: 42571.5, 60 sec: 42867.0, 300 sec: 42264.3). Total num frames: 1187463168. Throughput: 0: 42223.4. Samples: 1187566420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:00:37,000][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 06:00:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072477_1187463168.pth... [2024-06-18 06:00:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000071858_1177321472.pth [2024-06-18 06:00:38,911][12883] Updated weights for policy 0, policy_version 72481 (0.0033) [2024-06-18 06:00:41,909][12883] Updated weights for policy 0, policy_version 72491 (0.0045) [2024-06-18 06:00:41,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1187692544. Throughput: 0: 42612.8. Samples: 1187823620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:00:41,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 06:00:46,615][12883] Updated weights for policy 0, policy_version 72501 (0.0035) [2024-06-18 06:00:46,994][12645] Fps is (10 sec: 39346.3, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1187856384. Throughput: 0: 42146.8. Samples: 1187948920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:00:46,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 06:00:50,181][12883] Updated weights for policy 0, policy_version 72511 (0.0040) [2024-06-18 06:00:51,996][12645] Fps is (10 sec: 40950.5, 60 sec: 43142.9, 300 sec: 42264.8). Total num frames: 1188102144. Throughput: 0: 42302.8. Samples: 1188200600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:00:51,997][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 06:00:54,505][12883] Updated weights for policy 0, policy_version 72521 (0.0047) [2024-06-18 06:00:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41780.4, 300 sec: 42209.6). Total num frames: 1188298752. Throughput: 0: 42314.2. Samples: 1188455940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:00:56,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 06:00:57,888][12883] Updated weights for policy 0, policy_version 72531 (0.0027) [2024-06-18 06:01:01,981][12883] Updated weights for policy 0, policy_version 72541 (0.0032) [2024-06-18 06:01:01,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42326.8, 300 sec: 42265.2). Total num frames: 1188511744. Throughput: 0: 42119.1. Samples: 1188582840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:01:01,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 06:01:03,367][12862] Signal inference workers to stop experience collection... (17200 times) [2024-06-18 06:01:03,367][12862] Signal inference workers to resume experience collection... (17200 times) [2024-06-18 06:01:03,386][12883] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-06-18 06:01:03,386][12883] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-06-18 06:01:05,513][12883] Updated weights for policy 0, policy_version 72551 (0.0032) [2024-06-18 06:01:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1188741120. Throughput: 0: 42270.2. Samples: 1188836580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:01:06,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 06:01:09,669][12883] Updated weights for policy 0, policy_version 72561 (0.0050) [2024-06-18 06:01:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 1188937728. Throughput: 0: 42236.3. Samples: 1189093880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 06:01:11,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 06:01:13,319][12883] Updated weights for policy 0, policy_version 72571 (0.0022) [2024-06-18 06:01:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1189150720. Throughput: 0: 42208.0. Samples: 1189219580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:16,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 06:01:17,241][12883] Updated weights for policy 0, policy_version 72581 (0.0038) [2024-06-18 06:01:20,966][12883] Updated weights for policy 0, policy_version 72591 (0.0036) [2024-06-18 06:01:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1189363712. Throughput: 0: 42427.6. Samples: 1189475400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:21,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 06:01:25,251][12883] Updated weights for policy 0, policy_version 72601 (0.0035) [2024-06-18 06:01:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1189560320. Throughput: 0: 42333.8. Samples: 1189728640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:26,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 06:01:28,770][12883] Updated weights for policy 0, policy_version 72611 (0.0040) [2024-06-18 06:01:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 1189789696. Throughput: 0: 42220.4. Samples: 1189848840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:31,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 06:01:32,767][12883] Updated weights for policy 0, policy_version 72621 (0.0032) [2024-06-18 06:01:36,365][12883] Updated weights for policy 0, policy_version 72631 (0.0039) [2024-06-18 06:01:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42329.7, 300 sec: 42209.6). Total num frames: 1190002688. Throughput: 0: 42440.3. Samples: 1190110320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:36,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 06:01:40,475][12883] Updated weights for policy 0, policy_version 72641 (0.0030) [2024-06-18 06:01:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 1190199296. Throughput: 0: 42317.2. Samples: 1190360220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:41,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 06:01:44,009][12883] Updated weights for policy 0, policy_version 72651 (0.0037) [2024-06-18 06:01:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1190395904. Throughput: 0: 42248.0. Samples: 1190484000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:46,994][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 06:01:48,611][12883] Updated weights for policy 0, policy_version 72661 (0.0031) [2024-06-18 06:01:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1190625280. Throughput: 0: 42265.4. Samples: 1190738520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:51,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 06:01:52,221][12883] Updated weights for policy 0, policy_version 72671 (0.0032) [2024-06-18 06:01:56,535][12883] Updated weights for policy 0, policy_version 72681 (0.0029) [2024-06-18 06:01:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1190838272. Throughput: 0: 42241.9. Samples: 1190994760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 06:01:56,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 06:01:59,788][12883] Updated weights for policy 0, policy_version 72691 (0.0044) [2024-06-18 06:02:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1191034880. Throughput: 0: 42179.0. Samples: 1191117640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:01,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 06:02:04,263][12883] Updated weights for policy 0, policy_version 72701 (0.0038) [2024-06-18 06:02:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42154.3). Total num frames: 1191264256. Throughput: 0: 42158.2. Samples: 1191372520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:06,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 06:02:07,654][12883] Updated weights for policy 0, policy_version 72711 (0.0031) [2024-06-18 06:02:11,868][12883] Updated weights for policy 0, policy_version 72721 (0.0029) [2024-06-18 06:02:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1191460864. Throughput: 0: 42244.2. Samples: 1191629640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:11,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 06:02:15,230][12883] Updated weights for policy 0, policy_version 72731 (0.0034) [2024-06-18 06:02:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1191690240. Throughput: 0: 42331.9. Samples: 1191753780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:16,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 06:02:19,499][12883] Updated weights for policy 0, policy_version 72741 (0.0032) [2024-06-18 06:02:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1191903232. Throughput: 0: 42336.1. Samples: 1192015440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:21,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 06:02:22,881][12883] Updated weights for policy 0, policy_version 72751 (0.0040) [2024-06-18 06:02:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1192099840. Throughput: 0: 42467.2. Samples: 1192271240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:26,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 06:02:27,078][12883] Updated weights for policy 0, policy_version 72761 (0.0031) [2024-06-18 06:02:30,462][12883] Updated weights for policy 0, policy_version 72771 (0.0029) [2024-06-18 06:02:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42321.0). Total num frames: 1192345600. Throughput: 0: 42435.9. Samples: 1192393620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:31,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 06:02:34,799][12883] Updated weights for policy 0, policy_version 72781 (0.0022) [2024-06-18 06:02:35,855][12862] Signal inference workers to stop experience collection... (17250 times) [2024-06-18 06:02:35,855][12862] Signal inference workers to resume experience collection... (17250 times) [2024-06-18 06:02:35,897][12883] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-06-18 06:02:35,897][12883] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-06-18 06:02:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1192542208. Throughput: 0: 42633.4. Samples: 1192657020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:36,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 06:02:37,064][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072788_1192558592.pth... [2024-06-18 06:02:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072168_1182400512.pth [2024-06-18 06:02:38,338][12883] Updated weights for policy 0, policy_version 72791 (0.0035) [2024-06-18 06:02:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1192738816. Throughput: 0: 42577.8. Samples: 1192910760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 06:02:41,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 06:02:42,319][12883] Updated weights for policy 0, policy_version 72801 (0.0030) [2024-06-18 06:02:45,870][12883] Updated weights for policy 0, policy_version 72811 (0.0023) [2024-06-18 06:02:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 1192984576. Throughput: 0: 42660.5. Samples: 1193037360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:02:46,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 06:02:50,248][12883] Updated weights for policy 0, policy_version 72821 (0.0036) [2024-06-18 06:02:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1193181184. Throughput: 0: 42819.2. Samples: 1193299380. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:02:51,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 06:02:53,477][12883] Updated weights for policy 0, policy_version 72831 (0.0029) [2024-06-18 06:02:57,000][12645] Fps is (10 sec: 39296.7, 60 sec: 42320.9, 300 sec: 42319.8). Total num frames: 1193377792. Throughput: 0: 42719.9. Samples: 1193552300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:02:57,001][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 06:02:58,204][12883] Updated weights for policy 0, policy_version 72841 (0.0029) [2024-06-18 06:03:01,100][12883] Updated weights for policy 0, policy_version 72851 (0.0032) [2024-06-18 06:03:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42265.1). Total num frames: 1193607168. Throughput: 0: 42738.7. Samples: 1193677020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:03:01,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 06:03:05,803][12883] Updated weights for policy 0, policy_version 72861 (0.0032) [2024-06-18 06:03:06,994][12645] Fps is (10 sec: 44264.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1193820160. Throughput: 0: 42618.6. Samples: 1193933280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:03:06,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 06:03:08,999][12883] Updated weights for policy 0, policy_version 72871 (0.0038) [2024-06-18 06:03:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1194033152. Throughput: 0: 42632.8. Samples: 1194189720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:03:11,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 06:03:13,332][12883] Updated weights for policy 0, policy_version 72881 (0.0035) [2024-06-18 06:03:16,506][12883] Updated weights for policy 0, policy_version 72891 (0.0031) [2024-06-18 06:03:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 1194262528. Throughput: 0: 42813.5. Samples: 1194320220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:03:16,994][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 06:03:20,814][12883] Updated weights for policy 0, policy_version 72901 (0.0036) [2024-06-18 06:03:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1194442752. Throughput: 0: 42710.1. Samples: 1194578980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:03:21,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 06:03:24,285][12883] Updated weights for policy 0, policy_version 72911 (0.0030) [2024-06-18 06:03:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 1194655744. Throughput: 0: 42734.3. Samples: 1194833800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-18 06:03:26,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 06:03:28,470][12883] Updated weights for policy 0, policy_version 72921 (0.0043) [2024-06-18 06:03:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42321.6). Total num frames: 1194885120. Throughput: 0: 42771.9. Samples: 1194962100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:03:31,997][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 06:03:32,363][12883] Updated weights for policy 0, policy_version 72931 (0.0023) [2024-06-18 06:03:36,109][12883] Updated weights for policy 0, policy_version 72941 (0.0039) [2024-06-18 06:03:36,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1195098112. Throughput: 0: 42615.8. Samples: 1195217100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:03:36,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 06:03:39,839][12883] Updated weights for policy 0, policy_version 72951 (0.0037) [2024-06-18 06:03:41,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.8, 300 sec: 42320.4). Total num frames: 1195311104. Throughput: 0: 42559.8. Samples: 1195467320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:03:41,997][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 06:03:43,872][12883] Updated weights for policy 0, policy_version 72961 (0.0039) [2024-06-18 06:03:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1195540480. Throughput: 0: 42594.8. Samples: 1195593780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:03:46,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 06:03:47,347][12883] Updated weights for policy 0, policy_version 72971 (0.0032) [2024-06-18 06:03:51,636][12883] Updated weights for policy 0, policy_version 72981 (0.0031) [2024-06-18 06:03:51,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1195737088. Throughput: 0: 42870.7. Samples: 1195862460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:03:51,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 06:03:54,721][12883] Updated weights for policy 0, policy_version 72991 (0.0039) [2024-06-18 06:03:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42875.8, 300 sec: 42321.0). Total num frames: 1195950080. Throughput: 0: 42787.5. Samples: 1196115160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:03:56,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 06:03:59,215][12883] Updated weights for policy 0, policy_version 73001 (0.0034) [2024-06-18 06:04:01,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1196195840. Throughput: 0: 42669.6. Samples: 1196240360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:04:01,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 06:04:02,486][12883] Updated weights for policy 0, policy_version 73011 (0.0044) [2024-06-18 06:04:04,486][12862] Signal inference workers to stop experience collection... (17300 times) [2024-06-18 06:04:04,524][12883] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-06-18 06:04:04,543][12862] Signal inference workers to resume experience collection... (17300 times) [2024-06-18 06:04:04,544][12883] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-06-18 06:04:06,843][12883] Updated weights for policy 0, policy_version 73021 (0.0051) [2024-06-18 06:04:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1196376064. Throughput: 0: 42721.3. Samples: 1196501440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:04:06,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 06:04:10,281][12883] Updated weights for policy 0, policy_version 73031 (0.0036) [2024-06-18 06:04:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 1196572672. Throughput: 0: 42587.8. Samples: 1196750260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 06:04:11,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 06:04:14,630][12883] Updated weights for policy 0, policy_version 73041 (0.0037) [2024-06-18 06:04:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1196818432. Throughput: 0: 42581.4. Samples: 1196878260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:16,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 06:04:17,693][12883] Updated weights for policy 0, policy_version 73051 (0.0030) [2024-06-18 06:04:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1196982272. Throughput: 0: 42539.5. Samples: 1197131380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:21,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 06:04:22,486][12883] Updated weights for policy 0, policy_version 73061 (0.0037) [2024-06-18 06:04:25,757][12883] Updated weights for policy 0, policy_version 73071 (0.0047) [2024-06-18 06:04:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42376.2). Total num frames: 1197228032. Throughput: 0: 42578.9. Samples: 1197383280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:26,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 06:04:30,256][12883] Updated weights for policy 0, policy_version 73081 (0.0038) [2024-06-18 06:04:31,996][12645] Fps is (10 sec: 49141.8, 60 sec: 43143.0, 300 sec: 42653.6). Total num frames: 1197473792. Throughput: 0: 42785.0. Samples: 1197519200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:31,996][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 06:04:33,566][12883] Updated weights for policy 0, policy_version 73091 (0.0028) [2024-06-18 06:04:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1197637632. Throughput: 0: 42413.2. Samples: 1197771060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:36,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 06:04:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073098_1197637632.pth... [2024-06-18 06:04:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072477_1187463168.pth [2024-06-18 06:04:37,979][12883] Updated weights for policy 0, policy_version 73101 (0.0030) [2024-06-18 06:04:41,048][12883] Updated weights for policy 0, policy_version 73111 (0.0036) [2024-06-18 06:04:41,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42600.1, 300 sec: 42431.8). Total num frames: 1197867008. Throughput: 0: 42334.4. Samples: 1198020200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:41,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 06:04:45,498][12883] Updated weights for policy 0, policy_version 73121 (0.0040) [2024-06-18 06:04:46,994][12645] Fps is (10 sec: 45876.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1198096384. Throughput: 0: 42550.4. Samples: 1198155120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:46,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 06:04:48,593][12883] Updated weights for policy 0, policy_version 73131 (0.0038) [2024-06-18 06:04:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 42050.7, 300 sec: 42265.1). Total num frames: 1198260224. Throughput: 0: 42403.7. Samples: 1198409700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:51,996][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 06:04:53,119][12883] Updated weights for policy 0, policy_version 73141 (0.0033) [2024-06-18 06:04:56,329][12883] Updated weights for policy 0, policy_version 73151 (0.0030) [2024-06-18 06:04:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42487.6). Total num frames: 1198505984. Throughput: 0: 42241.9. Samples: 1198651140. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-18 06:04:56,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 06:05:00,928][12883] Updated weights for policy 0, policy_version 73161 (0.0022) [2024-06-18 06:05:01,994][12645] Fps is (10 sec: 44247.2, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1198702592. Throughput: 0: 42416.1. Samples: 1198786980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:01,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 06:05:03,982][12883] Updated weights for policy 0, policy_version 73171 (0.0045) [2024-06-18 06:05:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1198899200. Throughput: 0: 42307.6. Samples: 1199035220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:06,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 06:05:08,533][12883] Updated weights for policy 0, policy_version 73181 (0.0036) [2024-06-18 06:05:11,567][12883] Updated weights for policy 0, policy_version 73191 (0.0044) [2024-06-18 06:05:11,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1199161344. Throughput: 0: 42099.6. Samples: 1199277760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:11,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 06:05:16,789][12883] Updated weights for policy 0, policy_version 73201 (0.0043) [2024-06-18 06:05:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 1199325184. Throughput: 0: 42203.8. Samples: 1199418280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:16,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 06:05:18,105][12862] Signal inference workers to stop experience collection... (17350 times) [2024-06-18 06:05:18,105][12862] Signal inference workers to resume experience collection... (17350 times) [2024-06-18 06:05:18,157][12883] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-06-18 06:05:18,157][12883] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-06-18 06:05:19,876][12883] Updated weights for policy 0, policy_version 73211 (0.0037) [2024-06-18 06:05:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1199538176. Throughput: 0: 42121.8. Samples: 1199666540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 06:05:24,391][12883] Updated weights for policy 0, policy_version 73221 (0.0029) [2024-06-18 06:05:26,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 1199783936. Throughput: 0: 42116.0. Samples: 1199915520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:27,005][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 06:05:27,385][12883] Updated weights for policy 0, policy_version 73231 (0.0032) [2024-06-18 06:05:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41507.6, 300 sec: 42377.1). Total num frames: 1199964160. Throughput: 0: 42115.8. Samples: 1200050340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:32,000][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 06:05:32,227][12883] Updated weights for policy 0, policy_version 73241 (0.0032) [2024-06-18 06:05:35,590][12883] Updated weights for policy 0, policy_version 73251 (0.0038) [2024-06-18 06:05:36,994][12645] Fps is (10 sec: 37691.3, 60 sec: 42052.3, 300 sec: 42265.1). Total num frames: 1200160768. Throughput: 0: 41950.0. Samples: 1200297360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:36,995][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 06:05:39,888][12883] Updated weights for policy 0, policy_version 73261 (0.0033) [2024-06-18 06:05:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1200406528. Throughput: 0: 42163.9. Samples: 1200548520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 06:05:41,995][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 06:05:43,650][12883] Updated weights for policy 0, policy_version 73271 (0.0033) [2024-06-18 06:05:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42376.6). Total num frames: 1200603136. Throughput: 0: 42071.8. Samples: 1200680220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:05:46,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 06:05:47,717][12883] Updated weights for policy 0, policy_version 73281 (0.0034) [2024-06-18 06:05:51,131][12883] Updated weights for policy 0, policy_version 73291 (0.0035) [2024-06-18 06:05:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42600.0, 300 sec: 42431.8). Total num frames: 1200816128. Throughput: 0: 42043.6. Samples: 1200927180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:05:51,994][12645] Avg episode reward: [(0, '0.101')] [2024-06-18 06:05:55,345][12883] Updated weights for policy 0, policy_version 73301 (0.0026) [2024-06-18 06:05:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1201045504. Throughput: 0: 42423.6. Samples: 1201186820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:05:56,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 06:05:58,649][12883] Updated weights for policy 0, policy_version 73311 (0.0046) [2024-06-18 06:06:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1201225728. Throughput: 0: 42251.2. Samples: 1201319580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:06:01,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 06:06:02,879][12883] Updated weights for policy 0, policy_version 73321 (0.0030) [2024-06-18 06:06:06,410][12883] Updated weights for policy 0, policy_version 73331 (0.0036) [2024-06-18 06:06:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1201455104. Throughput: 0: 42250.3. Samples: 1201567800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:06:06,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 06:06:10,660][12883] Updated weights for policy 0, policy_version 73341 (0.0036) [2024-06-18 06:06:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 1201668096. Throughput: 0: 42411.1. Samples: 1201823920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:06:11,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 06:06:14,595][12883] Updated weights for policy 0, policy_version 73351 (0.0030) [2024-06-18 06:06:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1201864704. Throughput: 0: 42152.9. Samples: 1201947220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:06:16,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 06:06:18,383][12883] Updated weights for policy 0, policy_version 73361 (0.0031) [2024-06-18 06:06:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1202094080. Throughput: 0: 42383.3. Samples: 1202204600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:06:21,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 06:06:22,110][12883] Updated weights for policy 0, policy_version 73371 (0.0023) [2024-06-18 06:06:26,139][12883] Updated weights for policy 0, policy_version 73381 (0.0027) [2024-06-18 06:06:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 1202307072. Throughput: 0: 42490.2. Samples: 1202460580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 06:06:26,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 06:06:29,669][12883] Updated weights for policy 0, policy_version 73391 (0.0030) [2024-06-18 06:06:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1202503680. Throughput: 0: 42315.6. Samples: 1202584420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:06:31,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 06:06:33,823][12862] Signal inference workers to stop experience collection... (17400 times) [2024-06-18 06:06:33,828][12862] Signal inference workers to resume experience collection... (17400 times) [2024-06-18 06:06:33,857][12883] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-06-18 06:06:33,857][12883] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-06-18 06:06:33,978][12883] Updated weights for policy 0, policy_version 73401 (0.0029) [2024-06-18 06:06:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1202733056. Throughput: 0: 42395.5. Samples: 1202834980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:06:37,004][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 06:06:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073410_1202749440.pth... [2024-06-18 06:06:37,214][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000072788_1192558592.pth [2024-06-18 06:06:37,773][12883] Updated weights for policy 0, policy_version 73411 (0.0041) [2024-06-18 06:06:41,715][12883] Updated weights for policy 0, policy_version 73421 (0.0032) [2024-06-18 06:06:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1202929664. Throughput: 0: 42292.9. Samples: 1203090000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:06:41,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 06:06:45,629][12883] Updated weights for policy 0, policy_version 73431 (0.0043) [2024-06-18 06:06:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1203142656. Throughput: 0: 42257.3. Samples: 1203221160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:06:46,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 06:06:49,505][12883] Updated weights for policy 0, policy_version 73441 (0.0033) [2024-06-18 06:06:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1203355648. Throughput: 0: 42123.6. Samples: 1203463360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:06:51,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 06:06:53,615][12883] Updated weights for policy 0, policy_version 73451 (0.0028) [2024-06-18 06:06:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1203568640. Throughput: 0: 42142.2. Samples: 1203720320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:06:56,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 06:06:57,176][12883] Updated weights for policy 0, policy_version 73461 (0.0034) [2024-06-18 06:07:01,334][12883] Updated weights for policy 0, policy_version 73471 (0.0047) [2024-06-18 06:07:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1203781632. Throughput: 0: 42297.4. Samples: 1203850600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:07:01,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 06:07:05,109][12883] Updated weights for policy 0, policy_version 73481 (0.0038) [2024-06-18 06:07:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1203978240. Throughput: 0: 42047.2. Samples: 1204096720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:07:06,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 06:07:09,183][12883] Updated weights for policy 0, policy_version 73491 (0.0027) [2024-06-18 06:07:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1204191232. Throughput: 0: 42024.9. Samples: 1204351700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:07:11,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 06:07:12,802][12883] Updated weights for policy 0, policy_version 73501 (0.0035) [2024-06-18 06:07:16,955][12883] Updated weights for policy 0, policy_version 73511 (0.0037) [2024-06-18 06:07:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1204404224. Throughput: 0: 42118.1. Samples: 1204479740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:16,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 06:07:20,717][12883] Updated weights for policy 0, policy_version 73521 (0.0042) [2024-06-18 06:07:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1204617216. Throughput: 0: 42153.8. Samples: 1204731900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:21,995][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 06:07:25,128][12883] Updated weights for policy 0, policy_version 73531 (0.0039) [2024-06-18 06:07:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1204813824. Throughput: 0: 42180.0. Samples: 1204988100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:26,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 06:07:28,387][12883] Updated weights for policy 0, policy_version 73541 (0.0027) [2024-06-18 06:07:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1205026816. Throughput: 0: 41996.9. Samples: 1205111020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:31,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 06:07:32,687][12883] Updated weights for policy 0, policy_version 73551 (0.0026) [2024-06-18 06:07:35,998][12883] Updated weights for policy 0, policy_version 73561 (0.0032) [2024-06-18 06:07:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1205256192. Throughput: 0: 42355.5. Samples: 1205369360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:36,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 06:07:40,232][12883] Updated weights for policy 0, policy_version 73571 (0.0028) [2024-06-18 06:07:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1205452800. Throughput: 0: 42393.8. Samples: 1205628040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:41,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 06:07:43,619][12883] Updated weights for policy 0, policy_version 73581 (0.0025) [2024-06-18 06:07:46,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1205665792. Throughput: 0: 42253.4. Samples: 1205752000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:46,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 06:07:48,009][12883] Updated weights for policy 0, policy_version 73591 (0.0032) [2024-06-18 06:07:51,306][12883] Updated weights for policy 0, policy_version 73601 (0.0022) [2024-06-18 06:07:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42432.7). Total num frames: 1205895168. Throughput: 0: 42397.3. Samples: 1206004600. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:51,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 06:07:56,351][12883] Updated weights for policy 0, policy_version 73611 (0.0031) [2024-06-18 06:07:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 1206075392. Throughput: 0: 42427.5. Samples: 1206260940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 06:07:56,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 06:07:57,444][12862] Signal inference workers to stop experience collection... (17450 times) [2024-06-18 06:07:57,445][12862] Signal inference workers to resume experience collection... (17450 times) [2024-06-18 06:07:57,485][12883] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-06-18 06:07:57,485][12883] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-06-18 06:07:59,153][12883] Updated weights for policy 0, policy_version 73621 (0.0035) [2024-06-18 06:08:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1206304768. Throughput: 0: 42065.5. Samples: 1206372680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:01,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 06:08:04,088][12883] Updated weights for policy 0, policy_version 73631 (0.0040) [2024-06-18 06:08:06,925][12883] Updated weights for policy 0, policy_version 73641 (0.0040) [2024-06-18 06:08:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1206534144. Throughput: 0: 42207.1. Samples: 1206631220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:06,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 06:08:11,689][12883] Updated weights for policy 0, policy_version 73651 (0.0037) [2024-06-18 06:08:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1206697984. Throughput: 0: 42244.4. Samples: 1206889100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:11,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 06:08:14,818][12883] Updated weights for policy 0, policy_version 73661 (0.0031) [2024-06-18 06:08:16,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42050.7, 300 sec: 42320.4). Total num frames: 1206927360. Throughput: 0: 42117.3. Samples: 1207006400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:16,997][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 06:08:19,415][12883] Updated weights for policy 0, policy_version 73671 (0.0028) [2024-06-18 06:08:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1207156736. Throughput: 0: 42067.6. Samples: 1207262400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:21,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 06:08:22,637][12883] Updated weights for policy 0, policy_version 73681 (0.0045) [2024-06-18 06:08:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1207336960. Throughput: 0: 42039.0. Samples: 1207519800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:26,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 06:08:27,232][12883] Updated weights for policy 0, policy_version 73691 (0.0033) [2024-06-18 06:08:30,589][12883] Updated weights for policy 0, policy_version 73701 (0.0032) [2024-06-18 06:08:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1207582720. Throughput: 0: 42045.7. Samples: 1207644060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:31,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 06:08:34,798][12883] Updated weights for policy 0, policy_version 73711 (0.0043) [2024-06-18 06:08:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42265.5). Total num frames: 1207779328. Throughput: 0: 42175.9. Samples: 1207902520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:36,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 06:08:37,047][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073718_1207795712.pth... [2024-06-18 06:08:37,113][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073098_1197637632.pth [2024-06-18 06:08:38,345][12883] Updated weights for policy 0, policy_version 73721 (0.0031) [2024-06-18 06:08:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1207975936. Throughput: 0: 42111.1. Samples: 1208155940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 06:08:41,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 06:08:42,278][12883] Updated weights for policy 0, policy_version 73731 (0.0031) [2024-06-18 06:08:46,009][12883] Updated weights for policy 0, policy_version 73741 (0.0031) [2024-06-18 06:08:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1208238080. Throughput: 0: 42485.4. Samples: 1208284520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:08:46,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 06:08:50,099][12883] Updated weights for policy 0, policy_version 73751 (0.0041) [2024-06-18 06:08:51,999][12645] Fps is (10 sec: 42575.6, 60 sec: 41775.4, 300 sec: 42208.9). Total num frames: 1208401920. Throughput: 0: 42226.6. Samples: 1208531640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:08:52,000][12645] Avg episode reward: [(0, '0.024')] [2024-06-18 06:08:53,973][12883] Updated weights for policy 0, policy_version 73761 (0.0027) [2024-06-18 06:08:56,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 1208614912. Throughput: 0: 42134.3. Samples: 1208785140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:08:56,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 06:08:57,789][12883] Updated weights for policy 0, policy_version 73771 (0.0042) [2024-06-18 06:09:01,632][12883] Updated weights for policy 0, policy_version 73781 (0.0028) [2024-06-18 06:09:01,994][12645] Fps is (10 sec: 44261.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1208844288. Throughput: 0: 42449.4. Samples: 1208916520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:09:01,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 06:09:05,456][12883] Updated weights for policy 0, policy_version 73791 (0.0023) [2024-06-18 06:09:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1209040896. Throughput: 0: 42351.5. Samples: 1209168220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:09:06,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 06:09:09,355][12883] Updated weights for policy 0, policy_version 73801 (0.0035) [2024-06-18 06:09:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1209270272. Throughput: 0: 42112.5. Samples: 1209414860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:09:11,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 06:09:13,294][12883] Updated weights for policy 0, policy_version 73811 (0.0043) [2024-06-18 06:09:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1209466880. Throughput: 0: 42426.2. Samples: 1209553240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:09:16,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 06:09:17,047][12883] Updated weights for policy 0, policy_version 73821 (0.0038) [2024-06-18 06:09:21,099][12883] Updated weights for policy 0, policy_version 73831 (0.0026) [2024-06-18 06:09:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1209679872. Throughput: 0: 42124.4. Samples: 1209798120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:09:21,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 06:09:24,877][12883] Updated weights for policy 0, policy_version 73841 (0.0038) [2024-06-18 06:09:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42209.9). Total num frames: 1209925632. Throughput: 0: 41983.6. Samples: 1210045200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 06:09:26,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 06:09:28,728][12883] Updated weights for policy 0, policy_version 73851 (0.0028) [2024-06-18 06:09:31,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42050.7, 300 sec: 42264.9). Total num frames: 1210105856. Throughput: 0: 42205.8. Samples: 1210183880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:09:31,997][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 06:09:32,352][12862] Signal inference workers to stop experience collection... (17500 times) [2024-06-18 06:09:32,352][12862] Signal inference workers to resume experience collection... (17500 times) [2024-06-18 06:09:32,391][12883] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-06-18 06:09:32,391][12883] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-06-18 06:09:32,495][12883] Updated weights for policy 0, policy_version 73861 (0.0047) [2024-06-18 06:09:36,303][12883] Updated weights for policy 0, policy_version 73871 (0.0041) [2024-06-18 06:09:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1210302464. Throughput: 0: 42215.2. Samples: 1210431100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:09:36,998][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 06:09:40,270][12883] Updated weights for policy 0, policy_version 73881 (0.0040) [2024-06-18 06:09:41,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1210548224. Throughput: 0: 42148.9. Samples: 1210681840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:09:41,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 06:09:43,955][12883] Updated weights for policy 0, policy_version 73891 (0.0037) [2024-06-18 06:09:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41779.1, 300 sec: 42321.0). Total num frames: 1210744832. Throughput: 0: 42245.8. Samples: 1210817580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:09:46,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 06:09:48,020][12883] Updated weights for policy 0, policy_version 73901 (0.0038) [2024-06-18 06:09:51,614][12883] Updated weights for policy 0, policy_version 73911 (0.0033) [2024-06-18 06:09:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42602.3, 300 sec: 42209.6). Total num frames: 1210957824. Throughput: 0: 42120.5. Samples: 1211063640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:09:51,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 06:09:55,659][12883] Updated weights for policy 0, policy_version 73921 (0.0040) [2024-06-18 06:09:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1211187200. Throughput: 0: 42362.3. Samples: 1211321160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:09:56,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 06:09:59,233][12883] Updated weights for policy 0, policy_version 73931 (0.0038) [2024-06-18 06:10:01,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 1211367424. Throughput: 0: 42070.6. Samples: 1211446420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:10:01,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 06:10:03,352][12883] Updated weights for policy 0, policy_version 73941 (0.0046) [2024-06-18 06:10:06,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42596.8, 300 sec: 42153.8). Total num frames: 1211596800. Throughput: 0: 42341.0. Samples: 1211703560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:10:06,997][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 06:10:07,231][12883] Updated weights for policy 0, policy_version 73951 (0.0042) [2024-06-18 06:10:11,178][12883] Updated weights for policy 0, policy_version 73961 (0.0041) [2024-06-18 06:10:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1211809792. Throughput: 0: 42503.1. Samples: 1211957840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:10:11,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 06:10:14,932][12883] Updated weights for policy 0, policy_version 73971 (0.0038) [2024-06-18 06:10:16,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1211990016. Throughput: 0: 42106.9. Samples: 1212078600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:16,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 06:10:19,038][12883] Updated weights for policy 0, policy_version 73981 (0.0038) [2024-06-18 06:10:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42209.9). Total num frames: 1212235776. Throughput: 0: 42348.0. Samples: 1212336760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:21,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 06:10:22,640][12883] Updated weights for policy 0, policy_version 73991 (0.0028) [2024-06-18 06:10:26,754][12883] Updated weights for policy 0, policy_version 74001 (0.0029) [2024-06-18 06:10:26,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 1212432384. Throughput: 0: 42425.0. Samples: 1212590960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:26,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 06:10:30,688][12883] Updated weights for policy 0, policy_version 74011 (0.0033) [2024-06-18 06:10:31,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42052.3, 300 sec: 42264.9). Total num frames: 1212628992. Throughput: 0: 42199.2. Samples: 1212716640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:31,996][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 06:10:34,338][12883] Updated weights for policy 0, policy_version 74021 (0.0028) [2024-06-18 06:10:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 1212858368. Throughput: 0: 42364.3. Samples: 1212970040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:36,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 06:10:37,112][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074028_1212874752.pth... [2024-06-18 06:10:37,161][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073410_1202749440.pth [2024-06-18 06:10:38,390][12883] Updated weights for policy 0, policy_version 74031 (0.0035) [2024-06-18 06:10:41,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1213071360. Throughput: 0: 42169.2. Samples: 1213218780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:41,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 06:10:42,020][12883] Updated weights for policy 0, policy_version 74041 (0.0042) [2024-06-18 06:10:46,698][12883] Updated weights for policy 0, policy_version 74051 (0.0039) [2024-06-18 06:10:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1213267968. Throughput: 0: 42154.4. Samples: 1213343360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:46,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 06:10:49,884][12883] Updated weights for policy 0, policy_version 74061 (0.0026) [2024-06-18 06:10:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1213480960. Throughput: 0: 42131.1. Samples: 1213599360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:51,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 06:10:54,329][12883] Updated weights for policy 0, policy_version 74071 (0.0030) [2024-06-18 06:10:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1213710336. Throughput: 0: 42108.1. Samples: 1213852700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 25.0) [2024-06-18 06:10:56,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 06:10:57,480][12883] Updated weights for policy 0, policy_version 74081 (0.0037) [2024-06-18 06:11:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.5, 300 sec: 42154.1). Total num frames: 1213890560. Throughput: 0: 42241.5. Samples: 1213979460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:01,994][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 06:11:02,013][12883] Updated weights for policy 0, policy_version 74091 (0.0030) [2024-06-18 06:11:05,061][12883] Updated weights for policy 0, policy_version 74101 (0.0042) [2024-06-18 06:11:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42053.8, 300 sec: 42209.6). Total num frames: 1214119936. Throughput: 0: 42106.7. Samples: 1214231560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:06,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 06:11:09,853][12883] Updated weights for policy 0, policy_version 74111 (0.0027) [2024-06-18 06:11:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 1214332928. Throughput: 0: 42117.3. Samples: 1214486240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:11,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 06:11:13,049][12883] Updated weights for policy 0, policy_version 74121 (0.0038) [2024-06-18 06:11:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1214513152. Throughput: 0: 42179.4. Samples: 1214614620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:16,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 06:11:17,639][12883] Updated weights for policy 0, policy_version 74131 (0.0039) [2024-06-18 06:11:20,836][12883] Updated weights for policy 0, policy_version 74141 (0.0032) [2024-06-18 06:11:21,996][12645] Fps is (10 sec: 40950.1, 60 sec: 41777.7, 300 sec: 42153.8). Total num frames: 1214742528. Throughput: 0: 42025.5. Samples: 1214861280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:21,997][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 06:11:25,395][12883] Updated weights for policy 0, policy_version 74151 (0.0027) [2024-06-18 06:11:26,392][12862] Signal inference workers to stop experience collection... (17550 times) [2024-06-18 06:11:26,392][12862] Signal inference workers to resume experience collection... (17550 times) [2024-06-18 06:11:26,428][12883] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-06-18 06:11:26,428][12883] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-06-18 06:11:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1214955520. Throughput: 0: 42327.6. Samples: 1215123520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:26,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 06:11:28,339][12883] Updated weights for policy 0, policy_version 74161 (0.0027) [2024-06-18 06:11:31,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42326.9, 300 sec: 42154.1). Total num frames: 1215168512. Throughput: 0: 42451.6. Samples: 1215253680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:31,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 06:11:33,279][12883] Updated weights for policy 0, policy_version 74171 (0.0027) [2024-06-18 06:11:35,924][12883] Updated weights for policy 0, policy_version 74181 (0.0023) [2024-06-18 06:11:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1215381504. Throughput: 0: 42219.4. Samples: 1215499240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:36,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 06:11:40,980][12883] Updated weights for policy 0, policy_version 74191 (0.0043) [2024-06-18 06:11:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1215594496. Throughput: 0: 42468.3. Samples: 1215763780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:11:41,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 06:11:43,566][12883] Updated weights for policy 0, policy_version 74201 (0.0034) [2024-06-18 06:11:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1215791104. Throughput: 0: 42416.3. Samples: 1215888200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:11:46,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 06:11:48,518][12883] Updated weights for policy 0, policy_version 74211 (0.0042) [2024-06-18 06:11:51,683][12883] Updated weights for policy 0, policy_version 74221 (0.0027) [2024-06-18 06:11:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1216036864. Throughput: 0: 42376.0. Samples: 1216138480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:11:51,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 06:11:56,056][12883] Updated weights for policy 0, policy_version 74231 (0.0027) [2024-06-18 06:11:56,996][12645] Fps is (10 sec: 44227.4, 60 sec: 42050.7, 300 sec: 42209.3). Total num frames: 1216233472. Throughput: 0: 42510.3. Samples: 1216399300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:11:56,996][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 06:11:59,225][12883] Updated weights for policy 0, policy_version 74241 (0.0041) [2024-06-18 06:12:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1216446464. Throughput: 0: 42402.3. Samples: 1216522720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:12:01,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 06:12:03,925][12883] Updated weights for policy 0, policy_version 74251 (0.0024) [2024-06-18 06:12:06,994][12645] Fps is (10 sec: 44246.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1216675840. Throughput: 0: 42585.2. Samples: 1216777520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:12:06,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 06:12:07,368][12883] Updated weights for policy 0, policy_version 74261 (0.0037) [2024-06-18 06:12:11,494][12883] Updated weights for policy 0, policy_version 74271 (0.0027) [2024-06-18 06:12:11,993][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1216872448. Throughput: 0: 42408.2. Samples: 1217031880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:12:11,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 06:12:15,471][12883] Updated weights for policy 0, policy_version 74281 (0.0044) [2024-06-18 06:12:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1217085440. Throughput: 0: 42317.7. Samples: 1217157980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:12:16,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 06:12:19,373][12883] Updated weights for policy 0, policy_version 74291 (0.0037) [2024-06-18 06:12:21,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42873.0, 300 sec: 42376.2). Total num frames: 1217314816. Throughput: 0: 42501.3. Samples: 1217411800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:12:21,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 06:12:23,133][12883] Updated weights for policy 0, policy_version 74301 (0.0029) [2024-06-18 06:12:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1217495040. Throughput: 0: 42383.6. Samples: 1217671040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 06:12:26,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 06:12:27,092][12883] Updated weights for policy 0, policy_version 74311 (0.0025) [2024-06-18 06:12:30,608][12883] Updated weights for policy 0, policy_version 74321 (0.0028) [2024-06-18 06:12:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1217708032. Throughput: 0: 42392.9. Samples: 1217795880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:12:31,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 06:12:33,927][12862] Signal inference workers to stop experience collection... (17600 times) [2024-06-18 06:12:33,927][12862] Signal inference workers to resume experience collection... (17600 times) [2024-06-18 06:12:33,942][12883] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-06-18 06:12:33,956][12883] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-06-18 06:12:34,583][12883] Updated weights for policy 0, policy_version 74331 (0.0040) [2024-06-18 06:12:36,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1217953792. Throughput: 0: 42453.7. Samples: 1218048900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:12:36,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 06:12:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074338_1217953792.pth... [2024-06-18 06:12:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000073718_1207795712.pth [2024-06-18 06:12:38,495][12883] Updated weights for policy 0, policy_version 74341 (0.0037) [2024-06-18 06:12:41,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.8, 300 sec: 42320.4). Total num frames: 1218150400. Throughput: 0: 42438.2. Samples: 1218309020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:12:41,996][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 06:12:42,285][12883] Updated weights for policy 0, policy_version 74351 (0.0046) [2024-06-18 06:12:46,300][12883] Updated weights for policy 0, policy_version 74361 (0.0038) [2024-06-18 06:12:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1218363392. Throughput: 0: 42361.3. Samples: 1218428980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:12:46,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 06:12:50,152][12883] Updated weights for policy 0, policy_version 74371 (0.0040) [2024-06-18 06:12:51,996][12645] Fps is (10 sec: 44236.7, 60 sec: 42596.8, 300 sec: 42431.5). Total num frames: 1218592768. Throughput: 0: 42322.8. Samples: 1218682140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:12:51,997][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 06:12:53,861][12883] Updated weights for policy 0, policy_version 74381 (0.0030) [2024-06-18 06:12:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.8, 300 sec: 42265.1). Total num frames: 1218772992. Throughput: 0: 42556.2. Samples: 1218946920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:12:56,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 06:12:57,747][12883] Updated weights for policy 0, policy_version 74391 (0.0031) [2024-06-18 06:13:01,499][12883] Updated weights for policy 0, policy_version 74401 (0.0037) [2024-06-18 06:13:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1218985984. Throughput: 0: 42340.4. Samples: 1219063300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:13:01,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 06:13:05,521][12883] Updated weights for policy 0, policy_version 74411 (0.0053) [2024-06-18 06:13:06,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 1219215360. Throughput: 0: 42413.5. Samples: 1219320500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:13:06,997][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 06:13:09,115][12883] Updated weights for policy 0, policy_version 74421 (0.0036) [2024-06-18 06:13:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.1, 300 sec: 42265.5). Total num frames: 1219395584. Throughput: 0: 42444.3. Samples: 1219581040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 06:13:11,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 06:13:13,263][12883] Updated weights for policy 0, policy_version 74431 (0.0050) [2024-06-18 06:13:16,754][12883] Updated weights for policy 0, policy_version 74441 (0.0042) [2024-06-18 06:13:16,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1219641344. Throughput: 0: 42324.5. Samples: 1219700480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:16,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 06:13:20,667][12883] Updated weights for policy 0, policy_version 74451 (0.0022) [2024-06-18 06:13:22,000][12645] Fps is (10 sec: 47483.8, 60 sec: 42594.0, 300 sec: 42486.4). Total num frames: 1219870720. Throughput: 0: 42480.3. Samples: 1219960780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:22,001][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 06:13:24,435][12883] Updated weights for policy 0, policy_version 74461 (0.0032) [2024-06-18 06:13:26,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1220034560. Throughput: 0: 42503.8. Samples: 1220221600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:27,003][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 06:13:28,393][12883] Updated weights for policy 0, policy_version 74471 (0.0028) [2024-06-18 06:13:31,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1220280320. Throughput: 0: 42480.9. Samples: 1220340620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:31,996][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 06:13:32,171][12883] Updated weights for policy 0, policy_version 74481 (0.0033) [2024-06-18 06:13:35,959][12883] Updated weights for policy 0, policy_version 74491 (0.0034) [2024-06-18 06:13:36,994][12645] Fps is (10 sec: 45876.3, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1220493312. Throughput: 0: 42652.0. Samples: 1220601380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:36,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 06:13:39,913][12883] Updated weights for policy 0, policy_version 74501 (0.0021) [2024-06-18 06:13:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 1220689920. Throughput: 0: 42562.3. Samples: 1220862220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:41,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 06:13:43,474][12862] Signal inference workers to stop experience collection... (17650 times) [2024-06-18 06:13:43,510][12883] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-06-18 06:13:43,536][12862] Signal inference workers to resume experience collection... (17650 times) [2024-06-18 06:13:43,537][12883] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-06-18 06:13:43,695][12883] Updated weights for policy 0, policy_version 74511 (0.0038) [2024-06-18 06:13:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42377.0). Total num frames: 1220902912. Throughput: 0: 42736.6. Samples: 1220986440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:46,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 06:13:47,502][12883] Updated weights for policy 0, policy_version 74521 (0.0033) [2024-06-18 06:13:51,315][12883] Updated weights for policy 0, policy_version 74531 (0.0045) [2024-06-18 06:13:51,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42598.4, 300 sec: 42487.0). Total num frames: 1221148672. Throughput: 0: 42865.4. Samples: 1221249440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:51,997][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 06:13:55,596][12883] Updated weights for policy 0, policy_version 74541 (0.0036) [2024-06-18 06:13:56,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.9, 300 sec: 42320.4). Total num frames: 1221328896. Throughput: 0: 42672.1. Samples: 1221501380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 06:13:56,996][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 06:13:58,986][12883] Updated weights for policy 0, policy_version 74551 (0.0036) [2024-06-18 06:14:01,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1221558272. Throughput: 0: 42698.3. Samples: 1221621900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:01,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 06:14:03,312][12883] Updated weights for policy 0, policy_version 74561 (0.0029) [2024-06-18 06:14:06,638][12883] Updated weights for policy 0, policy_version 74571 (0.0041) [2024-06-18 06:14:06,994][12645] Fps is (10 sec: 47524.1, 60 sec: 43146.1, 300 sec: 42487.3). Total num frames: 1221804032. Throughput: 0: 42776.2. Samples: 1221885440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:06,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 06:14:10,993][12883] Updated weights for policy 0, policy_version 74581 (0.0034) [2024-06-18 06:14:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1221967872. Throughput: 0: 42515.6. Samples: 1222134800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:11,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 06:14:14,334][12883] Updated weights for policy 0, policy_version 74591 (0.0022) [2024-06-18 06:14:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1222180864. Throughput: 0: 42626.5. Samples: 1222258820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:16,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 06:14:19,010][12883] Updated weights for policy 0, policy_version 74601 (0.0031) [2024-06-18 06:14:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42329.8, 300 sec: 42320.7). Total num frames: 1222410240. Throughput: 0: 42606.6. Samples: 1222518680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:21,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 06:14:22,187][12883] Updated weights for policy 0, policy_version 74611 (0.0039) [2024-06-18 06:14:26,479][12883] Updated weights for policy 0, policy_version 74621 (0.0030) [2024-06-18 06:14:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42376.6). Total num frames: 1222606848. Throughput: 0: 42464.0. Samples: 1222773100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:26,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 06:14:29,893][12883] Updated weights for policy 0, policy_version 74631 (0.0022) [2024-06-18 06:14:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1222836224. Throughput: 0: 42496.4. Samples: 1222898780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:31,994][12645] Avg episode reward: [(0, '0.165')] [2024-06-18 06:14:34,180][12883] Updated weights for policy 0, policy_version 74641 (0.0036) [2024-06-18 06:14:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.1, 300 sec: 42320.7). Total num frames: 1223032832. Throughput: 0: 42343.3. Samples: 1223154800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:36,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 06:14:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074648_1223032832.pth... [2024-06-18 06:14:37,097][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074028_1212874752.pth [2024-06-18 06:14:37,665][12883] Updated weights for policy 0, policy_version 74651 (0.0036) [2024-06-18 06:14:41,802][12883] Updated weights for policy 0, policy_version 74661 (0.0026) [2024-06-18 06:14:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1223245824. Throughput: 0: 42434.6. Samples: 1223410840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:41,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 06:14:45,504][12883] Updated weights for policy 0, policy_version 74671 (0.0044) [2024-06-18 06:14:46,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1223475200. Throughput: 0: 42542.2. Samples: 1223536300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 06:14:46,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 06:14:49,198][12862] Signal inference workers to stop experience collection... (17700 times) [2024-06-18 06:14:49,199][12862] Signal inference workers to resume experience collection... (17700 times) [2024-06-18 06:14:49,243][12883] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-06-18 06:14:49,244][12883] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-06-18 06:14:49,348][12883] Updated weights for policy 0, policy_version 74681 (0.0036) [2024-06-18 06:14:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42053.8, 300 sec: 42320.7). Total num frames: 1223671808. Throughput: 0: 42349.7. Samples: 1223791180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:14:51,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 06:14:53,450][12883] Updated weights for policy 0, policy_version 74691 (0.0039) [2024-06-18 06:14:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42599.9, 300 sec: 42431.8). Total num frames: 1223884800. Throughput: 0: 42444.4. Samples: 1224044800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:14:56,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 06:14:57,430][12883] Updated weights for policy 0, policy_version 74701 (0.0036) [2024-06-18 06:15:01,120][12883] Updated weights for policy 0, policy_version 74711 (0.0029) [2024-06-18 06:15:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42432.1). Total num frames: 1224114176. Throughput: 0: 42442.3. Samples: 1224168720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:15:01,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 06:15:05,213][12883] Updated weights for policy 0, policy_version 74721 (0.0036) [2024-06-18 06:15:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1224310784. Throughput: 0: 42364.4. Samples: 1224425080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:15:06,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 06:15:09,086][12883] Updated weights for policy 0, policy_version 74731 (0.0031) [2024-06-18 06:15:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 1224523776. Throughput: 0: 42227.7. Samples: 1224673340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:15:11,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 06:15:12,942][12883] Updated weights for policy 0, policy_version 74741 (0.0038) [2024-06-18 06:15:16,673][12883] Updated weights for policy 0, policy_version 74751 (0.0037) [2024-06-18 06:15:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1224720384. Throughput: 0: 42245.2. Samples: 1224799820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:15:16,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 06:15:20,557][12883] Updated weights for policy 0, policy_version 74761 (0.0038) [2024-06-18 06:15:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42431.7). Total num frames: 1224949760. Throughput: 0: 42116.0. Samples: 1225050020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:15:21,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 06:15:24,548][12883] Updated weights for policy 0, policy_version 74771 (0.0041) [2024-06-18 06:15:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42376.6). Total num frames: 1225129984. Throughput: 0: 42085.3. Samples: 1225304680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:15:26,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 06:15:28,351][12883] Updated weights for policy 0, policy_version 74781 (0.0028) [2024-06-18 06:15:31,994][12645] Fps is (10 sec: 39322.5, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1225342976. Throughput: 0: 42041.4. Samples: 1225428160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 06:15:31,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 06:15:32,282][12883] Updated weights for policy 0, policy_version 74791 (0.0030) [2024-06-18 06:15:36,177][12883] Updated weights for policy 0, policy_version 74801 (0.0038) [2024-06-18 06:15:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1225588736. Throughput: 0: 42104.5. Samples: 1225685880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:15:36,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 06:15:40,094][12883] Updated weights for policy 0, policy_version 74811 (0.0027) [2024-06-18 06:15:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1225768960. Throughput: 0: 42052.5. Samples: 1225937160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:15:41,996][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 06:15:43,702][12883] Updated weights for policy 0, policy_version 74821 (0.0034) [2024-06-18 06:15:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1225981952. Throughput: 0: 42098.2. Samples: 1226063140. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:15:46,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 06:15:48,069][12883] Updated weights for policy 0, policy_version 74831 (0.0036) [2024-06-18 06:15:51,437][12883] Updated weights for policy 0, policy_version 74841 (0.0031) [2024-06-18 06:15:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1226211328. Throughput: 0: 42131.2. Samples: 1226320980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:15:51,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 06:15:52,134][12862] Signal inference workers to stop experience collection... (17750 times) [2024-06-18 06:15:52,189][12883] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-06-18 06:15:52,251][12862] Signal inference workers to resume experience collection... (17750 times) [2024-06-18 06:15:52,251][12883] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-06-18 06:15:55,882][12883] Updated weights for policy 0, policy_version 74851 (0.0040) [2024-06-18 06:15:56,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 1226424320. Throughput: 0: 42238.7. Samples: 1226574080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:15:56,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 06:15:59,168][12883] Updated weights for policy 0, policy_version 74861 (0.0038) [2024-06-18 06:16:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 1226604544. Throughput: 0: 42141.8. Samples: 1226696200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:16:01,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 06:16:03,658][12883] Updated weights for policy 0, policy_version 74871 (0.0036) [2024-06-18 06:16:06,989][12883] Updated weights for policy 0, policy_version 74881 (0.0033) [2024-06-18 06:16:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1226850304. Throughput: 0: 42358.8. Samples: 1226956160. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:16:06,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 06:16:11,274][12883] Updated weights for policy 0, policy_version 74891 (0.0029) [2024-06-18 06:16:11,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1227063296. Throughput: 0: 42331.7. Samples: 1227209600. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:16:11,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 06:16:14,925][12883] Updated weights for policy 0, policy_version 74901 (0.0035) [2024-06-18 06:16:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 1227259904. Throughput: 0: 42329.1. Samples: 1227332980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 06:16:16,994][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 06:16:19,002][12883] Updated weights for policy 0, policy_version 74911 (0.0028) [2024-06-18 06:16:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 1227456512. Throughput: 0: 42285.4. Samples: 1227588720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:21,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 06:16:22,615][12883] Updated weights for policy 0, policy_version 74921 (0.0030) [2024-06-18 06:16:26,717][12883] Updated weights for policy 0, policy_version 74931 (0.0039) [2024-06-18 06:16:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1227669504. Throughput: 0: 42393.8. Samples: 1227844880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:26,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 06:16:30,591][12883] Updated weights for policy 0, policy_version 74941 (0.0038) [2024-06-18 06:16:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1227915264. Throughput: 0: 42333.8. Samples: 1227968160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:31,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 06:16:34,471][12883] Updated weights for policy 0, policy_version 74951 (0.0029) [2024-06-18 06:16:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1228095488. Throughput: 0: 42209.8. Samples: 1228220420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:36,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 06:16:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074958_1228111872.pth... [2024-06-18 06:16:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074338_1217953792.pth [2024-06-18 06:16:38,176][12883] Updated weights for policy 0, policy_version 74961 (0.0023) [2024-06-18 06:16:41,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1228292096. Throughput: 0: 42047.5. Samples: 1228466220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:41,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 06:16:42,445][12883] Updated weights for policy 0, policy_version 74971 (0.0024) [2024-06-18 06:16:46,250][12883] Updated weights for policy 0, policy_version 74981 (0.0054) [2024-06-18 06:16:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1228521472. Throughput: 0: 42084.1. Samples: 1228589980. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:46,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 06:16:50,359][12883] Updated weights for policy 0, policy_version 74991 (0.0037) [2024-06-18 06:16:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42321.0). Total num frames: 1228718080. Throughput: 0: 41693.4. Samples: 1228832360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:51,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 06:16:54,131][12883] Updated weights for policy 0, policy_version 75001 (0.0034) [2024-06-18 06:16:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41232.9, 300 sec: 42209.6). Total num frames: 1228898304. Throughput: 0: 41713.2. Samples: 1229086700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:16:56,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 06:16:58,273][12883] Updated weights for policy 0, policy_version 75011 (0.0038) [2024-06-18 06:17:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1229127680. Throughput: 0: 41684.1. Samples: 1229208760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 06:17:01,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 06:17:02,057][12883] Updated weights for policy 0, policy_version 75021 (0.0024) [2024-06-18 06:17:06,039][12883] Updated weights for policy 0, policy_version 75031 (0.0031) [2024-06-18 06:17:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1229357056. Throughput: 0: 41710.1. Samples: 1229465680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:06,998][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 06:17:09,864][12883] Updated weights for policy 0, policy_version 75041 (0.0031) [2024-06-18 06:17:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42265.2). Total num frames: 1229553664. Throughput: 0: 41507.5. Samples: 1229712720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:11,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 06:17:13,933][12883] Updated weights for policy 0, policy_version 75051 (0.0037) [2024-06-18 06:17:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1229783040. Throughput: 0: 41543.1. Samples: 1229837600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:16,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 06:17:17,485][12883] Updated weights for policy 0, policy_version 75061 (0.0040) [2024-06-18 06:17:21,075][12862] Signal inference workers to stop experience collection... (17800 times) [2024-06-18 06:17:21,076][12862] Signal inference workers to resume experience collection... (17800 times) [2024-06-18 06:17:21,088][12883] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-06-18 06:17:21,088][12883] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-06-18 06:17:21,755][12883] Updated weights for policy 0, policy_version 75071 (0.0023) [2024-06-18 06:17:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1229979648. Throughput: 0: 41710.6. Samples: 1230097400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:21,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 06:17:24,998][12883] Updated weights for policy 0, policy_version 75081 (0.0027) [2024-06-18 06:17:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1230209024. Throughput: 0: 41822.2. Samples: 1230348220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:26,994][12645] Avg episode reward: [(0, '0.124')] [2024-06-18 06:17:29,453][12883] Updated weights for policy 0, policy_version 75091 (0.0050) [2024-06-18 06:17:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 1230405632. Throughput: 0: 41963.7. Samples: 1230478340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:31,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 06:17:32,543][12883] Updated weights for policy 0, policy_version 75101 (0.0035) [2024-06-18 06:17:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42209.9). Total num frames: 1230602240. Throughput: 0: 42252.9. Samples: 1230733740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:36,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 06:17:37,086][12883] Updated weights for policy 0, policy_version 75111 (0.0037) [2024-06-18 06:17:40,304][12883] Updated weights for policy 0, policy_version 75121 (0.0046) [2024-06-18 06:17:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1230815232. Throughput: 0: 42063.6. Samples: 1230979560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:41,999][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 06:17:44,736][12883] Updated weights for policy 0, policy_version 75131 (0.0035) [2024-06-18 06:17:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42210.0). Total num frames: 1231044608. Throughput: 0: 42240.6. Samples: 1231109580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:17:46,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 06:17:48,062][12883] Updated weights for policy 0, policy_version 75141 (0.0038) [2024-06-18 06:17:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1231241216. Throughput: 0: 42112.8. Samples: 1231360760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:17:51,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 06:17:52,779][12883] Updated weights for policy 0, policy_version 75151 (0.0034) [2024-06-18 06:17:56,673][12883] Updated weights for policy 0, policy_version 75161 (0.0045) [2024-06-18 06:17:56,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1231454208. Throughput: 0: 42227.0. Samples: 1231612940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:17:56,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 06:18:00,544][12883] Updated weights for policy 0, policy_version 75171 (0.0033) [2024-06-18 06:18:01,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 1231667200. Throughput: 0: 42305.0. Samples: 1231741320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:18:01,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 06:18:04,368][12883] Updated weights for policy 0, policy_version 75181 (0.0032) [2024-06-18 06:18:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 1231847424. Throughput: 0: 42051.6. Samples: 1231989720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:18:06,996][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 06:18:08,257][12883] Updated weights for policy 0, policy_version 75191 (0.0033) [2024-06-18 06:18:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1232076800. Throughput: 0: 42049.8. Samples: 1232240460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:18:11,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 06:18:12,022][12883] Updated weights for policy 0, policy_version 75201 (0.0029) [2024-06-18 06:18:15,964][12883] Updated weights for policy 0, policy_version 75211 (0.0028) [2024-06-18 06:18:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 42099.5). Total num frames: 1232289792. Throughput: 0: 42121.3. Samples: 1232373800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:18:16,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 06:18:19,749][12883] Updated weights for policy 0, policy_version 75221 (0.0032) [2024-06-18 06:18:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1232502784. Throughput: 0: 42183.1. Samples: 1232631980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:18:21,994][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 06:18:23,580][12883] Updated weights for policy 0, policy_version 75231 (0.0038) [2024-06-18 06:18:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1232732160. Throughput: 0: 42271.6. Samples: 1232881780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:18:26,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 06:18:27,409][12883] Updated weights for policy 0, policy_version 75241 (0.0040) [2024-06-18 06:18:31,651][12883] Updated weights for policy 0, policy_version 75251 (0.0028) [2024-06-18 06:18:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1232928768. Throughput: 0: 42335.9. Samples: 1233014700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 06:18:31,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 06:18:32,696][12862] Signal inference workers to stop experience collection... (17850 times) [2024-06-18 06:18:32,744][12883] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-06-18 06:18:32,747][12862] Signal inference workers to resume experience collection... (17850 times) [2024-06-18 06:18:32,757][12883] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-06-18 06:18:35,014][12883] Updated weights for policy 0, policy_version 75261 (0.0046) [2024-06-18 06:18:36,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42050.7, 300 sec: 42153.8). Total num frames: 1233125376. Throughput: 0: 42313.9. Samples: 1233264980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:18:36,996][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 06:18:37,042][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075265_1233141760.pth... [2024-06-18 06:18:37,093][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074648_1223032832.pth [2024-06-18 06:18:39,341][12883] Updated weights for policy 0, policy_version 75271 (0.0031) [2024-06-18 06:18:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 1233371136. Throughput: 0: 42285.8. Samples: 1233515800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:18:41,998][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 06:18:42,558][12883] Updated weights for policy 0, policy_version 75281 (0.0037) [2024-06-18 06:18:46,994][12645] Fps is (10 sec: 42608.1, 60 sec: 41779.1, 300 sec: 42043.3). Total num frames: 1233551360. Throughput: 0: 42435.0. Samples: 1233650900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:18:46,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 06:18:47,009][12883] Updated weights for policy 0, policy_version 75291 (0.0038) [2024-06-18 06:18:50,231][12883] Updated weights for policy 0, policy_version 75301 (0.0039) [2024-06-18 06:18:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42154.4). Total num frames: 1233764352. Throughput: 0: 42394.3. Samples: 1233897460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:18:51,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 06:18:54,716][12883] Updated weights for policy 0, policy_version 75311 (0.0046) [2024-06-18 06:18:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 1234010112. Throughput: 0: 42363.1. Samples: 1234146800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:18:56,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 06:18:58,032][12883] Updated weights for policy 0, policy_version 75321 (0.0046) [2024-06-18 06:19:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 1234173952. Throughput: 0: 42459.0. Samples: 1234284460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:19:01,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 06:19:02,489][12883] Updated weights for policy 0, policy_version 75331 (0.0039) [2024-06-18 06:19:05,818][12883] Updated weights for policy 0, policy_version 75341 (0.0049) [2024-06-18 06:19:06,995][12645] Fps is (10 sec: 39316.7, 60 sec: 42597.6, 300 sec: 42153.9). Total num frames: 1234403328. Throughput: 0: 42172.6. Samples: 1234529800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:19:06,995][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 06:19:10,287][12883] Updated weights for policy 0, policy_version 75351 (0.0028) [2024-06-18 06:19:11,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 1234649088. Throughput: 0: 42139.1. Samples: 1234778040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:19:11,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 06:19:14,014][12883] Updated weights for policy 0, policy_version 75361 (0.0042) [2024-06-18 06:19:17,000][12645] Fps is (10 sec: 39301.7, 60 sec: 41774.8, 300 sec: 41986.6). Total num frames: 1234796544. Throughput: 0: 42097.3. Samples: 1234909340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:19:17,001][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 06:19:17,854][12883] Updated weights for policy 0, policy_version 75371 (0.0029) [2024-06-18 06:19:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 1235025920. Throughput: 0: 41995.9. Samples: 1235154700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 06:19:21,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 06:19:22,084][12883] Updated weights for policy 0, policy_version 75381 (0.0040) [2024-06-18 06:19:25,638][12883] Updated weights for policy 0, policy_version 75391 (0.0027) [2024-06-18 06:19:26,994][12645] Fps is (10 sec: 47543.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1235271680. Throughput: 0: 41961.8. Samples: 1235404080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:19:26,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 06:19:29,846][12883] Updated weights for policy 0, policy_version 75401 (0.0031) [2024-06-18 06:19:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 1235435520. Throughput: 0: 41800.0. Samples: 1235531900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:19:31,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 06:19:33,597][12883] Updated weights for policy 0, policy_version 75411 (0.0032) [2024-06-18 06:19:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42154.1). Total num frames: 1235681280. Throughput: 0: 41803.1. Samples: 1235778600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:19:36,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 06:19:38,156][12883] Updated weights for policy 0, policy_version 75421 (0.0025) [2024-06-18 06:19:41,472][12883] Updated weights for policy 0, policy_version 75431 (0.0034) [2024-06-18 06:19:41,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42050.8, 300 sec: 42098.2). Total num frames: 1235894272. Throughput: 0: 42021.0. Samples: 1236037840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:19:41,996][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 06:19:45,678][12883] Updated weights for policy 0, policy_version 75441 (0.0037) [2024-06-18 06:19:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 1236074496. Throughput: 0: 41791.9. Samples: 1236165100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:19:46,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 06:19:47,948][12862] Signal inference workers to stop experience collection... (17900 times) [2024-06-18 06:19:47,983][12883] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-06-18 06:19:48,005][12862] Signal inference workers to resume experience collection... (17900 times) [2024-06-18 06:19:48,006][12883] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-06-18 06:19:49,004][12883] Updated weights for policy 0, policy_version 75451 (0.0035) [2024-06-18 06:19:51,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1236320256. Throughput: 0: 42004.3. Samples: 1236419940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:19:51,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 06:19:53,218][12883] Updated weights for policy 0, policy_version 75461 (0.0046) [2024-06-18 06:19:56,605][12883] Updated weights for policy 0, policy_version 75471 (0.0036) [2024-06-18 06:19:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 1236516864. Throughput: 0: 42193.7. Samples: 1236676760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:19:56,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 06:20:00,725][12883] Updated weights for policy 0, policy_version 75481 (0.0040) [2024-06-18 06:20:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 1236713472. Throughput: 0: 42014.4. Samples: 1236799720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:20:01,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 06:20:04,440][12883] Updated weights for policy 0, policy_version 75491 (0.0036) [2024-06-18 06:20:07,000][12645] Fps is (10 sec: 44209.9, 60 sec: 42594.8, 300 sec: 42153.2). Total num frames: 1236959232. Throughput: 0: 42255.1. Samples: 1237056440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:20:07,000][12645] Avg episode reward: [(0, '0.215')] [2024-06-18 06:20:08,273][12883] Updated weights for policy 0, policy_version 75501 (0.0031) [2024-06-18 06:20:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1237155840. Throughput: 0: 42459.7. Samples: 1237314760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:11,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 06:20:12,038][12883] Updated weights for policy 0, policy_version 75511 (0.0043) [2024-06-18 06:20:15,949][12883] Updated weights for policy 0, policy_version 75521 (0.0042) [2024-06-18 06:20:16,994][12645] Fps is (10 sec: 37706.4, 60 sec: 42329.7, 300 sec: 41987.5). Total num frames: 1237336064. Throughput: 0: 42308.8. Samples: 1237435800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:16,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 06:20:19,928][12883] Updated weights for policy 0, policy_version 75531 (0.0035) [2024-06-18 06:20:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 1237581824. Throughput: 0: 42376.8. Samples: 1237685560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:21,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 06:20:23,999][12883] Updated weights for policy 0, policy_version 75541 (0.0038) [2024-06-18 06:20:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1237778432. Throughput: 0: 42435.1. Samples: 1237947320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:26,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 06:20:27,751][12883] Updated weights for policy 0, policy_version 75551 (0.0041) [2024-06-18 06:20:31,786][12883] Updated weights for policy 0, policy_version 75561 (0.0041) [2024-06-18 06:20:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 1237991424. Throughput: 0: 42229.0. Samples: 1238065400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:31,996][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 06:20:35,503][12883] Updated weights for policy 0, policy_version 75571 (0.0034) [2024-06-18 06:20:36,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1238220800. Throughput: 0: 42206.1. Samples: 1238319220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:36,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 06:20:37,119][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075576_1238237184.pth... [2024-06-18 06:20:37,174][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000074958_1228111872.pth [2024-06-18 06:20:39,638][12883] Updated weights for policy 0, policy_version 75581 (0.0037) [2024-06-18 06:20:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42053.9, 300 sec: 42154.1). Total num frames: 1238417408. Throughput: 0: 42277.5. Samples: 1238579240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:41,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 06:20:43,115][12883] Updated weights for policy 0, policy_version 75591 (0.0033) [2024-06-18 06:20:46,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42596.9, 300 sec: 42098.2). Total num frames: 1238630400. Throughput: 0: 42208.1. Samples: 1238699180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:46,996][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 06:20:47,828][12883] Updated weights for policy 0, policy_version 75601 (0.0043) [2024-06-18 06:20:50,712][12883] Updated weights for policy 0, policy_version 75611 (0.0036) [2024-06-18 06:20:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1238843392. Throughput: 0: 42263.2. Samples: 1238958020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:20:51,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 06:20:55,533][12883] Updated weights for policy 0, policy_version 75621 (0.0027) [2024-06-18 06:20:56,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1239040000. Throughput: 0: 42195.9. Samples: 1239213580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:20:56,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 06:20:58,354][12883] Updated weights for policy 0, policy_version 75631 (0.0037) [2024-06-18 06:21:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 1239269376. Throughput: 0: 42313.9. Samples: 1239339920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:01,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 06:21:03,071][12883] Updated weights for policy 0, policy_version 75641 (0.0027) [2024-06-18 06:21:06,473][12883] Updated weights for policy 0, policy_version 75651 (0.0031) [2024-06-18 06:21:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41783.5, 300 sec: 42043.0). Total num frames: 1239465984. Throughput: 0: 42327.1. Samples: 1239590280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:06,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 06:21:08,972][12862] Signal inference workers to stop experience collection... (17950 times) [2024-06-18 06:21:08,973][12862] Signal inference workers to resume experience collection... (17950 times) [2024-06-18 06:21:09,004][12883] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-06-18 06:21:09,004][12883] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-06-18 06:21:10,787][12883] Updated weights for policy 0, policy_version 75661 (0.0038) [2024-06-18 06:21:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1239695360. Throughput: 0: 42137.2. Samples: 1239843500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:11,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 06:21:14,035][12883] Updated weights for policy 0, policy_version 75671 (0.0030) [2024-06-18 06:21:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 1239908352. Throughput: 0: 42445.7. Samples: 1239975460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:16,994][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 06:21:18,328][12883] Updated weights for policy 0, policy_version 75681 (0.0024) [2024-06-18 06:21:21,963][12883] Updated weights for policy 0, policy_version 75691 (0.0041) [2024-06-18 06:21:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1240121344. Throughput: 0: 42512.0. Samples: 1240232260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:21,995][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 06:21:26,134][12883] Updated weights for policy 0, policy_version 75701 (0.0033) [2024-06-18 06:21:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 1240334336. Throughput: 0: 42303.4. Samples: 1240482900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:26,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 06:21:29,947][12883] Updated weights for policy 0, policy_version 75711 (0.0032) [2024-06-18 06:21:31,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 1240530944. Throughput: 0: 42405.8. Samples: 1240607340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:31,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 06:21:33,723][12883] Updated weights for policy 0, policy_version 75721 (0.0035) [2024-06-18 06:21:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 1240743936. Throughput: 0: 42296.9. Samples: 1240861380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:36,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 06:21:37,605][12883] Updated weights for policy 0, policy_version 75731 (0.0036) [2024-06-18 06:21:41,317][12883] Updated weights for policy 0, policy_version 75741 (0.0043) [2024-06-18 06:21:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 1240940544. Throughput: 0: 42148.0. Samples: 1241110240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 06:21:41,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 06:21:45,526][12883] Updated weights for policy 0, policy_version 75751 (0.0039) [2024-06-18 06:21:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42326.8, 300 sec: 42209.6). Total num frames: 1241169920. Throughput: 0: 42099.0. Samples: 1241234380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:21:46,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 06:21:49,237][12883] Updated weights for policy 0, policy_version 75761 (0.0044) [2024-06-18 06:21:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1241382912. Throughput: 0: 42177.4. Samples: 1241488260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:21:51,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 06:21:53,223][12883] Updated weights for policy 0, policy_version 75771 (0.0040) [2024-06-18 06:21:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1241579520. Throughput: 0: 42115.5. Samples: 1241738700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:21:56,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 06:21:57,262][12883] Updated weights for policy 0, policy_version 75781 (0.0029) [2024-06-18 06:22:01,191][12883] Updated weights for policy 0, policy_version 75791 (0.0044) [2024-06-18 06:22:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 1241792512. Throughput: 0: 41934.3. Samples: 1241862500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:22:01,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 06:22:05,073][12883] Updated weights for policy 0, policy_version 75801 (0.0037) [2024-06-18 06:22:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1242005504. Throughput: 0: 41894.4. Samples: 1242117500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:22:06,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 06:22:07,073][12862] Saving new best policy, reward=0.665! [2024-06-18 06:22:09,046][12883] Updated weights for policy 0, policy_version 75811 (0.0042) [2024-06-18 06:22:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 1242218496. Throughput: 0: 41886.8. Samples: 1242367800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:22:11,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 06:22:12,702][12883] Updated weights for policy 0, policy_version 75821 (0.0034) [2024-06-18 06:22:16,814][12883] Updated weights for policy 0, policy_version 75831 (0.0032) [2024-06-18 06:22:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1242415104. Throughput: 0: 41936.7. Samples: 1242494500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:22:16,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 06:22:20,423][12883] Updated weights for policy 0, policy_version 75841 (0.0033) [2024-06-18 06:22:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 1242611712. Throughput: 0: 41879.1. Samples: 1242745940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:22:21,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 06:22:24,551][12883] Updated weights for policy 0, policy_version 75851 (0.0038) [2024-06-18 06:22:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1242841088. Throughput: 0: 42069.5. Samples: 1243003360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 06:22:26,994][12645] Avg episode reward: [(0, '0.144')] [2024-06-18 06:22:28,104][12883] Updated weights for policy 0, policy_version 75861 (0.0031) [2024-06-18 06:22:31,996][12645] Fps is (10 sec: 42588.6, 60 sec: 41777.5, 300 sec: 42153.8). Total num frames: 1243037696. Throughput: 0: 42075.7. Samples: 1243127880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:22:31,997][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 06:22:32,400][12883] Updated weights for policy 0, policy_version 75871 (0.0032) [2024-06-18 06:22:36,119][12883] Updated weights for policy 0, policy_version 75881 (0.0034) [2024-06-18 06:22:36,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 1243250688. Throughput: 0: 41950.6. Samples: 1243376040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:22:36,996][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 06:22:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075882_1243250688.pth... [2024-06-18 06:22:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075265_1233141760.pth [2024-06-18 06:22:40,197][12883] Updated weights for policy 0, policy_version 75891 (0.0042) [2024-06-18 06:22:41,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 1243463680. Throughput: 0: 42073.4. Samples: 1243632000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:22:41,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 06:22:43,678][12883] Updated weights for policy 0, policy_version 75901 (0.0022) [2024-06-18 06:22:45,404][12862] Signal inference workers to stop experience collection... (18000 times) [2024-06-18 06:22:45,405][12862] Signal inference workers to resume experience collection... (18000 times) [2024-06-18 06:22:45,422][12883] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-06-18 06:22:45,449][12883] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-06-18 06:22:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 1243676672. Throughput: 0: 42082.2. Samples: 1243756200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:22:46,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 06:22:47,869][12883] Updated weights for policy 0, policy_version 75911 (0.0054) [2024-06-18 06:22:51,856][12883] Updated weights for policy 0, policy_version 75921 (0.0028) [2024-06-18 06:22:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1243889664. Throughput: 0: 42000.8. Samples: 1244007540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:22:51,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 06:22:55,525][12883] Updated weights for policy 0, policy_version 75931 (0.0039) [2024-06-18 06:22:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1244119040. Throughput: 0: 42171.4. Samples: 1244265520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:22:56,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 06:22:59,577][12883] Updated weights for policy 0, policy_version 75941 (0.0037) [2024-06-18 06:23:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1244315648. Throughput: 0: 42085.4. Samples: 1244388340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:23:01,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 06:23:03,198][12883] Updated weights for policy 0, policy_version 75951 (0.0028) [2024-06-18 06:23:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1244528640. Throughput: 0: 42143.5. Samples: 1244642400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:23:06,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 06:23:07,194][12883] Updated weights for policy 0, policy_version 75961 (0.0030) [2024-06-18 06:23:10,919][12883] Updated weights for policy 0, policy_version 75971 (0.0025) [2024-06-18 06:23:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 1244741632. Throughput: 0: 42214.2. Samples: 1244903000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-18 06:23:11,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 06:23:14,911][12883] Updated weights for policy 0, policy_version 75981 (0.0033) [2024-06-18 06:23:16,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.9, 300 sec: 42264.8). Total num frames: 1244971008. Throughput: 0: 42279.6. Samples: 1245030460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:16,996][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 06:23:18,623][12883] Updated weights for policy 0, policy_version 75991 (0.0032) [2024-06-18 06:23:21,996][12645] Fps is (10 sec: 40950.4, 60 sec: 42323.7, 300 sec: 42098.2). Total num frames: 1245151232. Throughput: 0: 42381.1. Samples: 1245283280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:21,997][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 06:23:22,686][12883] Updated weights for policy 0, policy_version 76001 (0.0036) [2024-06-18 06:23:26,449][12883] Updated weights for policy 0, policy_version 76011 (0.0040) [2024-06-18 06:23:26,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 1245380608. Throughput: 0: 42401.7. Samples: 1245540080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:26,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 06:23:30,487][12883] Updated weights for policy 0, policy_version 76021 (0.0039) [2024-06-18 06:23:31,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42600.1, 300 sec: 42265.5). Total num frames: 1245593600. Throughput: 0: 42449.8. Samples: 1245666440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:31,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 06:23:33,928][12883] Updated weights for policy 0, policy_version 76031 (0.0022) [2024-06-18 06:23:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1245806592. Throughput: 0: 42468.4. Samples: 1245918620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:36,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 06:23:38,197][12883] Updated weights for policy 0, policy_version 76041 (0.0027) [2024-06-18 06:23:41,521][12883] Updated weights for policy 0, policy_version 76051 (0.0031) [2024-06-18 06:23:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1246035968. Throughput: 0: 42378.4. Samples: 1246172540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:41,994][12645] Avg episode reward: [(0, '0.088')] [2024-06-18 06:23:45,796][12883] Updated weights for policy 0, policy_version 76061 (0.0051) [2024-06-18 06:23:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1246216192. Throughput: 0: 42585.0. Samples: 1246304660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:46,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 06:23:49,192][12883] Updated weights for policy 0, policy_version 76071 (0.0038) [2024-06-18 06:23:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 1246445568. Throughput: 0: 42529.0. Samples: 1246556200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:51,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 06:23:53,267][12883] Updated weights for policy 0, policy_version 76081 (0.0032) [2024-06-18 06:23:56,741][12883] Updated weights for policy 0, policy_version 76091 (0.0039) [2024-06-18 06:23:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1246674944. Throughput: 0: 42456.4. Samples: 1246813540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 06:23:56,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 06:24:01,291][12883] Updated weights for policy 0, policy_version 76101 (0.0039) [2024-06-18 06:24:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42209.8). Total num frames: 1246855168. Throughput: 0: 42648.4. Samples: 1246949540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:01,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 06:24:04,769][12883] Updated weights for policy 0, policy_version 76111 (0.0029) [2024-06-18 06:24:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 1247084544. Throughput: 0: 42469.7. Samples: 1247194320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:06,994][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 06:24:08,739][12883] Updated weights for policy 0, policy_version 76121 (0.0032) [2024-06-18 06:24:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42377.1). Total num frames: 1247297536. Throughput: 0: 42506.7. Samples: 1247452880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:11,994][12645] Avg episode reward: [(0, '0.096')] [2024-06-18 06:24:12,514][12883] Updated weights for policy 0, policy_version 76131 (0.0028) [2024-06-18 06:24:16,513][12883] Updated weights for policy 0, policy_version 76141 (0.0035) [2024-06-18 06:24:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1247510528. Throughput: 0: 42466.6. Samples: 1247577440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:16,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 06:24:20,262][12883] Updated weights for policy 0, policy_version 76151 (0.0027) [2024-06-18 06:24:20,487][12862] Signal inference workers to stop experience collection... (18050 times) [2024-06-18 06:24:20,513][12883] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-06-18 06:24:20,545][12862] Signal inference workers to resume experience collection... (18050 times) [2024-06-18 06:24:20,546][12883] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-06-18 06:24:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42873.1, 300 sec: 42209.6). Total num frames: 1247723520. Throughput: 0: 42560.1. Samples: 1247833820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:21,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 06:24:24,171][12883] Updated weights for policy 0, policy_version 76161 (0.0023) [2024-06-18 06:24:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1247920128. Throughput: 0: 42822.6. Samples: 1248099560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:26,998][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 06:24:27,736][12883] Updated weights for policy 0, policy_version 76171 (0.0023) [2024-06-18 06:24:31,863][12883] Updated weights for policy 0, policy_version 76181 (0.0047) [2024-06-18 06:24:31,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42596.8, 300 sec: 42264.8). Total num frames: 1248149504. Throughput: 0: 42468.1. Samples: 1248215820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:31,996][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 06:24:35,675][12883] Updated weights for policy 0, policy_version 76191 (0.0038) [2024-06-18 06:24:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42321.0). Total num frames: 1248378880. Throughput: 0: 42608.4. Samples: 1248473580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:37,000][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 06:24:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076195_1248378880.pth... [2024-06-18 06:24:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075576_1238237184.pth [2024-06-18 06:24:39,663][12883] Updated weights for policy 0, policy_version 76201 (0.0035) [2024-06-18 06:24:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1248559104. Throughput: 0: 42684.5. Samples: 1248734340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:41,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 06:24:43,147][12883] Updated weights for policy 0, policy_version 76211 (0.0030) [2024-06-18 06:24:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42265.1). Total num frames: 1248788480. Throughput: 0: 42385.2. Samples: 1248856880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 06:24:46,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 06:24:47,197][12883] Updated weights for policy 0, policy_version 76221 (0.0037) [2024-06-18 06:24:50,658][12883] Updated weights for policy 0, policy_version 76231 (0.0034) [2024-06-18 06:24:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 1249017856. Throughput: 0: 42743.6. Samples: 1249117780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:24:51,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 06:24:54,689][12883] Updated weights for policy 0, policy_version 76241 (0.0037) [2024-06-18 06:24:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1249214464. Throughput: 0: 42733.4. Samples: 1249375880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:24:56,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 06:24:58,501][12883] Updated weights for policy 0, policy_version 76251 (0.0026) [2024-06-18 06:25:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42266.0). Total num frames: 1249427456. Throughput: 0: 42678.6. Samples: 1249497980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:25:01,994][12645] Avg episode reward: [(0, '0.143')] [2024-06-18 06:25:02,781][12883] Updated weights for policy 0, policy_version 76261 (0.0044) [2024-06-18 06:25:06,163][12883] Updated weights for policy 0, policy_version 76271 (0.0024) [2024-06-18 06:25:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 1249656832. Throughput: 0: 42632.9. Samples: 1249752300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:25:06,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 06:25:10,441][12883] Updated weights for policy 0, policy_version 76281 (0.0027) [2024-06-18 06:25:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1249820672. Throughput: 0: 42513.4. Samples: 1250012660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:25:11,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 06:25:13,737][12883] Updated weights for policy 0, policy_version 76291 (0.0040) [2024-06-18 06:25:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1250066432. Throughput: 0: 42618.1. Samples: 1250133540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:25:16,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 06:25:18,038][12883] Updated weights for policy 0, policy_version 76301 (0.0038) [2024-06-18 06:25:21,587][12883] Updated weights for policy 0, policy_version 76311 (0.0030) [2024-06-18 06:25:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1250279424. Throughput: 0: 42535.2. Samples: 1250387660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:25:21,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 06:25:25,706][12883] Updated weights for policy 0, policy_version 76321 (0.0037) [2024-06-18 06:25:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1250459648. Throughput: 0: 42565.6. Samples: 1250649800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:25:26,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 06:25:29,355][12883] Updated weights for policy 0, policy_version 76331 (0.0024) [2024-06-18 06:25:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42265.2). Total num frames: 1250689024. Throughput: 0: 42461.4. Samples: 1250767640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 06:25:31,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 06:25:33,463][12883] Updated weights for policy 0, policy_version 76341 (0.0037) [2024-06-18 06:25:36,922][12883] Updated weights for policy 0, policy_version 76351 (0.0035) [2024-06-18 06:25:36,996][12645] Fps is (10 sec: 47503.0, 60 sec: 42596.8, 300 sec: 42431.4). Total num frames: 1250934784. Throughput: 0: 42396.0. Samples: 1251025700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:25:36,997][12645] Avg episode reward: [(0, '0.692')] [2024-06-18 06:25:37,013][12862] Saving new best policy, reward=0.692! [2024-06-18 06:25:41,105][12883] Updated weights for policy 0, policy_version 76361 (0.0026) [2024-06-18 06:25:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42265.5). Total num frames: 1251098624. Throughput: 0: 42446.3. Samples: 1251285960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:25:41,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 06:25:44,359][12883] Updated weights for policy 0, policy_version 76371 (0.0028) [2024-06-18 06:25:46,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1251328000. Throughput: 0: 42330.2. Samples: 1251402840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:25:46,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 06:25:48,274][12862] Signal inference workers to stop experience collection... (18100 times) [2024-06-18 06:25:48,274][12862] Signal inference workers to resume experience collection... (18100 times) [2024-06-18 06:25:48,293][12883] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-06-18 06:25:48,322][12883] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-06-18 06:25:48,898][12883] Updated weights for policy 0, policy_version 76381 (0.0032) [2024-06-18 06:25:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1251557376. Throughput: 0: 42392.4. Samples: 1251659960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:25:51,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 06:25:52,492][12883] Updated weights for policy 0, policy_version 76391 (0.0028) [2024-06-18 06:25:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1251737600. Throughput: 0: 42306.7. Samples: 1251916460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:25:56,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 06:25:57,079][12883] Updated weights for policy 0, policy_version 76401 (0.0036) [2024-06-18 06:26:00,081][12883] Updated weights for policy 0, policy_version 76411 (0.0040) [2024-06-18 06:26:02,000][12645] Fps is (10 sec: 39297.4, 60 sec: 42048.0, 300 sec: 42319.8). Total num frames: 1251950592. Throughput: 0: 42287.1. Samples: 1252036720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:26:02,001][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 06:26:04,913][12883] Updated weights for policy 0, policy_version 76421 (0.0035) [2024-06-18 06:26:06,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1252196352. Throughput: 0: 42394.2. Samples: 1252295400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:26:06,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 06:26:07,729][12883] Updated weights for policy 0, policy_version 76431 (0.0029) [2024-06-18 06:26:11,994][12645] Fps is (10 sec: 40985.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 1252360192. Throughput: 0: 42252.0. Samples: 1252551140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:26:11,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 06:26:12,489][12883] Updated weights for policy 0, policy_version 76441 (0.0037) [2024-06-18 06:26:15,439][12883] Updated weights for policy 0, policy_version 76451 (0.0040) [2024-06-18 06:26:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 1252589568. Throughput: 0: 42292.4. Samples: 1252670800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 06:26:16,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 06:26:20,112][12883] Updated weights for policy 0, policy_version 76461 (0.0032) [2024-06-18 06:26:21,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1252835328. Throughput: 0: 42461.7. Samples: 1252936380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:21,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 06:26:23,167][12883] Updated weights for policy 0, policy_version 76471 (0.0028) [2024-06-18 06:26:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1253015552. Throughput: 0: 42397.3. Samples: 1253193840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:26,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 06:26:27,790][12883] Updated weights for policy 0, policy_version 76481 (0.0045) [2024-06-18 06:26:30,710][12883] Updated weights for policy 0, policy_version 76491 (0.0023) [2024-06-18 06:26:31,994][12645] Fps is (10 sec: 40958.8, 60 sec: 42598.2, 300 sec: 42376.2). Total num frames: 1253244928. Throughput: 0: 42475.3. Samples: 1253314240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:31,995][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 06:26:35,877][12883] Updated weights for policy 0, policy_version 76501 (0.0038) [2024-06-18 06:26:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 1253457920. Throughput: 0: 42497.3. Samples: 1253572340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:36,995][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 06:26:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076505_1253457920.pth... [2024-06-18 06:26:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000075882_1243250688.pth [2024-06-18 06:26:39,008][12883] Updated weights for policy 0, policy_version 76511 (0.0033) [2024-06-18 06:26:41,996][12645] Fps is (10 sec: 39313.9, 60 sec: 42323.7, 300 sec: 42264.9). Total num frames: 1253638144. Throughput: 0: 42401.0. Samples: 1253824600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:41,997][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 06:26:43,323][12883] Updated weights for policy 0, policy_version 76521 (0.0023) [2024-06-18 06:26:46,839][12883] Updated weights for policy 0, policy_version 76531 (0.0034) [2024-06-18 06:26:46,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42375.9). Total num frames: 1253883904. Throughput: 0: 42445.9. Samples: 1253946620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:46,996][12645] Avg episode reward: [(0, '0.063')] [2024-06-18 06:26:51,221][12883] Updated weights for policy 0, policy_version 76541 (0.0044) [2024-06-18 06:26:51,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1254096896. Throughput: 0: 42539.0. Samples: 1254209660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:51,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 06:26:54,485][12883] Updated weights for policy 0, policy_version 76551 (0.0031) [2024-06-18 06:26:56,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1254293504. Throughput: 0: 42409.8. Samples: 1254459580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:26:56,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 06:26:58,917][12883] Updated weights for policy 0, policy_version 76561 (0.0036) [2024-06-18 06:26:59,200][12862] Signal inference workers to stop experience collection... (18150 times) [2024-06-18 06:26:59,200][12862] Signal inference workers to resume experience collection... (18150 times) [2024-06-18 06:26:59,239][12883] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-06-18 06:26:59,239][12883] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-06-18 06:27:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42875.8, 300 sec: 42431.8). Total num frames: 1254522880. Throughput: 0: 42589.8. Samples: 1254587340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:27:01,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 06:27:02,411][12883] Updated weights for policy 0, policy_version 76571 (0.0033) [2024-06-18 06:27:06,705][12883] Updated weights for policy 0, policy_version 76581 (0.0032) [2024-06-18 06:27:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1254719488. Throughput: 0: 42528.0. Samples: 1254850140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 25.0) [2024-06-18 06:27:06,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 06:27:10,185][12883] Updated weights for policy 0, policy_version 76591 (0.0042) [2024-06-18 06:27:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1254948864. Throughput: 0: 42131.4. Samples: 1255089760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:11,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 06:27:14,342][12883] Updated weights for policy 0, policy_version 76601 (0.0043) [2024-06-18 06:27:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1255178240. Throughput: 0: 42487.4. Samples: 1255226160. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:16,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 06:27:17,651][12883] Updated weights for policy 0, policy_version 76611 (0.0035) [2024-06-18 06:27:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1255342080. Throughput: 0: 42541.0. Samples: 1255486680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:21,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 06:27:22,023][12883] Updated weights for policy 0, policy_version 76621 (0.0029) [2024-06-18 06:27:25,815][12883] Updated weights for policy 0, policy_version 76631 (0.0031) [2024-06-18 06:27:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 1255587840. Throughput: 0: 42298.6. Samples: 1255727940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:26,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 06:27:29,958][12883] Updated weights for policy 0, policy_version 76641 (0.0028) [2024-06-18 06:27:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.7, 300 sec: 42542.9). Total num frames: 1255800832. Throughput: 0: 42580.4. Samples: 1255862640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:31,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 06:27:33,318][12883] Updated weights for policy 0, policy_version 76651 (0.0030) [2024-06-18 06:27:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1255997440. Throughput: 0: 42418.3. Samples: 1256118480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:36,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 06:27:37,412][12883] Updated weights for policy 0, policy_version 76661 (0.0024) [2024-06-18 06:27:40,916][12883] Updated weights for policy 0, policy_version 76671 (0.0038) [2024-06-18 06:27:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43146.2, 300 sec: 42542.9). Total num frames: 1256226816. Throughput: 0: 42439.7. Samples: 1256369360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:41,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 06:27:44,981][12883] Updated weights for policy 0, policy_version 76681 (0.0041) [2024-06-18 06:27:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1256423424. Throughput: 0: 42525.4. Samples: 1256500980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:46,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 06:27:48,708][12883] Updated weights for policy 0, policy_version 76691 (0.0042) [2024-06-18 06:27:51,996][12645] Fps is (10 sec: 39312.5, 60 sec: 42050.7, 300 sec: 42375.9). Total num frames: 1256620032. Throughput: 0: 42257.0. Samples: 1256751800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 25.0) [2024-06-18 06:27:51,997][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 06:27:52,932][12883] Updated weights for policy 0, policy_version 76701 (0.0030) [2024-06-18 06:27:56,283][12883] Updated weights for policy 0, policy_version 76711 (0.0031) [2024-06-18 06:27:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1256849408. Throughput: 0: 42474.7. Samples: 1257001120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:27:56,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 06:28:00,570][12883] Updated weights for policy 0, policy_version 76721 (0.0037) [2024-06-18 06:28:01,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1257062400. Throughput: 0: 42384.1. Samples: 1257133440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:01,994][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 06:28:03,932][12883] Updated weights for policy 0, policy_version 76731 (0.0027) [2024-06-18 06:28:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1257259008. Throughput: 0: 42224.0. Samples: 1257386760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:06,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 06:28:08,268][12883] Updated weights for policy 0, policy_version 76741 (0.0040) [2024-06-18 06:28:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42376.6). Total num frames: 1257472000. Throughput: 0: 42359.6. Samples: 1257634120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:11,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 06:28:12,055][12883] Updated weights for policy 0, policy_version 76751 (0.0033) [2024-06-18 06:28:16,066][12883] Updated weights for policy 0, policy_version 76761 (0.0034) [2024-06-18 06:28:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 1257701376. Throughput: 0: 42295.5. Samples: 1257765940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:16,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 06:28:19,695][12883] Updated weights for policy 0, policy_version 76771 (0.0029) [2024-06-18 06:28:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1257897984. Throughput: 0: 42189.2. Samples: 1258017000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:21,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 06:28:23,831][12883] Updated weights for policy 0, policy_version 76781 (0.0028) [2024-06-18 06:28:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1258127360. Throughput: 0: 42225.2. Samples: 1258269500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:26,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 06:28:27,218][12883] Updated weights for policy 0, policy_version 76791 (0.0034) [2024-06-18 06:28:30,796][12862] Signal inference workers to stop experience collection... (18200 times) [2024-06-18 06:28:30,828][12883] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-06-18 06:28:30,854][12862] Signal inference workers to resume experience collection... (18200 times) [2024-06-18 06:28:30,855][12883] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-06-18 06:28:31,377][12883] Updated weights for policy 0, policy_version 76801 (0.0023) [2024-06-18 06:28:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1258323968. Throughput: 0: 42267.1. Samples: 1258403000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:31,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 06:28:34,694][12883] Updated weights for policy 0, policy_version 76811 (0.0032) [2024-06-18 06:28:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1258536960. Throughput: 0: 42454.4. Samples: 1258662160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:28:37,000][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 06:28:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076815_1258536960.pth... [2024-06-18 06:28:37,093][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076195_1248378880.pth [2024-06-18 06:28:39,434][12883] Updated weights for policy 0, policy_version 76821 (0.0036) [2024-06-18 06:28:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1258766336. Throughput: 0: 42407.5. Samples: 1258909460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:28:41,995][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 06:28:42,443][12883] Updated weights for policy 0, policy_version 76831 (0.0030) [2024-06-18 06:28:46,881][12883] Updated weights for policy 0, policy_version 76841 (0.0034) [2024-06-18 06:28:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1258962944. Throughput: 0: 42459.9. Samples: 1259044140. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:28:46,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 06:28:50,492][12883] Updated weights for policy 0, policy_version 76851 (0.0035) [2024-06-18 06:28:51,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42327.0, 300 sec: 42320.7). Total num frames: 1259159552. Throughput: 0: 42350.7. Samples: 1259292540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:28:51,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 06:28:54,712][12883] Updated weights for policy 0, policy_version 76861 (0.0027) [2024-06-18 06:28:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1259388928. Throughput: 0: 42580.8. Samples: 1259550260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:28:56,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 06:28:58,170][12883] Updated weights for policy 0, policy_version 76871 (0.0034) [2024-06-18 06:29:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1259601920. Throughput: 0: 42532.8. Samples: 1259679920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:29:01,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 06:29:02,296][12883] Updated weights for policy 0, policy_version 76881 (0.0043) [2024-06-18 06:29:05,831][12883] Updated weights for policy 0, policy_version 76891 (0.0041) [2024-06-18 06:29:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42431.8). Total num frames: 1259814912. Throughput: 0: 42375.4. Samples: 1259923900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:29:06,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 06:29:10,209][12883] Updated weights for policy 0, policy_version 76901 (0.0031) [2024-06-18 06:29:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1260027904. Throughput: 0: 42215.5. Samples: 1260169200. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:29:11,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 06:29:14,312][12883] Updated weights for policy 0, policy_version 76911 (0.0034) [2024-06-18 06:29:16,994][12645] Fps is (10 sec: 39322.5, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1260208128. Throughput: 0: 42168.0. Samples: 1260300560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:29:16,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 06:29:17,912][12883] Updated weights for policy 0, policy_version 76921 (0.0050) [2024-06-18 06:29:21,972][12883] Updated weights for policy 0, policy_version 76931 (0.0031) [2024-06-18 06:29:21,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 1260437504. Throughput: 0: 42054.4. Samples: 1260554700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:29:21,997][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 06:29:25,498][12883] Updated weights for policy 0, policy_version 76941 (0.0026) [2024-06-18 06:29:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 1260666880. Throughput: 0: 42219.6. Samples: 1260809340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:29:26,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 06:29:29,633][12883] Updated weights for policy 0, policy_version 76951 (0.0027) [2024-06-18 06:29:31,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1260847104. Throughput: 0: 42148.5. Samples: 1260940820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:29:31,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 06:29:33,494][12883] Updated weights for policy 0, policy_version 76961 (0.0031) [2024-06-18 06:29:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1261076480. Throughput: 0: 42256.3. Samples: 1261194080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:29:36,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 06:29:37,142][12883] Updated weights for policy 0, policy_version 76971 (0.0033) [2024-06-18 06:29:41,091][12883] Updated weights for policy 0, policy_version 76981 (0.0033) [2024-06-18 06:29:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1261289472. Throughput: 0: 42202.2. Samples: 1261449360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:29:41,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 06:29:44,820][12883] Updated weights for policy 0, policy_version 76991 (0.0031) [2024-06-18 06:29:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 1261486080. Throughput: 0: 42102.7. Samples: 1261574540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:29:46,998][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 06:29:48,638][12883] Updated weights for policy 0, policy_version 77001 (0.0037) [2024-06-18 06:29:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1261715456. Throughput: 0: 42366.0. Samples: 1261830360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:29:51,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 06:29:52,408][12883] Updated weights for policy 0, policy_version 77011 (0.0034) [2024-06-18 06:29:56,345][12883] Updated weights for policy 0, policy_version 77021 (0.0031) [2024-06-18 06:29:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1261912064. Throughput: 0: 42528.9. Samples: 1262083000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:29:56,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 06:30:00,402][12883] Updated weights for policy 0, policy_version 77031 (0.0032) [2024-06-18 06:30:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1262141440. Throughput: 0: 42519.5. Samples: 1262213940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:30:01,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 06:30:03,861][12883] Updated weights for policy 0, policy_version 77041 (0.0027) [2024-06-18 06:30:05,556][12862] Signal inference workers to stop experience collection... (18250 times) [2024-06-18 06:30:05,556][12862] Signal inference workers to resume experience collection... (18250 times) [2024-06-18 06:30:05,597][12883] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-06-18 06:30:05,598][12883] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-06-18 06:30:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.5, 300 sec: 42431.8). Total num frames: 1262338048. Throughput: 0: 42520.0. Samples: 1262468000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:30:06,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 06:30:08,153][12883] Updated weights for policy 0, policy_version 77051 (0.0032) [2024-06-18 06:30:11,882][12883] Updated weights for policy 0, policy_version 77061 (0.0044) [2024-06-18 06:30:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1262567424. Throughput: 0: 42530.8. Samples: 1262723220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:30:11,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 06:30:15,957][12883] Updated weights for policy 0, policy_version 77071 (0.0039) [2024-06-18 06:30:16,997][12645] Fps is (10 sec: 44222.9, 60 sec: 42869.3, 300 sec: 42375.8). Total num frames: 1262780416. Throughput: 0: 42510.4. Samples: 1262853920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:16,997][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 06:30:19,846][12883] Updated weights for policy 0, policy_version 77081 (0.0034) [2024-06-18 06:30:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 1262993408. Throughput: 0: 42426.3. Samples: 1263103260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:21,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 06:30:23,794][12883] Updated weights for policy 0, policy_version 77091 (0.0053) [2024-06-18 06:30:26,998][12645] Fps is (10 sec: 40956.7, 60 sec: 42049.6, 300 sec: 42375.7). Total num frames: 1263190016. Throughput: 0: 42359.1. Samples: 1263355680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:26,998][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 06:30:27,775][12883] Updated weights for policy 0, policy_version 77101 (0.0036) [2024-06-18 06:30:31,513][12883] Updated weights for policy 0, policy_version 77111 (0.0033) [2024-06-18 06:30:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42596.8, 300 sec: 42265.2). Total num frames: 1263403008. Throughput: 0: 42397.9. Samples: 1263482540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:31,996][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 06:30:35,255][12883] Updated weights for policy 0, policy_version 77121 (0.0025) [2024-06-18 06:30:36,994][12645] Fps is (10 sec: 44254.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1263632384. Throughput: 0: 42413.8. Samples: 1263738980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:36,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 06:30:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077126_1263632384.pth... [2024-06-18 06:30:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076505_1253457920.pth [2024-06-18 06:30:39,252][12883] Updated weights for policy 0, policy_version 77131 (0.0049) [2024-06-18 06:30:41,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1263828992. Throughput: 0: 42265.3. Samples: 1263984940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:41,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 06:30:43,339][12883] Updated weights for policy 0, policy_version 77141 (0.0038) [2024-06-18 06:30:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1264025600. Throughput: 0: 42197.0. Samples: 1264112800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:46,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 06:30:47,015][12883] Updated weights for policy 0, policy_version 77151 (0.0024) [2024-06-18 06:30:50,770][12883] Updated weights for policy 0, policy_version 77161 (0.0033) [2024-06-18 06:30:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1264254976. Throughput: 0: 42334.9. Samples: 1264373080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:51,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 06:30:54,704][12883] Updated weights for policy 0, policy_version 77171 (0.0041) [2024-06-18 06:30:56,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42869.9, 300 sec: 42487.9). Total num frames: 1264484352. Throughput: 0: 42216.0. Samples: 1264623040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:30:56,997][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 06:30:58,238][12883] Updated weights for policy 0, policy_version 77181 (0.0028) [2024-06-18 06:31:01,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 1264664576. Throughput: 0: 42355.0. Samples: 1264759760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:31:01,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 06:31:02,532][12883] Updated weights for policy 0, policy_version 77191 (0.0022) [2024-06-18 06:31:05,800][12883] Updated weights for policy 0, policy_version 77201 (0.0049) [2024-06-18 06:31:06,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1264893952. Throughput: 0: 42405.8. Samples: 1265011520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:06,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 06:31:10,190][12883] Updated weights for policy 0, policy_version 77211 (0.0037) [2024-06-18 06:31:11,996][12645] Fps is (10 sec: 47502.5, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 1265139712. Throughput: 0: 42377.1. Samples: 1265262580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:11,996][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 06:31:13,716][12883] Updated weights for policy 0, policy_version 77221 (0.0035) [2024-06-18 06:31:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42054.5, 300 sec: 42265.2). Total num frames: 1265303552. Throughput: 0: 42471.5. Samples: 1265393660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:16,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 06:31:18,104][12883] Updated weights for policy 0, policy_version 77231 (0.0039) [2024-06-18 06:31:21,287][12883] Updated weights for policy 0, policy_version 77241 (0.0031) [2024-06-18 06:31:21,994][12645] Fps is (10 sec: 37691.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1265516544. Throughput: 0: 42294.2. Samples: 1265642220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:21,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 06:31:24,220][12862] Signal inference workers to stop experience collection... (18300 times) [2024-06-18 06:31:24,224][12862] Signal inference workers to resume experience collection... (18300 times) [2024-06-18 06:31:24,260][12883] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-06-18 06:31:24,260][12883] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-06-18 06:31:25,625][12883] Updated weights for policy 0, policy_version 77251 (0.0021) [2024-06-18 06:31:26,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42874.1, 300 sec: 42431.8). Total num frames: 1265762304. Throughput: 0: 42421.8. Samples: 1265893920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:26,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 06:31:28,893][12883] Updated weights for policy 0, policy_version 77261 (0.0045) [2024-06-18 06:31:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42326.9, 300 sec: 42320.7). Total num frames: 1265942528. Throughput: 0: 42602.2. Samples: 1266029900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:31,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 06:31:33,224][12883] Updated weights for policy 0, policy_version 77271 (0.0034) [2024-06-18 06:31:36,538][12883] Updated weights for policy 0, policy_version 77281 (0.0045) [2024-06-18 06:31:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 1266171904. Throughput: 0: 42472.6. Samples: 1266284340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:36,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 06:31:40,787][12883] Updated weights for policy 0, policy_version 77291 (0.0039) [2024-06-18 06:31:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42432.1). Total num frames: 1266401280. Throughput: 0: 42648.4. Samples: 1266542120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:41,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 06:31:43,924][12883] Updated weights for policy 0, policy_version 77301 (0.0048) [2024-06-18 06:31:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1266581504. Throughput: 0: 42487.1. Samples: 1266671680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:31:46,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 06:31:48,362][12883] Updated weights for policy 0, policy_version 77311 (0.0033) [2024-06-18 06:31:51,418][12883] Updated weights for policy 0, policy_version 77321 (0.0034) [2024-06-18 06:31:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1266827264. Throughput: 0: 42589.8. Samples: 1266928060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:31:51,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 06:31:56,001][12883] Updated weights for policy 0, policy_version 77331 (0.0036) [2024-06-18 06:31:56,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42598.4, 300 sec: 42431.5). Total num frames: 1267040256. Throughput: 0: 42772.9. Samples: 1267187360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:31:56,996][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 06:31:59,132][12883] Updated weights for policy 0, policy_version 77341 (0.0029) [2024-06-18 06:32:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1267220480. Throughput: 0: 42751.9. Samples: 1267317500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:01,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 06:32:03,890][12883] Updated weights for policy 0, policy_version 77351 (0.0026) [2024-06-18 06:32:06,665][12883] Updated weights for policy 0, policy_version 77361 (0.0032) [2024-06-18 06:32:06,994][12645] Fps is (10 sec: 44247.1, 60 sec: 43144.6, 300 sec: 42487.4). Total num frames: 1267482624. Throughput: 0: 42961.4. Samples: 1267575480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:06,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 06:32:11,483][12883] Updated weights for policy 0, policy_version 77371 (0.0031) [2024-06-18 06:32:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42053.9, 300 sec: 42320.7). Total num frames: 1267662848. Throughput: 0: 43163.7. Samples: 1267836280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:11,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 06:32:14,198][12883] Updated weights for policy 0, policy_version 77381 (0.0035) [2024-06-18 06:32:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1267875840. Throughput: 0: 42838.2. Samples: 1267957620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:16,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 06:32:19,085][12883] Updated weights for policy 0, policy_version 77391 (0.0036) [2024-06-18 06:32:21,900][12883] Updated weights for policy 0, policy_version 77401 (0.0032) [2024-06-18 06:32:21,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43690.7, 300 sec: 42542.9). Total num frames: 1268137984. Throughput: 0: 43075.2. Samples: 1268222720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:21,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 06:32:26,761][12883] Updated weights for policy 0, policy_version 77411 (0.0036) [2024-06-18 06:32:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1268301824. Throughput: 0: 43098.6. Samples: 1268481560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:26,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 06:32:29,670][12883] Updated weights for policy 0, policy_version 77421 (0.0027) [2024-06-18 06:32:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1268531200. Throughput: 0: 42888.8. Samples: 1268601680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:31,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 06:32:34,394][12883] Updated weights for policy 0, policy_version 77431 (0.0036) [2024-06-18 06:32:36,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 1268776960. Throughput: 0: 42969.9. Samples: 1268861700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:36,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 06:32:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077440_1268776960.pth... [2024-06-18 06:32:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000076815_1258536960.pth [2024-06-18 06:32:37,249][12883] Updated weights for policy 0, policy_version 77441 (0.0031) [2024-06-18 06:32:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1268940800. Throughput: 0: 43179.1. Samples: 1269130320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:41,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 06:32:42,016][12883] Updated weights for policy 0, policy_version 77451 (0.0032) [2024-06-18 06:32:44,076][12862] Signal inference workers to stop experience collection... (18350 times) [2024-06-18 06:32:44,119][12883] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-06-18 06:32:44,128][12862] Signal inference workers to resume experience collection... (18350 times) [2024-06-18 06:32:44,135][12883] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-06-18 06:32:44,939][12883] Updated weights for policy 0, policy_version 77461 (0.0046) [2024-06-18 06:32:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42543.2). Total num frames: 1269170176. Throughput: 0: 42812.3. Samples: 1269244060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:46,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 06:32:49,559][12883] Updated weights for policy 0, policy_version 77471 (0.0030) [2024-06-18 06:32:51,994][12645] Fps is (10 sec: 49152.4, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 1269432320. Throughput: 0: 42927.6. Samples: 1269507220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:51,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 06:32:52,786][12883] Updated weights for policy 0, policy_version 77481 (0.0036) [2024-06-18 06:32:57,000][12645] Fps is (10 sec: 42572.4, 60 sec: 42595.5, 300 sec: 42486.4). Total num frames: 1269596160. Throughput: 0: 43053.9. Samples: 1269773980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:32:57,001][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 06:32:57,137][12883] Updated weights for policy 0, policy_version 77491 (0.0042) [2024-06-18 06:33:00,809][12883] Updated weights for policy 0, policy_version 77501 (0.0030) [2024-06-18 06:33:01,996][12645] Fps is (10 sec: 39312.4, 60 sec: 43416.0, 300 sec: 42598.1). Total num frames: 1269825536. Throughput: 0: 42888.9. Samples: 1269887720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:33:01,997][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 06:33:05,065][12883] Updated weights for policy 0, policy_version 77511 (0.0029) [2024-06-18 06:33:06,994][12645] Fps is (10 sec: 45904.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1270054912. Throughput: 0: 42858.6. Samples: 1270151360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:33:06,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 06:33:08,547][12883] Updated weights for policy 0, policy_version 77521 (0.0040) [2024-06-18 06:33:11,994][12645] Fps is (10 sec: 37691.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1270202368. Throughput: 0: 42864.4. Samples: 1270410460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:33:11,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 06:33:12,837][12883] Updated weights for policy 0, policy_version 77531 (0.0032) [2024-06-18 06:33:16,277][12883] Updated weights for policy 0, policy_version 77541 (0.0029) [2024-06-18 06:33:16,996][12645] Fps is (10 sec: 40950.8, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 1270464512. Throughput: 0: 42729.0. Samples: 1270524580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:33:16,996][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 06:33:20,564][12883] Updated weights for policy 0, policy_version 77551 (0.0029) [2024-06-18 06:33:21,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1270677504. Throughput: 0: 42741.0. Samples: 1270785040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 06:33:21,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 06:33:23,790][12883] Updated weights for policy 0, policy_version 77561 (0.0037) [2024-06-18 06:33:26,994][12645] Fps is (10 sec: 37692.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1270841344. Throughput: 0: 42477.0. Samples: 1271041780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:33:26,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 06:33:28,216][12883] Updated weights for policy 0, policy_version 77571 (0.0031) [2024-06-18 06:33:31,550][12883] Updated weights for policy 0, policy_version 77581 (0.0041) [2024-06-18 06:33:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1271103488. Throughput: 0: 42583.8. Samples: 1271160320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:33:31,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 06:33:35,819][12883] Updated weights for policy 0, policy_version 77591 (0.0028) [2024-06-18 06:33:36,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1271316480. Throughput: 0: 42641.7. Samples: 1271426100. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:33:36,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 06:33:39,290][12883] Updated weights for policy 0, policy_version 77601 (0.0029) [2024-06-18 06:33:41,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1271496704. Throughput: 0: 42255.6. Samples: 1271675220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:33:41,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 06:33:43,672][12883] Updated weights for policy 0, policy_version 77611 (0.0027) [2024-06-18 06:33:46,743][12883] Updated weights for policy 0, policy_version 77621 (0.0038) [2024-06-18 06:33:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1271742464. Throughput: 0: 42520.2. Samples: 1271801040. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:33:46,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 06:33:51,256][12883] Updated weights for policy 0, policy_version 77631 (0.0039) [2024-06-18 06:33:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 42542.8). Total num frames: 1271939072. Throughput: 0: 42475.4. Samples: 1272062760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:33:51,995][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 06:33:54,644][12883] Updated weights for policy 0, policy_version 77641 (0.0048) [2024-06-18 06:33:56,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42601.3, 300 sec: 42542.5). Total num frames: 1272152064. Throughput: 0: 42184.5. Samples: 1272308860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:33:56,997][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 06:33:59,067][12883] Updated weights for policy 0, policy_version 77651 (0.0035) [2024-06-18 06:34:01,999][12645] Fps is (10 sec: 44211.7, 60 sec: 42595.9, 300 sec: 42597.6). Total num frames: 1272381440. Throughput: 0: 42486.4. Samples: 1272436620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:34:02,000][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 06:34:02,485][12883] Updated weights for policy 0, policy_version 77661 (0.0043) [2024-06-18 06:34:06,716][12883] Updated weights for policy 0, policy_version 77671 (0.0027) [2024-06-18 06:34:06,994][12645] Fps is (10 sec: 40969.4, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1272561664. Throughput: 0: 42462.7. Samples: 1272695860. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 06:34:06,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 06:34:10,044][12883] Updated weights for policy 0, policy_version 77681 (0.0047) [2024-06-18 06:34:11,994][12645] Fps is (10 sec: 42623.0, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1272807424. Throughput: 0: 42238.1. Samples: 1272942500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:11,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 06:34:14,235][12883] Updated weights for policy 0, policy_version 77691 (0.0037) [2024-06-18 06:34:16,796][12862] Signal inference workers to stop experience collection... (18400 times) [2024-06-18 06:34:16,796][12862] Signal inference workers to resume experience collection... (18400 times) [2024-06-18 06:34:16,819][12883] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-06-18 06:34:16,819][12883] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-06-18 06:34:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42654.3). Total num frames: 1273020416. Throughput: 0: 42599.9. Samples: 1273077320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:16,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 06:34:17,768][12883] Updated weights for policy 0, policy_version 77701 (0.0031) [2024-06-18 06:34:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1273200640. Throughput: 0: 42345.8. Samples: 1273331660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:21,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 06:34:22,068][12883] Updated weights for policy 0, policy_version 77711 (0.0037) [2024-06-18 06:34:25,341][12883] Updated weights for policy 0, policy_version 77721 (0.0043) [2024-06-18 06:34:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1273446400. Throughput: 0: 42287.1. Samples: 1273578140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:26,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 06:34:29,705][12883] Updated weights for policy 0, policy_version 77731 (0.0021) [2024-06-18 06:34:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1273643008. Throughput: 0: 42526.7. Samples: 1273714740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:31,998][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 06:34:33,033][12883] Updated weights for policy 0, policy_version 77741 (0.0036) [2024-06-18 06:34:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1273856000. Throughput: 0: 42360.6. Samples: 1273968980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:36,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 06:34:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077750_1273856000.pth... [2024-06-18 06:34:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077126_1263632384.pth [2024-06-18 06:34:37,483][12883] Updated weights for policy 0, policy_version 77751 (0.0032) [2024-06-18 06:34:40,924][12883] Updated weights for policy 0, policy_version 77761 (0.0041) [2024-06-18 06:34:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1274085376. Throughput: 0: 42381.3. Samples: 1274215920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:41,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 06:34:45,193][12883] Updated weights for policy 0, policy_version 77771 (0.0036) [2024-06-18 06:34:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 1274265600. Throughput: 0: 42537.8. Samples: 1274350580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:46,994][12645] Avg episode reward: [(0, '0.074')] [2024-06-18 06:34:48,590][12883] Updated weights for policy 0, policy_version 77781 (0.0043) [2024-06-18 06:34:51,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1274478592. Throughput: 0: 42403.4. Samples: 1274604020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:51,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 06:34:52,756][12883] Updated weights for policy 0, policy_version 77791 (0.0038) [2024-06-18 06:34:56,447][12883] Updated weights for policy 0, policy_version 77801 (0.0035) [2024-06-18 06:34:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 1274724352. Throughput: 0: 42456.4. Samples: 1274853040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 06:34:56,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 06:35:00,753][12883] Updated weights for policy 0, policy_version 77811 (0.0039) [2024-06-18 06:35:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42056.3, 300 sec: 42598.4). Total num frames: 1274904576. Throughput: 0: 42318.7. Samples: 1274981660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:01,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 06:35:04,143][12883] Updated weights for policy 0, policy_version 77821 (0.0024) [2024-06-18 06:35:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1275101184. Throughput: 0: 42257.3. Samples: 1275233240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:06,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 06:35:08,383][12883] Updated weights for policy 0, policy_version 77831 (0.0036) [2024-06-18 06:35:11,837][12883] Updated weights for policy 0, policy_version 77841 (0.0033) [2024-06-18 06:35:11,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42598.8). Total num frames: 1275346944. Throughput: 0: 42523.5. Samples: 1275491700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:11,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 06:35:16,205][12883] Updated weights for policy 0, policy_version 77851 (0.0036) [2024-06-18 06:35:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1275527168. Throughput: 0: 42318.2. Samples: 1275619060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:16,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 06:35:19,612][12883] Updated weights for policy 0, policy_version 77861 (0.0031) [2024-06-18 06:35:21,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 1275740160. Throughput: 0: 42232.1. Samples: 1275869420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:21,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 06:35:23,708][12883] Updated weights for policy 0, policy_version 77871 (0.0028) [2024-06-18 06:35:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42598.7). Total num frames: 1275969536. Throughput: 0: 42386.2. Samples: 1276123300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:26,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 06:35:27,316][12883] Updated weights for policy 0, policy_version 77881 (0.0035) [2024-06-18 06:35:31,314][12883] Updated weights for policy 0, policy_version 77891 (0.0032) [2024-06-18 06:35:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1276166144. Throughput: 0: 42238.7. Samples: 1276251320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:31,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 06:35:35,491][12883] Updated weights for policy 0, policy_version 77901 (0.0031) [2024-06-18 06:35:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1276379136. Throughput: 0: 42228.2. Samples: 1276504280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:36,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 06:35:38,917][12883] Updated weights for policy 0, policy_version 77911 (0.0038) [2024-06-18 06:35:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1276608512. Throughput: 0: 42307.1. Samples: 1276756860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 06:35:41,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 06:35:42,310][12862] Signal inference workers to stop experience collection... (18450 times) [2024-06-18 06:35:42,339][12883] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-06-18 06:35:42,364][12862] Signal inference workers to resume experience collection... (18450 times) [2024-06-18 06:35:42,365][12883] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-06-18 06:35:43,261][12883] Updated weights for policy 0, policy_version 77921 (0.0050) [2024-06-18 06:35:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1276805120. Throughput: 0: 42373.3. Samples: 1276888460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:35:46,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 06:35:47,291][12883] Updated weights for policy 0, policy_version 77931 (0.0050) [2024-06-18 06:35:50,842][12883] Updated weights for policy 0, policy_version 77941 (0.0039) [2024-06-18 06:35:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1277018112. Throughput: 0: 42430.2. Samples: 1277142600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:35:51,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 06:35:54,700][12883] Updated weights for policy 0, policy_version 77951 (0.0032) [2024-06-18 06:35:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1277231104. Throughput: 0: 42372.5. Samples: 1277398460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:35:56,995][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 06:35:58,502][12883] Updated weights for policy 0, policy_version 77961 (0.0032) [2024-06-18 06:36:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1277460480. Throughput: 0: 42519.1. Samples: 1277532420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:36:01,994][12645] Avg episode reward: [(0, '0.114')] [2024-06-18 06:36:02,166][12883] Updated weights for policy 0, policy_version 77971 (0.0033) [2024-06-18 06:36:06,346][12883] Updated weights for policy 0, policy_version 77981 (0.0036) [2024-06-18 06:36:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 1277657088. Throughput: 0: 42644.7. Samples: 1277788440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:36:06,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 06:36:09,833][12883] Updated weights for policy 0, policy_version 77991 (0.0039) [2024-06-18 06:36:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1277886464. Throughput: 0: 42572.0. Samples: 1278039040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:36:11,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 06:36:14,221][12883] Updated weights for policy 0, policy_version 78001 (0.0037) [2024-06-18 06:36:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1278099456. Throughput: 0: 42549.8. Samples: 1278166060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:36:16,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 06:36:17,617][12883] Updated weights for policy 0, policy_version 78011 (0.0034) [2024-06-18 06:36:21,803][12883] Updated weights for policy 0, policy_version 78021 (0.0033) [2024-06-18 06:36:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1278296064. Throughput: 0: 42636.2. Samples: 1278422920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:36:21,995][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 06:36:25,209][12883] Updated weights for policy 0, policy_version 78031 (0.0041) [2024-06-18 06:36:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1278525440. Throughput: 0: 42679.5. Samples: 1278677440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:36:26,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 06:36:29,634][12883] Updated weights for policy 0, policy_version 78041 (0.0026) [2024-06-18 06:36:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1278738432. Throughput: 0: 42713.2. Samples: 1278810560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 06:36:31,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 06:36:32,976][12883] Updated weights for policy 0, policy_version 78051 (0.0040) [2024-06-18 06:36:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1278935040. Throughput: 0: 42516.8. Samples: 1279055860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:36:36,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 06:36:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078060_1278935040.pth... [2024-06-18 06:36:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077440_1268776960.pth [2024-06-18 06:36:37,234][12883] Updated weights for policy 0, policy_version 78061 (0.0038) [2024-06-18 06:36:40,715][12883] Updated weights for policy 0, policy_version 78071 (0.0024) [2024-06-18 06:36:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1279148032. Throughput: 0: 42464.5. Samples: 1279309360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:36:41,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 06:36:44,865][12883] Updated weights for policy 0, policy_version 78081 (0.0039) [2024-06-18 06:36:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1279361024. Throughput: 0: 42358.8. Samples: 1279438560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:36:47,000][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 06:36:48,222][12883] Updated weights for policy 0, policy_version 78091 (0.0044) [2024-06-18 06:36:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 1279574016. Throughput: 0: 42355.2. Samples: 1279694420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:36:51,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 06:36:52,314][12883] Updated weights for policy 0, policy_version 78101 (0.0031) [2024-06-18 06:36:56,289][12883] Updated weights for policy 0, policy_version 78111 (0.0025) [2024-06-18 06:36:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1279770624. Throughput: 0: 42321.4. Samples: 1279943500. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:36:56,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 06:36:59,989][12883] Updated weights for policy 0, policy_version 78121 (0.0033) [2024-06-18 06:37:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1280000000. Throughput: 0: 42341.0. Samples: 1280071400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:37:01,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 06:37:03,870][12883] Updated weights for policy 0, policy_version 78131 (0.0031) [2024-06-18 06:37:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1280212992. Throughput: 0: 42440.6. Samples: 1280332740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:37:06,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 06:37:07,778][12883] Updated weights for policy 0, policy_version 78141 (0.0035) [2024-06-18 06:37:09,394][12862] Signal inference workers to stop experience collection... (18500 times) [2024-06-18 06:37:09,394][12862] Signal inference workers to resume experience collection... (18500 times) [2024-06-18 06:37:09,436][12883] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-06-18 06:37:09,437][12883] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-06-18 06:37:11,587][12883] Updated weights for policy 0, policy_version 78151 (0.0030) [2024-06-18 06:37:12,000][12645] Fps is (10 sec: 42571.2, 60 sec: 42320.9, 300 sec: 42542.0). Total num frames: 1280425984. Throughput: 0: 42325.4. Samples: 1280582340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:37:12,000][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 06:37:15,590][12883] Updated weights for policy 0, policy_version 78161 (0.0026) [2024-06-18 06:37:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1280622592. Throughput: 0: 42220.6. Samples: 1280710480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 06:37:16,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 06:37:19,263][12883] Updated weights for policy 0, policy_version 78171 (0.0038) [2024-06-18 06:37:21,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1280835584. Throughput: 0: 42474.2. Samples: 1280967200. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:21,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 06:37:23,305][12883] Updated weights for policy 0, policy_version 78181 (0.0041) [2024-06-18 06:37:26,915][12883] Updated weights for policy 0, policy_version 78191 (0.0030) [2024-06-18 06:37:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1281081344. Throughput: 0: 42440.6. Samples: 1281219180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:26,994][12645] Avg episode reward: [(0, '0.122')] [2024-06-18 06:37:31,313][12883] Updated weights for policy 0, policy_version 78201 (0.0044) [2024-06-18 06:37:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1281261568. Throughput: 0: 42437.3. Samples: 1281348240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:31,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 06:37:34,717][12883] Updated weights for policy 0, policy_version 78211 (0.0024) [2024-06-18 06:37:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1281474560. Throughput: 0: 42319.9. Samples: 1281598820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:36,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 06:37:39,004][12883] Updated weights for policy 0, policy_version 78221 (0.0045) [2024-06-18 06:37:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1281703936. Throughput: 0: 42472.4. Samples: 1281854760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:41,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 06:37:42,478][12883] Updated weights for policy 0, policy_version 78231 (0.0036) [2024-06-18 06:37:46,995][12645] Fps is (10 sec: 40953.9, 60 sec: 42051.1, 300 sec: 42209.4). Total num frames: 1281884160. Throughput: 0: 42522.9. Samples: 1281985000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:46,996][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 06:37:47,498][12883] Updated weights for policy 0, policy_version 78241 (0.0028) [2024-06-18 06:37:50,181][12883] Updated weights for policy 0, policy_version 78251 (0.0045) [2024-06-18 06:37:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42432.7). Total num frames: 1282113536. Throughput: 0: 42145.3. Samples: 1282229280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:51,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 06:37:55,002][12883] Updated weights for policy 0, policy_version 78261 (0.0036) [2024-06-18 06:37:56,994][12645] Fps is (10 sec: 45882.7, 60 sec: 42871.5, 300 sec: 42432.1). Total num frames: 1282342912. Throughput: 0: 42312.2. Samples: 1282486120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:37:56,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 06:37:57,890][12883] Updated weights for policy 0, policy_version 78271 (0.0034) [2024-06-18 06:38:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1282539520. Throughput: 0: 42462.0. Samples: 1282621280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:38:01,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 06:38:02,439][12883] Updated weights for policy 0, policy_version 78281 (0.0030) [2024-06-18 06:38:05,407][12883] Updated weights for policy 0, policy_version 78291 (0.0027) [2024-06-18 06:38:07,000][12645] Fps is (10 sec: 42571.5, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 1282768896. Throughput: 0: 42367.5. Samples: 1282874000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 06:38:07,000][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 06:38:09,913][12883] Updated weights for policy 0, policy_version 78301 (0.0032) [2024-06-18 06:38:11,993][12645] Fps is (10 sec: 42599.5, 60 sec: 42329.9, 300 sec: 42376.6). Total num frames: 1282965504. Throughput: 0: 42501.9. Samples: 1283131760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:11,994][12645] Avg episode reward: [(0, '0.125')] [2024-06-18 06:38:12,926][12883] Updated weights for policy 0, policy_version 78311 (0.0037) [2024-06-18 06:38:16,994][12645] Fps is (10 sec: 39346.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1283162112. Throughput: 0: 42433.8. Samples: 1283257760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:16,994][12645] Avg episode reward: [(0, '0.155')] [2024-06-18 06:38:17,857][12883] Updated weights for policy 0, policy_version 78321 (0.0029) [2024-06-18 06:38:20,892][12883] Updated weights for policy 0, policy_version 78331 (0.0041) [2024-06-18 06:38:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1283407872. Throughput: 0: 42412.1. Samples: 1283507360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:21,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 06:38:25,469][12883] Updated weights for policy 0, policy_version 78341 (0.0031) [2024-06-18 06:38:26,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42050.6, 300 sec: 42375.9). Total num frames: 1283604480. Throughput: 0: 42364.5. Samples: 1283761260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:26,996][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 06:38:28,657][12883] Updated weights for policy 0, policy_version 78351 (0.0033) [2024-06-18 06:38:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1283817472. Throughput: 0: 42382.8. Samples: 1283892160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:31,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 06:38:32,998][12883] Updated weights for policy 0, policy_version 78361 (0.0036) [2024-06-18 06:38:36,915][12883] Updated weights for policy 0, policy_version 78371 (0.0030) [2024-06-18 06:38:36,994][12645] Fps is (10 sec: 42607.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1284030464. Throughput: 0: 42467.9. Samples: 1284140340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:36,995][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 06:38:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078371_1284030464.pth... [2024-06-18 06:38:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000077750_1273856000.pth [2024-06-18 06:38:40,636][12883] Updated weights for policy 0, policy_version 78381 (0.0023) [2024-06-18 06:38:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1284227072. Throughput: 0: 42481.7. Samples: 1284397800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:41,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 06:38:44,668][12883] Updated weights for policy 0, policy_version 78391 (0.0031) [2024-06-18 06:38:45,876][12862] Signal inference workers to stop experience collection... (18550 times) [2024-06-18 06:38:45,876][12862] Signal inference workers to resume experience collection... (18550 times) [2024-06-18 06:38:45,901][12883] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-06-18 06:38:45,901][12883] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-06-18 06:38:46,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42872.6, 300 sec: 42431.8). Total num frames: 1284456448. Throughput: 0: 42336.6. Samples: 1284526420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:46,994][12645] Avg episode reward: [(0, '0.120')] [2024-06-18 06:38:48,768][12883] Updated weights for policy 0, policy_version 78401 (0.0028) [2024-06-18 06:38:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 1284669440. Throughput: 0: 42286.2. Samples: 1284776620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:51,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 06:38:52,170][12883] Updated weights for policy 0, policy_version 78411 (0.0030) [2024-06-18 06:38:56,245][12883] Updated weights for policy 0, policy_version 78421 (0.0035) [2024-06-18 06:38:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42321.6). Total num frames: 1284866048. Throughput: 0: 42311.1. Samples: 1285035760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 06:38:56,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 06:38:59,754][12883] Updated weights for policy 0, policy_version 78431 (0.0029) [2024-06-18 06:39:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1285079040. Throughput: 0: 42287.1. Samples: 1285160680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:01,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 06:39:03,809][12883] Updated weights for policy 0, policy_version 78441 (0.0036) [2024-06-18 06:39:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42329.8, 300 sec: 42376.3). Total num frames: 1285308416. Throughput: 0: 42393.8. Samples: 1285415080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:06,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 06:39:07,539][12883] Updated weights for policy 0, policy_version 78451 (0.0028) [2024-06-18 06:39:11,568][12883] Updated weights for policy 0, policy_version 78461 (0.0035) [2024-06-18 06:39:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 1285505024. Throughput: 0: 42455.9. Samples: 1285671680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:11,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 06:39:15,235][12883] Updated weights for policy 0, policy_version 78471 (0.0035) [2024-06-18 06:39:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1285734400. Throughput: 0: 42338.2. Samples: 1285797380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:16,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 06:39:19,231][12883] Updated weights for policy 0, policy_version 78481 (0.0034) [2024-06-18 06:39:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1285947392. Throughput: 0: 42573.4. Samples: 1286056140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:21,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 06:39:22,923][12883] Updated weights for policy 0, policy_version 78491 (0.0041) [2024-06-18 06:39:26,910][12883] Updated weights for policy 0, policy_version 78501 (0.0049) [2024-06-18 06:39:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42600.0, 300 sec: 42431.8). Total num frames: 1286160384. Throughput: 0: 42507.2. Samples: 1286310620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:26,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 06:39:30,862][12883] Updated weights for policy 0, policy_version 78511 (0.0043) [2024-06-18 06:39:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1286373376. Throughput: 0: 42347.0. Samples: 1286432040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:31,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 06:39:34,882][12883] Updated weights for policy 0, policy_version 78521 (0.0044) [2024-06-18 06:39:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1286569984. Throughput: 0: 42534.0. Samples: 1286690640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:36,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 06:39:38,446][12883] Updated weights for policy 0, policy_version 78531 (0.0035) [2024-06-18 06:39:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1286799360. Throughput: 0: 42353.6. Samples: 1286941680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 06:39:41,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 06:39:43,019][12883] Updated weights for policy 0, policy_version 78541 (0.0035) [2024-06-18 06:39:46,275][12883] Updated weights for policy 0, policy_version 78551 (0.0024) [2024-06-18 06:39:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1286995968. Throughput: 0: 42404.9. Samples: 1287068900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:39:46,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 06:39:50,793][12883] Updated weights for policy 0, policy_version 78561 (0.0030) [2024-06-18 06:39:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1287208960. Throughput: 0: 42586.7. Samples: 1287331480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:39:51,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 06:39:54,148][12883] Updated weights for policy 0, policy_version 78571 (0.0039) [2024-06-18 06:39:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1287438336. Throughput: 0: 42412.0. Samples: 1287580220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:39:56,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 06:39:58,634][12883] Updated weights for policy 0, policy_version 78581 (0.0028) [2024-06-18 06:40:01,803][12883] Updated weights for policy 0, policy_version 78591 (0.0023) [2024-06-18 06:40:01,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 1287651328. Throughput: 0: 42438.3. Samples: 1287707200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:40:01,997][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 06:40:06,186][12883] Updated weights for policy 0, policy_version 78601 (0.0038) [2024-06-18 06:40:06,707][12862] Signal inference workers to stop experience collection... (18600 times) [2024-06-18 06:40:06,708][12862] Signal inference workers to resume experience collection... (18600 times) [2024-06-18 06:40:06,726][12883] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-06-18 06:40:06,726][12883] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-06-18 06:40:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1287831552. Throughput: 0: 42324.0. Samples: 1287960720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:40:06,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 06:40:09,391][12883] Updated weights for policy 0, policy_version 78611 (0.0040) [2024-06-18 06:40:11,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1288060928. Throughput: 0: 42346.5. Samples: 1288216220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:40:11,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 06:40:13,747][12883] Updated weights for policy 0, policy_version 78621 (0.0035) [2024-06-18 06:40:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1288257536. Throughput: 0: 42506.3. Samples: 1288344820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:40:16,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 06:40:17,478][12883] Updated weights for policy 0, policy_version 78631 (0.0025) [2024-06-18 06:40:21,599][12883] Updated weights for policy 0, policy_version 78641 (0.0036) [2024-06-18 06:40:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1288470528. Throughput: 0: 42426.5. Samples: 1288599840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:40:21,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 06:40:24,939][12883] Updated weights for policy 0, policy_version 78651 (0.0035) [2024-06-18 06:40:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1288699904. Throughput: 0: 42562.6. Samples: 1288857000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:40:26,994][12645] Avg episode reward: [(0, '0.150')] [2024-06-18 06:40:29,277][12883] Updated weights for policy 0, policy_version 78661 (0.0030) [2024-06-18 06:40:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1288912896. Throughput: 0: 42712.8. Samples: 1288990980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 06:40:31,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 06:40:32,605][12883] Updated weights for policy 0, policy_version 78671 (0.0041) [2024-06-18 06:40:36,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1289093120. Throughput: 0: 42402.2. Samples: 1289239580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:40:36,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 06:40:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078681_1289109504.pth... [2024-06-18 06:40:37,027][12883] Updated weights for policy 0, policy_version 78681 (0.0033) [2024-06-18 06:40:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078060_1278935040.pth [2024-06-18 06:40:40,327][12883] Updated weights for policy 0, policy_version 78691 (0.0031) [2024-06-18 06:40:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1289355264. Throughput: 0: 42517.2. Samples: 1289493500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:40:41,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 06:40:44,673][12883] Updated weights for policy 0, policy_version 78701 (0.0032) [2024-06-18 06:40:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1289535488. Throughput: 0: 42628.3. Samples: 1289625380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:40:46,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 06:40:47,959][12883] Updated weights for policy 0, policy_version 78711 (0.0051) [2024-06-18 06:40:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1289748480. Throughput: 0: 42488.4. Samples: 1289872700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:40:51,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 06:40:52,269][12883] Updated weights for policy 0, policy_version 78721 (0.0029) [2024-06-18 06:40:55,767][12883] Updated weights for policy 0, policy_version 78731 (0.0031) [2024-06-18 06:40:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1289977856. Throughput: 0: 42570.9. Samples: 1290131900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:40:56,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 06:40:59,902][12883] Updated weights for policy 0, policy_version 78741 (0.0032) [2024-06-18 06:41:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.8, 300 sec: 42431.8). Total num frames: 1290174464. Throughput: 0: 42624.8. Samples: 1290262940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:41:01,998][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 06:41:03,663][12883] Updated weights for policy 0, policy_version 78751 (0.0026) [2024-06-18 06:41:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1290387456. Throughput: 0: 42443.5. Samples: 1290509800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:41:06,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 06:41:07,678][12883] Updated weights for policy 0, policy_version 78761 (0.0032) [2024-06-18 06:41:11,322][12883] Updated weights for policy 0, policy_version 78771 (0.0043) [2024-06-18 06:41:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 1290600448. Throughput: 0: 42589.4. Samples: 1290773520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:41:11,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 06:41:12,923][12862] Signal inference workers to stop experience collection... (18650 times) [2024-06-18 06:41:12,923][12862] Signal inference workers to resume experience collection... (18650 times) [2024-06-18 06:41:12,968][12883] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-06-18 06:41:12,968][12883] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-06-18 06:41:15,520][12883] Updated weights for policy 0, policy_version 78781 (0.0030) [2024-06-18 06:41:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1290829824. Throughput: 0: 42376.4. Samples: 1290897920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:41:16,996][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 06:41:19,017][12883] Updated weights for policy 0, policy_version 78791 (0.0037) [2024-06-18 06:41:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1291026432. Throughput: 0: 42428.8. Samples: 1291148880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 06:41:21,999][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 06:41:23,232][12883] Updated weights for policy 0, policy_version 78801 (0.0033) [2024-06-18 06:41:26,893][12883] Updated weights for policy 0, policy_version 78811 (0.0030) [2024-06-18 06:41:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 1291239424. Throughput: 0: 42460.2. Samples: 1291404200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:41:26,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 06:41:30,897][12883] Updated weights for policy 0, policy_version 78821 (0.0035) [2024-06-18 06:41:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1291436032. Throughput: 0: 42422.2. Samples: 1291534380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:41:31,996][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 06:41:34,415][12883] Updated weights for policy 0, policy_version 78831 (0.0032) [2024-06-18 06:41:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 1291681792. Throughput: 0: 42497.8. Samples: 1291785100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:41:36,996][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 06:41:38,633][12883] Updated weights for policy 0, policy_version 78841 (0.0043) [2024-06-18 06:41:41,877][12883] Updated weights for policy 0, policy_version 78851 (0.0035) [2024-06-18 06:41:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1291894784. Throughput: 0: 42499.4. Samples: 1292044380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:41:41,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 06:41:46,264][12883] Updated weights for policy 0, policy_version 78861 (0.0040) [2024-06-18 06:41:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1292075008. Throughput: 0: 42364.0. Samples: 1292169320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:41:46,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 06:41:49,344][12883] Updated weights for policy 0, policy_version 78871 (0.0025) [2024-06-18 06:41:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1292320768. Throughput: 0: 42614.4. Samples: 1292427440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:41:51,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 06:41:54,104][12883] Updated weights for policy 0, policy_version 78881 (0.0042) [2024-06-18 06:41:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1292533760. Throughput: 0: 42478.1. Samples: 1292685040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:41:56,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 06:41:57,052][12883] Updated weights for policy 0, policy_version 78891 (0.0038) [2024-06-18 06:42:01,675][12883] Updated weights for policy 0, policy_version 78901 (0.0028) [2024-06-18 06:42:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1292713984. Throughput: 0: 42538.2. Samples: 1292812140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:42:01,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 06:42:04,980][12883] Updated weights for policy 0, policy_version 78911 (0.0036) [2024-06-18 06:42:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42432.7). Total num frames: 1292943360. Throughput: 0: 42459.5. Samples: 1293059560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 06:42:06,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 06:42:09,220][12883] Updated weights for policy 0, policy_version 78921 (0.0038) [2024-06-18 06:42:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1293172736. Throughput: 0: 42588.7. Samples: 1293320700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:11,995][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 06:42:12,500][12883] Updated weights for policy 0, policy_version 78931 (0.0034) [2024-06-18 06:42:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1293352960. Throughput: 0: 42512.0. Samples: 1293447420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:16,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 06:42:17,173][12883] Updated weights for policy 0, policy_version 78941 (0.0039) [2024-06-18 06:42:18,405][12862] Signal inference workers to stop experience collection... (18700 times) [2024-06-18 06:42:18,445][12883] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-06-18 06:42:18,465][12862] Signal inference workers to resume experience collection... (18700 times) [2024-06-18 06:42:18,466][12883] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-06-18 06:42:20,062][12883] Updated weights for policy 0, policy_version 78951 (0.0029) [2024-06-18 06:42:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1293598720. Throughput: 0: 42495.1. Samples: 1293697380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:21,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 06:42:24,669][12883] Updated weights for policy 0, policy_version 78961 (0.0039) [2024-06-18 06:42:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1293811712. Throughput: 0: 42664.0. Samples: 1293964260. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:26,994][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 06:42:27,726][12883] Updated weights for policy 0, policy_version 78971 (0.0034) [2024-06-18 06:42:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1294008320. Throughput: 0: 42687.1. Samples: 1294090240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:31,994][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 06:42:32,473][12883] Updated weights for policy 0, policy_version 78981 (0.0038) [2024-06-18 06:42:35,316][12883] Updated weights for policy 0, policy_version 78991 (0.0030) [2024-06-18 06:42:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1294221312. Throughput: 0: 42519.0. Samples: 1294340800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:36,994][12645] Avg episode reward: [(0, '0.072')] [2024-06-18 06:42:37,068][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078994_1294237696.pth... [2024-06-18 06:42:37,118][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078371_1284030464.pth [2024-06-18 06:42:40,019][12883] Updated weights for policy 0, policy_version 79001 (0.0043) [2024-06-18 06:42:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42543.1). Total num frames: 1294434304. Throughput: 0: 42600.9. Samples: 1294602080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:41,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 06:42:42,963][12883] Updated weights for policy 0, policy_version 79011 (0.0034) [2024-06-18 06:42:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1294647296. Throughput: 0: 42588.0. Samples: 1294728600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:46,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 06:42:47,663][12883] Updated weights for policy 0, policy_version 79021 (0.0034) [2024-06-18 06:42:50,555][12883] Updated weights for policy 0, policy_version 79031 (0.0031) [2024-06-18 06:42:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1294876672. Throughput: 0: 42536.6. Samples: 1294973700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:51,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 06:42:55,560][12883] Updated weights for policy 0, policy_version 79041 (0.0039) [2024-06-18 06:42:56,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.9, 300 sec: 42542.6). Total num frames: 1295089664. Throughput: 0: 42631.3. Samples: 1295239200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 06:42:56,996][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 06:42:58,159][12883] Updated weights for policy 0, policy_version 79051 (0.0029) [2024-06-18 06:43:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42432.7). Total num frames: 1295286272. Throughput: 0: 42604.0. Samples: 1295364600. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:01,994][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 06:43:03,219][12883] Updated weights for policy 0, policy_version 79061 (0.0036) [2024-06-18 06:43:05,895][12883] Updated weights for policy 0, policy_version 79071 (0.0038) [2024-06-18 06:43:06,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1295515648. Throughput: 0: 42610.2. Samples: 1295614840. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:06,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 06:43:10,974][12883] Updated weights for policy 0, policy_version 79081 (0.0033) [2024-06-18 06:43:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1295728640. Throughput: 0: 42580.4. Samples: 1295880380. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:11,995][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 06:43:13,619][12883] Updated weights for policy 0, policy_version 79091 (0.0041) [2024-06-18 06:43:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1295908864. Throughput: 0: 42443.5. Samples: 1296000200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:16,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 06:43:18,763][12883] Updated weights for policy 0, policy_version 79101 (0.0038) [2024-06-18 06:43:21,719][12883] Updated weights for policy 0, policy_version 79111 (0.0029) [2024-06-18 06:43:21,997][12645] Fps is (10 sec: 42584.1, 60 sec: 42595.9, 300 sec: 42542.7). Total num frames: 1296154624. Throughput: 0: 42425.6. Samples: 1296250100. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:21,998][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 06:43:26,370][12883] Updated weights for policy 0, policy_version 79121 (0.0027) [2024-06-18 06:43:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1296351232. Throughput: 0: 42380.4. Samples: 1296509200. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:26,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 06:43:29,608][12883] Updated weights for policy 0, policy_version 79131 (0.0036) [2024-06-18 06:43:31,994][12645] Fps is (10 sec: 37697.0, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1296531456. Throughput: 0: 42270.8. Samples: 1296630780. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:31,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 06:43:33,866][12883] Updated weights for policy 0, policy_version 79141 (0.0040) [2024-06-18 06:43:34,578][12862] Signal inference workers to stop experience collection... (18750 times) [2024-06-18 06:43:34,578][12862] Signal inference workers to resume experience collection... (18750 times) [2024-06-18 06:43:34,593][12883] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-06-18 06:43:34,593][12883] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-06-18 06:43:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1296777216. Throughput: 0: 42502.6. Samples: 1296886320. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:36,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 06:43:37,282][12883] Updated weights for policy 0, policy_version 79151 (0.0025) [2024-06-18 06:43:41,759][12883] Updated weights for policy 0, policy_version 79161 (0.0031) [2024-06-18 06:43:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1296973824. Throughput: 0: 42410.1. Samples: 1297147560. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:41,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 06:43:45,032][12883] Updated weights for policy 0, policy_version 79171 (0.0029) [2024-06-18 06:43:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1297186816. Throughput: 0: 42389.7. Samples: 1297272140. Policy #0 lag: (min: 0.0, avg: 12.9, max: 21.0) [2024-06-18 06:43:46,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 06:43:49,349][12883] Updated weights for policy 0, policy_version 79181 (0.0039) [2024-06-18 06:43:51,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1297432576. Throughput: 0: 42447.2. Samples: 1297524960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:43:51,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 06:43:52,646][12883] Updated weights for policy 0, policy_version 79191 (0.0037) [2024-06-18 06:43:56,952][12883] Updated weights for policy 0, policy_version 79201 (0.0029) [2024-06-18 06:43:56,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42325.3, 300 sec: 42542.5). Total num frames: 1297629184. Throughput: 0: 42320.7. Samples: 1297784900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:43:56,997][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 06:44:00,388][12883] Updated weights for policy 0, policy_version 79211 (0.0039) [2024-06-18 06:44:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1297809408. Throughput: 0: 42331.7. Samples: 1297905120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:44:01,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 06:44:04,606][12883] Updated weights for policy 0, policy_version 79221 (0.0023) [2024-06-18 06:44:06,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1298071552. Throughput: 0: 42574.9. Samples: 1298165820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:44:06,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 06:44:07,951][12883] Updated weights for policy 0, policy_version 79231 (0.0024) [2024-06-18 06:44:11,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1298268160. Throughput: 0: 42611.2. Samples: 1298426700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:44:11,994][12645] Avg episode reward: [(0, '0.060')] [2024-06-18 06:44:12,476][12883] Updated weights for policy 0, policy_version 79241 (0.0026) [2024-06-18 06:44:15,583][12883] Updated weights for policy 0, policy_version 79251 (0.0028) [2024-06-18 06:44:16,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1298464768. Throughput: 0: 42647.7. Samples: 1298549940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:44:16,994][12645] Avg episode reward: [(0, '0.090')] [2024-06-18 06:44:20,318][12883] Updated weights for policy 0, policy_version 79261 (0.0042) [2024-06-18 06:44:21,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42326.2, 300 sec: 42487.0). Total num frames: 1298694144. Throughput: 0: 42658.3. Samples: 1298806040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:44:21,996][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 06:44:23,610][12883] Updated weights for policy 0, policy_version 79271 (0.0037) [2024-06-18 06:44:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1298890752. Throughput: 0: 42453.7. Samples: 1299057980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:44:26,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 06:44:28,097][12883] Updated weights for policy 0, policy_version 79281 (0.0040) [2024-06-18 06:44:31,678][12883] Updated weights for policy 0, policy_version 79291 (0.0032) [2024-06-18 06:44:32,000][12645] Fps is (10 sec: 40943.7, 60 sec: 42866.9, 300 sec: 42486.4). Total num frames: 1299103744. Throughput: 0: 42397.3. Samples: 1299180280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:44:32,000][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 06:44:35,581][12883] Updated weights for policy 0, policy_version 79301 (0.0038) [2024-06-18 06:44:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1299333120. Throughput: 0: 42493.7. Samples: 1299437180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:44:36,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 06:44:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079305_1299333120.pth... [2024-06-18 06:44:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078681_1289109504.pth [2024-06-18 06:44:39,299][12883] Updated weights for policy 0, policy_version 79311 (0.0029) [2024-06-18 06:44:41,994][12645] Fps is (10 sec: 42624.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1299529728. Throughput: 0: 42583.4. Samples: 1299701060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:44:42,000][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 06:44:43,146][12883] Updated weights for policy 0, policy_version 79321 (0.0029) [2024-06-18 06:44:46,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 1299742720. Throughput: 0: 42545.8. Samples: 1299819780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:44:46,997][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 06:44:47,218][12883] Updated weights for policy 0, policy_version 79331 (0.0026) [2024-06-18 06:44:50,832][12883] Updated weights for policy 0, policy_version 79341 (0.0028) [2024-06-18 06:44:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1299988480. Throughput: 0: 42565.7. Samples: 1300081280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:44:51,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 06:44:54,745][12883] Updated weights for policy 0, policy_version 79351 (0.0043) [2024-06-18 06:44:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42326.9, 300 sec: 42432.1). Total num frames: 1300168704. Throughput: 0: 42569.9. Samples: 1300342340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:44:56,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 06:44:58,473][12883] Updated weights for policy 0, policy_version 79361 (0.0036) [2024-06-18 06:45:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1300381696. Throughput: 0: 42534.3. Samples: 1300463980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:45:01,996][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 06:45:02,571][12883] Updated weights for policy 0, policy_version 79371 (0.0043) [2024-06-18 06:45:06,091][12883] Updated weights for policy 0, policy_version 79381 (0.0034) [2024-06-18 06:45:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1300611072. Throughput: 0: 42625.6. Samples: 1300724100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:45:06,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 06:45:09,944][12883] Updated weights for policy 0, policy_version 79391 (0.0037) [2024-06-18 06:45:11,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 1300807680. Throughput: 0: 42750.9. Samples: 1300981860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:45:11,996][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 06:45:12,576][12862] Signal inference workers to stop experience collection... (18800 times) [2024-06-18 06:45:12,616][12883] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-06-18 06:45:12,639][12862] Signal inference workers to resume experience collection... (18800 times) [2024-06-18 06:45:12,644][12883] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-06-18 06:45:13,830][12883] Updated weights for policy 0, policy_version 79401 (0.0040) [2024-06-18 06:45:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1301037056. Throughput: 0: 42804.1. Samples: 1301106200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:45:16,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 06:45:17,446][12883] Updated weights for policy 0, policy_version 79411 (0.0030) [2024-06-18 06:45:21,651][12883] Updated weights for policy 0, policy_version 79421 (0.0032) [2024-06-18 06:45:21,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1301250048. Throughput: 0: 42875.6. Samples: 1301366580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 06:45:21,994][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 06:45:24,867][12883] Updated weights for policy 0, policy_version 79431 (0.0047) [2024-06-18 06:45:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1301446656. Throughput: 0: 42742.7. Samples: 1301624480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:45:26,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 06:45:29,072][12883] Updated weights for policy 0, policy_version 79441 (0.0031) [2024-06-18 06:45:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43149.1, 300 sec: 42709.5). Total num frames: 1301692416. Throughput: 0: 42934.2. Samples: 1301751720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:45:31,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 06:45:32,791][12883] Updated weights for policy 0, policy_version 79451 (0.0025) [2024-06-18 06:45:36,931][12883] Updated weights for policy 0, policy_version 79461 (0.0032) [2024-06-18 06:45:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1301889024. Throughput: 0: 42840.4. Samples: 1302009100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:45:36,998][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 06:45:40,335][12883] Updated weights for policy 0, policy_version 79471 (0.0037) [2024-06-18 06:45:41,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1302085632. Throughput: 0: 42832.8. Samples: 1302269820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:45:41,995][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 06:45:44,453][12883] Updated weights for policy 0, policy_version 79481 (0.0035) [2024-06-18 06:45:46,995][12645] Fps is (10 sec: 44231.2, 60 sec: 43145.2, 300 sec: 42653.8). Total num frames: 1302331392. Throughput: 0: 42826.8. Samples: 1302391240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:45:46,996][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 06:45:48,025][12883] Updated weights for policy 0, policy_version 79491 (0.0030) [2024-06-18 06:45:51,992][12883] Updated weights for policy 0, policy_version 79501 (0.0029) [2024-06-18 06:45:51,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1302544384. Throughput: 0: 42718.7. Samples: 1302646440. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:45:51,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 06:45:55,646][12883] Updated weights for policy 0, policy_version 79511 (0.0031) [2024-06-18 06:45:56,994][12645] Fps is (10 sec: 39326.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1302724608. Throughput: 0: 42583.8. Samples: 1302898040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:45:56,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 06:45:59,549][12883] Updated weights for policy 0, policy_version 79521 (0.0040) [2024-06-18 06:46:01,993][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1302953984. Throughput: 0: 42608.6. Samples: 1303023580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:46:01,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 06:46:03,504][12883] Updated weights for policy 0, policy_version 79531 (0.0038) [2024-06-18 06:46:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1303183360. Throughput: 0: 42556.1. Samples: 1303281600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:46:06,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 06:46:07,430][12883] Updated weights for policy 0, policy_version 79541 (0.0027) [2024-06-18 06:46:11,308][12883] Updated weights for policy 0, policy_version 79551 (0.0033) [2024-06-18 06:46:11,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 1303379968. Throughput: 0: 42446.1. Samples: 1303534560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 06:46:11,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 06:46:15,011][12883] Updated weights for policy 0, policy_version 79561 (0.0027) [2024-06-18 06:46:16,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1303560192. Throughput: 0: 42407.8. Samples: 1303660080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:16,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 06:46:19,006][12883] Updated weights for policy 0, policy_version 79571 (0.0032) [2024-06-18 06:46:21,996][12645] Fps is (10 sec: 42589.5, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1303805952. Throughput: 0: 42300.2. Samples: 1303912700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:21,996][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 06:46:22,719][12883] Updated weights for policy 0, policy_version 79581 (0.0029) [2024-06-18 06:46:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1304002560. Throughput: 0: 42223.7. Samples: 1304169880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:26,996][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 06:46:27,071][12883] Updated weights for policy 0, policy_version 79591 (0.0035) [2024-06-18 06:46:29,361][12862] Signal inference workers to stop experience collection... (18850 times) [2024-06-18 06:46:29,388][12883] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-06-18 06:46:29,423][12862] Signal inference workers to resume experience collection... (18850 times) [2024-06-18 06:46:29,424][12883] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-06-18 06:46:30,273][12883] Updated weights for policy 0, policy_version 79601 (0.0030) [2024-06-18 06:46:31,994][12645] Fps is (10 sec: 39330.7, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 1304199168. Throughput: 0: 42268.0. Samples: 1304293240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:31,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 06:46:34,873][12883] Updated weights for policy 0, policy_version 79611 (0.0044) [2024-06-18 06:46:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1304428544. Throughput: 0: 42284.5. Samples: 1304549240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:36,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 06:46:37,035][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079617_1304444928.pth... [2024-06-18 06:46:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000078994_1294237696.pth [2024-06-18 06:46:37,909][12883] Updated weights for policy 0, policy_version 79621 (0.0028) [2024-06-18 06:46:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 1304625152. Throughput: 0: 42421.9. Samples: 1304807020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:41,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 06:46:42,672][12883] Updated weights for policy 0, policy_version 79631 (0.0038) [2024-06-18 06:46:45,641][12883] Updated weights for policy 0, policy_version 79641 (0.0033) [2024-06-18 06:46:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42053.2, 300 sec: 42487.3). Total num frames: 1304854528. Throughput: 0: 42458.6. Samples: 1304934220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:46,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 06:46:50,298][12883] Updated weights for policy 0, policy_version 79651 (0.0022) [2024-06-18 06:46:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1305067520. Throughput: 0: 42408.4. Samples: 1305189980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:51,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 06:46:53,404][12883] Updated weights for policy 0, policy_version 79661 (0.0036) [2024-06-18 06:46:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1305280512. Throughput: 0: 42470.3. Samples: 1305445720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 06:46:56,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 06:46:57,884][12883] Updated weights for policy 0, policy_version 79671 (0.0037) [2024-06-18 06:47:00,971][12883] Updated weights for policy 0, policy_version 79681 (0.0033) [2024-06-18 06:47:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1305509888. Throughput: 0: 42539.1. Samples: 1305574340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:01,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 06:47:05,744][12883] Updated weights for policy 0, policy_version 79691 (0.0029) [2024-06-18 06:47:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.1, 300 sec: 42542.9). Total num frames: 1305722880. Throughput: 0: 42628.6. Samples: 1305830900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:07,000][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 06:47:08,696][12883] Updated weights for policy 0, policy_version 79701 (0.0021) [2024-06-18 06:47:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1305919488. Throughput: 0: 42587.8. Samples: 1306086340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:11,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 06:47:13,512][12883] Updated weights for policy 0, policy_version 79711 (0.0033) [2024-06-18 06:47:16,413][12883] Updated weights for policy 0, policy_version 79721 (0.0030) [2024-06-18 06:47:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1306165248. Throughput: 0: 42654.9. Samples: 1306212720. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:16,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 06:47:21,056][12883] Updated weights for policy 0, policy_version 79731 (0.0028) [2024-06-18 06:47:21,994][12645] Fps is (10 sec: 45876.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 1306378240. Throughput: 0: 42917.2. Samples: 1306480520. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:21,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 06:47:24,075][12883] Updated weights for policy 0, policy_version 79741 (0.0036) [2024-06-18 06:47:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1306558464. Throughput: 0: 42738.2. Samples: 1306730240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:26,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 06:47:28,685][12883] Updated weights for policy 0, policy_version 79751 (0.0027) [2024-06-18 06:47:31,746][12883] Updated weights for policy 0, policy_version 79761 (0.0046) [2024-06-18 06:47:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1306804224. Throughput: 0: 42651.4. Samples: 1306853540. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:31,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 06:47:36,282][12883] Updated weights for policy 0, policy_version 79771 (0.0034) [2024-06-18 06:47:36,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1307000832. Throughput: 0: 42816.7. Samples: 1307116740. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:36,995][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 06:47:39,560][12883] Updated weights for policy 0, policy_version 79781 (0.0035) [2024-06-18 06:47:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1307197440. Throughput: 0: 42676.0. Samples: 1307366140. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:41,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 06:47:43,893][12883] Updated weights for policy 0, policy_version 79791 (0.0034) [2024-06-18 06:47:45,935][12862] Signal inference workers to stop experience collection... (18900 times) [2024-06-18 06:47:46,000][12883] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-06-18 06:47:46,051][12862] Signal inference workers to resume experience collection... (18900 times) [2024-06-18 06:47:46,052][12883] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-06-18 06:47:46,994][12645] Fps is (10 sec: 40961.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1307410432. Throughput: 0: 42607.7. Samples: 1307491680. Policy #0 lag: (min: 1.0, avg: 11.7, max: 20.0) [2024-06-18 06:47:46,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 06:47:47,528][12883] Updated weights for policy 0, policy_version 79801 (0.0027) [2024-06-18 06:47:51,671][12883] Updated weights for policy 0, policy_version 79811 (0.0037) [2024-06-18 06:47:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42869.8, 300 sec: 42542.9). Total num frames: 1307639808. Throughput: 0: 42719.4. Samples: 1307753360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:47:51,997][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 06:47:55,270][12883] Updated weights for policy 0, policy_version 79821 (0.0034) [2024-06-18 06:47:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1307852800. Throughput: 0: 42529.2. Samples: 1308000140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:47:56,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 06:47:59,480][12883] Updated weights for policy 0, policy_version 79831 (0.0025) [2024-06-18 06:48:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1308065792. Throughput: 0: 42597.8. Samples: 1308129620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:01,998][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 06:48:02,846][12883] Updated weights for policy 0, policy_version 79841 (0.0046) [2024-06-18 06:48:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 1308262400. Throughput: 0: 42378.7. Samples: 1308387560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:06,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 06:48:07,089][12883] Updated weights for policy 0, policy_version 79851 (0.0033) [2024-06-18 06:48:10,732][12883] Updated weights for policy 0, policy_version 79861 (0.0034) [2024-06-18 06:48:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1308491776. Throughput: 0: 42270.1. Samples: 1308632400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:11,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 06:48:14,841][12883] Updated weights for policy 0, policy_version 79871 (0.0044) [2024-06-18 06:48:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42487.8). Total num frames: 1308688384. Throughput: 0: 42549.3. Samples: 1308768260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:16,995][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 06:48:18,366][12883] Updated weights for policy 0, policy_version 79881 (0.0033) [2024-06-18 06:48:21,994][12645] Fps is (10 sec: 37683.8, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 1308868608. Throughput: 0: 42297.1. Samples: 1309020100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 06:48:22,572][12883] Updated weights for policy 0, policy_version 79891 (0.0026) [2024-06-18 06:48:26,178][12883] Updated weights for policy 0, policy_version 79901 (0.0033) [2024-06-18 06:48:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1309130752. Throughput: 0: 42400.4. Samples: 1309274160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:26,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 06:48:30,254][12883] Updated weights for policy 0, policy_version 79911 (0.0035) [2024-06-18 06:48:31,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42050.8, 300 sec: 42542.5). Total num frames: 1309327360. Throughput: 0: 42579.6. Samples: 1309407860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:31,996][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 06:48:33,930][12883] Updated weights for policy 0, policy_version 79921 (0.0043) [2024-06-18 06:48:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1309507584. Throughput: 0: 42156.2. Samples: 1309650300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:36,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 06:48:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079926_1309507584.pth... [2024-06-18 06:48:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079305_1299333120.pth [2024-06-18 06:48:37,988][12883] Updated weights for policy 0, policy_version 79931 (0.0032) [2024-06-18 06:48:41,661][12883] Updated weights for policy 0, policy_version 79941 (0.0036) [2024-06-18 06:48:41,995][12645] Fps is (10 sec: 42603.9, 60 sec: 42597.8, 300 sec: 42598.3). Total num frames: 1309753344. Throughput: 0: 42334.6. Samples: 1309905240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:41,995][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 06:48:45,788][12883] Updated weights for policy 0, policy_version 79951 (0.0032) [2024-06-18 06:48:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1309966336. Throughput: 0: 42440.4. Samples: 1310039440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:46,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 06:48:49,422][12883] Updated weights for policy 0, policy_version 79961 (0.0024) [2024-06-18 06:48:51,994][12645] Fps is (10 sec: 40963.4, 60 sec: 42053.8, 300 sec: 42487.6). Total num frames: 1310162944. Throughput: 0: 42335.9. Samples: 1310292680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:51,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 06:48:53,431][12883] Updated weights for policy 0, policy_version 79971 (0.0039) [2024-06-18 06:48:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1310392320. Throughput: 0: 42455.7. Samples: 1310542900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:48:56,994][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 06:48:57,053][12883] Updated weights for policy 0, policy_version 79981 (0.0039) [2024-06-18 06:49:01,177][12883] Updated weights for policy 0, policy_version 79991 (0.0045) [2024-06-18 06:49:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1310605312. Throughput: 0: 42388.2. Samples: 1310675720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:49:01,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 06:49:03,960][12862] Signal inference workers to stop experience collection... (18950 times) [2024-06-18 06:49:04,006][12883] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-06-18 06:49:04,015][12862] Signal inference workers to resume experience collection... (18950 times) [2024-06-18 06:49:04,028][12883] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-06-18 06:49:04,665][12883] Updated weights for policy 0, policy_version 80001 (0.0040) [2024-06-18 06:49:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1310801920. Throughput: 0: 42427.4. Samples: 1310929340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:49:06,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 06:49:08,812][12883] Updated weights for policy 0, policy_version 80011 (0.0028) [2024-06-18 06:49:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1311047680. Throughput: 0: 42321.8. Samples: 1311178640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:49:11,999][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 06:49:12,346][12883] Updated weights for policy 0, policy_version 80021 (0.0042) [2024-06-18 06:49:16,395][12883] Updated weights for policy 0, policy_version 80031 (0.0028) [2024-06-18 06:49:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 1311244288. Throughput: 0: 42246.9. Samples: 1311308880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:49:16,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 06:49:20,456][12883] Updated weights for policy 0, policy_version 80041 (0.0030) [2024-06-18 06:49:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1311440896. Throughput: 0: 42568.9. Samples: 1311565900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:49:21,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 06:49:23,882][12883] Updated weights for policy 0, policy_version 80051 (0.0037) [2024-06-18 06:49:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 1311670272. Throughput: 0: 42556.4. Samples: 1311820240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 06:49:26,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 06:49:28,026][12883] Updated weights for policy 0, policy_version 80061 (0.0035) [2024-06-18 06:49:31,991][12883] Updated weights for policy 0, policy_version 80071 (0.0038) [2024-06-18 06:49:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1311883264. Throughput: 0: 42430.3. Samples: 1311948800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:49:31,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 06:49:35,924][12883] Updated weights for policy 0, policy_version 80081 (0.0033) [2024-06-18 06:49:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1312079872. Throughput: 0: 42440.0. Samples: 1312202480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:49:36,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 06:49:39,810][12883] Updated weights for policy 0, policy_version 80091 (0.0021) [2024-06-18 06:49:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.0, 300 sec: 42598.7). Total num frames: 1312309248. Throughput: 0: 42408.8. Samples: 1312451300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:49:41,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 06:49:44,095][12883] Updated weights for policy 0, policy_version 80101 (0.0028) [2024-06-18 06:49:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1312522240. Throughput: 0: 42439.0. Samples: 1312585480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:49:46,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 06:49:47,454][12883] Updated weights for policy 0, policy_version 80111 (0.0033) [2024-06-18 06:49:51,994][12645] Fps is (10 sec: 37683.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1312686080. Throughput: 0: 42241.1. Samples: 1312830180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:49:51,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 06:49:52,074][12883] Updated weights for policy 0, policy_version 80121 (0.0030) [2024-06-18 06:49:55,085][12883] Updated weights for policy 0, policy_version 80131 (0.0031) [2024-06-18 06:49:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1312948224. Throughput: 0: 42319.6. Samples: 1313083020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:49:56,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 06:50:00,005][12883] Updated weights for policy 0, policy_version 80141 (0.0035) [2024-06-18 06:50:01,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1313144832. Throughput: 0: 42490.2. Samples: 1313220940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:50:01,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 06:50:03,164][12883] Updated weights for policy 0, policy_version 80151 (0.0050) [2024-06-18 06:50:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 1313341440. Throughput: 0: 42181.4. Samples: 1313464060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:50:06,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 06:50:07,529][12883] Updated weights for policy 0, policy_version 80161 (0.0041) [2024-06-18 06:50:10,784][12883] Updated weights for policy 0, policy_version 80171 (0.0028) [2024-06-18 06:50:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1313587200. Throughput: 0: 42229.7. Samples: 1313720580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 06:50:11,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 06:50:15,091][12883] Updated weights for policy 0, policy_version 80181 (0.0038) [2024-06-18 06:50:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1313767424. Throughput: 0: 42290.3. Samples: 1313851860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:16,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 06:50:18,679][12883] Updated weights for policy 0, policy_version 80191 (0.0043) [2024-06-18 06:50:19,419][12862] Signal inference workers to stop experience collection... (19000 times) [2024-06-18 06:50:19,451][12883] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-06-18 06:50:19,537][12862] Signal inference workers to resume experience collection... (19000 times) [2024-06-18 06:50:19,537][12883] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-06-18 06:50:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1313996800. Throughput: 0: 42091.7. Samples: 1314096600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:21,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 06:50:22,587][12883] Updated weights for policy 0, policy_version 80201 (0.0036) [2024-06-18 06:50:26,248][12883] Updated weights for policy 0, policy_version 80211 (0.0028) [2024-06-18 06:50:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1314209792. Throughput: 0: 42373.4. Samples: 1314358100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:26,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 06:50:30,180][12883] Updated weights for policy 0, policy_version 80221 (0.0036) [2024-06-18 06:50:31,994][12645] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1314390016. Throughput: 0: 42141.8. Samples: 1314481860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:31,994][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 06:50:33,926][12883] Updated weights for policy 0, policy_version 80231 (0.0042) [2024-06-18 06:50:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1314635776. Throughput: 0: 42317.6. Samples: 1314734480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:36,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 06:50:37,086][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080240_1314652160.pth... [2024-06-18 06:50:37,132][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079617_1304444928.pth [2024-06-18 06:50:37,654][12883] Updated weights for policy 0, policy_version 80241 (0.0042) [2024-06-18 06:50:41,708][12883] Updated weights for policy 0, policy_version 80251 (0.0039) [2024-06-18 06:50:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42432.0). Total num frames: 1314848768. Throughput: 0: 42556.8. Samples: 1314998080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:41,994][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 06:50:46,066][12883] Updated weights for policy 0, policy_version 80261 (0.0038) [2024-06-18 06:50:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1315028992. Throughput: 0: 42325.4. Samples: 1315125580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:46,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 06:50:49,347][12883] Updated weights for policy 0, policy_version 80271 (0.0037) [2024-06-18 06:50:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1315241984. Throughput: 0: 42351.5. Samples: 1315369880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:51,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 06:50:53,766][12883] Updated weights for policy 0, policy_version 80281 (0.0027) [2024-06-18 06:50:56,795][12883] Updated weights for policy 0, policy_version 80291 (0.0034) [2024-06-18 06:50:56,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1315504128. Throughput: 0: 42464.9. Samples: 1315631500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:50:56,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 06:51:01,546][12883] Updated weights for policy 0, policy_version 80301 (0.0032) [2024-06-18 06:51:01,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42050.7, 300 sec: 42320.4). Total num frames: 1315667968. Throughput: 0: 42456.5. Samples: 1315762500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 06:51:01,997][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 06:51:04,551][12883] Updated weights for policy 0, policy_version 80311 (0.0036) [2024-06-18 06:51:06,996][12645] Fps is (10 sec: 39313.2, 60 sec: 42596.9, 300 sec: 42431.5). Total num frames: 1315897344. Throughput: 0: 42455.6. Samples: 1316007200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:06,996][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 06:51:09,379][12883] Updated weights for policy 0, policy_version 80321 (0.0023) [2024-06-18 06:51:11,994][12645] Fps is (10 sec: 44247.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1316110336. Throughput: 0: 42290.8. Samples: 1316261180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:12,000][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 06:51:12,352][12883] Updated weights for policy 0, policy_version 80331 (0.0038) [2024-06-18 06:51:16,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42052.3, 300 sec: 42321.0). Total num frames: 1316290560. Throughput: 0: 42331.7. Samples: 1316386780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:16,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 06:51:17,270][12883] Updated weights for policy 0, policy_version 80341 (0.0046) [2024-06-18 06:51:20,310][12883] Updated weights for policy 0, policy_version 80351 (0.0045) [2024-06-18 06:51:21,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 1316552704. Throughput: 0: 42315.9. Samples: 1316638700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:21,994][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 06:51:25,017][12883] Updated weights for policy 0, policy_version 80361 (0.0039) [2024-06-18 06:51:26,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1316749312. Throughput: 0: 42282.6. Samples: 1316900800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:26,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 06:51:28,082][12883] Updated weights for policy 0, policy_version 80371 (0.0032) [2024-06-18 06:51:31,994][12645] Fps is (10 sec: 36045.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1316913152. Throughput: 0: 42095.0. Samples: 1317019860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:31,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 06:51:33,163][12883] Updated weights for policy 0, policy_version 80381 (0.0043) [2024-06-18 06:51:33,636][12862] Signal inference workers to stop experience collection... (19050 times) [2024-06-18 06:51:33,636][12862] Signal inference workers to resume experience collection... (19050 times) [2024-06-18 06:51:33,677][12883] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-06-18 06:51:33,677][12883] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-06-18 06:51:35,673][12883] Updated weights for policy 0, policy_version 80391 (0.0023) [2024-06-18 06:51:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1317175296. Throughput: 0: 42326.8. Samples: 1317274580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:36,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 06:51:40,915][12883] Updated weights for policy 0, policy_version 80401 (0.0036) [2024-06-18 06:51:41,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1317371904. Throughput: 0: 42208.1. Samples: 1317530860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:41,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 06:51:43,213][12883] Updated weights for policy 0, policy_version 80411 (0.0030) [2024-06-18 06:51:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1317568512. Throughput: 0: 41996.8. Samples: 1317652260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:46,994][12645] Avg episode reward: [(0, '0.100')] [2024-06-18 06:51:48,583][12883] Updated weights for policy 0, policy_version 80421 (0.0022) [2024-06-18 06:51:50,807][12883] Updated weights for policy 0, policy_version 80431 (0.0032) [2024-06-18 06:51:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1317830656. Throughput: 0: 42297.5. Samples: 1317910500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 06:51:51,994][12645] Avg episode reward: [(0, '0.110')] [2024-06-18 06:51:56,236][12883] Updated weights for policy 0, policy_version 80441 (0.0031) [2024-06-18 06:51:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 42265.2). Total num frames: 1317978112. Throughput: 0: 42492.7. Samples: 1318173360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:51:56,995][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 06:51:58,573][12883] Updated weights for policy 0, policy_version 80451 (0.0031) [2024-06-18 06:52:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42600.0, 300 sec: 42376.3). Total num frames: 1318223872. Throughput: 0: 42151.0. Samples: 1318283580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:01,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 06:52:03,861][12883] Updated weights for policy 0, policy_version 80461 (0.0027) [2024-06-18 06:52:06,607][12883] Updated weights for policy 0, policy_version 80471 (0.0045) [2024-06-18 06:52:06,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 1318469632. Throughput: 0: 42389.8. Samples: 1318546240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:06,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 06:52:11,782][12883] Updated weights for policy 0, policy_version 80481 (0.0032) [2024-06-18 06:52:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1318600704. Throughput: 0: 42397.9. Samples: 1318808700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:11,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 06:52:14,223][12883] Updated weights for policy 0, policy_version 80491 (0.0033) [2024-06-18 06:52:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 1318846464. Throughput: 0: 42323.1. Samples: 1318924400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:16,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 06:52:19,421][12883] Updated weights for policy 0, policy_version 80501 (0.0050) [2024-06-18 06:52:21,923][12883] Updated weights for policy 0, policy_version 80511 (0.0045) [2024-06-18 06:52:21,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1319092224. Throughput: 0: 42391.0. Samples: 1319182180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:21,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 06:52:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 1319239680. Throughput: 0: 42559.4. Samples: 1319446040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:26,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 06:52:27,131][12883] Updated weights for policy 0, policy_version 80521 (0.0034) [2024-06-18 06:52:29,374][12862] Signal inference workers to stop experience collection... (19100 times) [2024-06-18 06:52:29,419][12883] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-06-18 06:52:29,427][12862] Signal inference workers to resume experience collection... (19100 times) [2024-06-18 06:52:29,437][12883] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-06-18 06:52:29,591][12883] Updated weights for policy 0, policy_version 80531 (0.0025) [2024-06-18 06:52:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42376.3). Total num frames: 1319501824. Throughput: 0: 42297.7. Samples: 1319555660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:31,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 06:52:34,673][12883] Updated weights for policy 0, policy_version 80541 (0.0033) [2024-06-18 06:52:36,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1319714816. Throughput: 0: 42479.6. Samples: 1319822080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:36,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 06:52:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080551_1319747584.pth... [2024-06-18 06:52:37,147][12883] Updated weights for policy 0, policy_version 80551 (0.0027) [2024-06-18 06:52:37,188][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000079926_1309507584.pth [2024-06-18 06:52:41,994][12645] Fps is (10 sec: 36045.5, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 1319862272. Throughput: 0: 42351.3. Samples: 1320079160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 06:52:41,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 06:52:42,438][12883] Updated weights for policy 0, policy_version 80561 (0.0052) [2024-06-18 06:52:45,239][12883] Updated weights for policy 0, policy_version 80571 (0.0030) [2024-06-18 06:52:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42376.6). Total num frames: 1320140800. Throughput: 0: 42420.9. Samples: 1320192520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:52:46,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 06:52:50,226][12883] Updated weights for policy 0, policy_version 80581 (0.0037) [2024-06-18 06:52:51,994][12645] Fps is (10 sec: 47513.5, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 1320337408. Throughput: 0: 42350.0. Samples: 1320451980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:52:51,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 06:52:53,195][12883] Updated weights for policy 0, policy_version 80591 (0.0034) [2024-06-18 06:52:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 1320517632. Throughput: 0: 42050.1. Samples: 1320700960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:52:56,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 06:52:58,199][12883] Updated weights for policy 0, policy_version 80601 (0.0024) [2024-06-18 06:53:01,125][12883] Updated weights for policy 0, policy_version 80611 (0.0034) [2024-06-18 06:53:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1320779776. Throughput: 0: 42213.9. Samples: 1320824020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:53:01,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 06:53:06,055][12883] Updated weights for policy 0, policy_version 80621 (0.0025) [2024-06-18 06:53:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 1320960000. Throughput: 0: 42307.6. Samples: 1321086020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:53:06,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 06:53:08,706][12883] Updated weights for policy 0, policy_version 80631 (0.0026) [2024-06-18 06:53:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 1321172992. Throughput: 0: 41922.3. Samples: 1321332540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:53:11,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 06:53:13,682][12883] Updated weights for policy 0, policy_version 80641 (0.0033) [2024-06-18 06:53:16,709][12883] Updated weights for policy 0, policy_version 80651 (0.0038) [2024-06-18 06:53:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1321385984. Throughput: 0: 42237.0. Samples: 1321456320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:53:16,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 06:53:21,322][12883] Updated weights for policy 0, policy_version 80661 (0.0039) [2024-06-18 06:53:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 1321582592. Throughput: 0: 42028.1. Samples: 1321713340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:53:21,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 06:53:23,074][12862] Signal inference workers to stop experience collection... (19150 times) [2024-06-18 06:53:23,075][12862] Signal inference workers to resume experience collection... (19150 times) [2024-06-18 06:53:23,091][12883] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-06-18 06:53:23,091][12883] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-06-18 06:53:24,462][12883] Updated weights for policy 0, policy_version 80671 (0.0038) [2024-06-18 06:53:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42210.0). Total num frames: 1321779200. Throughput: 0: 41827.1. Samples: 1321961380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 23.0) [2024-06-18 06:53:26,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 06:53:29,031][12883] Updated weights for policy 0, policy_version 80681 (0.0038) [2024-06-18 06:53:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1322024960. Throughput: 0: 42147.1. Samples: 1322089140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:53:31,994][12645] Avg episode reward: [(0, '0.160')] [2024-06-18 06:53:32,144][12883] Updated weights for policy 0, policy_version 80691 (0.0051) [2024-06-18 06:53:36,748][12883] Updated weights for policy 0, policy_version 80701 (0.0034) [2024-06-18 06:53:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.3, 300 sec: 42209.8). Total num frames: 1322205184. Throughput: 0: 42160.5. Samples: 1322349200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:53:36,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 06:53:40,194][12883] Updated weights for policy 0, policy_version 80711 (0.0025) [2024-06-18 06:53:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 1322418176. Throughput: 0: 42080.9. Samples: 1322594600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:53:41,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 06:53:44,530][12883] Updated weights for policy 0, policy_version 80721 (0.0024) [2024-06-18 06:53:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 1322647552. Throughput: 0: 42108.4. Samples: 1322718900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:53:46,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 06:53:47,936][12883] Updated weights for policy 0, policy_version 80731 (0.0031) [2024-06-18 06:53:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1322844160. Throughput: 0: 42020.1. Samples: 1322976920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:53:51,994][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 06:53:52,055][12883] Updated weights for policy 0, policy_version 80741 (0.0030) [2024-06-18 06:53:55,565][12883] Updated weights for policy 0, policy_version 80751 (0.0033) [2024-06-18 06:53:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 1323073536. Throughput: 0: 42109.3. Samples: 1323227460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:53:56,994][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 06:53:59,665][12883] Updated weights for policy 0, policy_version 80761 (0.0045) [2024-06-18 06:54:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1323302912. Throughput: 0: 42271.2. Samples: 1323358520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:54:01,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 06:54:03,182][12883] Updated weights for policy 0, policy_version 80771 (0.0036) [2024-06-18 06:54:06,994][12645] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 1323466752. Throughput: 0: 42247.4. Samples: 1323614480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:54:06,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 06:54:07,666][12883] Updated weights for policy 0, policy_version 80781 (0.0037) [2024-06-18 06:54:11,027][12883] Updated weights for policy 0, policy_version 80791 (0.0033) [2024-06-18 06:54:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 1323712512. Throughput: 0: 42168.8. Samples: 1323858980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:54:11,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 06:54:15,410][12883] Updated weights for policy 0, policy_version 80801 (0.0030) [2024-06-18 06:54:16,996][12645] Fps is (10 sec: 47503.7, 60 sec: 42596.8, 300 sec: 42375.9). Total num frames: 1323941888. Throughput: 0: 42329.5. Samples: 1323994060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 06:54:16,996][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 06:54:18,641][12883] Updated weights for policy 0, policy_version 80811 (0.0047) [2024-06-18 06:54:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 1324105728. Throughput: 0: 42180.8. Samples: 1324247340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:21,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 06:54:23,148][12883] Updated weights for policy 0, policy_version 80821 (0.0045) [2024-06-18 06:54:26,198][12883] Updated weights for policy 0, policy_version 80831 (0.0043) [2024-06-18 06:54:26,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42871.3, 300 sec: 42265.2). Total num frames: 1324351488. Throughput: 0: 42331.0. Samples: 1324499500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:26,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 06:54:30,679][12883] Updated weights for policy 0, policy_version 80841 (0.0033) [2024-06-18 06:54:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1324564480. Throughput: 0: 42591.9. Samples: 1324635540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:31,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 06:54:34,270][12883] Updated weights for policy 0, policy_version 80851 (0.0047) [2024-06-18 06:54:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 1324744704. Throughput: 0: 42316.8. Samples: 1324881180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:36,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 06:54:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080856_1324744704.pth... [2024-06-18 06:54:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080240_1314652160.pth [2024-06-18 06:54:38,456][12883] Updated weights for policy 0, policy_version 80861 (0.0026) [2024-06-18 06:54:41,846][12883] Updated weights for policy 0, policy_version 80871 (0.0033) [2024-06-18 06:54:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 1324990464. Throughput: 0: 42306.2. Samples: 1325131240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:41,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 06:54:46,102][12883] Updated weights for policy 0, policy_version 80881 (0.0040) [2024-06-18 06:54:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1325187072. Throughput: 0: 42317.2. Samples: 1325262800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:46,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 06:54:49,430][12883] Updated weights for policy 0, policy_version 80891 (0.0030) [2024-06-18 06:54:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 1325383680. Throughput: 0: 42340.5. Samples: 1325519800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:51,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 06:54:53,575][12883] Updated weights for policy 0, policy_version 80901 (0.0021) [2024-06-18 06:54:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1325629440. Throughput: 0: 42511.1. Samples: 1325771980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:54:56,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 06:54:57,383][12883] Updated weights for policy 0, policy_version 80911 (0.0041) [2024-06-18 06:55:01,076][12883] Updated weights for policy 0, policy_version 80921 (0.0040) [2024-06-18 06:55:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1325842432. Throughput: 0: 42532.8. Samples: 1325907940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:55:01,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 06:55:02,507][12862] Signal inference workers to stop experience collection... (19200 times) [2024-06-18 06:55:02,537][12883] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-06-18 06:55:02,572][12862] Signal inference workers to resume experience collection... (19200 times) [2024-06-18 06:55:02,573][12883] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-06-18 06:55:05,083][12883] Updated weights for policy 0, policy_version 80931 (0.0045) [2024-06-18 06:55:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1326039040. Throughput: 0: 42509.7. Samples: 1326160280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 06:55:06,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 06:55:09,009][12883] Updated weights for policy 0, policy_version 80941 (0.0038) [2024-06-18 06:55:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1326268416. Throughput: 0: 42351.6. Samples: 1326405320. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:11,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 06:55:13,106][12883] Updated weights for policy 0, policy_version 80951 (0.0036) [2024-06-18 06:55:16,816][12883] Updated weights for policy 0, policy_version 80961 (0.0039) [2024-06-18 06:55:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42053.8, 300 sec: 42265.2). Total num frames: 1326465024. Throughput: 0: 42271.2. Samples: 1326537740. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:16,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 06:55:20,808][12883] Updated weights for policy 0, policy_version 80971 (0.0038) [2024-06-18 06:55:21,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.8, 300 sec: 42209.3). Total num frames: 1326661632. Throughput: 0: 42469.5. Samples: 1326792400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:21,997][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 06:55:24,412][12883] Updated weights for policy 0, policy_version 80981 (0.0036) [2024-06-18 06:55:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 1326891008. Throughput: 0: 42563.3. Samples: 1327046580. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:26,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 06:55:28,345][12883] Updated weights for policy 0, policy_version 80991 (0.0033) [2024-06-18 06:55:31,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1327104000. Throughput: 0: 42568.4. Samples: 1327178380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:31,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 06:55:32,423][12883] Updated weights for policy 0, policy_version 81001 (0.0022) [2024-06-18 06:55:36,028][12883] Updated weights for policy 0, policy_version 81011 (0.0035) [2024-06-18 06:55:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 1327316992. Throughput: 0: 42546.6. Samples: 1327434400. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:36,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 06:55:39,852][12883] Updated weights for policy 0, policy_version 81021 (0.0031) [2024-06-18 06:55:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1327529984. Throughput: 0: 42604.8. Samples: 1327689200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:41,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 06:55:43,872][12883] Updated weights for policy 0, policy_version 81031 (0.0038) [2024-06-18 06:55:46,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 42431.5). Total num frames: 1327759360. Throughput: 0: 42462.3. Samples: 1327818840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:46,997][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 06:55:47,442][12883] Updated weights for policy 0, policy_version 81041 (0.0028) [2024-06-18 06:55:51,618][12883] Updated weights for policy 0, policy_version 81051 (0.0036) [2024-06-18 06:55:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 1327955968. Throughput: 0: 42455.3. Samples: 1328070760. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:51,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 06:55:55,363][12883] Updated weights for policy 0, policy_version 81061 (0.0035) [2024-06-18 06:55:56,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 1328168960. Throughput: 0: 42580.4. Samples: 1328321440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 20.0) [2024-06-18 06:55:56,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 06:55:59,274][12883] Updated weights for policy 0, policy_version 81071 (0.0032) [2024-06-18 06:56:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42376.6). Total num frames: 1328398336. Throughput: 0: 42570.2. Samples: 1328453400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:01,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 06:56:02,949][12883] Updated weights for policy 0, policy_version 81081 (0.0031) [2024-06-18 06:56:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 1328578560. Throughput: 0: 42579.9. Samples: 1328708400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:07,004][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 06:56:07,195][12883] Updated weights for policy 0, policy_version 81091 (0.0031) [2024-06-18 06:56:10,627][12883] Updated weights for policy 0, policy_version 81101 (0.0041) [2024-06-18 06:56:11,993][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1328807936. Throughput: 0: 42503.6. Samples: 1328959240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:11,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 06:56:14,885][12883] Updated weights for policy 0, policy_version 81111 (0.0044) [2024-06-18 06:56:16,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42869.9, 300 sec: 42320.4). Total num frames: 1329037312. Throughput: 0: 42527.8. Samples: 1329092220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:16,996][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 06:56:18,354][12883] Updated weights for policy 0, policy_version 81121 (0.0033) [2024-06-18 06:56:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42327.0, 300 sec: 42209.6). Total num frames: 1329201152. Throughput: 0: 42406.3. Samples: 1329342680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:21,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 06:56:22,575][12883] Updated weights for policy 0, policy_version 81131 (0.0036) [2024-06-18 06:56:26,515][12883] Updated weights for policy 0, policy_version 81141 (0.0035) [2024-06-18 06:56:26,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1329446912. Throughput: 0: 42431.7. Samples: 1329598620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:26,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 06:56:30,110][12862] Signal inference workers to stop experience collection... (19250 times) [2024-06-18 06:56:30,110][12862] Signal inference workers to resume experience collection... (19250 times) [2024-06-18 06:56:30,162][12883] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-06-18 06:56:30,162][12883] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-06-18 06:56:30,579][12883] Updated weights for policy 0, policy_version 81151 (0.0041) [2024-06-18 06:56:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 1329643520. Throughput: 0: 42461.3. Samples: 1329729500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:31,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 06:56:34,035][12883] Updated weights for policy 0, policy_version 81161 (0.0030) [2024-06-18 06:56:36,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 1329823744. Throughput: 0: 42333.2. Samples: 1329975760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:36,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 06:56:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081166_1329823744.pth... [2024-06-18 06:56:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080551_1319747584.pth [2024-06-18 06:56:38,253][12883] Updated weights for policy 0, policy_version 81171 (0.0030) [2024-06-18 06:56:41,543][12883] Updated weights for policy 0, policy_version 81181 (0.0044) [2024-06-18 06:56:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1330102272. Throughput: 0: 42276.2. Samples: 1330223860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:41,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 06:56:46,063][12883] Updated weights for policy 0, policy_version 81191 (0.0037) [2024-06-18 06:56:46,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42053.9, 300 sec: 42209.6). Total num frames: 1330282496. Throughput: 0: 42520.9. Samples: 1330366840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 06:56:46,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 06:56:49,175][12883] Updated weights for policy 0, policy_version 81201 (0.0034) [2024-06-18 06:56:51,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1330479104. Throughput: 0: 42227.5. Samples: 1330608640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:56:51,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 06:56:53,679][12883] Updated weights for policy 0, policy_version 81211 (0.0027) [2024-06-18 06:56:56,825][12883] Updated weights for policy 0, policy_version 81221 (0.0035) [2024-06-18 06:56:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 1330724864. Throughput: 0: 42343.4. Samples: 1330864700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:56:56,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 06:57:01,290][12883] Updated weights for policy 0, policy_version 81231 (0.0031) [2024-06-18 06:57:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 1330905088. Throughput: 0: 42399.8. Samples: 1331000120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:01,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 06:57:04,815][12883] Updated weights for policy 0, policy_version 81241 (0.0032) [2024-06-18 06:57:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1331118080. Throughput: 0: 42284.4. Samples: 1331245480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:06,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 06:57:08,898][12883] Updated weights for policy 0, policy_version 81251 (0.0028) [2024-06-18 06:57:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1331347456. Throughput: 0: 42374.7. Samples: 1331505480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:11,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 06:57:12,314][12883] Updated weights for policy 0, policy_version 81261 (0.0029) [2024-06-18 06:57:16,592][12883] Updated weights for policy 0, policy_version 81271 (0.0038) [2024-06-18 06:57:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42053.9, 300 sec: 42265.2). Total num frames: 1331560448. Throughput: 0: 42392.9. Samples: 1331637180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:16,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 06:57:20,050][12883] Updated weights for policy 0, policy_version 81281 (0.0049) [2024-06-18 06:57:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1331773440. Throughput: 0: 42490.8. Samples: 1331887840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:21,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 06:57:24,185][12883] Updated weights for policy 0, policy_version 81291 (0.0027) [2024-06-18 06:57:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 1332002816. Throughput: 0: 42737.3. Samples: 1332147040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:26,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 06:57:27,755][12883] Updated weights for policy 0, policy_version 81301 (0.0035) [2024-06-18 06:57:31,848][12883] Updated weights for policy 0, policy_version 81311 (0.0038) [2024-06-18 06:57:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1332199424. Throughput: 0: 42351.2. Samples: 1332272640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:31,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 06:57:35,306][12883] Updated weights for policy 0, policy_version 81321 (0.0027) [2024-06-18 06:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 1332412416. Throughput: 0: 42585.0. Samples: 1332524960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 06:57:36,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 06:57:39,681][12883] Updated weights for policy 0, policy_version 81331 (0.0032) [2024-06-18 06:57:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1332625408. Throughput: 0: 42678.3. Samples: 1332785220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:57:41,994][12645] Avg episode reward: [(0, '0.117')] [2024-06-18 06:57:42,992][12883] Updated weights for policy 0, policy_version 81341 (0.0038) [2024-06-18 06:57:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1332822016. Throughput: 0: 42425.7. Samples: 1332909280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:57:46,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 06:57:47,262][12883] Updated weights for policy 0, policy_version 81351 (0.0031) [2024-06-18 06:57:50,715][12883] Updated weights for policy 0, policy_version 81361 (0.0022) [2024-06-18 06:57:51,740][12862] Signal inference workers to stop experience collection... (19300 times) [2024-06-18 06:57:51,740][12862] Signal inference workers to resume experience collection... (19300 times) [2024-06-18 06:57:51,765][12883] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-06-18 06:57:51,765][12883] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-06-18 06:57:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1333067776. Throughput: 0: 42647.1. Samples: 1333164600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:57:51,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 06:57:54,822][12883] Updated weights for policy 0, policy_version 81371 (0.0047) [2024-06-18 06:57:56,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 1333264384. Throughput: 0: 42777.3. Samples: 1333430460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:57:56,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 06:57:58,235][12883] Updated weights for policy 0, policy_version 81381 (0.0032) [2024-06-18 06:58:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1333460992. Throughput: 0: 42520.2. Samples: 1333550600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:58:01,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 06:58:02,506][12883] Updated weights for policy 0, policy_version 81391 (0.0032) [2024-06-18 06:58:06,590][12883] Updated weights for policy 0, policy_version 81401 (0.0036) [2024-06-18 06:58:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 1333706752. Throughput: 0: 42621.7. Samples: 1333805820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:58:06,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 06:58:10,251][12883] Updated weights for policy 0, policy_version 81411 (0.0038) [2024-06-18 06:58:11,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1333903360. Throughput: 0: 42622.2. Samples: 1334065040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:58:11,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 06:58:14,157][12883] Updated weights for policy 0, policy_version 81421 (0.0034) [2024-06-18 06:58:16,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 1334083584. Throughput: 0: 42439.9. Samples: 1334182440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:58:16,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 06:58:17,986][12883] Updated weights for policy 0, policy_version 81431 (0.0037) [2024-06-18 06:58:21,755][12883] Updated weights for policy 0, policy_version 81441 (0.0047) [2024-06-18 06:58:21,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1334329344. Throughput: 0: 42674.1. Samples: 1334445300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:58:21,995][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 06:58:25,619][12883] Updated weights for policy 0, policy_version 81451 (0.0039) [2024-06-18 06:58:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1334542336. Throughput: 0: 42611.0. Samples: 1334702720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 06:58:26,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 06:58:29,254][12883] Updated weights for policy 0, policy_version 81461 (0.0040) [2024-06-18 06:58:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1334738944. Throughput: 0: 42621.8. Samples: 1334827260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:58:31,994][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 06:58:33,129][12883] Updated weights for policy 0, policy_version 81471 (0.0029) [2024-06-18 06:58:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1334968320. Throughput: 0: 42649.9. Samples: 1335083840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:58:36,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 06:58:37,063][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081481_1334984704.pth... [2024-06-18 06:58:37,075][12883] Updated weights for policy 0, policy_version 81481 (0.0041) [2024-06-18 06:58:37,131][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000080856_1324744704.pth [2024-06-18 06:58:41,214][12883] Updated weights for policy 0, policy_version 81491 (0.0038) [2024-06-18 06:58:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1335197696. Throughput: 0: 42472.3. Samples: 1335341720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:58:41,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 06:58:44,715][12883] Updated weights for policy 0, policy_version 81501 (0.0031) [2024-06-18 06:58:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1335394304. Throughput: 0: 42677.8. Samples: 1335471100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:58:47,000][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 06:58:48,829][12883] Updated weights for policy 0, policy_version 81511 (0.0049) [2024-06-18 06:58:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1335590912. Throughput: 0: 42613.0. Samples: 1335723400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:58:51,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 06:58:52,774][12883] Updated weights for policy 0, policy_version 81521 (0.0047) [2024-06-18 06:58:56,559][12883] Updated weights for policy 0, policy_version 81531 (0.0050) [2024-06-18 06:58:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1335803904. Throughput: 0: 42494.0. Samples: 1335977280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:58:56,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 06:59:00,461][12883] Updated weights for policy 0, policy_version 81541 (0.0035) [2024-06-18 06:59:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1336033280. Throughput: 0: 42687.2. Samples: 1336103360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:59:01,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 06:59:02,717][12862] Signal inference workers to stop experience collection... (19350 times) [2024-06-18 06:59:02,765][12862] Signal inference workers to resume experience collection... (19350 times) [2024-06-18 06:59:02,766][12883] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-06-18 06:59:02,792][12883] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-06-18 06:59:04,339][12883] Updated weights for policy 0, policy_version 81551 (0.0030) [2024-06-18 06:59:06,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42323.8, 300 sec: 42487.0). Total num frames: 1336246272. Throughput: 0: 42451.8. Samples: 1336355720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:59:06,996][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 06:59:08,278][12883] Updated weights for policy 0, policy_version 81561 (0.0035) [2024-06-18 06:59:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 1336442880. Throughput: 0: 42479.7. Samples: 1336614300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:59:11,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 06:59:12,034][12883] Updated weights for policy 0, policy_version 81571 (0.0033) [2024-06-18 06:59:16,111][12883] Updated weights for policy 0, policy_version 81581 (0.0043) [2024-06-18 06:59:16,993][12645] Fps is (10 sec: 40969.9, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1336655872. Throughput: 0: 42554.9. Samples: 1336742220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 06:59:16,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 06:59:19,623][12883] Updated weights for policy 0, policy_version 81591 (0.0036) [2024-06-18 06:59:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1336868864. Throughput: 0: 42324.4. Samples: 1336988440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:21,994][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 06:59:23,887][12883] Updated weights for policy 0, policy_version 81601 (0.0031) [2024-06-18 06:59:26,995][12645] Fps is (10 sec: 44229.6, 60 sec: 42597.4, 300 sec: 42487.1). Total num frames: 1337098240. Throughput: 0: 42444.9. Samples: 1337251800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:26,996][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 06:59:27,431][12883] Updated weights for policy 0, policy_version 81611 (0.0036) [2024-06-18 06:59:31,738][12883] Updated weights for policy 0, policy_version 81621 (0.0027) [2024-06-18 06:59:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1337294848. Throughput: 0: 42372.6. Samples: 1337377860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:31,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 06:59:34,948][12883] Updated weights for policy 0, policy_version 81631 (0.0043) [2024-06-18 06:59:36,994][12645] Fps is (10 sec: 40966.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1337507840. Throughput: 0: 42287.0. Samples: 1337626320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:36,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 06:59:39,238][12883] Updated weights for policy 0, policy_version 81641 (0.0039) [2024-06-18 06:59:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1337737216. Throughput: 0: 42426.4. Samples: 1337886460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:41,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 06:59:42,483][12883] Updated weights for policy 0, policy_version 81651 (0.0026) [2024-06-18 06:59:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 1337917440. Throughput: 0: 42483.9. Samples: 1338015140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:46,994][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 06:59:47,071][12883] Updated weights for policy 0, policy_version 81661 (0.0030) [2024-06-18 06:59:50,308][12883] Updated weights for policy 0, policy_version 81671 (0.0027) [2024-06-18 06:59:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1338163200. Throughput: 0: 42622.7. Samples: 1338273640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:51,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 06:59:54,633][12883] Updated weights for policy 0, policy_version 81681 (0.0028) [2024-06-18 06:59:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1338376192. Throughput: 0: 42607.3. Samples: 1338531640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 06:59:56,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 06:59:57,917][12883] Updated weights for policy 0, policy_version 81691 (0.0035) [2024-06-18 07:00:01,994][12645] Fps is (10 sec: 39320.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1338556416. Throughput: 0: 42435.3. Samples: 1338651820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:00:01,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 07:00:02,585][12883] Updated weights for policy 0, policy_version 81701 (0.0039) [2024-06-18 07:00:05,490][12883] Updated weights for policy 0, policy_version 81711 (0.0024) [2024-06-18 07:00:06,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 1338802176. Throughput: 0: 42704.9. Samples: 1338910160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:06,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 07:00:10,199][12883] Updated weights for policy 0, policy_version 81721 (0.0028) [2024-06-18 07:00:11,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1339031552. Throughput: 0: 42597.8. Samples: 1339168640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:11,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 07:00:13,357][12883] Updated weights for policy 0, policy_version 81731 (0.0038) [2024-06-18 07:00:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42543.2). Total num frames: 1339211776. Throughput: 0: 42642.5. Samples: 1339296780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:16,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 07:00:17,897][12883] Updated weights for policy 0, policy_version 81741 (0.0029) [2024-06-18 07:00:20,927][12883] Updated weights for policy 0, policy_version 81751 (0.0025) [2024-06-18 07:00:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1339457536. Throughput: 0: 42784.8. Samples: 1339551640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:21,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 07:00:25,588][12883] Updated weights for policy 0, policy_version 81761 (0.0023) [2024-06-18 07:00:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42599.5, 300 sec: 42542.9). Total num frames: 1339654144. Throughput: 0: 42696.0. Samples: 1339807780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:26,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 07:00:28,460][12883] Updated weights for policy 0, policy_version 81771 (0.0026) [2024-06-18 07:00:29,877][12862] Signal inference workers to stop experience collection... (19400 times) [2024-06-18 07:00:29,878][12862] Signal inference workers to resume experience collection... (19400 times) [2024-06-18 07:00:29,917][12883] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-06-18 07:00:29,918][12883] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-06-18 07:00:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1339850752. Throughput: 0: 42768.4. Samples: 1339939720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:31,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 07:00:33,229][12883] Updated weights for policy 0, policy_version 81781 (0.0034) [2024-06-18 07:00:36,015][12883] Updated weights for policy 0, policy_version 81791 (0.0029) [2024-06-18 07:00:36,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1340080128. Throughput: 0: 42663.7. Samples: 1340193520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:36,995][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 07:00:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081792_1340080128.pth... [2024-06-18 07:00:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081166_1329823744.pth [2024-06-18 07:00:40,850][12883] Updated weights for policy 0, policy_version 81801 (0.0026) [2024-06-18 07:00:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 1340293120. Throughput: 0: 42710.5. Samples: 1340453600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:41,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 07:00:43,743][12883] Updated weights for policy 0, policy_version 81811 (0.0039) [2024-06-18 07:00:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1340489728. Throughput: 0: 42850.6. Samples: 1340580100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:46,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 07:00:48,427][12883] Updated weights for policy 0, policy_version 81821 (0.0029) [2024-06-18 07:00:51,528][12883] Updated weights for policy 0, policy_version 81831 (0.0046) [2024-06-18 07:00:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1340719104. Throughput: 0: 42692.0. Samples: 1340831300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:00:51,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 07:00:55,870][12883] Updated weights for policy 0, policy_version 81841 (0.0028) [2024-06-18 07:00:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 1340915712. Throughput: 0: 42988.1. Samples: 1341103100. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:00:56,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 07:00:59,036][12883] Updated weights for policy 0, policy_version 81851 (0.0040) [2024-06-18 07:01:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1341128704. Throughput: 0: 42740.2. Samples: 1341220080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:01,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 07:01:03,262][12883] Updated weights for policy 0, policy_version 81861 (0.0039) [2024-06-18 07:01:06,587][12883] Updated weights for policy 0, policy_version 81871 (0.0032) [2024-06-18 07:01:06,994][12645] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1341390848. Throughput: 0: 42920.5. Samples: 1341483060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:06,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 07:01:10,791][12883] Updated weights for policy 0, policy_version 81881 (0.0042) [2024-06-18 07:01:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42432.1). Total num frames: 1341554688. Throughput: 0: 43033.8. Samples: 1341744300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:11,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 07:01:14,222][12883] Updated weights for policy 0, policy_version 81891 (0.0024) [2024-06-18 07:01:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1341784064. Throughput: 0: 42735.0. Samples: 1341862800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:16,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 07:01:18,781][12883] Updated weights for policy 0, policy_version 81901 (0.0023) [2024-06-18 07:01:21,860][12883] Updated weights for policy 0, policy_version 81911 (0.0037) [2024-06-18 07:01:21,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1342029824. Throughput: 0: 42944.1. Samples: 1342126000. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:21,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 07:01:26,536][12883] Updated weights for policy 0, policy_version 81921 (0.0040) [2024-06-18 07:01:26,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1342210048. Throughput: 0: 42923.5. Samples: 1342385160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:26,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 07:01:29,544][12883] Updated weights for policy 0, policy_version 81931 (0.0030) [2024-06-18 07:01:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1342439424. Throughput: 0: 42793.9. Samples: 1342505820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:31,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 07:01:33,931][12883] Updated weights for policy 0, policy_version 81941 (0.0035) [2024-06-18 07:01:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 1342668800. Throughput: 0: 43144.8. Samples: 1342772820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:36,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 07:01:37,555][12883] Updated weights for policy 0, policy_version 81951 (0.0031) [2024-06-18 07:01:41,422][12883] Updated weights for policy 0, policy_version 81961 (0.0034) [2024-06-18 07:01:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1342865408. Throughput: 0: 42799.1. Samples: 1343029060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 07:01:41,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 07:01:42,642][12862] Signal inference workers to stop experience collection... (19450 times) [2024-06-18 07:01:42,642][12862] Signal inference workers to resume experience collection... (19450 times) [2024-06-18 07:01:42,661][12883] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-06-18 07:01:42,662][12883] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-06-18 07:01:44,979][12883] Updated weights for policy 0, policy_version 81971 (0.0037) [2024-06-18 07:01:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1343094784. Throughput: 0: 43001.6. Samples: 1343155160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:01:46,995][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 07:01:48,904][12883] Updated weights for policy 0, policy_version 81981 (0.0039) [2024-06-18 07:01:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1343291392. Throughput: 0: 42938.4. Samples: 1343415280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:01:51,994][12645] Avg episode reward: [(0, '0.142')] [2024-06-18 07:01:52,968][12883] Updated weights for policy 0, policy_version 81991 (0.0031) [2024-06-18 07:01:56,525][12883] Updated weights for policy 0, policy_version 82001 (0.0040) [2024-06-18 07:01:56,996][12645] Fps is (10 sec: 40951.1, 60 sec: 43142.9, 300 sec: 42709.2). Total num frames: 1343504384. Throughput: 0: 42586.3. Samples: 1343660780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:01:56,996][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 07:02:00,609][12883] Updated weights for policy 0, policy_version 82011 (0.0047) [2024-06-18 07:02:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1343717376. Throughput: 0: 42998.0. Samples: 1343797700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:02:01,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 07:02:04,599][12883] Updated weights for policy 0, policy_version 82021 (0.0028) [2024-06-18 07:02:06,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1343930368. Throughput: 0: 42860.0. Samples: 1344054700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:02:06,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 07:02:08,166][12883] Updated weights for policy 0, policy_version 82031 (0.0044) [2024-06-18 07:02:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1344143360. Throughput: 0: 42701.3. Samples: 1344306720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:02:11,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 07:02:12,412][12883] Updated weights for policy 0, policy_version 82041 (0.0032) [2024-06-18 07:02:15,982][12883] Updated weights for policy 0, policy_version 82051 (0.0033) [2024-06-18 07:02:16,993][12645] Fps is (10 sec: 42599.5, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 1344356352. Throughput: 0: 42920.2. Samples: 1344437220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:02:16,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 07:02:20,158][12883] Updated weights for policy 0, policy_version 82061 (0.0033) [2024-06-18 07:02:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1344585728. Throughput: 0: 42753.3. Samples: 1344696720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:02:21,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 07:02:23,438][12883] Updated weights for policy 0, policy_version 82071 (0.0038) [2024-06-18 07:02:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1344782336. Throughput: 0: 42809.3. Samples: 1344955480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:02:26,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 07:02:27,838][12883] Updated weights for policy 0, policy_version 82081 (0.0035) [2024-06-18 07:02:31,299][12883] Updated weights for policy 0, policy_version 82091 (0.0031) [2024-06-18 07:02:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1344995328. Throughput: 0: 42703.7. Samples: 1345076820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 07:02:31,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 07:02:35,342][12883] Updated weights for policy 0, policy_version 82101 (0.0035) [2024-06-18 07:02:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1345224704. Throughput: 0: 42665.7. Samples: 1345335240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:02:36,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 07:02:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082106_1345224704.pth... [2024-06-18 07:02:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081481_1334984704.pth [2024-06-18 07:02:38,832][12883] Updated weights for policy 0, policy_version 82111 (0.0042) [2024-06-18 07:02:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1345421312. Throughput: 0: 42937.7. Samples: 1345592880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:02:41,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 07:02:42,891][12883] Updated weights for policy 0, policy_version 82121 (0.0043) [2024-06-18 07:02:46,438][12883] Updated weights for policy 0, policy_version 82131 (0.0038) [2024-06-18 07:02:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1345634304. Throughput: 0: 42656.4. Samples: 1345717240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:02:46,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 07:02:50,343][12883] Updated weights for policy 0, policy_version 82141 (0.0034) [2024-06-18 07:02:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1345847296. Throughput: 0: 42650.8. Samples: 1345973980. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:02:51,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 07:02:54,212][12883] Updated weights for policy 0, policy_version 82151 (0.0027) [2024-06-18 07:02:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 1346076672. Throughput: 0: 42808.8. Samples: 1346233120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:02:56,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 07:02:57,906][12883] Updated weights for policy 0, policy_version 82161 (0.0029) [2024-06-18 07:03:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1346289664. Throughput: 0: 42782.5. Samples: 1346362440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:03:01,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 07:03:02,000][12883] Updated weights for policy 0, policy_version 82171 (0.0032) [2024-06-18 07:03:05,838][12883] Updated weights for policy 0, policy_version 82181 (0.0032) [2024-06-18 07:03:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1346486272. Throughput: 0: 42753.4. Samples: 1346620620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:03:06,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 07:03:09,521][12883] Updated weights for policy 0, policy_version 82191 (0.0030) [2024-06-18 07:03:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1346699264. Throughput: 0: 42628.8. Samples: 1346873780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:03:11,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 07:03:13,655][12883] Updated weights for policy 0, policy_version 82201 (0.0036) [2024-06-18 07:03:14,741][12862] Signal inference workers to stop experience collection... (19500 times) [2024-06-18 07:03:14,742][12862] Signal inference workers to resume experience collection... (19500 times) [2024-06-18 07:03:14,796][12883] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-06-18 07:03:14,796][12883] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-06-18 07:03:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1346928640. Throughput: 0: 42779.5. Samples: 1347001900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:03:16,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 07:03:17,113][12883] Updated weights for policy 0, policy_version 82211 (0.0039) [2024-06-18 07:03:21,252][12883] Updated weights for policy 0, policy_version 82221 (0.0029) [2024-06-18 07:03:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1347125248. Throughput: 0: 42664.0. Samples: 1347255120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 20.0) [2024-06-18 07:03:21,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 07:03:24,763][12883] Updated weights for policy 0, policy_version 82231 (0.0037) [2024-06-18 07:03:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1347338240. Throughput: 0: 42760.9. Samples: 1347517120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:03:26,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 07:03:28,914][12883] Updated weights for policy 0, policy_version 82241 (0.0044) [2024-06-18 07:03:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1347584000. Throughput: 0: 42912.4. Samples: 1347648300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:03:31,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 07:03:32,390][12883] Updated weights for policy 0, policy_version 82251 (0.0027) [2024-06-18 07:03:36,564][12883] Updated weights for policy 0, policy_version 82261 (0.0028) [2024-06-18 07:03:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1347764224. Throughput: 0: 42817.2. Samples: 1347900760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:03:36,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 07:03:39,951][12883] Updated weights for policy 0, policy_version 82271 (0.0030) [2024-06-18 07:03:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1347977216. Throughput: 0: 42746.6. Samples: 1348156720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:03:41,994][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 07:03:44,291][12883] Updated weights for policy 0, policy_version 82281 (0.0030) [2024-06-18 07:03:46,994][12645] Fps is (10 sec: 47514.6, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1348239360. Throughput: 0: 42625.4. Samples: 1348280580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:03:46,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 07:03:47,636][12883] Updated weights for policy 0, policy_version 82291 (0.0027) [2024-06-18 07:03:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1348386816. Throughput: 0: 42421.5. Samples: 1348529600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:03:51,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 07:03:52,399][12883] Updated weights for policy 0, policy_version 82301 (0.0032) [2024-06-18 07:03:55,409][12883] Updated weights for policy 0, policy_version 82311 (0.0037) [2024-06-18 07:03:56,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1348616192. Throughput: 0: 42407.0. Samples: 1348782100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:03:56,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 07:04:00,049][12883] Updated weights for policy 0, policy_version 82321 (0.0032) [2024-06-18 07:04:01,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 1348861952. Throughput: 0: 42530.6. Samples: 1348915780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:04:01,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 07:04:03,585][12883] Updated weights for policy 0, policy_version 82331 (0.0046) [2024-06-18 07:04:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1349025792. Throughput: 0: 42388.9. Samples: 1349162620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:04:06,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 07:04:07,977][12883] Updated weights for policy 0, policy_version 82341 (0.0026) [2024-06-18 07:04:11,155][12883] Updated weights for policy 0, policy_version 82351 (0.0029) [2024-06-18 07:04:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1349255168. Throughput: 0: 42249.2. Samples: 1349418340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:04:11,995][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 07:04:15,764][12883] Updated weights for policy 0, policy_version 82361 (0.0036) [2024-06-18 07:04:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1349468160. Throughput: 0: 42141.7. Samples: 1349544680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:16,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 07:04:18,757][12883] Updated weights for policy 0, policy_version 82371 (0.0042) [2024-06-18 07:04:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42654.1). Total num frames: 1349681152. Throughput: 0: 42029.8. Samples: 1349792100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:21,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 07:04:23,411][12883] Updated weights for policy 0, policy_version 82381 (0.0031) [2024-06-18 07:04:26,689][12883] Updated weights for policy 0, policy_version 82391 (0.0035) [2024-06-18 07:04:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1349894144. Throughput: 0: 41929.0. Samples: 1350043520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:26,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 07:04:30,899][12883] Updated weights for policy 0, policy_version 82401 (0.0041) [2024-06-18 07:04:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 41777.7, 300 sec: 42653.6). Total num frames: 1350090752. Throughput: 0: 42076.9. Samples: 1350174140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:31,997][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 07:04:34,626][12883] Updated weights for policy 0, policy_version 82411 (0.0033) [2024-06-18 07:04:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1350320128. Throughput: 0: 42233.4. Samples: 1350430100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:36,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 07:04:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082417_1350320128.pth... [2024-06-18 07:04:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000081792_1340080128.pth [2024-06-18 07:04:38,462][12883] Updated weights for policy 0, policy_version 82421 (0.0024) [2024-06-18 07:04:40,769][12862] Signal inference workers to stop experience collection... (19550 times) [2024-06-18 07:04:40,819][12883] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-06-18 07:04:40,883][12862] Signal inference workers to resume experience collection... (19550 times) [2024-06-18 07:04:40,883][12883] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-06-18 07:04:41,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1350533120. Throughput: 0: 42334.6. Samples: 1350687160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:41,994][12645] Avg episode reward: [(0, '0.147')] [2024-06-18 07:04:42,544][12883] Updated weights for policy 0, policy_version 82431 (0.0031) [2024-06-18 07:04:46,318][12883] Updated weights for policy 0, policy_version 82441 (0.0027) [2024-06-18 07:04:46,994][12645] Fps is (10 sec: 42596.9, 60 sec: 41778.8, 300 sec: 42653.9). Total num frames: 1350746112. Throughput: 0: 42076.1. Samples: 1350809220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:46,995][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 07:04:50,341][12883] Updated weights for policy 0, policy_version 82451 (0.0036) [2024-06-18 07:04:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1350959104. Throughput: 0: 42239.1. Samples: 1351063380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:51,995][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 07:04:54,183][12883] Updated weights for policy 0, policy_version 82461 (0.0043) [2024-06-18 07:04:56,994][12645] Fps is (10 sec: 40961.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1351155712. Throughput: 0: 42377.0. Samples: 1351325300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:04:56,994][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 07:04:58,119][12883] Updated weights for policy 0, policy_version 82471 (0.0040) [2024-06-18 07:05:01,875][12883] Updated weights for policy 0, policy_version 82481 (0.0027) [2024-06-18 07:05:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1351368704. Throughput: 0: 42165.1. Samples: 1351442100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:05:01,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 07:05:05,648][12883] Updated weights for policy 0, policy_version 82491 (0.0042) [2024-06-18 07:05:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1351614464. Throughput: 0: 42486.7. Samples: 1351704000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 07:05:06,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 07:05:09,965][12883] Updated weights for policy 0, policy_version 82501 (0.0036) [2024-06-18 07:05:11,993][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.6, 300 sec: 42654.0). Total num frames: 1351794688. Throughput: 0: 42533.9. Samples: 1351957540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:11,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 07:05:13,221][12883] Updated weights for policy 0, policy_version 82511 (0.0037) [2024-06-18 07:05:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1352007680. Throughput: 0: 42310.5. Samples: 1352078020. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:17,003][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 07:05:17,625][12883] Updated weights for policy 0, policy_version 82521 (0.0038) [2024-06-18 07:05:20,702][12883] Updated weights for policy 0, policy_version 82531 (0.0028) [2024-06-18 07:05:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1352237056. Throughput: 0: 42386.9. Samples: 1352337500. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:21,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 07:05:25,178][12883] Updated weights for policy 0, policy_version 82541 (0.0035) [2024-06-18 07:05:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1352417280. Throughput: 0: 42391.2. Samples: 1352594760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:26,994][12645] Avg episode reward: [(0, '0.164')] [2024-06-18 07:05:28,407][12883] Updated weights for policy 0, policy_version 82551 (0.0031) [2024-06-18 07:05:31,994][12645] Fps is (10 sec: 39320.4, 60 sec: 42326.8, 300 sec: 42542.9). Total num frames: 1352630272. Throughput: 0: 42447.8. Samples: 1352719360. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:31,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 07:05:32,834][12883] Updated weights for policy 0, policy_version 82561 (0.0053) [2024-06-18 07:05:36,118][12883] Updated weights for policy 0, policy_version 82571 (0.0032) [2024-06-18 07:05:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1352859648. Throughput: 0: 42508.9. Samples: 1352976280. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:36,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 07:05:40,425][12883] Updated weights for policy 0, policy_version 82581 (0.0043) [2024-06-18 07:05:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1353072640. Throughput: 0: 42303.1. Samples: 1353228940. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:41,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 07:05:44,047][12883] Updated weights for policy 0, policy_version 82591 (0.0037) [2024-06-18 07:05:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.5, 300 sec: 42542.8). Total num frames: 1353269248. Throughput: 0: 42526.9. Samples: 1353355820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:46,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 07:05:48,241][12883] Updated weights for policy 0, policy_version 82601 (0.0027) [2024-06-18 07:05:51,790][12883] Updated weights for policy 0, policy_version 82611 (0.0037) [2024-06-18 07:05:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1353498624. Throughput: 0: 42387.5. Samples: 1353611440. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:51,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 07:05:55,816][12883] Updated weights for policy 0, policy_version 82621 (0.0034) [2024-06-18 07:05:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1353695232. Throughput: 0: 42249.2. Samples: 1353858760. Policy #0 lag: (min: 0.0, avg: 12.6, max: 25.0) [2024-06-18 07:05:56,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 07:05:58,935][12862] Signal inference workers to stop experience collection... (19600 times) [2024-06-18 07:05:58,935][12862] Signal inference workers to resume experience collection... (19600 times) [2024-06-18 07:05:58,945][12883] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-06-18 07:05:58,945][12883] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-06-18 07:05:59,590][12883] Updated weights for policy 0, policy_version 82631 (0.0025) [2024-06-18 07:06:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1353924608. Throughput: 0: 42376.5. Samples: 1353984960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:01,994][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 07:06:03,533][12883] Updated weights for policy 0, policy_version 82641 (0.0041) [2024-06-18 07:06:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1354121216. Throughput: 0: 42384.4. Samples: 1354244800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:06,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 07:06:07,294][12883] Updated weights for policy 0, policy_version 82651 (0.0034) [2024-06-18 07:06:11,263][12883] Updated weights for policy 0, policy_version 82661 (0.0022) [2024-06-18 07:06:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.1, 300 sec: 42598.4). Total num frames: 1354350592. Throughput: 0: 42183.0. Samples: 1354493000. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:11,995][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 07:06:15,304][12883] Updated weights for policy 0, policy_version 82671 (0.0039) [2024-06-18 07:06:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1354563584. Throughput: 0: 42304.2. Samples: 1354623040. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:16,995][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 07:06:18,965][12883] Updated weights for policy 0, policy_version 82681 (0.0041) [2024-06-18 07:06:21,994][12645] Fps is (10 sec: 39322.6, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 1354743808. Throughput: 0: 42221.9. Samples: 1354876260. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:21,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 07:06:23,033][12883] Updated weights for policy 0, policy_version 82691 (0.0037) [2024-06-18 07:06:26,787][12883] Updated weights for policy 0, policy_version 82701 (0.0032) [2024-06-18 07:06:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1354973184. Throughput: 0: 42283.6. Samples: 1355131700. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:27,003][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 07:06:30,821][12883] Updated weights for policy 0, policy_version 82711 (0.0029) [2024-06-18 07:06:31,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43144.6, 300 sec: 42542.8). Total num frames: 1355218944. Throughput: 0: 42386.3. Samples: 1355263200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:31,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 07:06:34,403][12883] Updated weights for policy 0, policy_version 82721 (0.0040) [2024-06-18 07:06:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1355382784. Throughput: 0: 42304.9. Samples: 1355515160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:36,999][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 07:06:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082726_1355382784.pth... [2024-06-18 07:06:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082106_1345224704.pth [2024-06-18 07:06:38,487][12883] Updated weights for policy 0, policy_version 82731 (0.0034) [2024-06-18 07:06:41,929][12883] Updated weights for policy 0, policy_version 82741 (0.0031) [2024-06-18 07:06:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1355628544. Throughput: 0: 42512.2. Samples: 1355771820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:41,994][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 07:06:46,168][12883] Updated weights for policy 0, policy_version 82751 (0.0037) [2024-06-18 07:06:46,994][12645] Fps is (10 sec: 47514.4, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 1355857920. Throughput: 0: 42570.3. Samples: 1355900620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 20.0) [2024-06-18 07:06:46,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 07:06:49,673][12883] Updated weights for policy 0, policy_version 82761 (0.0035) [2024-06-18 07:06:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 1356038144. Throughput: 0: 42505.3. Samples: 1356157540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:06:51,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 07:06:53,776][12883] Updated weights for policy 0, policy_version 82772 (0.0034) [2024-06-18 07:06:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1356267520. Throughput: 0: 42609.9. Samples: 1356410440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:06:56,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 07:06:58,415][12883] Updated weights for policy 0, policy_version 82782 (0.0029) [2024-06-18 07:07:01,568][12883] Updated weights for policy 0, policy_version 82792 (0.0034) [2024-06-18 07:07:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1356480512. Throughput: 0: 42581.5. Samples: 1356539200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:01,994][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 07:07:06,050][12883] Updated weights for policy 0, policy_version 82802 (0.0034) [2024-06-18 07:07:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 1356677120. Throughput: 0: 42659.7. Samples: 1356795960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:06,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 07:07:09,123][12883] Updated weights for policy 0, policy_version 82812 (0.0025) [2024-06-18 07:07:11,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.9, 300 sec: 42542.5). Total num frames: 1356906496. Throughput: 0: 42673.0. Samples: 1357052080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:11,996][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 07:07:13,603][12883] Updated weights for policy 0, policy_version 82822 (0.0024) [2024-06-18 07:07:16,797][12883] Updated weights for policy 0, policy_version 82832 (0.0034) [2024-06-18 07:07:16,997][12645] Fps is (10 sec: 45860.8, 60 sec: 42869.1, 300 sec: 42542.4). Total num frames: 1357135872. Throughput: 0: 42664.4. Samples: 1357183240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:16,998][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 07:07:21,087][12883] Updated weights for policy 0, policy_version 82842 (0.0039) [2024-06-18 07:07:21,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1357316096. Throughput: 0: 42726.4. Samples: 1357437840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 07:07:24,926][12883] Updated weights for policy 0, policy_version 82852 (0.0025) [2024-06-18 07:07:26,994][12645] Fps is (10 sec: 40974.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1357545472. Throughput: 0: 42583.8. Samples: 1357688080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:26,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 07:07:29,266][12883] Updated weights for policy 0, policy_version 82862 (0.0042) [2024-06-18 07:07:29,284][12862] Signal inference workers to stop experience collection... (19650 times) [2024-06-18 07:07:29,284][12862] Signal inference workers to resume experience collection... (19650 times) [2024-06-18 07:07:29,302][12883] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-06-18 07:07:29,303][12883] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-06-18 07:07:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1357742080. Throughput: 0: 42688.4. Samples: 1357821600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:31,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 07:07:32,416][12883] Updated weights for policy 0, policy_version 82872 (0.0029) [2024-06-18 07:07:36,971][12883] Updated weights for policy 0, policy_version 82882 (0.0031) [2024-06-18 07:07:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1357938688. Throughput: 0: 42508.1. Samples: 1358070400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:07:36,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 07:07:39,976][12883] Updated weights for policy 0, policy_version 82892 (0.0028) [2024-06-18 07:07:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1358184448. Throughput: 0: 42497.8. Samples: 1358322840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:07:41,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 07:07:44,473][12883] Updated weights for policy 0, policy_version 82902 (0.0035) [2024-06-18 07:07:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1358364672. Throughput: 0: 42659.0. Samples: 1358458860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:07:46,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 07:07:47,753][12883] Updated weights for policy 0, policy_version 82912 (0.0034) [2024-06-18 07:07:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1358594048. Throughput: 0: 42345.4. Samples: 1358701500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:07:51,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 07:07:51,997][12883] Updated weights for policy 0, policy_version 82922 (0.0043) [2024-06-18 07:07:55,928][12883] Updated weights for policy 0, policy_version 82932 (0.0033) [2024-06-18 07:07:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1358807040. Throughput: 0: 42212.6. Samples: 1358951560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:07:56,995][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 07:07:59,725][12883] Updated weights for policy 0, policy_version 82942 (0.0034) [2024-06-18 07:08:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1359003648. Throughput: 0: 42285.8. Samples: 1359085960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:08:01,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 07:08:03,634][12883] Updated weights for policy 0, policy_version 82952 (0.0029) [2024-06-18 07:08:06,994][12645] Fps is (10 sec: 40961.3, 60 sec: 42325.6, 300 sec: 42431.8). Total num frames: 1359216640. Throughput: 0: 42254.7. Samples: 1359339300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:08:06,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 07:08:07,280][12883] Updated weights for policy 0, policy_version 82962 (0.0033) [2024-06-18 07:08:11,216][12883] Updated weights for policy 0, policy_version 82972 (0.0036) [2024-06-18 07:08:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42053.8, 300 sec: 42376.2). Total num frames: 1359429632. Throughput: 0: 42395.9. Samples: 1359595900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:08:11,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 07:08:15,035][12883] Updated weights for policy 0, policy_version 82982 (0.0038) [2024-06-18 07:08:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41508.5, 300 sec: 42376.3). Total num frames: 1359626240. Throughput: 0: 42167.2. Samples: 1359719120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:08:16,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 07:08:18,819][12883] Updated weights for policy 0, policy_version 82992 (0.0025) [2024-06-18 07:08:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1359872000. Throughput: 0: 42328.7. Samples: 1359975200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:08:21,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 07:08:23,099][12883] Updated weights for policy 0, policy_version 83002 (0.0035) [2024-06-18 07:08:26,271][12883] Updated weights for policy 0, policy_version 83012 (0.0027) [2024-06-18 07:08:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 1360068608. Throughput: 0: 42499.6. Samples: 1360235320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 07:08:26,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 07:08:30,617][12883] Updated weights for policy 0, policy_version 83022 (0.0027) [2024-06-18 07:08:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1360281600. Throughput: 0: 42264.1. Samples: 1360360740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:08:31,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 07:08:34,373][12883] Updated weights for policy 0, policy_version 83032 (0.0034) [2024-06-18 07:08:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 1360510976. Throughput: 0: 42702.3. Samples: 1360623100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:08:36,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 07:08:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083040_1360527360.pth... [2024-06-18 07:08:37,155][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082417_1350320128.pth [2024-06-18 07:08:38,046][12883] Updated weights for policy 0, policy_version 83042 (0.0043) [2024-06-18 07:08:41,995][12883] Updated weights for policy 0, policy_version 83052 (0.0042) [2024-06-18 07:08:41,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.8, 300 sec: 42320.4). Total num frames: 1360723968. Throughput: 0: 42718.1. Samples: 1360873960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:08:41,996][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 07:08:45,633][12883] Updated weights for policy 0, policy_version 83062 (0.0045) [2024-06-18 07:08:46,792][12862] Signal inference workers to stop experience collection... (19700 times) [2024-06-18 07:08:46,796][12862] Signal inference workers to resume experience collection... (19700 times) [2024-06-18 07:08:46,816][12883] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-06-18 07:08:46,844][12883] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-06-18 07:08:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1360936960. Throughput: 0: 42642.2. Samples: 1361004860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:08:46,994][12645] Avg episode reward: [(0, '0.746')] [2024-06-18 07:08:47,002][12862] Saving new best policy, reward=0.746! [2024-06-18 07:08:49,838][12883] Updated weights for policy 0, policy_version 83072 (0.0043) [2024-06-18 07:08:51,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1361149952. Throughput: 0: 42735.4. Samples: 1361262400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:08:51,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 07:08:53,880][12883] Updated weights for policy 0, policy_version 83082 (0.0029) [2024-06-18 07:08:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 1361346560. Throughput: 0: 42585.4. Samples: 1361512240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:08:56,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 07:08:57,447][12883] Updated weights for policy 0, policy_version 83092 (0.0045) [2024-06-18 07:09:01,453][12883] Updated weights for policy 0, policy_version 83102 (0.0038) [2024-06-18 07:09:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1361559552. Throughput: 0: 42597.8. Samples: 1361636020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:09:01,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 07:09:05,482][12883] Updated weights for policy 0, policy_version 83112 (0.0034) [2024-06-18 07:09:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1361772544. Throughput: 0: 42722.8. Samples: 1361897720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:09:06,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 07:09:09,028][12883] Updated weights for policy 0, policy_version 83122 (0.0033) [2024-06-18 07:09:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1361985536. Throughput: 0: 42485.0. Samples: 1362147140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:09:11,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 07:09:13,093][12883] Updated weights for policy 0, policy_version 83132 (0.0045) [2024-06-18 07:09:16,650][12883] Updated weights for policy 0, policy_version 83142 (0.0028) [2024-06-18 07:09:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1362214912. Throughput: 0: 42558.2. Samples: 1362275860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 07:09:16,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 07:09:20,855][12883] Updated weights for policy 0, policy_version 83152 (0.0033) [2024-06-18 07:09:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1362411520. Throughput: 0: 42459.6. Samples: 1362533780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:21,996][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 07:09:24,205][12883] Updated weights for policy 0, policy_version 83162 (0.0037) [2024-06-18 07:09:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 1362624512. Throughput: 0: 42358.9. Samples: 1362780020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:26,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 07:09:28,807][12883] Updated weights for policy 0, policy_version 83172 (0.0042) [2024-06-18 07:09:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1362837504. Throughput: 0: 42312.9. Samples: 1362908940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:31,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 07:09:32,259][12883] Updated weights for policy 0, policy_version 83182 (0.0039) [2024-06-18 07:09:36,497][12883] Updated weights for policy 0, policy_version 83192 (0.0041) [2024-06-18 07:09:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1363034112. Throughput: 0: 42320.1. Samples: 1363166800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:36,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 07:09:40,057][12883] Updated weights for policy 0, policy_version 83202 (0.0032) [2024-06-18 07:09:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42600.0, 300 sec: 42487.4). Total num frames: 1363279872. Throughput: 0: 42308.5. Samples: 1363416120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:41,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 07:09:44,285][12883] Updated weights for policy 0, policy_version 83212 (0.0025) [2024-06-18 07:09:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1363476480. Throughput: 0: 42511.9. Samples: 1363549060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:46,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 07:09:47,671][12883] Updated weights for policy 0, policy_version 83222 (0.0031) [2024-06-18 07:09:51,963][12883] Updated weights for policy 0, policy_version 83232 (0.0030) [2024-06-18 07:09:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1363673088. Throughput: 0: 42383.4. Samples: 1363804980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:51,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 07:09:55,145][12883] Updated weights for policy 0, policy_version 83242 (0.0044) [2024-06-18 07:09:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1363902464. Throughput: 0: 42402.3. Samples: 1364055240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:09:56,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 07:09:59,835][12883] Updated weights for policy 0, policy_version 83252 (0.0023) [2024-06-18 07:10:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1364115456. Throughput: 0: 42659.5. Samples: 1364195540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:10:01,996][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 07:10:02,688][12883] Updated weights for policy 0, policy_version 83262 (0.0031) [2024-06-18 07:10:06,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42431.7). Total num frames: 1364312064. Throughput: 0: 42451.5. Samples: 1364444100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 07:10:06,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 07:10:07,433][12883] Updated weights for policy 0, policy_version 83272 (0.0033) [2024-06-18 07:10:10,454][12883] Updated weights for policy 0, policy_version 83282 (0.0028) [2024-06-18 07:10:12,004][12645] Fps is (10 sec: 44191.3, 60 sec: 42864.1, 300 sec: 42541.4). Total num frames: 1364557824. Throughput: 0: 42524.1. Samples: 1364694040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:12,009][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 07:10:15,020][12883] Updated weights for policy 0, policy_version 83292 (0.0039) [2024-06-18 07:10:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1364754432. Throughput: 0: 42718.6. Samples: 1364831280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:16,998][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 07:10:18,003][12883] Updated weights for policy 0, policy_version 83302 (0.0032) [2024-06-18 07:10:21,994][12645] Fps is (10 sec: 39362.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1364951040. Throughput: 0: 42570.2. Samples: 1365082460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:21,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 07:10:22,596][12883] Updated weights for policy 0, policy_version 83312 (0.0036) [2024-06-18 07:10:25,693][12883] Updated weights for policy 0, policy_version 83322 (0.0029) [2024-06-18 07:10:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1365196800. Throughput: 0: 42603.0. Samples: 1365333260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:26,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 07:10:30,524][12883] Updated weights for policy 0, policy_version 83332 (0.0039) [2024-06-18 07:10:30,995][12862] Signal inference workers to stop experience collection... (19750 times) [2024-06-18 07:10:30,996][12862] Signal inference workers to resume experience collection... (19750 times) [2024-06-18 07:10:31,020][12883] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-06-18 07:10:31,020][12883] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-06-18 07:10:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1365393408. Throughput: 0: 42724.1. Samples: 1365471640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:31,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 07:10:33,232][12883] Updated weights for policy 0, policy_version 83342 (0.0040) [2024-06-18 07:10:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1365606400. Throughput: 0: 42541.8. Samples: 1365719360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:36,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 07:10:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083350_1365606400.pth... [2024-06-18 07:10:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000082726_1355382784.pth [2024-06-18 07:10:38,398][12883] Updated weights for policy 0, policy_version 83352 (0.0042) [2024-06-18 07:10:41,283][12883] Updated weights for policy 0, policy_version 83362 (0.0035) [2024-06-18 07:10:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1365835776. Throughput: 0: 42557.2. Samples: 1365970320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:41,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 07:10:46,141][12883] Updated weights for policy 0, policy_version 83372 (0.0023) [2024-06-18 07:10:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1365999616. Throughput: 0: 42334.7. Samples: 1366100600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:46,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 07:10:48,877][12883] Updated weights for policy 0, policy_version 83382 (0.0041) [2024-06-18 07:10:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1366261760. Throughput: 0: 42530.7. Samples: 1366357980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:51,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 07:10:53,886][12883] Updated weights for policy 0, policy_version 83392 (0.0036) [2024-06-18 07:10:56,630][12883] Updated weights for policy 0, policy_version 83402 (0.0042) [2024-06-18 07:10:56,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1366474752. Throughput: 0: 42642.7. Samples: 1366612520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:10:56,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 07:11:01,514][12883] Updated weights for policy 0, policy_version 83412 (0.0033) [2024-06-18 07:11:01,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1366638592. Throughput: 0: 42381.0. Samples: 1366738420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 20.0) [2024-06-18 07:11:01,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 07:11:04,334][12883] Updated weights for policy 0, policy_version 83422 (0.0028) [2024-06-18 07:11:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1366900736. Throughput: 0: 42518.1. Samples: 1366995780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:06,995][12645] Avg episode reward: [(0, '0.102')] [2024-06-18 07:11:09,285][12883] Updated weights for policy 0, policy_version 83432 (0.0041) [2024-06-18 07:11:11,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42332.6, 300 sec: 42487.3). Total num frames: 1367097344. Throughput: 0: 42510.3. Samples: 1367246220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:11,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 07:11:12,217][12883] Updated weights for policy 0, policy_version 83442 (0.0032) [2024-06-18 07:11:16,820][12883] Updated weights for policy 0, policy_version 83452 (0.0033) [2024-06-18 07:11:16,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1367277568. Throughput: 0: 42258.6. Samples: 1367373280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:16,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 07:11:20,098][12883] Updated weights for policy 0, policy_version 83462 (0.0028) [2024-06-18 07:11:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1367539712. Throughput: 0: 42518.7. Samples: 1367632700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:21,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 07:11:24,375][12883] Updated weights for policy 0, policy_version 83472 (0.0035) [2024-06-18 07:11:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1367736320. Throughput: 0: 42699.6. Samples: 1367891800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:26,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 07:11:27,658][12883] Updated weights for policy 0, policy_version 83482 (0.0037) [2024-06-18 07:11:31,871][12883] Updated weights for policy 0, policy_version 83492 (0.0037) [2024-06-18 07:11:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1367932928. Throughput: 0: 42505.3. Samples: 1368013340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:31,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 07:11:35,411][12883] Updated weights for policy 0, policy_version 83502 (0.0041) [2024-06-18 07:11:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1368178688. Throughput: 0: 42621.7. Samples: 1368275960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:36,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 07:11:40,026][12883] Updated weights for policy 0, policy_version 83512 (0.0033) [2024-06-18 07:11:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1368358912. Throughput: 0: 42588.4. Samples: 1368529000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:41,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 07:11:43,098][12883] Updated weights for policy 0, policy_version 83522 (0.0032) [2024-06-18 07:11:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1368571904. Throughput: 0: 42624.3. Samples: 1368656520. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:46,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 07:11:47,697][12883] Updated weights for policy 0, policy_version 83532 (0.0038) [2024-06-18 07:11:47,863][12862] Signal inference workers to stop experience collection... (19800 times) [2024-06-18 07:11:47,863][12862] Signal inference workers to resume experience collection... (19800 times) [2024-06-18 07:11:47,897][12883] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-06-18 07:11:47,897][12883] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-06-18 07:11:50,534][12883] Updated weights for policy 0, policy_version 83542 (0.0029) [2024-06-18 07:11:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1368817664. Throughput: 0: 42693.1. Samples: 1368916960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 07:11:51,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 07:11:55,207][12883] Updated weights for policy 0, policy_version 83552 (0.0029) [2024-06-18 07:11:56,996][12645] Fps is (10 sec: 44228.8, 60 sec: 42324.0, 300 sec: 42487.0). Total num frames: 1369014272. Throughput: 0: 42904.9. Samples: 1369177020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:11:56,996][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 07:11:58,417][12883] Updated weights for policy 0, policy_version 83562 (0.0041) [2024-06-18 07:12:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1369227264. Throughput: 0: 42788.4. Samples: 1369298760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:01,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 07:12:02,698][12883] Updated weights for policy 0, policy_version 83572 (0.0032) [2024-06-18 07:12:06,087][12883] Updated weights for policy 0, policy_version 83582 (0.0032) [2024-06-18 07:12:06,994][12645] Fps is (10 sec: 45883.2, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1369473024. Throughput: 0: 42840.8. Samples: 1369560540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:06,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 07:12:10,276][12883] Updated weights for policy 0, policy_version 83592 (0.0038) [2024-06-18 07:12:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42376.7). Total num frames: 1369636864. Throughput: 0: 42723.1. Samples: 1369814340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:11,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 07:12:13,798][12883] Updated weights for policy 0, policy_version 83602 (0.0020) [2024-06-18 07:12:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1369866240. Throughput: 0: 42797.2. Samples: 1369939220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:16,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 07:12:18,233][12883] Updated weights for policy 0, policy_version 83612 (0.0039) [2024-06-18 07:12:21,457][12883] Updated weights for policy 0, policy_version 83622 (0.0038) [2024-06-18 07:12:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1370095616. Throughput: 0: 42603.2. Samples: 1370193100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:21,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 07:12:25,765][12883] Updated weights for policy 0, policy_version 83632 (0.0031) [2024-06-18 07:12:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1370275840. Throughput: 0: 42843.6. Samples: 1370456960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:26,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 07:12:28,899][12883] Updated weights for policy 0, policy_version 83642 (0.0038) [2024-06-18 07:12:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1370505216. Throughput: 0: 42745.3. Samples: 1370580060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:32,000][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 07:12:33,263][12883] Updated weights for policy 0, policy_version 83652 (0.0034) [2024-06-18 07:12:36,670][12883] Updated weights for policy 0, policy_version 83662 (0.0027) [2024-06-18 07:12:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1370734592. Throughput: 0: 42652.0. Samples: 1370836300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:36,994][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 07:12:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083663_1370734592.pth... [2024-06-18 07:12:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083040_1360527360.pth [2024-06-18 07:12:40,995][12883] Updated weights for policy 0, policy_version 83672 (0.0025) [2024-06-18 07:12:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1370931200. Throughput: 0: 42585.2. Samples: 1371093280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:12:41,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 07:12:44,425][12883] Updated weights for policy 0, policy_version 83682 (0.0034) [2024-06-18 07:12:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1371144192. Throughput: 0: 42692.4. Samples: 1371219920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:12:46,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 07:12:48,621][12883] Updated weights for policy 0, policy_version 83692 (0.0040) [2024-06-18 07:12:51,831][12883] Updated weights for policy 0, policy_version 83702 (0.0026) [2024-06-18 07:12:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1371373568. Throughput: 0: 42635.6. Samples: 1371479140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:12:51,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 07:12:56,070][12883] Updated weights for policy 0, policy_version 83712 (0.0036) [2024-06-18 07:12:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42872.8, 300 sec: 42653.9). Total num frames: 1371586560. Throughput: 0: 42911.6. Samples: 1371745360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:12:56,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 07:12:59,694][12883] Updated weights for policy 0, policy_version 83722 (0.0024) [2024-06-18 07:13:01,996][12645] Fps is (10 sec: 42589.6, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1371799552. Throughput: 0: 42935.8. Samples: 1371871420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:13:01,996][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 07:13:03,963][12883] Updated weights for policy 0, policy_version 83732 (0.0037) [2024-06-18 07:13:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1372012544. Throughput: 0: 42998.2. Samples: 1372128020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:13:06,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 07:13:07,200][12883] Updated weights for policy 0, policy_version 83742 (0.0046) [2024-06-18 07:13:07,675][12862] Signal inference workers to stop experience collection... (19850 times) [2024-06-18 07:13:07,680][12862] Signal inference workers to resume experience collection... (19850 times) [2024-06-18 07:13:07,711][12883] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-06-18 07:13:07,711][12883] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-06-18 07:13:11,860][12883] Updated weights for policy 0, policy_version 83752 (0.0031) [2024-06-18 07:13:11,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1372192768. Throughput: 0: 42969.4. Samples: 1372390580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:13:11,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 07:13:14,855][12883] Updated weights for policy 0, policy_version 83762 (0.0029) [2024-06-18 07:13:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1372454912. Throughput: 0: 42844.9. Samples: 1372508080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:13:16,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 07:13:19,397][12883] Updated weights for policy 0, policy_version 83772 (0.0037) [2024-06-18 07:13:21,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1372651520. Throughput: 0: 42883.6. Samples: 1372766060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:13:21,994][12645] Avg episode reward: [(0, '0.131')] [2024-06-18 07:13:22,296][12883] Updated weights for policy 0, policy_version 83782 (0.0028) [2024-06-18 07:13:26,846][12883] Updated weights for policy 0, policy_version 83792 (0.0044) [2024-06-18 07:13:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1372848128. Throughput: 0: 42892.1. Samples: 1373023420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:13:26,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 07:13:30,067][12883] Updated weights for policy 0, policy_version 83802 (0.0023) [2024-06-18 07:13:31,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1373093888. Throughput: 0: 42906.6. Samples: 1373150720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:13:31,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 07:13:34,310][12883] Updated weights for policy 0, policy_version 83812 (0.0039) [2024-06-18 07:13:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1373274112. Throughput: 0: 42919.7. Samples: 1373410520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:13:36,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 07:13:37,775][12883] Updated weights for policy 0, policy_version 83822 (0.0045) [2024-06-18 07:13:41,850][12883] Updated weights for policy 0, policy_version 83832 (0.0031) [2024-06-18 07:13:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1373519872. Throughput: 0: 42704.0. Samples: 1373667040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:13:41,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 07:13:45,374][12883] Updated weights for policy 0, policy_version 83842 (0.0028) [2024-06-18 07:13:46,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1373749248. Throughput: 0: 42771.0. Samples: 1373796020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:13:46,994][12645] Avg episode reward: [(0, '0.045')] [2024-06-18 07:13:49,443][12883] Updated weights for policy 0, policy_version 83852 (0.0052) [2024-06-18 07:13:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1373929472. Throughput: 0: 42855.2. Samples: 1374056500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:13:51,994][12645] Avg episode reward: [(0, '0.093')] [2024-06-18 07:13:53,045][12883] Updated weights for policy 0, policy_version 83862 (0.0032) [2024-06-18 07:13:56,961][12883] Updated weights for policy 0, policy_version 83872 (0.0034) [2024-06-18 07:13:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1374158848. Throughput: 0: 42770.2. Samples: 1374315240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:13:56,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 07:14:00,907][12883] Updated weights for policy 0, policy_version 83882 (0.0030) [2024-06-18 07:14:01,998][12645] Fps is (10 sec: 45854.2, 60 sec: 43142.8, 300 sec: 42764.4). Total num frames: 1374388224. Throughput: 0: 42913.9. Samples: 1374439400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:14:01,999][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 07:14:04,666][12883] Updated weights for policy 0, policy_version 83892 (0.0046) [2024-06-18 07:14:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1374568448. Throughput: 0: 42964.0. Samples: 1374699440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:14:06,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 07:14:08,379][12883] Updated weights for policy 0, policy_version 83902 (0.0032) [2024-06-18 07:14:11,994][12645] Fps is (10 sec: 40978.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1374797824. Throughput: 0: 42861.3. Samples: 1374952180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:14:11,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 07:14:12,301][12883] Updated weights for policy 0, policy_version 83912 (0.0031) [2024-06-18 07:14:16,237][12883] Updated weights for policy 0, policy_version 83922 (0.0034) [2024-06-18 07:14:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1375010816. Throughput: 0: 42942.2. Samples: 1375083120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:14:16,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 07:14:20,062][12883] Updated weights for policy 0, policy_version 83932 (0.0036) [2024-06-18 07:14:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1375223808. Throughput: 0: 42844.8. Samples: 1375338540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 07:14:21,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 07:14:23,708][12883] Updated weights for policy 0, policy_version 83942 (0.0035) [2024-06-18 07:14:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1375436800. Throughput: 0: 42896.9. Samples: 1375597400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:14:27,002][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 07:14:27,613][12883] Updated weights for policy 0, policy_version 83952 (0.0040) [2024-06-18 07:14:31,179][12883] Updated weights for policy 0, policy_version 83962 (0.0054) [2024-06-18 07:14:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1375649792. Throughput: 0: 42797.7. Samples: 1375721920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:14:31,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 07:14:35,194][12883] Updated weights for policy 0, policy_version 83972 (0.0035) [2024-06-18 07:14:35,757][12862] Signal inference workers to stop experience collection... (19900 times) [2024-06-18 07:14:35,758][12862] Signal inference workers to resume experience collection... (19900 times) [2024-06-18 07:14:35,789][12883] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-06-18 07:14:35,789][12883] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-06-18 07:14:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1375879168. Throughput: 0: 42756.9. Samples: 1375980560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:14:36,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 07:14:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083977_1375879168.pth... [2024-06-18 07:14:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083350_1365606400.pth [2024-06-18 07:14:39,239][12883] Updated weights for policy 0, policy_version 83982 (0.0043) [2024-06-18 07:14:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1376075776. Throughput: 0: 42669.3. Samples: 1376235360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:14:41,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 07:14:43,155][12883] Updated weights for policy 0, policy_version 83992 (0.0034) [2024-06-18 07:14:46,737][12883] Updated weights for policy 0, policy_version 84002 (0.0052) [2024-06-18 07:14:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1376288768. Throughput: 0: 42681.2. Samples: 1376359860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:14:46,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 07:14:50,676][12883] Updated weights for policy 0, policy_version 84012 (0.0029) [2024-06-18 07:14:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1376501760. Throughput: 0: 42644.9. Samples: 1376618460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:14:51,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 07:14:54,704][12883] Updated weights for policy 0, policy_version 84022 (0.0036) [2024-06-18 07:14:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1376714752. Throughput: 0: 42805.7. Samples: 1376878440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:14:56,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 07:14:58,425][12883] Updated weights for policy 0, policy_version 84032 (0.0041) [2024-06-18 07:15:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42328.5, 300 sec: 42765.0). Total num frames: 1376927744. Throughput: 0: 42630.7. Samples: 1377001500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:15:01,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 07:15:02,323][12883] Updated weights for policy 0, policy_version 84042 (0.0023) [2024-06-18 07:15:06,300][12883] Updated weights for policy 0, policy_version 84052 (0.0029) [2024-06-18 07:15:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42711.0). Total num frames: 1377157120. Throughput: 0: 42629.8. Samples: 1377256880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:15:06,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 07:15:10,246][12883] Updated weights for policy 0, policy_version 84062 (0.0040) [2024-06-18 07:15:11,999][12645] Fps is (10 sec: 42574.5, 60 sec: 42594.3, 300 sec: 42708.7). Total num frames: 1377353728. Throughput: 0: 42588.0. Samples: 1377514100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:15:12,000][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 07:15:13,914][12883] Updated weights for policy 0, policy_version 84072 (0.0033) [2024-06-18 07:15:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1377566720. Throughput: 0: 42573.0. Samples: 1377637700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:15:16,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 07:15:17,890][12883] Updated weights for policy 0, policy_version 84082 (0.0037) [2024-06-18 07:15:21,359][12883] Updated weights for policy 0, policy_version 84092 (0.0028) [2024-06-18 07:15:21,994][12645] Fps is (10 sec: 42622.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1377779712. Throughput: 0: 42631.2. Samples: 1377898960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:21,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 07:15:25,526][12883] Updated weights for policy 0, policy_version 84102 (0.0033) [2024-06-18 07:15:26,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 1377976320. Throughput: 0: 42602.3. Samples: 1378152560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:26,997][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 07:15:29,321][12883] Updated weights for policy 0, policy_version 84112 (0.0028) [2024-06-18 07:15:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1378222080. Throughput: 0: 42591.0. Samples: 1378276460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:31,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 07:15:33,109][12883] Updated weights for policy 0, policy_version 84122 (0.0038) [2024-06-18 07:15:36,822][12883] Updated weights for policy 0, policy_version 84132 (0.0031) [2024-06-18 07:15:36,994][12645] Fps is (10 sec: 44247.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1378418688. Throughput: 0: 42645.3. Samples: 1378537500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:36,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 07:15:40,730][12883] Updated weights for policy 0, policy_version 84142 (0.0034) [2024-06-18 07:15:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1378615296. Throughput: 0: 42711.5. Samples: 1378800460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:41,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 07:15:44,272][12883] Updated weights for policy 0, policy_version 84152 (0.0045) [2024-06-18 07:15:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1378861056. Throughput: 0: 42665.0. Samples: 1378921420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:46,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 07:15:48,278][12883] Updated weights for policy 0, policy_version 84162 (0.0026) [2024-06-18 07:15:52,000][12645] Fps is (10 sec: 45846.8, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 1379074048. Throughput: 0: 42826.1. Samples: 1379184320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:52,001][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 07:15:52,001][12883] Updated weights for policy 0, policy_version 84172 (0.0039) [2024-06-18 07:15:55,894][12883] Updated weights for policy 0, policy_version 84182 (0.0049) [2024-06-18 07:15:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1379270656. Throughput: 0: 42871.6. Samples: 1379443080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:15:56,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 07:15:58,692][12862] Signal inference workers to stop experience collection... (19950 times) [2024-06-18 07:15:58,692][12862] Signal inference workers to resume experience collection... (19950 times) [2024-06-18 07:15:58,739][12883] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-06-18 07:15:58,739][12883] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-06-18 07:15:59,461][12883] Updated weights for policy 0, policy_version 84192 (0.0046) [2024-06-18 07:16:01,994][12645] Fps is (10 sec: 42625.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1379500032. Throughput: 0: 42882.3. Samples: 1379567400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:16:01,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 07:16:03,404][12883] Updated weights for policy 0, policy_version 84202 (0.0032) [2024-06-18 07:16:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1379713024. Throughput: 0: 42864.6. Samples: 1379827880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-18 07:16:06,995][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 07:16:07,131][12883] Updated weights for policy 0, policy_version 84212 (0.0037) [2024-06-18 07:16:11,348][12883] Updated weights for policy 0, policy_version 84222 (0.0044) [2024-06-18 07:16:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42602.4, 300 sec: 42820.5). Total num frames: 1379909632. Throughput: 0: 42892.8. Samples: 1380082640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:11,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 07:16:15,313][12883] Updated weights for policy 0, policy_version 84232 (0.0035) [2024-06-18 07:16:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1380122624. Throughput: 0: 42855.6. Samples: 1380204960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:16,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 07:16:18,989][12883] Updated weights for policy 0, policy_version 84242 (0.0031) [2024-06-18 07:16:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1380352000. Throughput: 0: 42809.7. Samples: 1380463940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:21,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 07:16:22,878][12883] Updated weights for policy 0, policy_version 84252 (0.0045) [2024-06-18 07:16:26,749][12883] Updated weights for policy 0, policy_version 84262 (0.0030) [2024-06-18 07:16:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 1380564992. Throughput: 0: 42654.4. Samples: 1380719900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:26,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 07:16:30,465][12883] Updated weights for policy 0, policy_version 84272 (0.0049) [2024-06-18 07:16:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1380777984. Throughput: 0: 42769.7. Samples: 1380846060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:31,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 07:16:34,229][12883] Updated weights for policy 0, policy_version 84282 (0.0029) [2024-06-18 07:16:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1380974592. Throughput: 0: 42733.5. Samples: 1381107060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:36,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 07:16:37,127][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084290_1381007360.pth... [2024-06-18 07:16:37,171][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083663_1370734592.pth [2024-06-18 07:16:38,053][12883] Updated weights for policy 0, policy_version 84292 (0.0028) [2024-06-18 07:16:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1381203968. Throughput: 0: 42676.5. Samples: 1381363520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:41,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 07:16:42,244][12883] Updated weights for policy 0, policy_version 84303 (0.0031) [2024-06-18 07:16:45,951][12883] Updated weights for policy 0, policy_version 84313 (0.0028) [2024-06-18 07:16:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1381433344. Throughput: 0: 42800.3. Samples: 1381493420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:46,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 07:16:50,018][12883] Updated weights for policy 0, policy_version 84323 (0.0044) [2024-06-18 07:16:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42602.9, 300 sec: 42765.3). Total num frames: 1381629952. Throughput: 0: 42757.9. Samples: 1381751980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:51,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 07:16:53,690][12883] Updated weights for policy 0, policy_version 84333 (0.0036) [2024-06-18 07:16:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1381826560. Throughput: 0: 42750.7. Samples: 1382006420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:16:56,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 07:16:57,813][12883] Updated weights for policy 0, policy_version 84343 (0.0034) [2024-06-18 07:17:01,211][12883] Updated weights for policy 0, policy_version 84353 (0.0040) [2024-06-18 07:17:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1382072320. Throughput: 0: 42790.2. Samples: 1382130520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:01,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 07:17:05,358][12883] Updated weights for policy 0, policy_version 84363 (0.0031) [2024-06-18 07:17:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1382252544. Throughput: 0: 42649.4. Samples: 1382383160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:06,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 07:17:09,364][12883] Updated weights for policy 0, policy_version 84373 (0.0022) [2024-06-18 07:17:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1382465536. Throughput: 0: 42662.2. Samples: 1382639700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:11,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 07:17:13,013][12883] Updated weights for policy 0, policy_version 84383 (0.0033) [2024-06-18 07:17:14,528][12862] Signal inference workers to stop experience collection... (20000 times) [2024-06-18 07:17:14,581][12883] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-06-18 07:17:14,649][12862] Signal inference workers to resume experience collection... (20000 times) [2024-06-18 07:17:14,649][12883] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-06-18 07:17:16,928][12883] Updated weights for policy 0, policy_version 84393 (0.0038) [2024-06-18 07:17:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1382694912. Throughput: 0: 42705.3. Samples: 1382767800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:16,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 07:17:20,823][12883] Updated weights for policy 0, policy_version 84403 (0.0046) [2024-06-18 07:17:21,993][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1382891520. Throughput: 0: 42519.7. Samples: 1383020440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:21,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 07:17:24,505][12883] Updated weights for policy 0, policy_version 84413 (0.0028) [2024-06-18 07:17:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1383104512. Throughput: 0: 42407.0. Samples: 1383271840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:26,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 07:17:28,778][12883] Updated weights for policy 0, policy_version 84423 (0.0041) [2024-06-18 07:17:31,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1383333888. Throughput: 0: 42403.2. Samples: 1383401560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:31,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 07:17:32,123][12883] Updated weights for policy 0, policy_version 84433 (0.0030) [2024-06-18 07:17:36,563][12883] Updated weights for policy 0, policy_version 84443 (0.0038) [2024-06-18 07:17:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1383514112. Throughput: 0: 42337.4. Samples: 1383657160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:36,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 07:17:39,781][12883] Updated weights for policy 0, policy_version 84453 (0.0039) [2024-06-18 07:17:42,000][12645] Fps is (10 sec: 42571.8, 60 sec: 42593.9, 300 sec: 42764.1). Total num frames: 1383759872. Throughput: 0: 42165.7. Samples: 1383904140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:42,001][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 07:17:44,591][12883] Updated weights for policy 0, policy_version 84463 (0.0024) [2024-06-18 07:17:46,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1383972864. Throughput: 0: 42485.9. Samples: 1384042380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 07:17:46,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 07:17:47,628][12883] Updated weights for policy 0, policy_version 84473 (0.0046) [2024-06-18 07:17:51,994][12645] Fps is (10 sec: 39346.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1384153088. Throughput: 0: 42412.4. Samples: 1384291720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:17:51,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 07:17:52,154][12883] Updated weights for policy 0, policy_version 84483 (0.0033) [2024-06-18 07:17:55,148][12883] Updated weights for policy 0, policy_version 84493 (0.0033) [2024-06-18 07:17:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 1384415232. Throughput: 0: 42352.5. Samples: 1384545560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:17:56,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 07:17:59,859][12883] Updated weights for policy 0, policy_version 84503 (0.0029) [2024-06-18 07:18:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1384595456. Throughput: 0: 42634.3. Samples: 1384686340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:01,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 07:18:02,820][12883] Updated weights for policy 0, policy_version 84513 (0.0041) [2024-06-18 07:18:06,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 1384792064. Throughput: 0: 42374.0. Samples: 1384927280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:06,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 07:18:07,533][12883] Updated weights for policy 0, policy_version 84523 (0.0041) [2024-06-18 07:18:10,389][12883] Updated weights for policy 0, policy_version 84533 (0.0031) [2024-06-18 07:18:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1385054208. Throughput: 0: 42471.2. Samples: 1385183040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:11,994][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 07:18:15,212][12883] Updated weights for policy 0, policy_version 84543 (0.0040) [2024-06-18 07:18:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1385218048. Throughput: 0: 42681.5. Samples: 1385322220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:16,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 07:18:17,980][12883] Updated weights for policy 0, policy_version 84553 (0.0042) [2024-06-18 07:18:21,998][12645] Fps is (10 sec: 39304.9, 60 sec: 42595.3, 300 sec: 42708.9). Total num frames: 1385447424. Throughput: 0: 42531.5. Samples: 1385571260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:21,998][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 07:18:22,753][12883] Updated weights for policy 0, policy_version 84563 (0.0042) [2024-06-18 07:18:25,582][12883] Updated weights for policy 0, policy_version 84573 (0.0040) [2024-06-18 07:18:26,996][12645] Fps is (10 sec: 45864.2, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1385676800. Throughput: 0: 42788.2. Samples: 1385829440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:26,996][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 07:18:30,326][12883] Updated weights for policy 0, policy_version 84583 (0.0035) [2024-06-18 07:18:31,994][12645] Fps is (10 sec: 42616.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1385873408. Throughput: 0: 42625.7. Samples: 1385960540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:31,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 07:18:33,255][12883] Updated weights for policy 0, policy_version 84593 (0.0035) [2024-06-18 07:18:34,153][12862] Signal inference workers to stop experience collection... (20050 times) [2024-06-18 07:18:34,153][12862] Signal inference workers to resume experience collection... (20050 times) [2024-06-18 07:18:34,166][12883] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-06-18 07:18:34,167][12883] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-06-18 07:18:36,994][12645] Fps is (10 sec: 42608.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1386102784. Throughput: 0: 42664.1. Samples: 1386211600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:36,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 07:18:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084601_1386102784.pth... [2024-06-18 07:18:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000083977_1375879168.pth [2024-06-18 07:18:38,117][12883] Updated weights for policy 0, policy_version 84603 (0.0036) [2024-06-18 07:18:40,946][12883] Updated weights for policy 0, policy_version 84613 (0.0035) [2024-06-18 07:18:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 1386315776. Throughput: 0: 42731.0. Samples: 1386468460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 07:18:41,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 07:18:45,628][12883] Updated weights for policy 0, policy_version 84623 (0.0034) [2024-06-18 07:18:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1386496000. Throughput: 0: 42476.9. Samples: 1386597800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:18:46,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 07:18:48,827][12883] Updated weights for policy 0, policy_version 84633 (0.0032) [2024-06-18 07:18:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1386741760. Throughput: 0: 42859.1. Samples: 1386855940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:18:51,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 07:18:53,136][12883] Updated weights for policy 0, policy_version 84643 (0.0034) [2024-06-18 07:18:56,570][12883] Updated weights for policy 0, policy_version 84653 (0.0027) [2024-06-18 07:18:56,998][12645] Fps is (10 sec: 45853.2, 60 sec: 42322.0, 300 sec: 42598.4). Total num frames: 1386954752. Throughput: 0: 42776.0. Samples: 1387108160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:18:56,999][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 07:19:00,812][12883] Updated weights for policy 0, policy_version 84663 (0.0045) [2024-06-18 07:19:01,994][12645] Fps is (10 sec: 37683.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1387118592. Throughput: 0: 42567.1. Samples: 1387237740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:19:01,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 07:19:04,359][12883] Updated weights for policy 0, policy_version 84673 (0.0035) [2024-06-18 07:19:06,994][12645] Fps is (10 sec: 44257.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1387397120. Throughput: 0: 42678.2. Samples: 1387491600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:19:06,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 07:19:08,507][12883] Updated weights for policy 0, policy_version 84683 (0.0032) [2024-06-18 07:19:11,994][12645] Fps is (10 sec: 47512.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1387593728. Throughput: 0: 42719.4. Samples: 1387751720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:19:11,995][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 07:19:12,530][12883] Updated weights for policy 0, policy_version 84693 (0.0037) [2024-06-18 07:19:16,369][12883] Updated weights for policy 0, policy_version 84703 (0.0027) [2024-06-18 07:19:16,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1387773952. Throughput: 0: 42537.4. Samples: 1387874720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:19:16,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 07:19:19,883][12883] Updated weights for policy 0, policy_version 84713 (0.0029) [2024-06-18 07:19:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43147.5, 300 sec: 42709.5). Total num frames: 1388036096. Throughput: 0: 42832.4. Samples: 1388139060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:19:21,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 07:19:23,873][12883] Updated weights for policy 0, policy_version 84723 (0.0039) [2024-06-18 07:19:26,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1388232704. Throughput: 0: 42765.7. Samples: 1388392920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:19:26,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 07:19:27,487][12883] Updated weights for policy 0, policy_version 84733 (0.0027) [2024-06-18 07:19:31,487][12883] Updated weights for policy 0, policy_version 84743 (0.0025) [2024-06-18 07:19:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1388429312. Throughput: 0: 42710.1. Samples: 1388519760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 24.0) [2024-06-18 07:19:31,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 07:19:34,981][12883] Updated weights for policy 0, policy_version 84753 (0.0037) [2024-06-18 07:19:35,428][12862] Signal inference workers to stop experience collection... (20100 times) [2024-06-18 07:19:35,428][12862] Signal inference workers to resume experience collection... (20100 times) [2024-06-18 07:19:35,447][12883] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-06-18 07:19:35,448][12883] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-06-18 07:19:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1388675072. Throughput: 0: 42726.2. Samples: 1388778620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:19:36,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 07:19:39,079][12883] Updated weights for policy 0, policy_version 84763 (0.0029) [2024-06-18 07:19:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1388888064. Throughput: 0: 42930.7. Samples: 1389039840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:19:41,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 07:19:42,639][12883] Updated weights for policy 0, policy_version 84773 (0.0041) [2024-06-18 07:19:46,859][12883] Updated weights for policy 0, policy_version 84783 (0.0038) [2024-06-18 07:19:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1389084672. Throughput: 0: 42837.2. Samples: 1389165420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:19:46,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 07:19:50,241][12883] Updated weights for policy 0, policy_version 84793 (0.0039) [2024-06-18 07:19:51,996][12645] Fps is (10 sec: 44226.9, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1389330432. Throughput: 0: 43088.6. Samples: 1389430680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:19:51,996][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 07:19:54,649][12883] Updated weights for policy 0, policy_version 84803 (0.0033) [2024-06-18 07:19:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42874.9, 300 sec: 42709.5). Total num frames: 1389527040. Throughput: 0: 43188.2. Samples: 1389695180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:19:56,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 07:19:57,825][12883] Updated weights for policy 0, policy_version 84813 (0.0038) [2024-06-18 07:20:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 1389723648. Throughput: 0: 43140.3. Samples: 1389816040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:20:01,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 07:20:02,064][12883] Updated weights for policy 0, policy_version 84823 (0.0044) [2024-06-18 07:20:05,251][12883] Updated weights for policy 0, policy_version 84833 (0.0039) [2024-06-18 07:20:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.8). Total num frames: 1389969408. Throughput: 0: 43031.2. Samples: 1390075460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:20:06,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 07:20:09,539][12883] Updated weights for policy 0, policy_version 84843 (0.0027) [2024-06-18 07:20:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1390149632. Throughput: 0: 43138.4. Samples: 1390334140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:20:11,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 07:20:12,878][12883] Updated weights for policy 0, policy_version 84853 (0.0041) [2024-06-18 07:20:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1390379008. Throughput: 0: 43177.0. Samples: 1390462720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:20:16,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 07:20:17,051][12883] Updated weights for policy 0, policy_version 84863 (0.0030) [2024-06-18 07:20:20,476][12883] Updated weights for policy 0, policy_version 84873 (0.0048) [2024-06-18 07:20:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 1390592000. Throughput: 0: 43014.2. Samples: 1390714260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 26.0) [2024-06-18 07:20:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 07:20:24,572][12883] Updated weights for policy 0, policy_version 84883 (0.0038) [2024-06-18 07:20:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1390821376. Throughput: 0: 43084.4. Samples: 1390978640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:20:26,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 07:20:28,331][12883] Updated weights for policy 0, policy_version 84893 (0.0036) [2024-06-18 07:20:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1391034368. Throughput: 0: 43139.7. Samples: 1391106700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:20:31,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 07:20:32,120][12883] Updated weights for policy 0, policy_version 84903 (0.0041) [2024-06-18 07:20:36,022][12883] Updated weights for policy 0, policy_version 84913 (0.0025) [2024-06-18 07:20:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1391247360. Throughput: 0: 42927.4. Samples: 1391362320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:20:36,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 07:20:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084915_1391247360.pth... [2024-06-18 07:20:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084290_1381007360.pth [2024-06-18 07:20:39,548][12883] Updated weights for policy 0, policy_version 84923 (0.0041) [2024-06-18 07:20:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1391443968. Throughput: 0: 42788.4. Samples: 1391620660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:20:41,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 07:20:43,617][12883] Updated weights for policy 0, policy_version 84933 (0.0034) [2024-06-18 07:20:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42765.9). Total num frames: 1391689728. Throughput: 0: 42900.6. Samples: 1391746560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:20:46,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 07:20:47,338][12883] Updated weights for policy 0, policy_version 84943 (0.0031) [2024-06-18 07:20:51,138][12883] Updated weights for policy 0, policy_version 84953 (0.0029) [2024-06-18 07:20:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1391886336. Throughput: 0: 42742.2. Samples: 1391998860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:20:51,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 07:20:55,089][12883] Updated weights for policy 0, policy_version 84963 (0.0035) [2024-06-18 07:20:56,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1392082944. Throughput: 0: 42847.4. Samples: 1392262280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:20:56,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 07:20:58,830][12883] Updated weights for policy 0, policy_version 84973 (0.0037) [2024-06-18 07:21:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1392328704. Throughput: 0: 42753.3. Samples: 1392386620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:21:01,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 07:21:02,497][12883] Updated weights for policy 0, policy_version 84983 (0.0046) [2024-06-18 07:21:06,657][12883] Updated weights for policy 0, policy_version 84993 (0.0038) [2024-06-18 07:21:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1392525312. Throughput: 0: 42817.0. Samples: 1392641020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:21:06,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 07:21:07,521][12862] Signal inference workers to stop experience collection... (20150 times) [2024-06-18 07:21:07,528][12862] Signal inference workers to resume experience collection... (20150 times) [2024-06-18 07:21:07,568][12883] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-06-18 07:21:07,568][12883] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-06-18 07:21:10,010][12883] Updated weights for policy 0, policy_version 85003 (0.0033) [2024-06-18 07:21:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1392721920. Throughput: 0: 42744.5. Samples: 1392902140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:21:11,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 07:21:14,355][12883] Updated weights for policy 0, policy_version 85013 (0.0025) [2024-06-18 07:21:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1392967680. Throughput: 0: 42723.3. Samples: 1393029260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:21:17,000][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 07:21:17,962][12883] Updated weights for policy 0, policy_version 85023 (0.0027) [2024-06-18 07:21:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1393164288. Throughput: 0: 42816.1. Samples: 1393289040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:21,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 07:21:22,036][12883] Updated weights for policy 0, policy_version 85033 (0.0034) [2024-06-18 07:21:25,511][12883] Updated weights for policy 0, policy_version 85043 (0.0037) [2024-06-18 07:21:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1393377280. Throughput: 0: 42780.9. Samples: 1393545800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:26,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 07:21:29,618][12883] Updated weights for policy 0, policy_version 85053 (0.0031) [2024-06-18 07:21:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1393606656. Throughput: 0: 42822.1. Samples: 1393673560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:31,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 07:21:33,115][12883] Updated weights for policy 0, policy_version 85063 (0.0045) [2024-06-18 07:21:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1393803264. Throughput: 0: 42994.3. Samples: 1393933600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:36,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 07:21:37,259][12883] Updated weights for policy 0, policy_version 85073 (0.0038) [2024-06-18 07:21:41,173][12883] Updated weights for policy 0, policy_version 85083 (0.0034) [2024-06-18 07:21:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1394016256. Throughput: 0: 42728.6. Samples: 1394185060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:41,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 07:21:44,865][12883] Updated weights for policy 0, policy_version 85093 (0.0026) [2024-06-18 07:21:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1394245632. Throughput: 0: 42887.1. Samples: 1394316540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:46,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 07:21:48,561][12883] Updated weights for policy 0, policy_version 85103 (0.0028) [2024-06-18 07:21:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1394442240. Throughput: 0: 43025.8. Samples: 1394577180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:51,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 07:21:52,369][12883] Updated weights for policy 0, policy_version 85113 (0.0031) [2024-06-18 07:21:56,055][12883] Updated weights for policy 0, policy_version 85123 (0.0041) [2024-06-18 07:21:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1394671616. Throughput: 0: 42725.2. Samples: 1394824780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:21:56,996][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 07:21:59,953][12883] Updated weights for policy 0, policy_version 85133 (0.0032) [2024-06-18 07:22:01,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 1394851840. Throughput: 0: 42777.2. Samples: 1394954320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:22:01,996][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 07:22:03,601][12883] Updated weights for policy 0, policy_version 85143 (0.0025) [2024-06-18 07:22:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1395097600. Throughput: 0: 42864.9. Samples: 1395217960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 07:22:06,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 07:22:07,577][12883] Updated weights for policy 0, policy_version 85153 (0.0047) [2024-06-18 07:22:11,249][12883] Updated weights for policy 0, policy_version 85163 (0.0026) [2024-06-18 07:22:11,995][12645] Fps is (10 sec: 45877.8, 60 sec: 43143.3, 300 sec: 42764.8). Total num frames: 1395310592. Throughput: 0: 42686.0. Samples: 1395466740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:11,996][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 07:22:15,455][12883] Updated weights for policy 0, policy_version 85173 (0.0037) [2024-06-18 07:22:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1395507200. Throughput: 0: 42776.0. Samples: 1395598480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:16,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 07:22:18,750][12883] Updated weights for policy 0, policy_version 85183 (0.0033) [2024-06-18 07:22:21,081][12862] Signal inference workers to stop experience collection... (20200 times) [2024-06-18 07:22:21,127][12883] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-06-18 07:22:21,135][12862] Signal inference workers to resume experience collection... (20200 times) [2024-06-18 07:22:21,144][12883] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-06-18 07:22:21,994][12645] Fps is (10 sec: 42605.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1395736576. Throughput: 0: 42833.4. Samples: 1395861100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:21,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 07:22:23,020][12883] Updated weights for policy 0, policy_version 85193 (0.0030) [2024-06-18 07:22:26,520][12883] Updated weights for policy 0, policy_version 85203 (0.0039) [2024-06-18 07:22:26,996][12645] Fps is (10 sec: 45865.3, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 1395965952. Throughput: 0: 42865.8. Samples: 1396114120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:26,996][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 07:22:30,878][12883] Updated weights for policy 0, policy_version 85213 (0.0026) [2024-06-18 07:22:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1396162560. Throughput: 0: 42890.2. Samples: 1396246600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:31,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 07:22:34,644][12883] Updated weights for policy 0, policy_version 85223 (0.0027) [2024-06-18 07:22:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42821.5). Total num frames: 1396391936. Throughput: 0: 42852.8. Samples: 1396505560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:36,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 07:22:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085230_1396408320.pth... [2024-06-18 07:22:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084601_1386102784.pth [2024-06-18 07:22:38,443][12883] Updated weights for policy 0, policy_version 85233 (0.0030) [2024-06-18 07:22:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1396604928. Throughput: 0: 43078.7. Samples: 1396763320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:41,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 07:22:42,362][12883] Updated weights for policy 0, policy_version 85243 (0.0043) [2024-06-18 07:22:45,935][12883] Updated weights for policy 0, policy_version 85253 (0.0038) [2024-06-18 07:22:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1396785152. Throughput: 0: 42916.8. Samples: 1396885480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:46,995][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 07:22:49,715][12883] Updated weights for policy 0, policy_version 85263 (0.0034) [2024-06-18 07:22:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1397030912. Throughput: 0: 42868.3. Samples: 1397147040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:51,996][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 07:22:53,445][12883] Updated weights for policy 0, policy_version 85273 (0.0044) [2024-06-18 07:22:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1397243904. Throughput: 0: 42960.1. Samples: 1397399880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:22:56,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 07:22:57,466][12883] Updated weights for policy 0, policy_version 85283 (0.0022) [2024-06-18 07:23:01,208][12883] Updated weights for policy 0, policy_version 85293 (0.0027) [2024-06-18 07:23:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1397440512. Throughput: 0: 42972.5. Samples: 1397532240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 07:23:01,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 07:23:05,851][12883] Updated weights for policy 0, policy_version 85303 (0.0033) [2024-06-18 07:23:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1397653504. Throughput: 0: 42961.7. Samples: 1397794380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:06,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 07:23:08,855][12883] Updated weights for policy 0, policy_version 85313 (0.0035) [2024-06-18 07:23:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43145.8, 300 sec: 42987.2). Total num frames: 1397899264. Throughput: 0: 42829.8. Samples: 1398041360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:11,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 07:23:13,475][12883] Updated weights for policy 0, policy_version 85323 (0.0038) [2024-06-18 07:23:16,478][12883] Updated weights for policy 0, policy_version 85333 (0.0035) [2024-06-18 07:23:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42876.7). Total num frames: 1398095872. Throughput: 0: 42874.6. Samples: 1398175960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:16,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 07:23:20,991][12883] Updated weights for policy 0, policy_version 85343 (0.0041) [2024-06-18 07:23:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1398308864. Throughput: 0: 42782.7. Samples: 1398430780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:21,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 07:23:24,163][12883] Updated weights for policy 0, policy_version 85353 (0.0036) [2024-06-18 07:23:26,637][12862] Signal inference workers to stop experience collection... (20250 times) [2024-06-18 07:23:26,637][12862] Signal inference workers to resume experience collection... (20250 times) [2024-06-18 07:23:26,682][12883] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-06-18 07:23:26,682][12883] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-06-18 07:23:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42931.6). Total num frames: 1398538240. Throughput: 0: 42755.0. Samples: 1398687300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:26,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 07:23:28,614][12883] Updated weights for policy 0, policy_version 85363 (0.0039) [2024-06-18 07:23:31,656][12883] Updated weights for policy 0, policy_version 85373 (0.0039) [2024-06-18 07:23:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1398751232. Throughput: 0: 42835.5. Samples: 1398813080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:31,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 07:23:36,211][12883] Updated weights for policy 0, policy_version 85383 (0.0031) [2024-06-18 07:23:36,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1398947840. Throughput: 0: 42840.2. Samples: 1399074840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:36,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 07:23:39,161][12883] Updated weights for policy 0, policy_version 85393 (0.0040) [2024-06-18 07:23:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1399160832. Throughput: 0: 42797.0. Samples: 1399325740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:41,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 07:23:43,944][12883] Updated weights for policy 0, policy_version 85403 (0.0029) [2024-06-18 07:23:46,755][12883] Updated weights for policy 0, policy_version 85413 (0.0038) [2024-06-18 07:23:46,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 1399406592. Throughput: 0: 42705.7. Samples: 1399454000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:46,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 07:23:51,519][12883] Updated weights for policy 0, policy_version 85423 (0.0031) [2024-06-18 07:23:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42765.7). Total num frames: 1399570432. Throughput: 0: 42431.1. Samples: 1399703780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:23:51,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 07:23:55,081][12883] Updated weights for policy 0, policy_version 85433 (0.0044) [2024-06-18 07:23:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 1399799808. Throughput: 0: 42605.7. Samples: 1399958620. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:23:56,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 07:23:59,388][12883] Updated weights for policy 0, policy_version 85443 (0.0035) [2024-06-18 07:24:01,994][12645] Fps is (10 sec: 47513.0, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1400045568. Throughput: 0: 42454.6. Samples: 1400086420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:01,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 07:24:02,773][12883] Updated weights for policy 0, policy_version 85453 (0.0030) [2024-06-18 07:24:06,919][12883] Updated weights for policy 0, policy_version 85463 (0.0040) [2024-06-18 07:24:06,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1400225792. Throughput: 0: 42470.7. Samples: 1400342060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:06,996][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 07:24:10,510][12883] Updated weights for policy 0, policy_version 85473 (0.0047) [2024-06-18 07:24:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.2, 300 sec: 42931.6). Total num frames: 1400438784. Throughput: 0: 42367.2. Samples: 1400593820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:11,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 07:24:14,674][12883] Updated weights for policy 0, policy_version 85483 (0.0043) [2024-06-18 07:24:16,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1400668160. Throughput: 0: 42531.2. Samples: 1400726980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:16,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 07:24:18,159][12883] Updated weights for policy 0, policy_version 85493 (0.0027) [2024-06-18 07:24:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1400864768. Throughput: 0: 42363.3. Samples: 1400981200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:21,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 07:24:22,834][12883] Updated weights for policy 0, policy_version 85503 (0.0030) [2024-06-18 07:24:26,244][12883] Updated weights for policy 0, policy_version 85513 (0.0041) [2024-06-18 07:24:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1401077760. Throughput: 0: 42297.8. Samples: 1401229140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:26,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 07:24:30,253][12883] Updated weights for policy 0, policy_version 85523 (0.0049) [2024-06-18 07:24:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1401290752. Throughput: 0: 42357.7. Samples: 1401360100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:31,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 07:24:33,728][12883] Updated weights for policy 0, policy_version 85533 (0.0044) [2024-06-18 07:24:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1401503744. Throughput: 0: 42566.7. Samples: 1401619280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:36,994][12645] Avg episode reward: [(0, '0.046')] [2024-06-18 07:24:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085542_1401520128.pth... [2024-06-18 07:24:37,068][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000084915_1391247360.pth [2024-06-18 07:24:37,763][12883] Updated weights for policy 0, policy_version 85543 (0.0030) [2024-06-18 07:24:39,648][12862] Signal inference workers to stop experience collection... (20300 times) [2024-06-18 07:24:39,703][12862] Signal inference workers to resume experience collection... (20300 times) [2024-06-18 07:24:39,703][12883] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-06-18 07:24:39,721][12883] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-06-18 07:24:41,308][12883] Updated weights for policy 0, policy_version 85553 (0.0044) [2024-06-18 07:24:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1401733120. Throughput: 0: 42549.8. Samples: 1401873360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-18 07:24:41,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 07:24:45,356][12883] Updated weights for policy 0, policy_version 85563 (0.0034) [2024-06-18 07:24:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 42598.7). Total num frames: 1401896960. Throughput: 0: 42672.6. Samples: 1402006680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:24:46,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 07:24:48,784][12883] Updated weights for policy 0, policy_version 85573 (0.0023) [2024-06-18 07:24:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1402159104. Throughput: 0: 42742.6. Samples: 1402265380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:24:51,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 07:24:53,061][12883] Updated weights for policy 0, policy_version 85583 (0.0036) [2024-06-18 07:24:56,570][12883] Updated weights for policy 0, policy_version 85593 (0.0028) [2024-06-18 07:24:56,994][12645] Fps is (10 sec: 49151.3, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1402388480. Throughput: 0: 42822.2. Samples: 1402520820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:24:56,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 07:25:00,692][12883] Updated weights for policy 0, policy_version 85603 (0.0030) [2024-06-18 07:25:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 1402552320. Throughput: 0: 42710.7. Samples: 1402648960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:01,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 07:25:03,971][12883] Updated weights for policy 0, policy_version 85613 (0.0029) [2024-06-18 07:25:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 1402798080. Throughput: 0: 42717.8. Samples: 1402903500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:06,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 07:25:08,201][12883] Updated weights for policy 0, policy_version 85623 (0.0043) [2024-06-18 07:25:11,871][12883] Updated weights for policy 0, policy_version 85633 (0.0028) [2024-06-18 07:25:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1403011072. Throughput: 0: 43101.0. Samples: 1403168680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:11,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 07:25:15,686][12883] Updated weights for policy 0, policy_version 85643 (0.0028) [2024-06-18 07:25:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1403207680. Throughput: 0: 42995.6. Samples: 1403294900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:16,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 07:25:19,489][12883] Updated weights for policy 0, policy_version 85653 (0.0033) [2024-06-18 07:25:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1403453440. Throughput: 0: 42829.8. Samples: 1403546620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:21,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 07:25:23,262][12883] Updated weights for policy 0, policy_version 85663 (0.0040) [2024-06-18 07:25:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1403650048. Throughput: 0: 43024.0. Samples: 1403809440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:26,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 07:25:27,122][12883] Updated weights for policy 0, policy_version 85673 (0.0029) [2024-06-18 07:25:30,858][12883] Updated weights for policy 0, policy_version 85683 (0.0046) [2024-06-18 07:25:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1403863040. Throughput: 0: 42811.1. Samples: 1403933180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:31,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 07:25:34,813][12883] Updated weights for policy 0, policy_version 85693 (0.0031) [2024-06-18 07:25:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1404076032. Throughput: 0: 42768.5. Samples: 1404189960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 07:25:36,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 07:25:38,682][12883] Updated weights for policy 0, policy_version 85703 (0.0043) [2024-06-18 07:25:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1404272640. Throughput: 0: 42801.8. Samples: 1404446900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:25:41,994][12645] Avg episode reward: [(0, '0.096')] [2024-06-18 07:25:42,722][12883] Updated weights for policy 0, policy_version 85713 (0.0033) [2024-06-18 07:25:46,277][12883] Updated weights for policy 0, policy_version 85723 (0.0035) [2024-06-18 07:25:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1404502016. Throughput: 0: 42726.7. Samples: 1404571660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:25:46,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 07:25:50,247][12862] Signal inference workers to stop experience collection... (20350 times) [2024-06-18 07:25:50,247][12862] Signal inference workers to resume experience collection... (20350 times) [2024-06-18 07:25:50,262][12883] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-06-18 07:25:50,287][12883] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-06-18 07:25:50,393][12883] Updated weights for policy 0, policy_version 85733 (0.0032) [2024-06-18 07:25:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1404715008. Throughput: 0: 42788.6. Samples: 1404828980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:25:51,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 07:25:54,346][12883] Updated weights for policy 0, policy_version 85743 (0.0032) [2024-06-18 07:25:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1404928000. Throughput: 0: 42502.4. Samples: 1405081300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:25:56,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 07:25:58,250][12883] Updated weights for policy 0, policy_version 85753 (0.0027) [2024-06-18 07:26:01,773][12883] Updated weights for policy 0, policy_version 85763 (0.0031) [2024-06-18 07:26:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1405140992. Throughput: 0: 42519.6. Samples: 1405208280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:26:01,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 07:26:05,859][12883] Updated weights for policy 0, policy_version 85773 (0.0036) [2024-06-18 07:26:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1405370368. Throughput: 0: 42794.6. Samples: 1405472380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:26:06,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 07:26:09,724][12883] Updated weights for policy 0, policy_version 85783 (0.0047) [2024-06-18 07:26:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1405583360. Throughput: 0: 42619.6. Samples: 1405727320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:26:11,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 07:26:13,757][12883] Updated weights for policy 0, policy_version 85793 (0.0041) [2024-06-18 07:26:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1405779968. Throughput: 0: 42656.4. Samples: 1405852720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:26:16,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 07:26:17,201][12883] Updated weights for policy 0, policy_version 85803 (0.0043) [2024-06-18 07:26:21,444][12883] Updated weights for policy 0, policy_version 85813 (0.0038) [2024-06-18 07:26:21,997][12645] Fps is (10 sec: 40945.2, 60 sec: 42322.7, 300 sec: 42764.5). Total num frames: 1405992960. Throughput: 0: 42609.4. Samples: 1406107540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:26:21,998][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 07:26:24,654][12883] Updated weights for policy 0, policy_version 85823 (0.0037) [2024-06-18 07:26:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1406205952. Throughput: 0: 42518.6. Samples: 1406360240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:26:26,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 07:26:29,138][12883] Updated weights for policy 0, policy_version 85833 (0.0037) [2024-06-18 07:26:31,994][12645] Fps is (10 sec: 42614.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1406418944. Throughput: 0: 42696.9. Samples: 1406493020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:26:31,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 07:26:32,991][12883] Updated weights for policy 0, policy_version 85843 (0.0038) [2024-06-18 07:26:36,994][12645] Fps is (10 sec: 37683.7, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 1406582784. Throughput: 0: 42477.3. Samples: 1406740460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:26:36,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 07:26:37,274][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085853_1406615552.pth... [2024-06-18 07:26:37,278][12883] Updated weights for policy 0, policy_version 85853 (0.0026) [2024-06-18 07:26:37,334][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085230_1396408320.pth [2024-06-18 07:26:40,580][12883] Updated weights for policy 0, policy_version 85863 (0.0026) [2024-06-18 07:26:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1406844928. Throughput: 0: 42524.5. Samples: 1406994900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:26:41,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 07:26:44,766][12883] Updated weights for policy 0, policy_version 85873 (0.0027) [2024-06-18 07:26:46,994][12645] Fps is (10 sec: 47512.3, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1407057920. Throughput: 0: 42769.6. Samples: 1407132920. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:26:46,995][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 07:26:48,116][12883] Updated weights for policy 0, policy_version 85883 (0.0037) [2024-06-18 07:26:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 1407238144. Throughput: 0: 42387.1. Samples: 1407379800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:26:51,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 07:26:52,335][12883] Updated weights for policy 0, policy_version 85893 (0.0028) [2024-06-18 07:26:55,801][12883] Updated weights for policy 0, policy_version 85903 (0.0036) [2024-06-18 07:26:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 1407483904. Throughput: 0: 42522.2. Samples: 1407640820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:26:56,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 07:26:59,863][12883] Updated weights for policy 0, policy_version 85913 (0.0032) [2024-06-18 07:27:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1407696896. Throughput: 0: 42575.6. Samples: 1407768620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:27:01,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 07:27:03,335][12883] Updated weights for policy 0, policy_version 85923 (0.0031) [2024-06-18 07:27:04,366][12862] Signal inference workers to stop experience collection... (20400 times) [2024-06-18 07:27:04,422][12883] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-06-18 07:27:04,431][12862] Signal inference workers to resume experience collection... (20400 times) [2024-06-18 07:27:04,432][12883] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-06-18 07:27:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42654.2). Total num frames: 1407893504. Throughput: 0: 42510.0. Samples: 1408020340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:27:06,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 07:27:07,203][12883] Updated weights for policy 0, policy_version 85933 (0.0044) [2024-06-18 07:27:11,028][12883] Updated weights for policy 0, policy_version 85943 (0.0039) [2024-06-18 07:27:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1408122880. Throughput: 0: 42761.4. Samples: 1408284500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:27:11,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 07:27:15,099][12883] Updated weights for policy 0, policy_version 85953 (0.0032) [2024-06-18 07:27:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1408335872. Throughput: 0: 42631.0. Samples: 1408411420. Policy #0 lag: (min: 1.0, avg: 10.0, max: 22.0) [2024-06-18 07:27:16,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 07:27:18,895][12883] Updated weights for policy 0, policy_version 85963 (0.0035) [2024-06-18 07:27:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42601.0, 300 sec: 42654.3). Total num frames: 1408548864. Throughput: 0: 42631.1. Samples: 1408658860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:21,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 07:27:22,864][12883] Updated weights for policy 0, policy_version 85973 (0.0050) [2024-06-18 07:27:26,578][12883] Updated weights for policy 0, policy_version 85983 (0.0034) [2024-06-18 07:27:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1408778240. Throughput: 0: 42802.4. Samples: 1408921000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:26,994][12645] Avg episode reward: [(0, '0.637')] [2024-06-18 07:27:30,392][12883] Updated weights for policy 0, policy_version 85993 (0.0042) [2024-06-18 07:27:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1408991232. Throughput: 0: 42511.7. Samples: 1409045940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:31,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 07:27:34,199][12883] Updated weights for policy 0, policy_version 86003 (0.0021) [2024-06-18 07:27:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1409187840. Throughput: 0: 42590.2. Samples: 1409296360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:36,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 07:27:38,399][12883] Updated weights for policy 0, policy_version 86013 (0.0033) [2024-06-18 07:27:41,744][12883] Updated weights for policy 0, policy_version 86023 (0.0028) [2024-06-18 07:27:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1409400832. Throughput: 0: 42581.4. Samples: 1409556980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:41,994][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 07:27:46,076][12883] Updated weights for policy 0, policy_version 86033 (0.0058) [2024-06-18 07:27:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 1409613824. Throughput: 0: 42539.6. Samples: 1409682900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:46,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 07:27:49,463][12883] Updated weights for policy 0, policy_version 86043 (0.0024) [2024-06-18 07:27:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1409843200. Throughput: 0: 42562.3. Samples: 1409935640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:51,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 07:27:53,707][12883] Updated weights for policy 0, policy_version 86053 (0.0028) [2024-06-18 07:27:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1410039808. Throughput: 0: 42384.0. Samples: 1410191780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:27:56,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 07:27:57,135][12883] Updated weights for policy 0, policy_version 86063 (0.0040) [2024-06-18 07:28:01,249][12883] Updated weights for policy 0, policy_version 86073 (0.0029) [2024-06-18 07:28:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1410252800. Throughput: 0: 42387.6. Samples: 1410318860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:28:01,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 07:28:05,021][12883] Updated weights for policy 0, policy_version 86083 (0.0024) [2024-06-18 07:28:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1410465792. Throughput: 0: 42618.8. Samples: 1410576700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:28:06,994][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 07:28:08,992][12883] Updated weights for policy 0, policy_version 86093 (0.0043) [2024-06-18 07:28:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1410678784. Throughput: 0: 42455.6. Samples: 1410831500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:28:11,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 07:28:12,616][12883] Updated weights for policy 0, policy_version 86103 (0.0029) [2024-06-18 07:28:16,534][12883] Updated weights for policy 0, policy_version 86113 (0.0044) [2024-06-18 07:28:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1410875392. Throughput: 0: 42504.6. Samples: 1410958640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:16,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 07:28:20,510][12883] Updated weights for policy 0, policy_version 86123 (0.0050) [2024-06-18 07:28:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1411104768. Throughput: 0: 42647.7. Samples: 1411215500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:21,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 07:28:24,211][12883] Updated weights for policy 0, policy_version 86133 (0.0029) [2024-06-18 07:28:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1411317760. Throughput: 0: 42591.9. Samples: 1411473620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:26,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 07:28:28,108][12883] Updated weights for policy 0, policy_version 86143 (0.0034) [2024-06-18 07:28:31,808][12883] Updated weights for policy 0, policy_version 86153 (0.0023) [2024-06-18 07:28:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1411530752. Throughput: 0: 42602.3. Samples: 1411600000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:31,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 07:28:35,651][12883] Updated weights for policy 0, policy_version 86163 (0.0033) [2024-06-18 07:28:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1411743744. Throughput: 0: 42725.8. Samples: 1411858400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:36,997][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 07:28:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086166_1411743744.pth... [2024-06-18 07:28:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085542_1401520128.pth [2024-06-18 07:28:39,482][12883] Updated weights for policy 0, policy_version 86173 (0.0029) [2024-06-18 07:28:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1411973120. Throughput: 0: 42659.7. Samples: 1412111460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:41,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 07:28:43,364][12883] Updated weights for policy 0, policy_version 86183 (0.0031) [2024-06-18 07:28:46,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1412169728. Throughput: 0: 42652.0. Samples: 1412238200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:46,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 07:28:47,119][12883] Updated weights for policy 0, policy_version 86193 (0.0029) [2024-06-18 07:28:49,034][12862] Signal inference workers to stop experience collection... (20450 times) [2024-06-18 07:28:49,034][12862] Signal inference workers to resume experience collection... (20450 times) [2024-06-18 07:28:49,053][12883] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-06-18 07:28:49,053][12883] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-06-18 07:28:50,902][12883] Updated weights for policy 0, policy_version 86203 (0.0032) [2024-06-18 07:28:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1412366336. Throughput: 0: 42547.5. Samples: 1412491340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:51,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 07:28:55,119][12883] Updated weights for policy 0, policy_version 86213 (0.0029) [2024-06-18 07:28:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1412595712. Throughput: 0: 42519.6. Samples: 1412744880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:28:56,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 07:28:58,695][12883] Updated weights for policy 0, policy_version 86223 (0.0039) [2024-06-18 07:29:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 1412808704. Throughput: 0: 42579.9. Samples: 1412874740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 07:29:01,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 07:29:02,850][12883] Updated weights for policy 0, policy_version 86233 (0.0037) [2024-06-18 07:29:06,650][12883] Updated weights for policy 0, policy_version 86243 (0.0033) [2024-06-18 07:29:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1413021696. Throughput: 0: 42437.7. Samples: 1413125200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:06,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 07:29:10,621][12883] Updated weights for policy 0, policy_version 86253 (0.0028) [2024-06-18 07:29:11,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1413234688. Throughput: 0: 42520.2. Samples: 1413387120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:11,996][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 07:29:14,296][12883] Updated weights for policy 0, policy_version 86263 (0.0024) [2024-06-18 07:29:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1413447680. Throughput: 0: 42519.5. Samples: 1413513380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:16,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 07:29:18,091][12883] Updated weights for policy 0, policy_version 86273 (0.0026) [2024-06-18 07:29:21,831][12883] Updated weights for policy 0, policy_version 86283 (0.0036) [2024-06-18 07:29:21,994][12645] Fps is (10 sec: 42607.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1413660672. Throughput: 0: 42529.6. Samples: 1413772140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:21,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 07:29:25,662][12883] Updated weights for policy 0, policy_version 86293 (0.0024) [2024-06-18 07:29:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1413873664. Throughput: 0: 42719.9. Samples: 1414033860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:26,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 07:29:29,359][12883] Updated weights for policy 0, policy_version 86303 (0.0031) [2024-06-18 07:29:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1414070272. Throughput: 0: 42649.7. Samples: 1414157440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:31,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 07:29:33,446][12883] Updated weights for policy 0, policy_version 86313 (0.0036) [2024-06-18 07:29:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1414299648. Throughput: 0: 42636.9. Samples: 1414410000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:36,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 07:29:37,024][12883] Updated weights for policy 0, policy_version 86323 (0.0031) [2024-06-18 07:29:41,339][12883] Updated weights for policy 0, policy_version 86333 (0.0037) [2024-06-18 07:29:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1414496256. Throughput: 0: 42795.0. Samples: 1414670660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:41,994][12645] Avg episode reward: [(0, '0.129')] [2024-06-18 07:29:44,571][12883] Updated weights for policy 0, policy_version 86343 (0.0041) [2024-06-18 07:29:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1414709248. Throughput: 0: 42640.5. Samples: 1414793560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:46,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 07:29:49,159][12883] Updated weights for policy 0, policy_version 86353 (0.0023) [2024-06-18 07:29:51,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1414955008. Throughput: 0: 42693.2. Samples: 1415046400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:51,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 07:29:52,274][12883] Updated weights for policy 0, policy_version 86363 (0.0036) [2024-06-18 07:29:56,798][12883] Updated weights for policy 0, policy_version 86373 (0.0027) [2024-06-18 07:29:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1415135232. Throughput: 0: 42766.5. Samples: 1415311520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 07:29:56,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 07:30:00,578][12883] Updated weights for policy 0, policy_version 86383 (0.0036) [2024-06-18 07:30:01,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1415348224. Throughput: 0: 42620.9. Samples: 1415431320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:01,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 07:30:04,586][12883] Updated weights for policy 0, policy_version 86393 (0.0051) [2024-06-18 07:30:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1415593984. Throughput: 0: 42560.1. Samples: 1415687340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:06,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 07:30:08,365][12883] Updated weights for policy 0, policy_version 86403 (0.0030) [2024-06-18 07:30:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42325.3, 300 sec: 42598.1). Total num frames: 1415774208. Throughput: 0: 42536.6. Samples: 1415948100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:11,996][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 07:30:12,129][12883] Updated weights for policy 0, policy_version 86413 (0.0038) [2024-06-18 07:30:16,091][12883] Updated weights for policy 0, policy_version 86423 (0.0029) [2024-06-18 07:30:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1415987200. Throughput: 0: 42456.4. Samples: 1416067980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:16,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 07:30:19,919][12883] Updated weights for policy 0, policy_version 86433 (0.0023) [2024-06-18 07:30:21,546][12862] Signal inference workers to stop experience collection... (20500 times) [2024-06-18 07:30:21,547][12862] Signal inference workers to resume experience collection... (20500 times) [2024-06-18 07:30:21,570][12883] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-06-18 07:30:21,570][12883] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-06-18 07:30:21,994][12645] Fps is (10 sec: 45885.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1416232960. Throughput: 0: 42490.7. Samples: 1416322080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:21,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 07:30:23,587][12883] Updated weights for policy 0, policy_version 86443 (0.0033) [2024-06-18 07:30:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1416413184. Throughput: 0: 42511.5. Samples: 1416583680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:26,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 07:30:27,463][12883] Updated weights for policy 0, policy_version 86453 (0.0030) [2024-06-18 07:30:31,287][12883] Updated weights for policy 0, policy_version 86463 (0.0039) [2024-06-18 07:30:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1416626176. Throughput: 0: 42403.2. Samples: 1416701700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:31,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 07:30:35,005][12883] Updated weights for policy 0, policy_version 86473 (0.0040) [2024-06-18 07:30:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1416871936. Throughput: 0: 42663.6. Samples: 1416966260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:36,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 07:30:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086479_1416871936.pth... [2024-06-18 07:30:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000085853_1406615552.pth [2024-06-18 07:30:38,872][12883] Updated weights for policy 0, policy_version 86483 (0.0032) [2024-06-18 07:30:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1417052160. Throughput: 0: 42516.4. Samples: 1417224760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:41,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 07:30:42,741][12883] Updated weights for policy 0, policy_version 86493 (0.0031) [2024-06-18 07:30:46,679][12883] Updated weights for policy 0, policy_version 86503 (0.0033) [2024-06-18 07:30:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1417265152. Throughput: 0: 42639.5. Samples: 1417350100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 07:30:46,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 07:30:50,335][12883] Updated weights for policy 0, policy_version 86513 (0.0036) [2024-06-18 07:30:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1417494528. Throughput: 0: 42714.1. Samples: 1417609480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:30:51,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 07:30:54,178][12883] Updated weights for policy 0, policy_version 86523 (0.0028) [2024-06-18 07:30:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1417707520. Throughput: 0: 42710.0. Samples: 1417869960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:30:56,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 07:30:57,799][12883] Updated weights for policy 0, policy_version 86533 (0.0027) [2024-06-18 07:31:01,732][12883] Updated weights for policy 0, policy_version 86543 (0.0028) [2024-06-18 07:31:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1417920512. Throughput: 0: 42847.9. Samples: 1417996140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:01,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 07:31:05,406][12883] Updated weights for policy 0, policy_version 86553 (0.0046) [2024-06-18 07:31:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1418133504. Throughput: 0: 42940.3. Samples: 1418254400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:06,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 07:31:09,634][12883] Updated weights for policy 0, policy_version 86563 (0.0033) [2024-06-18 07:31:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 1418346496. Throughput: 0: 42910.1. Samples: 1418514640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:11,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 07:31:12,894][12883] Updated weights for policy 0, policy_version 86573 (0.0027) [2024-06-18 07:31:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.9). Total num frames: 1418559488. Throughput: 0: 43137.3. Samples: 1418642880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:16,994][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 07:31:17,123][12883] Updated weights for policy 0, policy_version 86583 (0.0039) [2024-06-18 07:31:20,361][12883] Updated weights for policy 0, policy_version 86593 (0.0041) [2024-06-18 07:31:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1418772480. Throughput: 0: 42872.0. Samples: 1418895500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:21,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 07:31:24,823][12883] Updated weights for policy 0, policy_version 86603 (0.0024) [2024-06-18 07:31:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1419001856. Throughput: 0: 42951.2. Samples: 1419157560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:26,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 07:31:28,276][12883] Updated weights for policy 0, policy_version 86613 (0.0029) [2024-06-18 07:31:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1419198464. Throughput: 0: 43008.1. Samples: 1419285460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:31,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 07:31:32,746][12883] Updated weights for policy 0, policy_version 86623 (0.0050) [2024-06-18 07:31:35,774][12883] Updated weights for policy 0, policy_version 86633 (0.0034) [2024-06-18 07:31:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1419427840. Throughput: 0: 42861.8. Samples: 1419538260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:36,996][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 07:31:40,363][12883] Updated weights for policy 0, policy_version 86643 (0.0042) [2024-06-18 07:31:40,368][12862] Signal inference workers to stop experience collection... (20550 times) [2024-06-18 07:31:40,369][12862] Signal inference workers to resume experience collection... (20550 times) [2024-06-18 07:31:40,389][12883] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-06-18 07:31:40,389][12883] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-06-18 07:31:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1419640832. Throughput: 0: 42994.8. Samples: 1419804720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:31:41,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 07:31:43,336][12883] Updated weights for policy 0, policy_version 86653 (0.0031) [2024-06-18 07:31:46,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1419853824. Throughput: 0: 42923.7. Samples: 1419927800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:31:46,996][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 07:31:47,783][12883] Updated weights for policy 0, policy_version 86663 (0.0028) [2024-06-18 07:31:50,925][12883] Updated weights for policy 0, policy_version 86673 (0.0047) [2024-06-18 07:31:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1420083200. Throughput: 0: 42818.7. Samples: 1420181240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:31:51,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 07:31:55,335][12883] Updated weights for policy 0, policy_version 86683 (0.0041) [2024-06-18 07:31:56,997][12645] Fps is (10 sec: 42592.9, 60 sec: 42869.0, 300 sec: 42653.4). Total num frames: 1420279808. Throughput: 0: 42878.5. Samples: 1420444320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:31:56,998][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 07:31:58,538][12883] Updated weights for policy 0, policy_version 86693 (0.0034) [2024-06-18 07:32:01,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1420492800. Throughput: 0: 42824.5. Samples: 1420570080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:32:02,005][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 07:32:02,795][12883] Updated weights for policy 0, policy_version 86703 (0.0044) [2024-06-18 07:32:06,149][12883] Updated weights for policy 0, policy_version 86713 (0.0039) [2024-06-18 07:32:06,994][12645] Fps is (10 sec: 45890.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1420738560. Throughput: 0: 43060.4. Samples: 1420833220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:32:06,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 07:32:10,697][12883] Updated weights for policy 0, policy_version 86723 (0.0034) [2024-06-18 07:32:11,999][12645] Fps is (10 sec: 42583.8, 60 sec: 42867.5, 300 sec: 42653.1). Total num frames: 1420918784. Throughput: 0: 43033.7. Samples: 1421094320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:32:12,000][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 07:32:14,050][12883] Updated weights for policy 0, policy_version 86733 (0.0039) [2024-06-18 07:32:16,994][12645] Fps is (10 sec: 40960.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1421148160. Throughput: 0: 42906.3. Samples: 1421216240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:32:16,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 07:32:18,418][12883] Updated weights for policy 0, policy_version 86743 (0.0027) [2024-06-18 07:32:21,534][12883] Updated weights for policy 0, policy_version 86753 (0.0030) [2024-06-18 07:32:21,994][12645] Fps is (10 sec: 47540.4, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1421393920. Throughput: 0: 43126.7. Samples: 1421478960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:32:21,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 07:32:25,725][12883] Updated weights for policy 0, policy_version 86763 (0.0038) [2024-06-18 07:32:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1421574144. Throughput: 0: 43131.7. Samples: 1421745640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:32:26,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 07:32:29,225][12883] Updated weights for policy 0, policy_version 86773 (0.0046) [2024-06-18 07:32:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1421803520. Throughput: 0: 43110.6. Samples: 1421867680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 07:32:31,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 07:32:33,446][12883] Updated weights for policy 0, policy_version 86783 (0.0030) [2024-06-18 07:32:36,685][12883] Updated weights for policy 0, policy_version 86793 (0.0042) [2024-06-18 07:32:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1422032896. Throughput: 0: 43229.0. Samples: 1422126540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:32:36,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 07:32:37,112][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086795_1422049280.pth... [2024-06-18 07:32:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086166_1411743744.pth [2024-06-18 07:32:40,961][12883] Updated weights for policy 0, policy_version 86803 (0.0034) [2024-06-18 07:32:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1422229504. Throughput: 0: 43252.3. Samples: 1422390520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:32:41,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 07:32:44,453][12883] Updated weights for policy 0, policy_version 86813 (0.0039) [2024-06-18 07:32:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 1422442496. Throughput: 0: 43109.7. Samples: 1422509920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:32:46,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 07:32:48,639][12883] Updated weights for policy 0, policy_version 86823 (0.0033) [2024-06-18 07:32:51,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1422655488. Throughput: 0: 43150.0. Samples: 1422775060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:32:51,996][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 07:32:52,023][12883] Updated weights for policy 0, policy_version 86833 (0.0029) [2024-06-18 07:32:56,111][12883] Updated weights for policy 0, policy_version 86843 (0.0035) [2024-06-18 07:32:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43420.2, 300 sec: 42820.6). Total num frames: 1422884864. Throughput: 0: 42941.0. Samples: 1423026420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:32:56,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 07:32:59,922][12883] Updated weights for policy 0, policy_version 86853 (0.0037) [2024-06-18 07:33:01,994][12645] Fps is (10 sec: 42607.8, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1423081472. Throughput: 0: 43227.0. Samples: 1423161460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:33:01,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 07:33:02,567][12862] Signal inference workers to stop experience collection... (20600 times) [2024-06-18 07:33:02,568][12862] Signal inference workers to resume experience collection... (20600 times) [2024-06-18 07:33:02,583][12883] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-06-18 07:33:02,611][12883] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-06-18 07:33:03,666][12883] Updated weights for policy 0, policy_version 86863 (0.0032) [2024-06-18 07:33:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1423294464. Throughput: 0: 43215.2. Samples: 1423423640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:33:06,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 07:33:07,310][12883] Updated weights for policy 0, policy_version 86873 (0.0032) [2024-06-18 07:33:11,141][12883] Updated weights for policy 0, policy_version 86883 (0.0047) [2024-06-18 07:33:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43148.6, 300 sec: 42820.5). Total num frames: 1423507456. Throughput: 0: 42939.0. Samples: 1423677900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:33:11,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 07:33:14,817][12883] Updated weights for policy 0, policy_version 86893 (0.0030) [2024-06-18 07:33:16,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1423736832. Throughput: 0: 43069.7. Samples: 1423805820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:33:16,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 07:33:19,157][12883] Updated weights for policy 0, policy_version 86903 (0.0036) [2024-06-18 07:33:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1423949824. Throughput: 0: 43119.1. Samples: 1424066900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:33:21,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 07:33:22,298][12883] Updated weights for policy 0, policy_version 86913 (0.0036) [2024-06-18 07:33:26,641][12883] Updated weights for policy 0, policy_version 86923 (0.0028) [2024-06-18 07:33:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1424146432. Throughput: 0: 42973.8. Samples: 1424324340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:33:26,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 07:33:30,391][12883] Updated weights for policy 0, policy_version 86933 (0.0024) [2024-06-18 07:33:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1424375808. Throughput: 0: 43102.6. Samples: 1424449540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:33:31,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 07:33:34,196][12883] Updated weights for policy 0, policy_version 86943 (0.0031) [2024-06-18 07:33:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1424588800. Throughput: 0: 42954.7. Samples: 1424707920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:33:36,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 07:33:37,904][12883] Updated weights for policy 0, policy_version 86953 (0.0038) [2024-06-18 07:33:41,969][12883] Updated weights for policy 0, policy_version 86963 (0.0033) [2024-06-18 07:33:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1424801792. Throughput: 0: 43155.0. Samples: 1424968400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:33:41,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 07:33:45,438][12883] Updated weights for policy 0, policy_version 86973 (0.0034) [2024-06-18 07:33:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1425014784. Throughput: 0: 42884.5. Samples: 1425091260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:33:46,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 07:33:49,922][12883] Updated weights for policy 0, policy_version 86983 (0.0041) [2024-06-18 07:33:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43146.2, 300 sec: 42876.1). Total num frames: 1425244160. Throughput: 0: 42808.4. Samples: 1425350020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:33:51,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 07:33:53,023][12883] Updated weights for policy 0, policy_version 86993 (0.0032) [2024-06-18 07:33:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1425424384. Throughput: 0: 42755.2. Samples: 1425601880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:33:56,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 07:33:57,540][12883] Updated weights for policy 0, policy_version 87003 (0.0037) [2024-06-18 07:34:00,613][12883] Updated weights for policy 0, policy_version 87013 (0.0033) [2024-06-18 07:34:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1425653760. Throughput: 0: 42706.3. Samples: 1425727600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:34:01,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 07:34:05,060][12883] Updated weights for policy 0, policy_version 87023 (0.0031) [2024-06-18 07:34:06,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42869.8, 300 sec: 42820.5). Total num frames: 1425866752. Throughput: 0: 42766.7. Samples: 1425991500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:34:06,996][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 07:34:08,117][12883] Updated weights for policy 0, policy_version 87033 (0.0034) [2024-06-18 07:34:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1426079744. Throughput: 0: 42722.3. Samples: 1426246840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:34:11,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 07:34:12,644][12883] Updated weights for policy 0, policy_version 87043 (0.0031) [2024-06-18 07:34:16,114][12883] Updated weights for policy 0, policy_version 87053 (0.0021) [2024-06-18 07:34:16,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1426309120. Throughput: 0: 42904.9. Samples: 1426380260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:34:16,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 07:34:20,262][12883] Updated weights for policy 0, policy_version 87063 (0.0037) [2024-06-18 07:34:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1426522112. Throughput: 0: 42887.5. Samples: 1426637860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 07:34:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 07:34:24,037][12883] Updated weights for policy 0, policy_version 87073 (0.0042) [2024-06-18 07:34:24,405][12862] Signal inference workers to stop experience collection... (20650 times) [2024-06-18 07:34:24,460][12883] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-06-18 07:34:24,520][12862] Signal inference workers to resume experience collection... (20650 times) [2024-06-18 07:34:24,520][12883] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-06-18 07:34:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1426735104. Throughput: 0: 42864.8. Samples: 1426897320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:34:26,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 07:34:27,798][12883] Updated weights for policy 0, policy_version 87083 (0.0026) [2024-06-18 07:34:31,437][12883] Updated weights for policy 0, policy_version 87093 (0.0031) [2024-06-18 07:34:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1426948096. Throughput: 0: 43016.0. Samples: 1427026980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:34:31,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 07:34:35,345][12883] Updated weights for policy 0, policy_version 87103 (0.0038) [2024-06-18 07:34:37,000][12645] Fps is (10 sec: 44209.3, 60 sec: 43139.9, 300 sec: 42986.2). Total num frames: 1427177472. Throughput: 0: 43049.9. Samples: 1427287540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:34:37,001][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 07:34:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087108_1427177472.pth... [2024-06-18 07:34:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086479_1416871936.pth [2024-06-18 07:34:38,939][12883] Updated weights for policy 0, policy_version 87113 (0.0048) [2024-06-18 07:34:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1427390464. Throughput: 0: 43133.7. Samples: 1427542900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:34:41,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 07:34:42,871][12883] Updated weights for policy 0, policy_version 87123 (0.0042) [2024-06-18 07:34:46,457][12883] Updated weights for policy 0, policy_version 87133 (0.0042) [2024-06-18 07:34:46,994][12645] Fps is (10 sec: 40985.4, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1427587072. Throughput: 0: 43143.5. Samples: 1427669060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:34:47,000][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 07:34:50,419][12883] Updated weights for policy 0, policy_version 87143 (0.0034) [2024-06-18 07:34:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1427816448. Throughput: 0: 43048.3. Samples: 1427928580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:34:51,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 07:34:54,008][12883] Updated weights for policy 0, policy_version 87153 (0.0029) [2024-06-18 07:34:56,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1428029440. Throughput: 0: 43028.5. Samples: 1428183120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:34:56,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 07:34:58,308][12883] Updated weights for policy 0, policy_version 87163 (0.0027) [2024-06-18 07:35:01,571][12883] Updated weights for policy 0, policy_version 87173 (0.0033) [2024-06-18 07:35:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1428242432. Throughput: 0: 43076.9. Samples: 1428318720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:35:01,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 07:35:05,873][12883] Updated weights for policy 0, policy_version 87183 (0.0035) [2024-06-18 07:35:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43146.1, 300 sec: 42987.5). Total num frames: 1428455424. Throughput: 0: 42957.2. Samples: 1428570940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:35:06,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 07:35:09,714][12883] Updated weights for policy 0, policy_version 87193 (0.0026) [2024-06-18 07:35:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1428652032. Throughput: 0: 42885.5. Samples: 1428827160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 07:35:11,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 07:35:13,458][12883] Updated weights for policy 0, policy_version 87203 (0.0035) [2024-06-18 07:35:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 1428865024. Throughput: 0: 42732.4. Samples: 1428949940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:16,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 07:35:17,276][12883] Updated weights for policy 0, policy_version 87213 (0.0044) [2024-06-18 07:35:21,073][12883] Updated weights for policy 0, policy_version 87223 (0.0037) [2024-06-18 07:35:21,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 43042.7). Total num frames: 1429110784. Throughput: 0: 42721.9. Samples: 1429209760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:21,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 07:35:25,030][12883] Updated weights for policy 0, policy_version 87233 (0.0037) [2024-06-18 07:35:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1429291008. Throughput: 0: 42692.8. Samples: 1429464080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:26,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 07:35:28,707][12883] Updated weights for policy 0, policy_version 87243 (0.0036) [2024-06-18 07:35:31,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1429520384. Throughput: 0: 42695.3. Samples: 1429590340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:31,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 07:35:32,595][12883] Updated weights for policy 0, policy_version 87253 (0.0038) [2024-06-18 07:35:36,369][12883] Updated weights for policy 0, policy_version 87263 (0.0029) [2024-06-18 07:35:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.7, 300 sec: 42931.6). Total num frames: 1429716992. Throughput: 0: 42660.4. Samples: 1429848300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:36,995][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 07:35:40,721][12883] Updated weights for policy 0, policy_version 87273 (0.0028) [2024-06-18 07:35:40,994][12862] Signal inference workers to stop experience collection... (20700 times) [2024-06-18 07:35:40,994][12862] Signal inference workers to resume experience collection... (20700 times) [2024-06-18 07:35:41,034][12883] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-06-18 07:35:41,034][12883] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-06-18 07:35:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1429946368. Throughput: 0: 42591.0. Samples: 1430099720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:41,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 07:35:44,489][12883] Updated weights for policy 0, policy_version 87283 (0.0042) [2024-06-18 07:35:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1430159360. Throughput: 0: 42417.8. Samples: 1430227520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:46,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 07:35:48,525][12883] Updated weights for policy 0, policy_version 87293 (0.0042) [2024-06-18 07:35:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1430355968. Throughput: 0: 42340.5. Samples: 1430476260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:51,996][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 07:35:52,307][12883] Updated weights for policy 0, policy_version 87303 (0.0045) [2024-06-18 07:35:56,233][12883] Updated weights for policy 0, policy_version 87313 (0.0035) [2024-06-18 07:35:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1430568960. Throughput: 0: 42175.1. Samples: 1430725040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:35:56,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 07:36:00,024][12883] Updated weights for policy 0, policy_version 87323 (0.0039) [2024-06-18 07:36:01,999][12645] Fps is (10 sec: 42575.3, 60 sec: 42321.5, 300 sec: 42875.3). Total num frames: 1430781952. Throughput: 0: 42280.2. Samples: 1430852780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:36:01,999][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 07:36:03,948][12883] Updated weights for policy 0, policy_version 87333 (0.0037) [2024-06-18 07:36:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1430978560. Throughput: 0: 42084.6. Samples: 1431103560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 07:36:06,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 07:36:07,751][12883] Updated weights for policy 0, policy_version 87343 (0.0024) [2024-06-18 07:36:11,483][12883] Updated weights for policy 0, policy_version 87353 (0.0034) [2024-06-18 07:36:11,994][12645] Fps is (10 sec: 40982.3, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1431191552. Throughput: 0: 42050.8. Samples: 1431356360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:11,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 07:36:15,397][12883] Updated weights for policy 0, policy_version 87363 (0.0033) [2024-06-18 07:36:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1431420928. Throughput: 0: 42198.6. Samples: 1431489280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:16,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 07:36:19,113][12883] Updated weights for policy 0, policy_version 87373 (0.0037) [2024-06-18 07:36:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 1431617536. Throughput: 0: 42098.8. Samples: 1431742740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:21,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 07:36:23,027][12883] Updated weights for policy 0, policy_version 87383 (0.0035) [2024-06-18 07:36:26,859][12883] Updated weights for policy 0, policy_version 87393 (0.0036) [2024-06-18 07:36:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1431846912. Throughput: 0: 42184.5. Samples: 1431998020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:26,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 07:36:30,511][12883] Updated weights for policy 0, policy_version 87403 (0.0035) [2024-06-18 07:36:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1432059904. Throughput: 0: 42210.7. Samples: 1432127000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:31,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 07:36:34,716][12883] Updated weights for policy 0, policy_version 87413 (0.0040) [2024-06-18 07:36:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1432240128. Throughput: 0: 42362.2. Samples: 1432382560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:36,998][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 07:36:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087418_1432256512.pth... [2024-06-18 07:36:37,114][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000086795_1422049280.pth [2024-06-18 07:36:38,370][12883] Updated weights for policy 0, policy_version 87423 (0.0037) [2024-06-18 07:36:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42765.4). Total num frames: 1432469504. Throughput: 0: 42461.8. Samples: 1432635820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:41,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 07:36:42,348][12883] Updated weights for policy 0, policy_version 87433 (0.0028) [2024-06-18 07:36:46,082][12883] Updated weights for policy 0, policy_version 87443 (0.0037) [2024-06-18 07:36:46,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1432715264. Throughput: 0: 42642.1. Samples: 1432771440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:46,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 07:36:49,837][12883] Updated weights for policy 0, policy_version 87453 (0.0037) [2024-06-18 07:36:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42710.0). Total num frames: 1432879104. Throughput: 0: 42608.4. Samples: 1433020940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:51,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 07:36:53,790][12883] Updated weights for policy 0, policy_version 87463 (0.0029) [2024-06-18 07:36:56,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42820.9). Total num frames: 1433124864. Throughput: 0: 42673.3. Samples: 1433276660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 07:36:56,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 07:36:57,415][12883] Updated weights for policy 0, policy_version 87473 (0.0040) [2024-06-18 07:37:01,387][12883] Updated weights for policy 0, policy_version 87483 (0.0049) [2024-06-18 07:37:01,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42875.3, 300 sec: 42765.0). Total num frames: 1433354240. Throughput: 0: 42748.5. Samples: 1433412960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:01,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 07:37:05,558][12883] Updated weights for policy 0, policy_version 87493 (0.0046) [2024-06-18 07:37:06,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42710.3). Total num frames: 1433518080. Throughput: 0: 42579.2. Samples: 1433658800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:06,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 07:37:09,158][12883] Updated weights for policy 0, policy_version 87503 (0.0027) [2024-06-18 07:37:11,995][12645] Fps is (10 sec: 39314.7, 60 sec: 42597.1, 300 sec: 42709.2). Total num frames: 1433747456. Throughput: 0: 42490.2. Samples: 1433910160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:11,996][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 07:37:13,179][12883] Updated weights for policy 0, policy_version 87513 (0.0029) [2024-06-18 07:37:16,756][12883] Updated weights for policy 0, policy_version 87523 (0.0035) [2024-06-18 07:37:16,959][12862] Signal inference workers to stop experience collection... (20750 times) [2024-06-18 07:37:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1433976832. Throughput: 0: 42590.7. Samples: 1434043580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:16,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 07:37:16,997][12883] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-06-18 07:37:17,021][12862] Signal inference workers to resume experience collection... (20750 times) [2024-06-18 07:37:17,023][12883] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-06-18 07:37:20,910][12883] Updated weights for policy 0, policy_version 87533 (0.0037) [2024-06-18 07:37:21,994][12645] Fps is (10 sec: 39328.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1434140672. Throughput: 0: 42456.9. Samples: 1434293120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:21,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 07:37:24,590][12883] Updated weights for policy 0, policy_version 87543 (0.0036) [2024-06-18 07:37:26,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 1434402816. Throughput: 0: 42336.0. Samples: 1434541040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:26,996][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 07:37:28,678][12883] Updated weights for policy 0, policy_version 87553 (0.0038) [2024-06-18 07:37:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1434599424. Throughput: 0: 42393.2. Samples: 1434679140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:31,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 07:37:32,473][12883] Updated weights for policy 0, policy_version 87563 (0.0037) [2024-06-18 07:37:36,994][12645] Fps is (10 sec: 37692.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1434779648. Throughput: 0: 42229.0. Samples: 1434921240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:36,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 07:37:37,020][12883] Updated weights for policy 0, policy_version 87573 (0.0034) [2024-06-18 07:37:40,218][12883] Updated weights for policy 0, policy_version 87583 (0.0032) [2024-06-18 07:37:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1435041792. Throughput: 0: 42200.1. Samples: 1435175660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:41,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 07:37:44,659][12883] Updated weights for policy 0, policy_version 87593 (0.0036) [2024-06-18 07:37:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 42543.2). Total num frames: 1435205632. Throughput: 0: 42051.2. Samples: 1435305260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:46,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 07:37:48,115][12883] Updated weights for policy 0, policy_version 87603 (0.0035) [2024-06-18 07:37:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1435435008. Throughput: 0: 42058.7. Samples: 1435551440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 07:37:51,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 07:37:52,556][12883] Updated weights for policy 0, policy_version 87613 (0.0045) [2024-06-18 07:37:55,825][12883] Updated weights for policy 0, policy_version 87623 (0.0024) [2024-06-18 07:37:56,994][12645] Fps is (10 sec: 49151.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1435697152. Throughput: 0: 42266.5. Samples: 1435812080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:37:56,994][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 07:38:00,353][12883] Updated weights for policy 0, policy_version 87633 (0.0034) [2024-06-18 07:38:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1435860992. Throughput: 0: 42262.7. Samples: 1435945400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:01,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 07:38:03,461][12883] Updated weights for policy 0, policy_version 87643 (0.0030) [2024-06-18 07:38:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1436090368. Throughput: 0: 42247.1. Samples: 1436194240. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:06,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 07:38:08,023][12883] Updated weights for policy 0, policy_version 87653 (0.0033) [2024-06-18 07:38:11,321][12883] Updated weights for policy 0, policy_version 87663 (0.0030) [2024-06-18 07:38:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42872.7, 300 sec: 42653.9). Total num frames: 1436319744. Throughput: 0: 42506.5. Samples: 1436453740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:11,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 07:38:15,628][12883] Updated weights for policy 0, policy_version 87673 (0.0037) [2024-06-18 07:38:16,994][12645] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 1436483584. Throughput: 0: 42233.0. Samples: 1436579620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:16,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 07:38:19,074][12883] Updated weights for policy 0, policy_version 87683 (0.0036) [2024-06-18 07:38:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1436729344. Throughput: 0: 42362.3. Samples: 1436827540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:21,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 07:38:23,426][12883] Updated weights for policy 0, policy_version 87693 (0.0038) [2024-06-18 07:38:26,794][12883] Updated weights for policy 0, policy_version 87703 (0.0049) [2024-06-18 07:38:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 1436925952. Throughput: 0: 42619.2. Samples: 1437093520. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:26,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 07:38:30,889][12883] Updated weights for policy 0, policy_version 87713 (0.0042) [2024-06-18 07:38:31,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1437138944. Throughput: 0: 42422.5. Samples: 1437214280. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:31,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 07:38:34,751][12883] Updated weights for policy 0, policy_version 87723 (0.0036) [2024-06-18 07:38:34,781][12862] Signal inference workers to stop experience collection... (20800 times) [2024-06-18 07:38:34,782][12862] Signal inference workers to resume experience collection... (20800 times) [2024-06-18 07:38:34,828][12883] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-06-18 07:38:34,828][12883] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-06-18 07:38:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1437384704. Throughput: 0: 42589.3. Samples: 1437467960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:36,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 07:38:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087731_1437384704.pth... [2024-06-18 07:38:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087108_1427177472.pth [2024-06-18 07:38:38,555][12883] Updated weights for policy 0, policy_version 87733 (0.0037) [2024-06-18 07:38:41,994][12645] Fps is (10 sec: 39322.6, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 1437532160. Throughput: 0: 42618.4. Samples: 1437729900. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:41,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 07:38:42,548][12883] Updated weights for policy 0, policy_version 87743 (0.0035) [2024-06-18 07:38:46,134][12883] Updated weights for policy 0, policy_version 87753 (0.0043) [2024-06-18 07:38:46,994][12645] Fps is (10 sec: 36044.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1437745152. Throughput: 0: 42216.4. Samples: 1437845140. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 07:38:46,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 07:38:50,070][12883] Updated weights for policy 0, policy_version 87763 (0.0041) [2024-06-18 07:38:52,000][12645] Fps is (10 sec: 49120.8, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 1438023680. Throughput: 0: 42479.9. Samples: 1438106100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:38:52,000][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 07:38:53,999][12883] Updated weights for policy 0, policy_version 87773 (0.0023) [2024-06-18 07:38:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 41506.3, 300 sec: 42487.3). Total num frames: 1438187520. Throughput: 0: 42488.6. Samples: 1438365720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:38:56,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 07:38:57,725][12883] Updated weights for policy 0, policy_version 87783 (0.0036) [2024-06-18 07:39:01,911][12883] Updated weights for policy 0, policy_version 87793 (0.0037) [2024-06-18 07:39:01,994][12645] Fps is (10 sec: 37707.2, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1438400512. Throughput: 0: 42376.9. Samples: 1438486580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:01,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 07:39:05,603][12883] Updated weights for policy 0, policy_version 87803 (0.0034) [2024-06-18 07:39:06,994][12645] Fps is (10 sec: 49151.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1438679040. Throughput: 0: 42729.7. Samples: 1438750380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:06,998][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 07:39:09,535][12883] Updated weights for policy 0, policy_version 87813 (0.0025) [2024-06-18 07:39:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1438842880. Throughput: 0: 42381.3. Samples: 1439000680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:11,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 07:39:13,197][12883] Updated weights for policy 0, policy_version 87823 (0.0029) [2024-06-18 07:39:16,994][12645] Fps is (10 sec: 36044.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1439039488. Throughput: 0: 42482.8. Samples: 1439126000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:16,994][12645] Avg episode reward: [(0, '0.190')] [2024-06-18 07:39:17,178][12883] Updated weights for policy 0, policy_version 87833 (0.0037) [2024-06-18 07:39:20,761][12883] Updated weights for policy 0, policy_version 87843 (0.0028) [2024-06-18 07:39:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1439285248. Throughput: 0: 42665.3. Samples: 1439387900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:21,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 07:39:24,938][12883] Updated weights for policy 0, policy_version 87853 (0.0035) [2024-06-18 07:39:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1439481856. Throughput: 0: 42628.4. Samples: 1439648180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:26,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 07:39:28,670][12883] Updated weights for policy 0, policy_version 87863 (0.0038) [2024-06-18 07:39:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42432.7). Total num frames: 1439694848. Throughput: 0: 42776.5. Samples: 1439770080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:31,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 07:39:32,460][12883] Updated weights for policy 0, policy_version 87873 (0.0036) [2024-06-18 07:39:36,257][12883] Updated weights for policy 0, policy_version 87883 (0.0028) [2024-06-18 07:39:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1439924224. Throughput: 0: 42776.0. Samples: 1440030760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-18 07:39:36,995][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 07:39:37,187][12862] Signal inference workers to stop experience collection... (20850 times) [2024-06-18 07:39:37,192][12862] Signal inference workers to resume experience collection... (20850 times) [2024-06-18 07:39:37,239][12883] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-06-18 07:39:37,239][12883] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-06-18 07:39:40,386][12883] Updated weights for policy 0, policy_version 87893 (0.0035) [2024-06-18 07:39:41,996][12645] Fps is (10 sec: 42588.8, 60 sec: 43142.8, 300 sec: 42487.0). Total num frames: 1440120832. Throughput: 0: 42648.9. Samples: 1440285020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:39:41,996][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 07:39:43,788][12883] Updated weights for policy 0, policy_version 87903 (0.0043) [2024-06-18 07:39:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 1440333824. Throughput: 0: 42642.5. Samples: 1440405500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:39:46,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 07:39:47,797][12883] Updated weights for policy 0, policy_version 87913 (0.0032) [2024-06-18 07:39:51,370][12883] Updated weights for policy 0, policy_version 87923 (0.0036) [2024-06-18 07:39:51,994][12645] Fps is (10 sec: 44247.2, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 1440563200. Throughput: 0: 42557.0. Samples: 1440665440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:39:51,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 07:39:55,455][12883] Updated weights for policy 0, policy_version 87933 (0.0036) [2024-06-18 07:39:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 1440743424. Throughput: 0: 42677.3. Samples: 1440921160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:39:56,995][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 07:39:59,031][12883] Updated weights for policy 0, policy_version 87943 (0.0033) [2024-06-18 07:40:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 1440972800. Throughput: 0: 42523.1. Samples: 1441039540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:40:01,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 07:40:03,124][12883] Updated weights for policy 0, policy_version 87953 (0.0041) [2024-06-18 07:40:06,610][12883] Updated weights for policy 0, policy_version 87963 (0.0041) [2024-06-18 07:40:06,994][12645] Fps is (10 sec: 45873.6, 60 sec: 42052.0, 300 sec: 42542.8). Total num frames: 1441202176. Throughput: 0: 42632.1. Samples: 1441306360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:40:06,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 07:40:11,055][12883] Updated weights for policy 0, policy_version 87973 (0.0033) [2024-06-18 07:40:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1441382400. Throughput: 0: 42569.8. Samples: 1441563820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:40:11,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 07:40:14,233][12883] Updated weights for policy 0, policy_version 87983 (0.0034) [2024-06-18 07:40:16,994][12645] Fps is (10 sec: 39323.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 1441595392. Throughput: 0: 42388.8. Samples: 1441677580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:40:16,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 07:40:18,941][12883] Updated weights for policy 0, policy_version 87993 (0.0036) [2024-06-18 07:40:21,904][12883] Updated weights for policy 0, policy_version 88003 (0.0041) [2024-06-18 07:40:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1441841152. Throughput: 0: 42439.2. Samples: 1441940520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:40:21,995][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 07:40:26,438][12883] Updated weights for policy 0, policy_version 88013 (0.0033) [2024-06-18 07:40:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1442037760. Throughput: 0: 42555.4. Samples: 1442199920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:40:26,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 07:40:29,461][12883] Updated weights for policy 0, policy_version 88023 (0.0032) [2024-06-18 07:40:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1442250752. Throughput: 0: 42538.7. Samples: 1442319740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 07:40:31,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 07:40:33,995][12883] Updated weights for policy 0, policy_version 88033 (0.0045) [2024-06-18 07:40:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1442480128. Throughput: 0: 42616.6. Samples: 1442583200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:40:36,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 07:40:37,051][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088043_1442496512.pth... [2024-06-18 07:40:37,054][12883] Updated weights for policy 0, policy_version 88043 (0.0033) [2024-06-18 07:40:37,107][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087418_1432256512.pth [2024-06-18 07:40:41,591][12883] Updated weights for policy 0, policy_version 88053 (0.0040) [2024-06-18 07:40:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42599.9, 300 sec: 42431.8). Total num frames: 1442676736. Throughput: 0: 42539.1. Samples: 1442835420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:40:41,994][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 07:40:45,259][12883] Updated weights for policy 0, policy_version 88063 (0.0035) [2024-06-18 07:40:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1442906112. Throughput: 0: 42737.7. Samples: 1442962740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:40:46,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 07:40:49,205][12883] Updated weights for policy 0, policy_version 88073 (0.0042) [2024-06-18 07:40:52,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 1443102720. Throughput: 0: 42499.8. Samples: 1443219100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:40:52,000][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 07:40:52,771][12883] Updated weights for policy 0, policy_version 88083 (0.0030) [2024-06-18 07:40:56,684][12883] Updated weights for policy 0, policy_version 88093 (0.0044) [2024-06-18 07:40:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42488.1). Total num frames: 1443315712. Throughput: 0: 42368.0. Samples: 1443470380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:40:56,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 07:41:00,486][12883] Updated weights for policy 0, policy_version 88103 (0.0030) [2024-06-18 07:41:01,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1443528704. Throughput: 0: 42670.2. Samples: 1443597740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:41:01,994][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 07:41:04,232][12883] Updated weights for policy 0, policy_version 88113 (0.0037) [2024-06-18 07:41:05,274][12862] Signal inference workers to stop experience collection... (20900 times) [2024-06-18 07:41:05,328][12862] Signal inference workers to resume experience collection... (20900 times) [2024-06-18 07:41:05,329][12883] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-06-18 07:41:05,344][12883] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-06-18 07:41:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.7, 300 sec: 42542.9). Total num frames: 1443741696. Throughput: 0: 42587.7. Samples: 1443856960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:41:06,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 07:41:08,145][12883] Updated weights for policy 0, policy_version 88123 (0.0035) [2024-06-18 07:41:11,999][12645] Fps is (10 sec: 42576.6, 60 sec: 42867.8, 300 sec: 42486.6). Total num frames: 1443954688. Throughput: 0: 42452.1. Samples: 1444110480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:41:11,999][12645] Avg episode reward: [(0, '0.682')] [2024-06-18 07:41:12,128][12883] Updated weights for policy 0, policy_version 88133 (0.0026) [2024-06-18 07:41:16,112][12883] Updated weights for policy 0, policy_version 88143 (0.0038) [2024-06-18 07:41:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1444167680. Throughput: 0: 42683.6. Samples: 1444240500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:41:16,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 07:41:19,801][12883] Updated weights for policy 0, policy_version 88153 (0.0040) [2024-06-18 07:41:21,994][12645] Fps is (10 sec: 40980.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1444364288. Throughput: 0: 42533.4. Samples: 1444497200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:41:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 07:41:23,553][12883] Updated weights for policy 0, policy_version 88163 (0.0036) [2024-06-18 07:41:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1444593664. Throughput: 0: 42529.8. Samples: 1444749260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 07:41:26,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 07:41:27,508][12883] Updated weights for policy 0, policy_version 88173 (0.0029) [2024-06-18 07:41:31,303][12883] Updated weights for policy 0, policy_version 88183 (0.0035) [2024-06-18 07:41:31,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1444823040. Throughput: 0: 42665.5. Samples: 1444882680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:41:31,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 07:41:35,148][12883] Updated weights for policy 0, policy_version 88193 (0.0021) [2024-06-18 07:41:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.5, 300 sec: 42487.3). Total num frames: 1445003264. Throughput: 0: 42562.4. Samples: 1445134140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:41:36,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 07:41:39,054][12883] Updated weights for policy 0, policy_version 88203 (0.0029) [2024-06-18 07:41:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1445232640. Throughput: 0: 42495.5. Samples: 1445382680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:41:41,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 07:41:42,976][12883] Updated weights for policy 0, policy_version 88213 (0.0031) [2024-06-18 07:41:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1445429248. Throughput: 0: 42688.0. Samples: 1445518700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:41:46,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 07:41:47,025][12883] Updated weights for policy 0, policy_version 88223 (0.0025) [2024-06-18 07:41:50,689][12883] Updated weights for policy 0, policy_version 88233 (0.0033) [2024-06-18 07:41:51,994][12645] Fps is (10 sec: 37683.4, 60 sec: 41783.6, 300 sec: 42320.7). Total num frames: 1445609472. Throughput: 0: 42410.7. Samples: 1445765440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:41:51,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 07:41:54,657][12883] Updated weights for policy 0, policy_version 88243 (0.0031) [2024-06-18 07:41:56,996][12645] Fps is (10 sec: 45864.6, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 1445888000. Throughput: 0: 42399.6. Samples: 1446018340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:41:56,997][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 07:41:58,300][12883] Updated weights for policy 0, policy_version 88253 (0.0038) [2024-06-18 07:42:01,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1446084608. Throughput: 0: 42561.4. Samples: 1446155760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:42:01,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 07:42:02,319][12883] Updated weights for policy 0, policy_version 88263 (0.0037) [2024-06-18 07:42:05,881][12883] Updated weights for policy 0, policy_version 88273 (0.0042) [2024-06-18 07:42:06,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42052.1, 300 sec: 42432.0). Total num frames: 1446264832. Throughput: 0: 42342.2. Samples: 1446402600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:42:06,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 07:42:09,947][12883] Updated weights for policy 0, policy_version 88283 (0.0021) [2024-06-18 07:42:11,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42875.1, 300 sec: 42542.8). Total num frames: 1446526976. Throughput: 0: 42430.2. Samples: 1446658620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:42:11,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 07:42:13,414][12883] Updated weights for policy 0, policy_version 88293 (0.0030) [2024-06-18 07:42:15,551][12862] Signal inference workers to stop experience collection... (20950 times) [2024-06-18 07:42:15,551][12862] Signal inference workers to resume experience collection... (20950 times) [2024-06-18 07:42:15,571][12883] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-06-18 07:42:15,571][12883] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-06-18 07:42:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1446707200. Throughput: 0: 42426.5. Samples: 1446791880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 07:42:16,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 07:42:17,758][12883] Updated weights for policy 0, policy_version 88303 (0.0027) [2024-06-18 07:42:21,577][12883] Updated weights for policy 0, policy_version 88313 (0.0034) [2024-06-18 07:42:21,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 1446920192. Throughput: 0: 42321.8. Samples: 1447038620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:21,994][12645] Avg episode reward: [(0, '0.097')] [2024-06-18 07:42:25,823][12883] Updated weights for policy 0, policy_version 88323 (0.0041) [2024-06-18 07:42:26,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1447165952. Throughput: 0: 42409.8. Samples: 1447291120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:26,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 07:42:29,204][12883] Updated weights for policy 0, policy_version 88333 (0.0039) [2024-06-18 07:42:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1447362560. Throughput: 0: 42296.5. Samples: 1447422040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:31,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 07:42:33,631][12883] Updated weights for policy 0, policy_version 88343 (0.0039) [2024-06-18 07:42:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1447559168. Throughput: 0: 42443.5. Samples: 1447675400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:36,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 07:42:37,079][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088353_1447575552.pth... [2024-06-18 07:42:37,098][12883] Updated weights for policy 0, policy_version 88353 (0.0041) [2024-06-18 07:42:37,133][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000087731_1437384704.pth [2024-06-18 07:42:41,314][12883] Updated weights for policy 0, policy_version 88363 (0.0032) [2024-06-18 07:42:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1447788544. Throughput: 0: 42533.3. Samples: 1447932240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:41,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 07:42:44,863][12883] Updated weights for policy 0, policy_version 88373 (0.0045) [2024-06-18 07:42:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1448001536. Throughput: 0: 42326.2. Samples: 1448060440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:46,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 07:42:48,975][12883] Updated weights for policy 0, policy_version 88383 (0.0037) [2024-06-18 07:42:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42431.8). Total num frames: 1448214528. Throughput: 0: 42428.1. Samples: 1448311860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:51,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 07:42:52,422][12883] Updated weights for policy 0, policy_version 88393 (0.0036) [2024-06-18 07:42:56,968][12883] Updated weights for policy 0, policy_version 88403 (0.0046) [2024-06-18 07:42:56,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41780.7, 300 sec: 42487.3). Total num frames: 1448394752. Throughput: 0: 42455.9. Samples: 1448569140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:42:56,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 07:43:00,384][12883] Updated weights for policy 0, policy_version 88413 (0.0038) [2024-06-18 07:43:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1448624128. Throughput: 0: 42221.0. Samples: 1448691820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:43:01,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 07:43:04,478][12883] Updated weights for policy 0, policy_version 88423 (0.0031) [2024-06-18 07:43:06,994][12645] Fps is (10 sec: 45876.1, 60 sec: 43144.7, 300 sec: 42487.3). Total num frames: 1448853504. Throughput: 0: 42476.0. Samples: 1448950040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:43:06,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 07:43:08,179][12883] Updated weights for policy 0, policy_version 88433 (0.0034) [2024-06-18 07:43:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1449033728. Throughput: 0: 42579.1. Samples: 1449207180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 07:43:11,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 07:43:12,142][12883] Updated weights for policy 0, policy_version 88443 (0.0036) [2024-06-18 07:43:16,053][12883] Updated weights for policy 0, policy_version 88453 (0.0042) [2024-06-18 07:43:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1449246720. Throughput: 0: 42326.6. Samples: 1449326740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:16,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 07:43:20,196][12883] Updated weights for policy 0, policy_version 88463 (0.0037) [2024-06-18 07:43:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1449492480. Throughput: 0: 42454.3. Samples: 1449585840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:21,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 07:43:23,783][12883] Updated weights for policy 0, policy_version 88473 (0.0032) [2024-06-18 07:43:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 42487.4). Total num frames: 1449672704. Throughput: 0: 42512.5. Samples: 1449845300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:26,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 07:43:27,596][12883] Updated weights for policy 0, policy_version 88483 (0.0026) [2024-06-18 07:43:31,289][12883] Updated weights for policy 0, policy_version 88493 (0.0038) [2024-06-18 07:43:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1449902080. Throughput: 0: 42456.3. Samples: 1449970980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:31,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 07:43:33,600][12862] Signal inference workers to stop experience collection... (21000 times) [2024-06-18 07:43:33,600][12862] Signal inference workers to resume experience collection... (21000 times) [2024-06-18 07:43:33,651][12883] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-06-18 07:43:33,652][12883] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-06-18 07:43:35,149][12883] Updated weights for policy 0, policy_version 88503 (0.0042) [2024-06-18 07:43:36,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 1450131456. Throughput: 0: 42712.8. Samples: 1450233940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:36,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 07:43:38,909][12883] Updated weights for policy 0, policy_version 88513 (0.0041) [2024-06-18 07:43:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1450328064. Throughput: 0: 42609.8. Samples: 1450486580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:41,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 07:43:42,950][12883] Updated weights for policy 0, policy_version 88523 (0.0031) [2024-06-18 07:43:46,573][12883] Updated weights for policy 0, policy_version 88533 (0.0042) [2024-06-18 07:43:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42377.1). Total num frames: 1450524672. Throughput: 0: 42635.1. Samples: 1450610400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:46,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 07:43:50,635][12883] Updated weights for policy 0, policy_version 88543 (0.0037) [2024-06-18 07:43:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1450770432. Throughput: 0: 42765.2. Samples: 1450874480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:51,994][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 07:43:54,265][12883] Updated weights for policy 0, policy_version 88553 (0.0038) [2024-06-18 07:43:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1450967040. Throughput: 0: 42477.3. Samples: 1451118660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:43:56,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 07:43:58,382][12883] Updated weights for policy 0, policy_version 88563 (0.0028) [2024-06-18 07:44:01,911][12883] Updated weights for policy 0, policy_version 88573 (0.0034) [2024-06-18 07:44:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 1451180032. Throughput: 0: 42659.1. Samples: 1451246400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:44:01,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 07:44:06,051][12883] Updated weights for policy 0, policy_version 88583 (0.0035) [2024-06-18 07:44:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1451393024. Throughput: 0: 42782.6. Samples: 1451511060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 07:44:06,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 07:44:09,481][12883] Updated weights for policy 0, policy_version 88593 (0.0041) [2024-06-18 07:44:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1451606016. Throughput: 0: 42600.4. Samples: 1451762320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:11,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 07:44:13,725][12883] Updated weights for policy 0, policy_version 88603 (0.0028) [2024-06-18 07:44:16,950][12883] Updated weights for policy 0, policy_version 88613 (0.0040) [2024-06-18 07:44:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1451835392. Throughput: 0: 42626.2. Samples: 1451889160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:16,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 07:44:21,262][12883] Updated weights for policy 0, policy_version 88623 (0.0035) [2024-06-18 07:44:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1452032000. Throughput: 0: 42644.2. Samples: 1452152920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:21,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 07:44:24,813][12883] Updated weights for policy 0, policy_version 88633 (0.0027) [2024-06-18 07:44:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1452244992. Throughput: 0: 42508.1. Samples: 1452399440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:26,994][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 07:44:29,240][12883] Updated weights for policy 0, policy_version 88643 (0.0034) [2024-06-18 07:44:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 1452457984. Throughput: 0: 42602.3. Samples: 1452527500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:31,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 07:44:32,359][12883] Updated weights for policy 0, policy_version 88653 (0.0046) [2024-06-18 07:44:36,838][12883] Updated weights for policy 0, policy_version 88663 (0.0041) [2024-06-18 07:44:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42487.6). Total num frames: 1452654592. Throughput: 0: 42371.2. Samples: 1452781180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:36,994][12645] Avg episode reward: [(0, '0.170')] [2024-06-18 07:44:37,140][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088665_1452687360.pth... [2024-06-18 07:44:37,186][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088043_1442496512.pth [2024-06-18 07:44:39,998][12883] Updated weights for policy 0, policy_version 88673 (0.0048) [2024-06-18 07:44:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1452900352. Throughput: 0: 42619.0. Samples: 1453036520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:41,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 07:44:44,380][12883] Updated weights for policy 0, policy_version 88683 (0.0023) [2024-06-18 07:44:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1453096960. Throughput: 0: 42568.5. Samples: 1453161980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:46,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 07:44:47,798][12883] Updated weights for policy 0, policy_version 88693 (0.0037) [2024-06-18 07:44:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 1453293568. Throughput: 0: 42421.2. Samples: 1453420020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:51,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 07:44:52,076][12883] Updated weights for policy 0, policy_version 88703 (0.0036) [2024-06-18 07:44:55,775][12883] Updated weights for policy 0, policy_version 88713 (0.0028) [2024-06-18 07:44:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1453522944. Throughput: 0: 42496.0. Samples: 1453674640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:44:56,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 07:44:57,620][12862] Signal inference workers to stop experience collection... (21050 times) [2024-06-18 07:44:57,620][12862] Signal inference workers to resume experience collection... (21050 times) [2024-06-18 07:44:57,652][12883] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-06-18 07:44:57,652][12883] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-06-18 07:44:59,688][12883] Updated weights for policy 0, policy_version 88723 (0.0030) [2024-06-18 07:45:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1453719552. Throughput: 0: 42531.5. Samples: 1453803080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 07:45:01,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 07:45:03,251][12883] Updated weights for policy 0, policy_version 88733 (0.0022) [2024-06-18 07:45:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1453916160. Throughput: 0: 42389.7. Samples: 1454060460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:06,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 07:45:07,382][12883] Updated weights for policy 0, policy_version 88743 (0.0032) [2024-06-18 07:45:10,845][12883] Updated weights for policy 0, policy_version 88753 (0.0030) [2024-06-18 07:45:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1454145536. Throughput: 0: 42560.9. Samples: 1454314680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:11,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 07:45:15,140][12883] Updated weights for policy 0, policy_version 88763 (0.0039) [2024-06-18 07:45:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1454374912. Throughput: 0: 42675.4. Samples: 1454447900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:17,006][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 07:45:19,153][12883] Updated weights for policy 0, policy_version 88773 (0.0037) [2024-06-18 07:45:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1454571520. Throughput: 0: 42610.2. Samples: 1454698640. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:21,994][12645] Avg episode reward: [(0, '0.171')] [2024-06-18 07:45:22,645][12883] Updated weights for policy 0, policy_version 88783 (0.0031) [2024-06-18 07:45:26,619][12883] Updated weights for policy 0, policy_version 88793 (0.0041) [2024-06-18 07:45:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1454784512. Throughput: 0: 42757.5. Samples: 1454960600. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:26,994][12645] Avg episode reward: [(0, '0.133')] [2024-06-18 07:45:30,237][12883] Updated weights for policy 0, policy_version 88803 (0.0035) [2024-06-18 07:45:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1455030272. Throughput: 0: 42861.7. Samples: 1455090760. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:31,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 07:45:34,276][12883] Updated weights for policy 0, policy_version 88813 (0.0038) [2024-06-18 07:45:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1455226880. Throughput: 0: 42820.6. Samples: 1455346940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:36,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 07:45:38,017][12883] Updated weights for policy 0, policy_version 88823 (0.0030) [2024-06-18 07:45:41,826][12883] Updated weights for policy 0, policy_version 88833 (0.0040) [2024-06-18 07:45:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1455439872. Throughput: 0: 42657.4. Samples: 1455594220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:41,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 07:45:45,846][12883] Updated weights for policy 0, policy_version 88843 (0.0031) [2024-06-18 07:45:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 1455669248. Throughput: 0: 42662.6. Samples: 1455722900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:46,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 07:45:49,198][12883] Updated weights for policy 0, policy_version 88853 (0.0039) [2024-06-18 07:45:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 1455865856. Throughput: 0: 42743.1. Samples: 1455983900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-18 07:45:51,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 07:45:53,693][12883] Updated weights for policy 0, policy_version 88863 (0.0037) [2024-06-18 07:45:56,798][12883] Updated weights for policy 0, policy_version 88873 (0.0042) [2024-06-18 07:45:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1456095232. Throughput: 0: 42652.5. Samples: 1456234040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:45:56,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 07:46:01,322][12883] Updated weights for policy 0, policy_version 88883 (0.0038) [2024-06-18 07:46:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1456275456. Throughput: 0: 42589.4. Samples: 1456364420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:01,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 07:46:04,699][12883] Updated weights for policy 0, policy_version 88893 (0.0023) [2024-06-18 07:46:06,460][12862] Signal inference workers to stop experience collection... (21100 times) [2024-06-18 07:46:06,511][12883] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-06-18 07:46:06,519][12862] Signal inference workers to resume experience collection... (21100 times) [2024-06-18 07:46:06,526][12883] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-06-18 07:46:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42599.1). Total num frames: 1456521216. Throughput: 0: 42773.7. Samples: 1456623460. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:06,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 07:46:09,355][12883] Updated weights for policy 0, policy_version 88903 (0.0032) [2024-06-18 07:46:11,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1456717824. Throughput: 0: 42534.5. Samples: 1456874660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:11,995][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 07:46:12,265][12883] Updated weights for policy 0, policy_version 88913 (0.0027) [2024-06-18 07:46:16,763][12883] Updated weights for policy 0, policy_version 88923 (0.0027) [2024-06-18 07:46:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1456914432. Throughput: 0: 42624.9. Samples: 1457008880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:16,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 07:46:19,801][12883] Updated weights for policy 0, policy_version 88933 (0.0035) [2024-06-18 07:46:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1457127424. Throughput: 0: 42608.4. Samples: 1457264320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:21,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 07:46:24,235][12883] Updated weights for policy 0, policy_version 88943 (0.0029) [2024-06-18 07:46:26,994][12645] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1457373184. Throughput: 0: 42718.9. Samples: 1457516580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:26,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 07:46:27,796][12883] Updated weights for policy 0, policy_version 88953 (0.0028) [2024-06-18 07:46:31,841][12883] Updated weights for policy 0, policy_version 88963 (0.0050) [2024-06-18 07:46:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1457569792. Throughput: 0: 42815.3. Samples: 1457649580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:31,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 07:46:35,320][12883] Updated weights for policy 0, policy_version 88973 (0.0041) [2024-06-18 07:46:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1457766400. Throughput: 0: 42556.5. Samples: 1457898940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:36,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 07:46:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088975_1457766400.pth... [2024-06-18 07:46:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088353_1447575552.pth [2024-06-18 07:46:39,807][12883] Updated weights for policy 0, policy_version 88983 (0.0037) [2024-06-18 07:46:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1457979392. Throughput: 0: 42608.4. Samples: 1458151420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:41,994][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 07:46:42,894][12883] Updated weights for policy 0, policy_version 88993 (0.0032) [2024-06-18 07:46:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1458192384. Throughput: 0: 42576.4. Samples: 1458280360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-18 07:46:46,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 07:46:47,676][12883] Updated weights for policy 0, policy_version 89003 (0.0032) [2024-06-18 07:46:50,719][12883] Updated weights for policy 0, policy_version 89013 (0.0042) [2024-06-18 07:46:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 1458405376. Throughput: 0: 42332.0. Samples: 1458528400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:46:51,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 07:46:55,195][12883] Updated weights for policy 0, policy_version 89023 (0.0027) [2024-06-18 07:46:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1458618368. Throughput: 0: 42518.7. Samples: 1458788000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:46:57,000][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 07:46:58,751][12883] Updated weights for policy 0, policy_version 89033 (0.0036) [2024-06-18 07:47:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1458847744. Throughput: 0: 42397.6. Samples: 1458916780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:01,995][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 07:47:02,754][12883] Updated weights for policy 0, policy_version 89043 (0.0025) [2024-06-18 07:47:06,610][12883] Updated weights for policy 0, policy_version 89053 (0.0025) [2024-06-18 07:47:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1459044352. Throughput: 0: 42389.3. Samples: 1459171840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:06,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 07:47:07,642][12862] Signal inference workers to stop experience collection... (21150 times) [2024-06-18 07:47:07,693][12883] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-06-18 07:47:07,752][12862] Signal inference workers to resume experience collection... (21150 times) [2024-06-18 07:47:07,752][12883] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-06-18 07:47:10,589][12883] Updated weights for policy 0, policy_version 89063 (0.0026) [2024-06-18 07:47:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1459273728. Throughput: 0: 42396.6. Samples: 1459424420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:11,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 07:47:14,250][12883] Updated weights for policy 0, policy_version 89073 (0.0030) [2024-06-18 07:47:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1459486720. Throughput: 0: 42289.3. Samples: 1459552600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:16,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 07:47:18,142][12883] Updated weights for policy 0, policy_version 89083 (0.0040) [2024-06-18 07:47:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1459683328. Throughput: 0: 42479.9. Samples: 1459810540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:21,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 07:47:22,234][12883] Updated weights for policy 0, policy_version 89093 (0.0042) [2024-06-18 07:47:25,826][12883] Updated weights for policy 0, policy_version 89103 (0.0043) [2024-06-18 07:47:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1459912704. Throughput: 0: 42382.5. Samples: 1460058640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:26,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 07:47:30,134][12883] Updated weights for policy 0, policy_version 89113 (0.0044) [2024-06-18 07:47:31,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1460109312. Throughput: 0: 42468.1. Samples: 1460191420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:31,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 07:47:33,400][12883] Updated weights for policy 0, policy_version 89123 (0.0041) [2024-06-18 07:47:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1460322304. Throughput: 0: 42565.3. Samples: 1460443840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:36,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 07:47:38,179][12883] Updated weights for policy 0, policy_version 89133 (0.0034) [2024-06-18 07:47:41,468][12883] Updated weights for policy 0, policy_version 89143 (0.0030) [2024-06-18 07:47:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1460551680. Throughput: 0: 42308.5. Samples: 1460691880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 07:47:41,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 07:47:45,890][12883] Updated weights for policy 0, policy_version 89153 (0.0028) [2024-06-18 07:47:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1460764672. Throughput: 0: 42385.8. Samples: 1460824140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:47:46,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 07:47:49,051][12883] Updated weights for policy 0, policy_version 89163 (0.0040) [2024-06-18 07:47:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1460961280. Throughput: 0: 42382.2. Samples: 1461079040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:47:51,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 07:47:53,637][12883] Updated weights for policy 0, policy_version 89173 (0.0039) [2024-06-18 07:47:56,678][12883] Updated weights for policy 0, policy_version 89183 (0.0027) [2024-06-18 07:47:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1461207040. Throughput: 0: 42389.8. Samples: 1461331960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:47:57,004][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 07:48:01,363][12883] Updated weights for policy 0, policy_version 89193 (0.0035) [2024-06-18 07:48:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1461387264. Throughput: 0: 42377.7. Samples: 1461459600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:01,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 07:48:04,305][12883] Updated weights for policy 0, policy_version 89203 (0.0033) [2024-06-18 07:48:06,994][12645] Fps is (10 sec: 36044.1, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 1461567488. Throughput: 0: 42213.6. Samples: 1461710160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:06,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 07:48:09,085][12883] Updated weights for policy 0, policy_version 89213 (0.0043) [2024-06-18 07:48:10,566][12862] Signal inference workers to stop experience collection... (21200 times) [2024-06-18 07:48:10,612][12883] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-06-18 07:48:10,623][12862] Signal inference workers to resume experience collection... (21200 times) [2024-06-18 07:48:10,636][12883] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-06-18 07:48:11,835][12883] Updated weights for policy 0, policy_version 89223 (0.0030) [2024-06-18 07:48:11,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1461829632. Throughput: 0: 42211.8. Samples: 1461958260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:11,996][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 07:48:16,574][12883] Updated weights for policy 0, policy_version 89233 (0.0044) [2024-06-18 07:48:16,994][12645] Fps is (10 sec: 42599.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 1461993472. Throughput: 0: 42371.4. Samples: 1462098140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:16,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 07:48:19,403][12883] Updated weights for policy 0, policy_version 89243 (0.0035) [2024-06-18 07:48:21,994][12645] Fps is (10 sec: 37691.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1462206464. Throughput: 0: 42246.1. Samples: 1462344920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:21,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 07:48:24,340][12883] Updated weights for policy 0, policy_version 89253 (0.0032) [2024-06-18 07:48:26,973][12883] Updated weights for policy 0, policy_version 89263 (0.0036) [2024-06-18 07:48:26,996][12645] Fps is (10 sec: 49141.0, 60 sec: 42870.0, 300 sec: 42653.6). Total num frames: 1462484992. Throughput: 0: 42342.8. Samples: 1462597400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:26,997][12645] Avg episode reward: [(0, '0.059')] [2024-06-18 07:48:31,866][12883] Updated weights for policy 0, policy_version 89273 (0.0033) [2024-06-18 07:48:31,996][12645] Fps is (10 sec: 44227.6, 60 sec: 42323.7, 300 sec: 42431.5). Total num frames: 1462648832. Throughput: 0: 42541.5. Samples: 1462738600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:31,996][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 07:48:34,835][12883] Updated weights for policy 0, policy_version 89283 (0.0032) [2024-06-18 07:48:36,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1462861824. Throughput: 0: 42444.0. Samples: 1462989020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 07:48:36,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 07:48:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089286_1462861824.pth... [2024-06-18 07:48:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088665_1452687360.pth [2024-06-18 07:48:39,545][12883] Updated weights for policy 0, policy_version 89293 (0.0029) [2024-06-18 07:48:41,993][12645] Fps is (10 sec: 45886.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1463107584. Throughput: 0: 42362.9. Samples: 1463238280. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:48:41,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 07:48:42,523][12883] Updated weights for policy 0, policy_version 89303 (0.0034) [2024-06-18 07:48:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 1463271424. Throughput: 0: 42499.1. Samples: 1463372060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:48:46,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 07:48:47,573][12883] Updated weights for policy 0, policy_version 89313 (0.0030) [2024-06-18 07:48:50,551][12883] Updated weights for policy 0, policy_version 89323 (0.0027) [2024-06-18 07:48:52,000][12645] Fps is (10 sec: 40933.9, 60 sec: 42594.0, 300 sec: 42542.0). Total num frames: 1463517184. Throughput: 0: 42537.0. Samples: 1463624580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:48:52,000][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 07:48:55,233][12883] Updated weights for policy 0, policy_version 89333 (0.0049) [2024-06-18 07:48:56,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1463746560. Throughput: 0: 42575.1. Samples: 1463874040. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:48:56,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 07:48:58,256][12883] Updated weights for policy 0, policy_version 89343 (0.0037) [2024-06-18 07:49:01,994][12645] Fps is (10 sec: 39345.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1463910400. Throughput: 0: 42316.3. Samples: 1464002380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:49:01,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 07:49:02,848][12883] Updated weights for policy 0, policy_version 89353 (0.0050) [2024-06-18 07:49:05,891][12883] Updated weights for policy 0, policy_version 89363 (0.0032) [2024-06-18 07:49:06,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1464139776. Throughput: 0: 42458.7. Samples: 1464255560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:49:06,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 07:49:10,516][12883] Updated weights for policy 0, policy_version 89373 (0.0028) [2024-06-18 07:49:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1464369152. Throughput: 0: 42530.5. Samples: 1464511180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:49:11,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 07:49:12,290][12862] Signal inference workers to stop experience collection... (21250 times) [2024-06-18 07:49:12,290][12862] Signal inference workers to resume experience collection... (21250 times) [2024-06-18 07:49:12,307][12883] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-06-18 07:49:12,317][12883] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-06-18 07:49:13,768][12883] Updated weights for policy 0, policy_version 89383 (0.0032) [2024-06-18 07:49:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1464549376. Throughput: 0: 42330.6. Samples: 1464643380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:49:16,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 07:49:18,054][12883] Updated weights for policy 0, policy_version 89393 (0.0036) [2024-06-18 07:49:21,233][12883] Updated weights for policy 0, policy_version 89403 (0.0043) [2024-06-18 07:49:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1464778752. Throughput: 0: 42395.6. Samples: 1464896820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:49:21,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 07:49:25,539][12883] Updated weights for policy 0, policy_version 89413 (0.0037) [2024-06-18 07:49:26,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 1465024512. Throughput: 0: 42575.4. Samples: 1465154180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:49:26,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 07:49:28,744][12883] Updated weights for policy 0, policy_version 89423 (0.0042) [2024-06-18 07:49:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1465188352. Throughput: 0: 42561.8. Samples: 1465287340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-18 07:49:31,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 07:49:33,276][12883] Updated weights for policy 0, policy_version 89433 (0.0039) [2024-06-18 07:49:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1465417728. Throughput: 0: 42580.0. Samples: 1465540420. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:49:36,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 07:49:37,122][12883] Updated weights for policy 0, policy_version 89443 (0.0024) [2024-06-18 07:49:40,816][12883] Updated weights for policy 0, policy_version 89453 (0.0038) [2024-06-18 07:49:41,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1465663488. Throughput: 0: 42800.8. Samples: 1465800080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:49:41,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 07:49:44,630][12883] Updated weights for policy 0, policy_version 89463 (0.0040) [2024-06-18 07:49:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 1465827328. Throughput: 0: 42772.2. Samples: 1465927120. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:49:46,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 07:49:48,380][12883] Updated weights for policy 0, policy_version 89473 (0.0026) [2024-06-18 07:49:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42329.6, 300 sec: 42487.3). Total num frames: 1466056704. Throughput: 0: 42686.6. Samples: 1466176460. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:49:51,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 07:49:52,287][12883] Updated weights for policy 0, policy_version 89483 (0.0032) [2024-06-18 07:49:55,968][12883] Updated weights for policy 0, policy_version 89493 (0.0030) [2024-06-18 07:49:56,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1466302464. Throughput: 0: 42816.5. Samples: 1466437920. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:49:56,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 07:49:59,786][12883] Updated weights for policy 0, policy_version 89503 (0.0035) [2024-06-18 07:50:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1466482688. Throughput: 0: 42793.3. Samples: 1466569080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:50:01,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 07:50:03,488][12883] Updated weights for policy 0, policy_version 89513 (0.0035) [2024-06-18 07:50:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1466712064. Throughput: 0: 42814.9. Samples: 1466823500. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:50:06,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 07:50:07,414][12883] Updated weights for policy 0, policy_version 89523 (0.0026) [2024-06-18 07:50:11,332][12883] Updated weights for policy 0, policy_version 89533 (0.0043) [2024-06-18 07:50:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1466925056. Throughput: 0: 42874.1. Samples: 1467083520. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:50:11,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 07:50:15,033][12883] Updated weights for policy 0, policy_version 89543 (0.0034) [2024-06-18 07:50:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1467138048. Throughput: 0: 42851.6. Samples: 1467215660. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:50:16,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 07:50:18,939][12883] Updated weights for policy 0, policy_version 89553 (0.0027) [2024-06-18 07:50:21,998][12645] Fps is (10 sec: 44216.6, 60 sec: 43141.2, 300 sec: 42653.3). Total num frames: 1467367424. Throughput: 0: 42788.2. Samples: 1467466080. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:50:21,999][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 07:50:22,650][12883] Updated weights for policy 0, policy_version 89563 (0.0033) [2024-06-18 07:50:26,664][12862] Signal inference workers to stop experience collection... (21300 times) [2024-06-18 07:50:26,664][12862] Signal inference workers to resume experience collection... (21300 times) [2024-06-18 07:50:26,680][12883] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-06-18 07:50:26,680][12883] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-06-18 07:50:26,812][12883] Updated weights for policy 0, policy_version 89573 (0.0029) [2024-06-18 07:50:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1467564032. Throughput: 0: 42969.4. Samples: 1467733700. Policy #0 lag: (min: 1.0, avg: 8.2, max: 20.0) [2024-06-18 07:50:26,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 07:50:30,264][12883] Updated weights for policy 0, policy_version 89583 (0.0027) [2024-06-18 07:50:31,994][12645] Fps is (10 sec: 40978.5, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1467777024. Throughput: 0: 42740.7. Samples: 1467850460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:50:31,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 07:50:34,481][12883] Updated weights for policy 0, policy_version 89593 (0.0048) [2024-06-18 07:50:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1467990016. Throughput: 0: 42960.9. Samples: 1468109700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:50:37,003][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 07:50:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089599_1467990016.pth... [2024-06-18 07:50:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000088975_1457766400.pth [2024-06-18 07:50:37,902][12883] Updated weights for policy 0, policy_version 89603 (0.0031) [2024-06-18 07:50:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1468186624. Throughput: 0: 42858.7. Samples: 1468366560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:50:41,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 07:50:42,375][12883] Updated weights for policy 0, policy_version 89613 (0.0034) [2024-06-18 07:50:45,605][12883] Updated weights for policy 0, policy_version 89623 (0.0039) [2024-06-18 07:50:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1468432384. Throughput: 0: 42833.8. Samples: 1468496600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:50:47,000][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 07:50:50,019][12883] Updated weights for policy 0, policy_version 89633 (0.0032) [2024-06-18 07:50:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1468645376. Throughput: 0: 42934.7. Samples: 1468755560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:50:51,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 07:50:53,543][12883] Updated weights for policy 0, policy_version 89643 (0.0023) [2024-06-18 07:50:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1468841984. Throughput: 0: 42871.6. Samples: 1469012740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:50:56,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 07:50:57,611][12883] Updated weights for policy 0, policy_version 89653 (0.0037) [2024-06-18 07:51:01,088][12883] Updated weights for policy 0, policy_version 89663 (0.0043) [2024-06-18 07:51:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1469087744. Throughput: 0: 42724.0. Samples: 1469138240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:51:01,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 07:51:05,547][12883] Updated weights for policy 0, policy_version 89673 (0.0032) [2024-06-18 07:51:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1469284352. Throughput: 0: 42923.9. Samples: 1469397460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:51:06,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 07:51:08,720][12883] Updated weights for policy 0, policy_version 89683 (0.0050) [2024-06-18 07:51:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1469497344. Throughput: 0: 42635.1. Samples: 1469652280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:51:11,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 07:51:13,193][12883] Updated weights for policy 0, policy_version 89693 (0.0037) [2024-06-18 07:51:16,389][12883] Updated weights for policy 0, policy_version 89703 (0.0028) [2024-06-18 07:51:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1469726720. Throughput: 0: 42895.3. Samples: 1469780740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 07:51:16,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 07:51:20,796][12883] Updated weights for policy 0, policy_version 89713 (0.0043) [2024-06-18 07:51:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42601.7, 300 sec: 42542.9). Total num frames: 1469923328. Throughput: 0: 42999.6. Samples: 1470044680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:21,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 07:51:24,069][12883] Updated weights for policy 0, policy_version 89723 (0.0033) [2024-06-18 07:51:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1470152704. Throughput: 0: 42812.1. Samples: 1470293100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:26,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 07:51:28,380][12883] Updated weights for policy 0, policy_version 89733 (0.0045) [2024-06-18 07:51:31,483][12883] Updated weights for policy 0, policy_version 89743 (0.0029) [2024-06-18 07:51:31,996][12645] Fps is (10 sec: 42588.3, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1470349312. Throughput: 0: 42860.0. Samples: 1470425400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:31,997][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 07:51:35,837][12883] Updated weights for policy 0, policy_version 89753 (0.0034) [2024-06-18 07:51:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1470562304. Throughput: 0: 42926.7. Samples: 1470687260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:36,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 07:51:38,470][12862] Signal inference workers to stop experience collection... (21350 times) [2024-06-18 07:51:38,471][12862] Signal inference workers to resume experience collection... (21350 times) [2024-06-18 07:51:38,517][12883] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-06-18 07:51:38,517][12883] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-06-18 07:51:39,034][12883] Updated weights for policy 0, policy_version 89763 (0.0033) [2024-06-18 07:51:41,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 1470808064. Throughput: 0: 42699.4. Samples: 1470934220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:41,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 07:51:43,872][12883] Updated weights for policy 0, policy_version 89773 (0.0028) [2024-06-18 07:51:46,551][12883] Updated weights for policy 0, policy_version 89783 (0.0030) [2024-06-18 07:51:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1471004672. Throughput: 0: 42974.3. Samples: 1471072180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:46,996][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 07:51:51,458][12883] Updated weights for policy 0, policy_version 89793 (0.0045) [2024-06-18 07:51:51,994][12645] Fps is (10 sec: 36044.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1471168512. Throughput: 0: 42892.4. Samples: 1471327620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:51,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 07:51:54,554][12883] Updated weights for policy 0, policy_version 89803 (0.0026) [2024-06-18 07:51:56,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1471447040. Throughput: 0: 42721.7. Samples: 1471574760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:51:56,999][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 07:51:59,032][12883] Updated weights for policy 0, policy_version 89813 (0.0032) [2024-06-18 07:52:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1471627264. Throughput: 0: 42935.5. Samples: 1471712840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:52:01,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 07:52:02,553][12883] Updated weights for policy 0, policy_version 89823 (0.0040) [2024-06-18 07:52:06,636][12883] Updated weights for policy 0, policy_version 89833 (0.0028) [2024-06-18 07:52:06,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1471823872. Throughput: 0: 42617.4. Samples: 1471962460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:52:06,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 07:52:09,994][12883] Updated weights for policy 0, policy_version 89843 (0.0034) [2024-06-18 07:52:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1472086016. Throughput: 0: 42855.1. Samples: 1472221580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:52:11,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 07:52:14,164][12883] Updated weights for policy 0, policy_version 89853 (0.0031) [2024-06-18 07:52:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1472282624. Throughput: 0: 43023.6. Samples: 1472361360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:16,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 07:52:17,441][12883] Updated weights for policy 0, policy_version 89863 (0.0032) [2024-06-18 07:52:21,732][12883] Updated weights for policy 0, policy_version 89873 (0.0033) [2024-06-18 07:52:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1472479232. Throughput: 0: 42693.6. Samples: 1472608480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:21,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 07:52:24,873][12883] Updated weights for policy 0, policy_version 89883 (0.0048) [2024-06-18 07:52:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1472708608. Throughput: 0: 42910.8. Samples: 1472865200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:26,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 07:52:30,237][12883] Updated weights for policy 0, policy_version 89893 (0.0035) [2024-06-18 07:52:31,993][12645] Fps is (10 sec: 44237.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1472921600. Throughput: 0: 42772.5. Samples: 1472996840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:31,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 07:52:32,456][12883] Updated weights for policy 0, policy_version 89903 (0.0039) [2024-06-18 07:52:34,978][12862] Signal inference workers to stop experience collection... (21400 times) [2024-06-18 07:52:34,984][12862] Signal inference workers to resume experience collection... (21400 times) [2024-06-18 07:52:35,024][12883] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-06-18 07:52:35,024][12883] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-06-18 07:52:36,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1473118208. Throughput: 0: 42567.5. Samples: 1473243160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:36,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 07:52:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089912_1473118208.pth... [2024-06-18 07:52:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089286_1462861824.pth [2024-06-18 07:52:37,807][12883] Updated weights for policy 0, policy_version 89913 (0.0039) [2024-06-18 07:52:40,234][12883] Updated weights for policy 0, policy_version 89923 (0.0049) [2024-06-18 07:52:41,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1473331200. Throughput: 0: 42865.2. Samples: 1473503700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:41,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 07:52:45,359][12883] Updated weights for policy 0, policy_version 89933 (0.0032) [2024-06-18 07:52:46,996][12645] Fps is (10 sec: 45866.0, 60 sec: 42871.5, 300 sec: 42764.7). Total num frames: 1473576960. Throughput: 0: 42610.4. Samples: 1473630400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:46,996][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 07:52:47,850][12883] Updated weights for policy 0, policy_version 89943 (0.0037) [2024-06-18 07:52:51,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43417.4, 300 sec: 42598.4). Total num frames: 1473773568. Throughput: 0: 42752.0. Samples: 1473886320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:51,995][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 07:52:52,849][12883] Updated weights for policy 0, policy_version 89953 (0.0041) [2024-06-18 07:52:55,805][12883] Updated weights for policy 0, policy_version 89963 (0.0023) [2024-06-18 07:52:56,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1473986560. Throughput: 0: 42714.1. Samples: 1474143720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:52:56,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 07:53:00,314][12883] Updated weights for policy 0, policy_version 89973 (0.0034) [2024-06-18 07:53:01,994][12645] Fps is (10 sec: 42599.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1474199552. Throughput: 0: 42418.2. Samples: 1474270180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:53:01,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 07:53:03,515][12883] Updated weights for policy 0, policy_version 89983 (0.0037) [2024-06-18 07:53:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1474396160. Throughput: 0: 42630.8. Samples: 1474526860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 07:53:06,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 07:53:07,920][12883] Updated weights for policy 0, policy_version 89993 (0.0038) [2024-06-18 07:53:11,074][12883] Updated weights for policy 0, policy_version 90003 (0.0054) [2024-06-18 07:53:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1474625536. Throughput: 0: 42535.1. Samples: 1474779280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:11,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 07:53:15,820][12883] Updated weights for policy 0, policy_version 90013 (0.0036) [2024-06-18 07:53:16,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1474838528. Throughput: 0: 42521.1. Samples: 1474910300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:16,998][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 07:53:18,544][12883] Updated weights for policy 0, policy_version 90023 (0.0033) [2024-06-18 07:53:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 1475035136. Throughput: 0: 42785.1. Samples: 1475168480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:21,994][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 07:53:23,419][12883] Updated weights for policy 0, policy_version 90033 (0.0036) [2024-06-18 07:53:26,137][12883] Updated weights for policy 0, policy_version 90043 (0.0045) [2024-06-18 07:53:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1475264512. Throughput: 0: 42498.4. Samples: 1475416120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:26,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 07:53:30,938][12883] Updated weights for policy 0, policy_version 90053 (0.0033) [2024-06-18 07:53:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1475461120. Throughput: 0: 42673.1. Samples: 1475550600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:31,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 07:53:34,317][12883] Updated weights for policy 0, policy_version 90063 (0.0032) [2024-06-18 07:53:36,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42870.0, 300 sec: 42653.6). Total num frames: 1475690496. Throughput: 0: 42626.2. Samples: 1475804580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:36,996][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 07:53:38,782][12883] Updated weights for policy 0, policy_version 90073 (0.0028) [2024-06-18 07:53:39,183][12862] Signal inference workers to stop experience collection... (21450 times) [2024-06-18 07:53:39,240][12883] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-06-18 07:53:39,244][12862] Signal inference workers to resume experience collection... (21450 times) [2024-06-18 07:53:39,254][12883] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-06-18 07:53:41,976][12883] Updated weights for policy 0, policy_version 90083 (0.0036) [2024-06-18 07:53:41,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1475919872. Throughput: 0: 42573.4. Samples: 1476059520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:41,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 07:53:46,542][12883] Updated weights for policy 0, policy_version 90093 (0.0032) [2024-06-18 07:53:46,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42053.8, 300 sec: 42654.8). Total num frames: 1476100096. Throughput: 0: 42739.6. Samples: 1476193460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:46,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 07:53:49,693][12883] Updated weights for policy 0, policy_version 90103 (0.0027) [2024-06-18 07:53:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 1476329472. Throughput: 0: 42607.0. Samples: 1476444180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:51,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 07:53:54,159][12883] Updated weights for policy 0, policy_version 90113 (0.0046) [2024-06-18 07:53:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1476542464. Throughput: 0: 42623.0. Samples: 1476697320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:53:56,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 07:53:57,460][12883] Updated weights for policy 0, policy_version 90123 (0.0032) [2024-06-18 07:54:01,671][12883] Updated weights for policy 0, policy_version 90133 (0.0033) [2024-06-18 07:54:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1476739072. Throughput: 0: 42645.4. Samples: 1476829340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 07:54:01,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 07:54:05,009][12883] Updated weights for policy 0, policy_version 90143 (0.0033) [2024-06-18 07:54:07,000][12645] Fps is (10 sec: 44209.4, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 1476984832. Throughput: 0: 42528.7. Samples: 1477082540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:07,000][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 07:54:09,205][12883] Updated weights for policy 0, policy_version 90153 (0.0032) [2024-06-18 07:54:11,995][12645] Fps is (10 sec: 44230.8, 60 sec: 42597.4, 300 sec: 42820.4). Total num frames: 1477181440. Throughput: 0: 42815.5. Samples: 1477342880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:11,996][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 07:54:12,611][12883] Updated weights for policy 0, policy_version 90163 (0.0038) [2024-06-18 07:54:16,822][12883] Updated weights for policy 0, policy_version 90173 (0.0043) [2024-06-18 07:54:16,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1477394432. Throughput: 0: 42574.8. Samples: 1477466460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:16,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 07:54:20,243][12883] Updated weights for policy 0, policy_version 90183 (0.0038) [2024-06-18 07:54:21,994][12645] Fps is (10 sec: 44242.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1477623808. Throughput: 0: 42572.8. Samples: 1477720260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:21,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 07:54:24,699][12883] Updated weights for policy 0, policy_version 90193 (0.0047) [2024-06-18 07:54:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1477836800. Throughput: 0: 42827.5. Samples: 1477986760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:26,998][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 07:54:27,798][12883] Updated weights for policy 0, policy_version 90203 (0.0037) [2024-06-18 07:54:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1478017024. Throughput: 0: 42641.8. Samples: 1478112340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:31,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 07:54:32,424][12883] Updated weights for policy 0, policy_version 90213 (0.0031) [2024-06-18 07:54:35,452][12883] Updated weights for policy 0, policy_version 90223 (0.0029) [2024-06-18 07:54:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1478279168. Throughput: 0: 42727.6. Samples: 1478366920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:36,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 07:54:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090227_1478279168.pth... [2024-06-18 07:54:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089599_1467990016.pth [2024-06-18 07:54:39,836][12883] Updated weights for policy 0, policy_version 90233 (0.0041) [2024-06-18 07:54:41,682][12862] Signal inference workers to stop experience collection... (21500 times) [2024-06-18 07:54:41,683][12862] Signal inference workers to resume experience collection... (21500 times) [2024-06-18 07:54:41,727][12883] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-06-18 07:54:41,727][12883] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-06-18 07:54:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1478475776. Throughput: 0: 43161.4. Samples: 1478639580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:41,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 07:54:42,999][12883] Updated weights for policy 0, policy_version 90243 (0.0029) [2024-06-18 07:54:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1478672384. Throughput: 0: 42813.2. Samples: 1478755940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:46,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 07:54:47,518][12883] Updated weights for policy 0, policy_version 90253 (0.0026) [2024-06-18 07:54:50,542][12883] Updated weights for policy 0, policy_version 90263 (0.0038) [2024-06-18 07:54:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1478918144. Throughput: 0: 42800.6. Samples: 1479008300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:51,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 07:54:55,235][12883] Updated weights for policy 0, policy_version 90273 (0.0042) [2024-06-18 07:54:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1479081984. Throughput: 0: 42959.6. Samples: 1479276000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 07:54:56,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 07:54:58,440][12883] Updated weights for policy 0, policy_version 90283 (0.0044) [2024-06-18 07:55:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1479294976. Throughput: 0: 42804.9. Samples: 1479392680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:01,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 07:55:02,924][12883] Updated weights for policy 0, policy_version 90293 (0.0032) [2024-06-18 07:55:06,029][12883] Updated weights for policy 0, policy_version 90303 (0.0027) [2024-06-18 07:55:06,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42875.9, 300 sec: 42820.6). Total num frames: 1479557120. Throughput: 0: 43013.3. Samples: 1479655860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:06,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 07:55:10,590][12883] Updated weights for policy 0, policy_version 90313 (0.0044) [2024-06-18 07:55:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 1479737344. Throughput: 0: 42906.7. Samples: 1479917560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:11,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 07:55:13,584][12883] Updated weights for policy 0, policy_version 90323 (0.0032) [2024-06-18 07:55:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 1479950336. Throughput: 0: 42712.4. Samples: 1480034400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:16,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 07:55:18,211][12883] Updated weights for policy 0, policy_version 90333 (0.0042) [2024-06-18 07:55:21,195][12883] Updated weights for policy 0, policy_version 90343 (0.0030) [2024-06-18 07:55:21,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1480212480. Throughput: 0: 42971.2. Samples: 1480300620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:21,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 07:55:25,934][12883] Updated weights for policy 0, policy_version 90353 (0.0042) [2024-06-18 07:55:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1480392704. Throughput: 0: 42591.1. Samples: 1480556180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:26,994][12645] Avg episode reward: [(0, '0.637')] [2024-06-18 07:55:29,026][12883] Updated weights for policy 0, policy_version 90363 (0.0034) [2024-06-18 07:55:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1480605696. Throughput: 0: 42773.8. Samples: 1480680760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:31,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 07:55:33,489][12883] Updated weights for policy 0, policy_version 90373 (0.0038) [2024-06-18 07:55:36,756][12883] Updated weights for policy 0, policy_version 90383 (0.0040) [2024-06-18 07:55:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1480835072. Throughput: 0: 43040.9. Samples: 1480945140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:36,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 07:55:40,942][12883] Updated weights for policy 0, policy_version 90393 (0.0029) [2024-06-18 07:55:41,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 1481015296. Throughput: 0: 42930.3. Samples: 1481207960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:41,996][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 07:55:44,391][12862] Signal inference workers to stop experience collection... (21550 times) [2024-06-18 07:55:44,392][12862] Signal inference workers to resume experience collection... (21550 times) [2024-06-18 07:55:44,410][12883] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-06-18 07:55:44,410][12883] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-06-18 07:55:44,542][12883] Updated weights for policy 0, policy_version 90403 (0.0042) [2024-06-18 07:55:46,996][12645] Fps is (10 sec: 42588.6, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1481261056. Throughput: 0: 42966.6. Samples: 1481326280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:46,996][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 07:55:48,606][12883] Updated weights for policy 0, policy_version 90413 (0.0023) [2024-06-18 07:55:51,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1481474048. Throughput: 0: 42965.4. Samples: 1481589300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-18 07:55:51,994][12645] Avg episode reward: [(0, '0.215')] [2024-06-18 07:55:52,022][12883] Updated weights for policy 0, policy_version 90423 (0.0035) [2024-06-18 07:55:56,329][12883] Updated weights for policy 0, policy_version 90433 (0.0040) [2024-06-18 07:55:56,994][12645] Fps is (10 sec: 40969.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1481670656. Throughput: 0: 42952.9. Samples: 1481850440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:55:56,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 07:55:59,539][12883] Updated weights for policy 0, policy_version 90443 (0.0033) [2024-06-18 07:56:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43690.6, 300 sec: 42820.6). Total num frames: 1481916416. Throughput: 0: 43058.2. Samples: 1481972020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:01,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 07:56:03,876][12883] Updated weights for policy 0, policy_version 90453 (0.0021) [2024-06-18 07:56:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1482113024. Throughput: 0: 43049.3. Samples: 1482237840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:06,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 07:56:07,312][12883] Updated weights for policy 0, policy_version 90463 (0.0037) [2024-06-18 07:56:11,474][12883] Updated weights for policy 0, policy_version 90473 (0.0042) [2024-06-18 07:56:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1482309632. Throughput: 0: 42900.0. Samples: 1482486680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:11,994][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 07:56:14,978][12883] Updated weights for policy 0, policy_version 90483 (0.0034) [2024-06-18 07:56:17,002][12645] Fps is (10 sec: 44199.7, 60 sec: 43411.6, 300 sec: 42819.3). Total num frames: 1482555392. Throughput: 0: 42983.6. Samples: 1482615380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:17,002][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 07:56:19,042][12883] Updated weights for policy 0, policy_version 90493 (0.0029) [2024-06-18 07:56:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1482719232. Throughput: 0: 42865.6. Samples: 1482874100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:21,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 07:56:22,835][12883] Updated weights for policy 0, policy_version 90503 (0.0030) [2024-06-18 07:56:26,639][12883] Updated weights for policy 0, policy_version 90513 (0.0026) [2024-06-18 07:56:26,994][12645] Fps is (10 sec: 40994.6, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 1482964992. Throughput: 0: 42567.0. Samples: 1483123380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:26,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 07:56:30,558][12883] Updated weights for policy 0, policy_version 90523 (0.0030) [2024-06-18 07:56:31,994][12645] Fps is (10 sec: 49153.1, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1483210752. Throughput: 0: 42908.9. Samples: 1483257080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:31,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 07:56:34,254][12883] Updated weights for policy 0, policy_version 90533 (0.0045) [2024-06-18 07:56:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1483374592. Throughput: 0: 42671.5. Samples: 1483509520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:36,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 07:56:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090538_1483374592.pth... [2024-06-18 07:56:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000089912_1473118208.pth [2024-06-18 07:56:38,190][12883] Updated weights for policy 0, policy_version 90543 (0.0043) [2024-06-18 07:56:41,996][12645] Fps is (10 sec: 39312.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1483603968. Throughput: 0: 42640.1. Samples: 1483769340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:41,997][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 07:56:42,227][12883] Updated weights for policy 0, policy_version 90553 (0.0033) [2024-06-18 07:56:45,843][12883] Updated weights for policy 0, policy_version 90563 (0.0037) [2024-06-18 07:56:46,994][12645] Fps is (10 sec: 47514.3, 60 sec: 43146.2, 300 sec: 42987.2). Total num frames: 1483849728. Throughput: 0: 42804.5. Samples: 1483898220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 07:56:46,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 07:56:49,838][12883] Updated weights for policy 0, policy_version 90573 (0.0034) [2024-06-18 07:56:51,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1484029952. Throughput: 0: 42556.3. Samples: 1484152880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:56:51,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 07:56:53,589][12883] Updated weights for policy 0, policy_version 90583 (0.0034) [2024-06-18 07:56:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1484242944. Throughput: 0: 42849.4. Samples: 1484414900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:56:56,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 07:56:57,344][12883] Updated weights for policy 0, policy_version 90593 (0.0043) [2024-06-18 07:57:01,257][12883] Updated weights for policy 0, policy_version 90603 (0.0040) [2024-06-18 07:57:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1484488704. Throughput: 0: 42870.1. Samples: 1484544180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:01,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 07:57:05,030][12883] Updated weights for policy 0, policy_version 90613 (0.0039) [2024-06-18 07:57:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1484652544. Throughput: 0: 42674.0. Samples: 1484794420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:06,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 07:57:07,656][12862] Signal inference workers to stop experience collection... (21600 times) [2024-06-18 07:57:07,656][12862] Signal inference workers to resume experience collection... (21600 times) [2024-06-18 07:57:07,682][12883] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-06-18 07:57:07,682][12883] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-06-18 07:57:09,061][12883] Updated weights for policy 0, policy_version 90623 (0.0036) [2024-06-18 07:57:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1484881920. Throughput: 0: 42731.5. Samples: 1485046300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:11,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 07:57:12,623][12883] Updated weights for policy 0, policy_version 90633 (0.0039) [2024-06-18 07:57:16,785][12883] Updated weights for policy 0, policy_version 90643 (0.0039) [2024-06-18 07:57:16,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42604.3, 300 sec: 42820.6). Total num frames: 1485111296. Throughput: 0: 42752.8. Samples: 1485180960. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:16,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 07:57:20,178][12883] Updated weights for policy 0, policy_version 90653 (0.0030) [2024-06-18 07:57:21,996][12645] Fps is (10 sec: 42588.8, 60 sec: 43143.0, 300 sec: 42709.1). Total num frames: 1485307904. Throughput: 0: 42629.0. Samples: 1485427920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:21,997][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 07:57:24,456][12883] Updated weights for policy 0, policy_version 90663 (0.0038) [2024-06-18 07:57:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1485537280. Throughput: 0: 42560.7. Samples: 1485684480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:26,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 07:57:27,836][12883] Updated weights for policy 0, policy_version 90673 (0.0025) [2024-06-18 07:57:31,994][12645] Fps is (10 sec: 40969.6, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1485717504. Throughput: 0: 42651.1. Samples: 1485817520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:31,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 07:57:32,126][12883] Updated weights for policy 0, policy_version 90683 (0.0033) [2024-06-18 07:57:35,490][12883] Updated weights for policy 0, policy_version 90693 (0.0037) [2024-06-18 07:57:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1485946880. Throughput: 0: 42452.5. Samples: 1486063240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:36,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 07:57:39,820][12883] Updated weights for policy 0, policy_version 90703 (0.0029) [2024-06-18 07:57:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42873.1, 300 sec: 42709.8). Total num frames: 1486176256. Throughput: 0: 42364.9. Samples: 1486321320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 07:57:41,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 07:57:43,285][12883] Updated weights for policy 0, policy_version 90713 (0.0036) [2024-06-18 07:57:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 1486356480. Throughput: 0: 42402.8. Samples: 1486452300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:57:46,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 07:57:47,555][12883] Updated weights for policy 0, policy_version 90723 (0.0033) [2024-06-18 07:57:50,836][12883] Updated weights for policy 0, policy_version 90733 (0.0033) [2024-06-18 07:57:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1486602240. Throughput: 0: 42397.2. Samples: 1486702300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:57:51,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 07:57:55,337][12883] Updated weights for policy 0, policy_version 90743 (0.0030) [2024-06-18 07:57:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1486798848. Throughput: 0: 42540.4. Samples: 1486960620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:57:56,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 07:57:58,592][12883] Updated weights for policy 0, policy_version 90753 (0.0035) [2024-06-18 07:58:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1486995456. Throughput: 0: 42371.0. Samples: 1487087660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:02,008][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 07:58:02,936][12883] Updated weights for policy 0, policy_version 90763 (0.0027) [2024-06-18 07:58:06,250][12883] Updated weights for policy 0, policy_version 90773 (0.0027) [2024-06-18 07:58:06,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43142.8, 300 sec: 42764.7). Total num frames: 1487241216. Throughput: 0: 42578.2. Samples: 1487343940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:06,996][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 07:58:10,575][12883] Updated weights for policy 0, policy_version 90783 (0.0028) [2024-06-18 07:58:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1487437824. Throughput: 0: 42603.7. Samples: 1487601640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:11,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 07:58:14,134][12883] Updated weights for policy 0, policy_version 90793 (0.0034) [2024-06-18 07:58:16,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1487650816. Throughput: 0: 42424.8. Samples: 1487726640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:16,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 07:58:17,970][12883] Updated weights for policy 0, policy_version 90803 (0.0026) [2024-06-18 07:58:21,583][12862] Signal inference workers to stop experience collection... (21650 times) [2024-06-18 07:58:21,583][12862] Signal inference workers to resume experience collection... (21650 times) [2024-06-18 07:58:21,604][12883] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-06-18 07:58:21,604][12883] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-06-18 07:58:21,739][12883] Updated weights for policy 0, policy_version 90813 (0.0028) [2024-06-18 07:58:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 1487896576. Throughput: 0: 42821.8. Samples: 1487990220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:21,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 07:58:25,597][12883] Updated weights for policy 0, policy_version 90823 (0.0033) [2024-06-18 07:58:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1488076800. Throughput: 0: 42691.9. Samples: 1488242460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:26,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 07:58:29,368][12883] Updated weights for policy 0, policy_version 90833 (0.0044) [2024-06-18 07:58:31,993][12645] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1488289792. Throughput: 0: 42587.6. Samples: 1488368740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:31,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 07:58:33,489][12883] Updated weights for policy 0, policy_version 90843 (0.0033) [2024-06-18 07:58:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1488519168. Throughput: 0: 42875.6. Samples: 1488631700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 07:58:36,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 07:58:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090853_1488535552.pth... [2024-06-18 07:58:37,020][12883] Updated weights for policy 0, policy_version 90853 (0.0036) [2024-06-18 07:58:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090227_1478279168.pth [2024-06-18 07:58:41,144][12883] Updated weights for policy 0, policy_version 90863 (0.0024) [2024-06-18 07:58:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1488732160. Throughput: 0: 42733.4. Samples: 1488883620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:58:41,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 07:58:44,580][12883] Updated weights for policy 0, policy_version 90873 (0.0038) [2024-06-18 07:58:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1488928768. Throughput: 0: 42734.7. Samples: 1489010720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:58:46,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 07:58:48,630][12883] Updated weights for policy 0, policy_version 90883 (0.0036) [2024-06-18 07:58:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1489158144. Throughput: 0: 42838.1. Samples: 1489271560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:58:51,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 07:58:52,502][12883] Updated weights for policy 0, policy_version 90893 (0.0039) [2024-06-18 07:58:56,756][12883] Updated weights for policy 0, policy_version 90903 (0.0034) [2024-06-18 07:58:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1489354752. Throughput: 0: 42782.2. Samples: 1489526840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:58:56,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 07:59:00,609][12883] Updated weights for policy 0, policy_version 90913 (0.0037) [2024-06-18 07:59:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42710.4). Total num frames: 1489584128. Throughput: 0: 42755.6. Samples: 1489650640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:59:01,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 07:59:04,431][12883] Updated weights for policy 0, policy_version 90923 (0.0027) [2024-06-18 07:59:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42600.0, 300 sec: 42765.2). Total num frames: 1489797120. Throughput: 0: 42678.7. Samples: 1489910760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:59:06,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 07:59:08,340][12883] Updated weights for policy 0, policy_version 90933 (0.0039) [2024-06-18 07:59:11,985][12883] Updated weights for policy 0, policy_version 90943 (0.0032) [2024-06-18 07:59:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1490010112. Throughput: 0: 42762.2. Samples: 1490166760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:59:11,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 07:59:15,947][12883] Updated weights for policy 0, policy_version 90953 (0.0046) [2024-06-18 07:59:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1490223104. Throughput: 0: 42695.3. Samples: 1490290040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:59:16,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 07:59:19,615][12883] Updated weights for policy 0, policy_version 90963 (0.0034) [2024-06-18 07:59:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1490436096. Throughput: 0: 42658.3. Samples: 1490551320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:59:21,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 07:59:23,707][12883] Updated weights for policy 0, policy_version 90973 (0.0029) [2024-06-18 07:59:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1490649088. Throughput: 0: 42767.3. Samples: 1490808160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:59:26,995][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 07:59:27,142][12883] Updated weights for policy 0, policy_version 90983 (0.0029) [2024-06-18 07:59:31,399][12883] Updated weights for policy 0, policy_version 90993 (0.0034) [2024-06-18 07:59:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1490862080. Throughput: 0: 42848.4. Samples: 1490938900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 07:59:31,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 07:59:35,239][12883] Updated weights for policy 0, policy_version 91003 (0.0036) [2024-06-18 07:59:36,996][12645] Fps is (10 sec: 42589.7, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 1491075072. Throughput: 0: 42744.1. Samples: 1491195140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:59:36,997][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 07:59:39,187][12883] Updated weights for policy 0, policy_version 91013 (0.0041) [2024-06-18 07:59:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1491288064. Throughput: 0: 42551.5. Samples: 1491441660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:59:41,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 07:59:42,915][12883] Updated weights for policy 0, policy_version 91023 (0.0029) [2024-06-18 07:59:44,397][12862] Signal inference workers to stop experience collection... (21700 times) [2024-06-18 07:59:44,397][12862] Signal inference workers to resume experience collection... (21700 times) [2024-06-18 07:59:44,411][12883] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-06-18 07:59:44,411][12883] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-06-18 07:59:46,815][12883] Updated weights for policy 0, policy_version 91033 (0.0038) [2024-06-18 07:59:46,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1491484672. Throughput: 0: 42708.0. Samples: 1491572500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:59:46,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 07:59:50,501][12883] Updated weights for policy 0, policy_version 91043 (0.0040) [2024-06-18 07:59:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1491714048. Throughput: 0: 42700.8. Samples: 1491832300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:59:51,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 07:59:54,462][12883] Updated weights for policy 0, policy_version 91053 (0.0034) [2024-06-18 07:59:56,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1491943424. Throughput: 0: 42660.8. Samples: 1492086500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 07:59:56,994][12645] Avg episode reward: [(0, '0.625')] [2024-06-18 07:59:58,017][12883] Updated weights for policy 0, policy_version 91063 (0.0035) [2024-06-18 08:00:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1492123648. Throughput: 0: 42858.8. Samples: 1492218680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:00:01,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 08:00:02,055][12883] Updated weights for policy 0, policy_version 91073 (0.0028) [2024-06-18 08:00:05,627][12883] Updated weights for policy 0, policy_version 91083 (0.0041) [2024-06-18 08:00:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1492353024. Throughput: 0: 42806.3. Samples: 1492477600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:00:06,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 08:00:09,600][12883] Updated weights for policy 0, policy_version 91093 (0.0032) [2024-06-18 08:00:11,995][12645] Fps is (10 sec: 45870.8, 60 sec: 42870.8, 300 sec: 42820.4). Total num frames: 1492582400. Throughput: 0: 42725.5. Samples: 1492730840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:00:11,995][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 08:00:13,263][12883] Updated weights for policy 0, policy_version 91103 (0.0030) [2024-06-18 08:00:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 1492762624. Throughput: 0: 42730.3. Samples: 1492861760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:00:16,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 08:00:17,393][12883] Updated weights for policy 0, policy_version 91113 (0.0036) [2024-06-18 08:00:21,587][12883] Updated weights for policy 0, policy_version 91123 (0.0036) [2024-06-18 08:00:21,994][12645] Fps is (10 sec: 39325.8, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1492975616. Throughput: 0: 42776.1. Samples: 1493119960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:00:21,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 08:00:24,961][12883] Updated weights for policy 0, policy_version 91133 (0.0028) [2024-06-18 08:00:26,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1493221376. Throughput: 0: 42913.7. Samples: 1493372780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:00:26,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 08:00:29,313][12883] Updated weights for policy 0, policy_version 91143 (0.0044) [2024-06-18 08:00:31,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1493434368. Throughput: 0: 42971.0. Samples: 1493506200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:00:31,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 08:00:32,439][12883] Updated weights for policy 0, policy_version 91153 (0.0029) [2024-06-18 08:00:36,873][12883] Updated weights for policy 0, policy_version 91163 (0.0044) [2024-06-18 08:00:36,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42326.9, 300 sec: 42709.8). Total num frames: 1493614592. Throughput: 0: 42813.4. Samples: 1493758900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:00:36,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 08:00:37,061][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091164_1493630976.pth... [2024-06-18 08:00:37,093][12862] Signal inference workers to stop experience collection... (21750 times) [2024-06-18 08:00:37,093][12862] Signal inference workers to resume experience collection... (21750 times) [2024-06-18 08:00:37,116][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090538_1483374592.pth [2024-06-18 08:00:37,118][12883] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-06-18 08:00:37,118][12883] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-06-18 08:00:39,927][12883] Updated weights for policy 0, policy_version 91173 (0.0033) [2024-06-18 08:00:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1493843968. Throughput: 0: 42979.3. Samples: 1494020560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:00:41,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 08:00:44,373][12883] Updated weights for policy 0, policy_version 91183 (0.0044) [2024-06-18 08:00:46,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1494089728. Throughput: 0: 42990.5. Samples: 1494153260. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:00:46,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 08:00:47,550][12883] Updated weights for policy 0, policy_version 91193 (0.0035) [2024-06-18 08:00:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1494253568. Throughput: 0: 42857.8. Samples: 1494406200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:00:51,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 08:00:52,039][12883] Updated weights for policy 0, policy_version 91203 (0.0031) [2024-06-18 08:00:54,969][12883] Updated weights for policy 0, policy_version 91213 (0.0032) [2024-06-18 08:00:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1494499328. Throughput: 0: 42935.1. Samples: 1494662880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:00:56,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 08:00:59,590][12883] Updated weights for policy 0, policy_version 91223 (0.0034) [2024-06-18 08:01:01,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1494728704. Throughput: 0: 43027.0. Samples: 1494797980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:01:01,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 08:01:02,658][12883] Updated weights for policy 0, policy_version 91233 (0.0022) [2024-06-18 08:01:06,994][12645] Fps is (10 sec: 40958.0, 60 sec: 42598.0, 300 sec: 42709.4). Total num frames: 1494908928. Throughput: 0: 42863.4. Samples: 1495048840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:01:06,994][12645] Avg episode reward: [(0, '0.180')] [2024-06-18 08:01:07,091][12883] Updated weights for policy 0, policy_version 91243 (0.0022) [2024-06-18 08:01:10,504][12883] Updated weights for policy 0, policy_version 91253 (0.0042) [2024-06-18 08:01:11,996][12645] Fps is (10 sec: 39312.8, 60 sec: 42324.4, 300 sec: 42599.3). Total num frames: 1495121920. Throughput: 0: 42869.5. Samples: 1495302000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:01:11,997][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 08:01:14,722][12883] Updated weights for policy 0, policy_version 91263 (0.0025) [2024-06-18 08:01:16,996][12645] Fps is (10 sec: 45867.4, 60 sec: 43416.0, 300 sec: 42875.8). Total num frames: 1495367680. Throughput: 0: 42849.5. Samples: 1495434520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:01:16,996][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 08:01:18,188][12883] Updated weights for policy 0, policy_version 91273 (0.0033) [2024-06-18 08:01:21,996][12645] Fps is (10 sec: 44236.9, 60 sec: 43142.8, 300 sec: 42709.1). Total num frames: 1495564288. Throughput: 0: 42778.3. Samples: 1495684020. Policy #0 lag: (min: 1.0, avg: 9.8, max: 23.0) [2024-06-18 08:01:21,996][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 08:01:22,267][12883] Updated weights for policy 0, policy_version 91283 (0.0036) [2024-06-18 08:01:25,365][12862] Signal inference workers to stop experience collection... (21800 times) [2024-06-18 08:01:25,414][12883] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-06-18 08:01:25,476][12862] Signal inference workers to resume experience collection... (21800 times) [2024-06-18 08:01:25,476][12883] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-06-18 08:01:25,606][12883] Updated weights for policy 0, policy_version 91293 (0.0030) [2024-06-18 08:01:26,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1495777280. Throughput: 0: 42725.8. Samples: 1495943220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:01:26,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 08:01:29,919][12883] Updated weights for policy 0, policy_version 91303 (0.0027) [2024-06-18 08:01:31,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1496023040. Throughput: 0: 42755.2. Samples: 1496077240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:01:31,994][12645] Avg episode reward: [(0, '0.234')] [2024-06-18 08:01:33,655][12883] Updated weights for policy 0, policy_version 91313 (0.0043) [2024-06-18 08:01:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 1496203264. Throughput: 0: 42680.8. Samples: 1496326840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:01:36,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 08:01:37,709][12883] Updated weights for policy 0, policy_version 91323 (0.0051) [2024-06-18 08:01:41,197][12883] Updated weights for policy 0, policy_version 91333 (0.0042) [2024-06-18 08:01:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1496432640. Throughput: 0: 42500.8. Samples: 1496575420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:01:41,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 08:01:45,587][12883] Updated weights for policy 0, policy_version 91343 (0.0031) [2024-06-18 08:01:46,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1496645632. Throughput: 0: 42497.9. Samples: 1496710380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:01:46,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 08:01:48,720][12883] Updated weights for policy 0, policy_version 91353 (0.0031) [2024-06-18 08:01:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1496825856. Throughput: 0: 42543.9. Samples: 1496963300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:01:51,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 08:01:53,073][12883] Updated weights for policy 0, policy_version 91363 (0.0042) [2024-06-18 08:01:56,284][12883] Updated weights for policy 0, policy_version 91373 (0.0035) [2024-06-18 08:01:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1497055232. Throughput: 0: 42710.6. Samples: 1497223880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:01:56,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 08:02:00,668][12883] Updated weights for policy 0, policy_version 91383 (0.0032) [2024-06-18 08:02:01,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1497284608. Throughput: 0: 42736.8. Samples: 1497357580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:02:01,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 08:02:03,986][12883] Updated weights for policy 0, policy_version 91393 (0.0032) [2024-06-18 08:02:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 1497481216. Throughput: 0: 42701.2. Samples: 1497605480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:02:06,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 08:02:08,701][12883] Updated weights for policy 0, policy_version 91403 (0.0037) [2024-06-18 08:02:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42873.0, 300 sec: 42653.9). Total num frames: 1497694208. Throughput: 0: 42607.8. Samples: 1497860580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:02:11,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 08:02:12,215][12883] Updated weights for policy 0, policy_version 91413 (0.0045) [2024-06-18 08:02:16,225][12883] Updated weights for policy 0, policy_version 91423 (0.0030) [2024-06-18 08:02:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42600.0, 300 sec: 42765.4). Total num frames: 1497923584. Throughput: 0: 42442.3. Samples: 1497987140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 08:02:16,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 08:02:19,914][12883] Updated weights for policy 0, policy_version 91433 (0.0024) [2024-06-18 08:02:21,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1498136576. Throughput: 0: 42517.9. Samples: 1498240140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:21,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 08:02:23,827][12883] Updated weights for policy 0, policy_version 91443 (0.0043) [2024-06-18 08:02:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1498333184. Throughput: 0: 42711.1. Samples: 1498497420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:26,995][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 08:02:27,363][12883] Updated weights for policy 0, policy_version 91453 (0.0033) [2024-06-18 08:02:31,322][12883] Updated weights for policy 0, policy_version 91463 (0.0039) [2024-06-18 08:02:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1498562560. Throughput: 0: 42634.6. Samples: 1498628940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:31,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 08:02:34,830][12883] Updated weights for policy 0, policy_version 91473 (0.0033) [2024-06-18 08:02:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1498775552. Throughput: 0: 42846.4. Samples: 1498891380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:36,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 08:02:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091478_1498775552.pth... [2024-06-18 08:02:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000090853_1488535552.pth [2024-06-18 08:02:38,903][12883] Updated weights for policy 0, policy_version 91483 (0.0031) [2024-06-18 08:02:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1498972160. Throughput: 0: 42697.0. Samples: 1499145240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:41,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 08:02:42,446][12883] Updated weights for policy 0, policy_version 91493 (0.0028) [2024-06-18 08:02:46,019][12862] Signal inference workers to stop experience collection... (21850 times) [2024-06-18 08:02:46,024][12862] Signal inference workers to resume experience collection... (21850 times) [2024-06-18 08:02:46,053][12883] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-06-18 08:02:46,053][12883] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-06-18 08:02:46,389][12883] Updated weights for policy 0, policy_version 91503 (0.0044) [2024-06-18 08:02:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1499201536. Throughput: 0: 42456.8. Samples: 1499268140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:46,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 08:02:50,020][12883] Updated weights for policy 0, policy_version 91513 (0.0030) [2024-06-18 08:02:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1499414528. Throughput: 0: 42760.5. Samples: 1499529700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:51,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 08:02:54,189][12883] Updated weights for policy 0, policy_version 91523 (0.0039) [2024-06-18 08:02:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1499627520. Throughput: 0: 42761.4. Samples: 1499784840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:02:56,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 08:02:57,934][12883] Updated weights for policy 0, policy_version 91533 (0.0042) [2024-06-18 08:03:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 1499807744. Throughput: 0: 42640.8. Samples: 1499905980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:03:01,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 08:03:02,245][12883] Updated weights for policy 0, policy_version 91543 (0.0035) [2024-06-18 08:03:05,538][12883] Updated weights for policy 0, policy_version 91553 (0.0050) [2024-06-18 08:03:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1500053504. Throughput: 0: 42685.6. Samples: 1500161000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:03:06,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 08:03:09,810][12883] Updated weights for policy 0, policy_version 91563 (0.0026) [2024-06-18 08:03:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1500250112. Throughput: 0: 42753.0. Samples: 1500421300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 08:03:11,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 08:03:13,187][12883] Updated weights for policy 0, policy_version 91573 (0.0042) [2024-06-18 08:03:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1500463104. Throughput: 0: 42698.7. Samples: 1500550380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:16,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 08:03:17,277][12883] Updated weights for policy 0, policy_version 91583 (0.0031) [2024-06-18 08:03:21,129][12883] Updated weights for policy 0, policy_version 91593 (0.0021) [2024-06-18 08:03:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 1500692480. Throughput: 0: 42633.6. Samples: 1500809900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:21,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 08:03:24,966][12883] Updated weights for policy 0, policy_version 91603 (0.0040) [2024-06-18 08:03:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1500905472. Throughput: 0: 42719.9. Samples: 1501067640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:26,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 08:03:28,697][12883] Updated weights for policy 0, policy_version 91613 (0.0040) [2024-06-18 08:03:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1501118464. Throughput: 0: 42801.4. Samples: 1501194200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:31,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 08:03:32,479][12883] Updated weights for policy 0, policy_version 91623 (0.0032) [2024-06-18 08:03:36,467][12883] Updated weights for policy 0, policy_version 91633 (0.0045) [2024-06-18 08:03:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1501331456. Throughput: 0: 42770.7. Samples: 1501454380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:36,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 08:03:39,848][12883] Updated weights for policy 0, policy_version 91643 (0.0036) [2024-06-18 08:03:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1501528064. Throughput: 0: 42780.1. Samples: 1501709940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:41,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 08:03:44,163][12883] Updated weights for policy 0, policy_version 91653 (0.0026) [2024-06-18 08:03:46,996][12645] Fps is (10 sec: 44228.4, 60 sec: 42870.1, 300 sec: 42764.7). Total num frames: 1501773824. Throughput: 0: 42984.9. Samples: 1501840380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:46,996][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 08:03:47,328][12883] Updated weights for policy 0, policy_version 91663 (0.0027) [2024-06-18 08:03:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1501954048. Throughput: 0: 42955.1. Samples: 1502093980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:51,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 08:03:52,034][12883] Updated weights for policy 0, policy_version 91673 (0.0033) [2024-06-18 08:03:54,926][12883] Updated weights for policy 0, policy_version 91683 (0.0031) [2024-06-18 08:03:56,994][12645] Fps is (10 sec: 40967.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1502183424. Throughput: 0: 42924.8. Samples: 1502352920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:03:56,998][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 08:03:59,626][12883] Updated weights for policy 0, policy_version 91693 (0.0025) [2024-06-18 08:04:01,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43690.8, 300 sec: 42820.6). Total num frames: 1502429184. Throughput: 0: 42997.3. Samples: 1502485260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:04:01,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 08:04:02,289][12883] Updated weights for policy 0, policy_version 91703 (0.0036) [2024-06-18 08:04:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1502593024. Throughput: 0: 43056.2. Samples: 1502747420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:04:06,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 08:04:07,159][12883] Updated weights for policy 0, policy_version 91713 (0.0024) [2024-06-18 08:04:07,513][12862] Signal inference workers to stop experience collection... (21900 times) [2024-06-18 08:04:07,513][12862] Signal inference workers to resume experience collection... (21900 times) [2024-06-18 08:04:07,556][12883] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-06-18 08:04:07,556][12883] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-06-18 08:04:09,820][12883] Updated weights for policy 0, policy_version 91723 (0.0040) [2024-06-18 08:04:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1502822400. Throughput: 0: 42880.1. Samples: 1502997240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:11,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 08:04:14,582][12883] Updated weights for policy 0, policy_version 91733 (0.0041) [2024-06-18 08:04:16,994][12645] Fps is (10 sec: 47512.6, 60 sec: 43417.4, 300 sec: 42820.5). Total num frames: 1503068160. Throughput: 0: 42918.9. Samples: 1503125560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:16,995][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 08:04:17,846][12883] Updated weights for policy 0, policy_version 91743 (0.0031) [2024-06-18 08:04:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 1503264768. Throughput: 0: 42997.4. Samples: 1503389260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:21,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 08:04:22,140][12883] Updated weights for policy 0, policy_version 91753 (0.0039) [2024-06-18 08:04:25,547][12883] Updated weights for policy 0, policy_version 91763 (0.0028) [2024-06-18 08:04:26,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1503477760. Throughput: 0: 42934.8. Samples: 1503642000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:26,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 08:04:29,608][12883] Updated weights for policy 0, policy_version 91773 (0.0026) [2024-06-18 08:04:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1503674368. Throughput: 0: 42965.4. Samples: 1503773740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:31,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 08:04:32,987][12883] Updated weights for policy 0, policy_version 91783 (0.0028) [2024-06-18 08:04:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1503903744. Throughput: 0: 43063.6. Samples: 1504031840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:36,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 08:04:37,078][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091792_1503920128.pth... [2024-06-18 08:04:37,128][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091164_1493630976.pth [2024-06-18 08:04:37,326][12883] Updated weights for policy 0, policy_version 91793 (0.0037) [2024-06-18 08:04:40,726][12883] Updated weights for policy 0, policy_version 91803 (0.0031) [2024-06-18 08:04:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1504133120. Throughput: 0: 42915.6. Samples: 1504284120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:41,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 08:04:44,836][12883] Updated weights for policy 0, policy_version 91813 (0.0026) [2024-06-18 08:04:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 1504329728. Throughput: 0: 42872.3. Samples: 1504414520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:46,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 08:04:49,008][12883] Updated weights for policy 0, policy_version 91823 (0.0041) [2024-06-18 08:04:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1504559104. Throughput: 0: 42828.3. Samples: 1504674700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:51,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 08:04:52,830][12883] Updated weights for policy 0, policy_version 91833 (0.0043) [2024-06-18 08:04:56,571][12883] Updated weights for policy 0, policy_version 91843 (0.0038) [2024-06-18 08:04:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 1504772096. Throughput: 0: 42853.9. Samples: 1504925660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:04:56,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 08:05:00,287][12883] Updated weights for policy 0, policy_version 91853 (0.0031) [2024-06-18 08:05:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1504985088. Throughput: 0: 42918.4. Samples: 1505056880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 08:05:01,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 08:05:04,347][12883] Updated weights for policy 0, policy_version 91863 (0.0031) [2024-06-18 08:05:06,997][12645] Fps is (10 sec: 40946.1, 60 sec: 43142.1, 300 sec: 42709.1). Total num frames: 1505181696. Throughput: 0: 42685.7. Samples: 1505310260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:06,998][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 08:05:08,212][12883] Updated weights for policy 0, policy_version 91873 (0.0033) [2024-06-18 08:05:11,936][12883] Updated weights for policy 0, policy_version 91883 (0.0035) [2024-06-18 08:05:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1505411072. Throughput: 0: 42822.1. Samples: 1505569000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:11,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 08:05:15,781][12883] Updated weights for policy 0, policy_version 91893 (0.0040) [2024-06-18 08:05:16,994][12645] Fps is (10 sec: 44251.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1505624064. Throughput: 0: 42711.5. Samples: 1505695760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:16,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 08:05:19,597][12883] Updated weights for policy 0, policy_version 91903 (0.0023) [2024-06-18 08:05:22,000][12645] Fps is (10 sec: 40935.0, 60 sec: 42594.0, 300 sec: 42708.6). Total num frames: 1505820672. Throughput: 0: 42492.9. Samples: 1505944280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:22,000][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 08:05:23,382][12883] Updated weights for policy 0, policy_version 91913 (0.0028) [2024-06-18 08:05:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 1506033664. Throughput: 0: 42753.7. Samples: 1506208040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:26,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 08:05:27,235][12883] Updated weights for policy 0, policy_version 91923 (0.0038) [2024-06-18 08:05:30,472][12862] Signal inference workers to stop experience collection... (21950 times) [2024-06-18 08:05:30,472][12862] Signal inference workers to resume experience collection... (21950 times) [2024-06-18 08:05:30,520][12883] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-06-18 08:05:30,520][12883] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-06-18 08:05:31,113][12883] Updated weights for policy 0, policy_version 91933 (0.0038) [2024-06-18 08:05:31,994][12645] Fps is (10 sec: 44263.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1506263040. Throughput: 0: 42647.6. Samples: 1506333660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:31,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 08:05:34,810][12883] Updated weights for policy 0, policy_version 91943 (0.0036) [2024-06-18 08:05:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1506476032. Throughput: 0: 42407.5. Samples: 1506583040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:36,995][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 08:05:38,918][12883] Updated weights for policy 0, policy_version 91953 (0.0035) [2024-06-18 08:05:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1506689024. Throughput: 0: 42727.5. Samples: 1506848400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:41,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 08:05:42,400][12883] Updated weights for policy 0, policy_version 91963 (0.0036) [2024-06-18 08:05:46,437][12883] Updated weights for policy 0, policy_version 91973 (0.0029) [2024-06-18 08:05:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1506885632. Throughput: 0: 42550.0. Samples: 1506971640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:46,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 08:05:50,070][12883] Updated weights for policy 0, policy_version 91983 (0.0041) [2024-06-18 08:05:51,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1507131392. Throughput: 0: 42648.3. Samples: 1507229300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:51,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 08:05:54,305][12883] Updated weights for policy 0, policy_version 91993 (0.0043) [2024-06-18 08:05:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1507328000. Throughput: 0: 42651.5. Samples: 1507488320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:05:56,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 08:05:57,671][12883] Updated weights for policy 0, policy_version 92003 (0.0037) [2024-06-18 08:06:01,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.3, 300 sec: 42765.1). Total num frames: 1507524608. Throughput: 0: 42496.5. Samples: 1507608100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 08:06:01,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 08:06:02,143][12883] Updated weights for policy 0, policy_version 92013 (0.0033) [2024-06-18 08:06:05,600][12883] Updated weights for policy 0, policy_version 92023 (0.0038) [2024-06-18 08:06:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42873.7, 300 sec: 42820.9). Total num frames: 1507753984. Throughput: 0: 42624.3. Samples: 1507862120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:06,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 08:06:09,810][12883] Updated weights for policy 0, policy_version 92033 (0.0036) [2024-06-18 08:06:11,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 1507966976. Throughput: 0: 42341.7. Samples: 1508113420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:11,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 08:06:13,456][12883] Updated weights for policy 0, policy_version 92043 (0.0037) [2024-06-18 08:06:16,994][12645] Fps is (10 sec: 39322.5, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 1508147200. Throughput: 0: 42419.2. Samples: 1508242520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:16,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 08:06:17,675][12883] Updated weights for policy 0, policy_version 92053 (0.0030) [2024-06-18 08:06:21,252][12883] Updated weights for policy 0, policy_version 92063 (0.0032) [2024-06-18 08:06:21,996][12645] Fps is (10 sec: 42589.9, 60 sec: 42874.3, 300 sec: 42764.7). Total num frames: 1508392960. Throughput: 0: 42666.5. Samples: 1508503120. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:21,996][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 08:06:25,131][12883] Updated weights for policy 0, policy_version 92073 (0.0029) [2024-06-18 08:06:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1508605952. Throughput: 0: 42376.5. Samples: 1508755340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:26,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 08:06:28,738][12883] Updated weights for policy 0, policy_version 92083 (0.0027) [2024-06-18 08:06:31,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1508802560. Throughput: 0: 42581.2. Samples: 1508887780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:31,994][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 08:06:32,597][12883] Updated weights for policy 0, policy_version 92093 (0.0037) [2024-06-18 08:06:36,281][12883] Updated weights for policy 0, policy_version 92103 (0.0041) [2024-06-18 08:06:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1509031936. Throughput: 0: 42574.8. Samples: 1509145160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:36,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 08:06:37,024][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092105_1509048320.pth... [2024-06-18 08:06:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091478_1498775552.pth [2024-06-18 08:06:40,358][12883] Updated weights for policy 0, policy_version 92113 (0.0024) [2024-06-18 08:06:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1509228544. Throughput: 0: 42449.8. Samples: 1509398560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:41,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 08:06:43,899][12883] Updated weights for policy 0, policy_version 92123 (0.0037) [2024-06-18 08:06:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1509441536. Throughput: 0: 42593.8. Samples: 1509524820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:46,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 08:06:48,061][12862] Signal inference workers to stop experience collection... (22000 times) [2024-06-18 08:06:48,062][12862] Signal inference workers to resume experience collection... (22000 times) [2024-06-18 08:06:48,080][12883] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-06-18 08:06:48,110][12883] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-06-18 08:06:48,212][12883] Updated weights for policy 0, policy_version 92133 (0.0030) [2024-06-18 08:06:51,790][12883] Updated weights for policy 0, policy_version 92143 (0.0053) [2024-06-18 08:06:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 1509670912. Throughput: 0: 42641.1. Samples: 1509780960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:51,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 08:06:55,765][12883] Updated weights for policy 0, policy_version 92153 (0.0027) [2024-06-18 08:06:56,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1509867520. Throughput: 0: 42706.7. Samples: 1510035220. Policy #0 lag: (min: 1.0, avg: 10.4, max: 24.0) [2024-06-18 08:06:56,995][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 08:06:59,330][12883] Updated weights for policy 0, policy_version 92163 (0.0037) [2024-06-18 08:07:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1510080512. Throughput: 0: 42666.5. Samples: 1510162520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:01,994][12645] Avg episode reward: [(0, '0.637')] [2024-06-18 08:07:03,512][12883] Updated weights for policy 0, policy_version 92173 (0.0050) [2024-06-18 08:07:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1510309888. Throughput: 0: 42607.3. Samples: 1510420360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:06,994][12645] Avg episode reward: [(0, '0.145')] [2024-06-18 08:07:07,137][12883] Updated weights for policy 0, policy_version 92183 (0.0025) [2024-06-18 08:07:11,178][12883] Updated weights for policy 0, policy_version 92193 (0.0031) [2024-06-18 08:07:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1510522880. Throughput: 0: 42618.6. Samples: 1510673180. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:11,994][12645] Avg episode reward: [(0, '0.185')] [2024-06-18 08:07:14,573][12883] Updated weights for policy 0, policy_version 92203 (0.0032) [2024-06-18 08:07:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 1510735872. Throughput: 0: 42571.4. Samples: 1510803500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:16,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 08:07:18,884][12883] Updated weights for policy 0, policy_version 92213 (0.0029) [2024-06-18 08:07:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1510948864. Throughput: 0: 42488.0. Samples: 1511057120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:21,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 08:07:22,498][12883] Updated weights for policy 0, policy_version 92223 (0.0044) [2024-06-18 08:07:26,416][12883] Updated weights for policy 0, policy_version 92233 (0.0031) [2024-06-18 08:07:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1511178240. Throughput: 0: 42654.1. Samples: 1511318000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:26,994][12645] Avg episode reward: [(0, '0.182')] [2024-06-18 08:07:30,148][12883] Updated weights for policy 0, policy_version 92243 (0.0032) [2024-06-18 08:07:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1511358464. Throughput: 0: 42671.6. Samples: 1511445040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:31,994][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 08:07:34,141][12883] Updated weights for policy 0, policy_version 92253 (0.0033) [2024-06-18 08:07:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1511604224. Throughput: 0: 42769.2. Samples: 1511705580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:36,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 08:07:37,715][12883] Updated weights for policy 0, policy_version 92263 (0.0033) [2024-06-18 08:07:41,828][12883] Updated weights for policy 0, policy_version 92273 (0.0044) [2024-06-18 08:07:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1511800832. Throughput: 0: 42798.3. Samples: 1511961140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:41,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 08:07:45,369][12883] Updated weights for policy 0, policy_version 92283 (0.0044) [2024-06-18 08:07:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1511997440. Throughput: 0: 42682.1. Samples: 1512083220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:46,995][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 08:07:49,430][12883] Updated weights for policy 0, policy_version 92293 (0.0043) [2024-06-18 08:07:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1512243200. Throughput: 0: 42752.4. Samples: 1512344220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 08:07:52,003][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 08:07:52,925][12883] Updated weights for policy 0, policy_version 92303 (0.0032) [2024-06-18 08:07:56,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1512439808. Throughput: 0: 42949.4. Samples: 1512605900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:07:56,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 08:07:57,083][12883] Updated weights for policy 0, policy_version 92313 (0.0027) [2024-06-18 08:08:00,509][12883] Updated weights for policy 0, policy_version 92323 (0.0027) [2024-06-18 08:08:01,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1512652800. Throughput: 0: 42753.1. Samples: 1512727480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:01,997][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 08:08:04,528][12883] Updated weights for policy 0, policy_version 92333 (0.0029) [2024-06-18 08:08:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1512865792. Throughput: 0: 42801.0. Samples: 1512983160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:06,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 08:08:07,217][12862] Signal inference workers to stop experience collection... (22050 times) [2024-06-18 08:08:07,262][12883] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-06-18 08:08:07,267][12862] Signal inference workers to resume experience collection... (22050 times) [2024-06-18 08:08:07,283][12883] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-06-18 08:08:08,008][12883] Updated weights for policy 0, policy_version 92343 (0.0032) [2024-06-18 08:08:11,994][12645] Fps is (10 sec: 44246.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1513095168. Throughput: 0: 42753.8. Samples: 1513241920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:11,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 08:08:12,464][12883] Updated weights for policy 0, policy_version 92353 (0.0028) [2024-06-18 08:08:15,872][12883] Updated weights for policy 0, policy_version 92363 (0.0040) [2024-06-18 08:08:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1513308160. Throughput: 0: 42800.4. Samples: 1513371060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:16,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 08:08:19,877][12883] Updated weights for policy 0, policy_version 92373 (0.0030) [2024-06-18 08:08:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1513504768. Throughput: 0: 42735.6. Samples: 1513628680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:21,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 08:08:23,728][12883] Updated weights for policy 0, policy_version 92383 (0.0030) [2024-06-18 08:08:27,001][12645] Fps is (10 sec: 40930.1, 60 sec: 42320.3, 300 sec: 42708.4). Total num frames: 1513717760. Throughput: 0: 42877.6. Samples: 1513890940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:27,002][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 08:08:27,388][12883] Updated weights for policy 0, policy_version 92393 (0.0035) [2024-06-18 08:08:31,269][12883] Updated weights for policy 0, policy_version 92403 (0.0035) [2024-06-18 08:08:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1513963520. Throughput: 0: 43059.2. Samples: 1514020880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:31,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 08:08:34,962][12883] Updated weights for policy 0, policy_version 92413 (0.0031) [2024-06-18 08:08:36,994][12645] Fps is (10 sec: 42628.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1514143744. Throughput: 0: 42789.8. Samples: 1514269760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:36,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 08:08:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092416_1514143744.pth... [2024-06-18 08:08:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000091792_1503920128.pth [2024-06-18 08:08:39,110][12883] Updated weights for policy 0, policy_version 92423 (0.0036) [2024-06-18 08:08:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42654.2). Total num frames: 1514356736. Throughput: 0: 42747.6. Samples: 1514529540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:41,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 08:08:42,571][12883] Updated weights for policy 0, policy_version 92433 (0.0042) [2024-06-18 08:08:46,692][12883] Updated weights for policy 0, policy_version 92443 (0.0032) [2024-06-18 08:08:46,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 1514586112. Throughput: 0: 42846.6. Samples: 1514655480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:08:46,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 08:08:50,461][12883] Updated weights for policy 0, policy_version 92453 (0.0050) [2024-06-18 08:08:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1514782720. Throughput: 0: 42728.8. Samples: 1514905960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:08:51,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 08:08:54,306][12883] Updated weights for policy 0, policy_version 92463 (0.0028) [2024-06-18 08:08:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1514995712. Throughput: 0: 42820.1. Samples: 1515168820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:08:56,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 08:08:57,984][12883] Updated weights for policy 0, policy_version 92473 (0.0029) [2024-06-18 08:09:01,906][12883] Updated weights for policy 0, policy_version 92483 (0.0027) [2024-06-18 08:09:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1515241472. Throughput: 0: 42931.5. Samples: 1515302980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:01,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 08:09:05,758][12883] Updated weights for policy 0, policy_version 92493 (0.0028) [2024-06-18 08:09:06,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1515438080. Throughput: 0: 42736.1. Samples: 1515551900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:06,996][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 08:09:09,767][12883] Updated weights for policy 0, policy_version 92503 (0.0035) [2024-06-18 08:09:11,935][12862] Signal inference workers to stop experience collection... (22100 times) [2024-06-18 08:09:11,972][12883] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-06-18 08:09:11,983][12862] Signal inference workers to resume experience collection... (22100 times) [2024-06-18 08:09:11,991][12883] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-06-18 08:09:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1515651072. Throughput: 0: 42793.9. Samples: 1515816360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:11,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 08:09:13,182][12883] Updated weights for policy 0, policy_version 92513 (0.0030) [2024-06-18 08:09:16,994][12645] Fps is (10 sec: 42607.2, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 1515864064. Throughput: 0: 42689.2. Samples: 1515941900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:16,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 08:09:17,286][12883] Updated weights for policy 0, policy_version 92523 (0.0036) [2024-06-18 08:09:20,878][12883] Updated weights for policy 0, policy_version 92533 (0.0027) [2024-06-18 08:09:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1516093440. Throughput: 0: 42824.6. Samples: 1516196860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:21,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 08:09:24,918][12883] Updated weights for policy 0, policy_version 92543 (0.0029) [2024-06-18 08:09:26,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43149.8, 300 sec: 42820.6). Total num frames: 1516306432. Throughput: 0: 42835.9. Samples: 1516457160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:26,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 08:09:28,674][12883] Updated weights for policy 0, policy_version 92553 (0.0034) [2024-06-18 08:09:31,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1516519424. Throughput: 0: 42800.0. Samples: 1516581580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:31,996][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 08:09:32,449][12883] Updated weights for policy 0, policy_version 92563 (0.0032) [2024-06-18 08:09:36,586][12883] Updated weights for policy 0, policy_version 92573 (0.0029) [2024-06-18 08:09:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1516732416. Throughput: 0: 43088.5. Samples: 1516844940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:36,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 08:09:40,179][12883] Updated weights for policy 0, policy_version 92583 (0.0043) [2024-06-18 08:09:41,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1516945408. Throughput: 0: 42820.9. Samples: 1517095760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 08:09:41,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 08:09:44,424][12883] Updated weights for policy 0, policy_version 92593 (0.0033) [2024-06-18 08:09:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1517158400. Throughput: 0: 42765.3. Samples: 1517227420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:09:46,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 08:09:47,795][12883] Updated weights for policy 0, policy_version 92603 (0.0049) [2024-06-18 08:09:51,970][12883] Updated weights for policy 0, policy_version 92613 (0.0027) [2024-06-18 08:09:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1517371392. Throughput: 0: 43031.0. Samples: 1517488200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:09:51,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 08:09:55,541][12883] Updated weights for policy 0, policy_version 92623 (0.0029) [2024-06-18 08:09:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43690.6, 300 sec: 42820.5). Total num frames: 1517617152. Throughput: 0: 42758.2. Samples: 1517740480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:09:56,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 08:09:59,532][12883] Updated weights for policy 0, policy_version 92633 (0.0039) [2024-06-18 08:10:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.5). Total num frames: 1517797376. Throughput: 0: 43038.4. Samples: 1517878620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:01,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 08:10:03,058][12883] Updated weights for policy 0, policy_version 92643 (0.0026) [2024-06-18 08:10:06,962][12883] Updated weights for policy 0, policy_version 92653 (0.0021) [2024-06-18 08:10:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43146.0, 300 sec: 42765.0). Total num frames: 1518026752. Throughput: 0: 43090.9. Samples: 1518135960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:06,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 08:10:10,627][12883] Updated weights for policy 0, policy_version 92663 (0.0034) [2024-06-18 08:10:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1518256128. Throughput: 0: 42917.7. Samples: 1518388460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:11,997][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 08:10:14,457][12883] Updated weights for policy 0, policy_version 92673 (0.0028) [2024-06-18 08:10:16,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42871.6, 300 sec: 42765.9). Total num frames: 1518436352. Throughput: 0: 43136.9. Samples: 1518522640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:16,994][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 08:10:18,157][12883] Updated weights for policy 0, policy_version 92683 (0.0033) [2024-06-18 08:10:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1518649344. Throughput: 0: 42992.9. Samples: 1518779620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:21,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 08:10:22,507][12883] Updated weights for policy 0, policy_version 92693 (0.0035) [2024-06-18 08:10:25,840][12883] Updated weights for policy 0, policy_version 92703 (0.0040) [2024-06-18 08:10:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1518895104. Throughput: 0: 42883.0. Samples: 1519025500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:26,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 08:10:30,308][12883] Updated weights for policy 0, policy_version 92713 (0.0047) [2024-06-18 08:10:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 1519058944. Throughput: 0: 43010.2. Samples: 1519162880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:31,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 08:10:33,543][12883] Updated weights for policy 0, policy_version 92723 (0.0036) [2024-06-18 08:10:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1519288320. Throughput: 0: 42752.5. Samples: 1519412060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:10:36,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 08:10:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092730_1519288320.pth... [2024-06-18 08:10:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092105_1509048320.pth [2024-06-18 08:10:38,127][12883] Updated weights for policy 0, policy_version 92733 (0.0036) [2024-06-18 08:10:41,174][12862] Signal inference workers to stop experience collection... (22150 times) [2024-06-18 08:10:41,175][12862] Signal inference workers to resume experience collection... (22150 times) [2024-06-18 08:10:41,218][12883] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-06-18 08:10:41,218][12883] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-06-18 08:10:41,308][12883] Updated weights for policy 0, policy_version 92743 (0.0042) [2024-06-18 08:10:41,994][12645] Fps is (10 sec: 49151.9, 60 sec: 43417.6, 300 sec: 42931.7). Total num frames: 1519550464. Throughput: 0: 42705.9. Samples: 1519662240. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:10:41,994][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 08:10:46,004][12883] Updated weights for policy 0, policy_version 92753 (0.0022) [2024-06-18 08:10:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 1519714304. Throughput: 0: 42673.7. Samples: 1519798940. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:10:46,995][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 08:10:48,864][12883] Updated weights for policy 0, policy_version 92763 (0.0033) [2024-06-18 08:10:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1519943680. Throughput: 0: 42666.7. Samples: 1520055960. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:10:51,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 08:10:53,656][12883] Updated weights for policy 0, policy_version 92773 (0.0027) [2024-06-18 08:10:56,895][12883] Updated weights for policy 0, policy_version 92783 (0.0038) [2024-06-18 08:10:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1520156672. Throughput: 0: 42656.0. Samples: 1520307980. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:10:56,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 08:11:01,504][12883] Updated weights for policy 0, policy_version 92793 (0.0034) [2024-06-18 08:11:01,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1520336896. Throughput: 0: 42333.3. Samples: 1520427640. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:01,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 08:11:04,631][12883] Updated weights for policy 0, policy_version 92803 (0.0040) [2024-06-18 08:11:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1520599040. Throughput: 0: 42225.3. Samples: 1520679760. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:06,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 08:11:09,023][12883] Updated weights for policy 0, policy_version 92813 (0.0030) [2024-06-18 08:11:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 1520779264. Throughput: 0: 42713.0. Samples: 1520947580. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:11,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 08:11:12,359][12883] Updated weights for policy 0, policy_version 92823 (0.0045) [2024-06-18 08:11:16,625][12883] Updated weights for policy 0, policy_version 92833 (0.0033) [2024-06-18 08:11:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42654.2). Total num frames: 1520975872. Throughput: 0: 42352.4. Samples: 1521068740. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:16,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 08:11:19,873][12883] Updated weights for policy 0, policy_version 92843 (0.0032) [2024-06-18 08:11:21,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1521254400. Throughput: 0: 42480.4. Samples: 1521323680. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:21,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 08:11:24,242][12883] Updated weights for policy 0, policy_version 92853 (0.0029) [2024-06-18 08:11:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1521418240. Throughput: 0: 42932.8. Samples: 1521594220. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:26,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 08:11:27,445][12883] Updated weights for policy 0, policy_version 92863 (0.0035) [2024-06-18 08:11:31,994][12645] Fps is (10 sec: 36045.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1521614848. Throughput: 0: 42519.7. Samples: 1521712320. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:31,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 08:11:32,194][12883] Updated weights for policy 0, policy_version 92873 (0.0028) [2024-06-18 08:11:34,929][12883] Updated weights for policy 0, policy_version 92883 (0.0035) [2024-06-18 08:11:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1521876992. Throughput: 0: 42538.2. Samples: 1521970180. Policy #0 lag: (min: 1.0, avg: 12.2, max: 21.0) [2024-06-18 08:11:36,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 08:11:39,878][12883] Updated weights for policy 0, policy_version 92893 (0.0038) [2024-06-18 08:11:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 1522057216. Throughput: 0: 42888.5. Samples: 1522237960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:11:41,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 08:11:42,694][12883] Updated weights for policy 0, policy_version 92903 (0.0029) [2024-06-18 08:11:46,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1522253824. Throughput: 0: 42807.5. Samples: 1522353980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:11:46,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 08:11:47,448][12883] Updated weights for policy 0, policy_version 92913 (0.0029) [2024-06-18 08:11:50,336][12862] Signal inference workers to stop experience collection... (22200 times) [2024-06-18 08:11:50,389][12862] Signal inference workers to resume experience collection... (22200 times) [2024-06-18 08:11:50,391][12883] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-06-18 08:11:50,419][12883] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-06-18 08:11:50,529][12883] Updated weights for policy 0, policy_version 92923 (0.0043) [2024-06-18 08:11:51,994][12645] Fps is (10 sec: 47513.7, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1522532352. Throughput: 0: 43027.6. Samples: 1522616000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:11:51,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 08:11:55,047][12883] Updated weights for policy 0, policy_version 92933 (0.0023) [2024-06-18 08:11:56,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1522712576. Throughput: 0: 42939.0. Samples: 1522879840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:11:56,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 08:11:58,151][12883] Updated weights for policy 0, policy_version 92943 (0.0025) [2024-06-18 08:12:01,994][12645] Fps is (10 sec: 36044.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1522892800. Throughput: 0: 42828.4. Samples: 1522996020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:12:01,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 08:12:02,786][12883] Updated weights for policy 0, policy_version 92953 (0.0034) [2024-06-18 08:12:05,679][12883] Updated weights for policy 0, policy_version 92963 (0.0028) [2024-06-18 08:12:06,996][12645] Fps is (10 sec: 47503.4, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 1523187712. Throughput: 0: 43093.4. Samples: 1523262980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:12:06,997][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 08:12:10,543][12883] Updated weights for policy 0, policy_version 92973 (0.0041) [2024-06-18 08:12:11,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1523351552. Throughput: 0: 42897.5. Samples: 1523524700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:12:11,996][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 08:12:13,295][12883] Updated weights for policy 0, policy_version 92983 (0.0040) [2024-06-18 08:12:16,994][12645] Fps is (10 sec: 36052.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1523548160. Throughput: 0: 42891.0. Samples: 1523642420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:12:16,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 08:12:18,065][12883] Updated weights for policy 0, policy_version 92993 (0.0033) [2024-06-18 08:12:20,804][12883] Updated weights for policy 0, policy_version 93003 (0.0038) [2024-06-18 08:12:21,996][12645] Fps is (10 sec: 49151.9, 60 sec: 43143.0, 300 sec: 42931.3). Total num frames: 1523843072. Throughput: 0: 43055.7. Samples: 1523907780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:12:21,997][12645] Avg episode reward: [(0, '0.115')] [2024-06-18 08:12:25,671][12883] Updated weights for policy 0, policy_version 93013 (0.0032) [2024-06-18 08:12:26,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1523990528. Throughput: 0: 42977.3. Samples: 1524171940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:12:26,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 08:12:28,445][12883] Updated weights for policy 0, policy_version 93023 (0.0028) [2024-06-18 08:12:31,996][12645] Fps is (10 sec: 36044.7, 60 sec: 43142.8, 300 sec: 42709.2). Total num frames: 1524203520. Throughput: 0: 42784.0. Samples: 1524279360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:12:31,997][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 08:12:33,554][12883] Updated weights for policy 0, policy_version 93033 (0.0051) [2024-06-18 08:12:36,054][12883] Updated weights for policy 0, policy_version 93043 (0.0030) [2024-06-18 08:12:36,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1524465664. Throughput: 0: 42893.8. Samples: 1524546220. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:12:36,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 08:12:37,066][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093047_1524482048.pth... [2024-06-18 08:12:37,134][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092416_1514143744.pth [2024-06-18 08:12:41,096][12883] Updated weights for policy 0, policy_version 93053 (0.0034) [2024-06-18 08:12:41,996][12645] Fps is (10 sec: 40960.3, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1524613120. Throughput: 0: 42875.8. Samples: 1524809340. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:12:41,996][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 08:12:43,811][12883] Updated weights for policy 0, policy_version 93063 (0.0033) [2024-06-18 08:12:46,994][12645] Fps is (10 sec: 39320.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1524858880. Throughput: 0: 42872.8. Samples: 1524925300. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:12:46,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 08:12:48,661][12883] Updated weights for policy 0, policy_version 93073 (0.0033) [2024-06-18 08:12:49,832][12862] Signal inference workers to stop experience collection... (22250 times) [2024-06-18 08:12:49,833][12862] Signal inference workers to resume experience collection... (22250 times) [2024-06-18 08:12:49,864][12883] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-06-18 08:12:49,864][12883] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-06-18 08:12:51,361][12883] Updated weights for policy 0, policy_version 93083 (0.0036) [2024-06-18 08:12:51,994][12645] Fps is (10 sec: 49162.1, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 1525104640. Throughput: 0: 42938.9. Samples: 1525195140. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:12:51,995][12645] Avg episode reward: [(0, '0.159')] [2024-06-18 08:12:56,249][12883] Updated weights for policy 0, policy_version 93093 (0.0031) [2024-06-18 08:12:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 1525252096. Throughput: 0: 42726.0. Samples: 1525447280. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:12:56,994][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 08:12:59,316][12883] Updated weights for policy 0, policy_version 93103 (0.0031) [2024-06-18 08:13:01,994][12645] Fps is (10 sec: 39322.5, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 1525497856. Throughput: 0: 42706.8. Samples: 1525564220. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:13:01,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 08:13:03,805][12883] Updated weights for policy 0, policy_version 93113 (0.0023) [2024-06-18 08:13:06,909][12883] Updated weights for policy 0, policy_version 93123 (0.0034) [2024-06-18 08:13:06,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42326.9, 300 sec: 42820.6). Total num frames: 1525727232. Throughput: 0: 42859.4. Samples: 1525836360. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:13:06,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 08:13:11,409][12883] Updated weights for policy 0, policy_version 93133 (0.0029) [2024-06-18 08:13:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1525891072. Throughput: 0: 42570.1. Samples: 1526087600. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:13:11,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 08:13:14,442][12883] Updated weights for policy 0, policy_version 93143 (0.0036) [2024-06-18 08:13:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1526153216. Throughput: 0: 42857.6. Samples: 1526207860. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:13:16,995][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 08:13:19,063][12883] Updated weights for policy 0, policy_version 93153 (0.0032) [2024-06-18 08:13:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 41780.8, 300 sec: 42821.6). Total num frames: 1526349824. Throughput: 0: 42897.2. Samples: 1526476600. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:13:21,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 08:13:22,290][12883] Updated weights for policy 0, policy_version 93163 (0.0041) [2024-06-18 08:13:26,677][12883] Updated weights for policy 0, policy_version 93173 (0.0027) [2024-06-18 08:13:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1526546432. Throughput: 0: 42743.4. Samples: 1526732700. Policy #0 lag: (min: 2.0, avg: 9.4, max: 23.0) [2024-06-18 08:13:26,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 08:13:29,788][12883] Updated weights for policy 0, policy_version 93183 (0.0041) [2024-06-18 08:13:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43419.2, 300 sec: 42931.6). Total num frames: 1526808576. Throughput: 0: 42866.3. Samples: 1526854280. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:13:31,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 08:13:34,173][12883] Updated weights for policy 0, policy_version 93193 (0.0044) [2024-06-18 08:13:36,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42323.7, 300 sec: 42875.8). Total num frames: 1527005184. Throughput: 0: 42750.9. Samples: 1527119020. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:13:36,997][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 08:13:37,411][12883] Updated weights for policy 0, policy_version 93203 (0.0031) [2024-06-18 08:13:41,654][12883] Updated weights for policy 0, policy_version 93213 (0.0042) [2024-06-18 08:13:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1527201792. Throughput: 0: 42787.2. Samples: 1527372700. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:13:41,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 08:13:45,175][12883] Updated weights for policy 0, policy_version 93223 (0.0028) [2024-06-18 08:13:46,994][12645] Fps is (10 sec: 44246.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1527447552. Throughput: 0: 42951.4. Samples: 1527497040. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:13:46,994][12645] Avg episode reward: [(0, '0.675')] [2024-06-18 08:13:49,177][12883] Updated weights for policy 0, policy_version 93233 (0.0040) [2024-06-18 08:13:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42876.1). Total num frames: 1527644160. Throughput: 0: 42772.9. Samples: 1527761140. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:13:51,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 08:13:52,722][12883] Updated weights for policy 0, policy_version 93243 (0.0032) [2024-06-18 08:13:56,784][12862] Signal inference workers to stop experience collection... (22300 times) [2024-06-18 08:13:56,785][12862] Signal inference workers to resume experience collection... (22300 times) [2024-06-18 08:13:56,805][12883] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-06-18 08:13:56,805][12883] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-06-18 08:13:56,936][12883] Updated weights for policy 0, policy_version 93253 (0.0036) [2024-06-18 08:13:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1527857152. Throughput: 0: 42783.0. Samples: 1528012840. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:13:56,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 08:14:00,520][12883] Updated weights for policy 0, policy_version 93263 (0.0031) [2024-06-18 08:14:02,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42867.0, 300 sec: 42820.0). Total num frames: 1528070144. Throughput: 0: 42893.3. Samples: 1528138320. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:14:02,000][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 08:14:04,584][12883] Updated weights for policy 0, policy_version 93273 (0.0041) [2024-06-18 08:14:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1528283136. Throughput: 0: 42846.6. Samples: 1528404700. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:14:06,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 08:14:08,210][12883] Updated weights for policy 0, policy_version 93283 (0.0032) [2024-06-18 08:14:11,994][12645] Fps is (10 sec: 42624.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1528496128. Throughput: 0: 42524.4. Samples: 1528646300. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:14:11,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 08:14:12,158][12883] Updated weights for policy 0, policy_version 93293 (0.0039) [2024-06-18 08:14:15,875][12883] Updated weights for policy 0, policy_version 93303 (0.0027) [2024-06-18 08:14:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1528725504. Throughput: 0: 42766.8. Samples: 1528778780. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:14:16,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 08:14:19,738][12883] Updated weights for policy 0, policy_version 93313 (0.0039) [2024-06-18 08:14:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1528889344. Throughput: 0: 42536.3. Samples: 1529033060. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:14:21,994][12645] Avg episode reward: [(0, '0.126')] [2024-06-18 08:14:23,590][12883] Updated weights for policy 0, policy_version 93323 (0.0025) [2024-06-18 08:14:26,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 1529135104. Throughput: 0: 42514.6. Samples: 1529285860. Policy #0 lag: (min: 0.0, avg: 13.5, max: 24.0) [2024-06-18 08:14:26,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 08:14:27,453][12883] Updated weights for policy 0, policy_version 93333 (0.0026) [2024-06-18 08:14:31,158][12883] Updated weights for policy 0, policy_version 93343 (0.0035) [2024-06-18 08:14:31,994][12645] Fps is (10 sec: 47514.3, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1529364480. Throughput: 0: 42695.7. Samples: 1529418340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:14:31,994][12645] Avg episode reward: [(0, '0.169')] [2024-06-18 08:14:35,258][12883] Updated weights for policy 0, policy_version 93353 (0.0027) [2024-06-18 08:14:36,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42325.3, 300 sec: 42709.2). Total num frames: 1529544704. Throughput: 0: 42527.2. Samples: 1529674960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:14:36,997][12645] Avg episode reward: [(0, '0.205')] [2024-06-18 08:14:37,151][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093357_1529561088.pth... [2024-06-18 08:14:37,207][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000092730_1519288320.pth [2024-06-18 08:14:38,715][12883] Updated weights for policy 0, policy_version 93363 (0.0032) [2024-06-18 08:14:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1529774080. Throughput: 0: 42741.9. Samples: 1529936220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:14:41,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 08:14:43,000][12883] Updated weights for policy 0, policy_version 93373 (0.0029) [2024-06-18 08:14:46,260][12883] Updated weights for policy 0, policy_version 93383 (0.0038) [2024-06-18 08:14:46,994][12645] Fps is (10 sec: 47523.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1530019840. Throughput: 0: 42847.2. Samples: 1530066180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:14:46,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 08:14:50,705][12883] Updated weights for policy 0, policy_version 93393 (0.0031) [2024-06-18 08:14:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1530183680. Throughput: 0: 42647.6. Samples: 1530323840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:14:51,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 08:14:53,768][12883] Updated weights for policy 0, policy_version 93403 (0.0036) [2024-06-18 08:14:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1530413056. Throughput: 0: 42994.1. Samples: 1530581040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:14:57,000][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 08:14:58,486][12883] Updated weights for policy 0, policy_version 93413 (0.0037) [2024-06-18 08:15:01,554][12883] Updated weights for policy 0, policy_version 93423 (0.0029) [2024-06-18 08:15:01,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 1530658816. Throughput: 0: 42930.2. Samples: 1530710640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:15:01,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 08:15:06,340][12883] Updated weights for policy 0, policy_version 93433 (0.0030) [2024-06-18 08:15:06,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1530822656. Throughput: 0: 42957.0. Samples: 1530966120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:15:06,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 08:15:07,744][12862] Signal inference workers to stop experience collection... (22350 times) [2024-06-18 08:15:07,744][12862] Signal inference workers to resume experience collection... (22350 times) [2024-06-18 08:15:07,763][12883] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-06-18 08:15:07,763][12883] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-06-18 08:15:09,093][12883] Updated weights for policy 0, policy_version 93443 (0.0034) [2024-06-18 08:15:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1531052032. Throughput: 0: 43072.5. Samples: 1531224120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:15:11,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 08:15:13,959][12883] Updated weights for policy 0, policy_version 93453 (0.0029) [2024-06-18 08:15:16,721][12883] Updated weights for policy 0, policy_version 93463 (0.0035) [2024-06-18 08:15:16,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1531297792. Throughput: 0: 43136.3. Samples: 1531359480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:15:16,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 08:15:21,831][12883] Updated weights for policy 0, policy_version 93473 (0.0032) [2024-06-18 08:15:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1531461632. Throughput: 0: 43032.9. Samples: 1531611340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 26.0) [2024-06-18 08:15:21,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 08:15:24,244][12883] Updated weights for policy 0, policy_version 93483 (0.0030) [2024-06-18 08:15:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1531691008. Throughput: 0: 42834.6. Samples: 1531863780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:15:26,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 08:15:29,617][12883] Updated weights for policy 0, policy_version 93493 (0.0033) [2024-06-18 08:15:31,867][12883] Updated weights for policy 0, policy_version 93503 (0.0029) [2024-06-18 08:15:31,994][12645] Fps is (10 sec: 49152.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1531953152. Throughput: 0: 42940.6. Samples: 1531998500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:15:31,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 08:15:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1532084224. Throughput: 0: 42723.0. Samples: 1532246380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:15:36,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 08:15:37,215][12883] Updated weights for policy 0, policy_version 93513 (0.0037) [2024-06-18 08:15:39,631][12883] Updated weights for policy 0, policy_version 93523 (0.0044) [2024-06-18 08:15:41,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1532329984. Throughput: 0: 42625.9. Samples: 1532499200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:15:41,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 08:15:44,771][12883] Updated weights for policy 0, policy_version 93533 (0.0028) [2024-06-18 08:15:46,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1532575744. Throughput: 0: 42719.9. Samples: 1532633040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:15:46,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 08:15:47,276][12883] Updated weights for policy 0, policy_version 93543 (0.0039) [2024-06-18 08:15:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1532739584. Throughput: 0: 42548.7. Samples: 1532880820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:15:51,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 08:15:52,359][12883] Updated weights for policy 0, policy_version 93553 (0.0027) [2024-06-18 08:15:55,214][12883] Updated weights for policy 0, policy_version 93563 (0.0038) [2024-06-18 08:15:56,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1532985344. Throughput: 0: 42411.8. Samples: 1533132660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:15:56,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 08:15:59,997][12883] Updated weights for policy 0, policy_version 93573 (0.0037) [2024-06-18 08:16:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1533181952. Throughput: 0: 42359.5. Samples: 1533265660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:16:01,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 08:16:03,026][12883] Updated weights for policy 0, policy_version 93583 (0.0031) [2024-06-18 08:16:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1533394944. Throughput: 0: 42186.6. Samples: 1533509740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:16:06,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 08:16:08,063][12883] Updated weights for policy 0, policy_version 93593 (0.0038) [2024-06-18 08:16:10,434][12862] Signal inference workers to stop experience collection... (22400 times) [2024-06-18 08:16:10,465][12883] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-06-18 08:16:10,490][12862] Signal inference workers to resume experience collection... (22400 times) [2024-06-18 08:16:10,491][12883] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-06-18 08:16:10,811][12883] Updated weights for policy 0, policy_version 93603 (0.0028) [2024-06-18 08:16:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1533624320. Throughput: 0: 42256.9. Samples: 1533765340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:16:11,996][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 08:16:15,569][12883] Updated weights for policy 0, policy_version 93613 (0.0036) [2024-06-18 08:16:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1533804544. Throughput: 0: 42189.2. Samples: 1533897020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:16:16,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 08:16:18,556][12883] Updated weights for policy 0, policy_version 93623 (0.0038) [2024-06-18 08:16:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1534033920. Throughput: 0: 42316.9. Samples: 1534150640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:16:22,000][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 08:16:23,103][12883] Updated weights for policy 0, policy_version 93633 (0.0025) [2024-06-18 08:16:26,448][12883] Updated weights for policy 0, policy_version 93643 (0.0028) [2024-06-18 08:16:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1534246912. Throughput: 0: 42294.3. Samples: 1534402440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:16:26,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 08:16:30,676][12883] Updated weights for policy 0, policy_version 93653 (0.0027) [2024-06-18 08:16:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 1534459904. Throughput: 0: 42244.1. Samples: 1534534020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:16:31,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 08:16:34,318][12883] Updated weights for policy 0, policy_version 93663 (0.0040) [2024-06-18 08:16:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 1534689280. Throughput: 0: 42412.9. Samples: 1534789400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:16:36,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 08:16:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093670_1534689280.pth... [2024-06-18 08:16:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093047_1524482048.pth [2024-06-18 08:16:38,423][12883] Updated weights for policy 0, policy_version 93673 (0.0038) [2024-06-18 08:16:41,975][12883] Updated weights for policy 0, policy_version 93683 (0.0039) [2024-06-18 08:16:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1534902272. Throughput: 0: 42655.8. Samples: 1535052160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:16:41,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 08:16:46,050][12883] Updated weights for policy 0, policy_version 93693 (0.0041) [2024-06-18 08:16:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 1535082496. Throughput: 0: 42377.3. Samples: 1535172640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:16:46,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 08:16:49,838][12883] Updated weights for policy 0, policy_version 93703 (0.0030) [2024-06-18 08:16:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1535328256. Throughput: 0: 42614.7. Samples: 1535427400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:16:51,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 08:16:53,617][12883] Updated weights for policy 0, policy_version 93713 (0.0039) [2024-06-18 08:16:56,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42820.6). Total num frames: 1535524864. Throughput: 0: 42813.4. Samples: 1535691940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:16:56,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 08:16:57,514][12883] Updated weights for policy 0, policy_version 93723 (0.0030) [2024-06-18 08:17:01,078][12883] Updated weights for policy 0, policy_version 93733 (0.0028) [2024-06-18 08:17:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 1535737856. Throughput: 0: 42754.8. Samples: 1535820980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:17:01,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 08:17:05,143][12883] Updated weights for policy 0, policy_version 93743 (0.0037) [2024-06-18 08:17:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1535950848. Throughput: 0: 42786.4. Samples: 1536076020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:17:06,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 08:17:08,626][12883] Updated weights for policy 0, policy_version 93753 (0.0025) [2024-06-18 08:17:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1536180224. Throughput: 0: 42844.9. Samples: 1536330460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:17:11,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 08:17:12,748][12883] Updated weights for policy 0, policy_version 93763 (0.0029) [2024-06-18 08:17:16,813][12883] Updated weights for policy 0, policy_version 93773 (0.0043) [2024-06-18 08:17:16,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.9, 300 sec: 42487.3). Total num frames: 1536376832. Throughput: 0: 42775.2. Samples: 1536459000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 08:17:16,996][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 08:17:20,287][12883] Updated weights for policy 0, policy_version 93783 (0.0044) [2024-06-18 08:17:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1536589824. Throughput: 0: 42834.0. Samples: 1536716920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:21,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 08:17:24,296][12883] Updated weights for policy 0, policy_version 93793 (0.0029) [2024-06-18 08:17:26,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42871.3, 300 sec: 42765.3). Total num frames: 1536819200. Throughput: 0: 42709.6. Samples: 1536974100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:26,994][12645] Avg episode reward: [(0, '0.061')] [2024-06-18 08:17:27,873][12883] Updated weights for policy 0, policy_version 93803 (0.0028) [2024-06-18 08:17:31,750][12883] Updated weights for policy 0, policy_version 93813 (0.0031) [2024-06-18 08:17:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1537032192. Throughput: 0: 42869.9. Samples: 1537101780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:31,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 08:17:35,886][12883] Updated weights for policy 0, policy_version 93823 (0.0019) [2024-06-18 08:17:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.3). Total num frames: 1537228800. Throughput: 0: 42886.7. Samples: 1537357300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:36,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 08:17:37,615][12862] Signal inference workers to stop experience collection... (22450 times) [2024-06-18 08:17:37,665][12862] Signal inference workers to resume experience collection... (22450 times) [2024-06-18 08:17:37,666][12883] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-06-18 08:17:37,683][12883] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-06-18 08:17:39,120][12883] Updated weights for policy 0, policy_version 93833 (0.0032) [2024-06-18 08:17:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1537458176. Throughput: 0: 42796.4. Samples: 1537617780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:41,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 08:17:43,491][12883] Updated weights for policy 0, policy_version 93843 (0.0042) [2024-06-18 08:17:46,678][12883] Updated weights for policy 0, policy_version 93853 (0.0035) [2024-06-18 08:17:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43417.8, 300 sec: 42654.0). Total num frames: 1537687552. Throughput: 0: 42703.6. Samples: 1537742640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:46,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 08:17:51,222][12883] Updated weights for policy 0, policy_version 93863 (0.0032) [2024-06-18 08:17:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1537884160. Throughput: 0: 42732.3. Samples: 1537998980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:51,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 08:17:54,323][12883] Updated weights for policy 0, policy_version 93873 (0.0032) [2024-06-18 08:17:56,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1538097152. Throughput: 0: 42888.7. Samples: 1538260460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:17:56,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 08:17:58,650][12883] Updated weights for policy 0, policy_version 93883 (0.0040) [2024-06-18 08:18:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1538326528. Throughput: 0: 42889.7. Samples: 1538388940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:18:01,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 08:18:02,042][12883] Updated weights for policy 0, policy_version 93893 (0.0036) [2024-06-18 08:18:06,271][12883] Updated weights for policy 0, policy_version 93903 (0.0029) [2024-06-18 08:18:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1538539520. Throughput: 0: 42950.1. Samples: 1538649680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:18:06,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 08:18:10,271][12883] Updated weights for policy 0, policy_version 93913 (0.0032) [2024-06-18 08:18:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1538719744. Throughput: 0: 42986.4. Samples: 1538908480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:18:11,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 08:18:13,830][12883] Updated weights for policy 0, policy_version 93923 (0.0022) [2024-06-18 08:18:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1538949120. Throughput: 0: 42934.5. Samples: 1539033840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 08:18:16,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 08:18:17,968][12883] Updated weights for policy 0, policy_version 93933 (0.0035) [2024-06-18 08:18:21,928][12883] Updated weights for policy 0, policy_version 93943 (0.0036) [2024-06-18 08:18:21,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1539162112. Throughput: 0: 42798.1. Samples: 1539283220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:21,995][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 08:18:25,671][12883] Updated weights for policy 0, policy_version 93953 (0.0025) [2024-06-18 08:18:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1539358720. Throughput: 0: 42851.1. Samples: 1539546080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:26,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 08:18:29,549][12883] Updated weights for policy 0, policy_version 93963 (0.0026) [2024-06-18 08:18:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 1539604480. Throughput: 0: 42903.0. Samples: 1539673280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:31,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 08:18:33,326][12883] Updated weights for policy 0, policy_version 93973 (0.0035) [2024-06-18 08:18:36,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1539801088. Throughput: 0: 42744.3. Samples: 1539922480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:36,995][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 08:18:37,125][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093983_1539817472.pth... [2024-06-18 08:18:37,132][12883] Updated weights for policy 0, policy_version 93983 (0.0036) [2024-06-18 08:18:37,190][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093357_1529561088.pth [2024-06-18 08:18:41,046][12883] Updated weights for policy 0, policy_version 93993 (0.0035) [2024-06-18 08:18:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1539997696. Throughput: 0: 42673.5. Samples: 1540180760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:41,994][12645] Avg episode reward: [(0, '0.239')] [2024-06-18 08:18:44,802][12883] Updated weights for policy 0, policy_version 94003 (0.0035) [2024-06-18 08:18:46,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1540243456. Throughput: 0: 42685.8. Samples: 1540309800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:46,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 08:18:48,695][12883] Updated weights for policy 0, policy_version 94013 (0.0038) [2024-06-18 08:18:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1540440064. Throughput: 0: 42453.4. Samples: 1540560080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:52,000][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 08:18:52,433][12883] Updated weights for policy 0, policy_version 94023 (0.0036) [2024-06-18 08:18:56,045][12862] Signal inference workers to stop experience collection... (22500 times) [2024-06-18 08:18:56,045][12862] Signal inference workers to resume experience collection... (22500 times) [2024-06-18 08:18:56,061][12883] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-06-18 08:18:56,061][12883] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-06-18 08:18:56,197][12883] Updated weights for policy 0, policy_version 94033 (0.0036) [2024-06-18 08:18:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 1540653056. Throughput: 0: 42529.8. Samples: 1540822320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:18:56,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 08:18:59,877][12883] Updated weights for policy 0, policy_version 94043 (0.0029) [2024-06-18 08:19:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1540866048. Throughput: 0: 42574.8. Samples: 1540949700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:19:01,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 08:19:03,842][12883] Updated weights for policy 0, policy_version 94053 (0.0027) [2024-06-18 08:19:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1541095424. Throughput: 0: 42653.1. Samples: 1541202600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:19:06,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 08:19:07,496][12883] Updated weights for policy 0, policy_version 94063 (0.0028) [2024-06-18 08:19:11,715][12883] Updated weights for policy 0, policy_version 94073 (0.0030) [2024-06-18 08:19:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1541292032. Throughput: 0: 42630.3. Samples: 1541464440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 08:19:11,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 08:19:15,220][12883] Updated weights for policy 0, policy_version 94083 (0.0048) [2024-06-18 08:19:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1541505024. Throughput: 0: 42611.1. Samples: 1541590780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:16,994][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 08:19:19,829][12883] Updated weights for policy 0, policy_version 94093 (0.0036) [2024-06-18 08:19:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1541734400. Throughput: 0: 42707.3. Samples: 1541844300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:21,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 08:19:22,783][12883] Updated weights for policy 0, policy_version 94103 (0.0038) [2024-06-18 08:19:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1541931008. Throughput: 0: 42804.6. Samples: 1542106980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:26,995][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 08:19:27,219][12883] Updated weights for policy 0, policy_version 94113 (0.0034) [2024-06-18 08:19:30,352][12883] Updated weights for policy 0, policy_version 94123 (0.0036) [2024-06-18 08:19:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 1542160384. Throughput: 0: 42668.3. Samples: 1542229880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:31,994][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 08:19:34,841][12883] Updated weights for policy 0, policy_version 94133 (0.0036) [2024-06-18 08:19:37,000][12645] Fps is (10 sec: 44210.1, 60 sec: 42867.1, 300 sec: 42708.6). Total num frames: 1542373376. Throughput: 0: 42939.3. Samples: 1542492620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:37,000][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 08:19:37,868][12883] Updated weights for policy 0, policy_version 94143 (0.0024) [2024-06-18 08:19:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 1542569984. Throughput: 0: 42761.2. Samples: 1542746580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:41,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 08:19:42,664][12883] Updated weights for policy 0, policy_version 94153 (0.0042) [2024-06-18 08:19:45,482][12883] Updated weights for policy 0, policy_version 94163 (0.0025) [2024-06-18 08:19:46,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1542799360. Throughput: 0: 42741.3. Samples: 1542873060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:46,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 08:19:50,227][12883] Updated weights for policy 0, policy_version 94173 (0.0030) [2024-06-18 08:19:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1543012352. Throughput: 0: 42978.5. Samples: 1543136640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:51,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 08:19:53,117][12883] Updated weights for policy 0, policy_version 94183 (0.0038) [2024-06-18 08:19:56,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 1543225344. Throughput: 0: 42743.6. Samples: 1543388000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:19:57,005][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 08:19:58,125][12883] Updated weights for policy 0, policy_version 94193 (0.0039) [2024-06-18 08:20:00,931][12883] Updated weights for policy 0, policy_version 94203 (0.0025) [2024-06-18 08:20:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1543454720. Throughput: 0: 42727.6. Samples: 1543513520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:20:01,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 08:20:05,755][12883] Updated weights for policy 0, policy_version 94213 (0.0026) [2024-06-18 08:20:06,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1543667712. Throughput: 0: 43056.3. Samples: 1543781840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:20:06,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 08:20:08,379][12883] Updated weights for policy 0, policy_version 94223 (0.0042) [2024-06-18 08:20:11,703][12862] Signal inference workers to stop experience collection... (22550 times) [2024-06-18 08:20:11,703][12862] Signal inference workers to resume experience collection... (22550 times) [2024-06-18 08:20:11,726][12883] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-06-18 08:20:11,726][12883] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-06-18 08:20:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1543864320. Throughput: 0: 42929.9. Samples: 1544038820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:11,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 08:20:13,383][12883] Updated weights for policy 0, policy_version 94233 (0.0031) [2024-06-18 08:20:16,008][12883] Updated weights for policy 0, policy_version 94243 (0.0031) [2024-06-18 08:20:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1544093696. Throughput: 0: 42823.6. Samples: 1544156940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:16,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 08:20:20,978][12883] Updated weights for policy 0, policy_version 94253 (0.0040) [2024-06-18 08:20:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1544306688. Throughput: 0: 42721.5. Samples: 1544414820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:21,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 08:20:23,958][12883] Updated weights for policy 0, policy_version 94263 (0.0034) [2024-06-18 08:20:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1544503296. Throughput: 0: 42697.9. Samples: 1544667980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:26,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 08:20:28,794][12883] Updated weights for policy 0, policy_version 94273 (0.0030) [2024-06-18 08:20:31,530][12883] Updated weights for policy 0, policy_version 94283 (0.0037) [2024-06-18 08:20:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1544732672. Throughput: 0: 42663.2. Samples: 1544792900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:31,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 08:20:36,262][12883] Updated weights for policy 0, policy_version 94293 (0.0041) [2024-06-18 08:20:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 1544945664. Throughput: 0: 42720.9. Samples: 1545059080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:36,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 08:20:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094297_1544962048.pth... [2024-06-18 08:20:37,200][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093670_1534689280.pth [2024-06-18 08:20:39,052][12883] Updated weights for policy 0, policy_version 94303 (0.0036) [2024-06-18 08:20:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1545142272. Throughput: 0: 42813.8. Samples: 1545314520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:41,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 08:20:43,831][12883] Updated weights for policy 0, policy_version 94313 (0.0034) [2024-06-18 08:20:46,580][12883] Updated weights for policy 0, policy_version 94323 (0.0039) [2024-06-18 08:20:46,995][12645] Fps is (10 sec: 44229.5, 60 sec: 43143.3, 300 sec: 42875.9). Total num frames: 1545388032. Throughput: 0: 42880.5. Samples: 1545443220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:46,996][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 08:20:51,537][12883] Updated weights for policy 0, policy_version 94333 (0.0046) [2024-06-18 08:20:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1545568256. Throughput: 0: 42664.2. Samples: 1545701720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:51,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 08:20:54,584][12883] Updated weights for policy 0, policy_version 94343 (0.0036) [2024-06-18 08:20:56,994][12645] Fps is (10 sec: 39328.4, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1545781248. Throughput: 0: 42556.5. Samples: 1545953860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:20:56,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 08:20:59,197][12883] Updated weights for policy 0, policy_version 94353 (0.0039) [2024-06-18 08:21:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1546027008. Throughput: 0: 42758.3. Samples: 1546081060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:21:01,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 08:21:02,189][12883] Updated weights for policy 0, policy_version 94363 (0.0036) [2024-06-18 08:21:06,910][12883] Updated weights for policy 0, policy_version 94373 (0.0035) [2024-06-18 08:21:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1546207232. Throughput: 0: 42644.8. Samples: 1546333840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:21:06,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 08:21:09,761][12883] Updated weights for policy 0, policy_version 94383 (0.0042) [2024-06-18 08:21:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1546420224. Throughput: 0: 42684.9. Samples: 1546588800. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:11,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 08:21:14,846][12883] Updated weights for policy 0, policy_version 94393 (0.0038) [2024-06-18 08:21:16,994][12645] Fps is (10 sec: 47514.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1546682368. Throughput: 0: 42801.8. Samples: 1546718980. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:16,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 08:21:17,309][12883] Updated weights for policy 0, policy_version 94403 (0.0021) [2024-06-18 08:21:20,349][12862] Signal inference workers to stop experience collection... (22600 times) [2024-06-18 08:21:20,404][12883] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-06-18 08:21:20,463][12862] Signal inference workers to resume experience collection... (22600 times) [2024-06-18 08:21:20,464][12883] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-06-18 08:21:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1546813440. Throughput: 0: 42452.9. Samples: 1546969460. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:22,007][12645] Avg episode reward: [(0, '0.773')] [2024-06-18 08:21:22,014][12862] Saving new best policy, reward=0.773! [2024-06-18 08:21:22,504][12883] Updated weights for policy 0, policy_version 94413 (0.0026) [2024-06-18 08:21:25,223][12883] Updated weights for policy 0, policy_version 94423 (0.0032) [2024-06-18 08:21:26,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1547059200. Throughput: 0: 42411.5. Samples: 1547223040. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:26,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 08:21:29,998][12883] Updated weights for policy 0, policy_version 94433 (0.0035) [2024-06-18 08:21:31,994][12645] Fps is (10 sec: 49152.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1547304960. Throughput: 0: 42606.1. Samples: 1547360420. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:31,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 08:21:32,781][12883] Updated weights for policy 0, policy_version 94443 (0.0034) [2024-06-18 08:21:36,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1547468800. Throughput: 0: 42586.9. Samples: 1547618140. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:36,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 08:21:37,591][12883] Updated weights for policy 0, policy_version 94453 (0.0025) [2024-06-18 08:21:40,319][12883] Updated weights for policy 0, policy_version 94463 (0.0032) [2024-06-18 08:21:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1547714560. Throughput: 0: 42457.7. Samples: 1547864460. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:41,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 08:21:45,303][12883] Updated weights for policy 0, policy_version 94473 (0.0036) [2024-06-18 08:21:46,994][12645] Fps is (10 sec: 49153.0, 60 sec: 42872.8, 300 sec: 42820.6). Total num frames: 1547960320. Throughput: 0: 42677.4. Samples: 1548001540. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:46,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 08:21:47,712][12883] Updated weights for policy 0, policy_version 94483 (0.0040) [2024-06-18 08:21:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1548124160. Throughput: 0: 42760.0. Samples: 1548258040. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:51,995][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 08:21:52,891][12883] Updated weights for policy 0, policy_version 94493 (0.0040) [2024-06-18 08:21:55,497][12883] Updated weights for policy 0, policy_version 94503 (0.0040) [2024-06-18 08:21:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1548369920. Throughput: 0: 42630.6. Samples: 1548507180. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:21:56,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 08:22:00,657][12883] Updated weights for policy 0, policy_version 94513 (0.0045) [2024-06-18 08:22:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1548566528. Throughput: 0: 42778.2. Samples: 1548644000. Policy #0 lag: (min: 2.0, avg: 10.1, max: 22.0) [2024-06-18 08:22:01,999][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 08:22:03,725][12883] Updated weights for policy 0, policy_version 94523 (0.0038) [2024-06-18 08:22:06,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1548746752. Throughput: 0: 42791.5. Samples: 1548895080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:07,008][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 08:22:08,278][12883] Updated weights for policy 0, policy_version 94533 (0.0032) [2024-06-18 08:22:11,357][12883] Updated weights for policy 0, policy_version 94543 (0.0035) [2024-06-18 08:22:11,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 1549025280. Throughput: 0: 42753.6. Samples: 1549146960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:11,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 08:22:16,061][12883] Updated weights for policy 0, policy_version 94553 (0.0028) [2024-06-18 08:22:16,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1549205504. Throughput: 0: 42784.0. Samples: 1549285700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:16,995][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 08:22:18,926][12883] Updated weights for policy 0, policy_version 94563 (0.0043) [2024-06-18 08:22:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1549402112. Throughput: 0: 42600.5. Samples: 1549535160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:21,995][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 08:22:23,633][12883] Updated weights for policy 0, policy_version 94573 (0.0035) [2024-06-18 08:22:26,403][12883] Updated weights for policy 0, policy_version 94583 (0.0054) [2024-06-18 08:22:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1549664256. Throughput: 0: 42644.4. Samples: 1549783460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:26,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 08:22:31,323][12883] Updated weights for policy 0, policy_version 94593 (0.0027) [2024-06-18 08:22:31,772][12862] Signal inference workers to stop experience collection... (22650 times) [2024-06-18 08:22:31,772][12862] Signal inference workers to resume experience collection... (22650 times) [2024-06-18 08:22:31,824][12883] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-06-18 08:22:31,824][12883] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-06-18 08:22:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1549844480. Throughput: 0: 42708.3. Samples: 1549923420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:31,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 08:22:34,293][12883] Updated weights for policy 0, policy_version 94603 (0.0027) [2024-06-18 08:22:36,998][12645] Fps is (10 sec: 39303.4, 60 sec: 43141.2, 300 sec: 42708.8). Total num frames: 1550057472. Throughput: 0: 42670.3. Samples: 1550178400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:36,999][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 08:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094608_1550057472.pth... [2024-06-18 08:22:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000093983_1539817472.pth [2024-06-18 08:22:38,782][12883] Updated weights for policy 0, policy_version 94613 (0.0042) [2024-06-18 08:22:41,913][12883] Updated weights for policy 0, policy_version 94623 (0.0043) [2024-06-18 08:22:42,000][12645] Fps is (10 sec: 45847.1, 60 sec: 43140.1, 300 sec: 42764.1). Total num frames: 1550303232. Throughput: 0: 42661.7. Samples: 1550427220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:42,001][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 08:22:46,423][12883] Updated weights for policy 0, policy_version 94633 (0.0033) [2024-06-18 08:22:46,994][12645] Fps is (10 sec: 42618.6, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1550483456. Throughput: 0: 42608.4. Samples: 1550561380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:46,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 08:22:49,579][12883] Updated weights for policy 0, policy_version 94643 (0.0036) [2024-06-18 08:22:51,994][12645] Fps is (10 sec: 37705.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1550680064. Throughput: 0: 42636.3. Samples: 1550813720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:51,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 08:22:54,425][12883] Updated weights for policy 0, policy_version 94653 (0.0027) [2024-06-18 08:22:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1550942208. Throughput: 0: 42691.2. Samples: 1551068060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:22:56,994][12645] Avg episode reward: [(0, '0.119')] [2024-06-18 08:22:57,550][12883] Updated weights for policy 0, policy_version 94663 (0.0044) [2024-06-18 08:23:01,836][12883] Updated weights for policy 0, policy_version 94673 (0.0032) [2024-06-18 08:23:01,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1551122432. Throughput: 0: 42663.1. Samples: 1551205540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:23:01,994][12645] Avg episode reward: [(0, '0.109')] [2024-06-18 08:23:05,033][12883] Updated weights for policy 0, policy_version 94683 (0.0036) [2024-06-18 08:23:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1551335424. Throughput: 0: 42710.2. Samples: 1551457120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:06,994][12645] Avg episode reward: [(0, '0.167')] [2024-06-18 08:23:09,735][12883] Updated weights for policy 0, policy_version 94693 (0.0035) [2024-06-18 08:23:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1551564800. Throughput: 0: 42803.7. Samples: 1551709620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:11,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 08:23:12,598][12883] Updated weights for policy 0, policy_version 94703 (0.0042) [2024-06-18 08:23:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1551761408. Throughput: 0: 42710.7. Samples: 1551845400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:16,994][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 08:23:17,352][12883] Updated weights for policy 0, policy_version 94713 (0.0031) [2024-06-18 08:23:20,246][12883] Updated weights for policy 0, policy_version 94723 (0.0031) [2024-06-18 08:23:21,998][12645] Fps is (10 sec: 42581.4, 60 sec: 43141.7, 300 sec: 42820.0). Total num frames: 1551990784. Throughput: 0: 42620.2. Samples: 1552096280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:21,998][12645] Avg episode reward: [(0, '0.692')] [2024-06-18 08:23:24,920][12883] Updated weights for policy 0, policy_version 94733 (0.0036) [2024-06-18 08:23:26,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 1552203776. Throughput: 0: 42883.0. Samples: 1552356700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:26,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 08:23:27,925][12883] Updated weights for policy 0, policy_version 94743 (0.0035) [2024-06-18 08:23:31,994][12645] Fps is (10 sec: 40976.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1552400384. Throughput: 0: 42793.3. Samples: 1552487080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:31,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 08:23:32,348][12883] Updated weights for policy 0, policy_version 94753 (0.0034) [2024-06-18 08:23:35,612][12883] Updated weights for policy 0, policy_version 94763 (0.0042) [2024-06-18 08:23:36,994][12645] Fps is (10 sec: 42599.9, 60 sec: 42874.9, 300 sec: 42820.6). Total num frames: 1552629760. Throughput: 0: 42842.1. Samples: 1552741600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:36,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 08:23:39,957][12883] Updated weights for policy 0, policy_version 94773 (0.0033) [2024-06-18 08:23:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42056.6, 300 sec: 42653.9). Total num frames: 1552826368. Throughput: 0: 42971.6. Samples: 1553001780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:41,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 08:23:43,441][12883] Updated weights for policy 0, policy_version 94783 (0.0031) [2024-06-18 08:23:46,151][12862] Signal inference workers to stop experience collection... (22700 times) [2024-06-18 08:23:46,152][12862] Signal inference workers to resume experience collection... (22700 times) [2024-06-18 08:23:46,194][12883] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-06-18 08:23:46,194][12883] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-06-18 08:23:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1553039360. Throughput: 0: 42757.9. Samples: 1553129640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:46,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 08:23:47,681][12883] Updated weights for policy 0, policy_version 94793 (0.0023) [2024-06-18 08:23:51,119][12883] Updated weights for policy 0, policy_version 94803 (0.0030) [2024-06-18 08:23:51,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43417.9, 300 sec: 42820.6). Total num frames: 1553285120. Throughput: 0: 42785.5. Samples: 1553382460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:51,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 08:23:55,667][12883] Updated weights for policy 0, policy_version 94813 (0.0023) [2024-06-18 08:23:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1553481728. Throughput: 0: 42808.9. Samples: 1553636020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 08:23:56,994][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 08:23:58,848][12883] Updated weights for policy 0, policy_version 94823 (0.0042) [2024-06-18 08:24:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1553678336. Throughput: 0: 42629.0. Samples: 1553763700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:01,994][12645] Avg episode reward: [(0, '0.077')] [2024-06-18 08:24:03,208][12883] Updated weights for policy 0, policy_version 94833 (0.0036) [2024-06-18 08:24:06,537][12883] Updated weights for policy 0, policy_version 94843 (0.0034) [2024-06-18 08:24:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1553924096. Throughput: 0: 42818.9. Samples: 1554022960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:06,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 08:24:10,711][12883] Updated weights for policy 0, policy_version 94853 (0.0033) [2024-06-18 08:24:11,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 1554137088. Throughput: 0: 42704.8. Samples: 1554278500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:11,997][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 08:24:13,930][12883] Updated weights for policy 0, policy_version 94863 (0.0040) [2024-06-18 08:24:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1554317312. Throughput: 0: 42758.7. Samples: 1554411220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:16,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 08:24:18,177][12883] Updated weights for policy 0, policy_version 94873 (0.0040) [2024-06-18 08:24:21,866][12883] Updated weights for policy 0, policy_version 94883 (0.0040) [2024-06-18 08:24:21,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42874.4, 300 sec: 42820.6). Total num frames: 1554563072. Throughput: 0: 42709.8. Samples: 1554663540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:21,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 08:24:25,658][12883] Updated weights for policy 0, policy_version 94893 (0.0029) [2024-06-18 08:24:26,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 1554776064. Throughput: 0: 42715.1. Samples: 1554923960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:26,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 08:24:29,331][12883] Updated weights for policy 0, policy_version 94903 (0.0029) [2024-06-18 08:24:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42710.4). Total num frames: 1554972672. Throughput: 0: 42730.7. Samples: 1555052520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:31,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 08:24:33,161][12883] Updated weights for policy 0, policy_version 94913 (0.0036) [2024-06-18 08:24:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1555202048. Throughput: 0: 42956.3. Samples: 1555315500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:36,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 08:24:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094922_1555202048.pth... [2024-06-18 08:24:37,103][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094297_1544962048.pth [2024-06-18 08:24:37,247][12883] Updated weights for policy 0, policy_version 94923 (0.0028) [2024-06-18 08:24:40,713][12883] Updated weights for policy 0, policy_version 94933 (0.0034) [2024-06-18 08:24:41,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1555415040. Throughput: 0: 43064.8. Samples: 1555573940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:41,994][12645] Avg episode reward: [(0, '0.729')] [2024-06-18 08:24:44,694][12883] Updated weights for policy 0, policy_version 94943 (0.0031) [2024-06-18 08:24:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1555611648. Throughput: 0: 43039.1. Samples: 1555700460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:46,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 08:24:48,299][12883] Updated weights for policy 0, policy_version 94953 (0.0021) [2024-06-18 08:24:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 1555841024. Throughput: 0: 43003.2. Samples: 1555958100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:51,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 08:24:52,586][12883] Updated weights for policy 0, policy_version 94963 (0.0028) [2024-06-18 08:24:56,613][12883] Updated weights for policy 0, policy_version 94973 (0.0028) [2024-06-18 08:24:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1556070400. Throughput: 0: 43051.0. Samples: 1556215700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 08:24:56,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 08:25:00,150][12883] Updated weights for policy 0, policy_version 94983 (0.0021) [2024-06-18 08:25:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1556250624. Throughput: 0: 42828.5. Samples: 1556338500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:02,000][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 08:25:04,035][12883] Updated weights for policy 0, policy_version 94993 (0.0027) [2024-06-18 08:25:04,619][12862] Signal inference workers to stop experience collection... (22750 times) [2024-06-18 08:25:04,619][12862] Signal inference workers to resume experience collection... (22750 times) [2024-06-18 08:25:04,663][12883] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-06-18 08:25:04,663][12883] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-06-18 08:25:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1556496384. Throughput: 0: 43078.2. Samples: 1556602060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:06,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 08:25:07,686][12883] Updated weights for policy 0, policy_version 95003 (0.0043) [2024-06-18 08:25:11,553][12883] Updated weights for policy 0, policy_version 95013 (0.0029) [2024-06-18 08:25:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1556709376. Throughput: 0: 42948.0. Samples: 1556856620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:11,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 08:25:15,243][12883] Updated weights for policy 0, policy_version 95023 (0.0040) [2024-06-18 08:25:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1556889600. Throughput: 0: 42917.3. Samples: 1556983800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:16,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 08:25:19,395][12883] Updated weights for policy 0, policy_version 95033 (0.0039) [2024-06-18 08:25:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1557135360. Throughput: 0: 42698.7. Samples: 1557236940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:21,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 08:25:23,219][12883] Updated weights for policy 0, policy_version 95043 (0.0030) [2024-06-18 08:25:26,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 1557331968. Throughput: 0: 42742.3. Samples: 1557497440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:26,997][12645] Avg episode reward: [(0, '0.677')] [2024-06-18 08:25:27,080][12883] Updated weights for policy 0, policy_version 95053 (0.0030) [2024-06-18 08:25:30,680][12883] Updated weights for policy 0, policy_version 95063 (0.0040) [2024-06-18 08:25:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1557544960. Throughput: 0: 42710.2. Samples: 1557622420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:31,994][12645] Avg episode reward: [(0, '0.704')] [2024-06-18 08:25:34,435][12883] Updated weights for policy 0, policy_version 95073 (0.0046) [2024-06-18 08:25:36,994][12645] Fps is (10 sec: 45885.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1557790720. Throughput: 0: 42808.4. Samples: 1557884480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:36,994][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 08:25:38,829][12883] Updated weights for policy 0, policy_version 95083 (0.0040) [2024-06-18 08:25:41,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42709.7). Total num frames: 1557987328. Throughput: 0: 42907.0. Samples: 1558146520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:41,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 08:25:42,181][12883] Updated weights for policy 0, policy_version 95093 (0.0030) [2024-06-18 08:25:46,275][12883] Updated weights for policy 0, policy_version 95103 (0.0038) [2024-06-18 08:25:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1558200320. Throughput: 0: 43023.9. Samples: 1558274580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:46,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 08:25:49,632][12883] Updated weights for policy 0, policy_version 95113 (0.0026) [2024-06-18 08:25:51,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1558413312. Throughput: 0: 42924.9. Samples: 1558533680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 08:25:51,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 08:25:53,589][12883] Updated weights for policy 0, policy_version 95123 (0.0032) [2024-06-18 08:25:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1558626304. Throughput: 0: 43031.9. Samples: 1558793060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:25:56,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 08:25:57,319][12883] Updated weights for policy 0, policy_version 95133 (0.0039) [2024-06-18 08:26:01,152][12883] Updated weights for policy 0, policy_version 95143 (0.0041) [2024-06-18 08:26:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1558839296. Throughput: 0: 43018.6. Samples: 1558919640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:01,994][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 08:26:04,891][12883] Updated weights for policy 0, policy_version 95153 (0.0045) [2024-06-18 08:26:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1559068672. Throughput: 0: 43191.2. Samples: 1559180540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:06,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 08:26:08,864][12883] Updated weights for policy 0, policy_version 95163 (0.0034) [2024-06-18 08:26:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1559265280. Throughput: 0: 43074.7. Samples: 1559435700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:11,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 08:26:12,748][12883] Updated weights for policy 0, policy_version 95173 (0.0038) [2024-06-18 08:26:16,506][12883] Updated weights for policy 0, policy_version 95183 (0.0032) [2024-06-18 08:26:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42931.7). Total num frames: 1559478272. Throughput: 0: 43208.4. Samples: 1559566800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:16,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 08:26:20,374][12883] Updated weights for policy 0, policy_version 95193 (0.0030) [2024-06-18 08:26:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1559707648. Throughput: 0: 43124.6. Samples: 1559825080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:21,994][12645] Avg episode reward: [(0, '0.228')] [2024-06-18 08:26:24,310][12883] Updated weights for policy 0, policy_version 95203 (0.0051) [2024-06-18 08:26:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 1559920640. Throughput: 0: 42995.2. Samples: 1560081300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:26,995][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 08:26:28,081][12883] Updated weights for policy 0, policy_version 95213 (0.0032) [2024-06-18 08:26:31,812][12883] Updated weights for policy 0, policy_version 95223 (0.0027) [2024-06-18 08:26:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1560133632. Throughput: 0: 42908.9. Samples: 1560205480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:31,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 08:26:35,811][12883] Updated weights for policy 0, policy_version 95233 (0.0032) [2024-06-18 08:26:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1560346624. Throughput: 0: 42964.7. Samples: 1560467100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:36,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 08:26:37,136][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095237_1560363008.pth... [2024-06-18 08:26:37,188][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094608_1550057472.pth [2024-06-18 08:26:39,486][12883] Updated weights for policy 0, policy_version 95243 (0.0029) [2024-06-18 08:26:40,219][12862] Signal inference workers to stop experience collection... (22800 times) [2024-06-18 08:26:40,267][12883] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-06-18 08:26:40,275][12862] Signal inference workers to resume experience collection... (22800 times) [2024-06-18 08:26:40,284][12883] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-06-18 08:26:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1560576000. Throughput: 0: 42838.8. Samples: 1560720800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:41,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 08:26:43,739][12883] Updated weights for policy 0, policy_version 95253 (0.0045) [2024-06-18 08:26:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1560772608. Throughput: 0: 42876.3. Samples: 1560849080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:46,995][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 08:26:47,135][12883] Updated weights for policy 0, policy_version 95263 (0.0038) [2024-06-18 08:26:51,293][12883] Updated weights for policy 0, policy_version 95273 (0.0038) [2024-06-18 08:26:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1560985600. Throughput: 0: 42759.5. Samples: 1561104720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 08:26:51,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 08:26:54,879][12883] Updated weights for policy 0, policy_version 95283 (0.0031) [2024-06-18 08:26:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1561198592. Throughput: 0: 42774.5. Samples: 1561360560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:26:56,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 08:26:59,135][12883] Updated weights for policy 0, policy_version 95293 (0.0037) [2024-06-18 08:27:01,996][12645] Fps is (10 sec: 44226.5, 60 sec: 43142.9, 300 sec: 42986.9). Total num frames: 1561427968. Throughput: 0: 42756.1. Samples: 1561490920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:01,997][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 08:27:02,437][12883] Updated weights for policy 0, policy_version 95303 (0.0037) [2024-06-18 08:27:06,645][12883] Updated weights for policy 0, policy_version 95313 (0.0039) [2024-06-18 08:27:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1561624576. Throughput: 0: 42806.5. Samples: 1561751380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:06,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 08:27:10,194][12883] Updated weights for policy 0, policy_version 95323 (0.0035) [2024-06-18 08:27:11,994][12645] Fps is (10 sec: 42608.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1561853952. Throughput: 0: 42726.0. Samples: 1562003960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:11,994][12645] Avg episode reward: [(0, '0.080')] [2024-06-18 08:27:14,127][12883] Updated weights for policy 0, policy_version 95333 (0.0030) [2024-06-18 08:27:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1562066944. Throughput: 0: 42849.8. Samples: 1562133720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:16,996][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 08:27:17,785][12883] Updated weights for policy 0, policy_version 95343 (0.0042) [2024-06-18 08:27:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1562247168. Throughput: 0: 42612.6. Samples: 1562384660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:21,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 08:27:22,109][12883] Updated weights for policy 0, policy_version 95353 (0.0029) [2024-06-18 08:27:25,542][12883] Updated weights for policy 0, policy_version 95363 (0.0034) [2024-06-18 08:27:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1562476544. Throughput: 0: 42658.2. Samples: 1562640420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:26,994][12645] Avg episode reward: [(0, '0.672')] [2024-06-18 08:27:29,680][12883] Updated weights for policy 0, policy_version 95373 (0.0035) [2024-06-18 08:27:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.7). Total num frames: 1562673152. Throughput: 0: 42701.1. Samples: 1562770620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:31,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 08:27:33,289][12883] Updated weights for policy 0, policy_version 95383 (0.0040) [2024-06-18 08:27:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 1562902528. Throughput: 0: 42566.0. Samples: 1563020200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:36,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 08:27:37,272][12883] Updated weights for policy 0, policy_version 95393 (0.0031) [2024-06-18 08:27:40,898][12883] Updated weights for policy 0, policy_version 95403 (0.0031) [2024-06-18 08:27:41,993][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1563115520. Throughput: 0: 42646.4. Samples: 1563279640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:41,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 08:27:45,037][12883] Updated weights for policy 0, policy_version 95413 (0.0043) [2024-06-18 08:27:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1563328512. Throughput: 0: 42720.4. Samples: 1563413240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 08:27:46,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 08:27:48,439][12883] Updated weights for policy 0, policy_version 95423 (0.0033) [2024-06-18 08:27:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1563541504. Throughput: 0: 42468.9. Samples: 1563662480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:27:51,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 08:27:52,790][12883] Updated weights for policy 0, policy_version 95433 (0.0031) [2024-06-18 08:27:56,401][12883] Updated weights for policy 0, policy_version 95443 (0.0027) [2024-06-18 08:27:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1563754496. Throughput: 0: 42580.4. Samples: 1563920080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:27:56,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 08:28:00,398][12883] Updated weights for policy 0, policy_version 95453 (0.0030) [2024-06-18 08:28:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 1563951104. Throughput: 0: 42577.4. Samples: 1564049700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:01,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 08:28:03,874][12883] Updated weights for policy 0, policy_version 95463 (0.0043) [2024-06-18 08:28:06,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1564196864. Throughput: 0: 42640.7. Samples: 1564303500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:06,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 08:28:07,847][12883] Updated weights for policy 0, policy_version 95473 (0.0029) [2024-06-18 08:28:07,863][12862] Signal inference workers to stop experience collection... (22850 times) [2024-06-18 08:28:07,863][12862] Signal inference workers to resume experience collection... (22850 times) [2024-06-18 08:28:07,905][12883] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-06-18 08:28:07,906][12883] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-06-18 08:28:11,476][12883] Updated weights for policy 0, policy_version 95483 (0.0033) [2024-06-18 08:28:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 1564393472. Throughput: 0: 42781.2. Samples: 1564565580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:11,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 08:28:15,376][12883] Updated weights for policy 0, policy_version 95493 (0.0033) [2024-06-18 08:28:16,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42323.8, 300 sec: 42765.3). Total num frames: 1564606464. Throughput: 0: 42778.3. Samples: 1564695740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:16,997][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 08:28:19,163][12883] Updated weights for policy 0, policy_version 95503 (0.0043) [2024-06-18 08:28:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1564835840. Throughput: 0: 42873.5. Samples: 1564949500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:21,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 08:28:22,875][12883] Updated weights for policy 0, policy_version 95513 (0.0030) [2024-06-18 08:28:26,733][12883] Updated weights for policy 0, policy_version 95523 (0.0035) [2024-06-18 08:28:26,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1565048832. Throughput: 0: 42992.7. Samples: 1565214320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:26,994][12645] Avg episode reward: [(0, '0.188')] [2024-06-18 08:28:30,464][12883] Updated weights for policy 0, policy_version 95533 (0.0037) [2024-06-18 08:28:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1565261824. Throughput: 0: 42808.0. Samples: 1565339600. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:31,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 08:28:34,280][12883] Updated weights for policy 0, policy_version 95543 (0.0037) [2024-06-18 08:28:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1565491200. Throughput: 0: 42969.8. Samples: 1565596120. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:36,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 08:28:37,117][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095551_1565507584.pth... [2024-06-18 08:28:37,170][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000094922_1555202048.pth [2024-06-18 08:28:38,146][12883] Updated weights for policy 0, policy_version 95553 (0.0025) [2024-06-18 08:28:41,903][12883] Updated weights for policy 0, policy_version 95563 (0.0035) [2024-06-18 08:28:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1565704192. Throughput: 0: 43018.1. Samples: 1565855900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:41,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 08:28:45,566][12883] Updated weights for policy 0, policy_version 95573 (0.0032) [2024-06-18 08:28:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1565900800. Throughput: 0: 42933.7. Samples: 1565981720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 23.0) [2024-06-18 08:28:46,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 08:28:49,480][12883] Updated weights for policy 0, policy_version 95583 (0.0033) [2024-06-18 08:28:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1566130176. Throughput: 0: 43079.2. Samples: 1566242060. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:28:51,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 08:28:53,537][12883] Updated weights for policy 0, policy_version 95593 (0.0041) [2024-06-18 08:28:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1566343168. Throughput: 0: 42951.7. Samples: 1566498400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:28:56,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 08:28:57,093][12883] Updated weights for policy 0, policy_version 95603 (0.0042) [2024-06-18 08:29:01,612][12883] Updated weights for policy 0, policy_version 95613 (0.0027) [2024-06-18 08:29:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1566539776. Throughput: 0: 42873.3. Samples: 1566624940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:01,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 08:29:04,605][12883] Updated weights for policy 0, policy_version 95623 (0.0031) [2024-06-18 08:29:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1566769152. Throughput: 0: 43004.3. Samples: 1566884700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:06,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 08:29:09,229][12883] Updated weights for policy 0, policy_version 95633 (0.0028) [2024-06-18 08:29:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1566982144. Throughput: 0: 42870.7. Samples: 1567143500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:11,996][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 08:29:12,218][12883] Updated weights for policy 0, policy_version 95643 (0.0028) [2024-06-18 08:29:16,830][12883] Updated weights for policy 0, policy_version 95653 (0.0037) [2024-06-18 08:29:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43146.1, 300 sec: 42820.5). Total num frames: 1567195136. Throughput: 0: 42961.2. Samples: 1567272860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:16,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 08:29:19,867][12883] Updated weights for policy 0, policy_version 95663 (0.0034) [2024-06-18 08:29:21,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1567408128. Throughput: 0: 42745.9. Samples: 1567519780. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:21,996][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 08:29:24,333][12883] Updated weights for policy 0, policy_version 95673 (0.0042) [2024-06-18 08:29:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1567621120. Throughput: 0: 42913.8. Samples: 1567787020. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:26,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 08:29:27,644][12883] Updated weights for policy 0, policy_version 95683 (0.0050) [2024-06-18 08:29:31,983][12883] Updated weights for policy 0, policy_version 95693 (0.0032) [2024-06-18 08:29:31,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 1567834112. Throughput: 0: 42873.5. Samples: 1567911120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:31,996][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 08:29:35,294][12883] Updated weights for policy 0, policy_version 95703 (0.0037) [2024-06-18 08:29:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1568047104. Throughput: 0: 42659.6. Samples: 1568161740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:36,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 08:29:37,190][12862] Signal inference workers to stop experience collection... (22900 times) [2024-06-18 08:29:37,238][12883] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-06-18 08:29:37,248][12862] Signal inference workers to resume experience collection... (22900 times) [2024-06-18 08:29:37,260][12883] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-06-18 08:29:39,566][12883] Updated weights for policy 0, policy_version 95713 (0.0030) [2024-06-18 08:29:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1568243712. Throughput: 0: 42854.7. Samples: 1568426860. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 08:29:41,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 08:29:43,001][12883] Updated weights for policy 0, policy_version 95723 (0.0028) [2024-06-18 08:29:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1568473088. Throughput: 0: 42760.2. Samples: 1568549160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:29:46,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 08:29:47,139][12883] Updated weights for policy 0, policy_version 95733 (0.0036) [2024-06-18 08:29:50,670][12883] Updated weights for policy 0, policy_version 95743 (0.0041) [2024-06-18 08:29:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1568702464. Throughput: 0: 42650.2. Samples: 1568803960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:29:51,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 08:29:54,918][12883] Updated weights for policy 0, policy_version 95753 (0.0037) [2024-06-18 08:29:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1568899072. Throughput: 0: 42702.2. Samples: 1569065100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:29:56,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 08:29:58,221][12883] Updated weights for policy 0, policy_version 95763 (0.0033) [2024-06-18 08:30:02,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42866.9, 300 sec: 42764.1). Total num frames: 1569112064. Throughput: 0: 42591.5. Samples: 1569189740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:02,001][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 08:30:03,096][12883] Updated weights for policy 0, policy_version 95773 (0.0044) [2024-06-18 08:30:06,045][12883] Updated weights for policy 0, policy_version 95783 (0.0030) [2024-06-18 08:30:06,996][12645] Fps is (10 sec: 45864.9, 60 sec: 43143.0, 300 sec: 42875.8). Total num frames: 1569357824. Throughput: 0: 42860.5. Samples: 1569448500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:06,997][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 08:30:10,781][12883] Updated weights for policy 0, policy_version 95793 (0.0030) [2024-06-18 08:30:11,994][12645] Fps is (10 sec: 44264.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1569554432. Throughput: 0: 42733.0. Samples: 1569710000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:11,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 08:30:13,511][12883] Updated weights for policy 0, policy_version 95803 (0.0032) [2024-06-18 08:30:16,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1569767424. Throughput: 0: 42696.8. Samples: 1569832380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:16,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 08:30:18,319][12883] Updated weights for policy 0, policy_version 95813 (0.0033) [2024-06-18 08:30:21,015][12883] Updated weights for policy 0, policy_version 95823 (0.0035) [2024-06-18 08:30:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43146.1, 300 sec: 42932.0). Total num frames: 1569996800. Throughput: 0: 42861.3. Samples: 1570090500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:21,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 08:30:25,887][12883] Updated weights for policy 0, policy_version 95833 (0.0043) [2024-06-18 08:30:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1570177024. Throughput: 0: 42820.9. Samples: 1570353800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:26,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 08:30:28,873][12883] Updated weights for policy 0, policy_version 95843 (0.0026) [2024-06-18 08:30:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 1570422784. Throughput: 0: 42791.3. Samples: 1570474760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:31,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 08:30:33,432][12883] Updated weights for policy 0, policy_version 95853 (0.0037) [2024-06-18 08:30:36,464][12883] Updated weights for policy 0, policy_version 95863 (0.0044) [2024-06-18 08:30:37,000][12645] Fps is (10 sec: 45846.3, 60 sec: 43140.0, 300 sec: 42875.2). Total num frames: 1570635776. Throughput: 0: 42861.2. Samples: 1570732980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:37,001][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 08:30:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095865_1570652160.pth... [2024-06-18 08:30:37,153][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095237_1560363008.pth [2024-06-18 08:30:41,643][12883] Updated weights for policy 0, policy_version 95873 (0.0041) [2024-06-18 08:30:41,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1570783232. Throughput: 0: 42934.7. Samples: 1570997160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:30:41,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 08:30:44,235][12883] Updated weights for policy 0, policy_version 95883 (0.0028) [2024-06-18 08:30:46,996][12645] Fps is (10 sec: 40976.5, 60 sec: 42870.0, 300 sec: 42820.2). Total num frames: 1571045376. Throughput: 0: 42602.5. Samples: 1571106680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:30:46,997][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 08:30:49,336][12883] Updated weights for policy 0, policy_version 95893 (0.0033) [2024-06-18 08:30:51,922][12883] Updated weights for policy 0, policy_version 95903 (0.0031) [2024-06-18 08:30:51,994][12645] Fps is (10 sec: 49151.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1571274752. Throughput: 0: 42622.6. Samples: 1571366420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:30:51,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 08:30:56,996][12645] Fps is (10 sec: 37683.1, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 1571422208. Throughput: 0: 42645.0. Samples: 1571629120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:30:56,996][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 08:30:57,086][12883] Updated weights for policy 0, policy_version 95913 (0.0034) [2024-06-18 08:30:59,531][12883] Updated weights for policy 0, policy_version 95923 (0.0027) [2024-06-18 08:31:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 1571684352. Throughput: 0: 42511.1. Samples: 1571745380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:01,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 08:31:04,657][12883] Updated weights for policy 0, policy_version 95933 (0.0032) [2024-06-18 08:31:06,321][12862] Signal inference workers to stop experience collection... (22950 times) [2024-06-18 08:31:06,321][12862] Signal inference workers to resume experience collection... (22950 times) [2024-06-18 08:31:06,332][12883] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-06-18 08:31:06,332][12883] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-06-18 08:31:06,994][12645] Fps is (10 sec: 49163.3, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 1571913728. Throughput: 0: 42755.2. Samples: 1572014480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:06,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 08:31:07,100][12883] Updated weights for policy 0, policy_version 95943 (0.0040) [2024-06-18 08:31:11,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1572077568. Throughput: 0: 42671.5. Samples: 1572274020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:11,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 08:31:12,247][12883] Updated weights for policy 0, policy_version 95953 (0.0031) [2024-06-18 08:31:15,040][12883] Updated weights for policy 0, policy_version 95963 (0.0041) [2024-06-18 08:31:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1572323328. Throughput: 0: 42570.6. Samples: 1572390440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:16,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 08:31:19,882][12883] Updated weights for policy 0, policy_version 95973 (0.0031) [2024-06-18 08:31:21,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1572552704. Throughput: 0: 42727.7. Samples: 1572655460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:22,005][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 08:31:22,665][12883] Updated weights for policy 0, policy_version 95983 (0.0032) [2024-06-18 08:31:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1572716544. Throughput: 0: 42491.5. Samples: 1572909280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:26,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 08:31:27,559][12883] Updated weights for policy 0, policy_version 95993 (0.0034) [2024-06-18 08:31:30,242][12883] Updated weights for policy 0, policy_version 96003 (0.0038) [2024-06-18 08:31:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1572962304. Throughput: 0: 42691.4. Samples: 1573027700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:31,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 08:31:35,117][12883] Updated weights for policy 0, policy_version 96013 (0.0037) [2024-06-18 08:31:36,994][12645] Fps is (10 sec: 47513.8, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 1573191680. Throughput: 0: 42878.2. Samples: 1573295940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 08:31:36,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 08:31:37,792][12883] Updated weights for policy 0, policy_version 96023 (0.0023) [2024-06-18 08:31:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1573371904. Throughput: 0: 42666.5. Samples: 1573549020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:31:41,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 08:31:42,556][12883] Updated weights for policy 0, policy_version 96033 (0.0029) [2024-06-18 08:31:45,481][12883] Updated weights for policy 0, policy_version 96043 (0.0028) [2024-06-18 08:31:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 1573617664. Throughput: 0: 42895.0. Samples: 1573675660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:31:46,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 08:31:50,562][12883] Updated weights for policy 0, policy_version 96053 (0.0033) [2024-06-18 08:31:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1573814272. Throughput: 0: 42780.8. Samples: 1573939620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:31:51,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 08:31:53,162][12883] Updated weights for policy 0, policy_version 96063 (0.0029) [2024-06-18 08:31:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43419.2, 300 sec: 42709.8). Total num frames: 1574027264. Throughput: 0: 42551.2. Samples: 1574188820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:31:56,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 08:31:58,159][12883] Updated weights for policy 0, policy_version 96073 (0.0029) [2024-06-18 08:32:00,764][12883] Updated weights for policy 0, policy_version 96083 (0.0032) [2024-06-18 08:32:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1574256640. Throughput: 0: 42807.4. Samples: 1574316780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:01,994][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 08:32:05,772][12883] Updated weights for policy 0, policy_version 96093 (0.0035) [2024-06-18 08:32:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1574469632. Throughput: 0: 42895.3. Samples: 1574585740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:06,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 08:32:08,254][12883] Updated weights for policy 0, policy_version 96103 (0.0033) [2024-06-18 08:32:10,180][12862] Signal inference workers to stop experience collection... (23000 times) [2024-06-18 08:32:10,180][12862] Signal inference workers to resume experience collection... (23000 times) [2024-06-18 08:32:10,196][12883] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-06-18 08:32:10,197][12883] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-06-18 08:32:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 1574682624. Throughput: 0: 42758.3. Samples: 1574833400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:11,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 08:32:13,241][12883] Updated weights for policy 0, policy_version 96113 (0.0037) [2024-06-18 08:32:15,835][12883] Updated weights for policy 0, policy_version 96123 (0.0034) [2024-06-18 08:32:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1574895616. Throughput: 0: 42949.4. Samples: 1574960420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:16,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 08:32:20,795][12883] Updated weights for policy 0, policy_version 96133 (0.0037) [2024-06-18 08:32:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1575092224. Throughput: 0: 42711.1. Samples: 1575217940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:21,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 08:32:23,789][12883] Updated weights for policy 0, policy_version 96143 (0.0029) [2024-06-18 08:32:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1575305216. Throughput: 0: 42731.1. Samples: 1575471920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:26,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 08:32:28,495][12883] Updated weights for policy 0, policy_version 96153 (0.0035) [2024-06-18 08:32:31,560][12883] Updated weights for policy 0, policy_version 96163 (0.0038) [2024-06-18 08:32:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1575550976. Throughput: 0: 42758.3. Samples: 1575599780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:31,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 08:32:36,042][12883] Updated weights for policy 0, policy_version 96173 (0.0044) [2024-06-18 08:32:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1575731200. Throughput: 0: 42626.8. Samples: 1575857820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 08:32:36,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 08:32:37,077][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096176_1575747584.pth... [2024-06-18 08:32:37,141][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095551_1565507584.pth [2024-06-18 08:32:39,293][12883] Updated weights for policy 0, policy_version 96183 (0.0028) [2024-06-18 08:32:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1575944192. Throughput: 0: 42588.9. Samples: 1576105320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:32:41,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 08:32:43,743][12883] Updated weights for policy 0, policy_version 96193 (0.0032) [2024-06-18 08:32:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1576173568. Throughput: 0: 42686.0. Samples: 1576237640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:32:46,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 08:32:47,103][12883] Updated weights for policy 0, policy_version 96203 (0.0040) [2024-06-18 08:32:51,458][12883] Updated weights for policy 0, policy_version 96213 (0.0041) [2024-06-18 08:32:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1576370176. Throughput: 0: 42497.7. Samples: 1576498140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:32:51,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 08:32:54,826][12883] Updated weights for policy 0, policy_version 96223 (0.0035) [2024-06-18 08:32:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1576583168. Throughput: 0: 42623.5. Samples: 1576751460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:32:56,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 08:32:59,310][12883] Updated weights for policy 0, policy_version 96233 (0.0041) [2024-06-18 08:33:01,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1576828928. Throughput: 0: 42553.4. Samples: 1576875320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:01,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 08:33:02,441][12883] Updated weights for policy 0, policy_version 96243 (0.0034) [2024-06-18 08:33:06,838][12883] Updated weights for policy 0, policy_version 96253 (0.0044) [2024-06-18 08:33:06,996][12645] Fps is (10 sec: 42588.2, 60 sec: 42323.6, 300 sec: 42764.7). Total num frames: 1577009152. Throughput: 0: 42588.0. Samples: 1577134500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:06,997][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 08:33:10,255][12883] Updated weights for policy 0, policy_version 96263 (0.0032) [2024-06-18 08:33:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 1577222144. Throughput: 0: 42672.6. Samples: 1577392180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:11,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 08:33:14,455][12883] Updated weights for policy 0, policy_version 96273 (0.0032) [2024-06-18 08:33:16,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1577467904. Throughput: 0: 42689.5. Samples: 1577520820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:16,995][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 08:33:17,881][12883] Updated weights for policy 0, policy_version 96283 (0.0042) [2024-06-18 08:33:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1577648128. Throughput: 0: 42713.4. Samples: 1577779920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:21,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 08:33:22,051][12883] Updated weights for policy 0, policy_version 96293 (0.0031) [2024-06-18 08:33:25,695][12883] Updated weights for policy 0, policy_version 96303 (0.0031) [2024-06-18 08:33:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1577877504. Throughput: 0: 42765.7. Samples: 1578029780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:26,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 08:33:27,872][12862] Signal inference workers to stop experience collection... (23050 times) [2024-06-18 08:33:27,928][12883] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-06-18 08:33:27,928][12862] Signal inference workers to resume experience collection... (23050 times) [2024-06-18 08:33:27,940][12883] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-06-18 08:33:29,698][12883] Updated weights for policy 0, policy_version 96313 (0.0043) [2024-06-18 08:33:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1578106880. Throughput: 0: 42779.0. Samples: 1578162700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:31,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 08:33:33,395][12883] Updated weights for policy 0, policy_version 96323 (0.0034) [2024-06-18 08:33:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1578287104. Throughput: 0: 42735.0. Samples: 1578421220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 08:33:36,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 08:33:37,425][12883] Updated weights for policy 0, policy_version 96333 (0.0031) [2024-06-18 08:33:41,122][12883] Updated weights for policy 0, policy_version 96343 (0.0034) [2024-06-18 08:33:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1578516480. Throughput: 0: 42704.0. Samples: 1578673140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:33:41,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 08:33:45,095][12883] Updated weights for policy 0, policy_version 96353 (0.0045) [2024-06-18 08:33:46,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 1578762240. Throughput: 0: 42824.8. Samples: 1578802440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:33:46,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 08:33:48,665][12883] Updated weights for policy 0, policy_version 96363 (0.0033) [2024-06-18 08:33:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1578926080. Throughput: 0: 42851.9. Samples: 1579062740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:33:51,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 08:33:52,722][12883] Updated weights for policy 0, policy_version 96373 (0.0026) [2024-06-18 08:33:56,446][12883] Updated weights for policy 0, policy_version 96383 (0.0039) [2024-06-18 08:33:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1579155456. Throughput: 0: 42640.8. Samples: 1579311020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:33:56,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 08:34:00,491][12883] Updated weights for policy 0, policy_version 96393 (0.0032) [2024-06-18 08:34:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1579384832. Throughput: 0: 42743.8. Samples: 1579444280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:34:01,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 08:34:04,026][12883] Updated weights for policy 0, policy_version 96403 (0.0038) [2024-06-18 08:34:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1579581440. Throughput: 0: 42703.5. Samples: 1579701580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:34:06,994][12645] Avg episode reward: [(0, '0.189')] [2024-06-18 08:34:08,097][12883] Updated weights for policy 0, policy_version 96413 (0.0038) [2024-06-18 08:34:11,649][12883] Updated weights for policy 0, policy_version 96423 (0.0041) [2024-06-18 08:34:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1579810816. Throughput: 0: 42716.5. Samples: 1579952020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:34:11,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 08:34:15,761][12883] Updated weights for policy 0, policy_version 96433 (0.0043) [2024-06-18 08:34:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42765.3). Total num frames: 1580023808. Throughput: 0: 42604.4. Samples: 1580079900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:34:16,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 08:34:19,334][12883] Updated weights for policy 0, policy_version 96443 (0.0042) [2024-06-18 08:34:21,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1580204032. Throughput: 0: 42588.1. Samples: 1580337780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:34:21,996][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 08:34:23,327][12883] Updated weights for policy 0, policy_version 96453 (0.0028) [2024-06-18 08:34:26,768][12883] Updated weights for policy 0, policy_version 96463 (0.0032) [2024-06-18 08:34:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 1580449792. Throughput: 0: 42697.3. Samples: 1580594520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:34:26,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 08:34:30,874][12883] Updated weights for policy 0, policy_version 96473 (0.0031) [2024-06-18 08:34:31,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1580662784. Throughput: 0: 42804.8. Samples: 1580728660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:34:31,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 08:34:34,325][12883] Updated weights for policy 0, policy_version 96483 (0.0031) [2024-06-18 08:34:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1580843008. Throughput: 0: 42615.2. Samples: 1580980420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:34:36,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 08:34:37,097][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096488_1580859392.pth... [2024-06-18 08:34:37,163][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000095865_1570652160.pth [2024-06-18 08:34:38,916][12883] Updated weights for policy 0, policy_version 96493 (0.0029) [2024-06-18 08:34:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1581088768. Throughput: 0: 42676.7. Samples: 1581231480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:34:41,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 08:34:42,126][12883] Updated weights for policy 0, policy_version 96503 (0.0034) [2024-06-18 08:34:46,546][12883] Updated weights for policy 0, policy_version 96513 (0.0036) [2024-06-18 08:34:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1581301760. Throughput: 0: 42673.3. Samples: 1581364580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:34:46,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 08:34:49,691][12883] Updated weights for policy 0, policy_version 96523 (0.0036) [2024-06-18 08:34:50,872][12862] Signal inference workers to stop experience collection... (23100 times) [2024-06-18 08:34:50,872][12862] Signal inference workers to resume experience collection... (23100 times) [2024-06-18 08:34:50,910][12883] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-06-18 08:34:50,910][12883] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-06-18 08:34:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1581498368. Throughput: 0: 42646.2. Samples: 1581620660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:34:51,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 08:34:54,120][12883] Updated weights for policy 0, policy_version 96533 (0.0027) [2024-06-18 08:34:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.9). Total num frames: 1581727744. Throughput: 0: 42781.2. Samples: 1581877180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:34:56,994][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 08:34:57,521][12883] Updated weights for policy 0, policy_version 96543 (0.0035) [2024-06-18 08:35:01,604][12883] Updated weights for policy 0, policy_version 96553 (0.0039) [2024-06-18 08:35:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1581940736. Throughput: 0: 42815.7. Samples: 1582006600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:35:01,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 08:35:05,494][12883] Updated weights for policy 0, policy_version 96563 (0.0040) [2024-06-18 08:35:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1582120960. Throughput: 0: 42535.4. Samples: 1582251780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:35:06,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 08:35:09,553][12883] Updated weights for policy 0, policy_version 96573 (0.0045) [2024-06-18 08:35:11,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1582350336. Throughput: 0: 42499.1. Samples: 1582506980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:35:11,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 08:35:13,380][12883] Updated weights for policy 0, policy_version 96583 (0.0023) [2024-06-18 08:35:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1582563328. Throughput: 0: 42448.5. Samples: 1582638840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:35:16,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 08:35:17,100][12883] Updated weights for policy 0, policy_version 96593 (0.0037) [2024-06-18 08:35:21,175][12883] Updated weights for policy 0, policy_version 96603 (0.0035) [2024-06-18 08:35:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1582776320. Throughput: 0: 42476.9. Samples: 1582891880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:35:21,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 08:35:24,783][12883] Updated weights for policy 0, policy_version 96613 (0.0040) [2024-06-18 08:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1582989312. Throughput: 0: 42497.0. Samples: 1583143840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:35:26,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 08:35:29,077][12883] Updated weights for policy 0, policy_version 96623 (0.0036) [2024-06-18 08:35:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42599.3). Total num frames: 1583202304. Throughput: 0: 42498.7. Samples: 1583277020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:35:31,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 08:35:32,650][12883] Updated weights for policy 0, policy_version 96633 (0.0030) [2024-06-18 08:35:36,784][12883] Updated weights for policy 0, policy_version 96643 (0.0038) [2024-06-18 08:35:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1583398912. Throughput: 0: 42388.0. Samples: 1583528120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:35:36,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 08:35:40,266][12883] Updated weights for policy 0, policy_version 96653 (0.0036) [2024-06-18 08:35:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 1583628288. Throughput: 0: 42247.7. Samples: 1583778320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:35:41,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 08:35:44,902][12883] Updated weights for policy 0, policy_version 96663 (0.0037) [2024-06-18 08:35:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1583841280. Throughput: 0: 42317.3. Samples: 1583910880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:35:46,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 08:35:47,742][12883] Updated weights for policy 0, policy_version 96673 (0.0041) [2024-06-18 08:35:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 1584021504. Throughput: 0: 42410.0. Samples: 1584160220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:35:51,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 08:35:52,631][12883] Updated weights for policy 0, policy_version 96683 (0.0023) [2024-06-18 08:35:55,531][12883] Updated weights for policy 0, policy_version 96693 (0.0026) [2024-06-18 08:35:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1584267264. Throughput: 0: 42393.4. Samples: 1584414680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:35:56,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 08:36:00,163][12883] Updated weights for policy 0, policy_version 96703 (0.0026) [2024-06-18 08:36:01,994][12645] Fps is (10 sec: 47512.6, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1584496640. Throughput: 0: 42515.0. Samples: 1584552020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:36:01,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 08:36:03,006][12883] Updated weights for policy 0, policy_version 96713 (0.0032) [2024-06-18 08:36:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1584676864. Throughput: 0: 42511.1. Samples: 1584804880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:36:06,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 08:36:07,747][12883] Updated weights for policy 0, policy_version 96723 (0.0028) [2024-06-18 08:36:08,231][12862] Signal inference workers to stop experience collection... (23150 times) [2024-06-18 08:36:08,236][12862] Signal inference workers to resume experience collection... (23150 times) [2024-06-18 08:36:08,282][12883] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-06-18 08:36:08,283][12883] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-06-18 08:36:10,768][12883] Updated weights for policy 0, policy_version 96733 (0.0031) [2024-06-18 08:36:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1584906240. Throughput: 0: 42550.7. Samples: 1585058620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:36:11,994][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 08:36:15,311][12883] Updated weights for policy 0, policy_version 96743 (0.0037) [2024-06-18 08:36:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1585102848. Throughput: 0: 42735.6. Samples: 1585200120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:36:16,994][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 08:36:18,257][12883] Updated weights for policy 0, policy_version 96753 (0.0039) [2024-06-18 08:36:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1585315840. Throughput: 0: 42707.7. Samples: 1585449960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:36:21,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 08:36:22,864][12883] Updated weights for policy 0, policy_version 96763 (0.0028) [2024-06-18 08:36:25,806][12883] Updated weights for policy 0, policy_version 96773 (0.0033) [2024-06-18 08:36:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1585561600. Throughput: 0: 42817.4. Samples: 1585705100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:36:26,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 08:36:30,372][12883] Updated weights for policy 0, policy_version 96783 (0.0034) [2024-06-18 08:36:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1585774592. Throughput: 0: 42860.8. Samples: 1585839620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 24.0) [2024-06-18 08:36:31,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 08:36:33,553][12883] Updated weights for policy 0, policy_version 96793 (0.0036) [2024-06-18 08:36:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1585971200. Throughput: 0: 42983.4. Samples: 1586094480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:36:36,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 08:36:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096800_1585971200.pth... [2024-06-18 08:36:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096176_1575747584.pth [2024-06-18 08:36:38,049][12883] Updated weights for policy 0, policy_version 96803 (0.0031) [2024-06-18 08:36:41,491][12883] Updated weights for policy 0, policy_version 96813 (0.0034) [2024-06-18 08:36:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1586216960. Throughput: 0: 43036.5. Samples: 1586351320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:36:41,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 08:36:45,663][12883] Updated weights for policy 0, policy_version 96823 (0.0023) [2024-06-18 08:36:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1586413568. Throughput: 0: 43027.3. Samples: 1586488240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:36:46,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 08:36:49,126][12883] Updated weights for policy 0, policy_version 96833 (0.0034) [2024-06-18 08:36:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1586610176. Throughput: 0: 42981.3. Samples: 1586739040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:36:51,999][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 08:36:53,314][12883] Updated weights for policy 0, policy_version 96843 (0.0037) [2024-06-18 08:36:56,662][12883] Updated weights for policy 0, policy_version 96853 (0.0046) [2024-06-18 08:36:56,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1586855936. Throughput: 0: 43059.9. Samples: 1586996320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:36:56,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 08:37:01,055][12883] Updated weights for policy 0, policy_version 96863 (0.0037) [2024-06-18 08:37:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1587052544. Throughput: 0: 42843.5. Samples: 1587128080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:37:01,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 08:37:04,172][12883] Updated weights for policy 0, policy_version 96873 (0.0024) [2024-06-18 08:37:06,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1587249152. Throughput: 0: 42919.1. Samples: 1587381320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:37:06,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 08:37:08,631][12883] Updated weights for policy 0, policy_version 96883 (0.0027) [2024-06-18 08:37:11,697][12883] Updated weights for policy 0, policy_version 96893 (0.0029) [2024-06-18 08:37:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1587494912. Throughput: 0: 42826.1. Samples: 1587632280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:37:11,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 08:37:16,183][12883] Updated weights for policy 0, policy_version 96903 (0.0025) [2024-06-18 08:37:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1587675136. Throughput: 0: 42786.3. Samples: 1587765000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:37:16,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 08:37:19,405][12883] Updated weights for policy 0, policy_version 96913 (0.0034) [2024-06-18 08:37:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1587904512. Throughput: 0: 42741.0. Samples: 1588017820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:37:21,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 08:37:23,858][12883] Updated weights for policy 0, policy_version 96923 (0.0026) [2024-06-18 08:37:26,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1588150272. Throughput: 0: 42783.9. Samples: 1588276600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 08:37:26,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 08:37:26,998][12883] Updated weights for policy 0, policy_version 96933 (0.0032) [2024-06-18 08:37:31,895][12883] Updated weights for policy 0, policy_version 96943 (0.0041) [2024-06-18 08:37:31,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1588314112. Throughput: 0: 42611.3. Samples: 1588405760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:37:31,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 08:37:34,754][12883] Updated weights for policy 0, policy_version 96953 (0.0031) [2024-06-18 08:37:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1588559872. Throughput: 0: 42681.4. Samples: 1588659700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:37:36,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 08:37:39,506][12883] Updated weights for policy 0, policy_version 96963 (0.0037) [2024-06-18 08:37:39,856][12862] Signal inference workers to stop experience collection... (23200 times) [2024-06-18 08:37:39,888][12883] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-06-18 08:37:39,909][12862] Signal inference workers to resume experience collection... (23200 times) [2024-06-18 08:37:39,913][12883] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-06-18 08:37:41,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1588772864. Throughput: 0: 42607.3. Samples: 1588913640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:37:41,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 08:37:42,408][12883] Updated weights for policy 0, policy_version 96973 (0.0032) [2024-06-18 08:37:46,959][12883] Updated weights for policy 0, policy_version 96983 (0.0027) [2024-06-18 08:37:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1588969472. Throughput: 0: 42568.9. Samples: 1589043680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:37:46,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 08:37:49,934][12883] Updated weights for policy 0, policy_version 96993 (0.0040) [2024-06-18 08:37:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1589182464. Throughput: 0: 42617.7. Samples: 1589299120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:37:51,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 08:37:54,857][12883] Updated weights for policy 0, policy_version 97003 (0.0049) [2024-06-18 08:37:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1589395456. Throughput: 0: 42708.2. Samples: 1589554140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:37:56,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 08:37:57,964][12883] Updated weights for policy 0, policy_version 97013 (0.0052) [2024-06-18 08:38:01,995][12645] Fps is (10 sec: 40953.7, 60 sec: 42324.2, 300 sec: 42654.1). Total num frames: 1589592064. Throughput: 0: 42632.7. Samples: 1589683540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:38:01,996][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 08:38:02,486][12883] Updated weights for policy 0, policy_version 97023 (0.0037) [2024-06-18 08:38:05,785][12883] Updated weights for policy 0, policy_version 97033 (0.0036) [2024-06-18 08:38:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1589821440. Throughput: 0: 42560.4. Samples: 1589933040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:38:06,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 08:38:10,263][12883] Updated weights for policy 0, policy_version 97043 (0.0039) [2024-06-18 08:38:11,994][12645] Fps is (10 sec: 44243.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1590034432. Throughput: 0: 42492.9. Samples: 1590188780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:38:11,998][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 08:38:13,574][12883] Updated weights for policy 0, policy_version 97053 (0.0043) [2024-06-18 08:38:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1590231040. Throughput: 0: 42461.9. Samples: 1590316540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:38:16,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 08:38:17,770][12883] Updated weights for policy 0, policy_version 97063 (0.0034) [2024-06-18 08:38:21,160][12883] Updated weights for policy 0, policy_version 97073 (0.0036) [2024-06-18 08:38:21,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1590476800. Throughput: 0: 42386.8. Samples: 1590567200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:38:21,997][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 08:38:25,295][12883] Updated weights for policy 0, policy_version 97083 (0.0032) [2024-06-18 08:38:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1590673408. Throughput: 0: 42414.9. Samples: 1590822320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 08:38:26,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 08:38:28,842][12883] Updated weights for policy 0, policy_version 97093 (0.0033) [2024-06-18 08:38:31,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1590853632. Throughput: 0: 42334.2. Samples: 1590948720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:38:31,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 08:38:33,244][12883] Updated weights for policy 0, policy_version 97103 (0.0029) [2024-06-18 08:38:36,784][12883] Updated weights for policy 0, policy_version 97113 (0.0030) [2024-06-18 08:38:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1591099392. Throughput: 0: 42404.9. Samples: 1591207340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:38:36,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 08:38:37,131][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097114_1591115776.pth... [2024-06-18 08:38:37,178][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096488_1580859392.pth [2024-06-18 08:38:40,967][12883] Updated weights for policy 0, policy_version 97123 (0.0041) [2024-06-18 08:38:41,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1591312384. Throughput: 0: 42383.5. Samples: 1591461400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:38:41,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 08:38:44,353][12883] Updated weights for policy 0, policy_version 97133 (0.0040) [2024-06-18 08:38:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1591508992. Throughput: 0: 42337.6. Samples: 1591588660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:38:46,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 08:38:48,740][12883] Updated weights for policy 0, policy_version 97143 (0.0036) [2024-06-18 08:38:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1591738368. Throughput: 0: 42578.7. Samples: 1591849080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:38:51,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 08:38:52,026][12883] Updated weights for policy 0, policy_version 97153 (0.0031) [2024-06-18 08:38:56,321][12883] Updated weights for policy 0, policy_version 97163 (0.0029) [2024-06-18 08:38:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1591951360. Throughput: 0: 42568.0. Samples: 1592104340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:38:57,002][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 08:38:59,635][12883] Updated weights for policy 0, policy_version 97173 (0.0035) [2024-06-18 08:39:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.4, 300 sec: 42542.9). Total num frames: 1592131584. Throughput: 0: 42490.6. Samples: 1592228620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:39:02,004][12645] Avg episode reward: [(0, '0.704')] [2024-06-18 08:39:03,914][12883] Updated weights for policy 0, policy_version 97183 (0.0039) [2024-06-18 08:39:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1592377344. Throughput: 0: 42624.3. Samples: 1592485200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:39:06,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 08:39:07,000][12862] Signal inference workers to stop experience collection... (23250 times) [2024-06-18 08:39:07,004][12862] Signal inference workers to resume experience collection... (23250 times) [2024-06-18 08:39:07,038][12883] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-06-18 08:39:07,038][12883] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-06-18 08:39:07,303][12883] Updated weights for policy 0, policy_version 97193 (0.0032) [2024-06-18 08:39:11,751][12883] Updated weights for policy 0, policy_version 97203 (0.0033) [2024-06-18 08:39:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1592590336. Throughput: 0: 42719.7. Samples: 1592744700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:39:11,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 08:39:14,971][12883] Updated weights for policy 0, policy_version 97213 (0.0040) [2024-06-18 08:39:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 1592770560. Throughput: 0: 42696.4. Samples: 1592870060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:39:16,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 08:39:19,460][12883] Updated weights for policy 0, policy_version 97223 (0.0040) [2024-06-18 08:39:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 1593016320. Throughput: 0: 42557.7. Samples: 1593122440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:39:21,995][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 08:39:22,977][12883] Updated weights for policy 0, policy_version 97233 (0.0043) [2024-06-18 08:39:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1593212928. Throughput: 0: 42793.3. Samples: 1593387100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 08:39:26,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 08:39:27,114][12883] Updated weights for policy 0, policy_version 97243 (0.0028) [2024-06-18 08:39:30,675][12883] Updated weights for policy 0, policy_version 97253 (0.0028) [2024-06-18 08:39:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1593425920. Throughput: 0: 42687.6. Samples: 1593509600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:39:31,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 08:39:34,530][12883] Updated weights for policy 0, policy_version 97263 (0.0036) [2024-06-18 08:39:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1593671680. Throughput: 0: 42628.0. Samples: 1593767340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:39:36,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 08:39:38,377][12883] Updated weights for policy 0, policy_version 97273 (0.0036) [2024-06-18 08:39:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1593851904. Throughput: 0: 42711.7. Samples: 1594026360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:39:41,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 08:39:42,206][12883] Updated weights for policy 0, policy_version 97283 (0.0022) [2024-06-18 08:39:45,997][12883] Updated weights for policy 0, policy_version 97293 (0.0028) [2024-06-18 08:39:46,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1594064896. Throughput: 0: 42663.9. Samples: 1594148500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:39:46,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 08:39:49,746][12883] Updated weights for policy 0, policy_version 97303 (0.0042) [2024-06-18 08:39:51,994][12645] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1594327040. Throughput: 0: 42837.3. Samples: 1594412880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:39:51,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 08:39:53,480][12883] Updated weights for policy 0, policy_version 97313 (0.0028) [2024-06-18 08:39:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1594507264. Throughput: 0: 42831.9. Samples: 1594672140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:39:56,994][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 08:39:57,434][12883] Updated weights for policy 0, policy_version 97323 (0.0024) [2024-06-18 08:40:00,939][12883] Updated weights for policy 0, policy_version 97333 (0.0039) [2024-06-18 08:40:01,994][12645] Fps is (10 sec: 39320.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1594720256. Throughput: 0: 42788.3. Samples: 1594795540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:40:01,995][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 08:40:05,362][12883] Updated weights for policy 0, policy_version 97343 (0.0028) [2024-06-18 08:40:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1594949632. Throughput: 0: 42913.4. Samples: 1595053540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:40:06,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 08:40:08,416][12883] Updated weights for policy 0, policy_version 97353 (0.0028) [2024-06-18 08:40:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1595146240. Throughput: 0: 42839.1. Samples: 1595314860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:40:11,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 08:40:13,001][12883] Updated weights for policy 0, policy_version 97363 (0.0038) [2024-06-18 08:40:16,423][12883] Updated weights for policy 0, policy_version 97373 (0.0034) [2024-06-18 08:40:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1595375616. Throughput: 0: 42876.8. Samples: 1595439060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:40:16,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 08:40:20,335][12883] Updated weights for policy 0, policy_version 97383 (0.0039) [2024-06-18 08:40:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1595572224. Throughput: 0: 43005.3. Samples: 1595702580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:40:21,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 08:40:23,922][12883] Updated weights for policy 0, policy_version 97393 (0.0058) [2024-06-18 08:40:27,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 1595801600. Throughput: 0: 43071.2. Samples: 1595964840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 08:40:27,000][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 08:40:27,879][12883] Updated weights for policy 0, policy_version 97403 (0.0030) [2024-06-18 08:40:31,377][12883] Updated weights for policy 0, policy_version 97413 (0.0048) [2024-06-18 08:40:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1596030976. Throughput: 0: 43183.7. Samples: 1596091760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:40:31,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 08:40:35,717][12883] Updated weights for policy 0, policy_version 97423 (0.0029) [2024-06-18 08:40:36,996][12645] Fps is (10 sec: 42615.5, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 1596227584. Throughput: 0: 42978.7. Samples: 1596347020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:40:36,996][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 08:40:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097426_1596227584.pth... [2024-06-18 08:40:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000096800_1585971200.pth [2024-06-18 08:40:38,998][12883] Updated weights for policy 0, policy_version 97433 (0.0036) [2024-06-18 08:40:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 1596440576. Throughput: 0: 42958.2. Samples: 1596605260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:40:41,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 08:40:43,439][12883] Updated weights for policy 0, policy_version 97443 (0.0038) [2024-06-18 08:40:46,917][12883] Updated weights for policy 0, policy_version 97453 (0.0038) [2024-06-18 08:40:46,994][12645] Fps is (10 sec: 44246.3, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1596669952. Throughput: 0: 42962.3. Samples: 1596728840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:40:46,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 08:40:51,090][12883] Updated weights for policy 0, policy_version 97463 (0.0037) [2024-06-18 08:40:51,879][12862] Signal inference workers to stop experience collection... (23300 times) [2024-06-18 08:40:51,912][12883] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-06-18 08:40:51,936][12862] Signal inference workers to resume experience collection... (23300 times) [2024-06-18 08:40:51,937][12883] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-06-18 08:40:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1596850176. Throughput: 0: 42837.4. Samples: 1596981220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:40:51,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 08:40:54,729][12883] Updated weights for policy 0, policy_version 97473 (0.0045) [2024-06-18 08:40:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1597079552. Throughput: 0: 42772.5. Samples: 1597239620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:40:56,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 08:40:58,676][12883] Updated weights for policy 0, policy_version 97483 (0.0038) [2024-06-18 08:41:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1597292544. Throughput: 0: 42855.1. Samples: 1597367540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:41:01,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 08:41:02,228][12883] Updated weights for policy 0, policy_version 97493 (0.0051) [2024-06-18 08:41:06,348][12883] Updated weights for policy 0, policy_version 97503 (0.0027) [2024-06-18 08:41:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1597489152. Throughput: 0: 42662.3. Samples: 1597622380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:41:06,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 08:41:10,029][12883] Updated weights for policy 0, policy_version 97513 (0.0049) [2024-06-18 08:41:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1597702144. Throughput: 0: 42552.6. Samples: 1597879440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:41:11,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 08:41:14,009][12883] Updated weights for policy 0, policy_version 97523 (0.0029) [2024-06-18 08:41:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1597931520. Throughput: 0: 42506.6. Samples: 1598004560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:41:16,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 08:41:17,621][12883] Updated weights for policy 0, policy_version 97533 (0.0037) [2024-06-18 08:41:21,565][12883] Updated weights for policy 0, policy_version 97543 (0.0035) [2024-06-18 08:41:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1598144512. Throughput: 0: 42585.7. Samples: 1598263280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:41:21,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 08:41:25,198][12883] Updated weights for policy 0, policy_version 97553 (0.0026) [2024-06-18 08:41:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42329.6, 300 sec: 42598.4). Total num frames: 1598341120. Throughput: 0: 42507.5. Samples: 1598518100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 08:41:26,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 08:41:29,393][12883] Updated weights for policy 0, policy_version 97563 (0.0040) [2024-06-18 08:41:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1598570496. Throughput: 0: 42572.5. Samples: 1598644600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:41:31,994][12645] Avg episode reward: [(0, '0.174')] [2024-06-18 08:41:32,805][12883] Updated weights for policy 0, policy_version 97573 (0.0039) [2024-06-18 08:41:36,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 1598783488. Throughput: 0: 42833.0. Samples: 1598908700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:41:36,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 08:41:37,039][12883] Updated weights for policy 0, policy_version 97583 (0.0041) [2024-06-18 08:41:40,437][12883] Updated weights for policy 0, policy_version 97593 (0.0036) [2024-06-18 08:41:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1598980096. Throughput: 0: 42628.0. Samples: 1599157880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:41:41,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 08:41:44,587][12883] Updated weights for policy 0, policy_version 97603 (0.0031) [2024-06-18 08:41:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 1599209472. Throughput: 0: 42539.6. Samples: 1599281820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:41:46,994][12645] Avg episode reward: [(0, '0.625')] [2024-06-18 08:41:48,170][12883] Updated weights for policy 0, policy_version 97613 (0.0023) [2024-06-18 08:41:52,000][12645] Fps is (10 sec: 44209.2, 60 sec: 42867.0, 300 sec: 42597.5). Total num frames: 1599422464. Throughput: 0: 42740.8. Samples: 1599545980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:41:52,001][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 08:41:52,261][12883] Updated weights for policy 0, policy_version 97623 (0.0032) [2024-06-18 08:41:56,110][12883] Updated weights for policy 0, policy_version 97633 (0.0035) [2024-06-18 08:41:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1599619072. Throughput: 0: 42509.0. Samples: 1599792340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:41:56,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 08:41:59,876][12883] Updated weights for policy 0, policy_version 97643 (0.0028) [2024-06-18 08:42:01,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1599864832. Throughput: 0: 42585.4. Samples: 1599920900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:42:01,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 08:42:04,012][12883] Updated weights for policy 0, policy_version 97653 (0.0034) [2024-06-18 08:42:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1600045056. Throughput: 0: 42541.3. Samples: 1600177640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:42:06,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 08:42:07,719][12883] Updated weights for policy 0, policy_version 97663 (0.0024) [2024-06-18 08:42:11,812][12883] Updated weights for policy 0, policy_version 97673 (0.0048) [2024-06-18 08:42:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1600274432. Throughput: 0: 42385.9. Samples: 1600425460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:42:11,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 08:42:15,452][12883] Updated weights for policy 0, policy_version 97683 (0.0039) [2024-06-18 08:42:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1600487424. Throughput: 0: 42545.0. Samples: 1600559120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:42:16,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 08:42:19,422][12862] Signal inference workers to stop experience collection... (23350 times) [2024-06-18 08:42:19,459][12883] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-06-18 08:42:19,470][12862] Signal inference workers to resume experience collection... (23350 times) [2024-06-18 08:42:19,482][12883] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-06-18 08:42:19,598][12883] Updated weights for policy 0, policy_version 97693 (0.0047) [2024-06-18 08:42:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1600667648. Throughput: 0: 42321.7. Samples: 1600813180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 08:42:21,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 08:42:23,187][12883] Updated weights for policy 0, policy_version 97703 (0.0029) [2024-06-18 08:42:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1600897024. Throughput: 0: 42233.7. Samples: 1601058400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:42:26,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 08:42:28,006][12883] Updated weights for policy 0, policy_version 97713 (0.0041) [2024-06-18 08:42:30,980][12883] Updated weights for policy 0, policy_version 97723 (0.0047) [2024-06-18 08:42:31,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1601142784. Throughput: 0: 42487.8. Samples: 1601193780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:42:31,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 08:42:35,694][12883] Updated weights for policy 0, policy_version 97733 (0.0034) [2024-06-18 08:42:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1601323008. Throughput: 0: 42383.2. Samples: 1601452960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:42:36,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 08:42:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097737_1601323008.pth... [2024-06-18 08:42:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097114_1591115776.pth [2024-06-18 08:42:38,636][12883] Updated weights for policy 0, policy_version 97743 (0.0032) [2024-06-18 08:42:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1601552384. Throughput: 0: 42366.6. Samples: 1601698840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:42:41,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 08:42:43,299][12883] Updated weights for policy 0, policy_version 97753 (0.0033) [2024-06-18 08:42:46,556][12883] Updated weights for policy 0, policy_version 97763 (0.0037) [2024-06-18 08:42:46,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1601781760. Throughput: 0: 42506.7. Samples: 1601833700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:42:46,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 08:42:50,867][12883] Updated weights for policy 0, policy_version 97773 (0.0026) [2024-06-18 08:42:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42056.7, 300 sec: 42542.9). Total num frames: 1601945600. Throughput: 0: 42560.1. Samples: 1602092840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:42:51,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 08:42:54,137][12883] Updated weights for policy 0, policy_version 97783 (0.0039) [2024-06-18 08:42:56,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42765.2). Total num frames: 1602207744. Throughput: 0: 42610.6. Samples: 1602342940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:42:56,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 08:42:58,439][12883] Updated weights for policy 0, policy_version 97793 (0.0032) [2024-06-18 08:43:01,789][12883] Updated weights for policy 0, policy_version 97803 (0.0033) [2024-06-18 08:43:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1602404352. Throughput: 0: 42518.7. Samples: 1602472460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:43:01,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 08:43:06,137][12883] Updated weights for policy 0, policy_version 97813 (0.0035) [2024-06-18 08:43:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1602600960. Throughput: 0: 42581.3. Samples: 1602729340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:43:06,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 08:43:09,625][12883] Updated weights for policy 0, policy_version 97823 (0.0037) [2024-06-18 08:43:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1602830336. Throughput: 0: 42630.8. Samples: 1602976780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:43:11,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 08:43:13,717][12883] Updated weights for policy 0, policy_version 97833 (0.0050) [2024-06-18 08:43:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1603026944. Throughput: 0: 42492.5. Samples: 1603105940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:43:16,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 08:43:17,447][12883] Updated weights for policy 0, policy_version 97843 (0.0061) [2024-06-18 08:43:21,360][12883] Updated weights for policy 0, policy_version 97853 (0.0039) [2024-06-18 08:43:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1603239936. Throughput: 0: 42366.6. Samples: 1603359460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 08:43:21,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 08:43:24,873][12883] Updated weights for policy 0, policy_version 97863 (0.0032) [2024-06-18 08:43:26,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1603452928. Throughput: 0: 42552.2. Samples: 1603613680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:43:26,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 08:43:28,955][12883] Updated weights for policy 0, policy_version 97873 (0.0033) [2024-06-18 08:43:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1603649536. Throughput: 0: 42479.6. Samples: 1603745280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:43:31,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 08:43:32,685][12883] Updated weights for policy 0, policy_version 97883 (0.0045) [2024-06-18 08:43:36,508][12883] Updated weights for policy 0, policy_version 97893 (0.0038) [2024-06-18 08:43:36,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1603878912. Throughput: 0: 42408.2. Samples: 1604001220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:43:36,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 08:43:40,684][12883] Updated weights for policy 0, policy_version 97903 (0.0029) [2024-06-18 08:43:41,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1604108288. Throughput: 0: 42253.0. Samples: 1604244320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:43:41,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 08:43:44,143][12862] Signal inference workers to stop experience collection... (23400 times) [2024-06-18 08:43:44,143][12862] Signal inference workers to resume experience collection... (23400 times) [2024-06-18 08:43:44,187][12883] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-06-18 08:43:44,188][12883] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-06-18 08:43:44,282][12883] Updated weights for policy 0, policy_version 97913 (0.0030) [2024-06-18 08:43:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41506.0, 300 sec: 42487.3). Total num frames: 1604272128. Throughput: 0: 42262.1. Samples: 1604374260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:43:46,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 08:43:48,289][12883] Updated weights for policy 0, policy_version 97923 (0.0031) [2024-06-18 08:43:51,970][12883] Updated weights for policy 0, policy_version 97933 (0.0041) [2024-06-18 08:43:51,996][12645] Fps is (10 sec: 42588.7, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 1604534272. Throughput: 0: 42269.9. Samples: 1604631580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:43:51,996][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 08:43:56,055][12883] Updated weights for policy 0, policy_version 97943 (0.0041) [2024-06-18 08:43:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1604730880. Throughput: 0: 42374.5. Samples: 1604883640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:43:56,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 08:44:00,001][12883] Updated weights for policy 0, policy_version 97953 (0.0037) [2024-06-18 08:44:01,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1604927488. Throughput: 0: 42292.4. Samples: 1605009100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:44:01,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 08:44:03,893][12883] Updated weights for policy 0, policy_version 97963 (0.0040) [2024-06-18 08:44:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1605140480. Throughput: 0: 42218.7. Samples: 1605259300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:44:06,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 08:44:07,751][12883] Updated weights for policy 0, policy_version 97973 (0.0035) [2024-06-18 08:44:11,665][12883] Updated weights for policy 0, policy_version 97983 (0.0031) [2024-06-18 08:44:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1605353472. Throughput: 0: 42299.9. Samples: 1605517180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:44:11,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 08:44:15,411][12883] Updated weights for policy 0, policy_version 97993 (0.0032) [2024-06-18 08:44:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1605566464. Throughput: 0: 42205.3. Samples: 1605644520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:44:16,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 08:44:19,283][12883] Updated weights for policy 0, policy_version 98003 (0.0023) [2024-06-18 08:44:21,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 1605795840. Throughput: 0: 42208.7. Samples: 1605900700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 08:44:21,996][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 08:44:23,126][12883] Updated weights for policy 0, policy_version 98013 (0.0042) [2024-06-18 08:44:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1605992448. Throughput: 0: 42444.0. Samples: 1606154300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:44:26,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 08:44:27,001][12883] Updated weights for policy 0, policy_version 98023 (0.0040) [2024-06-18 08:44:31,080][12883] Updated weights for policy 0, policy_version 98033 (0.0038) [2024-06-18 08:44:31,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1606221824. Throughput: 0: 42399.1. Samples: 1606282220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:44:31,996][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 08:44:34,627][12883] Updated weights for policy 0, policy_version 98043 (0.0039) [2024-06-18 08:44:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 1606418432. Throughput: 0: 42268.9. Samples: 1606533680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:44:36,997][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 08:44:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098048_1606418432.pth... [2024-06-18 08:44:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097426_1596227584.pth [2024-06-18 08:44:38,806][12883] Updated weights for policy 0, policy_version 98053 (0.0036) [2024-06-18 08:44:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1606631424. Throughput: 0: 42347.3. Samples: 1606789260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:44:41,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 08:44:42,146][12883] Updated weights for policy 0, policy_version 98063 (0.0028) [2024-06-18 08:44:46,341][12883] Updated weights for policy 0, policy_version 98073 (0.0045) [2024-06-18 08:44:46,994][12645] Fps is (10 sec: 44246.3, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 1606860800. Throughput: 0: 42482.2. Samples: 1606920800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:44:46,994][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 08:44:49,962][12883] Updated weights for policy 0, policy_version 98083 (0.0031) [2024-06-18 08:44:51,994][12645] Fps is (10 sec: 39320.7, 60 sec: 41507.6, 300 sec: 42431.8). Total num frames: 1607024640. Throughput: 0: 42394.6. Samples: 1607167060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:44:51,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 08:44:54,080][12883] Updated weights for policy 0, policy_version 98093 (0.0040) [2024-06-18 08:44:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1607270400. Throughput: 0: 42471.0. Samples: 1607428380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:44:56,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 08:44:57,733][12883] Updated weights for policy 0, policy_version 98103 (0.0035) [2024-06-18 08:45:01,920][12883] Updated weights for policy 0, policy_version 98113 (0.0043) [2024-06-18 08:45:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1607483392. Throughput: 0: 42514.1. Samples: 1607557660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:45:02,000][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 08:45:05,480][12883] Updated weights for policy 0, policy_version 98123 (0.0046) [2024-06-18 08:45:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1607680000. Throughput: 0: 42324.6. Samples: 1607805220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:45:06,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 08:45:09,310][12883] Updated weights for policy 0, policy_version 98133 (0.0033) [2024-06-18 08:45:11,612][12862] Signal inference workers to stop experience collection... (23450 times) [2024-06-18 08:45:11,644][12883] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-06-18 08:45:11,670][12862] Signal inference workers to resume experience collection... (23450 times) [2024-06-18 08:45:11,671][12883] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-06-18 08:45:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1607909376. Throughput: 0: 42562.2. Samples: 1608069600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:45:11,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 08:45:13,015][12883] Updated weights for policy 0, policy_version 98143 (0.0033) [2024-06-18 08:45:16,849][12883] Updated weights for policy 0, policy_version 98153 (0.0038) [2024-06-18 08:45:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1608138752. Throughput: 0: 42680.9. Samples: 1608202860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:45:16,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 08:45:20,596][12883] Updated weights for policy 0, policy_version 98163 (0.0040) [2024-06-18 08:45:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42053.8, 300 sec: 42432.7). Total num frames: 1608318976. Throughput: 0: 42626.1. Samples: 1608451760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 08:45:21,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 08:45:24,483][12883] Updated weights for policy 0, policy_version 98173 (0.0028) [2024-06-18 08:45:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1608548352. Throughput: 0: 42706.5. Samples: 1608711060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:45:26,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 08:45:28,326][12883] Updated weights for policy 0, policy_version 98183 (0.0031) [2024-06-18 08:45:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 1608761344. Throughput: 0: 42700.7. Samples: 1608842320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:45:31,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 08:45:32,306][12883] Updated weights for policy 0, policy_version 98193 (0.0032) [2024-06-18 08:45:35,892][12883] Updated weights for policy 0, policy_version 98203 (0.0036) [2024-06-18 08:45:36,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 1608974336. Throughput: 0: 42848.1. Samples: 1609095220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:45:36,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 08:45:40,035][12883] Updated weights for policy 0, policy_version 98213 (0.0041) [2024-06-18 08:45:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1609170944. Throughput: 0: 42598.4. Samples: 1609345300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:45:41,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 08:45:43,730][12883] Updated weights for policy 0, policy_version 98223 (0.0024) [2024-06-18 08:45:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1609400320. Throughput: 0: 42437.3. Samples: 1609467340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:45:46,994][12645] Avg episode reward: [(0, '0.734')] [2024-06-18 08:45:47,782][12883] Updated weights for policy 0, policy_version 98233 (0.0035) [2024-06-18 08:45:51,909][12883] Updated weights for policy 0, policy_version 98243 (0.0036) [2024-06-18 08:45:51,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 1609613312. Throughput: 0: 42685.9. Samples: 1609726080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:45:51,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 08:45:55,371][12883] Updated weights for policy 0, policy_version 98253 (0.0043) [2024-06-18 08:45:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1609809920. Throughput: 0: 42356.0. Samples: 1609975620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:45:56,994][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 08:45:59,485][12883] Updated weights for policy 0, policy_version 98263 (0.0033) [2024-06-18 08:46:01,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1610022912. Throughput: 0: 42219.7. Samples: 1610102740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:46:01,994][12645] Avg episode reward: [(0, '0.154')] [2024-06-18 08:46:03,018][12883] Updated weights for policy 0, policy_version 98273 (0.0046) [2024-06-18 08:46:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1610252288. Throughput: 0: 42559.7. Samples: 1610366940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:46:06,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 08:46:06,998][12883] Updated weights for policy 0, policy_version 98283 (0.0037) [2024-06-18 08:46:10,886][12883] Updated weights for policy 0, policy_version 98293 (0.0024) [2024-06-18 08:46:11,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1610465280. Throughput: 0: 42267.1. Samples: 1610613080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:46:11,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 08:46:14,621][12883] Updated weights for policy 0, policy_version 98303 (0.0035) [2024-06-18 08:46:17,000][12645] Fps is (10 sec: 39296.7, 60 sec: 41774.9, 300 sec: 42375.3). Total num frames: 1610645504. Throughput: 0: 42189.1. Samples: 1610741100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:46:17,001][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 08:46:18,482][12883] Updated weights for policy 0, policy_version 98313 (0.0031) [2024-06-18 08:46:21,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1610891264. Throughput: 0: 42376.9. Samples: 1611002180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 08:46:21,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 08:46:22,314][12883] Updated weights for policy 0, policy_version 98323 (0.0023) [2024-06-18 08:46:24,993][12862] Signal inference workers to stop experience collection... (23500 times) [2024-06-18 08:46:25,032][12883] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-06-18 08:46:25,041][12862] Signal inference workers to resume experience collection... (23500 times) [2024-06-18 08:46:25,049][12883] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-06-18 08:46:26,171][12883] Updated weights for policy 0, policy_version 98333 (0.0041) [2024-06-18 08:46:26,994][12645] Fps is (10 sec: 45903.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1611104256. Throughput: 0: 42327.9. Samples: 1611250060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:46:26,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 08:46:29,916][12883] Updated weights for policy 0, policy_version 98343 (0.0048) [2024-06-18 08:46:32,000][12645] Fps is (10 sec: 40934.1, 60 sec: 42320.8, 300 sec: 42430.9). Total num frames: 1611300864. Throughput: 0: 42437.2. Samples: 1611377280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:46:32,001][12645] Avg episode reward: [(0, '0.662')] [2024-06-18 08:46:34,153][12883] Updated weights for policy 0, policy_version 98353 (0.0035) [2024-06-18 08:46:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1611530240. Throughput: 0: 42568.1. Samples: 1611641640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:46:36,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 08:46:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098361_1611546624.pth... [2024-06-18 08:46:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000097737_1601323008.pth [2024-06-18 08:46:38,189][12883] Updated weights for policy 0, policy_version 98363 (0.0035) [2024-06-18 08:46:41,832][12883] Updated weights for policy 0, policy_version 98373 (0.0028) [2024-06-18 08:46:41,994][12645] Fps is (10 sec: 44264.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1611743232. Throughput: 0: 42656.0. Samples: 1611895140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:46:41,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 08:46:45,635][12883] Updated weights for policy 0, policy_version 98383 (0.0028) [2024-06-18 08:46:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42488.2). Total num frames: 1611956224. Throughput: 0: 42634.1. Samples: 1612021280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:46:46,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 08:46:49,381][12883] Updated weights for policy 0, policy_version 98393 (0.0038) [2024-06-18 08:46:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1612169216. Throughput: 0: 42543.4. Samples: 1612281400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:46:51,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 08:46:53,513][12883] Updated weights for policy 0, policy_version 98403 (0.0039) [2024-06-18 08:46:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 1612382208. Throughput: 0: 42770.4. Samples: 1612537740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:46:56,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 08:46:57,016][12883] Updated weights for policy 0, policy_version 98413 (0.0040) [2024-06-18 08:47:01,087][12883] Updated weights for policy 0, policy_version 98423 (0.0035) [2024-06-18 08:47:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1612595200. Throughput: 0: 42753.5. Samples: 1612664740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:47:01,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 08:47:05,287][12883] Updated weights for policy 0, policy_version 98433 (0.0032) [2024-06-18 08:47:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1612808192. Throughput: 0: 42653.4. Samples: 1612921580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:47:06,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 08:47:08,792][12883] Updated weights for policy 0, policy_version 98443 (0.0033) [2024-06-18 08:47:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1613021184. Throughput: 0: 42718.6. Samples: 1613172400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:47:11,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 08:47:13,000][12883] Updated weights for policy 0, policy_version 98453 (0.0029) [2024-06-18 08:47:16,550][12883] Updated weights for policy 0, policy_version 98463 (0.0033) [2024-06-18 08:47:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 1613217792. Throughput: 0: 42776.3. Samples: 1613301940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:47:16,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 08:47:20,671][12883] Updated weights for policy 0, policy_version 98473 (0.0031) [2024-06-18 08:47:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1613463552. Throughput: 0: 42698.2. Samples: 1613563060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 08:47:21,994][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 08:47:24,144][12883] Updated weights for policy 0, policy_version 98483 (0.0039) [2024-06-18 08:47:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1613676544. Throughput: 0: 42591.6. Samples: 1613811760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:47:26,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 08:47:28,466][12883] Updated weights for policy 0, policy_version 98493 (0.0047) [2024-06-18 08:47:31,660][12883] Updated weights for policy 0, policy_version 98503 (0.0038) [2024-06-18 08:47:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 1613873152. Throughput: 0: 42763.5. Samples: 1613945640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:47:31,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 08:47:32,885][12862] Signal inference workers to stop experience collection... (23550 times) [2024-06-18 08:47:32,885][12862] Signal inference workers to resume experience collection... (23550 times) [2024-06-18 08:47:32,937][12883] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-06-18 08:47:32,937][12883] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-06-18 08:47:35,996][12883] Updated weights for policy 0, policy_version 98513 (0.0038) [2024-06-18 08:47:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1614069760. Throughput: 0: 42788.5. Samples: 1614206880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:47:36,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 08:47:39,183][12883] Updated weights for policy 0, policy_version 98523 (0.0035) [2024-06-18 08:47:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1614315520. Throughput: 0: 42767.6. Samples: 1614462280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:47:41,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 08:47:43,702][12883] Updated weights for policy 0, policy_version 98533 (0.0032) [2024-06-18 08:47:46,734][12883] Updated weights for policy 0, policy_version 98543 (0.0029) [2024-06-18 08:47:46,996][12645] Fps is (10 sec: 45864.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1614528512. Throughput: 0: 42838.3. Samples: 1614592560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:47:46,996][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 08:47:51,478][12883] Updated weights for policy 0, policy_version 98553 (0.0036) [2024-06-18 08:47:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1614708736. Throughput: 0: 42737.7. Samples: 1614844780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:47:51,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 08:47:54,505][12883] Updated weights for policy 0, policy_version 98563 (0.0031) [2024-06-18 08:47:56,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1614938112. Throughput: 0: 42827.2. Samples: 1615099620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:47:56,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 08:47:59,054][12883] Updated weights for policy 0, policy_version 98573 (0.0031) [2024-06-18 08:48:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1615151104. Throughput: 0: 42960.4. Samples: 1615235160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:48:01,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 08:48:02,186][12883] Updated weights for policy 0, policy_version 98583 (0.0031) [2024-06-18 08:48:06,619][12883] Updated weights for policy 0, policy_version 98593 (0.0031) [2024-06-18 08:48:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 1615347712. Throughput: 0: 42770.5. Samples: 1615487740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:48:06,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 08:48:09,767][12883] Updated weights for policy 0, policy_version 98603 (0.0029) [2024-06-18 08:48:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1615577088. Throughput: 0: 42727.4. Samples: 1615734500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:48:11,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 08:48:14,067][12883] Updated weights for policy 0, policy_version 98613 (0.0040) [2024-06-18 08:48:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1615790080. Throughput: 0: 42669.8. Samples: 1615865780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:48:16,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 08:48:17,390][12883] Updated weights for policy 0, policy_version 98623 (0.0041) [2024-06-18 08:48:21,732][12883] Updated weights for policy 0, policy_version 98633 (0.0038) [2024-06-18 08:48:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1616003072. Throughput: 0: 42551.5. Samples: 1616121700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 08:48:21,995][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 08:48:25,027][12883] Updated weights for policy 0, policy_version 98643 (0.0026) [2024-06-18 08:48:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1616216064. Throughput: 0: 42670.6. Samples: 1616382460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:48:26,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 08:48:29,221][12883] Updated weights for policy 0, policy_version 98653 (0.0024) [2024-06-18 08:48:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1616445440. Throughput: 0: 42623.4. Samples: 1616510520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:48:31,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 08:48:32,786][12883] Updated weights for policy 0, policy_version 98663 (0.0033) [2024-06-18 08:48:36,764][12883] Updated weights for policy 0, policy_version 98673 (0.0041) [2024-06-18 08:48:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1616658432. Throughput: 0: 42715.5. Samples: 1616766980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:48:36,994][12645] Avg episode reward: [(0, '0.673')] [2024-06-18 08:48:37,113][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098674_1616674816.pth... [2024-06-18 08:48:37,170][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098048_1606418432.pth [2024-06-18 08:48:40,469][12883] Updated weights for policy 0, policy_version 98683 (0.0023) [2024-06-18 08:48:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1616871424. Throughput: 0: 42772.1. Samples: 1617024360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:48:41,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 08:48:44,313][12883] Updated weights for policy 0, policy_version 98693 (0.0038) [2024-06-18 08:48:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42543.2). Total num frames: 1617084416. Throughput: 0: 42586.7. Samples: 1617151560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:48:46,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 08:48:48,245][12883] Updated weights for policy 0, policy_version 98703 (0.0026) [2024-06-18 08:48:51,949][12883] Updated weights for policy 0, policy_version 98713 (0.0032) [2024-06-18 08:48:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 1617313792. Throughput: 0: 42905.3. Samples: 1617418480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:48:51,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 08:48:53,932][12862] Signal inference workers to stop experience collection... (23600 times) [2024-06-18 08:48:53,978][12883] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-06-18 08:48:53,986][12862] Signal inference workers to resume experience collection... (23600 times) [2024-06-18 08:48:53,992][12883] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-06-18 08:48:55,851][12883] Updated weights for policy 0, policy_version 98723 (0.0029) [2024-06-18 08:48:57,000][12645] Fps is (10 sec: 42571.3, 60 sec: 42867.0, 300 sec: 42653.0). Total num frames: 1617510400. Throughput: 0: 42995.0. Samples: 1617669540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:48:57,009][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 08:48:59,608][12883] Updated weights for policy 0, policy_version 98733 (0.0032) [2024-06-18 08:49:01,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1617739776. Throughput: 0: 42947.6. Samples: 1617798420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:49:01,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 08:49:03,464][12883] Updated weights for policy 0, policy_version 98743 (0.0034) [2024-06-18 08:49:06,994][12645] Fps is (10 sec: 44264.6, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 1617952768. Throughput: 0: 43121.8. Samples: 1618062180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:49:06,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 08:49:07,111][12883] Updated weights for policy 0, policy_version 98753 (0.0035) [2024-06-18 08:49:11,098][12883] Updated weights for policy 0, policy_version 98763 (0.0025) [2024-06-18 08:49:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.7, 300 sec: 42653.9). Total num frames: 1618149376. Throughput: 0: 42992.6. Samples: 1618317120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:49:11,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 08:49:14,803][12883] Updated weights for policy 0, policy_version 98773 (0.0025) [2024-06-18 08:49:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1618345984. Throughput: 0: 42861.8. Samples: 1618439300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:49:16,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 08:49:19,181][12883] Updated weights for policy 0, policy_version 98783 (0.0036) [2024-06-18 08:49:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1618591744. Throughput: 0: 42823.1. Samples: 1618694020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 08:49:21,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 08:49:22,304][12883] Updated weights for policy 0, policy_version 98793 (0.0029) [2024-06-18 08:49:26,756][12883] Updated weights for policy 0, policy_version 98803 (0.0042) [2024-06-18 08:49:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1618804736. Throughput: 0: 43030.1. Samples: 1618960720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:49:26,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 08:49:29,849][12883] Updated weights for policy 0, policy_version 98813 (0.0040) [2024-06-18 08:49:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1619001344. Throughput: 0: 42997.3. Samples: 1619086440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:49:31,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 08:49:34,282][12883] Updated weights for policy 0, policy_version 98823 (0.0039) [2024-06-18 08:49:36,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1619230720. Throughput: 0: 42911.7. Samples: 1619349500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:49:36,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 08:49:37,418][12883] Updated weights for policy 0, policy_version 98833 (0.0031) [2024-06-18 08:49:41,629][12883] Updated weights for policy 0, policy_version 98843 (0.0043) [2024-06-18 08:49:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1619443712. Throughput: 0: 43067.7. Samples: 1619607320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:49:41,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 08:49:45,082][12883] Updated weights for policy 0, policy_version 98853 (0.0046) [2024-06-18 08:49:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1619640320. Throughput: 0: 43056.4. Samples: 1619735960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:49:46,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 08:49:49,747][12883] Updated weights for policy 0, policy_version 98863 (0.0027) [2024-06-18 08:49:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1619886080. Throughput: 0: 42866.2. Samples: 1619991160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:49:51,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 08:49:52,949][12883] Updated weights for policy 0, policy_version 98873 (0.0028) [2024-06-18 08:49:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 1620082688. Throughput: 0: 42905.7. Samples: 1620247880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:49:56,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 08:49:57,087][12883] Updated weights for policy 0, policy_version 98883 (0.0032) [2024-06-18 08:50:00,469][12883] Updated weights for policy 0, policy_version 98893 (0.0041) [2024-06-18 08:50:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1620279296. Throughput: 0: 42940.4. Samples: 1620371620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:50:01,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 08:50:04,716][12883] Updated weights for policy 0, policy_version 98903 (0.0033) [2024-06-18 08:50:06,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1620541440. Throughput: 0: 43140.4. Samples: 1620635340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:50:06,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 08:50:08,039][12883] Updated weights for policy 0, policy_version 98913 (0.0029) [2024-06-18 08:50:11,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1620738048. Throughput: 0: 42800.5. Samples: 1620886740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:50:11,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 08:50:12,218][12883] Updated weights for policy 0, policy_version 98923 (0.0027) [2024-06-18 08:50:16,045][12883] Updated weights for policy 0, policy_version 98933 (0.0045) [2024-06-18 08:50:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1620934656. Throughput: 0: 42889.6. Samples: 1621016480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:50:16,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 08:50:19,743][12883] Updated weights for policy 0, policy_version 98943 (0.0040) [2024-06-18 08:50:20,660][12862] Signal inference workers to stop experience collection... (23650 times) [2024-06-18 08:50:20,664][12862] Signal inference workers to resume experience collection... (23650 times) [2024-06-18 08:50:20,682][12883] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-06-18 08:50:20,682][12883] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-06-18 08:50:21,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1621164032. Throughput: 0: 42893.4. Samples: 1621279800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 08:50:21,996][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 08:50:23,730][12883] Updated weights for policy 0, policy_version 98953 (0.0031) [2024-06-18 08:50:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1621360640. Throughput: 0: 42749.3. Samples: 1621531040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:50:26,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 08:50:27,705][12883] Updated weights for policy 0, policy_version 98963 (0.0044) [2024-06-18 08:50:31,301][12883] Updated weights for policy 0, policy_version 98973 (0.0027) [2024-06-18 08:50:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1621590016. Throughput: 0: 42782.7. Samples: 1621661180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:50:31,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 08:50:35,268][12883] Updated weights for policy 0, policy_version 98983 (0.0027) [2024-06-18 08:50:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1621819392. Throughput: 0: 42874.6. Samples: 1621920520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:50:36,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 08:50:37,123][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098989_1621835776.pth... [2024-06-18 08:50:37,171][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098361_1611546624.pth [2024-06-18 08:50:38,911][12883] Updated weights for policy 0, policy_version 98993 (0.0030) [2024-06-18 08:50:41,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 1622032384. Throughput: 0: 42962.2. Samples: 1622181280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:50:41,997][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 08:50:42,769][12883] Updated weights for policy 0, policy_version 99003 (0.0033) [2024-06-18 08:50:46,382][12883] Updated weights for policy 0, policy_version 99013 (0.0027) [2024-06-18 08:50:46,994][12645] Fps is (10 sec: 42599.3, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1622245376. Throughput: 0: 43109.4. Samples: 1622311540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:50:46,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 08:50:50,268][12883] Updated weights for policy 0, policy_version 99023 (0.0042) [2024-06-18 08:50:51,994][12645] Fps is (10 sec: 44247.0, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1622474752. Throughput: 0: 43006.8. Samples: 1622570640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:50:51,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 08:50:54,046][12883] Updated weights for policy 0, policy_version 99033 (0.0028) [2024-06-18 08:50:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1622687744. Throughput: 0: 43281.4. Samples: 1622834400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:50:56,994][12645] Avg episode reward: [(0, '0.161')] [2024-06-18 08:50:57,814][12883] Updated weights for policy 0, policy_version 99043 (0.0039) [2024-06-18 08:51:01,640][12883] Updated weights for policy 0, policy_version 99053 (0.0031) [2024-06-18 08:51:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 42876.1). Total num frames: 1622900736. Throughput: 0: 43159.7. Samples: 1622958660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:51:01,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 08:51:05,343][12883] Updated weights for policy 0, policy_version 99063 (0.0032) [2024-06-18 08:51:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1623130112. Throughput: 0: 43139.8. Samples: 1623221000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:51:06,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 08:51:09,203][12883] Updated weights for policy 0, policy_version 99073 (0.0022) [2024-06-18 08:51:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42988.1). Total num frames: 1623326720. Throughput: 0: 43402.3. Samples: 1623484140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:51:11,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 08:51:12,756][12883] Updated weights for policy 0, policy_version 99083 (0.0045) [2024-06-18 08:51:16,814][12883] Updated weights for policy 0, policy_version 99093 (0.0035) [2024-06-18 08:51:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 1623556096. Throughput: 0: 43208.8. Samples: 1623605580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:51:16,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 08:51:20,202][12883] Updated weights for policy 0, policy_version 99103 (0.0027) [2024-06-18 08:51:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43692.3, 300 sec: 42987.2). Total num frames: 1623785472. Throughput: 0: 43328.6. Samples: 1623870300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 08:51:21,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 08:51:24,224][12883] Updated weights for policy 0, policy_version 99113 (0.0044) [2024-06-18 08:51:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43690.7, 300 sec: 42988.1). Total num frames: 1623982080. Throughput: 0: 43412.8. Samples: 1624134760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:51:26,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 08:51:28,005][12862] Signal inference workers to stop experience collection... (23700 times) [2024-06-18 08:51:28,052][12883] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-06-18 08:51:28,052][12862] Signal inference workers to resume experience collection... (23700 times) [2024-06-18 08:51:28,063][12883] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-06-18 08:51:28,214][12883] Updated weights for policy 0, policy_version 99123 (0.0028) [2024-06-18 08:51:31,689][12883] Updated weights for policy 0, policy_version 99133 (0.0038) [2024-06-18 08:51:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1624195072. Throughput: 0: 43229.7. Samples: 1624256880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:51:31,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 08:51:35,673][12883] Updated weights for policy 0, policy_version 99143 (0.0024) [2024-06-18 08:51:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43690.7, 300 sec: 43042.7). Total num frames: 1624440832. Throughput: 0: 43403.4. Samples: 1624523800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:51:36,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 08:51:39,578][12883] Updated weights for policy 0, policy_version 99153 (0.0041) [2024-06-18 08:51:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43419.2, 300 sec: 42987.2). Total num frames: 1624637440. Throughput: 0: 43290.2. Samples: 1624782460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:51:41,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 08:51:43,250][12883] Updated weights for policy 0, policy_version 99163 (0.0035) [2024-06-18 08:51:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1624834048. Throughput: 0: 43239.5. Samples: 1624904440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:51:46,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 08:51:47,085][12883] Updated weights for policy 0, policy_version 99173 (0.0029) [2024-06-18 08:51:50,801][12883] Updated weights for policy 0, policy_version 99183 (0.0042) [2024-06-18 08:51:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1625063424. Throughput: 0: 43294.8. Samples: 1625169260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:51:51,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 08:51:54,836][12883] Updated weights for policy 0, policy_version 99193 (0.0032) [2024-06-18 08:51:56,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42931.3). Total num frames: 1625260032. Throughput: 0: 43137.9. Samples: 1625425440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:51:56,996][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 08:51:58,583][12883] Updated weights for policy 0, policy_version 99203 (0.0033) [2024-06-18 08:52:01,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42931.6). Total num frames: 1625473024. Throughput: 0: 43137.3. Samples: 1625546760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:52:01,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 08:52:02,432][12883] Updated weights for policy 0, policy_version 99213 (0.0031) [2024-06-18 08:52:06,085][12883] Updated weights for policy 0, policy_version 99223 (0.0033) [2024-06-18 08:52:06,994][12645] Fps is (10 sec: 44246.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1625702400. Throughput: 0: 43172.7. Samples: 1625813080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:52:06,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 08:52:09,908][12883] Updated weights for policy 0, policy_version 99233 (0.0036) [2024-06-18 08:52:11,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1625915392. Throughput: 0: 42944.9. Samples: 1626067280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:52:11,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 08:52:13,753][12883] Updated weights for policy 0, policy_version 99243 (0.0043) [2024-06-18 08:52:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1626112000. Throughput: 0: 42975.5. Samples: 1626190780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:52:16,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 08:52:17,994][12883] Updated weights for policy 0, policy_version 99253 (0.0032) [2024-06-18 08:52:21,447][12883] Updated weights for policy 0, policy_version 99263 (0.0036) [2024-06-18 08:52:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1626341376. Throughput: 0: 42744.1. Samples: 1626447280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 08:52:21,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 08:52:25,559][12883] Updated weights for policy 0, policy_version 99273 (0.0029) [2024-06-18 08:52:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1626554368. Throughput: 0: 42776.5. Samples: 1626707400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:52:26,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 08:52:29,421][12883] Updated weights for policy 0, policy_version 99283 (0.0036) [2024-06-18 08:52:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 1626750976. Throughput: 0: 42834.1. Samples: 1626831980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:52:31,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 08:52:33,228][12883] Updated weights for policy 0, policy_version 99293 (0.0040) [2024-06-18 08:52:36,881][12883] Updated weights for policy 0, policy_version 99303 (0.0030) [2024-06-18 08:52:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42931.6). Total num frames: 1626980352. Throughput: 0: 42604.7. Samples: 1627086480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:52:36,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 08:52:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099303_1626980352.pth... [2024-06-18 08:52:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098674_1616674816.pth [2024-06-18 08:52:40,914][12883] Updated weights for policy 0, policy_version 99313 (0.0042) [2024-06-18 08:52:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42932.0). Total num frames: 1627193344. Throughput: 0: 42670.1. Samples: 1627345500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:52:41,994][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 08:52:44,509][12883] Updated weights for policy 0, policy_version 99323 (0.0031) [2024-06-18 08:52:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1627389952. Throughput: 0: 42713.0. Samples: 1627468840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:52:46,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 08:52:48,751][12883] Updated weights for policy 0, policy_version 99333 (0.0035) [2024-06-18 08:52:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 1627619328. Throughput: 0: 42565.9. Samples: 1627728540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:52:51,994][12645] Avg episode reward: [(0, '0.658')] [2024-06-18 08:52:52,174][12883] Updated weights for policy 0, policy_version 99343 (0.0037) [2024-06-18 08:52:56,352][12883] Updated weights for policy 0, policy_version 99353 (0.0031) [2024-06-18 08:52:56,572][12862] Signal inference workers to stop experience collection... (23750 times) [2024-06-18 08:52:56,625][12862] Signal inference workers to resume experience collection... (23750 times) [2024-06-18 08:52:56,626][12883] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-06-18 08:52:56,643][12883] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-06-18 08:52:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42987.2). Total num frames: 1627832320. Throughput: 0: 42721.8. Samples: 1627989760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:52:56,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 08:52:59,791][12883] Updated weights for policy 0, policy_version 99363 (0.0028) [2024-06-18 08:53:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 43042.7). Total num frames: 1628045312. Throughput: 0: 42809.4. Samples: 1628117200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:53:01,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 08:53:03,943][12883] Updated weights for policy 0, policy_version 99373 (0.0033) [2024-06-18 08:53:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1628274688. Throughput: 0: 42853.2. Samples: 1628375680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:53:06,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 08:53:07,402][12883] Updated weights for policy 0, policy_version 99383 (0.0041) [2024-06-18 08:53:11,718][12883] Updated weights for policy 0, policy_version 99393 (0.0039) [2024-06-18 08:53:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42987.2). Total num frames: 1628471296. Throughput: 0: 42929.2. Samples: 1628639220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:53:11,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 08:53:15,019][12883] Updated weights for policy 0, policy_version 99403 (0.0027) [2024-06-18 08:53:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1628684288. Throughput: 0: 42850.3. Samples: 1628760240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:53:16,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 08:53:19,355][12883] Updated weights for policy 0, policy_version 99413 (0.0034) [2024-06-18 08:53:21,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1628913664. Throughput: 0: 42809.1. Samples: 1629012880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 23.0) [2024-06-18 08:53:21,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 08:53:22,907][12883] Updated weights for policy 0, policy_version 99423 (0.0039) [2024-06-18 08:53:26,997][12645] Fps is (10 sec: 39306.9, 60 sec: 42049.6, 300 sec: 42820.0). Total num frames: 1629077504. Throughput: 0: 42907.1. Samples: 1629276480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:53:26,998][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 08:53:27,386][12883] Updated weights for policy 0, policy_version 99433 (0.0047) [2024-06-18 08:53:30,239][12883] Updated weights for policy 0, policy_version 99443 (0.0028) [2024-06-18 08:53:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 1629323264. Throughput: 0: 42843.7. Samples: 1629396800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:53:31,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 08:53:34,931][12883] Updated weights for policy 0, policy_version 99453 (0.0033) [2024-06-18 08:53:36,994][12645] Fps is (10 sec: 47530.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1629552640. Throughput: 0: 42805.7. Samples: 1629654800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:53:36,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 08:53:37,898][12883] Updated weights for policy 0, policy_version 99463 (0.0027) [2024-06-18 08:53:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1629732864. Throughput: 0: 42908.9. Samples: 1629920660. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:53:41,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 08:53:42,408][12883] Updated weights for policy 0, policy_version 99473 (0.0036) [2024-06-18 08:53:45,675][12883] Updated weights for policy 0, policy_version 99483 (0.0032) [2024-06-18 08:53:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1629962240. Throughput: 0: 42878.1. Samples: 1630046720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:53:46,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 08:53:49,858][12883] Updated weights for policy 0, policy_version 99493 (0.0037) [2024-06-18 08:53:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42988.1). Total num frames: 1630191616. Throughput: 0: 42907.2. Samples: 1630306500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:53:51,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 08:53:53,460][12883] Updated weights for policy 0, policy_version 99503 (0.0042) [2024-06-18 08:53:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1630388224. Throughput: 0: 42701.3. Samples: 1630560780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:53:56,994][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 08:53:57,356][12883] Updated weights for policy 0, policy_version 99513 (0.0035) [2024-06-18 08:54:01,292][12883] Updated weights for policy 0, policy_version 99523 (0.0041) [2024-06-18 08:54:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1630617600. Throughput: 0: 42832.9. Samples: 1630687720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:54:01,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 08:54:04,933][12883] Updated weights for policy 0, policy_version 99533 (0.0038) [2024-06-18 08:54:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1630830592. Throughput: 0: 43031.9. Samples: 1630949320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:54:06,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 08:54:08,921][12883] Updated weights for policy 0, policy_version 99543 (0.0029) [2024-06-18 08:54:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 43042.7). Total num frames: 1631043584. Throughput: 0: 42888.4. Samples: 1631206300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:54:11,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 08:54:12,374][12883] Updated weights for policy 0, policy_version 99553 (0.0027) [2024-06-18 08:54:16,269][12883] Updated weights for policy 0, policy_version 99563 (0.0038) [2024-06-18 08:54:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1631256576. Throughput: 0: 43186.9. Samples: 1631340220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:54:16,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 08:54:19,856][12883] Updated weights for policy 0, policy_version 99573 (0.0039) [2024-06-18 08:54:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1631485952. Throughput: 0: 43106.3. Samples: 1631594580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-18 08:54:21,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 08:54:23,695][12883] Updated weights for policy 0, policy_version 99583 (0.0027) [2024-06-18 08:54:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43420.3, 300 sec: 42987.2). Total num frames: 1631682560. Throughput: 0: 42871.1. Samples: 1631849860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:54:26,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 08:54:27,920][12883] Updated weights for policy 0, policy_version 99593 (0.0035) [2024-06-18 08:54:28,011][12862] Signal inference workers to stop experience collection... (23800 times) [2024-06-18 08:54:28,059][12862] Signal inference workers to resume experience collection... (23800 times) [2024-06-18 08:54:28,071][12883] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-06-18 08:54:28,106][12883] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-06-18 08:54:31,185][12883] Updated weights for policy 0, policy_version 99603 (0.0033) [2024-06-18 08:54:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1631895552. Throughput: 0: 42954.4. Samples: 1631979660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:54:31,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 08:54:35,398][12883] Updated weights for policy 0, policy_version 99613 (0.0036) [2024-06-18 08:54:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42987.2). Total num frames: 1632124928. Throughput: 0: 42846.2. Samples: 1632234580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:54:36,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 08:54:37,100][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099618_1632141312.pth... [2024-06-18 08:54:37,159][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000098989_1621835776.pth [2024-06-18 08:54:38,761][12883] Updated weights for policy 0, policy_version 99623 (0.0045) [2024-06-18 08:54:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1632321536. Throughput: 0: 42946.7. Samples: 1632493380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:54:41,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 08:54:43,142][12883] Updated weights for policy 0, policy_version 99633 (0.0037) [2024-06-18 08:54:46,350][12883] Updated weights for policy 0, policy_version 99643 (0.0028) [2024-06-18 08:54:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1632550912. Throughput: 0: 42867.2. Samples: 1632616740. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:54:46,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 08:54:50,687][12883] Updated weights for policy 0, policy_version 99653 (0.0040) [2024-06-18 08:54:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42987.1). Total num frames: 1632763904. Throughput: 0: 42854.1. Samples: 1632877760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:54:51,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 08:54:53,911][12883] Updated weights for policy 0, policy_version 99663 (0.0033) [2024-06-18 08:54:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1632960512. Throughput: 0: 42912.9. Samples: 1633137380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:54:56,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 08:54:58,267][12883] Updated weights for policy 0, policy_version 99673 (0.0035) [2024-06-18 08:55:01,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1633189888. Throughput: 0: 42676.2. Samples: 1633260640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:55:01,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 08:55:02,229][12883] Updated weights for policy 0, policy_version 99683 (0.0036) [2024-06-18 08:55:05,799][12883] Updated weights for policy 0, policy_version 99693 (0.0039) [2024-06-18 08:55:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1633402880. Throughput: 0: 42712.9. Samples: 1633516660. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:55:06,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 08:55:09,996][12883] Updated weights for policy 0, policy_version 99703 (0.0037) [2024-06-18 08:55:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 1633599488. Throughput: 0: 42700.9. Samples: 1633771400. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:55:11,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 08:55:13,473][12883] Updated weights for policy 0, policy_version 99713 (0.0028) [2024-06-18 08:55:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42987.5). Total num frames: 1633845248. Throughput: 0: 42604.8. Samples: 1633896880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:55:16,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 08:55:17,831][12883] Updated weights for policy 0, policy_version 99723 (0.0043) [2024-06-18 08:55:21,062][12883] Updated weights for policy 0, policy_version 99733 (0.0038) [2024-06-18 08:55:21,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42869.9, 300 sec: 43042.4). Total num frames: 1634058240. Throughput: 0: 42684.1. Samples: 1634155460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 08:55:21,996][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 08:55:25,508][12883] Updated weights for policy 0, policy_version 99743 (0.0046) [2024-06-18 08:55:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1634254848. Throughput: 0: 42735.1. Samples: 1634416460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:55:26,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 08:55:28,707][12883] Updated weights for policy 0, policy_version 99753 (0.0035) [2024-06-18 08:55:31,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1634467840. Throughput: 0: 42794.6. Samples: 1634542500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:55:31,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 08:55:33,118][12883] Updated weights for policy 0, policy_version 99763 (0.0040) [2024-06-18 08:55:36,462][12883] Updated weights for policy 0, policy_version 99773 (0.0036) [2024-06-18 08:55:36,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 1634697216. Throughput: 0: 42685.1. Samples: 1634798580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:55:36,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 08:55:40,834][12883] Updated weights for policy 0, policy_version 99783 (0.0041) [2024-06-18 08:55:41,376][12862] Signal inference workers to stop experience collection... (23850 times) [2024-06-18 08:55:41,376][12862] Signal inference workers to resume experience collection... (23850 times) [2024-06-18 08:55:41,422][12883] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-06-18 08:55:41,422][12883] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-06-18 08:55:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1634893824. Throughput: 0: 42523.0. Samples: 1635050920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:55:41,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 08:55:44,400][12883] Updated weights for policy 0, policy_version 99793 (0.0046) [2024-06-18 08:55:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1635123200. Throughput: 0: 42620.8. Samples: 1635178580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:55:46,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 08:55:48,705][12883] Updated weights for policy 0, policy_version 99803 (0.0037) [2024-06-18 08:55:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1635319808. Throughput: 0: 42533.4. Samples: 1635430660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:55:51,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 08:55:52,027][12883] Updated weights for policy 0, policy_version 99813 (0.0036) [2024-06-18 08:55:56,481][12883] Updated weights for policy 0, policy_version 99823 (0.0039) [2024-06-18 08:55:56,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1635500032. Throughput: 0: 42579.1. Samples: 1635687460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:55:56,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 08:55:59,634][12883] Updated weights for policy 0, policy_version 99833 (0.0036) [2024-06-18 08:56:01,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1635745792. Throughput: 0: 42505.0. Samples: 1635809700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:56:01,997][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 08:56:04,373][12883] Updated weights for policy 0, policy_version 99843 (0.0032) [2024-06-18 08:56:06,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1635958784. Throughput: 0: 42569.3. Samples: 1636070980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:56:06,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 08:56:07,261][12883] Updated weights for policy 0, policy_version 99853 (0.0039) [2024-06-18 08:56:11,975][12883] Updated weights for policy 0, policy_version 99863 (0.0030) [2024-06-18 08:56:11,994][12645] Fps is (10 sec: 40968.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1636155392. Throughput: 0: 42524.0. Samples: 1636330040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:56:11,995][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 08:56:14,985][12883] Updated weights for policy 0, policy_version 99873 (0.0034) [2024-06-18 08:56:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1636384768. Throughput: 0: 42417.8. Samples: 1636451300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:56:16,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 08:56:19,626][12883] Updated weights for policy 0, policy_version 99883 (0.0050) [2024-06-18 08:56:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 1636597760. Throughput: 0: 42376.4. Samples: 1636705520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 08:56:21,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 08:56:22,716][12883] Updated weights for policy 0, policy_version 99893 (0.0039) [2024-06-18 08:56:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1636794368. Throughput: 0: 42664.0. Samples: 1636970800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:56:26,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 08:56:27,125][12883] Updated weights for policy 0, policy_version 99903 (0.0040) [2024-06-18 08:56:30,911][12883] Updated weights for policy 0, policy_version 99913 (0.0033) [2024-06-18 08:56:31,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1637023744. Throughput: 0: 42430.3. Samples: 1637088040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:56:31,997][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 08:56:34,824][12883] Updated weights for policy 0, policy_version 99923 (0.0032) [2024-06-18 08:56:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1637236736. Throughput: 0: 42529.7. Samples: 1637344500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:56:36,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 08:56:37,126][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099930_1637253120.pth... [2024-06-18 08:56:37,188][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099303_1626980352.pth [2024-06-18 08:56:38,575][12883] Updated weights for policy 0, policy_version 99933 (0.0036) [2024-06-18 08:56:41,996][12645] Fps is (10 sec: 40960.3, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 1637433344. Throughput: 0: 42576.2. Samples: 1637603480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:56:41,997][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 08:56:42,858][12883] Updated weights for policy 0, policy_version 99943 (0.0030) [2024-06-18 08:56:46,255][12883] Updated weights for policy 0, policy_version 99953 (0.0033) [2024-06-18 08:56:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1637662720. Throughput: 0: 42655.4. Samples: 1637729100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:56:46,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 08:56:50,597][12883] Updated weights for policy 0, policy_version 99963 (0.0028) [2024-06-18 08:56:51,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 1637842944. Throughput: 0: 42620.8. Samples: 1637988920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:56:51,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 08:56:53,907][12883] Updated weights for policy 0, policy_version 99973 (0.0032) [2024-06-18 08:56:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1638072320. Throughput: 0: 42384.1. Samples: 1638237320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:56:56,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 08:56:58,167][12883] Updated weights for policy 0, policy_version 99983 (0.0027) [2024-06-18 08:57:01,398][12883] Updated weights for policy 0, policy_version 99993 (0.0028) [2024-06-18 08:57:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42326.8, 300 sec: 42653.9). Total num frames: 1638285312. Throughput: 0: 42624.3. Samples: 1638369400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:57:01,995][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 08:57:05,699][12883] Updated weights for policy 0, policy_version 100003 (0.0030) [2024-06-18 08:57:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1638498304. Throughput: 0: 42694.2. Samples: 1638626760. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:57:06,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 08:57:08,778][12883] Updated weights for policy 0, policy_version 100013 (0.0039) [2024-06-18 08:57:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1638727680. Throughput: 0: 42420.8. Samples: 1638879740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:57:11,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 08:57:13,252][12883] Updated weights for policy 0, policy_version 100023 (0.0039) [2024-06-18 08:57:16,310][12883] Updated weights for policy 0, policy_version 100033 (0.0035) [2024-06-18 08:57:16,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 1638940672. Throughput: 0: 42783.6. Samples: 1639013300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:57:16,997][12645] Avg episode reward: [(0, '0.690')] [2024-06-18 08:57:19,026][12862] Signal inference workers to stop experience collection... (23900 times) [2024-06-18 08:57:19,078][12883] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-06-18 08:57:19,138][12862] Signal inference workers to resume experience collection... (23900 times) [2024-06-18 08:57:19,138][12883] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-06-18 08:57:20,760][12883] Updated weights for policy 0, policy_version 100043 (0.0045) [2024-06-18 08:57:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1639137280. Throughput: 0: 42714.6. Samples: 1639266660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-18 08:57:21,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 08:57:24,573][12883] Updated weights for policy 0, policy_version 100053 (0.0024) [2024-06-18 08:57:26,994][12645] Fps is (10 sec: 44246.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1639383040. Throughput: 0: 42688.8. Samples: 1639524380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:57:26,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 08:57:28,736][12883] Updated weights for policy 0, policy_version 100063 (0.0024) [2024-06-18 08:57:31,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42598.4, 300 sec: 42709.2). Total num frames: 1639579648. Throughput: 0: 42856.2. Samples: 1639657720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:57:31,996][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 08:57:32,189][12883] Updated weights for policy 0, policy_version 100073 (0.0037) [2024-06-18 08:57:36,244][12883] Updated weights for policy 0, policy_version 100083 (0.0045) [2024-06-18 08:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1639792640. Throughput: 0: 42788.9. Samples: 1639914420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:57:36,994][12645] Avg episode reward: [(0, '0.121')] [2024-06-18 08:57:39,769][12883] Updated weights for policy 0, policy_version 100093 (0.0037) [2024-06-18 08:57:41,994][12645] Fps is (10 sec: 44247.1, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 1640022016. Throughput: 0: 42885.5. Samples: 1640167160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:57:41,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 08:57:43,819][12883] Updated weights for policy 0, policy_version 100103 (0.0036) [2024-06-18 08:57:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1640235008. Throughput: 0: 42975.7. Samples: 1640303300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:57:46,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 08:57:47,181][12883] Updated weights for policy 0, policy_version 100113 (0.0025) [2024-06-18 08:57:51,208][12883] Updated weights for policy 0, policy_version 100123 (0.0032) [2024-06-18 08:57:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1640448000. Throughput: 0: 43002.8. Samples: 1640561880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:57:51,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 08:57:54,899][12883] Updated weights for policy 0, policy_version 100133 (0.0045) [2024-06-18 08:57:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1640677376. Throughput: 0: 43039.1. Samples: 1640816500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:57:56,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 08:57:58,574][12883] Updated weights for policy 0, policy_version 100143 (0.0033) [2024-06-18 08:58:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1640873984. Throughput: 0: 43009.7. Samples: 1640948640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:58:01,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 08:58:02,419][12883] Updated weights for policy 0, policy_version 100153 (0.0041) [2024-06-18 08:58:06,129][12883] Updated weights for policy 0, policy_version 100163 (0.0039) [2024-06-18 08:58:06,993][12645] Fps is (10 sec: 40961.2, 60 sec: 43144.7, 300 sec: 42765.1). Total num frames: 1641086976. Throughput: 0: 43080.2. Samples: 1641205260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:58:06,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 08:58:10,310][12883] Updated weights for policy 0, policy_version 100173 (0.0035) [2024-06-18 08:58:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1641299968. Throughput: 0: 42939.9. Samples: 1641456680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:58:11,995][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 08:58:13,786][12883] Updated weights for policy 0, policy_version 100183 (0.0041) [2024-06-18 08:58:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1641496576. Throughput: 0: 42827.9. Samples: 1641584880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:58:16,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 08:58:17,875][12883] Updated weights for policy 0, policy_version 100193 (0.0033) [2024-06-18 08:58:21,721][12883] Updated weights for policy 0, policy_version 100203 (0.0033) [2024-06-18 08:58:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42876.6). Total num frames: 1641725952. Throughput: 0: 42729.2. Samples: 1641837240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 08:58:21,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 08:58:25,474][12883] Updated weights for policy 0, policy_version 100213 (0.0034) [2024-06-18 08:58:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1641938944. Throughput: 0: 42807.1. Samples: 1642093480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:58:26,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 08:58:29,158][12883] Updated weights for policy 0, policy_version 100223 (0.0031) [2024-06-18 08:58:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 1642119168. Throughput: 0: 42701.3. Samples: 1642224860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:58:31,995][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 08:58:33,006][12883] Updated weights for policy 0, policy_version 100233 (0.0036) [2024-06-18 08:58:36,491][12862] Signal inference workers to stop experience collection... (23950 times) [2024-06-18 08:58:36,492][12862] Signal inference workers to resume experience collection... (23950 times) [2024-06-18 08:58:36,513][12883] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-06-18 08:58:36,513][12883] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-06-18 08:58:36,639][12883] Updated weights for policy 0, policy_version 100243 (0.0036) [2024-06-18 08:58:36,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1642381312. Throughput: 0: 42635.4. Samples: 1642480480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:58:36,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 08:58:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100243_1642381312.pth... [2024-06-18 08:58:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099618_1632141312.pth [2024-06-18 08:58:40,930][12883] Updated weights for policy 0, policy_version 100253 (0.0032) [2024-06-18 08:58:41,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1642594304. Throughput: 0: 42721.0. Samples: 1642738940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:58:41,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 08:58:44,198][12883] Updated weights for policy 0, policy_version 100263 (0.0036) [2024-06-18 08:58:46,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1642774528. Throughput: 0: 42677.0. Samples: 1642869100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:58:46,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 08:58:48,526][12883] Updated weights for policy 0, policy_version 100273 (0.0030) [2024-06-18 08:58:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1643020288. Throughput: 0: 42672.4. Samples: 1643125520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:58:51,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 08:58:52,015][12883] Updated weights for policy 0, policy_version 100283 (0.0041) [2024-06-18 08:58:56,332][12883] Updated weights for policy 0, policy_version 100293 (0.0042) [2024-06-18 08:58:56,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1643233280. Throughput: 0: 42947.5. Samples: 1643389320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:58:56,995][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 08:58:59,638][12883] Updated weights for policy 0, policy_version 100303 (0.0032) [2024-06-18 08:59:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1643413504. Throughput: 0: 42879.2. Samples: 1643514440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:59:01,994][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 08:59:03,966][12883] Updated weights for policy 0, policy_version 100313 (0.0029) [2024-06-18 08:59:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1643659264. Throughput: 0: 42943.7. Samples: 1643769700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:59:06,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 08:59:07,227][12883] Updated weights for policy 0, policy_version 100323 (0.0028) [2024-06-18 08:59:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1643839488. Throughput: 0: 43008.4. Samples: 1644028860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:59:11,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 08:59:12,012][12883] Updated weights for policy 0, policy_version 100333 (0.0034) [2024-06-18 08:59:14,707][12883] Updated weights for policy 0, policy_version 100343 (0.0035) [2024-06-18 08:59:17,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42867.0, 300 sec: 42653.0). Total num frames: 1644068864. Throughput: 0: 42727.0. Samples: 1644147840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:59:17,000][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 08:59:19,514][12883] Updated weights for policy 0, policy_version 100353 (0.0033) [2024-06-18 08:59:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1644298240. Throughput: 0: 42834.7. Samples: 1644408040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 08:59:21,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 08:59:22,330][12883] Updated weights for policy 0, policy_version 100363 (0.0023) [2024-06-18 08:59:26,909][12883] Updated weights for policy 0, policy_version 100373 (0.0046) [2024-06-18 08:59:26,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1644511232. Throughput: 0: 42935.9. Samples: 1644671060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 08:59:26,998][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 08:59:30,055][12883] Updated weights for policy 0, policy_version 100383 (0.0036) [2024-06-18 08:59:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 1644724224. Throughput: 0: 42731.4. Samples: 1644792020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 08:59:31,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 08:59:34,706][12883] Updated weights for policy 0, policy_version 100393 (0.0028) [2024-06-18 08:59:37,000][12645] Fps is (10 sec: 44209.3, 60 sec: 42867.1, 300 sec: 42819.7). Total num frames: 1644953600. Throughput: 0: 42966.4. Samples: 1645059280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 08:59:37,001][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 08:59:37,830][12883] Updated weights for policy 0, policy_version 100403 (0.0026) [2024-06-18 08:59:41,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 1645150208. Throughput: 0: 42795.3. Samples: 1645315200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 08:59:41,996][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 08:59:42,523][12883] Updated weights for policy 0, policy_version 100413 (0.0039) [2024-06-18 08:59:45,274][12883] Updated weights for policy 0, policy_version 100423 (0.0029) [2024-06-18 08:59:46,994][12645] Fps is (10 sec: 40985.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1645363200. Throughput: 0: 42738.1. Samples: 1645437660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 08:59:46,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 08:59:50,165][12883] Updated weights for policy 0, policy_version 100433 (0.0045) [2024-06-18 08:59:51,994][12645] Fps is (10 sec: 45885.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1645608960. Throughput: 0: 43013.0. Samples: 1645705280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 08:59:51,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 08:59:53,298][12883] Updated weights for policy 0, policy_version 100443 (0.0048) [2024-06-18 08:59:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1645789184. Throughput: 0: 42988.8. Samples: 1645963360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 08:59:56,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 08:59:57,608][12883] Updated weights for policy 0, policy_version 100453 (0.0041) [2024-06-18 09:00:00,947][12883] Updated weights for policy 0, policy_version 100463 (0.0031) [2024-06-18 09:00:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1646018560. Throughput: 0: 43064.1. Samples: 1646085460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 09:00:01,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 09:00:05,051][12883] Updated weights for policy 0, policy_version 100473 (0.0036) [2024-06-18 09:00:06,713][12862] Signal inference workers to stop experience collection... (24000 times) [2024-06-18 09:00:06,713][12862] Signal inference workers to resume experience collection... (24000 times) [2024-06-18 09:00:06,723][12883] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-06-18 09:00:06,723][12883] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-06-18 09:00:06,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 1646264320. Throughput: 0: 43183.7. Samples: 1646351300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 09:00:06,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 09:00:08,549][12883] Updated weights for policy 0, policy_version 100483 (0.0043) [2024-06-18 09:00:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1646444544. Throughput: 0: 42919.5. Samples: 1646602440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 09:00:11,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 09:00:12,720][12883] Updated weights for policy 0, policy_version 100493 (0.0035) [2024-06-18 09:00:16,443][12883] Updated weights for policy 0, policy_version 100503 (0.0033) [2024-06-18 09:00:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 43149.1, 300 sec: 42709.8). Total num frames: 1646657536. Throughput: 0: 43022.8. Samples: 1646728040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 09:00:16,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 09:00:20,237][12883] Updated weights for policy 0, policy_version 100513 (0.0047) [2024-06-18 09:00:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1646886912. Throughput: 0: 42765.1. Samples: 1646983440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 09:00:21,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 09:00:24,186][12883] Updated weights for policy 0, policy_version 100523 (0.0035) [2024-06-18 09:00:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1647083520. Throughput: 0: 42992.3. Samples: 1647249760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:00:26,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 09:00:27,979][12883] Updated weights for policy 0, policy_version 100533 (0.0030) [2024-06-18 09:00:31,639][12883] Updated weights for policy 0, policy_version 100543 (0.0040) [2024-06-18 09:00:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1647296512. Throughput: 0: 43000.1. Samples: 1647372660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:00:31,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 09:00:35,575][12883] Updated weights for policy 0, policy_version 100553 (0.0026) [2024-06-18 09:00:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42875.9, 300 sec: 42820.6). Total num frames: 1647525888. Throughput: 0: 42839.4. Samples: 1647633060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:00:36,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 09:00:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100558_1647542272.pth... [2024-06-18 09:00:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000099930_1637253120.pth [2024-06-18 09:00:39,155][12883] Updated weights for policy 0, policy_version 100563 (0.0036) [2024-06-18 09:00:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1647706112. Throughput: 0: 42949.8. Samples: 1647896100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:00:41,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 09:00:43,158][12883] Updated weights for policy 0, policy_version 100573 (0.0041) [2024-06-18 09:00:46,715][12883] Updated weights for policy 0, policy_version 100583 (0.0053) [2024-06-18 09:00:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1647951872. Throughput: 0: 42837.8. Samples: 1648013160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:00:46,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 09:00:50,654][12883] Updated weights for policy 0, policy_version 100593 (0.0041) [2024-06-18 09:00:51,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 42931.3). Total num frames: 1648164864. Throughput: 0: 42722.7. Samples: 1648273920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:00:51,997][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 09:00:54,755][12883] Updated weights for policy 0, policy_version 100603 (0.0026) [2024-06-18 09:00:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 1648361472. Throughput: 0: 43018.2. Samples: 1648538260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:00:56,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 09:00:58,272][12883] Updated weights for policy 0, policy_version 100613 (0.0028) [2024-06-18 09:01:01,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1648590848. Throughput: 0: 42973.2. Samples: 1648661840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:01:01,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 09:01:02,337][12883] Updated weights for policy 0, policy_version 100623 (0.0040) [2024-06-18 09:01:06,058][12883] Updated weights for policy 0, policy_version 100633 (0.0037) [2024-06-18 09:01:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42931.7). Total num frames: 1648820224. Throughput: 0: 43064.8. Samples: 1648921360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:01:06,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 09:01:09,819][12883] Updated weights for policy 0, policy_version 100643 (0.0040) [2024-06-18 09:01:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1649000448. Throughput: 0: 42954.1. Samples: 1649182700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:01:11,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 09:01:13,910][12883] Updated weights for policy 0, policy_version 100653 (0.0033) [2024-06-18 09:01:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1649246208. Throughput: 0: 42970.1. Samples: 1649306320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:01:16,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 09:01:17,287][12883] Updated weights for policy 0, policy_version 100663 (0.0039) [2024-06-18 09:01:21,429][12883] Updated weights for policy 0, policy_version 100673 (0.0037) [2024-06-18 09:01:21,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 1649459200. Throughput: 0: 43088.2. Samples: 1649572020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 09:01:21,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 09:01:24,934][12883] Updated weights for policy 0, policy_version 100683 (0.0043) [2024-06-18 09:01:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 1649639424. Throughput: 0: 42825.8. Samples: 1649823260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:01:26,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 09:01:28,951][12883] Updated weights for policy 0, policy_version 100693 (0.0037) [2024-06-18 09:01:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 1649901568. Throughput: 0: 43036.4. Samples: 1649949800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:01:31,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 09:01:32,429][12883] Updated weights for policy 0, policy_version 100703 (0.0027) [2024-06-18 09:01:36,595][12883] Updated weights for policy 0, policy_version 100713 (0.0037) [2024-06-18 09:01:36,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42869.9, 300 sec: 42931.6). Total num frames: 1650098176. Throughput: 0: 43142.7. Samples: 1650215340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:01:36,997][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 09:01:37,637][12862] Signal inference workers to stop experience collection... (24050 times) [2024-06-18 09:01:37,638][12862] Signal inference workers to resume experience collection... (24050 times) [2024-06-18 09:01:37,679][12883] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-06-18 09:01:37,680][12883] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-06-18 09:01:39,949][12883] Updated weights for policy 0, policy_version 100723 (0.0034) [2024-06-18 09:01:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1650294784. Throughput: 0: 42948.0. Samples: 1650470920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:01:41,994][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 09:01:44,412][12883] Updated weights for policy 0, policy_version 100733 (0.0042) [2024-06-18 09:01:46,994][12645] Fps is (10 sec: 44247.0, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 1650540544. Throughput: 0: 43001.0. Samples: 1650596880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:01:46,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 09:01:47,571][12883] Updated weights for policy 0, policy_version 100743 (0.0046) [2024-06-18 09:01:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 1650720768. Throughput: 0: 43048.5. Samples: 1650858540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:01:51,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 09:01:52,039][12883] Updated weights for policy 0, policy_version 100753 (0.0023) [2024-06-18 09:01:55,420][12883] Updated weights for policy 0, policy_version 100763 (0.0046) [2024-06-18 09:01:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 1650950144. Throughput: 0: 42821.0. Samples: 1651109640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:01:56,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 09:01:59,747][12883] Updated weights for policy 0, policy_version 100773 (0.0037) [2024-06-18 09:02:01,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1651179520. Throughput: 0: 42955.9. Samples: 1651239340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:01,995][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 09:02:03,206][12883] Updated weights for policy 0, policy_version 100783 (0.0036) [2024-06-18 09:02:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1651359744. Throughput: 0: 42772.7. Samples: 1651496800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:06,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 09:02:07,553][12883] Updated weights for policy 0, policy_version 100793 (0.0026) [2024-06-18 09:02:10,787][12883] Updated weights for policy 0, policy_version 100803 (0.0042) [2024-06-18 09:02:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43417.6, 300 sec: 42931.9). Total num frames: 1651605504. Throughput: 0: 42694.5. Samples: 1651744520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:11,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 09:02:15,322][12883] Updated weights for policy 0, policy_version 100813 (0.0030) [2024-06-18 09:02:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1651818496. Throughput: 0: 42875.1. Samples: 1651879180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:16,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 09:02:18,567][12883] Updated weights for policy 0, policy_version 100823 (0.0046) [2024-06-18 09:02:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1651998720. Throughput: 0: 42519.9. Samples: 1652128640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:21,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 09:02:23,117][12883] Updated weights for policy 0, policy_version 100833 (0.0039) [2024-06-18 09:02:26,211][12883] Updated weights for policy 0, policy_version 100843 (0.0032) [2024-06-18 09:02:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42931.9). Total num frames: 1652244480. Throughput: 0: 42453.3. Samples: 1652381320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:26,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 09:02:30,853][12883] Updated weights for policy 0, policy_version 100853 (0.0028) [2024-06-18 09:02:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1652441088. Throughput: 0: 42530.6. Samples: 1652510760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:31,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 09:02:34,163][12883] Updated weights for policy 0, policy_version 100863 (0.0032) [2024-06-18 09:02:36,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42053.9, 300 sec: 42709.5). Total num frames: 1652621312. Throughput: 0: 42358.7. Samples: 1652764680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:36,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 09:02:37,062][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100869_1652637696.pth... [2024-06-18 09:02:37,123][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100243_1642381312.pth [2024-06-18 09:02:38,407][12883] Updated weights for policy 0, policy_version 100873 (0.0031) [2024-06-18 09:02:41,865][12883] Updated weights for policy 0, policy_version 100883 (0.0027) [2024-06-18 09:02:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1652867072. Throughput: 0: 42463.5. Samples: 1653020500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:41,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 09:02:45,990][12883] Updated weights for policy 0, policy_version 100893 (0.0039) [2024-06-18 09:02:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1653063680. Throughput: 0: 42467.2. Samples: 1653150360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:46,994][12645] Avg episode reward: [(0, '0.798')] [2024-06-18 09:02:47,015][12862] Saving new best policy, reward=0.798! [2024-06-18 09:02:49,501][12883] Updated weights for policy 0, policy_version 100903 (0.0040) [2024-06-18 09:02:51,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1653260288. Throughput: 0: 42165.9. Samples: 1653394260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:51,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 09:02:53,867][12883] Updated weights for policy 0, policy_version 100913 (0.0036) [2024-06-18 09:02:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1653489664. Throughput: 0: 42445.4. Samples: 1653654560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:02:56,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 09:02:57,381][12883] Updated weights for policy 0, policy_version 100923 (0.0037) [2024-06-18 09:03:01,406][12883] Updated weights for policy 0, policy_version 100933 (0.0024) [2024-06-18 09:03:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 1653702656. Throughput: 0: 42291.6. Samples: 1653782300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:03:01,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 09:03:02,369][12862] Signal inference workers to stop experience collection... (24100 times) [2024-06-18 09:03:02,408][12883] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-06-18 09:03:02,427][12862] Signal inference workers to resume experience collection... (24100 times) [2024-06-18 09:03:02,430][12883] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-06-18 09:03:05,007][12883] Updated weights for policy 0, policy_version 100943 (0.0035) [2024-06-18 09:03:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1653915648. Throughput: 0: 42344.0. Samples: 1654034120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:03:06,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 09:03:09,025][12883] Updated weights for policy 0, policy_version 100953 (0.0029) [2024-06-18 09:03:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42820.6). Total num frames: 1654128640. Throughput: 0: 42441.0. Samples: 1654291160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:03:11,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 09:03:12,630][12883] Updated weights for policy 0, policy_version 100963 (0.0036) [2024-06-18 09:03:16,579][12883] Updated weights for policy 0, policy_version 100973 (0.0027) [2024-06-18 09:03:16,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1654341632. Throughput: 0: 42466.5. Samples: 1654421760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:03:16,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 09:03:20,159][12883] Updated weights for policy 0, policy_version 100983 (0.0028) [2024-06-18 09:03:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1654571008. Throughput: 0: 42413.2. Samples: 1654673280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 09:03:21,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 09:03:24,597][12883] Updated weights for policy 0, policy_version 100993 (0.0023) [2024-06-18 09:03:26,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42987.2). Total num frames: 1654800384. Throughput: 0: 42649.7. Samples: 1654939740. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:03:26,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 09:03:27,859][12883] Updated weights for policy 0, policy_version 101003 (0.0035) [2024-06-18 09:03:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 1654964224. Throughput: 0: 42566.2. Samples: 1655065840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:03:31,994][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 09:03:32,302][12883] Updated weights for policy 0, policy_version 101013 (0.0036) [2024-06-18 09:03:35,462][12883] Updated weights for policy 0, policy_version 101023 (0.0040) [2024-06-18 09:03:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1655226368. Throughput: 0: 42767.4. Samples: 1655318800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:03:36,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 09:03:39,901][12883] Updated weights for policy 0, policy_version 101033 (0.0039) [2024-06-18 09:03:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1655422976. Throughput: 0: 42899.2. Samples: 1655585020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:03:41,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 09:03:42,866][12883] Updated weights for policy 0, policy_version 101043 (0.0029) [2024-06-18 09:03:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1655619584. Throughput: 0: 42760.4. Samples: 1655706520. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:03:46,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 09:03:47,553][12883] Updated weights for policy 0, policy_version 101053 (0.0035) [2024-06-18 09:03:50,374][12883] Updated weights for policy 0, policy_version 101063 (0.0030) [2024-06-18 09:03:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1655848960. Throughput: 0: 42818.9. Samples: 1655960980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:03:51,995][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 09:03:55,018][12883] Updated weights for policy 0, policy_version 101073 (0.0030) [2024-06-18 09:03:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1656045568. Throughput: 0: 43138.1. Samples: 1656232380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:03:56,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 09:03:58,291][12883] Updated weights for policy 0, policy_version 101083 (0.0028) [2024-06-18 09:04:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1656274944. Throughput: 0: 42911.2. Samples: 1656352760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:04:01,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 09:04:02,566][12883] Updated weights for policy 0, policy_version 101093 (0.0025) [2024-06-18 09:04:05,925][12883] Updated weights for policy 0, policy_version 101103 (0.0027) [2024-06-18 09:04:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 1656504320. Throughput: 0: 43113.8. Samples: 1656613400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:04:06,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 09:04:10,237][12883] Updated weights for policy 0, policy_version 101113 (0.0025) [2024-06-18 09:04:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42821.5). Total num frames: 1656700928. Throughput: 0: 43057.0. Samples: 1656877300. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:04:11,994][12645] Avg episode reward: [(0, '0.661')] [2024-06-18 09:04:13,567][12883] Updated weights for policy 0, policy_version 101123 (0.0043) [2024-06-18 09:04:16,996][12645] Fps is (10 sec: 42589.2, 60 sec: 43143.0, 300 sec: 42820.2). Total num frames: 1656930304. Throughput: 0: 42950.8. Samples: 1656998720. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:04:16,997][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 09:04:17,819][12883] Updated weights for policy 0, policy_version 101133 (0.0040) [2024-06-18 09:04:19,178][12862] Signal inference workers to stop experience collection... (24150 times) [2024-06-18 09:04:19,179][12862] Signal inference workers to resume experience collection... (24150 times) [2024-06-18 09:04:19,212][12883] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-06-18 09:04:19,213][12883] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-06-18 09:04:21,129][12883] Updated weights for policy 0, policy_version 101143 (0.0037) [2024-06-18 09:04:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1657159680. Throughput: 0: 43139.2. Samples: 1657260060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-18 09:04:21,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 09:04:25,511][12883] Updated weights for policy 0, policy_version 101153 (0.0033) [2024-06-18 09:04:26,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1657356288. Throughput: 0: 43025.3. Samples: 1657521160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:04:26,994][12645] Avg episode reward: [(0, '0.625')] [2024-06-18 09:04:28,795][12883] Updated weights for policy 0, policy_version 101163 (0.0029) [2024-06-18 09:04:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43690.6, 300 sec: 42821.5). Total num frames: 1657585664. Throughput: 0: 43002.7. Samples: 1657641640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:04:31,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 09:04:32,922][12883] Updated weights for policy 0, policy_version 101173 (0.0036) [2024-06-18 09:04:36,485][12883] Updated weights for policy 0, policy_version 101183 (0.0026) [2024-06-18 09:04:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42876.4). Total num frames: 1657798656. Throughput: 0: 43139.3. Samples: 1657902240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:04:36,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 09:04:37,102][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101185_1657815040.pth... [2024-06-18 09:04:37,163][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100558_1647542272.pth [2024-06-18 09:04:41,017][12883] Updated weights for policy 0, policy_version 101193 (0.0028) [2024-06-18 09:04:41,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1657978880. Throughput: 0: 42889.8. Samples: 1658162420. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:04:41,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 09:04:44,302][12883] Updated weights for policy 0, policy_version 101203 (0.0041) [2024-06-18 09:04:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43690.7, 300 sec: 42820.5). Total num frames: 1658241024. Throughput: 0: 42851.1. Samples: 1658281060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:04:47,007][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 09:04:48,596][12883] Updated weights for policy 0, policy_version 101213 (0.0045) [2024-06-18 09:04:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1658421248. Throughput: 0: 42932.9. Samples: 1658545380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:04:51,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 09:04:52,031][12883] Updated weights for policy 0, policy_version 101223 (0.0033) [2024-06-18 09:04:56,196][12883] Updated weights for policy 0, policy_version 101233 (0.0043) [2024-06-18 09:04:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1658617856. Throughput: 0: 42778.1. Samples: 1658802320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:04:56,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 09:04:59,699][12883] Updated weights for policy 0, policy_version 101243 (0.0033) [2024-06-18 09:05:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1658880000. Throughput: 0: 42859.0. Samples: 1658927280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:05:01,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 09:05:03,649][12883] Updated weights for policy 0, policy_version 101253 (0.0049) [2024-06-18 09:05:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1659043840. Throughput: 0: 42819.1. Samples: 1659186920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:05:06,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 09:05:07,375][12883] Updated weights for policy 0, policy_version 101263 (0.0030) [2024-06-18 09:05:11,677][12883] Updated weights for policy 0, policy_version 101273 (0.0044) [2024-06-18 09:05:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1659273216. Throughput: 0: 42842.2. Samples: 1659449060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:05:11,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 09:05:14,863][12883] Updated weights for policy 0, policy_version 101283 (0.0031) [2024-06-18 09:05:16,994][12645] Fps is (10 sec: 47513.1, 60 sec: 43146.0, 300 sec: 42820.5). Total num frames: 1659518976. Throughput: 0: 43000.4. Samples: 1659576660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:05:16,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 09:05:19,030][12883] Updated weights for policy 0, policy_version 101293 (0.0038) [2024-06-18 09:05:21,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 1659699200. Throughput: 0: 42895.7. Samples: 1659832640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:05:21,996][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 09:05:22,551][12883] Updated weights for policy 0, policy_version 101303 (0.0028) [2024-06-18 09:05:26,559][12883] Updated weights for policy 0, policy_version 101313 (0.0026) [2024-06-18 09:05:26,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1659912192. Throughput: 0: 42871.7. Samples: 1660091640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 09:05:26,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 09:05:28,025][12862] Signal inference workers to stop experience collection... (24200 times) [2024-06-18 09:05:28,064][12883] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-06-18 09:05:28,096][12862] Signal inference workers to resume experience collection... (24200 times) [2024-06-18 09:05:28,100][12883] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-06-18 09:05:30,301][12883] Updated weights for policy 0, policy_version 101323 (0.0030) [2024-06-18 09:05:31,994][12645] Fps is (10 sec: 45885.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1660157952. Throughput: 0: 43095.5. Samples: 1660220360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:05:31,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 09:05:34,159][12883] Updated weights for policy 0, policy_version 101333 (0.0036) [2024-06-18 09:05:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1660354560. Throughput: 0: 42887.1. Samples: 1660475300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:05:36,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 09:05:37,845][12883] Updated weights for policy 0, policy_version 101343 (0.0043) [2024-06-18 09:05:41,856][12883] Updated weights for policy 0, policy_version 101353 (0.0030) [2024-06-18 09:05:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1660567552. Throughput: 0: 43034.7. Samples: 1660738880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:05:41,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 09:05:45,442][12883] Updated weights for policy 0, policy_version 101363 (0.0036) [2024-06-18 09:05:46,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 1660813312. Throughput: 0: 42993.9. Samples: 1660862000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:05:46,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 09:05:49,816][12883] Updated weights for policy 0, policy_version 101373 (0.0033) [2024-06-18 09:05:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1660993536. Throughput: 0: 42927.2. Samples: 1661118640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:05:51,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 09:05:53,167][12883] Updated weights for policy 0, policy_version 101383 (0.0029) [2024-06-18 09:05:56,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1661190144. Throughput: 0: 42863.5. Samples: 1661377920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:05:56,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 09:05:57,390][12883] Updated weights for policy 0, policy_version 101393 (0.0035) [2024-06-18 09:06:00,761][12883] Updated weights for policy 0, policy_version 101403 (0.0036) [2024-06-18 09:06:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1661435904. Throughput: 0: 42748.0. Samples: 1661500320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:06:02,003][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 09:06:05,191][12883] Updated weights for policy 0, policy_version 101413 (0.0044) [2024-06-18 09:06:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1661648896. Throughput: 0: 42756.4. Samples: 1661756580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:06:06,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 09:06:08,782][12883] Updated weights for policy 0, policy_version 101423 (0.0033) [2024-06-18 09:06:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1661845504. Throughput: 0: 42673.8. Samples: 1662011960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:06:11,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 09:06:12,900][12883] Updated weights for policy 0, policy_version 101433 (0.0033) [2024-06-18 09:06:16,507][12883] Updated weights for policy 0, policy_version 101443 (0.0038) [2024-06-18 09:06:16,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1662042112. Throughput: 0: 42534.2. Samples: 1662134400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:06:16,998][12645] Avg episode reward: [(0, '0.700')] [2024-06-18 09:06:20,603][12883] Updated weights for policy 0, policy_version 101453 (0.0036) [2024-06-18 09:06:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1662287872. Throughput: 0: 42652.9. Samples: 1662394680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:06:21,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 09:06:24,494][12883] Updated weights for policy 0, policy_version 101463 (0.0036) [2024-06-18 09:06:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1662484480. Throughput: 0: 42318.2. Samples: 1662643200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 09:06:26,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 09:06:28,114][12883] Updated weights for policy 0, policy_version 101473 (0.0032) [2024-06-18 09:06:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42654.3). Total num frames: 1662681088. Throughput: 0: 42311.5. Samples: 1662766020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:06:31,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 09:06:32,092][12883] Updated weights for policy 0, policy_version 101483 (0.0035) [2024-06-18 09:06:35,761][12883] Updated weights for policy 0, policy_version 101493 (0.0028) [2024-06-18 09:06:36,996][12645] Fps is (10 sec: 45864.6, 60 sec: 43142.9, 300 sec: 42875.8). Total num frames: 1662943232. Throughput: 0: 42471.6. Samples: 1663029960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:06:36,997][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 09:06:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101498_1662943232.pth... [2024-06-18 09:06:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000100869_1652637696.pth [2024-06-18 09:06:39,773][12883] Updated weights for policy 0, policy_version 101503 (0.0034) [2024-06-18 09:06:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1663107072. Throughput: 0: 42472.1. Samples: 1663289160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:06:41,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 09:06:43,202][12883] Updated weights for policy 0, policy_version 101513 (0.0031) [2024-06-18 09:06:43,668][12862] Signal inference workers to stop experience collection... (24250 times) [2024-06-18 09:06:43,669][12862] Signal inference workers to resume experience collection... (24250 times) [2024-06-18 09:06:43,713][12883] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-06-18 09:06:43,713][12883] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-06-18 09:06:46,994][12645] Fps is (10 sec: 37692.1, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1663320064. Throughput: 0: 42451.8. Samples: 1663410640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:06:46,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 09:06:47,276][12883] Updated weights for policy 0, policy_version 101523 (0.0030) [2024-06-18 09:06:50,746][12883] Updated weights for policy 0, policy_version 101533 (0.0034) [2024-06-18 09:06:51,994][12645] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1663582208. Throughput: 0: 42690.7. Samples: 1663677660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:06:51,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 09:06:55,347][12883] Updated weights for policy 0, policy_version 101543 (0.0037) [2024-06-18 09:06:56,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1663762432. Throughput: 0: 42692.8. Samples: 1663933140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:06:56,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 09:06:58,356][12883] Updated weights for policy 0, policy_version 101553 (0.0046) [2024-06-18 09:07:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1663975424. Throughput: 0: 42709.4. Samples: 1664056320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:07:01,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 09:07:02,923][12883] Updated weights for policy 0, policy_version 101563 (0.0026) [2024-06-18 09:07:06,271][12883] Updated weights for policy 0, policy_version 101573 (0.0021) [2024-06-18 09:07:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1664204800. Throughput: 0: 42687.6. Samples: 1664315620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:07:06,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 09:07:10,362][12883] Updated weights for policy 0, policy_version 101583 (0.0042) [2024-06-18 09:07:11,995][12645] Fps is (10 sec: 44231.1, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 1664417792. Throughput: 0: 42919.6. Samples: 1664574640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:07:11,995][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 09:07:13,700][12883] Updated weights for policy 0, policy_version 101593 (0.0038) [2024-06-18 09:07:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1664647168. Throughput: 0: 43039.5. Samples: 1664702800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:07:16,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 09:07:17,799][12883] Updated weights for policy 0, policy_version 101603 (0.0026) [2024-06-18 09:07:21,315][12883] Updated weights for policy 0, policy_version 101613 (0.0028) [2024-06-18 09:07:21,994][12645] Fps is (10 sec: 42604.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1664843776. Throughput: 0: 42943.1. Samples: 1664962300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:07:21,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 09:07:25,262][12883] Updated weights for policy 0, policy_version 101623 (0.0036) [2024-06-18 09:07:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1665073152. Throughput: 0: 42987.9. Samples: 1665223620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 09:07:26,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 09:07:28,797][12883] Updated weights for policy 0, policy_version 101633 (0.0039) [2024-06-18 09:07:31,996][12645] Fps is (10 sec: 44226.3, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 1665286144. Throughput: 0: 43072.0. Samples: 1665348980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:07:31,997][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 09:07:32,705][12883] Updated weights for policy 0, policy_version 101643 (0.0041) [2024-06-18 09:07:36,247][12883] Updated weights for policy 0, policy_version 101653 (0.0033) [2024-06-18 09:07:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42820.5). Total num frames: 1665499136. Throughput: 0: 42948.8. Samples: 1665610360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:07:36,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 09:07:40,167][12883] Updated weights for policy 0, policy_version 101663 (0.0036) [2024-06-18 09:07:41,994][12645] Fps is (10 sec: 40969.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1665695744. Throughput: 0: 43016.1. Samples: 1665868860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:07:41,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 09:07:44,038][12883] Updated weights for policy 0, policy_version 101673 (0.0028) [2024-06-18 09:07:46,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43415.9, 300 sec: 42931.3). Total num frames: 1665925120. Throughput: 0: 43119.2. Samples: 1665996780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:07:46,996][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 09:07:47,647][12883] Updated weights for policy 0, policy_version 101683 (0.0033) [2024-06-18 09:07:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 1666121728. Throughput: 0: 43143.8. Samples: 1666257100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:07:51,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 09:07:52,159][12883] Updated weights for policy 0, policy_version 101693 (0.0027) [2024-06-18 09:07:55,343][12883] Updated weights for policy 0, policy_version 101703 (0.0027) [2024-06-18 09:07:56,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1666334720. Throughput: 0: 43019.3. Samples: 1666510460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:07:56,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 09:07:59,879][12883] Updated weights for policy 0, policy_version 101713 (0.0030) [2024-06-18 09:08:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1666564096. Throughput: 0: 43018.3. Samples: 1666638620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:08:01,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 09:08:02,927][12883] Updated weights for policy 0, policy_version 101723 (0.0032) [2024-06-18 09:08:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1666760704. Throughput: 0: 42833.3. Samples: 1666889800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:08:06,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 09:08:07,554][12883] Updated weights for policy 0, policy_version 101733 (0.0030) [2024-06-18 09:08:07,649][12862] Signal inference workers to stop experience collection... (24300 times) [2024-06-18 09:08:07,701][12883] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-06-18 09:08:07,706][12862] Signal inference workers to resume experience collection... (24300 times) [2024-06-18 09:08:07,716][12883] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-06-18 09:08:11,247][12883] Updated weights for policy 0, policy_version 101743 (0.0039) [2024-06-18 09:08:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42599.3, 300 sec: 42820.6). Total num frames: 1666973696. Throughput: 0: 42673.8. Samples: 1667143940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:08:11,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 09:08:15,160][12883] Updated weights for policy 0, policy_version 101753 (0.0027) [2024-06-18 09:08:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1667203072. Throughput: 0: 42812.8. Samples: 1667275460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:08:16,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 09:08:18,817][12883] Updated weights for policy 0, policy_version 101763 (0.0045) [2024-06-18 09:08:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1667399680. Throughput: 0: 42672.5. Samples: 1667530620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:08:22,000][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 09:08:22,722][12883] Updated weights for policy 0, policy_version 101773 (0.0032) [2024-06-18 09:08:26,343][12883] Updated weights for policy 0, policy_version 101783 (0.0027) [2024-06-18 09:08:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1667612672. Throughput: 0: 42620.3. Samples: 1667786780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:08:26,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 09:08:30,471][12883] Updated weights for policy 0, policy_version 101793 (0.0035) [2024-06-18 09:08:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42600.1, 300 sec: 42765.1). Total num frames: 1667842048. Throughput: 0: 42675.1. Samples: 1667917060. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:08:31,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 09:08:33,897][12883] Updated weights for policy 0, policy_version 101803 (0.0033) [2024-06-18 09:08:36,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1668055040. Throughput: 0: 42657.1. Samples: 1668176660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:08:36,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 09:08:37,038][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101811_1668071424.pth... [2024-06-18 09:08:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101185_1657815040.pth [2024-06-18 09:08:38,267][12883] Updated weights for policy 0, policy_version 101813 (0.0032) [2024-06-18 09:08:41,646][12883] Updated weights for policy 0, policy_version 101823 (0.0029) [2024-06-18 09:08:41,995][12645] Fps is (10 sec: 42591.0, 60 sec: 42870.3, 300 sec: 42875.9). Total num frames: 1668268032. Throughput: 0: 42574.1. Samples: 1668426360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:08:41,996][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 09:08:45,861][12883] Updated weights for policy 0, policy_version 101833 (0.0042) [2024-06-18 09:08:46,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42599.9, 300 sec: 42820.6). Total num frames: 1668481024. Throughput: 0: 42628.3. Samples: 1668556900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:08:46,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 09:08:49,312][12883] Updated weights for policy 0, policy_version 101843 (0.0046) [2024-06-18 09:08:51,994][12645] Fps is (10 sec: 44244.4, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1668710400. Throughput: 0: 42871.6. Samples: 1668819020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:08:51,994][12645] Avg episode reward: [(0, '0.717')] [2024-06-18 09:08:53,342][12883] Updated weights for policy 0, policy_version 101853 (0.0041) [2024-06-18 09:08:56,995][12645] Fps is (10 sec: 42593.2, 60 sec: 42870.6, 300 sec: 42820.4). Total num frames: 1668907008. Throughput: 0: 42893.0. Samples: 1669074180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:08:56,995][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 09:08:57,082][12883] Updated weights for policy 0, policy_version 101863 (0.0032) [2024-06-18 09:09:00,739][12883] Updated weights for policy 0, policy_version 101873 (0.0032) [2024-06-18 09:09:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1669136384. Throughput: 0: 42726.8. Samples: 1669198160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:09:01,994][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 09:09:04,703][12883] Updated weights for policy 0, policy_version 101883 (0.0024) [2024-06-18 09:09:06,994][12645] Fps is (10 sec: 44243.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1669349376. Throughput: 0: 42984.5. Samples: 1669464920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:09:06,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 09:09:08,205][12883] Updated weights for policy 0, policy_version 101893 (0.0040) [2024-06-18 09:09:09,080][12862] Signal inference workers to stop experience collection... (24350 times) [2024-06-18 09:09:09,080][12862] Signal inference workers to resume experience collection... (24350 times) [2024-06-18 09:09:09,114][12883] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-06-18 09:09:09,114][12883] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-06-18 09:09:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42765.4). Total num frames: 1669545984. Throughput: 0: 43050.0. Samples: 1669724020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:09:11,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 09:09:12,253][12883] Updated weights for policy 0, policy_version 101903 (0.0044) [2024-06-18 09:09:15,849][12883] Updated weights for policy 0, policy_version 101913 (0.0040) [2024-06-18 09:09:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1669775360. Throughput: 0: 42926.1. Samples: 1669848740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:09:16,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 09:09:19,689][12883] Updated weights for policy 0, policy_version 101923 (0.0029) [2024-06-18 09:09:21,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 1670021120. Throughput: 0: 42977.2. Samples: 1670110640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:09:21,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 09:09:23,319][12883] Updated weights for policy 0, policy_version 101933 (0.0034) [2024-06-18 09:09:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1670217728. Throughput: 0: 43267.7. Samples: 1670373340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-18 09:09:26,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 09:09:27,239][12883] Updated weights for policy 0, policy_version 101943 (0.0035) [2024-06-18 09:09:31,063][12883] Updated weights for policy 0, policy_version 101953 (0.0030) [2024-06-18 09:09:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.3, 300 sec: 42820.5). Total num frames: 1670430720. Throughput: 0: 43071.5. Samples: 1670495120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:09:31,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 09:09:34,822][12883] Updated weights for policy 0, policy_version 101963 (0.0027) [2024-06-18 09:09:36,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43690.6, 300 sec: 43042.7). Total num frames: 1670676480. Throughput: 0: 43139.4. Samples: 1670760300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:09:36,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 09:09:38,755][12883] Updated weights for policy 0, policy_version 101973 (0.0036) [2024-06-18 09:09:41,996][12645] Fps is (10 sec: 42589.7, 60 sec: 43144.1, 300 sec: 42764.7). Total num frames: 1670856704. Throughput: 0: 43300.9. Samples: 1671022760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:09:41,996][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 09:09:42,725][12883] Updated weights for policy 0, policy_version 101983 (0.0036) [2024-06-18 09:09:46,423][12883] Updated weights for policy 0, policy_version 101993 (0.0030) [2024-06-18 09:09:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 1671086080. Throughput: 0: 43245.3. Samples: 1671144200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:09:46,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 09:09:50,238][12883] Updated weights for policy 0, policy_version 102003 (0.0037) [2024-06-18 09:09:51,994][12645] Fps is (10 sec: 44246.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1671299072. Throughput: 0: 43126.7. Samples: 1671405620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:09:51,994][12645] Avg episode reward: [(0, '0.197')] [2024-06-18 09:09:53,914][12883] Updated weights for policy 0, policy_version 102013 (0.0032) [2024-06-18 09:09:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43145.4, 300 sec: 42765.0). Total num frames: 1671495680. Throughput: 0: 43162.1. Samples: 1671666320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:09:56,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 09:09:57,769][12883] Updated weights for policy 0, policy_version 102023 (0.0025) [2024-06-18 09:10:01,438][12883] Updated weights for policy 0, policy_version 102033 (0.0030) [2024-06-18 09:10:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 1671725056. Throughput: 0: 43182.1. Samples: 1671791940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:10:01,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 09:10:05,774][12883] Updated weights for policy 0, policy_version 102043 (0.0041) [2024-06-18 09:10:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 1671954432. Throughput: 0: 43223.6. Samples: 1672055700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:10:06,994][12645] Avg episode reward: [(0, '0.677')] [2024-06-18 09:10:09,145][12883] Updated weights for policy 0, policy_version 102053 (0.0026) [2024-06-18 09:10:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1672118272. Throughput: 0: 43035.9. Samples: 1672309960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:10:11,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 09:10:13,280][12883] Updated weights for policy 0, policy_version 102063 (0.0029) [2024-06-18 09:10:16,678][12883] Updated weights for policy 0, policy_version 102073 (0.0047) [2024-06-18 09:10:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42987.5). Total num frames: 1672380416. Throughput: 0: 43029.1. Samples: 1672431420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:10:16,994][12645] Avg episode reward: [(0, '0.201')] [2024-06-18 09:10:20,853][12883] Updated weights for policy 0, policy_version 102083 (0.0038) [2024-06-18 09:10:21,994][12645] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1672593408. Throughput: 0: 43037.0. Samples: 1672696960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:10:21,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 09:10:24,370][12883] Updated weights for policy 0, policy_version 102093 (0.0032) [2024-06-18 09:10:26,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1672773632. Throughput: 0: 42846.2. Samples: 1672950840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:10:26,997][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 09:10:28,474][12883] Updated weights for policy 0, policy_version 102103 (0.0026) [2024-06-18 09:10:31,916][12883] Updated weights for policy 0, policy_version 102113 (0.0031) [2024-06-18 09:10:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1673019392. Throughput: 0: 42919.4. Samples: 1673075580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:10:31,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 09:10:35,921][12883] Updated weights for policy 0, policy_version 102123 (0.0031) [2024-06-18 09:10:36,419][12862] Signal inference workers to stop experience collection... (24400 times) [2024-06-18 09:10:36,420][12862] Signal inference workers to resume experience collection... (24400 times) [2024-06-18 09:10:36,463][12883] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-06-18 09:10:36,463][12883] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-06-18 09:10:36,994][12645] Fps is (10 sec: 44247.3, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1673216000. Throughput: 0: 42993.4. Samples: 1673340320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:10:36,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 09:10:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102126_1673232384.pth... [2024-06-18 09:10:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101498_1662943232.pth [2024-06-18 09:10:39,525][12883] Updated weights for policy 0, policy_version 102133 (0.0047) [2024-06-18 09:10:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 1673428992. Throughput: 0: 42793.8. Samples: 1673592040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:10:41,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 09:10:43,849][12883] Updated weights for policy 0, policy_version 102143 (0.0035) [2024-06-18 09:10:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1673658368. Throughput: 0: 42829.0. Samples: 1673719240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:10:46,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 09:10:47,238][12883] Updated weights for policy 0, policy_version 102153 (0.0034) [2024-06-18 09:10:51,695][12883] Updated weights for policy 0, policy_version 102163 (0.0031) [2024-06-18 09:10:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 1673871360. Throughput: 0: 42735.6. Samples: 1673978800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:10:51,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 09:10:54,897][12883] Updated weights for policy 0, policy_version 102173 (0.0031) [2024-06-18 09:10:56,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1674051584. Throughput: 0: 42692.7. Samples: 1674231220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:10:56,997][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 09:10:59,229][12883] Updated weights for policy 0, policy_version 102183 (0.0040) [2024-06-18 09:11:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1674297344. Throughput: 0: 42821.2. Samples: 1674358380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:11:01,995][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 09:11:02,474][12883] Updated weights for policy 0, policy_version 102193 (0.0031) [2024-06-18 09:11:06,771][12883] Updated weights for policy 0, policy_version 102203 (0.0030) [2024-06-18 09:11:06,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1674493952. Throughput: 0: 42746.7. Samples: 1674620560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:11:06,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 09:11:10,601][12883] Updated weights for policy 0, policy_version 102213 (0.0046) [2024-06-18 09:11:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1674706944. Throughput: 0: 42656.2. Samples: 1674870280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:11:11,994][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 09:11:14,518][12883] Updated weights for policy 0, policy_version 102223 (0.0035) [2024-06-18 09:11:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1674936320. Throughput: 0: 42775.7. Samples: 1675000480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:11:16,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 09:11:18,113][12883] Updated weights for policy 0, policy_version 102233 (0.0047) [2024-06-18 09:11:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 42820.5). Total num frames: 1675116544. Throughput: 0: 42580.8. Samples: 1675256460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:11:21,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 09:11:22,174][12883] Updated weights for policy 0, policy_version 102243 (0.0031) [2024-06-18 09:11:25,936][12883] Updated weights for policy 0, policy_version 102253 (0.0035) [2024-06-18 09:11:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 1675345920. Throughput: 0: 42493.3. Samples: 1675504240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 09:11:26,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 09:11:29,767][12883] Updated weights for policy 0, policy_version 102263 (0.0035) [2024-06-18 09:11:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.6, 300 sec: 42820.9). Total num frames: 1675575296. Throughput: 0: 42627.1. Samples: 1675637460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:11:31,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 09:11:33,691][12883] Updated weights for policy 0, policy_version 102273 (0.0028) [2024-06-18 09:11:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 1675771904. Throughput: 0: 42541.7. Samples: 1675893180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:11:36,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 09:11:37,498][12883] Updated weights for policy 0, policy_version 102283 (0.0031) [2024-06-18 09:11:41,602][12883] Updated weights for policy 0, policy_version 102293 (0.0042) [2024-06-18 09:11:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1675984896. Throughput: 0: 42540.4. Samples: 1676145440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:11:41,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 09:11:45,273][12883] Updated weights for policy 0, policy_version 102303 (0.0045) [2024-06-18 09:11:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1676214272. Throughput: 0: 42561.8. Samples: 1676273660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:11:46,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 09:11:49,204][12883] Updated weights for policy 0, policy_version 102313 (0.0040) [2024-06-18 09:11:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 1676394496. Throughput: 0: 42388.0. Samples: 1676528020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:11:51,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 09:11:52,964][12883] Updated weights for policy 0, policy_version 102323 (0.0038) [2024-06-18 09:11:56,733][12883] Updated weights for policy 0, policy_version 102333 (0.0044) [2024-06-18 09:11:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42873.0, 300 sec: 42876.1). Total num frames: 1676623872. Throughput: 0: 42563.2. Samples: 1676785620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:11:56,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 09:12:00,526][12883] Updated weights for policy 0, policy_version 102343 (0.0041) [2024-06-18 09:12:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1676853248. Throughput: 0: 42579.1. Samples: 1676916540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:12:01,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 09:12:02,078][12862] Signal inference workers to stop experience collection... (24450 times) [2024-06-18 09:12:02,078][12862] Signal inference workers to resume experience collection... (24450 times) [2024-06-18 09:12:02,097][12883] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-06-18 09:12:02,098][12883] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-06-18 09:12:04,507][12883] Updated weights for policy 0, policy_version 102353 (0.0030) [2024-06-18 09:12:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42765.2). Total num frames: 1677033472. Throughput: 0: 42544.0. Samples: 1677170940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:12:06,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 09:12:07,980][12883] Updated weights for policy 0, policy_version 102363 (0.0028) [2024-06-18 09:12:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1677246464. Throughput: 0: 42771.0. Samples: 1677428940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:12:11,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 09:12:12,218][12883] Updated weights for policy 0, policy_version 102373 (0.0027) [2024-06-18 09:12:15,611][12883] Updated weights for policy 0, policy_version 102383 (0.0035) [2024-06-18 09:12:16,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1677492224. Throughput: 0: 42724.0. Samples: 1677560040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:12:16,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 09:12:19,818][12883] Updated weights for policy 0, policy_version 102393 (0.0033) [2024-06-18 09:12:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1677656064. Throughput: 0: 42600.9. Samples: 1677810220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:12:21,994][12645] Avg episode reward: [(0, '0.829')] [2024-06-18 09:12:22,039][12862] Saving new best policy, reward=0.829! [2024-06-18 09:12:23,312][12883] Updated weights for policy 0, policy_version 102403 (0.0034) [2024-06-18 09:12:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 1677885440. Throughput: 0: 42736.4. Samples: 1678068580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 09:12:26,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 09:12:27,749][12883] Updated weights for policy 0, policy_version 102413 (0.0023) [2024-06-18 09:12:30,900][12883] Updated weights for policy 0, policy_version 102423 (0.0031) [2024-06-18 09:12:31,994][12645] Fps is (10 sec: 47514.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1678131200. Throughput: 0: 42821.5. Samples: 1678200620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:12:31,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 09:12:35,224][12883] Updated weights for policy 0, policy_version 102433 (0.0036) [2024-06-18 09:12:36,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1678311424. Throughput: 0: 42846.9. Samples: 1678456140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:12:36,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 09:12:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102436_1678311424.pth... [2024-06-18 09:12:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000101811_1668071424.pth [2024-06-18 09:12:38,602][12883] Updated weights for policy 0, policy_version 102443 (0.0021) [2024-06-18 09:12:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 1678540800. Throughput: 0: 42897.9. Samples: 1678716020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:12:41,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 09:12:42,884][12883] Updated weights for policy 0, policy_version 102453 (0.0042) [2024-06-18 09:12:46,078][12883] Updated weights for policy 0, policy_version 102463 (0.0035) [2024-06-18 09:12:46,994][12645] Fps is (10 sec: 47514.4, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 1678786560. Throughput: 0: 42828.1. Samples: 1678843800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:12:46,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 09:12:50,485][12883] Updated weights for policy 0, policy_version 102473 (0.0036) [2024-06-18 09:12:51,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 1678966784. Throughput: 0: 42767.9. Samples: 1679095500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:12:51,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 09:12:53,720][12883] Updated weights for policy 0, policy_version 102483 (0.0031) [2024-06-18 09:12:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1679179776. Throughput: 0: 42805.3. Samples: 1679355180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:12:56,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 09:12:58,105][12883] Updated weights for policy 0, policy_version 102493 (0.0033) [2024-06-18 09:13:01,331][12883] Updated weights for policy 0, policy_version 102503 (0.0030) [2024-06-18 09:13:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1679409152. Throughput: 0: 42812.4. Samples: 1679486600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:13:01,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 09:13:05,842][12883] Updated weights for policy 0, policy_version 102513 (0.0031) [2024-06-18 09:13:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1679605760. Throughput: 0: 42944.9. Samples: 1679742740. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:13:06,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 09:13:09,007][12883] Updated weights for policy 0, policy_version 102523 (0.0039) [2024-06-18 09:13:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1679818752. Throughput: 0: 42792.0. Samples: 1679994220. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:13:11,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 09:13:13,557][12883] Updated weights for policy 0, policy_version 102533 (0.0031) [2024-06-18 09:13:16,641][12883] Updated weights for policy 0, policy_version 102543 (0.0034) [2024-06-18 09:13:16,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42869.8, 300 sec: 42931.3). Total num frames: 1680064512. Throughput: 0: 42741.3. Samples: 1680124080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:13:16,996][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 09:13:21,427][12883] Updated weights for policy 0, policy_version 102553 (0.0028) [2024-06-18 09:13:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1680261120. Throughput: 0: 42651.6. Samples: 1680375460. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:13:22,008][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 09:13:23,965][12862] Signal inference workers to stop experience collection... (24500 times) [2024-06-18 09:13:23,966][12862] Signal inference workers to resume experience collection... (24500 times) [2024-06-18 09:13:24,010][12883] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-06-18 09:13:24,010][12883] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-06-18 09:13:24,286][12883] Updated weights for policy 0, policy_version 102563 (0.0032) [2024-06-18 09:13:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1680474112. Throughput: 0: 42520.4. Samples: 1680629440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 09:13:26,996][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 09:13:29,142][12883] Updated weights for policy 0, policy_version 102573 (0.0044) [2024-06-18 09:13:31,877][12883] Updated weights for policy 0, policy_version 102583 (0.0034) [2024-06-18 09:13:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1680719872. Throughput: 0: 42568.4. Samples: 1680759380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:13:31,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 09:13:36,675][12883] Updated weights for policy 0, policy_version 102593 (0.0025) [2024-06-18 09:13:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.2). Total num frames: 1680883712. Throughput: 0: 42664.0. Samples: 1681015380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:13:36,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 09:13:39,650][12883] Updated weights for policy 0, policy_version 102603 (0.0046) [2024-06-18 09:13:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1681129472. Throughput: 0: 42469.9. Samples: 1681266320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:13:41,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 09:13:44,256][12883] Updated weights for policy 0, policy_version 102613 (0.0033) [2024-06-18 09:13:46,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1681342464. Throughput: 0: 42543.6. Samples: 1681401060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:13:46,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 09:13:47,491][12883] Updated weights for policy 0, policy_version 102623 (0.0028) [2024-06-18 09:13:51,903][12883] Updated weights for policy 0, policy_version 102633 (0.0035) [2024-06-18 09:13:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.6, 300 sec: 42820.7). Total num frames: 1681539072. Throughput: 0: 42516.0. Samples: 1681655960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:13:51,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 09:13:55,125][12883] Updated weights for policy 0, policy_version 102643 (0.0045) [2024-06-18 09:13:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1681768448. Throughput: 0: 42451.6. Samples: 1681904540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:13:56,994][12645] Avg episode reward: [(0, '0.187')] [2024-06-18 09:14:00,078][12883] Updated weights for policy 0, policy_version 102653 (0.0030) [2024-06-18 09:14:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1681965056. Throughput: 0: 42587.3. Samples: 1682040420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:14:01,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 09:14:02,774][12883] Updated weights for policy 0, policy_version 102663 (0.0041) [2024-06-18 09:14:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1682161664. Throughput: 0: 42655.6. Samples: 1682294960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:14:06,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 09:14:07,698][12883] Updated weights for policy 0, policy_version 102673 (0.0036) [2024-06-18 09:14:10,654][12883] Updated weights for policy 0, policy_version 102683 (0.0030) [2024-06-18 09:14:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 1682423808. Throughput: 0: 42605.7. Samples: 1682546700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:14:11,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 09:14:15,231][12883] Updated weights for policy 0, policy_version 102693 (0.0031) [2024-06-18 09:14:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1682604032. Throughput: 0: 42748.9. Samples: 1682683080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:14:16,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 09:14:18,329][12883] Updated weights for policy 0, policy_version 102703 (0.0022) [2024-06-18 09:14:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1682817024. Throughput: 0: 42582.8. Samples: 1682931600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:14:21,996][12645] Avg episode reward: [(0, '0.134')] [2024-06-18 09:14:22,783][12883] Updated weights for policy 0, policy_version 102713 (0.0024) [2024-06-18 09:14:26,098][12883] Updated weights for policy 0, policy_version 102723 (0.0040) [2024-06-18 09:14:26,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1683062784. Throughput: 0: 42752.0. Samples: 1683190160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 09:14:26,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 09:14:30,397][12883] Updated weights for policy 0, policy_version 102733 (0.0042) [2024-06-18 09:14:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1683226624. Throughput: 0: 42679.0. Samples: 1683321620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:14:32,003][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 09:14:34,092][12883] Updated weights for policy 0, policy_version 102743 (0.0039) [2024-06-18 09:14:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 1683472384. Throughput: 0: 42638.7. Samples: 1683574700. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:14:36,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 09:14:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102751_1683472384.pth... [2024-06-18 09:14:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102126_1673232384.pth [2024-06-18 09:14:38,053][12883] Updated weights for policy 0, policy_version 102753 (0.0029) [2024-06-18 09:14:40,077][12862] Signal inference workers to stop experience collection... (24550 times) [2024-06-18 09:14:40,077][12862] Signal inference workers to resume experience collection... (24550 times) [2024-06-18 09:14:40,101][12883] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-06-18 09:14:40,101][12883] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-06-18 09:14:41,692][12883] Updated weights for policy 0, policy_version 102763 (0.0036) [2024-06-18 09:14:41,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1683685376. Throughput: 0: 42834.3. Samples: 1683832080. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:14:41,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 09:14:45,622][12883] Updated weights for policy 0, policy_version 102773 (0.0034) [2024-06-18 09:14:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1683881984. Throughput: 0: 42613.5. Samples: 1683958020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:14:46,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 09:14:49,208][12883] Updated weights for policy 0, policy_version 102783 (0.0053) [2024-06-18 09:14:51,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1684127744. Throughput: 0: 42604.3. Samples: 1684212160. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:14:51,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 09:14:53,170][12883] Updated weights for policy 0, policy_version 102793 (0.0026) [2024-06-18 09:14:56,855][12883] Updated weights for policy 0, policy_version 102803 (0.0029) [2024-06-18 09:14:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1684324352. Throughput: 0: 42901.9. Samples: 1684477280. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:14:56,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 09:15:00,739][12883] Updated weights for policy 0, policy_version 102813 (0.0039) [2024-06-18 09:15:01,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1684520960. Throughput: 0: 42619.6. Samples: 1684600960. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:15:01,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 09:15:04,700][12883] Updated weights for policy 0, policy_version 102823 (0.0035) [2024-06-18 09:15:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1684766720. Throughput: 0: 42730.3. Samples: 1684854460. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:15:06,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 09:15:08,638][12883] Updated weights for policy 0, policy_version 102833 (0.0029) [2024-06-18 09:15:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1684946944. Throughput: 0: 42892.0. Samples: 1685120300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:15:11,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 09:15:12,197][12883] Updated weights for policy 0, policy_version 102843 (0.0041) [2024-06-18 09:15:16,269][12883] Updated weights for policy 0, policy_version 102853 (0.0027) [2024-06-18 09:15:16,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1685176320. Throughput: 0: 42770.7. Samples: 1685246300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:15:16,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 09:15:20,126][12883] Updated weights for policy 0, policy_version 102863 (0.0031) [2024-06-18 09:15:21,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.9, 300 sec: 42765.0). Total num frames: 1685389312. Throughput: 0: 42712.5. Samples: 1685496860. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:15:21,996][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 09:15:23,797][12883] Updated weights for policy 0, policy_version 102873 (0.0031) [2024-06-18 09:15:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1685585920. Throughput: 0: 42730.2. Samples: 1685754940. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 09:15:26,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 09:15:27,713][12883] Updated weights for policy 0, policy_version 102883 (0.0046) [2024-06-18 09:15:31,383][12883] Updated weights for policy 0, policy_version 102893 (0.0029) [2024-06-18 09:15:31,996][12645] Fps is (10 sec: 44237.0, 60 sec: 43416.0, 300 sec: 42764.7). Total num frames: 1685831680. Throughput: 0: 42715.7. Samples: 1685880320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:15:31,996][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 09:15:35,427][12883] Updated weights for policy 0, policy_version 102903 (0.0035) [2024-06-18 09:15:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1686028288. Throughput: 0: 42800.7. Samples: 1686138180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:15:36,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 09:15:38,928][12883] Updated weights for policy 0, policy_version 102913 (0.0027) [2024-06-18 09:15:41,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1686224896. Throughput: 0: 42649.4. Samples: 1686396500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:15:41,994][12645] Avg episode reward: [(0, '0.148')] [2024-06-18 09:15:43,083][12883] Updated weights for policy 0, policy_version 102923 (0.0032) [2024-06-18 09:15:43,577][12862] Signal inference workers to stop experience collection... (24600 times) [2024-06-18 09:15:43,577][12862] Signal inference workers to resume experience collection... (24600 times) [2024-06-18 09:15:43,607][12883] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-06-18 09:15:43,608][12883] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-06-18 09:15:46,916][12883] Updated weights for policy 0, policy_version 102933 (0.0041) [2024-06-18 09:15:46,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1686454272. Throughput: 0: 42753.4. Samples: 1686524960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:15:46,996][12645] Avg episode reward: [(0, '0.261')] [2024-06-18 09:15:50,984][12883] Updated weights for policy 0, policy_version 102943 (0.0028) [2024-06-18 09:15:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42765.3). Total num frames: 1686667264. Throughput: 0: 42820.3. Samples: 1686781380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:15:51,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 09:15:54,509][12883] Updated weights for policy 0, policy_version 102953 (0.0039) [2024-06-18 09:15:56,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1686863872. Throughput: 0: 42535.5. Samples: 1687034400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:15:56,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 09:15:58,523][12883] Updated weights for policy 0, policy_version 102963 (0.0033) [2024-06-18 09:16:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1687109632. Throughput: 0: 42648.5. Samples: 1687165480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:16:01,994][12883] Updated weights for policy 0, policy_version 102973 (0.0033) [2024-06-18 09:16:01,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 09:16:05,866][12883] Updated weights for policy 0, policy_version 102983 (0.0029) [2024-06-18 09:16:06,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42050.6, 300 sec: 42653.6). Total num frames: 1687289856. Throughput: 0: 42843.1. Samples: 1687424800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:16:06,996][12645] Avg episode reward: [(0, '0.767')] [2024-06-18 09:16:09,370][12883] Updated weights for policy 0, policy_version 102993 (0.0028) [2024-06-18 09:16:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1687519232. Throughput: 0: 42803.0. Samples: 1687681080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:16:11,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 09:16:13,611][12883] Updated weights for policy 0, policy_version 103003 (0.0024) [2024-06-18 09:16:16,994][12645] Fps is (10 sec: 44246.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1687732224. Throughput: 0: 42880.7. Samples: 1687809860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:16:16,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 09:16:17,260][12883] Updated weights for policy 0, policy_version 103013 (0.0040) [2024-06-18 09:16:21,381][12883] Updated weights for policy 0, policy_version 103023 (0.0028) [2024-06-18 09:16:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1687945216. Throughput: 0: 42944.8. Samples: 1688070700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:16:21,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 09:16:24,867][12883] Updated weights for policy 0, policy_version 103033 (0.0027) [2024-06-18 09:16:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1688158208. Throughput: 0: 42766.5. Samples: 1688321000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:16:26,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 09:16:29,135][12883] Updated weights for policy 0, policy_version 103043 (0.0039) [2024-06-18 09:16:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 1688371200. Throughput: 0: 42892.7. Samples: 1688455040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 09:16:32,000][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 09:16:32,556][12883] Updated weights for policy 0, policy_version 103053 (0.0037) [2024-06-18 09:16:36,835][12883] Updated weights for policy 0, policy_version 103063 (0.0033) [2024-06-18 09:16:36,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 1688584192. Throughput: 0: 42827.6. Samples: 1688708720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:16:36,997][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 09:16:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103063_1688584192.pth... [2024-06-18 09:16:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102436_1678311424.pth [2024-06-18 09:16:40,191][12883] Updated weights for policy 0, policy_version 103073 (0.0039) [2024-06-18 09:16:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1688813568. Throughput: 0: 42707.4. Samples: 1688956240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:16:41,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 09:16:44,552][12883] Updated weights for policy 0, policy_version 103083 (0.0044) [2024-06-18 09:16:46,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42326.8, 300 sec: 42709.4). Total num frames: 1688993792. Throughput: 0: 42694.1. Samples: 1689086720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:16:46,995][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 09:16:47,783][12883] Updated weights for policy 0, policy_version 103093 (0.0035) [2024-06-18 09:16:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1689223168. Throughput: 0: 42590.5. Samples: 1689341280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:16:51,994][12645] Avg episode reward: [(0, '0.668')] [2024-06-18 09:16:52,114][12883] Updated weights for policy 0, policy_version 103103 (0.0030) [2024-06-18 09:16:55,697][12883] Updated weights for policy 0, policy_version 103113 (0.0044) [2024-06-18 09:16:56,994][12645] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1689436160. Throughput: 0: 42475.2. Samples: 1689592460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:16:56,994][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 09:16:59,811][12883] Updated weights for policy 0, policy_version 103123 (0.0033) [2024-06-18 09:17:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1689632768. Throughput: 0: 42636.9. Samples: 1689728520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:17:01,995][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 09:17:03,365][12883] Updated weights for policy 0, policy_version 103133 (0.0039) [2024-06-18 09:17:06,995][12645] Fps is (10 sec: 42590.9, 60 sec: 42871.9, 300 sec: 42764.8). Total num frames: 1689862144. Throughput: 0: 42325.5. Samples: 1689975420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:17:06,996][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 09:17:07,492][12883] Updated weights for policy 0, policy_version 103143 (0.0032) [2024-06-18 09:17:11,206][12883] Updated weights for policy 0, policy_version 103153 (0.0047) [2024-06-18 09:17:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1690075136. Throughput: 0: 42411.7. Samples: 1690229520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:17:11,994][12645] Avg episode reward: [(0, '0.083')] [2024-06-18 09:17:14,812][12862] Signal inference workers to stop experience collection... (24650 times) [2024-06-18 09:17:14,867][12883] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-06-18 09:17:14,870][12862] Signal inference workers to resume experience collection... (24650 times) [2024-06-18 09:17:14,881][12883] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-06-18 09:17:15,191][12883] Updated weights for policy 0, policy_version 103163 (0.0030) [2024-06-18 09:17:16,996][12645] Fps is (10 sec: 40957.7, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 1690271744. Throughput: 0: 42318.8. Samples: 1690359480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:17:16,996][12645] Avg episode reward: [(0, '0.203')] [2024-06-18 09:17:18,839][12883] Updated weights for policy 0, policy_version 103173 (0.0034) [2024-06-18 09:17:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1690501120. Throughput: 0: 42351.1. Samples: 1690614420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:17:21,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 09:17:22,957][12883] Updated weights for policy 0, policy_version 103183 (0.0032) [2024-06-18 09:17:26,536][12883] Updated weights for policy 0, policy_version 103193 (0.0042) [2024-06-18 09:17:26,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1690714112. Throughput: 0: 42384.5. Samples: 1690863540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:17:26,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 09:17:30,543][12883] Updated weights for policy 0, policy_version 103203 (0.0025) [2024-06-18 09:17:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1690910720. Throughput: 0: 42373.1. Samples: 1690993500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 09:17:31,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 09:17:34,661][12883] Updated weights for policy 0, policy_version 103213 (0.0031) [2024-06-18 09:17:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 1691140096. Throughput: 0: 42469.4. Samples: 1691252400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:17:36,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 09:17:38,130][12883] Updated weights for policy 0, policy_version 103223 (0.0036) [2024-06-18 09:17:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1691353088. Throughput: 0: 42577.8. Samples: 1691508460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:17:41,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 09:17:42,328][12883] Updated weights for policy 0, policy_version 103233 (0.0024) [2024-06-18 09:17:46,093][12883] Updated weights for policy 0, policy_version 103243 (0.0036) [2024-06-18 09:17:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1691549696. Throughput: 0: 42553.4. Samples: 1691643420. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:17:46,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 09:17:49,992][12883] Updated weights for policy 0, policy_version 103253 (0.0032) [2024-06-18 09:17:51,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 1691779072. Throughput: 0: 42538.2. Samples: 1691889660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:17:51,997][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 09:17:54,153][12883] Updated weights for policy 0, policy_version 103263 (0.0032) [2024-06-18 09:17:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1691992064. Throughput: 0: 42579.9. Samples: 1692145620. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:17:56,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 09:17:57,495][12883] Updated weights for policy 0, policy_version 103273 (0.0033) [2024-06-18 09:18:01,626][12883] Updated weights for policy 0, policy_version 103283 (0.0032) [2024-06-18 09:18:01,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 1692188672. Throughput: 0: 42771.5. Samples: 1692284100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:18:01,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 09:18:05,158][12883] Updated weights for policy 0, policy_version 103293 (0.0042) [2024-06-18 09:18:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42326.5, 300 sec: 42653.9). Total num frames: 1692401664. Throughput: 0: 42679.6. Samples: 1692535000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:18:06,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 09:18:09,121][12883] Updated weights for policy 0, policy_version 103303 (0.0033) [2024-06-18 09:18:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 1692631040. Throughput: 0: 42752.5. Samples: 1692787400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:18:11,994][12645] Avg episode reward: [(0, '0.108')] [2024-06-18 09:18:12,733][12883] Updated weights for policy 0, policy_version 103313 (0.0024) [2024-06-18 09:18:16,592][12883] Updated weights for policy 0, policy_version 103323 (0.0037) [2024-06-18 09:18:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1692844032. Throughput: 0: 42825.8. Samples: 1692920660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:18:16,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 09:18:20,444][12883] Updated weights for policy 0, policy_version 103333 (0.0040) [2024-06-18 09:18:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1693057024. Throughput: 0: 42685.9. Samples: 1693173360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:18:21,996][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 09:18:24,237][12883] Updated weights for policy 0, policy_version 103343 (0.0038) [2024-06-18 09:18:26,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 1693286400. Throughput: 0: 42740.4. Samples: 1693431880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:18:26,997][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 09:18:28,186][12883] Updated weights for policy 0, policy_version 103353 (0.0031) [2024-06-18 09:18:31,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1693483008. Throughput: 0: 42629.8. Samples: 1693561760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 09:18:31,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 09:18:32,119][12883] Updated weights for policy 0, policy_version 103363 (0.0051) [2024-06-18 09:18:36,229][12883] Updated weights for policy 0, policy_version 103373 (0.0045) [2024-06-18 09:18:36,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1693696000. Throughput: 0: 42866.5. Samples: 1693818560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:18:36,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 09:18:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103375_1693696000.pth... [2024-06-18 09:18:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000102751_1683472384.pth [2024-06-18 09:18:39,639][12883] Updated weights for policy 0, policy_version 103383 (0.0028) [2024-06-18 09:18:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1693925376. Throughput: 0: 42757.7. Samples: 1694069720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:18:41,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 09:18:43,758][12883] Updated weights for policy 0, policy_version 103393 (0.0034) [2024-06-18 09:18:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1694121984. Throughput: 0: 42611.9. Samples: 1694201640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:18:46,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 09:18:47,270][12883] Updated weights for policy 0, policy_version 103403 (0.0027) [2024-06-18 09:18:49,504][12862] Signal inference workers to stop experience collection... (24700 times) [2024-06-18 09:18:49,537][12883] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-06-18 09:18:49,562][12862] Signal inference workers to resume experience collection... (24700 times) [2024-06-18 09:18:49,563][12883] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-06-18 09:18:51,299][12883] Updated weights for policy 0, policy_version 103413 (0.0041) [2024-06-18 09:18:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1694334976. Throughput: 0: 42645.3. Samples: 1694454040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:18:51,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 09:18:54,817][12883] Updated weights for policy 0, policy_version 103423 (0.0035) [2024-06-18 09:18:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1694564352. Throughput: 0: 42795.2. Samples: 1694713180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:18:56,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 09:18:58,992][12883] Updated weights for policy 0, policy_version 103433 (0.0033) [2024-06-18 09:19:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1694777344. Throughput: 0: 42745.6. Samples: 1694844220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:19:01,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 09:19:02,415][12883] Updated weights for policy 0, policy_version 103443 (0.0028) [2024-06-18 09:19:06,525][12883] Updated weights for policy 0, policy_version 103453 (0.0036) [2024-06-18 09:19:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1694973952. Throughput: 0: 42758.1. Samples: 1695097380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:19:06,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 09:19:10,258][12883] Updated weights for policy 0, policy_version 103463 (0.0034) [2024-06-18 09:19:11,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1695203328. Throughput: 0: 42637.8. Samples: 1695350480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:19:11,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 09:19:14,478][12883] Updated weights for policy 0, policy_version 103473 (0.0025) [2024-06-18 09:19:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1695399936. Throughput: 0: 42691.6. Samples: 1695482880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:19:16,994][12645] Avg episode reward: [(0, '0.068')] [2024-06-18 09:19:17,852][12883] Updated weights for policy 0, policy_version 103483 (0.0032) [2024-06-18 09:19:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1695612928. Throughput: 0: 42635.2. Samples: 1695737140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:19:21,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 09:19:22,014][12883] Updated weights for policy 0, policy_version 103493 (0.0048) [2024-06-18 09:19:26,077][12883] Updated weights for policy 0, policy_version 103503 (0.0030) [2024-06-18 09:19:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1695842304. Throughput: 0: 42704.5. Samples: 1695991420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:19:26,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 09:19:29,899][12883] Updated weights for policy 0, policy_version 103513 (0.0039) [2024-06-18 09:19:31,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 1696038912. Throughput: 0: 42735.6. Samples: 1696124840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:19:31,996][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 09:19:33,452][12883] Updated weights for policy 0, policy_version 103523 (0.0025) [2024-06-18 09:19:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1696251904. Throughput: 0: 42816.9. Samples: 1696380800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:19:36,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 09:19:37,474][12883] Updated weights for policy 0, policy_version 103533 (0.0035) [2024-06-18 09:19:41,075][12883] Updated weights for policy 0, policy_version 103543 (0.0036) [2024-06-18 09:19:41,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1696481280. Throughput: 0: 42651.8. Samples: 1696632520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:19:41,995][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 09:19:45,113][12883] Updated weights for policy 0, policy_version 103553 (0.0030) [2024-06-18 09:19:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1696710656. Throughput: 0: 42770.8. Samples: 1696768900. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:19:46,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 09:19:48,518][12883] Updated weights for policy 0, policy_version 103563 (0.0030) [2024-06-18 09:19:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1696907264. Throughput: 0: 42881.7. Samples: 1697027060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:19:51,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 09:19:52,827][12883] Updated weights for policy 0, policy_version 103573 (0.0037) [2024-06-18 09:19:56,171][12883] Updated weights for policy 0, policy_version 103583 (0.0038) [2024-06-18 09:19:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1697120256. Throughput: 0: 42792.4. Samples: 1697276140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:19:56,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 09:20:00,415][12883] Updated weights for policy 0, policy_version 103593 (0.0034) [2024-06-18 09:20:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1697349632. Throughput: 0: 42806.6. Samples: 1697409180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:20:01,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 09:20:03,859][12883] Updated weights for policy 0, policy_version 103603 (0.0029) [2024-06-18 09:20:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1697546240. Throughput: 0: 42877.7. Samples: 1697666640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:20:06,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 09:20:08,004][12883] Updated weights for policy 0, policy_version 103613 (0.0040) [2024-06-18 09:20:11,679][12883] Updated weights for policy 0, policy_version 103623 (0.0036) [2024-06-18 09:20:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1697775616. Throughput: 0: 42797.5. Samples: 1697917300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:20:11,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 09:20:15,672][12883] Updated weights for policy 0, policy_version 103633 (0.0040) [2024-06-18 09:20:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1697972224. Throughput: 0: 42704.8. Samples: 1698046460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:20:16,994][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 09:20:17,375][12862] Signal inference workers to stop experience collection... (24750 times) [2024-06-18 09:20:17,375][12862] Signal inference workers to resume experience collection... (24750 times) [2024-06-18 09:20:17,420][12883] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-06-18 09:20:17,420][12883] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-06-18 09:20:19,353][12883] Updated weights for policy 0, policy_version 103643 (0.0039) [2024-06-18 09:20:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1698168832. Throughput: 0: 42565.2. Samples: 1698296240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:20:21,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 09:20:23,776][12883] Updated weights for policy 0, policy_version 103653 (0.0049) [2024-06-18 09:20:26,992][12883] Updated weights for policy 0, policy_version 103663 (0.0042) [2024-06-18 09:20:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1698414592. Throughput: 0: 42581.4. Samples: 1698548680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:20:26,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 09:20:31,374][12883] Updated weights for policy 0, policy_version 103673 (0.0034) [2024-06-18 09:20:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 1698611200. Throughput: 0: 42410.3. Samples: 1698677360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 22.0) [2024-06-18 09:20:31,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 09:20:34,619][12883] Updated weights for policy 0, policy_version 103683 (0.0033) [2024-06-18 09:20:36,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1698791424. Throughput: 0: 42298.7. Samples: 1698930500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:20:36,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 09:20:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103686_1698791424.pth... [2024-06-18 09:20:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103063_1688584192.pth [2024-06-18 09:20:39,059][12883] Updated weights for policy 0, policy_version 103693 (0.0042) [2024-06-18 09:20:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42654.2). Total num frames: 1699037184. Throughput: 0: 42359.9. Samples: 1699182340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:20:41,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 09:20:42,366][12883] Updated weights for policy 0, policy_version 103703 (0.0038) [2024-06-18 09:20:46,681][12883] Updated weights for policy 0, policy_version 103713 (0.0033) [2024-06-18 09:20:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1699266560. Throughput: 0: 42434.3. Samples: 1699318720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:20:46,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 09:20:49,956][12883] Updated weights for policy 0, policy_version 103723 (0.0033) [2024-06-18 09:20:51,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1699430400. Throughput: 0: 42231.6. Samples: 1699567060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:20:51,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 09:20:54,278][12883] Updated weights for policy 0, policy_version 103733 (0.0037) [2024-06-18 09:20:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1699676160. Throughput: 0: 42456.4. Samples: 1699827840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:20:56,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 09:20:57,740][12883] Updated weights for policy 0, policy_version 103743 (0.0041) [2024-06-18 09:21:01,883][12883] Updated weights for policy 0, policy_version 103753 (0.0037) [2024-06-18 09:21:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 1699889152. Throughput: 0: 42474.2. Samples: 1699957800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:21:01,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 09:21:05,639][12883] Updated weights for policy 0, policy_version 103763 (0.0027) [2024-06-18 09:21:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1700085760. Throughput: 0: 42424.0. Samples: 1700205320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:21:06,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 09:21:09,721][12883] Updated weights for policy 0, policy_version 103773 (0.0032) [2024-06-18 09:21:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1700298752. Throughput: 0: 42472.5. Samples: 1700459940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:21:11,994][12645] Avg episode reward: [(0, '0.779')] [2024-06-18 09:21:13,384][12883] Updated weights for policy 0, policy_version 103783 (0.0042) [2024-06-18 09:21:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1700495360. Throughput: 0: 42408.1. Samples: 1700585720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:21:16,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 09:21:17,474][12883] Updated weights for policy 0, policy_version 103793 (0.0035) [2024-06-18 09:21:20,905][12883] Updated weights for policy 0, policy_version 103803 (0.0042) [2024-06-18 09:21:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1700741120. Throughput: 0: 42473.7. Samples: 1700841820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:21:21,998][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 09:21:25,034][12883] Updated weights for policy 0, policy_version 103813 (0.0031) [2024-06-18 09:21:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1700954112. Throughput: 0: 42386.8. Samples: 1701089740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:21:26,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 09:21:28,850][12883] Updated weights for policy 0, policy_version 103823 (0.0034) [2024-06-18 09:21:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 1701134336. Throughput: 0: 42350.7. Samples: 1701224500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 09:21:31,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 09:21:32,785][12883] Updated weights for policy 0, policy_version 103833 (0.0035) [2024-06-18 09:21:36,401][12883] Updated weights for policy 0, policy_version 103843 (0.0034) [2024-06-18 09:21:36,671][12862] Signal inference workers to stop experience collection... (24800 times) [2024-06-18 09:21:36,672][12862] Signal inference workers to resume experience collection... (24800 times) [2024-06-18 09:21:36,716][12883] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-06-18 09:21:36,716][12883] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-06-18 09:21:36,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43416.0, 300 sec: 42653.6). Total num frames: 1701396480. Throughput: 0: 42682.8. Samples: 1701487880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:21:36,997][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 09:21:40,702][12883] Updated weights for policy 0, policy_version 103853 (0.0040) [2024-06-18 09:21:41,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 1701593088. Throughput: 0: 42405.0. Samples: 1701736160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:21:41,997][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 09:21:44,144][12883] Updated weights for policy 0, policy_version 103863 (0.0051) [2024-06-18 09:21:46,994][12645] Fps is (10 sec: 37691.8, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 1701773312. Throughput: 0: 42212.0. Samples: 1701857340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:21:46,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 09:21:48,523][12883] Updated weights for policy 0, policy_version 103873 (0.0031) [2024-06-18 09:21:51,963][12883] Updated weights for policy 0, policy_version 103883 (0.0030) [2024-06-18 09:21:51,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1702019072. Throughput: 0: 42563.2. Samples: 1702120660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:21:51,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 09:21:56,087][12883] Updated weights for policy 0, policy_version 103893 (0.0036) [2024-06-18 09:21:56,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1702232064. Throughput: 0: 42573.6. Samples: 1702375760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:21:56,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 09:21:59,592][12883] Updated weights for policy 0, policy_version 103903 (0.0035) [2024-06-18 09:22:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42543.1). Total num frames: 1702412288. Throughput: 0: 42567.5. Samples: 1702501260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:22:01,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 09:22:03,587][12883] Updated weights for policy 0, policy_version 103913 (0.0028) [2024-06-18 09:22:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1702641664. Throughput: 0: 42542.2. Samples: 1702756220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:22:06,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 09:22:07,374][12883] Updated weights for policy 0, policy_version 103923 (0.0031) [2024-06-18 09:22:11,532][12883] Updated weights for policy 0, policy_version 103933 (0.0033) [2024-06-18 09:22:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1702854656. Throughput: 0: 42719.6. Samples: 1703012120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:22:11,994][12645] Avg episode reward: [(0, '0.224')] [2024-06-18 09:22:15,094][12883] Updated weights for policy 0, policy_version 103943 (0.0033) [2024-06-18 09:22:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1703051264. Throughput: 0: 42583.5. Samples: 1703140760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:22:16,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 09:22:19,122][12883] Updated weights for policy 0, policy_version 103953 (0.0024) [2024-06-18 09:22:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1703280640. Throughput: 0: 42457.3. Samples: 1703398360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:22:21,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 09:22:22,819][12883] Updated weights for policy 0, policy_version 103963 (0.0029) [2024-06-18 09:22:26,668][12883] Updated weights for policy 0, policy_version 103973 (0.0035) [2024-06-18 09:22:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1703493632. Throughput: 0: 42591.8. Samples: 1703652700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:22:26,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 09:22:30,511][12883] Updated weights for policy 0, policy_version 103983 (0.0040) [2024-06-18 09:22:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1703706624. Throughput: 0: 42791.5. Samples: 1703782960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 09:22:31,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 09:22:34,526][12883] Updated weights for policy 0, policy_version 103993 (0.0038) [2024-06-18 09:22:36,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 1703936000. Throughput: 0: 42646.7. Samples: 1704039860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:22:36,997][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 09:22:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104000_1703936000.pth... [2024-06-18 09:22:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103375_1693696000.pth [2024-06-18 09:22:38,253][12883] Updated weights for policy 0, policy_version 104003 (0.0037) [2024-06-18 09:22:41,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42325.4, 300 sec: 42653.6). Total num frames: 1704132608. Throughput: 0: 42626.5. Samples: 1704294040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:22:41,996][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 09:22:42,083][12883] Updated weights for policy 0, policy_version 104013 (0.0028) [2024-06-18 09:22:45,827][12883] Updated weights for policy 0, policy_version 104023 (0.0042) [2024-06-18 09:22:46,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 1704361984. Throughput: 0: 42616.4. Samples: 1704419000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:22:46,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 09:22:49,737][12883] Updated weights for policy 0, policy_version 104033 (0.0034) [2024-06-18 09:22:51,996][12645] Fps is (10 sec: 42598.6, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 1704558592. Throughput: 0: 42786.9. Samples: 1704681720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:22:51,996][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 09:22:53,649][12883] Updated weights for policy 0, policy_version 104043 (0.0036) [2024-06-18 09:22:56,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 1704771584. Throughput: 0: 42610.7. Samples: 1704929700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:22:56,996][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 09:22:57,375][12883] Updated weights for policy 0, policy_version 104053 (0.0036) [2024-06-18 09:23:01,269][12883] Updated weights for policy 0, policy_version 104063 (0.0023) [2024-06-18 09:23:01,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1704984576. Throughput: 0: 42649.4. Samples: 1705059980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:23:01,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 09:23:04,967][12883] Updated weights for policy 0, policy_version 104073 (0.0031) [2024-06-18 09:23:06,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1705197568. Throughput: 0: 42646.6. Samples: 1705317460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:23:06,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 09:23:07,028][12862] Signal inference workers to stop experience collection... (24850 times) [2024-06-18 09:23:07,028][12862] Signal inference workers to resume experience collection... (24850 times) [2024-06-18 09:23:07,048][12883] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-06-18 09:23:07,048][12883] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-06-18 09:23:08,783][12883] Updated weights for policy 0, policy_version 104083 (0.0041) [2024-06-18 09:23:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1705410560. Throughput: 0: 42583.3. Samples: 1705568940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:23:11,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 09:23:12,633][12883] Updated weights for policy 0, policy_version 104093 (0.0027) [2024-06-18 09:23:16,333][12883] Updated weights for policy 0, policy_version 104103 (0.0042) [2024-06-18 09:23:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1705623552. Throughput: 0: 42588.9. Samples: 1705699460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:23:16,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 09:23:20,431][12883] Updated weights for policy 0, policy_version 104113 (0.0041) [2024-06-18 09:23:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 1705820160. Throughput: 0: 42491.1. Samples: 1705951860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:23:21,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 09:23:24,420][12883] Updated weights for policy 0, policy_version 104123 (0.0042) [2024-06-18 09:23:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1706065920. Throughput: 0: 42552.8. Samples: 1706208820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:23:26,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 09:23:28,396][12883] Updated weights for policy 0, policy_version 104133 (0.0036) [2024-06-18 09:23:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1706262528. Throughput: 0: 42699.6. Samples: 1706340480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 09:23:31,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 09:23:32,162][12883] Updated weights for policy 0, policy_version 104143 (0.0051) [2024-06-18 09:23:36,127][12883] Updated weights for policy 0, policy_version 104153 (0.0032) [2024-06-18 09:23:36,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 1706459136. Throughput: 0: 42484.3. Samples: 1706593420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:23:36,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 09:23:39,686][12883] Updated weights for policy 0, policy_version 104163 (0.0031) [2024-06-18 09:23:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1706688512. Throughput: 0: 42722.6. Samples: 1706852120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:23:41,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 09:23:43,981][12883] Updated weights for policy 0, policy_version 104173 (0.0031) [2024-06-18 09:23:46,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1706901504. Throughput: 0: 42604.2. Samples: 1706977180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:23:46,995][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 09:23:47,846][12883] Updated weights for policy 0, policy_version 104183 (0.0025) [2024-06-18 09:23:51,707][12883] Updated weights for policy 0, policy_version 104193 (0.0037) [2024-06-18 09:23:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.8, 300 sec: 42487.3). Total num frames: 1707098112. Throughput: 0: 42532.8. Samples: 1707231440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:23:51,994][12645] Avg episode reward: [(0, '0.624')] [2024-06-18 09:23:55,277][12883] Updated weights for policy 0, policy_version 104203 (0.0043) [2024-06-18 09:23:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 1707343872. Throughput: 0: 42640.8. Samples: 1707487780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:23:56,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 09:23:59,328][12883] Updated weights for policy 0, policy_version 104213 (0.0037) [2024-06-18 09:24:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1707540480. Throughput: 0: 42646.3. Samples: 1707618540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:24:01,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 09:24:02,716][12883] Updated weights for policy 0, policy_version 104223 (0.0045) [2024-06-18 09:24:06,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1707720704. Throughput: 0: 42803.1. Samples: 1707878000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:24:06,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 09:24:07,244][12883] Updated weights for policy 0, policy_version 104233 (0.0029) [2024-06-18 09:24:10,350][12883] Updated weights for policy 0, policy_version 104243 (0.0033) [2024-06-18 09:24:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1707982848. Throughput: 0: 42633.7. Samples: 1708127340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:24:11,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 09:24:14,698][12862] Signal inference workers to stop experience collection... (24900 times) [2024-06-18 09:24:14,699][12862] Signal inference workers to resume experience collection... (24900 times) [2024-06-18 09:24:14,712][12883] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-06-18 09:24:14,740][12883] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-06-18 09:24:14,857][12883] Updated weights for policy 0, policy_version 104253 (0.0029) [2024-06-18 09:24:16,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1708179456. Throughput: 0: 42803.9. Samples: 1708266660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:24:16,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 09:24:18,321][12883] Updated weights for policy 0, policy_version 104263 (0.0031) [2024-06-18 09:24:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1708376064. Throughput: 0: 42566.5. Samples: 1708508920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:24:21,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 09:24:22,428][12883] Updated weights for policy 0, policy_version 104273 (0.0046) [2024-06-18 09:24:25,818][12883] Updated weights for policy 0, policy_version 104283 (0.0039) [2024-06-18 09:24:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1708621824. Throughput: 0: 42539.1. Samples: 1708766380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:24:26,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 09:24:29,915][12883] Updated weights for policy 0, policy_version 104293 (0.0033) [2024-06-18 09:24:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1708802048. Throughput: 0: 42749.2. Samples: 1708900880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-18 09:24:31,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 09:24:33,417][12883] Updated weights for policy 0, policy_version 104303 (0.0029) [2024-06-18 09:24:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1709031424. Throughput: 0: 42562.2. Samples: 1709146740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:24:36,995][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 09:24:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104311_1709031424.pth... [2024-06-18 09:24:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000103686_1698791424.pth [2024-06-18 09:24:37,510][12883] Updated weights for policy 0, policy_version 104313 (0.0031) [2024-06-18 09:24:41,238][12883] Updated weights for policy 0, policy_version 104323 (0.0034) [2024-06-18 09:24:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1709260800. Throughput: 0: 42566.3. Samples: 1709403260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:24:41,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 09:24:45,237][12883] Updated weights for policy 0, policy_version 104333 (0.0031) [2024-06-18 09:24:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1709457408. Throughput: 0: 42585.3. Samples: 1709534880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:24:46,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 09:24:48,957][12883] Updated weights for policy 0, policy_version 104343 (0.0035) [2024-06-18 09:24:51,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 1709670400. Throughput: 0: 42411.6. Samples: 1709786620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:24:51,997][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 09:24:53,008][12883] Updated weights for policy 0, policy_version 104353 (0.0029) [2024-06-18 09:24:56,537][12883] Updated weights for policy 0, policy_version 104363 (0.0028) [2024-06-18 09:24:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1709883392. Throughput: 0: 42602.2. Samples: 1710044440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:24:56,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 09:25:00,570][12883] Updated weights for policy 0, policy_version 104373 (0.0039) [2024-06-18 09:25:01,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1710096384. Throughput: 0: 42451.2. Samples: 1710176960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:01,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 09:25:04,338][12883] Updated weights for policy 0, policy_version 104383 (0.0038) [2024-06-18 09:25:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.5, 300 sec: 42542.8). Total num frames: 1710325760. Throughput: 0: 42778.3. Samples: 1710433940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:06,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 09:25:08,233][12883] Updated weights for policy 0, policy_version 104393 (0.0035) [2024-06-18 09:25:11,926][12883] Updated weights for policy 0, policy_version 104403 (0.0043) [2024-06-18 09:25:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1710538752. Throughput: 0: 42797.7. Samples: 1710692280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:11,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 09:25:15,744][12883] Updated weights for policy 0, policy_version 104413 (0.0040) [2024-06-18 09:25:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1710751744. Throughput: 0: 42687.1. Samples: 1710821900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:16,996][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 09:25:19,574][12883] Updated weights for policy 0, policy_version 104423 (0.0031) [2024-06-18 09:25:21,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 1710981120. Throughput: 0: 42884.6. Samples: 1711076540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:21,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 09:25:23,590][12883] Updated weights for policy 0, policy_version 104433 (0.0042) [2024-06-18 09:25:26,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1711161344. Throughput: 0: 42927.0. Samples: 1711334980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:26,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 09:25:27,438][12883] Updated weights for policy 0, policy_version 104443 (0.0039) [2024-06-18 09:25:31,113][12883] Updated weights for policy 0, policy_version 104453 (0.0036) [2024-06-18 09:25:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1711390720. Throughput: 0: 42681.8. Samples: 1711455560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:31,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 09:25:35,198][12883] Updated weights for policy 0, policy_version 104463 (0.0036) [2024-06-18 09:25:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1711620096. Throughput: 0: 42774.9. Samples: 1711711400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 09:25:36,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 09:25:38,999][12883] Updated weights for policy 0, policy_version 104473 (0.0050) [2024-06-18 09:25:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1711783936. Throughput: 0: 42740.4. Samples: 1711967760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:25:41,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 09:25:42,882][12862] Signal inference workers to stop experience collection... (24950 times) [2024-06-18 09:25:42,928][12883] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-06-18 09:25:42,931][12862] Signal inference workers to resume experience collection... (24950 times) [2024-06-18 09:25:42,943][12883] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-06-18 09:25:43,071][12883] Updated weights for policy 0, policy_version 104483 (0.0044) [2024-06-18 09:25:46,547][12883] Updated weights for policy 0, policy_version 104493 (0.0037) [2024-06-18 09:25:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1712029696. Throughput: 0: 42372.0. Samples: 1712083700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:25:46,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 09:25:50,716][12883] Updated weights for policy 0, policy_version 104503 (0.0036) [2024-06-18 09:25:51,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 1712259072. Throughput: 0: 42523.0. Samples: 1712347480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:25:51,994][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 09:25:54,100][12883] Updated weights for policy 0, policy_version 104513 (0.0040) [2024-06-18 09:25:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1712406528. Throughput: 0: 42628.9. Samples: 1712610580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:25:56,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 09:25:58,471][12883] Updated weights for policy 0, policy_version 104523 (0.0026) [2024-06-18 09:26:01,813][12883] Updated weights for policy 0, policy_version 104533 (0.0030) [2024-06-18 09:26:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1712668672. Throughput: 0: 42369.7. Samples: 1712728440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:01,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 09:26:06,076][12883] Updated weights for policy 0, policy_version 104543 (0.0028) [2024-06-18 09:26:06,994][12645] Fps is (10 sec: 47514.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1712881664. Throughput: 0: 42636.5. Samples: 1712995180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:06,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 09:26:09,371][12883] Updated weights for policy 0, policy_version 104553 (0.0044) [2024-06-18 09:26:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 1713061888. Throughput: 0: 42561.0. Samples: 1713250220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:11,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 09:26:13,773][12883] Updated weights for policy 0, policy_version 104563 (0.0040) [2024-06-18 09:26:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1713307648. Throughput: 0: 42513.0. Samples: 1713368640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:16,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 09:26:17,018][12883] Updated weights for policy 0, policy_version 104573 (0.0041) [2024-06-18 09:26:21,541][12883] Updated weights for policy 0, policy_version 104583 (0.0033) [2024-06-18 09:26:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1713520640. Throughput: 0: 42590.9. Samples: 1713627980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:21,994][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 09:26:25,038][12883] Updated weights for policy 0, policy_version 104593 (0.0038) [2024-06-18 09:26:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1713700864. Throughput: 0: 42552.1. Samples: 1713882600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:26,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 09:26:29,269][12883] Updated weights for policy 0, policy_version 104603 (0.0028) [2024-06-18 09:26:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 1713946624. Throughput: 0: 42758.3. Samples: 1714007820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:31,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 09:26:32,674][12883] Updated weights for policy 0, policy_version 104613 (0.0038) [2024-06-18 09:26:36,780][12883] Updated weights for policy 0, policy_version 104623 (0.0036) [2024-06-18 09:26:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 1714143232. Throughput: 0: 42689.4. Samples: 1714268500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 09:26:36,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 09:26:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104623_1714143232.pth... [2024-06-18 09:26:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104000_1703936000.pth [2024-06-18 09:26:40,431][12883] Updated weights for policy 0, policy_version 104633 (0.0041) [2024-06-18 09:26:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1714356224. Throughput: 0: 42479.6. Samples: 1714522160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:26:41,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 09:26:44,339][12883] Updated weights for policy 0, policy_version 104643 (0.0030) [2024-06-18 09:26:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1714585600. Throughput: 0: 42740.4. Samples: 1714651760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:26:46,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 09:26:47,836][12883] Updated weights for policy 0, policy_version 104653 (0.0039) [2024-06-18 09:26:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1714782208. Throughput: 0: 42559.1. Samples: 1714910340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:26:51,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 09:26:52,015][12883] Updated weights for policy 0, policy_version 104663 (0.0038) [2024-06-18 09:26:55,448][12883] Updated weights for policy 0, policy_version 104673 (0.0034) [2024-06-18 09:26:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 1714995200. Throughput: 0: 42545.9. Samples: 1715164780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:26:56,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 09:26:59,605][12883] Updated weights for policy 0, policy_version 104683 (0.0037) [2024-06-18 09:27:01,016][12862] Signal inference workers to stop experience collection... (25000 times) [2024-06-18 09:27:01,027][12883] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-06-18 09:27:01,074][12862] Signal inference workers to resume experience collection... (25000 times) [2024-06-18 09:27:01,074][12883] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-06-18 09:27:01,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1715240960. Throughput: 0: 42803.0. Samples: 1715294780. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:01,996][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 09:27:03,059][12883] Updated weights for policy 0, policy_version 104693 (0.0029) [2024-06-18 09:27:06,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1715421184. Throughput: 0: 42813.2. Samples: 1715554580. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:06,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 09:27:07,325][12883] Updated weights for policy 0, policy_version 104703 (0.0036) [2024-06-18 09:27:10,859][12883] Updated weights for policy 0, policy_version 104713 (0.0028) [2024-06-18 09:27:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1715650560. Throughput: 0: 42711.4. Samples: 1715804620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:11,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 09:27:14,926][12883] Updated weights for policy 0, policy_version 104723 (0.0028) [2024-06-18 09:27:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1715879936. Throughput: 0: 42880.8. Samples: 1715937460. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:16,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 09:27:18,406][12883] Updated weights for policy 0, policy_version 104733 (0.0041) [2024-06-18 09:27:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1716076544. Throughput: 0: 42788.5. Samples: 1716193980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:21,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 09:27:22,632][12883] Updated weights for policy 0, policy_version 104743 (0.0038) [2024-06-18 09:27:26,275][12883] Updated weights for policy 0, policy_version 104753 (0.0038) [2024-06-18 09:27:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1716289536. Throughput: 0: 42708.2. Samples: 1716444020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:26,994][12645] Avg episode reward: [(0, '0.755')] [2024-06-18 09:27:30,381][12883] Updated weights for policy 0, policy_version 104763 (0.0048) [2024-06-18 09:27:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1716486144. Throughput: 0: 42738.5. Samples: 1716575000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:31,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 09:27:33,827][12883] Updated weights for policy 0, policy_version 104773 (0.0036) [2024-06-18 09:27:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 1716715520. Throughput: 0: 42690.1. Samples: 1716831400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 09:27:36,998][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 09:27:37,914][12883] Updated weights for policy 0, policy_version 104783 (0.0030) [2024-06-18 09:27:41,307][12883] Updated weights for policy 0, policy_version 104793 (0.0032) [2024-06-18 09:27:41,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1716944896. Throughput: 0: 42634.1. Samples: 1717083320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:27:41,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 09:27:45,944][12883] Updated weights for policy 0, policy_version 104803 (0.0027) [2024-06-18 09:27:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 1717125120. Throughput: 0: 42720.0. Samples: 1717217180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:27:46,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 09:27:48,896][12883] Updated weights for policy 0, policy_version 104813 (0.0042) [2024-06-18 09:27:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 1717338112. Throughput: 0: 42665.4. Samples: 1717474520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:27:51,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 09:27:53,373][12883] Updated weights for policy 0, policy_version 104823 (0.0029) [2024-06-18 09:27:56,589][12883] Updated weights for policy 0, policy_version 104833 (0.0023) [2024-06-18 09:27:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1717583872. Throughput: 0: 42774.3. Samples: 1717729460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:27:56,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 09:28:00,964][12883] Updated weights for policy 0, policy_version 104843 (0.0031) [2024-06-18 09:28:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1717780480. Throughput: 0: 42855.6. Samples: 1717865960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:01,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 09:28:04,262][12883] Updated weights for policy 0, policy_version 104853 (0.0033) [2024-06-18 09:28:06,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1717993472. Throughput: 0: 42671.0. Samples: 1718114180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:06,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 09:28:09,303][12883] Updated weights for policy 0, policy_version 104863 (0.0030) [2024-06-18 09:28:11,851][12883] Updated weights for policy 0, policy_version 104873 (0.0031) [2024-06-18 09:28:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1718239232. Throughput: 0: 42696.8. Samples: 1718365380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:11,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 09:28:16,792][12883] Updated weights for policy 0, policy_version 104883 (0.0043) [2024-06-18 09:28:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1718403072. Throughput: 0: 42718.7. Samples: 1718497340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:16,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 09:28:18,301][12862] Signal inference workers to stop experience collection... (25050 times) [2024-06-18 09:28:18,342][12883] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-06-18 09:28:18,351][12862] Signal inference workers to resume experience collection... (25050 times) [2024-06-18 09:28:18,361][12883] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-06-18 09:28:19,903][12883] Updated weights for policy 0, policy_version 104893 (0.0035) [2024-06-18 09:28:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1718648832. Throughput: 0: 42620.8. Samples: 1718749340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:21,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 09:28:24,354][12883] Updated weights for policy 0, policy_version 104903 (0.0027) [2024-06-18 09:28:26,996][12645] Fps is (10 sec: 45865.1, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1718861824. Throughput: 0: 42864.5. Samples: 1719012320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:27,008][12645] Avg episode reward: [(0, '0.684')] [2024-06-18 09:28:27,440][12883] Updated weights for policy 0, policy_version 104913 (0.0034) [2024-06-18 09:28:31,921][12883] Updated weights for policy 0, policy_version 104923 (0.0044) [2024-06-18 09:28:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1719058432. Throughput: 0: 42697.3. Samples: 1719138560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:31,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 09:28:35,014][12883] Updated weights for policy 0, policy_version 104933 (0.0039) [2024-06-18 09:28:36,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1719287808. Throughput: 0: 42577.4. Samples: 1719390500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 09:28:36,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 09:28:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104937_1719287808.pth... [2024-06-18 09:28:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104311_1709031424.pth [2024-06-18 09:28:39,331][12883] Updated weights for policy 0, policy_version 104943 (0.0033) [2024-06-18 09:28:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1719500800. Throughput: 0: 42835.1. Samples: 1719657040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:28:41,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 09:28:42,777][12883] Updated weights for policy 0, policy_version 104953 (0.0036) [2024-06-18 09:28:46,823][12883] Updated weights for policy 0, policy_version 104963 (0.0030) [2024-06-18 09:28:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1719713792. Throughput: 0: 42694.7. Samples: 1719787220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:28:46,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 09:28:50,407][12883] Updated weights for policy 0, policy_version 104973 (0.0047) [2024-06-18 09:28:51,996][12645] Fps is (10 sec: 42588.8, 60 sec: 43143.0, 300 sec: 42653.6). Total num frames: 1719926784. Throughput: 0: 42745.5. Samples: 1720037820. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:28:51,997][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 09:28:54,749][12883] Updated weights for policy 0, policy_version 104983 (0.0042) [2024-06-18 09:28:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1720139776. Throughput: 0: 42888.1. Samples: 1720295340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:28:56,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 09:28:58,083][12883] Updated weights for policy 0, policy_version 104993 (0.0043) [2024-06-18 09:29:01,994][12645] Fps is (10 sec: 42607.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1720352768. Throughput: 0: 42772.4. Samples: 1720422100. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:01,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 09:29:02,293][12883] Updated weights for policy 0, policy_version 105003 (0.0024) [2024-06-18 09:29:05,767][12883] Updated weights for policy 0, policy_version 105013 (0.0035) [2024-06-18 09:29:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1720582144. Throughput: 0: 42880.2. Samples: 1720678940. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:06,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 09:29:09,922][12883] Updated weights for policy 0, policy_version 105023 (0.0034) [2024-06-18 09:29:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1720762368. Throughput: 0: 42693.7. Samples: 1720933440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:11,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 09:29:13,494][12883] Updated weights for policy 0, policy_version 105033 (0.0038) [2024-06-18 09:29:17,000][12645] Fps is (10 sec: 40933.7, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 1720991744. Throughput: 0: 42667.0. Samples: 1721058840. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:17,001][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 09:29:17,563][12883] Updated weights for policy 0, policy_version 105043 (0.0023) [2024-06-18 09:29:21,139][12883] Updated weights for policy 0, policy_version 105053 (0.0034) [2024-06-18 09:29:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1721221120. Throughput: 0: 42847.0. Samples: 1721318620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:22,000][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 09:29:25,416][12883] Updated weights for policy 0, policy_version 105063 (0.0041) [2024-06-18 09:29:26,994][12645] Fps is (10 sec: 42625.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1721417728. Throughput: 0: 42649.8. Samples: 1721576280. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:26,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 09:29:28,737][12883] Updated weights for policy 0, policy_version 105073 (0.0036) [2024-06-18 09:29:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1721630720. Throughput: 0: 42536.4. Samples: 1721701360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:31,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 09:29:32,963][12883] Updated weights for policy 0, policy_version 105083 (0.0038) [2024-06-18 09:29:33,533][12862] Signal inference workers to stop experience collection... (25100 times) [2024-06-18 09:29:33,587][12883] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-06-18 09:29:33,595][12862] Signal inference workers to resume experience collection... (25100 times) [2024-06-18 09:29:33,600][12883] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-06-18 09:29:36,767][12883] Updated weights for policy 0, policy_version 105093 (0.0032) [2024-06-18 09:29:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1721843712. Throughput: 0: 42733.7. Samples: 1721960740. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-18 09:29:36,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 09:29:40,491][12883] Updated weights for policy 0, policy_version 105103 (0.0029) [2024-06-18 09:29:41,998][12645] Fps is (10 sec: 42579.8, 60 sec: 42595.3, 300 sec: 42708.9). Total num frames: 1722056704. Throughput: 0: 42750.0. Samples: 1722219280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:29:41,998][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 09:29:44,337][12883] Updated weights for policy 0, policy_version 105113 (0.0040) [2024-06-18 09:29:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 1722269696. Throughput: 0: 42800.1. Samples: 1722348100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:29:46,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 09:29:48,195][12883] Updated weights for policy 0, policy_version 105123 (0.0028) [2024-06-18 09:29:51,920][12883] Updated weights for policy 0, policy_version 105133 (0.0058) [2024-06-18 09:29:51,994][12645] Fps is (10 sec: 44256.3, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 1722499072. Throughput: 0: 42751.9. Samples: 1722602780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:29:51,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 09:29:55,806][12883] Updated weights for policy 0, policy_version 105143 (0.0036) [2024-06-18 09:29:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1722712064. Throughput: 0: 42920.9. Samples: 1722864880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:29:56,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 09:29:59,428][12883] Updated weights for policy 0, policy_version 105153 (0.0032) [2024-06-18 09:30:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1722925056. Throughput: 0: 43003.3. Samples: 1722993720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:01,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 09:30:03,371][12883] Updated weights for policy 0, policy_version 105163 (0.0022) [2024-06-18 09:30:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1723138048. Throughput: 0: 43004.5. Samples: 1723253820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:06,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 09:30:07,121][12883] Updated weights for policy 0, policy_version 105173 (0.0038) [2024-06-18 09:30:10,956][12883] Updated weights for policy 0, policy_version 105183 (0.0045) [2024-06-18 09:30:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1723334656. Throughput: 0: 42982.1. Samples: 1723510480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:11,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 09:30:14,811][12883] Updated weights for policy 0, policy_version 105193 (0.0027) [2024-06-18 09:30:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42875.9, 300 sec: 42653.9). Total num frames: 1723564032. Throughput: 0: 43023.5. Samples: 1723637420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:16,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 09:30:18,573][12883] Updated weights for policy 0, policy_version 105203 (0.0029) [2024-06-18 09:30:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1723777024. Throughput: 0: 42941.4. Samples: 1723893100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:21,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 09:30:22,446][12883] Updated weights for policy 0, policy_version 105213 (0.0040) [2024-06-18 09:30:26,166][12883] Updated weights for policy 0, policy_version 105223 (0.0036) [2024-06-18 09:30:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1723990016. Throughput: 0: 42943.2. Samples: 1724151540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:26,994][12645] Avg episode reward: [(0, '0.780')] [2024-06-18 09:30:30,060][12883] Updated weights for policy 0, policy_version 105233 (0.0030) [2024-06-18 09:30:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1724203008. Throughput: 0: 42935.5. Samples: 1724280200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:31,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 09:30:33,934][12883] Updated weights for policy 0, policy_version 105243 (0.0031) [2024-06-18 09:30:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1724432384. Throughput: 0: 43096.8. Samples: 1724542140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 09:30:36,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 09:30:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105251_1724432384.pth... [2024-06-18 09:30:37,048][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104623_1714143232.pth [2024-06-18 09:30:37,596][12883] Updated weights for policy 0, policy_version 105253 (0.0034) [2024-06-18 09:30:41,528][12883] Updated weights for policy 0, policy_version 105263 (0.0035) [2024-06-18 09:30:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42874.5, 300 sec: 42709.5). Total num frames: 1724628992. Throughput: 0: 42731.5. Samples: 1724787800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:30:41,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 09:30:44,065][12862] Signal inference workers to stop experience collection... (25150 times) [2024-06-18 09:30:44,065][12862] Signal inference workers to resume experience collection... (25150 times) [2024-06-18 09:30:44,100][12883] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-06-18 09:30:44,100][12883] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-06-18 09:30:45,201][12883] Updated weights for policy 0, policy_version 105273 (0.0039) [2024-06-18 09:30:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1724841984. Throughput: 0: 42756.6. Samples: 1724917760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:30:46,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 09:30:49,341][12883] Updated weights for policy 0, policy_version 105283 (0.0032) [2024-06-18 09:30:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1725071360. Throughput: 0: 42752.4. Samples: 1725177680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:30:51,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 09:30:52,984][12883] Updated weights for policy 0, policy_version 105293 (0.0036) [2024-06-18 09:30:56,895][12883] Updated weights for policy 0, policy_version 105303 (0.0029) [2024-06-18 09:30:57,000][12645] Fps is (10 sec: 44209.0, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 1725284352. Throughput: 0: 42672.8. Samples: 1725431020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:30:57,000][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 09:31:00,739][12883] Updated weights for policy 0, policy_version 105313 (0.0032) [2024-06-18 09:31:01,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1725464576. Throughput: 0: 42648.6. Samples: 1725556600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:01,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 09:31:04,932][12883] Updated weights for policy 0, policy_version 105323 (0.0044) [2024-06-18 09:31:06,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1725693952. Throughput: 0: 42684.0. Samples: 1725813880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:06,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 09:31:08,146][12883] Updated weights for policy 0, policy_version 105333 (0.0030) [2024-06-18 09:31:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1725906944. Throughput: 0: 42733.4. Samples: 1726074540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:11,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 09:31:12,360][12883] Updated weights for policy 0, policy_version 105343 (0.0033) [2024-06-18 09:31:15,847][12883] Updated weights for policy 0, policy_version 105353 (0.0033) [2024-06-18 09:31:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1726119936. Throughput: 0: 42768.5. Samples: 1726204780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:16,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 09:31:20,062][12883] Updated weights for policy 0, policy_version 105363 (0.0036) [2024-06-18 09:31:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1726332928. Throughput: 0: 42535.6. Samples: 1726456240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:21,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 09:31:23,523][12883] Updated weights for policy 0, policy_version 105373 (0.0035) [2024-06-18 09:31:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1726545920. Throughput: 0: 42712.1. Samples: 1726709840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:26,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 09:31:27,653][12883] Updated weights for policy 0, policy_version 105383 (0.0036) [2024-06-18 09:31:31,550][12883] Updated weights for policy 0, policy_version 105393 (0.0036) [2024-06-18 09:31:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1726775296. Throughput: 0: 42799.2. Samples: 1726843720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:31,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 09:31:35,142][12883] Updated weights for policy 0, policy_version 105403 (0.0032) [2024-06-18 09:31:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1726971904. Throughput: 0: 42656.1. Samples: 1727097200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 09:31:36,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 09:31:39,331][12883] Updated weights for policy 0, policy_version 105413 (0.0033) [2024-06-18 09:31:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1727201280. Throughput: 0: 42517.9. Samples: 1727344060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:31:41,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 09:31:42,870][12883] Updated weights for policy 0, policy_version 105423 (0.0027) [2024-06-18 09:31:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1727414272. Throughput: 0: 42715.1. Samples: 1727478780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:31:46,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 09:31:46,994][12883] Updated weights for policy 0, policy_version 105433 (0.0038) [2024-06-18 09:31:50,952][12883] Updated weights for policy 0, policy_version 105443 (0.0047) [2024-06-18 09:31:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1727610880. Throughput: 0: 42679.9. Samples: 1727734480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:31:51,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 09:31:54,682][12883] Updated weights for policy 0, policy_version 105453 (0.0042) [2024-06-18 09:31:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 1727856640. Throughput: 0: 42402.6. Samples: 1727982660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:31:56,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 09:31:58,627][12883] Updated weights for policy 0, policy_version 105463 (0.0021) [2024-06-18 09:32:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1728053248. Throughput: 0: 42656.9. Samples: 1728124340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:01,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 09:32:02,307][12883] Updated weights for policy 0, policy_version 105473 (0.0035) [2024-06-18 09:32:06,071][12883] Updated weights for policy 0, policy_version 105483 (0.0038) [2024-06-18 09:32:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1728249856. Throughput: 0: 42677.7. Samples: 1728376740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:06,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 09:32:09,882][12883] Updated weights for policy 0, policy_version 105493 (0.0033) [2024-06-18 09:32:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1728479232. Throughput: 0: 42793.3. Samples: 1728635540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:11,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 09:32:13,604][12883] Updated weights for policy 0, policy_version 105503 (0.0032) [2024-06-18 09:32:16,306][12862] Signal inference workers to stop experience collection... (25200 times) [2024-06-18 09:32:16,306][12862] Signal inference workers to resume experience collection... (25200 times) [2024-06-18 09:32:16,322][12883] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-06-18 09:32:16,322][12883] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-06-18 09:32:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1728692224. Throughput: 0: 42753.2. Samples: 1728767620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:16,995][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 09:32:17,527][12883] Updated weights for policy 0, policy_version 105513 (0.0038) [2024-06-18 09:32:21,302][12883] Updated weights for policy 0, policy_version 105523 (0.0035) [2024-06-18 09:32:21,995][12645] Fps is (10 sec: 42594.5, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 1728905216. Throughput: 0: 42729.8. Samples: 1729020080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:21,995][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 09:32:25,088][12883] Updated weights for policy 0, policy_version 105533 (0.0032) [2024-06-18 09:32:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1729118208. Throughput: 0: 42985.3. Samples: 1729278400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:26,995][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 09:32:28,953][12883] Updated weights for policy 0, policy_version 105543 (0.0031) [2024-06-18 09:32:31,996][12645] Fps is (10 sec: 42592.8, 60 sec: 42596.7, 300 sec: 42764.7). Total num frames: 1729331200. Throughput: 0: 42717.9. Samples: 1729401180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:31,997][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 09:32:32,641][12883] Updated weights for policy 0, policy_version 105553 (0.0033) [2024-06-18 09:32:36,641][12883] Updated weights for policy 0, policy_version 105563 (0.0039) [2024-06-18 09:32:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1729560576. Throughput: 0: 42703.1. Samples: 1729656120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 09:32:36,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 09:32:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105564_1729560576.pth... [2024-06-18 09:32:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000104937_1719287808.pth [2024-06-18 09:32:40,319][12883] Updated weights for policy 0, policy_version 105573 (0.0040) [2024-06-18 09:32:41,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1729757184. Throughput: 0: 43012.6. Samples: 1729918220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:32:41,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 09:32:44,209][12883] Updated weights for policy 0, policy_version 105583 (0.0038) [2024-06-18 09:32:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1729953792. Throughput: 0: 42571.9. Samples: 1730040080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:32:46,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 09:32:48,330][12883] Updated weights for policy 0, policy_version 105593 (0.0028) [2024-06-18 09:32:51,881][12883] Updated weights for policy 0, policy_version 105603 (0.0028) [2024-06-18 09:32:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1730199552. Throughput: 0: 42666.4. Samples: 1730296720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:32:51,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 09:32:56,233][12883] Updated weights for policy 0, policy_version 105613 (0.0037) [2024-06-18 09:32:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1730396160. Throughput: 0: 42665.2. Samples: 1730555480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:32:56,995][12645] Avg episode reward: [(0, '0.618')] [2024-06-18 09:32:59,649][12883] Updated weights for policy 0, policy_version 105623 (0.0035) [2024-06-18 09:33:01,997][12645] Fps is (10 sec: 37671.3, 60 sec: 42050.1, 300 sec: 42653.5). Total num frames: 1730576384. Throughput: 0: 42385.1. Samples: 1730675080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:01,997][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 09:33:03,826][12883] Updated weights for policy 0, policy_version 105633 (0.0036) [2024-06-18 09:33:06,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1730822144. Throughput: 0: 42409.4. Samples: 1730928560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:06,996][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 09:33:07,243][12883] Updated weights for policy 0, policy_version 105643 (0.0035) [2024-06-18 09:33:11,943][12883] Updated weights for policy 0, policy_version 105653 (0.0039) [2024-06-18 09:33:11,994][12645] Fps is (10 sec: 44250.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1731018752. Throughput: 0: 42415.2. Samples: 1731187080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:11,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 09:33:15,201][12883] Updated weights for policy 0, policy_version 105663 (0.0025) [2024-06-18 09:33:16,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1731231744. Throughput: 0: 42358.9. Samples: 1731307240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:16,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 09:33:19,514][12883] Updated weights for policy 0, policy_version 105673 (0.0034) [2024-06-18 09:33:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42599.1, 300 sec: 42709.8). Total num frames: 1731461120. Throughput: 0: 42425.9. Samples: 1731565280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:21,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 09:33:22,666][12883] Updated weights for policy 0, policy_version 105683 (0.0025) [2024-06-18 09:33:26,109][12862] Signal inference workers to stop experience collection... (25250 times) [2024-06-18 09:33:26,110][12862] Signal inference workers to resume experience collection... (25250 times) [2024-06-18 09:33:26,140][12883] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-06-18 09:33:26,140][12883] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-06-18 09:33:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1731641344. Throughput: 0: 42328.0. Samples: 1731822980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:26,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 09:33:27,238][12883] Updated weights for policy 0, policy_version 105693 (0.0030) [2024-06-18 09:33:30,646][12883] Updated weights for policy 0, policy_version 105703 (0.0026) [2024-06-18 09:33:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1731870720. Throughput: 0: 42373.4. Samples: 1731946880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:31,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 09:33:34,839][12883] Updated weights for policy 0, policy_version 105713 (0.0035) [2024-06-18 09:33:36,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1732083712. Throughput: 0: 42309.9. Samples: 1732200680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:36,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 09:33:38,305][12883] Updated weights for policy 0, policy_version 105723 (0.0029) [2024-06-18 09:33:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 1732280320. Throughput: 0: 42398.3. Samples: 1732463400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 09:33:41,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 09:33:42,506][12883] Updated weights for policy 0, policy_version 105733 (0.0028) [2024-06-18 09:33:46,094][12883] Updated weights for policy 0, policy_version 105743 (0.0030) [2024-06-18 09:33:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42709.8). Total num frames: 1732526080. Throughput: 0: 42519.3. Samples: 1732588320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:33:46,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 09:33:50,057][12883] Updated weights for policy 0, policy_version 105753 (0.0026) [2024-06-18 09:33:51,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1732739072. Throughput: 0: 42431.4. Samples: 1732837880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:33:51,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 09:33:53,898][12883] Updated weights for policy 0, policy_version 105763 (0.0032) [2024-06-18 09:33:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1732919296. Throughput: 0: 42776.8. Samples: 1733112040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:33:56,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 09:33:57,731][12883] Updated weights for policy 0, policy_version 105773 (0.0024) [2024-06-18 09:34:01,347][12883] Updated weights for policy 0, policy_version 105783 (0.0022) [2024-06-18 09:34:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43146.8, 300 sec: 42653.9). Total num frames: 1733165056. Throughput: 0: 42651.6. Samples: 1733226560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:01,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 09:34:05,359][12883] Updated weights for policy 0, policy_version 105793 (0.0036) [2024-06-18 09:34:06,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42600.1, 300 sec: 42765.0). Total num frames: 1733378048. Throughput: 0: 42653.8. Samples: 1733484700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:06,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 09:34:09,018][12883] Updated weights for policy 0, policy_version 105803 (0.0046) [2024-06-18 09:34:11,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42543.8). Total num frames: 1733541888. Throughput: 0: 42886.7. Samples: 1733752880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:11,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 09:34:12,886][12883] Updated weights for policy 0, policy_version 105813 (0.0040) [2024-06-18 09:34:16,611][12883] Updated weights for policy 0, policy_version 105823 (0.0035) [2024-06-18 09:34:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1733804032. Throughput: 0: 42731.0. Samples: 1733869780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:16,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 09:34:20,671][12883] Updated weights for policy 0, policy_version 105833 (0.0040) [2024-06-18 09:34:21,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1734033408. Throughput: 0: 42870.3. Samples: 1734129840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:21,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 09:34:24,362][12883] Updated weights for policy 0, policy_version 105843 (0.0037) [2024-06-18 09:34:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1734197248. Throughput: 0: 42807.6. Samples: 1734389740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:26,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 09:34:28,255][12883] Updated weights for policy 0, policy_version 105853 (0.0023) [2024-06-18 09:34:30,146][12862] Signal inference workers to stop experience collection... (25300 times) [2024-06-18 09:34:30,146][12862] Signal inference workers to resume experience collection... (25300 times) [2024-06-18 09:34:30,169][12883] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-06-18 09:34:30,169][12883] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-06-18 09:34:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1734443008. Throughput: 0: 42686.4. Samples: 1734509200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:31,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 09:34:32,103][12883] Updated weights for policy 0, policy_version 105863 (0.0038) [2024-06-18 09:34:35,952][12883] Updated weights for policy 0, policy_version 105873 (0.0037) [2024-06-18 09:34:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.6, 300 sec: 42710.1). Total num frames: 1734656000. Throughput: 0: 42917.8. Samples: 1734769180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:36,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 09:34:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105875_1734656000.pth... [2024-06-18 09:34:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105251_1724432384.pth [2024-06-18 09:34:39,713][12883] Updated weights for policy 0, policy_version 105883 (0.0030) [2024-06-18 09:34:41,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1734819840. Throughput: 0: 42436.5. Samples: 1735021680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 09:34:41,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 09:34:43,837][12883] Updated weights for policy 0, policy_version 105893 (0.0023) [2024-06-18 09:34:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1735081984. Throughput: 0: 42536.1. Samples: 1735140680. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:34:46,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 09:34:47,438][12883] Updated weights for policy 0, policy_version 105903 (0.0030) [2024-06-18 09:34:51,706][12883] Updated weights for policy 0, policy_version 105913 (0.0026) [2024-06-18 09:34:51,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1735294976. Throughput: 0: 42675.0. Samples: 1735405080. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:34:51,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 09:34:55,115][12883] Updated weights for policy 0, policy_version 105923 (0.0030) [2024-06-18 09:34:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1735475200. Throughput: 0: 42268.5. Samples: 1735654960. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:34:56,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 09:34:59,388][12883] Updated weights for policy 0, policy_version 105933 (0.0049) [2024-06-18 09:35:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1735704576. Throughput: 0: 42305.1. Samples: 1735773500. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:01,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 09:35:03,033][12883] Updated weights for policy 0, policy_version 105943 (0.0052) [2024-06-18 09:35:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1735901184. Throughput: 0: 42422.9. Samples: 1736038860. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:06,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 09:35:07,181][12883] Updated weights for policy 0, policy_version 105953 (0.0028) [2024-06-18 09:35:11,163][12883] Updated weights for policy 0, policy_version 105963 (0.0032) [2024-06-18 09:35:11,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1736097792. Throughput: 0: 42063.0. Samples: 1736282580. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:11,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 09:35:15,000][12883] Updated weights for policy 0, policy_version 105973 (0.0028) [2024-06-18 09:35:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1736343552. Throughput: 0: 42205.3. Samples: 1736408440. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:16,994][12645] Avg episode reward: [(0, '0.691')] [2024-06-18 09:35:18,781][12883] Updated weights for policy 0, policy_version 105983 (0.0034) [2024-06-18 09:35:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 41233.2, 300 sec: 42431.8). Total num frames: 1736507392. Throughput: 0: 42132.5. Samples: 1736665140. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:21,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 09:35:22,883][12883] Updated weights for policy 0, policy_version 105993 (0.0030) [2024-06-18 09:35:26,492][12883] Updated weights for policy 0, policy_version 106003 (0.0024) [2024-06-18 09:35:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1736753152. Throughput: 0: 41787.6. Samples: 1736902120. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:26,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 09:35:30,649][12883] Updated weights for policy 0, policy_version 106013 (0.0046) [2024-06-18 09:35:31,994][12645] Fps is (10 sec: 49151.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1736998912. Throughput: 0: 42242.7. Samples: 1737041600. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:31,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 09:35:34,128][12883] Updated weights for policy 0, policy_version 106023 (0.0026) [2024-06-18 09:35:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 41233.1, 300 sec: 42376.3). Total num frames: 1737129984. Throughput: 0: 42055.1. Samples: 1737297560. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:36,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 09:35:37,256][12862] Signal inference workers to stop experience collection... (25350 times) [2024-06-18 09:35:37,256][12862] Signal inference workers to resume experience collection... (25350 times) [2024-06-18 09:35:37,268][12883] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-06-18 09:35:37,268][12883] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-06-18 09:35:38,287][12883] Updated weights for policy 0, policy_version 106033 (0.0029) [2024-06-18 09:35:41,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1737375744. Throughput: 0: 42024.4. Samples: 1737546060. Policy #0 lag: (min: 0.0, avg: 6.9, max: 21.0) [2024-06-18 09:35:41,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 09:35:42,346][12883] Updated weights for policy 0, policy_version 106043 (0.0039) [2024-06-18 09:35:45,899][12883] Updated weights for policy 0, policy_version 106053 (0.0032) [2024-06-18 09:35:46,994][12645] Fps is (10 sec: 50790.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1737637888. Throughput: 0: 42486.2. Samples: 1737685380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:35:46,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 09:35:50,035][12883] Updated weights for policy 0, policy_version 106063 (0.0027) [2024-06-18 09:35:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 42321.6). Total num frames: 1737768960. Throughput: 0: 42060.4. Samples: 1737931580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:35:51,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 09:35:53,892][12883] Updated weights for policy 0, policy_version 106073 (0.0025) [2024-06-18 09:35:56,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1738014720. Throughput: 0: 42225.0. Samples: 1738182700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:35:56,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 09:35:57,679][12883] Updated weights for policy 0, policy_version 106083 (0.0035) [2024-06-18 09:36:01,468][12883] Updated weights for policy 0, policy_version 106093 (0.0037) [2024-06-18 09:36:01,994][12645] Fps is (10 sec: 49152.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1738260480. Throughput: 0: 42470.7. Samples: 1738319620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:01,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 09:36:05,294][12883] Updated weights for policy 0, policy_version 106103 (0.0048) [2024-06-18 09:36:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1738440704. Throughput: 0: 42281.6. Samples: 1738567820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:06,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 09:36:09,358][12883] Updated weights for policy 0, policy_version 106113 (0.0038) [2024-06-18 09:36:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1738670080. Throughput: 0: 42589.7. Samples: 1738818660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:11,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 09:36:12,973][12883] Updated weights for policy 0, policy_version 106123 (0.0037) [2024-06-18 09:36:16,918][12883] Updated weights for policy 0, policy_version 106133 (0.0035) [2024-06-18 09:36:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1738883072. Throughput: 0: 42480.5. Samples: 1738953220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:16,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 09:36:20,784][12883] Updated weights for policy 0, policy_version 106143 (0.0029) [2024-06-18 09:36:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1739079680. Throughput: 0: 42512.4. Samples: 1739210620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:21,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 09:36:24,381][12883] Updated weights for policy 0, policy_version 106153 (0.0036) [2024-06-18 09:36:26,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1739325440. Throughput: 0: 42667.0. Samples: 1739466080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:26,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 09:36:28,393][12883] Updated weights for policy 0, policy_version 106163 (0.0027) [2024-06-18 09:36:31,829][12862] Signal inference workers to stop experience collection... (25400 times) [2024-06-18 09:36:31,830][12862] Signal inference workers to resume experience collection... (25400 times) [2024-06-18 09:36:31,847][12883] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-06-18 09:36:31,847][12883] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-06-18 09:36:31,982][12883] Updated weights for policy 0, policy_version 106173 (0.0040) [2024-06-18 09:36:31,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1739538432. Throughput: 0: 42477.7. Samples: 1739596880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:31,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 09:36:35,983][12883] Updated weights for policy 0, policy_version 106183 (0.0035) [2024-06-18 09:36:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42487.3). Total num frames: 1739735040. Throughput: 0: 42673.3. Samples: 1739851880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:36,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 09:36:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106185_1739735040.pth... [2024-06-18 09:36:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105564_1729560576.pth [2024-06-18 09:36:39,677][12883] Updated weights for policy 0, policy_version 106193 (0.0027) [2024-06-18 09:36:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1739964416. Throughput: 0: 42652.0. Samples: 1740102040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:36:41,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 09:36:44,087][12883] Updated weights for policy 0, policy_version 106203 (0.0041) [2024-06-18 09:36:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1740161024. Throughput: 0: 42531.6. Samples: 1740233540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:36:46,994][12645] Avg episode reward: [(0, '0.799')] [2024-06-18 09:36:47,280][12883] Updated weights for policy 0, policy_version 106213 (0.0036) [2024-06-18 09:36:51,727][12883] Updated weights for policy 0, policy_version 106223 (0.0033) [2024-06-18 09:36:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 43144.5, 300 sec: 42376.3). Total num frames: 1740357632. Throughput: 0: 42749.0. Samples: 1740491520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:36:51,994][12645] Avg episode reward: [(0, '0.799')] [2024-06-18 09:36:55,096][12883] Updated weights for policy 0, policy_version 106233 (0.0040) [2024-06-18 09:36:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1740603392. Throughput: 0: 42641.9. Samples: 1740737540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:36:56,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 09:36:59,246][12883] Updated weights for policy 0, policy_version 106243 (0.0029) [2024-06-18 09:37:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1740783616. Throughput: 0: 42605.7. Samples: 1740870480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:01,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 09:37:03,009][12883] Updated weights for policy 0, policy_version 106253 (0.0047) [2024-06-18 09:37:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1740996608. Throughput: 0: 42518.3. Samples: 1741123940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:06,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 09:37:07,140][12883] Updated weights for policy 0, policy_version 106263 (0.0052) [2024-06-18 09:37:10,652][12883] Updated weights for policy 0, policy_version 106273 (0.0036) [2024-06-18 09:37:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1741225984. Throughput: 0: 42393.8. Samples: 1741373800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:11,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 09:37:14,877][12883] Updated weights for policy 0, policy_version 106283 (0.0038) [2024-06-18 09:37:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42487.4). Total num frames: 1741438976. Throughput: 0: 42438.2. Samples: 1741506600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:17,000][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 09:37:18,216][12883] Updated weights for policy 0, policy_version 106293 (0.0028) [2024-06-18 09:37:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1741635584. Throughput: 0: 42383.0. Samples: 1741759120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:21,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 09:37:22,493][12883] Updated weights for policy 0, policy_version 106303 (0.0039) [2024-06-18 09:37:25,726][12883] Updated weights for policy 0, policy_version 106313 (0.0035) [2024-06-18 09:37:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 1741864960. Throughput: 0: 42560.0. Samples: 1742017240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:26,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 09:37:29,973][12883] Updated weights for policy 0, policy_version 106323 (0.0037) [2024-06-18 09:37:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1742077952. Throughput: 0: 42511.0. Samples: 1742146540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:31,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 09:37:33,385][12883] Updated weights for policy 0, policy_version 106333 (0.0039) [2024-06-18 09:37:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1742274560. Throughput: 0: 42577.8. Samples: 1742407520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:36,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 09:37:37,541][12883] Updated weights for policy 0, policy_version 106343 (0.0031) [2024-06-18 09:37:41,292][12883] Updated weights for policy 0, policy_version 106353 (0.0033) [2024-06-18 09:37:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1742503936. Throughput: 0: 42551.2. Samples: 1742652340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 09:37:41,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 09:37:45,508][12883] Updated weights for policy 0, policy_version 106363 (0.0024) [2024-06-18 09:37:46,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42375.9). Total num frames: 1742700544. Throughput: 0: 42513.0. Samples: 1742783660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:37:46,996][12645] Avg episode reward: [(0, '0.722')] [2024-06-18 09:37:48,839][12883] Updated weights for policy 0, policy_version 106373 (0.0026) [2024-06-18 09:37:51,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 1742913536. Throughput: 0: 42517.7. Samples: 1743037240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:37:51,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 09:37:53,077][12883] Updated weights for policy 0, policy_version 106383 (0.0029) [2024-06-18 09:37:56,894][12883] Updated weights for policy 0, policy_version 106393 (0.0032) [2024-06-18 09:37:56,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42325.3, 300 sec: 42598.8). Total num frames: 1743142912. Throughput: 0: 42609.9. Samples: 1743291240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:37:56,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 09:38:00,661][12883] Updated weights for policy 0, policy_version 106403 (0.0035) [2024-06-18 09:38:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 1743355904. Throughput: 0: 42551.6. Samples: 1743421420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:01,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 09:38:04,292][12862] Signal inference workers to stop experience collection... (25450 times) [2024-06-18 09:38:04,293][12862] Signal inference workers to resume experience collection... (25450 times) [2024-06-18 09:38:04,309][12883] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-06-18 09:38:04,309][12883] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-06-18 09:38:04,670][12883] Updated weights for policy 0, policy_version 106413 (0.0023) [2024-06-18 09:38:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1743568896. Throughput: 0: 42560.6. Samples: 1743674340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:06,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 09:38:08,255][12883] Updated weights for policy 0, policy_version 106423 (0.0030) [2024-06-18 09:38:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1743765504. Throughput: 0: 42596.9. Samples: 1743934100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:11,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 09:38:12,226][12883] Updated weights for policy 0, policy_version 106433 (0.0030) [2024-06-18 09:38:16,177][12883] Updated weights for policy 0, policy_version 106443 (0.0028) [2024-06-18 09:38:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1743994880. Throughput: 0: 42514.7. Samples: 1744059700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:16,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 09:38:19,954][12883] Updated weights for policy 0, policy_version 106453 (0.0032) [2024-06-18 09:38:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1744207872. Throughput: 0: 42407.0. Samples: 1744315840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:21,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 09:38:23,831][12883] Updated weights for policy 0, policy_version 106463 (0.0033) [2024-06-18 09:38:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1744420864. Throughput: 0: 42727.4. Samples: 1744575080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:26,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 09:38:27,581][12883] Updated weights for policy 0, policy_version 106473 (0.0038) [2024-06-18 09:38:31,638][12883] Updated weights for policy 0, policy_version 106483 (0.0034) [2024-06-18 09:38:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1744650240. Throughput: 0: 42627.1. Samples: 1744701780. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:31,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 09:38:35,293][12883] Updated weights for policy 0, policy_version 106493 (0.0048) [2024-06-18 09:38:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1744830464. Throughput: 0: 42550.3. Samples: 1744952000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:36,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 09:38:37,117][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106497_1744846848.pth... [2024-06-18 09:38:37,185][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000105875_1734656000.pth [2024-06-18 09:38:39,255][12883] Updated weights for policy 0, policy_version 106503 (0.0039) [2024-06-18 09:38:41,994][12645] Fps is (10 sec: 39319.3, 60 sec: 42324.9, 300 sec: 42431.7). Total num frames: 1745043456. Throughput: 0: 42640.8. Samples: 1745210100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-18 09:38:41,995][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 09:38:43,064][12883] Updated weights for policy 0, policy_version 106513 (0.0037) [2024-06-18 09:38:46,940][12883] Updated weights for policy 0, policy_version 106523 (0.0044) [2024-06-18 09:38:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.0, 300 sec: 42487.3). Total num frames: 1745272832. Throughput: 0: 42597.8. Samples: 1745338320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:38:46,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 09:38:50,889][12883] Updated weights for policy 0, policy_version 106533 (0.0038) [2024-06-18 09:38:51,994][12645] Fps is (10 sec: 42600.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1745469440. Throughput: 0: 42610.6. Samples: 1745591820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:38:51,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 09:38:54,547][12883] Updated weights for policy 0, policy_version 106543 (0.0036) [2024-06-18 09:38:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1745682432. Throughput: 0: 42450.1. Samples: 1745844360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:38:56,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 09:38:58,729][12883] Updated weights for policy 0, policy_version 106553 (0.0031) [2024-06-18 09:39:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1745911808. Throughput: 0: 42513.0. Samples: 1745972780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:01,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 09:39:02,160][12883] Updated weights for policy 0, policy_version 106563 (0.0031) [2024-06-18 09:39:06,578][12883] Updated weights for policy 0, policy_version 106573 (0.0023) [2024-06-18 09:39:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 1746092032. Throughput: 0: 42423.4. Samples: 1746224900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:06,995][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 09:39:10,042][12883] Updated weights for policy 0, policy_version 106583 (0.0042) [2024-06-18 09:39:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1746321408. Throughput: 0: 42242.2. Samples: 1746475980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:11,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 09:39:14,614][12883] Updated weights for policy 0, policy_version 106593 (0.0044) [2024-06-18 09:39:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 1746534400. Throughput: 0: 42388.8. Samples: 1746609280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:17,003][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 09:39:18,162][12883] Updated weights for policy 0, policy_version 106603 (0.0036) [2024-06-18 09:39:21,994][12645] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1746714624. Throughput: 0: 42439.9. Samples: 1746861800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:21,995][12645] Avg episode reward: [(0, '0.113')] [2024-06-18 09:39:22,449][12883] Updated weights for policy 0, policy_version 106613 (0.0030) [2024-06-18 09:39:22,538][12862] Signal inference workers to stop experience collection... (25500 times) [2024-06-18 09:39:22,596][12883] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-06-18 09:39:22,653][12862] Signal inference workers to resume experience collection... (25500 times) [2024-06-18 09:39:22,653][12883] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-06-18 09:39:25,698][12883] Updated weights for policy 0, policy_version 106623 (0.0043) [2024-06-18 09:39:26,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1746960384. Throughput: 0: 42181.5. Samples: 1747108240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:26,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 09:39:30,184][12883] Updated weights for policy 0, policy_version 106633 (0.0045) [2024-06-18 09:39:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 1747156992. Throughput: 0: 42499.3. Samples: 1747250780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:31,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 09:39:33,327][12883] Updated weights for policy 0, policy_version 106643 (0.0038) [2024-06-18 09:39:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1747353600. Throughput: 0: 42195.0. Samples: 1747490600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:36,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 09:39:37,856][12883] Updated weights for policy 0, policy_version 106653 (0.0051) [2024-06-18 09:39:40,890][12883] Updated weights for policy 0, policy_version 106663 (0.0038) [2024-06-18 09:39:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.8, 300 sec: 42431.8). Total num frames: 1747599360. Throughput: 0: 42133.0. Samples: 1747740340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 09:39:41,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 09:39:45,434][12883] Updated weights for policy 0, policy_version 106673 (0.0031) [2024-06-18 09:39:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 1747795968. Throughput: 0: 42401.3. Samples: 1747880840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:39:46,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 09:39:48,500][12883] Updated weights for policy 0, policy_version 106683 (0.0031) [2024-06-18 09:39:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1748008960. Throughput: 0: 42390.8. Samples: 1748132480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:39:51,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 09:39:53,067][12883] Updated weights for policy 0, policy_version 106693 (0.0036) [2024-06-18 09:39:55,966][12883] Updated weights for policy 0, policy_version 106703 (0.0033) [2024-06-18 09:39:56,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1748254720. Throughput: 0: 42397.3. Samples: 1748383860. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:39:56,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 09:40:00,861][12883] Updated weights for policy 0, policy_version 106713 (0.0038) [2024-06-18 09:40:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1748451328. Throughput: 0: 42499.2. Samples: 1748521740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:01,994][12645] Avg episode reward: [(0, '0.198')] [2024-06-18 09:40:03,753][12883] Updated weights for policy 0, policy_version 106723 (0.0042) [2024-06-18 09:40:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1748664320. Throughput: 0: 42546.8. Samples: 1748776400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:06,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 09:40:08,506][12883] Updated weights for policy 0, policy_version 106733 (0.0036) [2024-06-18 09:40:11,365][12883] Updated weights for policy 0, policy_version 106743 (0.0030) [2024-06-18 09:40:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1748893696. Throughput: 0: 42737.8. Samples: 1749031440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:11,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 09:40:16,262][12883] Updated weights for policy 0, policy_version 106753 (0.0028) [2024-06-18 09:40:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1749073920. Throughput: 0: 42497.7. Samples: 1749163180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:16,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 09:40:18,933][12883] Updated weights for policy 0, policy_version 106763 (0.0026) [2024-06-18 09:40:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1749286912. Throughput: 0: 42877.4. Samples: 1749420080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:22,004][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 09:40:23,895][12883] Updated weights for policy 0, policy_version 106773 (0.0024) [2024-06-18 09:40:26,446][12883] Updated weights for policy 0, policy_version 106783 (0.0033) [2024-06-18 09:40:26,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1749549056. Throughput: 0: 42949.9. Samples: 1749673080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:26,994][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 09:40:31,275][12883] Updated weights for policy 0, policy_version 106793 (0.0032) [2024-06-18 09:40:32,000][12645] Fps is (10 sec: 44209.3, 60 sec: 42866.9, 300 sec: 42708.6). Total num frames: 1749729280. Throughput: 0: 43003.8. Samples: 1749816280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:32,001][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 09:40:34,074][12883] Updated weights for policy 0, policy_version 106803 (0.0039) [2024-06-18 09:40:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1749942272. Throughput: 0: 43011.5. Samples: 1750068000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:36,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 09:40:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106808_1749942272.pth... [2024-06-18 09:40:37,099][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106185_1739735040.pth [2024-06-18 09:40:38,698][12862] Signal inference workers to stop experience collection... (25550 times) [2024-06-18 09:40:38,698][12862] Signal inference workers to resume experience collection... (25550 times) [2024-06-18 09:40:38,734][12883] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-06-18 09:40:38,734][12883] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-06-18 09:40:38,852][12883] Updated weights for policy 0, policy_version 106813 (0.0039) [2024-06-18 09:40:41,614][12883] Updated weights for policy 0, policy_version 106823 (0.0026) [2024-06-18 09:40:41,994][12645] Fps is (10 sec: 47543.5, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 1750204416. Throughput: 0: 43071.2. Samples: 1750322060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 22.0) [2024-06-18 09:40:41,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 09:40:46,576][12883] Updated weights for policy 0, policy_version 106833 (0.0041) [2024-06-18 09:40:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1750368256. Throughput: 0: 42968.6. Samples: 1750455320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:40:46,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 09:40:49,344][12883] Updated weights for policy 0, policy_version 106843 (0.0032) [2024-06-18 09:40:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1750597632. Throughput: 0: 42948.0. Samples: 1750709060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:40:51,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 09:40:54,136][12883] Updated weights for policy 0, policy_version 106853 (0.0046) [2024-06-18 09:40:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1750827008. Throughput: 0: 43000.3. Samples: 1750966460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:40:56,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 09:40:57,310][12883] Updated weights for policy 0, policy_version 106863 (0.0033) [2024-06-18 09:41:01,583][12883] Updated weights for policy 0, policy_version 106873 (0.0046) [2024-06-18 09:41:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1751007232. Throughput: 0: 43042.1. Samples: 1751100080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:01,995][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 09:41:04,865][12883] Updated weights for policy 0, policy_version 106883 (0.0038) [2024-06-18 09:41:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1751236608. Throughput: 0: 42946.6. Samples: 1751352680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:06,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 09:41:09,075][12883] Updated weights for policy 0, policy_version 106893 (0.0033) [2024-06-18 09:41:11,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1751465984. Throughput: 0: 43130.1. Samples: 1751613940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:11,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 09:41:12,361][12883] Updated weights for policy 0, policy_version 106903 (0.0026) [2024-06-18 09:41:16,587][12883] Updated weights for policy 0, policy_version 106913 (0.0025) [2024-06-18 09:41:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1751662592. Throughput: 0: 42927.3. Samples: 1751747740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:16,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 09:41:19,874][12883] Updated weights for policy 0, policy_version 106923 (0.0026) [2024-06-18 09:41:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 1751875584. Throughput: 0: 43041.8. Samples: 1752004880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:21,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 09:41:24,430][12883] Updated weights for policy 0, policy_version 106933 (0.0034) [2024-06-18 09:41:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1752121344. Throughput: 0: 43072.4. Samples: 1752260320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:26,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 09:41:27,560][12883] Updated weights for policy 0, policy_version 106943 (0.0036) [2024-06-18 09:41:31,974][12883] Updated weights for policy 0, policy_version 106953 (0.0027) [2024-06-18 09:41:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43148.9, 300 sec: 42653.9). Total num frames: 1752317952. Throughput: 0: 43053.1. Samples: 1752392720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:31,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 09:41:35,287][12883] Updated weights for policy 0, policy_version 106963 (0.0045) [2024-06-18 09:41:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1752530944. Throughput: 0: 43055.1. Samples: 1752646540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:36,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 09:41:39,637][12883] Updated weights for policy 0, policy_version 106973 (0.0032) [2024-06-18 09:41:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1752760320. Throughput: 0: 42952.8. Samples: 1752899340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:41,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 09:41:43,208][12883] Updated weights for policy 0, policy_version 106983 (0.0033) [2024-06-18 09:41:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1752940544. Throughput: 0: 43003.2. Samples: 1753035220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 09:41:46,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 09:41:47,247][12883] Updated weights for policy 0, policy_version 106993 (0.0031) [2024-06-18 09:41:50,711][12883] Updated weights for policy 0, policy_version 107003 (0.0034) [2024-06-18 09:41:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1753169920. Throughput: 0: 42963.2. Samples: 1753286020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:41:51,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 09:41:54,940][12883] Updated weights for policy 0, policy_version 107013 (0.0033) [2024-06-18 09:41:56,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1753415680. Throughput: 0: 42855.0. Samples: 1753542420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:41:56,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 09:41:58,162][12883] Updated weights for policy 0, policy_version 107023 (0.0032) [2024-06-18 09:42:01,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1753595904. Throughput: 0: 42869.8. Samples: 1753676880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:01,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 09:42:02,524][12883] Updated weights for policy 0, policy_version 107033 (0.0031) [2024-06-18 09:42:03,979][12862] Signal inference workers to stop experience collection... (25600 times) [2024-06-18 09:42:03,980][12862] Signal inference workers to resume experience collection... (25600 times) [2024-06-18 09:42:04,004][12883] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-06-18 09:42:04,004][12883] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-06-18 09:42:05,958][12883] Updated weights for policy 0, policy_version 107043 (0.0034) [2024-06-18 09:42:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1753808896. Throughput: 0: 42689.3. Samples: 1753925900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:06,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 09:42:10,402][12883] Updated weights for policy 0, policy_version 107053 (0.0031) [2024-06-18 09:42:11,998][12645] Fps is (10 sec: 44216.4, 60 sec: 42868.2, 300 sec: 42708.8). Total num frames: 1754038272. Throughput: 0: 42674.3. Samples: 1754180860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:11,999][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 09:42:13,545][12883] Updated weights for policy 0, policy_version 107063 (0.0033) [2024-06-18 09:42:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1754234880. Throughput: 0: 42727.3. Samples: 1754315440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:16,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 09:42:17,897][12883] Updated weights for policy 0, policy_version 107073 (0.0039) [2024-06-18 09:42:21,198][12883] Updated weights for policy 0, policy_version 107083 (0.0039) [2024-06-18 09:42:21,994][12645] Fps is (10 sec: 40978.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1754447872. Throughput: 0: 42807.6. Samples: 1754572880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:21,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 09:42:25,495][12883] Updated weights for policy 0, policy_version 107093 (0.0028) [2024-06-18 09:42:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1754693632. Throughput: 0: 42749.4. Samples: 1754823060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:26,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 09:42:28,823][12883] Updated weights for policy 0, policy_version 107103 (0.0028) [2024-06-18 09:42:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1754857472. Throughput: 0: 42742.3. Samples: 1754958620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:31,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 09:42:33,162][12883] Updated weights for policy 0, policy_version 107113 (0.0046) [2024-06-18 09:42:36,548][12883] Updated weights for policy 0, policy_version 107123 (0.0030) [2024-06-18 09:42:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1755103232. Throughput: 0: 42826.8. Samples: 1755213220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:36,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 09:42:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107123_1755103232.pth... [2024-06-18 09:42:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106497_1744846848.pth [2024-06-18 09:42:40,950][12883] Updated weights for policy 0, policy_version 107133 (0.0040) [2024-06-18 09:42:41,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1755332608. Throughput: 0: 42730.6. Samples: 1755465300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:41,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 09:42:44,235][12883] Updated weights for policy 0, policy_version 107143 (0.0031) [2024-06-18 09:42:46,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 1755512832. Throughput: 0: 42603.1. Samples: 1755594120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 09:42:46,996][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 09:42:48,776][12883] Updated weights for policy 0, policy_version 107153 (0.0031) [2024-06-18 09:42:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1755758592. Throughput: 0: 42814.3. Samples: 1755852540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:42:51,995][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 09:42:51,995][12883] Updated weights for policy 0, policy_version 107163 (0.0045) [2024-06-18 09:42:56,260][12883] Updated weights for policy 0, policy_version 107173 (0.0045) [2024-06-18 09:42:56,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1755971584. Throughput: 0: 42848.3. Samples: 1756108840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:42:56,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 09:42:59,634][12883] Updated weights for policy 0, policy_version 107183 (0.0028) [2024-06-18 09:43:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1756168192. Throughput: 0: 42739.9. Samples: 1756238740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:01,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 09:43:03,749][12883] Updated weights for policy 0, policy_version 107193 (0.0038) [2024-06-18 09:43:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1756381184. Throughput: 0: 42659.9. Samples: 1756492580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:06,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 09:43:07,311][12883] Updated weights for policy 0, policy_version 107203 (0.0029) [2024-06-18 09:43:11,561][12883] Updated weights for policy 0, policy_version 107213 (0.0035) [2024-06-18 09:43:11,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42600.0, 300 sec: 42709.2). Total num frames: 1756594176. Throughput: 0: 42745.8. Samples: 1756746720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:11,997][12645] Avg episode reward: [(0, '0.687')] [2024-06-18 09:43:15,294][12883] Updated weights for policy 0, policy_version 107223 (0.0027) [2024-06-18 09:43:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1756790784. Throughput: 0: 42502.7. Samples: 1756871240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:16,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 09:43:19,057][12883] Updated weights for policy 0, policy_version 107233 (0.0041) [2024-06-18 09:43:21,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1757020160. Throughput: 0: 42502.1. Samples: 1757125820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:21,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 09:43:23,169][12883] Updated weights for policy 0, policy_version 107243 (0.0044) [2024-06-18 09:43:26,851][12883] Updated weights for policy 0, policy_version 107253 (0.0040) [2024-06-18 09:43:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1757249536. Throughput: 0: 42515.6. Samples: 1757378500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:26,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 09:43:31,043][12883] Updated weights for policy 0, policy_version 107263 (0.0031) [2024-06-18 09:43:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1757429760. Throughput: 0: 42452.8. Samples: 1757504400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:31,994][12645] Avg episode reward: [(0, '0.697')] [2024-06-18 09:43:32,917][12862] Signal inference workers to stop experience collection... (25650 times) [2024-06-18 09:43:32,917][12862] Signal inference workers to resume experience collection... (25650 times) [2024-06-18 09:43:32,937][12883] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-06-18 09:43:32,965][12883] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-06-18 09:43:34,563][12883] Updated weights for policy 0, policy_version 107273 (0.0028) [2024-06-18 09:43:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 1757642752. Throughput: 0: 42368.0. Samples: 1757759100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:36,994][12645] Avg episode reward: [(0, '0.697')] [2024-06-18 09:43:38,760][12883] Updated weights for policy 0, policy_version 107283 (0.0039) [2024-06-18 09:43:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1757872128. Throughput: 0: 42521.3. Samples: 1758022300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:41,994][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 09:43:42,244][12883] Updated weights for policy 0, policy_version 107293 (0.0038) [2024-06-18 09:43:46,464][12883] Updated weights for policy 0, policy_version 107303 (0.0027) [2024-06-18 09:43:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1758068736. Throughput: 0: 42486.2. Samples: 1758150620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 09:43:46,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 09:43:49,822][12883] Updated weights for policy 0, policy_version 107313 (0.0038) [2024-06-18 09:43:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1758281728. Throughput: 0: 42355.5. Samples: 1758398580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:43:51,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 09:43:54,209][12883] Updated weights for policy 0, policy_version 107323 (0.0023) [2024-06-18 09:43:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1758511104. Throughput: 0: 42328.8. Samples: 1758651420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:43:56,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 09:43:57,506][12883] Updated weights for policy 0, policy_version 107333 (0.0024) [2024-06-18 09:44:01,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 1758691328. Throughput: 0: 42455.7. Samples: 1758781840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:01,996][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 09:44:02,023][12883] Updated weights for policy 0, policy_version 107343 (0.0041) [2024-06-18 09:44:05,609][12883] Updated weights for policy 0, policy_version 107353 (0.0048) [2024-06-18 09:44:06,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 1758920704. Throughput: 0: 42530.4. Samples: 1759039780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:06,996][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 09:44:09,581][12883] Updated weights for policy 0, policy_version 107363 (0.0046) [2024-06-18 09:44:11,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 1759133696. Throughput: 0: 42516.9. Samples: 1759291760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:11,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 09:44:13,116][12883] Updated weights for policy 0, policy_version 107373 (0.0036) [2024-06-18 09:44:16,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1759330304. Throughput: 0: 42576.1. Samples: 1759420320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:16,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 09:44:17,362][12883] Updated weights for policy 0, policy_version 107383 (0.0040) [2024-06-18 09:44:20,711][12883] Updated weights for policy 0, policy_version 107393 (0.0038) [2024-06-18 09:44:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1759576064. Throughput: 0: 42657.3. Samples: 1759678680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:21,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 09:44:24,907][12883] Updated weights for policy 0, policy_version 107403 (0.0032) [2024-06-18 09:44:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1759772672. Throughput: 0: 42521.0. Samples: 1759935740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:26,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 09:44:28,243][12883] Updated weights for policy 0, policy_version 107413 (0.0038) [2024-06-18 09:44:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1759969280. Throughput: 0: 42465.4. Samples: 1760061560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:31,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 09:44:32,532][12883] Updated weights for policy 0, policy_version 107423 (0.0037) [2024-06-18 09:44:35,987][12883] Updated weights for policy 0, policy_version 107433 (0.0039) [2024-06-18 09:44:36,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1760231424. Throughput: 0: 42752.0. Samples: 1760322420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:36,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 09:44:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107436_1760231424.pth... [2024-06-18 09:44:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000106808_1749942272.pth [2024-06-18 09:44:40,331][12883] Updated weights for policy 0, policy_version 107443 (0.0027) [2024-06-18 09:44:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1760428032. Throughput: 0: 42793.3. Samples: 1760577120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:41,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 09:44:43,647][12883] Updated weights for policy 0, policy_version 107453 (0.0034) [2024-06-18 09:44:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1760624640. Throughput: 0: 42683.8. Samples: 1760702520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 09:44:46,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 09:44:48,126][12883] Updated weights for policy 0, policy_version 107463 (0.0038) [2024-06-18 09:44:51,368][12883] Updated weights for policy 0, policy_version 107473 (0.0032) [2024-06-18 09:44:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1760854016. Throughput: 0: 42755.6. Samples: 1760963680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:44:51,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 09:44:56,043][12883] Updated weights for policy 0, policy_version 107483 (0.0027) [2024-06-18 09:44:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1761050624. Throughput: 0: 42748.9. Samples: 1761215460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:44:56,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 09:44:59,134][12883] Updated weights for policy 0, policy_version 107493 (0.0047) [2024-06-18 09:45:01,994][12645] Fps is (10 sec: 40957.4, 60 sec: 42872.7, 300 sec: 42709.4). Total num frames: 1761263616. Throughput: 0: 42639.9. Samples: 1761339140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:01,995][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 09:45:03,799][12862] Signal inference workers to stop experience collection... (25700 times) [2024-06-18 09:45:03,857][12883] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-06-18 09:45:03,864][12862] Signal inference workers to resume experience collection... (25700 times) [2024-06-18 09:45:03,880][12883] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-06-18 09:45:03,883][12883] Updated weights for policy 0, policy_version 107503 (0.0036) [2024-06-18 09:45:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1761476608. Throughput: 0: 42473.5. Samples: 1761589980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:06,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 09:45:07,092][12883] Updated weights for policy 0, policy_version 107513 (0.0024) [2024-06-18 09:45:11,767][12883] Updated weights for policy 0, policy_version 107523 (0.0037) [2024-06-18 09:45:11,994][12645] Fps is (10 sec: 40961.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1761673216. Throughput: 0: 42649.6. Samples: 1761854980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:11,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 09:45:14,616][12883] Updated weights for policy 0, policy_version 107533 (0.0036) [2024-06-18 09:45:16,998][12645] Fps is (10 sec: 42579.8, 60 sec: 42868.4, 300 sec: 42764.4). Total num frames: 1761902592. Throughput: 0: 42517.3. Samples: 1761975020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:16,999][12645] Avg episode reward: [(0, '0.741')] [2024-06-18 09:45:19,351][12883] Updated weights for policy 0, policy_version 107543 (0.0033) [2024-06-18 09:45:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1762131968. Throughput: 0: 42411.6. Samples: 1762230940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:21,994][12645] Avg episode reward: [(0, '0.720')] [2024-06-18 09:45:22,459][12883] Updated weights for policy 0, policy_version 107553 (0.0031) [2024-06-18 09:45:26,868][12883] Updated weights for policy 0, policy_version 107563 (0.0039) [2024-06-18 09:45:26,994][12645] Fps is (10 sec: 40977.5, 60 sec: 42325.3, 300 sec: 42654.8). Total num frames: 1762312192. Throughput: 0: 42607.6. Samples: 1762494460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:26,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 09:45:30,057][12883] Updated weights for policy 0, policy_version 107573 (0.0027) [2024-06-18 09:45:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1762525184. Throughput: 0: 42464.5. Samples: 1762613420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:31,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 09:45:34,356][12883] Updated weights for policy 0, policy_version 107583 (0.0037) [2024-06-18 09:45:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1762770944. Throughput: 0: 42442.0. Samples: 1762873580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:36,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 09:45:37,594][12883] Updated weights for policy 0, policy_version 107593 (0.0034) [2024-06-18 09:45:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1762951168. Throughput: 0: 42580.4. Samples: 1763131580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:41,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 09:45:42,453][12883] Updated weights for policy 0, policy_version 107603 (0.0049) [2024-06-18 09:45:45,093][12883] Updated weights for policy 0, policy_version 107613 (0.0042) [2024-06-18 09:45:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1763180544. Throughput: 0: 42492.0. Samples: 1763251260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 09:45:46,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 09:45:49,980][12883] Updated weights for policy 0, policy_version 107623 (0.0032) [2024-06-18 09:45:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1763409920. Throughput: 0: 42808.7. Samples: 1763516380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:45:51,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 09:45:52,625][12883] Updated weights for policy 0, policy_version 107633 (0.0031) [2024-06-18 09:45:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1763590144. Throughput: 0: 42728.0. Samples: 1763777740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:45:56,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 09:45:57,463][12883] Updated weights for policy 0, policy_version 107643 (0.0032) [2024-06-18 09:46:00,412][12883] Updated weights for policy 0, policy_version 107653 (0.0033) [2024-06-18 09:46:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.8, 300 sec: 42709.5). Total num frames: 1763835904. Throughput: 0: 42741.3. Samples: 1763898200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:01,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 09:46:04,961][12883] Updated weights for policy 0, policy_version 107663 (0.0032) [2024-06-18 09:46:06,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1764065280. Throughput: 0: 42924.5. Samples: 1764162540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:06,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 09:46:08,122][12883] Updated weights for policy 0, policy_version 107673 (0.0030) [2024-06-18 09:46:11,996][12645] Fps is (10 sec: 42589.2, 60 sec: 43143.0, 300 sec: 42709.2). Total num frames: 1764261888. Throughput: 0: 42769.4. Samples: 1764419180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:11,996][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 09:46:12,715][12883] Updated weights for policy 0, policy_version 107683 (0.0038) [2024-06-18 09:46:14,875][12862] Signal inference workers to stop experience collection... (25750 times) [2024-06-18 09:46:14,875][12862] Signal inference workers to resume experience collection... (25750 times) [2024-06-18 09:46:14,908][12883] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-06-18 09:46:14,908][12883] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-06-18 09:46:15,580][12883] Updated weights for policy 0, policy_version 107693 (0.0034) [2024-06-18 09:46:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42874.5, 300 sec: 42709.5). Total num frames: 1764474880. Throughput: 0: 43012.4. Samples: 1764548980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:16,996][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 09:46:20,110][12883] Updated weights for policy 0, policy_version 107703 (0.0034) [2024-06-18 09:46:21,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1764687872. Throughput: 0: 42959.6. Samples: 1764806760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:21,996][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 09:46:23,215][12883] Updated weights for policy 0, policy_version 107713 (0.0039) [2024-06-18 09:46:26,995][12645] Fps is (10 sec: 42595.0, 60 sec: 43143.9, 300 sec: 42653.8). Total num frames: 1764900864. Throughput: 0: 42909.8. Samples: 1765062560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:26,995][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 09:46:27,630][12883] Updated weights for policy 0, policy_version 107723 (0.0036) [2024-06-18 09:46:30,977][12883] Updated weights for policy 0, policy_version 107733 (0.0030) [2024-06-18 09:46:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1765113856. Throughput: 0: 43078.2. Samples: 1765189780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:31,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 09:46:35,313][12883] Updated weights for policy 0, policy_version 107743 (0.0025) [2024-06-18 09:46:36,994][12645] Fps is (10 sec: 44240.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1765343232. Throughput: 0: 42987.2. Samples: 1765450800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:36,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 09:46:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107748_1765343232.pth... [2024-06-18 09:46:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107123_1755103232.pth [2024-06-18 09:46:38,497][12883] Updated weights for policy 0, policy_version 107753 (0.0029) [2024-06-18 09:46:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1765539840. Throughput: 0: 42882.2. Samples: 1765707440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:41,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 09:46:43,092][12883] Updated weights for policy 0, policy_version 107763 (0.0034) [2024-06-18 09:46:46,373][12883] Updated weights for policy 0, policy_version 107773 (0.0045) [2024-06-18 09:46:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1765769216. Throughput: 0: 43050.7. Samples: 1765835480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 09:46:46,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 09:46:50,685][12883] Updated weights for policy 0, policy_version 107783 (0.0036) [2024-06-18 09:46:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1765965824. Throughput: 0: 42829.3. Samples: 1766089860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:46:51,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 09:46:53,891][12883] Updated weights for policy 0, policy_version 107793 (0.0023) [2024-06-18 09:46:56,994][12645] Fps is (10 sec: 39320.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1766162432. Throughput: 0: 42767.7. Samples: 1766343640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:46:56,995][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 09:46:58,243][12883] Updated weights for policy 0, policy_version 107803 (0.0023) [2024-06-18 09:47:01,503][12883] Updated weights for policy 0, policy_version 107813 (0.0042) [2024-06-18 09:47:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1766408192. Throughput: 0: 42701.4. Samples: 1766470540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:01,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 09:47:06,058][12883] Updated weights for policy 0, policy_version 107823 (0.0031) [2024-06-18 09:47:06,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.3, 300 sec: 42654.6). Total num frames: 1766621184. Throughput: 0: 42700.8. Samples: 1766728300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:06,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 09:47:09,055][12883] Updated weights for policy 0, policy_version 107833 (0.0033) [2024-06-18 09:47:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 1766817792. Throughput: 0: 42699.0. Samples: 1766983980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:11,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 09:47:13,652][12883] Updated weights for policy 0, policy_version 107843 (0.0034) [2024-06-18 09:47:16,949][12883] Updated weights for policy 0, policy_version 107853 (0.0031) [2024-06-18 09:47:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1767063552. Throughput: 0: 42635.6. Samples: 1767108380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:16,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 09:47:21,259][12883] Updated weights for policy 0, policy_version 107863 (0.0031) [2024-06-18 09:47:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1767260160. Throughput: 0: 42716.0. Samples: 1767373020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:21,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 09:47:24,367][12883] Updated weights for policy 0, policy_version 107873 (0.0041) [2024-06-18 09:47:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 1767456768. Throughput: 0: 42634.2. Samples: 1767625980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:26,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 09:47:29,045][12883] Updated weights for policy 0, policy_version 107883 (0.0044) [2024-06-18 09:47:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1767702528. Throughput: 0: 42517.7. Samples: 1767748780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:31,996][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 09:47:32,290][12883] Updated weights for policy 0, policy_version 107893 (0.0031) [2024-06-18 09:47:36,405][12862] Signal inference workers to stop experience collection... (25800 times) [2024-06-18 09:47:36,405][12862] Signal inference workers to resume experience collection... (25800 times) [2024-06-18 09:47:36,421][12883] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-06-18 09:47:36,421][12883] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-06-18 09:47:36,547][12883] Updated weights for policy 0, policy_version 107903 (0.0038) [2024-06-18 09:47:36,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1767899136. Throughput: 0: 42660.0. Samples: 1768009660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:36,997][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 09:47:40,190][12883] Updated weights for policy 0, policy_version 107913 (0.0034) [2024-06-18 09:47:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1768095744. Throughput: 0: 42809.6. Samples: 1768270060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:41,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 09:47:44,368][12883] Updated weights for policy 0, policy_version 107923 (0.0033) [2024-06-18 09:47:46,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1768341504. Throughput: 0: 42672.8. Samples: 1768390820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:47,003][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 09:47:47,795][12883] Updated weights for policy 0, policy_version 107933 (0.0031) [2024-06-18 09:47:51,976][12883] Updated weights for policy 0, policy_version 107943 (0.0034) [2024-06-18 09:47:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1768538112. Throughput: 0: 42671.6. Samples: 1768648520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 09:47:51,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 09:47:55,503][12883] Updated weights for policy 0, policy_version 107953 (0.0027) [2024-06-18 09:47:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1768734720. Throughput: 0: 42795.1. Samples: 1768909760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:47:56,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 09:47:59,693][12883] Updated weights for policy 0, policy_version 107963 (0.0036) [2024-06-18 09:48:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1768996864. Throughput: 0: 42828.8. Samples: 1769035680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:01,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 09:48:03,066][12883] Updated weights for policy 0, policy_version 107973 (0.0049) [2024-06-18 09:48:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 1769160704. Throughput: 0: 42566.6. Samples: 1769288520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:06,996][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 09:48:07,510][12883] Updated weights for policy 0, policy_version 107983 (0.0031) [2024-06-18 09:48:10,955][12883] Updated weights for policy 0, policy_version 107993 (0.0027) [2024-06-18 09:48:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1769390080. Throughput: 0: 42610.2. Samples: 1769543440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:11,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 09:48:15,163][12883] Updated weights for policy 0, policy_version 108003 (0.0032) [2024-06-18 09:48:16,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1769619456. Throughput: 0: 42727.2. Samples: 1769671500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:16,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 09:48:18,639][12883] Updated weights for policy 0, policy_version 108013 (0.0035) [2024-06-18 09:48:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1769799680. Throughput: 0: 42777.3. Samples: 1769934540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:21,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 09:48:22,665][12883] Updated weights for policy 0, policy_version 108023 (0.0037) [2024-06-18 09:48:26,260][12883] Updated weights for policy 0, policy_version 108033 (0.0050) [2024-06-18 09:48:26,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1770045440. Throughput: 0: 42597.6. Samples: 1770186960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:26,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 09:48:30,199][12883] Updated weights for policy 0, policy_version 108043 (0.0028) [2024-06-18 09:48:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1770258432. Throughput: 0: 42899.1. Samples: 1770321280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:31,994][12645] Avg episode reward: [(0, '0.757')] [2024-06-18 09:48:33,837][12883] Updated weights for policy 0, policy_version 108053 (0.0031) [2024-06-18 09:48:36,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42598.4, 300 sec: 42653.6). Total num frames: 1770455040. Throughput: 0: 42872.1. Samples: 1770577860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:36,996][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 09:48:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108060_1770455040.pth... [2024-06-18 09:48:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107436_1760231424.pth [2024-06-18 09:48:38,127][12883] Updated weights for policy 0, policy_version 108063 (0.0043) [2024-06-18 09:48:41,560][12883] Updated weights for policy 0, policy_version 108073 (0.0032) [2024-06-18 09:48:41,995][12645] Fps is (10 sec: 40955.0, 60 sec: 42870.5, 300 sec: 42709.3). Total num frames: 1770668032. Throughput: 0: 42671.8. Samples: 1770830040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:41,996][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 09:48:45,779][12883] Updated weights for policy 0, policy_version 108083 (0.0039) [2024-06-18 09:48:46,994][12645] Fps is (10 sec: 44246.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1770897408. Throughput: 0: 42709.4. Samples: 1770957600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:46,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 09:48:49,240][12883] Updated weights for policy 0, policy_version 108093 (0.0028) [2024-06-18 09:48:51,994][12645] Fps is (10 sec: 44242.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1771110400. Throughput: 0: 42827.2. Samples: 1771215740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 09:48:51,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 09:48:53,785][12883] Updated weights for policy 0, policy_version 108103 (0.0025) [2024-06-18 09:48:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1771290624. Throughput: 0: 42906.8. Samples: 1771474240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:48:56,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 09:48:57,159][12883] Updated weights for policy 0, policy_version 108113 (0.0031) [2024-06-18 09:49:01,276][12883] Updated weights for policy 0, policy_version 108123 (0.0035) [2024-06-18 09:49:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 1771520000. Throughput: 0: 42751.9. Samples: 1771595340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:01,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 09:49:04,423][12862] Signal inference workers to stop experience collection... (25850 times) [2024-06-18 09:49:04,424][12862] Signal inference workers to resume experience collection... (25850 times) [2024-06-18 09:49:04,465][12883] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-06-18 09:49:04,465][12883] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-06-18 09:49:04,701][12883] Updated weights for policy 0, policy_version 108133 (0.0027) [2024-06-18 09:49:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1771749376. Throughput: 0: 42651.5. Samples: 1771853860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:06,994][12645] Avg episode reward: [(0, '0.677')] [2024-06-18 09:49:08,922][12883] Updated weights for policy 0, policy_version 108143 (0.0045) [2024-06-18 09:49:11,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 1771945984. Throughput: 0: 42761.0. Samples: 1772111300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:11,997][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 09:49:12,270][12883] Updated weights for policy 0, policy_version 108153 (0.0021) [2024-06-18 09:49:16,515][12883] Updated weights for policy 0, policy_version 108163 (0.0036) [2024-06-18 09:49:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1772158976. Throughput: 0: 42505.9. Samples: 1772234040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:16,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 09:49:19,840][12883] Updated weights for policy 0, policy_version 108173 (0.0029) [2024-06-18 09:49:21,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1772388352. Throughput: 0: 42480.4. Samples: 1772489380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:21,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 09:49:24,289][12883] Updated weights for policy 0, policy_version 108183 (0.0037) [2024-06-18 09:49:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1772584960. Throughput: 0: 42622.2. Samples: 1772747980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:26,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 09:49:27,928][12883] Updated weights for policy 0, policy_version 108193 (0.0038) [2024-06-18 09:49:31,860][12883] Updated weights for policy 0, policy_version 108203 (0.0035) [2024-06-18 09:49:31,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1772797952. Throughput: 0: 42549.3. Samples: 1772872320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:31,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 09:49:35,817][12883] Updated weights for policy 0, policy_version 108213 (0.0032) [2024-06-18 09:49:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1773027328. Throughput: 0: 42678.1. Samples: 1773136260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:36,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 09:49:39,506][12883] Updated weights for policy 0, policy_version 108223 (0.0039) [2024-06-18 09:49:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42599.3, 300 sec: 42709.5). Total num frames: 1773223936. Throughput: 0: 42510.2. Samples: 1773387200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:41,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 09:49:43,563][12883] Updated weights for policy 0, policy_version 108233 (0.0024) [2024-06-18 09:49:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1773436928. Throughput: 0: 42560.9. Samples: 1773510580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:46,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 09:49:47,212][12883] Updated weights for policy 0, policy_version 108243 (0.0030) [2024-06-18 09:49:51,284][12883] Updated weights for policy 0, policy_version 108253 (0.0037) [2024-06-18 09:49:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1773649920. Throughput: 0: 42452.9. Samples: 1773764240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 09:49:51,994][12645] Avg episode reward: [(0, '0.184')] [2024-06-18 09:49:54,899][12883] Updated weights for policy 0, policy_version 108263 (0.0035) [2024-06-18 09:49:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 1773846528. Throughput: 0: 42404.8. Samples: 1774019420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:49:56,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 09:49:58,879][12883] Updated weights for policy 0, policy_version 108273 (0.0029) [2024-06-18 09:50:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1774075904. Throughput: 0: 42414.1. Samples: 1774142680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:01,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 09:50:02,408][12883] Updated weights for policy 0, policy_version 108283 (0.0038) [2024-06-18 09:50:06,551][12883] Updated weights for policy 0, policy_version 108293 (0.0036) [2024-06-18 09:50:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1774288896. Throughput: 0: 42497.5. Samples: 1774401780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:06,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 09:50:10,249][12883] Updated weights for policy 0, policy_version 108303 (0.0032) [2024-06-18 09:50:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42654.5). Total num frames: 1774485504. Throughput: 0: 42579.0. Samples: 1774664040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:11,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 09:50:14,258][12883] Updated weights for policy 0, policy_version 108313 (0.0023) [2024-06-18 09:50:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1774714880. Throughput: 0: 42534.3. Samples: 1774786360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:16,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 09:50:17,946][12883] Updated weights for policy 0, policy_version 108323 (0.0037) [2024-06-18 09:50:21,904][12883] Updated weights for policy 0, policy_version 108333 (0.0039) [2024-06-18 09:50:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1774927872. Throughput: 0: 42348.5. Samples: 1775041940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:21,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 09:50:23,321][12862] Signal inference workers to stop experience collection... (25900 times) [2024-06-18 09:50:23,321][12862] Signal inference workers to resume experience collection... (25900 times) [2024-06-18 09:50:23,343][12883] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-06-18 09:50:23,343][12883] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-06-18 09:50:26,207][12883] Updated weights for policy 0, policy_version 108343 (0.0022) [2024-06-18 09:50:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1775124480. Throughput: 0: 42522.6. Samples: 1775300720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:26,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 09:50:29,700][12883] Updated weights for policy 0, policy_version 108353 (0.0036) [2024-06-18 09:50:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1775370240. Throughput: 0: 42589.4. Samples: 1775427100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:31,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 09:50:33,811][12883] Updated weights for policy 0, policy_version 108363 (0.0026) [2024-06-18 09:50:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 1775534080. Throughput: 0: 42603.2. Samples: 1775681380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:36,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 09:50:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108371_1775550464.pth... [2024-06-18 09:50:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000107748_1765343232.pth [2024-06-18 09:50:37,398][12883] Updated weights for policy 0, policy_version 108373 (0.0027) [2024-06-18 09:50:41,407][12883] Updated weights for policy 0, policy_version 108383 (0.0029) [2024-06-18 09:50:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1775779840. Throughput: 0: 42637.3. Samples: 1775938100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:41,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 09:50:44,993][12883] Updated weights for policy 0, policy_version 108393 (0.0045) [2024-06-18 09:50:46,994][12645] Fps is (10 sec: 49151.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1776025600. Throughput: 0: 42704.0. Samples: 1776064360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:46,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 09:50:49,106][12883] Updated weights for policy 0, policy_version 108403 (0.0039) [2024-06-18 09:50:51,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1776173056. Throughput: 0: 42517.9. Samples: 1776315080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 09:50:51,996][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 09:50:52,869][12883] Updated weights for policy 0, policy_version 108413 (0.0033) [2024-06-18 09:50:56,851][12883] Updated weights for policy 0, policy_version 108423 (0.0034) [2024-06-18 09:50:56,997][12645] Fps is (10 sec: 37669.5, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 1776402432. Throughput: 0: 42359.7. Samples: 1776570380. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:50:56,998][12645] Avg episode reward: [(0, '0.215')] [2024-06-18 09:51:00,378][12883] Updated weights for policy 0, policy_version 108433 (0.0031) [2024-06-18 09:51:01,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1776648192. Throughput: 0: 42643.0. Samples: 1776705300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:01,994][12645] Avg episode reward: [(0, '0.194')] [2024-06-18 09:51:04,413][12883] Updated weights for policy 0, policy_version 108443 (0.0031) [2024-06-18 09:51:06,994][12645] Fps is (10 sec: 40975.4, 60 sec: 42052.4, 300 sec: 42543.2). Total num frames: 1776812032. Throughput: 0: 42569.9. Samples: 1776957580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:06,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 09:51:08,394][12883] Updated weights for policy 0, policy_version 108453 (0.0037) [2024-06-18 09:51:11,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1777041408. Throughput: 0: 42388.1. Samples: 1777208180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:11,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 09:51:12,042][12883] Updated weights for policy 0, policy_version 108463 (0.0034) [2024-06-18 09:51:16,033][12883] Updated weights for policy 0, policy_version 108473 (0.0027) [2024-06-18 09:51:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1777270784. Throughput: 0: 42508.9. Samples: 1777340000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:16,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 09:51:19,646][12883] Updated weights for policy 0, policy_version 108483 (0.0030) [2024-06-18 09:51:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42543.0). Total num frames: 1777451008. Throughput: 0: 42435.5. Samples: 1777590980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:21,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 09:51:23,575][12883] Updated weights for policy 0, policy_version 108493 (0.0043) [2024-06-18 09:51:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1777696768. Throughput: 0: 42310.3. Samples: 1777842060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:26,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 09:51:27,395][12883] Updated weights for policy 0, policy_version 108503 (0.0035) [2024-06-18 09:51:31,238][12883] Updated weights for policy 0, policy_version 108513 (0.0046) [2024-06-18 09:51:31,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 1777909760. Throughput: 0: 42517.6. Samples: 1777977740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:31,996][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 09:51:34,955][12883] Updated weights for policy 0, policy_version 108523 (0.0029) [2024-06-18 09:51:36,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.7, 300 sec: 42542.5). Total num frames: 1778089984. Throughput: 0: 42585.9. Samples: 1778231540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:36,996][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 09:51:37,525][12862] Signal inference workers to stop experience collection... (25950 times) [2024-06-18 09:51:37,580][12862] Signal inference workers to resume experience collection... (25950 times) [2024-06-18 09:51:37,581][12883] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-06-18 09:51:37,602][12883] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-06-18 09:51:38,815][12883] Updated weights for policy 0, policy_version 108533 (0.0041) [2024-06-18 09:51:41,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1778335744. Throughput: 0: 42543.6. Samples: 1778484680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:41,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 09:51:42,402][12883] Updated weights for policy 0, policy_version 108543 (0.0022) [2024-06-18 09:51:46,498][12883] Updated weights for policy 0, policy_version 108553 (0.0035) [2024-06-18 09:51:46,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1778548736. Throughput: 0: 42590.8. Samples: 1778621880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:47,000][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 09:51:50,376][12883] Updated weights for policy 0, policy_version 108563 (0.0041) [2024-06-18 09:51:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1778745344. Throughput: 0: 42438.2. Samples: 1778867300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 09:51:51,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 09:51:54,253][12883] Updated weights for policy 0, policy_version 108573 (0.0043) [2024-06-18 09:51:57,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42869.6, 300 sec: 42597.5). Total num frames: 1778974720. Throughput: 0: 42441.1. Samples: 1779118300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:51:57,001][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 09:51:57,911][12883] Updated weights for policy 0, policy_version 108583 (0.0031) [2024-06-18 09:52:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 1779171328. Throughput: 0: 42540.9. Samples: 1779254340. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:01,994][12645] Avg episode reward: [(0, '0.123')] [2024-06-18 09:52:02,018][12883] Updated weights for policy 0, policy_version 108593 (0.0031) [2024-06-18 09:52:05,765][12883] Updated weights for policy 0, policy_version 108603 (0.0039) [2024-06-18 09:52:06,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1779384320. Throughput: 0: 42556.9. Samples: 1779506040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:06,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 09:52:09,676][12883] Updated weights for policy 0, policy_version 108613 (0.0027) [2024-06-18 09:52:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1779613696. Throughput: 0: 42608.5. Samples: 1779759440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:11,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 09:52:13,334][12883] Updated weights for policy 0, policy_version 108623 (0.0047) [2024-06-18 09:52:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1779810304. Throughput: 0: 42498.4. Samples: 1779890080. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:16,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 09:52:17,428][12883] Updated weights for policy 0, policy_version 108633 (0.0044) [2024-06-18 09:52:20,831][12883] Updated weights for policy 0, policy_version 108643 (0.0030) [2024-06-18 09:52:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1780039680. Throughput: 0: 42684.7. Samples: 1780152260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:21,995][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 09:52:25,079][12883] Updated weights for policy 0, policy_version 108653 (0.0023) [2024-06-18 09:52:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1780252672. Throughput: 0: 42817.7. Samples: 1780411480. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:26,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 09:52:28,727][12883] Updated weights for policy 0, policy_version 108663 (0.0035) [2024-06-18 09:52:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42873.1, 300 sec: 42654.3). Total num frames: 1780482048. Throughput: 0: 42462.8. Samples: 1780532700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:31,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 09:52:32,838][12883] Updated weights for policy 0, policy_version 108673 (0.0038) [2024-06-18 09:52:36,249][12883] Updated weights for policy 0, policy_version 108683 (0.0051) [2024-06-18 09:52:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 1780662272. Throughput: 0: 42653.3. Samples: 1780786700. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:36,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 09:52:37,003][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108683_1780662272.pth... [2024-06-18 09:52:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108060_1770455040.pth [2024-06-18 09:52:40,563][12883] Updated weights for policy 0, policy_version 108693 (0.0032) [2024-06-18 09:52:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1780875264. Throughput: 0: 42935.5. Samples: 1781050120. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:41,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 09:52:43,797][12883] Updated weights for policy 0, policy_version 108703 (0.0026) [2024-06-18 09:52:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1781104640. Throughput: 0: 42644.7. Samples: 1781173360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:46,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 09:52:47,772][12862] Signal inference workers to stop experience collection... (26000 times) [2024-06-18 09:52:47,824][12883] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-06-18 09:52:47,887][12862] Signal inference workers to resume experience collection... (26000 times) [2024-06-18 09:52:47,887][12883] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-06-18 09:52:48,022][12883] Updated weights for policy 0, policy_version 108713 (0.0046) [2024-06-18 09:52:51,389][12883] Updated weights for policy 0, policy_version 108723 (0.0028) [2024-06-18 09:52:51,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1781317632. Throughput: 0: 42709.9. Samples: 1781428080. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:51,997][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 09:52:55,566][12883] Updated weights for policy 0, policy_version 108733 (0.0038) [2024-06-18 09:52:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42329.8, 300 sec: 42431.8). Total num frames: 1781514240. Throughput: 0: 42840.4. Samples: 1781687260. Policy #0 lag: (min: 1.0, avg: 11.7, max: 23.0) [2024-06-18 09:52:56,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 09:52:59,191][12883] Updated weights for policy 0, policy_version 108743 (0.0041) [2024-06-18 09:53:01,996][12645] Fps is (10 sec: 42598.3, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 1781743616. Throughput: 0: 42666.8. Samples: 1781810180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:01,997][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 09:53:03,052][12883] Updated weights for policy 0, policy_version 108753 (0.0042) [2024-06-18 09:53:06,724][12883] Updated weights for policy 0, policy_version 108763 (0.0022) [2024-06-18 09:53:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1781972992. Throughput: 0: 42664.0. Samples: 1782072140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:06,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 09:53:10,587][12883] Updated weights for policy 0, policy_version 108773 (0.0030) [2024-06-18 09:53:11,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 1782153216. Throughput: 0: 42569.2. Samples: 1782327100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:11,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 09:53:14,810][12883] Updated weights for policy 0, policy_version 108783 (0.0028) [2024-06-18 09:53:16,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1782382592. Throughput: 0: 42660.9. Samples: 1782452540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:16,997][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 09:53:18,208][12883] Updated weights for policy 0, policy_version 108793 (0.0036) [2024-06-18 09:53:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1782595584. Throughput: 0: 42688.9. Samples: 1782707700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:21,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 09:53:22,370][12883] Updated weights for policy 0, policy_version 108803 (0.0037) [2024-06-18 09:53:26,089][12883] Updated weights for policy 0, policy_version 108813 (0.0039) [2024-06-18 09:53:26,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1782792192. Throughput: 0: 42540.8. Samples: 1782964460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:26,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 09:53:30,329][12883] Updated weights for policy 0, policy_version 108823 (0.0037) [2024-06-18 09:53:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 1783021568. Throughput: 0: 42651.8. Samples: 1783092680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:31,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 09:53:33,923][12883] Updated weights for policy 0, policy_version 108833 (0.0038) [2024-06-18 09:53:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.6). Total num frames: 1783234560. Throughput: 0: 42770.1. Samples: 1783352640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:36,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 09:53:37,886][12883] Updated weights for policy 0, policy_version 108843 (0.0024) [2024-06-18 09:53:41,667][12883] Updated weights for policy 0, policy_version 108853 (0.0039) [2024-06-18 09:53:41,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 1783447552. Throughput: 0: 42596.6. Samples: 1783604200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:41,996][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 09:53:45,639][12883] Updated weights for policy 0, policy_version 108863 (0.0040) [2024-06-18 09:53:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1783660544. Throughput: 0: 42696.4. Samples: 1783731420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:46,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 09:53:49,552][12883] Updated weights for policy 0, policy_version 108873 (0.0032) [2024-06-18 09:53:51,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 1783857152. Throughput: 0: 42550.2. Samples: 1783986900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:51,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 09:53:53,247][12883] Updated weights for policy 0, policy_version 108883 (0.0044) [2024-06-18 09:53:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1784070144. Throughput: 0: 42517.5. Samples: 1784240380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 09:53:56,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 09:53:57,295][12883] Updated weights for policy 0, policy_version 108893 (0.0037) [2024-06-18 09:54:01,184][12883] Updated weights for policy 0, policy_version 108903 (0.0032) [2024-06-18 09:54:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42600.1, 300 sec: 42542.9). Total num frames: 1784299520. Throughput: 0: 42627.1. Samples: 1784370660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:01,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 09:54:04,886][12883] Updated weights for policy 0, policy_version 108913 (0.0037) [2024-06-18 09:54:06,996][12645] Fps is (10 sec: 44226.2, 60 sec: 42323.8, 300 sec: 42598.4). Total num frames: 1784512512. Throughput: 0: 42547.2. Samples: 1784622420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:06,997][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 09:54:08,822][12883] Updated weights for policy 0, policy_version 108923 (0.0027) [2024-06-18 09:54:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1784725504. Throughput: 0: 42517.8. Samples: 1784877760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:11,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 09:54:12,368][12862] Signal inference workers to stop experience collection... (26050 times) [2024-06-18 09:54:12,421][12883] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-06-18 09:54:12,420][12862] Signal inference workers to resume experience collection... (26050 times) [2024-06-18 09:54:12,446][12883] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-06-18 09:54:12,563][12883] Updated weights for policy 0, policy_version 108933 (0.0031) [2024-06-18 09:54:16,607][12883] Updated weights for policy 0, policy_version 108943 (0.0026) [2024-06-18 09:54:16,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1784922112. Throughput: 0: 42651.0. Samples: 1785011980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:16,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 09:54:20,394][12883] Updated weights for policy 0, policy_version 108953 (0.0046) [2024-06-18 09:54:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1785135104. Throughput: 0: 42444.5. Samples: 1785262640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:21,995][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 09:54:24,312][12883] Updated weights for policy 0, policy_version 108963 (0.0037) [2024-06-18 09:54:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1785364480. Throughput: 0: 42423.8. Samples: 1785513180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:26,995][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 09:54:28,072][12883] Updated weights for policy 0, policy_version 108973 (0.0036) [2024-06-18 09:54:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1785544704. Throughput: 0: 42571.6. Samples: 1785647140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:31,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 09:54:32,202][12883] Updated weights for policy 0, policy_version 108983 (0.0027) [2024-06-18 09:54:35,750][12883] Updated weights for policy 0, policy_version 108993 (0.0030) [2024-06-18 09:54:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1785774080. Throughput: 0: 42592.9. Samples: 1785903580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:36,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 09:54:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108996_1785790464.pth... [2024-06-18 09:54:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108371_1775550464.pth [2024-06-18 09:54:40,001][12883] Updated weights for policy 0, policy_version 109003 (0.0035) [2024-06-18 09:54:41,997][12645] Fps is (10 sec: 45861.8, 60 sec: 42597.9, 300 sec: 42598.0). Total num frames: 1786003456. Throughput: 0: 42474.9. Samples: 1786151880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:41,997][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 09:54:43,649][12883] Updated weights for policy 0, policy_version 109013 (0.0031) [2024-06-18 09:54:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1786216448. Throughput: 0: 42441.2. Samples: 1786280520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:46,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 09:54:47,539][12883] Updated weights for policy 0, policy_version 109023 (0.0028) [2024-06-18 09:54:51,098][12883] Updated weights for policy 0, policy_version 109033 (0.0034) [2024-06-18 09:54:51,994][12645] Fps is (10 sec: 42610.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1786429440. Throughput: 0: 42694.6. Samples: 1786543580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:51,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 09:54:54,942][12883] Updated weights for policy 0, policy_version 109043 (0.0039) [2024-06-18 09:54:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1786642432. Throughput: 0: 42643.4. Samples: 1786796720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 09:54:56,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 09:54:58,882][12883] Updated weights for policy 0, policy_version 109053 (0.0038) [2024-06-18 09:55:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1786839040. Throughput: 0: 42391.7. Samples: 1786919600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:01,994][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 09:55:02,506][12883] Updated weights for policy 0, policy_version 109063 (0.0038) [2024-06-18 09:55:06,383][12883] Updated weights for policy 0, policy_version 109073 (0.0045) [2024-06-18 09:55:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 1787084800. Throughput: 0: 42602.3. Samples: 1787179740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:06,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 09:55:10,461][12883] Updated weights for policy 0, policy_version 109083 (0.0042) [2024-06-18 09:55:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1787265024. Throughput: 0: 42733.4. Samples: 1787436180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:11,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 09:55:13,970][12883] Updated weights for policy 0, policy_version 109093 (0.0026) [2024-06-18 09:55:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1787494400. Throughput: 0: 42565.3. Samples: 1787562580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:16,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 09:55:17,923][12883] Updated weights for policy 0, policy_version 109103 (0.0029) [2024-06-18 09:55:21,807][12883] Updated weights for policy 0, policy_version 109113 (0.0031) [2024-06-18 09:55:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1787707392. Throughput: 0: 42592.0. Samples: 1787820220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:21,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 09:55:25,636][12883] Updated weights for policy 0, policy_version 109123 (0.0035) [2024-06-18 09:55:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1787904000. Throughput: 0: 42872.0. Samples: 1788081000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:26,994][12645] Avg episode reward: [(0, '0.132')] [2024-06-18 09:55:29,322][12883] Updated weights for policy 0, policy_version 109133 (0.0037) [2024-06-18 09:55:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1788133376. Throughput: 0: 42819.7. Samples: 1788207400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:31,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 09:55:33,462][12862] Signal inference workers to stop experience collection... (26100 times) [2024-06-18 09:55:33,462][12862] Signal inference workers to resume experience collection... (26100 times) [2024-06-18 09:55:33,472][12883] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-06-18 09:55:33,472][12883] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-06-18 09:55:33,613][12883] Updated weights for policy 0, policy_version 109143 (0.0022) [2024-06-18 09:55:36,942][12883] Updated weights for policy 0, policy_version 109153 (0.0023) [2024-06-18 09:55:36,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1788362752. Throughput: 0: 42716.0. Samples: 1788465800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:36,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 09:55:41,263][12883] Updated weights for policy 0, policy_version 109163 (0.0033) [2024-06-18 09:55:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42600.4, 300 sec: 42487.3). Total num frames: 1788559360. Throughput: 0: 42990.2. Samples: 1788731280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:41,995][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 09:55:44,717][12883] Updated weights for policy 0, policy_version 109173 (0.0042) [2024-06-18 09:55:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1788788736. Throughput: 0: 43007.5. Samples: 1788854940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:46,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 09:55:48,804][12883] Updated weights for policy 0, policy_version 109183 (0.0040) [2024-06-18 09:55:51,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 1789001728. Throughput: 0: 42797.7. Samples: 1789105640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:51,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 09:55:52,388][12883] Updated weights for policy 0, policy_version 109193 (0.0039) [2024-06-18 09:55:56,586][12883] Updated weights for policy 0, policy_version 109203 (0.0026) [2024-06-18 09:55:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1789181952. Throughput: 0: 42898.6. Samples: 1789366620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 09:55:56,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 09:56:00,110][12883] Updated weights for policy 0, policy_version 109213 (0.0032) [2024-06-18 09:56:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1789411328. Throughput: 0: 42861.8. Samples: 1789491360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:01,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 09:56:04,144][12883] Updated weights for policy 0, policy_version 109223 (0.0026) [2024-06-18 09:56:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1789624320. Throughput: 0: 42805.7. Samples: 1789746480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:06,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 09:56:07,638][12883] Updated weights for policy 0, policy_version 109233 (0.0028) [2024-06-18 09:56:11,846][12883] Updated weights for policy 0, policy_version 109243 (0.0036) [2024-06-18 09:56:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1789837312. Throughput: 0: 42677.8. Samples: 1790001500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:11,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 09:56:15,666][12883] Updated weights for policy 0, policy_version 109253 (0.0032) [2024-06-18 09:56:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1790050304. Throughput: 0: 42575.9. Samples: 1790123320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:16,994][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 09:56:19,529][12883] Updated weights for policy 0, policy_version 109263 (0.0042) [2024-06-18 09:56:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1790279680. Throughput: 0: 42634.1. Samples: 1790384340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:21,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 09:56:23,206][12883] Updated weights for policy 0, policy_version 109273 (0.0024) [2024-06-18 09:56:26,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42869.9, 300 sec: 42598.4). Total num frames: 1790476288. Throughput: 0: 42426.9. Samples: 1790640580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:26,996][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 09:56:27,151][12883] Updated weights for policy 0, policy_version 109283 (0.0028) [2024-06-18 09:56:30,818][12883] Updated weights for policy 0, policy_version 109293 (0.0035) [2024-06-18 09:56:31,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42596.7, 300 sec: 42709.5). Total num frames: 1790689280. Throughput: 0: 42431.1. Samples: 1790764440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:31,997][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 09:56:34,934][12883] Updated weights for policy 0, policy_version 109303 (0.0031) [2024-06-18 09:56:36,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1790902272. Throughput: 0: 42688.5. Samples: 1791026620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:36,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 09:56:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109308_1790902272.pth... [2024-06-18 09:56:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108683_1780662272.pth [2024-06-18 09:56:38,396][12883] Updated weights for policy 0, policy_version 109313 (0.0024) [2024-06-18 09:56:41,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1791115264. Throughput: 0: 42426.2. Samples: 1791275800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:41,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 09:56:42,727][12883] Updated weights for policy 0, policy_version 109323 (0.0033) [2024-06-18 09:56:45,862][12883] Updated weights for policy 0, policy_version 109333 (0.0021) [2024-06-18 09:56:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1791344640. Throughput: 0: 42601.3. Samples: 1791408420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:46,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 09:56:50,030][12862] Signal inference workers to stop experience collection... (26150 times) [2024-06-18 09:56:50,031][12862] Signal inference workers to resume experience collection... (26150 times) [2024-06-18 09:56:50,071][12883] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-06-18 09:56:50,071][12883] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-06-18 09:56:50,179][12883] Updated weights for policy 0, policy_version 109343 (0.0044) [2024-06-18 09:56:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42543.8). Total num frames: 1791524864. Throughput: 0: 42742.8. Samples: 1791669900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:51,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 09:56:53,471][12883] Updated weights for policy 0, policy_version 109353 (0.0024) [2024-06-18 09:56:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1791770624. Throughput: 0: 42689.5. Samples: 1791922520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 09:56:56,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 09:56:57,797][12883] Updated weights for policy 0, policy_version 109363 (0.0044) [2024-06-18 09:57:01,389][12883] Updated weights for policy 0, policy_version 109373 (0.0033) [2024-06-18 09:57:01,996][12645] Fps is (10 sec: 47502.6, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 1792000000. Throughput: 0: 42986.8. Samples: 1792057820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:02,005][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 09:57:05,542][12883] Updated weights for policy 0, policy_version 109383 (0.0034) [2024-06-18 09:57:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1792163840. Throughput: 0: 42774.8. Samples: 1792309200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:06,995][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 09:57:08,957][12883] Updated weights for policy 0, policy_version 109393 (0.0047) [2024-06-18 09:57:11,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1792425984. Throughput: 0: 42686.6. Samples: 1792561380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:11,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 09:57:13,347][12883] Updated weights for policy 0, policy_version 109403 (0.0048) [2024-06-18 09:57:16,639][12883] Updated weights for policy 0, policy_version 109413 (0.0041) [2024-06-18 09:57:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1792622592. Throughput: 0: 42922.3. Samples: 1792695840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:16,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 09:57:20,997][12883] Updated weights for policy 0, policy_version 109423 (0.0038) [2024-06-18 09:57:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1792802816. Throughput: 0: 42731.5. Samples: 1792949540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:22,003][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 09:57:24,379][12883] Updated weights for policy 0, policy_version 109433 (0.0040) [2024-06-18 09:57:26,996][12645] Fps is (10 sec: 44226.6, 60 sec: 43144.5, 300 sec: 42653.6). Total num frames: 1793064960. Throughput: 0: 42798.4. Samples: 1793201820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:27,005][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 09:57:28,829][12883] Updated weights for policy 0, policy_version 109443 (0.0032) [2024-06-18 09:57:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 1793261568. Throughput: 0: 42864.5. Samples: 1793337320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:31,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 09:57:32,052][12883] Updated weights for policy 0, policy_version 109453 (0.0041) [2024-06-18 09:57:36,246][12883] Updated weights for policy 0, policy_version 109463 (0.0024) [2024-06-18 09:57:36,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1793458176. Throughput: 0: 42760.4. Samples: 1793594120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:36,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 09:57:39,546][12883] Updated weights for policy 0, policy_version 109473 (0.0033) [2024-06-18 09:57:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1793687552. Throughput: 0: 42774.1. Samples: 1793847360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:41,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 09:57:43,709][12883] Updated weights for policy 0, policy_version 109483 (0.0045) [2024-06-18 09:57:46,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1793916928. Throughput: 0: 42836.8. Samples: 1793985380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:46,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 09:57:47,174][12883] Updated weights for policy 0, policy_version 109493 (0.0030) [2024-06-18 09:57:51,252][12883] Updated weights for policy 0, policy_version 109503 (0.0035) [2024-06-18 09:57:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1794113536. Throughput: 0: 42947.5. Samples: 1794241840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:51,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 09:57:54,852][12883] Updated weights for policy 0, policy_version 109513 (0.0032) [2024-06-18 09:57:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1794342912. Throughput: 0: 42943.6. Samples: 1794493840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 09:57:56,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 09:57:59,066][12883] Updated weights for policy 0, policy_version 109523 (0.0031) [2024-06-18 09:58:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 1794555904. Throughput: 0: 42874.4. Samples: 1794625200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:01,995][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 09:58:02,683][12883] Updated weights for policy 0, policy_version 109533 (0.0029) [2024-06-18 09:58:06,837][12883] Updated weights for policy 0, policy_version 109543 (0.0031) [2024-06-18 09:58:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1794752512. Throughput: 0: 42964.9. Samples: 1794882960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:06,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 09:58:10,362][12883] Updated weights for policy 0, policy_version 109553 (0.0026) [2024-06-18 09:58:11,994][12645] Fps is (10 sec: 42599.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1794981888. Throughput: 0: 43122.2. Samples: 1795142220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:11,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 09:58:14,497][12883] Updated weights for policy 0, policy_version 109563 (0.0042) [2024-06-18 09:58:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1795194880. Throughput: 0: 43036.7. Samples: 1795273980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:16,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 09:58:17,828][12883] Updated weights for policy 0, policy_version 109573 (0.0048) [2024-06-18 09:58:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1795375104. Throughput: 0: 42797.9. Samples: 1795520020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 09:58:22,259][12883] Updated weights for policy 0, policy_version 109583 (0.0041) [2024-06-18 09:58:25,433][12883] Updated weights for policy 0, policy_version 109593 (0.0040) [2024-06-18 09:58:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.9, 300 sec: 42709.4). Total num frames: 1795620864. Throughput: 0: 42916.8. Samples: 1795778620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:26,994][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 09:58:29,721][12883] Updated weights for policy 0, policy_version 109603 (0.0044) [2024-06-18 09:58:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1795833856. Throughput: 0: 42771.6. Samples: 1795910100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:31,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 09:58:33,442][12883] Updated weights for policy 0, policy_version 109613 (0.0022) [2024-06-18 09:58:35,215][12862] Signal inference workers to stop experience collection... (26200 times) [2024-06-18 09:58:35,237][12883] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-06-18 09:58:35,277][12862] Signal inference workers to resume experience collection... (26200 times) [2024-06-18 09:58:35,277][12883] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-06-18 09:58:36,996][12645] Fps is (10 sec: 42589.5, 60 sec: 43142.9, 300 sec: 42709.5). Total num frames: 1796046848. Throughput: 0: 42634.7. Samples: 1796160500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:36,997][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 09:58:37,024][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109622_1796046848.pth... [2024-06-18 09:58:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000108996_1785790464.pth [2024-06-18 09:58:37,269][12883] Updated weights for policy 0, policy_version 109623 (0.0032) [2024-06-18 09:58:41,029][12883] Updated weights for policy 0, policy_version 109633 (0.0038) [2024-06-18 09:58:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1796259840. Throughput: 0: 42643.6. Samples: 1796412800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:41,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 09:58:45,024][12883] Updated weights for policy 0, policy_version 109643 (0.0033) [2024-06-18 09:58:46,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1796472832. Throughput: 0: 42735.2. Samples: 1796548280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:46,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 09:58:48,684][12883] Updated weights for policy 0, policy_version 109653 (0.0025) [2024-06-18 09:58:51,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 1796669440. Throughput: 0: 42660.4. Samples: 1796802680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:51,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 09:58:52,729][12883] Updated weights for policy 0, policy_version 109663 (0.0038) [2024-06-18 09:58:56,439][12883] Updated weights for policy 0, policy_version 109673 (0.0029) [2024-06-18 09:58:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1796898816. Throughput: 0: 42567.1. Samples: 1797057740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:58:56,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 09:59:00,646][12883] Updated weights for policy 0, policy_version 109683 (0.0037) [2024-06-18 09:59:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.6, 300 sec: 42709.8). Total num frames: 1797111808. Throughput: 0: 42556.2. Samples: 1797189000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 09:59:01,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 09:59:04,003][12883] Updated weights for policy 0, policy_version 109693 (0.0035) [2024-06-18 09:59:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1797324800. Throughput: 0: 42789.7. Samples: 1797445560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:06,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 09:59:08,089][12883] Updated weights for policy 0, policy_version 109703 (0.0047) [2024-06-18 09:59:11,826][12883] Updated weights for policy 0, policy_version 109713 (0.0032) [2024-06-18 09:59:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1797537792. Throughput: 0: 42889.4. Samples: 1797708640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:11,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 09:59:15,919][12883] Updated weights for policy 0, policy_version 109723 (0.0035) [2024-06-18 09:59:16,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 1797767168. Throughput: 0: 42774.3. Samples: 1797835040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:16,997][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 09:59:19,315][12883] Updated weights for policy 0, policy_version 109733 (0.0031) [2024-06-18 09:59:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1797980160. Throughput: 0: 42853.6. Samples: 1798088820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:21,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 09:59:23,509][12883] Updated weights for policy 0, policy_version 109743 (0.0031) [2024-06-18 09:59:26,843][12883] Updated weights for policy 0, policy_version 109753 (0.0030) [2024-06-18 09:59:26,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 1798193152. Throughput: 0: 42952.0. Samples: 1798345640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:26,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 09:59:31,033][12883] Updated weights for policy 0, policy_version 109763 (0.0039) [2024-06-18 09:59:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1798389760. Throughput: 0: 42709.0. Samples: 1798470180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:31,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 09:59:34,210][12883] Updated weights for policy 0, policy_version 109773 (0.0040) [2024-06-18 09:59:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43146.1, 300 sec: 42821.0). Total num frames: 1798635520. Throughput: 0: 42995.1. Samples: 1798737460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:36,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 09:59:38,660][12883] Updated weights for policy 0, policy_version 109783 (0.0037) [2024-06-18 09:59:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1798832128. Throughput: 0: 42993.3. Samples: 1798992440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:41,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 09:59:42,170][12883] Updated weights for policy 0, policy_version 109793 (0.0034) [2024-06-18 09:59:46,468][12883] Updated weights for policy 0, policy_version 109803 (0.0026) [2024-06-18 09:59:46,647][12862] Signal inference workers to stop experience collection... (26250 times) [2024-06-18 09:59:46,677][12883] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-06-18 09:59:46,707][12862] Signal inference workers to resume experience collection... (26250 times) [2024-06-18 09:59:46,712][12883] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-06-18 09:59:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1799045120. Throughput: 0: 42911.4. Samples: 1799120020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:46,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 09:59:49,711][12883] Updated weights for policy 0, policy_version 109813 (0.0024) [2024-06-18 09:59:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1799258112. Throughput: 0: 42884.4. Samples: 1799375360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:51,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 09:59:54,017][12883] Updated weights for policy 0, policy_version 109823 (0.0037) [2024-06-18 09:59:56,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 1799471104. Throughput: 0: 42874.4. Samples: 1799638080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 09:59:56,996][12645] Avg episode reward: [(0, '0.735')] [2024-06-18 09:59:57,288][12883] Updated weights for policy 0, policy_version 109833 (0.0044) [2024-06-18 10:00:01,620][12883] Updated weights for policy 0, policy_version 109843 (0.0043) [2024-06-18 10:00:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1799684096. Throughput: 0: 42932.8. Samples: 1799766920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 10:00:01,994][12645] Avg episode reward: [(0, '0.802')] [2024-06-18 10:00:04,882][12883] Updated weights for policy 0, policy_version 109853 (0.0036) [2024-06-18 10:00:06,994][12645] Fps is (10 sec: 44246.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1799913472. Throughput: 0: 42931.9. Samples: 1800020760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:06,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 10:00:09,176][12883] Updated weights for policy 0, policy_version 109863 (0.0022) [2024-06-18 10:00:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1800126464. Throughput: 0: 43053.3. Samples: 1800283040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:11,994][12645] Avg episode reward: [(0, '0.687')] [2024-06-18 10:00:12,604][12883] Updated weights for policy 0, policy_version 109873 (0.0031) [2024-06-18 10:00:16,902][12883] Updated weights for policy 0, policy_version 109883 (0.0027) [2024-06-18 10:00:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1800323072. Throughput: 0: 43177.4. Samples: 1800413160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:16,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 10:00:20,022][12883] Updated weights for policy 0, policy_version 109893 (0.0024) [2024-06-18 10:00:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 1800568832. Throughput: 0: 42966.9. Samples: 1800670960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:21,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 10:00:24,482][12883] Updated weights for policy 0, policy_version 109903 (0.0034) [2024-06-18 10:00:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1800749056. Throughput: 0: 43131.6. Samples: 1800933360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:26,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 10:00:28,011][12883] Updated weights for policy 0, policy_version 109913 (0.0030) [2024-06-18 10:00:31,961][12883] Updated weights for policy 0, policy_version 109923 (0.0033) [2024-06-18 10:00:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1800978432. Throughput: 0: 42975.2. Samples: 1801053900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:31,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 10:00:35,584][12883] Updated weights for policy 0, policy_version 109933 (0.0030) [2024-06-18 10:00:36,994][12645] Fps is (10 sec: 47513.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 1801224192. Throughput: 0: 43198.2. Samples: 1801319280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:36,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 10:00:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109938_1801224192.pth... [2024-06-18 10:00:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109308_1790902272.pth [2024-06-18 10:00:39,371][12883] Updated weights for policy 0, policy_version 109943 (0.0030) [2024-06-18 10:00:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1801404416. Throughput: 0: 43193.3. Samples: 1801581680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:41,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 10:00:43,040][12883] Updated weights for policy 0, policy_version 109953 (0.0040) [2024-06-18 10:00:46,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1801617408. Throughput: 0: 43033.0. Samples: 1801703400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:46,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 10:00:47,064][12883] Updated weights for policy 0, policy_version 109963 (0.0033) [2024-06-18 10:00:50,499][12883] Updated weights for policy 0, policy_version 109973 (0.0031) [2024-06-18 10:00:51,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.7, 300 sec: 42987.2). Total num frames: 1801863168. Throughput: 0: 43154.4. Samples: 1801962700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:51,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 10:00:54,769][12883] Updated weights for policy 0, policy_version 109983 (0.0037) [2024-06-18 10:00:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 1802059776. Throughput: 0: 43121.2. Samples: 1802223500. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:00:56,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 10:00:57,933][12883] Updated weights for policy 0, policy_version 109993 (0.0033) [2024-06-18 10:01:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1802272768. Throughput: 0: 42996.8. Samples: 1802348020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 10:01:01,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 10:01:02,293][12883] Updated weights for policy 0, policy_version 110003 (0.0034) [2024-06-18 10:01:05,651][12883] Updated weights for policy 0, policy_version 110013 (0.0033) [2024-06-18 10:01:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1802502144. Throughput: 0: 43080.7. Samples: 1802609600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:06,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 10:01:07,661][12862] Signal inference workers to stop experience collection... (26300 times) [2024-06-18 10:01:07,662][12862] Signal inference workers to resume experience collection... (26300 times) [2024-06-18 10:01:07,706][12883] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-06-18 10:01:07,706][12883] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-06-18 10:01:09,753][12883] Updated weights for policy 0, policy_version 110023 (0.0047) [2024-06-18 10:01:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 1802682368. Throughput: 0: 43171.5. Samples: 1802876080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:11,994][12645] Avg episode reward: [(0, '0.183')] [2024-06-18 10:01:13,199][12883] Updated weights for policy 0, policy_version 110033 (0.0028) [2024-06-18 10:01:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1802911744. Throughput: 0: 43200.8. Samples: 1802997940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:16,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 10:01:17,331][12883] Updated weights for policy 0, policy_version 110043 (0.0020) [2024-06-18 10:01:20,704][12883] Updated weights for policy 0, policy_version 110053 (0.0037) [2024-06-18 10:01:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 1803141120. Throughput: 0: 43087.7. Samples: 1803258220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:21,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 10:01:24,781][12883] Updated weights for policy 0, policy_version 110063 (0.0035) [2024-06-18 10:01:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43417.7, 300 sec: 42932.0). Total num frames: 1803354112. Throughput: 0: 43198.2. Samples: 1803525600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:26,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 10:01:28,342][12883] Updated weights for policy 0, policy_version 110073 (0.0043) [2024-06-18 10:01:31,996][12645] Fps is (10 sec: 42588.9, 60 sec: 43142.9, 300 sec: 42931.3). Total num frames: 1803567104. Throughput: 0: 43283.6. Samples: 1803651260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:31,997][12645] Avg episode reward: [(0, '0.153')] [2024-06-18 10:01:32,483][12883] Updated weights for policy 0, policy_version 110083 (0.0036) [2024-06-18 10:01:35,768][12883] Updated weights for policy 0, policy_version 110093 (0.0030) [2024-06-18 10:01:36,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1803796480. Throughput: 0: 43285.7. Samples: 1803910560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:36,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 10:01:39,945][12883] Updated weights for policy 0, policy_version 110103 (0.0036) [2024-06-18 10:01:41,994][12645] Fps is (10 sec: 42607.7, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1803993088. Throughput: 0: 43236.9. Samples: 1804169160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:41,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 10:01:43,352][12883] Updated weights for policy 0, policy_version 110113 (0.0045) [2024-06-18 10:01:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1804206080. Throughput: 0: 43244.0. Samples: 1804294000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:47,000][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 10:01:47,676][12883] Updated weights for policy 0, policy_version 110123 (0.0041) [2024-06-18 10:01:50,935][12883] Updated weights for policy 0, policy_version 110133 (0.0029) [2024-06-18 10:01:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1804451840. Throughput: 0: 43154.7. Samples: 1804551560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:51,995][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 10:01:55,243][12883] Updated weights for policy 0, policy_version 110143 (0.0032) [2024-06-18 10:01:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1804632064. Throughput: 0: 43115.5. Samples: 1804816280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:01:56,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 10:01:58,508][12883] Updated weights for policy 0, policy_version 110153 (0.0047) [2024-06-18 10:02:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 43042.7). Total num frames: 1804861440. Throughput: 0: 43174.5. Samples: 1804940800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 10:02:01,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 10:02:02,856][12883] Updated weights for policy 0, policy_version 110163 (0.0033) [2024-06-18 10:02:06,191][12883] Updated weights for policy 0, policy_version 110173 (0.0036) [2024-06-18 10:02:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1805074432. Throughput: 0: 43075.6. Samples: 1805196620. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:06,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 10:02:11,152][12883] Updated weights for policy 0, policy_version 110183 (0.0036) [2024-06-18 10:02:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 1805271040. Throughput: 0: 42910.6. Samples: 1805456580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:11,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 10:02:14,275][12883] Updated weights for policy 0, policy_version 110193 (0.0046) [2024-06-18 10:02:16,104][12862] Signal inference workers to stop experience collection... (26350 times) [2024-06-18 10:02:16,129][12883] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-06-18 10:02:16,159][12862] Signal inference workers to resume experience collection... (26350 times) [2024-06-18 10:02:16,160][12883] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-06-18 10:02:16,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 43098.2). Total num frames: 1805516800. Throughput: 0: 42735.4. Samples: 1805574260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:16,994][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 10:02:18,838][12883] Updated weights for policy 0, policy_version 110203 (0.0047) [2024-06-18 10:02:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 1805713408. Throughput: 0: 42598.8. Samples: 1805827500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:21,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 10:02:22,027][12883] Updated weights for policy 0, policy_version 110213 (0.0026) [2024-06-18 10:02:26,498][12883] Updated weights for policy 0, policy_version 110223 (0.0044) [2024-06-18 10:02:26,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1805910016. Throughput: 0: 42565.5. Samples: 1806084600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:26,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 10:02:29,860][12883] Updated weights for policy 0, policy_version 110233 (0.0036) [2024-06-18 10:02:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43419.1, 300 sec: 43098.2). Total num frames: 1806172160. Throughput: 0: 42552.8. Samples: 1806208880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:31,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 10:02:34,073][12883] Updated weights for policy 0, policy_version 110243 (0.0028) [2024-06-18 10:02:36,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 1806336000. Throughput: 0: 42553.2. Samples: 1806466460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:36,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 10:02:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110250_1806336000.pth... [2024-06-18 10:02:37,053][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109622_1796046848.pth [2024-06-18 10:02:37,522][12883] Updated weights for policy 0, policy_version 110253 (0.0029) [2024-06-18 10:02:41,563][12883] Updated weights for policy 0, policy_version 110263 (0.0031) [2024-06-18 10:02:41,996][12645] Fps is (10 sec: 37675.1, 60 sec: 42596.9, 300 sec: 42820.2). Total num frames: 1806548992. Throughput: 0: 42244.6. Samples: 1806717380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:41,996][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 10:02:45,431][12883] Updated weights for policy 0, policy_version 110273 (0.0027) [2024-06-18 10:02:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 1806794752. Throughput: 0: 42392.5. Samples: 1806848460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:46,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 10:02:49,210][12883] Updated weights for policy 0, policy_version 110283 (0.0044) [2024-06-18 10:02:51,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42052.3, 300 sec: 42820.5). Total num frames: 1806974976. Throughput: 0: 42380.4. Samples: 1807103740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:51,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 10:02:53,387][12883] Updated weights for policy 0, policy_version 110293 (0.0027) [2024-06-18 10:02:56,817][12883] Updated weights for policy 0, policy_version 110303 (0.0024) [2024-06-18 10:02:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1807204352. Throughput: 0: 42168.0. Samples: 1807354140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:02:56,994][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 10:03:01,071][12883] Updated weights for policy 0, policy_version 110313 (0.0033) [2024-06-18 10:03:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 1807433728. Throughput: 0: 42571.6. Samples: 1807489980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-18 10:03:01,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 10:03:05,092][12883] Updated weights for policy 0, policy_version 110323 (0.0042) [2024-06-18 10:03:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 1807613952. Throughput: 0: 42563.5. Samples: 1807742860. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:07,003][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 10:03:08,882][12883] Updated weights for policy 0, policy_version 110333 (0.0030) [2024-06-18 10:03:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1807826944. Throughput: 0: 42394.2. Samples: 1807992340. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:11,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 10:03:12,717][12883] Updated weights for policy 0, policy_version 110343 (0.0027) [2024-06-18 10:03:16,632][12883] Updated weights for policy 0, policy_version 110353 (0.0044) [2024-06-18 10:03:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42987.2). Total num frames: 1808056320. Throughput: 0: 42545.8. Samples: 1808123440. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:16,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 10:03:20,304][12883] Updated weights for policy 0, policy_version 110363 (0.0037) [2024-06-18 10:03:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1808269312. Throughput: 0: 42448.2. Samples: 1808376620. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:21,994][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 10:03:23,991][12862] Signal inference workers to stop experience collection... (26400 times) [2024-06-18 10:03:24,039][12883] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-06-18 10:03:24,042][12862] Signal inference workers to resume experience collection... (26400 times) [2024-06-18 10:03:24,060][12883] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-06-18 10:03:24,187][12883] Updated weights for policy 0, policy_version 110373 (0.0026) [2024-06-18 10:03:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1808482304. Throughput: 0: 42518.6. Samples: 1808630620. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:26,994][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 10:03:27,883][12883] Updated weights for policy 0, policy_version 110383 (0.0029) [2024-06-18 10:03:31,865][12883] Updated weights for policy 0, policy_version 110393 (0.0031) [2024-06-18 10:03:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42820.9). Total num frames: 1808678912. Throughput: 0: 42461.0. Samples: 1808759200. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:31,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 10:03:35,349][12883] Updated weights for policy 0, policy_version 110403 (0.0043) [2024-06-18 10:03:36,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43143.1, 300 sec: 42931.3). Total num frames: 1808924672. Throughput: 0: 42529.0. Samples: 1809017640. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:36,996][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 10:03:39,811][12883] Updated weights for policy 0, policy_version 110413 (0.0033) [2024-06-18 10:03:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42876.1). Total num frames: 1809121280. Throughput: 0: 42466.2. Samples: 1809265120. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:41,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 10:03:43,193][12883] Updated weights for policy 0, policy_version 110423 (0.0031) [2024-06-18 10:03:46,994][12645] Fps is (10 sec: 37692.0, 60 sec: 41779.3, 300 sec: 42820.6). Total num frames: 1809301504. Throughput: 0: 42313.5. Samples: 1809394080. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:46,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 10:03:47,514][12883] Updated weights for policy 0, policy_version 110433 (0.0045) [2024-06-18 10:03:51,025][12883] Updated weights for policy 0, policy_version 110443 (0.0045) [2024-06-18 10:03:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1809547264. Throughput: 0: 42342.7. Samples: 1809648280. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:51,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 10:03:55,141][12883] Updated weights for policy 0, policy_version 110453 (0.0037) [2024-06-18 10:03:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1809760256. Throughput: 0: 42379.5. Samples: 1809899420. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:03:56,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 10:03:58,623][12883] Updated weights for policy 0, policy_version 110463 (0.0020) [2024-06-18 10:04:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 1809940480. Throughput: 0: 42312.4. Samples: 1810027500. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:04:01,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 10:04:02,711][12883] Updated weights for policy 0, policy_version 110473 (0.0029) [2024-06-18 10:04:06,152][12883] Updated weights for policy 0, policy_version 110483 (0.0040) [2024-06-18 10:04:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 1810202624. Throughput: 0: 42535.5. Samples: 1810290720. Policy #0 lag: (min: 2.0, avg: 11.2, max: 23.0) [2024-06-18 10:04:06,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 10:04:10,170][12883] Updated weights for policy 0, policy_version 110493 (0.0033) [2024-06-18 10:04:11,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1810399232. Throughput: 0: 42622.6. Samples: 1810548640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:11,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 10:04:13,596][12883] Updated weights for policy 0, policy_version 110503 (0.0031) [2024-06-18 10:04:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1810595840. Throughput: 0: 42432.3. Samples: 1810668660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:16,995][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 10:04:17,956][12883] Updated weights for policy 0, policy_version 110513 (0.0038) [2024-06-18 10:04:21,081][12883] Updated weights for policy 0, policy_version 110523 (0.0032) [2024-06-18 10:04:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1810841600. Throughput: 0: 42667.5. Samples: 1810937580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:21,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 10:04:25,303][12883] Updated weights for policy 0, policy_version 110533 (0.0025) [2024-06-18 10:04:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 1811021824. Throughput: 0: 42930.1. Samples: 1811196980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:26,994][12645] Avg episode reward: [(0, '0.564')] [2024-06-18 10:04:28,743][12883] Updated weights for policy 0, policy_version 110543 (0.0039) [2024-06-18 10:04:31,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1811218432. Throughput: 0: 42681.2. Samples: 1811314740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:31,996][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 10:04:32,943][12883] Updated weights for policy 0, policy_version 110553 (0.0044) [2024-06-18 10:04:36,719][12883] Updated weights for policy 0, policy_version 110563 (0.0027) [2024-06-18 10:04:36,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42325.3, 300 sec: 42820.2). Total num frames: 1811464192. Throughput: 0: 42745.0. Samples: 1811571900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:36,997][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 10:04:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110563_1811464192.pth... [2024-06-18 10:04:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000109938_1801224192.pth [2024-06-18 10:04:40,859][12883] Updated weights for policy 0, policy_version 110573 (0.0024) [2024-06-18 10:04:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1811660800. Throughput: 0: 42957.4. Samples: 1811832500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:41,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 10:04:43,035][12862] Signal inference workers to stop experience collection... (26450 times) [2024-06-18 10:04:43,035][12862] Signal inference workers to resume experience collection... (26450 times) [2024-06-18 10:04:43,076][12883] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-06-18 10:04:43,076][12883] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-06-18 10:04:44,473][12883] Updated weights for policy 0, policy_version 110583 (0.0029) [2024-06-18 10:04:46,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1811873792. Throughput: 0: 42812.4. Samples: 1811954060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:46,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 10:04:48,858][12883] Updated weights for policy 0, policy_version 110593 (0.0028) [2024-06-18 10:04:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.9). Total num frames: 1812103168. Throughput: 0: 42716.1. Samples: 1812212940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:51,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 10:04:52,003][12883] Updated weights for policy 0, policy_version 110603 (0.0040) [2024-06-18 10:04:56,415][12883] Updated weights for policy 0, policy_version 110613 (0.0033) [2024-06-18 10:04:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 1812316160. Throughput: 0: 42795.2. Samples: 1812474420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:04:56,994][12645] Avg episode reward: [(0, '0.667')] [2024-06-18 10:04:59,522][12883] Updated weights for policy 0, policy_version 110623 (0.0029) [2024-06-18 10:05:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1812512768. Throughput: 0: 42842.7. Samples: 1812596580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:05:01,994][12645] Avg episode reward: [(0, '0.685')] [2024-06-18 10:05:03,923][12883] Updated weights for policy 0, policy_version 110633 (0.0026) [2024-06-18 10:05:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1812742144. Throughput: 0: 42727.9. Samples: 1812860340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:05:06,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 10:05:07,243][12883] Updated weights for policy 0, policy_version 110643 (0.0040) [2024-06-18 10:05:11,576][12883] Updated weights for policy 0, policy_version 110653 (0.0027) [2024-06-18 10:05:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1812955136. Throughput: 0: 42600.5. Samples: 1813114000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:11,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 10:05:15,066][12883] Updated weights for policy 0, policy_version 110663 (0.0039) [2024-06-18 10:05:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.4). Total num frames: 1813168128. Throughput: 0: 42766.2. Samples: 1813239220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:16,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 10:05:19,076][12883] Updated weights for policy 0, policy_version 110673 (0.0029) [2024-06-18 10:05:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 1813381120. Throughput: 0: 42845.2. Samples: 1813499840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:21,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 10:05:22,762][12883] Updated weights for policy 0, policy_version 110683 (0.0027) [2024-06-18 10:05:26,789][12883] Updated weights for policy 0, policy_version 110693 (0.0033) [2024-06-18 10:05:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1813594112. Throughput: 0: 42811.1. Samples: 1813759000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:26,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 10:05:30,455][12883] Updated weights for policy 0, policy_version 110703 (0.0041) [2024-06-18 10:05:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1813823488. Throughput: 0: 42880.9. Samples: 1813883700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:31,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 10:05:34,421][12883] Updated weights for policy 0, policy_version 110713 (0.0025) [2024-06-18 10:05:36,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 1814036480. Throughput: 0: 42932.9. Samples: 1814145020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:36,996][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 10:05:37,944][12883] Updated weights for policy 0, policy_version 110723 (0.0040) [2024-06-18 10:05:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1814233088. Throughput: 0: 42848.9. Samples: 1814402620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:41,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 10:05:42,031][12883] Updated weights for policy 0, policy_version 110733 (0.0033) [2024-06-18 10:05:45,535][12883] Updated weights for policy 0, policy_version 110743 (0.0036) [2024-06-18 10:05:46,994][12645] Fps is (10 sec: 42607.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1814462464. Throughput: 0: 42889.4. Samples: 1814526600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:46,994][12645] Avg episode reward: [(0, '0.208')] [2024-06-18 10:05:49,505][12883] Updated weights for policy 0, policy_version 110753 (0.0031) [2024-06-18 10:05:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1814691840. Throughput: 0: 42871.7. Samples: 1814789560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:51,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 10:05:53,122][12883] Updated weights for policy 0, policy_version 110763 (0.0023) [2024-06-18 10:05:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1814888448. Throughput: 0: 42933.5. Samples: 1815046000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:05:56,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 10:05:57,088][12883] Updated weights for policy 0, policy_version 110773 (0.0041) [2024-06-18 10:06:00,727][12883] Updated weights for policy 0, policy_version 110783 (0.0044) [2024-06-18 10:06:02,000][12645] Fps is (10 sec: 40933.9, 60 sec: 43140.1, 300 sec: 42708.6). Total num frames: 1815101440. Throughput: 0: 42886.5. Samples: 1815169380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:06:02,001][12645] Avg episode reward: [(0, '0.788')] [2024-06-18 10:06:04,528][12862] Signal inference workers to stop experience collection... (26500 times) [2024-06-18 10:06:04,528][12862] Signal inference workers to resume experience collection... (26500 times) [2024-06-18 10:06:04,561][12883] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-06-18 10:06:04,562][12883] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-06-18 10:06:04,683][12883] Updated weights for policy 0, policy_version 110793 (0.0038) [2024-06-18 10:06:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1815314432. Throughput: 0: 42927.7. Samples: 1815431580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 10:06:06,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 10:06:08,266][12883] Updated weights for policy 0, policy_version 110803 (0.0031) [2024-06-18 10:06:11,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1815527424. Throughput: 0: 42993.0. Samples: 1815693680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:11,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 10:06:12,314][12883] Updated weights for policy 0, policy_version 110813 (0.0029) [2024-06-18 10:06:15,883][12883] Updated weights for policy 0, policy_version 110823 (0.0039) [2024-06-18 10:06:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1815756800. Throughput: 0: 43030.3. Samples: 1815820060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:16,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 10:06:20,182][12883] Updated weights for policy 0, policy_version 110833 (0.0038) [2024-06-18 10:06:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1815969792. Throughput: 0: 42903.9. Samples: 1816075600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:21,996][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 10:06:23,527][12883] Updated weights for policy 0, policy_version 110843 (0.0037) [2024-06-18 10:06:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1816166400. Throughput: 0: 42862.7. Samples: 1816331440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:26,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 10:06:28,023][12883] Updated weights for policy 0, policy_version 110853 (0.0038) [2024-06-18 10:06:31,184][12883] Updated weights for policy 0, policy_version 110863 (0.0036) [2024-06-18 10:06:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1816412160. Throughput: 0: 43044.9. Samples: 1816463620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:31,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 10:06:35,666][12883] Updated weights for policy 0, policy_version 110873 (0.0040) [2024-06-18 10:06:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1816592384. Throughput: 0: 42864.4. Samples: 1816718460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:36,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 10:06:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110876_1816592384.pth... [2024-06-18 10:06:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110250_1806336000.pth [2024-06-18 10:06:38,838][12883] Updated weights for policy 0, policy_version 110883 (0.0036) [2024-06-18 10:06:41,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1816805376. Throughput: 0: 42903.9. Samples: 1816976780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:41,997][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 10:06:43,382][12883] Updated weights for policy 0, policy_version 110893 (0.0024) [2024-06-18 10:06:46,494][12883] Updated weights for policy 0, policy_version 110903 (0.0037) [2024-06-18 10:06:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1817051136. Throughput: 0: 43082.4. Samples: 1817107820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:46,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 10:06:51,095][12883] Updated weights for policy 0, policy_version 110913 (0.0034) [2024-06-18 10:06:51,994][12645] Fps is (10 sec: 42608.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1817231360. Throughput: 0: 42845.3. Samples: 1817359620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:51,994][12645] Avg episode reward: [(0, '0.156')] [2024-06-18 10:06:54,236][12883] Updated weights for policy 0, policy_version 110923 (0.0031) [2024-06-18 10:06:56,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1817444352. Throughput: 0: 42739.9. Samples: 1817616980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:06:56,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 10:06:58,698][12883] Updated weights for policy 0, policy_version 110933 (0.0036) [2024-06-18 10:07:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 1817673728. Throughput: 0: 42685.4. Samples: 1817740900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:07:01,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 10:07:02,098][12883] Updated weights for policy 0, policy_version 110943 (0.0037) [2024-06-18 10:07:06,208][12883] Updated weights for policy 0, policy_version 110953 (0.0036) [2024-06-18 10:07:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1817886720. Throughput: 0: 42807.6. Samples: 1818001940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 10:07:06,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 10:07:09,903][12883] Updated weights for policy 0, policy_version 110963 (0.0034) [2024-06-18 10:07:11,993][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1818083328. Throughput: 0: 42833.4. Samples: 1818258940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:11,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 10:07:14,183][12883] Updated weights for policy 0, policy_version 110973 (0.0036) [2024-06-18 10:07:16,999][12645] Fps is (10 sec: 44215.3, 60 sec: 42868.0, 300 sec: 42764.3). Total num frames: 1818329088. Throughput: 0: 42589.7. Samples: 1818380360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:16,999][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 10:07:17,401][12883] Updated weights for policy 0, policy_version 110983 (0.0033) [2024-06-18 10:07:21,892][12883] Updated weights for policy 0, policy_version 110993 (0.0041) [2024-06-18 10:07:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1818509312. Throughput: 0: 42731.6. Samples: 1818641380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:21,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 10:07:24,938][12883] Updated weights for policy 0, policy_version 111003 (0.0036) [2024-06-18 10:07:26,994][12645] Fps is (10 sec: 39340.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1818722304. Throughput: 0: 42619.4. Samples: 1818894560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:26,997][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 10:07:29,521][12883] Updated weights for policy 0, policy_version 111013 (0.0040) [2024-06-18 10:07:29,897][12862] Signal inference workers to stop experience collection... (26550 times) [2024-06-18 10:07:29,897][12862] Signal inference workers to resume experience collection... (26550 times) [2024-06-18 10:07:29,946][12883] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-06-18 10:07:29,947][12883] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-06-18 10:07:31,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1818951680. Throughput: 0: 42545.3. Samples: 1819022360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:31,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 10:07:33,031][12883] Updated weights for policy 0, policy_version 111023 (0.0038) [2024-06-18 10:07:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1819131904. Throughput: 0: 42656.4. Samples: 1819279160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:36,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 10:07:37,318][12883] Updated weights for policy 0, policy_version 111033 (0.0033) [2024-06-18 10:07:40,623][12883] Updated weights for policy 0, policy_version 111043 (0.0027) [2024-06-18 10:07:41,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1819377664. Throughput: 0: 42506.3. Samples: 1819529760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:41,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 10:07:44,967][12883] Updated weights for policy 0, policy_version 111053 (0.0042) [2024-06-18 10:07:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1819590656. Throughput: 0: 42771.2. Samples: 1819665600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:46,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 10:07:48,025][12883] Updated weights for policy 0, policy_version 111063 (0.0027) [2024-06-18 10:07:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1819787264. Throughput: 0: 42591.1. Samples: 1819918540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:51,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 10:07:52,467][12883] Updated weights for policy 0, policy_version 111073 (0.0033) [2024-06-18 10:07:56,073][12883] Updated weights for policy 0, policy_version 111083 (0.0031) [2024-06-18 10:07:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1820033024. Throughput: 0: 42532.3. Samples: 1820172900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:07:56,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 10:08:00,091][12883] Updated weights for policy 0, policy_version 111093 (0.0034) [2024-06-18 10:08:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1820229632. Throughput: 0: 42824.7. Samples: 1820307260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:08:01,996][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 10:08:03,599][12883] Updated weights for policy 0, policy_version 111103 (0.0036) [2024-06-18 10:08:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1820426240. Throughput: 0: 42627.5. Samples: 1820559620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:08:06,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 10:08:07,684][12883] Updated weights for policy 0, policy_version 111113 (0.0040) [2024-06-18 10:08:11,429][12883] Updated weights for policy 0, policy_version 111123 (0.0021) [2024-06-18 10:08:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1820672000. Throughput: 0: 42469.8. Samples: 1820805700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 10:08:11,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 10:08:15,279][12883] Updated weights for policy 0, policy_version 111133 (0.0033) [2024-06-18 10:08:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42328.8, 300 sec: 42709.5). Total num frames: 1820868608. Throughput: 0: 42670.0. Samples: 1820942500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:16,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 10:08:18,963][12883] Updated weights for policy 0, policy_version 111143 (0.0046) [2024-06-18 10:08:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1821048832. Throughput: 0: 42560.9. Samples: 1821194400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:21,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 10:08:23,068][12883] Updated weights for policy 0, policy_version 111153 (0.0037) [2024-06-18 10:08:26,858][12883] Updated weights for policy 0, policy_version 111163 (0.0041) [2024-06-18 10:08:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1821294592. Throughput: 0: 42673.3. Samples: 1821450060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:26,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 10:08:30,731][12883] Updated weights for policy 0, policy_version 111173 (0.0030) [2024-06-18 10:08:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1821507584. Throughput: 0: 42594.6. Samples: 1821582360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:31,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 10:08:34,496][12883] Updated weights for policy 0, policy_version 111183 (0.0035) [2024-06-18 10:08:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1821704192. Throughput: 0: 42556.1. Samples: 1821833560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:36,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 10:08:37,086][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111189_1821720576.pth... [2024-06-18 10:08:37,134][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110563_1811464192.pth [2024-06-18 10:08:38,646][12883] Updated weights for policy 0, policy_version 111193 (0.0022) [2024-06-18 10:08:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1821933568. Throughput: 0: 42538.7. Samples: 1822087140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:41,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 10:08:42,316][12883] Updated weights for policy 0, policy_version 111203 (0.0032) [2024-06-18 10:08:46,499][12883] Updated weights for policy 0, policy_version 111213 (0.0022) [2024-06-18 10:08:47,000][12645] Fps is (10 sec: 44209.1, 60 sec: 42593.9, 300 sec: 42708.6). Total num frames: 1822146560. Throughput: 0: 42472.3. Samples: 1822218780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:47,000][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 10:08:49,949][12883] Updated weights for policy 0, policy_version 111223 (0.0027) [2024-06-18 10:08:51,995][12645] Fps is (10 sec: 42593.0, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 1822359552. Throughput: 0: 42522.9. Samples: 1822473200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:51,995][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 10:08:54,210][12883] Updated weights for policy 0, policy_version 111233 (0.0033) [2024-06-18 10:08:57,000][12645] Fps is (10 sec: 42597.6, 60 sec: 42320.8, 300 sec: 42819.6). Total num frames: 1822572544. Throughput: 0: 42704.6. Samples: 1822727680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:08:57,001][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 10:08:57,869][12883] Updated weights for policy 0, policy_version 111243 (0.0032) [2024-06-18 10:09:01,988][12883] Updated weights for policy 0, policy_version 111253 (0.0037) [2024-06-18 10:09:01,994][12645] Fps is (10 sec: 40964.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1822769152. Throughput: 0: 42544.8. Samples: 1822857020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:09:01,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 10:09:05,525][12883] Updated weights for policy 0, policy_version 111263 (0.0042) [2024-06-18 10:09:06,994][12645] Fps is (10 sec: 44264.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1823014912. Throughput: 0: 42636.8. Samples: 1823113060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:09:06,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 10:09:09,523][12862] Signal inference workers to stop experience collection... (26600 times) [2024-06-18 10:09:09,524][12862] Signal inference workers to resume experience collection... (26600 times) [2024-06-18 10:09:09,535][12883] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-06-18 10:09:09,547][12883] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-06-18 10:09:09,687][12883] Updated weights for policy 0, policy_version 111273 (0.0042) [2024-06-18 10:09:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1823227904. Throughput: 0: 42490.7. Samples: 1823362140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 10:09:11,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 10:09:13,222][12883] Updated weights for policy 0, policy_version 111283 (0.0025) [2024-06-18 10:09:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1823391744. Throughput: 0: 42440.4. Samples: 1823492180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:16,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 10:09:17,335][12883] Updated weights for policy 0, policy_version 111293 (0.0024) [2024-06-18 10:09:20,841][12883] Updated weights for policy 0, policy_version 111303 (0.0036) [2024-06-18 10:09:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1823653888. Throughput: 0: 42647.0. Samples: 1823752680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:21,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 10:09:25,004][12883] Updated weights for policy 0, policy_version 111313 (0.0035) [2024-06-18 10:09:26,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 1823866880. Throughput: 0: 42587.7. Samples: 1824003600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:26,995][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 10:09:28,582][12883] Updated weights for policy 0, policy_version 111323 (0.0031) [2024-06-18 10:09:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1824047104. Throughput: 0: 42496.1. Samples: 1824130840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:31,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 10:09:32,711][12883] Updated weights for policy 0, policy_version 111333 (0.0037) [2024-06-18 10:09:36,359][12883] Updated weights for policy 0, policy_version 111343 (0.0032) [2024-06-18 10:09:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1824276480. Throughput: 0: 42575.3. Samples: 1824389040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:36,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 10:09:40,511][12883] Updated weights for policy 0, policy_version 111353 (0.0037) [2024-06-18 10:09:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1824489472. Throughput: 0: 42392.4. Samples: 1824635060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:41,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 10:09:44,161][12883] Updated weights for policy 0, policy_version 111363 (0.0051) [2024-06-18 10:09:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42329.6, 300 sec: 42653.9). Total num frames: 1824686080. Throughput: 0: 42549.7. Samples: 1824771760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:46,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 10:09:48,007][12883] Updated weights for policy 0, policy_version 111373 (0.0038) [2024-06-18 10:09:51,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42053.1, 300 sec: 42598.4). Total num frames: 1824882688. Throughput: 0: 42394.6. Samples: 1825020820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:51,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 10:09:52,130][12883] Updated weights for policy 0, policy_version 111383 (0.0036) [2024-06-18 10:09:55,958][12883] Updated weights for policy 0, policy_version 111393 (0.0024) [2024-06-18 10:09:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 1825128448. Throughput: 0: 42584.8. Samples: 1825278460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:09:56,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 10:09:59,733][12883] Updated weights for policy 0, policy_version 111403 (0.0029) [2024-06-18 10:10:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1825325056. Throughput: 0: 42620.0. Samples: 1825410080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:10:01,995][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 10:10:03,372][12883] Updated weights for policy 0, policy_version 111413 (0.0034) [2024-06-18 10:10:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1825538048. Throughput: 0: 42427.1. Samples: 1825661900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:10:06,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 10:10:07,142][12883] Updated weights for policy 0, policy_version 111423 (0.0034) [2024-06-18 10:10:10,866][12883] Updated weights for policy 0, policy_version 111433 (0.0041) [2024-06-18 10:10:11,351][12862] Signal inference workers to stop experience collection... (26650 times) [2024-06-18 10:10:11,351][12862] Signal inference workers to resume experience collection... (26650 times) [2024-06-18 10:10:11,372][12883] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-06-18 10:10:11,372][12883] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-06-18 10:10:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1825767424. Throughput: 0: 42638.4. Samples: 1825922320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:10:11,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 10:10:14,658][12883] Updated weights for policy 0, policy_version 111443 (0.0028) [2024-06-18 10:10:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1825980416. Throughput: 0: 42763.4. Samples: 1826055200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:16,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 10:10:18,632][12883] Updated weights for policy 0, policy_version 111453 (0.0037) [2024-06-18 10:10:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1826177024. Throughput: 0: 42551.3. Samples: 1826303840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:21,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 10:10:22,238][12883] Updated weights for policy 0, policy_version 111463 (0.0037) [2024-06-18 10:10:26,207][12883] Updated weights for policy 0, policy_version 111473 (0.0038) [2024-06-18 10:10:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1826406400. Throughput: 0: 42847.3. Samples: 1826563200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:26,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 10:10:29,772][12883] Updated weights for policy 0, policy_version 111483 (0.0040) [2024-06-18 10:10:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1826586624. Throughput: 0: 42575.3. Samples: 1826687640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:31,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 10:10:34,305][12883] Updated weights for policy 0, policy_version 111493 (0.0033) [2024-06-18 10:10:37,000][12645] Fps is (10 sec: 42572.3, 60 sec: 42594.0, 300 sec: 42708.6). Total num frames: 1826832384. Throughput: 0: 42610.6. Samples: 1826938560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:37,001][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 10:10:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111501_1826832384.pth... [2024-06-18 10:10:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000110876_1816592384.pth [2024-06-18 10:10:37,774][12883] Updated weights for policy 0, policy_version 111503 (0.0027) [2024-06-18 10:10:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1827012608. Throughput: 0: 42540.5. Samples: 1827192780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:41,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 10:10:42,083][12883] Updated weights for policy 0, policy_version 111513 (0.0038) [2024-06-18 10:10:45,319][12883] Updated weights for policy 0, policy_version 111523 (0.0027) [2024-06-18 10:10:46,994][12645] Fps is (10 sec: 40985.2, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1827241984. Throughput: 0: 42363.9. Samples: 1827316460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:46,994][12645] Avg episode reward: [(0, '0.748')] [2024-06-18 10:10:49,843][12883] Updated weights for policy 0, policy_version 111533 (0.0030) [2024-06-18 10:10:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1827471360. Throughput: 0: 42524.9. Samples: 1827575520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:51,994][12645] Avg episode reward: [(0, '0.749')] [2024-06-18 10:10:52,973][12883] Updated weights for policy 0, policy_version 111543 (0.0023) [2024-06-18 10:10:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42543.8). Total num frames: 1827651584. Throughput: 0: 42386.3. Samples: 1827829700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:10:56,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 10:10:57,646][12883] Updated weights for policy 0, policy_version 111553 (0.0043) [2024-06-18 10:11:01,079][12883] Updated weights for policy 0, policy_version 111563 (0.0036) [2024-06-18 10:11:01,995][12645] Fps is (10 sec: 42591.6, 60 sec: 42870.4, 300 sec: 42653.7). Total num frames: 1827897344. Throughput: 0: 42038.1. Samples: 1827946980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:11:01,996][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 10:11:05,326][12883] Updated weights for policy 0, policy_version 111573 (0.0045) [2024-06-18 10:11:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1828093952. Throughput: 0: 42266.2. Samples: 1828205820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:11:06,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 10:11:08,841][12883] Updated weights for policy 0, policy_version 111583 (0.0034) [2024-06-18 10:11:11,994][12645] Fps is (10 sec: 39327.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1828290560. Throughput: 0: 42031.7. Samples: 1828454620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:11:11,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 10:11:12,963][12883] Updated weights for policy 0, policy_version 111593 (0.0028) [2024-06-18 10:11:16,440][12883] Updated weights for policy 0, policy_version 111603 (0.0040) [2024-06-18 10:11:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1828536320. Throughput: 0: 42132.8. Samples: 1828583620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:16,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 10:11:20,566][12883] Updated weights for policy 0, policy_version 111613 (0.0034) [2024-06-18 10:11:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1828700160. Throughput: 0: 42220.6. Samples: 1828838220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:21,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 10:11:24,073][12883] Updated weights for policy 0, policy_version 111623 (0.0038) [2024-06-18 10:11:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1828929536. Throughput: 0: 42206.7. Samples: 1829092080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:26,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 10:11:28,599][12883] Updated weights for policy 0, policy_version 111633 (0.0034) [2024-06-18 10:11:31,527][12862] Signal inference workers to stop experience collection... (26700 times) [2024-06-18 10:11:31,527][12862] Signal inference workers to resume experience collection... (26700 times) [2024-06-18 10:11:31,552][12883] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-06-18 10:11:31,552][12883] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-06-18 10:11:31,666][12883] Updated weights for policy 0, policy_version 111643 (0.0030) [2024-06-18 10:11:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1829158912. Throughput: 0: 42314.7. Samples: 1829220620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:31,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 10:11:36,055][12883] Updated weights for policy 0, policy_version 111653 (0.0034) [2024-06-18 10:11:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42056.7, 300 sec: 42543.2). Total num frames: 1829355520. Throughput: 0: 42335.1. Samples: 1829480600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:36,994][12645] Avg episode reward: [(0, '0.681')] [2024-06-18 10:11:39,765][12883] Updated weights for policy 0, policy_version 111663 (0.0044) [2024-06-18 10:11:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1829568512. Throughput: 0: 42286.5. Samples: 1829732600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:41,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 10:11:43,609][12883] Updated weights for policy 0, policy_version 111673 (0.0043) [2024-06-18 10:11:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1829797888. Throughput: 0: 42515.6. Samples: 1829860120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:46,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 10:11:47,302][12883] Updated weights for policy 0, policy_version 111683 (0.0030) [2024-06-18 10:11:51,316][12883] Updated weights for policy 0, policy_version 111693 (0.0044) [2024-06-18 10:11:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1829994496. Throughput: 0: 42537.3. Samples: 1830120000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:51,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 10:11:54,804][12883] Updated weights for policy 0, policy_version 111703 (0.0031) [2024-06-18 10:11:56,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1830191104. Throughput: 0: 42695.6. Samples: 1830375920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:11:56,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 10:11:59,047][12883] Updated weights for policy 0, policy_version 111713 (0.0028) [2024-06-18 10:12:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42599.5, 300 sec: 42598.4). Total num frames: 1830453248. Throughput: 0: 42664.9. Samples: 1830503540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:12:01,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 10:12:02,191][12883] Updated weights for policy 0, policy_version 111723 (0.0036) [2024-06-18 10:12:06,695][12883] Updated weights for policy 0, policy_version 111733 (0.0042) [2024-06-18 10:12:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 1830633472. Throughput: 0: 42732.8. Samples: 1830761200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:12:06,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 10:12:09,780][12883] Updated weights for policy 0, policy_version 111743 (0.0028) [2024-06-18 10:12:11,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42432.5). Total num frames: 1830846464. Throughput: 0: 42788.5. Samples: 1831017560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:12:11,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 10:12:14,411][12883] Updated weights for policy 0, policy_version 111753 (0.0033) [2024-06-18 10:12:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1831075840. Throughput: 0: 42731.2. Samples: 1831143520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 10:12:16,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 10:12:17,495][12883] Updated weights for policy 0, policy_version 111763 (0.0043) [2024-06-18 10:12:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1831288832. Throughput: 0: 42657.8. Samples: 1831400200. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:21,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 10:12:21,999][12883] Updated weights for policy 0, policy_version 111773 (0.0033) [2024-06-18 10:12:25,770][12883] Updated weights for policy 0, policy_version 111783 (0.0038) [2024-06-18 10:12:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1831501824. Throughput: 0: 42851.7. Samples: 1831660920. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:26,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 10:12:29,472][12883] Updated weights for policy 0, policy_version 111793 (0.0036) [2024-06-18 10:12:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1831731200. Throughput: 0: 42729.5. Samples: 1831782940. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:31,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 10:12:33,330][12883] Updated weights for policy 0, policy_version 111803 (0.0023) [2024-06-18 10:12:36,994][12645] Fps is (10 sec: 42597.2, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1831927808. Throughput: 0: 42770.5. Samples: 1832044680. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:36,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 10:12:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111812_1831927808.pth... [2024-06-18 10:12:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111189_1821720576.pth [2024-06-18 10:12:37,256][12883] Updated weights for policy 0, policy_version 111813 (0.0041) [2024-06-18 10:12:40,983][12883] Updated weights for policy 0, policy_version 111823 (0.0027) [2024-06-18 10:12:42,000][12645] Fps is (10 sec: 40934.1, 60 sec: 42867.1, 300 sec: 42541.9). Total num frames: 1832140800. Throughput: 0: 42809.5. Samples: 1832302620. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:42,000][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 10:12:44,868][12883] Updated weights for policy 0, policy_version 111833 (0.0036) [2024-06-18 10:12:46,994][12645] Fps is (10 sec: 42599.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1832353792. Throughput: 0: 42737.9. Samples: 1832426740. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:46,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 10:12:48,763][12883] Updated weights for policy 0, policy_version 111843 (0.0033) [2024-06-18 10:12:51,994][12645] Fps is (10 sec: 44263.9, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1832583168. Throughput: 0: 42840.4. Samples: 1832689020. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:51,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 10:12:52,545][12883] Updated weights for policy 0, policy_version 111853 (0.0042) [2024-06-18 10:12:56,422][12883] Updated weights for policy 0, policy_version 111863 (0.0037) [2024-06-18 10:12:56,995][12645] Fps is (10 sec: 42591.2, 60 sec: 43143.3, 300 sec: 42542.6). Total num frames: 1832779776. Throughput: 0: 42663.3. Samples: 1832937480. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:12:56,996][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 10:13:00,414][12883] Updated weights for policy 0, policy_version 111873 (0.0034) [2024-06-18 10:13:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1832992768. Throughput: 0: 42603.0. Samples: 1833060660. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:13:01,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 10:13:04,237][12883] Updated weights for policy 0, policy_version 111883 (0.0034) [2024-06-18 10:13:06,994][12645] Fps is (10 sec: 42605.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1833205760. Throughput: 0: 42643.5. Samples: 1833319160. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:13:06,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 10:13:08,120][12883] Updated weights for policy 0, policy_version 111893 (0.0047) [2024-06-18 10:13:08,465][12862] Signal inference workers to stop experience collection... (26750 times) [2024-06-18 10:13:08,507][12883] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-06-18 10:13:08,531][12862] Signal inference workers to resume experience collection... (26750 times) [2024-06-18 10:13:08,536][12883] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-06-18 10:13:11,822][12883] Updated weights for policy 0, policy_version 111903 (0.0028) [2024-06-18 10:13:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1833418752. Throughput: 0: 42547.1. Samples: 1833575540. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:13:11,994][12645] Avg episode reward: [(0, '0.678')] [2024-06-18 10:13:15,654][12883] Updated weights for policy 0, policy_version 111913 (0.0037) [2024-06-18 10:13:17,000][12645] Fps is (10 sec: 42571.2, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 1833631744. Throughput: 0: 42723.2. Samples: 1833705760. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-18 10:13:17,009][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 10:13:19,212][12883] Updated weights for policy 0, policy_version 111923 (0.0024) [2024-06-18 10:13:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1833844736. Throughput: 0: 42632.6. Samples: 1833963140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:21,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 10:13:23,250][12883] Updated weights for policy 0, policy_version 111933 (0.0029) [2024-06-18 10:13:26,994][12645] Fps is (10 sec: 42625.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1834057728. Throughput: 0: 42566.4. Samples: 1834217840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:26,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 10:13:27,058][12883] Updated weights for policy 0, policy_version 111943 (0.0032) [2024-06-18 10:13:30,967][12883] Updated weights for policy 0, policy_version 111953 (0.0026) [2024-06-18 10:13:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1834287104. Throughput: 0: 42731.4. Samples: 1834349660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:31,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 10:13:34,598][12883] Updated weights for policy 0, policy_version 111963 (0.0038) [2024-06-18 10:13:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1834483712. Throughput: 0: 42544.1. Samples: 1834603500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:36,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 10:13:38,655][12883] Updated weights for policy 0, policy_version 111973 (0.0034) [2024-06-18 10:13:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42602.9, 300 sec: 42543.8). Total num frames: 1834696704. Throughput: 0: 42695.9. Samples: 1834858720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:41,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 10:13:42,470][12883] Updated weights for policy 0, policy_version 111983 (0.0034) [2024-06-18 10:13:46,374][12883] Updated weights for policy 0, policy_version 111993 (0.0028) [2024-06-18 10:13:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42598.6). Total num frames: 1834926080. Throughput: 0: 42834.1. Samples: 1834988200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:46,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 10:13:50,242][12883] Updated weights for policy 0, policy_version 112003 (0.0031) [2024-06-18 10:13:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42488.2). Total num frames: 1835106304. Throughput: 0: 42598.6. Samples: 1835236100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:51,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 10:13:54,035][12883] Updated weights for policy 0, policy_version 112013 (0.0033) [2024-06-18 10:13:56,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42599.6, 300 sec: 42598.4). Total num frames: 1835335680. Throughput: 0: 42755.1. Samples: 1835499520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:13:56,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 10:13:57,695][12883] Updated weights for policy 0, policy_version 112023 (0.0034) [2024-06-18 10:14:01,470][12883] Updated weights for policy 0, policy_version 112033 (0.0027) [2024-06-18 10:14:01,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1835565056. Throughput: 0: 42831.8. Samples: 1835632920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:14:01,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 10:14:05,275][12883] Updated weights for policy 0, policy_version 112043 (0.0024) [2024-06-18 10:14:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1835761664. Throughput: 0: 42772.4. Samples: 1835887900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:14:06,995][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 10:14:08,938][12883] Updated weights for policy 0, policy_version 112053 (0.0026) [2024-06-18 10:14:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1835974656. Throughput: 0: 42847.9. Samples: 1836146000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:14:11,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 10:14:12,838][12883] Updated weights for policy 0, policy_version 112063 (0.0026) [2024-06-18 10:14:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42602.9, 300 sec: 42487.3). Total num frames: 1836187648. Throughput: 0: 42797.4. Samples: 1836275540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:14:16,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 10:14:17,048][12883] Updated weights for policy 0, policy_version 112073 (0.0041) [2024-06-18 10:14:20,478][12883] Updated weights for policy 0, policy_version 112083 (0.0039) [2024-06-18 10:14:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 1836417024. Throughput: 0: 42749.9. Samples: 1836527240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:21,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 10:14:24,849][12883] Updated weights for policy 0, policy_version 112093 (0.0035) [2024-06-18 10:14:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1836613632. Throughput: 0: 42756.8. Samples: 1836782780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:26,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 10:14:28,135][12883] Updated weights for policy 0, policy_version 112103 (0.0039) [2024-06-18 10:14:31,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 1836843008. Throughput: 0: 42719.4. Samples: 1836910660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:31,996][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 10:14:32,359][12883] Updated weights for policy 0, policy_version 112113 (0.0035) [2024-06-18 10:14:35,044][12862] Signal inference workers to stop experience collection... (26800 times) [2024-06-18 10:14:35,044][12862] Signal inference workers to resume experience collection... (26800 times) [2024-06-18 10:14:35,075][12883] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-06-18 10:14:35,075][12883] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-06-18 10:14:35,697][12883] Updated weights for policy 0, policy_version 112123 (0.0035) [2024-06-18 10:14:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1837056000. Throughput: 0: 42836.5. Samples: 1837163740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:36,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 10:14:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112125_1837056000.pth... [2024-06-18 10:14:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111501_1826832384.pth [2024-06-18 10:14:39,916][12883] Updated weights for policy 0, policy_version 112133 (0.0042) [2024-06-18 10:14:41,996][12645] Fps is (10 sec: 40960.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1837252608. Throughput: 0: 42866.3. Samples: 1837428600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:41,996][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 10:14:43,252][12883] Updated weights for policy 0, policy_version 112143 (0.0024) [2024-06-18 10:14:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1837481984. Throughput: 0: 42756.4. Samples: 1837556960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:46,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 10:14:47,439][12883] Updated weights for policy 0, policy_version 112153 (0.0040) [2024-06-18 10:14:51,421][12883] Updated weights for policy 0, policy_version 112163 (0.0047) [2024-06-18 10:14:51,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1837694976. Throughput: 0: 42620.6. Samples: 1837805820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:51,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 10:14:55,318][12883] Updated weights for policy 0, policy_version 112173 (0.0039) [2024-06-18 10:14:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1837891584. Throughput: 0: 42485.2. Samples: 1838057840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:14:56,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 10:14:58,991][12883] Updated weights for policy 0, policy_version 112183 (0.0037) [2024-06-18 10:15:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1838104576. Throughput: 0: 42512.4. Samples: 1838188600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:15:01,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 10:15:03,190][12883] Updated weights for policy 0, policy_version 112193 (0.0041) [2024-06-18 10:15:06,612][12883] Updated weights for policy 0, policy_version 112203 (0.0039) [2024-06-18 10:15:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1838333952. Throughput: 0: 42488.3. Samples: 1838439220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:15:06,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 10:15:11,079][12883] Updated weights for policy 0, policy_version 112213 (0.0041) [2024-06-18 10:15:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1838530560. Throughput: 0: 42497.4. Samples: 1838695160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:15:11,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 10:15:14,293][12883] Updated weights for policy 0, policy_version 112223 (0.0039) [2024-06-18 10:15:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1838743552. Throughput: 0: 42481.6. Samples: 1838822240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 10:15:16,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 10:15:18,543][12883] Updated weights for policy 0, policy_version 112233 (0.0031) [2024-06-18 10:15:21,925][12883] Updated weights for policy 0, policy_version 112243 (0.0035) [2024-06-18 10:15:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 1838989312. Throughput: 0: 42743.0. Samples: 1839087180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:21,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 10:15:25,993][12883] Updated weights for policy 0, policy_version 112253 (0.0040) [2024-06-18 10:15:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1839185920. Throughput: 0: 42491.0. Samples: 1839340600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:26,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 10:15:29,906][12883] Updated weights for policy 0, policy_version 112263 (0.0039) [2024-06-18 10:15:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42326.9, 300 sec: 42543.8). Total num frames: 1839382528. Throughput: 0: 42331.2. Samples: 1839461860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:31,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 10:15:33,803][12883] Updated weights for policy 0, policy_version 112273 (0.0040) [2024-06-18 10:15:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1839611904. Throughput: 0: 42604.4. Samples: 1839723020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:36,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 10:15:37,792][12883] Updated weights for policy 0, policy_version 112283 (0.0041) [2024-06-18 10:15:41,482][12883] Updated weights for policy 0, policy_version 112293 (0.0026) [2024-06-18 10:15:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1839824896. Throughput: 0: 42545.9. Samples: 1839972400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:41,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 10:15:45,411][12883] Updated weights for policy 0, policy_version 112303 (0.0043) [2024-06-18 10:15:46,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1840037888. Throughput: 0: 42522.7. Samples: 1840102220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:46,997][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 10:15:49,080][12883] Updated weights for policy 0, policy_version 112313 (0.0033) [2024-06-18 10:15:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1840250880. Throughput: 0: 42791.1. Samples: 1840364820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:51,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 10:15:53,038][12883] Updated weights for policy 0, policy_version 112323 (0.0042) [2024-06-18 10:15:56,723][12883] Updated weights for policy 0, policy_version 112333 (0.0031) [2024-06-18 10:15:56,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43144.6, 300 sec: 42654.2). Total num frames: 1840480256. Throughput: 0: 42643.1. Samples: 1840614100. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:15:56,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 10:15:59,244][12862] Signal inference workers to stop experience collection... (26850 times) [2024-06-18 10:15:59,244][12862] Signal inference workers to resume experience collection... (26850 times) [2024-06-18 10:15:59,298][12883] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-06-18 10:15:59,299][12883] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-06-18 10:16:00,796][12883] Updated weights for policy 0, policy_version 112343 (0.0028) [2024-06-18 10:16:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1840660480. Throughput: 0: 42639.2. Samples: 1840741000. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:16:01,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 10:16:04,278][12883] Updated weights for policy 0, policy_version 112353 (0.0027) [2024-06-18 10:16:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1840873472. Throughput: 0: 42570.6. Samples: 1841002860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:16:06,998][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 10:16:08,405][12883] Updated weights for policy 0, policy_version 112363 (0.0036) [2024-06-18 10:16:11,938][12883] Updated weights for policy 0, policy_version 112373 (0.0032) [2024-06-18 10:16:11,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 1841119232. Throughput: 0: 42479.5. Samples: 1841252180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:16:11,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 10:16:16,051][12883] Updated weights for policy 0, policy_version 112383 (0.0041) [2024-06-18 10:16:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1841315840. Throughput: 0: 42804.0. Samples: 1841388040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:16:16,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 10:16:19,455][12883] Updated weights for policy 0, policy_version 112393 (0.0031) [2024-06-18 10:16:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1841512448. Throughput: 0: 42515.6. Samples: 1841636220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:16:21,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 10:16:23,739][12883] Updated weights for policy 0, policy_version 112403 (0.0023) [2024-06-18 10:16:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1841741824. Throughput: 0: 42566.6. Samples: 1841887900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:16:26,994][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 10:16:27,615][12883] Updated weights for policy 0, policy_version 112413 (0.0034) [2024-06-18 10:16:31,553][12883] Updated weights for policy 0, policy_version 112423 (0.0038) [2024-06-18 10:16:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1841954816. Throughput: 0: 42719.9. Samples: 1842024520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:16:31,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 10:16:35,155][12883] Updated weights for policy 0, policy_version 112433 (0.0032) [2024-06-18 10:16:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1842135040. Throughput: 0: 42473.9. Samples: 1842276140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:16:36,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 10:16:37,100][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112436_1842151424.pth... [2024-06-18 10:16:37,145][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000111812_1831927808.pth [2024-06-18 10:16:39,102][12883] Updated weights for policy 0, policy_version 112443 (0.0027) [2024-06-18 10:16:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1842380800. Throughput: 0: 42592.9. Samples: 1842530780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:16:41,994][12645] Avg episode reward: [(0, '0.668')] [2024-06-18 10:16:42,878][12883] Updated weights for policy 0, policy_version 112453 (0.0037) [2024-06-18 10:16:46,882][12883] Updated weights for policy 0, policy_version 112463 (0.0036) [2024-06-18 10:16:46,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42600.1, 300 sec: 42709.5). Total num frames: 1842593792. Throughput: 0: 42784.1. Samples: 1842666280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:16:46,994][12645] Avg episode reward: [(0, '0.675')] [2024-06-18 10:16:50,914][12883] Updated weights for policy 0, policy_version 112473 (0.0034) [2024-06-18 10:16:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1842790400. Throughput: 0: 42566.3. Samples: 1842918340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:16:51,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 10:16:54,497][12883] Updated weights for policy 0, policy_version 112483 (0.0039) [2024-06-18 10:16:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1843036160. Throughput: 0: 42638.2. Samples: 1843170900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:16:56,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 10:16:58,624][12883] Updated weights for policy 0, policy_version 112493 (0.0035) [2024-06-18 10:17:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1843232768. Throughput: 0: 42545.7. Samples: 1843302600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:17:01,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 10:17:02,131][12883] Updated weights for policy 0, policy_version 112503 (0.0041) [2024-06-18 10:17:06,323][12883] Updated weights for policy 0, policy_version 112513 (0.0030) [2024-06-18 10:17:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1843445760. Throughput: 0: 42612.0. Samples: 1843553760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:17:06,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 10:17:09,313][12862] Signal inference workers to stop experience collection... (26900 times) [2024-06-18 10:17:09,368][12883] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-06-18 10:17:09,373][12862] Signal inference workers to resume experience collection... (26900 times) [2024-06-18 10:17:09,381][12883] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-06-18 10:17:09,840][12883] Updated weights for policy 0, policy_version 112523 (0.0037) [2024-06-18 10:17:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1843658752. Throughput: 0: 42696.5. Samples: 1843809240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:17:11,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 10:17:13,831][12883] Updated weights for policy 0, policy_version 112533 (0.0028) [2024-06-18 10:17:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1843871744. Throughput: 0: 42481.4. Samples: 1843936180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:17:16,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 10:17:17,452][12883] Updated weights for policy 0, policy_version 112543 (0.0032) [2024-06-18 10:17:21,665][12883] Updated weights for policy 0, policy_version 112553 (0.0031) [2024-06-18 10:17:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1844068352. Throughput: 0: 42487.2. Samples: 1844188060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 10:17:21,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 10:17:25,273][12883] Updated weights for policy 0, policy_version 112563 (0.0043) [2024-06-18 10:17:26,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 1844281344. Throughput: 0: 42553.3. Samples: 1844445780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:17:26,997][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 10:17:29,298][12883] Updated weights for policy 0, policy_version 112573 (0.0034) [2024-06-18 10:17:31,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1844494336. Throughput: 0: 42405.6. Samples: 1844574540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:17:31,995][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 10:17:33,034][12883] Updated weights for policy 0, policy_version 112583 (0.0037) [2024-06-18 10:17:36,837][12883] Updated weights for policy 0, policy_version 112593 (0.0038) [2024-06-18 10:17:36,994][12645] Fps is (10 sec: 44246.8, 60 sec: 43144.5, 300 sec: 42654.8). Total num frames: 1844723712. Throughput: 0: 42450.7. Samples: 1844828620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:17:36,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 10:17:41,016][12883] Updated weights for policy 0, policy_version 112603 (0.0034) [2024-06-18 10:17:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1844936704. Throughput: 0: 42486.6. Samples: 1845082800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:17:41,994][12645] Avg episode reward: [(0, '0.706')] [2024-06-18 10:17:44,645][12883] Updated weights for policy 0, policy_version 112613 (0.0038) [2024-06-18 10:17:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1845133312. Throughput: 0: 42504.5. Samples: 1845215300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:17:46,994][12645] Avg episode reward: [(0, '0.820')] [2024-06-18 10:17:48,602][12883] Updated weights for policy 0, policy_version 112623 (0.0039) [2024-06-18 10:17:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 1845346304. Throughput: 0: 42499.6. Samples: 1845466240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:17:51,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 10:17:52,469][12883] Updated weights for policy 0, policy_version 112633 (0.0031) [2024-06-18 10:17:56,321][12883] Updated weights for policy 0, policy_version 112643 (0.0037) [2024-06-18 10:17:56,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 1845575680. Throughput: 0: 42565.8. Samples: 1845724800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:17:56,996][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 10:18:00,323][12883] Updated weights for policy 0, policy_version 112653 (0.0036) [2024-06-18 10:18:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1845772288. Throughput: 0: 42585.4. Samples: 1845852520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:18:01,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 10:18:04,042][12883] Updated weights for policy 0, policy_version 112663 (0.0036) [2024-06-18 10:18:06,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1845985280. Throughput: 0: 42817.1. Samples: 1846114840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:18:06,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 10:18:07,926][12883] Updated weights for policy 0, policy_version 112673 (0.0028) [2024-06-18 10:18:11,687][12883] Updated weights for policy 0, policy_version 112683 (0.0032) [2024-06-18 10:18:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42599.3). Total num frames: 1846198272. Throughput: 0: 42772.9. Samples: 1846370460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:18:11,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 10:18:15,518][12883] Updated weights for policy 0, policy_version 112693 (0.0033) [2024-06-18 10:18:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1846411264. Throughput: 0: 42746.7. Samples: 1846498140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:18:16,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 10:18:19,191][12883] Updated weights for policy 0, policy_version 112703 (0.0031) [2024-06-18 10:18:21,240][12862] Signal inference workers to stop experience collection... (26950 times) [2024-06-18 10:18:21,240][12862] Signal inference workers to resume experience collection... (26950 times) [2024-06-18 10:18:21,284][12883] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-06-18 10:18:21,285][12883] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-06-18 10:18:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 1846640640. Throughput: 0: 42818.6. Samples: 1846755460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 10:18:21,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 10:18:23,513][12883] Updated weights for policy 0, policy_version 112713 (0.0038) [2024-06-18 10:18:26,774][12883] Updated weights for policy 0, policy_version 112723 (0.0033) [2024-06-18 10:18:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 1846853632. Throughput: 0: 42851.1. Samples: 1847011100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:18:26,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 10:18:31,095][12883] Updated weights for policy 0, policy_version 112733 (0.0042) [2024-06-18 10:18:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1847066624. Throughput: 0: 42764.4. Samples: 1847139700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:18:31,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 10:18:34,421][12883] Updated weights for policy 0, policy_version 112743 (0.0046) [2024-06-18 10:18:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1847279616. Throughput: 0: 42948.4. Samples: 1847398920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:18:36,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 10:18:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112749_1847279616.pth... [2024-06-18 10:18:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112125_1837056000.pth [2024-06-18 10:18:38,657][12883] Updated weights for policy 0, policy_version 112753 (0.0047) [2024-06-18 10:18:41,932][12883] Updated weights for policy 0, policy_version 112763 (0.0041) [2024-06-18 10:18:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1847508992. Throughput: 0: 42898.1. Samples: 1847655120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:18:41,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 10:18:46,198][12883] Updated weights for policy 0, policy_version 112773 (0.0034) [2024-06-18 10:18:46,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1847705600. Throughput: 0: 42927.6. Samples: 1847784260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:18:46,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 10:18:49,646][12883] Updated weights for policy 0, policy_version 112783 (0.0025) [2024-06-18 10:18:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1847918592. Throughput: 0: 42791.5. Samples: 1848040460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:18:51,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 10:18:53,697][12883] Updated weights for policy 0, policy_version 112793 (0.0037) [2024-06-18 10:18:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 1848147968. Throughput: 0: 42689.8. Samples: 1848291500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:18:56,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 10:18:57,271][12883] Updated weights for policy 0, policy_version 112803 (0.0033) [2024-06-18 10:19:01,357][12883] Updated weights for policy 0, policy_version 112813 (0.0035) [2024-06-18 10:19:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 1848344576. Throughput: 0: 42861.5. Samples: 1848426900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:19:01,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 10:19:04,807][12883] Updated weights for policy 0, policy_version 112823 (0.0046) [2024-06-18 10:19:06,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1848541184. Throughput: 0: 42829.3. Samples: 1848682780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:19:06,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 10:19:08,863][12883] Updated weights for policy 0, policy_version 112833 (0.0032) [2024-06-18 10:19:11,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1848786944. Throughput: 0: 42827.6. Samples: 1848938340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:19:11,996][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 10:19:12,376][12883] Updated weights for policy 0, policy_version 112843 (0.0033) [2024-06-18 10:19:16,922][12883] Updated weights for policy 0, policy_version 112853 (0.0034) [2024-06-18 10:19:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1848983552. Throughput: 0: 43034.3. Samples: 1849076240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:19:16,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 10:19:20,060][12883] Updated weights for policy 0, policy_version 112863 (0.0031) [2024-06-18 10:19:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1849196544. Throughput: 0: 42834.2. Samples: 1849326460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:19:21,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 10:19:24,447][12883] Updated weights for policy 0, policy_version 112873 (0.0041) [2024-06-18 10:19:26,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 1849442304. Throughput: 0: 42853.9. Samples: 1849583540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:19:26,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 10:19:27,932][12883] Updated weights for policy 0, policy_version 112883 (0.0033) [2024-06-18 10:19:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1849622528. Throughput: 0: 42954.5. Samples: 1849717220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:19:31,995][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 10:19:32,185][12883] Updated weights for policy 0, policy_version 112893 (0.0034) [2024-06-18 10:19:35,925][12883] Updated weights for policy 0, policy_version 112903 (0.0038) [2024-06-18 10:19:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1849851904. Throughput: 0: 42869.5. Samples: 1849969580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:19:36,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 10:19:39,842][12883] Updated weights for policy 0, policy_version 112913 (0.0036) [2024-06-18 10:19:41,532][12862] Signal inference workers to stop experience collection... (27000 times) [2024-06-18 10:19:41,532][12862] Signal inference workers to resume experience collection... (27000 times) [2024-06-18 10:19:41,548][12883] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-06-18 10:19:41,548][12883] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-06-18 10:19:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1850081280. Throughput: 0: 42850.6. Samples: 1850219780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:19:41,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 10:19:43,650][12883] Updated weights for policy 0, policy_version 112923 (0.0031) [2024-06-18 10:19:46,999][12645] Fps is (10 sec: 40939.2, 60 sec: 42594.8, 300 sec: 42597.7). Total num frames: 1850261504. Throughput: 0: 42768.0. Samples: 1850351680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:19:46,999][12645] Avg episode reward: [(0, '0.717')] [2024-06-18 10:19:47,444][12883] Updated weights for policy 0, policy_version 112933 (0.0057) [2024-06-18 10:19:51,434][12883] Updated weights for policy 0, policy_version 112943 (0.0034) [2024-06-18 10:19:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1850490880. Throughput: 0: 42613.8. Samples: 1850600400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:19:51,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 10:19:55,146][12883] Updated weights for policy 0, policy_version 112953 (0.0039) [2024-06-18 10:19:56,994][12645] Fps is (10 sec: 45897.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1850720256. Throughput: 0: 42541.7. Samples: 1850852720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:19:56,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 10:19:58,977][12883] Updated weights for policy 0, policy_version 112963 (0.0029) [2024-06-18 10:20:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1850900480. Throughput: 0: 42345.3. Samples: 1850981780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:20:01,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 10:20:02,799][12883] Updated weights for policy 0, policy_version 112973 (0.0042) [2024-06-18 10:20:06,586][12883] Updated weights for policy 0, policy_version 112983 (0.0032) [2024-06-18 10:20:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1851129856. Throughput: 0: 42498.4. Samples: 1851238880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:20:06,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 10:20:10,565][12883] Updated weights for policy 0, policy_version 112993 (0.0027) [2024-06-18 10:20:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1851342848. Throughput: 0: 42503.5. Samples: 1851496200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:20:11,995][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 10:20:14,162][12883] Updated weights for policy 0, policy_version 113003 (0.0023) [2024-06-18 10:20:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1851539456. Throughput: 0: 42417.4. Samples: 1851626000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:20:16,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 10:20:18,292][12883] Updated weights for policy 0, policy_version 113013 (0.0030) [2024-06-18 10:20:21,734][12883] Updated weights for policy 0, policy_version 113023 (0.0035) [2024-06-18 10:20:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1851768832. Throughput: 0: 42447.9. Samples: 1851879740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:20:21,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 10:20:25,730][12883] Updated weights for policy 0, policy_version 113033 (0.0030) [2024-06-18 10:20:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1851981824. Throughput: 0: 42618.3. Samples: 1852137600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:20:26,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 10:20:29,516][12883] Updated weights for policy 0, policy_version 113043 (0.0039) [2024-06-18 10:20:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1852178432. Throughput: 0: 42503.7. Samples: 1852264140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:20:31,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 10:20:33,428][12883] Updated weights for policy 0, policy_version 113053 (0.0038) [2024-06-18 10:20:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1852407808. Throughput: 0: 42683.2. Samples: 1852521140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:20:36,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 10:20:37,055][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113063_1852424192.pth... [2024-06-18 10:20:37,061][12883] Updated weights for policy 0, policy_version 113063 (0.0033) [2024-06-18 10:20:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112436_1842151424.pth [2024-06-18 10:20:40,940][12883] Updated weights for policy 0, policy_version 113073 (0.0047) [2024-06-18 10:20:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 1852620800. Throughput: 0: 42655.2. Samples: 1852772200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:20:41,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 10:20:44,792][12883] Updated weights for policy 0, policy_version 113083 (0.0034) [2024-06-18 10:20:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42601.9, 300 sec: 42598.4). Total num frames: 1852817408. Throughput: 0: 42723.4. Samples: 1852904340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:20:46,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 10:20:48,821][12883] Updated weights for policy 0, policy_version 113093 (0.0034) [2024-06-18 10:20:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1853046784. Throughput: 0: 42551.1. Samples: 1853153680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:20:51,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 10:20:52,438][12883] Updated weights for policy 0, policy_version 113103 (0.0035) [2024-06-18 10:20:56,688][12883] Updated weights for policy 0, policy_version 113113 (0.0031) [2024-06-18 10:20:56,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1853259776. Throughput: 0: 42531.2. Samples: 1853410100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:20:56,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 10:20:59,968][12883] Updated weights for policy 0, policy_version 113123 (0.0031) [2024-06-18 10:21:00,700][12862] Signal inference workers to stop experience collection... (27050 times) [2024-06-18 10:21:00,700][12862] Signal inference workers to resume experience collection... (27050 times) [2024-06-18 10:21:00,749][12883] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-06-18 10:21:00,749][12883] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-06-18 10:21:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1853456384. Throughput: 0: 42387.5. Samples: 1853533440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:21:01,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 10:21:04,239][12883] Updated weights for policy 0, policy_version 113133 (0.0048) [2024-06-18 10:21:06,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1853669376. Throughput: 0: 42505.3. Samples: 1853792480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:21:06,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 10:21:07,746][12883] Updated weights for policy 0, policy_version 113143 (0.0033) [2024-06-18 10:21:11,982][12883] Updated weights for policy 0, policy_version 113153 (0.0030) [2024-06-18 10:21:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1853898752. Throughput: 0: 42556.4. Samples: 1854052640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:21:11,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 10:21:15,395][12883] Updated weights for policy 0, policy_version 113163 (0.0024) [2024-06-18 10:21:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1854111744. Throughput: 0: 42579.2. Samples: 1854180200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:21:16,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 10:21:19,707][12883] Updated weights for policy 0, policy_version 113173 (0.0032) [2024-06-18 10:21:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1854308352. Throughput: 0: 42335.5. Samples: 1854426240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:21:21,996][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 10:21:23,076][12883] Updated weights for policy 0, policy_version 113183 (0.0038) [2024-06-18 10:21:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1854504960. Throughput: 0: 42639.4. Samples: 1854690980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 10:21:26,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 10:21:27,567][12883] Updated weights for policy 0, policy_version 113193 (0.0046) [2024-06-18 10:21:31,051][12883] Updated weights for policy 0, policy_version 113203 (0.0025) [2024-06-18 10:21:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1854750720. Throughput: 0: 42387.7. Samples: 1854811780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:21:31,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 10:21:35,282][12883] Updated weights for policy 0, policy_version 113213 (0.0047) [2024-06-18 10:21:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1854947328. Throughput: 0: 42470.2. Samples: 1855064840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:21:36,994][12645] Avg episode reward: [(0, '0.138')] [2024-06-18 10:21:38,880][12883] Updated weights for policy 0, policy_version 113223 (0.0041) [2024-06-18 10:21:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1855160320. Throughput: 0: 42543.5. Samples: 1855324560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:21:41,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 10:21:42,926][12883] Updated weights for policy 0, policy_version 113233 (0.0041) [2024-06-18 10:21:46,619][12883] Updated weights for policy 0, policy_version 113243 (0.0043) [2024-06-18 10:21:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1855389696. Throughput: 0: 42688.0. Samples: 1855454400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:21:46,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 10:21:50,624][12883] Updated weights for policy 0, policy_version 113253 (0.0030) [2024-06-18 10:21:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 1855586304. Throughput: 0: 42624.4. Samples: 1855710580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:21:51,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 10:21:54,245][12883] Updated weights for policy 0, policy_version 113263 (0.0049) [2024-06-18 10:21:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1855799296. Throughput: 0: 42504.8. Samples: 1855965360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:21:56,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 10:21:58,225][12883] Updated weights for policy 0, policy_version 113273 (0.0033) [2024-06-18 10:22:00,417][12862] Signal inference workers to stop experience collection... (27100 times) [2024-06-18 10:22:00,456][12883] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-06-18 10:22:00,465][12862] Signal inference workers to resume experience collection... (27100 times) [2024-06-18 10:22:00,476][12883] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-06-18 10:22:01,971][12883] Updated weights for policy 0, policy_version 113283 (0.0030) [2024-06-18 10:22:01,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1856028672. Throughput: 0: 42497.1. Samples: 1856092660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:22:01,996][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 10:22:05,803][12883] Updated weights for policy 0, policy_version 113293 (0.0038) [2024-06-18 10:22:06,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 1856241664. Throughput: 0: 42842.3. Samples: 1856354140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:22:06,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 10:22:09,595][12883] Updated weights for policy 0, policy_version 113303 (0.0025) [2024-06-18 10:22:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1856438272. Throughput: 0: 42609.4. Samples: 1856608400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:22:11,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 10:22:13,442][12883] Updated weights for policy 0, policy_version 113313 (0.0046) [2024-06-18 10:22:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1856634880. Throughput: 0: 42742.6. Samples: 1856735200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:22:16,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 10:22:17,337][12883] Updated weights for policy 0, policy_version 113323 (0.0039) [2024-06-18 10:22:21,139][12883] Updated weights for policy 0, policy_version 113333 (0.0043) [2024-06-18 10:22:21,996][12645] Fps is (10 sec: 45865.0, 60 sec: 43142.9, 300 sec: 42765.0). Total num frames: 1856897024. Throughput: 0: 42868.9. Samples: 1856994040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:22:21,997][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 10:22:24,999][12883] Updated weights for policy 0, policy_version 113343 (0.0021) [2024-06-18 10:22:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1857077248. Throughput: 0: 42834.1. Samples: 1857252100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 10:22:26,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 10:22:28,654][12883] Updated weights for policy 0, policy_version 113353 (0.0044) [2024-06-18 10:22:31,994][12645] Fps is (10 sec: 39330.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1857290240. Throughput: 0: 42654.7. Samples: 1857373860. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:22:31,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 10:22:32,588][12883] Updated weights for policy 0, policy_version 113363 (0.0028) [2024-06-18 10:22:36,284][12883] Updated weights for policy 0, policy_version 113373 (0.0027) [2024-06-18 10:22:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1857536000. Throughput: 0: 42749.3. Samples: 1857634300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:22:36,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 10:22:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113375_1857536000.pth... [2024-06-18 10:22:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000112749_1847279616.pth [2024-06-18 10:22:40,311][12883] Updated weights for policy 0, policy_version 113383 (0.0036) [2024-06-18 10:22:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1857732608. Throughput: 0: 42869.1. Samples: 1857894460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:22:41,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 10:22:43,736][12883] Updated weights for policy 0, policy_version 113393 (0.0032) [2024-06-18 10:22:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1857945600. Throughput: 0: 42797.2. Samples: 1858018440. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:22:46,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 10:22:47,800][12883] Updated weights for policy 0, policy_version 113403 (0.0036) [2024-06-18 10:22:51,404][12883] Updated weights for policy 0, policy_version 113413 (0.0038) [2024-06-18 10:22:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 1858174976. Throughput: 0: 42854.2. Samples: 1858282580. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:22:51,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 10:22:55,814][12883] Updated weights for policy 0, policy_version 113423 (0.0047) [2024-06-18 10:22:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1858371584. Throughput: 0: 42637.0. Samples: 1858527060. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:22:56,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 10:22:59,035][12883] Updated weights for policy 0, policy_version 113433 (0.0030) [2024-06-18 10:23:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1858568192. Throughput: 0: 42684.4. Samples: 1858656000. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:23:01,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 10:23:03,292][12883] Updated weights for policy 0, policy_version 113443 (0.0027) [2024-06-18 10:23:06,630][12862] Signal inference workers to stop experience collection... (27150 times) [2024-06-18 10:23:06,686][12862] Signal inference workers to resume experience collection... (27150 times) [2024-06-18 10:23:06,688][12883] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-06-18 10:23:06,709][12883] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-06-18 10:23:06,823][12883] Updated weights for policy 0, policy_version 113453 (0.0029) [2024-06-18 10:23:06,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1858813952. Throughput: 0: 42733.5. Samples: 1858916960. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:23:06,995][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 10:23:11,193][12883] Updated weights for policy 0, policy_version 113463 (0.0032) [2024-06-18 10:23:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1858994176. Throughput: 0: 42693.9. Samples: 1859173320. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:23:11,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 10:23:14,386][12883] Updated weights for policy 0, policy_version 113473 (0.0031) [2024-06-18 10:23:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1859207168. Throughput: 0: 42686.6. Samples: 1859294760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:23:16,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 10:23:18,819][12883] Updated weights for policy 0, policy_version 113483 (0.0032) [2024-06-18 10:23:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1859452928. Throughput: 0: 42699.2. Samples: 1859555760. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:23:21,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 10:23:22,046][12883] Updated weights for policy 0, policy_version 113493 (0.0038) [2024-06-18 10:23:26,310][12883] Updated weights for policy 0, policy_version 113503 (0.0028) [2024-06-18 10:23:26,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1859649536. Throughput: 0: 42629.3. Samples: 1859812780. Policy #0 lag: (min: 1.0, avg: 11.5, max: 22.0) [2024-06-18 10:23:26,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 10:23:29,630][12883] Updated weights for policy 0, policy_version 113513 (0.0027) [2024-06-18 10:23:31,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1859862528. Throughput: 0: 42650.8. Samples: 1859937820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:23:31,996][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 10:23:33,709][12883] Updated weights for policy 0, policy_version 113523 (0.0041) [2024-06-18 10:23:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 1860091904. Throughput: 0: 42730.2. Samples: 1860205440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:23:36,994][12645] Avg episode reward: [(0, '0.668')] [2024-06-18 10:23:37,441][12883] Updated weights for policy 0, policy_version 113533 (0.0028) [2024-06-18 10:23:41,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1860288512. Throughput: 0: 42993.2. Samples: 1860461760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:23:41,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 10:23:42,001][12883] Updated weights for policy 0, policy_version 113543 (0.0041) [2024-06-18 10:23:44,932][12883] Updated weights for policy 0, policy_version 113553 (0.0042) [2024-06-18 10:23:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1860517888. Throughput: 0: 42795.5. Samples: 1860581800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:23:46,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 10:23:49,474][12883] Updated weights for policy 0, policy_version 113563 (0.0034) [2024-06-18 10:23:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.7, 300 sec: 42653.6). Total num frames: 1860730880. Throughput: 0: 42842.0. Samples: 1860844940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:23:51,996][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 10:23:52,907][12883] Updated weights for policy 0, policy_version 113573 (0.0022) [2024-06-18 10:23:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1860911104. Throughput: 0: 42856.9. Samples: 1861101880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:23:56,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 10:23:57,177][12883] Updated weights for policy 0, policy_version 113583 (0.0029) [2024-06-18 10:24:00,316][12883] Updated weights for policy 0, policy_version 113593 (0.0045) [2024-06-18 10:24:01,994][12645] Fps is (10 sec: 42607.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1861156864. Throughput: 0: 42872.1. Samples: 1861224000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:24:01,994][12645] Avg episode reward: [(0, '0.689')] [2024-06-18 10:24:04,811][12883] Updated weights for policy 0, policy_version 113603 (0.0032) [2024-06-18 10:24:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1861369856. Throughput: 0: 42734.2. Samples: 1861478800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:24:06,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 10:24:08,165][12883] Updated weights for policy 0, policy_version 113613 (0.0026) [2024-06-18 10:24:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1861566464. Throughput: 0: 42757.7. Samples: 1861736880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:24:11,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 10:24:12,320][12883] Updated weights for policy 0, policy_version 113623 (0.0040) [2024-06-18 10:24:15,717][12883] Updated weights for policy 0, policy_version 113633 (0.0036) [2024-06-18 10:24:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 1861795840. Throughput: 0: 42712.4. Samples: 1861859780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:24:16,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 10:24:20,417][12883] Updated weights for policy 0, policy_version 113643 (0.0028) [2024-06-18 10:24:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1862008832. Throughput: 0: 42545.6. Samples: 1862120000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:24:21,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 10:24:23,379][12883] Updated weights for policy 0, policy_version 113653 (0.0032) [2024-06-18 10:24:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1862205440. Throughput: 0: 42512.5. Samples: 1862374820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 10:24:26,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 10:24:27,989][12883] Updated weights for policy 0, policy_version 113663 (0.0037) [2024-06-18 10:24:31,194][12883] Updated weights for policy 0, policy_version 113673 (0.0043) [2024-06-18 10:24:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 1862451200. Throughput: 0: 42648.6. Samples: 1862500980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:24:31,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 10:24:35,548][12883] Updated weights for policy 0, policy_version 113683 (0.0037) [2024-06-18 10:24:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 1862631424. Throughput: 0: 42402.9. Samples: 1862752980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:24:36,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 10:24:37,151][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113687_1862647808.pth... [2024-06-18 10:24:37,216][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113063_1852424192.pth [2024-06-18 10:24:38,603][12862] Signal inference workers to stop experience collection... (27200 times) [2024-06-18 10:24:38,603][12862] Signal inference workers to resume experience collection... (27200 times) [2024-06-18 10:24:38,647][12883] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-06-18 10:24:38,647][12883] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-06-18 10:24:39,039][12883] Updated weights for policy 0, policy_version 113693 (0.0044) [2024-06-18 10:24:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42654.7). Total num frames: 1862844416. Throughput: 0: 42496.1. Samples: 1863014200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:24:41,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 10:24:43,438][12883] Updated weights for policy 0, policy_version 113703 (0.0037) [2024-06-18 10:24:46,632][12883] Updated weights for policy 0, policy_version 113713 (0.0039) [2024-06-18 10:24:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1863090176. Throughput: 0: 42516.9. Samples: 1863137260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:24:46,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 10:24:50,999][12883] Updated weights for policy 0, policy_version 113723 (0.0032) [2024-06-18 10:24:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 1863286784. Throughput: 0: 42622.7. Samples: 1863396820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:24:51,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 10:24:54,605][12883] Updated weights for policy 0, policy_version 113733 (0.0028) [2024-06-18 10:24:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1863483392. Throughput: 0: 42666.8. Samples: 1863656880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:24:56,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 10:24:58,533][12883] Updated weights for policy 0, policy_version 113743 (0.0039) [2024-06-18 10:25:01,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 1863696384. Throughput: 0: 42509.8. Samples: 1863772820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:25:01,996][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 10:25:02,386][12883] Updated weights for policy 0, policy_version 113753 (0.0049) [2024-06-18 10:25:06,061][12883] Updated weights for policy 0, policy_version 113763 (0.0047) [2024-06-18 10:25:06,996][12645] Fps is (10 sec: 45864.7, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1863942144. Throughput: 0: 42428.6. Samples: 1864029380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:25:06,997][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 10:25:10,502][12883] Updated weights for policy 0, policy_version 113773 (0.0037) [2024-06-18 10:25:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1864105984. Throughput: 0: 42596.9. Samples: 1864291680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:25:11,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 10:25:13,628][12883] Updated weights for policy 0, policy_version 113783 (0.0041) [2024-06-18 10:25:16,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1864335360. Throughput: 0: 42397.2. Samples: 1864408860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:25:16,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 10:25:18,157][12883] Updated weights for policy 0, policy_version 113793 (0.0033) [2024-06-18 10:25:21,265][12883] Updated weights for policy 0, policy_version 113803 (0.0035) [2024-06-18 10:25:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1864581120. Throughput: 0: 42648.1. Samples: 1864672140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:25:21,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 10:25:25,642][12883] Updated weights for policy 0, policy_version 113813 (0.0031) [2024-06-18 10:25:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1864744960. Throughput: 0: 42523.8. Samples: 1864927780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:25:26,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 10:25:29,122][12883] Updated weights for policy 0, policy_version 113823 (0.0035) [2024-06-18 10:25:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1864990720. Throughput: 0: 42517.9. Samples: 1865050560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 26.0) [2024-06-18 10:25:31,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 10:25:33,183][12883] Updated weights for policy 0, policy_version 113833 (0.0038) [2024-06-18 10:25:36,711][12883] Updated weights for policy 0, policy_version 113843 (0.0038) [2024-06-18 10:25:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1865220096. Throughput: 0: 42663.8. Samples: 1865316700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:25:36,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 10:25:40,737][12883] Updated weights for policy 0, policy_version 113853 (0.0041) [2024-06-18 10:25:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 1865400320. Throughput: 0: 42643.5. Samples: 1865575840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:25:41,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 10:25:44,409][12883] Updated weights for policy 0, policy_version 113863 (0.0030) [2024-06-18 10:25:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1865629696. Throughput: 0: 42756.2. Samples: 1865696760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:25:46,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 10:25:48,271][12883] Updated weights for policy 0, policy_version 113873 (0.0030) [2024-06-18 10:25:51,971][12883] Updated weights for policy 0, policy_version 113883 (0.0037) [2024-06-18 10:25:51,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1865859072. Throughput: 0: 42994.6. Samples: 1865964040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:25:51,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 10:25:52,063][12862] Signal inference workers to stop experience collection... (27250 times) [2024-06-18 10:25:52,110][12883] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-06-18 10:25:52,121][12862] Signal inference workers to resume experience collection... (27250 times) [2024-06-18 10:25:52,140][12883] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-06-18 10:25:56,282][12883] Updated weights for policy 0, policy_version 113893 (0.0035) [2024-06-18 10:25:56,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 1866022912. Throughput: 0: 42735.2. Samples: 1866214860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:25:56,997][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 10:25:59,570][12883] Updated weights for policy 0, policy_version 113903 (0.0044) [2024-06-18 10:26:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42873.2, 300 sec: 42709.5). Total num frames: 1866268672. Throughput: 0: 42852.6. Samples: 1866337220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:26:01,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 10:26:03,835][12883] Updated weights for policy 0, policy_version 113913 (0.0027) [2024-06-18 10:26:06,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 1866481664. Throughput: 0: 42918.2. Samples: 1866603460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:26:06,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 10:26:07,311][12883] Updated weights for policy 0, policy_version 113923 (0.0050) [2024-06-18 10:26:11,666][12883] Updated weights for policy 0, policy_version 113933 (0.0033) [2024-06-18 10:26:11,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1866678272. Throughput: 0: 42958.3. Samples: 1866860900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:26:11,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 10:26:14,962][12883] Updated weights for policy 0, policy_version 113943 (0.0030) [2024-06-18 10:26:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1866924032. Throughput: 0: 42952.4. Samples: 1866983420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:26:16,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 10:26:19,392][12883] Updated weights for policy 0, policy_version 113953 (0.0041) [2024-06-18 10:26:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1867120640. Throughput: 0: 42700.6. Samples: 1867238220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:26:21,994][12645] Avg episode reward: [(0, '0.754')] [2024-06-18 10:26:22,823][12883] Updated weights for policy 0, policy_version 113963 (0.0051) [2024-06-18 10:26:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1867317248. Throughput: 0: 42731.0. Samples: 1867498740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:26:26,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 10:26:27,160][12883] Updated weights for policy 0, policy_version 113973 (0.0040) [2024-06-18 10:26:30,462][12883] Updated weights for policy 0, policy_version 113983 (0.0042) [2024-06-18 10:26:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1867563008. Throughput: 0: 42825.8. Samples: 1867623920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 10:26:31,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 10:26:34,865][12883] Updated weights for policy 0, policy_version 113993 (0.0039) [2024-06-18 10:26:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1867759616. Throughput: 0: 42638.6. Samples: 1867882780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:26:36,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 10:26:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114000_1867776000.pth... [2024-06-18 10:26:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113375_1857536000.pth [2024-06-18 10:26:38,089][12883] Updated weights for policy 0, policy_version 114003 (0.0042) [2024-06-18 10:26:41,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1867956224. Throughput: 0: 42766.7. Samples: 1868139360. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:26:41,997][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 10:26:42,587][12883] Updated weights for policy 0, policy_version 114013 (0.0027) [2024-06-18 10:26:46,106][12883] Updated weights for policy 0, policy_version 114023 (0.0035) [2024-06-18 10:26:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1868201984. Throughput: 0: 42812.7. Samples: 1868263800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:26:46,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 10:26:50,141][12883] Updated weights for policy 0, policy_version 114033 (0.0032) [2024-06-18 10:26:51,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1868398592. Throughput: 0: 42660.9. Samples: 1868523200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:26:51,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 10:26:53,732][12883] Updated weights for policy 0, policy_version 114043 (0.0036) [2024-06-18 10:26:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43146.2, 300 sec: 42654.3). Total num frames: 1868611584. Throughput: 0: 42577.9. Samples: 1868776900. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:26:56,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 10:26:57,572][12883] Updated weights for policy 0, policy_version 114053 (0.0034) [2024-06-18 10:27:01,233][12883] Updated weights for policy 0, policy_version 114063 (0.0024) [2024-06-18 10:27:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1868857344. Throughput: 0: 42797.3. Samples: 1868909300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:27:01,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 10:27:04,959][12862] Signal inference workers to stop experience collection... (27300 times) [2024-06-18 10:27:04,959][12862] Signal inference workers to resume experience collection... (27300 times) [2024-06-18 10:27:05,002][12883] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-06-18 10:27:05,002][12883] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-06-18 10:27:05,090][12883] Updated weights for policy 0, policy_version 114073 (0.0034) [2024-06-18 10:27:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1869037568. Throughput: 0: 42846.5. Samples: 1869166320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:27:06,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 10:27:08,967][12883] Updated weights for policy 0, policy_version 114083 (0.0030) [2024-06-18 10:27:11,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 1869250560. Throughput: 0: 42669.1. Samples: 1869418940. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:27:11,996][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 10:27:12,899][12883] Updated weights for policy 0, policy_version 114093 (0.0041) [2024-06-18 10:27:16,569][12883] Updated weights for policy 0, policy_version 114103 (0.0035) [2024-06-18 10:27:16,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1869479936. Throughput: 0: 42798.8. Samples: 1869549860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:27:16,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 10:27:20,539][12883] Updated weights for policy 0, policy_version 114113 (0.0042) [2024-06-18 10:27:21,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1869692928. Throughput: 0: 42705.0. Samples: 1869804500. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:27:21,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 10:27:24,146][12883] Updated weights for policy 0, policy_version 114123 (0.0030) [2024-06-18 10:27:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1869905920. Throughput: 0: 42759.9. Samples: 1870063460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:27:26,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 10:27:28,382][12883] Updated weights for policy 0, policy_version 114133 (0.0044) [2024-06-18 10:27:31,795][12883] Updated weights for policy 0, policy_version 114143 (0.0030) [2024-06-18 10:27:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1870118912. Throughput: 0: 42850.8. Samples: 1870192080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 10:27:31,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 10:27:35,773][12883] Updated weights for policy 0, policy_version 114153 (0.0034) [2024-06-18 10:27:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1870331904. Throughput: 0: 42805.4. Samples: 1870449440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:27:36,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 10:27:39,585][12883] Updated weights for policy 0, policy_version 114163 (0.0035) [2024-06-18 10:27:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 1870544896. Throughput: 0: 42926.7. Samples: 1870708600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:27:41,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 10:27:43,116][12883] Updated weights for policy 0, policy_version 114173 (0.0028) [2024-06-18 10:27:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1870757888. Throughput: 0: 42819.5. Samples: 1870836180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:27:46,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 10:27:47,119][12883] Updated weights for policy 0, policy_version 114183 (0.0033) [2024-06-18 10:27:50,611][12883] Updated weights for policy 0, policy_version 114193 (0.0040) [2024-06-18 10:27:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1870987264. Throughput: 0: 42769.9. Samples: 1871090960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:27:51,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 10:27:54,753][12883] Updated weights for policy 0, policy_version 114203 (0.0033) [2024-06-18 10:27:56,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 1871183872. Throughput: 0: 42862.7. Samples: 1871347760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:27:56,996][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 10:27:58,170][12883] Updated weights for policy 0, policy_version 114213 (0.0037) [2024-06-18 10:28:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1871396864. Throughput: 0: 42968.0. Samples: 1871483420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:28:01,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 10:28:02,375][12883] Updated weights for policy 0, policy_version 114223 (0.0032) [2024-06-18 10:28:05,765][12883] Updated weights for policy 0, policy_version 114233 (0.0029) [2024-06-18 10:28:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1871626240. Throughput: 0: 43016.4. Samples: 1871740240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:28:06,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 10:28:10,050][12883] Updated weights for policy 0, policy_version 114243 (0.0030) [2024-06-18 10:28:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43146.1, 300 sec: 42820.6). Total num frames: 1871839232. Throughput: 0: 42841.7. Samples: 1871991340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:28:11,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 10:28:13,377][12883] Updated weights for policy 0, policy_version 114253 (0.0030) [2024-06-18 10:28:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1872052224. Throughput: 0: 42913.3. Samples: 1872123180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:28:16,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 10:28:17,505][12883] Updated weights for policy 0, policy_version 114263 (0.0036) [2024-06-18 10:28:21,213][12883] Updated weights for policy 0, policy_version 114273 (0.0042) [2024-06-18 10:28:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1872281600. Throughput: 0: 42942.6. Samples: 1872381860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:28:21,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 10:28:25,222][12883] Updated weights for policy 0, policy_version 114283 (0.0042) [2024-06-18 10:28:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 1872494592. Throughput: 0: 42846.6. Samples: 1872636700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:28:26,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 10:28:28,772][12883] Updated weights for policy 0, policy_version 114293 (0.0035) [2024-06-18 10:28:31,143][12862] Signal inference workers to stop experience collection... (27350 times) [2024-06-18 10:28:31,143][12862] Signal inference workers to resume experience collection... (27350 times) [2024-06-18 10:28:31,191][12883] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-06-18 10:28:31,191][12883] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-06-18 10:28:31,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 1872691200. Throughput: 0: 42912.6. Samples: 1872767340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 10:28:31,996][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 10:28:32,763][12883] Updated weights for policy 0, policy_version 114303 (0.0027) [2024-06-18 10:28:36,310][12883] Updated weights for policy 0, policy_version 114313 (0.0026) [2024-06-18 10:28:37,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 1872904192. Throughput: 0: 42916.7. Samples: 1873022480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:28:37,000][12645] Avg episode reward: [(0, '0.775')] [2024-06-18 10:28:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114313_1872904192.pth... [2024-06-18 10:28:37,108][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000113687_1862647808.pth [2024-06-18 10:28:40,427][12883] Updated weights for policy 0, policy_version 114323 (0.0040) [2024-06-18 10:28:41,994][12645] Fps is (10 sec: 44246.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1873133568. Throughput: 0: 43020.3. Samples: 1873283580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:28:41,994][12645] Avg episode reward: [(0, '0.753')] [2024-06-18 10:28:44,034][12883] Updated weights for policy 0, policy_version 114333 (0.0039) [2024-06-18 10:28:46,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 1873313792. Throughput: 0: 42963.0. Samples: 1873416760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:28:46,994][12645] Avg episode reward: [(0, '0.786')] [2024-06-18 10:28:48,207][12883] Updated weights for policy 0, policy_version 114343 (0.0033) [2024-06-18 10:28:51,554][12883] Updated weights for policy 0, policy_version 114353 (0.0027) [2024-06-18 10:28:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1873559552. Throughput: 0: 42839.7. Samples: 1873668020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:28:51,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 10:28:55,829][12883] Updated weights for policy 0, policy_version 114363 (0.0040) [2024-06-18 10:28:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 1873756160. Throughput: 0: 43076.4. Samples: 1873929780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:28:56,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 10:28:59,057][12883] Updated weights for policy 0, policy_version 114373 (0.0037) [2024-06-18 10:29:01,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 1873969152. Throughput: 0: 42881.9. Samples: 1874052960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:01,996][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 10:29:03,506][12883] Updated weights for policy 0, policy_version 114383 (0.0035) [2024-06-18 10:29:06,555][12883] Updated weights for policy 0, policy_version 114393 (0.0046) [2024-06-18 10:29:06,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 1874231296. Throughput: 0: 42981.0. Samples: 1874316000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:06,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 10:29:11,167][12883] Updated weights for policy 0, policy_version 114403 (0.0038) [2024-06-18 10:29:11,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1874395136. Throughput: 0: 43106.8. Samples: 1874576500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:11,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 10:29:14,334][12883] Updated weights for policy 0, policy_version 114413 (0.0037) [2024-06-18 10:29:16,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1874608128. Throughput: 0: 42857.3. Samples: 1874695820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:16,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 10:29:18,754][12883] Updated weights for policy 0, policy_version 114423 (0.0040) [2024-06-18 10:29:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 1874853888. Throughput: 0: 43050.0. Samples: 1874959460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:21,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 10:29:22,051][12883] Updated weights for policy 0, policy_version 114433 (0.0032) [2024-06-18 10:29:26,486][12883] Updated weights for policy 0, policy_version 114443 (0.0034) [2024-06-18 10:29:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1875050496. Throughput: 0: 43067.6. Samples: 1875221620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:26,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 10:29:29,443][12883] Updated weights for policy 0, policy_version 114453 (0.0037) [2024-06-18 10:29:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1875247104. Throughput: 0: 42813.4. Samples: 1875343360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:31,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 10:29:34,146][12883] Updated weights for policy 0, policy_version 114463 (0.0027) [2024-06-18 10:29:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43422.2, 300 sec: 42931.6). Total num frames: 1875509248. Throughput: 0: 43017.8. Samples: 1875603820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 10:29:36,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 10:29:37,029][12883] Updated weights for policy 0, policy_version 114473 (0.0036) [2024-06-18 10:29:37,497][12862] Signal inference workers to stop experience collection... (27400 times) [2024-06-18 10:29:37,497][12862] Signal inference workers to resume experience collection... (27400 times) [2024-06-18 10:29:37,540][12883] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-06-18 10:29:37,540][12883] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-06-18 10:29:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1875673088. Throughput: 0: 43118.7. Samples: 1875870120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:29:41,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 10:29:42,048][12883] Updated weights for policy 0, policy_version 114483 (0.0033) [2024-06-18 10:29:44,452][12883] Updated weights for policy 0, policy_version 114493 (0.0040) [2024-06-18 10:29:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1875902464. Throughput: 0: 42911.5. Samples: 1875983880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:29:46,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 10:29:49,470][12883] Updated weights for policy 0, policy_version 114503 (0.0031) [2024-06-18 10:29:51,994][12645] Fps is (10 sec: 49152.4, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 1876164608. Throughput: 0: 42996.5. Samples: 1876250840. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:29:51,994][12645] Avg episode reward: [(0, '0.764')] [2024-06-18 10:29:52,368][12883] Updated weights for policy 0, policy_version 114513 (0.0028) [2024-06-18 10:29:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 1876328448. Throughput: 0: 42977.2. Samples: 1876510480. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:29:56,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 10:29:57,014][12883] Updated weights for policy 0, policy_version 114523 (0.0032) [2024-06-18 10:29:59,962][12883] Updated weights for policy 0, policy_version 114533 (0.0033) [2024-06-18 10:30:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43146.1, 300 sec: 42765.3). Total num frames: 1876557824. Throughput: 0: 42964.4. Samples: 1876629220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:01,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 10:30:04,604][12883] Updated weights for policy 0, policy_version 114543 (0.0027) [2024-06-18 10:30:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42987.2). Total num frames: 1876787200. Throughput: 0: 42993.4. Samples: 1876894160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:06,994][12645] Avg episode reward: [(0, '0.849')] [2024-06-18 10:30:07,092][12862] Saving new best policy, reward=0.849! [2024-06-18 10:30:07,532][12883] Updated weights for policy 0, policy_version 114553 (0.0036) [2024-06-18 10:30:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 1876983808. Throughput: 0: 42942.6. Samples: 1877154040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:11,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 10:30:12,327][12883] Updated weights for policy 0, policy_version 114563 (0.0038) [2024-06-18 10:30:15,276][12883] Updated weights for policy 0, policy_version 114573 (0.0047) [2024-06-18 10:30:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 1877213184. Throughput: 0: 42930.2. Samples: 1877275220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:16,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 10:30:20,177][12883] Updated weights for policy 0, policy_version 114583 (0.0039) [2024-06-18 10:30:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 1877409792. Throughput: 0: 42882.2. Samples: 1877533520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:21,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 10:30:22,966][12883] Updated weights for policy 0, policy_version 114593 (0.0028) [2024-06-18 10:30:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1877606400. Throughput: 0: 42636.4. Samples: 1877788760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:26,994][12645] Avg episode reward: [(0, '0.728')] [2024-06-18 10:30:27,949][12883] Updated weights for policy 0, policy_version 114603 (0.0034) [2024-06-18 10:30:30,589][12883] Updated weights for policy 0, policy_version 114613 (0.0041) [2024-06-18 10:30:31,996][12645] Fps is (10 sec: 45864.6, 60 sec: 43689.0, 300 sec: 42875.8). Total num frames: 1877868544. Throughput: 0: 42874.7. Samples: 1877913340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:31,997][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 10:30:35,543][12883] Updated weights for policy 0, policy_version 114623 (0.0036) [2024-06-18 10:30:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42876.1). Total num frames: 1878048768. Throughput: 0: 42671.0. Samples: 1878171040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-18 10:30:36,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 10:30:37,015][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114627_1878048768.pth... [2024-06-18 10:30:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114000_1867776000.pth [2024-06-18 10:30:38,539][12883] Updated weights for policy 0, policy_version 114633 (0.0042) [2024-06-18 10:30:41,994][12645] Fps is (10 sec: 37691.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1878245376. Throughput: 0: 42661.3. Samples: 1878430240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:30:41,994][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 10:30:43,317][12883] Updated weights for policy 0, policy_version 114643 (0.0039) [2024-06-18 10:30:46,128][12883] Updated weights for policy 0, policy_version 114653 (0.0035) [2024-06-18 10:30:47,000][12645] Fps is (10 sec: 47484.0, 60 sec: 43686.0, 300 sec: 42930.7). Total num frames: 1878523904. Throughput: 0: 42756.6. Samples: 1878553540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:30:47,001][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 10:30:51,010][12883] Updated weights for policy 0, policy_version 114663 (0.0033) [2024-06-18 10:30:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42932.0). Total num frames: 1878687744. Throughput: 0: 42680.8. Samples: 1878814800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:30:51,994][12645] Avg episode reward: [(0, '0.618')] [2024-06-18 10:30:53,987][12883] Updated weights for policy 0, policy_version 114673 (0.0042) [2024-06-18 10:30:56,994][12645] Fps is (10 sec: 37706.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1878900736. Throughput: 0: 42515.6. Samples: 1879067240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:30:56,994][12645] Avg episode reward: [(0, '0.735')] [2024-06-18 10:30:58,658][12883] Updated weights for policy 0, policy_version 114683 (0.0041) [2024-06-18 10:31:00,324][12862] Signal inference workers to stop experience collection... (27450 times) [2024-06-18 10:31:00,359][12883] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-06-18 10:31:00,379][12862] Signal inference workers to resume experience collection... (27450 times) [2024-06-18 10:31:00,380][12883] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-06-18 10:31:01,595][12883] Updated weights for policy 0, policy_version 114693 (0.0038) [2024-06-18 10:31:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 1879130112. Throughput: 0: 42575.5. Samples: 1879191120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:01,995][12645] Avg episode reward: [(0, '0.762')] [2024-06-18 10:31:06,652][12883] Updated weights for policy 0, policy_version 114703 (0.0035) [2024-06-18 10:31:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 1879310336. Throughput: 0: 42634.6. Samples: 1879452080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:06,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 10:31:09,121][12883] Updated weights for policy 0, policy_version 114713 (0.0034) [2024-06-18 10:31:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1879539712. Throughput: 0: 42548.1. Samples: 1879703420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:11,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 10:31:14,501][12883] Updated weights for policy 0, policy_version 114723 (0.0045) [2024-06-18 10:31:16,793][12883] Updated weights for policy 0, policy_version 114733 (0.0029) [2024-06-18 10:31:16,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 1879785472. Throughput: 0: 42691.4. Samples: 1879834360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:16,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 10:31:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 1879932928. Throughput: 0: 42624.1. Samples: 1880089120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:21,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 10:31:22,031][12883] Updated weights for policy 0, policy_version 114743 (0.0031) [2024-06-18 10:31:24,674][12883] Updated weights for policy 0, policy_version 114753 (0.0033) [2024-06-18 10:31:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1880195072. Throughput: 0: 42465.8. Samples: 1880341200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:26,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 10:31:29,570][12883] Updated weights for policy 0, policy_version 114763 (0.0030) [2024-06-18 10:31:31,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42326.9, 300 sec: 42876.1). Total num frames: 1880408064. Throughput: 0: 42711.3. Samples: 1880475280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:31,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 10:31:32,310][12883] Updated weights for policy 0, policy_version 114773 (0.0037) [2024-06-18 10:31:36,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42820.9). Total num frames: 1880588288. Throughput: 0: 42521.8. Samples: 1880728280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 10:31:36,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 10:31:37,129][12883] Updated weights for policy 0, policy_version 114783 (0.0044) [2024-06-18 10:31:39,912][12883] Updated weights for policy 0, policy_version 114793 (0.0040) [2024-06-18 10:31:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 1880850432. Throughput: 0: 42528.1. Samples: 1880981000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:31:41,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 10:31:44,635][12883] Updated weights for policy 0, policy_version 114803 (0.0032) [2024-06-18 10:31:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42056.8, 300 sec: 42876.1). Total num frames: 1881047040. Throughput: 0: 42733.5. Samples: 1881114120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:31:46,994][12645] Avg episode reward: [(0, '0.707')] [2024-06-18 10:31:47,887][12883] Updated weights for policy 0, policy_version 114813 (0.0029) [2024-06-18 10:31:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1881243648. Throughput: 0: 42562.7. Samples: 1881367400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:31:51,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 10:31:52,271][12883] Updated weights for policy 0, policy_version 114823 (0.0028) [2024-06-18 10:31:55,509][12883] Updated weights for policy 0, policy_version 114833 (0.0040) [2024-06-18 10:31:56,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1881456640. Throughput: 0: 42655.0. Samples: 1881622900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:31:56,994][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 10:32:00,143][12883] Updated weights for policy 0, policy_version 114843 (0.0046) [2024-06-18 10:32:01,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1881686016. Throughput: 0: 42729.3. Samples: 1881757180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:02,003][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 10:32:03,091][12883] Updated weights for policy 0, policy_version 114853 (0.0027) [2024-06-18 10:32:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42765.4). Total num frames: 1881866240. Throughput: 0: 42687.2. Samples: 1882010040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:06,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 10:32:07,637][12883] Updated weights for policy 0, policy_version 114863 (0.0031) [2024-06-18 10:32:07,654][12862] Signal inference workers to stop experience collection... (27500 times) [2024-06-18 10:32:07,655][12862] Signal inference workers to resume experience collection... (27500 times) [2024-06-18 10:32:07,673][12883] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-06-18 10:32:07,673][12883] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-06-18 10:32:10,662][12883] Updated weights for policy 0, policy_version 114873 (0.0028) [2024-06-18 10:32:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1882112000. Throughput: 0: 42753.3. Samples: 1882265100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:11,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 10:32:15,174][12883] Updated weights for policy 0, policy_version 114883 (0.0033) [2024-06-18 10:32:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 1882324992. Throughput: 0: 42773.8. Samples: 1882400100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:17,002][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 10:32:18,393][12883] Updated weights for policy 0, policy_version 114893 (0.0039) [2024-06-18 10:32:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1882521600. Throughput: 0: 42700.7. Samples: 1882649820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:21,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 10:32:22,825][12883] Updated weights for policy 0, policy_version 114903 (0.0038) [2024-06-18 10:32:26,157][12883] Updated weights for policy 0, policy_version 114913 (0.0041) [2024-06-18 10:32:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 1882750976. Throughput: 0: 42702.0. Samples: 1882902600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:26,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 10:32:30,622][12883] Updated weights for policy 0, policy_version 114923 (0.0026) [2024-06-18 10:32:31,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1882963968. Throughput: 0: 42725.7. Samples: 1883036780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:31,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 10:32:33,980][12883] Updated weights for policy 0, policy_version 114933 (0.0027) [2024-06-18 10:32:36,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1883160576. Throughput: 0: 42752.9. Samples: 1883291280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 10:32:36,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 10:32:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114939_1883160576.pth... [2024-06-18 10:32:37,052][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114313_1872904192.pth [2024-06-18 10:32:38,312][12883] Updated weights for policy 0, policy_version 114943 (0.0032) [2024-06-18 10:32:41,504][12883] Updated weights for policy 0, policy_version 114953 (0.0036) [2024-06-18 10:32:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 1883406336. Throughput: 0: 42624.1. Samples: 1883540980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:32:41,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 10:32:46,038][12883] Updated weights for policy 0, policy_version 114963 (0.0031) [2024-06-18 10:32:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1883619328. Throughput: 0: 42725.0. Samples: 1883679800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:32:46,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 10:32:49,347][12883] Updated weights for policy 0, policy_version 114973 (0.0030) [2024-06-18 10:32:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 1883783168. Throughput: 0: 42589.6. Samples: 1883926580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:32:51,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 10:32:53,553][12883] Updated weights for policy 0, policy_version 114983 (0.0036) [2024-06-18 10:32:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1884028928. Throughput: 0: 42523.5. Samples: 1884178660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:32:56,997][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 10:32:57,079][12883] Updated weights for policy 0, policy_version 114993 (0.0032) [2024-06-18 10:33:01,348][12883] Updated weights for policy 0, policy_version 115003 (0.0034) [2024-06-18 10:33:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1884241920. Throughput: 0: 42491.1. Samples: 1884312200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:01,994][12645] Avg episode reward: [(0, '0.706')] [2024-06-18 10:33:04,722][12883] Updated weights for policy 0, policy_version 115013 (0.0033) [2024-06-18 10:33:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1884438528. Throughput: 0: 42432.5. Samples: 1884559280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:06,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 10:33:08,966][12883] Updated weights for policy 0, policy_version 115023 (0.0026) [2024-06-18 10:33:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1884667904. Throughput: 0: 42685.9. Samples: 1884823460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:11,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 10:33:12,485][12883] Updated weights for policy 0, policy_version 115033 (0.0036) [2024-06-18 10:33:16,605][12883] Updated weights for policy 0, policy_version 115043 (0.0029) [2024-06-18 10:33:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1884897280. Throughput: 0: 42527.9. Samples: 1884950540. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:16,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 10:33:20,179][12883] Updated weights for policy 0, policy_version 115053 (0.0027) [2024-06-18 10:33:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1885077504. Throughput: 0: 42419.0. Samples: 1885200140. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:21,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 10:33:23,457][12862] Signal inference workers to stop experience collection... (27550 times) [2024-06-18 10:33:23,457][12862] Signal inference workers to resume experience collection... (27550 times) [2024-06-18 10:33:23,469][12883] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-06-18 10:33:23,502][12883] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-06-18 10:33:24,182][12883] Updated weights for policy 0, policy_version 115063 (0.0022) [2024-06-18 10:33:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.6, 300 sec: 42765.4). Total num frames: 1885306880. Throughput: 0: 42642.3. Samples: 1885459880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:26,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 10:33:27,826][12883] Updated weights for policy 0, policy_version 115073 (0.0027) [2024-06-18 10:33:31,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 1885503488. Throughput: 0: 42498.3. Samples: 1885592220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:31,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 10:33:32,073][12883] Updated weights for policy 0, policy_version 115083 (0.0037) [2024-06-18 10:33:35,555][12883] Updated weights for policy 0, policy_version 115093 (0.0039) [2024-06-18 10:33:36,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1885732864. Throughput: 0: 42593.4. Samples: 1885843280. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 10:33:36,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 10:33:39,832][12883] Updated weights for policy 0, policy_version 115103 (0.0036) [2024-06-18 10:33:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1885962240. Throughput: 0: 42680.9. Samples: 1886099300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:33:41,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 10:33:43,185][12883] Updated weights for policy 0, policy_version 115113 (0.0031) [2024-06-18 10:33:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1886142464. Throughput: 0: 42565.3. Samples: 1886227640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:33:46,994][12645] Avg episode reward: [(0, '0.070')] [2024-06-18 10:33:47,290][12883] Updated weights for policy 0, policy_version 115123 (0.0032) [2024-06-18 10:33:50,847][12883] Updated weights for policy 0, policy_version 115133 (0.0029) [2024-06-18 10:33:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1886388224. Throughput: 0: 42864.6. Samples: 1886488180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:33:51,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 10:33:54,993][12883] Updated weights for policy 0, policy_version 115143 (0.0021) [2024-06-18 10:33:56,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42820.9). Total num frames: 1886601216. Throughput: 0: 42587.0. Samples: 1886739880. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:33:56,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 10:33:58,423][12883] Updated weights for policy 0, policy_version 115153 (0.0028) [2024-06-18 10:34:01,996][12645] Fps is (10 sec: 40950.3, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 1886797824. Throughput: 0: 42620.2. Samples: 1886868540. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:01,997][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 10:34:02,619][12883] Updated weights for policy 0, policy_version 115163 (0.0037) [2024-06-18 10:34:06,119][12883] Updated weights for policy 0, policy_version 115173 (0.0025) [2024-06-18 10:34:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 1887027200. Throughput: 0: 42786.2. Samples: 1887125520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:06,995][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 10:34:10,347][12883] Updated weights for policy 0, policy_version 115183 (0.0028) [2024-06-18 10:34:11,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1887223808. Throughput: 0: 42868.8. Samples: 1887388980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:11,994][12645] Avg episode reward: [(0, '0.146')] [2024-06-18 10:34:13,911][12883] Updated weights for policy 0, policy_version 115193 (0.0028) [2024-06-18 10:34:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1887436800. Throughput: 0: 42587.9. Samples: 1887508680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:16,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 10:34:18,033][12883] Updated weights for policy 0, policy_version 115203 (0.0039) [2024-06-18 10:34:21,542][12883] Updated weights for policy 0, policy_version 115213 (0.0030) [2024-06-18 10:34:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1887666176. Throughput: 0: 42804.4. Samples: 1887769480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:21,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 10:34:25,594][12883] Updated weights for policy 0, policy_version 115223 (0.0023) [2024-06-18 10:34:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1887862784. Throughput: 0: 42980.4. Samples: 1888033420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:26,994][12645] Avg episode reward: [(0, '0.693')] [2024-06-18 10:34:28,993][12883] Updated weights for policy 0, policy_version 115233 (0.0032) [2024-06-18 10:34:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1888092160. Throughput: 0: 42882.3. Samples: 1888157340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:31,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 10:34:33,135][12883] Updated weights for policy 0, policy_version 115243 (0.0050) [2024-06-18 10:34:36,654][12883] Updated weights for policy 0, policy_version 115253 (0.0030) [2024-06-18 10:34:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1888305152. Throughput: 0: 42790.5. Samples: 1888413760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:36,995][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 10:34:37,096][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115254_1888321536.pth... [2024-06-18 10:34:37,145][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114627_1878048768.pth [2024-06-18 10:34:40,743][12883] Updated weights for policy 0, policy_version 115263 (0.0032) [2024-06-18 10:34:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1888501760. Throughput: 0: 42967.3. Samples: 1888673400. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 10:34:41,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 10:34:44,185][12883] Updated weights for policy 0, policy_version 115273 (0.0038) [2024-06-18 10:34:46,526][12862] Signal inference workers to stop experience collection... (27600 times) [2024-06-18 10:34:46,526][12862] Signal inference workers to resume experience collection... (27600 times) [2024-06-18 10:34:46,565][12883] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-06-18 10:34:46,565][12883] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-06-18 10:34:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1888731136. Throughput: 0: 42876.9. Samples: 1888797900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:34:46,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 10:34:48,246][12883] Updated weights for policy 0, policy_version 115283 (0.0035) [2024-06-18 10:34:51,824][12883] Updated weights for policy 0, policy_version 115293 (0.0025) [2024-06-18 10:34:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1888960512. Throughput: 0: 43010.0. Samples: 1889060960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:34:51,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 10:34:56,300][12883] Updated weights for policy 0, policy_version 115303 (0.0044) [2024-06-18 10:34:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1889157120. Throughput: 0: 42850.6. Samples: 1889317260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:34:56,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 10:34:59,834][12883] Updated weights for policy 0, policy_version 115313 (0.0038) [2024-06-18 10:35:01,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42872.9, 300 sec: 42653.9). Total num frames: 1889370112. Throughput: 0: 42895.4. Samples: 1889438980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:01,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 10:35:04,090][12883] Updated weights for policy 0, policy_version 115323 (0.0028) [2024-06-18 10:35:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1889599488. Throughput: 0: 42949.9. Samples: 1889702220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:06,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 10:35:07,433][12883] Updated weights for policy 0, policy_version 115333 (0.0034) [2024-06-18 10:35:11,543][12883] Updated weights for policy 0, policy_version 115343 (0.0045) [2024-06-18 10:35:11,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1889779712. Throughput: 0: 42772.5. Samples: 1889958180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:11,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 10:35:15,073][12883] Updated weights for policy 0, policy_version 115353 (0.0037) [2024-06-18 10:35:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1890025472. Throughput: 0: 42793.6. Samples: 1890083060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:16,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 10:35:18,968][12883] Updated weights for policy 0, policy_version 115363 (0.0034) [2024-06-18 10:35:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1890222080. Throughput: 0: 42965.0. Samples: 1890347180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:21,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 10:35:22,646][12883] Updated weights for policy 0, policy_version 115373 (0.0039) [2024-06-18 10:35:26,503][12883] Updated weights for policy 0, policy_version 115383 (0.0029) [2024-06-18 10:35:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 1890451456. Throughput: 0: 42736.9. Samples: 1890596560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:26,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 10:35:30,362][12883] Updated weights for policy 0, policy_version 115393 (0.0028) [2024-06-18 10:35:31,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1890680832. Throughput: 0: 42973.8. Samples: 1890731720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:31,994][12645] Avg episode reward: [(0, '0.564')] [2024-06-18 10:35:34,102][12883] Updated weights for policy 0, policy_version 115403 (0.0038) [2024-06-18 10:35:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1890861056. Throughput: 0: 42945.7. Samples: 1890993520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:36,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 10:35:38,153][12883] Updated weights for policy 0, policy_version 115413 (0.0033) [2024-06-18 10:35:41,731][12883] Updated weights for policy 0, policy_version 115423 (0.0032) [2024-06-18 10:35:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43417.6, 300 sec: 42654.9). Total num frames: 1891106816. Throughput: 0: 42930.8. Samples: 1891249140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:35:41,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 10:35:45,801][12883] Updated weights for policy 0, policy_version 115433 (0.0038) [2024-06-18 10:35:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1891303424. Throughput: 0: 43066.8. Samples: 1891376980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:35:46,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 10:35:49,345][12883] Updated weights for policy 0, policy_version 115443 (0.0034) [2024-06-18 10:35:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1891516416. Throughput: 0: 43001.4. Samples: 1891637280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:35:51,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 10:35:53,227][12883] Updated weights for policy 0, policy_version 115453 (0.0037) [2024-06-18 10:35:56,807][12883] Updated weights for policy 0, policy_version 115463 (0.0023) [2024-06-18 10:35:56,996][12645] Fps is (10 sec: 44227.2, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 1891745792. Throughput: 0: 42946.7. Samples: 1891890880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:35:56,996][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 10:36:01,089][12883] Updated weights for policy 0, policy_version 115473 (0.0028) [2024-06-18 10:36:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1891942400. Throughput: 0: 43068.1. Samples: 1892021120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:01,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 10:36:04,336][12883] Updated weights for policy 0, policy_version 115483 (0.0028) [2024-06-18 10:36:06,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1892139008. Throughput: 0: 42788.1. Samples: 1892272640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:06,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 10:36:07,463][12862] Signal inference workers to stop experience collection... (27650 times) [2024-06-18 10:36:07,463][12862] Signal inference workers to resume experience collection... (27650 times) [2024-06-18 10:36:07,487][12883] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-06-18 10:36:07,488][12883] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-06-18 10:36:08,753][12883] Updated weights for policy 0, policy_version 115493 (0.0037) [2024-06-18 10:36:11,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1892401152. Throughput: 0: 42803.2. Samples: 1892522700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:11,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 10:36:11,998][12883] Updated weights for policy 0, policy_version 115503 (0.0037) [2024-06-18 10:36:16,450][12883] Updated weights for policy 0, policy_version 115513 (0.0039) [2024-06-18 10:36:16,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1892597760. Throughput: 0: 42944.8. Samples: 1892664240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:16,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 10:36:19,779][12883] Updated weights for policy 0, policy_version 115523 (0.0027) [2024-06-18 10:36:21,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1892777984. Throughput: 0: 42524.4. Samples: 1892907120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:21,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 10:36:24,126][12883] Updated weights for policy 0, policy_version 115533 (0.0027) [2024-06-18 10:36:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1893023744. Throughput: 0: 42623.5. Samples: 1893167200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:26,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 10:36:27,331][12883] Updated weights for policy 0, policy_version 115543 (0.0038) [2024-06-18 10:36:31,660][12883] Updated weights for policy 0, policy_version 115553 (0.0032) [2024-06-18 10:36:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1893236736. Throughput: 0: 42725.0. Samples: 1893299600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:31,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 10:36:35,664][12883] Updated weights for policy 0, policy_version 115563 (0.0031) [2024-06-18 10:36:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1893433344. Throughput: 0: 42593.6. Samples: 1893554000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:36,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 10:36:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115566_1893433344.pth... [2024-06-18 10:36:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000114939_1883160576.pth [2024-06-18 10:36:39,237][12883] Updated weights for policy 0, policy_version 115573 (0.0023) [2024-06-18 10:36:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1893662720. Throughput: 0: 42522.6. Samples: 1893804300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 10:36:41,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 10:36:43,146][12883] Updated weights for policy 0, policy_version 115583 (0.0033) [2024-06-18 10:36:46,968][12883] Updated weights for policy 0, policy_version 115593 (0.0044) [2024-06-18 10:36:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1893875712. Throughput: 0: 42542.2. Samples: 1893935520. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:36:46,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 10:36:50,667][12883] Updated weights for policy 0, policy_version 115603 (0.0032) [2024-06-18 10:36:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1894072320. Throughput: 0: 42534.1. Samples: 1894186680. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:36:51,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 10:36:54,634][12883] Updated weights for policy 0, policy_version 115613 (0.0035) [2024-06-18 10:36:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 1894301696. Throughput: 0: 42745.7. Samples: 1894446260. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:36:56,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 10:36:58,141][12883] Updated weights for policy 0, policy_version 115623 (0.0041) [2024-06-18 10:37:01,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42596.8, 300 sec: 42820.2). Total num frames: 1894498304. Throughput: 0: 42504.1. Samples: 1894577020. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:01,996][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 10:37:02,415][12883] Updated weights for policy 0, policy_version 115633 (0.0045) [2024-06-18 10:37:05,881][12883] Updated weights for policy 0, policy_version 115643 (0.0031) [2024-06-18 10:37:06,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 1894711296. Throughput: 0: 42767.4. Samples: 1894831660. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:06,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 10:37:10,283][12883] Updated weights for policy 0, policy_version 115653 (0.0034) [2024-06-18 10:37:11,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1894940672. Throughput: 0: 42736.4. Samples: 1895090340. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:11,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 10:37:13,745][12883] Updated weights for policy 0, policy_version 115663 (0.0033) [2024-06-18 10:37:16,996][12645] Fps is (10 sec: 42589.7, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 1895137280. Throughput: 0: 42700.0. Samples: 1895221200. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:16,996][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 10:37:17,822][12883] Updated weights for policy 0, policy_version 115673 (0.0046) [2024-06-18 10:37:21,968][12883] Updated weights for policy 0, policy_version 115683 (0.0030) [2024-06-18 10:37:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1895350272. Throughput: 0: 42494.8. Samples: 1895466260. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:21,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 10:37:25,142][12862] Signal inference workers to stop experience collection... (27700 times) [2024-06-18 10:37:25,142][12862] Signal inference workers to resume experience collection... (27700 times) [2024-06-18 10:37:25,168][12883] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-06-18 10:37:25,168][12883] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-06-18 10:37:25,439][12883] Updated weights for policy 0, policy_version 115693 (0.0043) [2024-06-18 10:37:26,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1895563264. Throughput: 0: 42653.7. Samples: 1895723720. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:26,994][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 10:37:29,844][12883] Updated weights for policy 0, policy_version 115703 (0.0037) [2024-06-18 10:37:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1895759872. Throughput: 0: 42540.9. Samples: 1895849860. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:31,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 10:37:33,625][12883] Updated weights for policy 0, policy_version 115713 (0.0034) [2024-06-18 10:37:36,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 1895989248. Throughput: 0: 42350.4. Samples: 1896092540. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:36,996][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 10:37:37,487][12883] Updated weights for policy 0, policy_version 115723 (0.0041) [2024-06-18 10:37:41,382][12883] Updated weights for policy 0, policy_version 115733 (0.0033) [2024-06-18 10:37:41,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 1896202240. Throughput: 0: 42411.2. Samples: 1896354860. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:41,996][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 10:37:44,951][12883] Updated weights for policy 0, policy_version 115743 (0.0031) [2024-06-18 10:37:46,994][12645] Fps is (10 sec: 39330.4, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1896382464. Throughput: 0: 42283.9. Samples: 1896479700. Policy #0 lag: (min: 1.0, avg: 8.1, max: 21.0) [2024-06-18 10:37:46,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 10:37:49,112][12883] Updated weights for policy 0, policy_version 115753 (0.0041) [2024-06-18 10:37:51,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1896628224. Throughput: 0: 42219.7. Samples: 1896731540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:37:51,994][12645] Avg episode reward: [(0, '0.663')] [2024-06-18 10:37:52,567][12883] Updated weights for policy 0, policy_version 115763 (0.0042) [2024-06-18 10:37:56,894][12883] Updated weights for policy 0, policy_version 115773 (0.0029) [2024-06-18 10:37:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1896824832. Throughput: 0: 42274.3. Samples: 1896992680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:37:56,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 10:38:00,086][12883] Updated weights for policy 0, policy_version 115783 (0.0032) [2024-06-18 10:38:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 1897037824. Throughput: 0: 42142.1. Samples: 1897117500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:01,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 10:38:04,419][12883] Updated weights for policy 0, policy_version 115793 (0.0033) [2024-06-18 10:38:06,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1897283584. Throughput: 0: 42357.2. Samples: 1897372340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:06,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 10:38:07,607][12883] Updated weights for policy 0, policy_version 115803 (0.0042) [2024-06-18 10:38:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1897463808. Throughput: 0: 42564.0. Samples: 1897639100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:11,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 10:38:12,086][12883] Updated weights for policy 0, policy_version 115813 (0.0031) [2024-06-18 10:38:15,324][12883] Updated weights for policy 0, policy_version 115823 (0.0047) [2024-06-18 10:38:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42326.8, 300 sec: 42709.5). Total num frames: 1897676800. Throughput: 0: 42327.5. Samples: 1897754600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:16,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 10:38:19,657][12883] Updated weights for policy 0, policy_version 115833 (0.0038) [2024-06-18 10:38:21,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1897938944. Throughput: 0: 42671.5. Samples: 1898012660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:21,994][12645] Avg episode reward: [(0, '0.719')] [2024-06-18 10:38:23,766][12883] Updated weights for policy 0, policy_version 115843 (0.0032) [2024-06-18 10:38:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1898102784. Throughput: 0: 42572.3. Samples: 1898270520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:26,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 10:38:27,502][12883] Updated weights for policy 0, policy_version 115853 (0.0042) [2024-06-18 10:38:31,357][12883] Updated weights for policy 0, policy_version 115863 (0.0032) [2024-06-18 10:38:31,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1898315776. Throughput: 0: 42552.5. Samples: 1898394560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:31,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 10:38:35,066][12883] Updated weights for policy 0, policy_version 115873 (0.0040) [2024-06-18 10:38:35,793][12862] Signal inference workers to stop experience collection... (27750 times) [2024-06-18 10:38:35,847][12883] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-06-18 10:38:35,916][12862] Signal inference workers to resume experience collection... (27750 times) [2024-06-18 10:38:35,916][12883] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-06-18 10:38:36,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43146.2, 300 sec: 42765.0). Total num frames: 1898577920. Throughput: 0: 42817.4. Samples: 1898658320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:36,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 10:38:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115880_1898577920.pth... [2024-06-18 10:38:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115254_1888321536.pth [2024-06-18 10:38:39,063][12883] Updated weights for policy 0, policy_version 115883 (0.0031) [2024-06-18 10:38:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 1898741760. Throughput: 0: 42807.0. Samples: 1898919000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:41,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 10:38:42,883][12883] Updated weights for policy 0, policy_version 115893 (0.0024) [2024-06-18 10:38:46,691][12883] Updated weights for policy 0, policy_version 115903 (0.0024) [2024-06-18 10:38:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1898954752. Throughput: 0: 42645.7. Samples: 1899036560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 10:38:46,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 10:38:50,433][12883] Updated weights for policy 0, policy_version 115913 (0.0041) [2024-06-18 10:38:51,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1899216896. Throughput: 0: 42983.7. Samples: 1899306600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:38:51,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 10:38:54,785][12883] Updated weights for policy 0, policy_version 115923 (0.0037) [2024-06-18 10:38:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1899397120. Throughput: 0: 42789.4. Samples: 1899564620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:38:56,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 10:38:58,020][12883] Updated weights for policy 0, policy_version 115933 (0.0037) [2024-06-18 10:39:01,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1899593728. Throughput: 0: 42821.0. Samples: 1899681540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:01,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 10:39:02,503][12883] Updated weights for policy 0, policy_version 115943 (0.0027) [2024-06-18 10:39:05,609][12883] Updated weights for policy 0, policy_version 115953 (0.0036) [2024-06-18 10:39:06,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 1899855872. Throughput: 0: 42944.7. Samples: 1899945180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:06,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 10:39:09,979][12883] Updated weights for policy 0, policy_version 115963 (0.0037) [2024-06-18 10:39:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1900019712. Throughput: 0: 43003.2. Samples: 1900205660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:11,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 10:39:13,464][12883] Updated weights for policy 0, policy_version 115973 (0.0029) [2024-06-18 10:39:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1900249088. Throughput: 0: 42889.7. Samples: 1900324600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:16,994][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 10:39:17,669][12883] Updated weights for policy 0, policy_version 115983 (0.0036) [2024-06-18 10:39:21,047][12883] Updated weights for policy 0, policy_version 115993 (0.0029) [2024-06-18 10:39:21,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1900494848. Throughput: 0: 42914.3. Samples: 1900589460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:21,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 10:39:25,283][12883] Updated weights for policy 0, policy_version 116003 (0.0031) [2024-06-18 10:39:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1900675072. Throughput: 0: 42868.5. Samples: 1900848080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:26,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 10:39:28,462][12862] Signal inference workers to stop experience collection... (27800 times) [2024-06-18 10:39:28,495][12883] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-06-18 10:39:28,515][12862] Signal inference workers to resume experience collection... (27800 times) [2024-06-18 10:39:28,517][12883] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-06-18 10:39:28,911][12883] Updated weights for policy 0, policy_version 116013 (0.0031) [2024-06-18 10:39:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1900904448. Throughput: 0: 42787.5. Samples: 1900962000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:31,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 10:39:33,191][12883] Updated weights for policy 0, policy_version 116023 (0.0031) [2024-06-18 10:39:36,413][12883] Updated weights for policy 0, policy_version 116033 (0.0044) [2024-06-18 10:39:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1901117440. Throughput: 0: 42712.2. Samples: 1901228660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:36,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 10:39:40,733][12883] Updated weights for policy 0, policy_version 116043 (0.0038) [2024-06-18 10:39:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1901297664. Throughput: 0: 42801.3. Samples: 1901490680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:41,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 10:39:44,059][12883] Updated weights for policy 0, policy_version 116053 (0.0043) [2024-06-18 10:39:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1901543424. Throughput: 0: 42795.4. Samples: 1901607340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 10:39:46,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 10:39:48,307][12883] Updated weights for policy 0, policy_version 116063 (0.0024) [2024-06-18 10:39:51,744][12883] Updated weights for policy 0, policy_version 116073 (0.0047) [2024-06-18 10:39:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1901756416. Throughput: 0: 42664.2. Samples: 1901865060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:39:51,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 10:39:55,748][12883] Updated weights for policy 0, policy_version 116083 (0.0040) [2024-06-18 10:39:56,994][12645] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1901920256. Throughput: 0: 42577.9. Samples: 1902121660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:39:56,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 10:39:59,346][12883] Updated weights for policy 0, policy_version 116093 (0.0031) [2024-06-18 10:40:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 1902198784. Throughput: 0: 42627.1. Samples: 1902242820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:01,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 10:40:03,246][12883] Updated weights for policy 0, policy_version 116103 (0.0035) [2024-06-18 10:40:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 1902379008. Throughput: 0: 42684.4. Samples: 1902510260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:06,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 10:40:07,021][12883] Updated weights for policy 0, policy_version 116113 (0.0032) [2024-06-18 10:40:10,858][12883] Updated weights for policy 0, policy_version 116123 (0.0026) [2024-06-18 10:40:11,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1902575616. Throughput: 0: 42452.0. Samples: 1902758420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:11,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 10:40:14,676][12883] Updated weights for policy 0, policy_version 116133 (0.0027) [2024-06-18 10:40:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 1902837760. Throughput: 0: 42677.9. Samples: 1902882500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:16,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 10:40:18,322][12883] Updated weights for policy 0, policy_version 116143 (0.0032) [2024-06-18 10:40:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 1903001600. Throughput: 0: 42602.0. Samples: 1903145740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:21,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 10:40:22,652][12883] Updated weights for policy 0, policy_version 116153 (0.0035) [2024-06-18 10:40:24,090][12862] Signal inference workers to stop experience collection... (27850 times) [2024-06-18 10:40:24,139][12883] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-06-18 10:40:24,146][12862] Signal inference workers to resume experience collection... (27850 times) [2024-06-18 10:40:24,162][12883] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-06-18 10:40:26,022][12883] Updated weights for policy 0, policy_version 116163 (0.0030) [2024-06-18 10:40:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1903214592. Throughput: 0: 42215.0. Samples: 1903390360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:26,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 10:40:30,235][12883] Updated weights for policy 0, policy_version 116173 (0.0021) [2024-06-18 10:40:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1903460352. Throughput: 0: 42524.6. Samples: 1903520940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:31,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 10:40:34,198][12883] Updated weights for policy 0, policy_version 116183 (0.0022) [2024-06-18 10:40:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 1903624192. Throughput: 0: 42496.4. Samples: 1903777400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:36,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 10:40:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116189_1903640576.pth... [2024-06-18 10:40:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115566_1893433344.pth [2024-06-18 10:40:37,899][12883] Updated weights for policy 0, policy_version 116193 (0.0033) [2024-06-18 10:40:41,887][12883] Updated weights for policy 0, policy_version 116203 (0.0027) [2024-06-18 10:40:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1903869952. Throughput: 0: 42389.1. Samples: 1904029180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:41,995][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 10:40:45,512][12883] Updated weights for policy 0, policy_version 116213 (0.0041) [2024-06-18 10:40:46,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1904099328. Throughput: 0: 42651.2. Samples: 1904162120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 10:40:46,994][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 10:40:49,479][12883] Updated weights for policy 0, policy_version 116223 (0.0037) [2024-06-18 10:40:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.0, 300 sec: 42432.1). Total num frames: 1904263168. Throughput: 0: 42241.2. Samples: 1904411120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:40:51,995][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 10:40:53,322][12883] Updated weights for policy 0, policy_version 116233 (0.0038) [2024-06-18 10:40:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 1904492544. Throughput: 0: 42168.3. Samples: 1904656000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:40:56,994][12645] Avg episode reward: [(0, '0.713')] [2024-06-18 10:40:57,630][12883] Updated weights for policy 0, policy_version 116243 (0.0037) [2024-06-18 10:41:01,103][12883] Updated weights for policy 0, policy_version 116253 (0.0029) [2024-06-18 10:41:01,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1904738304. Throughput: 0: 42458.1. Samples: 1904793120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:01,994][12645] Avg episode reward: [(0, '0.682')] [2024-06-18 10:41:05,208][12883] Updated weights for policy 0, policy_version 116263 (0.0026) [2024-06-18 10:41:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 1904902144. Throughput: 0: 42247.5. Samples: 1905046880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:06,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 10:41:08,687][12883] Updated weights for policy 0, policy_version 116273 (0.0036) [2024-06-18 10:41:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1905147904. Throughput: 0: 42261.0. Samples: 1905292100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:11,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 10:41:13,209][12883] Updated weights for policy 0, policy_version 116283 (0.0046) [2024-06-18 10:41:16,435][12883] Updated weights for policy 0, policy_version 116293 (0.0030) [2024-06-18 10:41:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1905360896. Throughput: 0: 42376.4. Samples: 1905427880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:16,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 10:41:20,780][12883] Updated weights for policy 0, policy_version 116303 (0.0037) [2024-06-18 10:41:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1905541120. Throughput: 0: 42329.8. Samples: 1905682240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:21,994][12645] Avg episode reward: [(0, '0.670')] [2024-06-18 10:41:24,085][12883] Updated weights for policy 0, policy_version 116313 (0.0031) [2024-06-18 10:41:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 1905786880. Throughput: 0: 42445.0. Samples: 1905939200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:26,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 10:41:28,423][12883] Updated weights for policy 0, policy_version 116323 (0.0028) [2024-06-18 10:41:31,671][12883] Updated weights for policy 0, policy_version 116333 (0.0036) [2024-06-18 10:41:31,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1905999872. Throughput: 0: 42328.9. Samples: 1906066920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:31,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 10:41:36,143][12883] Updated weights for policy 0, policy_version 116343 (0.0029) [2024-06-18 10:41:36,607][12862] Signal inference workers to stop experience collection... (27900 times) [2024-06-18 10:41:36,607][12862] Signal inference workers to resume experience collection... (27900 times) [2024-06-18 10:41:36,649][12883] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-06-18 10:41:36,649][12883] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-06-18 10:41:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 1906212864. Throughput: 0: 42550.3. Samples: 1906325880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:36,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 10:41:39,268][12883] Updated weights for policy 0, policy_version 116353 (0.0028) [2024-06-18 10:41:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1906425856. Throughput: 0: 42744.4. Samples: 1906579500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:41,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 10:41:43,936][12883] Updated weights for policy 0, policy_version 116363 (0.0036) [2024-06-18 10:41:46,778][12883] Updated weights for policy 0, policy_version 116373 (0.0045) [2024-06-18 10:41:46,997][12645] Fps is (10 sec: 44223.3, 60 sec: 42596.2, 300 sec: 42653.5). Total num frames: 1906655232. Throughput: 0: 42639.3. Samples: 1906712020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:46,997][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 10:41:51,555][12883] Updated weights for policy 0, policy_version 116383 (0.0033) [2024-06-18 10:41:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.7, 300 sec: 42542.9). Total num frames: 1906851840. Throughput: 0: 42659.5. Samples: 1906966560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 10:41:51,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 10:41:54,635][12883] Updated weights for policy 0, policy_version 116393 (0.0033) [2024-06-18 10:41:56,994][12645] Fps is (10 sec: 40972.7, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 1907064832. Throughput: 0: 42834.6. Samples: 1907219660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:41:56,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 10:41:59,009][12883] Updated weights for policy 0, policy_version 116403 (0.0031) [2024-06-18 10:42:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1907294208. Throughput: 0: 42876.5. Samples: 1907357320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:01,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 10:42:02,134][12883] Updated weights for policy 0, policy_version 116413 (0.0024) [2024-06-18 10:42:06,561][12883] Updated weights for policy 0, policy_version 116423 (0.0045) [2024-06-18 10:42:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1907490816. Throughput: 0: 42992.8. Samples: 1907616920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:06,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 10:42:09,827][12883] Updated weights for policy 0, policy_version 116433 (0.0022) [2024-06-18 10:42:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 1907720192. Throughput: 0: 42768.0. Samples: 1907863760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:11,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 10:42:14,057][12883] Updated weights for policy 0, policy_version 116443 (0.0025) [2024-06-18 10:42:16,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1907916800. Throughput: 0: 42997.8. Samples: 1908001820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:16,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 10:42:17,715][12883] Updated weights for policy 0, policy_version 116453 (0.0027) [2024-06-18 10:42:21,519][12883] Updated weights for policy 0, policy_version 116463 (0.0042) [2024-06-18 10:42:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1908129792. Throughput: 0: 42981.4. Samples: 1908260040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:21,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 10:42:25,210][12883] Updated weights for policy 0, policy_version 116473 (0.0032) [2024-06-18 10:42:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1908359168. Throughput: 0: 42856.6. Samples: 1908508040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:26,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 10:42:29,293][12883] Updated weights for policy 0, policy_version 116483 (0.0037) [2024-06-18 10:42:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 1908555776. Throughput: 0: 42940.8. Samples: 1908644220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:31,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 10:42:32,747][12883] Updated weights for policy 0, policy_version 116493 (0.0029) [2024-06-18 10:42:36,597][12883] Updated weights for policy 0, policy_version 116503 (0.0032) [2024-06-18 10:42:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 1908785152. Throughput: 0: 43205.8. Samples: 1908910820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:36,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 10:42:37,137][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116505_1908817920.pth... [2024-06-18 10:42:37,195][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000115880_1898577920.pth [2024-06-18 10:42:40,219][12883] Updated weights for policy 0, policy_version 116513 (0.0026) [2024-06-18 10:42:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1909014528. Throughput: 0: 43240.4. Samples: 1909165480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:41,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 10:42:44,306][12883] Updated weights for policy 0, policy_version 116523 (0.0038) [2024-06-18 10:42:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42327.6, 300 sec: 42598.4). Total num frames: 1909194752. Throughput: 0: 42952.0. Samples: 1909290160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:46,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 10:42:47,916][12883] Updated weights for policy 0, policy_version 116533 (0.0027) [2024-06-18 10:42:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1909424128. Throughput: 0: 42999.7. Samples: 1909551900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 10:42:51,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 10:42:52,019][12883] Updated weights for policy 0, policy_version 116543 (0.0038) [2024-06-18 10:42:55,567][12883] Updated weights for policy 0, policy_version 116553 (0.0033) [2024-06-18 10:42:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1909653504. Throughput: 0: 43107.6. Samples: 1909803600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:42:56,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 10:42:59,478][12862] Signal inference workers to stop experience collection... (27950 times) [2024-06-18 10:42:59,519][12883] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-06-18 10:42:59,524][12862] Signal inference workers to resume experience collection... (27950 times) [2024-06-18 10:42:59,540][12883] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-06-18 10:42:59,543][12883] Updated weights for policy 0, policy_version 116563 (0.0032) [2024-06-18 10:43:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1909850112. Throughput: 0: 42850.1. Samples: 1909930080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:01,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 10:43:03,662][12883] Updated weights for policy 0, policy_version 116573 (0.0035) [2024-06-18 10:43:06,951][12883] Updated weights for policy 0, policy_version 116583 (0.0033) [2024-06-18 10:43:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 1910095872. Throughput: 0: 42958.6. Samples: 1910193180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:06,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 10:43:11,264][12883] Updated weights for policy 0, policy_version 116593 (0.0044) [2024-06-18 10:43:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1910308864. Throughput: 0: 43051.6. Samples: 1910445360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:11,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 10:43:14,586][12883] Updated weights for policy 0, policy_version 116603 (0.0031) [2024-06-18 10:43:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1910489088. Throughput: 0: 42950.2. Samples: 1910576980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:16,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 10:43:18,846][12883] Updated weights for policy 0, policy_version 116613 (0.0025) [2024-06-18 10:43:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1910718464. Throughput: 0: 42724.9. Samples: 1910833440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:21,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 10:43:22,346][12883] Updated weights for policy 0, policy_version 116623 (0.0028) [2024-06-18 10:43:26,248][12883] Updated weights for policy 0, policy_version 116633 (0.0041) [2024-06-18 10:43:26,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1910964224. Throughput: 0: 42735.6. Samples: 1911088580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:26,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 10:43:29,914][12883] Updated weights for policy 0, policy_version 116643 (0.0023) [2024-06-18 10:43:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 1911128064. Throughput: 0: 42921.2. Samples: 1911221620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:31,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 10:43:33,937][12883] Updated weights for policy 0, policy_version 116653 (0.0034) [2024-06-18 10:43:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1911373824. Throughput: 0: 42875.5. Samples: 1911481300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:36,994][12645] Avg episode reward: [(0, '0.608')] [2024-06-18 10:43:37,653][12883] Updated weights for policy 0, policy_version 116663 (0.0043) [2024-06-18 10:43:41,433][12883] Updated weights for policy 0, policy_version 116673 (0.0038) [2024-06-18 10:43:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1911603200. Throughput: 0: 42900.4. Samples: 1911734120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:41,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 10:43:45,162][12883] Updated weights for policy 0, policy_version 116683 (0.0032) [2024-06-18 10:43:46,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1911767040. Throughput: 0: 43024.1. Samples: 1911866160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:46,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 10:43:49,083][12883] Updated weights for policy 0, policy_version 116693 (0.0026) [2024-06-18 10:43:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 1912029184. Throughput: 0: 43019.9. Samples: 1912129080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 26.0) [2024-06-18 10:43:51,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 10:43:52,758][12883] Updated weights for policy 0, policy_version 116703 (0.0043) [2024-06-18 10:43:56,672][12883] Updated weights for policy 0, policy_version 116713 (0.0032) [2024-06-18 10:43:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1912225792. Throughput: 0: 43115.0. Samples: 1912385540. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:43:56,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 10:44:00,584][12883] Updated weights for policy 0, policy_version 116723 (0.0049) [2024-06-18 10:44:01,997][12645] Fps is (10 sec: 39309.8, 60 sec: 42869.3, 300 sec: 42598.0). Total num frames: 1912422400. Throughput: 0: 42939.2. Samples: 1912509380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:01,997][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 10:44:04,234][12883] Updated weights for policy 0, policy_version 116733 (0.0036) [2024-06-18 10:44:07,000][12645] Fps is (10 sec: 44209.4, 60 sec: 42867.1, 300 sec: 42875.2). Total num frames: 1912668160. Throughput: 0: 43124.2. Samples: 1912774300. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:07,000][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 10:44:08,234][12883] Updated weights for policy 0, policy_version 116743 (0.0038) [2024-06-18 10:44:11,951][12883] Updated weights for policy 0, policy_version 116753 (0.0036) [2024-06-18 10:44:11,994][12645] Fps is (10 sec: 45889.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1912881152. Throughput: 0: 43113.3. Samples: 1913028680. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:11,994][12645] Avg episode reward: [(0, '0.697')] [2024-06-18 10:44:15,804][12883] Updated weights for policy 0, policy_version 116763 (0.0032) [2024-06-18 10:44:16,994][12645] Fps is (10 sec: 40985.3, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1913077760. Throughput: 0: 42938.2. Samples: 1913153840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:16,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 10:44:19,517][12883] Updated weights for policy 0, policy_version 116773 (0.0044) [2024-06-18 10:44:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.4, 300 sec: 42876.1). Total num frames: 1913323520. Throughput: 0: 43077.2. Samples: 1913419780. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:21,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 10:44:23,294][12883] Updated weights for policy 0, policy_version 116783 (0.0033) [2024-06-18 10:44:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1913503744. Throughput: 0: 43091.2. Samples: 1913673220. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:26,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 10:44:27,476][12883] Updated weights for policy 0, policy_version 116793 (0.0032) [2024-06-18 10:44:29,637][12862] Signal inference workers to stop experience collection... (28000 times) [2024-06-18 10:44:29,638][12862] Signal inference workers to resume experience collection... (28000 times) [2024-06-18 10:44:29,671][12883] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-06-18 10:44:29,671][12883] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-06-18 10:44:30,752][12883] Updated weights for policy 0, policy_version 116803 (0.0033) [2024-06-18 10:44:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1913733120. Throughput: 0: 42958.1. Samples: 1913799280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:31,996][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 10:44:35,380][12883] Updated weights for policy 0, policy_version 116813 (0.0026) [2024-06-18 10:44:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1913929728. Throughput: 0: 42852.6. Samples: 1914057440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:36,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 10:44:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116817_1913929728.pth... [2024-06-18 10:44:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116189_1903640576.pth [2024-06-18 10:44:38,423][12883] Updated weights for policy 0, policy_version 116823 (0.0034) [2024-06-18 10:44:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1914142720. Throughput: 0: 42673.3. Samples: 1914305840. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:41,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 10:44:43,276][12883] Updated weights for policy 0, policy_version 116833 (0.0029) [2024-06-18 10:44:46,126][12883] Updated weights for policy 0, policy_version 116843 (0.0047) [2024-06-18 10:44:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1914372096. Throughput: 0: 42855.9. Samples: 1914437760. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:46,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 10:44:50,999][12883] Updated weights for policy 0, policy_version 116853 (0.0036) [2024-06-18 10:44:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 1914568704. Throughput: 0: 42716.1. Samples: 1914696260. Policy #0 lag: (min: 1.0, avg: 9.1, max: 22.0) [2024-06-18 10:44:51,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 10:44:53,750][12883] Updated weights for policy 0, policy_version 116863 (0.0057) [2024-06-18 10:44:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1914781696. Throughput: 0: 42663.1. Samples: 1914948520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:44:56,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 10:44:58,456][12883] Updated weights for policy 0, policy_version 116873 (0.0036) [2024-06-18 10:45:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42873.6, 300 sec: 42765.0). Total num frames: 1914994688. Throughput: 0: 42719.9. Samples: 1915076240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:01,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 10:45:02,137][12883] Updated weights for policy 0, policy_version 116883 (0.0023) [2024-06-18 10:45:06,076][12883] Updated weights for policy 0, policy_version 116893 (0.0036) [2024-06-18 10:45:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42329.7, 300 sec: 42820.5). Total num frames: 1915207680. Throughput: 0: 42522.3. Samples: 1915333280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:06,994][12645] Avg episode reward: [(0, '0.662')] [2024-06-18 10:45:10,279][12883] Updated weights for policy 0, policy_version 116903 (0.0038) [2024-06-18 10:45:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1915420672. Throughput: 0: 42437.9. Samples: 1915582920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:11,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 10:45:14,292][12883] Updated weights for policy 0, policy_version 116913 (0.0038) [2024-06-18 10:45:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 1915633664. Throughput: 0: 42544.9. Samples: 1915713800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:16,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 10:45:17,857][12883] Updated weights for policy 0, policy_version 116923 (0.0037) [2024-06-18 10:45:21,900][12883] Updated weights for policy 0, policy_version 116933 (0.0033) [2024-06-18 10:45:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 1915830272. Throughput: 0: 42264.5. Samples: 1915959340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:21,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 10:45:25,789][12883] Updated weights for policy 0, policy_version 116943 (0.0036) [2024-06-18 10:45:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1916043264. Throughput: 0: 42350.7. Samples: 1916211620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:26,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 10:45:29,555][12883] Updated weights for policy 0, policy_version 116953 (0.0027) [2024-06-18 10:45:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 1916289024. Throughput: 0: 42294.7. Samples: 1916341020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:31,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 10:45:33,314][12883] Updated weights for policy 0, policy_version 116963 (0.0035) [2024-06-18 10:45:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 1916452864. Throughput: 0: 42157.4. Samples: 1916593340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:36,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 10:45:37,283][12883] Updated weights for policy 0, policy_version 116973 (0.0053) [2024-06-18 10:45:38,963][12862] Signal inference workers to stop experience collection... (28050 times) [2024-06-18 10:45:38,963][12862] Signal inference workers to resume experience collection... (28050 times) [2024-06-18 10:45:38,986][12883] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-06-18 10:45:38,987][12883] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-06-18 10:45:40,834][12883] Updated weights for policy 0, policy_version 116983 (0.0027) [2024-06-18 10:45:41,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1916665856. Throughput: 0: 42336.5. Samples: 1916853660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:41,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 10:45:45,125][12883] Updated weights for policy 0, policy_version 116993 (0.0030) [2024-06-18 10:45:46,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 1916928000. Throughput: 0: 42226.0. Samples: 1916976400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:46,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 10:45:49,004][12883] Updated weights for policy 0, policy_version 117003 (0.0033) [2024-06-18 10:45:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 1917091840. Throughput: 0: 42174.2. Samples: 1917231120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:51,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 10:45:52,888][12883] Updated weights for policy 0, policy_version 117013 (0.0033) [2024-06-18 10:45:56,600][12883] Updated weights for policy 0, policy_version 117023 (0.0049) [2024-06-18 10:45:56,998][12645] Fps is (10 sec: 37665.8, 60 sec: 42049.1, 300 sec: 42597.8). Total num frames: 1917304832. Throughput: 0: 42251.2. Samples: 1917484420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-18 10:45:56,999][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 10:46:00,554][12883] Updated weights for policy 0, policy_version 117033 (0.0044) [2024-06-18 10:46:01,994][12645] Fps is (10 sec: 49152.6, 60 sec: 43144.7, 300 sec: 42987.2). Total num frames: 1917583360. Throughput: 0: 42214.8. Samples: 1917613460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:01,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 10:46:04,183][12883] Updated weights for policy 0, policy_version 117043 (0.0036) [2024-06-18 10:46:06,994][12645] Fps is (10 sec: 44256.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1917747200. Throughput: 0: 42420.0. Samples: 1917868240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:06,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 10:46:08,240][12883] Updated weights for policy 0, policy_version 117053 (0.0033) [2024-06-18 10:46:11,890][12883] Updated weights for policy 0, policy_version 117063 (0.0032) [2024-06-18 10:46:11,996][12645] Fps is (10 sec: 37674.4, 60 sec: 42323.7, 300 sec: 42709.2). Total num frames: 1917960192. Throughput: 0: 42515.7. Samples: 1918124920. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:11,996][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 10:46:15,725][12883] Updated weights for policy 0, policy_version 117073 (0.0024) [2024-06-18 10:46:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1918205952. Throughput: 0: 42464.0. Samples: 1918251900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:16,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 10:46:19,901][12883] Updated weights for policy 0, policy_version 117083 (0.0039) [2024-06-18 10:46:21,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1918369792. Throughput: 0: 42505.7. Samples: 1918506100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:21,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 10:46:23,541][12883] Updated weights for policy 0, policy_version 117093 (0.0040) [2024-06-18 10:46:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1918599168. Throughput: 0: 42373.3. Samples: 1918760460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:26,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 10:46:27,407][12883] Updated weights for policy 0, policy_version 117103 (0.0033) [2024-06-18 10:46:31,033][12883] Updated weights for policy 0, policy_version 117113 (0.0030) [2024-06-18 10:46:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1918828544. Throughput: 0: 42592.7. Samples: 1918893080. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:31,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 10:46:34,879][12883] Updated weights for policy 0, policy_version 117123 (0.0038) [2024-06-18 10:46:36,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 1919008768. Throughput: 0: 42504.3. Samples: 1919143820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:36,995][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 10:46:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117127_1919008768.pth... [2024-06-18 10:46:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116505_1908817920.pth [2024-06-18 10:46:38,822][12883] Updated weights for policy 0, policy_version 117133 (0.0037) [2024-06-18 10:46:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.4). Total num frames: 1919238144. Throughput: 0: 42575.4. Samples: 1919400120. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:41,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 10:46:42,346][12883] Updated weights for policy 0, policy_version 117143 (0.0032) [2024-06-18 10:46:46,554][12883] Updated weights for policy 0, policy_version 117153 (0.0035) [2024-06-18 10:46:46,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 1919467520. Throughput: 0: 42620.3. Samples: 1919531380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:46,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 10:46:50,059][12883] Updated weights for policy 0, policy_version 117163 (0.0037) [2024-06-18 10:46:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1919664128. Throughput: 0: 42664.4. Samples: 1919788140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:51,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 10:46:54,062][12883] Updated weights for policy 0, policy_version 117173 (0.0032) [2024-06-18 10:46:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43147.7, 300 sec: 42709.5). Total num frames: 1919893504. Throughput: 0: 42633.6. Samples: 1920043340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 23.0) [2024-06-18 10:46:56,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 10:46:57,657][12883] Updated weights for policy 0, policy_version 117183 (0.0042) [2024-06-18 10:46:58,623][12862] Signal inference workers to stop experience collection... (28100 times) [2024-06-18 10:46:58,624][12862] Signal inference workers to resume experience collection... (28100 times) [2024-06-18 10:46:58,683][12883] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-06-18 10:46:58,683][12883] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-06-18 10:47:01,545][12883] Updated weights for policy 0, policy_version 117193 (0.0036) [2024-06-18 10:47:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 1920106496. Throughput: 0: 42802.7. Samples: 1920178020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:01,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 10:47:05,257][12883] Updated weights for policy 0, policy_version 117203 (0.0042) [2024-06-18 10:47:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1920303104. Throughput: 0: 42803.1. Samples: 1920432240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:06,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 10:47:09,043][12883] Updated weights for policy 0, policy_version 117213 (0.0026) [2024-06-18 10:47:11,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43146.0, 300 sec: 42820.5). Total num frames: 1920548864. Throughput: 0: 42795.8. Samples: 1920686280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:11,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 10:47:12,727][12883] Updated weights for policy 0, policy_version 117223 (0.0037) [2024-06-18 10:47:16,859][12883] Updated weights for policy 0, policy_version 117233 (0.0032) [2024-06-18 10:47:16,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1920745472. Throughput: 0: 42989.0. Samples: 1920827580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:16,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 10:47:20,648][12883] Updated weights for policy 0, policy_version 117243 (0.0030) [2024-06-18 10:47:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1920958464. Throughput: 0: 43021.5. Samples: 1921079780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:21,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 10:47:24,635][12883] Updated weights for policy 0, policy_version 117253 (0.0028) [2024-06-18 10:47:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 1921204224. Throughput: 0: 42816.9. Samples: 1921326880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:26,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 10:47:28,153][12883] Updated weights for policy 0, policy_version 117263 (0.0033) [2024-06-18 10:47:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1921384448. Throughput: 0: 42977.0. Samples: 1921465340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:31,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 10:47:32,011][12883] Updated weights for policy 0, policy_version 117273 (0.0042) [2024-06-18 10:47:35,749][12883] Updated weights for policy 0, policy_version 117283 (0.0043) [2024-06-18 10:47:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 1921597440. Throughput: 0: 42899.1. Samples: 1921718600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:36,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 10:47:39,753][12883] Updated weights for policy 0, policy_version 117293 (0.0045) [2024-06-18 10:47:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 1921826816. Throughput: 0: 42910.4. Samples: 1921974300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:41,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 10:47:43,346][12883] Updated weights for policy 0, policy_version 117303 (0.0027) [2024-06-18 10:47:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1922007040. Throughput: 0: 42802.7. Samples: 1922104140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:46,994][12645] Avg episode reward: [(0, '0.778')] [2024-06-18 10:47:47,576][12883] Updated weights for policy 0, policy_version 117313 (0.0036) [2024-06-18 10:47:50,943][12883] Updated weights for policy 0, policy_version 117323 (0.0040) [2024-06-18 10:47:51,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1922220032. Throughput: 0: 42796.5. Samples: 1922358080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:51,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 10:47:55,127][12883] Updated weights for policy 0, policy_version 117333 (0.0030) [2024-06-18 10:47:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1922465792. Throughput: 0: 42745.9. Samples: 1922609840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 10:47:56,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 10:47:58,448][12883] Updated weights for policy 0, policy_version 117343 (0.0033) [2024-06-18 10:48:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1922662400. Throughput: 0: 42685.2. Samples: 1922748420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:01,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 10:48:02,703][12883] Updated weights for policy 0, policy_version 117353 (0.0035) [2024-06-18 10:48:06,414][12883] Updated weights for policy 0, policy_version 117363 (0.0044) [2024-06-18 10:48:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1922875392. Throughput: 0: 42629.4. Samples: 1922998100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:06,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 10:48:10,362][12883] Updated weights for policy 0, policy_version 117373 (0.0040) [2024-06-18 10:48:11,629][12862] Signal inference workers to stop experience collection... (28150 times) [2024-06-18 10:48:11,669][12883] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-06-18 10:48:11,688][12862] Signal inference workers to resume experience collection... (28150 times) [2024-06-18 10:48:11,691][12883] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-06-18 10:48:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 1923104768. Throughput: 0: 42801.4. Samples: 1923252940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:11,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 10:48:14,134][12883] Updated weights for policy 0, policy_version 117383 (0.0028) [2024-06-18 10:48:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1923301376. Throughput: 0: 42517.3. Samples: 1923378620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:16,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 10:48:18,146][12883] Updated weights for policy 0, policy_version 117393 (0.0033) [2024-06-18 10:48:21,870][12883] Updated weights for policy 0, policy_version 117403 (0.0026) [2024-06-18 10:48:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1923530752. Throughput: 0: 42524.5. Samples: 1923632200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:21,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 10:48:25,940][12883] Updated weights for policy 0, policy_version 117413 (0.0028) [2024-06-18 10:48:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1923743744. Throughput: 0: 42547.9. Samples: 1923888960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:26,996][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 10:48:29,474][12883] Updated weights for policy 0, policy_version 117423 (0.0023) [2024-06-18 10:48:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1923923968. Throughput: 0: 42434.2. Samples: 1924013680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:31,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 10:48:33,847][12883] Updated weights for policy 0, policy_version 117433 (0.0033) [2024-06-18 10:48:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1924169728. Throughput: 0: 42566.3. Samples: 1924273560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:36,994][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 10:48:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117443_1924186112.pth... [2024-06-18 10:48:37,033][12883] Updated weights for policy 0, policy_version 117443 (0.0044) [2024-06-18 10:48:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000116817_1913929728.pth [2024-06-18 10:48:41,447][12883] Updated weights for policy 0, policy_version 117453 (0.0035) [2024-06-18 10:48:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1924382720. Throughput: 0: 42702.4. Samples: 1924531440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:41,994][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 10:48:44,844][12883] Updated weights for policy 0, policy_version 117463 (0.0030) [2024-06-18 10:48:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1924579328. Throughput: 0: 42441.4. Samples: 1924658280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:46,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 10:48:49,018][12883] Updated weights for policy 0, policy_version 117473 (0.0026) [2024-06-18 10:48:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1924808704. Throughput: 0: 42661.8. Samples: 1924917880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:51,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 10:48:52,459][12883] Updated weights for policy 0, policy_version 117483 (0.0036) [2024-06-18 10:48:56,867][12883] Updated weights for policy 0, policy_version 117493 (0.0039) [2024-06-18 10:48:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42654.4). Total num frames: 1925005312. Throughput: 0: 42715.9. Samples: 1925175160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:48:56,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 10:48:59,963][12883] Updated weights for policy 0, policy_version 117503 (0.0036) [2024-06-18 10:49:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42543.8). Total num frames: 1925218304. Throughput: 0: 42680.5. Samples: 1925299240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 10:49:01,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 10:49:04,313][12883] Updated weights for policy 0, policy_version 117513 (0.0031) [2024-06-18 10:49:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1925447680. Throughput: 0: 42816.7. Samples: 1925558960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:06,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 10:49:07,959][12883] Updated weights for policy 0, policy_version 117523 (0.0036) [2024-06-18 10:49:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1925627904. Throughput: 0: 42762.8. Samples: 1925813280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:11,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 10:49:12,312][12883] Updated weights for policy 0, policy_version 117533 (0.0033) [2024-06-18 10:49:15,923][12883] Updated weights for policy 0, policy_version 117543 (0.0029) [2024-06-18 10:49:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 1925840896. Throughput: 0: 42662.6. Samples: 1925933500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:16,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 10:49:19,999][12883] Updated weights for policy 0, policy_version 117553 (0.0034) [2024-06-18 10:49:21,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1926103040. Throughput: 0: 42655.2. Samples: 1926193040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:21,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 10:49:23,505][12883] Updated weights for policy 0, policy_version 117563 (0.0042) [2024-06-18 10:49:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1926283264. Throughput: 0: 42665.1. Samples: 1926451380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:26,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 10:49:27,658][12883] Updated weights for policy 0, policy_version 117573 (0.0040) [2024-06-18 10:49:31,517][12883] Updated weights for policy 0, policy_version 117583 (0.0030) [2024-06-18 10:49:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1926496256. Throughput: 0: 42469.7. Samples: 1926569420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:31,995][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 10:49:35,420][12883] Updated weights for policy 0, policy_version 117593 (0.0035) [2024-06-18 10:49:36,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1926742016. Throughput: 0: 42482.2. Samples: 1926829580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:36,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 10:49:39,026][12883] Updated weights for policy 0, policy_version 117603 (0.0048) [2024-06-18 10:49:41,546][12862] Signal inference workers to stop experience collection... (28200 times) [2024-06-18 10:49:41,547][12862] Signal inference workers to resume experience collection... (28200 times) [2024-06-18 10:49:41,574][12883] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-06-18 10:49:41,575][12883] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-06-18 10:49:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1926922240. Throughput: 0: 42438.3. Samples: 1927084880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:41,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 10:49:43,232][12883] Updated weights for policy 0, policy_version 117613 (0.0025) [2024-06-18 10:49:46,515][12883] Updated weights for policy 0, policy_version 117623 (0.0036) [2024-06-18 10:49:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1927135232. Throughput: 0: 42328.0. Samples: 1927204000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:46,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 10:49:51,132][12883] Updated weights for policy 0, policy_version 117633 (0.0030) [2024-06-18 10:49:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1927364608. Throughput: 0: 42511.2. Samples: 1927471960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:51,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 10:49:54,130][12883] Updated weights for policy 0, policy_version 117643 (0.0033) [2024-06-18 10:49:56,997][12645] Fps is (10 sec: 40947.6, 60 sec: 42323.3, 300 sec: 42542.5). Total num frames: 1927544832. Throughput: 0: 42318.5. Samples: 1927717740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:49:56,997][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 10:49:59,058][12883] Updated weights for policy 0, policy_version 117653 (0.0042) [2024-06-18 10:50:01,588][12883] Updated weights for policy 0, policy_version 117663 (0.0040) [2024-06-18 10:50:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1927790592. Throughput: 0: 42492.3. Samples: 1927845660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 10:50:01,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 10:50:06,589][12883] Updated weights for policy 0, policy_version 117673 (0.0042) [2024-06-18 10:50:06,996][12645] Fps is (10 sec: 42601.3, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 1927970816. Throughput: 0: 42613.8. Samples: 1928110760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:06,997][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 10:50:09,053][12883] Updated weights for policy 0, policy_version 117683 (0.0046) [2024-06-18 10:50:11,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 1928200192. Throughput: 0: 42393.6. Samples: 1928359180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:11,996][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 10:50:14,166][12883] Updated weights for policy 0, policy_version 117693 (0.0034) [2024-06-18 10:50:16,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 1928413184. Throughput: 0: 42754.6. Samples: 1928493380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:16,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 10:50:17,418][12883] Updated weights for policy 0, policy_version 117703 (0.0030) [2024-06-18 10:50:21,944][12883] Updated weights for policy 0, policy_version 117713 (0.0031) [2024-06-18 10:50:21,994][12645] Fps is (10 sec: 40968.7, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 1928609792. Throughput: 0: 42584.4. Samples: 1928745880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:21,995][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 10:50:25,131][12883] Updated weights for policy 0, policy_version 117723 (0.0037) [2024-06-18 10:50:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 1928839168. Throughput: 0: 42510.5. Samples: 1928997860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:26,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 10:50:29,846][12883] Updated weights for policy 0, policy_version 117733 (0.0026) [2024-06-18 10:50:31,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1929068544. Throughput: 0: 42812.8. Samples: 1929130580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:31,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 10:50:32,760][12883] Updated weights for policy 0, policy_version 117743 (0.0038) [2024-06-18 10:50:36,996][12645] Fps is (10 sec: 39313.1, 60 sec: 41504.6, 300 sec: 42598.1). Total num frames: 1929232384. Throughput: 0: 42480.4. Samples: 1929383680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:36,997][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 10:50:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117751_1929232384.pth... [2024-06-18 10:50:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117127_1919008768.pth [2024-06-18 10:50:37,390][12883] Updated weights for policy 0, policy_version 117753 (0.0045) [2024-06-18 10:50:40,447][12883] Updated weights for policy 0, policy_version 117763 (0.0033) [2024-06-18 10:50:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 1929478144. Throughput: 0: 42613.0. Samples: 1929635200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:41,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 10:50:44,980][12862] Signal inference workers to stop experience collection... (28250 times) [2024-06-18 10:50:44,981][12862] Signal inference workers to resume experience collection... (28250 times) [2024-06-18 10:50:45,014][12883] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-06-18 10:50:45,014][12883] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-06-18 10:50:45,142][12883] Updated weights for policy 0, policy_version 117773 (0.0033) [2024-06-18 10:50:46,994][12645] Fps is (10 sec: 49162.7, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 1929723904. Throughput: 0: 42745.7. Samples: 1929769220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:46,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 10:50:47,987][12883] Updated weights for policy 0, policy_version 117783 (0.0023) [2024-06-18 10:50:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42599.1). Total num frames: 1929871360. Throughput: 0: 42378.2. Samples: 1930017680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:51,994][12645] Avg episode reward: [(0, '0.191')] [2024-06-18 10:50:52,805][12883] Updated weights for policy 0, policy_version 117793 (0.0038) [2024-06-18 10:50:55,948][12883] Updated weights for policy 0, policy_version 117803 (0.0043) [2024-06-18 10:50:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43146.5, 300 sec: 42542.8). Total num frames: 1930133504. Throughput: 0: 42428.2. Samples: 1930268360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:50:56,995][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 10:51:00,246][12883] Updated weights for policy 0, policy_version 117813 (0.0040) [2024-06-18 10:51:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1930330112. Throughput: 0: 42509.9. Samples: 1930406320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 10:51:01,994][12645] Avg episode reward: [(0, '0.672')] [2024-06-18 10:51:03,466][12883] Updated weights for policy 0, policy_version 117823 (0.0026) [2024-06-18 10:51:06,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42600.0, 300 sec: 42598.7). Total num frames: 1930526720. Throughput: 0: 42502.8. Samples: 1930658500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:06,994][12645] Avg episode reward: [(0, '0.723')] [2024-06-18 10:51:08,218][12883] Updated weights for policy 0, policy_version 117833 (0.0035) [2024-06-18 10:51:11,005][12883] Updated weights for policy 0, policy_version 117843 (0.0024) [2024-06-18 10:51:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 1930788864. Throughput: 0: 42547.6. Samples: 1930912500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:11,995][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 10:51:15,768][12883] Updated weights for policy 0, policy_version 117853 (0.0043) [2024-06-18 10:51:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1930985472. Throughput: 0: 42643.9. Samples: 1931049560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:16,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 10:51:18,650][12883] Updated weights for policy 0, policy_version 117863 (0.0042) [2024-06-18 10:51:21,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1931165696. Throughput: 0: 42489.7. Samples: 1931295620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:21,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 10:51:23,499][12883] Updated weights for policy 0, policy_version 117873 (0.0041) [2024-06-18 10:51:26,340][12883] Updated weights for policy 0, policy_version 117883 (0.0021) [2024-06-18 10:51:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1931427840. Throughput: 0: 42615.6. Samples: 1931552900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:26,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 10:51:31,258][12883] Updated weights for policy 0, policy_version 117893 (0.0053) [2024-06-18 10:51:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 1931608064. Throughput: 0: 42748.5. Samples: 1931692900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:31,995][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 10:51:34,085][12883] Updated weights for policy 0, policy_version 117903 (0.0035) [2024-06-18 10:51:36,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 1931804672. Throughput: 0: 42547.9. Samples: 1931932340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:36,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 10:51:38,793][12883] Updated weights for policy 0, policy_version 117913 (0.0038) [2024-06-18 10:51:41,797][12883] Updated weights for policy 0, policy_version 117923 (0.0035) [2024-06-18 10:51:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1932066816. Throughput: 0: 42788.1. Samples: 1932193820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:41,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 10:51:46,354][12883] Updated weights for policy 0, policy_version 117933 (0.0046) [2024-06-18 10:51:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1932247040. Throughput: 0: 42695.9. Samples: 1932327640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:46,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 10:51:49,395][12883] Updated weights for policy 0, policy_version 117943 (0.0035) [2024-06-18 10:51:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1932460032. Throughput: 0: 42490.2. Samples: 1932570560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:51,994][12645] Avg episode reward: [(0, '0.205')] [2024-06-18 10:51:53,728][12862] Signal inference workers to stop experience collection... (28300 times) [2024-06-18 10:51:53,771][12883] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-06-18 10:51:53,793][12862] Signal inference workers to resume experience collection... (28300 times) [2024-06-18 10:51:53,794][12883] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-06-18 10:51:54,225][12883] Updated weights for policy 0, policy_version 117953 (0.0039) [2024-06-18 10:51:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 1932673024. Throughput: 0: 42585.4. Samples: 1932828840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:51:56,994][12645] Avg episode reward: [(0, '0.866')] [2024-06-18 10:51:57,128][12862] Saving new best policy, reward=0.866! [2024-06-18 10:51:57,501][12883] Updated weights for policy 0, policy_version 117963 (0.0041) [2024-06-18 10:52:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1932853248. Throughput: 0: 42281.0. Samples: 1932952200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 10:52:01,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 10:52:02,086][12883] Updated weights for policy 0, policy_version 117973 (0.0039) [2024-06-18 10:52:04,987][12883] Updated weights for policy 0, policy_version 117983 (0.0045) [2024-06-18 10:52:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1933115392. Throughput: 0: 42315.6. Samples: 1933199820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:06,994][12645] Avg episode reward: [(0, '0.807')] [2024-06-18 10:52:09,643][12883] Updated weights for policy 0, policy_version 117993 (0.0032) [2024-06-18 10:52:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1933312000. Throughput: 0: 42420.9. Samples: 1933461840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:11,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 10:52:12,783][12883] Updated weights for policy 0, policy_version 118003 (0.0028) [2024-06-18 10:52:16,994][12645] Fps is (10 sec: 37682.7, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 1933492224. Throughput: 0: 42049.3. Samples: 1933585120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:16,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 10:52:17,419][12883] Updated weights for policy 0, policy_version 118013 (0.0042) [2024-06-18 10:52:20,308][12883] Updated weights for policy 0, policy_version 118023 (0.0036) [2024-06-18 10:52:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1933754368. Throughput: 0: 42423.5. Samples: 1933841400. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:21,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 10:52:25,123][12883] Updated weights for policy 0, policy_version 118033 (0.0030) [2024-06-18 10:52:26,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1933950976. Throughput: 0: 42454.2. Samples: 1934104260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:26,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 10:52:27,885][12883] Updated weights for policy 0, policy_version 118043 (0.0041) [2024-06-18 10:52:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1934131200. Throughput: 0: 42138.7. Samples: 1934223880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:31,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 10:52:32,917][12883] Updated weights for policy 0, policy_version 118053 (0.0038) [2024-06-18 10:52:35,858][12883] Updated weights for policy 0, policy_version 118063 (0.0028) [2024-06-18 10:52:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 1934393344. Throughput: 0: 42514.7. Samples: 1934483720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:36,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 10:52:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118067_1934409728.pth... [2024-06-18 10:52:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117443_1924186112.pth [2024-06-18 10:52:40,663][12883] Updated weights for policy 0, policy_version 118073 (0.0037) [2024-06-18 10:52:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 1934573568. Throughput: 0: 42610.7. Samples: 1934746320. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:41,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 10:52:43,761][12883] Updated weights for policy 0, policy_version 118083 (0.0028) [2024-06-18 10:52:46,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1934786560. Throughput: 0: 42372.8. Samples: 1934858980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:46,994][12645] Avg episode reward: [(0, '0.179')] [2024-06-18 10:52:48,499][12883] Updated weights for policy 0, policy_version 118093 (0.0042) [2024-06-18 10:52:51,494][12883] Updated weights for policy 0, policy_version 118103 (0.0032) [2024-06-18 10:52:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1935032320. Throughput: 0: 42703.2. Samples: 1935121460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:51,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 10:52:56,034][12883] Updated weights for policy 0, policy_version 118113 (0.0034) [2024-06-18 10:52:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 1935212544. Throughput: 0: 42588.9. Samples: 1935378340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:52:56,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 10:52:59,023][12883] Updated weights for policy 0, policy_version 118123 (0.0024) [2024-06-18 10:53:01,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1935409152. Throughput: 0: 42521.1. Samples: 1935498560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:53:01,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 10:53:03,581][12883] Updated weights for policy 0, policy_version 118133 (0.0035) [2024-06-18 10:53:06,636][12883] Updated weights for policy 0, policy_version 118143 (0.0024) [2024-06-18 10:53:06,676][12862] Signal inference workers to stop experience collection... (28350 times) [2024-06-18 10:53:06,736][12883] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-06-18 10:53:06,792][12862] Signal inference workers to resume experience collection... (28350 times) [2024-06-18 10:53:06,792][12883] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-06-18 10:53:06,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1935687680. Throughput: 0: 42745.0. Samples: 1935764920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 24.0) [2024-06-18 10:53:06,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 10:53:11,258][12883] Updated weights for policy 0, policy_version 118153 (0.0035) [2024-06-18 10:53:11,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 1935835136. Throughput: 0: 42471.9. Samples: 1936015500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:11,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 10:53:14,587][12883] Updated weights for policy 0, policy_version 118163 (0.0043) [2024-06-18 10:53:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1936064512. Throughput: 0: 42458.6. Samples: 1936134520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:16,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 10:53:18,917][12883] Updated weights for policy 0, policy_version 118173 (0.0037) [2024-06-18 10:53:21,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 1936293888. Throughput: 0: 42616.4. Samples: 1936401460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:21,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 10:53:22,158][12883] Updated weights for policy 0, policy_version 118183 (0.0030) [2024-06-18 10:53:26,819][12883] Updated weights for policy 0, policy_version 118193 (0.0033) [2024-06-18 10:53:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 1936474112. Throughput: 0: 42399.1. Samples: 1936654280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:26,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 10:53:29,775][12883] Updated weights for policy 0, policy_version 118203 (0.0030) [2024-06-18 10:53:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1936703488. Throughput: 0: 42588.0. Samples: 1936775440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:31,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 10:53:34,554][12883] Updated weights for policy 0, policy_version 118213 (0.0046) [2024-06-18 10:53:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 1936916480. Throughput: 0: 42493.2. Samples: 1937033660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:36,994][12645] Avg episode reward: [(0, '0.285')] [2024-06-18 10:53:37,499][12883] Updated weights for policy 0, policy_version 118223 (0.0029) [2024-06-18 10:53:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 1937096704. Throughput: 0: 42472.5. Samples: 1937289600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:41,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 10:53:42,158][12883] Updated weights for policy 0, policy_version 118233 (0.0032) [2024-06-18 10:53:45,173][12883] Updated weights for policy 0, policy_version 118243 (0.0038) [2024-06-18 10:53:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1937342464. Throughput: 0: 42608.8. Samples: 1937415960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:46,994][12645] Avg episode reward: [(0, '0.728')] [2024-06-18 10:53:49,795][12883] Updated weights for policy 0, policy_version 118253 (0.0040) [2024-06-18 10:53:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 1937539072. Throughput: 0: 42277.2. Samples: 1937667400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:51,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 10:53:52,921][12883] Updated weights for policy 0, policy_version 118263 (0.0040) [2024-06-18 10:53:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1937752064. Throughput: 0: 42462.0. Samples: 1937926280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:53:56,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 10:53:57,417][12883] Updated weights for policy 0, policy_version 118273 (0.0040) [2024-06-18 10:54:00,588][12883] Updated weights for policy 0, policy_version 118283 (0.0045) [2024-06-18 10:54:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1937981440. Throughput: 0: 42667.6. Samples: 1938054560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:54:01,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 10:54:05,006][12883] Updated weights for policy 0, policy_version 118293 (0.0029) [2024-06-18 10:54:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 42542.8). Total num frames: 1938178048. Throughput: 0: 42446.6. Samples: 1938311560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 10:54:06,994][12645] Avg episode reward: [(0, '0.152')] [2024-06-18 10:54:08,321][12883] Updated weights for policy 0, policy_version 118303 (0.0036) [2024-06-18 10:54:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1938391040. Throughput: 0: 42439.6. Samples: 1938564060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:11,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 10:54:12,647][12883] Updated weights for policy 0, policy_version 118313 (0.0053) [2024-06-18 10:54:15,113][12862] Signal inference workers to stop experience collection... (28400 times) [2024-06-18 10:54:15,165][12862] Signal inference workers to resume experience collection... (28400 times) [2024-06-18 10:54:15,168][12883] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-06-18 10:54:15,183][12883] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-06-18 10:54:16,091][12883] Updated weights for policy 0, policy_version 118323 (0.0036) [2024-06-18 10:54:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1938636800. Throughput: 0: 42651.1. Samples: 1938694740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:16,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 10:54:20,159][12883] Updated weights for policy 0, policy_version 118333 (0.0037) [2024-06-18 10:54:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1938800640. Throughput: 0: 42469.8. Samples: 1938944800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:21,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 10:54:23,828][12883] Updated weights for policy 0, policy_version 118343 (0.0027) [2024-06-18 10:54:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 1939046400. Throughput: 0: 42320.4. Samples: 1939194020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:26,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 10:54:28,648][12883] Updated weights for policy 0, policy_version 118353 (0.0047) [2024-06-18 10:54:31,396][12883] Updated weights for policy 0, policy_version 118363 (0.0044) [2024-06-18 10:54:31,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1939275776. Throughput: 0: 42513.8. Samples: 1939329080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:31,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 10:54:36,150][12883] Updated weights for policy 0, policy_version 118373 (0.0027) [2024-06-18 10:54:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1939456000. Throughput: 0: 42444.0. Samples: 1939577380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:36,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 10:54:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118375_1939456000.pth... [2024-06-18 10:54:37,080][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000117751_1929232384.pth [2024-06-18 10:54:39,386][12883] Updated weights for policy 0, policy_version 118383 (0.0036) [2024-06-18 10:54:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 1939668992. Throughput: 0: 42288.9. Samples: 1939829280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:41,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 10:54:43,749][12883] Updated weights for policy 0, policy_version 118393 (0.0032) [2024-06-18 10:54:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1939898368. Throughput: 0: 42169.7. Samples: 1939952200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:46,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 10:54:47,145][12883] Updated weights for policy 0, policy_version 118403 (0.0038) [2024-06-18 10:54:51,302][12883] Updated weights for policy 0, policy_version 118413 (0.0029) [2024-06-18 10:54:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42598.8). Total num frames: 1940111360. Throughput: 0: 42240.9. Samples: 1940212400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:51,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 10:54:55,370][12883] Updated weights for policy 0, policy_version 118423 (0.0026) [2024-06-18 10:54:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 1940291584. Throughput: 0: 42287.5. Samples: 1940467000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:54:56,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 10:54:58,827][12883] Updated weights for policy 0, policy_version 118433 (0.0033) [2024-06-18 10:55:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42543.2). Total num frames: 1940520960. Throughput: 0: 42150.2. Samples: 1940591500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:55:01,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 10:55:02,852][12883] Updated weights for policy 0, policy_version 118443 (0.0028) [2024-06-18 10:55:06,450][12883] Updated weights for policy 0, policy_version 118453 (0.0041) [2024-06-18 10:55:06,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 1940750336. Throughput: 0: 42413.9. Samples: 1940853420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 10:55:06,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 10:55:10,861][12883] Updated weights for policy 0, policy_version 118463 (0.0036) [2024-06-18 10:55:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1940946944. Throughput: 0: 42461.0. Samples: 1941104760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:11,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 10:55:14,375][12883] Updated weights for policy 0, policy_version 118473 (0.0041) [2024-06-18 10:55:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1941176320. Throughput: 0: 42212.4. Samples: 1941228640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:16,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 10:55:18,428][12883] Updated weights for policy 0, policy_version 118483 (0.0032) [2024-06-18 10:55:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1941356544. Throughput: 0: 42381.0. Samples: 1941484520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:21,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 10:55:22,467][12883] Updated weights for policy 0, policy_version 118493 (0.0027) [2024-06-18 10:55:25,888][12883] Updated weights for policy 0, policy_version 118503 (0.0028) [2024-06-18 10:55:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 1941602304. Throughput: 0: 42440.4. Samples: 1941739100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:26,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 10:55:30,262][12883] Updated weights for policy 0, policy_version 118513 (0.0032) [2024-06-18 10:55:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 1941798912. Throughput: 0: 42600.4. Samples: 1941869220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:31,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 10:55:33,855][12883] Updated weights for policy 0, policy_version 118523 (0.0029) [2024-06-18 10:55:36,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1941995520. Throughput: 0: 42441.3. Samples: 1942122260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:36,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 10:55:38,073][12883] Updated weights for policy 0, policy_version 118533 (0.0037) [2024-06-18 10:55:41,431][12883] Updated weights for policy 0, policy_version 118543 (0.0046) [2024-06-18 10:55:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.3, 300 sec: 42431.8). Total num frames: 1942241280. Throughput: 0: 42332.4. Samples: 1942371960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:41,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 10:55:45,635][12883] Updated weights for policy 0, policy_version 118553 (0.0041) [2024-06-18 10:55:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 1942421504. Throughput: 0: 42552.0. Samples: 1942506340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:46,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 10:55:48,926][12883] Updated weights for policy 0, policy_version 118563 (0.0042) [2024-06-18 10:55:51,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 1942634496. Throughput: 0: 42413.3. Samples: 1942762020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:51,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 10:55:53,146][12883] Updated weights for policy 0, policy_version 118573 (0.0050) [2024-06-18 10:55:56,614][12862] Signal inference workers to stop experience collection... (28450 times) [2024-06-18 10:55:56,614][12862] Signal inference workers to resume experience collection... (28450 times) [2024-06-18 10:55:56,623][12883] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-06-18 10:55:56,624][12883] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-06-18 10:55:56,767][12883] Updated weights for policy 0, policy_version 118583 (0.0042) [2024-06-18 10:55:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1942863872. Throughput: 0: 42470.6. Samples: 1943015940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:55:56,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 10:56:00,847][12883] Updated weights for policy 0, policy_version 118593 (0.0028) [2024-06-18 10:56:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1943060480. Throughput: 0: 42635.6. Samples: 1943147240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:56:01,994][12645] Avg episode reward: [(0, '0.267')] [2024-06-18 10:56:04,334][12883] Updated weights for policy 0, policy_version 118603 (0.0024) [2024-06-18 10:56:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 1943273472. Throughput: 0: 42384.9. Samples: 1943391840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:56:06,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 10:56:08,459][12883] Updated weights for policy 0, policy_version 118613 (0.0036) [2024-06-18 10:56:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 1943486464. Throughput: 0: 42324.0. Samples: 1943643680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 10:56:11,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 10:56:12,139][12883] Updated weights for policy 0, policy_version 118623 (0.0029) [2024-06-18 10:56:16,812][12883] Updated weights for policy 0, policy_version 118633 (0.0031) [2024-06-18 10:56:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1943699456. Throughput: 0: 42310.7. Samples: 1943773200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:16,998][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 10:56:19,815][12883] Updated weights for policy 0, policy_version 118643 (0.0043) [2024-06-18 10:56:21,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 1943912448. Throughput: 0: 42138.6. Samples: 1944018500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:21,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 10:56:24,462][12883] Updated weights for policy 0, policy_version 118653 (0.0036) [2024-06-18 10:56:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1944141824. Throughput: 0: 42362.8. Samples: 1944278280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:26,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 10:56:27,547][12883] Updated weights for policy 0, policy_version 118663 (0.0031) [2024-06-18 10:56:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 1944322048. Throughput: 0: 42265.4. Samples: 1944408280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:31,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 10:56:32,052][12883] Updated weights for policy 0, policy_version 118673 (0.0030) [2024-06-18 10:56:35,357][12883] Updated weights for policy 0, policy_version 118683 (0.0037) [2024-06-18 10:56:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 1944567808. Throughput: 0: 42324.8. Samples: 1944666640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:36,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 10:56:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118687_1944567808.pth... [2024-06-18 10:56:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118067_1934409728.pth [2024-06-18 10:56:39,659][12883] Updated weights for policy 0, policy_version 118693 (0.0038) [2024-06-18 10:56:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1944764416. Throughput: 0: 42437.4. Samples: 1944925620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:41,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 10:56:43,222][12883] Updated weights for policy 0, policy_version 118703 (0.0031) [2024-06-18 10:56:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 1944961024. Throughput: 0: 42290.5. Samples: 1945050320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:46,995][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 10:56:47,501][12883] Updated weights for policy 0, policy_version 118713 (0.0044) [2024-06-18 10:56:51,115][12883] Updated weights for policy 0, policy_version 118723 (0.0034) [2024-06-18 10:56:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 1945190400. Throughput: 0: 42655.6. Samples: 1945311340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:51,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 10:56:54,915][12883] Updated weights for policy 0, policy_version 118733 (0.0027) [2024-06-18 10:56:56,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1945403392. Throughput: 0: 42661.7. Samples: 1945563460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:56:56,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 10:56:58,749][12883] Updated weights for policy 0, policy_version 118743 (0.0034) [2024-06-18 10:57:01,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 1945600000. Throughput: 0: 42699.1. Samples: 1945694660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:57:01,994][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 10:57:02,726][12883] Updated weights for policy 0, policy_version 118753 (0.0033) [2024-06-18 10:57:06,324][12883] Updated weights for policy 0, policy_version 118763 (0.0042) [2024-06-18 10:57:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 1945845760. Throughput: 0: 42970.3. Samples: 1945952160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:57:06,994][12645] Avg episode reward: [(0, '0.689')] [2024-06-18 10:57:10,317][12883] Updated weights for policy 0, policy_version 118773 (0.0035) [2024-06-18 10:57:11,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1946058752. Throughput: 0: 42822.1. Samples: 1946205280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 10:57:11,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 10:57:14,203][12883] Updated weights for policy 0, policy_version 118783 (0.0029) [2024-06-18 10:57:14,499][12862] Signal inference workers to stop experience collection... (28500 times) [2024-06-18 10:57:14,499][12862] Signal inference workers to resume experience collection... (28500 times) [2024-06-18 10:57:14,547][12883] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-06-18 10:57:14,547][12883] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-06-18 10:57:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 1946255360. Throughput: 0: 42856.0. Samples: 1946336800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:16,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 10:57:17,872][12883] Updated weights for policy 0, policy_version 118793 (0.0036) [2024-06-18 10:57:21,862][12883] Updated weights for policy 0, policy_version 118803 (0.0036) [2024-06-18 10:57:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 1946468352. Throughput: 0: 42849.0. Samples: 1946594840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:21,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 10:57:25,441][12883] Updated weights for policy 0, policy_version 118813 (0.0034) [2024-06-18 10:57:26,996][12645] Fps is (10 sec: 45861.9, 60 sec: 42869.4, 300 sec: 42653.5). Total num frames: 1946714112. Throughput: 0: 42777.4. Samples: 1946850720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:26,997][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 10:57:29,529][12883] Updated weights for policy 0, policy_version 118823 (0.0035) [2024-06-18 10:57:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 1946910720. Throughput: 0: 43022.9. Samples: 1946986340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:31,994][12645] Avg episode reward: [(0, '0.268')] [2024-06-18 10:57:32,863][12883] Updated weights for policy 0, policy_version 118833 (0.0036) [2024-06-18 10:57:36,994][12645] Fps is (10 sec: 37693.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 1947090944. Throughput: 0: 42867.4. Samples: 1947240380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:36,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 10:57:37,211][12883] Updated weights for policy 0, policy_version 118843 (0.0037) [2024-06-18 10:57:40,430][12883] Updated weights for policy 0, policy_version 118853 (0.0037) [2024-06-18 10:57:41,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 1947336704. Throughput: 0: 42880.1. Samples: 1947493160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:41,996][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 10:57:44,817][12883] Updated weights for policy 0, policy_version 118863 (0.0043) [2024-06-18 10:57:46,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.7, 300 sec: 42487.3). Total num frames: 1947566080. Throughput: 0: 43112.1. Samples: 1947634700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:46,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 10:57:47,913][12883] Updated weights for policy 0, policy_version 118873 (0.0043) [2024-06-18 10:57:51,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 1947729920. Throughput: 0: 42809.9. Samples: 1947878600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:51,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 10:57:52,499][12883] Updated weights for policy 0, policy_version 118883 (0.0030) [2024-06-18 10:57:55,550][12883] Updated weights for policy 0, policy_version 118893 (0.0039) [2024-06-18 10:57:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1947992064. Throughput: 0: 42866.3. Samples: 1948134260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:57:57,003][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 10:58:00,183][12883] Updated weights for policy 0, policy_version 118903 (0.0044) [2024-06-18 10:58:01,994][12645] Fps is (10 sec: 49151.0, 60 sec: 43690.6, 300 sec: 42487.3). Total num frames: 1948221440. Throughput: 0: 42994.9. Samples: 1948271580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:58:01,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 10:58:03,167][12883] Updated weights for policy 0, policy_version 118913 (0.0030) [2024-06-18 10:58:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1948401664. Throughput: 0: 42895.5. Samples: 1948525140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:58:06,994][12645] Avg episode reward: [(0, '0.236')] [2024-06-18 10:58:07,624][12883] Updated weights for policy 0, policy_version 118923 (0.0047) [2024-06-18 10:58:11,040][12883] Updated weights for policy 0, policy_version 118933 (0.0036) [2024-06-18 10:58:11,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1948647424. Throughput: 0: 42793.8. Samples: 1948776320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 10:58:11,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 10:58:15,249][12883] Updated weights for policy 0, policy_version 118943 (0.0032) [2024-06-18 10:58:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 1948844032. Throughput: 0: 42756.3. Samples: 1948910380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:16,994][12645] Avg episode reward: [(0, '0.710')] [2024-06-18 10:58:18,638][12883] Updated weights for policy 0, policy_version 118953 (0.0049) [2024-06-18 10:58:21,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1949024256. Throughput: 0: 42784.9. Samples: 1949165700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:21,994][12645] Avg episode reward: [(0, '0.229')] [2024-06-18 10:58:23,163][12883] Updated weights for policy 0, policy_version 118963 (0.0023) [2024-06-18 10:58:26,210][12862] Signal inference workers to stop experience collection... (28550 times) [2024-06-18 10:58:26,210][12862] Signal inference workers to resume experience collection... (28550 times) [2024-06-18 10:58:26,214][12883] Updated weights for policy 0, policy_version 118973 (0.0036) [2024-06-18 10:58:26,228][12883] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-06-18 10:58:26,228][12883] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-06-18 10:58:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.4, 300 sec: 42653.9). Total num frames: 1949286400. Throughput: 0: 42602.5. Samples: 1949410180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:26,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 10:58:31,045][12883] Updated weights for policy 0, policy_version 118983 (0.0042) [2024-06-18 10:58:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1949466624. Throughput: 0: 42605.9. Samples: 1949551960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:31,994][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 10:58:33,807][12883] Updated weights for policy 0, policy_version 118993 (0.0034) [2024-06-18 10:58:36,994][12645] Fps is (10 sec: 39321.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1949679616. Throughput: 0: 42787.0. Samples: 1949804020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:36,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 10:58:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118999_1949679616.pth... [2024-06-18 10:58:37,053][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118375_1939456000.pth [2024-06-18 10:58:38,613][12883] Updated weights for policy 0, policy_version 119003 (0.0045) [2024-06-18 10:58:41,521][12883] Updated weights for policy 0, policy_version 119013 (0.0038) [2024-06-18 10:58:41,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 1949925376. Throughput: 0: 42628.0. Samples: 1950052520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:41,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 10:58:46,270][12883] Updated weights for policy 0, policy_version 119023 (0.0034) [2024-06-18 10:58:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 1950121984. Throughput: 0: 42611.2. Samples: 1950189080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:46,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 10:58:49,362][12883] Updated weights for policy 0, policy_version 119033 (0.0044) [2024-06-18 10:58:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 1950318592. Throughput: 0: 42535.5. Samples: 1950439240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:51,995][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 10:58:53,918][12883] Updated weights for policy 0, policy_version 119043 (0.0037) [2024-06-18 10:58:56,993][12883] Updated weights for policy 0, policy_version 119053 (0.0041) [2024-06-18 10:58:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1950564352. Throughput: 0: 42692.0. Samples: 1950697460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:58:56,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 10:59:01,434][12883] Updated weights for policy 0, policy_version 119063 (0.0030) [2024-06-18 10:59:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1950777344. Throughput: 0: 42770.7. Samples: 1950835060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:59:01,998][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 10:59:04,620][12883] Updated weights for policy 0, policy_version 119073 (0.0040) [2024-06-18 10:59:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1950957568. Throughput: 0: 42668.4. Samples: 1951085780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:59:06,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 10:59:08,850][12883] Updated weights for policy 0, policy_version 119083 (0.0029) [2024-06-18 10:59:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1951203328. Throughput: 0: 42926.4. Samples: 1951341860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 10:59:11,994][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 10:59:12,199][12883] Updated weights for policy 0, policy_version 119093 (0.0035) [2024-06-18 10:59:16,536][12883] Updated weights for policy 0, policy_version 119103 (0.0030) [2024-06-18 10:59:16,996][12645] Fps is (10 sec: 45866.8, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 1951416320. Throughput: 0: 42862.1. Samples: 1951480840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:16,996][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 10:59:19,826][12883] Updated weights for policy 0, policy_version 119113 (0.0037) [2024-06-18 10:59:21,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1951596544. Throughput: 0: 42713.0. Samples: 1951726100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:21,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 10:59:24,206][12883] Updated weights for policy 0, policy_version 119123 (0.0024) [2024-06-18 10:59:26,994][12645] Fps is (10 sec: 42606.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 1951842304. Throughput: 0: 42902.2. Samples: 1951983120. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:26,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 10:59:27,471][12883] Updated weights for policy 0, policy_version 119133 (0.0041) [2024-06-18 10:59:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1952022528. Throughput: 0: 42895.7. Samples: 1952119380. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:31,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 10:59:32,031][12883] Updated weights for policy 0, policy_version 119143 (0.0037) [2024-06-18 10:59:35,060][12883] Updated weights for policy 0, policy_version 119153 (0.0027) [2024-06-18 10:59:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1952251904. Throughput: 0: 42738.3. Samples: 1952362460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:36,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 10:59:39,652][12883] Updated weights for policy 0, policy_version 119163 (0.0047) [2024-06-18 10:59:41,690][12862] Signal inference workers to stop experience collection... (28600 times) [2024-06-18 10:59:41,696][12862] Signal inference workers to resume experience collection... (28600 times) [2024-06-18 10:59:41,748][12883] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-06-18 10:59:41,748][12883] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-06-18 10:59:41,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1952481280. Throughput: 0: 42888.4. Samples: 1952627440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:41,994][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 10:59:42,926][12883] Updated weights for policy 0, policy_version 119173 (0.0038) [2024-06-18 10:59:46,996][12645] Fps is (10 sec: 40950.3, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 1952661504. Throughput: 0: 42684.1. Samples: 1952755940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:46,997][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 10:59:47,194][12883] Updated weights for policy 0, policy_version 119183 (0.0036) [2024-06-18 10:59:50,423][12883] Updated weights for policy 0, policy_version 119193 (0.0029) [2024-06-18 10:59:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 1952907264. Throughput: 0: 42712.6. Samples: 1953007840. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:51,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 10:59:55,006][12883] Updated weights for policy 0, policy_version 119203 (0.0036) [2024-06-18 10:59:56,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 1953103872. Throughput: 0: 42992.3. Samples: 1953276520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 10:59:56,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 10:59:57,879][12883] Updated weights for policy 0, policy_version 119213 (0.0035) [2024-06-18 11:00:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1953316864. Throughput: 0: 42616.9. Samples: 1953398520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 11:00:01,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 11:00:02,666][12883] Updated weights for policy 0, policy_version 119223 (0.0041) [2024-06-18 11:00:05,397][12883] Updated weights for policy 0, policy_version 119233 (0.0038) [2024-06-18 11:00:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 1953562624. Throughput: 0: 42713.7. Samples: 1953648220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 11:00:06,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 11:00:10,257][12883] Updated weights for policy 0, policy_version 119243 (0.0045) [2024-06-18 11:00:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1953759232. Throughput: 0: 43013.4. Samples: 1953918720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 11:00:11,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 11:00:13,350][12883] Updated weights for policy 0, policy_version 119253 (0.0027) [2024-06-18 11:00:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42326.5, 300 sec: 42709.4). Total num frames: 1953955840. Throughput: 0: 42658.4. Samples: 1954039020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-18 11:00:16,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 11:00:17,809][12883] Updated weights for policy 0, policy_version 119263 (0.0043) [2024-06-18 11:00:20,838][12883] Updated weights for policy 0, policy_version 119273 (0.0033) [2024-06-18 11:00:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43690.7, 300 sec: 42765.0). Total num frames: 1954217984. Throughput: 0: 43026.7. Samples: 1954298660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:21,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 11:00:25,574][12883] Updated weights for policy 0, policy_version 119283 (0.0033) [2024-06-18 11:00:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1954381824. Throughput: 0: 42911.6. Samples: 1954558460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:26,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 11:00:28,490][12883] Updated weights for policy 0, policy_version 119293 (0.0043) [2024-06-18 11:00:31,994][12645] Fps is (10 sec: 34405.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 1954562048. Throughput: 0: 42718.1. Samples: 1954678160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:31,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 11:00:33,203][12883] Updated weights for policy 0, policy_version 119303 (0.0032) [2024-06-18 11:00:36,283][12883] Updated weights for policy 0, policy_version 119313 (0.0027) [2024-06-18 11:00:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 1954856960. Throughput: 0: 42807.5. Samples: 1954934180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:36,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 11:00:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119315_1954856960.pth... [2024-06-18 11:00:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118687_1944567808.pth [2024-06-18 11:00:40,905][12883] Updated weights for policy 0, policy_version 119323 (0.0039) [2024-06-18 11:00:41,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1955020800. Throughput: 0: 42440.1. Samples: 1955186320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:41,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 11:00:44,287][12883] Updated weights for policy 0, policy_version 119333 (0.0032) [2024-06-18 11:00:46,994][12645] Fps is (10 sec: 36045.0, 60 sec: 42600.1, 300 sec: 42653.9). Total num frames: 1955217408. Throughput: 0: 42470.7. Samples: 1955309700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:47,000][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 11:00:48,395][12883] Updated weights for policy 0, policy_version 119343 (0.0041) [2024-06-18 11:00:50,885][12862] Signal inference workers to stop experience collection... (28650 times) [2024-06-18 11:00:50,885][12862] Signal inference workers to resume experience collection... (28650 times) [2024-06-18 11:00:50,930][12883] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-06-18 11:00:50,930][12883] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-06-18 11:00:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1955463168. Throughput: 0: 42790.3. Samples: 1955573780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:51,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 11:00:52,078][12883] Updated weights for policy 0, policy_version 119353 (0.0030) [2024-06-18 11:00:56,108][12883] Updated weights for policy 0, policy_version 119363 (0.0034) [2024-06-18 11:00:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 1955676160. Throughput: 0: 42366.1. Samples: 1955825200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:00:56,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 11:00:59,699][12883] Updated weights for policy 0, policy_version 119373 (0.0034) [2024-06-18 11:01:01,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1955872768. Throughput: 0: 42571.6. Samples: 1955954740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:01:01,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 11:01:03,933][12883] Updated weights for policy 0, policy_version 119383 (0.0039) [2024-06-18 11:01:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1956102144. Throughput: 0: 42544.4. Samples: 1956213160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:01:06,994][12645] Avg episode reward: [(0, '0.690')] [2024-06-18 11:01:07,518][12883] Updated weights for policy 0, policy_version 119393 (0.0028) [2024-06-18 11:01:11,475][12883] Updated weights for policy 0, policy_version 119403 (0.0038) [2024-06-18 11:01:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1956315136. Throughput: 0: 42454.7. Samples: 1956468920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:01:11,994][12645] Avg episode reward: [(0, '0.734')] [2024-06-18 11:01:15,314][12883] Updated weights for policy 0, policy_version 119413 (0.0028) [2024-06-18 11:01:16,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 1956511744. Throughput: 0: 42585.0. Samples: 1956594580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 11:01:16,997][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 11:01:19,218][12883] Updated weights for policy 0, policy_version 119423 (0.0039) [2024-06-18 11:01:21,996][12645] Fps is (10 sec: 40951.0, 60 sec: 41777.6, 300 sec: 42653.6). Total num frames: 1956724736. Throughput: 0: 42401.0. Samples: 1956842320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:21,996][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 11:01:23,047][12883] Updated weights for policy 0, policy_version 119433 (0.0023) [2024-06-18 11:01:26,975][12883] Updated weights for policy 0, policy_version 119443 (0.0036) [2024-06-18 11:01:26,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 1956954112. Throughput: 0: 42527.0. Samples: 1957100040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:26,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 11:01:30,903][12883] Updated weights for policy 0, policy_version 119453 (0.0039) [2024-06-18 11:01:31,994][12645] Fps is (10 sec: 42608.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 1957150720. Throughput: 0: 42591.1. Samples: 1957226300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:31,994][12645] Avg episode reward: [(0, '0.691')] [2024-06-18 11:01:34,589][12883] Updated weights for policy 0, policy_version 119463 (0.0036) [2024-06-18 11:01:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1957363712. Throughput: 0: 42334.6. Samples: 1957478840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:36,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 11:01:38,518][12883] Updated weights for policy 0, policy_version 119473 (0.0039) [2024-06-18 11:01:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1957593088. Throughput: 0: 42465.4. Samples: 1957736140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:41,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 11:01:42,437][12883] Updated weights for policy 0, policy_version 119483 (0.0027) [2024-06-18 11:01:46,166][12883] Updated weights for policy 0, policy_version 119493 (0.0042) [2024-06-18 11:01:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1957789696. Throughput: 0: 42353.3. Samples: 1957860640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:46,994][12645] Avg episode reward: [(0, '0.206')] [2024-06-18 11:01:50,361][12883] Updated weights for policy 0, policy_version 119503 (0.0037) [2024-06-18 11:01:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1958002688. Throughput: 0: 42304.5. Samples: 1958116860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:51,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 11:01:53,782][12883] Updated weights for policy 0, policy_version 119513 (0.0029) [2024-06-18 11:01:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1958215680. Throughput: 0: 42443.1. Samples: 1958378860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:01:56,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 11:01:57,952][12883] Updated weights for policy 0, policy_version 119523 (0.0045) [2024-06-18 11:02:01,443][12883] Updated weights for policy 0, policy_version 119533 (0.0036) [2024-06-18 11:02:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1958428672. Throughput: 0: 42397.7. Samples: 1958502380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:02:01,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 11:02:05,471][12883] Updated weights for policy 0, policy_version 119543 (0.0034) [2024-06-18 11:02:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1958641664. Throughput: 0: 42536.3. Samples: 1958756360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:02:06,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 11:02:09,793][12883] Updated weights for policy 0, policy_version 119553 (0.0042) [2024-06-18 11:02:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 1958838272. Throughput: 0: 42531.7. Samples: 1959013960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:02:11,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 11:02:13,106][12883] Updated weights for policy 0, policy_version 119563 (0.0037) [2024-06-18 11:02:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1959067648. Throughput: 0: 42445.7. Samples: 1959136360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 11:02:16,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 11:02:17,349][12883] Updated weights for policy 0, policy_version 119573 (0.0041) [2024-06-18 11:02:20,168][12862] Signal inference workers to stop experience collection... (28700 times) [2024-06-18 11:02:20,169][12862] Signal inference workers to resume experience collection... (28700 times) [2024-06-18 11:02:20,191][12883] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-06-18 11:02:20,191][12883] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-06-18 11:02:20,783][12883] Updated weights for policy 0, policy_version 119583 (0.0029) [2024-06-18 11:02:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42599.9, 300 sec: 42598.8). Total num frames: 1959280640. Throughput: 0: 42657.3. Samples: 1959398420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:21,996][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 11:02:24,847][12883] Updated weights for policy 0, policy_version 119593 (0.0031) [2024-06-18 11:02:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1959477248. Throughput: 0: 42670.6. Samples: 1959656320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:26,994][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 11:02:28,510][12883] Updated weights for policy 0, policy_version 119603 (0.0032) [2024-06-18 11:02:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1959706624. Throughput: 0: 42657.3. Samples: 1959780220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:31,997][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 11:02:32,789][12883] Updated weights for policy 0, policy_version 119613 (0.0030) [2024-06-18 11:02:36,199][12883] Updated weights for policy 0, policy_version 119623 (0.0027) [2024-06-18 11:02:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 1959936000. Throughput: 0: 42714.6. Samples: 1960039020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:36,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 11:02:37,030][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119625_1959936000.pth... [2024-06-18 11:02:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000118999_1949679616.pth [2024-06-18 11:02:40,446][12883] Updated weights for policy 0, policy_version 119633 (0.0036) [2024-06-18 11:02:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 1960116224. Throughput: 0: 42569.3. Samples: 1960294480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:41,995][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 11:02:44,082][12883] Updated weights for policy 0, policy_version 119643 (0.0032) [2024-06-18 11:02:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 1960361984. Throughput: 0: 42501.4. Samples: 1960414940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:46,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 11:02:48,172][12883] Updated weights for policy 0, policy_version 119653 (0.0040) [2024-06-18 11:02:51,926][12883] Updated weights for policy 0, policy_version 119663 (0.0025) [2024-06-18 11:02:51,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1960558592. Throughput: 0: 42766.8. Samples: 1960680860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:51,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 11:02:55,697][12883] Updated weights for policy 0, policy_version 119673 (0.0027) [2024-06-18 11:02:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1960771584. Throughput: 0: 42712.0. Samples: 1960936000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:02:56,994][12645] Avg episode reward: [(0, '0.215')] [2024-06-18 11:02:59,518][12883] Updated weights for policy 0, policy_version 119683 (0.0049) [2024-06-18 11:03:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1961000960. Throughput: 0: 42732.0. Samples: 1961059300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:03:01,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 11:03:03,529][12883] Updated weights for policy 0, policy_version 119693 (0.0037) [2024-06-18 11:03:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1961181184. Throughput: 0: 42568.5. Samples: 1961314000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:03:06,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 11:03:07,228][12883] Updated weights for policy 0, policy_version 119703 (0.0041) [2024-06-18 11:03:11,105][12883] Updated weights for policy 0, policy_version 119713 (0.0033) [2024-06-18 11:03:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1961394176. Throughput: 0: 42595.1. Samples: 1961573100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:03:11,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 11:03:14,858][12883] Updated weights for policy 0, policy_version 119723 (0.0042) [2024-06-18 11:03:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1961639936. Throughput: 0: 42627.1. Samples: 1961698440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:03:16,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 11:03:18,659][12883] Updated weights for policy 0, policy_version 119733 (0.0033) [2024-06-18 11:03:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1961820160. Throughput: 0: 42702.2. Samples: 1961960620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:03:21,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 11:03:22,339][12883] Updated weights for policy 0, policy_version 119743 (0.0025) [2024-06-18 11:03:26,504][12883] Updated weights for policy 0, policy_version 119753 (0.0038) [2024-06-18 11:03:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1962033152. Throughput: 0: 42546.2. Samples: 1962209060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:03:26,998][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 11:03:30,374][12883] Updated weights for policy 0, policy_version 119763 (0.0037) [2024-06-18 11:03:31,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 1962278912. Throughput: 0: 42773.3. Samples: 1962339740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:03:31,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 11:03:34,004][12883] Updated weights for policy 0, policy_version 119773 (0.0030) [2024-06-18 11:03:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 1962442752. Throughput: 0: 42727.4. Samples: 1962603600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:03:36,994][12645] Avg episode reward: [(0, '0.691')] [2024-06-18 11:03:37,813][12883] Updated weights for policy 0, policy_version 119783 (0.0034) [2024-06-18 11:03:39,830][12862] Signal inference workers to stop experience collection... (28750 times) [2024-06-18 11:03:39,830][12862] Signal inference workers to resume experience collection... (28750 times) [2024-06-18 11:03:39,873][12883] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-06-18 11:03:39,873][12883] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-06-18 11:03:41,767][12883] Updated weights for policy 0, policy_version 119793 (0.0040) [2024-06-18 11:03:41,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1962688512. Throughput: 0: 42563.4. Samples: 1962851360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:03:41,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 11:03:45,535][12883] Updated weights for policy 0, policy_version 119803 (0.0035) [2024-06-18 11:03:46,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1962917888. Throughput: 0: 42783.5. Samples: 1962984560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:03:46,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 11:03:49,284][12883] Updated weights for policy 0, policy_version 119813 (0.0035) [2024-06-18 11:03:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1963098112. Throughput: 0: 42889.8. Samples: 1963244040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:03:51,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 11:03:53,212][12883] Updated weights for policy 0, policy_version 119823 (0.0031) [2024-06-18 11:03:56,854][12883] Updated weights for policy 0, policy_version 119833 (0.0026) [2024-06-18 11:03:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1963343872. Throughput: 0: 42704.4. Samples: 1963494800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:03:56,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 11:04:00,859][12883] Updated weights for policy 0, policy_version 119843 (0.0030) [2024-06-18 11:04:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1963556864. Throughput: 0: 42863.1. Samples: 1963627280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:04:01,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 11:04:04,486][12883] Updated weights for policy 0, policy_version 119853 (0.0030) [2024-06-18 11:04:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 1963737088. Throughput: 0: 42753.7. Samples: 1963884540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:04:06,994][12645] Avg episode reward: [(0, '0.669')] [2024-06-18 11:04:08,517][12883] Updated weights for policy 0, policy_version 119863 (0.0033) [2024-06-18 11:04:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 1963982848. Throughput: 0: 42844.5. Samples: 1964137060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:04:11,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 11:04:12,082][12883] Updated weights for policy 0, policy_version 119873 (0.0022) [2024-06-18 11:04:16,011][12883] Updated weights for policy 0, policy_version 119883 (0.0038) [2024-06-18 11:04:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 1964195840. Throughput: 0: 42854.1. Samples: 1964268180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:04:16,994][12645] Avg episode reward: [(0, '0.564')] [2024-06-18 11:04:20,046][12883] Updated weights for policy 0, policy_version 119893 (0.0034) [2024-06-18 11:04:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1964392448. Throughput: 0: 42633.9. Samples: 1964522120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:04:21,994][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 11:04:23,894][12883] Updated weights for policy 0, policy_version 119903 (0.0030) [2024-06-18 11:04:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1964621824. Throughput: 0: 42632.1. Samples: 1964769800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:04:26,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 11:04:27,811][12883] Updated weights for policy 0, policy_version 119913 (0.0053) [2024-06-18 11:04:31,585][12883] Updated weights for policy 0, policy_version 119923 (0.0043) [2024-06-18 11:04:31,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1964851200. Throughput: 0: 42706.3. Samples: 1964906340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:04:31,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 11:04:35,443][12883] Updated weights for policy 0, policy_version 119933 (0.0041) [2024-06-18 11:04:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 1965031424. Throughput: 0: 42521.7. Samples: 1965157520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:04:36,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 11:04:37,108][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119937_1965047808.pth... [2024-06-18 11:04:37,167][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119315_1954856960.pth [2024-06-18 11:04:39,207][12883] Updated weights for policy 0, policy_version 119943 (0.0036) [2024-06-18 11:04:41,995][12645] Fps is (10 sec: 40952.7, 60 sec: 42870.2, 300 sec: 42709.5). Total num frames: 1965260800. Throughput: 0: 42510.7. Samples: 1965407860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:04:41,996][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 11:04:43,582][12883] Updated weights for policy 0, policy_version 119953 (0.0044) [2024-06-18 11:04:46,812][12883] Updated weights for policy 0, policy_version 119963 (0.0029) [2024-06-18 11:04:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1965490176. Throughput: 0: 42590.7. Samples: 1965543860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:04:46,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 11:04:51,066][12883] Updated weights for policy 0, policy_version 119973 (0.0037) [2024-06-18 11:04:51,994][12645] Fps is (10 sec: 40967.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1965670400. Throughput: 0: 42705.4. Samples: 1965806280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:04:51,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 11:04:52,200][12862] Signal inference workers to stop experience collection... (28800 times) [2024-06-18 11:04:52,200][12862] Signal inference workers to resume experience collection... (28800 times) [2024-06-18 11:04:52,237][12883] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-06-18 11:04:52,237][12883] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-06-18 11:04:54,279][12883] Updated weights for policy 0, policy_version 119983 (0.0026) [2024-06-18 11:04:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1965916160. Throughput: 0: 42734.2. Samples: 1966060100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:04:56,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 11:04:58,569][12883] Updated weights for policy 0, policy_version 119993 (0.0037) [2024-06-18 11:05:01,762][12883] Updated weights for policy 0, policy_version 120003 (0.0034) [2024-06-18 11:05:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1966129152. Throughput: 0: 42651.6. Samples: 1966187500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:05:01,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 11:05:06,068][12883] Updated weights for policy 0, policy_version 120013 (0.0029) [2024-06-18 11:05:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1966342144. Throughput: 0: 42780.3. Samples: 1966447240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:05:06,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 11:05:09,541][12883] Updated weights for policy 0, policy_version 120023 (0.0044) [2024-06-18 11:05:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 1966538752. Throughput: 0: 42896.3. Samples: 1966700140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:05:11,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 11:05:13,678][12883] Updated weights for policy 0, policy_version 120033 (0.0024) [2024-06-18 11:05:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 1966751744. Throughput: 0: 42707.0. Samples: 1966828160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:05:16,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 11:05:17,293][12883] Updated weights for policy 0, policy_version 120043 (0.0029) [2024-06-18 11:05:21,399][12883] Updated weights for policy 0, policy_version 120053 (0.0038) [2024-06-18 11:05:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 1966981120. Throughput: 0: 42891.1. Samples: 1967087620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 24.0) [2024-06-18 11:05:21,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 11:05:24,791][12883] Updated weights for policy 0, policy_version 120063 (0.0047) [2024-06-18 11:05:26,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1967177728. Throughput: 0: 43021.3. Samples: 1967343740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:05:26,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 11:05:29,139][12883] Updated weights for policy 0, policy_version 120073 (0.0028) [2024-06-18 11:05:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 1967390720. Throughput: 0: 42801.8. Samples: 1967469940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:05:31,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 11:05:32,404][12883] Updated weights for policy 0, policy_version 120083 (0.0032) [2024-06-18 11:05:36,655][12883] Updated weights for policy 0, policy_version 120093 (0.0026) [2024-06-18 11:05:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1967620096. Throughput: 0: 42798.1. Samples: 1967732200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:05:36,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 11:05:39,838][12883] Updated weights for policy 0, policy_version 120103 (0.0023) [2024-06-18 11:05:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42599.7, 300 sec: 42709.5). Total num frames: 1967816704. Throughput: 0: 42911.1. Samples: 1967991100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:05:41,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 11:05:44,342][12883] Updated weights for policy 0, policy_version 120113 (0.0023) [2024-06-18 11:05:46,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1968029696. Throughput: 0: 42891.1. Samples: 1968117600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:05:46,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 11:05:47,596][12883] Updated weights for policy 0, policy_version 120123 (0.0032) [2024-06-18 11:05:51,847][12883] Updated weights for policy 0, policy_version 120133 (0.0031) [2024-06-18 11:05:52,000][12645] Fps is (10 sec: 44209.2, 60 sec: 43140.0, 300 sec: 42653.0). Total num frames: 1968259072. Throughput: 0: 42791.4. Samples: 1968373120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:05:52,001][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 11:05:55,167][12883] Updated weights for policy 0, policy_version 120143 (0.0030) [2024-06-18 11:05:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 1968455680. Throughput: 0: 42836.1. Samples: 1968627760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:05:56,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 11:05:59,623][12883] Updated weights for policy 0, policy_version 120153 (0.0032) [2024-06-18 11:06:01,983][12862] Signal inference workers to stop experience collection... (28850 times) [2024-06-18 11:06:01,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 1968668672. Throughput: 0: 42847.3. Samples: 1968756280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:06:01,994][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 11:06:02,025][12883] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-06-18 11:06:02,106][12862] Signal inference workers to resume experience collection... (28850 times) [2024-06-18 11:06:02,106][12883] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-06-18 11:06:03,342][12883] Updated weights for policy 0, policy_version 120163 (0.0029) [2024-06-18 11:06:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1968898048. Throughput: 0: 42928.1. Samples: 1969019380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:06:06,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 11:06:07,087][12883] Updated weights for policy 0, policy_version 120173 (0.0038) [2024-06-18 11:06:10,943][12883] Updated weights for policy 0, policy_version 120183 (0.0032) [2024-06-18 11:06:12,000][12645] Fps is (10 sec: 44208.7, 60 sec: 42867.0, 300 sec: 42708.9). Total num frames: 1969111040. Throughput: 0: 42543.0. Samples: 1969258440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:06:12,001][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 11:06:15,169][12883] Updated weights for policy 0, policy_version 120193 (0.0037) [2024-06-18 11:06:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.8). Total num frames: 1969324032. Throughput: 0: 42746.3. Samples: 1969393520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:06:16,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 11:06:18,502][12883] Updated weights for policy 0, policy_version 120203 (0.0028) [2024-06-18 11:06:21,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1969537024. Throughput: 0: 42792.5. Samples: 1969657860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:06:21,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 11:06:22,827][12883] Updated weights for policy 0, policy_version 120213 (0.0027) [2024-06-18 11:06:26,004][12883] Updated weights for policy 0, policy_version 120223 (0.0030) [2024-06-18 11:06:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1969766400. Throughput: 0: 42488.4. Samples: 1969903080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 11:06:26,994][12645] Avg episode reward: [(0, '0.210')] [2024-06-18 11:06:30,275][12883] Updated weights for policy 0, policy_version 120233 (0.0037) [2024-06-18 11:06:31,996][12645] Fps is (10 sec: 44226.9, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 1969979392. Throughput: 0: 42734.6. Samples: 1970040760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:06:31,997][12645] Avg episode reward: [(0, '0.687')] [2024-06-18 11:06:33,698][12883] Updated weights for policy 0, policy_version 120243 (0.0026) [2024-06-18 11:06:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1970176000. Throughput: 0: 42921.6. Samples: 1970304320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:06:36,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 11:06:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120250_1970176000.pth... [2024-06-18 11:06:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119625_1959936000.pth [2024-06-18 11:06:37,777][12883] Updated weights for policy 0, policy_version 120253 (0.0035) [2024-06-18 11:06:41,275][12883] Updated weights for policy 0, policy_version 120263 (0.0038) [2024-06-18 11:06:41,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 1970421760. Throughput: 0: 42740.1. Samples: 1970551060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:06:41,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 11:06:45,495][12883] Updated weights for policy 0, policy_version 120273 (0.0038) [2024-06-18 11:06:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 1970618368. Throughput: 0: 42846.1. Samples: 1970684360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:06:46,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 11:06:49,080][12883] Updated weights for policy 0, policy_version 120283 (0.0046) [2024-06-18 11:06:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 1970831360. Throughput: 0: 42763.1. Samples: 1970943720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:06:51,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 11:06:53,297][12883] Updated weights for policy 0, policy_version 120293 (0.0035) [2024-06-18 11:06:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1971027968. Throughput: 0: 43019.3. Samples: 1971194040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:06:56,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 11:06:57,073][12883] Updated weights for policy 0, policy_version 120303 (0.0036) [2024-06-18 11:07:00,821][12883] Updated weights for policy 0, policy_version 120313 (0.0041) [2024-06-18 11:07:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1971273728. Throughput: 0: 42865.8. Samples: 1971322480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:07:01,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 11:07:04,617][12883] Updated weights for policy 0, policy_version 120323 (0.0028) [2024-06-18 11:07:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1971453952. Throughput: 0: 42801.8. Samples: 1971583940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:07:06,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 11:07:08,366][12883] Updated weights for policy 0, policy_version 120333 (0.0037) [2024-06-18 11:07:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 1971666944. Throughput: 0: 43021.5. Samples: 1971839040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:07:11,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 11:07:12,228][12883] Updated weights for policy 0, policy_version 120343 (0.0032) [2024-06-18 11:07:15,953][12883] Updated weights for policy 0, policy_version 120353 (0.0040) [2024-06-18 11:07:16,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 1971912704. Throughput: 0: 42836.3. Samples: 1971968300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:07:16,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 11:07:19,940][12883] Updated weights for policy 0, policy_version 120363 (0.0045) [2024-06-18 11:07:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1972092928. Throughput: 0: 42613.7. Samples: 1972221940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:07:21,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 11:07:23,334][12862] Signal inference workers to stop experience collection... (28900 times) [2024-06-18 11:07:23,334][12862] Signal inference workers to resume experience collection... (28900 times) [2024-06-18 11:07:23,351][12883] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-06-18 11:07:23,351][12883] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-06-18 11:07:23,477][12883] Updated weights for policy 0, policy_version 120373 (0.0025) [2024-06-18 11:07:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1972322304. Throughput: 0: 42842.1. Samples: 1972478960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 11:07:26,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 11:07:27,521][12883] Updated weights for policy 0, policy_version 120383 (0.0035) [2024-06-18 11:07:30,930][12883] Updated weights for policy 0, policy_version 120393 (0.0031) [2024-06-18 11:07:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 1972535296. Throughput: 0: 42985.0. Samples: 1972618680. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:07:31,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 11:07:35,518][12883] Updated weights for policy 0, policy_version 120403 (0.0033) [2024-06-18 11:07:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1972731904. Throughput: 0: 42870.3. Samples: 1972872880. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:07:36,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 11:07:38,466][12883] Updated weights for policy 0, policy_version 120413 (0.0036) [2024-06-18 11:07:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1972977664. Throughput: 0: 42937.8. Samples: 1973126240. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:07:41,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 11:07:43,212][12883] Updated weights for policy 0, policy_version 120423 (0.0040) [2024-06-18 11:07:46,298][12883] Updated weights for policy 0, policy_version 120433 (0.0029) [2024-06-18 11:07:46,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 1973207040. Throughput: 0: 43059.1. Samples: 1973260140. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:07:46,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 11:07:50,774][12883] Updated weights for policy 0, policy_version 120443 (0.0045) [2024-06-18 11:07:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1973370880. Throughput: 0: 42939.1. Samples: 1973516200. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:07:51,994][12645] Avg episode reward: [(0, '0.637')] [2024-06-18 11:07:53,783][12883] Updated weights for policy 0, policy_version 120453 (0.0031) [2024-06-18 11:07:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1973633024. Throughput: 0: 42799.9. Samples: 1973765040. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:07:56,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 11:07:58,475][12883] Updated weights for policy 0, policy_version 120463 (0.0042) [2024-06-18 11:08:01,492][12883] Updated weights for policy 0, policy_version 120473 (0.0035) [2024-06-18 11:08:01,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 1973829632. Throughput: 0: 43045.0. Samples: 1973905320. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:08:01,994][12645] Avg episode reward: [(0, '0.163')] [2024-06-18 11:08:06,214][12883] Updated weights for policy 0, policy_version 120483 (0.0037) [2024-06-18 11:08:06,996][12645] Fps is (10 sec: 36036.7, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 1973993472. Throughput: 0: 43063.6. Samples: 1974159900. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:08:06,997][12645] Avg episode reward: [(0, '0.157')] [2024-06-18 11:08:09,213][12883] Updated weights for policy 0, policy_version 120493 (0.0031) [2024-06-18 11:08:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 1974272000. Throughput: 0: 42886.8. Samples: 1974408860. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:08:11,994][12645] Avg episode reward: [(0, '0.244')] [2024-06-18 11:08:13,936][12883] Updated weights for policy 0, policy_version 120503 (0.0032) [2024-06-18 11:08:16,994][12645] Fps is (10 sec: 47524.6, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 1974468608. Throughput: 0: 42893.3. Samples: 1974548880. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:08:16,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 11:08:17,041][12883] Updated weights for policy 0, policy_version 120513 (0.0051) [2024-06-18 11:08:21,763][12883] Updated weights for policy 0, policy_version 120523 (0.0030) [2024-06-18 11:08:21,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1974648832. Throughput: 0: 42682.6. Samples: 1974793600. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:08:21,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 11:08:24,758][12883] Updated weights for policy 0, policy_version 120533 (0.0029) [2024-06-18 11:08:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 1974910976. Throughput: 0: 42560.4. Samples: 1975041460. Policy #0 lag: (min: 1.0, avg: 8.6, max: 20.0) [2024-06-18 11:08:26,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 11:08:29,420][12883] Updated weights for policy 0, policy_version 120543 (0.0040) [2024-06-18 11:08:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 1975107584. Throughput: 0: 42540.1. Samples: 1975174440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:08:31,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 11:08:32,367][12883] Updated weights for policy 0, policy_version 120553 (0.0033) [2024-06-18 11:08:36,946][12883] Updated weights for policy 0, policy_version 120563 (0.0028) [2024-06-18 11:08:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1975304192. Throughput: 0: 42350.3. Samples: 1975421960. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:08:36,994][12645] Avg episode reward: [(0, '0.773')] [2024-06-18 11:08:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120563_1975304192.pth... [2024-06-18 11:08:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000119937_1965047808.pth [2024-06-18 11:08:39,770][12862] Signal inference workers to stop experience collection... (28950 times) [2024-06-18 11:08:39,806][12883] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-06-18 11:08:39,833][12862] Signal inference workers to resume experience collection... (28950 times) [2024-06-18 11:08:39,834][12883] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-06-18 11:08:39,974][12883] Updated weights for policy 0, policy_version 120573 (0.0038) [2024-06-18 11:08:41,995][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 1975549952. Throughput: 0: 42499.1. Samples: 1975677500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:08:41,996][12645] Avg episode reward: [(0, '0.673')] [2024-06-18 11:08:45,123][12883] Updated weights for policy 0, policy_version 120583 (0.0036) [2024-06-18 11:08:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 1975713792. Throughput: 0: 42371.1. Samples: 1975812020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:08:46,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 11:08:47,744][12883] Updated weights for policy 0, policy_version 120593 (0.0026) [2024-06-18 11:08:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1975926784. Throughput: 0: 42109.6. Samples: 1976054740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:08:51,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 11:08:52,823][12883] Updated weights for policy 0, policy_version 120603 (0.0029) [2024-06-18 11:08:55,698][12883] Updated weights for policy 0, policy_version 120613 (0.0046) [2024-06-18 11:08:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1976172544. Throughput: 0: 42222.6. Samples: 1976308880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:08:56,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 11:09:00,455][12883] Updated weights for policy 0, policy_version 120623 (0.0033) [2024-06-18 11:09:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 1976336384. Throughput: 0: 42232.5. Samples: 1976449340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:09:01,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 11:09:03,372][12883] Updated weights for policy 0, policy_version 120633 (0.0031) [2024-06-18 11:09:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 1976582144. Throughput: 0: 42315.1. Samples: 1976697780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:09:06,994][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 11:09:08,024][12883] Updated weights for policy 0, policy_version 120643 (0.0032) [2024-06-18 11:09:10,924][12883] Updated weights for policy 0, policy_version 120653 (0.0035) [2024-06-18 11:09:11,994][12645] Fps is (10 sec: 49151.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 1976827904. Throughput: 0: 42286.7. Samples: 1976944360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:09:11,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 11:09:15,875][12883] Updated weights for policy 0, policy_version 120663 (0.0031) [2024-06-18 11:09:16,994][12645] Fps is (10 sec: 39318.8, 60 sec: 41778.6, 300 sec: 42653.8). Total num frames: 1976975360. Throughput: 0: 42322.3. Samples: 1977078980. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:09:16,995][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 11:09:18,639][12883] Updated weights for policy 0, policy_version 120673 (0.0028) [2024-06-18 11:09:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1977237504. Throughput: 0: 42414.2. Samples: 1977330600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:09:21,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 11:09:23,466][12883] Updated weights for policy 0, policy_version 120683 (0.0033) [2024-06-18 11:09:26,557][12883] Updated weights for policy 0, policy_version 120693 (0.0028) [2024-06-18 11:09:26,994][12645] Fps is (10 sec: 47517.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1977450496. Throughput: 0: 42490.7. Samples: 1977589580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:09:26,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 11:09:31,676][12883] Updated weights for policy 0, policy_version 120703 (0.0047) [2024-06-18 11:09:31,996][12645] Fps is (10 sec: 37675.2, 60 sec: 41777.6, 300 sec: 42653.6). Total num frames: 1977614336. Throughput: 0: 42332.6. Samples: 1977717080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 11:09:31,996][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 11:09:34,249][12883] Updated weights for policy 0, policy_version 120713 (0.0038) [2024-06-18 11:09:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 1977876480. Throughput: 0: 42437.4. Samples: 1977964420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:09:36,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 11:09:39,348][12883] Updated weights for policy 0, policy_version 120723 (0.0036) [2024-06-18 11:09:40,668][12862] Signal inference workers to stop experience collection... (29000 times) [2024-06-18 11:09:40,668][12862] Signal inference workers to resume experience collection... (29000 times) [2024-06-18 11:09:40,695][12883] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-06-18 11:09:40,695][12883] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-06-18 11:09:41,936][12883] Updated weights for policy 0, policy_version 120733 (0.0029) [2024-06-18 11:09:41,994][12645] Fps is (10 sec: 47524.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1978089472. Throughput: 0: 42633.4. Samples: 1978227380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:09:41,994][12645] Avg episode reward: [(0, '0.713')] [2024-06-18 11:09:46,994][12645] Fps is (10 sec: 36045.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1978236928. Throughput: 0: 42233.3. Samples: 1978349840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:09:46,994][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 11:09:47,018][12883] Updated weights for policy 0, policy_version 120743 (0.0027) [2024-06-18 11:09:49,775][12883] Updated weights for policy 0, policy_version 120753 (0.0027) [2024-06-18 11:09:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 1978499072. Throughput: 0: 42161.0. Samples: 1978595020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:09:51,994][12645] Avg episode reward: [(0, '0.697')] [2024-06-18 11:09:55,079][12883] Updated weights for policy 0, policy_version 120763 (0.0038) [2024-06-18 11:09:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 1978695680. Throughput: 0: 42656.5. Samples: 1978863900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:09:56,994][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 11:09:57,431][12883] Updated weights for policy 0, policy_version 120773 (0.0029) [2024-06-18 11:10:01,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1978875904. Throughput: 0: 42270.9. Samples: 1978981140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:10:01,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 11:10:02,719][12883] Updated weights for policy 0, policy_version 120783 (0.0035) [2024-06-18 11:10:04,994][12883] Updated weights for policy 0, policy_version 120793 (0.0035) [2024-06-18 11:10:06,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 1979154432. Throughput: 0: 42347.5. Samples: 1979236240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:10:06,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 11:10:10,411][12883] Updated weights for policy 0, policy_version 120803 (0.0034) [2024-06-18 11:10:11,994][12645] Fps is (10 sec: 45875.1, 60 sec: 41779.1, 300 sec: 42654.0). Total num frames: 1979334656. Throughput: 0: 42529.3. Samples: 1979503400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:10:11,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 11:10:12,574][12883] Updated weights for policy 0, policy_version 120813 (0.0030) [2024-06-18 11:10:16,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.9, 300 sec: 42542.9). Total num frames: 1979531264. Throughput: 0: 42481.6. Samples: 1979628660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:10:16,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 11:10:17,930][12883] Updated weights for policy 0, policy_version 120823 (0.0034) [2024-06-18 11:10:20,128][12883] Updated weights for policy 0, policy_version 120833 (0.0026) [2024-06-18 11:10:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1979777024. Throughput: 0: 42595.1. Samples: 1979881200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:10:21,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 11:10:25,414][12883] Updated weights for policy 0, policy_version 120843 (0.0043) [2024-06-18 11:10:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 1979973632. Throughput: 0: 42733.2. Samples: 1980150380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:10:26,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 11:10:27,748][12883] Updated weights for policy 0, policy_version 120853 (0.0033) [2024-06-18 11:10:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 1980170240. Throughput: 0: 42631.1. Samples: 1980268240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 24.0) [2024-06-18 11:10:31,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 11:10:32,943][12883] Updated weights for policy 0, policy_version 120863 (0.0027) [2024-06-18 11:10:35,349][12883] Updated weights for policy 0, policy_version 120873 (0.0030) [2024-06-18 11:10:36,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 1980432384. Throughput: 0: 42960.0. Samples: 1980528220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:10:36,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 11:10:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120876_1980432384.pth... [2024-06-18 11:10:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120250_1970176000.pth [2024-06-18 11:10:40,559][12883] Updated weights for policy 0, policy_version 120883 (0.0046) [2024-06-18 11:10:41,667][12862] Signal inference workers to stop experience collection... (29050 times) [2024-06-18 11:10:41,667][12862] Signal inference workers to resume experience collection... (29050 times) [2024-06-18 11:10:41,701][12883] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-06-18 11:10:41,701][12883] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-06-18 11:10:41,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 1980628992. Throughput: 0: 42862.0. Samples: 1980792700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:10:41,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 11:10:43,051][12883] Updated weights for policy 0, policy_version 120893 (0.0026) [2024-06-18 11:10:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43144.5, 300 sec: 42599.3). Total num frames: 1980825600. Throughput: 0: 43005.8. Samples: 1980916400. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:10:46,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 11:10:48,123][12883] Updated weights for policy 0, policy_version 120903 (0.0033) [2024-06-18 11:10:50,689][12883] Updated weights for policy 0, policy_version 120913 (0.0035) [2024-06-18 11:10:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 1981071360. Throughput: 0: 42919.5. Samples: 1981167620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:10:51,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 11:10:55,713][12883] Updated weights for policy 0, policy_version 120923 (0.0031) [2024-06-18 11:10:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1981267968. Throughput: 0: 42929.0. Samples: 1981435200. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:10:56,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 11:10:58,091][12883] Updated weights for policy 0, policy_version 120933 (0.0034) [2024-06-18 11:11:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 1981480960. Throughput: 0: 42825.3. Samples: 1981555800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:11:01,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 11:11:03,314][12883] Updated weights for policy 0, policy_version 120943 (0.0038) [2024-06-18 11:11:06,314][12883] Updated weights for policy 0, policy_version 120953 (0.0039) [2024-06-18 11:11:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42765.9). Total num frames: 1981726720. Throughput: 0: 42846.3. Samples: 1981809280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:11:06,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 11:11:11,000][12883] Updated weights for policy 0, policy_version 120963 (0.0044) [2024-06-18 11:11:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1981906944. Throughput: 0: 42742.6. Samples: 1982073800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:11:11,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 11:11:13,980][12883] Updated weights for policy 0, policy_version 120973 (0.0052) [2024-06-18 11:11:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 1982119936. Throughput: 0: 42768.4. Samples: 1982192820. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:11:16,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 11:11:18,669][12883] Updated weights for policy 0, policy_version 120983 (0.0041) [2024-06-18 11:11:21,717][12883] Updated weights for policy 0, policy_version 120993 (0.0027) [2024-06-18 11:11:21,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 1982365696. Throughput: 0: 42681.3. Samples: 1982448880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:11:21,994][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 11:11:26,161][12883] Updated weights for policy 0, policy_version 121003 (0.0035) [2024-06-18 11:11:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 1982562304. Throughput: 0: 42707.7. Samples: 1982714540. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:11:26,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 11:11:29,489][12883] Updated weights for policy 0, policy_version 121013 (0.0033) [2024-06-18 11:11:31,994][12645] Fps is (10 sec: 39320.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 1982758912. Throughput: 0: 42611.9. Samples: 1982833940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 11:11:31,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 11:11:33,661][12883] Updated weights for policy 0, policy_version 121023 (0.0027) [2024-06-18 11:11:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1982988288. Throughput: 0: 42882.8. Samples: 1983097340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:11:36,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 11:11:37,011][12883] Updated weights for policy 0, policy_version 121033 (0.0033) [2024-06-18 11:11:41,531][12883] Updated weights for policy 0, policy_version 121043 (0.0038) [2024-06-18 11:11:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 1983168512. Throughput: 0: 42560.8. Samples: 1983350440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:11:41,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 11:11:44,933][12883] Updated weights for policy 0, policy_version 121053 (0.0039) [2024-06-18 11:11:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1983397888. Throughput: 0: 42607.6. Samples: 1983473140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:11:46,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 11:11:49,138][12883] Updated weights for policy 0, policy_version 121063 (0.0039) [2024-06-18 11:11:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1983627264. Throughput: 0: 42734.6. Samples: 1983732340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:11:51,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 11:11:52,591][12883] Updated weights for policy 0, policy_version 121073 (0.0030) [2024-06-18 11:11:56,712][12883] Updated weights for policy 0, policy_version 121083 (0.0037) [2024-06-18 11:11:56,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 1983823872. Throughput: 0: 42508.7. Samples: 1983986700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:11:56,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 11:12:00,497][12883] Updated weights for policy 0, policy_version 121093 (0.0046) [2024-06-18 11:12:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1984036864. Throughput: 0: 42657.3. Samples: 1984112400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:01,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 11:12:03,190][12862] Signal inference workers to stop experience collection... (29100 times) [2024-06-18 11:12:03,222][12883] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-06-18 11:12:03,247][12862] Signal inference workers to resume experience collection... (29100 times) [2024-06-18 11:12:03,248][12883] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-06-18 11:12:04,638][12883] Updated weights for policy 0, policy_version 121103 (0.0033) [2024-06-18 11:12:06,994][12645] Fps is (10 sec: 44238.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 1984266240. Throughput: 0: 42631.5. Samples: 1984367300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:06,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 11:12:08,307][12883] Updated weights for policy 0, policy_version 121113 (0.0027) [2024-06-18 11:12:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1984446464. Throughput: 0: 42385.3. Samples: 1984621880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:11,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 11:12:12,476][12883] Updated weights for policy 0, policy_version 121123 (0.0034) [2024-06-18 11:12:15,826][12883] Updated weights for policy 0, policy_version 121133 (0.0034) [2024-06-18 11:12:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1984675840. Throughput: 0: 42457.0. Samples: 1984744500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:16,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 11:12:20,032][12883] Updated weights for policy 0, policy_version 121143 (0.0053) [2024-06-18 11:12:21,995][12645] Fps is (10 sec: 44229.7, 60 sec: 42051.0, 300 sec: 42598.2). Total num frames: 1984888832. Throughput: 0: 42420.2. Samples: 1985006320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:21,996][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 11:12:23,391][12883] Updated weights for policy 0, policy_version 121153 (0.0037) [2024-06-18 11:12:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1985101824. Throughput: 0: 42413.8. Samples: 1985259060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:26,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 11:12:27,457][12883] Updated weights for policy 0, policy_version 121163 (0.0032) [2024-06-18 11:12:31,089][12883] Updated weights for policy 0, policy_version 121173 (0.0037) [2024-06-18 11:12:31,994][12645] Fps is (10 sec: 42606.2, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 1985314816. Throughput: 0: 42507.7. Samples: 1985385980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:31,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 11:12:34,823][12883] Updated weights for policy 0, policy_version 121183 (0.0029) [2024-06-18 11:12:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 1985511424. Throughput: 0: 42433.8. Samples: 1985641860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-18 11:12:36,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 11:12:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121187_1985527808.pth... [2024-06-18 11:12:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120563_1975304192.pth [2024-06-18 11:12:38,761][12883] Updated weights for policy 0, policy_version 121193 (0.0042) [2024-06-18 11:12:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 1985740800. Throughput: 0: 42459.9. Samples: 1985897380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:12:41,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 11:12:42,898][12883] Updated weights for policy 0, policy_version 121203 (0.0033) [2024-06-18 11:12:46,449][12883] Updated weights for policy 0, policy_version 121213 (0.0025) [2024-06-18 11:12:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1985953792. Throughput: 0: 42613.8. Samples: 1986030020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:12:46,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 11:12:50,535][12883] Updated weights for policy 0, policy_version 121223 (0.0023) [2024-06-18 11:12:52,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 1986166784. Throughput: 0: 42607.4. Samples: 1986284900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:12:52,001][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 11:12:54,438][12883] Updated weights for policy 0, policy_version 121233 (0.0024) [2024-06-18 11:12:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 1986412544. Throughput: 0: 42673.8. Samples: 1986542200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:12:56,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 11:12:58,223][12883] Updated weights for policy 0, policy_version 121243 (0.0034) [2024-06-18 11:13:01,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 1986592768. Throughput: 0: 42752.1. Samples: 1986668340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:01,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 11:13:02,033][12883] Updated weights for policy 0, policy_version 121253 (0.0039) [2024-06-18 11:13:05,793][12883] Updated weights for policy 0, policy_version 121263 (0.0036) [2024-06-18 11:13:06,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 1986805760. Throughput: 0: 42635.9. Samples: 1986924860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:06,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 11:13:09,703][12883] Updated weights for policy 0, policy_version 121273 (0.0029) [2024-06-18 11:13:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 1987018752. Throughput: 0: 42876.8. Samples: 1987188520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:11,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 11:13:13,291][12883] Updated weights for policy 0, policy_version 121283 (0.0037) [2024-06-18 11:13:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 1987231744. Throughput: 0: 42800.9. Samples: 1987312120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:16,996][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 11:13:17,338][12883] Updated weights for policy 0, policy_version 121293 (0.0040) [2024-06-18 11:13:20,849][12883] Updated weights for policy 0, policy_version 121303 (0.0032) [2024-06-18 11:13:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42872.7, 300 sec: 42542.9). Total num frames: 1987461120. Throughput: 0: 42752.4. Samples: 1987565720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:21,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 11:13:25,036][12883] Updated weights for policy 0, policy_version 121313 (0.0033) [2024-06-18 11:13:26,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1987674112. Throughput: 0: 42883.5. Samples: 1987827140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:26,994][12645] Avg episode reward: [(0, '0.691')] [2024-06-18 11:13:28,439][12883] Updated weights for policy 0, policy_version 121323 (0.0032) [2024-06-18 11:13:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 1987870720. Throughput: 0: 42605.3. Samples: 1987947260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:31,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 11:13:32,867][12883] Updated weights for policy 0, policy_version 121333 (0.0033) [2024-06-18 11:13:35,893][12883] Updated weights for policy 0, policy_version 121343 (0.0036) [2024-06-18 11:13:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 1988116480. Throughput: 0: 42623.6. Samples: 1988202700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 11:13:36,994][12645] Avg episode reward: [(0, '0.719')] [2024-06-18 11:13:40,515][12883] Updated weights for policy 0, policy_version 121353 (0.0032) [2024-06-18 11:13:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 1988296704. Throughput: 0: 42773.5. Samples: 1988467000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:13:41,994][12645] Avg episode reward: [(0, '0.719')] [2024-06-18 11:13:42,088][12862] Signal inference workers to stop experience collection... (29150 times) [2024-06-18 11:13:42,088][12862] Signal inference workers to resume experience collection... (29150 times) [2024-06-18 11:13:42,118][12883] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-06-18 11:13:42,118][12883] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-06-18 11:13:43,623][12883] Updated weights for policy 0, policy_version 121363 (0.0028) [2024-06-18 11:13:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 1988493312. Throughput: 0: 42664.8. Samples: 1988588260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:13:46,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 11:13:48,394][12883] Updated weights for policy 0, policy_version 121373 (0.0037) [2024-06-18 11:13:51,086][12883] Updated weights for policy 0, policy_version 121383 (0.0032) [2024-06-18 11:13:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43149.0, 300 sec: 42653.9). Total num frames: 1988755456. Throughput: 0: 42735.5. Samples: 1988847960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:13:51,996][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 11:13:56,068][12883] Updated weights for policy 0, policy_version 121393 (0.0034) [2024-06-18 11:13:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 1988952064. Throughput: 0: 42608.9. Samples: 1989105920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:13:56,994][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 11:13:58,685][12883] Updated weights for policy 0, policy_version 121403 (0.0043) [2024-06-18 11:14:01,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 1989148672. Throughput: 0: 42626.4. Samples: 1989230220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:01,994][12645] Avg episode reward: [(0, '0.756')] [2024-06-18 11:14:03,830][12883] Updated weights for policy 0, policy_version 121413 (0.0039) [2024-06-18 11:14:06,928][12883] Updated weights for policy 0, policy_version 121423 (0.0034) [2024-06-18 11:14:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 1989394432. Throughput: 0: 42631.5. Samples: 1989484140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:06,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 11:14:11,617][12883] Updated weights for policy 0, policy_version 121433 (0.0040) [2024-06-18 11:14:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.1). Total num frames: 1989591040. Throughput: 0: 42589.3. Samples: 1989743660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:11,994][12645] Avg episode reward: [(0, '0.682')] [2024-06-18 11:14:14,577][12883] Updated weights for policy 0, policy_version 121443 (0.0033) [2024-06-18 11:14:16,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 1989771264. Throughput: 0: 42581.0. Samples: 1989863400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:16,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 11:14:19,127][12883] Updated weights for policy 0, policy_version 121453 (0.0033) [2024-06-18 11:14:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 1990033408. Throughput: 0: 42713.0. Samples: 1990124780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:21,994][12645] Avg episode reward: [(0, '0.336')] [2024-06-18 11:14:22,250][12883] Updated weights for policy 0, policy_version 121463 (0.0044) [2024-06-18 11:14:26,816][12883] Updated weights for policy 0, policy_version 121473 (0.0032) [2024-06-18 11:14:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 1990230016. Throughput: 0: 42620.4. Samples: 1990384920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:26,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 11:14:29,946][12883] Updated weights for policy 0, policy_version 121483 (0.0031) [2024-06-18 11:14:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 1990426624. Throughput: 0: 42716.1. Samples: 1990510480. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:31,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 11:14:34,334][12883] Updated weights for policy 0, policy_version 121493 (0.0027) [2024-06-18 11:14:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 1990672384. Throughput: 0: 42736.1. Samples: 1990771080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 11:14:36,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 11:14:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121501_1990672384.pth... [2024-06-18 11:14:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000120876_1980432384.pth [2024-06-18 11:14:37,565][12883] Updated weights for policy 0, policy_version 121503 (0.0037) [2024-06-18 11:14:41,931][12883] Updated weights for policy 0, policy_version 121513 (0.0024) [2024-06-18 11:14:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 1990868992. Throughput: 0: 42887.0. Samples: 1991035840. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:14:41,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 11:14:45,127][12883] Updated weights for policy 0, policy_version 121523 (0.0035) [2024-06-18 11:14:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 1991065600. Throughput: 0: 42717.1. Samples: 1991152480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:14:46,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 11:14:49,623][12883] Updated weights for policy 0, policy_version 121533 (0.0048) [2024-06-18 11:14:51,206][12862] Signal inference workers to stop experience collection... (29200 times) [2024-06-18 11:14:51,206][12862] Signal inference workers to resume experience collection... (29200 times) [2024-06-18 11:14:51,251][12883] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-06-18 11:14:51,251][12883] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-06-18 11:14:51,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 1991311360. Throughput: 0: 42743.1. Samples: 1991407580. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:14:51,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 11:14:52,849][12883] Updated weights for policy 0, policy_version 121543 (0.0030) [2024-06-18 11:14:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 1991491584. Throughput: 0: 42740.9. Samples: 1991667000. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:14:56,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 11:14:57,307][12883] Updated weights for policy 0, policy_version 121553 (0.0044) [2024-06-18 11:15:00,443][12883] Updated weights for policy 0, policy_version 121563 (0.0048) [2024-06-18 11:15:01,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 1991720960. Throughput: 0: 42769.8. Samples: 1991788040. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:01,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 11:15:05,059][12883] Updated weights for policy 0, policy_version 121573 (0.0037) [2024-06-18 11:15:06,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 1991966720. Throughput: 0: 42789.3. Samples: 1992050300. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:06,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 11:15:08,254][12883] Updated weights for policy 0, policy_version 121583 (0.0022) [2024-06-18 11:15:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 1992130560. Throughput: 0: 42695.5. Samples: 1992306220. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:11,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 11:15:13,305][12883] Updated weights for policy 0, policy_version 121593 (0.0033) [2024-06-18 11:15:15,898][12883] Updated weights for policy 0, policy_version 121603 (0.0039) [2024-06-18 11:15:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 1992359936. Throughput: 0: 42514.6. Samples: 1992423640. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:16,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 11:15:21,062][12883] Updated weights for policy 0, policy_version 121613 (0.0028) [2024-06-18 11:15:21,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 1992589312. Throughput: 0: 42698.3. Samples: 1992692600. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:21,997][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 11:15:23,955][12883] Updated weights for policy 0, policy_version 121623 (0.0029) [2024-06-18 11:15:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 1992785920. Throughput: 0: 42437.0. Samples: 1992945500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:26,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 11:15:28,580][12883] Updated weights for policy 0, policy_version 121633 (0.0039) [2024-06-18 11:15:31,525][12883] Updated weights for policy 0, policy_version 121643 (0.0039) [2024-06-18 11:15:31,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 1992998912. Throughput: 0: 42588.4. Samples: 1993068960. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:31,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 11:15:36,114][12883] Updated weights for policy 0, policy_version 121653 (0.0031) [2024-06-18 11:15:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 1993211904. Throughput: 0: 42690.7. Samples: 1993328660. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:36,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 11:15:39,341][12883] Updated weights for policy 0, policy_version 121663 (0.0033) [2024-06-18 11:15:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1993424896. Throughput: 0: 42535.5. Samples: 1993581100. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-18 11:15:41,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 11:15:43,629][12883] Updated weights for policy 0, policy_version 121673 (0.0036) [2024-06-18 11:15:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 1993637888. Throughput: 0: 42806.5. Samples: 1993714340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:15:46,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 11:15:47,371][12883] Updated weights for policy 0, policy_version 121683 (0.0033) [2024-06-18 11:15:51,090][12883] Updated weights for policy 0, policy_version 121693 (0.0029) [2024-06-18 11:15:51,997][12645] Fps is (10 sec: 42582.8, 60 sec: 42322.7, 300 sec: 42653.4). Total num frames: 1993850880. Throughput: 0: 42773.4. Samples: 1993975260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:15:51,998][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 11:15:54,902][12883] Updated weights for policy 0, policy_version 121703 (0.0038) [2024-06-18 11:15:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1994080256. Throughput: 0: 42627.9. Samples: 1994224480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:15:56,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 11:15:58,681][12883] Updated weights for policy 0, policy_version 121713 (0.0033) [2024-06-18 11:15:59,299][12862] Signal inference workers to stop experience collection... (29250 times) [2024-06-18 11:15:59,307][12862] Signal inference workers to resume experience collection... (29250 times) [2024-06-18 11:15:59,353][12883] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-06-18 11:15:59,353][12883] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-06-18 11:16:01,994][12645] Fps is (10 sec: 42614.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 1994276864. Throughput: 0: 42953.4. Samples: 1994356540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:01,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 11:16:02,780][12883] Updated weights for policy 0, policy_version 121723 (0.0027) [2024-06-18 11:16:06,467][12883] Updated weights for policy 0, policy_version 121733 (0.0030) [2024-06-18 11:16:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 1994473472. Throughput: 0: 42738.1. Samples: 1994615720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:06,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 11:16:10,395][12883] Updated weights for policy 0, policy_version 121743 (0.0031) [2024-06-18 11:16:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 1994719232. Throughput: 0: 42645.8. Samples: 1994864560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:11,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 11:16:14,162][12883] Updated weights for policy 0, policy_version 121753 (0.0037) [2024-06-18 11:16:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 1994915840. Throughput: 0: 42850.6. Samples: 1994997240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:16,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 11:16:17,944][12883] Updated weights for policy 0, policy_version 121763 (0.0037) [2024-06-18 11:16:21,612][12883] Updated weights for policy 0, policy_version 121773 (0.0037) [2024-06-18 11:16:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 1995128832. Throughput: 0: 42723.1. Samples: 1995251200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:21,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 11:16:25,593][12883] Updated weights for policy 0, policy_version 121783 (0.0034) [2024-06-18 11:16:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 1995358208. Throughput: 0: 42715.7. Samples: 1995503300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:26,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 11:16:29,330][12883] Updated weights for policy 0, policy_version 121793 (0.0036) [2024-06-18 11:16:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1995554816. Throughput: 0: 42694.4. Samples: 1995635580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:31,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 11:16:33,460][12883] Updated weights for policy 0, policy_version 121803 (0.0029) [2024-06-18 11:16:36,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 1995767808. Throughput: 0: 42498.3. Samples: 1995887620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:36,996][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 11:16:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121812_1995767808.pth... [2024-06-18 11:16:37,095][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121187_1985527808.pth [2024-06-18 11:16:37,271][12883] Updated weights for policy 0, policy_version 121813 (0.0041) [2024-06-18 11:16:40,985][12883] Updated weights for policy 0, policy_version 121823 (0.0023) [2024-06-18 11:16:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 1995980800. Throughput: 0: 42693.5. Samples: 1996145680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 11:16:41,994][12645] Avg episode reward: [(0, '0.670')] [2024-06-18 11:16:44,968][12883] Updated weights for policy 0, policy_version 121833 (0.0036) [2024-06-18 11:16:46,994][12645] Fps is (10 sec: 42607.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 1996193792. Throughput: 0: 42786.1. Samples: 1996281920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:16:46,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 11:16:48,353][12883] Updated weights for policy 0, policy_version 121843 (0.0032) [2024-06-18 11:16:51,996][12645] Fps is (10 sec: 42588.4, 60 sec: 42599.5, 300 sec: 42653.6). Total num frames: 1996406784. Throughput: 0: 42519.7. Samples: 1996529200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:16:51,997][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 11:16:52,529][12883] Updated weights for policy 0, policy_version 121853 (0.0026) [2024-06-18 11:16:55,994][12883] Updated weights for policy 0, policy_version 121863 (0.0035) [2024-06-18 11:16:56,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 1996636160. Throughput: 0: 42684.1. Samples: 1996785340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:16:56,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 11:17:00,480][12883] Updated weights for policy 0, policy_version 121873 (0.0028) [2024-06-18 11:17:01,996][12645] Fps is (10 sec: 44236.8, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 1996849152. Throughput: 0: 42650.8. Samples: 1996916620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:01,997][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 11:17:03,695][12883] Updated weights for policy 0, policy_version 121883 (0.0044) [2024-06-18 11:17:06,994][12645] Fps is (10 sec: 42597.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 1997062144. Throughput: 0: 42538.1. Samples: 1997165420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:06,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 11:17:08,101][12883] Updated weights for policy 0, policy_version 121893 (0.0047) [2024-06-18 11:17:11,441][12883] Updated weights for policy 0, policy_version 121903 (0.0040) [2024-06-18 11:17:11,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 1997258752. Throughput: 0: 42704.0. Samples: 1997424980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:11,994][12645] Avg episode reward: [(0, '0.242')] [2024-06-18 11:17:15,998][12883] Updated weights for policy 0, policy_version 121913 (0.0042) [2024-06-18 11:17:16,996][12645] Fps is (10 sec: 40951.2, 60 sec: 42596.8, 300 sec: 42653.9). Total num frames: 1997471744. Throughput: 0: 42566.7. Samples: 1997551180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:16,997][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 11:17:19,422][12883] Updated weights for policy 0, policy_version 121923 (0.0031) [2024-06-18 11:17:20,435][12862] Signal inference workers to stop experience collection... (29300 times) [2024-06-18 11:17:20,435][12862] Signal inference workers to resume experience collection... (29300 times) [2024-06-18 11:17:20,469][12883] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-06-18 11:17:20,469][12883] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-06-18 11:17:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1997701120. Throughput: 0: 42540.7. Samples: 1997801860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:21,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 11:17:23,621][12883] Updated weights for policy 0, policy_version 121933 (0.0031) [2024-06-18 11:17:26,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1997897728. Throughput: 0: 42623.9. Samples: 1998063760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:26,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 11:17:27,017][12883] Updated weights for policy 0, policy_version 121943 (0.0041) [2024-06-18 11:17:31,088][12883] Updated weights for policy 0, policy_version 121953 (0.0041) [2024-06-18 11:17:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 1998094336. Throughput: 0: 42366.8. Samples: 1998188420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:31,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 11:17:34,637][12883] Updated weights for policy 0, policy_version 121963 (0.0045) [2024-06-18 11:17:36,994][12645] Fps is (10 sec: 45874.2, 60 sec: 43146.0, 300 sec: 42765.0). Total num frames: 1998356480. Throughput: 0: 42631.7. Samples: 1998447540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:36,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 11:17:38,549][12883] Updated weights for policy 0, policy_version 121973 (0.0045) [2024-06-18 11:17:41,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 1998553088. Throughput: 0: 42600.8. Samples: 1998702380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:17:41,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 11:17:42,221][12883] Updated weights for policy 0, policy_version 121983 (0.0041) [2024-06-18 11:17:46,314][12883] Updated weights for policy 0, policy_version 121993 (0.0040) [2024-06-18 11:17:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 1998749696. Throughput: 0: 42444.7. Samples: 1998826540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:17:46,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 11:17:50,232][12883] Updated weights for policy 0, policy_version 122003 (0.0041) [2024-06-18 11:17:51,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42871.5, 300 sec: 42598.1). Total num frames: 1998979072. Throughput: 0: 42628.2. Samples: 1999083780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:17:51,996][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 11:17:53,959][12883] Updated weights for policy 0, policy_version 122013 (0.0032) [2024-06-18 11:17:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 1999192064. Throughput: 0: 42552.4. Samples: 1999339840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:17:56,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 11:17:57,922][12883] Updated weights for policy 0, policy_version 122023 (0.0029) [2024-06-18 11:18:01,691][12883] Updated weights for policy 0, policy_version 122033 (0.0040) [2024-06-18 11:18:01,994][12645] Fps is (10 sec: 40969.5, 60 sec: 42327.0, 300 sec: 42653.9). Total num frames: 1999388672. Throughput: 0: 42532.5. Samples: 1999465040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:01,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 11:18:05,775][12883] Updated weights for policy 0, policy_version 122043 (0.0037) [2024-06-18 11:18:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 1999634432. Throughput: 0: 42777.4. Samples: 1999726840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:06,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 11:18:09,308][12883] Updated weights for policy 0, policy_version 122053 (0.0036) [2024-06-18 11:18:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 1999814656. Throughput: 0: 42687.6. Samples: 1999984700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:11,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 11:18:13,301][12883] Updated weights for policy 0, policy_version 122063 (0.0027) [2024-06-18 11:18:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 2000027648. Throughput: 0: 42540.1. Samples: 2000102720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:16,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 11:18:17,034][12883] Updated weights for policy 0, policy_version 122073 (0.0035) [2024-06-18 11:18:21,085][12883] Updated weights for policy 0, policy_version 122083 (0.0035) [2024-06-18 11:18:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2000257024. Throughput: 0: 42541.0. Samples: 2000361880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:21,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 11:18:24,782][12883] Updated weights for policy 0, policy_version 122093 (0.0036) [2024-06-18 11:18:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2000453632. Throughput: 0: 42516.5. Samples: 2000615620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:26,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 11:18:28,652][12883] Updated weights for policy 0, policy_version 122103 (0.0039) [2024-06-18 11:18:31,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2000650240. Throughput: 0: 42468.5. Samples: 2000737620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:31,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 11:18:32,417][12883] Updated weights for policy 0, policy_version 122113 (0.0034) [2024-06-18 11:18:36,138][12883] Updated weights for policy 0, policy_version 122123 (0.0027) [2024-06-18 11:18:36,996][12645] Fps is (10 sec: 42590.0, 60 sec: 42051.0, 300 sec: 42653.6). Total num frames: 2000879616. Throughput: 0: 42486.0. Samples: 2000995640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:36,996][12645] Avg episode reward: [(0, '0.650')] [2024-06-18 11:18:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122124_2000879616.pth... [2024-06-18 11:18:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121501_1990672384.pth [2024-06-18 11:18:40,205][12883] Updated weights for policy 0, policy_version 122133 (0.0032) [2024-06-18 11:18:41,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2001076224. Throughput: 0: 42615.4. Samples: 2001257540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:41,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 11:18:44,162][12883] Updated weights for policy 0, policy_version 122143 (0.0035) [2024-06-18 11:18:46,994][12645] Fps is (10 sec: 42606.3, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2001305600. Throughput: 0: 42624.2. Samples: 2001383140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:18:46,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 11:18:47,912][12883] Updated weights for policy 0, policy_version 122153 (0.0038) [2024-06-18 11:18:51,697][12883] Updated weights for policy 0, policy_version 122163 (0.0033) [2024-06-18 11:18:51,994][12645] Fps is (10 sec: 45876.4, 60 sec: 42600.1, 300 sec: 42654.0). Total num frames: 2001534976. Throughput: 0: 42482.3. Samples: 2001638540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:18:51,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 11:18:55,569][12883] Updated weights for policy 0, policy_version 122173 (0.0028) [2024-06-18 11:18:56,994][12645] Fps is (10 sec: 40961.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2001715200. Throughput: 0: 42447.6. Samples: 2001894840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:18:56,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 11:18:59,242][12883] Updated weights for policy 0, policy_version 122183 (0.0027) [2024-06-18 11:19:01,995][12645] Fps is (10 sec: 39317.1, 60 sec: 42324.5, 300 sec: 42487.2). Total num frames: 2001928192. Throughput: 0: 42522.9. Samples: 2002016300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:01,995][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 11:19:03,373][12883] Updated weights for policy 0, policy_version 122193 (0.0035) [2024-06-18 11:19:05,444][12862] Signal inference workers to stop experience collection... (29350 times) [2024-06-18 11:19:05,444][12862] Signal inference workers to resume experience collection... (29350 times) [2024-06-18 11:19:05,481][12883] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-06-18 11:19:05,481][12883] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-06-18 11:19:06,801][12883] Updated weights for policy 0, policy_version 122203 (0.0035) [2024-06-18 11:19:06,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2002173952. Throughput: 0: 42558.2. Samples: 2002277000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:06,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 11:19:11,102][12883] Updated weights for policy 0, policy_version 122213 (0.0040) [2024-06-18 11:19:11,994][12645] Fps is (10 sec: 44241.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2002370560. Throughput: 0: 42456.5. Samples: 2002526160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:11,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 11:19:14,710][12883] Updated weights for policy 0, policy_version 122223 (0.0022) [2024-06-18 11:19:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2002583552. Throughput: 0: 42588.3. Samples: 2002654100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:16,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 11:19:18,862][12883] Updated weights for policy 0, policy_version 122233 (0.0037) [2024-06-18 11:19:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 2002780160. Throughput: 0: 42598.3. Samples: 2002912480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:21,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 11:19:22,527][12883] Updated weights for policy 0, policy_version 122243 (0.0027) [2024-06-18 11:19:26,460][12883] Updated weights for policy 0, policy_version 122253 (0.0033) [2024-06-18 11:19:26,995][12645] Fps is (10 sec: 44229.5, 60 sec: 42870.3, 300 sec: 42709.2). Total num frames: 2003025920. Throughput: 0: 42448.8. Samples: 2003167800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:26,996][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 11:19:30,058][12883] Updated weights for policy 0, policy_version 122263 (0.0058) [2024-06-18 11:19:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2003222528. Throughput: 0: 42592.1. Samples: 2003299780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:31,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 11:19:34,043][12883] Updated weights for policy 0, policy_version 122273 (0.0023) [2024-06-18 11:19:36,994][12645] Fps is (10 sec: 40966.7, 60 sec: 42599.8, 300 sec: 42598.4). Total num frames: 2003435520. Throughput: 0: 42630.5. Samples: 2003556920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:36,994][12645] Avg episode reward: [(0, '0.158')] [2024-06-18 11:19:37,650][12883] Updated weights for policy 0, policy_version 122283 (0.0028) [2024-06-18 11:19:41,698][12883] Updated weights for policy 0, policy_version 122293 (0.0042) [2024-06-18 11:19:41,999][12645] Fps is (10 sec: 42576.7, 60 sec: 42867.9, 300 sec: 42653.2). Total num frames: 2003648512. Throughput: 0: 42509.7. Samples: 2003808000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:41,999][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 11:19:45,632][12883] Updated weights for policy 0, policy_version 122303 (0.0033) [2024-06-18 11:19:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2003861504. Throughput: 0: 42727.7. Samples: 2003939000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 11:19:46,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 11:19:49,381][12883] Updated weights for policy 0, policy_version 122313 (0.0036) [2024-06-18 11:19:51,994][12645] Fps is (10 sec: 40981.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2004058112. Throughput: 0: 42569.9. Samples: 2004192640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:19:51,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 11:19:53,121][12883] Updated weights for policy 0, policy_version 122323 (0.0036) [2024-06-18 11:19:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2004287488. Throughput: 0: 42774.3. Samples: 2004451000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:19:56,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 11:19:57,037][12883] Updated weights for policy 0, policy_version 122333 (0.0041) [2024-06-18 11:20:00,824][12883] Updated weights for policy 0, policy_version 122343 (0.0036) [2024-06-18 11:20:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42872.2, 300 sec: 42487.3). Total num frames: 2004500480. Throughput: 0: 42856.0. Samples: 2004582620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:01,994][12645] Avg episode reward: [(0, '0.748')] [2024-06-18 11:20:04,725][12883] Updated weights for policy 0, policy_version 122353 (0.0038) [2024-06-18 11:20:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2004713472. Throughput: 0: 42666.8. Samples: 2004832480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:06,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 11:20:08,407][12883] Updated weights for policy 0, policy_version 122363 (0.0033) [2024-06-18 11:20:12,000][12645] Fps is (10 sec: 42572.0, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 2004926464. Throughput: 0: 42753.5. Samples: 2005091900. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:12,000][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 11:20:12,313][12883] Updated weights for policy 0, policy_version 122373 (0.0028) [2024-06-18 11:20:16,024][12883] Updated weights for policy 0, policy_version 122383 (0.0047) [2024-06-18 11:20:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2005155840. Throughput: 0: 42740.9. Samples: 2005223120. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:16,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 11:20:19,890][12883] Updated weights for policy 0, policy_version 122393 (0.0037) [2024-06-18 11:20:21,994][12645] Fps is (10 sec: 40984.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2005336064. Throughput: 0: 42527.9. Samples: 2005470680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:21,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 11:20:23,765][12883] Updated weights for policy 0, policy_version 122403 (0.0040) [2024-06-18 11:20:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.5, 300 sec: 42598.4). Total num frames: 2005565440. Throughput: 0: 42692.8. Samples: 2005728960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:26,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 11:20:27,753][12883] Updated weights for policy 0, policy_version 122413 (0.0029) [2024-06-18 11:20:31,557][12883] Updated weights for policy 0, policy_version 122423 (0.0044) [2024-06-18 11:20:31,998][12645] Fps is (10 sec: 45858.0, 60 sec: 42868.7, 300 sec: 42653.4). Total num frames: 2005794816. Throughput: 0: 42568.7. Samples: 2005854760. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:31,998][12645] Avg episode reward: [(0, '0.701')] [2024-06-18 11:20:35,384][12883] Updated weights for policy 0, policy_version 122433 (0.0028) [2024-06-18 11:20:36,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2005991424. Throughput: 0: 42508.5. Samples: 2006105620. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:36,997][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 11:20:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122436_2005991424.pth... [2024-06-18 11:20:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000121812_1995767808.pth [2024-06-18 11:20:39,381][12883] Updated weights for policy 0, policy_version 122443 (0.0041) [2024-06-18 11:20:41,994][12645] Fps is (10 sec: 40976.1, 60 sec: 42602.1, 300 sec: 42598.4). Total num frames: 2006204416. Throughput: 0: 42451.9. Samples: 2006361340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:41,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 11:20:43,735][12883] Updated weights for policy 0, policy_version 122453 (0.0035) [2024-06-18 11:20:43,771][12862] Signal inference workers to stop experience collection... (29400 times) [2024-06-18 11:20:43,772][12862] Signal inference workers to resume experience collection... (29400 times) [2024-06-18 11:20:43,816][12883] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-06-18 11:20:43,816][12883] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-06-18 11:20:46,996][12645] Fps is (10 sec: 42598.5, 60 sec: 42596.8, 300 sec: 42598.6). Total num frames: 2006417408. Throughput: 0: 42430.3. Samples: 2006492080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-18 11:20:46,996][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 11:20:47,073][12883] Updated weights for policy 0, policy_version 122463 (0.0042) [2024-06-18 11:20:51,368][12883] Updated weights for policy 0, policy_version 122473 (0.0043) [2024-06-18 11:20:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2006614016. Throughput: 0: 42449.4. Samples: 2006742700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:20:51,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 11:20:54,762][12883] Updated weights for policy 0, policy_version 122483 (0.0036) [2024-06-18 11:20:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2006843392. Throughput: 0: 42239.2. Samples: 2006992400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:20:56,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 11:20:59,090][12883] Updated weights for policy 0, policy_version 122493 (0.0032) [2024-06-18 11:21:01,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2007056384. Throughput: 0: 42128.9. Samples: 2007118920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:01,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 11:21:02,661][12883] Updated weights for policy 0, policy_version 122503 (0.0048) [2024-06-18 11:21:06,760][12883] Updated weights for policy 0, policy_version 122513 (0.0047) [2024-06-18 11:21:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2007252992. Throughput: 0: 42360.6. Samples: 2007376900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:06,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 11:21:10,401][12883] Updated weights for policy 0, policy_version 122523 (0.0039) [2024-06-18 11:21:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42329.8, 300 sec: 42542.9). Total num frames: 2007465984. Throughput: 0: 42157.9. Samples: 2007626060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:11,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 11:21:14,690][12883] Updated weights for policy 0, policy_version 122533 (0.0035) [2024-06-18 11:21:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2007695360. Throughput: 0: 42256.2. Samples: 2007756120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:16,994][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 11:21:18,076][12883] Updated weights for policy 0, policy_version 122543 (0.0039) [2024-06-18 11:21:21,995][12645] Fps is (10 sec: 42592.4, 60 sec: 42597.6, 300 sec: 42487.1). Total num frames: 2007891968. Throughput: 0: 42371.1. Samples: 2008012280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:21,996][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 11:21:22,207][12883] Updated weights for policy 0, policy_version 122553 (0.0032) [2024-06-18 11:21:25,702][12883] Updated weights for policy 0, policy_version 122563 (0.0034) [2024-06-18 11:21:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2008104960. Throughput: 0: 42243.9. Samples: 2008262320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:26,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 11:21:30,013][12883] Updated weights for policy 0, policy_version 122573 (0.0032) [2024-06-18 11:21:31,994][12645] Fps is (10 sec: 42603.7, 60 sec: 42054.9, 300 sec: 42543.2). Total num frames: 2008317952. Throughput: 0: 42208.2. Samples: 2008391360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:31,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 11:21:33,321][12883] Updated weights for policy 0, policy_version 122583 (0.0038) [2024-06-18 11:21:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 2008514560. Throughput: 0: 42304.9. Samples: 2008646420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:36,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 11:21:37,743][12883] Updated weights for policy 0, policy_version 122593 (0.0028) [2024-06-18 11:21:41,693][12883] Updated weights for policy 0, policy_version 122603 (0.0034) [2024-06-18 11:21:41,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2008743936. Throughput: 0: 42367.2. Samples: 2008898920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:41,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 11:21:45,529][12883] Updated weights for policy 0, policy_version 122613 (0.0037) [2024-06-18 11:21:46,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42326.9, 300 sec: 42543.2). Total num frames: 2008956928. Throughput: 0: 42423.5. Samples: 2009027980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:46,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 11:21:49,143][12883] Updated weights for policy 0, policy_version 122623 (0.0037) [2024-06-18 11:21:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 2009137152. Throughput: 0: 42384.4. Samples: 2009284200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 11:21:51,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 11:21:53,118][12883] Updated weights for policy 0, policy_version 122633 (0.0030) [2024-06-18 11:21:56,649][12883] Updated weights for policy 0, policy_version 122643 (0.0038) [2024-06-18 11:21:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2009382912. Throughput: 0: 42461.7. Samples: 2009536840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:21:56,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 11:22:01,250][12883] Updated weights for policy 0, policy_version 122653 (0.0029) [2024-06-18 11:22:02,000][12645] Fps is (10 sec: 45846.4, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 2009595904. Throughput: 0: 42500.7. Samples: 2009668920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:02,001][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 11:22:04,303][12883] Updated weights for policy 0, policy_version 122663 (0.0029) [2024-06-18 11:22:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2009792512. Throughput: 0: 42346.5. Samples: 2009917820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:06,996][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 11:22:08,849][12883] Updated weights for policy 0, policy_version 122673 (0.0040) [2024-06-18 11:22:11,847][12862] Signal inference workers to stop experience collection... (29450 times) [2024-06-18 11:22:11,847][12862] Signal inference workers to resume experience collection... (29450 times) [2024-06-18 11:22:11,891][12883] InferenceWorker_p0-w0: stopping experience collection (29450 times) [2024-06-18 11:22:11,891][12883] InferenceWorker_p0-w0: resuming experience collection (29450 times) [2024-06-18 11:22:11,989][12883] Updated weights for policy 0, policy_version 122683 (0.0032) [2024-06-18 11:22:11,994][12645] Fps is (10 sec: 44264.9, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2010038272. Throughput: 0: 42332.5. Samples: 2010167280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:11,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 11:22:16,390][12883] Updated weights for policy 0, policy_version 122693 (0.0046) [2024-06-18 11:22:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2010234880. Throughput: 0: 42493.4. Samples: 2010303560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:16,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 11:22:19,982][12883] Updated weights for policy 0, policy_version 122703 (0.0031) [2024-06-18 11:22:21,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42326.4, 300 sec: 42487.3). Total num frames: 2010431488. Throughput: 0: 42311.6. Samples: 2010550440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:21,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 11:22:24,225][12883] Updated weights for policy 0, policy_version 122713 (0.0029) [2024-06-18 11:22:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2010660864. Throughput: 0: 42512.8. Samples: 2010812000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:26,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 11:22:27,816][12883] Updated weights for policy 0, policy_version 122723 (0.0046) [2024-06-18 11:22:31,810][12883] Updated weights for policy 0, policy_version 122733 (0.0034) [2024-06-18 11:22:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 2010857472. Throughput: 0: 42606.3. Samples: 2010945260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:31,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 11:22:35,341][12883] Updated weights for policy 0, policy_version 122743 (0.0030) [2024-06-18 11:22:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2011070464. Throughput: 0: 42554.1. Samples: 2011199140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:36,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 11:22:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122746_2011070464.pth... [2024-06-18 11:22:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122124_2000879616.pth [2024-06-18 11:22:39,271][12883] Updated weights for policy 0, policy_version 122753 (0.0038) [2024-06-18 11:22:41,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2011316224. Throughput: 0: 42560.9. Samples: 2011452080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:41,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 11:22:42,929][12883] Updated weights for policy 0, policy_version 122763 (0.0027) [2024-06-18 11:22:46,943][12883] Updated weights for policy 0, policy_version 122773 (0.0027) [2024-06-18 11:22:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2011512832. Throughput: 0: 42666.8. Samples: 2011588660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:46,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 11:22:51,101][12883] Updated weights for policy 0, policy_version 122783 (0.0043) [2024-06-18 11:22:52,000][12645] Fps is (10 sec: 40934.9, 60 sec: 43140.1, 300 sec: 42486.4). Total num frames: 2011725824. Throughput: 0: 42816.8. Samples: 2011844840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:52,000][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 11:22:54,428][12883] Updated weights for policy 0, policy_version 122793 (0.0049) [2024-06-18 11:22:56,998][12645] Fps is (10 sec: 44216.3, 60 sec: 42868.2, 300 sec: 42597.7). Total num frames: 2011955200. Throughput: 0: 42834.2. Samples: 2012095020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:22:56,999][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 11:22:58,474][12883] Updated weights for policy 0, policy_version 122803 (0.0048) [2024-06-18 11:23:01,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42602.9, 300 sec: 42431.8). Total num frames: 2012151808. Throughput: 0: 42815.6. Samples: 2012230260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:01,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 11:23:02,132][12883] Updated weights for policy 0, policy_version 122813 (0.0047) [2024-06-18 11:23:05,958][12883] Updated weights for policy 0, policy_version 122823 (0.0032) [2024-06-18 11:23:06,994][12645] Fps is (10 sec: 40979.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2012364800. Throughput: 0: 43105.7. Samples: 2012490200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:06,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 11:23:09,593][12883] Updated weights for policy 0, policy_version 122833 (0.0036) [2024-06-18 11:23:11,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2012610560. Throughput: 0: 42913.6. Samples: 2012743120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:11,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 11:23:13,527][12883] Updated weights for policy 0, policy_version 122843 (0.0044) [2024-06-18 11:23:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2012790784. Throughput: 0: 42912.4. Samples: 2012876320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:16,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 11:23:17,157][12883] Updated weights for policy 0, policy_version 122853 (0.0028) [2024-06-18 11:23:21,261][12883] Updated weights for policy 0, policy_version 122863 (0.0031) [2024-06-18 11:23:21,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2013020160. Throughput: 0: 42899.7. Samples: 2013129620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:21,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 11:23:24,680][12862] Signal inference workers to stop experience collection... (29500 times) [2024-06-18 11:23:24,680][12862] Signal inference workers to resume experience collection... (29500 times) [2024-06-18 11:23:24,721][12883] InferenceWorker_p0-w0: stopping experience collection (29500 times) [2024-06-18 11:23:24,721][12883] InferenceWorker_p0-w0: resuming experience collection (29500 times) [2024-06-18 11:23:24,832][12883] Updated weights for policy 0, policy_version 122873 (0.0045) [2024-06-18 11:23:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2013249536. Throughput: 0: 42905.8. Samples: 2013382840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:26,994][12645] Avg episode reward: [(0, '0.698')] [2024-06-18 11:23:28,746][12883] Updated weights for policy 0, policy_version 122883 (0.0040) [2024-06-18 11:23:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42543.1). Total num frames: 2013429760. Throughput: 0: 42860.0. Samples: 2013517360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:31,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 11:23:32,397][12883] Updated weights for policy 0, policy_version 122893 (0.0038) [2024-06-18 11:23:36,412][12883] Updated weights for policy 0, policy_version 122903 (0.0040) [2024-06-18 11:23:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 2013659136. Throughput: 0: 42581.9. Samples: 2013760760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:36,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 11:23:40,570][12883] Updated weights for policy 0, policy_version 122913 (0.0035) [2024-06-18 11:23:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2013872128. Throughput: 0: 42735.6. Samples: 2014017920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:41,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 11:23:44,004][12883] Updated weights for policy 0, policy_version 122923 (0.0042) [2024-06-18 11:23:46,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2014052352. Throughput: 0: 42508.8. Samples: 2014143160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:46,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 11:23:48,114][12883] Updated weights for policy 0, policy_version 122933 (0.0031) [2024-06-18 11:23:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 2014281728. Throughput: 0: 42411.2. Samples: 2014398700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 11:23:51,994][12645] Avg episode reward: [(0, '0.695')] [2024-06-18 11:23:52,026][12883] Updated weights for policy 0, policy_version 122943 (0.0030) [2024-06-18 11:23:55,952][12883] Updated weights for policy 0, policy_version 122953 (0.0031) [2024-06-18 11:23:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42328.7, 300 sec: 42598.6). Total num frames: 2014494720. Throughput: 0: 42434.8. Samples: 2014652680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:23:56,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 11:23:59,709][12883] Updated weights for policy 0, policy_version 122963 (0.0031) [2024-06-18 11:24:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 2014674944. Throughput: 0: 42340.9. Samples: 2014781660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:01,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 11:24:03,469][12883] Updated weights for policy 0, policy_version 122973 (0.0029) [2024-06-18 11:24:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2014920704. Throughput: 0: 42395.1. Samples: 2015037400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:06,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 11:24:07,254][12883] Updated weights for policy 0, policy_version 122983 (0.0033) [2024-06-18 11:24:11,349][12883] Updated weights for policy 0, policy_version 122993 (0.0026) [2024-06-18 11:24:11,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2015150080. Throughput: 0: 42353.4. Samples: 2015288740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:11,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 11:24:15,367][12883] Updated weights for policy 0, policy_version 123003 (0.0024) [2024-06-18 11:24:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2015330304. Throughput: 0: 42223.7. Samples: 2015417420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:16,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 11:24:19,197][12883] Updated weights for policy 0, policy_version 123013 (0.0027) [2024-06-18 11:24:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.8, 300 sec: 42542.8). Total num frames: 2015576064. Throughput: 0: 42476.9. Samples: 2015672320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:21,996][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 11:24:23,102][12883] Updated weights for policy 0, policy_version 123023 (0.0030) [2024-06-18 11:24:26,860][12883] Updated weights for policy 0, policy_version 123033 (0.0036) [2024-06-18 11:24:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2015772672. Throughput: 0: 42412.8. Samples: 2015926500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:26,999][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 11:24:30,654][12883] Updated weights for policy 0, policy_version 123043 (0.0036) [2024-06-18 11:24:31,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2015969280. Throughput: 0: 42287.6. Samples: 2016046100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:31,994][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 11:24:34,618][12883] Updated weights for policy 0, policy_version 123053 (0.0031) [2024-06-18 11:24:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42599.1). Total num frames: 2016215040. Throughput: 0: 42311.5. Samples: 2016302720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:36,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 11:24:37,111][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123061_2016231424.pth... [2024-06-18 11:24:37,161][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122436_2005991424.pth [2024-06-18 11:24:38,659][12883] Updated weights for policy 0, policy_version 123063 (0.0034) [2024-06-18 11:24:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2016395264. Throughput: 0: 42377.8. Samples: 2016559680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:41,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 11:24:42,523][12883] Updated weights for policy 0, policy_version 123073 (0.0032) [2024-06-18 11:24:46,266][12883] Updated weights for policy 0, policy_version 123083 (0.0031) [2024-06-18 11:24:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2016591872. Throughput: 0: 42292.0. Samples: 2016684800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:46,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 11:24:50,143][12883] Updated weights for policy 0, policy_version 123093 (0.0030) [2024-06-18 11:24:50,769][12862] Signal inference workers to stop experience collection... (29550 times) [2024-06-18 11:24:50,770][12862] Signal inference workers to resume experience collection... (29550 times) [2024-06-18 11:24:50,782][12883] InferenceWorker_p0-w0: stopping experience collection (29550 times) [2024-06-18 11:24:50,782][12883] InferenceWorker_p0-w0: resuming experience collection (29550 times) [2024-06-18 11:24:51,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 2016821248. Throughput: 0: 42245.8. Samples: 2016938560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:51,997][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 11:24:53,968][12883] Updated weights for policy 0, policy_version 123103 (0.0043) [2024-06-18 11:24:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2017034240. Throughput: 0: 42486.3. Samples: 2017200620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 11:24:56,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 11:24:57,753][12883] Updated weights for policy 0, policy_version 123113 (0.0035) [2024-06-18 11:25:01,486][12883] Updated weights for policy 0, policy_version 123123 (0.0037) [2024-06-18 11:25:01,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2017247232. Throughput: 0: 42444.0. Samples: 2017327400. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:01,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 11:25:05,398][12883] Updated weights for policy 0, policy_version 123133 (0.0040) [2024-06-18 11:25:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42488.2). Total num frames: 2017460224. Throughput: 0: 42420.7. Samples: 2017581160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:06,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 11:25:09,026][12883] Updated weights for policy 0, policy_version 123143 (0.0029) [2024-06-18 11:25:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2017673216. Throughput: 0: 42588.0. Samples: 2017842960. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:11,995][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 11:25:13,131][12883] Updated weights for policy 0, policy_version 123153 (0.0026) [2024-06-18 11:25:16,636][12883] Updated weights for policy 0, policy_version 123163 (0.0043) [2024-06-18 11:25:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2017902592. Throughput: 0: 42655.2. Samples: 2017965580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:16,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 11:25:20,699][12883] Updated weights for policy 0, policy_version 123173 (0.0028) [2024-06-18 11:25:21,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42327.0, 300 sec: 42542.9). Total num frames: 2018115584. Throughput: 0: 42724.5. Samples: 2018225320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:21,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 11:25:24,483][12883] Updated weights for policy 0, policy_version 123183 (0.0033) [2024-06-18 11:25:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42376.8). Total num frames: 2018295808. Throughput: 0: 42800.5. Samples: 2018485700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:26,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 11:25:28,230][12883] Updated weights for policy 0, policy_version 123193 (0.0031) [2024-06-18 11:25:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 2018541568. Throughput: 0: 42726.2. Samples: 2018607480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:31,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 11:25:32,203][12883] Updated weights for policy 0, policy_version 123203 (0.0042) [2024-06-18 11:25:35,869][12883] Updated weights for policy 0, policy_version 123213 (0.0043) [2024-06-18 11:25:36,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2018770944. Throughput: 0: 42933.2. Samples: 2018870460. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:36,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 11:25:39,799][12883] Updated weights for policy 0, policy_version 123223 (0.0038) [2024-06-18 11:25:41,994][12645] Fps is (10 sec: 39320.3, 60 sec: 42325.0, 300 sec: 42432.1). Total num frames: 2018934784. Throughput: 0: 42874.2. Samples: 2019129980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:41,995][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 11:25:43,573][12883] Updated weights for policy 0, policy_version 123233 (0.0036) [2024-06-18 11:25:47,000][12645] Fps is (10 sec: 40934.8, 60 sec: 43140.0, 300 sec: 42597.5). Total num frames: 2019180544. Throughput: 0: 42732.7. Samples: 2019250640. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:47,000][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 11:25:47,562][12883] Updated weights for policy 0, policy_version 123243 (0.0033) [2024-06-18 11:25:51,444][12883] Updated weights for policy 0, policy_version 123253 (0.0029) [2024-06-18 11:25:51,996][12645] Fps is (10 sec: 45866.7, 60 sec: 42871.5, 300 sec: 42542.5). Total num frames: 2019393536. Throughput: 0: 42791.7. Samples: 2019506880. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:51,996][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 11:25:55,434][12883] Updated weights for policy 0, policy_version 123263 (0.0034) [2024-06-18 11:25:56,994][12645] Fps is (10 sec: 40985.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2019590144. Throughput: 0: 42605.4. Samples: 2019760200. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 11:25:56,994][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 11:25:57,958][12862] Signal inference workers to stop experience collection... (29600 times) [2024-06-18 11:25:57,990][12883] InferenceWorker_p0-w0: stopping experience collection (29600 times) [2024-06-18 11:25:58,014][12862] Signal inference workers to resume experience collection... (29600 times) [2024-06-18 11:25:58,015][12883] InferenceWorker_p0-w0: resuming experience collection (29600 times) [2024-06-18 11:25:59,092][12883] Updated weights for policy 0, policy_version 123273 (0.0029) [2024-06-18 11:26:01,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2019819520. Throughput: 0: 42637.3. Samples: 2019884260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:01,994][12645] Avg episode reward: [(0, '0.740')] [2024-06-18 11:26:03,162][12883] Updated weights for policy 0, policy_version 123283 (0.0040) [2024-06-18 11:26:06,985][12883] Updated weights for policy 0, policy_version 123293 (0.0044) [2024-06-18 11:26:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2020032512. Throughput: 0: 42676.9. Samples: 2020145780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:06,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 11:26:10,723][12883] Updated weights for policy 0, policy_version 123303 (0.0034) [2024-06-18 11:26:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 2020229120. Throughput: 0: 42375.6. Samples: 2020392600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:11,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 11:26:14,803][12883] Updated weights for policy 0, policy_version 123313 (0.0039) [2024-06-18 11:26:16,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42596.8, 300 sec: 42598.3). Total num frames: 2020458496. Throughput: 0: 42619.7. Samples: 2020525460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:16,996][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 11:26:18,242][12883] Updated weights for policy 0, policy_version 123323 (0.0035) [2024-06-18 11:26:21,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2020638720. Throughput: 0: 42359.6. Samples: 2020776640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:21,998][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 11:26:22,471][12883] Updated weights for policy 0, policy_version 123333 (0.0042) [2024-06-18 11:26:25,931][12883] Updated weights for policy 0, policy_version 123343 (0.0038) [2024-06-18 11:26:26,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2020851712. Throughput: 0: 42153.3. Samples: 2021026860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:26,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 11:26:30,222][12883] Updated weights for policy 0, policy_version 123353 (0.0028) [2024-06-18 11:26:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2021097472. Throughput: 0: 42382.7. Samples: 2021157600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:31,994][12645] Avg episode reward: [(0, '0.714')] [2024-06-18 11:26:33,501][12883] Updated weights for policy 0, policy_version 123363 (0.0033) [2024-06-18 11:26:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 2021294080. Throughput: 0: 42434.0. Samples: 2021416320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:36,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 11:26:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123370_2021294080.pth... [2024-06-18 11:26:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000122746_2011070464.pth [2024-06-18 11:26:38,110][12883] Updated weights for policy 0, policy_version 123373 (0.0043) [2024-06-18 11:26:41,370][12883] Updated weights for policy 0, policy_version 123383 (0.0038) [2024-06-18 11:26:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.8, 300 sec: 42542.9). Total num frames: 2021507072. Throughput: 0: 42155.7. Samples: 2021657200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:41,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 11:26:46,145][12883] Updated weights for policy 0, policy_version 123393 (0.0035) [2024-06-18 11:26:46,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42602.6, 300 sec: 42709.4). Total num frames: 2021736448. Throughput: 0: 42430.4. Samples: 2021793640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:46,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 11:26:48,934][12883] Updated weights for policy 0, policy_version 123403 (0.0036) [2024-06-18 11:26:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42053.9, 300 sec: 42487.3). Total num frames: 2021916672. Throughput: 0: 42391.5. Samples: 2022053400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:51,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 11:26:53,710][12883] Updated weights for policy 0, policy_version 123413 (0.0032) [2024-06-18 11:26:56,406][12883] Updated weights for policy 0, policy_version 123423 (0.0026) [2024-06-18 11:26:56,998][12645] Fps is (10 sec: 42581.8, 60 sec: 42868.5, 300 sec: 42598.7). Total num frames: 2022162432. Throughput: 0: 42274.6. Samples: 2022295140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:26:56,998][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 11:27:01,457][12883] Updated weights for policy 0, policy_version 123433 (0.0033) [2024-06-18 11:27:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2022359040. Throughput: 0: 42431.8. Samples: 2022434800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 11:27:01,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 11:27:04,715][12883] Updated weights for policy 0, policy_version 123443 (0.0039) [2024-06-18 11:27:06,997][12645] Fps is (10 sec: 39324.2, 60 sec: 42049.7, 300 sec: 42431.3). Total num frames: 2022555648. Throughput: 0: 42341.1. Samples: 2022682140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:06,998][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 11:27:09,255][12883] Updated weights for policy 0, policy_version 123453 (0.0029) [2024-06-18 11:27:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 2022785024. Throughput: 0: 42251.8. Samples: 2022928200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:11,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 11:27:12,371][12883] Updated weights for policy 0, policy_version 123463 (0.0034) [2024-06-18 11:27:16,988][12883] Updated weights for policy 0, policy_version 123473 (0.0036) [2024-06-18 11:27:16,994][12645] Fps is (10 sec: 42613.6, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 2022981632. Throughput: 0: 42393.4. Samples: 2023065300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:16,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 11:27:17,300][12862] Signal inference workers to stop experience collection... (29650 times) [2024-06-18 11:27:17,300][12862] Signal inference workers to resume experience collection... (29650 times) [2024-06-18 11:27:17,331][12883] InferenceWorker_p0-w0: stopping experience collection (29650 times) [2024-06-18 11:27:17,331][12883] InferenceWorker_p0-w0: resuming experience collection (29650 times) [2024-06-18 11:27:19,815][12883] Updated weights for policy 0, policy_version 123483 (0.0026) [2024-06-18 11:27:21,996][12645] Fps is (10 sec: 40951.5, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 2023194624. Throughput: 0: 42378.9. Samples: 2023323460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:21,996][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 11:27:24,551][12883] Updated weights for policy 0, policy_version 123493 (0.0027) [2024-06-18 11:27:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2023440384. Throughput: 0: 42607.4. Samples: 2023574540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:26,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 11:27:27,303][12883] Updated weights for policy 0, policy_version 123503 (0.0041) [2024-06-18 11:27:31,994][12645] Fps is (10 sec: 40969.1, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2023604224. Throughput: 0: 42672.3. Samples: 2023713880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:31,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 11:27:32,217][12883] Updated weights for policy 0, policy_version 123513 (0.0046) [2024-06-18 11:27:34,848][12883] Updated weights for policy 0, policy_version 123523 (0.0042) [2024-06-18 11:27:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2023849984. Throughput: 0: 42477.7. Samples: 2023964900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:36,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 11:27:39,868][12883] Updated weights for policy 0, policy_version 123533 (0.0038) [2024-06-18 11:27:41,994][12645] Fps is (10 sec: 49151.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2024095744. Throughput: 0: 42696.4. Samples: 2024216300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:41,994][12645] Avg episode reward: [(0, '0.162')] [2024-06-18 11:27:43,152][12883] Updated weights for policy 0, policy_version 123543 (0.0037) [2024-06-18 11:27:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 41779.4, 300 sec: 42432.7). Total num frames: 2024243200. Throughput: 0: 42415.1. Samples: 2024343480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:46,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 11:27:47,750][12883] Updated weights for policy 0, policy_version 123553 (0.0034) [2024-06-18 11:27:50,742][12883] Updated weights for policy 0, policy_version 123563 (0.0032) [2024-06-18 11:27:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42543.5). Total num frames: 2024505344. Throughput: 0: 42649.9. Samples: 2024601240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:51,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 11:27:55,297][12883] Updated weights for policy 0, policy_version 123573 (0.0039) [2024-06-18 11:27:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42328.3, 300 sec: 42542.9). Total num frames: 2024701952. Throughput: 0: 42894.9. Samples: 2024858460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:27:56,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 11:27:58,695][12883] Updated weights for policy 0, policy_version 123583 (0.0047) [2024-06-18 11:28:01,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2024898560. Throughput: 0: 42652.9. Samples: 2024984680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 11:28:01,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 11:28:02,883][12883] Updated weights for policy 0, policy_version 123593 (0.0034) [2024-06-18 11:28:06,181][12883] Updated weights for policy 0, policy_version 123603 (0.0025) [2024-06-18 11:28:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43147.1, 300 sec: 42487.3). Total num frames: 2025144320. Throughput: 0: 42607.9. Samples: 2025240720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:06,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 11:28:10,519][12883] Updated weights for policy 0, policy_version 123613 (0.0024) [2024-06-18 11:28:11,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2025324544. Throughput: 0: 42734.1. Samples: 2025497580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:11,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 11:28:13,759][12883] Updated weights for policy 0, policy_version 123623 (0.0034) [2024-06-18 11:28:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2025553920. Throughput: 0: 42414.2. Samples: 2025622520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:16,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 11:28:18,030][12883] Updated weights for policy 0, policy_version 123633 (0.0031) [2024-06-18 11:28:21,901][12883] Updated weights for policy 0, policy_version 123643 (0.0031) [2024-06-18 11:28:21,996][12645] Fps is (10 sec: 44227.8, 60 sec: 42871.5, 300 sec: 42431.5). Total num frames: 2025766912. Throughput: 0: 42502.8. Samples: 2025877620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:21,996][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 11:28:24,329][12862] Signal inference workers to stop experience collection... (29700 times) [2024-06-18 11:28:24,329][12862] Signal inference workers to resume experience collection... (29700 times) [2024-06-18 11:28:24,372][12883] InferenceWorker_p0-w0: stopping experience collection (29700 times) [2024-06-18 11:28:24,372][12883] InferenceWorker_p0-w0: resuming experience collection (29700 times) [2024-06-18 11:28:25,644][12883] Updated weights for policy 0, policy_version 123653 (0.0028) [2024-06-18 11:28:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2025963520. Throughput: 0: 42632.9. Samples: 2026134780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:26,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 11:28:29,462][12883] Updated weights for policy 0, policy_version 123663 (0.0038) [2024-06-18 11:28:31,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2026176512. Throughput: 0: 42606.7. Samples: 2026260780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:31,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 11:28:33,341][12883] Updated weights for policy 0, policy_version 123673 (0.0054) [2024-06-18 11:28:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2026405888. Throughput: 0: 42677.5. Samples: 2026521720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:36,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 11:28:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123683_2026422272.pth... [2024-06-18 11:28:37,019][12883] Updated weights for policy 0, policy_version 123683 (0.0029) [2024-06-18 11:28:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123061_2016231424.pth [2024-06-18 11:28:40,928][12883] Updated weights for policy 0, policy_version 123693 (0.0038) [2024-06-18 11:28:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2026618880. Throughput: 0: 42577.3. Samples: 2026774440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:41,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 11:28:44,534][12883] Updated weights for policy 0, policy_version 123703 (0.0039) [2024-06-18 11:28:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2026815488. Throughput: 0: 42632.8. Samples: 2026903160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:46,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 11:28:48,630][12883] Updated weights for policy 0, policy_version 123713 (0.0037) [2024-06-18 11:28:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2027044864. Throughput: 0: 42757.0. Samples: 2027164780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:51,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 11:28:52,331][12883] Updated weights for policy 0, policy_version 123723 (0.0031) [2024-06-18 11:28:56,156][12883] Updated weights for policy 0, policy_version 123733 (0.0026) [2024-06-18 11:28:56,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2027274240. Throughput: 0: 42736.7. Samples: 2027420720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:28:56,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 11:29:00,430][12883] Updated weights for policy 0, policy_version 123743 (0.0034) [2024-06-18 11:29:01,994][12645] Fps is (10 sec: 42595.2, 60 sec: 42871.0, 300 sec: 42542.8). Total num frames: 2027470848. Throughput: 0: 42883.8. Samples: 2027552320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 11:29:01,995][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 11:29:03,696][12883] Updated weights for policy 0, policy_version 123753 (0.0043) [2024-06-18 11:29:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2027683840. Throughput: 0: 42895.1. Samples: 2027807800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:06,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 11:29:08,063][12883] Updated weights for policy 0, policy_version 123763 (0.0048) [2024-06-18 11:29:11,619][12883] Updated weights for policy 0, policy_version 123773 (0.0051) [2024-06-18 11:29:11,994][12645] Fps is (10 sec: 42600.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2027896832. Throughput: 0: 42737.7. Samples: 2028057980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:11,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 11:29:16,130][12883] Updated weights for policy 0, policy_version 123783 (0.0033) [2024-06-18 11:29:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 2028109824. Throughput: 0: 42704.5. Samples: 2028182480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:16,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 11:29:19,215][12883] Updated weights for policy 0, policy_version 123793 (0.0036) [2024-06-18 11:29:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2028339200. Throughput: 0: 42643.5. Samples: 2028440680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:21,994][12645] Avg episode reward: [(0, '0.608')] [2024-06-18 11:29:23,524][12883] Updated weights for policy 0, policy_version 123803 (0.0033) [2024-06-18 11:29:26,757][12883] Updated weights for policy 0, policy_version 123813 (0.0039) [2024-06-18 11:29:26,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2028552192. Throughput: 0: 42662.7. Samples: 2028694260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:26,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 11:29:30,931][12883] Updated weights for policy 0, policy_version 123823 (0.0030) [2024-06-18 11:29:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2028748800. Throughput: 0: 42800.0. Samples: 2028829160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:31,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 11:29:34,475][12883] Updated weights for policy 0, policy_version 123833 (0.0040) [2024-06-18 11:29:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2028945408. Throughput: 0: 42663.5. Samples: 2029084640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:36,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 11:29:38,650][12883] Updated weights for policy 0, policy_version 123843 (0.0030) [2024-06-18 11:29:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2029174784. Throughput: 0: 42493.8. Samples: 2029332940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:41,994][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 11:29:42,441][12883] Updated weights for policy 0, policy_version 123853 (0.0043) [2024-06-18 11:29:46,710][12883] Updated weights for policy 0, policy_version 123863 (0.0046) [2024-06-18 11:29:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2029387776. Throughput: 0: 42449.5. Samples: 2029462520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:46,994][12645] Avg episode reward: [(0, '0.684')] [2024-06-18 11:29:49,965][12883] Updated weights for policy 0, policy_version 123873 (0.0036) [2024-06-18 11:29:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2029584384. Throughput: 0: 42383.1. Samples: 2029715040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:51,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 11:29:54,353][12883] Updated weights for policy 0, policy_version 123883 (0.0041) [2024-06-18 11:29:54,620][12862] Signal inference workers to stop experience collection... (29750 times) [2024-06-18 11:29:54,620][12862] Signal inference workers to resume experience collection... (29750 times) [2024-06-18 11:29:54,649][12883] InferenceWorker_p0-w0: stopping experience collection (29750 times) [2024-06-18 11:29:54,649][12883] InferenceWorker_p0-w0: resuming experience collection (29750 times) [2024-06-18 11:29:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2029830144. Throughput: 0: 42471.7. Samples: 2029969200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:29:56,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 11:29:57,519][12883] Updated weights for policy 0, policy_version 123893 (0.0031) [2024-06-18 11:30:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.8, 300 sec: 42542.9). Total num frames: 2030010368. Throughput: 0: 42531.6. Samples: 2030096400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:30:01,994][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 11:30:02,103][12883] Updated weights for policy 0, policy_version 123903 (0.0034) [2024-06-18 11:30:05,223][12883] Updated weights for policy 0, policy_version 123913 (0.0035) [2024-06-18 11:30:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2030223360. Throughput: 0: 42404.4. Samples: 2030348880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 11:30:06,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 11:30:09,795][12883] Updated weights for policy 0, policy_version 123923 (0.0048) [2024-06-18 11:30:11,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2030469120. Throughput: 0: 42506.1. Samples: 2030607040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:11,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 11:30:13,238][12883] Updated weights for policy 0, policy_version 123933 (0.0046) [2024-06-18 11:30:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2030649344. Throughput: 0: 42372.0. Samples: 2030735900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:16,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 11:30:17,374][12883] Updated weights for policy 0, policy_version 123943 (0.0027) [2024-06-18 11:30:20,921][12883] Updated weights for policy 0, policy_version 123953 (0.0047) [2024-06-18 11:30:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2030878720. Throughput: 0: 42377.2. Samples: 2030991620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:21,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 11:30:24,966][12883] Updated weights for policy 0, policy_version 123963 (0.0024) [2024-06-18 11:30:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2031091712. Throughput: 0: 42495.5. Samples: 2031245240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:26,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 11:30:28,708][12883] Updated weights for policy 0, policy_version 123973 (0.0028) [2024-06-18 11:30:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2031304704. Throughput: 0: 42472.0. Samples: 2031373760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:31,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 11:30:32,582][12883] Updated weights for policy 0, policy_version 123983 (0.0029) [2024-06-18 11:30:36,267][12883] Updated weights for policy 0, policy_version 123993 (0.0043) [2024-06-18 11:30:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.5). Total num frames: 2031501312. Throughput: 0: 42532.0. Samples: 2031628980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:36,994][12645] Avg episode reward: [(0, '0.718')] [2024-06-18 11:30:37,097][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123994_2031517696.pth... [2024-06-18 11:30:37,158][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123370_2021294080.pth [2024-06-18 11:30:40,704][12883] Updated weights for policy 0, policy_version 124003 (0.0035) [2024-06-18 11:30:41,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42543.7). Total num frames: 2031730688. Throughput: 0: 42491.3. Samples: 2031881320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:41,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 11:30:44,234][12883] Updated weights for policy 0, policy_version 124013 (0.0038) [2024-06-18 11:30:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2031943680. Throughput: 0: 42621.3. Samples: 2032014360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:46,994][12645] Avg episode reward: [(0, '0.319')] [2024-06-18 11:30:48,171][12883] Updated weights for policy 0, policy_version 124023 (0.0030) [2024-06-18 11:30:51,902][12883] Updated weights for policy 0, policy_version 124033 (0.0052) [2024-06-18 11:30:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2032156672. Throughput: 0: 42535.1. Samples: 2032262960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:51,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 11:30:55,693][12883] Updated weights for policy 0, policy_version 124043 (0.0036) [2024-06-18 11:30:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2032369664. Throughput: 0: 42621.9. Samples: 2032525020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:30:56,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 11:30:59,409][12883] Updated weights for policy 0, policy_version 124053 (0.0033) [2024-06-18 11:31:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2032566272. Throughput: 0: 42578.7. Samples: 2032651940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:31:01,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 11:31:03,462][12883] Updated weights for policy 0, policy_version 124063 (0.0031) [2024-06-18 11:31:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2032795648. Throughput: 0: 42502.7. Samples: 2032904240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 11:31:06,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 11:31:07,297][12883] Updated weights for policy 0, policy_version 124073 (0.0027) [2024-06-18 11:31:11,288][12883] Updated weights for policy 0, policy_version 124083 (0.0034) [2024-06-18 11:31:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42487.6). Total num frames: 2032992256. Throughput: 0: 42616.1. Samples: 2033162960. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:11,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 11:31:15,029][12883] Updated weights for policy 0, policy_version 124093 (0.0034) [2024-06-18 11:31:15,283][12862] Signal inference workers to stop experience collection... (29800 times) [2024-06-18 11:31:15,283][12862] Signal inference workers to resume experience collection... (29800 times) [2024-06-18 11:31:15,307][12883] InferenceWorker_p0-w0: stopping experience collection (29800 times) [2024-06-18 11:31:15,307][12883] InferenceWorker_p0-w0: resuming experience collection (29800 times) [2024-06-18 11:31:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2033205248. Throughput: 0: 42638.6. Samples: 2033292500. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:16,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 11:31:18,816][12883] Updated weights for policy 0, policy_version 124103 (0.0032) [2024-06-18 11:31:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2033434624. Throughput: 0: 42503.8. Samples: 2033541660. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:21,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 11:31:22,646][12883] Updated weights for policy 0, policy_version 124113 (0.0031) [2024-06-18 11:31:26,420][12883] Updated weights for policy 0, policy_version 124123 (0.0037) [2024-06-18 11:31:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2033647616. Throughput: 0: 42618.0. Samples: 2033799120. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:26,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 11:31:30,298][12883] Updated weights for policy 0, policy_version 124133 (0.0026) [2024-06-18 11:31:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2033844224. Throughput: 0: 42588.3. Samples: 2033930840. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:31,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 11:31:33,982][12883] Updated weights for policy 0, policy_version 124143 (0.0036) [2024-06-18 11:31:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2034073600. Throughput: 0: 42679.5. Samples: 2034183540. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:36,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 11:31:38,067][12883] Updated weights for policy 0, policy_version 124153 (0.0026) [2024-06-18 11:31:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 2034270208. Throughput: 0: 42412.8. Samples: 2034433600. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:41,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 11:31:42,194][12883] Updated weights for policy 0, policy_version 124163 (0.0027) [2024-06-18 11:31:46,070][12883] Updated weights for policy 0, policy_version 124173 (0.0023) [2024-06-18 11:31:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2034483200. Throughput: 0: 42425.3. Samples: 2034561080. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:46,994][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 11:31:49,731][12883] Updated weights for policy 0, policy_version 124183 (0.0046) [2024-06-18 11:31:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42543.5). Total num frames: 2034712576. Throughput: 0: 42542.1. Samples: 2034818640. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:51,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 11:31:53,850][12883] Updated weights for policy 0, policy_version 124193 (0.0030) [2024-06-18 11:31:56,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2034909184. Throughput: 0: 42344.5. Samples: 2035068560. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:31:56,996][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 11:31:57,374][12883] Updated weights for policy 0, policy_version 124203 (0.0028) [2024-06-18 11:32:01,822][12883] Updated weights for policy 0, policy_version 124213 (0.0038) [2024-06-18 11:32:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 2035122176. Throughput: 0: 42308.0. Samples: 2035196360. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:32:01,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 11:32:04,942][12883] Updated weights for policy 0, policy_version 124223 (0.0025) [2024-06-18 11:32:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2035351552. Throughput: 0: 42709.0. Samples: 2035463560. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:32:06,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 11:32:09,299][12883] Updated weights for policy 0, policy_version 124233 (0.0049) [2024-06-18 11:32:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2035564544. Throughput: 0: 42435.0. Samples: 2035708700. Policy #0 lag: (min: 2.0, avg: 9.8, max: 21.0) [2024-06-18 11:32:11,994][12645] Avg episode reward: [(0, '0.707')] [2024-06-18 11:32:12,542][12883] Updated weights for policy 0, policy_version 124243 (0.0028) [2024-06-18 11:32:16,837][12883] Updated weights for policy 0, policy_version 124253 (0.0032) [2024-06-18 11:32:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 2035761152. Throughput: 0: 42463.6. Samples: 2035841700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:16,994][12645] Avg episode reward: [(0, '0.686')] [2024-06-18 11:32:20,209][12883] Updated weights for policy 0, policy_version 124263 (0.0044) [2024-06-18 11:32:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2035974144. Throughput: 0: 42537.4. Samples: 2036097720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:21,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 11:32:24,334][12883] Updated weights for policy 0, policy_version 124273 (0.0041) [2024-06-18 11:32:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2036203520. Throughput: 0: 42702.6. Samples: 2036355220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:26,995][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 11:32:27,868][12883] Updated weights for policy 0, policy_version 124283 (0.0042) [2024-06-18 11:32:31,959][12883] Updated weights for policy 0, policy_version 124293 (0.0030) [2024-06-18 11:32:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2036416512. Throughput: 0: 42729.9. Samples: 2036483920. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:31,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 11:32:35,634][12883] Updated weights for policy 0, policy_version 124303 (0.0031) [2024-06-18 11:32:36,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2036613120. Throughput: 0: 42587.7. Samples: 2036735080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:36,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 11:32:37,077][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124306_2036629504.pth... [2024-06-18 11:32:37,128][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123683_2026422272.pth [2024-06-18 11:32:39,628][12883] Updated weights for policy 0, policy_version 124313 (0.0043) [2024-06-18 11:32:41,301][12862] Signal inference workers to stop experience collection... (29850 times) [2024-06-18 11:32:41,302][12862] Signal inference workers to resume experience collection... (29850 times) [2024-06-18 11:32:41,348][12883] InferenceWorker_p0-w0: stopping experience collection (29850 times) [2024-06-18 11:32:41,348][12883] InferenceWorker_p0-w0: resuming experience collection (29850 times) [2024-06-18 11:32:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2036842496. Throughput: 0: 42720.3. Samples: 2036990880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:41,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 11:32:43,259][12883] Updated weights for policy 0, policy_version 124323 (0.0029) [2024-06-18 11:32:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2037039104. Throughput: 0: 42709.9. Samples: 2037118300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:46,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 11:32:47,221][12883] Updated weights for policy 0, policy_version 124333 (0.0034) [2024-06-18 11:32:50,868][12883] Updated weights for policy 0, policy_version 124343 (0.0033) [2024-06-18 11:32:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2037252096. Throughput: 0: 42484.0. Samples: 2037375340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:51,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 11:32:54,894][12883] Updated weights for policy 0, policy_version 124353 (0.0025) [2024-06-18 11:32:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 2037497856. Throughput: 0: 42711.6. Samples: 2037630720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:32:56,996][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 11:32:58,507][12883] Updated weights for policy 0, policy_version 124363 (0.0035) [2024-06-18 11:33:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2037661696. Throughput: 0: 42712.8. Samples: 2037763780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:33:01,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 11:33:02,852][12883] Updated weights for policy 0, policy_version 124373 (0.0034) [2024-06-18 11:33:06,157][12883] Updated weights for policy 0, policy_version 124383 (0.0036) [2024-06-18 11:33:06,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2037891072. Throughput: 0: 42571.2. Samples: 2038013420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:33:06,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 11:33:10,686][12883] Updated weights for policy 0, policy_version 124393 (0.0026) [2024-06-18 11:33:11,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2038120448. Throughput: 0: 42366.4. Samples: 2038261700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 11:33:11,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 11:33:13,903][12883] Updated weights for policy 0, policy_version 124403 (0.0026) [2024-06-18 11:33:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42487.7). Total num frames: 2038300672. Throughput: 0: 42480.0. Samples: 2038395520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:16,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 11:33:18,626][12883] Updated weights for policy 0, policy_version 124413 (0.0032) [2024-06-18 11:33:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2038530048. Throughput: 0: 42370.2. Samples: 2038641740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:21,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 11:33:22,059][12883] Updated weights for policy 0, policy_version 124423 (0.0038) [2024-06-18 11:33:26,313][12883] Updated weights for policy 0, policy_version 124433 (0.0045) [2024-06-18 11:33:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2038743040. Throughput: 0: 42447.2. Samples: 2038901000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:26,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 11:33:29,561][12883] Updated weights for policy 0, policy_version 124443 (0.0024) [2024-06-18 11:33:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 2038923264. Throughput: 0: 42431.5. Samples: 2039027720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:31,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 11:33:33,957][12883] Updated weights for policy 0, policy_version 124453 (0.0036) [2024-06-18 11:33:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2039169024. Throughput: 0: 42447.6. Samples: 2039285480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:36,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 11:33:37,202][12883] Updated weights for policy 0, policy_version 124463 (0.0036) [2024-06-18 11:33:41,511][12883] Updated weights for policy 0, policy_version 124473 (0.0026) [2024-06-18 11:33:41,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2039398400. Throughput: 0: 42519.5. Samples: 2039544100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:41,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 11:33:44,793][12883] Updated weights for policy 0, policy_version 124483 (0.0044) [2024-06-18 11:33:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2039578624. Throughput: 0: 42424.9. Samples: 2039672900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:46,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 11:33:49,163][12883] Updated weights for policy 0, policy_version 124493 (0.0045) [2024-06-18 11:33:50,259][12862] Signal inference workers to stop experience collection... (29900 times) [2024-06-18 11:33:50,259][12862] Signal inference workers to resume experience collection... (29900 times) [2024-06-18 11:33:50,297][12883] InferenceWorker_p0-w0: stopping experience collection (29900 times) [2024-06-18 11:33:50,297][12883] InferenceWorker_p0-w0: resuming experience collection (29900 times) [2024-06-18 11:33:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2039824384. Throughput: 0: 42507.1. Samples: 2039926240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:51,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 11:33:52,290][12883] Updated weights for policy 0, policy_version 124503 (0.0036) [2024-06-18 11:33:56,770][12883] Updated weights for policy 0, policy_version 124513 (0.0042) [2024-06-18 11:33:56,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42598.5). Total num frames: 2040037376. Throughput: 0: 42794.3. Samples: 2040187440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:33:56,994][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 11:33:59,885][12883] Updated weights for policy 0, policy_version 124523 (0.0033) [2024-06-18 11:34:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2040217600. Throughput: 0: 42552.3. Samples: 2040310380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:34:01,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 11:34:04,518][12883] Updated weights for policy 0, policy_version 124533 (0.0050) [2024-06-18 11:34:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2040463360. Throughput: 0: 42635.5. Samples: 2040560340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:34:06,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 11:34:07,502][12883] Updated weights for policy 0, policy_version 124543 (0.0040) [2024-06-18 11:34:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2040643584. Throughput: 0: 42648.9. Samples: 2040820200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:34:11,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 11:34:12,385][12883] Updated weights for policy 0, policy_version 124553 (0.0032) [2024-06-18 11:34:15,601][12883] Updated weights for policy 0, policy_version 124563 (0.0027) [2024-06-18 11:34:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2040856576. Throughput: 0: 42571.6. Samples: 2040943440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 11:34:16,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 11:34:19,997][12883] Updated weights for policy 0, policy_version 124573 (0.0031) [2024-06-18 11:34:21,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 2041085952. Throughput: 0: 42545.9. Samples: 2041200140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:21,997][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 11:34:23,383][12883] Updated weights for policy 0, policy_version 124583 (0.0030) [2024-06-18 11:34:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2041282560. Throughput: 0: 42522.3. Samples: 2041457600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:26,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 11:34:28,026][12883] Updated weights for policy 0, policy_version 124593 (0.0033) [2024-06-18 11:34:30,968][12883] Updated weights for policy 0, policy_version 124603 (0.0045) [2024-06-18 11:34:31,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2041495552. Throughput: 0: 42422.2. Samples: 2041581900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:31,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 11:34:35,514][12883] Updated weights for policy 0, policy_version 124613 (0.0037) [2024-06-18 11:34:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2041741312. Throughput: 0: 42540.8. Samples: 2041840580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:36,994][12645] Avg episode reward: [(0, '0.773')] [2024-06-18 11:34:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124618_2041741312.pth... [2024-06-18 11:34:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000123994_2031517696.pth [2024-06-18 11:34:38,640][12883] Updated weights for policy 0, policy_version 124623 (0.0038) [2024-06-18 11:34:41,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2041937920. Throughput: 0: 42415.1. Samples: 2042096220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:41,997][12645] Avg episode reward: [(0, '0.773')] [2024-06-18 11:34:43,494][12883] Updated weights for policy 0, policy_version 124633 (0.0033) [2024-06-18 11:34:46,825][12883] Updated weights for policy 0, policy_version 124643 (0.0030) [2024-06-18 11:34:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2042150912. Throughput: 0: 42503.2. Samples: 2042223020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:46,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 11:34:51,301][12883] Updated weights for policy 0, policy_version 124653 (0.0040) [2024-06-18 11:34:51,994][12645] Fps is (10 sec: 40969.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 2042347520. Throughput: 0: 42671.3. Samples: 2042480540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:51,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 11:34:54,447][12883] Updated weights for policy 0, policy_version 124663 (0.0043) [2024-06-18 11:34:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2042576896. Throughput: 0: 42497.2. Samples: 2042732580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:34:56,994][12645] Avg episode reward: [(0, '0.772')] [2024-06-18 11:34:58,751][12883] Updated weights for policy 0, policy_version 124673 (0.0036) [2024-06-18 11:35:01,978][12883] Updated weights for policy 0, policy_version 124683 (0.0035) [2024-06-18 11:35:01,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2042806272. Throughput: 0: 42737.7. Samples: 2042866640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:35:01,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 11:35:06,086][12883] Updated weights for policy 0, policy_version 124693 (0.0027) [2024-06-18 11:35:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 2042986496. Throughput: 0: 42718.2. Samples: 2043122360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:35:06,994][12645] Avg episode reward: [(0, '0.698')] [2024-06-18 11:35:09,479][12883] Updated weights for policy 0, policy_version 124703 (0.0038) [2024-06-18 11:35:11,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2043215872. Throughput: 0: 42660.9. Samples: 2043377340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:35:11,994][12645] Avg episode reward: [(0, '0.677')] [2024-06-18 11:35:13,537][12883] Updated weights for policy 0, policy_version 124713 (0.0024) [2024-06-18 11:35:16,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2043445248. Throughput: 0: 42909.3. Samples: 2043512820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 11:35:16,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 11:35:17,058][12883] Updated weights for policy 0, policy_version 124723 (0.0028) [2024-06-18 11:35:19,560][12862] Signal inference workers to stop experience collection... (29950 times) [2024-06-18 11:35:19,592][12883] InferenceWorker_p0-w0: stopping experience collection (29950 times) [2024-06-18 11:35:19,612][12862] Signal inference workers to resume experience collection... (29950 times) [2024-06-18 11:35:19,616][12883] InferenceWorker_p0-w0: resuming experience collection (29950 times) [2024-06-18 11:35:21,074][12883] Updated weights for policy 0, policy_version 124733 (0.0043) [2024-06-18 11:35:22,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42595.6, 300 sec: 42542.0). Total num frames: 2043641856. Throughput: 0: 42701.7. Samples: 2043762420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:22,000][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 11:35:24,835][12883] Updated weights for policy 0, policy_version 124743 (0.0038) [2024-06-18 11:35:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2043871232. Throughput: 0: 42692.0. Samples: 2044017260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:26,994][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 11:35:28,900][12883] Updated weights for policy 0, policy_version 124753 (0.0036) [2024-06-18 11:35:31,994][12645] Fps is (10 sec: 45903.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2044100608. Throughput: 0: 42821.8. Samples: 2044150000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:31,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 11:35:32,809][12883] Updated weights for policy 0, policy_version 124763 (0.0036) [2024-06-18 11:35:36,394][12883] Updated weights for policy 0, policy_version 124773 (0.0028) [2024-06-18 11:35:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2044280832. Throughput: 0: 42680.7. Samples: 2044401180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:36,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 11:35:40,832][12883] Updated weights for policy 0, policy_version 124783 (0.0037) [2024-06-18 11:35:41,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42871.5, 300 sec: 42598.1). Total num frames: 2044510208. Throughput: 0: 42794.0. Samples: 2044658400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:41,996][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 11:35:44,016][12883] Updated weights for policy 0, policy_version 124793 (0.0040) [2024-06-18 11:35:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2044690432. Throughput: 0: 42687.6. Samples: 2044787580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:46,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 11:35:48,423][12883] Updated weights for policy 0, policy_version 124803 (0.0032) [2024-06-18 11:35:51,545][12883] Updated weights for policy 0, policy_version 124813 (0.0031) [2024-06-18 11:35:51,994][12645] Fps is (10 sec: 42607.8, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2044936192. Throughput: 0: 42595.0. Samples: 2045039140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:51,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 11:35:56,266][12883] Updated weights for policy 0, policy_version 124823 (0.0034) [2024-06-18 11:35:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2045132800. Throughput: 0: 42662.1. Samples: 2045297140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:35:56,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 11:35:59,151][12883] Updated weights for policy 0, policy_version 124833 (0.0028) [2024-06-18 11:36:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2045329408. Throughput: 0: 42503.5. Samples: 2045425480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:36:01,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 11:36:03,910][12883] Updated weights for policy 0, policy_version 124843 (0.0035) [2024-06-18 11:36:06,755][12883] Updated weights for policy 0, policy_version 124853 (0.0026) [2024-06-18 11:36:06,994][12645] Fps is (10 sec: 45874.3, 60 sec: 43417.4, 300 sec: 42709.4). Total num frames: 2045591552. Throughput: 0: 42483.0. Samples: 2045673900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:36:06,994][12645] Avg episode reward: [(0, '0.672')] [2024-06-18 11:36:11,559][12883] Updated weights for policy 0, policy_version 124863 (0.0029) [2024-06-18 11:36:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2045771776. Throughput: 0: 42570.6. Samples: 2045932940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:36:11,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 11:36:14,873][12883] Updated weights for policy 0, policy_version 124873 (0.0029) [2024-06-18 11:36:16,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2045984768. Throughput: 0: 42298.5. Samples: 2046053440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 11:36:16,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 11:36:19,315][12883] Updated weights for policy 0, policy_version 124883 (0.0033) [2024-06-18 11:36:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 2046214144. Throughput: 0: 42533.9. Samples: 2046315200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:21,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 11:36:22,570][12883] Updated weights for policy 0, policy_version 124893 (0.0040) [2024-06-18 11:36:26,928][12883] Updated weights for policy 0, policy_version 124903 (0.0046) [2024-06-18 11:36:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2046410752. Throughput: 0: 42589.6. Samples: 2046574840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:26,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 11:36:30,210][12883] Updated weights for policy 0, policy_version 124913 (0.0035) [2024-06-18 11:36:31,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2046623744. Throughput: 0: 42347.5. Samples: 2046693220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:31,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 11:36:33,827][12862] Signal inference workers to stop experience collection... (30000 times) [2024-06-18 11:36:33,827][12862] Signal inference workers to resume experience collection... (30000 times) [2024-06-18 11:36:33,846][12883] InferenceWorker_p0-w0: stopping experience collection (30000 times) [2024-06-18 11:36:33,846][12883] InferenceWorker_p0-w0: resuming experience collection (30000 times) [2024-06-18 11:36:34,578][12883] Updated weights for policy 0, policy_version 124923 (0.0046) [2024-06-18 11:36:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2046836736. Throughput: 0: 42636.4. Samples: 2046957780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:36,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 11:36:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124930_2046853120.pth... [2024-06-18 11:36:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124306_2036629504.pth [2024-06-18 11:36:37,861][12883] Updated weights for policy 0, policy_version 124933 (0.0022) [2024-06-18 11:36:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 2047049728. Throughput: 0: 42650.2. Samples: 2047216400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:41,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 11:36:42,171][12883] Updated weights for policy 0, policy_version 124943 (0.0033) [2024-06-18 11:36:45,564][12883] Updated weights for policy 0, policy_version 124953 (0.0034) [2024-06-18 11:36:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2047279104. Throughput: 0: 42592.0. Samples: 2047342120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:46,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 11:36:49,714][12883] Updated weights for policy 0, policy_version 124963 (0.0025) [2024-06-18 11:36:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2047475712. Throughput: 0: 42775.8. Samples: 2047598800. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:51,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 11:36:53,061][12883] Updated weights for policy 0, policy_version 124973 (0.0035) [2024-06-18 11:36:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2047672320. Throughput: 0: 42640.0. Samples: 2047851740. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:36:56,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 11:36:57,762][12883] Updated weights for policy 0, policy_version 124983 (0.0036) [2024-06-18 11:37:00,812][12883] Updated weights for policy 0, policy_version 124993 (0.0030) [2024-06-18 11:37:01,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2047918080. Throughput: 0: 42807.2. Samples: 2047979760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:01,994][12645] Avg episode reward: [(0, '0.767')] [2024-06-18 11:37:05,762][12883] Updated weights for policy 0, policy_version 125003 (0.0034) [2024-06-18 11:37:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.4, 300 sec: 42487.3). Total num frames: 2048098304. Throughput: 0: 42751.5. Samples: 2048239020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:06,994][12645] Avg episode reward: [(0, '0.692')] [2024-06-18 11:37:08,488][12883] Updated weights for policy 0, policy_version 125013 (0.0044) [2024-06-18 11:37:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2048327680. Throughput: 0: 42628.0. Samples: 2048493100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:11,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 11:37:13,458][12883] Updated weights for policy 0, policy_version 125023 (0.0036) [2024-06-18 11:37:16,030][12883] Updated weights for policy 0, policy_version 125033 (0.0035) [2024-06-18 11:37:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2048557056. Throughput: 0: 42825.0. Samples: 2048620340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:16,994][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 11:37:21,135][12883] Updated weights for policy 0, policy_version 125043 (0.0035) [2024-06-18 11:37:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2048753664. Throughput: 0: 42871.4. Samples: 2048887000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:21,994][12645] Avg episode reward: [(0, '0.629')] [2024-06-18 11:37:23,579][12883] Updated weights for policy 0, policy_version 125053 (0.0039) [2024-06-18 11:37:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2048983040. Throughput: 0: 42635.9. Samples: 2049135020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:26,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 11:37:28,620][12883] Updated weights for policy 0, policy_version 125063 (0.0031) [2024-06-18 11:37:31,412][12883] Updated weights for policy 0, policy_version 125073 (0.0033) [2024-06-18 11:37:31,996][12645] Fps is (10 sec: 47503.8, 60 sec: 43416.1, 300 sec: 42764.7). Total num frames: 2049228800. Throughput: 0: 42756.6. Samples: 2049266260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:31,996][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 11:37:36,101][12883] Updated weights for policy 0, policy_version 125083 (0.0033) [2024-06-18 11:37:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2049392640. Throughput: 0: 42893.1. Samples: 2049529000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:36,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 11:37:39,104][12883] Updated weights for policy 0, policy_version 125093 (0.0034) [2024-06-18 11:37:41,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2049622016. Throughput: 0: 42882.7. Samples: 2049781460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:41,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 11:37:43,728][12883] Updated weights for policy 0, policy_version 125103 (0.0027) [2024-06-18 11:37:46,738][12883] Updated weights for policy 0, policy_version 125113 (0.0043) [2024-06-18 11:37:46,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2049867776. Throughput: 0: 42968.9. Samples: 2049913360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:46,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 11:37:51,266][12883] Updated weights for policy 0, policy_version 125123 (0.0031) [2024-06-18 11:37:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 2050048000. Throughput: 0: 42939.9. Samples: 2050171320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:51,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 11:37:54,226][12883] Updated weights for policy 0, policy_version 125133 (0.0043) [2024-06-18 11:37:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2050260992. Throughput: 0: 42992.9. Samples: 2050427780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:37:56,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 11:37:57,439][12862] Signal inference workers to stop experience collection... (30050 times) [2024-06-18 11:37:57,440][12862] Signal inference workers to resume experience collection... (30050 times) [2024-06-18 11:37:57,455][12883] InferenceWorker_p0-w0: stopping experience collection (30050 times) [2024-06-18 11:37:57,455][12883] InferenceWorker_p0-w0: resuming experience collection (30050 times) [2024-06-18 11:37:58,722][12883] Updated weights for policy 0, policy_version 125143 (0.0039) [2024-06-18 11:38:01,781][12883] Updated weights for policy 0, policy_version 125153 (0.0039) [2024-06-18 11:38:01,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2050506752. Throughput: 0: 43011.6. Samples: 2050555860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:38:01,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 11:38:06,310][12883] Updated weights for policy 0, policy_version 125163 (0.0026) [2024-06-18 11:38:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2050686976. Throughput: 0: 42859.7. Samples: 2050815680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:38:06,996][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 11:38:09,471][12883] Updated weights for policy 0, policy_version 125173 (0.0025) [2024-06-18 11:38:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2050899968. Throughput: 0: 42924.1. Samples: 2051066600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:38:11,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 11:38:14,136][12883] Updated weights for policy 0, policy_version 125183 (0.0041) [2024-06-18 11:38:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2051112960. Throughput: 0: 42897.3. Samples: 2051196540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:38:16,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 11:38:17,301][12883] Updated weights for policy 0, policy_version 125193 (0.0029) [2024-06-18 11:38:21,661][12883] Updated weights for policy 0, policy_version 125203 (0.0039) [2024-06-18 11:38:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2051325952. Throughput: 0: 42786.3. Samples: 2051454380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:38:21,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 11:38:24,967][12883] Updated weights for policy 0, policy_version 125213 (0.0030) [2024-06-18 11:38:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2051555328. Throughput: 0: 42742.7. Samples: 2051704880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:38:26,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 11:38:29,368][12883] Updated weights for policy 0, policy_version 125223 (0.0037) [2024-06-18 11:38:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 2051751936. Throughput: 0: 42843.1. Samples: 2051841300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:38:31,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 11:38:32,763][12883] Updated weights for policy 0, policy_version 125233 (0.0040) [2024-06-18 11:38:36,930][12883] Updated weights for policy 0, policy_version 125243 (0.0039) [2024-06-18 11:38:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 2051981312. Throughput: 0: 42791.2. Samples: 2052096920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:38:36,994][12645] Avg episode reward: [(0, '0.250')] [2024-06-18 11:38:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125243_2051981312.pth... [2024-06-18 11:38:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124618_2041741312.pth [2024-06-18 11:38:40,474][12883] Updated weights for policy 0, policy_version 125253 (0.0037) [2024-06-18 11:38:41,995][12645] Fps is (10 sec: 45867.7, 60 sec: 43143.4, 300 sec: 42820.3). Total num frames: 2052210688. Throughput: 0: 42627.8. Samples: 2052346100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:38:41,996][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 11:38:44,596][12883] Updated weights for policy 0, policy_version 125263 (0.0049) [2024-06-18 11:38:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2052390912. Throughput: 0: 42639.5. Samples: 2052474640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:38:46,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 11:38:48,406][12883] Updated weights for policy 0, policy_version 125273 (0.0028) [2024-06-18 11:38:51,994][12645] Fps is (10 sec: 40966.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2052620288. Throughput: 0: 42603.9. Samples: 2052732860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:38:51,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 11:38:52,194][12883] Updated weights for policy 0, policy_version 125283 (0.0028) [2024-06-18 11:38:55,889][12883] Updated weights for policy 0, policy_version 125293 (0.0036) [2024-06-18 11:38:56,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2052833280. Throughput: 0: 42574.8. Samples: 2052982560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:38:56,997][12645] Avg episode reward: [(0, '0.767')] [2024-06-18 11:39:00,018][12883] Updated weights for policy 0, policy_version 125303 (0.0043) [2024-06-18 11:39:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2053029888. Throughput: 0: 42596.8. Samples: 2053113400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:39:01,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 11:39:03,375][12883] Updated weights for policy 0, policy_version 125313 (0.0038) [2024-06-18 11:39:06,996][12645] Fps is (10 sec: 42598.4, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2053259264. Throughput: 0: 42460.5. Samples: 2053365200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:39:06,997][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 11:39:07,810][12883] Updated weights for policy 0, policy_version 125323 (0.0035) [2024-06-18 11:39:11,347][12883] Updated weights for policy 0, policy_version 125333 (0.0033) [2024-06-18 11:39:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2053455872. Throughput: 0: 42613.8. Samples: 2053622500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:39:11,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 11:39:15,081][12862] Signal inference workers to stop experience collection... (30100 times) [2024-06-18 11:39:15,081][12862] Signal inference workers to resume experience collection... (30100 times) [2024-06-18 11:39:15,097][12883] InferenceWorker_p0-w0: stopping experience collection (30100 times) [2024-06-18 11:39:15,097][12883] InferenceWorker_p0-w0: resuming experience collection (30100 times) [2024-06-18 11:39:15,225][12883] Updated weights for policy 0, policy_version 125343 (0.0035) [2024-06-18 11:39:16,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 2053652480. Throughput: 0: 42425.0. Samples: 2053750420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:39:16,994][12645] Avg episode reward: [(0, '0.662')] [2024-06-18 11:39:18,998][12883] Updated weights for policy 0, policy_version 125353 (0.0035) [2024-06-18 11:39:21,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2053914624. Throughput: 0: 42591.5. Samples: 2054013540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 11:39:21,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 11:39:23,233][12883] Updated weights for policy 0, policy_version 125363 (0.0031) [2024-06-18 11:39:26,877][12883] Updated weights for policy 0, policy_version 125373 (0.0042) [2024-06-18 11:39:26,994][12645] Fps is (10 sec: 45874.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2054111232. Throughput: 0: 42689.8. Samples: 2054267080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:39:26,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 11:39:30,932][12883] Updated weights for policy 0, policy_version 125383 (0.0032) [2024-06-18 11:39:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2054324224. Throughput: 0: 42649.0. Samples: 2054393840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:39:31,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 11:39:34,386][12883] Updated weights for policy 0, policy_version 125393 (0.0030) [2024-06-18 11:39:36,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.4). Total num frames: 2054553600. Throughput: 0: 42724.1. Samples: 2054655440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:39:36,994][12645] Avg episode reward: [(0, '0.176')] [2024-06-18 11:39:38,618][12883] Updated weights for policy 0, policy_version 125403 (0.0029) [2024-06-18 11:39:41,915][12883] Updated weights for policy 0, policy_version 125413 (0.0046) [2024-06-18 11:39:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42599.5, 300 sec: 42765.0). Total num frames: 2054766592. Throughput: 0: 42832.7. Samples: 2054909940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:39:41,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 11:39:46,330][12883] Updated weights for policy 0, policy_version 125423 (0.0032) [2024-06-18 11:39:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2054946816. Throughput: 0: 42737.8. Samples: 2055036600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:39:46,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 11:39:49,458][12883] Updated weights for policy 0, policy_version 125433 (0.0031) [2024-06-18 11:39:51,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2055176192. Throughput: 0: 42797.8. Samples: 2055291000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:39:51,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 11:39:53,935][12883] Updated weights for policy 0, policy_version 125443 (0.0028) [2024-06-18 11:39:56,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2055405568. Throughput: 0: 42818.3. Samples: 2055549320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:39:56,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 11:39:57,346][12883] Updated weights for policy 0, policy_version 125453 (0.0045) [2024-06-18 11:40:01,347][12883] Updated weights for policy 0, policy_version 125463 (0.0035) [2024-06-18 11:40:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2055602176. Throughput: 0: 42800.3. Samples: 2055676440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:40:01,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 11:40:04,902][12883] Updated weights for policy 0, policy_version 125473 (0.0029) [2024-06-18 11:40:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 2055831552. Throughput: 0: 42674.2. Samples: 2055933880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:40:06,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 11:40:09,037][12883] Updated weights for policy 0, policy_version 125483 (0.0032) [2024-06-18 11:40:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2056044544. Throughput: 0: 42849.8. Samples: 2056195320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:40:11,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 11:40:12,495][12883] Updated weights for policy 0, policy_version 125493 (0.0033) [2024-06-18 11:40:16,750][12883] Updated weights for policy 0, policy_version 125503 (0.0035) [2024-06-18 11:40:16,999][12645] Fps is (10 sec: 40940.1, 60 sec: 43140.9, 300 sec: 42709.7). Total num frames: 2056241152. Throughput: 0: 42708.6. Samples: 2056315940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:40:16,999][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 11:40:20,234][12883] Updated weights for policy 0, policy_version 125513 (0.0026) [2024-06-18 11:40:21,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2056454144. Throughput: 0: 42484.5. Samples: 2056567240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:40:21,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 11:40:24,698][12883] Updated weights for policy 0, policy_version 125523 (0.0029) [2024-06-18 11:40:26,994][12645] Fps is (10 sec: 42618.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2056667136. Throughput: 0: 42563.0. Samples: 2056825280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 11:40:26,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 11:40:27,895][12883] Updated weights for policy 0, policy_version 125533 (0.0033) [2024-06-18 11:40:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2056880128. Throughput: 0: 42571.6. Samples: 2056952320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:40:31,994][12645] Avg episode reward: [(0, '0.211')] [2024-06-18 11:40:32,311][12883] Updated weights for policy 0, policy_version 125543 (0.0049) [2024-06-18 11:40:35,983][12883] Updated weights for policy 0, policy_version 125553 (0.0028) [2024-06-18 11:40:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42654.2). Total num frames: 2057093120. Throughput: 0: 42609.1. Samples: 2057208420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:40:36,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 11:40:37,126][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125556_2057109504.pth... [2024-06-18 11:40:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000124930_2046853120.pth [2024-06-18 11:40:40,238][12883] Updated weights for policy 0, policy_version 125563 (0.0037) [2024-06-18 11:40:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2057306112. Throughput: 0: 42592.9. Samples: 2057466000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:40:41,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 11:40:43,728][12883] Updated weights for policy 0, policy_version 125573 (0.0030) [2024-06-18 11:40:45,791][12862] Signal inference workers to stop experience collection... (30150 times) [2024-06-18 11:40:45,791][12862] Signal inference workers to resume experience collection... (30150 times) [2024-06-18 11:40:45,833][12883] InferenceWorker_p0-w0: stopping experience collection (30150 times) [2024-06-18 11:40:45,833][12883] InferenceWorker_p0-w0: resuming experience collection (30150 times) [2024-06-18 11:40:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2057519104. Throughput: 0: 42531.6. Samples: 2057590360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:40:46,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 11:40:47,827][12883] Updated weights for policy 0, policy_version 125583 (0.0027) [2024-06-18 11:40:51,412][12883] Updated weights for policy 0, policy_version 125593 (0.0037) [2024-06-18 11:40:51,995][12645] Fps is (10 sec: 42593.5, 60 sec: 42597.5, 300 sec: 42709.3). Total num frames: 2057732096. Throughput: 0: 42476.8. Samples: 2057845380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:40:51,995][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 11:40:55,432][12883] Updated weights for policy 0, policy_version 125603 (0.0038) [2024-06-18 11:40:56,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2057928704. Throughput: 0: 42451.5. Samples: 2058105640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:40:56,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 11:40:59,335][12883] Updated weights for policy 0, policy_version 125613 (0.0034) [2024-06-18 11:41:01,994][12645] Fps is (10 sec: 44241.5, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2058174464. Throughput: 0: 42517.9. Samples: 2058229040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:41:01,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 11:41:03,052][12883] Updated weights for policy 0, policy_version 125623 (0.0039) [2024-06-18 11:41:06,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2058371072. Throughput: 0: 42591.0. Samples: 2058483840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:41:06,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 11:41:06,995][12883] Updated weights for policy 0, policy_version 125633 (0.0039) [2024-06-18 11:41:10,619][12883] Updated weights for policy 0, policy_version 125643 (0.0030) [2024-06-18 11:41:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2058584064. Throughput: 0: 42537.9. Samples: 2058739480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:41:11,995][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 11:41:14,641][12883] Updated weights for policy 0, policy_version 125653 (0.0043) [2024-06-18 11:41:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42875.1, 300 sec: 42709.5). Total num frames: 2058813440. Throughput: 0: 42556.5. Samples: 2058867360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:41:16,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 11:41:18,443][12883] Updated weights for policy 0, policy_version 125663 (0.0038) [2024-06-18 11:41:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2058993664. Throughput: 0: 42437.5. Samples: 2059118100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:41:21,994][12645] Avg episode reward: [(0, '0.723')] [2024-06-18 11:41:22,493][12883] Updated weights for policy 0, policy_version 125673 (0.0038) [2024-06-18 11:41:26,093][12883] Updated weights for policy 0, policy_version 125683 (0.0034) [2024-06-18 11:41:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2059223040. Throughput: 0: 42376.4. Samples: 2059372940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 11:41:26,994][12645] Avg episode reward: [(0, '0.864')] [2024-06-18 11:41:30,058][12883] Updated weights for policy 0, policy_version 125693 (0.0038) [2024-06-18 11:41:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2059419648. Throughput: 0: 42567.1. Samples: 2059505880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:41:31,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 11:41:33,682][12883] Updated weights for policy 0, policy_version 125703 (0.0035) [2024-06-18 11:41:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2059632640. Throughput: 0: 42312.1. Samples: 2059749380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:41:36,996][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 11:41:38,283][12883] Updated weights for policy 0, policy_version 125713 (0.0027) [2024-06-18 11:41:41,348][12883] Updated weights for policy 0, policy_version 125723 (0.0032) [2024-06-18 11:41:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2059845632. Throughput: 0: 42279.3. Samples: 2060008200. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:41:41,995][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 11:41:45,963][12883] Updated weights for policy 0, policy_version 125733 (0.0033) [2024-06-18 11:41:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2060058624. Throughput: 0: 42522.0. Samples: 2060142520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:41:46,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 11:41:48,912][12883] Updated weights for policy 0, policy_version 125743 (0.0036) [2024-06-18 11:41:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42599.2, 300 sec: 42765.0). Total num frames: 2060288000. Throughput: 0: 42528.9. Samples: 2060397640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:41:51,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 11:41:53,503][12883] Updated weights for policy 0, policy_version 125753 (0.0030) [2024-06-18 11:41:56,493][12883] Updated weights for policy 0, policy_version 125763 (0.0039) [2024-06-18 11:41:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2060500992. Throughput: 0: 42553.4. Samples: 2060654380. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:41:56,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 11:42:01,043][12883] Updated weights for policy 0, policy_version 125773 (0.0033) [2024-06-18 11:42:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2060713984. Throughput: 0: 42611.5. Samples: 2060784880. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:42:01,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 11:42:04,369][12883] Updated weights for policy 0, policy_version 125783 (0.0033) [2024-06-18 11:42:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2060943360. Throughput: 0: 42804.9. Samples: 2061044320. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:42:06,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 11:42:08,477][12883] Updated weights for policy 0, policy_version 125793 (0.0033) [2024-06-18 11:42:10,915][12862] Signal inference workers to stop experience collection... (30200 times) [2024-06-18 11:42:10,915][12862] Signal inference workers to resume experience collection... (30200 times) [2024-06-18 11:42:10,945][12883] InferenceWorker_p0-w0: stopping experience collection (30200 times) [2024-06-18 11:42:10,945][12883] InferenceWorker_p0-w0: resuming experience collection (30200 times) [2024-06-18 11:42:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2061139968. Throughput: 0: 42833.9. Samples: 2061300460. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:42:11,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 11:42:12,013][12883] Updated weights for policy 0, policy_version 125803 (0.0039) [2024-06-18 11:42:16,032][12883] Updated weights for policy 0, policy_version 125813 (0.0029) [2024-06-18 11:42:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2061352960. Throughput: 0: 42696.5. Samples: 2061427220. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:42:16,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 11:42:19,551][12883] Updated weights for policy 0, policy_version 125823 (0.0039) [2024-06-18 11:42:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2061598720. Throughput: 0: 43116.5. Samples: 2061689620. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:42:21,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 11:42:23,643][12883] Updated weights for policy 0, policy_version 125833 (0.0028) [2024-06-18 11:42:26,929][12883] Updated weights for policy 0, policy_version 125843 (0.0040) [2024-06-18 11:42:26,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42654.2). Total num frames: 2061811712. Throughput: 0: 43060.8. Samples: 2061945940. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:42:26,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 11:42:31,269][12883] Updated weights for policy 0, policy_version 125853 (0.0042) [2024-06-18 11:42:31,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2061975552. Throughput: 0: 42844.4. Samples: 2062070520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 11:42:31,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 11:42:34,618][12883] Updated weights for policy 0, policy_version 125863 (0.0033) [2024-06-18 11:42:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2062221312. Throughput: 0: 42999.4. Samples: 2062332620. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:42:36,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 11:42:37,080][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125869_2062237696.pth... [2024-06-18 11:42:37,137][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125243_2051981312.pth [2024-06-18 11:42:39,135][12883] Updated weights for policy 0, policy_version 125873 (0.0030) [2024-06-18 11:42:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2062434304. Throughput: 0: 42855.2. Samples: 2062582860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:42:41,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 11:42:42,340][12883] Updated weights for policy 0, policy_version 125883 (0.0030) [2024-06-18 11:42:46,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2062614528. Throughput: 0: 42754.7. Samples: 2062708840. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:42:46,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 11:42:47,139][12883] Updated weights for policy 0, policy_version 125893 (0.0031) [2024-06-18 11:42:50,291][12883] Updated weights for policy 0, policy_version 125903 (0.0027) [2024-06-18 11:42:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2062843904. Throughput: 0: 42736.9. Samples: 2062967480. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:42:51,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 11:42:54,724][12883] Updated weights for policy 0, policy_version 125913 (0.0038) [2024-06-18 11:42:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2063073280. Throughput: 0: 42728.7. Samples: 2063223260. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:42:56,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 11:42:58,316][12883] Updated weights for policy 0, policy_version 125923 (0.0036) [2024-06-18 11:43:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2063269888. Throughput: 0: 42729.2. Samples: 2063350040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:43:01,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 11:43:02,262][12883] Updated weights for policy 0, policy_version 125933 (0.0027) [2024-06-18 11:43:06,020][12883] Updated weights for policy 0, policy_version 125943 (0.0031) [2024-06-18 11:43:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2063499264. Throughput: 0: 42657.8. Samples: 2063609220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:43:06,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 11:43:09,837][12883] Updated weights for policy 0, policy_version 125953 (0.0031) [2024-06-18 11:43:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2063728640. Throughput: 0: 42631.3. Samples: 2063864340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:43:11,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 11:43:13,536][12883] Updated weights for policy 0, policy_version 125963 (0.0040) [2024-06-18 11:43:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2063892480. Throughput: 0: 42715.1. Samples: 2063992700. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:43:16,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 11:43:17,525][12883] Updated weights for policy 0, policy_version 125973 (0.0036) [2024-06-18 11:43:20,967][12862] Signal inference workers to stop experience collection... (30250 times) [2024-06-18 11:43:20,997][12883] InferenceWorker_p0-w0: stopping experience collection (30250 times) [2024-06-18 11:43:21,019][12862] Signal inference workers to resume experience collection... (30250 times) [2024-06-18 11:43:21,020][12883] InferenceWorker_p0-w0: resuming experience collection (30250 times) [2024-06-18 11:43:21,154][12883] Updated weights for policy 0, policy_version 125983 (0.0030) [2024-06-18 11:43:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2064138240. Throughput: 0: 42707.2. Samples: 2064254440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:43:22,006][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 11:43:25,540][12883] Updated weights for policy 0, policy_version 125993 (0.0031) [2024-06-18 11:43:26,994][12645] Fps is (10 sec: 47513.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2064367616. Throughput: 0: 42631.9. Samples: 2064501300. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:43:26,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 11:43:28,778][12883] Updated weights for policy 0, policy_version 126003 (0.0037) [2024-06-18 11:43:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2064547840. Throughput: 0: 42740.3. Samples: 2064632160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-18 11:43:31,995][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 11:43:33,121][12883] Updated weights for policy 0, policy_version 126013 (0.0048) [2024-06-18 11:43:36,387][12883] Updated weights for policy 0, policy_version 126023 (0.0036) [2024-06-18 11:43:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42654.2). Total num frames: 2064793600. Throughput: 0: 42800.0. Samples: 2064893480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:43:36,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 11:43:40,702][12883] Updated weights for policy 0, policy_version 126033 (0.0034) [2024-06-18 11:43:41,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42596.7, 300 sec: 42709.1). Total num frames: 2064990208. Throughput: 0: 42844.1. Samples: 2065151340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:43:41,997][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 11:43:44,046][12883] Updated weights for policy 0, policy_version 126043 (0.0028) [2024-06-18 11:43:46,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2065203200. Throughput: 0: 42802.2. Samples: 2065276140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:43:46,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 11:43:48,178][12883] Updated weights for policy 0, policy_version 126053 (0.0033) [2024-06-18 11:43:51,689][12883] Updated weights for policy 0, policy_version 126063 (0.0040) [2024-06-18 11:43:52,000][12645] Fps is (10 sec: 42581.6, 60 sec: 42867.0, 300 sec: 42653.4). Total num frames: 2065416192. Throughput: 0: 42747.0. Samples: 2065533100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:43:52,001][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 11:43:55,906][12883] Updated weights for policy 0, policy_version 126073 (0.0040) [2024-06-18 11:43:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2065629184. Throughput: 0: 42799.4. Samples: 2065790320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:43:56,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 11:43:59,130][12883] Updated weights for policy 0, policy_version 126083 (0.0028) [2024-06-18 11:44:01,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2065842176. Throughput: 0: 42727.6. Samples: 2065915440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:01,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 11:44:03,573][12883] Updated weights for policy 0, policy_version 126093 (0.0045) [2024-06-18 11:44:06,766][12883] Updated weights for policy 0, policy_version 126103 (0.0027) [2024-06-18 11:44:06,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2066071552. Throughput: 0: 42656.5. Samples: 2066173980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:06,994][12645] Avg episode reward: [(0, '0.637')] [2024-06-18 11:44:11,125][12883] Updated weights for policy 0, policy_version 126113 (0.0038) [2024-06-18 11:44:11,994][12645] Fps is (10 sec: 42595.9, 60 sec: 42324.9, 300 sec: 42764.9). Total num frames: 2066268160. Throughput: 0: 42792.4. Samples: 2066426980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:11,995][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 11:44:14,779][12883] Updated weights for policy 0, policy_version 126123 (0.0038) [2024-06-18 11:44:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2066481152. Throughput: 0: 42795.2. Samples: 2066557940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:16,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 11:44:18,758][12883] Updated weights for policy 0, policy_version 126133 (0.0040) [2024-06-18 11:44:21,994][12645] Fps is (10 sec: 42600.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2066694144. Throughput: 0: 42687.5. Samples: 2066814420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:21,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 11:44:22,325][12883] Updated weights for policy 0, policy_version 126143 (0.0026) [2024-06-18 11:44:26,170][12883] Updated weights for policy 0, policy_version 126153 (0.0037) [2024-06-18 11:44:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2066907136. Throughput: 0: 42789.8. Samples: 2067076780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:26,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 11:44:29,678][12883] Updated weights for policy 0, policy_version 126163 (0.0031) [2024-06-18 11:44:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 2067136512. Throughput: 0: 42911.7. Samples: 2067207160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:31,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 11:44:33,838][12883] Updated weights for policy 0, policy_version 126173 (0.0034) [2024-06-18 11:44:36,996][12645] Fps is (10 sec: 45865.1, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2067365888. Throughput: 0: 43074.6. Samples: 2067471280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 11:44:36,996][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 11:44:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126183_2067382272.pth... [2024-06-18 11:44:37,073][12883] Updated weights for policy 0, policy_version 126183 (0.0029) [2024-06-18 11:44:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125556_2057109504.pth [2024-06-18 11:44:37,867][12862] Signal inference workers to stop experience collection... (30300 times) [2024-06-18 11:44:37,867][12862] Signal inference workers to resume experience collection... (30300 times) [2024-06-18 11:44:37,912][12883] InferenceWorker_p0-w0: stopping experience collection (30300 times) [2024-06-18 11:44:37,912][12883] InferenceWorker_p0-w0: resuming experience collection (30300 times) [2024-06-18 11:44:41,335][12883] Updated weights for policy 0, policy_version 126193 (0.0034) [2024-06-18 11:44:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2067546112. Throughput: 0: 43062.3. Samples: 2067728120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:44:41,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 11:44:44,717][12883] Updated weights for policy 0, policy_version 126203 (0.0023) [2024-06-18 11:44:46,994][12645] Fps is (10 sec: 40968.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2067775488. Throughput: 0: 43143.4. Samples: 2067856900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:44:46,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 11:44:48,767][12883] Updated weights for policy 0, policy_version 126213 (0.0032) [2024-06-18 11:44:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43149.0, 300 sec: 42709.5). Total num frames: 2068004864. Throughput: 0: 43084.0. Samples: 2068112760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:44:51,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 11:44:52,448][12883] Updated weights for policy 0, policy_version 126223 (0.0038) [2024-06-18 11:44:56,253][12883] Updated weights for policy 0, policy_version 126233 (0.0024) [2024-06-18 11:44:56,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2068201472. Throughput: 0: 43133.1. Samples: 2068367940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:44:56,994][12645] Avg episode reward: [(0, '0.709')] [2024-06-18 11:44:59,923][12883] Updated weights for policy 0, policy_version 126243 (0.0033) [2024-06-18 11:45:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2068430848. Throughput: 0: 43163.9. Samples: 2068500320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:01,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 11:45:04,326][12883] Updated weights for policy 0, policy_version 126253 (0.0040) [2024-06-18 11:45:06,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2068643840. Throughput: 0: 43178.3. Samples: 2068757540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:06,997][12645] Avg episode reward: [(0, '0.669')] [2024-06-18 11:45:07,562][12883] Updated weights for policy 0, policy_version 126263 (0.0040) [2024-06-18 11:45:11,887][12883] Updated weights for policy 0, policy_version 126273 (0.0044) [2024-06-18 11:45:11,995][12645] Fps is (10 sec: 42591.3, 60 sec: 43143.7, 300 sec: 42765.5). Total num frames: 2068856832. Throughput: 0: 43155.1. Samples: 2069018840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:11,996][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 11:45:15,448][12883] Updated weights for policy 0, policy_version 126283 (0.0033) [2024-06-18 11:45:16,994][12645] Fps is (10 sec: 42608.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2069069824. Throughput: 0: 43062.2. Samples: 2069144960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:16,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 11:45:19,467][12883] Updated weights for policy 0, policy_version 126293 (0.0026) [2024-06-18 11:45:21,994][12645] Fps is (10 sec: 44244.7, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2069299200. Throughput: 0: 42934.9. Samples: 2069403260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:21,996][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 11:45:23,011][12883] Updated weights for policy 0, policy_version 126303 (0.0023) [2024-06-18 11:45:26,994][12645] Fps is (10 sec: 40959.0, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2069479424. Throughput: 0: 42829.6. Samples: 2069655460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:26,995][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 11:45:27,413][12883] Updated weights for policy 0, policy_version 126313 (0.0039) [2024-06-18 11:45:30,822][12883] Updated weights for policy 0, policy_version 126323 (0.0029) [2024-06-18 11:45:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2069708800. Throughput: 0: 42730.3. Samples: 2069779760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:31,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 11:45:35,013][12883] Updated weights for policy 0, policy_version 126333 (0.0041) [2024-06-18 11:45:36,994][12645] Fps is (10 sec: 45876.3, 60 sec: 42873.0, 300 sec: 42820.6). Total num frames: 2069938176. Throughput: 0: 42888.0. Samples: 2070042720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-18 11:45:36,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 11:45:38,458][12883] Updated weights for policy 0, policy_version 126343 (0.0037) [2024-06-18 11:45:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2070118400. Throughput: 0: 42906.5. Samples: 2070298740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:45:41,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 11:45:42,762][12883] Updated weights for policy 0, policy_version 126353 (0.0025) [2024-06-18 11:45:46,014][12883] Updated weights for policy 0, policy_version 126363 (0.0032) [2024-06-18 11:45:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.2). Total num frames: 2070347776. Throughput: 0: 42697.5. Samples: 2070421700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:45:46,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 11:45:50,294][12883] Updated weights for policy 0, policy_version 126373 (0.0025) [2024-06-18 11:45:51,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2070577152. Throughput: 0: 42730.6. Samples: 2070680320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:45:51,994][12645] Avg episode reward: [(0, '0.668')] [2024-06-18 11:45:54,111][12883] Updated weights for policy 0, policy_version 126383 (0.0039) [2024-06-18 11:45:56,999][12645] Fps is (10 sec: 40937.1, 60 sec: 42594.4, 300 sec: 42653.2). Total num frames: 2070757376. Throughput: 0: 42635.6. Samples: 2070937600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:45:57,000][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 11:45:58,252][12883] Updated weights for policy 0, policy_version 126393 (0.0039) [2024-06-18 11:46:01,718][12883] Updated weights for policy 0, policy_version 126403 (0.0034) [2024-06-18 11:46:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2071003136. Throughput: 0: 42540.4. Samples: 2071059280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:01,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 11:46:05,820][12883] Updated weights for policy 0, policy_version 126413 (0.0022) [2024-06-18 11:46:06,994][12645] Fps is (10 sec: 42622.0, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2071183360. Throughput: 0: 42644.5. Samples: 2071322260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:06,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 11:46:09,319][12883] Updated weights for policy 0, policy_version 126423 (0.0030) [2024-06-18 11:46:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42326.7, 300 sec: 42653.9). Total num frames: 2071396352. Throughput: 0: 42602.5. Samples: 2071572560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:11,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 11:46:13,510][12883] Updated weights for policy 0, policy_version 126433 (0.0027) [2024-06-18 11:46:16,849][12883] Updated weights for policy 0, policy_version 126443 (0.0031) [2024-06-18 11:46:16,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2071642112. Throughput: 0: 42751.5. Samples: 2071703580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:16,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 11:46:21,202][12883] Updated weights for policy 0, policy_version 126453 (0.0031) [2024-06-18 11:46:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 2071805952. Throughput: 0: 42550.6. Samples: 2071957500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:21,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 11:46:22,551][12862] Signal inference workers to stop experience collection... (30350 times) [2024-06-18 11:46:22,599][12883] InferenceWorker_p0-w0: stopping experience collection (30350 times) [2024-06-18 11:46:22,611][12862] Signal inference workers to resume experience collection... (30350 times) [2024-06-18 11:46:22,624][12883] InferenceWorker_p0-w0: resuming experience collection (30350 times) [2024-06-18 11:46:24,440][12883] Updated weights for policy 0, policy_version 126463 (0.0027) [2024-06-18 11:46:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42869.9, 300 sec: 42820.2). Total num frames: 2072051712. Throughput: 0: 42608.1. Samples: 2072216200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:26,996][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 11:46:28,837][12883] Updated weights for policy 0, policy_version 126473 (0.0037) [2024-06-18 11:46:31,990][12883] Updated weights for policy 0, policy_version 126483 (0.0038) [2024-06-18 11:46:31,994][12645] Fps is (10 sec: 49152.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2072297472. Throughput: 0: 42745.8. Samples: 2072345260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:31,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 11:46:36,640][12883] Updated weights for policy 0, policy_version 126493 (0.0028) [2024-06-18 11:46:36,994][12645] Fps is (10 sec: 40969.7, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2072461312. Throughput: 0: 42560.9. Samples: 2072595560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:36,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 11:46:37,108][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126494_2072477696.pth... [2024-06-18 11:46:37,169][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000125869_2062237696.pth [2024-06-18 11:46:40,331][12883] Updated weights for policy 0, policy_version 126503 (0.0038) [2024-06-18 11:46:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2072674304. Throughput: 0: 42431.8. Samples: 2072846800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 11:46:41,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 11:46:44,577][12883] Updated weights for policy 0, policy_version 126513 (0.0048) [2024-06-18 11:46:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2072920064. Throughput: 0: 42638.1. Samples: 2072978000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:46:46,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 11:46:47,871][12883] Updated weights for policy 0, policy_version 126523 (0.0043) [2024-06-18 11:46:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2073100288. Throughput: 0: 42475.9. Samples: 2073233680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:46:51,996][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 11:46:52,081][12883] Updated weights for policy 0, policy_version 126533 (0.0022) [2024-06-18 11:46:55,372][12883] Updated weights for policy 0, policy_version 126543 (0.0039) [2024-06-18 11:46:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42602.3, 300 sec: 42709.5). Total num frames: 2073313280. Throughput: 0: 42535.5. Samples: 2073486660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:46:56,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 11:46:59,573][12883] Updated weights for policy 0, policy_version 126553 (0.0037) [2024-06-18 11:47:02,000][12645] Fps is (10 sec: 45847.0, 60 sec: 42594.0, 300 sec: 42764.1). Total num frames: 2073559040. Throughput: 0: 42467.1. Samples: 2073614860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:02,000][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 11:47:03,082][12883] Updated weights for policy 0, policy_version 126563 (0.0033) [2024-06-18 11:47:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2073739264. Throughput: 0: 42471.6. Samples: 2073868720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:06,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 11:47:07,152][12883] Updated weights for policy 0, policy_version 126573 (0.0042) [2024-06-18 11:47:11,148][12883] Updated weights for policy 0, policy_version 126583 (0.0033) [2024-06-18 11:47:11,994][12645] Fps is (10 sec: 39345.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2073952256. Throughput: 0: 42371.9. Samples: 2074122840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:11,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 11:47:14,767][12883] Updated weights for policy 0, policy_version 126593 (0.0031) [2024-06-18 11:47:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2074181632. Throughput: 0: 42391.0. Samples: 2074252860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:17,000][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 11:47:18,876][12883] Updated weights for policy 0, policy_version 126603 (0.0029) [2024-06-18 11:47:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2074378240. Throughput: 0: 42527.5. Samples: 2074509300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:21,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 11:47:22,412][12883] Updated weights for policy 0, policy_version 126613 (0.0036) [2024-06-18 11:47:26,432][12883] Updated weights for policy 0, policy_version 126623 (0.0027) [2024-06-18 11:47:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2074591232. Throughput: 0: 42584.6. Samples: 2074763100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:26,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 11:47:30,537][12883] Updated weights for policy 0, policy_version 126633 (0.0028) [2024-06-18 11:47:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2074820608. Throughput: 0: 42553.4. Samples: 2074892900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:31,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 11:47:33,990][12883] Updated weights for policy 0, policy_version 126643 (0.0036) [2024-06-18 11:47:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2075017216. Throughput: 0: 42580.6. Samples: 2075149800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:36,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 11:47:38,071][12883] Updated weights for policy 0, policy_version 126653 (0.0032) [2024-06-18 11:47:41,590][12883] Updated weights for policy 0, policy_version 126663 (0.0038) [2024-06-18 11:47:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2075246592. Throughput: 0: 42566.7. Samples: 2075402160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 11:47:41,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 11:47:45,801][12883] Updated weights for policy 0, policy_version 126673 (0.0038) [2024-06-18 11:47:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2075475968. Throughput: 0: 42711.6. Samples: 2075536620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:47:46,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 11:47:49,158][12883] Updated weights for policy 0, policy_version 126683 (0.0036) [2024-06-18 11:47:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2075656192. Throughput: 0: 42753.0. Samples: 2075792600. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:47:51,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 11:47:53,719][12883] Updated weights for policy 0, policy_version 126693 (0.0035) [2024-06-18 11:47:56,807][12883] Updated weights for policy 0, policy_version 126703 (0.0031) [2024-06-18 11:47:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2075901952. Throughput: 0: 42777.4. Samples: 2076047820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:47:56,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 11:47:59,428][12862] Signal inference workers to stop experience collection... (30400 times) [2024-06-18 11:47:59,432][12862] Signal inference workers to resume experience collection... (30400 times) [2024-06-18 11:47:59,471][12883] InferenceWorker_p0-w0: stopping experience collection (30400 times) [2024-06-18 11:47:59,471][12883] InferenceWorker_p0-w0: resuming experience collection (30400 times) [2024-06-18 11:48:01,445][12883] Updated weights for policy 0, policy_version 126713 (0.0025) [2024-06-18 11:48:02,000][12645] Fps is (10 sec: 44209.0, 60 sec: 42325.3, 300 sec: 42708.6). Total num frames: 2076098560. Throughput: 0: 42719.4. Samples: 2076175500. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:02,001][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 11:48:04,583][12883] Updated weights for policy 0, policy_version 126723 (0.0035) [2024-06-18 11:48:06,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2076278784. Throughput: 0: 42633.7. Samples: 2076427820. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:06,994][12645] Avg episode reward: [(0, '0.841')] [2024-06-18 11:48:08,976][12883] Updated weights for policy 0, policy_version 126733 (0.0026) [2024-06-18 11:48:11,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2076524544. Throughput: 0: 42726.6. Samples: 2076685800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:11,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 11:48:12,306][12883] Updated weights for policy 0, policy_version 126743 (0.0038) [2024-06-18 11:48:16,410][12883] Updated weights for policy 0, policy_version 126753 (0.0033) [2024-06-18 11:48:16,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2076737536. Throughput: 0: 42742.2. Samples: 2076816300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:16,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 11:48:20,390][12883] Updated weights for policy 0, policy_version 126763 (0.0032) [2024-06-18 11:48:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2076950528. Throughput: 0: 42716.8. Samples: 2077072060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:21,995][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 11:48:24,211][12883] Updated weights for policy 0, policy_version 126773 (0.0028) [2024-06-18 11:48:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2077163520. Throughput: 0: 42621.7. Samples: 2077320140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:26,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 11:48:28,250][12883] Updated weights for policy 0, policy_version 126783 (0.0048) [2024-06-18 11:48:31,964][12883] Updated weights for policy 0, policy_version 126793 (0.0037) [2024-06-18 11:48:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2077376512. Throughput: 0: 42528.4. Samples: 2077450400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:31,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 11:48:35,846][12883] Updated weights for policy 0, policy_version 126803 (0.0041) [2024-06-18 11:48:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 2077605888. Throughput: 0: 42742.1. Samples: 2077716000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:36,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 11:48:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126807_2077605888.pth... [2024-06-18 11:48:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126183_2067382272.pth [2024-06-18 11:48:39,438][12883] Updated weights for policy 0, policy_version 126813 (0.0035) [2024-06-18 11:48:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2077818880. Throughput: 0: 42601.8. Samples: 2077964900. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 11:48:42,000][12645] Avg episode reward: [(0, '0.909')] [2024-06-18 11:48:42,000][12862] Saving new best policy, reward=0.909! [2024-06-18 11:48:43,424][12883] Updated weights for policy 0, policy_version 126823 (0.0037) [2024-06-18 11:48:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 2078015488. Throughput: 0: 42596.2. Samples: 2078092060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:48:46,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 11:48:47,046][12883] Updated weights for policy 0, policy_version 126833 (0.0027) [2024-06-18 11:48:51,229][12883] Updated weights for policy 0, policy_version 126843 (0.0021) [2024-06-18 11:48:51,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2078228480. Throughput: 0: 42756.9. Samples: 2078351880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:48:51,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 11:48:54,500][12883] Updated weights for policy 0, policy_version 126853 (0.0038) [2024-06-18 11:48:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2078441472. Throughput: 0: 42536.4. Samples: 2078599940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:48:56,994][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 11:48:58,788][12883] Updated weights for policy 0, policy_version 126863 (0.0040) [2024-06-18 11:49:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42875.9, 300 sec: 42709.5). Total num frames: 2078670848. Throughput: 0: 42554.7. Samples: 2078731260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:01,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 11:49:02,555][12883] Updated weights for policy 0, policy_version 126873 (0.0030) [2024-06-18 11:49:06,584][12883] Updated weights for policy 0, policy_version 126883 (0.0041) [2024-06-18 11:49:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2078867456. Throughput: 0: 42668.4. Samples: 2078992140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:06,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 11:49:10,318][12883] Updated weights for policy 0, policy_version 126893 (0.0034) [2024-06-18 11:49:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2079096832. Throughput: 0: 42669.3. Samples: 2079240260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:11,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 11:49:14,237][12883] Updated weights for policy 0, policy_version 126903 (0.0033) [2024-06-18 11:49:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2079293440. Throughput: 0: 42783.3. Samples: 2079375640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:16,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 11:49:17,765][12883] Updated weights for policy 0, policy_version 126913 (0.0038) [2024-06-18 11:49:21,752][12883] Updated weights for policy 0, policy_version 126923 (0.0030) [2024-06-18 11:49:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2079522816. Throughput: 0: 42532.4. Samples: 2079629960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:21,995][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 11:49:25,461][12883] Updated weights for policy 0, policy_version 126933 (0.0029) [2024-06-18 11:49:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2079735808. Throughput: 0: 42649.8. Samples: 2079884140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:26,994][12645] Avg episode reward: [(0, '0.709')] [2024-06-18 11:49:29,367][12883] Updated weights for policy 0, policy_version 126943 (0.0048) [2024-06-18 11:49:30,977][12862] Signal inference workers to stop experience collection... (30450 times) [2024-06-18 11:49:31,015][12883] InferenceWorker_p0-w0: stopping experience collection (30450 times) [2024-06-18 11:49:31,023][12862] Signal inference workers to resume experience collection... (30450 times) [2024-06-18 11:49:31,029][12883] InferenceWorker_p0-w0: resuming experience collection (30450 times) [2024-06-18 11:49:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42654.3). Total num frames: 2079948800. Throughput: 0: 42715.1. Samples: 2080014240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:31,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 11:49:33,442][12883] Updated weights for policy 0, policy_version 126953 (0.0036) [2024-06-18 11:49:36,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42709.1). Total num frames: 2080145408. Throughput: 0: 42709.5. Samples: 2080273900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:36,996][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 11:49:37,200][12883] Updated weights for policy 0, policy_version 126963 (0.0040) [2024-06-18 11:49:41,099][12883] Updated weights for policy 0, policy_version 126973 (0.0033) [2024-06-18 11:49:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2080374784. Throughput: 0: 42889.0. Samples: 2080529940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:41,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 11:49:44,742][12883] Updated weights for policy 0, policy_version 126983 (0.0030) [2024-06-18 11:49:46,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2080587776. Throughput: 0: 42912.9. Samples: 2080662340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 11:49:46,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 11:49:48,783][12883] Updated weights for policy 0, policy_version 126993 (0.0029) [2024-06-18 11:49:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2080784384. Throughput: 0: 42800.6. Samples: 2080918160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:49:51,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 11:49:52,301][12883] Updated weights for policy 0, policy_version 127003 (0.0042) [2024-06-18 11:49:56,418][12883] Updated weights for policy 0, policy_version 127013 (0.0028) [2024-06-18 11:49:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2081013760. Throughput: 0: 42903.1. Samples: 2081170900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:49:56,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 11:49:59,748][12883] Updated weights for policy 0, policy_version 127023 (0.0034) [2024-06-18 11:50:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 2081226752. Throughput: 0: 42760.3. Samples: 2081299860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:01,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 11:50:04,053][12883] Updated weights for policy 0, policy_version 127033 (0.0047) [2024-06-18 11:50:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 2081439744. Throughput: 0: 42774.7. Samples: 2081554820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:06,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 11:50:07,801][12883] Updated weights for policy 0, policy_version 127043 (0.0044) [2024-06-18 11:50:11,514][12883] Updated weights for policy 0, policy_version 127053 (0.0040) [2024-06-18 11:50:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2081636352. Throughput: 0: 42742.2. Samples: 2081807540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:11,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 11:50:15,490][12883] Updated weights for policy 0, policy_version 127063 (0.0029) [2024-06-18 11:50:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2081849344. Throughput: 0: 42618.9. Samples: 2081932100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:16,995][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 11:50:19,085][12883] Updated weights for policy 0, policy_version 127073 (0.0026) [2024-06-18 11:50:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2082078720. Throughput: 0: 42662.1. Samples: 2082193600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:21,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 11:50:22,985][12883] Updated weights for policy 0, policy_version 127083 (0.0042) [2024-06-18 11:50:26,710][12883] Updated weights for policy 0, policy_version 127093 (0.0038) [2024-06-18 11:50:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2082291712. Throughput: 0: 42400.0. Samples: 2082437940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:26,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 11:50:30,683][12883] Updated weights for policy 0, policy_version 127103 (0.0028) [2024-06-18 11:50:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2082504704. Throughput: 0: 42321.7. Samples: 2082566820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:31,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 11:50:34,508][12883] Updated weights for policy 0, policy_version 127113 (0.0032) [2024-06-18 11:50:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2082717696. Throughput: 0: 42535.1. Samples: 2082832240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:36,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 11:50:37,129][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127120_2082734080.pth... [2024-06-18 11:50:37,181][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126494_2072477696.pth [2024-06-18 11:50:38,101][12883] Updated weights for policy 0, policy_version 127123 (0.0033) [2024-06-18 11:50:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2082930688. Throughput: 0: 42558.3. Samples: 2083086020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:41,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 11:50:42,104][12883] Updated weights for policy 0, policy_version 127133 (0.0030) [2024-06-18 11:50:45,999][12883] Updated weights for policy 0, policy_version 127143 (0.0047) [2024-06-18 11:50:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2083160064. Throughput: 0: 42628.2. Samples: 2083218120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 11:50:46,994][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 11:50:49,742][12883] Updated weights for policy 0, policy_version 127153 (0.0029) [2024-06-18 11:50:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42654.7). Total num frames: 2083340288. Throughput: 0: 42595.5. Samples: 2083471620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:50:51,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 11:50:53,879][12883] Updated weights for policy 0, policy_version 127163 (0.0028) [2024-06-18 11:50:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2083569664. Throughput: 0: 42695.9. Samples: 2083728860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:50:56,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 11:50:57,264][12883] Updated weights for policy 0, policy_version 127173 (0.0037) [2024-06-18 11:50:59,078][12862] Signal inference workers to stop experience collection... (30500 times) [2024-06-18 11:50:59,079][12862] Signal inference workers to resume experience collection... (30500 times) [2024-06-18 11:50:59,101][12883] InferenceWorker_p0-w0: stopping experience collection (30500 times) [2024-06-18 11:50:59,132][12883] InferenceWorker_p0-w0: resuming experience collection (30500 times) [2024-06-18 11:51:01,389][12883] Updated weights for policy 0, policy_version 127183 (0.0031) [2024-06-18 11:51:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2083799040. Throughput: 0: 42905.9. Samples: 2083862860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:01,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 11:51:04,867][12883] Updated weights for policy 0, policy_version 127193 (0.0030) [2024-06-18 11:51:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2083979264. Throughput: 0: 42808.9. Samples: 2084120000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:06,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 11:51:09,028][12883] Updated weights for policy 0, policy_version 127203 (0.0034) [2024-06-18 11:51:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2084225024. Throughput: 0: 42905.6. Samples: 2084368700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:11,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 11:51:13,006][12883] Updated weights for policy 0, policy_version 127213 (0.0034) [2024-06-18 11:51:16,539][12883] Updated weights for policy 0, policy_version 127223 (0.0037) [2024-06-18 11:51:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2084438016. Throughput: 0: 43051.6. Samples: 2084504140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:16,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 11:51:20,991][12883] Updated weights for policy 0, policy_version 127233 (0.0041) [2024-06-18 11:51:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 2084634624. Throughput: 0: 42785.2. Samples: 2084757580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:21,995][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 11:51:24,547][12883] Updated weights for policy 0, policy_version 127243 (0.0034) [2024-06-18 11:51:26,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2084864000. Throughput: 0: 42544.2. Samples: 2085000520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:26,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 11:51:28,562][12883] Updated weights for policy 0, policy_version 127253 (0.0028) [2024-06-18 11:51:31,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2085060608. Throughput: 0: 42718.7. Samples: 2085140560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:31,996][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 11:51:32,228][12883] Updated weights for policy 0, policy_version 127263 (0.0033) [2024-06-18 11:51:36,164][12883] Updated weights for policy 0, policy_version 127273 (0.0033) [2024-06-18 11:51:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2085273600. Throughput: 0: 42768.0. Samples: 2085396180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:36,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 11:51:40,071][12883] Updated weights for policy 0, policy_version 127283 (0.0029) [2024-06-18 11:51:41,994][12645] Fps is (10 sec: 45884.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2085519360. Throughput: 0: 42456.8. Samples: 2085639420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:41,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 11:51:43,754][12883] Updated weights for policy 0, policy_version 127293 (0.0033) [2024-06-18 11:51:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2085683200. Throughput: 0: 42584.8. Samples: 2085779180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:46,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 11:51:47,748][12883] Updated weights for policy 0, policy_version 127303 (0.0038) [2024-06-18 11:51:51,298][12883] Updated weights for policy 0, policy_version 127313 (0.0028) [2024-06-18 11:51:51,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2085912576. Throughput: 0: 42603.2. Samples: 2086037240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 11:51:51,997][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 11:51:55,281][12883] Updated weights for policy 0, policy_version 127323 (0.0028) [2024-06-18 11:51:56,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42710.4). Total num frames: 2086158336. Throughput: 0: 42684.0. Samples: 2086289480. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:51:56,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 11:51:58,957][12883] Updated weights for policy 0, policy_version 127333 (0.0025) [2024-06-18 11:52:01,996][12645] Fps is (10 sec: 40960.1, 60 sec: 42050.7, 300 sec: 42653.6). Total num frames: 2086322176. Throughput: 0: 42776.0. Samples: 2086429160. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:01,996][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 11:52:02,907][12883] Updated weights for policy 0, policy_version 127343 (0.0050) [2024-06-18 11:52:06,600][12883] Updated weights for policy 0, policy_version 127353 (0.0034) [2024-06-18 11:52:06,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2086551552. Throughput: 0: 42602.4. Samples: 2086674780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:06,996][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 11:52:10,538][12883] Updated weights for policy 0, policy_version 127363 (0.0036) [2024-06-18 11:52:11,404][12862] Signal inference workers to stop experience collection... (30550 times) [2024-06-18 11:52:11,405][12862] Signal inference workers to resume experience collection... (30550 times) [2024-06-18 11:52:11,432][12883] InferenceWorker_p0-w0: stopping experience collection (30550 times) [2024-06-18 11:52:11,432][12883] InferenceWorker_p0-w0: resuming experience collection (30550 times) [2024-06-18 11:52:11,994][12645] Fps is (10 sec: 47524.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2086797312. Throughput: 0: 42948.6. Samples: 2086933200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:11,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 11:52:14,250][12883] Updated weights for policy 0, policy_version 127373 (0.0039) [2024-06-18 11:52:16,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2086961152. Throughput: 0: 42776.7. Samples: 2087065420. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:16,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 11:52:18,153][12883] Updated weights for policy 0, policy_version 127383 (0.0034) [2024-06-18 11:52:21,965][12883] Updated weights for policy 0, policy_version 127393 (0.0041) [2024-06-18 11:52:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2087206912. Throughput: 0: 42617.0. Samples: 2087313940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:21,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 11:52:25,866][12883] Updated weights for policy 0, policy_version 127403 (0.0036) [2024-06-18 11:52:26,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2087419904. Throughput: 0: 43006.7. Samples: 2087574720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:26,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 11:52:29,687][12883] Updated weights for policy 0, policy_version 127413 (0.0031) [2024-06-18 11:52:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 2087600128. Throughput: 0: 42870.3. Samples: 2087708340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:31,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 11:52:33,698][12883] Updated weights for policy 0, policy_version 127423 (0.0042) [2024-06-18 11:52:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2087845888. Throughput: 0: 42655.9. Samples: 2087956660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:36,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 11:52:37,132][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127433_2087862272.pth... [2024-06-18 11:52:37,141][12883] Updated weights for policy 0, policy_version 127433 (0.0034) [2024-06-18 11:52:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000126807_2077605888.pth [2024-06-18 11:52:41,344][12883] Updated weights for policy 0, policy_version 127443 (0.0047) [2024-06-18 11:52:41,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2088058880. Throughput: 0: 42801.5. Samples: 2088215540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:41,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 11:52:44,758][12883] Updated weights for policy 0, policy_version 127453 (0.0033) [2024-06-18 11:52:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2088255488. Throughput: 0: 42537.2. Samples: 2088343240. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:46,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 11:52:48,854][12883] Updated weights for policy 0, policy_version 127463 (0.0038) [2024-06-18 11:52:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 2088484864. Throughput: 0: 42729.2. Samples: 2088597500. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 11:52:51,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 11:52:52,450][12883] Updated weights for policy 0, policy_version 127473 (0.0043) [2024-06-18 11:52:56,509][12883] Updated weights for policy 0, policy_version 127483 (0.0031) [2024-06-18 11:52:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42654.8). Total num frames: 2088681472. Throughput: 0: 42671.1. Samples: 2088853400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:52:56,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 11:53:00,051][12883] Updated weights for policy 0, policy_version 127493 (0.0028) [2024-06-18 11:53:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43146.1, 300 sec: 42820.6). Total num frames: 2088910848. Throughput: 0: 42582.3. Samples: 2088981620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:01,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 11:53:04,127][12883] Updated weights for policy 0, policy_version 127503 (0.0045) [2024-06-18 11:53:06,995][12645] Fps is (10 sec: 45870.5, 60 sec: 43145.4, 300 sec: 42764.9). Total num frames: 2089140224. Throughput: 0: 42711.0. Samples: 2089235980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:06,995][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 11:53:07,653][12883] Updated weights for policy 0, policy_version 127513 (0.0037) [2024-06-18 11:53:11,869][12883] Updated weights for policy 0, policy_version 127523 (0.0034) [2024-06-18 11:53:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2089336832. Throughput: 0: 42680.9. Samples: 2089495360. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:11,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 11:53:15,687][12883] Updated weights for policy 0, policy_version 127533 (0.0029) [2024-06-18 11:53:16,996][12645] Fps is (10 sec: 42593.2, 60 sec: 43416.0, 300 sec: 42764.7). Total num frames: 2089566208. Throughput: 0: 42516.6. Samples: 2089621680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:16,997][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 11:53:19,537][12883] Updated weights for policy 0, policy_version 127543 (0.0037) [2024-06-18 11:53:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2089762816. Throughput: 0: 42658.5. Samples: 2089876300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:21,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 11:53:23,324][12883] Updated weights for policy 0, policy_version 127553 (0.0044) [2024-06-18 11:53:26,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2089959424. Throughput: 0: 42503.1. Samples: 2090128180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:26,994][12645] Avg episode reward: [(0, '0.233')] [2024-06-18 11:53:27,492][12883] Updated weights for policy 0, policy_version 127563 (0.0041) [2024-06-18 11:53:31,095][12883] Updated weights for policy 0, policy_version 127573 (0.0037) [2024-06-18 11:53:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2090188800. Throughput: 0: 42603.6. Samples: 2090260400. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:31,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 11:53:35,045][12883] Updated weights for policy 0, policy_version 127583 (0.0028) [2024-06-18 11:53:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2090401792. Throughput: 0: 42659.6. Samples: 2090517180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:36,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 11:53:38,839][12883] Updated weights for policy 0, policy_version 127593 (0.0040) [2024-06-18 11:53:40,861][12862] Signal inference workers to stop experience collection... (30600 times) [2024-06-18 11:53:40,917][12883] InferenceWorker_p0-w0: stopping experience collection (30600 times) [2024-06-18 11:53:40,981][12862] Signal inference workers to resume experience collection... (30600 times) [2024-06-18 11:53:40,982][12883] InferenceWorker_p0-w0: resuming experience collection (30600 times) [2024-06-18 11:53:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2090614784. Throughput: 0: 42573.8. Samples: 2090769220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:41,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 11:53:42,875][12883] Updated weights for policy 0, policy_version 127603 (0.0031) [2024-06-18 11:53:46,377][12883] Updated weights for policy 0, policy_version 127613 (0.0040) [2024-06-18 11:53:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2090827776. Throughput: 0: 42535.2. Samples: 2090895700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:46,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 11:53:50,579][12883] Updated weights for policy 0, policy_version 127623 (0.0021) [2024-06-18 11:53:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2091057152. Throughput: 0: 42770.3. Samples: 2091160600. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:51,994][12645] Avg episode reward: [(0, '0.677')] [2024-06-18 11:53:53,936][12883] Updated weights for policy 0, policy_version 127633 (0.0033) [2024-06-18 11:53:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2091253760. Throughput: 0: 42596.0. Samples: 2091412180. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-18 11:53:56,994][12645] Avg episode reward: [(0, '0.673')] [2024-06-18 11:53:58,203][12883] Updated weights for policy 0, policy_version 127643 (0.0039) [2024-06-18 11:54:01,599][12883] Updated weights for policy 0, policy_version 127653 (0.0039) [2024-06-18 11:54:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2091466752. Throughput: 0: 42615.9. Samples: 2091539300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:01,994][12645] Avg episode reward: [(0, '0.678')] [2024-06-18 11:54:05,579][12883] Updated weights for policy 0, policy_version 127663 (0.0039) [2024-06-18 11:54:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42326.1, 300 sec: 42653.9). Total num frames: 2091679744. Throughput: 0: 42734.4. Samples: 2091799340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:06,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 11:54:09,334][12883] Updated weights for policy 0, policy_version 127673 (0.0048) [2024-06-18 11:54:11,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2091892736. Throughput: 0: 42688.1. Samples: 2092049240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:12,005][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 11:54:13,493][12883] Updated weights for policy 0, policy_version 127683 (0.0036) [2024-06-18 11:54:16,922][12883] Updated weights for policy 0, policy_version 127693 (0.0040) [2024-06-18 11:54:16,994][12645] Fps is (10 sec: 44233.6, 60 sec: 42599.5, 300 sec: 42709.4). Total num frames: 2092122112. Throughput: 0: 42565.1. Samples: 2092175860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:16,995][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 11:54:21,478][12883] Updated weights for policy 0, policy_version 127703 (0.0029) [2024-06-18 11:54:21,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 2092318720. Throughput: 0: 42744.1. Samples: 2092440660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:21,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 11:54:24,755][12883] Updated weights for policy 0, policy_version 127713 (0.0046) [2024-06-18 11:54:26,994][12645] Fps is (10 sec: 40963.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2092531712. Throughput: 0: 42922.7. Samples: 2092700740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:26,994][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 11:54:28,923][12883] Updated weights for policy 0, policy_version 127723 (0.0034) [2024-06-18 11:54:31,996][12645] Fps is (10 sec: 42588.3, 60 sec: 42596.8, 300 sec: 42709.5). Total num frames: 2092744704. Throughput: 0: 42800.4. Samples: 2092821820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:31,997][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 11:54:32,245][12883] Updated weights for policy 0, policy_version 127733 (0.0028) [2024-06-18 11:54:36,621][12883] Updated weights for policy 0, policy_version 127743 (0.0033) [2024-06-18 11:54:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2092974080. Throughput: 0: 42760.1. Samples: 2093084800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:36,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 11:54:37,136][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127746_2092990464.pth... [2024-06-18 11:54:37,201][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127120_2082734080.pth [2024-06-18 11:54:39,977][12883] Updated weights for policy 0, policy_version 127753 (0.0039) [2024-06-18 11:54:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2093154304. Throughput: 0: 42959.6. Samples: 2093345360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:41,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 11:54:44,199][12883] Updated weights for policy 0, policy_version 127763 (0.0037) [2024-06-18 11:54:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2093400064. Throughput: 0: 42841.7. Samples: 2093467180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:46,994][12645] Avg episode reward: [(0, '0.128')] [2024-06-18 11:54:47,497][12883] Updated weights for policy 0, policy_version 127773 (0.0031) [2024-06-18 11:54:51,717][12883] Updated weights for policy 0, policy_version 127783 (0.0037) [2024-06-18 11:54:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2093613056. Throughput: 0: 42855.5. Samples: 2093727840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:51,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 11:54:55,238][12883] Updated weights for policy 0, policy_version 127793 (0.0045) [2024-06-18 11:54:57,000][12645] Fps is (10 sec: 40934.9, 60 sec: 42594.0, 300 sec: 42653.1). Total num frames: 2093809664. Throughput: 0: 42980.2. Samples: 2093983520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 11:54:57,000][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 11:54:59,286][12883] Updated weights for policy 0, policy_version 127803 (0.0033) [2024-06-18 11:55:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2094039040. Throughput: 0: 42991.4. Samples: 2094110540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:01,996][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 11:55:02,838][12883] Updated weights for policy 0, policy_version 127813 (0.0049) [2024-06-18 11:55:06,037][12862] Signal inference workers to stop experience collection... (30650 times) [2024-06-18 11:55:06,037][12862] Signal inference workers to resume experience collection... (30650 times) [2024-06-18 11:55:06,104][12883] InferenceWorker_p0-w0: stopping experience collection (30650 times) [2024-06-18 11:55:06,104][12883] InferenceWorker_p0-w0: resuming experience collection (30650 times) [2024-06-18 11:55:06,994][12645] Fps is (10 sec: 42625.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2094235648. Throughput: 0: 42884.4. Samples: 2094370460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:06,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 11:55:07,112][12883] Updated weights for policy 0, policy_version 127823 (0.0044) [2024-06-18 11:55:10,531][12883] Updated weights for policy 0, policy_version 127833 (0.0032) [2024-06-18 11:55:12,000][12645] Fps is (10 sec: 40943.5, 60 sec: 42595.5, 300 sec: 42708.6). Total num frames: 2094448640. Throughput: 0: 42555.8. Samples: 2094616020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:12,001][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 11:55:14,875][12883] Updated weights for policy 0, policy_version 127843 (0.0033) [2024-06-18 11:55:16,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42871.9, 300 sec: 42765.0). Total num frames: 2094694400. Throughput: 0: 42842.0. Samples: 2094749620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:16,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 11:55:18,171][12883] Updated weights for policy 0, policy_version 127853 (0.0030) [2024-06-18 11:55:21,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2094858240. Throughput: 0: 42739.5. Samples: 2095008080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:21,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 11:55:22,443][12883] Updated weights for policy 0, policy_version 127863 (0.0046) [2024-06-18 11:55:26,088][12883] Updated weights for policy 0, policy_version 127873 (0.0042) [2024-06-18 11:55:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2095104000. Throughput: 0: 42349.8. Samples: 2095251100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:26,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 11:55:30,262][12883] Updated weights for policy 0, policy_version 127883 (0.0051) [2024-06-18 11:55:31,994][12645] Fps is (10 sec: 49151.4, 60 sec: 43419.1, 300 sec: 42820.5). Total num frames: 2095349760. Throughput: 0: 42769.8. Samples: 2095391820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:32,000][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 11:55:33,556][12883] Updated weights for policy 0, policy_version 127893 (0.0029) [2024-06-18 11:55:36,994][12645] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 2095480832. Throughput: 0: 42714.3. Samples: 2095649980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:36,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 11:55:37,976][12883] Updated weights for policy 0, policy_version 127903 (0.0045) [2024-06-18 11:55:40,994][12883] Updated weights for policy 0, policy_version 127913 (0.0032) [2024-06-18 11:55:42,000][12645] Fps is (10 sec: 40935.0, 60 sec: 43413.1, 300 sec: 42708.6). Total num frames: 2095759360. Throughput: 0: 42629.8. Samples: 2095901860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:42,000][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 11:55:45,689][12883] Updated weights for policy 0, policy_version 127923 (0.0042) [2024-06-18 11:55:46,994][12645] Fps is (10 sec: 52428.9, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2096005120. Throughput: 0: 42965.3. Samples: 2096043880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:46,994][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 11:55:48,657][12883] Updated weights for policy 0, policy_version 127933 (0.0033) [2024-06-18 11:55:51,994][12645] Fps is (10 sec: 37706.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2096136192. Throughput: 0: 42673.7. Samples: 2096290780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:51,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 11:55:53,361][12883] Updated weights for policy 0, policy_version 127943 (0.0036) [2024-06-18 11:55:56,207][12883] Updated weights for policy 0, policy_version 127953 (0.0037) [2024-06-18 11:55:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43422.0, 300 sec: 42765.0). Total num frames: 2096414720. Throughput: 0: 42937.0. Samples: 2096547920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:55:56,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 11:56:00,852][12883] Updated weights for policy 0, policy_version 127963 (0.0040) [2024-06-18 11:56:01,994][12645] Fps is (10 sec: 49151.8, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 2096627712. Throughput: 0: 43034.7. Samples: 2096686180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 11:56:01,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 11:56:03,816][12883] Updated weights for policy 0, policy_version 127973 (0.0038) [2024-06-18 11:56:06,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2096791552. Throughput: 0: 42824.0. Samples: 2096935160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:06,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 11:56:07,660][12862] Signal inference workers to stop experience collection... (30700 times) [2024-06-18 11:56:07,661][12862] Signal inference workers to resume experience collection... (30700 times) [2024-06-18 11:56:07,679][12883] InferenceWorker_p0-w0: stopping experience collection (30700 times) [2024-06-18 11:56:07,682][12883] InferenceWorker_p0-w0: resuming experience collection (30700 times) [2024-06-18 11:56:08,396][12883] Updated weights for policy 0, policy_version 127983 (0.0041) [2024-06-18 11:56:11,494][12883] Updated weights for policy 0, policy_version 127993 (0.0031) [2024-06-18 11:56:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43422.1, 300 sec: 42765.0). Total num frames: 2097053696. Throughput: 0: 43090.7. Samples: 2097190180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:11,994][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 11:56:16,031][12883] Updated weights for policy 0, policy_version 128003 (0.0039) [2024-06-18 11:56:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2097250304. Throughput: 0: 43137.9. Samples: 2097333020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:16,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 11:56:19,031][12883] Updated weights for policy 0, policy_version 128013 (0.0038) [2024-06-18 11:56:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2097446912. Throughput: 0: 42860.5. Samples: 2097578700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:21,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 11:56:23,617][12883] Updated weights for policy 0, policy_version 128023 (0.0041) [2024-06-18 11:56:26,505][12883] Updated weights for policy 0, policy_version 128033 (0.0032) [2024-06-18 11:56:26,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42876.4). Total num frames: 2097709056. Throughput: 0: 42932.9. Samples: 2097833580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:27,000][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 11:56:31,133][12883] Updated weights for policy 0, policy_version 128043 (0.0029) [2024-06-18 11:56:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2097905664. Throughput: 0: 42930.2. Samples: 2097975740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:31,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 11:56:34,171][12883] Updated weights for policy 0, policy_version 128053 (0.0031) [2024-06-18 11:56:36,994][12645] Fps is (10 sec: 39322.3, 60 sec: 43690.7, 300 sec: 42654.0). Total num frames: 2098102272. Throughput: 0: 43016.9. Samples: 2098226540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:36,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 11:56:37,025][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128058_2098102272.pth... [2024-06-18 11:56:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127433_2087862272.pth [2024-06-18 11:56:38,958][12883] Updated weights for policy 0, policy_version 128063 (0.0029) [2024-06-18 11:56:41,934][12883] Updated weights for policy 0, policy_version 128073 (0.0031) [2024-06-18 11:56:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43149.0, 300 sec: 42931.6). Total num frames: 2098348032. Throughput: 0: 43036.1. Samples: 2098484540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:42,000][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 11:56:46,656][12883] Updated weights for policy 0, policy_version 128083 (0.0031) [2024-06-18 11:56:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42709.8). Total num frames: 2098511872. Throughput: 0: 42969.9. Samples: 2098619820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:46,994][12645] Avg episode reward: [(0, '0.276')] [2024-06-18 11:56:49,493][12883] Updated weights for policy 0, policy_version 128093 (0.0037) [2024-06-18 11:56:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43417.5, 300 sec: 42654.0). Total num frames: 2098741248. Throughput: 0: 42843.5. Samples: 2098863120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:51,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 11:56:54,536][12883] Updated weights for policy 0, policy_version 128103 (0.0033) [2024-06-18 11:56:56,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.5, 300 sec: 42931.9). Total num frames: 2098987008. Throughput: 0: 43045.2. Samples: 2099127220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:56:56,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 11:56:57,020][12883] Updated weights for policy 0, policy_version 128113 (0.0033) [2024-06-18 11:57:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42709.8). Total num frames: 2099150848. Throughput: 0: 42759.2. Samples: 2099257180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-18 11:57:01,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 11:57:02,014][12883] Updated weights for policy 0, policy_version 128123 (0.0038) [2024-06-18 11:57:04,552][12883] Updated weights for policy 0, policy_version 128133 (0.0044) [2024-06-18 11:57:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2099396608. Throughput: 0: 42809.7. Samples: 2099505140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:06,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 11:57:09,777][12883] Updated weights for policy 0, policy_version 128143 (0.0027) [2024-06-18 11:57:11,323][12862] Signal inference workers to stop experience collection... (30750 times) [2024-06-18 11:57:11,324][12862] Signal inference workers to resume experience collection... (30750 times) [2024-06-18 11:57:11,343][12883] InferenceWorker_p0-w0: stopping experience collection (30750 times) [2024-06-18 11:57:11,344][12883] InferenceWorker_p0-w0: resuming experience collection (30750 times) [2024-06-18 11:57:11,994][12645] Fps is (10 sec: 49152.2, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2099642368. Throughput: 0: 42796.6. Samples: 2099759420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:11,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 11:57:12,752][12883] Updated weights for policy 0, policy_version 128153 (0.0038) [2024-06-18 11:57:16,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2099789824. Throughput: 0: 42615.2. Samples: 2099893420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:16,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 11:57:17,456][12883] Updated weights for policy 0, policy_version 128163 (0.0033) [2024-06-18 11:57:20,193][12883] Updated weights for policy 0, policy_version 128173 (0.0032) [2024-06-18 11:57:21,994][12645] Fps is (10 sec: 39321.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2100035584. Throughput: 0: 42707.9. Samples: 2100148400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:21,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 11:57:25,134][12883] Updated weights for policy 0, policy_version 128183 (0.0040) [2024-06-18 11:57:26,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.5, 300 sec: 42931.6). Total num frames: 2100264960. Throughput: 0: 42588.4. Samples: 2100401020. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:26,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 11:57:27,895][12883] Updated weights for policy 0, policy_version 128193 (0.0041) [2024-06-18 11:57:31,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2100428800. Throughput: 0: 42453.8. Samples: 2100530240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:31,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 11:57:32,810][12883] Updated weights for policy 0, policy_version 128203 (0.0029) [2024-06-18 11:57:35,628][12883] Updated weights for policy 0, policy_version 128213 (0.0036) [2024-06-18 11:57:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2100674560. Throughput: 0: 42704.2. Samples: 2100784800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:36,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 11:57:40,298][12883] Updated weights for policy 0, policy_version 128223 (0.0029) [2024-06-18 11:57:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2100887552. Throughput: 0: 42655.2. Samples: 2101046700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:41,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 11:57:43,276][12883] Updated weights for policy 0, policy_version 128233 (0.0035) [2024-06-18 11:57:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2101084160. Throughput: 0: 42572.4. Samples: 2101172940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:46,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 11:57:47,769][12883] Updated weights for policy 0, policy_version 128243 (0.0040) [2024-06-18 11:57:51,082][12883] Updated weights for policy 0, policy_version 128253 (0.0028) [2024-06-18 11:57:51,994][12645] Fps is (10 sec: 42596.7, 60 sec: 42871.2, 300 sec: 42820.5). Total num frames: 2101313536. Throughput: 0: 42703.2. Samples: 2101426800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:51,995][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 11:57:55,197][12883] Updated weights for policy 0, policy_version 128263 (0.0027) [2024-06-18 11:57:56,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2101526528. Throughput: 0: 42938.2. Samples: 2101691640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:57:56,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 11:57:58,851][12883] Updated weights for policy 0, policy_version 128273 (0.0032) [2024-06-18 11:58:01,994][12645] Fps is (10 sec: 42600.0, 60 sec: 43144.5, 300 sec: 42709.6). Total num frames: 2101739520. Throughput: 0: 42681.7. Samples: 2101814100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 11:58:01,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 11:58:03,064][12883] Updated weights for policy 0, policy_version 128283 (0.0031) [2024-06-18 11:58:06,402][12883] Updated weights for policy 0, policy_version 128293 (0.0037) [2024-06-18 11:58:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2101968896. Throughput: 0: 42777.5. Samples: 2102073380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:06,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 11:58:10,656][12883] Updated weights for policy 0, policy_version 128303 (0.0031) [2024-06-18 11:58:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42709.8). Total num frames: 2102165504. Throughput: 0: 42943.1. Samples: 2102333460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:11,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 11:58:14,098][12883] Updated weights for policy 0, policy_version 128313 (0.0038) [2024-06-18 11:58:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 2102394880. Throughput: 0: 42806.6. Samples: 2102456540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:16,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 11:58:18,083][12883] Updated weights for policy 0, policy_version 128323 (0.0029) [2024-06-18 11:58:21,751][12883] Updated weights for policy 0, policy_version 128333 (0.0025) [2024-06-18 11:58:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2102624256. Throughput: 0: 42982.1. Samples: 2102719000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:21,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 11:58:25,583][12883] Updated weights for policy 0, policy_version 128343 (0.0038) [2024-06-18 11:58:26,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2102804480. Throughput: 0: 42854.7. Samples: 2102975260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:26,997][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 11:58:29,324][12883] Updated weights for policy 0, policy_version 128353 (0.0033) [2024-06-18 11:58:31,346][12862] Signal inference workers to stop experience collection... (30800 times) [2024-06-18 11:58:31,352][12862] Signal inference workers to resume experience collection... (30800 times) [2024-06-18 11:58:31,395][12883] InferenceWorker_p0-w0: stopping experience collection (30800 times) [2024-06-18 11:58:31,396][12883] InferenceWorker_p0-w0: resuming experience collection (30800 times) [2024-06-18 11:58:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2103033856. Throughput: 0: 42869.0. Samples: 2103102040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:31,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 11:58:33,076][12883] Updated weights for policy 0, policy_version 128363 (0.0041) [2024-06-18 11:58:36,978][12883] Updated weights for policy 0, policy_version 128373 (0.0037) [2024-06-18 11:58:36,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 2103263232. Throughput: 0: 43063.0. Samples: 2103364620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:36,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 11:58:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128373_2103263232.pth... [2024-06-18 11:58:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000127746_2092990464.pth [2024-06-18 11:58:40,648][12883] Updated weights for policy 0, policy_version 128383 (0.0026) [2024-06-18 11:58:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2103459840. Throughput: 0: 42792.7. Samples: 2103617320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:41,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 11:58:44,581][12883] Updated weights for policy 0, policy_version 128393 (0.0027) [2024-06-18 11:58:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2103672832. Throughput: 0: 42755.0. Samples: 2103738080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:46,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 11:58:49,160][12883] Updated weights for policy 0, policy_version 128403 (0.0034) [2024-06-18 11:58:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.8, 300 sec: 42876.1). Total num frames: 2103902208. Throughput: 0: 42759.0. Samples: 2103997540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:51,996][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 11:58:52,525][12883] Updated weights for policy 0, policy_version 128413 (0.0024) [2024-06-18 11:58:56,690][12883] Updated weights for policy 0, policy_version 128423 (0.0032) [2024-06-18 11:58:56,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2104082432. Throughput: 0: 42587.2. Samples: 2104249880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:58:56,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 11:59:00,338][12883] Updated weights for policy 0, policy_version 128433 (0.0027) [2024-06-18 11:59:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2104311808. Throughput: 0: 42593.3. Samples: 2104373240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:59:01,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 11:59:04,323][12883] Updated weights for policy 0, policy_version 128443 (0.0041) [2024-06-18 11:59:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 2104508416. Throughput: 0: 42526.3. Samples: 2104632680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 11:59:06,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 11:59:07,869][12883] Updated weights for policy 0, policy_version 128453 (0.0028) [2024-06-18 11:59:11,887][12883] Updated weights for policy 0, policy_version 128463 (0.0028) [2024-06-18 11:59:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 2104737792. Throughput: 0: 42456.4. Samples: 2104885700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:11,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 11:59:16,029][12883] Updated weights for policy 0, policy_version 128473 (0.0030) [2024-06-18 11:59:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2104950784. Throughput: 0: 42561.7. Samples: 2105017320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:16,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 11:59:19,442][12883] Updated weights for policy 0, policy_version 128483 (0.0044) [2024-06-18 11:59:21,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2105147392. Throughput: 0: 42367.2. Samples: 2105271140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:21,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 11:59:23,652][12883] Updated weights for policy 0, policy_version 128493 (0.0032) [2024-06-18 11:59:26,991][12883] Updated weights for policy 0, policy_version 128503 (0.0027) [2024-06-18 11:59:26,996][12645] Fps is (10 sec: 44227.4, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 2105393152. Throughput: 0: 42490.5. Samples: 2105529480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:26,996][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 11:59:31,208][12883] Updated weights for policy 0, policy_version 128513 (0.0034) [2024-06-18 11:59:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2105589760. Throughput: 0: 42689.5. Samples: 2105659100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:31,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 11:59:34,980][12883] Updated weights for policy 0, policy_version 128523 (0.0030) [2024-06-18 11:59:36,994][12645] Fps is (10 sec: 40968.5, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2105802752. Throughput: 0: 42628.0. Samples: 2105915800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:36,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 11:59:38,703][12883] Updated weights for policy 0, policy_version 128533 (0.0033) [2024-06-18 11:59:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2106032128. Throughput: 0: 42643.9. Samples: 2106168860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:41,994][12645] Avg episode reward: [(0, '0.689')] [2024-06-18 11:59:42,491][12883] Updated weights for policy 0, policy_version 128543 (0.0037) [2024-06-18 11:59:46,608][12883] Updated weights for policy 0, policy_version 128553 (0.0034) [2024-06-18 11:59:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2106228736. Throughput: 0: 42828.0. Samples: 2106300500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:46,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 11:59:49,869][12883] Updated weights for policy 0, policy_version 128563 (0.0032) [2024-06-18 11:59:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42877.0). Total num frames: 2106458112. Throughput: 0: 42852.4. Samples: 2106561040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:51,994][12645] Avg episode reward: [(0, '0.812')] [2024-06-18 11:59:54,143][12883] Updated weights for policy 0, policy_version 128573 (0.0033) [2024-06-18 11:59:56,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 2106671104. Throughput: 0: 42940.1. Samples: 2106818000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 11:59:56,994][12645] Avg episode reward: [(0, '0.694')] [2024-06-18 11:59:58,077][12883] Updated weights for policy 0, policy_version 128583 (0.0035) [2024-06-18 12:00:01,753][12883] Updated weights for policy 0, policy_version 128593 (0.0045) [2024-06-18 12:00:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2106884096. Throughput: 0: 42940.5. Samples: 2106949640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:00:01,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 12:00:03,656][12862] Signal inference workers to stop experience collection... (30850 times) [2024-06-18 12:00:03,657][12862] Signal inference workers to resume experience collection... (30850 times) [2024-06-18 12:00:03,702][12883] InferenceWorker_p0-w0: stopping experience collection (30850 times) [2024-06-18 12:00:03,702][12883] InferenceWorker_p0-w0: resuming experience collection (30850 times) [2024-06-18 12:00:05,578][12883] Updated weights for policy 0, policy_version 128603 (0.0037) [2024-06-18 12:00:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 2107097088. Throughput: 0: 43018.2. Samples: 2107206960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:00:06,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 12:00:09,328][12883] Updated weights for policy 0, policy_version 128613 (0.0041) [2024-06-18 12:00:11,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.9, 300 sec: 42764.7). Total num frames: 2107310080. Throughput: 0: 42819.9. Samples: 2107456380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:11,996][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 12:00:13,233][12883] Updated weights for policy 0, policy_version 128623 (0.0043) [2024-06-18 12:00:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2107506688. Throughput: 0: 42806.7. Samples: 2107585400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:16,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 12:00:17,041][12883] Updated weights for policy 0, policy_version 128633 (0.0027) [2024-06-18 12:00:21,022][12883] Updated weights for policy 0, policy_version 128643 (0.0030) [2024-06-18 12:00:21,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2107719680. Throughput: 0: 42795.6. Samples: 2107841600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:21,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 12:00:24,666][12883] Updated weights for policy 0, policy_version 128653 (0.0046) [2024-06-18 12:00:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 2107965440. Throughput: 0: 42720.1. Samples: 2108091260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:26,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 12:00:28,909][12883] Updated weights for policy 0, policy_version 128663 (0.0033) [2024-06-18 12:00:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2108145664. Throughput: 0: 42628.1. Samples: 2108218760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:31,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 12:00:32,579][12883] Updated weights for policy 0, policy_version 128673 (0.0035) [2024-06-18 12:00:36,475][12883] Updated weights for policy 0, policy_version 128683 (0.0030) [2024-06-18 12:00:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42710.4). Total num frames: 2108358656. Throughput: 0: 42663.0. Samples: 2108480880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:36,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 12:00:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128684_2108358656.pth... [2024-06-18 12:00:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128058_2098102272.pth [2024-06-18 12:00:40,243][12883] Updated weights for policy 0, policy_version 128693 (0.0030) [2024-06-18 12:00:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2108588032. Throughput: 0: 42478.1. Samples: 2108729520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:41,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 12:00:44,048][12883] Updated weights for policy 0, policy_version 128703 (0.0041) [2024-06-18 12:00:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2108784640. Throughput: 0: 42671.5. Samples: 2108869860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:46,994][12645] Avg episode reward: [(0, '0.718')] [2024-06-18 12:00:47,747][12883] Updated weights for policy 0, policy_version 128713 (0.0026) [2024-06-18 12:00:51,603][12883] Updated weights for policy 0, policy_version 128723 (0.0033) [2024-06-18 12:00:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2108997632. Throughput: 0: 42569.0. Samples: 2109122560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:51,994][12645] Avg episode reward: [(0, '0.732')] [2024-06-18 12:00:55,327][12883] Updated weights for policy 0, policy_version 128733 (0.0040) [2024-06-18 12:00:56,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2109243392. Throughput: 0: 42575.9. Samples: 2109372200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:00:56,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 12:00:59,088][12883] Updated weights for policy 0, policy_version 128743 (0.0032) [2024-06-18 12:01:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2109423616. Throughput: 0: 42749.3. Samples: 2109509120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:01:01,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 12:01:02,901][12883] Updated weights for policy 0, policy_version 128753 (0.0034) [2024-06-18 12:01:06,600][12883] Updated weights for policy 0, policy_version 128763 (0.0041) [2024-06-18 12:01:06,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2109652992. Throughput: 0: 42769.9. Samples: 2109766340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:01:07,005][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 12:01:10,530][12883] Updated weights for policy 0, policy_version 128773 (0.0049) [2024-06-18 12:01:11,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 2109882368. Throughput: 0: 42834.3. Samples: 2110018800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 12:01:11,994][12645] Avg episode reward: [(0, '0.729')] [2024-06-18 12:01:14,473][12883] Updated weights for policy 0, policy_version 128783 (0.0041) [2024-06-18 12:01:16,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2110062592. Throughput: 0: 42972.4. Samples: 2110152520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:16,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 12:01:18,095][12883] Updated weights for policy 0, policy_version 128793 (0.0038) [2024-06-18 12:01:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2110291968. Throughput: 0: 42844.5. Samples: 2110408880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:21,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 12:01:22,007][12883] Updated weights for policy 0, policy_version 128803 (0.0031) [2024-06-18 12:01:26,082][12883] Updated weights for policy 0, policy_version 128813 (0.0032) [2024-06-18 12:01:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2110521344. Throughput: 0: 42884.8. Samples: 2110659340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:26,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 12:01:29,554][12883] Updated weights for policy 0, policy_version 128823 (0.0026) [2024-06-18 12:01:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2110701568. Throughput: 0: 42614.6. Samples: 2110787520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:31,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 12:01:33,592][12883] Updated weights for policy 0, policy_version 128833 (0.0040) [2024-06-18 12:01:36,982][12883] Updated weights for policy 0, policy_version 128843 (0.0051) [2024-06-18 12:01:36,996][12645] Fps is (10 sec: 44227.0, 60 sec: 43416.0, 300 sec: 42764.7). Total num frames: 2110963712. Throughput: 0: 42862.1. Samples: 2111051460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:36,997][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 12:01:41,415][12883] Updated weights for policy 0, policy_version 128853 (0.0049) [2024-06-18 12:01:41,435][12862] Signal inference workers to stop experience collection... (30900 times) [2024-06-18 12:01:41,436][12862] Signal inference workers to resume experience collection... (30900 times) [2024-06-18 12:01:41,454][12883] InferenceWorker_p0-w0: stopping experience collection (30900 times) [2024-06-18 12:01:41,454][12883] InferenceWorker_p0-w0: resuming experience collection (30900 times) [2024-06-18 12:01:42,000][12645] Fps is (10 sec: 47484.5, 60 sec: 43140.1, 300 sec: 42930.7). Total num frames: 2111176704. Throughput: 0: 42982.9. Samples: 2111306700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:42,000][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 12:01:44,538][12883] Updated weights for policy 0, policy_version 128863 (0.0034) [2024-06-18 12:01:46,994][12645] Fps is (10 sec: 37691.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2111340544. Throughput: 0: 42647.0. Samples: 2111428240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:46,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 12:01:49,066][12883] Updated weights for policy 0, policy_version 128873 (0.0036) [2024-06-18 12:01:51,994][12645] Fps is (10 sec: 40985.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2111586304. Throughput: 0: 42668.4. Samples: 2111686320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:52,003][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 12:01:52,447][12883] Updated weights for policy 0, policy_version 128883 (0.0028) [2024-06-18 12:01:56,703][12883] Updated weights for policy 0, policy_version 128893 (0.0027) [2024-06-18 12:01:56,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2111799296. Throughput: 0: 42987.5. Samples: 2111953240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:01:56,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 12:02:00,161][12883] Updated weights for policy 0, policy_version 128903 (0.0031) [2024-06-18 12:02:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2111979520. Throughput: 0: 42708.9. Samples: 2112074420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:02:01,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 12:02:04,634][12883] Updated weights for policy 0, policy_version 128913 (0.0044) [2024-06-18 12:02:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 2112241664. Throughput: 0: 42692.5. Samples: 2112330040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:02:06,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 12:02:07,615][12883] Updated weights for policy 0, policy_version 128923 (0.0031) [2024-06-18 12:02:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2112405504. Throughput: 0: 43012.6. Samples: 2112594900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:02:11,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 12:02:12,478][12883] Updated weights for policy 0, policy_version 128933 (0.0030) [2024-06-18 12:02:15,154][12883] Updated weights for policy 0, policy_version 128943 (0.0039) [2024-06-18 12:02:16,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2112634880. Throughput: 0: 42695.5. Samples: 2112708820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:16,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 12:02:20,362][12883] Updated weights for policy 0, policy_version 128953 (0.0034) [2024-06-18 12:02:21,994][12645] Fps is (10 sec: 47512.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2112880640. Throughput: 0: 42620.3. Samples: 2112969280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:21,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 12:02:22,913][12883] Updated weights for policy 0, policy_version 128963 (0.0030) [2024-06-18 12:02:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2113044480. Throughput: 0: 42714.3. Samples: 2113228580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:26,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 12:02:27,944][12883] Updated weights for policy 0, policy_version 128973 (0.0046) [2024-06-18 12:02:30,445][12883] Updated weights for policy 0, policy_version 128983 (0.0040) [2024-06-18 12:02:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2113273856. Throughput: 0: 42563.5. Samples: 2113343600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:31,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 12:02:35,583][12883] Updated weights for policy 0, policy_version 128993 (0.0030) [2024-06-18 12:02:36,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2113503232. Throughput: 0: 42695.6. Samples: 2113607620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:36,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 12:02:37,160][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129000_2113536000.pth... [2024-06-18 12:02:37,220][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128373_2103263232.pth [2024-06-18 12:02:38,500][12883] Updated weights for policy 0, policy_version 129003 (0.0021) [2024-06-18 12:02:41,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41510.4, 300 sec: 42653.9). Total num frames: 2113667072. Throughput: 0: 42437.6. Samples: 2113862940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:41,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 12:02:43,260][12883] Updated weights for policy 0, policy_version 129013 (0.0036) [2024-06-18 12:02:46,013][12883] Updated weights for policy 0, policy_version 129023 (0.0034) [2024-06-18 12:02:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42765.1). Total num frames: 2113929216. Throughput: 0: 42368.5. Samples: 2113981000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:46,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 12:02:50,879][12883] Updated weights for policy 0, policy_version 129033 (0.0040) [2024-06-18 12:02:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2114125824. Throughput: 0: 42515.0. Samples: 2114243220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:51,996][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 12:02:53,931][12883] Updated weights for policy 0, policy_version 129043 (0.0036) [2024-06-18 12:02:56,994][12645] Fps is (10 sec: 37682.1, 60 sec: 41778.9, 300 sec: 42598.4). Total num frames: 2114306048. Throughput: 0: 42265.0. Samples: 2114496840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:02:56,995][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 12:02:58,745][12883] Updated weights for policy 0, policy_version 129053 (0.0049) [2024-06-18 12:03:01,675][12883] Updated weights for policy 0, policy_version 129063 (0.0027) [2024-06-18 12:03:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2114568192. Throughput: 0: 42432.7. Samples: 2114618280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:03:01,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 12:03:06,430][12883] Updated weights for policy 0, policy_version 129073 (0.0037) [2024-06-18 12:03:06,998][12645] Fps is (10 sec: 45858.3, 60 sec: 42049.4, 300 sec: 42708.9). Total num frames: 2114764800. Throughput: 0: 42394.1. Samples: 2114877180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:03:06,998][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 12:03:08,063][12862] Signal inference workers to stop experience collection... (30950 times) [2024-06-18 12:03:08,063][12862] Signal inference workers to resume experience collection... (30950 times) [2024-06-18 12:03:08,081][12883] InferenceWorker_p0-w0: stopping experience collection (30950 times) [2024-06-18 12:03:08,081][12883] InferenceWorker_p0-w0: resuming experience collection (30950 times) [2024-06-18 12:03:09,363][12883] Updated weights for policy 0, policy_version 129083 (0.0039) [2024-06-18 12:03:11,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2114945024. Throughput: 0: 42125.8. Samples: 2115124240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:03:11,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 12:03:14,226][12883] Updated weights for policy 0, policy_version 129093 (0.0038) [2024-06-18 12:03:16,993][12645] Fps is (10 sec: 44255.2, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 2115207168. Throughput: 0: 42303.4. Samples: 2115247240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 12:03:16,994][12645] Avg episode reward: [(0, '0.681')] [2024-06-18 12:03:17,106][12883] Updated weights for policy 0, policy_version 129103 (0.0035) [2024-06-18 12:03:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 42598.7). Total num frames: 2115371008. Throughput: 0: 42137.3. Samples: 2115503800. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:21,994][12645] Avg episode reward: [(0, '0.747')] [2024-06-18 12:03:22,095][12883] Updated weights for policy 0, policy_version 129113 (0.0037) [2024-06-18 12:03:25,284][12883] Updated weights for policy 0, policy_version 129123 (0.0038) [2024-06-18 12:03:26,995][12645] Fps is (10 sec: 37676.9, 60 sec: 42324.3, 300 sec: 42542.6). Total num frames: 2115584000. Throughput: 0: 41992.4. Samples: 2115752660. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:26,996][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 12:03:29,814][12883] Updated weights for policy 0, policy_version 129133 (0.0047) [2024-06-18 12:03:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2115813376. Throughput: 0: 42321.7. Samples: 2115885480. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:31,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 12:03:33,042][12883] Updated weights for policy 0, policy_version 129143 (0.0030) [2024-06-18 12:03:36,995][12645] Fps is (10 sec: 42598.7, 60 sec: 41778.1, 300 sec: 42542.7). Total num frames: 2116009984. Throughput: 0: 42070.7. Samples: 2116136460. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:36,996][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 12:03:37,396][12883] Updated weights for policy 0, policy_version 129153 (0.0042) [2024-06-18 12:03:40,725][12883] Updated weights for policy 0, policy_version 129163 (0.0036) [2024-06-18 12:03:41,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2116239360. Throughput: 0: 42009.3. Samples: 2116387240. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:41,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 12:03:45,312][12883] Updated weights for policy 0, policy_version 129173 (0.0029) [2024-06-18 12:03:46,994][12645] Fps is (10 sec: 44243.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2116452352. Throughput: 0: 42224.8. Samples: 2116518400. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:46,994][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 12:03:48,650][12883] Updated weights for policy 0, policy_version 129183 (0.0024) [2024-06-18 12:03:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2116648960. Throughput: 0: 42090.9. Samples: 2116771100. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:51,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 12:03:52,875][12883] Updated weights for policy 0, policy_version 129193 (0.0039) [2024-06-18 12:03:56,426][12883] Updated weights for policy 0, policy_version 129203 (0.0031) [2024-06-18 12:03:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.8, 300 sec: 42598.4). Total num frames: 2116878336. Throughput: 0: 42221.1. Samples: 2117024180. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:03:56,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 12:04:00,634][12883] Updated weights for policy 0, policy_version 129213 (0.0037) [2024-06-18 12:04:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2117074944. Throughput: 0: 42311.0. Samples: 2117151240. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:04:01,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 12:04:04,342][12883] Updated weights for policy 0, policy_version 129223 (0.0028) [2024-06-18 12:04:06,994][12645] Fps is (10 sec: 39320.8, 60 sec: 41782.0, 300 sec: 42487.3). Total num frames: 2117271552. Throughput: 0: 42245.7. Samples: 2117404860. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:04:06,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 12:04:08,386][12883] Updated weights for policy 0, policy_version 129233 (0.0029) [2024-06-18 12:04:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2117500928. Throughput: 0: 42525.9. Samples: 2117666260. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:04:11,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 12:04:12,219][12883] Updated weights for policy 0, policy_version 129243 (0.0044) [2024-06-18 12:04:16,023][12883] Updated weights for policy 0, policy_version 129253 (0.0041) [2024-06-18 12:04:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2117730304. Throughput: 0: 42403.6. Samples: 2117793640. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-18 12:04:17,007][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 12:04:19,817][12883] Updated weights for policy 0, policy_version 129263 (0.0049) [2024-06-18 12:04:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2117926912. Throughput: 0: 42461.4. Samples: 2118047160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:21,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 12:04:23,559][12883] Updated weights for policy 0, policy_version 129273 (0.0025) [2024-06-18 12:04:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42599.5, 300 sec: 42542.8). Total num frames: 2118139904. Throughput: 0: 42679.5. Samples: 2118307820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:26,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 12:04:27,342][12883] Updated weights for policy 0, policy_version 129283 (0.0031) [2024-06-18 12:04:31,151][12862] Signal inference workers to stop experience collection... (31000 times) [2024-06-18 12:04:31,151][12862] Signal inference workers to resume experience collection... (31000 times) [2024-06-18 12:04:31,153][12883] Updated weights for policy 0, policy_version 129293 (0.0021) [2024-06-18 12:04:31,167][12883] InferenceWorker_p0-w0: stopping experience collection (31000 times) [2024-06-18 12:04:31,182][12883] InferenceWorker_p0-w0: resuming experience collection (31000 times) [2024-06-18 12:04:31,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42596.9, 300 sec: 42598.1). Total num frames: 2118369280. Throughput: 0: 42663.7. Samples: 2118438360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:31,997][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 12:04:34,969][12883] Updated weights for policy 0, policy_version 129303 (0.0050) [2024-06-18 12:04:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42872.4, 300 sec: 42542.9). Total num frames: 2118582272. Throughput: 0: 42663.0. Samples: 2118690940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:36,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 12:04:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129308_2118582272.pth... [2024-06-18 12:04:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000128684_2108358656.pth [2024-06-18 12:04:38,879][12883] Updated weights for policy 0, policy_version 129313 (0.0033) [2024-06-18 12:04:41,994][12645] Fps is (10 sec: 42607.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2118795264. Throughput: 0: 42716.2. Samples: 2118946420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:41,994][12645] Avg episode reward: [(0, '0.683')] [2024-06-18 12:04:42,743][12883] Updated weights for policy 0, policy_version 129323 (0.0035) [2024-06-18 12:04:46,575][12883] Updated weights for policy 0, policy_version 129333 (0.0037) [2024-06-18 12:04:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2118991872. Throughput: 0: 42763.8. Samples: 2119075620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:46,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 12:04:50,489][12883] Updated weights for policy 0, policy_version 129343 (0.0033) [2024-06-18 12:04:51,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2119221248. Throughput: 0: 42893.9. Samples: 2119335080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:51,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 12:04:54,410][12883] Updated weights for policy 0, policy_version 129353 (0.0037) [2024-06-18 12:04:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2119434240. Throughput: 0: 42779.9. Samples: 2119591360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:04:56,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 12:04:57,990][12883] Updated weights for policy 0, policy_version 129363 (0.0034) [2024-06-18 12:05:01,812][12883] Updated weights for policy 0, policy_version 129373 (0.0033) [2024-06-18 12:05:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2119647232. Throughput: 0: 42845.1. Samples: 2119721660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:05:01,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 12:05:05,946][12883] Updated weights for policy 0, policy_version 129383 (0.0046) [2024-06-18 12:05:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 2119843840. Throughput: 0: 42951.8. Samples: 2119980000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:05:06,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 12:05:09,382][12883] Updated weights for policy 0, policy_version 129393 (0.0027) [2024-06-18 12:05:11,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2120073216. Throughput: 0: 42712.4. Samples: 2120229880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:05:11,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 12:05:13,666][12883] Updated weights for policy 0, policy_version 129403 (0.0033) [2024-06-18 12:05:16,935][12883] Updated weights for policy 0, policy_version 129413 (0.0029) [2024-06-18 12:05:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2120302592. Throughput: 0: 42769.2. Samples: 2120362880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 12:05:16,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 12:05:21,321][12883] Updated weights for policy 0, policy_version 129423 (0.0044) [2024-06-18 12:05:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2120499200. Throughput: 0: 42855.7. Samples: 2120619440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:21,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 12:05:24,497][12883] Updated weights for policy 0, policy_version 129433 (0.0030) [2024-06-18 12:05:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2120712192. Throughput: 0: 42793.5. Samples: 2120872120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:26,994][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 12:05:28,847][12883] Updated weights for policy 0, policy_version 129443 (0.0024) [2024-06-18 12:05:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42599.9, 300 sec: 42598.4). Total num frames: 2120925184. Throughput: 0: 42687.2. Samples: 2120996540. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:31,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 12:05:32,314][12883] Updated weights for policy 0, policy_version 129453 (0.0034) [2024-06-18 12:05:36,391][12883] Updated weights for policy 0, policy_version 129463 (0.0034) [2024-06-18 12:05:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2121154560. Throughput: 0: 42714.2. Samples: 2121257220. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:36,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 12:05:39,840][12862] Signal inference workers to stop experience collection... (31050 times) [2024-06-18 12:05:39,880][12883] InferenceWorker_p0-w0: stopping experience collection (31050 times) [2024-06-18 12:05:39,897][12862] Signal inference workers to resume experience collection... (31050 times) [2024-06-18 12:05:39,908][12883] InferenceWorker_p0-w0: resuming experience collection (31050 times) [2024-06-18 12:05:39,911][12883] Updated weights for policy 0, policy_version 129473 (0.0031) [2024-06-18 12:05:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2121334784. Throughput: 0: 42704.0. Samples: 2121513040. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:41,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 12:05:44,068][12883] Updated weights for policy 0, policy_version 129483 (0.0036) [2024-06-18 12:05:47,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.1, 300 sec: 42653.0). Total num frames: 2121580544. Throughput: 0: 42626.5. Samples: 2121640120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:47,000][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 12:05:47,526][12883] Updated weights for policy 0, policy_version 129493 (0.0033) [2024-06-18 12:05:51,766][12883] Updated weights for policy 0, policy_version 129503 (0.0040) [2024-06-18 12:05:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2121777152. Throughput: 0: 42613.0. Samples: 2121897580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:51,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 12:05:55,247][12883] Updated weights for policy 0, policy_version 129513 (0.0028) [2024-06-18 12:05:56,994][12645] Fps is (10 sec: 40985.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2121990144. Throughput: 0: 42612.9. Samples: 2122147460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:05:56,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 12:05:59,545][12883] Updated weights for policy 0, policy_version 129523 (0.0036) [2024-06-18 12:06:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.7). Total num frames: 2122186752. Throughput: 0: 42489.4. Samples: 2122274900. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:06:01,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 12:06:03,011][12883] Updated weights for policy 0, policy_version 129533 (0.0031) [2024-06-18 12:06:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2122399744. Throughput: 0: 42394.6. Samples: 2122527200. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:06:06,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 12:06:07,393][12883] Updated weights for policy 0, policy_version 129543 (0.0037) [2024-06-18 12:06:10,821][12883] Updated weights for policy 0, policy_version 129553 (0.0043) [2024-06-18 12:06:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2122629120. Throughput: 0: 42288.0. Samples: 2122775080. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:06:11,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 12:06:15,116][12883] Updated weights for policy 0, policy_version 129563 (0.0040) [2024-06-18 12:06:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 2122825728. Throughput: 0: 42489.9. Samples: 2122908580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:06:16,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 12:06:18,578][12883] Updated weights for policy 0, policy_version 129573 (0.0048) [2024-06-18 12:06:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2123055104. Throughput: 0: 42410.2. Samples: 2123165680. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 12:06:21,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 12:06:22,683][12883] Updated weights for policy 0, policy_version 129583 (0.0026) [2024-06-18 12:06:26,282][12883] Updated weights for policy 0, policy_version 129593 (0.0042) [2024-06-18 12:06:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2123268096. Throughput: 0: 42332.1. Samples: 2123417980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:06:26,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 12:06:30,263][12883] Updated weights for policy 0, policy_version 129603 (0.0027) [2024-06-18 12:06:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42376.6). Total num frames: 2123464704. Throughput: 0: 42489.4. Samples: 2123551880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:06:31,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 12:06:33,908][12883] Updated weights for policy 0, policy_version 129613 (0.0028) [2024-06-18 12:06:36,994][12645] Fps is (10 sec: 42597.1, 60 sec: 42325.2, 300 sec: 42432.7). Total num frames: 2123694080. Throughput: 0: 42389.2. Samples: 2123805100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:06:36,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 12:06:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129620_2123694080.pth... [2024-06-18 12:06:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129000_2113536000.pth [2024-06-18 12:06:37,982][12883] Updated weights for policy 0, policy_version 129623 (0.0022) [2024-06-18 12:06:41,594][12883] Updated weights for policy 0, policy_version 129633 (0.0041) [2024-06-18 12:06:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2123907072. Throughput: 0: 42384.5. Samples: 2124054760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:06:42,000][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 12:06:45,771][12883] Updated weights for policy 0, policy_version 129643 (0.0048) [2024-06-18 12:06:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42056.6, 300 sec: 42431.8). Total num frames: 2124103680. Throughput: 0: 42459.5. Samples: 2124185580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:06:46,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 12:06:49,117][12883] Updated weights for policy 0, policy_version 129653 (0.0044) [2024-06-18 12:06:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2124333056. Throughput: 0: 42594.8. Samples: 2124443960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:06:51,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 12:06:53,203][12883] Updated weights for policy 0, policy_version 129663 (0.0032) [2024-06-18 12:06:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2124546048. Throughput: 0: 42830.5. Samples: 2124702460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:06:56,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 12:06:57,267][12883] Updated weights for policy 0, policy_version 129673 (0.0038) [2024-06-18 12:07:01,203][12883] Updated weights for policy 0, policy_version 129683 (0.0029) [2024-06-18 12:07:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2124759040. Throughput: 0: 42673.3. Samples: 2124828880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:07:01,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 12:07:03,494][12862] Signal inference workers to stop experience collection... (31100 times) [2024-06-18 12:07:03,495][12862] Signal inference workers to resume experience collection... (31100 times) [2024-06-18 12:07:03,536][12883] InferenceWorker_p0-w0: stopping experience collection (31100 times) [2024-06-18 12:07:03,536][12883] InferenceWorker_p0-w0: resuming experience collection (31100 times) [2024-06-18 12:07:04,847][12883] Updated weights for policy 0, policy_version 129693 (0.0039) [2024-06-18 12:07:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2124988416. Throughput: 0: 42685.1. Samples: 2125086520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:07:07,003][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 12:07:08,802][12883] Updated weights for policy 0, policy_version 129703 (0.0037) [2024-06-18 12:07:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2125185024. Throughput: 0: 42781.2. Samples: 2125343140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:07:11,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 12:07:12,423][12883] Updated weights for policy 0, policy_version 129713 (0.0030) [2024-06-18 12:07:16,442][12883] Updated weights for policy 0, policy_version 129723 (0.0040) [2024-06-18 12:07:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 2125398016. Throughput: 0: 42601.3. Samples: 2125468940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:07:17,003][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 12:07:20,047][12883] Updated weights for policy 0, policy_version 129733 (0.0040) [2024-06-18 12:07:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2125627392. Throughput: 0: 42679.3. Samples: 2125725660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:07:22,008][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 12:07:23,930][12883] Updated weights for policy 0, policy_version 129743 (0.0034) [2024-06-18 12:07:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2125840384. Throughput: 0: 42998.2. Samples: 2125989680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:07:26,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 12:07:27,547][12883] Updated weights for policy 0, policy_version 129753 (0.0039) [2024-06-18 12:07:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2126020608. Throughput: 0: 42844.9. Samples: 2126113600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:07:31,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 12:07:32,078][12883] Updated weights for policy 0, policy_version 129763 (0.0030) [2024-06-18 12:07:35,432][12883] Updated weights for policy 0, policy_version 129773 (0.0024) [2024-06-18 12:07:36,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2126249984. Throughput: 0: 42841.1. Samples: 2126371820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:07:36,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 12:07:39,763][12883] Updated weights for policy 0, policy_version 129783 (0.0043) [2024-06-18 12:07:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2126462976. Throughput: 0: 42854.0. Samples: 2126630880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:07:41,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 12:07:43,031][12883] Updated weights for policy 0, policy_version 129793 (0.0034) [2024-06-18 12:07:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2126675968. Throughput: 0: 42967.9. Samples: 2126762440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:07:46,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 12:07:47,346][12883] Updated weights for policy 0, policy_version 129803 (0.0043) [2024-06-18 12:07:50,654][12883] Updated weights for policy 0, policy_version 129813 (0.0035) [2024-06-18 12:07:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2126905344. Throughput: 0: 42952.1. Samples: 2127019360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:07:51,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 12:07:54,924][12883] Updated weights for policy 0, policy_version 129823 (0.0030) [2024-06-18 12:07:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 2127101952. Throughput: 0: 42878.8. Samples: 2127272680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:07:56,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 12:07:58,294][12883] Updated weights for policy 0, policy_version 129833 (0.0035) [2024-06-18 12:08:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42543.4). Total num frames: 2127314944. Throughput: 0: 42844.9. Samples: 2127396960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:08:01,994][12645] Avg episode reward: [(0, '0.806')] [2024-06-18 12:08:02,608][12883] Updated weights for policy 0, policy_version 129843 (0.0046) [2024-06-18 12:08:06,049][12883] Updated weights for policy 0, policy_version 129853 (0.0042) [2024-06-18 12:08:06,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2127560704. Throughput: 0: 42966.1. Samples: 2127659140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:08:06,994][12645] Avg episode reward: [(0, '0.817')] [2024-06-18 12:08:10,240][12883] Updated weights for policy 0, policy_version 129863 (0.0042) [2024-06-18 12:08:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2127757312. Throughput: 0: 42669.3. Samples: 2127909800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:08:11,994][12645] Avg episode reward: [(0, '0.673')] [2024-06-18 12:08:13,736][12883] Updated weights for policy 0, policy_version 129873 (0.0039) [2024-06-18 12:08:16,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2127970304. Throughput: 0: 42791.1. Samples: 2128039200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:08:16,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 12:08:17,697][12883] Updated weights for policy 0, policy_version 129883 (0.0031) [2024-06-18 12:08:21,314][12883] Updated weights for policy 0, policy_version 129893 (0.0045) [2024-06-18 12:08:21,995][12645] Fps is (10 sec: 44232.6, 60 sec: 42870.8, 300 sec: 42765.1). Total num frames: 2128199680. Throughput: 0: 42820.6. Samples: 2128298780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:08:21,995][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 12:08:25,155][12883] Updated weights for policy 0, policy_version 129903 (0.0032) [2024-06-18 12:08:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2128396288. Throughput: 0: 42647.9. Samples: 2128550040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 12:08:26,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 12:08:29,164][12883] Updated weights for policy 0, policy_version 129913 (0.0033) [2024-06-18 12:08:31,994][12645] Fps is (10 sec: 39325.3, 60 sec: 42871.5, 300 sec: 42654.2). Total num frames: 2128592896. Throughput: 0: 42627.6. Samples: 2128680680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:08:31,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 12:08:32,853][12883] Updated weights for policy 0, policy_version 129923 (0.0038) [2024-06-18 12:08:36,725][12883] Updated weights for policy 0, policy_version 129933 (0.0027) [2024-06-18 12:08:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2128822272. Throughput: 0: 42705.4. Samples: 2128941100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:08:36,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 12:08:37,228][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129935_2128855040.pth... [2024-06-18 12:08:37,284][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129308_2118582272.pth [2024-06-18 12:08:40,829][12883] Updated weights for policy 0, policy_version 129943 (0.0035) [2024-06-18 12:08:41,968][12862] Signal inference workers to stop experience collection... (31150 times) [2024-06-18 12:08:41,968][12862] Signal inference workers to resume experience collection... (31150 times) [2024-06-18 12:08:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2129051648. Throughput: 0: 42698.2. Samples: 2129194100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:08:41,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 12:08:42,012][12883] InferenceWorker_p0-w0: stopping experience collection (31150 times) [2024-06-18 12:08:42,012][12883] InferenceWorker_p0-w0: resuming experience collection (31150 times) [2024-06-18 12:08:44,371][12883] Updated weights for policy 0, policy_version 129953 (0.0030) [2024-06-18 12:08:46,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2129248256. Throughput: 0: 42820.3. Samples: 2129323880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:08:46,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 12:08:48,524][12883] Updated weights for policy 0, policy_version 129963 (0.0040) [2024-06-18 12:08:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2129461248. Throughput: 0: 42805.9. Samples: 2129585400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:08:51,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 12:08:52,013][12883] Updated weights for policy 0, policy_version 129973 (0.0038) [2024-06-18 12:08:56,097][12883] Updated weights for policy 0, policy_version 129983 (0.0043) [2024-06-18 12:08:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2129674240. Throughput: 0: 42990.7. Samples: 2129844380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:08:56,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 12:08:59,563][12883] Updated weights for policy 0, policy_version 129993 (0.0033) [2024-06-18 12:09:01,998][12645] Fps is (10 sec: 44216.1, 60 sec: 43141.2, 300 sec: 42819.9). Total num frames: 2129903616. Throughput: 0: 42926.2. Samples: 2129971080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:09:01,999][12645] Avg episode reward: [(0, '0.670')] [2024-06-18 12:09:03,538][12883] Updated weights for policy 0, policy_version 130003 (0.0038) [2024-06-18 12:09:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 2130116608. Throughput: 0: 42864.6. Samples: 2130227640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:09:06,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 12:09:07,132][12883] Updated weights for policy 0, policy_version 130013 (0.0047) [2024-06-18 12:09:11,762][12883] Updated weights for policy 0, policy_version 130023 (0.0034) [2024-06-18 12:09:11,996][12645] Fps is (10 sec: 39331.4, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2130296832. Throughput: 0: 43084.1. Samples: 2130488920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:09:11,996][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 12:09:14,694][12883] Updated weights for policy 0, policy_version 130033 (0.0040) [2024-06-18 12:09:16,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2130542592. Throughput: 0: 42798.1. Samples: 2130606600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:09:16,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 12:09:19,261][12883] Updated weights for policy 0, policy_version 130043 (0.0047) [2024-06-18 12:09:21,994][12645] Fps is (10 sec: 47524.4, 60 sec: 42872.2, 300 sec: 42820.6). Total num frames: 2130771968. Throughput: 0: 42853.8. Samples: 2130869520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:09:21,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 12:09:22,182][12883] Updated weights for policy 0, policy_version 130053 (0.0031) [2024-06-18 12:09:26,777][12883] Updated weights for policy 0, policy_version 130063 (0.0044) [2024-06-18 12:09:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42654.2). Total num frames: 2130952192. Throughput: 0: 43051.4. Samples: 2131131420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:09:26,994][12645] Avg episode reward: [(0, '0.727')] [2024-06-18 12:09:29,784][12883] Updated weights for policy 0, policy_version 130073 (0.0041) [2024-06-18 12:09:31,994][12645] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2131181568. Throughput: 0: 42856.4. Samples: 2131252420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:09:31,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 12:09:34,251][12883] Updated weights for policy 0, policy_version 130083 (0.0031) [2024-06-18 12:09:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2131394560. Throughput: 0: 42899.5. Samples: 2131515880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:09:36,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 12:09:37,535][12883] Updated weights for policy 0, policy_version 130093 (0.0036) [2024-06-18 12:09:41,699][12883] Updated weights for policy 0, policy_version 130103 (0.0033) [2024-06-18 12:09:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2131607552. Throughput: 0: 42857.2. Samples: 2131772960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:09:41,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 12:09:45,355][12883] Updated weights for policy 0, policy_version 130113 (0.0032) [2024-06-18 12:09:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2131820544. Throughput: 0: 42732.1. Samples: 2131893820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:09:46,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 12:09:49,817][12883] Updated weights for policy 0, policy_version 130123 (0.0029) [2024-06-18 12:09:51,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2132049920. Throughput: 0: 42892.0. Samples: 2132157780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:09:51,994][12645] Avg episode reward: [(0, '0.713')] [2024-06-18 12:09:53,188][12883] Updated weights for policy 0, policy_version 130133 (0.0031) [2024-06-18 12:09:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2132246528. Throughput: 0: 42800.8. Samples: 2132414860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:09:56,994][12645] Avg episode reward: [(0, '0.715')] [2024-06-18 12:09:57,396][12883] Updated weights for policy 0, policy_version 130143 (0.0033) [2024-06-18 12:09:58,600][12862] Signal inference workers to stop experience collection... (31200 times) [2024-06-18 12:09:58,632][12883] InferenceWorker_p0-w0: stopping experience collection (31200 times) [2024-06-18 12:09:58,656][12862] Signal inference workers to resume experience collection... (31200 times) [2024-06-18 12:09:58,664][12883] InferenceWorker_p0-w0: resuming experience collection (31200 times) [2024-06-18 12:10:00,688][12883] Updated weights for policy 0, policy_version 130153 (0.0041) [2024-06-18 12:10:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42601.7, 300 sec: 42765.0). Total num frames: 2132459520. Throughput: 0: 42834.7. Samples: 2132534160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:10:01,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 12:10:04,893][12883] Updated weights for policy 0, policy_version 130163 (0.0031) [2024-06-18 12:10:06,996][12645] Fps is (10 sec: 45865.1, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 2132705280. Throughput: 0: 42888.1. Samples: 2132799580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:10:06,996][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 12:10:08,256][12883] Updated weights for policy 0, policy_version 130173 (0.0028) [2024-06-18 12:10:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2132869120. Throughput: 0: 42713.0. Samples: 2133053500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:10:11,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 12:10:12,611][12883] Updated weights for policy 0, policy_version 130183 (0.0037) [2024-06-18 12:10:15,832][12883] Updated weights for policy 0, policy_version 130193 (0.0051) [2024-06-18 12:10:16,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2133114880. Throughput: 0: 42864.1. Samples: 2133181300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:10:16,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 12:10:20,172][12883] Updated weights for policy 0, policy_version 130203 (0.0046) [2024-06-18 12:10:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2133311488. Throughput: 0: 42745.9. Samples: 2133439440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:10:21,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 12:10:23,639][12883] Updated weights for policy 0, policy_version 130213 (0.0019) [2024-06-18 12:10:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2133524480. Throughput: 0: 42554.2. Samples: 2133687900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:10:26,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 12:10:27,892][12883] Updated weights for policy 0, policy_version 130223 (0.0030) [2024-06-18 12:10:31,249][12883] Updated weights for policy 0, policy_version 130233 (0.0036) [2024-06-18 12:10:31,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2133753856. Throughput: 0: 42756.4. Samples: 2133817860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 12:10:31,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 12:10:35,531][12883] Updated weights for policy 0, policy_version 130243 (0.0027) [2024-06-18 12:10:36,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2133966848. Throughput: 0: 42542.2. Samples: 2134072180. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:10:36,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 12:10:37,011][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130247_2133966848.pth... [2024-06-18 12:10:37,060][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129620_2123694080.pth [2024-06-18 12:10:38,846][12883] Updated weights for policy 0, policy_version 130253 (0.0033) [2024-06-18 12:10:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2134179840. Throughput: 0: 42550.6. Samples: 2134329640. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:10:41,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 12:10:43,198][12883] Updated weights for policy 0, policy_version 130263 (0.0047) [2024-06-18 12:10:46,801][12883] Updated weights for policy 0, policy_version 130273 (0.0041) [2024-06-18 12:10:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2134409216. Throughput: 0: 42710.4. Samples: 2134456120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:10:46,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 12:10:50,827][12883] Updated weights for policy 0, policy_version 130283 (0.0027) [2024-06-18 12:10:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2134605824. Throughput: 0: 42525.1. Samples: 2134713120. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:10:51,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 12:10:54,365][12883] Updated weights for policy 0, policy_version 130293 (0.0035) [2024-06-18 12:10:56,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2134802432. Throughput: 0: 42515.9. Samples: 2134966720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:10:56,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 12:10:58,479][12883] Updated weights for policy 0, policy_version 130303 (0.0029) [2024-06-18 12:10:59,099][12862] Signal inference workers to stop experience collection... (31250 times) [2024-06-18 12:10:59,100][12862] Signal inference workers to resume experience collection... (31250 times) [2024-06-18 12:10:59,119][12883] InferenceWorker_p0-w0: stopping experience collection (31250 times) [2024-06-18 12:10:59,119][12883] InferenceWorker_p0-w0: resuming experience collection (31250 times) [2024-06-18 12:11:01,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2135031808. Throughput: 0: 42562.2. Samples: 2135096600. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:11:01,994][12645] Avg episode reward: [(0, '0.258')] [2024-06-18 12:11:02,123][12883] Updated weights for policy 0, policy_version 130313 (0.0035) [2024-06-18 12:11:06,163][12883] Updated weights for policy 0, policy_version 130323 (0.0042) [2024-06-18 12:11:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 2135244800. Throughput: 0: 42527.9. Samples: 2135353200. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:11:07,003][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 12:11:09,772][12883] Updated weights for policy 0, policy_version 130333 (0.0039) [2024-06-18 12:11:11,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2135441408. Throughput: 0: 42633.7. Samples: 2135606420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:11:11,994][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 12:11:14,214][12883] Updated weights for policy 0, policy_version 130343 (0.0039) [2024-06-18 12:11:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2135670784. Throughput: 0: 42534.2. Samples: 2135731900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:11:16,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 12:11:17,459][12883] Updated weights for policy 0, policy_version 130353 (0.0036) [2024-06-18 12:11:22,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42320.9, 300 sec: 42653.0). Total num frames: 2135851008. Throughput: 0: 42478.9. Samples: 2135984000. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:11:22,001][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 12:11:22,213][12883] Updated weights for policy 0, policy_version 130363 (0.0040) [2024-06-18 12:11:25,118][12883] Updated weights for policy 0, policy_version 130373 (0.0037) [2024-06-18 12:11:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2136080384. Throughput: 0: 42346.2. Samples: 2136235220. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:11:26,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 12:11:29,804][12883] Updated weights for policy 0, policy_version 130383 (0.0039) [2024-06-18 12:11:31,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2136293376. Throughput: 0: 42491.4. Samples: 2136368240. Policy #0 lag: (min: 1.0, avg: 10.1, max: 21.0) [2024-06-18 12:11:31,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 12:11:32,725][12883] Updated weights for policy 0, policy_version 130393 (0.0038) [2024-06-18 12:11:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2136489984. Throughput: 0: 42336.4. Samples: 2136618260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:11:36,994][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 12:11:37,313][12883] Updated weights for policy 0, policy_version 130403 (0.0030) [2024-06-18 12:11:41,449][12883] Updated weights for policy 0, policy_version 130413 (0.0055) [2024-06-18 12:11:42,000][12645] Fps is (10 sec: 42572.2, 60 sec: 42321.0, 300 sec: 42764.1). Total num frames: 2136719360. Throughput: 0: 42507.0. Samples: 2136879800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:11:42,001][12645] Avg episode reward: [(0, '0.219')] [2024-06-18 12:11:45,010][12883] Updated weights for policy 0, policy_version 130423 (0.0030) [2024-06-18 12:11:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2136932352. Throughput: 0: 42465.3. Samples: 2137007540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:11:46,996][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 12:11:49,006][12883] Updated weights for policy 0, policy_version 130433 (0.0029) [2024-06-18 12:11:51,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2137145344. Throughput: 0: 42373.7. Samples: 2137260020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:11:51,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 12:11:52,775][12883] Updated weights for policy 0, policy_version 130443 (0.0023) [2024-06-18 12:11:56,572][12883] Updated weights for policy 0, policy_version 130453 (0.0030) [2024-06-18 12:11:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2137374720. Throughput: 0: 42477.4. Samples: 2137517900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:11:56,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 12:12:00,479][12883] Updated weights for policy 0, policy_version 130463 (0.0034) [2024-06-18 12:12:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2137571328. Throughput: 0: 42522.3. Samples: 2137645400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:01,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 12:12:04,206][12883] Updated weights for policy 0, policy_version 130473 (0.0034) [2024-06-18 12:12:06,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2137784320. Throughput: 0: 42659.7. Samples: 2137903420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:06,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 12:12:07,954][12862] Signal inference workers to stop experience collection... (31300 times) [2024-06-18 12:12:08,005][12862] Signal inference workers to resume experience collection... (31300 times) [2024-06-18 12:12:08,006][12883] InferenceWorker_p0-w0: stopping experience collection (31300 times) [2024-06-18 12:12:08,013][12883] Updated weights for policy 0, policy_version 130483 (0.0031) [2024-06-18 12:12:08,021][12883] InferenceWorker_p0-w0: resuming experience collection (31300 times) [2024-06-18 12:12:11,758][12883] Updated weights for policy 0, policy_version 130493 (0.0024) [2024-06-18 12:12:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2138013696. Throughput: 0: 42795.7. Samples: 2138161020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:11,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 12:12:15,418][12883] Updated weights for policy 0, policy_version 130503 (0.0033) [2024-06-18 12:12:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2138210304. Throughput: 0: 42742.6. Samples: 2138291660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:16,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 12:12:19,251][12883] Updated weights for policy 0, policy_version 130513 (0.0029) [2024-06-18 12:12:21,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42874.4, 300 sec: 42653.6). Total num frames: 2138423296. Throughput: 0: 42821.1. Samples: 2138545300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:21,997][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 12:12:23,026][12883] Updated weights for policy 0, policy_version 130523 (0.0029) [2024-06-18 12:12:26,799][12883] Updated weights for policy 0, policy_version 130533 (0.0019) [2024-06-18 12:12:26,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2138652672. Throughput: 0: 42841.6. Samples: 2138807400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:26,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 12:12:30,567][12883] Updated weights for policy 0, policy_version 130543 (0.0042) [2024-06-18 12:12:31,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2138849280. Throughput: 0: 42866.7. Samples: 2138936540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:31,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 12:12:34,343][12883] Updated weights for policy 0, policy_version 130553 (0.0031) [2024-06-18 12:12:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2139078656. Throughput: 0: 42933.9. Samples: 2139192040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 12:12:36,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 12:12:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130559_2139078656.pth... [2024-06-18 12:12:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000129935_2128855040.pth [2024-06-18 12:12:38,623][12883] Updated weights for policy 0, policy_version 130563 (0.0031) [2024-06-18 12:12:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42875.9, 300 sec: 42765.0). Total num frames: 2139291648. Throughput: 0: 42858.3. Samples: 2139446520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:12:41,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 12:12:42,352][12883] Updated weights for policy 0, policy_version 130573 (0.0036) [2024-06-18 12:12:46,245][12883] Updated weights for policy 0, policy_version 130583 (0.0042) [2024-06-18 12:12:47,000][12645] Fps is (10 sec: 40934.3, 60 sec: 42594.0, 300 sec: 42653.0). Total num frames: 2139488256. Throughput: 0: 42855.3. Samples: 2139574160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:12:47,001][12645] Avg episode reward: [(0, '0.181')] [2024-06-18 12:12:50,060][12883] Updated weights for policy 0, policy_version 130593 (0.0041) [2024-06-18 12:12:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2139701248. Throughput: 0: 42649.0. Samples: 2139822620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:12:51,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 12:12:53,918][12883] Updated weights for policy 0, policy_version 130603 (0.0038) [2024-06-18 12:12:56,996][12645] Fps is (10 sec: 42615.6, 60 sec: 42323.8, 300 sec: 42709.2). Total num frames: 2139914240. Throughput: 0: 42773.5. Samples: 2140085920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:12:56,996][12645] Avg episode reward: [(0, '0.740')] [2024-06-18 12:12:57,705][12883] Updated weights for policy 0, policy_version 130613 (0.0032) [2024-06-18 12:13:01,558][12883] Updated weights for policy 0, policy_version 130623 (0.0033) [2024-06-18 12:13:01,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2140143616. Throughput: 0: 42630.3. Samples: 2140210020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:01,994][12645] Avg episode reward: [(0, '0.660')] [2024-06-18 12:13:05,626][12883] Updated weights for policy 0, policy_version 130633 (0.0027) [2024-06-18 12:13:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2140340224. Throughput: 0: 42628.3. Samples: 2140463480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:06,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 12:13:09,127][12883] Updated weights for policy 0, policy_version 130643 (0.0037) [2024-06-18 12:13:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2140553216. Throughput: 0: 42545.2. Samples: 2140721940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:11,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 12:13:13,513][12883] Updated weights for policy 0, policy_version 130653 (0.0038) [2024-06-18 12:13:16,649][12883] Updated weights for policy 0, policy_version 130663 (0.0034) [2024-06-18 12:13:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42709.6). Total num frames: 2140798976. Throughput: 0: 42415.0. Samples: 2140845220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:16,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 12:13:21,053][12883] Updated weights for policy 0, policy_version 130673 (0.0048) [2024-06-18 12:13:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2140995584. Throughput: 0: 42588.1. Samples: 2141108500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:21,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 12:13:24,166][12883] Updated weights for policy 0, policy_version 130683 (0.0040) [2024-06-18 12:13:26,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2141192192. Throughput: 0: 42588.9. Samples: 2141363020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:26,994][12645] Avg episode reward: [(0, '0.694')] [2024-06-18 12:13:28,598][12883] Updated weights for policy 0, policy_version 130693 (0.0033) [2024-06-18 12:13:31,819][12883] Updated weights for policy 0, policy_version 130703 (0.0052) [2024-06-18 12:13:31,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2141437952. Throughput: 0: 42508.5. Samples: 2141486780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:31,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 12:13:34,967][12862] Signal inference workers to stop experience collection... (31350 times) [2024-06-18 12:13:34,968][12862] Signal inference workers to resume experience collection... (31350 times) [2024-06-18 12:13:35,015][12883] InferenceWorker_p0-w0: stopping experience collection (31350 times) [2024-06-18 12:13:35,016][12883] InferenceWorker_p0-w0: resuming experience collection (31350 times) [2024-06-18 12:13:36,368][12883] Updated weights for policy 0, policy_version 130713 (0.0038) [2024-06-18 12:13:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2141634560. Throughput: 0: 42681.2. Samples: 2141743280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 12:13:36,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 12:13:39,491][12883] Updated weights for policy 0, policy_version 130723 (0.0024) [2024-06-18 12:13:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2141831168. Throughput: 0: 42564.4. Samples: 2142001220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:13:41,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 12:13:44,090][12883] Updated weights for policy 0, policy_version 130733 (0.0028) [2024-06-18 12:13:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43149.0, 300 sec: 42765.0). Total num frames: 2142076928. Throughput: 0: 42508.4. Samples: 2142122900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:13:47,000][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 12:13:47,749][12883] Updated weights for policy 0, policy_version 130743 (0.0038) [2024-06-18 12:13:51,581][12883] Updated weights for policy 0, policy_version 130753 (0.0033) [2024-06-18 12:13:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2142273536. Throughput: 0: 42641.0. Samples: 2142382320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:13:51,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 12:13:55,290][12883] Updated weights for policy 0, policy_version 130763 (0.0034) [2024-06-18 12:13:56,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42326.9, 300 sec: 42543.5). Total num frames: 2142453760. Throughput: 0: 42565.8. Samples: 2142637400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:13:56,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 12:13:59,190][12883] Updated weights for policy 0, policy_version 130773 (0.0039) [2024-06-18 12:14:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2142699520. Throughput: 0: 42549.4. Samples: 2142759940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:01,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 12:14:03,048][12883] Updated weights for policy 0, policy_version 130783 (0.0034) [2024-06-18 12:14:06,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 2142896128. Throughput: 0: 42442.3. Samples: 2143018400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:06,994][12645] Avg episode reward: [(0, '0.173')] [2024-06-18 12:14:07,010][12883] Updated weights for policy 0, policy_version 130793 (0.0044) [2024-06-18 12:14:10,775][12883] Updated weights for policy 0, policy_version 130803 (0.0039) [2024-06-18 12:14:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2143092736. Throughput: 0: 42471.2. Samples: 2143274220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:11,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 12:14:14,563][12883] Updated weights for policy 0, policy_version 130813 (0.0033) [2024-06-18 12:14:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2143322112. Throughput: 0: 42470.3. Samples: 2143397940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:16,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 12:14:18,428][12883] Updated weights for policy 0, policy_version 130823 (0.0033) [2024-06-18 12:14:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2143535104. Throughput: 0: 42581.8. Samples: 2143659460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:21,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 12:14:22,389][12883] Updated weights for policy 0, policy_version 130833 (0.0032) [2024-06-18 12:14:26,004][12883] Updated weights for policy 0, policy_version 130843 (0.0036) [2024-06-18 12:14:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2143748096. Throughput: 0: 42481.7. Samples: 2143912900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:26,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 12:14:29,947][12883] Updated weights for policy 0, policy_version 130853 (0.0037) [2024-06-18 12:14:31,995][12645] Fps is (10 sec: 44230.4, 60 sec: 42324.3, 300 sec: 42653.7). Total num frames: 2143977472. Throughput: 0: 42611.1. Samples: 2144040460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:31,996][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 12:14:33,855][12883] Updated weights for policy 0, policy_version 130863 (0.0038) [2024-06-18 12:14:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2144174080. Throughput: 0: 42563.1. Samples: 2144297660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:36,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 12:14:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130870_2144174080.pth... [2024-06-18 12:14:37,092][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130247_2133966848.pth [2024-06-18 12:14:37,675][12883] Updated weights for policy 0, policy_version 130873 (0.0027) [2024-06-18 12:14:41,690][12883] Updated weights for policy 0, policy_version 130883 (0.0034) [2024-06-18 12:14:41,994][12645] Fps is (10 sec: 40966.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2144387072. Throughput: 0: 42560.5. Samples: 2144552620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 12:14:41,994][12645] Avg episode reward: [(0, '0.140')] [2024-06-18 12:14:45,299][12883] Updated weights for policy 0, policy_version 130893 (0.0024) [2024-06-18 12:14:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2144616448. Throughput: 0: 42658.1. Samples: 2144679560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:14:46,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 12:14:49,422][12883] Updated weights for policy 0, policy_version 130903 (0.0033) [2024-06-18 12:14:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2144813056. Throughput: 0: 42668.7. Samples: 2144938500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:14:51,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 12:14:53,005][12883] Updated weights for policy 0, policy_version 130913 (0.0049) [2024-06-18 12:14:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2145026048. Throughput: 0: 42385.3. Samples: 2145181560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:14:56,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 12:14:57,594][12883] Updated weights for policy 0, policy_version 130923 (0.0042) [2024-06-18 12:15:00,806][12883] Updated weights for policy 0, policy_version 130933 (0.0035) [2024-06-18 12:15:01,442][12862] Signal inference workers to stop experience collection... (31400 times) [2024-06-18 12:15:01,480][12883] InferenceWorker_p0-w0: stopping experience collection (31400 times) [2024-06-18 12:15:01,501][12862] Signal inference workers to resume experience collection... (31400 times) [2024-06-18 12:15:01,504][12883] InferenceWorker_p0-w0: resuming experience collection (31400 times) [2024-06-18 12:15:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 2145255424. Throughput: 0: 42609.8. Samples: 2145315380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:01,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 12:15:05,136][12883] Updated weights for policy 0, policy_version 130943 (0.0038) [2024-06-18 12:15:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.1, 300 sec: 42598.4). Total num frames: 2145435648. Throughput: 0: 42509.7. Samples: 2145572400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:06,994][12645] Avg episode reward: [(0, '0.716')] [2024-06-18 12:15:08,605][12883] Updated weights for policy 0, policy_version 130953 (0.0036) [2024-06-18 12:15:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2145665024. Throughput: 0: 42315.3. Samples: 2145817080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:11,994][12645] Avg episode reward: [(0, '0.738')] [2024-06-18 12:15:12,699][12883] Updated weights for policy 0, policy_version 130963 (0.0036) [2024-06-18 12:15:16,124][12883] Updated weights for policy 0, policy_version 130973 (0.0033) [2024-06-18 12:15:16,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2145894400. Throughput: 0: 42554.3. Samples: 2145955340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:16,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 12:15:20,218][12883] Updated weights for policy 0, policy_version 130983 (0.0034) [2024-06-18 12:15:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2146074624. Throughput: 0: 42467.1. Samples: 2146208680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:21,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 12:15:23,895][12883] Updated weights for policy 0, policy_version 130993 (0.0044) [2024-06-18 12:15:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2146320384. Throughput: 0: 42336.8. Samples: 2146457780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:26,996][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 12:15:28,074][12883] Updated weights for policy 0, policy_version 131003 (0.0042) [2024-06-18 12:15:31,587][12883] Updated weights for policy 0, policy_version 131013 (0.0036) [2024-06-18 12:15:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42326.4, 300 sec: 42542.8). Total num frames: 2146516992. Throughput: 0: 42559.2. Samples: 2146594720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:31,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 12:15:35,759][12883] Updated weights for policy 0, policy_version 131023 (0.0039) [2024-06-18 12:15:36,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2146713600. Throughput: 0: 42453.5. Samples: 2146848900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:36,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 12:15:39,173][12883] Updated weights for policy 0, policy_version 131033 (0.0026) [2024-06-18 12:15:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2146959360. Throughput: 0: 42610.1. Samples: 2147099020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 22.0) [2024-06-18 12:15:41,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 12:15:43,540][12883] Updated weights for policy 0, policy_version 131043 (0.0031) [2024-06-18 12:15:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2147155968. Throughput: 0: 42583.0. Samples: 2147231620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:15:46,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 12:15:47,434][12883] Updated weights for policy 0, policy_version 131053 (0.0033) [2024-06-18 12:15:51,057][12883] Updated weights for policy 0, policy_version 131063 (0.0055) [2024-06-18 12:15:51,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2147352576. Throughput: 0: 42363.8. Samples: 2147478860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:15:51,997][12645] Avg episode reward: [(0, '0.691')] [2024-06-18 12:15:54,946][12883] Updated weights for policy 0, policy_version 131073 (0.0033) [2024-06-18 12:15:56,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2147581952. Throughput: 0: 42648.0. Samples: 2147736240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:15:56,994][12645] Avg episode reward: [(0, '0.649')] [2024-06-18 12:15:58,949][12883] Updated weights for policy 0, policy_version 131083 (0.0037) [2024-06-18 12:16:01,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2147794944. Throughput: 0: 42396.0. Samples: 2147863160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:01,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 12:16:02,802][12883] Updated weights for policy 0, policy_version 131093 (0.0050) [2024-06-18 12:16:06,501][12883] Updated weights for policy 0, policy_version 131103 (0.0038) [2024-06-18 12:16:07,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42867.2, 300 sec: 42597.5). Total num frames: 2148007936. Throughput: 0: 42443.5. Samples: 2148118900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:07,000][12645] Avg episode reward: [(0, '0.564')] [2024-06-18 12:16:10,366][12883] Updated weights for policy 0, policy_version 131113 (0.0032) [2024-06-18 12:16:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2148220928. Throughput: 0: 42785.8. Samples: 2148383140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:11,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 12:16:14,261][12883] Updated weights for policy 0, policy_version 131123 (0.0045) [2024-06-18 12:16:16,994][12645] Fps is (10 sec: 42624.7, 60 sec: 42325.3, 300 sec: 42654.8). Total num frames: 2148433920. Throughput: 0: 42533.8. Samples: 2148508740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:16,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 12:16:17,929][12883] Updated weights for policy 0, policy_version 131133 (0.0035) [2024-06-18 12:16:21,781][12883] Updated weights for policy 0, policy_version 131143 (0.0034) [2024-06-18 12:16:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2148646912. Throughput: 0: 42529.2. Samples: 2148762720. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:21,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 12:16:25,453][12883] Updated weights for policy 0, policy_version 131153 (0.0033) [2024-06-18 12:16:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2148859904. Throughput: 0: 42745.3. Samples: 2149022560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:26,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 12:16:29,235][12883] Updated weights for policy 0, policy_version 131163 (0.0042) [2024-06-18 12:16:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2149056512. Throughput: 0: 42650.7. Samples: 2149150900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:31,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 12:16:32,612][12862] Signal inference workers to stop experience collection... (31450 times) [2024-06-18 12:16:32,662][12862] Signal inference workers to resume experience collection... (31450 times) [2024-06-18 12:16:32,663][12883] InferenceWorker_p0-w0: stopping experience collection (31450 times) [2024-06-18 12:16:32,688][12883] InferenceWorker_p0-w0: resuming experience collection (31450 times) [2024-06-18 12:16:33,179][12883] Updated weights for policy 0, policy_version 131173 (0.0033) [2024-06-18 12:16:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42599.3). Total num frames: 2149285888. Throughput: 0: 42891.0. Samples: 2149408860. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:36,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 12:16:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131182_2149285888.pth... [2024-06-18 12:16:37,062][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130559_2139078656.pth [2024-06-18 12:16:37,328][12883] Updated weights for policy 0, policy_version 131183 (0.0044) [2024-06-18 12:16:40,784][12883] Updated weights for policy 0, policy_version 131193 (0.0033) [2024-06-18 12:16:41,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2149515264. Throughput: 0: 42766.9. Samples: 2149660760. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:41,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 12:16:44,809][12883] Updated weights for policy 0, policy_version 131203 (0.0039) [2024-06-18 12:16:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2149695488. Throughput: 0: 42804.0. Samples: 2149789340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-18 12:16:46,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 12:16:48,572][12883] Updated weights for policy 0, policy_version 131213 (0.0030) [2024-06-18 12:16:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 2149924864. Throughput: 0: 42899.1. Samples: 2150049100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:16:51,995][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 12:16:52,272][12883] Updated weights for policy 0, policy_version 131223 (0.0032) [2024-06-18 12:16:56,203][12883] Updated weights for policy 0, policy_version 131233 (0.0037) [2024-06-18 12:16:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2150154240. Throughput: 0: 42766.8. Samples: 2150307640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:16:56,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 12:16:59,856][12883] Updated weights for policy 0, policy_version 131243 (0.0044) [2024-06-18 12:17:01,996][12645] Fps is (10 sec: 44227.6, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2150367232. Throughput: 0: 42917.0. Samples: 2150440100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:01,996][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 12:17:03,675][12883] Updated weights for policy 0, policy_version 131253 (0.0033) [2024-06-18 12:17:06,994][12645] Fps is (10 sec: 40958.9, 60 sec: 42602.7, 300 sec: 42542.8). Total num frames: 2150563840. Throughput: 0: 42944.3. Samples: 2150695220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:06,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 12:17:07,550][12883] Updated weights for policy 0, policy_version 131263 (0.0034) [2024-06-18 12:17:11,200][12883] Updated weights for policy 0, policy_version 131273 (0.0036) [2024-06-18 12:17:11,994][12645] Fps is (10 sec: 44246.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2150809600. Throughput: 0: 42726.3. Samples: 2150945240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:11,994][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 12:17:15,157][12883] Updated weights for policy 0, policy_version 131283 (0.0034) [2024-06-18 12:17:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 2150989824. Throughput: 0: 42924.7. Samples: 2151082520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:16,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 12:17:18,873][12883] Updated weights for policy 0, policy_version 131293 (0.0041) [2024-06-18 12:17:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2151202816. Throughput: 0: 42795.2. Samples: 2151334640. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:21,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 12:17:22,735][12883] Updated weights for policy 0, policy_version 131303 (0.0028) [2024-06-18 12:17:26,579][12883] Updated weights for policy 0, policy_version 131313 (0.0027) [2024-06-18 12:17:26,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2151464960. Throughput: 0: 42884.1. Samples: 2151590540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:26,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 12:17:30,659][12883] Updated weights for policy 0, policy_version 131323 (0.0034) [2024-06-18 12:17:31,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2151628800. Throughput: 0: 42963.4. Samples: 2151722700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:31,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 12:17:34,448][12883] Updated weights for policy 0, policy_version 131333 (0.0026) [2024-06-18 12:17:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2151841792. Throughput: 0: 42605.4. Samples: 2151966340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:36,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 12:17:38,345][12883] Updated weights for policy 0, policy_version 131343 (0.0032) [2024-06-18 12:17:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42654.8). Total num frames: 2152071168. Throughput: 0: 42612.8. Samples: 2152225220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:41,996][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 12:17:42,006][12883] Updated weights for policy 0, policy_version 131353 (0.0031) [2024-06-18 12:17:46,163][12883] Updated weights for policy 0, policy_version 131363 (0.0023) [2024-06-18 12:17:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2152284160. Throughput: 0: 42642.5. Samples: 2152358920. Policy #0 lag: (min: 0.0, avg: 8.8, max: 20.0) [2024-06-18 12:17:46,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 12:17:49,598][12883] Updated weights for policy 0, policy_version 131373 (0.0027) [2024-06-18 12:17:51,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42870.0, 300 sec: 42653.9). Total num frames: 2152497152. Throughput: 0: 42451.8. Samples: 2152605640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:17:51,997][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 12:17:53,738][12883] Updated weights for policy 0, policy_version 131383 (0.0025) [2024-06-18 12:17:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2152710144. Throughput: 0: 42701.0. Samples: 2152866780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:17:56,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 12:17:57,214][12883] Updated weights for policy 0, policy_version 131393 (0.0036) [2024-06-18 12:18:01,542][12883] Updated weights for policy 0, policy_version 131403 (0.0036) [2024-06-18 12:18:01,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 2152906752. Throughput: 0: 42488.7. Samples: 2152994500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:01,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 12:18:04,648][12862] Signal inference workers to stop experience collection... (31500 times) [2024-06-18 12:18:04,682][12883] InferenceWorker_p0-w0: stopping experience collection (31500 times) [2024-06-18 12:18:04,696][12862] Signal inference workers to resume experience collection... (31500 times) [2024-06-18 12:18:04,706][12883] InferenceWorker_p0-w0: resuming experience collection (31500 times) [2024-06-18 12:18:04,831][12883] Updated weights for policy 0, policy_version 131413 (0.0043) [2024-06-18 12:18:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2153136128. Throughput: 0: 42480.0. Samples: 2153246240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:06,994][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 12:18:09,167][12883] Updated weights for policy 0, policy_version 131423 (0.0032) [2024-06-18 12:18:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2153349120. Throughput: 0: 42636.0. Samples: 2153509160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:11,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 12:18:12,430][12883] Updated weights for policy 0, policy_version 131433 (0.0023) [2024-06-18 12:18:16,789][12883] Updated weights for policy 0, policy_version 131443 (0.0028) [2024-06-18 12:18:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2153562112. Throughput: 0: 42511.2. Samples: 2153635700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:16,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 12:18:20,351][12883] Updated weights for policy 0, policy_version 131453 (0.0035) [2024-06-18 12:18:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2153775104. Throughput: 0: 42605.7. Samples: 2153883600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:21,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 12:18:24,452][12883] Updated weights for policy 0, policy_version 131463 (0.0028) [2024-06-18 12:18:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2153971712. Throughput: 0: 42636.4. Samples: 2154143860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:26,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 12:18:28,010][12883] Updated weights for policy 0, policy_version 131473 (0.0045) [2024-06-18 12:18:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2154201088. Throughput: 0: 42472.9. Samples: 2154270200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:31,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 12:18:32,553][12883] Updated weights for policy 0, policy_version 131483 (0.0041) [2024-06-18 12:18:36,037][12883] Updated weights for policy 0, policy_version 131493 (0.0039) [2024-06-18 12:18:36,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2154414080. Throughput: 0: 42679.1. Samples: 2154526100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:36,994][12645] Avg episode reward: [(0, '0.149')] [2024-06-18 12:18:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131496_2154430464.pth... [2024-06-18 12:18:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000130870_2144174080.pth [2024-06-18 12:18:40,260][12883] Updated weights for policy 0, policy_version 131503 (0.0033) [2024-06-18 12:18:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2154627072. Throughput: 0: 42541.7. Samples: 2154781160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:41,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 12:18:43,718][12883] Updated weights for policy 0, policy_version 131513 (0.0053) [2024-06-18 12:18:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2154840064. Throughput: 0: 42522.2. Samples: 2154908000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:18:46,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 12:18:47,886][12883] Updated weights for policy 0, policy_version 131523 (0.0030) [2024-06-18 12:18:51,283][12883] Updated weights for policy 0, policy_version 131533 (0.0045) [2024-06-18 12:18:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2155069440. Throughput: 0: 42784.0. Samples: 2155171520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:18:51,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 12:18:55,426][12883] Updated weights for policy 0, policy_version 131543 (0.0024) [2024-06-18 12:18:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2155249664. Throughput: 0: 42649.3. Samples: 2155428380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:18:56,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 12:18:58,862][12883] Updated weights for policy 0, policy_version 131553 (0.0029) [2024-06-18 12:19:01,994][12645] Fps is (10 sec: 42597.5, 60 sec: 43144.3, 300 sec: 42709.4). Total num frames: 2155495424. Throughput: 0: 42489.2. Samples: 2155547720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:01,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 12:19:02,946][12883] Updated weights for policy 0, policy_version 131563 (0.0032) [2024-06-18 12:19:06,771][12883] Updated weights for policy 0, policy_version 131573 (0.0034) [2024-06-18 12:19:06,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2155708416. Throughput: 0: 42709.0. Samples: 2155805500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:06,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 12:19:10,408][12862] Signal inference workers to stop experience collection... (31550 times) [2024-06-18 12:19:10,408][12862] Signal inference workers to resume experience collection... (31550 times) [2024-06-18 12:19:10,434][12883] InferenceWorker_p0-w0: stopping experience collection (31550 times) [2024-06-18 12:19:10,434][12883] InferenceWorker_p0-w0: resuming experience collection (31550 times) [2024-06-18 12:19:10,555][12883] Updated weights for policy 0, policy_version 131583 (0.0023) [2024-06-18 12:19:11,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2155905024. Throughput: 0: 42685.0. Samples: 2156064680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:11,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 12:19:14,298][12883] Updated weights for policy 0, policy_version 131593 (0.0025) [2024-06-18 12:19:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2156134400. Throughput: 0: 42764.9. Samples: 2156194620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:16,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 12:19:18,098][12883] Updated weights for policy 0, policy_version 131603 (0.0035) [2024-06-18 12:19:21,955][12883] Updated weights for policy 0, policy_version 131613 (0.0031) [2024-06-18 12:19:21,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2156347392. Throughput: 0: 42805.8. Samples: 2156452460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:21,997][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 12:19:26,152][12883] Updated weights for policy 0, policy_version 131623 (0.0050) [2024-06-18 12:19:26,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42869.9, 300 sec: 42598.3). Total num frames: 2156544000. Throughput: 0: 42867.6. Samples: 2156710300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:26,997][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 12:19:29,572][12883] Updated weights for policy 0, policy_version 131633 (0.0037) [2024-06-18 12:19:31,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2156756992. Throughput: 0: 42765.7. Samples: 2156832460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:31,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 12:19:33,774][12883] Updated weights for policy 0, policy_version 131643 (0.0037) [2024-06-18 12:19:36,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2156969984. Throughput: 0: 42507.1. Samples: 2157084340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:36,994][12645] Avg episode reward: [(0, '0.697')] [2024-06-18 12:19:37,337][12883] Updated weights for policy 0, policy_version 131653 (0.0033) [2024-06-18 12:19:41,389][12883] Updated weights for policy 0, policy_version 131663 (0.0039) [2024-06-18 12:19:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2157166592. Throughput: 0: 42431.1. Samples: 2157337780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:41,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 12:19:45,238][12883] Updated weights for policy 0, policy_version 131673 (0.0030) [2024-06-18 12:19:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2157395968. Throughput: 0: 42693.1. Samples: 2157468900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:46,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 12:19:49,341][12883] Updated weights for policy 0, policy_version 131683 (0.0037) [2024-06-18 12:19:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2157608960. Throughput: 0: 42579.0. Samples: 2157721560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:19:51,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 12:19:52,974][12883] Updated weights for policy 0, policy_version 131693 (0.0041) [2024-06-18 12:19:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2157805568. Throughput: 0: 42553.3. Samples: 2157979580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:19:56,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 12:19:57,355][12883] Updated weights for policy 0, policy_version 131703 (0.0037) [2024-06-18 12:20:00,577][12883] Updated weights for policy 0, policy_version 131713 (0.0039) [2024-06-18 12:20:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2158034944. Throughput: 0: 42401.2. Samples: 2158102680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:01,994][12645] Avg episode reward: [(0, '0.715')] [2024-06-18 12:20:05,012][12883] Updated weights for policy 0, policy_version 131723 (0.0045) [2024-06-18 12:20:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2158264320. Throughput: 0: 42339.8. Samples: 2158357660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:06,996][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 12:20:08,399][12883] Updated weights for policy 0, policy_version 131733 (0.0027) [2024-06-18 12:20:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2158444544. Throughput: 0: 42465.3. Samples: 2158621140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:11,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 12:20:12,514][12883] Updated weights for policy 0, policy_version 131743 (0.0040) [2024-06-18 12:20:16,189][12883] Updated weights for policy 0, policy_version 131753 (0.0026) [2024-06-18 12:20:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2158673920. Throughput: 0: 42388.4. Samples: 2158739940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:16,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 12:20:20,076][12883] Updated weights for policy 0, policy_version 131763 (0.0044) [2024-06-18 12:20:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 2158886912. Throughput: 0: 42506.3. Samples: 2158997120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:21,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 12:20:24,209][12883] Updated weights for policy 0, policy_version 131773 (0.0032) [2024-06-18 12:20:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42326.9, 300 sec: 42598.4). Total num frames: 2159083520. Throughput: 0: 42753.4. Samples: 2159261680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:26,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 12:20:27,637][12883] Updated weights for policy 0, policy_version 131783 (0.0040) [2024-06-18 12:20:31,719][12883] Updated weights for policy 0, policy_version 131793 (0.0037) [2024-06-18 12:20:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2159296512. Throughput: 0: 42566.2. Samples: 2159384380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:31,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 12:20:35,224][12883] Updated weights for policy 0, policy_version 131803 (0.0031) [2024-06-18 12:20:36,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2159542272. Throughput: 0: 42736.5. Samples: 2159644700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:36,994][12645] Avg episode reward: [(0, '0.195')] [2024-06-18 12:20:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131808_2159542272.pth... [2024-06-18 12:20:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131182_2149285888.pth [2024-06-18 12:20:39,191][12883] Updated weights for policy 0, policy_version 131813 (0.0034) [2024-06-18 12:20:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2159738880. Throughput: 0: 42706.3. Samples: 2159901360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:41,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 12:20:42,868][12883] Updated weights for policy 0, policy_version 131823 (0.0027) [2024-06-18 12:20:46,578][12883] Updated weights for policy 0, policy_version 131833 (0.0026) [2024-06-18 12:20:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2159951872. Throughput: 0: 42788.1. Samples: 2160028140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:46,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 12:20:49,779][12862] Signal inference workers to stop experience collection... (31600 times) [2024-06-18 12:20:49,779][12862] Signal inference workers to resume experience collection... (31600 times) [2024-06-18 12:20:49,798][12883] InferenceWorker_p0-w0: stopping experience collection (31600 times) [2024-06-18 12:20:49,798][12883] InferenceWorker_p0-w0: resuming experience collection (31600 times) [2024-06-18 12:20:50,674][12883] Updated weights for policy 0, policy_version 131843 (0.0037) [2024-06-18 12:20:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2160164864. Throughput: 0: 42828.0. Samples: 2160284920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:51,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 12:20:54,039][12883] Updated weights for policy 0, policy_version 131853 (0.0034) [2024-06-18 12:20:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2160361472. Throughput: 0: 42648.0. Samples: 2160540300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:20:56,994][12645] Avg episode reward: [(0, '0.177')] [2024-06-18 12:20:58,212][12883] Updated weights for policy 0, policy_version 131863 (0.0044) [2024-06-18 12:21:01,636][12883] Updated weights for policy 0, policy_version 131873 (0.0024) [2024-06-18 12:21:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42710.4). Total num frames: 2160607232. Throughput: 0: 42905.0. Samples: 2160670660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:01,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 12:21:05,781][12883] Updated weights for policy 0, policy_version 131883 (0.0031) [2024-06-18 12:21:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2160803840. Throughput: 0: 42815.2. Samples: 2160923800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:06,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 12:21:09,225][12883] Updated weights for policy 0, policy_version 131893 (0.0038) [2024-06-18 12:21:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2161016832. Throughput: 0: 42576.9. Samples: 2161177640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:12,003][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 12:21:13,487][12883] Updated weights for policy 0, policy_version 131903 (0.0028) [2024-06-18 12:21:16,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2161229824. Throughput: 0: 42715.0. Samples: 2161306560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:17,004][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 12:21:17,375][12883] Updated weights for policy 0, policy_version 131913 (0.0030) [2024-06-18 12:21:21,517][12883] Updated weights for policy 0, policy_version 131923 (0.0033) [2024-06-18 12:21:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2161426432. Throughput: 0: 42482.2. Samples: 2161556400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:21,994][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 12:21:24,891][12883] Updated weights for policy 0, policy_version 131933 (0.0037) [2024-06-18 12:21:26,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2161639424. Throughput: 0: 42518.2. Samples: 2161814680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:26,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 12:21:29,072][12883] Updated weights for policy 0, policy_version 131943 (0.0030) [2024-06-18 12:21:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2161868800. Throughput: 0: 42518.7. Samples: 2161941480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:31,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 12:21:32,519][12883] Updated weights for policy 0, policy_version 131953 (0.0031) [2024-06-18 12:21:36,715][12883] Updated weights for policy 0, policy_version 131963 (0.0037) [2024-06-18 12:21:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2162081792. Throughput: 0: 42606.2. Samples: 2162202200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:36,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 12:21:40,447][12883] Updated weights for policy 0, policy_version 131973 (0.0045) [2024-06-18 12:21:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2162294784. Throughput: 0: 42580.5. Samples: 2162456420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:41,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 12:21:44,397][12883] Updated weights for policy 0, policy_version 131983 (0.0029) [2024-06-18 12:21:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2162507776. Throughput: 0: 42443.0. Samples: 2162580600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:46,995][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 12:21:48,025][12883] Updated weights for policy 0, policy_version 131993 (0.0042) [2024-06-18 12:21:51,997][12645] Fps is (10 sec: 40954.5, 60 sec: 42324.4, 300 sec: 42542.7). Total num frames: 2162704384. Throughput: 0: 42576.0. Samples: 2162839780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:51,998][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 12:21:52,376][12883] Updated weights for policy 0, policy_version 132003 (0.0044) [2024-06-18 12:21:55,631][12883] Updated weights for policy 0, policy_version 132013 (0.0038) [2024-06-18 12:21:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2162933760. Throughput: 0: 42392.9. Samples: 2163085320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 12:21:56,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 12:22:00,081][12883] Updated weights for policy 0, policy_version 132023 (0.0034) [2024-06-18 12:22:01,994][12645] Fps is (10 sec: 42604.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2163130368. Throughput: 0: 42555.2. Samples: 2163221540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:01,994][12645] Avg episode reward: [(0, '0.166')] [2024-06-18 12:22:03,310][12883] Updated weights for policy 0, policy_version 132033 (0.0025) [2024-06-18 12:22:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2163359744. Throughput: 0: 42536.0. Samples: 2163470520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:06,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 12:22:07,721][12883] Updated weights for policy 0, policy_version 132043 (0.0037) [2024-06-18 12:22:11,367][12883] Updated weights for policy 0, policy_version 132053 (0.0036) [2024-06-18 12:22:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2163589120. Throughput: 0: 42499.6. Samples: 2163727160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:11,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 12:22:15,327][12883] Updated weights for policy 0, policy_version 132063 (0.0034) [2024-06-18 12:22:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2163769344. Throughput: 0: 42527.6. Samples: 2163855220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:16,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 12:22:18,999][12883] Updated weights for policy 0, policy_version 132073 (0.0029) [2024-06-18 12:22:21,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2163998720. Throughput: 0: 42462.3. Samples: 2164113000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:21,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 12:22:22,842][12883] Updated weights for policy 0, policy_version 132083 (0.0029) [2024-06-18 12:22:23,836][12862] Signal inference workers to stop experience collection... (31650 times) [2024-06-18 12:22:23,845][12862] Signal inference workers to resume experience collection... (31650 times) [2024-06-18 12:22:23,877][12883] InferenceWorker_p0-w0: stopping experience collection (31650 times) [2024-06-18 12:22:23,877][12883] InferenceWorker_p0-w0: resuming experience collection (31650 times) [2024-06-18 12:22:26,892][12883] Updated weights for policy 0, policy_version 132093 (0.0038) [2024-06-18 12:22:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2164211712. Throughput: 0: 42552.0. Samples: 2164371260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:26,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 12:22:30,817][12883] Updated weights for policy 0, policy_version 132103 (0.0040) [2024-06-18 12:22:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2164424704. Throughput: 0: 42598.8. Samples: 2164497540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:31,994][12645] Avg episode reward: [(0, '0.812')] [2024-06-18 12:22:34,412][12883] Updated weights for policy 0, policy_version 132113 (0.0040) [2024-06-18 12:22:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2164654080. Throughput: 0: 42547.8. Samples: 2164754380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:36,994][12645] Avg episode reward: [(0, '0.732')] [2024-06-18 12:22:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132120_2164654080.pth... [2024-06-18 12:22:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131496_2154430464.pth [2024-06-18 12:22:38,310][12883] Updated weights for policy 0, policy_version 132123 (0.0031) [2024-06-18 12:22:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2164834304. Throughput: 0: 42906.3. Samples: 2165016100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:41,994][12645] Avg episode reward: [(0, '0.738')] [2024-06-18 12:22:42,139][12883] Updated weights for policy 0, policy_version 132133 (0.0032) [2024-06-18 12:22:45,836][12883] Updated weights for policy 0, policy_version 132143 (0.0040) [2024-06-18 12:22:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2165063680. Throughput: 0: 42672.8. Samples: 2165141820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:46,994][12645] Avg episode reward: [(0, '0.740')] [2024-06-18 12:22:49,938][12883] Updated weights for policy 0, policy_version 132153 (0.0041) [2024-06-18 12:22:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42872.5, 300 sec: 42598.4). Total num frames: 2165276672. Throughput: 0: 42673.8. Samples: 2165390840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:51,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 12:22:53,360][12883] Updated weights for policy 0, policy_version 132163 (0.0025) [2024-06-18 12:22:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2165473280. Throughput: 0: 42837.6. Samples: 2165654860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:22:56,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 12:22:57,581][12883] Updated weights for policy 0, policy_version 132173 (0.0047) [2024-06-18 12:23:00,903][12883] Updated weights for policy 0, policy_version 132183 (0.0031) [2024-06-18 12:23:01,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2165702656. Throughput: 0: 42797.6. Samples: 2165781120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 12:23:01,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 12:23:05,458][12883] Updated weights for policy 0, policy_version 132193 (0.0037) [2024-06-18 12:23:06,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2165932032. Throughput: 0: 42651.2. Samples: 2166032300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:06,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 12:23:08,620][12883] Updated weights for policy 0, policy_version 132203 (0.0046) [2024-06-18 12:23:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2166095872. Throughput: 0: 42586.6. Samples: 2166287660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:11,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 12:23:13,197][12883] Updated weights for policy 0, policy_version 132213 (0.0038) [2024-06-18 12:23:16,671][12883] Updated weights for policy 0, policy_version 132223 (0.0033) [2024-06-18 12:23:16,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2166341632. Throughput: 0: 42470.1. Samples: 2166408700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:16,994][12645] Avg episode reward: [(0, '0.686')] [2024-06-18 12:23:20,958][12883] Updated weights for policy 0, policy_version 132233 (0.0036) [2024-06-18 12:23:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2166554624. Throughput: 0: 42605.8. Samples: 2166671640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:21,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 12:23:24,227][12883] Updated weights for policy 0, policy_version 132243 (0.0043) [2024-06-18 12:23:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2166751232. Throughput: 0: 42475.9. Samples: 2166927520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:26,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 12:23:28,507][12883] Updated weights for policy 0, policy_version 132253 (0.0028) [2024-06-18 12:23:31,783][12883] Updated weights for policy 0, policy_version 132263 (0.0032) [2024-06-18 12:23:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2166996992. Throughput: 0: 42539.5. Samples: 2167056100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:31,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 12:23:36,129][12883] Updated weights for policy 0, policy_version 132273 (0.0043) [2024-06-18 12:23:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 2167193600. Throughput: 0: 42660.0. Samples: 2167310540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:36,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 12:23:39,558][12883] Updated weights for policy 0, policy_version 132283 (0.0027) [2024-06-18 12:23:41,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2167390208. Throughput: 0: 42536.6. Samples: 2167569000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:41,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 12:23:43,730][12883] Updated weights for policy 0, policy_version 132293 (0.0029) [2024-06-18 12:23:46,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2167635968. Throughput: 0: 42467.0. Samples: 2167692140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:47,000][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 12:23:47,151][12883] Updated weights for policy 0, policy_version 132303 (0.0029) [2024-06-18 12:23:51,438][12883] Updated weights for policy 0, policy_version 132313 (0.0056) [2024-06-18 12:23:51,996][12645] Fps is (10 sec: 44226.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2167832576. Throughput: 0: 42593.4. Samples: 2167949100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:51,997][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 12:23:54,886][12883] Updated weights for policy 0, policy_version 132323 (0.0033) [2024-06-18 12:23:55,806][12862] Signal inference workers to stop experience collection... (31700 times) [2024-06-18 12:23:55,806][12862] Signal inference workers to resume experience collection... (31700 times) [2024-06-18 12:23:55,819][12883] InferenceWorker_p0-w0: stopping experience collection (31700 times) [2024-06-18 12:23:55,844][12883] InferenceWorker_p0-w0: resuming experience collection (31700 times) [2024-06-18 12:23:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2168029184. Throughput: 0: 42553.2. Samples: 2168202560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:23:56,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 12:23:59,401][12883] Updated weights for policy 0, policy_version 132333 (0.0031) [2024-06-18 12:24:01,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2168258560. Throughput: 0: 42662.4. Samples: 2168328500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:24:01,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 12:24:02,715][12883] Updated weights for policy 0, policy_version 132343 (0.0032) [2024-06-18 12:24:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2168471552. Throughput: 0: 42482.3. Samples: 2168583340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:06,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 12:24:07,006][12883] Updated weights for policy 0, policy_version 132353 (0.0031) [2024-06-18 12:24:10,511][12883] Updated weights for policy 0, policy_version 132363 (0.0038) [2024-06-18 12:24:11,996][12645] Fps is (10 sec: 42588.7, 60 sec: 43142.9, 300 sec: 42542.5). Total num frames: 2168684544. Throughput: 0: 42273.1. Samples: 2168829900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:11,996][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 12:24:14,790][12883] Updated weights for policy 0, policy_version 132373 (0.0031) [2024-06-18 12:24:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2168881152. Throughput: 0: 42415.6. Samples: 2168964800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:16,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 12:24:18,173][12883] Updated weights for policy 0, policy_version 132383 (0.0030) [2024-06-18 12:24:21,994][12645] Fps is (10 sec: 39330.2, 60 sec: 42052.3, 300 sec: 42487.6). Total num frames: 2169077760. Throughput: 0: 42502.1. Samples: 2169223140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:21,994][12645] Avg episode reward: [(0, '0.650')] [2024-06-18 12:24:22,445][12883] Updated weights for policy 0, policy_version 132393 (0.0034) [2024-06-18 12:24:26,058][12883] Updated weights for policy 0, policy_version 132403 (0.0035) [2024-06-18 12:24:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2169323520. Throughput: 0: 42205.7. Samples: 2169468260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:26,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 12:24:30,559][12883] Updated weights for policy 0, policy_version 132413 (0.0032) [2024-06-18 12:24:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2169520128. Throughput: 0: 42467.3. Samples: 2169603160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:31,994][12645] Avg episode reward: [(0, '0.706')] [2024-06-18 12:24:33,970][12883] Updated weights for policy 0, policy_version 132423 (0.0027) [2024-06-18 12:24:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2169733120. Throughput: 0: 42490.0. Samples: 2169861060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:36,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 12:24:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132430_2169733120.pth... [2024-06-18 12:24:37,045][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000131808_2159542272.pth [2024-06-18 12:24:37,971][12883] Updated weights for policy 0, policy_version 132433 (0.0033) [2024-06-18 12:24:41,562][12883] Updated weights for policy 0, policy_version 132443 (0.0026) [2024-06-18 12:24:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2169962496. Throughput: 0: 42441.1. Samples: 2170112400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:41,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 12:24:45,818][12883] Updated weights for policy 0, policy_version 132453 (0.0034) [2024-06-18 12:24:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 2170175488. Throughput: 0: 42488.4. Samples: 2170240480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:46,994][12645] Avg episode reward: [(0, '0.796')] [2024-06-18 12:24:49,160][12883] Updated weights for policy 0, policy_version 132463 (0.0030) [2024-06-18 12:24:51,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42053.9, 300 sec: 42542.9). Total num frames: 2170355712. Throughput: 0: 42463.1. Samples: 2170494180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:51,994][12645] Avg episode reward: [(0, '0.710')] [2024-06-18 12:24:53,348][12883] Updated weights for policy 0, policy_version 132473 (0.0044) [2024-06-18 12:24:56,882][12883] Updated weights for policy 0, policy_version 132483 (0.0028) [2024-06-18 12:24:56,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2170601472. Throughput: 0: 42591.3. Samples: 2170746420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:24:56,995][12645] Avg episode reward: [(0, '0.705')] [2024-06-18 12:25:01,034][12883] Updated weights for policy 0, policy_version 132493 (0.0031) [2024-06-18 12:25:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2170814464. Throughput: 0: 42466.7. Samples: 2170875800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:25:01,994][12645] Avg episode reward: [(0, '0.714')] [2024-06-18 12:25:04,402][12883] Updated weights for policy 0, policy_version 132503 (0.0033) [2024-06-18 12:25:06,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2171011072. Throughput: 0: 42367.7. Samples: 2171129680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 12:25:06,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 12:25:08,610][12883] Updated weights for policy 0, policy_version 132513 (0.0051) [2024-06-18 12:25:09,935][12862] Signal inference workers to stop experience collection... (31750 times) [2024-06-18 12:25:09,970][12883] InferenceWorker_p0-w0: stopping experience collection (31750 times) [2024-06-18 12:25:09,982][12862] Signal inference workers to resume experience collection... (31750 times) [2024-06-18 12:25:09,988][12883] InferenceWorker_p0-w0: resuming experience collection (31750 times) [2024-06-18 12:25:11,998][12645] Fps is (10 sec: 42581.0, 60 sec: 42597.1, 300 sec: 42597.8). Total num frames: 2171240448. Throughput: 0: 42650.4. Samples: 2171387700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:11,998][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 12:25:12,757][12883] Updated weights for policy 0, policy_version 132523 (0.0038) [2024-06-18 12:25:16,669][12883] Updated weights for policy 0, policy_version 132533 (0.0037) [2024-06-18 12:25:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2171453440. Throughput: 0: 42514.3. Samples: 2171516300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:16,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 12:25:20,388][12883] Updated weights for policy 0, policy_version 132543 (0.0046) [2024-06-18 12:25:21,994][12645] Fps is (10 sec: 39338.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2171633664. Throughput: 0: 42303.7. Samples: 2171764720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:21,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 12:25:24,367][12883] Updated weights for policy 0, policy_version 132553 (0.0042) [2024-06-18 12:25:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2171879424. Throughput: 0: 42364.4. Samples: 2172018800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:26,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 12:25:28,075][12883] Updated weights for policy 0, policy_version 132563 (0.0029) [2024-06-18 12:25:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2172059648. Throughput: 0: 42470.7. Samples: 2172151660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:31,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 12:25:32,204][12883] Updated weights for policy 0, policy_version 132573 (0.0029) [2024-06-18 12:25:35,832][12883] Updated weights for policy 0, policy_version 132583 (0.0034) [2024-06-18 12:25:36,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2172272640. Throughput: 0: 42393.2. Samples: 2172401880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:36,994][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 12:25:39,859][12883] Updated weights for policy 0, policy_version 132593 (0.0034) [2024-06-18 12:25:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 2172485632. Throughput: 0: 42341.3. Samples: 2172651780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:41,995][12645] Avg episode reward: [(0, '0.789')] [2024-06-18 12:25:43,600][12883] Updated weights for policy 0, policy_version 132603 (0.0028) [2024-06-18 12:25:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2172698624. Throughput: 0: 42297.8. Samples: 2172779200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:46,996][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 12:25:47,579][12883] Updated weights for policy 0, policy_version 132613 (0.0037) [2024-06-18 12:25:51,204][12883] Updated weights for policy 0, policy_version 132623 (0.0030) [2024-06-18 12:25:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2172911616. Throughput: 0: 42299.8. Samples: 2173033180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:51,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 12:25:55,194][12883] Updated weights for policy 0, policy_version 132633 (0.0040) [2024-06-18 12:25:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 2173124608. Throughput: 0: 42214.6. Samples: 2173287180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:25:56,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 12:25:58,922][12883] Updated weights for policy 0, policy_version 132643 (0.0038) [2024-06-18 12:26:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2173337600. Throughput: 0: 42181.2. Samples: 2173414460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:26:01,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 12:26:02,686][12883] Updated weights for policy 0, policy_version 132653 (0.0025) [2024-06-18 12:26:06,654][12883] Updated weights for policy 0, policy_version 132663 (0.0036) [2024-06-18 12:26:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2173550592. Throughput: 0: 42309.8. Samples: 2173668660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 12:26:06,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 12:26:10,349][12883] Updated weights for policy 0, policy_version 132673 (0.0036) [2024-06-18 12:26:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42055.2, 300 sec: 42487.3). Total num frames: 2173763584. Throughput: 0: 42292.9. Samples: 2173921980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:11,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 12:26:14,372][12883] Updated weights for policy 0, policy_version 132683 (0.0034) [2024-06-18 12:26:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2173960192. Throughput: 0: 42092.9. Samples: 2174045840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:16,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 12:26:18,477][12883] Updated weights for policy 0, policy_version 132693 (0.0040) [2024-06-18 12:26:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2174189568. Throughput: 0: 42249.4. Samples: 2174303100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:21,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 12:26:22,358][12883] Updated weights for policy 0, policy_version 132703 (0.0030) [2024-06-18 12:26:26,178][12883] Updated weights for policy 0, policy_version 132713 (0.0033) [2024-06-18 12:26:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2174402560. Throughput: 0: 42205.5. Samples: 2174551020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:26,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 12:26:30,059][12883] Updated weights for policy 0, policy_version 132723 (0.0029) [2024-06-18 12:26:31,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42050.7, 300 sec: 42375.9). Total num frames: 2174582784. Throughput: 0: 42156.6. Samples: 2174676340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:31,996][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 12:26:34,106][12883] Updated weights for policy 0, policy_version 132733 (0.0027) [2024-06-18 12:26:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2174812160. Throughput: 0: 42166.0. Samples: 2174930640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:36,994][12645] Avg episode reward: [(0, '0.681')] [2024-06-18 12:26:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132741_2174828544.pth... [2024-06-18 12:26:37,135][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132120_2164654080.pth [2024-06-18 12:26:37,754][12883] Updated weights for policy 0, policy_version 132743 (0.0023) [2024-06-18 12:26:41,727][12883] Updated weights for policy 0, policy_version 132753 (0.0040) [2024-06-18 12:26:41,996][12645] Fps is (10 sec: 45875.3, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2175041536. Throughput: 0: 42295.2. Samples: 2175190560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:41,997][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 12:26:45,135][12862] Signal inference workers to stop experience collection... (31800 times) [2024-06-18 12:26:45,136][12862] Signal inference workers to resume experience collection... (31800 times) [2024-06-18 12:26:45,155][12883] InferenceWorker_p0-w0: stopping experience collection (31800 times) [2024-06-18 12:26:45,155][12883] InferenceWorker_p0-w0: resuming experience collection (31800 times) [2024-06-18 12:26:45,288][12883] Updated weights for policy 0, policy_version 132763 (0.0037) [2024-06-18 12:26:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.5). Total num frames: 2175238144. Throughput: 0: 42276.5. Samples: 2175316900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:46,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 12:26:49,203][12883] Updated weights for policy 0, policy_version 132773 (0.0028) [2024-06-18 12:26:51,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2175467520. Throughput: 0: 42456.7. Samples: 2175579220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:51,994][12645] Avg episode reward: [(0, '0.296')] [2024-06-18 12:26:52,830][12883] Updated weights for policy 0, policy_version 132783 (0.0030) [2024-06-18 12:26:56,766][12883] Updated weights for policy 0, policy_version 132793 (0.0044) [2024-06-18 12:26:56,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2175680512. Throughput: 0: 42507.8. Samples: 2175834840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:26:56,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 12:27:00,605][12883] Updated weights for policy 0, policy_version 132803 (0.0033) [2024-06-18 12:27:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2175877120. Throughput: 0: 42580.3. Samples: 2175961960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:27:01,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 12:27:04,527][12883] Updated weights for policy 0, policy_version 132813 (0.0043) [2024-06-18 12:27:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2176106496. Throughput: 0: 42634.1. Samples: 2176221640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:27:06,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 12:27:08,542][12883] Updated weights for policy 0, policy_version 132823 (0.0025) [2024-06-18 12:27:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2176319488. Throughput: 0: 42807.5. Samples: 2176477360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 12:27:11,998][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 12:27:12,103][12883] Updated weights for policy 0, policy_version 132833 (0.0036) [2024-06-18 12:27:16,044][12883] Updated weights for policy 0, policy_version 132843 (0.0034) [2024-06-18 12:27:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2176548864. Throughput: 0: 42941.7. Samples: 2176608620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:16,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 12:27:19,531][12883] Updated weights for policy 0, policy_version 132853 (0.0044) [2024-06-18 12:27:21,994][12645] Fps is (10 sec: 42595.6, 60 sec: 42597.9, 300 sec: 42487.2). Total num frames: 2176745472. Throughput: 0: 42937.9. Samples: 2176862880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:21,995][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 12:27:23,681][12883] Updated weights for policy 0, policy_version 132863 (0.0030) [2024-06-18 12:27:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2176974848. Throughput: 0: 42830.0. Samples: 2177117820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:26,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 12:27:27,287][12883] Updated weights for policy 0, policy_version 132873 (0.0040) [2024-06-18 12:27:31,318][12883] Updated weights for policy 0, policy_version 132883 (0.0030) [2024-06-18 12:27:31,994][12645] Fps is (10 sec: 42601.4, 60 sec: 43146.1, 300 sec: 42431.8). Total num frames: 2177171456. Throughput: 0: 42941.7. Samples: 2177249280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:31,994][12645] Avg episode reward: [(0, '0.106')] [2024-06-18 12:27:35,391][12883] Updated weights for policy 0, policy_version 132893 (0.0036) [2024-06-18 12:27:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2177400832. Throughput: 0: 42810.7. Samples: 2177505700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:36,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 12:27:39,233][12883] Updated weights for policy 0, policy_version 132903 (0.0055) [2024-06-18 12:27:42,000][12645] Fps is (10 sec: 42571.9, 60 sec: 42595.6, 300 sec: 42486.4). Total num frames: 2177597440. Throughput: 0: 42670.2. Samples: 2177755260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:42,001][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 12:27:42,865][12883] Updated weights for policy 0, policy_version 132913 (0.0041) [2024-06-18 12:27:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2177810432. Throughput: 0: 42838.7. Samples: 2177889700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:46,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 12:27:46,998][12883] Updated weights for policy 0, policy_version 132923 (0.0037) [2024-06-18 12:27:50,996][12883] Updated weights for policy 0, policy_version 132933 (0.0028) [2024-06-18 12:27:51,995][12645] Fps is (10 sec: 42620.1, 60 sec: 42597.6, 300 sec: 42542.7). Total num frames: 2178023424. Throughput: 0: 42766.5. Samples: 2178146180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:51,995][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 12:27:54,676][12883] Updated weights for policy 0, policy_version 132943 (0.0026) [2024-06-18 12:27:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2178236416. Throughput: 0: 42553.8. Samples: 2178392280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:27:56,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 12:27:58,611][12883] Updated weights for policy 0, policy_version 132953 (0.0025) [2024-06-18 12:28:01,994][12645] Fps is (10 sec: 40964.9, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 2178433024. Throughput: 0: 42560.1. Samples: 2178523820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:28:02,000][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 12:28:02,434][12883] Updated weights for policy 0, policy_version 132963 (0.0032) [2024-06-18 12:28:06,151][12883] Updated weights for policy 0, policy_version 132973 (0.0045) [2024-06-18 12:28:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2178678784. Throughput: 0: 42599.4. Samples: 2178779820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:28:06,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 12:28:09,931][12883] Updated weights for policy 0, policy_version 132983 (0.0044) [2024-06-18 12:28:10,611][12862] Signal inference workers to stop experience collection... (31850 times) [2024-06-18 12:28:10,665][12883] InferenceWorker_p0-w0: stopping experience collection (31850 times) [2024-06-18 12:28:10,668][12862] Signal inference workers to resume experience collection... (31850 times) [2024-06-18 12:28:10,678][12883] InferenceWorker_p0-w0: resuming experience collection (31850 times) [2024-06-18 12:28:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2178875392. Throughput: 0: 42565.9. Samples: 2179033280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 12:28:11,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 12:28:13,710][12883] Updated weights for policy 0, policy_version 132993 (0.0028) [2024-06-18 12:28:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2179088384. Throughput: 0: 42554.7. Samples: 2179164240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:16,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 12:28:17,432][12883] Updated weights for policy 0, policy_version 133003 (0.0040) [2024-06-18 12:28:21,211][12883] Updated weights for policy 0, policy_version 133013 (0.0028) [2024-06-18 12:28:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.9, 300 sec: 42542.9). Total num frames: 2179301376. Throughput: 0: 42546.3. Samples: 2179420280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:21,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 12:28:25,223][12883] Updated weights for policy 0, policy_version 133023 (0.0027) [2024-06-18 12:28:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2179530752. Throughput: 0: 42756.1. Samples: 2179679020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:26,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 12:28:28,784][12883] Updated weights for policy 0, policy_version 133033 (0.0035) [2024-06-18 12:28:31,996][12645] Fps is (10 sec: 42587.3, 60 sec: 42596.6, 300 sec: 42486.9). Total num frames: 2179727360. Throughput: 0: 42571.3. Samples: 2179805520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:31,997][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 12:28:32,755][12883] Updated weights for policy 0, policy_version 133043 (0.0029) [2024-06-18 12:28:36,578][12883] Updated weights for policy 0, policy_version 133053 (0.0042) [2024-06-18 12:28:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2179940352. Throughput: 0: 42521.1. Samples: 2180059580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:36,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 12:28:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133054_2179956736.pth... [2024-06-18 12:28:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132430_2169733120.pth [2024-06-18 12:28:40,449][12883] Updated weights for policy 0, policy_version 133063 (0.0026) [2024-06-18 12:28:41,994][12645] Fps is (10 sec: 42609.5, 60 sec: 42602.8, 300 sec: 42431.8). Total num frames: 2180153344. Throughput: 0: 42829.3. Samples: 2180319600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:41,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 12:28:44,115][12883] Updated weights for policy 0, policy_version 133073 (0.0033) [2024-06-18 12:28:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 2180366336. Throughput: 0: 42709.4. Samples: 2180445740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:46,994][12645] Avg episode reward: [(0, '0.714')] [2024-06-18 12:28:48,117][12883] Updated weights for policy 0, policy_version 133083 (0.0037) [2024-06-18 12:28:51,941][12883] Updated weights for policy 0, policy_version 133093 (0.0026) [2024-06-18 12:28:51,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42870.7, 300 sec: 42598.1). Total num frames: 2180595712. Throughput: 0: 42761.0. Samples: 2180704160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:51,996][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 12:28:55,625][12883] Updated weights for policy 0, policy_version 133103 (0.0039) [2024-06-18 12:28:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2180792320. Throughput: 0: 42872.8. Samples: 2180962560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:28:56,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 12:28:59,427][12883] Updated weights for policy 0, policy_version 133113 (0.0033) [2024-06-18 12:29:01,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2181021696. Throughput: 0: 42636.9. Samples: 2181082900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:29:01,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 12:29:03,550][12883] Updated weights for policy 0, policy_version 133123 (0.0039) [2024-06-18 12:29:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2181234688. Throughput: 0: 42778.7. Samples: 2181345320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:29:06,994][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 12:29:07,030][12883] Updated weights for policy 0, policy_version 133133 (0.0031) [2024-06-18 12:29:11,089][12883] Updated weights for policy 0, policy_version 133143 (0.0031) [2024-06-18 12:29:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2181447680. Throughput: 0: 42577.8. Samples: 2181595020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:29:11,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 12:29:15,239][12883] Updated weights for policy 0, policy_version 133153 (0.0039) [2024-06-18 12:29:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2181644288. Throughput: 0: 42724.8. Samples: 2181728020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:29:16,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 12:29:18,562][12883] Updated weights for policy 0, policy_version 133163 (0.0027) [2024-06-18 12:29:21,999][12645] Fps is (10 sec: 42577.7, 60 sec: 42867.9, 300 sec: 42542.2). Total num frames: 2181873664. Throughput: 0: 42770.4. Samples: 2181984460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:21,999][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 12:29:22,813][12883] Updated weights for policy 0, policy_version 133173 (0.0036) [2024-06-18 12:29:26,148][12883] Updated weights for policy 0, policy_version 133183 (0.0038) [2024-06-18 12:29:26,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2182086656. Throughput: 0: 42612.8. Samples: 2182237180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:26,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 12:29:30,685][12883] Updated weights for policy 0, policy_version 133193 (0.0022) [2024-06-18 12:29:31,994][12645] Fps is (10 sec: 40979.8, 60 sec: 42600.2, 300 sec: 42542.9). Total num frames: 2182283264. Throughput: 0: 42751.8. Samples: 2182369580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:31,994][12645] Avg episode reward: [(0, '0.707')] [2024-06-18 12:29:34,066][12883] Updated weights for policy 0, policy_version 133203 (0.0031) [2024-06-18 12:29:37,000][12645] Fps is (10 sec: 42572.3, 60 sec: 42867.0, 300 sec: 42542.0). Total num frames: 2182512640. Throughput: 0: 42562.9. Samples: 2182619660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:37,000][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 12:29:38,357][12883] Updated weights for policy 0, policy_version 133213 (0.0038) [2024-06-18 12:29:41,915][12883] Updated weights for policy 0, policy_version 133223 (0.0032) [2024-06-18 12:29:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2182725632. Throughput: 0: 42474.3. Samples: 2182873900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:41,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 12:29:46,357][12883] Updated weights for policy 0, policy_version 133233 (0.0035) [2024-06-18 12:29:46,994][12645] Fps is (10 sec: 40985.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2182922240. Throughput: 0: 42618.6. Samples: 2183000740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:46,994][12645] Avg episode reward: [(0, '0.240')] [2024-06-18 12:29:49,707][12883] Updated weights for policy 0, policy_version 133243 (0.0032) [2024-06-18 12:29:49,748][12862] Signal inference workers to stop experience collection... (31900 times) [2024-06-18 12:29:49,748][12862] Signal inference workers to resume experience collection... (31900 times) [2024-06-18 12:29:49,775][12883] InferenceWorker_p0-w0: stopping experience collection (31900 times) [2024-06-18 12:29:49,775][12883] InferenceWorker_p0-w0: resuming experience collection (31900 times) [2024-06-18 12:29:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 2183151616. Throughput: 0: 42399.6. Samples: 2183253300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:51,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 12:29:54,206][12883] Updated weights for policy 0, policy_version 133253 (0.0036) [2024-06-18 12:29:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2183364608. Throughput: 0: 42555.1. Samples: 2183510000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:29:56,994][12645] Avg episode reward: [(0, '0.663')] [2024-06-18 12:29:57,242][12883] Updated weights for policy 0, policy_version 133263 (0.0030) [2024-06-18 12:30:01,938][12883] Updated weights for policy 0, policy_version 133273 (0.0030) [2024-06-18 12:30:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2183544832. Throughput: 0: 42445.2. Samples: 2183638060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:30:01,995][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 12:30:04,826][12883] Updated weights for policy 0, policy_version 133283 (0.0028) [2024-06-18 12:30:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42543.5). Total num frames: 2183790592. Throughput: 0: 42292.6. Samples: 2183887420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:30:06,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 12:30:09,656][12883] Updated weights for policy 0, policy_version 133293 (0.0038) [2024-06-18 12:30:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2184003584. Throughput: 0: 42393.4. Samples: 2184144880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:30:11,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 12:30:12,833][12883] Updated weights for policy 0, policy_version 133303 (0.0043) [2024-06-18 12:30:16,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2184183808. Throughput: 0: 42295.6. Samples: 2184272880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 12:30:16,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 12:30:17,282][12883] Updated weights for policy 0, policy_version 133313 (0.0035) [2024-06-18 12:30:20,359][12883] Updated weights for policy 0, policy_version 133323 (0.0024) [2024-06-18 12:30:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42328.8, 300 sec: 42487.3). Total num frames: 2184413184. Throughput: 0: 42363.1. Samples: 2184525740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:21,994][12645] Avg episode reward: [(0, '0.265')] [2024-06-18 12:30:24,995][12883] Updated weights for policy 0, policy_version 133333 (0.0030) [2024-06-18 12:30:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2184642560. Throughput: 0: 42294.3. Samples: 2184777140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:26,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 12:30:28,021][12883] Updated weights for policy 0, policy_version 133343 (0.0039) [2024-06-18 12:30:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2184822784. Throughput: 0: 42471.2. Samples: 2184911940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:31,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 12:30:32,571][12883] Updated weights for policy 0, policy_version 133353 (0.0033) [2024-06-18 12:30:35,916][12883] Updated weights for policy 0, policy_version 133363 (0.0033) [2024-06-18 12:30:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42329.7, 300 sec: 42598.4). Total num frames: 2185052160. Throughput: 0: 42642.2. Samples: 2185172200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:36,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 12:30:37,050][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133366_2185068544.pth... [2024-06-18 12:30:37,111][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000132741_2174828544.pth [2024-06-18 12:30:40,409][12883] Updated weights for policy 0, policy_version 133373 (0.0043) [2024-06-18 12:30:41,997][12645] Fps is (10 sec: 45860.1, 60 sec: 42596.1, 300 sec: 42653.5). Total num frames: 2185281536. Throughput: 0: 42465.9. Samples: 2185421100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:41,998][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 12:30:43,526][12883] Updated weights for policy 0, policy_version 133383 (0.0037) [2024-06-18 12:30:46,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42323.8, 300 sec: 42542.6). Total num frames: 2185461760. Throughput: 0: 42562.8. Samples: 2185553480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:46,996][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 12:30:47,900][12883] Updated weights for policy 0, policy_version 133393 (0.0028) [2024-06-18 12:30:51,037][12883] Updated weights for policy 0, policy_version 133403 (0.0040) [2024-06-18 12:30:51,994][12645] Fps is (10 sec: 40973.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2185691136. Throughput: 0: 42741.3. Samples: 2185810780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:51,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 12:30:55,433][12883] Updated weights for policy 0, policy_version 133413 (0.0033) [2024-06-18 12:30:55,891][12862] Signal inference workers to stop experience collection... (31950 times) [2024-06-18 12:30:55,891][12862] Signal inference workers to resume experience collection... (31950 times) [2024-06-18 12:30:55,932][12883] InferenceWorker_p0-w0: stopping experience collection (31950 times) [2024-06-18 12:30:55,932][12883] InferenceWorker_p0-w0: resuming experience collection (31950 times) [2024-06-18 12:30:56,994][12645] Fps is (10 sec: 45885.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2185920512. Throughput: 0: 42748.9. Samples: 2186068580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:30:56,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 12:30:58,486][12883] Updated weights for policy 0, policy_version 133423 (0.0037) [2024-06-18 12:31:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2186117120. Throughput: 0: 42766.7. Samples: 2186197380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:31:01,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 12:31:02,960][12883] Updated weights for policy 0, policy_version 133433 (0.0026) [2024-06-18 12:31:06,203][12883] Updated weights for policy 0, policy_version 133443 (0.0027) [2024-06-18 12:31:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2186346496. Throughput: 0: 42700.8. Samples: 2186447280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:31:06,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 12:31:10,594][12883] Updated weights for policy 0, policy_version 133453 (0.0022) [2024-06-18 12:31:11,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2186559488. Throughput: 0: 42896.9. Samples: 2186707500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:31:11,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 12:31:14,419][12883] Updated weights for policy 0, policy_version 133463 (0.0048) [2024-06-18 12:31:16,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2186756096. Throughput: 0: 42669.8. Samples: 2186832080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:31:16,994][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 12:31:18,189][12883] Updated weights for policy 0, policy_version 133473 (0.0033) [2024-06-18 12:31:21,874][12883] Updated weights for policy 0, policy_version 133483 (0.0035) [2024-06-18 12:31:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2186985472. Throughput: 0: 42628.0. Samples: 2187090460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 12:31:21,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 12:31:26,295][12883] Updated weights for policy 0, policy_version 133493 (0.0031) [2024-06-18 12:31:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42709.8). Total num frames: 2187182080. Throughput: 0: 42912.9. Samples: 2187352040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:31:26,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 12:31:29,461][12883] Updated weights for policy 0, policy_version 133503 (0.0041) [2024-06-18 12:31:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2187395072. Throughput: 0: 42599.4. Samples: 2187470360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:31:31,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 12:31:34,014][12883] Updated weights for policy 0, policy_version 133513 (0.0028) [2024-06-18 12:31:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2187624448. Throughput: 0: 42560.0. Samples: 2187725980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:31:36,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 12:31:37,158][12883] Updated weights for policy 0, policy_version 133523 (0.0041) [2024-06-18 12:31:41,621][12883] Updated weights for policy 0, policy_version 133533 (0.0036) [2024-06-18 12:31:41,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42327.7, 300 sec: 42653.9). Total num frames: 2187821056. Throughput: 0: 42620.9. Samples: 2187986520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:31:41,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 12:31:44,783][12883] Updated weights for policy 0, policy_version 133543 (0.0033) [2024-06-18 12:31:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42873.0, 300 sec: 42598.4). Total num frames: 2188034048. Throughput: 0: 42480.8. Samples: 2188109020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:31:46,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 12:31:49,285][12883] Updated weights for policy 0, policy_version 133553 (0.0027) [2024-06-18 12:31:51,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2188263424. Throughput: 0: 42625.5. Samples: 2188365520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:31:51,996][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 12:31:52,581][12883] Updated weights for policy 0, policy_version 133563 (0.0034) [2024-06-18 12:31:56,797][12883] Updated weights for policy 0, policy_version 133573 (0.0034) [2024-06-18 12:31:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2188460032. Throughput: 0: 42718.6. Samples: 2188629840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:31:56,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 12:32:00,247][12883] Updated weights for policy 0, policy_version 133583 (0.0040) [2024-06-18 12:32:02,000][12645] Fps is (10 sec: 40943.6, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 2188673024. Throughput: 0: 42624.3. Samples: 2188750440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:32:02,001][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 12:32:04,690][12883] Updated weights for policy 0, policy_version 133593 (0.0032) [2024-06-18 12:32:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2188902400. Throughput: 0: 42674.2. Samples: 2189010800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:32:06,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 12:32:07,631][12883] Updated weights for policy 0, policy_version 133603 (0.0039) [2024-06-18 12:32:11,996][12645] Fps is (10 sec: 40976.3, 60 sec: 42050.6, 300 sec: 42487.0). Total num frames: 2189082624. Throughput: 0: 42605.0. Samples: 2189269360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:32:11,997][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 12:32:12,488][12883] Updated weights for policy 0, policy_version 133613 (0.0039) [2024-06-18 12:32:15,487][12883] Updated weights for policy 0, policy_version 133623 (0.0034) [2024-06-18 12:32:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.5). Total num frames: 2189312000. Throughput: 0: 42589.8. Samples: 2189386900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:32:16,994][12645] Avg episode reward: [(0, '0.637')] [2024-06-18 12:32:20,129][12883] Updated weights for policy 0, policy_version 133633 (0.0042) [2024-06-18 12:32:21,994][12645] Fps is (10 sec: 47524.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2189557760. Throughput: 0: 42651.6. Samples: 2189645300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 12:32:21,994][12645] Avg episode reward: [(0, '0.800')] [2024-06-18 12:32:22,875][12883] Updated weights for policy 0, policy_version 133643 (0.0034) [2024-06-18 12:32:26,995][12645] Fps is (10 sec: 42594.9, 60 sec: 42597.8, 300 sec: 42598.3). Total num frames: 2189737984. Throughput: 0: 42736.0. Samples: 2189909680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:32:26,995][12645] Avg episode reward: [(0, '0.834')] [2024-06-18 12:32:27,947][12883] Updated weights for policy 0, policy_version 133653 (0.0025) [2024-06-18 12:32:30,309][12883] Updated weights for policy 0, policy_version 133663 (0.0026) [2024-06-18 12:32:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2189950976. Throughput: 0: 42579.6. Samples: 2190025100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:32:31,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 12:32:35,653][12883] Updated weights for policy 0, policy_version 133673 (0.0045) [2024-06-18 12:32:36,382][12862] Signal inference workers to stop experience collection... (32000 times) [2024-06-18 12:32:36,383][12862] Signal inference workers to resume experience collection... (32000 times) [2024-06-18 12:32:36,403][12883] InferenceWorker_p0-w0: stopping experience collection (32000 times) [2024-06-18 12:32:36,403][12883] InferenceWorker_p0-w0: resuming experience collection (32000 times) [2024-06-18 12:32:36,994][12645] Fps is (10 sec: 44240.7, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 2190180352. Throughput: 0: 42926.1. Samples: 2190297100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:32:36,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 12:32:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133679_2190196736.pth... [2024-06-18 12:32:37,160][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133054_2179956736.pth [2024-06-18 12:32:37,888][12883] Updated weights for policy 0, policy_version 133683 (0.0038) [2024-06-18 12:32:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2190376960. Throughput: 0: 42564.3. Samples: 2190545240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:32:41,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 12:32:43,291][12883] Updated weights for policy 0, policy_version 133693 (0.0027) [2024-06-18 12:32:45,613][12883] Updated weights for policy 0, policy_version 133703 (0.0029) [2024-06-18 12:32:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.6). Total num frames: 2190589952. Throughput: 0: 42705.0. Samples: 2190671900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:32:46,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 12:32:50,920][12883] Updated weights for policy 0, policy_version 133713 (0.0027) [2024-06-18 12:32:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42327.0, 300 sec: 42598.4). Total num frames: 2190802944. Throughput: 0: 42759.6. Samples: 2190934980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:32:51,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 12:32:53,298][12883] Updated weights for policy 0, policy_version 133723 (0.0032) [2024-06-18 12:32:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2191015936. Throughput: 0: 42517.7. Samples: 2191182560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:32:56,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 12:32:58,770][12883] Updated weights for policy 0, policy_version 133733 (0.0044) [2024-06-18 12:33:01,660][12883] Updated weights for policy 0, policy_version 133743 (0.0038) [2024-06-18 12:33:01,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42875.8, 300 sec: 42598.4). Total num frames: 2191245312. Throughput: 0: 42677.3. Samples: 2191307380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:33:01,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 12:33:06,303][12883] Updated weights for policy 0, policy_version 133753 (0.0032) [2024-06-18 12:33:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2191441920. Throughput: 0: 42725.8. Samples: 2191567960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:33:06,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 12:33:09,464][12883] Updated weights for policy 0, policy_version 133763 (0.0042) [2024-06-18 12:33:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2191654912. Throughput: 0: 42414.2. Samples: 2191818280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:33:11,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 12:33:13,839][12883] Updated weights for policy 0, policy_version 133773 (0.0027) [2024-06-18 12:33:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2191884288. Throughput: 0: 42709.4. Samples: 2191947020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:33:16,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 12:33:17,022][12883] Updated weights for policy 0, policy_version 133783 (0.0033) [2024-06-18 12:33:21,715][12883] Updated weights for policy 0, policy_version 133793 (0.0031) [2024-06-18 12:33:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2192080896. Throughput: 0: 42489.0. Samples: 2192209100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:33:21,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 12:33:24,675][12883] Updated weights for policy 0, policy_version 133803 (0.0041) [2024-06-18 12:33:26,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42326.0, 300 sec: 42543.2). Total num frames: 2192277504. Throughput: 0: 42534.8. Samples: 2192459300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 12:33:26,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 12:33:29,464][12883] Updated weights for policy 0, policy_version 133813 (0.0032) [2024-06-18 12:33:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2192523264. Throughput: 0: 42634.8. Samples: 2192590460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:33:31,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 12:33:32,684][12883] Updated weights for policy 0, policy_version 133823 (0.0036) [2024-06-18 12:33:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2192703488. Throughput: 0: 42513.2. Samples: 2192848080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:33:36,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 12:33:37,121][12883] Updated weights for policy 0, policy_version 133833 (0.0033) [2024-06-18 12:33:40,525][12883] Updated weights for policy 0, policy_version 133843 (0.0036) [2024-06-18 12:33:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2192932864. Throughput: 0: 42441.3. Samples: 2193092420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:33:41,994][12645] Avg episode reward: [(0, '0.759')] [2024-06-18 12:33:44,783][12883] Updated weights for policy 0, policy_version 133853 (0.0036) [2024-06-18 12:33:46,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 2193178624. Throughput: 0: 42701.4. Samples: 2193228940. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:33:46,994][12645] Avg episode reward: [(0, '0.796')] [2024-06-18 12:33:48,142][12883] Updated weights for policy 0, policy_version 133863 (0.0034) [2024-06-18 12:33:48,370][12862] Signal inference workers to stop experience collection... (32050 times) [2024-06-18 12:33:48,371][12862] Signal inference workers to resume experience collection... (32050 times) [2024-06-18 12:33:48,411][12883] InferenceWorker_p0-w0: stopping experience collection (32050 times) [2024-06-18 12:33:48,411][12883] InferenceWorker_p0-w0: resuming experience collection (32050 times) [2024-06-18 12:33:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 2193342464. Throughput: 0: 42690.1. Samples: 2193489020. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:33:51,994][12645] Avg episode reward: [(0, '0.660')] [2024-06-18 12:33:52,296][12883] Updated weights for policy 0, policy_version 133873 (0.0041) [2024-06-18 12:33:55,615][12883] Updated weights for policy 0, policy_version 133883 (0.0037) [2024-06-18 12:33:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2193588224. Throughput: 0: 42547.6. Samples: 2193732920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:33:56,994][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 12:33:59,957][12883] Updated weights for policy 0, policy_version 133893 (0.0037) [2024-06-18 12:34:01,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2193801216. Throughput: 0: 42651.7. Samples: 2193866340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:34:01,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 12:34:03,182][12883] Updated weights for policy 0, policy_version 133903 (0.0027) [2024-06-18 12:34:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2193981440. Throughput: 0: 42647.4. Samples: 2194128240. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:34:06,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 12:34:07,865][12883] Updated weights for policy 0, policy_version 133913 (0.0044) [2024-06-18 12:34:10,920][12883] Updated weights for policy 0, policy_version 133923 (0.0030) [2024-06-18 12:34:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2194210816. Throughput: 0: 42428.1. Samples: 2194368560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:34:11,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 12:34:15,534][12883] Updated weights for policy 0, policy_version 133933 (0.0028) [2024-06-18 12:34:16,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42654.6). Total num frames: 2194456576. Throughput: 0: 42509.6. Samples: 2194503400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:34:16,994][12645] Avg episode reward: [(0, '0.739')] [2024-06-18 12:34:18,654][12883] Updated weights for policy 0, policy_version 133943 (0.0038) [2024-06-18 12:34:21,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2194604032. Throughput: 0: 42347.9. Samples: 2194753740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:34:21,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 12:34:23,386][12883] Updated weights for policy 0, policy_version 133953 (0.0041) [2024-06-18 12:34:26,390][12883] Updated weights for policy 0, policy_version 133963 (0.0033) [2024-06-18 12:34:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2194849792. Throughput: 0: 42410.7. Samples: 2195000900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 12:34:26,994][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 12:34:31,029][12883] Updated weights for policy 0, policy_version 133973 (0.0035) [2024-06-18 12:34:31,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42598.3, 300 sec: 42599.3). Total num frames: 2195079168. Throughput: 0: 42481.7. Samples: 2195140620. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:34:31,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 12:34:34,566][12883] Updated weights for policy 0, policy_version 133983 (0.0023) [2024-06-18 12:34:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2195243008. Throughput: 0: 42154.8. Samples: 2195385980. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:34:36,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 12:34:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133987_2195243008.pth... [2024-06-18 12:34:37,056][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133366_2185068544.pth [2024-06-18 12:34:38,651][12883] Updated weights for policy 0, policy_version 133993 (0.0042) [2024-06-18 12:34:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2195488768. Throughput: 0: 42388.0. Samples: 2195640380. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:34:41,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 12:34:42,310][12883] Updated weights for policy 0, policy_version 134003 (0.0032) [2024-06-18 12:34:46,213][12883] Updated weights for policy 0, policy_version 134013 (0.0030) [2024-06-18 12:34:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2195718144. Throughput: 0: 42487.9. Samples: 2195778300. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:34:46,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 12:34:49,805][12883] Updated weights for policy 0, policy_version 134023 (0.0037) [2024-06-18 12:34:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2195881984. Throughput: 0: 42034.2. Samples: 2196019780. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:34:51,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 12:34:53,919][12883] Updated weights for policy 0, policy_version 134033 (0.0036) [2024-06-18 12:34:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2196127744. Throughput: 0: 42468.2. Samples: 2196279640. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:34:56,995][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 12:34:57,281][12883] Updated weights for policy 0, policy_version 134043 (0.0037) [2024-06-18 12:35:00,516][12862] Signal inference workers to stop experience collection... (32100 times) [2024-06-18 12:35:00,517][12862] Signal inference workers to resume experience collection... (32100 times) [2024-06-18 12:35:00,527][12883] InferenceWorker_p0-w0: stopping experience collection (32100 times) [2024-06-18 12:35:00,528][12883] InferenceWorker_p0-w0: resuming experience collection (32100 times) [2024-06-18 12:35:01,459][12883] Updated weights for policy 0, policy_version 134053 (0.0037) [2024-06-18 12:35:02,000][12645] Fps is (10 sec: 45846.3, 60 sec: 42320.8, 300 sec: 42541.9). Total num frames: 2196340736. Throughput: 0: 42399.0. Samples: 2196411620. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:35:02,001][12645] Avg episode reward: [(0, '0.740')] [2024-06-18 12:35:05,211][12883] Updated weights for policy 0, policy_version 134063 (0.0031) [2024-06-18 12:35:06,996][12645] Fps is (10 sec: 39313.3, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 2196520960. Throughput: 0: 42360.6. Samples: 2196660060. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:35:06,996][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 12:35:09,100][12883] Updated weights for policy 0, policy_version 134073 (0.0030) [2024-06-18 12:35:11,994][12645] Fps is (10 sec: 42625.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2196766720. Throughput: 0: 42540.4. Samples: 2196915220. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:35:11,994][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 12:35:12,924][12883] Updated weights for policy 0, policy_version 134083 (0.0033) [2024-06-18 12:35:16,851][12883] Updated weights for policy 0, policy_version 134093 (0.0034) [2024-06-18 12:35:16,994][12645] Fps is (10 sec: 45884.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2196979712. Throughput: 0: 42451.1. Samples: 2197050920. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:35:16,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 12:35:20,716][12883] Updated weights for policy 0, policy_version 134103 (0.0043) [2024-06-18 12:35:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2197176320. Throughput: 0: 42576.8. Samples: 2197301940. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:35:21,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 12:35:24,473][12883] Updated weights for policy 0, policy_version 134113 (0.0048) [2024-06-18 12:35:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2197422080. Throughput: 0: 42509.6. Samples: 2197553320. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:35:26,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 12:35:28,399][12883] Updated weights for policy 0, policy_version 134123 (0.0041) [2024-06-18 12:35:31,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2197602304. Throughput: 0: 42423.2. Samples: 2197687340. Policy #0 lag: (min: 2.0, avg: 9.7, max: 22.0) [2024-06-18 12:35:31,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 12:35:32,264][12883] Updated weights for policy 0, policy_version 134133 (0.0031) [2024-06-18 12:35:35,875][12883] Updated weights for policy 0, policy_version 134143 (0.0024) [2024-06-18 12:35:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42543.3). Total num frames: 2197831680. Throughput: 0: 42742.2. Samples: 2197943180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:35:36,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 12:35:39,830][12883] Updated weights for policy 0, policy_version 134153 (0.0032) [2024-06-18 12:35:41,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2198061056. Throughput: 0: 42574.4. Samples: 2198195480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:35:41,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 12:35:43,826][12883] Updated weights for policy 0, policy_version 134163 (0.0041) [2024-06-18 12:35:46,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 2198241280. Throughput: 0: 42482.5. Samples: 2198323160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:35:46,997][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 12:35:47,561][12883] Updated weights for policy 0, policy_version 134173 (0.0028) [2024-06-18 12:35:51,811][12883] Updated weights for policy 0, policy_version 134183 (0.0031) [2024-06-18 12:35:51,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2198470656. Throughput: 0: 42648.3. Samples: 2198579140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:35:51,994][12645] Avg episode reward: [(0, '0.314')] [2024-06-18 12:35:55,174][12883] Updated weights for policy 0, policy_version 134193 (0.0047) [2024-06-18 12:35:56,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2198683648. Throughput: 0: 42621.4. Samples: 2198833180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:35:56,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 12:35:59,497][12883] Updated weights for policy 0, policy_version 134203 (0.0036) [2024-06-18 12:36:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42329.8, 300 sec: 42487.3). Total num frames: 2198880256. Throughput: 0: 42509.5. Samples: 2198963840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:36:01,994][12645] Avg episode reward: [(0, '0.127')] [2024-06-18 12:36:03,293][12883] Updated weights for policy 0, policy_version 134213 (0.0043) [2024-06-18 12:36:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42873.0, 300 sec: 42487.3). Total num frames: 2199093248. Throughput: 0: 42561.7. Samples: 2199217220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:36:06,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 12:36:07,075][12883] Updated weights for policy 0, policy_version 134223 (0.0028) [2024-06-18 12:36:10,945][12883] Updated weights for policy 0, policy_version 134233 (0.0044) [2024-06-18 12:36:11,097][12862] Signal inference workers to stop experience collection... (32150 times) [2024-06-18 12:36:11,097][12862] Signal inference workers to resume experience collection... (32150 times) [2024-06-18 12:36:11,107][12883] InferenceWorker_p0-w0: stopping experience collection (32150 times) [2024-06-18 12:36:11,108][12883] InferenceWorker_p0-w0: resuming experience collection (32150 times) [2024-06-18 12:36:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2199322624. Throughput: 0: 42521.4. Samples: 2199466780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:36:11,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 12:36:14,831][12883] Updated weights for policy 0, policy_version 134243 (0.0023) [2024-06-18 12:36:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2199519232. Throughput: 0: 42508.8. Samples: 2199600240. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:36:16,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 12:36:18,489][12883] Updated weights for policy 0, policy_version 134253 (0.0037) [2024-06-18 12:36:21,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2199732224. Throughput: 0: 42403.6. Samples: 2199851340. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:36:21,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 12:36:22,482][12883] Updated weights for policy 0, policy_version 134263 (0.0039) [2024-06-18 12:36:26,120][12883] Updated weights for policy 0, policy_version 134273 (0.0031) [2024-06-18 12:36:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2199961600. Throughput: 0: 42546.6. Samples: 2200110080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:36:26,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 12:36:30,202][12883] Updated weights for policy 0, policy_version 134283 (0.0030) [2024-06-18 12:36:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2200158208. Throughput: 0: 42543.9. Samples: 2200237540. Policy #0 lag: (min: 0.0, avg: 12.5, max: 25.0) [2024-06-18 12:36:31,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 12:36:33,983][12883] Updated weights for policy 0, policy_version 134293 (0.0039) [2024-06-18 12:36:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2200371200. Throughput: 0: 42476.5. Samples: 2200490580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:36:37,003][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 12:36:37,042][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134301_2200387584.pth... [2024-06-18 12:36:37,091][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133679_2190196736.pth [2024-06-18 12:36:38,474][12883] Updated weights for policy 0, policy_version 134303 (0.0029) [2024-06-18 12:36:41,822][12883] Updated weights for policy 0, policy_version 134313 (0.0030) [2024-06-18 12:36:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 2200584192. Throughput: 0: 42513.6. Samples: 2200746300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:36:41,994][12645] Avg episode reward: [(0, '0.618')] [2024-06-18 12:36:46,357][12883] Updated weights for policy 0, policy_version 134323 (0.0021) [2024-06-18 12:36:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.0, 300 sec: 42543.2). Total num frames: 2200813568. Throughput: 0: 42387.5. Samples: 2200871280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:36:46,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 12:36:49,473][12883] Updated weights for policy 0, policy_version 134333 (0.0031) [2024-06-18 12:36:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2201026560. Throughput: 0: 42439.2. Samples: 2201126980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:36:51,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 12:36:53,882][12883] Updated weights for policy 0, policy_version 134343 (0.0028) [2024-06-18 12:36:56,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42543.8). Total num frames: 2201223168. Throughput: 0: 42640.5. Samples: 2201385600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:36:56,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 12:36:57,061][12883] Updated weights for policy 0, policy_version 134353 (0.0047) [2024-06-18 12:37:01,443][12883] Updated weights for policy 0, policy_version 134363 (0.0028) [2024-06-18 12:37:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2201436160. Throughput: 0: 42520.5. Samples: 2201513660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:01,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 12:37:04,779][12883] Updated weights for policy 0, policy_version 134373 (0.0033) [2024-06-18 12:37:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2201649152. Throughput: 0: 42395.5. Samples: 2201759140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:06,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 12:37:09,083][12883] Updated weights for policy 0, policy_version 134383 (0.0037) [2024-06-18 12:37:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2201862144. Throughput: 0: 42496.5. Samples: 2202022420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:11,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 12:37:12,320][12883] Updated weights for policy 0, policy_version 134393 (0.0026) [2024-06-18 12:37:16,658][12883] Updated weights for policy 0, policy_version 134403 (0.0039) [2024-06-18 12:37:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 2202058752. Throughput: 0: 42367.5. Samples: 2202144080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:16,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 12:37:19,864][12883] Updated weights for policy 0, policy_version 134413 (0.0041) [2024-06-18 12:37:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.5). Total num frames: 2202304512. Throughput: 0: 42457.4. Samples: 2202401160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:21,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 12:37:23,053][12862] Signal inference workers to stop experience collection... (32200 times) [2024-06-18 12:37:23,081][12883] InferenceWorker_p0-w0: stopping experience collection (32200 times) [2024-06-18 12:37:23,100][12862] Signal inference workers to resume experience collection... (32200 times) [2024-06-18 12:37:23,109][12883] InferenceWorker_p0-w0: resuming experience collection (32200 times) [2024-06-18 12:37:24,238][12883] Updated weights for policy 0, policy_version 134423 (0.0036) [2024-06-18 12:37:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2202501120. Throughput: 0: 42632.2. Samples: 2202664740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:26,994][12645] Avg episode reward: [(0, '0.230')] [2024-06-18 12:37:27,681][12883] Updated weights for policy 0, policy_version 134433 (0.0022) [2024-06-18 12:37:31,832][12883] Updated weights for policy 0, policy_version 134443 (0.0046) [2024-06-18 12:37:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2202714112. Throughput: 0: 42496.9. Samples: 2202783640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:31,994][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 12:37:35,248][12883] Updated weights for policy 0, policy_version 134453 (0.0032) [2024-06-18 12:37:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2202927104. Throughput: 0: 42578.2. Samples: 2203043000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 12:37:36,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 12:37:39,839][12883] Updated weights for policy 0, policy_version 134463 (0.0037) [2024-06-18 12:37:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2203140096. Throughput: 0: 42550.7. Samples: 2203300380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:37:41,994][12645] Avg episode reward: [(0, '0.746')] [2024-06-18 12:37:42,812][12883] Updated weights for policy 0, policy_version 134473 (0.0037) [2024-06-18 12:37:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2203336704. Throughput: 0: 42427.0. Samples: 2203422880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:37:46,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 12:37:47,475][12883] Updated weights for policy 0, policy_version 134483 (0.0028) [2024-06-18 12:37:50,824][12883] Updated weights for policy 0, policy_version 134493 (0.0029) [2024-06-18 12:37:51,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2203582464. Throughput: 0: 42635.3. Samples: 2203677820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:37:51,996][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 12:37:55,120][12883] Updated weights for policy 0, policy_version 134503 (0.0030) [2024-06-18 12:37:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2203762688. Throughput: 0: 42573.2. Samples: 2203938220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:37:56,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 12:37:58,523][12883] Updated weights for policy 0, policy_version 134513 (0.0036) [2024-06-18 12:38:01,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2203975680. Throughput: 0: 42550.7. Samples: 2204058860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:01,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 12:38:02,693][12883] Updated weights for policy 0, policy_version 134523 (0.0032) [2024-06-18 12:38:05,946][12883] Updated weights for policy 0, policy_version 134533 (0.0055) [2024-06-18 12:38:06,998][12645] Fps is (10 sec: 45856.1, 60 sec: 42868.5, 300 sec: 42597.8). Total num frames: 2204221440. Throughput: 0: 42599.5. Samples: 2204318320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:06,998][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 12:38:10,461][12883] Updated weights for policy 0, policy_version 134543 (0.0051) [2024-06-18 12:38:12,000][12645] Fps is (10 sec: 42571.8, 60 sec: 42320.9, 300 sec: 42430.9). Total num frames: 2204401664. Throughput: 0: 42541.1. Samples: 2204579360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:12,000][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 12:38:13,800][12883] Updated weights for policy 0, policy_version 134553 (0.0042) [2024-06-18 12:38:16,994][12645] Fps is (10 sec: 40977.3, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2204631040. Throughput: 0: 42541.8. Samples: 2204698020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:16,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 12:38:18,033][12883] Updated weights for policy 0, policy_version 134563 (0.0031) [2024-06-18 12:38:21,434][12883] Updated weights for policy 0, policy_version 134573 (0.0036) [2024-06-18 12:38:21,994][12645] Fps is (10 sec: 47543.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2204876800. Throughput: 0: 42638.3. Samples: 2204961720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:21,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 12:38:25,922][12883] Updated weights for policy 0, policy_version 134583 (0.0026) [2024-06-18 12:38:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2205057024. Throughput: 0: 42524.8. Samples: 2205214000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:26,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 12:38:29,086][12883] Updated weights for policy 0, policy_version 134593 (0.0047) [2024-06-18 12:38:31,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2205270016. Throughput: 0: 42573.9. Samples: 2205338700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:31,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 12:38:33,558][12883] Updated weights for policy 0, policy_version 134603 (0.0033) [2024-06-18 12:38:35,551][12862] Signal inference workers to stop experience collection... (32250 times) [2024-06-18 12:38:35,552][12862] Signal inference workers to resume experience collection... (32250 times) [2024-06-18 12:38:35,567][12883] InferenceWorker_p0-w0: stopping experience collection (32250 times) [2024-06-18 12:38:35,567][12883] InferenceWorker_p0-w0: resuming experience collection (32250 times) [2024-06-18 12:38:36,732][12883] Updated weights for policy 0, policy_version 134613 (0.0029) [2024-06-18 12:38:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2205499392. Throughput: 0: 42795.4. Samples: 2205603520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:38:36,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 12:38:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134613_2205499392.pth... [2024-06-18 12:38:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000133987_2195243008.pth [2024-06-18 12:38:41,204][12883] Updated weights for policy 0, policy_version 134623 (0.0033) [2024-06-18 12:38:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2205712384. Throughput: 0: 42670.8. Samples: 2205858400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:38:41,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 12:38:44,890][12883] Updated weights for policy 0, policy_version 134633 (0.0041) [2024-06-18 12:38:47,008][12645] Fps is (10 sec: 40902.6, 60 sec: 42861.5, 300 sec: 42596.4). Total num frames: 2205908992. Throughput: 0: 42757.7. Samples: 2205983560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:38:47,008][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 12:38:48,884][12883] Updated weights for policy 0, policy_version 134643 (0.0042) [2024-06-18 12:38:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 2206121984. Throughput: 0: 42731.1. Samples: 2206241040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:38:51,995][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 12:38:52,410][12883] Updated weights for policy 0, policy_version 134653 (0.0038) [2024-06-18 12:38:56,670][12883] Updated weights for policy 0, policy_version 134663 (0.0031) [2024-06-18 12:38:56,994][12645] Fps is (10 sec: 42658.9, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 2206334976. Throughput: 0: 42709.1. Samples: 2206501000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:38:56,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 12:39:00,065][12883] Updated weights for policy 0, policy_version 134673 (0.0049) [2024-06-18 12:39:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2206564352. Throughput: 0: 42814.2. Samples: 2206624660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:01,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 12:39:04,351][12883] Updated weights for policy 0, policy_version 134683 (0.0039) [2024-06-18 12:39:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42328.4, 300 sec: 42542.9). Total num frames: 2206760960. Throughput: 0: 42616.0. Samples: 2206879440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:06,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 12:39:07,774][12883] Updated weights for policy 0, policy_version 134693 (0.0031) [2024-06-18 12:39:11,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42602.9, 300 sec: 42376.3). Total num frames: 2206957568. Throughput: 0: 42713.5. Samples: 2207136100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:11,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 12:39:12,023][12883] Updated weights for policy 0, policy_version 134703 (0.0027) [2024-06-18 12:39:15,666][12883] Updated weights for policy 0, policy_version 134713 (0.0044) [2024-06-18 12:39:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2207203328. Throughput: 0: 42678.7. Samples: 2207259240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:16,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 12:39:19,650][12883] Updated weights for policy 0, policy_version 134723 (0.0035) [2024-06-18 12:39:21,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42052.1, 300 sec: 42542.8). Total num frames: 2207399936. Throughput: 0: 42595.9. Samples: 2207520340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:21,995][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 12:39:23,415][12883] Updated weights for policy 0, policy_version 134733 (0.0028) [2024-06-18 12:39:26,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2207596544. Throughput: 0: 42558.7. Samples: 2207773540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:26,994][12645] Avg episode reward: [(0, '0.725')] [2024-06-18 12:39:27,498][12883] Updated weights for policy 0, policy_version 134743 (0.0022) [2024-06-18 12:39:31,030][12883] Updated weights for policy 0, policy_version 134753 (0.0043) [2024-06-18 12:39:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2207842304. Throughput: 0: 42532.8. Samples: 2207896940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:31,994][12645] Avg episode reward: [(0, '0.243')] [2024-06-18 12:39:35,329][12883] Updated weights for policy 0, policy_version 134763 (0.0037) [2024-06-18 12:39:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2208038912. Throughput: 0: 42586.3. Samples: 2208157420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:36,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 12:39:38,439][12883] Updated weights for policy 0, policy_version 134773 (0.0033) [2024-06-18 12:39:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2208235520. Throughput: 0: 42429.7. Samples: 2208410340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:39:41,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 12:39:42,888][12883] Updated weights for policy 0, policy_version 134783 (0.0045) [2024-06-18 12:39:46,461][12883] Updated weights for policy 0, policy_version 134793 (0.0029) [2024-06-18 12:39:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42881.5, 300 sec: 42709.5). Total num frames: 2208481280. Throughput: 0: 42426.7. Samples: 2208533860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:39:46,994][12645] Avg episode reward: [(0, '0.702')] [2024-06-18 12:39:50,413][12883] Updated weights for policy 0, policy_version 134803 (0.0030) [2024-06-18 12:39:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2208677888. Throughput: 0: 42487.0. Samples: 2208791360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:39:51,995][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 12:39:54,385][12883] Updated weights for policy 0, policy_version 134813 (0.0032) [2024-06-18 12:39:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42543.8). Total num frames: 2208890880. Throughput: 0: 42641.7. Samples: 2209054980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:39:56,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 12:39:57,929][12883] Updated weights for policy 0, policy_version 134823 (0.0028) [2024-06-18 12:40:01,959][12883] Updated weights for policy 0, policy_version 134833 (0.0030) [2024-06-18 12:40:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42654.3). Total num frames: 2209103872. Throughput: 0: 42710.7. Samples: 2209181220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:01,994][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 12:40:05,504][12883] Updated weights for policy 0, policy_version 134843 (0.0050) [2024-06-18 12:40:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 2209316864. Throughput: 0: 42648.4. Samples: 2209439520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:06,995][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 12:40:09,501][12883] Updated weights for policy 0, policy_version 134853 (0.0029) [2024-06-18 12:40:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2209546240. Throughput: 0: 42816.9. Samples: 2209700300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:11,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 12:40:13,194][12883] Updated weights for policy 0, policy_version 134863 (0.0029) [2024-06-18 12:40:15,260][12862] Signal inference workers to stop experience collection... (32300 times) [2024-06-18 12:40:15,261][12862] Signal inference workers to resume experience collection... (32300 times) [2024-06-18 12:40:15,277][12883] InferenceWorker_p0-w0: stopping experience collection (32300 times) [2024-06-18 12:40:15,277][12883] InferenceWorker_p0-w0: resuming experience collection (32300 times) [2024-06-18 12:40:16,973][12883] Updated weights for policy 0, policy_version 134873 (0.0033) [2024-06-18 12:40:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2209759232. Throughput: 0: 42857.4. Samples: 2209825520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:16,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 12:40:20,951][12883] Updated weights for policy 0, policy_version 134883 (0.0028) [2024-06-18 12:40:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2209972224. Throughput: 0: 42851.9. Samples: 2210085760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:21,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 12:40:24,673][12883] Updated weights for policy 0, policy_version 134893 (0.0033) [2024-06-18 12:40:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2210201600. Throughput: 0: 42912.9. Samples: 2210341420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:26,994][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 12:40:28,413][12883] Updated weights for policy 0, policy_version 134903 (0.0039) [2024-06-18 12:40:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2210398208. Throughput: 0: 43064.9. Samples: 2210471780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:31,994][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 12:40:32,176][12883] Updated weights for policy 0, policy_version 134913 (0.0037) [2024-06-18 12:40:35,870][12883] Updated weights for policy 0, policy_version 134923 (0.0037) [2024-06-18 12:40:36,997][12645] Fps is (10 sec: 40945.5, 60 sec: 42868.9, 300 sec: 42542.3). Total num frames: 2210611200. Throughput: 0: 43182.8. Samples: 2210734740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:37,006][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 12:40:37,085][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134926_2210627584.pth... [2024-06-18 12:40:37,154][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134301_2200387584.pth [2024-06-18 12:40:39,942][12883] Updated weights for policy 0, policy_version 134933 (0.0028) [2024-06-18 12:40:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42654.3). Total num frames: 2210824192. Throughput: 0: 42918.7. Samples: 2210986320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:41,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 12:40:43,439][12883] Updated weights for policy 0, policy_version 134943 (0.0035) [2024-06-18 12:40:46,994][12645] Fps is (10 sec: 40975.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2211020800. Throughput: 0: 42872.8. Samples: 2211110500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 12:40:46,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 12:40:47,714][12883] Updated weights for policy 0, policy_version 134953 (0.0031) [2024-06-18 12:40:51,061][12883] Updated weights for policy 0, policy_version 134963 (0.0031) [2024-06-18 12:40:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2211266560. Throughput: 0: 42759.7. Samples: 2211363700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:40:51,996][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 12:40:55,453][12883] Updated weights for policy 0, policy_version 134973 (0.0025) [2024-06-18 12:40:56,994][12645] Fps is (10 sec: 45874.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2211479552. Throughput: 0: 42710.5. Samples: 2211622280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:40:56,997][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 12:40:58,773][12883] Updated weights for policy 0, policy_version 134983 (0.0038) [2024-06-18 12:41:01,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2211659776. Throughput: 0: 42825.3. Samples: 2211752660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:01,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 12:41:03,310][12883] Updated weights for policy 0, policy_version 134993 (0.0038) [2024-06-18 12:41:06,371][12883] Updated weights for policy 0, policy_version 135003 (0.0030) [2024-06-18 12:41:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 2211905536. Throughput: 0: 42701.8. Samples: 2212007340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:06,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 12:41:10,797][12883] Updated weights for policy 0, policy_version 135013 (0.0031) [2024-06-18 12:41:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2212118528. Throughput: 0: 42777.4. Samples: 2212266400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:11,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 12:41:13,906][12883] Updated weights for policy 0, policy_version 135023 (0.0034) [2024-06-18 12:41:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2212298752. Throughput: 0: 42728.9. Samples: 2212394580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:16,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 12:41:18,490][12883] Updated weights for policy 0, policy_version 135033 (0.0024) [2024-06-18 12:41:21,318][12883] Updated weights for policy 0, policy_version 135043 (0.0023) [2024-06-18 12:41:21,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42709.1). Total num frames: 2212560896. Throughput: 0: 42677.7. Samples: 2212655180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:21,997][12645] Avg episode reward: [(0, '0.700')] [2024-06-18 12:41:26,194][12883] Updated weights for policy 0, policy_version 135053 (0.0047) [2024-06-18 12:41:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2212757504. Throughput: 0: 42811.1. Samples: 2212912820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:26,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 12:41:29,272][12883] Updated weights for policy 0, policy_version 135063 (0.0028) [2024-06-18 12:41:31,994][12645] Fps is (10 sec: 39330.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2212954112. Throughput: 0: 42711.6. Samples: 2213032520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:31,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 12:41:33,935][12883] Updated weights for policy 0, policy_version 135073 (0.0029) [2024-06-18 12:41:36,895][12883] Updated weights for policy 0, policy_version 135083 (0.0036) [2024-06-18 12:41:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43147.1, 300 sec: 42765.0). Total num frames: 2213199872. Throughput: 0: 42868.9. Samples: 2213292800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:36,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 12:41:41,512][12883] Updated weights for policy 0, policy_version 135093 (0.0036) [2024-06-18 12:41:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2213380096. Throughput: 0: 42926.7. Samples: 2213553980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:41,999][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 12:41:44,445][12883] Updated weights for policy 0, policy_version 135103 (0.0036) [2024-06-18 12:41:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2213609472. Throughput: 0: 42720.1. Samples: 2213675060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 12:41:46,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 12:41:49,026][12883] Updated weights for policy 0, policy_version 135113 (0.0037) [2024-06-18 12:41:51,820][12862] Signal inference workers to stop experience collection... (32350 times) [2024-06-18 12:41:51,820][12862] Signal inference workers to resume experience collection... (32350 times) [2024-06-18 12:41:51,863][12883] InferenceWorker_p0-w0: stopping experience collection (32350 times) [2024-06-18 12:41:51,864][12883] InferenceWorker_p0-w0: resuming experience collection (32350 times) [2024-06-18 12:41:51,952][12883] Updated weights for policy 0, policy_version 135123 (0.0027) [2024-06-18 12:41:51,994][12645] Fps is (10 sec: 47514.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2213855232. Throughput: 0: 43004.1. Samples: 2213942520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:41:51,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 12:41:56,994][12645] Fps is (10 sec: 39320.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2214002688. Throughput: 0: 42884.2. Samples: 2214196200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:41:56,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 12:41:57,161][12883] Updated weights for policy 0, policy_version 135133 (0.0037) [2024-06-18 12:41:59,590][12883] Updated weights for policy 0, policy_version 135143 (0.0034) [2024-06-18 12:42:01,994][12645] Fps is (10 sec: 39321.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2214248448. Throughput: 0: 42673.3. Samples: 2214314880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:01,994][12645] Avg episode reward: [(0, '0.135')] [2024-06-18 12:42:04,600][12883] Updated weights for policy 0, policy_version 135153 (0.0036) [2024-06-18 12:42:06,994][12645] Fps is (10 sec: 45876.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2214461440. Throughput: 0: 42796.0. Samples: 2214580900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:06,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 12:42:07,423][12883] Updated weights for policy 0, policy_version 135163 (0.0028) [2024-06-18 12:42:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2214658048. Throughput: 0: 42854.7. Samples: 2214841280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:11,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 12:42:12,170][12883] Updated weights for policy 0, policy_version 135173 (0.0033) [2024-06-18 12:42:15,123][12883] Updated weights for policy 0, policy_version 135183 (0.0037) [2024-06-18 12:42:16,996][12645] Fps is (10 sec: 42588.4, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 2214887424. Throughput: 0: 42956.9. Samples: 2214965680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:16,997][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 12:42:19,762][12883] Updated weights for policy 0, policy_version 135193 (0.0031) [2024-06-18 12:42:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 2215116800. Throughput: 0: 42952.4. Samples: 2215225660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:21,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 12:42:22,648][12883] Updated weights for policy 0, policy_version 135203 (0.0026) [2024-06-18 12:42:26,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2215297024. Throughput: 0: 42981.3. Samples: 2215488140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:26,994][12645] Avg episode reward: [(0, '0.843')] [2024-06-18 12:42:27,315][12883] Updated weights for policy 0, policy_version 135213 (0.0035) [2024-06-18 12:42:30,489][12883] Updated weights for policy 0, policy_version 135223 (0.0043) [2024-06-18 12:42:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2215542784. Throughput: 0: 42918.5. Samples: 2215606400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:31,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 12:42:35,225][12883] Updated weights for policy 0, policy_version 135233 (0.0028) [2024-06-18 12:42:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2215739392. Throughput: 0: 42779.9. Samples: 2215867620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:36,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 12:42:37,138][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135239_2215755776.pth... [2024-06-18 12:42:37,183][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134613_2205499392.pth [2024-06-18 12:42:38,281][12883] Updated weights for policy 0, policy_version 135243 (0.0030) [2024-06-18 12:42:41,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2215936000. Throughput: 0: 42784.2. Samples: 2216121480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:41,994][12645] Avg episode reward: [(0, '0.719')] [2024-06-18 12:42:42,785][12883] Updated weights for policy 0, policy_version 135253 (0.0045) [2024-06-18 12:42:46,026][12883] Updated weights for policy 0, policy_version 135263 (0.0028) [2024-06-18 12:42:46,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 2216198144. Throughput: 0: 42922.3. Samples: 2216246380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:46,994][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 12:42:50,459][12883] Updated weights for policy 0, policy_version 135273 (0.0037) [2024-06-18 12:42:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2216378368. Throughput: 0: 42756.5. Samples: 2216504940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 12:42:52,002][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 12:42:53,690][12883] Updated weights for policy 0, policy_version 135283 (0.0037) [2024-06-18 12:42:56,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 2216574976. Throughput: 0: 42640.5. Samples: 2216760100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:42:56,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 12:42:58,134][12883] Updated weights for policy 0, policy_version 135293 (0.0034) [2024-06-18 12:43:01,120][12862] Signal inference workers to stop experience collection... (32400 times) [2024-06-18 12:43:01,120][12862] Signal inference workers to resume experience collection... (32400 times) [2024-06-18 12:43:01,137][12883] InferenceWorker_p0-w0: stopping experience collection (32400 times) [2024-06-18 12:43:01,167][12883] InferenceWorker_p0-w0: resuming experience collection (32400 times) [2024-06-18 12:43:01,258][12883] Updated weights for policy 0, policy_version 135303 (0.0028) [2024-06-18 12:43:01,994][12645] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42765.6). Total num frames: 2216837120. Throughput: 0: 42758.5. Samples: 2216889720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:01,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 12:43:05,755][12883] Updated weights for policy 0, policy_version 135313 (0.0032) [2024-06-18 12:43:06,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42765.9). Total num frames: 2217017344. Throughput: 0: 42686.6. Samples: 2217146560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:06,995][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 12:43:08,910][12883] Updated weights for policy 0, policy_version 135323 (0.0040) [2024-06-18 12:43:11,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2217230336. Throughput: 0: 42441.4. Samples: 2217398000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:11,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 12:43:13,575][12883] Updated weights for policy 0, policy_version 135333 (0.0034) [2024-06-18 12:43:16,602][12883] Updated weights for policy 0, policy_version 135343 (0.0030) [2024-06-18 12:43:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 2217476096. Throughput: 0: 42697.4. Samples: 2217527780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:16,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 12:43:21,059][12883] Updated weights for policy 0, policy_version 135353 (0.0033) [2024-06-18 12:43:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2217656320. Throughput: 0: 42662.2. Samples: 2217787420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:21,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 12:43:24,263][12883] Updated weights for policy 0, policy_version 135363 (0.0042) [2024-06-18 12:43:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2217885696. Throughput: 0: 42451.4. Samples: 2218031800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:26,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 12:43:29,421][12883] Updated weights for policy 0, policy_version 135373 (0.0033) [2024-06-18 12:43:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2218082304. Throughput: 0: 42713.4. Samples: 2218168480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:31,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 12:43:32,151][12883] Updated weights for policy 0, policy_version 135383 (0.0045) [2024-06-18 12:43:36,994][12645] Fps is (10 sec: 37681.4, 60 sec: 42051.8, 300 sec: 42542.8). Total num frames: 2218262528. Throughput: 0: 42478.9. Samples: 2218416520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:36,995][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 12:43:37,071][12883] Updated weights for policy 0, policy_version 135393 (0.0038) [2024-06-18 12:43:39,957][12883] Updated weights for policy 0, policy_version 135403 (0.0032) [2024-06-18 12:43:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42767.1). Total num frames: 2218524672. Throughput: 0: 42362.6. Samples: 2218666420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:41,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 12:43:44,777][12883] Updated weights for policy 0, policy_version 135413 (0.0036) [2024-06-18 12:43:46,994][12645] Fps is (10 sec: 45877.8, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2218721280. Throughput: 0: 42505.3. Samples: 2218802460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:46,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 12:43:47,685][12883] Updated weights for policy 0, policy_version 135423 (0.0040) [2024-06-18 12:43:51,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 2218901504. Throughput: 0: 42286.7. Samples: 2219049460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:43:51,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 12:43:52,471][12883] Updated weights for policy 0, policy_version 135433 (0.0043) [2024-06-18 12:43:55,381][12883] Updated weights for policy 0, policy_version 135443 (0.0036) [2024-06-18 12:43:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2219163648. Throughput: 0: 42234.2. Samples: 2219298540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:43:56,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 12:44:00,142][12883] Updated weights for policy 0, policy_version 135453 (0.0034) [2024-06-18 12:44:01,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2219360256. Throughput: 0: 42406.3. Samples: 2219436060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:01,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 12:44:03,014][12883] Updated weights for policy 0, policy_version 135463 (0.0041) [2024-06-18 12:44:06,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2219540480. Throughput: 0: 42147.4. Samples: 2219684060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:06,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 12:44:08,131][12883] Updated weights for policy 0, policy_version 135473 (0.0035) [2024-06-18 12:44:10,790][12883] Updated weights for policy 0, policy_version 135483 (0.0027) [2024-06-18 12:44:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2219802624. Throughput: 0: 42216.2. Samples: 2219931520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:11,994][12645] Avg episode reward: [(0, '0.214')] [2024-06-18 12:44:15,813][12883] Updated weights for policy 0, policy_version 135493 (0.0051) [2024-06-18 12:44:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 2219982848. Throughput: 0: 42143.0. Samples: 2220064920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:16,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 12:44:18,388][12883] Updated weights for policy 0, policy_version 135503 (0.0032) [2024-06-18 12:44:21,996][12645] Fps is (10 sec: 39312.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 2220195840. Throughput: 0: 42138.0. Samples: 2220312800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:21,996][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 12:44:23,486][12883] Updated weights for policy 0, policy_version 135513 (0.0028) [2024-06-18 12:44:24,445][12862] Signal inference workers to stop experience collection... (32450 times) [2024-06-18 12:44:24,497][12862] Signal inference workers to resume experience collection... (32450 times) [2024-06-18 12:44:24,498][12883] InferenceWorker_p0-w0: stopping experience collection (32450 times) [2024-06-18 12:44:24,513][12883] InferenceWorker_p0-w0: resuming experience collection (32450 times) [2024-06-18 12:44:26,056][12883] Updated weights for policy 0, policy_version 135523 (0.0043) [2024-06-18 12:44:26,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2220441600. Throughput: 0: 42248.0. Samples: 2220567580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:26,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 12:44:31,094][12883] Updated weights for policy 0, policy_version 135533 (0.0036) [2024-06-18 12:44:31,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2220621824. Throughput: 0: 42240.5. Samples: 2220703280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:31,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 12:44:34,156][12883] Updated weights for policy 0, policy_version 135543 (0.0033) [2024-06-18 12:44:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.8, 300 sec: 42653.9). Total num frames: 2220818432. Throughput: 0: 42274.7. Samples: 2220951820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:36,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 12:44:37,026][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135549_2220834816.pth... [2024-06-18 12:44:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000134926_2210627584.pth [2024-06-18 12:44:38,797][12883] Updated weights for policy 0, policy_version 135553 (0.0038) [2024-06-18 12:44:41,743][12883] Updated weights for policy 0, policy_version 135563 (0.0033) [2024-06-18 12:44:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2221080576. Throughput: 0: 42384.5. Samples: 2221205840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:41,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 12:44:46,413][12883] Updated weights for policy 0, policy_version 135573 (0.0043) [2024-06-18 12:44:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2221260800. Throughput: 0: 42371.4. Samples: 2221342780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:46,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 12:44:49,294][12883] Updated weights for policy 0, policy_version 135583 (0.0034) [2024-06-18 12:44:51,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2221473792. Throughput: 0: 42405.7. Samples: 2221592320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:51,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 12:44:54,003][12883] Updated weights for policy 0, policy_version 135593 (0.0043) [2024-06-18 12:44:56,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42323.8, 300 sec: 42709.1). Total num frames: 2221703168. Throughput: 0: 42568.1. Samples: 2221847180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 12:44:56,996][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 12:44:57,128][12883] Updated weights for policy 0, policy_version 135603 (0.0028) [2024-06-18 12:45:01,537][12883] Updated weights for policy 0, policy_version 135613 (0.0042) [2024-06-18 12:45:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2221899776. Throughput: 0: 42528.0. Samples: 2221978680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:01,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 12:45:04,564][12883] Updated weights for policy 0, policy_version 135623 (0.0038) [2024-06-18 12:45:07,000][12645] Fps is (10 sec: 39305.6, 60 sec: 42594.0, 300 sec: 42541.9). Total num frames: 2222096384. Throughput: 0: 42556.6. Samples: 2222228020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:07,001][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 12:45:09,542][12883] Updated weights for policy 0, policy_version 135633 (0.0029) [2024-06-18 12:45:11,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2222358528. Throughput: 0: 42639.7. Samples: 2222486360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:11,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 12:45:12,135][12883] Updated weights for policy 0, policy_version 135643 (0.0027) [2024-06-18 12:45:16,994][12645] Fps is (10 sec: 42625.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2222522368. Throughput: 0: 42636.5. Samples: 2222621920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:16,994][12645] Avg episode reward: [(0, '0.661')] [2024-06-18 12:45:17,009][12883] Updated weights for policy 0, policy_version 135653 (0.0028) [2024-06-18 12:45:19,737][12883] Updated weights for policy 0, policy_version 135663 (0.0026) [2024-06-18 12:45:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 2222751744. Throughput: 0: 42642.2. Samples: 2222870720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:21,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 12:45:24,737][12883] Updated weights for policy 0, policy_version 135673 (0.0036) [2024-06-18 12:45:26,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2222997504. Throughput: 0: 42729.7. Samples: 2223128680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:27,000][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 12:45:27,384][12883] Updated weights for policy 0, policy_version 135683 (0.0035) [2024-06-18 12:45:27,986][12862] Signal inference workers to stop experience collection... (32500 times) [2024-06-18 12:45:27,986][12862] Signal inference workers to resume experience collection... (32500 times) [2024-06-18 12:45:27,997][12883] InferenceWorker_p0-w0: stopping experience collection (32500 times) [2024-06-18 12:45:28,013][12883] InferenceWorker_p0-w0: resuming experience collection (32500 times) [2024-06-18 12:45:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42598.9). Total num frames: 2223177728. Throughput: 0: 42697.4. Samples: 2223264160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:31,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 12:45:32,177][12883] Updated weights for policy 0, policy_version 135693 (0.0035) [2024-06-18 12:45:35,067][12883] Updated weights for policy 0, policy_version 135703 (0.0029) [2024-06-18 12:45:36,996][12645] Fps is (10 sec: 40951.1, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 2223407104. Throughput: 0: 42618.9. Samples: 2223510260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:36,996][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 12:45:39,665][12883] Updated weights for policy 0, policy_version 135713 (0.0034) [2024-06-18 12:45:41,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2223620096. Throughput: 0: 42888.4. Samples: 2223777060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:41,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 12:45:42,893][12883] Updated weights for policy 0, policy_version 135723 (0.0033) [2024-06-18 12:45:46,994][12645] Fps is (10 sec: 40966.5, 60 sec: 42598.0, 300 sec: 42542.8). Total num frames: 2223816704. Throughput: 0: 42679.9. Samples: 2223899300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:46,995][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 12:45:47,418][12883] Updated weights for policy 0, policy_version 135733 (0.0027) [2024-06-18 12:45:50,483][12883] Updated weights for policy 0, policy_version 135743 (0.0033) [2024-06-18 12:45:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2224046080. Throughput: 0: 42711.3. Samples: 2224149760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:51,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 12:45:55,021][12883] Updated weights for policy 0, policy_version 135753 (0.0040) [2024-06-18 12:45:56,994][12645] Fps is (10 sec: 44239.2, 60 sec: 42599.9, 300 sec: 42709.5). Total num frames: 2224259072. Throughput: 0: 42902.0. Samples: 2224416960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 23.0) [2024-06-18 12:45:56,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 12:45:58,115][12883] Updated weights for policy 0, policy_version 135763 (0.0040) [2024-06-18 12:46:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2224455680. Throughput: 0: 42658.6. Samples: 2224541560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:01,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 12:46:02,846][12883] Updated weights for policy 0, policy_version 135773 (0.0032) [2024-06-18 12:46:05,665][12883] Updated weights for policy 0, policy_version 135783 (0.0027) [2024-06-18 12:46:06,996][12645] Fps is (10 sec: 44227.6, 60 sec: 43420.6, 300 sec: 42653.6). Total num frames: 2224701440. Throughput: 0: 42792.1. Samples: 2224796460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:06,997][12645] Avg episode reward: [(0, '0.217')] [2024-06-18 12:46:10,472][12883] Updated weights for policy 0, policy_version 135793 (0.0035) [2024-06-18 12:46:11,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2224881664. Throughput: 0: 42945.0. Samples: 2225061200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:11,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 12:46:13,234][12883] Updated weights for policy 0, policy_version 135803 (0.0032) [2024-06-18 12:46:16,994][12645] Fps is (10 sec: 37691.5, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 2225078272. Throughput: 0: 42714.3. Samples: 2225186300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:16,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 12:46:17,890][12883] Updated weights for policy 0, policy_version 135813 (0.0028) [2024-06-18 12:46:20,879][12883] Updated weights for policy 0, policy_version 135823 (0.0030) [2024-06-18 12:46:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2225340416. Throughput: 0: 42929.8. Samples: 2225442000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:21,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 12:46:25,744][12883] Updated weights for policy 0, policy_version 135833 (0.0046) [2024-06-18 12:46:26,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2225537024. Throughput: 0: 42766.6. Samples: 2225701560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:26,994][12645] Avg episode reward: [(0, '0.624')] [2024-06-18 12:46:29,121][12883] Updated weights for policy 0, policy_version 135843 (0.0036) [2024-06-18 12:46:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2225733632. Throughput: 0: 42837.1. Samples: 2225826940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:31,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 12:46:33,181][12883] Updated weights for policy 0, policy_version 135853 (0.0025) [2024-06-18 12:46:36,625][12883] Updated weights for policy 0, policy_version 135863 (0.0034) [2024-06-18 12:46:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 2225995776. Throughput: 0: 43030.2. Samples: 2226086120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:36,995][12645] Avg episode reward: [(0, '0.686')] [2024-06-18 12:46:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135864_2225995776.pth... [2024-06-18 12:46:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135239_2215755776.pth [2024-06-18 12:46:40,959][12883] Updated weights for policy 0, policy_version 135873 (0.0037) [2024-06-18 12:46:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2226192384. Throughput: 0: 42807.6. Samples: 2226343300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:41,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 12:46:44,208][12883] Updated weights for policy 0, policy_version 135883 (0.0036) [2024-06-18 12:46:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.9, 300 sec: 42487.3). Total num frames: 2226388992. Throughput: 0: 42834.6. Samples: 2226469120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:46,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 12:46:48,783][12883] Updated weights for policy 0, policy_version 135893 (0.0039) [2024-06-18 12:46:51,948][12883] Updated weights for policy 0, policy_version 135903 (0.0037) [2024-06-18 12:46:51,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2226634752. Throughput: 0: 42892.0. Samples: 2226726500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:51,994][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 12:46:56,349][12883] Updated weights for policy 0, policy_version 135913 (0.0024) [2024-06-18 12:46:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2226831360. Throughput: 0: 42758.7. Samples: 2226985340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:46:56,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 12:46:58,201][12862] Signal inference workers to stop experience collection... (32550 times) [2024-06-18 12:46:58,202][12862] Signal inference workers to resume experience collection... (32550 times) [2024-06-18 12:46:58,249][12883] InferenceWorker_p0-w0: stopping experience collection (32550 times) [2024-06-18 12:46:58,249][12883] InferenceWorker_p0-w0: resuming experience collection (32550 times) [2024-06-18 12:46:59,498][12883] Updated weights for policy 0, policy_version 135923 (0.0031) [2024-06-18 12:47:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2227044352. Throughput: 0: 42698.3. Samples: 2227107720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 12:47:01,994][12645] Avg episode reward: [(0, '0.661')] [2024-06-18 12:47:04,162][12883] Updated weights for policy 0, policy_version 135933 (0.0038) [2024-06-18 12:47:06,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2227273728. Throughput: 0: 42784.0. Samples: 2227367280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:06,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 12:47:07,565][12883] Updated weights for policy 0, policy_version 135943 (0.0026) [2024-06-18 12:47:11,721][12883] Updated weights for policy 0, policy_version 135953 (0.0044) [2024-06-18 12:47:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42654.3). Total num frames: 2227470336. Throughput: 0: 42811.9. Samples: 2227628100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:11,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 12:47:15,329][12883] Updated weights for policy 0, policy_version 135963 (0.0032) [2024-06-18 12:47:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 2227683328. Throughput: 0: 42820.9. Samples: 2227753880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:16,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 12:47:19,281][12883] Updated weights for policy 0, policy_version 135973 (0.0042) [2024-06-18 12:47:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2227912704. Throughput: 0: 42730.2. Samples: 2228008980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:21,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 12:47:22,941][12883] Updated weights for policy 0, policy_version 135983 (0.0045) [2024-06-18 12:47:26,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42596.8, 300 sec: 42542.6). Total num frames: 2228092928. Throughput: 0: 42729.9. Samples: 2228266240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:26,997][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 12:47:27,287][12883] Updated weights for policy 0, policy_version 135993 (0.0028) [2024-06-18 12:47:30,631][12883] Updated weights for policy 0, policy_version 136003 (0.0037) [2024-06-18 12:47:31,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 2228305920. Throughput: 0: 42513.9. Samples: 2228382340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:31,997][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 12:47:34,874][12883] Updated weights for policy 0, policy_version 136013 (0.0035) [2024-06-18 12:47:36,994][12645] Fps is (10 sec: 47524.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2228568064. Throughput: 0: 42588.3. Samples: 2228642980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:36,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 12:47:38,127][12883] Updated weights for policy 0, policy_version 136023 (0.0039) [2024-06-18 12:47:41,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2228731904. Throughput: 0: 42744.1. Samples: 2228908820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:41,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 12:47:42,431][12883] Updated weights for policy 0, policy_version 136033 (0.0032) [2024-06-18 12:47:46,081][12883] Updated weights for policy 0, policy_version 136043 (0.0032) [2024-06-18 12:47:46,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2228944896. Throughput: 0: 42578.2. Samples: 2229023740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:46,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 12:47:50,002][12883] Updated weights for policy 0, policy_version 136053 (0.0032) [2024-06-18 12:47:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2229190656. Throughput: 0: 42566.1. Samples: 2229282760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:51,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 12:47:53,695][12883] Updated weights for policy 0, policy_version 136063 (0.0024) [2024-06-18 12:47:56,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 2229370880. Throughput: 0: 42663.3. Samples: 2229548040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:47:56,997][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 12:47:57,621][12883] Updated weights for policy 0, policy_version 136073 (0.0055) [2024-06-18 12:48:01,507][12883] Updated weights for policy 0, policy_version 136083 (0.0028) [2024-06-18 12:48:01,993][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2229583872. Throughput: 0: 42497.0. Samples: 2229666240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 12:48:01,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 12:48:05,555][12883] Updated weights for policy 0, policy_version 136093 (0.0033) [2024-06-18 12:48:06,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2229829632. Throughput: 0: 42605.4. Samples: 2229926220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:06,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 12:48:09,376][12883] Updated weights for policy 0, policy_version 136103 (0.0027) [2024-06-18 12:48:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2230009856. Throughput: 0: 42572.8. Samples: 2230181920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:11,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 12:48:13,229][12883] Updated weights for policy 0, policy_version 136113 (0.0049) [2024-06-18 12:48:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2230222848. Throughput: 0: 42690.1. Samples: 2230303300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:16,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 12:48:17,006][12883] Updated weights for policy 0, policy_version 136123 (0.0043) [2024-06-18 12:48:20,788][12883] Updated weights for policy 0, policy_version 136133 (0.0028) [2024-06-18 12:48:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2230452224. Throughput: 0: 42635.1. Samples: 2230561560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:21,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 12:48:25,144][12883] Updated weights for policy 0, policy_version 136143 (0.0033) [2024-06-18 12:48:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42599.9, 300 sec: 42598.4). Total num frames: 2230648832. Throughput: 0: 42559.8. Samples: 2230824020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:26,994][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 12:48:28,533][12862] Signal inference workers to stop experience collection... (32600 times) [2024-06-18 12:48:28,533][12862] Signal inference workers to resume experience collection... (32600 times) [2024-06-18 12:48:28,534][12883] Updated weights for policy 0, policy_version 136153 (0.0030) [2024-06-18 12:48:28,588][12883] InferenceWorker_p0-w0: stopping experience collection (32600 times) [2024-06-18 12:48:28,589][12883] InferenceWorker_p0-w0: resuming experience collection (32600 times) [2024-06-18 12:48:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42599.9, 300 sec: 42709.6). Total num frames: 2230861824. Throughput: 0: 42668.7. Samples: 2230943840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:31,994][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 12:48:32,635][12883] Updated weights for policy 0, policy_version 136163 (0.0032) [2024-06-18 12:48:36,069][12883] Updated weights for policy 0, policy_version 136173 (0.0030) [2024-06-18 12:48:36,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2231107584. Throughput: 0: 42754.7. Samples: 2231206720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:36,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 12:48:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136176_2231107584.pth... [2024-06-18 12:48:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135549_2220834816.pth [2024-06-18 12:48:40,176][12883] Updated weights for policy 0, policy_version 136183 (0.0023) [2024-06-18 12:48:41,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2231271424. Throughput: 0: 42670.2. Samples: 2231468100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:41,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 12:48:43,640][12883] Updated weights for policy 0, policy_version 136193 (0.0041) [2024-06-18 12:48:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2231500800. Throughput: 0: 42552.7. Samples: 2231581120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:46,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 12:48:48,022][12883] Updated weights for policy 0, policy_version 136203 (0.0026) [2024-06-18 12:48:51,377][12883] Updated weights for policy 0, policy_version 136213 (0.0038) [2024-06-18 12:48:51,994][12645] Fps is (10 sec: 47512.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2231746560. Throughput: 0: 42542.6. Samples: 2231840640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:51,996][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 12:48:56,100][12883] Updated weights for policy 0, policy_version 136223 (0.0029) [2024-06-18 12:48:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.9, 300 sec: 42542.8). Total num frames: 2231910400. Throughput: 0: 42507.9. Samples: 2232094780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:48:56,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 12:48:59,002][12883] Updated weights for policy 0, policy_version 136233 (0.0031) [2024-06-18 12:49:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2232156160. Throughput: 0: 42439.2. Samples: 2232213060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:49:01,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 12:49:03,823][12883] Updated weights for policy 0, policy_version 136243 (0.0027) [2024-06-18 12:49:06,586][12883] Updated weights for policy 0, policy_version 136253 (0.0045) [2024-06-18 12:49:06,996][12645] Fps is (10 sec: 45865.3, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 2232369152. Throughput: 0: 42569.5. Samples: 2232477280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 12:49:06,997][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 12:49:11,519][12883] Updated weights for policy 0, policy_version 136263 (0.0041) [2024-06-18 12:49:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2232532992. Throughput: 0: 42375.3. Samples: 2232730900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:11,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 12:49:14,192][12883] Updated weights for policy 0, policy_version 136273 (0.0031) [2024-06-18 12:49:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2232795136. Throughput: 0: 42394.8. Samples: 2232851600. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:16,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 12:49:19,326][12883] Updated weights for policy 0, policy_version 136283 (0.0032) [2024-06-18 12:49:21,830][12883] Updated weights for policy 0, policy_version 136293 (0.0031) [2024-06-18 12:49:21,994][12645] Fps is (10 sec: 49151.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2233024512. Throughput: 0: 42441.3. Samples: 2233116580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:21,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 12:49:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2233171968. Throughput: 0: 42403.1. Samples: 2233376240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:26,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 12:49:27,019][12883] Updated weights for policy 0, policy_version 136303 (0.0037) [2024-06-18 12:49:29,050][12862] Signal inference workers to stop experience collection... (32650 times) [2024-06-18 12:49:29,096][12883] InferenceWorker_p0-w0: stopping experience collection (32650 times) [2024-06-18 12:49:29,097][12862] Signal inference workers to resume experience collection... (32650 times) [2024-06-18 12:49:29,115][12883] InferenceWorker_p0-w0: resuming experience collection (32650 times) [2024-06-18 12:49:29,703][12883] Updated weights for policy 0, policy_version 136313 (0.0039) [2024-06-18 12:49:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2233450496. Throughput: 0: 42526.7. Samples: 2233494820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:31,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 12:49:34,761][12883] Updated weights for policy 0, policy_version 136323 (0.0039) [2024-06-18 12:49:36,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2233647104. Throughput: 0: 42627.7. Samples: 2233758880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:36,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 12:49:37,256][12883] Updated weights for policy 0, policy_version 136333 (0.0043) [2024-06-18 12:49:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2233827328. Throughput: 0: 42613.7. Samples: 2234012400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:41,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 12:49:42,414][12883] Updated weights for policy 0, policy_version 136343 (0.0046) [2024-06-18 12:49:45,090][12883] Updated weights for policy 0, policy_version 136353 (0.0028) [2024-06-18 12:49:46,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2234089472. Throughput: 0: 42643.2. Samples: 2234132100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:46,997][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 12:49:49,940][12883] Updated weights for policy 0, policy_version 136363 (0.0022) [2024-06-18 12:49:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42543.2). Total num frames: 2234253312. Throughput: 0: 42509.7. Samples: 2234390120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:51,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 12:49:52,841][12883] Updated weights for policy 0, policy_version 136373 (0.0049) [2024-06-18 12:49:56,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2234466304. Throughput: 0: 42552.5. Samples: 2234645760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:49:56,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 12:49:57,487][12883] Updated weights for policy 0, policy_version 136383 (0.0036) [2024-06-18 12:50:00,798][12883] Updated weights for policy 0, policy_version 136393 (0.0034) [2024-06-18 12:50:01,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 2234728448. Throughput: 0: 42750.5. Samples: 2234775380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:50:01,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 12:50:04,965][12883] Updated weights for policy 0, policy_version 136403 (0.0026) [2024-06-18 12:50:06,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42326.8, 300 sec: 42542.8). Total num frames: 2234908672. Throughput: 0: 42545.7. Samples: 2235031140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:50:06,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 12:50:08,505][12883] Updated weights for policy 0, policy_version 136413 (0.0031) [2024-06-18 12:50:11,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2235105280. Throughput: 0: 42550.2. Samples: 2235291000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 12:50:11,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 12:50:12,599][12883] Updated weights for policy 0, policy_version 136423 (0.0030) [2024-06-18 12:50:16,099][12883] Updated weights for policy 0, policy_version 136433 (0.0030) [2024-06-18 12:50:16,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2235367424. Throughput: 0: 42746.6. Samples: 2235418420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:16,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 12:50:20,520][12883] Updated weights for policy 0, policy_version 136443 (0.0028) [2024-06-18 12:50:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 2235531264. Throughput: 0: 42499.1. Samples: 2235671340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:21,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 12:50:23,698][12883] Updated weights for policy 0, policy_version 136453 (0.0037) [2024-06-18 12:50:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 2235760640. Throughput: 0: 42494.4. Samples: 2235924640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:26,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 12:50:28,141][12883] Updated weights for policy 0, policy_version 136463 (0.0035) [2024-06-18 12:50:31,372][12883] Updated weights for policy 0, policy_version 136473 (0.0035) [2024-06-18 12:50:31,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42709.8). Total num frames: 2236006400. Throughput: 0: 42801.7. Samples: 2236058080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:31,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 12:50:35,881][12883] Updated weights for policy 0, policy_version 136483 (0.0039) [2024-06-18 12:50:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2236186624. Throughput: 0: 42884.9. Samples: 2236319940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:36,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 12:50:37,064][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136487_2236203008.pth... [2024-06-18 12:50:37,122][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000135864_2225995776.pth [2024-06-18 12:50:37,866][12862] Signal inference workers to stop experience collection... (32700 times) [2024-06-18 12:50:37,916][12883] InferenceWorker_p0-w0: stopping experience collection (32700 times) [2024-06-18 12:50:37,924][12862] Signal inference workers to resume experience collection... (32700 times) [2024-06-18 12:50:37,928][12883] InferenceWorker_p0-w0: resuming experience collection (32700 times) [2024-06-18 12:50:39,088][12883] Updated weights for policy 0, policy_version 136493 (0.0029) [2024-06-18 12:50:41,995][12645] Fps is (10 sec: 40953.5, 60 sec: 43143.5, 300 sec: 42709.4). Total num frames: 2236416000. Throughput: 0: 42688.3. Samples: 2236566800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:41,996][12645] Avg episode reward: [(0, '0.232')] [2024-06-18 12:50:43,516][12883] Updated weights for policy 0, policy_version 136503 (0.0037) [2024-06-18 12:50:46,872][12883] Updated weights for policy 0, policy_version 136513 (0.0030) [2024-06-18 12:50:46,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.9, 300 sec: 42653.9). Total num frames: 2236628992. Throughput: 0: 42677.9. Samples: 2236695880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:46,994][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 12:50:51,092][12883] Updated weights for policy 0, policy_version 136523 (0.0028) [2024-06-18 12:50:51,994][12645] Fps is (10 sec: 40966.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2236825600. Throughput: 0: 42767.2. Samples: 2236955660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:51,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 12:50:54,500][12883] Updated weights for policy 0, policy_version 136533 (0.0032) [2024-06-18 12:50:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2237054976. Throughput: 0: 42599.1. Samples: 2237207960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:50:56,994][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 12:50:58,547][12883] Updated weights for policy 0, policy_version 136543 (0.0029) [2024-06-18 12:51:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42598.7). Total num frames: 2237267968. Throughput: 0: 42775.7. Samples: 2237343320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:51:01,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 12:51:02,099][12883] Updated weights for policy 0, policy_version 136553 (0.0030) [2024-06-18 12:51:05,963][12883] Updated weights for policy 0, policy_version 136563 (0.0041) [2024-06-18 12:51:06,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 2237480960. Throughput: 0: 42890.2. Samples: 2237601500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:51:06,996][12645] Avg episode reward: [(0, '0.237')] [2024-06-18 12:51:09,691][12883] Updated weights for policy 0, policy_version 136573 (0.0031) [2024-06-18 12:51:12,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 2237693952. Throughput: 0: 42851.8. Samples: 2237853240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 12:51:12,000][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 12:51:13,463][12883] Updated weights for policy 0, policy_version 136583 (0.0034) [2024-06-18 12:51:16,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2237906944. Throughput: 0: 42733.7. Samples: 2237981100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:16,994][12645] Avg episode reward: [(0, '0.658')] [2024-06-18 12:51:17,355][12883] Updated weights for policy 0, policy_version 136593 (0.0034) [2024-06-18 12:51:21,097][12883] Updated weights for policy 0, policy_version 136603 (0.0034) [2024-06-18 12:51:21,994][12645] Fps is (10 sec: 42625.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2238119936. Throughput: 0: 42736.5. Samples: 2238243080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:21,994][12645] Avg episode reward: [(0, '0.724')] [2024-06-18 12:51:24,896][12883] Updated weights for policy 0, policy_version 136613 (0.0047) [2024-06-18 12:51:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2238316544. Throughput: 0: 42873.9. Samples: 2238496060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:26,999][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 12:51:28,737][12883] Updated weights for policy 0, policy_version 136623 (0.0042) [2024-06-18 12:51:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2238562304. Throughput: 0: 42828.5. Samples: 2238623160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:31,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 12:51:32,699][12883] Updated weights for policy 0, policy_version 136633 (0.0028) [2024-06-18 12:51:36,734][12883] Updated weights for policy 0, policy_version 136643 (0.0035) [2024-06-18 12:51:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 2238775296. Throughput: 0: 42779.1. Samples: 2238880720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:36,994][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 12:51:40,243][12883] Updated weights for policy 0, policy_version 136653 (0.0035) [2024-06-18 12:51:41,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42599.4, 300 sec: 42653.9). Total num frames: 2238971904. Throughput: 0: 42745.2. Samples: 2239131500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:41,994][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 12:51:44,421][12883] Updated weights for policy 0, policy_version 136663 (0.0034) [2024-06-18 12:51:47,000][12645] Fps is (10 sec: 42571.6, 60 sec: 42867.0, 300 sec: 42597.5). Total num frames: 2239201280. Throughput: 0: 42673.5. Samples: 2239263900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:47,000][12645] Avg episode reward: [(0, '0.700')] [2024-06-18 12:51:47,890][12883] Updated weights for policy 0, policy_version 136673 (0.0030) [2024-06-18 12:51:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2239381504. Throughput: 0: 42691.0. Samples: 2239522500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:51,994][12645] Avg episode reward: [(0, '0.769')] [2024-06-18 12:51:52,346][12883] Updated weights for policy 0, policy_version 136683 (0.0035) [2024-06-18 12:51:55,514][12883] Updated weights for policy 0, policy_version 136693 (0.0039) [2024-06-18 12:51:56,994][12645] Fps is (10 sec: 40985.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2239610880. Throughput: 0: 42742.8. Samples: 2239776400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:51:56,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 12:51:59,785][12883] Updated weights for policy 0, policy_version 136703 (0.0039) [2024-06-18 12:52:01,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2239856640. Throughput: 0: 42934.8. Samples: 2239913160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:52:01,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 12:52:03,212][12883] Updated weights for policy 0, policy_version 136713 (0.0053) [2024-06-18 12:52:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2240036864. Throughput: 0: 42690.2. Samples: 2240164140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:52:06,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 12:52:07,322][12883] Updated weights for policy 0, policy_version 136723 (0.0032) [2024-06-18 12:52:10,715][12883] Updated weights for policy 0, policy_version 136733 (0.0042) [2024-06-18 12:52:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42876.0, 300 sec: 42654.0). Total num frames: 2240266240. Throughput: 0: 42739.7. Samples: 2240419340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:52:11,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 12:52:14,853][12883] Updated weights for policy 0, policy_version 136743 (0.0029) [2024-06-18 12:52:16,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2240479232. Throughput: 0: 42868.8. Samples: 2240552260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 12:52:16,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 12:52:18,550][12883] Updated weights for policy 0, policy_version 136753 (0.0031) [2024-06-18 12:52:21,042][12862] Signal inference workers to stop experience collection... (32750 times) [2024-06-18 12:52:21,087][12883] InferenceWorker_p0-w0: stopping experience collection (32750 times) [2024-06-18 12:52:21,097][12862] Signal inference workers to resume experience collection... (32750 times) [2024-06-18 12:52:21,111][12883] InferenceWorker_p0-w0: resuming experience collection (32750 times) [2024-06-18 12:52:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2240675840. Throughput: 0: 42936.4. Samples: 2240812860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:21,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 12:52:22,910][12883] Updated weights for policy 0, policy_version 136763 (0.0030) [2024-06-18 12:52:26,067][12883] Updated weights for policy 0, policy_version 136773 (0.0036) [2024-06-18 12:52:26,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 42820.9). Total num frames: 2240937984. Throughput: 0: 42963.2. Samples: 2241064840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:26,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 12:52:30,458][12883] Updated weights for policy 0, policy_version 136783 (0.0030) [2024-06-18 12:52:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2241118208. Throughput: 0: 42982.4. Samples: 2241197840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:31,994][12645] Avg episode reward: [(0, '0.681')] [2024-06-18 12:52:33,916][12883] Updated weights for policy 0, policy_version 136793 (0.0037) [2024-06-18 12:52:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2241331200. Throughput: 0: 42861.3. Samples: 2241451260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:36,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 12:52:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136800_2241331200.pth... [2024-06-18 12:52:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136176_2231107584.pth [2024-06-18 12:52:37,998][12883] Updated weights for policy 0, policy_version 136803 (0.0046) [2024-06-18 12:52:41,466][12883] Updated weights for policy 0, policy_version 136813 (0.0031) [2024-06-18 12:52:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 2241576960. Throughput: 0: 42803.1. Samples: 2241702540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:41,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 12:52:45,884][12883] Updated weights for policy 0, policy_version 136823 (0.0031) [2024-06-18 12:52:46,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42601.3, 300 sec: 42598.1). Total num frames: 2241757184. Throughput: 0: 42812.1. Samples: 2241839800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:46,996][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 12:52:48,993][12883] Updated weights for policy 0, policy_version 136833 (0.0041) [2024-06-18 12:52:51,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2241953792. Throughput: 0: 42781.7. Samples: 2242089320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:51,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 12:52:53,502][12883] Updated weights for policy 0, policy_version 136843 (0.0025) [2024-06-18 12:52:56,526][12883] Updated weights for policy 0, policy_version 136853 (0.0033) [2024-06-18 12:52:56,994][12645] Fps is (10 sec: 45885.2, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 2242215936. Throughput: 0: 42769.2. Samples: 2242343960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:52:56,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 12:53:01,085][12883] Updated weights for policy 0, policy_version 136863 (0.0036) [2024-06-18 12:53:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2242396160. Throughput: 0: 42920.5. Samples: 2242483680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:53:01,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 12:53:04,187][12883] Updated weights for policy 0, policy_version 136873 (0.0040) [2024-06-18 12:53:06,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2242609152. Throughput: 0: 42695.4. Samples: 2242734160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:53:06,994][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 12:53:08,890][12883] Updated weights for policy 0, policy_version 136883 (0.0033) [2024-06-18 12:53:11,815][12883] Updated weights for policy 0, policy_version 136893 (0.0036) [2024-06-18 12:53:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2242854912. Throughput: 0: 42679.2. Samples: 2242985400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:53:11,994][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 12:53:16,582][12883] Updated weights for policy 0, policy_version 136903 (0.0035) [2024-06-18 12:53:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2243051520. Throughput: 0: 42660.5. Samples: 2243117560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 12:53:16,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 12:53:19,450][12883] Updated weights for policy 0, policy_version 136913 (0.0033) [2024-06-18 12:53:21,994][12645] Fps is (10 sec: 40959.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2243264512. Throughput: 0: 42618.1. Samples: 2243369080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:21,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 12:53:24,444][12883] Updated weights for policy 0, policy_version 136923 (0.0023) [2024-06-18 12:53:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2243493888. Throughput: 0: 42704.4. Samples: 2243624240. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:26,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 12:53:27,105][12883] Updated weights for policy 0, policy_version 136933 (0.0041) [2024-06-18 12:53:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2243657728. Throughput: 0: 42459.3. Samples: 2243750380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:31,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 12:53:32,392][12883] Updated weights for policy 0, policy_version 136943 (0.0026) [2024-06-18 12:53:34,931][12883] Updated weights for policy 0, policy_version 136953 (0.0033) [2024-06-18 12:53:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2243903488. Throughput: 0: 42547.5. Samples: 2244003960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:36,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 12:53:39,875][12883] Updated weights for policy 0, policy_version 136963 (0.0037) [2024-06-18 12:53:41,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 2244100096. Throughput: 0: 42614.8. Samples: 2244261620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:41,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 12:53:43,021][12883] Updated weights for policy 0, policy_version 136973 (0.0044) [2024-06-18 12:53:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42326.8, 300 sec: 42542.9). Total num frames: 2244296704. Throughput: 0: 42274.6. Samples: 2244386040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:46,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 12:53:47,442][12883] Updated weights for policy 0, policy_version 136983 (0.0034) [2024-06-18 12:53:47,828][12862] Signal inference workers to stop experience collection... (32800 times) [2024-06-18 12:53:47,828][12862] Signal inference workers to resume experience collection... (32800 times) [2024-06-18 12:53:47,863][12883] InferenceWorker_p0-w0: stopping experience collection (32800 times) [2024-06-18 12:53:47,868][12883] InferenceWorker_p0-w0: resuming experience collection (32800 times) [2024-06-18 12:53:50,738][12883] Updated weights for policy 0, policy_version 136993 (0.0038) [2024-06-18 12:53:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2244526080. Throughput: 0: 42420.6. Samples: 2244643080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:51,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 12:53:55,142][12883] Updated weights for policy 0, policy_version 137003 (0.0021) [2024-06-18 12:53:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2244722688. Throughput: 0: 42631.0. Samples: 2244903800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:53:56,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 12:53:58,317][12883] Updated weights for policy 0, policy_version 137013 (0.0040) [2024-06-18 12:54:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2244952064. Throughput: 0: 42457.8. Samples: 2245028160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:54:01,999][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 12:54:02,641][12883] Updated weights for policy 0, policy_version 137023 (0.0029) [2024-06-18 12:54:06,350][12883] Updated weights for policy 0, policy_version 137033 (0.0028) [2024-06-18 12:54:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2245165056. Throughput: 0: 42636.2. Samples: 2245287700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:54:06,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 12:54:10,218][12883] Updated weights for policy 0, policy_version 137043 (0.0036) [2024-06-18 12:54:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2245378048. Throughput: 0: 42674.2. Samples: 2245544580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:54:11,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 12:54:13,806][12883] Updated weights for policy 0, policy_version 137053 (0.0031) [2024-06-18 12:54:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2245591040. Throughput: 0: 42612.1. Samples: 2245667920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:54:16,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 12:54:17,692][12883] Updated weights for policy 0, policy_version 137063 (0.0022) [2024-06-18 12:54:21,370][12883] Updated weights for policy 0, policy_version 137073 (0.0025) [2024-06-18 12:54:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2245820416. Throughput: 0: 42897.4. Samples: 2245934340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-18 12:54:21,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 12:54:25,447][12883] Updated weights for policy 0, policy_version 137083 (0.0027) [2024-06-18 12:54:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 2246017024. Throughput: 0: 42828.4. Samples: 2246188900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:54:26,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 12:54:28,991][12883] Updated weights for policy 0, policy_version 137093 (0.0032) [2024-06-18 12:54:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2246246400. Throughput: 0: 42793.5. Samples: 2246311740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:54:31,994][12645] Avg episode reward: [(0, '0.172')] [2024-06-18 12:54:33,116][12883] Updated weights for policy 0, policy_version 137103 (0.0034) [2024-06-18 12:54:36,709][12883] Updated weights for policy 0, policy_version 137113 (0.0032) [2024-06-18 12:54:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2246459392. Throughput: 0: 42944.4. Samples: 2246575580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:54:36,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 12:54:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137113_2246459392.pth... [2024-06-18 12:54:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136487_2236203008.pth [2024-06-18 12:54:40,626][12883] Updated weights for policy 0, policy_version 137123 (0.0037) [2024-06-18 12:54:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.7). Total num frames: 2246656000. Throughput: 0: 42772.9. Samples: 2246828580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:54:41,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 12:54:44,306][12883] Updated weights for policy 0, policy_version 137133 (0.0031) [2024-06-18 12:54:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2246901760. Throughput: 0: 42865.7. Samples: 2246957120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:54:46,996][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 12:54:48,109][12883] Updated weights for policy 0, policy_version 137143 (0.0039) [2024-06-18 12:54:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2247098368. Throughput: 0: 42945.2. Samples: 2247220240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:54:51,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 12:54:52,325][12883] Updated weights for policy 0, policy_version 137153 (0.0033) [2024-06-18 12:54:56,044][12883] Updated weights for policy 0, policy_version 137163 (0.0040) [2024-06-18 12:54:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2247311360. Throughput: 0: 42761.3. Samples: 2247468840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:54:56,998][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 12:55:00,038][12883] Updated weights for policy 0, policy_version 137173 (0.0044) [2024-06-18 12:55:01,995][12645] Fps is (10 sec: 44231.7, 60 sec: 43143.7, 300 sec: 42820.4). Total num frames: 2247540736. Throughput: 0: 42883.3. Samples: 2247597720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:55:01,996][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 12:55:03,624][12883] Updated weights for policy 0, policy_version 137183 (0.0038) [2024-06-18 12:55:06,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2247720960. Throughput: 0: 42665.8. Samples: 2247854300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:55:06,996][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 12:55:07,656][12883] Updated weights for policy 0, policy_version 137193 (0.0036) [2024-06-18 12:55:11,100][12862] Signal inference workers to stop experience collection... (32850 times) [2024-06-18 12:55:11,100][12862] Signal inference workers to resume experience collection... (32850 times) [2024-06-18 12:55:11,127][12883] InferenceWorker_p0-w0: stopping experience collection (32850 times) [2024-06-18 12:55:11,127][12883] InferenceWorker_p0-w0: resuming experience collection (32850 times) [2024-06-18 12:55:11,233][12883] Updated weights for policy 0, policy_version 137203 (0.0040) [2024-06-18 12:55:11,994][12645] Fps is (10 sec: 40964.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2247950336. Throughput: 0: 42429.7. Samples: 2248098240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:55:11,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 12:55:15,403][12883] Updated weights for policy 0, policy_version 137213 (0.0037) [2024-06-18 12:55:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2248163328. Throughput: 0: 42703.5. Samples: 2248233400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:55:16,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 12:55:18,986][12883] Updated weights for policy 0, policy_version 137223 (0.0038) [2024-06-18 12:55:21,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 2248376320. Throughput: 0: 42598.3. Samples: 2248492600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 12:55:21,996][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 12:55:22,977][12883] Updated weights for policy 0, policy_version 137233 (0.0032) [2024-06-18 12:55:26,741][12883] Updated weights for policy 0, policy_version 137243 (0.0035) [2024-06-18 12:55:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2248589312. Throughput: 0: 42606.1. Samples: 2248745860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:55:26,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 12:55:30,911][12883] Updated weights for policy 0, policy_version 137253 (0.0032) [2024-06-18 12:55:31,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2248802304. Throughput: 0: 42577.4. Samples: 2248873100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:55:31,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 12:55:34,530][12883] Updated weights for policy 0, policy_version 137263 (0.0045) [2024-06-18 12:55:36,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42869.9, 300 sec: 42764.9). Total num frames: 2249031680. Throughput: 0: 42468.6. Samples: 2249131420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:55:36,996][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 12:55:38,563][12883] Updated weights for policy 0, policy_version 137273 (0.0026) [2024-06-18 12:55:41,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2249228288. Throughput: 0: 42500.2. Samples: 2249381440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:55:41,997][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 12:55:42,230][12883] Updated weights for policy 0, policy_version 137283 (0.0029) [2024-06-18 12:55:46,179][12883] Updated weights for policy 0, policy_version 137293 (0.0025) [2024-06-18 12:55:46,996][12645] Fps is (10 sec: 40960.0, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2249441280. Throughput: 0: 42498.6. Samples: 2249510200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:55:46,997][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 12:55:50,009][12883] Updated weights for policy 0, policy_version 137303 (0.0037) [2024-06-18 12:55:51,994][12645] Fps is (10 sec: 39330.4, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2249621504. Throughput: 0: 42491.5. Samples: 2249766420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:55:51,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 12:55:53,861][12883] Updated weights for policy 0, policy_version 137313 (0.0038) [2024-06-18 12:55:56,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2249867264. Throughput: 0: 42651.5. Samples: 2250017560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:55:56,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 12:55:57,587][12883] Updated weights for policy 0, policy_version 137323 (0.0041) [2024-06-18 12:56:01,491][12883] Updated weights for policy 0, policy_version 137333 (0.0024) [2024-06-18 12:56:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42326.2, 300 sec: 42709.8). Total num frames: 2250080256. Throughput: 0: 42610.6. Samples: 2250150880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:56:01,994][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 12:56:05,119][12883] Updated weights for policy 0, policy_version 137343 (0.0034) [2024-06-18 12:56:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 2250276864. Throughput: 0: 42460.2. Samples: 2250403220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:56:06,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 12:56:09,207][12883] Updated weights for policy 0, policy_version 137353 (0.0034) [2024-06-18 12:56:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2250506240. Throughput: 0: 42473.1. Samples: 2250657140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:56:11,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 12:56:13,019][12883] Updated weights for policy 0, policy_version 137363 (0.0026) [2024-06-18 12:56:16,786][12883] Updated weights for policy 0, policy_version 137373 (0.0031) [2024-06-18 12:56:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2250719232. Throughput: 0: 42592.3. Samples: 2250789760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:56:16,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 12:56:20,664][12883] Updated weights for policy 0, policy_version 137383 (0.0022) [2024-06-18 12:56:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 2250932224. Throughput: 0: 42670.9. Samples: 2251051520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:56:21,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 12:56:24,263][12883] Updated weights for policy 0, policy_version 137393 (0.0035) [2024-06-18 12:56:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2251161600. Throughput: 0: 42835.0. Samples: 2251308920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 12:56:26,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 12:56:28,311][12883] Updated weights for policy 0, policy_version 137403 (0.0032) [2024-06-18 12:56:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2251358208. Throughput: 0: 42901.7. Samples: 2251440680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:56:31,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 12:56:32,035][12883] Updated weights for policy 0, policy_version 137413 (0.0027) [2024-06-18 12:56:36,036][12883] Updated weights for policy 0, policy_version 137423 (0.0028) [2024-06-18 12:56:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2251571200. Throughput: 0: 42853.8. Samples: 2251694840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:56:36,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 12:56:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137425_2251571200.pth... [2024-06-18 12:56:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000136800_2241331200.pth [2024-06-18 12:56:39,594][12883] Updated weights for policy 0, policy_version 137433 (0.0032) [2024-06-18 12:56:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.0, 300 sec: 42710.4). Total num frames: 2251800576. Throughput: 0: 42876.0. Samples: 2251946980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:56:41,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 12:56:43,910][12883] Updated weights for policy 0, policy_version 137443 (0.0026) [2024-06-18 12:56:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42873.0, 300 sec: 42820.6). Total num frames: 2252013568. Throughput: 0: 42827.5. Samples: 2252078120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:56:46,996][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 12:56:47,115][12883] Updated weights for policy 0, policy_version 137453 (0.0041) [2024-06-18 12:56:51,438][12883] Updated weights for policy 0, policy_version 137463 (0.0040) [2024-06-18 12:56:52,000][12645] Fps is (10 sec: 40934.7, 60 sec: 43140.0, 300 sec: 42708.6). Total num frames: 2252210176. Throughput: 0: 42872.8. Samples: 2252332760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:56:52,001][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 12:56:54,552][12862] Signal inference workers to stop experience collection... (32900 times) [2024-06-18 12:56:54,584][12883] InferenceWorker_p0-w0: stopping experience collection (32900 times) [2024-06-18 12:56:54,611][12862] Signal inference workers to resume experience collection... (32900 times) [2024-06-18 12:56:54,612][12883] InferenceWorker_p0-w0: resuming experience collection (32900 times) [2024-06-18 12:56:54,750][12883] Updated weights for policy 0, policy_version 137473 (0.0039) [2024-06-18 12:56:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2252439552. Throughput: 0: 42976.8. Samples: 2252591100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:56:56,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 12:56:59,028][12883] Updated weights for policy 0, policy_version 137483 (0.0051) [2024-06-18 12:57:01,994][12645] Fps is (10 sec: 42625.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2252636160. Throughput: 0: 42881.9. Samples: 2252719440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:57:01,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 12:57:02,828][12883] Updated weights for policy 0, policy_version 137493 (0.0027) [2024-06-18 12:57:06,464][12883] Updated weights for policy 0, policy_version 137503 (0.0041) [2024-06-18 12:57:06,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2252849152. Throughput: 0: 42622.1. Samples: 2252969520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:57:06,995][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 12:57:10,441][12883] Updated weights for policy 0, policy_version 137513 (0.0035) [2024-06-18 12:57:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2253062144. Throughput: 0: 42613.7. Samples: 2253226540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:57:11,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 12:57:14,098][12883] Updated weights for policy 0, policy_version 137523 (0.0034) [2024-06-18 12:57:16,994][12645] Fps is (10 sec: 42599.8, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 2253275136. Throughput: 0: 42473.4. Samples: 2253351980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:57:16,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 12:57:18,125][12883] Updated weights for policy 0, policy_version 137533 (0.0031) [2024-06-18 12:57:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2253488128. Throughput: 0: 42438.6. Samples: 2253604580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:57:21,998][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 12:57:22,429][12883] Updated weights for policy 0, policy_version 137543 (0.0032) [2024-06-18 12:57:25,842][12883] Updated weights for policy 0, policy_version 137553 (0.0033) [2024-06-18 12:57:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2253717504. Throughput: 0: 42429.8. Samples: 2253856320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 12:57:26,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 12:57:30,280][12883] Updated weights for policy 0, policy_version 137563 (0.0037) [2024-06-18 12:57:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2253897728. Throughput: 0: 42396.5. Samples: 2253985960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:57:31,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 12:57:33,518][12883] Updated weights for policy 0, policy_version 137573 (0.0042) [2024-06-18 12:57:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2254127104. Throughput: 0: 42316.7. Samples: 2254236740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:57:36,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 12:57:38,038][12883] Updated weights for policy 0, policy_version 137583 (0.0043) [2024-06-18 12:57:41,772][12883] Updated weights for policy 0, policy_version 137593 (0.0027) [2024-06-18 12:57:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2254340096. Throughput: 0: 42295.2. Samples: 2254494380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:57:41,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 12:57:45,729][12883] Updated weights for policy 0, policy_version 137603 (0.0061) [2024-06-18 12:57:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2254536704. Throughput: 0: 42203.5. Samples: 2254618600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:57:46,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 12:57:49,361][12883] Updated weights for policy 0, policy_version 137613 (0.0029) [2024-06-18 12:57:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 2254766080. Throughput: 0: 42367.8. Samples: 2254876060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:57:51,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 12:57:53,479][12883] Updated weights for policy 0, policy_version 137623 (0.0035) [2024-06-18 12:57:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2254962688. Throughput: 0: 42401.8. Samples: 2255134620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:57:56,994][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 12:57:57,033][12883] Updated weights for policy 0, policy_version 137633 (0.0034) [2024-06-18 12:58:01,039][12883] Updated weights for policy 0, policy_version 137643 (0.0019) [2024-06-18 12:58:01,997][12645] Fps is (10 sec: 42582.8, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 2255192064. Throughput: 0: 42283.6. Samples: 2255254900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:58:01,998][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 12:58:04,489][12883] Updated weights for policy 0, policy_version 137653 (0.0038) [2024-06-18 12:58:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2255421440. Throughput: 0: 42447.9. Samples: 2255514740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:58:06,994][12645] Avg episode reward: [(0, '0.659')] [2024-06-18 12:58:08,812][12883] Updated weights for policy 0, policy_version 137663 (0.0029) [2024-06-18 12:58:11,994][12645] Fps is (10 sec: 40975.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2255601664. Throughput: 0: 42616.6. Samples: 2255774060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:58:11,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 12:58:12,185][12883] Updated weights for policy 0, policy_version 137673 (0.0043) [2024-06-18 12:58:14,038][12862] Signal inference workers to stop experience collection... (32950 times) [2024-06-18 12:58:14,087][12862] Signal inference workers to resume experience collection... (32950 times) [2024-06-18 12:58:14,088][12883] InferenceWorker_p0-w0: stopping experience collection (32950 times) [2024-06-18 12:58:14,101][12883] InferenceWorker_p0-w0: resuming experience collection (32950 times) [2024-06-18 12:58:16,633][12883] Updated weights for policy 0, policy_version 137683 (0.0028) [2024-06-18 12:58:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 2255831040. Throughput: 0: 42325.7. Samples: 2255890620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:58:16,994][12645] Avg episode reward: [(0, '0.714')] [2024-06-18 12:58:19,882][12883] Updated weights for policy 0, policy_version 137693 (0.0046) [2024-06-18 12:58:21,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2256060416. Throughput: 0: 42593.7. Samples: 2256153460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:58:21,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 12:58:24,300][12883] Updated weights for policy 0, policy_version 137703 (0.0028) [2024-06-18 12:58:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2256240640. Throughput: 0: 42634.1. Samples: 2256412920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:58:26,995][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 12:58:27,911][12883] Updated weights for policy 0, policy_version 137713 (0.0035) [2024-06-18 12:58:31,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2256437248. Throughput: 0: 42479.2. Samples: 2256530160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 12:58:31,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 12:58:32,116][12883] Updated weights for policy 0, policy_version 137723 (0.0028) [2024-06-18 12:58:35,725][12883] Updated weights for policy 0, policy_version 137733 (0.0041) [2024-06-18 12:58:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2256683008. Throughput: 0: 42559.9. Samples: 2256791260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:58:36,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 12:58:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137738_2256699392.pth... [2024-06-18 12:58:37,084][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137113_2246459392.pth [2024-06-18 12:58:39,900][12883] Updated weights for policy 0, policy_version 137743 (0.0028) [2024-06-18 12:58:41,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2256879616. Throughput: 0: 42461.7. Samples: 2257045400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:58:41,994][12645] Avg episode reward: [(0, '0.886')] [2024-06-18 12:58:43,473][12883] Updated weights for policy 0, policy_version 137753 (0.0032) [2024-06-18 12:58:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2257076224. Throughput: 0: 42458.0. Samples: 2257165360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:58:46,994][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 12:58:47,440][12883] Updated weights for policy 0, policy_version 137763 (0.0035) [2024-06-18 12:58:51,004][12883] Updated weights for policy 0, policy_version 137773 (0.0037) [2024-06-18 12:58:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2257321984. Throughput: 0: 42541.3. Samples: 2257429100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:58:51,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 12:58:54,995][12883] Updated weights for policy 0, policy_version 137783 (0.0036) [2024-06-18 12:58:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2257534976. Throughput: 0: 42576.4. Samples: 2257690000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:58:56,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 12:58:58,428][12883] Updated weights for policy 0, policy_version 137793 (0.0035) [2024-06-18 12:59:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42327.8, 300 sec: 42598.4). Total num frames: 2257731584. Throughput: 0: 42632.4. Samples: 2257809080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:01,994][12645] Avg episode reward: [(0, '0.687')] [2024-06-18 12:59:02,421][12883] Updated weights for policy 0, policy_version 137803 (0.0032) [2024-06-18 12:59:06,273][12883] Updated weights for policy 0, policy_version 137813 (0.0037) [2024-06-18 12:59:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2257960960. Throughput: 0: 42640.8. Samples: 2258072300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:06,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 12:59:09,894][12883] Updated weights for policy 0, policy_version 137823 (0.0038) [2024-06-18 12:59:11,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2258157568. Throughput: 0: 42579.3. Samples: 2258328980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:11,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 12:59:13,884][12883] Updated weights for policy 0, policy_version 137833 (0.0036) [2024-06-18 12:59:16,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2258386944. Throughput: 0: 42751.6. Samples: 2258454080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:16,997][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 12:59:17,505][12883] Updated weights for policy 0, policy_version 137843 (0.0029) [2024-06-18 12:59:21,492][12883] Updated weights for policy 0, policy_version 137853 (0.0032) [2024-06-18 12:59:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2258583552. Throughput: 0: 42633.4. Samples: 2258709760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:21,994][12645] Avg episode reward: [(0, '0.213')] [2024-06-18 12:59:22,835][12862] Signal inference workers to stop experience collection... (33000 times) [2024-06-18 12:59:22,836][12862] Signal inference workers to resume experience collection... (33000 times) [2024-06-18 12:59:22,848][12883] InferenceWorker_p0-w0: stopping experience collection (33000 times) [2024-06-18 12:59:22,848][12883] InferenceWorker_p0-w0: resuming experience collection (33000 times) [2024-06-18 12:59:25,007][12883] Updated weights for policy 0, policy_version 137863 (0.0030) [2024-06-18 12:59:26,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2258796544. Throughput: 0: 42667.5. Samples: 2258965440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:26,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 12:59:29,410][12883] Updated weights for policy 0, policy_version 137873 (0.0029) [2024-06-18 12:59:31,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2259009536. Throughput: 0: 42747.1. Samples: 2259088980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:31,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 12:59:32,926][12883] Updated weights for policy 0, policy_version 137883 (0.0024) [2024-06-18 12:59:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2259222528. Throughput: 0: 42565.0. Samples: 2259344520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 12:59:36,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 12:59:37,049][12883] Updated weights for policy 0, policy_version 137893 (0.0038) [2024-06-18 12:59:40,885][12883] Updated weights for policy 0, policy_version 137903 (0.0043) [2024-06-18 12:59:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2259435520. Throughput: 0: 42349.7. Samples: 2259595740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:59:41,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 12:59:44,685][12883] Updated weights for policy 0, policy_version 137913 (0.0033) [2024-06-18 12:59:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2259648512. Throughput: 0: 42585.4. Samples: 2259725420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:59:46,994][12645] Avg episode reward: [(0, '0.302')] [2024-06-18 12:59:48,808][12883] Updated weights for policy 0, policy_version 137923 (0.0036) [2024-06-18 12:59:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2259861504. Throughput: 0: 42340.0. Samples: 2259977600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:59:51,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 12:59:52,434][12883] Updated weights for policy 0, policy_version 137933 (0.0027) [2024-06-18 12:59:56,706][12883] Updated weights for policy 0, policy_version 137943 (0.0034) [2024-06-18 12:59:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.5). Total num frames: 2260074496. Throughput: 0: 42326.6. Samples: 2260233680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 12:59:56,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 13:00:00,253][12883] Updated weights for policy 0, policy_version 137953 (0.0041) [2024-06-18 13:00:01,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2260271104. Throughput: 0: 42315.6. Samples: 2260358180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:01,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 13:00:04,352][12883] Updated weights for policy 0, policy_version 137963 (0.0031) [2024-06-18 13:00:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2260500480. Throughput: 0: 42300.9. Samples: 2260613300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:06,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 13:00:07,852][12883] Updated weights for policy 0, policy_version 137973 (0.0034) [2024-06-18 13:00:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2260697088. Throughput: 0: 42366.9. Samples: 2260871940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:11,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 13:00:12,013][12883] Updated weights for policy 0, policy_version 137983 (0.0030) [2024-06-18 13:00:15,520][12883] Updated weights for policy 0, policy_version 137993 (0.0047) [2024-06-18 13:00:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42053.8, 300 sec: 42487.6). Total num frames: 2260910080. Throughput: 0: 42253.3. Samples: 2260990380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:16,994][12645] Avg episode reward: [(0, '0.564')] [2024-06-18 13:00:19,900][12883] Updated weights for policy 0, policy_version 138003 (0.0049) [2024-06-18 13:00:21,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42596.8, 300 sec: 42542.6). Total num frames: 2261139456. Throughput: 0: 42272.2. Samples: 2261246860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:21,996][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 13:00:23,398][12883] Updated weights for policy 0, policy_version 138013 (0.0027) [2024-06-18 13:00:26,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42050.8, 300 sec: 42431.5). Total num frames: 2261319680. Throughput: 0: 42402.0. Samples: 2261503920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:26,996][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 13:00:27,590][12883] Updated weights for policy 0, policy_version 138023 (0.0038) [2024-06-18 13:00:31,061][12883] Updated weights for policy 0, policy_version 138033 (0.0043) [2024-06-18 13:00:31,994][12645] Fps is (10 sec: 42607.3, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2261565440. Throughput: 0: 42234.5. Samples: 2261625980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:31,994][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 13:00:35,240][12883] Updated weights for policy 0, policy_version 138043 (0.0043) [2024-06-18 13:00:36,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2261778432. Throughput: 0: 42434.8. Samples: 2261887160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 13:00:36,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 13:00:37,029][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138049_2261794816.pth... [2024-06-18 13:00:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137425_2251571200.pth [2024-06-18 13:00:38,910][12883] Updated weights for policy 0, policy_version 138053 (0.0028) [2024-06-18 13:00:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2261975040. Throughput: 0: 42236.7. Samples: 2262134340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:00:41,994][12645] Avg episode reward: [(0, '0.742')] [2024-06-18 13:00:42,883][12883] Updated weights for policy 0, policy_version 138063 (0.0034) [2024-06-18 13:00:46,586][12883] Updated weights for policy 0, policy_version 138073 (0.0045) [2024-06-18 13:00:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2262188032. Throughput: 0: 42327.0. Samples: 2262262900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:00:46,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 13:00:50,442][12883] Updated weights for policy 0, policy_version 138083 (0.0025) [2024-06-18 13:00:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2262401024. Throughput: 0: 42345.7. Samples: 2262518860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:00:51,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 13:00:52,854][12862] Signal inference workers to stop experience collection... (33050 times) [2024-06-18 13:00:52,855][12862] Signal inference workers to resume experience collection... (33050 times) [2024-06-18 13:00:52,884][12883] InferenceWorker_p0-w0: stopping experience collection (33050 times) [2024-06-18 13:00:52,885][12883] InferenceWorker_p0-w0: resuming experience collection (33050 times) [2024-06-18 13:00:54,480][12883] Updated weights for policy 0, policy_version 138093 (0.0037) [2024-06-18 13:00:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2262614016. Throughput: 0: 42340.8. Samples: 2262777280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:00:56,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 13:00:58,642][12883] Updated weights for policy 0, policy_version 138103 (0.0035) [2024-06-18 13:01:01,952][12883] Updated weights for policy 0, policy_version 138113 (0.0025) [2024-06-18 13:01:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2262843392. Throughput: 0: 42476.5. Samples: 2262901820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:01,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 13:01:06,168][12883] Updated weights for policy 0, policy_version 138123 (0.0034) [2024-06-18 13:01:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2263056384. Throughput: 0: 42526.5. Samples: 2263160460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:06,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 13:01:09,796][12883] Updated weights for policy 0, policy_version 138133 (0.0043) [2024-06-18 13:01:11,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2263236608. Throughput: 0: 42487.5. Samples: 2263415760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:11,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 13:01:13,903][12883] Updated weights for policy 0, policy_version 138143 (0.0027) [2024-06-18 13:01:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2263482368. Throughput: 0: 42522.3. Samples: 2263539480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:16,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 13:01:17,569][12883] Updated weights for policy 0, policy_version 138153 (0.0036) [2024-06-18 13:01:21,424][12883] Updated weights for policy 0, policy_version 138163 (0.0034) [2024-06-18 13:01:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42326.8, 300 sec: 42431.8). Total num frames: 2263678976. Throughput: 0: 42451.4. Samples: 2263797480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:21,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 13:01:25,500][12883] Updated weights for policy 0, policy_version 138173 (0.0038) [2024-06-18 13:01:27,000][12645] Fps is (10 sec: 39297.2, 60 sec: 42595.6, 300 sec: 42430.9). Total num frames: 2263875584. Throughput: 0: 42607.5. Samples: 2264051940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:27,000][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 13:01:29,080][12883] Updated weights for policy 0, policy_version 138183 (0.0036) [2024-06-18 13:01:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2264121344. Throughput: 0: 42622.2. Samples: 2264180900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:31,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 13:01:32,948][12883] Updated weights for policy 0, policy_version 138193 (0.0035) [2024-06-18 13:01:36,909][12883] Updated weights for policy 0, policy_version 138203 (0.0044) [2024-06-18 13:01:36,994][12645] Fps is (10 sec: 44264.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2264317952. Throughput: 0: 42648.9. Samples: 2264438060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:36,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 13:01:40,510][12883] Updated weights for policy 0, policy_version 138213 (0.0030) [2024-06-18 13:01:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2264530944. Throughput: 0: 42552.5. Samples: 2264692140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:01:41,994][12645] Avg episode reward: [(0, '0.659')] [2024-06-18 13:01:44,462][12883] Updated weights for policy 0, policy_version 138223 (0.0033) [2024-06-18 13:01:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42543.8). Total num frames: 2264760320. Throughput: 0: 42629.3. Samples: 2264820140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:01:46,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 13:01:48,256][12883] Updated weights for policy 0, policy_version 138233 (0.0027) [2024-06-18 13:01:51,997][12645] Fps is (10 sec: 40944.8, 60 sec: 42322.7, 300 sec: 42375.7). Total num frames: 2264940544. Throughput: 0: 42603.2. Samples: 2265077760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:01:51,998][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 13:01:52,303][12883] Updated weights for policy 0, policy_version 138243 (0.0025) [2024-06-18 13:01:55,685][12883] Updated weights for policy 0, policy_version 138253 (0.0030) [2024-06-18 13:01:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2265169920. Throughput: 0: 42520.3. Samples: 2265329180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:01:56,994][12645] Avg episode reward: [(0, '0.649')] [2024-06-18 13:01:59,974][12883] Updated weights for policy 0, policy_version 138263 (0.0043) [2024-06-18 13:02:01,994][12645] Fps is (10 sec: 45892.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2265399296. Throughput: 0: 42837.4. Samples: 2265467160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:01,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 13:02:03,795][12883] Updated weights for policy 0, policy_version 138273 (0.0034) [2024-06-18 13:02:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2265579520. Throughput: 0: 42685.4. Samples: 2265718320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:06,994][12645] Avg episode reward: [(0, '0.741')] [2024-06-18 13:02:07,739][12883] Updated weights for policy 0, policy_version 138283 (0.0033) [2024-06-18 13:02:11,161][12883] Updated weights for policy 0, policy_version 138293 (0.0051) [2024-06-18 13:02:11,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2265825280. Throughput: 0: 42630.7. Samples: 2265970060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:11,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 13:02:15,225][12862] Signal inference workers to stop experience collection... (33100 times) [2024-06-18 13:02:15,226][12862] Signal inference workers to resume experience collection... (33100 times) [2024-06-18 13:02:15,252][12883] InferenceWorker_p0-w0: stopping experience collection (33100 times) [2024-06-18 13:02:15,252][12883] InferenceWorker_p0-w0: resuming experience collection (33100 times) [2024-06-18 13:02:15,391][12883] Updated weights for policy 0, policy_version 138303 (0.0043) [2024-06-18 13:02:16,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2266054656. Throughput: 0: 42732.0. Samples: 2266103840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:16,996][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 13:02:18,615][12883] Updated weights for policy 0, policy_version 138313 (0.0035) [2024-06-18 13:02:21,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 2266202112. Throughput: 0: 42684.5. Samples: 2266358860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:21,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 13:02:22,899][12883] Updated weights for policy 0, policy_version 138323 (0.0022) [2024-06-18 13:02:26,050][12883] Updated weights for policy 0, policy_version 138333 (0.0025) [2024-06-18 13:02:26,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42875.8, 300 sec: 42542.8). Total num frames: 2266447872. Throughput: 0: 42640.3. Samples: 2266610960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:26,994][12645] Avg episode reward: [(0, '0.291')] [2024-06-18 13:02:30,526][12883] Updated weights for policy 0, policy_version 138343 (0.0027) [2024-06-18 13:02:31,994][12645] Fps is (10 sec: 47514.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2266677248. Throughput: 0: 42861.0. Samples: 2266748880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:31,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 13:02:34,313][12883] Updated weights for policy 0, policy_version 138353 (0.0030) [2024-06-18 13:02:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2266857472. Throughput: 0: 42696.4. Samples: 2266998940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:36,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 13:02:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138358_2266857472.pth... [2024-06-18 13:02:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000137738_2256699392.pth [2024-06-18 13:02:38,054][12883] Updated weights for policy 0, policy_version 138363 (0.0022) [2024-06-18 13:02:41,943][12883] Updated weights for policy 0, policy_version 138373 (0.0038) [2024-06-18 13:02:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2267103232. Throughput: 0: 42815.2. Samples: 2267255860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 13:02:41,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 13:02:46,201][12883] Updated weights for policy 0, policy_version 138383 (0.0035) [2024-06-18 13:02:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2267299840. Throughput: 0: 42694.7. Samples: 2267388420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:02:46,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 13:02:49,580][12883] Updated weights for policy 0, policy_version 138393 (0.0030) [2024-06-18 13:02:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42874.1, 300 sec: 42542.9). Total num frames: 2267512832. Throughput: 0: 42733.9. Samples: 2267641340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:02:51,994][12645] Avg episode reward: [(0, '0.650')] [2024-06-18 13:02:53,755][12883] Updated weights for policy 0, policy_version 138403 (0.0032) [2024-06-18 13:02:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.8). Total num frames: 2267725824. Throughput: 0: 42726.8. Samples: 2267892760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:02:56,994][12645] Avg episode reward: [(0, '0.668')] [2024-06-18 13:02:57,255][12883] Updated weights for policy 0, policy_version 138413 (0.0039) [2024-06-18 13:03:01,221][12883] Updated weights for policy 0, policy_version 138423 (0.0027) [2024-06-18 13:03:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2267938816. Throughput: 0: 42666.8. Samples: 2268023840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:01,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 13:03:04,940][12883] Updated weights for policy 0, policy_version 138433 (0.0032) [2024-06-18 13:03:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 2268151808. Throughput: 0: 42665.8. Samples: 2268278820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:06,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 13:03:09,010][12883] Updated weights for policy 0, policy_version 138443 (0.0023) [2024-06-18 13:03:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2268381184. Throughput: 0: 42496.6. Samples: 2268523300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:11,994][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 13:03:12,799][12883] Updated weights for policy 0, policy_version 138453 (0.0029) [2024-06-18 13:03:16,880][12883] Updated weights for policy 0, policy_version 138463 (0.0031) [2024-06-18 13:03:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2268577792. Throughput: 0: 42387.1. Samples: 2268656300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:16,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 13:03:20,415][12883] Updated weights for policy 0, policy_version 138473 (0.0035) [2024-06-18 13:03:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2268774400. Throughput: 0: 42387.6. Samples: 2268906380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:21,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 13:03:24,566][12883] Updated weights for policy 0, policy_version 138483 (0.0031) [2024-06-18 13:03:26,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2269020160. Throughput: 0: 42218.2. Samples: 2269155680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:26,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 13:03:27,933][12883] Updated weights for policy 0, policy_version 138493 (0.0035) [2024-06-18 13:03:31,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2269216768. Throughput: 0: 42377.0. Samples: 2269295380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:31,994][12645] Avg episode reward: [(0, '0.629')] [2024-06-18 13:03:32,099][12883] Updated weights for policy 0, policy_version 138503 (0.0033) [2024-06-18 13:03:35,553][12883] Updated weights for policy 0, policy_version 138513 (0.0040) [2024-06-18 13:03:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2269413376. Throughput: 0: 42257.7. Samples: 2269542940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:36,994][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 13:03:39,642][12883] Updated weights for policy 0, policy_version 138523 (0.0032) [2024-06-18 13:03:41,454][12862] Signal inference workers to stop experience collection... (33150 times) [2024-06-18 13:03:41,459][12862] Signal inference workers to resume experience collection... (33150 times) [2024-06-18 13:03:41,495][12883] InferenceWorker_p0-w0: stopping experience collection (33150 times) [2024-06-18 13:03:41,495][12883] InferenceWorker_p0-w0: resuming experience collection (33150 times) [2024-06-18 13:03:41,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2269675520. Throughput: 0: 42343.5. Samples: 2269798220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:41,994][12645] Avg episode reward: [(0, '0.695')] [2024-06-18 13:03:43,323][12883] Updated weights for policy 0, policy_version 138533 (0.0022) [2024-06-18 13:03:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2269855744. Throughput: 0: 42500.4. Samples: 2269936360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:03:46,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 13:03:47,222][12883] Updated weights for policy 0, policy_version 138543 (0.0028) [2024-06-18 13:03:51,413][12883] Updated weights for policy 0, policy_version 138553 (0.0029) [2024-06-18 13:03:51,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2270068736. Throughput: 0: 42389.4. Samples: 2270186340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:03:51,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 13:03:55,578][12883] Updated weights for policy 0, policy_version 138563 (0.0035) [2024-06-18 13:03:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2270314496. Throughput: 0: 42618.6. Samples: 2270441140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:03:56,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 13:03:59,188][12883] Updated weights for policy 0, policy_version 138573 (0.0042) [2024-06-18 13:04:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42376.3). Total num frames: 2270461952. Throughput: 0: 42481.7. Samples: 2270567980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:01,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 13:04:03,177][12883] Updated weights for policy 0, policy_version 138583 (0.0029) [2024-06-18 13:04:06,685][12883] Updated weights for policy 0, policy_version 138593 (0.0025) [2024-06-18 13:04:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2270707712. Throughput: 0: 42487.1. Samples: 2270818300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:06,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 13:04:10,727][12883] Updated weights for policy 0, policy_version 138603 (0.0045) [2024-06-18 13:04:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 2270920704. Throughput: 0: 42780.0. Samples: 2271080780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:11,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 13:04:14,353][12883] Updated weights for policy 0, policy_version 138613 (0.0036) [2024-06-18 13:04:16,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 2271100928. Throughput: 0: 42494.1. Samples: 2271207620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:16,994][12645] Avg episode reward: [(0, '0.701')] [2024-06-18 13:04:18,241][12883] Updated weights for policy 0, policy_version 138623 (0.0037) [2024-06-18 13:04:21,934][12883] Updated weights for policy 0, policy_version 138633 (0.0042) [2024-06-18 13:04:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2271363072. Throughput: 0: 42608.4. Samples: 2271460320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:21,994][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 13:04:25,854][12883] Updated weights for policy 0, policy_version 138643 (0.0030) [2024-06-18 13:04:26,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2271576064. Throughput: 0: 42799.9. Samples: 2271724220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:26,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 13:04:29,612][12883] Updated weights for policy 0, policy_version 138653 (0.0035) [2024-06-18 13:04:31,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2271756288. Throughput: 0: 42525.0. Samples: 2271849980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:31,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 13:04:33,637][12883] Updated weights for policy 0, policy_version 138663 (0.0035) [2024-06-18 13:04:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2271985664. Throughput: 0: 42581.2. Samples: 2272102500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:36,994][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 13:04:37,191][12883] Updated weights for policy 0, policy_version 138673 (0.0039) [2024-06-18 13:04:37,193][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138673_2272018432.pth... [2024-06-18 13:04:37,275][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138049_2261794816.pth [2024-06-18 13:04:41,640][12883] Updated weights for policy 0, policy_version 138683 (0.0028) [2024-06-18 13:04:41,637][12862] Signal inference workers to stop experience collection... (33200 times) [2024-06-18 13:04:41,646][12862] Signal inference workers to resume experience collection... (33200 times) [2024-06-18 13:04:41,660][12883] InferenceWorker_p0-w0: stopping experience collection (33200 times) [2024-06-18 13:04:41,660][12883] InferenceWorker_p0-w0: resuming experience collection (33200 times) [2024-06-18 13:04:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2272215040. Throughput: 0: 42750.3. Samples: 2272364900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:41,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 13:04:44,792][12883] Updated weights for policy 0, policy_version 138693 (0.0022) [2024-06-18 13:04:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2272411648. Throughput: 0: 42593.7. Samples: 2272484700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:04:46,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 13:04:49,434][12883] Updated weights for policy 0, policy_version 138703 (0.0034) [2024-06-18 13:04:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2272624640. Throughput: 0: 42586.7. Samples: 2272734700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:04:51,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 13:04:52,785][12883] Updated weights for policy 0, policy_version 138713 (0.0031) [2024-06-18 13:04:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2272821248. Throughput: 0: 42647.7. Samples: 2272999920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:04:56,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 13:04:57,015][12883] Updated weights for policy 0, policy_version 138723 (0.0026) [2024-06-18 13:05:00,356][12883] Updated weights for policy 0, policy_version 138733 (0.0041) [2024-06-18 13:05:01,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2273050624. Throughput: 0: 42549.3. Samples: 2273122340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:02,000][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 13:05:04,817][12883] Updated weights for policy 0, policy_version 138743 (0.0027) [2024-06-18 13:05:06,994][12645] Fps is (10 sec: 44235.1, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 2273263616. Throughput: 0: 42494.9. Samples: 2273372600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:07,000][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 13:05:08,434][12883] Updated weights for policy 0, policy_version 138753 (0.0029) [2024-06-18 13:05:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2273460224. Throughput: 0: 42595.3. Samples: 2273641000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:11,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 13:05:12,234][12883] Updated weights for policy 0, policy_version 138763 (0.0028) [2024-06-18 13:05:15,962][12883] Updated weights for policy 0, policy_version 138773 (0.0035) [2024-06-18 13:05:16,994][12645] Fps is (10 sec: 42599.9, 60 sec: 43144.6, 300 sec: 42543.2). Total num frames: 2273689600. Throughput: 0: 42441.8. Samples: 2273759860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:16,994][12645] Avg episode reward: [(0, '0.624')] [2024-06-18 13:05:19,916][12883] Updated weights for policy 0, policy_version 138783 (0.0032) [2024-06-18 13:05:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2273902592. Throughput: 0: 42440.0. Samples: 2274012300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:21,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 13:05:23,599][12883] Updated weights for policy 0, policy_version 138793 (0.0033) [2024-06-18 13:05:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 2274099200. Throughput: 0: 42455.6. Samples: 2274275400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:26,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 13:05:27,384][12883] Updated weights for policy 0, policy_version 138803 (0.0026) [2024-06-18 13:05:31,168][12883] Updated weights for policy 0, policy_version 138813 (0.0030) [2024-06-18 13:05:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2274344960. Throughput: 0: 42584.0. Samples: 2274400980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:31,994][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 13:05:34,872][12883] Updated weights for policy 0, policy_version 138823 (0.0036) [2024-06-18 13:05:36,526][12862] Signal inference workers to stop experience collection... (33250 times) [2024-06-18 13:05:36,527][12862] Signal inference workers to resume experience collection... (33250 times) [2024-06-18 13:05:36,568][12883] InferenceWorker_p0-w0: stopping experience collection (33250 times) [2024-06-18 13:05:36,568][12883] InferenceWorker_p0-w0: resuming experience collection (33250 times) [2024-06-18 13:05:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2274557952. Throughput: 0: 42756.0. Samples: 2274658720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:36,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 13:05:38,755][12883] Updated weights for policy 0, policy_version 138833 (0.0038) [2024-06-18 13:05:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2274754560. Throughput: 0: 42716.8. Samples: 2274922180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:41,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 13:05:42,599][12883] Updated weights for policy 0, policy_version 138843 (0.0048) [2024-06-18 13:05:46,420][12883] Updated weights for policy 0, policy_version 138853 (0.0026) [2024-06-18 13:05:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2274983936. Throughput: 0: 42681.4. Samples: 2275043000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:46,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 13:05:50,121][12883] Updated weights for policy 0, policy_version 138863 (0.0036) [2024-06-18 13:05:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2275180544. Throughput: 0: 42796.7. Samples: 2275298440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 13:05:51,994][12645] Avg episode reward: [(0, '0.729')] [2024-06-18 13:05:54,143][12883] Updated weights for policy 0, policy_version 138873 (0.0033) [2024-06-18 13:05:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2275393536. Throughput: 0: 42614.6. Samples: 2275558660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:05:56,994][12645] Avg episode reward: [(0, '0.772')] [2024-06-18 13:05:58,203][12883] Updated weights for policy 0, policy_version 138883 (0.0039) [2024-06-18 13:06:01,913][12883] Updated weights for policy 0, policy_version 138893 (0.0053) [2024-06-18 13:06:01,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 2275622912. Throughput: 0: 42760.0. Samples: 2275684160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:01,997][12645] Avg episode reward: [(0, '0.758')] [2024-06-18 13:06:05,811][12883] Updated weights for policy 0, policy_version 138903 (0.0032) [2024-06-18 13:06:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 2275819520. Throughput: 0: 42763.5. Samples: 2275936660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:06,994][12645] Avg episode reward: [(0, '0.758')] [2024-06-18 13:06:09,715][12883] Updated weights for policy 0, policy_version 138913 (0.0032) [2024-06-18 13:06:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2276032512. Throughput: 0: 42672.8. Samples: 2276195680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:11,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 13:06:13,419][12883] Updated weights for policy 0, policy_version 138923 (0.0045) [2024-06-18 13:06:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2276245504. Throughput: 0: 42601.4. Samples: 2276318040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:16,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 13:06:17,362][12883] Updated weights for policy 0, policy_version 138933 (0.0040) [2024-06-18 13:06:20,939][12883] Updated weights for policy 0, policy_version 138943 (0.0038) [2024-06-18 13:06:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42654.8). Total num frames: 2276458496. Throughput: 0: 42576.3. Samples: 2276574660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:21,994][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 13:06:25,033][12883] Updated weights for policy 0, policy_version 138953 (0.0033) [2024-06-18 13:06:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2276655104. Throughput: 0: 42386.2. Samples: 2276829560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:26,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 13:06:28,787][12883] Updated weights for policy 0, policy_version 138963 (0.0036) [2024-06-18 13:06:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2276884480. Throughput: 0: 42573.9. Samples: 2276958820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:31,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 13:06:32,960][12883] Updated weights for policy 0, policy_version 138973 (0.0035) [2024-06-18 13:06:36,486][12883] Updated weights for policy 0, policy_version 138983 (0.0027) [2024-06-18 13:06:37,000][12645] Fps is (10 sec: 44209.1, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 2277097472. Throughput: 0: 42469.7. Samples: 2277209840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:37,000][12645] Avg episode reward: [(0, '0.641')] [2024-06-18 13:06:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138983_2277097472.pth... [2024-06-18 13:06:37,065][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138358_2266857472.pth [2024-06-18 13:06:40,419][12883] Updated weights for policy 0, policy_version 138993 (0.0039) [2024-06-18 13:06:41,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2277310464. Throughput: 0: 42369.2. Samples: 2277465280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:41,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 13:06:44,112][12883] Updated weights for policy 0, policy_version 139003 (0.0039) [2024-06-18 13:06:46,994][12645] Fps is (10 sec: 42625.2, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 2277523456. Throughput: 0: 42349.7. Samples: 2277589800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:46,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 13:06:48,085][12883] Updated weights for policy 0, policy_version 139013 (0.0042) [2024-06-18 13:06:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2277736448. Throughput: 0: 42394.3. Samples: 2277844400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:51,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 13:06:52,222][12883] Updated weights for policy 0, policy_version 139023 (0.0022) [2024-06-18 13:06:56,328][12883] Updated weights for policy 0, policy_version 139033 (0.0041) [2024-06-18 13:06:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2277933056. Throughput: 0: 42333.8. Samples: 2278100700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 13:06:56,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 13:06:59,889][12883] Updated weights for policy 0, policy_version 139043 (0.0041) [2024-06-18 13:07:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 41780.8, 300 sec: 42542.9). Total num frames: 2278129664. Throughput: 0: 42349.5. Samples: 2278223760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:01,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 13:07:03,935][12883] Updated weights for policy 0, policy_version 139053 (0.0036) [2024-06-18 13:07:05,861][12862] Signal inference workers to stop experience collection... (33300 times) [2024-06-18 13:07:05,904][12883] InferenceWorker_p0-w0: stopping experience collection (33300 times) [2024-06-18 13:07:05,920][12862] Signal inference workers to resume experience collection... (33300 times) [2024-06-18 13:07:05,932][12883] InferenceWorker_p0-w0: resuming experience collection (33300 times) [2024-06-18 13:07:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2278391808. Throughput: 0: 42397.9. Samples: 2278482560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:06,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 13:07:07,581][12883] Updated weights for policy 0, policy_version 139063 (0.0036) [2024-06-18 13:07:11,646][12883] Updated weights for policy 0, policy_version 139073 (0.0030) [2024-06-18 13:07:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2278572032. Throughput: 0: 42377.3. Samples: 2278736540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:11,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 13:07:15,323][12883] Updated weights for policy 0, policy_version 139083 (0.0026) [2024-06-18 13:07:16,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2278768640. Throughput: 0: 42302.1. Samples: 2278862420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:16,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 13:07:19,508][12883] Updated weights for policy 0, policy_version 139093 (0.0032) [2024-06-18 13:07:21,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2279014400. Throughput: 0: 42508.5. Samples: 2279122460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:21,994][12645] Avg episode reward: [(0, '0.103')] [2024-06-18 13:07:22,932][12883] Updated weights for policy 0, policy_version 139103 (0.0033) [2024-06-18 13:07:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2279211008. Throughput: 0: 42523.2. Samples: 2279378820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:26,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 13:07:27,104][12883] Updated weights for policy 0, policy_version 139113 (0.0038) [2024-06-18 13:07:30,642][12883] Updated weights for policy 0, policy_version 139123 (0.0035) [2024-06-18 13:07:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2279407616. Throughput: 0: 42450.6. Samples: 2279500080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:31,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 13:07:34,946][12883] Updated weights for policy 0, policy_version 139133 (0.0037) [2024-06-18 13:07:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42875.9, 300 sec: 42598.4). Total num frames: 2279669760. Throughput: 0: 42526.6. Samples: 2279758100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:36,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 13:07:38,486][12883] Updated weights for policy 0, policy_version 139143 (0.0043) [2024-06-18 13:07:41,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2279849984. Throughput: 0: 42561.4. Samples: 2280016060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:41,996][12645] Avg episode reward: [(0, '0.624')] [2024-06-18 13:07:42,737][12883] Updated weights for policy 0, policy_version 139153 (0.0031) [2024-06-18 13:07:46,251][12883] Updated weights for policy 0, policy_version 139163 (0.0038) [2024-06-18 13:07:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2280062976. Throughput: 0: 42571.1. Samples: 2280139460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:46,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 13:07:50,413][12883] Updated weights for policy 0, policy_version 139173 (0.0039) [2024-06-18 13:07:51,994][12645] Fps is (10 sec: 44246.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2280292352. Throughput: 0: 42602.2. Samples: 2280399660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:51,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 13:07:54,071][12883] Updated weights for policy 0, policy_version 139183 (0.0028) [2024-06-18 13:07:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2280488960. Throughput: 0: 42672.9. Samples: 2280656820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 13:07:56,994][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 13:07:58,032][12883] Updated weights for policy 0, policy_version 139193 (0.0027) [2024-06-18 13:08:01,663][12883] Updated weights for policy 0, policy_version 139203 (0.0034) [2024-06-18 13:08:01,995][12645] Fps is (10 sec: 40955.7, 60 sec: 42870.6, 300 sec: 42542.7). Total num frames: 2280701952. Throughput: 0: 42619.9. Samples: 2280780360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:01,995][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 13:08:05,822][12883] Updated weights for policy 0, policy_version 139213 (0.0028) [2024-06-18 13:08:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2280914944. Throughput: 0: 42594.6. Samples: 2281039220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:06,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 13:08:09,227][12883] Updated weights for policy 0, policy_version 139223 (0.0029) [2024-06-18 13:08:11,994][12645] Fps is (10 sec: 44241.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2281144320. Throughput: 0: 42627.5. Samples: 2281297060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:11,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 13:08:13,329][12883] Updated weights for policy 0, policy_version 139233 (0.0028) [2024-06-18 13:08:16,855][12883] Updated weights for policy 0, policy_version 139243 (0.0033) [2024-06-18 13:08:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2281357312. Throughput: 0: 42741.2. Samples: 2281423440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:16,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 13:08:20,875][12883] Updated weights for policy 0, policy_version 139253 (0.0030) [2024-06-18 13:08:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2281553920. Throughput: 0: 42732.4. Samples: 2281681060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:21,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 13:08:24,450][12883] Updated weights for policy 0, policy_version 139263 (0.0027) [2024-06-18 13:08:26,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2281799680. Throughput: 0: 42725.8. Samples: 2281938620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:26,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 13:08:28,589][12883] Updated weights for policy 0, policy_version 139273 (0.0030) [2024-06-18 13:08:31,981][12883] Updated weights for policy 0, policy_version 139283 (0.0035) [2024-06-18 13:08:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2282012672. Throughput: 0: 42892.3. Samples: 2282069620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:31,995][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 13:08:36,219][12883] Updated weights for policy 0, policy_version 139293 (0.0035) [2024-06-18 13:08:36,996][12645] Fps is (10 sec: 39312.3, 60 sec: 42050.7, 300 sec: 42431.5). Total num frames: 2282192896. Throughput: 0: 42705.9. Samples: 2282321520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:36,997][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 13:08:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139295_2282209280.pth... [2024-06-18 13:08:37,105][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138673_2272018432.pth [2024-06-18 13:08:39,915][12883] Updated weights for policy 0, policy_version 139303 (0.0035) [2024-06-18 13:08:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2282422272. Throughput: 0: 42641.4. Samples: 2282575680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:41,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 13:08:43,847][12883] Updated weights for policy 0, policy_version 139313 (0.0024) [2024-06-18 13:08:45,215][12862] Signal inference workers to stop experience collection... (33350 times) [2024-06-18 13:08:45,215][12862] Signal inference workers to resume experience collection... (33350 times) [2024-06-18 13:08:45,229][12883] InferenceWorker_p0-w0: stopping experience collection (33350 times) [2024-06-18 13:08:45,229][12883] InferenceWorker_p0-w0: resuming experience collection (33350 times) [2024-06-18 13:08:46,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2282635264. Throughput: 0: 42759.3. Samples: 2282704480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:46,994][12645] Avg episode reward: [(0, '0.679')] [2024-06-18 13:08:47,474][12883] Updated weights for policy 0, policy_version 139323 (0.0046) [2024-06-18 13:08:51,569][12883] Updated weights for policy 0, policy_version 139333 (0.0031) [2024-06-18 13:08:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2282831872. Throughput: 0: 42603.6. Samples: 2282956380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:51,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 13:08:55,182][12883] Updated weights for policy 0, policy_version 139343 (0.0025) [2024-06-18 13:08:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2283061248. Throughput: 0: 42514.8. Samples: 2283210220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:08:56,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 13:08:59,230][12883] Updated weights for policy 0, policy_version 139353 (0.0027) [2024-06-18 13:09:01,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43145.4, 300 sec: 42654.0). Total num frames: 2283290624. Throughput: 0: 42561.1. Samples: 2283338680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:09:01,994][12645] Avg episode reward: [(0, '0.704')] [2024-06-18 13:09:02,999][12883] Updated weights for policy 0, policy_version 139363 (0.0039) [2024-06-18 13:09:06,827][12883] Updated weights for policy 0, policy_version 139373 (0.0038) [2024-06-18 13:09:06,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2283487232. Throughput: 0: 42608.0. Samples: 2283598420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:06,994][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 13:09:10,678][12883] Updated weights for policy 0, policy_version 139383 (0.0032) [2024-06-18 13:09:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2283700224. Throughput: 0: 42492.4. Samples: 2283850780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:11,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 13:09:14,522][12883] Updated weights for policy 0, policy_version 139393 (0.0030) [2024-06-18 13:09:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2283896832. Throughput: 0: 42347.2. Samples: 2283975240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:16,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 13:09:18,273][12883] Updated weights for policy 0, policy_version 139403 (0.0035) [2024-06-18 13:09:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2284109824. Throughput: 0: 42531.0. Samples: 2284235320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:21,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 13:09:22,159][12883] Updated weights for policy 0, policy_version 139413 (0.0036) [2024-06-18 13:09:25,796][12883] Updated weights for policy 0, policy_version 139423 (0.0031) [2024-06-18 13:09:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2284339200. Throughput: 0: 42489.7. Samples: 2284487720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:26,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 13:09:29,785][12883] Updated weights for policy 0, policy_version 139433 (0.0031) [2024-06-18 13:09:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2284535808. Throughput: 0: 42571.1. Samples: 2284620180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:31,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 13:09:33,366][12883] Updated weights for policy 0, policy_version 139443 (0.0032) [2024-06-18 13:09:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 2284748800. Throughput: 0: 42632.1. Samples: 2284874820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:36,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 13:09:37,557][12883] Updated weights for policy 0, policy_version 139453 (0.0027) [2024-06-18 13:09:41,047][12883] Updated weights for policy 0, policy_version 139463 (0.0025) [2024-06-18 13:09:41,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2284994560. Throughput: 0: 42578.7. Samples: 2285126260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:41,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 13:09:45,080][12883] Updated weights for policy 0, policy_version 139473 (0.0033) [2024-06-18 13:09:47,000][12645] Fps is (10 sec: 44208.6, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 2285191168. Throughput: 0: 42879.7. Samples: 2285268540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:47,001][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 13:09:48,793][12883] Updated weights for policy 0, policy_version 139483 (0.0028) [2024-06-18 13:09:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2285387776. Throughput: 0: 42735.2. Samples: 2285521500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:51,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 13:09:52,979][12883] Updated weights for policy 0, policy_version 139493 (0.0034) [2024-06-18 13:09:56,440][12862] Signal inference workers to stop experience collection... (33400 times) [2024-06-18 13:09:56,440][12862] Signal inference workers to resume experience collection... (33400 times) [2024-06-18 13:09:56,486][12883] InferenceWorker_p0-w0: stopping experience collection (33400 times) [2024-06-18 13:09:56,486][12883] InferenceWorker_p0-w0: resuming experience collection (33400 times) [2024-06-18 13:09:56,575][12883] Updated weights for policy 0, policy_version 139503 (0.0022) [2024-06-18 13:09:56,994][12645] Fps is (10 sec: 45903.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2285649920. Throughput: 0: 42855.4. Samples: 2285779280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:09:56,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 13:10:00,545][12883] Updated weights for policy 0, policy_version 139513 (0.0033) [2024-06-18 13:10:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2285813760. Throughput: 0: 43004.4. Samples: 2285910440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 13:10:01,995][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 13:10:04,279][12883] Updated weights for policy 0, policy_version 139523 (0.0039) [2024-06-18 13:10:06,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2286043136. Throughput: 0: 42877.9. Samples: 2286164820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:06,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 13:10:08,293][12883] Updated weights for policy 0, policy_version 139533 (0.0036) [2024-06-18 13:10:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2286256128. Throughput: 0: 42890.7. Samples: 2286417800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:11,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 13:10:12,049][12883] Updated weights for policy 0, policy_version 139543 (0.0048) [2024-06-18 13:10:15,980][12883] Updated weights for policy 0, policy_version 139553 (0.0033) [2024-06-18 13:10:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2286469120. Throughput: 0: 42837.4. Samples: 2286547860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:16,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 13:10:19,642][12883] Updated weights for policy 0, policy_version 139563 (0.0034) [2024-06-18 13:10:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2286698496. Throughput: 0: 42908.4. Samples: 2286805700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:21,994][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 13:10:23,648][12883] Updated weights for policy 0, policy_version 139573 (0.0028) [2024-06-18 13:10:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2286895104. Throughput: 0: 43053.7. Samples: 2287063680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:26,994][12645] Avg episode reward: [(0, '0.678')] [2024-06-18 13:10:27,226][12883] Updated weights for policy 0, policy_version 139583 (0.0041) [2024-06-18 13:10:31,513][12883] Updated weights for policy 0, policy_version 139593 (0.0041) [2024-06-18 13:10:31,996][12645] Fps is (10 sec: 40950.7, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 2287108096. Throughput: 0: 42654.0. Samples: 2287187800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:31,997][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 13:10:35,103][12883] Updated weights for policy 0, policy_version 139603 (0.0033) [2024-06-18 13:10:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2287337472. Throughput: 0: 42741.2. Samples: 2287444860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:36,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 13:10:37,142][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139609_2287353856.pth... [2024-06-18 13:10:37,199][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000138983_2277097472.pth [2024-06-18 13:10:39,035][12883] Updated weights for policy 0, policy_version 139613 (0.0028) [2024-06-18 13:10:41,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2287550464. Throughput: 0: 42662.2. Samples: 2287699080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:41,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 13:10:43,282][12883] Updated weights for policy 0, policy_version 139623 (0.0032) [2024-06-18 13:10:46,882][12883] Updated weights for policy 0, policy_version 139633 (0.0029) [2024-06-18 13:10:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 2287747072. Throughput: 0: 42700.9. Samples: 2287831980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:46,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 13:10:51,042][12883] Updated weights for policy 0, policy_version 139643 (0.0027) [2024-06-18 13:10:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2287960064. Throughput: 0: 42872.9. Samples: 2288094100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:51,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 13:10:54,312][12883] Updated weights for policy 0, policy_version 139653 (0.0045) [2024-06-18 13:10:56,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42654.3). Total num frames: 2288205824. Throughput: 0: 42664.9. Samples: 2288337720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:10:56,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 13:10:58,623][12883] Updated weights for policy 0, policy_version 139663 (0.0033) [2024-06-18 13:11:01,841][12883] Updated weights for policy 0, policy_version 139673 (0.0036) [2024-06-18 13:11:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2288402432. Throughput: 0: 42711.5. Samples: 2288469880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:11:01,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 13:11:06,176][12883] Updated weights for policy 0, policy_version 139683 (0.0029) [2024-06-18 13:11:06,996][12645] Fps is (10 sec: 39310.7, 60 sec: 42596.4, 300 sec: 42598.0). Total num frames: 2288599040. Throughput: 0: 42737.8. Samples: 2288729020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:11:06,997][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 13:11:09,526][12883] Updated weights for policy 0, policy_version 139693 (0.0032) [2024-06-18 13:11:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2288828416. Throughput: 0: 42529.8. Samples: 2288977520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:11,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 13:11:13,797][12883] Updated weights for policy 0, policy_version 139703 (0.0031) [2024-06-18 13:11:16,994][12645] Fps is (10 sec: 42610.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2289025024. Throughput: 0: 42703.4. Samples: 2289109360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:16,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 13:11:17,423][12883] Updated weights for policy 0, policy_version 139713 (0.0038) [2024-06-18 13:11:17,800][12862] Signal inference workers to stop experience collection... (33450 times) [2024-06-18 13:11:17,800][12862] Signal inference workers to resume experience collection... (33450 times) [2024-06-18 13:11:17,812][12883] InferenceWorker_p0-w0: stopping experience collection (33450 times) [2024-06-18 13:11:17,812][12883] InferenceWorker_p0-w0: resuming experience collection (33450 times) [2024-06-18 13:11:21,395][12883] Updated weights for policy 0, policy_version 139723 (0.0040) [2024-06-18 13:11:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2289238016. Throughput: 0: 42620.6. Samples: 2289362780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:21,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 13:11:24,992][12883] Updated weights for policy 0, policy_version 139733 (0.0025) [2024-06-18 13:11:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2289467392. Throughput: 0: 42651.5. Samples: 2289618400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:26,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 13:11:29,003][12883] Updated weights for policy 0, policy_version 139743 (0.0040) [2024-06-18 13:11:31,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42873.0, 300 sec: 42654.8). Total num frames: 2289680384. Throughput: 0: 42513.2. Samples: 2289745080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:31,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 13:11:32,573][12883] Updated weights for policy 0, policy_version 139753 (0.0035) [2024-06-18 13:11:36,782][12883] Updated weights for policy 0, policy_version 139763 (0.0056) [2024-06-18 13:11:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2289876992. Throughput: 0: 42368.9. Samples: 2290000700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:36,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 13:11:40,162][12883] Updated weights for policy 0, policy_version 139773 (0.0027) [2024-06-18 13:11:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2290106368. Throughput: 0: 42638.3. Samples: 2290256440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:41,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 13:11:44,495][12883] Updated weights for policy 0, policy_version 139783 (0.0030) [2024-06-18 13:11:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2290319360. Throughput: 0: 42580.9. Samples: 2290386020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:46,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 13:11:47,802][12883] Updated weights for policy 0, policy_version 139793 (0.0036) [2024-06-18 13:11:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2290499584. Throughput: 0: 42408.9. Samples: 2290637300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:51,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 13:11:52,236][12883] Updated weights for policy 0, policy_version 139803 (0.0037) [2024-06-18 13:11:55,349][12883] Updated weights for policy 0, policy_version 139813 (0.0037) [2024-06-18 13:11:57,000][12645] Fps is (10 sec: 42572.0, 60 sec: 42320.9, 300 sec: 42764.1). Total num frames: 2290745344. Throughput: 0: 42496.7. Samples: 2290890140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:11:57,001][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 13:11:59,896][12883] Updated weights for policy 0, policy_version 139823 (0.0040) [2024-06-18 13:12:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2290925568. Throughput: 0: 42520.9. Samples: 2291022800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:12:01,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 13:12:03,430][12883] Updated weights for policy 0, policy_version 139833 (0.0032) [2024-06-18 13:12:06,996][12645] Fps is (10 sec: 40976.4, 60 sec: 42598.8, 300 sec: 42653.6). Total num frames: 2291154944. Throughput: 0: 42494.7. Samples: 2291275140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:12:06,996][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 13:12:07,686][12883] Updated weights for policy 0, policy_version 139843 (0.0030) [2024-06-18 13:12:10,999][12883] Updated weights for policy 0, policy_version 139853 (0.0031) [2024-06-18 13:12:11,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2291384320. Throughput: 0: 42485.9. Samples: 2291530260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:12:11,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 13:12:15,314][12883] Updated weights for policy 0, policy_version 139863 (0.0031) [2024-06-18 13:12:16,995][12645] Fps is (10 sec: 40964.0, 60 sec: 42324.5, 300 sec: 42542.7). Total num frames: 2291564544. Throughput: 0: 42554.5. Samples: 2291660080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:16,995][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 13:12:18,543][12883] Updated weights for policy 0, policy_version 139873 (0.0040) [2024-06-18 13:12:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2291793920. Throughput: 0: 42551.6. Samples: 2291915520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:21,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 13:12:22,825][12883] Updated weights for policy 0, policy_version 139883 (0.0040) [2024-06-18 13:12:26,223][12883] Updated weights for policy 0, policy_version 139893 (0.0033) [2024-06-18 13:12:26,996][12645] Fps is (10 sec: 45870.6, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 2292023296. Throughput: 0: 42527.6. Samples: 2292170280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:26,996][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 13:12:30,514][12883] Updated weights for policy 0, policy_version 139903 (0.0038) [2024-06-18 13:12:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2292219904. Throughput: 0: 42486.2. Samples: 2292297900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:31,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 13:12:34,103][12883] Updated weights for policy 0, policy_version 139913 (0.0022) [2024-06-18 13:12:36,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42654.3). Total num frames: 2292432896. Throughput: 0: 42638.5. Samples: 2292556040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:36,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 13:12:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139920_2292449280.pth... [2024-06-18 13:12:37,051][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139295_2282209280.pth [2024-06-18 13:12:38,010][12883] Updated weights for policy 0, policy_version 139923 (0.0033) [2024-06-18 13:12:41,712][12883] Updated weights for policy 0, policy_version 139933 (0.0032) [2024-06-18 13:12:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2292678656. Throughput: 0: 42769.0. Samples: 2292814480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:41,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 13:12:45,543][12883] Updated weights for policy 0, policy_version 139943 (0.0037) [2024-06-18 13:12:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2292875264. Throughput: 0: 42706.3. Samples: 2292944580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:46,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 13:12:49,406][12883] Updated weights for policy 0, policy_version 139953 (0.0032) [2024-06-18 13:12:51,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2293088256. Throughput: 0: 42716.8. Samples: 2293197300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:51,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 13:12:53,071][12883] Updated weights for policy 0, policy_version 139963 (0.0051) [2024-06-18 13:12:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42602.8, 300 sec: 42709.6). Total num frames: 2293301248. Throughput: 0: 42965.3. Samples: 2293463700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:12:56,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 13:12:57,283][12883] Updated weights for policy 0, policy_version 139973 (0.0039) [2024-06-18 13:12:57,902][12862] Signal inference workers to stop experience collection... (33500 times) [2024-06-18 13:12:57,902][12862] Signal inference workers to resume experience collection... (33500 times) [2024-06-18 13:12:57,922][12883] InferenceWorker_p0-w0: stopping experience collection (33500 times) [2024-06-18 13:12:57,922][12883] InferenceWorker_p0-w0: resuming experience collection (33500 times) [2024-06-18 13:13:00,787][12883] Updated weights for policy 0, policy_version 139983 (0.0035) [2024-06-18 13:13:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2293514240. Throughput: 0: 42874.9. Samples: 2293589400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:13:01,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 13:13:04,806][12883] Updated weights for policy 0, policy_version 139993 (0.0035) [2024-06-18 13:13:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42873.1, 300 sec: 42654.0). Total num frames: 2293727232. Throughput: 0: 42873.3. Samples: 2293844820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:13:06,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 13:13:08,248][12883] Updated weights for policy 0, policy_version 140003 (0.0034) [2024-06-18 13:13:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2293923840. Throughput: 0: 43163.0. Samples: 2294112520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:13:11,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 13:13:12,602][12883] Updated weights for policy 0, policy_version 140013 (0.0051) [2024-06-18 13:13:16,161][12883] Updated weights for policy 0, policy_version 140023 (0.0023) [2024-06-18 13:13:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43145.5, 300 sec: 42709.5). Total num frames: 2294153216. Throughput: 0: 42928.2. Samples: 2294229660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:16,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 13:13:20,150][12883] Updated weights for policy 0, policy_version 140033 (0.0023) [2024-06-18 13:13:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2294382592. Throughput: 0: 42892.1. Samples: 2294486180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:21,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 13:13:24,003][12883] Updated weights for policy 0, policy_version 140043 (0.0035) [2024-06-18 13:13:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2294579200. Throughput: 0: 42968.9. Samples: 2294748080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:26,994][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 13:13:27,683][12883] Updated weights for policy 0, policy_version 140053 (0.0033) [2024-06-18 13:13:31,440][12883] Updated weights for policy 0, policy_version 140063 (0.0032) [2024-06-18 13:13:31,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43143.0, 300 sec: 42765.0). Total num frames: 2294808576. Throughput: 0: 42916.9. Samples: 2294875940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:31,997][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 13:13:35,805][12883] Updated weights for policy 0, policy_version 140073 (0.0041) [2024-06-18 13:13:36,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2295021568. Throughput: 0: 43078.5. Samples: 2295135840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:36,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 13:13:39,062][12883] Updated weights for policy 0, policy_version 140083 (0.0028) [2024-06-18 13:13:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2295218176. Throughput: 0: 42696.1. Samples: 2295385020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:41,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 13:13:43,393][12883] Updated weights for policy 0, policy_version 140093 (0.0025) [2024-06-18 13:13:46,768][12883] Updated weights for policy 0, policy_version 140103 (0.0041) [2024-06-18 13:13:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2295463936. Throughput: 0: 42612.4. Samples: 2295506960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:46,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 13:13:51,176][12883] Updated weights for policy 0, policy_version 140113 (0.0036) [2024-06-18 13:13:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2295644160. Throughput: 0: 42791.6. Samples: 2295770440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:51,994][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 13:13:54,311][12883] Updated weights for policy 0, policy_version 140123 (0.0023) [2024-06-18 13:13:56,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2295840768. Throughput: 0: 42502.7. Samples: 2296025140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:13:56,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 13:13:58,748][12883] Updated weights for policy 0, policy_version 140133 (0.0042) [2024-06-18 13:14:01,950][12883] Updated weights for policy 0, policy_version 140143 (0.0033) [2024-06-18 13:14:01,996][12645] Fps is (10 sec: 45864.6, 60 sec: 43143.0, 300 sec: 42764.7). Total num frames: 2296102912. Throughput: 0: 42581.3. Samples: 2296145920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:14:01,996][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 13:14:06,347][12883] Updated weights for policy 0, policy_version 140153 (0.0038) [2024-06-18 13:14:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2296283136. Throughput: 0: 42686.7. Samples: 2296407080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:14:06,994][12645] Avg episode reward: [(0, '0.735')] [2024-06-18 13:14:09,506][12883] Updated weights for policy 0, policy_version 140163 (0.0035) [2024-06-18 13:14:11,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2296496128. Throughput: 0: 42513.8. Samples: 2296661200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:14:11,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 13:14:12,814][12862] Signal inference workers to stop experience collection... (33550 times) [2024-06-18 13:14:12,841][12883] InferenceWorker_p0-w0: stopping experience collection (33550 times) [2024-06-18 13:14:12,876][12862] Signal inference workers to resume experience collection... (33550 times) [2024-06-18 13:14:12,878][12883] InferenceWorker_p0-w0: resuming experience collection (33550 times) [2024-06-18 13:14:14,029][12883] Updated weights for policy 0, policy_version 140173 (0.0041) [2024-06-18 13:14:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2296741888. Throughput: 0: 42559.9. Samples: 2296791040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:14:16,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 13:14:17,292][12883] Updated weights for policy 0, policy_version 140183 (0.0043) [2024-06-18 13:14:21,692][12883] Updated weights for policy 0, policy_version 140193 (0.0047) [2024-06-18 13:14:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2296922112. Throughput: 0: 42483.6. Samples: 2297047600. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:21,994][12645] Avg episode reward: [(0, '0.287')] [2024-06-18 13:14:25,291][12883] Updated weights for policy 0, policy_version 140203 (0.0032) [2024-06-18 13:14:26,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2297118720. Throughput: 0: 42548.0. Samples: 2297299680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:26,994][12645] Avg episode reward: [(0, '0.797')] [2024-06-18 13:14:29,211][12883] Updated weights for policy 0, policy_version 140213 (0.0026) [2024-06-18 13:14:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42873.1, 300 sec: 42820.5). Total num frames: 2297380864. Throughput: 0: 42686.3. Samples: 2297427840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:31,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 13:14:32,885][12883] Updated weights for policy 0, policy_version 140223 (0.0042) [2024-06-18 13:14:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2297561088. Throughput: 0: 42647.1. Samples: 2297689560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:36,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 13:14:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140232_2297561088.pth... [2024-06-18 13:14:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139609_2287353856.pth [2024-06-18 13:14:37,364][12883] Updated weights for policy 0, policy_version 140233 (0.0035) [2024-06-18 13:14:40,401][12883] Updated weights for policy 0, policy_version 140243 (0.0035) [2024-06-18 13:14:41,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42654.9). Total num frames: 2297774080. Throughput: 0: 42631.6. Samples: 2297943560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:41,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 13:14:45,002][12883] Updated weights for policy 0, policy_version 140253 (0.0032) [2024-06-18 13:14:46,994][12645] Fps is (10 sec: 47513.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2298036224. Throughput: 0: 42957.7. Samples: 2298078920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:46,994][12645] Avg episode reward: [(0, '0.659')] [2024-06-18 13:14:47,850][12883] Updated weights for policy 0, policy_version 140263 (0.0035) [2024-06-18 13:14:52,000][12645] Fps is (10 sec: 42571.5, 60 sec: 42593.9, 300 sec: 42542.0). Total num frames: 2298200064. Throughput: 0: 42868.7. Samples: 2298336440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:52,001][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 13:14:52,421][12883] Updated weights for policy 0, policy_version 140273 (0.0025) [2024-06-18 13:14:55,743][12883] Updated weights for policy 0, policy_version 140283 (0.0033) [2024-06-18 13:14:56,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2298413056. Throughput: 0: 42754.6. Samples: 2298585160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:14:56,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 13:14:59,988][12883] Updated weights for policy 0, policy_version 140293 (0.0052) [2024-06-18 13:15:01,994][12645] Fps is (10 sec: 45903.9, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2298658816. Throughput: 0: 42828.0. Samples: 2298718300. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:15:01,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 13:15:03,820][12883] Updated weights for policy 0, policy_version 140303 (0.0032) [2024-06-18 13:15:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2298839040. Throughput: 0: 42749.3. Samples: 2298971320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:15:06,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 13:15:07,635][12883] Updated weights for policy 0, policy_version 140313 (0.0034) [2024-06-18 13:15:11,439][12883] Updated weights for policy 0, policy_version 140323 (0.0028) [2024-06-18 13:15:11,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 2299068416. Throughput: 0: 42807.4. Samples: 2299226020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:15:12,003][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 13:15:15,404][12883] Updated weights for policy 0, policy_version 140333 (0.0044) [2024-06-18 13:15:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2299281408. Throughput: 0: 42830.3. Samples: 2299355200. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-18 13:15:16,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 13:15:19,066][12883] Updated weights for policy 0, policy_version 140343 (0.0032) [2024-06-18 13:15:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2299478016. Throughput: 0: 42657.7. Samples: 2299609160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:21,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 13:15:23,082][12883] Updated weights for policy 0, policy_version 140353 (0.0033) [2024-06-18 13:15:26,481][12883] Updated weights for policy 0, policy_version 140363 (0.0035) [2024-06-18 13:15:26,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 2299707392. Throughput: 0: 42693.6. Samples: 2299864780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:26,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 13:15:30,900][12883] Updated weights for policy 0, policy_version 140373 (0.0030) [2024-06-18 13:15:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 2299920384. Throughput: 0: 42671.1. Samples: 2299999120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:31,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 13:15:32,063][12862] Signal inference workers to stop experience collection... (33600 times) [2024-06-18 13:15:32,064][12862] Signal inference workers to resume experience collection... (33600 times) [2024-06-18 13:15:32,089][12883] InferenceWorker_p0-w0: stopping experience collection (33600 times) [2024-06-18 13:15:32,089][12883] InferenceWorker_p0-w0: resuming experience collection (33600 times) [2024-06-18 13:15:34,126][12883] Updated weights for policy 0, policy_version 140383 (0.0033) [2024-06-18 13:15:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2300116992. Throughput: 0: 42479.6. Samples: 2300247760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:36,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 13:15:38,604][12883] Updated weights for policy 0, policy_version 140393 (0.0035) [2024-06-18 13:15:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2300346368. Throughput: 0: 42623.9. Samples: 2300503240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:41,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 13:15:42,281][12883] Updated weights for policy 0, policy_version 140403 (0.0035) [2024-06-18 13:15:46,377][12883] Updated weights for policy 0, policy_version 140413 (0.0033) [2024-06-18 13:15:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2300559360. Throughput: 0: 42591.6. Samples: 2300634920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:46,994][12645] Avg episode reward: [(0, '0.780')] [2024-06-18 13:15:49,909][12883] Updated weights for policy 0, policy_version 140423 (0.0032) [2024-06-18 13:15:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42875.9, 300 sec: 42598.4). Total num frames: 2300772352. Throughput: 0: 42513.3. Samples: 2300884420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:51,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 13:15:54,019][12883] Updated weights for policy 0, policy_version 140433 (0.0033) [2024-06-18 13:15:56,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2300985344. Throughput: 0: 42694.1. Samples: 2301147340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:15:56,996][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 13:15:57,511][12883] Updated weights for policy 0, policy_version 140443 (0.0029) [2024-06-18 13:16:01,636][12883] Updated weights for policy 0, policy_version 140453 (0.0031) [2024-06-18 13:16:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.9). Total num frames: 2301198336. Throughput: 0: 42664.4. Samples: 2301275100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:16:01,994][12645] Avg episode reward: [(0, '0.692')] [2024-06-18 13:16:05,259][12883] Updated weights for policy 0, policy_version 140463 (0.0028) [2024-06-18 13:16:06,994][12645] Fps is (10 sec: 44246.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2301427712. Throughput: 0: 42731.2. Samples: 2301532060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:16:06,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 13:16:09,172][12883] Updated weights for policy 0, policy_version 140473 (0.0029) [2024-06-18 13:16:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2301640704. Throughput: 0: 42626.8. Samples: 2301782980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:16:11,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 13:16:12,996][12883] Updated weights for policy 0, policy_version 140483 (0.0045) [2024-06-18 13:16:16,881][12883] Updated weights for policy 0, policy_version 140493 (0.0033) [2024-06-18 13:16:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2301837312. Throughput: 0: 42471.1. Samples: 2301910320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:16:16,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 13:16:20,913][12883] Updated weights for policy 0, policy_version 140503 (0.0037) [2024-06-18 13:16:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2302050304. Throughput: 0: 42764.9. Samples: 2302172180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:16:21,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 13:16:24,833][12883] Updated weights for policy 0, policy_version 140513 (0.0041) [2024-06-18 13:16:26,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2302279680. Throughput: 0: 42482.2. Samples: 2302414940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:16:26,994][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 13:16:28,650][12883] Updated weights for policy 0, policy_version 140523 (0.0033) [2024-06-18 13:16:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2302443520. Throughput: 0: 42447.6. Samples: 2302545060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:16:31,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 13:16:32,862][12883] Updated weights for policy 0, policy_version 140533 (0.0035) [2024-06-18 13:16:36,517][12883] Updated weights for policy 0, policy_version 140543 (0.0031) [2024-06-18 13:16:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2302672896. Throughput: 0: 42513.8. Samples: 2302797540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:16:36,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 13:16:37,093][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140545_2302689280.pth... [2024-06-18 13:16:37,153][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000139920_2292449280.pth [2024-06-18 13:16:40,625][12883] Updated weights for policy 0, policy_version 140553 (0.0038) [2024-06-18 13:16:41,310][12862] Signal inference workers to stop experience collection... (33650 times) [2024-06-18 13:16:41,360][12883] InferenceWorker_p0-w0: stopping experience collection (33650 times) [2024-06-18 13:16:41,429][12862] Signal inference workers to resume experience collection... (33650 times) [2024-06-18 13:16:41,429][12883] InferenceWorker_p0-w0: resuming experience collection (33650 times) [2024-06-18 13:16:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2302918656. Throughput: 0: 42182.6. Samples: 2303045460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:16:41,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 13:16:44,401][12883] Updated weights for policy 0, policy_version 140563 (0.0034) [2024-06-18 13:16:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2303098880. Throughput: 0: 42304.4. Samples: 2303178800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:16:46,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 13:16:48,341][12883] Updated weights for policy 0, policy_version 140573 (0.0034) [2024-06-18 13:16:51,996][12645] Fps is (10 sec: 37674.5, 60 sec: 42050.8, 300 sec: 42543.4). Total num frames: 2303295488. Throughput: 0: 42040.2. Samples: 2303423960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:16:51,997][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 13:16:52,048][12883] Updated weights for policy 0, policy_version 140583 (0.0034) [2024-06-18 13:16:55,885][12883] Updated weights for policy 0, policy_version 140593 (0.0036) [2024-06-18 13:16:56,995][12645] Fps is (10 sec: 42592.9, 60 sec: 42325.9, 300 sec: 42709.3). Total num frames: 2303524864. Throughput: 0: 42263.1. Samples: 2303684880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:16:56,996][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 13:16:59,852][12883] Updated weights for policy 0, policy_version 140603 (0.0030) [2024-06-18 13:17:01,996][12645] Fps is (10 sec: 45875.2, 60 sec: 42596.8, 300 sec: 42709.5). Total num frames: 2303754240. Throughput: 0: 42250.8. Samples: 2303811700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:17:01,996][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 13:17:03,863][12883] Updated weights for policy 0, policy_version 140613 (0.0028) [2024-06-18 13:17:06,997][12645] Fps is (10 sec: 40952.4, 60 sec: 41777.0, 300 sec: 42542.4). Total num frames: 2303934464. Throughput: 0: 41860.2. Samples: 2304056020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:17:06,997][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 13:17:07,413][12883] Updated weights for policy 0, policy_version 140623 (0.0034) [2024-06-18 13:17:11,763][12883] Updated weights for policy 0, policy_version 140633 (0.0031) [2024-06-18 13:17:11,994][12645] Fps is (10 sec: 39330.1, 60 sec: 41779.1, 300 sec: 42654.1). Total num frames: 2304147456. Throughput: 0: 42406.1. Samples: 2304323220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:17:11,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 13:17:15,019][12883] Updated weights for policy 0, policy_version 140643 (0.0027) [2024-06-18 13:17:16,994][12645] Fps is (10 sec: 42611.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2304360448. Throughput: 0: 42243.0. Samples: 2304446000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:17:16,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 13:17:19,357][12883] Updated weights for policy 0, policy_version 140653 (0.0038) [2024-06-18 13:17:21,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42050.7, 300 sec: 42542.9). Total num frames: 2304573440. Throughput: 0: 42155.2. Samples: 2304694620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:17:21,997][12645] Avg episode reward: [(0, '0.282')] [2024-06-18 13:17:22,621][12883] Updated weights for policy 0, policy_version 140663 (0.0036) [2024-06-18 13:17:26,889][12883] Updated weights for policy 0, policy_version 140673 (0.0024) [2024-06-18 13:17:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2304786432. Throughput: 0: 42519.0. Samples: 2304958820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:17:26,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 13:17:30,154][12883] Updated weights for policy 0, policy_version 140683 (0.0032) [2024-06-18 13:17:31,994][12645] Fps is (10 sec: 42608.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2304999424. Throughput: 0: 42287.2. Samples: 2305081720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:17:31,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 13:17:34,482][12883] Updated weights for policy 0, policy_version 140693 (0.0039) [2024-06-18 13:17:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2305228800. Throughput: 0: 42487.4. Samples: 2305335800. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:17:36,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 13:17:38,014][12883] Updated weights for policy 0, policy_version 140703 (0.0040) [2024-06-18 13:17:41,997][12645] Fps is (10 sec: 42583.7, 60 sec: 41776.8, 300 sec: 42542.4). Total num frames: 2305425408. Throughput: 0: 42470.5. Samples: 2305596140. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:17:41,998][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 13:17:42,172][12883] Updated weights for policy 0, policy_version 140713 (0.0026) [2024-06-18 13:17:45,630][12883] Updated weights for policy 0, policy_version 140723 (0.0027) [2024-06-18 13:17:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2305654784. Throughput: 0: 42500.0. Samples: 2305724100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:17:46,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 13:17:48,052][12862] Signal inference workers to stop experience collection... (33700 times) [2024-06-18 13:17:48,053][12862] Signal inference workers to resume experience collection... (33700 times) [2024-06-18 13:17:48,075][12883] InferenceWorker_p0-w0: stopping experience collection (33700 times) [2024-06-18 13:17:48,075][12883] InferenceWorker_p0-w0: resuming experience collection (33700 times) [2024-06-18 13:17:49,598][12883] Updated weights for policy 0, policy_version 140733 (0.0036) [2024-06-18 13:17:51,994][12645] Fps is (10 sec: 42612.6, 60 sec: 42600.0, 300 sec: 42542.9). Total num frames: 2305851392. Throughput: 0: 42674.5. Samples: 2305976240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:17:51,994][12645] Avg episode reward: [(0, '0.735')] [2024-06-18 13:17:53,516][12883] Updated weights for policy 0, policy_version 140743 (0.0027) [2024-06-18 13:17:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42326.3, 300 sec: 42542.9). Total num frames: 2306064384. Throughput: 0: 42621.9. Samples: 2306241200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:17:56,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 13:17:57,283][12883] Updated weights for policy 0, policy_version 140753 (0.0031) [2024-06-18 13:18:00,936][12883] Updated weights for policy 0, policy_version 140763 (0.0040) [2024-06-18 13:18:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 2306310144. Throughput: 0: 42723.5. Samples: 2306368560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:18:01,994][12645] Avg episode reward: [(0, '0.207')] [2024-06-18 13:18:04,934][12883] Updated weights for policy 0, policy_version 140773 (0.0032) [2024-06-18 13:18:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.7, 300 sec: 42653.9). Total num frames: 2306506752. Throughput: 0: 42787.5. Samples: 2306619960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:18:06,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 13:18:08,703][12883] Updated weights for policy 0, policy_version 140783 (0.0052) [2024-06-18 13:18:11,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2306703360. Throughput: 0: 42774.6. Samples: 2306883680. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:18:11,994][12645] Avg episode reward: [(0, '0.670')] [2024-06-18 13:18:12,552][12883] Updated weights for policy 0, policy_version 140793 (0.0043) [2024-06-18 13:18:16,305][12883] Updated weights for policy 0, policy_version 140803 (0.0036) [2024-06-18 13:18:16,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2306932736. Throughput: 0: 42757.8. Samples: 2307005820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:18:16,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 13:18:20,112][12883] Updated weights for policy 0, policy_version 140813 (0.0037) [2024-06-18 13:18:21,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 2307162112. Throughput: 0: 42771.2. Samples: 2307260500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:18:21,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 13:18:24,294][12883] Updated weights for policy 0, policy_version 140823 (0.0036) [2024-06-18 13:18:26,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42487.7). Total num frames: 2307342336. Throughput: 0: 42886.4. Samples: 2307525880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 13:18:26,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 13:18:27,747][12883] Updated weights for policy 0, policy_version 140833 (0.0035) [2024-06-18 13:18:31,866][12883] Updated weights for policy 0, policy_version 140843 (0.0039) [2024-06-18 13:18:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2307571712. Throughput: 0: 42703.5. Samples: 2307645760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:18:31,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 13:18:35,507][12883] Updated weights for policy 0, policy_version 140853 (0.0031) [2024-06-18 13:18:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2307801088. Throughput: 0: 42844.4. Samples: 2307904240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:18:36,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 13:18:37,146][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140858_2307817472.pth... [2024-06-18 13:18:37,198][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140232_2297561088.pth [2024-06-18 13:18:39,328][12883] Updated weights for policy 0, policy_version 140863 (0.0041) [2024-06-18 13:18:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42873.9, 300 sec: 42487.3). Total num frames: 2307997696. Throughput: 0: 42732.9. Samples: 2308164180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:18:41,994][12645] Avg episode reward: [(0, '0.714')] [2024-06-18 13:18:43,233][12883] Updated weights for policy 0, policy_version 140873 (0.0036) [2024-06-18 13:18:46,833][12883] Updated weights for policy 0, policy_version 140883 (0.0031) [2024-06-18 13:18:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2308227072. Throughput: 0: 42640.5. Samples: 2308287380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:18:46,994][12645] Avg episode reward: [(0, '0.624')] [2024-06-18 13:18:48,003][12862] Signal inference workers to stop experience collection... (33750 times) [2024-06-18 13:18:48,003][12862] Signal inference workers to resume experience collection... (33750 times) [2024-06-18 13:18:48,021][12883] InferenceWorker_p0-w0: stopping experience collection (33750 times) [2024-06-18 13:18:48,021][12883] InferenceWorker_p0-w0: resuming experience collection (33750 times) [2024-06-18 13:18:50,957][12883] Updated weights for policy 0, policy_version 140893 (0.0026) [2024-06-18 13:18:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2308440064. Throughput: 0: 42908.1. Samples: 2308550820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:18:51,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 13:18:54,658][12883] Updated weights for policy 0, policy_version 140903 (0.0037) [2024-06-18 13:18:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42487.6). Total num frames: 2308636672. Throughput: 0: 42732.9. Samples: 2308806660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:18:56,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 13:18:58,694][12883] Updated weights for policy 0, policy_version 140913 (0.0035) [2024-06-18 13:19:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2308866048. Throughput: 0: 42822.6. Samples: 2308932840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:19:01,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 13:19:02,291][12883] Updated weights for policy 0, policy_version 140923 (0.0037) [2024-06-18 13:19:06,112][12883] Updated weights for policy 0, policy_version 140933 (0.0026) [2024-06-18 13:19:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 2309095424. Throughput: 0: 43005.2. Samples: 2309195740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:19:06,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 13:19:09,828][12883] Updated weights for policy 0, policy_version 140943 (0.0030) [2024-06-18 13:19:11,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 2309275648. Throughput: 0: 42887.6. Samples: 2309455820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:19:11,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 13:19:13,632][12883] Updated weights for policy 0, policy_version 140953 (0.0040) [2024-06-18 13:19:17,000][12645] Fps is (10 sec: 40934.6, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 2309505024. Throughput: 0: 42854.4. Samples: 2309574480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:19:17,001][12645] Avg episode reward: [(0, '0.650')] [2024-06-18 13:19:17,509][12883] Updated weights for policy 0, policy_version 140963 (0.0049) [2024-06-18 13:19:21,150][12883] Updated weights for policy 0, policy_version 140973 (0.0045) [2024-06-18 13:19:21,994][12645] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2309750784. Throughput: 0: 42999.6. Samples: 2309839220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:19:21,994][12645] Avg episode reward: [(0, '0.659')] [2024-06-18 13:19:25,138][12883] Updated weights for policy 0, policy_version 140983 (0.0043) [2024-06-18 13:19:26,994][12645] Fps is (10 sec: 40986.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2309914624. Throughput: 0: 42919.5. Samples: 2310095560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:19:26,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 13:19:28,703][12883] Updated weights for policy 0, policy_version 140993 (0.0035) [2024-06-18 13:19:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2310144000. Throughput: 0: 42962.3. Samples: 2310220680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 13:19:31,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 13:19:32,877][12883] Updated weights for policy 0, policy_version 141003 (0.0038) [2024-06-18 13:19:36,438][12883] Updated weights for policy 0, policy_version 141013 (0.0029) [2024-06-18 13:19:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2310373376. Throughput: 0: 42919.6. Samples: 2310482200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:19:36,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 13:19:40,746][12883] Updated weights for policy 0, policy_version 141023 (0.0028) [2024-06-18 13:19:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2310569984. Throughput: 0: 42988.1. Samples: 2310741120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:19:41,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 13:19:43,813][12883] Updated weights for policy 0, policy_version 141033 (0.0033) [2024-06-18 13:19:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42710.4). Total num frames: 2310799360. Throughput: 0: 43134.7. Samples: 2310873900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:19:46,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 13:19:48,256][12883] Updated weights for policy 0, policy_version 141043 (0.0036) [2024-06-18 13:19:51,752][12883] Updated weights for policy 0, policy_version 141053 (0.0041) [2024-06-18 13:19:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2311028736. Throughput: 0: 42998.0. Samples: 2311130640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:19:51,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 13:19:55,889][12883] Updated weights for policy 0, policy_version 141063 (0.0042) [2024-06-18 13:19:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2311225344. Throughput: 0: 42972.8. Samples: 2311389600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:19:56,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 13:19:59,181][12883] Updated weights for policy 0, policy_version 141073 (0.0028) [2024-06-18 13:20:01,783][12862] Signal inference workers to stop experience collection... (33800 times) [2024-06-18 13:20:01,784][12862] Signal inference workers to resume experience collection... (33800 times) [2024-06-18 13:20:01,834][12883] InferenceWorker_p0-w0: stopping experience collection (33800 times) [2024-06-18 13:20:01,834][12883] InferenceWorker_p0-w0: resuming experience collection (33800 times) [2024-06-18 13:20:01,994][12645] Fps is (10 sec: 42597.1, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2311454720. Throughput: 0: 43168.9. Samples: 2311516820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:20:01,994][12645] Avg episode reward: [(0, '0.284')] [2024-06-18 13:20:03,410][12883] Updated weights for policy 0, policy_version 141083 (0.0037) [2024-06-18 13:20:06,770][12883] Updated weights for policy 0, policy_version 141093 (0.0028) [2024-06-18 13:20:07,000][12645] Fps is (10 sec: 44209.2, 60 sec: 42867.1, 300 sec: 42708.6). Total num frames: 2311667712. Throughput: 0: 43071.8. Samples: 2311777720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:20:07,000][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 13:20:11,150][12883] Updated weights for policy 0, policy_version 141103 (0.0037) [2024-06-18 13:20:11,994][12645] Fps is (10 sec: 40961.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2311864320. Throughput: 0: 43069.3. Samples: 2312033680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:20:11,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 13:20:14,307][12883] Updated weights for policy 0, policy_version 141113 (0.0025) [2024-06-18 13:20:16,994][12645] Fps is (10 sec: 44264.3, 60 sec: 43422.1, 300 sec: 42820.6). Total num frames: 2312110080. Throughput: 0: 43015.4. Samples: 2312156380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:20:16,994][12645] Avg episode reward: [(0, '0.669')] [2024-06-18 13:20:19,023][12883] Updated weights for policy 0, policy_version 141123 (0.0031) [2024-06-18 13:20:21,981][12883] Updated weights for policy 0, policy_version 141133 (0.0033) [2024-06-18 13:20:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2312323072. Throughput: 0: 43065.3. Samples: 2312420140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:20:21,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 13:20:26,536][12883] Updated weights for policy 0, policy_version 141143 (0.0046) [2024-06-18 13:20:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2312519680. Throughput: 0: 43155.5. Samples: 2312683120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:20:26,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 13:20:29,758][12883] Updated weights for policy 0, policy_version 141153 (0.0022) [2024-06-18 13:20:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2312749056. Throughput: 0: 42978.8. Samples: 2312807940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 13:20:31,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 13:20:34,117][12883] Updated weights for policy 0, policy_version 141163 (0.0043) [2024-06-18 13:20:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2312929280. Throughput: 0: 42922.2. Samples: 2313062140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:20:36,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 13:20:37,040][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141171_2312945664.pth... [2024-06-18 13:20:37,112][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140545_2302689280.pth [2024-06-18 13:20:37,387][12883] Updated weights for policy 0, policy_version 141173 (0.0034) [2024-06-18 13:20:41,694][12883] Updated weights for policy 0, policy_version 141183 (0.0035) [2024-06-18 13:20:41,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2313142272. Throughput: 0: 42935.9. Samples: 2313321720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:20:41,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 13:20:44,941][12883] Updated weights for policy 0, policy_version 141193 (0.0040) [2024-06-18 13:20:46,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2313388032. Throughput: 0: 42914.4. Samples: 2313447960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:20:46,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 13:20:49,285][12883] Updated weights for policy 0, policy_version 141203 (0.0032) [2024-06-18 13:20:51,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 2313568256. Throughput: 0: 42842.4. Samples: 2313705360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:20:51,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 13:20:52,891][12883] Updated weights for policy 0, policy_version 141213 (0.0029) [2024-06-18 13:20:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2313781248. Throughput: 0: 42928.3. Samples: 2313965460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:20:56,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 13:20:57,271][12883] Updated weights for policy 0, policy_version 141223 (0.0030) [2024-06-18 13:21:00,433][12883] Updated weights for policy 0, policy_version 141233 (0.0032) [2024-06-18 13:21:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2314027008. Throughput: 0: 42877.4. Samples: 2314085860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:01,994][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 13:21:04,761][12883] Updated weights for policy 0, policy_version 141243 (0.0037) [2024-06-18 13:21:06,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42602.9, 300 sec: 42653.9). Total num frames: 2314223616. Throughput: 0: 42817.0. Samples: 2314346900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:06,994][12645] Avg episode reward: [(0, '0.733')] [2024-06-18 13:21:08,107][12883] Updated weights for policy 0, policy_version 141253 (0.0032) [2024-06-18 13:21:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2314436608. Throughput: 0: 42724.5. Samples: 2314605720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:11,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 13:21:12,256][12883] Updated weights for policy 0, policy_version 141263 (0.0032) [2024-06-18 13:21:15,597][12883] Updated weights for policy 0, policy_version 141273 (0.0035) [2024-06-18 13:21:16,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2314649600. Throughput: 0: 42809.5. Samples: 2314734380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:16,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 13:21:19,643][12883] Updated weights for policy 0, policy_version 141283 (0.0033) [2024-06-18 13:21:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2314878976. Throughput: 0: 42928.5. Samples: 2314993920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:21,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 13:21:23,555][12883] Updated weights for policy 0, policy_version 141293 (0.0041) [2024-06-18 13:21:24,676][12862] Signal inference workers to stop experience collection... (33850 times) [2024-06-18 13:21:24,712][12883] InferenceWorker_p0-w0: stopping experience collection (33850 times) [2024-06-18 13:21:24,722][12862] Signal inference workers to resume experience collection... (33850 times) [2024-06-18 13:21:24,727][12883] InferenceWorker_p0-w0: resuming experience collection (33850 times) [2024-06-18 13:21:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2315075584. Throughput: 0: 42914.3. Samples: 2315252860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:26,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 13:21:27,367][12883] Updated weights for policy 0, policy_version 141303 (0.0021) [2024-06-18 13:21:31,221][12883] Updated weights for policy 0, policy_version 141313 (0.0030) [2024-06-18 13:21:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2315304960. Throughput: 0: 42873.8. Samples: 2315377280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:31,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 13:21:35,103][12883] Updated weights for policy 0, policy_version 141323 (0.0036) [2024-06-18 13:21:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2315517952. Throughput: 0: 42847.9. Samples: 2315633520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 13:21:36,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 13:21:38,926][12883] Updated weights for policy 0, policy_version 141333 (0.0031) [2024-06-18 13:21:41,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42870.0, 300 sec: 42764.7). Total num frames: 2315714560. Throughput: 0: 42724.7. Samples: 2315888160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:21:41,996][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 13:21:42,751][12883] Updated weights for policy 0, policy_version 141343 (0.0028) [2024-06-18 13:21:46,548][12883] Updated weights for policy 0, policy_version 141353 (0.0027) [2024-06-18 13:21:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42876.4). Total num frames: 2315943936. Throughput: 0: 42914.1. Samples: 2316017000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:21:46,994][12645] Avg episode reward: [(0, '0.673')] [2024-06-18 13:21:50,387][12883] Updated weights for policy 0, policy_version 141363 (0.0033) [2024-06-18 13:21:51,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43417.6, 300 sec: 42876.3). Total num frames: 2316173312. Throughput: 0: 42907.9. Samples: 2316277760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:21:51,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 13:21:54,074][12883] Updated weights for policy 0, policy_version 141373 (0.0031) [2024-06-18 13:21:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2316369920. Throughput: 0: 42942.2. Samples: 2316538120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:21:56,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 13:21:57,869][12883] Updated weights for policy 0, policy_version 141383 (0.0039) [2024-06-18 13:22:01,583][12883] Updated weights for policy 0, policy_version 141393 (0.0030) [2024-06-18 13:22:01,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42932.1). Total num frames: 2316599296. Throughput: 0: 42844.7. Samples: 2316662380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:01,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 13:22:05,654][12883] Updated weights for policy 0, policy_version 141403 (0.0040) [2024-06-18 13:22:06,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2316812288. Throughput: 0: 42978.2. Samples: 2316927940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:06,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 13:22:09,071][12883] Updated weights for policy 0, policy_version 141413 (0.0040) [2024-06-18 13:22:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2317025280. Throughput: 0: 42862.7. Samples: 2317181680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:11,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 13:22:13,302][12883] Updated weights for policy 0, policy_version 141423 (0.0034) [2024-06-18 13:22:16,675][12883] Updated weights for policy 0, policy_version 141433 (0.0029) [2024-06-18 13:22:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42987.5). Total num frames: 2317254656. Throughput: 0: 42988.9. Samples: 2317311780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:16,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 13:22:20,898][12883] Updated weights for policy 0, policy_version 141443 (0.0037) [2024-06-18 13:22:21,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.9, 300 sec: 42931.3). Total num frames: 2317451264. Throughput: 0: 42970.8. Samples: 2317567300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:21,996][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 13:22:24,372][12883] Updated weights for policy 0, policy_version 141453 (0.0039) [2024-06-18 13:22:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2317647872. Throughput: 0: 43051.8. Samples: 2317825400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:26,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 13:22:28,649][12883] Updated weights for policy 0, policy_version 141463 (0.0042) [2024-06-18 13:22:31,966][12883] Updated weights for policy 0, policy_version 141473 (0.0021) [2024-06-18 13:22:31,994][12645] Fps is (10 sec: 44246.9, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2317893632. Throughput: 0: 42980.6. Samples: 2317951120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:31,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 13:22:36,264][12883] Updated weights for policy 0, policy_version 141483 (0.0040) [2024-06-18 13:22:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42932.1). Total num frames: 2318090240. Throughput: 0: 42929.7. Samples: 2318209600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:36,994][12645] Avg episode reward: [(0, '0.723')] [2024-06-18 13:22:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141485_2318090240.pth... [2024-06-18 13:22:37,058][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000140858_2307817472.pth [2024-06-18 13:22:39,599][12883] Updated weights for policy 0, policy_version 141493 (0.0031) [2024-06-18 13:22:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43146.1, 300 sec: 42876.1). Total num frames: 2318303232. Throughput: 0: 42775.1. Samples: 2318463000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 13:22:41,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 13:22:43,831][12883] Updated weights for policy 0, policy_version 141503 (0.0028) [2024-06-18 13:22:46,165][12862] Signal inference workers to stop experience collection... (33900 times) [2024-06-18 13:22:46,165][12862] Signal inference workers to resume experience collection... (33900 times) [2024-06-18 13:22:46,187][12883] InferenceWorker_p0-w0: stopping experience collection (33900 times) [2024-06-18 13:22:46,214][12883] InferenceWorker_p0-w0: resuming experience collection (33900 times) [2024-06-18 13:22:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2318516224. Throughput: 0: 42812.7. Samples: 2318588960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:22:46,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 13:22:47,382][12883] Updated weights for policy 0, policy_version 141513 (0.0035) [2024-06-18 13:22:51,510][12883] Updated weights for policy 0, policy_version 141523 (0.0036) [2024-06-18 13:22:51,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.9, 300 sec: 42931.3). Total num frames: 2318729216. Throughput: 0: 42672.2. Samples: 2318848280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:22:51,996][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 13:22:54,991][12883] Updated weights for policy 0, policy_version 141533 (0.0034) [2024-06-18 13:22:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2318942208. Throughput: 0: 42588.4. Samples: 2319098160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:22:57,003][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 13:22:59,183][12883] Updated weights for policy 0, policy_version 141543 (0.0039) [2024-06-18 13:23:01,994][12645] Fps is (10 sec: 44247.0, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2319171584. Throughput: 0: 42709.9. Samples: 2319233720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:01,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 13:23:02,630][12883] Updated weights for policy 0, policy_version 141553 (0.0028) [2024-06-18 13:23:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2319351808. Throughput: 0: 42715.4. Samples: 2319489400. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:06,994][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 13:23:07,201][12883] Updated weights for policy 0, policy_version 141563 (0.0045) [2024-06-18 13:23:10,370][12883] Updated weights for policy 0, policy_version 141573 (0.0024) [2024-06-18 13:23:11,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2319581184. Throughput: 0: 42412.4. Samples: 2319733960. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:11,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 13:23:14,926][12883] Updated weights for policy 0, policy_version 141583 (0.0038) [2024-06-18 13:23:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2319810560. Throughput: 0: 42632.9. Samples: 2319869600. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:16,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 13:23:17,832][12883] Updated weights for policy 0, policy_version 141593 (0.0037) [2024-06-18 13:23:21,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42053.8, 300 sec: 42820.5). Total num frames: 2319974400. Throughput: 0: 42432.4. Samples: 2320119060. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:21,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 13:23:22,644][12883] Updated weights for policy 0, policy_version 141603 (0.0039) [2024-06-18 13:23:25,771][12883] Updated weights for policy 0, policy_version 141613 (0.0026) [2024-06-18 13:23:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2320236544. Throughput: 0: 42253.7. Samples: 2320364420. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:26,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 13:23:30,120][12883] Updated weights for policy 0, policy_version 141623 (0.0028) [2024-06-18 13:23:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2320433152. Throughput: 0: 42674.0. Samples: 2320509280. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:31,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 13:23:33,339][12883] Updated weights for policy 0, policy_version 141633 (0.0021) [2024-06-18 13:23:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2320613376. Throughput: 0: 42354.9. Samples: 2320754160. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:37,000][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 13:23:37,878][12883] Updated weights for policy 0, policy_version 141643 (0.0026) [2024-06-18 13:23:41,333][12883] Updated weights for policy 0, policy_version 141653 (0.0037) [2024-06-18 13:23:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2320891904. Throughput: 0: 42363.6. Samples: 2321004520. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-18 13:23:41,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 13:23:45,476][12883] Updated weights for policy 0, policy_version 141663 (0.0036) [2024-06-18 13:23:46,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2321055744. Throughput: 0: 42546.2. Samples: 2321148400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:23:46,996][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 13:23:48,949][12883] Updated weights for policy 0, policy_version 141673 (0.0034) [2024-06-18 13:23:51,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42053.7, 300 sec: 42765.0). Total num frames: 2321252352. Throughput: 0: 42171.1. Samples: 2321387100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:23:51,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 13:23:53,576][12883] Updated weights for policy 0, policy_version 141683 (0.0047) [2024-06-18 13:23:55,595][12862] Signal inference workers to stop experience collection... (33950 times) [2024-06-18 13:23:55,595][12862] Signal inference workers to resume experience collection... (33950 times) [2024-06-18 13:23:55,643][12883] InferenceWorker_p0-w0: stopping experience collection (33950 times) [2024-06-18 13:23:55,648][12883] InferenceWorker_p0-w0: resuming experience collection (33950 times) [2024-06-18 13:23:56,605][12883] Updated weights for policy 0, policy_version 141693 (0.0046) [2024-06-18 13:23:56,994][12645] Fps is (10 sec: 45885.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2321514496. Throughput: 0: 42426.8. Samples: 2321643160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:23:56,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 13:24:01,101][12883] Updated weights for policy 0, policy_version 141703 (0.0033) [2024-06-18 13:24:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 2321678336. Throughput: 0: 42452.5. Samples: 2321779960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:01,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 13:24:04,263][12883] Updated weights for policy 0, policy_version 141713 (0.0030) [2024-06-18 13:24:06,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2321907712. Throughput: 0: 42408.4. Samples: 2322027440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:06,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 13:24:08,708][12883] Updated weights for policy 0, policy_version 141723 (0.0041) [2024-06-18 13:24:11,966][12883] Updated weights for policy 0, policy_version 141733 (0.0038) [2024-06-18 13:24:11,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42877.0). Total num frames: 2322153472. Throughput: 0: 42751.2. Samples: 2322288220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:11,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 13:24:16,139][12883] Updated weights for policy 0, policy_version 141743 (0.0030) [2024-06-18 13:24:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2322333696. Throughput: 0: 42416.3. Samples: 2322418020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:16,994][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 13:24:19,752][12883] Updated weights for policy 0, policy_version 141753 (0.0038) [2024-06-18 13:24:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2322563072. Throughput: 0: 42513.4. Samples: 2322667260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:21,994][12645] Avg episode reward: [(0, '0.192')] [2024-06-18 13:24:23,865][12883] Updated weights for policy 0, policy_version 141763 (0.0025) [2024-06-18 13:24:26,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2322776064. Throughput: 0: 42873.8. Samples: 2322933840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:26,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 13:24:27,434][12883] Updated weights for policy 0, policy_version 141773 (0.0030) [2024-06-18 13:24:31,280][12883] Updated weights for policy 0, policy_version 141783 (0.0042) [2024-06-18 13:24:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2322972672. Throughput: 0: 42427.4. Samples: 2323057540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:31,994][12645] Avg episode reward: [(0, '0.669')] [2024-06-18 13:24:35,137][12883] Updated weights for policy 0, policy_version 141793 (0.0027) [2024-06-18 13:24:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2323202048. Throughput: 0: 42804.0. Samples: 2323313280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:36,994][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 13:24:37,097][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141798_2323218432.pth... [2024-06-18 13:24:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141171_2312945664.pth [2024-06-18 13:24:38,777][12883] Updated weights for policy 0, policy_version 141803 (0.0032) [2024-06-18 13:24:41,994][12645] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 2323398656. Throughput: 0: 43026.7. Samples: 2323579360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:41,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 13:24:42,846][12883] Updated weights for policy 0, policy_version 141813 (0.0035) [2024-06-18 13:24:46,904][12883] Updated weights for policy 0, policy_version 141823 (0.0035) [2024-06-18 13:24:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2323628032. Throughput: 0: 42633.7. Samples: 2323698480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-18 13:24:46,994][12645] Avg episode reward: [(0, '0.797')] [2024-06-18 13:24:50,421][12883] Updated weights for policy 0, policy_version 141833 (0.0038) [2024-06-18 13:24:51,994][12645] Fps is (10 sec: 45874.8, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2323857408. Throughput: 0: 42915.6. Samples: 2323958640. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:24:51,998][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 13:24:54,333][12883] Updated weights for policy 0, policy_version 141843 (0.0027) [2024-06-18 13:24:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 2324037632. Throughput: 0: 42946.6. Samples: 2324220820. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:24:56,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 13:24:58,285][12883] Updated weights for policy 0, policy_version 141853 (0.0033) [2024-06-18 13:25:01,883][12883] Updated weights for policy 0, policy_version 141863 (0.0035) [2024-06-18 13:25:02,000][12645] Fps is (10 sec: 42571.8, 60 sec: 43413.0, 300 sec: 42765.0). Total num frames: 2324283392. Throughput: 0: 42700.4. Samples: 2324339800. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:02,001][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 13:25:05,876][12883] Updated weights for policy 0, policy_version 141873 (0.0027) [2024-06-18 13:25:06,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2324496384. Throughput: 0: 43041.2. Samples: 2324604120. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:06,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 13:25:08,589][12862] Signal inference workers to stop experience collection... (34000 times) [2024-06-18 13:25:08,589][12862] Signal inference workers to resume experience collection... (34000 times) [2024-06-18 13:25:08,625][12883] InferenceWorker_p0-w0: stopping experience collection (34000 times) [2024-06-18 13:25:08,625][12883] InferenceWorker_p0-w0: resuming experience collection (34000 times) [2024-06-18 13:25:09,527][12883] Updated weights for policy 0, policy_version 141883 (0.0038) [2024-06-18 13:25:11,994][12645] Fps is (10 sec: 40984.1, 60 sec: 42325.0, 300 sec: 42653.9). Total num frames: 2324692992. Throughput: 0: 42881.0. Samples: 2324863500. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:11,995][12645] Avg episode reward: [(0, '0.754')] [2024-06-18 13:25:13,644][12883] Updated weights for policy 0, policy_version 141893 (0.0036) [2024-06-18 13:25:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2324922368. Throughput: 0: 42809.3. Samples: 2324983960. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:16,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 13:25:17,088][12883] Updated weights for policy 0, policy_version 141903 (0.0042) [2024-06-18 13:25:21,718][12883] Updated weights for policy 0, policy_version 141913 (0.0035) [2024-06-18 13:25:21,994][12645] Fps is (10 sec: 42600.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2325118976. Throughput: 0: 42760.9. Samples: 2325237520. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:21,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 13:25:24,945][12883] Updated weights for policy 0, policy_version 141923 (0.0038) [2024-06-18 13:25:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2325331968. Throughput: 0: 42590.2. Samples: 2325495920. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:26,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 13:25:29,298][12883] Updated weights for policy 0, policy_version 141933 (0.0038) [2024-06-18 13:25:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2325544960. Throughput: 0: 42803.4. Samples: 2325624640. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:31,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 13:25:32,393][12883] Updated weights for policy 0, policy_version 141943 (0.0036) [2024-06-18 13:25:36,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2325741568. Throughput: 0: 42581.8. Samples: 2325874820. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:36,994][12645] Avg episode reward: [(0, '0.784')] [2024-06-18 13:25:37,047][12883] Updated weights for policy 0, policy_version 141953 (0.0029) [2024-06-18 13:25:40,210][12883] Updated weights for policy 0, policy_version 141963 (0.0027) [2024-06-18 13:25:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2325954560. Throughput: 0: 42456.8. Samples: 2326131380. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:41,994][12645] Avg episode reward: [(0, '0.695')] [2024-06-18 13:25:44,696][12883] Updated weights for policy 0, policy_version 141973 (0.0038) [2024-06-18 13:25:46,998][12645] Fps is (10 sec: 45853.0, 60 sec: 42868.0, 300 sec: 42819.9). Total num frames: 2326200320. Throughput: 0: 42756.5. Samples: 2326263780. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:46,999][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 13:25:47,655][12883] Updated weights for policy 0, policy_version 141983 (0.0028) [2024-06-18 13:25:51,996][12645] Fps is (10 sec: 42589.2, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 2326380544. Throughput: 0: 42498.4. Samples: 2326516640. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 13:25:51,996][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 13:25:52,332][12883] Updated weights for policy 0, policy_version 141993 (0.0041) [2024-06-18 13:25:55,327][12883] Updated weights for policy 0, policy_version 142003 (0.0037) [2024-06-18 13:25:56,994][12645] Fps is (10 sec: 40979.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2326609920. Throughput: 0: 42494.5. Samples: 2326775740. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:25:56,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 13:25:59,866][12883] Updated weights for policy 0, policy_version 142013 (0.0029) [2024-06-18 13:26:01,994][12645] Fps is (10 sec: 45885.7, 60 sec: 42602.9, 300 sec: 42765.0). Total num frames: 2326839296. Throughput: 0: 42876.1. Samples: 2326913380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:01,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 13:26:02,873][12883] Updated weights for policy 0, policy_version 142023 (0.0061) [2024-06-18 13:26:06,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2327035904. Throughput: 0: 42832.7. Samples: 2327165000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:06,994][12645] Avg episode reward: [(0, '0.689')] [2024-06-18 13:26:07,371][12883] Updated weights for policy 0, policy_version 142033 (0.0025) [2024-06-18 13:26:10,857][12883] Updated weights for policy 0, policy_version 142043 (0.0031) [2024-06-18 13:26:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 2327265280. Throughput: 0: 42676.4. Samples: 2327416360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:11,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 13:26:15,397][12883] Updated weights for policy 0, policy_version 142053 (0.0038) [2024-06-18 13:26:16,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2327478272. Throughput: 0: 42789.8. Samples: 2327550180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:16,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 13:26:18,406][12883] Updated weights for policy 0, policy_version 142063 (0.0038) [2024-06-18 13:26:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2327691264. Throughput: 0: 43031.5. Samples: 2327811240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:21,996][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 13:26:22,838][12883] Updated weights for policy 0, policy_version 142073 (0.0042) [2024-06-18 13:26:26,060][12883] Updated weights for policy 0, policy_version 142083 (0.0034) [2024-06-18 13:26:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2327920640. Throughput: 0: 42816.2. Samples: 2328058100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:26,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 13:26:30,831][12883] Updated weights for policy 0, policy_version 142093 (0.0031) [2024-06-18 13:26:31,861][12862] Signal inference workers to stop experience collection... (34050 times) [2024-06-18 13:26:31,902][12883] InferenceWorker_p0-w0: stopping experience collection (34050 times) [2024-06-18 13:26:31,918][12862] Signal inference workers to resume experience collection... (34050 times) [2024-06-18 13:26:31,924][12883] InferenceWorker_p0-w0: resuming experience collection (34050 times) [2024-06-18 13:26:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2328117248. Throughput: 0: 42824.1. Samples: 2328190660. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:31,996][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 13:26:33,810][12883] Updated weights for policy 0, policy_version 142103 (0.0035) [2024-06-18 13:26:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2328330240. Throughput: 0: 42981.7. Samples: 2328450720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:36,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 13:26:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142110_2328330240.pth... [2024-06-18 13:26:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141485_2318090240.pth [2024-06-18 13:26:38,261][12883] Updated weights for policy 0, policy_version 142113 (0.0029) [2024-06-18 13:26:41,524][12883] Updated weights for policy 0, policy_version 142123 (0.0046) [2024-06-18 13:26:41,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2328559616. Throughput: 0: 42751.2. Samples: 2328699540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:41,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 13:26:46,017][12883] Updated weights for policy 0, policy_version 142133 (0.0038) [2024-06-18 13:26:46,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42328.7, 300 sec: 42598.4). Total num frames: 2328739840. Throughput: 0: 42587.9. Samples: 2328829840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:46,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 13:26:49,171][12883] Updated weights for policy 0, policy_version 142143 (0.0044) [2024-06-18 13:26:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43146.1, 300 sec: 42709.5). Total num frames: 2328969216. Throughput: 0: 42749.4. Samples: 2329088720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-18 13:26:51,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 13:26:53,599][12883] Updated weights for policy 0, policy_version 142153 (0.0040) [2024-06-18 13:26:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2329182208. Throughput: 0: 42867.1. Samples: 2329345380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:26:56,998][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 13:26:57,040][12883] Updated weights for policy 0, policy_version 142163 (0.0042) [2024-06-18 13:27:01,190][12883] Updated weights for policy 0, policy_version 142173 (0.0031) [2024-06-18 13:27:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2329395200. Throughput: 0: 42762.8. Samples: 2329474500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:01,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 13:27:04,730][12883] Updated weights for policy 0, policy_version 142183 (0.0031) [2024-06-18 13:27:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2329608192. Throughput: 0: 42690.3. Samples: 2329732300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:06,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 13:27:08,754][12883] Updated weights for policy 0, policy_version 142193 (0.0043) [2024-06-18 13:27:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2329821184. Throughput: 0: 42723.1. Samples: 2329980640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:11,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 13:27:12,415][12883] Updated weights for policy 0, policy_version 142203 (0.0032) [2024-06-18 13:27:16,404][12883] Updated weights for policy 0, policy_version 142213 (0.0034) [2024-06-18 13:27:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2330017792. Throughput: 0: 42670.3. Samples: 2330110820. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:16,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 13:27:20,402][12883] Updated weights for policy 0, policy_version 142223 (0.0034) [2024-06-18 13:27:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2330263552. Throughput: 0: 42761.8. Samples: 2330375000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:21,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 13:27:24,222][12883] Updated weights for policy 0, policy_version 142233 (0.0022) [2024-06-18 13:27:26,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2330492928. Throughput: 0: 42813.4. Samples: 2330626140. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:26,994][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 13:27:27,907][12883] Updated weights for policy 0, policy_version 142243 (0.0037) [2024-06-18 13:27:31,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2330656768. Throughput: 0: 42679.6. Samples: 2330750420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:31,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 13:27:32,022][12883] Updated weights for policy 0, policy_version 142253 (0.0037) [2024-06-18 13:27:35,763][12883] Updated weights for policy 0, policy_version 142263 (0.0036) [2024-06-18 13:27:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2330918912. Throughput: 0: 42677.9. Samples: 2331009220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:37,004][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 13:27:39,641][12883] Updated weights for policy 0, policy_version 142273 (0.0025) [2024-06-18 13:27:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2331115520. Throughput: 0: 42606.3. Samples: 2331262660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:41,994][12645] Avg episode reward: [(0, '0.235')] [2024-06-18 13:27:43,310][12883] Updated weights for policy 0, policy_version 142283 (0.0022) [2024-06-18 13:27:46,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2331295744. Throughput: 0: 42596.4. Samples: 2331391340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:46,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 13:27:47,539][12883] Updated weights for policy 0, policy_version 142293 (0.0036) [2024-06-18 13:27:50,907][12883] Updated weights for policy 0, policy_version 142303 (0.0031) [2024-06-18 13:27:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2331525120. Throughput: 0: 42609.3. Samples: 2331649720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:51,994][12645] Avg episode reward: [(0, '0.669')] [2024-06-18 13:27:55,099][12883] Updated weights for policy 0, policy_version 142313 (0.0023) [2024-06-18 13:27:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2331738112. Throughput: 0: 42676.0. Samples: 2331901060. Policy #0 lag: (min: 1.0, avg: 11.1, max: 20.0) [2024-06-18 13:27:56,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 13:27:58,424][12883] Updated weights for policy 0, policy_version 142323 (0.0033) [2024-06-18 13:27:59,729][12862] Signal inference workers to stop experience collection... (34100 times) [2024-06-18 13:27:59,782][12862] Signal inference workers to resume experience collection... (34100 times) [2024-06-18 13:27:59,783][12883] InferenceWorker_p0-w0: stopping experience collection (34100 times) [2024-06-18 13:27:59,815][12883] InferenceWorker_p0-w0: resuming experience collection (34100 times) [2024-06-18 13:28:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2331951104. Throughput: 0: 42706.8. Samples: 2332032620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:01,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 13:28:02,595][12883] Updated weights for policy 0, policy_version 142333 (0.0031) [2024-06-18 13:28:06,061][12883] Updated weights for policy 0, policy_version 142343 (0.0036) [2024-06-18 13:28:06,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2332180480. Throughput: 0: 42469.4. Samples: 2332286220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:06,996][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 13:28:10,549][12883] Updated weights for policy 0, policy_version 142353 (0.0033) [2024-06-18 13:28:11,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2332393472. Throughput: 0: 42567.4. Samples: 2332541680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:11,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 13:28:13,649][12883] Updated weights for policy 0, policy_version 142363 (0.0039) [2024-06-18 13:28:16,994][12645] Fps is (10 sec: 40968.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2332590080. Throughput: 0: 42704.8. Samples: 2332672140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:16,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 13:28:18,299][12883] Updated weights for policy 0, policy_version 142373 (0.0040) [2024-06-18 13:28:21,328][12883] Updated weights for policy 0, policy_version 142383 (0.0037) [2024-06-18 13:28:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2332819456. Throughput: 0: 42590.6. Samples: 2332925800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:21,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 13:28:25,850][12883] Updated weights for policy 0, policy_version 142393 (0.0028) [2024-06-18 13:28:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 2333016064. Throughput: 0: 42665.2. Samples: 2333182600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:26,994][12645] Avg episode reward: [(0, '0.245')] [2024-06-18 13:28:29,071][12883] Updated weights for policy 0, policy_version 142403 (0.0040) [2024-06-18 13:28:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2333212672. Throughput: 0: 42611.1. Samples: 2333308840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:31,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 13:28:33,402][12883] Updated weights for policy 0, policy_version 142413 (0.0034) [2024-06-18 13:28:36,923][12883] Updated weights for policy 0, policy_version 142423 (0.0034) [2024-06-18 13:28:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2333458432. Throughput: 0: 42571.5. Samples: 2333565440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:36,994][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 13:28:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142423_2333458432.pth... [2024-06-18 13:28:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000141798_2323218432.pth [2024-06-18 13:28:40,984][12883] Updated weights for policy 0, policy_version 142433 (0.0043) [2024-06-18 13:28:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 2333671424. Throughput: 0: 42609.8. Samples: 2333818500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:41,994][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 13:28:44,421][12883] Updated weights for policy 0, policy_version 142443 (0.0031) [2024-06-18 13:28:46,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2333851648. Throughput: 0: 42561.6. Samples: 2333947900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:46,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 13:28:48,594][12883] Updated weights for policy 0, policy_version 142453 (0.0024) [2024-06-18 13:28:51,942][12883] Updated weights for policy 0, policy_version 142463 (0.0031) [2024-06-18 13:28:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2334113792. Throughput: 0: 42680.8. Samples: 2334206760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:51,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 13:28:56,353][12883] Updated weights for policy 0, policy_version 142473 (0.0038) [2024-06-18 13:28:56,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2334310400. Throughput: 0: 42691.7. Samples: 2334462800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 13:28:56,994][12645] Avg episode reward: [(0, '0.672')] [2024-06-18 13:28:59,531][12883] Updated weights for policy 0, policy_version 142483 (0.0041) [2024-06-18 13:29:01,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2334507008. Throughput: 0: 42556.0. Samples: 2334587160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:01,994][12645] Avg episode reward: [(0, '0.917')] [2024-06-18 13:29:01,996][12862] Saving new best policy, reward=0.917! [2024-06-18 13:29:03,925][12883] Updated weights for policy 0, policy_version 142493 (0.0026) [2024-06-18 13:29:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2334752768. Throughput: 0: 42748.1. Samples: 2334849460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:06,994][12645] Avg episode reward: [(0, '0.649')] [2024-06-18 13:29:07,717][12883] Updated weights for policy 0, policy_version 142503 (0.0048) [2024-06-18 13:29:11,515][12883] Updated weights for policy 0, policy_version 142513 (0.0027) [2024-06-18 13:29:11,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2334949376. Throughput: 0: 42636.6. Samples: 2335101240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:11,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 13:29:15,395][12883] Updated weights for policy 0, policy_version 142523 (0.0040) [2024-06-18 13:29:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2335145984. Throughput: 0: 42655.4. Samples: 2335228340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:16,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 13:29:19,346][12883] Updated weights for policy 0, policy_version 142533 (0.0037) [2024-06-18 13:29:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2335375360. Throughput: 0: 42690.4. Samples: 2335486500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:21,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 13:29:22,835][12883] Updated weights for policy 0, policy_version 142543 (0.0040) [2024-06-18 13:29:26,986][12883] Updated weights for policy 0, policy_version 142553 (0.0030) [2024-06-18 13:29:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2335588352. Throughput: 0: 42825.5. Samples: 2335745660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:26,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 13:29:30,705][12883] Updated weights for policy 0, policy_version 142563 (0.0037) [2024-06-18 13:29:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2335801344. Throughput: 0: 42621.0. Samples: 2335865840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:31,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 13:29:34,628][12883] Updated weights for policy 0, policy_version 142573 (0.0040) [2024-06-18 13:29:36,994][12645] Fps is (10 sec: 42599.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2336014336. Throughput: 0: 42632.9. Samples: 2336125240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:36,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 13:29:38,293][12883] Updated weights for policy 0, policy_version 142583 (0.0023) [2024-06-18 13:29:40,453][12862] Signal inference workers to stop experience collection... (34150 times) [2024-06-18 13:29:40,454][12862] Signal inference workers to resume experience collection... (34150 times) [2024-06-18 13:29:40,466][12883] InferenceWorker_p0-w0: stopping experience collection (34150 times) [2024-06-18 13:29:40,485][12883] InferenceWorker_p0-w0: resuming experience collection (34150 times) [2024-06-18 13:29:41,995][12645] Fps is (10 sec: 42592.9, 60 sec: 42597.4, 300 sec: 42709.3). Total num frames: 2336227328. Throughput: 0: 42522.7. Samples: 2336376380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:41,996][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 13:29:42,586][12883] Updated weights for policy 0, policy_version 142593 (0.0022) [2024-06-18 13:29:45,954][12883] Updated weights for policy 0, policy_version 142603 (0.0044) [2024-06-18 13:29:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2336440320. Throughput: 0: 42648.0. Samples: 2336506320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:46,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 13:29:50,462][12883] Updated weights for policy 0, policy_version 142613 (0.0033) [2024-06-18 13:29:51,994][12645] Fps is (10 sec: 40965.5, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2336636928. Throughput: 0: 42575.5. Samples: 2336765360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:51,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 13:29:53,596][12883] Updated weights for policy 0, policy_version 142623 (0.0025) [2024-06-18 13:29:57,000][12645] Fps is (10 sec: 40934.7, 60 sec: 42320.9, 300 sec: 42598.4). Total num frames: 2336849920. Throughput: 0: 42501.2. Samples: 2337014060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:29:57,001][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 13:29:58,144][12883] Updated weights for policy 0, policy_version 142633 (0.0033) [2024-06-18 13:30:01,454][12883] Updated weights for policy 0, policy_version 142643 (0.0028) [2024-06-18 13:30:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2337079296. Throughput: 0: 42568.8. Samples: 2337143940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:30:01,995][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 13:30:05,857][12883] Updated weights for policy 0, policy_version 142653 (0.0038) [2024-06-18 13:30:06,994][12645] Fps is (10 sec: 42625.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 2337275904. Throughput: 0: 42528.4. Samples: 2337400280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:06,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 13:30:09,256][12883] Updated weights for policy 0, policy_version 142663 (0.0032) [2024-06-18 13:30:11,996][12645] Fps is (10 sec: 42589.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2337505280. Throughput: 0: 42348.3. Samples: 2337651420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:11,997][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 13:30:13,604][12883] Updated weights for policy 0, policy_version 142673 (0.0038) [2024-06-18 13:30:16,877][12883] Updated weights for policy 0, policy_version 142683 (0.0040) [2024-06-18 13:30:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2337718272. Throughput: 0: 42610.2. Samples: 2337783300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:16,994][12645] Avg episode reward: [(0, '0.664')] [2024-06-18 13:30:21,142][12883] Updated weights for policy 0, policy_version 142693 (0.0034) [2024-06-18 13:30:21,996][12645] Fps is (10 sec: 40960.1, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 2337914880. Throughput: 0: 42514.3. Samples: 2338038480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:21,996][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 13:30:24,478][12883] Updated weights for policy 0, policy_version 142703 (0.0037) [2024-06-18 13:30:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2338160640. Throughput: 0: 42611.9. Samples: 2338293860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:26,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 13:30:28,556][12883] Updated weights for policy 0, policy_version 142713 (0.0031) [2024-06-18 13:30:31,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2338357248. Throughput: 0: 42723.2. Samples: 2338428860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:31,994][12645] Avg episode reward: [(0, '0.695')] [2024-06-18 13:30:32,031][12883] Updated weights for policy 0, policy_version 142723 (0.0032) [2024-06-18 13:30:36,402][12883] Updated weights for policy 0, policy_version 142733 (0.0031) [2024-06-18 13:30:36,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2338553856. Throughput: 0: 42668.4. Samples: 2338685440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:36,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 13:30:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142735_2338570240.pth... [2024-06-18 13:30:37,130][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142110_2328330240.pth [2024-06-18 13:30:39,745][12883] Updated weights for policy 0, policy_version 142743 (0.0027) [2024-06-18 13:30:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42599.3, 300 sec: 42654.6). Total num frames: 2338783232. Throughput: 0: 42789.0. Samples: 2338939300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:41,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 13:30:44,032][12883] Updated weights for policy 0, policy_version 142753 (0.0039) [2024-06-18 13:30:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 2338996224. Throughput: 0: 42688.2. Samples: 2339064900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:46,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 13:30:47,571][12883] Updated weights for policy 0, policy_version 142763 (0.0031) [2024-06-18 13:30:51,858][12883] Updated weights for policy 0, policy_version 142773 (0.0033) [2024-06-18 13:30:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2339192832. Throughput: 0: 42716.7. Samples: 2339322540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:51,994][12645] Avg episode reward: [(0, '0.848')] [2024-06-18 13:30:55,156][12883] Updated weights for policy 0, policy_version 142783 (0.0030) [2024-06-18 13:30:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43149.0, 300 sec: 42709.5). Total num frames: 2339438592. Throughput: 0: 42747.9. Samples: 2339574980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:30:56,994][12645] Avg episode reward: [(0, '0.739')] [2024-06-18 13:30:59,362][12883] Updated weights for policy 0, policy_version 142793 (0.0038) [2024-06-18 13:30:59,595][12862] Signal inference workers to stop experience collection... (34200 times) [2024-06-18 13:30:59,633][12883] InferenceWorker_p0-w0: stopping experience collection (34200 times) [2024-06-18 13:30:59,642][12862] Signal inference workers to resume experience collection... (34200 times) [2024-06-18 13:30:59,653][12883] InferenceWorker_p0-w0: resuming experience collection (34200 times) [2024-06-18 13:31:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2339618816. Throughput: 0: 42843.0. Samples: 2339711240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:31:01,995][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 13:31:02,700][12883] Updated weights for policy 0, policy_version 142803 (0.0030) [2024-06-18 13:31:06,834][12883] Updated weights for policy 0, policy_version 142813 (0.0030) [2024-06-18 13:31:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2339848192. Throughput: 0: 42851.5. Samples: 2339966700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 13:31:06,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 13:31:10,904][12883] Updated weights for policy 0, policy_version 142823 (0.0044) [2024-06-18 13:31:11,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2340077568. Throughput: 0: 42767.6. Samples: 2340218400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:11,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 13:31:14,453][12883] Updated weights for policy 0, policy_version 142833 (0.0039) [2024-06-18 13:31:16,996][12645] Fps is (10 sec: 40950.5, 60 sec: 42323.7, 300 sec: 42598.1). Total num frames: 2340257792. Throughput: 0: 42656.1. Samples: 2340348480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:16,997][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 13:31:18,423][12883] Updated weights for policy 0, policy_version 142843 (0.0028) [2024-06-18 13:31:21,958][12883] Updated weights for policy 0, policy_version 142853 (0.0039) [2024-06-18 13:31:21,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2340503552. Throughput: 0: 42522.2. Samples: 2340598940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:21,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 13:31:26,083][12883] Updated weights for policy 0, policy_version 142863 (0.0029) [2024-06-18 13:31:26,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2340700160. Throughput: 0: 42716.9. Samples: 2340861560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:26,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 13:31:29,623][12883] Updated weights for policy 0, policy_version 142873 (0.0032) [2024-06-18 13:31:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2340913152. Throughput: 0: 42818.7. Samples: 2340991740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:31,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 13:31:33,813][12883] Updated weights for policy 0, policy_version 142883 (0.0035) [2024-06-18 13:31:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2341142528. Throughput: 0: 42777.4. Samples: 2341247520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:36,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 13:31:37,109][12883] Updated weights for policy 0, policy_version 142893 (0.0042) [2024-06-18 13:31:41,297][12883] Updated weights for policy 0, policy_version 142903 (0.0040) [2024-06-18 13:31:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2341322752. Throughput: 0: 42912.0. Samples: 2341506020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:41,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 13:31:44,927][12883] Updated weights for policy 0, policy_version 142913 (0.0037) [2024-06-18 13:31:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2341552128. Throughput: 0: 42634.7. Samples: 2341629800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:46,994][12645] Avg episode reward: [(0, '0.299')] [2024-06-18 13:31:49,076][12883] Updated weights for policy 0, policy_version 142923 (0.0037) [2024-06-18 13:31:51,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2341781504. Throughput: 0: 42699.9. Samples: 2341888200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:51,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 13:31:52,529][12883] Updated weights for policy 0, policy_version 142933 (0.0035) [2024-06-18 13:31:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2341961728. Throughput: 0: 42831.9. Samples: 2342145840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:31:56,995][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 13:31:57,083][12883] Updated weights for policy 0, policy_version 142943 (0.0031) [2024-06-18 13:32:00,269][12883] Updated weights for policy 0, policy_version 142953 (0.0029) [2024-06-18 13:32:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 43143.0, 300 sec: 42709.1). Total num frames: 2342207488. Throughput: 0: 42722.7. Samples: 2342271000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:32:01,996][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 13:32:04,634][12883] Updated weights for policy 0, policy_version 142963 (0.0039) [2024-06-18 13:32:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2342387712. Throughput: 0: 42868.0. Samples: 2342528000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:32:06,995][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 13:32:08,119][12883] Updated weights for policy 0, policy_version 142973 (0.0029) [2024-06-18 13:32:11,994][12645] Fps is (10 sec: 39330.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2342600704. Throughput: 0: 42619.5. Samples: 2342779440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 13:32:11,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 13:32:12,209][12883] Updated weights for policy 0, policy_version 142983 (0.0027) [2024-06-18 13:32:15,729][12862] Signal inference workers to stop experience collection... (34250 times) [2024-06-18 13:32:15,779][12883] InferenceWorker_p0-w0: stopping experience collection (34250 times) [2024-06-18 13:32:15,788][12862] Signal inference workers to resume experience collection... (34250 times) [2024-06-18 13:32:15,796][12883] InferenceWorker_p0-w0: resuming experience collection (34250 times) [2024-06-18 13:32:15,921][12883] Updated weights for policy 0, policy_version 142993 (0.0029) [2024-06-18 13:32:16,995][12645] Fps is (10 sec: 45867.5, 60 sec: 43144.9, 300 sec: 42653.7). Total num frames: 2342846464. Throughput: 0: 42608.1. Samples: 2342909180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:16,996][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 13:32:19,862][12883] Updated weights for policy 0, policy_version 143003 (0.0033) [2024-06-18 13:32:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2343043072. Throughput: 0: 42648.9. Samples: 2343166720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:21,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 13:32:23,633][12883] Updated weights for policy 0, policy_version 143013 (0.0043) [2024-06-18 13:32:26,994][12645] Fps is (10 sec: 39328.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2343239680. Throughput: 0: 42367.8. Samples: 2343412580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:26,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 13:32:27,758][12883] Updated weights for policy 0, policy_version 143023 (0.0049) [2024-06-18 13:32:31,454][12883] Updated weights for policy 0, policy_version 143033 (0.0025) [2024-06-18 13:32:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2343469056. Throughput: 0: 42518.8. Samples: 2343543140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:31,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 13:32:35,852][12883] Updated weights for policy 0, policy_version 143043 (0.0044) [2024-06-18 13:32:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 2343665664. Throughput: 0: 42356.4. Samples: 2343794240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:36,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 13:32:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143046_2343665664.pth... [2024-06-18 13:32:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142423_2333458432.pth [2024-06-18 13:32:39,510][12883] Updated weights for policy 0, policy_version 143053 (0.0029) [2024-06-18 13:32:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2343895040. Throughput: 0: 42094.8. Samples: 2344040100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:41,994][12645] Avg episode reward: [(0, '0.664')] [2024-06-18 13:32:43,426][12883] Updated weights for policy 0, policy_version 143063 (0.0041) [2024-06-18 13:32:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2344091648. Throughput: 0: 42268.3. Samples: 2344172980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:46,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 13:32:47,236][12883] Updated weights for policy 0, policy_version 143073 (0.0029) [2024-06-18 13:32:51,318][12883] Updated weights for policy 0, policy_version 143083 (0.0033) [2024-06-18 13:32:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2344321024. Throughput: 0: 42214.3. Samples: 2344427640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:51,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 13:32:55,039][12883] Updated weights for policy 0, policy_version 143093 (0.0031) [2024-06-18 13:32:56,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2344534016. Throughput: 0: 42152.5. Samples: 2344676300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:32:56,995][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 13:32:58,842][12883] Updated weights for policy 0, policy_version 143103 (0.0042) [2024-06-18 13:33:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42053.8, 300 sec: 42543.2). Total num frames: 2344730624. Throughput: 0: 42159.8. Samples: 2344806300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:33:01,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 13:33:02,626][12883] Updated weights for policy 0, policy_version 143113 (0.0031) [2024-06-18 13:33:06,217][12883] Updated weights for policy 0, policy_version 143123 (0.0031) [2024-06-18 13:33:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2344927232. Throughput: 0: 42213.8. Samples: 2345066340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:33:06,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 13:33:10,306][12883] Updated weights for policy 0, policy_version 143133 (0.0046) [2024-06-18 13:33:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2345172992. Throughput: 0: 42381.0. Samples: 2345319720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 13:33:11,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 13:33:13,890][12883] Updated weights for policy 0, policy_version 143143 (0.0035) [2024-06-18 13:33:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 41780.4, 300 sec: 42487.3). Total num frames: 2345353216. Throughput: 0: 42334.2. Samples: 2345448180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:16,995][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 13:33:18,022][12883] Updated weights for policy 0, policy_version 143153 (0.0039) [2024-06-18 13:33:21,537][12883] Updated weights for policy 0, policy_version 143163 (0.0034) [2024-06-18 13:33:22,000][12645] Fps is (10 sec: 40934.2, 60 sec: 42320.9, 300 sec: 42597.5). Total num frames: 2345582592. Throughput: 0: 42328.9. Samples: 2345699300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:22,000][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 13:33:25,574][12883] Updated weights for policy 0, policy_version 143173 (0.0029) [2024-06-18 13:33:26,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2345811968. Throughput: 0: 42595.5. Samples: 2345956900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:26,994][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 13:33:29,113][12883] Updated weights for policy 0, policy_version 143183 (0.0035) [2024-06-18 13:33:31,994][12645] Fps is (10 sec: 42624.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2346008576. Throughput: 0: 42580.9. Samples: 2346089120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:31,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 13:33:33,302][12883] Updated weights for policy 0, policy_version 143193 (0.0034) [2024-06-18 13:33:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2346221568. Throughput: 0: 42495.9. Samples: 2346339960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:36,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 13:33:37,174][12883] Updated weights for policy 0, policy_version 143203 (0.0027) [2024-06-18 13:33:37,916][12862] Signal inference workers to stop experience collection... (34300 times) [2024-06-18 13:33:37,916][12862] Signal inference workers to resume experience collection... (34300 times) [2024-06-18 13:33:37,968][12883] InferenceWorker_p0-w0: stopping experience collection (34300 times) [2024-06-18 13:33:37,968][12883] InferenceWorker_p0-w0: resuming experience collection (34300 times) [2024-06-18 13:33:41,216][12883] Updated weights for policy 0, policy_version 143213 (0.0032) [2024-06-18 13:33:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2346434560. Throughput: 0: 42581.8. Samples: 2346592480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:41,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 13:33:44,870][12883] Updated weights for policy 0, policy_version 143223 (0.0032) [2024-06-18 13:33:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2346631168. Throughput: 0: 42510.2. Samples: 2346719260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:46,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 13:33:49,054][12883] Updated weights for policy 0, policy_version 143233 (0.0040) [2024-06-18 13:33:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 2346860544. Throughput: 0: 42411.4. Samples: 2346974860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:51,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 13:33:52,669][12883] Updated weights for policy 0, policy_version 143243 (0.0027) [2024-06-18 13:33:56,516][12883] Updated weights for policy 0, policy_version 143253 (0.0040) [2024-06-18 13:33:56,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2347089920. Throughput: 0: 42629.2. Samples: 2347238040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:33:56,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 13:34:00,409][12883] Updated weights for policy 0, policy_version 143263 (0.0053) [2024-06-18 13:34:01,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2347270144. Throughput: 0: 42545.8. Samples: 2347362740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:34:01,994][12645] Avg episode reward: [(0, '0.248')] [2024-06-18 13:34:04,038][12883] Updated weights for policy 0, policy_version 143273 (0.0051) [2024-06-18 13:34:06,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2347515904. Throughput: 0: 42615.5. Samples: 2347616740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:34:06,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 13:34:08,014][12883] Updated weights for policy 0, policy_version 143283 (0.0028) [2024-06-18 13:34:11,850][12883] Updated weights for policy 0, policy_version 143293 (0.0034) [2024-06-18 13:34:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2347712512. Throughput: 0: 42549.4. Samples: 2347871620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:34:11,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 13:34:15,582][12883] Updated weights for policy 0, policy_version 143303 (0.0032) [2024-06-18 13:34:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2347925504. Throughput: 0: 42329.7. Samples: 2347993960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 13:34:16,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 13:34:19,512][12883] Updated weights for policy 0, policy_version 143313 (0.0041) [2024-06-18 13:34:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42876.0, 300 sec: 42598.4). Total num frames: 2348154880. Throughput: 0: 42549.4. Samples: 2348254680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:21,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 13:34:23,615][12883] Updated weights for policy 0, policy_version 143323 (0.0046) [2024-06-18 13:34:26,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2348335104. Throughput: 0: 42615.6. Samples: 2348510180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:26,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 13:34:27,445][12883] Updated weights for policy 0, policy_version 143333 (0.0032) [2024-06-18 13:34:31,206][12883] Updated weights for policy 0, policy_version 143343 (0.0034) [2024-06-18 13:34:31,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2348564480. Throughput: 0: 42473.9. Samples: 2348630680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:31,997][12645] Avg episode reward: [(0, '0.397')] [2024-06-18 13:34:35,113][12883] Updated weights for policy 0, policy_version 143353 (0.0035) [2024-06-18 13:34:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42543.1). Total num frames: 2348777472. Throughput: 0: 42557.5. Samples: 2348889940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:36,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 13:34:37,018][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143359_2348793856.pth... [2024-06-18 13:34:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000142735_2338570240.pth [2024-06-18 13:34:38,704][12883] Updated weights for policy 0, policy_version 143363 (0.0032) [2024-06-18 13:34:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2348974080. Throughput: 0: 42371.6. Samples: 2349144760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:41,994][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 13:34:43,042][12883] Updated weights for policy 0, policy_version 143373 (0.0038) [2024-06-18 13:34:46,633][12883] Updated weights for policy 0, policy_version 143383 (0.0039) [2024-06-18 13:34:46,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2349187072. Throughput: 0: 42284.6. Samples: 2349265640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:46,996][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 13:34:50,675][12883] Updated weights for policy 0, policy_version 143393 (0.0044) [2024-06-18 13:34:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42654.8). Total num frames: 2349432832. Throughput: 0: 42364.1. Samples: 2349523120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:51,994][12645] Avg episode reward: [(0, '0.757')] [2024-06-18 13:34:54,301][12883] Updated weights for policy 0, policy_version 143403 (0.0032) [2024-06-18 13:34:56,994][12645] Fps is (10 sec: 40968.7, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 2349596672. Throughput: 0: 42379.8. Samples: 2349778720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:34:56,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 13:34:58,270][12883] Updated weights for policy 0, policy_version 143413 (0.0038) [2024-06-18 13:35:01,876][12883] Updated weights for policy 0, policy_version 143423 (0.0038) [2024-06-18 13:35:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2349842432. Throughput: 0: 42281.8. Samples: 2349896640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:35:01,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 13:35:06,189][12883] Updated weights for policy 0, policy_version 143433 (0.0054) [2024-06-18 13:35:06,994][12645] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42543.2). Total num frames: 2350055424. Throughput: 0: 42306.7. Samples: 2350158480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:35:06,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 13:35:09,796][12883] Updated weights for policy 0, policy_version 143443 (0.0026) [2024-06-18 13:35:11,996][12645] Fps is (10 sec: 39313.2, 60 sec: 42050.6, 300 sec: 42431.5). Total num frames: 2350235648. Throughput: 0: 42264.9. Samples: 2350412200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:35:11,997][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 13:35:13,966][12883] Updated weights for policy 0, policy_version 143453 (0.0038) [2024-06-18 13:35:15,214][12862] Signal inference workers to stop experience collection... (34350 times) [2024-06-18 13:35:15,215][12862] Signal inference workers to resume experience collection... (34350 times) [2024-06-18 13:35:15,239][12883] InferenceWorker_p0-w0: stopping experience collection (34350 times) [2024-06-18 13:35:15,240][12883] InferenceWorker_p0-w0: resuming experience collection (34350 times) [2024-06-18 13:35:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 2350465024. Throughput: 0: 42310.1. Samples: 2350534540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:35:16,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 13:35:17,583][12883] Updated weights for policy 0, policy_version 143463 (0.0034) [2024-06-18 13:35:21,519][12883] Updated weights for policy 0, policy_version 143473 (0.0046) [2024-06-18 13:35:21,994][12645] Fps is (10 sec: 45885.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2350694400. Throughput: 0: 42383.0. Samples: 2350797180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 13:35:21,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 13:35:25,163][12883] Updated weights for policy 0, policy_version 143483 (0.0032) [2024-06-18 13:35:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2350874624. Throughput: 0: 42400.0. Samples: 2351052760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:35:26,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 13:35:29,070][12883] Updated weights for policy 0, policy_version 143493 (0.0036) [2024-06-18 13:35:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2351120384. Throughput: 0: 42483.9. Samples: 2351177320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:35:31,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 13:35:32,666][12883] Updated weights for policy 0, policy_version 143503 (0.0038) [2024-06-18 13:35:36,728][12883] Updated weights for policy 0, policy_version 143513 (0.0033) [2024-06-18 13:35:36,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2351333376. Throughput: 0: 42701.3. Samples: 2351444680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:35:37,000][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 13:35:40,815][12883] Updated weights for policy 0, policy_version 143523 (0.0032) [2024-06-18 13:35:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2351529984. Throughput: 0: 42544.5. Samples: 2351693220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:35:41,994][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 13:35:44,240][12883] Updated weights for policy 0, policy_version 143533 (0.0037) [2024-06-18 13:35:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2351759360. Throughput: 0: 42690.3. Samples: 2351817700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:35:46,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 13:35:48,531][12883] Updated weights for policy 0, policy_version 143543 (0.0038) [2024-06-18 13:35:51,784][12883] Updated weights for policy 0, policy_version 143553 (0.0031) [2024-06-18 13:35:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2351972352. Throughput: 0: 42761.7. Samples: 2352082760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:35:51,994][12645] Avg episode reward: [(0, '0.178')] [2024-06-18 13:35:56,173][12883] Updated weights for policy 0, policy_version 143563 (0.0042) [2024-06-18 13:35:56,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2352152576. Throughput: 0: 42735.6. Samples: 2352335300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:35:56,996][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 13:35:59,425][12883] Updated weights for policy 0, policy_version 143573 (0.0032) [2024-06-18 13:36:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2352381952. Throughput: 0: 42571.5. Samples: 2352450260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:36:01,994][12645] Avg episode reward: [(0, '0.691')] [2024-06-18 13:36:04,317][12883] Updated weights for policy 0, policy_version 143583 (0.0031) [2024-06-18 13:36:06,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2352611328. Throughput: 0: 42613.9. Samples: 2352714800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:36:06,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 13:36:07,041][12883] Updated weights for policy 0, policy_version 143593 (0.0034) [2024-06-18 13:36:11,892][12883] Updated weights for policy 0, policy_version 143603 (0.0037) [2024-06-18 13:36:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42600.0, 300 sec: 42487.7). Total num frames: 2352791552. Throughput: 0: 42608.5. Samples: 2352970140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:36:11,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 13:36:14,641][12883] Updated weights for policy 0, policy_version 143613 (0.0032) [2024-06-18 13:36:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 2353004544. Throughput: 0: 42572.1. Samples: 2353093060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:36:16,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 13:36:19,307][12883] Updated weights for policy 0, policy_version 143623 (0.0033) [2024-06-18 13:36:21,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2353250304. Throughput: 0: 42467.1. Samples: 2353355700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 13:36:21,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 13:36:22,341][12883] Updated weights for policy 0, policy_version 143633 (0.0039) [2024-06-18 13:36:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2353430528. Throughput: 0: 42597.0. Samples: 2353610080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:36:26,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 13:36:27,049][12883] Updated weights for policy 0, policy_version 143643 (0.0032) [2024-06-18 13:36:29,963][12883] Updated weights for policy 0, policy_version 143653 (0.0024) [2024-06-18 13:36:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 2353659904. Throughput: 0: 42595.0. Samples: 2353734480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:36:31,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 13:36:34,927][12883] Updated weights for policy 0, policy_version 143663 (0.0031) [2024-06-18 13:36:36,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2353889280. Throughput: 0: 42497.9. Samples: 2353995160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:36:36,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 13:36:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143670_2353889280.pth... [2024-06-18 13:36:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143046_2343665664.pth [2024-06-18 13:36:37,825][12862] Signal inference workers to stop experience collection... (34400 times) [2024-06-18 13:36:37,825][12862] Signal inference workers to resume experience collection... (34400 times) [2024-06-18 13:36:37,872][12883] InferenceWorker_p0-w0: stopping experience collection (34400 times) [2024-06-18 13:36:37,873][12883] InferenceWorker_p0-w0: resuming experience collection (34400 times) [2024-06-18 13:36:37,959][12883] Updated weights for policy 0, policy_version 143673 (0.0023) [2024-06-18 13:36:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2354085888. Throughput: 0: 42680.8. Samples: 2354255840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:36:41,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 13:36:42,379][12883] Updated weights for policy 0, policy_version 143683 (0.0033) [2024-06-18 13:36:45,556][12883] Updated weights for policy 0, policy_version 143693 (0.0039) [2024-06-18 13:36:46,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2354315264. Throughput: 0: 42859.5. Samples: 2354378940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:36:46,994][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 13:36:49,925][12883] Updated weights for policy 0, policy_version 143703 (0.0042) [2024-06-18 13:36:52,000][12645] Fps is (10 sec: 44208.9, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 2354528256. Throughput: 0: 42669.1. Samples: 2354635180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:36:52,001][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 13:36:53,164][12883] Updated weights for policy 0, policy_version 143713 (0.0030) [2024-06-18 13:36:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42873.0, 300 sec: 42432.1). Total num frames: 2354724864. Throughput: 0: 42827.1. Samples: 2354897360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:36:56,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 13:36:57,333][12883] Updated weights for policy 0, policy_version 143723 (0.0053) [2024-06-18 13:37:01,051][12883] Updated weights for policy 0, policy_version 143733 (0.0033) [2024-06-18 13:37:01,994][12645] Fps is (10 sec: 42625.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2354954240. Throughput: 0: 42962.1. Samples: 2355026360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:37:01,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 13:37:05,055][12883] Updated weights for policy 0, policy_version 143743 (0.0035) [2024-06-18 13:37:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 2355150848. Throughput: 0: 42816.4. Samples: 2355282440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:37:06,995][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 13:37:08,607][12883] Updated weights for policy 0, policy_version 143753 (0.0034) [2024-06-18 13:37:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42432.0). Total num frames: 2355363840. Throughput: 0: 42826.2. Samples: 2355537260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:37:11,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 13:37:12,773][12883] Updated weights for policy 0, policy_version 143763 (0.0030) [2024-06-18 13:37:16,312][12883] Updated weights for policy 0, policy_version 143773 (0.0034) [2024-06-18 13:37:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2355593216. Throughput: 0: 42865.8. Samples: 2355663440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:37:16,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 13:37:20,441][12883] Updated weights for policy 0, policy_version 143783 (0.0032) [2024-06-18 13:37:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2355822592. Throughput: 0: 42819.6. Samples: 2355922040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:37:21,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 13:37:23,959][12883] Updated weights for policy 0, policy_version 143793 (0.0039) [2024-06-18 13:37:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2356019200. Throughput: 0: 42695.4. Samples: 2356177140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 13:37:26,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 13:37:28,132][12883] Updated weights for policy 0, policy_version 143803 (0.0033) [2024-06-18 13:37:31,582][12883] Updated weights for policy 0, policy_version 143813 (0.0022) [2024-06-18 13:37:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2356248576. Throughput: 0: 42726.8. Samples: 2356301640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:37:31,994][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 13:37:35,630][12883] Updated weights for policy 0, policy_version 143823 (0.0033) [2024-06-18 13:37:36,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2356477952. Throughput: 0: 42962.3. Samples: 2356568220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:37:36,994][12645] Avg episode reward: [(0, '0.789')] [2024-06-18 13:37:39,120][12883] Updated weights for policy 0, policy_version 143833 (0.0030) [2024-06-18 13:37:41,998][12645] Fps is (10 sec: 39303.4, 60 sec: 42595.1, 300 sec: 42542.2). Total num frames: 2356641792. Throughput: 0: 42626.7. Samples: 2356815760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:37:41,999][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 13:37:42,096][12862] Signal inference workers to stop experience collection... (34450 times) [2024-06-18 13:37:42,152][12862] Signal inference workers to resume experience collection... (34450 times) [2024-06-18 13:37:42,152][12883] InferenceWorker_p0-w0: stopping experience collection (34450 times) [2024-06-18 13:37:42,173][12883] InferenceWorker_p0-w0: resuming experience collection (34450 times) [2024-06-18 13:37:43,244][12883] Updated weights for policy 0, policy_version 143843 (0.0026) [2024-06-18 13:37:46,710][12883] Updated weights for policy 0, policy_version 143853 (0.0028) [2024-06-18 13:37:46,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2356887552. Throughput: 0: 42599.1. Samples: 2356943320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:37:46,994][12645] Avg episode reward: [(0, '0.718')] [2024-06-18 13:37:50,798][12883] Updated weights for policy 0, policy_version 143863 (0.0037) [2024-06-18 13:37:51,994][12645] Fps is (10 sec: 44257.6, 60 sec: 42602.9, 300 sec: 42542.9). Total num frames: 2357084160. Throughput: 0: 42701.6. Samples: 2357204000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:37:51,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 13:37:54,558][12883] Updated weights for policy 0, policy_version 143873 (0.0033) [2024-06-18 13:37:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2357297152. Throughput: 0: 42763.0. Samples: 2357461600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:37:56,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 13:37:58,850][12883] Updated weights for policy 0, policy_version 143883 (0.0038) [2024-06-18 13:38:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2357510144. Throughput: 0: 42711.6. Samples: 2357585460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:38:01,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 13:38:02,202][12883] Updated weights for policy 0, policy_version 143893 (0.0039) [2024-06-18 13:38:06,381][12883] Updated weights for policy 0, policy_version 143903 (0.0032) [2024-06-18 13:38:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.7, 300 sec: 42542.9). Total num frames: 2357723136. Throughput: 0: 42828.4. Samples: 2357849320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:38:06,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 13:38:09,845][12883] Updated weights for policy 0, policy_version 143913 (0.0035) [2024-06-18 13:38:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2357936128. Throughput: 0: 42789.0. Samples: 2358102640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:38:11,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 13:38:13,942][12883] Updated weights for policy 0, policy_version 143923 (0.0035) [2024-06-18 13:38:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.9). Total num frames: 2358165504. Throughput: 0: 42800.5. Samples: 2358227660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:38:16,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 13:38:17,458][12883] Updated weights for policy 0, policy_version 143933 (0.0030) [2024-06-18 13:38:21,709][12883] Updated weights for policy 0, policy_version 143943 (0.0031) [2024-06-18 13:38:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2358362112. Throughput: 0: 42721.0. Samples: 2358490660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:38:21,994][12645] Avg episode reward: [(0, '0.238')] [2024-06-18 13:38:25,350][12883] Updated weights for policy 0, policy_version 143953 (0.0038) [2024-06-18 13:38:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2358591488. Throughput: 0: 42928.5. Samples: 2358747340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:38:26,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 13:38:29,206][12883] Updated weights for policy 0, policy_version 143963 (0.0031) [2024-06-18 13:38:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2358820864. Throughput: 0: 42984.1. Samples: 2358877600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 13:38:31,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 13:38:32,905][12883] Updated weights for policy 0, policy_version 143973 (0.0037) [2024-06-18 13:38:36,866][12883] Updated weights for policy 0, policy_version 143983 (0.0032) [2024-06-18 13:38:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 2359017472. Throughput: 0: 42840.5. Samples: 2359131820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:38:36,994][12645] Avg episode reward: [(0, '0.641')] [2024-06-18 13:38:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143983_2359017472.pth... [2024-06-18 13:38:37,064][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143359_2348793856.pth [2024-06-18 13:38:40,482][12883] Updated weights for policy 0, policy_version 143993 (0.0031) [2024-06-18 13:38:41,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43147.9, 300 sec: 42709.5). Total num frames: 2359230464. Throughput: 0: 42833.0. Samples: 2359389080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:38:41,994][12645] Avg episode reward: [(0, '0.641')] [2024-06-18 13:38:44,449][12883] Updated weights for policy 0, policy_version 144003 (0.0025) [2024-06-18 13:38:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2359459840. Throughput: 0: 42958.1. Samples: 2359518580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:38:46,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 13:38:48,142][12883] Updated weights for policy 0, policy_version 144013 (0.0039) [2024-06-18 13:38:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2359656448. Throughput: 0: 42753.4. Samples: 2359773220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:38:51,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 13:38:52,134][12883] Updated weights for policy 0, policy_version 144023 (0.0042) [2024-06-18 13:38:56,166][12883] Updated weights for policy 0, policy_version 144033 (0.0033) [2024-06-18 13:38:56,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2359869440. Throughput: 0: 42716.1. Samples: 2360024860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:38:56,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 13:38:59,889][12883] Updated weights for policy 0, policy_version 144043 (0.0045) [2024-06-18 13:39:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2360082432. Throughput: 0: 42797.3. Samples: 2360153540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:39:01,994][12645] Avg episode reward: [(0, '0.713')] [2024-06-18 13:39:03,712][12883] Updated weights for policy 0, policy_version 144053 (0.0040) [2024-06-18 13:39:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2360295424. Throughput: 0: 42608.9. Samples: 2360408060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:39:06,996][12645] Avg episode reward: [(0, '0.380')] [2024-06-18 13:39:07,381][12883] Updated weights for policy 0, policy_version 144063 (0.0032) [2024-06-18 13:39:11,440][12883] Updated weights for policy 0, policy_version 144073 (0.0043) [2024-06-18 13:39:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2360524800. Throughput: 0: 42553.4. Samples: 2360662240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:39:11,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 13:39:15,345][12883] Updated weights for policy 0, policy_version 144083 (0.0054) [2024-06-18 13:39:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2360705024. Throughput: 0: 42416.4. Samples: 2360786340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:39:16,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 13:39:19,349][12883] Updated weights for policy 0, policy_version 144093 (0.0023) [2024-06-18 13:39:19,631][12862] Signal inference workers to stop experience collection... (34500 times) [2024-06-18 13:39:19,679][12883] InferenceWorker_p0-w0: stopping experience collection (34500 times) [2024-06-18 13:39:19,684][12862] Signal inference workers to resume experience collection... (34500 times) [2024-06-18 13:39:19,693][12883] InferenceWorker_p0-w0: resuming experience collection (34500 times) [2024-06-18 13:39:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2360934400. Throughput: 0: 42518.7. Samples: 2361045160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:39:21,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 13:39:22,806][12883] Updated weights for policy 0, policy_version 144103 (0.0027) [2024-06-18 13:39:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2361131008. Throughput: 0: 42588.1. Samples: 2361305540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:39:26,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 13:39:27,055][12883] Updated weights for policy 0, policy_version 144113 (0.0030) [2024-06-18 13:39:30,581][12883] Updated weights for policy 0, policy_version 144123 (0.0026) [2024-06-18 13:39:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2361360384. Throughput: 0: 42424.9. Samples: 2361427700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:39:31,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 13:39:34,499][12883] Updated weights for policy 0, policy_version 144133 (0.0033) [2024-06-18 13:39:36,996][12645] Fps is (10 sec: 45864.4, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2361589760. Throughput: 0: 42580.0. Samples: 2361689420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:39:36,997][12645] Avg episode reward: [(0, '0.722')] [2024-06-18 13:39:38,137][12883] Updated weights for policy 0, policy_version 144143 (0.0039) [2024-06-18 13:39:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2361769984. Throughput: 0: 42696.4. Samples: 2361946200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:39:41,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 13:39:42,210][12883] Updated weights for policy 0, policy_version 144153 (0.0033) [2024-06-18 13:39:45,755][12883] Updated weights for policy 0, policy_version 144163 (0.0040) [2024-06-18 13:39:46,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2361999360. Throughput: 0: 42580.3. Samples: 2362069660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:39:46,994][12645] Avg episode reward: [(0, '0.660')] [2024-06-18 13:39:49,730][12883] Updated weights for policy 0, policy_version 144173 (0.0043) [2024-06-18 13:39:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2362212352. Throughput: 0: 42697.0. Samples: 2362329420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:39:51,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 13:39:53,327][12883] Updated weights for policy 0, policy_version 144183 (0.0026) [2024-06-18 13:39:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 2362425344. Throughput: 0: 42758.0. Samples: 2362586360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:39:57,008][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 13:39:57,707][12883] Updated weights for policy 0, policy_version 144193 (0.0026) [2024-06-18 13:40:00,929][12883] Updated weights for policy 0, policy_version 144203 (0.0028) [2024-06-18 13:40:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2362638336. Throughput: 0: 42838.8. Samples: 2362714080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:01,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 13:40:05,213][12883] Updated weights for policy 0, policy_version 144213 (0.0029) [2024-06-18 13:40:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.9). Total num frames: 2362867712. Throughput: 0: 42791.4. Samples: 2362970780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:06,994][12645] Avg episode reward: [(0, '0.709')] [2024-06-18 13:40:08,755][12883] Updated weights for policy 0, policy_version 144223 (0.0025) [2024-06-18 13:40:11,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2363047936. Throughput: 0: 42737.2. Samples: 2363228720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:11,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 13:40:12,850][12883] Updated weights for policy 0, policy_version 144233 (0.0048) [2024-06-18 13:40:16,272][12883] Updated weights for policy 0, policy_version 144243 (0.0036) [2024-06-18 13:40:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2363293696. Throughput: 0: 42813.8. Samples: 2363354320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:16,994][12645] Avg episode reward: [(0, '0.711')] [2024-06-18 13:40:20,484][12883] Updated weights for policy 0, policy_version 144253 (0.0024) [2024-06-18 13:40:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2363490304. Throughput: 0: 42648.3. Samples: 2363608500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:21,994][12645] Avg episode reward: [(0, '0.193')] [2024-06-18 13:40:23,927][12883] Updated weights for policy 0, policy_version 144263 (0.0040) [2024-06-18 13:40:26,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2363670528. Throughput: 0: 42769.7. Samples: 2363870840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:26,994][12645] Avg episode reward: [(0, '0.705')] [2024-06-18 13:40:28,574][12883] Updated weights for policy 0, policy_version 144273 (0.0031) [2024-06-18 13:40:31,417][12883] Updated weights for policy 0, policy_version 144283 (0.0042) [2024-06-18 13:40:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2363932672. Throughput: 0: 42689.5. Samples: 2363990680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:31,994][12645] Avg episode reward: [(0, '0.805')] [2024-06-18 13:40:36,031][12883] Updated weights for policy 0, policy_version 144293 (0.0024) [2024-06-18 13:40:36,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42327.0, 300 sec: 42709.5). Total num frames: 2364129280. Throughput: 0: 42695.1. Samples: 2364250700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 13:40:36,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 13:40:37,073][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144296_2364145664.pth... [2024-06-18 13:40:37,117][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143670_2353889280.pth [2024-06-18 13:40:39,348][12883] Updated weights for policy 0, policy_version 144303 (0.0048) [2024-06-18 13:40:41,996][12645] Fps is (10 sec: 37674.9, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 2364309504. Throughput: 0: 42726.0. Samples: 2364509120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:40:41,996][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 13:40:43,586][12883] Updated weights for policy 0, policy_version 144313 (0.0027) [2024-06-18 13:40:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2364571648. Throughput: 0: 42663.1. Samples: 2364633920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:40:46,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 13:40:47,048][12883] Updated weights for policy 0, policy_version 144323 (0.0036) [2024-06-18 13:40:51,415][12883] Updated weights for policy 0, policy_version 144333 (0.0039) [2024-06-18 13:40:51,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 2364768256. Throughput: 0: 42747.6. Samples: 2364894420. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:40:51,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 13:40:54,625][12862] Signal inference workers to stop experience collection... (34550 times) [2024-06-18 13:40:54,626][12862] Signal inference workers to resume experience collection... (34550 times) [2024-06-18 13:40:54,674][12883] InferenceWorker_p0-w0: stopping experience collection (34550 times) [2024-06-18 13:40:54,674][12883] InferenceWorker_p0-w0: resuming experience collection (34550 times) [2024-06-18 13:40:54,769][12883] Updated weights for policy 0, policy_version 144343 (0.0037) [2024-06-18 13:40:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2364964864. Throughput: 0: 42612.3. Samples: 2365146280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:40:56,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 13:40:59,067][12883] Updated weights for policy 0, policy_version 144353 (0.0038) [2024-06-18 13:41:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2365210624. Throughput: 0: 42744.1. Samples: 2365277800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:01,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 13:41:02,227][12883] Updated weights for policy 0, policy_version 144363 (0.0028) [2024-06-18 13:41:06,527][12883] Updated weights for policy 0, policy_version 144373 (0.0024) [2024-06-18 13:41:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2365407232. Throughput: 0: 42821.8. Samples: 2365535480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:06,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 13:41:10,457][12883] Updated weights for policy 0, policy_version 144383 (0.0021) [2024-06-18 13:41:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2365620224. Throughput: 0: 42631.1. Samples: 2365789240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:11,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 13:41:14,165][12883] Updated weights for policy 0, policy_version 144393 (0.0035) [2024-06-18 13:41:16,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2365865984. Throughput: 0: 42832.7. Samples: 2365918160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:16,995][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 13:41:18,051][12883] Updated weights for policy 0, policy_version 144403 (0.0031) [2024-06-18 13:41:21,764][12883] Updated weights for policy 0, policy_version 144413 (0.0037) [2024-06-18 13:41:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2366062592. Throughput: 0: 42809.7. Samples: 2366177140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:21,994][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 13:41:25,695][12883] Updated weights for policy 0, policy_version 144423 (0.0028) [2024-06-18 13:41:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2366275584. Throughput: 0: 42630.0. Samples: 2366427380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:26,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 13:41:29,646][12883] Updated weights for policy 0, policy_version 144433 (0.0042) [2024-06-18 13:41:31,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2366504960. Throughput: 0: 42715.6. Samples: 2366556120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:31,994][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 13:41:33,350][12883] Updated weights for policy 0, policy_version 144443 (0.0031) [2024-06-18 13:41:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2366685184. Throughput: 0: 42642.2. Samples: 2366813320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:36,994][12645] Avg episode reward: [(0, '0.657')] [2024-06-18 13:41:37,469][12883] Updated weights for policy 0, policy_version 144453 (0.0033) [2024-06-18 13:41:41,004][12883] Updated weights for policy 0, policy_version 144463 (0.0031) [2024-06-18 13:41:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43419.2, 300 sec: 42709.5). Total num frames: 2366914560. Throughput: 0: 42669.4. Samples: 2367066400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 21.0) [2024-06-18 13:41:41,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 13:41:45,090][12883] Updated weights for policy 0, policy_version 144473 (0.0035) [2024-06-18 13:41:46,995][12645] Fps is (10 sec: 42594.6, 60 sec: 42324.7, 300 sec: 42654.7). Total num frames: 2367111168. Throughput: 0: 42577.7. Samples: 2367193840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:41:46,995][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 13:41:48,593][12883] Updated weights for policy 0, policy_version 144483 (0.0032) [2024-06-18 13:41:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2367340544. Throughput: 0: 42585.3. Samples: 2367451820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:41:51,994][12645] Avg episode reward: [(0, '0.702')] [2024-06-18 13:41:52,766][12883] Updated weights for policy 0, policy_version 144493 (0.0029) [2024-06-18 13:41:56,467][12883] Updated weights for policy 0, policy_version 144503 (0.0038) [2024-06-18 13:41:56,994][12645] Fps is (10 sec: 42602.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2367537152. Throughput: 0: 42424.8. Samples: 2367698360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:41:56,994][12645] Avg episode reward: [(0, '0.748')] [2024-06-18 13:42:00,503][12883] Updated weights for policy 0, policy_version 144513 (0.0038) [2024-06-18 13:42:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2367750144. Throughput: 0: 42472.1. Samples: 2367829400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:01,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 13:42:04,378][12883] Updated weights for policy 0, policy_version 144523 (0.0037) [2024-06-18 13:42:06,996][12645] Fps is (10 sec: 42589.6, 60 sec: 42596.9, 300 sec: 42709.2). Total num frames: 2367963136. Throughput: 0: 42354.0. Samples: 2368083160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:06,996][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 13:42:08,227][12883] Updated weights for policy 0, policy_version 144533 (0.0042) [2024-06-18 13:42:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2368159744. Throughput: 0: 42294.3. Samples: 2368330620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:11,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 13:42:12,186][12883] Updated weights for policy 0, policy_version 144543 (0.0037) [2024-06-18 13:42:16,030][12883] Updated weights for policy 0, policy_version 144553 (0.0035) [2024-06-18 13:42:16,994][12645] Fps is (10 sec: 42607.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2368389120. Throughput: 0: 42300.8. Samples: 2368459660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:16,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 13:42:18,262][12862] Signal inference workers to stop experience collection... (34600 times) [2024-06-18 13:42:18,300][12883] InferenceWorker_p0-w0: stopping experience collection (34600 times) [2024-06-18 13:42:18,322][12862] Signal inference workers to resume experience collection... (34600 times) [2024-06-18 13:42:18,323][12883] InferenceWorker_p0-w0: resuming experience collection (34600 times) [2024-06-18 13:42:20,371][12883] Updated weights for policy 0, policy_version 144563 (0.0027) [2024-06-18 13:42:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 2368585728. Throughput: 0: 42202.3. Samples: 2368712420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:21,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 13:42:23,635][12883] Updated weights for policy 0, policy_version 144573 (0.0035) [2024-06-18 13:42:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2368815104. Throughput: 0: 42200.9. Samples: 2368965440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:26,994][12645] Avg episode reward: [(0, '0.672')] [2024-06-18 13:42:28,082][12883] Updated weights for policy 0, policy_version 144583 (0.0024) [2024-06-18 13:42:31,198][12883] Updated weights for policy 0, policy_version 144593 (0.0040) [2024-06-18 13:42:31,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2369044480. Throughput: 0: 42344.3. Samples: 2369099300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:31,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 13:42:35,654][12883] Updated weights for policy 0, policy_version 144603 (0.0027) [2024-06-18 13:42:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42599.1). Total num frames: 2369208320. Throughput: 0: 42167.1. Samples: 2369349340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:36,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 13:42:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144605_2369208320.pth... [2024-06-18 13:42:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000143983_2359017472.pth [2024-06-18 13:42:38,811][12883] Updated weights for policy 0, policy_version 144613 (0.0031) [2024-06-18 13:42:41,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2369437696. Throughput: 0: 42210.4. Samples: 2369597820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 13:42:41,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 13:42:43,302][12883] Updated weights for policy 0, policy_version 144623 (0.0032) [2024-06-18 13:42:46,630][12883] Updated weights for policy 0, policy_version 144633 (0.0027) [2024-06-18 13:42:46,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.9, 300 sec: 42653.9). Total num frames: 2369667072. Throughput: 0: 42243.5. Samples: 2369730360. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:42:46,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 13:42:50,882][12883] Updated weights for policy 0, policy_version 144643 (0.0038) [2024-06-18 13:42:51,996][12645] Fps is (10 sec: 40950.6, 60 sec: 41777.7, 300 sec: 42542.6). Total num frames: 2369847296. Throughput: 0: 42196.8. Samples: 2369982020. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:42:51,996][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 13:42:54,270][12883] Updated weights for policy 0, policy_version 144653 (0.0043) [2024-06-18 13:42:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2370060288. Throughput: 0: 42340.4. Samples: 2370235940. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:42:56,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 13:42:58,572][12883] Updated weights for policy 0, policy_version 144663 (0.0045) [2024-06-18 13:43:01,994][12645] Fps is (10 sec: 45885.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2370306048. Throughput: 0: 42225.1. Samples: 2370359780. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:01,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 13:43:02,083][12883] Updated weights for policy 0, policy_version 144673 (0.0035) [2024-06-18 13:43:06,206][12883] Updated weights for policy 0, policy_version 144683 (0.0043) [2024-06-18 13:43:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 2370502656. Throughput: 0: 42262.9. Samples: 2370614260. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:06,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 13:43:10,247][12883] Updated weights for policy 0, policy_version 144693 (0.0035) [2024-06-18 13:43:11,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2370699264. Throughput: 0: 42255.1. Samples: 2370866920. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:11,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 13:43:14,098][12883] Updated weights for policy 0, policy_version 144703 (0.0031) [2024-06-18 13:43:16,996][12645] Fps is (10 sec: 44227.4, 60 sec: 42596.9, 300 sec: 42653.6). Total num frames: 2370945024. Throughput: 0: 42103.4. Samples: 2370994040. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:16,996][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 13:43:17,741][12883] Updated weights for policy 0, policy_version 144713 (0.0030) [2024-06-18 13:43:21,985][12883] Updated weights for policy 0, policy_version 144723 (0.0039) [2024-06-18 13:43:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2371141632. Throughput: 0: 42361.9. Samples: 2371255620. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:22,008][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 13:43:24,137][12862] Signal inference workers to stop experience collection... (34650 times) [2024-06-18 13:43:24,192][12883] InferenceWorker_p0-w0: stopping experience collection (34650 times) [2024-06-18 13:43:24,252][12862] Signal inference workers to resume experience collection... (34650 times) [2024-06-18 13:43:24,252][12883] InferenceWorker_p0-w0: resuming experience collection (34650 times) [2024-06-18 13:43:25,583][12883] Updated weights for policy 0, policy_version 144733 (0.0028) [2024-06-18 13:43:26,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2371354624. Throughput: 0: 42473.7. Samples: 2371509140. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:26,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 13:43:29,729][12883] Updated weights for policy 0, policy_version 144743 (0.0031) [2024-06-18 13:43:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2371567616. Throughput: 0: 42566.0. Samples: 2371645820. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:31,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 13:43:33,185][12883] Updated weights for policy 0, policy_version 144753 (0.0033) [2024-06-18 13:43:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2371764224. Throughput: 0: 42570.6. Samples: 2371897600. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:36,994][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 13:43:37,530][12883] Updated weights for policy 0, policy_version 144763 (0.0036) [2024-06-18 13:43:40,779][12883] Updated weights for policy 0, policy_version 144773 (0.0030) [2024-06-18 13:43:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 2372009984. Throughput: 0: 42479.4. Samples: 2372147520. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:41,995][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 13:43:45,226][12883] Updated weights for policy 0, policy_version 144783 (0.0040) [2024-06-18 13:43:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2372206592. Throughput: 0: 42783.9. Samples: 2372285060. Policy #0 lag: (min: 1.0, avg: 11.8, max: 26.0) [2024-06-18 13:43:46,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 13:43:48,175][12883] Updated weights for policy 0, policy_version 144793 (0.0036) [2024-06-18 13:43:51,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42873.0, 300 sec: 42542.8). Total num frames: 2372419584. Throughput: 0: 42820.9. Samples: 2372541200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:43:51,994][12645] Avg episode reward: [(0, '0.199')] [2024-06-18 13:43:52,755][12883] Updated weights for policy 0, policy_version 144803 (0.0036) [2024-06-18 13:43:55,600][12883] Updated weights for policy 0, policy_version 144813 (0.0029) [2024-06-18 13:43:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2372648960. Throughput: 0: 42928.5. Samples: 2372798700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:43:56,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 13:44:00,357][12883] Updated weights for policy 0, policy_version 144823 (0.0032) [2024-06-18 13:44:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42596.7, 300 sec: 42598.1). Total num frames: 2372861952. Throughput: 0: 43093.7. Samples: 2372933260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:01,997][12645] Avg episode reward: [(0, '0.247')] [2024-06-18 13:44:03,311][12883] Updated weights for policy 0, policy_version 144833 (0.0033) [2024-06-18 13:44:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2373074944. Throughput: 0: 43007.0. Samples: 2373190940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:06,994][12645] Avg episode reward: [(0, '0.331')] [2024-06-18 13:44:07,852][12883] Updated weights for policy 0, policy_version 144843 (0.0037) [2024-06-18 13:44:10,974][12883] Updated weights for policy 0, policy_version 144853 (0.0054) [2024-06-18 13:44:11,994][12645] Fps is (10 sec: 44246.2, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2373304320. Throughput: 0: 43016.2. Samples: 2373444880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:12,000][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 13:44:15,396][12883] Updated weights for policy 0, policy_version 144863 (0.0041) [2024-06-18 13:44:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42599.8, 300 sec: 42598.4). Total num frames: 2373500928. Throughput: 0: 42873.5. Samples: 2373575140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:16,995][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 13:44:18,613][12883] Updated weights for policy 0, policy_version 144873 (0.0040) [2024-06-18 13:44:21,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2373697536. Throughput: 0: 43005.2. Samples: 2373832840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:21,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 13:44:23,366][12883] Updated weights for policy 0, policy_version 144883 (0.0029) [2024-06-18 13:44:26,321][12883] Updated weights for policy 0, policy_version 144893 (0.0029) [2024-06-18 13:44:26,994][12645] Fps is (10 sec: 45876.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2373959680. Throughput: 0: 42918.4. Samples: 2374078840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:26,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 13:44:30,995][12883] Updated weights for policy 0, policy_version 144903 (0.0033) [2024-06-18 13:44:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42487.7). Total num frames: 2374123520. Throughput: 0: 42869.8. Samples: 2374214200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:31,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 13:44:33,943][12883] Updated weights for policy 0, policy_version 144913 (0.0036) [2024-06-18 13:44:36,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2374352896. Throughput: 0: 42796.9. Samples: 2374467060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:36,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 13:44:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144919_2374352896.pth... [2024-06-18 13:44:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144296_2364145664.pth [2024-06-18 13:44:38,743][12883] Updated weights for policy 0, policy_version 144923 (0.0033) [2024-06-18 13:44:41,801][12883] Updated weights for policy 0, policy_version 144933 (0.0028) [2024-06-18 13:44:41,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2374582272. Throughput: 0: 42759.0. Samples: 2374722860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:41,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 13:44:46,159][12883] Updated weights for policy 0, policy_version 144943 (0.0026) [2024-06-18 13:44:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2374778880. Throughput: 0: 42710.2. Samples: 2374855120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:46,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 13:44:49,565][12883] Updated weights for policy 0, policy_version 144953 (0.0029) [2024-06-18 13:44:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2374991872. Throughput: 0: 42596.9. Samples: 2375107800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 13:44:51,995][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 13:44:54,065][12883] Updated weights for policy 0, policy_version 144963 (0.0023) [2024-06-18 13:44:55,917][12862] Signal inference workers to stop experience collection... (34700 times) [2024-06-18 13:44:55,917][12862] Signal inference workers to resume experience collection... (34700 times) [2024-06-18 13:44:55,949][12883] InferenceWorker_p0-w0: stopping experience collection (34700 times) [2024-06-18 13:44:55,949][12883] InferenceWorker_p0-w0: resuming experience collection (34700 times) [2024-06-18 13:44:56,996][12645] Fps is (10 sec: 44227.1, 60 sec: 42869.8, 300 sec: 42653.6). Total num frames: 2375221248. Throughput: 0: 42629.2. Samples: 2375363280. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:44:56,996][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 13:44:57,122][12883] Updated weights for policy 0, policy_version 144973 (0.0030) [2024-06-18 13:45:01,775][12883] Updated weights for policy 0, policy_version 144983 (0.0036) [2024-06-18 13:45:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42326.9, 300 sec: 42487.3). Total num frames: 2375401472. Throughput: 0: 42663.2. Samples: 2375494980. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:01,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 13:45:04,685][12883] Updated weights for policy 0, policy_version 144993 (0.0039) [2024-06-18 13:45:06,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2375630848. Throughput: 0: 42556.4. Samples: 2375747880. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:06,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 13:45:09,543][12883] Updated weights for policy 0, policy_version 145003 (0.0045) [2024-06-18 13:45:11,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 2375860224. Throughput: 0: 42750.7. Samples: 2376002620. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:11,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 13:45:12,521][12883] Updated weights for policy 0, policy_version 145013 (0.0044) [2024-06-18 13:45:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2376040448. Throughput: 0: 42726.6. Samples: 2376136900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:16,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 13:45:17,121][12883] Updated weights for policy 0, policy_version 145023 (0.0038) [2024-06-18 13:45:20,136][12883] Updated weights for policy 0, policy_version 145033 (0.0031) [2024-06-18 13:45:21,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2376269824. Throughput: 0: 42591.9. Samples: 2376383700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:21,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 13:45:24,663][12883] Updated weights for policy 0, policy_version 145043 (0.0034) [2024-06-18 13:45:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2376499200. Throughput: 0: 42877.7. Samples: 2376652360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:26,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 13:45:27,695][12883] Updated weights for policy 0, policy_version 145053 (0.0031) [2024-06-18 13:45:31,994][12645] Fps is (10 sec: 42599.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2376695808. Throughput: 0: 42794.4. Samples: 2376780860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:31,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 13:45:32,119][12883] Updated weights for policy 0, policy_version 145063 (0.0042) [2024-06-18 13:45:35,334][12883] Updated weights for policy 0, policy_version 145073 (0.0029) [2024-06-18 13:45:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2376925184. Throughput: 0: 42713.3. Samples: 2377029900. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:36,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 13:45:39,570][12883] Updated weights for policy 0, policy_version 145083 (0.0030) [2024-06-18 13:45:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2377138176. Throughput: 0: 43045.8. Samples: 2377300240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:41,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 13:45:42,760][12883] Updated weights for policy 0, policy_version 145093 (0.0038) [2024-06-18 13:45:46,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 2377351168. Throughput: 0: 42868.6. Samples: 2377424060. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:46,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 13:45:47,104][12883] Updated weights for policy 0, policy_version 145103 (0.0035) [2024-06-18 13:45:50,703][12883] Updated weights for policy 0, policy_version 145113 (0.0040) [2024-06-18 13:45:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2377580544. Throughput: 0: 42972.9. Samples: 2377681660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-18 13:45:51,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 13:45:54,918][12883] Updated weights for policy 0, policy_version 145123 (0.0031) [2024-06-18 13:45:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.1, 300 sec: 42598.4). Total num frames: 2377777152. Throughput: 0: 43187.6. Samples: 2377946060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:45:56,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 13:45:58,261][12883] Updated weights for policy 0, policy_version 145133 (0.0047) [2024-06-18 13:46:00,579][12862] Signal inference workers to stop experience collection... (34750 times) [2024-06-18 13:46:00,580][12862] Signal inference workers to resume experience collection... (34750 times) [2024-06-18 13:46:00,591][12883] InferenceWorker_p0-w0: stopping experience collection (34750 times) [2024-06-18 13:46:00,621][12883] InferenceWorker_p0-w0: resuming experience collection (34750 times) [2024-06-18 13:46:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2377990144. Throughput: 0: 42945.2. Samples: 2378069440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:01,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 13:46:02,522][12883] Updated weights for policy 0, policy_version 145143 (0.0023) [2024-06-18 13:46:05,891][12883] Updated weights for policy 0, policy_version 145153 (0.0045) [2024-06-18 13:46:06,994][12645] Fps is (10 sec: 45874.1, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2378235904. Throughput: 0: 43276.0. Samples: 2378331120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:06,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 13:46:10,017][12883] Updated weights for policy 0, policy_version 145163 (0.0021) [2024-06-18 13:46:11,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2378416128. Throughput: 0: 42932.9. Samples: 2378584340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:11,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 13:46:13,694][12883] Updated weights for policy 0, policy_version 145173 (0.0038) [2024-06-18 13:46:16,994][12645] Fps is (10 sec: 39322.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2378629120. Throughput: 0: 42758.2. Samples: 2378704980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:16,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 13:46:17,489][12883] Updated weights for policy 0, policy_version 145183 (0.0029) [2024-06-18 13:46:21,157][12883] Updated weights for policy 0, policy_version 145193 (0.0030) [2024-06-18 13:46:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43417.8, 300 sec: 42709.5). Total num frames: 2378874880. Throughput: 0: 43179.8. Samples: 2378972980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:21,994][12645] Avg episode reward: [(0, '0.283')] [2024-06-18 13:46:25,254][12883] Updated weights for policy 0, policy_version 145203 (0.0031) [2024-06-18 13:46:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2379055104. Throughput: 0: 42925.8. Samples: 2379231900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:26,994][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 13:46:28,759][12883] Updated weights for policy 0, policy_version 145213 (0.0042) [2024-06-18 13:46:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2379284480. Throughput: 0: 42864.9. Samples: 2379352980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:31,994][12645] Avg episode reward: [(0, '0.802')] [2024-06-18 13:46:32,848][12883] Updated weights for policy 0, policy_version 145223 (0.0039) [2024-06-18 13:46:36,277][12883] Updated weights for policy 0, policy_version 145233 (0.0035) [2024-06-18 13:46:36,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 2379530240. Throughput: 0: 43064.1. Samples: 2379619540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:36,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 13:46:37,106][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145236_2379546624.pth... [2024-06-18 13:46:37,154][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144605_2369208320.pth [2024-06-18 13:46:41,050][12883] Updated weights for policy 0, policy_version 145243 (0.0036) [2024-06-18 13:46:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 2379694080. Throughput: 0: 42811.5. Samples: 2379872580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:41,994][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 13:46:43,988][12883] Updated weights for policy 0, policy_version 145253 (0.0037) [2024-06-18 13:46:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2379923456. Throughput: 0: 42756.2. Samples: 2379993460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:46,994][12645] Avg episode reward: [(0, '0.085')] [2024-06-18 13:46:48,605][12883] Updated weights for policy 0, policy_version 145263 (0.0032) [2024-06-18 13:46:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2380136448. Throughput: 0: 42758.8. Samples: 2380255260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:51,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 13:46:52,090][12883] Updated weights for policy 0, policy_version 145273 (0.0037) [2024-06-18 13:46:56,193][12883] Updated weights for policy 0, policy_version 145283 (0.0032) [2024-06-18 13:46:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2380333056. Throughput: 0: 42888.9. Samples: 2380514340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 13:46:56,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 13:46:59,722][12883] Updated weights for policy 0, policy_version 145293 (0.0035) [2024-06-18 13:47:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2380562432. Throughput: 0: 42960.8. Samples: 2380638220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:01,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 13:47:04,157][12883] Updated weights for policy 0, policy_version 145303 (0.0045) [2024-06-18 13:47:06,437][12862] Signal inference workers to stop experience collection... (34800 times) [2024-06-18 13:47:06,444][12862] Signal inference workers to resume experience collection... (34800 times) [2024-06-18 13:47:06,487][12883] InferenceWorker_p0-w0: stopping experience collection (34800 times) [2024-06-18 13:47:06,487][12883] InferenceWorker_p0-w0: resuming experience collection (34800 times) [2024-06-18 13:47:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2380775424. Throughput: 0: 42649.7. Samples: 2380892220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:06,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 13:47:07,396][12883] Updated weights for policy 0, policy_version 145313 (0.0043) [2024-06-18 13:47:11,851][12883] Updated weights for policy 0, policy_version 145323 (0.0032) [2024-06-18 13:47:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2380972032. Throughput: 0: 42557.2. Samples: 2381146980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:11,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 13:47:15,397][12883] Updated weights for policy 0, policy_version 145333 (0.0041) [2024-06-18 13:47:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2381217792. Throughput: 0: 42606.2. Samples: 2381270260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:16,994][12645] Avg episode reward: [(0, '0.733')] [2024-06-18 13:47:19,462][12883] Updated weights for policy 0, policy_version 145343 (0.0048) [2024-06-18 13:47:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2381398016. Throughput: 0: 42401.3. Samples: 2381527600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:21,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 13:47:23,055][12883] Updated weights for policy 0, policy_version 145353 (0.0028) [2024-06-18 13:47:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2381611008. Throughput: 0: 42410.1. Samples: 2381781040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:26,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 13:47:27,174][12883] Updated weights for policy 0, policy_version 145363 (0.0044) [2024-06-18 13:47:30,650][12883] Updated weights for policy 0, policy_version 145373 (0.0040) [2024-06-18 13:47:31,994][12645] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2381873152. Throughput: 0: 42600.3. Samples: 2381910480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:31,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 13:47:34,889][12883] Updated weights for policy 0, policy_version 145383 (0.0041) [2024-06-18 13:47:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42709.5). Total num frames: 2382036992. Throughput: 0: 42415.6. Samples: 2382163960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:36,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 13:47:38,591][12883] Updated weights for policy 0, policy_version 145393 (0.0042) [2024-06-18 13:47:41,994][12645] Fps is (10 sec: 36044.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2382233600. Throughput: 0: 42176.4. Samples: 2382412280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:41,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 13:47:42,837][12883] Updated weights for policy 0, policy_version 145403 (0.0035) [2024-06-18 13:47:46,282][12883] Updated weights for policy 0, policy_version 145413 (0.0042) [2024-06-18 13:47:46,998][12645] Fps is (10 sec: 44219.2, 60 sec: 42595.6, 300 sec: 42820.3). Total num frames: 2382479360. Throughput: 0: 42208.4. Samples: 2382537760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:46,998][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 13:47:50,771][12883] Updated weights for policy 0, policy_version 145423 (0.0037) [2024-06-18 13:47:51,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42050.7, 300 sec: 42709.2). Total num frames: 2382659584. Throughput: 0: 42314.4. Samples: 2382796460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:51,996][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 13:47:54,047][12883] Updated weights for policy 0, policy_version 145433 (0.0028) [2024-06-18 13:47:56,994][12645] Fps is (10 sec: 40975.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2382888960. Throughput: 0: 42246.5. Samples: 2383048080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 13:47:56,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 13:47:58,406][12883] Updated weights for policy 0, policy_version 145443 (0.0037) [2024-06-18 13:48:01,734][12883] Updated weights for policy 0, policy_version 145453 (0.0024) [2024-06-18 13:48:01,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2383118336. Throughput: 0: 42399.5. Samples: 2383178240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:01,998][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 13:48:06,054][12883] Updated weights for policy 0, policy_version 145463 (0.0023) [2024-06-18 13:48:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2383314944. Throughput: 0: 42472.0. Samples: 2383438840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:06,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 13:48:09,266][12883] Updated weights for policy 0, policy_version 145473 (0.0035) [2024-06-18 13:48:11,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2383511552. Throughput: 0: 42341.0. Samples: 2383686380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:11,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 13:48:13,671][12883] Updated weights for policy 0, policy_version 145483 (0.0034) [2024-06-18 13:48:16,979][12883] Updated weights for policy 0, policy_version 145493 (0.0033) [2024-06-18 13:48:16,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.7, 300 sec: 42764.7). Total num frames: 2383757312. Throughput: 0: 42289.0. Samples: 2383813580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:16,997][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 13:48:21,379][12883] Updated weights for policy 0, policy_version 145503 (0.0037) [2024-06-18 13:48:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2383953920. Throughput: 0: 42433.4. Samples: 2384073460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:21,994][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 13:48:24,529][12883] Updated weights for policy 0, policy_version 145513 (0.0039) [2024-06-18 13:48:26,996][12645] Fps is (10 sec: 40960.2, 60 sec: 42596.9, 300 sec: 42709.1). Total num frames: 2384166912. Throughput: 0: 42426.8. Samples: 2384321580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:26,996][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 13:48:29,162][12883] Updated weights for policy 0, policy_version 145523 (0.0030) [2024-06-18 13:48:30,668][12862] Signal inference workers to stop experience collection... (34850 times) [2024-06-18 13:48:30,669][12862] Signal inference workers to resume experience collection... (34850 times) [2024-06-18 13:48:30,689][12883] InferenceWorker_p0-w0: stopping experience collection (34850 times) [2024-06-18 13:48:30,690][12883] InferenceWorker_p0-w0: resuming experience collection (34850 times) [2024-06-18 13:48:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 42765.0). Total num frames: 2384379904. Throughput: 0: 42501.9. Samples: 2384450180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:31,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 13:48:32,244][12883] Updated weights for policy 0, policy_version 145533 (0.0028) [2024-06-18 13:48:36,782][12883] Updated weights for policy 0, policy_version 145543 (0.0033) [2024-06-18 13:48:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2384592896. Throughput: 0: 42520.8. Samples: 2384709800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:36,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 13:48:37,104][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145545_2384609280.pth... [2024-06-18 13:48:37,167][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000144919_2374352896.pth [2024-06-18 13:48:39,816][12883] Updated weights for policy 0, policy_version 145553 (0.0041) [2024-06-18 13:48:41,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2384822272. Throughput: 0: 42465.4. Samples: 2384959020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:42,000][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 13:48:44,525][12883] Updated weights for policy 0, policy_version 145563 (0.0037) [2024-06-18 13:48:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42328.1, 300 sec: 42709.5). Total num frames: 2385018880. Throughput: 0: 42620.4. Samples: 2385096160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:46,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 13:48:47,403][12883] Updated weights for policy 0, policy_version 145573 (0.0054) [2024-06-18 13:48:51,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2385215488. Throughput: 0: 42420.5. Samples: 2385347760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:51,994][12645] Avg episode reward: [(0, '0.209')] [2024-06-18 13:48:52,099][12883] Updated weights for policy 0, policy_version 145583 (0.0036) [2024-06-18 13:48:55,296][12883] Updated weights for policy 0, policy_version 145593 (0.0045) [2024-06-18 13:48:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2385461248. Throughput: 0: 42602.5. Samples: 2385603500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:48:56,994][12645] Avg episode reward: [(0, '0.107')] [2024-06-18 13:48:59,727][12883] Updated weights for policy 0, policy_version 145603 (0.0040) [2024-06-18 13:49:01,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2385657856. Throughput: 0: 42787.0. Samples: 2385738900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 13:49:01,998][12645] Avg episode reward: [(0, '0.202')] [2024-06-18 13:49:03,020][12883] Updated weights for policy 0, policy_version 145613 (0.0034) [2024-06-18 13:49:06,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 42487.4). Total num frames: 2385838080. Throughput: 0: 42505.3. Samples: 2385986200. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:06,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 13:49:07,532][12883] Updated weights for policy 0, policy_version 145623 (0.0024) [2024-06-18 13:49:10,659][12883] Updated weights for policy 0, policy_version 145633 (0.0033) [2024-06-18 13:49:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2386083840. Throughput: 0: 42740.7. Samples: 2386244820. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:11,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 13:49:15,370][12883] Updated weights for policy 0, policy_version 145643 (0.0035) [2024-06-18 13:49:16,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2386296832. Throughput: 0: 42918.6. Samples: 2386381520. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:16,994][12645] Avg episode reward: [(0, '0.350')] [2024-06-18 13:49:18,198][12883] Updated weights for policy 0, policy_version 145653 (0.0032) [2024-06-18 13:49:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2386493440. Throughput: 0: 42559.2. Samples: 2386624960. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:21,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 13:49:23,103][12883] Updated weights for policy 0, policy_version 145663 (0.0030) [2024-06-18 13:49:25,708][12883] Updated weights for policy 0, policy_version 145673 (0.0023) [2024-06-18 13:49:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42873.0, 300 sec: 42765.0). Total num frames: 2386739200. Throughput: 0: 42771.0. Samples: 2386883720. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:26,994][12645] Avg episode reward: [(0, '0.251')] [2024-06-18 13:49:30,838][12883] Updated weights for policy 0, policy_version 145683 (0.0032) [2024-06-18 13:49:31,065][12862] Signal inference workers to stop experience collection... (34900 times) [2024-06-18 13:49:31,071][12862] Signal inference workers to resume experience collection... (34900 times) [2024-06-18 13:49:31,096][12883] InferenceWorker_p0-w0: stopping experience collection (34900 times) [2024-06-18 13:49:31,096][12883] InferenceWorker_p0-w0: resuming experience collection (34900 times) [2024-06-18 13:49:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2386919424. Throughput: 0: 42670.7. Samples: 2387016340. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:31,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 13:49:33,267][12883] Updated weights for policy 0, policy_version 145693 (0.0027) [2024-06-18 13:49:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2387148800. Throughput: 0: 42708.0. Samples: 2387269620. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:36,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 13:49:38,299][12883] Updated weights for policy 0, policy_version 145703 (0.0025) [2024-06-18 13:49:41,211][12883] Updated weights for policy 0, policy_version 145713 (0.0032) [2024-06-18 13:49:41,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2387394560. Throughput: 0: 42705.8. Samples: 2387525260. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:41,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 13:49:45,858][12883] Updated weights for policy 0, policy_version 145723 (0.0034) [2024-06-18 13:49:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2387574784. Throughput: 0: 42692.0. Samples: 2387660040. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:46,994][12645] Avg episode reward: [(0, '0.763')] [2024-06-18 13:49:48,699][12883] Updated weights for policy 0, policy_version 145733 (0.0042) [2024-06-18 13:49:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42654.3). Total num frames: 2387804160. Throughput: 0: 42851.1. Samples: 2387914500. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:51,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 13:49:53,493][12883] Updated weights for policy 0, policy_version 145743 (0.0034) [2024-06-18 13:49:56,397][12883] Updated weights for policy 0, policy_version 145753 (0.0028) [2024-06-18 13:49:56,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2388033536. Throughput: 0: 42804.2. Samples: 2388171000. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:49:56,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 13:50:01,174][12883] Updated weights for policy 0, policy_version 145763 (0.0028) [2024-06-18 13:50:01,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2388213760. Throughput: 0: 42644.2. Samples: 2388300600. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:50:01,996][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 13:50:04,066][12883] Updated weights for policy 0, policy_version 145773 (0.0036) [2024-06-18 13:50:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2388426752. Throughput: 0: 42767.9. Samples: 2388549520. Policy #0 lag: (min: 2.0, avg: 12.4, max: 22.0) [2024-06-18 13:50:06,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 13:50:08,691][12883] Updated weights for policy 0, policy_version 145783 (0.0036) [2024-06-18 13:50:11,955][12883] Updated weights for policy 0, policy_version 145793 (0.0026) [2024-06-18 13:50:11,994][12645] Fps is (10 sec: 45885.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2388672512. Throughput: 0: 42745.4. Samples: 2388807260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:11,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 13:50:16,243][12883] Updated weights for policy 0, policy_version 145803 (0.0031) [2024-06-18 13:50:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2388852736. Throughput: 0: 42769.3. Samples: 2388940960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:16,994][12645] Avg episode reward: [(0, '0.687')] [2024-06-18 13:50:19,856][12883] Updated weights for policy 0, policy_version 145813 (0.0034) [2024-06-18 13:50:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2389098496. Throughput: 0: 42777.7. Samples: 2389194620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:21,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 13:50:23,883][12883] Updated weights for policy 0, policy_version 145823 (0.0033) [2024-06-18 13:50:26,999][12645] Fps is (10 sec: 44211.7, 60 sec: 42594.5, 300 sec: 42708.6). Total num frames: 2389295104. Throughput: 0: 42808.4. Samples: 2389451880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:27,000][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 13:50:27,504][12883] Updated weights for policy 0, policy_version 145833 (0.0040) [2024-06-18 13:50:31,383][12883] Updated weights for policy 0, policy_version 145843 (0.0029) [2024-06-18 13:50:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2389491712. Throughput: 0: 42701.7. Samples: 2389581620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:31,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 13:50:35,075][12883] Updated weights for policy 0, policy_version 145853 (0.0045) [2024-06-18 13:50:36,994][12645] Fps is (10 sec: 45901.0, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2389753856. Throughput: 0: 42736.4. Samples: 2389837640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:36,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 13:50:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145859_2389753856.pth... [2024-06-18 13:50:37,071][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145236_2379546624.pth [2024-06-18 13:50:39,365][12883] Updated weights for policy 0, policy_version 145863 (0.0042) [2024-06-18 13:50:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2389950464. Throughput: 0: 42887.4. Samples: 2390100940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:41,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 13:50:42,687][12883] Updated weights for policy 0, policy_version 145873 (0.0035) [2024-06-18 13:50:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2390130688. Throughput: 0: 42738.7. Samples: 2390223740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:46,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 13:50:47,025][12883] Updated weights for policy 0, policy_version 145883 (0.0047) [2024-06-18 13:50:50,383][12883] Updated weights for policy 0, policy_version 145893 (0.0036) [2024-06-18 13:50:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2390376448. Throughput: 0: 42985.2. Samples: 2390483860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:51,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 13:50:54,925][12883] Updated weights for policy 0, policy_version 145903 (0.0033) [2024-06-18 13:50:56,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2390589440. Throughput: 0: 42937.3. Samples: 2390739440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:50:56,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 13:50:58,053][12883] Updated weights for policy 0, policy_version 145913 (0.0041) [2024-06-18 13:50:58,484][12862] Signal inference workers to stop experience collection... (34950 times) [2024-06-18 13:50:58,488][12862] Signal inference workers to resume experience collection... (34950 times) [2024-06-18 13:50:58,505][12883] InferenceWorker_p0-w0: stopping experience collection (34950 times) [2024-06-18 13:50:58,505][12883] InferenceWorker_p0-w0: resuming experience collection (34950 times) [2024-06-18 13:51:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42873.0, 300 sec: 42542.9). Total num frames: 2390786048. Throughput: 0: 42766.1. Samples: 2390865440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:51:01,994][12645] Avg episode reward: [(0, '0.709')] [2024-06-18 13:51:02,481][12883] Updated weights for policy 0, policy_version 145923 (0.0032) [2024-06-18 13:51:05,693][12883] Updated weights for policy 0, policy_version 145933 (0.0043) [2024-06-18 13:51:06,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2391015424. Throughput: 0: 42817.8. Samples: 2391121420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 13:51:06,994][12645] Avg episode reward: [(0, '0.256')] [2024-06-18 13:51:10,212][12883] Updated weights for policy 0, policy_version 145943 (0.0035) [2024-06-18 13:51:11,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2391228416. Throughput: 0: 42854.8. Samples: 2391380100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:11,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 13:51:13,356][12883] Updated weights for policy 0, policy_version 145953 (0.0030) [2024-06-18 13:51:16,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2391408640. Throughput: 0: 42745.3. Samples: 2391505160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:16,994][12645] Avg episode reward: [(0, '0.822')] [2024-06-18 13:51:17,719][12883] Updated weights for policy 0, policy_version 145963 (0.0030) [2024-06-18 13:51:20,954][12883] Updated weights for policy 0, policy_version 145973 (0.0046) [2024-06-18 13:51:21,995][12645] Fps is (10 sec: 42591.3, 60 sec: 42597.3, 300 sec: 42709.2). Total num frames: 2391654400. Throughput: 0: 42868.7. Samples: 2391766800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:21,996][12645] Avg episode reward: [(0, '0.822')] [2024-06-18 13:51:25,243][12883] Updated weights for policy 0, policy_version 145983 (0.0027) [2024-06-18 13:51:26,994][12645] Fps is (10 sec: 47514.2, 60 sec: 43148.6, 300 sec: 42709.5). Total num frames: 2391883776. Throughput: 0: 42864.5. Samples: 2392029840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:26,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 13:51:28,603][12883] Updated weights for policy 0, policy_version 145993 (0.0022) [2024-06-18 13:51:31,994][12645] Fps is (10 sec: 40966.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2392064000. Throughput: 0: 43023.0. Samples: 2392159780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:31,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 13:51:32,733][12883] Updated weights for policy 0, policy_version 146003 (0.0031) [2024-06-18 13:51:36,349][12883] Updated weights for policy 0, policy_version 146013 (0.0030) [2024-06-18 13:51:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2392293376. Throughput: 0: 42860.6. Samples: 2392412580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:36,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 13:51:40,227][12883] Updated weights for policy 0, policy_version 146023 (0.0027) [2024-06-18 13:51:41,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2392522752. Throughput: 0: 43001.8. Samples: 2392674520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:41,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 13:51:43,937][12883] Updated weights for policy 0, policy_version 146033 (0.0029) [2024-06-18 13:51:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2392719360. Throughput: 0: 43048.9. Samples: 2392802640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:46,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 13:51:47,947][12883] Updated weights for policy 0, policy_version 146043 (0.0037) [2024-06-18 13:51:51,757][12883] Updated weights for policy 0, policy_version 146053 (0.0032) [2024-06-18 13:51:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2392932352. Throughput: 0: 42921.6. Samples: 2393052900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:51,995][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 13:51:55,440][12883] Updated weights for policy 0, policy_version 146063 (0.0037) [2024-06-18 13:51:56,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2393178112. Throughput: 0: 42883.0. Samples: 2393309840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:51:56,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 13:51:59,395][12883] Updated weights for policy 0, policy_version 146073 (0.0026) [2024-06-18 13:52:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2393358336. Throughput: 0: 43083.1. Samples: 2393443900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:52:01,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 13:52:03,407][12883] Updated weights for policy 0, policy_version 146083 (0.0035) [2024-06-18 13:52:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2393571328. Throughput: 0: 42756.6. Samples: 2393690780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:52:06,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 13:52:07,350][12883] Updated weights for policy 0, policy_version 146093 (0.0029) [2024-06-18 13:52:11,221][12883] Updated weights for policy 0, policy_version 146103 (0.0030) [2024-06-18 13:52:11,994][12645] Fps is (10 sec: 45876.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2393817088. Throughput: 0: 42618.7. Samples: 2393947680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 13:52:11,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 13:52:14,994][12883] Updated weights for policy 0, policy_version 146113 (0.0032) [2024-06-18 13:52:16,609][12862] Signal inference workers to stop experience collection... (35000 times) [2024-06-18 13:52:16,609][12862] Signal inference workers to resume experience collection... (35000 times) [2024-06-18 13:52:16,632][12883] InferenceWorker_p0-w0: stopping experience collection (35000 times) [2024-06-18 13:52:16,632][12883] InferenceWorker_p0-w0: resuming experience collection (35000 times) [2024-06-18 13:52:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2393997312. Throughput: 0: 42766.7. Samples: 2394084280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:16,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 13:52:18,692][12883] Updated weights for policy 0, policy_version 146123 (0.0040) [2024-06-18 13:52:21,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42599.5, 300 sec: 42709.5). Total num frames: 2394210304. Throughput: 0: 42593.3. Samples: 2394329280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:21,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 13:52:22,670][12883] Updated weights for policy 0, policy_version 146133 (0.0026) [2024-06-18 13:52:26,199][12883] Updated weights for policy 0, policy_version 146143 (0.0048) [2024-06-18 13:52:26,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2394456064. Throughput: 0: 42543.6. Samples: 2394588980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:26,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 13:52:30,239][12883] Updated weights for policy 0, policy_version 146153 (0.0039) [2024-06-18 13:52:31,994][12645] Fps is (10 sec: 42597.1, 60 sec: 42871.2, 300 sec: 42709.4). Total num frames: 2394636288. Throughput: 0: 42693.1. Samples: 2394723840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:31,995][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 13:52:33,780][12883] Updated weights for policy 0, policy_version 146163 (0.0028) [2024-06-18 13:52:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2394865664. Throughput: 0: 42714.3. Samples: 2394975040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:36,994][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 13:52:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146171_2394865664.pth... [2024-06-18 13:52:37,070][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145545_2384609280.pth [2024-06-18 13:52:37,775][12883] Updated weights for policy 0, policy_version 146173 (0.0032) [2024-06-18 13:52:41,228][12883] Updated weights for policy 0, policy_version 146183 (0.0024) [2024-06-18 13:52:41,994][12645] Fps is (10 sec: 45876.5, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 2395095040. Throughput: 0: 42760.0. Samples: 2395234040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:41,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 13:52:45,577][12883] Updated weights for policy 0, policy_version 146193 (0.0035) [2024-06-18 13:52:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 2395275264. Throughput: 0: 42579.7. Samples: 2395359980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:46,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 13:52:49,576][12883] Updated weights for policy 0, policy_version 146203 (0.0042) [2024-06-18 13:52:51,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2395521024. Throughput: 0: 42556.9. Samples: 2395605840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:51,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 13:52:53,252][12883] Updated weights for policy 0, policy_version 146213 (0.0034) [2024-06-18 13:52:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2395701248. Throughput: 0: 42725.7. Samples: 2395870340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:52:56,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 13:52:57,352][12883] Updated weights for policy 0, policy_version 146223 (0.0042) [2024-06-18 13:53:00,829][12883] Updated weights for policy 0, policy_version 146233 (0.0028) [2024-06-18 13:53:01,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2395897856. Throughput: 0: 42350.1. Samples: 2395990040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:53:01,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 13:53:04,963][12883] Updated weights for policy 0, policy_version 146243 (0.0036) [2024-06-18 13:53:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2396160000. Throughput: 0: 42656.4. Samples: 2396248820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:53:07,003][12645] Avg episode reward: [(0, '0.754')] [2024-06-18 13:53:08,421][12883] Updated weights for policy 0, policy_version 146253 (0.0041) [2024-06-18 13:53:11,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42050.7, 300 sec: 42653.9). Total num frames: 2396340224. Throughput: 0: 42827.6. Samples: 2396516320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:53:11,997][12645] Avg episode reward: [(0, '0.661')] [2024-06-18 13:53:12,621][12883] Updated weights for policy 0, policy_version 146263 (0.0031) [2024-06-18 13:53:15,997][12883] Updated weights for policy 0, policy_version 146273 (0.0031) [2024-06-18 13:53:16,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 2396553216. Throughput: 0: 42485.1. Samples: 2396635660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 13:53:16,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 13:53:20,227][12883] Updated weights for policy 0, policy_version 146283 (0.0030) [2024-06-18 13:53:21,994][12645] Fps is (10 sec: 45885.2, 60 sec: 43144.5, 300 sec: 42820.9). Total num frames: 2396798976. Throughput: 0: 42616.0. Samples: 2396892760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:21,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 13:53:23,467][12883] Updated weights for policy 0, policy_version 146293 (0.0036) [2024-06-18 13:53:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 2396962816. Throughput: 0: 42750.6. Samples: 2397157820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:26,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 13:53:27,855][12883] Updated weights for policy 0, policy_version 146303 (0.0032) [2024-06-18 13:53:31,168][12883] Updated weights for policy 0, policy_version 146313 (0.0032) [2024-06-18 13:53:31,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.7, 300 sec: 42709.5). Total num frames: 2397192192. Throughput: 0: 42397.4. Samples: 2397267860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:31,994][12645] Avg episode reward: [(0, '0.718')] [2024-06-18 13:53:35,228][12862] Signal inference workers to stop experience collection... (35050 times) [2024-06-18 13:53:35,264][12883] InferenceWorker_p0-w0: stopping experience collection (35050 times) [2024-06-18 13:53:35,282][12862] Signal inference workers to resume experience collection... (35050 times) [2024-06-18 13:53:35,283][12883] InferenceWorker_p0-w0: resuming experience collection (35050 times) [2024-06-18 13:53:35,427][12883] Updated weights for policy 0, policy_version 146323 (0.0047) [2024-06-18 13:53:36,996][12645] Fps is (10 sec: 47504.8, 60 sec: 42870.1, 300 sec: 42764.7). Total num frames: 2397437952. Throughput: 0: 42809.8. Samples: 2397532360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:36,996][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 13:53:38,671][12883] Updated weights for policy 0, policy_version 146333 (0.0029) [2024-06-18 13:53:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 42653.9). Total num frames: 2397601792. Throughput: 0: 42713.0. Samples: 2397792420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:41,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 13:53:43,445][12883] Updated weights for policy 0, policy_version 146343 (0.0038) [2024-06-18 13:53:46,587][12883] Updated weights for policy 0, policy_version 146353 (0.0028) [2024-06-18 13:53:46,994][12645] Fps is (10 sec: 40968.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2397847552. Throughput: 0: 42724.1. Samples: 2397912620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:46,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 13:53:50,879][12883] Updated weights for policy 0, policy_version 146363 (0.0037) [2024-06-18 13:53:51,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2398076928. Throughput: 0: 42887.6. Samples: 2398178760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:51,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 13:53:54,159][12883] Updated weights for policy 0, policy_version 146373 (0.0038) [2024-06-18 13:53:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2398240768. Throughput: 0: 42623.0. Samples: 2398434260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:53:56,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 13:53:58,425][12883] Updated weights for policy 0, policy_version 146383 (0.0026) [2024-06-18 13:54:01,663][12883] Updated weights for policy 0, policy_version 146393 (0.0029) [2024-06-18 13:54:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2398502912. Throughput: 0: 42679.2. Samples: 2398556220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:54:01,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 13:54:05,983][12883] Updated weights for policy 0, policy_version 146403 (0.0037) [2024-06-18 13:54:06,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2398715904. Throughput: 0: 42806.7. Samples: 2398819060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:54:06,994][12645] Avg episode reward: [(0, '0.669')] [2024-06-18 13:54:09,306][12883] Updated weights for policy 0, policy_version 146413 (0.0028) [2024-06-18 13:54:11,996][12645] Fps is (10 sec: 37674.8, 60 sec: 42325.3, 300 sec: 42653.6). Total num frames: 2398879744. Throughput: 0: 42720.2. Samples: 2399080320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:54:11,996][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 13:54:13,678][12883] Updated weights for policy 0, policy_version 146423 (0.0035) [2024-06-18 13:54:16,969][12883] Updated weights for policy 0, policy_version 146433 (0.0038) [2024-06-18 13:54:16,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42931.6). Total num frames: 2399158272. Throughput: 0: 42958.5. Samples: 2399201000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 13:54:16,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 13:54:21,284][12883] Updated weights for policy 0, policy_version 146443 (0.0046) [2024-06-18 13:54:21,993][12645] Fps is (10 sec: 47525.0, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 2399354880. Throughput: 0: 42854.5. Samples: 2399460720. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:21,994][12645] Avg episode reward: [(0, '0.692')] [2024-06-18 13:54:24,867][12883] Updated weights for policy 0, policy_version 146453 (0.0028) [2024-06-18 13:54:26,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2399535104. Throughput: 0: 42748.3. Samples: 2399716100. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:26,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 13:54:28,888][12883] Updated weights for policy 0, policy_version 146463 (0.0039) [2024-06-18 13:54:31,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2399764480. Throughput: 0: 42762.2. Samples: 2399836920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:31,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 13:54:32,431][12883] Updated weights for policy 0, policy_version 146473 (0.0028) [2024-06-18 13:54:36,417][12883] Updated weights for policy 0, policy_version 146483 (0.0034) [2024-06-18 13:54:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42599.7, 300 sec: 42709.5). Total num frames: 2399993856. Throughput: 0: 42744.8. Samples: 2400102280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:36,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 13:54:37,001][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146485_2400010240.pth... [2024-06-18 13:54:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000145859_2389753856.pth [2024-06-18 13:54:40,375][12883] Updated weights for policy 0, policy_version 146493 (0.0026) [2024-06-18 13:54:42,000][12645] Fps is (10 sec: 42571.6, 60 sec: 43140.0, 300 sec: 42764.1). Total num frames: 2400190464. Throughput: 0: 42756.3. Samples: 2400358560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:42,001][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 13:54:44,150][12883] Updated weights for policy 0, policy_version 146503 (0.0027) [2024-06-18 13:54:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2400419840. Throughput: 0: 42825.8. Samples: 2400483380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:46,994][12645] Avg episode reward: [(0, '0.255')] [2024-06-18 13:54:48,019][12883] Updated weights for policy 0, policy_version 146513 (0.0032) [2024-06-18 13:54:51,901][12883] Updated weights for policy 0, policy_version 146523 (0.0031) [2024-06-18 13:54:51,994][12645] Fps is (10 sec: 44264.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2400632832. Throughput: 0: 42829.4. Samples: 2400746380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:51,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 13:54:55,754][12883] Updated weights for policy 0, policy_version 146533 (0.0040) [2024-06-18 13:54:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2400829440. Throughput: 0: 42620.8. Samples: 2400998160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:54:56,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 13:54:59,350][12862] Signal inference workers to stop experience collection... (35100 times) [2024-06-18 13:54:59,378][12883] InferenceWorker_p0-w0: stopping experience collection (35100 times) [2024-06-18 13:54:59,416][12862] Signal inference workers to resume experience collection... (35100 times) [2024-06-18 13:54:59,419][12883] InferenceWorker_p0-w0: resuming experience collection (35100 times) [2024-06-18 13:54:59,563][12883] Updated weights for policy 0, policy_version 146543 (0.0034) [2024-06-18 13:55:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2401058816. Throughput: 0: 42742.2. Samples: 2401124400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:55:01,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 13:55:03,367][12883] Updated weights for policy 0, policy_version 146553 (0.0030) [2024-06-18 13:55:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2401271808. Throughput: 0: 42750.9. Samples: 2401384520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:55:06,994][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 13:55:07,116][12883] Updated weights for policy 0, policy_version 146563 (0.0032) [2024-06-18 13:55:11,137][12883] Updated weights for policy 0, policy_version 146573 (0.0041) [2024-06-18 13:55:12,000][12645] Fps is (10 sec: 42571.9, 60 sec: 43414.6, 300 sec: 42819.6). Total num frames: 2401484800. Throughput: 0: 42571.0. Samples: 2401632060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:55:12,001][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 13:55:15,034][12883] Updated weights for policy 0, policy_version 146583 (0.0027) [2024-06-18 13:55:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2401697792. Throughput: 0: 42806.1. Samples: 2401763200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:55:16,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 13:55:18,943][12883] Updated weights for policy 0, policy_version 146593 (0.0029) [2024-06-18 13:55:21,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42598.3, 300 sec: 42765.8). Total num frames: 2401910784. Throughput: 0: 42658.4. Samples: 2402021900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 23.0) [2024-06-18 13:55:21,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 13:55:22,812][12883] Updated weights for policy 0, policy_version 146603 (0.0032) [2024-06-18 13:55:26,652][12883] Updated weights for policy 0, policy_version 146613 (0.0031) [2024-06-18 13:55:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2402107392. Throughput: 0: 42545.0. Samples: 2402272820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:55:26,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 13:55:30,348][12883] Updated weights for policy 0, policy_version 146623 (0.0046) [2024-06-18 13:55:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2402320384. Throughput: 0: 42534.2. Samples: 2402397420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:55:31,994][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 13:55:34,654][12883] Updated weights for policy 0, policy_version 146633 (0.0035) [2024-06-18 13:55:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2402549760. Throughput: 0: 42439.0. Samples: 2402656140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:55:36,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 13:55:38,239][12883] Updated weights for policy 0, policy_version 146643 (0.0027) [2024-06-18 13:55:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42329.7, 300 sec: 42709.4). Total num frames: 2402729984. Throughput: 0: 42567.9. Samples: 2402913720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:55:41,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 13:55:42,315][12883] Updated weights for policy 0, policy_version 146653 (0.0032) [2024-06-18 13:55:45,689][12883] Updated weights for policy 0, policy_version 146663 (0.0033) [2024-06-18 13:55:46,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.8, 300 sec: 42709.2). Total num frames: 2402975744. Throughput: 0: 42531.3. Samples: 2403038400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:55:46,996][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 13:55:49,881][12883] Updated weights for policy 0, policy_version 146673 (0.0032) [2024-06-18 13:55:51,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2403172352. Throughput: 0: 42580.0. Samples: 2403300620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:55:51,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 13:55:53,393][12883] Updated weights for policy 0, policy_version 146683 (0.0023) [2024-06-18 13:55:56,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2403385344. Throughput: 0: 42674.8. Samples: 2403552160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:55:56,994][12645] Avg episode reward: [(0, '0.297')] [2024-06-18 13:55:57,358][12883] Updated weights for policy 0, policy_version 146693 (0.0030) [2024-06-18 13:56:00,808][12883] Updated weights for policy 0, policy_version 146703 (0.0035) [2024-06-18 13:56:01,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2403631104. Throughput: 0: 42681.4. Samples: 2403683860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:56:01,997][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 13:56:05,552][12883] Updated weights for policy 0, policy_version 146713 (0.0028) [2024-06-18 13:56:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2403827712. Throughput: 0: 42798.1. Samples: 2403947820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:56:06,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 13:56:08,646][12883] Updated weights for policy 0, policy_version 146723 (0.0028) [2024-06-18 13:56:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42602.9, 300 sec: 42820.6). Total num frames: 2404040704. Throughput: 0: 42806.7. Samples: 2404199120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:56:11,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 13:56:12,991][12883] Updated weights for policy 0, policy_version 146733 (0.0046) [2024-06-18 13:56:16,365][12883] Updated weights for policy 0, policy_version 146743 (0.0028) [2024-06-18 13:56:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42765.3). Total num frames: 2404270080. Throughput: 0: 42962.7. Samples: 2404330740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:56:16,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 13:56:20,392][12883] Updated weights for policy 0, policy_version 146753 (0.0047) [2024-06-18 13:56:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2404450304. Throughput: 0: 42965.0. Samples: 2404589560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:56:21,994][12645] Avg episode reward: [(0, '0.359')] [2024-06-18 13:56:24,008][12883] Updated weights for policy 0, policy_version 146763 (0.0036) [2024-06-18 13:56:26,994][12645] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2404679680. Throughput: 0: 42845.3. Samples: 2404841760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 13:56:26,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 13:56:27,770][12883] Updated weights for policy 0, policy_version 146773 (0.0035) [2024-06-18 13:56:30,900][12862] Signal inference workers to stop experience collection... (35150 times) [2024-06-18 13:56:30,948][12883] InferenceWorker_p0-w0: stopping experience collection (35150 times) [2024-06-18 13:56:30,957][12862] Signal inference workers to resume experience collection... (35150 times) [2024-06-18 13:56:30,965][12883] InferenceWorker_p0-w0: resuming experience collection (35150 times) [2024-06-18 13:56:31,578][12883] Updated weights for policy 0, policy_version 146783 (0.0028) [2024-06-18 13:56:31,994][12645] Fps is (10 sec: 47513.0, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 2404925440. Throughput: 0: 42966.5. Samples: 2404971800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:56:31,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 13:56:35,371][12883] Updated weights for policy 0, policy_version 146793 (0.0043) [2024-06-18 13:56:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2405105664. Throughput: 0: 42759.1. Samples: 2405224780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:56:36,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 13:56:37,113][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146797_2405122048.pth... [2024-06-18 13:56:37,162][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146171_2394865664.pth [2024-06-18 13:56:39,058][12883] Updated weights for policy 0, policy_version 146803 (0.0028) [2024-06-18 13:56:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2405318656. Throughput: 0: 43031.2. Samples: 2405488560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:56:41,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 13:56:43,159][12883] Updated weights for policy 0, policy_version 146813 (0.0030) [2024-06-18 13:56:46,706][12883] Updated weights for policy 0, policy_version 146823 (0.0036) [2024-06-18 13:56:46,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43146.0, 300 sec: 42820.6). Total num frames: 2405564416. Throughput: 0: 42849.3. Samples: 2405612080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:56:46,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 13:56:50,929][12883] Updated weights for policy 0, policy_version 146833 (0.0037) [2024-06-18 13:56:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2405744640. Throughput: 0: 42800.1. Samples: 2405873820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:56:51,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 13:56:54,382][12883] Updated weights for policy 0, policy_version 146843 (0.0028) [2024-06-18 13:56:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2405957632. Throughput: 0: 42780.9. Samples: 2406124260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:56:56,994][12645] Avg episode reward: [(0, '0.611')] [2024-06-18 13:56:58,718][12883] Updated weights for policy 0, policy_version 146853 (0.0028) [2024-06-18 13:57:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2406187008. Throughput: 0: 42734.6. Samples: 2406253800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:57:01,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 13:57:02,046][12883] Updated weights for policy 0, policy_version 146863 (0.0023) [2024-06-18 13:57:06,251][12883] Updated weights for policy 0, policy_version 146873 (0.0025) [2024-06-18 13:57:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2406383616. Throughput: 0: 42666.1. Samples: 2406509540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:57:06,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 13:57:09,845][12883] Updated weights for policy 0, policy_version 146883 (0.0035) [2024-06-18 13:57:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2406612992. Throughput: 0: 42670.7. Samples: 2406761940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:57:11,996][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 13:57:13,801][12883] Updated weights for policy 0, policy_version 146893 (0.0031) [2024-06-18 13:57:17,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42320.9, 300 sec: 42708.6). Total num frames: 2406809600. Throughput: 0: 42727.0. Samples: 2406894780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:57:17,001][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 13:57:17,730][12883] Updated weights for policy 0, policy_version 146903 (0.0035) [2024-06-18 13:57:21,383][12883] Updated weights for policy 0, policy_version 146913 (0.0031) [2024-06-18 13:57:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2407022592. Throughput: 0: 42733.2. Samples: 2407147780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:57:21,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 13:57:25,386][12883] Updated weights for policy 0, policy_version 146923 (0.0030) [2024-06-18 13:57:26,998][12645] Fps is (10 sec: 44243.2, 60 sec: 42868.1, 300 sec: 42764.4). Total num frames: 2407251968. Throughput: 0: 42416.8. Samples: 2407397520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:57:26,999][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 13:57:29,483][12883] Updated weights for policy 0, policy_version 146933 (0.0032) [2024-06-18 13:57:31,994][12645] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2407432192. Throughput: 0: 42620.0. Samples: 2407529980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 13:57:31,995][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 13:57:32,190][12862] Signal inference workers to stop experience collection... (35200 times) [2024-06-18 13:57:32,194][12862] Signal inference workers to resume experience collection... (35200 times) [2024-06-18 13:57:32,242][12883] InferenceWorker_p0-w0: stopping experience collection (35200 times) [2024-06-18 13:57:32,242][12883] InferenceWorker_p0-w0: resuming experience collection (35200 times) [2024-06-18 13:57:32,947][12883] Updated weights for policy 0, policy_version 146943 (0.0035) [2024-06-18 13:57:36,994][12645] Fps is (10 sec: 40979.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2407661568. Throughput: 0: 42493.9. Samples: 2407786040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:57:36,994][12645] Avg episode reward: [(0, '0.315')] [2024-06-18 13:57:37,052][12883] Updated weights for policy 0, policy_version 146953 (0.0032) [2024-06-18 13:57:40,685][12883] Updated weights for policy 0, policy_version 146963 (0.0040) [2024-06-18 13:57:41,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2407890944. Throughput: 0: 42450.3. Samples: 2408034520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:57:41,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 13:57:44,738][12883] Updated weights for policy 0, policy_version 146973 (0.0031) [2024-06-18 13:57:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2408071168. Throughput: 0: 42631.2. Samples: 2408172200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:57:46,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 13:57:48,258][12883] Updated weights for policy 0, policy_version 146983 (0.0045) [2024-06-18 13:57:51,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2408284160. Throughput: 0: 42594.7. Samples: 2408426300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:57:51,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 13:57:52,443][12883] Updated weights for policy 0, policy_version 146993 (0.0038) [2024-06-18 13:57:55,887][12883] Updated weights for policy 0, policy_version 147003 (0.0031) [2024-06-18 13:57:56,994][12645] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2408546304. Throughput: 0: 42426.1. Samples: 2408671120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:57:56,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 13:58:00,070][12883] Updated weights for policy 0, policy_version 147013 (0.0034) [2024-06-18 13:58:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2408710144. Throughput: 0: 42490.4. Samples: 2408806580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:58:01,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 13:58:03,720][12883] Updated weights for policy 0, policy_version 147023 (0.0046) [2024-06-18 13:58:06,995][12645] Fps is (10 sec: 39315.1, 60 sec: 42597.2, 300 sec: 42709.5). Total num frames: 2408939520. Throughput: 0: 42411.7. Samples: 2409056380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:58:06,996][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 13:58:07,710][12883] Updated weights for policy 0, policy_version 147033 (0.0041) [2024-06-18 13:58:11,511][12883] Updated weights for policy 0, policy_version 147043 (0.0032) [2024-06-18 13:58:11,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2409168896. Throughput: 0: 42572.1. Samples: 2409313060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:58:11,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 13:58:15,456][12883] Updated weights for policy 0, policy_version 147053 (0.0027) [2024-06-18 13:58:16,994][12645] Fps is (10 sec: 42605.9, 60 sec: 42602.8, 300 sec: 42598.4). Total num frames: 2409365504. Throughput: 0: 42630.7. Samples: 2409448360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:58:16,994][12645] Avg episode reward: [(0, '0.729')] [2024-06-18 13:58:19,364][12883] Updated weights for policy 0, policy_version 147063 (0.0035) [2024-06-18 13:58:21,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2409562112. Throughput: 0: 42556.0. Samples: 2409701060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:58:21,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 13:58:22,981][12883] Updated weights for policy 0, policy_version 147073 (0.0041) [2024-06-18 13:58:26,805][12883] Updated weights for policy 0, policy_version 147083 (0.0040) [2024-06-18 13:58:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42601.9, 300 sec: 42765.0). Total num frames: 2409807872. Throughput: 0: 42838.2. Samples: 2409962240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:58:26,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 13:58:30,506][12883] Updated weights for policy 0, policy_version 147093 (0.0045) [2024-06-18 13:58:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2410004480. Throughput: 0: 42699.0. Samples: 2410093660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 13:58:31,994][12645] Avg episode reward: [(0, '0.223')] [2024-06-18 13:58:34,411][12883] Updated weights for policy 0, policy_version 147103 (0.0027) [2024-06-18 13:58:36,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2410217472. Throughput: 0: 42638.7. Samples: 2410345040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:58:36,994][12645] Avg episode reward: [(0, '0.231')] [2024-06-18 13:58:37,070][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147109_2410233856.pth... [2024-06-18 13:58:37,129][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146485_2400010240.pth [2024-06-18 13:58:38,195][12883] Updated weights for policy 0, policy_version 147113 (0.0027) [2024-06-18 13:58:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2410446848. Throughput: 0: 42902.3. Samples: 2410601720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:58:41,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 13:58:42,346][12883] Updated weights for policy 0, policy_version 147123 (0.0032) [2024-06-18 13:58:45,936][12883] Updated weights for policy 0, policy_version 147133 (0.0036) [2024-06-18 13:58:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2410643456. Throughput: 0: 42639.1. Samples: 2410725340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:58:46,994][12645] Avg episode reward: [(0, '0.313')] [2024-06-18 13:58:49,953][12883] Updated weights for policy 0, policy_version 147143 (0.0036) [2024-06-18 13:58:51,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2410872832. Throughput: 0: 42899.1. Samples: 2410986760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:58:51,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 13:58:52,708][12862] Signal inference workers to stop experience collection... (35250 times) [2024-06-18 13:58:52,708][12862] Signal inference workers to resume experience collection... (35250 times) [2024-06-18 13:58:52,742][12883] InferenceWorker_p0-w0: stopping experience collection (35250 times) [2024-06-18 13:58:52,742][12883] InferenceWorker_p0-w0: resuming experience collection (35250 times) [2024-06-18 13:58:53,664][12883] Updated weights for policy 0, policy_version 147153 (0.0030) [2024-06-18 13:58:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 2411085824. Throughput: 0: 42813.9. Samples: 2411239680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:58:56,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 13:58:57,497][12883] Updated weights for policy 0, policy_version 147163 (0.0039) [2024-06-18 13:59:01,666][12883] Updated weights for policy 0, policy_version 147173 (0.0038) [2024-06-18 13:59:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2411282432. Throughput: 0: 42698.2. Samples: 2411369780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:01,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 13:59:05,032][12883] Updated weights for policy 0, policy_version 147183 (0.0033) [2024-06-18 13:59:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42872.7, 300 sec: 42820.9). Total num frames: 2411511808. Throughput: 0: 42774.1. Samples: 2411625900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:06,999][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 13:59:09,308][12883] Updated weights for policy 0, policy_version 147193 (0.0042) [2024-06-18 13:59:11,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2411741184. Throughput: 0: 42740.3. Samples: 2411885560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:11,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 13:59:12,733][12883] Updated weights for policy 0, policy_version 147203 (0.0046) [2024-06-18 13:59:16,815][12883] Updated weights for policy 0, policy_version 147213 (0.0031) [2024-06-18 13:59:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2411937792. Throughput: 0: 42737.8. Samples: 2412016860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:16,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 13:59:20,272][12883] Updated weights for policy 0, policy_version 147223 (0.0037) [2024-06-18 13:59:21,994][12645] Fps is (10 sec: 42599.2, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 2412167168. Throughput: 0: 42865.4. Samples: 2412273980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:21,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 13:59:24,382][12883] Updated weights for policy 0, policy_version 147233 (0.0041) [2024-06-18 13:59:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2412380160. Throughput: 0: 42869.3. Samples: 2412530840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:26,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 13:59:28,047][12883] Updated weights for policy 0, policy_version 147243 (0.0028) [2024-06-18 13:59:31,997][12883] Updated weights for policy 0, policy_version 147253 (0.0042) [2024-06-18 13:59:31,998][12645] Fps is (10 sec: 42580.5, 60 sec: 43141.6, 300 sec: 42708.9). Total num frames: 2412593152. Throughput: 0: 42922.2. Samples: 2412657020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:31,998][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 13:59:35,757][12883] Updated weights for policy 0, policy_version 147263 (0.0039) [2024-06-18 13:59:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2412789760. Throughput: 0: 42710.1. Samples: 2412908720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 13:59:36,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 13:59:39,896][12883] Updated weights for policy 0, policy_version 147273 (0.0026) [2024-06-18 13:59:41,994][12645] Fps is (10 sec: 42616.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2413019136. Throughput: 0: 42797.8. Samples: 2413165580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:59:41,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 13:59:43,487][12883] Updated weights for policy 0, policy_version 147283 (0.0039) [2024-06-18 13:59:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2413215744. Throughput: 0: 42681.7. Samples: 2413290460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:59:46,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 13:59:47,822][12883] Updated weights for policy 0, policy_version 147293 (0.0038) [2024-06-18 13:59:51,215][12883] Updated weights for policy 0, policy_version 147303 (0.0035) [2024-06-18 13:59:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2413428736. Throughput: 0: 42589.9. Samples: 2413542440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:59:51,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 13:59:55,359][12883] Updated weights for policy 0, policy_version 147313 (0.0034) [2024-06-18 13:59:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2413625344. Throughput: 0: 42659.5. Samples: 2413805240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 13:59:56,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 13:59:58,776][12883] Updated weights for policy 0, policy_version 147323 (0.0031) [2024-06-18 14:00:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2413854720. Throughput: 0: 42456.8. Samples: 2413927420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:01,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 14:00:03,243][12883] Updated weights for policy 0, policy_version 147333 (0.0039) [2024-06-18 14:00:06,671][12883] Updated weights for policy 0, policy_version 147343 (0.0046) [2024-06-18 14:00:07,000][12645] Fps is (10 sec: 44209.9, 60 sec: 42594.0, 300 sec: 42654.0). Total num frames: 2414067712. Throughput: 0: 42434.9. Samples: 2414183820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:07,001][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 14:00:11,001][12883] Updated weights for policy 0, policy_version 147353 (0.0031) [2024-06-18 14:00:11,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 2414280704. Throughput: 0: 42360.2. Samples: 2414437040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:11,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 14:00:14,318][12883] Updated weights for policy 0, policy_version 147363 (0.0035) [2024-06-18 14:00:16,994][12645] Fps is (10 sec: 44264.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2414510080. Throughput: 0: 42401.7. Samples: 2414564920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:16,994][12645] Avg episode reward: [(0, '0.725')] [2024-06-18 14:00:18,746][12883] Updated weights for policy 0, policy_version 147373 (0.0032) [2024-06-18 14:00:21,990][12883] Updated weights for policy 0, policy_version 147383 (0.0034) [2024-06-18 14:00:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2414723072. Throughput: 0: 42483.7. Samples: 2414820480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:21,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 14:00:26,286][12883] Updated weights for policy 0, policy_version 147393 (0.0042) [2024-06-18 14:00:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2414903296. Throughput: 0: 42497.8. Samples: 2415077980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:26,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 14:00:29,744][12883] Updated weights for policy 0, policy_version 147403 (0.0035) [2024-06-18 14:00:31,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42055.1, 300 sec: 42598.4). Total num frames: 2415116288. Throughput: 0: 42501.8. Samples: 2415203040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:31,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 14:00:33,730][12883] Updated weights for policy 0, policy_version 147413 (0.0028) [2024-06-18 14:00:36,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2415362048. Throughput: 0: 42631.0. Samples: 2415460840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:36,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 14:00:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147422_2415362048.pth... [2024-06-18 14:00:37,076][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000146797_2405122048.pth [2024-06-18 14:00:37,424][12883] Updated weights for policy 0, policy_version 147423 (0.0022) [2024-06-18 14:00:41,406][12883] Updated weights for policy 0, policy_version 147433 (0.0035) [2024-06-18 14:00:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 2415542272. Throughput: 0: 42473.5. Samples: 2415716540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 14:00:41,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 14:00:44,943][12883] Updated weights for policy 0, policy_version 147443 (0.0038) [2024-06-18 14:00:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2415771648. Throughput: 0: 42652.2. Samples: 2415846760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:00:46,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 14:00:49,152][12883] Updated weights for policy 0, policy_version 147453 (0.0032) [2024-06-18 14:00:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2415984640. Throughput: 0: 42601.5. Samples: 2416100620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:00:51,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 14:00:53,062][12883] Updated weights for policy 0, policy_version 147463 (0.0032) [2024-06-18 14:00:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2416181248. Throughput: 0: 42670.6. Samples: 2416357220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:00:56,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 14:00:57,141][12883] Updated weights for policy 0, policy_version 147473 (0.0036) [2024-06-18 14:00:59,099][12862] Signal inference workers to stop experience collection... (35300 times) [2024-06-18 14:00:59,099][12862] Signal inference workers to resume experience collection... (35300 times) [2024-06-18 14:00:59,121][12883] InferenceWorker_p0-w0: stopping experience collection (35300 times) [2024-06-18 14:00:59,121][12883] InferenceWorker_p0-w0: resuming experience collection (35300 times) [2024-06-18 14:01:00,659][12883] Updated weights for policy 0, policy_version 147483 (0.0030) [2024-06-18 14:01:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2416410624. Throughput: 0: 42624.0. Samples: 2416483000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:01,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 14:01:04,691][12883] Updated weights for policy 0, policy_version 147493 (0.0039) [2024-06-18 14:01:06,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42875.9, 300 sec: 42709.5). Total num frames: 2416640000. Throughput: 0: 42574.2. Samples: 2416736320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:06,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 14:01:08,558][12883] Updated weights for policy 0, policy_version 147503 (0.0043) [2024-06-18 14:01:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2416820224. Throughput: 0: 42710.6. Samples: 2416999960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:11,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 14:01:12,335][12883] Updated weights for policy 0, policy_version 147513 (0.0039) [2024-06-18 14:01:15,904][12883] Updated weights for policy 0, policy_version 147523 (0.0035) [2024-06-18 14:01:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2417065984. Throughput: 0: 42729.0. Samples: 2417125840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:16,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 14:01:19,808][12883] Updated weights for policy 0, policy_version 147533 (0.0033) [2024-06-18 14:01:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2417278976. Throughput: 0: 42816.0. Samples: 2417387560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:21,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 14:01:23,445][12883] Updated weights for policy 0, policy_version 147543 (0.0041) [2024-06-18 14:01:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2417475584. Throughput: 0: 42763.5. Samples: 2417640900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:26,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 14:01:27,411][12883] Updated weights for policy 0, policy_version 147553 (0.0031) [2024-06-18 14:01:31,051][12883] Updated weights for policy 0, policy_version 147563 (0.0031) [2024-06-18 14:01:31,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2417704960. Throughput: 0: 42662.1. Samples: 2417766560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:31,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 14:01:35,057][12883] Updated weights for policy 0, policy_version 147573 (0.0039) [2024-06-18 14:01:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2417917952. Throughput: 0: 42754.6. Samples: 2418024580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:36,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 14:01:38,858][12883] Updated weights for policy 0, policy_version 147583 (0.0031) [2024-06-18 14:01:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2418114560. Throughput: 0: 42689.4. Samples: 2418278240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 14:01:41,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 14:01:42,593][12883] Updated weights for policy 0, policy_version 147593 (0.0042) [2024-06-18 14:01:46,603][12883] Updated weights for policy 0, policy_version 147603 (0.0036) [2024-06-18 14:01:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2418327552. Throughput: 0: 42830.6. Samples: 2418410380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:01:46,994][12645] Avg episode reward: [(0, '0.504')] [2024-06-18 14:01:50,540][12883] Updated weights for policy 0, policy_version 147613 (0.0040) [2024-06-18 14:01:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2418540544. Throughput: 0: 42801.0. Samples: 2418662360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:01:51,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 14:01:54,254][12883] Updated weights for policy 0, policy_version 147623 (0.0030) [2024-06-18 14:01:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2418737152. Throughput: 0: 42640.0. Samples: 2418918760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:01:56,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 14:01:58,253][12883] Updated weights for policy 0, policy_version 147633 (0.0043) [2024-06-18 14:02:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2418966528. Throughput: 0: 42648.4. Samples: 2419045020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:01,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 14:02:02,184][12883] Updated weights for policy 0, policy_version 147643 (0.0037) [2024-06-18 14:02:06,039][12883] Updated weights for policy 0, policy_version 147653 (0.0025) [2024-06-18 14:02:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2419179520. Throughput: 0: 42508.9. Samples: 2419300460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:06,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 14:02:09,709][12883] Updated weights for policy 0, policy_version 147663 (0.0034) [2024-06-18 14:02:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42543.7). Total num frames: 2419359744. Throughput: 0: 42513.3. Samples: 2419554000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:11,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 14:02:13,862][12883] Updated weights for policy 0, policy_version 147673 (0.0030) [2024-06-18 14:02:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2419621888. Throughput: 0: 42509.9. Samples: 2419679500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:16,996][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 14:02:17,284][12883] Updated weights for policy 0, policy_version 147683 (0.0030) [2024-06-18 14:02:21,460][12883] Updated weights for policy 0, policy_version 147693 (0.0030) [2024-06-18 14:02:21,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42599.1). Total num frames: 2419818496. Throughput: 0: 42508.1. Samples: 2419937440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:21,994][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 14:02:24,775][12883] Updated weights for policy 0, policy_version 147703 (0.0037) [2024-06-18 14:02:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2420015104. Throughput: 0: 42599.4. Samples: 2420195220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:26,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 14:02:27,816][12862] Signal inference workers to stop experience collection... (35350 times) [2024-06-18 14:02:27,864][12883] InferenceWorker_p0-w0: stopping experience collection (35350 times) [2024-06-18 14:02:27,876][12862] Signal inference workers to resume experience collection... (35350 times) [2024-06-18 14:02:27,886][12883] InferenceWorker_p0-w0: resuming experience collection (35350 times) [2024-06-18 14:02:29,487][12883] Updated weights for policy 0, policy_version 147713 (0.0032) [2024-06-18 14:02:31,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.9, 300 sec: 42709.1). Total num frames: 2420260864. Throughput: 0: 42538.3. Samples: 2420324700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:31,997][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 14:02:32,302][12883] Updated weights for policy 0, policy_version 147723 (0.0038) [2024-06-18 14:02:36,907][12883] Updated weights for policy 0, policy_version 147733 (0.0023) [2024-06-18 14:02:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2420457472. Throughput: 0: 42723.5. Samples: 2420584920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:36,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 14:02:37,171][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147735_2420490240.pth... [2024-06-18 14:02:37,224][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147109_2410233856.pth [2024-06-18 14:02:40,346][12883] Updated weights for policy 0, policy_version 147743 (0.0032) [2024-06-18 14:02:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2420670464. Throughput: 0: 42793.8. Samples: 2420844480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:41,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 14:02:44,338][12883] Updated weights for policy 0, policy_version 147753 (0.0036) [2024-06-18 14:02:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2420899840. Throughput: 0: 42752.4. Samples: 2420968880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-18 14:02:46,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 14:02:47,757][12883] Updated weights for policy 0, policy_version 147763 (0.0034) [2024-06-18 14:02:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2421096448. Throughput: 0: 42934.7. Samples: 2421232520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:02:51,994][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 14:02:52,029][12883] Updated weights for policy 0, policy_version 147773 (0.0042) [2024-06-18 14:02:55,447][12883] Updated weights for policy 0, policy_version 147783 (0.0046) [2024-06-18 14:02:57,000][12645] Fps is (10 sec: 42571.8, 60 sec: 43140.1, 300 sec: 42764.1). Total num frames: 2421325824. Throughput: 0: 42915.5. Samples: 2421485460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:02:57,000][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 14:02:59,783][12883] Updated weights for policy 0, policy_version 147793 (0.0039) [2024-06-18 14:03:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2421555200. Throughput: 0: 43085.9. Samples: 2421618360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:01,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 14:03:02,831][12883] Updated weights for policy 0, policy_version 147803 (0.0024) [2024-06-18 14:03:06,998][12645] Fps is (10 sec: 40966.5, 60 sec: 42595.1, 300 sec: 42597.7). Total num frames: 2421735424. Throughput: 0: 43071.5. Samples: 2421875860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:06,999][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 14:03:07,199][12883] Updated weights for policy 0, policy_version 147813 (0.0033) [2024-06-18 14:03:10,735][12883] Updated weights for policy 0, policy_version 147823 (0.0040) [2024-06-18 14:03:11,994][12645] Fps is (10 sec: 40959.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2421964800. Throughput: 0: 43044.0. Samples: 2422132200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:11,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 14:03:14,764][12883] Updated weights for policy 0, policy_version 147833 (0.0035) [2024-06-18 14:03:16,994][12645] Fps is (10 sec: 45896.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2422194176. Throughput: 0: 42948.8. Samples: 2422257300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:16,994][12645] Avg episode reward: [(0, '0.286')] [2024-06-18 14:03:18,203][12883] Updated weights for policy 0, policy_version 147843 (0.0032) [2024-06-18 14:03:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2422390784. Throughput: 0: 42927.5. Samples: 2422516660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:21,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 14:03:22,344][12883] Updated weights for policy 0, policy_version 147853 (0.0028) [2024-06-18 14:03:25,944][12883] Updated weights for policy 0, policy_version 147863 (0.0033) [2024-06-18 14:03:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2422603776. Throughput: 0: 42850.6. Samples: 2422772760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:26,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 14:03:30,114][12883] Updated weights for policy 0, policy_version 147873 (0.0033) [2024-06-18 14:03:31,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2422849536. Throughput: 0: 42986.2. Samples: 2422903260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:31,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 14:03:33,629][12883] Updated weights for policy 0, policy_version 147883 (0.0039) [2024-06-18 14:03:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2423029760. Throughput: 0: 42716.3. Samples: 2423154760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:36,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 14:03:37,630][12883] Updated weights for policy 0, policy_version 147893 (0.0029) [2024-06-18 14:03:41,872][12883] Updated weights for policy 0, policy_version 147903 (0.0033) [2024-06-18 14:03:41,908][12862] Signal inference workers to stop experience collection... (35400 times) [2024-06-18 14:03:41,908][12862] Signal inference workers to resume experience collection... (35400 times) [2024-06-18 14:03:41,936][12883] InferenceWorker_p0-w0: stopping experience collection (35400 times) [2024-06-18 14:03:41,936][12883] InferenceWorker_p0-w0: resuming experience collection (35400 times) [2024-06-18 14:03:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2423242752. Throughput: 0: 42875.8. Samples: 2423414600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:41,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 14:03:45,453][12883] Updated weights for policy 0, policy_version 147913 (0.0037) [2024-06-18 14:03:46,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2423472128. Throughput: 0: 42626.2. Samples: 2423536540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:46,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 14:03:49,426][12883] Updated weights for policy 0, policy_version 147923 (0.0028) [2024-06-18 14:03:51,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2423668736. Throughput: 0: 42558.1. Samples: 2423790780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:03:51,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 14:03:53,392][12883] Updated weights for policy 0, policy_version 147933 (0.0037) [2024-06-18 14:03:56,983][12883] Updated weights for policy 0, policy_version 147943 (0.0037) [2024-06-18 14:03:56,993][12645] Fps is (10 sec: 42598.6, 60 sec: 42876.0, 300 sec: 42765.0). Total num frames: 2423898112. Throughput: 0: 42698.9. Samples: 2424053640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:03:56,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 14:04:01,051][12883] Updated weights for policy 0, policy_version 147953 (0.0039) [2024-06-18 14:04:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2424111104. Throughput: 0: 42723.2. Samples: 2424179840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:01,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 14:04:04,500][12883] Updated weights for policy 0, policy_version 147963 (0.0044) [2024-06-18 14:04:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42874.9, 300 sec: 42598.4). Total num frames: 2424307712. Throughput: 0: 42550.8. Samples: 2424431440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:06,994][12645] Avg episode reward: [(0, '0.685')] [2024-06-18 14:04:08,567][12883] Updated weights for policy 0, policy_version 147973 (0.0036) [2024-06-18 14:04:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2424520704. Throughput: 0: 42577.5. Samples: 2424688740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:11,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 14:04:12,469][12883] Updated weights for policy 0, policy_version 147983 (0.0030) [2024-06-18 14:04:16,202][12883] Updated weights for policy 0, policy_version 147993 (0.0046) [2024-06-18 14:04:16,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2424750080. Throughput: 0: 42542.7. Samples: 2424817680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:16,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 14:04:19,985][12883] Updated weights for policy 0, policy_version 148003 (0.0032) [2024-06-18 14:04:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2424946688. Throughput: 0: 42679.7. Samples: 2425075340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:21,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 14:04:23,815][12883] Updated weights for policy 0, policy_version 148013 (0.0028) [2024-06-18 14:04:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42654.5). Total num frames: 2425176064. Throughput: 0: 42558.2. Samples: 2425329720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:26,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 14:04:28,138][12883] Updated weights for policy 0, policy_version 148023 (0.0040) [2024-06-18 14:04:31,308][12883] Updated weights for policy 0, policy_version 148033 (0.0030) [2024-06-18 14:04:31,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2425389056. Throughput: 0: 42760.4. Samples: 2425460760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:31,994][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 14:04:35,624][12883] Updated weights for policy 0, policy_version 148043 (0.0048) [2024-06-18 14:04:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2425585664. Throughput: 0: 42857.8. Samples: 2425719380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:36,994][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 14:04:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148047_2425602048.pth... [2024-06-18 14:04:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147422_2415362048.pth [2024-06-18 14:04:39,063][12883] Updated weights for policy 0, policy_version 148053 (0.0038) [2024-06-18 14:04:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2425815040. Throughput: 0: 42507.8. Samples: 2425966500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:41,994][12645] Avg episode reward: [(0, '0.679')] [2024-06-18 14:04:43,174][12883] Updated weights for policy 0, policy_version 148063 (0.0030) [2024-06-18 14:04:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2426028032. Throughput: 0: 42697.8. Samples: 2426101240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:46,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 14:04:46,999][12883] Updated weights for policy 0, policy_version 148073 (0.0033) [2024-06-18 14:04:50,671][12883] Updated weights for policy 0, policy_version 148083 (0.0041) [2024-06-18 14:04:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2426224640. Throughput: 0: 42818.6. Samples: 2426358280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 14:04:51,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 14:04:54,554][12883] Updated weights for policy 0, policy_version 148093 (0.0024) [2024-06-18 14:04:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2426470400. Throughput: 0: 42701.7. Samples: 2426610320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:04:56,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 14:04:58,313][12883] Updated weights for policy 0, policy_version 148103 (0.0038) [2024-06-18 14:05:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.8). Total num frames: 2426650624. Throughput: 0: 42796.9. Samples: 2426743540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:01,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 14:05:02,280][12883] Updated weights for policy 0, policy_version 148113 (0.0035) [2024-06-18 14:05:05,851][12883] Updated weights for policy 0, policy_version 148123 (0.0027) [2024-06-18 14:05:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.2, 300 sec: 42653.9). Total num frames: 2426863616. Throughput: 0: 42679.4. Samples: 2426995920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:06,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 14:05:09,869][12883] Updated weights for policy 0, policy_version 148133 (0.0032) [2024-06-18 14:05:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2427092992. Throughput: 0: 42587.1. Samples: 2427246140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:11,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 14:05:13,482][12883] Updated weights for policy 0, policy_version 148143 (0.0033) [2024-06-18 14:05:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2427305984. Throughput: 0: 42608.4. Samples: 2427378140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:16,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 14:05:17,431][12862] Signal inference workers to stop experience collection... (35450 times) [2024-06-18 14:05:17,431][12862] Signal inference workers to resume experience collection... (35450 times) [2024-06-18 14:05:17,468][12883] InferenceWorker_p0-w0: stopping experience collection (35450 times) [2024-06-18 14:05:17,468][12883] InferenceWorker_p0-w0: resuming experience collection (35450 times) [2024-06-18 14:05:17,573][12883] Updated weights for policy 0, policy_version 148153 (0.0031) [2024-06-18 14:05:21,162][12883] Updated weights for policy 0, policy_version 148163 (0.0034) [2024-06-18 14:05:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 2427518976. Throughput: 0: 42421.3. Samples: 2427628340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:21,994][12645] Avg episode reward: [(0, '0.755')] [2024-06-18 14:05:25,667][12883] Updated weights for policy 0, policy_version 148173 (0.0034) [2024-06-18 14:05:26,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2427731968. Throughput: 0: 42586.4. Samples: 2427882880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:26,994][12645] Avg episode reward: [(0, '0.779')] [2024-06-18 14:05:28,833][12883] Updated weights for policy 0, policy_version 148183 (0.0036) [2024-06-18 14:05:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2427944960. Throughput: 0: 42488.8. Samples: 2428013240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:31,994][12645] Avg episode reward: [(0, '0.876')] [2024-06-18 14:05:33,138][12883] Updated weights for policy 0, policy_version 148193 (0.0028) [2024-06-18 14:05:36,636][12883] Updated weights for policy 0, policy_version 148203 (0.0040) [2024-06-18 14:05:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2428157952. Throughput: 0: 42400.9. Samples: 2428266320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:36,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 14:05:41,056][12883] Updated weights for policy 0, policy_version 148213 (0.0032) [2024-06-18 14:05:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2428370944. Throughput: 0: 42523.1. Samples: 2428523860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:41,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 14:05:44,190][12883] Updated weights for policy 0, policy_version 148223 (0.0024) [2024-06-18 14:05:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2428583936. Throughput: 0: 42391.5. Samples: 2428651160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:46,995][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 14:05:48,670][12883] Updated weights for policy 0, policy_version 148233 (0.0029) [2024-06-18 14:05:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2428796928. Throughput: 0: 42443.1. Samples: 2428905860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:51,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 14:05:52,195][12883] Updated weights for policy 0, policy_version 148243 (0.0037) [2024-06-18 14:05:56,271][12883] Updated weights for policy 0, policy_version 148253 (0.0036) [2024-06-18 14:05:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2429009920. Throughput: 0: 42554.6. Samples: 2429161100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 14:05:56,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 14:05:59,702][12883] Updated weights for policy 0, policy_version 148263 (0.0039) [2024-06-18 14:06:01,994][12645] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2429190144. Throughput: 0: 42427.3. Samples: 2429287360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:01,994][12645] Avg episode reward: [(0, '0.768')] [2024-06-18 14:06:03,942][12883] Updated weights for policy 0, policy_version 148273 (0.0038) [2024-06-18 14:06:06,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2429419520. Throughput: 0: 42509.5. Samples: 2429541260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:06,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 14:06:07,539][12883] Updated weights for policy 0, policy_version 148283 (0.0028) [2024-06-18 14:06:11,786][12883] Updated weights for policy 0, policy_version 148293 (0.0048) [2024-06-18 14:06:11,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2429632512. Throughput: 0: 42538.5. Samples: 2429797120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:11,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 14:06:15,267][12883] Updated weights for policy 0, policy_version 148303 (0.0033) [2024-06-18 14:06:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2429845504. Throughput: 0: 42409.7. Samples: 2429921680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:16,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 14:06:19,466][12883] Updated weights for policy 0, policy_version 148313 (0.0028) [2024-06-18 14:06:21,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 2430058496. Throughput: 0: 42550.8. Samples: 2430181200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:21,996][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 14:06:22,887][12883] Updated weights for policy 0, policy_version 148323 (0.0040) [2024-06-18 14:06:26,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 2430255104. Throughput: 0: 42369.4. Samples: 2430430480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:26,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 14:06:27,725][12883] Updated weights for policy 0, policy_version 148333 (0.0057) [2024-06-18 14:06:30,516][12883] Updated weights for policy 0, policy_version 148343 (0.0033) [2024-06-18 14:06:31,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2430500864. Throughput: 0: 42317.3. Samples: 2430555440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:31,994][12645] Avg episode reward: [(0, '0.650')] [2024-06-18 14:06:35,523][12883] Updated weights for policy 0, policy_version 148353 (0.0032) [2024-06-18 14:06:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2430681088. Throughput: 0: 42322.7. Samples: 2430810380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:36,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 14:06:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148357_2430681088.pth... [2024-06-18 14:06:37,085][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000147735_2420490240.pth [2024-06-18 14:06:38,412][12883] Updated weights for policy 0, policy_version 148363 (0.0041) [2024-06-18 14:06:41,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2430894080. Throughput: 0: 42196.1. Samples: 2431059920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:41,994][12645] Avg episode reward: [(0, '0.658')] [2024-06-18 14:06:43,247][12883] Updated weights for policy 0, policy_version 148373 (0.0026) [2024-06-18 14:06:46,367][12883] Updated weights for policy 0, policy_version 148383 (0.0026) [2024-06-18 14:06:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2431123456. Throughput: 0: 42298.1. Samples: 2431190780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:46,994][12645] Avg episode reward: [(0, '0.667')] [2024-06-18 14:06:50,871][12883] Updated weights for policy 0, policy_version 148393 (0.0037) [2024-06-18 14:06:51,344][12862] Signal inference workers to stop experience collection... (35500 times) [2024-06-18 14:06:51,344][12862] Signal inference workers to resume experience collection... (35500 times) [2024-06-18 14:06:51,357][12883] InferenceWorker_p0-w0: stopping experience collection (35500 times) [2024-06-18 14:06:51,389][12883] InferenceWorker_p0-w0: resuming experience collection (35500 times) [2024-06-18 14:06:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2431320064. Throughput: 0: 42347.9. Samples: 2431446920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:51,994][12645] Avg episode reward: [(0, '0.222')] [2024-06-18 14:06:53,960][12883] Updated weights for policy 0, policy_version 148403 (0.0031) [2024-06-18 14:06:56,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 2431516672. Throughput: 0: 42188.9. Samples: 2431695620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:06:56,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 14:06:58,532][12883] Updated weights for policy 0, policy_version 148413 (0.0038) [2024-06-18 14:07:01,613][12883] Updated weights for policy 0, policy_version 148423 (0.0037) [2024-06-18 14:07:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2431762432. Throughput: 0: 42141.5. Samples: 2431818040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:07:01,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 14:07:06,284][12883] Updated weights for policy 0, policy_version 148433 (0.0039) [2024-06-18 14:07:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2431959040. Throughput: 0: 42136.2. Samples: 2432077240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:06,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 14:07:09,681][12883] Updated weights for policy 0, policy_version 148443 (0.0032) [2024-06-18 14:07:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2432155648. Throughput: 0: 42206.6. Samples: 2432329780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:11,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 14:07:14,001][12883] Updated weights for policy 0, policy_version 148453 (0.0030) [2024-06-18 14:07:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2432385024. Throughput: 0: 42288.4. Samples: 2432458420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:16,994][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 14:07:17,416][12883] Updated weights for policy 0, policy_version 148463 (0.0039) [2024-06-18 14:07:21,665][12883] Updated weights for policy 0, policy_version 148473 (0.0035) [2024-06-18 14:07:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 2432598016. Throughput: 0: 42357.9. Samples: 2432716480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:21,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 14:07:25,249][12883] Updated weights for policy 0, policy_version 148483 (0.0025) [2024-06-18 14:07:26,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2432794624. Throughput: 0: 42416.4. Samples: 2432968660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:26,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 14:07:29,314][12883] Updated weights for policy 0, policy_version 148493 (0.0029) [2024-06-18 14:07:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2433024000. Throughput: 0: 42463.1. Samples: 2433101620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:31,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 14:07:32,887][12883] Updated weights for policy 0, policy_version 148503 (0.0023) [2024-06-18 14:07:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2433220608. Throughput: 0: 42268.0. Samples: 2433348980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:36,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 14:07:37,159][12883] Updated weights for policy 0, policy_version 148513 (0.0034) [2024-06-18 14:07:40,638][12883] Updated weights for policy 0, policy_version 148523 (0.0036) [2024-06-18 14:07:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2433449984. Throughput: 0: 42278.2. Samples: 2433598140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:41,994][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 14:07:44,885][12883] Updated weights for policy 0, policy_version 148533 (0.0045) [2024-06-18 14:07:46,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2433662976. Throughput: 0: 42514.6. Samples: 2433731200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:46,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 14:07:48,392][12883] Updated weights for policy 0, policy_version 148543 (0.0042) [2024-06-18 14:07:51,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42543.8). Total num frames: 2433875968. Throughput: 0: 42374.3. Samples: 2433984080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:51,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 14:07:52,427][12883] Updated weights for policy 0, policy_version 148553 (0.0036) [2024-06-18 14:07:56,023][12883] Updated weights for policy 0, policy_version 148563 (0.0040) [2024-06-18 14:07:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2434088960. Throughput: 0: 42340.9. Samples: 2434235120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:07:56,994][12645] Avg episode reward: [(0, '0.608')] [2024-06-18 14:08:00,051][12883] Updated weights for policy 0, policy_version 148573 (0.0030) [2024-06-18 14:08:01,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42599.1). Total num frames: 2434301952. Throughput: 0: 42440.5. Samples: 2434368240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:08:01,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 14:08:03,911][12883] Updated weights for policy 0, policy_version 148583 (0.0033) [2024-06-18 14:08:06,996][12645] Fps is (10 sec: 42588.9, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2434514944. Throughput: 0: 42313.3. Samples: 2434620680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:08:06,997][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 14:08:08,210][12883] Updated weights for policy 0, policy_version 148593 (0.0040) [2024-06-18 14:08:11,727][12883] Updated weights for policy 0, policy_version 148603 (0.0036) [2024-06-18 14:08:11,997][12645] Fps is (10 sec: 40947.1, 60 sec: 42596.2, 300 sec: 42431.3). Total num frames: 2434711552. Throughput: 0: 42181.0. Samples: 2434866940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:11,997][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 14:08:15,882][12883] Updated weights for policy 0, policy_version 148613 (0.0041) [2024-06-18 14:08:16,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2434924544. Throughput: 0: 42001.8. Samples: 2434991700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:16,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 14:08:19,951][12883] Updated weights for policy 0, policy_version 148623 (0.0027) [2024-06-18 14:08:21,994][12645] Fps is (10 sec: 42611.5, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2435137536. Throughput: 0: 42255.0. Samples: 2435250460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:21,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 14:08:23,552][12883] Updated weights for policy 0, policy_version 148633 (0.0034) [2024-06-18 14:08:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 2435334144. Throughput: 0: 42332.6. Samples: 2435503100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:26,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 14:08:27,607][12883] Updated weights for policy 0, policy_version 148643 (0.0031) [2024-06-18 14:08:28,933][12862] Signal inference workers to stop experience collection... (35550 times) [2024-06-18 14:08:28,933][12862] Signal inference workers to resume experience collection... (35550 times) [2024-06-18 14:08:28,956][12883] InferenceWorker_p0-w0: stopping experience collection (35550 times) [2024-06-18 14:08:28,957][12883] InferenceWorker_p0-w0: resuming experience collection (35550 times) [2024-06-18 14:08:31,151][12883] Updated weights for policy 0, policy_version 148653 (0.0027) [2024-06-18 14:08:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2435563520. Throughput: 0: 42015.0. Samples: 2435621880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:31,994][12645] Avg episode reward: [(0, '0.435')] [2024-06-18 14:08:35,365][12883] Updated weights for policy 0, policy_version 148663 (0.0038) [2024-06-18 14:08:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2435776512. Throughput: 0: 42220.5. Samples: 2435884000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:36,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 14:08:37,108][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148669_2435792896.pth... [2024-06-18 14:08:37,173][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148047_2425602048.pth [2024-06-18 14:08:38,775][12883] Updated weights for policy 0, policy_version 148673 (0.0042) [2024-06-18 14:08:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 2435973120. Throughput: 0: 42187.6. Samples: 2436133560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:41,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 14:08:43,069][12883] Updated weights for policy 0, policy_version 148683 (0.0033) [2024-06-18 14:08:46,481][12883] Updated weights for policy 0, policy_version 148693 (0.0031) [2024-06-18 14:08:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2436202496. Throughput: 0: 42086.7. Samples: 2436262140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:46,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 14:08:50,874][12883] Updated weights for policy 0, policy_version 148703 (0.0023) [2024-06-18 14:08:51,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2436415488. Throughput: 0: 42302.6. Samples: 2436524200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:51,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 14:08:54,077][12883] Updated weights for policy 0, policy_version 148713 (0.0037) [2024-06-18 14:08:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 2436595712. Throughput: 0: 42397.2. Samples: 2436774680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:08:56,995][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 14:08:58,722][12883] Updated weights for policy 0, policy_version 148723 (0.0025) [2024-06-18 14:09:01,787][12883] Updated weights for policy 0, policy_version 148733 (0.0028) [2024-06-18 14:09:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2436841472. Throughput: 0: 42390.7. Samples: 2436899280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:09:01,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 14:09:06,318][12883] Updated weights for policy 0, policy_version 148743 (0.0049) [2024-06-18 14:09:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 2437038080. Throughput: 0: 42377.0. Samples: 2437157420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:09:06,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 14:09:09,667][12883] Updated weights for policy 0, policy_version 148753 (0.0034) [2024-06-18 14:09:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42327.6, 300 sec: 42376.3). Total num frames: 2437251072. Throughput: 0: 42502.2. Samples: 2437415700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:11,994][12645] Avg episode reward: [(0, '0.196')] [2024-06-18 14:09:13,898][12883] Updated weights for policy 0, policy_version 148763 (0.0033) [2024-06-18 14:09:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2437464064. Throughput: 0: 42560.1. Samples: 2437537080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:16,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 14:09:17,347][12883] Updated weights for policy 0, policy_version 148773 (0.0042) [2024-06-18 14:09:21,663][12883] Updated weights for policy 0, policy_version 148783 (0.0028) [2024-06-18 14:09:21,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 2437660672. Throughput: 0: 42405.6. Samples: 2437792260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:21,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 14:09:24,845][12883] Updated weights for policy 0, policy_version 148793 (0.0034) [2024-06-18 14:09:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 2437890048. Throughput: 0: 42508.1. Samples: 2438046420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:26,994][12645] Avg episode reward: [(0, '0.281')] [2024-06-18 14:09:29,365][12883] Updated weights for policy 0, policy_version 148803 (0.0045) [2024-06-18 14:09:31,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2438103040. Throughput: 0: 42526.7. Samples: 2438175840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:31,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 14:09:32,445][12883] Updated weights for policy 0, policy_version 148813 (0.0035) [2024-06-18 14:09:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 2438299648. Throughput: 0: 42364.1. Samples: 2438430580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:36,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 14:09:37,094][12883] Updated weights for policy 0, policy_version 148823 (0.0032) [2024-06-18 14:09:39,994][12883] Updated weights for policy 0, policy_version 148833 (0.0033) [2024-06-18 14:09:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 2438529024. Throughput: 0: 42571.1. Samples: 2438690380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:41,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 14:09:44,688][12883] Updated weights for policy 0, policy_version 148843 (0.0030) [2024-06-18 14:09:46,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2438758400. Throughput: 0: 42747.5. Samples: 2438822920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:46,994][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 14:09:47,634][12883] Updated weights for policy 0, policy_version 148853 (0.0034) [2024-06-18 14:09:49,639][12862] Signal inference workers to stop experience collection... (35600 times) [2024-06-18 14:09:49,681][12883] InferenceWorker_p0-w0: stopping experience collection (35600 times) [2024-06-18 14:09:49,692][12862] Signal inference workers to resume experience collection... (35600 times) [2024-06-18 14:09:49,702][12883] InferenceWorker_p0-w0: resuming experience collection (35600 times) [2024-06-18 14:09:51,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 2438938624. Throughput: 0: 42605.8. Samples: 2439074680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:51,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 14:09:52,431][12883] Updated weights for policy 0, policy_version 148863 (0.0029) [2024-06-18 14:09:55,705][12883] Updated weights for policy 0, policy_version 148873 (0.0039) [2024-06-18 14:09:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2439168000. Throughput: 0: 42452.0. Samples: 2439326040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:09:56,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 14:10:00,119][12883] Updated weights for policy 0, policy_version 148883 (0.0046) [2024-06-18 14:10:01,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 2439380992. Throughput: 0: 42750.8. Samples: 2439460960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:10:01,996][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 14:10:03,260][12883] Updated weights for policy 0, policy_version 148893 (0.0044) [2024-06-18 14:10:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 2439593984. Throughput: 0: 42678.4. Samples: 2439712780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:10:06,994][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 14:10:07,918][12883] Updated weights for policy 0, policy_version 148903 (0.0040) [2024-06-18 14:10:10,810][12883] Updated weights for policy 0, policy_version 148913 (0.0040) [2024-06-18 14:10:11,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 2439806976. Throughput: 0: 42723.9. Samples: 2439969000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 14:10:11,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 14:10:15,539][12883] Updated weights for policy 0, policy_version 148923 (0.0034) [2024-06-18 14:10:17,000][12645] Fps is (10 sec: 44208.6, 60 sec: 42867.0, 300 sec: 42430.9). Total num frames: 2440036352. Throughput: 0: 42782.4. Samples: 2440101320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:17,001][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 14:10:18,477][12883] Updated weights for policy 0, policy_version 148933 (0.0038) [2024-06-18 14:10:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 2440249344. Throughput: 0: 42836.8. Samples: 2440358240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:21,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 14:10:23,164][12883] Updated weights for policy 0, policy_version 148943 (0.0032) [2024-06-18 14:10:26,216][12883] Updated weights for policy 0, policy_version 148953 (0.0035) [2024-06-18 14:10:26,994][12645] Fps is (10 sec: 42625.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2440462336. Throughput: 0: 42681.0. Samples: 2440611020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:26,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 14:10:30,640][12883] Updated weights for policy 0, policy_version 148963 (0.0036) [2024-06-18 14:10:31,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42596.8, 300 sec: 42375.9). Total num frames: 2440658944. Throughput: 0: 42752.6. Samples: 2440746880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:31,997][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 14:10:33,780][12883] Updated weights for policy 0, policy_version 148973 (0.0036) [2024-06-18 14:10:36,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43417.5, 300 sec: 42487.3). Total num frames: 2440904704. Throughput: 0: 42850.5. Samples: 2441002960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:36,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 14:10:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148981_2440904704.pth... [2024-06-18 14:10:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148357_2430681088.pth [2024-06-18 14:10:38,575][12883] Updated weights for policy 0, policy_version 148983 (0.0031) [2024-06-18 14:10:41,238][12883] Updated weights for policy 0, policy_version 148993 (0.0029) [2024-06-18 14:10:41,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2441101312. Throughput: 0: 42927.9. Samples: 2441257800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:41,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 14:10:46,201][12883] Updated weights for policy 0, policy_version 149003 (0.0031) [2024-06-18 14:10:46,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 2441297920. Throughput: 0: 42802.1. Samples: 2441386960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:46,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 14:10:49,323][12883] Updated weights for policy 0, policy_version 149013 (0.0030) [2024-06-18 14:10:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42487.3). Total num frames: 2441543680. Throughput: 0: 42938.9. Samples: 2441645040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:51,995][12645] Avg episode reward: [(0, '0.696')] [2024-06-18 14:10:54,043][12883] Updated weights for policy 0, policy_version 149023 (0.0034) [2024-06-18 14:10:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2441740288. Throughput: 0: 43019.6. Samples: 2441904880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:10:56,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 14:10:57,015][12883] Updated weights for policy 0, policy_version 149033 (0.0023) [2024-06-18 14:11:01,556][12883] Updated weights for policy 0, policy_version 149043 (0.0027) [2024-06-18 14:11:01,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 2441953280. Throughput: 0: 42901.6. Samples: 2442031620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:11:01,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 14:11:04,458][12883] Updated weights for policy 0, policy_version 149053 (0.0025) [2024-06-18 14:11:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2442182656. Throughput: 0: 43072.0. Samples: 2442296480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:11:06,994][12645] Avg episode reward: [(0, '0.280')] [2024-06-18 14:11:09,024][12883] Updated weights for policy 0, policy_version 149063 (0.0041) [2024-06-18 14:11:09,130][12862] Signal inference workers to stop experience collection... (35650 times) [2024-06-18 14:11:09,179][12883] InferenceWorker_p0-w0: stopping experience collection (35650 times) [2024-06-18 14:11:09,184][12862] Signal inference workers to resume experience collection... (35650 times) [2024-06-18 14:11:09,188][12883] InferenceWorker_p0-w0: resuming experience collection (35650 times) [2024-06-18 14:11:11,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 2442395648. Throughput: 0: 43103.6. Samples: 2442550680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:11:11,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 14:11:12,035][12883] Updated weights for policy 0, policy_version 149073 (0.0031) [2024-06-18 14:11:16,614][12883] Updated weights for policy 0, policy_version 149083 (0.0040) [2024-06-18 14:11:16,999][12645] Fps is (10 sec: 42575.5, 60 sec: 42872.1, 300 sec: 42542.4). Total num frames: 2442608640. Throughput: 0: 42883.3. Samples: 2442676760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 14:11:17,000][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 14:11:19,919][12883] Updated weights for policy 0, policy_version 149093 (0.0032) [2024-06-18 14:11:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2442821632. Throughput: 0: 42916.1. Samples: 2442934180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:21,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 14:11:24,204][12883] Updated weights for policy 0, policy_version 149103 (0.0036) [2024-06-18 14:11:26,994][12645] Fps is (10 sec: 40981.3, 60 sec: 42598.2, 300 sec: 42431.8). Total num frames: 2443018240. Throughput: 0: 42991.0. Samples: 2443192400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:26,994][12645] Avg episode reward: [(0, '0.670')] [2024-06-18 14:11:27,528][12883] Updated weights for policy 0, policy_version 149113 (0.0037) [2024-06-18 14:11:31,625][12883] Updated weights for policy 0, policy_version 149123 (0.0038) [2024-06-18 14:11:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42873.1, 300 sec: 42542.9). Total num frames: 2443231232. Throughput: 0: 43008.0. Samples: 2443322320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:31,994][12645] Avg episode reward: [(0, '0.795')] [2024-06-18 14:11:35,016][12883] Updated weights for policy 0, policy_version 149133 (0.0030) [2024-06-18 14:11:37,000][12645] Fps is (10 sec: 44210.4, 60 sec: 42594.1, 300 sec: 42597.5). Total num frames: 2443460608. Throughput: 0: 42884.9. Samples: 2443575120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:37,000][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 14:11:38,987][12883] Updated weights for policy 0, policy_version 149143 (0.0027) [2024-06-18 14:11:41,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2443673600. Throughput: 0: 42865.6. Samples: 2443833840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:41,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 14:11:42,570][12883] Updated weights for policy 0, policy_version 149153 (0.0041) [2024-06-18 14:11:46,994][12645] Fps is (10 sec: 40985.7, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2443870208. Throughput: 0: 42928.1. Samples: 2443963380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:46,994][12645] Avg episode reward: [(0, '0.784')] [2024-06-18 14:11:47,010][12883] Updated weights for policy 0, policy_version 149163 (0.0035) [2024-06-18 14:11:50,006][12883] Updated weights for policy 0, policy_version 149173 (0.0027) [2024-06-18 14:11:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2444099584. Throughput: 0: 42646.6. Samples: 2444215580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:51,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 14:11:54,630][12883] Updated weights for policy 0, policy_version 149183 (0.0036) [2024-06-18 14:11:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2444328960. Throughput: 0: 42763.5. Samples: 2444475040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:11:56,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 14:11:57,670][12883] Updated weights for policy 0, policy_version 149193 (0.0036) [2024-06-18 14:12:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2444525568. Throughput: 0: 42790.4. Samples: 2444602100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:12:01,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 14:12:02,176][12883] Updated weights for policy 0, policy_version 149203 (0.0035) [2024-06-18 14:12:05,780][12883] Updated weights for policy 0, policy_version 149213 (0.0042) [2024-06-18 14:12:07,000][12645] Fps is (10 sec: 42571.8, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 2444754944. Throughput: 0: 42738.9. Samples: 2444857700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:12:07,000][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 14:12:10,281][12883] Updated weights for policy 0, policy_version 149223 (0.0026) [2024-06-18 14:12:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2444967936. Throughput: 0: 42620.2. Samples: 2445110300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:12:11,994][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 14:12:13,554][12883] Updated weights for policy 0, policy_version 149233 (0.0025) [2024-06-18 14:12:16,994][12645] Fps is (10 sec: 39346.5, 60 sec: 42329.2, 300 sec: 42542.9). Total num frames: 2445148160. Throughput: 0: 42625.0. Samples: 2445240440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:12:16,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 14:12:17,774][12883] Updated weights for policy 0, policy_version 149243 (0.0026) [2024-06-18 14:12:21,297][12883] Updated weights for policy 0, policy_version 149253 (0.0027) [2024-06-18 14:12:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2445393920. Throughput: 0: 42716.9. Samples: 2445497120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 14:12:21,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 14:12:25,286][12883] Updated weights for policy 0, policy_version 149263 (0.0036) [2024-06-18 14:12:26,109][12862] Signal inference workers to stop experience collection... (35700 times) [2024-06-18 14:12:26,109][12862] Signal inference workers to resume experience collection... (35700 times) [2024-06-18 14:12:26,119][12883] InferenceWorker_p0-w0: stopping experience collection (35700 times) [2024-06-18 14:12:26,120][12883] InferenceWorker_p0-w0: resuming experience collection (35700 times) [2024-06-18 14:12:26,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 2445590528. Throughput: 0: 42687.8. Samples: 2445754780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:12:26,994][12645] Avg episode reward: [(0, '0.464')] [2024-06-18 14:12:28,865][12883] Updated weights for policy 0, policy_version 149273 (0.0031) [2024-06-18 14:12:31,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2445803520. Throughput: 0: 42654.9. Samples: 2445882860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:12:31,994][12645] Avg episode reward: [(0, '0.377')] [2024-06-18 14:12:32,861][12883] Updated weights for policy 0, policy_version 149283 (0.0038) [2024-06-18 14:12:36,427][12883] Updated weights for policy 0, policy_version 149293 (0.0033) [2024-06-18 14:12:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42875.9, 300 sec: 42654.0). Total num frames: 2446032896. Throughput: 0: 42774.8. Samples: 2446140440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:12:36,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 14:12:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149295_2446049280.pth... [2024-06-18 14:12:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148669_2435792896.pth [2024-06-18 14:12:40,394][12883] Updated weights for policy 0, policy_version 149303 (0.0027) [2024-06-18 14:12:42,000][12645] Fps is (10 sec: 44209.7, 60 sec: 42867.1, 300 sec: 42653.0). Total num frames: 2446245888. Throughput: 0: 42879.9. Samples: 2446404900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:12:42,000][12645] Avg episode reward: [(0, '0.141')] [2024-06-18 14:12:43,956][12883] Updated weights for policy 0, policy_version 149313 (0.0043) [2024-06-18 14:12:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2446458880. Throughput: 0: 42809.7. Samples: 2446528540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:12:46,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 14:12:48,537][12883] Updated weights for policy 0, policy_version 149323 (0.0028) [2024-06-18 14:12:51,595][12883] Updated weights for policy 0, policy_version 149333 (0.0035) [2024-06-18 14:12:51,996][12645] Fps is (10 sec: 42615.4, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2446671872. Throughput: 0: 42818.5. Samples: 2446784360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:12:51,997][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 14:12:56,114][12883] Updated weights for policy 0, policy_version 149343 (0.0036) [2024-06-18 14:12:56,996][12645] Fps is (10 sec: 42587.3, 60 sec: 42596.5, 300 sec: 42653.6). Total num frames: 2446884864. Throughput: 0: 42990.7. Samples: 2447045000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:12:56,997][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 14:12:59,212][12883] Updated weights for policy 0, policy_version 149353 (0.0044) [2024-06-18 14:13:01,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2447097856. Throughput: 0: 42833.6. Samples: 2447167960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:13:01,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 14:13:03,664][12883] Updated weights for policy 0, policy_version 149363 (0.0039) [2024-06-18 14:13:06,994][12645] Fps is (10 sec: 42610.0, 60 sec: 42602.9, 300 sec: 42709.9). Total num frames: 2447310848. Throughput: 0: 42760.5. Samples: 2447421340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:13:06,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 14:13:07,052][12883] Updated weights for policy 0, policy_version 149373 (0.0039) [2024-06-18 14:13:11,277][12883] Updated weights for policy 0, policy_version 149383 (0.0032) [2024-06-18 14:13:11,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2447523840. Throughput: 0: 42787.5. Samples: 2447680220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:13:11,994][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 14:13:14,932][12883] Updated weights for policy 0, policy_version 149393 (0.0042) [2024-06-18 14:13:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2447736832. Throughput: 0: 42796.9. Samples: 2447808720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:13:16,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 14:13:18,801][12883] Updated weights for policy 0, policy_version 149403 (0.0041) [2024-06-18 14:13:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2447949824. Throughput: 0: 42643.9. Samples: 2448059420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 14:13:21,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 14:13:22,967][12883] Updated weights for policy 0, policy_version 149413 (0.0035) [2024-06-18 14:13:26,310][12883] Updated weights for policy 0, policy_version 149423 (0.0045) [2024-06-18 14:13:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2448162816. Throughput: 0: 42488.9. Samples: 2448316640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:13:26,994][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 14:13:30,719][12883] Updated weights for policy 0, policy_version 149433 (0.0044) [2024-06-18 14:13:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2448359424. Throughput: 0: 42518.2. Samples: 2448441860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:13:31,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 14:13:34,296][12883] Updated weights for policy 0, policy_version 149443 (0.0038) [2024-06-18 14:13:36,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2448572416. Throughput: 0: 42442.2. Samples: 2448694160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:13:36,994][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 14:13:38,319][12883] Updated weights for policy 0, policy_version 149453 (0.0032) [2024-06-18 14:13:41,794][12883] Updated weights for policy 0, policy_version 149463 (0.0043) [2024-06-18 14:13:41,998][12645] Fps is (10 sec: 44216.2, 60 sec: 42599.4, 300 sec: 42708.8). Total num frames: 2448801792. Throughput: 0: 42438.5. Samples: 2448954820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:13:41,999][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 14:13:45,965][12883] Updated weights for policy 0, policy_version 149473 (0.0037) [2024-06-18 14:13:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2448998400. Throughput: 0: 42594.3. Samples: 2449084700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:13:46,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 14:13:49,438][12883] Updated weights for policy 0, policy_version 149483 (0.0033) [2024-06-18 14:13:51,994][12645] Fps is (10 sec: 40979.4, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 2449211392. Throughput: 0: 42468.8. Samples: 2449332440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:13:51,994][12645] Avg episode reward: [(0, '0.625')] [2024-06-18 14:13:53,709][12883] Updated weights for policy 0, policy_version 149493 (0.0037) [2024-06-18 14:13:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42327.3, 300 sec: 42653.9). Total num frames: 2449424384. Throughput: 0: 42412.9. Samples: 2449588800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:13:56,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 14:13:57,376][12883] Updated weights for policy 0, policy_version 149503 (0.0044) [2024-06-18 14:14:01,541][12883] Updated weights for policy 0, policy_version 149513 (0.0041) [2024-06-18 14:14:01,994][12645] Fps is (10 sec: 42595.5, 60 sec: 42324.9, 300 sec: 42709.4). Total num frames: 2449637376. Throughput: 0: 42424.7. Samples: 2449717860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:14:01,995][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 14:14:04,979][12883] Updated weights for policy 0, policy_version 149523 (0.0034) [2024-06-18 14:14:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2449866752. Throughput: 0: 42355.1. Samples: 2449965400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:14:06,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 14:14:09,363][12883] Updated weights for policy 0, policy_version 149533 (0.0032) [2024-06-18 14:14:11,994][12645] Fps is (10 sec: 42601.5, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2450063360. Throughput: 0: 42593.0. Samples: 2450233320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:14:11,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 14:14:12,637][12883] Updated weights for policy 0, policy_version 149543 (0.0028) [2024-06-18 14:14:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2450259968. Throughput: 0: 42514.3. Samples: 2450355000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:14:16,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 14:14:17,095][12883] Updated weights for policy 0, policy_version 149553 (0.0042) [2024-06-18 14:14:20,339][12883] Updated weights for policy 0, policy_version 149563 (0.0030) [2024-06-18 14:14:20,578][12862] Signal inference workers to stop experience collection... (35750 times) [2024-06-18 14:14:20,579][12862] Signal inference workers to resume experience collection... (35750 times) [2024-06-18 14:14:20,613][12883] InferenceWorker_p0-w0: stopping experience collection (35750 times) [2024-06-18 14:14:20,613][12883] InferenceWorker_p0-w0: resuming experience collection (35750 times) [2024-06-18 14:14:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2450505728. Throughput: 0: 42522.1. Samples: 2450607660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:14:21,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 14:14:24,732][12883] Updated weights for policy 0, policy_version 149573 (0.0033) [2024-06-18 14:14:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2450685952. Throughput: 0: 42509.4. Samples: 2450867540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:14:26,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 14:14:27,944][12883] Updated weights for policy 0, policy_version 149583 (0.0037) [2024-06-18 14:14:31,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2450898944. Throughput: 0: 42353.3. Samples: 2450990600. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:14:31,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 14:14:32,514][12883] Updated weights for policy 0, policy_version 149593 (0.0038) [2024-06-18 14:14:35,562][12883] Updated weights for policy 0, policy_version 149603 (0.0034) [2024-06-18 14:14:36,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 2451161088. Throughput: 0: 42609.8. Samples: 2451249880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:14:36,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 14:14:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149607_2451161088.pth... [2024-06-18 14:14:37,094][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000148981_2440904704.pth [2024-06-18 14:14:40,135][12883] Updated weights for policy 0, policy_version 149613 (0.0023) [2024-06-18 14:14:41,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42055.7, 300 sec: 42598.4). Total num frames: 2451324928. Throughput: 0: 42597.9. Samples: 2451505700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:14:41,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 14:14:43,413][12883] Updated weights for policy 0, policy_version 149623 (0.0039) [2024-06-18 14:14:46,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2451537920. Throughput: 0: 42303.3. Samples: 2451621480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:14:46,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 14:14:48,396][12883] Updated weights for policy 0, policy_version 149633 (0.0038) [2024-06-18 14:14:51,297][12883] Updated weights for policy 0, policy_version 149643 (0.0033) [2024-06-18 14:14:51,994][12645] Fps is (10 sec: 47512.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2451800064. Throughput: 0: 42504.0. Samples: 2451878080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:14:51,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 14:14:56,323][12883] Updated weights for policy 0, policy_version 149653 (0.0034) [2024-06-18 14:14:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 2451963904. Throughput: 0: 42290.2. Samples: 2452136380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:14:56,994][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 14:14:58,904][12883] Updated weights for policy 0, policy_version 149663 (0.0027) [2024-06-18 14:15:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.9, 300 sec: 42709.5). Total num frames: 2452193280. Throughput: 0: 42130.2. Samples: 2452250860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:15:01,994][12645] Avg episode reward: [(0, '0.667')] [2024-06-18 14:15:03,872][12883] Updated weights for policy 0, policy_version 149673 (0.0028) [2024-06-18 14:15:06,618][12883] Updated weights for policy 0, policy_version 149683 (0.0042) [2024-06-18 14:15:06,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2452422656. Throughput: 0: 42428.8. Samples: 2452516960. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:15:06,994][12645] Avg episode reward: [(0, '0.775')] [2024-06-18 14:15:11,528][12883] Updated weights for policy 0, policy_version 149693 (0.0025) [2024-06-18 14:15:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42543.8). Total num frames: 2452586496. Throughput: 0: 42462.1. Samples: 2452778340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:15:11,994][12645] Avg episode reward: [(0, '0.787')] [2024-06-18 14:15:14,320][12883] Updated weights for policy 0, policy_version 149703 (0.0032) [2024-06-18 14:15:16,994][12645] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2452848640. Throughput: 0: 42277.0. Samples: 2452893060. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:15:16,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 14:15:19,176][12883] Updated weights for policy 0, policy_version 149713 (0.0050) [2024-06-18 14:15:21,993][12645] Fps is (10 sec: 47514.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2453061632. Throughput: 0: 42395.3. Samples: 2453157660. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:15:21,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 14:15:22,002][12883] Updated weights for policy 0, policy_version 149723 (0.0041) [2024-06-18 14:15:26,802][12883] Updated weights for policy 0, policy_version 149733 (0.0033) [2024-06-18 14:15:26,994][12645] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 2453225472. Throughput: 0: 42512.3. Samples: 2453418760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:15:26,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 14:15:29,549][12883] Updated weights for policy 0, policy_version 149743 (0.0028) [2024-06-18 14:15:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2453471232. Throughput: 0: 42464.1. Samples: 2453532360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-18 14:15:31,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 14:15:34,654][12883] Updated weights for policy 0, policy_version 149753 (0.0033) [2024-06-18 14:15:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2453684224. Throughput: 0: 42553.0. Samples: 2453792960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:15:36,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 14:15:37,558][12883] Updated weights for policy 0, policy_version 149763 (0.0034) [2024-06-18 14:15:41,994][12645] Fps is (10 sec: 37682.7, 60 sec: 42052.1, 300 sec: 42542.9). Total num frames: 2453848064. Throughput: 0: 42428.8. Samples: 2454045680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:15:41,994][12645] Avg episode reward: [(0, '0.730')] [2024-06-18 14:15:42,373][12883] Updated weights for policy 0, policy_version 149773 (0.0027) [2024-06-18 14:15:45,404][12883] Updated weights for policy 0, policy_version 149783 (0.0022) [2024-06-18 14:15:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2454110208. Throughput: 0: 42478.3. Samples: 2454162380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:15:46,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 14:15:49,989][12883] Updated weights for policy 0, policy_version 149793 (0.0039) [2024-06-18 14:15:51,123][12862] Signal inference workers to stop experience collection... (35800 times) [2024-06-18 14:15:51,153][12883] InferenceWorker_p0-w0: stopping experience collection (35800 times) [2024-06-18 14:15:51,171][12862] Signal inference workers to resume experience collection... (35800 times) [2024-06-18 14:15:51,173][12883] InferenceWorker_p0-w0: resuming experience collection (35800 times) [2024-06-18 14:15:52,000][12645] Fps is (10 sec: 45846.6, 60 sec: 41774.9, 300 sec: 42597.5). Total num frames: 2454306816. Throughput: 0: 42448.8. Samples: 2454427420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:15:52,001][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 14:15:53,080][12883] Updated weights for policy 0, policy_version 149803 (0.0036) [2024-06-18 14:15:56,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2454503424. Throughput: 0: 42282.7. Samples: 2454681060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:15:56,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 14:15:57,531][12883] Updated weights for policy 0, policy_version 149813 (0.0038) [2024-06-18 14:16:00,748][12883] Updated weights for policy 0, policy_version 149823 (0.0035) [2024-06-18 14:16:01,994][12645] Fps is (10 sec: 42625.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2454732800. Throughput: 0: 42597.3. Samples: 2454809940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:01,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 14:16:05,031][12883] Updated weights for policy 0, policy_version 149833 (0.0038) [2024-06-18 14:16:06,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 2454945792. Throughput: 0: 42349.5. Samples: 2455063400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:06,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 14:16:08,282][12883] Updated weights for policy 0, policy_version 149843 (0.0033) [2024-06-18 14:16:11,999][12645] Fps is (10 sec: 40937.2, 60 sec: 42594.5, 300 sec: 42487.3). Total num frames: 2455142400. Throughput: 0: 42121.9. Samples: 2455314480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:12,000][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 14:16:12,607][12883] Updated weights for policy 0, policy_version 149853 (0.0046) [2024-06-18 14:16:15,859][12883] Updated weights for policy 0, policy_version 149863 (0.0032) [2024-06-18 14:16:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2455355392. Throughput: 0: 42535.5. Samples: 2455446460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:16,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 14:16:20,517][12883] Updated weights for policy 0, policy_version 149873 (0.0035) [2024-06-18 14:16:21,994][12645] Fps is (10 sec: 42621.8, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 2455568384. Throughput: 0: 42358.6. Samples: 2455699100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:21,994][12645] Avg episode reward: [(0, '0.225')] [2024-06-18 14:16:23,843][12883] Updated weights for policy 0, policy_version 149883 (0.0032) [2024-06-18 14:16:27,000][12645] Fps is (10 sec: 42571.4, 60 sec: 42594.0, 300 sec: 42542.0). Total num frames: 2455781376. Throughput: 0: 42453.3. Samples: 2455956340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:27,000][12645] Avg episode reward: [(0, '0.241')] [2024-06-18 14:16:28,205][12883] Updated weights for policy 0, policy_version 149893 (0.0026) [2024-06-18 14:16:31,395][12883] Updated weights for policy 0, policy_version 149903 (0.0037) [2024-06-18 14:16:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42543.7). Total num frames: 2456010752. Throughput: 0: 42650.6. Samples: 2456081660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:31,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 14:16:36,070][12883] Updated weights for policy 0, policy_version 149913 (0.0029) [2024-06-18 14:16:36,994][12645] Fps is (10 sec: 42625.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2456207360. Throughput: 0: 42502.0. Samples: 2456339740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:16:36,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 14:16:37,122][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149916_2456223744.pth... [2024-06-18 14:16:37,183][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149295_2446049280.pth [2024-06-18 14:16:39,008][12883] Updated weights for policy 0, policy_version 149923 (0.0037) [2024-06-18 14:16:41,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 2456420352. Throughput: 0: 42343.3. Samples: 2456586600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:16:41,996][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 14:16:43,897][12883] Updated weights for policy 0, policy_version 149933 (0.0029) [2024-06-18 14:16:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2456649728. Throughput: 0: 42348.9. Samples: 2456715640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:16:46,994][12645] Avg episode reward: [(0, '0.763')] [2024-06-18 14:16:47,091][12883] Updated weights for policy 0, policy_version 149943 (0.0031) [2024-06-18 14:16:51,913][12883] Updated weights for policy 0, policy_version 149953 (0.0037) [2024-06-18 14:16:51,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42056.6, 300 sec: 42376.2). Total num frames: 2456829952. Throughput: 0: 42243.1. Samples: 2456964340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:16:51,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 14:16:54,629][12862] Signal inference workers to stop experience collection... (35850 times) [2024-06-18 14:16:54,630][12862] Signal inference workers to resume experience collection... (35850 times) [2024-06-18 14:16:54,656][12883] InferenceWorker_p0-w0: stopping experience collection (35850 times) [2024-06-18 14:16:54,656][12883] InferenceWorker_p0-w0: resuming experience collection (35850 times) [2024-06-18 14:16:54,770][12883] Updated weights for policy 0, policy_version 149963 (0.0028) [2024-06-18 14:16:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2457059328. Throughput: 0: 42512.9. Samples: 2457227320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:16:56,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 14:16:59,412][12883] Updated weights for policy 0, policy_version 149973 (0.0035) [2024-06-18 14:17:01,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42488.2). Total num frames: 2457288704. Throughput: 0: 42491.9. Samples: 2457358600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:01,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 14:17:02,321][12883] Updated weights for policy 0, policy_version 149983 (0.0037) [2024-06-18 14:17:06,855][12883] Updated weights for policy 0, policy_version 149993 (0.0030) [2024-06-18 14:17:06,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2457485312. Throughput: 0: 42589.7. Samples: 2457615640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:06,995][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 14:17:10,282][12883] Updated weights for policy 0, policy_version 150003 (0.0037) [2024-06-18 14:17:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42602.3, 300 sec: 42542.9). Total num frames: 2457698304. Throughput: 0: 42506.4. Samples: 2457868860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:11,994][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 14:17:14,599][12883] Updated weights for policy 0, policy_version 150013 (0.0038) [2024-06-18 14:17:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2457927680. Throughput: 0: 42646.7. Samples: 2458000760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:16,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 14:17:17,996][12883] Updated weights for policy 0, policy_version 150023 (0.0032) [2024-06-18 14:17:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2458124288. Throughput: 0: 42554.6. Samples: 2458254700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:21,996][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 14:17:22,046][12883] Updated weights for policy 0, policy_version 150033 (0.0029) [2024-06-18 14:17:25,507][12883] Updated weights for policy 0, policy_version 150043 (0.0031) [2024-06-18 14:17:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42876.0, 300 sec: 42542.9). Total num frames: 2458353664. Throughput: 0: 42900.4. Samples: 2458517020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:26,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 14:17:29,600][12883] Updated weights for policy 0, policy_version 150053 (0.0037) [2024-06-18 14:17:31,996][12645] Fps is (10 sec: 45865.2, 60 sec: 42869.9, 300 sec: 42542.5). Total num frames: 2458583040. Throughput: 0: 42896.9. Samples: 2458646100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:31,996][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 14:17:33,124][12883] Updated weights for policy 0, policy_version 150063 (0.0040) [2024-06-18 14:17:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42432.7). Total num frames: 2458763264. Throughput: 0: 43056.5. Samples: 2458901880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 14:17:36,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 14:17:37,259][12883] Updated weights for policy 0, policy_version 150073 (0.0033) [2024-06-18 14:17:40,906][12883] Updated weights for policy 0, policy_version 150083 (0.0045) [2024-06-18 14:17:41,994][12645] Fps is (10 sec: 40968.8, 60 sec: 42873.0, 300 sec: 42487.3). Total num frames: 2458992640. Throughput: 0: 42846.5. Samples: 2459155420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:17:41,994][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 14:17:44,824][12883] Updated weights for policy 0, policy_version 150093 (0.0045) [2024-06-18 14:17:46,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43144.5, 300 sec: 42598.7). Total num frames: 2459238400. Throughput: 0: 42832.4. Samples: 2459286060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:17:46,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 14:17:48,774][12883] Updated weights for policy 0, policy_version 150103 (0.0035) [2024-06-18 14:17:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42487.7). Total num frames: 2459418624. Throughput: 0: 42746.6. Samples: 2459539240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:17:51,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 14:17:52,429][12883] Updated weights for policy 0, policy_version 150113 (0.0034) [2024-06-18 14:17:56,825][12883] Updated weights for policy 0, policy_version 150123 (0.0039) [2024-06-18 14:17:56,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42869.8, 300 sec: 42487.0). Total num frames: 2459631616. Throughput: 0: 42800.5. Samples: 2459794980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:17:56,996][12645] Avg episode reward: [(0, '0.270')] [2024-06-18 14:18:00,481][12883] Updated weights for policy 0, policy_version 150133 (0.0037) [2024-06-18 14:18:01,994][12645] Fps is (10 sec: 45875.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2459877376. Throughput: 0: 42624.4. Samples: 2459918860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:01,999][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 14:18:04,450][12883] Updated weights for policy 0, policy_version 150143 (0.0036) [2024-06-18 14:18:06,994][12645] Fps is (10 sec: 42607.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2460057600. Throughput: 0: 42819.1. Samples: 2460181560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:06,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 14:18:08,096][12883] Updated weights for policy 0, policy_version 150153 (0.0031) [2024-06-18 14:18:11,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2460254208. Throughput: 0: 42642.6. Samples: 2460435940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:11,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 14:18:12,199][12883] Updated weights for policy 0, policy_version 150163 (0.0038) [2024-06-18 14:18:12,668][12862] Signal inference workers to stop experience collection... (35900 times) [2024-06-18 14:18:12,669][12862] Signal inference workers to resume experience collection... (35900 times) [2024-06-18 14:18:12,698][12883] InferenceWorker_p0-w0: stopping experience collection (35900 times) [2024-06-18 14:18:12,699][12883] InferenceWorker_p0-w0: resuming experience collection (35900 times) [2024-06-18 14:18:15,833][12883] Updated weights for policy 0, policy_version 150173 (0.0029) [2024-06-18 14:18:16,994][12645] Fps is (10 sec: 47513.9, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 2460532736. Throughput: 0: 42511.0. Samples: 2460559000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:16,994][12645] Avg episode reward: [(0, '0.400')] [2024-06-18 14:18:20,122][12883] Updated weights for policy 0, policy_version 150183 (0.0046) [2024-06-18 14:18:21,994][12645] Fps is (10 sec: 45874.4, 60 sec: 43144.4, 300 sec: 42542.9). Total num frames: 2460712960. Throughput: 0: 42630.9. Samples: 2460820280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:21,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 14:18:23,428][12883] Updated weights for policy 0, policy_version 150193 (0.0026) [2024-06-18 14:18:26,994][12645] Fps is (10 sec: 36045.2, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2460893184. Throughput: 0: 42651.7. Samples: 2461074740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:26,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 14:18:27,754][12883] Updated weights for policy 0, policy_version 150203 (0.0034) [2024-06-18 14:18:31,177][12883] Updated weights for policy 0, policy_version 150213 (0.0036) [2024-06-18 14:18:31,994][12645] Fps is (10 sec: 45876.2, 60 sec: 43146.2, 300 sec: 42709.5). Total num frames: 2461171712. Throughput: 0: 42509.8. Samples: 2461199000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:31,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 14:18:35,567][12883] Updated weights for policy 0, policy_version 150223 (0.0033) [2024-06-18 14:18:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42432.5). Total num frames: 2461319168. Throughput: 0: 42695.8. Samples: 2461460540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:36,994][12645] Avg episode reward: [(0, '0.768')] [2024-06-18 14:18:37,022][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150227_2461319168.pth... [2024-06-18 14:18:37,072][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149607_2451161088.pth [2024-06-18 14:18:38,843][12883] Updated weights for policy 0, policy_version 150233 (0.0034) [2024-06-18 14:18:41,993][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 2461548544. Throughput: 0: 42400.0. Samples: 2461702880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 14:18:41,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 14:18:43,354][12883] Updated weights for policy 0, policy_version 150243 (0.0029) [2024-06-18 14:18:46,386][12883] Updated weights for policy 0, policy_version 150253 (0.0039) [2024-06-18 14:18:46,996][12645] Fps is (10 sec: 45864.5, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2461777920. Throughput: 0: 42689.9. Samples: 2461840000. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:18:46,996][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 14:18:50,919][12883] Updated weights for policy 0, policy_version 150263 (0.0044) [2024-06-18 14:18:51,994][12645] Fps is (10 sec: 39320.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2461941760. Throughput: 0: 42533.7. Samples: 2462095580. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:18:51,994][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 14:18:53,866][12883] Updated weights for policy 0, policy_version 150273 (0.0027) [2024-06-18 14:18:56,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42873.1, 300 sec: 42598.5). Total num frames: 2462203904. Throughput: 0: 42333.8. Samples: 2462340960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:18:56,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 14:18:58,840][12883] Updated weights for policy 0, policy_version 150283 (0.0044) [2024-06-18 14:19:01,483][12883] Updated weights for policy 0, policy_version 150293 (0.0024) [2024-06-18 14:19:01,993][12645] Fps is (10 sec: 47515.0, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2462416896. Throughput: 0: 42726.8. Samples: 2462481700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:01,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 14:19:03,148][12862] Signal inference workers to stop experience collection... (35950 times) [2024-06-18 14:19:03,192][12883] InferenceWorker_p0-w0: stopping experience collection (35950 times) [2024-06-18 14:19:03,197][12862] Signal inference workers to resume experience collection... (35950 times) [2024-06-18 14:19:03,203][12883] InferenceWorker_p0-w0: resuming experience collection (35950 times) [2024-06-18 14:19:06,603][12883] Updated weights for policy 0, policy_version 150303 (0.0031) [2024-06-18 14:19:06,996][12645] Fps is (10 sec: 37674.7, 60 sec: 42050.8, 300 sec: 42431.5). Total num frames: 2462580736. Throughput: 0: 42572.3. Samples: 2462736120. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:06,997][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 14:19:09,097][12883] Updated weights for policy 0, policy_version 150313 (0.0026) [2024-06-18 14:19:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2462842880. Throughput: 0: 42394.6. Samples: 2462982500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:11,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 14:19:14,121][12883] Updated weights for policy 0, policy_version 150323 (0.0028) [2024-06-18 14:19:16,786][12883] Updated weights for policy 0, policy_version 150333 (0.0038) [2024-06-18 14:19:16,994][12645] Fps is (10 sec: 49162.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2463072256. Throughput: 0: 42870.5. Samples: 2463128180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:16,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 14:19:21,727][12883] Updated weights for policy 0, policy_version 150343 (0.0030) [2024-06-18 14:19:21,994][12645] Fps is (10 sec: 37683.5, 60 sec: 41779.4, 300 sec: 42487.3). Total num frames: 2463219712. Throughput: 0: 42622.7. Samples: 2463378560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:21,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 14:19:24,438][12883] Updated weights for policy 0, policy_version 150353 (0.0033) [2024-06-18 14:19:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2463498240. Throughput: 0: 42645.6. Samples: 2463621940. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:26,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 14:19:29,371][12883] Updated weights for policy 0, policy_version 150363 (0.0037) [2024-06-18 14:19:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 2463678464. Throughput: 0: 42754.2. Samples: 2463763840. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:31,994][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 14:19:32,222][12883] Updated weights for policy 0, policy_version 150373 (0.0032) [2024-06-18 14:19:37,000][12645] Fps is (10 sec: 36022.6, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 2463858688. Throughput: 0: 42442.2. Samples: 2464005740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:37,000][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 14:19:37,092][12883] Updated weights for policy 0, policy_version 150383 (0.0036) [2024-06-18 14:19:39,859][12883] Updated weights for policy 0, policy_version 150393 (0.0030) [2024-06-18 14:19:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2464120832. Throughput: 0: 42612.9. Samples: 2464258540. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:41,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 14:19:44,699][12883] Updated weights for policy 0, policy_version 150403 (0.0025) [2024-06-18 14:19:46,994][12645] Fps is (10 sec: 47543.6, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 2464333824. Throughput: 0: 42665.2. Samples: 2464401640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 23.0) [2024-06-18 14:19:46,994][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 14:19:47,568][12883] Updated weights for policy 0, policy_version 150413 (0.0033) [2024-06-18 14:19:51,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2464497664. Throughput: 0: 42564.7. Samples: 2464651440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:19:51,994][12645] Avg episode reward: [(0, '0.742')] [2024-06-18 14:19:52,312][12883] Updated weights for policy 0, policy_version 150423 (0.0042) [2024-06-18 14:19:55,044][12883] Updated weights for policy 0, policy_version 150433 (0.0028) [2024-06-18 14:19:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2464759808. Throughput: 0: 42671.5. Samples: 2464902720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:19:56,994][12645] Avg episode reward: [(0, '0.649')] [2024-06-18 14:20:00,055][12883] Updated weights for policy 0, policy_version 150443 (0.0023) [2024-06-18 14:20:00,679][12862] Signal inference workers to stop experience collection... (36000 times) [2024-06-18 14:20:00,679][12862] Signal inference workers to resume experience collection... (36000 times) [2024-06-18 14:20:00,723][12883] InferenceWorker_p0-w0: stopping experience collection (36000 times) [2024-06-18 14:20:00,724][12883] InferenceWorker_p0-w0: resuming experience collection (36000 times) [2024-06-18 14:20:01,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2464972800. Throughput: 0: 42522.8. Samples: 2465041700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:01,994][12645] Avg episode reward: [(0, '0.705')] [2024-06-18 14:20:02,559][12883] Updated weights for policy 0, policy_version 150453 (0.0027) [2024-06-18 14:20:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43146.2, 300 sec: 42654.0). Total num frames: 2465169408. Throughput: 0: 42561.7. Samples: 2465293840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:06,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 14:20:07,719][12883] Updated weights for policy 0, policy_version 150463 (0.0030) [2024-06-18 14:20:10,492][12883] Updated weights for policy 0, policy_version 150473 (0.0049) [2024-06-18 14:20:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2465398784. Throughput: 0: 42741.0. Samples: 2465545280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:11,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 14:20:15,353][12883] Updated weights for policy 0, policy_version 150483 (0.0042) [2024-06-18 14:20:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2465595392. Throughput: 0: 42477.6. Samples: 2465675340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:16,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 14:20:18,189][12883] Updated weights for policy 0, policy_version 150493 (0.0037) [2024-06-18 14:20:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2465808384. Throughput: 0: 42777.9. Samples: 2465930480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:21,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 14:20:23,056][12883] Updated weights for policy 0, policy_version 150503 (0.0035) [2024-06-18 14:20:25,866][12883] Updated weights for policy 0, policy_version 150513 (0.0030) [2024-06-18 14:20:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2466054144. Throughput: 0: 42696.8. Samples: 2466179900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:26,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 14:20:30,842][12883] Updated weights for policy 0, policy_version 150523 (0.0032) [2024-06-18 14:20:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2466234368. Throughput: 0: 42490.7. Samples: 2466313720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:31,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 14:20:33,570][12883] Updated weights for policy 0, policy_version 150533 (0.0039) [2024-06-18 14:20:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43422.1, 300 sec: 42765.0). Total num frames: 2466463744. Throughput: 0: 42670.7. Samples: 2466571620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:36,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 14:20:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150541_2466463744.pth... [2024-06-18 14:20:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000149916_2456223744.pth [2024-06-18 14:20:38,552][12883] Updated weights for policy 0, policy_version 150543 (0.0042) [2024-06-18 14:20:41,346][12883] Updated weights for policy 0, policy_version 150553 (0.0032) [2024-06-18 14:20:41,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2466693120. Throughput: 0: 42578.2. Samples: 2466818740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:41,994][12645] Avg episode reward: [(0, '0.717')] [2024-06-18 14:20:46,338][12883] Updated weights for policy 0, policy_version 150563 (0.0044) [2024-06-18 14:20:46,996][12645] Fps is (10 sec: 39313.0, 60 sec: 42050.7, 300 sec: 42543.4). Total num frames: 2466856960. Throughput: 0: 42306.8. Samples: 2466945600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 14:20:46,996][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 14:20:49,262][12883] Updated weights for policy 0, policy_version 150573 (0.0036) [2024-06-18 14:20:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2467086336. Throughput: 0: 42422.6. Samples: 2467202860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:20:51,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 14:20:53,978][12883] Updated weights for policy 0, policy_version 150583 (0.0029) [2024-06-18 14:20:56,787][12883] Updated weights for policy 0, policy_version 150593 (0.0051) [2024-06-18 14:20:56,994][12645] Fps is (10 sec: 45885.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2467315712. Throughput: 0: 42411.1. Samples: 2467453780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:20:56,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 14:21:01,828][12883] Updated weights for policy 0, policy_version 150603 (0.0045) [2024-06-18 14:21:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 2467479552. Throughput: 0: 42587.2. Samples: 2467591760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:01,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 14:21:04,393][12883] Updated weights for policy 0, policy_version 150613 (0.0031) [2024-06-18 14:21:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42654.7). Total num frames: 2467725312. Throughput: 0: 42543.1. Samples: 2467844920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:06,994][12645] Avg episode reward: [(0, '0.303')] [2024-06-18 14:21:09,326][12883] Updated weights for policy 0, policy_version 150623 (0.0037) [2024-06-18 14:21:10,883][12862] Signal inference workers to stop experience collection... (36050 times) [2024-06-18 14:21:10,934][12862] Signal inference workers to resume experience collection... (36050 times) [2024-06-18 14:21:10,935][12883] InferenceWorker_p0-w0: stopping experience collection (36050 times) [2024-06-18 14:21:10,950][12883] InferenceWorker_p0-w0: resuming experience collection (36050 times) [2024-06-18 14:21:11,895][12883] Updated weights for policy 0, policy_version 150633 (0.0021) [2024-06-18 14:21:11,994][12645] Fps is (10 sec: 49152.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2467971072. Throughput: 0: 42715.6. Samples: 2468102100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:11,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 14:21:16,922][12883] Updated weights for policy 0, policy_version 150643 (0.0034) [2024-06-18 14:21:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2468134912. Throughput: 0: 42715.1. Samples: 2468235900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:16,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 14:21:19,417][12883] Updated weights for policy 0, policy_version 150653 (0.0032) [2024-06-18 14:21:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42710.4). Total num frames: 2468380672. Throughput: 0: 42513.3. Samples: 2468484720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:21,995][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 14:21:24,487][12883] Updated weights for policy 0, policy_version 150663 (0.0032) [2024-06-18 14:21:26,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2468610048. Throughput: 0: 42719.6. Samples: 2468741120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:26,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 14:21:27,070][12883] Updated weights for policy 0, policy_version 150673 (0.0027) [2024-06-18 14:21:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2468773888. Throughput: 0: 42768.8. Samples: 2468870100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:32,000][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 14:21:32,219][12883] Updated weights for policy 0, policy_version 150683 (0.0037) [2024-06-18 14:21:34,926][12883] Updated weights for policy 0, policy_version 150693 (0.0042) [2024-06-18 14:21:36,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2469036032. Throughput: 0: 42740.8. Samples: 2469126200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:36,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 14:21:39,724][12883] Updated weights for policy 0, policy_version 150703 (0.0036) [2024-06-18 14:21:41,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2469232640. Throughput: 0: 42932.6. Samples: 2469385740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:41,994][12645] Avg episode reward: [(0, '0.249')] [2024-06-18 14:21:42,917][12883] Updated weights for policy 0, policy_version 150713 (0.0046) [2024-06-18 14:21:46,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42599.9, 300 sec: 42653.9). Total num frames: 2469412864. Throughput: 0: 42620.8. Samples: 2469509700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:46,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 14:21:47,583][12883] Updated weights for policy 0, policy_version 150723 (0.0038) [2024-06-18 14:21:50,648][12883] Updated weights for policy 0, policy_version 150733 (0.0038) [2024-06-18 14:21:51,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2469658624. Throughput: 0: 42787.1. Samples: 2469770340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 14:21:51,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 14:21:54,978][12883] Updated weights for policy 0, policy_version 150743 (0.0031) [2024-06-18 14:21:56,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2469871616. Throughput: 0: 42776.5. Samples: 2470027040. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:21:56,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 14:21:58,571][12883] Updated weights for policy 0, policy_version 150753 (0.0041) [2024-06-18 14:22:01,996][12645] Fps is (10 sec: 39313.4, 60 sec: 42870.0, 300 sec: 42598.1). Total num frames: 2470051840. Throughput: 0: 42654.4. Samples: 2470155440. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:01,996][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 14:22:02,843][12883] Updated weights for policy 0, policy_version 150763 (0.0032) [2024-06-18 14:22:05,993][12883] Updated weights for policy 0, policy_version 150773 (0.0038) [2024-06-18 14:22:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2470297600. Throughput: 0: 42861.4. Samples: 2470413480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:06,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 14:22:10,437][12883] Updated weights for policy 0, policy_version 150783 (0.0025) [2024-06-18 14:22:11,994][12645] Fps is (10 sec: 45884.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2470510592. Throughput: 0: 42739.4. Samples: 2470664400. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:11,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 14:22:13,369][12862] Signal inference workers to stop experience collection... (36100 times) [2024-06-18 14:22:13,370][12862] Signal inference workers to resume experience collection... (36100 times) [2024-06-18 14:22:13,380][12883] InferenceWorker_p0-w0: stopping experience collection (36100 times) [2024-06-18 14:22:13,380][12883] InferenceWorker_p0-w0: resuming experience collection (36100 times) [2024-06-18 14:22:13,518][12883] Updated weights for policy 0, policy_version 150793 (0.0031) [2024-06-18 14:22:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2470707200. Throughput: 0: 42802.7. Samples: 2470796220. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:16,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 14:22:17,907][12883] Updated weights for policy 0, policy_version 150803 (0.0035) [2024-06-18 14:22:21,699][12883] Updated weights for policy 0, policy_version 150813 (0.0032) [2024-06-18 14:22:21,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2470936576. Throughput: 0: 42797.9. Samples: 2471052100. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:21,994][12645] Avg episode reward: [(0, '0.713')] [2024-06-18 14:22:25,397][12883] Updated weights for policy 0, policy_version 150823 (0.0036) [2024-06-18 14:22:26,994][12645] Fps is (10 sec: 45873.9, 60 sec: 42598.2, 300 sec: 42654.2). Total num frames: 2471165952. Throughput: 0: 42740.6. Samples: 2471309080. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:26,994][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 14:22:29,347][12883] Updated weights for policy 0, policy_version 150833 (0.0030) [2024-06-18 14:22:31,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2471362560. Throughput: 0: 42817.4. Samples: 2471436480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:31,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 14:22:33,236][12883] Updated weights for policy 0, policy_version 150843 (0.0035) [2024-06-18 14:22:36,843][12883] Updated weights for policy 0, policy_version 150853 (0.0042) [2024-06-18 14:22:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2471575552. Throughput: 0: 42761.4. Samples: 2471694600. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:36,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 14:22:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150853_2471575552.pth... [2024-06-18 14:22:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150227_2461319168.pth [2024-06-18 14:22:40,938][12883] Updated weights for policy 0, policy_version 150863 (0.0032) [2024-06-18 14:22:41,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.8, 300 sec: 42598.1). Total num frames: 2471804928. Throughput: 0: 42784.0. Samples: 2471952420. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:41,996][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 14:22:44,499][12883] Updated weights for policy 0, policy_version 150873 (0.0031) [2024-06-18 14:22:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2472001536. Throughput: 0: 42736.2. Samples: 2472078480. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:46,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 14:22:48,479][12883] Updated weights for policy 0, policy_version 150883 (0.0045) [2024-06-18 14:22:51,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42325.4, 300 sec: 42598.7). Total num frames: 2472198144. Throughput: 0: 42766.2. Samples: 2472337960. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:51,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 14:22:52,169][12883] Updated weights for policy 0, policy_version 150893 (0.0037) [2024-06-18 14:22:56,089][12883] Updated weights for policy 0, policy_version 150903 (0.0030) [2024-06-18 14:22:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2472427520. Throughput: 0: 42782.7. Samples: 2472589620. Policy #0 lag: (min: 1.0, avg: 8.4, max: 21.0) [2024-06-18 14:22:56,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 14:23:00,039][12883] Updated weights for policy 0, policy_version 150913 (0.0037) [2024-06-18 14:23:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2472640512. Throughput: 0: 42816.8. Samples: 2472722980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:01,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 14:23:03,658][12883] Updated weights for policy 0, policy_version 150923 (0.0041) [2024-06-18 14:23:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2472853504. Throughput: 0: 42689.7. Samples: 2472973140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:06,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 14:23:07,650][12883] Updated weights for policy 0, policy_version 150933 (0.0027) [2024-06-18 14:23:11,277][12883] Updated weights for policy 0, policy_version 150943 (0.0047) [2024-06-18 14:23:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2473082880. Throughput: 0: 42692.5. Samples: 2473230240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:11,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 14:23:15,311][12883] Updated weights for policy 0, policy_version 150953 (0.0038) [2024-06-18 14:23:17,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42866.9, 300 sec: 42597.5). Total num frames: 2473279488. Throughput: 0: 42734.5. Samples: 2473359800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:17,001][12645] Avg episode reward: [(0, '0.649')] [2024-06-18 14:23:18,850][12883] Updated weights for policy 0, policy_version 150963 (0.0051) [2024-06-18 14:23:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2473492480. Throughput: 0: 42579.6. Samples: 2473610680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:21,994][12645] Avg episode reward: [(0, '0.789')] [2024-06-18 14:23:22,825][12862] Signal inference workers to stop experience collection... (36150 times) [2024-06-18 14:23:22,826][12862] Signal inference workers to resume experience collection... (36150 times) [2024-06-18 14:23:22,852][12883] InferenceWorker_p0-w0: stopping experience collection (36150 times) [2024-06-18 14:23:22,852][12883] InferenceWorker_p0-w0: resuming experience collection (36150 times) [2024-06-18 14:23:23,003][12883] Updated weights for policy 0, policy_version 150973 (0.0018) [2024-06-18 14:23:26,902][12883] Updated weights for policy 0, policy_version 150983 (0.0032) [2024-06-18 14:23:26,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 2473705472. Throughput: 0: 42572.3. Samples: 2473868080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:26,994][12645] Avg episode reward: [(0, '0.662')] [2024-06-18 14:23:30,710][12883] Updated weights for policy 0, policy_version 150993 (0.0027) [2024-06-18 14:23:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2473918464. Throughput: 0: 42485.0. Samples: 2473990300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:31,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 14:23:34,606][12883] Updated weights for policy 0, policy_version 151003 (0.0025) [2024-06-18 14:23:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2474131456. Throughput: 0: 42376.0. Samples: 2474244880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:36,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 14:23:38,234][12883] Updated weights for policy 0, policy_version 151013 (0.0034) [2024-06-18 14:23:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42053.9, 300 sec: 42543.2). Total num frames: 2474328064. Throughput: 0: 42557.9. Samples: 2474504720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:41,994][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 14:23:42,318][12883] Updated weights for policy 0, policy_version 151023 (0.0042) [2024-06-18 14:23:46,258][12883] Updated weights for policy 0, policy_version 151033 (0.0034) [2024-06-18 14:23:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2474557440. Throughput: 0: 42409.2. Samples: 2474631400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:46,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 14:23:49,900][12883] Updated weights for policy 0, policy_version 151043 (0.0031) [2024-06-18 14:23:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2474770432. Throughput: 0: 42339.2. Samples: 2474878400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:51,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 14:23:53,913][12883] Updated weights for policy 0, policy_version 151053 (0.0027) [2024-06-18 14:23:56,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.8, 300 sec: 42598.0). Total num frames: 2474983424. Throughput: 0: 42279.3. Samples: 2475132900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:23:56,997][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 14:23:57,584][12883] Updated weights for policy 0, policy_version 151063 (0.0038) [2024-06-18 14:24:01,689][12883] Updated weights for policy 0, policy_version 151073 (0.0041) [2024-06-18 14:24:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2475180032. Throughput: 0: 42196.5. Samples: 2475258380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 14:24:01,994][12645] Avg episode reward: [(0, '0.702')] [2024-06-18 14:24:05,222][12883] Updated weights for policy 0, policy_version 151083 (0.0031) [2024-06-18 14:24:06,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2475409408. Throughput: 0: 42314.2. Samples: 2475514820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:06,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 14:24:09,340][12883] Updated weights for policy 0, policy_version 151093 (0.0041) [2024-06-18 14:24:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2475622400. Throughput: 0: 42429.7. Samples: 2475777420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:11,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 14:24:12,967][12883] Updated weights for policy 0, policy_version 151103 (0.0032) [2024-06-18 14:24:16,960][12883] Updated weights for policy 0, policy_version 151113 (0.0030) [2024-06-18 14:24:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42602.8, 300 sec: 42765.0). Total num frames: 2475835392. Throughput: 0: 42436.4. Samples: 2475899940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:16,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 14:24:20,442][12883] Updated weights for policy 0, policy_version 151123 (0.0035) [2024-06-18 14:24:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2476048384. Throughput: 0: 42415.0. Samples: 2476153560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:21,995][12645] Avg episode reward: [(0, '0.274')] [2024-06-18 14:24:24,458][12883] Updated weights for policy 0, policy_version 151133 (0.0027) [2024-06-18 14:24:26,999][12645] Fps is (10 sec: 42574.9, 60 sec: 42594.4, 300 sec: 42653.1). Total num frames: 2476261376. Throughput: 0: 42476.9. Samples: 2476416420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:27,000][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 14:24:28,113][12883] Updated weights for policy 0, policy_version 151143 (0.0034) [2024-06-18 14:24:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42765.9). Total num frames: 2476474368. Throughput: 0: 42358.9. Samples: 2476537540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:31,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 14:24:32,088][12883] Updated weights for policy 0, policy_version 151153 (0.0038) [2024-06-18 14:24:35,728][12883] Updated weights for policy 0, policy_version 151163 (0.0031) [2024-06-18 14:24:36,994][12645] Fps is (10 sec: 40983.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2476670976. Throughput: 0: 42588.5. Samples: 2476794880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:36,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 14:24:37,169][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151166_2476703744.pth... [2024-06-18 14:24:37,226][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150541_2466463744.pth [2024-06-18 14:24:39,933][12883] Updated weights for policy 0, policy_version 151173 (0.0036) [2024-06-18 14:24:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2476883968. Throughput: 0: 42779.9. Samples: 2477057900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:41,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 14:24:43,627][12883] Updated weights for policy 0, policy_version 151183 (0.0048) [2024-06-18 14:24:44,352][12862] Signal inference workers to stop experience collection... (36200 times) [2024-06-18 14:24:44,352][12862] Signal inference workers to resume experience collection... (36200 times) [2024-06-18 14:24:44,369][12883] InferenceWorker_p0-w0: stopping experience collection (36200 times) [2024-06-18 14:24:44,369][12883] InferenceWorker_p0-w0: resuming experience collection (36200 times) [2024-06-18 14:24:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2477113344. Throughput: 0: 42738.2. Samples: 2477181600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:46,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 14:24:47,374][12883] Updated weights for policy 0, policy_version 151193 (0.0028) [2024-06-18 14:24:51,162][12883] Updated weights for policy 0, policy_version 151203 (0.0041) [2024-06-18 14:24:51,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2477326336. Throughput: 0: 42721.8. Samples: 2477437300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:51,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 14:24:55,215][12883] Updated weights for policy 0, policy_version 151213 (0.0027) [2024-06-18 14:24:56,996][12645] Fps is (10 sec: 42589.4, 60 sec: 42598.4, 300 sec: 42598.1). Total num frames: 2477539328. Throughput: 0: 42735.7. Samples: 2477700620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:24:56,996][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 14:24:58,776][12883] Updated weights for policy 0, policy_version 151223 (0.0031) [2024-06-18 14:25:01,998][12645] Fps is (10 sec: 42581.4, 60 sec: 42868.7, 300 sec: 42653.4). Total num frames: 2477752320. Throughput: 0: 42692.7. Samples: 2477821280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 14:25:01,998][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 14:25:03,038][12883] Updated weights for policy 0, policy_version 151233 (0.0029) [2024-06-18 14:25:06,470][12883] Updated weights for policy 0, policy_version 151243 (0.0033) [2024-06-18 14:25:06,996][12645] Fps is (10 sec: 42598.3, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2477965312. Throughput: 0: 42786.8. Samples: 2478079060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:06,996][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 14:25:11,050][12883] Updated weights for policy 0, policy_version 151253 (0.0045) [2024-06-18 14:25:11,996][12645] Fps is (10 sec: 40967.1, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2478161920. Throughput: 0: 42691.2. Samples: 2478337380. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:11,997][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 14:25:14,259][12883] Updated weights for policy 0, policy_version 151263 (0.0027) [2024-06-18 14:25:17,000][12645] Fps is (10 sec: 44218.9, 60 sec: 42867.0, 300 sec: 42708.6). Total num frames: 2478407680. Throughput: 0: 42785.9. Samples: 2478463180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:17,000][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 14:25:18,810][12883] Updated weights for policy 0, policy_version 151273 (0.0032) [2024-06-18 14:25:21,881][12883] Updated weights for policy 0, policy_version 151283 (0.0047) [2024-06-18 14:25:21,998][12645] Fps is (10 sec: 45867.7, 60 sec: 42868.7, 300 sec: 42597.8). Total num frames: 2478620672. Throughput: 0: 42778.5. Samples: 2478720080. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:21,998][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 14:25:26,710][12883] Updated weights for policy 0, policy_version 151293 (0.0038) [2024-06-18 14:25:26,994][12645] Fps is (10 sec: 39346.3, 60 sec: 42329.3, 300 sec: 42598.4). Total num frames: 2478800896. Throughput: 0: 42852.4. Samples: 2478986260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:26,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 14:25:29,802][12883] Updated weights for policy 0, policy_version 151303 (0.0032) [2024-06-18 14:25:31,994][12645] Fps is (10 sec: 44254.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2479063040. Throughput: 0: 42793.5. Samples: 2479107300. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:31,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 14:25:34,380][12883] Updated weights for policy 0, policy_version 151313 (0.0034) [2024-06-18 14:25:36,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2479259648. Throughput: 0: 42777.3. Samples: 2479362280. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:36,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 14:25:37,286][12883] Updated weights for policy 0, policy_version 151323 (0.0036) [2024-06-18 14:25:41,918][12883] Updated weights for policy 0, policy_version 151333 (0.0037) [2024-06-18 14:25:41,994][12645] Fps is (10 sec: 37683.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2479439872. Throughput: 0: 43043.5. Samples: 2479637480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:41,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 14:25:44,732][12883] Updated weights for policy 0, policy_version 151343 (0.0039) [2024-06-18 14:25:46,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2479702016. Throughput: 0: 42923.8. Samples: 2479752680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:46,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 14:25:49,468][12883] Updated weights for policy 0, policy_version 151353 (0.0033) [2024-06-18 14:25:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2479898624. Throughput: 0: 43030.1. Samples: 2480015320. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:51,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 14:25:52,531][12883] Updated weights for policy 0, policy_version 151363 (0.0026) [2024-06-18 14:25:56,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42053.8, 300 sec: 42653.9). Total num frames: 2480062464. Throughput: 0: 43075.5. Samples: 2480275680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:25:56,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 14:25:57,283][12883] Updated weights for policy 0, policy_version 151373 (0.0045) [2024-06-18 14:26:00,125][12883] Updated weights for policy 0, policy_version 151383 (0.0024) [2024-06-18 14:26:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43420.5, 300 sec: 42820.6). Total num frames: 2480357376. Throughput: 0: 42881.1. Samples: 2480392560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:26:01,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 14:26:04,947][12883] Updated weights for policy 0, policy_version 151393 (0.0029) [2024-06-18 14:26:05,204][12862] Signal inference workers to stop experience collection... (36250 times) [2024-06-18 14:26:05,257][12862] Signal inference workers to resume experience collection... (36250 times) [2024-06-18 14:26:05,258][12883] InferenceWorker_p0-w0: stopping experience collection (36250 times) [2024-06-18 14:26:05,271][12883] InferenceWorker_p0-w0: resuming experience collection (36250 times) [2024-06-18 14:26:06,994][12645] Fps is (10 sec: 49152.1, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2480553984. Throughput: 0: 42949.0. Samples: 2480652620. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 14:26:06,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 14:26:07,629][12883] Updated weights for policy 0, policy_version 151403 (0.0041) [2024-06-18 14:26:11,994][12645] Fps is (10 sec: 36044.8, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 2480717824. Throughput: 0: 42944.0. Samples: 2480918740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:11,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 14:26:12,394][12883] Updated weights for policy 0, policy_version 151413 (0.0028) [2024-06-18 14:26:15,256][12883] Updated weights for policy 0, policy_version 151423 (0.0031) [2024-06-18 14:26:16,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42874.3, 300 sec: 42709.2). Total num frames: 2480979968. Throughput: 0: 42847.5. Samples: 2481035540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:16,997][12645] Avg episode reward: [(0, '0.634')] [2024-06-18 14:26:20,064][12883] Updated weights for policy 0, policy_version 151433 (0.0042) [2024-06-18 14:26:21,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42874.3, 300 sec: 42653.9). Total num frames: 2481192960. Throughput: 0: 42831.3. Samples: 2481289680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:21,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 14:26:22,869][12883] Updated weights for policy 0, policy_version 151443 (0.0046) [2024-06-18 14:26:26,994][12645] Fps is (10 sec: 39330.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2481373184. Throughput: 0: 42454.3. Samples: 2481547920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:26,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 14:26:27,610][12883] Updated weights for policy 0, policy_version 151453 (0.0040) [2024-06-18 14:26:30,772][12883] Updated weights for policy 0, policy_version 151463 (0.0030) [2024-06-18 14:26:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2481602560. Throughput: 0: 42551.5. Samples: 2481667500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:31,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 14:26:35,065][12883] Updated weights for policy 0, policy_version 151473 (0.0032) [2024-06-18 14:26:36,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2481815552. Throughput: 0: 42604.9. Samples: 2481932540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:36,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 14:26:37,139][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151480_2481848320.pth... [2024-06-18 14:26:37,191][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000150853_2471575552.pth [2024-06-18 14:26:38,716][12883] Updated weights for policy 0, policy_version 151483 (0.0028) [2024-06-18 14:26:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2482028544. Throughput: 0: 42429.7. Samples: 2482185020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:41,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 14:26:42,635][12883] Updated weights for policy 0, policy_version 151493 (0.0037) [2024-06-18 14:26:46,319][12883] Updated weights for policy 0, policy_version 151503 (0.0023) [2024-06-18 14:26:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2482257920. Throughput: 0: 42627.1. Samples: 2482310780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:46,995][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 14:26:50,472][12883] Updated weights for policy 0, policy_version 151513 (0.0041) [2024-06-18 14:26:51,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2482470912. Throughput: 0: 42494.2. Samples: 2482564860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:51,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 14:26:54,047][12883] Updated weights for policy 0, policy_version 151523 (0.0036) [2024-06-18 14:26:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 43144.5, 300 sec: 42709.8). Total num frames: 2482651136. Throughput: 0: 42324.0. Samples: 2482823320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:26:56,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 14:26:58,370][12883] Updated weights for policy 0, policy_version 151533 (0.0034) [2024-06-18 14:27:01,562][12883] Updated weights for policy 0, policy_version 151543 (0.0026) [2024-06-18 14:27:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2482880512. Throughput: 0: 42485.2. Samples: 2482947280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:27:01,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 14:27:05,899][12883] Updated weights for policy 0, policy_version 151553 (0.0028) [2024-06-18 14:27:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2483109888. Throughput: 0: 42638.5. Samples: 2483208420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:27:06,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 14:27:09,363][12883] Updated weights for policy 0, policy_version 151563 (0.0041) [2024-06-18 14:27:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2483290112. Throughput: 0: 42556.4. Samples: 2483462960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 25.0) [2024-06-18 14:27:11,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 14:27:12,655][12862] Signal inference workers to stop experience collection... (36300 times) [2024-06-18 14:27:12,655][12862] Signal inference workers to resume experience collection... (36300 times) [2024-06-18 14:27:12,673][12883] InferenceWorker_p0-w0: stopping experience collection (36300 times) [2024-06-18 14:27:12,673][12883] InferenceWorker_p0-w0: resuming experience collection (36300 times) [2024-06-18 14:27:13,480][12883] Updated weights for policy 0, policy_version 151573 (0.0039) [2024-06-18 14:27:16,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42053.9, 300 sec: 42598.4). Total num frames: 2483503104. Throughput: 0: 42523.6. Samples: 2483581060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:16,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 14:27:17,289][12883] Updated weights for policy 0, policy_version 151583 (0.0034) [2024-06-18 14:27:21,132][12883] Updated weights for policy 0, policy_version 151593 (0.0043) [2024-06-18 14:27:21,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2483716096. Throughput: 0: 42453.8. Samples: 2483842960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:21,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 14:27:24,836][12883] Updated weights for policy 0, policy_version 151603 (0.0042) [2024-06-18 14:27:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2483912704. Throughput: 0: 42463.6. Samples: 2484095880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:26,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 14:27:28,777][12883] Updated weights for policy 0, policy_version 151613 (0.0041) [2024-06-18 14:27:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2484142080. Throughput: 0: 42426.3. Samples: 2484219960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:31,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 14:27:32,784][12883] Updated weights for policy 0, policy_version 151623 (0.0034) [2024-06-18 14:27:36,420][12883] Updated weights for policy 0, policy_version 151633 (0.0039) [2024-06-18 14:27:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 2484355072. Throughput: 0: 42540.5. Samples: 2484479180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:36,994][12645] Avg episode reward: [(0, '0.043')] [2024-06-18 14:27:40,547][12883] Updated weights for policy 0, policy_version 151643 (0.0040) [2024-06-18 14:27:41,994][12645] Fps is (10 sec: 42597.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2484568064. Throughput: 0: 42409.7. Samples: 2484731760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:41,994][12645] Avg episode reward: [(0, '0.057')] [2024-06-18 14:27:43,998][12883] Updated weights for policy 0, policy_version 151653 (0.0046) [2024-06-18 14:27:46,994][12645] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 2484764672. Throughput: 0: 42542.2. Samples: 2484861680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:46,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 14:27:48,257][12883] Updated weights for policy 0, policy_version 151663 (0.0029) [2024-06-18 14:27:51,650][12883] Updated weights for policy 0, policy_version 151673 (0.0048) [2024-06-18 14:27:51,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2485010432. Throughput: 0: 42476.1. Samples: 2485119840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:51,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 14:27:55,903][12883] Updated weights for policy 0, policy_version 151683 (0.0043) [2024-06-18 14:27:56,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2485223424. Throughput: 0: 42383.9. Samples: 2485370240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:27:56,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 14:27:59,287][12883] Updated weights for policy 0, policy_version 151693 (0.0027) [2024-06-18 14:28:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2485420032. Throughput: 0: 42636.0. Samples: 2485499680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:28:01,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 14:28:03,447][12883] Updated weights for policy 0, policy_version 151703 (0.0034) [2024-06-18 14:28:06,934][12883] Updated weights for policy 0, policy_version 151713 (0.0029) [2024-06-18 14:28:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2485665792. Throughput: 0: 42631.5. Samples: 2485761380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:28:06,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 14:28:11,270][12883] Updated weights for policy 0, policy_version 151723 (0.0042) [2024-06-18 14:28:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42654.8). Total num frames: 2485862400. Throughput: 0: 42565.4. Samples: 2486011320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:28:11,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 14:28:14,846][12883] Updated weights for policy 0, policy_version 151733 (0.0031) [2024-06-18 14:28:16,994][12645] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2486042624. Throughput: 0: 42585.6. Samples: 2486136320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 19.0) [2024-06-18 14:28:16,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 14:28:18,753][12883] Updated weights for policy 0, policy_version 151743 (0.0042) [2024-06-18 14:28:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2486288384. Throughput: 0: 42520.3. Samples: 2486392600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:21,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 14:28:22,601][12883] Updated weights for policy 0, policy_version 151753 (0.0044) [2024-06-18 14:28:26,747][12883] Updated weights for policy 0, policy_version 151763 (0.0033) [2024-06-18 14:28:26,994][12645] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2486501376. Throughput: 0: 42730.8. Samples: 2486654640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:26,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 14:28:30,198][12883] Updated weights for policy 0, policy_version 151773 (0.0038) [2024-06-18 14:28:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2486697984. Throughput: 0: 42595.3. Samples: 2486778460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:31,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 14:28:34,370][12883] Updated weights for policy 0, policy_version 151783 (0.0033) [2024-06-18 14:28:36,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42869.8, 300 sec: 42709.1). Total num frames: 2486927360. Throughput: 0: 42554.3. Samples: 2487034880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:36,996][12645] Avg episode reward: [(0, '0.376')] [2024-06-18 14:28:37,019][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151790_2486927360.pth... [2024-06-18 14:28:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151166_2476703744.pth [2024-06-18 14:28:37,823][12883] Updated weights for policy 0, policy_version 151793 (0.0035) [2024-06-18 14:28:41,897][12883] Updated weights for policy 0, policy_version 151803 (0.0042) [2024-06-18 14:28:41,998][12645] Fps is (10 sec: 44218.8, 60 sec: 42868.8, 300 sec: 42653.4). Total num frames: 2487140352. Throughput: 0: 42746.1. Samples: 2487293980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:41,998][12645] Avg episode reward: [(0, '0.300')] [2024-06-18 14:28:45,707][12883] Updated weights for policy 0, policy_version 151813 (0.0039) [2024-06-18 14:28:46,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2487336960. Throughput: 0: 42583.1. Samples: 2487415920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:46,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 14:28:49,778][12883] Updated weights for policy 0, policy_version 151823 (0.0023) [2024-06-18 14:28:49,807][12862] Signal inference workers to stop experience collection... (36350 times) [2024-06-18 14:28:49,807][12862] Signal inference workers to resume experience collection... (36350 times) [2024-06-18 14:28:49,825][12883] InferenceWorker_p0-w0: stopping experience collection (36350 times) [2024-06-18 14:28:49,825][12883] InferenceWorker_p0-w0: resuming experience collection (36350 times) [2024-06-18 14:28:51,994][12645] Fps is (10 sec: 40976.4, 60 sec: 42325.3, 300 sec: 42598.7). Total num frames: 2487549952. Throughput: 0: 42381.8. Samples: 2487668560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:51,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 14:28:53,441][12883] Updated weights for policy 0, policy_version 151833 (0.0023) [2024-06-18 14:28:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42050.8, 300 sec: 42598.1). Total num frames: 2487746560. Throughput: 0: 42618.3. Samples: 2487929240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:28:56,996][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 14:28:57,550][12883] Updated weights for policy 0, policy_version 151843 (0.0031) [2024-06-18 14:29:01,093][12883] Updated weights for policy 0, policy_version 151853 (0.0042) [2024-06-18 14:29:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2487975936. Throughput: 0: 42560.6. Samples: 2488051540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:29:01,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 14:29:05,109][12883] Updated weights for policy 0, policy_version 151863 (0.0027) [2024-06-18 14:29:06,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2488188928. Throughput: 0: 42484.9. Samples: 2488304420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:29:06,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 14:29:09,015][12883] Updated weights for policy 0, policy_version 151873 (0.0033) [2024-06-18 14:29:11,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2488401920. Throughput: 0: 42575.5. Samples: 2488570540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:29:11,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 14:29:12,847][12883] Updated weights for policy 0, policy_version 151883 (0.0034) [2024-06-18 14:29:16,704][12883] Updated weights for policy 0, policy_version 151893 (0.0034) [2024-06-18 14:29:16,996][12645] Fps is (10 sec: 45865.4, 60 sec: 43416.1, 300 sec: 42709.2). Total num frames: 2488647680. Throughput: 0: 42436.9. Samples: 2488688220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:29:16,996][12645] Avg episode reward: [(0, '0.361')] [2024-06-18 14:29:20,736][12883] Updated weights for policy 0, policy_version 151903 (0.0045) [2024-06-18 14:29:21,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.8). Total num frames: 2488844288. Throughput: 0: 42631.5. Samples: 2488953200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:21,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 14:29:24,133][12883] Updated weights for policy 0, policy_version 151913 (0.0030) [2024-06-18 14:29:26,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2489024512. Throughput: 0: 42469.6. Samples: 2489204940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:26,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 14:29:28,355][12883] Updated weights for policy 0, policy_version 151923 (0.0022) [2024-06-18 14:29:31,745][12883] Updated weights for policy 0, policy_version 151933 (0.0040) [2024-06-18 14:29:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2489286656. Throughput: 0: 42587.9. Samples: 2489332380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:31,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 14:29:36,070][12883] Updated weights for policy 0, policy_version 151943 (0.0026) [2024-06-18 14:29:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2489483264. Throughput: 0: 42734.2. Samples: 2489591600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:36,994][12645] Avg episode reward: [(0, '0.719')] [2024-06-18 14:29:39,427][12883] Updated weights for policy 0, policy_version 151953 (0.0034) [2024-06-18 14:29:41,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42328.1, 300 sec: 42598.4). Total num frames: 2489679872. Throughput: 0: 42674.6. Samples: 2489849500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:41,994][12645] Avg episode reward: [(0, '0.769')] [2024-06-18 14:29:43,870][12883] Updated weights for policy 0, policy_version 151963 (0.0027) [2024-06-18 14:29:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2489909248. Throughput: 0: 42612.4. Samples: 2489969100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:46,994][12645] Avg episode reward: [(0, '0.698')] [2024-06-18 14:29:47,346][12883] Updated weights for policy 0, policy_version 151973 (0.0032) [2024-06-18 14:29:51,499][12883] Updated weights for policy 0, policy_version 151983 (0.0027) [2024-06-18 14:29:51,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42654.2). Total num frames: 2490122240. Throughput: 0: 42825.3. Samples: 2490231560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:51,994][12645] Avg episode reward: [(0, '0.698')] [2024-06-18 14:29:55,085][12883] Updated weights for policy 0, policy_version 151993 (0.0027) [2024-06-18 14:29:56,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42600.1, 300 sec: 42543.5). Total num frames: 2490302464. Throughput: 0: 42633.5. Samples: 2490489040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:29:56,994][12645] Avg episode reward: [(0, '0.642')] [2024-06-18 14:29:59,097][12883] Updated weights for policy 0, policy_version 152003 (0.0045) [2024-06-18 14:30:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42654.3). Total num frames: 2490548224. Throughput: 0: 42673.1. Samples: 2490608420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:30:01,994][12645] Avg episode reward: [(0, '0.613')] [2024-06-18 14:30:02,680][12883] Updated weights for policy 0, policy_version 152013 (0.0048) [2024-06-18 14:30:06,822][12883] Updated weights for policy 0, policy_version 152023 (0.0046) [2024-06-18 14:30:06,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2490744832. Throughput: 0: 42489.7. Samples: 2490865240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:30:06,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 14:30:07,152][12862] Signal inference workers to stop experience collection... (36400 times) [2024-06-18 14:30:07,206][12862] Signal inference workers to resume experience collection... (36400 times) [2024-06-18 14:30:07,208][12883] InferenceWorker_p0-w0: stopping experience collection (36400 times) [2024-06-18 14:30:07,231][12883] InferenceWorker_p0-w0: resuming experience collection (36400 times) [2024-06-18 14:30:10,391][12883] Updated weights for policy 0, policy_version 152033 (0.0042) [2024-06-18 14:30:11,994][12645] Fps is (10 sec: 39322.3, 60 sec: 42325.5, 300 sec: 42488.2). Total num frames: 2490941440. Throughput: 0: 42516.0. Samples: 2491118160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:30:11,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 14:30:14,389][12883] Updated weights for policy 0, policy_version 152043 (0.0026) [2024-06-18 14:30:17,000][12645] Fps is (10 sec: 42570.6, 60 sec: 42049.2, 300 sec: 42542.5). Total num frames: 2491170816. Throughput: 0: 42517.5. Samples: 2491245940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:30:17,001][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 14:30:18,063][12883] Updated weights for policy 0, policy_version 152053 (0.0032) [2024-06-18 14:30:21,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2491367424. Throughput: 0: 42514.2. Samples: 2491504740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:30:21,994][12645] Avg episode reward: [(0, '0.413')] [2024-06-18 14:30:22,394][12883] Updated weights for policy 0, policy_version 152063 (0.0028) [2024-06-18 14:30:25,623][12883] Updated weights for policy 0, policy_version 152073 (0.0025) [2024-06-18 14:30:26,994][12645] Fps is (10 sec: 42626.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2491596800. Throughput: 0: 42492.8. Samples: 2491761680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:30:26,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 14:30:29,954][12883] Updated weights for policy 0, policy_version 152083 (0.0042) [2024-06-18 14:30:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2491809792. Throughput: 0: 42726.6. Samples: 2491891800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:30:31,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 14:30:33,534][12883] Updated weights for policy 0, policy_version 152093 (0.0029) [2024-06-18 14:30:36,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2492022784. Throughput: 0: 42522.7. Samples: 2492145080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:30:36,994][12645] Avg episode reward: [(0, '0.677')] [2024-06-18 14:30:37,000][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152101_2492022784.pth... [2024-06-18 14:30:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151480_2481848320.pth [2024-06-18 14:30:37,518][12883] Updated weights for policy 0, policy_version 152103 (0.0043) [2024-06-18 14:30:41,355][12883] Updated weights for policy 0, policy_version 152113 (0.0033) [2024-06-18 14:30:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2492235776. Throughput: 0: 42234.1. Samples: 2492389580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:30:41,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 14:30:45,313][12883] Updated weights for policy 0, policy_version 152123 (0.0037) [2024-06-18 14:30:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2492448768. Throughput: 0: 42488.1. Samples: 2492520380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:30:46,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 14:30:49,067][12883] Updated weights for policy 0, policy_version 152133 (0.0042) [2024-06-18 14:30:51,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2492645376. Throughput: 0: 42456.6. Samples: 2492775780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:30:51,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 14:30:52,873][12883] Updated weights for policy 0, policy_version 152143 (0.0033) [2024-06-18 14:30:56,566][12883] Updated weights for policy 0, policy_version 152153 (0.0030) [2024-06-18 14:30:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 2492874752. Throughput: 0: 42497.2. Samples: 2493030540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:30:56,994][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 14:31:00,419][12883] Updated weights for policy 0, policy_version 152163 (0.0032) [2024-06-18 14:31:01,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2493087744. Throughput: 0: 42604.8. Samples: 2493162880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:31:01,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 14:31:04,165][12883] Updated weights for policy 0, policy_version 152173 (0.0032) [2024-06-18 14:31:06,996][12645] Fps is (10 sec: 42588.8, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2493300736. Throughput: 0: 42410.4. Samples: 2493413300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:31:06,996][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 14:31:08,593][12883] Updated weights for policy 0, policy_version 152183 (0.0044) [2024-06-18 14:31:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42432.1). Total num frames: 2493497344. Throughput: 0: 42310.3. Samples: 2493665640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:31:11,994][12645] Avg episode reward: [(0, '0.668')] [2024-06-18 14:31:12,410][12883] Updated weights for policy 0, policy_version 152193 (0.0042) [2024-06-18 14:31:16,148][12883] Updated weights for policy 0, policy_version 152203 (0.0028) [2024-06-18 14:31:16,994][12645] Fps is (10 sec: 42608.5, 60 sec: 42603.1, 300 sec: 42487.3). Total num frames: 2493726720. Throughput: 0: 42329.9. Samples: 2493796640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:31:16,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 14:31:20,204][12883] Updated weights for policy 0, policy_version 152213 (0.0030) [2024-06-18 14:31:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2493923328. Throughput: 0: 42327.2. Samples: 2494049800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:31:21,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 14:31:23,750][12883] Updated weights for policy 0, policy_version 152223 (0.0040) [2024-06-18 14:31:26,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2494136320. Throughput: 0: 42597.3. Samples: 2494306460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-18 14:31:26,994][12645] Avg episode reward: [(0, '0.743')] [2024-06-18 14:31:27,931][12883] Updated weights for policy 0, policy_version 152233 (0.0028) [2024-06-18 14:31:31,423][12883] Updated weights for policy 0, policy_version 152243 (0.0041) [2024-06-18 14:31:31,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2494365696. Throughput: 0: 42614.8. Samples: 2494438040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:31:31,994][12645] Avg episode reward: [(0, '0.709')] [2024-06-18 14:31:35,554][12883] Updated weights for policy 0, policy_version 152253 (0.0041) [2024-06-18 14:31:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2494562304. Throughput: 0: 42598.8. Samples: 2494692740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:31:36,995][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 14:31:39,140][12883] Updated weights for policy 0, policy_version 152263 (0.0042) [2024-06-18 14:31:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2494791680. Throughput: 0: 42468.1. Samples: 2494941600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:31:41,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 14:31:43,273][12883] Updated weights for policy 0, policy_version 152273 (0.0046) [2024-06-18 14:31:43,699][12862] Signal inference workers to stop experience collection... (36450 times) [2024-06-18 14:31:43,699][12862] Signal inference workers to resume experience collection... (36450 times) [2024-06-18 14:31:43,747][12883] InferenceWorker_p0-w0: stopping experience collection (36450 times) [2024-06-18 14:31:43,747][12883] InferenceWorker_p0-w0: resuming experience collection (36450 times) [2024-06-18 14:31:46,735][12883] Updated weights for policy 0, policy_version 152283 (0.0024) [2024-06-18 14:31:46,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2495004672. Throughput: 0: 42477.0. Samples: 2495074340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:31:46,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 14:31:50,751][12883] Updated weights for policy 0, policy_version 152293 (0.0031) [2024-06-18 14:31:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2495201280. Throughput: 0: 42643.6. Samples: 2495332160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:31:51,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 14:31:54,356][12883] Updated weights for policy 0, policy_version 152303 (0.0032) [2024-06-18 14:31:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2495447040. Throughput: 0: 42672.0. Samples: 2495585880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:31:56,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 14:31:58,237][12883] Updated weights for policy 0, policy_version 152313 (0.0039) [2024-06-18 14:32:01,980][12883] Updated weights for policy 0, policy_version 152323 (0.0039) [2024-06-18 14:32:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2495660032. Throughput: 0: 42764.0. Samples: 2495721020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:32:01,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 14:32:06,170][12883] Updated weights for policy 0, policy_version 152333 (0.0034) [2024-06-18 14:32:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2495856640. Throughput: 0: 42766.6. Samples: 2495974300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:32:06,994][12645] Avg episode reward: [(0, '0.105')] [2024-06-18 14:32:10,184][12883] Updated weights for policy 0, policy_version 152343 (0.0043) [2024-06-18 14:32:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2496069632. Throughput: 0: 42520.2. Samples: 2496219860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:32:11,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 14:32:14,187][12883] Updated weights for policy 0, policy_version 152353 (0.0033) [2024-06-18 14:32:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2496282624. Throughput: 0: 42534.9. Samples: 2496352120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:32:16,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 14:32:17,758][12883] Updated weights for policy 0, policy_version 152363 (0.0025) [2024-06-18 14:32:21,713][12883] Updated weights for policy 0, policy_version 152373 (0.0038) [2024-06-18 14:32:21,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2496479232. Throughput: 0: 42477.9. Samples: 2496604240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:32:22,003][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 14:32:25,321][12883] Updated weights for policy 0, policy_version 152383 (0.0028) [2024-06-18 14:32:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2496708608. Throughput: 0: 42671.4. Samples: 2496861820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:32:26,994][12645] Avg episode reward: [(0, '0.813')] [2024-06-18 14:32:29,621][12883] Updated weights for policy 0, policy_version 152393 (0.0038) [2024-06-18 14:32:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2496921600. Throughput: 0: 42626.2. Samples: 2496992520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 14:32:31,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 14:32:32,909][12883] Updated weights for policy 0, policy_version 152403 (0.0034) [2024-06-18 14:32:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 2497118208. Throughput: 0: 42482.6. Samples: 2497243880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:32:36,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 14:32:37,090][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152413_2497134592.pth... [2024-06-18 14:32:37,099][12883] Updated weights for policy 0, policy_version 152413 (0.0034) [2024-06-18 14:32:37,142][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000151790_2486927360.pth [2024-06-18 14:32:40,578][12883] Updated weights for policy 0, policy_version 152423 (0.0033) [2024-06-18 14:32:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2497331200. Throughput: 0: 42505.4. Samples: 2497498620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:32:41,996][12645] Avg episode reward: [(0, '0.689')] [2024-06-18 14:32:44,775][12883] Updated weights for policy 0, policy_version 152433 (0.0028) [2024-06-18 14:32:46,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2497576960. Throughput: 0: 42402.9. Samples: 2497629160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:32:46,994][12645] Avg episode reward: [(0, '0.672')] [2024-06-18 14:32:48,543][12883] Updated weights for policy 0, policy_version 152443 (0.0032) [2024-06-18 14:32:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2497773568. Throughput: 0: 42459.2. Samples: 2497884960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:32:51,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 14:32:52,182][12883] Updated weights for policy 0, policy_version 152453 (0.0031) [2024-06-18 14:32:56,240][12883] Updated weights for policy 0, policy_version 152463 (0.0037) [2024-06-18 14:32:56,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2497986560. Throughput: 0: 42679.1. Samples: 2498140420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:32:56,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 14:32:59,788][12883] Updated weights for policy 0, policy_version 152473 (0.0034) [2024-06-18 14:33:01,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2498199552. Throughput: 0: 42574.2. Samples: 2498267960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:01,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 14:33:03,849][12883] Updated weights for policy 0, policy_version 152483 (0.0033) [2024-06-18 14:33:06,036][12862] Signal inference workers to stop experience collection... (36500 times) [2024-06-18 14:33:06,037][12862] Signal inference workers to resume experience collection... (36500 times) [2024-06-18 14:33:06,072][12883] InferenceWorker_p0-w0: stopping experience collection (36500 times) [2024-06-18 14:33:06,072][12883] InferenceWorker_p0-w0: resuming experience collection (36500 times) [2024-06-18 14:33:06,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2498412544. Throughput: 0: 42676.4. Samples: 2498524680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:06,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 14:33:07,606][12883] Updated weights for policy 0, policy_version 152493 (0.0028) [2024-06-18 14:33:11,450][12883] Updated weights for policy 0, policy_version 152503 (0.0033) [2024-06-18 14:33:12,000][12645] Fps is (10 sec: 42572.1, 60 sec: 42593.9, 300 sec: 42653.0). Total num frames: 2498625536. Throughput: 0: 42618.9. Samples: 2498779940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:12,001][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 14:33:15,253][12883] Updated weights for policy 0, policy_version 152513 (0.0046) [2024-06-18 14:33:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2498838528. Throughput: 0: 42617.9. Samples: 2498910320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:16,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 14:33:18,978][12883] Updated weights for policy 0, policy_version 152523 (0.0033) [2024-06-18 14:33:21,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2499051520. Throughput: 0: 42799.0. Samples: 2499169840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:21,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 14:33:22,901][12883] Updated weights for policy 0, policy_version 152533 (0.0029) [2024-06-18 14:33:26,589][12883] Updated weights for policy 0, policy_version 152543 (0.0033) [2024-06-18 14:33:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2499264512. Throughput: 0: 42821.8. Samples: 2499425600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:26,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 14:33:30,359][12883] Updated weights for policy 0, policy_version 152553 (0.0029) [2024-06-18 14:33:31,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 2499477504. Throughput: 0: 42899.3. Samples: 2499559620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:31,994][12645] Avg episode reward: [(0, '0.662')] [2024-06-18 14:33:34,723][12883] Updated weights for policy 0, policy_version 152563 (0.0025) [2024-06-18 14:33:36,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42487.9). Total num frames: 2499674112. Throughput: 0: 42832.8. Samples: 2499812440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 14:33:36,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 14:33:38,081][12883] Updated weights for policy 0, policy_version 152573 (0.0038) [2024-06-18 14:33:41,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2499903488. Throughput: 0: 42917.2. Samples: 2500071700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:33:41,995][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 14:33:42,186][12883] Updated weights for policy 0, policy_version 152583 (0.0030) [2024-06-18 14:33:45,629][12883] Updated weights for policy 0, policy_version 152593 (0.0036) [2024-06-18 14:33:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2500132864. Throughput: 0: 43084.5. Samples: 2500206760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:33:46,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 14:33:49,755][12883] Updated weights for policy 0, policy_version 152603 (0.0040) [2024-06-18 14:33:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2500329472. Throughput: 0: 42952.0. Samples: 2500457520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:33:51,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 14:33:53,245][12883] Updated weights for policy 0, policy_version 152613 (0.0028) [2024-06-18 14:33:56,995][12645] Fps is (10 sec: 42592.4, 60 sec: 42870.4, 300 sec: 42653.7). Total num frames: 2500558848. Throughput: 0: 43017.9. Samples: 2500715540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:33:56,996][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 14:33:57,314][12883] Updated weights for policy 0, policy_version 152623 (0.0032) [2024-06-18 14:34:00,782][12883] Updated weights for policy 0, policy_version 152633 (0.0039) [2024-06-18 14:34:01,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42871.7, 300 sec: 42654.0). Total num frames: 2500771840. Throughput: 0: 43077.3. Samples: 2500848800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:01,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 14:34:04,654][12883] Updated weights for policy 0, policy_version 152643 (0.0033) [2024-06-18 14:34:06,998][12645] Fps is (10 sec: 40948.8, 60 sec: 42595.4, 300 sec: 42597.8). Total num frames: 2500968448. Throughput: 0: 43094.2. Samples: 2501109260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:06,998][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 14:34:08,416][12883] Updated weights for policy 0, policy_version 152653 (0.0038) [2024-06-18 14:34:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42876.0, 300 sec: 42543.2). Total num frames: 2501197824. Throughput: 0: 43014.7. Samples: 2501361260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:11,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 14:34:12,149][12883] Updated weights for policy 0, policy_version 152663 (0.0022) [2024-06-18 14:34:16,026][12883] Updated weights for policy 0, policy_version 152673 (0.0035) [2024-06-18 14:34:16,994][12645] Fps is (10 sec: 45894.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2501427200. Throughput: 0: 43034.6. Samples: 2501496180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:16,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 14:34:19,536][12883] Updated weights for policy 0, policy_version 152683 (0.0045) [2024-06-18 14:34:21,998][12645] Fps is (10 sec: 42579.2, 60 sec: 42868.3, 300 sec: 42708.8). Total num frames: 2501623808. Throughput: 0: 43177.6. Samples: 2501755620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:21,998][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 14:34:23,397][12883] Updated weights for policy 0, policy_version 152693 (0.0038) [2024-06-18 14:34:25,939][12862] Signal inference workers to stop experience collection... (36550 times) [2024-06-18 14:34:25,944][12862] Signal inference workers to resume experience collection... (36550 times) [2024-06-18 14:34:25,989][12883] InferenceWorker_p0-w0: stopping experience collection (36550 times) [2024-06-18 14:34:25,992][12883] InferenceWorker_p0-w0: resuming experience collection (36550 times) [2024-06-18 14:34:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2501853184. Throughput: 0: 42957.0. Samples: 2502004760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:26,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 14:34:28,103][12883] Updated weights for policy 0, policy_version 152703 (0.0035) [2024-06-18 14:34:31,290][12883] Updated weights for policy 0, policy_version 152713 (0.0038) [2024-06-18 14:34:31,994][12645] Fps is (10 sec: 45895.1, 60 sec: 43417.5, 300 sec: 42709.5). Total num frames: 2502082560. Throughput: 0: 43036.0. Samples: 2502143380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:31,994][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 14:34:35,646][12883] Updated weights for policy 0, policy_version 152723 (0.0038) [2024-06-18 14:34:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2502262784. Throughput: 0: 43098.3. Samples: 2502396940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 14:34:36,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 14:34:37,002][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152726_2502262784.pth... [2024-06-18 14:34:37,077][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152101_2492022784.pth [2024-06-18 14:34:39,083][12883] Updated weights for policy 0, policy_version 152733 (0.0038) [2024-06-18 14:34:41,996][12645] Fps is (10 sec: 42590.9, 60 sec: 43416.3, 300 sec: 42709.2). Total num frames: 2502508544. Throughput: 0: 42937.4. Samples: 2502647740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:34:41,996][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 14:34:43,093][12883] Updated weights for policy 0, policy_version 152743 (0.0038) [2024-06-18 14:34:46,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2502688768. Throughput: 0: 42887.9. Samples: 2502778760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:34:46,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 14:34:47,059][12883] Updated weights for policy 0, policy_version 152753 (0.0040) [2024-06-18 14:34:50,818][12883] Updated weights for policy 0, policy_version 152763 (0.0031) [2024-06-18 14:34:51,994][12645] Fps is (10 sec: 39328.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2502901760. Throughput: 0: 42753.3. Samples: 2503032980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:34:52,003][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 14:34:54,654][12883] Updated weights for policy 0, policy_version 152773 (0.0035) [2024-06-18 14:34:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43145.6, 300 sec: 42709.5). Total num frames: 2503147520. Throughput: 0: 42815.5. Samples: 2503287960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:34:56,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 14:34:58,295][12883] Updated weights for policy 0, policy_version 152783 (0.0029) [2024-06-18 14:35:01,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2503344128. Throughput: 0: 42788.0. Samples: 2503421640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:01,994][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 14:35:02,439][12883] Updated weights for policy 0, policy_version 152793 (0.0029) [2024-06-18 14:35:05,786][12883] Updated weights for policy 0, policy_version 152803 (0.0031) [2024-06-18 14:35:06,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42874.5, 300 sec: 42709.5). Total num frames: 2503540736. Throughput: 0: 42674.5. Samples: 2503675780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:06,994][12645] Avg episode reward: [(0, '0.295')] [2024-06-18 14:35:10,229][12883] Updated weights for policy 0, policy_version 152813 (0.0034) [2024-06-18 14:35:11,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42766.0). Total num frames: 2503786496. Throughput: 0: 42666.7. Samples: 2503924760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:11,994][12645] Avg episode reward: [(0, '0.636')] [2024-06-18 14:35:13,919][12883] Updated weights for policy 0, policy_version 152823 (0.0029) [2024-06-18 14:35:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2503983104. Throughput: 0: 42645.4. Samples: 2504062420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:16,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 14:35:17,778][12883] Updated weights for policy 0, policy_version 152833 (0.0037) [2024-06-18 14:35:21,578][12883] Updated weights for policy 0, policy_version 152843 (0.0037) [2024-06-18 14:35:21,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42601.5, 300 sec: 42654.0). Total num frames: 2504179712. Throughput: 0: 42571.6. Samples: 2504312660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:21,994][12645] Avg episode reward: [(0, '0.524')] [2024-06-18 14:35:25,487][12883] Updated weights for policy 0, policy_version 152853 (0.0033) [2024-06-18 14:35:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2504425472. Throughput: 0: 42652.0. Samples: 2504567000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:26,994][12645] Avg episode reward: [(0, '0.663')] [2024-06-18 14:35:29,202][12883] Updated weights for policy 0, policy_version 152863 (0.0027) [2024-06-18 14:35:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 2504622080. Throughput: 0: 42729.3. Samples: 2504701580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:31,994][12645] Avg episode reward: [(0, '0.690')] [2024-06-18 14:35:33,107][12883] Updated weights for policy 0, policy_version 152873 (0.0029) [2024-06-18 14:35:36,806][12883] Updated weights for policy 0, policy_version 152883 (0.0026) [2024-06-18 14:35:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2504835072. Throughput: 0: 42708.1. Samples: 2504954840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:37,000][12645] Avg episode reward: [(0, '0.826')] [2024-06-18 14:35:40,915][12883] Updated weights for policy 0, policy_version 152893 (0.0038) [2024-06-18 14:35:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 2505064448. Throughput: 0: 42744.4. Samples: 2505211460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 14:35:41,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 14:35:42,078][12862] Signal inference workers to stop experience collection... (36600 times) [2024-06-18 14:35:42,078][12862] Signal inference workers to resume experience collection... (36600 times) [2024-06-18 14:35:42,105][12883] InferenceWorker_p0-w0: stopping experience collection (36600 times) [2024-06-18 14:35:42,105][12883] InferenceWorker_p0-w0: resuming experience collection (36600 times) [2024-06-18 14:35:44,428][12883] Updated weights for policy 0, policy_version 152903 (0.0041) [2024-06-18 14:35:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2505261056. Throughput: 0: 42631.5. Samples: 2505340060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:35:46,994][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 14:35:48,566][12883] Updated weights for policy 0, policy_version 152913 (0.0048) [2024-06-18 14:35:51,839][12883] Updated weights for policy 0, policy_version 152923 (0.0040) [2024-06-18 14:35:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2505490432. Throughput: 0: 42627.4. Samples: 2505594020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:35:51,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 14:35:56,216][12883] Updated weights for policy 0, policy_version 152933 (0.0032) [2024-06-18 14:35:56,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2505719808. Throughput: 0: 43032.0. Samples: 2505861200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:35:56,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 14:35:59,299][12883] Updated weights for policy 0, policy_version 152943 (0.0030) [2024-06-18 14:36:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2505900032. Throughput: 0: 42761.7. Samples: 2505986700. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:01,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 14:36:03,756][12883] Updated weights for policy 0, policy_version 152953 (0.0037) [2024-06-18 14:36:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2506129408. Throughput: 0: 42831.5. Samples: 2506240080. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:06,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 14:36:07,242][12883] Updated weights for policy 0, policy_version 152963 (0.0032) [2024-06-18 14:36:11,538][12883] Updated weights for policy 0, policy_version 152973 (0.0035) [2024-06-18 14:36:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2506342400. Throughput: 0: 43106.2. Samples: 2506506780. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:11,995][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 14:36:14,735][12883] Updated weights for policy 0, policy_version 152983 (0.0046) [2024-06-18 14:36:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2506555392. Throughput: 0: 42832.8. Samples: 2506629060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:16,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 14:36:19,219][12883] Updated weights for policy 0, policy_version 152993 (0.0029) [2024-06-18 14:36:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2506784768. Throughput: 0: 42882.1. Samples: 2506884540. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:21,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 14:36:22,240][12883] Updated weights for policy 0, policy_version 153003 (0.0038) [2024-06-18 14:36:26,727][12883] Updated weights for policy 0, policy_version 153013 (0.0041) [2024-06-18 14:36:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2506964992. Throughput: 0: 43123.7. Samples: 2507152020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:26,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 14:36:29,767][12883] Updated weights for policy 0, policy_version 153023 (0.0042) [2024-06-18 14:36:31,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42869.8, 300 sec: 42820.3). Total num frames: 2507194368. Throughput: 0: 42923.6. Samples: 2507271720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:31,996][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 14:36:34,500][12883] Updated weights for policy 0, policy_version 153033 (0.0027) [2024-06-18 14:36:36,994][12645] Fps is (10 sec: 47513.3, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2507440128. Throughput: 0: 42959.7. Samples: 2507527200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:36,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 14:36:37,007][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153042_2507440128.pth... [2024-06-18 14:36:37,086][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152413_2497134592.pth [2024-06-18 14:36:37,260][12883] Updated weights for policy 0, policy_version 153043 (0.0038) [2024-06-18 14:36:41,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2507603968. Throughput: 0: 42969.3. Samples: 2507794820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:41,994][12645] Avg episode reward: [(0, '0.701')] [2024-06-18 14:36:42,035][12883] Updated weights for policy 0, policy_version 153053 (0.0032) [2024-06-18 14:36:45,400][12883] Updated weights for policy 0, policy_version 153063 (0.0032) [2024-06-18 14:36:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2507833344. Throughput: 0: 42754.7. Samples: 2507910660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-18 14:36:46,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 14:36:49,655][12883] Updated weights for policy 0, policy_version 153073 (0.0026) [2024-06-18 14:36:49,904][12862] Signal inference workers to stop experience collection... (36650 times) [2024-06-18 14:36:49,905][12862] Signal inference workers to resume experience collection... (36650 times) [2024-06-18 14:36:49,950][12883] InferenceWorker_p0-w0: stopping experience collection (36650 times) [2024-06-18 14:36:49,951][12883] InferenceWorker_p0-w0: resuming experience collection (36650 times) [2024-06-18 14:36:51,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2508079104. Throughput: 0: 42883.6. Samples: 2508169840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:36:51,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 14:36:53,050][12883] Updated weights for policy 0, policy_version 153083 (0.0032) [2024-06-18 14:36:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2508259328. Throughput: 0: 42856.9. Samples: 2508435340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:36:56,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 14:36:57,197][12883] Updated weights for policy 0, policy_version 153093 (0.0051) [2024-06-18 14:37:00,543][12883] Updated weights for policy 0, policy_version 153103 (0.0036) [2024-06-18 14:37:01,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2508472320. Throughput: 0: 42771.1. Samples: 2508553760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:01,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 14:37:04,832][12883] Updated weights for policy 0, policy_version 153113 (0.0026) [2024-06-18 14:37:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2508718080. Throughput: 0: 42931.1. Samples: 2508816440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:06,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 14:37:08,048][12883] Updated weights for policy 0, policy_version 153123 (0.0027) [2024-06-18 14:37:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2508898304. Throughput: 0: 42719.0. Samples: 2509074380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:11,999][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 14:37:12,354][12883] Updated weights for policy 0, policy_version 153133 (0.0039) [2024-06-18 14:37:16,134][12883] Updated weights for policy 0, policy_version 153143 (0.0034) [2024-06-18 14:37:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2509127680. Throughput: 0: 42728.9. Samples: 2509194420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:16,994][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 14:37:19,976][12883] Updated weights for policy 0, policy_version 153153 (0.0032) [2024-06-18 14:37:21,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2509357056. Throughput: 0: 42858.7. Samples: 2509455840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:21,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 14:37:23,730][12883] Updated weights for policy 0, policy_version 153163 (0.0026) [2024-06-18 14:37:26,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2509537280. Throughput: 0: 42793.3. Samples: 2509720520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:26,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 14:37:27,703][12883] Updated weights for policy 0, policy_version 153173 (0.0027) [2024-06-18 14:37:31,256][12883] Updated weights for policy 0, policy_version 153183 (0.0037) [2024-06-18 14:37:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 2509783040. Throughput: 0: 42865.4. Samples: 2509839600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:31,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 14:37:35,398][12883] Updated weights for policy 0, policy_version 153193 (0.0039) [2024-06-18 14:37:36,994][12645] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42987.2). Total num frames: 2510012416. Throughput: 0: 42863.9. Samples: 2510098720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:36,995][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 14:37:38,753][12883] Updated weights for policy 0, policy_version 153203 (0.0041) [2024-06-18 14:37:41,995][12645] Fps is (10 sec: 37678.9, 60 sec: 42597.6, 300 sec: 42653.8). Total num frames: 2510159872. Throughput: 0: 42727.5. Samples: 2510358120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:41,995][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 14:37:43,047][12883] Updated weights for policy 0, policy_version 153213 (0.0046) [2024-06-18 14:37:46,633][12883] Updated weights for policy 0, policy_version 153223 (0.0042) [2024-06-18 14:37:46,994][12645] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2510405632. Throughput: 0: 42717.4. Samples: 2510476040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:46,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 14:37:51,014][12883] Updated weights for policy 0, policy_version 153233 (0.0041) [2024-06-18 14:37:51,995][12645] Fps is (10 sec: 49149.9, 60 sec: 42870.4, 300 sec: 42931.4). Total num frames: 2510651392. Throughput: 0: 42774.6. Samples: 2510741360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 23.0) [2024-06-18 14:37:51,996][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 14:37:54,152][12883] Updated weights for policy 0, policy_version 153243 (0.0040) [2024-06-18 14:37:56,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2510815232. Throughput: 0: 42735.2. Samples: 2510997460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:37:56,994][12645] Avg episode reward: [(0, '0.685')] [2024-06-18 14:37:58,604][12883] Updated weights for policy 0, policy_version 153253 (0.0046) [2024-06-18 14:38:01,660][12883] Updated weights for policy 0, policy_version 153263 (0.0032) [2024-06-18 14:38:01,994][12645] Fps is (10 sec: 40966.4, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2511060992. Throughput: 0: 42709.3. Samples: 2511116340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:01,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 14:38:06,339][12883] Updated weights for policy 0, policy_version 153273 (0.0043) [2024-06-18 14:38:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42877.0). Total num frames: 2511273984. Throughput: 0: 42792.4. Samples: 2511381500. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:06,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 14:38:09,291][12883] Updated weights for policy 0, policy_version 153283 (0.0028) [2024-06-18 14:38:11,994][12645] Fps is (10 sec: 37682.3, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 2511437824. Throughput: 0: 42462.1. Samples: 2511631320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:11,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 14:38:13,971][12883] Updated weights for policy 0, policy_version 153293 (0.0029) [2024-06-18 14:38:14,231][12862] Signal inference workers to stop experience collection... (36700 times) [2024-06-18 14:38:14,267][12883] InferenceWorker_p0-w0: stopping experience collection (36700 times) [2024-06-18 14:38:14,285][12862] Signal inference workers to resume experience collection... (36700 times) [2024-06-18 14:38:14,291][12883] InferenceWorker_p0-w0: resuming experience collection (36700 times) [2024-06-18 14:38:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2511699968. Throughput: 0: 42530.2. Samples: 2511753460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:16,994][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 14:38:17,048][12883] Updated weights for policy 0, policy_version 153303 (0.0037) [2024-06-18 14:38:21,715][12883] Updated weights for policy 0, policy_version 153313 (0.0028) [2024-06-18 14:38:21,994][12645] Fps is (10 sec: 45876.5, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2511896576. Throughput: 0: 42681.1. Samples: 2512019360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:21,994][12645] Avg episode reward: [(0, '0.809')] [2024-06-18 14:38:25,074][12883] Updated weights for policy 0, policy_version 153323 (0.0035) [2024-06-18 14:38:26,995][12645] Fps is (10 sec: 39316.1, 60 sec: 42597.5, 300 sec: 42764.8). Total num frames: 2512093184. Throughput: 0: 42320.2. Samples: 2512262540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:26,996][12645] Avg episode reward: [(0, '0.704')] [2024-06-18 14:38:29,601][12883] Updated weights for policy 0, policy_version 153333 (0.0031) [2024-06-18 14:38:31,994][12645] Fps is (10 sec: 42597.0, 60 sec: 42325.1, 300 sec: 42876.1). Total num frames: 2512322560. Throughput: 0: 42508.2. Samples: 2512388920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:31,994][12645] Avg episode reward: [(0, '0.700')] [2024-06-18 14:38:32,845][12883] Updated weights for policy 0, policy_version 153343 (0.0053) [2024-06-18 14:38:36,994][12645] Fps is (10 sec: 40965.0, 60 sec: 41506.1, 300 sec: 42709.5). Total num frames: 2512502784. Throughput: 0: 42440.4. Samples: 2512651120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:36,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 14:38:37,120][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153352_2512519168.pth... [2024-06-18 14:38:37,184][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000152726_2502262784.pth [2024-06-18 14:38:37,348][12883] Updated weights for policy 0, policy_version 153353 (0.0037) [2024-06-18 14:38:40,571][12883] Updated weights for policy 0, policy_version 153363 (0.0038) [2024-06-18 14:38:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42872.2, 300 sec: 42709.5). Total num frames: 2512732160. Throughput: 0: 41917.7. Samples: 2512883760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:41,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 14:38:45,274][12883] Updated weights for policy 0, policy_version 153373 (0.0039) [2024-06-18 14:38:46,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2512977920. Throughput: 0: 42297.3. Samples: 2513019720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:46,994][12645] Avg episode reward: [(0, '0.664')] [2024-06-18 14:38:48,295][12883] Updated weights for policy 0, policy_version 153383 (0.0031) [2024-06-18 14:38:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 40961.1, 300 sec: 42543.1). Total num frames: 2513108992. Throughput: 0: 42025.8. Samples: 2513272660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 21.0) [2024-06-18 14:38:51,994][12645] Avg episode reward: [(0, '0.529')] [2024-06-18 14:38:52,925][12883] Updated weights for policy 0, policy_version 153393 (0.0038) [2024-06-18 14:38:55,856][12883] Updated weights for policy 0, policy_version 153403 (0.0035) [2024-06-18 14:38:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2513387520. Throughput: 0: 41951.7. Samples: 2513519140. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:38:56,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 14:39:00,607][12883] Updated weights for policy 0, policy_version 153413 (0.0037) [2024-06-18 14:39:01,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42052.3, 300 sec: 42765.6). Total num frames: 2513584128. Throughput: 0: 42345.4. Samples: 2513659000. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:01,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 14:39:03,546][12883] Updated weights for policy 0, policy_version 153423 (0.0027) [2024-06-18 14:39:06,994][12645] Fps is (10 sec: 34406.5, 60 sec: 40960.0, 300 sec: 42487.3). Total num frames: 2513731584. Throughput: 0: 41906.2. Samples: 2513905140. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:06,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 14:39:08,020][12862] Signal inference workers to stop experience collection... (36750 times) [2024-06-18 14:39:08,020][12862] Signal inference workers to resume experience collection... (36750 times) [2024-06-18 14:39:08,055][12883] InferenceWorker_p0-w0: stopping experience collection (36750 times) [2024-06-18 14:39:08,060][12883] InferenceWorker_p0-w0: resuming experience collection (36750 times) [2024-06-18 14:39:08,346][12883] Updated weights for policy 0, policy_version 153433 (0.0031) [2024-06-18 14:39:11,311][12883] Updated weights for policy 0, policy_version 153443 (0.0042) [2024-06-18 14:39:11,996][12645] Fps is (10 sec: 44226.5, 60 sec: 43143.1, 300 sec: 42709.1). Total num frames: 2514026496. Throughput: 0: 42045.4. Samples: 2514154620. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:11,996][12645] Avg episode reward: [(0, '0.200')] [2024-06-18 14:39:15,972][12883] Updated weights for policy 0, policy_version 153453 (0.0029) [2024-06-18 14:39:16,994][12645] Fps is (10 sec: 49151.6, 60 sec: 42052.2, 300 sec: 42710.1). Total num frames: 2514223104. Throughput: 0: 42494.4. Samples: 2514301160. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:16,994][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 14:39:18,915][12883] Updated weights for policy 0, policy_version 153463 (0.0040) [2024-06-18 14:39:21,994][12645] Fps is (10 sec: 36052.7, 60 sec: 41506.1, 300 sec: 42487.3). Total num frames: 2514386944. Throughput: 0: 42157.9. Samples: 2514548220. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:21,994][12645] Avg episode reward: [(0, '0.664')] [2024-06-18 14:39:23,662][12883] Updated weights for policy 0, policy_version 153473 (0.0033) [2024-06-18 14:39:26,808][12883] Updated weights for policy 0, policy_version 153483 (0.0033) [2024-06-18 14:39:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42872.5, 300 sec: 42654.0). Total num frames: 2514665472. Throughput: 0: 42584.6. Samples: 2514800060. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:26,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 14:39:31,285][12883] Updated weights for policy 0, policy_version 153493 (0.0041) [2024-06-18 14:39:31,994][12645] Fps is (10 sec: 49151.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2514878464. Throughput: 0: 42736.9. Samples: 2514942880. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:31,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 14:39:34,418][12883] Updated weights for policy 0, policy_version 153503 (0.0034) [2024-06-18 14:39:36,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.4, 300 sec: 42487.6). Total num frames: 2515042304. Throughput: 0: 42481.2. Samples: 2515184320. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:36,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 14:39:39,051][12883] Updated weights for policy 0, policy_version 153513 (0.0028) [2024-06-18 14:39:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2515304448. Throughput: 0: 42581.3. Samples: 2515435300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:41,994][12645] Avg episode reward: [(0, '0.681')] [2024-06-18 14:39:42,493][12883] Updated weights for policy 0, policy_version 153523 (0.0030) [2024-06-18 14:39:46,618][12883] Updated weights for policy 0, policy_version 153533 (0.0040) [2024-06-18 14:39:46,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2515501056. Throughput: 0: 42618.6. Samples: 2515576840. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:46,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 14:39:50,211][12883] Updated weights for policy 0, policy_version 153543 (0.0042) [2024-06-18 14:39:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2515681280. Throughput: 0: 42670.3. Samples: 2515825300. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:51,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 14:39:54,429][12883] Updated weights for policy 0, policy_version 153553 (0.0028) [2024-06-18 14:39:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2515943424. Throughput: 0: 42709.7. Samples: 2516076460. Policy #0 lag: (min: 1.0, avg: 12.1, max: 21.0) [2024-06-18 14:39:56,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 14:39:57,686][12883] Updated weights for policy 0, policy_version 153563 (0.0033) [2024-06-18 14:40:01,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2516123648. Throughput: 0: 42654.2. Samples: 2516220600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:01,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 14:40:02,034][12883] Updated weights for policy 0, policy_version 153573 (0.0035) [2024-06-18 14:40:05,230][12883] Updated weights for policy 0, policy_version 153583 (0.0037) [2024-06-18 14:40:06,994][12645] Fps is (10 sec: 39321.3, 60 sec: 43417.5, 300 sec: 42542.8). Total num frames: 2516336640. Throughput: 0: 42678.6. Samples: 2516468760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:06,994][12645] Avg episode reward: [(0, '0.358')] [2024-06-18 14:40:09,755][12883] Updated weights for policy 0, policy_version 153593 (0.0035) [2024-06-18 14:40:10,888][12862] Signal inference workers to stop experience collection... (36800 times) [2024-06-18 14:40:10,938][12883] InferenceWorker_p0-w0: stopping experience collection (36800 times) [2024-06-18 14:40:10,943][12862] Signal inference workers to resume experience collection... (36800 times) [2024-06-18 14:40:10,960][12883] InferenceWorker_p0-w0: resuming experience collection (36800 times) [2024-06-18 14:40:11,994][12645] Fps is (10 sec: 47514.1, 60 sec: 42873.1, 300 sec: 42765.0). Total num frames: 2516598784. Throughput: 0: 42612.8. Samples: 2516717640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:11,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 14:40:12,764][12883] Updated weights for policy 0, policy_version 153603 (0.0040) [2024-06-18 14:40:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2516762624. Throughput: 0: 42609.4. Samples: 2516860300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:16,994][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 14:40:17,327][12883] Updated weights for policy 0, policy_version 153613 (0.0039) [2024-06-18 14:40:20,388][12883] Updated weights for policy 0, policy_version 153623 (0.0032) [2024-06-18 14:40:21,994][12645] Fps is (10 sec: 37683.1, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 2516975616. Throughput: 0: 42681.4. Samples: 2517104980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:21,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 14:40:24,878][12883] Updated weights for policy 0, policy_version 153633 (0.0028) [2024-06-18 14:40:26,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2517237760. Throughput: 0: 42821.4. Samples: 2517362260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:26,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 14:40:27,956][12883] Updated weights for policy 0, policy_version 153643 (0.0025) [2024-06-18 14:40:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2517417984. Throughput: 0: 42772.8. Samples: 2517501620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:31,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 14:40:32,471][12883] Updated weights for policy 0, policy_version 153653 (0.0035) [2024-06-18 14:40:35,549][12883] Updated weights for policy 0, policy_version 153663 (0.0036) [2024-06-18 14:40:36,994][12645] Fps is (10 sec: 39321.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2517630976. Throughput: 0: 42618.5. Samples: 2517743140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:36,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 14:40:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153664_2517630976.pth... [2024-06-18 14:40:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153042_2507440128.pth [2024-06-18 14:40:40,158][12883] Updated weights for policy 0, policy_version 153673 (0.0037) [2024-06-18 14:40:41,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2517860352. Throughput: 0: 42812.9. Samples: 2518003040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:41,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 14:40:43,579][12883] Updated weights for policy 0, policy_version 153683 (0.0046) [2024-06-18 14:40:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2518056960. Throughput: 0: 42543.7. Samples: 2518135060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:47,002][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 14:40:47,970][12883] Updated weights for policy 0, policy_version 153693 (0.0039) [2024-06-18 14:40:51,185][12883] Updated weights for policy 0, policy_version 153703 (0.0038) [2024-06-18 14:40:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2518269952. Throughput: 0: 42434.7. Samples: 2518378320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:52,006][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 14:40:55,543][12883] Updated weights for policy 0, policy_version 153713 (0.0041) [2024-06-18 14:40:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2518482944. Throughput: 0: 42803.9. Samples: 2518643820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:40:56,994][12645] Avg episode reward: [(0, '0.733')] [2024-06-18 14:40:59,525][12883] Updated weights for policy 0, policy_version 153723 (0.0024) [2024-06-18 14:41:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2518695936. Throughput: 0: 42544.4. Samples: 2518774800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:41:01,994][12645] Avg episode reward: [(0, '0.738')] [2024-06-18 14:41:03,071][12883] Updated weights for policy 0, policy_version 153733 (0.0034) [2024-06-18 14:41:06,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2518908928. Throughput: 0: 42595.2. Samples: 2519021760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:06,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 14:41:07,120][12883] Updated weights for policy 0, policy_version 153743 (0.0030) [2024-06-18 14:41:11,002][12883] Updated weights for policy 0, policy_version 153753 (0.0036) [2024-06-18 14:41:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2519121920. Throughput: 0: 42783.1. Samples: 2519287500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:11,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 14:41:14,663][12883] Updated weights for policy 0, policy_version 153763 (0.0037) [2024-06-18 14:41:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2519334912. Throughput: 0: 42517.4. Samples: 2519414900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:16,994][12645] Avg episode reward: [(0, '0.650')] [2024-06-18 14:41:18,765][12883] Updated weights for policy 0, policy_version 153773 (0.0037) [2024-06-18 14:41:19,978][12862] Signal inference workers to stop experience collection... (36850 times) [2024-06-18 14:41:19,979][12862] Signal inference workers to resume experience collection... (36850 times) [2024-06-18 14:41:19,995][12883] InferenceWorker_p0-w0: stopping experience collection (36850 times) [2024-06-18 14:41:19,995][12883] InferenceWorker_p0-w0: resuming experience collection (36850 times) [2024-06-18 14:41:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2519564288. Throughput: 0: 42785.4. Samples: 2519668480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:21,994][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 14:41:22,129][12883] Updated weights for policy 0, policy_version 153783 (0.0035) [2024-06-18 14:41:26,742][12883] Updated weights for policy 0, policy_version 153793 (0.0033) [2024-06-18 14:41:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42598.7). Total num frames: 2519760896. Throughput: 0: 42893.8. Samples: 2519933260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:26,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 14:41:29,749][12883] Updated weights for policy 0, policy_version 153803 (0.0030) [2024-06-18 14:41:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2519973888. Throughput: 0: 42645.8. Samples: 2520054120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:31,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 14:41:34,513][12883] Updated weights for policy 0, policy_version 153813 (0.0041) [2024-06-18 14:41:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2520203264. Throughput: 0: 42859.2. Samples: 2520306980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:36,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 14:41:38,179][12883] Updated weights for policy 0, policy_version 153823 (0.0037) [2024-06-18 14:41:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2520383488. Throughput: 0: 42871.3. Samples: 2520573020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:41,994][12645] Avg episode reward: [(0, '0.116')] [2024-06-18 14:41:42,063][12883] Updated weights for policy 0, policy_version 153833 (0.0027) [2024-06-18 14:41:45,657][12883] Updated weights for policy 0, policy_version 153843 (0.0031) [2024-06-18 14:41:46,995][12645] Fps is (10 sec: 40953.1, 60 sec: 42597.2, 300 sec: 42487.1). Total num frames: 2520612864. Throughput: 0: 42590.0. Samples: 2520691420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:46,996][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 14:41:49,451][12883] Updated weights for policy 0, policy_version 153853 (0.0037) [2024-06-18 14:41:51,994][12645] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2520858624. Throughput: 0: 42855.1. Samples: 2520950240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:51,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 14:41:53,145][12883] Updated weights for policy 0, policy_version 153863 (0.0031) [2024-06-18 14:41:56,994][12645] Fps is (10 sec: 42605.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2521038848. Throughput: 0: 42683.8. Samples: 2521208280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:41:56,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 14:41:57,112][12883] Updated weights for policy 0, policy_version 153873 (0.0037) [2024-06-18 14:42:00,607][12883] Updated weights for policy 0, policy_version 153883 (0.0032) [2024-06-18 14:42:01,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2521251840. Throughput: 0: 42549.3. Samples: 2521329620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:42:01,994][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 14:42:04,836][12883] Updated weights for policy 0, policy_version 153893 (0.0048) [2024-06-18 14:42:06,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2521481216. Throughput: 0: 42673.9. Samples: 2521588800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 14:42:06,994][12645] Avg episode reward: [(0, '0.691')] [2024-06-18 14:42:08,143][12883] Updated weights for policy 0, policy_version 153903 (0.0043) [2024-06-18 14:42:11,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2521661440. Throughput: 0: 42532.0. Samples: 2521847200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:11,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 14:42:12,458][12883] Updated weights for policy 0, policy_version 153913 (0.0034) [2024-06-18 14:42:15,971][12883] Updated weights for policy 0, policy_version 153923 (0.0035) [2024-06-18 14:42:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2521907200. Throughput: 0: 42633.7. Samples: 2521972640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:16,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 14:42:20,047][12883] Updated weights for policy 0, policy_version 153933 (0.0026) [2024-06-18 14:42:21,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2522136576. Throughput: 0: 42906.6. Samples: 2522237780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:21,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 14:42:23,503][12883] Updated weights for policy 0, policy_version 153943 (0.0029) [2024-06-18 14:42:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2522316800. Throughput: 0: 42671.0. Samples: 2522493220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:26,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 14:42:27,642][12883] Updated weights for policy 0, policy_version 153953 (0.0038) [2024-06-18 14:42:29,756][12862] Signal inference workers to stop experience collection... (36900 times) [2024-06-18 14:42:29,757][12862] Signal inference workers to resume experience collection... (36900 times) [2024-06-18 14:42:29,781][12883] InferenceWorker_p0-w0: stopping experience collection (36900 times) [2024-06-18 14:42:29,781][12883] InferenceWorker_p0-w0: resuming experience collection (36900 times) [2024-06-18 14:42:31,267][12883] Updated weights for policy 0, policy_version 153963 (0.0035) [2024-06-18 14:42:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2522529792. Throughput: 0: 42806.5. Samples: 2522617640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:31,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 14:42:35,656][12883] Updated weights for policy 0, policy_version 153973 (0.0048) [2024-06-18 14:42:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.6). Total num frames: 2522759168. Throughput: 0: 42914.2. Samples: 2522881380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:36,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 14:42:37,021][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153978_2522775552.pth... [2024-06-18 14:42:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153352_2512519168.pth [2024-06-18 14:42:38,901][12883] Updated weights for policy 0, policy_version 153983 (0.0042) [2024-06-18 14:42:41,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2522972160. Throughput: 0: 42759.6. Samples: 2523132460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:41,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 14:42:43,302][12883] Updated weights for policy 0, policy_version 153993 (0.0047) [2024-06-18 14:42:46,799][12883] Updated weights for policy 0, policy_version 154003 (0.0038) [2024-06-18 14:42:46,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42872.6, 300 sec: 42487.5). Total num frames: 2523185152. Throughput: 0: 42926.7. Samples: 2523261320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:46,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 14:42:50,844][12883] Updated weights for policy 0, policy_version 154013 (0.0026) [2024-06-18 14:42:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2523398144. Throughput: 0: 42881.2. Samples: 2523518460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:51,994][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 14:42:54,499][12883] Updated weights for policy 0, policy_version 154023 (0.0027) [2024-06-18 14:42:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2523594752. Throughput: 0: 42730.7. Samples: 2523770080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:42:56,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 14:42:58,327][12883] Updated weights for policy 0, policy_version 154033 (0.0041) [2024-06-18 14:43:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2523807744. Throughput: 0: 42780.0. Samples: 2523897740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:43:01,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 14:43:02,193][12883] Updated weights for policy 0, policy_version 154043 (0.0031) [2024-06-18 14:43:05,861][12883] Updated weights for policy 0, policy_version 154053 (0.0031) [2024-06-18 14:43:06,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 2524037120. Throughput: 0: 42615.0. Samples: 2524155460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 14:43:06,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 14:43:09,914][12883] Updated weights for policy 0, policy_version 154063 (0.0029) [2024-06-18 14:43:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2524233728. Throughput: 0: 42725.4. Samples: 2524415860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:11,994][12645] Avg episode reward: [(0, '0.807')] [2024-06-18 14:43:13,421][12883] Updated weights for policy 0, policy_version 154073 (0.0042) [2024-06-18 14:43:16,994][12645] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2524463104. Throughput: 0: 42696.1. Samples: 2524538960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:16,994][12645] Avg episode reward: [(0, '0.788')] [2024-06-18 14:43:17,543][12883] Updated weights for policy 0, policy_version 154083 (0.0041) [2024-06-18 14:43:20,857][12883] Updated weights for policy 0, policy_version 154093 (0.0040) [2024-06-18 14:43:21,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 2524692480. Throughput: 0: 42575.1. Samples: 2524797260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:21,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 14:43:25,474][12883] Updated weights for policy 0, policy_version 154103 (0.0035) [2024-06-18 14:43:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.6, 300 sec: 42598.5). Total num frames: 2524889088. Throughput: 0: 42865.5. Samples: 2525061400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:26,994][12645] Avg episode reward: [(0, '0.637')] [2024-06-18 14:43:28,296][12883] Updated weights for policy 0, policy_version 154113 (0.0035) [2024-06-18 14:43:31,996][12645] Fps is (10 sec: 42589.0, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2525118464. Throughput: 0: 42821.9. Samples: 2525188400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:31,996][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 14:43:33,019][12883] Updated weights for policy 0, policy_version 154123 (0.0031) [2024-06-18 14:43:35,077][12862] Signal inference workers to stop experience collection... (36950 times) [2024-06-18 14:43:35,077][12862] Signal inference workers to resume experience collection... (36950 times) [2024-06-18 14:43:35,124][12883] InferenceWorker_p0-w0: stopping experience collection (36950 times) [2024-06-18 14:43:35,124][12883] InferenceWorker_p0-w0: resuming experience collection (36950 times) [2024-06-18 14:43:35,863][12883] Updated weights for policy 0, policy_version 154133 (0.0030) [2024-06-18 14:43:36,997][12645] Fps is (10 sec: 42583.2, 60 sec: 42596.0, 300 sec: 42653.5). Total num frames: 2525315072. Throughput: 0: 42839.9. Samples: 2525446400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:36,998][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 14:43:40,536][12883] Updated weights for policy 0, policy_version 154143 (0.0031) [2024-06-18 14:43:41,994][12645] Fps is (10 sec: 42608.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2525544448. Throughput: 0: 43044.9. Samples: 2525707100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:41,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 14:43:43,662][12883] Updated weights for policy 0, policy_version 154153 (0.0031) [2024-06-18 14:43:46,994][12645] Fps is (10 sec: 44251.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2525757440. Throughput: 0: 43083.1. Samples: 2525836480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:46,994][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 14:43:48,145][12883] Updated weights for policy 0, policy_version 154163 (0.0036) [2024-06-18 14:43:51,846][12883] Updated weights for policy 0, policy_version 154173 (0.0037) [2024-06-18 14:43:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2525970432. Throughput: 0: 43012.1. Samples: 2526091000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:51,994][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 14:43:55,608][12883] Updated weights for policy 0, policy_version 154183 (0.0038) [2024-06-18 14:43:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2526183424. Throughput: 0: 43039.5. Samples: 2526352640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:43:56,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 14:43:59,333][12883] Updated weights for policy 0, policy_version 154193 (0.0053) [2024-06-18 14:44:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2526412800. Throughput: 0: 43141.6. Samples: 2526480340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:44:01,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 14:44:03,268][12883] Updated weights for policy 0, policy_version 154203 (0.0036) [2024-06-18 14:44:06,707][12883] Updated weights for policy 0, policy_version 154213 (0.0034) [2024-06-18 14:44:06,994][12645] Fps is (10 sec: 44237.7, 60 sec: 43144.7, 300 sec: 42709.8). Total num frames: 2526625792. Throughput: 0: 43145.5. Samples: 2526738800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:44:06,994][12645] Avg episode reward: [(0, '0.675')] [2024-06-18 14:44:11,087][12883] Updated weights for policy 0, policy_version 154223 (0.0025) [2024-06-18 14:44:11,995][12645] Fps is (10 sec: 42591.9, 60 sec: 43416.5, 300 sec: 42764.8). Total num frames: 2526838784. Throughput: 0: 43081.0. Samples: 2527000120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 14:44:11,996][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 14:44:14,274][12883] Updated weights for policy 0, policy_version 154233 (0.0029) [2024-06-18 14:44:16,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2527068160. Throughput: 0: 43011.9. Samples: 2527123840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:16,994][12645] Avg episode reward: [(0, '0.567')] [2024-06-18 14:44:18,672][12883] Updated weights for policy 0, policy_version 154243 (0.0027) [2024-06-18 14:44:21,922][12883] Updated weights for policy 0, policy_version 154253 (0.0032) [2024-06-18 14:44:21,994][12645] Fps is (10 sec: 44243.7, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2527281152. Throughput: 0: 43102.4. Samples: 2527385860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:21,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 14:44:26,139][12883] Updated weights for policy 0, policy_version 154263 (0.0034) [2024-06-18 14:44:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 43417.4, 300 sec: 42765.0). Total num frames: 2527494144. Throughput: 0: 43127.4. Samples: 2527647840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:26,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 14:44:29,502][12883] Updated weights for policy 0, policy_version 154273 (0.0033) [2024-06-18 14:44:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 2527707136. Throughput: 0: 43141.4. Samples: 2527777840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:31,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 14:44:33,637][12883] Updated weights for policy 0, policy_version 154283 (0.0031) [2024-06-18 14:44:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43146.9, 300 sec: 42709.5). Total num frames: 2527903744. Throughput: 0: 43202.6. Samples: 2528035120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:36,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 14:44:37,094][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154292_2527920128.pth... [2024-06-18 14:44:37,157][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153664_2517630976.pth [2024-06-18 14:44:37,301][12883] Updated weights for policy 0, policy_version 154293 (0.0048) [2024-06-18 14:44:41,279][12883] Updated weights for policy 0, policy_version 154303 (0.0035) [2024-06-18 14:44:41,997][12645] Fps is (10 sec: 42582.1, 60 sec: 43141.8, 300 sec: 42820.0). Total num frames: 2528133120. Throughput: 0: 43111.1. Samples: 2528292800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:41,998][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 14:44:44,840][12883] Updated weights for policy 0, policy_version 154313 (0.0033) [2024-06-18 14:44:46,994][12645] Fps is (10 sec: 45875.8, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 2528362496. Throughput: 0: 43191.1. Samples: 2528423940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:46,998][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 14:44:48,831][12883] Updated weights for policy 0, policy_version 154323 (0.0040) [2024-06-18 14:44:51,994][12645] Fps is (10 sec: 40975.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2528542720. Throughput: 0: 42976.8. Samples: 2528672760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:51,994][12645] Avg episode reward: [(0, '0.321')] [2024-06-18 14:44:52,558][12883] Updated weights for policy 0, policy_version 154333 (0.0034) [2024-06-18 14:44:56,616][12883] Updated weights for policy 0, policy_version 154343 (0.0030) [2024-06-18 14:44:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2528755712. Throughput: 0: 42890.8. Samples: 2528930140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:44:56,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 14:45:00,221][12883] Updated weights for policy 0, policy_version 154353 (0.0032) [2024-06-18 14:45:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2528985088. Throughput: 0: 42998.8. Samples: 2529058780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:45:01,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 14:45:04,413][12883] Updated weights for policy 0, policy_version 154363 (0.0032) [2024-06-18 14:45:04,977][12862] Signal inference workers to stop experience collection... (37000 times) [2024-06-18 14:45:05,031][12862] Signal inference workers to resume experience collection... (37000 times) [2024-06-18 14:45:05,031][12883] InferenceWorker_p0-w0: stopping experience collection (37000 times) [2024-06-18 14:45:05,058][12883] InferenceWorker_p0-w0: resuming experience collection (37000 times) [2024-06-18 14:45:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 2529198080. Throughput: 0: 42789.7. Samples: 2529311400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:45:06,994][12645] Avg episode reward: [(0, '0.769')] [2024-06-18 14:45:07,993][12883] Updated weights for policy 0, policy_version 154373 (0.0050) [2024-06-18 14:45:11,902][12883] Updated weights for policy 0, policy_version 154383 (0.0023) [2024-06-18 14:45:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42872.6, 300 sec: 42876.1). Total num frames: 2529411072. Throughput: 0: 42747.7. Samples: 2529571480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:45:11,994][12645] Avg episode reward: [(0, '0.683')] [2024-06-18 14:45:15,575][12883] Updated weights for policy 0, policy_version 154393 (0.0024) [2024-06-18 14:45:16,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2529624064. Throughput: 0: 42676.9. Samples: 2529698300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 14:45:16,994][12645] Avg episode reward: [(0, '0.730')] [2024-06-18 14:45:19,580][12883] Updated weights for policy 0, policy_version 154403 (0.0030) [2024-06-18 14:45:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2529837056. Throughput: 0: 42636.7. Samples: 2529953760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:21,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 14:45:23,448][12883] Updated weights for policy 0, policy_version 154413 (0.0044) [2024-06-18 14:45:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2530050048. Throughput: 0: 42701.7. Samples: 2530214220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:26,995][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 14:45:27,154][12883] Updated weights for policy 0, policy_version 154423 (0.0045) [2024-06-18 14:45:30,822][12883] Updated weights for policy 0, policy_version 154433 (0.0049) [2024-06-18 14:45:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2530263040. Throughput: 0: 42612.6. Samples: 2530341500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:31,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 14:45:34,802][12883] Updated weights for policy 0, policy_version 154443 (0.0037) [2024-06-18 14:45:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2530476032. Throughput: 0: 42703.2. Samples: 2530594400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:36,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 14:45:38,866][12883] Updated weights for policy 0, policy_version 154453 (0.0041) [2024-06-18 14:45:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42328.1, 300 sec: 42765.0). Total num frames: 2530672640. Throughput: 0: 42699.2. Samples: 2530851600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:41,994][12645] Avg episode reward: [(0, '0.253')] [2024-06-18 14:45:42,467][12883] Updated weights for policy 0, policy_version 154463 (0.0027) [2024-06-18 14:45:46,547][12883] Updated weights for policy 0, policy_version 154473 (0.0032) [2024-06-18 14:45:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2530902016. Throughput: 0: 42585.3. Samples: 2530975120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:46,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 14:45:50,177][12883] Updated weights for policy 0, policy_version 154483 (0.0027) [2024-06-18 14:45:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2531115008. Throughput: 0: 42754.9. Samples: 2531235360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:51,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 14:45:54,057][12883] Updated weights for policy 0, policy_version 154493 (0.0027) [2024-06-18 14:45:56,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2531328000. Throughput: 0: 42791.9. Samples: 2531497120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:45:56,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 14:45:57,714][12883] Updated weights for policy 0, policy_version 154503 (0.0030) [2024-06-18 14:46:01,503][12883] Updated weights for policy 0, policy_version 154513 (0.0033) [2024-06-18 14:46:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2531540992. Throughput: 0: 42817.8. Samples: 2531625100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:46:01,994][12645] Avg episode reward: [(0, '0.260')] [2024-06-18 14:46:05,565][12883] Updated weights for policy 0, policy_version 154523 (0.0034) [2024-06-18 14:46:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2531753984. Throughput: 0: 42748.4. Samples: 2531877440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:46:06,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 14:46:09,463][12883] Updated weights for policy 0, policy_version 154533 (0.0033) [2024-06-18 14:46:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2531950592. Throughput: 0: 42748.6. Samples: 2532137900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:46:11,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 14:46:13,153][12883] Updated weights for policy 0, policy_version 154543 (0.0037) [2024-06-18 14:46:16,933][12883] Updated weights for policy 0, policy_version 154553 (0.0032) [2024-06-18 14:46:16,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2532196352. Throughput: 0: 42731.3. Samples: 2532264420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:46:16,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 14:46:20,760][12883] Updated weights for policy 0, policy_version 154563 (0.0040) [2024-06-18 14:46:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2532409344. Throughput: 0: 42851.6. Samples: 2532522720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-18 14:46:21,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 14:46:24,448][12883] Updated weights for policy 0, policy_version 154573 (0.0033) [2024-06-18 14:46:26,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 2532605952. Throughput: 0: 42796.0. Samples: 2532777420. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:46:26,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 14:46:28,442][12883] Updated weights for policy 0, policy_version 154583 (0.0026) [2024-06-18 14:46:30,418][12862] Signal inference workers to stop experience collection... (37050 times) [2024-06-18 14:46:30,471][12883] InferenceWorker_p0-w0: stopping experience collection (37050 times) [2024-06-18 14:46:30,533][12862] Signal inference workers to resume experience collection... (37050 times) [2024-06-18 14:46:30,534][12883] InferenceWorker_p0-w0: resuming experience collection (37050 times) [2024-06-18 14:46:31,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42869.8, 300 sec: 42820.2). Total num frames: 2532835328. Throughput: 0: 43000.9. Samples: 2532910260. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:46:31,996][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 14:46:32,345][12883] Updated weights for policy 0, policy_version 154593 (0.0031) [2024-06-18 14:46:36,123][12883] Updated weights for policy 0, policy_version 154603 (0.0026) [2024-06-18 14:46:36,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2533048320. Throughput: 0: 42895.4. Samples: 2533165660. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:46:36,996][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 14:46:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154605_2533048320.pth... [2024-06-18 14:46:37,063][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000153978_2522775552.pth [2024-06-18 14:46:39,963][12883] Updated weights for policy 0, policy_version 154613 (0.0031) [2024-06-18 14:46:41,996][12645] Fps is (10 sec: 44236.9, 60 sec: 43415.9, 300 sec: 42931.6). Total num frames: 2533277696. Throughput: 0: 42770.8. Samples: 2533421900. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:46:41,996][12645] Avg episode reward: [(0, '0.465')] [2024-06-18 14:46:43,675][12883] Updated weights for policy 0, policy_version 154623 (0.0033) [2024-06-18 14:46:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2533474304. Throughput: 0: 42795.1. Samples: 2533550880. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:46:46,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 14:46:47,615][12883] Updated weights for policy 0, policy_version 154633 (0.0049) [2024-06-18 14:46:51,346][12883] Updated weights for policy 0, policy_version 154643 (0.0031) [2024-06-18 14:46:51,997][12645] Fps is (10 sec: 40956.8, 60 sec: 42869.3, 300 sec: 42875.7). Total num frames: 2533687296. Throughput: 0: 42821.1. Samples: 2533804520. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:46:51,997][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 14:46:55,506][12883] Updated weights for policy 0, policy_version 154653 (0.0035) [2024-06-18 14:46:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2533900288. Throughput: 0: 42662.7. Samples: 2534057720. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:46:56,994][12645] Avg episode reward: [(0, '0.729')] [2024-06-18 14:46:58,994][12883] Updated weights for policy 0, policy_version 154663 (0.0037) [2024-06-18 14:47:01,994][12645] Fps is (10 sec: 42611.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2534113280. Throughput: 0: 42718.9. Samples: 2534186760. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:47:01,994][12645] Avg episode reward: [(0, '0.751')] [2024-06-18 14:47:03,147][12883] Updated weights for policy 0, policy_version 154673 (0.0028) [2024-06-18 14:47:06,606][12883] Updated weights for policy 0, policy_version 154683 (0.0044) [2024-06-18 14:47:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 2534342656. Throughput: 0: 42589.2. Samples: 2534439240. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:47:06,994][12645] Avg episode reward: [(0, '0.779')] [2024-06-18 14:47:10,805][12883] Updated weights for policy 0, policy_version 154693 (0.0034) [2024-06-18 14:47:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2534539264. Throughput: 0: 42604.3. Samples: 2534694620. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:47:11,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 14:47:14,224][12883] Updated weights for policy 0, policy_version 154703 (0.0038) [2024-06-18 14:47:16,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2534752256. Throughput: 0: 42496.6. Samples: 2534822520. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:47:16,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 14:47:18,291][12883] Updated weights for policy 0, policy_version 154713 (0.0024) [2024-06-18 14:47:21,993][12883] Updated weights for policy 0, policy_version 154723 (0.0039) [2024-06-18 14:47:21,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2534981632. Throughput: 0: 42590.8. Samples: 2535082240. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:47:21,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 14:47:26,121][12883] Updated weights for policy 0, policy_version 154733 (0.0024) [2024-06-18 14:47:26,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2535161856. Throughput: 0: 42443.0. Samples: 2535331740. Policy #0 lag: (min: 1.0, avg: 12.6, max: 24.0) [2024-06-18 14:47:26,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 14:47:29,889][12883] Updated weights for policy 0, policy_version 154743 (0.0041) [2024-06-18 14:47:31,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42598.4, 300 sec: 42820.2). Total num frames: 2535391232. Throughput: 0: 42336.0. Samples: 2535456100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:47:31,996][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 14:47:34,120][12883] Updated weights for policy 0, policy_version 154753 (0.0032) [2024-06-18 14:47:36,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2535604224. Throughput: 0: 42377.0. Samples: 2535711360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:47:36,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 14:47:37,487][12883] Updated weights for policy 0, policy_version 154763 (0.0033) [2024-06-18 14:47:41,801][12883] Updated weights for policy 0, policy_version 154773 (0.0040) [2024-06-18 14:47:41,994][12645] Fps is (10 sec: 40969.6, 60 sec: 42053.9, 300 sec: 42765.0). Total num frames: 2535800832. Throughput: 0: 42529.4. Samples: 2535971540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:47:41,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 14:47:44,964][12883] Updated weights for policy 0, policy_version 154783 (0.0041) [2024-06-18 14:47:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2536030208. Throughput: 0: 42474.5. Samples: 2536098120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:47:46,994][12645] Avg episode reward: [(0, '0.306')] [2024-06-18 14:47:49,573][12883] Updated weights for policy 0, policy_version 154793 (0.0036) [2024-06-18 14:47:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42600.5, 300 sec: 42876.1). Total num frames: 2536243200. Throughput: 0: 42643.2. Samples: 2536358180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:47:51,998][12645] Avg episode reward: [(0, '0.338')] [2024-06-18 14:47:52,494][12883] Updated weights for policy 0, policy_version 154803 (0.0042) [2024-06-18 14:47:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2536423424. Throughput: 0: 42724.2. Samples: 2536617200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:47:56,994][12645] Avg episode reward: [(0, '0.662')] [2024-06-18 14:47:57,151][12883] Updated weights for policy 0, policy_version 154813 (0.0031) [2024-06-18 14:48:00,254][12883] Updated weights for policy 0, policy_version 154823 (0.0037) [2024-06-18 14:48:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2536685568. Throughput: 0: 42629.6. Samples: 2536740840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:48:01,994][12645] Avg episode reward: [(0, '0.412')] [2024-06-18 14:48:04,702][12883] Updated weights for policy 0, policy_version 154833 (0.0034) [2024-06-18 14:48:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2536865792. Throughput: 0: 42576.0. Samples: 2536998160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:48:06,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 14:48:08,418][12883] Updated weights for policy 0, policy_version 154843 (0.0026) [2024-06-18 14:48:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2537078784. Throughput: 0: 42700.4. Samples: 2537253260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:48:11,994][12645] Avg episode reward: [(0, '0.721')] [2024-06-18 14:48:12,271][12862] Signal inference workers to stop experience collection... (37100 times) [2024-06-18 14:48:12,320][12883] InferenceWorker_p0-w0: stopping experience collection (37100 times) [2024-06-18 14:48:12,329][12862] Signal inference workers to resume experience collection... (37100 times) [2024-06-18 14:48:12,340][12883] InferenceWorker_p0-w0: resuming experience collection (37100 times) [2024-06-18 14:48:12,472][12883] Updated weights for policy 0, policy_version 154853 (0.0036) [2024-06-18 14:48:15,887][12883] Updated weights for policy 0, policy_version 154863 (0.0031) [2024-06-18 14:48:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2537324544. Throughput: 0: 42921.7. Samples: 2537387480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:48:16,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 14:48:20,424][12883] Updated weights for policy 0, policy_version 154873 (0.0028) [2024-06-18 14:48:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2537521152. Throughput: 0: 43038.2. Samples: 2537648080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:48:22,000][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 14:48:23,518][12883] Updated weights for policy 0, policy_version 154883 (0.0030) [2024-06-18 14:48:26,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42709.8). Total num frames: 2537717760. Throughput: 0: 42915.4. Samples: 2537902740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 14:48:26,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 14:48:28,131][12883] Updated weights for policy 0, policy_version 154893 (0.0037) [2024-06-18 14:48:31,060][12883] Updated weights for policy 0, policy_version 154903 (0.0033) [2024-06-18 14:48:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42873.1, 300 sec: 42876.6). Total num frames: 2537963520. Throughput: 0: 42952.6. Samples: 2538030980. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:48:31,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 14:48:35,596][12883] Updated weights for policy 0, policy_version 154913 (0.0033) [2024-06-18 14:48:36,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2538176512. Throughput: 0: 43025.4. Samples: 2538294320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:48:36,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 14:48:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154919_2538192896.pth... [2024-06-18 14:48:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154292_2527920128.pth [2024-06-18 14:48:38,610][12883] Updated weights for policy 0, policy_version 154923 (0.0029) [2024-06-18 14:48:41,994][12645] Fps is (10 sec: 40956.5, 60 sec: 42870.8, 300 sec: 42764.9). Total num frames: 2538373120. Throughput: 0: 42779.6. Samples: 2538542320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:48:41,995][12645] Avg episode reward: [(0, '0.401')] [2024-06-18 14:48:43,025][12883] Updated weights for policy 0, policy_version 154933 (0.0024) [2024-06-18 14:48:46,386][12883] Updated weights for policy 0, policy_version 154943 (0.0034) [2024-06-18 14:48:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2538602496. Throughput: 0: 42832.0. Samples: 2538668280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:48:46,994][12645] Avg episode reward: [(0, '0.438')] [2024-06-18 14:48:51,039][12883] Updated weights for policy 0, policy_version 154953 (0.0041) [2024-06-18 14:48:51,994][12645] Fps is (10 sec: 44240.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2538815488. Throughput: 0: 42976.5. Samples: 2538932100. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:48:51,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 14:48:54,195][12883] Updated weights for policy 0, policy_version 154963 (0.0036) [2024-06-18 14:48:56,997][12645] Fps is (10 sec: 42583.1, 60 sec: 43415.0, 300 sec: 42764.5). Total num frames: 2539028480. Throughput: 0: 42684.2. Samples: 2539174200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:48:56,998][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 14:48:58,648][12883] Updated weights for policy 0, policy_version 154973 (0.0034) [2024-06-18 14:49:01,808][12883] Updated weights for policy 0, policy_version 154983 (0.0034) [2024-06-18 14:49:01,995][12645] Fps is (10 sec: 42592.0, 60 sec: 42597.3, 300 sec: 42764.8). Total num frames: 2539241472. Throughput: 0: 42608.0. Samples: 2539304900. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:49:01,996][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 14:49:06,096][12883] Updated weights for policy 0, policy_version 154993 (0.0033) [2024-06-18 14:49:06,994][12645] Fps is (10 sec: 42613.5, 60 sec: 43144.6, 300 sec: 42765.3). Total num frames: 2539454464. Throughput: 0: 42691.6. Samples: 2539569200. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:49:06,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 14:49:09,492][12883] Updated weights for policy 0, policy_version 155003 (0.0042) [2024-06-18 14:49:11,994][12645] Fps is (10 sec: 44243.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2539683840. Throughput: 0: 42509.0. Samples: 2539815640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:49:11,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 14:49:13,640][12883] Updated weights for policy 0, policy_version 155013 (0.0037) [2024-06-18 14:49:15,318][12862] Signal inference workers to stop experience collection... (37150 times) [2024-06-18 14:49:15,319][12862] Signal inference workers to resume experience collection... (37150 times) [2024-06-18 14:49:15,342][12883] InferenceWorker_p0-w0: stopping experience collection (37150 times) [2024-06-18 14:49:15,342][12883] InferenceWorker_p0-w0: resuming experience collection (37150 times) [2024-06-18 14:49:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2539880448. Throughput: 0: 42684.4. Samples: 2539951780. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:49:16,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 14:49:17,271][12883] Updated weights for policy 0, policy_version 155023 (0.0030) [2024-06-18 14:49:21,247][12883] Updated weights for policy 0, policy_version 155033 (0.0027) [2024-06-18 14:49:21,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2540077056. Throughput: 0: 42733.8. Samples: 2540217440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:49:21,996][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 14:49:24,750][12883] Updated weights for policy 0, policy_version 155043 (0.0032) [2024-06-18 14:49:26,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43690.7, 300 sec: 42820.6). Total num frames: 2540339200. Throughput: 0: 42650.9. Samples: 2540461580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:49:26,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 14:49:28,937][12883] Updated weights for policy 0, policy_version 155053 (0.0036) [2024-06-18 14:49:31,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 2540519424. Throughput: 0: 42869.8. Samples: 2540597420. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-18 14:49:31,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 14:49:32,304][12883] Updated weights for policy 0, policy_version 155063 (0.0044) [2024-06-18 14:49:36,427][12883] Updated weights for policy 0, policy_version 155073 (0.0037) [2024-06-18 14:49:36,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 2540716032. Throughput: 0: 42670.6. Samples: 2540852280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:49:36,994][12645] Avg episode reward: [(0, '0.686')] [2024-06-18 14:49:39,788][12883] Updated weights for policy 0, policy_version 155083 (0.0037) [2024-06-18 14:49:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43145.1, 300 sec: 42709.5). Total num frames: 2540961792. Throughput: 0: 42913.6. Samples: 2541105160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:49:41,994][12645] Avg episode reward: [(0, '0.658')] [2024-06-18 14:49:44,007][12883] Updated weights for policy 0, policy_version 155093 (0.0047) [2024-06-18 14:49:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2541174784. Throughput: 0: 42938.2. Samples: 2541237060. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:49:46,994][12645] Avg episode reward: [(0, '0.687')] [2024-06-18 14:49:48,243][12883] Updated weights for policy 0, policy_version 155103 (0.0036) [2024-06-18 14:49:51,943][12883] Updated weights for policy 0, policy_version 155113 (0.0037) [2024-06-18 14:49:51,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.8, 300 sec: 42764.7). Total num frames: 2541371392. Throughput: 0: 42605.4. Samples: 2541486540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:49:51,996][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 14:49:55,774][12883] Updated weights for policy 0, policy_version 155123 (0.0034) [2024-06-18 14:49:56,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42600.9, 300 sec: 42709.5). Total num frames: 2541584384. Throughput: 0: 42918.7. Samples: 2541746980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:49:56,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 14:49:59,750][12883] Updated weights for policy 0, policy_version 155133 (0.0033) [2024-06-18 14:50:01,996][12645] Fps is (10 sec: 44236.9, 60 sec: 42870.9, 300 sec: 42764.7). Total num frames: 2541813760. Throughput: 0: 42831.6. Samples: 2541879300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:01,996][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 14:50:03,369][12883] Updated weights for policy 0, policy_version 155143 (0.0045) [2024-06-18 14:50:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2541993984. Throughput: 0: 42396.7. Samples: 2542125200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:06,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 14:50:07,464][12883] Updated weights for policy 0, policy_version 155153 (0.0028) [2024-06-18 14:50:10,966][12883] Updated weights for policy 0, policy_version 155163 (0.0032) [2024-06-18 14:50:11,996][12645] Fps is (10 sec: 40959.8, 60 sec: 42323.7, 300 sec: 42709.1). Total num frames: 2542223360. Throughput: 0: 42696.1. Samples: 2542383000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:11,996][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 14:50:15,155][12883] Updated weights for policy 0, policy_version 155173 (0.0036) [2024-06-18 14:50:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2542436352. Throughput: 0: 42539.9. Samples: 2542511720. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:16,995][12645] Avg episode reward: [(0, '0.320')] [2024-06-18 14:50:18,989][12883] Updated weights for policy 0, policy_version 155183 (0.0035) [2024-06-18 14:50:21,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2542649344. Throughput: 0: 42466.2. Samples: 2542763260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:21,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 14:50:22,691][12883] Updated weights for policy 0, policy_version 155193 (0.0037) [2024-06-18 14:50:26,734][12883] Updated weights for policy 0, policy_version 155203 (0.0042) [2024-06-18 14:50:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2542862336. Throughput: 0: 42518.1. Samples: 2543018480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:26,994][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 14:50:30,466][12883] Updated weights for policy 0, policy_version 155213 (0.0031) [2024-06-18 14:50:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2543058944. Throughput: 0: 42357.9. Samples: 2543143160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:31,994][12645] Avg episode reward: [(0, '0.716')] [2024-06-18 14:50:34,429][12883] Updated weights for policy 0, policy_version 155223 (0.0043) [2024-06-18 14:50:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 2543304704. Throughput: 0: 42440.8. Samples: 2543396280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 14:50:36,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 14:50:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155231_2543304704.pth... [2024-06-18 14:50:37,057][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154605_2533048320.pth [2024-06-18 14:50:38,108][12883] Updated weights for policy 0, policy_version 155233 (0.0041) [2024-06-18 14:50:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2543468544. Throughput: 0: 42299.1. Samples: 2543650440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:50:41,994][12645] Avg episode reward: [(0, '0.685')] [2024-06-18 14:50:42,339][12883] Updated weights for policy 0, policy_version 155243 (0.0049) [2024-06-18 14:50:45,793][12883] Updated weights for policy 0, policy_version 155253 (0.0033) [2024-06-18 14:50:46,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2543697920. Throughput: 0: 42058.5. Samples: 2543771840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:50:46,994][12645] Avg episode reward: [(0, '0.702')] [2024-06-18 14:50:49,914][12883] Updated weights for policy 0, policy_version 155263 (0.0045) [2024-06-18 14:50:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2543927296. Throughput: 0: 42417.4. Samples: 2544033980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:50:51,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 14:50:53,611][12883] Updated weights for policy 0, policy_version 155273 (0.0026) [2024-06-18 14:50:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42050.7, 300 sec: 42598.1). Total num frames: 2544107520. Throughput: 0: 42483.6. Samples: 2544294760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:50:56,996][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 14:50:57,349][12862] Signal inference workers to stop experience collection... (37200 times) [2024-06-18 14:50:57,377][12883] InferenceWorker_p0-w0: stopping experience collection (37200 times) [2024-06-18 14:50:57,398][12862] Signal inference workers to resume experience collection... (37200 times) [2024-06-18 14:50:57,399][12883] InferenceWorker_p0-w0: resuming experience collection (37200 times) [2024-06-18 14:50:57,550][12883] Updated weights for policy 0, policy_version 155283 (0.0037) [2024-06-18 14:51:01,278][12883] Updated weights for policy 0, policy_version 155293 (0.0033) [2024-06-18 14:51:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42326.9, 300 sec: 42709.5). Total num frames: 2544353280. Throughput: 0: 42245.4. Samples: 2544412760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:01,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 14:51:05,105][12883] Updated weights for policy 0, policy_version 155303 (0.0028) [2024-06-18 14:51:06,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2544566272. Throughput: 0: 42503.1. Samples: 2544675900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:06,994][12645] Avg episode reward: [(0, '0.678')] [2024-06-18 14:51:08,730][12883] Updated weights for policy 0, policy_version 155313 (0.0036) [2024-06-18 14:51:11,994][12645] Fps is (10 sec: 37682.8, 60 sec: 41780.7, 300 sec: 42487.3). Total num frames: 2544730112. Throughput: 0: 42381.3. Samples: 2544925640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:11,995][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 14:51:12,923][12883] Updated weights for policy 0, policy_version 155323 (0.0043) [2024-06-18 14:51:16,412][12883] Updated weights for policy 0, policy_version 155333 (0.0047) [2024-06-18 14:51:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2544992256. Throughput: 0: 42386.1. Samples: 2545050540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:16,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 14:51:20,755][12883] Updated weights for policy 0, policy_version 155343 (0.0039) [2024-06-18 14:51:21,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2545172480. Throughput: 0: 42427.1. Samples: 2545305500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:21,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 14:51:24,189][12883] Updated weights for policy 0, policy_version 155353 (0.0032) [2024-06-18 14:51:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42543.2). Total num frames: 2545385472. Throughput: 0: 42447.0. Samples: 2545560560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:26,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 14:51:28,735][12883] Updated weights for policy 0, policy_version 155363 (0.0033) [2024-06-18 14:51:31,693][12883] Updated weights for policy 0, policy_version 155373 (0.0036) [2024-06-18 14:51:31,994][12645] Fps is (10 sec: 45872.4, 60 sec: 42871.1, 300 sec: 42653.9). Total num frames: 2545631232. Throughput: 0: 42653.3. Samples: 2545691260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:31,995][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 14:51:36,535][12883] Updated weights for policy 0, policy_version 155383 (0.0036) [2024-06-18 14:51:36,994][12645] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42487.6). Total num frames: 2545811456. Throughput: 0: 42538.2. Samples: 2545948200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:36,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 14:51:39,304][12883] Updated weights for policy 0, policy_version 155393 (0.0036) [2024-06-18 14:51:41,994][12645] Fps is (10 sec: 40962.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2546040832. Throughput: 0: 42308.4. Samples: 2546198540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 14:51:41,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 14:51:44,104][12883] Updated weights for policy 0, policy_version 155403 (0.0024) [2024-06-18 14:51:46,793][12883] Updated weights for policy 0, policy_version 155413 (0.0040) [2024-06-18 14:51:46,994][12645] Fps is (10 sec: 47513.8, 60 sec: 43144.6, 300 sec: 42709.9). Total num frames: 2546286592. Throughput: 0: 42590.3. Samples: 2546329320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:51:47,000][12645] Avg episode reward: [(0, '0.699')] [2024-06-18 14:51:51,805][12883] Updated weights for policy 0, policy_version 155423 (0.0039) [2024-06-18 14:51:51,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2546450432. Throughput: 0: 42550.7. Samples: 2546590680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:51:51,994][12645] Avg episode reward: [(0, '0.696')] [2024-06-18 14:51:54,377][12883] Updated weights for policy 0, policy_version 155433 (0.0037) [2024-06-18 14:51:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 2546696192. Throughput: 0: 42510.4. Samples: 2546838600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:51:56,994][12645] Avg episode reward: [(0, '0.452')] [2024-06-18 14:51:59,478][12883] Updated weights for policy 0, policy_version 155443 (0.0030) [2024-06-18 14:52:01,994][12645] Fps is (10 sec: 47512.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2546925568. Throughput: 0: 42672.3. Samples: 2546970800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:01,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 14:52:02,434][12883] Updated weights for policy 0, policy_version 155453 (0.0033) [2024-06-18 14:52:06,996][12645] Fps is (10 sec: 39312.6, 60 sec: 42050.7, 300 sec: 42542.6). Total num frames: 2547089408. Throughput: 0: 42752.9. Samples: 2547229480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:06,996][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 14:52:07,190][12883] Updated weights for policy 0, policy_version 155463 (0.0029) [2024-06-18 14:52:09,981][12883] Updated weights for policy 0, policy_version 155473 (0.0031) [2024-06-18 14:52:11,994][12645] Fps is (10 sec: 39322.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2547318784. Throughput: 0: 42600.6. Samples: 2547477580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:11,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 14:52:14,891][12883] Updated weights for policy 0, policy_version 155483 (0.0041) [2024-06-18 14:52:16,742][12862] Signal inference workers to stop experience collection... (37250 times) [2024-06-18 14:52:16,743][12862] Signal inference workers to resume experience collection... (37250 times) [2024-06-18 14:52:16,786][12883] InferenceWorker_p0-w0: stopping experience collection (37250 times) [2024-06-18 14:52:16,786][12883] InferenceWorker_p0-w0: resuming experience collection (37250 times) [2024-06-18 14:52:16,994][12645] Fps is (10 sec: 47524.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2547564544. Throughput: 0: 42712.0. Samples: 2547613280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:16,994][12645] Avg episode reward: [(0, '0.564')] [2024-06-18 14:52:17,542][12883] Updated weights for policy 0, policy_version 155493 (0.0036) [2024-06-18 14:52:21,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2547728384. Throughput: 0: 42669.3. Samples: 2547868320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:21,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 14:52:22,437][12883] Updated weights for policy 0, policy_version 155503 (0.0026) [2024-06-18 14:52:25,477][12883] Updated weights for policy 0, policy_version 155513 (0.0032) [2024-06-18 14:52:27,000][12645] Fps is (10 sec: 40934.3, 60 sec: 43140.1, 300 sec: 42653.4). Total num frames: 2547974144. Throughput: 0: 42428.2. Samples: 2548108080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:27,001][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 14:52:29,959][12883] Updated weights for policy 0, policy_version 155523 (0.0034) [2024-06-18 14:52:31,994][12645] Fps is (10 sec: 47513.9, 60 sec: 42871.9, 300 sec: 42709.5). Total num frames: 2548203520. Throughput: 0: 42670.3. Samples: 2548249480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:31,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 14:52:33,170][12883] Updated weights for policy 0, policy_version 155533 (0.0027) [2024-06-18 14:52:36,994][12645] Fps is (10 sec: 40985.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2548383744. Throughput: 0: 42679.5. Samples: 2548511260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:36,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 14:52:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155541_2548383744.pth... [2024-06-18 14:52:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000154919_2538192896.pth [2024-06-18 14:52:37,623][12883] Updated weights for policy 0, policy_version 155543 (0.0043) [2024-06-18 14:52:40,922][12883] Updated weights for policy 0, policy_version 155553 (0.0031) [2024-06-18 14:52:41,994][12645] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2548613120. Throughput: 0: 42567.8. Samples: 2548754160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 14:52:41,995][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 14:52:45,273][12883] Updated weights for policy 0, policy_version 155563 (0.0035) [2024-06-18 14:52:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2548826112. Throughput: 0: 42645.5. Samples: 2548889840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:52:46,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 14:52:48,746][12883] Updated weights for policy 0, policy_version 155573 (0.0037) [2024-06-18 14:52:51,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2549006336. Throughput: 0: 42614.1. Samples: 2549147020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:52:51,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 14:52:52,803][12883] Updated weights for policy 0, policy_version 155583 (0.0039) [2024-06-18 14:52:56,529][12883] Updated weights for policy 0, policy_version 155593 (0.0037) [2024-06-18 14:52:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2549252096. Throughput: 0: 42538.7. Samples: 2549391820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:52:56,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 14:53:00,482][12883] Updated weights for policy 0, policy_version 155603 (0.0032) [2024-06-18 14:53:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 2549465088. Throughput: 0: 42530.3. Samples: 2549527140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:01,994][12645] Avg episode reward: [(0, '0.450')] [2024-06-18 14:53:04,414][12883] Updated weights for policy 0, policy_version 155613 (0.0030) [2024-06-18 14:53:06,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2549645312. Throughput: 0: 42485.8. Samples: 2549780180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:06,994][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 14:53:08,532][12883] Updated weights for policy 0, policy_version 155623 (0.0036) [2024-06-18 14:53:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2549874688. Throughput: 0: 42603.3. Samples: 2550024960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:11,994][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 14:53:12,031][12883] Updated weights for policy 0, policy_version 155633 (0.0045) [2024-06-18 14:53:16,394][12883] Updated weights for policy 0, policy_version 155643 (0.0046) [2024-06-18 14:53:16,996][12645] Fps is (10 sec: 45865.0, 60 sec: 42323.8, 300 sec: 42653.6). Total num frames: 2550104064. Throughput: 0: 42420.5. Samples: 2550158500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:16,996][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 14:53:19,574][12883] Updated weights for policy 0, policy_version 155653 (0.0030) [2024-06-18 14:53:21,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2550267904. Throughput: 0: 42180.0. Samples: 2550409360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:21,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 14:53:24,047][12883] Updated weights for policy 0, policy_version 155663 (0.0025) [2024-06-18 14:53:26,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42602.9, 300 sec: 42598.4). Total num frames: 2550530048. Throughput: 0: 42489.5. Samples: 2550666180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:26,994][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 14:53:27,122][12883] Updated weights for policy 0, policy_version 155673 (0.0031) [2024-06-18 14:53:31,652][12883] Updated weights for policy 0, policy_version 155683 (0.0040) [2024-06-18 14:53:31,658][12862] Signal inference workers to stop experience collection... (37300 times) [2024-06-18 14:53:31,659][12862] Signal inference workers to resume experience collection... (37300 times) [2024-06-18 14:53:31,675][12883] InferenceWorker_p0-w0: stopping experience collection (37300 times) [2024-06-18 14:53:31,675][12883] InferenceWorker_p0-w0: resuming experience collection (37300 times) [2024-06-18 14:53:31,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2550743040. Throughput: 0: 42498.2. Samples: 2550802260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:31,994][12645] Avg episode reward: [(0, '0.724')] [2024-06-18 14:53:34,783][12883] Updated weights for policy 0, policy_version 155693 (0.0032) [2024-06-18 14:53:36,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42543.0). Total num frames: 2550923264. Throughput: 0: 42241.3. Samples: 2551047880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:36,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 14:53:39,224][12883] Updated weights for policy 0, policy_version 155703 (0.0042) [2024-06-18 14:53:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2551136256. Throughput: 0: 42463.4. Samples: 2551302680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:41,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 14:53:42,718][12883] Updated weights for policy 0, policy_version 155713 (0.0049) [2024-06-18 14:53:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2551365632. Throughput: 0: 42314.3. Samples: 2551431280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 14:53:46,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 14:53:46,995][12883] Updated weights for policy 0, policy_version 155723 (0.0033) [2024-06-18 14:53:50,370][12883] Updated weights for policy 0, policy_version 155733 (0.0027) [2024-06-18 14:53:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.8). Total num frames: 2551562240. Throughput: 0: 42317.8. Samples: 2551684480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:53:51,994][12645] Avg episode reward: [(0, '0.371')] [2024-06-18 14:53:54,627][12883] Updated weights for policy 0, policy_version 155743 (0.0043) [2024-06-18 14:53:56,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42543.1). Total num frames: 2551791616. Throughput: 0: 42713.3. Samples: 2551947060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:53:56,994][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 14:53:58,054][12883] Updated weights for policy 0, policy_version 155753 (0.0040) [2024-06-18 14:54:01,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2551988224. Throughput: 0: 42680.0. Samples: 2552079000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:01,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 14:54:02,235][12883] Updated weights for policy 0, policy_version 155763 (0.0037) [2024-06-18 14:54:05,775][12883] Updated weights for policy 0, policy_version 155773 (0.0025) [2024-06-18 14:54:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2552217600. Throughput: 0: 42686.3. Samples: 2552330240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:06,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 14:54:09,914][12883] Updated weights for policy 0, policy_version 155783 (0.0053) [2024-06-18 14:54:11,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2552430592. Throughput: 0: 42653.7. Samples: 2552585600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:11,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 14:54:13,663][12883] Updated weights for policy 0, policy_version 155793 (0.0023) [2024-06-18 14:54:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42053.9, 300 sec: 42543.2). Total num frames: 2552627200. Throughput: 0: 42414.3. Samples: 2552710900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:16,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 14:54:17,627][12883] Updated weights for policy 0, policy_version 155803 (0.0022) [2024-06-18 14:54:21,272][12883] Updated weights for policy 0, policy_version 155813 (0.0032) [2024-06-18 14:54:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 2552856576. Throughput: 0: 42603.5. Samples: 2552965040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:21,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 14:54:25,103][12883] Updated weights for policy 0, policy_version 155823 (0.0029) [2024-06-18 14:54:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2553069568. Throughput: 0: 42720.5. Samples: 2553225100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:26,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 14:54:28,855][12883] Updated weights for policy 0, policy_version 155833 (0.0044) [2024-06-18 14:54:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 2553249792. Throughput: 0: 42701.6. Samples: 2553352860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:31,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 14:54:32,946][12883] Updated weights for policy 0, policy_version 155843 (0.0031) [2024-06-18 14:54:36,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2553479168. Throughput: 0: 42620.6. Samples: 2553602400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:36,994][12645] Avg episode reward: [(0, '0.702')] [2024-06-18 14:54:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155853_2553495552.pth... [2024-06-18 14:54:37,022][12883] Updated weights for policy 0, policy_version 155853 (0.0033) [2024-06-18 14:54:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155231_2543304704.pth [2024-06-18 14:54:40,946][12883] Updated weights for policy 0, policy_version 155863 (0.0027) [2024-06-18 14:54:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2553708544. Throughput: 0: 42371.9. Samples: 2553853800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:41,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 14:54:45,024][12883] Updated weights for policy 0, policy_version 155873 (0.0046) [2024-06-18 14:54:46,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42487.6). Total num frames: 2553905152. Throughput: 0: 42372.7. Samples: 2553985780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:46,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 14:54:48,544][12883] Updated weights for policy 0, policy_version 155883 (0.0031) [2024-06-18 14:54:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2554134528. Throughput: 0: 42556.3. Samples: 2554245280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 14:54:51,994][12645] Avg episode reward: [(0, '0.608')] [2024-06-18 14:54:52,658][12883] Updated weights for policy 0, policy_version 155893 (0.0039) [2024-06-18 14:54:56,009][12883] Updated weights for policy 0, policy_version 155903 (0.0033) [2024-06-18 14:54:56,618][12862] Signal inference workers to stop experience collection... (37350 times) [2024-06-18 14:54:56,618][12862] Signal inference workers to resume experience collection... (37350 times) [2024-06-18 14:54:56,662][12883] InferenceWorker_p0-w0: stopping experience collection (37350 times) [2024-06-18 14:54:56,662][12883] InferenceWorker_p0-w0: resuming experience collection (37350 times) [2024-06-18 14:54:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 2554347520. Throughput: 0: 42507.1. Samples: 2554498420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:54:56,999][12645] Avg episode reward: [(0, '0.559')] [2024-06-18 14:55:00,067][12883] Updated weights for policy 0, policy_version 155913 (0.0036) [2024-06-18 14:55:02,000][12645] Fps is (10 sec: 40935.7, 60 sec: 42594.1, 300 sec: 42542.0). Total num frames: 2554544128. Throughput: 0: 42523.2. Samples: 2554624700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:02,000][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 14:55:04,195][12883] Updated weights for policy 0, policy_version 155923 (0.0043) [2024-06-18 14:55:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2554757120. Throughput: 0: 42486.7. Samples: 2554876940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:06,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 14:55:07,747][12883] Updated weights for policy 0, policy_version 155933 (0.0025) [2024-06-18 14:55:11,958][12883] Updated weights for policy 0, policy_version 155943 (0.0042) [2024-06-18 14:55:11,994][12645] Fps is (10 sec: 42624.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2554970112. Throughput: 0: 42336.1. Samples: 2555130220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:11,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 14:55:15,331][12883] Updated weights for policy 0, policy_version 155953 (0.0029) [2024-06-18 14:55:16,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2555183104. Throughput: 0: 42381.7. Samples: 2555260040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:16,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 14:55:19,381][12883] Updated weights for policy 0, policy_version 155963 (0.0037) [2024-06-18 14:55:21,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2555379712. Throughput: 0: 42439.8. Samples: 2555512200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:21,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 14:55:23,214][12883] Updated weights for policy 0, policy_version 155973 (0.0031) [2024-06-18 14:55:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2555609088. Throughput: 0: 42576.0. Samples: 2555769720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:26,994][12645] Avg episode reward: [(0, '0.543')] [2024-06-18 14:55:27,237][12883] Updated weights for policy 0, policy_version 155983 (0.0032) [2024-06-18 14:55:31,038][12883] Updated weights for policy 0, policy_version 155993 (0.0024) [2024-06-18 14:55:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2555822080. Throughput: 0: 42436.1. Samples: 2555895400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:31,994][12645] Avg episode reward: [(0, '0.275')] [2024-06-18 14:55:35,013][12883] Updated weights for policy 0, policy_version 156003 (0.0027) [2024-06-18 14:55:36,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42323.7, 300 sec: 42542.5). Total num frames: 2556018688. Throughput: 0: 42309.1. Samples: 2556149280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:36,996][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 14:55:38,719][12883] Updated weights for policy 0, policy_version 156013 (0.0029) [2024-06-18 14:55:41,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2556248064. Throughput: 0: 42284.6. Samples: 2556401320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:41,996][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 14:55:42,805][12883] Updated weights for policy 0, policy_version 156023 (0.0036) [2024-06-18 14:55:46,324][12883] Updated weights for policy 0, policy_version 156033 (0.0039) [2024-06-18 14:55:46,994][12645] Fps is (10 sec: 44246.6, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2556461056. Throughput: 0: 42366.1. Samples: 2556530920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:46,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 14:55:50,445][12883] Updated weights for policy 0, policy_version 156043 (0.0032) [2024-06-18 14:55:51,994][12645] Fps is (10 sec: 40968.9, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 2556657664. Throughput: 0: 42399.1. Samples: 2556784900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:51,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 14:55:53,954][12883] Updated weights for policy 0, policy_version 156053 (0.0035) [2024-06-18 14:55:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2556887040. Throughput: 0: 42511.1. Samples: 2557043220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 14:55:56,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 14:55:58,086][12883] Updated weights for policy 0, policy_version 156063 (0.0030) [2024-06-18 14:56:01,674][12883] Updated weights for policy 0, policy_version 156073 (0.0047) [2024-06-18 14:56:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42875.7, 300 sec: 42542.9). Total num frames: 2557116416. Throughput: 0: 42473.9. Samples: 2557171360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:01,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 14:56:06,074][12883] Updated weights for policy 0, policy_version 156083 (0.0030) [2024-06-18 14:56:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2557296640. Throughput: 0: 42602.3. Samples: 2557429300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:06,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 14:56:09,244][12883] Updated weights for policy 0, policy_version 156093 (0.0033) [2024-06-18 14:56:11,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42869.8, 300 sec: 42542.6). Total num frames: 2557542400. Throughput: 0: 42459.3. Samples: 2557680480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:11,996][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 14:56:13,675][12883] Updated weights for policy 0, policy_version 156103 (0.0042) [2024-06-18 14:56:16,930][12883] Updated weights for policy 0, policy_version 156113 (0.0026) [2024-06-18 14:56:16,994][12645] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2557755392. Throughput: 0: 42509.7. Samples: 2557808340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:17,000][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 14:56:21,397][12883] Updated weights for policy 0, policy_version 156123 (0.0033) [2024-06-18 14:56:21,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2557919232. Throughput: 0: 42512.9. Samples: 2558062260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:21,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 14:56:24,247][12862] Signal inference workers to stop experience collection... (37400 times) [2024-06-18 14:56:24,247][12862] Signal inference workers to resume experience collection... (37400 times) [2024-06-18 14:56:24,316][12883] InferenceWorker_p0-w0: stopping experience collection (37400 times) [2024-06-18 14:56:24,320][12883] InferenceWorker_p0-w0: resuming experience collection (37400 times) [2024-06-18 14:56:24,799][12883] Updated weights for policy 0, policy_version 156133 (0.0041) [2024-06-18 14:56:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 2558164992. Throughput: 0: 42618.9. Samples: 2558319080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:26,994][12645] Avg episode reward: [(0, '0.439')] [2024-06-18 14:56:29,110][12883] Updated weights for policy 0, policy_version 156143 (0.0028) [2024-06-18 14:56:31,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2558377984. Throughput: 0: 42695.1. Samples: 2558452200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:31,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 14:56:32,584][12883] Updated weights for policy 0, policy_version 156153 (0.0030) [2024-06-18 14:56:36,708][12883] Updated weights for policy 0, policy_version 156163 (0.0033) [2024-06-18 14:56:36,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42600.1, 300 sec: 42487.3). Total num frames: 2558574592. Throughput: 0: 42660.2. Samples: 2558704600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:36,994][12645] Avg episode reward: [(0, '0.800')] [2024-06-18 14:56:37,088][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156164_2558590976.pth... [2024-06-18 14:56:37,152][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155541_2548383744.pth [2024-06-18 14:56:40,347][12883] Updated weights for policy 0, policy_version 156173 (0.0035) [2024-06-18 14:56:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 2558820352. Throughput: 0: 42514.2. Samples: 2558956360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:41,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 14:56:44,431][12883] Updated weights for policy 0, policy_version 156183 (0.0027) [2024-06-18 14:56:46,996][12645] Fps is (10 sec: 44226.4, 60 sec: 42596.8, 300 sec: 42598.1). Total num frames: 2559016960. Throughput: 0: 42655.2. Samples: 2559090940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:46,997][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 14:56:48,030][12883] Updated weights for policy 0, policy_version 156193 (0.0031) [2024-06-18 14:56:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2559213568. Throughput: 0: 42559.1. Samples: 2559344460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:51,994][12645] Avg episode reward: [(0, '0.273')] [2024-06-18 14:56:52,155][12883] Updated weights for policy 0, policy_version 156203 (0.0038) [2024-06-18 14:56:55,769][12883] Updated weights for policy 0, policy_version 156213 (0.0041) [2024-06-18 14:56:56,996][12645] Fps is (10 sec: 44236.9, 60 sec: 42869.9, 300 sec: 42487.0). Total num frames: 2559459328. Throughput: 0: 42516.4. Samples: 2559593720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:56:56,996][12645] Avg episode reward: [(0, '0.279')] [2024-06-18 14:56:59,833][12883] Updated weights for policy 0, policy_version 156223 (0.0031) [2024-06-18 14:57:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 2559639552. Throughput: 0: 42556.2. Samples: 2559723360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 14:57:01,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 14:57:03,419][12883] Updated weights for policy 0, policy_version 156233 (0.0029) [2024-06-18 14:57:06,996][12645] Fps is (10 sec: 39321.5, 60 sec: 42596.8, 300 sec: 42487.0). Total num frames: 2559852544. Throughput: 0: 42590.7. Samples: 2559978940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:06,996][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 14:57:07,350][12883] Updated weights for policy 0, policy_version 156243 (0.0032) [2024-06-18 14:57:11,293][12883] Updated weights for policy 0, policy_version 156253 (0.0037) [2024-06-18 14:57:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 2560081920. Throughput: 0: 42426.8. Samples: 2560228280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:11,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 14:57:14,977][12883] Updated weights for policy 0, policy_version 156263 (0.0034) [2024-06-18 14:57:16,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 2560278528. Throughput: 0: 42366.7. Samples: 2560358700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:16,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 14:57:18,945][12883] Updated weights for policy 0, policy_version 156273 (0.0036) [2024-06-18 14:57:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42432.7). Total num frames: 2560491520. Throughput: 0: 42379.9. Samples: 2560611700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:21,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 14:57:23,267][12883] Updated weights for policy 0, policy_version 156283 (0.0027) [2024-06-18 14:57:26,807][12883] Updated weights for policy 0, policy_version 156293 (0.0030) [2024-06-18 14:57:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 2560704512. Throughput: 0: 42332.1. Samples: 2560861300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:26,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 14:57:30,759][12883] Updated weights for policy 0, policy_version 156303 (0.0037) [2024-06-18 14:57:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2560917504. Throughput: 0: 42165.6. Samples: 2560988300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:31,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 14:57:34,541][12883] Updated weights for policy 0, policy_version 156313 (0.0036) [2024-06-18 14:57:36,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 2561114112. Throughput: 0: 42114.7. Samples: 2561239620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:36,997][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 14:57:38,480][12883] Updated weights for policy 0, policy_version 156323 (0.0033) [2024-06-18 14:57:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 2561327104. Throughput: 0: 42208.3. Samples: 2561493000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:41,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 14:57:42,217][12883] Updated weights for policy 0, policy_version 156333 (0.0042) [2024-06-18 14:57:46,241][12883] Updated weights for policy 0, policy_version 156343 (0.0025) [2024-06-18 14:57:46,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42326.8, 300 sec: 42542.8). Total num frames: 2561556480. Throughput: 0: 42149.2. Samples: 2561620080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:46,994][12645] Avg episode reward: [(0, '0.304')] [2024-06-18 14:57:48,759][12862] Signal inference workers to stop experience collection... (37450 times) [2024-06-18 14:57:48,796][12883] InferenceWorker_p0-w0: stopping experience collection (37450 times) [2024-06-18 14:57:48,811][12862] Signal inference workers to resume experience collection... (37450 times) [2024-06-18 14:57:48,821][12883] InferenceWorker_p0-w0: resuming experience collection (37450 times) [2024-06-18 14:57:49,734][12883] Updated weights for policy 0, policy_version 156353 (0.0032) [2024-06-18 14:57:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2561769472. Throughput: 0: 42125.6. Samples: 2561874500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:51,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 14:57:54,038][12883] Updated weights for policy 0, policy_version 156363 (0.0027) [2024-06-18 14:57:56,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42053.9, 300 sec: 42431.8). Total num frames: 2561982464. Throughput: 0: 42362.2. Samples: 2562134580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:57:56,994][12645] Avg episode reward: [(0, '0.325')] [2024-06-18 14:57:57,191][12883] Updated weights for policy 0, policy_version 156373 (0.0042) [2024-06-18 14:58:01,665][12883] Updated weights for policy 0, policy_version 156383 (0.0035) [2024-06-18 14:58:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2562195456. Throughput: 0: 42354.6. Samples: 2562264660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:01,994][12645] Avg episode reward: [(0, '0.627')] [2024-06-18 14:58:05,135][12883] Updated weights for policy 0, policy_version 156393 (0.0053) [2024-06-18 14:58:06,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42326.9, 300 sec: 42431.8). Total num frames: 2562392064. Throughput: 0: 42282.2. Samples: 2562514400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:06,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 14:58:09,259][12883] Updated weights for policy 0, policy_version 156403 (0.0029) [2024-06-18 14:58:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42432.1). Total num frames: 2562621440. Throughput: 0: 42438.5. Samples: 2562771040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:11,996][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 14:58:12,834][12883] Updated weights for policy 0, policy_version 156413 (0.0032) [2024-06-18 14:58:16,921][12883] Updated weights for policy 0, policy_version 156423 (0.0033) [2024-06-18 14:58:16,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2562834432. Throughput: 0: 42408.6. Samples: 2562896680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:16,994][12645] Avg episode reward: [(0, '0.508')] [2024-06-18 14:58:20,542][12883] Updated weights for policy 0, policy_version 156433 (0.0037) [2024-06-18 14:58:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 2563031040. Throughput: 0: 42410.1. Samples: 2563148080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:21,995][12645] Avg episode reward: [(0, '0.679')] [2024-06-18 14:58:24,923][12883] Updated weights for policy 0, policy_version 156443 (0.0048) [2024-06-18 14:58:26,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 2563227648. Throughput: 0: 42592.4. Samples: 2563409660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:26,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 14:58:28,324][12883] Updated weights for policy 0, policy_version 156453 (0.0029) [2024-06-18 14:58:31,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2563457024. Throughput: 0: 42484.6. Samples: 2563531880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:31,994][12645] Avg episode reward: [(0, '0.474')] [2024-06-18 14:58:32,524][12883] Updated weights for policy 0, policy_version 156463 (0.0041) [2024-06-18 14:58:36,100][12883] Updated weights for policy 0, policy_version 156473 (0.0026) [2024-06-18 14:58:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2563670016. Throughput: 0: 42456.5. Samples: 2563785040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:36,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 14:58:37,078][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156475_2563686400.pth... [2024-06-18 14:58:37,135][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000155853_2553495552.pth [2024-06-18 14:58:40,265][12883] Updated weights for policy 0, policy_version 156483 (0.0041) [2024-06-18 14:58:41,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2563883008. Throughput: 0: 42504.7. Samples: 2564047300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:41,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 14:58:43,580][12883] Updated weights for policy 0, policy_version 156493 (0.0035) [2024-06-18 14:58:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2564096000. Throughput: 0: 42411.6. Samples: 2564173180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:46,994][12645] Avg episode reward: [(0, '0.396')] [2024-06-18 14:58:47,783][12883] Updated weights for policy 0, policy_version 156503 (0.0037) [2024-06-18 14:58:51,307][12883] Updated weights for policy 0, policy_version 156513 (0.0039) [2024-06-18 14:58:51,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2564308992. Throughput: 0: 42589.4. Samples: 2564430920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:51,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 14:58:55,455][12883] Updated weights for policy 0, policy_version 156523 (0.0043) [2024-06-18 14:58:56,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 2564538368. Throughput: 0: 42515.1. Samples: 2564684220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:58:56,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 14:58:59,547][12883] Updated weights for policy 0, policy_version 156533 (0.0037) [2024-06-18 14:59:01,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 2564718592. Throughput: 0: 42661.2. Samples: 2564816440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:59:01,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 14:59:02,990][12883] Updated weights for policy 0, policy_version 156543 (0.0037) [2024-06-18 14:59:06,996][12645] Fps is (10 sec: 40950.9, 60 sec: 42596.8, 300 sec: 42431.5). Total num frames: 2564947968. Throughput: 0: 42631.3. Samples: 2565066580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 14:59:06,997][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 14:59:07,121][12883] Updated weights for policy 0, policy_version 156553 (0.0030) [2024-06-18 14:59:07,934][12862] Signal inference workers to stop experience collection... (37500 times) [2024-06-18 14:59:07,935][12862] Signal inference workers to resume experience collection... (37500 times) [2024-06-18 14:59:07,967][12883] InferenceWorker_p0-w0: stopping experience collection (37500 times) [2024-06-18 14:59:07,967][12883] InferenceWorker_p0-w0: resuming experience collection (37500 times) [2024-06-18 14:59:11,027][12883] Updated weights for policy 0, policy_version 156563 (0.0039) [2024-06-18 14:59:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 2565177344. Throughput: 0: 42452.9. Samples: 2565320040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:11,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 14:59:15,678][12883] Updated weights for policy 0, policy_version 156574 (0.0031) [2024-06-18 14:59:16,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 2565373952. Throughput: 0: 42680.3. Samples: 2565452500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:16,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 14:59:19,068][12883] Updated weights for policy 0, policy_version 156584 (0.0035) [2024-06-18 14:59:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 2565586944. Throughput: 0: 42626.7. Samples: 2565703240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:21,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 14:59:23,388][12883] Updated weights for policy 0, policy_version 156594 (0.0050) [2024-06-18 14:59:26,798][12883] Updated weights for policy 0, policy_version 156604 (0.0036) [2024-06-18 14:59:26,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2565816320. Throughput: 0: 42476.6. Samples: 2565958740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:26,994][12645] Avg episode reward: [(0, '0.492')] [2024-06-18 14:59:31,134][12883] Updated weights for policy 0, policy_version 156614 (0.0038) [2024-06-18 14:59:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2565996544. Throughput: 0: 42412.5. Samples: 2566081740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:31,994][12645] Avg episode reward: [(0, '0.386')] [2024-06-18 14:59:34,476][12883] Updated weights for policy 0, policy_version 156624 (0.0032) [2024-06-18 14:59:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2566225920. Throughput: 0: 42271.1. Samples: 2566333120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:36,994][12645] Avg episode reward: [(0, '0.318')] [2024-06-18 14:59:38,815][12883] Updated weights for policy 0, policy_version 156634 (0.0028) [2024-06-18 14:59:41,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 2566422528. Throughput: 0: 42412.1. Samples: 2566592760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:41,994][12645] Avg episode reward: [(0, '0.168')] [2024-06-18 14:59:42,268][12883] Updated weights for policy 0, policy_version 156644 (0.0036) [2024-06-18 14:59:46,420][12883] Updated weights for policy 0, policy_version 156654 (0.0038) [2024-06-18 14:59:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 2566635520. Throughput: 0: 42153.0. Samples: 2566713320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:46,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 14:59:50,122][12883] Updated weights for policy 0, policy_version 156664 (0.0040) [2024-06-18 14:59:51,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2566881280. Throughput: 0: 42256.3. Samples: 2566968020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:51,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 14:59:54,043][12883] Updated weights for policy 0, policy_version 156674 (0.0032) [2024-06-18 14:59:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42432.6). Total num frames: 2567061504. Throughput: 0: 42424.8. Samples: 2567229160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 14:59:56,994][12645] Avg episode reward: [(0, '0.408')] [2024-06-18 14:59:57,985][12883] Updated weights for policy 0, policy_version 156684 (0.0047) [2024-06-18 15:00:01,649][12883] Updated weights for policy 0, policy_version 156694 (0.0032) [2024-06-18 15:00:01,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2567274496. Throughput: 0: 42144.0. Samples: 2567348980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 15:00:01,995][12645] Avg episode reward: [(0, '0.345')] [2024-06-18 15:00:05,640][12883] Updated weights for policy 0, policy_version 156704 (0.0041) [2024-06-18 15:00:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42600.0, 300 sec: 42487.3). Total num frames: 2567503872. Throughput: 0: 42485.3. Samples: 2567615080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 15:00:06,994][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 15:00:09,542][12883] Updated weights for policy 0, policy_version 156714 (0.0023) [2024-06-18 15:00:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2567700480. Throughput: 0: 42435.5. Samples: 2567868340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 15:00:11,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 15:00:13,235][12883] Updated weights for policy 0, policy_version 156724 (0.0035) [2024-06-18 15:00:16,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2567913472. Throughput: 0: 42382.1. Samples: 2567988940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:16,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 15:00:17,418][12883] Updated weights for policy 0, policy_version 156734 (0.0055) [2024-06-18 15:00:20,936][12883] Updated weights for policy 0, policy_version 156744 (0.0042) [2024-06-18 15:00:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2568159232. Throughput: 0: 42695.0. Samples: 2568254400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:21,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 15:00:24,946][12883] Updated weights for policy 0, policy_version 156754 (0.0049) [2024-06-18 15:00:26,994][12645] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 2568323072. Throughput: 0: 42669.7. Samples: 2568512900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:26,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 15:00:28,664][12883] Updated weights for policy 0, policy_version 156764 (0.0043) [2024-06-18 15:00:31,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 2568552448. Throughput: 0: 42616.3. Samples: 2568631060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:31,995][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 15:00:32,783][12883] Updated weights for policy 0, policy_version 156774 (0.0041) [2024-06-18 15:00:36,083][12883] Updated weights for policy 0, policy_version 156784 (0.0044) [2024-06-18 15:00:36,981][12862] Signal inference workers to stop experience collection... (37550 times) [2024-06-18 15:00:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2568781824. Throughput: 0: 42756.0. Samples: 2568892040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:36,994][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 15:00:37,026][12883] InferenceWorker_p0-w0: stopping experience collection (37550 times) [2024-06-18 15:00:37,033][12862] Signal inference workers to resume experience collection... (37550 times) [2024-06-18 15:00:37,043][12883] InferenceWorker_p0-w0: resuming experience collection (37550 times) [2024-06-18 15:00:37,168][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156788_2568814592.pth... [2024-06-18 15:00:37,210][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156164_2558590976.pth [2024-06-18 15:00:40,363][12883] Updated weights for policy 0, policy_version 156794 (0.0043) [2024-06-18 15:00:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 2568978432. Throughput: 0: 42726.3. Samples: 2569151840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:41,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 15:00:43,869][12883] Updated weights for policy 0, policy_version 156804 (0.0027) [2024-06-18 15:00:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2569207808. Throughput: 0: 42585.9. Samples: 2569265340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:46,994][12645] Avg episode reward: [(0, '0.710')] [2024-06-18 15:00:47,945][12883] Updated weights for policy 0, policy_version 156814 (0.0032) [2024-06-18 15:00:51,525][12883] Updated weights for policy 0, policy_version 156824 (0.0028) [2024-06-18 15:00:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2569420800. Throughput: 0: 42576.8. Samples: 2569531040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:51,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 15:00:55,467][12883] Updated weights for policy 0, policy_version 156834 (0.0038) [2024-06-18 15:00:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 2569601024. Throughput: 0: 42706.3. Samples: 2569790120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:00:56,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 15:00:59,429][12883] Updated weights for policy 0, policy_version 156844 (0.0022) [2024-06-18 15:01:01,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2569846784. Throughput: 0: 42739.7. Samples: 2569912220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:01:01,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 15:01:03,061][12883] Updated weights for policy 0, policy_version 156854 (0.0035) [2024-06-18 15:01:06,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 2570043392. Throughput: 0: 42447.2. Samples: 2570164520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:01:06,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 15:01:07,309][12883] Updated weights for policy 0, policy_version 156864 (0.0048) [2024-06-18 15:01:10,762][12883] Updated weights for policy 0, policy_version 156874 (0.0034) [2024-06-18 15:01:11,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 2570240000. Throughput: 0: 42404.9. Samples: 2570421120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:01:11,994][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 15:01:14,934][12883] Updated weights for policy 0, policy_version 156884 (0.0047) [2024-06-18 15:01:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2570502144. Throughput: 0: 42647.7. Samples: 2570550200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 20.0) [2024-06-18 15:01:16,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 15:01:18,452][12883] Updated weights for policy 0, policy_version 156894 (0.0031) [2024-06-18 15:01:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 2570682368. Throughput: 0: 42647.6. Samples: 2570811180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:21,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 15:01:22,558][12883] Updated weights for policy 0, policy_version 156904 (0.0030) [2024-06-18 15:01:26,174][12883] Updated weights for policy 0, policy_version 156914 (0.0033) [2024-06-18 15:01:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 2570895360. Throughput: 0: 42590.7. Samples: 2571068420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:26,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 15:01:30,067][12883] Updated weights for policy 0, policy_version 156924 (0.0036) [2024-06-18 15:01:31,994][12645] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2571141120. Throughput: 0: 42904.9. Samples: 2571196060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:31,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 15:01:34,030][12883] Updated weights for policy 0, policy_version 156934 (0.0030) [2024-06-18 15:01:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2571337728. Throughput: 0: 42885.0. Samples: 2571460860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:36,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 15:01:37,506][12883] Updated weights for policy 0, policy_version 156944 (0.0030) [2024-06-18 15:01:41,801][12883] Updated weights for policy 0, policy_version 156954 (0.0035) [2024-06-18 15:01:41,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42487.6). Total num frames: 2571550720. Throughput: 0: 42677.7. Samples: 2571710620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:41,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 15:01:43,308][12862] Signal inference workers to stop experience collection... (37600 times) [2024-06-18 15:01:43,308][12862] Signal inference workers to resume experience collection... (37600 times) [2024-06-18 15:01:43,322][12883] InferenceWorker_p0-w0: stopping experience collection (37600 times) [2024-06-18 15:01:43,322][12883] InferenceWorker_p0-w0: resuming experience collection (37600 times) [2024-06-18 15:01:45,227][12883] Updated weights for policy 0, policy_version 156964 (0.0040) [2024-06-18 15:01:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2571780096. Throughput: 0: 42788.4. Samples: 2571837700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:46,994][12645] Avg episode reward: [(0, '0.700')] [2024-06-18 15:01:49,332][12883] Updated weights for policy 0, policy_version 156974 (0.0033) [2024-06-18 15:01:51,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42432.1). Total num frames: 2571976704. Throughput: 0: 42928.0. Samples: 2572096280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:51,994][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 15:01:52,832][12883] Updated weights for policy 0, policy_version 156984 (0.0042) [2024-06-18 15:01:56,874][12883] Updated weights for policy 0, policy_version 156994 (0.0037) [2024-06-18 15:01:56,994][12645] Fps is (10 sec: 40959.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2572189696. Throughput: 0: 42875.5. Samples: 2572350520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:01:56,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 15:02:00,665][12883] Updated weights for policy 0, policy_version 157004 (0.0031) [2024-06-18 15:02:01,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42869.8, 300 sec: 42598.4). Total num frames: 2572419072. Throughput: 0: 42866.3. Samples: 2572479280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:02:01,997][12645] Avg episode reward: [(0, '0.246')] [2024-06-18 15:02:04,320][12883] Updated weights for policy 0, policy_version 157014 (0.0031) [2024-06-18 15:02:06,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2572615680. Throughput: 0: 42811.1. Samples: 2572737680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:02:06,995][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 15:02:08,544][12883] Updated weights for policy 0, policy_version 157024 (0.0033) [2024-06-18 15:02:11,931][12883] Updated weights for policy 0, policy_version 157034 (0.0022) [2024-06-18 15:02:11,994][12645] Fps is (10 sec: 42607.9, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 2572845056. Throughput: 0: 42807.5. Samples: 2572994760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:02:11,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 15:02:15,978][12883] Updated weights for policy 0, policy_version 157044 (0.0035) [2024-06-18 15:02:16,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2573074432. Throughput: 0: 42943.1. Samples: 2573128500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:02:16,994][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 15:02:19,619][12883] Updated weights for policy 0, policy_version 157054 (0.0036) [2024-06-18 15:02:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2573254656. Throughput: 0: 42693.4. Samples: 2573382060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 15:02:21,994][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 15:02:23,417][12883] Updated weights for policy 0, policy_version 157064 (0.0039) [2024-06-18 15:02:27,000][12645] Fps is (10 sec: 37659.8, 60 sec: 42594.0, 300 sec: 42486.4). Total num frames: 2573451264. Throughput: 0: 42854.1. Samples: 2573639320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:02:27,001][12645] Avg episode reward: [(0, '0.727')] [2024-06-18 15:02:27,485][12883] Updated weights for policy 0, policy_version 157074 (0.0036) [2024-06-18 15:02:30,835][12883] Updated weights for policy 0, policy_version 157084 (0.0023) [2024-06-18 15:02:31,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2573713408. Throughput: 0: 42903.2. Samples: 2573768340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:02:31,994][12645] Avg episode reward: [(0, '0.727')] [2024-06-18 15:02:34,965][12883] Updated weights for policy 0, policy_version 157094 (0.0030) [2024-06-18 15:02:36,994][12645] Fps is (10 sec: 45903.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2573910016. Throughput: 0: 42989.2. Samples: 2574030800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:02:36,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 15:02:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157099_2573910016.pth... [2024-06-18 15:02:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156475_2563686400.pth [2024-06-18 15:02:38,317][12883] Updated weights for policy 0, policy_version 157104 (0.0039) [2024-06-18 15:02:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2574106624. Throughput: 0: 43126.9. Samples: 2574291220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:02:41,994][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 15:02:42,430][12883] Updated weights for policy 0, policy_version 157114 (0.0036) [2024-06-18 15:02:45,900][12883] Updated weights for policy 0, policy_version 157124 (0.0037) [2024-06-18 15:02:46,999][12645] Fps is (10 sec: 44212.8, 60 sec: 42867.5, 300 sec: 42653.1). Total num frames: 2574352384. Throughput: 0: 43078.6. Samples: 2574417960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:02:47,000][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 15:02:50,152][12883] Updated weights for policy 0, policy_version 157134 (0.0036) [2024-06-18 15:02:51,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2574548992. Throughput: 0: 43039.9. Samples: 2574674480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:02:51,995][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 15:02:53,469][12883] Updated weights for policy 0, policy_version 157144 (0.0043) [2024-06-18 15:02:57,000][12645] Fps is (10 sec: 40957.2, 60 sec: 42867.1, 300 sec: 42597.5). Total num frames: 2574761984. Throughput: 0: 43113.1. Samples: 2574935120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:02:57,000][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 15:02:58,136][12883] Updated weights for policy 0, policy_version 157154 (0.0030) [2024-06-18 15:03:00,874][12862] Signal inference workers to stop experience collection... (37650 times) [2024-06-18 15:03:00,874][12862] Signal inference workers to resume experience collection... (37650 times) [2024-06-18 15:03:00,931][12883] InferenceWorker_p0-w0: stopping experience collection (37650 times) [2024-06-18 15:03:00,931][12883] InferenceWorker_p0-w0: resuming experience collection (37650 times) [2024-06-18 15:03:01,024][12883] Updated weights for policy 0, policy_version 157164 (0.0044) [2024-06-18 15:03:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 2574991360. Throughput: 0: 42951.9. Samples: 2575061340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:03:01,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 15:03:05,692][12883] Updated weights for policy 0, policy_version 157174 (0.0032) [2024-06-18 15:03:06,994][12645] Fps is (10 sec: 42625.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2575187968. Throughput: 0: 43114.6. Samples: 2575322220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:03:06,994][12645] Avg episode reward: [(0, '0.732')] [2024-06-18 15:03:08,627][12883] Updated weights for policy 0, policy_version 157184 (0.0033) [2024-06-18 15:03:11,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2575400960. Throughput: 0: 43053.6. Samples: 2575576460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:03:11,994][12645] Avg episode reward: [(0, '0.696')] [2024-06-18 15:03:13,247][12883] Updated weights for policy 0, policy_version 157194 (0.0026) [2024-06-18 15:03:16,986][12883] Updated weights for policy 0, policy_version 157204 (0.0038) [2024-06-18 15:03:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2575630336. Throughput: 0: 43005.6. Samples: 2575703600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:03:16,994][12645] Avg episode reward: [(0, '0.668')] [2024-06-18 15:03:20,748][12883] Updated weights for policy 0, policy_version 157214 (0.0025) [2024-06-18 15:03:21,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2575843328. Throughput: 0: 42981.1. Samples: 2575964940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:03:21,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 15:03:24,511][12883] Updated weights for policy 0, policy_version 157224 (0.0027) [2024-06-18 15:03:26,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43422.2, 300 sec: 42709.5). Total num frames: 2576056320. Throughput: 0: 42969.8. Samples: 2576224860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:03:26,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 15:03:28,316][12883] Updated weights for policy 0, policy_version 157234 (0.0042) [2024-06-18 15:03:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2576269312. Throughput: 0: 43035.6. Samples: 2576354320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:03:31,994][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 15:03:32,123][12883] Updated weights for policy 0, policy_version 157244 (0.0034) [2024-06-18 15:03:36,274][12883] Updated weights for policy 0, policy_version 157254 (0.0046) [2024-06-18 15:03:36,996][12645] Fps is (10 sec: 42588.5, 60 sec: 42870.0, 300 sec: 42709.2). Total num frames: 2576482304. Throughput: 0: 42948.7. Samples: 2576607260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:03:36,996][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 15:03:39,803][12883] Updated weights for policy 0, policy_version 157264 (0.0033) [2024-06-18 15:03:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2576711680. Throughput: 0: 42806.0. Samples: 2576861120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:03:41,994][12645] Avg episode reward: [(0, '0.747')] [2024-06-18 15:03:43,857][12883] Updated weights for policy 0, policy_version 157274 (0.0031) [2024-06-18 15:03:46,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42329.3, 300 sec: 42653.9). Total num frames: 2576891904. Throughput: 0: 42862.4. Samples: 2576990140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:03:46,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 15:03:47,585][12883] Updated weights for policy 0, policy_version 157284 (0.0037) [2024-06-18 15:03:51,512][12883] Updated weights for policy 0, policy_version 157294 (0.0024) [2024-06-18 15:03:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2577137664. Throughput: 0: 42749.7. Samples: 2577245960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:03:51,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 15:03:55,185][12883] Updated weights for policy 0, policy_version 157304 (0.0045) [2024-06-18 15:03:56,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43149.1, 300 sec: 42820.6). Total num frames: 2577350656. Throughput: 0: 42625.8. Samples: 2577494620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:03:56,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 15:03:59,379][12883] Updated weights for policy 0, policy_version 157314 (0.0032) [2024-06-18 15:04:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2577530880. Throughput: 0: 42700.0. Samples: 2577625100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:04:01,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 15:04:02,806][12883] Updated weights for policy 0, policy_version 157324 (0.0025) [2024-06-18 15:04:06,813][12862] Signal inference workers to stop experience collection... (37700 times) [2024-06-18 15:04:06,845][12883] InferenceWorker_p0-w0: stopping experience collection (37700 times) [2024-06-18 15:04:06,879][12862] Signal inference workers to resume experience collection... (37700 times) [2024-06-18 15:04:06,880][12883] InferenceWorker_p0-w0: resuming experience collection (37700 times) [2024-06-18 15:04:06,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2577743872. Throughput: 0: 42664.0. Samples: 2577884820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:04:06,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 15:04:07,017][12883] Updated weights for policy 0, policy_version 157334 (0.0032) [2024-06-18 15:04:10,632][12883] Updated weights for policy 0, policy_version 157344 (0.0025) [2024-06-18 15:04:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2577989632. Throughput: 0: 42351.9. Samples: 2578130700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:04:11,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 15:04:14,860][12883] Updated weights for policy 0, policy_version 157354 (0.0035) [2024-06-18 15:04:16,998][12645] Fps is (10 sec: 44216.3, 60 sec: 42595.2, 300 sec: 42708.8). Total num frames: 2578186240. Throughput: 0: 42487.2. Samples: 2578266440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:04:16,999][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 15:04:18,349][12883] Updated weights for policy 0, policy_version 157364 (0.0034) [2024-06-18 15:04:21,995][12645] Fps is (10 sec: 39317.2, 60 sec: 42324.5, 300 sec: 42598.2). Total num frames: 2578382848. Throughput: 0: 42388.2. Samples: 2578514680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:04:21,995][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 15:04:22,626][12883] Updated weights for policy 0, policy_version 157374 (0.0050) [2024-06-18 15:04:25,924][12883] Updated weights for policy 0, policy_version 157384 (0.0032) [2024-06-18 15:04:26,994][12645] Fps is (10 sec: 42617.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2578612224. Throughput: 0: 42528.8. Samples: 2578774920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:04:26,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 15:04:29,988][12883] Updated weights for policy 0, policy_version 157394 (0.0028) [2024-06-18 15:04:31,994][12645] Fps is (10 sec: 44241.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2578825216. Throughput: 0: 42634.1. Samples: 2578908680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 15:04:31,994][12645] Avg episode reward: [(0, '0.722')] [2024-06-18 15:04:33,513][12883] Updated weights for policy 0, policy_version 157404 (0.0038) [2024-06-18 15:04:36,996][12645] Fps is (10 sec: 40951.0, 60 sec: 42325.3, 300 sec: 42709.1). Total num frames: 2579021824. Throughput: 0: 42459.3. Samples: 2579156720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:04:36,997][12645] Avg episode reward: [(0, '0.679')] [2024-06-18 15:04:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157411_2579021824.pth... [2024-06-18 15:04:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000156788_2568814592.pth [2024-06-18 15:04:37,576][12883] Updated weights for policy 0, policy_version 157414 (0.0027) [2024-06-18 15:04:41,449][12883] Updated weights for policy 0, policy_version 157424 (0.0035) [2024-06-18 15:04:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2579267584. Throughput: 0: 42705.3. Samples: 2579416360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:04:41,994][12645] Avg episode reward: [(0, '0.728')] [2024-06-18 15:04:45,245][12883] Updated weights for policy 0, policy_version 157434 (0.0043) [2024-06-18 15:04:46,994][12645] Fps is (10 sec: 44247.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2579464192. Throughput: 0: 42812.1. Samples: 2579551640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:04:46,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 15:04:49,068][12883] Updated weights for policy 0, policy_version 157444 (0.0031) [2024-06-18 15:04:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2579677184. Throughput: 0: 42581.7. Samples: 2579801000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:04:51,995][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 15:04:52,759][12883] Updated weights for policy 0, policy_version 157454 (0.0033) [2024-06-18 15:04:56,617][12883] Updated weights for policy 0, policy_version 157464 (0.0045) [2024-06-18 15:04:56,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2579890176. Throughput: 0: 42814.3. Samples: 2580057340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:04:56,994][12645] Avg episode reward: [(0, '0.263')] [2024-06-18 15:05:00,243][12883] Updated weights for policy 0, policy_version 157474 (0.0030) [2024-06-18 15:05:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2580086784. Throughput: 0: 42711.9. Samples: 2580188280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:05:01,994][12645] Avg episode reward: [(0, '0.227')] [2024-06-18 15:05:04,138][12883] Updated weights for policy 0, policy_version 157484 (0.0033) [2024-06-18 15:05:06,994][12645] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2580332544. Throughput: 0: 42792.0. Samples: 2580440280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:05:06,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 15:05:08,071][12883] Updated weights for policy 0, policy_version 157494 (0.0041) [2024-06-18 15:05:11,754][12883] Updated weights for policy 0, policy_version 157504 (0.0032) [2024-06-18 15:05:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2580545536. Throughput: 0: 42654.3. Samples: 2580694360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:05:11,994][12645] Avg episode reward: [(0, '0.695')] [2024-06-18 15:05:15,632][12883] Updated weights for policy 0, policy_version 157514 (0.0027) [2024-06-18 15:05:16,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42601.7, 300 sec: 42654.0). Total num frames: 2580742144. Throughput: 0: 42492.1. Samples: 2580820820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:05:16,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 15:05:19,584][12883] Updated weights for policy 0, policy_version 157524 (0.0034) [2024-06-18 15:05:21,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43145.2, 300 sec: 42876.1). Total num frames: 2580971520. Throughput: 0: 42731.8. Samples: 2581079560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:05:21,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 15:05:23,439][12883] Updated weights for policy 0, policy_version 157534 (0.0030) [2024-06-18 15:05:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2581168128. Throughput: 0: 42724.9. Samples: 2581338980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:05:26,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 15:05:27,164][12883] Updated weights for policy 0, policy_version 157544 (0.0038) [2024-06-18 15:05:31,355][12883] Updated weights for policy 0, policy_version 157554 (0.0031) [2024-06-18 15:05:31,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2581381120. Throughput: 0: 42420.8. Samples: 2581460580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:05:31,994][12645] Avg episode reward: [(0, '0.749')] [2024-06-18 15:05:34,249][12862] Signal inference workers to stop experience collection... (37750 times) [2024-06-18 15:05:34,249][12862] Signal inference workers to resume experience collection... (37750 times) [2024-06-18 15:05:34,284][12883] InferenceWorker_p0-w0: stopping experience collection (37750 times) [2024-06-18 15:05:34,285][12883] InferenceWorker_p0-w0: resuming experience collection (37750 times) [2024-06-18 15:05:34,730][12883] Updated weights for policy 0, policy_version 157564 (0.0039) [2024-06-18 15:05:36,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43419.2, 300 sec: 42876.1). Total num frames: 2581626880. Throughput: 0: 42629.8. Samples: 2581719340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:05:36,994][12645] Avg episode reward: [(0, '0.633')] [2024-06-18 15:05:38,984][12883] Updated weights for policy 0, policy_version 157574 (0.0034) [2024-06-18 15:05:41,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2581807104. Throughput: 0: 42657.8. Samples: 2581976940. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:05:41,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 15:05:42,613][12883] Updated weights for policy 0, policy_version 157584 (0.0052) [2024-06-18 15:05:46,747][12883] Updated weights for policy 0, policy_version 157594 (0.0034) [2024-06-18 15:05:46,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2582020096. Throughput: 0: 42486.2. Samples: 2582100160. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:05:47,000][12645] Avg episode reward: [(0, '0.288')] [2024-06-18 15:05:50,325][12883] Updated weights for policy 0, policy_version 157604 (0.0031) [2024-06-18 15:05:52,000][12645] Fps is (10 sec: 44208.7, 60 sec: 42867.1, 300 sec: 42875.2). Total num frames: 2582249472. Throughput: 0: 42545.8. Samples: 2582355100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:05:52,001][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 15:05:54,299][12883] Updated weights for policy 0, policy_version 157614 (0.0038) [2024-06-18 15:05:56,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2582462464. Throughput: 0: 42678.7. Samples: 2582614900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:05:56,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 15:05:58,011][12883] Updated weights for policy 0, policy_version 157624 (0.0028) [2024-06-18 15:06:01,968][12883] Updated weights for policy 0, policy_version 157634 (0.0028) [2024-06-18 15:06:01,996][12645] Fps is (10 sec: 42615.4, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 2582675456. Throughput: 0: 42717.4. Samples: 2582743200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:01,997][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 15:06:05,845][12883] Updated weights for policy 0, policy_version 157644 (0.0036) [2024-06-18 15:06:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2582872064. Throughput: 0: 42602.4. Samples: 2582996660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:06,999][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 15:06:09,829][12883] Updated weights for policy 0, policy_version 157654 (0.0036) [2024-06-18 15:06:11,999][12645] Fps is (10 sec: 40948.5, 60 sec: 42321.8, 300 sec: 42653.2). Total num frames: 2583085056. Throughput: 0: 42528.5. Samples: 2583252980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:11,999][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 15:06:13,501][12883] Updated weights for policy 0, policy_version 157664 (0.0022) [2024-06-18 15:06:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2583298048. Throughput: 0: 42639.6. Samples: 2583379360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:16,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 15:06:17,328][12883] Updated weights for policy 0, policy_version 157674 (0.0040) [2024-06-18 15:06:21,088][12883] Updated weights for policy 0, policy_version 157684 (0.0037) [2024-06-18 15:06:21,994][12645] Fps is (10 sec: 42620.2, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 2583511040. Throughput: 0: 42578.8. Samples: 2583635380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:21,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 15:06:25,045][12883] Updated weights for policy 0, policy_version 157694 (0.0027) [2024-06-18 15:06:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2583724032. Throughput: 0: 42538.5. Samples: 2583891180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:26,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 15:06:28,854][12883] Updated weights for policy 0, policy_version 157704 (0.0046) [2024-06-18 15:06:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2583937024. Throughput: 0: 42638.7. Samples: 2584018900. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:31,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 15:06:32,582][12883] Updated weights for policy 0, policy_version 157714 (0.0033) [2024-06-18 15:06:36,440][12883] Updated weights for policy 0, policy_version 157724 (0.0040) [2024-06-18 15:06:36,999][12645] Fps is (10 sec: 44215.1, 60 sec: 42321.9, 300 sec: 42764.3). Total num frames: 2584166400. Throughput: 0: 42714.6. Samples: 2584277200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-18 15:06:36,999][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 15:06:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157725_2584166400.pth... [2024-06-18 15:06:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157099_2573910016.pth [2024-06-18 15:06:40,968][12883] Updated weights for policy 0, policy_version 157734 (0.0036) [2024-06-18 15:06:41,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2584363008. Throughput: 0: 42641.8. Samples: 2584533780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:06:41,994][12645] Avg episode reward: [(0, '0.544')] [2024-06-18 15:06:43,971][12883] Updated weights for policy 0, policy_version 157744 (0.0038) [2024-06-18 15:06:46,994][12645] Fps is (10 sec: 42619.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2584592384. Throughput: 0: 42504.4. Samples: 2584655800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:06:46,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 15:06:48,400][12883] Updated weights for policy 0, policy_version 157754 (0.0039) [2024-06-18 15:06:49,508][12862] Signal inference workers to stop experience collection... (37800 times) [2024-06-18 15:06:49,511][12862] Signal inference workers to resume experience collection... (37800 times) [2024-06-18 15:06:49,558][12883] InferenceWorker_p0-w0: stopping experience collection (37800 times) [2024-06-18 15:06:49,564][12883] InferenceWorker_p0-w0: resuming experience collection (37800 times) [2024-06-18 15:06:51,538][12883] Updated weights for policy 0, policy_version 157764 (0.0042) [2024-06-18 15:06:51,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42875.8, 300 sec: 42820.6). Total num frames: 2584821760. Throughput: 0: 42680.3. Samples: 2584917280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:06:51,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 15:06:55,866][12883] Updated weights for policy 0, policy_version 157774 (0.0030) [2024-06-18 15:06:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 2585001984. Throughput: 0: 42773.7. Samples: 2585177580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:06:56,994][12645] Avg episode reward: [(0, '0.310')] [2024-06-18 15:06:59,214][12883] Updated weights for policy 0, policy_version 157784 (0.0028) [2024-06-18 15:07:01,996][12645] Fps is (10 sec: 42589.8, 60 sec: 42871.5, 300 sec: 42820.2). Total num frames: 2585247744. Throughput: 0: 42658.3. Samples: 2585299080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:01,997][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 15:07:03,747][12883] Updated weights for policy 0, policy_version 157794 (0.0030) [2024-06-18 15:07:06,832][12883] Updated weights for policy 0, policy_version 157804 (0.0045) [2024-06-18 15:07:06,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2585460736. Throughput: 0: 42736.4. Samples: 2585558520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:06,994][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 15:07:11,698][12883] Updated weights for policy 0, policy_version 157814 (0.0032) [2024-06-18 15:07:11,996][12645] Fps is (10 sec: 39321.5, 60 sec: 42600.4, 300 sec: 42598.1). Total num frames: 2585640960. Throughput: 0: 42865.0. Samples: 2585820200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:11,996][12645] Avg episode reward: [(0, '0.843')] [2024-06-18 15:07:14,534][12883] Updated weights for policy 0, policy_version 157824 (0.0031) [2024-06-18 15:07:16,996][12645] Fps is (10 sec: 42588.9, 60 sec: 43142.9, 300 sec: 42820.2). Total num frames: 2585886720. Throughput: 0: 42640.9. Samples: 2585937840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:16,997][12645] Avg episode reward: [(0, '0.843')] [2024-06-18 15:07:19,592][12883] Updated weights for policy 0, policy_version 157834 (0.0030) [2024-06-18 15:07:21,994][12645] Fps is (10 sec: 45885.3, 60 sec: 43144.5, 300 sec: 42877.0). Total num frames: 2586099712. Throughput: 0: 42714.0. Samples: 2586199120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:21,994][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 15:07:22,386][12883] Updated weights for policy 0, policy_version 157844 (0.0032) [2024-06-18 15:07:26,994][12645] Fps is (10 sec: 37691.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2586263552. Throughput: 0: 42796.0. Samples: 2586459600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:26,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 15:07:27,278][12883] Updated weights for policy 0, policy_version 157854 (0.0029) [2024-06-18 15:07:29,902][12883] Updated weights for policy 0, policy_version 157864 (0.0028) [2024-06-18 15:07:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2586525696. Throughput: 0: 42775.1. Samples: 2586580680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:31,994][12645] Avg episode reward: [(0, '0.264')] [2024-06-18 15:07:34,734][12883] Updated weights for policy 0, policy_version 157874 (0.0035) [2024-06-18 15:07:36,994][12645] Fps is (10 sec: 47513.5, 60 sec: 42875.0, 300 sec: 42820.5). Total num frames: 2586738688. Throughput: 0: 42812.1. Samples: 2586843820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:36,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 15:07:37,942][12883] Updated weights for policy 0, policy_version 157884 (0.0032) [2024-06-18 15:07:41,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42543.7). Total num frames: 2586902528. Throughput: 0: 42602.3. Samples: 2587094680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 15:07:41,994][12645] Avg episode reward: [(0, '0.591')] [2024-06-18 15:07:42,355][12883] Updated weights for policy 0, policy_version 157894 (0.0032) [2024-06-18 15:07:45,522][12883] Updated weights for policy 0, policy_version 157904 (0.0035) [2024-06-18 15:07:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2587164672. Throughput: 0: 42660.7. Samples: 2587218720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:07:46,994][12645] Avg episode reward: [(0, '0.693')] [2024-06-18 15:07:50,310][12883] Updated weights for policy 0, policy_version 157914 (0.0031) [2024-06-18 15:07:51,994][12645] Fps is (10 sec: 45874.3, 60 sec: 42325.4, 300 sec: 42710.4). Total num frames: 2587361280. Throughput: 0: 42738.6. Samples: 2587481760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:07:51,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 15:07:53,193][12883] Updated weights for policy 0, policy_version 157924 (0.0030) [2024-06-18 15:07:56,994][12645] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2587541504. Throughput: 0: 42608.4. Samples: 2587737480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:07:56,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 15:07:57,863][12883] Updated weights for policy 0, policy_version 157934 (0.0032) [2024-06-18 15:08:01,003][12883] Updated weights for policy 0, policy_version 157944 (0.0030) [2024-06-18 15:08:01,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2587803648. Throughput: 0: 42693.3. Samples: 2587858940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:01,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 15:08:05,433][12883] Updated weights for policy 0, policy_version 157954 (0.0031) [2024-06-18 15:08:05,441][12862] Signal inference workers to stop experience collection... (37850 times) [2024-06-18 15:08:05,441][12862] Signal inference workers to resume experience collection... (37850 times) [2024-06-18 15:08:05,451][12883] InferenceWorker_p0-w0: stopping experience collection (37850 times) [2024-06-18 15:08:05,463][12883] InferenceWorker_p0-w0: resuming experience collection (37850 times) [2024-06-18 15:08:06,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2588000256. Throughput: 0: 42686.2. Samples: 2588120000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:06,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 15:08:08,531][12883] Updated weights for policy 0, policy_version 157964 (0.0031) [2024-06-18 15:08:11,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2588196864. Throughput: 0: 42435.6. Samples: 2588369200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:11,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 15:08:12,907][12883] Updated weights for policy 0, policy_version 157974 (0.0032) [2024-06-18 15:08:16,521][12883] Updated weights for policy 0, policy_version 157984 (0.0028) [2024-06-18 15:08:16,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2588442624. Throughput: 0: 42682.6. Samples: 2588501400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:16,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 15:08:20,453][12883] Updated weights for policy 0, policy_version 157994 (0.0039) [2024-06-18 15:08:21,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2588639232. Throughput: 0: 42649.3. Samples: 2588763040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:21,995][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 15:08:23,989][12883] Updated weights for policy 0, policy_version 158004 (0.0031) [2024-06-18 15:08:26,996][12645] Fps is (10 sec: 42588.9, 60 sec: 43416.0, 300 sec: 42709.1). Total num frames: 2588868608. Throughput: 0: 42720.4. Samples: 2589017200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:26,997][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 15:08:28,185][12883] Updated weights for policy 0, policy_version 158014 (0.0037) [2024-06-18 15:08:31,619][12883] Updated weights for policy 0, policy_version 158024 (0.0035) [2024-06-18 15:08:31,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2589097984. Throughput: 0: 42947.6. Samples: 2589151360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:31,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 15:08:35,605][12883] Updated weights for policy 0, policy_version 158034 (0.0035) [2024-06-18 15:08:36,994][12645] Fps is (10 sec: 42608.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2589294592. Throughput: 0: 42884.1. Samples: 2589411540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:36,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 15:08:37,004][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158038_2589294592.pth... [2024-06-18 15:08:37,049][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157411_2579021824.pth [2024-06-18 15:08:39,200][12883] Updated weights for policy 0, policy_version 158044 (0.0042) [2024-06-18 15:08:41,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2589507584. Throughput: 0: 42825.8. Samples: 2589664640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:41,994][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 15:08:43,074][12883] Updated weights for policy 0, policy_version 158054 (0.0036) [2024-06-18 15:08:46,712][12883] Updated weights for policy 0, policy_version 158064 (0.0033) [2024-06-18 15:08:46,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2589736960. Throughput: 0: 43040.3. Samples: 2589795760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 15:08:46,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 15:08:50,469][12883] Updated weights for policy 0, policy_version 158074 (0.0042) [2024-06-18 15:08:51,994][12645] Fps is (10 sec: 42597.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2589933568. Throughput: 0: 42820.4. Samples: 2590046920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:08:51,995][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 15:08:54,390][12883] Updated weights for policy 0, policy_version 158084 (0.0041) [2024-06-18 15:08:56,994][12645] Fps is (10 sec: 40960.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 2590146560. Throughput: 0: 43072.9. Samples: 2590307480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:08:56,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 15:08:58,050][12883] Updated weights for policy 0, policy_version 158094 (0.0032) [2024-06-18 15:09:01,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2590343168. Throughput: 0: 42983.6. Samples: 2590435660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:01,994][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 15:09:02,344][12883] Updated weights for policy 0, policy_version 158104 (0.0032) [2024-06-18 15:09:05,628][12883] Updated weights for policy 0, policy_version 158114 (0.0042) [2024-06-18 15:09:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2590572544. Throughput: 0: 42642.6. Samples: 2590681960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:06,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 15:09:10,262][12883] Updated weights for policy 0, policy_version 158124 (0.0035) [2024-06-18 15:09:11,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42654.6). Total num frames: 2590769152. Throughput: 0: 42711.0. Samples: 2590939100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:11,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 15:09:13,469][12883] Updated weights for policy 0, policy_version 158134 (0.0028) [2024-06-18 15:09:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42709.6). Total num frames: 2590982144. Throughput: 0: 42515.5. Samples: 2591064560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:16,994][12645] Avg episode reward: [(0, '0.711')] [2024-06-18 15:09:18,059][12883] Updated weights for policy 0, policy_version 158144 (0.0033) [2024-06-18 15:09:21,042][12883] Updated weights for policy 0, policy_version 158154 (0.0031) [2024-06-18 15:09:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2591211520. Throughput: 0: 42397.7. Samples: 2591319440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:21,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 15:09:25,831][12883] Updated weights for policy 0, policy_version 158164 (0.0036) [2024-06-18 15:09:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 2591408128. Throughput: 0: 42569.3. Samples: 2591580260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:26,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 15:09:28,807][12883] Updated weights for policy 0, policy_version 158174 (0.0041) [2024-06-18 15:09:31,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 2591637504. Throughput: 0: 42332.1. Samples: 2591700700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:31,994][12645] Avg episode reward: [(0, '0.373')] [2024-06-18 15:09:33,366][12883] Updated weights for policy 0, policy_version 158184 (0.0037) [2024-06-18 15:09:36,515][12883] Updated weights for policy 0, policy_version 158194 (0.0038) [2024-06-18 15:09:36,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2591866880. Throughput: 0: 42565.4. Samples: 2591962360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:36,994][12645] Avg episode reward: [(0, '0.586')] [2024-06-18 15:09:39,068][12862] Signal inference workers to stop experience collection... (37900 times) [2024-06-18 15:09:39,068][12862] Signal inference workers to resume experience collection... (37900 times) [2024-06-18 15:09:39,118][12883] InferenceWorker_p0-w0: stopping experience collection (37900 times) [2024-06-18 15:09:39,118][12883] InferenceWorker_p0-w0: resuming experience collection (37900 times) [2024-06-18 15:09:41,080][12883] Updated weights for policy 0, policy_version 158204 (0.0045) [2024-06-18 15:09:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2592047104. Throughput: 0: 42486.1. Samples: 2592219360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:41,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 15:09:44,211][12883] Updated weights for policy 0, policy_version 158214 (0.0027) [2024-06-18 15:09:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2592276480. Throughput: 0: 42285.7. Samples: 2592338520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:46,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 15:09:49,006][12883] Updated weights for policy 0, policy_version 158224 (0.0038) [2024-06-18 15:09:51,755][12883] Updated weights for policy 0, policy_version 158234 (0.0032) [2024-06-18 15:09:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2592505856. Throughput: 0: 42620.5. Samples: 2592599880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 15:09:51,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 15:09:56,954][12883] Updated weights for policy 0, policy_version 158244 (0.0027) [2024-06-18 15:09:56,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 2592669696. Throughput: 0: 42633.3. Samples: 2592857600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:09:56,994][12645] Avg episode reward: [(0, '0.352')] [2024-06-18 15:09:59,466][12883] Updated weights for policy 0, policy_version 158254 (0.0035) [2024-06-18 15:10:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2592899072. Throughput: 0: 42351.1. Samples: 2592970360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:02,003][12645] Avg episode reward: [(0, '0.707')] [2024-06-18 15:10:04,507][12883] Updated weights for policy 0, policy_version 158264 (0.0030) [2024-06-18 15:10:06,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2593144832. Throughput: 0: 42737.2. Samples: 2593242620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:06,994][12645] Avg episode reward: [(0, '0.788')] [2024-06-18 15:10:07,148][12883] Updated weights for policy 0, policy_version 158274 (0.0029) [2024-06-18 15:10:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2593308672. Throughput: 0: 42463.5. Samples: 2593491120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:11,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 15:10:12,103][12883] Updated weights for policy 0, policy_version 158284 (0.0043) [2024-06-18 15:10:15,509][12883] Updated weights for policy 0, policy_version 158294 (0.0035) [2024-06-18 15:10:16,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2593554432. Throughput: 0: 42374.8. Samples: 2593607560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:16,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 15:10:19,754][12883] Updated weights for policy 0, policy_version 158304 (0.0028) [2024-06-18 15:10:21,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2593767424. Throughput: 0: 42609.9. Samples: 2593879800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:21,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 15:10:23,069][12883] Updated weights for policy 0, policy_version 158314 (0.0049) [2024-06-18 15:10:26,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2593947648. Throughput: 0: 42539.2. Samples: 2594133620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:26,994][12645] Avg episode reward: [(0, '0.266')] [2024-06-18 15:10:27,460][12883] Updated weights for policy 0, policy_version 158324 (0.0041) [2024-06-18 15:10:30,481][12883] Updated weights for policy 0, policy_version 158334 (0.0031) [2024-06-18 15:10:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2594193408. Throughput: 0: 42478.7. Samples: 2594250060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:31,994][12645] Avg episode reward: [(0, '0.682')] [2024-06-18 15:10:35,111][12883] Updated weights for policy 0, policy_version 158344 (0.0043) [2024-06-18 15:10:36,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 2594390016. Throughput: 0: 42575.2. Samples: 2594515760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:36,994][12645] Avg episode reward: [(0, '0.635')] [2024-06-18 15:10:37,064][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158350_2594406400.pth... [2024-06-18 15:10:37,109][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000157725_2584166400.pth [2024-06-18 15:10:38,285][12883] Updated weights for policy 0, policy_version 158354 (0.0027) [2024-06-18 15:10:39,464][12862] Signal inference workers to stop experience collection... (37950 times) [2024-06-18 15:10:39,464][12862] Signal inference workers to resume experience collection... (37950 times) [2024-06-18 15:10:39,511][12883] InferenceWorker_p0-w0: stopping experience collection (37950 times) [2024-06-18 15:10:39,512][12883] InferenceWorker_p0-w0: resuming experience collection (37950 times) [2024-06-18 15:10:41,996][12645] Fps is (10 sec: 39313.0, 60 sec: 42323.8, 300 sec: 42598.1). Total num frames: 2594586624. Throughput: 0: 42355.3. Samples: 2594763680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:41,996][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 15:10:43,181][12883] Updated weights for policy 0, policy_version 158364 (0.0034) [2024-06-18 15:10:46,100][12883] Updated weights for policy 0, policy_version 158374 (0.0035) [2024-06-18 15:10:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.9). Total num frames: 2594832384. Throughput: 0: 42726.8. Samples: 2594893060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:46,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 15:10:50,687][12883] Updated weights for policy 0, policy_version 158384 (0.0031) [2024-06-18 15:10:51,994][12645] Fps is (10 sec: 42607.6, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 2595012608. Throughput: 0: 42402.2. Samples: 2595150720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:51,994][12645] Avg episode reward: [(0, '0.271')] [2024-06-18 15:10:53,728][12883] Updated weights for policy 0, policy_version 158394 (0.0032) [2024-06-18 15:10:56,994][12645] Fps is (10 sec: 39319.4, 60 sec: 42598.1, 300 sec: 42543.1). Total num frames: 2595225600. Throughput: 0: 42314.6. Samples: 2595395300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-18 15:10:56,995][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 15:10:58,652][12883] Updated weights for policy 0, policy_version 158404 (0.0038) [2024-06-18 15:11:01,500][12883] Updated weights for policy 0, policy_version 158414 (0.0031) [2024-06-18 15:11:01,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2595471360. Throughput: 0: 42725.6. Samples: 2595530220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:01,994][12645] Avg episode reward: [(0, '0.309')] [2024-06-18 15:11:06,402][12883] Updated weights for policy 0, policy_version 158424 (0.0033) [2024-06-18 15:11:06,994][12645] Fps is (10 sec: 42600.6, 60 sec: 41779.3, 300 sec: 42599.1). Total num frames: 2595651584. Throughput: 0: 42333.7. Samples: 2595784820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:06,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 15:11:09,221][12883] Updated weights for policy 0, policy_version 158434 (0.0031) [2024-06-18 15:11:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2595880960. Throughput: 0: 42238.6. Samples: 2596034360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:11,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 15:11:13,926][12883] Updated weights for policy 0, policy_version 158444 (0.0030) [2024-06-18 15:11:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2596093952. Throughput: 0: 42652.5. Samples: 2596169420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:16,994][12645] Avg episode reward: [(0, '0.218')] [2024-06-18 15:11:17,057][12883] Updated weights for policy 0, policy_version 158454 (0.0041) [2024-06-18 15:11:21,638][12883] Updated weights for policy 0, policy_version 158464 (0.0040) [2024-06-18 15:11:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2596290560. Throughput: 0: 42391.0. Samples: 2596423360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:21,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 15:11:24,984][12883] Updated weights for policy 0, policy_version 158474 (0.0041) [2024-06-18 15:11:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2596519936. Throughput: 0: 42295.8. Samples: 2596666900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:26,994][12645] Avg episode reward: [(0, '0.695')] [2024-06-18 15:11:29,379][12883] Updated weights for policy 0, policy_version 158484 (0.0024) [2024-06-18 15:11:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42599.1). Total num frames: 2596732928. Throughput: 0: 42510.2. Samples: 2596806020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:31,994][12645] Avg episode reward: [(0, '0.685')] [2024-06-18 15:11:32,525][12883] Updated weights for policy 0, policy_version 158494 (0.0037) [2024-06-18 15:11:36,938][12883] Updated weights for policy 0, policy_version 158504 (0.0027) [2024-06-18 15:11:36,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2596929536. Throughput: 0: 42392.2. Samples: 2597058360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:36,994][12645] Avg episode reward: [(0, '0.737')] [2024-06-18 15:11:40,431][12883] Updated weights for policy 0, policy_version 158514 (0.0043) [2024-06-18 15:11:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43146.2, 300 sec: 42653.9). Total num frames: 2597175296. Throughput: 0: 42391.6. Samples: 2597302900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:41,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 15:11:44,658][12883] Updated weights for policy 0, policy_version 158524 (0.0027) [2024-06-18 15:11:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2597371904. Throughput: 0: 42416.1. Samples: 2597438940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:46,994][12645] Avg episode reward: [(0, '0.700')] [2024-06-18 15:11:48,087][12883] Updated weights for policy 0, policy_version 158534 (0.0028) [2024-06-18 15:11:51,994][12645] Fps is (10 sec: 36044.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2597535744. Throughput: 0: 42232.4. Samples: 2597685280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:51,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 15:11:52,337][12883] Updated weights for policy 0, policy_version 158544 (0.0032) [2024-06-18 15:11:55,892][12883] Updated weights for policy 0, policy_version 158554 (0.0033) [2024-06-18 15:11:56,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.8, 300 sec: 42487.6). Total num frames: 2597781504. Throughput: 0: 42312.1. Samples: 2597938400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 15:11:56,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 15:12:00,349][12883] Updated weights for policy 0, policy_version 158564 (0.0041) [2024-06-18 15:12:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2597994496. Throughput: 0: 42367.1. Samples: 2598075940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:01,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 15:12:03,601][12883] Updated weights for policy 0, policy_version 158574 (0.0033) [2024-06-18 15:12:06,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 42542.9). Total num frames: 2598191104. Throughput: 0: 42258.4. Samples: 2598325080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:06,996][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 15:12:07,903][12883] Updated weights for policy 0, policy_version 158584 (0.0046) [2024-06-18 15:12:09,748][12862] Signal inference workers to stop experience collection... (38000 times) [2024-06-18 15:12:09,748][12862] Signal inference workers to resume experience collection... (38000 times) [2024-06-18 15:12:09,768][12883] InferenceWorker_p0-w0: stopping experience collection (38000 times) [2024-06-18 15:12:09,769][12883] InferenceWorker_p0-w0: resuming experience collection (38000 times) [2024-06-18 15:12:11,085][12883] Updated weights for policy 0, policy_version 158594 (0.0029) [2024-06-18 15:12:11,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42543.2). Total num frames: 2598436864. Throughput: 0: 42535.6. Samples: 2598581000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:11,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 15:12:15,700][12883] Updated weights for policy 0, policy_version 158604 (0.0040) [2024-06-18 15:12:16,994][12645] Fps is (10 sec: 44246.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2598633472. Throughput: 0: 42434.2. Samples: 2598715560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:16,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 15:12:18,718][12883] Updated weights for policy 0, policy_version 158614 (0.0028) [2024-06-18 15:12:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2598830080. Throughput: 0: 42347.1. Samples: 2598963980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:21,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 15:12:23,191][12883] Updated weights for policy 0, policy_version 158624 (0.0039) [2024-06-18 15:12:26,356][12883] Updated weights for policy 0, policy_version 158634 (0.0034) [2024-06-18 15:12:26,998][12645] Fps is (10 sec: 44216.3, 60 sec: 42595.1, 300 sec: 42542.2). Total num frames: 2599075840. Throughput: 0: 42641.3. Samples: 2599221960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:26,999][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 15:12:30,897][12883] Updated weights for policy 0, policy_version 158644 (0.0048) [2024-06-18 15:12:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2599272448. Throughput: 0: 42499.9. Samples: 2599351440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:31,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 15:12:34,079][12883] Updated weights for policy 0, policy_version 158654 (0.0042) [2024-06-18 15:12:36,994][12645] Fps is (10 sec: 39339.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2599469056. Throughput: 0: 42593.8. Samples: 2599602000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:36,994][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 15:12:37,017][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158659_2599469056.pth... [2024-06-18 15:12:37,083][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158038_2589294592.pth [2024-06-18 15:12:38,374][12883] Updated weights for policy 0, policy_version 158664 (0.0029) [2024-06-18 15:12:41,965][12883] Updated weights for policy 0, policy_version 158674 (0.0037) [2024-06-18 15:12:41,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2599714816. Throughput: 0: 42571.5. Samples: 2599854120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:41,994][12645] Avg episode reward: [(0, '0.723')] [2024-06-18 15:12:45,936][12883] Updated weights for policy 0, policy_version 158684 (0.0040) [2024-06-18 15:12:46,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2599911424. Throughput: 0: 42537.7. Samples: 2599990140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:46,994][12645] Avg episode reward: [(0, '0.803')] [2024-06-18 15:12:49,555][12883] Updated weights for policy 0, policy_version 158694 (0.0034) [2024-06-18 15:12:51,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2600108032. Throughput: 0: 42455.9. Samples: 2600235500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:51,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 15:12:53,904][12883] Updated weights for policy 0, policy_version 158704 (0.0034) [2024-06-18 15:12:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2600353792. Throughput: 0: 42637.8. Samples: 2600499700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:12:56,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 15:12:57,087][12883] Updated weights for policy 0, policy_version 158714 (0.0034) [2024-06-18 15:13:01,482][12883] Updated weights for policy 0, policy_version 158724 (0.0026) [2024-06-18 15:13:01,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2600566784. Throughput: 0: 42531.1. Samples: 2600629460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:13:01,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 15:13:04,856][12883] Updated weights for policy 0, policy_version 158734 (0.0025) [2024-06-18 15:13:06,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42599.9, 300 sec: 42542.8). Total num frames: 2600747008. Throughput: 0: 42652.7. Samples: 2600883360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:06,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 15:13:08,889][12883] Updated weights for policy 0, policy_version 158744 (0.0030) [2024-06-18 15:13:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2600992768. Throughput: 0: 42679.6. Samples: 2601142340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:11,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 15:13:12,391][12883] Updated weights for policy 0, policy_version 158754 (0.0045) [2024-06-18 15:13:16,554][12883] Updated weights for policy 0, policy_version 158764 (0.0037) [2024-06-18 15:13:16,994][12645] Fps is (10 sec: 47514.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 2601222144. Throughput: 0: 42847.7. Samples: 2601279580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:16,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 15:13:19,999][12883] Updated weights for policy 0, policy_version 158774 (0.0032) [2024-06-18 15:13:22,000][12645] Fps is (10 sec: 39297.0, 60 sec: 42593.9, 300 sec: 42431.2). Total num frames: 2601385984. Throughput: 0: 42813.2. Samples: 2601528860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:22,000][12645] Avg episode reward: [(0, '0.366')] [2024-06-18 15:13:24,277][12883] Updated weights for policy 0, policy_version 158784 (0.0048) [2024-06-18 15:13:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42601.7, 300 sec: 42487.3). Total num frames: 2601631744. Throughput: 0: 42867.5. Samples: 2601783160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:26,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 15:13:27,783][12883] Updated weights for policy 0, policy_version 158794 (0.0035) [2024-06-18 15:13:31,916][12883] Updated weights for policy 0, policy_version 158804 (0.0027) [2024-06-18 15:13:31,994][12645] Fps is (10 sec: 45904.2, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 2601844736. Throughput: 0: 42923.7. Samples: 2601921700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:31,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 15:13:35,351][12883] Updated weights for policy 0, policy_version 158814 (0.0027) [2024-06-18 15:13:36,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2602041344. Throughput: 0: 42882.2. Samples: 2602165200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:36,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 15:13:39,683][12883] Updated weights for policy 0, policy_version 158824 (0.0033) [2024-06-18 15:13:41,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2602270720. Throughput: 0: 42768.3. Samples: 2602424360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:41,996][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 15:13:42,532][12862] Signal inference workers to stop experience collection... (38050 times) [2024-06-18 15:13:42,533][12862] Signal inference workers to resume experience collection... (38050 times) [2024-06-18 15:13:42,575][12883] InferenceWorker_p0-w0: stopping experience collection (38050 times) [2024-06-18 15:13:42,575][12883] InferenceWorker_p0-w0: resuming experience collection (38050 times) [2024-06-18 15:13:42,871][12883] Updated weights for policy 0, policy_version 158834 (0.0028) [2024-06-18 15:13:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 2602467328. Throughput: 0: 42881.8. Samples: 2602559140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:46,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 15:13:47,270][12883] Updated weights for policy 0, policy_version 158844 (0.0041) [2024-06-18 15:13:50,737][12883] Updated weights for policy 0, policy_version 158854 (0.0044) [2024-06-18 15:13:51,994][12645] Fps is (10 sec: 42607.1, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 2602696704. Throughput: 0: 42714.8. Samples: 2602805520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:51,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 15:13:55,357][12883] Updated weights for policy 0, policy_version 158864 (0.0040) [2024-06-18 15:13:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2602909696. Throughput: 0: 42624.4. Samples: 2603060440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:13:56,994][12645] Avg episode reward: [(0, '0.767')] [2024-06-18 15:13:58,440][12883] Updated weights for policy 0, policy_version 158874 (0.0021) [2024-06-18 15:14:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 2603106304. Throughput: 0: 42532.1. Samples: 2603193520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:14:01,994][12645] Avg episode reward: [(0, '0.762')] [2024-06-18 15:14:02,852][12883] Updated weights for policy 0, policy_version 158884 (0.0039) [2024-06-18 15:14:06,133][12883] Updated weights for policy 0, policy_version 158894 (0.0037) [2024-06-18 15:14:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2603335680. Throughput: 0: 42672.1. Samples: 2603448840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-18 15:14:07,003][12645] Avg episode reward: [(0, '0.392')] [2024-06-18 15:14:10,468][12883] Updated weights for policy 0, policy_version 158904 (0.0029) [2024-06-18 15:14:11,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2603548672. Throughput: 0: 42627.1. Samples: 2603701380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:11,994][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 15:14:13,631][12883] Updated weights for policy 0, policy_version 158914 (0.0034) [2024-06-18 15:14:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2603745280. Throughput: 0: 42538.2. Samples: 2603835920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:16,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 15:14:17,955][12883] Updated weights for policy 0, policy_version 158924 (0.0039) [2024-06-18 15:14:21,312][12883] Updated weights for policy 0, policy_version 158934 (0.0029) [2024-06-18 15:14:21,994][12645] Fps is (10 sec: 44236.8, 60 sec: 43422.1, 300 sec: 42653.9). Total num frames: 2603991040. Throughput: 0: 42828.0. Samples: 2604092460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:21,994][12645] Avg episode reward: [(0, '0.733')] [2024-06-18 15:14:25,924][12883] Updated weights for policy 0, policy_version 158944 (0.0026) [2024-06-18 15:14:26,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2604204032. Throughput: 0: 42706.3. Samples: 2604346060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:26,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 15:14:29,001][12883] Updated weights for policy 0, policy_version 158954 (0.0041) [2024-06-18 15:14:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2604384256. Throughput: 0: 42496.5. Samples: 2604471480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:31,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 15:14:33,567][12883] Updated weights for policy 0, policy_version 158964 (0.0045) [2024-06-18 15:14:36,537][12883] Updated weights for policy 0, policy_version 158974 (0.0037) [2024-06-18 15:14:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2604630016. Throughput: 0: 42680.0. Samples: 2604726120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:36,994][12645] Avg episode reward: [(0, '0.406')] [2024-06-18 15:14:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158974_2604630016.pth... [2024-06-18 15:14:37,079][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158350_2594406400.pth [2024-06-18 15:14:41,299][12883] Updated weights for policy 0, policy_version 158984 (0.0028) [2024-06-18 15:14:41,994][12645] Fps is (10 sec: 44235.6, 60 sec: 42599.7, 300 sec: 42542.8). Total num frames: 2604826624. Throughput: 0: 42730.1. Samples: 2604983300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:41,994][12645] Avg episode reward: [(0, '0.294')] [2024-06-18 15:14:44,155][12883] Updated weights for policy 0, policy_version 158994 (0.0031) [2024-06-18 15:14:46,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2605023232. Throughput: 0: 42460.0. Samples: 2605104220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:46,994][12645] Avg episode reward: [(0, '0.670')] [2024-06-18 15:14:48,934][12883] Updated weights for policy 0, policy_version 159004 (0.0045) [2024-06-18 15:14:51,755][12883] Updated weights for policy 0, policy_version 159014 (0.0050) [2024-06-18 15:14:51,996][12645] Fps is (10 sec: 45865.8, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2605285376. Throughput: 0: 42549.5. Samples: 2605363660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:51,996][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 15:14:54,368][12862] Signal inference workers to stop experience collection... (38100 times) [2024-06-18 15:14:54,372][12862] Signal inference workers to resume experience collection... (38100 times) [2024-06-18 15:14:54,416][12883] InferenceWorker_p0-w0: stopping experience collection (38100 times) [2024-06-18 15:14:54,420][12883] InferenceWorker_p0-w0: resuming experience collection (38100 times) [2024-06-18 15:14:56,819][12883] Updated weights for policy 0, policy_version 159024 (0.0026) [2024-06-18 15:14:56,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2605465600. Throughput: 0: 42781.2. Samples: 2605626540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:14:56,994][12645] Avg episode reward: [(0, '0.390')] [2024-06-18 15:14:59,685][12883] Updated weights for policy 0, policy_version 159034 (0.0026) [2024-06-18 15:15:01,994][12645] Fps is (10 sec: 37691.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2605662208. Throughput: 0: 42473.4. Samples: 2605747220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:15:01,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 15:15:04,530][12883] Updated weights for policy 0, policy_version 159044 (0.0026) [2024-06-18 15:15:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2605924352. Throughput: 0: 42753.2. Samples: 2606016360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:15:06,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 15:15:07,587][12883] Updated weights for policy 0, policy_version 159054 (0.0030) [2024-06-18 15:15:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 2606088192. Throughput: 0: 42745.7. Samples: 2606269620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 15:15:11,994][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 15:15:12,234][12883] Updated weights for policy 0, policy_version 159064 (0.0029) [2024-06-18 15:15:14,992][12883] Updated weights for policy 0, policy_version 159074 (0.0033) [2024-06-18 15:15:16,996][12645] Fps is (10 sec: 39313.0, 60 sec: 42869.8, 300 sec: 42542.5). Total num frames: 2606317568. Throughput: 0: 42584.0. Samples: 2606387860. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:16,997][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 15:15:19,718][12883] Updated weights for policy 0, policy_version 159084 (0.0046) [2024-06-18 15:15:21,994][12645] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2606546944. Throughput: 0: 42838.4. Samples: 2606653840. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:21,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 15:15:22,639][12883] Updated weights for policy 0, policy_version 159094 (0.0040) [2024-06-18 15:15:26,994][12645] Fps is (10 sec: 42608.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2606743552. Throughput: 0: 42958.8. Samples: 2606916440. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:26,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 15:15:27,218][12883] Updated weights for policy 0, policy_version 159104 (0.0023) [2024-06-18 15:15:29,958][12883] Updated weights for policy 0, policy_version 159114 (0.0025) [2024-06-18 15:15:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2606972928. Throughput: 0: 42955.0. Samples: 2607037200. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:31,994][12645] Avg episode reward: [(0, '0.362')] [2024-06-18 15:15:34,852][12883] Updated weights for policy 0, policy_version 159124 (0.0046) [2024-06-18 15:15:36,994][12645] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42765.3). Total num frames: 2607202304. Throughput: 0: 43026.0. Samples: 2607299740. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:36,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 15:15:37,896][12883] Updated weights for policy 0, policy_version 159134 (0.0033) [2024-06-18 15:15:41,996][12645] Fps is (10 sec: 39313.1, 60 sec: 42323.9, 300 sec: 42487.0). Total num frames: 2607366144. Throughput: 0: 42937.5. Samples: 2607558820. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:41,996][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 15:15:42,385][12883] Updated weights for policy 0, policy_version 159144 (0.0039) [2024-06-18 15:15:45,744][12883] Updated weights for policy 0, policy_version 159154 (0.0043) [2024-06-18 15:15:46,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2607611904. Throughput: 0: 42887.0. Samples: 2607677140. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:46,994][12645] Avg episode reward: [(0, '0.389')] [2024-06-18 15:15:50,077][12883] Updated weights for policy 0, policy_version 159164 (0.0038) [2024-06-18 15:15:51,994][12645] Fps is (10 sec: 44246.8, 60 sec: 42053.8, 300 sec: 42654.0). Total num frames: 2607808512. Throughput: 0: 42625.0. Samples: 2607934480. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:51,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 15:15:53,387][12883] Updated weights for policy 0, policy_version 159174 (0.0042) [2024-06-18 15:15:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2608005120. Throughput: 0: 42673.0. Samples: 2608189900. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:15:56,994][12645] Avg episode reward: [(0, '0.625')] [2024-06-18 15:15:57,849][12883] Updated weights for policy 0, policy_version 159184 (0.0028) [2024-06-18 15:16:00,972][12883] Updated weights for policy 0, policy_version 159194 (0.0037) [2024-06-18 15:16:01,994][12645] Fps is (10 sec: 45874.7, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 2608267264. Throughput: 0: 42871.4. Samples: 2608316980. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:16:01,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 15:16:05,775][12883] Updated weights for policy 0, policy_version 159204 (0.0032) [2024-06-18 15:16:06,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2608447488. Throughput: 0: 42720.7. Samples: 2608576280. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:16:06,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 15:16:08,818][12883] Updated weights for policy 0, policy_version 159214 (0.0036) [2024-06-18 15:16:11,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 2608644096. Throughput: 0: 42535.1. Samples: 2608830520. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:16:11,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 15:16:13,548][12883] Updated weights for policy 0, policy_version 159224 (0.0036) [2024-06-18 15:16:16,467][12883] Updated weights for policy 0, policy_version 159234 (0.0047) [2024-06-18 15:16:16,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43146.1, 300 sec: 42765.0). Total num frames: 2608906240. Throughput: 0: 42643.5. Samples: 2608956160. Policy #0 lag: (min: 2.0, avg: 12.0, max: 22.0) [2024-06-18 15:16:16,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 15:16:21,042][12862] Signal inference workers to stop experience collection... (38150 times) [2024-06-18 15:16:21,043][12862] Signal inference workers to resume experience collection... (38150 times) [2024-06-18 15:16:21,072][12883] InferenceWorker_p0-w0: stopping experience collection (38150 times) [2024-06-18 15:16:21,073][12883] InferenceWorker_p0-w0: resuming experience collection (38150 times) [2024-06-18 15:16:21,180][12883] Updated weights for policy 0, policy_version 159244 (0.0037) [2024-06-18 15:16:21,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2609086464. Throughput: 0: 42613.8. Samples: 2609217360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:21,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 15:16:24,625][12883] Updated weights for policy 0, policy_version 159254 (0.0036) [2024-06-18 15:16:26,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2609299456. Throughput: 0: 42516.3. Samples: 2609471960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:26,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 15:16:29,082][12883] Updated weights for policy 0, policy_version 159264 (0.0037) [2024-06-18 15:16:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 2609528832. Throughput: 0: 42640.8. Samples: 2609595980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:31,994][12645] Avg episode reward: [(0, '0.440')] [2024-06-18 15:16:32,254][12883] Updated weights for policy 0, policy_version 159274 (0.0036) [2024-06-18 15:16:36,818][12883] Updated weights for policy 0, policy_version 159284 (0.0028) [2024-06-18 15:16:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 2609725440. Throughput: 0: 42588.8. Samples: 2609850980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:36,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 15:16:37,009][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159285_2609725440.pth... [2024-06-18 15:16:37,067][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158659_2599469056.pth [2024-06-18 15:16:39,698][12883] Updated weights for policy 0, policy_version 159294 (0.0027) [2024-06-18 15:16:41,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42873.1, 300 sec: 42598.4). Total num frames: 2609938432. Throughput: 0: 42586.7. Samples: 2610106300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:41,994][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 15:16:44,380][12883] Updated weights for policy 0, policy_version 159304 (0.0029) [2024-06-18 15:16:46,994][12645] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2610184192. Throughput: 0: 42722.2. Samples: 2610239480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:46,995][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 15:16:47,164][12883] Updated weights for policy 0, policy_version 159314 (0.0027) [2024-06-18 15:16:51,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2610348032. Throughput: 0: 42575.1. Samples: 2610492160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:51,994][12645] Avg episode reward: [(0, '0.660')] [2024-06-18 15:16:52,277][12883] Updated weights for policy 0, policy_version 159324 (0.0040) [2024-06-18 15:16:55,083][12883] Updated weights for policy 0, policy_version 159334 (0.0048) [2024-06-18 15:16:56,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2610593792. Throughput: 0: 42663.4. Samples: 2610750380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:16:56,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 15:16:59,787][12883] Updated weights for policy 0, policy_version 159344 (0.0027) [2024-06-18 15:17:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 2610806784. Throughput: 0: 42759.6. Samples: 2610880340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:17:01,994][12645] Avg episode reward: [(0, '0.136')] [2024-06-18 15:17:02,706][12883] Updated weights for policy 0, policy_version 159354 (0.0030) [2024-06-18 15:17:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2611003392. Throughput: 0: 42586.2. Samples: 2611133740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:17:06,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 15:17:07,236][12883] Updated weights for policy 0, policy_version 159364 (0.0037) [2024-06-18 15:17:10,442][12883] Updated weights for policy 0, policy_version 159374 (0.0034) [2024-06-18 15:17:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2611232768. Throughput: 0: 42672.6. Samples: 2611392220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:17:11,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 15:17:14,859][12883] Updated weights for policy 0, policy_version 159384 (0.0028) [2024-06-18 15:17:16,994][12645] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2611445760. Throughput: 0: 42835.3. Samples: 2611523560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:17:16,994][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 15:17:17,989][12883] Updated weights for policy 0, policy_version 159394 (0.0063) [2024-06-18 15:17:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42599.1). Total num frames: 2611642368. Throughput: 0: 42694.8. Samples: 2611772240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 15:17:21,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 15:17:22,690][12883] Updated weights for policy 0, policy_version 159404 (0.0033) [2024-06-18 15:17:25,669][12883] Updated weights for policy 0, policy_version 159414 (0.0036) [2024-06-18 15:17:26,996][12645] Fps is (10 sec: 42587.9, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2611871744. Throughput: 0: 42671.5. Samples: 2612026620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:17:26,997][12645] Avg episode reward: [(0, '0.562')] [2024-06-18 15:17:30,304][12883] Updated weights for policy 0, policy_version 159424 (0.0035) [2024-06-18 15:17:31,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2612101120. Throughput: 0: 42787.5. Samples: 2612164920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:17:31,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 15:17:33,353][12883] Updated weights for policy 0, policy_version 159434 (0.0026) [2024-06-18 15:17:36,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2612281344. Throughput: 0: 42611.0. Samples: 2612409660. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:17:36,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 15:17:37,850][12883] Updated weights for policy 0, policy_version 159444 (0.0027) [2024-06-18 15:17:41,277][12883] Updated weights for policy 0, policy_version 159454 (0.0032) [2024-06-18 15:17:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2612510720. Throughput: 0: 42475.3. Samples: 2612661760. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:17:41,994][12645] Avg episode reward: [(0, '0.499')] [2024-06-18 15:17:44,892][12862] Signal inference workers to stop experience collection... (38200 times) [2024-06-18 15:17:44,946][12862] Signal inference workers to resume experience collection... (38200 times) [2024-06-18 15:17:44,946][12883] InferenceWorker_p0-w0: stopping experience collection (38200 times) [2024-06-18 15:17:44,961][12883] InferenceWorker_p0-w0: resuming experience collection (38200 times) [2024-06-18 15:17:45,665][12883] Updated weights for policy 0, policy_version 159464 (0.0043) [2024-06-18 15:17:46,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2612723712. Throughput: 0: 42560.5. Samples: 2612795560. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:17:46,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 15:17:48,989][12883] Updated weights for policy 0, policy_version 159474 (0.0035) [2024-06-18 15:17:51,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2612920320. Throughput: 0: 42614.7. Samples: 2613051400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:17:51,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 15:17:53,605][12883] Updated weights for policy 0, policy_version 159484 (0.0033) [2024-06-18 15:17:56,675][12883] Updated weights for policy 0, policy_version 159494 (0.0022) [2024-06-18 15:17:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2613166080. Throughput: 0: 42485.7. Samples: 2613304080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:17:56,994][12645] Avg episode reward: [(0, '0.311')] [2024-06-18 15:18:01,171][12883] Updated weights for policy 0, policy_version 159504 (0.0031) [2024-06-18 15:18:01,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42596.9, 300 sec: 42764.7). Total num frames: 2613362688. Throughput: 0: 42554.3. Samples: 2613438600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:18:01,996][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 15:18:04,332][12883] Updated weights for policy 0, policy_version 159514 (0.0045) [2024-06-18 15:18:06,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2613559296. Throughput: 0: 42596.3. Samples: 2613689080. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:18:06,994][12645] Avg episode reward: [(0, '0.477')] [2024-06-18 15:18:08,776][12883] Updated weights for policy 0, policy_version 159524 (0.0038) [2024-06-18 15:18:11,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2613788672. Throughput: 0: 42560.4. Samples: 2613941740. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:18:11,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 15:18:12,131][12883] Updated weights for policy 0, policy_version 159534 (0.0040) [2024-06-18 15:18:16,198][12883] Updated weights for policy 0, policy_version 159544 (0.0033) [2024-06-18 15:18:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 2614001664. Throughput: 0: 42432.5. Samples: 2614074380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:18:16,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 15:18:19,941][12883] Updated weights for policy 0, policy_version 159554 (0.0031) [2024-06-18 15:18:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2614214656. Throughput: 0: 42623.1. Samples: 2614327700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:18:21,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 15:18:23,777][12883] Updated weights for policy 0, policy_version 159564 (0.0042) [2024-06-18 15:18:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42327.1, 300 sec: 42598.4). Total num frames: 2614411264. Throughput: 0: 42663.1. Samples: 2614581600. Policy #0 lag: (min: 1.0, avg: 9.0, max: 22.0) [2024-06-18 15:18:26,994][12645] Avg episode reward: [(0, '0.580')] [2024-06-18 15:18:27,645][12883] Updated weights for policy 0, policy_version 159574 (0.0034) [2024-06-18 15:18:31,348][12883] Updated weights for policy 0, policy_version 159584 (0.0025) [2024-06-18 15:18:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2614640640. Throughput: 0: 42601.8. Samples: 2614712640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:18:31,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 15:18:35,532][12883] Updated weights for policy 0, policy_version 159594 (0.0037) [2024-06-18 15:18:36,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2614837248. Throughput: 0: 42421.7. Samples: 2614960380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:18:36,994][12645] Avg episode reward: [(0, '0.673')] [2024-06-18 15:18:37,020][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159597_2614837248.pth... [2024-06-18 15:18:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000158974_2604630016.pth [2024-06-18 15:18:39,113][12883] Updated weights for policy 0, policy_version 159604 (0.0034) [2024-06-18 15:18:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2615066624. Throughput: 0: 42327.5. Samples: 2615208820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:18:41,994][12645] Avg episode reward: [(0, '0.583')] [2024-06-18 15:18:43,254][12883] Updated weights for policy 0, policy_version 159614 (0.0035) [2024-06-18 15:18:46,901][12883] Updated weights for policy 0, policy_version 159624 (0.0043) [2024-06-18 15:18:46,996][12645] Fps is (10 sec: 44227.5, 60 sec: 42596.8, 300 sec: 42653.6). Total num frames: 2615279616. Throughput: 0: 42353.3. Samples: 2615344500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:18:46,996][12645] Avg episode reward: [(0, '0.560')] [2024-06-18 15:18:50,768][12883] Updated weights for policy 0, policy_version 159634 (0.0033) [2024-06-18 15:18:51,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2615476224. Throughput: 0: 42385.9. Samples: 2615596440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:18:51,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 15:18:54,762][12883] Updated weights for policy 0, policy_version 159644 (0.0035) [2024-06-18 15:18:56,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2615705600. Throughput: 0: 42393.8. Samples: 2615849460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:18:56,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 15:18:58,447][12883] Updated weights for policy 0, policy_version 159654 (0.0036) [2024-06-18 15:19:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42053.8, 300 sec: 42542.9). Total num frames: 2615885824. Throughput: 0: 42344.8. Samples: 2615979900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:19:01,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 15:19:02,446][12883] Updated weights for policy 0, policy_version 159664 (0.0039) [2024-06-18 15:19:06,307][12883] Updated weights for policy 0, policy_version 159674 (0.0042) [2024-06-18 15:19:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2616115200. Throughput: 0: 42361.8. Samples: 2616233980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:19:06,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 15:19:09,257][12862] Signal inference workers to stop experience collection... (38250 times) [2024-06-18 15:19:09,258][12862] Signal inference workers to resume experience collection... (38250 times) [2024-06-18 15:19:09,268][12883] InferenceWorker_p0-w0: stopping experience collection (38250 times) [2024-06-18 15:19:09,268][12883] InferenceWorker_p0-w0: resuming experience collection (38250 times) [2024-06-18 15:19:10,367][12883] Updated weights for policy 0, policy_version 159684 (0.0032) [2024-06-18 15:19:11,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2616344576. Throughput: 0: 42252.3. Samples: 2616482960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:19:11,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 15:19:14,095][12883] Updated weights for policy 0, policy_version 159694 (0.0027) [2024-06-18 15:19:16,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2616524800. Throughput: 0: 42265.7. Samples: 2616614600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:19:16,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 15:19:17,871][12883] Updated weights for policy 0, policy_version 159704 (0.0028) [2024-06-18 15:19:21,644][12883] Updated weights for policy 0, policy_version 159714 (0.0038) [2024-06-18 15:19:21,998][12645] Fps is (10 sec: 42580.4, 60 sec: 42595.4, 300 sec: 42597.8). Total num frames: 2616770560. Throughput: 0: 42481.4. Samples: 2616872220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:19:21,998][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 15:19:25,611][12883] Updated weights for policy 0, policy_version 159724 (0.0028) [2024-06-18 15:19:26,994][12645] Fps is (10 sec: 45876.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2616983552. Throughput: 0: 42551.2. Samples: 2617123620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 15:19:26,994][12645] Avg episode reward: [(0, '0.619')] [2024-06-18 15:19:29,455][12883] Updated weights for policy 0, policy_version 159734 (0.0039) [2024-06-18 15:19:31,996][12645] Fps is (10 sec: 39329.6, 60 sec: 42050.7, 300 sec: 42487.0). Total num frames: 2617163776. Throughput: 0: 42335.6. Samples: 2617249600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:19:31,997][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 15:19:33,320][12883] Updated weights for policy 0, policy_version 159744 (0.0032) [2024-06-18 15:19:36,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2617393152. Throughput: 0: 42445.7. Samples: 2617506500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:19:36,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 15:19:37,454][12883] Updated weights for policy 0, policy_version 159754 (0.0038) [2024-06-18 15:19:41,149][12883] Updated weights for policy 0, policy_version 159764 (0.0040) [2024-06-18 15:19:41,994][12645] Fps is (10 sec: 45885.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2617622528. Throughput: 0: 42392.5. Samples: 2617757120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:19:41,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 15:19:44,986][12883] Updated weights for policy 0, policy_version 159774 (0.0035) [2024-06-18 15:19:46,994][12645] Fps is (10 sec: 39322.4, 60 sec: 41780.8, 300 sec: 42376.6). Total num frames: 2617786368. Throughput: 0: 42406.8. Samples: 2617888200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:19:46,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 15:19:48,754][12883] Updated weights for policy 0, policy_version 159784 (0.0039) [2024-06-18 15:19:51,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2618032128. Throughput: 0: 42489.8. Samples: 2618146020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:19:51,994][12645] Avg episode reward: [(0, '0.404')] [2024-06-18 15:19:52,500][12883] Updated weights for policy 0, policy_version 159794 (0.0023) [2024-06-18 15:19:56,435][12883] Updated weights for policy 0, policy_version 159804 (0.0034) [2024-06-18 15:19:56,994][12645] Fps is (10 sec: 47512.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2618261504. Throughput: 0: 42736.0. Samples: 2618406080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:19:56,994][12645] Avg episode reward: [(0, '0.262')] [2024-06-18 15:20:00,093][12883] Updated weights for policy 0, policy_version 159814 (0.0045) [2024-06-18 15:20:01,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 2618458112. Throughput: 0: 42733.4. Samples: 2618537600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:20:01,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 15:20:04,302][12883] Updated weights for policy 0, policy_version 159824 (0.0036) [2024-06-18 15:20:06,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2618671104. Throughput: 0: 42568.4. Samples: 2618787620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:20:06,994][12645] Avg episode reward: [(0, '0.399')] [2024-06-18 15:20:07,664][12883] Updated weights for policy 0, policy_version 159834 (0.0030) [2024-06-18 15:20:10,488][12862] Signal inference workers to stop experience collection... (38300 times) [2024-06-18 15:20:10,541][12862] Signal inference workers to resume experience collection... (38300 times) [2024-06-18 15:20:10,546][12883] InferenceWorker_p0-w0: stopping experience collection (38300 times) [2024-06-18 15:20:10,568][12883] InferenceWorker_p0-w0: resuming experience collection (38300 times) [2024-06-18 15:20:11,832][12883] Updated weights for policy 0, policy_version 159844 (0.0036) [2024-06-18 15:20:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42654.3). Total num frames: 2618900480. Throughput: 0: 42812.7. Samples: 2619050200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:20:11,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 15:20:15,191][12883] Updated weights for policy 0, policy_version 159854 (0.0032) [2024-06-18 15:20:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2619097088. Throughput: 0: 42793.6. Samples: 2619175220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:20:16,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 15:20:19,389][12883] Updated weights for policy 0, policy_version 159864 (0.0027) [2024-06-18 15:20:21,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42601.4, 300 sec: 42653.9). Total num frames: 2619326464. Throughput: 0: 42698.7. Samples: 2619427940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:20:21,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 15:20:22,837][12883] Updated weights for policy 0, policy_version 159874 (0.0032) [2024-06-18 15:20:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2619523072. Throughput: 0: 43042.3. Samples: 2619694020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:20:26,994][12645] Avg episode reward: [(0, '0.570')] [2024-06-18 15:20:27,007][12883] Updated weights for policy 0, policy_version 159884 (0.0030) [2024-06-18 15:20:30,783][12883] Updated weights for policy 0, policy_version 159894 (0.0033) [2024-06-18 15:20:31,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42873.1, 300 sec: 42487.3). Total num frames: 2619736064. Throughput: 0: 42828.3. Samples: 2619815480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 15:20:31,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 15:20:34,592][12883] Updated weights for policy 0, policy_version 159904 (0.0038) [2024-06-18 15:20:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.8). Total num frames: 2619965440. Throughput: 0: 42694.7. Samples: 2620067280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:20:36,994][12645] Avg episode reward: [(0, '0.252')] [2024-06-18 15:20:37,041][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159911_2619981824.pth... [2024-06-18 15:20:37,089][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159285_2609725440.pth [2024-06-18 15:20:38,851][12883] Updated weights for policy 0, policy_version 159914 (0.0026) [2024-06-18 15:20:41,996][12645] Fps is (10 sec: 44227.9, 60 sec: 42597.0, 300 sec: 42598.1). Total num frames: 2620178432. Throughput: 0: 42775.0. Samples: 2620331040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:20:41,996][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 15:20:42,191][12883] Updated weights for policy 0, policy_version 159924 (0.0031) [2024-06-18 15:20:46,583][12883] Updated weights for policy 0, policy_version 159934 (0.0027) [2024-06-18 15:20:46,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 2620375040. Throughput: 0: 42592.0. Samples: 2620454240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:20:46,994][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 15:20:50,014][12883] Updated weights for policy 0, policy_version 159944 (0.0037) [2024-06-18 15:20:51,994][12645] Fps is (10 sec: 40968.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2620588032. Throughput: 0: 42647.7. Samples: 2620706760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:20:51,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 15:20:54,123][12883] Updated weights for policy 0, policy_version 159954 (0.0034) [2024-06-18 15:20:56,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2620801024. Throughput: 0: 42705.8. Samples: 2620971960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:20:56,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 15:20:57,574][12883] Updated weights for policy 0, policy_version 159964 (0.0031) [2024-06-18 15:21:01,706][12883] Updated weights for policy 0, policy_version 159974 (0.0040) [2024-06-18 15:21:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2621030400. Throughput: 0: 42624.2. Samples: 2621093400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:01,996][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 15:21:05,641][12883] Updated weights for policy 0, policy_version 159984 (0.0040) [2024-06-18 15:21:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2621243392. Throughput: 0: 42858.3. Samples: 2621356560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:06,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 15:21:09,273][12883] Updated weights for policy 0, policy_version 159994 (0.0043) [2024-06-18 15:21:11,994][12645] Fps is (10 sec: 40969.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2621440000. Throughput: 0: 42559.9. Samples: 2621609220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:11,994][12645] Avg episode reward: [(0, '0.564')] [2024-06-18 15:21:13,181][12883] Updated weights for policy 0, policy_version 160004 (0.0053) [2024-06-18 15:21:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2621652992. Throughput: 0: 42567.5. Samples: 2621731020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:16,994][12645] Avg episode reward: [(0, '0.608')] [2024-06-18 15:21:17,306][12883] Updated weights for policy 0, policy_version 160014 (0.0041) [2024-06-18 15:21:20,863][12883] Updated weights for policy 0, policy_version 160024 (0.0030) [2024-06-18 15:21:21,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2621898752. Throughput: 0: 42766.6. Samples: 2621991780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:21,994][12645] Avg episode reward: [(0, '0.608')] [2024-06-18 15:21:24,668][12883] Updated weights for policy 0, policy_version 160034 (0.0039) [2024-06-18 15:21:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2622078976. Throughput: 0: 42732.1. Samples: 2622253900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:26,994][12645] Avg episode reward: [(0, '0.862')] [2024-06-18 15:21:28,379][12883] Updated weights for policy 0, policy_version 160044 (0.0040) [2024-06-18 15:21:31,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2622308352. Throughput: 0: 42718.2. Samples: 2622376560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:31,994][12645] Avg episode reward: [(0, '0.686')] [2024-06-18 15:21:32,004][12862] Signal inference workers to stop experience collection... (38350 times) [2024-06-18 15:21:32,005][12862] Signal inference workers to resume experience collection... (38350 times) [2024-06-18 15:21:32,027][12883] InferenceWorker_p0-w0: stopping experience collection (38350 times) [2024-06-18 15:21:32,027][12883] InferenceWorker_p0-w0: resuming experience collection (38350 times) [2024-06-18 15:21:32,151][12883] Updated weights for policy 0, policy_version 160054 (0.0027) [2024-06-18 15:21:36,203][12883] Updated weights for policy 0, policy_version 160064 (0.0046) [2024-06-18 15:21:36,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2622537728. Throughput: 0: 42982.2. Samples: 2622640960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 15:21:36,994][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 15:21:39,938][12883] Updated weights for policy 0, policy_version 160074 (0.0028) [2024-06-18 15:21:41,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42326.7, 300 sec: 42487.3). Total num frames: 2622717952. Throughput: 0: 42882.6. Samples: 2622901680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:21:41,994][12645] Avg episode reward: [(0, '0.681')] [2024-06-18 15:21:43,694][12883] Updated weights for policy 0, policy_version 160084 (0.0027) [2024-06-18 15:21:46,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2622963712. Throughput: 0: 42902.1. Samples: 2623023900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:21:46,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 15:21:47,706][12883] Updated weights for policy 0, policy_version 160094 (0.0040) [2024-06-18 15:21:51,218][12883] Updated weights for policy 0, policy_version 160104 (0.0036) [2024-06-18 15:21:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2623176704. Throughput: 0: 42827.0. Samples: 2623283780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:21:51,994][12645] Avg episode reward: [(0, '0.675')] [2024-06-18 15:21:55,525][12883] Updated weights for policy 0, policy_version 160114 (0.0029) [2024-06-18 15:21:56,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 2623356928. Throughput: 0: 42903.4. Samples: 2623539880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:21:56,994][12645] Avg episode reward: [(0, '0.649')] [2024-06-18 15:21:58,847][12883] Updated weights for policy 0, policy_version 160124 (0.0027) [2024-06-18 15:22:01,996][12645] Fps is (10 sec: 42589.0, 60 sec: 42871.4, 300 sec: 42709.2). Total num frames: 2623602688. Throughput: 0: 42858.3. Samples: 2623659740. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:01,996][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 15:22:03,363][12883] Updated weights for policy 0, policy_version 160134 (0.0037) [2024-06-18 15:22:06,530][12883] Updated weights for policy 0, policy_version 160144 (0.0039) [2024-06-18 15:22:06,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2623815680. Throughput: 0: 42953.5. Samples: 2623924680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:06,994][12645] Avg episode reward: [(0, '0.555')] [2024-06-18 15:22:11,249][12883] Updated weights for policy 0, policy_version 160154 (0.0041) [2024-06-18 15:22:11,994][12645] Fps is (10 sec: 40969.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2624012288. Throughput: 0: 42856.9. Samples: 2624182460. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:11,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 15:22:14,313][12883] Updated weights for policy 0, policy_version 160164 (0.0042) [2024-06-18 15:22:16,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2624241664. Throughput: 0: 42755.5. Samples: 2624300560. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:16,995][12645] Avg episode reward: [(0, '0.535')] [2024-06-18 15:22:18,892][12883] Updated weights for policy 0, policy_version 160174 (0.0036) [2024-06-18 15:22:21,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42543.2). Total num frames: 2624421888. Throughput: 0: 42601.3. Samples: 2624558020. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:21,994][12645] Avg episode reward: [(0, '0.478')] [2024-06-18 15:22:22,248][12883] Updated weights for policy 0, policy_version 160184 (0.0039) [2024-06-18 15:22:26,759][12883] Updated weights for policy 0, policy_version 160194 (0.0043) [2024-06-18 15:22:26,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2624618496. Throughput: 0: 42546.7. Samples: 2624816280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:26,996][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 15:22:29,794][12883] Updated weights for policy 0, policy_version 160204 (0.0036) [2024-06-18 15:22:31,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2624880640. Throughput: 0: 42569.4. Samples: 2624939520. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:31,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 15:22:34,375][12883] Updated weights for policy 0, policy_version 160214 (0.0032) [2024-06-18 15:22:36,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2625077248. Throughput: 0: 42433.3. Samples: 2625193280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:37,000][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 15:22:37,149][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160223_2625093632.pth... [2024-06-18 15:22:37,207][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159597_2614837248.pth [2024-06-18 15:22:37,525][12883] Updated weights for policy 0, policy_version 160224 (0.0038) [2024-06-18 15:22:41,924][12883] Updated weights for policy 0, policy_version 160234 (0.0044) [2024-06-18 15:22:41,994][12645] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2625273856. Throughput: 0: 42494.3. Samples: 2625452120. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-18 15:22:41,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 15:22:45,092][12883] Updated weights for policy 0, policy_version 160244 (0.0031) [2024-06-18 15:22:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2625503232. Throughput: 0: 42604.0. Samples: 2625576820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:22:46,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 15:22:49,468][12883] Updated weights for policy 0, policy_version 160254 (0.0036) [2024-06-18 15:22:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2625716224. Throughput: 0: 42519.9. Samples: 2625838080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:22:51,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 15:22:52,790][12883] Updated weights for policy 0, policy_version 160264 (0.0031) [2024-06-18 15:22:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42543.2). Total num frames: 2625912832. Throughput: 0: 42335.1. Samples: 2626087540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:22:56,994][12645] Avg episode reward: [(0, '0.683')] [2024-06-18 15:22:57,486][12883] Updated weights for policy 0, policy_version 160274 (0.0027) [2024-06-18 15:23:00,437][12883] Updated weights for policy 0, policy_version 160284 (0.0032) [2024-06-18 15:23:01,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42600.0, 300 sec: 42709.5). Total num frames: 2626158592. Throughput: 0: 42556.5. Samples: 2626215600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:01,994][12645] Avg episode reward: [(0, '0.683')] [2024-06-18 15:23:05,203][12883] Updated weights for policy 0, policy_version 160294 (0.0033) [2024-06-18 15:23:06,207][12862] Signal inference workers to stop experience collection... (38400 times) [2024-06-18 15:23:06,207][12862] Signal inference workers to resume experience collection... (38400 times) [2024-06-18 15:23:06,245][12883] InferenceWorker_p0-w0: stopping experience collection (38400 times) [2024-06-18 15:23:06,245][12883] InferenceWorker_p0-w0: resuming experience collection (38400 times) [2024-06-18 15:23:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2626355200. Throughput: 0: 42531.2. Samples: 2626471920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:06,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 15:23:08,393][12883] Updated weights for policy 0, policy_version 160304 (0.0035) [2024-06-18 15:23:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2626568192. Throughput: 0: 42316.0. Samples: 2626720500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:11,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 15:23:12,794][12883] Updated weights for policy 0, policy_version 160314 (0.0032) [2024-06-18 15:23:16,070][12883] Updated weights for policy 0, policy_version 160324 (0.0033) [2024-06-18 15:23:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2626781184. Throughput: 0: 42425.8. Samples: 2626848680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:16,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 15:23:20,203][12883] Updated weights for policy 0, policy_version 160334 (0.0029) [2024-06-18 15:23:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2626961408. Throughput: 0: 42615.6. Samples: 2627110980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:21,994][12645] Avg episode reward: [(0, '0.605')] [2024-06-18 15:23:23,810][12883] Updated weights for policy 0, policy_version 160344 (0.0033) [2024-06-18 15:23:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 2627223552. Throughput: 0: 42320.4. Samples: 2627356540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:26,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 15:23:28,438][12883] Updated weights for policy 0, policy_version 160354 (0.0047) [2024-06-18 15:23:31,670][12883] Updated weights for policy 0, policy_version 160364 (0.0023) [2024-06-18 15:23:31,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42323.7, 300 sec: 42653.6). Total num frames: 2627420160. Throughput: 0: 42573.4. Samples: 2627492720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:31,996][12645] Avg episode reward: [(0, '0.654')] [2024-06-18 15:23:36,044][12883] Updated weights for policy 0, policy_version 160374 (0.0030) [2024-06-18 15:23:36,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2627600384. Throughput: 0: 42299.1. Samples: 2627741540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:36,994][12645] Avg episode reward: [(0, '0.623')] [2024-06-18 15:23:39,299][12883] Updated weights for policy 0, policy_version 160384 (0.0035) [2024-06-18 15:23:41,994][12645] Fps is (10 sec: 42607.8, 60 sec: 42871.5, 300 sec: 42598.7). Total num frames: 2627846144. Throughput: 0: 42283.6. Samples: 2627990300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:41,994][12645] Avg episode reward: [(0, '0.410')] [2024-06-18 15:23:43,550][12883] Updated weights for policy 0, policy_version 160394 (0.0040) [2024-06-18 15:23:46,913][12883] Updated weights for policy 0, policy_version 160404 (0.0029) [2024-06-18 15:23:46,994][12645] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2628059136. Throughput: 0: 42493.4. Samples: 2628127800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 15:23:46,994][12645] Avg episode reward: [(0, '0.732')] [2024-06-18 15:23:51,440][12883] Updated weights for policy 0, policy_version 160414 (0.0033) [2024-06-18 15:23:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 2628239360. Throughput: 0: 42307.0. Samples: 2628375740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:23:51,994][12645] Avg episode reward: [(0, '0.664')] [2024-06-18 15:23:54,592][12883] Updated weights for policy 0, policy_version 160424 (0.0033) [2024-06-18 15:23:56,996][12645] Fps is (10 sec: 42588.6, 60 sec: 42869.9, 300 sec: 42709.2). Total num frames: 2628485120. Throughput: 0: 42438.4. Samples: 2628630320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:23:56,996][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 15:23:59,043][12883] Updated weights for policy 0, policy_version 160434 (0.0034) [2024-06-18 15:24:01,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2628698112. Throughput: 0: 42702.1. Samples: 2628770280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:01,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 15:24:02,159][12883] Updated weights for policy 0, policy_version 160444 (0.0034) [2024-06-18 15:24:06,582][12883] Updated weights for policy 0, policy_version 160454 (0.0038) [2024-06-18 15:24:06,994][12645] Fps is (10 sec: 40969.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2628894720. Throughput: 0: 42438.7. Samples: 2629020720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:06,994][12645] Avg episode reward: [(0, '0.697')] [2024-06-18 15:24:09,747][12883] Updated weights for policy 0, policy_version 160464 (0.0032) [2024-06-18 15:24:11,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2629140480. Throughput: 0: 42582.2. Samples: 2629272740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:11,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 15:24:14,126][12883] Updated weights for policy 0, policy_version 160474 (0.0031) [2024-06-18 15:24:16,086][12862] Signal inference workers to stop experience collection... (38450 times) [2024-06-18 15:24:16,128][12883] InferenceWorker_p0-w0: stopping experience collection (38450 times) [2024-06-18 15:24:16,198][12862] Signal inference workers to resume experience collection... (38450 times) [2024-06-18 15:24:16,199][12883] InferenceWorker_p0-w0: resuming experience collection (38450 times) [2024-06-18 15:24:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42599.0). Total num frames: 2629337088. Throughput: 0: 42451.8. Samples: 2629402960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:16,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 15:24:17,444][12883] Updated weights for policy 0, policy_version 160484 (0.0038) [2024-06-18 15:24:21,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2629517312. Throughput: 0: 42550.7. Samples: 2629656320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:21,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 15:24:22,150][12883] Updated weights for policy 0, policy_version 160494 (0.0046) [2024-06-18 15:24:25,276][12883] Updated weights for policy 0, policy_version 160504 (0.0037) [2024-06-18 15:24:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42709.8). Total num frames: 2629763072. Throughput: 0: 42526.3. Samples: 2629903980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:26,994][12645] Avg episode reward: [(0, '0.443')] [2024-06-18 15:24:29,910][12883] Updated weights for policy 0, policy_version 160514 (0.0041) [2024-06-18 15:24:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42053.8, 300 sec: 42542.9). Total num frames: 2629943296. Throughput: 0: 42557.3. Samples: 2630042880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:31,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 15:24:33,015][12883] Updated weights for policy 0, policy_version 160524 (0.0037) [2024-06-18 15:24:36,994][12645] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 2630139904. Throughput: 0: 42471.9. Samples: 2630286980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:36,995][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 15:24:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160531_2630139904.pth... [2024-06-18 15:24:37,090][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000159911_2619981824.pth [2024-06-18 15:24:37,631][12883] Updated weights for policy 0, policy_version 160534 (0.0032) [2024-06-18 15:24:40,840][12883] Updated weights for policy 0, policy_version 160544 (0.0036) [2024-06-18 15:24:41,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2630418432. Throughput: 0: 42465.2. Samples: 2630541160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:41,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 15:24:45,539][12883] Updated weights for policy 0, policy_version 160554 (0.0037) [2024-06-18 15:24:46,994][12645] Fps is (10 sec: 44237.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2630582272. Throughput: 0: 42453.9. Samples: 2630680700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:46,994][12645] Avg episode reward: [(0, '0.269')] [2024-06-18 15:24:48,498][12883] Updated weights for policy 0, policy_version 160564 (0.0040) [2024-06-18 15:24:51,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2630795264. Throughput: 0: 42355.0. Samples: 2630926700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 15:24:51,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 15:24:53,339][12883] Updated weights for policy 0, policy_version 160574 (0.0031) [2024-06-18 15:24:55,999][12883] Updated weights for policy 0, policy_version 160584 (0.0037) [2024-06-18 15:24:56,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42600.0, 300 sec: 42653.9). Total num frames: 2631041024. Throughput: 0: 42437.4. Samples: 2631182420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:24:56,994][12645] Avg episode reward: [(0, '0.779')] [2024-06-18 15:25:00,970][12883] Updated weights for policy 0, policy_version 160594 (0.0031) [2024-06-18 15:25:01,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2631237632. Throughput: 0: 42517.4. Samples: 2631316240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:01,994][12645] Avg episode reward: [(0, '0.771')] [2024-06-18 15:25:04,037][12883] Updated weights for policy 0, policy_version 160604 (0.0035) [2024-06-18 15:25:06,995][12645] Fps is (10 sec: 40953.8, 60 sec: 42597.3, 300 sec: 42542.7). Total num frames: 2631450624. Throughput: 0: 42486.2. Samples: 2631568260. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:06,996][12645] Avg episode reward: [(0, '0.775')] [2024-06-18 15:25:08,493][12883] Updated weights for policy 0, policy_version 160614 (0.0030) [2024-06-18 15:25:11,588][12883] Updated weights for policy 0, policy_version 160624 (0.0027) [2024-06-18 15:25:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2631680000. Throughput: 0: 42577.2. Samples: 2631819960. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:11,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 15:25:16,170][12883] Updated weights for policy 0, policy_version 160634 (0.0046) [2024-06-18 15:25:17,000][12645] Fps is (10 sec: 40940.5, 60 sec: 42047.9, 300 sec: 42486.4). Total num frames: 2631860224. Throughput: 0: 42479.9. Samples: 2631954740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:17,000][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 15:25:19,369][12883] Updated weights for policy 0, policy_version 160644 (0.0046) [2024-06-18 15:25:21,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42869.9, 300 sec: 42598.1). Total num frames: 2632089600. Throughput: 0: 42474.0. Samples: 2632198400. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:21,996][12645] Avg episode reward: [(0, '0.804')] [2024-06-18 15:25:23,693][12883] Updated weights for policy 0, policy_version 160654 (0.0037) [2024-06-18 15:25:26,869][12883] Updated weights for policy 0, policy_version 160664 (0.0044) [2024-06-18 15:25:26,994][12645] Fps is (10 sec: 45903.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2632318976. Throughput: 0: 42700.4. Samples: 2632462680. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:26,994][12645] Avg episode reward: [(0, '0.678')] [2024-06-18 15:25:31,263][12883] Updated weights for policy 0, policy_version 160674 (0.0030) [2024-06-18 15:25:31,994][12645] Fps is (10 sec: 42607.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2632515584. Throughput: 0: 42435.0. Samples: 2632590280. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:31,996][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 15:25:33,297][12862] Signal inference workers to stop experience collection... (38500 times) [2024-06-18 15:25:33,333][12883] InferenceWorker_p0-w0: stopping experience collection (38500 times) [2024-06-18 15:25:33,356][12862] Signal inference workers to resume experience collection... (38500 times) [2024-06-18 15:25:33,357][12883] InferenceWorker_p0-w0: resuming experience collection (38500 times) [2024-06-18 15:25:34,637][12883] Updated weights for policy 0, policy_version 160684 (0.0037) [2024-06-18 15:25:36,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43417.7, 300 sec: 42598.7). Total num frames: 2632744960. Throughput: 0: 42492.0. Samples: 2632838840. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:36,994][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 15:25:39,167][12883] Updated weights for policy 0, policy_version 160694 (0.0030) [2024-06-18 15:25:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2632941568. Throughput: 0: 42610.6. Samples: 2633099900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:41,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 15:25:42,391][12883] Updated weights for policy 0, policy_version 160704 (0.0029) [2024-06-18 15:25:46,870][12883] Updated weights for policy 0, policy_version 160714 (0.0039) [2024-06-18 15:25:46,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2633138176. Throughput: 0: 42421.8. Samples: 2633225220. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:46,994][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 15:25:50,131][12883] Updated weights for policy 0, policy_version 160724 (0.0028) [2024-06-18 15:25:52,000][12645] Fps is (10 sec: 45846.8, 60 sec: 43413.1, 300 sec: 42708.6). Total num frames: 2633400320. Throughput: 0: 42537.8. Samples: 2633482660. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:52,000][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 15:25:54,448][12883] Updated weights for policy 0, policy_version 160734 (0.0033) [2024-06-18 15:25:56,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42598.7). Total num frames: 2633596928. Throughput: 0: 42777.8. Samples: 2633744960. Policy #0 lag: (min: 0.0, avg: 7.8, max: 20.0) [2024-06-18 15:25:56,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 15:25:57,756][12883] Updated weights for policy 0, policy_version 160744 (0.0044) [2024-06-18 15:26:01,994][12645] Fps is (10 sec: 37707.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2633777152. Throughput: 0: 42568.7. Samples: 2633870060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:01,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 15:26:02,115][12883] Updated weights for policy 0, policy_version 160754 (0.0036) [2024-06-18 15:26:05,415][12883] Updated weights for policy 0, policy_version 160764 (0.0041) [2024-06-18 15:26:06,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42872.6, 300 sec: 42653.9). Total num frames: 2634022912. Throughput: 0: 42763.0. Samples: 2634122640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:06,994][12645] Avg episode reward: [(0, '0.685')] [2024-06-18 15:26:09,764][12883] Updated weights for policy 0, policy_version 160774 (0.0032) [2024-06-18 15:26:11,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2634219520. Throughput: 0: 42770.7. Samples: 2634387360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:11,994][12645] Avg episode reward: [(0, '0.695')] [2024-06-18 15:26:13,089][12883] Updated weights for policy 0, policy_version 160784 (0.0041) [2024-06-18 15:26:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42876.0, 300 sec: 42487.3). Total num frames: 2634432512. Throughput: 0: 42654.3. Samples: 2634509720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:16,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 15:26:17,431][12883] Updated weights for policy 0, policy_version 160794 (0.0024) [2024-06-18 15:26:20,799][12883] Updated weights for policy 0, policy_version 160804 (0.0031) [2024-06-18 15:26:21,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42873.0, 300 sec: 42653.9). Total num frames: 2634661888. Throughput: 0: 42821.4. Samples: 2634765800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:21,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 15:26:25,250][12883] Updated weights for policy 0, policy_version 160814 (0.0045) [2024-06-18 15:26:26,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 2634858496. Throughput: 0: 42803.7. Samples: 2635026060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:26,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 15:26:28,542][12883] Updated weights for policy 0, policy_version 160824 (0.0033) [2024-06-18 15:26:31,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2635071488. Throughput: 0: 42680.3. Samples: 2635145840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:31,994][12645] Avg episode reward: [(0, '0.697')] [2024-06-18 15:26:32,829][12883] Updated weights for policy 0, policy_version 160834 (0.0049) [2024-06-18 15:26:36,214][12883] Updated weights for policy 0, policy_version 160844 (0.0040) [2024-06-18 15:26:36,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2635284480. Throughput: 0: 42695.6. Samples: 2635403700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:36,994][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 15:26:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160845_2635284480.pth... [2024-06-18 15:26:37,098][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160223_2625093632.pth [2024-06-18 15:26:40,470][12883] Updated weights for policy 0, policy_version 160854 (0.0049) [2024-06-18 15:26:41,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2635497472. Throughput: 0: 42471.5. Samples: 2635656180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:41,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 15:26:44,003][12883] Updated weights for policy 0, policy_version 160864 (0.0029) [2024-06-18 15:26:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 2635710464. Throughput: 0: 42512.0. Samples: 2635783100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:46,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 15:26:47,994][12883] Updated weights for policy 0, policy_version 160874 (0.0025) [2024-06-18 15:26:51,595][12883] Updated weights for policy 0, policy_version 160884 (0.0043) [2024-06-18 15:26:51,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42329.6, 300 sec: 42653.9). Total num frames: 2635939840. Throughput: 0: 42720.8. Samples: 2636045080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:51,994][12645] Avg episode reward: [(0, '0.716')] [2024-06-18 15:26:55,898][12883] Updated weights for policy 0, policy_version 160894 (0.0031) [2024-06-18 15:26:56,996][12645] Fps is (10 sec: 42588.0, 60 sec: 42323.6, 300 sec: 42487.3). Total num frames: 2636136448. Throughput: 0: 42460.9. Samples: 2636298200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 15:26:56,996][12645] Avg episode reward: [(0, '0.653')] [2024-06-18 15:26:59,338][12883] Updated weights for policy 0, policy_version 160904 (0.0040) [2024-06-18 15:27:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 2636365824. Throughput: 0: 42610.5. Samples: 2636427200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:01,994][12645] Avg episode reward: [(0, '0.548')] [2024-06-18 15:27:03,576][12883] Updated weights for policy 0, policy_version 160914 (0.0044) [2024-06-18 15:27:06,994][12645] Fps is (10 sec: 42609.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2636562432. Throughput: 0: 42634.7. Samples: 2636684360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:06,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 15:27:07,086][12883] Updated weights for policy 0, policy_version 160924 (0.0038) [2024-06-18 15:27:07,089][12862] Signal inference workers to stop experience collection... (38550 times) [2024-06-18 15:27:07,089][12862] Signal inference workers to resume experience collection... (38550 times) [2024-06-18 15:27:07,124][12883] InferenceWorker_p0-w0: stopping experience collection (38550 times) [2024-06-18 15:27:07,124][12883] InferenceWorker_p0-w0: resuming experience collection (38550 times) [2024-06-18 15:27:11,023][12883] Updated weights for policy 0, policy_version 160934 (0.0042) [2024-06-18 15:27:11,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2636791808. Throughput: 0: 42516.4. Samples: 2636939300. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:11,994][12645] Avg episode reward: [(0, '0.355')] [2024-06-18 15:27:14,660][12883] Updated weights for policy 0, policy_version 160944 (0.0051) [2024-06-18 15:27:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2637004800. Throughput: 0: 42701.9. Samples: 2637067420. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:16,994][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 15:27:18,560][12883] Updated weights for policy 0, policy_version 160954 (0.0046) [2024-06-18 15:27:21,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2637201408. Throughput: 0: 42589.5. Samples: 2637320220. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:21,994][12645] Avg episode reward: [(0, '0.711')] [2024-06-18 15:27:22,389][12883] Updated weights for policy 0, policy_version 160964 (0.0048) [2024-06-18 15:27:26,559][12883] Updated weights for policy 0, policy_version 160974 (0.0034) [2024-06-18 15:27:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 2637430784. Throughput: 0: 42863.2. Samples: 2637585020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:26,994][12645] Avg episode reward: [(0, '0.643')] [2024-06-18 15:27:30,105][12883] Updated weights for policy 0, policy_version 160984 (0.0034) [2024-06-18 15:27:31,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2637643776. Throughput: 0: 42846.7. Samples: 2637711200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:31,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 15:27:34,062][12883] Updated weights for policy 0, policy_version 160994 (0.0037) [2024-06-18 15:27:36,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2637856768. Throughput: 0: 42606.7. Samples: 2637962380. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:36,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 15:27:37,826][12883] Updated weights for policy 0, policy_version 161004 (0.0024) [2024-06-18 15:27:41,636][12883] Updated weights for policy 0, policy_version 161014 (0.0035) [2024-06-18 15:27:41,995][12645] Fps is (10 sec: 42592.9, 60 sec: 42870.6, 300 sec: 42598.2). Total num frames: 2638069760. Throughput: 0: 42870.5. Samples: 2638227320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:41,995][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 15:27:45,537][12883] Updated weights for policy 0, policy_version 161024 (0.0028) [2024-06-18 15:27:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2638282752. Throughput: 0: 42752.1. Samples: 2638351040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:46,994][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 15:27:49,058][12883] Updated weights for policy 0, policy_version 161034 (0.0039) [2024-06-18 15:27:51,994][12645] Fps is (10 sec: 42603.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2638495744. Throughput: 0: 42794.2. Samples: 2638610100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:51,994][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 15:27:53,195][12883] Updated weights for policy 0, policy_version 161044 (0.0028) [2024-06-18 15:27:56,629][12883] Updated weights for policy 0, policy_version 161054 (0.0034) [2024-06-18 15:27:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43146.3, 300 sec: 42598.4). Total num frames: 2638725120. Throughput: 0: 42852.0. Samples: 2638867640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:27:56,994][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 15:28:00,878][12883] Updated weights for policy 0, policy_version 161064 (0.0030) [2024-06-18 15:28:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2638921728. Throughput: 0: 42872.0. Samples: 2638996660. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-18 15:28:01,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 15:28:04,041][12883] Updated weights for policy 0, policy_version 161074 (0.0028) [2024-06-18 15:28:06,994][12645] Fps is (10 sec: 42597.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 2639151104. Throughput: 0: 43110.9. Samples: 2639260220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:06,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 15:28:08,418][12883] Updated weights for policy 0, policy_version 161084 (0.0035) [2024-06-18 15:28:11,899][12883] Updated weights for policy 0, policy_version 161094 (0.0025) [2024-06-18 15:28:11,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42869.9, 300 sec: 42653.6). Total num frames: 2639364096. Throughput: 0: 42960.0. Samples: 2639518320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:11,997][12645] Avg episode reward: [(0, '0.381')] [2024-06-18 15:28:16,167][12883] Updated weights for policy 0, policy_version 161104 (0.0031) [2024-06-18 15:28:16,994][12645] Fps is (10 sec: 40961.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2639560704. Throughput: 0: 43006.6. Samples: 2639646500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:16,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 15:28:19,431][12883] Updated weights for policy 0, policy_version 161114 (0.0033) [2024-06-18 15:28:21,994][12645] Fps is (10 sec: 44246.7, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 2639806464. Throughput: 0: 43266.3. Samples: 2639909360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:21,994][12645] Avg episode reward: [(0, '0.486')] [2024-06-18 15:28:23,731][12883] Updated weights for policy 0, policy_version 161124 (0.0031) [2024-06-18 15:28:26,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42654.3). Total num frames: 2640003072. Throughput: 0: 43117.2. Samples: 2640167540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:26,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 15:28:27,119][12883] Updated weights for policy 0, policy_version 161134 (0.0028) [2024-06-18 15:28:31,141][12883] Updated weights for policy 0, policy_version 161144 (0.0043) [2024-06-18 15:28:31,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2640216064. Throughput: 0: 43260.9. Samples: 2640297780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:31,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 15:28:34,490][12883] Updated weights for policy 0, policy_version 161154 (0.0044) [2024-06-18 15:28:35,592][12862] Signal inference workers to stop experience collection... (38600 times) [2024-06-18 15:28:35,593][12862] Signal inference workers to resume experience collection... (38600 times) [2024-06-18 15:28:35,637][12883] InferenceWorker_p0-w0: stopping experience collection (38600 times) [2024-06-18 15:28:35,637][12883] InferenceWorker_p0-w0: resuming experience collection (38600 times) [2024-06-18 15:28:37,000][12645] Fps is (10 sec: 45846.3, 60 sec: 43413.1, 300 sec: 42764.1). Total num frames: 2640461824. Throughput: 0: 43245.9. Samples: 2640556440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:37,001][12645] Avg episode reward: [(0, '0.405')] [2024-06-18 15:28:37,023][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161161_2640461824.pth... [2024-06-18 15:28:37,082][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160531_2630139904.pth [2024-06-18 15:28:38,696][12883] Updated weights for policy 0, policy_version 161164 (0.0045) [2024-06-18 15:28:41,980][12883] Updated weights for policy 0, policy_version 161174 (0.0030) [2024-06-18 15:28:41,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43418.4, 300 sec: 42765.0). Total num frames: 2640674816. Throughput: 0: 43249.7. Samples: 2640813880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:41,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 15:28:46,186][12883] Updated weights for policy 0, policy_version 161184 (0.0041) [2024-06-18 15:28:47,000][12645] Fps is (10 sec: 39321.9, 60 sec: 42867.1, 300 sec: 42764.1). Total num frames: 2640855040. Throughput: 0: 43152.7. Samples: 2640938800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:47,000][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 15:28:49,714][12883] Updated weights for policy 0, policy_version 161194 (0.0033) [2024-06-18 15:28:51,994][12645] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.8). Total num frames: 2641084416. Throughput: 0: 43068.7. Samples: 2641198300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:51,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 15:28:54,010][12883] Updated weights for policy 0, policy_version 161204 (0.0029) [2024-06-18 15:28:56,997][12645] Fps is (10 sec: 45886.5, 60 sec: 43141.8, 300 sec: 42764.5). Total num frames: 2641313792. Throughput: 0: 43113.7. Samples: 2641458500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:28:56,998][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 15:28:57,317][12883] Updated weights for policy 0, policy_version 161214 (0.0028) [2024-06-18 15:29:01,394][12883] Updated weights for policy 0, policy_version 161224 (0.0037) [2024-06-18 15:29:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2641510400. Throughput: 0: 43113.2. Samples: 2641586600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:29:01,995][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 15:29:05,437][12883] Updated weights for policy 0, policy_version 161234 (0.0042) [2024-06-18 15:29:06,994][12645] Fps is (10 sec: 42614.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2641739776. Throughput: 0: 43083.2. Samples: 2641848100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 15:29:06,994][12645] Avg episode reward: [(0, '0.618')] [2024-06-18 15:29:09,199][12883] Updated weights for policy 0, policy_version 161244 (0.0033) [2024-06-18 15:29:11,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42873.1, 300 sec: 42709.5). Total num frames: 2641936384. Throughput: 0: 42880.9. Samples: 2642097180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:11,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 15:29:12,981][12883] Updated weights for policy 0, policy_version 161254 (0.0027) [2024-06-18 15:29:16,994][12645] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2642132992. Throughput: 0: 42868.9. Samples: 2642226880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:16,994][12645] Avg episode reward: [(0, '0.430')] [2024-06-18 15:29:17,211][12883] Updated weights for policy 0, policy_version 161264 (0.0048) [2024-06-18 15:29:20,742][12883] Updated weights for policy 0, policy_version 161274 (0.0037) [2024-06-18 15:29:21,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 2642362368. Throughput: 0: 42719.6. Samples: 2642478560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:21,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 15:29:24,706][12883] Updated weights for policy 0, policy_version 161284 (0.0038) [2024-06-18 15:29:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2642575360. Throughput: 0: 42710.6. Samples: 2642735860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:26,994][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 15:29:28,317][12883] Updated weights for policy 0, policy_version 161294 (0.0026) [2024-06-18 15:29:31,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2642771968. Throughput: 0: 42809.9. Samples: 2642864980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:31,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 15:29:32,455][12883] Updated weights for policy 0, policy_version 161304 (0.0028) [2024-06-18 15:29:35,951][12883] Updated weights for policy 0, policy_version 161314 (0.0036) [2024-06-18 15:29:36,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42329.6, 300 sec: 42653.9). Total num frames: 2643001344. Throughput: 0: 42673.5. Samples: 2643118620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:36,994][12645] Avg episode reward: [(0, '0.394')] [2024-06-18 15:29:39,943][12883] Updated weights for policy 0, policy_version 161324 (0.0034) [2024-06-18 15:29:41,993][12645] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 2643197952. Throughput: 0: 42774.4. Samples: 2643383180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:41,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 15:29:43,361][12883] Updated weights for policy 0, policy_version 161334 (0.0038) [2024-06-18 15:29:46,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42875.9, 300 sec: 42820.6). Total num frames: 2643427328. Throughput: 0: 42608.0. Samples: 2643503960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:46,994][12645] Avg episode reward: [(0, '0.581')] [2024-06-18 15:29:47,908][12883] Updated weights for policy 0, policy_version 161344 (0.0031) [2024-06-18 15:29:50,847][12883] Updated weights for policy 0, policy_version 161354 (0.0036) [2024-06-18 15:29:51,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2643656704. Throughput: 0: 42383.6. Samples: 2643755360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:51,994][12645] Avg episode reward: [(0, '0.385')] [2024-06-18 15:29:55,535][12883] Updated weights for policy 0, policy_version 161364 (0.0035) [2024-06-18 15:29:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42327.9, 300 sec: 42765.0). Total num frames: 2643853312. Throughput: 0: 42803.0. Samples: 2644023320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:29:56,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 15:29:58,705][12883] Updated weights for policy 0, policy_version 161374 (0.0027) [2024-06-18 15:30:01,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.2). Total num frames: 2644066304. Throughput: 0: 42611.3. Samples: 2644144380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:30:01,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 15:30:03,066][12883] Updated weights for policy 0, policy_version 161384 (0.0030) [2024-06-18 15:30:06,494][12883] Updated weights for policy 0, policy_version 161394 (0.0037) [2024-06-18 15:30:06,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2644295680. Throughput: 0: 42774.4. Samples: 2644403400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:30:06,994][12645] Avg episode reward: [(0, '0.494')] [2024-06-18 15:30:10,719][12883] Updated weights for policy 0, policy_version 161404 (0.0033) [2024-06-18 15:30:11,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42821.5). Total num frames: 2644492288. Throughput: 0: 42765.0. Samples: 2644660280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 15:30:11,994][12645] Avg episode reward: [(0, '0.652')] [2024-06-18 15:30:14,147][12883] Updated weights for policy 0, policy_version 161414 (0.0043) [2024-06-18 15:30:14,152][12862] Signal inference workers to stop experience collection... (38650 times) [2024-06-18 15:30:14,153][12862] Signal inference workers to resume experience collection... (38650 times) [2024-06-18 15:30:14,196][12883] InferenceWorker_p0-w0: stopping experience collection (38650 times) [2024-06-18 15:30:14,196][12883] InferenceWorker_p0-w0: resuming experience collection (38650 times) [2024-06-18 15:30:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.3). Total num frames: 2644705280. Throughput: 0: 42686.3. Samples: 2644785860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:16,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 15:30:18,191][12883] Updated weights for policy 0, policy_version 161424 (0.0030) [2024-06-18 15:30:21,625][12883] Updated weights for policy 0, policy_version 161434 (0.0037) [2024-06-18 15:30:21,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2644951040. Throughput: 0: 42881.6. Samples: 2645048280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:21,994][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 15:30:25,918][12883] Updated weights for policy 0, policy_version 161444 (0.0031) [2024-06-18 15:30:26,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2645131264. Throughput: 0: 42832.2. Samples: 2645310640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:26,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 15:30:29,200][12883] Updated weights for policy 0, policy_version 161454 (0.0039) [2024-06-18 15:30:31,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2645344256. Throughput: 0: 42771.7. Samples: 2645428680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:31,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 15:30:33,630][12883] Updated weights for policy 0, policy_version 161464 (0.0027) [2024-06-18 15:30:36,859][12883] Updated weights for policy 0, policy_version 161474 (0.0024) [2024-06-18 15:30:36,994][12645] Fps is (10 sec: 47514.5, 60 sec: 43417.8, 300 sec: 42931.6). Total num frames: 2645606400. Throughput: 0: 42979.9. Samples: 2645689460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:36,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 15:30:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161475_2645606400.pth... [2024-06-18 15:30:37,073][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000160845_2635284480.pth [2024-06-18 15:30:41,383][12883] Updated weights for policy 0, policy_version 161484 (0.0033) [2024-06-18 15:30:41,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 2645770240. Throughput: 0: 42769.4. Samples: 2645947940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:41,994][12645] Avg episode reward: [(0, '0.687')] [2024-06-18 15:30:44,520][12883] Updated weights for policy 0, policy_version 161494 (0.0028) [2024-06-18 15:30:46,994][12645] Fps is (10 sec: 37683.0, 60 sec: 42598.4, 300 sec: 42654.8). Total num frames: 2645983232. Throughput: 0: 42760.8. Samples: 2646068620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:46,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 15:30:49,088][12883] Updated weights for policy 0, policy_version 161504 (0.0042) [2024-06-18 15:30:51,994][12645] Fps is (10 sec: 45876.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2646228992. Throughput: 0: 42925.0. Samples: 2646335020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:51,994][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 15:30:52,127][12883] Updated weights for policy 0, policy_version 161514 (0.0035) [2024-06-18 15:30:56,607][12883] Updated weights for policy 0, policy_version 161524 (0.0023) [2024-06-18 15:30:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2646425600. Throughput: 0: 42992.8. Samples: 2646594960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:30:56,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 15:30:59,720][12883] Updated weights for policy 0, policy_version 161534 (0.0022) [2024-06-18 15:31:01,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2646638592. Throughput: 0: 42923.9. Samples: 2646717440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:31:01,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 15:31:04,271][12883] Updated weights for policy 0, policy_version 161544 (0.0028) [2024-06-18 15:31:06,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2646884352. Throughput: 0: 42836.0. Samples: 2646975900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:31:06,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 15:31:07,380][12883] Updated weights for policy 0, policy_version 161554 (0.0034) [2024-06-18 15:31:11,947][12883] Updated weights for policy 0, policy_version 161564 (0.0028) [2024-06-18 15:31:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2647064576. Throughput: 0: 42847.6. Samples: 2647238780. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:31:11,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 15:31:15,341][12883] Updated weights for policy 0, policy_version 161574 (0.0046) [2024-06-18 15:31:16,996][12645] Fps is (10 sec: 39312.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2647277568. Throughput: 0: 42740.9. Samples: 2647352120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 15:31:16,996][12645] Avg episode reward: [(0, '0.212')] [2024-06-18 15:31:19,716][12883] Updated weights for policy 0, policy_version 161584 (0.0037) [2024-06-18 15:31:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2647523328. Throughput: 0: 42805.7. Samples: 2647615720. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:21,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 15:31:22,925][12862] Signal inference workers to stop experience collection... (38700 times) [2024-06-18 15:31:22,951][12883] InferenceWorker_p0-w0: stopping experience collection (38700 times) [2024-06-18 15:31:23,035][12862] Signal inference workers to resume experience collection... (38700 times) [2024-06-18 15:31:23,035][12883] InferenceWorker_p0-w0: resuming experience collection (38700 times) [2024-06-18 15:31:23,037][12883] Updated weights for policy 0, policy_version 161594 (0.0027) [2024-06-18 15:31:26,994][12645] Fps is (10 sec: 39330.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2647670784. Throughput: 0: 42710.3. Samples: 2647869900. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:26,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 15:31:27,431][12883] Updated weights for policy 0, policy_version 161604 (0.0033) [2024-06-18 15:31:30,550][12883] Updated weights for policy 0, policy_version 161614 (0.0050) [2024-06-18 15:31:31,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2647916544. Throughput: 0: 42627.6. Samples: 2647986860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:31,994][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 15:31:35,138][12883] Updated weights for policy 0, policy_version 161624 (0.0037) [2024-06-18 15:31:36,994][12645] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2648162304. Throughput: 0: 42629.3. Samples: 2648253340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:36,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 15:31:38,153][12883] Updated weights for policy 0, policy_version 161634 (0.0026) [2024-06-18 15:31:41,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2648326144. Throughput: 0: 42575.2. Samples: 2648510840. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:41,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 15:31:42,748][12883] Updated weights for policy 0, policy_version 161644 (0.0035) [2024-06-18 15:31:46,081][12883] Updated weights for policy 0, policy_version 161654 (0.0040) [2024-06-18 15:31:46,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2648555520. Throughput: 0: 42484.9. Samples: 2648629260. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:46,994][12645] Avg episode reward: [(0, '0.910')] [2024-06-18 15:31:50,729][12883] Updated weights for policy 0, policy_version 161664 (0.0036) [2024-06-18 15:31:51,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42932.0). Total num frames: 2648801280. Throughput: 0: 42593.8. Samples: 2648892620. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:51,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 15:31:53,896][12883] Updated weights for policy 0, policy_version 161674 (0.0027) [2024-06-18 15:31:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2648965120. Throughput: 0: 42380.5. Samples: 2649145900. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:31:56,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 15:31:58,463][12883] Updated weights for policy 0, policy_version 161684 (0.0024) [2024-06-18 15:32:01,559][12883] Updated weights for policy 0, policy_version 161694 (0.0022) [2024-06-18 15:32:01,994][12645] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2649194496. Throughput: 0: 42411.9. Samples: 2649260560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:32:01,994][12645] Avg episode reward: [(0, '0.324')] [2024-06-18 15:32:06,034][12883] Updated weights for policy 0, policy_version 161704 (0.0032) [2024-06-18 15:32:06,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2649423872. Throughput: 0: 42480.1. Samples: 2649527320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:32:06,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 15:32:09,129][12883] Updated weights for policy 0, policy_version 161714 (0.0044) [2024-06-18 15:32:11,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2649604096. Throughput: 0: 42594.3. Samples: 2649786640. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:32:11,994][12645] Avg episode reward: [(0, '0.429')] [2024-06-18 15:32:13,745][12883] Updated weights for policy 0, policy_version 161724 (0.0035) [2024-06-18 15:32:16,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42600.0, 300 sec: 42820.6). Total num frames: 2649833472. Throughput: 0: 42549.4. Samples: 2649901580. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:32:16,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 15:32:17,073][12883] Updated weights for policy 0, policy_version 161734 (0.0032) [2024-06-18 15:32:21,415][12883] Updated weights for policy 0, policy_version 161744 (0.0040) [2024-06-18 15:32:21,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2650046464. Throughput: 0: 42471.6. Samples: 2650164560. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-18 15:32:21,994][12645] Avg episode reward: [(0, '0.498')] [2024-06-18 15:32:24,583][12883] Updated weights for policy 0, policy_version 161754 (0.0031) [2024-06-18 15:32:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2650226688. Throughput: 0: 42461.7. Samples: 2650421620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:32:26,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 15:32:29,223][12883] Updated weights for policy 0, policy_version 161764 (0.0031) [2024-06-18 15:32:29,249][12862] Signal inference workers to stop experience collection... (38750 times) [2024-06-18 15:32:29,250][12862] Signal inference workers to resume experience collection... (38750 times) [2024-06-18 15:32:29,286][12883] InferenceWorker_p0-w0: stopping experience collection (38750 times) [2024-06-18 15:32:29,286][12883] InferenceWorker_p0-w0: resuming experience collection (38750 times) [2024-06-18 15:32:32,000][12645] Fps is (10 sec: 44208.6, 60 sec: 42866.9, 300 sec: 42819.7). Total num frames: 2650488832. Throughput: 0: 42448.3. Samples: 2650539700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:32:32,001][12645] Avg episode reward: [(0, '0.254')] [2024-06-18 15:32:32,715][12883] Updated weights for policy 0, policy_version 161774 (0.0036) [2024-06-18 15:32:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42654.1). Total num frames: 2650652672. Throughput: 0: 42453.3. Samples: 2650803020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:32:36,994][12645] Avg episode reward: [(0, '0.556')] [2024-06-18 15:32:37,012][12883] Updated weights for policy 0, policy_version 161784 (0.0046) [2024-06-18 15:32:37,145][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161785_2650685440.pth... [2024-06-18 15:32:37,215][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161161_2640461824.pth [2024-06-18 15:32:40,290][12883] Updated weights for policy 0, policy_version 161794 (0.0034) [2024-06-18 15:32:41,994][12645] Fps is (10 sec: 37706.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2650865664. Throughput: 0: 42365.7. Samples: 2651052360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:32:41,994][12645] Avg episode reward: [(0, '0.569')] [2024-06-18 15:32:44,726][12883] Updated weights for policy 0, policy_version 161804 (0.0029) [2024-06-18 15:32:46,994][12645] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2651127808. Throughput: 0: 42626.1. Samples: 2651178740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:32:46,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 15:32:47,735][12883] Updated weights for policy 0, policy_version 161814 (0.0033) [2024-06-18 15:32:51,994][12645] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 2651291648. Throughput: 0: 42517.8. Samples: 2651440620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:32:51,994][12645] Avg episode reward: [(0, '0.628')] [2024-06-18 15:32:52,254][12883] Updated weights for policy 0, policy_version 161824 (0.0035) [2024-06-18 15:32:55,473][12883] Updated weights for policy 0, policy_version 161834 (0.0033) [2024-06-18 15:32:56,994][12645] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2651504640. Throughput: 0: 42254.1. Samples: 2651688080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:32:56,994][12645] Avg episode reward: [(0, '0.426')] [2024-06-18 15:32:59,911][12883] Updated weights for policy 0, policy_version 161844 (0.0031) [2024-06-18 15:33:01,996][12645] Fps is (10 sec: 47502.7, 60 sec: 42869.8, 300 sec: 42764.7). Total num frames: 2651766784. Throughput: 0: 42524.9. Samples: 2651815300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:33:01,997][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 15:33:03,312][12883] Updated weights for policy 0, policy_version 161854 (0.0047) [2024-06-18 15:33:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 42543.2). Total num frames: 2651914240. Throughput: 0: 42351.6. Samples: 2652070380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:33:06,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 15:33:07,923][12883] Updated weights for policy 0, policy_version 161864 (0.0037) [2024-06-18 15:33:11,076][12883] Updated weights for policy 0, policy_version 161874 (0.0032) [2024-06-18 15:33:11,994][12645] Fps is (10 sec: 39330.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2652160000. Throughput: 0: 42092.5. Samples: 2652315780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:33:11,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 15:33:15,186][12862] Signal inference workers to stop experience collection... (38800 times) [2024-06-18 15:33:15,233][12883] InferenceWorker_p0-w0: stopping experience collection (38800 times) [2024-06-18 15:33:15,244][12862] Signal inference workers to resume experience collection... (38800 times) [2024-06-18 15:33:15,251][12883] InferenceWorker_p0-w0: resuming experience collection (38800 times) [2024-06-18 15:33:15,572][12883] Updated weights for policy 0, policy_version 161884 (0.0038) [2024-06-18 15:33:16,996][12645] Fps is (10 sec: 49140.4, 60 sec: 42869.8, 300 sec: 42709.2). Total num frames: 2652405760. Throughput: 0: 42459.4. Samples: 2652450200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:33:16,997][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 15:33:18,568][12883] Updated weights for policy 0, policy_version 161894 (0.0026) [2024-06-18 15:33:22,000][12645] Fps is (10 sec: 40934.2, 60 sec: 42047.8, 300 sec: 42597.5). Total num frames: 2652569600. Throughput: 0: 42303.9. Samples: 2652706960. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:33:22,009][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 15:33:23,130][12883] Updated weights for policy 0, policy_version 161904 (0.0034) [2024-06-18 15:33:26,994][12645] Fps is (10 sec: 37691.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2652782592. Throughput: 0: 42395.6. Samples: 2652960160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 15:33:26,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 15:33:27,207][12883] Updated weights for policy 0, policy_version 161914 (0.0028) [2024-06-18 15:33:30,672][12883] Updated weights for policy 0, policy_version 161924 (0.0034) [2024-06-18 15:33:31,994][12645] Fps is (10 sec: 45903.5, 60 sec: 42329.7, 300 sec: 42599.3). Total num frames: 2653028352. Throughput: 0: 42719.1. Samples: 2653101100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:33:31,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 15:33:34,856][12883] Updated weights for policy 0, policy_version 161934 (0.0026) [2024-06-18 15:33:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 2653192192. Throughput: 0: 42541.8. Samples: 2653355000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:33:36,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 15:33:38,277][12883] Updated weights for policy 0, policy_version 161944 (0.0024) [2024-06-18 15:33:41,996][12645] Fps is (10 sec: 40951.3, 60 sec: 42870.0, 300 sec: 42654.5). Total num frames: 2653437952. Throughput: 0: 42528.2. Samples: 2653601940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:33:41,997][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 15:33:42,347][12883] Updated weights for policy 0, policy_version 161954 (0.0039) [2024-06-18 15:33:45,846][12883] Updated weights for policy 0, policy_version 161964 (0.0026) [2024-06-18 15:33:46,994][12645] Fps is (10 sec: 49151.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2653683712. Throughput: 0: 42920.8. Samples: 2653746640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:33:46,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 15:33:49,770][12883] Updated weights for policy 0, policy_version 161974 (0.0048) [2024-06-18 15:33:51,994][12645] Fps is (10 sec: 39329.8, 60 sec: 42325.2, 300 sec: 42432.3). Total num frames: 2653831168. Throughput: 0: 42770.0. Samples: 2653995040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:33:51,994][12645] Avg episode reward: [(0, '0.333')] [2024-06-18 15:33:53,499][12883] Updated weights for policy 0, policy_version 161984 (0.0038) [2024-06-18 15:33:56,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2654093312. Throughput: 0: 42747.4. Samples: 2654239420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:33:56,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 15:33:57,349][12883] Updated weights for policy 0, policy_version 161994 (0.0034) [2024-06-18 15:34:01,365][12883] Updated weights for policy 0, policy_version 162004 (0.0040) [2024-06-18 15:34:01,997][12645] Fps is (10 sec: 47496.3, 60 sec: 42324.3, 300 sec: 42597.8). Total num frames: 2654306304. Throughput: 0: 42911.4. Samples: 2654381280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:34:01,998][12645] Avg episode reward: [(0, '0.175')] [2024-06-18 15:34:04,845][12883] Updated weights for policy 0, policy_version 162014 (0.0030) [2024-06-18 15:34:06,994][12645] Fps is (10 sec: 37683.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 2654470144. Throughput: 0: 42635.2. Samples: 2654625280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:34:06,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 15:34:08,278][12862] Signal inference workers to stop experience collection... (38850 times) [2024-06-18 15:34:08,317][12883] InferenceWorker_p0-w0: stopping experience collection (38850 times) [2024-06-18 15:34:08,324][12862] Signal inference workers to resume experience collection... (38850 times) [2024-06-18 15:34:08,333][12883] InferenceWorker_p0-w0: resuming experience collection (38850 times) [2024-06-18 15:34:08,948][12883] Updated weights for policy 0, policy_version 162024 (0.0042) [2024-06-18 15:34:11,994][12645] Fps is (10 sec: 44252.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2654748672. Throughput: 0: 42633.3. Samples: 2654878660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:34:11,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 15:34:12,627][12883] Updated weights for policy 0, policy_version 162034 (0.0033) [2024-06-18 15:34:16,419][12883] Updated weights for policy 0, policy_version 162044 (0.0034) [2024-06-18 15:34:16,994][12645] Fps is (10 sec: 47514.0, 60 sec: 42326.9, 300 sec: 42654.0). Total num frames: 2654945280. Throughput: 0: 42633.4. Samples: 2655019600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:34:16,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 15:34:20,479][12883] Updated weights for policy 0, policy_version 162054 (0.0035) [2024-06-18 15:34:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42602.8, 300 sec: 42542.9). Total num frames: 2655125504. Throughput: 0: 42423.4. Samples: 2655264060. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:34:21,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 15:34:24,090][12883] Updated weights for policy 0, policy_version 162064 (0.0041) [2024-06-18 15:34:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 2655371264. Throughput: 0: 42561.7. Samples: 2655517120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:34:26,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 15:34:28,315][12883] Updated weights for policy 0, policy_version 162074 (0.0041) [2024-06-18 15:34:31,994][12645] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 2655567872. Throughput: 0: 42321.4. Samples: 2655651100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 15:34:31,994][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 15:34:32,073][12883] Updated weights for policy 0, policy_version 162084 (0.0034) [2024-06-18 15:34:35,975][12883] Updated weights for policy 0, policy_version 162094 (0.0026) [2024-06-18 15:34:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2655764480. Throughput: 0: 42288.2. Samples: 2655898000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:34:36,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 15:34:37,074][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162096_2655780864.pth... [2024-06-18 15:34:37,133][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161475_2645606400.pth [2024-06-18 15:34:39,866][12883] Updated weights for policy 0, policy_version 162104 (0.0027) [2024-06-18 15:34:41,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42600.0, 300 sec: 42598.4). Total num frames: 2655993856. Throughput: 0: 42567.7. Samples: 2656154960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:34:41,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 15:34:43,532][12883] Updated weights for policy 0, policy_version 162114 (0.0033) [2024-06-18 15:34:46,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42050.7, 300 sec: 42542.5). Total num frames: 2656206848. Throughput: 0: 42336.1. Samples: 2656286340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:34:46,996][12645] Avg episode reward: [(0, '0.476')] [2024-06-18 15:34:47,591][12883] Updated weights for policy 0, policy_version 162124 (0.0026) [2024-06-18 15:34:51,120][12883] Updated weights for policy 0, policy_version 162134 (0.0035) [2024-06-18 15:34:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 2656419840. Throughput: 0: 42461.3. Samples: 2656536040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:34:51,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 15:34:55,336][12883] Updated weights for policy 0, policy_version 162144 (0.0035) [2024-06-18 15:34:56,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2656632832. Throughput: 0: 42572.1. Samples: 2656794400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:34:56,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 15:34:58,680][12883] Updated weights for policy 0, policy_version 162154 (0.0026) [2024-06-18 15:35:01,996][12645] Fps is (10 sec: 42589.1, 60 sec: 42326.4, 300 sec: 42542.5). Total num frames: 2656845824. Throughput: 0: 42293.9. Samples: 2656922920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:01,996][12645] Avg episode reward: [(0, '0.329')] [2024-06-18 15:35:03,070][12883] Updated weights for policy 0, policy_version 162164 (0.0029) [2024-06-18 15:35:06,599][12883] Updated weights for policy 0, policy_version 162174 (0.0032) [2024-06-18 15:35:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42653.9). Total num frames: 2657075200. Throughput: 0: 42481.9. Samples: 2657175740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:06,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 15:35:10,871][12883] Updated weights for policy 0, policy_version 162184 (0.0027) [2024-06-18 15:35:11,994][12645] Fps is (10 sec: 40969.2, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2657255424. Throughput: 0: 42481.3. Samples: 2657428780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:11,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 15:35:14,235][12883] Updated weights for policy 0, policy_version 162194 (0.0023) [2024-06-18 15:35:16,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2657484800. Throughput: 0: 42214.6. Samples: 2657550760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:16,994][12645] Avg episode reward: [(0, '0.777')] [2024-06-18 15:35:18,844][12883] Updated weights for policy 0, policy_version 162204 (0.0042) [2024-06-18 15:35:21,823][12883] Updated weights for policy 0, policy_version 162214 (0.0026) [2024-06-18 15:35:21,994][12645] Fps is (10 sec: 47513.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 2657730560. Throughput: 0: 42499.4. Samples: 2657810480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:21,994][12645] Avg episode reward: [(0, '0.631')] [2024-06-18 15:35:26,404][12883] Updated weights for policy 0, policy_version 162224 (0.0048) [2024-06-18 15:35:26,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2657910784. Throughput: 0: 42494.5. Samples: 2658067220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:26,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 15:35:29,900][12883] Updated weights for policy 0, policy_version 162234 (0.0037) [2024-06-18 15:35:31,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 2658107392. Throughput: 0: 42379.5. Samples: 2658193320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:31,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 15:35:34,093][12883] Updated weights for policy 0, policy_version 162244 (0.0031) [2024-06-18 15:35:36,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2658336768. Throughput: 0: 42460.4. Samples: 2658446760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 15:35:36,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 15:35:37,385][12883] Updated weights for policy 0, policy_version 162254 (0.0034) [2024-06-18 15:35:41,597][12862] Signal inference workers to stop experience collection... (38900 times) [2024-06-18 15:35:41,598][12862] Signal inference workers to resume experience collection... (38900 times) [2024-06-18 15:35:41,641][12883] InferenceWorker_p0-w0: stopping experience collection (38900 times) [2024-06-18 15:35:41,641][12883] InferenceWorker_p0-w0: resuming experience collection (38900 times) [2024-06-18 15:35:41,731][12883] Updated weights for policy 0, policy_version 162264 (0.0032) [2024-06-18 15:35:41,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2658549760. Throughput: 0: 42504.0. Samples: 2658707080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:35:41,994][12645] Avg episode reward: [(0, '0.534')] [2024-06-18 15:35:44,870][12883] Updated weights for policy 0, policy_version 162274 (0.0029) [2024-06-18 15:35:46,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42327.0, 300 sec: 42431.8). Total num frames: 2658746368. Throughput: 0: 42330.2. Samples: 2658827680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:35:46,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 15:35:49,473][12883] Updated weights for policy 0, policy_version 162284 (0.0030) [2024-06-18 15:35:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2658992128. Throughput: 0: 42628.5. Samples: 2659094020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:35:51,994][12645] Avg episode reward: [(0, '0.447')] [2024-06-18 15:35:52,363][12883] Updated weights for policy 0, policy_version 162294 (0.0026) [2024-06-18 15:35:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2659172352. Throughput: 0: 42633.3. Samples: 2659347280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:35:56,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 15:35:57,254][12883] Updated weights for policy 0, policy_version 162304 (0.0049) [2024-06-18 15:36:00,213][12883] Updated weights for policy 0, policy_version 162314 (0.0043) [2024-06-18 15:36:01,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42599.9, 300 sec: 42431.8). Total num frames: 2659401728. Throughput: 0: 42608.8. Samples: 2659468160. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:01,994][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 15:36:04,938][12883] Updated weights for policy 0, policy_version 162324 (0.0034) [2024-06-18 15:36:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2659631104. Throughput: 0: 42810.3. Samples: 2659736940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:06,994][12645] Avg episode reward: [(0, '0.616')] [2024-06-18 15:36:07,878][12883] Updated weights for policy 0, policy_version 162334 (0.0040) [2024-06-18 15:36:11,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.6). Total num frames: 2659811328. Throughput: 0: 42823.0. Samples: 2659994260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:11,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 15:36:12,523][12883] Updated weights for policy 0, policy_version 162344 (0.0027) [2024-06-18 15:36:15,433][12883] Updated weights for policy 0, policy_version 162354 (0.0026) [2024-06-18 15:36:16,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 2660040704. Throughput: 0: 42691.0. Samples: 2660114420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:16,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 15:36:20,181][12883] Updated weights for policy 0, policy_version 162364 (0.0035) [2024-06-18 15:36:21,994][12645] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2660270080. Throughput: 0: 42840.9. Samples: 2660374600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:21,994][12645] Avg episode reward: [(0, '0.720')] [2024-06-18 15:36:23,167][12883] Updated weights for policy 0, policy_version 162374 (0.0041) [2024-06-18 15:36:26,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2660450304. Throughput: 0: 42540.8. Samples: 2660621420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:26,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 15:36:28,291][12883] Updated weights for policy 0, policy_version 162384 (0.0035) [2024-06-18 15:36:30,750][12883] Updated weights for policy 0, policy_version 162394 (0.0033) [2024-06-18 15:36:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 2660679680. Throughput: 0: 42661.2. Samples: 2660747440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:31,994][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 15:36:35,722][12883] Updated weights for policy 0, policy_version 162404 (0.0039) [2024-06-18 15:36:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 2660876288. Throughput: 0: 42526.9. Samples: 2661007740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 15:36:36,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 15:36:37,053][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162408_2660892672.pth... [2024-06-18 15:36:37,124][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000161785_2650685440.pth [2024-06-18 15:36:38,453][12883] Updated weights for policy 0, policy_version 162414 (0.0033) [2024-06-18 15:36:41,995][12645] Fps is (10 sec: 40955.5, 60 sec: 42324.6, 300 sec: 42487.2). Total num frames: 2661089280. Throughput: 0: 42541.6. Samples: 2661261700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:36:41,995][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 15:36:43,528][12883] Updated weights for policy 0, policy_version 162424 (0.0051) [2024-06-18 15:36:46,063][12883] Updated weights for policy 0, policy_version 162434 (0.0031) [2024-06-18 15:36:46,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 2661335040. Throughput: 0: 42712.4. Samples: 2661390220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:36:46,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 15:36:51,162][12883] Updated weights for policy 0, policy_version 162444 (0.0028) [2024-06-18 15:36:51,994][12645] Fps is (10 sec: 42603.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2661515264. Throughput: 0: 42452.5. Samples: 2661647300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:36:51,994][12645] Avg episode reward: [(0, '0.512')] [2024-06-18 15:36:52,535][12862] Signal inference workers to stop experience collection... (38950 times) [2024-06-18 15:36:52,587][12883] InferenceWorker_p0-w0: stopping experience collection (38950 times) [2024-06-18 15:36:52,594][12862] Signal inference workers to resume experience collection... (38950 times) [2024-06-18 15:36:52,601][12883] InferenceWorker_p0-w0: resuming experience collection (38950 times) [2024-06-18 15:36:53,730][12883] Updated weights for policy 0, policy_version 162454 (0.0029) [2024-06-18 15:36:56,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 2661728256. Throughput: 0: 42315.2. Samples: 2661898440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:36:56,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 15:36:58,827][12883] Updated weights for policy 0, policy_version 162464 (0.0037) [2024-06-18 15:37:01,478][12883] Updated weights for policy 0, policy_version 162474 (0.0036) [2024-06-18 15:37:01,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 2661974016. Throughput: 0: 42581.7. Samples: 2662030600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:01,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 15:37:06,536][12883] Updated weights for policy 0, policy_version 162484 (0.0035) [2024-06-18 15:37:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 2662154240. Throughput: 0: 42524.0. Samples: 2662288180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:06,994][12645] Avg episode reward: [(0, '0.308')] [2024-06-18 15:37:09,578][12883] Updated weights for policy 0, policy_version 162494 (0.0034) [2024-06-18 15:37:11,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 2662383616. Throughput: 0: 42571.6. Samples: 2662537140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:11,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 15:37:14,532][12883] Updated weights for policy 0, policy_version 162504 (0.0036) [2024-06-18 15:37:16,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2662612992. Throughput: 0: 42758.3. Samples: 2662671560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:16,994][12645] Avg episode reward: [(0, '0.594')] [2024-06-18 15:37:17,241][12883] Updated weights for policy 0, policy_version 162514 (0.0032) [2024-06-18 15:37:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 2662776832. Throughput: 0: 42611.3. Samples: 2662925240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:21,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 15:37:22,094][12883] Updated weights for policy 0, policy_version 162524 (0.0030) [2024-06-18 15:37:25,022][12883] Updated weights for policy 0, policy_version 162534 (0.0030) [2024-06-18 15:37:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42488.2). Total num frames: 2663022592. Throughput: 0: 42562.9. Samples: 2663176980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:26,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 15:37:29,834][12883] Updated weights for policy 0, policy_version 162544 (0.0041) [2024-06-18 15:37:31,994][12645] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2663235584. Throughput: 0: 42633.8. Samples: 2663308740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:31,994][12645] Avg episode reward: [(0, '0.370')] [2024-06-18 15:37:32,848][12883] Updated weights for policy 0, policy_version 162554 (0.0033) [2024-06-18 15:37:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2663415808. Throughput: 0: 42471.5. Samples: 2663558520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:36,994][12645] Avg episode reward: [(0, '0.451')] [2024-06-18 15:37:37,383][12883] Updated weights for policy 0, policy_version 162564 (0.0036) [2024-06-18 15:37:40,654][12883] Updated weights for policy 0, policy_version 162574 (0.0039) [2024-06-18 15:37:41,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42872.2, 300 sec: 42487.3). Total num frames: 2663661568. Throughput: 0: 42302.2. Samples: 2663802040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-18 15:37:41,995][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 15:37:45,443][12883] Updated weights for policy 0, policy_version 162584 (0.0032) [2024-06-18 15:37:46,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2663874560. Throughput: 0: 42466.7. Samples: 2663941600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:37:46,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 15:37:48,474][12883] Updated weights for policy 0, policy_version 162594 (0.0027) [2024-06-18 15:37:51,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2664038400. Throughput: 0: 42170.7. Samples: 2664185860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:37:51,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 15:37:53,118][12883] Updated weights for policy 0, policy_version 162604 (0.0038) [2024-06-18 15:37:56,395][12883] Updated weights for policy 0, policy_version 162614 (0.0031) [2024-06-18 15:37:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.7). Total num frames: 2664300544. Throughput: 0: 42092.1. Samples: 2664431280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:37:56,994][12645] Avg episode reward: [(0, '0.538')] [2024-06-18 15:38:00,886][12883] Updated weights for policy 0, policy_version 162624 (0.0033) [2024-06-18 15:38:01,994][12645] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 2664480768. Throughput: 0: 42201.7. Samples: 2664570640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:01,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 15:38:03,923][12883] Updated weights for policy 0, policy_version 162634 (0.0034) [2024-06-18 15:38:07,000][12645] Fps is (10 sec: 39296.9, 60 sec: 42320.9, 300 sec: 42486.4). Total num frames: 2664693760. Throughput: 0: 42054.1. Samples: 2664817940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:07,000][12645] Avg episode reward: [(0, '0.298')] [2024-06-18 15:38:08,600][12883] Updated weights for policy 0, policy_version 162644 (0.0026) [2024-06-18 15:38:11,670][12883] Updated weights for policy 0, policy_version 162654 (0.0045) [2024-06-18 15:38:11,994][12645] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42543.2). Total num frames: 2664955904. Throughput: 0: 42162.1. Samples: 2665074280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:11,994][12645] Avg episode reward: [(0, '0.422')] [2024-06-18 15:38:16,159][12883] Updated weights for policy 0, policy_version 162664 (0.0033) [2024-06-18 15:38:16,994][12645] Fps is (10 sec: 42625.3, 60 sec: 41779.2, 300 sec: 42543.8). Total num frames: 2665119744. Throughput: 0: 42161.4. Samples: 2665206000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:16,994][12645] Avg episode reward: [(0, '0.597')] [2024-06-18 15:38:18,697][12862] Signal inference workers to stop experience collection... (39000 times) [2024-06-18 15:38:18,697][12862] Signal inference workers to resume experience collection... (39000 times) [2024-06-18 15:38:18,730][12883] InferenceWorker_p0-w0: stopping experience collection (39000 times) [2024-06-18 15:38:18,730][12883] InferenceWorker_p0-w0: resuming experience collection (39000 times) [2024-06-18 15:38:19,383][12883] Updated weights for policy 0, policy_version 162674 (0.0040) [2024-06-18 15:38:21,994][12645] Fps is (10 sec: 37683.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2665332736. Throughput: 0: 42220.0. Samples: 2665458420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:21,994][12645] Avg episode reward: [(0, '0.785')] [2024-06-18 15:38:23,822][12883] Updated weights for policy 0, policy_version 162684 (0.0029) [2024-06-18 15:38:26,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2665562112. Throughput: 0: 42469.4. Samples: 2665713160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:26,994][12645] Avg episode reward: [(0, '0.785')] [2024-06-18 15:38:27,024][12883] Updated weights for policy 0, policy_version 162694 (0.0034) [2024-06-18 15:38:31,692][12883] Updated weights for policy 0, policy_version 162704 (0.0037) [2024-06-18 15:38:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 2665758720. Throughput: 0: 42340.1. Samples: 2665846900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:31,994][12645] Avg episode reward: [(0, '0.676')] [2024-06-18 15:38:34,665][12883] Updated weights for policy 0, policy_version 162714 (0.0033) [2024-06-18 15:38:36,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42543.2). Total num frames: 2665988096. Throughput: 0: 42425.8. Samples: 2666095020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:36,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 15:38:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162719_2665988096.pth... [2024-06-18 15:38:37,055][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162096_2655780864.pth [2024-06-18 15:38:39,294][12883] Updated weights for policy 0, policy_version 162724 (0.0033) [2024-06-18 15:38:41,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 2666201088. Throughput: 0: 42735.1. Samples: 2666354360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:41,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 15:38:42,666][12883] Updated weights for policy 0, policy_version 162734 (0.0026) [2024-06-18 15:38:46,891][12883] Updated weights for policy 0, policy_version 162744 (0.0030) [2024-06-18 15:38:46,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 2666397696. Throughput: 0: 42335.8. Samples: 2666475760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:46,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 15:38:50,550][12883] Updated weights for policy 0, policy_version 162754 (0.0032) [2024-06-18 15:38:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42542.9). Total num frames: 2666643456. Throughput: 0: 42499.7. Samples: 2666730160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:51,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 15:38:54,607][12883] Updated weights for policy 0, policy_version 162764 (0.0042) [2024-06-18 15:38:56,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42432.3). Total num frames: 2666823680. Throughput: 0: 42550.3. Samples: 2666989040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:38:56,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 15:38:58,386][12883] Updated weights for policy 0, policy_version 162774 (0.0037) [2024-06-18 15:39:01,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2667036672. Throughput: 0: 42313.7. Samples: 2667110120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:01,994][12645] Avg episode reward: [(0, '0.663')] [2024-06-18 15:39:02,063][12883] Updated weights for policy 0, policy_version 162784 (0.0029) [2024-06-18 15:39:06,083][12883] Updated weights for policy 0, policy_version 162794 (0.0029) [2024-06-18 15:39:06,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43149.0, 300 sec: 42487.3). Total num frames: 2667282432. Throughput: 0: 42461.3. Samples: 2667369180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:06,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 15:39:09,839][12883] Updated weights for policy 0, policy_version 162804 (0.0039) [2024-06-18 15:39:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 2667462656. Throughput: 0: 42605.4. Samples: 2667630400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:11,995][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 15:39:13,605][12883] Updated weights for policy 0, policy_version 162814 (0.0042) [2024-06-18 15:39:16,994][12645] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2667659264. Throughput: 0: 42311.1. Samples: 2667750900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:16,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 15:39:17,385][12883] Updated weights for policy 0, policy_version 162824 (0.0042) [2024-06-18 15:39:21,078][12883] Updated weights for policy 0, policy_version 162834 (0.0023) [2024-06-18 15:39:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 2667921408. Throughput: 0: 42540.4. Samples: 2668009340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:21,994][12645] Avg episode reward: [(0, '0.622')] [2024-06-18 15:39:25,211][12883] Updated weights for policy 0, policy_version 162844 (0.0037) [2024-06-18 15:39:26,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 2668101632. Throughput: 0: 42618.1. Samples: 2668272180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:26,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 15:39:28,951][12883] Updated weights for policy 0, policy_version 162854 (0.0022) [2024-06-18 15:39:31,996][12645] Fps is (10 sec: 39312.9, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2668314624. Throughput: 0: 42519.4. Samples: 2668389220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:31,996][12645] Avg episode reward: [(0, '0.351')] [2024-06-18 15:39:32,923][12883] Updated weights for policy 0, policy_version 162864 (0.0036) [2024-06-18 15:39:36,594][12883] Updated weights for policy 0, policy_version 162874 (0.0037) [2024-06-18 15:39:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2668544000. Throughput: 0: 42775.1. Samples: 2668655040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:36,994][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 15:39:37,384][12862] Signal inference workers to stop experience collection... (39050 times) [2024-06-18 15:39:37,423][12883] InferenceWorker_p0-w0: stopping experience collection (39050 times) [2024-06-18 15:39:37,442][12862] Signal inference workers to resume experience collection... (39050 times) [2024-06-18 15:39:37,444][12883] InferenceWorker_p0-w0: resuming experience collection (39050 times) [2024-06-18 15:39:40,475][12883] Updated weights for policy 0, policy_version 162884 (0.0039) [2024-06-18 15:39:41,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42325.3, 300 sec: 42487.6). Total num frames: 2668740608. Throughput: 0: 42642.6. Samples: 2668907960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:41,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 15:39:44,150][12883] Updated weights for policy 0, policy_version 162894 (0.0042) [2024-06-18 15:39:46,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42596.9, 300 sec: 42487.0). Total num frames: 2668953600. Throughput: 0: 42701.4. Samples: 2669031780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:46,996][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 15:39:48,219][12883] Updated weights for policy 0, policy_version 162904 (0.0040) [2024-06-18 15:39:51,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 2669166592. Throughput: 0: 42838.1. Samples: 2669296900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 15:39:51,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 15:39:52,041][12883] Updated weights for policy 0, policy_version 162914 (0.0035) [2024-06-18 15:39:55,738][12883] Updated weights for policy 0, policy_version 162924 (0.0032) [2024-06-18 15:39:56,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42598.4, 300 sec: 42487.6). Total num frames: 2669379584. Throughput: 0: 42740.5. Samples: 2669553720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:39:56,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 15:39:59,567][12883] Updated weights for policy 0, policy_version 162934 (0.0039) [2024-06-18 15:40:01,996][12645] Fps is (10 sec: 44227.3, 60 sec: 42869.9, 300 sec: 42487.0). Total num frames: 2669608960. Throughput: 0: 42848.0. Samples: 2669679160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:01,997][12645] Avg episode reward: [(0, '0.488')] [2024-06-18 15:40:03,336][12883] Updated weights for policy 0, policy_version 162944 (0.0030) [2024-06-18 15:40:06,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2669821952. Throughput: 0: 42905.7. Samples: 2669940100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:06,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 15:40:07,130][12883] Updated weights for policy 0, policy_version 162954 (0.0042) [2024-06-18 15:40:10,908][12883] Updated weights for policy 0, policy_version 162964 (0.0035) [2024-06-18 15:40:11,994][12645] Fps is (10 sec: 42608.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 2670034944. Throughput: 0: 42765.9. Samples: 2670196640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:11,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 15:40:14,920][12883] Updated weights for policy 0, policy_version 162974 (0.0035) [2024-06-18 15:40:16,993][12645] Fps is (10 sec: 42599.5, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 2670247936. Throughput: 0: 42977.8. Samples: 2670323120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:16,994][12645] Avg episode reward: [(0, '0.204')] [2024-06-18 15:40:18,501][12883] Updated weights for policy 0, policy_version 162984 (0.0041) [2024-06-18 15:40:21,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2670460928. Throughput: 0: 42810.6. Samples: 2670581520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:21,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 15:40:22,693][12883] Updated weights for policy 0, policy_version 162994 (0.0043) [2024-06-18 15:40:26,010][12883] Updated weights for policy 0, policy_version 163004 (0.0028) [2024-06-18 15:40:26,994][12645] Fps is (10 sec: 44235.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2670690304. Throughput: 0: 42919.9. Samples: 2670839360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:26,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 15:40:30,466][12883] Updated weights for policy 0, policy_version 163014 (0.0036) [2024-06-18 15:40:31,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43146.1, 300 sec: 42598.4). Total num frames: 2670903296. Throughput: 0: 43039.5. Samples: 2670968460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:31,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 15:40:34,046][12883] Updated weights for policy 0, policy_version 163024 (0.0041) [2024-06-18 15:40:36,996][12645] Fps is (10 sec: 40951.1, 60 sec: 42596.8, 300 sec: 42542.5). Total num frames: 2671099904. Throughput: 0: 42694.8. Samples: 2671218260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:36,996][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 15:40:37,014][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163031_2671099904.pth... [2024-06-18 15:40:37,087][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162408_2660892672.pth [2024-06-18 15:40:38,106][12883] Updated weights for policy 0, policy_version 163034 (0.0039) [2024-06-18 15:40:41,675][12883] Updated weights for policy 0, policy_version 163044 (0.0034) [2024-06-18 15:40:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2671329280. Throughput: 0: 42725.8. Samples: 2671476380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:41,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 15:40:45,841][12883] Updated weights for policy 0, policy_version 163054 (0.0037) [2024-06-18 15:40:46,994][12645] Fps is (10 sec: 44247.2, 60 sec: 43146.2, 300 sec: 42542.9). Total num frames: 2671542272. Throughput: 0: 42827.1. Samples: 2671606280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:46,994][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 15:40:49,264][12883] Updated weights for policy 0, policy_version 163064 (0.0040) [2024-06-18 15:40:51,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 2671755264. Throughput: 0: 42686.9. Samples: 2671861000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:51,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 15:40:53,318][12883] Updated weights for policy 0, policy_version 163074 (0.0043) [2024-06-18 15:40:56,866][12883] Updated weights for policy 0, policy_version 163084 (0.0033) [2024-06-18 15:40:56,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 2671968256. Throughput: 0: 42802.7. Samples: 2672122760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 15:40:56,994][12645] Avg episode reward: [(0, '0.407')] [2024-06-18 15:41:01,033][12883] Updated weights for policy 0, policy_version 163094 (0.0033) [2024-06-18 15:41:02,000][12645] Fps is (10 sec: 42571.3, 60 sec: 42868.6, 300 sec: 42542.0). Total num frames: 2672181248. Throughput: 0: 42695.3. Samples: 2672244680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:02,001][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 15:41:04,461][12862] Signal inference workers to stop experience collection... (39100 times) [2024-06-18 15:41:04,495][12883] InferenceWorker_p0-w0: stopping experience collection (39100 times) [2024-06-18 15:41:04,519][12862] Signal inference workers to resume experience collection... (39100 times) [2024-06-18 15:41:04,520][12883] InferenceWorker_p0-w0: resuming experience collection (39100 times) [2024-06-18 15:41:04,658][12883] Updated weights for policy 0, policy_version 163104 (0.0035) [2024-06-18 15:41:06,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2672394240. Throughput: 0: 42728.3. Samples: 2672504300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:06,994][12645] Avg episode reward: [(0, '0.828')] [2024-06-18 15:41:08,557][12883] Updated weights for policy 0, policy_version 163114 (0.0041) [2024-06-18 15:41:11,994][12645] Fps is (10 sec: 39346.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 2672574464. Throughput: 0: 42728.0. Samples: 2672762120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:11,994][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 15:41:12,354][12883] Updated weights for policy 0, policy_version 163124 (0.0034) [2024-06-18 15:41:16,133][12883] Updated weights for policy 0, policy_version 163134 (0.0039) [2024-06-18 15:41:16,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 2672820224. Throughput: 0: 42694.6. Samples: 2672889720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:16,994][12645] Avg episode reward: [(0, '0.787')] [2024-06-18 15:41:20,257][12883] Updated weights for policy 0, policy_version 163144 (0.0040) [2024-06-18 15:41:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 2673016832. Throughput: 0: 42810.4. Samples: 2673144640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:21,994][12645] Avg episode reward: [(0, '0.803')] [2024-06-18 15:41:23,756][12883] Updated weights for policy 0, policy_version 163154 (0.0042) [2024-06-18 15:41:26,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2673229824. Throughput: 0: 42799.9. Samples: 2673402380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:26,994][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 15:41:27,774][12883] Updated weights for policy 0, policy_version 163164 (0.0033) [2024-06-18 15:41:31,355][12883] Updated weights for policy 0, policy_version 163174 (0.0026) [2024-06-18 15:41:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2673459200. Throughput: 0: 42657.6. Samples: 2673525880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:31,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 15:41:35,659][12883] Updated weights for policy 0, policy_version 163184 (0.0040) [2024-06-18 15:41:36,994][12645] Fps is (10 sec: 45875.9, 60 sec: 43146.2, 300 sec: 42709.7). Total num frames: 2673688576. Throughput: 0: 42914.6. Samples: 2673792160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:36,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 15:41:39,352][12883] Updated weights for policy 0, policy_version 163194 (0.0026) [2024-06-18 15:41:41,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 2673885184. Throughput: 0: 42811.6. Samples: 2674049280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:41,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 15:41:43,076][12883] Updated weights for policy 0, policy_version 163204 (0.0036) [2024-06-18 15:41:46,808][12883] Updated weights for policy 0, policy_version 163214 (0.0033) [2024-06-18 15:41:46,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2674114560. Throughput: 0: 42841.9. Samples: 2674172300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:46,994][12645] Avg episode reward: [(0, '0.444')] [2024-06-18 15:41:50,580][12883] Updated weights for policy 0, policy_version 163224 (0.0043) [2024-06-18 15:41:51,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2674327552. Throughput: 0: 42885.8. Samples: 2674434160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:51,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 15:41:54,441][12883] Updated weights for policy 0, policy_version 163234 (0.0031) [2024-06-18 15:41:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2674540544. Throughput: 0: 43043.5. Samples: 2674699080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:41:56,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 15:41:58,073][12883] Updated weights for policy 0, policy_version 163244 (0.0032) [2024-06-18 15:42:01,868][12883] Updated weights for policy 0, policy_version 163254 (0.0036) [2024-06-18 15:42:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 2674753536. Throughput: 0: 42906.4. Samples: 2674820500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:01,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 15:42:05,564][12883] Updated weights for policy 0, policy_version 163264 (0.0026) [2024-06-18 15:42:06,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2674966528. Throughput: 0: 42992.1. Samples: 2675079280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:06,994][12645] Avg episode reward: [(0, '0.577')] [2024-06-18 15:42:09,439][12883] Updated weights for policy 0, policy_version 163274 (0.0036) [2024-06-18 15:42:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 2675179520. Throughput: 0: 43041.0. Samples: 2675339220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:11,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 15:42:13,139][12883] Updated weights for policy 0, policy_version 163284 (0.0025) [2024-06-18 15:42:16,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2675392512. Throughput: 0: 43156.1. Samples: 2675467900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:16,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 15:42:17,343][12883] Updated weights for policy 0, policy_version 163294 (0.0033) [2024-06-18 15:42:21,056][12883] Updated weights for policy 0, policy_version 163304 (0.0024) [2024-06-18 15:42:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2675605504. Throughput: 0: 42913.2. Samples: 2675723260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:21,994][12645] Avg episode reward: [(0, '0.648')] [2024-06-18 15:42:24,946][12883] Updated weights for policy 0, policy_version 163314 (0.0023) [2024-06-18 15:42:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 2675818496. Throughput: 0: 42806.5. Samples: 2675975580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:26,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 15:42:29,055][12883] Updated weights for policy 0, policy_version 163324 (0.0038) [2024-06-18 15:42:31,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2676047872. Throughput: 0: 42941.8. Samples: 2676104680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:31,994][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 15:42:32,534][12862] Signal inference workers to stop experience collection... (39150 times) [2024-06-18 15:42:32,565][12883] InferenceWorker_p0-w0: stopping experience collection (39150 times) [2024-06-18 15:42:32,585][12862] Signal inference workers to resume experience collection... (39150 times) [2024-06-18 15:42:32,599][12883] InferenceWorker_p0-w0: resuming experience collection (39150 times) [2024-06-18 15:42:32,606][12883] Updated weights for policy 0, policy_version 163334 (0.0041) [2024-06-18 15:42:36,661][12883] Updated weights for policy 0, policy_version 163344 (0.0033) [2024-06-18 15:42:36,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2676260864. Throughput: 0: 43029.8. Samples: 2676370500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:36,994][12645] Avg episode reward: [(0, '0.417')] [2024-06-18 15:42:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163346_2676260864.pth... [2024-06-18 15:42:37,054][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000162719_2665988096.pth [2024-06-18 15:42:40,231][12883] Updated weights for policy 0, policy_version 163354 (0.0033) [2024-06-18 15:42:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2676457472. Throughput: 0: 42726.4. Samples: 2676621760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:41,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 15:42:44,239][12883] Updated weights for policy 0, policy_version 163364 (0.0028) [2024-06-18 15:42:46,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2676670464. Throughput: 0: 42796.0. Samples: 2676746320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:46,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 15:42:47,719][12883] Updated weights for policy 0, policy_version 163374 (0.0039) [2024-06-18 15:42:51,841][12883] Updated weights for policy 0, policy_version 163384 (0.0031) [2024-06-18 15:42:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2676883456. Throughput: 0: 42805.0. Samples: 2677005500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:51,994][12645] Avg episode reward: [(0, '0.393')] [2024-06-18 15:42:55,582][12883] Updated weights for policy 0, policy_version 163394 (0.0038) [2024-06-18 15:42:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2677112832. Throughput: 0: 42654.5. Samples: 2677258680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:42:56,994][12645] Avg episode reward: [(0, '0.533')] [2024-06-18 15:42:59,339][12883] Updated weights for policy 0, policy_version 163404 (0.0042) [2024-06-18 15:43:01,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42765.9). Total num frames: 2677309440. Throughput: 0: 42627.5. Samples: 2677386140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:43:01,994][12645] Avg episode reward: [(0, '0.305')] [2024-06-18 15:43:03,389][12883] Updated weights for policy 0, policy_version 163414 (0.0037) [2024-06-18 15:43:06,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2677522432. Throughput: 0: 42797.3. Samples: 2677649140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 15:43:06,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 15:43:07,133][12883] Updated weights for policy 0, policy_version 163424 (0.0035) [2024-06-18 15:43:10,997][12883] Updated weights for policy 0, policy_version 163434 (0.0032) [2024-06-18 15:43:11,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2677735424. Throughput: 0: 42734.4. Samples: 2677898620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:11,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 15:43:14,689][12883] Updated weights for policy 0, policy_version 163444 (0.0035) [2024-06-18 15:43:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2677964800. Throughput: 0: 42758.2. Samples: 2678028800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:16,994][12645] Avg episode reward: [(0, '0.347')] [2024-06-18 15:43:18,687][12883] Updated weights for policy 0, policy_version 163454 (0.0027) [2024-06-18 15:43:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2678177792. Throughput: 0: 42641.0. Samples: 2678289340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:21,994][12645] Avg episode reward: [(0, '0.414')] [2024-06-18 15:43:22,438][12883] Updated weights for policy 0, policy_version 163464 (0.0035) [2024-06-18 15:43:26,178][12883] Updated weights for policy 0, policy_version 163474 (0.0032) [2024-06-18 15:43:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 2678390784. Throughput: 0: 42699.1. Samples: 2678543220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:26,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 15:43:30,072][12883] Updated weights for policy 0, policy_version 163484 (0.0043) [2024-06-18 15:43:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2678603776. Throughput: 0: 42813.8. Samples: 2678672940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:31,994][12645] Avg episode reward: [(0, '0.307')] [2024-06-18 15:43:33,747][12883] Updated weights for policy 0, policy_version 163494 (0.0039) [2024-06-18 15:43:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2678800384. Throughput: 0: 42826.1. Samples: 2678932680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:36,994][12645] Avg episode reward: [(0, '0.546')] [2024-06-18 15:43:37,629][12883] Updated weights for policy 0, policy_version 163504 (0.0030) [2024-06-18 15:43:41,658][12883] Updated weights for policy 0, policy_version 163514 (0.0029) [2024-06-18 15:43:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2679029760. Throughput: 0: 42953.0. Samples: 2679191560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:41,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 15:43:45,368][12883] Updated weights for policy 0, policy_version 163524 (0.0026) [2024-06-18 15:43:46,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2679242752. Throughput: 0: 42933.9. Samples: 2679318160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:46,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 15:43:49,311][12883] Updated weights for policy 0, policy_version 163534 (0.0033) [2024-06-18 15:43:51,993][12645] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2679455744. Throughput: 0: 42727.3. Samples: 2679571860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:51,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 15:43:53,078][12883] Updated weights for policy 0, policy_version 163544 (0.0025) [2024-06-18 15:43:56,908][12883] Updated weights for policy 0, policy_version 163554 (0.0040) [2024-06-18 15:43:56,996][12645] Fps is (10 sec: 42588.7, 60 sec: 42596.9, 300 sec: 42820.2). Total num frames: 2679668736. Throughput: 0: 42988.4. Samples: 2679833200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:43:56,997][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 15:44:00,903][12883] Updated weights for policy 0, policy_version 163564 (0.0035) [2024-06-18 15:44:01,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 2679881728. Throughput: 0: 42851.7. Samples: 2679957120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:44:01,994][12645] Avg episode reward: [(0, '0.398')] [2024-06-18 15:44:04,473][12883] Updated weights for policy 0, policy_version 163574 (0.0036) [2024-06-18 15:44:06,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2680111104. Throughput: 0: 42881.6. Samples: 2680219020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:44:06,994][12645] Avg episode reward: [(0, '0.384')] [2024-06-18 15:44:08,537][12883] Updated weights for policy 0, policy_version 163584 (0.0037) [2024-06-18 15:44:11,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2680307712. Throughput: 0: 42942.7. Samples: 2680475640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 15:44:11,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 15:44:12,027][12883] Updated weights for policy 0, policy_version 163594 (0.0035) [2024-06-18 15:44:16,200][12883] Updated weights for policy 0, policy_version 163604 (0.0032) [2024-06-18 15:44:16,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2680520704. Throughput: 0: 42840.8. Samples: 2680600780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:16,994][12645] Avg episode reward: [(0, '0.630')] [2024-06-18 15:44:19,543][12883] Updated weights for policy 0, policy_version 163614 (0.0035) [2024-06-18 15:44:21,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2680766464. Throughput: 0: 42836.9. Samples: 2680860340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:21,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 15:44:23,661][12883] Updated weights for policy 0, policy_version 163624 (0.0039) [2024-06-18 15:44:26,640][12862] Signal inference workers to stop experience collection... (39200 times) [2024-06-18 15:44:26,640][12862] Signal inference workers to resume experience collection... (39200 times) [2024-06-18 15:44:26,655][12883] InferenceWorker_p0-w0: stopping experience collection (39200 times) [2024-06-18 15:44:26,664][12883] InferenceWorker_p0-w0: resuming experience collection (39200 times) [2024-06-18 15:44:26,996][12645] Fps is (10 sec: 44226.8, 60 sec: 42869.8, 300 sec: 42876.1). Total num frames: 2680963072. Throughput: 0: 42785.8. Samples: 2681117020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:26,996][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 15:44:27,125][12883] Updated weights for policy 0, policy_version 163634 (0.0035) [2024-06-18 15:44:31,312][12883] Updated weights for policy 0, policy_version 163644 (0.0042) [2024-06-18 15:44:31,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2681159680. Throughput: 0: 42636.5. Samples: 2681236800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:31,994][12645] Avg episode reward: [(0, '0.516')] [2024-06-18 15:44:34,700][12883] Updated weights for policy 0, policy_version 163654 (0.0033) [2024-06-18 15:44:36,994][12645] Fps is (10 sec: 44246.6, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2681405440. Throughput: 0: 42663.4. Samples: 2681491720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:36,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 15:44:37,005][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163660_2681405440.pth... [2024-06-18 15:44:37,061][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163031_2671099904.pth [2024-06-18 15:44:39,091][12883] Updated weights for policy 0, policy_version 163664 (0.0047) [2024-06-18 15:44:41,994][12645] Fps is (10 sec: 44235.7, 60 sec: 42871.3, 300 sec: 42876.4). Total num frames: 2681602048. Throughput: 0: 42730.9. Samples: 2681756000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:42,000][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 15:44:42,734][12883] Updated weights for policy 0, policy_version 163674 (0.0036) [2024-06-18 15:44:46,994][12645] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2681782272. Throughput: 0: 42700.0. Samples: 2681878620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:46,994][12645] Avg episode reward: [(0, '0.571')] [2024-06-18 15:44:47,022][12883] Updated weights for policy 0, policy_version 163684 (0.0024) [2024-06-18 15:44:50,238][12883] Updated weights for policy 0, policy_version 163694 (0.0034) [2024-06-18 15:44:51,996][12645] Fps is (10 sec: 44227.6, 60 sec: 43142.8, 300 sec: 42931.3). Total num frames: 2682044416. Throughput: 0: 42601.5. Samples: 2682136180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:51,997][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 15:44:54,666][12883] Updated weights for policy 0, policy_version 163704 (0.0043) [2024-06-18 15:44:56,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42600.0, 300 sec: 42765.3). Total num frames: 2682224640. Throughput: 0: 42712.4. Samples: 2682397700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:44:56,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 15:44:57,882][12883] Updated weights for policy 0, policy_version 163714 (0.0042) [2024-06-18 15:45:01,994][12645] Fps is (10 sec: 37691.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2682421248. Throughput: 0: 42651.2. Samples: 2682520080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:45:01,998][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 15:45:02,245][12883] Updated weights for policy 0, policy_version 163724 (0.0033) [2024-06-18 15:45:05,443][12883] Updated weights for policy 0, policy_version 163734 (0.0028) [2024-06-18 15:45:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2682683392. Throughput: 0: 42592.5. Samples: 2682777000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:45:06,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 15:45:09,793][12883] Updated weights for policy 0, policy_version 163744 (0.0036) [2024-06-18 15:45:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2682863616. Throughput: 0: 42796.8. Samples: 2683042780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:45:11,994][12645] Avg episode reward: [(0, '0.851')] [2024-06-18 15:45:13,057][12883] Updated weights for policy 0, policy_version 163754 (0.0030) [2024-06-18 15:45:17,000][12645] Fps is (10 sec: 37659.6, 60 sec: 42320.9, 300 sec: 42708.6). Total num frames: 2683060224. Throughput: 0: 42785.5. Samples: 2683162420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 15:45:17,000][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 15:45:17,396][12883] Updated weights for policy 0, policy_version 163764 (0.0028) [2024-06-18 15:45:20,818][12883] Updated weights for policy 0, policy_version 163774 (0.0035) [2024-06-18 15:45:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2683322368. Throughput: 0: 42741.4. Samples: 2683415080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:21,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 15:45:25,436][12883] Updated weights for policy 0, policy_version 163784 (0.0024) [2024-06-18 15:45:26,994][12645] Fps is (10 sec: 42625.0, 60 sec: 42053.9, 300 sec: 42653.9). Total num frames: 2683486208. Throughput: 0: 42642.8. Samples: 2683674920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:26,994][12645] Avg episode reward: [(0, '0.216')] [2024-06-18 15:45:28,590][12883] Updated weights for policy 0, policy_version 163794 (0.0037) [2024-06-18 15:45:30,880][12862] Signal inference workers to stop experience collection... (39250 times) [2024-06-18 15:45:30,937][12883] InferenceWorker_p0-w0: stopping experience collection (39250 times) [2024-06-18 15:45:30,937][12862] Signal inference workers to resume experience collection... (39250 times) [2024-06-18 15:45:30,963][12883] InferenceWorker_p0-w0: resuming experience collection (39250 times) [2024-06-18 15:45:31,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42765.3). Total num frames: 2683715584. Throughput: 0: 42555.0. Samples: 2683793600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:31,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 15:45:33,006][12883] Updated weights for policy 0, policy_version 163804 (0.0035) [2024-06-18 15:45:36,459][12883] Updated weights for policy 0, policy_version 163814 (0.0033) [2024-06-18 15:45:36,994][12645] Fps is (10 sec: 47513.0, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2683961344. Throughput: 0: 42639.3. Samples: 2684054860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:36,994][12645] Avg episode reward: [(0, '0.449')] [2024-06-18 15:45:40,808][12883] Updated weights for policy 0, policy_version 163824 (0.0032) [2024-06-18 15:45:41,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 2684125184. Throughput: 0: 42415.2. Samples: 2684306380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:41,994][12645] Avg episode reward: [(0, '0.466')] [2024-06-18 15:45:44,129][12883] Updated weights for policy 0, policy_version 163834 (0.0028) [2024-06-18 15:45:46,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 2684370944. Throughput: 0: 42378.1. Samples: 2684427100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:46,994][12645] Avg episode reward: [(0, '0.348')] [2024-06-18 15:45:48,479][12883] Updated weights for policy 0, policy_version 163844 (0.0037) [2024-06-18 15:45:51,945][12883] Updated weights for policy 0, policy_version 163854 (0.0034) [2024-06-18 15:45:51,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42327.0, 300 sec: 42765.0). Total num frames: 2684583936. Throughput: 0: 42477.8. Samples: 2684688500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:51,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 15:45:56,171][12883] Updated weights for policy 0, policy_version 163864 (0.0035) [2024-06-18 15:45:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42710.4). Total num frames: 2684780544. Throughput: 0: 42238.1. Samples: 2684943500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:45:56,994][12645] Avg episode reward: [(0, '0.515')] [2024-06-18 15:45:59,736][12883] Updated weights for policy 0, policy_version 163874 (0.0039) [2024-06-18 15:46:01,994][12645] Fps is (10 sec: 42598.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2685009920. Throughput: 0: 42436.6. Samples: 2685071800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:46:01,994][12645] Avg episode reward: [(0, '0.673')] [2024-06-18 15:46:03,659][12883] Updated weights for policy 0, policy_version 163884 (0.0035) [2024-06-18 15:46:06,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42820.6). Total num frames: 2685206528. Throughput: 0: 42641.3. Samples: 2685333940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:46:06,994][12645] Avg episode reward: [(0, '0.484')] [2024-06-18 15:46:07,407][12883] Updated weights for policy 0, policy_version 163894 (0.0037) [2024-06-18 15:46:11,373][12883] Updated weights for policy 0, policy_version 163904 (0.0025) [2024-06-18 15:46:11,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2685435904. Throughput: 0: 42417.8. Samples: 2685583720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:46:11,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 15:46:14,939][12883] Updated weights for policy 0, policy_version 163914 (0.0033) [2024-06-18 15:46:16,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43149.0, 300 sec: 42820.6). Total num frames: 2685648896. Throughput: 0: 42695.1. Samples: 2685714880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:46:16,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 15:46:18,889][12883] Updated weights for policy 0, policy_version 163924 (0.0021) [2024-06-18 15:46:21,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 2685845504. Throughput: 0: 42514.4. Samples: 2685968000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-18 15:46:21,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 15:46:22,617][12883] Updated weights for policy 0, policy_version 163934 (0.0042) [2024-06-18 15:46:26,479][12883] Updated weights for policy 0, policy_version 163944 (0.0030) [2024-06-18 15:46:26,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2686058496. Throughput: 0: 42598.2. Samples: 2686223300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:46:26,994][12645] Avg episode reward: [(0, '0.375')] [2024-06-18 15:46:30,271][12883] Updated weights for policy 0, policy_version 163954 (0.0028) [2024-06-18 15:46:31,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2686287872. Throughput: 0: 42822.7. Samples: 2686354120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:46:31,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 15:46:34,515][12883] Updated weights for policy 0, policy_version 163964 (0.0035) [2024-06-18 15:46:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2686500864. Throughput: 0: 42629.7. Samples: 2686606840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:46:36,994][12645] Avg episode reward: [(0, '0.596')] [2024-06-18 15:46:37,006][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163971_2686500864.pth... [2024-06-18 15:46:37,066][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163346_2676260864.pth [2024-06-18 15:46:37,979][12883] Updated weights for policy 0, policy_version 163974 (0.0031) [2024-06-18 15:46:41,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 2686697472. Throughput: 0: 42602.0. Samples: 2686860580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:46:41,994][12645] Avg episode reward: [(0, '0.278')] [2024-06-18 15:46:42,184][12883] Updated weights for policy 0, policy_version 163984 (0.0030) [2024-06-18 15:46:45,739][12883] Updated weights for policy 0, policy_version 163994 (0.0030) [2024-06-18 15:46:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2686910464. Throughput: 0: 42527.0. Samples: 2686985520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:46:46,994][12645] Avg episode reward: [(0, '0.600')] [2024-06-18 15:46:50,051][12883] Updated weights for policy 0, policy_version 164004 (0.0030) [2024-06-18 15:46:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2687123456. Throughput: 0: 42422.1. Samples: 2687242940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:46:51,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 15:46:53,344][12883] Updated weights for policy 0, policy_version 164014 (0.0043) [2024-06-18 15:46:56,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2687336448. Throughput: 0: 42556.0. Samples: 2687498740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:46:56,994][12645] Avg episode reward: [(0, '0.357')] [2024-06-18 15:46:57,664][12883] Updated weights for policy 0, policy_version 164024 (0.0040) [2024-06-18 15:47:01,054][12883] Updated weights for policy 0, policy_version 164034 (0.0040) [2024-06-18 15:47:01,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 2687549440. Throughput: 0: 42336.8. Samples: 2687620040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:47:01,994][12645] Avg episode reward: [(0, '0.445')] [2024-06-18 15:47:05,252][12883] Updated weights for policy 0, policy_version 164044 (0.0037) [2024-06-18 15:47:06,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2687762432. Throughput: 0: 42410.1. Samples: 2687876460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:47:06,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 15:47:08,020][12862] Signal inference workers to stop experience collection... (39300 times) [2024-06-18 15:47:08,071][12883] InferenceWorker_p0-w0: stopping experience collection (39300 times) [2024-06-18 15:47:08,135][12862] Signal inference workers to resume experience collection... (39300 times) [2024-06-18 15:47:08,135][12883] InferenceWorker_p0-w0: resuming experience collection (39300 times) [2024-06-18 15:47:08,801][12883] Updated weights for policy 0, policy_version 164054 (0.0028) [2024-06-18 15:47:11,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2687975424. Throughput: 0: 42567.9. Samples: 2688138860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:47:11,994][12645] Avg episode reward: [(0, '0.442')] [2024-06-18 15:47:13,397][12883] Updated weights for policy 0, policy_version 164064 (0.0034) [2024-06-18 15:47:16,470][12883] Updated weights for policy 0, policy_version 164074 (0.0040) [2024-06-18 15:47:16,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2688204800. Throughput: 0: 42452.6. Samples: 2688264480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:47:16,994][12645] Avg episode reward: [(0, '0.454')] [2024-06-18 15:47:20,985][12883] Updated weights for policy 0, policy_version 164084 (0.0034) [2024-06-18 15:47:21,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2688417792. Throughput: 0: 42616.0. Samples: 2688524560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:47:21,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 15:47:24,431][12883] Updated weights for policy 0, policy_version 164094 (0.0048) [2024-06-18 15:47:26,995][12645] Fps is (10 sec: 42591.9, 60 sec: 42870.3, 300 sec: 42653.7). Total num frames: 2688630784. Throughput: 0: 42632.3. Samples: 2688779100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 15:47:27,000][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 15:47:28,586][12883] Updated weights for policy 0, policy_version 164104 (0.0044) [2024-06-18 15:47:31,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2688843776. Throughput: 0: 42569.4. Samples: 2688901140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:47:31,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 15:47:31,991][12883] Updated weights for policy 0, policy_version 164114 (0.0030) [2024-06-18 15:47:36,099][12883] Updated weights for policy 0, policy_version 164124 (0.0029) [2024-06-18 15:47:36,994][12645] Fps is (10 sec: 40966.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2689040384. Throughput: 0: 42630.3. Samples: 2689161300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:47:36,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 15:47:39,686][12883] Updated weights for policy 0, policy_version 164134 (0.0038) [2024-06-18 15:47:41,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2689253376. Throughput: 0: 42579.0. Samples: 2689414800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:47:41,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 15:47:43,751][12883] Updated weights for policy 0, policy_version 164144 (0.0029) [2024-06-18 15:47:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2689466368. Throughput: 0: 42710.7. Samples: 2689542020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:47:47,000][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 15:47:47,929][12883] Updated weights for policy 0, policy_version 164154 (0.0037) [2024-06-18 15:47:51,454][12883] Updated weights for policy 0, policy_version 164164 (0.0026) [2024-06-18 15:47:51,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2689679360. Throughput: 0: 42663.5. Samples: 2689796320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:47:51,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 15:47:55,706][12883] Updated weights for policy 0, policy_version 164174 (0.0030) [2024-06-18 15:47:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 2689875968. Throughput: 0: 42548.0. Samples: 2690053520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:47:56,994][12645] Avg episode reward: [(0, '0.799')] [2024-06-18 15:47:59,248][12883] Updated weights for policy 0, policy_version 164184 (0.0037) [2024-06-18 15:48:01,994][12645] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2690105344. Throughput: 0: 42467.6. Samples: 2690175520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:48:01,994][12645] Avg episode reward: [(0, '0.786')] [2024-06-18 15:48:03,270][12883] Updated weights for policy 0, policy_version 164194 (0.0023) [2024-06-18 15:48:06,825][12883] Updated weights for policy 0, policy_version 164204 (0.0028) [2024-06-18 15:48:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 2690318336. Throughput: 0: 42524.6. Samples: 2690438160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:48:06,994][12645] Avg episode reward: [(0, '0.574')] [2024-06-18 15:48:10,832][12883] Updated weights for policy 0, policy_version 164214 (0.0040) [2024-06-18 15:48:11,996][12645] Fps is (10 sec: 40950.6, 60 sec: 42323.8, 300 sec: 42542.5). Total num frames: 2690514944. Throughput: 0: 42623.8. Samples: 2690697200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:48:11,996][12645] Avg episode reward: [(0, '0.732')] [2024-06-18 15:48:14,467][12883] Updated weights for policy 0, policy_version 164224 (0.0036) [2024-06-18 15:48:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2690760704. Throughput: 0: 42540.9. Samples: 2690815480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:48:16,994][12645] Avg episode reward: [(0, '0.810')] [2024-06-18 15:48:18,827][12883] Updated weights for policy 0, policy_version 164234 (0.0024) [2024-06-18 15:48:21,990][12883] Updated weights for policy 0, policy_version 164244 (0.0032) [2024-06-18 15:48:21,994][12645] Fps is (10 sec: 45885.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2690973696. Throughput: 0: 42688.9. Samples: 2691082300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:48:21,994][12645] Avg episode reward: [(0, '0.814')] [2024-06-18 15:48:26,492][12883] Updated weights for policy 0, policy_version 164254 (0.0032) [2024-06-18 15:48:26,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42053.3, 300 sec: 42542.8). Total num frames: 2691153920. Throughput: 0: 42850.7. Samples: 2691343080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:48:26,994][12645] Avg episode reward: [(0, '0.818')] [2024-06-18 15:48:29,553][12883] Updated weights for policy 0, policy_version 164264 (0.0044) [2024-06-18 15:48:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2691432448. Throughput: 0: 42739.1. Samples: 2691465280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:48:31,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 15:48:33,912][12883] Updated weights for policy 0, policy_version 164274 (0.0028) [2024-06-18 15:48:36,906][12862] Signal inference workers to stop experience collection... (39350 times) [2024-06-18 15:48:36,952][12883] InferenceWorker_p0-w0: stopping experience collection (39350 times) [2024-06-18 15:48:36,960][12862] Signal inference workers to resume experience collection... (39350 times) [2024-06-18 15:48:36,966][12883] InferenceWorker_p0-w0: resuming experience collection (39350 times) [2024-06-18 15:48:36,994][12645] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2691596288. Throughput: 0: 42870.4. Samples: 2691725480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:48:36,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 15:48:37,088][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164283_2691612672.pth... [2024-06-18 15:48:37,156][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163660_2681405440.pth [2024-06-18 15:48:37,307][12883] Updated weights for policy 0, policy_version 164284 (0.0029) [2024-06-18 15:48:41,386][12883] Updated weights for policy 0, policy_version 164294 (0.0031) [2024-06-18 15:48:41,994][12645] Fps is (10 sec: 36044.5, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 2691792896. Throughput: 0: 42811.4. Samples: 2691980040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:48:41,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 15:48:44,826][12883] Updated weights for policy 0, policy_version 164304 (0.0023) [2024-06-18 15:48:46,994][12645] Fps is (10 sec: 45874.1, 60 sec: 43144.5, 300 sec: 42709.4). Total num frames: 2692055040. Throughput: 0: 42989.6. Samples: 2692110060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:48:46,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 15:48:48,872][12883] Updated weights for policy 0, policy_version 164314 (0.0033) [2024-06-18 15:48:51,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 2692235264. Throughput: 0: 42999.5. Samples: 2692373140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:48:51,994][12645] Avg episode reward: [(0, '0.502')] [2024-06-18 15:48:52,478][12883] Updated weights for policy 0, policy_version 164324 (0.0027) [2024-06-18 15:48:56,544][12883] Updated weights for policy 0, policy_version 164334 (0.0034) [2024-06-18 15:48:56,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2692448256. Throughput: 0: 42787.4. Samples: 2692622540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:48:56,994][12645] Avg episode reward: [(0, '0.341')] [2024-06-18 15:49:00,061][12883] Updated weights for policy 0, policy_version 164344 (0.0042) [2024-06-18 15:49:01,996][12645] Fps is (10 sec: 45864.8, 60 sec: 43142.9, 300 sec: 42653.6). Total num frames: 2692694016. Throughput: 0: 43000.1. Samples: 2692750580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:01,996][12645] Avg episode reward: [(0, '0.289')] [2024-06-18 15:49:04,202][12883] Updated weights for policy 0, policy_version 164354 (0.0031) [2024-06-18 15:49:06,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2692890624. Throughput: 0: 43040.1. Samples: 2693019100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:06,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 15:49:07,520][12883] Updated weights for policy 0, policy_version 164364 (0.0033) [2024-06-18 15:49:11,912][12883] Updated weights for policy 0, policy_version 164374 (0.0040) [2024-06-18 15:49:11,994][12645] Fps is (10 sec: 40969.0, 60 sec: 43146.1, 300 sec: 42653.9). Total num frames: 2693103616. Throughput: 0: 42672.9. Samples: 2693263360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:11,994][12645] Avg episode reward: [(0, '0.620')] [2024-06-18 15:49:15,268][12883] Updated weights for policy 0, policy_version 164384 (0.0044) [2024-06-18 15:49:16,994][12645] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2693332992. Throughput: 0: 42832.4. Samples: 2693392740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:16,994][12645] Avg episode reward: [(0, '0.750')] [2024-06-18 15:49:19,514][12883] Updated weights for policy 0, policy_version 164394 (0.0032) [2024-06-18 15:49:21,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42543.2). Total num frames: 2693513216. Throughput: 0: 42925.7. Samples: 2693657140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:21,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 15:49:23,195][12883] Updated weights for policy 0, policy_version 164404 (0.0034) [2024-06-18 15:49:26,994][12645] Fps is (10 sec: 40960.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 2693742592. Throughput: 0: 42746.4. Samples: 2693903620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:26,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 15:49:27,708][12883] Updated weights for policy 0, policy_version 164414 (0.0026) [2024-06-18 15:49:30,658][12883] Updated weights for policy 0, policy_version 164424 (0.0037) [2024-06-18 15:49:31,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2693971968. Throughput: 0: 42747.3. Samples: 2694033680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:31,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 15:49:35,380][12883] Updated weights for policy 0, policy_version 164434 (0.0034) [2024-06-18 15:49:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2694168576. Throughput: 0: 42675.1. Samples: 2694293520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:49:36,994][12645] Avg episode reward: [(0, '0.621')] [2024-06-18 15:49:38,278][12883] Updated weights for policy 0, policy_version 164444 (0.0038) [2024-06-18 15:49:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2694381568. Throughput: 0: 42723.1. Samples: 2694545080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:49:41,994][12645] Avg episode reward: [(0, '0.674')] [2024-06-18 15:49:42,877][12883] Updated weights for policy 0, policy_version 164454 (0.0040) [2024-06-18 15:49:44,209][12862] Signal inference workers to stop experience collection... (39400 times) [2024-06-18 15:49:44,209][12862] Signal inference workers to resume experience collection... (39400 times) [2024-06-18 15:49:44,253][12883] InferenceWorker_p0-w0: stopping experience collection (39400 times) [2024-06-18 15:49:44,254][12883] InferenceWorker_p0-w0: resuming experience collection (39400 times) [2024-06-18 15:49:46,238][12883] Updated weights for policy 0, policy_version 164464 (0.0037) [2024-06-18 15:49:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42598.7). Total num frames: 2694610944. Throughput: 0: 42791.9. Samples: 2694676120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:49:46,994][12645] Avg episode reward: [(0, '0.312')] [2024-06-18 15:49:50,470][12883] Updated weights for policy 0, policy_version 164474 (0.0027) [2024-06-18 15:49:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2694791168. Throughput: 0: 42528.8. Samples: 2694932900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:49:51,994][12645] Avg episode reward: [(0, '0.317')] [2024-06-18 15:49:53,691][12883] Updated weights for policy 0, policy_version 164484 (0.0029) [2024-06-18 15:49:56,996][12645] Fps is (10 sec: 40950.8, 60 sec: 42869.9, 300 sec: 42709.1). Total num frames: 2695020544. Throughput: 0: 42794.8. Samples: 2695189220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:49:56,996][12645] Avg episode reward: [(0, '0.342')] [2024-06-18 15:49:58,294][12883] Updated weights for policy 0, policy_version 164494 (0.0037) [2024-06-18 15:50:01,435][12883] Updated weights for policy 0, policy_version 164504 (0.0035) [2024-06-18 15:50:01,996][12645] Fps is (10 sec: 45865.1, 60 sec: 42598.4, 300 sec: 42598.1). Total num frames: 2695249920. Throughput: 0: 42851.8. Samples: 2695321160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:01,997][12645] Avg episode reward: [(0, '0.378')] [2024-06-18 15:50:06,358][12883] Updated weights for policy 0, policy_version 164514 (0.0038) [2024-06-18 15:50:06,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2695430144. Throughput: 0: 42579.0. Samples: 2695573200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:06,994][12645] Avg episode reward: [(0, '0.485')] [2024-06-18 15:50:09,314][12883] Updated weights for policy 0, policy_version 164524 (0.0047) [2024-06-18 15:50:12,000][12645] Fps is (10 sec: 42581.4, 60 sec: 42867.1, 300 sec: 42765.0). Total num frames: 2695675904. Throughput: 0: 42648.3. Samples: 2695823060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:12,000][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 15:50:13,826][12883] Updated weights for policy 0, policy_version 164534 (0.0030) [2024-06-18 15:50:16,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2695872512. Throughput: 0: 42790.2. Samples: 2695959240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:16,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 15:50:17,076][12883] Updated weights for policy 0, policy_version 164544 (0.0046) [2024-06-18 15:50:21,319][12883] Updated weights for policy 0, policy_version 164554 (0.0031) [2024-06-18 15:50:21,994][12645] Fps is (10 sec: 39345.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2696069120. Throughput: 0: 42560.8. Samples: 2696208760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:21,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 15:50:24,533][12883] Updated weights for policy 0, policy_version 164564 (0.0027) [2024-06-18 15:50:26,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2696314880. Throughput: 0: 42563.6. Samples: 2696460440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:26,994][12645] Avg episode reward: [(0, '0.598')] [2024-06-18 15:50:29,316][12883] Updated weights for policy 0, policy_version 164574 (0.0043) [2024-06-18 15:50:31,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 2696511488. Throughput: 0: 42613.7. Samples: 2696593740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:31,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 15:50:32,229][12883] Updated weights for policy 0, policy_version 164584 (0.0042) [2024-06-18 15:50:36,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2696708096. Throughput: 0: 42492.9. Samples: 2696845080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:36,994][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 15:50:36,999][12883] Updated weights for policy 0, policy_version 164594 (0.0036) [2024-06-18 15:50:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164594_2696708096.pth... [2024-06-18 15:50:37,078][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000163971_2686500864.pth [2024-06-18 15:50:39,857][12883] Updated weights for policy 0, policy_version 164604 (0.0054) [2024-06-18 15:50:42,000][12645] Fps is (10 sec: 42571.7, 60 sec: 42593.9, 300 sec: 42597.5). Total num frames: 2696937472. Throughput: 0: 42474.8. Samples: 2697100760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-18 15:50:42,001][12645] Avg episode reward: [(0, '0.473')] [2024-06-18 15:50:44,389][12883] Updated weights for policy 0, policy_version 164614 (0.0030) [2024-06-18 15:50:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2697166848. Throughput: 0: 42367.4. Samples: 2697227600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:50:46,994][12645] Avg episode reward: [(0, '0.368')] [2024-06-18 15:50:47,888][12883] Updated weights for policy 0, policy_version 164624 (0.0041) [2024-06-18 15:50:51,777][12883] Updated weights for policy 0, policy_version 164634 (0.0035) [2024-06-18 15:50:51,994][12645] Fps is (10 sec: 42625.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 2697363456. Throughput: 0: 42522.3. Samples: 2697486700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:50:51,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 15:50:55,484][12883] Updated weights for policy 0, policy_version 164644 (0.0040) [2024-06-18 15:50:56,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42873.1, 300 sec: 42653.9). Total num frames: 2697592832. Throughput: 0: 42638.4. Samples: 2697741520. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:50:56,994][12645] Avg episode reward: [(0, '0.322')] [2024-06-18 15:50:58,008][12862] Signal inference workers to stop experience collection... (39450 times) [2024-06-18 15:50:58,008][12862] Signal inference workers to resume experience collection... (39450 times) [2024-06-18 15:50:58,058][12883] InferenceWorker_p0-w0: stopping experience collection (39450 times) [2024-06-18 15:50:58,058][12883] InferenceWorker_p0-w0: resuming experience collection (39450 times) [2024-06-18 15:50:59,329][12883] Updated weights for policy 0, policy_version 164654 (0.0031) [2024-06-18 15:51:01,996][12645] Fps is (10 sec: 44227.2, 60 sec: 42598.4, 300 sec: 42709.2). Total num frames: 2697805824. Throughput: 0: 42508.1. Samples: 2697872200. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:01,996][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 15:51:03,152][12883] Updated weights for policy 0, policy_version 164664 (0.0033) [2024-06-18 15:51:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 2698002432. Throughput: 0: 42736.1. Samples: 2698131880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:06,994][12645] Avg episode reward: [(0, '0.741')] [2024-06-18 15:51:07,112][12883] Updated weights for policy 0, policy_version 164674 (0.0038) [2024-06-18 15:51:10,869][12883] Updated weights for policy 0, policy_version 164684 (0.0038) [2024-06-18 15:51:11,994][12645] Fps is (10 sec: 44246.9, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 2698248192. Throughput: 0: 42632.5. Samples: 2698378900. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:11,994][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 15:51:15,017][12883] Updated weights for policy 0, policy_version 164694 (0.0023) [2024-06-18 15:51:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2698444800. Throughput: 0: 42710.2. Samples: 2698515700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:16,994][12645] Avg episode reward: [(0, '0.472')] [2024-06-18 15:51:18,289][12883] Updated weights for policy 0, policy_version 164704 (0.0023) [2024-06-18 15:51:21,994][12645] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2698625024. Throughput: 0: 42908.6. Samples: 2698775960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:21,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 15:51:22,681][12883] Updated weights for policy 0, policy_version 164714 (0.0030) [2024-06-18 15:51:25,773][12883] Updated weights for policy 0, policy_version 164724 (0.0042) [2024-06-18 15:51:26,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2698887168. Throughput: 0: 42806.4. Samples: 2699026780. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:26,994][12645] Avg episode reward: [(0, '0.557')] [2024-06-18 15:51:30,093][12883] Updated weights for policy 0, policy_version 164734 (0.0028) [2024-06-18 15:51:31,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2699083776. Throughput: 0: 43009.9. Samples: 2699163040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:31,994][12645] Avg episode reward: [(0, '0.589')] [2024-06-18 15:51:33,200][12883] Updated weights for policy 0, policy_version 164744 (0.0022) [2024-06-18 15:51:36,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2699296768. Throughput: 0: 42983.6. Samples: 2699420960. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:36,994][12645] Avg episode reward: [(0, '0.714')] [2024-06-18 15:51:37,478][12883] Updated weights for policy 0, policy_version 164754 (0.0032) [2024-06-18 15:51:40,862][12883] Updated weights for policy 0, policy_version 164764 (0.0029) [2024-06-18 15:51:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43149.1, 300 sec: 42765.0). Total num frames: 2699526144. Throughput: 0: 42885.8. Samples: 2699671380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:41,994][12645] Avg episode reward: [(0, '0.755')] [2024-06-18 15:51:45,237][12883] Updated weights for policy 0, policy_version 164774 (0.0040) [2024-06-18 15:51:46,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2699739136. Throughput: 0: 43067.9. Samples: 2699810160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 15:51:46,994][12645] Avg episode reward: [(0, '0.818')] [2024-06-18 15:51:48,688][12883] Updated weights for policy 0, policy_version 164784 (0.0042) [2024-06-18 15:51:51,994][12645] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 2699919360. Throughput: 0: 42768.3. Samples: 2700056460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:51:51,994][12645] Avg episode reward: [(0, '0.379')] [2024-06-18 15:51:52,885][12883] Updated weights for policy 0, policy_version 164794 (0.0025) [2024-06-18 15:51:56,338][12883] Updated weights for policy 0, policy_version 164804 (0.0035) [2024-06-18 15:51:56,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2700165120. Throughput: 0: 42902.1. Samples: 2700309500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:51:56,999][12645] Avg episode reward: [(0, '0.364')] [2024-06-18 15:52:00,346][12883] Updated weights for policy 0, policy_version 164814 (0.0029) [2024-06-18 15:52:01,994][12645] Fps is (10 sec: 47514.7, 60 sec: 43146.2, 300 sec: 42820.6). Total num frames: 2700394496. Throughput: 0: 43083.7. Samples: 2700454460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:01,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 15:52:04,142][12883] Updated weights for policy 0, policy_version 164824 (0.0042) [2024-06-18 15:52:06,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2700574720. Throughput: 0: 42877.3. Samples: 2700705440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:06,994][12645] Avg episode reward: [(0, '0.521')] [2024-06-18 15:52:08,253][12883] Updated weights for policy 0, policy_version 164834 (0.0037) [2024-06-18 15:52:11,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 2700787712. Throughput: 0: 43072.0. Samples: 2700965020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:11,994][12645] Avg episode reward: [(0, '0.540')] [2024-06-18 15:52:12,011][12883] Updated weights for policy 0, policy_version 164844 (0.0042) [2024-06-18 15:52:15,768][12883] Updated weights for policy 0, policy_version 164854 (0.0035) [2024-06-18 15:52:16,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2701033472. Throughput: 0: 42931.6. Samples: 2701094960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:16,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 15:52:17,288][12862] Signal inference workers to stop experience collection... (39500 times) [2024-06-18 15:52:17,321][12883] InferenceWorker_p0-w0: stopping experience collection (39500 times) [2024-06-18 15:52:17,344][12862] Signal inference workers to resume experience collection... (39500 times) [2024-06-18 15:52:17,348][12883] InferenceWorker_p0-w0: resuming experience collection (39500 times) [2024-06-18 15:52:19,543][12883] Updated weights for policy 0, policy_version 164864 (0.0034) [2024-06-18 15:52:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.2). Total num frames: 2701213696. Throughput: 0: 42986.6. Samples: 2701355360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:21,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 15:52:23,083][12883] Updated weights for policy 0, policy_version 164874 (0.0041) [2024-06-18 15:52:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2701459456. Throughput: 0: 43036.5. Samples: 2701608020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:26,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 15:52:26,997][12883] Updated weights for policy 0, policy_version 164884 (0.0040) [2024-06-18 15:52:30,599][12883] Updated weights for policy 0, policy_version 164894 (0.0027) [2024-06-18 15:52:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 2701672448. Throughput: 0: 42952.8. Samples: 2701743040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:31,995][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 15:52:34,464][12883] Updated weights for policy 0, policy_version 164904 (0.0029) [2024-06-18 15:52:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2701869056. Throughput: 0: 43231.3. Samples: 2702001860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:36,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 15:52:37,012][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164910_2701885440.pth... [2024-06-18 15:52:37,059][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164283_2691612672.pth [2024-06-18 15:52:38,435][12883] Updated weights for policy 0, policy_version 164914 (0.0026) [2024-06-18 15:52:41,971][12883] Updated weights for policy 0, policy_version 164924 (0.0025) [2024-06-18 15:52:41,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2702114816. Throughput: 0: 43205.8. Samples: 2702253760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:41,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 15:52:46,039][12883] Updated weights for policy 0, policy_version 164934 (0.0023) [2024-06-18 15:52:46,994][12645] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2702311424. Throughput: 0: 42942.5. Samples: 2702386880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 15:52:46,994][12645] Avg episode reward: [(0, '0.572')] [2024-06-18 15:52:49,496][12883] Updated weights for policy 0, policy_version 164944 (0.0031) [2024-06-18 15:52:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 2702524416. Throughput: 0: 43059.1. Samples: 2702643100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:52:51,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 15:52:53,486][12883] Updated weights for policy 0, policy_version 164954 (0.0033) [2024-06-18 15:52:56,994][12645] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2702753792. Throughput: 0: 42914.7. Samples: 2702896180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:52:56,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 15:52:57,091][12883] Updated weights for policy 0, policy_version 164964 (0.0032) [2024-06-18 15:53:01,335][12883] Updated weights for policy 0, policy_version 164974 (0.0032) [2024-06-18 15:53:01,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2702950400. Throughput: 0: 42922.1. Samples: 2703026460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:01,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 15:53:04,607][12883] Updated weights for policy 0, policy_version 164984 (0.0037) [2024-06-18 15:53:06,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 2703163392. Throughput: 0: 42865.4. Samples: 2703284300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:06,994][12645] Avg episode reward: [(0, '0.525')] [2024-06-18 15:53:08,911][12883] Updated weights for policy 0, policy_version 164994 (0.0027) [2024-06-18 15:53:11,996][12645] Fps is (10 sec: 45864.2, 60 sec: 43688.9, 300 sec: 42875.7). Total num frames: 2703409152. Throughput: 0: 42947.4. Samples: 2703540760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:11,997][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 15:53:12,526][12883] Updated weights for policy 0, policy_version 165004 (0.0032) [2024-06-18 15:53:16,653][12883] Updated weights for policy 0, policy_version 165014 (0.0038) [2024-06-18 15:53:16,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2703605760. Throughput: 0: 42844.0. Samples: 2703671020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:16,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 15:53:19,949][12883] Updated weights for policy 0, policy_version 165024 (0.0040) [2024-06-18 15:53:21,994][12645] Fps is (10 sec: 40969.0, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 2703818752. Throughput: 0: 42823.3. Samples: 2703928920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:21,995][12645] Avg episode reward: [(0, '0.539')] [2024-06-18 15:53:24,414][12883] Updated weights for policy 0, policy_version 165034 (0.0029) [2024-06-18 15:53:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2704048128. Throughput: 0: 42942.3. Samples: 2704186160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:26,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 15:53:27,619][12883] Updated weights for policy 0, policy_version 165044 (0.0030) [2024-06-18 15:53:31,994][12645] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2704228352. Throughput: 0: 42712.5. Samples: 2704308940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:31,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 15:53:32,239][12883] Updated weights for policy 0, policy_version 165054 (0.0047) [2024-06-18 15:53:33,736][12862] Signal inference workers to stop experience collection... (39550 times) [2024-06-18 15:53:33,790][12883] InferenceWorker_p0-w0: stopping experience collection (39550 times) [2024-06-18 15:53:33,854][12862] Signal inference workers to resume experience collection... (39550 times) [2024-06-18 15:53:33,854][12883] InferenceWorker_p0-w0: resuming experience collection (39550 times) [2024-06-18 15:53:35,176][12883] Updated weights for policy 0, policy_version 165064 (0.0034) [2024-06-18 15:53:36,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2704441344. Throughput: 0: 42697.9. Samples: 2704564500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:36,994][12645] Avg episode reward: [(0, '0.664')] [2024-06-18 15:53:39,795][12883] Updated weights for policy 0, policy_version 165074 (0.0036) [2024-06-18 15:53:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 2704654336. Throughput: 0: 42897.2. Samples: 2704826560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:41,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 15:53:43,283][12883] Updated weights for policy 0, policy_version 165084 (0.0035) [2024-06-18 15:53:46,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2704867328. Throughput: 0: 42823.5. Samples: 2704953520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:46,994][12645] Avg episode reward: [(0, '0.328')] [2024-06-18 15:53:47,460][12883] Updated weights for policy 0, policy_version 165094 (0.0044) [2024-06-18 15:53:50,908][12883] Updated weights for policy 0, policy_version 165104 (0.0036) [2024-06-18 15:53:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2705096704. Throughput: 0: 42681.7. Samples: 2705204980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 15:53:51,994][12645] Avg episode reward: [(0, '0.481')] [2024-06-18 15:53:54,866][12883] Updated weights for policy 0, policy_version 165114 (0.0032) [2024-06-18 15:53:56,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.8). Total num frames: 2705293312. Throughput: 0: 42825.8. Samples: 2705467820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:53:56,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 15:53:58,373][12883] Updated weights for policy 0, policy_version 165124 (0.0023) [2024-06-18 15:54:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2705506304. Throughput: 0: 42675.3. Samples: 2705591400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:01,994][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 15:54:02,452][12883] Updated weights for policy 0, policy_version 165134 (0.0030) [2024-06-18 15:54:06,095][12883] Updated weights for policy 0, policy_version 165144 (0.0032) [2024-06-18 15:54:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2705735680. Throughput: 0: 42701.6. Samples: 2705850480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:06,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 15:54:10,402][12883] Updated weights for policy 0, policy_version 165154 (0.0033) [2024-06-18 15:54:11,994][12645] Fps is (10 sec: 40960.0, 60 sec: 41780.9, 300 sec: 42654.0). Total num frames: 2705915904. Throughput: 0: 42719.1. Samples: 2706108520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:11,994][12645] Avg episode reward: [(0, '0.353')] [2024-06-18 15:54:13,696][12883] Updated weights for policy 0, policy_version 165164 (0.0029) [2024-06-18 15:54:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2706161664. Throughput: 0: 42787.1. Samples: 2706234360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:16,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 15:54:17,954][12883] Updated weights for policy 0, policy_version 165174 (0.0040) [2024-06-18 15:54:21,274][12883] Updated weights for policy 0, policy_version 165184 (0.0029) [2024-06-18 15:54:21,994][12645] Fps is (10 sec: 47512.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2706391040. Throughput: 0: 42922.9. Samples: 2706496040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:21,994][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 15:54:25,625][12883] Updated weights for policy 0, policy_version 165194 (0.0045) [2024-06-18 15:54:26,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 2706571264. Throughput: 0: 42792.9. Samples: 2706752240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:26,994][12645] Avg episode reward: [(0, '0.335')] [2024-06-18 15:54:28,977][12883] Updated weights for policy 0, policy_version 165204 (0.0034) [2024-06-18 15:54:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2706817024. Throughput: 0: 42829.4. Samples: 2706880840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:31,994][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 15:54:33,248][12883] Updated weights for policy 0, policy_version 165214 (0.0037) [2024-06-18 15:54:36,659][12883] Updated weights for policy 0, policy_version 165224 (0.0028) [2024-06-18 15:54:36,994][12645] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2707030016. Throughput: 0: 43014.7. Samples: 2707140640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:36,994][12645] Avg episode reward: [(0, '0.690')] [2024-06-18 15:54:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165225_2707046400.pth... [2024-06-18 15:54:37,069][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164594_2696708096.pth [2024-06-18 15:54:40,852][12883] Updated weights for policy 0, policy_version 165234 (0.0033) [2024-06-18 15:54:41,994][12645] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2707210240. Throughput: 0: 42975.9. Samples: 2707401740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:41,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 15:54:44,457][12883] Updated weights for policy 0, policy_version 165244 (0.0045) [2024-06-18 15:54:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2707456000. Throughput: 0: 42953.7. Samples: 2707524320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:46,994][12645] Avg episode reward: [(0, '0.639')] [2024-06-18 15:54:48,649][12883] Updated weights for policy 0, policy_version 165254 (0.0044) [2024-06-18 15:54:51,576][12862] Signal inference workers to stop experience collection... (39600 times) [2024-06-18 15:54:51,576][12862] Signal inference workers to resume experience collection... (39600 times) [2024-06-18 15:54:51,616][12883] InferenceWorker_p0-w0: stopping experience collection (39600 times) [2024-06-18 15:54:51,616][12883] InferenceWorker_p0-w0: resuming experience collection (39600 times) [2024-06-18 15:54:51,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42876.4). Total num frames: 2707668992. Throughput: 0: 42886.1. Samples: 2707780360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:51,994][12645] Avg episode reward: [(0, '0.542')] [2024-06-18 15:54:52,064][12883] Updated weights for policy 0, policy_version 165264 (0.0037) [2024-06-18 15:54:56,276][12883] Updated weights for policy 0, policy_version 165274 (0.0027) [2024-06-18 15:54:56,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 2707865600. Throughput: 0: 42858.6. Samples: 2708037160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 15:54:56,994][12645] Avg episode reward: [(0, '0.272')] [2024-06-18 15:55:00,088][12883] Updated weights for policy 0, policy_version 165284 (0.0043) [2024-06-18 15:55:01,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43417.5, 300 sec: 42987.2). Total num frames: 2708111360. Throughput: 0: 42938.7. Samples: 2708166600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:01,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 15:55:03,724][12883] Updated weights for policy 0, policy_version 165294 (0.0036) [2024-06-18 15:55:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42821.5). Total num frames: 2708307968. Throughput: 0: 42968.0. Samples: 2708429600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:06,994][12645] Avg episode reward: [(0, '0.186')] [2024-06-18 15:55:07,455][12883] Updated weights for policy 0, policy_version 165304 (0.0039) [2024-06-18 15:55:11,226][12883] Updated weights for policy 0, policy_version 165314 (0.0036) [2024-06-18 15:55:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2708520960. Throughput: 0: 42983.6. Samples: 2708686500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:11,994][12645] Avg episode reward: [(0, '0.343')] [2024-06-18 15:55:15,127][12883] Updated weights for policy 0, policy_version 165324 (0.0032) [2024-06-18 15:55:16,994][12645] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42987.2). Total num frames: 2708750336. Throughput: 0: 42982.8. Samples: 2708815060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:16,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 15:55:18,823][12883] Updated weights for policy 0, policy_version 165334 (0.0032) [2024-06-18 15:55:21,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2708946944. Throughput: 0: 42988.0. Samples: 2709075100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:21,994][12645] Avg episode reward: [(0, '0.718')] [2024-06-18 15:55:22,675][12883] Updated weights for policy 0, policy_version 165344 (0.0028) [2024-06-18 15:55:26,387][12883] Updated weights for policy 0, policy_version 165354 (0.0026) [2024-06-18 15:55:26,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2709176320. Throughput: 0: 42808.5. Samples: 2709328120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:26,994][12645] Avg episode reward: [(0, '0.739')] [2024-06-18 15:55:30,311][12883] Updated weights for policy 0, policy_version 165364 (0.0029) [2024-06-18 15:55:31,996][12645] Fps is (10 sec: 44227.0, 60 sec: 42869.9, 300 sec: 42986.9). Total num frames: 2709389312. Throughput: 0: 42928.5. Samples: 2709456200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:31,997][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 15:55:33,874][12883] Updated weights for policy 0, policy_version 165374 (0.0039) [2024-06-18 15:55:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42932.5). Total num frames: 2709602304. Throughput: 0: 43088.9. Samples: 2709719360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:36,994][12645] Avg episode reward: [(0, '0.618')] [2024-06-18 15:55:38,229][12883] Updated weights for policy 0, policy_version 165384 (0.0037) [2024-06-18 15:55:41,994][12645] Fps is (10 sec: 40969.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2709798912. Throughput: 0: 42940.9. Samples: 2709969500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:41,994][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 15:55:42,016][12883] Updated weights for policy 0, policy_version 165394 (0.0048) [2024-06-18 15:55:46,005][12883] Updated weights for policy 0, policy_version 165404 (0.0043) [2024-06-18 15:55:46,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2710028288. Throughput: 0: 42968.9. Samples: 2710100200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:46,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 15:55:49,678][12883] Updated weights for policy 0, policy_version 165414 (0.0036) [2024-06-18 15:55:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2710224896. Throughput: 0: 42909.4. Samples: 2710360520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:51,994][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 15:55:53,654][12883] Updated weights for policy 0, policy_version 165424 (0.0034) [2024-06-18 15:55:56,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.4). Total num frames: 2710454272. Throughput: 0: 42722.8. Samples: 2710609020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:55:56,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 15:55:57,503][12883] Updated weights for policy 0, policy_version 165434 (0.0042) [2024-06-18 15:56:01,353][12883] Updated weights for policy 0, policy_version 165444 (0.0043) [2024-06-18 15:56:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 2710667264. Throughput: 0: 42868.4. Samples: 2710744140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 15:56:01,994][12645] Avg episode reward: [(0, '0.503')] [2024-06-18 15:56:05,007][12883] Updated weights for policy 0, policy_version 165454 (0.0026) [2024-06-18 15:56:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2710863872. Throughput: 0: 42751.1. Samples: 2710998900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:06,994][12645] Avg episode reward: [(0, '0.456')] [2024-06-18 15:56:09,042][12883] Updated weights for policy 0, policy_version 165464 (0.0032) [2024-06-18 15:56:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2711109632. Throughput: 0: 42677.8. Samples: 2711248620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:11,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 15:56:12,595][12883] Updated weights for policy 0, policy_version 165474 (0.0039) [2024-06-18 15:56:16,633][12883] Updated weights for policy 0, policy_version 165484 (0.0033) [2024-06-18 15:56:16,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 43042.7). Total num frames: 2711322624. Throughput: 0: 42782.1. Samples: 2711381300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:16,994][12645] Avg episode reward: [(0, '0.587')] [2024-06-18 15:56:17,532][12862] Signal inference workers to stop experience collection... (39650 times) [2024-06-18 15:56:17,571][12883] InferenceWorker_p0-w0: stopping experience collection (39650 times) [2024-06-18 15:56:17,591][12862] Signal inference workers to resume experience collection... (39650 times) [2024-06-18 15:56:17,593][12883] InferenceWorker_p0-w0: resuming experience collection (39650 times) [2024-06-18 15:56:20,738][12883] Updated weights for policy 0, policy_version 165494 (0.0038) [2024-06-18 15:56:21,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2711502848. Throughput: 0: 42602.4. Samples: 2711636460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:21,994][12645] Avg episode reward: [(0, '0.708')] [2024-06-18 15:56:24,205][12883] Updated weights for policy 0, policy_version 165504 (0.0027) [2024-06-18 15:56:26,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2711732224. Throughput: 0: 42539.0. Samples: 2711883760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:26,994][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 15:56:28,357][12883] Updated weights for policy 0, policy_version 165514 (0.0035) [2024-06-18 15:56:31,757][12883] Updated weights for policy 0, policy_version 165524 (0.0033) [2024-06-18 15:56:31,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42873.1, 300 sec: 42931.6). Total num frames: 2711961600. Throughput: 0: 42644.0. Samples: 2712019180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:31,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 15:56:35,871][12883] Updated weights for policy 0, policy_version 165534 (0.0036) [2024-06-18 15:56:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2712141824. Throughput: 0: 42579.0. Samples: 2712276580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:36,994][12645] Avg episode reward: [(0, '0.606')] [2024-06-18 15:56:37,016][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165536_2712141824.pth... [2024-06-18 15:56:37,081][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000164910_2701885440.pth [2024-06-18 15:56:39,508][12883] Updated weights for policy 0, policy_version 165544 (0.0038) [2024-06-18 15:56:41,994][12645] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2712387584. Throughput: 0: 42608.9. Samples: 2712526420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:41,994][12645] Avg episode reward: [(0, '0.738')] [2024-06-18 15:56:43,401][12883] Updated weights for policy 0, policy_version 165554 (0.0031) [2024-06-18 15:56:46,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 2712584192. Throughput: 0: 42642.6. Samples: 2712663060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:46,994][12645] Avg episode reward: [(0, '0.729')] [2024-06-18 15:56:47,129][12883] Updated weights for policy 0, policy_version 165564 (0.0042) [2024-06-18 15:56:51,003][12883] Updated weights for policy 0, policy_version 165574 (0.0029) [2024-06-18 15:56:51,994][12645] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2712780800. Throughput: 0: 42645.4. Samples: 2712917940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:51,994][12645] Avg episode reward: [(0, '0.509')] [2024-06-18 15:56:54,765][12883] Updated weights for policy 0, policy_version 165584 (0.0028) [2024-06-18 15:56:56,994][12645] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2713026560. Throughput: 0: 42570.6. Samples: 2713164300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:56:56,994][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 15:56:58,591][12883] Updated weights for policy 0, policy_version 165594 (0.0042) [2024-06-18 15:57:01,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 2713223168. Throughput: 0: 42695.9. Samples: 2713302620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:57:01,995][12645] Avg episode reward: [(0, '0.411')] [2024-06-18 15:57:02,444][12883] Updated weights for policy 0, policy_version 165604 (0.0041) [2024-06-18 15:57:06,232][12883] Updated weights for policy 0, policy_version 165614 (0.0031) [2024-06-18 15:57:06,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2713436160. Throughput: 0: 42844.4. Samples: 2713564460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-18 15:57:06,994][12645] Avg episode reward: [(0, '0.547')] [2024-06-18 15:57:10,153][12883] Updated weights for policy 0, policy_version 165624 (0.0026) [2024-06-18 15:57:10,671][12862] Signal inference workers to stop experience collection... (39700 times) [2024-06-18 15:57:10,672][12862] Signal inference workers to resume experience collection... (39700 times) [2024-06-18 15:57:10,724][12883] InferenceWorker_p0-w0: stopping experience collection (39700 times) [2024-06-18 15:57:10,724][12883] InferenceWorker_p0-w0: resuming experience collection (39700 times) [2024-06-18 15:57:11,994][12645] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2713681920. Throughput: 0: 42740.9. Samples: 2713807100. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:11,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 15:57:14,082][12883] Updated weights for policy 0, policy_version 165634 (0.0040) [2024-06-18 15:57:16,994][12645] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42765.0). Total num frames: 2713829376. Throughput: 0: 42662.2. Samples: 2713938980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:16,994][12645] Avg episode reward: [(0, '0.663')] [2024-06-18 15:57:17,776][12883] Updated weights for policy 0, policy_version 165644 (0.0029) [2024-06-18 15:57:21,994][12645] Fps is (10 sec: 37684.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2714058752. Throughput: 0: 42649.1. Samples: 2714195780. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:21,994][12645] Avg episode reward: [(0, '0.459')] [2024-06-18 15:57:22,027][12883] Updated weights for policy 0, policy_version 165654 (0.0029) [2024-06-18 15:57:25,448][12883] Updated weights for policy 0, policy_version 165664 (0.0036) [2024-06-18 15:57:26,994][12645] Fps is (10 sec: 49151.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2714320896. Throughput: 0: 42698.6. Samples: 2714447860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:26,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 15:57:29,363][12883] Updated weights for policy 0, policy_version 165674 (0.0024) [2024-06-18 15:57:31,994][12645] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42765.0). Total num frames: 2714484736. Throughput: 0: 42654.3. Samples: 2714582500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:31,994][12645] Avg episode reward: [(0, '0.349')] [2024-06-18 15:57:32,877][12883] Updated weights for policy 0, policy_version 165684 (0.0048) [2024-06-18 15:57:36,904][12883] Updated weights for policy 0, policy_version 165694 (0.0039) [2024-06-18 15:57:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2714730496. Throughput: 0: 42775.0. Samples: 2714842820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:36,996][12645] Avg episode reward: [(0, '0.363')] [2024-06-18 15:57:40,619][12883] Updated weights for policy 0, policy_version 165704 (0.0029) [2024-06-18 15:57:41,994][12645] Fps is (10 sec: 47513.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2714959872. Throughput: 0: 42890.2. Samples: 2715094360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:41,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 15:57:44,584][12883] Updated weights for policy 0, policy_version 165714 (0.0028) [2024-06-18 15:57:46,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2715140096. Throughput: 0: 42806.4. Samples: 2715228900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:46,994][12645] Avg episode reward: [(0, '0.520')] [2024-06-18 15:57:48,081][12883] Updated weights for policy 0, policy_version 165724 (0.0035) [2024-06-18 15:57:51,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2715369472. Throughput: 0: 42640.8. Samples: 2715483300. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:51,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 15:57:52,092][12883] Updated weights for policy 0, policy_version 165734 (0.0031) [2024-06-18 15:57:55,579][12883] Updated weights for policy 0, policy_version 165744 (0.0033) [2024-06-18 15:57:56,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2715598848. Throughput: 0: 42902.7. Samples: 2715737720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:57:56,996][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 15:57:59,825][12883] Updated weights for policy 0, policy_version 165754 (0.0036) [2024-06-18 15:58:01,998][12645] Fps is (10 sec: 40943.2, 60 sec: 42595.6, 300 sec: 42764.4). Total num frames: 2715779072. Throughput: 0: 42914.7. Samples: 2715870320. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:58:02,004][12645] Avg episode reward: [(0, '0.733')] [2024-06-18 15:58:03,224][12862] Signal inference workers to stop experience collection... (39750 times) [2024-06-18 15:58:03,270][12883] InferenceWorker_p0-w0: stopping experience collection (39750 times) [2024-06-18 15:58:03,277][12862] Signal inference workers to resume experience collection... (39750 times) [2024-06-18 15:58:03,291][12883] InferenceWorker_p0-w0: resuming experience collection (39750 times) [2024-06-18 15:58:03,407][12883] Updated weights for policy 0, policy_version 165764 (0.0030) [2024-06-18 15:58:06,994][12645] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42709.8). Total num frames: 2716008448. Throughput: 0: 42826.0. Samples: 2716122960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:58:06,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 15:58:07,457][12883] Updated weights for policy 0, policy_version 165774 (0.0033) [2024-06-18 15:58:11,445][12883] Updated weights for policy 0, policy_version 165784 (0.0029) [2024-06-18 15:58:11,994][12645] Fps is (10 sec: 44255.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2716221440. Throughput: 0: 42853.0. Samples: 2716376240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 22.0) [2024-06-18 15:58:11,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 15:58:15,055][12883] Updated weights for policy 0, policy_version 165794 (0.0036) [2024-06-18 15:58:16,994][12645] Fps is (10 sec: 40960.4, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2716418048. Throughput: 0: 42838.2. Samples: 2716510220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:16,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 15:58:18,855][12883] Updated weights for policy 0, policy_version 165804 (0.0036) [2024-06-18 15:58:21,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2716647424. Throughput: 0: 42696.9. Samples: 2716764180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:21,994][12645] Avg episode reward: [(0, '0.601')] [2024-06-18 15:58:22,748][12883] Updated weights for policy 0, policy_version 165814 (0.0031) [2024-06-18 15:58:26,372][12883] Updated weights for policy 0, policy_version 165824 (0.0040) [2024-06-18 15:58:26,996][12645] Fps is (10 sec: 45864.8, 60 sec: 42596.8, 300 sec: 42875.8). Total num frames: 2716876800. Throughput: 0: 42776.1. Samples: 2717019380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:26,996][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 15:58:30,479][12883] Updated weights for policy 0, policy_version 165834 (0.0035) [2024-06-18 15:58:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 2717057024. Throughput: 0: 42660.5. Samples: 2717148620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:31,994][12645] Avg episode reward: [(0, '0.475')] [2024-06-18 15:58:34,421][12883] Updated weights for policy 0, policy_version 165844 (0.0036) [2024-06-18 15:58:36,994][12645] Fps is (10 sec: 42608.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2717302784. Throughput: 0: 42784.1. Samples: 2717408580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:36,994][12645] Avg episode reward: [(0, '0.221')] [2024-06-18 15:58:37,071][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165852_2717319168.pth... [2024-06-18 15:58:37,147][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165225_2707046400.pth [2024-06-18 15:58:38,515][12883] Updated weights for policy 0, policy_version 165854 (0.0024) [2024-06-18 15:58:41,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2717499392. Throughput: 0: 42600.0. Samples: 2717654720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:41,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 15:58:42,263][12883] Updated weights for policy 0, policy_version 165864 (0.0033) [2024-06-18 15:58:46,050][12883] Updated weights for policy 0, policy_version 165874 (0.0030) [2024-06-18 15:58:47,000][12645] Fps is (10 sec: 40933.8, 60 sec: 42867.0, 300 sec: 42764.1). Total num frames: 2717712384. Throughput: 0: 42475.3. Samples: 2717781800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:47,001][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 15:58:49,919][12883] Updated weights for policy 0, policy_version 165884 (0.0028) [2024-06-18 15:58:51,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2717941760. Throughput: 0: 42580.5. Samples: 2718039080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:51,994][12645] Avg episode reward: [(0, '0.501')] [2024-06-18 15:58:53,662][12883] Updated weights for policy 0, policy_version 165894 (0.0036) [2024-06-18 15:58:56,994][12645] Fps is (10 sec: 44264.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2718154752. Throughput: 0: 42704.8. Samples: 2718297960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:58:56,994][12645] Avg episode reward: [(0, '0.458')] [2024-06-18 15:58:57,496][12883] Updated weights for policy 0, policy_version 165904 (0.0035) [2024-06-18 15:59:01,319][12883] Updated weights for policy 0, policy_version 165914 (0.0035) [2024-06-18 15:59:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42874.5, 300 sec: 42765.0). Total num frames: 2718351360. Throughput: 0: 42490.7. Samples: 2718422300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:59:01,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 15:59:05,394][12883] Updated weights for policy 0, policy_version 165924 (0.0044) [2024-06-18 15:59:06,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2718580736. Throughput: 0: 42637.8. Samples: 2718682880. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:59:06,994][12645] Avg episode reward: [(0, '0.759')] [2024-06-18 15:59:08,829][12883] Updated weights for policy 0, policy_version 165934 (0.0041) [2024-06-18 15:59:11,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2718793728. Throughput: 0: 42607.9. Samples: 2718936640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:59:11,994][12645] Avg episode reward: [(0, '0.645')] [2024-06-18 15:59:12,928][12883] Updated weights for policy 0, policy_version 165944 (0.0030) [2024-06-18 15:59:16,578][12883] Updated weights for policy 0, policy_version 165954 (0.0027) [2024-06-18 15:59:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2719006720. Throughput: 0: 42571.4. Samples: 2719064340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 15:59:16,994][12645] Avg episode reward: [(0, '0.479')] [2024-06-18 15:59:20,636][12883] Updated weights for policy 0, policy_version 165964 (0.0046) [2024-06-18 15:59:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2719203328. Throughput: 0: 42595.1. Samples: 2719325360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:21,994][12645] Avg episode reward: [(0, '0.259')] [2024-06-18 15:59:24,217][12883] Updated weights for policy 0, policy_version 165974 (0.0037) [2024-06-18 15:59:26,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42600.0, 300 sec: 42765.0). Total num frames: 2719432704. Throughput: 0: 42762.2. Samples: 2719579020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:26,994][12645] Avg episode reward: [(0, '0.448')] [2024-06-18 15:59:28,687][12883] Updated weights for policy 0, policy_version 165984 (0.0035) [2024-06-18 15:59:29,612][12862] Signal inference workers to stop experience collection... (39800 times) [2024-06-18 15:59:29,612][12862] Signal inference workers to resume experience collection... (39800 times) [2024-06-18 15:59:29,659][12883] InferenceWorker_p0-w0: stopping experience collection (39800 times) [2024-06-18 15:59:29,660][12883] InferenceWorker_p0-w0: resuming experience collection (39800 times) [2024-06-18 15:59:31,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2719629312. Throughput: 0: 42864.2. Samples: 2719710420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:31,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 15:59:32,149][12883] Updated weights for policy 0, policy_version 165994 (0.0029) [2024-06-18 15:59:36,214][12883] Updated weights for policy 0, policy_version 166004 (0.0027) [2024-06-18 15:59:36,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2719842304. Throughput: 0: 42695.6. Samples: 2719960380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:36,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 15:59:39,791][12883] Updated weights for policy 0, policy_version 166014 (0.0025) [2024-06-18 15:59:41,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2720071680. Throughput: 0: 42795.6. Samples: 2720223760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:41,994][12645] Avg episode reward: [(0, '0.505')] [2024-06-18 15:59:43,694][12883] Updated weights for policy 0, policy_version 166024 (0.0028) [2024-06-18 15:59:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42602.9, 300 sec: 42709.5). Total num frames: 2720268288. Throughput: 0: 42889.3. Samples: 2720352320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:46,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 15:59:47,313][12883] Updated weights for policy 0, policy_version 166034 (0.0040) [2024-06-18 15:59:51,326][12883] Updated weights for policy 0, policy_version 166044 (0.0041) [2024-06-18 15:59:51,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2720497664. Throughput: 0: 42830.7. Samples: 2720610260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:51,994][12645] Avg episode reward: [(0, '0.563')] [2024-06-18 15:59:55,083][12883] Updated weights for policy 0, policy_version 166054 (0.0026) [2024-06-18 15:59:56,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2720710656. Throughput: 0: 42851.7. Samples: 2720864960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 15:59:56,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 15:59:58,960][12883] Updated weights for policy 0, policy_version 166064 (0.0041) [2024-06-18 16:00:01,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2720907264. Throughput: 0: 42920.2. Samples: 2720995740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:01,994][12645] Avg episode reward: [(0, '0.290')] [2024-06-18 16:00:02,868][12883] Updated weights for policy 0, policy_version 166074 (0.0022) [2024-06-18 16:00:06,971][12883] Updated weights for policy 0, policy_version 166084 (0.0030) [2024-06-18 16:00:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2721120256. Throughput: 0: 42874.6. Samples: 2721254720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:06,994][12645] Avg episode reward: [(0, '0.418')] [2024-06-18 16:00:10,352][12883] Updated weights for policy 0, policy_version 166094 (0.0034) [2024-06-18 16:00:11,994][12645] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2721366016. Throughput: 0: 42899.2. Samples: 2721509480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:11,994][12645] Avg episode reward: [(0, '0.487')] [2024-06-18 16:00:14,552][12883] Updated weights for policy 0, policy_version 166104 (0.0042) [2024-06-18 16:00:16,994][12645] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2721562624. Throughput: 0: 42909.2. Samples: 2721641340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:16,994][12645] Avg episode reward: [(0, '0.632')] [2024-06-18 16:00:18,063][12883] Updated weights for policy 0, policy_version 166114 (0.0039) [2024-06-18 16:00:21,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2721759232. Throughput: 0: 43000.1. Samples: 2721895380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:21,994][12645] Avg episode reward: [(0, '0.671')] [2024-06-18 16:00:22,039][12883] Updated weights for policy 0, policy_version 166124 (0.0028) [2024-06-18 16:00:25,597][12883] Updated weights for policy 0, policy_version 166134 (0.0047) [2024-06-18 16:00:26,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 2722004992. Throughput: 0: 42921.2. Samples: 2722155220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:26,999][12645] Avg episode reward: [(0, '0.666')] [2024-06-18 16:00:29,541][12883] Updated weights for policy 0, policy_version 166144 (0.0033) [2024-06-18 16:00:31,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2722201600. Throughput: 0: 42923.1. Samples: 2722283860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:31,994][12645] Avg episode reward: [(0, '0.462')] [2024-06-18 16:00:33,026][12883] Updated weights for policy 0, policy_version 166154 (0.0031) [2024-06-18 16:00:36,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2722414592. Throughput: 0: 42840.4. Samples: 2722538080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:36,994][12645] Avg episode reward: [(0, '0.391')] [2024-06-18 16:00:37,091][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166164_2722430976.pth... [2024-06-18 16:00:37,095][12883] Updated weights for policy 0, policy_version 166164 (0.0040) [2024-06-18 16:00:37,144][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165536_2712141824.pth [2024-06-18 16:00:40,607][12883] Updated weights for policy 0, policy_version 166174 (0.0039) [2024-06-18 16:00:41,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2722643968. Throughput: 0: 42971.1. Samples: 2722798660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:41,994][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 16:00:42,347][12862] Signal inference workers to stop experience collection... (39850 times) [2024-06-18 16:00:42,397][12883] InferenceWorker_p0-w0: stopping experience collection (39850 times) [2024-06-18 16:00:42,462][12862] Signal inference workers to resume experience collection... (39850 times) [2024-06-18 16:00:42,462][12883] InferenceWorker_p0-w0: resuming experience collection (39850 times) [2024-06-18 16:00:44,632][12883] Updated weights for policy 0, policy_version 166184 (0.0035) [2024-06-18 16:00:46,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2722840576. Throughput: 0: 42943.5. Samples: 2722928200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:46,994][12645] Avg episode reward: [(0, '0.316')] [2024-06-18 16:00:48,084][12883] Updated weights for policy 0, policy_version 166194 (0.0036) [2024-06-18 16:00:51,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2723053568. Throughput: 0: 42646.7. Samples: 2723173820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:51,994][12645] Avg episode reward: [(0, '0.500')] [2024-06-18 16:00:52,658][12883] Updated weights for policy 0, policy_version 166204 (0.0038) [2024-06-18 16:00:56,054][12883] Updated weights for policy 0, policy_version 166214 (0.0026) [2024-06-18 16:00:56,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2723282944. Throughput: 0: 42705.7. Samples: 2723431240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:00:56,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 16:01:00,474][12883] Updated weights for policy 0, policy_version 166224 (0.0026) [2024-06-18 16:01:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 2723495936. Throughput: 0: 42863.0. Samples: 2723570160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:01:01,994][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 16:01:03,514][12883] Updated weights for policy 0, policy_version 166234 (0.0040) [2024-06-18 16:01:06,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 2723692544. Throughput: 0: 42777.6. Samples: 2723820380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:01:06,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 16:01:08,088][12883] Updated weights for policy 0, policy_version 166244 (0.0034) [2024-06-18 16:01:11,069][12883] Updated weights for policy 0, policy_version 166254 (0.0047) [2024-06-18 16:01:11,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2723921920. Throughput: 0: 42652.4. Samples: 2724074580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:01:11,994][12645] Avg episode reward: [(0, '0.624')] [2024-06-18 16:01:15,697][12883] Updated weights for policy 0, policy_version 166264 (0.0036) [2024-06-18 16:01:16,994][12645] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 2724134912. Throughput: 0: 42716.5. Samples: 2724206100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:01:17,000][12645] Avg episode reward: [(0, '0.467')] [2024-06-18 16:01:18,719][12883] Updated weights for policy 0, policy_version 166274 (0.0026) [2024-06-18 16:01:21,994][12645] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2724331520. Throughput: 0: 42625.4. Samples: 2724456220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:01:21,994][12645] Avg episode reward: [(0, '0.428')] [2024-06-18 16:01:23,354][12883] Updated weights for policy 0, policy_version 166284 (0.0038) [2024-06-18 16:01:26,515][12883] Updated weights for policy 0, policy_version 166294 (0.0027) [2024-06-18 16:01:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2724577280. Throughput: 0: 42533.6. Samples: 2724712680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 16:01:26,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 16:01:30,964][12883] Updated weights for policy 0, policy_version 166304 (0.0031) [2024-06-18 16:01:31,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2724757504. Throughput: 0: 42570.3. Samples: 2724843860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:01:31,994][12645] Avg episode reward: [(0, '0.545')] [2024-06-18 16:01:34,232][12883] Updated weights for policy 0, policy_version 166314 (0.0026) [2024-06-18 16:01:36,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2724986880. Throughput: 0: 42718.1. Samples: 2725096140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:01:36,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 16:01:38,951][12883] Updated weights for policy 0, policy_version 166324 (0.0041) [2024-06-18 16:01:41,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2725199872. Throughput: 0: 42728.5. Samples: 2725354020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:01:41,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 16:01:42,002][12883] Updated weights for policy 0, policy_version 166334 (0.0037) [2024-06-18 16:01:46,695][12883] Updated weights for policy 0, policy_version 166344 (0.0031) [2024-06-18 16:01:46,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2725396480. Throughput: 0: 42489.7. Samples: 2725482200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:01:46,994][12645] Avg episode reward: [(0, '0.584')] [2024-06-18 16:01:50,007][12883] Updated weights for policy 0, policy_version 166354 (0.0035) [2024-06-18 16:01:51,994][12645] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2725625856. Throughput: 0: 42635.2. Samples: 2725738960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:01:51,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 16:01:54,385][12883] Updated weights for policy 0, policy_version 166364 (0.0042) [2024-06-18 16:01:56,994][12645] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2725855232. Throughput: 0: 42631.9. Samples: 2725993020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:01:56,994][12645] Avg episode reward: [(0, '0.723')] [2024-06-18 16:01:57,750][12883] Updated weights for policy 0, policy_version 166374 (0.0045) [2024-06-18 16:02:01,893][12883] Updated weights for policy 0, policy_version 166384 (0.0027) [2024-06-18 16:02:01,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 2726035456. Throughput: 0: 42451.9. Samples: 2726116440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:02:01,994][12645] Avg episode reward: [(0, '0.421')] [2024-06-18 16:02:05,454][12862] Signal inference workers to stop experience collection... (39900 times) [2024-06-18 16:02:05,500][12883] InferenceWorker_p0-w0: stopping experience collection (39900 times) [2024-06-18 16:02:05,504][12862] Signal inference workers to resume experience collection... (39900 times) [2024-06-18 16:02:05,512][12883] InferenceWorker_p0-w0: resuming experience collection (39900 times) [2024-06-18 16:02:05,516][12883] Updated weights for policy 0, policy_version 166394 (0.0036) [2024-06-18 16:02:06,994][12645] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 2726264832. Throughput: 0: 42696.4. Samples: 2726377560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:02:06,994][12645] Avg episode reward: [(0, '0.334')] [2024-06-18 16:02:09,499][12883] Updated weights for policy 0, policy_version 166404 (0.0028) [2024-06-18 16:02:11,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 2726477824. Throughput: 0: 42620.5. Samples: 2726630600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:02:11,994][12645] Avg episode reward: [(0, '0.721')] [2024-06-18 16:02:13,135][12883] Updated weights for policy 0, policy_version 166414 (0.0032) [2024-06-18 16:02:16,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 2726658048. Throughput: 0: 42462.6. Samples: 2726754680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:02:16,994][12645] Avg episode reward: [(0, '0.425')] [2024-06-18 16:02:17,400][12883] Updated weights for policy 0, policy_version 166424 (0.0034) [2024-06-18 16:02:20,734][12883] Updated weights for policy 0, policy_version 166434 (0.0035) [2024-06-18 16:02:21,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2726920192. Throughput: 0: 42502.7. Samples: 2727008760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:02:21,994][12645] Avg episode reward: [(0, '0.330')] [2024-06-18 16:02:25,154][12883] Updated weights for policy 0, policy_version 166444 (0.0030) [2024-06-18 16:02:26,994][12645] Fps is (10 sec: 45874.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2727116800. Throughput: 0: 42391.0. Samples: 2727261620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:02:26,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 16:02:28,432][12883] Updated weights for policy 0, policy_version 166454 (0.0041) [2024-06-18 16:02:31,994][12645] Fps is (10 sec: 37682.6, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2727297024. Throughput: 0: 42294.9. Samples: 2727385480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:02:31,994][12645] Avg episode reward: [(0, '0.480')] [2024-06-18 16:02:33,037][12883] Updated weights for policy 0, policy_version 166464 (0.0034) [2024-06-18 16:02:36,053][12883] Updated weights for policy 0, policy_version 166474 (0.0037) [2024-06-18 16:02:36,994][12645] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2727542784. Throughput: 0: 42409.5. Samples: 2727647380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:02:36,994][12645] Avg episode reward: [(0, '0.573')] [2024-06-18 16:02:37,133][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166477_2727559168.pth... [2024-06-18 16:02:37,190][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000165852_2717319168.pth [2024-06-18 16:02:40,611][12883] Updated weights for policy 0, policy_version 166484 (0.0042) [2024-06-18 16:02:41,994][12645] Fps is (10 sec: 45876.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2727755776. Throughput: 0: 42394.4. Samples: 2727900760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:02:41,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 16:02:44,220][12883] Updated weights for policy 0, policy_version 166494 (0.0031) [2024-06-18 16:02:46,994][12645] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2727952384. Throughput: 0: 42418.7. Samples: 2728025280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:02:46,997][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 16:02:48,269][12883] Updated weights for policy 0, policy_version 166504 (0.0037) [2024-06-18 16:02:51,658][12883] Updated weights for policy 0, policy_version 166514 (0.0021) [2024-06-18 16:02:51,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2728181760. Throughput: 0: 42367.0. Samples: 2728284080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:02:51,994][12645] Avg episode reward: [(0, '0.339')] [2024-06-18 16:02:55,847][12883] Updated weights for policy 0, policy_version 166524 (0.0039) [2024-06-18 16:02:57,000][12645] Fps is (10 sec: 40936.3, 60 sec: 41775.2, 300 sec: 42653.7). Total num frames: 2728361984. Throughput: 0: 42501.2. Samples: 2728543400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:02:57,000][12645] Avg episode reward: [(0, '0.553')] [2024-06-18 16:02:59,053][12883] Updated weights for policy 0, policy_version 166534 (0.0033) [2024-06-18 16:03:01,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 2728591360. Throughput: 0: 42434.2. Samples: 2728664220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:01,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 16:03:03,562][12883] Updated weights for policy 0, policy_version 166544 (0.0036) [2024-06-18 16:03:06,980][12883] Updated weights for policy 0, policy_version 166554 (0.0033) [2024-06-18 16:03:06,994][12645] Fps is (10 sec: 45902.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2728820736. Throughput: 0: 42450.7. Samples: 2728919040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:06,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 16:03:11,593][12883] Updated weights for policy 0, policy_version 166564 (0.0037) [2024-06-18 16:03:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 2729000960. Throughput: 0: 42579.2. Samples: 2729177680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:11,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 16:03:14,633][12883] Updated weights for policy 0, policy_version 166574 (0.0028) [2024-06-18 16:03:16,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2729213952. Throughput: 0: 42514.5. Samples: 2729298620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:16,994][12645] Avg episode reward: [(0, '0.723')] [2024-06-18 16:03:19,183][12883] Updated weights for policy 0, policy_version 166584 (0.0034) [2024-06-18 16:03:21,994][12645] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 2729459712. Throughput: 0: 42464.3. Samples: 2729558280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:21,994][12645] Avg episode reward: [(0, '0.590')] [2024-06-18 16:03:22,513][12883] Updated weights for policy 0, policy_version 166594 (0.0023) [2024-06-18 16:03:25,927][12862] Signal inference workers to stop experience collection... (39950 times) [2024-06-18 16:03:25,973][12883] InferenceWorker_p0-w0: stopping experience collection (39950 times) [2024-06-18 16:03:25,984][12862] Signal inference workers to resume experience collection... (39950 times) [2024-06-18 16:03:26,000][12883] InferenceWorker_p0-w0: resuming experience collection (39950 times) [2024-06-18 16:03:26,654][12883] Updated weights for policy 0, policy_version 166604 (0.0035) [2024-06-18 16:03:26,994][12645] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 2729656320. Throughput: 0: 42645.3. Samples: 2729819800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:26,994][12645] Avg episode reward: [(0, '0.409')] [2024-06-18 16:03:29,980][12883] Updated weights for policy 0, policy_version 166614 (0.0031) [2024-06-18 16:03:31,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 2729869312. Throughput: 0: 42641.8. Samples: 2729944160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:31,995][12645] Avg episode reward: [(0, '0.424')] [2024-06-18 16:03:34,272][12883] Updated weights for policy 0, policy_version 166624 (0.0037) [2024-06-18 16:03:36,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2730098688. Throughput: 0: 42603.7. Samples: 2730201240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 16:03:36,994][12645] Avg episode reward: [(0, '0.748')] [2024-06-18 16:03:37,538][12883] Updated weights for policy 0, policy_version 166634 (0.0026) [2024-06-18 16:03:41,892][12883] Updated weights for policy 0, policy_version 166644 (0.0039) [2024-06-18 16:03:41,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42654.9). Total num frames: 2730295296. Throughput: 0: 42653.6. Samples: 2730462560. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:03:41,994][12645] Avg episode reward: [(0, '0.690')] [2024-06-18 16:03:45,084][12883] Updated weights for policy 0, policy_version 166654 (0.0035) [2024-06-18 16:03:46,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2730508288. Throughput: 0: 42663.1. Samples: 2730584060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:03:46,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 16:03:49,744][12883] Updated weights for policy 0, policy_version 166664 (0.0042) [2024-06-18 16:03:51,999][12645] Fps is (10 sec: 44213.9, 60 sec: 42594.8, 300 sec: 42653.2). Total num frames: 2730737664. Throughput: 0: 42713.8. Samples: 2730841380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:03:51,999][12645] Avg episode reward: [(0, '0.614')] [2024-06-18 16:03:52,804][12883] Updated weights for policy 0, policy_version 166674 (0.0036) [2024-06-18 16:03:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42875.6, 300 sec: 42653.9). Total num frames: 2730934272. Throughput: 0: 42840.8. Samples: 2731105520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:03:56,994][12645] Avg episode reward: [(0, '0.718')] [2024-06-18 16:03:57,163][12883] Updated weights for policy 0, policy_version 166684 (0.0042) [2024-06-18 16:04:00,340][12883] Updated weights for policy 0, policy_version 166694 (0.0031) [2024-06-18 16:04:01,994][12645] Fps is (10 sec: 40981.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 2731147264. Throughput: 0: 42881.3. Samples: 2731228280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:01,994][12645] Avg episode reward: [(0, '0.526')] [2024-06-18 16:04:04,616][12883] Updated weights for policy 0, policy_version 166704 (0.0052) [2024-06-18 16:04:06,994][12645] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2731393024. Throughput: 0: 42930.6. Samples: 2731490160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:06,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 16:04:07,945][12883] Updated weights for policy 0, policy_version 166714 (0.0042) [2024-06-18 16:04:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 2731573248. Throughput: 0: 42889.3. Samples: 2731749820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:11,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 16:04:12,598][12883] Updated weights for policy 0, policy_version 166724 (0.0043) [2024-06-18 16:04:15,528][12883] Updated weights for policy 0, policy_version 166734 (0.0034) [2024-06-18 16:04:16,994][12645] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2731802624. Throughput: 0: 42832.0. Samples: 2731871600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:16,994][12645] Avg episode reward: [(0, '0.277')] [2024-06-18 16:04:20,201][12883] Updated weights for policy 0, policy_version 166744 (0.0037) [2024-06-18 16:04:21,994][12645] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2732032000. Throughput: 0: 42976.8. Samples: 2732135200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:21,994][12645] Avg episode reward: [(0, '0.327')] [2024-06-18 16:04:22,997][12883] Updated weights for policy 0, policy_version 166754 (0.0029) [2024-06-18 16:04:26,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2732228608. Throughput: 0: 42956.4. Samples: 2732395600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:26,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 16:04:27,844][12883] Updated weights for policy 0, policy_version 166764 (0.0036) [2024-06-18 16:04:30,493][12883] Updated weights for policy 0, policy_version 166774 (0.0030) [2024-06-18 16:04:31,994][12645] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2732457984. Throughput: 0: 42955.2. Samples: 2732517040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:31,994][12645] Avg episode reward: [(0, '0.665')] [2024-06-18 16:04:35,942][12883] Updated weights for policy 0, policy_version 166784 (0.0037) [2024-06-18 16:04:36,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2732670976. Throughput: 0: 43155.6. Samples: 2732783160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:36,994][12645] Avg episode reward: [(0, '0.549')] [2024-06-18 16:04:37,101][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166790_2732687360.pth... [2024-06-18 16:04:37,158][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166164_2722430976.pth [2024-06-18 16:04:38,429][12883] Updated weights for policy 0, policy_version 166794 (0.0044) [2024-06-18 16:04:41,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2732851200. Throughput: 0: 42921.0. Samples: 2733036960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:04:41,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 16:04:42,131][12862] Signal inference workers to stop experience collection... (40000 times) [2024-06-18 16:04:42,186][12883] InferenceWorker_p0-w0: stopping experience collection (40000 times) [2024-06-18 16:04:42,192][12862] Signal inference workers to resume experience collection... (40000 times) [2024-06-18 16:04:42,208][12883] InferenceWorker_p0-w0: resuming experience collection (40000 times) [2024-06-18 16:04:43,396][12883] Updated weights for policy 0, policy_version 166804 (0.0029) [2024-06-18 16:04:46,260][12883] Updated weights for policy 0, policy_version 166814 (0.0045) [2024-06-18 16:04:46,994][12645] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2733096960. Throughput: 0: 42842.1. Samples: 2733156180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:04:46,994][12645] Avg episode reward: [(0, '0.354')] [2024-06-18 16:04:51,122][12883] Updated weights for policy 0, policy_version 166824 (0.0038) [2024-06-18 16:04:51,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42602.0, 300 sec: 42653.9). Total num frames: 2733293568. Throughput: 0: 42881.8. Samples: 2733419840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:04:51,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 16:04:53,939][12883] Updated weights for policy 0, policy_version 166834 (0.0037) [2024-06-18 16:04:56,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2733506560. Throughput: 0: 42789.7. Samples: 2733675360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:04:56,994][12645] Avg episode reward: [(0, '0.374')] [2024-06-18 16:04:58,627][12883] Updated weights for policy 0, policy_version 166844 (0.0028) [2024-06-18 16:05:01,690][12883] Updated weights for policy 0, policy_version 166854 (0.0024) [2024-06-18 16:05:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 2733752320. Throughput: 0: 42969.8. Samples: 2733805240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:02,000][12645] Avg episode reward: [(0, '0.403')] [2024-06-18 16:05:06,088][12883] Updated weights for policy 0, policy_version 166864 (0.0028) [2024-06-18 16:05:06,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2733932544. Throughput: 0: 42857.4. Samples: 2734063780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:06,994][12645] Avg episode reward: [(0, '0.482')] [2024-06-18 16:05:09,161][12883] Updated weights for policy 0, policy_version 166874 (0.0033) [2024-06-18 16:05:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 2734161920. Throughput: 0: 42833.8. Samples: 2734323120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:11,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 16:05:13,714][12883] Updated weights for policy 0, policy_version 166884 (0.0036) [2024-06-18 16:05:16,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2734374912. Throughput: 0: 42979.9. Samples: 2734451140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:16,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 16:05:17,277][12883] Updated weights for policy 0, policy_version 166894 (0.0038) [2024-06-18 16:05:21,207][12883] Updated weights for policy 0, policy_version 166904 (0.0045) [2024-06-18 16:05:21,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2734587904. Throughput: 0: 42866.6. Samples: 2734712160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:21,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 16:05:24,861][12883] Updated weights for policy 0, policy_version 166914 (0.0044) [2024-06-18 16:05:26,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2734800896. Throughput: 0: 42875.9. Samples: 2734966380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:26,996][12645] Avg episode reward: [(0, '0.257')] [2024-06-18 16:05:28,821][12883] Updated weights for policy 0, policy_version 166924 (0.0040) [2024-06-18 16:05:31,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 2735013888. Throughput: 0: 43055.1. Samples: 2735093660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:31,994][12645] Avg episode reward: [(0, '0.732')] [2024-06-18 16:05:33,090][12883] Updated weights for policy 0, policy_version 166934 (0.0040) [2024-06-18 16:05:36,563][12883] Updated weights for policy 0, policy_version 166944 (0.0040) [2024-06-18 16:05:36,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2735226880. Throughput: 0: 42784.0. Samples: 2735345120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:36,994][12645] Avg episode reward: [(0, '0.369')] [2024-06-18 16:05:40,762][12883] Updated weights for policy 0, policy_version 166954 (0.0030) [2024-06-18 16:05:41,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.3, 300 sec: 42709.4). Total num frames: 2735439872. Throughput: 0: 42761.6. Samples: 2735599640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:41,995][12645] Avg episode reward: [(0, '0.226')] [2024-06-18 16:05:44,254][12883] Updated weights for policy 0, policy_version 166964 (0.0044) [2024-06-18 16:05:46,994][12645] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2735652864. Throughput: 0: 42668.1. Samples: 2735725300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 24.0) [2024-06-18 16:05:46,994][12645] Avg episode reward: [(0, '0.292')] [2024-06-18 16:05:48,287][12883] Updated weights for policy 0, policy_version 166974 (0.0037) [2024-06-18 16:05:51,873][12883] Updated weights for policy 0, policy_version 166984 (0.0032) [2024-06-18 16:05:51,994][12645] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2735865856. Throughput: 0: 42539.5. Samples: 2735978060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:05:51,994][12645] Avg episode reward: [(0, '0.518')] [2024-06-18 16:05:56,304][12883] Updated weights for policy 0, policy_version 166994 (0.0035) [2024-06-18 16:05:56,994][12645] Fps is (10 sec: 40959.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2736062464. Throughput: 0: 42572.7. Samples: 2736238900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:05:56,994][12645] Avg episode reward: [(0, '0.537')] [2024-06-18 16:05:59,592][12883] Updated weights for policy 0, policy_version 167004 (0.0047) [2024-06-18 16:06:01,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2736308224. Throughput: 0: 42487.1. Samples: 2736363060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:01,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 16:06:03,931][12883] Updated weights for policy 0, policy_version 167014 (0.0032) [2024-06-18 16:06:06,994][12645] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2736488448. Throughput: 0: 42312.5. Samples: 2736616220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:06,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 16:06:07,521][12883] Updated weights for policy 0, policy_version 167024 (0.0044) [2024-06-18 16:06:11,671][12883] Updated weights for policy 0, policy_version 167034 (0.0034) [2024-06-18 16:06:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2736717824. Throughput: 0: 42484.5. Samples: 2736878180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:11,994][12645] Avg episode reward: [(0, '0.599')] [2024-06-18 16:06:13,180][12862] Signal inference workers to stop experience collection... (40050 times) [2024-06-18 16:06:13,180][12862] Signal inference workers to resume experience collection... (40050 times) [2024-06-18 16:06:13,202][12883] InferenceWorker_p0-w0: stopping experience collection (40050 times) [2024-06-18 16:06:13,203][12883] InferenceWorker_p0-w0: resuming experience collection (40050 times) [2024-06-18 16:06:14,985][12883] Updated weights for policy 0, policy_version 167044 (0.0037) [2024-06-18 16:06:16,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2736914432. Throughput: 0: 42377.9. Samples: 2737000660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:16,994][12645] Avg episode reward: [(0, '0.578')] [2024-06-18 16:06:19,095][12883] Updated weights for policy 0, policy_version 167054 (0.0023) [2024-06-18 16:06:21,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 2737143808. Throughput: 0: 42525.8. Samples: 2737258780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:21,994][12645] Avg episode reward: [(0, '0.646')] [2024-06-18 16:06:22,824][12883] Updated weights for policy 0, policy_version 167064 (0.0055) [2024-06-18 16:06:26,580][12883] Updated weights for policy 0, policy_version 167074 (0.0028) [2024-06-18 16:06:26,996][12645] Fps is (10 sec: 44226.6, 60 sec: 42596.8, 300 sec: 42709.1). Total num frames: 2737356800. Throughput: 0: 42589.7. Samples: 2737516260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:26,996][12645] Avg episode reward: [(0, '0.507')] [2024-06-18 16:06:30,358][12883] Updated weights for policy 0, policy_version 167084 (0.0023) [2024-06-18 16:06:31,994][12645] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 2737553408. Throughput: 0: 42621.7. Samples: 2737643280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:31,994][12645] Avg episode reward: [(0, '0.365')] [2024-06-18 16:06:34,165][12883] Updated weights for policy 0, policy_version 167094 (0.0036) [2024-06-18 16:06:36,994][12645] Fps is (10 sec: 42607.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2737782784. Throughput: 0: 42757.3. Samples: 2737902140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:36,994][12645] Avg episode reward: [(0, '0.517')] [2024-06-18 16:06:37,010][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167101_2737782784.pth... [2024-06-18 16:06:37,088][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166477_2727559168.pth [2024-06-18 16:06:37,976][12883] Updated weights for policy 0, policy_version 167104 (0.0032) [2024-06-18 16:06:41,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 2737979392. Throughput: 0: 42500.5. Samples: 2738151420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:41,994][12645] Avg episode reward: [(0, '0.455')] [2024-06-18 16:06:42,234][12883] Updated weights for policy 0, policy_version 167114 (0.0036) [2024-06-18 16:06:45,566][12883] Updated weights for policy 0, policy_version 167124 (0.0036) [2024-06-18 16:06:46,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 2738192384. Throughput: 0: 42584.9. Samples: 2738279380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:46,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 16:06:50,022][12883] Updated weights for policy 0, policy_version 167134 (0.0032) [2024-06-18 16:06:51,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 2738405376. Throughput: 0: 42608.8. Samples: 2738533620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 16:06:51,994][12645] Avg episode reward: [(0, '0.593')] [2024-06-18 16:06:53,169][12883] Updated weights for policy 0, policy_version 167144 (0.0028) [2024-06-18 16:06:56,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 2738618368. Throughput: 0: 42482.1. Samples: 2738789880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:06:56,994][12645] Avg episode reward: [(0, '0.514')] [2024-06-18 16:06:57,551][12883] Updated weights for policy 0, policy_version 167154 (0.0034) [2024-06-18 16:07:00,793][12883] Updated weights for policy 0, policy_version 167164 (0.0031) [2024-06-18 16:07:01,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 2738847744. Throughput: 0: 42725.8. Samples: 2738923320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:01,994][12645] Avg episode reward: [(0, '0.490')] [2024-06-18 16:07:05,102][12883] Updated weights for policy 0, policy_version 167174 (0.0034) [2024-06-18 16:07:06,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2739060736. Throughput: 0: 42651.5. Samples: 2739178100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:06,996][12645] Avg episode reward: [(0, '0.419')] [2024-06-18 16:07:08,361][12883] Updated weights for policy 0, policy_version 167184 (0.0038) [2024-06-18 16:07:11,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2739273728. Throughput: 0: 42639.8. Samples: 2739434960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:11,994][12645] Avg episode reward: [(0, '0.434')] [2024-06-18 16:07:12,965][12883] Updated weights for policy 0, policy_version 167194 (0.0037) [2024-06-18 16:07:16,107][12883] Updated weights for policy 0, policy_version 167204 (0.0026) [2024-06-18 16:07:16,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 2739486720. Throughput: 0: 42728.3. Samples: 2739566060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:16,994][12645] Avg episode reward: [(0, '0.332')] [2024-06-18 16:07:20,413][12883] Updated weights for policy 0, policy_version 167214 (0.0037) [2024-06-18 16:07:21,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 2739699712. Throughput: 0: 42583.6. Samples: 2739818400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:21,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 16:07:23,779][12883] Updated weights for policy 0, policy_version 167224 (0.0028) [2024-06-18 16:07:26,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42599.9, 300 sec: 42765.0). Total num frames: 2739912704. Throughput: 0: 42834.6. Samples: 2740078980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:26,994][12645] Avg episode reward: [(0, '0.457')] [2024-06-18 16:07:27,993][12883] Updated weights for policy 0, policy_version 167234 (0.0050) [2024-06-18 16:07:31,429][12883] Updated weights for policy 0, policy_version 167244 (0.0038) [2024-06-18 16:07:31,994][12645] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2740142080. Throughput: 0: 42749.7. Samples: 2740203120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:31,994][12645] Avg episode reward: [(0, '0.471')] [2024-06-18 16:07:35,608][12883] Updated weights for policy 0, policy_version 167254 (0.0038) [2024-06-18 16:07:36,994][12645] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2740355072. Throughput: 0: 42905.9. Samples: 2740464380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:36,994][12645] Avg episode reward: [(0, '0.356')] [2024-06-18 16:07:38,263][12862] Signal inference workers to stop experience collection... (40100 times) [2024-06-18 16:07:38,293][12883] InferenceWorker_p0-w0: stopping experience collection (40100 times) [2024-06-18 16:07:38,319][12862] Signal inference workers to resume experience collection... (40100 times) [2024-06-18 16:07:38,320][12883] InferenceWorker_p0-w0: resuming experience collection (40100 times) [2024-06-18 16:07:39,231][12883] Updated weights for policy 0, policy_version 167264 (0.0027) [2024-06-18 16:07:41,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2740551680. Throughput: 0: 42829.0. Samples: 2740717180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:41,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 16:07:43,235][12883] Updated weights for policy 0, policy_version 167274 (0.0036) [2024-06-18 16:07:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 2740764672. Throughput: 0: 42611.0. Samples: 2740840820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:46,994][12645] Avg episode reward: [(0, '0.367')] [2024-06-18 16:07:47,155][12883] Updated weights for policy 0, policy_version 167284 (0.0034) [2024-06-18 16:07:50,785][12883] Updated weights for policy 0, policy_version 167294 (0.0031) [2024-06-18 16:07:51,994][12645] Fps is (10 sec: 45875.1, 60 sec: 43417.6, 300 sec: 42876.9). Total num frames: 2741010432. Throughput: 0: 42878.7. Samples: 2741107640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:51,994][12645] Avg episode reward: [(0, '0.293')] [2024-06-18 16:07:54,727][12883] Updated weights for policy 0, policy_version 167304 (0.0036) [2024-06-18 16:07:56,994][12645] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 2741207040. Throughput: 0: 42947.1. Samples: 2741367580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:07:56,996][12645] Avg episode reward: [(0, '0.603')] [2024-06-18 16:07:58,208][12883] Updated weights for policy 0, policy_version 167314 (0.0042) [2024-06-18 16:08:01,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2741420032. Throughput: 0: 42715.7. Samples: 2741488260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:01,994][12645] Avg episode reward: [(0, '0.681')] [2024-06-18 16:08:02,108][12883] Updated weights for policy 0, policy_version 167324 (0.0048) [2024-06-18 16:08:05,727][12883] Updated weights for policy 0, policy_version 167334 (0.0049) [2024-06-18 16:08:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2741665792. Throughput: 0: 43020.0. Samples: 2741754300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:06,994][12645] Avg episode reward: [(0, '0.530')] [2024-06-18 16:08:09,869][12883] Updated weights for policy 0, policy_version 167344 (0.0036) [2024-06-18 16:08:11,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2741829632. Throughput: 0: 43056.5. Samples: 2742016520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:11,994][12645] Avg episode reward: [(0, '0.682')] [2024-06-18 16:08:13,260][12883] Updated weights for policy 0, policy_version 167354 (0.0032) [2024-06-18 16:08:16,994][12645] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2742059008. Throughput: 0: 43011.9. Samples: 2742138660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:16,995][12645] Avg episode reward: [(0, '0.510')] [2024-06-18 16:08:17,469][12883] Updated weights for policy 0, policy_version 167364 (0.0033) [2024-06-18 16:08:20,834][12883] Updated weights for policy 0, policy_version 167374 (0.0035) [2024-06-18 16:08:21,994][12645] Fps is (10 sec: 49151.4, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 2742321152. Throughput: 0: 43120.7. Samples: 2742404820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:21,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 16:08:25,243][12883] Updated weights for policy 0, policy_version 167384 (0.0043) [2024-06-18 16:08:26,994][12645] Fps is (10 sec: 42599.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2742484992. Throughput: 0: 43184.0. Samples: 2742660460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:26,994][12645] Avg episode reward: [(0, '0.497')] [2024-06-18 16:08:28,478][12883] Updated weights for policy 0, policy_version 167394 (0.0029) [2024-06-18 16:08:31,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2742714368. Throughput: 0: 43126.2. Samples: 2742781500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:31,994][12645] Avg episode reward: [(0, '0.416')] [2024-06-18 16:08:33,181][12883] Updated weights for policy 0, policy_version 167404 (0.0041) [2024-06-18 16:08:36,095][12883] Updated weights for policy 0, policy_version 167414 (0.0023) [2024-06-18 16:08:36,994][12645] Fps is (10 sec: 47513.6, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2742960128. Throughput: 0: 43175.6. Samples: 2743050540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:36,994][12645] Avg episode reward: [(0, '0.617')] [2024-06-18 16:08:37,065][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167418_2742976512.pth... [2024-06-18 16:08:37,121][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000166790_2732687360.pth [2024-06-18 16:08:40,640][12883] Updated weights for policy 0, policy_version 167424 (0.0038) [2024-06-18 16:08:41,994][12645] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 2743107584. Throughput: 0: 43189.9. Samples: 2743311120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:41,994][12645] Avg episode reward: [(0, '0.463')] [2024-06-18 16:08:43,840][12862] Signal inference workers to stop experience collection... (40150 times) [2024-06-18 16:08:43,840][12862] Signal inference workers to resume experience collection... (40150 times) [2024-06-18 16:08:43,845][12883] Updated weights for policy 0, policy_version 167434 (0.0037) [2024-06-18 16:08:43,867][12883] InferenceWorker_p0-w0: stopping experience collection (40150 times) [2024-06-18 16:08:43,867][12883] InferenceWorker_p0-w0: resuming experience collection (40150 times) [2024-06-18 16:08:46,994][12645] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42765.8). Total num frames: 2743353344. Throughput: 0: 43144.9. Samples: 2743429780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:47,003][12645] Avg episode reward: [(0, '0.496')] [2024-06-18 16:08:48,277][12883] Updated weights for policy 0, policy_version 167444 (0.0037) [2024-06-18 16:08:51,412][12883] Updated weights for policy 0, policy_version 167454 (0.0028) [2024-06-18 16:08:51,994][12645] Fps is (10 sec: 49152.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2743599104. Throughput: 0: 43080.6. Samples: 2743692920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:51,994][12645] Avg episode reward: [(0, '0.360')] [2024-06-18 16:08:55,794][12883] Updated weights for policy 0, policy_version 167464 (0.0034) [2024-06-18 16:08:56,994][12645] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 2743762944. Throughput: 0: 42894.3. Samples: 2743946760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:08:56,994][12645] Avg episode reward: [(0, '0.469')] [2024-06-18 16:08:59,348][12883] Updated weights for policy 0, policy_version 167474 (0.0040) [2024-06-18 16:09:01,996][12645] Fps is (10 sec: 40950.3, 60 sec: 43142.9, 300 sec: 42764.7). Total num frames: 2744008704. Throughput: 0: 42958.1. Samples: 2744071860. Policy #0 lag: (min: 0.0, avg: 8.3, max: 20.0) [2024-06-18 16:09:01,997][12645] Avg episode reward: [(0, '0.541')] [2024-06-18 16:09:03,383][12883] Updated weights for policy 0, policy_version 167484 (0.0031) [2024-06-18 16:09:06,932][12883] Updated weights for policy 0, policy_version 167494 (0.0041) [2024-06-18 16:09:06,994][12645] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2744221696. Throughput: 0: 42834.0. Samples: 2744332340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:06,994][12645] Avg episode reward: [(0, '0.565')] [2024-06-18 16:09:10,997][12883] Updated weights for policy 0, policy_version 167504 (0.0031) [2024-06-18 16:09:11,994][12645] Fps is (10 sec: 40969.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 2744418304. Throughput: 0: 42929.8. Samples: 2744592300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:11,994][12645] Avg episode reward: [(0, '0.656')] [2024-06-18 16:09:14,508][12883] Updated weights for policy 0, policy_version 167514 (0.0042) [2024-06-18 16:09:16,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 2744664064. Throughput: 0: 43007.1. Samples: 2744716820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:16,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 16:09:18,902][12883] Updated weights for policy 0, policy_version 167524 (0.0038) [2024-06-18 16:09:21,996][12645] Fps is (10 sec: 44226.9, 60 sec: 42323.8, 300 sec: 42820.2). Total num frames: 2744860672. Throughput: 0: 42675.7. Samples: 2744971040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:21,996][12645] Avg episode reward: [(0, '0.427')] [2024-06-18 16:09:22,591][12883] Updated weights for policy 0, policy_version 167534 (0.0036) [2024-06-18 16:09:26,553][12883] Updated weights for policy 0, policy_version 167544 (0.0036) [2024-06-18 16:09:26,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 2745057280. Throughput: 0: 42669.2. Samples: 2745231240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:26,994][12645] Avg episode reward: [(0, '0.415')] [2024-06-18 16:09:30,222][12883] Updated weights for policy 0, policy_version 167554 (0.0043) [2024-06-18 16:09:31,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2745303040. Throughput: 0: 42864.4. Samples: 2745358680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:31,994][12645] Avg episode reward: [(0, '0.531')] [2024-06-18 16:09:34,255][12883] Updated weights for policy 0, policy_version 167564 (0.0038) [2024-06-18 16:09:36,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2745499648. Throughput: 0: 42770.5. Samples: 2745617600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:36,994][12645] Avg episode reward: [(0, '0.607')] [2024-06-18 16:09:37,677][12883] Updated weights for policy 0, policy_version 167574 (0.0032) [2024-06-18 16:09:41,899][12883] Updated weights for policy 0, policy_version 167584 (0.0027) [2024-06-18 16:09:41,994][12645] Fps is (10 sec: 39321.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 2745696256. Throughput: 0: 42955.9. Samples: 2745879780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:41,994][12645] Avg episode reward: [(0, '0.592')] [2024-06-18 16:09:45,210][12883] Updated weights for policy 0, policy_version 167594 (0.0042) [2024-06-18 16:09:46,994][12645] Fps is (10 sec: 45875.2, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2745958400. Throughput: 0: 42948.3. Samples: 2746004440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:46,994][12645] Avg episode reward: [(0, '0.732')] [2024-06-18 16:09:49,509][12883] Updated weights for policy 0, policy_version 167604 (0.0038) [2024-06-18 16:09:51,994][12645] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 2746138624. Throughput: 0: 42923.9. Samples: 2746263920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:51,994][12645] Avg episode reward: [(0, '0.709')] [2024-06-18 16:09:53,198][12883] Updated weights for policy 0, policy_version 167614 (0.0032) [2024-06-18 16:09:56,994][12645] Fps is (10 sec: 39321.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2746351616. Throughput: 0: 42830.6. Samples: 2746519680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:09:56,994][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 16:09:56,998][12883] Updated weights for policy 0, policy_version 167624 (0.0029) [2024-06-18 16:10:00,745][12883] Updated weights for policy 0, policy_version 167634 (0.0040) [2024-06-18 16:10:01,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43146.2, 300 sec: 42931.6). Total num frames: 2746597376. Throughput: 0: 42953.8. Samples: 2746649740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:10:01,994][12645] Avg episode reward: [(0, '0.680')] [2024-06-18 16:10:04,572][12883] Updated weights for policy 0, policy_version 167644 (0.0043) [2024-06-18 16:10:06,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2746793984. Throughput: 0: 42991.4. Samples: 2746905560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:06,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 16:10:08,297][12883] Updated weights for policy 0, policy_version 167654 (0.0026) [2024-06-18 16:10:11,994][12645] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2746990592. Throughput: 0: 42840.6. Samples: 2747159060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:11,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 16:10:12,148][12883] Updated weights for policy 0, policy_version 167664 (0.0036) [2024-06-18 16:10:15,747][12862] Signal inference workers to stop experience collection... (40200 times) [2024-06-18 16:10:15,752][12862] Signal inference workers to resume experience collection... (40200 times) [2024-06-18 16:10:15,776][12883] InferenceWorker_p0-w0: stopping experience collection (40200 times) [2024-06-18 16:10:15,776][12883] InferenceWorker_p0-w0: resuming experience collection (40200 times) [2024-06-18 16:10:15,897][12883] Updated weights for policy 0, policy_version 167674 (0.0019) [2024-06-18 16:10:16,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2747219968. Throughput: 0: 42836.6. Samples: 2747286320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:16,994][12645] Avg episode reward: [(0, '0.655')] [2024-06-18 16:10:19,733][12883] Updated weights for policy 0, policy_version 167684 (0.0033) [2024-06-18 16:10:21,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42873.0, 300 sec: 42820.5). Total num frames: 2747432960. Throughput: 0: 42827.1. Samples: 2747544820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:21,994][12645] Avg episode reward: [(0, '0.625')] [2024-06-18 16:10:23,421][12883] Updated weights for policy 0, policy_version 167694 (0.0034) [2024-06-18 16:10:26,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 2747645952. Throughput: 0: 42712.0. Samples: 2747801820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:26,994][12645] Avg episode reward: [(0, '0.595')] [2024-06-18 16:10:27,724][12883] Updated weights for policy 0, policy_version 167704 (0.0039) [2024-06-18 16:10:31,224][12883] Updated weights for policy 0, policy_version 167714 (0.0028) [2024-06-18 16:10:31,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2747875328. Throughput: 0: 42731.6. Samples: 2747927360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:31,994][12645] Avg episode reward: [(0, '0.582')] [2024-06-18 16:10:35,345][12883] Updated weights for policy 0, policy_version 167724 (0.0037) [2024-06-18 16:10:36,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2748071936. Throughput: 0: 42784.0. Samples: 2748189200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:36,994][12645] Avg episode reward: [(0, '0.344')] [2024-06-18 16:10:37,075][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167730_2748088320.pth... [2024-06-18 16:10:37,139][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167101_2737782784.pth [2024-06-18 16:10:38,827][12883] Updated weights for policy 0, policy_version 167734 (0.0035) [2024-06-18 16:10:41,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2748284928. Throughput: 0: 42716.5. Samples: 2748441920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:41,994][12645] Avg episode reward: [(0, '0.651')] [2024-06-18 16:10:42,951][12883] Updated weights for policy 0, policy_version 167744 (0.0037) [2024-06-18 16:10:46,739][12883] Updated weights for policy 0, policy_version 167754 (0.0036) [2024-06-18 16:10:46,994][12645] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2748497920. Throughput: 0: 42617.8. Samples: 2748567540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:46,994][12645] Avg episode reward: [(0, '0.522')] [2024-06-18 16:10:50,466][12883] Updated weights for policy 0, policy_version 167764 (0.0032) [2024-06-18 16:10:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2748727296. Throughput: 0: 42767.2. Samples: 2748830080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:51,994][12645] Avg episode reward: [(0, '0.220')] [2024-06-18 16:10:54,550][12883] Updated weights for policy 0, policy_version 167774 (0.0033) [2024-06-18 16:10:56,994][12645] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2748923904. Throughput: 0: 42827.4. Samples: 2749086300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:10:56,994][12645] Avg episode reward: [(0, '0.579')] [2024-06-18 16:10:58,043][12883] Updated weights for policy 0, policy_version 167784 (0.0038) [2024-06-18 16:11:01,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 2749136896. Throughput: 0: 42731.6. Samples: 2749209240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:11:01,994][12645] Avg episode reward: [(0, '0.718')] [2024-06-18 16:11:01,999][12883] Updated weights for policy 0, policy_version 167794 (0.0029) [2024-06-18 16:11:05,390][12883] Updated weights for policy 0, policy_version 167804 (0.0041) [2024-06-18 16:11:06,994][12645] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 2749349888. Throughput: 0: 42831.1. Samples: 2749472220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:11:06,994][12645] Avg episode reward: [(0, '0.551')] [2024-06-18 16:11:09,431][12883] Updated weights for policy 0, policy_version 167814 (0.0033) [2024-06-18 16:11:11,994][12645] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2749562880. Throughput: 0: 42856.8. Samples: 2749730380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:11,994][12645] Avg episode reward: [(0, '0.552')] [2024-06-18 16:11:13,140][12883] Updated weights for policy 0, policy_version 167824 (0.0038) [2024-06-18 16:11:16,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 2749775872. Throughput: 0: 42975.9. Samples: 2749861280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:16,994][12645] Avg episode reward: [(0, '0.523')] [2024-06-18 16:11:17,050][12883] Updated weights for policy 0, policy_version 167834 (0.0043) [2024-06-18 16:11:20,788][12883] Updated weights for policy 0, policy_version 167844 (0.0040) [2024-06-18 16:11:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.4). Total num frames: 2750005248. Throughput: 0: 42774.6. Samples: 2750114060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:21,994][12645] Avg episode reward: [(0, '0.301')] [2024-06-18 16:11:25,140][12883] Updated weights for policy 0, policy_version 167854 (0.0030) [2024-06-18 16:11:26,994][12645] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2750218240. Throughput: 0: 42716.8. Samples: 2750364180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:26,994][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 16:11:28,394][12883] Updated weights for policy 0, policy_version 167864 (0.0029) [2024-06-18 16:11:31,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 2750414848. Throughput: 0: 42807.1. Samples: 2750493860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:31,994][12645] Avg episode reward: [(0, '0.431')] [2024-06-18 16:11:32,898][12883] Updated weights for policy 0, policy_version 167874 (0.0034) [2024-06-18 16:11:36,003][12883] Updated weights for policy 0, policy_version 167884 (0.0045) [2024-06-18 16:11:36,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 2750644224. Throughput: 0: 42713.7. Samples: 2750752200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:36,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 16:11:37,853][12862] Signal inference workers to stop experience collection... (40250 times) [2024-06-18 16:11:37,853][12862] Signal inference workers to resume experience collection... (40250 times) [2024-06-18 16:11:37,900][12883] InferenceWorker_p0-w0: stopping experience collection (40250 times) [2024-06-18 16:11:37,900][12883] InferenceWorker_p0-w0: resuming experience collection (40250 times) [2024-06-18 16:11:40,506][12883] Updated weights for policy 0, policy_version 167894 (0.0032) [2024-06-18 16:11:41,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42931.7). Total num frames: 2750857216. Throughput: 0: 42722.9. Samples: 2751008820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:41,994][12645] Avg episode reward: [(0, '0.550')] [2024-06-18 16:11:44,111][12883] Updated weights for policy 0, policy_version 167904 (0.0026) [2024-06-18 16:11:46,994][12645] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 2751053824. Throughput: 0: 42835.4. Samples: 2751136840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:46,994][12645] Avg episode reward: [(0, '0.609')] [2024-06-18 16:11:48,095][12883] Updated weights for policy 0, policy_version 167914 (0.0027) [2024-06-18 16:11:51,658][12883] Updated weights for policy 0, policy_version 167924 (0.0034) [2024-06-18 16:11:51,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42931.7). Total num frames: 2751283200. Throughput: 0: 42750.4. Samples: 2751395980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:51,994][12645] Avg episode reward: [(0, '0.453')] [2024-06-18 16:11:55,703][12883] Updated weights for policy 0, policy_version 167934 (0.0031) [2024-06-18 16:11:56,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2751496192. Throughput: 0: 42834.6. Samples: 2751657940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:11:56,994][12645] Avg episode reward: [(0, '0.489')] [2024-06-18 16:11:59,274][12883] Updated weights for policy 0, policy_version 167944 (0.0046) [2024-06-18 16:12:01,994][12645] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2751709184. Throughput: 0: 42675.7. Samples: 2751781680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:12:02,003][12645] Avg episode reward: [(0, '0.506')] [2024-06-18 16:12:03,232][12883] Updated weights for policy 0, policy_version 167954 (0.0028) [2024-06-18 16:12:06,994][12645] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2751905792. Throughput: 0: 42732.0. Samples: 2752037000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:12:06,994][12645] Avg episode reward: [(0, '0.470')] [2024-06-18 16:12:07,161][12883] Updated weights for policy 0, policy_version 167964 (0.0041) [2024-06-18 16:12:10,974][12883] Updated weights for policy 0, policy_version 167974 (0.0036) [2024-06-18 16:12:11,994][12645] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2752118784. Throughput: 0: 42993.9. Samples: 2752298900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:12:11,994][12645] Avg episode reward: [(0, '0.585')] [2024-06-18 16:12:14,810][12883] Updated weights for policy 0, policy_version 167984 (0.0025) [2024-06-18 16:12:16,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 2752348160. Throughput: 0: 42922.2. Samples: 2752425360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 16:12:16,994][12645] Avg episode reward: [(0, '0.340')] [2024-06-18 16:12:18,719][12883] Updated weights for policy 0, policy_version 167994 (0.0025) [2024-06-18 16:12:21,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 2752561152. Throughput: 0: 42779.2. Samples: 2752677260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:21,994][12645] Avg episode reward: [(0, '0.491')] [2024-06-18 16:12:22,157][12883] Updated weights for policy 0, policy_version 168004 (0.0038) [2024-06-18 16:12:26,191][12883] Updated weights for policy 0, policy_version 168014 (0.0023) [2024-06-18 16:12:26,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2752774144. Throughput: 0: 42927.0. Samples: 2752940540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:26,994][12645] Avg episode reward: [(0, '0.749')] [2024-06-18 16:12:30,056][12883] Updated weights for policy 0, policy_version 168024 (0.0040) [2024-06-18 16:12:31,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2753003520. Throughput: 0: 42930.7. Samples: 2753068720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:31,994][12645] Avg episode reward: [(0, '0.688')] [2024-06-18 16:12:33,642][12883] Updated weights for policy 0, policy_version 168034 (0.0037) [2024-06-18 16:12:36,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 2753216512. Throughput: 0: 42963.9. Samples: 2753329360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:36,994][12645] Avg episode reward: [(0, '0.615')] [2024-06-18 16:12:37,013][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168043_2753216512.pth... [2024-06-18 16:12:37,075][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167418_2742976512.pth [2024-06-18 16:12:37,507][12883] Updated weights for policy 0, policy_version 168044 (0.0043) [2024-06-18 16:12:41,751][12883] Updated weights for policy 0, policy_version 168054 (0.0034) [2024-06-18 16:12:41,995][12645] Fps is (10 sec: 40952.8, 60 sec: 42597.1, 300 sec: 42875.8). Total num frames: 2753413120. Throughput: 0: 42941.0. Samples: 2753590360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:41,996][12645] Avg episode reward: [(0, '0.532')] [2024-06-18 16:12:45,155][12883] Updated weights for policy 0, policy_version 168064 (0.0043) [2024-06-18 16:12:46,994][12645] Fps is (10 sec: 44236.7, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 2753658880. Throughput: 0: 42974.6. Samples: 2753715540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:46,999][12645] Avg episode reward: [(0, '0.683')] [2024-06-18 16:12:49,248][12883] Updated weights for policy 0, policy_version 168074 (0.0043) [2024-06-18 16:12:50,924][12862] Signal inference workers to stop experience collection... (40300 times) [2024-06-18 16:12:50,924][12862] Signal inference workers to resume experience collection... (40300 times) [2024-06-18 16:12:50,937][12883] InferenceWorker_p0-w0: stopping experience collection (40300 times) [2024-06-18 16:12:50,938][12883] InferenceWorker_p0-w0: resuming experience collection (40300 times) [2024-06-18 16:12:51,994][12645] Fps is (10 sec: 42606.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2753839104. Throughput: 0: 42885.0. Samples: 2753966820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:51,994][12645] Avg episode reward: [(0, '0.608')] [2024-06-18 16:12:52,632][12883] Updated weights for policy 0, policy_version 168084 (0.0027) [2024-06-18 16:12:56,855][12883] Updated weights for policy 0, policy_version 168094 (0.0035) [2024-06-18 16:12:56,994][12645] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2754052096. Throughput: 0: 42835.0. Samples: 2754226480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:12:56,994][12645] Avg episode reward: [(0, '0.640')] [2024-06-18 16:13:00,173][12883] Updated weights for policy 0, policy_version 168104 (0.0035) [2024-06-18 16:13:02,000][12645] Fps is (10 sec: 44209.0, 60 sec: 42867.1, 300 sec: 42764.1). Total num frames: 2754281472. Throughput: 0: 42831.0. Samples: 2754353020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:13:02,000][12645] Avg episode reward: [(0, '0.528')] [2024-06-18 16:13:04,395][12883] Updated weights for policy 0, policy_version 168114 (0.0030) [2024-06-18 16:13:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 2754494464. Throughput: 0: 42841.2. Samples: 2754605120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:13:06,994][12645] Avg episode reward: [(0, '0.527')] [2024-06-18 16:13:08,139][12883] Updated weights for policy 0, policy_version 168124 (0.0029) [2024-06-18 16:13:11,994][12645] Fps is (10 sec: 40985.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 2754691072. Throughput: 0: 42778.8. Samples: 2754865580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:13:11,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 16:13:12,451][12883] Updated weights for policy 0, policy_version 168134 (0.0036) [2024-06-18 16:13:15,541][12883] Updated weights for policy 0, policy_version 168144 (0.0036) [2024-06-18 16:13:16,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 2754920448. Throughput: 0: 42820.5. Samples: 2754995640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:13:16,994][12645] Avg episode reward: [(0, '0.483')] [2024-06-18 16:13:20,003][12883] Updated weights for policy 0, policy_version 168154 (0.0028) [2024-06-18 16:13:21,994][12645] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 2755133440. Throughput: 0: 42786.2. Samples: 2755254740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 16:13:21,994][12645] Avg episode reward: [(0, '0.446')] [2024-06-18 16:13:22,936][12883] Updated weights for policy 0, policy_version 168164 (0.0034) [2024-06-18 16:13:26,994][12645] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 2755346432. Throughput: 0: 42648.3. Samples: 2755509460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:13:26,994][12645] Avg episode reward: [(0, '0.558')] [2024-06-18 16:13:27,617][12883] Updated weights for policy 0, policy_version 168174 (0.0028) [2024-06-18 16:13:30,439][12883] Updated weights for policy 0, policy_version 168184 (0.0027) [2024-06-18 16:13:31,994][12645] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 2755559424. Throughput: 0: 42757.1. Samples: 2755639600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:13:31,994][12645] Avg episode reward: [(0, '0.576')] [2024-06-18 16:13:35,061][12883] Updated weights for policy 0, policy_version 168194 (0.0038) [2024-06-18 16:13:36,994][12645] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2755788800. Throughput: 0: 42949.7. Samples: 2755899560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:13:36,994][12645] Avg episode reward: [(0, '0.737')] [2024-06-18 16:13:38,033][12883] Updated weights for policy 0, policy_version 168204 (0.0028) [2024-06-18 16:13:41,994][12645] Fps is (10 sec: 40959.6, 60 sec: 42599.7, 300 sec: 42765.0). Total num frames: 2755969024. Throughput: 0: 42872.1. Samples: 2756155720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:13:41,994][12645] Avg episode reward: [(0, '0.747')] [2024-06-18 16:13:42,770][12883] Updated weights for policy 0, policy_version 168214 (0.0036) [2024-06-18 16:13:45,877][12883] Updated weights for policy 0, policy_version 168224 (0.0030) [2024-06-18 16:13:46,994][12645] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 2756214784. Throughput: 0: 42803.6. Samples: 2756278920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:13:46,994][12645] Avg episode reward: [(0, '0.566')] [2024-06-18 16:13:50,457][12883] Updated weights for policy 0, policy_version 168234 (0.0055) [2024-06-18 16:13:51,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2756427776. Throughput: 0: 42957.9. Samples: 2756538220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:13:51,994][12645] Avg episode reward: [(0, '0.420')] [2024-06-18 16:13:53,568][12883] Updated weights for policy 0, policy_version 168244 (0.0032) [2024-06-18 16:13:56,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.3). Total num frames: 2756624384. Throughput: 0: 42860.8. Samples: 2756794320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:13:56,994][12645] Avg episode reward: [(0, '0.388')] [2024-06-18 16:13:58,286][12883] Updated weights for policy 0, policy_version 168254 (0.0036) [2024-06-18 16:14:01,358][12883] Updated weights for policy 0, policy_version 168264 (0.0045) [2024-06-18 16:14:01,994][12645] Fps is (10 sec: 44236.4, 60 sec: 43149.0, 300 sec: 42876.1). Total num frames: 2756870144. Throughput: 0: 42678.6. Samples: 2756916180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:14:01,994][12645] Avg episode reward: [(0, '0.432')] [2024-06-18 16:14:05,804][12883] Updated weights for policy 0, policy_version 168274 (0.0033) [2024-06-18 16:14:06,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2757066752. Throughput: 0: 42837.4. Samples: 2757182420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:14:06,994][12645] Avg episode reward: [(0, '0.437')] [2024-06-18 16:14:08,854][12883] Updated weights for policy 0, policy_version 168284 (0.0043) [2024-06-18 16:14:11,994][12645] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2757263360. Throughput: 0: 42755.6. Samples: 2757433460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:14:11,994][12645] Avg episode reward: [(0, '0.644')] [2024-06-18 16:14:13,388][12883] Updated weights for policy 0, policy_version 168294 (0.0030) [2024-06-18 16:14:16,523][12883] Updated weights for policy 0, policy_version 168304 (0.0039) [2024-06-18 16:14:16,996][12645] Fps is (10 sec: 44226.9, 60 sec: 43142.9, 300 sec: 42876.1). Total num frames: 2757509120. Throughput: 0: 42704.0. Samples: 2757561380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:14:16,996][12645] Avg episode reward: [(0, '0.461')] [2024-06-18 16:14:20,843][12883] Updated weights for policy 0, policy_version 168314 (0.0028) [2024-06-18 16:14:21,994][12645] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 2757722112. Throughput: 0: 42887.1. Samples: 2757829480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:14:21,994][12645] Avg episode reward: [(0, '0.346')] [2024-06-18 16:14:24,143][12883] Updated weights for policy 0, policy_version 168324 (0.0037) [2024-06-18 16:14:26,994][12645] Fps is (10 sec: 37692.0, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 2757885952. Throughput: 0: 42887.6. Samples: 2758085660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 16:14:26,994][12645] Avg episode reward: [(0, '0.326')] [2024-06-18 16:14:28,273][12862] Signal inference workers to stop experience collection... (40350 times) [2024-06-18 16:14:28,273][12862] Signal inference workers to resume experience collection... (40350 times) [2024-06-18 16:14:28,315][12883] InferenceWorker_p0-w0: stopping experience collection (40350 times) [2024-06-18 16:14:28,315][12883] InferenceWorker_p0-w0: resuming experience collection (40350 times) [2024-06-18 16:14:28,411][12883] Updated weights for policy 0, policy_version 168334 (0.0042) [2024-06-18 16:14:31,838][12883] Updated weights for policy 0, policy_version 168344 (0.0043) [2024-06-18 16:14:31,994][12645] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2758148096. Throughput: 0: 42858.8. Samples: 2758207560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:14:31,994][12645] Avg episode reward: [(0, '0.641')] [2024-06-18 16:14:35,908][12883] Updated weights for policy 0, policy_version 168354 (0.0032) [2024-06-18 16:14:36,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42820.5). Total num frames: 2758328320. Throughput: 0: 42816.8. Samples: 2758464980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:14:36,994][12645] Avg episode reward: [(0, '0.682')] [2024-06-18 16:14:37,060][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168356_2758344704.pth... [2024-06-18 16:14:37,108][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000167730_2748088320.pth [2024-06-18 16:14:39,456][12883] Updated weights for policy 0, policy_version 168364 (0.0041) [2024-06-18 16:14:41,994][12645] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 2758541312. Throughput: 0: 42702.2. Samples: 2758715920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:14:41,994][12645] Avg episode reward: [(0, '0.765')] [2024-06-18 16:14:43,663][12883] Updated weights for policy 0, policy_version 168374 (0.0030) [2024-06-18 16:14:46,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2758754304. Throughput: 0: 42956.5. Samples: 2758849220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:14:46,994][12645] Avg episode reward: [(0, '0.395')] [2024-06-18 16:14:47,485][12883] Updated weights for policy 0, policy_version 168384 (0.0028) [2024-06-18 16:14:51,118][12883] Updated weights for policy 0, policy_version 168394 (0.0032) [2024-06-18 16:14:51,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 2758983680. Throughput: 0: 42624.0. Samples: 2759100500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:14:51,994][12645] Avg episode reward: [(0, '0.382')] [2024-06-18 16:14:55,230][12883] Updated weights for policy 0, policy_version 168404 (0.0026) [2024-06-18 16:14:56,994][12645] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 2759196672. Throughput: 0: 42715.5. Samples: 2759355660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:14:56,998][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 16:14:59,130][12883] Updated weights for policy 0, policy_version 168414 (0.0028) [2024-06-18 16:15:01,996][12645] Fps is (10 sec: 42589.3, 60 sec: 42323.8, 300 sec: 42764.7). Total num frames: 2759409664. Throughput: 0: 42892.0. Samples: 2759491520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:15:01,997][12645] Avg episode reward: [(0, '0.493')] [2024-06-18 16:15:02,713][12883] Updated weights for policy 0, policy_version 168424 (0.0042) [2024-06-18 16:15:06,636][12883] Updated weights for policy 0, policy_version 168434 (0.0032) [2024-06-18 16:15:06,994][12645] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2759622656. Throughput: 0: 42600.4. Samples: 2759746500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:15:06,994][12645] Avg episode reward: [(0, '0.495')] [2024-06-18 16:15:10,315][12883] Updated weights for policy 0, policy_version 168444 (0.0027) [2024-06-18 16:15:11,994][12645] Fps is (10 sec: 44246.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2759852032. Throughput: 0: 42583.5. Samples: 2760001920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:15:11,994][12645] Avg episode reward: [(0, '0.610')] [2024-06-18 16:15:14,110][12883] Updated weights for policy 0, policy_version 168454 (0.0036) [2024-06-18 16:15:16,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42326.9, 300 sec: 42765.0). Total num frames: 2760048640. Throughput: 0: 42794.2. Samples: 2760133300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:15:16,994][12645] Avg episode reward: [(0, '0.647')] [2024-06-18 16:15:17,879][12883] Updated weights for policy 0, policy_version 168464 (0.0038) [2024-06-18 16:15:21,827][12883] Updated weights for policy 0, policy_version 168474 (0.0028) [2024-06-18 16:15:21,994][12645] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 2760278016. Throughput: 0: 42724.5. Samples: 2760387580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:15:21,999][12645] Avg episode reward: [(0, '0.604')] [2024-06-18 16:15:25,542][12883] Updated weights for policy 0, policy_version 168484 (0.0025) [2024-06-18 16:15:26,994][12645] Fps is (10 sec: 42597.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 2760474624. Throughput: 0: 42920.4. Samples: 2760647340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:15:26,994][12645] Avg episode reward: [(0, '0.769')] [2024-06-18 16:15:29,354][12883] Updated weights for policy 0, policy_version 168494 (0.0028) [2024-06-18 16:15:31,994][12645] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 2760687616. Throughput: 0: 42793.0. Samples: 2760774900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:15:31,994][12645] Avg episode reward: [(0, '0.702')] [2024-06-18 16:15:33,431][12883] Updated weights for policy 0, policy_version 168504 (0.0040) [2024-06-18 16:15:36,994][12645] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2760916992. Throughput: 0: 42896.4. Samples: 2761030840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:15:37,003][12645] Avg episode reward: [(0, '0.423')] [2024-06-18 16:15:37,251][12883] Updated weights for policy 0, policy_version 168514 (0.0030) [2024-06-18 16:15:40,901][12883] Updated weights for policy 0, policy_version 168524 (0.0036) [2024-06-18 16:15:41,994][12645] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 2761129984. Throughput: 0: 43016.5. Samples: 2761291400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:15:41,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 16:15:44,733][12883] Updated weights for policy 0, policy_version 168534 (0.0036) [2024-06-18 16:15:46,995][12645] Fps is (10 sec: 42594.4, 60 sec: 43143.7, 300 sec: 42764.9). Total num frames: 2761342976. Throughput: 0: 42830.9. Samples: 2761418860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:15:46,995][12645] Avg episode reward: [(0, '0.519')] [2024-06-18 16:15:48,552][12883] Updated weights for policy 0, policy_version 168544 (0.0028) [2024-06-18 16:15:51,999][12645] Fps is (10 sec: 44213.1, 60 sec: 43140.7, 300 sec: 42875.3). Total num frames: 2761572352. Throughput: 0: 42810.9. Samples: 2761673220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:15:51,999][12645] Avg episode reward: [(0, '0.711')] [2024-06-18 16:15:52,185][12883] Updated weights for policy 0, policy_version 168554 (0.0032) [2024-06-18 16:15:56,196][12883] Updated weights for policy 0, policy_version 168564 (0.0032) [2024-06-18 16:15:56,994][12645] Fps is (10 sec: 44241.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 2761785344. Throughput: 0: 43069.8. Samples: 2761940060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:15:56,994][12645] Avg episode reward: [(0, '0.602')] [2024-06-18 16:15:59,742][12883] Updated weights for policy 0, policy_version 168574 (0.0032) [2024-06-18 16:16:01,994][12645] Fps is (10 sec: 40982.3, 60 sec: 42873.1, 300 sec: 42820.6). Total num frames: 2761981952. Throughput: 0: 42995.6. Samples: 2762068100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:16:01,994][12645] Avg episode reward: [(0, '0.402')] [2024-06-18 16:16:03,899][12883] Updated weights for policy 0, policy_version 168584 (0.0036) [2024-06-18 16:16:06,994][12645] Fps is (10 sec: 44236.5, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 2762227712. Throughput: 0: 43018.2. Samples: 2762323400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:16:06,994][12645] Avg episode reward: [(0, '0.387')] [2024-06-18 16:16:07,283][12883] Updated weights for policy 0, policy_version 168594 (0.0027) [2024-06-18 16:16:08,210][12862] Signal inference workers to stop experience collection... (40400 times) [2024-06-18 16:16:08,244][12883] InferenceWorker_p0-w0: stopping experience collection (40400 times) [2024-06-18 16:16:08,321][12862] Signal inference workers to resume experience collection... (40400 times) [2024-06-18 16:16:08,321][12883] InferenceWorker_p0-w0: resuming experience collection (40400 times) [2024-06-18 16:16:11,784][12883] Updated weights for policy 0, policy_version 168604 (0.0037) [2024-06-18 16:16:11,994][12645] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2762424320. Throughput: 0: 43146.4. Samples: 2762588920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:16:11,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 16:16:14,873][12883] Updated weights for policy 0, policy_version 168614 (0.0039) [2024-06-18 16:16:16,994][12645] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 2762637312. Throughput: 0: 43006.1. Samples: 2762710180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:16:16,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 16:16:19,338][12883] Updated weights for policy 0, policy_version 168624 (0.0040) [2024-06-18 16:16:21,994][12645] Fps is (10 sec: 45875.4, 60 sec: 43417.7, 300 sec: 42931.7). Total num frames: 2762883072. Throughput: 0: 42967.3. Samples: 2762964360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:16:21,994][12645] Avg episode reward: [(0, '0.686')] [2024-06-18 16:16:22,902][12883] Updated weights for policy 0, policy_version 168634 (0.0034) [2024-06-18 16:16:26,994][12645] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 2763046912. Throughput: 0: 43246.7. Samples: 2763237500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:16:26,994][12645] Avg episode reward: [(0, '0.618')] [2024-06-18 16:16:27,015][12883] Updated weights for policy 0, policy_version 168644 (0.0033) [2024-06-18 16:16:30,294][12883] Updated weights for policy 0, policy_version 168654 (0.0031) [2024-06-18 16:16:31,994][12645] Fps is (10 sec: 40959.5, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 2763292672. Throughput: 0: 43126.7. Samples: 2763359520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:16:31,994][12645] Avg episode reward: [(0, '0.511')] [2024-06-18 16:16:34,745][12883] Updated weights for policy 0, policy_version 168664 (0.0037) [2024-06-18 16:16:36,994][12645] Fps is (10 sec: 47513.0, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 2763522048. Throughput: 0: 43280.6. Samples: 2763620620. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:16:36,994][12645] Avg episode reward: [(0, '0.575')] [2024-06-18 16:16:37,008][12862] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168672_2763522048.pth... [2024-06-18 16:16:37,074][12862] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168043_2753216512.pth [2024-06-18 16:16:38,075][12883] Updated weights for policy 0, policy_version 168674 (0.0035) [2024-06-18 16:16:41,994][12645] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 2763702272. Throughput: 0: 43135.1. Samples: 2763881140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:16:41,994][12645] Avg episode reward: [(0, '0.323')] [2024-06-18 16:16:42,193][12883] Updated weights for policy 0, policy_version 168684 (0.0040) [2024-06-18 16:16:45,630][12883] Updated weights for policy 0, policy_version 168694 (0.0039) [2024-06-18 16:16:46,996][12645] Fps is (10 sec: 42589.4, 60 sec: 43416.7, 300 sec: 42931.3). Total num frames: 2763948032. Throughput: 0: 43017.3. Samples: 2764003980. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:16:46,997][12645] Avg episode reward: [(0, '0.337')] [2024-06-18 16:16:49,697][12883] Updated weights for policy 0, policy_version 168704 (0.0044) [2024-06-18 16:16:51,994][12645] Fps is (10 sec: 44237.0, 60 sec: 42875.4, 300 sec: 42876.1). Total num frames: 2764144640. Throughput: 0: 43096.5. Samples: 2764262740. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:16:51,994][12645] Avg episode reward: [(0, '0.372')] [2024-06-18 16:16:53,258][12883] Updated weights for policy 0, policy_version 168714 (0.0027) [2024-06-18 16:16:56,994][12645] Fps is (10 sec: 40969.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 2764357632. Throughput: 0: 42938.1. Samples: 2764521140. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:16:56,994][12645] Avg episode reward: [(0, '0.433')] [2024-06-18 16:16:57,567][12883] Updated weights for policy 0, policy_version 168724 (0.0041) [2024-06-18 16:17:00,869][12883] Updated weights for policy 0, policy_version 168734 (0.0036) [2024-06-18 16:17:01,996][12645] Fps is (10 sec: 44226.7, 60 sec: 43415.9, 300 sec: 42986.9). Total num frames: 2764587008. Throughput: 0: 42957.5. Samples: 2764643360. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:01,997][12645] Avg episode reward: [(0, '0.638')] [2024-06-18 16:17:05,065][12883] Updated weights for policy 0, policy_version 168744 (0.0039) [2024-06-18 16:17:06,994][12645] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 2764800000. Throughput: 0: 43143.9. Samples: 2764905840. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:06,994][12645] Avg episode reward: [(0, '0.588')] [2024-06-18 16:17:08,276][12883] Updated weights for policy 0, policy_version 168754 (0.0032) [2024-06-18 16:17:11,994][12645] Fps is (10 sec: 39330.6, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 2764980224. Throughput: 0: 42722.7. Samples: 2765160020. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:11,994][12645] Avg episode reward: [(0, '0.561')] [2024-06-18 16:17:12,532][12883] Updated weights for policy 0, policy_version 168764 (0.0036) [2024-06-18 16:17:15,842][12883] Updated weights for policy 0, policy_version 168774 (0.0050) [2024-06-18 16:17:16,994][12645] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 2765225984. Throughput: 0: 42742.2. Samples: 2765282920. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:16,994][12645] Avg episode reward: [(0, '0.740')] [2024-06-18 16:17:20,832][12883] Updated weights for policy 0, policy_version 168784 (0.0037) [2024-06-18 16:17:21,994][12645] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2765422592. Throughput: 0: 42778.8. Samples: 2765545660. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:21,994][12645] Avg episode reward: [(0, '0.568')] [2024-06-18 16:17:23,450][12883] Updated weights for policy 0, policy_version 168794 (0.0039) [2024-06-18 16:17:26,994][12645] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 2765619200. Throughput: 0: 42625.8. Samples: 2765799300. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:26,994][12645] Avg episode reward: [(0, '0.441')] [2024-06-18 16:17:28,449][12883] Updated weights for policy 0, policy_version 168804 (0.0030) [2024-06-18 16:17:28,515][12862] Signal inference workers to stop experience collection... (40450 times) [2024-06-18 16:17:28,562][12883] InferenceWorker_p0-w0: stopping experience collection (40450 times) [2024-06-18 16:17:28,636][12862] Signal inference workers to resume experience collection... (40450 times) [2024-06-18 16:17:28,637][12883] InferenceWorker_p0-w0: resuming experience collection (40450 times) [2024-06-18 16:17:31,048][12883] Updated weights for policy 0, policy_version 168814 (0.0033) [2024-06-18 16:17:31,994][12645] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 2765848576. Throughput: 0: 42693.8. Samples: 2765925100. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:31,994][12645] Avg episode reward: [(0, '0.729')] [2024-06-18 16:17:36,016][12883] Updated weights for policy 0, policy_version 168824 (0.0032) [2024-06-18 16:17:36,994][12645] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42820.8). Total num frames: 2766045184. Throughput: 0: 42667.6. Samples: 2766182780. Policy #0 lag: (min: 0.0, avg: 13.0, max: 24.0) [2024-06-18 16:17:36,994][12645] Avg episode reward: [(0, '0.468')] [2024-06-18 16:17:38,894][12883] Updated weights for policy 0, policy_version 168834 (0.0044) [2024-06-18 16:17:41,994][12645] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 2766274560. Throughput: 0: 42448.8. Samples: 2766431340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:17:41,994][12645] Avg episode reward: [(0, '0.513')] [2024-06-18 16:17:44,136][12883] Updated weights for policy 0, policy_version 168844 (0.0030) [2024-06-18 16:17:46,688][12883] Updated weights for policy 0, policy_version 168854 (0.0031) [2024-06-18 16:17:46,994][12645] Fps is (10 sec: 45874.9, 60 sec: 42600.0, 300 sec: 42931.6). Total num frames: 2766503936. Throughput: 0: 42624.8. Samples: 2766561380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:17:46,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 16:17:51,749][12883] Updated weights for policy 0, policy_version 168864 (0.0040) [2024-06-18 16:17:51,994][12645] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 2766684160. Throughput: 0: 42411.2. Samples: 2766814340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:17:51,994][12645] Avg episode reward: [(0, '0.383')] [2024-06-18 16:17:54,571][12883] Updated weights for policy 0, policy_version 168874 (0.0031) [2024-06-18 16:17:56,994][12645] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 42765.9). Total num frames: 2766897152. Throughput: 0: 42326.3. Samples: 2767064700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:17:56,994][12645] Avg episode reward: [(0, '0.554')] [2024-06-18 16:17:59,285][12883] Updated weights for policy 0, policy_version 168884 (0.0035) [2024-06-18 16:18:01,994][12645] Fps is (10 sec: 45875.3, 60 sec: 42600.0, 300 sec: 42876.1). Total num frames: 2767142912. Throughput: 0: 42450.8. Samples: 2767193200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:18:01,994][12645] Avg episode reward: [(0, '0.810')] [2024-06-18 16:18:02,094][12883] Updated weights for policy 0, policy_version 168894 (0.0033) [2024-06-18 16:18:06,859][12883] Updated weights for policy 0, policy_version 168904 (0.0028) [2024-06-18 16:18:06,994][12645] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 2767339520. Throughput: 0: 42406.2. Samples: 2767453940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:18:06,994][12645] Avg episode reward: [(0, '0.712')] [2024-06-18 16:18:09,684][12883] Updated weights for policy 0, policy_version 168914 (0.0036) [2024-06-18 16:18:11,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 2767552512. Throughput: 0: 42296.9. Samples: 2767702660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:18:11,994][12645] Avg episode reward: [(0, '0.626')] [2024-06-18 16:18:14,416][12883] Updated weights for policy 0, policy_version 168924 (0.0045) [2024-06-18 16:18:17,000][12645] Fps is (10 sec: 45846.6, 60 sec: 42867.0, 300 sec: 42930.7). Total num frames: 2767798272. Throughput: 0: 42490.0. Samples: 2767837420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:18:17,000][12645] Avg episode reward: [(0, '0.536')] [2024-06-18 16:18:17,494][12883] Updated weights for policy 0, policy_version 168934 (0.0032) [2024-06-18 16:18:21,994][12645] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 2767962112. Throughput: 0: 42429.3. Samples: 2768092100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:18:21,994][12645] Avg episode reward: [(0, '0.436')] [2024-06-18 16:18:22,142][12883] Updated weights for policy 0, policy_version 168944 (0.0029) [2024-06-18 16:18:25,419][12883] Updated weights for policy 0, policy_version 168954 (0.0031) [2024-06-18 16:18:26,994][12645] Fps is (10 sec: 40986.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 2768207872. Throughput: 0: 42434.9. Samples: 2768340900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:18:26,994][12645] Avg episode reward: [(0, '0.460')] [2024-06-18 16:18:30,148][12883] Updated weights for policy 0, policy_version 168964 (0.0030) [2024-06-18 16:18:31,994][12645] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 2768404480. Throughput: 0: 42600.9. Samples: 2768478420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:18:31,994][12645] Avg episode reward: [(0, '0.612')] [2024-06-18 16:23:32,974][16381] Saving configuration to /workspace/metta/train_dir/p2.dr4/config.json... [2024-06-18 16:23:32,991][16381] Rollout worker 0 uses device cpu [2024-06-18 16:23:32,991][16381] Rollout worker 1 uses device cpu [2024-06-18 16:23:32,991][16381] Rollout worker 2 uses device cpu [2024-06-18 16:23:32,991][16381] Rollout worker 3 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 4 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 5 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 6 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 7 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 8 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 9 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 10 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 11 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 12 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 13 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 14 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 15 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 16 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 17 uses device cpu [2024-06-18 16:23:32,992][16381] Rollout worker 18 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 19 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 20 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 21 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 22 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 23 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 24 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 25 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 26 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 27 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 28 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 29 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 30 uses device cpu [2024-06-18 16:23:32,993][16381] Rollout worker 31 uses device cpu [2024-06-18 16:23:33,572][16381] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:23:33,572][16381] InferenceWorker_p0-w0: min num requests: 10 [2024-06-18 16:23:33,662][16381] Starting all processes... [2024-06-18 16:23:33,663][16381] Starting process learner_proc0 [2024-06-18 16:23:33,892][16381] Starting all processes... [2024-06-18 16:23:33,895][16381] Starting process inference_proc0-0 [2024-06-18 16:23:33,896][16381] Starting process rollout_proc0 [2024-06-18 16:23:33,896][16381] Starting process rollout_proc1 [2024-06-18 16:23:33,897][16381] Starting process rollout_proc2 [2024-06-18 16:23:33,898][16381] Starting process rollout_proc3 [2024-06-18 16:23:33,899][16381] Starting process rollout_proc4 [2024-06-18 16:23:33,899][16381] Starting process rollout_proc5 [2024-06-18 16:23:33,899][16381] Starting process rollout_proc6 [2024-06-18 16:23:33,899][16381] Starting process rollout_proc7 [2024-06-18 16:23:33,899][16381] Starting process rollout_proc8 [2024-06-18 16:23:33,899][16381] Starting process rollout_proc9 [2024-06-18 16:23:33,902][16381] Starting process rollout_proc10 [2024-06-18 16:23:33,902][16381] Starting process rollout_proc11 [2024-06-18 16:23:33,903][16381] Starting process rollout_proc12 [2024-06-18 16:23:33,903][16381] Starting process rollout_proc13 [2024-06-18 16:23:33,954][16381] Starting process rollout_proc14 [2024-06-18 16:23:33,955][16381] Starting process rollout_proc15 [2024-06-18 16:23:33,955][16381] Starting process rollout_proc16 [2024-06-18 16:23:33,955][16381] Starting process rollout_proc17 [2024-06-18 16:23:33,955][16381] Starting process rollout_proc18 [2024-06-18 16:23:33,956][16381] Starting process rollout_proc19 [2024-06-18 16:23:33,959][16381] Starting process rollout_proc20 [2024-06-18 16:23:33,959][16381] Starting process rollout_proc21 [2024-06-18 16:23:33,959][16381] Starting process rollout_proc22 [2024-06-18 16:23:33,960][16381] Starting process rollout_proc23 [2024-06-18 16:23:33,960][16381] Starting process rollout_proc24 [2024-06-18 16:23:33,961][16381] Starting process rollout_proc25 [2024-06-18 16:23:33,962][16381] Starting process rollout_proc26 [2024-06-18 16:23:33,964][16381] Starting process rollout_proc27 [2024-06-18 16:23:33,965][16381] Starting process rollout_proc28 [2024-06-18 16:23:33,969][16381] Starting process rollout_proc29 [2024-06-18 16:23:33,970][16381] Starting process rollout_proc30 [2024-06-18 16:23:33,972][16381] Starting process rollout_proc31 [2024-06-18 16:23:36,076][16614] Worker 0 uses CPU cores [0] [2024-06-18 16:23:36,100][16613] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:23:36,100][16613] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-18 16:23:36,110][16613] Num visible devices: 1 [2024-06-18 16:23:36,128][16593] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:23:36,128][16593] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-18 16:23:36,140][16593] Num visible devices: 1 [2024-06-18 16:23:36,156][16593] Setting fixed seed 0 [2024-06-18 16:23:36,157][16593] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:23:36,157][16593] Initializing actor-critic model on device cuda:0 [2024-06-18 16:23:36,204][16627] Worker 13 uses CPU cores [13] [2024-06-18 16:23:36,207][16625] Worker 11 uses CPU cores [11] [2024-06-18 16:23:36,207][16622] Worker 9 uses CPU cores [9] [2024-06-18 16:23:36,232][16677] Worker 19 uses CPU cores [19] [2024-06-18 16:23:36,252][16797] Worker 27 uses CPU cores [27] [2024-06-18 16:23:36,264][16621] Worker 6 uses CPU cores [6] [2024-06-18 16:23:36,280][16782] Worker 24 uses CPU cores [24] [2024-06-18 16:23:36,287][16762] Worker 23 uses CPU cores [23] [2024-06-18 16:23:36,299][16629] Worker 15 uses CPU cores [15] [2024-06-18 16:23:36,316][16727] Worker 20 uses CPU cores [20] [2024-06-18 16:23:36,332][16796] Worker 28 uses CPU cores [28] [2024-06-18 16:23:36,344][16662] Worker 16 uses CPU cores [16] [2024-06-18 16:23:36,352][16616] Worker 2 uses CPU cores [2] [2024-06-18 16:23:36,368][16628] Worker 14 uses CPU cores [14] [2024-06-18 16:23:36,376][16794] Worker 25 uses CPU cores [25] [2024-06-18 16:23:36,381][16626] Worker 12 uses CPU cores [12] [2024-06-18 16:23:36,400][16799] Worker 31 uses CPU cores [31] [2024-06-18 16:23:36,415][16795] Worker 26 uses CPU cores [26] [2024-06-18 16:23:36,434][16623] Worker 8 uses CPU cores [8] [2024-06-18 16:23:36,436][16661] Worker 18 uses CPU cores [18] [2024-06-18 16:23:36,439][16620] Worker 7 uses CPU cores [7] [2024-06-18 16:23:36,450][16615] Worker 1 uses CPU cores [1] [2024-06-18 16:23:36,456][16618] Worker 5 uses CPU cores [5] [2024-06-18 16:23:36,492][16624] Worker 10 uses CPU cores [10] [2024-06-18 16:23:36,497][16695] Worker 17 uses CPU cores [17] [2024-06-18 16:23:36,499][16619] Worker 4 uses CPU cores [4] [2024-06-18 16:23:36,519][16800] Worker 30 uses CPU cores [30] [2024-06-18 16:23:36,532][16761] Worker 22 uses CPU cores [22] [2024-06-18 16:23:36,538][16798] Worker 29 uses CPU cores [29] [2024-06-18 16:23:36,546][16617] Worker 3 uses CPU cores [3] [2024-06-18 16:23:36,598][16732] Worker 21 uses CPU cores [21] [2024-06-18 16:23:37,089][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,089][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,090][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,093][16593] RunningMeanStd input shape: (1,) [2024-06-18 16:23:37,094][16593] RunningMeanStd input shape: (1,) [2024-06-18 16:23:37,094][16593] RunningMeanStd input shape: (1,) [2024-06-18 16:23:37,094][16593] RunningMeanStd input shape: (1,) [2024-06-18 16:23:37,094][16593] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:37,134][16593] RunningMeanStd input shape: (1,) [2024-06-18 16:23:37,138][16593] Created Actor Critic model with architecture: [2024-06-18 16:23:37,138][16593] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-18 16:23:37,196][16593] Using optimizer [2024-06-18 16:23:37,381][16593] Loading state from checkpoint /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168672_2763522048.pth... [2024-06-18 16:23:37,395][16593] Loading model from checkpoint [2024-06-18 16:23:37,397][16593] Loaded experiment state at self.train_step=168672, self.env_steps=2763522048 [2024-06-18 16:23:37,397][16593] Initialized policy 0 weights for model version 168672 [2024-06-18 16:23:37,398][16593] LearnerWorker_p0 finished initialization! [2024-06-18 16:23:37,399][16593] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:23:38,132][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,133][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,134][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,136][16613] RunningMeanStd input shape: (1,) [2024-06-18 16:23:38,137][16613] RunningMeanStd input shape: (1,) [2024-06-18 16:23:38,137][16613] RunningMeanStd input shape: (1,) [2024-06-18 16:23:38,137][16613] RunningMeanStd input shape: (1,) [2024-06-18 16:23:38,137][16613] RunningMeanStd input shape: (11, 11) [2024-06-18 16:23:38,175][16613] RunningMeanStd input shape: (1,) [2024-06-18 16:23:38,197][16381] Inference worker 0-0 is ready! [2024-06-18 16:23:38,197][16381] All inference workers are ready! Signal rollout workers to start! [2024-06-18 16:23:40,646][16381] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 2763522048. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 16:23:40,902][16727] Decorrelating experience for 0 frames... [2024-06-18 16:23:40,922][16695] Decorrelating experience for 0 frames... [2024-06-18 16:23:40,964][16799] Decorrelating experience for 0 frames... [2024-06-18 16:23:40,983][16762] Decorrelating experience for 0 frames... [2024-06-18 16:23:40,987][16614] Decorrelating experience for 0 frames... [2024-06-18 16:23:40,991][16626] Decorrelating experience for 0 frames... [2024-06-18 16:23:40,995][16615] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,017][16795] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,022][16662] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,023][16798] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,031][16618] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,036][16797] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,051][16796] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,052][16761] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,060][16622] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,065][16677] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,069][16782] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,080][16623] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,081][16621] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,085][16629] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,089][16794] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,095][16800] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,096][16661] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,121][16620] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,126][16627] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,128][16732] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,131][16628] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,136][16625] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,144][16624] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,152][16619] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,158][16616] Decorrelating experience for 0 frames... [2024-06-18 16:23:41,167][16617] Decorrelating experience for 0 frames... [2024-06-18 16:23:42,066][16727] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,153][16626] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,158][16795] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,191][16622] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,206][16662] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,247][16695] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,279][16677] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,281][16618] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,304][16614] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,306][16615] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,307][16619] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,319][16762] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,319][16620] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,333][16799] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,350][16796] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,356][16782] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,366][16628] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,381][16623] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,386][16798] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,412][16797] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,413][16800] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,439][16794] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,462][16621] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,464][16661] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,466][16629] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,468][16624] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,483][16761] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,485][16625] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,498][16627] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,514][16616] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,522][16732] Decorrelating experience for 256 frames... [2024-06-18 16:23:42,572][16617] Decorrelating experience for 256 frames... [2024-06-18 16:23:45,646][16381] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 2763522048. Throughput: 0: 1024.0. Samples: 5120. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 16:23:49,867][16622] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-18 16:23:49,946][16626] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-18 16:23:49,965][16620] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-18 16:23:49,992][16618] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-18 16:23:50,027][16615] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-18 16:23:50,027][16628] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-18 16:23:50,107][16795] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-18 16:23:50,123][16629] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-18 16:23:50,132][16623] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-18 16:23:50,139][16593] Signal inference workers to stop experience collection... [2024-06-18 16:23:50,148][16627] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-18 16:23:50,149][16613] InferenceWorker_p0-w0: stopping experience collection [2024-06-18 16:23:50,154][16662] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-18 16:23:50,646][16381] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 2763522048. Throughput: 0: 31482.1. Samples: 314820. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 16:23:50,646][16381] Avg episode reward: [(0, '0.000')] [2024-06-18 16:23:50,698][16593] Signal inference workers to resume experience collection... [2024-06-18 16:23:50,699][16613] InferenceWorker_p0-w0: resuming experience collection [2024-06-18 16:23:50,734][16677] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-18 16:23:51,017][16619] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-18 16:23:51,069][16796] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-18 16:23:51,201][16695] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-18 16:23:51,217][16621] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-18 16:23:51,231][16625] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-18 16:23:51,256][16727] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-18 16:23:51,275][16624] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-18 16:23:51,278][16762] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-18 16:23:51,304][16616] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-18 16:23:51,325][16782] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-18 16:23:51,344][16617] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-18 16:23:51,412][16798] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-18 16:23:51,503][16794] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-18 16:23:51,532][16800] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-18 16:23:51,559][16797] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-18 16:23:51,849][16761] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-18 16:23:51,904][16732] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-18 16:23:51,940][16799] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-18 16:23:52,087][16613] Updated weights for policy 0, policy_version 168682 (0.0012) [2024-06-18 16:23:52,597][16661] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-18 16:23:53,568][16381] Heartbeat connected on Batcher_0 [2024-06-18 16:23:53,569][16381] Heartbeat connected on LearnerWorker_p0 [2024-06-18 16:23:53,575][16381] Heartbeat connected on RolloutWorker_w0 [2024-06-18 16:23:53,616][16381] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-18 16:23:54,738][16615] Worker 1 awakens! [2024-06-18 16:23:54,743][16381] Heartbeat connected on RolloutWorker_w1 [2024-06-18 16:23:55,646][16381] Fps is (10 sec: 16384.0, 60 sec: 10922.7, 300 sec: 10922.7). Total num frames: 2763685888. Throughput: 0: 22005.3. Samples: 330080. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:23:55,646][16381] Avg episode reward: [(0, '0.000')] [2024-06-18 16:24:00,646][16381] Fps is (10 sec: 18022.4, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 2763702272. Throughput: 0: 17134.0. Samples: 342680. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:00,653][16381] Avg episode reward: [(0, '0.000')] [2024-06-18 16:24:00,724][16616] Worker 2 awakens! [2024-06-18 16:24:00,730][16381] Heartbeat connected on RolloutWorker_w2 [2024-06-18 16:24:05,476][16617] Worker 3 awakens! [2024-06-18 16:24:05,481][16381] Heartbeat connected on RolloutWorker_w3 [2024-06-18 16:24:05,646][16381] Fps is (10 sec: 3276.8, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 2763718656. Throughput: 0: 14538.4. Samples: 363460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:05,647][16381] Avg episode reward: [(0, '0.000')] [2024-06-18 16:24:09,804][16619] Worker 4 awakens! [2024-06-18 16:24:09,810][16381] Heartbeat connected on RolloutWorker_w4 [2024-06-18 16:24:10,646][16381] Fps is (10 sec: 4915.2, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 2763751424. Throughput: 0: 12623.4. Samples: 378700. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:10,646][16381] Avg episode reward: [(0, '0.000')] [2024-06-18 16:24:13,506][16618] Worker 5 awakens! [2024-06-18 16:24:13,513][16381] Heartbeat connected on RolloutWorker_w5 [2024-06-18 16:24:15,646][16381] Fps is (10 sec: 9830.5, 60 sec: 8426.1, 300 sec: 8426.1). Total num frames: 2763816960. Throughput: 0: 12682.9. Samples: 443900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:15,653][16381] Avg episode reward: [(0, '0.000')] [2024-06-18 16:24:17,852][16613] Updated weights for policy 0, policy_version 168692 (0.0013) [2024-06-18 16:24:19,440][16621] Worker 6 awakens! [2024-06-18 16:24:19,445][16381] Heartbeat connected on RolloutWorker_w6 [2024-06-18 16:24:20,646][16381] Fps is (10 sec: 13107.1, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 2763882496. Throughput: 0: 13323.5. Samples: 532940. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:20,646][16381] Avg episode reward: [(0, '0.000')] [2024-06-18 16:24:22,876][16620] Worker 7 awakens! [2024-06-18 16:24:22,884][16381] Heartbeat connected on RolloutWorker_w7 [2024-06-18 16:24:25,646][16381] Fps is (10 sec: 16383.8, 60 sec: 10194.5, 300 sec: 10194.5). Total num frames: 2763980800. Throughput: 0: 13044.9. Samples: 587020. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:25,646][16381] Avg episode reward: [(0, '0.021')] [2024-06-18 16:24:26,729][16613] Updated weights for policy 0, policy_version 168702 (0.0012) [2024-06-18 16:24:27,732][16623] Worker 8 awakens! [2024-06-18 16:24:27,737][16381] Heartbeat connected on RolloutWorker_w8 [2024-06-18 16:24:30,646][16381] Fps is (10 sec: 19660.6, 60 sec: 11141.1, 300 sec: 11141.1). Total num frames: 2764079104. Throughput: 0: 15643.5. Samples: 709080. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:30,653][16381] Avg episode reward: [(0, '0.021')] [2024-06-18 16:24:32,155][16622] Worker 9 awakens! [2024-06-18 16:24:32,163][16381] Heartbeat connected on RolloutWorker_w9 [2024-06-18 16:24:33,564][16613] Updated weights for policy 0, policy_version 168712 (0.0017) [2024-06-18 16:24:35,646][16381] Fps is (10 sec: 21299.1, 60 sec: 12213.5, 300 sec: 12213.5). Total num frames: 2764193792. Throughput: 0: 12037.3. Samples: 856500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:35,647][16381] Avg episode reward: [(0, '0.197')] [2024-06-18 16:24:38,248][16624] Worker 10 awakens! [2024-06-18 16:24:38,255][16381] Heartbeat connected on RolloutWorker_w10 [2024-06-18 16:24:39,866][16613] Updated weights for policy 0, policy_version 168722 (0.0013) [2024-06-18 16:24:40,646][16381] Fps is (10 sec: 29491.1, 60 sec: 14199.4, 300 sec: 14199.4). Total num frames: 2764374016. Throughput: 0: 13573.7. Samples: 940900. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:40,647][16381] Avg episode reward: [(0, '0.244')] [2024-06-18 16:24:42,892][16625] Worker 11 awakens! [2024-06-18 16:24:42,901][16381] Heartbeat connected on RolloutWorker_w11 [2024-06-18 16:24:45,646][16381] Fps is (10 sec: 29491.5, 60 sec: 16110.9, 300 sec: 14871.6). Total num frames: 2764488704. Throughput: 0: 17060.9. Samples: 1110420. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:45,646][16381] Avg episode reward: [(0, '0.368')] [2024-06-18 16:24:45,755][16613] Updated weights for policy 0, policy_version 168732 (0.0015) [2024-06-18 16:24:46,206][16626] Worker 12 awakens! [2024-06-18 16:24:46,212][16381] Heartbeat connected on RolloutWorker_w12 [2024-06-18 16:24:50,388][16613] Updated weights for policy 0, policy_version 168742 (0.0016) [2024-06-18 16:24:50,646][16381] Fps is (10 sec: 29491.2, 60 sec: 19114.6, 300 sec: 16384.0). Total num frames: 2764668928. Throughput: 0: 20828.0. Samples: 1300720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:50,647][16381] Avg episode reward: [(0, '0.368')] [2024-06-18 16:24:51,106][16627] Worker 13 awakens! [2024-06-18 16:24:51,115][16381] Heartbeat connected on RolloutWorker_w13 [2024-06-18 16:24:55,364][16613] Updated weights for policy 0, policy_version 168752 (0.0017) [2024-06-18 16:24:55,646][16381] Fps is (10 sec: 34406.2, 60 sec: 19114.6, 300 sec: 17476.2). Total num frames: 2764832768. Throughput: 0: 22617.2. Samples: 1396480. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:24:55,647][16381] Avg episode reward: [(0, '0.345')] [2024-06-18 16:24:55,752][16628] Worker 14 awakens! [2024-06-18 16:24:55,761][16381] Heartbeat connected on RolloutWorker_w14 [2024-06-18 16:25:00,260][16613] Updated weights for policy 0, policy_version 168762 (0.0019) [2024-06-18 16:25:00,536][16629] Worker 15 awakens! [2024-06-18 16:25:00,544][16381] Heartbeat connected on RolloutWorker_w15 [2024-06-18 16:25:00,646][16381] Fps is (10 sec: 32768.3, 60 sec: 21572.2, 300 sec: 18432.0). Total num frames: 2764996608. Throughput: 0: 25579.1. Samples: 1594960. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:00,646][16381] Avg episode reward: [(0, '0.402')] [2024-06-18 16:25:05,236][16613] Updated weights for policy 0, policy_version 168772 (0.0031) [2024-06-18 16:25:05,254][16662] Worker 16 awakens! [2024-06-18 16:25:05,264][16381] Heartbeat connected on RolloutWorker_w16 [2024-06-18 16:25:05,646][16381] Fps is (10 sec: 34406.4, 60 sec: 24302.9, 300 sec: 19468.0). Total num frames: 2765176832. Throughput: 0: 27895.9. Samples: 1788260. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:05,647][16381] Avg episode reward: [(0, '0.334')] [2024-06-18 16:25:10,299][16613] Updated weights for policy 0, policy_version 168782 (0.0026) [2024-06-18 16:25:10,646][16381] Fps is (10 sec: 34406.1, 60 sec: 26487.4, 300 sec: 20206.9). Total num frames: 2765340672. Throughput: 0: 29197.3. Samples: 1900900. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:10,647][16381] Avg episode reward: [(0, '0.358')] [2024-06-18 16:25:10,992][16695] Worker 17 awakens! [2024-06-18 16:25:11,003][16381] Heartbeat connected on RolloutWorker_w17 [2024-06-18 16:25:15,032][16613] Updated weights for policy 0, policy_version 168792 (0.0031) [2024-06-18 16:25:15,646][16381] Fps is (10 sec: 32768.3, 60 sec: 28125.9, 300 sec: 20868.0). Total num frames: 2765504512. Throughput: 0: 31122.3. Samples: 2109580. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:15,646][16381] Avg episode reward: [(0, '0.205')] [2024-06-18 16:25:17,072][16661] Worker 18 awakens! [2024-06-18 16:25:17,084][16381] Heartbeat connected on RolloutWorker_w18 [2024-06-18 16:25:19,669][16613] Updated weights for policy 0, policy_version 168802 (0.0036) [2024-06-18 16:25:19,896][16677] Worker 19 awakens! [2024-06-18 16:25:19,909][16381] Heartbeat connected on RolloutWorker_w19 [2024-06-18 16:25:20,646][16381] Fps is (10 sec: 34406.6, 60 sec: 30037.3, 300 sec: 21626.9). Total num frames: 2765684736. Throughput: 0: 32548.5. Samples: 2321180. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:20,647][16381] Avg episode reward: [(0, '0.288')] [2024-06-18 16:25:24,597][16613] Updated weights for policy 0, policy_version 168812 (0.0033) [2024-06-18 16:25:25,106][16727] Worker 20 awakens! [2024-06-18 16:25:25,118][16381] Heartbeat connected on RolloutWorker_w20 [2024-06-18 16:25:25,646][16381] Fps is (10 sec: 36044.5, 60 sec: 31402.7, 300 sec: 22313.4). Total num frames: 2765864960. Throughput: 0: 33126.7. Samples: 2431600. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:25,647][16381] Avg episode reward: [(0, '0.583')] [2024-06-18 16:25:28,288][16613] Updated weights for policy 0, policy_version 168822 (0.0022) [2024-06-18 16:25:30,441][16732] Worker 21 awakens! [2024-06-18 16:25:30,455][16381] Heartbeat connected on RolloutWorker_w21 [2024-06-18 16:25:30,646][16381] Fps is (10 sec: 37682.8, 60 sec: 33041.0, 300 sec: 23086.5). Total num frames: 2766061568. Throughput: 0: 34267.5. Samples: 2652460. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:30,647][16381] Avg episode reward: [(0, '0.595')] [2024-06-18 16:25:30,662][16593] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168827_2766061568.pth... [2024-06-18 16:25:30,726][16593] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168356_2758344704.pth [2024-06-18 16:25:32,920][16613] Updated weights for policy 0, policy_version 168832 (0.0034) [2024-06-18 16:25:35,072][16761] Worker 22 awakens! [2024-06-18 16:25:35,085][16381] Heartbeat connected on RolloutWorker_w22 [2024-06-18 16:25:35,646][16381] Fps is (10 sec: 36044.6, 60 sec: 33860.3, 300 sec: 23507.5). Total num frames: 2766225408. Throughput: 0: 35151.5. Samples: 2882540. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:35,647][16381] Avg episode reward: [(0, '0.571')] [2024-06-18 16:25:36,827][16613] Updated weights for policy 0, policy_version 168842 (0.0033) [2024-06-18 16:25:39,188][16762] Worker 23 awakens! [2024-06-18 16:25:39,202][16381] Heartbeat connected on RolloutWorker_w23 [2024-06-18 16:25:40,646][16381] Fps is (10 sec: 37683.3, 60 sec: 34406.4, 300 sec: 24302.9). Total num frames: 2766438400. Throughput: 0: 35594.7. Samples: 2998240. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:40,647][16381] Avg episode reward: [(0, '0.480')] [2024-06-18 16:25:41,248][16613] Updated weights for policy 0, policy_version 168852 (0.0031) [2024-06-18 16:25:43,926][16782] Worker 24 awakens! [2024-06-18 16:25:43,939][16381] Heartbeat connected on RolloutWorker_w24 [2024-06-18 16:25:45,529][16613] Updated weights for policy 0, policy_version 168862 (0.0026) [2024-06-18 16:25:45,646][16381] Fps is (10 sec: 40960.5, 60 sec: 35771.8, 300 sec: 24903.7). Total num frames: 2766635008. Throughput: 0: 36415.6. Samples: 3233660. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:45,654][16381] Avg episode reward: [(0, '0.408')] [2024-06-18 16:25:48,792][16794] Worker 25 awakens! [2024-06-18 16:25:48,808][16381] Heartbeat connected on RolloutWorker_w25 [2024-06-18 16:25:49,216][16613] Updated weights for policy 0, policy_version 168872 (0.0039) [2024-06-18 16:25:50,646][16381] Fps is (10 sec: 40959.7, 60 sec: 36317.8, 300 sec: 25584.2). Total num frames: 2766848000. Throughput: 0: 37384.8. Samples: 3470580. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:50,647][16381] Avg episode reward: [(0, '0.476')] [2024-06-18 16:25:52,082][16795] Worker 26 awakens! [2024-06-18 16:25:52,096][16381] Heartbeat connected on RolloutWorker_w26 [2024-06-18 16:25:53,568][16613] Updated weights for policy 0, policy_version 168882 (0.0036) [2024-06-18 16:25:55,646][16381] Fps is (10 sec: 39321.0, 60 sec: 36590.9, 300 sec: 25971.6). Total num frames: 2767028224. Throughput: 0: 37675.1. Samples: 3596280. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:25:55,647][16381] Avg episode reward: [(0, '0.470')] [2024-06-18 16:25:57,132][16613] Updated weights for policy 0, policy_version 168892 (0.0031) [2024-06-18 16:25:58,220][16797] Worker 27 awakens! [2024-06-18 16:25:58,236][16381] Heartbeat connected on RolloutWorker_w27 [2024-06-18 16:26:00,646][16381] Fps is (10 sec: 40960.5, 60 sec: 37683.2, 300 sec: 26682.5). Total num frames: 2767257600. Throughput: 0: 38445.3. Samples: 3839620. Policy #0 lag: (min: 0.0, avg: 28.5, max: 80.0) [2024-06-18 16:26:00,654][16381] Avg episode reward: [(0, '0.290')] [2024-06-18 16:26:02,110][16613] Updated weights for policy 0, policy_version 168902 (0.0026) [2024-06-18 16:26:02,416][16796] Worker 28 awakens! [2024-06-18 16:26:02,428][16381] Heartbeat connected on RolloutWorker_w28 [2024-06-18 16:26:05,646][16381] Fps is (10 sec: 39322.1, 60 sec: 37410.2, 300 sec: 26892.4). Total num frames: 2767421440. Throughput: 0: 39240.0. Samples: 4086980. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:05,646][16381] Avg episode reward: [(0, '0.514')] [2024-06-18 16:26:05,958][16613] Updated weights for policy 0, policy_version 168912 (0.0027) [2024-06-18 16:26:07,452][16798] Worker 29 awakens! [2024-06-18 16:26:07,465][16381] Heartbeat connected on RolloutWorker_w29 [2024-06-18 16:26:09,758][16613] Updated weights for policy 0, policy_version 168922 (0.0039) [2024-06-18 16:26:10,646][16381] Fps is (10 sec: 40959.6, 60 sec: 38775.4, 300 sec: 27634.3). Total num frames: 2767667200. Throughput: 0: 39392.8. Samples: 4204280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:10,647][16381] Avg episode reward: [(0, '0.532')] [2024-06-18 16:26:12,258][16800] Worker 30 awakens! [2024-06-18 16:26:12,273][16381] Heartbeat connected on RolloutWorker_w30 [2024-06-18 16:26:13,703][16613] Updated weights for policy 0, policy_version 168932 (0.0041) [2024-06-18 16:26:15,646][16381] Fps is (10 sec: 44236.3, 60 sec: 39321.5, 300 sec: 28011.3). Total num frames: 2767863808. Throughput: 0: 39982.7. Samples: 4451680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:15,652][16381] Avg episode reward: [(0, '0.330')] [2024-06-18 16:26:17,352][16799] Worker 31 awakens! [2024-06-18 16:26:17,368][16381] Heartbeat connected on RolloutWorker_w31 [2024-06-18 16:26:17,593][16613] Updated weights for policy 0, policy_version 168942 (0.0026) [2024-06-18 16:26:20,646][16381] Fps is (10 sec: 39322.3, 60 sec: 39594.7, 300 sec: 28364.8). Total num frames: 2768060416. Throughput: 0: 40548.1. Samples: 4707200. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:20,646][16381] Avg episode reward: [(0, '0.330')] [2024-06-18 16:26:21,599][16613] Updated weights for policy 0, policy_version 168952 (0.0033) [2024-06-18 16:26:24,811][16593] Signal inference workers to stop experience collection... (50 times) [2024-06-18 16:26:24,867][16593] Signal inference workers to resume experience collection... (50 times) [2024-06-18 16:26:24,869][16613] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-18 16:26:24,884][16613] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-18 16:26:25,532][16613] Updated weights for policy 0, policy_version 168962 (0.0037) [2024-06-18 16:26:25,646][16381] Fps is (10 sec: 40960.7, 60 sec: 40140.9, 300 sec: 28796.1). Total num frames: 2768273408. Throughput: 0: 40632.1. Samples: 4826680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:25,646][16381] Avg episode reward: [(0, '0.213')] [2024-06-18 16:26:29,428][16613] Updated weights for policy 0, policy_version 168972 (0.0028) [2024-06-18 16:26:30,646][16381] Fps is (10 sec: 42598.2, 60 sec: 40413.9, 300 sec: 29202.1). Total num frames: 2768486400. Throughput: 0: 41173.3. Samples: 5086460. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:30,647][16381] Avg episode reward: [(0, '0.216')] [2024-06-18 16:26:33,183][16613] Updated weights for policy 0, policy_version 168982 (0.0034) [2024-06-18 16:26:35,646][16381] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 29584.8). Total num frames: 2768699392. Throughput: 0: 41361.9. Samples: 5331860. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:35,647][16381] Avg episode reward: [(0, '0.418')] [2024-06-18 16:26:37,370][16613] Updated weights for policy 0, policy_version 168992 (0.0042) [2024-06-18 16:26:40,646][16381] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 29855.3). Total num frames: 2768896000. Throughput: 0: 41512.1. Samples: 5464320. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:40,650][16381] Avg episode reward: [(0, '0.720')] [2024-06-18 16:26:40,993][16613] Updated weights for policy 0, policy_version 169002 (0.0036) [2024-06-18 16:26:45,162][16613] Updated weights for policy 0, policy_version 169012 (0.0037) [2024-06-18 16:26:45,646][16381] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 30199.7). Total num frames: 2769108992. Throughput: 0: 41675.2. Samples: 5715000. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:45,646][16381] Avg episode reward: [(0, '0.765')] [2024-06-18 16:26:48,758][16613] Updated weights for policy 0, policy_version 169022 (0.0036) [2024-06-18 16:26:50,646][16381] Fps is (10 sec: 44236.6, 60 sec: 41506.2, 300 sec: 30612.2). Total num frames: 2769338368. Throughput: 0: 41678.6. Samples: 5962520. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:50,647][16381] Avg episode reward: [(0, '0.765')] [2024-06-18 16:26:52,888][16613] Updated weights for policy 0, policy_version 169032 (0.0043) [2024-06-18 16:26:55,649][16381] Fps is (10 sec: 40949.5, 60 sec: 41504.5, 300 sec: 30751.1). Total num frames: 2769518592. Throughput: 0: 42001.4. Samples: 6094440. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:26:55,649][16381] Avg episode reward: [(0, '0.736')] [2024-06-18 16:26:56,560][16613] Updated weights for policy 0, policy_version 169042 (0.0039) [2024-06-18 16:27:00,447][16613] Updated weights for policy 0, policy_version 169052 (0.0023) [2024-06-18 16:27:00,646][16381] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 31129.6). Total num frames: 2769747968. Throughput: 0: 42119.5. Samples: 6347060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:27:00,647][16381] Avg episode reward: [(0, '0.587')] [2024-06-18 16:27:04,330][16613] Updated weights for policy 0, policy_version 169062 (0.0028) [2024-06-18 16:27:05,646][16381] Fps is (10 sec: 47525.0, 60 sec: 42871.4, 300 sec: 31569.2). Total num frames: 2769993728. Throughput: 0: 41965.7. Samples: 6595660. Policy #0 lag: (min: 0.0, avg: 7.7, max: 19.0) [2024-06-18 16:27:05,647][16381] Avg episode reward: [(0, '0.346')] [2024-06-18 16:27:08,231][16613] Updated weights for policy 0, policy_version 169072 (0.0046) [2024-06-18 16:27:10,646][16381] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 31675.7). Total num frames: 2770173952. Throughput: 0: 42167.9. Samples: 6724240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:10,651][16381] Avg episode reward: [(0, '0.402')] [2024-06-18 16:27:12,171][16613] Updated weights for policy 0, policy_version 169082 (0.0038) [2024-06-18 16:27:15,646][16381] Fps is (10 sec: 37683.5, 60 sec: 41779.3, 300 sec: 31853.5). Total num frames: 2770370560. Throughput: 0: 41995.6. Samples: 6976260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:15,646][16381] Avg episode reward: [(0, '0.542')] [2024-06-18 16:27:15,886][16613] Updated weights for policy 0, policy_version 169092 (0.0026) [2024-06-18 16:27:19,903][16613] Updated weights for policy 0, policy_version 169102 (0.0036) [2024-06-18 16:27:20,646][16381] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 32172.2). Total num frames: 2770599936. Throughput: 0: 42149.9. Samples: 7228600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:20,646][16381] Avg episode reward: [(0, '0.403')] [2024-06-18 16:27:23,628][16613] Updated weights for policy 0, policy_version 169112 (0.0052) [2024-06-18 16:27:25,646][16381] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 32258.3). Total num frames: 2770780160. Throughput: 0: 41945.8. Samples: 7351880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:25,647][16381] Avg episode reward: [(0, '0.255')] [2024-06-18 16:27:27,662][16613] Updated weights for policy 0, policy_version 169122 (0.0039) [2024-06-18 16:27:30,646][16381] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 32554.3). Total num frames: 2771009536. Throughput: 0: 41873.7. Samples: 7599320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:30,647][16381] Avg episode reward: [(0, '0.291')] [2024-06-18 16:27:30,797][16593] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169130_2771025920.pth... [2024-06-18 16:27:30,865][16593] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168672_2763522048.pth [2024-06-18 16:27:32,048][16613] Updated weights for policy 0, policy_version 169132 (0.0035) [2024-06-18 16:27:35,646][16381] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 32698.3). Total num frames: 2771206144. Throughput: 0: 41953.4. Samples: 7850420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:35,647][16381] Avg episode reward: [(0, '0.461')] [2024-06-18 16:27:35,669][16613] Updated weights for policy 0, policy_version 169142 (0.0034) [2024-06-18 16:27:39,799][16613] Updated weights for policy 0, policy_version 169152 (0.0031) [2024-06-18 16:27:40,646][16381] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 32836.2). Total num frames: 2771402752. Throughput: 0: 41814.2. Samples: 7975980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:40,647][16381] Avg episode reward: [(0, '0.534')] [2024-06-18 16:27:43,511][16613] Updated weights for policy 0, policy_version 169162 (0.0028) [2024-06-18 16:27:44,871][16593] Signal inference workers to stop experience collection... (100 times) [2024-06-18 16:27:44,871][16593] Signal inference workers to resume experience collection... (100 times) [2024-06-18 16:27:44,894][16613] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-18 16:27:44,895][16613] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-18 16:27:45,646][16381] Fps is (10 sec: 44237.0, 60 sec: 42325.2, 300 sec: 33169.2). Total num frames: 2771648512. Throughput: 0: 41672.9. Samples: 8222340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:45,648][16381] Avg episode reward: [(0, '0.495')] [2024-06-18 16:27:47,526][16613] Updated weights for policy 0, policy_version 169172 (0.0048) [2024-06-18 16:27:50,646][16381] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 33226.7). Total num frames: 2771828736. Throughput: 0: 41727.2. Samples: 8473380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:50,647][16381] Avg episode reward: [(0, '0.653')] [2024-06-18 16:27:51,399][16613] Updated weights for policy 0, policy_version 169182 (0.0037) [2024-06-18 16:27:55,278][16613] Updated weights for policy 0, policy_version 169192 (0.0028) [2024-06-18 16:27:55,646][16381] Fps is (10 sec: 39321.9, 60 sec: 42054.0, 300 sec: 33410.5). Total num frames: 2772041728. Throughput: 0: 41551.6. Samples: 8594060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:27:55,646][16381] Avg episode reward: [(0, '0.591')] [2024-06-18 16:27:59,492][16613] Updated weights for policy 0, policy_version 169202 (0.0028) [2024-06-18 16:28:00,646][16381] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 33650.2). Total num frames: 2772271104. Throughput: 0: 41545.8. Samples: 8845820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:28:00,647][16381] Avg episode reward: [(0, '0.724')] [2024-06-18 16:28:03,027][16613] Updated weights for policy 0, policy_version 169212 (0.0036) [2024-06-18 16:28:05,646][16381] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 33695.4). Total num frames: 2772451328. Throughput: 0: 41583.5. Samples: 9099860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:28:05,647][16381] Avg episode reward: [(0, '0.694')] [2024-06-18 16:28:07,111][16613] Updated weights for policy 0, policy_version 169222 (0.0046) [2024-06-18 16:28:10,646][16381] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 33920.9). Total num frames: 2772680704. Throughput: 0: 41557.2. Samples: 9221960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 16:28:10,647][16381] Avg episode reward: [(0, '0.652')] [2024-06-18 16:28:10,903][16613] Updated weights for policy 0, policy_version 169232 (0.0043) [2024-06-18 16:28:14,825][16613] Updated weights for policy 0, policy_version 169242 (0.0036) [2024-06-18 16:28:15,646][16381] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 34019.1). Total num frames: 2772877312. Throughput: 0: 41812.1. Samples: 9480860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:15,647][16381] Avg episode reward: [(0, '0.536')] [2024-06-18 16:28:18,856][16613] Updated weights for policy 0, policy_version 169252 (0.0043) [2024-06-18 16:28:20,646][16381] Fps is (10 sec: 39321.9, 60 sec: 41233.0, 300 sec: 34113.8). Total num frames: 2773073920. Throughput: 0: 41715.6. Samples: 9727620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:20,647][16381] Avg episode reward: [(0, '0.499')] [2024-06-18 16:28:23,169][16613] Updated weights for policy 0, policy_version 169262 (0.0036) [2024-06-18 16:28:25,646][16381] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 34320.2). Total num frames: 2773303296. Throughput: 0: 41721.5. Samples: 9853440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:25,646][16381] Avg episode reward: [(0, '0.464')] [2024-06-18 16:28:26,614][16613] Updated weights for policy 0, policy_version 169272 (0.0030) [2024-06-18 16:28:30,646][16381] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 34406.4). Total num frames: 2773499904. Throughput: 0: 41889.8. Samples: 10107380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:30,647][16381] Avg episode reward: [(0, '0.464')] [2024-06-18 16:28:30,837][16613] Updated weights for policy 0, policy_version 169282 (0.0040) [2024-06-18 16:28:34,458][16613] Updated weights for policy 0, policy_version 169292 (0.0041) [2024-06-18 16:28:35,646][16381] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 34489.7). Total num frames: 2773696512. Throughput: 0: 41759.9. Samples: 10352580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:35,647][16381] Avg episode reward: [(0, '0.550')] [2024-06-18 16:28:38,722][16613] Updated weights for policy 0, policy_version 169302 (0.0051) [2024-06-18 16:28:40,646][16381] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 35322.8). Total num frames: 2773942272. Throughput: 0: 42017.3. Samples: 10484840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:40,646][16381] Avg episode reward: [(0, '0.478')] [2024-06-18 16:28:42,387][16613] Updated weights for policy 0, policy_version 169312 (0.0035) [2024-06-18 16:28:45,646][16381] Fps is (10 sec: 39322.0, 60 sec: 40687.0, 300 sec: 35822.6). Total num frames: 2774089728. Throughput: 0: 41859.6. Samples: 10729500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:45,646][16381] Avg episode reward: [(0, '0.493')] [2024-06-18 16:28:46,662][16613] Updated weights for policy 0, policy_version 169322 (0.0027) [2024-06-18 16:28:50,407][16613] Updated weights for policy 0, policy_version 169332 (0.0027) [2024-06-18 16:28:50,646][16381] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 36100.3). Total num frames: 2774335488. Throughput: 0: 41701.3. Samples: 10976420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:50,646][16381] Avg episode reward: [(0, '0.509')] [2024-06-18 16:28:54,463][16613] Updated weights for policy 0, policy_version 169342 (0.0039) [2024-06-18 16:28:55,646][16381] Fps is (10 sec: 47513.4, 60 sec: 42052.2, 300 sec: 36822.3). Total num frames: 2774564864. Throughput: 0: 41904.6. Samples: 11107660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:28:55,647][16381] Avg episode reward: [(0, '0.589')] [2024-06-18 16:28:58,270][16613] Updated weights for policy 0, policy_version 169352 (0.0043) [2024-06-18 16:29:00,646][16381] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 37377.7). Total num frames: 2774745088. Throughput: 0: 41589.2. Samples: 11352380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:29:00,647][16381] Avg episode reward: [(0, '0.532')] [2024-06-18 16:29:02,277][16613] Updated weights for policy 0, policy_version 169362 (0.0039) [2024-06-18 16:29:05,646][16381] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 38044.2). Total num frames: 2774974464. Throughput: 0: 41575.2. Samples: 11598500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:29:05,646][16381] Avg episode reward: [(0, '0.640')] [2024-06-18 16:29:06,074][16613] Updated weights for policy 0, policy_version 169372 (0.0039) [2024-06-18 16:29:10,538][16613] Updated weights for policy 0, policy_version 169382 (0.0040) [2024-06-18 16:29:10,646][16381] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 38433.0). Total num frames: 2775154688. Throughput: 0: 41564.8. Samples: 11723860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:29:10,647][16381] Avg episode reward: [(0, '0.749')] [2024-06-18 16:29:10,875][16593] Signal inference workers to stop experience collection... (150 times) [2024-06-18 16:29:10,927][16613] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-18 16:29:10,939][16593] Signal inference workers to resume experience collection... (150 times) [2024-06-18 16:29:10,940][16613] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-18 16:29:13,878][16613] Updated weights for policy 0, policy_version 169392 (0.0030) [2024-06-18 16:29:15,646][16381] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 38988.3). Total num frames: 2775384064. Throughput: 0: 41456.8. Samples: 11972940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 16:29:15,656][16381] Avg episode reward: [(0, '0.732')] [2024-06-18 16:29:18,270][16613] Updated weights for policy 0, policy_version 169402 (0.0036) [2024-06-18 16:29:20,648][16381] Fps is (10 sec: 42590.8, 60 sec: 41777.9, 300 sec: 39321.4). Total num frames: 2775580672. Throughput: 0: 41505.5. Samples: 12220400. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:20,649][16381] Avg episode reward: [(0, '0.677')] [2024-06-18 16:29:22,027][16613] Updated weights for policy 0, policy_version 169412 (0.0037) [2024-06-18 16:29:25,646][16381] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 39654.8). Total num frames: 2775777280. Throughput: 0: 41363.9. Samples: 12346220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:25,647][16381] Avg episode reward: [(0, '0.583')] [2024-06-18 16:29:26,129][16613] Updated weights for policy 0, policy_version 169422 (0.0036) [2024-06-18 16:29:29,842][16613] Updated weights for policy 0, policy_version 169432 (0.0037) [2024-06-18 16:29:30,646][16381] Fps is (10 sec: 42606.5, 60 sec: 41779.2, 300 sec: 40043.6). Total num frames: 2776006656. Throughput: 0: 41500.9. Samples: 12597040. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:30,648][16381] Avg episode reward: [(0, '0.682')] [2024-06-18 16:29:30,699][16593] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169435_2776023040.pth... [2024-06-18 16:29:30,745][16593] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000168827_2766061568.pth [2024-06-18 16:29:33,996][16613] Updated weights for policy 0, policy_version 169442 (0.0028) [2024-06-18 16:29:35,646][16381] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 40154.7). Total num frames: 2776219648. Throughput: 0: 41428.0. Samples: 12840680. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:35,647][16381] Avg episode reward: [(0, '0.494')] [2024-06-18 16:29:37,762][16613] Updated weights for policy 0, policy_version 169452 (0.0032) [2024-06-18 16:29:40,646][16381] Fps is (10 sec: 39321.0, 60 sec: 40959.9, 300 sec: 40376.8). Total num frames: 2776399872. Throughput: 0: 41335.0. Samples: 12967740. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:40,647][16381] Avg episode reward: [(0, '0.653')] [2024-06-18 16:29:42,195][16613] Updated weights for policy 0, policy_version 169462 (0.0035) [2024-06-18 16:29:45,646][16381] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 40487.9). Total num frames: 2776612864. Throughput: 0: 41437.5. Samples: 13217060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:45,646][16381] Avg episode reward: [(0, '0.534')] [2024-06-18 16:29:45,672][16613] Updated weights for policy 0, policy_version 169472 (0.0039) [2024-06-18 16:29:49,998][16613] Updated weights for policy 0, policy_version 169482 (0.0036) [2024-06-18 16:29:50,646][16381] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 40654.5). Total num frames: 2776825856. Throughput: 0: 41658.6. Samples: 13473140. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:50,646][16381] Avg episode reward: [(0, '0.546')] [2024-06-18 16:29:53,397][16613] Updated weights for policy 0, policy_version 169492 (0.0043) [2024-06-18 16:29:55,646][16381] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 40821.1). Total num frames: 2777038848. Throughput: 0: 41604.4. Samples: 13596060. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:29:55,647][16381] Avg episode reward: [(0, '0.587')] [2024-06-18 16:29:57,827][16613] Updated weights for policy 0, policy_version 169502 (0.0052) [2024-06-18 16:30:00,646][16381] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 40932.2). Total num frames: 2777251840. Throughput: 0: 41560.8. Samples: 13843180. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:30:00,647][16381] Avg episode reward: [(0, '0.509')] [2024-06-18 16:30:01,088][16613] Updated weights for policy 0, policy_version 169512 (0.0043) [2024-06-18 16:30:05,543][16613] Updated weights for policy 0, policy_version 169522 (0.0024) [2024-06-18 16:30:05,646][16381] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2777448448. Throughput: 0: 41799.0. Samples: 14101280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:30:05,647][16381] Avg episode reward: [(0, '0.466')] [2024-06-18 16:30:09,404][16613] Updated weights for policy 0, policy_version 169532 (0.0032) [2024-06-18 16:30:10,646][16381] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 41265.4). Total num frames: 2777677824. Throughput: 0: 41626.6. Samples: 14219420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:30:10,647][16381] Avg episode reward: [(0, '0.637')] [2024-06-18 16:30:13,248][16613] Updated weights for policy 0, policy_version 169542 (0.0039) [2024-06-18 16:30:37,805][18875] Saving configuration to /workspace/metta/train_dir/p2.dr4/config.json... [2024-06-18 16:30:37,821][18875] Rollout worker 0 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 1 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 2 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 3 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 4 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 5 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 6 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 7 uses device cpu [2024-06-18 16:30:37,822][18875] Rollout worker 8 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 9 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 10 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 11 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 12 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 13 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 14 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 15 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 16 uses device cpu [2024-06-18 16:30:37,823][18875] Rollout worker 17 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 18 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 19 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 20 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 21 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 22 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 23 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 24 uses device cpu [2024-06-18 16:30:37,824][18875] Rollout worker 25 uses device cpu [2024-06-18 16:30:37,825][18875] Rollout worker 26 uses device cpu [2024-06-18 16:30:37,825][18875] Rollout worker 27 uses device cpu [2024-06-18 16:30:37,825][18875] Rollout worker 28 uses device cpu [2024-06-18 16:30:37,825][18875] Rollout worker 29 uses device cpu [2024-06-18 16:30:37,825][18875] Rollout worker 30 uses device cpu [2024-06-18 16:30:37,825][18875] Rollout worker 31 uses device cpu [2024-06-18 16:30:38,405][18875] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:30:38,405][18875] InferenceWorker_p0-w0: min num requests: 10 [2024-06-18 16:30:38,482][18875] Starting all processes... [2024-06-18 16:30:38,482][18875] Starting process learner_proc0 [2024-06-18 16:30:38,713][18875] Starting all processes... [2024-06-18 16:30:38,716][18875] Starting process inference_proc0-0 [2024-06-18 16:30:38,716][18875] Starting process rollout_proc0 [2024-06-18 16:30:38,718][18875] Starting process rollout_proc1 [2024-06-18 16:30:38,718][18875] Starting process rollout_proc2 [2024-06-18 16:30:38,718][18875] Starting process rollout_proc3 [2024-06-18 16:30:38,719][18875] Starting process rollout_proc4 [2024-06-18 16:30:38,719][18875] Starting process rollout_proc5 [2024-06-18 16:30:38,719][18875] Starting process rollout_proc6 [2024-06-18 16:30:38,719][18875] Starting process rollout_proc7 [2024-06-18 16:30:38,720][18875] Starting process rollout_proc8 [2024-06-18 16:30:38,721][18875] Starting process rollout_proc9 [2024-06-18 16:30:38,721][18875] Starting process rollout_proc10 [2024-06-18 16:30:38,721][18875] Starting process rollout_proc11 [2024-06-18 16:30:38,721][18875] Starting process rollout_proc12 [2024-06-18 16:30:38,783][18875] Starting process rollout_proc13 [2024-06-18 16:30:38,783][18875] Starting process rollout_proc14 [2024-06-18 16:30:38,798][18875] Starting process rollout_proc16 [2024-06-18 16:30:38,798][18875] Starting process rollout_proc17 [2024-06-18 16:30:38,798][18875] Starting process rollout_proc18 [2024-06-18 16:30:38,784][18875] Starting process rollout_proc15 [2024-06-18 16:30:38,800][18875] Starting process rollout_proc19 [2024-06-18 16:30:38,800][18875] Starting process rollout_proc20 [2024-06-18 16:30:38,801][18875] Starting process rollout_proc21 [2024-06-18 16:30:38,802][18875] Starting process rollout_proc22 [2024-06-18 16:30:38,807][18875] Starting process rollout_proc23 [2024-06-18 16:30:38,812][18875] Starting process rollout_proc24 [2024-06-18 16:30:38,813][18875] Starting process rollout_proc25 [2024-06-18 16:30:38,830][18875] Starting process rollout_proc26 [2024-06-18 16:30:38,840][18875] Starting process rollout_proc27 [2024-06-18 16:30:38,840][18875] Starting process rollout_proc28 [2024-06-18 16:30:38,849][18875] Starting process rollout_proc29 [2024-06-18 16:30:38,850][18875] Starting process rollout_proc30 [2024-06-18 16:30:38,857][18875] Starting process rollout_proc31 [2024-06-18 16:30:40,922][19087] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:30:40,922][19087] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-18 16:30:40,931][19087] Num visible devices: 1 [2024-06-18 16:30:40,944][19109] Worker 0 uses CPU cores [0] [2024-06-18 16:30:40,945][19087] Setting fixed seed 0 [2024-06-18 16:30:40,946][19087] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:30:40,952][19087] Initializing actor-critic model on device cuda:0 [2024-06-18 16:30:40,952][19107] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:30:40,953][19107] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-18 16:30:40,962][19107] Num visible devices: 1 [2024-06-18 16:30:41,000][19230] Worker 31 uses CPU cores [31] [2024-06-18 16:30:41,020][19217] Worker 18 uses CPU cores [18] [2024-06-18 16:30:41,051][19212] Worker 11 uses CPU cores [11] [2024-06-18 16:30:41,056][19173] Worker 4 uses CPU cores [4] [2024-06-18 16:30:41,087][19141] Worker 3 uses CPU cores [3] [2024-06-18 16:30:41,111][19210] Worker 9 uses CPU cores [9] [2024-06-18 16:30:41,144][19213] Worker 12 uses CPU cores [12] [2024-06-18 16:30:41,144][19206] Worker 6 uses CPU cores [6] [2024-06-18 16:30:41,155][19229] Worker 28 uses CPU cores [28] [2024-06-18 16:30:41,172][19225] Worker 21 uses CPU cores [21] [2024-06-18 16:30:41,176][19205] Worker 2 uses CPU cores [2] [2024-06-18 16:30:41,187][19108] Worker 1 uses CPU cores [1] [2024-06-18 16:30:41,211][19216] Worker 17 uses CPU cores [17] [2024-06-18 16:30:41,264][19228] Worker 24 uses CPU cores [24] [2024-06-18 16:30:41,316][19222] Worker 22 uses CPU cores [22] [2024-06-18 16:30:41,334][19207] Worker 7 uses CPU cores [7] [2024-06-18 16:30:41,335][19215] Worker 13 uses CPU cores [13] [2024-06-18 16:30:41,347][19211] Worker 10 uses CPU cores [10] [2024-06-18 16:30:41,361][19209] Worker 5 uses CPU cores [5] [2024-06-18 16:30:41,372][19224] Worker 23 uses CPU cores [23] [2024-06-18 16:30:41,376][19220] Worker 20 uses CPU cores [20] [2024-06-18 16:30:41,382][19221] Worker 19 uses CPU cores [19] [2024-06-18 16:30:41,387][19208] Worker 8 uses CPU cores [8] [2024-06-18 16:30:41,387][19214] Worker 16 uses CPU cores [16] [2024-06-18 16:30:41,411][19227] Worker 27 uses CPU cores [27] [2024-06-18 16:30:41,412][19218] Worker 14 uses CPU cores [14] [2024-06-18 16:30:41,422][19232] Worker 29 uses CPU cores [29] [2024-06-18 16:30:41,462][19226] Worker 26 uses CPU cores [26] [2024-06-18 16:30:41,500][19223] Worker 25 uses CPU cores [25] [2024-06-18 16:30:41,511][19219] Worker 15 uses CPU cores [15] [2024-06-18 16:30:41,515][19231] Worker 30 uses CPU cores [30] [2024-06-18 16:30:41,855][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,856][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,859][19087] RunningMeanStd input shape: (1,) [2024-06-18 16:30:41,860][19087] RunningMeanStd input shape: (1,) [2024-06-18 16:30:41,860][19087] RunningMeanStd input shape: (1,) [2024-06-18 16:30:41,860][19087] RunningMeanStd input shape: (1,) [2024-06-18 16:30:41,860][19087] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:41,900][19087] RunningMeanStd input shape: (1,) [2024-06-18 16:30:41,904][19087] Created Actor Critic model with architecture: [2024-06-18 16:30:41,904][19087] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-18 16:30:41,962][19087] Using optimizer [2024-06-18 16:30:42,148][19087] Loading state from checkpoint /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169435_2776023040.pth... [2024-06-18 16:30:42,161][19087] Loading model from checkpoint [2024-06-18 16:30:42,163][19087] Loaded experiment state at self.train_step=169435, self.env_steps=2776023040 [2024-06-18 16:30:42,163][19087] Initialized policy 0 weights for model version 169435 [2024-06-18 16:30:42,164][19087] LearnerWorker_p0 finished initialization! [2024-06-18 16:30:42,164][19087] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,894][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,895][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,898][19107] RunningMeanStd input shape: (1,) [2024-06-18 16:30:42,898][19107] RunningMeanStd input shape: (1,) [2024-06-18 16:30:42,898][19107] RunningMeanStd input shape: (1,) [2024-06-18 16:30:42,899][19107] RunningMeanStd input shape: (1,) [2024-06-18 16:30:42,899][19107] RunningMeanStd input shape: (11, 11) [2024-06-18 16:30:42,937][19107] RunningMeanStd input shape: (1,) [2024-06-18 16:30:42,959][18875] Inference worker 0-0 is ready! [2024-06-18 16:30:42,959][18875] All inference workers are ready! Signal rollout workers to start! [2024-06-18 16:30:45,500][18875] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 2776023040. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 16:30:45,701][19225] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,717][19228] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,726][19213] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,733][19232] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,768][19223] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,771][19222] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,775][19209] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,778][19212] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,778][19206] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,779][19210] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,781][19224] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,801][19231] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,803][19109] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,813][19230] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,817][19207] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,820][19217] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,834][19211] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,835][19208] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,845][19205] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,862][19229] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,867][19227] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,876][19226] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,877][19214] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,882][19216] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,887][19141] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,894][19218] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,894][19108] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,897][19220] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,924][19219] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,925][19173] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,926][19215] Decorrelating experience for 0 frames... [2024-06-18 16:30:45,943][19221] Decorrelating experience for 0 frames... [2024-06-18 16:30:46,916][19228] Decorrelating experience for 256 frames... [2024-06-18 16:30:46,938][19224] Decorrelating experience for 256 frames... [2024-06-18 16:30:46,976][19209] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,000][19225] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,002][19210] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,007][19232] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,008][19212] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,009][19206] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,016][19207] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,042][19213] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,058][19223] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,066][19226] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,066][19231] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,079][19220] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,105][19219] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,113][19109] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,126][19229] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,153][19173] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,159][19216] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,161][19214] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,166][19208] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,166][19230] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,185][19215] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,210][19222] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,229][19218] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,236][19217] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,238][19221] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,245][19205] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,248][19141] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,256][19211] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,271][19108] Decorrelating experience for 256 frames... [2024-06-18 16:30:47,357][19227] Decorrelating experience for 256 frames... [2024-06-18 16:30:50,500][18875] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 2776023040. Throughput: 0: 996.0. Samples: 4980. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 16:30:54,570][19210] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-18 16:30:54,582][19212] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-18 16:30:54,638][19219] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-18 16:30:54,723][19208] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-18 16:30:54,737][19209] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-18 16:30:54,769][19218] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-18 16:30:54,770][19207] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-18 16:30:54,790][19215] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-18 16:30:54,791][19213] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-18 16:30:54,820][19211] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-18 16:30:54,822][19206] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-18 16:30:54,852][19228] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-18 16:30:54,855][19226] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-18 16:30:54,864][19224] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-18 16:30:54,871][19087] Signal inference workers to stop experience collection... [2024-06-18 16:30:54,878][19107] InferenceWorker_p0-w0: stopping experience collection [2024-06-18 16:30:54,886][19216] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-18 16:30:54,891][19108] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-18 16:30:54,892][19220] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-18 16:30:55,426][19087] Signal inference workers to resume experience collection... [2024-06-18 16:30:55,426][19107] InferenceWorker_p0-w0: resuming experience collection [2024-06-18 16:30:55,447][19205] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-18 16:30:55,456][19141] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-18 16:30:55,500][18875] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 2776039424. Throughput: 0: 32082.0. Samples: 320820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:30:55,673][19223] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-18 16:30:55,991][19173] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-18 16:30:56,061][19225] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-18 16:30:56,062][19232] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-18 16:30:56,211][19229] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-18 16:30:56,230][19214] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-18 16:30:56,236][19231] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-18 16:30:56,277][19217] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-18 16:30:56,330][19221] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-18 16:30:56,505][19222] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-18 16:30:56,778][19230] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-18 16:30:56,906][19227] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-18 16:30:56,923][19107] Updated weights for policy 0, policy_version 169445 (0.0014) [2024-06-18 16:30:58,402][18875] Heartbeat connected on Batcher_0 [2024-06-18 16:30:58,404][18875] Heartbeat connected on LearnerWorker_p0 [2024-06-18 16:30:58,419][18875] Heartbeat connected on RolloutWorker_w0 [2024-06-18 16:30:58,471][18875] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-18 16:30:59,602][19108] Worker 1 awakens! [2024-06-18 16:30:59,620][18875] Heartbeat connected on RolloutWorker_w1 [2024-06-18 16:31:00,500][18875] Fps is (10 sec: 16384.2, 60 sec: 10922.7, 300 sec: 10922.7). Total num frames: 2776186880. Throughput: 0: 22033.5. Samples: 330500. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:00,501][18875] Avg episode reward: [(0, '0.000')] [2024-06-18 16:31:04,868][19205] Worker 2 awakens! [2024-06-18 16:31:04,873][18875] Heartbeat connected on RolloutWorker_w2 [2024-06-18 16:31:05,500][18875] Fps is (10 sec: 16384.1, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 2776203264. Throughput: 0: 17159.1. Samples: 343180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:05,501][18875] Avg episode reward: [(0, '0.000')] [2024-06-18 16:31:09,588][19141] Worker 3 awakens! [2024-06-18 16:31:09,593][18875] Heartbeat connected on RolloutWorker_w3 [2024-06-18 16:31:10,500][18875] Fps is (10 sec: 3276.8, 60 sec: 7864.4, 300 sec: 7864.4). Total num frames: 2776219648. Throughput: 0: 14616.9. Samples: 365420. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:10,500][18875] Avg episode reward: [(0, '0.000')] [2024-06-18 16:31:14,794][19173] Worker 4 awakens! [2024-06-18 16:31:14,806][18875] Heartbeat connected on RolloutWorker_w4 [2024-06-18 16:31:15,500][18875] Fps is (10 sec: 4915.2, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 2776252416. Throughput: 0: 12698.0. Samples: 380940. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:15,501][18875] Avg episode reward: [(0, '0.000')] [2024-06-18 16:31:18,272][19209] Worker 5 awakens! [2024-06-18 16:31:18,280][18875] Heartbeat connected on RolloutWorker_w5 [2024-06-18 16:31:20,500][18875] Fps is (10 sec: 9830.4, 60 sec: 8426.1, 300 sec: 8426.1). Total num frames: 2776317952. Throughput: 0: 12709.2. Samples: 444820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:20,500][18875] Avg episode reward: [(0, '0.000')] [2024-06-18 16:31:22,354][19107] Updated weights for policy 0, policy_version 169455 (0.0017) [2024-06-18 16:31:23,044][19206] Worker 6 awakens! [2024-06-18 16:31:23,049][18875] Heartbeat connected on RolloutWorker_w6 [2024-06-18 16:31:25,500][18875] Fps is (10 sec: 14745.6, 60 sec: 9420.8, 300 sec: 9420.8). Total num frames: 2776399872. Throughput: 0: 13386.0. Samples: 535440. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:25,501][18875] Avg episode reward: [(0, '0.005')] [2024-06-18 16:31:27,594][19207] Worker 7 awakens! [2024-06-18 16:31:27,600][18875] Heartbeat connected on RolloutWorker_w7 [2024-06-18 16:31:30,500][18875] Fps is (10 sec: 16384.0, 60 sec: 10194.5, 300 sec: 10194.5). Total num frames: 2776481792. Throughput: 0: 13148.0. Samples: 591660. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:30,500][18875] Avg episode reward: [(0, '0.004')] [2024-06-18 16:31:31,105][19107] Updated weights for policy 0, policy_version 169465 (0.0013) [2024-06-18 16:31:32,225][19208] Worker 8 awakens! [2024-06-18 16:31:32,230][18875] Heartbeat connected on RolloutWorker_w8 [2024-06-18 16:31:35,500][18875] Fps is (10 sec: 19660.7, 60 sec: 11468.8, 300 sec: 11468.8). Total num frames: 2776596480. Throughput: 0: 15801.4. Samples: 716040. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:35,501][18875] Avg episode reward: [(0, '0.004')] [2024-06-18 16:31:36,856][19210] Worker 9 awakens! [2024-06-18 16:31:36,863][18875] Heartbeat connected on RolloutWorker_w9 [2024-06-18 16:31:38,835][19107] Updated weights for policy 0, policy_version 169475 (0.0012) [2024-06-18 16:31:40,500][18875] Fps is (10 sec: 24576.0, 60 sec: 12809.3, 300 sec: 12809.3). Total num frames: 2776727552. Throughput: 0: 11876.9. Samples: 855280. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:40,501][18875] Avg episode reward: [(0, '0.052')] [2024-06-18 16:31:41,796][19211] Worker 10 awakens! [2024-06-18 16:31:41,802][18875] Heartbeat connected on RolloutWorker_w10 [2024-06-18 16:31:44,571][19107] Updated weights for policy 0, policy_version 169485 (0.0014) [2024-06-18 16:31:45,500][18875] Fps is (10 sec: 26214.2, 60 sec: 13926.4, 300 sec: 13926.4). Total num frames: 2776858624. Throughput: 0: 13526.2. Samples: 939180. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:45,501][18875] Avg episode reward: [(0, '0.079')] [2024-06-18 16:31:46,244][19212] Worker 11 awakens! [2024-06-18 16:31:46,253][18875] Heartbeat connected on RolloutWorker_w11 [2024-06-18 16:31:50,340][19107] Updated weights for policy 0, policy_version 169495 (0.0017) [2024-06-18 16:31:50,500][18875] Fps is (10 sec: 27852.7, 60 sec: 16384.0, 300 sec: 15123.7). Total num frames: 2777006080. Throughput: 0: 17171.1. Samples: 1115880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:50,501][18875] Avg episode reward: [(0, '0.225')] [2024-06-18 16:31:51,140][19213] Worker 12 awakens! [2024-06-18 16:31:51,147][18875] Heartbeat connected on RolloutWorker_w12 [2024-06-18 16:31:54,901][19107] Updated weights for policy 0, policy_version 169505 (0.0018) [2024-06-18 16:31:55,500][18875] Fps is (10 sec: 32768.0, 60 sec: 19114.6, 300 sec: 16618.0). Total num frames: 2777186304. Throughput: 0: 20808.8. Samples: 1301820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 16:31:55,501][18875] Avg episode reward: [(0, '0.200')] [2024-06-18 16:31:55,824][19215] Worker 13 awakens! [2024-06-18 16:31:55,833][18875] Heartbeat connected on RolloutWorker_w13 [2024-06-18 16:32:00,025][19107] Updated weights for policy 0, policy_version 169515 (0.0017) [2024-06-18 16:32:00,496][19218] Worker 14 awakens! [2024-06-18 16:32:00,500][18875] Fps is (10 sec: 34406.2, 60 sec: 19387.7, 300 sec: 17694.7). Total num frames: 2777350144. Throughput: 0: 22740.0. Samples: 1404240. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:00,501][18875] Avg episode reward: [(0, '0.136')] [2024-06-18 16:32:00,510][18875] Heartbeat connected on RolloutWorker_w14 [2024-06-18 16:32:04,744][19107] Updated weights for policy 0, policy_version 169525 (0.0021) [2024-06-18 16:32:05,048][19219] Worker 15 awakens! [2024-06-18 16:32:05,056][18875] Heartbeat connected on RolloutWorker_w15 [2024-06-18 16:32:05,500][18875] Fps is (10 sec: 32768.1, 60 sec: 21845.3, 300 sec: 18636.8). Total num frames: 2777513984. Throughput: 0: 25813.7. Samples: 1606440. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:05,501][18875] Avg episode reward: [(0, '0.305')] [2024-06-18 16:32:09,641][19107] Updated weights for policy 0, policy_version 169535 (0.0021) [2024-06-18 16:32:10,500][18875] Fps is (10 sec: 32768.0, 60 sec: 24302.9, 300 sec: 19468.0). Total num frames: 2777677824. Throughput: 0: 28152.8. Samples: 1802320. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:10,501][18875] Avg episode reward: [(0, '0.305')] [2024-06-18 16:32:11,328][19214] Worker 16 awakens! [2024-06-18 16:32:11,339][18875] Heartbeat connected on RolloutWorker_w16 [2024-06-18 16:32:14,672][19216] Worker 17 awakens! [2024-06-18 16:32:14,684][18875] Heartbeat connected on RolloutWorker_w17 [2024-06-18 16:32:14,855][19107] Updated weights for policy 0, policy_version 169545 (0.0032) [2024-06-18 16:32:15,500][18875] Fps is (10 sec: 32768.0, 60 sec: 26487.4, 300 sec: 20206.9). Total num frames: 2777841664. Throughput: 0: 29195.5. Samples: 1905460. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:15,501][18875] Avg episode reward: [(0, '0.330')] [2024-06-18 16:32:19,374][19107] Updated weights for policy 0, policy_version 169555 (0.0029) [2024-06-18 16:32:20,500][18875] Fps is (10 sec: 34406.5, 60 sec: 28398.9, 300 sec: 21040.5). Total num frames: 2778021888. Throughput: 0: 31082.2. Samples: 2114740. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:20,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 16:32:20,756][19217] Worker 18 awakens! [2024-06-18 16:32:20,767][18875] Heartbeat connected on RolloutWorker_w18 [2024-06-18 16:32:24,242][19107] Updated weights for policy 0, policy_version 169565 (0.0029) [2024-06-18 16:32:25,494][19221] Worker 19 awakens! [2024-06-18 16:32:25,500][18875] Fps is (10 sec: 36045.0, 60 sec: 30037.3, 300 sec: 21790.7). Total num frames: 2778202112. Throughput: 0: 32799.1. Samples: 2331240. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:25,501][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 16:32:25,510][18875] Heartbeat connected on RolloutWorker_w19 [2024-06-18 16:32:28,427][19107] Updated weights for policy 0, policy_version 169575 (0.0028) [2024-06-18 16:32:28,742][19220] Worker 20 awakens! [2024-06-18 16:32:28,755][18875] Heartbeat connected on RolloutWorker_w20 [2024-06-18 16:32:30,500][18875] Fps is (10 sec: 34406.5, 60 sec: 31402.6, 300 sec: 22313.5). Total num frames: 2778365952. Throughput: 0: 33310.3. Samples: 2438140. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:30,501][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 16:32:33,208][19107] Updated weights for policy 0, policy_version 169585 (0.0030) [2024-06-18 16:32:34,596][19225] Worker 21 awakens! [2024-06-18 16:32:34,609][18875] Heartbeat connected on RolloutWorker_w21 [2024-06-18 16:32:35,500][18875] Fps is (10 sec: 36044.7, 60 sec: 32768.0, 300 sec: 23086.5). Total num frames: 2778562560. Throughput: 0: 34411.1. Samples: 2664380. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:35,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 16:32:35,511][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169591_2778578944.pth... [2024-06-18 16:32:35,559][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169130_2771025920.pth [2024-06-18 16:32:37,290][19107] Updated weights for policy 0, policy_version 169595 (0.0037) [2024-06-18 16:32:39,730][19222] Worker 22 awakens! [2024-06-18 16:32:39,743][18875] Heartbeat connected on RolloutWorker_w22 [2024-06-18 16:32:40,500][18875] Fps is (10 sec: 39321.0, 60 sec: 33860.1, 300 sec: 23792.4). Total num frames: 2778759168. Throughput: 0: 35308.4. Samples: 2890700. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:40,501][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 16:32:41,384][19107] Updated weights for policy 0, policy_version 169605 (0.0030) [2024-06-18 16:32:42,780][19224] Worker 23 awakens! [2024-06-18 16:32:42,793][18875] Heartbeat connected on RolloutWorker_w23 [2024-06-18 16:32:45,500][18875] Fps is (10 sec: 37683.0, 60 sec: 34679.5, 300 sec: 24302.9). Total num frames: 2778939392. Throughput: 0: 35624.4. Samples: 3007340. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:45,501][18875] Avg episode reward: [(0, '0.574')] [2024-06-18 16:32:46,129][19107] Updated weights for policy 0, policy_version 169615 (0.0028) [2024-06-18 16:32:47,456][19228] Worker 24 awakens! [2024-06-18 16:32:47,471][18875] Heartbeat connected on RolloutWorker_w24 [2024-06-18 16:32:49,896][19107] Updated weights for policy 0, policy_version 169625 (0.0030) [2024-06-18 16:32:50,501][18875] Fps is (10 sec: 39317.6, 60 sec: 35771.0, 300 sec: 25034.5). Total num frames: 2779152384. Throughput: 0: 36363.5. Samples: 3242840. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:50,502][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 16:32:52,938][19223] Worker 25 awakens! [2024-06-18 16:32:52,955][18875] Heartbeat connected on RolloutWorker_w25 [2024-06-18 16:32:53,986][19107] Updated weights for policy 0, policy_version 169635 (0.0035) [2024-06-18 16:32:55,500][18875] Fps is (10 sec: 40959.9, 60 sec: 36044.8, 300 sec: 25584.2). Total num frames: 2779348992. Throughput: 0: 37282.6. Samples: 3480040. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:32:55,501][18875] Avg episode reward: [(0, '0.266')] [2024-06-18 16:32:56,830][19226] Worker 26 awakens! [2024-06-18 16:32:56,846][18875] Heartbeat connected on RolloutWorker_w26 [2024-06-18 16:32:58,046][19107] Updated weights for policy 0, policy_version 169645 (0.0026) [2024-06-18 16:33:00,500][18875] Fps is (10 sec: 40964.4, 60 sec: 36864.0, 300 sec: 26214.4). Total num frames: 2779561984. Throughput: 0: 37699.5. Samples: 3601940. Policy #0 lag: (min: 0.0, avg: 5.3, max: 9.0) [2024-06-18 16:33:00,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 16:33:02,275][19107] Updated weights for policy 0, policy_version 169655 (0.0036) [2024-06-18 16:33:03,568][19227] Worker 27 awakens! [2024-06-18 16:33:03,584][18875] Heartbeat connected on RolloutWorker_w27 [2024-06-18 16:33:05,500][18875] Fps is (10 sec: 40960.6, 60 sec: 37410.2, 300 sec: 26682.5). Total num frames: 2779758592. Throughput: 0: 38500.5. Samples: 3847260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:05,500][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 16:33:06,261][19107] Updated weights for policy 0, policy_version 169665 (0.0037) [2024-06-18 16:33:07,501][19229] Worker 28 awakens! [2024-06-18 16:33:07,517][18875] Heartbeat connected on RolloutWorker_w28 [2024-06-18 16:33:10,373][19107] Updated weights for policy 0, policy_version 169675 (0.0033) [2024-06-18 16:33:10,500][18875] Fps is (10 sec: 39322.2, 60 sec: 37956.3, 300 sec: 27118.4). Total num frames: 2779955200. Throughput: 0: 39076.0. Samples: 4089660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:10,501][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 16:33:12,100][19232] Worker 29 awakens! [2024-06-18 16:33:12,116][18875] Heartbeat connected on RolloutWorker_w29 [2024-06-18 16:33:14,325][19107] Updated weights for policy 0, policy_version 169685 (0.0027) [2024-06-18 16:33:15,500][18875] Fps is (10 sec: 40959.5, 60 sec: 38775.5, 300 sec: 27634.3). Total num frames: 2780168192. Throughput: 0: 39515.1. Samples: 4216320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:15,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 16:33:16,951][19231] Worker 30 awakens! [2024-06-18 16:33:16,966][18875] Heartbeat connected on RolloutWorker_w30 [2024-06-18 16:33:18,087][19107] Updated weights for policy 0, policy_version 169695 (0.0032) [2024-06-18 16:33:20,504][18875] Fps is (10 sec: 40944.6, 60 sec: 39046.1, 300 sec: 28010.7). Total num frames: 2780364800. Throughput: 0: 40028.7. Samples: 4465820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:20,505][18875] Avg episode reward: [(0, '0.475')] [2024-06-18 16:33:21,951][19107] Updated weights for policy 0, policy_version 169705 (0.0037) [2024-06-18 16:33:22,173][19230] Worker 31 awakens! [2024-06-18 16:33:22,186][18875] Heartbeat connected on RolloutWorker_w31 [2024-06-18 16:33:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 39594.6, 300 sec: 28467.2). Total num frames: 2780577792. Throughput: 0: 40515.7. Samples: 4713900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:25,501][18875] Avg episode reward: [(0, '0.346')] [2024-06-18 16:33:25,747][19107] Updated weights for policy 0, policy_version 169715 (0.0043) [2024-06-18 16:33:27,083][19087] Signal inference workers to stop experience collection... (50 times) [2024-06-18 16:33:27,088][19087] Signal inference workers to resume experience collection... (50 times) [2024-06-18 16:33:27,120][19107] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-18 16:33:27,120][19107] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-18 16:33:29,700][19107] Updated weights for policy 0, policy_version 169725 (0.0039) [2024-06-18 16:33:30,500][18875] Fps is (10 sec: 42614.2, 60 sec: 40413.9, 300 sec: 28895.4). Total num frames: 2780790784. Throughput: 0: 40788.1. Samples: 4842800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:30,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 16:33:33,581][19107] Updated weights for policy 0, policy_version 169735 (0.0027) [2024-06-18 16:33:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 40686.9, 300 sec: 29298.4). Total num frames: 2781003776. Throughput: 0: 41199.3. Samples: 5096760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:35,501][18875] Avg episode reward: [(0, '0.332')] [2024-06-18 16:33:37,235][19107] Updated weights for policy 0, policy_version 169745 (0.0025) [2024-06-18 16:33:40,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40687.0, 300 sec: 29584.8). Total num frames: 2781200384. Throughput: 0: 41499.1. Samples: 5347500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:40,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 16:33:41,759][19107] Updated weights for policy 0, policy_version 169755 (0.0031) [2024-06-18 16:33:45,145][19107] Updated weights for policy 0, policy_version 169765 (0.0039) [2024-06-18 16:33:45,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 30128.4). Total num frames: 2781446144. Throughput: 0: 41673.8. Samples: 5477260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:45,501][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 16:33:49,539][19107] Updated weights for policy 0, policy_version 169775 (0.0032) [2024-06-18 16:33:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40960.8, 300 sec: 30199.7). Total num frames: 2781609984. Throughput: 0: 41753.3. Samples: 5726160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:50,501][18875] Avg episode reward: [(0, '0.303')] [2024-06-18 16:33:52,980][19107] Updated weights for policy 0, policy_version 169785 (0.0035) [2024-06-18 16:33:55,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 30698.5). Total num frames: 2781855744. Throughput: 0: 41812.5. Samples: 5971220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:33:55,500][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 16:33:57,293][19107] Updated weights for policy 0, policy_version 169795 (0.0037) [2024-06-18 16:34:00,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41506.3, 300 sec: 30919.6). Total num frames: 2782052352. Throughput: 0: 41919.7. Samples: 6102700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:34:00,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 16:34:00,863][19107] Updated weights for policy 0, policy_version 169805 (0.0033) [2024-06-18 16:34:05,197][19107] Updated weights for policy 0, policy_version 169815 (0.0033) [2024-06-18 16:34:05,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 31129.6). Total num frames: 2782248960. Throughput: 0: 41911.5. Samples: 6351680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 18.0) [2024-06-18 16:34:05,501][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 16:34:08,913][19107] Updated weights for policy 0, policy_version 169825 (0.0028) [2024-06-18 16:34:10,504][18875] Fps is (10 sec: 44220.5, 60 sec: 42322.7, 300 sec: 31568.6). Total num frames: 2782494720. Throughput: 0: 41882.4. Samples: 6598760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:10,505][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 16:34:12,967][19107] Updated weights for policy 0, policy_version 169835 (0.0036) [2024-06-18 16:34:15,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 31675.7). Total num frames: 2782674944. Throughput: 0: 41818.6. Samples: 6724640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:15,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 16:34:16,812][19107] Updated weights for policy 0, policy_version 169845 (0.0049) [2024-06-18 16:34:20,500][18875] Fps is (10 sec: 37696.6, 60 sec: 41781.7, 300 sec: 31853.5). Total num frames: 2782871552. Throughput: 0: 41673.8. Samples: 6972080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:20,501][18875] Avg episode reward: [(0, '0.739')] [2024-06-18 16:34:20,940][19107] Updated weights for policy 0, policy_version 169855 (0.0027) [2024-06-18 16:34:24,777][19107] Updated weights for policy 0, policy_version 169865 (0.0036) [2024-06-18 16:34:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 32097.7). Total num frames: 2783084544. Throughput: 0: 41591.5. Samples: 7219120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:25,506][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 16:34:29,310][19107] Updated weights for policy 0, policy_version 169875 (0.0039) [2024-06-18 16:34:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 32258.3). Total num frames: 2783281152. Throughput: 0: 41460.5. Samples: 7342980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:30,501][18875] Avg episode reward: [(0, '0.257')] [2024-06-18 16:34:32,809][19107] Updated weights for policy 0, policy_version 169885 (0.0035) [2024-06-18 16:34:35,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 32483.1). Total num frames: 2783494144. Throughput: 0: 41465.8. Samples: 7592120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:35,501][18875] Avg episode reward: [(0, '0.165')] [2024-06-18 16:34:35,554][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169892_2783510528.pth... [2024-06-18 16:34:35,603][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169435_2776023040.pth [2024-06-18 16:34:36,990][19107] Updated weights for policy 0, policy_version 169895 (0.0037) [2024-06-18 16:34:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 32698.3). Total num frames: 2783707136. Throughput: 0: 41749.3. Samples: 7849940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:40,501][18875] Avg episode reward: [(0, '0.245')] [2024-06-18 16:34:40,508][19107] Updated weights for policy 0, policy_version 169905 (0.0034) [2024-06-18 16:34:44,995][19107] Updated weights for policy 0, policy_version 169915 (0.0035) [2024-06-18 16:34:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40960.1, 300 sec: 32836.3). Total num frames: 2783903744. Throughput: 0: 41500.4. Samples: 7970220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:45,501][18875] Avg episode reward: [(0, '0.336')] [2024-06-18 16:34:48,422][19107] Updated weights for policy 0, policy_version 169925 (0.0035) [2024-06-18 16:34:50,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 33169.3). Total num frames: 2784149504. Throughput: 0: 41547.1. Samples: 8221300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:50,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 16:34:52,784][19107] Updated weights for policy 0, policy_version 169935 (0.0042) [2024-06-18 16:34:55,504][18875] Fps is (10 sec: 44220.2, 60 sec: 41503.5, 300 sec: 33291.8). Total num frames: 2784346112. Throughput: 0: 41586.2. Samples: 8470140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:34:55,505][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 16:34:56,486][19107] Updated weights for policy 0, policy_version 169945 (0.0031) [2024-06-18 16:35:00,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41233.0, 300 sec: 33346.3). Total num frames: 2784526336. Throughput: 0: 41417.8. Samples: 8588440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:35:00,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 16:35:00,582][19107] Updated weights for policy 0, policy_version 169955 (0.0035) [2024-06-18 16:35:01,800][19087] Signal inference workers to stop experience collection... (100 times) [2024-06-18 16:35:01,800][19087] Signal inference workers to resume experience collection... (100 times) [2024-06-18 16:35:01,830][19107] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-18 16:35:01,831][19107] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-18 16:35:04,127][19107] Updated weights for policy 0, policy_version 169965 (0.0036) [2024-06-18 16:35:05,500][18875] Fps is (10 sec: 40975.4, 60 sec: 41779.2, 300 sec: 33587.2). Total num frames: 2784755712. Throughput: 0: 41682.3. Samples: 8847780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:35:05,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 16:35:08,654][19107] Updated weights for policy 0, policy_version 169975 (0.0034) [2024-06-18 16:35:10,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41235.6, 300 sec: 33757.2). Total num frames: 2784968704. Throughput: 0: 41657.5. Samples: 9093700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-18 16:35:10,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 16:35:11,826][19107] Updated weights for policy 0, policy_version 169985 (0.0041) [2024-06-18 16:35:15,500][18875] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 33920.9). Total num frames: 2785181696. Throughput: 0: 41657.1. Samples: 9217560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:15,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 16:35:16,251][19107] Updated weights for policy 0, policy_version 169995 (0.0033) [2024-06-18 16:35:19,856][19107] Updated weights for policy 0, policy_version 170005 (0.0047) [2024-06-18 16:35:20,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 34078.7). Total num frames: 2785394688. Throughput: 0: 41759.5. Samples: 9471300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:20,508][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 16:35:23,988][19107] Updated weights for policy 0, policy_version 170015 (0.0033) [2024-06-18 16:35:25,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 34113.8). Total num frames: 2785574912. Throughput: 0: 41690.1. Samples: 9726000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:25,501][18875] Avg episode reward: [(0, '0.695')] [2024-06-18 16:35:27,650][19107] Updated weights for policy 0, policy_version 170025 (0.0039) [2024-06-18 16:35:30,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 34262.7). Total num frames: 2785787904. Throughput: 0: 41671.8. Samples: 9845460. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:30,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 16:35:31,641][19107] Updated weights for policy 0, policy_version 170035 (0.0040) [2024-06-18 16:35:35,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 34406.4). Total num frames: 2786000896. Throughput: 0: 41615.8. Samples: 10094020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:35,504][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 16:35:35,663][19107] Updated weights for policy 0, policy_version 170045 (0.0028) [2024-06-18 16:35:39,441][19107] Updated weights for policy 0, policy_version 170055 (0.0028) [2024-06-18 16:35:40,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 34489.7). Total num frames: 2786197504. Throughput: 0: 41647.4. Samples: 10344120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:40,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 16:35:43,590][19107] Updated weights for policy 0, policy_version 170065 (0.0036) [2024-06-18 16:35:45,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 35267.3). Total num frames: 2786426880. Throughput: 0: 41679.5. Samples: 10464020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:45,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 16:35:47,708][19107] Updated weights for policy 0, policy_version 170075 (0.0054) [2024-06-18 16:35:50,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 35878.2). Total num frames: 2786623488. Throughput: 0: 41544.4. Samples: 10717280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:50,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 16:35:51,327][19107] Updated weights for policy 0, policy_version 170085 (0.0039) [2024-06-18 16:35:55,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40962.5, 300 sec: 35989.3). Total num frames: 2786803712. Throughput: 0: 41564.4. Samples: 10964100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:35:55,501][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 16:35:56,120][19107] Updated weights for policy 0, policy_version 170095 (0.0030) [2024-06-18 16:35:59,234][19107] Updated weights for policy 0, policy_version 170105 (0.0031) [2024-06-18 16:36:00,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 36766.8). Total num frames: 2787049472. Throughput: 0: 41534.9. Samples: 11086620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:36:00,500][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 16:36:03,976][19107] Updated weights for policy 0, policy_version 170115 (0.0042) [2024-06-18 16:36:05,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41506.1, 300 sec: 37377.7). Total num frames: 2787246080. Throughput: 0: 41565.4. Samples: 11341740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:36:05,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 16:36:07,036][19107] Updated weights for policy 0, policy_version 170125 (0.0036) [2024-06-18 16:36:10,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 37933.1). Total num frames: 2787442688. Throughput: 0: 41317.4. Samples: 11585280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:36:10,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 16:36:11,974][19107] Updated weights for policy 0, policy_version 170135 (0.0052) [2024-06-18 16:36:14,766][19107] Updated weights for policy 0, policy_version 170145 (0.0035) [2024-06-18 16:36:15,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 38488.5). Total num frames: 2787672064. Throughput: 0: 41517.9. Samples: 11713760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 16:36:15,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 16:36:19,820][19107] Updated weights for policy 0, policy_version 170155 (0.0028) [2024-06-18 16:36:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 38821.7). Total num frames: 2787852288. Throughput: 0: 41572.1. Samples: 11964760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:20,502][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 16:36:21,823][19087] Signal inference workers to stop experience collection... (150 times) [2024-06-18 16:36:21,862][19107] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-18 16:36:21,883][19087] Signal inference workers to resume experience collection... (150 times) [2024-06-18 16:36:21,884][19107] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-18 16:36:22,590][19107] Updated weights for policy 0, policy_version 170165 (0.0044) [2024-06-18 16:36:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 39321.6). Total num frames: 2788081664. Throughput: 0: 41421.7. Samples: 12208100. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:25,501][18875] Avg episode reward: [(0, '0.590')] [2024-06-18 16:36:27,715][19107] Updated weights for policy 0, policy_version 170175 (0.0024) [2024-06-18 16:36:30,397][19107] Updated weights for policy 0, policy_version 170185 (0.0038) [2024-06-18 16:36:30,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42052.4, 300 sec: 39710.4). Total num frames: 2788311040. Throughput: 0: 41639.7. Samples: 12337800. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:30,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 16:36:35,500][18875] Fps is (10 sec: 37683.7, 60 sec: 40960.1, 300 sec: 39765.9). Total num frames: 2788458496. Throughput: 0: 41517.4. Samples: 12585560. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:35,501][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 16:36:35,565][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000170195_2788474880.pth... [2024-06-18 16:36:35,581][19107] Updated weights for policy 0, policy_version 170195 (0.0042) [2024-06-18 16:36:35,631][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169591_2778578944.pth [2024-06-18 16:36:38,105][19107] Updated weights for policy 0, policy_version 170205 (0.0058) [2024-06-18 16:36:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 40210.2). Total num frames: 2788720640. Throughput: 0: 41262.1. Samples: 12820900. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:40,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 16:36:43,559][19107] Updated weights for policy 0, policy_version 170215 (0.0024) [2024-06-18 16:36:45,500][18875] Fps is (10 sec: 45875.3, 60 sec: 41506.2, 300 sec: 40376.8). Total num frames: 2788917248. Throughput: 0: 41717.7. Samples: 12963920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:45,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 16:36:45,891][19107] Updated weights for policy 0, policy_version 170225 (0.0028) [2024-06-18 16:36:50,500][18875] Fps is (10 sec: 34406.4, 60 sec: 40686.9, 300 sec: 40265.8). Total num frames: 2789064704. Throughput: 0: 41368.4. Samples: 13203320. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:50,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 16:36:51,583][19107] Updated weights for policy 0, policy_version 170235 (0.0033) [2024-06-18 16:36:54,257][19107] Updated weights for policy 0, policy_version 170245 (0.0035) [2024-06-18 16:36:55,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 40654.5). Total num frames: 2789343232. Throughput: 0: 41222.2. Samples: 13440280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:36:55,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 16:36:59,391][19107] Updated weights for policy 0, policy_version 170255 (0.0035) [2024-06-18 16:37:00,500][18875] Fps is (10 sec: 44236.8, 60 sec: 40959.9, 300 sec: 40654.5). Total num frames: 2789507072. Throughput: 0: 41394.2. Samples: 13576500. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:37:00,501][18875] Avg episode reward: [(0, '0.346')] [2024-06-18 16:37:01,942][19107] Updated weights for policy 0, policy_version 170265 (0.0031) [2024-06-18 16:37:05,500][18875] Fps is (10 sec: 36045.2, 60 sec: 40960.0, 300 sec: 40765.6). Total num frames: 2789703680. Throughput: 0: 41048.5. Samples: 13811940. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:37:05,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 16:37:07,488][19107] Updated weights for policy 0, policy_version 170275 (0.0035) [2024-06-18 16:37:10,173][19107] Updated weights for policy 0, policy_version 170285 (0.0037) [2024-06-18 16:37:10,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41043.3). Total num frames: 2789949440. Throughput: 0: 41195.1. Samples: 14061880. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:37:10,501][18875] Avg episode reward: [(0, '0.762')] [2024-06-18 16:37:15,416][19107] Updated weights for policy 0, policy_version 170295 (0.0033) [2024-06-18 16:37:15,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 2790113280. Throughput: 0: 41172.8. Samples: 14190580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:37:15,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 16:37:17,896][19107] Updated weights for policy 0, policy_version 170305 (0.0044) [2024-06-18 16:37:20,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2790359040. Throughput: 0: 41060.4. Samples: 14433280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-18 16:37:20,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 16:37:23,267][19107] Updated weights for policy 0, policy_version 170315 (0.0025) [2024-06-18 16:37:25,500][18875] Fps is (10 sec: 45874.8, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 2790572032. Throughput: 0: 41383.9. Samples: 14683180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:37:25,501][18875] Avg episode reward: [(0, '0.627')] [2024-06-18 16:37:25,928][19107] Updated weights for policy 0, policy_version 170325 (0.0034) [2024-06-18 16:37:30,500][18875] Fps is (10 sec: 34406.7, 60 sec: 39867.8, 300 sec: 41154.4). Total num frames: 2790703104. Throughput: 0: 40849.8. Samples: 14802160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:37:30,500][18875] Avg episode reward: [(0, '0.759')] [2024-06-18 16:37:31,394][19107] Updated weights for policy 0, policy_version 170335 (0.0037) [2024-06-18 16:37:33,822][19107] Updated weights for policy 0, policy_version 170345 (0.0028) [2024-06-18 16:37:35,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 41487.7). Total num frames: 2790998016. Throughput: 0: 40992.1. Samples: 15047960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:37:35,500][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 16:37:39,297][19107] Updated weights for policy 0, policy_version 170355 (0.0029) [2024-06-18 16:37:40,500][18875] Fps is (10 sec: 45874.9, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 2791161856. Throughput: 0: 41621.9. Samples: 15313260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:37:40,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 16:37:41,632][19107] Updated weights for policy 0, policy_version 170365 (0.0042) [2024-06-18 16:37:45,500][18875] Fps is (10 sec: 36044.6, 60 sec: 40686.9, 300 sec: 41376.7). Total num frames: 2791358464. Throughput: 0: 41112.1. Samples: 15426540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:37:45,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 16:37:46,263][19087] Signal inference workers to stop experience collection... (200 times) [2024-06-18 16:37:46,300][19107] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-18 16:37:46,378][19087] Signal inference workers to resume experience collection... (200 times) [2024-06-18 16:37:46,378][19107] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-18 16:37:47,088][19107] Updated weights for policy 0, policy_version 170375 (0.0045) [2024-06-18 16:37:49,511][19107] Updated weights for policy 0, policy_version 170385 (0.0040) [2024-06-18 16:37:50,500][18875] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 41654.2). Total num frames: 2791636992. Throughput: 0: 41527.9. Samples: 15680700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:37:50,501][18875] Avg episode reward: [(0, '0.488')] [2024-06-18 16:37:54,891][19107] Updated weights for policy 0, policy_version 170395 (0.0029) [2024-06-18 16:37:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40413.9, 300 sec: 41376.5). Total num frames: 2791768064. Throughput: 0: 41607.1. Samples: 15934200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:37:55,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 16:37:57,537][19107] Updated weights for policy 0, policy_version 170405 (0.0035) [2024-06-18 16:38:00,500][18875] Fps is (10 sec: 37682.9, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 2792013824. Throughput: 0: 41156.8. Samples: 16042640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:38:00,501][18875] Avg episode reward: [(0, '0.762')] [2024-06-18 16:38:03,067][19107] Updated weights for policy 0, policy_version 170415 (0.0032) [2024-06-18 16:38:05,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 2792226816. Throughput: 0: 41524.9. Samples: 16301900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:38:05,501][18875] Avg episode reward: [(0, '0.290')] [2024-06-18 16:38:05,505][19107] Updated weights for policy 0, policy_version 170425 (0.0031) [2024-06-18 16:38:10,500][18875] Fps is (10 sec: 36045.8, 60 sec: 40414.0, 300 sec: 41376.6). Total num frames: 2792374272. Throughput: 0: 41602.5. Samples: 16555280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:38:10,500][18875] Avg episode reward: [(0, '0.645')] [2024-06-18 16:38:10,813][19107] Updated weights for policy 0, policy_version 170435 (0.0033) [2024-06-18 16:38:13,326][19107] Updated weights for policy 0, policy_version 170445 (0.0042) [2024-06-18 16:38:15,501][18875] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41599.2). Total num frames: 2792636416. Throughput: 0: 41476.6. Samples: 16668620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:38:15,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 16:38:18,791][19107] Updated weights for policy 0, policy_version 170455 (0.0034) [2024-06-18 16:38:20,500][18875] Fps is (10 sec: 45875.2, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 2792833024. Throughput: 0: 41773.8. Samples: 16927780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:38:20,500][18875] Avg episode reward: [(0, '0.744')] [2024-06-18 16:38:21,077][19107] Updated weights for policy 0, policy_version 170465 (0.0038) [2024-06-18 16:38:25,500][18875] Fps is (10 sec: 36045.9, 60 sec: 40414.0, 300 sec: 41376.6). Total num frames: 2792996864. Throughput: 0: 41498.7. Samples: 17180700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 20.0) [2024-06-18 16:38:25,500][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 16:38:26,702][19107] Updated weights for policy 0, policy_version 170475 (0.0031) [2024-06-18 16:38:29,102][19107] Updated weights for policy 0, policy_version 170485 (0.0024) [2024-06-18 16:38:30,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 41598.7). Total num frames: 2793275392. Throughput: 0: 41605.8. Samples: 17298800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:38:30,501][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 16:38:34,386][19107] Updated weights for policy 0, policy_version 170495 (0.0038) [2024-06-18 16:38:35,500][18875] Fps is (10 sec: 44236.3, 60 sec: 40686.9, 300 sec: 41487.6). Total num frames: 2793439232. Throughput: 0: 41656.5. Samples: 17555240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:38:35,501][18875] Avg episode reward: [(0, '0.787')] [2024-06-18 16:38:35,586][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000170499_2793455616.pth... [2024-06-18 16:38:35,658][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000169892_2783510528.pth [2024-06-18 16:38:37,161][19107] Updated weights for policy 0, policy_version 170505 (0.0035) [2024-06-18 16:38:40,504][18875] Fps is (10 sec: 37669.5, 60 sec: 41503.6, 300 sec: 41376.0). Total num frames: 2793652224. Throughput: 0: 41323.4. Samples: 17793900. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:38:40,505][18875] Avg episode reward: [(0, '0.645')] [2024-06-18 16:38:42,259][19107] Updated weights for policy 0, policy_version 170515 (0.0048) [2024-06-18 16:38:45,099][19107] Updated weights for policy 0, policy_version 170525 (0.0042) [2024-06-18 16:38:45,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 2793881600. Throughput: 0: 41665.6. Samples: 17917580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:38:45,500][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 16:38:45,699][19087] Signal inference workers to stop experience collection... (250 times) [2024-06-18 16:38:45,751][19107] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-18 16:38:45,817][19087] Signal inference workers to resume experience collection... (250 times) [2024-06-18 16:38:45,818][19107] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-18 16:38:50,009][19107] Updated weights for policy 0, policy_version 170535 (0.0037) [2024-06-18 16:38:50,500][18875] Fps is (10 sec: 40974.5, 60 sec: 40413.9, 300 sec: 41376.5). Total num frames: 2794061824. Throughput: 0: 41531.0. Samples: 18170800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:38:50,501][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 16:38:53,087][19107] Updated weights for policy 0, policy_version 170545 (0.0030) [2024-06-18 16:38:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 2794291200. Throughput: 0: 41184.8. Samples: 18408600. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:38:55,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 16:38:58,203][19107] Updated weights for policy 0, policy_version 170555 (0.0043) [2024-06-18 16:39:00,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41487.6). Total num frames: 2794487808. Throughput: 0: 41684.2. Samples: 18544400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:39:00,501][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 16:39:01,198][19107] Updated weights for policy 0, policy_version 170565 (0.0054) [2024-06-18 16:39:05,500][18875] Fps is (10 sec: 37682.4, 60 sec: 40686.8, 300 sec: 41265.9). Total num frames: 2794668032. Throughput: 0: 41380.6. Samples: 18789920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:39:05,501][18875] Avg episode reward: [(0, '0.479')] [2024-06-18 16:39:05,960][19107] Updated weights for policy 0, policy_version 170575 (0.0034) [2024-06-18 16:39:09,189][19107] Updated weights for policy 0, policy_version 170585 (0.0034) [2024-06-18 16:39:10,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41543.2). Total num frames: 2794930176. Throughput: 0: 41166.6. Samples: 19033200. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:39:10,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 16:39:13,569][19107] Updated weights for policy 0, policy_version 170595 (0.0031) [2024-06-18 16:39:15,501][18875] Fps is (10 sec: 45875.1, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 2795126784. Throughput: 0: 41624.7. Samples: 19171920. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:39:15,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 16:39:17,001][19107] Updated weights for policy 0, policy_version 170605 (0.0049) [2024-06-18 16:39:20,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 2795323392. Throughput: 0: 41361.0. Samples: 19416480. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:39:20,500][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 16:39:21,495][19107] Updated weights for policy 0, policy_version 170615 (0.0038) [2024-06-18 16:39:24,731][19107] Updated weights for policy 0, policy_version 170625 (0.0032) [2024-06-18 16:39:25,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42598.3, 300 sec: 41598.7). Total num frames: 2795552768. Throughput: 0: 41662.0. Samples: 19668540. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:39:25,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 16:39:29,208][19107] Updated weights for policy 0, policy_version 170635 (0.0038) [2024-06-18 16:39:30,500][18875] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 2795732992. Throughput: 0: 41738.1. Samples: 19795800. Policy #0 lag: (min: 0.0, avg: 12.6, max: 22.0) [2024-06-18 16:39:30,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 16:39:33,030][19107] Updated weights for policy 0, policy_version 170645 (0.0044) [2024-06-18 16:39:35,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 2795945984. Throughput: 0: 41452.1. Samples: 20036140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:39:35,501][18875] Avg episode reward: [(0, '0.260')] [2024-06-18 16:39:36,886][19107] Updated weights for policy 0, policy_version 170655 (0.0038) [2024-06-18 16:39:40,504][18875] Fps is (10 sec: 40945.1, 60 sec: 41506.1, 300 sec: 41487.1). Total num frames: 2796142592. Throughput: 0: 41757.5. Samples: 20287840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:39:40,505][18875] Avg episode reward: [(0, '0.441')] [2024-06-18 16:39:41,089][19107] Updated weights for policy 0, policy_version 170665 (0.0051) [2024-06-18 16:39:44,631][19107] Updated weights for policy 0, policy_version 170675 (0.0031) [2024-06-18 16:39:45,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 2796371968. Throughput: 0: 41549.8. Samples: 20414140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:39:45,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 16:39:48,815][19107] Updated weights for policy 0, policy_version 170685 (0.0039) [2024-06-18 16:39:50,500][18875] Fps is (10 sec: 44253.4, 60 sec: 42052.4, 300 sec: 41488.2). Total num frames: 2796584960. Throughput: 0: 41665.6. Samples: 20664860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:39:50,500][18875] Avg episode reward: [(0, '0.357')] [2024-06-18 16:39:52,438][19107] Updated weights for policy 0, policy_version 170695 (0.0047) [2024-06-18 16:39:55,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 2796781568. Throughput: 0: 41710.7. Samples: 20910180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:39:55,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 16:39:56,837][19107] Updated weights for policy 0, policy_version 170705 (0.0044) [2024-06-18 16:40:00,266][19107] Updated weights for policy 0, policy_version 170715 (0.0035) [2024-06-18 16:40:00,501][18875] Fps is (10 sec: 40958.8, 60 sec: 41779.0, 300 sec: 41487.6). Total num frames: 2796994560. Throughput: 0: 41307.1. Samples: 21030740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:00,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 16:40:04,637][19107] Updated weights for policy 0, policy_version 170725 (0.0043) [2024-06-18 16:40:05,500][18875] Fps is (10 sec: 37682.9, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 2797158400. Throughput: 0: 41261.2. Samples: 21273240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:05,501][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 16:40:06,506][19087] Signal inference workers to stop experience collection... (300 times) [2024-06-18 16:40:06,560][19107] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-18 16:40:06,567][19087] Signal inference workers to resume experience collection... (300 times) [2024-06-18 16:40:06,582][19107] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-18 16:40:08,044][19107] Updated weights for policy 0, policy_version 170735 (0.0036) [2024-06-18 16:40:10,500][18875] Fps is (10 sec: 39322.6, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 2797387776. Throughput: 0: 41157.9. Samples: 21520640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:10,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 16:40:12,473][19107] Updated weights for policy 0, policy_version 170745 (0.0043) [2024-06-18 16:40:15,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41233.3, 300 sec: 41376.6). Total num frames: 2797600768. Throughput: 0: 41165.0. Samples: 21648220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:15,500][18875] Avg episode reward: [(0, '0.829')] [2024-06-18 16:40:15,859][19107] Updated weights for policy 0, policy_version 170755 (0.0039) [2024-06-18 16:40:20,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40959.9, 300 sec: 41376.5). Total num frames: 2797780992. Throughput: 0: 41223.5. Samples: 21891200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:20,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 16:40:21,300][19107] Updated weights for policy 0, policy_version 170765 (0.0044) [2024-06-18 16:40:24,243][19107] Updated weights for policy 0, policy_version 170775 (0.0026) [2024-06-18 16:40:25,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 41376.6). Total num frames: 2797993984. Throughput: 0: 40929.6. Samples: 22129520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:25,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 16:40:29,140][19107] Updated weights for policy 0, policy_version 170785 (0.0043) [2024-06-18 16:40:30,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 2798190592. Throughput: 0: 40806.6. Samples: 22250440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:30,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 16:40:32,151][19107] Updated weights for policy 0, policy_version 170795 (0.0045) [2024-06-18 16:40:35,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 2798419968. Throughput: 0: 40851.9. Samples: 22503200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:40:35,501][18875] Avg episode reward: [(0, '0.724')] [2024-06-18 16:40:35,511][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000170802_2798419968.pth... [2024-06-18 16:40:35,562][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000170195_2788474880.pth [2024-06-18 16:40:37,017][19107] Updated weights for policy 0, policy_version 170805 (0.0028) [2024-06-18 16:40:40,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41235.6, 300 sec: 41321.0). Total num frames: 2798616576. Throughput: 0: 40825.4. Samples: 22747320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:40:40,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 16:40:40,601][19107] Updated weights for policy 0, policy_version 170815 (0.0048) [2024-06-18 16:40:44,842][19107] Updated weights for policy 0, policy_version 170825 (0.0035) [2024-06-18 16:40:45,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 2798813184. Throughput: 0: 40970.4. Samples: 22874400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:40:45,501][18875] Avg episode reward: [(0, '0.376')] [2024-06-18 16:40:48,482][19107] Updated weights for policy 0, policy_version 170835 (0.0035) [2024-06-18 16:40:50,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 41432.1). Total num frames: 2799026176. Throughput: 0: 40964.9. Samples: 23116660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:40:50,501][18875] Avg episode reward: [(0, '0.239')] [2024-06-18 16:40:52,989][19107] Updated weights for policy 0, policy_version 170845 (0.0035) [2024-06-18 16:40:55,500][18875] Fps is (10 sec: 42597.9, 60 sec: 40959.9, 300 sec: 41321.0). Total num frames: 2799239168. Throughput: 0: 41101.2. Samples: 23370200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:40:55,501][18875] Avg episode reward: [(0, '0.239')] [2024-06-18 16:40:56,598][19107] Updated weights for policy 0, policy_version 170855 (0.0028) [2024-06-18 16:41:00,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40413.9, 300 sec: 41265.4). Total num frames: 2799419392. Throughput: 0: 40918.0. Samples: 23489540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:00,501][18875] Avg episode reward: [(0, '0.343')] [2024-06-18 16:41:01,006][19107] Updated weights for policy 0, policy_version 170865 (0.0050) [2024-06-18 16:41:04,562][19107] Updated weights for policy 0, policy_version 170875 (0.0036) [2024-06-18 16:41:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 2799665152. Throughput: 0: 41080.9. Samples: 23739840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:05,501][18875] Avg episode reward: [(0, '0.343')] [2024-06-18 16:41:08,887][19107] Updated weights for policy 0, policy_version 170885 (0.0040) [2024-06-18 16:41:10,500][18875] Fps is (10 sec: 42598.5, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 2799845376. Throughput: 0: 41238.5. Samples: 23985260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:10,501][18875] Avg episode reward: [(0, '0.361')] [2024-06-18 16:41:12,548][19107] Updated weights for policy 0, policy_version 170895 (0.0031) [2024-06-18 16:41:15,504][18875] Fps is (10 sec: 39307.5, 60 sec: 40957.5, 300 sec: 41376.0). Total num frames: 2800058368. Throughput: 0: 41276.3. Samples: 24108020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:15,505][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 16:41:16,887][19107] Updated weights for policy 0, policy_version 170905 (0.0040) [2024-06-18 16:41:20,504][18875] Fps is (10 sec: 40945.6, 60 sec: 41230.6, 300 sec: 41265.0). Total num frames: 2800254976. Throughput: 0: 41036.7. Samples: 24350000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:20,505][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 16:41:20,708][19107] Updated weights for policy 0, policy_version 170915 (0.0038) [2024-06-18 16:41:24,710][19107] Updated weights for policy 0, policy_version 170925 (0.0034) [2024-06-18 16:41:25,504][18875] Fps is (10 sec: 40960.0, 60 sec: 41230.5, 300 sec: 41209.4). Total num frames: 2800467968. Throughput: 0: 41207.3. Samples: 24601800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:25,505][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 16:41:28,627][19107] Updated weights for policy 0, policy_version 170935 (0.0038) [2024-06-18 16:41:30,500][18875] Fps is (10 sec: 44252.7, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 2800697344. Throughput: 0: 41178.6. Samples: 24727440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:30,504][18875] Avg episode reward: [(0, '0.224')] [2024-06-18 16:41:32,472][19107] Updated weights for policy 0, policy_version 170945 (0.0038) [2024-06-18 16:41:35,500][18875] Fps is (10 sec: 39336.2, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 2800861184. Throughput: 0: 41248.5. Samples: 24972840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:35,500][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 16:41:36,483][19107] Updated weights for policy 0, policy_version 170955 (0.0038) [2024-06-18 16:41:40,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 2801074176. Throughput: 0: 41036.1. Samples: 25216820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-18 16:41:40,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 16:41:40,587][19107] Updated weights for policy 0, policy_version 170965 (0.0046) [2024-06-18 16:41:44,407][19107] Updated weights for policy 0, policy_version 170975 (0.0035) [2024-06-18 16:41:45,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 2801303552. Throughput: 0: 41062.8. Samples: 25337360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:41:45,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 16:41:48,518][19107] Updated weights for policy 0, policy_version 170985 (0.0037) [2024-06-18 16:41:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2801500160. Throughput: 0: 41019.1. Samples: 25585700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:41:50,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 16:41:52,600][19107] Updated weights for policy 0, policy_version 170995 (0.0042) [2024-06-18 16:41:55,500][18875] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 2801696768. Throughput: 0: 41049.8. Samples: 25832500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:41:55,509][18875] Avg episode reward: [(0, '0.733')] [2024-06-18 16:41:56,313][19087] Signal inference workers to stop experience collection... (350 times) [2024-06-18 16:41:56,313][19087] Signal inference workers to resume experience collection... (350 times) [2024-06-18 16:41:56,323][19107] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-18 16:41:56,323][19107] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-18 16:41:56,466][19107] Updated weights for policy 0, policy_version 171005 (0.0029) [2024-06-18 16:42:00,385][19107] Updated weights for policy 0, policy_version 171015 (0.0032) [2024-06-18 16:42:00,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 2801909760. Throughput: 0: 41055.7. Samples: 25955380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:00,509][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 16:42:04,442][19107] Updated weights for policy 0, policy_version 171025 (0.0039) [2024-06-18 16:42:05,500][18875] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 2802106368. Throughput: 0: 41204.2. Samples: 26204040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:05,501][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 16:42:08,351][19107] Updated weights for policy 0, policy_version 171035 (0.0041) [2024-06-18 16:42:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 2802319360. Throughput: 0: 41047.7. Samples: 26448800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:10,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 16:42:12,274][19107] Updated weights for policy 0, policy_version 171045 (0.0035) [2024-06-18 16:42:15,500][18875] Fps is (10 sec: 39321.8, 60 sec: 40689.4, 300 sec: 41154.4). Total num frames: 2802499584. Throughput: 0: 41003.2. Samples: 26572580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:15,500][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 16:42:16,296][19107] Updated weights for policy 0, policy_version 171055 (0.0046) [2024-06-18 16:42:20,007][19107] Updated weights for policy 0, policy_version 171065 (0.0029) [2024-06-18 16:42:20,501][18875] Fps is (10 sec: 40957.3, 60 sec: 41235.0, 300 sec: 41209.8). Total num frames: 2802728960. Throughput: 0: 41024.1. Samples: 26818960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:20,502][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 16:42:24,513][19107] Updated weights for policy 0, policy_version 171075 (0.0046) [2024-06-18 16:42:25,500][18875] Fps is (10 sec: 44236.1, 60 sec: 41235.5, 300 sec: 41487.6). Total num frames: 2802941952. Throughput: 0: 41037.7. Samples: 27063520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:25,501][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 16:42:28,279][19107] Updated weights for policy 0, policy_version 171085 (0.0045) [2024-06-18 16:42:30,500][18875] Fps is (10 sec: 37686.4, 60 sec: 40140.9, 300 sec: 41043.3). Total num frames: 2803105792. Throughput: 0: 41158.3. Samples: 27189480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:30,501][18875] Avg episode reward: [(0, '0.371')] [2024-06-18 16:42:32,321][19107] Updated weights for policy 0, policy_version 171095 (0.0038) [2024-06-18 16:42:35,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.0, 300 sec: 41376.5). Total num frames: 2803367936. Throughput: 0: 41086.6. Samples: 27434600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:35,501][18875] Avg episode reward: [(0, '0.183')] [2024-06-18 16:42:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000171104_2803367936.pth... [2024-06-18 16:42:35,571][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000170499_2793455616.pth [2024-06-18 16:42:35,936][19107] Updated weights for policy 0, policy_version 171105 (0.0031) [2024-06-18 16:42:40,166][19107] Updated weights for policy 0, policy_version 171115 (0.0025) [2024-06-18 16:42:40,501][18875] Fps is (10 sec: 44235.5, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 2803548160. Throughput: 0: 41127.9. Samples: 27683260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:40,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 16:42:44,310][19107] Updated weights for policy 0, policy_version 171125 (0.0041) [2024-06-18 16:42:45,500][18875] Fps is (10 sec: 36044.7, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 2803728384. Throughput: 0: 41036.8. Samples: 27802040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:42:45,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 16:42:48,286][19107] Updated weights for policy 0, policy_version 171135 (0.0033) [2024-06-18 16:42:50,504][18875] Fps is (10 sec: 42584.1, 60 sec: 41230.6, 300 sec: 41376.1). Total num frames: 2803974144. Throughput: 0: 41077.6. Samples: 28052680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:42:50,505][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 16:42:52,583][19107] Updated weights for policy 0, policy_version 171145 (0.0040) [2024-06-18 16:42:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2804154368. Throughput: 0: 41192.4. Samples: 28302460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:42:55,501][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 16:42:56,051][19107] Updated weights for policy 0, policy_version 171155 (0.0034) [2024-06-18 16:43:00,355][19107] Updated weights for policy 0, policy_version 171165 (0.0028) [2024-06-18 16:43:00,500][18875] Fps is (10 sec: 39335.5, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2804367360. Throughput: 0: 41166.5. Samples: 28425080. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:00,501][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 16:43:03,819][19107] Updated weights for policy 0, policy_version 171175 (0.0045) [2024-06-18 16:43:05,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41506.0, 300 sec: 41432.0). Total num frames: 2804596736. Throughput: 0: 41308.2. Samples: 28677800. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:05,501][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 16:43:08,107][19107] Updated weights for policy 0, policy_version 171185 (0.0043) [2024-06-18 16:43:10,500][18875] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2804776960. Throughput: 0: 41470.8. Samples: 28929700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:10,500][18875] Avg episode reward: [(0, '0.820')] [2024-06-18 16:43:12,062][19107] Updated weights for policy 0, policy_version 171195 (0.0037) [2024-06-18 16:43:15,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 2804989952. Throughput: 0: 41403.0. Samples: 29052620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:15,501][18875] Avg episode reward: [(0, '0.767')] [2024-06-18 16:43:15,903][19107] Updated weights for policy 0, policy_version 171205 (0.0040) [2024-06-18 16:43:19,798][19107] Updated weights for policy 0, policy_version 171215 (0.0052) [2024-06-18 16:43:20,500][18875] Fps is (10 sec: 42597.4, 60 sec: 41233.5, 300 sec: 41376.5). Total num frames: 2805202944. Throughput: 0: 41480.5. Samples: 29301220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:20,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 16:43:23,594][19107] Updated weights for policy 0, policy_version 171225 (0.0025) [2024-06-18 16:43:25,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2805415936. Throughput: 0: 41510.0. Samples: 29551200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:25,501][18875] Avg episode reward: [(0, '0.328')] [2024-06-18 16:43:27,668][19107] Updated weights for policy 0, policy_version 171235 (0.0033) [2024-06-18 16:43:30,504][18875] Fps is (10 sec: 44221.5, 60 sec: 42322.8, 300 sec: 41376.0). Total num frames: 2805645312. Throughput: 0: 41710.2. Samples: 29679140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:30,505][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 16:43:31,158][19107] Updated weights for policy 0, policy_version 171245 (0.0033) [2024-06-18 16:43:35,404][19107] Updated weights for policy 0, policy_version 171255 (0.0030) [2024-06-18 16:43:35,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 41321.5). Total num frames: 2805841920. Throughput: 0: 41780.6. Samples: 29932660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:35,501][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 16:43:37,168][19087] Signal inference workers to stop experience collection... (400 times) [2024-06-18 16:43:37,220][19107] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-18 16:43:37,229][19087] Signal inference workers to resume experience collection... (400 times) [2024-06-18 16:43:37,236][19107] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-18 16:43:39,498][19107] Updated weights for policy 0, policy_version 171265 (0.0047) [2024-06-18 16:43:40,500][18875] Fps is (10 sec: 37696.7, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 2806022144. Throughput: 0: 41696.1. Samples: 30178780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:40,501][18875] Avg episode reward: [(0, '0.284')] [2024-06-18 16:43:43,525][19107] Updated weights for policy 0, policy_version 171275 (0.0036) [2024-06-18 16:43:45,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41376.5). Total num frames: 2806267904. Throughput: 0: 41722.2. Samples: 30302580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:45,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 16:43:47,212][19107] Updated weights for policy 0, policy_version 171285 (0.0042) [2024-06-18 16:43:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41235.6, 300 sec: 41209.9). Total num frames: 2806448128. Throughput: 0: 41537.9. Samples: 30547000. Policy #0 lag: (min: 0.0, avg: 11.8, max: 20.0) [2024-06-18 16:43:50,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 16:43:51,507][19107] Updated weights for policy 0, policy_version 171295 (0.0029) [2024-06-18 16:43:55,125][19107] Updated weights for policy 0, policy_version 171305 (0.0027) [2024-06-18 16:43:55,503][18875] Fps is (10 sec: 39313.0, 60 sec: 41777.7, 300 sec: 41265.1). Total num frames: 2806661120. Throughput: 0: 41516.9. Samples: 30798060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:43:55,503][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 16:43:59,126][19107] Updated weights for policy 0, policy_version 171315 (0.0038) [2024-06-18 16:44:00,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 2806890496. Throughput: 0: 41585.8. Samples: 30923980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:00,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 16:44:03,569][19107] Updated weights for policy 0, policy_version 171325 (0.0037) [2024-06-18 16:44:05,504][18875] Fps is (10 sec: 42592.6, 60 sec: 41503.7, 300 sec: 41209.4). Total num frames: 2807087104. Throughput: 0: 41640.3. Samples: 31175180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:05,505][18875] Avg episode reward: [(0, '0.309')] [2024-06-18 16:44:07,079][19107] Updated weights for policy 0, policy_version 171335 (0.0043) [2024-06-18 16:44:10,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2807283712. Throughput: 0: 41380.3. Samples: 31413320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:10,502][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 16:44:11,336][19107] Updated weights for policy 0, policy_version 171345 (0.0032) [2024-06-18 16:44:14,997][19107] Updated weights for policy 0, policy_version 171355 (0.0029) [2024-06-18 16:44:15,500][18875] Fps is (10 sec: 42613.2, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 2807513088. Throughput: 0: 41365.4. Samples: 31540440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:15,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 16:44:19,104][19107] Updated weights for policy 0, policy_version 171365 (0.0040) [2024-06-18 16:44:20,504][18875] Fps is (10 sec: 39307.6, 60 sec: 41230.7, 300 sec: 41098.3). Total num frames: 2807676928. Throughput: 0: 41247.0. Samples: 31788920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:20,505][18875] Avg episode reward: [(0, '0.501')] [2024-06-18 16:44:22,881][19107] Updated weights for policy 0, policy_version 171375 (0.0041) [2024-06-18 16:44:25,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2807889920. Throughput: 0: 41335.5. Samples: 32038880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:25,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 16:44:26,743][19107] Updated weights for policy 0, policy_version 171385 (0.0030) [2024-06-18 16:44:30,500][18875] Fps is (10 sec: 44252.9, 60 sec: 41235.5, 300 sec: 41265.5). Total num frames: 2808119296. Throughput: 0: 41443.6. Samples: 32167540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:30,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 16:44:30,692][19107] Updated weights for policy 0, policy_version 171395 (0.0043) [2024-06-18 16:44:34,498][19107] Updated weights for policy 0, policy_version 171405 (0.0051) [2024-06-18 16:44:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41266.0). Total num frames: 2808315904. Throughput: 0: 41509.2. Samples: 32414920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:35,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 16:44:35,545][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000171406_2808315904.pth... [2024-06-18 16:44:35,602][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000170802_2798419968.pth [2024-06-18 16:44:38,432][19107] Updated weights for policy 0, policy_version 171415 (0.0047) [2024-06-18 16:44:40,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2808512512. Throughput: 0: 41366.1. Samples: 32659440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:40,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 16:44:42,489][19107] Updated weights for policy 0, policy_version 171425 (0.0039) [2024-06-18 16:44:45,504][18875] Fps is (10 sec: 42583.2, 60 sec: 41230.6, 300 sec: 41209.4). Total num frames: 2808741888. Throughput: 0: 41314.0. Samples: 32783260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:45,505][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 16:44:46,385][19107] Updated weights for policy 0, policy_version 171435 (0.0040) [2024-06-18 16:44:50,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 2808938496. Throughput: 0: 41254.5. Samples: 33031480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:50,501][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 16:44:50,623][19107] Updated weights for policy 0, policy_version 171445 (0.0035) [2024-06-18 16:44:54,371][19107] Updated weights for policy 0, policy_version 171455 (0.0037) [2024-06-18 16:44:55,500][18875] Fps is (10 sec: 42614.0, 60 sec: 41780.8, 300 sec: 41265.5). Total num frames: 2809167872. Throughput: 0: 41197.0. Samples: 33267180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 16:44:55,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 16:44:58,792][19107] Updated weights for policy 0, policy_version 171465 (0.0030) [2024-06-18 16:45:00,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 2809348096. Throughput: 0: 41318.4. Samples: 33399760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:00,501][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 16:45:02,113][19107] Updated weights for policy 0, policy_version 171475 (0.0037) [2024-06-18 16:45:05,500][18875] Fps is (10 sec: 37683.3, 60 sec: 40962.5, 300 sec: 41209.9). Total num frames: 2809544704. Throughput: 0: 41231.8. Samples: 33644200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:05,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 16:45:06,747][19107] Updated weights for policy 0, policy_version 171485 (0.0043) [2024-06-18 16:45:10,036][19107] Updated weights for policy 0, policy_version 171495 (0.0051) [2024-06-18 16:45:10,500][18875] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 2809790464. Throughput: 0: 41046.2. Samples: 33885960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:10,504][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 16:45:13,247][19087] Signal inference workers to stop experience collection... (450 times) [2024-06-18 16:45:13,248][19087] Signal inference workers to resume experience collection... (450 times) [2024-06-18 16:45:13,263][19107] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-18 16:45:13,276][19107] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-18 16:45:14,705][19107] Updated weights for policy 0, policy_version 171505 (0.0039) [2024-06-18 16:45:15,500][18875] Fps is (10 sec: 39320.7, 60 sec: 40413.8, 300 sec: 41209.9). Total num frames: 2809937920. Throughput: 0: 41150.1. Samples: 34019300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:15,501][18875] Avg episode reward: [(0, '0.646')] [2024-06-18 16:45:18,006][19107] Updated weights for policy 0, policy_version 171515 (0.0028) [2024-06-18 16:45:20,500][18875] Fps is (10 sec: 36044.4, 60 sec: 41235.5, 300 sec: 41209.9). Total num frames: 2810150912. Throughput: 0: 41034.1. Samples: 34261460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:20,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 16:45:22,655][19107] Updated weights for policy 0, policy_version 171525 (0.0044) [2024-06-18 16:45:25,500][18875] Fps is (10 sec: 45876.4, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 2810396672. Throughput: 0: 41144.5. Samples: 34510940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:25,500][18875] Avg episode reward: [(0, '0.217')] [2024-06-18 16:45:25,938][19107] Updated weights for policy 0, policy_version 171535 (0.0028) [2024-06-18 16:45:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 2810576896. Throughput: 0: 41189.0. Samples: 34636620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:30,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 16:45:30,596][19107] Updated weights for policy 0, policy_version 171545 (0.0038) [2024-06-18 16:45:34,086][19107] Updated weights for policy 0, policy_version 171555 (0.0039) [2024-06-18 16:45:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 2810806272. Throughput: 0: 41076.4. Samples: 34879920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:35,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 16:45:38,749][19107] Updated weights for policy 0, policy_version 171565 (0.0041) [2024-06-18 16:45:40,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 2811002880. Throughput: 0: 41417.3. Samples: 35130960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:40,503][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 16:45:41,902][19107] Updated weights for policy 0, policy_version 171575 (0.0033) [2024-06-18 16:45:45,500][18875] Fps is (10 sec: 39321.9, 60 sec: 40962.5, 300 sec: 41265.5). Total num frames: 2811199488. Throughput: 0: 41064.9. Samples: 35247680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:45,500][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 16:45:46,575][19107] Updated weights for policy 0, policy_version 171585 (0.0037) [2024-06-18 16:45:49,714][19107] Updated weights for policy 0, policy_version 171595 (0.0032) [2024-06-18 16:45:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 2811428864. Throughput: 0: 41298.6. Samples: 35502640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:50,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 16:45:54,327][19107] Updated weights for policy 0, policy_version 171605 (0.0043) [2024-06-18 16:45:55,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40413.9, 300 sec: 41265.5). Total num frames: 2811592704. Throughput: 0: 41443.6. Samples: 35750920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:45:55,501][18875] Avg episode reward: [(0, '0.786')] [2024-06-18 16:45:57,712][19107] Updated weights for policy 0, policy_version 171615 (0.0032) [2024-06-18 16:46:00,500][18875] Fps is (10 sec: 37683.0, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 2811805696. Throughput: 0: 41053.0. Samples: 35866680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-18 16:46:00,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 16:46:02,126][19107] Updated weights for policy 0, policy_version 171625 (0.0033) [2024-06-18 16:46:05,500][18875] Fps is (10 sec: 44235.9, 60 sec: 41506.0, 300 sec: 41321.0). Total num frames: 2812035072. Throughput: 0: 41351.6. Samples: 36122280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:05,501][18875] Avg episode reward: [(0, '0.307')] [2024-06-18 16:46:05,981][19107] Updated weights for policy 0, policy_version 171635 (0.0030) [2024-06-18 16:46:10,498][19107] Updated weights for policy 0, policy_version 171645 (0.0037) [2024-06-18 16:46:10,500][18875] Fps is (10 sec: 42598.1, 60 sec: 40686.9, 300 sec: 41266.0). Total num frames: 2812231680. Throughput: 0: 41274.5. Samples: 36368300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:10,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 16:46:14,154][19107] Updated weights for policy 0, policy_version 171655 (0.0035) [2024-06-18 16:46:15,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41266.0). Total num frames: 2812428288. Throughput: 0: 41048.1. Samples: 36483780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:15,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 16:46:18,162][19107] Updated weights for policy 0, policy_version 171665 (0.0034) [2024-06-18 16:46:20,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41321.5). Total num frames: 2812657664. Throughput: 0: 41364.5. Samples: 36741320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:20,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 16:46:22,130][19107] Updated weights for policy 0, policy_version 171675 (0.0036) [2024-06-18 16:46:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40686.8, 300 sec: 41154.4). Total num frames: 2812837888. Throughput: 0: 41185.7. Samples: 36984320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:25,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 16:46:26,145][19107] Updated weights for policy 0, policy_version 171685 (0.0029) [2024-06-18 16:46:29,920][19107] Updated weights for policy 0, policy_version 171695 (0.0037) [2024-06-18 16:46:30,504][18875] Fps is (10 sec: 40945.1, 60 sec: 41503.7, 300 sec: 41376.0). Total num frames: 2813067264. Throughput: 0: 41319.7. Samples: 37107220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:30,504][18875] Avg episode reward: [(0, '0.324')] [2024-06-18 16:46:33,138][19087] Signal inference workers to stop experience collection... (500 times) [2024-06-18 16:46:33,190][19107] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-18 16:46:33,259][19087] Signal inference workers to resume experience collection... (500 times) [2024-06-18 16:46:33,259][19107] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-18 16:46:33,929][19107] Updated weights for policy 0, policy_version 171705 (0.0034) [2024-06-18 16:46:35,500][18875] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 2813247488. Throughput: 0: 41087.6. Samples: 37351580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:35,500][18875] Avg episode reward: [(0, '0.343')] [2024-06-18 16:46:35,580][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000171708_2813263872.pth... [2024-06-18 16:46:35,623][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000171104_2803367936.pth [2024-06-18 16:46:37,953][19107] Updated weights for policy 0, policy_version 171715 (0.0034) [2024-06-18 16:46:40,500][18875] Fps is (10 sec: 40974.6, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 2813476864. Throughput: 0: 41047.9. Samples: 37598080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:40,501][18875] Avg episode reward: [(0, '0.723')] [2024-06-18 16:46:42,533][19107] Updated weights for policy 0, policy_version 171725 (0.0044) [2024-06-18 16:46:45,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41232.9, 300 sec: 41265.5). Total num frames: 2813673472. Throughput: 0: 41316.0. Samples: 37725900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:45,501][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 16:46:45,963][19107] Updated weights for policy 0, policy_version 171735 (0.0038) [2024-06-18 16:46:50,338][19107] Updated weights for policy 0, policy_version 171745 (0.0044) [2024-06-18 16:46:50,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 2813870080. Throughput: 0: 41083.2. Samples: 37971020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:50,501][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 16:46:53,791][19107] Updated weights for policy 0, policy_version 171755 (0.0048) [2024-06-18 16:46:55,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41321.0). Total num frames: 2814099456. Throughput: 0: 40987.1. Samples: 38212720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:46:55,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 16:46:58,160][19107] Updated weights for policy 0, policy_version 171765 (0.0030) [2024-06-18 16:47:00,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 2814279680. Throughput: 0: 41190.7. Samples: 38337360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:47:00,501][18875] Avg episode reward: [(0, '0.375')] [2024-06-18 16:47:01,763][19107] Updated weights for policy 0, policy_version 171775 (0.0032) [2024-06-18 16:47:05,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 2814492672. Throughput: 0: 40867.5. Samples: 38580360. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-18 16:47:05,503][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 16:47:06,640][19107] Updated weights for policy 0, policy_version 171785 (0.0040) [2024-06-18 16:47:09,673][19107] Updated weights for policy 0, policy_version 171795 (0.0034) [2024-06-18 16:47:10,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 2814705664. Throughput: 0: 40882.7. Samples: 38824040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:10,501][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 16:47:14,711][19107] Updated weights for policy 0, policy_version 171805 (0.0036) [2024-06-18 16:47:15,501][18875] Fps is (10 sec: 39321.0, 60 sec: 40959.9, 300 sec: 41210.0). Total num frames: 2814885888. Throughput: 0: 40977.8. Samples: 38951080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:15,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 16:47:18,108][19107] Updated weights for policy 0, policy_version 171815 (0.0034) [2024-06-18 16:47:20,500][18875] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 2815115264. Throughput: 0: 40931.5. Samples: 39193500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:20,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 16:47:22,515][19107] Updated weights for policy 0, policy_version 171825 (0.0053) [2024-06-18 16:47:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 2815295488. Throughput: 0: 41086.2. Samples: 39446960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:25,501][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 16:47:26,127][19107] Updated weights for policy 0, policy_version 171835 (0.0037) [2024-06-18 16:47:30,462][19107] Updated weights for policy 0, policy_version 171845 (0.0038) [2024-06-18 16:47:30,504][18875] Fps is (10 sec: 39307.4, 60 sec: 40686.9, 300 sec: 41153.9). Total num frames: 2815508480. Throughput: 0: 40904.4. Samples: 39566740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:30,504][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 16:47:34,099][19107] Updated weights for policy 0, policy_version 171855 (0.0040) [2024-06-18 16:47:35,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 2815721472. Throughput: 0: 40985.9. Samples: 39815380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:35,500][18875] Avg episode reward: [(0, '0.619')] [2024-06-18 16:47:38,460][19107] Updated weights for policy 0, policy_version 171865 (0.0042) [2024-06-18 16:47:40,500][18875] Fps is (10 sec: 40974.6, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 2815918080. Throughput: 0: 41151.7. Samples: 40064540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:40,503][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 16:47:41,834][19107] Updated weights for policy 0, policy_version 171875 (0.0037) [2024-06-18 16:47:45,500][18875] Fps is (10 sec: 39320.9, 60 sec: 40686.9, 300 sec: 41154.9). Total num frames: 2816114688. Throughput: 0: 41059.1. Samples: 40185020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:45,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 16:47:46,520][19107] Updated weights for policy 0, policy_version 171885 (0.0045) [2024-06-18 16:47:49,824][19107] Updated weights for policy 0, policy_version 171895 (0.0037) [2024-06-18 16:47:50,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 2816344064. Throughput: 0: 41141.4. Samples: 40431720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:50,501][18875] Avg episode reward: [(0, '0.332')] [2024-06-18 16:47:54,279][19107] Updated weights for policy 0, policy_version 171905 (0.0034) [2024-06-18 16:47:55,504][18875] Fps is (10 sec: 42583.4, 60 sec: 40684.6, 300 sec: 41265.0). Total num frames: 2816540672. Throughput: 0: 41343.0. Samples: 40684620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:47:55,504][18875] Avg episode reward: [(0, '0.235')] [2024-06-18 16:47:57,754][19107] Updated weights for policy 0, policy_version 171915 (0.0034) [2024-06-18 16:48:00,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2816753664. Throughput: 0: 41173.0. Samples: 40803860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:48:00,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 16:48:02,042][19107] Updated weights for policy 0, policy_version 171925 (0.0031) [2024-06-18 16:48:04,131][19087] Signal inference workers to stop experience collection... (550 times) [2024-06-18 16:48:04,131][19087] Signal inference workers to resume experience collection... (550 times) [2024-06-18 16:48:04,179][19107] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-18 16:48:04,180][19107] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-18 16:48:05,500][18875] Fps is (10 sec: 42614.2, 60 sec: 41233.2, 300 sec: 41321.0). Total num frames: 2816966656. Throughput: 0: 41267.6. Samples: 41050540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:48:05,500][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 16:48:05,637][19107] Updated weights for policy 0, policy_version 171935 (0.0039) [2024-06-18 16:48:09,890][19107] Updated weights for policy 0, policy_version 171945 (0.0034) [2024-06-18 16:48:10,500][18875] Fps is (10 sec: 39322.1, 60 sec: 40687.0, 300 sec: 41209.9). Total num frames: 2817146880. Throughput: 0: 41190.8. Samples: 41300540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:48:10,501][18875] Avg episode reward: [(0, '0.340')] [2024-06-18 16:48:13,537][19107] Updated weights for policy 0, policy_version 171955 (0.0027) [2024-06-18 16:48:15,500][18875] Fps is (10 sec: 42597.6, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 2817392640. Throughput: 0: 41217.9. Samples: 41421400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 16:48:15,501][18875] Avg episode reward: [(0, '0.268')] [2024-06-18 16:48:17,874][19107] Updated weights for policy 0, policy_version 171965 (0.0043) [2024-06-18 16:48:20,504][18875] Fps is (10 sec: 44220.6, 60 sec: 41230.6, 300 sec: 41265.0). Total num frames: 2817589248. Throughput: 0: 41321.5. Samples: 41675000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:20,504][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 16:48:21,793][19107] Updated weights for policy 0, policy_version 171975 (0.0049) [2024-06-18 16:48:25,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41233.1, 300 sec: 41099.3). Total num frames: 2817769472. Throughput: 0: 41407.6. Samples: 41927880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:25,501][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 16:48:25,661][19107] Updated weights for policy 0, policy_version 171985 (0.0035) [2024-06-18 16:48:29,637][19107] Updated weights for policy 0, policy_version 171995 (0.0041) [2024-06-18 16:48:30,500][18875] Fps is (10 sec: 44253.0, 60 sec: 42054.8, 300 sec: 41321.0). Total num frames: 2818031616. Throughput: 0: 41515.3. Samples: 42053200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:30,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 16:48:33,557][19107] Updated weights for policy 0, policy_version 172005 (0.0023) [2024-06-18 16:48:35,500][18875] Fps is (10 sec: 44236.0, 60 sec: 41506.0, 300 sec: 41321.0). Total num frames: 2818211840. Throughput: 0: 41454.0. Samples: 42297160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:35,501][18875] Avg episode reward: [(0, '0.391')] [2024-06-18 16:48:35,531][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172010_2818211840.pth... [2024-06-18 16:48:35,583][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000171406_2808315904.pth [2024-06-18 16:48:37,396][19107] Updated weights for policy 0, policy_version 172015 (0.0034) [2024-06-18 16:48:40,500][18875] Fps is (10 sec: 36044.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2818392064. Throughput: 0: 41301.9. Samples: 42543060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:40,501][18875] Avg episode reward: [(0, '0.665')] [2024-06-18 16:48:41,428][19107] Updated weights for policy 0, policy_version 172025 (0.0038) [2024-06-18 16:48:45,190][19107] Updated weights for policy 0, policy_version 172035 (0.0029) [2024-06-18 16:48:45,500][18875] Fps is (10 sec: 42599.5, 60 sec: 42052.4, 300 sec: 41321.0). Total num frames: 2818637824. Throughput: 0: 41397.9. Samples: 42666760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:45,500][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 16:48:49,317][19107] Updated weights for policy 0, policy_version 172045 (0.0032) [2024-06-18 16:48:50,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41232.9, 300 sec: 41210.2). Total num frames: 2818818048. Throughput: 0: 41493.6. Samples: 42917760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:50,501][18875] Avg episode reward: [(0, '0.334')] [2024-06-18 16:48:53,320][19107] Updated weights for policy 0, policy_version 172055 (0.0047) [2024-06-18 16:48:55,500][18875] Fps is (10 sec: 39320.7, 60 sec: 41508.5, 300 sec: 41154.4). Total num frames: 2819031040. Throughput: 0: 41410.9. Samples: 43164040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:48:55,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 16:48:57,219][19107] Updated weights for policy 0, policy_version 172065 (0.0030) [2024-06-18 16:49:00,500][18875] Fps is (10 sec: 39322.4, 60 sec: 40960.1, 300 sec: 41099.4). Total num frames: 2819211264. Throughput: 0: 41469.5. Samples: 43287520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:49:00,500][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 16:49:01,281][19107] Updated weights for policy 0, policy_version 172075 (0.0049) [2024-06-18 16:49:05,453][19107] Updated weights for policy 0, policy_version 172085 (0.0032) [2024-06-18 16:49:05,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2819440640. Throughput: 0: 41396.2. Samples: 43537680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:49:05,500][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 16:49:09,187][19107] Updated weights for policy 0, policy_version 172095 (0.0051) [2024-06-18 16:49:10,500][18875] Fps is (10 sec: 44236.2, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 2819653632. Throughput: 0: 41131.9. Samples: 43778820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:49:10,512][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 16:49:13,288][19107] Updated weights for policy 0, policy_version 172105 (0.0041) [2024-06-18 16:49:15,500][18875] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41266.0). Total num frames: 2819850240. Throughput: 0: 41279.4. Samples: 43910780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:49:15,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 16:49:17,162][19107] Updated weights for policy 0, policy_version 172115 (0.0033) [2024-06-18 16:49:20,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41235.4, 300 sec: 41265.5). Total num frames: 2820063232. Throughput: 0: 41264.5. Samples: 44154060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 16:49:20,501][18875] Avg episode reward: [(0, '0.276')] [2024-06-18 16:49:21,276][19107] Updated weights for policy 0, policy_version 172125 (0.0025) [2024-06-18 16:49:25,069][19107] Updated weights for policy 0, policy_version 172135 (0.0033) [2024-06-18 16:49:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 2820259840. Throughput: 0: 41180.4. Samples: 44396180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:49:25,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 16:49:29,422][19107] Updated weights for policy 0, policy_version 172145 (0.0037) [2024-06-18 16:49:30,504][18875] Fps is (10 sec: 40945.9, 60 sec: 40684.5, 300 sec: 41209.4). Total num frames: 2820472832. Throughput: 0: 41155.8. Samples: 44518920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:49:30,504][18875] Avg episode reward: [(0, '0.317')] [2024-06-18 16:49:32,495][19087] Signal inference workers to stop experience collection... (600 times) [2024-06-18 16:49:32,544][19107] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-18 16:49:32,609][19087] Signal inference workers to resume experience collection... (600 times) [2024-06-18 16:49:32,609][19107] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-18 16:49:33,228][19107] Updated weights for policy 0, policy_version 172155 (0.0025) [2024-06-18 16:49:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 2820669440. Throughput: 0: 40867.2. Samples: 44756780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:49:35,501][18875] Avg episode reward: [(0, '0.317')] [2024-06-18 16:49:37,324][19107] Updated weights for policy 0, policy_version 172165 (0.0039) [2024-06-18 16:49:40,504][18875] Fps is (10 sec: 39321.4, 60 sec: 41230.7, 300 sec: 41098.8). Total num frames: 2820866048. Throughput: 0: 41036.4. Samples: 45010820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:49:40,505][18875] Avg episode reward: [(0, '0.268')] [2024-06-18 16:49:41,409][19107] Updated weights for policy 0, policy_version 172175 (0.0051) [2024-06-18 16:49:45,070][19107] Updated weights for policy 0, policy_version 172185 (0.0039) [2024-06-18 16:49:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40686.8, 300 sec: 41154.4). Total num frames: 2821079040. Throughput: 0: 40968.3. Samples: 45131100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:49:45,501][18875] Avg episode reward: [(0, '0.336')] [2024-06-18 16:49:49,429][19107] Updated weights for policy 0, policy_version 172195 (0.0040) [2024-06-18 16:49:50,500][18875] Fps is (10 sec: 42613.5, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 2821292032. Throughput: 0: 40970.6. Samples: 45381360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:49:50,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 16:49:53,057][19107] Updated weights for policy 0, policy_version 172205 (0.0037) [2024-06-18 16:49:55,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2821488640. Throughput: 0: 40999.1. Samples: 45623780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:49:55,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 16:49:57,538][19107] Updated weights for policy 0, policy_version 172215 (0.0037) [2024-06-18 16:50:00,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 2821701632. Throughput: 0: 40725.3. Samples: 45743420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:50:00,501][18875] Avg episode reward: [(0, '0.161')] [2024-06-18 16:50:01,296][19107] Updated weights for policy 0, policy_version 172225 (0.0036) [2024-06-18 16:50:05,391][19107] Updated weights for policy 0, policy_version 172235 (0.0030) [2024-06-18 16:50:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 2821898240. Throughput: 0: 40805.0. Samples: 45990280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:50:05,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 16:50:08,947][19107] Updated weights for policy 0, policy_version 172245 (0.0025) [2024-06-18 16:50:10,504][18875] Fps is (10 sec: 40945.7, 60 sec: 40957.6, 300 sec: 41265.0). Total num frames: 2822111232. Throughput: 0: 40809.7. Samples: 46232760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:50:10,505][18875] Avg episode reward: [(0, '0.276')] [2024-06-18 16:50:13,582][19107] Updated weights for policy 0, policy_version 172255 (0.0036) [2024-06-18 16:50:15,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 2822307840. Throughput: 0: 41057.0. Samples: 46366340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:50:15,501][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 16:50:16,705][19107] Updated weights for policy 0, policy_version 172265 (0.0029) [2024-06-18 16:50:20,500][18875] Fps is (10 sec: 40974.6, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 2822520832. Throughput: 0: 41200.5. Samples: 46610800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:50:20,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 16:50:21,461][19107] Updated weights for policy 0, policy_version 172275 (0.0030) [2024-06-18 16:50:24,512][19107] Updated weights for policy 0, policy_version 172285 (0.0030) [2024-06-18 16:50:25,504][18875] Fps is (10 sec: 42583.2, 60 sec: 41230.7, 300 sec: 41209.4). Total num frames: 2822733824. Throughput: 0: 41012.9. Samples: 46856400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 16:50:25,504][18875] Avg episode reward: [(0, '0.374')] [2024-06-18 16:50:29,150][19107] Updated weights for policy 0, policy_version 172295 (0.0036) [2024-06-18 16:50:30,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40962.3, 300 sec: 41098.8). Total num frames: 2822930432. Throughput: 0: 41232.0. Samples: 46986540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:50:30,501][18875] Avg episode reward: [(0, '0.391')] [2024-06-18 16:50:32,293][19107] Updated weights for policy 0, policy_version 172305 (0.0035) [2024-06-18 16:50:35,500][18875] Fps is (10 sec: 40974.8, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2823143424. Throughput: 0: 41198.3. Samples: 47235280. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:50:35,501][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 16:50:35,524][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172311_2823143424.pth... [2024-06-18 16:50:35,586][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000171708_2813263872.pth [2024-06-18 16:50:36,974][19107] Updated weights for policy 0, policy_version 172315 (0.0039) [2024-06-18 16:50:40,050][19107] Updated weights for policy 0, policy_version 172325 (0.0030) [2024-06-18 16:50:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41781.6, 300 sec: 41265.4). Total num frames: 2823372800. Throughput: 0: 41207.9. Samples: 47478140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:50:40,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 16:50:44,021][19087] Signal inference workers to stop experience collection... (650 times) [2024-06-18 16:50:44,032][19087] Signal inference workers to resume experience collection... (650 times) [2024-06-18 16:50:44,036][19107] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-18 16:50:44,063][19107] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-18 16:50:45,376][19107] Updated weights for policy 0, policy_version 172335 (0.0049) [2024-06-18 16:50:45,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 2823536640. Throughput: 0: 41335.7. Samples: 47603520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:50:45,501][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 16:50:48,021][19107] Updated weights for policy 0, policy_version 172345 (0.0039) [2024-06-18 16:50:50,504][18875] Fps is (10 sec: 39307.8, 60 sec: 41230.6, 300 sec: 41264.9). Total num frames: 2823766016. Throughput: 0: 41317.6. Samples: 47849720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:50:50,505][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 16:50:53,491][19107] Updated weights for policy 0, policy_version 172355 (0.0047) [2024-06-18 16:50:55,500][18875] Fps is (10 sec: 45875.1, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 2823995392. Throughput: 0: 41499.3. Samples: 48100080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:50:55,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 16:50:56,008][19107] Updated weights for policy 0, policy_version 172365 (0.0046) [2024-06-18 16:51:00,500][18875] Fps is (10 sec: 39335.4, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 2824159232. Throughput: 0: 41337.7. Samples: 48226540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:51:00,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 16:51:01,245][19107] Updated weights for policy 0, policy_version 172375 (0.0041) [2024-06-18 16:51:04,223][19107] Updated weights for policy 0, policy_version 172385 (0.0030) [2024-06-18 16:51:05,501][18875] Fps is (10 sec: 37682.5, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2824372224. Throughput: 0: 41358.5. Samples: 48471940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:51:05,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 16:51:09,012][19107] Updated weights for policy 0, policy_version 172395 (0.0038) [2024-06-18 16:51:10,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41235.6, 300 sec: 41209.9). Total num frames: 2824585216. Throughput: 0: 41520.7. Samples: 48724680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:51:10,501][18875] Avg episode reward: [(0, '0.405')] [2024-06-18 16:51:12,082][19107] Updated weights for policy 0, policy_version 172405 (0.0025) [2024-06-18 16:51:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2824781824. Throughput: 0: 41271.1. Samples: 48843740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:51:15,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 16:51:16,945][19107] Updated weights for policy 0, policy_version 172415 (0.0038) [2024-06-18 16:51:20,122][19107] Updated weights for policy 0, policy_version 172425 (0.0038) [2024-06-18 16:51:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 2825011200. Throughput: 0: 41264.5. Samples: 49092180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:51:20,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 16:51:24,983][19107] Updated weights for policy 0, policy_version 172435 (0.0044) [2024-06-18 16:51:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40962.4, 300 sec: 41099.3). Total num frames: 2825191424. Throughput: 0: 41320.1. Samples: 49337540. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:51:25,501][18875] Avg episode reward: [(0, '0.342')] [2024-06-18 16:51:28,155][19107] Updated weights for policy 0, policy_version 172445 (0.0043) [2024-06-18 16:51:30,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2825404416. Throughput: 0: 41193.3. Samples: 49457220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 16:51:30,501][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 16:51:33,011][19107] Updated weights for policy 0, policy_version 172455 (0.0036) [2024-06-18 16:51:35,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 2825633792. Throughput: 0: 41349.4. Samples: 49710300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:51:35,501][18875] Avg episode reward: [(0, '0.702')] [2024-06-18 16:51:36,199][19107] Updated weights for policy 0, policy_version 172465 (0.0033) [2024-06-18 16:51:40,500][18875] Fps is (10 sec: 39321.9, 60 sec: 40414.0, 300 sec: 41098.9). Total num frames: 2825797632. Throughput: 0: 41213.4. Samples: 49954680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:51:40,501][18875] Avg episode reward: [(0, '0.274')] [2024-06-18 16:51:40,925][19107] Updated weights for policy 0, policy_version 172475 (0.0049) [2024-06-18 16:51:44,087][19107] Updated weights for policy 0, policy_version 172485 (0.0040) [2024-06-18 16:51:45,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 2826027008. Throughput: 0: 41112.6. Samples: 50076600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:51:45,500][18875] Avg episode reward: [(0, '0.162')] [2024-06-18 16:51:49,044][19107] Updated weights for policy 0, policy_version 172495 (0.0030) [2024-06-18 16:51:50,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41235.6, 300 sec: 41154.4). Total num frames: 2826240000. Throughput: 0: 41242.8. Samples: 50327860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:51:50,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 16:51:52,193][19107] Updated weights for policy 0, policy_version 172505 (0.0044) [2024-06-18 16:51:55,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 2826420224. Throughput: 0: 41041.3. Samples: 50571540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:51:55,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 16:51:57,158][19107] Updated weights for policy 0, policy_version 172515 (0.0041) [2024-06-18 16:52:00,353][19107] Updated weights for policy 0, policy_version 172525 (0.0044) [2024-06-18 16:52:00,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 2826649600. Throughput: 0: 41000.6. Samples: 50688760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:00,501][18875] Avg episode reward: [(0, '0.698')] [2024-06-18 16:52:04,973][19107] Updated weights for policy 0, policy_version 172535 (0.0045) [2024-06-18 16:52:05,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2826846208. Throughput: 0: 41049.6. Samples: 50939420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:05,501][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 16:52:08,239][19107] Updated weights for policy 0, policy_version 172545 (0.0028) [2024-06-18 16:52:08,258][19087] Signal inference workers to stop experience collection... (700 times) [2024-06-18 16:52:08,258][19087] Signal inference workers to resume experience collection... (700 times) [2024-06-18 16:52:08,300][19107] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-18 16:52:08,300][19107] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-18 16:52:10,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 2827042816. Throughput: 0: 41036.0. Samples: 51184160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:10,504][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 16:52:12,926][19107] Updated weights for policy 0, policy_version 172555 (0.0034) [2024-06-18 16:52:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 2827272192. Throughput: 0: 41136.9. Samples: 51308380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:15,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 16:52:16,190][19107] Updated weights for policy 0, policy_version 172565 (0.0037) [2024-06-18 16:52:20,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 2827436032. Throughput: 0: 41022.0. Samples: 51556280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:20,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 16:52:20,839][19107] Updated weights for policy 0, policy_version 172575 (0.0047) [2024-06-18 16:52:24,038][19107] Updated weights for policy 0, policy_version 172585 (0.0034) [2024-06-18 16:52:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41265.9). Total num frames: 2827681792. Throughput: 0: 41042.9. Samples: 51801620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:25,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 16:52:28,580][19107] Updated weights for policy 0, policy_version 172595 (0.0049) [2024-06-18 16:52:30,504][18875] Fps is (10 sec: 44220.5, 60 sec: 41230.6, 300 sec: 41209.4). Total num frames: 2827878400. Throughput: 0: 41153.1. Samples: 51928640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:30,505][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 16:52:32,238][19107] Updated weights for policy 0, policy_version 172605 (0.0047) [2024-06-18 16:52:35,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 2828058624. Throughput: 0: 40903.5. Samples: 52168520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 16:52:35,501][18875] Avg episode reward: [(0, '0.300')] [2024-06-18 16:52:35,580][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172612_2828075008.pth... [2024-06-18 16:52:35,634][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172010_2818211840.pth [2024-06-18 16:52:36,534][19107] Updated weights for policy 0, policy_version 172615 (0.0038) [2024-06-18 16:52:40,353][19107] Updated weights for policy 0, policy_version 172625 (0.0044) [2024-06-18 16:52:40,500][18875] Fps is (10 sec: 40974.7, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 2828288000. Throughput: 0: 40991.1. Samples: 52416140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:52:40,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 16:52:44,330][19107] Updated weights for policy 0, policy_version 172635 (0.0048) [2024-06-18 16:52:45,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2828500992. Throughput: 0: 41308.8. Samples: 52547660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:52:45,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 16:52:48,397][19107] Updated weights for policy 0, policy_version 172645 (0.0040) [2024-06-18 16:52:50,500][18875] Fps is (10 sec: 37683.5, 60 sec: 40413.9, 300 sec: 41099.4). Total num frames: 2828664832. Throughput: 0: 41081.0. Samples: 52788060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:52:50,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 16:52:52,255][19107] Updated weights for policy 0, policy_version 172655 (0.0039) [2024-06-18 16:52:55,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41232.9, 300 sec: 41154.4). Total num frames: 2828894208. Throughput: 0: 41054.1. Samples: 53031600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:52:55,509][18875] Avg episode reward: [(0, '0.304')] [2024-06-18 16:52:56,272][19107] Updated weights for policy 0, policy_version 172665 (0.0046) [2024-06-18 16:53:00,221][19107] Updated weights for policy 0, policy_version 172675 (0.0035) [2024-06-18 16:53:00,500][18875] Fps is (10 sec: 44236.3, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 2829107200. Throughput: 0: 41121.8. Samples: 53158860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:00,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 16:53:04,107][19107] Updated weights for policy 0, policy_version 172685 (0.0036) [2024-06-18 16:53:05,500][18875] Fps is (10 sec: 40960.8, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 2829303808. Throughput: 0: 41043.5. Samples: 53403240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:05,501][18875] Avg episode reward: [(0, '0.292')] [2024-06-18 16:53:08,103][19107] Updated weights for policy 0, policy_version 172695 (0.0029) [2024-06-18 16:53:10,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2829533184. Throughput: 0: 41022.8. Samples: 53647640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:10,501][18875] Avg episode reward: [(0, '0.253')] [2024-06-18 16:53:12,053][19107] Updated weights for policy 0, policy_version 172705 (0.0034) [2024-06-18 16:53:15,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 41099.4). Total num frames: 2829713408. Throughput: 0: 41120.2. Samples: 53778900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:15,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 16:53:16,010][19107] Updated weights for policy 0, policy_version 172715 (0.0033) [2024-06-18 16:53:19,940][19107] Updated weights for policy 0, policy_version 172725 (0.0039) [2024-06-18 16:53:20,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 2829926400. Throughput: 0: 41162.2. Samples: 54020820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:20,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 16:53:23,877][19107] Updated weights for policy 0, policy_version 172735 (0.0045) [2024-06-18 16:53:25,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 2830172160. Throughput: 0: 41120.0. Samples: 54266540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:25,503][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 16:53:27,836][19107] Updated weights for policy 0, policy_version 172745 (0.0036) [2024-06-18 16:53:30,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40689.4, 300 sec: 41043.3). Total num frames: 2830319616. Throughput: 0: 41027.2. Samples: 54393880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:30,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 16:53:31,803][19107] Updated weights for policy 0, policy_version 172755 (0.0042) [2024-06-18 16:53:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 2830565376. Throughput: 0: 41173.7. Samples: 54640880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:35,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 16:53:35,975][19107] Updated weights for policy 0, policy_version 172765 (0.0031) [2024-06-18 16:53:39,160][19087] Signal inference workers to stop experience collection... (750 times) [2024-06-18 16:53:39,162][19087] Signal inference workers to resume experience collection... (750 times) [2024-06-18 16:53:39,182][19107] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-18 16:53:39,182][19107] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-18 16:53:39,800][19107] Updated weights for policy 0, policy_version 172775 (0.0039) [2024-06-18 16:53:40,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2830778368. Throughput: 0: 41261.9. Samples: 54888380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 16:53:40,501][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 16:53:44,275][19107] Updated weights for policy 0, policy_version 172785 (0.0039) [2024-06-18 16:53:45,500][18875] Fps is (10 sec: 37682.6, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 2830942208. Throughput: 0: 41172.8. Samples: 55011640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:53:45,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 16:53:47,964][19107] Updated weights for policy 0, policy_version 172795 (0.0031) [2024-06-18 16:53:50,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 2831171584. Throughput: 0: 41211.0. Samples: 55257740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:53:50,501][18875] Avg episode reward: [(0, '0.243')] [2024-06-18 16:53:52,069][19107] Updated weights for policy 0, policy_version 172805 (0.0048) [2024-06-18 16:53:55,501][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2831368192. Throughput: 0: 41426.5. Samples: 55511840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:53:55,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 16:53:55,926][19107] Updated weights for policy 0, policy_version 172815 (0.0043) [2024-06-18 16:53:59,956][19107] Updated weights for policy 0, policy_version 172825 (0.0033) [2024-06-18 16:54:00,502][18875] Fps is (10 sec: 40954.4, 60 sec: 41232.1, 300 sec: 41154.2). Total num frames: 2831581184. Throughput: 0: 41028.8. Samples: 55625260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:00,502][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 16:54:03,943][19107] Updated weights for policy 0, policy_version 172835 (0.0034) [2024-06-18 16:54:05,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2831810560. Throughput: 0: 41216.0. Samples: 55875540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:05,501][18875] Avg episode reward: [(0, '0.274')] [2024-06-18 16:54:07,850][19107] Updated weights for policy 0, policy_version 172845 (0.0034) [2024-06-18 16:54:10,500][18875] Fps is (10 sec: 39327.1, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 2831974400. Throughput: 0: 41355.0. Samples: 56127520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:10,501][18875] Avg episode reward: [(0, '0.389')] [2024-06-18 16:54:11,817][19107] Updated weights for policy 0, policy_version 172855 (0.0049) [2024-06-18 16:54:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 2832203776. Throughput: 0: 41016.3. Samples: 56239620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:15,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 16:54:15,753][19107] Updated weights for policy 0, policy_version 172865 (0.0036) [2024-06-18 16:54:19,665][19107] Updated weights for policy 0, policy_version 172875 (0.0039) [2024-06-18 16:54:20,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 2832433152. Throughput: 0: 41420.3. Samples: 56504800. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:20,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 16:54:23,814][19107] Updated weights for policy 0, policy_version 172885 (0.0030) [2024-06-18 16:54:25,501][18875] Fps is (10 sec: 39321.3, 60 sec: 40413.7, 300 sec: 41099.3). Total num frames: 2832596992. Throughput: 0: 41362.9. Samples: 56749720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:25,501][18875] Avg episode reward: [(0, '0.800')] [2024-06-18 16:54:27,887][19107] Updated weights for policy 0, policy_version 172895 (0.0038) [2024-06-18 16:54:30,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2832826368. Throughput: 0: 41251.8. Samples: 56867960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:30,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 16:54:31,847][19107] Updated weights for policy 0, policy_version 172905 (0.0039) [2024-06-18 16:54:35,504][18875] Fps is (10 sec: 40945.9, 60 sec: 40684.5, 300 sec: 41154.4). Total num frames: 2833006592. Throughput: 0: 41383.0. Samples: 57120120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:35,505][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 16:54:35,678][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172914_2833022976.pth... [2024-06-18 16:54:35,772][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172311_2823143424.pth [2024-06-18 16:54:35,895][19107] Updated weights for policy 0, policy_version 172915 (0.0034) [2024-06-18 16:54:40,141][19107] Updated weights for policy 0, policy_version 172925 (0.0049) [2024-06-18 16:54:40,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 2833219584. Throughput: 0: 41077.1. Samples: 57360300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:40,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 16:54:43,719][19107] Updated weights for policy 0, policy_version 172935 (0.0042) [2024-06-18 16:54:45,500][18875] Fps is (10 sec: 44252.9, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 2833448960. Throughput: 0: 41392.9. Samples: 57487880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 16:54:45,501][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 16:54:47,995][19107] Updated weights for policy 0, policy_version 172945 (0.0033) [2024-06-18 16:54:50,502][18875] Fps is (10 sec: 40951.3, 60 sec: 40958.6, 300 sec: 41154.1). Total num frames: 2833629184. Throughput: 0: 41287.9. Samples: 57733580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:54:50,503][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 16:54:51,606][19107] Updated weights for policy 0, policy_version 172955 (0.0051) [2024-06-18 16:54:55,500][18875] Fps is (10 sec: 37683.1, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2833825792. Throughput: 0: 40982.3. Samples: 57971720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:54:55,501][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 16:54:56,213][19087] Signal inference workers to stop experience collection... (800 times) [2024-06-18 16:54:56,259][19087] Signal inference workers to resume experience collection... (800 times) [2024-06-18 16:54:56,266][19107] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-18 16:54:56,299][19107] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-18 16:54:56,399][19107] Updated weights for policy 0, policy_version 172965 (0.0031) [2024-06-18 16:54:59,465][19107] Updated weights for policy 0, policy_version 172975 (0.0031) [2024-06-18 16:55:00,500][18875] Fps is (10 sec: 42607.0, 60 sec: 41234.0, 300 sec: 41209.9). Total num frames: 2834055168. Throughput: 0: 41285.3. Samples: 58097460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:00,501][18875] Avg episode reward: [(0, '0.409')] [2024-06-18 16:55:04,383][19107] Updated weights for policy 0, policy_version 172985 (0.0033) [2024-06-18 16:55:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40413.9, 300 sec: 41099.3). Total num frames: 2834235392. Throughput: 0: 40894.3. Samples: 58345040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:05,501][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 16:55:07,503][19107] Updated weights for policy 0, policy_version 172995 (0.0049) [2024-06-18 16:55:10,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 2834448384. Throughput: 0: 40766.0. Samples: 58584180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:10,500][18875] Avg episode reward: [(0, '0.734')] [2024-06-18 16:55:12,172][19107] Updated weights for policy 0, policy_version 173005 (0.0030) [2024-06-18 16:55:15,258][19107] Updated weights for policy 0, policy_version 173015 (0.0039) [2024-06-18 16:55:15,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2834677760. Throughput: 0: 41065.1. Samples: 58715900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:15,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 16:55:19,914][19107] Updated weights for policy 0, policy_version 173025 (0.0026) [2024-06-18 16:55:20,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40140.9, 300 sec: 41043.8). Total num frames: 2834841600. Throughput: 0: 40784.6. Samples: 58955280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:20,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 16:55:23,581][19107] Updated weights for policy 0, policy_version 173035 (0.0043) [2024-06-18 16:55:25,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 2835087360. Throughput: 0: 40925.8. Samples: 59201960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:25,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 16:55:27,515][19107] Updated weights for policy 0, policy_version 173045 (0.0044) [2024-06-18 16:55:30,500][18875] Fps is (10 sec: 44237.4, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2835283968. Throughput: 0: 40817.4. Samples: 59324660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:30,500][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 16:55:31,486][19107] Updated weights for policy 0, policy_version 173055 (0.0044) [2024-06-18 16:55:35,500][18875] Fps is (10 sec: 37682.7, 60 sec: 40962.4, 300 sec: 40987.8). Total num frames: 2835464192. Throughput: 0: 40866.7. Samples: 59572500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:35,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 16:55:35,927][19107] Updated weights for policy 0, policy_version 173065 (0.0051) [2024-06-18 16:55:39,804][19107] Updated weights for policy 0, policy_version 173075 (0.0029) [2024-06-18 16:55:40,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2835677184. Throughput: 0: 40955.1. Samples: 59814700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:40,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 16:55:43,773][19107] Updated weights for policy 0, policy_version 173085 (0.0032) [2024-06-18 16:55:45,500][18875] Fps is (10 sec: 42599.1, 60 sec: 40687.0, 300 sec: 41099.4). Total num frames: 2835890176. Throughput: 0: 40878.8. Samples: 59937000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:45,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 16:55:47,707][19107] Updated weights for policy 0, policy_version 173095 (0.0038) [2024-06-18 16:55:50,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40961.4, 300 sec: 40987.8). Total num frames: 2836086784. Throughput: 0: 40750.7. Samples: 60178820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 16:55:50,501][18875] Avg episode reward: [(0, '0.320')] [2024-06-18 16:55:52,228][19107] Updated weights for policy 0, policy_version 173105 (0.0032) [2024-06-18 16:55:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2836299776. Throughput: 0: 40925.7. Samples: 60425840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:55:55,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 16:55:55,578][19107] Updated weights for policy 0, policy_version 173115 (0.0033) [2024-06-18 16:56:00,015][19107] Updated weights for policy 0, policy_version 173125 (0.0029) [2024-06-18 16:56:00,504][18875] Fps is (10 sec: 42583.1, 60 sec: 40957.6, 300 sec: 41153.9). Total num frames: 2836512768. Throughput: 0: 40780.8. Samples: 60551180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:00,505][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 16:56:03,516][19107] Updated weights for policy 0, policy_version 173135 (0.0037) [2024-06-18 16:56:05,500][18875] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 2836692992. Throughput: 0: 40974.3. Samples: 60799120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:05,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 16:56:07,824][19107] Updated weights for policy 0, policy_version 173145 (0.0036) [2024-06-18 16:56:10,500][18875] Fps is (10 sec: 42613.6, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 2836938752. Throughput: 0: 40971.9. Samples: 61045700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:10,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 16:56:11,814][19107] Updated weights for policy 0, policy_version 173155 (0.0031) [2024-06-18 16:56:15,467][19107] Updated weights for policy 0, policy_version 173165 (0.0041) [2024-06-18 16:56:15,500][18875] Fps is (10 sec: 44236.9, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 2837135360. Throughput: 0: 41139.1. Samples: 61175920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:15,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 16:56:19,673][19107] Updated weights for policy 0, policy_version 173175 (0.0034) [2024-06-18 16:56:20,501][18875] Fps is (10 sec: 39321.1, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 2837331968. Throughput: 0: 41112.8. Samples: 61422580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:20,501][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 16:56:23,159][19087] Signal inference workers to stop experience collection... (850 times) [2024-06-18 16:56:23,204][19107] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-18 16:56:23,219][19087] Signal inference workers to resume experience collection... (850 times) [2024-06-18 16:56:23,220][19107] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-18 16:56:23,370][19107] Updated weights for policy 0, policy_version 173185 (0.0030) [2024-06-18 16:56:25,500][18875] Fps is (10 sec: 42597.5, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2837561344. Throughput: 0: 41102.1. Samples: 61664300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:25,501][18875] Avg episode reward: [(0, '0.722')] [2024-06-18 16:56:27,459][19107] Updated weights for policy 0, policy_version 173195 (0.0032) [2024-06-18 16:56:30,500][18875] Fps is (10 sec: 37683.5, 60 sec: 40413.7, 300 sec: 40932.2). Total num frames: 2837708800. Throughput: 0: 41231.0. Samples: 61792400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:30,504][18875] Avg episode reward: [(0, '0.761')] [2024-06-18 16:56:31,255][19107] Updated weights for policy 0, policy_version 173205 (0.0027) [2024-06-18 16:56:35,441][19107] Updated weights for policy 0, policy_version 173215 (0.0032) [2024-06-18 16:56:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 2837954560. Throughput: 0: 41337.7. Samples: 62039020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:35,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 16:56:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000173215_2837954560.pth... [2024-06-18 16:56:35,571][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172612_2828075008.pth [2024-06-18 16:56:39,509][19107] Updated weights for policy 0, policy_version 173225 (0.0050) [2024-06-18 16:56:40,500][18875] Fps is (10 sec: 45875.8, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 2838167552. Throughput: 0: 41182.7. Samples: 62279060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:40,500][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 16:56:43,616][19107] Updated weights for policy 0, policy_version 173235 (0.0037) [2024-06-18 16:56:45,502][18875] Fps is (10 sec: 37676.3, 60 sec: 40685.6, 300 sec: 40987.5). Total num frames: 2838331392. Throughput: 0: 41260.6. Samples: 62407840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:45,503][18875] Avg episode reward: [(0, '0.328')] [2024-06-18 16:56:47,453][19107] Updated weights for policy 0, policy_version 173245 (0.0041) [2024-06-18 16:56:50,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 2838577152. Throughput: 0: 40986.5. Samples: 62643520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:50,508][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 16:56:51,654][19107] Updated weights for policy 0, policy_version 173255 (0.0042) [2024-06-18 16:56:55,437][19107] Updated weights for policy 0, policy_version 173265 (0.0030) [2024-06-18 16:56:55,500][18875] Fps is (10 sec: 44246.0, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 2838773760. Throughput: 0: 41295.7. Samples: 62904000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 16:56:55,500][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 16:56:59,605][19107] Updated weights for policy 0, policy_version 173275 (0.0052) [2024-06-18 16:57:00,500][18875] Fps is (10 sec: 37683.7, 60 sec: 40689.4, 300 sec: 41043.3). Total num frames: 2838953984. Throughput: 0: 40915.1. Samples: 63017100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:00,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 16:57:03,298][19107] Updated weights for policy 0, policy_version 173285 (0.0038) [2024-06-18 16:57:05,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2839199744. Throughput: 0: 40944.2. Samples: 63265060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:05,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 16:57:07,486][19107] Updated weights for policy 0, policy_version 173295 (0.0048) [2024-06-18 16:57:10,500][18875] Fps is (10 sec: 39321.9, 60 sec: 40140.9, 300 sec: 40932.3). Total num frames: 2839347200. Throughput: 0: 41319.3. Samples: 63523660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:10,500][18875] Avg episode reward: [(0, '0.310')] [2024-06-18 16:57:11,293][19107] Updated weights for policy 0, policy_version 173305 (0.0048) [2024-06-18 16:57:15,322][19107] Updated weights for policy 0, policy_version 173315 (0.0040) [2024-06-18 16:57:15,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 2839592960. Throughput: 0: 40958.2. Samples: 63635520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:15,501][18875] Avg episode reward: [(0, '0.277')] [2024-06-18 16:57:19,246][19107] Updated weights for policy 0, policy_version 173325 (0.0032) [2024-06-18 16:57:20,500][18875] Fps is (10 sec: 47513.0, 60 sec: 41506.3, 300 sec: 41154.4). Total num frames: 2839822336. Throughput: 0: 41042.3. Samples: 63885920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:20,501][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 16:57:23,077][19107] Updated weights for policy 0, policy_version 173335 (0.0031) [2024-06-18 16:57:25,500][18875] Fps is (10 sec: 37683.4, 60 sec: 40140.9, 300 sec: 40988.3). Total num frames: 2839969792. Throughput: 0: 41381.3. Samples: 64141220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:25,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 16:57:27,071][19107] Updated weights for policy 0, policy_version 173345 (0.0034) [2024-06-18 16:57:30,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2840215552. Throughput: 0: 41015.0. Samples: 64253440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:30,501][18875] Avg episode reward: [(0, '0.717')] [2024-06-18 16:57:30,881][19107] Updated weights for policy 0, policy_version 173355 (0.0038) [2024-06-18 16:57:33,575][19087] Signal inference workers to stop experience collection... (900 times) [2024-06-18 16:57:33,576][19087] Signal inference workers to resume experience collection... (900 times) [2024-06-18 16:57:33,606][19107] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-18 16:57:33,606][19107] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-18 16:57:35,102][19107] Updated weights for policy 0, policy_version 173365 (0.0041) [2024-06-18 16:57:35,500][18875] Fps is (10 sec: 45875.3, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2840428544. Throughput: 0: 41474.7. Samples: 64509880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:35,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 16:57:38,809][19107] Updated weights for policy 0, policy_version 173375 (0.0037) [2024-06-18 16:57:40,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 2840592384. Throughput: 0: 41240.8. Samples: 64759840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:40,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 16:57:42,984][19107] Updated weights for policy 0, policy_version 173385 (0.0033) [2024-06-18 16:57:45,504][18875] Fps is (10 sec: 40945.2, 60 sec: 41778.0, 300 sec: 41264.9). Total num frames: 2840838144. Throughput: 0: 41312.6. Samples: 64876320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:45,505][18875] Avg episode reward: [(0, '0.413')] [2024-06-18 16:57:47,042][19107] Updated weights for policy 0, policy_version 173395 (0.0045) [2024-06-18 16:57:50,500][18875] Fps is (10 sec: 44236.4, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2841034752. Throughput: 0: 41479.0. Samples: 65131620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:50,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 16:57:50,877][19107] Updated weights for policy 0, policy_version 173405 (0.0029) [2024-06-18 16:57:54,984][19107] Updated weights for policy 0, policy_version 173415 (0.0045) [2024-06-18 16:57:55,500][18875] Fps is (10 sec: 39335.7, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 2841231360. Throughput: 0: 40990.1. Samples: 65368220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:57:55,503][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 16:57:59,050][19107] Updated weights for policy 0, policy_version 173425 (0.0035) [2024-06-18 16:58:00,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2841460736. Throughput: 0: 41430.2. Samples: 65499880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 16:58:00,502][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 16:58:02,781][19107] Updated weights for policy 0, policy_version 173435 (0.0039) [2024-06-18 16:58:05,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 2841624576. Throughput: 0: 41293.4. Samples: 65744120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:05,501][18875] Avg episode reward: [(0, '0.351')] [2024-06-18 16:58:06,955][19107] Updated weights for policy 0, policy_version 173445 (0.0032) [2024-06-18 16:58:10,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 2841853952. Throughput: 0: 41026.6. Samples: 65987420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:10,501][18875] Avg episode reward: [(0, '0.370')] [2024-06-18 16:58:10,955][19107] Updated weights for policy 0, policy_version 173455 (0.0042) [2024-06-18 16:58:14,730][19107] Updated weights for policy 0, policy_version 173465 (0.0030) [2024-06-18 16:58:15,500][18875] Fps is (10 sec: 45875.3, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 2842083328. Throughput: 0: 41456.6. Samples: 66118980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:15,501][18875] Avg episode reward: [(0, '0.699')] [2024-06-18 16:58:19,231][19107] Updated weights for policy 0, policy_version 173475 (0.0038) [2024-06-18 16:58:20,500][18875] Fps is (10 sec: 39322.2, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 2842247168. Throughput: 0: 41134.7. Samples: 66360940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:20,501][18875] Avg episode reward: [(0, '0.325')] [2024-06-18 16:58:22,799][19107] Updated weights for policy 0, policy_version 173485 (0.0032) [2024-06-18 16:58:25,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 2842476544. Throughput: 0: 40972.1. Samples: 66603580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:25,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 16:58:27,072][19107] Updated weights for policy 0, policy_version 173495 (0.0045) [2024-06-18 16:58:30,502][18875] Fps is (10 sec: 44228.9, 60 sec: 41232.0, 300 sec: 41098.6). Total num frames: 2842689536. Throughput: 0: 41235.1. Samples: 66731820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:30,503][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 16:58:30,759][19107] Updated weights for policy 0, policy_version 173505 (0.0030) [2024-06-18 16:58:35,005][19107] Updated weights for policy 0, policy_version 173515 (0.0035) [2024-06-18 16:58:35,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 2842869760. Throughput: 0: 40977.9. Samples: 66975620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:35,501][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 16:58:35,535][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000173515_2842869760.pth... [2024-06-18 16:58:35,599][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000172914_2833022976.pth [2024-06-18 16:58:38,899][19107] Updated weights for policy 0, policy_version 173525 (0.0034) [2024-06-18 16:58:40,504][18875] Fps is (10 sec: 40952.3, 60 sec: 41776.7, 300 sec: 41209.4). Total num frames: 2843099136. Throughput: 0: 41179.0. Samples: 67221420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:40,504][18875] Avg episode reward: [(0, '0.436')] [2024-06-18 16:58:42,931][19107] Updated weights for policy 0, policy_version 173535 (0.0033) [2024-06-18 16:58:45,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40689.4, 300 sec: 41043.3). Total num frames: 2843279360. Throughput: 0: 41070.3. Samples: 67348040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:45,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 16:58:46,570][19107] Updated weights for policy 0, policy_version 173545 (0.0039) [2024-06-18 16:58:50,500][18875] Fps is (10 sec: 39335.7, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2843492352. Throughput: 0: 41096.0. Samples: 67593440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:50,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 16:58:50,887][19107] Updated weights for policy 0, policy_version 173555 (0.0034) [2024-06-18 16:58:54,475][19107] Updated weights for policy 0, policy_version 173565 (0.0043) [2024-06-18 16:58:55,500][18875] Fps is (10 sec: 45875.6, 60 sec: 41779.3, 300 sec: 41210.1). Total num frames: 2843738112. Throughput: 0: 41015.2. Samples: 67833100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:58:55,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 16:58:59,016][19107] Updated weights for policy 0, policy_version 173575 (0.0049) [2024-06-18 16:59:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 2843901952. Throughput: 0: 41012.4. Samples: 67964540. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:59:00,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 16:59:02,583][19087] Signal inference workers to stop experience collection... (950 times) [2024-06-18 16:59:02,587][19087] Signal inference workers to resume experience collection... (950 times) [2024-06-18 16:59:02,591][19107] Updated weights for policy 0, policy_version 173585 (0.0049) [2024-06-18 16:59:02,604][19107] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-18 16:59:02,604][19107] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-18 16:59:05,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2844114944. Throughput: 0: 41064.0. Samples: 68208820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 16:59:05,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 16:59:06,936][19107] Updated weights for policy 0, policy_version 173595 (0.0035) [2024-06-18 16:59:10,480][19107] Updated weights for policy 0, policy_version 173605 (0.0038) [2024-06-18 16:59:10,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 2844344320. Throughput: 0: 41198.2. Samples: 68457500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:10,501][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 16:59:14,833][19107] Updated weights for policy 0, policy_version 173615 (0.0042) [2024-06-18 16:59:15,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 2844508160. Throughput: 0: 41009.5. Samples: 68577180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:15,501][18875] Avg episode reward: [(0, '0.375')] [2024-06-18 16:59:18,502][19107] Updated weights for policy 0, policy_version 173625 (0.0034) [2024-06-18 16:59:20,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 2844737536. Throughput: 0: 40860.9. Samples: 68814360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:20,500][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 16:59:22,866][19107] Updated weights for policy 0, policy_version 173635 (0.0031) [2024-06-18 16:59:25,500][18875] Fps is (10 sec: 39321.9, 60 sec: 40413.9, 300 sec: 40932.2). Total num frames: 2844901376. Throughput: 0: 41092.7. Samples: 69070440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:25,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 16:59:26,488][19107] Updated weights for policy 0, policy_version 173645 (0.0031) [2024-06-18 16:59:30,500][18875] Fps is (10 sec: 40959.2, 60 sec: 40961.1, 300 sec: 41154.9). Total num frames: 2845147136. Throughput: 0: 40755.0. Samples: 69182020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:30,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 16:59:30,871][19107] Updated weights for policy 0, policy_version 173655 (0.0041) [2024-06-18 16:59:34,435][19107] Updated weights for policy 0, policy_version 173665 (0.0054) [2024-06-18 16:59:35,504][18875] Fps is (10 sec: 44220.4, 60 sec: 41230.5, 300 sec: 41098.3). Total num frames: 2845343744. Throughput: 0: 40825.1. Samples: 69430720. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:35,505][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 16:59:39,019][19107] Updated weights for policy 0, policy_version 173675 (0.0039) [2024-06-18 16:59:40,500][18875] Fps is (10 sec: 39321.8, 60 sec: 40689.3, 300 sec: 40987.8). Total num frames: 2845540352. Throughput: 0: 41023.4. Samples: 69679160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:40,501][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 16:59:42,827][19107] Updated weights for policy 0, policy_version 173685 (0.0036) [2024-06-18 16:59:45,500][18875] Fps is (10 sec: 39335.5, 60 sec: 40959.9, 300 sec: 41043.6). Total num frames: 2845736960. Throughput: 0: 40703.0. Samples: 69796180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:45,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 16:59:47,067][19107] Updated weights for policy 0, policy_version 173695 (0.0034) [2024-06-18 16:59:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2845966336. Throughput: 0: 40912.7. Samples: 70049900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:50,504][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 16:59:50,878][19107] Updated weights for policy 0, policy_version 173705 (0.0042) [2024-06-18 16:59:54,814][19107] Updated weights for policy 0, policy_version 173715 (0.0041) [2024-06-18 16:59:55,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40140.7, 300 sec: 40987.8). Total num frames: 2846146560. Throughput: 0: 40807.1. Samples: 70293820. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 16:59:55,504][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 16:59:58,990][19107] Updated weights for policy 0, policy_version 173725 (0.0034) [2024-06-18 17:00:00,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2846375936. Throughput: 0: 40831.7. Samples: 70414600. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 17:00:00,500][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 17:00:02,685][19107] Updated weights for policy 0, policy_version 173735 (0.0036) [2024-06-18 17:00:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 2846572544. Throughput: 0: 41178.1. Samples: 70667380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 17:00:05,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 17:00:06,715][19107] Updated weights for policy 0, policy_version 173745 (0.0037) [2024-06-18 17:00:10,500][18875] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 2846785536. Throughput: 0: 40783.1. Samples: 70905680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 20.0) [2024-06-18 17:00:10,501][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 17:00:10,584][19107] Updated weights for policy 0, policy_version 173755 (0.0037) [2024-06-18 17:00:14,579][19107] Updated weights for policy 0, policy_version 173765 (0.0047) [2024-06-18 17:00:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2846982144. Throughput: 0: 41102.4. Samples: 71031620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:15,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 17:00:18,894][19107] Updated weights for policy 0, policy_version 173775 (0.0037) [2024-06-18 17:00:20,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40686.8, 300 sec: 40987.7). Total num frames: 2847178752. Throughput: 0: 41013.0. Samples: 71276160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:20,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 17:00:22,579][19107] Updated weights for policy 0, policy_version 173785 (0.0033) [2024-06-18 17:00:23,435][19087] Signal inference workers to stop experience collection... (1000 times) [2024-06-18 17:00:23,492][19107] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-18 17:00:23,550][19087] Signal inference workers to resume experience collection... (1000 times) [2024-06-18 17:00:23,550][19107] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-18 17:00:25,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41043.3). Total num frames: 2847391744. Throughput: 0: 40934.2. Samples: 71521200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:25,501][18875] Avg episode reward: [(0, '0.777')] [2024-06-18 17:00:27,105][19107] Updated weights for policy 0, policy_version 173795 (0.0034) [2024-06-18 17:00:30,500][18875] Fps is (10 sec: 42599.4, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2847604736. Throughput: 0: 41100.2. Samples: 71645680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:30,500][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 17:00:30,506][19107] Updated weights for policy 0, policy_version 173805 (0.0048) [2024-06-18 17:00:34,993][19107] Updated weights for policy 0, policy_version 173815 (0.0029) [2024-06-18 17:00:35,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40962.4, 300 sec: 41098.8). Total num frames: 2847801344. Throughput: 0: 40884.5. Samples: 71889700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:35,504][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 17:00:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000173816_2847801344.pth... [2024-06-18 17:00:35,576][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000173215_2837954560.pth [2024-06-18 17:00:38,529][19107] Updated weights for policy 0, policy_version 173825 (0.0035) [2024-06-18 17:00:40,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 2847997952. Throughput: 0: 40863.7. Samples: 72132680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:40,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 17:00:43,224][19107] Updated weights for policy 0, policy_version 173835 (0.0046) [2024-06-18 17:00:45,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 2848210944. Throughput: 0: 41084.9. Samples: 72263420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:45,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 17:00:46,592][19107] Updated weights for policy 0, policy_version 173845 (0.0043) [2024-06-18 17:00:50,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2848407552. Throughput: 0: 40897.8. Samples: 72507780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:50,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 17:00:51,165][19107] Updated weights for policy 0, policy_version 173855 (0.0048) [2024-06-18 17:00:54,570][19107] Updated weights for policy 0, policy_version 173865 (0.0051) [2024-06-18 17:00:55,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41043.8). Total num frames: 2848620544. Throughput: 0: 41060.4. Samples: 72753400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:00:55,501][18875] Avg episode reward: [(0, '0.784')] [2024-06-18 17:00:58,994][19107] Updated weights for policy 0, policy_version 173875 (0.0028) [2024-06-18 17:01:00,500][18875] Fps is (10 sec: 40959.7, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 2848817152. Throughput: 0: 40984.7. Samples: 72875940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:01:00,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 17:01:02,946][19107] Updated weights for policy 0, policy_version 173885 (0.0041) [2024-06-18 17:01:05,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 2849013760. Throughput: 0: 41096.9. Samples: 73125520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:01:05,501][18875] Avg episode reward: [(0, '0.335')] [2024-06-18 17:01:06,889][19107] Updated weights for policy 0, policy_version 173895 (0.0042) [2024-06-18 17:01:10,500][18875] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2849243136. Throughput: 0: 41031.6. Samples: 73367620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:01:10,501][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 17:01:11,136][19107] Updated weights for policy 0, policy_version 173905 (0.0028) [2024-06-18 17:01:14,807][19107] Updated weights for policy 0, policy_version 173915 (0.0028) [2024-06-18 17:01:15,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41233.0, 300 sec: 41098.9). Total num frames: 2849456128. Throughput: 0: 41156.8. Samples: 73497740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:01:15,501][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 17:01:19,149][19107] Updated weights for policy 0, policy_version 173925 (0.0045) [2024-06-18 17:01:20,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 2849652736. Throughput: 0: 41209.9. Samples: 73744140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:01:20,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 17:01:23,025][19107] Updated weights for policy 0, policy_version 173935 (0.0032) [2024-06-18 17:01:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 2849882112. Throughput: 0: 41122.1. Samples: 73983180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:01:25,504][18875] Avg episode reward: [(0, '0.207')] [2024-06-18 17:01:27,088][19107] Updated weights for policy 0, policy_version 173945 (0.0045) [2024-06-18 17:01:30,500][18875] Fps is (10 sec: 39320.7, 60 sec: 40686.8, 300 sec: 40987.8). Total num frames: 2850045952. Throughput: 0: 41051.4. Samples: 74110740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:01:30,502][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 17:01:30,977][19107] Updated weights for policy 0, policy_version 173955 (0.0043) [2024-06-18 17:01:35,097][19107] Updated weights for policy 0, policy_version 173965 (0.0039) [2024-06-18 17:01:35,501][18875] Fps is (10 sec: 37682.3, 60 sec: 40959.9, 300 sec: 40987.7). Total num frames: 2850258944. Throughput: 0: 41069.6. Samples: 74355920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:01:35,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 17:01:38,971][19107] Updated weights for policy 0, policy_version 173975 (0.0040) [2024-06-18 17:01:40,500][18875] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 41210.2). Total num frames: 2850488320. Throughput: 0: 41104.4. Samples: 74603100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:01:40,501][18875] Avg episode reward: [(0, '0.398')] [2024-06-18 17:01:43,059][19107] Updated weights for policy 0, policy_version 173985 (0.0031) [2024-06-18 17:01:45,500][18875] Fps is (10 sec: 40960.6, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 2850668544. Throughput: 0: 41236.5. Samples: 74731580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:01:45,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 17:01:46,866][19107] Updated weights for policy 0, policy_version 173995 (0.0030) [2024-06-18 17:01:50,500][18875] Fps is (10 sec: 37683.3, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 2850865152. Throughput: 0: 41116.5. Samples: 74975760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:01:50,501][18875] Avg episode reward: [(0, '0.763')] [2024-06-18 17:01:51,009][19107] Updated weights for policy 0, policy_version 174005 (0.0045) [2024-06-18 17:01:54,798][19107] Updated weights for policy 0, policy_version 174015 (0.0036) [2024-06-18 17:01:55,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2851094528. Throughput: 0: 41272.1. Samples: 75224860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:01:55,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 17:01:58,789][19107] Updated weights for policy 0, policy_version 174025 (0.0037) [2024-06-18 17:02:00,500][18875] Fps is (10 sec: 45875.5, 60 sec: 41779.3, 300 sec: 41098.9). Total num frames: 2851323904. Throughput: 0: 41215.2. Samples: 75352420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:02:00,500][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 17:02:02,474][19087] Signal inference workers to stop experience collection... (1050 times) [2024-06-18 17:02:02,475][19087] Signal inference workers to resume experience collection... (1050 times) [2024-06-18 17:02:02,524][19107] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-18 17:02:02,524][19107] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-18 17:02:02,613][19107] Updated weights for policy 0, policy_version 174035 (0.0030) [2024-06-18 17:02:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 2851504128. Throughput: 0: 41152.4. Samples: 75596000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:02:05,501][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 17:02:06,601][19107] Updated weights for policy 0, policy_version 174045 (0.0038) [2024-06-18 17:02:10,500][18875] Fps is (10 sec: 37682.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2851700736. Throughput: 0: 41459.5. Samples: 75848860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:02:10,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 17:02:10,641][19107] Updated weights for policy 0, policy_version 174055 (0.0042) [2024-06-18 17:02:14,737][19107] Updated weights for policy 0, policy_version 174065 (0.0042) [2024-06-18 17:02:15,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 2851913728. Throughput: 0: 41300.6. Samples: 75969260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:02:15,501][18875] Avg episode reward: [(0, '0.760')] [2024-06-18 17:02:18,529][19107] Updated weights for policy 0, policy_version 174075 (0.0030) [2024-06-18 17:02:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41232.9, 300 sec: 41209.9). Total num frames: 2852126720. Throughput: 0: 41284.6. Samples: 76213720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:02:20,501][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 17:02:22,558][19107] Updated weights for policy 0, policy_version 174085 (0.0040) [2024-06-18 17:02:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 2852323328. Throughput: 0: 41333.3. Samples: 76463100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 24.0) [2024-06-18 17:02:25,504][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 17:02:26,585][19107] Updated weights for policy 0, policy_version 174095 (0.0036) [2024-06-18 17:02:30,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 2852519936. Throughput: 0: 41267.2. Samples: 76588600. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:02:30,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 17:02:30,697][19107] Updated weights for policy 0, policy_version 174105 (0.0040) [2024-06-18 17:02:34,469][19107] Updated weights for policy 0, policy_version 174115 (0.0040) [2024-06-18 17:02:35,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 2852749312. Throughput: 0: 41485.8. Samples: 76842620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:02:35,501][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 17:02:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000174118_2852749312.pth... [2024-06-18 17:02:35,591][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000173515_2842869760.pth [2024-06-18 17:02:38,609][19107] Updated weights for policy 0, policy_version 174125 (0.0048) [2024-06-18 17:02:40,500][18875] Fps is (10 sec: 44236.2, 60 sec: 41233.0, 300 sec: 41099.3). Total num frames: 2852962304. Throughput: 0: 41214.1. Samples: 77079500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:02:40,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 17:02:42,293][19107] Updated weights for policy 0, policy_version 174135 (0.0033) [2024-06-18 17:02:45,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 2853158912. Throughput: 0: 41209.6. Samples: 77206860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:02:45,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 17:02:46,515][19107] Updated weights for policy 0, policy_version 174145 (0.0034) [2024-06-18 17:02:50,313][19107] Updated weights for policy 0, policy_version 174155 (0.0037) [2024-06-18 17:02:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 2853371904. Throughput: 0: 41266.6. Samples: 77453000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:02:50,501][18875] Avg episode reward: [(0, '0.293')] [2024-06-18 17:02:54,431][19107] Updated weights for policy 0, policy_version 174165 (0.0045) [2024-06-18 17:02:55,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2853568512. Throughput: 0: 41145.8. Samples: 77700420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:02:55,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 17:02:58,092][19107] Updated weights for policy 0, policy_version 174175 (0.0049) [2024-06-18 17:03:00,500][18875] Fps is (10 sec: 39322.3, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 2853765120. Throughput: 0: 41117.4. Samples: 77819540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:03:00,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 17:03:02,433][19107] Updated weights for policy 0, policy_version 174185 (0.0029) [2024-06-18 17:03:05,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2853994496. Throughput: 0: 41168.1. Samples: 78066280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:03:05,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 17:03:06,029][19107] Updated weights for policy 0, policy_version 174195 (0.0049) [2024-06-18 17:03:10,294][19107] Updated weights for policy 0, policy_version 174205 (0.0032) [2024-06-18 17:03:10,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 2854191104. Throughput: 0: 41238.2. Samples: 78318820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:03:10,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 17:03:13,890][19107] Updated weights for policy 0, policy_version 174215 (0.0040) [2024-06-18 17:03:15,500][18875] Fps is (10 sec: 37683.3, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 2854371328. Throughput: 0: 40971.6. Samples: 78432320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:03:15,500][18875] Avg episode reward: [(0, '0.450')] [2024-06-18 17:03:18,291][19107] Updated weights for policy 0, policy_version 174225 (0.0034) [2024-06-18 17:03:20,262][19087] Signal inference workers to stop experience collection... (1100 times) [2024-06-18 17:03:20,303][19107] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-18 17:03:20,312][19087] Signal inference workers to resume experience collection... (1100 times) [2024-06-18 17:03:20,322][19107] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-18 17:03:20,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2854633472. Throughput: 0: 40948.8. Samples: 78685320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:03:20,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 17:03:21,787][19107] Updated weights for policy 0, policy_version 174235 (0.0031) [2024-06-18 17:03:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 40988.0). Total num frames: 2854780928. Throughput: 0: 41343.3. Samples: 78939940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:03:25,501][18875] Avg episode reward: [(0, '0.790')] [2024-06-18 17:03:26,199][19107] Updated weights for policy 0, policy_version 174245 (0.0040) [2024-06-18 17:03:29,621][19107] Updated weights for policy 0, policy_version 174255 (0.0024) [2024-06-18 17:03:30,504][18875] Fps is (10 sec: 37669.7, 60 sec: 41503.6, 300 sec: 41153.9). Total num frames: 2855010304. Throughput: 0: 41062.5. Samples: 79054820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 17:03:30,505][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 17:03:34,339][19107] Updated weights for policy 0, policy_version 174265 (0.0046) [2024-06-18 17:03:35,500][18875] Fps is (10 sec: 45874.4, 60 sec: 41506.0, 300 sec: 41154.9). Total num frames: 2855239680. Throughput: 0: 41143.1. Samples: 79304440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:03:35,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 17:03:37,662][19107] Updated weights for policy 0, policy_version 174275 (0.0033) [2024-06-18 17:03:40,500][18875] Fps is (10 sec: 39335.8, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 2855403520. Throughput: 0: 41154.7. Samples: 79552380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:03:40,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 17:03:42,143][19107] Updated weights for policy 0, policy_version 174285 (0.0042) [2024-06-18 17:03:45,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2855632896. Throughput: 0: 41168.7. Samples: 79672140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:03:45,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 17:03:45,976][19107] Updated weights for policy 0, policy_version 174295 (0.0037) [2024-06-18 17:03:49,936][19107] Updated weights for policy 0, policy_version 174305 (0.0034) [2024-06-18 17:03:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 40987.7). Total num frames: 2855829504. Throughput: 0: 41296.3. Samples: 79924620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:03:50,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 17:03:53,706][19107] Updated weights for policy 0, policy_version 174315 (0.0036) [2024-06-18 17:03:55,501][18875] Fps is (10 sec: 37683.1, 60 sec: 40686.8, 300 sec: 41043.3). Total num frames: 2856009728. Throughput: 0: 41209.2. Samples: 80173240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:03:55,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 17:03:57,894][19107] Updated weights for policy 0, policy_version 174325 (0.0041) [2024-06-18 17:04:00,500][18875] Fps is (10 sec: 44237.2, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2856271872. Throughput: 0: 41217.2. Samples: 80287100. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:00,502][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 17:04:01,975][19107] Updated weights for policy 0, policy_version 174335 (0.0026) [2024-06-18 17:04:05,500][18875] Fps is (10 sec: 40960.4, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 2856419328. Throughput: 0: 41137.7. Samples: 80536520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:05,501][18875] Avg episode reward: [(0, '0.409')] [2024-06-18 17:04:06,041][19107] Updated weights for policy 0, policy_version 174345 (0.0042) [2024-06-18 17:04:09,736][19107] Updated weights for policy 0, policy_version 174355 (0.0039) [2024-06-18 17:04:10,500][18875] Fps is (10 sec: 36044.6, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 2856632320. Throughput: 0: 40866.5. Samples: 80778940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:10,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 17:04:14,104][19107] Updated weights for policy 0, policy_version 174365 (0.0035) [2024-06-18 17:04:15,500][18875] Fps is (10 sec: 45875.8, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 2856878080. Throughput: 0: 41238.5. Samples: 80910400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:15,508][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 17:04:17,486][19107] Updated weights for policy 0, policy_version 174375 (0.0029) [2024-06-18 17:04:20,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40140.8, 300 sec: 41154.4). Total num frames: 2857041920. Throughput: 0: 41117.4. Samples: 81154720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:20,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 17:04:21,970][19107] Updated weights for policy 0, policy_version 174385 (0.0047) [2024-06-18 17:04:25,246][19107] Updated weights for policy 0, policy_version 174395 (0.0050) [2024-06-18 17:04:25,501][18875] Fps is (10 sec: 40958.8, 60 sec: 41779.0, 300 sec: 41154.4). Total num frames: 2857287680. Throughput: 0: 40978.9. Samples: 81396440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:25,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 17:04:29,931][19107] Updated weights for policy 0, policy_version 174405 (0.0029) [2024-06-18 17:04:30,501][18875] Fps is (10 sec: 42597.2, 60 sec: 40962.3, 300 sec: 41099.3). Total num frames: 2857467904. Throughput: 0: 41137.6. Samples: 81523340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:30,501][18875] Avg episode reward: [(0, '0.815')] [2024-06-18 17:04:33,019][19107] Updated weights for policy 0, policy_version 174415 (0.0044) [2024-06-18 17:04:35,500][18875] Fps is (10 sec: 34407.6, 60 sec: 39867.9, 300 sec: 40987.8). Total num frames: 2857631744. Throughput: 0: 40916.2. Samples: 81765840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 22.0) [2024-06-18 17:04:35,500][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 17:04:35,675][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000174418_2857664512.pth... [2024-06-18 17:04:35,729][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000173816_2847801344.pth [2024-06-18 17:04:37,813][19107] Updated weights for policy 0, policy_version 174425 (0.0038) [2024-06-18 17:04:40,500][18875] Fps is (10 sec: 44237.9, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 2857910272. Throughput: 0: 40709.5. Samples: 82005160. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:04:40,501][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 17:04:40,897][19107] Updated weights for policy 0, policy_version 174435 (0.0038) [2024-06-18 17:04:45,500][18875] Fps is (10 sec: 45874.1, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 2858090496. Throughput: 0: 41247.5. Samples: 82143240. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:04:45,501][18875] Avg episode reward: [(0, '0.627')] [2024-06-18 17:04:45,735][19107] Updated weights for policy 0, policy_version 174445 (0.0035) [2024-06-18 17:04:47,925][19087] Signal inference workers to stop experience collection... (1150 times) [2024-06-18 17:04:47,925][19087] Signal inference workers to resume experience collection... (1150 times) [2024-06-18 17:04:47,943][19107] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-18 17:04:47,943][19107] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-18 17:04:49,626][19107] Updated weights for policy 0, policy_version 174455 (0.0042) [2024-06-18 17:04:50,500][18875] Fps is (10 sec: 36044.7, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 2858270720. Throughput: 0: 40862.7. Samples: 82375340. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:04:50,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 17:04:53,720][19107] Updated weights for policy 0, policy_version 174465 (0.0046) [2024-06-18 17:04:55,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 2858483712. Throughput: 0: 40902.6. Samples: 82619560. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:04:55,501][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 17:04:57,537][19107] Updated weights for policy 0, policy_version 174475 (0.0051) [2024-06-18 17:05:00,500][18875] Fps is (10 sec: 39321.9, 60 sec: 39867.8, 300 sec: 40987.8). Total num frames: 2858663936. Throughput: 0: 40719.1. Samples: 82742760. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:00,501][18875] Avg episode reward: [(0, '0.286')] [2024-06-18 17:05:01,720][19107] Updated weights for policy 0, policy_version 174485 (0.0032) [2024-06-18 17:05:05,339][19107] Updated weights for policy 0, policy_version 174495 (0.0035) [2024-06-18 17:05:05,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41154.4). Total num frames: 2858926080. Throughput: 0: 40823.1. Samples: 82991760. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:05,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 17:05:09,695][19107] Updated weights for policy 0, policy_version 174505 (0.0034) [2024-06-18 17:05:10,500][18875] Fps is (10 sec: 45875.6, 60 sec: 41506.3, 300 sec: 41154.4). Total num frames: 2859122688. Throughput: 0: 40949.6. Samples: 83239160. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:10,500][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 17:05:13,253][19107] Updated weights for policy 0, policy_version 174515 (0.0033) [2024-06-18 17:05:15,500][18875] Fps is (10 sec: 36045.0, 60 sec: 40140.8, 300 sec: 41043.3). Total num frames: 2859286528. Throughput: 0: 40770.5. Samples: 83358000. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:15,501][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 17:05:17,674][19107] Updated weights for policy 0, policy_version 174525 (0.0029) [2024-06-18 17:05:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 2859532288. Throughput: 0: 40987.5. Samples: 83610280. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:20,501][18875] Avg episode reward: [(0, '0.450')] [2024-06-18 17:05:21,248][19107] Updated weights for policy 0, policy_version 174535 (0.0045) [2024-06-18 17:05:25,359][19107] Updated weights for policy 0, policy_version 174545 (0.0035) [2024-06-18 17:05:25,504][18875] Fps is (10 sec: 45858.7, 60 sec: 40957.8, 300 sec: 41153.9). Total num frames: 2859745280. Throughput: 0: 41301.7. Samples: 83863880. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:25,504][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 17:05:29,259][19107] Updated weights for policy 0, policy_version 174555 (0.0040) [2024-06-18 17:05:30,500][18875] Fps is (10 sec: 37683.1, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 2859909120. Throughput: 0: 40944.2. Samples: 83985720. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:30,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 17:05:33,177][19107] Updated weights for policy 0, policy_version 174565 (0.0038) [2024-06-18 17:05:35,500][18875] Fps is (10 sec: 40974.7, 60 sec: 42052.2, 300 sec: 41209.9). Total num frames: 2860154880. Throughput: 0: 41327.2. Samples: 84235060. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:35,501][18875] Avg episode reward: [(0, '0.311')] [2024-06-18 17:05:37,755][19107] Updated weights for policy 0, policy_version 174575 (0.0025) [2024-06-18 17:05:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 40413.9, 300 sec: 41098.8). Total num frames: 2860335104. Throughput: 0: 41431.2. Samples: 84483960. Policy #0 lag: (min: 2.0, avg: 9.5, max: 22.0) [2024-06-18 17:05:40,501][18875] Avg episode reward: [(0, '0.405')] [2024-06-18 17:05:41,022][19107] Updated weights for policy 0, policy_version 174585 (0.0030) [2024-06-18 17:05:45,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2860548096. Throughput: 0: 41387.1. Samples: 84605180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:05:45,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 17:05:45,685][19107] Updated weights for policy 0, policy_version 174595 (0.0050) [2024-06-18 17:05:48,735][19107] Updated weights for policy 0, policy_version 174605 (0.0028) [2024-06-18 17:05:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2860761088. Throughput: 0: 41253.7. Samples: 84848180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:05:50,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 17:05:53,439][19107] Updated weights for policy 0, policy_version 174615 (0.0028) [2024-06-18 17:05:55,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2860941312. Throughput: 0: 41578.2. Samples: 85110180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:05:55,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 17:05:56,227][19087] Signal inference workers to stop experience collection... (1200 times) [2024-06-18 17:05:56,228][19087] Signal inference workers to resume experience collection... (1200 times) [2024-06-18 17:05:56,244][19107] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-18 17:05:56,244][19107] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-18 17:05:56,540][19107] Updated weights for policy 0, policy_version 174625 (0.0038) [2024-06-18 17:06:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2861170688. Throughput: 0: 41584.7. Samples: 85229320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:00,501][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 17:06:01,160][19107] Updated weights for policy 0, policy_version 174635 (0.0034) [2024-06-18 17:06:04,674][19107] Updated weights for policy 0, policy_version 174645 (0.0036) [2024-06-18 17:06:05,500][18875] Fps is (10 sec: 44236.3, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2861383680. Throughput: 0: 41371.4. Samples: 85472000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:05,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 17:06:09,761][19107] Updated weights for policy 0, policy_version 174655 (0.0039) [2024-06-18 17:06:10,500][18875] Fps is (10 sec: 39321.7, 60 sec: 40686.8, 300 sec: 41043.3). Total num frames: 2861563904. Throughput: 0: 41321.8. Samples: 85723220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:10,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 17:06:12,834][19107] Updated weights for policy 0, policy_version 174665 (0.0050) [2024-06-18 17:06:15,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 41209.9). Total num frames: 2861809664. Throughput: 0: 41104.8. Samples: 85835440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:15,501][18875] Avg episode reward: [(0, '0.294')] [2024-06-18 17:06:17,550][19107] Updated weights for policy 0, policy_version 174675 (0.0035) [2024-06-18 17:06:20,504][18875] Fps is (10 sec: 44221.6, 60 sec: 41230.6, 300 sec: 41098.4). Total num frames: 2862006272. Throughput: 0: 41404.3. Samples: 86098400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:20,504][18875] Avg episode reward: [(0, '0.294')] [2024-06-18 17:06:20,673][19107] Updated weights for policy 0, policy_version 174685 (0.0029) [2024-06-18 17:06:25,273][19107] Updated weights for policy 0, policy_version 174695 (0.0030) [2024-06-18 17:06:25,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40962.4, 300 sec: 41209.9). Total num frames: 2862202880. Throughput: 0: 41202.2. Samples: 86338060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:25,501][18875] Avg episode reward: [(0, '0.369')] [2024-06-18 17:06:28,647][19107] Updated weights for policy 0, policy_version 174705 (0.0029) [2024-06-18 17:06:30,500][18875] Fps is (10 sec: 40974.5, 60 sec: 41779.2, 300 sec: 41210.0). Total num frames: 2862415872. Throughput: 0: 41248.8. Samples: 86461380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:30,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 17:06:33,552][19107] Updated weights for policy 0, policy_version 174715 (0.0028) [2024-06-18 17:06:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 2862596096. Throughput: 0: 41337.4. Samples: 86708360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:35,501][18875] Avg episode reward: [(0, '0.335')] [2024-06-18 17:06:35,541][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000174720_2862612480.pth... [2024-06-18 17:06:35,608][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000174118_2852749312.pth [2024-06-18 17:06:36,755][19107] Updated weights for policy 0, policy_version 174725 (0.0036) [2024-06-18 17:06:40,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2862809088. Throughput: 0: 40931.1. Samples: 86952080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:40,500][18875] Avg episode reward: [(0, '0.403')] [2024-06-18 17:06:41,621][19107] Updated weights for policy 0, policy_version 174735 (0.0045) [2024-06-18 17:06:44,872][19107] Updated weights for policy 0, policy_version 174745 (0.0044) [2024-06-18 17:06:45,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2863022080. Throughput: 0: 41107.1. Samples: 87079140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 17:06:45,501][18875] Avg episode reward: [(0, '0.450')] [2024-06-18 17:06:49,863][19107] Updated weights for policy 0, policy_version 174755 (0.0028) [2024-06-18 17:06:50,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2863202304. Throughput: 0: 41166.7. Samples: 87324500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:06:50,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 17:06:52,705][19107] Updated weights for policy 0, policy_version 174765 (0.0031) [2024-06-18 17:06:55,504][18875] Fps is (10 sec: 42583.3, 60 sec: 41776.6, 300 sec: 41098.3). Total num frames: 2863448064. Throughput: 0: 40934.1. Samples: 87565400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:06:55,505][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 17:06:57,809][19107] Updated weights for policy 0, policy_version 174775 (0.0052) [2024-06-18 17:07:00,500][18875] Fps is (10 sec: 45875.7, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 2863661056. Throughput: 0: 41375.3. Samples: 87697320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:00,501][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 17:07:00,585][19107] Updated weights for policy 0, policy_version 174785 (0.0039) [2024-06-18 17:07:05,500][18875] Fps is (10 sec: 37696.7, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 2863824896. Throughput: 0: 40830.3. Samples: 87935620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:05,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 17:07:05,709][19107] Updated weights for policy 0, policy_version 174795 (0.0029) [2024-06-18 17:07:08,472][19107] Updated weights for policy 0, policy_version 174805 (0.0030) [2024-06-18 17:07:10,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2864054272. Throughput: 0: 40912.8. Samples: 88179140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:10,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 17:07:13,764][19107] Updated weights for policy 0, policy_version 174815 (0.0044) [2024-06-18 17:07:14,682][19087] Signal inference workers to stop experience collection... (1250 times) [2024-06-18 17:07:14,692][19107] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-18 17:07:14,796][19087] Signal inference workers to resume experience collection... (1250 times) [2024-06-18 17:07:14,796][19107] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-18 17:07:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 2864250880. Throughput: 0: 40943.6. Samples: 88303840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:15,501][18875] Avg episode reward: [(0, '0.394')] [2024-06-18 17:07:16,706][19107] Updated weights for policy 0, policy_version 174825 (0.0043) [2024-06-18 17:07:20,501][18875] Fps is (10 sec: 39318.2, 60 sec: 40688.7, 300 sec: 41098.7). Total num frames: 2864447488. Throughput: 0: 40954.7. Samples: 88551360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:20,502][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 17:07:21,605][19107] Updated weights for policy 0, policy_version 174835 (0.0043) [2024-06-18 17:07:24,647][19107] Updated weights for policy 0, policy_version 174845 (0.0042) [2024-06-18 17:07:25,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2864676864. Throughput: 0: 40881.8. Samples: 88791760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:25,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 17:07:29,649][19107] Updated weights for policy 0, policy_version 174855 (0.0044) [2024-06-18 17:07:30,500][18875] Fps is (10 sec: 39324.9, 60 sec: 40413.8, 300 sec: 40987.7). Total num frames: 2864840704. Throughput: 0: 40980.0. Samples: 88923240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:30,501][18875] Avg episode reward: [(0, '0.364')] [2024-06-18 17:07:32,714][19107] Updated weights for policy 0, policy_version 174865 (0.0035) [2024-06-18 17:07:35,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2865070080. Throughput: 0: 40808.9. Samples: 89160900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:35,504][18875] Avg episode reward: [(0, '0.294')] [2024-06-18 17:07:37,696][19107] Updated weights for policy 0, policy_version 174875 (0.0050) [2024-06-18 17:07:40,504][18875] Fps is (10 sec: 44221.5, 60 sec: 41230.6, 300 sec: 41098.4). Total num frames: 2865283072. Throughput: 0: 40817.4. Samples: 89402180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:40,504][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 17:07:40,813][19107] Updated weights for policy 0, policy_version 174885 (0.0026) [2024-06-18 17:07:45,500][18875] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 2865463296. Throughput: 0: 40705.3. Samples: 89529060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:45,501][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 17:07:45,673][19107] Updated weights for policy 0, policy_version 174895 (0.0042) [2024-06-18 17:07:48,595][19107] Updated weights for policy 0, policy_version 174905 (0.0036) [2024-06-18 17:07:50,500][18875] Fps is (10 sec: 40974.8, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 2865692672. Throughput: 0: 40791.2. Samples: 89771220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 17:07:50,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 17:07:54,168][19107] Updated weights for policy 0, policy_version 174915 (0.0050) [2024-06-18 17:07:55,500][18875] Fps is (10 sec: 44237.2, 60 sec: 40962.5, 300 sec: 41154.4). Total num frames: 2865905664. Throughput: 0: 41020.6. Samples: 90025060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:07:55,500][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 17:07:56,737][19107] Updated weights for policy 0, policy_version 174925 (0.0039) [2024-06-18 17:08:00,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 2866085888. Throughput: 0: 40887.1. Samples: 90143760. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:00,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 17:08:02,025][19107] Updated weights for policy 0, policy_version 174935 (0.0034) [2024-06-18 17:08:04,809][19107] Updated weights for policy 0, policy_version 174945 (0.0044) [2024-06-18 17:08:05,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 2866315264. Throughput: 0: 40851.5. Samples: 90389640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:05,501][18875] Avg episode reward: [(0, '0.253')] [2024-06-18 17:08:09,970][19107] Updated weights for policy 0, policy_version 174955 (0.0039) [2024-06-18 17:08:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 2866495488. Throughput: 0: 41146.6. Samples: 90643360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:10,501][18875] Avg episode reward: [(0, '0.336')] [2024-06-18 17:08:10,667][19087] Signal inference workers to stop experience collection... (1300 times) [2024-06-18 17:08:10,707][19107] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-18 17:08:10,716][19087] Signal inference workers to resume experience collection... (1300 times) [2024-06-18 17:08:10,723][19107] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-18 17:08:13,000][19107] Updated weights for policy 0, policy_version 174965 (0.0039) [2024-06-18 17:08:15,500][18875] Fps is (10 sec: 39321.0, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 2866708480. Throughput: 0: 40763.1. Samples: 90757580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:15,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 17:08:17,795][19107] Updated weights for policy 0, policy_version 174975 (0.0040) [2024-06-18 17:08:20,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41506.7, 300 sec: 41209.9). Total num frames: 2866937856. Throughput: 0: 40981.8. Samples: 91005080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:20,504][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 17:08:20,882][19107] Updated weights for policy 0, policy_version 174985 (0.0038) [2024-06-18 17:08:25,500][18875] Fps is (10 sec: 37683.7, 60 sec: 40140.8, 300 sec: 40932.7). Total num frames: 2867085312. Throughput: 0: 41278.4. Samples: 91259560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:25,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 17:08:25,710][19107] Updated weights for policy 0, policy_version 174995 (0.0040) [2024-06-18 17:08:28,770][19107] Updated weights for policy 0, policy_version 175005 (0.0039) [2024-06-18 17:08:30,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 2867331072. Throughput: 0: 40987.4. Samples: 91373500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:30,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 17:08:33,581][19107] Updated weights for policy 0, policy_version 175015 (0.0046) [2024-06-18 17:08:35,500][18875] Fps is (10 sec: 45875.1, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2867544064. Throughput: 0: 41126.1. Samples: 91621900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:35,501][18875] Avg episode reward: [(0, '0.722')] [2024-06-18 17:08:35,510][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175021_2867544064.pth... [2024-06-18 17:08:35,572][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000174418_2857664512.pth [2024-06-18 17:08:37,033][19107] Updated weights for policy 0, policy_version 175025 (0.0039) [2024-06-18 17:08:40,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40416.2, 300 sec: 40932.2). Total num frames: 2867707904. Throughput: 0: 40895.9. Samples: 91865380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:40,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 17:08:41,590][19107] Updated weights for policy 0, policy_version 175035 (0.0045) [2024-06-18 17:08:44,997][19107] Updated weights for policy 0, policy_version 175045 (0.0043) [2024-06-18 17:08:45,504][18875] Fps is (10 sec: 40945.3, 60 sec: 41503.6, 300 sec: 41098.4). Total num frames: 2867953664. Throughput: 0: 40959.9. Samples: 91987100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:45,505][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 17:08:49,443][19107] Updated weights for policy 0, policy_version 175055 (0.0052) [2024-06-18 17:08:50,500][18875] Fps is (10 sec: 44236.4, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 2868150272. Throughput: 0: 41075.9. Samples: 92238060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:50,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 17:08:53,378][19107] Updated weights for policy 0, policy_version 175065 (0.0035) [2024-06-18 17:08:55,500][18875] Fps is (10 sec: 39336.0, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 2868346880. Throughput: 0: 40707.6. Samples: 92475200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-18 17:08:55,501][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 17:08:57,352][19107] Updated weights for policy 0, policy_version 175075 (0.0034) [2024-06-18 17:09:00,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2868559872. Throughput: 0: 41081.8. Samples: 92606260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:00,504][18875] Avg episode reward: [(0, '0.690')] [2024-06-18 17:09:01,385][19107] Updated weights for policy 0, policy_version 175085 (0.0042) [2024-06-18 17:09:05,500][18875] Fps is (10 sec: 37683.2, 60 sec: 40140.8, 300 sec: 40987.8). Total num frames: 2868723712. Throughput: 0: 40959.6. Samples: 92848260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:05,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 17:09:05,890][19107] Updated weights for policy 0, policy_version 175095 (0.0039) [2024-06-18 17:09:09,415][19107] Updated weights for policy 0, policy_version 175105 (0.0030) [2024-06-18 17:09:10,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 2868969472. Throughput: 0: 40704.0. Samples: 93091240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:10,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 17:09:13,629][19107] Updated weights for policy 0, policy_version 175115 (0.0035) [2024-06-18 17:09:15,504][18875] Fps is (10 sec: 44220.6, 60 sec: 40957.6, 300 sec: 41098.3). Total num frames: 2869166080. Throughput: 0: 41034.6. Samples: 93220200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:15,504][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 17:09:17,455][19107] Updated weights for policy 0, policy_version 175125 (0.0036) [2024-06-18 17:09:20,500][18875] Fps is (10 sec: 37682.9, 60 sec: 40140.8, 300 sec: 40876.7). Total num frames: 2869346304. Throughput: 0: 40895.9. Samples: 93462220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:20,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 17:09:21,466][19107] Updated weights for policy 0, policy_version 175135 (0.0037) [2024-06-18 17:09:25,294][19107] Updated weights for policy 0, policy_version 175145 (0.0036) [2024-06-18 17:09:25,500][18875] Fps is (10 sec: 40974.7, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 2869575680. Throughput: 0: 41038.2. Samples: 93712100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:25,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 17:09:29,314][19107] Updated weights for policy 0, policy_version 175155 (0.0039) [2024-06-18 17:09:30,500][18875] Fps is (10 sec: 44237.1, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 2869788672. Throughput: 0: 41066.4. Samples: 93834940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:30,504][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 17:09:31,833][19087] Signal inference workers to stop experience collection... (1350 times) [2024-06-18 17:09:31,876][19107] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-18 17:09:31,885][19087] Signal inference workers to resume experience collection... (1350 times) [2024-06-18 17:09:31,893][19107] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-18 17:09:33,086][19107] Updated weights for policy 0, policy_version 175165 (0.0045) [2024-06-18 17:09:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 2869985280. Throughput: 0: 40966.7. Samples: 94081560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:35,501][18875] Avg episode reward: [(0, '0.171')] [2024-06-18 17:09:37,076][19107] Updated weights for policy 0, policy_version 175175 (0.0039) [2024-06-18 17:09:40,501][18875] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41043.3). Total num frames: 2870198272. Throughput: 0: 41101.6. Samples: 94324780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:40,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 17:09:41,120][19107] Updated weights for policy 0, policy_version 175185 (0.0052) [2024-06-18 17:09:45,345][19107] Updated weights for policy 0, policy_version 175195 (0.0039) [2024-06-18 17:09:45,500][18875] Fps is (10 sec: 42598.4, 60 sec: 40962.4, 300 sec: 41154.4). Total num frames: 2870411264. Throughput: 0: 40926.6. Samples: 94447960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:45,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 17:09:49,029][19107] Updated weights for policy 0, policy_version 175205 (0.0034) [2024-06-18 17:09:50,500][18875] Fps is (10 sec: 40961.2, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2870607872. Throughput: 0: 41045.4. Samples: 94695300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:50,501][18875] Avg episode reward: [(0, '0.273')] [2024-06-18 17:09:53,251][19107] Updated weights for policy 0, policy_version 175215 (0.0033) [2024-06-18 17:09:55,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 2870804480. Throughput: 0: 41008.4. Samples: 94936620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:09:55,501][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 17:09:57,315][19107] Updated weights for policy 0, policy_version 175225 (0.0039) [2024-06-18 17:10:00,500][18875] Fps is (10 sec: 37683.2, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 2870984704. Throughput: 0: 41029.2. Samples: 95066360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 17:10:00,501][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 17:10:01,119][19107] Updated weights for policy 0, policy_version 175235 (0.0030) [2024-06-18 17:10:05,078][19107] Updated weights for policy 0, policy_version 175245 (0.0039) [2024-06-18 17:10:05,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 40987.8). Total num frames: 2871214080. Throughput: 0: 41119.6. Samples: 95312600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:05,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 17:10:09,163][19107] Updated weights for policy 0, policy_version 175255 (0.0041) [2024-06-18 17:10:10,500][18875] Fps is (10 sec: 45874.7, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2871443456. Throughput: 0: 41021.8. Samples: 95558080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:10,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 17:10:12,925][19107] Updated weights for policy 0, policy_version 175265 (0.0036) [2024-06-18 17:10:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40962.4, 300 sec: 40987.8). Total num frames: 2871623680. Throughput: 0: 40992.4. Samples: 95679600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:15,501][18875] Avg episode reward: [(0, '0.607')] [2024-06-18 17:10:17,313][19107] Updated weights for policy 0, policy_version 175275 (0.0046) [2024-06-18 17:10:20,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 40932.7). Total num frames: 2871820288. Throughput: 0: 40994.7. Samples: 95926320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:20,501][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 17:10:20,989][19107] Updated weights for policy 0, policy_version 175285 (0.0048) [2024-06-18 17:10:25,045][19107] Updated weights for policy 0, policy_version 175295 (0.0038) [2024-06-18 17:10:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 2872033280. Throughput: 0: 41152.2. Samples: 96176620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:25,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 17:10:29,085][19107] Updated weights for policy 0, policy_version 175305 (0.0036) [2024-06-18 17:10:30,500][18875] Fps is (10 sec: 42599.0, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 2872246272. Throughput: 0: 41262.8. Samples: 96304780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:30,501][18875] Avg episode reward: [(0, '0.329')] [2024-06-18 17:10:32,971][19107] Updated weights for policy 0, policy_version 175315 (0.0039) [2024-06-18 17:10:35,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2872442880. Throughput: 0: 41019.0. Samples: 96541160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:35,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 17:10:35,528][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175320_2872442880.pth... [2024-06-18 17:10:35,580][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000174720_2862612480.pth [2024-06-18 17:10:37,316][19107] Updated weights for policy 0, policy_version 175325 (0.0041) [2024-06-18 17:10:40,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40687.1, 300 sec: 40987.8). Total num frames: 2872639488. Throughput: 0: 41179.3. Samples: 96789680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:40,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 17:10:40,938][19107] Updated weights for policy 0, policy_version 175335 (0.0039) [2024-06-18 17:10:45,199][19107] Updated weights for policy 0, policy_version 175345 (0.0034) [2024-06-18 17:10:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 2872852480. Throughput: 0: 40958.2. Samples: 96909480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:45,501][18875] Avg episode reward: [(0, '0.323')] [2024-06-18 17:10:48,305][19087] Signal inference workers to stop experience collection... (1400 times) [2024-06-18 17:10:48,362][19107] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-18 17:10:48,421][19087] Signal inference workers to resume experience collection... (1400 times) [2024-06-18 17:10:48,422][19107] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-18 17:10:49,310][19107] Updated weights for policy 0, policy_version 175355 (0.0035) [2024-06-18 17:10:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 2873065472. Throughput: 0: 40889.3. Samples: 97152620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:50,501][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 17:10:53,155][19107] Updated weights for policy 0, policy_version 175365 (0.0032) [2024-06-18 17:10:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 2873262080. Throughput: 0: 40897.3. Samples: 97398460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:10:55,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 17:10:57,338][19107] Updated weights for policy 0, policy_version 175375 (0.0029) [2024-06-18 17:11:00,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 40932.3). Total num frames: 2873458688. Throughput: 0: 40792.6. Samples: 97515260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:11:00,500][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 17:11:01,013][19107] Updated weights for policy 0, policy_version 175385 (0.0040) [2024-06-18 17:11:05,251][19107] Updated weights for policy 0, policy_version 175395 (0.0038) [2024-06-18 17:11:05,500][18875] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2873671680. Throughput: 0: 40938.2. Samples: 97768540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:11:05,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 17:11:08,900][19107] Updated weights for policy 0, policy_version 175405 (0.0036) [2024-06-18 17:11:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40414.0, 300 sec: 40876.7). Total num frames: 2873868288. Throughput: 0: 40767.2. Samples: 98011140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:10,500][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 17:11:13,421][19107] Updated weights for policy 0, policy_version 175415 (0.0030) [2024-06-18 17:11:15,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 40988.3). Total num frames: 2874097664. Throughput: 0: 40721.7. Samples: 98137260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:15,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 17:11:17,032][19107] Updated weights for policy 0, policy_version 175425 (0.0049) [2024-06-18 17:11:20,504][18875] Fps is (10 sec: 39306.9, 60 sec: 40684.5, 300 sec: 40876.2). Total num frames: 2874261504. Throughput: 0: 40922.5. Samples: 98382820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:20,505][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 17:11:21,364][19107] Updated weights for policy 0, policy_version 175435 (0.0031) [2024-06-18 17:11:25,098][19107] Updated weights for policy 0, policy_version 175445 (0.0037) [2024-06-18 17:11:25,500][18875] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 2874490880. Throughput: 0: 40832.8. Samples: 98627160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:25,501][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 17:11:29,282][19107] Updated weights for policy 0, policy_version 175455 (0.0036) [2024-06-18 17:11:30,500][18875] Fps is (10 sec: 45891.8, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2874720256. Throughput: 0: 41030.6. Samples: 98755860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:30,504][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 17:11:32,744][19107] Updated weights for policy 0, policy_version 175465 (0.0039) [2024-06-18 17:11:35,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 2874900480. Throughput: 0: 41121.8. Samples: 99003100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:35,504][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 17:11:37,020][19107] Updated weights for policy 0, policy_version 175475 (0.0036) [2024-06-18 17:11:40,462][19107] Updated weights for policy 0, policy_version 175485 (0.0046) [2024-06-18 17:11:40,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41098.8). Total num frames: 2875146240. Throughput: 0: 41213.8. Samples: 99253080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:40,501][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 17:11:44,989][19107] Updated weights for policy 0, policy_version 175495 (0.0039) [2024-06-18 17:11:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 2875326464. Throughput: 0: 41392.4. Samples: 99377920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:45,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 17:11:48,295][19107] Updated weights for policy 0, policy_version 175505 (0.0030) [2024-06-18 17:11:50,500][18875] Fps is (10 sec: 37683.8, 60 sec: 40960.1, 300 sec: 40932.7). Total num frames: 2875523072. Throughput: 0: 41084.5. Samples: 99617340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:50,501][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 17:11:53,268][19107] Updated weights for policy 0, policy_version 175515 (0.0043) [2024-06-18 17:11:55,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41233.2, 300 sec: 40932.2). Total num frames: 2875736064. Throughput: 0: 41305.3. Samples: 99869880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:11:55,500][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 17:11:56,590][19107] Updated weights for policy 0, policy_version 175525 (0.0030) [2024-06-18 17:11:59,813][19087] Signal inference workers to stop experience collection... (1450 times) [2024-06-18 17:11:59,813][19087] Signal inference workers to resume experience collection... (1450 times) [2024-06-18 17:11:59,852][19107] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-18 17:11:59,852][19107] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-18 17:12:00,500][18875] Fps is (10 sec: 39320.9, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 2875916288. Throughput: 0: 41275.5. Samples: 99994660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:12:00,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 17:12:01,206][19107] Updated weights for policy 0, policy_version 175535 (0.0031) [2024-06-18 17:12:04,401][19107] Updated weights for policy 0, policy_version 175545 (0.0044) [2024-06-18 17:12:05,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 2876145664. Throughput: 0: 41346.1. Samples: 100243240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:12:05,500][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 17:12:09,087][19107] Updated weights for policy 0, policy_version 175555 (0.0039) [2024-06-18 17:12:10,500][18875] Fps is (10 sec: 45875.1, 60 sec: 41779.1, 300 sec: 41098.8). Total num frames: 2876375040. Throughput: 0: 41414.2. Samples: 100490800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:12:10,501][18875] Avg episode reward: [(0, '0.781')] [2024-06-18 17:12:12,169][19107] Updated weights for policy 0, policy_version 175565 (0.0038) [2024-06-18 17:12:15,500][18875] Fps is (10 sec: 40959.4, 60 sec: 40960.0, 300 sec: 41043.4). Total num frames: 2876555264. Throughput: 0: 41407.5. Samples: 100619200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 17:12:15,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 17:12:16,796][19107] Updated weights for policy 0, policy_version 175575 (0.0027) [2024-06-18 17:12:19,974][19107] Updated weights for policy 0, policy_version 175585 (0.0036) [2024-06-18 17:12:20,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42054.8, 300 sec: 41043.3). Total num frames: 2876784640. Throughput: 0: 41351.5. Samples: 100863920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:20,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 17:12:24,532][19107] Updated weights for policy 0, policy_version 175595 (0.0035) [2024-06-18 17:12:25,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 2876997632. Throughput: 0: 41410.8. Samples: 101116560. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:25,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 17:12:28,280][19107] Updated weights for policy 0, policy_version 175605 (0.0033) [2024-06-18 17:12:30,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2877177856. Throughput: 0: 41434.2. Samples: 101242460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:30,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 17:12:32,529][19107] Updated weights for policy 0, policy_version 175615 (0.0033) [2024-06-18 17:12:35,503][18875] Fps is (10 sec: 40946.8, 60 sec: 41777.0, 300 sec: 41098.9). Total num frames: 2877407232. Throughput: 0: 41529.9. Samples: 101486320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:35,504][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 17:12:35,529][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175623_2877407232.pth... [2024-06-18 17:12:35,588][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175021_2867544064.pth [2024-06-18 17:12:36,011][19107] Updated weights for policy 0, policy_version 175625 (0.0039) [2024-06-18 17:12:40,496][19107] Updated weights for policy 0, policy_version 175635 (0.0039) [2024-06-18 17:12:40,500][18875] Fps is (10 sec: 42598.8, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2877603840. Throughput: 0: 41420.0. Samples: 101733780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:40,501][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 17:12:43,987][19107] Updated weights for policy 0, policy_version 175645 (0.0039) [2024-06-18 17:12:45,500][18875] Fps is (10 sec: 40972.7, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 2877816832. Throughput: 0: 41370.2. Samples: 101856320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:45,503][18875] Avg episode reward: [(0, '0.367')] [2024-06-18 17:12:48,279][19107] Updated weights for policy 0, policy_version 175655 (0.0032) [2024-06-18 17:12:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 2878013440. Throughput: 0: 41384.4. Samples: 102105540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:50,501][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 17:12:52,241][19107] Updated weights for policy 0, policy_version 175665 (0.0041) [2024-06-18 17:12:55,500][18875] Fps is (10 sec: 37683.1, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 2878193664. Throughput: 0: 41406.2. Samples: 102354080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:12:55,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 17:12:56,157][19107] Updated weights for policy 0, policy_version 175675 (0.0039) [2024-06-18 17:13:00,105][19107] Updated weights for policy 0, policy_version 175685 (0.0038) [2024-06-18 17:13:00,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42052.3, 300 sec: 41098.8). Total num frames: 2878439424. Throughput: 0: 41128.0. Samples: 102469960. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:13:00,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 17:13:03,639][19087] Signal inference workers to stop experience collection... (1500 times) [2024-06-18 17:13:03,641][19087] Signal inference workers to resume experience collection... (1500 times) [2024-06-18 17:13:03,654][19107] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-18 17:13:03,654][19107] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-18 17:13:04,253][19107] Updated weights for policy 0, policy_version 175695 (0.0037) [2024-06-18 17:13:05,500][18875] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2878636032. Throughput: 0: 41288.1. Samples: 102721880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:13:05,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 17:13:07,963][19107] Updated weights for policy 0, policy_version 175705 (0.0040) [2024-06-18 17:13:10,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2878816256. Throughput: 0: 41020.4. Samples: 102962480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:13:10,501][18875] Avg episode reward: [(0, '0.645')] [2024-06-18 17:13:12,436][19107] Updated weights for policy 0, policy_version 175715 (0.0041) [2024-06-18 17:13:15,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 2879029248. Throughput: 0: 40997.8. Samples: 103087360. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:13:15,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 17:13:16,002][19107] Updated weights for policy 0, policy_version 175725 (0.0041) [2024-06-18 17:13:20,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 2879225856. Throughput: 0: 41131.0. Samples: 103337080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 17:13:20,500][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 17:13:20,522][19107] Updated weights for policy 0, policy_version 175735 (0.0037) [2024-06-18 17:13:24,095][19107] Updated weights for policy 0, policy_version 175745 (0.0044) [2024-06-18 17:13:25,500][18875] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 2879455232. Throughput: 0: 40892.4. Samples: 103573940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:13:25,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 17:13:28,576][19107] Updated weights for policy 0, policy_version 175755 (0.0038) [2024-06-18 17:13:30,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 2879651840. Throughput: 0: 41039.1. Samples: 103703080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:13:30,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 17:13:32,203][19107] Updated weights for policy 0, policy_version 175765 (0.0038) [2024-06-18 17:13:35,500][18875] Fps is (10 sec: 37682.7, 60 sec: 40415.9, 300 sec: 41098.8). Total num frames: 2879832064. Throughput: 0: 40975.0. Samples: 103949420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:13:35,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 17:13:36,299][19107] Updated weights for policy 0, policy_version 175775 (0.0040) [2024-06-18 17:13:40,070][19107] Updated weights for policy 0, policy_version 175785 (0.0049) [2024-06-18 17:13:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 40959.9, 300 sec: 41043.8). Total num frames: 2880061440. Throughput: 0: 40892.9. Samples: 104194260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:13:40,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 17:13:44,314][19107] Updated weights for policy 0, policy_version 175795 (0.0037) [2024-06-18 17:13:45,500][18875] Fps is (10 sec: 44237.7, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2880274432. Throughput: 0: 41092.6. Samples: 104319120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:13:45,500][18875] Avg episode reward: [(0, '0.627')] [2024-06-18 17:13:48,381][19107] Updated weights for policy 0, policy_version 175805 (0.0038) [2024-06-18 17:13:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 2880471040. Throughput: 0: 40856.4. Samples: 104560420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:13:50,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 17:13:52,212][19107] Updated weights for policy 0, policy_version 175815 (0.0044) [2024-06-18 17:13:55,504][18875] Fps is (10 sec: 40944.1, 60 sec: 41503.6, 300 sec: 41098.3). Total num frames: 2880684032. Throughput: 0: 41061.4. Samples: 104810400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:13:55,505][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 17:13:56,234][19107] Updated weights for policy 0, policy_version 175825 (0.0048) [2024-06-18 17:14:00,090][19107] Updated weights for policy 0, policy_version 175835 (0.0030) [2024-06-18 17:14:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 2880880640. Throughput: 0: 41058.1. Samples: 104934980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:14:00,504][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 17:14:04,179][19107] Updated weights for policy 0, policy_version 175845 (0.0033) [2024-06-18 17:14:05,500][18875] Fps is (10 sec: 40975.4, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 2881093632. Throughput: 0: 40846.6. Samples: 105175180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:14:05,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 17:14:08,089][19107] Updated weights for policy 0, policy_version 175855 (0.0042) [2024-06-18 17:14:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41099.3). Total num frames: 2881290240. Throughput: 0: 40940.3. Samples: 105416260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:14:10,501][18875] Avg episode reward: [(0, '0.686')] [2024-06-18 17:14:11,996][19087] Signal inference workers to stop experience collection... (1550 times) [2024-06-18 17:14:12,048][19107] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-18 17:14:12,054][19087] Signal inference workers to resume experience collection... (1550 times) [2024-06-18 17:14:12,061][19107] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-18 17:14:12,198][19107] Updated weights for policy 0, policy_version 175865 (0.0036) [2024-06-18 17:14:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2881503232. Throughput: 0: 40906.7. Samples: 105543880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:14:15,501][18875] Avg episode reward: [(0, '0.819')] [2024-06-18 17:14:16,053][19107] Updated weights for policy 0, policy_version 175875 (0.0036) [2024-06-18 17:14:20,478][19107] Updated weights for policy 0, policy_version 175885 (0.0043) [2024-06-18 17:14:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41232.9, 300 sec: 41098.8). Total num frames: 2881699840. Throughput: 0: 40874.2. Samples: 105788760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:14:20,506][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 17:14:24,096][19107] Updated weights for policy 0, policy_version 175895 (0.0031) [2024-06-18 17:14:25,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 2881880064. Throughput: 0: 40893.9. Samples: 106034480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:14:25,500][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 17:14:28,360][19107] Updated weights for policy 0, policy_version 175905 (0.0042) [2024-06-18 17:14:30,500][18875] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2882093056. Throughput: 0: 40842.1. Samples: 106157020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:14:30,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 17:14:32,677][19107] Updated weights for policy 0, policy_version 175915 (0.0038) [2024-06-18 17:14:35,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41506.3, 300 sec: 41098.9). Total num frames: 2882322432. Throughput: 0: 40833.0. Samples: 106397900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:14:35,501][18875] Avg episode reward: [(0, '0.364')] [2024-06-18 17:14:35,533][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175923_2882322432.pth... [2024-06-18 17:14:35,596][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175320_2872442880.pth [2024-06-18 17:14:36,287][19107] Updated weights for policy 0, policy_version 175925 (0.0031) [2024-06-18 17:14:40,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40687.1, 300 sec: 40987.8). Total num frames: 2882502656. Throughput: 0: 40825.7. Samples: 106647400. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:14:40,501][18875] Avg episode reward: [(0, '0.303')] [2024-06-18 17:14:40,570][19107] Updated weights for policy 0, policy_version 175935 (0.0033) [2024-06-18 17:14:44,446][19107] Updated weights for policy 0, policy_version 175945 (0.0034) [2024-06-18 17:14:45,500][18875] Fps is (10 sec: 40959.3, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 2882732032. Throughput: 0: 40752.4. Samples: 106768840. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:14:45,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 17:14:48,460][19107] Updated weights for policy 0, policy_version 175955 (0.0041) [2024-06-18 17:14:50,500][18875] Fps is (10 sec: 42597.8, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 2882928640. Throughput: 0: 40898.2. Samples: 107015600. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:14:50,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 17:14:52,402][19107] Updated weights for policy 0, policy_version 175965 (0.0039) [2024-06-18 17:14:55,500][18875] Fps is (10 sec: 40960.2, 60 sec: 40962.5, 300 sec: 41209.9). Total num frames: 2883141632. Throughput: 0: 41093.4. Samples: 107265460. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:14:55,501][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 17:14:56,248][19107] Updated weights for policy 0, policy_version 175975 (0.0045) [2024-06-18 17:15:00,388][19107] Updated weights for policy 0, policy_version 175985 (0.0035) [2024-06-18 17:15:00,500][18875] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2883338240. Throughput: 0: 41026.3. Samples: 107390060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:15:00,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 17:15:04,019][19107] Updated weights for policy 0, policy_version 175995 (0.0038) [2024-06-18 17:15:05,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 2883534848. Throughput: 0: 41028.1. Samples: 107635020. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:15:05,501][18875] Avg episode reward: [(0, '0.554')] [2024-06-18 17:15:08,546][19107] Updated weights for policy 0, policy_version 176005 (0.0030) [2024-06-18 17:15:10,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2883747840. Throughput: 0: 41155.1. Samples: 107886460. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:15:10,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 17:15:11,974][19107] Updated weights for policy 0, policy_version 176015 (0.0044) [2024-06-18 17:15:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 2883944448. Throughput: 0: 41049.3. Samples: 108004240. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:15:15,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 17:15:16,289][19107] Updated weights for policy 0, policy_version 176025 (0.0039) [2024-06-18 17:15:20,154][19107] Updated weights for policy 0, policy_version 176035 (0.0035) [2024-06-18 17:15:20,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2884157440. Throughput: 0: 41182.7. Samples: 108251120. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:15:20,500][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 17:15:24,090][19107] Updated weights for policy 0, policy_version 176045 (0.0037) [2024-06-18 17:15:25,504][18875] Fps is (10 sec: 42582.9, 60 sec: 41503.6, 300 sec: 41098.3). Total num frames: 2884370432. Throughput: 0: 41221.5. Samples: 108502520. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:15:25,505][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 17:15:27,869][19107] Updated weights for policy 0, policy_version 176055 (0.0038) [2024-06-18 17:15:30,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2884567040. Throughput: 0: 41205.8. Samples: 108623100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-18 17:15:30,501][18875] Avg episode reward: [(0, '0.171')] [2024-06-18 17:15:31,782][19107] Updated weights for policy 0, policy_version 176065 (0.0042) [2024-06-18 17:15:35,500][18875] Fps is (10 sec: 40974.7, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 2884780032. Throughput: 0: 41332.0. Samples: 108875540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:15:35,501][18875] Avg episode reward: [(0, '0.242')] [2024-06-18 17:15:36,000][19107] Updated weights for policy 0, policy_version 176075 (0.0033) [2024-06-18 17:15:39,916][19107] Updated weights for policy 0, policy_version 176085 (0.0041) [2024-06-18 17:15:40,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 2884976640. Throughput: 0: 41257.0. Samples: 109122020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:15:40,501][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 17:15:43,830][19107] Updated weights for policy 0, policy_version 176095 (0.0045) [2024-06-18 17:15:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 2885189632. Throughput: 0: 41212.3. Samples: 109244620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:15:45,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 17:15:48,063][19107] Updated weights for policy 0, policy_version 176105 (0.0035) [2024-06-18 17:15:50,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2885402624. Throughput: 0: 41277.3. Samples: 109492500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:15:50,501][18875] Avg episode reward: [(0, '0.343')] [2024-06-18 17:15:51,802][19107] Updated weights for policy 0, policy_version 176115 (0.0045) [2024-06-18 17:15:55,056][19087] Signal inference workers to stop experience collection... (1600 times) [2024-06-18 17:15:55,057][19087] Signal inference workers to resume experience collection... (1600 times) [2024-06-18 17:15:55,090][19107] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-18 17:15:55,090][19107] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-18 17:15:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2885615616. Throughput: 0: 41292.9. Samples: 109744640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:15:55,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 17:15:55,766][19107] Updated weights for policy 0, policy_version 176125 (0.0033) [2024-06-18 17:16:00,015][19107] Updated weights for policy 0, policy_version 176135 (0.0043) [2024-06-18 17:16:00,504][18875] Fps is (10 sec: 42583.5, 60 sec: 41503.6, 300 sec: 41209.4). Total num frames: 2885828608. Throughput: 0: 41344.3. Samples: 109864880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:00,504][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 17:16:03,658][19107] Updated weights for policy 0, policy_version 176145 (0.0046) [2024-06-18 17:16:05,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2886008832. Throughput: 0: 41313.6. Samples: 110110240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:05,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 17:16:07,889][19107] Updated weights for policy 0, policy_version 176155 (0.0034) [2024-06-18 17:16:10,500][18875] Fps is (10 sec: 37697.2, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 2886205440. Throughput: 0: 41261.2. Samples: 110359120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:10,500][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 17:16:11,524][19107] Updated weights for policy 0, policy_version 176165 (0.0032) [2024-06-18 17:16:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41266.0). Total num frames: 2886434816. Throughput: 0: 41344.4. Samples: 110483600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:15,501][18875] Avg episode reward: [(0, '0.614')] [2024-06-18 17:16:15,799][19107] Updated weights for policy 0, policy_version 176175 (0.0043) [2024-06-18 17:16:19,390][19107] Updated weights for policy 0, policy_version 176185 (0.0032) [2024-06-18 17:16:20,504][18875] Fps is (10 sec: 44219.9, 60 sec: 41503.5, 300 sec: 41209.4). Total num frames: 2886647808. Throughput: 0: 41224.6. Samples: 110730800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:20,505][18875] Avg episode reward: [(0, '0.373')] [2024-06-18 17:16:23,916][19107] Updated weights for policy 0, policy_version 176195 (0.0030) [2024-06-18 17:16:25,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40962.5, 300 sec: 41043.3). Total num frames: 2886828032. Throughput: 0: 41268.0. Samples: 110979080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:25,501][18875] Avg episode reward: [(0, '0.364')] [2024-06-18 17:16:27,657][19107] Updated weights for policy 0, policy_version 176205 (0.0026) [2024-06-18 17:16:30,500][18875] Fps is (10 sec: 39336.5, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 2887041024. Throughput: 0: 41314.3. Samples: 111103760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:30,501][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 17:16:31,801][19107] Updated weights for policy 0, policy_version 176215 (0.0039) [2024-06-18 17:16:35,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2887254016. Throughput: 0: 41182.2. Samples: 111345700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 17:16:35,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 17:16:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000176224_2887254016.pth... [2024-06-18 17:16:35,591][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175623_2877407232.pth [2024-06-18 17:16:35,736][19107] Updated weights for policy 0, policy_version 176225 (0.0044) [2024-06-18 17:16:39,931][19107] Updated weights for policy 0, policy_version 176235 (0.0039) [2024-06-18 17:16:40,500][18875] Fps is (10 sec: 40959.3, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2887450624. Throughput: 0: 41004.4. Samples: 111589840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:16:40,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 17:16:43,587][19107] Updated weights for policy 0, policy_version 176245 (0.0034) [2024-06-18 17:16:45,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 2887680000. Throughput: 0: 41069.0. Samples: 111712840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:16:45,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 17:16:47,836][19107] Updated weights for policy 0, policy_version 176255 (0.0040) [2024-06-18 17:16:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2887876608. Throughput: 0: 41142.2. Samples: 111961640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:16:50,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 17:16:51,926][19107] Updated weights for policy 0, policy_version 176265 (0.0030) [2024-06-18 17:16:55,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 2888073216. Throughput: 0: 41118.5. Samples: 112209460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:16:55,501][18875] Avg episode reward: [(0, '0.280')] [2024-06-18 17:16:55,701][19107] Updated weights for policy 0, policy_version 176275 (0.0048) [2024-06-18 17:16:59,806][19107] Updated weights for policy 0, policy_version 176285 (0.0047) [2024-06-18 17:17:00,500][18875] Fps is (10 sec: 39322.1, 60 sec: 40689.4, 300 sec: 41098.8). Total num frames: 2888269824. Throughput: 0: 41009.0. Samples: 112329000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:00,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 17:17:03,722][19107] Updated weights for policy 0, policy_version 176295 (0.0039) [2024-06-18 17:17:05,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 2888499200. Throughput: 0: 41156.6. Samples: 112582700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:05,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 17:17:07,540][19107] Updated weights for policy 0, policy_version 176305 (0.0039) [2024-06-18 17:17:10,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 2888679424. Throughput: 0: 41008.1. Samples: 112824440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:10,500][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 17:17:11,802][19107] Updated weights for policy 0, policy_version 176315 (0.0030) [2024-06-18 17:17:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 2888892416. Throughput: 0: 40998.5. Samples: 112948700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:15,501][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 17:17:16,040][19107] Updated weights for policy 0, policy_version 176325 (0.0033) [2024-06-18 17:17:16,148][19087] Signal inference workers to stop experience collection... (1650 times) [2024-06-18 17:17:16,197][19107] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-18 17:17:16,198][19087] Signal inference workers to resume experience collection... (1650 times) [2024-06-18 17:17:16,210][19107] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-18 17:17:19,613][19107] Updated weights for policy 0, policy_version 176335 (0.0043) [2024-06-18 17:17:20,500][18875] Fps is (10 sec: 40959.2, 60 sec: 40689.4, 300 sec: 40987.7). Total num frames: 2889089024. Throughput: 0: 41174.2. Samples: 113198540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:20,501][18875] Avg episode reward: [(0, '0.273')] [2024-06-18 17:17:24,009][19107] Updated weights for policy 0, policy_version 176345 (0.0044) [2024-06-18 17:17:25,500][18875] Fps is (10 sec: 44237.2, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2889334784. Throughput: 0: 41180.0. Samples: 113442940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:25,501][18875] Avg episode reward: [(0, '0.297')] [2024-06-18 17:17:27,705][19107] Updated weights for policy 0, policy_version 176355 (0.0030) [2024-06-18 17:17:30,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41506.0, 300 sec: 41099.3). Total num frames: 2889531392. Throughput: 0: 41358.7. Samples: 113573980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:30,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 17:17:31,693][19107] Updated weights for policy 0, policy_version 176365 (0.0036) [2024-06-18 17:17:35,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 2889711616. Throughput: 0: 41193.5. Samples: 113815340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:35,501][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 17:17:35,783][19107] Updated weights for policy 0, policy_version 176375 (0.0038) [2024-06-18 17:17:39,363][19107] Updated weights for policy 0, policy_version 176385 (0.0036) [2024-06-18 17:17:40,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 2889940992. Throughput: 0: 41160.9. Samples: 114061700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:17:40,504][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 17:17:43,732][19107] Updated weights for policy 0, policy_version 176395 (0.0034) [2024-06-18 17:17:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2890137600. Throughput: 0: 41247.6. Samples: 114185140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:17:45,500][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 17:17:47,200][19107] Updated weights for policy 0, policy_version 176405 (0.0028) [2024-06-18 17:17:50,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2890334208. Throughput: 0: 41056.9. Samples: 114430260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:17:50,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 17:17:51,585][19107] Updated weights for policy 0, policy_version 176415 (0.0034) [2024-06-18 17:17:55,304][19107] Updated weights for policy 0, policy_version 176425 (0.0047) [2024-06-18 17:17:55,500][18875] Fps is (10 sec: 40959.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2890547200. Throughput: 0: 41260.7. Samples: 114681180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:17:55,501][18875] Avg episode reward: [(0, '0.607')] [2024-06-18 17:17:59,395][19107] Updated weights for policy 0, policy_version 176435 (0.0044) [2024-06-18 17:18:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2890743808. Throughput: 0: 41214.8. Samples: 114803360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:00,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 17:18:03,285][19107] Updated weights for policy 0, policy_version 176445 (0.0038) [2024-06-18 17:18:05,504][18875] Fps is (10 sec: 40945.8, 60 sec: 40957.6, 300 sec: 41153.9). Total num frames: 2890956800. Throughput: 0: 41053.2. Samples: 115046080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:05,504][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 17:18:07,378][19107] Updated weights for policy 0, policy_version 176455 (0.0045) [2024-06-18 17:18:10,500][18875] Fps is (10 sec: 39321.8, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 2891137024. Throughput: 0: 41244.9. Samples: 115298960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:10,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 17:18:11,171][19107] Updated weights for policy 0, policy_version 176465 (0.0028) [2024-06-18 17:18:15,314][19107] Updated weights for policy 0, policy_version 176475 (0.0028) [2024-06-18 17:18:15,504][18875] Fps is (10 sec: 40959.8, 60 sec: 41230.7, 300 sec: 41153.9). Total num frames: 2891366400. Throughput: 0: 40984.3. Samples: 115418420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:15,505][18875] Avg episode reward: [(0, '0.227')] [2024-06-18 17:18:18,936][19107] Updated weights for policy 0, policy_version 176485 (0.0030) [2024-06-18 17:18:20,500][18875] Fps is (10 sec: 45875.6, 60 sec: 41779.3, 300 sec: 41154.4). Total num frames: 2891595776. Throughput: 0: 41077.4. Samples: 115663820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:20,500][18875] Avg episode reward: [(0, '0.276')] [2024-06-18 17:18:23,251][19107] Updated weights for policy 0, policy_version 176495 (0.0029) [2024-06-18 17:18:25,500][18875] Fps is (10 sec: 40975.3, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 2891776000. Throughput: 0: 41290.8. Samples: 115919780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:25,500][18875] Avg episode reward: [(0, '0.306')] [2024-06-18 17:18:26,744][19107] Updated weights for policy 0, policy_version 176505 (0.0032) [2024-06-18 17:18:30,500][18875] Fps is (10 sec: 37683.2, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 2891972608. Throughput: 0: 41157.8. Samples: 116037240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:30,501][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 17:18:31,032][19107] Updated weights for policy 0, policy_version 176515 (0.0041) [2024-06-18 17:18:33,478][19087] Signal inference workers to stop experience collection... (1700 times) [2024-06-18 17:18:33,478][19087] Signal inference workers to resume experience collection... (1700 times) [2024-06-18 17:18:33,535][19107] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-18 17:18:33,535][19107] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-18 17:18:34,810][19107] Updated weights for policy 0, policy_version 176525 (0.0041) [2024-06-18 17:18:35,500][18875] Fps is (10 sec: 42597.6, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 2892201984. Throughput: 0: 41306.6. Samples: 116289060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:35,504][18875] Avg episode reward: [(0, '0.806')] [2024-06-18 17:18:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000176526_2892201984.pth... [2024-06-18 17:18:35,577][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000175923_2882322432.pth [2024-06-18 17:18:38,929][19107] Updated weights for policy 0, policy_version 176535 (0.0034) [2024-06-18 17:18:40,500][18875] Fps is (10 sec: 40959.3, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 2892382208. Throughput: 0: 41162.3. Samples: 116533480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:40,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 17:18:42,688][19107] Updated weights for policy 0, policy_version 176545 (0.0038) [2024-06-18 17:18:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41232.9, 300 sec: 41154.4). Total num frames: 2892611584. Throughput: 0: 41083.9. Samples: 116652140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 17:18:45,501][18875] Avg episode reward: [(0, '0.409')] [2024-06-18 17:18:47,293][19107] Updated weights for policy 0, policy_version 176555 (0.0045) [2024-06-18 17:18:50,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41154.9). Total num frames: 2892824576. Throughput: 0: 41332.1. Samples: 116905880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:18:50,501][18875] Avg episode reward: [(0, '0.382')] [2024-06-18 17:18:50,606][19107] Updated weights for policy 0, policy_version 176565 (0.0038) [2024-06-18 17:18:55,291][19107] Updated weights for policy 0, policy_version 176575 (0.0039) [2024-06-18 17:18:55,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 2893021184. Throughput: 0: 41147.6. Samples: 117150600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:18:55,500][18875] Avg episode reward: [(0, '0.177')] [2024-06-18 17:18:59,334][19107] Updated weights for policy 0, policy_version 176585 (0.0041) [2024-06-18 17:19:00,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 2893217792. Throughput: 0: 41110.0. Samples: 117268220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:00,501][18875] Avg episode reward: [(0, '0.305')] [2024-06-18 17:19:03,202][19107] Updated weights for policy 0, policy_version 176595 (0.0040) [2024-06-18 17:19:05,500][18875] Fps is (10 sec: 39320.7, 60 sec: 40962.4, 300 sec: 41098.8). Total num frames: 2893414400. Throughput: 0: 41164.7. Samples: 117516240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:05,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 17:19:07,233][19107] Updated weights for policy 0, policy_version 176605 (0.0036) [2024-06-18 17:19:10,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 2893627392. Throughput: 0: 40889.7. Samples: 117759820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:10,501][18875] Avg episode reward: [(0, '0.246')] [2024-06-18 17:19:11,090][19107] Updated weights for policy 0, policy_version 176615 (0.0034) [2024-06-18 17:19:15,012][19107] Updated weights for policy 0, policy_version 176625 (0.0050) [2024-06-18 17:19:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41235.5, 300 sec: 41154.4). Total num frames: 2893840384. Throughput: 0: 41120.3. Samples: 117887660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:15,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 17:19:18,996][19107] Updated weights for policy 0, policy_version 176635 (0.0045) [2024-06-18 17:19:20,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 41154.4). Total num frames: 2894020608. Throughput: 0: 40928.5. Samples: 118130840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:20,501][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 17:19:22,813][19107] Updated weights for policy 0, policy_version 176645 (0.0034) [2024-06-18 17:19:25,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2894249984. Throughput: 0: 41022.8. Samples: 118379500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:25,500][18875] Avg episode reward: [(0, '0.386')] [2024-06-18 17:19:26,663][19107] Updated weights for policy 0, policy_version 176655 (0.0040) [2024-06-18 17:19:30,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2894462976. Throughput: 0: 41220.2. Samples: 118507040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:30,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 17:19:30,560][19107] Updated weights for policy 0, policy_version 176665 (0.0031) [2024-06-18 17:19:34,497][19107] Updated weights for policy 0, policy_version 176675 (0.0025) [2024-06-18 17:19:35,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 2894659584. Throughput: 0: 41036.1. Samples: 118752500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:35,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 17:19:38,506][19107] Updated weights for policy 0, policy_version 176685 (0.0043) [2024-06-18 17:19:40,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2894888960. Throughput: 0: 41113.7. Samples: 119000720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:40,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 17:19:42,501][19107] Updated weights for policy 0, policy_version 176695 (0.0037) [2024-06-18 17:19:45,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 2895052800. Throughput: 0: 41267.5. Samples: 119125260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:45,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 17:19:46,741][19107] Updated weights for policy 0, policy_version 176705 (0.0036) [2024-06-18 17:19:50,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2895282176. Throughput: 0: 41139.6. Samples: 119367520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 23.0) [2024-06-18 17:19:50,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 17:19:50,813][19107] Updated weights for policy 0, policy_version 176715 (0.0035) [2024-06-18 17:19:54,922][19107] Updated weights for policy 0, policy_version 176725 (0.0045) [2024-06-18 17:19:55,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41232.9, 300 sec: 41209.9). Total num frames: 2895495168. Throughput: 0: 41183.9. Samples: 119613100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:19:55,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 17:19:58,766][19107] Updated weights for policy 0, policy_version 176735 (0.0045) [2024-06-18 17:20:00,500][18875] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2895675392. Throughput: 0: 41156.9. Samples: 119739720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:00,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 17:20:02,860][19107] Updated weights for policy 0, policy_version 176745 (0.0055) [2024-06-18 17:20:03,666][19087] Signal inference workers to stop experience collection... (1750 times) [2024-06-18 17:20:03,667][19087] Signal inference workers to resume experience collection... (1750 times) [2024-06-18 17:20:03,713][19107] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-18 17:20:03,713][19107] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-18 17:20:05,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 2895888384. Throughput: 0: 41145.9. Samples: 119982400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:05,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 17:20:06,854][19107] Updated weights for policy 0, policy_version 176755 (0.0040) [2024-06-18 17:20:10,500][18875] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2896084992. Throughput: 0: 40984.8. Samples: 120223820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:10,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 17:20:10,866][19107] Updated weights for policy 0, policy_version 176765 (0.0044) [2024-06-18 17:20:14,756][19107] Updated weights for policy 0, policy_version 176775 (0.0034) [2024-06-18 17:20:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2896297984. Throughput: 0: 40884.9. Samples: 120346860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:15,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 17:20:18,887][19107] Updated weights for policy 0, policy_version 176785 (0.0040) [2024-06-18 17:20:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41099.4). Total num frames: 2896494592. Throughput: 0: 40864.0. Samples: 120591380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:20,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 17:20:22,659][19107] Updated weights for policy 0, policy_version 176795 (0.0033) [2024-06-18 17:20:25,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 2896691200. Throughput: 0: 40813.2. Samples: 120837320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:25,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 17:20:27,540][19107] Updated weights for policy 0, policy_version 176805 (0.0031) [2024-06-18 17:20:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2896920576. Throughput: 0: 40784.2. Samples: 120960540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:30,500][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 17:20:31,272][19107] Updated weights for policy 0, policy_version 176815 (0.0041) [2024-06-18 17:20:35,484][19107] Updated weights for policy 0, policy_version 176825 (0.0042) [2024-06-18 17:20:35,500][18875] Fps is (10 sec: 40960.9, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 2897100800. Throughput: 0: 40764.2. Samples: 121201900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:35,500][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 17:20:35,630][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000176826_2897117184.pth... [2024-06-18 17:20:35,680][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000176224_2887254016.pth [2024-06-18 17:20:39,045][19107] Updated weights for policy 0, policy_version 176835 (0.0041) [2024-06-18 17:20:40,500][18875] Fps is (10 sec: 39320.7, 60 sec: 40413.8, 300 sec: 41098.8). Total num frames: 2897313792. Throughput: 0: 40728.9. Samples: 121445900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:40,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 17:20:43,453][19107] Updated weights for policy 0, policy_version 176845 (0.0041) [2024-06-18 17:20:45,500][18875] Fps is (10 sec: 42597.4, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2897526784. Throughput: 0: 40667.5. Samples: 121569760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:45,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 17:20:46,965][19107] Updated weights for policy 0, policy_version 176855 (0.0040) [2024-06-18 17:20:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 2897723392. Throughput: 0: 40649.6. Samples: 121811640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:50,501][18875] Avg episode reward: [(0, '0.686')] [2024-06-18 17:20:51,366][19107] Updated weights for policy 0, policy_version 176865 (0.0031) [2024-06-18 17:20:54,877][19107] Updated weights for policy 0, policy_version 176875 (0.0031) [2024-06-18 17:20:55,500][18875] Fps is (10 sec: 39322.3, 60 sec: 40414.0, 300 sec: 40988.3). Total num frames: 2897920000. Throughput: 0: 40657.3. Samples: 122053400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:20:55,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 17:20:59,389][19107] Updated weights for policy 0, policy_version 176885 (0.0042) [2024-06-18 17:21:00,500][18875] Fps is (10 sec: 39322.4, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2898116608. Throughput: 0: 40703.6. Samples: 122178520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 17:21:00,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 17:21:03,153][19107] Updated weights for policy 0, policy_version 176895 (0.0041) [2024-06-18 17:21:05,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 2898313216. Throughput: 0: 40741.8. Samples: 122424760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:05,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 17:21:07,606][19107] Updated weights for policy 0, policy_version 176905 (0.0046) [2024-06-18 17:21:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 2898526208. Throughput: 0: 40712.1. Samples: 122669360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:10,501][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 17:21:10,937][19107] Updated weights for policy 0, policy_version 176915 (0.0025) [2024-06-18 17:21:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40413.8, 300 sec: 40932.7). Total num frames: 2898722816. Throughput: 0: 40803.9. Samples: 122796720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:15,501][18875] Avg episode reward: [(0, '0.332')] [2024-06-18 17:21:15,560][19107] Updated weights for policy 0, policy_version 176925 (0.0056) [2024-06-18 17:21:18,987][19107] Updated weights for policy 0, policy_version 176935 (0.0046) [2024-06-18 17:21:20,500][18875] Fps is (10 sec: 42597.8, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 2898952192. Throughput: 0: 40795.4. Samples: 123037700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:20,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 17:21:23,291][19107] Updated weights for policy 0, policy_version 176945 (0.0028) [2024-06-18 17:21:25,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 2899165184. Throughput: 0: 40925.0. Samples: 123287520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:25,501][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 17:21:27,103][19107] Updated weights for policy 0, policy_version 176955 (0.0038) [2024-06-18 17:21:30,500][18875] Fps is (10 sec: 37683.4, 60 sec: 40140.7, 300 sec: 40932.2). Total num frames: 2899329024. Throughput: 0: 40812.1. Samples: 123406300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:30,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 17:21:31,227][19107] Updated weights for policy 0, policy_version 176965 (0.0030) [2024-06-18 17:21:34,860][19107] Updated weights for policy 0, policy_version 176975 (0.0044) [2024-06-18 17:21:35,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41232.9, 300 sec: 41098.8). Total num frames: 2899574784. Throughput: 0: 40957.9. Samples: 123654740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:35,501][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 17:21:39,208][19107] Updated weights for policy 0, policy_version 176985 (0.0034) [2024-06-18 17:21:40,501][18875] Fps is (10 sec: 45874.5, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2899787776. Throughput: 0: 41064.2. Samples: 123901300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:40,502][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 17:21:42,601][19087] Signal inference workers to stop experience collection... (1800 times) [2024-06-18 17:21:42,657][19107] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-18 17:21:42,657][19087] Signal inference workers to resume experience collection... (1800 times) [2024-06-18 17:21:42,671][19107] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-18 17:21:42,803][19107] Updated weights for policy 0, policy_version 176995 (0.0042) [2024-06-18 17:21:45,500][18875] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 2899968000. Throughput: 0: 40992.0. Samples: 124023160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:45,504][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 17:21:47,236][19107] Updated weights for policy 0, policy_version 177005 (0.0037) [2024-06-18 17:21:50,501][18875] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2900180992. Throughput: 0: 40934.4. Samples: 124266820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:50,501][18875] Avg episode reward: [(0, '0.704')] [2024-06-18 17:21:51,075][19107] Updated weights for policy 0, policy_version 177015 (0.0036) [2024-06-18 17:21:55,189][19107] Updated weights for policy 0, policy_version 177025 (0.0041) [2024-06-18 17:21:55,500][18875] Fps is (10 sec: 40959.7, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 2900377600. Throughput: 0: 41098.5. Samples: 124518800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:21:55,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 17:21:58,853][19107] Updated weights for policy 0, policy_version 177035 (0.0044) [2024-06-18 17:22:00,500][18875] Fps is (10 sec: 39322.5, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 2900574208. Throughput: 0: 40913.4. Samples: 124637820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:22:00,501][18875] Avg episode reward: [(0, '0.574')] [2024-06-18 17:22:03,379][19107] Updated weights for policy 0, policy_version 177045 (0.0041) [2024-06-18 17:22:05,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 2900803584. Throughput: 0: 40873.8. Samples: 124877020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:22:05,501][18875] Avg episode reward: [(0, '0.452')] [2024-06-18 17:22:06,808][19107] Updated weights for policy 0, policy_version 177055 (0.0031) [2024-06-18 17:22:10,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40687.0, 300 sec: 40932.3). Total num frames: 2900967424. Throughput: 0: 40911.7. Samples: 125128540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:10,500][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 17:22:11,290][19107] Updated weights for policy 0, policy_version 177065 (0.0039) [2024-06-18 17:22:14,890][19107] Updated weights for policy 0, policy_version 177075 (0.0044) [2024-06-18 17:22:15,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 2901196800. Throughput: 0: 40887.6. Samples: 125246240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:15,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 17:22:19,477][19107] Updated weights for policy 0, policy_version 177085 (0.0028) [2024-06-18 17:22:20,500][18875] Fps is (10 sec: 44236.6, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 2901409792. Throughput: 0: 41042.4. Samples: 125501640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:20,501][18875] Avg episode reward: [(0, '0.318')] [2024-06-18 17:22:22,887][19107] Updated weights for policy 0, policy_version 177095 (0.0033) [2024-06-18 17:22:25,502][18875] Fps is (10 sec: 40954.2, 60 sec: 40686.0, 300 sec: 40932.0). Total num frames: 2901606400. Throughput: 0: 41025.6. Samples: 125747500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:25,508][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 17:22:27,162][19107] Updated weights for policy 0, policy_version 177105 (0.0047) [2024-06-18 17:22:30,500][18875] Fps is (10 sec: 40959.2, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 2901819392. Throughput: 0: 41067.9. Samples: 125871220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:30,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 17:22:30,966][19107] Updated weights for policy 0, policy_version 177115 (0.0042) [2024-06-18 17:22:35,046][19107] Updated weights for policy 0, policy_version 177125 (0.0034) [2024-06-18 17:22:35,500][18875] Fps is (10 sec: 40965.7, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 2902016000. Throughput: 0: 41183.8. Samples: 126120080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:35,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 17:22:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000177125_2902016000.pth... [2024-06-18 17:22:35,597][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000176526_2892201984.pth [2024-06-18 17:22:39,347][19107] Updated weights for policy 0, policy_version 177135 (0.0037) [2024-06-18 17:22:40,500][18875] Fps is (10 sec: 40960.5, 60 sec: 40687.1, 300 sec: 40987.8). Total num frames: 2902228992. Throughput: 0: 41047.7. Samples: 126365940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:40,501][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 17:22:42,860][19107] Updated weights for policy 0, policy_version 177145 (0.0038) [2024-06-18 17:22:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 2902441984. Throughput: 0: 41056.5. Samples: 126485360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:45,501][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 17:22:47,561][19107] Updated weights for policy 0, policy_version 177155 (0.0037) [2024-06-18 17:22:50,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 41043.3). Total num frames: 2902654976. Throughput: 0: 41296.5. Samples: 126735360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:50,501][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 17:22:50,611][19107] Updated weights for policy 0, policy_version 177165 (0.0036) [2024-06-18 17:22:55,458][19107] Updated weights for policy 0, policy_version 177175 (0.0040) [2024-06-18 17:22:55,500][18875] Fps is (10 sec: 39320.8, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 2902835200. Throughput: 0: 41237.5. Samples: 126984240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:22:55,501][18875] Avg episode reward: [(0, '0.338')] [2024-06-18 17:22:56,402][19087] Signal inference workers to stop experience collection... (1850 times) [2024-06-18 17:22:56,402][19087] Signal inference workers to resume experience collection... (1850 times) [2024-06-18 17:22:56,416][19107] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-18 17:22:56,416][19107] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-18 17:22:58,988][19107] Updated weights for policy 0, policy_version 177185 (0.0032) [2024-06-18 17:23:00,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41043.8). Total num frames: 2903064576. Throughput: 0: 41196.3. Samples: 127100080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:23:00,501][18875] Avg episode reward: [(0, '0.373')] [2024-06-18 17:23:03,557][19107] Updated weights for policy 0, policy_version 177195 (0.0055) [2024-06-18 17:23:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 2903261184. Throughput: 0: 41147.8. Samples: 127353300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:23:05,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 17:23:06,908][19107] Updated weights for policy 0, policy_version 177205 (0.0036) [2024-06-18 17:23:10,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 40988.3). Total num frames: 2903457792. Throughput: 0: 41039.5. Samples: 127594220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 17:23:10,501][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 17:23:11,700][19107] Updated weights for policy 0, policy_version 177215 (0.0048) [2024-06-18 17:23:15,110][19107] Updated weights for policy 0, policy_version 177225 (0.0041) [2024-06-18 17:23:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 2903670784. Throughput: 0: 40922.2. Samples: 127712720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:15,501][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 17:23:19,581][19107] Updated weights for policy 0, policy_version 177235 (0.0045) [2024-06-18 17:23:20,500][18875] Fps is (10 sec: 40959.7, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 2903867392. Throughput: 0: 41002.2. Samples: 127965180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:20,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 17:23:23,169][19107] Updated weights for policy 0, policy_version 177245 (0.0040) [2024-06-18 17:23:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41234.0, 300 sec: 41043.3). Total num frames: 2904080384. Throughput: 0: 40892.8. Samples: 128206120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:25,501][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 17:23:27,457][19107] Updated weights for policy 0, policy_version 177255 (0.0032) [2024-06-18 17:23:30,500][18875] Fps is (10 sec: 40960.7, 60 sec: 40960.2, 300 sec: 40932.3). Total num frames: 2904276992. Throughput: 0: 41034.7. Samples: 128331920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:30,500][18875] Avg episode reward: [(0, '0.734')] [2024-06-18 17:23:30,952][19107] Updated weights for policy 0, policy_version 177265 (0.0040) [2024-06-18 17:23:35,436][19107] Updated weights for policy 0, policy_version 177275 (0.0032) [2024-06-18 17:23:35,500][18875] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 2904473600. Throughput: 0: 41084.5. Samples: 128584160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:35,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 17:23:38,644][19107] Updated weights for policy 0, policy_version 177285 (0.0030) [2024-06-18 17:23:40,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 2904702976. Throughput: 0: 41005.1. Samples: 128829460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:40,500][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 17:23:43,100][19107] Updated weights for policy 0, policy_version 177295 (0.0034) [2024-06-18 17:23:45,500][18875] Fps is (10 sec: 45874.3, 60 sec: 41506.0, 300 sec: 41043.3). Total num frames: 2904932352. Throughput: 0: 41321.7. Samples: 128959560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:45,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 17:23:46,650][19107] Updated weights for policy 0, policy_version 177305 (0.0042) [2024-06-18 17:23:50,505][18875] Fps is (10 sec: 37664.1, 60 sec: 40410.5, 300 sec: 40876.0). Total num frames: 2905079808. Throughput: 0: 40988.0. Samples: 129197960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:50,506][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 17:23:51,162][19107] Updated weights for policy 0, policy_version 177315 (0.0043) [2024-06-18 17:23:54,947][19107] Updated weights for policy 0, policy_version 177325 (0.0033) [2024-06-18 17:23:55,500][18875] Fps is (10 sec: 37683.8, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 2905309184. Throughput: 0: 41088.9. Samples: 129443220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:23:55,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 17:23:59,085][19107] Updated weights for policy 0, policy_version 177335 (0.0047) [2024-06-18 17:24:00,503][18875] Fps is (10 sec: 45886.6, 60 sec: 41231.4, 300 sec: 41098.5). Total num frames: 2905538560. Throughput: 0: 41329.4. Samples: 129572640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:24:00,503][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 17:24:02,712][19107] Updated weights for policy 0, policy_version 177345 (0.0034) [2024-06-18 17:24:05,504][18875] Fps is (10 sec: 39307.2, 60 sec: 40684.6, 300 sec: 40931.7). Total num frames: 2905702400. Throughput: 0: 40985.2. Samples: 129809660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:24:05,505][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 17:24:06,780][19107] Updated weights for policy 0, policy_version 177355 (0.0039) [2024-06-18 17:24:10,414][19107] Updated weights for policy 0, policy_version 177365 (0.0036) [2024-06-18 17:24:10,500][18875] Fps is (10 sec: 40970.1, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 2905948160. Throughput: 0: 41304.1. Samples: 130064800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:24:10,501][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 17:24:14,470][19107] Updated weights for policy 0, policy_version 177375 (0.0035) [2024-06-18 17:24:15,500][18875] Fps is (10 sec: 42614.1, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 2906128384. Throughput: 0: 41256.0. Samples: 130188440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 17:24:15,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 17:24:18,299][19107] Updated weights for policy 0, policy_version 177385 (0.0033) [2024-06-18 17:24:20,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 2906357760. Throughput: 0: 41075.0. Samples: 130432540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:20,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 17:24:22,474][19107] Updated weights for policy 0, policy_version 177395 (0.0035) [2024-06-18 17:24:25,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 2906554368. Throughput: 0: 41118.6. Samples: 130679800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:25,500][18875] Avg episode reward: [(0, '0.348')] [2024-06-18 17:24:26,573][19107] Updated weights for policy 0, policy_version 177405 (0.0036) [2024-06-18 17:24:30,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 2906750976. Throughput: 0: 41006.8. Samples: 130804860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:30,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 17:24:30,784][19107] Updated weights for policy 0, policy_version 177415 (0.0033) [2024-06-18 17:24:34,403][19107] Updated weights for policy 0, policy_version 177425 (0.0035) [2024-06-18 17:24:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 40932.2). Total num frames: 2906963968. Throughput: 0: 41284.5. Samples: 131055560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:35,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 17:24:35,602][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000177428_2906980352.pth... [2024-06-18 17:24:35,658][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000176826_2897117184.pth [2024-06-18 17:24:38,822][19107] Updated weights for policy 0, policy_version 177435 (0.0031) [2024-06-18 17:24:40,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2907160576. Throughput: 0: 41285.4. Samples: 131301060. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:40,501][18875] Avg episode reward: [(0, '0.370')] [2024-06-18 17:24:42,662][19107] Updated weights for policy 0, policy_version 177445 (0.0030) [2024-06-18 17:24:43,403][19087] Signal inference workers to stop experience collection... (1900 times) [2024-06-18 17:24:43,403][19087] Signal inference workers to resume experience collection... (1900 times) [2024-06-18 17:24:43,437][19107] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-18 17:24:43,438][19107] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-18 17:24:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 2907373568. Throughput: 0: 41069.3. Samples: 131420660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:45,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 17:24:46,700][19107] Updated weights for policy 0, policy_version 177455 (0.0028) [2024-06-18 17:24:50,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41509.6, 300 sec: 40932.3). Total num frames: 2907570176. Throughput: 0: 41342.5. Samples: 131669920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:50,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 17:24:50,576][19107] Updated weights for policy 0, policy_version 177465 (0.0041) [2024-06-18 17:24:54,502][19107] Updated weights for policy 0, policy_version 177475 (0.0041) [2024-06-18 17:24:55,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2907783168. Throughput: 0: 41233.3. Samples: 131920300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:24:55,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 17:24:58,373][19107] Updated weights for policy 0, policy_version 177485 (0.0048) [2024-06-18 17:25:00,503][18875] Fps is (10 sec: 42586.9, 60 sec: 40959.9, 300 sec: 41042.9). Total num frames: 2907996160. Throughput: 0: 41298.8. Samples: 132047000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:25:00,503][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 17:25:02,337][19107] Updated weights for policy 0, policy_version 177495 (0.0035) [2024-06-18 17:25:05,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41508.6, 300 sec: 41043.3). Total num frames: 2908192768. Throughput: 0: 41315.1. Samples: 132291720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:25:05,501][18875] Avg episode reward: [(0, '0.620')] [2024-06-18 17:25:06,279][19107] Updated weights for policy 0, policy_version 177505 (0.0033) [2024-06-18 17:25:10,191][19107] Updated weights for policy 0, policy_version 177515 (0.0040) [2024-06-18 17:25:10,500][18875] Fps is (10 sec: 40970.6, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 2908405760. Throughput: 0: 41323.0. Samples: 132539340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:25:10,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 17:25:14,582][19107] Updated weights for policy 0, policy_version 177525 (0.0047) [2024-06-18 17:25:15,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2908602368. Throughput: 0: 41189.6. Samples: 132658400. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:25:15,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 17:25:18,003][19107] Updated weights for policy 0, policy_version 177535 (0.0039) [2024-06-18 17:25:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2908831744. Throughput: 0: 41072.8. Samples: 132903840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 25.0) [2024-06-18 17:25:20,503][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 17:25:22,349][19107] Updated weights for policy 0, policy_version 177545 (0.0042) [2024-06-18 17:25:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 2909028352. Throughput: 0: 41143.9. Samples: 133152540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:25:25,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 17:25:25,739][19107] Updated weights for policy 0, policy_version 177555 (0.0039) [2024-06-18 17:25:30,173][19107] Updated weights for policy 0, policy_version 177565 (0.0040) [2024-06-18 17:25:30,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2909224960. Throughput: 0: 41228.0. Samples: 133275920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:25:30,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 17:25:33,721][19107] Updated weights for policy 0, policy_version 177575 (0.0046) [2024-06-18 17:25:35,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 2909421568. Throughput: 0: 41128.7. Samples: 133520720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:25:35,501][18875] Avg episode reward: [(0, '0.227')] [2024-06-18 17:25:38,492][19107] Updated weights for policy 0, policy_version 177585 (0.0041) [2024-06-18 17:25:40,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41232.9, 300 sec: 41043.3). Total num frames: 2909634560. Throughput: 0: 41193.3. Samples: 133774000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:25:40,501][18875] Avg episode reward: [(0, '0.144')] [2024-06-18 17:25:41,985][19107] Updated weights for policy 0, policy_version 177595 (0.0040) [2024-06-18 17:25:45,501][18875] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2909847552. Throughput: 0: 41141.4. Samples: 133898260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:25:45,501][18875] Avg episode reward: [(0, '0.190')] [2024-06-18 17:25:46,338][19107] Updated weights for policy 0, policy_version 177605 (0.0033) [2024-06-18 17:25:50,045][19107] Updated weights for policy 0, policy_version 177615 (0.0034) [2024-06-18 17:25:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2910044160. Throughput: 0: 41206.2. Samples: 134146000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:25:50,501][18875] Avg episode reward: [(0, '0.145')] [2024-06-18 17:25:54,182][19107] Updated weights for policy 0, policy_version 177625 (0.0039) [2024-06-18 17:25:55,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2910257152. Throughput: 0: 41220.4. Samples: 134394260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:25:55,501][18875] Avg episode reward: [(0, '0.299')] [2024-06-18 17:25:58,071][19107] Updated weights for policy 0, policy_version 177635 (0.0038) [2024-06-18 17:26:00,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41234.8, 300 sec: 41209.9). Total num frames: 2910470144. Throughput: 0: 41281.3. Samples: 134516060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:26:00,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 17:26:01,941][19107] Updated weights for policy 0, policy_version 177645 (0.0038) [2024-06-18 17:26:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2910666752. Throughput: 0: 41284.0. Samples: 134761620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:26:05,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 17:26:05,955][19107] Updated weights for policy 0, policy_version 177655 (0.0032) [2024-06-18 17:26:09,612][19107] Updated weights for policy 0, policy_version 177665 (0.0029) [2024-06-18 17:26:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 2910879744. Throughput: 0: 41348.8. Samples: 135013240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:26:10,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 17:26:13,850][19107] Updated weights for policy 0, policy_version 177675 (0.0029) [2024-06-18 17:26:15,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 2911092736. Throughput: 0: 41431.5. Samples: 135140340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:26:15,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 17:26:17,350][19107] Updated weights for policy 0, policy_version 177685 (0.0039) [2024-06-18 17:26:20,500][18875] Fps is (10 sec: 39322.4, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2911272960. Throughput: 0: 41450.9. Samples: 135386000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:26:20,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 17:26:21,791][19107] Updated weights for policy 0, policy_version 177695 (0.0032) [2024-06-18 17:26:23,428][19087] Signal inference workers to stop experience collection... (1950 times) [2024-06-18 17:26:23,479][19107] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-18 17:26:23,480][19087] Signal inference workers to resume experience collection... (1950 times) [2024-06-18 17:26:23,495][19107] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-18 17:26:25,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 2911502336. Throughput: 0: 41261.0. Samples: 135630740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:26:25,501][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 17:26:25,659][19107] Updated weights for policy 0, policy_version 177705 (0.0027) [2024-06-18 17:26:29,593][19107] Updated weights for policy 0, policy_version 177715 (0.0036) [2024-06-18 17:26:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 2911682560. Throughput: 0: 41295.3. Samples: 135756540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:26:30,500][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 17:26:33,378][19107] Updated weights for policy 0, policy_version 177725 (0.0033) [2024-06-18 17:26:35,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 2911895552. Throughput: 0: 41272.4. Samples: 136003260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:26:35,501][18875] Avg episode reward: [(0, '0.716')] [2024-06-18 17:26:35,738][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000177730_2911928320.pth... [2024-06-18 17:26:35,807][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000177125_2902016000.pth [2024-06-18 17:26:37,793][19107] Updated weights for policy 0, policy_version 177735 (0.0038) [2024-06-18 17:26:40,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 2912124928. Throughput: 0: 41282.0. Samples: 136251940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:26:40,500][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 17:26:41,050][19107] Updated weights for policy 0, policy_version 177745 (0.0031) [2024-06-18 17:26:45,500][18875] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2912305152. Throughput: 0: 41314.8. Samples: 136375220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:26:45,501][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 17:26:45,725][19107] Updated weights for policy 0, policy_version 177755 (0.0040) [2024-06-18 17:26:49,289][19107] Updated weights for policy 0, policy_version 177765 (0.0046) [2024-06-18 17:26:50,500][18875] Fps is (10 sec: 39320.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2912518144. Throughput: 0: 41209.4. Samples: 136616040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:26:50,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 17:26:53,680][19107] Updated weights for policy 0, policy_version 177775 (0.0052) [2024-06-18 17:26:55,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41506.3, 300 sec: 41265.5). Total num frames: 2912747520. Throughput: 0: 41233.1. Samples: 136868720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:26:55,500][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 17:26:57,022][19107] Updated weights for policy 0, policy_version 177785 (0.0043) [2024-06-18 17:27:00,500][18875] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2912927744. Throughput: 0: 41053.5. Samples: 136987740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:27:00,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 17:27:01,767][19107] Updated weights for policy 0, policy_version 177795 (0.0037) [2024-06-18 17:27:05,436][19107] Updated weights for policy 0, policy_version 177805 (0.0040) [2024-06-18 17:27:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.3, 300 sec: 41321.0). Total num frames: 2913157120. Throughput: 0: 41007.6. Samples: 137231340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:27:05,500][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 17:27:09,660][19107] Updated weights for policy 0, policy_version 177815 (0.0035) [2024-06-18 17:27:10,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2913337344. Throughput: 0: 41252.5. Samples: 137487100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:27:10,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 17:27:13,417][19107] Updated weights for policy 0, policy_version 177825 (0.0037) [2024-06-18 17:27:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 41209.9). Total num frames: 2913566720. Throughput: 0: 41065.4. Samples: 137604480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:27:15,500][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 17:27:17,597][19107] Updated weights for policy 0, policy_version 177835 (0.0030) [2024-06-18 17:27:20,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41265.6). Total num frames: 2913779712. Throughput: 0: 41182.7. Samples: 137856480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:27:20,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 17:27:21,120][19107] Updated weights for policy 0, policy_version 177845 (0.0028) [2024-06-18 17:27:25,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 2913959936. Throughput: 0: 41179.5. Samples: 138105020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:27:25,500][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 17:27:25,578][19107] Updated weights for policy 0, policy_version 177855 (0.0028) [2024-06-18 17:27:28,983][19107] Updated weights for policy 0, policy_version 177865 (0.0028) [2024-06-18 17:27:30,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 2914189312. Throughput: 0: 41150.7. Samples: 138227000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-18 17:27:30,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 17:27:33,485][19107] Updated weights for policy 0, policy_version 177875 (0.0033) [2024-06-18 17:27:34,814][19087] Signal inference workers to stop experience collection... (2000 times) [2024-06-18 17:27:34,868][19107] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-18 17:27:34,877][19087] Signal inference workers to resume experience collection... (2000 times) [2024-06-18 17:27:34,879][19107] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-18 17:27:35,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 2914402304. Throughput: 0: 41550.3. Samples: 138485800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:27:35,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 17:27:36,659][19107] Updated weights for policy 0, policy_version 177885 (0.0042) [2024-06-18 17:27:40,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 2914582528. Throughput: 0: 41313.7. Samples: 138727840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:27:40,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 17:27:42,025][19107] Updated weights for policy 0, policy_version 177895 (0.0042) [2024-06-18 17:27:44,596][19107] Updated weights for policy 0, policy_version 177905 (0.0042) [2024-06-18 17:27:45,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2914811904. Throughput: 0: 41356.2. Samples: 138848780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:27:45,501][18875] Avg episode reward: [(0, '0.404')] [2024-06-18 17:27:50,199][19107] Updated weights for policy 0, policy_version 177915 (0.0050) [2024-06-18 17:27:50,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2914975744. Throughput: 0: 41435.4. Samples: 139095940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:27:50,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 17:27:52,734][19107] Updated weights for policy 0, policy_version 177925 (0.0031) [2024-06-18 17:27:55,504][18875] Fps is (10 sec: 39308.7, 60 sec: 40957.6, 300 sec: 41153.9). Total num frames: 2915205120. Throughput: 0: 41076.0. Samples: 139335660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:27:55,504][18875] Avg episode reward: [(0, '0.703')] [2024-06-18 17:27:58,011][19107] Updated weights for policy 0, policy_version 177935 (0.0035) [2024-06-18 17:28:00,500][18875] Fps is (10 sec: 45875.4, 60 sec: 41779.1, 300 sec: 41265.5). Total num frames: 2915434496. Throughput: 0: 41423.0. Samples: 139468520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:00,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 17:28:00,602][19107] Updated weights for policy 0, policy_version 177945 (0.0045) [2024-06-18 17:28:05,500][18875] Fps is (10 sec: 39334.4, 60 sec: 40686.8, 300 sec: 41154.4). Total num frames: 2915598336. Throughput: 0: 41296.8. Samples: 139714840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:05,501][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 17:28:05,837][19107] Updated weights for policy 0, policy_version 177955 (0.0037) [2024-06-18 17:28:09,226][19107] Updated weights for policy 0, policy_version 177965 (0.0032) [2024-06-18 17:28:10,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 2915844096. Throughput: 0: 40831.1. Samples: 139942420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:10,500][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 17:28:13,917][19107] Updated weights for policy 0, policy_version 177975 (0.0044) [2024-06-18 17:28:15,500][18875] Fps is (10 sec: 42599.0, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 2916024320. Throughput: 0: 41207.5. Samples: 140081340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:15,501][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 17:28:16,974][19107] Updated weights for policy 0, policy_version 177985 (0.0037) [2024-06-18 17:28:20,500][18875] Fps is (10 sec: 37682.6, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 2916220928. Throughput: 0: 40662.2. Samples: 140315600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:20,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 17:28:21,734][19107] Updated weights for policy 0, policy_version 177995 (0.0027) [2024-06-18 17:28:25,024][19107] Updated weights for policy 0, policy_version 178005 (0.0033) [2024-06-18 17:28:25,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 41265.4). Total num frames: 2916450304. Throughput: 0: 40783.0. Samples: 140563080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:25,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 17:28:29,981][19107] Updated weights for policy 0, policy_version 178015 (0.0035) [2024-06-18 17:28:30,500][18875] Fps is (10 sec: 40960.5, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 2916630528. Throughput: 0: 40947.3. Samples: 140691400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:30,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 17:28:33,147][19107] Updated weights for policy 0, policy_version 178025 (0.0035) [2024-06-18 17:28:35,500][18875] Fps is (10 sec: 39322.1, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 2916843520. Throughput: 0: 40709.4. Samples: 140927860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:35,501][18875] Avg episode reward: [(0, '0.770')] [2024-06-18 17:28:35,661][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178031_2916859904.pth... [2024-06-18 17:28:35,718][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000177428_2906980352.pth [2024-06-18 17:28:37,936][19107] Updated weights for policy 0, policy_version 178035 (0.0036) [2024-06-18 17:28:40,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 2917023744. Throughput: 0: 40981.3. Samples: 141179680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-18 17:28:40,504][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 17:28:41,208][19107] Updated weights for policy 0, policy_version 178045 (0.0045) [2024-06-18 17:28:41,534][19087] Signal inference workers to stop experience collection... (2050 times) [2024-06-18 17:28:41,537][19087] Signal inference workers to resume experience collection... (2050 times) [2024-06-18 17:28:41,562][19107] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-18 17:28:41,562][19107] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-18 17:28:45,500][18875] Fps is (10 sec: 37683.0, 60 sec: 40140.9, 300 sec: 41155.1). Total num frames: 2917220352. Throughput: 0: 40602.6. Samples: 141295640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:28:45,504][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 17:28:45,778][19107] Updated weights for policy 0, policy_version 178055 (0.0043) [2024-06-18 17:28:49,221][19107] Updated weights for policy 0, policy_version 178065 (0.0039) [2024-06-18 17:28:50,504][18875] Fps is (10 sec: 45858.9, 60 sec: 41776.7, 300 sec: 41265.0). Total num frames: 2917482496. Throughput: 0: 40775.5. Samples: 141549880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:28:50,505][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 17:28:53,801][19107] Updated weights for policy 0, policy_version 178075 (0.0036) [2024-06-18 17:28:55,501][18875] Fps is (10 sec: 42598.0, 60 sec: 40689.1, 300 sec: 41043.6). Total num frames: 2917646336. Throughput: 0: 41091.3. Samples: 141791540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:28:55,501][18875] Avg episode reward: [(0, '0.458')] [2024-06-18 17:28:57,002][19107] Updated weights for policy 0, policy_version 178085 (0.0043) [2024-06-18 17:29:00,500][18875] Fps is (10 sec: 36057.7, 60 sec: 40140.8, 300 sec: 41154.9). Total num frames: 2917842944. Throughput: 0: 40573.3. Samples: 141907140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:00,501][18875] Avg episode reward: [(0, '0.148')] [2024-06-18 17:29:01,806][19107] Updated weights for policy 0, policy_version 178095 (0.0036) [2024-06-18 17:29:05,208][19107] Updated weights for policy 0, policy_version 178105 (0.0038) [2024-06-18 17:29:05,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 2918072320. Throughput: 0: 40932.5. Samples: 142157560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:05,501][18875] Avg episode reward: [(0, '0.282')] [2024-06-18 17:29:09,699][19107] Updated weights for policy 0, policy_version 178115 (0.0039) [2024-06-18 17:29:10,500][18875] Fps is (10 sec: 40960.4, 60 sec: 40140.8, 300 sec: 41098.8). Total num frames: 2918252544. Throughput: 0: 40869.5. Samples: 142402200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:10,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 17:29:13,451][19107] Updated weights for policy 0, policy_version 178125 (0.0026) [2024-06-18 17:29:15,500][18875] Fps is (10 sec: 40959.4, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 2918481920. Throughput: 0: 40709.2. Samples: 142523320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:15,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 17:29:17,617][19107] Updated weights for policy 0, policy_version 178135 (0.0042) [2024-06-18 17:29:20,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2918694912. Throughput: 0: 40872.9. Samples: 142767140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:20,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 17:29:21,376][19107] Updated weights for policy 0, policy_version 178145 (0.0044) [2024-06-18 17:29:25,500][18875] Fps is (10 sec: 39321.8, 60 sec: 40413.9, 300 sec: 41098.8). Total num frames: 2918875136. Throughput: 0: 40963.5. Samples: 143023040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:25,502][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 17:29:25,770][19107] Updated weights for policy 0, policy_version 178155 (0.0048) [2024-06-18 17:29:29,360][19107] Updated weights for policy 0, policy_version 178165 (0.0038) [2024-06-18 17:29:30,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2919104512. Throughput: 0: 41090.8. Samples: 143144720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:30,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 17:29:33,527][19107] Updated weights for policy 0, policy_version 178175 (0.0046) [2024-06-18 17:29:35,500][18875] Fps is (10 sec: 42598.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2919301120. Throughput: 0: 40993.9. Samples: 143394460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:35,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 17:29:37,179][19107] Updated weights for policy 0, policy_version 178185 (0.0036) [2024-06-18 17:29:40,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 2919497728. Throughput: 0: 41149.4. Samples: 143643260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:40,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 17:29:41,330][19107] Updated weights for policy 0, policy_version 178195 (0.0036) [2024-06-18 17:29:45,359][19107] Updated weights for policy 0, policy_version 178205 (0.0045) [2024-06-18 17:29:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2919727104. Throughput: 0: 41238.2. Samples: 143762860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 23.0) [2024-06-18 17:29:45,501][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 17:29:49,082][19107] Updated weights for policy 0, policy_version 178215 (0.0038) [2024-06-18 17:29:50,500][18875] Fps is (10 sec: 42599.2, 60 sec: 40689.5, 300 sec: 41154.4). Total num frames: 2919923712. Throughput: 0: 41218.3. Samples: 144012380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:29:50,500][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 17:29:53,260][19107] Updated weights for policy 0, policy_version 178225 (0.0044) [2024-06-18 17:29:55,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40960.2, 300 sec: 41043.7). Total num frames: 2920103936. Throughput: 0: 41300.9. Samples: 144260740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:29:55,500][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 17:29:57,127][19087] Signal inference workers to stop experience collection... (2100 times) [2024-06-18 17:29:57,128][19087] Signal inference workers to resume experience collection... (2100 times) [2024-06-18 17:29:57,147][19107] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-18 17:29:57,152][19107] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-18 17:29:57,305][19107] Updated weights for policy 0, policy_version 178235 (0.0038) [2024-06-18 17:30:00,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 2920349696. Throughput: 0: 41248.1. Samples: 144379480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:00,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 17:30:01,173][19107] Updated weights for policy 0, policy_version 178245 (0.0041) [2024-06-18 17:30:05,057][19107] Updated weights for policy 0, policy_version 178255 (0.0040) [2024-06-18 17:30:05,500][18875] Fps is (10 sec: 42598.2, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 2920529920. Throughput: 0: 41540.0. Samples: 144636440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:05,501][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 17:30:09,143][19107] Updated weights for policy 0, policy_version 178265 (0.0034) [2024-06-18 17:30:10,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 2920742912. Throughput: 0: 41174.2. Samples: 144875880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:10,501][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 17:30:13,005][19107] Updated weights for policy 0, policy_version 178275 (0.0041) [2024-06-18 17:30:15,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 2920955904. Throughput: 0: 41211.1. Samples: 144999220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:15,501][18875] Avg episode reward: [(0, '0.373')] [2024-06-18 17:30:17,071][19107] Updated weights for policy 0, policy_version 178285 (0.0045) [2024-06-18 17:30:20,504][18875] Fps is (10 sec: 40945.8, 60 sec: 40957.6, 300 sec: 41098.4). Total num frames: 2921152512. Throughput: 0: 41252.8. Samples: 145250980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:20,504][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 17:30:20,966][19107] Updated weights for policy 0, policy_version 178295 (0.0033) [2024-06-18 17:30:24,927][19107] Updated weights for policy 0, policy_version 178305 (0.0040) [2024-06-18 17:30:25,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2921365504. Throughput: 0: 40950.2. Samples: 145486020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:25,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 17:30:29,107][19107] Updated weights for policy 0, policy_version 178315 (0.0039) [2024-06-18 17:30:30,500][18875] Fps is (10 sec: 40974.2, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 2921562112. Throughput: 0: 41155.5. Samples: 145614860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:30,501][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 17:30:32,763][19107] Updated weights for policy 0, policy_version 178325 (0.0035) [2024-06-18 17:30:35,500][18875] Fps is (10 sec: 39322.3, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2921758720. Throughput: 0: 41101.3. Samples: 145861940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:35,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 17:30:35,513][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178330_2921758720.pth... [2024-06-18 17:30:35,576][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000177730_2911928320.pth [2024-06-18 17:30:37,123][19107] Updated weights for policy 0, policy_version 178335 (0.0039) [2024-06-18 17:30:40,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 2921971712. Throughput: 0: 41010.9. Samples: 146106240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:40,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 17:30:40,800][19107] Updated weights for policy 0, policy_version 178345 (0.0038) [2024-06-18 17:30:45,088][19107] Updated weights for policy 0, policy_version 178355 (0.0026) [2024-06-18 17:30:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 2922168320. Throughput: 0: 41223.2. Samples: 146234520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:45,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 17:30:48,665][19107] Updated weights for policy 0, policy_version 178365 (0.0033) [2024-06-18 17:30:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 40959.9, 300 sec: 41098.9). Total num frames: 2922381312. Throughput: 0: 40881.3. Samples: 146476100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 17:30:50,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 17:30:53,348][19107] Updated weights for policy 0, policy_version 178375 (0.0045) [2024-06-18 17:30:55,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 41098.9). Total num frames: 2922594304. Throughput: 0: 41009.0. Samples: 146721280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:30:55,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 17:30:57,005][19107] Updated weights for policy 0, policy_version 178385 (0.0038) [2024-06-18 17:31:00,504][18875] Fps is (10 sec: 40945.3, 60 sec: 40684.5, 300 sec: 41098.4). Total num frames: 2922790912. Throughput: 0: 41093.6. Samples: 146848580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:00,504][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 17:31:01,101][19107] Updated weights for policy 0, policy_version 178395 (0.0029) [2024-06-18 17:31:04,753][19107] Updated weights for policy 0, policy_version 178405 (0.0034) [2024-06-18 17:31:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41098.9). Total num frames: 2923003904. Throughput: 0: 40939.2. Samples: 147093100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:05,501][18875] Avg episode reward: [(0, '0.725')] [2024-06-18 17:31:09,271][19107] Updated weights for policy 0, policy_version 178415 (0.0035) [2024-06-18 17:31:10,500][18875] Fps is (10 sec: 42613.6, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 2923216896. Throughput: 0: 41201.4. Samples: 147340080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:10,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 17:31:13,239][19107] Updated weights for policy 0, policy_version 178425 (0.0041) [2024-06-18 17:31:15,500][18875] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 2923413504. Throughput: 0: 41129.4. Samples: 147465680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:15,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 17:31:17,210][19107] Updated weights for policy 0, policy_version 178435 (0.0039) [2024-06-18 17:31:20,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40962.3, 300 sec: 41043.3). Total num frames: 2923610112. Throughput: 0: 40947.8. Samples: 147704600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:20,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 17:31:21,245][19107] Updated weights for policy 0, policy_version 178445 (0.0040) [2024-06-18 17:31:24,271][19087] Signal inference workers to stop experience collection... (2150 times) [2024-06-18 17:31:24,272][19087] Signal inference workers to resume experience collection... (2150 times) [2024-06-18 17:31:24,285][19107] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-18 17:31:24,324][19107] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-18 17:31:25,078][19107] Updated weights for policy 0, policy_version 178455 (0.0038) [2024-06-18 17:31:25,504][18875] Fps is (10 sec: 40945.1, 60 sec: 40957.6, 300 sec: 41153.9). Total num frames: 2923823104. Throughput: 0: 41204.3. Samples: 147960580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:25,505][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 17:31:29,155][19107] Updated weights for policy 0, policy_version 178465 (0.0039) [2024-06-18 17:31:30,504][18875] Fps is (10 sec: 42583.7, 60 sec: 41230.7, 300 sec: 41153.9). Total num frames: 2924036096. Throughput: 0: 41008.2. Samples: 148080040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:30,504][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 17:31:32,907][19107] Updated weights for policy 0, policy_version 178475 (0.0029) [2024-06-18 17:31:35,500][18875] Fps is (10 sec: 42613.3, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 2924249088. Throughput: 0: 41022.1. Samples: 148322100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:35,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 17:31:36,931][19107] Updated weights for policy 0, policy_version 178485 (0.0036) [2024-06-18 17:31:40,500][18875] Fps is (10 sec: 39335.8, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 2924429312. Throughput: 0: 41422.3. Samples: 148585280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:40,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 17:31:40,790][19107] Updated weights for policy 0, policy_version 178495 (0.0040) [2024-06-18 17:31:44,669][19107] Updated weights for policy 0, policy_version 178505 (0.0030) [2024-06-18 17:31:45,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 2924658688. Throughput: 0: 41100.1. Samples: 148697940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:45,501][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 17:31:48,865][19107] Updated weights for policy 0, policy_version 178515 (0.0036) [2024-06-18 17:31:50,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 2924871680. Throughput: 0: 41183.6. Samples: 148946360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:50,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 17:31:52,466][19107] Updated weights for policy 0, policy_version 178525 (0.0045) [2024-06-18 17:31:55,500][18875] Fps is (10 sec: 37682.9, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 2925035520. Throughput: 0: 41294.2. Samples: 149198320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 17:31:55,501][18875] Avg episode reward: [(0, '0.769')] [2024-06-18 17:31:56,768][19107] Updated weights for policy 0, policy_version 178535 (0.0052) [2024-06-18 17:32:00,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41235.6, 300 sec: 41043.3). Total num frames: 2925264896. Throughput: 0: 41084.9. Samples: 149314500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:00,501][18875] Avg episode reward: [(0, '0.796')] [2024-06-18 17:32:00,582][19107] Updated weights for policy 0, policy_version 178545 (0.0037) [2024-06-18 17:32:04,665][19107] Updated weights for policy 0, policy_version 178555 (0.0025) [2024-06-18 17:32:05,500][18875] Fps is (10 sec: 45875.2, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 2925494272. Throughput: 0: 41556.0. Samples: 149574620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:05,501][18875] Avg episode reward: [(0, '0.782')] [2024-06-18 17:32:08,499][19107] Updated weights for policy 0, policy_version 178565 (0.0039) [2024-06-18 17:32:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 2925690880. Throughput: 0: 41109.1. Samples: 149810340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:10,503][18875] Avg episode reward: [(0, '0.727')] [2024-06-18 17:32:12,588][19107] Updated weights for policy 0, policy_version 178575 (0.0031) [2024-06-18 17:32:15,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 2925903872. Throughput: 0: 41313.0. Samples: 149938980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:15,509][18875] Avg episode reward: [(0, '0.397')] [2024-06-18 17:32:16,258][19107] Updated weights for policy 0, policy_version 178585 (0.0036) [2024-06-18 17:32:20,504][18875] Fps is (10 sec: 37669.3, 60 sec: 40957.6, 300 sec: 41042.8). Total num frames: 2926067712. Throughput: 0: 41535.4. Samples: 150191340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:20,505][18875] Avg episode reward: [(0, '0.397')] [2024-06-18 17:32:20,647][19107] Updated weights for policy 0, policy_version 178595 (0.0045) [2024-06-18 17:32:24,057][19107] Updated weights for policy 0, policy_version 178605 (0.0045) [2024-06-18 17:32:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41508.7, 300 sec: 41098.8). Total num frames: 2926313472. Throughput: 0: 41068.1. Samples: 150433340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:25,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 17:32:28,712][19107] Updated weights for policy 0, policy_version 178615 (0.0036) [2024-06-18 17:32:30,500][18875] Fps is (10 sec: 44253.4, 60 sec: 41235.6, 300 sec: 41043.3). Total num frames: 2926510080. Throughput: 0: 41448.1. Samples: 150563100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:30,500][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 17:32:32,231][19107] Updated weights for policy 0, policy_version 178625 (0.0032) [2024-06-18 17:32:35,500][18875] Fps is (10 sec: 37682.8, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2926690304. Throughput: 0: 41234.7. Samples: 150801920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:35,501][18875] Avg episode reward: [(0, '0.608')] [2024-06-18 17:32:35,637][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178632_2926706688.pth... [2024-06-18 17:32:35,691][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178031_2916859904.pth [2024-06-18 17:32:36,624][19107] Updated weights for policy 0, policy_version 178635 (0.0050) [2024-06-18 17:32:40,399][19107] Updated weights for policy 0, policy_version 178645 (0.0024) [2024-06-18 17:32:40,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 2926919680. Throughput: 0: 41052.5. Samples: 151045680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:40,501][18875] Avg episode reward: [(0, '0.619')] [2024-06-18 17:32:45,136][19107] Updated weights for policy 0, policy_version 178655 (0.0043) [2024-06-18 17:32:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 2927099904. Throughput: 0: 41246.2. Samples: 151170580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:45,501][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 17:32:48,152][19107] Updated weights for policy 0, policy_version 178665 (0.0038) [2024-06-18 17:32:50,500][18875] Fps is (10 sec: 39322.0, 60 sec: 40687.0, 300 sec: 41043.8). Total num frames: 2927312896. Throughput: 0: 40813.0. Samples: 151411200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:50,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 17:32:52,902][19107] Updated weights for policy 0, policy_version 178675 (0.0042) [2024-06-18 17:32:53,539][19087] Signal inference workers to stop experience collection... (2200 times) [2024-06-18 17:32:53,540][19087] Signal inference workers to resume experience collection... (2200 times) [2024-06-18 17:32:53,583][19107] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-18 17:32:53,583][19107] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-18 17:32:55,500][18875] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41043.3). Total num frames: 2927542272. Throughput: 0: 41229.6. Samples: 151665680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:32:55,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 17:32:55,965][19107] Updated weights for policy 0, policy_version 178685 (0.0044) [2024-06-18 17:33:00,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 2927706112. Throughput: 0: 41025.0. Samples: 151785100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:33:00,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 17:33:00,838][19107] Updated weights for policy 0, policy_version 178695 (0.0038) [2024-06-18 17:33:03,783][19107] Updated weights for policy 0, policy_version 178705 (0.0030) [2024-06-18 17:33:05,500][18875] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 2927951872. Throughput: 0: 40801.5. Samples: 152027260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:05,501][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 17:33:08,937][19107] Updated weights for policy 0, policy_version 178715 (0.0029) [2024-06-18 17:33:10,500][18875] Fps is (10 sec: 44236.3, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 2928148480. Throughput: 0: 40954.5. Samples: 152276300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:10,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 17:33:11,911][19107] Updated weights for policy 0, policy_version 178725 (0.0046) [2024-06-18 17:33:15,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 2928345088. Throughput: 0: 40629.2. Samples: 152391420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:15,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 17:33:17,124][19107] Updated weights for policy 0, policy_version 178735 (0.0047) [2024-06-18 17:33:19,895][19107] Updated weights for policy 0, policy_version 178745 (0.0047) [2024-06-18 17:33:20,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41781.7, 300 sec: 41098.9). Total num frames: 2928574464. Throughput: 0: 40843.1. Samples: 152639860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:20,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 17:33:24,947][19107] Updated weights for policy 0, policy_version 178755 (0.0033) [2024-06-18 17:33:25,500][18875] Fps is (10 sec: 39321.8, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 2928738304. Throughput: 0: 41123.6. Samples: 152896240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:25,501][18875] Avg episode reward: [(0, '0.720')] [2024-06-18 17:33:27,917][19107] Updated weights for policy 0, policy_version 178765 (0.0040) [2024-06-18 17:33:30,500][18875] Fps is (10 sec: 37683.5, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 2928951296. Throughput: 0: 40929.4. Samples: 153012400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:30,501][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 17:33:32,729][19107] Updated weights for policy 0, policy_version 178775 (0.0040) [2024-06-18 17:33:35,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 2929180672. Throughput: 0: 41331.0. Samples: 153271100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:35,501][18875] Avg episode reward: [(0, '0.646')] [2024-06-18 17:33:35,917][19107] Updated weights for policy 0, policy_version 178785 (0.0035) [2024-06-18 17:33:40,500][18875] Fps is (10 sec: 40959.2, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 2929360896. Throughput: 0: 41108.9. Samples: 153515580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:40,501][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 17:33:40,538][19107] Updated weights for policy 0, policy_version 178795 (0.0028) [2024-06-18 17:33:43,837][19107] Updated weights for policy 0, policy_version 178805 (0.0043) [2024-06-18 17:33:45,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41043.8). Total num frames: 2929590272. Throughput: 0: 41045.1. Samples: 153632140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:45,501][18875] Avg episode reward: [(0, '0.398')] [2024-06-18 17:33:48,783][19107] Updated weights for policy 0, policy_version 178815 (0.0034) [2024-06-18 17:33:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41232.9, 300 sec: 41154.4). Total num frames: 2929786880. Throughput: 0: 41285.2. Samples: 153885100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:50,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 17:33:51,819][19107] Updated weights for policy 0, policy_version 178825 (0.0035) [2024-06-18 17:33:55,500][18875] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 2929999872. Throughput: 0: 41220.9. Samples: 154131240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:33:55,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 17:33:56,807][19107] Updated weights for policy 0, policy_version 178835 (0.0033) [2024-06-18 17:33:59,756][19107] Updated weights for policy 0, policy_version 178845 (0.0038) [2024-06-18 17:34:00,501][18875] Fps is (10 sec: 42597.8, 60 sec: 41778.9, 300 sec: 41154.3). Total num frames: 2930212864. Throughput: 0: 41519.3. Samples: 154259800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:34:00,502][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 17:34:04,927][19107] Updated weights for policy 0, policy_version 178855 (0.0040) [2024-06-18 17:34:05,500][18875] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 2930393088. Throughput: 0: 41510.8. Samples: 154507840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:34:05,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 17:34:07,634][19107] Updated weights for policy 0, policy_version 178865 (0.0024) [2024-06-18 17:34:10,500][18875] Fps is (10 sec: 40961.4, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 2930622464. Throughput: 0: 41094.2. Samples: 154745480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-18 17:34:10,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 17:34:12,624][19107] Updated weights for policy 0, policy_version 178875 (0.0035) [2024-06-18 17:34:14,244][19087] Signal inference workers to stop experience collection... (2250 times) [2024-06-18 17:34:14,297][19087] Signal inference workers to resume experience collection... (2250 times) [2024-06-18 17:34:14,298][19107] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-18 17:34:14,321][19107] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-18 17:34:15,422][19107] Updated weights for policy 0, policy_version 178885 (0.0034) [2024-06-18 17:34:15,500][18875] Fps is (10 sec: 45874.2, 60 sec: 41779.1, 300 sec: 41209.9). Total num frames: 2930851840. Throughput: 0: 41560.7. Samples: 154882640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:15,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 17:34:20,290][19107] Updated weights for policy 0, policy_version 178895 (0.0026) [2024-06-18 17:34:20,500][18875] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 2931015680. Throughput: 0: 41213.9. Samples: 155125720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:20,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 17:34:23,553][19107] Updated weights for policy 0, policy_version 178905 (0.0048) [2024-06-18 17:34:25,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 41265.5). Total num frames: 2931277824. Throughput: 0: 41056.5. Samples: 155363120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:25,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 17:34:28,833][19107] Updated weights for policy 0, policy_version 178915 (0.0034) [2024-06-18 17:34:30,500][18875] Fps is (10 sec: 40959.0, 60 sec: 41232.9, 300 sec: 41098.8). Total num frames: 2931425280. Throughput: 0: 41434.7. Samples: 155496700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:30,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 17:34:31,444][19107] Updated weights for policy 0, policy_version 178925 (0.0040) [2024-06-18 17:34:35,503][18875] Fps is (10 sec: 34398.5, 60 sec: 40685.4, 300 sec: 41098.5). Total num frames: 2931621888. Throughput: 0: 41101.2. Samples: 155734740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:35,503][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 17:34:35,621][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178933_2931638272.pth... [2024-06-18 17:34:35,682][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178330_2921758720.pth [2024-06-18 17:34:36,674][19107] Updated weights for policy 0, policy_version 178935 (0.0033) [2024-06-18 17:34:39,615][19107] Updated weights for policy 0, policy_version 178945 (0.0047) [2024-06-18 17:34:40,500][18875] Fps is (10 sec: 45875.9, 60 sec: 42052.3, 300 sec: 41209.9). Total num frames: 2931884032. Throughput: 0: 41136.0. Samples: 155982360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:40,501][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 17:34:44,606][19107] Updated weights for policy 0, policy_version 178955 (0.0035) [2024-06-18 17:34:45,500][18875] Fps is (10 sec: 40969.9, 60 sec: 40687.1, 300 sec: 41043.3). Total num frames: 2932031488. Throughput: 0: 41163.1. Samples: 156112120. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:45,500][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 17:34:47,611][19107] Updated weights for policy 0, policy_version 178965 (0.0040) [2024-06-18 17:34:50,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41233.2, 300 sec: 41209.9). Total num frames: 2932260864. Throughput: 0: 40815.5. Samples: 156344540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:50,501][18875] Avg episode reward: [(0, '0.320')] [2024-06-18 17:34:52,354][19107] Updated weights for policy 0, policy_version 178975 (0.0041) [2024-06-18 17:34:55,500][18875] Fps is (10 sec: 44236.3, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 2932473856. Throughput: 0: 41332.4. Samples: 156605440. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:34:55,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 17:34:55,630][19107] Updated weights for policy 0, policy_version 178985 (0.0036) [2024-06-18 17:35:00,013][19107] Updated weights for policy 0, policy_version 178995 (0.0043) [2024-06-18 17:35:00,500][18875] Fps is (10 sec: 40959.6, 60 sec: 40960.2, 300 sec: 41154.4). Total num frames: 2932670464. Throughput: 0: 40897.9. Samples: 156723040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:35:00,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 17:35:03,530][19107] Updated weights for policy 0, policy_version 179005 (0.0042) [2024-06-18 17:35:05,504][18875] Fps is (10 sec: 44220.7, 60 sec: 42049.7, 300 sec: 41265.0). Total num frames: 2932916224. Throughput: 0: 41059.7. Samples: 156973560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:35:05,504][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 17:35:08,034][19107] Updated weights for policy 0, policy_version 179015 (0.0028) [2024-06-18 17:35:10,500][18875] Fps is (10 sec: 40960.4, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 2933080064. Throughput: 0: 41417.4. Samples: 157226900. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:35:10,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 17:35:11,481][19107] Updated weights for policy 0, policy_version 179025 (0.0033) [2024-06-18 17:35:15,500][18875] Fps is (10 sec: 36057.4, 60 sec: 40413.9, 300 sec: 41099.3). Total num frames: 2933276672. Throughput: 0: 41000.0. Samples: 157341700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 28.0) [2024-06-18 17:35:15,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 17:35:16,042][19107] Updated weights for policy 0, policy_version 179035 (0.0041) [2024-06-18 17:35:19,355][19107] Updated weights for policy 0, policy_version 179045 (0.0035) [2024-06-18 17:35:20,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 2933506048. Throughput: 0: 41298.1. Samples: 157593060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:20,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 17:35:23,971][19107] Updated weights for policy 0, policy_version 179055 (0.0050) [2024-06-18 17:35:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 40140.7, 300 sec: 41098.8). Total num frames: 2933686272. Throughput: 0: 41296.4. Samples: 157840700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:25,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 17:35:25,578][19087] Signal inference workers to stop experience collection... (2300 times) [2024-06-18 17:35:25,578][19087] Signal inference workers to resume experience collection... (2300 times) [2024-06-18 17:35:25,623][19107] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-18 17:35:25,624][19107] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-18 17:35:27,251][19107] Updated weights for policy 0, policy_version 179065 (0.0034) [2024-06-18 17:35:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.3, 300 sec: 41209.9). Total num frames: 2933915648. Throughput: 0: 41105.7. Samples: 157961880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:30,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 17:35:31,638][19107] Updated weights for policy 0, policy_version 179075 (0.0027) [2024-06-18 17:35:35,077][19107] Updated weights for policy 0, policy_version 179085 (0.0035) [2024-06-18 17:35:35,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42053.8, 300 sec: 41265.5). Total num frames: 2934145024. Throughput: 0: 41578.0. Samples: 158215560. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:35,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 17:35:39,669][19107] Updated weights for policy 0, policy_version 179095 (0.0039) [2024-06-18 17:35:40,500][18875] Fps is (10 sec: 39321.2, 60 sec: 40413.8, 300 sec: 41154.4). Total num frames: 2934308864. Throughput: 0: 41192.8. Samples: 158459120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:40,504][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 17:35:43,278][19107] Updated weights for policy 0, policy_version 179105 (0.0029) [2024-06-18 17:35:45,504][18875] Fps is (10 sec: 37670.0, 60 sec: 41503.5, 300 sec: 41153.9). Total num frames: 2934521856. Throughput: 0: 41333.1. Samples: 158583180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:45,505][18875] Avg episode reward: [(0, '0.219')] [2024-06-18 17:35:47,383][19107] Updated weights for policy 0, policy_version 179115 (0.0042) [2024-06-18 17:35:50,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2934734848. Throughput: 0: 41319.8. Samples: 158832800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:50,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 17:35:51,281][19107] Updated weights for policy 0, policy_version 179125 (0.0045) [2024-06-18 17:35:55,106][19107] Updated weights for policy 0, policy_version 179135 (0.0032) [2024-06-18 17:35:55,500][18875] Fps is (10 sec: 42613.8, 60 sec: 41233.0, 300 sec: 41210.4). Total num frames: 2934947840. Throughput: 0: 41039.9. Samples: 159073700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:35:55,504][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 17:35:59,076][19107] Updated weights for policy 0, policy_version 179145 (0.0045) [2024-06-18 17:36:00,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 2935160832. Throughput: 0: 41381.4. Samples: 159203860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:36:00,501][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 17:36:03,216][19107] Updated weights for policy 0, policy_version 179155 (0.0038) [2024-06-18 17:36:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40689.4, 300 sec: 41154.4). Total num frames: 2935357440. Throughput: 0: 41145.3. Samples: 159444600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:36:05,501][18875] Avg episode reward: [(0, '0.258')] [2024-06-18 17:36:07,098][19107] Updated weights for policy 0, policy_version 179165 (0.0039) [2024-06-18 17:36:10,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 2935554048. Throughput: 0: 41181.0. Samples: 159693840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:36:10,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 17:36:11,153][19107] Updated weights for policy 0, policy_version 179175 (0.0030) [2024-06-18 17:36:15,022][19107] Updated weights for policy 0, policy_version 179185 (0.0034) [2024-06-18 17:36:15,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 2935783424. Throughput: 0: 41136.4. Samples: 159813020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:36:15,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 17:36:19,393][19107] Updated weights for policy 0, policy_version 179195 (0.0035) [2024-06-18 17:36:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41210.4). Total num frames: 2935980032. Throughput: 0: 41180.6. Samples: 160068680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 17:36:20,501][18875] Avg episode reward: [(0, '0.781')] [2024-06-18 17:36:22,947][19107] Updated weights for policy 0, policy_version 179205 (0.0037) [2024-06-18 17:36:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41210.4). Total num frames: 2936193024. Throughput: 0: 41376.1. Samples: 160321040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:36:25,501][18875] Avg episode reward: [(0, '0.714')] [2024-06-18 17:36:26,832][19107] Updated weights for policy 0, policy_version 179215 (0.0045) [2024-06-18 17:36:30,504][18875] Fps is (10 sec: 40945.2, 60 sec: 41230.5, 300 sec: 41153.9). Total num frames: 2936389632. Throughput: 0: 41454.3. Samples: 160448620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:36:30,505][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 17:36:31,013][19107] Updated weights for policy 0, policy_version 179225 (0.0041) [2024-06-18 17:36:34,496][19107] Updated weights for policy 0, policy_version 179235 (0.0031) [2024-06-18 17:36:35,504][18875] Fps is (10 sec: 40945.1, 60 sec: 40957.6, 300 sec: 41265.0). Total num frames: 2936602624. Throughput: 0: 41500.6. Samples: 160700480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:36:35,505][18875] Avg episode reward: [(0, '0.295')] [2024-06-18 17:36:35,531][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000179236_2936602624.pth... [2024-06-18 17:36:35,587][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178632_2926706688.pth [2024-06-18 17:36:38,903][19107] Updated weights for policy 0, policy_version 179245 (0.0042) [2024-06-18 17:36:40,500][18875] Fps is (10 sec: 42614.2, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 2936815616. Throughput: 0: 41606.8. Samples: 160946000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:36:40,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 17:36:42,409][19107] Updated weights for policy 0, policy_version 179255 (0.0037) [2024-06-18 17:36:45,500][18875] Fps is (10 sec: 40974.9, 60 sec: 41508.7, 300 sec: 41154.4). Total num frames: 2937012224. Throughput: 0: 41406.8. Samples: 161067160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:36:45,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 17:36:46,771][19107] Updated weights for policy 0, policy_version 179265 (0.0048) [2024-06-18 17:36:50,082][19087] Signal inference workers to stop experience collection... (2350 times) [2024-06-18 17:36:50,112][19107] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-18 17:36:50,134][19087] Signal inference workers to resume experience collection... (2350 times) [2024-06-18 17:36:50,140][19107] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-18 17:36:50,268][19107] Updated weights for policy 0, policy_version 179275 (0.0032) [2024-06-18 17:36:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 2937241600. Throughput: 0: 41568.1. Samples: 161315160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:36:50,501][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 17:36:54,567][19107] Updated weights for policy 0, policy_version 179285 (0.0044) [2024-06-18 17:36:55,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 2937421824. Throughput: 0: 41639.5. Samples: 161567620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:36:55,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 17:36:58,393][19107] Updated weights for policy 0, policy_version 179295 (0.0035) [2024-06-18 17:37:00,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 2937634816. Throughput: 0: 41738.8. Samples: 161691260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:37:00,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 17:37:02,263][19107] Updated weights for policy 0, policy_version 179305 (0.0040) [2024-06-18 17:37:05,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 2937864192. Throughput: 0: 41679.9. Samples: 161944280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:37:05,501][18875] Avg episode reward: [(0, '0.664')] [2024-06-18 17:37:06,215][19107] Updated weights for policy 0, policy_version 179315 (0.0041) [2024-06-18 17:37:09,948][19107] Updated weights for policy 0, policy_version 179325 (0.0045) [2024-06-18 17:37:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 2938060800. Throughput: 0: 41533.0. Samples: 162190020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:37:10,500][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 17:37:13,949][19107] Updated weights for policy 0, policy_version 179335 (0.0043) [2024-06-18 17:37:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41377.0). Total num frames: 2938273792. Throughput: 0: 41517.9. Samples: 162316780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:37:15,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 17:37:17,605][19107] Updated weights for policy 0, policy_version 179345 (0.0051) [2024-06-18 17:37:20,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41265.4). Total num frames: 2938486784. Throughput: 0: 41686.4. Samples: 162576220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:37:20,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 17:37:21,877][19107] Updated weights for policy 0, policy_version 179355 (0.0042) [2024-06-18 17:37:25,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 2938683392. Throughput: 0: 41732.4. Samples: 162823960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 17:37:25,500][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 17:37:25,767][19107] Updated weights for policy 0, policy_version 179365 (0.0034) [2024-06-18 17:37:29,814][19107] Updated weights for policy 0, policy_version 179375 (0.0028) [2024-06-18 17:37:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41781.7, 300 sec: 41376.5). Total num frames: 2938896384. Throughput: 0: 41816.9. Samples: 162948920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:37:30,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 17:37:33,622][19107] Updated weights for policy 0, policy_version 179385 (0.0039) [2024-06-18 17:37:35,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41781.7, 300 sec: 41321.0). Total num frames: 2939109376. Throughput: 0: 41879.1. Samples: 163199720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:37:35,501][18875] Avg episode reward: [(0, '0.302')] [2024-06-18 17:37:37,441][19107] Updated weights for policy 0, policy_version 179395 (0.0039) [2024-06-18 17:37:40,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 2939289600. Throughput: 0: 41920.9. Samples: 163454060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:37:40,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 17:37:41,650][19107] Updated weights for policy 0, policy_version 179405 (0.0029) [2024-06-18 17:37:45,146][19107] Updated weights for policy 0, policy_version 179415 (0.0033) [2024-06-18 17:37:45,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41432.1). Total num frames: 2939535360. Throughput: 0: 41767.9. Samples: 163570820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:37:45,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 17:37:49,347][19107] Updated weights for policy 0, policy_version 179425 (0.0043) [2024-06-18 17:37:50,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 2939715584. Throughput: 0: 41770.8. Samples: 163823960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:37:50,500][18875] Avg episode reward: [(0, '0.686')] [2024-06-18 17:37:53,747][19107] Updated weights for policy 0, policy_version 179435 (0.0040) [2024-06-18 17:37:55,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 2939928576. Throughput: 0: 41819.0. Samples: 164071880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:37:55,501][18875] Avg episode reward: [(0, '0.293')] [2024-06-18 17:37:57,534][19107] Updated weights for policy 0, policy_version 179445 (0.0038) [2024-06-18 17:38:00,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 41376.5). Total num frames: 2940157952. Throughput: 0: 41799.6. Samples: 164197760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:38:00,501][18875] Avg episode reward: [(0, '0.656')] [2024-06-18 17:38:01,352][19107] Updated weights for policy 0, policy_version 179455 (0.0034) [2024-06-18 17:38:05,333][19107] Updated weights for policy 0, policy_version 179465 (0.0030) [2024-06-18 17:38:05,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41376.6). Total num frames: 2940354560. Throughput: 0: 41488.1. Samples: 164443180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:38:05,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 17:38:09,161][19107] Updated weights for policy 0, policy_version 179475 (0.0035) [2024-06-18 17:38:10,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41506.1, 300 sec: 41376.6). Total num frames: 2940551168. Throughput: 0: 41512.0. Samples: 164692000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:38:10,500][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 17:38:13,102][19107] Updated weights for policy 0, policy_version 179485 (0.0035) [2024-06-18 17:38:15,500][18875] Fps is (10 sec: 42597.3, 60 sec: 41779.1, 300 sec: 41376.5). Total num frames: 2940780544. Throughput: 0: 41505.2. Samples: 164816660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:38:15,510][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 17:38:16,962][19107] Updated weights for policy 0, policy_version 179495 (0.0035) [2024-06-18 17:38:20,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 2940977152. Throughput: 0: 41644.0. Samples: 165073700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:38:20,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 17:38:20,822][19107] Updated weights for policy 0, policy_version 179505 (0.0049) [2024-06-18 17:38:22,014][19087] Signal inference workers to stop experience collection... (2400 times) [2024-06-18 17:38:22,055][19107] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-18 17:38:22,123][19087] Signal inference workers to resume experience collection... (2400 times) [2024-06-18 17:38:22,123][19107] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-18 17:38:24,958][19107] Updated weights for policy 0, policy_version 179515 (0.0037) [2024-06-18 17:38:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 2941190144. Throughput: 0: 41482.6. Samples: 165320780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:38:25,510][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 17:38:28,588][19107] Updated weights for policy 0, policy_version 179525 (0.0045) [2024-06-18 17:38:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41432.1). Total num frames: 2941403136. Throughput: 0: 41568.1. Samples: 165441380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:38:30,500][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 17:38:32,886][19107] Updated weights for policy 0, policy_version 179535 (0.0036) [2024-06-18 17:38:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 2941616128. Throughput: 0: 41681.2. Samples: 165699620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:38:35,501][18875] Avg episode reward: [(0, '0.348')] [2024-06-18 17:38:35,512][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000179542_2941616128.pth... [2024-06-18 17:38:35,566][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000178933_2931638272.pth [2024-06-18 17:38:36,474][19107] Updated weights for policy 0, policy_version 179545 (0.0045) [2024-06-18 17:38:40,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41376.6). Total num frames: 2941796352. Throughput: 0: 41592.0. Samples: 165943520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:38:40,501][18875] Avg episode reward: [(0, '0.286')] [2024-06-18 17:38:41,174][19107] Updated weights for policy 0, policy_version 179555 (0.0030) [2024-06-18 17:38:44,271][19107] Updated weights for policy 0, policy_version 179565 (0.0036) [2024-06-18 17:38:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41487.7). Total num frames: 2942025728. Throughput: 0: 41569.9. Samples: 166068400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:38:45,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 17:38:48,953][19107] Updated weights for policy 0, policy_version 179575 (0.0042) [2024-06-18 17:38:50,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 2942222336. Throughput: 0: 41708.0. Samples: 166320040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:38:50,500][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 17:38:52,516][19107] Updated weights for policy 0, policy_version 179585 (0.0049) [2024-06-18 17:38:55,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 41487.7). Total num frames: 2942451712. Throughput: 0: 41554.9. Samples: 166561980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:38:55,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 17:38:56,930][19107] Updated weights for policy 0, policy_version 179595 (0.0042) [2024-06-18 17:39:00,351][19107] Updated weights for policy 0, policy_version 179605 (0.0036) [2024-06-18 17:39:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.3, 300 sec: 41543.2). Total num frames: 2942648320. Throughput: 0: 41814.5. Samples: 166698300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:00,500][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 17:39:04,804][19107] Updated weights for policy 0, policy_version 179615 (0.0045) [2024-06-18 17:39:05,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41232.9, 300 sec: 41376.5). Total num frames: 2942828544. Throughput: 0: 41682.0. Samples: 166949400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:05,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 17:39:08,199][19107] Updated weights for policy 0, policy_version 179625 (0.0024) [2024-06-18 17:39:10,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41432.1). Total num frames: 2943074304. Throughput: 0: 41603.2. Samples: 167192920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:10,501][18875] Avg episode reward: [(0, '0.749')] [2024-06-18 17:39:12,716][19107] Updated weights for policy 0, policy_version 179635 (0.0035) [2024-06-18 17:39:15,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41233.2, 300 sec: 41487.6). Total num frames: 2943254528. Throughput: 0: 41785.7. Samples: 167321740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:15,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 17:39:16,012][19107] Updated weights for policy 0, policy_version 179645 (0.0034) [2024-06-18 17:39:20,495][19107] Updated weights for policy 0, policy_version 179655 (0.0041) [2024-06-18 17:39:20,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 2943467520. Throughput: 0: 41484.1. Samples: 167566400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:20,500][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 17:39:23,842][19107] Updated weights for policy 0, policy_version 179665 (0.0034) [2024-06-18 17:39:25,500][18875] Fps is (10 sec: 42597.5, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 2943680512. Throughput: 0: 41548.7. Samples: 167813220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:25,501][18875] Avg episode reward: [(0, '0.425')] [2024-06-18 17:39:28,193][19107] Updated weights for policy 0, policy_version 179675 (0.0040) [2024-06-18 17:39:30,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 41543.5). Total num frames: 2943877120. Throughput: 0: 41734.2. Samples: 167946440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:30,501][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 17:39:31,841][19107] Updated weights for policy 0, policy_version 179685 (0.0041) [2024-06-18 17:39:33,649][19087] Signal inference workers to stop experience collection... (2450 times) [2024-06-18 17:39:33,671][19107] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-18 17:39:33,758][19087] Signal inference workers to resume experience collection... (2450 times) [2024-06-18 17:39:33,759][19107] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-18 17:39:35,501][18875] Fps is (10 sec: 40959.9, 60 sec: 41232.9, 300 sec: 41376.5). Total num frames: 2944090112. Throughput: 0: 41546.0. Samples: 168189620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 17:39:35,501][18875] Avg episode reward: [(0, '0.300')] [2024-06-18 17:39:36,014][19107] Updated weights for policy 0, policy_version 179695 (0.0022) [2024-06-18 17:39:39,703][19107] Updated weights for policy 0, policy_version 179705 (0.0030) [2024-06-18 17:39:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2944303104. Throughput: 0: 41630.7. Samples: 168435360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:39:40,501][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 17:39:43,883][19107] Updated weights for policy 0, policy_version 179715 (0.0033) [2024-06-18 17:39:45,500][18875] Fps is (10 sec: 44237.8, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2944532480. Throughput: 0: 41390.1. Samples: 168560860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:39:45,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 17:39:47,618][19107] Updated weights for policy 0, policy_version 179725 (0.0033) [2024-06-18 17:39:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 2944712704. Throughput: 0: 41374.3. Samples: 168811240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:39:50,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 17:39:51,779][19107] Updated weights for policy 0, policy_version 179735 (0.0033) [2024-06-18 17:39:55,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 2944925696. Throughput: 0: 41303.9. Samples: 169051600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:39:55,501][18875] Avg episode reward: [(0, '0.173')] [2024-06-18 17:39:55,803][19107] Updated weights for policy 0, policy_version 179745 (0.0040) [2024-06-18 17:39:59,631][19107] Updated weights for policy 0, policy_version 179755 (0.0034) [2024-06-18 17:40:00,500][18875] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41321.5). Total num frames: 2945105920. Throughput: 0: 41214.2. Samples: 169176380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:00,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 17:40:03,554][19107] Updated weights for policy 0, policy_version 179765 (0.0037) [2024-06-18 17:40:05,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41543.1). Total num frames: 2945335296. Throughput: 0: 41316.2. Samples: 169425640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:05,501][18875] Avg episode reward: [(0, '0.699')] [2024-06-18 17:40:07,924][19107] Updated weights for policy 0, policy_version 179775 (0.0059) [2024-06-18 17:40:10,503][18875] Fps is (10 sec: 44226.4, 60 sec: 41231.4, 300 sec: 41598.4). Total num frames: 2945548288. Throughput: 0: 41274.5. Samples: 169670660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:10,503][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 17:40:11,602][19107] Updated weights for policy 0, policy_version 179785 (0.0032) [2024-06-18 17:40:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 2945744896. Throughput: 0: 41142.2. Samples: 169797840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:15,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 17:40:15,699][19107] Updated weights for policy 0, policy_version 179795 (0.0039) [2024-06-18 17:40:19,400][19107] Updated weights for policy 0, policy_version 179805 (0.0036) [2024-06-18 17:40:20,500][18875] Fps is (10 sec: 39330.6, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 2945941504. Throughput: 0: 41237.0. Samples: 170045280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:20,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 17:40:23,413][19107] Updated weights for policy 0, policy_version 179815 (0.0038) [2024-06-18 17:40:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41543.1). Total num frames: 2946170880. Throughput: 0: 41311.1. Samples: 170294360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:25,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 17:40:27,091][19107] Updated weights for policy 0, policy_version 179825 (0.0034) [2024-06-18 17:40:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 2946351104. Throughput: 0: 41250.2. Samples: 170417120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:30,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 17:40:31,409][19107] Updated weights for policy 0, policy_version 179835 (0.0043) [2024-06-18 17:40:35,298][19107] Updated weights for policy 0, policy_version 179845 (0.0035) [2024-06-18 17:40:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 2946580480. Throughput: 0: 41155.9. Samples: 170663260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:35,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 17:40:35,531][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000179845_2946580480.pth... [2024-06-18 17:40:35,595][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000179236_2936602624.pth [2024-06-18 17:40:39,399][19107] Updated weights for policy 0, policy_version 179855 (0.0035) [2024-06-18 17:40:40,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41543.7). Total num frames: 2946777088. Throughput: 0: 41406.7. Samples: 170914900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-18 17:40:40,501][18875] Avg episode reward: [(0, '0.376')] [2024-06-18 17:40:43,078][19107] Updated weights for policy 0, policy_version 179865 (0.0032) [2024-06-18 17:40:45,500][18875] Fps is (10 sec: 39322.3, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 2946973696. Throughput: 0: 41312.5. Samples: 171035440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:40:45,500][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 17:40:47,394][19107] Updated weights for policy 0, policy_version 179875 (0.0038) [2024-06-18 17:40:50,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2947219456. Throughput: 0: 41297.4. Samples: 171284020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:40:50,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 17:40:50,885][19107] Updated weights for policy 0, policy_version 179885 (0.0047) [2024-06-18 17:40:55,500][18875] Fps is (10 sec: 40959.2, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 2947383296. Throughput: 0: 41499.8. Samples: 171538060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:40:55,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 17:40:55,601][19107] Updated weights for policy 0, policy_version 179895 (0.0038) [2024-06-18 17:40:56,655][19087] Signal inference workers to stop experience collection... (2500 times) [2024-06-18 17:40:56,656][19087] Signal inference workers to resume experience collection... (2500 times) [2024-06-18 17:40:56,673][19107] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-18 17:40:56,680][19107] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-18 17:40:58,690][19107] Updated weights for policy 0, policy_version 179905 (0.0053) [2024-06-18 17:41:00,500][18875] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 2947596288. Throughput: 0: 41232.4. Samples: 171653300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:00,501][18875] Avg episode reward: [(0, '0.208')] [2024-06-18 17:41:03,511][19107] Updated weights for policy 0, policy_version 179915 (0.0028) [2024-06-18 17:41:05,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 2947825664. Throughput: 0: 41238.2. Samples: 171901000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:05,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 17:41:06,633][19107] Updated weights for policy 0, policy_version 179925 (0.0042) [2024-06-18 17:41:10,501][18875] Fps is (10 sec: 37682.6, 60 sec: 40415.3, 300 sec: 41321.0). Total num frames: 2947973120. Throughput: 0: 41444.3. Samples: 172159360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:10,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 17:41:11,408][19107] Updated weights for policy 0, policy_version 179935 (0.0033) [2024-06-18 17:41:14,670][19107] Updated weights for policy 0, policy_version 179945 (0.0045) [2024-06-18 17:41:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 2948235264. Throughput: 0: 41217.4. Samples: 172271900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:15,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 17:41:19,125][19107] Updated weights for policy 0, policy_version 179955 (0.0032) [2024-06-18 17:41:20,500][18875] Fps is (10 sec: 47514.9, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 2948448256. Throughput: 0: 41401.5. Samples: 172526320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:20,500][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 17:41:22,435][19107] Updated weights for policy 0, policy_version 179965 (0.0033) [2024-06-18 17:41:25,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41488.1). Total num frames: 2948628480. Throughput: 0: 41395.6. Samples: 172777700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:25,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 17:41:27,326][19107] Updated weights for policy 0, policy_version 179975 (0.0032) [2024-06-18 17:41:30,438][19107] Updated weights for policy 0, policy_version 179985 (0.0037) [2024-06-18 17:41:30,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41599.2). Total num frames: 2948874240. Throughput: 0: 41402.2. Samples: 172898540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:30,512][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 17:41:35,089][19107] Updated weights for policy 0, policy_version 179995 (0.0034) [2024-06-18 17:41:35,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 2949054464. Throughput: 0: 41572.8. Samples: 173154800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:35,501][18875] Avg episode reward: [(0, '0.302')] [2024-06-18 17:41:38,253][19107] Updated weights for policy 0, policy_version 180005 (0.0036) [2024-06-18 17:41:40,500][18875] Fps is (10 sec: 37682.7, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 2949251072. Throughput: 0: 41320.5. Samples: 173397480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:40,501][18875] Avg episode reward: [(0, '0.702')] [2024-06-18 17:41:42,881][19107] Updated weights for policy 0, policy_version 180015 (0.0022) [2024-06-18 17:41:45,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 2949464064. Throughput: 0: 41610.8. Samples: 173525780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:41:45,500][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 17:41:46,330][19107] Updated weights for policy 0, policy_version 180025 (0.0039) [2024-06-18 17:41:50,500][18875] Fps is (10 sec: 42599.1, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 2949677056. Throughput: 0: 41649.0. Samples: 173775200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:41:50,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 17:41:50,677][19107] Updated weights for policy 0, policy_version 180035 (0.0027) [2024-06-18 17:41:54,202][19107] Updated weights for policy 0, policy_version 180045 (0.0041) [2024-06-18 17:41:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 2949873664. Throughput: 0: 41390.8. Samples: 174021940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:41:55,501][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 17:41:58,799][19107] Updated weights for policy 0, policy_version 180055 (0.0032) [2024-06-18 17:42:00,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 2950119424. Throughput: 0: 41762.1. Samples: 174151200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:00,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 17:42:02,078][19107] Updated weights for policy 0, policy_version 180065 (0.0035) [2024-06-18 17:42:05,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 2950299648. Throughput: 0: 41682.6. Samples: 174402040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:05,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 17:42:06,545][19107] Updated weights for policy 0, policy_version 180075 (0.0030) [2024-06-18 17:42:09,872][19107] Updated weights for policy 0, policy_version 180085 (0.0037) [2024-06-18 17:42:10,504][18875] Fps is (10 sec: 40945.7, 60 sec: 42596.0, 300 sec: 41542.7). Total num frames: 2950529024. Throughput: 0: 41472.2. Samples: 174644100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:10,505][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 17:42:14,362][19107] Updated weights for policy 0, policy_version 180095 (0.0042) [2024-06-18 17:42:15,032][19087] Signal inference workers to stop experience collection... (2550 times) [2024-06-18 17:42:15,032][19087] Signal inference workers to resume experience collection... (2550 times) [2024-06-18 17:42:15,060][19107] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-18 17:42:15,060][19107] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-18 17:42:15,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 2950742016. Throughput: 0: 41769.8. Samples: 174778180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:15,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 17:42:17,586][19107] Updated weights for policy 0, policy_version 180105 (0.0037) [2024-06-18 17:42:20,500][18875] Fps is (10 sec: 39335.8, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 2950922240. Throughput: 0: 41622.8. Samples: 175027820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:20,501][18875] Avg episode reward: [(0, '0.379')] [2024-06-18 17:42:22,254][19107] Updated weights for policy 0, policy_version 180115 (0.0034) [2024-06-18 17:42:25,446][19107] Updated weights for policy 0, policy_version 180125 (0.0035) [2024-06-18 17:42:25,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 41598.7). Total num frames: 2951168000. Throughput: 0: 41803.6. Samples: 175278640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:25,501][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 17:42:30,178][19107] Updated weights for policy 0, policy_version 180135 (0.0038) [2024-06-18 17:42:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 2951348224. Throughput: 0: 41830.2. Samples: 175408140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:30,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 17:42:33,225][19107] Updated weights for policy 0, policy_version 180145 (0.0046) [2024-06-18 17:42:35,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 2951544832. Throughput: 0: 41664.3. Samples: 175650100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:35,501][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 17:42:35,562][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000180149_2951561216.pth... [2024-06-18 17:42:35,616][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000179542_2941616128.pth [2024-06-18 17:42:38,007][19107] Updated weights for policy 0, policy_version 180155 (0.0037) [2024-06-18 17:42:40,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 41543.2). Total num frames: 2951790592. Throughput: 0: 41746.2. Samples: 175900520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:40,501][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 17:42:40,977][19107] Updated weights for policy 0, policy_version 180165 (0.0034) [2024-06-18 17:42:45,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 2951954432. Throughput: 0: 41732.1. Samples: 176029140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:45,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 17:42:45,872][19107] Updated weights for policy 0, policy_version 180175 (0.0047) [2024-06-18 17:42:48,881][19107] Updated weights for policy 0, policy_version 180185 (0.0050) [2024-06-18 17:42:50,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 41598.7). Total num frames: 2952200192. Throughput: 0: 41624.3. Samples: 176275140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:50,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 17:42:53,842][19107] Updated weights for policy 0, policy_version 180195 (0.0034) [2024-06-18 17:42:55,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 2952396800. Throughput: 0: 41830.4. Samples: 176526320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-18 17:42:55,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 17:42:57,217][19107] Updated weights for policy 0, policy_version 180205 (0.0029) [2024-06-18 17:43:00,501][18875] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 2952593408. Throughput: 0: 41692.7. Samples: 176654360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:00,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 17:43:01,739][19107] Updated weights for policy 0, policy_version 180215 (0.0053) [2024-06-18 17:43:04,988][19107] Updated weights for policy 0, policy_version 180225 (0.0033) [2024-06-18 17:43:05,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 2952822784. Throughput: 0: 41681.4. Samples: 176903480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:05,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 17:43:09,603][19107] Updated weights for policy 0, policy_version 180235 (0.0028) [2024-06-18 17:43:10,504][18875] Fps is (10 sec: 44221.4, 60 sec: 41779.2, 300 sec: 41542.7). Total num frames: 2953035776. Throughput: 0: 41869.6. Samples: 177162920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:10,505][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 17:43:12,720][19107] Updated weights for policy 0, policy_version 180245 (0.0039) [2024-06-18 17:43:15,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 2953232384. Throughput: 0: 41613.8. Samples: 177280760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:15,501][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 17:43:17,422][19107] Updated weights for policy 0, policy_version 180255 (0.0040) [2024-06-18 17:43:20,382][19107] Updated weights for policy 0, policy_version 180265 (0.0036) [2024-06-18 17:43:20,500][18875] Fps is (10 sec: 42614.0, 60 sec: 42325.3, 300 sec: 41598.7). Total num frames: 2953461760. Throughput: 0: 41866.3. Samples: 177534080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:20,501][18875] Avg episode reward: [(0, '0.638')] [2024-06-18 17:43:25,142][19107] Updated weights for policy 0, policy_version 180275 (0.0036) [2024-06-18 17:43:25,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 2953625600. Throughput: 0: 41874.7. Samples: 177784880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:25,501][18875] Avg episode reward: [(0, '0.783')] [2024-06-18 17:43:28,127][19107] Updated weights for policy 0, policy_version 180285 (0.0035) [2024-06-18 17:43:30,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 2953871360. Throughput: 0: 41600.4. Samples: 177901160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:30,501][18875] Avg episode reward: [(0, '0.843')] [2024-06-18 17:43:33,238][19087] Signal inference workers to stop experience collection... (2600 times) [2024-06-18 17:43:33,239][19087] Signal inference workers to resume experience collection... (2600 times) [2024-06-18 17:43:33,245][19107] Updated weights for policy 0, policy_version 180295 (0.0052) [2024-06-18 17:43:33,267][19107] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-18 17:43:33,272][19107] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-18 17:43:35,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42052.4, 300 sec: 41598.7). Total num frames: 2954067968. Throughput: 0: 41868.2. Samples: 178159200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:35,500][18875] Avg episode reward: [(0, '0.701')] [2024-06-18 17:43:35,943][19107] Updated weights for policy 0, policy_version 180305 (0.0040) [2024-06-18 17:43:40,500][18875] Fps is (10 sec: 37683.5, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 2954248192. Throughput: 0: 41744.5. Samples: 178404820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:40,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 17:43:40,954][19107] Updated weights for policy 0, policy_version 180315 (0.0034) [2024-06-18 17:43:44,111][19107] Updated weights for policy 0, policy_version 180325 (0.0028) [2024-06-18 17:43:45,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 41598.7). Total num frames: 2954493952. Throughput: 0: 41527.2. Samples: 178523080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:45,501][18875] Avg episode reward: [(0, '0.340')] [2024-06-18 17:43:48,821][19107] Updated weights for policy 0, policy_version 180335 (0.0043) [2024-06-18 17:43:50,500][18875] Fps is (10 sec: 44237.2, 60 sec: 41506.3, 300 sec: 41487.6). Total num frames: 2954690560. Throughput: 0: 41574.3. Samples: 178774320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:50,500][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 17:43:51,924][19107] Updated weights for policy 0, policy_version 180345 (0.0041) [2024-06-18 17:43:55,500][18875] Fps is (10 sec: 37682.9, 60 sec: 41233.0, 300 sec: 41432.0). Total num frames: 2954870784. Throughput: 0: 41392.1. Samples: 179025420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:43:55,501][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 17:43:56,689][19107] Updated weights for policy 0, policy_version 180355 (0.0031) [2024-06-18 17:43:59,766][19107] Updated weights for policy 0, policy_version 180365 (0.0037) [2024-06-18 17:44:00,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.4, 300 sec: 41598.7). Total num frames: 2955100160. Throughput: 0: 41481.4. Samples: 179147420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:44:00,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 17:44:04,700][19107] Updated weights for policy 0, policy_version 180375 (0.0042) [2024-06-18 17:44:05,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 2955296768. Throughput: 0: 41494.1. Samples: 179401320. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:05,501][18875] Avg episode reward: [(0, '0.436')] [2024-06-18 17:44:07,751][19107] Updated weights for policy 0, policy_version 180385 (0.0045) [2024-06-18 17:44:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41235.6, 300 sec: 41543.2). Total num frames: 2955509760. Throughput: 0: 41535.7. Samples: 179653980. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:10,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 17:44:12,675][19107] Updated weights for policy 0, policy_version 180395 (0.0035) [2024-06-18 17:44:15,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2955739136. Throughput: 0: 41726.3. Samples: 179778840. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:15,501][18875] Avg episode reward: [(0, '0.325')] [2024-06-18 17:44:15,605][19107] Updated weights for policy 0, policy_version 180405 (0.0038) [2024-06-18 17:44:20,429][19107] Updated weights for policy 0, policy_version 180415 (0.0041) [2024-06-18 17:44:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 2955919360. Throughput: 0: 41570.1. Samples: 180029860. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:20,501][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 17:44:23,987][19107] Updated weights for policy 0, policy_version 180425 (0.0048) [2024-06-18 17:44:25,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 2956132352. Throughput: 0: 41602.7. Samples: 180276940. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:25,500][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 17:44:28,289][19107] Updated weights for policy 0, policy_version 180435 (0.0036) [2024-06-18 17:44:30,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 2956378112. Throughput: 0: 41769.4. Samples: 180402700. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:30,509][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 17:44:31,728][19107] Updated weights for policy 0, policy_version 180445 (0.0033) [2024-06-18 17:44:35,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 2956541952. Throughput: 0: 41619.4. Samples: 180647200. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:35,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 17:44:35,513][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000180453_2956541952.pth... [2024-06-18 17:44:35,586][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000179845_2946580480.pth [2024-06-18 17:44:35,926][19107] Updated weights for policy 0, policy_version 180455 (0.0034) [2024-06-18 17:44:37,796][19087] Signal inference workers to stop experience collection... (2650 times) [2024-06-18 17:44:37,797][19087] Signal inference workers to resume experience collection... (2650 times) [2024-06-18 17:44:37,847][19107] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-18 17:44:37,848][19107] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-18 17:44:39,438][19107] Updated weights for policy 0, policy_version 180465 (0.0029) [2024-06-18 17:44:40,500][18875] Fps is (10 sec: 37683.7, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 2956754944. Throughput: 0: 41554.0. Samples: 180895340. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:40,500][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 17:44:43,892][19107] Updated weights for policy 0, policy_version 180475 (0.0034) [2024-06-18 17:44:45,500][18875] Fps is (10 sec: 44237.5, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 2956984320. Throughput: 0: 41733.4. Samples: 181025420. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:45,500][18875] Avg episode reward: [(0, '0.138')] [2024-06-18 17:44:47,803][19107] Updated weights for policy 0, policy_version 180485 (0.0043) [2024-06-18 17:44:50,500][18875] Fps is (10 sec: 40959.2, 60 sec: 41232.9, 300 sec: 41487.6). Total num frames: 2957164544. Throughput: 0: 41597.3. Samples: 181273200. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:50,501][18875] Avg episode reward: [(0, '0.608')] [2024-06-18 17:44:51,711][19107] Updated weights for policy 0, policy_version 180495 (0.0037) [2024-06-18 17:44:55,500][18875] Fps is (10 sec: 39320.5, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2957377536. Throughput: 0: 41373.6. Samples: 181515800. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:44:55,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 17:44:55,779][19107] Updated weights for policy 0, policy_version 180505 (0.0037) [2024-06-18 17:44:59,530][19107] Updated weights for policy 0, policy_version 180515 (0.0038) [2024-06-18 17:45:00,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 2957590528. Throughput: 0: 41505.8. Samples: 181646600. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:45:00,500][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 17:45:03,442][19107] Updated weights for policy 0, policy_version 180525 (0.0031) [2024-06-18 17:45:05,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41487.9). Total num frames: 2957787136. Throughput: 0: 41415.5. Samples: 181893560. Policy #0 lag: (min: 0.0, avg: 13.3, max: 22.0) [2024-06-18 17:45:05,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 17:45:07,255][19107] Updated weights for policy 0, policy_version 180535 (0.0031) [2024-06-18 17:45:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2958016512. Throughput: 0: 41324.9. Samples: 182136560. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:10,501][18875] Avg episode reward: [(0, '0.405')] [2024-06-18 17:45:11,355][19107] Updated weights for policy 0, policy_version 180545 (0.0039) [2024-06-18 17:45:15,322][19107] Updated weights for policy 0, policy_version 180555 (0.0041) [2024-06-18 17:45:15,504][18875] Fps is (10 sec: 42583.0, 60 sec: 41230.5, 300 sec: 41598.2). Total num frames: 2958213120. Throughput: 0: 41465.1. Samples: 182268780. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:15,505][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 17:45:19,064][19107] Updated weights for policy 0, policy_version 180565 (0.0042) [2024-06-18 17:45:20,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 2958409728. Throughput: 0: 41501.8. Samples: 182514780. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:20,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 17:45:23,198][19107] Updated weights for policy 0, policy_version 180575 (0.0043) [2024-06-18 17:45:25,504][18875] Fps is (10 sec: 42598.3, 60 sec: 41776.6, 300 sec: 41653.7). Total num frames: 2958639104. Throughput: 0: 41389.0. Samples: 182758000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:25,505][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 17:45:27,110][19107] Updated weights for policy 0, policy_version 180585 (0.0030) [2024-06-18 17:45:30,500][18875] Fps is (10 sec: 42597.9, 60 sec: 40959.9, 300 sec: 41543.2). Total num frames: 2958835712. Throughput: 0: 41502.0. Samples: 182893020. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:30,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 17:45:31,154][19107] Updated weights for policy 0, policy_version 180595 (0.0036) [2024-06-18 17:45:34,747][19107] Updated weights for policy 0, policy_version 180605 (0.0038) [2024-06-18 17:45:35,500][18875] Fps is (10 sec: 39335.8, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 2959032320. Throughput: 0: 41335.2. Samples: 183133280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:35,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 17:45:38,845][19107] Updated weights for policy 0, policy_version 180615 (0.0036) [2024-06-18 17:45:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 2959245312. Throughput: 0: 41409.0. Samples: 183379200. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:40,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 17:45:43,017][19107] Updated weights for policy 0, policy_version 180625 (0.0027) [2024-06-18 17:45:45,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41232.9, 300 sec: 41487.6). Total num frames: 2959458304. Throughput: 0: 41359.4. Samples: 183507780. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:45,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 17:45:47,081][19107] Updated weights for policy 0, policy_version 180635 (0.0044) [2024-06-18 17:45:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 2959671296. Throughput: 0: 41312.5. Samples: 183752620. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:50,501][18875] Avg episode reward: [(0, '0.775')] [2024-06-18 17:45:50,814][19107] Updated weights for policy 0, policy_version 180645 (0.0028) [2024-06-18 17:45:54,659][19107] Updated weights for policy 0, policy_version 180655 (0.0031) [2024-06-18 17:45:55,504][18875] Fps is (10 sec: 40945.7, 60 sec: 41503.7, 300 sec: 41598.2). Total num frames: 2959867904. Throughput: 0: 41525.1. Samples: 184005340. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:45:55,505][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 17:45:58,480][19107] Updated weights for policy 0, policy_version 180665 (0.0046) [2024-06-18 17:46:00,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 2960064512. Throughput: 0: 41319.7. Samples: 184128020. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:46:00,501][18875] Avg episode reward: [(0, '0.684')] [2024-06-18 17:46:02,893][19107] Updated weights for policy 0, policy_version 180675 (0.0025) [2024-06-18 17:46:05,501][18875] Fps is (10 sec: 44250.2, 60 sec: 42051.9, 300 sec: 41820.8). Total num frames: 2960310272. Throughput: 0: 41399.9. Samples: 184377800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:46:05,502][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 17:46:06,217][19107] Updated weights for policy 0, policy_version 180685 (0.0031) [2024-06-18 17:46:09,756][19087] Signal inference workers to stop experience collection... (2700 times) [2024-06-18 17:46:09,795][19107] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-18 17:46:09,822][19087] Signal inference workers to resume experience collection... (2700 times) [2024-06-18 17:46:09,828][19107] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-18 17:46:10,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 2960490496. Throughput: 0: 41542.9. Samples: 184627280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 20.0) [2024-06-18 17:46:10,501][18875] Avg episode reward: [(0, '0.352')] [2024-06-18 17:46:10,838][19107] Updated weights for policy 0, policy_version 180695 (0.0028) [2024-06-18 17:46:13,869][19107] Updated weights for policy 0, policy_version 180705 (0.0034) [2024-06-18 17:46:15,500][18875] Fps is (10 sec: 39324.1, 60 sec: 41508.7, 300 sec: 41543.2). Total num frames: 2960703488. Throughput: 0: 41282.8. Samples: 184750740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:15,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 17:46:18,744][19107] Updated weights for policy 0, policy_version 180715 (0.0042) [2024-06-18 17:46:20,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 2960932864. Throughput: 0: 41589.0. Samples: 185004780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:20,500][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 17:46:21,592][19107] Updated weights for policy 0, policy_version 180725 (0.0028) [2024-06-18 17:46:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41235.6, 300 sec: 41487.6). Total num frames: 2961113088. Throughput: 0: 41714.7. Samples: 185256360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:25,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 17:46:26,446][19107] Updated weights for policy 0, policy_version 180735 (0.0037) [2024-06-18 17:46:29,968][19107] Updated weights for policy 0, policy_version 180745 (0.0039) [2024-06-18 17:46:30,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 2961326080. Throughput: 0: 41424.7. Samples: 185371880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:30,500][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 17:46:34,146][19107] Updated weights for policy 0, policy_version 180755 (0.0034) [2024-06-18 17:46:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 2961539072. Throughput: 0: 41734.2. Samples: 185630660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:35,500][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 17:46:35,674][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000180760_2961571840.pth... [2024-06-18 17:46:35,730][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000180149_2951561216.pth [2024-06-18 17:46:38,344][19107] Updated weights for policy 0, policy_version 180765 (0.0037) [2024-06-18 17:46:40,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 2961752064. Throughput: 0: 41757.1. Samples: 185884260. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:40,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 17:46:42,433][19107] Updated weights for policy 0, policy_version 180775 (0.0030) [2024-06-18 17:46:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 2961948672. Throughput: 0: 41748.6. Samples: 186006700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:45,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 17:46:46,193][19107] Updated weights for policy 0, policy_version 180785 (0.0035) [2024-06-18 17:46:50,201][19107] Updated weights for policy 0, policy_version 180795 (0.0054) [2024-06-18 17:46:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 2962161664. Throughput: 0: 41794.8. Samples: 186258540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:50,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 17:46:53,683][19107] Updated weights for policy 0, policy_version 180805 (0.0025) [2024-06-18 17:46:55,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41781.7, 300 sec: 41543.2). Total num frames: 2962374656. Throughput: 0: 41931.1. Samples: 186514180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:46:55,501][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 17:46:57,927][19107] Updated weights for policy 0, policy_version 180815 (0.0035) [2024-06-18 17:47:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 2962587648. Throughput: 0: 42028.5. Samples: 186642020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:47:00,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 17:47:01,300][19107] Updated weights for policy 0, policy_version 180825 (0.0038) [2024-06-18 17:47:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.5, 300 sec: 41543.7). Total num frames: 2962784256. Throughput: 0: 41944.8. Samples: 186892300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:47:05,501][18875] Avg episode reward: [(0, '0.242')] [2024-06-18 17:47:05,567][19107] Updated weights for policy 0, policy_version 180835 (0.0037) [2024-06-18 17:47:09,355][19107] Updated weights for policy 0, policy_version 180845 (0.0028) [2024-06-18 17:47:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 2962997248. Throughput: 0: 41877.9. Samples: 187140860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:47:10,501][18875] Avg episode reward: [(0, '0.348')] [2024-06-18 17:47:13,318][19107] Updated weights for policy 0, policy_version 180855 (0.0034) [2024-06-18 17:47:15,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 2963210240. Throughput: 0: 42095.5. Samples: 187266180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 17:47:15,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 17:47:17,007][19107] Updated weights for policy 0, policy_version 180865 (0.0045) [2024-06-18 17:47:20,504][18875] Fps is (10 sec: 42582.8, 60 sec: 41503.6, 300 sec: 41542.7). Total num frames: 2963423232. Throughput: 0: 41937.0. Samples: 187517980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:20,504][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 17:47:20,871][19107] Updated weights for policy 0, policy_version 180875 (0.0029) [2024-06-18 17:47:25,101][19107] Updated weights for policy 0, policy_version 180885 (0.0030) [2024-06-18 17:47:25,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2963619840. Throughput: 0: 41748.4. Samples: 187762940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:25,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 17:47:28,737][19107] Updated weights for policy 0, policy_version 180895 (0.0047) [2024-06-18 17:47:30,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 2963849216. Throughput: 0: 41857.7. Samples: 187890300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:30,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 17:47:32,904][19107] Updated weights for policy 0, policy_version 180905 (0.0040) [2024-06-18 17:47:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 2964045824. Throughput: 0: 41837.3. Samples: 188141220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:35,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 17:47:36,751][19107] Updated weights for policy 0, policy_version 180915 (0.0046) [2024-06-18 17:47:38,305][19087] Signal inference workers to stop experience collection... (2750 times) [2024-06-18 17:47:38,315][19087] Signal inference workers to resume experience collection... (2750 times) [2024-06-18 17:47:38,356][19107] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-18 17:47:38,356][19107] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-18 17:47:40,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 2964258816. Throughput: 0: 41748.4. Samples: 188392860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:40,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 17:47:40,526][19107] Updated weights for policy 0, policy_version 180925 (0.0033) [2024-06-18 17:47:44,618][19107] Updated weights for policy 0, policy_version 180935 (0.0037) [2024-06-18 17:47:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 2964471808. Throughput: 0: 41765.3. Samples: 188521460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:45,501][18875] Avg episode reward: [(0, '0.168')] [2024-06-18 17:47:48,191][19107] Updated weights for policy 0, policy_version 180945 (0.0037) [2024-06-18 17:47:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 2964684800. Throughput: 0: 41735.5. Samples: 188770400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:50,501][18875] Avg episode reward: [(0, '0.199')] [2024-06-18 17:47:52,306][19107] Updated weights for policy 0, policy_version 180955 (0.0038) [2024-06-18 17:47:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 2964897792. Throughput: 0: 41877.8. Samples: 189025360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:47:55,501][18875] Avg episode reward: [(0, '0.279')] [2024-06-18 17:47:55,871][19107] Updated weights for policy 0, policy_version 180965 (0.0028) [2024-06-18 17:48:00,195][19107] Updated weights for policy 0, policy_version 180975 (0.0044) [2024-06-18 17:48:00,501][18875] Fps is (10 sec: 40958.3, 60 sec: 41778.8, 300 sec: 41598.6). Total num frames: 2965094400. Throughput: 0: 41820.4. Samples: 189148120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:48:00,501][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 17:48:03,698][19107] Updated weights for policy 0, policy_version 180985 (0.0030) [2024-06-18 17:48:05,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41654.8). Total num frames: 2965323776. Throughput: 0: 41721.6. Samples: 189395300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:48:05,501][18875] Avg episode reward: [(0, '0.711')] [2024-06-18 17:48:08,006][19107] Updated weights for policy 0, policy_version 180995 (0.0026) [2024-06-18 17:48:10,500][18875] Fps is (10 sec: 42600.6, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 2965520384. Throughput: 0: 41975.6. Samples: 189651840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:48:10,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 17:48:11,413][19107] Updated weights for policy 0, policy_version 181005 (0.0034) [2024-06-18 17:48:15,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 2965733376. Throughput: 0: 42015.6. Samples: 189781000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:48:15,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 17:48:15,626][19107] Updated weights for policy 0, policy_version 181015 (0.0029) [2024-06-18 17:48:19,384][19107] Updated weights for policy 0, policy_version 181025 (0.0035) [2024-06-18 17:48:20,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42327.8, 300 sec: 41820.9). Total num frames: 2965962752. Throughput: 0: 42093.3. Samples: 190035420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:48:20,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 17:48:23,729][19107] Updated weights for policy 0, policy_version 181035 (0.0038) [2024-06-18 17:48:25,504][18875] Fps is (10 sec: 42582.9, 60 sec: 42322.8, 300 sec: 41653.7). Total num frames: 2966159360. Throughput: 0: 42097.5. Samples: 190287400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:48:25,504][18875] Avg episode reward: [(0, '0.265')] [2024-06-18 17:48:27,215][19107] Updated weights for policy 0, policy_version 181045 (0.0043) [2024-06-18 17:48:30,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 2966339584. Throughput: 0: 41954.6. Samples: 190409420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:48:30,501][18875] Avg episode reward: [(0, '0.415')] [2024-06-18 17:48:31,489][19107] Updated weights for policy 0, policy_version 181055 (0.0037) [2024-06-18 17:48:35,098][19107] Updated weights for policy 0, policy_version 181065 (0.0036) [2024-06-18 17:48:35,500][18875] Fps is (10 sec: 42613.2, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 2966585344. Throughput: 0: 42142.2. Samples: 190666800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:48:35,501][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 17:48:35,616][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181067_2966601728.pth... [2024-06-18 17:48:35,685][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000180453_2956541952.pth [2024-06-18 17:48:39,334][19107] Updated weights for policy 0, policy_version 181075 (0.0040) [2024-06-18 17:48:40,504][18875] Fps is (10 sec: 44219.3, 60 sec: 42049.5, 300 sec: 41653.7). Total num frames: 2966781952. Throughput: 0: 42016.7. Samples: 190916280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:48:40,505][18875] Avg episode reward: [(0, '0.291')] [2024-06-18 17:48:42,333][19087] Signal inference workers to stop experience collection... (2800 times) [2024-06-18 17:48:42,370][19107] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-18 17:48:42,395][19087] Signal inference workers to resume experience collection... (2800 times) [2024-06-18 17:48:42,396][19107] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-18 17:48:42,687][19107] Updated weights for policy 0, policy_version 181085 (0.0035) [2024-06-18 17:48:45,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 2966994944. Throughput: 0: 42000.5. Samples: 191038120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:48:45,501][18875] Avg episode reward: [(0, '0.198')] [2024-06-18 17:48:47,092][19107] Updated weights for policy 0, policy_version 181095 (0.0026) [2024-06-18 17:48:50,500][18875] Fps is (10 sec: 40975.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2967191552. Throughput: 0: 42048.7. Samples: 191287500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:48:50,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 17:48:50,987][19107] Updated weights for policy 0, policy_version 181105 (0.0039) [2024-06-18 17:48:55,186][19107] Updated weights for policy 0, policy_version 181115 (0.0034) [2024-06-18 17:48:55,500][18875] Fps is (10 sec: 40959.1, 60 sec: 41779.0, 300 sec: 41709.7). Total num frames: 2967404544. Throughput: 0: 42041.6. Samples: 191543720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:48:55,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 17:48:58,702][19107] Updated weights for policy 0, policy_version 181125 (0.0041) [2024-06-18 17:49:00,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.6, 300 sec: 41765.3). Total num frames: 2967617536. Throughput: 0: 41790.3. Samples: 191661560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:49:00,500][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 17:49:02,809][19107] Updated weights for policy 0, policy_version 181135 (0.0043) [2024-06-18 17:49:05,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 2967814144. Throughput: 0: 41777.8. Samples: 191915420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:49:05,501][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 17:49:06,390][19107] Updated weights for policy 0, policy_version 181145 (0.0040) [2024-06-18 17:49:10,497][19107] Updated weights for policy 0, policy_version 181155 (0.0040) [2024-06-18 17:49:10,503][18875] Fps is (10 sec: 42588.6, 60 sec: 42050.7, 300 sec: 41709.5). Total num frames: 2968043520. Throughput: 0: 41732.4. Samples: 192165300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:49:10,503][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 17:49:14,464][19107] Updated weights for policy 0, policy_version 181165 (0.0040) [2024-06-18 17:49:15,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2968240128. Throughput: 0: 41885.9. Samples: 192294280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:49:15,500][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 17:49:18,685][19107] Updated weights for policy 0, policy_version 181175 (0.0040) [2024-06-18 17:49:20,500][18875] Fps is (10 sec: 37691.3, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 2968420352. Throughput: 0: 41710.3. Samples: 192543760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:49:20,501][18875] Avg episode reward: [(0, '0.188')] [2024-06-18 17:49:22,280][19107] Updated weights for policy 0, policy_version 181185 (0.0040) [2024-06-18 17:49:25,500][18875] Fps is (10 sec: 42597.6, 60 sec: 41781.6, 300 sec: 41654.2). Total num frames: 2968666112. Throughput: 0: 41622.7. Samples: 192789140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 17:49:25,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 17:49:26,528][19107] Updated weights for policy 0, policy_version 181195 (0.0031) [2024-06-18 17:49:30,169][19107] Updated weights for policy 0, policy_version 181205 (0.0036) [2024-06-18 17:49:30,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 2968862720. Throughput: 0: 41779.0. Samples: 192918180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:49:30,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 17:49:34,208][19107] Updated weights for policy 0, policy_version 181215 (0.0043) [2024-06-18 17:49:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 2969075712. Throughput: 0: 41789.4. Samples: 193168020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:49:35,501][18875] Avg episode reward: [(0, '0.298')] [2024-06-18 17:49:38,315][19107] Updated weights for policy 0, policy_version 181225 (0.0034) [2024-06-18 17:49:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42055.0, 300 sec: 41765.3). Total num frames: 2969305088. Throughput: 0: 41519.2. Samples: 193412080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:49:40,501][18875] Avg episode reward: [(0, '0.202')] [2024-06-18 17:49:42,542][19107] Updated weights for policy 0, policy_version 181235 (0.0039) [2024-06-18 17:49:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 2969485312. Throughput: 0: 41768.3. Samples: 193541140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:49:45,501][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 17:49:46,296][19107] Updated weights for policy 0, policy_version 181245 (0.0034) [2024-06-18 17:49:50,292][19107] Updated weights for policy 0, policy_version 181255 (0.0043) [2024-06-18 17:49:50,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2969698304. Throughput: 0: 41522.6. Samples: 193783940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:49:50,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 17:49:54,166][19107] Updated weights for policy 0, policy_version 181265 (0.0036) [2024-06-18 17:49:55,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2969911296. Throughput: 0: 41519.7. Samples: 194033600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:49:55,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 17:49:57,904][19107] Updated weights for policy 0, policy_version 181275 (0.0045) [2024-06-18 17:50:00,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 2970107904. Throughput: 0: 41526.5. Samples: 194162980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:50:00,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 17:50:01,920][19107] Updated weights for policy 0, policy_version 181285 (0.0047) [2024-06-18 17:50:02,551][19087] Signal inference workers to stop experience collection... (2850 times) [2024-06-18 17:50:02,551][19087] Signal inference workers to resume experience collection... (2850 times) [2024-06-18 17:50:02,595][19107] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-18 17:50:02,596][19107] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-18 17:50:05,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 2970320896. Throughput: 0: 41424.6. Samples: 194407860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:50:05,501][18875] Avg episode reward: [(0, '0.720')] [2024-06-18 17:50:05,605][19107] Updated weights for policy 0, policy_version 181295 (0.0030) [2024-06-18 17:50:09,673][19107] Updated weights for policy 0, policy_version 181305 (0.0038) [2024-06-18 17:50:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41234.6, 300 sec: 41710.3). Total num frames: 2970517504. Throughput: 0: 41502.2. Samples: 194656740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:50:10,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 17:50:13,898][19107] Updated weights for policy 0, policy_version 181315 (0.0039) [2024-06-18 17:50:15,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 2970714112. Throughput: 0: 41416.2. Samples: 194781900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:50:15,500][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 17:50:17,506][19107] Updated weights for policy 0, policy_version 181325 (0.0034) [2024-06-18 17:50:20,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 41710.3). Total num frames: 2970943488. Throughput: 0: 41340.6. Samples: 195028340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:50:20,500][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 17:50:21,632][19107] Updated weights for policy 0, policy_version 181335 (0.0037) [2024-06-18 17:50:25,240][19107] Updated weights for policy 0, policy_version 181345 (0.0044) [2024-06-18 17:50:25,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 2971156480. Throughput: 0: 41593.9. Samples: 195283800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:50:25,500][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 17:50:29,235][19107] Updated weights for policy 0, policy_version 181355 (0.0028) [2024-06-18 17:50:30,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 2971369472. Throughput: 0: 41553.9. Samples: 195411060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 17:50:30,501][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 17:50:33,547][19107] Updated weights for policy 0, policy_version 181365 (0.0034) [2024-06-18 17:50:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 2971582464. Throughput: 0: 41776.1. Samples: 195663860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:50:35,500][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 17:50:35,526][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181371_2971582464.pth... [2024-06-18 17:50:35,582][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000180760_2961571840.pth [2024-06-18 17:50:37,084][19107] Updated weights for policy 0, policy_version 181375 (0.0027) [2024-06-18 17:50:40,500][18875] Fps is (10 sec: 39321.4, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 2971762688. Throughput: 0: 41740.1. Samples: 195911900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:50:40,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 17:50:41,207][19107] Updated weights for policy 0, policy_version 181385 (0.0035) [2024-06-18 17:50:45,025][19107] Updated weights for policy 0, policy_version 181395 (0.0034) [2024-06-18 17:50:45,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2971992064. Throughput: 0: 41503.6. Samples: 196030640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:50:45,504][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 17:50:48,892][19107] Updated weights for policy 0, policy_version 181405 (0.0036) [2024-06-18 17:50:50,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41765.8). Total num frames: 2972188672. Throughput: 0: 41640.0. Samples: 196281660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:50:50,500][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 17:50:52,811][19107] Updated weights for policy 0, policy_version 181415 (0.0033) [2024-06-18 17:50:55,504][18875] Fps is (10 sec: 40945.3, 60 sec: 41503.7, 300 sec: 41820.3). Total num frames: 2972401664. Throughput: 0: 41762.0. Samples: 196536180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:50:55,505][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 17:50:57,192][19107] Updated weights for policy 0, policy_version 181425 (0.0046) [2024-06-18 17:51:00,500][18875] Fps is (10 sec: 40958.9, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 2972598272. Throughput: 0: 41630.4. Samples: 196655280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:00,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 17:51:00,895][19107] Updated weights for policy 0, policy_version 181435 (0.0042) [2024-06-18 17:51:04,982][19107] Updated weights for policy 0, policy_version 181445 (0.0036) [2024-06-18 17:51:05,500][18875] Fps is (10 sec: 40974.4, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 2972811264. Throughput: 0: 41826.0. Samples: 196910520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:05,501][18875] Avg episode reward: [(0, '0.683')] [2024-06-18 17:51:08,752][19107] Updated weights for policy 0, policy_version 181455 (0.0025) [2024-06-18 17:51:10,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2973024256. Throughput: 0: 41622.1. Samples: 197156800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:10,501][18875] Avg episode reward: [(0, '0.752')] [2024-06-18 17:51:12,703][19107] Updated weights for policy 0, policy_version 181465 (0.0027) [2024-06-18 17:51:15,503][18875] Fps is (10 sec: 40948.6, 60 sec: 41777.1, 300 sec: 41653.8). Total num frames: 2973220864. Throughput: 0: 41695.5. Samples: 197287480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:15,504][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 17:51:16,599][19107] Updated weights for policy 0, policy_version 181475 (0.0032) [2024-06-18 17:51:20,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 2973433856. Throughput: 0: 41445.7. Samples: 197528920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:20,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 17:51:20,736][19107] Updated weights for policy 0, policy_version 181485 (0.0041) [2024-06-18 17:51:24,536][19107] Updated weights for policy 0, policy_version 181495 (0.0034) [2024-06-18 17:51:25,500][18875] Fps is (10 sec: 44250.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 2973663232. Throughput: 0: 41484.1. Samples: 197778680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:25,501][18875] Avg episode reward: [(0, '0.759')] [2024-06-18 17:51:28,512][19107] Updated weights for policy 0, policy_version 181505 (0.0040) [2024-06-18 17:51:30,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 2973859840. Throughput: 0: 41656.4. Samples: 197905180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:30,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 17:51:32,693][19107] Updated weights for policy 0, policy_version 181515 (0.0028) [2024-06-18 17:51:35,501][18875] Fps is (10 sec: 40959.0, 60 sec: 41505.9, 300 sec: 41765.3). Total num frames: 2974072832. Throughput: 0: 41707.7. Samples: 198158520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:35,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 17:51:36,264][19107] Updated weights for policy 0, policy_version 181525 (0.0035) [2024-06-18 17:51:36,834][19087] Signal inference workers to stop experience collection... (2900 times) [2024-06-18 17:51:36,855][19107] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-18 17:51:36,944][19087] Signal inference workers to resume experience collection... (2900 times) [2024-06-18 17:51:36,945][19107] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-18 17:51:40,490][19107] Updated weights for policy 0, policy_version 181535 (0.0037) [2024-06-18 17:51:40,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 2974269440. Throughput: 0: 41511.0. Samples: 198404020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 17:51:40,500][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 17:51:44,282][19107] Updated weights for policy 0, policy_version 181545 (0.0036) [2024-06-18 17:51:45,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 2974482432. Throughput: 0: 41521.5. Samples: 198523740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:51:45,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 17:51:48,327][19107] Updated weights for policy 0, policy_version 181555 (0.0034) [2024-06-18 17:51:50,500][18875] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 2974679040. Throughput: 0: 41529.4. Samples: 198779340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:51:50,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 17:51:51,927][19107] Updated weights for policy 0, policy_version 181565 (0.0029) [2024-06-18 17:51:55,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41508.6, 300 sec: 41709.8). Total num frames: 2974892032. Throughput: 0: 41601.4. Samples: 199028860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:51:55,501][18875] Avg episode reward: [(0, '0.259')] [2024-06-18 17:51:56,067][19107] Updated weights for policy 0, policy_version 181575 (0.0040) [2024-06-18 17:51:59,599][19107] Updated weights for policy 0, policy_version 181585 (0.0033) [2024-06-18 17:52:00,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41779.4, 300 sec: 41765.3). Total num frames: 2975105024. Throughput: 0: 41492.1. Samples: 199154500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:00,501][18875] Avg episode reward: [(0, '0.452')] [2024-06-18 17:52:03,830][19107] Updated weights for policy 0, policy_version 181595 (0.0033) [2024-06-18 17:52:05,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.3, 300 sec: 41709.8). Total num frames: 2975301632. Throughput: 0: 41664.0. Samples: 199403800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:05,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 17:52:07,362][19107] Updated weights for policy 0, policy_version 181605 (0.0039) [2024-06-18 17:52:10,500][18875] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 41709.7). Total num frames: 2975514624. Throughput: 0: 41661.2. Samples: 199653440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:10,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 17:52:11,691][19107] Updated weights for policy 0, policy_version 181615 (0.0028) [2024-06-18 17:52:15,322][19107] Updated weights for policy 0, policy_version 181625 (0.0031) [2024-06-18 17:52:15,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42054.4, 300 sec: 41765.8). Total num frames: 2975744000. Throughput: 0: 41728.1. Samples: 199782940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:15,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 17:52:19,732][19107] Updated weights for policy 0, policy_version 181635 (0.0027) [2024-06-18 17:52:20,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 2975907840. Throughput: 0: 41730.5. Samples: 200036380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:20,500][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 17:52:23,175][19107] Updated weights for policy 0, policy_version 181645 (0.0034) [2024-06-18 17:52:25,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 2976153600. Throughput: 0: 41612.8. Samples: 200276600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:25,501][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 17:52:27,550][19107] Updated weights for policy 0, policy_version 181655 (0.0041) [2024-06-18 17:52:30,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 2976350208. Throughput: 0: 41962.7. Samples: 200412060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:30,501][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 17:52:30,999][19107] Updated weights for policy 0, policy_version 181665 (0.0037) [2024-06-18 17:52:35,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41233.2, 300 sec: 41654.2). Total num frames: 2976546816. Throughput: 0: 41783.1. Samples: 200659580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:35,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 17:52:35,611][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181675_2976563200.pth... [2024-06-18 17:52:35,621][19107] Updated weights for policy 0, policy_version 181675 (0.0042) [2024-06-18 17:52:35,670][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181067_2966601728.pth [2024-06-18 17:52:38,899][19107] Updated weights for policy 0, policy_version 181685 (0.0029) [2024-06-18 17:52:40,500][18875] Fps is (10 sec: 45874.7, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 2976808960. Throughput: 0: 41724.5. Samples: 200906460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:40,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 17:52:43,312][19107] Updated weights for policy 0, policy_version 181695 (0.0038) [2024-06-18 17:52:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 2976972800. Throughput: 0: 41847.4. Samples: 201037640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 17:52:45,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 17:52:46,664][19107] Updated weights for policy 0, policy_version 181705 (0.0027) [2024-06-18 17:52:50,500][18875] Fps is (10 sec: 36044.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 2977169408. Throughput: 0: 41730.1. Samples: 201281660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:52:50,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 17:52:51,265][19107] Updated weights for policy 0, policy_version 181715 (0.0044) [2024-06-18 17:52:54,138][19087] Signal inference workers to stop experience collection... (2950 times) [2024-06-18 17:52:54,186][19107] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-18 17:52:54,193][19087] Signal inference workers to resume experience collection... (2950 times) [2024-06-18 17:52:54,201][19107] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-18 17:52:54,783][19107] Updated weights for policy 0, policy_version 181725 (0.0033) [2024-06-18 17:52:55,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 2977398784. Throughput: 0: 41680.0. Samples: 201529040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:52:55,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 17:52:59,354][19107] Updated weights for policy 0, policy_version 181735 (0.0031) [2024-06-18 17:53:00,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 2977579008. Throughput: 0: 41560.0. Samples: 201653140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:00,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 17:53:02,969][19107] Updated weights for policy 0, policy_version 181745 (0.0028) [2024-06-18 17:53:05,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 2977824768. Throughput: 0: 41328.4. Samples: 201896160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:05,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 17:53:07,021][19107] Updated weights for policy 0, policy_version 181755 (0.0040) [2024-06-18 17:53:10,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 2978004992. Throughput: 0: 41668.8. Samples: 202151700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:10,501][18875] Avg episode reward: [(0, '0.398')] [2024-06-18 17:53:10,754][19107] Updated weights for policy 0, policy_version 181765 (0.0046) [2024-06-18 17:53:14,815][19107] Updated weights for policy 0, policy_version 181775 (0.0055) [2024-06-18 17:53:15,500][18875] Fps is (10 sec: 39320.8, 60 sec: 41232.9, 300 sec: 41543.1). Total num frames: 2978217984. Throughput: 0: 41312.2. Samples: 202271120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:15,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 17:53:18,555][19107] Updated weights for policy 0, policy_version 181785 (0.0038) [2024-06-18 17:53:20,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 41654.8). Total num frames: 2978447360. Throughput: 0: 41457.9. Samples: 202525180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:20,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 17:53:23,203][19107] Updated weights for policy 0, policy_version 181795 (0.0030) [2024-06-18 17:53:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 2978627584. Throughput: 0: 41514.2. Samples: 202774600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:25,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 17:53:26,498][19107] Updated weights for policy 0, policy_version 181805 (0.0045) [2024-06-18 17:53:30,500][18875] Fps is (10 sec: 37682.4, 60 sec: 41232.9, 300 sec: 41487.6). Total num frames: 2978824192. Throughput: 0: 41237.7. Samples: 202893340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:30,501][18875] Avg episode reward: [(0, '0.448')] [2024-06-18 17:53:30,963][19107] Updated weights for policy 0, policy_version 181815 (0.0041) [2024-06-18 17:53:34,577][19107] Updated weights for policy 0, policy_version 181825 (0.0039) [2024-06-18 17:53:35,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41654.8). Total num frames: 2979069952. Throughput: 0: 41561.4. Samples: 203151920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:35,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 17:53:38,850][19107] Updated weights for policy 0, policy_version 181835 (0.0029) [2024-06-18 17:53:40,500][18875] Fps is (10 sec: 44237.3, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 2979266560. Throughput: 0: 41603.2. Samples: 203401180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:40,501][18875] Avg episode reward: [(0, '0.430')] [2024-06-18 17:53:42,438][19107] Updated weights for policy 0, policy_version 181845 (0.0033) [2024-06-18 17:53:45,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 2979463168. Throughput: 0: 41618.9. Samples: 203526000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:45,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 17:53:46,603][19107] Updated weights for policy 0, policy_version 181855 (0.0035) [2024-06-18 17:53:50,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.3, 300 sec: 41543.2). Total num frames: 2979659776. Throughput: 0: 41685.8. Samples: 203772020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 17:53:50,501][18875] Avg episode reward: [(0, '0.801')] [2024-06-18 17:53:50,613][19107] Updated weights for policy 0, policy_version 181865 (0.0035) [2024-06-18 17:53:54,387][19107] Updated weights for policy 0, policy_version 181875 (0.0034) [2024-06-18 17:53:55,501][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 2979889152. Throughput: 0: 41595.8. Samples: 204023520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:53:55,501][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 17:53:58,349][19107] Updated weights for policy 0, policy_version 181885 (0.0034) [2024-06-18 17:54:00,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 2980102144. Throughput: 0: 41841.4. Samples: 204153980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:00,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 17:54:02,118][19107] Updated weights for policy 0, policy_version 181895 (0.0042) [2024-06-18 17:54:05,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41233.0, 300 sec: 41543.5). Total num frames: 2980298752. Throughput: 0: 41687.4. Samples: 204401120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:05,501][18875] Avg episode reward: [(0, '0.430')] [2024-06-18 17:54:05,994][19107] Updated weights for policy 0, policy_version 181905 (0.0035) [2024-06-18 17:54:09,907][19107] Updated weights for policy 0, policy_version 181915 (0.0037) [2024-06-18 17:54:10,505][18875] Fps is (10 sec: 40939.7, 60 sec: 41775.7, 300 sec: 41598.0). Total num frames: 2980511744. Throughput: 0: 41671.8. Samples: 204650040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:10,506][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 17:54:11,800][19087] Signal inference workers to stop experience collection... (3000 times) [2024-06-18 17:54:11,800][19087] Signal inference workers to resume experience collection... (3000 times) [2024-06-18 17:54:11,828][19107] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-18 17:54:11,829][19107] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-18 17:54:13,615][19107] Updated weights for policy 0, policy_version 181925 (0.0031) [2024-06-18 17:54:15,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 2980741120. Throughput: 0: 41866.7. Samples: 204777340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:15,501][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 17:54:17,663][19107] Updated weights for policy 0, policy_version 181935 (0.0033) [2024-06-18 17:54:20,500][18875] Fps is (10 sec: 42620.1, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 2980937728. Throughput: 0: 41668.5. Samples: 205027000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:20,500][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 17:54:21,961][19107] Updated weights for policy 0, policy_version 181945 (0.0034) [2024-06-18 17:54:25,388][19107] Updated weights for policy 0, policy_version 181955 (0.0030) [2024-06-18 17:54:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 2981150720. Throughput: 0: 41768.9. Samples: 205280780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:25,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 17:54:29,635][19107] Updated weights for policy 0, policy_version 181965 (0.0034) [2024-06-18 17:54:30,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 2981363712. Throughput: 0: 41770.7. Samples: 205405680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:30,504][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 17:54:33,269][19107] Updated weights for policy 0, policy_version 181975 (0.0037) [2024-06-18 17:54:35,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 2981576704. Throughput: 0: 41859.0. Samples: 205655680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:35,501][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 17:54:35,529][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181981_2981576704.pth... [2024-06-18 17:54:35,589][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181371_2971582464.pth [2024-06-18 17:54:37,446][19107] Updated weights for policy 0, policy_version 181985 (0.0042) [2024-06-18 17:54:40,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 2981773312. Throughput: 0: 41925.5. Samples: 205910160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:40,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 17:54:40,886][19107] Updated weights for policy 0, policy_version 181995 (0.0033) [2024-06-18 17:54:45,146][19107] Updated weights for policy 0, policy_version 182005 (0.0033) [2024-06-18 17:54:45,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 2981969920. Throughput: 0: 41680.1. Samples: 206029580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:45,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 17:54:48,664][19107] Updated weights for policy 0, policy_version 182015 (0.0023) [2024-06-18 17:54:50,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41654.3). Total num frames: 2982199296. Throughput: 0: 41828.1. Samples: 206283380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:50,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 17:54:52,883][19107] Updated weights for policy 0, policy_version 182025 (0.0042) [2024-06-18 17:54:55,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 2982395904. Throughput: 0: 41856.6. Samples: 206533380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 17:54:55,501][18875] Avg episode reward: [(0, '0.574')] [2024-06-18 17:54:56,771][19107] Updated weights for policy 0, policy_version 182035 (0.0026) [2024-06-18 17:55:00,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 2982592512. Throughput: 0: 41677.3. Samples: 206652820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:00,501][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 17:55:01,140][19107] Updated weights for policy 0, policy_version 182045 (0.0044) [2024-06-18 17:55:04,817][19107] Updated weights for policy 0, policy_version 182055 (0.0038) [2024-06-18 17:55:05,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 2982821888. Throughput: 0: 41738.9. Samples: 206905260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:05,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 17:55:09,013][19107] Updated weights for policy 0, policy_version 182065 (0.0036) [2024-06-18 17:55:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41509.5, 300 sec: 41654.2). Total num frames: 2983002112. Throughput: 0: 41543.4. Samples: 207150240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:10,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 17:55:12,823][19107] Updated weights for policy 0, policy_version 182075 (0.0030) [2024-06-18 17:55:15,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 2983231488. Throughput: 0: 41509.7. Samples: 207273620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:15,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 17:55:16,805][19107] Updated weights for policy 0, policy_version 182085 (0.0045) [2024-06-18 17:55:20,504][18875] Fps is (10 sec: 42583.6, 60 sec: 41503.6, 300 sec: 41598.2). Total num frames: 2983428096. Throughput: 0: 41622.1. Samples: 207528820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:20,505][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 17:55:20,642][19107] Updated weights for policy 0, policy_version 182095 (0.0028) [2024-06-18 17:55:24,548][19107] Updated weights for policy 0, policy_version 182105 (0.0033) [2024-06-18 17:55:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 2983641088. Throughput: 0: 41536.5. Samples: 207779300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:25,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 17:55:28,546][19107] Updated weights for policy 0, policy_version 182115 (0.0026) [2024-06-18 17:55:30,500][18875] Fps is (10 sec: 44252.9, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 2983870464. Throughput: 0: 41762.7. Samples: 207908900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:30,501][18875] Avg episode reward: [(0, '0.373')] [2024-06-18 17:55:32,234][19107] Updated weights for policy 0, policy_version 182125 (0.0037) [2024-06-18 17:55:35,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 2984050688. Throughput: 0: 41730.3. Samples: 208161240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:35,501][18875] Avg episode reward: [(0, '0.276')] [2024-06-18 17:55:36,262][19107] Updated weights for policy 0, policy_version 182135 (0.0034) [2024-06-18 17:55:39,933][19107] Updated weights for policy 0, policy_version 182145 (0.0043) [2024-06-18 17:55:40,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 2984263680. Throughput: 0: 41640.5. Samples: 208407200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:40,501][18875] Avg episode reward: [(0, '0.413')] [2024-06-18 17:55:43,929][19107] Updated weights for policy 0, policy_version 182155 (0.0049) [2024-06-18 17:55:45,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 41709.7). Total num frames: 2984493056. Throughput: 0: 41810.2. Samples: 208534280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:45,501][18875] Avg episode reward: [(0, '0.415')] [2024-06-18 17:55:47,642][19107] Updated weights for policy 0, policy_version 182165 (0.0023) [2024-06-18 17:55:50,503][18875] Fps is (10 sec: 42585.6, 60 sec: 41504.1, 300 sec: 41654.3). Total num frames: 2984689664. Throughput: 0: 41737.4. Samples: 208783560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:50,504][18875] Avg episode reward: [(0, '0.312')] [2024-06-18 17:55:51,863][19107] Updated weights for policy 0, policy_version 182175 (0.0037) [2024-06-18 17:55:55,498][19107] Updated weights for policy 0, policy_version 182185 (0.0033) [2024-06-18 17:55:55,501][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 2984919040. Throughput: 0: 42007.9. Samples: 209040600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:55:55,501][18875] Avg episode reward: [(0, '0.370')] [2024-06-18 17:55:59,599][19107] Updated weights for policy 0, policy_version 182195 (0.0041) [2024-06-18 17:56:00,500][18875] Fps is (10 sec: 44249.7, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 2985132032. Throughput: 0: 42034.3. Samples: 209165160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 17:56:00,501][18875] Avg episode reward: [(0, '0.344')] [2024-06-18 17:56:03,391][19107] Updated weights for policy 0, policy_version 182205 (0.0038) [2024-06-18 17:56:05,510][18875] Fps is (10 sec: 40920.7, 60 sec: 41772.5, 300 sec: 41708.4). Total num frames: 2985328640. Throughput: 0: 42076.4. Samples: 209422520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:05,511][18875] Avg episode reward: [(0, '0.344')] [2024-06-18 17:56:07,238][19107] Updated weights for policy 0, policy_version 182215 (0.0044) [2024-06-18 17:56:08,350][19087] Signal inference workers to stop experience collection... (3050 times) [2024-06-18 17:56:08,386][19107] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-18 17:56:08,406][19087] Signal inference workers to resume experience collection... (3050 times) [2024-06-18 17:56:08,407][19107] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-18 17:56:10,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 41765.7). Total num frames: 2985541632. Throughput: 0: 41980.5. Samples: 209668420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:10,501][18875] Avg episode reward: [(0, '0.680')] [2024-06-18 17:56:11,276][19107] Updated weights for policy 0, policy_version 182225 (0.0038) [2024-06-18 17:56:14,960][19107] Updated weights for policy 0, policy_version 182235 (0.0033) [2024-06-18 17:56:15,500][18875] Fps is (10 sec: 41000.0, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 2985738240. Throughput: 0: 41895.4. Samples: 209794200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:15,501][18875] Avg episode reward: [(0, '0.680')] [2024-06-18 17:56:19,003][19107] Updated weights for policy 0, policy_version 182245 (0.0043) [2024-06-18 17:56:20,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42054.8, 300 sec: 41654.2). Total num frames: 2985951232. Throughput: 0: 41757.3. Samples: 210040320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:20,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 17:56:22,619][19107] Updated weights for policy 0, policy_version 182255 (0.0042) [2024-06-18 17:56:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 2986147840. Throughput: 0: 41836.5. Samples: 210289840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:25,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 17:56:27,431][19107] Updated weights for policy 0, policy_version 182265 (0.0034) [2024-06-18 17:56:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 2986377216. Throughput: 0: 41775.2. Samples: 210414160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:30,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 17:56:30,607][19107] Updated weights for policy 0, policy_version 182275 (0.0036) [2024-06-18 17:56:34,929][19107] Updated weights for policy 0, policy_version 182285 (0.0033) [2024-06-18 17:56:35,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 2986573824. Throughput: 0: 41778.3. Samples: 210663460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:35,501][18875] Avg episode reward: [(0, '0.754')] [2024-06-18 17:56:35,601][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000182287_2986590208.pth... [2024-06-18 17:56:35,657][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181675_2976563200.pth [2024-06-18 17:56:38,738][19107] Updated weights for policy 0, policy_version 182295 (0.0035) [2024-06-18 17:56:40,504][18875] Fps is (10 sec: 39307.5, 60 sec: 41776.7, 300 sec: 41653.7). Total num frames: 2986770432. Throughput: 0: 41690.7. Samples: 210916820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:40,504][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 17:56:42,585][19107] Updated weights for policy 0, policy_version 182305 (0.0048) [2024-06-18 17:56:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 2986999808. Throughput: 0: 41731.2. Samples: 211043060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:45,501][18875] Avg episode reward: [(0, '0.780')] [2024-06-18 17:56:46,398][19107] Updated weights for policy 0, policy_version 182315 (0.0033) [2024-06-18 17:56:50,500][18875] Fps is (10 sec: 42613.5, 60 sec: 41781.3, 300 sec: 41709.8). Total num frames: 2987196416. Throughput: 0: 41706.4. Samples: 211298900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:50,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 17:56:50,683][19107] Updated weights for policy 0, policy_version 182325 (0.0027) [2024-06-18 17:56:54,139][19107] Updated weights for policy 0, policy_version 182335 (0.0040) [2024-06-18 17:56:55,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.2, 300 sec: 41709.7). Total num frames: 2987409408. Throughput: 0: 41756.3. Samples: 211547460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:56:55,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 17:56:58,348][19107] Updated weights for policy 0, policy_version 182345 (0.0044) [2024-06-18 17:57:00,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 2987622400. Throughput: 0: 41641.8. Samples: 211668080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:57:00,501][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 17:57:02,158][19107] Updated weights for policy 0, policy_version 182355 (0.0034) [2024-06-18 17:57:05,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41786.0, 300 sec: 41765.3). Total num frames: 2987835392. Throughput: 0: 41796.0. Samples: 211921140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:57:05,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 17:57:06,265][19107] Updated weights for policy 0, policy_version 182365 (0.0046) [2024-06-18 17:57:09,951][19107] Updated weights for policy 0, policy_version 182375 (0.0039) [2024-06-18 17:57:10,504][18875] Fps is (10 sec: 42583.3, 60 sec: 41776.6, 300 sec: 41709.3). Total num frames: 2988048384. Throughput: 0: 41699.7. Samples: 212166480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 17:57:10,505][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 17:57:14,032][19107] Updated weights for policy 0, policy_version 182385 (0.0033) [2024-06-18 17:57:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41820.8). Total num frames: 2988244992. Throughput: 0: 41857.4. Samples: 212297740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:15,501][18875] Avg episode reward: [(0, '0.401')] [2024-06-18 17:57:17,674][19107] Updated weights for policy 0, policy_version 182395 (0.0049) [2024-06-18 17:57:20,500][18875] Fps is (10 sec: 39336.2, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 2988441600. Throughput: 0: 41910.3. Samples: 212549420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:20,501][18875] Avg episode reward: [(0, '0.401')] [2024-06-18 17:57:21,773][19107] Updated weights for policy 0, policy_version 182405 (0.0047) [2024-06-18 17:57:25,455][19107] Updated weights for policy 0, policy_version 182415 (0.0023) [2024-06-18 17:57:25,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 2988687360. Throughput: 0: 41809.0. Samples: 212798080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:25,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 17:57:29,636][19107] Updated weights for policy 0, policy_version 182425 (0.0031) [2024-06-18 17:57:30,501][18875] Fps is (10 sec: 42597.5, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 2988867584. Throughput: 0: 41859.9. Samples: 212926760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:30,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 17:57:33,100][19087] Signal inference workers to stop experience collection... (3100 times) [2024-06-18 17:57:33,102][19087] Signal inference workers to resume experience collection... (3100 times) [2024-06-18 17:57:33,127][19107] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-18 17:57:33,127][19107] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-18 17:57:33,249][19107] Updated weights for policy 0, policy_version 182435 (0.0029) [2024-06-18 17:57:35,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 2989064192. Throughput: 0: 41700.9. Samples: 213175440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:35,501][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 17:57:37,566][19107] Updated weights for policy 0, policy_version 182445 (0.0034) [2024-06-18 17:57:40,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42327.9, 300 sec: 41820.9). Total num frames: 2989309952. Throughput: 0: 41661.9. Samples: 213422240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:40,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 17:57:41,019][19107] Updated weights for policy 0, policy_version 182455 (0.0027) [2024-06-18 17:57:45,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 2989490176. Throughput: 0: 41776.1. Samples: 213548000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:45,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 17:57:45,532][19107] Updated weights for policy 0, policy_version 182465 (0.0038) [2024-06-18 17:57:48,741][19107] Updated weights for policy 0, policy_version 182475 (0.0030) [2024-06-18 17:57:50,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 2989703168. Throughput: 0: 41596.0. Samples: 213792960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:50,501][18875] Avg episode reward: [(0, '0.402')] [2024-06-18 17:57:53,409][19107] Updated weights for policy 0, policy_version 182485 (0.0035) [2024-06-18 17:57:55,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 2989932544. Throughput: 0: 41761.2. Samples: 214045580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:57:55,500][18875] Avg episode reward: [(0, '0.294')] [2024-06-18 17:57:57,058][19107] Updated weights for policy 0, policy_version 182495 (0.0035) [2024-06-18 17:58:00,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 2990112768. Throughput: 0: 41830.6. Samples: 214180120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:58:00,501][18875] Avg episode reward: [(0, '0.294')] [2024-06-18 17:58:01,110][19107] Updated weights for policy 0, policy_version 182505 (0.0026) [2024-06-18 17:58:04,723][19107] Updated weights for policy 0, policy_version 182515 (0.0029) [2024-06-18 17:58:05,503][18875] Fps is (10 sec: 40947.1, 60 sec: 41777.1, 300 sec: 41820.4). Total num frames: 2990342144. Throughput: 0: 41793.5. Samples: 214430260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:58:05,504][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 17:58:08,854][19107] Updated weights for policy 0, policy_version 182525 (0.0042) [2024-06-18 17:58:10,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41781.7, 300 sec: 41820.9). Total num frames: 2990555136. Throughput: 0: 41835.2. Samples: 214680660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:58:10,501][18875] Avg episode reward: [(0, '0.401')] [2024-06-18 17:58:12,517][19107] Updated weights for policy 0, policy_version 182535 (0.0038) [2024-06-18 17:58:15,500][18875] Fps is (10 sec: 39334.0, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 2990735360. Throughput: 0: 41761.1. Samples: 214806000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 17:58:15,500][18875] Avg episode reward: [(0, '0.284')] [2024-06-18 17:58:16,637][19107] Updated weights for policy 0, policy_version 182545 (0.0037) [2024-06-18 17:58:20,235][19107] Updated weights for policy 0, policy_version 182555 (0.0033) [2024-06-18 17:58:20,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 2990997504. Throughput: 0: 41884.6. Samples: 215060240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:20,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 17:58:24,571][19107] Updated weights for policy 0, policy_version 182565 (0.0035) [2024-06-18 17:58:25,500][18875] Fps is (10 sec: 45874.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 2991194112. Throughput: 0: 42019.1. Samples: 215313100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:25,501][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 17:58:27,885][19107] Updated weights for policy 0, policy_version 182575 (0.0041) [2024-06-18 17:58:30,500][18875] Fps is (10 sec: 39320.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 2991390720. Throughput: 0: 41939.9. Samples: 215435300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:30,501][18875] Avg episode reward: [(0, '0.590')] [2024-06-18 17:58:32,510][19107] Updated weights for policy 0, policy_version 182585 (0.0033) [2024-06-18 17:58:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 2991603712. Throughput: 0: 42118.7. Samples: 215688300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:35,500][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 17:58:35,536][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000182594_2991620096.pth... [2024-06-18 17:58:35,591][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000181981_2981576704.pth [2024-06-18 17:58:35,836][19107] Updated weights for policy 0, policy_version 182595 (0.0039) [2024-06-18 17:58:40,347][19107] Updated weights for policy 0, policy_version 182605 (0.0030) [2024-06-18 17:58:40,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 2991800320. Throughput: 0: 42136.9. Samples: 215941740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:40,501][18875] Avg episode reward: [(0, '0.259')] [2024-06-18 17:58:43,620][19107] Updated weights for policy 0, policy_version 182615 (0.0050) [2024-06-18 17:58:45,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 2992013312. Throughput: 0: 41816.5. Samples: 216061860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:45,501][18875] Avg episode reward: [(0, '0.321')] [2024-06-18 17:58:48,239][19107] Updated weights for policy 0, policy_version 182625 (0.0035) [2024-06-18 17:58:50,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 2992226304. Throughput: 0: 41792.2. Samples: 216310780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:50,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 17:58:51,423][19107] Updated weights for policy 0, policy_version 182635 (0.0035) [2024-06-18 17:58:55,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 2992422912. Throughput: 0: 41867.1. Samples: 216564680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:58:55,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 17:58:56,143][19107] Updated weights for policy 0, policy_version 182645 (0.0033) [2024-06-18 17:58:59,309][19107] Updated weights for policy 0, policy_version 182655 (0.0047) [2024-06-18 17:59:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 2992635904. Throughput: 0: 41895.5. Samples: 216691300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:59:00,501][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 17:59:03,957][19107] Updated weights for policy 0, policy_version 182665 (0.0034) [2024-06-18 17:59:05,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41781.4, 300 sec: 41821.6). Total num frames: 2992848896. Throughput: 0: 41858.2. Samples: 216943860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:59:05,501][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 17:59:06,831][19087] Signal inference workers to stop experience collection... (3150 times) [2024-06-18 17:59:06,864][19107] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-18 17:59:06,886][19087] Signal inference workers to resume experience collection... (3150 times) [2024-06-18 17:59:06,887][19107] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-18 17:59:07,317][19107] Updated weights for policy 0, policy_version 182675 (0.0044) [2024-06-18 17:59:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 2993045504. Throughput: 0: 41718.2. Samples: 217190420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:59:10,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 17:59:11,825][19107] Updated weights for policy 0, policy_version 182685 (0.0037) [2024-06-18 17:59:15,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 2993258496. Throughput: 0: 41673.0. Samples: 217310580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:59:15,501][18875] Avg episode reward: [(0, '0.361')] [2024-06-18 17:59:15,594][19107] Updated weights for policy 0, policy_version 182695 (0.0039) [2024-06-18 17:59:19,940][19107] Updated weights for policy 0, policy_version 182705 (0.0035) [2024-06-18 17:59:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 2993471488. Throughput: 0: 41790.6. Samples: 217568880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-18 17:59:20,501][18875] Avg episode reward: [(0, '0.419')] [2024-06-18 17:59:23,291][19107] Updated weights for policy 0, policy_version 182715 (0.0035) [2024-06-18 17:59:25,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 2993684480. Throughput: 0: 41510.7. Samples: 217809720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 17:59:25,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 17:59:27,868][19107] Updated weights for policy 0, policy_version 182725 (0.0037) [2024-06-18 17:59:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 2993881088. Throughput: 0: 41684.5. Samples: 217937660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 17:59:30,501][18875] Avg episode reward: [(0, '0.364')] [2024-06-18 17:59:31,503][19107] Updated weights for policy 0, policy_version 182735 (0.0029) [2024-06-18 17:59:35,500][18875] Fps is (10 sec: 37683.1, 60 sec: 40960.0, 300 sec: 41654.3). Total num frames: 2994061312. Throughput: 0: 41577.8. Samples: 218181780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 17:59:35,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 17:59:35,683][19107] Updated weights for policy 0, policy_version 182745 (0.0034) [2024-06-18 17:59:39,246][19107] Updated weights for policy 0, policy_version 182755 (0.0032) [2024-06-18 17:59:40,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 2994307072. Throughput: 0: 41370.2. Samples: 218426340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 17:59:40,501][18875] Avg episode reward: [(0, '0.790')] [2024-06-18 17:59:43,514][19107] Updated weights for policy 0, policy_version 182765 (0.0043) [2024-06-18 17:59:45,500][18875] Fps is (10 sec: 45874.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2994520064. Throughput: 0: 41565.7. Samples: 218561760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 17:59:45,502][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 17:59:46,959][19107] Updated weights for policy 0, policy_version 182775 (0.0032) [2024-06-18 17:59:50,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 2994700288. Throughput: 0: 41381.2. Samples: 218806020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 17:59:50,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 17:59:51,315][19107] Updated weights for policy 0, policy_version 182785 (0.0036) [2024-06-18 17:59:54,811][19107] Updated weights for policy 0, policy_version 182795 (0.0037) [2024-06-18 17:59:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 2994929664. Throughput: 0: 41413.3. Samples: 219054020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 17:59:55,501][18875] Avg episode reward: [(0, '0.281')] [2024-06-18 17:59:59,171][19107] Updated weights for policy 0, policy_version 182805 (0.0049) [2024-06-18 18:00:00,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2995142656. Throughput: 0: 41687.2. Samples: 219186500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 18:00:00,501][18875] Avg episode reward: [(0, '0.370')] [2024-06-18 18:00:02,884][19107] Updated weights for policy 0, policy_version 182815 (0.0050) [2024-06-18 18:00:05,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 2995339264. Throughput: 0: 41392.5. Samples: 219431540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 18:00:05,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 18:00:06,841][19107] Updated weights for policy 0, policy_version 182825 (0.0042) [2024-06-18 18:00:10,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 2995535872. Throughput: 0: 41612.5. Samples: 219682280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 18:00:10,500][18875] Avg episode reward: [(0, '0.372')] [2024-06-18 18:00:10,717][19107] Updated weights for policy 0, policy_version 182835 (0.0029) [2024-06-18 18:00:14,928][19107] Updated weights for policy 0, policy_version 182845 (0.0035) [2024-06-18 18:00:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41765.8). Total num frames: 2995748864. Throughput: 0: 41501.2. Samples: 219805220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 18:00:15,501][18875] Avg episode reward: [(0, '0.363')] [2024-06-18 18:00:18,548][19107] Updated weights for policy 0, policy_version 182855 (0.0040) [2024-06-18 18:00:20,500][18875] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 2995978240. Throughput: 0: 41650.1. Samples: 220056040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 18:00:20,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 18:00:22,775][19107] Updated weights for policy 0, policy_version 182865 (0.0021) [2024-06-18 18:00:25,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 2996174848. Throughput: 0: 41847.6. Samples: 220309480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 18:00:25,501][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 18:00:26,439][19107] Updated weights for policy 0, policy_version 182875 (0.0039) [2024-06-18 18:00:29,389][19087] Signal inference workers to stop experience collection... (3200 times) [2024-06-18 18:00:29,390][19087] Signal inference workers to resume experience collection... (3200 times) [2024-06-18 18:00:29,429][19107] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-18 18:00:29,429][19107] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-18 18:00:30,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 2996371456. Throughput: 0: 41509.9. Samples: 220429700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:00:30,501][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 18:00:30,613][19107] Updated weights for policy 0, policy_version 182885 (0.0027) [2024-06-18 18:00:34,272][19107] Updated weights for policy 0, policy_version 182895 (0.0041) [2024-06-18 18:00:35,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 2996617216. Throughput: 0: 41810.3. Samples: 220687480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:00:35,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 18:00:35,641][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000182900_2996633600.pth... [2024-06-18 18:00:35,698][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000182287_2986590208.pth [2024-06-18 18:00:38,470][19107] Updated weights for policy 0, policy_version 182905 (0.0034) [2024-06-18 18:00:40,504][18875] Fps is (10 sec: 44220.3, 60 sec: 41776.7, 300 sec: 41764.8). Total num frames: 2996813824. Throughput: 0: 41919.4. Samples: 220940540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:00:40,505][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 18:00:41,982][19107] Updated weights for policy 0, policy_version 182915 (0.0033) [2024-06-18 18:00:45,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41765.7). Total num frames: 2997010432. Throughput: 0: 41610.1. Samples: 221058960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:00:45,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 18:00:46,282][19107] Updated weights for policy 0, policy_version 182925 (0.0049) [2024-06-18 18:00:49,897][19107] Updated weights for policy 0, policy_version 182935 (0.0028) [2024-06-18 18:00:50,500][18875] Fps is (10 sec: 40974.9, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 2997223424. Throughput: 0: 41747.1. Samples: 221310160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:00:50,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 18:00:54,087][19107] Updated weights for policy 0, policy_version 182945 (0.0026) [2024-06-18 18:00:55,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 2997420032. Throughput: 0: 41836.0. Samples: 221564900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:00:55,501][18875] Avg episode reward: [(0, '0.348')] [2024-06-18 18:00:57,762][19107] Updated weights for policy 0, policy_version 182955 (0.0029) [2024-06-18 18:01:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41711.2). Total num frames: 2997633024. Throughput: 0: 41787.1. Samples: 221685640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:01:00,501][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 18:01:01,676][19107] Updated weights for policy 0, policy_version 182965 (0.0041) [2024-06-18 18:01:05,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 2997846016. Throughput: 0: 41843.2. Samples: 221938980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:01:05,501][18875] Avg episode reward: [(0, '0.479')] [2024-06-18 18:01:05,644][19107] Updated weights for policy 0, policy_version 182975 (0.0040) [2024-06-18 18:01:09,437][19107] Updated weights for policy 0, policy_version 182985 (0.0035) [2024-06-18 18:01:10,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 2998059008. Throughput: 0: 41965.3. Samples: 222197920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:01:10,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 18:01:13,422][19107] Updated weights for policy 0, policy_version 182995 (0.0024) [2024-06-18 18:01:15,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 2998272000. Throughput: 0: 42004.4. Samples: 222319900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:01:15,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 18:01:17,233][19107] Updated weights for policy 0, policy_version 183005 (0.0047) [2024-06-18 18:01:20,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 2998501376. Throughput: 0: 41943.5. Samples: 222574940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:01:20,509][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 18:01:21,205][19107] Updated weights for policy 0, policy_version 183015 (0.0041) [2024-06-18 18:01:24,978][19107] Updated weights for policy 0, policy_version 183025 (0.0034) [2024-06-18 18:01:25,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 2998697984. Throughput: 0: 41856.8. Samples: 222823940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:01:25,512][18875] Avg episode reward: [(0, '0.757')] [2024-06-18 18:01:28,891][19107] Updated weights for policy 0, policy_version 183035 (0.0038) [2024-06-18 18:01:30,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 2998910976. Throughput: 0: 41968.9. Samples: 222947560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:01:30,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 18:01:33,011][19107] Updated weights for policy 0, policy_version 183045 (0.0028) [2024-06-18 18:01:35,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41821.4). Total num frames: 2999107584. Throughput: 0: 41996.9. Samples: 223200020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:01:35,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 18:01:36,538][19107] Updated weights for policy 0, policy_version 183055 (0.0034) [2024-06-18 18:01:40,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41781.8, 300 sec: 41765.3). Total num frames: 2999320576. Throughput: 0: 42018.6. Samples: 223455740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:01:40,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 18:01:40,655][19107] Updated weights for policy 0, policy_version 183065 (0.0038) [2024-06-18 18:01:45,028][19107] Updated weights for policy 0, policy_version 183075 (0.0040) [2024-06-18 18:01:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 2999517184. Throughput: 0: 42112.9. Samples: 223580720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:01:45,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 18:01:48,531][19107] Updated weights for policy 0, policy_version 183085 (0.0029) [2024-06-18 18:01:50,502][18875] Fps is (10 sec: 42592.2, 60 sec: 42051.3, 300 sec: 41820.7). Total num frames: 2999746560. Throughput: 0: 41974.6. Samples: 223827900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:01:50,502][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 18:01:52,991][19107] Updated weights for policy 0, policy_version 183095 (0.0032) [2024-06-18 18:01:55,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 2999943168. Throughput: 0: 41690.7. Samples: 224074000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:01:55,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 18:01:56,387][19107] Updated weights for policy 0, policy_version 183105 (0.0032) [2024-06-18 18:02:00,500][18875] Fps is (10 sec: 37688.8, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 3000123392. Throughput: 0: 41747.2. Samples: 224198520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:00,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 18:02:00,813][19107] Updated weights for policy 0, policy_version 183115 (0.0029) [2024-06-18 18:02:04,305][19107] Updated weights for policy 0, policy_version 183125 (0.0055) [2024-06-18 18:02:05,320][19087] Signal inference workers to stop experience collection... (3250 times) [2024-06-18 18:02:05,321][19087] Signal inference workers to resume experience collection... (3250 times) [2024-06-18 18:02:05,349][19107] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-18 18:02:05,349][19107] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-18 18:02:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41765.8). Total num frames: 3000369152. Throughput: 0: 41597.0. Samples: 224446800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:05,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 18:02:08,670][19107] Updated weights for policy 0, policy_version 183135 (0.0040) [2024-06-18 18:02:10,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3000582144. Throughput: 0: 41665.2. Samples: 224698880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:10,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 18:02:12,177][19107] Updated weights for policy 0, policy_version 183145 (0.0039) [2024-06-18 18:02:15,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3000762368. Throughput: 0: 41784.9. Samples: 224827880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:15,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 18:02:16,653][19107] Updated weights for policy 0, policy_version 183155 (0.0037) [2024-06-18 18:02:20,014][19107] Updated weights for policy 0, policy_version 183165 (0.0044) [2024-06-18 18:02:20,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 3000975360. Throughput: 0: 41522.7. Samples: 225068540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:20,501][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 18:02:24,595][19107] Updated weights for policy 0, policy_version 183175 (0.0029) [2024-06-18 18:02:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3001188352. Throughput: 0: 41577.2. Samples: 225326720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:25,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 18:02:28,222][19107] Updated weights for policy 0, policy_version 183185 (0.0043) [2024-06-18 18:02:30,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3001401344. Throughput: 0: 41608.4. Samples: 225453100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:30,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 18:02:32,179][19107] Updated weights for policy 0, policy_version 183195 (0.0035) [2024-06-18 18:02:35,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3001614336. Throughput: 0: 41550.7. Samples: 225697620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:35,501][18875] Avg episode reward: [(0, '0.699')] [2024-06-18 18:02:35,527][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000183204_3001614336.pth... [2024-06-18 18:02:35,595][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000182594_2991620096.pth [2024-06-18 18:02:35,878][19107] Updated weights for policy 0, policy_version 183205 (0.0031) [2024-06-18 18:02:39,968][19107] Updated weights for policy 0, policy_version 183215 (0.0032) [2024-06-18 18:02:40,501][18875] Fps is (10 sec: 40957.9, 60 sec: 41505.7, 300 sec: 41765.2). Total num frames: 3001810944. Throughput: 0: 41692.4. Samples: 225950180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:02:40,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 18:02:43,480][19107] Updated weights for policy 0, policy_version 183225 (0.0027) [2024-06-18 18:02:45,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3002023936. Throughput: 0: 41724.8. Samples: 226076140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:02:45,501][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 18:02:47,771][19107] Updated weights for policy 0, policy_version 183235 (0.0034) [2024-06-18 18:02:50,500][18875] Fps is (10 sec: 44238.9, 60 sec: 41780.1, 300 sec: 41765.3). Total num frames: 3002253312. Throughput: 0: 41831.4. Samples: 226329220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:02:50,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 18:02:51,140][19107] Updated weights for policy 0, policy_version 183245 (0.0036) [2024-06-18 18:02:55,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 3002417152. Throughput: 0: 41876.0. Samples: 226583300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:02:55,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 18:02:55,884][19107] Updated weights for policy 0, policy_version 183255 (0.0033) [2024-06-18 18:02:59,244][19107] Updated weights for policy 0, policy_version 183265 (0.0040) [2024-06-18 18:03:00,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 41765.8). Total num frames: 3002662912. Throughput: 0: 41558.8. Samples: 226698020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:00,501][18875] Avg episode reward: [(0, '0.339')] [2024-06-18 18:03:03,550][19107] Updated weights for policy 0, policy_version 183275 (0.0029) [2024-06-18 18:03:05,500][18875] Fps is (10 sec: 47514.1, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 3002892288. Throughput: 0: 42005.8. Samples: 226958800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:05,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 18:03:07,185][19107] Updated weights for policy 0, policy_version 183285 (0.0049) [2024-06-18 18:03:10,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 3003056128. Throughput: 0: 41876.6. Samples: 227211160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:10,503][18875] Avg episode reward: [(0, '0.424')] [2024-06-18 18:03:11,295][19107] Updated weights for policy 0, policy_version 183295 (0.0039) [2024-06-18 18:03:15,050][19107] Updated weights for policy 0, policy_version 183305 (0.0026) [2024-06-18 18:03:15,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 3003269120. Throughput: 0: 41702.4. Samples: 227329700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:15,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 18:03:19,103][19107] Updated weights for policy 0, policy_version 183315 (0.0037) [2024-06-18 18:03:20,149][19087] Signal inference workers to stop experience collection... (3300 times) [2024-06-18 18:03:20,201][19107] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-18 18:03:20,206][19087] Signal inference workers to resume experience collection... (3300 times) [2024-06-18 18:03:20,214][19107] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-18 18:03:20,504][18875] Fps is (10 sec: 45858.4, 60 sec: 42322.8, 300 sec: 41764.8). Total num frames: 3003514880. Throughput: 0: 41876.1. Samples: 227582200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:20,505][18875] Avg episode reward: [(0, '0.756')] [2024-06-18 18:03:22,812][19107] Updated weights for policy 0, policy_version 183325 (0.0044) [2024-06-18 18:03:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3003678720. Throughput: 0: 41873.4. Samples: 227834460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:25,501][18875] Avg episode reward: [(0, '0.722')] [2024-06-18 18:03:27,257][19107] Updated weights for policy 0, policy_version 183335 (0.0025) [2024-06-18 18:03:30,474][19107] Updated weights for policy 0, policy_version 183345 (0.0034) [2024-06-18 18:03:30,500][18875] Fps is (10 sec: 40974.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3003924480. Throughput: 0: 41732.9. Samples: 227954120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:30,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 18:03:34,878][19107] Updated weights for policy 0, policy_version 183355 (0.0042) [2024-06-18 18:03:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3004104704. Throughput: 0: 41847.3. Samples: 228212340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:35,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 18:03:38,705][19107] Updated weights for policy 0, policy_version 183365 (0.0029) [2024-06-18 18:03:40,504][18875] Fps is (10 sec: 39307.7, 60 sec: 41777.1, 300 sec: 41709.3). Total num frames: 3004317696. Throughput: 0: 41619.4. Samples: 228456320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:40,504][18875] Avg episode reward: [(0, '0.260')] [2024-06-18 18:03:42,651][19107] Updated weights for policy 0, policy_version 183375 (0.0030) [2024-06-18 18:03:45,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3004547072. Throughput: 0: 41881.3. Samples: 228582680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 18:03:45,501][18875] Avg episode reward: [(0, '0.332')] [2024-06-18 18:03:46,349][19107] Updated weights for policy 0, policy_version 183385 (0.0043) [2024-06-18 18:03:50,500][18875] Fps is (10 sec: 40975.2, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 3004727296. Throughput: 0: 41669.9. Samples: 228833940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:03:50,501][18875] Avg episode reward: [(0, '0.328')] [2024-06-18 18:03:50,638][19107] Updated weights for policy 0, policy_version 183395 (0.0044) [2024-06-18 18:03:54,126][19107] Updated weights for policy 0, policy_version 183405 (0.0034) [2024-06-18 18:03:55,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 3004956672. Throughput: 0: 41625.3. Samples: 229084300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:03:55,501][18875] Avg episode reward: [(0, '0.298')] [2024-06-18 18:03:58,479][19107] Updated weights for policy 0, policy_version 183415 (0.0030) [2024-06-18 18:04:00,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3005153280. Throughput: 0: 41813.7. Samples: 229211320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:00,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 18:04:02,085][19107] Updated weights for policy 0, policy_version 183425 (0.0043) [2024-06-18 18:04:05,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3005366272. Throughput: 0: 41706.1. Samples: 229458820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:05,501][18875] Avg episode reward: [(0, '0.757')] [2024-06-18 18:04:06,295][19107] Updated weights for policy 0, policy_version 183435 (0.0028) [2024-06-18 18:04:10,013][19107] Updated weights for policy 0, policy_version 183445 (0.0032) [2024-06-18 18:04:10,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3005579264. Throughput: 0: 41519.2. Samples: 229702820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:10,500][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 18:04:14,053][19107] Updated weights for policy 0, policy_version 183455 (0.0051) [2024-06-18 18:04:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3005792256. Throughput: 0: 41704.6. Samples: 229830820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:15,501][18875] Avg episode reward: [(0, '0.313')] [2024-06-18 18:04:17,875][19107] Updated weights for policy 0, policy_version 183465 (0.0039) [2024-06-18 18:04:20,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40962.5, 300 sec: 41654.2). Total num frames: 3005972480. Throughput: 0: 41470.3. Samples: 230078500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:20,500][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 18:04:22,074][19107] Updated weights for policy 0, policy_version 183475 (0.0036) [2024-06-18 18:04:25,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3006185472. Throughput: 0: 41435.9. Samples: 230320780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:25,500][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 18:04:25,767][19107] Updated weights for policy 0, policy_version 183485 (0.0034) [2024-06-18 18:04:30,133][19107] Updated weights for policy 0, policy_version 183495 (0.0038) [2024-06-18 18:04:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.2, 300 sec: 41820.9). Total num frames: 3006398464. Throughput: 0: 41361.3. Samples: 230443940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:30,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 18:04:33,596][19107] Updated weights for policy 0, policy_version 183505 (0.0030) [2024-06-18 18:04:35,504][18875] Fps is (10 sec: 40944.9, 60 sec: 41503.6, 300 sec: 41653.7). Total num frames: 3006595072. Throughput: 0: 41203.8. Samples: 230688260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:35,505][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 18:04:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000183508_3006595072.pth... [2024-06-18 18:04:35,574][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000182900_2996633600.pth [2024-06-18 18:04:37,821][19107] Updated weights for policy 0, policy_version 183515 (0.0035) [2024-06-18 18:04:40,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41235.6, 300 sec: 41598.7). Total num frames: 3006791680. Throughput: 0: 41342.7. Samples: 230944720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:40,501][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 18:04:41,331][19107] Updated weights for policy 0, policy_version 183525 (0.0026) [2024-06-18 18:04:45,500][18875] Fps is (10 sec: 42613.4, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 3007021056. Throughput: 0: 41410.1. Samples: 231074780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:45,501][18875] Avg episode reward: [(0, '0.199')] [2024-06-18 18:04:45,525][19107] Updated weights for policy 0, policy_version 183535 (0.0039) [2024-06-18 18:04:49,395][19107] Updated weights for policy 0, policy_version 183545 (0.0032) [2024-06-18 18:04:50,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 3007217664. Throughput: 0: 41392.9. Samples: 231321500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 18:04:50,500][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 18:04:53,363][19107] Updated weights for policy 0, policy_version 183555 (0.0039) [2024-06-18 18:04:55,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3007447040. Throughput: 0: 41484.8. Samples: 231569640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:04:55,501][18875] Avg episode reward: [(0, '0.290')] [2024-06-18 18:04:57,258][19107] Updated weights for policy 0, policy_version 183565 (0.0039) [2024-06-18 18:04:59,587][19087] Signal inference workers to stop experience collection... (3350 times) [2024-06-18 18:04:59,618][19107] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-18 18:04:59,646][19087] Signal inference workers to resume experience collection... (3350 times) [2024-06-18 18:04:59,646][19107] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-18 18:05:00,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3007643648. Throughput: 0: 41551.0. Samples: 231700620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:00,504][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 18:05:01,336][19107] Updated weights for policy 0, policy_version 183575 (0.0027) [2024-06-18 18:05:05,170][19107] Updated weights for policy 0, policy_version 183585 (0.0042) [2024-06-18 18:05:05,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3007873024. Throughput: 0: 41613.2. Samples: 231951100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:05,501][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 18:05:09,154][19107] Updated weights for policy 0, policy_version 183595 (0.0033) [2024-06-18 18:05:10,500][18875] Fps is (10 sec: 44237.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3008086016. Throughput: 0: 41654.3. Samples: 232195220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:10,500][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 18:05:12,994][19107] Updated weights for policy 0, policy_version 183605 (0.0040) [2024-06-18 18:05:15,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 3008266240. Throughput: 0: 41772.4. Samples: 232323700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:15,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 18:05:17,033][19107] Updated weights for policy 0, policy_version 183615 (0.0037) [2024-06-18 18:05:20,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 3008479232. Throughput: 0: 41944.6. Samples: 232575620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:20,503][18875] Avg episode reward: [(0, '0.261')] [2024-06-18 18:05:20,762][19107] Updated weights for policy 0, policy_version 183625 (0.0046) [2024-06-18 18:05:24,813][19107] Updated weights for policy 0, policy_version 183635 (0.0033) [2024-06-18 18:05:25,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3008692224. Throughput: 0: 41748.8. Samples: 232823420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:25,501][18875] Avg episode reward: [(0, '0.262')] [2024-06-18 18:05:28,783][19107] Updated weights for policy 0, policy_version 183645 (0.0030) [2024-06-18 18:05:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3008888832. Throughput: 0: 41758.8. Samples: 232953920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:30,500][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 18:05:32,875][19107] Updated weights for policy 0, policy_version 183655 (0.0045) [2024-06-18 18:05:35,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41508.6, 300 sec: 41599.2). Total num frames: 3009085440. Throughput: 0: 41647.5. Samples: 233195640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:35,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 18:05:36,551][19107] Updated weights for policy 0, policy_version 183665 (0.0044) [2024-06-18 18:05:40,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 3009298432. Throughput: 0: 41737.8. Samples: 233447840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:40,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 18:05:40,713][19107] Updated weights for policy 0, policy_version 183675 (0.0029) [2024-06-18 18:05:44,506][19107] Updated weights for policy 0, policy_version 183685 (0.0037) [2024-06-18 18:05:45,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3009511424. Throughput: 0: 41612.0. Samples: 233573160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:45,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 18:05:48,546][19107] Updated weights for policy 0, policy_version 183695 (0.0039) [2024-06-18 18:05:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 3009724416. Throughput: 0: 41532.9. Samples: 233820080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:50,501][18875] Avg episode reward: [(0, '0.391')] [2024-06-18 18:05:52,449][19107] Updated weights for policy 0, policy_version 183705 (0.0036) [2024-06-18 18:05:55,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3009937408. Throughput: 0: 41740.3. Samples: 234073540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-18 18:05:55,501][18875] Avg episode reward: [(0, '0.371')] [2024-06-18 18:05:56,215][19107] Updated weights for policy 0, policy_version 183715 (0.0034) [2024-06-18 18:06:00,369][19107] Updated weights for policy 0, policy_version 183725 (0.0034) [2024-06-18 18:06:00,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3010150400. Throughput: 0: 41676.3. Samples: 234199140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:00,501][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 18:06:03,985][19107] Updated weights for policy 0, policy_version 183735 (0.0034) [2024-06-18 18:06:05,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3010363392. Throughput: 0: 41576.0. Samples: 234446540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:05,501][18875] Avg episode reward: [(0, '0.419')] [2024-06-18 18:06:08,150][19107] Updated weights for policy 0, policy_version 183745 (0.0033) [2024-06-18 18:06:10,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 3010560000. Throughput: 0: 41715.7. Samples: 234700620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:10,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 18:06:11,807][19107] Updated weights for policy 0, policy_version 183755 (0.0041) [2024-06-18 18:06:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3010772992. Throughput: 0: 41581.2. Samples: 234825080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:15,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 18:06:15,946][19107] Updated weights for policy 0, policy_version 183765 (0.0042) [2024-06-18 18:06:19,672][19107] Updated weights for policy 0, policy_version 183775 (0.0032) [2024-06-18 18:06:20,501][18875] Fps is (10 sec: 42595.6, 60 sec: 41778.8, 300 sec: 41654.1). Total num frames: 3010985984. Throughput: 0: 41809.2. Samples: 235077080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:20,502][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 18:06:23,798][19107] Updated weights for policy 0, policy_version 183785 (0.0035) [2024-06-18 18:06:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3011182592. Throughput: 0: 41735.9. Samples: 235325960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:25,503][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 18:06:27,777][19107] Updated weights for policy 0, policy_version 183795 (0.0033) [2024-06-18 18:06:30,500][18875] Fps is (10 sec: 40962.8, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3011395584. Throughput: 0: 41755.6. Samples: 235452160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:30,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 18:06:31,674][19107] Updated weights for policy 0, policy_version 183805 (0.0034) [2024-06-18 18:06:35,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 3011608576. Throughput: 0: 41842.5. Samples: 235703000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:35,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 18:06:35,527][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000183814_3011608576.pth... [2024-06-18 18:06:35,575][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000183204_3001614336.pth [2024-06-18 18:06:35,769][19107] Updated weights for policy 0, policy_version 183815 (0.0031) [2024-06-18 18:06:39,175][19087] Signal inference workers to stop experience collection... (3400 times) [2024-06-18 18:06:39,209][19107] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-18 18:06:39,286][19087] Signal inference workers to resume experience collection... (3400 times) [2024-06-18 18:06:39,287][19107] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-18 18:06:39,427][19107] Updated weights for policy 0, policy_version 183825 (0.0042) [2024-06-18 18:06:40,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3011821568. Throughput: 0: 41643.2. Samples: 235947480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:40,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 18:06:43,710][19107] Updated weights for policy 0, policy_version 183835 (0.0039) [2024-06-18 18:06:45,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41779.3, 300 sec: 41598.9). Total num frames: 3012018176. Throughput: 0: 41644.6. Samples: 236073140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:45,500][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 18:06:47,264][19107] Updated weights for policy 0, policy_version 183845 (0.0036) [2024-06-18 18:06:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 3012231168. Throughput: 0: 41728.8. Samples: 236324340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:50,501][18875] Avg episode reward: [(0, '0.326')] [2024-06-18 18:06:51,436][19107] Updated weights for policy 0, policy_version 183855 (0.0034) [2024-06-18 18:06:55,270][19107] Updated weights for policy 0, policy_version 183865 (0.0034) [2024-06-18 18:06:55,500][18875] Fps is (10 sec: 42597.5, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3012444160. Throughput: 0: 41522.1. Samples: 236569120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:06:55,501][18875] Avg episode reward: [(0, '0.281')] [2024-06-18 18:06:59,466][19107] Updated weights for policy 0, policy_version 183875 (0.0027) [2024-06-18 18:07:00,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3012640768. Throughput: 0: 41549.8. Samples: 236694820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:07:00,501][18875] Avg episode reward: [(0, '0.475')] [2024-06-18 18:07:02,989][19107] Updated weights for policy 0, policy_version 183885 (0.0037) [2024-06-18 18:07:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3012853760. Throughput: 0: 41462.3. Samples: 236942860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:05,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 18:07:07,462][19107] Updated weights for policy 0, policy_version 183895 (0.0033) [2024-06-18 18:07:10,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3013050368. Throughput: 0: 41457.8. Samples: 237191560. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:10,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 18:07:10,886][19107] Updated weights for policy 0, policy_version 183905 (0.0028) [2024-06-18 18:07:15,284][19107] Updated weights for policy 0, policy_version 183915 (0.0038) [2024-06-18 18:07:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3013263360. Throughput: 0: 41378.2. Samples: 237314180. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:15,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 18:07:18,621][19107] Updated weights for policy 0, policy_version 183925 (0.0045) [2024-06-18 18:07:20,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.5, 300 sec: 41654.2). Total num frames: 3013476352. Throughput: 0: 41324.0. Samples: 237562580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:20,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 18:07:23,140][19107] Updated weights for policy 0, policy_version 183935 (0.0043) [2024-06-18 18:07:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3013672960. Throughput: 0: 41495.1. Samples: 237814760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:25,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 18:07:26,481][19107] Updated weights for policy 0, policy_version 183945 (0.0038) [2024-06-18 18:07:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 3013885952. Throughput: 0: 41516.7. Samples: 237941400. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:30,501][18875] Avg episode reward: [(0, '0.798')] [2024-06-18 18:07:31,095][19107] Updated weights for policy 0, policy_version 183955 (0.0035) [2024-06-18 18:07:34,256][19107] Updated weights for policy 0, policy_version 183965 (0.0041) [2024-06-18 18:07:35,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41654.3). Total num frames: 3014098944. Throughput: 0: 41301.7. Samples: 238182920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:35,501][18875] Avg episode reward: [(0, '0.784')] [2024-06-18 18:07:39,095][19107] Updated weights for policy 0, policy_version 183975 (0.0029) [2024-06-18 18:07:40,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3014311936. Throughput: 0: 41558.7. Samples: 238439260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:40,501][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 18:07:42,131][19107] Updated weights for policy 0, policy_version 183985 (0.0029) [2024-06-18 18:07:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.0, 300 sec: 41543.2). Total num frames: 3014508544. Throughput: 0: 41456.8. Samples: 238560380. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:45,501][18875] Avg episode reward: [(0, '0.608')] [2024-06-18 18:07:46,847][19107] Updated weights for policy 0, policy_version 183995 (0.0024) [2024-06-18 18:07:50,305][19107] Updated weights for policy 0, policy_version 184005 (0.0042) [2024-06-18 18:07:50,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3014737920. Throughput: 0: 41461.0. Samples: 238808600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:50,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 18:07:55,065][19107] Updated weights for policy 0, policy_version 184015 (0.0039) [2024-06-18 18:07:55,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 41543.2). Total num frames: 3014918144. Throughput: 0: 41601.4. Samples: 239063620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:07:55,500][18875] Avg episode reward: [(0, '0.816')] [2024-06-18 18:07:58,150][19107] Updated weights for policy 0, policy_version 184025 (0.0034) [2024-06-18 18:08:00,500][18875] Fps is (10 sec: 37682.7, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 3015114752. Throughput: 0: 41473.3. Samples: 239180480. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:08:00,512][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 18:08:02,802][19107] Updated weights for policy 0, policy_version 184035 (0.0048) [2024-06-18 18:08:03,312][19087] Signal inference workers to stop experience collection... (3450 times) [2024-06-18 18:08:03,312][19087] Signal inference workers to resume experience collection... (3450 times) [2024-06-18 18:08:03,342][19107] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-18 18:08:03,342][19107] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-18 18:08:05,500][18875] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3015360512. Throughput: 0: 41533.3. Samples: 239431580. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:08:05,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 18:08:05,880][19107] Updated weights for policy 0, policy_version 184045 (0.0029) [2024-06-18 18:08:10,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3015540736. Throughput: 0: 41615.2. Samples: 239687440. Policy #0 lag: (min: 1.0, avg: 9.7, max: 22.0) [2024-06-18 18:08:10,501][18875] Avg episode reward: [(0, '0.488')] [2024-06-18 18:08:10,566][19107] Updated weights for policy 0, policy_version 184055 (0.0044) [2024-06-18 18:08:13,678][19107] Updated weights for policy 0, policy_version 184065 (0.0031) [2024-06-18 18:08:15,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41488.1). Total num frames: 3015753728. Throughput: 0: 41482.7. Samples: 239808120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:15,501][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 18:08:18,270][19107] Updated weights for policy 0, policy_version 184075 (0.0040) [2024-06-18 18:08:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3015966720. Throughput: 0: 41741.9. Samples: 240061300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:20,501][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 18:08:21,735][19107] Updated weights for policy 0, policy_version 184085 (0.0042) [2024-06-18 18:08:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 3016163328. Throughput: 0: 41640.1. Samples: 240313060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:25,501][18875] Avg episode reward: [(0, '0.136')] [2024-06-18 18:08:26,082][19107] Updated weights for policy 0, policy_version 184095 (0.0043) [2024-06-18 18:08:29,842][19107] Updated weights for policy 0, policy_version 184105 (0.0041) [2024-06-18 18:08:30,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 3016392704. Throughput: 0: 41543.2. Samples: 240429820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:30,501][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 18:08:33,906][19107] Updated weights for policy 0, policy_version 184115 (0.0035) [2024-06-18 18:08:35,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42052.4, 300 sec: 41710.3). Total num frames: 3016622080. Throughput: 0: 41720.9. Samples: 240686040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:35,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 18:08:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000184120_3016622080.pth... [2024-06-18 18:08:35,573][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000183508_3006595072.pth [2024-06-18 18:08:37,505][19107] Updated weights for policy 0, policy_version 184125 (0.0036) [2024-06-18 18:08:40,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41487.6). Total num frames: 3016785920. Throughput: 0: 41792.4. Samples: 240944280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:40,501][18875] Avg episode reward: [(0, '0.306')] [2024-06-18 18:08:41,615][19107] Updated weights for policy 0, policy_version 184135 (0.0039) [2024-06-18 18:08:45,500][18875] Fps is (10 sec: 39320.9, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3017015296. Throughput: 0: 41810.2. Samples: 241061940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:45,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 18:08:45,648][19107] Updated weights for policy 0, policy_version 184145 (0.0040) [2024-06-18 18:08:49,720][19107] Updated weights for policy 0, policy_version 184155 (0.0040) [2024-06-18 18:08:50,504][18875] Fps is (10 sec: 44221.1, 60 sec: 41503.6, 300 sec: 41598.2). Total num frames: 3017228288. Throughput: 0: 41818.1. Samples: 241313540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:50,504][18875] Avg episode reward: [(0, '0.737')] [2024-06-18 18:08:53,459][19107] Updated weights for policy 0, policy_version 184165 (0.0032) [2024-06-18 18:08:55,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3017408512. Throughput: 0: 41739.1. Samples: 241565700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:08:55,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 18:08:57,601][19107] Updated weights for policy 0, policy_version 184175 (0.0037) [2024-06-18 18:09:00,500][18875] Fps is (10 sec: 42613.0, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 3017654272. Throughput: 0: 41888.9. Samples: 241693120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:09:00,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 18:09:01,259][19107] Updated weights for policy 0, policy_version 184185 (0.0027) [2024-06-18 18:09:05,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41233.2, 300 sec: 41543.2). Total num frames: 3017834496. Throughput: 0: 41823.2. Samples: 241943340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:09:05,500][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 18:09:05,571][19107] Updated weights for policy 0, policy_version 184195 (0.0030) [2024-06-18 18:09:09,006][19107] Updated weights for policy 0, policy_version 184205 (0.0034) [2024-06-18 18:09:10,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 3018047488. Throughput: 0: 41715.9. Samples: 242190280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:09:10,501][18875] Avg episode reward: [(0, '0.441')] [2024-06-18 18:09:13,399][19107] Updated weights for policy 0, policy_version 184215 (0.0030) [2024-06-18 18:09:15,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3018276864. Throughput: 0: 42047.1. Samples: 242321940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 18:09:15,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 18:09:17,111][19107] Updated weights for policy 0, policy_version 184225 (0.0053) [2024-06-18 18:09:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 3018457088. Throughput: 0: 41736.3. Samples: 242564180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:20,501][18875] Avg episode reward: [(0, '0.280')] [2024-06-18 18:09:21,336][19107] Updated weights for policy 0, policy_version 184235 (0.0029) [2024-06-18 18:09:24,784][19107] Updated weights for policy 0, policy_version 184245 (0.0030) [2024-06-18 18:09:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 3018686464. Throughput: 0: 41744.9. Samples: 242822800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:25,501][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 18:09:29,056][19107] Updated weights for policy 0, policy_version 184255 (0.0044) [2024-06-18 18:09:30,500][18875] Fps is (10 sec: 45875.1, 60 sec: 42052.2, 300 sec: 41765.8). Total num frames: 3018915840. Throughput: 0: 41980.9. Samples: 242951080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:30,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 18:09:32,709][19107] Updated weights for policy 0, policy_version 184265 (0.0033) [2024-06-18 18:09:35,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 3019096064. Throughput: 0: 41871.3. Samples: 243197600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:35,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 18:09:36,861][19107] Updated weights for policy 0, policy_version 184275 (0.0032) [2024-06-18 18:09:40,500][18875] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 3019309056. Throughput: 0: 41904.1. Samples: 243451380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:40,500][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 18:09:40,573][19107] Updated weights for policy 0, policy_version 184285 (0.0028) [2024-06-18 18:09:44,545][19107] Updated weights for policy 0, policy_version 184295 (0.0033) [2024-06-18 18:09:45,091][19087] Signal inference workers to stop experience collection... (3500 times) [2024-06-18 18:09:45,092][19087] Signal inference workers to resume experience collection... (3500 times) [2024-06-18 18:09:45,132][19107] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-18 18:09:45,132][19107] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-18 18:09:45,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3019538432. Throughput: 0: 41905.8. Samples: 243578880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:45,501][18875] Avg episode reward: [(0, '0.305')] [2024-06-18 18:09:48,342][19107] Updated weights for policy 0, policy_version 184305 (0.0034) [2024-06-18 18:09:50,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41781.6, 300 sec: 41654.2). Total num frames: 3019735040. Throughput: 0: 41963.4. Samples: 243831700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:50,501][18875] Avg episode reward: [(0, '0.310')] [2024-06-18 18:09:52,246][19107] Updated weights for policy 0, policy_version 184315 (0.0038) [2024-06-18 18:09:55,503][18875] Fps is (10 sec: 40950.4, 60 sec: 42323.6, 300 sec: 41709.4). Total num frames: 3019948032. Throughput: 0: 42003.1. Samples: 244080520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:09:55,503][18875] Avg episode reward: [(0, '0.331')] [2024-06-18 18:09:56,073][19107] Updated weights for policy 0, policy_version 184325 (0.0043) [2024-06-18 18:10:00,050][19107] Updated weights for policy 0, policy_version 184335 (0.0031) [2024-06-18 18:10:00,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 3020161024. Throughput: 0: 41882.3. Samples: 244206640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:10:00,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 18:10:03,799][19107] Updated weights for policy 0, policy_version 184345 (0.0026) [2024-06-18 18:10:05,500][18875] Fps is (10 sec: 42609.3, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 3020374016. Throughput: 0: 42069.5. Samples: 244457300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:10:05,500][18875] Avg episode reward: [(0, '0.325')] [2024-06-18 18:10:07,846][19107] Updated weights for policy 0, policy_version 184355 (0.0044) [2024-06-18 18:10:10,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3020570624. Throughput: 0: 41803.6. Samples: 244703960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:10:10,501][18875] Avg episode reward: [(0, '0.620')] [2024-06-18 18:10:11,601][19107] Updated weights for policy 0, policy_version 184365 (0.0026) [2024-06-18 18:10:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3020767232. Throughput: 0: 41761.5. Samples: 244830340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:10:15,501][18875] Avg episode reward: [(0, '0.749')] [2024-06-18 18:10:15,779][19107] Updated weights for policy 0, policy_version 184375 (0.0044) [2024-06-18 18:10:19,366][19107] Updated weights for policy 0, policy_version 184385 (0.0035) [2024-06-18 18:10:20,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 3020996608. Throughput: 0: 41857.2. Samples: 245081180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 18:10:20,504][18875] Avg episode reward: [(0, '0.756')] [2024-06-18 18:10:23,648][19107] Updated weights for policy 0, policy_version 184395 (0.0032) [2024-06-18 18:10:25,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3021209600. Throughput: 0: 41875.9. Samples: 245335800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:10:25,501][18875] Avg episode reward: [(0, '0.850')] [2024-06-18 18:10:27,360][19107] Updated weights for policy 0, policy_version 184405 (0.0029) [2024-06-18 18:10:30,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 3021389824. Throughput: 0: 41864.2. Samples: 245462760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:10:30,500][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 18:10:31,239][19107] Updated weights for policy 0, policy_version 184415 (0.0042) [2024-06-18 18:10:35,376][19107] Updated weights for policy 0, policy_version 184425 (0.0044) [2024-06-18 18:10:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3021619200. Throughput: 0: 41909.8. Samples: 245717640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:10:35,501][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 18:10:35,556][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000184426_3021635584.pth... [2024-06-18 18:10:35,618][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000183814_3011608576.pth [2024-06-18 18:10:39,077][19107] Updated weights for policy 0, policy_version 184435 (0.0029) [2024-06-18 18:10:40,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3021832192. Throughput: 0: 41800.5. Samples: 245961440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:10:40,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 18:10:43,101][19107] Updated weights for policy 0, policy_version 184445 (0.0045) [2024-06-18 18:10:45,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3022028800. Throughput: 0: 41747.8. Samples: 246085300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:10:45,501][18875] Avg episode reward: [(0, '0.530')] [2024-06-18 18:10:46,771][19107] Updated weights for policy 0, policy_version 184455 (0.0037) [2024-06-18 18:10:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3022241792. Throughput: 0: 41981.7. Samples: 246346480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:10:50,500][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 18:10:50,820][19107] Updated weights for policy 0, policy_version 184465 (0.0039) [2024-06-18 18:10:54,345][19107] Updated weights for policy 0, policy_version 184475 (0.0029) [2024-06-18 18:10:55,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42054.0, 300 sec: 41765.3). Total num frames: 3022471168. Throughput: 0: 41788.8. Samples: 246584460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:10:55,501][18875] Avg episode reward: [(0, '0.210')] [2024-06-18 18:10:58,076][19087] Signal inference workers to stop experience collection... (3550 times) [2024-06-18 18:10:58,076][19087] Signal inference workers to resume experience collection... (3550 times) [2024-06-18 18:10:58,093][19107] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-18 18:10:58,093][19107] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-18 18:10:58,714][19107] Updated weights for policy 0, policy_version 184485 (0.0035) [2024-06-18 18:11:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3022667776. Throughput: 0: 41903.1. Samples: 246715980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:11:00,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 18:11:02,362][19107] Updated weights for policy 0, policy_version 184495 (0.0031) [2024-06-18 18:11:05,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3022864384. Throughput: 0: 41853.9. Samples: 246964600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:11:05,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 18:11:06,489][19107] Updated weights for policy 0, policy_version 184505 (0.0041) [2024-06-18 18:11:10,186][19107] Updated weights for policy 0, policy_version 184515 (0.0036) [2024-06-18 18:11:10,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3023093760. Throughput: 0: 41692.4. Samples: 247211960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:11:10,504][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 18:11:14,361][19107] Updated weights for policy 0, policy_version 184525 (0.0050) [2024-06-18 18:11:15,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41709.9). Total num frames: 3023290368. Throughput: 0: 41812.7. Samples: 247344340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:11:15,501][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 18:11:18,012][19107] Updated weights for policy 0, policy_version 184535 (0.0038) [2024-06-18 18:11:20,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 3023470592. Throughput: 0: 41574.6. Samples: 247588500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:11:20,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 18:11:21,973][19107] Updated weights for policy 0, policy_version 184545 (0.0040) [2024-06-18 18:11:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41709.7). Total num frames: 3023699968. Throughput: 0: 41803.0. Samples: 247842580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 18:11:25,501][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 18:11:25,819][19107] Updated weights for policy 0, policy_version 184555 (0.0032) [2024-06-18 18:11:29,858][19107] Updated weights for policy 0, policy_version 184565 (0.0028) [2024-06-18 18:11:30,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 3023929344. Throughput: 0: 41865.0. Samples: 247969220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:11:30,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 18:11:34,237][19107] Updated weights for policy 0, policy_version 184575 (0.0039) [2024-06-18 18:11:35,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3024109568. Throughput: 0: 41584.9. Samples: 248217800. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:11:35,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 18:11:38,143][19107] Updated weights for policy 0, policy_version 184585 (0.0037) [2024-06-18 18:11:40,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3024338944. Throughput: 0: 41727.1. Samples: 248462180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:11:40,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 18:11:42,203][19107] Updated weights for policy 0, policy_version 184595 (0.0037) [2024-06-18 18:11:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 3024519168. Throughput: 0: 41664.0. Samples: 248590860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:11:45,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 18:11:45,911][19107] Updated weights for policy 0, policy_version 184605 (0.0037) [2024-06-18 18:11:49,936][19107] Updated weights for policy 0, policy_version 184615 (0.0022) [2024-06-18 18:11:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3024748544. Throughput: 0: 41790.7. Samples: 248845180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:11:50,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 18:11:53,592][19107] Updated weights for policy 0, policy_version 184625 (0.0045) [2024-06-18 18:11:55,500][18875] Fps is (10 sec: 47513.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3024994304. Throughput: 0: 41804.4. Samples: 249093160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:11:55,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 18:11:57,862][19107] Updated weights for policy 0, policy_version 184635 (0.0037) [2024-06-18 18:12:00,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3025158144. Throughput: 0: 41815.2. Samples: 249226020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:12:00,501][18875] Avg episode reward: [(0, '0.340')] [2024-06-18 18:12:01,377][19107] Updated weights for policy 0, policy_version 184645 (0.0030) [2024-06-18 18:12:05,500][18875] Fps is (10 sec: 36045.1, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3025354752. Throughput: 0: 41880.1. Samples: 249473100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:12:05,501][18875] Avg episode reward: [(0, '0.338')] [2024-06-18 18:12:05,757][19107] Updated weights for policy 0, policy_version 184655 (0.0047) [2024-06-18 18:12:09,262][19107] Updated weights for policy 0, policy_version 184665 (0.0032) [2024-06-18 18:12:10,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3025600512. Throughput: 0: 41866.8. Samples: 249726580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:12:10,501][18875] Avg episode reward: [(0, '0.338')] [2024-06-18 18:12:12,129][19087] Signal inference workers to stop experience collection... (3600 times) [2024-06-18 18:12:12,129][19087] Signal inference workers to resume experience collection... (3600 times) [2024-06-18 18:12:12,168][19107] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-18 18:12:12,168][19107] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-18 18:12:13,574][19107] Updated weights for policy 0, policy_version 184675 (0.0035) [2024-06-18 18:12:15,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 3025764352. Throughput: 0: 41951.2. Samples: 249857020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:12:15,500][18875] Avg episode reward: [(0, '0.253')] [2024-06-18 18:12:16,941][19107] Updated weights for policy 0, policy_version 184685 (0.0026) [2024-06-18 18:12:20,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3026010112. Throughput: 0: 41854.1. Samples: 250101240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:12:20,509][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 18:12:21,579][19107] Updated weights for policy 0, policy_version 184695 (0.0032) [2024-06-18 18:12:24,634][19107] Updated weights for policy 0, policy_version 184705 (0.0028) [2024-06-18 18:12:25,500][18875] Fps is (10 sec: 47513.0, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3026239488. Throughput: 0: 41952.0. Samples: 250350020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:12:25,509][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 18:12:29,411][19107] Updated weights for policy 0, policy_version 184715 (0.0039) [2024-06-18 18:12:30,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3026419712. Throughput: 0: 41966.2. Samples: 250479340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 22.0) [2024-06-18 18:12:30,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 18:12:32,427][19107] Updated weights for policy 0, policy_version 184725 (0.0036) [2024-06-18 18:12:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 3026649088. Throughput: 0: 41774.6. Samples: 250725040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:12:35,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 18:12:35,530][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000184733_3026665472.pth... [2024-06-18 18:12:35,587][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000184120_3016622080.pth [2024-06-18 18:12:37,221][19107] Updated weights for policy 0, policy_version 184735 (0.0034) [2024-06-18 18:12:40,428][19107] Updated weights for policy 0, policy_version 184745 (0.0037) [2024-06-18 18:12:40,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3026862080. Throughput: 0: 42090.3. Samples: 250987220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:12:40,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 18:12:45,103][19107] Updated weights for policy 0, policy_version 184755 (0.0037) [2024-06-18 18:12:45,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3027042304. Throughput: 0: 41861.7. Samples: 251109800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:12:45,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 18:12:48,027][19107] Updated weights for policy 0, policy_version 184765 (0.0032) [2024-06-18 18:12:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3027288064. Throughput: 0: 41810.6. Samples: 251354580. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:12:50,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 18:12:53,192][19107] Updated weights for policy 0, policy_version 184775 (0.0027) [2024-06-18 18:12:55,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3027468288. Throughput: 0: 41948.0. Samples: 251614240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:12:55,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 18:12:55,848][19107] Updated weights for policy 0, policy_version 184785 (0.0035) [2024-06-18 18:13:00,501][18875] Fps is (10 sec: 36044.4, 60 sec: 41505.9, 300 sec: 41654.2). Total num frames: 3027648512. Throughput: 0: 41778.4. Samples: 251737060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:00,501][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 18:13:00,983][19107] Updated weights for policy 0, policy_version 184795 (0.0027) [2024-06-18 18:13:03,661][19107] Updated weights for policy 0, policy_version 184805 (0.0047) [2024-06-18 18:13:04,503][19087] Signal inference workers to stop experience collection... (3650 times) [2024-06-18 18:13:04,503][19087] Signal inference workers to resume experience collection... (3650 times) [2024-06-18 18:13:04,520][19107] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-18 18:13:04,520][19107] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-18 18:13:05,504][18875] Fps is (10 sec: 45858.3, 60 sec: 42868.9, 300 sec: 41986.9). Total num frames: 3027927040. Throughput: 0: 41852.7. Samples: 251984760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:05,505][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 18:13:08,794][19107] Updated weights for policy 0, policy_version 184815 (0.0033) [2024-06-18 18:13:10,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41506.0, 300 sec: 41820.9). Total num frames: 3028090880. Throughput: 0: 42037.3. Samples: 252241700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:10,501][18875] Avg episode reward: [(0, '0.718')] [2024-06-18 18:13:11,533][19107] Updated weights for policy 0, policy_version 184825 (0.0029) [2024-06-18 18:13:15,503][18875] Fps is (10 sec: 37685.2, 60 sec: 42323.1, 300 sec: 41820.4). Total num frames: 3028303872. Throughput: 0: 41756.2. Samples: 252358500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:15,504][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 18:13:16,719][19107] Updated weights for policy 0, policy_version 184835 (0.0035) [2024-06-18 18:13:19,259][19107] Updated weights for policy 0, policy_version 184845 (0.0028) [2024-06-18 18:13:20,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3028549632. Throughput: 0: 42012.5. Samples: 252615600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:20,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 18:13:24,331][19107] Updated weights for policy 0, policy_version 184855 (0.0035) [2024-06-18 18:13:25,500][18875] Fps is (10 sec: 40972.6, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3028713472. Throughput: 0: 42017.8. Samples: 252878020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:25,503][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 18:13:26,995][19107] Updated weights for policy 0, policy_version 184865 (0.0034) [2024-06-18 18:13:30,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3028942848. Throughput: 0: 41740.5. Samples: 252988120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:30,501][18875] Avg episode reward: [(0, '0.168')] [2024-06-18 18:13:31,986][19107] Updated weights for policy 0, policy_version 184875 (0.0029) [2024-06-18 18:13:34,650][19107] Updated weights for policy 0, policy_version 184885 (0.0030) [2024-06-18 18:13:35,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3029172224. Throughput: 0: 42009.4. Samples: 253245000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:35,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 18:13:39,681][19107] Updated weights for policy 0, policy_version 184895 (0.0034) [2024-06-18 18:13:40,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3029336064. Throughput: 0: 41951.1. Samples: 253502040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 18:13:40,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 18:13:42,387][19107] Updated weights for policy 0, policy_version 184905 (0.0039) [2024-06-18 18:13:45,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.9). Total num frames: 3029581824. Throughput: 0: 41857.4. Samples: 253620640. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:13:45,501][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 18:13:47,358][19107] Updated weights for policy 0, policy_version 184915 (0.0043) [2024-06-18 18:13:50,500][18875] Fps is (10 sec: 45874.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3029794816. Throughput: 0: 42044.6. Samples: 253876620. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:13:50,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 18:13:50,735][19107] Updated weights for policy 0, policy_version 184925 (0.0037) [2024-06-18 18:13:55,018][19107] Updated weights for policy 0, policy_version 184935 (0.0039) [2024-06-18 18:13:55,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3029975040. Throughput: 0: 41945.3. Samples: 254129240. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:13:55,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 18:13:58,409][19107] Updated weights for policy 0, policy_version 184945 (0.0030) [2024-06-18 18:14:00,500][18875] Fps is (10 sec: 42599.5, 60 sec: 42871.7, 300 sec: 41987.5). Total num frames: 3030220800. Throughput: 0: 41979.4. Samples: 254247440. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:00,501][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 18:14:03,113][19107] Updated weights for policy 0, policy_version 184955 (0.0039) [2024-06-18 18:14:05,500][18875] Fps is (10 sec: 44237.7, 60 sec: 41508.7, 300 sec: 41932.0). Total num frames: 3030417408. Throughput: 0: 42058.4. Samples: 254508220. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:05,500][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 18:14:06,265][19107] Updated weights for policy 0, policy_version 184965 (0.0043) [2024-06-18 18:14:10,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3030597632. Throughput: 0: 41640.6. Samples: 254751840. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:10,500][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 18:14:10,774][19107] Updated weights for policy 0, policy_version 184975 (0.0040) [2024-06-18 18:14:12,357][19087] Signal inference workers to stop experience collection... (3700 times) [2024-06-18 18:14:12,397][19107] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-18 18:14:12,418][19087] Signal inference workers to resume experience collection... (3700 times) [2024-06-18 18:14:12,419][19107] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-18 18:14:14,247][19107] Updated weights for policy 0, policy_version 184985 (0.0031) [2024-06-18 18:14:15,500][18875] Fps is (10 sec: 42597.6, 60 sec: 42327.5, 300 sec: 41987.5). Total num frames: 3030843392. Throughput: 0: 41971.9. Samples: 254876860. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:15,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 18:14:18,980][19107] Updated weights for policy 0, policy_version 184995 (0.0039) [2024-06-18 18:14:20,501][18875] Fps is (10 sec: 45869.7, 60 sec: 41778.5, 300 sec: 41931.8). Total num frames: 3031056384. Throughput: 0: 42008.0. Samples: 255135400. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:20,502][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 18:14:21,901][19107] Updated weights for policy 0, policy_version 185005 (0.0032) [2024-06-18 18:14:25,504][18875] Fps is (10 sec: 40945.6, 60 sec: 42322.8, 300 sec: 41820.4). Total num frames: 3031252992. Throughput: 0: 41839.7. Samples: 255384980. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:25,504][18875] Avg episode reward: [(0, '0.835')] [2024-06-18 18:14:26,568][19107] Updated weights for policy 0, policy_version 185015 (0.0029) [2024-06-18 18:14:29,553][19107] Updated weights for policy 0, policy_version 185025 (0.0029) [2024-06-18 18:14:30,500][18875] Fps is (10 sec: 40964.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3031465984. Throughput: 0: 41974.4. Samples: 255509480. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:30,501][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 18:14:34,136][19107] Updated weights for policy 0, policy_version 185035 (0.0047) [2024-06-18 18:14:35,500][18875] Fps is (10 sec: 40974.8, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3031662592. Throughput: 0: 41925.5. Samples: 255763260. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:35,501][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 18:14:35,651][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185039_3031678976.pth... [2024-06-18 18:14:35,718][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000184426_3021635584.pth [2024-06-18 18:14:37,855][19107] Updated weights for policy 0, policy_version 185045 (0.0028) [2024-06-18 18:14:40,504][18875] Fps is (10 sec: 40945.2, 60 sec: 42322.8, 300 sec: 41820.4). Total num frames: 3031875584. Throughput: 0: 41896.8. Samples: 256014740. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:40,504][18875] Avg episode reward: [(0, '0.680')] [2024-06-18 18:14:41,762][19107] Updated weights for policy 0, policy_version 185055 (0.0039) [2024-06-18 18:14:45,504][18875] Fps is (10 sec: 42583.0, 60 sec: 41776.8, 300 sec: 41875.9). Total num frames: 3032088576. Throughput: 0: 42089.0. Samples: 256141600. Policy #0 lag: (min: 1.0, avg: 8.2, max: 21.0) [2024-06-18 18:14:45,504][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 18:14:45,787][19107] Updated weights for policy 0, policy_version 185065 (0.0038) [2024-06-18 18:14:49,576][19107] Updated weights for policy 0, policy_version 185075 (0.0043) [2024-06-18 18:14:50,500][18875] Fps is (10 sec: 42613.2, 60 sec: 41779.2, 300 sec: 41876.7). Total num frames: 3032301568. Throughput: 0: 41865.1. Samples: 256392160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:14:50,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 18:14:53,480][19107] Updated weights for policy 0, policy_version 185085 (0.0040) [2024-06-18 18:14:55,500][18875] Fps is (10 sec: 42613.9, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3032514560. Throughput: 0: 41920.8. Samples: 256638280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:14:55,504][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 18:14:57,353][19107] Updated weights for policy 0, policy_version 185095 (0.0024) [2024-06-18 18:15:00,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3032711168. Throughput: 0: 41949.5. Samples: 256764580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:00,500][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 18:15:01,402][19107] Updated weights for policy 0, policy_version 185105 (0.0029) [2024-06-18 18:15:05,318][19107] Updated weights for policy 0, policy_version 185115 (0.0052) [2024-06-18 18:15:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3032924160. Throughput: 0: 41777.9. Samples: 257015360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:05,501][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 18:15:09,265][19107] Updated weights for policy 0, policy_version 185125 (0.0034) [2024-06-18 18:15:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3033137152. Throughput: 0: 41796.3. Samples: 257265660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:10,500][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 18:15:13,098][19107] Updated weights for policy 0, policy_version 185135 (0.0040) [2024-06-18 18:15:15,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3033333760. Throughput: 0: 41822.1. Samples: 257391480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:15,501][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 18:15:17,022][19107] Updated weights for policy 0, policy_version 185145 (0.0029) [2024-06-18 18:15:20,504][18875] Fps is (10 sec: 40944.7, 60 sec: 41504.4, 300 sec: 41820.3). Total num frames: 3033546752. Throughput: 0: 41807.2. Samples: 257644740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:20,505][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 18:15:20,971][19107] Updated weights for policy 0, policy_version 185155 (0.0037) [2024-06-18 18:15:24,847][19107] Updated weights for policy 0, policy_version 185165 (0.0026) [2024-06-18 18:15:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41781.6, 300 sec: 41931.9). Total num frames: 3033759744. Throughput: 0: 41769.9. Samples: 257894240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:25,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 18:15:28,660][19107] Updated weights for policy 0, policy_version 185175 (0.0037) [2024-06-18 18:15:30,500][18875] Fps is (10 sec: 44252.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3033989120. Throughput: 0: 41737.5. Samples: 258019640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:30,501][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 18:15:32,757][19107] Updated weights for policy 0, policy_version 185185 (0.0043) [2024-06-18 18:15:35,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3034169344. Throughput: 0: 41843.3. Samples: 258275100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:35,500][18875] Avg episode reward: [(0, '0.740')] [2024-06-18 18:15:35,654][19087] Signal inference workers to stop experience collection... (3750 times) [2024-06-18 18:15:35,655][19087] Signal inference workers to resume experience collection... (3750 times) [2024-06-18 18:15:35,690][19107] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-18 18:15:35,690][19107] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-18 18:15:36,329][19107] Updated weights for policy 0, policy_version 185195 (0.0033) [2024-06-18 18:15:40,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41781.7, 300 sec: 41876.4). Total num frames: 3034382336. Throughput: 0: 41964.4. Samples: 258526680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:40,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 18:15:40,608][19107] Updated weights for policy 0, policy_version 185205 (0.0037) [2024-06-18 18:15:44,105][19107] Updated weights for policy 0, policy_version 185215 (0.0054) [2024-06-18 18:15:45,500][18875] Fps is (10 sec: 45874.2, 60 sec: 42327.8, 300 sec: 41987.4). Total num frames: 3034628096. Throughput: 0: 41992.7. Samples: 258654260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:45,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 18:15:48,501][19107] Updated weights for policy 0, policy_version 185225 (0.0044) [2024-06-18 18:15:50,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 3034775552. Throughput: 0: 42050.1. Samples: 258907620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:15:50,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 18:15:51,730][19107] Updated weights for policy 0, policy_version 185235 (0.0040) [2024-06-18 18:15:55,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3035004928. Throughput: 0: 42026.1. Samples: 259156840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:15:55,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 18:15:56,356][19107] Updated weights for policy 0, policy_version 185245 (0.0046) [2024-06-18 18:15:59,653][19107] Updated weights for policy 0, policy_version 185255 (0.0037) [2024-06-18 18:16:00,500][18875] Fps is (10 sec: 49152.4, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3035267072. Throughput: 0: 42009.9. Samples: 259281920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:00,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 18:16:04,525][19107] Updated weights for policy 0, policy_version 185265 (0.0039) [2024-06-18 18:16:05,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3035414528. Throughput: 0: 41866.5. Samples: 259528580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:05,501][18875] Avg episode reward: [(0, '0.750')] [2024-06-18 18:16:07,418][19107] Updated weights for policy 0, policy_version 185275 (0.0045) [2024-06-18 18:16:10,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3035643904. Throughput: 0: 41813.4. Samples: 259775840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:10,501][18875] Avg episode reward: [(0, '0.619')] [2024-06-18 18:16:12,271][19107] Updated weights for policy 0, policy_version 185285 (0.0039) [2024-06-18 18:16:15,298][19107] Updated weights for policy 0, policy_version 185295 (0.0035) [2024-06-18 18:16:15,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3035873280. Throughput: 0: 41928.1. Samples: 259906400. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:15,500][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 18:16:20,051][19107] Updated weights for policy 0, policy_version 185305 (0.0029) [2024-06-18 18:16:20,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41508.6, 300 sec: 41820.9). Total num frames: 3036037120. Throughput: 0: 41723.9. Samples: 260152680. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:20,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 18:16:23,172][19107] Updated weights for policy 0, policy_version 185315 (0.0027) [2024-06-18 18:16:25,503][18875] Fps is (10 sec: 40948.5, 60 sec: 42050.4, 300 sec: 41876.0). Total num frames: 3036282880. Throughput: 0: 41629.8. Samples: 260400140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:25,504][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 18:16:27,834][19107] Updated weights for policy 0, policy_version 185325 (0.0043) [2024-06-18 18:16:30,500][18875] Fps is (10 sec: 45875.4, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3036495872. Throughput: 0: 41759.2. Samples: 260533420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:30,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 18:16:30,904][19107] Updated weights for policy 0, policy_version 185335 (0.0032) [2024-06-18 18:16:35,500][18875] Fps is (10 sec: 39332.1, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3036676096. Throughput: 0: 41679.0. Samples: 260783180. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:35,501][18875] Avg episode reward: [(0, '0.337')] [2024-06-18 18:16:35,522][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185344_3036676096.pth... [2024-06-18 18:16:35,584][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000184733_3026665472.pth [2024-06-18 18:16:35,734][19107] Updated weights for policy 0, policy_version 185345 (0.0030) [2024-06-18 18:16:38,908][19107] Updated weights for policy 0, policy_version 185355 (0.0037) [2024-06-18 18:16:40,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3036921856. Throughput: 0: 41672.9. Samples: 261032120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:40,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 18:16:41,973][19087] Signal inference workers to stop experience collection... (3800 times) [2024-06-18 18:16:42,015][19107] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-18 18:16:42,026][19087] Signal inference workers to resume experience collection... (3800 times) [2024-06-18 18:16:42,035][19107] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-18 18:16:43,752][19107] Updated weights for policy 0, policy_version 185365 (0.0040) [2024-06-18 18:16:45,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 3037102080. Throughput: 0: 41866.2. Samples: 261165900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:45,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 18:16:46,508][19107] Updated weights for policy 0, policy_version 185375 (0.0030) [2024-06-18 18:16:50,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 3037315072. Throughput: 0: 41812.7. Samples: 261410160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:50,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 18:16:51,305][19107] Updated weights for policy 0, policy_version 185385 (0.0037) [2024-06-18 18:16:54,403][19107] Updated weights for policy 0, policy_version 185395 (0.0032) [2024-06-18 18:16:55,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3037528064. Throughput: 0: 41877.7. Samples: 261660340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 21.0) [2024-06-18 18:16:55,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 18:16:58,967][19107] Updated weights for policy 0, policy_version 185405 (0.0034) [2024-06-18 18:17:00,500][18875] Fps is (10 sec: 40960.9, 60 sec: 40960.1, 300 sec: 41931.9). Total num frames: 3037724672. Throughput: 0: 41837.8. Samples: 261789100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:00,500][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 18:17:02,130][19107] Updated weights for policy 0, policy_version 185415 (0.0027) [2024-06-18 18:17:05,500][18875] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3037937664. Throughput: 0: 41813.9. Samples: 262034300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:05,500][18875] Avg episode reward: [(0, '0.665')] [2024-06-18 18:17:06,879][19107] Updated weights for policy 0, policy_version 185425 (0.0030) [2024-06-18 18:17:10,088][19107] Updated weights for policy 0, policy_version 185435 (0.0036) [2024-06-18 18:17:10,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3038167040. Throughput: 0: 41812.9. Samples: 262281600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:10,500][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 18:17:14,645][19107] Updated weights for policy 0, policy_version 185445 (0.0048) [2024-06-18 18:17:15,500][18875] Fps is (10 sec: 40958.9, 60 sec: 41232.9, 300 sec: 41820.8). Total num frames: 3038347264. Throughput: 0: 41733.2. Samples: 262411420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:15,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 18:17:17,987][19107] Updated weights for policy 0, policy_version 185455 (0.0033) [2024-06-18 18:17:20,500][18875] Fps is (10 sec: 39320.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3038560256. Throughput: 0: 41564.5. Samples: 262653580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:20,501][18875] Avg episode reward: [(0, '0.677')] [2024-06-18 18:17:22,478][19107] Updated weights for policy 0, policy_version 185465 (0.0030) [2024-06-18 18:17:25,500][18875] Fps is (10 sec: 44237.5, 60 sec: 41781.1, 300 sec: 41931.9). Total num frames: 3038789632. Throughput: 0: 41746.7. Samples: 262910720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:25,501][18875] Avg episode reward: [(0, '0.740')] [2024-06-18 18:17:25,821][19107] Updated weights for policy 0, policy_version 185475 (0.0038) [2024-06-18 18:17:30,250][19107] Updated weights for policy 0, policy_version 185485 (0.0037) [2024-06-18 18:17:30,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3038986240. Throughput: 0: 41530.1. Samples: 263034760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:30,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 18:17:33,677][19107] Updated weights for policy 0, policy_version 185495 (0.0034) [2024-06-18 18:17:35,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3039199232. Throughput: 0: 41625.9. Samples: 263283320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:35,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 18:17:38,137][19107] Updated weights for policy 0, policy_version 185505 (0.0033) [2024-06-18 18:17:40,500][18875] Fps is (10 sec: 40961.0, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 3039395840. Throughput: 0: 41588.7. Samples: 263531820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:40,500][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 18:17:41,788][19107] Updated weights for policy 0, policy_version 185515 (0.0036) [2024-06-18 18:17:45,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3039592448. Throughput: 0: 41445.6. Samples: 263654160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:45,501][18875] Avg episode reward: [(0, '0.614')] [2024-06-18 18:17:46,293][19107] Updated weights for policy 0, policy_version 185525 (0.0033) [2024-06-18 18:17:49,587][19107] Updated weights for policy 0, policy_version 185535 (0.0030) [2024-06-18 18:17:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3039821824. Throughput: 0: 41510.6. Samples: 263902280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:50,500][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 18:17:54,173][19107] Updated weights for policy 0, policy_version 185545 (0.0036) [2024-06-18 18:17:55,504][18875] Fps is (10 sec: 42583.1, 60 sec: 41503.7, 300 sec: 41931.4). Total num frames: 3040018432. Throughput: 0: 41864.1. Samples: 264165640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:17:55,505][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 18:17:57,504][19107] Updated weights for policy 0, policy_version 185555 (0.0038) [2024-06-18 18:17:59,722][19087] Signal inference workers to stop experience collection... (3850 times) [2024-06-18 18:17:59,723][19087] Signal inference workers to resume experience collection... (3850 times) [2024-06-18 18:17:59,749][19107] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-18 18:17:59,749][19107] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-18 18:18:00,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41654.7). Total num frames: 3040215040. Throughput: 0: 41634.3. Samples: 264284960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:18:00,504][18875] Avg episode reward: [(0, '0.607')] [2024-06-18 18:18:01,731][19107] Updated weights for policy 0, policy_version 185565 (0.0035) [2024-06-18 18:18:05,194][19107] Updated weights for policy 0, policy_version 185575 (0.0032) [2024-06-18 18:18:05,500][18875] Fps is (10 sec: 44253.3, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 3040460800. Throughput: 0: 41726.4. Samples: 264531260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 18:18:05,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 18:18:09,524][19107] Updated weights for policy 0, policy_version 185585 (0.0027) [2024-06-18 18:18:10,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41506.0, 300 sec: 41876.8). Total num frames: 3040657408. Throughput: 0: 41808.4. Samples: 264792100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:10,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 18:18:12,773][19107] Updated weights for policy 0, policy_version 185595 (0.0033) [2024-06-18 18:18:15,500][18875] Fps is (10 sec: 39320.8, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3040854016. Throughput: 0: 41689.3. Samples: 264910780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:15,501][18875] Avg episode reward: [(0, '0.865')] [2024-06-18 18:18:17,302][19107] Updated weights for policy 0, policy_version 185605 (0.0037) [2024-06-18 18:18:20,454][19107] Updated weights for policy 0, policy_version 185615 (0.0045) [2024-06-18 18:18:20,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3041116160. Throughput: 0: 41829.7. Samples: 265165660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:20,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 18:18:25,134][19107] Updated weights for policy 0, policy_version 185625 (0.0025) [2024-06-18 18:18:25,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3041280000. Throughput: 0: 41914.1. Samples: 265417960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:25,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 18:18:28,950][19107] Updated weights for policy 0, policy_version 185635 (0.0038) [2024-06-18 18:18:30,504][18875] Fps is (10 sec: 36031.9, 60 sec: 41503.7, 300 sec: 41709.3). Total num frames: 3041476608. Throughput: 0: 41787.8. Samples: 265534760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:30,504][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 18:18:32,906][19107] Updated weights for policy 0, policy_version 185645 (0.0027) [2024-06-18 18:18:35,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3041722368. Throughput: 0: 42067.0. Samples: 265795300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:35,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 18:18:35,527][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185652_3041722368.pth... [2024-06-18 18:18:35,605][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185039_3031678976.pth [2024-06-18 18:18:36,718][19107] Updated weights for policy 0, policy_version 185655 (0.0042) [2024-06-18 18:18:40,500][18875] Fps is (10 sec: 42614.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3041902592. Throughput: 0: 41837.2. Samples: 266048160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:40,501][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 18:18:40,838][19107] Updated weights for policy 0, policy_version 185665 (0.0029) [2024-06-18 18:18:44,478][19107] Updated weights for policy 0, policy_version 185675 (0.0034) [2024-06-18 18:18:45,500][18875] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3042115584. Throughput: 0: 41857.8. Samples: 266168560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:45,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 18:18:48,815][19107] Updated weights for policy 0, policy_version 185685 (0.0039) [2024-06-18 18:18:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3042328576. Throughput: 0: 41922.1. Samples: 266417760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:50,501][18875] Avg episode reward: [(0, '0.342')] [2024-06-18 18:18:52,276][19107] Updated weights for policy 0, policy_version 185695 (0.0045) [2024-06-18 18:18:55,504][18875] Fps is (10 sec: 40945.3, 60 sec: 41779.2, 300 sec: 41709.3). Total num frames: 3042525184. Throughput: 0: 41892.2. Samples: 266677400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:18:55,504][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 18:18:56,422][19107] Updated weights for policy 0, policy_version 185705 (0.0028) [2024-06-18 18:19:00,069][19107] Updated weights for policy 0, policy_version 185715 (0.0043) [2024-06-18 18:19:00,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3042754560. Throughput: 0: 41883.2. Samples: 266795520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:19:00,503][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 18:19:04,351][19107] Updated weights for policy 0, policy_version 185725 (0.0023) [2024-06-18 18:19:05,500][18875] Fps is (10 sec: 44252.4, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3042967552. Throughput: 0: 41885.3. Samples: 267050500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:19:05,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 18:19:07,963][19107] Updated weights for policy 0, policy_version 185735 (0.0033) [2024-06-18 18:19:10,504][18875] Fps is (10 sec: 40945.7, 60 sec: 41776.7, 300 sec: 41764.8). Total num frames: 3043164160. Throughput: 0: 41797.6. Samples: 267299000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 18:19:10,504][18875] Avg episode reward: [(0, '0.145')] [2024-06-18 18:19:12,698][19107] Updated weights for policy 0, policy_version 185745 (0.0036) [2024-06-18 18:19:15,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41765.5). Total num frames: 3043377152. Throughput: 0: 41858.4. Samples: 267418240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:15,501][18875] Avg episode reward: [(0, '0.734')] [2024-06-18 18:19:16,257][19107] Updated weights for policy 0, policy_version 185755 (0.0026) [2024-06-18 18:19:20,423][19107] Updated weights for policy 0, policy_version 185765 (0.0046) [2024-06-18 18:19:20,500][18875] Fps is (10 sec: 40974.8, 60 sec: 40960.0, 300 sec: 41765.8). Total num frames: 3043573760. Throughput: 0: 41603.6. Samples: 267667460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:20,501][18875] Avg episode reward: [(0, '0.250')] [2024-06-18 18:19:24,127][19107] Updated weights for policy 0, policy_version 185775 (0.0039) [2024-06-18 18:19:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3043786752. Throughput: 0: 41512.9. Samples: 267916240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:25,501][18875] Avg episode reward: [(0, '0.268')] [2024-06-18 18:19:28,152][19107] Updated weights for policy 0, policy_version 185785 (0.0044) [2024-06-18 18:19:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42054.8, 300 sec: 41820.9). Total num frames: 3043999744. Throughput: 0: 41623.6. Samples: 268041620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:30,501][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 18:19:31,967][19107] Updated weights for policy 0, policy_version 185795 (0.0046) [2024-06-18 18:19:33,875][19087] Signal inference workers to stop experience collection... (3900 times) [2024-06-18 18:19:33,906][19107] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-18 18:19:33,942][19087] Signal inference workers to resume experience collection... (3900 times) [2024-06-18 18:19:33,943][19107] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-18 18:19:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41765.8). Total num frames: 3044196352. Throughput: 0: 41659.5. Samples: 268292440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:35,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 18:19:35,883][19107] Updated weights for policy 0, policy_version 185805 (0.0045) [2024-06-18 18:19:40,108][19107] Updated weights for policy 0, policy_version 185815 (0.0044) [2024-06-18 18:19:40,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 41765.8). Total num frames: 3044409344. Throughput: 0: 41623.7. Samples: 268550320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:40,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 18:19:43,732][19107] Updated weights for policy 0, policy_version 185825 (0.0030) [2024-06-18 18:19:45,500][18875] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3044638720. Throughput: 0: 41717.9. Samples: 268672820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:45,500][18875] Avg episode reward: [(0, '0.403')] [2024-06-18 18:19:47,992][19107] Updated weights for policy 0, policy_version 185835 (0.0035) [2024-06-18 18:19:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3044835328. Throughput: 0: 41658.2. Samples: 268925120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:50,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 18:19:51,670][19107] Updated weights for policy 0, policy_version 185845 (0.0040) [2024-06-18 18:19:55,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41781.8, 300 sec: 41765.3). Total num frames: 3045031936. Throughput: 0: 41735.8. Samples: 269176960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:19:55,500][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 18:19:55,764][19107] Updated weights for policy 0, policy_version 185855 (0.0031) [2024-06-18 18:19:59,788][19107] Updated weights for policy 0, policy_version 185865 (0.0035) [2024-06-18 18:20:00,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3045261312. Throughput: 0: 41898.7. Samples: 269303680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:20:00,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 18:20:03,585][19107] Updated weights for policy 0, policy_version 185875 (0.0045) [2024-06-18 18:20:05,500][18875] Fps is (10 sec: 44235.9, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3045474304. Throughput: 0: 41847.9. Samples: 269550620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:20:05,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 18:20:07,450][19107] Updated weights for policy 0, policy_version 185885 (0.0028) [2024-06-18 18:20:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41781.6, 300 sec: 41820.9). Total num frames: 3045670912. Throughput: 0: 41757.7. Samples: 269795340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:20:10,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 18:20:11,516][19107] Updated weights for policy 0, policy_version 185895 (0.0045) [2024-06-18 18:20:15,239][19107] Updated weights for policy 0, policy_version 185905 (0.0038) [2024-06-18 18:20:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41765.8). Total num frames: 3045867520. Throughput: 0: 41731.4. Samples: 269919540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 18:20:15,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 18:20:19,508][19107] Updated weights for policy 0, policy_version 185915 (0.0024) [2024-06-18 18:20:20,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3046080512. Throughput: 0: 41838.3. Samples: 270175160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:20,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 18:20:22,991][19107] Updated weights for policy 0, policy_version 185925 (0.0037) [2024-06-18 18:20:25,500][18875] Fps is (10 sec: 42599.5, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3046293504. Throughput: 0: 41555.3. Samples: 270420300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:25,500][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 18:20:27,279][19107] Updated weights for policy 0, policy_version 185935 (0.0031) [2024-06-18 18:20:30,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3046490112. Throughput: 0: 41689.3. Samples: 270548840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:30,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 18:20:30,718][19107] Updated weights for policy 0, policy_version 185945 (0.0037) [2024-06-18 18:20:34,975][19107] Updated weights for policy 0, policy_version 185955 (0.0036) [2024-06-18 18:20:35,501][18875] Fps is (10 sec: 42597.1, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3046719488. Throughput: 0: 41705.2. Samples: 270801860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:35,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 18:20:35,529][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185957_3046719488.pth... [2024-06-18 18:20:35,584][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185344_3036676096.pth [2024-06-18 18:20:38,777][19107] Updated weights for policy 0, policy_version 185965 (0.0037) [2024-06-18 18:20:40,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3046932480. Throughput: 0: 41571.4. Samples: 271047680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:40,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 18:20:41,776][19087] Signal inference workers to stop experience collection... (3950 times) [2024-06-18 18:20:41,776][19087] Signal inference workers to resume experience collection... (3950 times) [2024-06-18 18:20:41,798][19107] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-18 18:20:41,829][19107] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-18 18:20:42,816][19107] Updated weights for policy 0, policy_version 185975 (0.0030) [2024-06-18 18:20:45,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41232.9, 300 sec: 41820.8). Total num frames: 3047112704. Throughput: 0: 41620.3. Samples: 271176600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:45,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 18:20:46,396][19107] Updated weights for policy 0, policy_version 185985 (0.0037) [2024-06-18 18:20:50,458][19107] Updated weights for policy 0, policy_version 185995 (0.0035) [2024-06-18 18:20:50,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3047342080. Throughput: 0: 41700.0. Samples: 271427120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:50,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 18:20:54,157][19107] Updated weights for policy 0, policy_version 186005 (0.0029) [2024-06-18 18:20:55,500][18875] Fps is (10 sec: 44238.0, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 3047555072. Throughput: 0: 41710.8. Samples: 271672320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:20:55,500][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 18:20:58,351][19107] Updated weights for policy 0, policy_version 186015 (0.0027) [2024-06-18 18:21:00,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 3047751680. Throughput: 0: 41779.7. Samples: 271799620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:21:00,501][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 18:21:02,185][19107] Updated weights for policy 0, policy_version 186025 (0.0039) [2024-06-18 18:21:05,500][18875] Fps is (10 sec: 40959.3, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3047964672. Throughput: 0: 41725.3. Samples: 272052800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:21:05,501][18875] Avg episode reward: [(0, '0.398')] [2024-06-18 18:21:06,200][19107] Updated weights for policy 0, policy_version 186035 (0.0038) [2024-06-18 18:21:09,963][19107] Updated weights for policy 0, policy_version 186045 (0.0047) [2024-06-18 18:21:10,502][18875] Fps is (10 sec: 42590.7, 60 sec: 41778.0, 300 sec: 41709.5). Total num frames: 3048177664. Throughput: 0: 41760.0. Samples: 272299580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:21:10,503][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 18:21:13,887][19107] Updated weights for policy 0, policy_version 186055 (0.0031) [2024-06-18 18:21:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3048374272. Throughput: 0: 41771.0. Samples: 272428540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:21:15,503][18875] Avg episode reward: [(0, '0.320')] [2024-06-18 18:21:17,927][19107] Updated weights for policy 0, policy_version 186065 (0.0032) [2024-06-18 18:21:20,504][18875] Fps is (10 sec: 40952.7, 60 sec: 41776.7, 300 sec: 41709.7). Total num frames: 3048587264. Throughput: 0: 41586.2. Samples: 272673380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 18:21:20,505][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 18:21:21,688][19107] Updated weights for policy 0, policy_version 186075 (0.0041) [2024-06-18 18:21:25,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 3048783872. Throughput: 0: 41743.9. Samples: 272926160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:21:25,501][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 18:21:25,872][19107] Updated weights for policy 0, policy_version 186085 (0.0037) [2024-06-18 18:21:29,950][19107] Updated weights for policy 0, policy_version 186095 (0.0043) [2024-06-18 18:21:30,500][18875] Fps is (10 sec: 40975.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3048996864. Throughput: 0: 41603.3. Samples: 273048740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:21:30,501][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 18:21:33,739][19107] Updated weights for policy 0, policy_version 186105 (0.0042) [2024-06-18 18:21:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3049209856. Throughput: 0: 41457.0. Samples: 273292680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:21:35,501][18875] Avg episode reward: [(0, '0.333')] [2024-06-18 18:21:37,757][19107] Updated weights for policy 0, policy_version 186115 (0.0034) [2024-06-18 18:21:40,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 3049406464. Throughput: 0: 41668.0. Samples: 273547380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:21:40,500][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 18:21:41,562][19107] Updated weights for policy 0, policy_version 186125 (0.0033) [2024-06-18 18:21:45,373][19107] Updated weights for policy 0, policy_version 186135 (0.0039) [2024-06-18 18:21:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3049635840. Throughput: 0: 41613.7. Samples: 273672240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:21:45,501][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 18:21:49,526][19107] Updated weights for policy 0, policy_version 186145 (0.0037) [2024-06-18 18:21:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3049832448. Throughput: 0: 41543.2. Samples: 273922240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:21:50,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 18:21:53,537][19107] Updated weights for policy 0, policy_version 186155 (0.0033) [2024-06-18 18:21:55,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3050045440. Throughput: 0: 41584.7. Samples: 274170820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:21:55,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 18:21:57,228][19107] Updated weights for policy 0, policy_version 186165 (0.0043) [2024-06-18 18:22:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3050258432. Throughput: 0: 41576.6. Samples: 274299480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:22:00,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 18:22:01,482][19107] Updated weights for policy 0, policy_version 186175 (0.0042) [2024-06-18 18:22:05,229][19107] Updated weights for policy 0, policy_version 186185 (0.0034) [2024-06-18 18:22:05,500][18875] Fps is (10 sec: 40961.0, 60 sec: 41506.3, 300 sec: 41654.2). Total num frames: 3050455040. Throughput: 0: 41658.1. Samples: 274547840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:22:05,500][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 18:22:09,322][19107] Updated weights for policy 0, policy_version 186195 (0.0042) [2024-06-18 18:22:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41507.4, 300 sec: 41765.3). Total num frames: 3050668032. Throughput: 0: 41515.2. Samples: 274794340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:22:10,504][18875] Avg episode reward: [(0, '0.717')] [2024-06-18 18:22:12,925][19107] Updated weights for policy 0, policy_version 186205 (0.0041) [2024-06-18 18:22:15,500][18875] Fps is (10 sec: 42597.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3050881024. Throughput: 0: 41454.1. Samples: 274914180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:22:15,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 18:22:17,022][19087] Signal inference workers to stop experience collection... (4000 times) [2024-06-18 18:22:17,022][19087] Signal inference workers to resume experience collection... (4000 times) [2024-06-18 18:22:17,068][19107] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-18 18:22:17,072][19107] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-18 18:22:17,160][19107] Updated weights for policy 0, policy_version 186215 (0.0047) [2024-06-18 18:22:20,504][18875] Fps is (10 sec: 40945.1, 60 sec: 41506.2, 300 sec: 41653.7). Total num frames: 3051077632. Throughput: 0: 41555.8. Samples: 275162840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:22:20,504][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 18:22:21,226][19107] Updated weights for policy 0, policy_version 186225 (0.0034) [2024-06-18 18:22:24,992][19107] Updated weights for policy 0, policy_version 186235 (0.0040) [2024-06-18 18:22:25,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 3051274240. Throughput: 0: 41423.0. Samples: 275411420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:22:25,501][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 18:22:29,111][19107] Updated weights for policy 0, policy_version 186245 (0.0046) [2024-06-18 18:22:30,500][18875] Fps is (10 sec: 40974.3, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 3051487232. Throughput: 0: 41452.0. Samples: 275537580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 18:22:30,501][18875] Avg episode reward: [(0, '0.363')] [2024-06-18 18:22:32,763][19107] Updated weights for policy 0, policy_version 186255 (0.0032) [2024-06-18 18:22:35,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41709.7). Total num frames: 3051700224. Throughput: 0: 41369.2. Samples: 275783860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:22:35,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 18:22:35,524][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000186261_3051700224.pth... [2024-06-18 18:22:35,590][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185652_3041722368.pth [2024-06-18 18:22:37,522][19107] Updated weights for policy 0, policy_version 186265 (0.0044) [2024-06-18 18:22:40,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3051913216. Throughput: 0: 41373.9. Samples: 276032640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:22:40,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 18:22:40,716][19107] Updated weights for policy 0, policy_version 186275 (0.0035) [2024-06-18 18:22:45,439][19107] Updated weights for policy 0, policy_version 186285 (0.0040) [2024-06-18 18:22:45,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41598.7). Total num frames: 3052093440. Throughput: 0: 41164.7. Samples: 276151900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:22:45,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 18:22:49,013][19107] Updated weights for policy 0, policy_version 186295 (0.0042) [2024-06-18 18:22:50,501][18875] Fps is (10 sec: 39321.0, 60 sec: 41232.9, 300 sec: 41654.7). Total num frames: 3052306432. Throughput: 0: 41130.4. Samples: 276398720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:22:50,501][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 18:22:53,185][19107] Updated weights for policy 0, policy_version 186305 (0.0029) [2024-06-18 18:22:55,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 3052519424. Throughput: 0: 41237.7. Samples: 276650040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:22:55,501][18875] Avg episode reward: [(0, '0.367')] [2024-06-18 18:22:56,887][19107] Updated weights for policy 0, policy_version 186315 (0.0029) [2024-06-18 18:23:00,500][18875] Fps is (10 sec: 40961.3, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 3052716032. Throughput: 0: 41358.5. Samples: 276775300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:00,500][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 18:23:00,945][19107] Updated weights for policy 0, policy_version 186325 (0.0038) [2024-06-18 18:23:04,815][19107] Updated weights for policy 0, policy_version 186335 (0.0036) [2024-06-18 18:23:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41232.9, 300 sec: 41598.7). Total num frames: 3052929024. Throughput: 0: 41319.2. Samples: 277022060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:05,501][18875] Avg episode reward: [(0, '0.846')] [2024-06-18 18:23:08,907][19107] Updated weights for policy 0, policy_version 186345 (0.0040) [2024-06-18 18:23:10,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41654.3). Total num frames: 3053142016. Throughput: 0: 41326.8. Samples: 277271120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:10,500][18875] Avg episode reward: [(0, '0.792')] [2024-06-18 18:23:13,089][19107] Updated weights for policy 0, policy_version 186355 (0.0035) [2024-06-18 18:23:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 3053338624. Throughput: 0: 41465.4. Samples: 277403520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:15,501][18875] Avg episode reward: [(0, '0.680')] [2024-06-18 18:23:16,757][19107] Updated weights for policy 0, policy_version 186365 (0.0039) [2024-06-18 18:23:20,500][18875] Fps is (10 sec: 39321.3, 60 sec: 40962.4, 300 sec: 41543.2). Total num frames: 3053535232. Throughput: 0: 41299.6. Samples: 277642340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:20,501][18875] Avg episode reward: [(0, '0.363')] [2024-06-18 18:23:20,937][19107] Updated weights for policy 0, policy_version 186375 (0.0041) [2024-06-18 18:23:24,428][19107] Updated weights for policy 0, policy_version 186385 (0.0047) [2024-06-18 18:23:25,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41654.8). Total num frames: 3053764608. Throughput: 0: 41382.8. Samples: 277894860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:25,501][18875] Avg episode reward: [(0, '0.331')] [2024-06-18 18:23:28,559][19107] Updated weights for policy 0, policy_version 186395 (0.0028) [2024-06-18 18:23:30,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41506.3, 300 sec: 41543.2). Total num frames: 3053977600. Throughput: 0: 41669.6. Samples: 278027020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:30,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 18:23:32,276][19107] Updated weights for policy 0, policy_version 186405 (0.0042) [2024-06-18 18:23:35,500][18875] Fps is (10 sec: 39321.7, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 3054157824. Throughput: 0: 41614.9. Samples: 278271380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:23:35,500][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 18:23:36,971][19107] Updated weights for policy 0, policy_version 186415 (0.0041) [2024-06-18 18:23:37,658][19087] Signal inference workers to stop experience collection... (4050 times) [2024-06-18 18:23:37,659][19087] Signal inference workers to resume experience collection... (4050 times) [2024-06-18 18:23:37,671][19107] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-18 18:23:37,671][19107] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-18 18:23:40,158][19107] Updated weights for policy 0, policy_version 186425 (0.0033) [2024-06-18 18:23:40,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 3054387200. Throughput: 0: 41541.4. Samples: 278519400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:23:40,501][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 18:23:44,835][19107] Updated weights for policy 0, policy_version 186435 (0.0047) [2024-06-18 18:23:45,504][18875] Fps is (10 sec: 44220.3, 60 sec: 41776.8, 300 sec: 41598.2). Total num frames: 3054600192. Throughput: 0: 41541.4. Samples: 278644820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:23:45,505][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 18:23:48,306][19107] Updated weights for policy 0, policy_version 186445 (0.0037) [2024-06-18 18:23:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41654.7). Total num frames: 3054813184. Throughput: 0: 41581.3. Samples: 278893220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:23:50,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 18:23:52,565][19107] Updated weights for policy 0, policy_version 186455 (0.0035) [2024-06-18 18:23:55,500][18875] Fps is (10 sec: 40974.8, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3055009792. Throughput: 0: 41707.9. Samples: 279147980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:23:55,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 18:23:55,961][19107] Updated weights for policy 0, policy_version 186465 (0.0040) [2024-06-18 18:24:00,281][19107] Updated weights for policy 0, policy_version 186475 (0.0044) [2024-06-18 18:24:00,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3055206400. Throughput: 0: 41607.6. Samples: 279275860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:00,501][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 18:24:03,894][19107] Updated weights for policy 0, policy_version 186485 (0.0034) [2024-06-18 18:24:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41599.2). Total num frames: 3055435776. Throughput: 0: 41841.4. Samples: 279525200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:05,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 18:24:08,005][19107] Updated weights for policy 0, policy_version 186495 (0.0043) [2024-06-18 18:24:10,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 3055648768. Throughput: 0: 41778.6. Samples: 279774900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:10,501][18875] Avg episode reward: [(0, '0.361')] [2024-06-18 18:24:11,854][19107] Updated weights for policy 0, policy_version 186505 (0.0032) [2024-06-18 18:24:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3055845376. Throughput: 0: 41689.2. Samples: 279903040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:15,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 18:24:15,659][19107] Updated weights for policy 0, policy_version 186515 (0.0031) [2024-06-18 18:24:19,393][19107] Updated weights for policy 0, policy_version 186525 (0.0038) [2024-06-18 18:24:20,501][18875] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 41654.2). Total num frames: 3056074752. Throughput: 0: 41974.3. Samples: 280160240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:20,501][18875] Avg episode reward: [(0, '0.264')] [2024-06-18 18:24:23,463][19107] Updated weights for policy 0, policy_version 186535 (0.0032) [2024-06-18 18:24:25,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 3056287744. Throughput: 0: 41939.7. Samples: 280406680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:25,501][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 18:24:26,965][19107] Updated weights for policy 0, policy_version 186545 (0.0035) [2024-06-18 18:24:30,500][18875] Fps is (10 sec: 39323.0, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3056467968. Throughput: 0: 41949.7. Samples: 280532400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:30,500][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 18:24:31,165][19107] Updated weights for policy 0, policy_version 186555 (0.0034) [2024-06-18 18:24:34,671][19107] Updated weights for policy 0, policy_version 186565 (0.0035) [2024-06-18 18:24:35,501][18875] Fps is (10 sec: 40958.5, 60 sec: 42325.1, 300 sec: 41654.2). Total num frames: 3056697344. Throughput: 0: 41989.6. Samples: 280782760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:35,501][18875] Avg episode reward: [(0, '0.244')] [2024-06-18 18:24:35,518][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000186566_3056697344.pth... [2024-06-18 18:24:35,563][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000185957_3046719488.pth [2024-06-18 18:24:39,105][19107] Updated weights for policy 0, policy_version 186575 (0.0037) [2024-06-18 18:24:40,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 3056910336. Throughput: 0: 41904.1. Samples: 281033660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 22.0) [2024-06-18 18:24:40,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 18:24:42,413][19107] Updated weights for policy 0, policy_version 186585 (0.0042) [2024-06-18 18:24:45,500][18875] Fps is (10 sec: 39323.1, 60 sec: 41508.7, 300 sec: 41543.2). Total num frames: 3057090560. Throughput: 0: 41866.7. Samples: 281159860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:24:45,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 18:24:47,017][19107] Updated weights for policy 0, policy_version 186595 (0.0040) [2024-06-18 18:24:50,084][19107] Updated weights for policy 0, policy_version 186605 (0.0037) [2024-06-18 18:24:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3057336320. Throughput: 0: 41870.2. Samples: 281409360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:24:50,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 18:24:54,832][19107] Updated weights for policy 0, policy_version 186615 (0.0037) [2024-06-18 18:24:55,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 41654.3). Total num frames: 3057549312. Throughput: 0: 42027.6. Samples: 281666140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:24:55,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 18:24:58,005][19107] Updated weights for policy 0, policy_version 186625 (0.0031) [2024-06-18 18:25:00,504][18875] Fps is (10 sec: 39307.6, 60 sec: 42049.7, 300 sec: 41542.7). Total num frames: 3057729536. Throughput: 0: 41756.7. Samples: 281782240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:00,504][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 18:25:02,419][19087] Signal inference workers to stop experience collection... (4100 times) [2024-06-18 18:25:02,470][19107] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-18 18:25:02,477][19087] Signal inference workers to resume experience collection... (4100 times) [2024-06-18 18:25:02,485][19107] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-18 18:25:02,633][19107] Updated weights for policy 0, policy_version 186635 (0.0043) [2024-06-18 18:25:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 3057958912. Throughput: 0: 41591.8. Samples: 282031860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:05,501][18875] Avg episode reward: [(0, '0.479')] [2024-06-18 18:25:06,385][19107] Updated weights for policy 0, policy_version 186645 (0.0036) [2024-06-18 18:25:10,500][18875] Fps is (10 sec: 40974.7, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3058139136. Throughput: 0: 41851.9. Samples: 282290020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:10,501][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 18:25:10,528][19107] Updated weights for policy 0, policy_version 186655 (0.0036) [2024-06-18 18:25:14,092][19107] Updated weights for policy 0, policy_version 186665 (0.0033) [2024-06-18 18:25:15,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3058352128. Throughput: 0: 41708.3. Samples: 282409280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:15,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 18:25:18,321][19107] Updated weights for policy 0, policy_version 186675 (0.0028) [2024-06-18 18:25:20,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 3058565120. Throughput: 0: 41619.8. Samples: 282655640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:20,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 18:25:21,773][19107] Updated weights for policy 0, policy_version 186685 (0.0036) [2024-06-18 18:25:25,504][18875] Fps is (10 sec: 40945.7, 60 sec: 41230.6, 300 sec: 41598.2). Total num frames: 3058761728. Throughput: 0: 41937.5. Samples: 282921000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:25,504][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 18:25:26,066][19107] Updated weights for policy 0, policy_version 186695 (0.0036) [2024-06-18 18:25:29,519][19107] Updated weights for policy 0, policy_version 186705 (0.0027) [2024-06-18 18:25:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 3058991104. Throughput: 0: 41750.1. Samples: 283038620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:30,501][18875] Avg episode reward: [(0, '0.764')] [2024-06-18 18:25:33,723][19107] Updated weights for policy 0, policy_version 186715 (0.0031) [2024-06-18 18:25:35,500][18875] Fps is (10 sec: 45892.1, 60 sec: 42052.5, 300 sec: 41654.3). Total num frames: 3059220480. Throughput: 0: 41825.0. Samples: 283291480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:35,500][18875] Avg episode reward: [(0, '0.734')] [2024-06-18 18:25:37,517][19107] Updated weights for policy 0, policy_version 186725 (0.0044) [2024-06-18 18:25:40,500][18875] Fps is (10 sec: 37683.6, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 3059367936. Throughput: 0: 41788.9. Samples: 283546640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:40,500][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 18:25:41,581][19107] Updated weights for policy 0, policy_version 186735 (0.0046) [2024-06-18 18:25:45,099][19107] Updated weights for policy 0, policy_version 186745 (0.0038) [2024-06-18 18:25:45,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 41654.2). Total num frames: 3059630080. Throughput: 0: 41777.9. Samples: 283662100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 18:25:45,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 18:25:49,504][19107] Updated weights for policy 0, policy_version 186755 (0.0029) [2024-06-18 18:25:50,500][18875] Fps is (10 sec: 45874.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3059826688. Throughput: 0: 42000.3. Samples: 283921880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:25:50,501][18875] Avg episode reward: [(0, '0.334')] [2024-06-18 18:25:52,865][19107] Updated weights for policy 0, policy_version 186765 (0.0037) [2024-06-18 18:25:55,500][18875] Fps is (10 sec: 37683.4, 60 sec: 40959.9, 300 sec: 41543.2). Total num frames: 3060006912. Throughput: 0: 41804.8. Samples: 284171240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:25:55,501][18875] Avg episode reward: [(0, '0.303')] [2024-06-18 18:25:57,322][19107] Updated weights for policy 0, policy_version 186775 (0.0027) [2024-06-18 18:26:00,500][18875] Fps is (10 sec: 44237.7, 60 sec: 42327.9, 300 sec: 41709.8). Total num frames: 3060269056. Throughput: 0: 41936.6. Samples: 284296420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:00,500][18875] Avg episode reward: [(0, '0.317')] [2024-06-18 18:26:00,584][19107] Updated weights for policy 0, policy_version 186785 (0.0041) [2024-06-18 18:26:05,069][19107] Updated weights for policy 0, policy_version 186795 (0.0030) [2024-06-18 18:26:05,500][18875] Fps is (10 sec: 45874.9, 60 sec: 41779.1, 300 sec: 41654.5). Total num frames: 3060465664. Throughput: 0: 42138.1. Samples: 284551860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:05,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 18:26:08,409][19107] Updated weights for policy 0, policy_version 186805 (0.0026) [2024-06-18 18:26:10,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 3060662272. Throughput: 0: 41695.4. Samples: 284797140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:10,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 18:26:12,921][19107] Updated weights for policy 0, policy_version 186815 (0.0028) [2024-06-18 18:26:15,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41654.8). Total num frames: 3060875264. Throughput: 0: 41800.0. Samples: 284919620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:15,501][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 18:26:16,418][19107] Updated weights for policy 0, policy_version 186825 (0.0036) [2024-06-18 18:26:20,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3061088256. Throughput: 0: 41870.1. Samples: 285175640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:20,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 18:26:20,690][19107] Updated weights for policy 0, policy_version 186835 (0.0032) [2024-06-18 18:26:23,429][19087] Signal inference workers to stop experience collection... (4150 times) [2024-06-18 18:26:23,474][19087] Signal inference workers to resume experience collection... (4150 times) [2024-06-18 18:26:23,484][19107] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-18 18:26:23,519][19107] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-18 18:26:24,074][19107] Updated weights for policy 0, policy_version 186845 (0.0044) [2024-06-18 18:26:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42054.7, 300 sec: 41654.2). Total num frames: 3061284864. Throughput: 0: 41696.3. Samples: 285422980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:25,501][18875] Avg episode reward: [(0, '0.331')] [2024-06-18 18:26:28,693][19107] Updated weights for policy 0, policy_version 186855 (0.0037) [2024-06-18 18:26:30,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3061481472. Throughput: 0: 42007.6. Samples: 285552440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:30,501][18875] Avg episode reward: [(0, '0.236')] [2024-06-18 18:26:31,991][19107] Updated weights for policy 0, policy_version 186865 (0.0028) [2024-06-18 18:26:35,502][18875] Fps is (10 sec: 40951.7, 60 sec: 41231.6, 300 sec: 41653.9). Total num frames: 3061694464. Throughput: 0: 41675.9. Samples: 285797380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:35,503][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 18:26:35,528][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000186871_3061694464.pth... [2024-06-18 18:26:35,587][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000186261_3051700224.pth [2024-06-18 18:26:36,466][19107] Updated weights for policy 0, policy_version 186875 (0.0042) [2024-06-18 18:26:39,830][19107] Updated weights for policy 0, policy_version 186885 (0.0028) [2024-06-18 18:26:40,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 41654.2). Total num frames: 3061923840. Throughput: 0: 41388.0. Samples: 286033700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:40,501][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 18:26:44,234][19107] Updated weights for policy 0, policy_version 186895 (0.0037) [2024-06-18 18:26:45,500][18875] Fps is (10 sec: 40968.2, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 3062104064. Throughput: 0: 41574.0. Samples: 286167260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:45,501][18875] Avg episode reward: [(0, '0.305')] [2024-06-18 18:26:47,793][19107] Updated weights for policy 0, policy_version 186905 (0.0031) [2024-06-18 18:26:50,500][18875] Fps is (10 sec: 37683.2, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3062300672. Throughput: 0: 41574.8. Samples: 286422720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:50,501][18875] Avg episode reward: [(0, '0.394')] [2024-06-18 18:26:52,089][19107] Updated weights for policy 0, policy_version 186915 (0.0033) [2024-06-18 18:26:55,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 3062546432. Throughput: 0: 41540.0. Samples: 286666440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 18:26:55,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 18:26:56,065][19107] Updated weights for policy 0, policy_version 186925 (0.0049) [2024-06-18 18:27:00,025][19107] Updated weights for policy 0, policy_version 186935 (0.0028) [2024-06-18 18:27:00,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 3062759424. Throughput: 0: 41840.8. Samples: 286802460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:00,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 18:27:03,709][19107] Updated weights for policy 0, policy_version 186945 (0.0048) [2024-06-18 18:27:05,500][18875] Fps is (10 sec: 37683.1, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 3062923264. Throughput: 0: 41601.4. Samples: 287047700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:05,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 18:27:07,646][19107] Updated weights for policy 0, policy_version 186955 (0.0036) [2024-06-18 18:27:10,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3063185408. Throughput: 0: 41662.7. Samples: 287297800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:10,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 18:27:11,336][19107] Updated weights for policy 0, policy_version 186965 (0.0030) [2024-06-18 18:27:15,500][18875] Fps is (10 sec: 45875.4, 60 sec: 41779.2, 300 sec: 41710.3). Total num frames: 3063382016. Throughput: 0: 41908.6. Samples: 287438320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:15,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 18:27:15,521][19107] Updated weights for policy 0, policy_version 186975 (0.0028) [2024-06-18 18:27:18,930][19107] Updated weights for policy 0, policy_version 186985 (0.0029) [2024-06-18 18:27:20,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3063578624. Throughput: 0: 41841.9. Samples: 287680180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:20,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 18:27:23,628][19107] Updated weights for policy 0, policy_version 186995 (0.0033) [2024-06-18 18:27:25,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.5, 300 sec: 41820.9). Total num frames: 3063824384. Throughput: 0: 42257.5. Samples: 287935280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:25,500][18875] Avg episode reward: [(0, '0.335')] [2024-06-18 18:27:26,747][19107] Updated weights for policy 0, policy_version 187005 (0.0028) [2024-06-18 18:27:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3064004608. Throughput: 0: 42239.2. Samples: 288068020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:30,501][18875] Avg episode reward: [(0, '0.383')] [2024-06-18 18:27:31,307][19107] Updated weights for policy 0, policy_version 187015 (0.0032) [2024-06-18 18:27:34,725][19107] Updated weights for policy 0, policy_version 187025 (0.0029) [2024-06-18 18:27:35,500][18875] Fps is (10 sec: 40959.0, 60 sec: 42326.7, 300 sec: 41765.3). Total num frames: 3064233984. Throughput: 0: 41996.3. Samples: 288312560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:35,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 18:27:39,044][19107] Updated weights for policy 0, policy_version 187035 (0.0036) [2024-06-18 18:27:40,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3064446976. Throughput: 0: 42129.3. Samples: 288562260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:40,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 18:27:42,489][19107] Updated weights for policy 0, policy_version 187045 (0.0040) [2024-06-18 18:27:45,500][18875] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3064627200. Throughput: 0: 41957.8. Samples: 288690560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:45,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 18:27:46,899][19107] Updated weights for policy 0, policy_version 187055 (0.0023) [2024-06-18 18:27:47,479][19087] Signal inference workers to stop experience collection... (4200 times) [2024-06-18 18:27:47,526][19107] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-18 18:27:47,533][19087] Signal inference workers to resume experience collection... (4200 times) [2024-06-18 18:27:47,542][19107] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-18 18:27:50,478][19107] Updated weights for policy 0, policy_version 187065 (0.0036) [2024-06-18 18:27:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 41876.4). Total num frames: 3064872960. Throughput: 0: 42175.9. Samples: 288945620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:50,501][18875] Avg episode reward: [(0, '0.287')] [2024-06-18 18:27:54,755][19107] Updated weights for policy 0, policy_version 187075 (0.0030) [2024-06-18 18:27:55,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3065053184. Throughput: 0: 42182.7. Samples: 289196020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:27:55,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 18:27:58,379][19107] Updated weights for policy 0, policy_version 187085 (0.0032) [2024-06-18 18:28:00,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3065249792. Throughput: 0: 41760.0. Samples: 289317520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 18:28:00,500][18875] Avg episode reward: [(0, '0.370')] [2024-06-18 18:28:02,384][19107] Updated weights for policy 0, policy_version 187095 (0.0047) [2024-06-18 18:28:05,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 41876.4). Total num frames: 3065495552. Throughput: 0: 41999.2. Samples: 289570140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:05,501][18875] Avg episode reward: [(0, '0.415')] [2024-06-18 18:28:06,077][19107] Updated weights for policy 0, policy_version 187105 (0.0050) [2024-06-18 18:28:10,497][19107] Updated weights for policy 0, policy_version 187115 (0.0042) [2024-06-18 18:28:10,500][18875] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3065692160. Throughput: 0: 41859.4. Samples: 289818960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:10,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 18:28:14,016][19107] Updated weights for policy 0, policy_version 187125 (0.0029) [2024-06-18 18:28:15,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3065872384. Throughput: 0: 41744.0. Samples: 289946500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:15,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 18:28:18,208][19107] Updated weights for policy 0, policy_version 187135 (0.0032) [2024-06-18 18:28:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3066118144. Throughput: 0: 41845.9. Samples: 290195620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:20,501][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 18:28:22,095][19107] Updated weights for policy 0, policy_version 187145 (0.0028) [2024-06-18 18:28:25,500][18875] Fps is (10 sec: 44236.0, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3066314752. Throughput: 0: 41989.2. Samples: 290451780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:25,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 18:28:25,841][19107] Updated weights for policy 0, policy_version 187155 (0.0030) [2024-06-18 18:28:30,066][19107] Updated weights for policy 0, policy_version 187165 (0.0038) [2024-06-18 18:28:30,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3066511360. Throughput: 0: 41965.4. Samples: 290579000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:30,500][18875] Avg episode reward: [(0, '0.682')] [2024-06-18 18:28:33,714][19107] Updated weights for policy 0, policy_version 187175 (0.0032) [2024-06-18 18:28:35,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3066724352. Throughput: 0: 41777.8. Samples: 290825620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:35,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 18:28:35,645][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000187180_3066757120.pth... [2024-06-18 18:28:35,705][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000186566_3056697344.pth [2024-06-18 18:28:38,540][19107] Updated weights for policy 0, policy_version 187185 (0.0029) [2024-06-18 18:28:40,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41821.4). Total num frames: 3066937344. Throughput: 0: 41777.8. Samples: 291076020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:40,501][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 18:28:41,563][19107] Updated weights for policy 0, policy_version 187195 (0.0035) [2024-06-18 18:28:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3067133952. Throughput: 0: 41840.8. Samples: 291200360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:45,501][18875] Avg episode reward: [(0, '0.800')] [2024-06-18 18:28:46,205][19107] Updated weights for policy 0, policy_version 187205 (0.0036) [2024-06-18 18:28:49,296][19107] Updated weights for policy 0, policy_version 187215 (0.0032) [2024-06-18 18:28:50,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3067363328. Throughput: 0: 41787.2. Samples: 291450560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:50,501][18875] Avg episode reward: [(0, '0.750')] [2024-06-18 18:28:53,926][19107] Updated weights for policy 0, policy_version 187225 (0.0037) [2024-06-18 18:28:55,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3067576320. Throughput: 0: 41834.1. Samples: 291701500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:28:55,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 18:28:57,350][19107] Updated weights for policy 0, policy_version 187235 (0.0024) [2024-06-18 18:28:58,507][19087] Signal inference workers to stop experience collection... (4250 times) [2024-06-18 18:28:58,509][19087] Signal inference workers to resume experience collection... (4250 times) [2024-06-18 18:28:58,547][19107] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-18 18:28:58,547][19107] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-18 18:29:00,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3067756544. Throughput: 0: 41811.0. Samples: 291828000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:29:00,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 18:29:01,644][19107] Updated weights for policy 0, policy_version 187245 (0.0030) [2024-06-18 18:29:05,203][19107] Updated weights for policy 0, policy_version 187255 (0.0037) [2024-06-18 18:29:05,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3067985920. Throughput: 0: 41967.9. Samples: 292084180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 18:29:05,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 18:29:09,654][19107] Updated weights for policy 0, policy_version 187265 (0.0030) [2024-06-18 18:29:10,500][18875] Fps is (10 sec: 44237.2, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3068198912. Throughput: 0: 41784.2. Samples: 292332060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:10,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 18:29:13,174][19107] Updated weights for policy 0, policy_version 187275 (0.0032) [2024-06-18 18:29:15,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 41820.9). Total num frames: 3068411904. Throughput: 0: 41751.3. Samples: 292457820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:15,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 18:29:17,367][19107] Updated weights for policy 0, policy_version 187285 (0.0026) [2024-06-18 18:29:20,501][18875] Fps is (10 sec: 40959.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3068608512. Throughput: 0: 41757.7. Samples: 292704720. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:20,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 18:29:20,922][19107] Updated weights for policy 0, policy_version 187295 (0.0037) [2024-06-18 18:29:25,119][19107] Updated weights for policy 0, policy_version 187305 (0.0037) [2024-06-18 18:29:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3068821504. Throughput: 0: 41862.2. Samples: 292959820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:25,501][18875] Avg episode reward: [(0, '0.318')] [2024-06-18 18:29:28,878][19107] Updated weights for policy 0, policy_version 187315 (0.0037) [2024-06-18 18:29:30,500][18875] Fps is (10 sec: 44237.7, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3069050880. Throughput: 0: 41911.6. Samples: 293086380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:30,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 18:29:32,782][19107] Updated weights for policy 0, policy_version 187325 (0.0042) [2024-06-18 18:29:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3069214720. Throughput: 0: 41838.5. Samples: 293333300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:35,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 18:29:36,610][19107] Updated weights for policy 0, policy_version 187335 (0.0036) [2024-06-18 18:29:40,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3069444096. Throughput: 0: 41957.5. Samples: 293589580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:40,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 18:29:40,546][19107] Updated weights for policy 0, policy_version 187345 (0.0039) [2024-06-18 18:29:44,568][19107] Updated weights for policy 0, policy_version 187355 (0.0037) [2024-06-18 18:29:45,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 3069673472. Throughput: 0: 41949.9. Samples: 293715740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:45,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 18:29:48,452][19107] Updated weights for policy 0, policy_version 187365 (0.0031) [2024-06-18 18:29:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3069870080. Throughput: 0: 41722.0. Samples: 293961660. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:50,500][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 18:29:52,124][19107] Updated weights for policy 0, policy_version 187375 (0.0032) [2024-06-18 18:29:55,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.2, 300 sec: 41821.4). Total num frames: 3070066688. Throughput: 0: 41955.0. Samples: 294220040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:29:55,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 18:29:56,176][19107] Updated weights for policy 0, policy_version 187385 (0.0038) [2024-06-18 18:29:59,972][19107] Updated weights for policy 0, policy_version 187395 (0.0041) [2024-06-18 18:30:00,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 3070296064. Throughput: 0: 41900.6. Samples: 294343340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:30:00,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 18:30:04,016][19107] Updated weights for policy 0, policy_version 187405 (0.0029) [2024-06-18 18:30:05,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 3070509056. Throughput: 0: 42100.6. Samples: 294599240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:30:05,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 18:30:07,596][19107] Updated weights for policy 0, policy_version 187415 (0.0034) [2024-06-18 18:30:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3070705664. Throughput: 0: 42009.3. Samples: 294850240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 20.0) [2024-06-18 18:30:10,501][18875] Avg episode reward: [(0, '0.627')] [2024-06-18 18:30:11,787][19107] Updated weights for policy 0, policy_version 187425 (0.0035) [2024-06-18 18:30:15,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3070918656. Throughput: 0: 41948.3. Samples: 294974060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:15,501][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 18:30:15,537][19107] Updated weights for policy 0, policy_version 187435 (0.0032) [2024-06-18 18:30:19,695][19107] Updated weights for policy 0, policy_version 187445 (0.0032) [2024-06-18 18:30:20,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 41988.0). Total num frames: 3071148032. Throughput: 0: 42118.3. Samples: 295228620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:20,501][18875] Avg episode reward: [(0, '0.273')] [2024-06-18 18:30:23,449][19087] Signal inference workers to stop experience collection... (4300 times) [2024-06-18 18:30:23,451][19087] Signal inference workers to resume experience collection... (4300 times) [2024-06-18 18:30:23,464][19107] Updated weights for policy 0, policy_version 187455 (0.0037) [2024-06-18 18:30:23,473][19107] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-18 18:30:23,473][19107] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-18 18:30:25,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3071344640. Throughput: 0: 41950.2. Samples: 295477340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:25,501][18875] Avg episode reward: [(0, '0.306')] [2024-06-18 18:30:27,451][19107] Updated weights for policy 0, policy_version 187465 (0.0049) [2024-06-18 18:30:30,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3071541248. Throughput: 0: 41921.8. Samples: 295602220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:30,501][18875] Avg episode reward: [(0, '0.275')] [2024-06-18 18:30:31,203][19107] Updated weights for policy 0, policy_version 187475 (0.0042) [2024-06-18 18:30:35,268][19107] Updated weights for policy 0, policy_version 187485 (0.0033) [2024-06-18 18:30:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3071770624. Throughput: 0: 42115.5. Samples: 295856860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:35,500][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 18:30:35,523][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000187486_3071770624.pth... [2024-06-18 18:30:35,577][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000186871_3061694464.pth [2024-06-18 18:30:39,133][19107] Updated weights for policy 0, policy_version 187495 (0.0053) [2024-06-18 18:30:40,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 3071967232. Throughput: 0: 41772.4. Samples: 296099800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:40,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 18:30:43,015][19107] Updated weights for policy 0, policy_version 187505 (0.0028) [2024-06-18 18:30:45,501][18875] Fps is (10 sec: 40959.0, 60 sec: 41779.0, 300 sec: 41876.4). Total num frames: 3072180224. Throughput: 0: 41935.3. Samples: 296230440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:45,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 18:30:46,890][19107] Updated weights for policy 0, policy_version 187515 (0.0036) [2024-06-18 18:30:50,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3072376832. Throughput: 0: 41809.7. Samples: 296480680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:50,501][18875] Avg episode reward: [(0, '0.326')] [2024-06-18 18:30:50,947][19107] Updated weights for policy 0, policy_version 187525 (0.0052) [2024-06-18 18:30:54,557][19107] Updated weights for policy 0, policy_version 187535 (0.0043) [2024-06-18 18:30:55,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3072606208. Throughput: 0: 41764.5. Samples: 296729640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:30:55,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 18:30:58,967][19107] Updated weights for policy 0, policy_version 187545 (0.0030) [2024-06-18 18:31:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 3072802816. Throughput: 0: 41854.8. Samples: 296857520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:31:00,504][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 18:31:02,783][19107] Updated weights for policy 0, policy_version 187555 (0.0040) [2024-06-18 18:31:05,501][18875] Fps is (10 sec: 39321.0, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3072999424. Throughput: 0: 41572.3. Samples: 297099380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:31:05,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 18:31:06,962][19107] Updated weights for policy 0, policy_version 187565 (0.0030) [2024-06-18 18:31:10,504][18875] Fps is (10 sec: 40943.9, 60 sec: 41776.5, 300 sec: 41820.3). Total num frames: 3073212416. Throughput: 0: 41656.8. Samples: 297352060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:31:10,507][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 18:31:10,549][19107] Updated weights for policy 0, policy_version 187575 (0.0037) [2024-06-18 18:31:14,803][19107] Updated weights for policy 0, policy_version 187585 (0.0036) [2024-06-18 18:31:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3073409024. Throughput: 0: 41628.7. Samples: 297475520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 25.0) [2024-06-18 18:31:15,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 18:31:18,564][19107] Updated weights for policy 0, policy_version 187595 (0.0046) [2024-06-18 18:31:20,500][18875] Fps is (10 sec: 42615.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3073638400. Throughput: 0: 41550.6. Samples: 297726640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:20,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 18:31:22,702][19107] Updated weights for policy 0, policy_version 187605 (0.0036) [2024-06-18 18:31:25,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3073851392. Throughput: 0: 41536.5. Samples: 297968940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:25,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 18:31:26,393][19107] Updated weights for policy 0, policy_version 187615 (0.0040) [2024-06-18 18:31:30,500][18875] Fps is (10 sec: 37683.7, 60 sec: 41233.0, 300 sec: 41765.6). Total num frames: 3074015232. Throughput: 0: 41547.8. Samples: 298100080. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:30,501][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 18:31:30,706][19107] Updated weights for policy 0, policy_version 187625 (0.0032) [2024-06-18 18:31:34,618][19107] Updated weights for policy 0, policy_version 187635 (0.0028) [2024-06-18 18:31:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3074277376. Throughput: 0: 41661.4. Samples: 298355440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:35,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 18:31:38,465][19107] Updated weights for policy 0, policy_version 187645 (0.0034) [2024-06-18 18:31:40,500][18875] Fps is (10 sec: 45875.4, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 3074473984. Throughput: 0: 41526.8. Samples: 298598340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:40,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 18:31:42,410][19107] Updated weights for policy 0, policy_version 187655 (0.0024) [2024-06-18 18:31:45,500][18875] Fps is (10 sec: 37682.6, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3074654208. Throughput: 0: 41489.7. Samples: 298724560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:45,502][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 18:31:46,160][19107] Updated weights for policy 0, policy_version 187665 (0.0031) [2024-06-18 18:31:50,221][19107] Updated weights for policy 0, policy_version 187675 (0.0041) [2024-06-18 18:31:50,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3074883584. Throughput: 0: 41671.3. Samples: 298974580. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:50,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 18:31:53,819][19107] Updated weights for policy 0, policy_version 187685 (0.0040) [2024-06-18 18:31:55,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3075096576. Throughput: 0: 41683.2. Samples: 299227640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:31:55,501][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 18:31:58,100][19107] Updated weights for policy 0, policy_version 187695 (0.0033) [2024-06-18 18:32:00,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3075293184. Throughput: 0: 41829.0. Samples: 299357820. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:32:00,501][18875] Avg episode reward: [(0, '0.383')] [2024-06-18 18:32:01,673][19107] Updated weights for policy 0, policy_version 187705 (0.0040) [2024-06-18 18:32:05,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.3, 300 sec: 41709.8). Total num frames: 3075489792. Throughput: 0: 41625.9. Samples: 299599800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:32:05,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 18:32:05,739][19107] Updated weights for policy 0, policy_version 187715 (0.0025) [2024-06-18 18:32:09,383][19107] Updated weights for policy 0, policy_version 187725 (0.0036) [2024-06-18 18:32:10,291][19087] Signal inference workers to stop experience collection... (4350 times) [2024-06-18 18:32:10,328][19107] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-18 18:32:10,346][19087] Signal inference workers to resume experience collection... (4350 times) [2024-06-18 18:32:10,347][19107] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-18 18:32:10,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42055.0, 300 sec: 41876.4). Total num frames: 3075735552. Throughput: 0: 41819.0. Samples: 299850800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:32:10,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 18:32:13,522][19107] Updated weights for policy 0, policy_version 187735 (0.0035) [2024-06-18 18:32:15,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3075915776. Throughput: 0: 41760.8. Samples: 299979320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:32:15,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 18:32:17,281][19107] Updated weights for policy 0, policy_version 187745 (0.0043) [2024-06-18 18:32:20,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3076145152. Throughput: 0: 41604.3. Samples: 300227640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:32:20,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 18:32:21,215][19107] Updated weights for policy 0, policy_version 187755 (0.0047) [2024-06-18 18:32:25,267][19107] Updated weights for policy 0, policy_version 187765 (0.0030) [2024-06-18 18:32:25,504][18875] Fps is (10 sec: 44221.1, 60 sec: 41776.7, 300 sec: 41875.9). Total num frames: 3076358144. Throughput: 0: 41902.8. Samples: 300484120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-18 18:32:25,505][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 18:32:29,023][19107] Updated weights for policy 0, policy_version 187775 (0.0035) [2024-06-18 18:32:30,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42052.1, 300 sec: 41709.8). Total num frames: 3076538368. Throughput: 0: 41784.0. Samples: 300604840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:32:30,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 18:32:33,083][19107] Updated weights for policy 0, policy_version 187785 (0.0038) [2024-06-18 18:32:35,500][18875] Fps is (10 sec: 42614.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3076784128. Throughput: 0: 41840.5. Samples: 300857400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:32:35,500][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 18:32:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000187792_3076784128.pth... [2024-06-18 18:32:35,585][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000187180_3066757120.pth [2024-06-18 18:32:36,763][19107] Updated weights for policy 0, policy_version 187795 (0.0043) [2024-06-18 18:32:40,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3076964352. Throughput: 0: 41808.6. Samples: 301109020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:32:40,500][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 18:32:40,819][19107] Updated weights for policy 0, policy_version 187805 (0.0024) [2024-06-18 18:32:44,918][19107] Updated weights for policy 0, policy_version 187815 (0.0036) [2024-06-18 18:32:45,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3077177344. Throughput: 0: 41642.6. Samples: 301231740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:32:45,501][18875] Avg episode reward: [(0, '0.656')] [2024-06-18 18:32:48,499][19107] Updated weights for policy 0, policy_version 187825 (0.0038) [2024-06-18 18:32:50,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3077406720. Throughput: 0: 41862.1. Samples: 301483600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:32:50,501][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 18:32:52,582][19107] Updated weights for policy 0, policy_version 187835 (0.0041) [2024-06-18 18:32:55,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 3077603328. Throughput: 0: 41951.8. Samples: 301738620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:32:55,500][18875] Avg episode reward: [(0, '0.307')] [2024-06-18 18:32:56,470][19107] Updated weights for policy 0, policy_version 187845 (0.0034) [2024-06-18 18:33:00,197][19107] Updated weights for policy 0, policy_version 187855 (0.0041) [2024-06-18 18:33:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3077816320. Throughput: 0: 41745.4. Samples: 301857860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:33:00,501][18875] Avg episode reward: [(0, '0.176')] [2024-06-18 18:33:04,327][19107] Updated weights for policy 0, policy_version 187865 (0.0045) [2024-06-18 18:33:05,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3078012928. Throughput: 0: 41958.3. Samples: 302115760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:33:05,501][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 18:33:07,803][19107] Updated weights for policy 0, policy_version 187875 (0.0027) [2024-06-18 18:33:10,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3078225920. Throughput: 0: 41867.7. Samples: 302368020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:33:10,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 18:33:12,043][19107] Updated weights for policy 0, policy_version 187885 (0.0032) [2024-06-18 18:33:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3078438912. Throughput: 0: 41977.5. Samples: 302493820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:33:15,501][18875] Avg episode reward: [(0, '0.306')] [2024-06-18 18:33:15,787][19107] Updated weights for policy 0, policy_version 187895 (0.0030) [2024-06-18 18:33:19,988][19107] Updated weights for policy 0, policy_version 187905 (0.0039) [2024-06-18 18:33:20,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3078651904. Throughput: 0: 41800.8. Samples: 302738440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:33:20,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 18:33:23,739][19107] Updated weights for policy 0, policy_version 187915 (0.0044) [2024-06-18 18:33:25,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41235.5, 300 sec: 41765.3). Total num frames: 3078832128. Throughput: 0: 41753.6. Samples: 302987940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:33:25,501][18875] Avg episode reward: [(0, '0.285')] [2024-06-18 18:33:27,746][19107] Updated weights for policy 0, policy_version 187925 (0.0038) [2024-06-18 18:33:30,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3079045120. Throughput: 0: 41699.1. Samples: 303108200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-18 18:33:30,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 18:33:32,019][19107] Updated weights for policy 0, policy_version 187935 (0.0024) [2024-06-18 18:33:35,494][19107] Updated weights for policy 0, policy_version 187945 (0.0042) [2024-06-18 18:33:35,501][18875] Fps is (10 sec: 45871.6, 60 sec: 41778.5, 300 sec: 41876.3). Total num frames: 3079290880. Throughput: 0: 41756.9. Samples: 303362700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:33:35,502][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 18:33:39,952][19107] Updated weights for policy 0, policy_version 187955 (0.0048) [2024-06-18 18:33:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.0, 300 sec: 41820.8). Total num frames: 3079471104. Throughput: 0: 41634.4. Samples: 303612180. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:33:40,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 18:33:43,165][19107] Updated weights for policy 0, policy_version 187965 (0.0044) [2024-06-18 18:33:45,500][18875] Fps is (10 sec: 37686.3, 60 sec: 41506.1, 300 sec: 41709.7). Total num frames: 3079667712. Throughput: 0: 41719.9. Samples: 303735260. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:33:45,501][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 18:33:48,067][19107] Updated weights for policy 0, policy_version 187975 (0.0036) [2024-06-18 18:33:48,989][19087] Signal inference workers to stop experience collection... (4400 times) [2024-06-18 18:33:49,035][19107] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-18 18:33:49,040][19087] Signal inference workers to resume experience collection... (4400 times) [2024-06-18 18:33:49,052][19107] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-18 18:33:50,503][18875] Fps is (10 sec: 44226.6, 60 sec: 41777.5, 300 sec: 41820.5). Total num frames: 3079913472. Throughput: 0: 41569.3. Samples: 303986480. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:33:50,503][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 18:33:50,999][19107] Updated weights for policy 0, policy_version 187985 (0.0039) [2024-06-18 18:33:55,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3080093696. Throughput: 0: 41735.3. Samples: 304246100. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:33:55,500][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 18:33:55,738][19107] Updated weights for policy 0, policy_version 187995 (0.0039) [2024-06-18 18:33:58,641][19107] Updated weights for policy 0, policy_version 188005 (0.0043) [2024-06-18 18:34:00,500][18875] Fps is (10 sec: 40969.6, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 3080323072. Throughput: 0: 41473.6. Samples: 304360140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:00,512][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 18:34:03,828][19107] Updated weights for policy 0, policy_version 188015 (0.0051) [2024-06-18 18:34:05,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3080536064. Throughput: 0: 41806.3. Samples: 304619720. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:05,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 18:34:06,398][19107] Updated weights for policy 0, policy_version 188025 (0.0033) [2024-06-18 18:34:10,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 3080699904. Throughput: 0: 41863.1. Samples: 304871780. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:10,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 18:34:11,613][19107] Updated weights for policy 0, policy_version 188035 (0.0027) [2024-06-18 18:34:14,057][19107] Updated weights for policy 0, policy_version 188045 (0.0034) [2024-06-18 18:34:15,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3080962048. Throughput: 0: 41846.6. Samples: 304991300. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:15,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 18:34:19,463][19107] Updated weights for policy 0, policy_version 188055 (0.0030) [2024-06-18 18:34:20,500][18875] Fps is (10 sec: 45875.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3081158656. Throughput: 0: 41955.5. Samples: 305250660. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:20,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 18:34:21,921][19107] Updated weights for policy 0, policy_version 188065 (0.0036) [2024-06-18 18:34:25,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 3081338880. Throughput: 0: 41769.4. Samples: 305491800. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:25,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 18:34:27,297][19107] Updated weights for policy 0, policy_version 188075 (0.0038) [2024-06-18 18:34:29,843][19107] Updated weights for policy 0, policy_version 188085 (0.0048) [2024-06-18 18:34:30,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 3081601024. Throughput: 0: 41898.4. Samples: 305620680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:30,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 18:34:35,085][19107] Updated weights for policy 0, policy_version 188095 (0.0044) [2024-06-18 18:34:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41233.7, 300 sec: 41765.3). Total num frames: 3081764864. Throughput: 0: 42026.3. Samples: 305877560. Policy #0 lag: (min: 0.0, avg: 12.8, max: 25.0) [2024-06-18 18:34:35,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 18:34:35,625][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000188097_3081781248.pth... [2024-06-18 18:34:35,674][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000187486_3071770624.pth [2024-06-18 18:34:38,003][19107] Updated weights for policy 0, policy_version 188105 (0.0036) [2024-06-18 18:34:40,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 3081994240. Throughput: 0: 41643.4. Samples: 306120060. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:34:40,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 18:34:42,867][19107] Updated weights for policy 0, policy_version 188115 (0.0039) [2024-06-18 18:34:45,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3082207232. Throughput: 0: 42012.0. Samples: 306250680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:34:45,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 18:34:45,697][19107] Updated weights for policy 0, policy_version 188125 (0.0034) [2024-06-18 18:34:50,500][18875] Fps is (10 sec: 37683.7, 60 sec: 40961.7, 300 sec: 41709.8). Total num frames: 3082371072. Throughput: 0: 41902.2. Samples: 306505320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:34:50,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 18:34:50,653][19107] Updated weights for policy 0, policy_version 188135 (0.0034) [2024-06-18 18:34:53,319][19107] Updated weights for policy 0, policy_version 188145 (0.0032) [2024-06-18 18:34:55,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3082616832. Throughput: 0: 41724.5. Samples: 306749380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:34:55,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 18:34:58,370][19107] Updated weights for policy 0, policy_version 188155 (0.0036) [2024-06-18 18:34:59,603][19087] Signal inference workers to stop experience collection... (4450 times) [2024-06-18 18:34:59,655][19107] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-18 18:34:59,655][19087] Signal inference workers to resume experience collection... (4450 times) [2024-06-18 18:34:59,678][19107] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-18 18:35:00,501][18875] Fps is (10 sec: 47512.1, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3082846208. Throughput: 0: 42123.9. Samples: 306886880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:00,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 18:35:01,173][19107] Updated weights for policy 0, policy_version 188165 (0.0028) [2024-06-18 18:35:05,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 3083010048. Throughput: 0: 41907.6. Samples: 307136500. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:05,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 18:35:06,126][19107] Updated weights for policy 0, policy_version 188175 (0.0032) [2024-06-18 18:35:09,371][19107] Updated weights for policy 0, policy_version 188185 (0.0032) [2024-06-18 18:35:10,500][18875] Fps is (10 sec: 40961.1, 60 sec: 42598.5, 300 sec: 41820.9). Total num frames: 3083255808. Throughput: 0: 41919.2. Samples: 307378160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:10,501][18875] Avg episode reward: [(0, '0.364')] [2024-06-18 18:35:13,916][19107] Updated weights for policy 0, policy_version 188195 (0.0030) [2024-06-18 18:35:15,500][18875] Fps is (10 sec: 45875.2, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3083468800. Throughput: 0: 42152.0. Samples: 307517520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:15,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 18:35:17,219][19107] Updated weights for policy 0, policy_version 188205 (0.0037) [2024-06-18 18:35:20,504][18875] Fps is (10 sec: 39307.3, 60 sec: 41503.6, 300 sec: 41709.3). Total num frames: 3083649024. Throughput: 0: 41865.1. Samples: 307761640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:20,505][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 18:35:21,645][19107] Updated weights for policy 0, policy_version 188215 (0.0043) [2024-06-18 18:35:24,869][19107] Updated weights for policy 0, policy_version 188225 (0.0045) [2024-06-18 18:35:25,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 41876.4). Total num frames: 3083894784. Throughput: 0: 41959.2. Samples: 308008220. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:25,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 18:35:29,533][19107] Updated weights for policy 0, policy_version 188235 (0.0047) [2024-06-18 18:35:30,504][18875] Fps is (10 sec: 44236.7, 60 sec: 41503.6, 300 sec: 41764.8). Total num frames: 3084091392. Throughput: 0: 42002.1. Samples: 308140920. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:30,505][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 18:35:32,443][19107] Updated weights for policy 0, policy_version 188245 (0.0045) [2024-06-18 18:35:35,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3084271616. Throughput: 0: 41781.3. Samples: 308385480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:35,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 18:35:37,197][19107] Updated weights for policy 0, policy_version 188255 (0.0033) [2024-06-18 18:35:40,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3084517376. Throughput: 0: 41763.5. Samples: 308628740. Policy #0 lag: (min: 0.0, avg: 12.0, max: 22.0) [2024-06-18 18:35:40,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 18:35:40,703][19107] Updated weights for policy 0, policy_version 188265 (0.0030) [2024-06-18 18:35:45,106][19107] Updated weights for policy 0, policy_version 188275 (0.0041) [2024-06-18 18:35:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 3084697600. Throughput: 0: 41721.6. Samples: 308764340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:35:45,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 18:35:48,302][19107] Updated weights for policy 0, policy_version 188285 (0.0031) [2024-06-18 18:35:50,500][18875] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 3084910592. Throughput: 0: 41485.8. Samples: 309003360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:35:50,500][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 18:35:52,977][19107] Updated weights for policy 0, policy_version 188295 (0.0034) [2024-06-18 18:35:55,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3085156352. Throughput: 0: 41798.7. Samples: 309259100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:35:55,501][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 18:35:55,934][19107] Updated weights for policy 0, policy_version 188305 (0.0033) [2024-06-18 18:36:00,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41506.3, 300 sec: 41820.9). Total num frames: 3085336576. Throughput: 0: 41438.2. Samples: 309382240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:00,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 18:36:00,914][19107] Updated weights for policy 0, policy_version 188315 (0.0038) [2024-06-18 18:36:04,201][19107] Updated weights for policy 0, policy_version 188325 (0.0034) [2024-06-18 18:36:05,500][18875] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 41765.9). Total num frames: 3085533184. Throughput: 0: 41403.8. Samples: 309624660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:05,500][18875] Avg episode reward: [(0, '0.262')] [2024-06-18 18:36:08,888][19107] Updated weights for policy 0, policy_version 188335 (0.0039) [2024-06-18 18:36:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3085746176. Throughput: 0: 41508.4. Samples: 309876100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:10,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 18:36:12,062][19107] Updated weights for policy 0, policy_version 188345 (0.0034) [2024-06-18 18:36:15,504][18875] Fps is (10 sec: 39307.1, 60 sec: 40957.5, 300 sec: 41653.7). Total num frames: 3085926400. Throughput: 0: 41302.7. Samples: 309999540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:15,513][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 18:36:16,948][19107] Updated weights for policy 0, policy_version 188355 (0.0044) [2024-06-18 18:36:20,040][19107] Updated weights for policy 0, policy_version 188365 (0.0031) [2024-06-18 18:36:20,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42054.7, 300 sec: 41765.3). Total num frames: 3086172160. Throughput: 0: 41384.7. Samples: 310247800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:20,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 18:36:24,662][19107] Updated weights for policy 0, policy_version 188375 (0.0036) [2024-06-18 18:36:25,500][18875] Fps is (10 sec: 44252.8, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3086368768. Throughput: 0: 41587.6. Samples: 310500180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:25,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 18:36:27,843][19107] Updated weights for policy 0, policy_version 188385 (0.0046) [2024-06-18 18:36:30,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41508.6, 300 sec: 41709.8). Total num frames: 3086581760. Throughput: 0: 41226.6. Samples: 310619540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:30,501][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 18:36:31,888][19087] Signal inference workers to stop experience collection... (4500 times) [2024-06-18 18:36:31,888][19087] Signal inference workers to resume experience collection... (4500 times) [2024-06-18 18:36:31,918][19107] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-18 18:36:31,918][19107] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-18 18:36:32,507][19107] Updated weights for policy 0, policy_version 188395 (0.0034) [2024-06-18 18:36:35,500][18875] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 3086794752. Throughput: 0: 41507.8. Samples: 310871220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:35,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 18:36:35,538][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000188404_3086811136.pth... [2024-06-18 18:36:35,603][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000187792_3076784128.pth [2024-06-18 18:36:35,977][19107] Updated weights for policy 0, policy_version 188405 (0.0032) [2024-06-18 18:36:40,500][18875] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 3086974976. Throughput: 0: 41426.5. Samples: 311123300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:40,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 18:36:40,736][19107] Updated weights for policy 0, policy_version 188415 (0.0033) [2024-06-18 18:36:43,987][19107] Updated weights for policy 0, policy_version 188425 (0.0033) [2024-06-18 18:36:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3087204352. Throughput: 0: 41382.6. Samples: 311244460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:45,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 18:36:48,504][19107] Updated weights for policy 0, policy_version 188435 (0.0041) [2024-06-18 18:36:50,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3087400960. Throughput: 0: 41598.6. Samples: 311496600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-18 18:36:50,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 18:36:51,891][19107] Updated weights for policy 0, policy_version 188445 (0.0031) [2024-06-18 18:36:55,500][18875] Fps is (10 sec: 39322.2, 60 sec: 40686.9, 300 sec: 41709.8). Total num frames: 3087597568. Throughput: 0: 41585.8. Samples: 311747460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:36:55,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 18:36:56,218][19107] Updated weights for policy 0, policy_version 188455 (0.0044) [2024-06-18 18:36:59,700][19107] Updated weights for policy 0, policy_version 188465 (0.0035) [2024-06-18 18:37:00,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3087826944. Throughput: 0: 41533.5. Samples: 311868400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:00,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 18:37:03,817][19107] Updated weights for policy 0, policy_version 188475 (0.0055) [2024-06-18 18:37:05,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 3088039936. Throughput: 0: 41638.3. Samples: 312121520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:05,504][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 18:37:07,746][19107] Updated weights for policy 0, policy_version 188485 (0.0024) [2024-06-18 18:37:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3088236544. Throughput: 0: 41625.7. Samples: 312373340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:10,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 18:37:11,630][19107] Updated weights for policy 0, policy_version 188495 (0.0037) [2024-06-18 18:37:15,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41781.8, 300 sec: 41654.3). Total num frames: 3088433152. Throughput: 0: 41789.5. Samples: 312500060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:15,500][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 18:37:15,742][19107] Updated weights for policy 0, policy_version 188505 (0.0030) [2024-06-18 18:37:19,420][19107] Updated weights for policy 0, policy_version 188515 (0.0032) [2024-06-18 18:37:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41654.7). Total num frames: 3088646144. Throughput: 0: 41743.6. Samples: 312749680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:20,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 18:37:23,594][19107] Updated weights for policy 0, policy_version 188525 (0.0034) [2024-06-18 18:37:25,500][18875] Fps is (10 sec: 44236.0, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 3088875520. Throughput: 0: 41658.2. Samples: 312997920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:25,501][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 18:37:27,355][19107] Updated weights for policy 0, policy_version 188535 (0.0030) [2024-06-18 18:37:30,500][18875] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3089088512. Throughput: 0: 41890.0. Samples: 313129500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:30,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 18:37:31,367][19107] Updated weights for policy 0, policy_version 188545 (0.0038) [2024-06-18 18:37:34,955][19107] Updated weights for policy 0, policy_version 188555 (0.0032) [2024-06-18 18:37:35,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.4, 300 sec: 41820.9). Total num frames: 3089301504. Throughput: 0: 41992.1. Samples: 313386240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:35,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 18:37:39,047][19107] Updated weights for policy 0, policy_version 188565 (0.0035) [2024-06-18 18:37:39,774][19087] Signal inference workers to stop experience collection... (4550 times) [2024-06-18 18:37:39,824][19107] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-18 18:37:39,829][19087] Signal inference workers to resume experience collection... (4550 times) [2024-06-18 18:37:39,839][19107] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-18 18:37:40,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 3089530880. Throughput: 0: 41946.1. Samples: 313635040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:40,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 18:37:42,914][19107] Updated weights for policy 0, policy_version 188575 (0.0024) [2024-06-18 18:37:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3089711104. Throughput: 0: 42078.8. Samples: 313761940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:45,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 18:37:46,749][19107] Updated weights for policy 0, policy_version 188585 (0.0028) [2024-06-18 18:37:50,503][18875] Fps is (10 sec: 37672.2, 60 sec: 41777.1, 300 sec: 41709.3). Total num frames: 3089907712. Throughput: 0: 42146.1. Samples: 314018220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:50,504][18875] Avg episode reward: [(0, '0.730')] [2024-06-18 18:37:50,690][19107] Updated weights for policy 0, policy_version 188595 (0.0041) [2024-06-18 18:37:54,305][19107] Updated weights for policy 0, policy_version 188605 (0.0037) [2024-06-18 18:37:55,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 41820.9). Total num frames: 3090153472. Throughput: 0: 42159.6. Samples: 314270520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-18 18:37:55,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 18:37:58,361][19107] Updated weights for policy 0, policy_version 188615 (0.0036) [2024-06-18 18:38:00,500][18875] Fps is (10 sec: 44250.4, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3090350080. Throughput: 0: 42352.9. Samples: 314405940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:00,501][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 18:38:01,909][19107] Updated weights for policy 0, policy_version 188625 (0.0035) [2024-06-18 18:38:05,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3090530304. Throughput: 0: 42368.9. Samples: 314656280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:05,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 18:38:06,574][19107] Updated weights for policy 0, policy_version 188635 (0.0029) [2024-06-18 18:38:09,695][19107] Updated weights for policy 0, policy_version 188645 (0.0030) [2024-06-18 18:38:10,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 41876.4). Total num frames: 3090792448. Throughput: 0: 42266.4. Samples: 314899900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:10,500][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 18:38:14,347][19107] Updated weights for policy 0, policy_version 188655 (0.0039) [2024-06-18 18:38:15,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 3090972672. Throughput: 0: 42409.6. Samples: 315037940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:15,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 18:38:17,480][19107] Updated weights for policy 0, policy_version 188665 (0.0033) [2024-06-18 18:38:20,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3091185664. Throughput: 0: 42152.8. Samples: 315283120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:20,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 18:38:22,071][19107] Updated weights for policy 0, policy_version 188675 (0.0031) [2024-06-18 18:38:25,245][19107] Updated weights for policy 0, policy_version 188685 (0.0027) [2024-06-18 18:38:25,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 3091431424. Throughput: 0: 42323.2. Samples: 315539580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:25,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 18:38:29,707][19107] Updated weights for policy 0, policy_version 188695 (0.0028) [2024-06-18 18:38:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41765.5). Total num frames: 3091611648. Throughput: 0: 42340.0. Samples: 315667240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:30,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 18:38:32,918][19107] Updated weights for policy 0, policy_version 188705 (0.0024) [2024-06-18 18:38:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 41932.0). Total num frames: 3091841024. Throughput: 0: 42157.5. Samples: 315915180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:35,501][18875] Avg episode reward: [(0, '0.733')] [2024-06-18 18:38:35,511][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000188711_3091841024.pth... [2024-06-18 18:38:35,567][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000188097_3081781248.pth [2024-06-18 18:38:37,455][19107] Updated weights for policy 0, policy_version 188715 (0.0032) [2024-06-18 18:38:40,502][18875] Fps is (10 sec: 42590.0, 60 sec: 41777.9, 300 sec: 41931.7). Total num frames: 3092037632. Throughput: 0: 42201.8. Samples: 316169680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:40,503][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 18:38:40,787][19107] Updated weights for policy 0, policy_version 188725 (0.0038) [2024-06-18 18:38:45,489][19107] Updated weights for policy 0, policy_version 188735 (0.0038) [2024-06-18 18:38:45,500][18875] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 41765.7). Total num frames: 3092234240. Throughput: 0: 41971.0. Samples: 316294640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:45,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 18:38:48,675][19107] Updated weights for policy 0, policy_version 188745 (0.0031) [2024-06-18 18:38:50,500][18875] Fps is (10 sec: 44244.5, 60 sec: 42873.5, 300 sec: 41987.4). Total num frames: 3092480000. Throughput: 0: 42043.5. Samples: 316548240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:50,501][18875] Avg episode reward: [(0, '0.397')] [2024-06-18 18:38:53,114][19107] Updated weights for policy 0, policy_version 188755 (0.0041) [2024-06-18 18:38:55,501][18875] Fps is (10 sec: 42597.6, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3092660224. Throughput: 0: 42362.9. Samples: 316806240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:38:55,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 18:38:56,449][19107] Updated weights for policy 0, policy_version 188765 (0.0038) [2024-06-18 18:39:00,504][18875] Fps is (10 sec: 37670.2, 60 sec: 41776.6, 300 sec: 41764.8). Total num frames: 3092856832. Throughput: 0: 41955.0. Samples: 316926060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:39:00,504][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 18:39:00,848][19107] Updated weights for policy 0, policy_version 188775 (0.0038) [2024-06-18 18:39:04,553][19087] Signal inference workers to stop experience collection... (4600 times) [2024-06-18 18:39:04,554][19087] Signal inference workers to resume experience collection... (4600 times) [2024-06-18 18:39:04,558][19107] Updated weights for policy 0, policy_version 188785 (0.0029) [2024-06-18 18:39:04,568][19107] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-18 18:39:04,599][19107] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-18 18:39:05,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42043.0). Total num frames: 3093102592. Throughput: 0: 42191.5. Samples: 317181740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:05,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 18:39:08,477][19107] Updated weights for policy 0, policy_version 188795 (0.0029) [2024-06-18 18:39:10,500][18875] Fps is (10 sec: 44252.3, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 3093299200. Throughput: 0: 42078.2. Samples: 317433100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:10,501][18875] Avg episode reward: [(0, '0.325')] [2024-06-18 18:39:12,371][19107] Updated weights for policy 0, policy_version 188805 (0.0044) [2024-06-18 18:39:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3093512192. Throughput: 0: 41990.1. Samples: 317556800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:15,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 18:39:16,343][19107] Updated weights for policy 0, policy_version 188815 (0.0036) [2024-06-18 18:39:20,147][19107] Updated weights for policy 0, policy_version 188825 (0.0028) [2024-06-18 18:39:20,501][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3093725184. Throughput: 0: 42145.6. Samples: 317811740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:20,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 18:39:24,100][19107] Updated weights for policy 0, policy_version 188835 (0.0023) [2024-06-18 18:39:25,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3093938176. Throughput: 0: 42096.9. Samples: 318063960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:25,501][18875] Avg episode reward: [(0, '0.330')] [2024-06-18 18:39:27,926][19107] Updated weights for policy 0, policy_version 188845 (0.0040) [2024-06-18 18:39:30,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3094151168. Throughput: 0: 42088.4. Samples: 318188620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:30,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 18:39:31,902][19107] Updated weights for policy 0, policy_version 188855 (0.0041) [2024-06-18 18:39:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3094347776. Throughput: 0: 42062.3. Samples: 318441040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:35,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 18:39:35,757][19107] Updated weights for policy 0, policy_version 188865 (0.0030) [2024-06-18 18:39:39,777][19107] Updated weights for policy 0, policy_version 188875 (0.0038) [2024-06-18 18:39:40,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42053.5, 300 sec: 41876.4). Total num frames: 3094560768. Throughput: 0: 41894.3. Samples: 318691480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:40,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 18:39:43,869][19107] Updated weights for policy 0, policy_version 188885 (0.0029) [2024-06-18 18:39:45,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3094773760. Throughput: 0: 42130.5. Samples: 318821780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:45,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 18:39:47,472][19107] Updated weights for policy 0, policy_version 188895 (0.0033) [2024-06-18 18:39:50,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3094970368. Throughput: 0: 41875.1. Samples: 319066120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:50,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 18:39:51,805][19107] Updated weights for policy 0, policy_version 188905 (0.0026) [2024-06-18 18:39:55,163][19107] Updated weights for policy 0, policy_version 188915 (0.0043) [2024-06-18 18:39:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3095183360. Throughput: 0: 41841.8. Samples: 319315980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:39:55,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 18:39:59,696][19107] Updated weights for policy 0, policy_version 188925 (0.0037) [2024-06-18 18:40:00,502][18875] Fps is (10 sec: 42590.0, 60 sec: 42326.4, 300 sec: 41987.2). Total num frames: 3095396352. Throughput: 0: 41883.1. Samples: 319441620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:40:00,503][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 18:40:02,829][19107] Updated weights for policy 0, policy_version 188935 (0.0030) [2024-06-18 18:40:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3095609344. Throughput: 0: 41774.4. Samples: 319691580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-18 18:40:05,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 18:40:07,528][19107] Updated weights for policy 0, policy_version 188945 (0.0035) [2024-06-18 18:40:10,500][18875] Fps is (10 sec: 40968.7, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3095805952. Throughput: 0: 41772.5. Samples: 319943720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:10,500][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 18:40:10,775][19107] Updated weights for policy 0, policy_version 188955 (0.0033) [2024-06-18 18:40:15,264][19107] Updated weights for policy 0, policy_version 188965 (0.0038) [2024-06-18 18:40:15,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 3096018944. Throughput: 0: 41782.6. Samples: 320068840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:15,501][18875] Avg episode reward: [(0, '0.367')] [2024-06-18 18:40:18,935][19107] Updated weights for policy 0, policy_version 188975 (0.0038) [2024-06-18 18:40:20,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.3, 300 sec: 41820.8). Total num frames: 3096231936. Throughput: 0: 41694.7. Samples: 320317300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:20,501][18875] Avg episode reward: [(0, '0.749')] [2024-06-18 18:40:22,975][19107] Updated weights for policy 0, policy_version 188985 (0.0040) [2024-06-18 18:40:25,504][18875] Fps is (10 sec: 40945.8, 60 sec: 41503.6, 300 sec: 41820.9). Total num frames: 3096428544. Throughput: 0: 41619.5. Samples: 320564500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:25,504][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 18:40:26,788][19107] Updated weights for policy 0, policy_version 188995 (0.0025) [2024-06-18 18:40:30,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3096625152. Throughput: 0: 41496.0. Samples: 320689100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:30,501][18875] Avg episode reward: [(0, '0.323')] [2024-06-18 18:40:31,026][19107] Updated weights for policy 0, policy_version 189005 (0.0029) [2024-06-18 18:40:34,618][19107] Updated weights for policy 0, policy_version 189015 (0.0047) [2024-06-18 18:40:35,500][18875] Fps is (10 sec: 42614.0, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3096854528. Throughput: 0: 41767.2. Samples: 320945640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:35,500][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 18:40:35,605][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189018_3096870912.pth... [2024-06-18 18:40:35,667][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000188404_3086811136.pth [2024-06-18 18:40:38,874][19107] Updated weights for policy 0, policy_version 189025 (0.0034) [2024-06-18 18:40:40,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 3097051136. Throughput: 0: 41718.3. Samples: 321193300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:40,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 18:40:42,333][19107] Updated weights for policy 0, policy_version 189035 (0.0038) [2024-06-18 18:40:45,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3097264128. Throughput: 0: 41802.3. Samples: 321322640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:45,501][18875] Avg episode reward: [(0, '0.313')] [2024-06-18 18:40:46,450][19107] Updated weights for policy 0, policy_version 189045 (0.0028) [2024-06-18 18:40:49,900][19087] Signal inference workers to stop experience collection... (4650 times) [2024-06-18 18:40:49,900][19087] Signal inference workers to resume experience collection... (4650 times) [2024-06-18 18:40:49,952][19107] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-18 18:40:49,952][19107] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-18 18:40:50,030][19107] Updated weights for policy 0, policy_version 189055 (0.0040) [2024-06-18 18:40:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3097477120. Throughput: 0: 41716.4. Samples: 321568820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:50,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 18:40:54,427][19107] Updated weights for policy 0, policy_version 189065 (0.0027) [2024-06-18 18:40:55,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.3, 300 sec: 41820.9). Total num frames: 3097673728. Throughput: 0: 41679.1. Samples: 321819280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:40:55,500][18875] Avg episode reward: [(0, '0.730')] [2024-06-18 18:40:58,086][19107] Updated weights for policy 0, policy_version 189075 (0.0040) [2024-06-18 18:41:00,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41780.6, 300 sec: 41931.9). Total num frames: 3097903104. Throughput: 0: 41561.9. Samples: 321939120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:41:00,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 18:41:02,504][19107] Updated weights for policy 0, policy_version 189085 (0.0033) [2024-06-18 18:41:05,500][18875] Fps is (10 sec: 42597.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3098099712. Throughput: 0: 41642.6. Samples: 322191220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:41:05,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 18:41:05,937][19107] Updated weights for policy 0, policy_version 189095 (0.0038) [2024-06-18 18:41:10,182][19107] Updated weights for policy 0, policy_version 189105 (0.0034) [2024-06-18 18:41:10,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.0, 300 sec: 41932.4). Total num frames: 3098296320. Throughput: 0: 41743.7. Samples: 322442820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:41:10,501][18875] Avg episode reward: [(0, '0.475')] [2024-06-18 18:41:13,884][19107] Updated weights for policy 0, policy_version 189115 (0.0041) [2024-06-18 18:41:15,503][18875] Fps is (10 sec: 40950.2, 60 sec: 41504.5, 300 sec: 41820.5). Total num frames: 3098509312. Throughput: 0: 41683.9. Samples: 322564980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 18:41:15,503][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 18:41:17,990][19107] Updated weights for policy 0, policy_version 189125 (0.0035) [2024-06-18 18:41:20,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3098738688. Throughput: 0: 41556.4. Samples: 322815680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:20,501][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 18:41:21,590][19107] Updated weights for policy 0, policy_version 189135 (0.0031) [2024-06-18 18:41:25,504][18875] Fps is (10 sec: 42593.6, 60 sec: 41779.2, 300 sec: 41875.9). Total num frames: 3098935296. Throughput: 0: 41596.2. Samples: 323065280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:25,504][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 18:41:25,768][19107] Updated weights for policy 0, policy_version 189145 (0.0026) [2024-06-18 18:41:29,600][19107] Updated weights for policy 0, policy_version 189155 (0.0043) [2024-06-18 18:41:30,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3099131904. Throughput: 0: 41586.3. Samples: 323194020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:30,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 18:41:33,553][19107] Updated weights for policy 0, policy_version 189165 (0.0033) [2024-06-18 18:41:35,504][18875] Fps is (10 sec: 42598.5, 60 sec: 41776.7, 300 sec: 41987.0). Total num frames: 3099361280. Throughput: 0: 41663.8. Samples: 323443840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:35,504][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 18:41:37,729][19107] Updated weights for policy 0, policy_version 189175 (0.0032) [2024-06-18 18:41:40,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3099574272. Throughput: 0: 41673.1. Samples: 323694580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:40,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 18:41:41,378][19107] Updated weights for policy 0, policy_version 189185 (0.0041) [2024-06-18 18:41:45,500][18875] Fps is (10 sec: 37696.7, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 3099738112. Throughput: 0: 41672.5. Samples: 323814380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:45,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 18:41:45,714][19107] Updated weights for policy 0, policy_version 189195 (0.0042) [2024-06-18 18:41:49,224][19107] Updated weights for policy 0, policy_version 189205 (0.0037) [2024-06-18 18:41:50,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3100000256. Throughput: 0: 41856.1. Samples: 324074740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:50,500][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 18:41:53,756][19107] Updated weights for policy 0, policy_version 189215 (0.0034) [2024-06-18 18:41:55,500][18875] Fps is (10 sec: 45874.6, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 3100196864. Throughput: 0: 41791.1. Samples: 324323420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:41:55,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 18:41:57,129][19107] Updated weights for policy 0, policy_version 189225 (0.0045) [2024-06-18 18:42:00,500][18875] Fps is (10 sec: 39320.9, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3100393472. Throughput: 0: 41764.8. Samples: 324444300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:42:00,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 18:42:01,395][19107] Updated weights for policy 0, policy_version 189235 (0.0034) [2024-06-18 18:42:05,023][19107] Updated weights for policy 0, policy_version 189245 (0.0028) [2024-06-18 18:42:05,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3100622848. Throughput: 0: 41906.7. Samples: 324701480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:42:05,501][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 18:42:09,126][19107] Updated weights for policy 0, policy_version 189255 (0.0034) [2024-06-18 18:42:10,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3100819456. Throughput: 0: 41870.5. Samples: 324949300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:42:10,501][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 18:42:12,655][19107] Updated weights for policy 0, policy_version 189265 (0.0043) [2024-06-18 18:42:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42054.0, 300 sec: 41987.5). Total num frames: 3101032448. Throughput: 0: 41793.4. Samples: 325074720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:42:15,501][18875] Avg episode reward: [(0, '0.723')] [2024-06-18 18:42:16,994][19107] Updated weights for policy 0, policy_version 189275 (0.0033) [2024-06-18 18:42:19,725][19087] Signal inference workers to stop experience collection... (4700 times) [2024-06-18 18:42:19,726][19087] Signal inference workers to resume experience collection... (4700 times) [2024-06-18 18:42:19,775][19107] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-18 18:42:19,780][19107] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-18 18:42:20,305][19107] Updated weights for policy 0, policy_version 189285 (0.0040) [2024-06-18 18:42:20,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3101245440. Throughput: 0: 41848.1. Samples: 325326860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-18 18:42:20,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 18:42:25,077][19107] Updated weights for policy 0, policy_version 189295 (0.0030) [2024-06-18 18:42:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41781.7, 300 sec: 41876.4). Total num frames: 3101442048. Throughput: 0: 41908.1. Samples: 325580440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:42:25,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 18:42:28,464][19107] Updated weights for policy 0, policy_version 189305 (0.0041) [2024-06-18 18:42:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3101655040. Throughput: 0: 41821.3. Samples: 325696340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:42:30,504][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 18:42:32,739][19107] Updated weights for policy 0, policy_version 189315 (0.0034) [2024-06-18 18:42:35,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41508.6, 300 sec: 41765.3). Total num frames: 3101851648. Throughput: 0: 41708.8. Samples: 325951640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:42:35,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 18:42:35,538][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189323_3101868032.pth... [2024-06-18 18:42:35,595][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000188711_3091841024.pth [2024-06-18 18:42:36,208][19107] Updated weights for policy 0, policy_version 189325 (0.0040) [2024-06-18 18:42:40,349][19107] Updated weights for policy 0, policy_version 189335 (0.0033) [2024-06-18 18:42:40,503][18875] Fps is (10 sec: 40950.3, 60 sec: 41504.5, 300 sec: 41876.0). Total num frames: 3102064640. Throughput: 0: 41825.8. Samples: 326205680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:42:40,503][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 18:42:43,930][19107] Updated weights for policy 0, policy_version 189345 (0.0035) [2024-06-18 18:42:45,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 41987.9). Total num frames: 3102294016. Throughput: 0: 41816.5. Samples: 326326040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:42:45,501][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 18:42:48,164][19107] Updated weights for policy 0, policy_version 189355 (0.0037) [2024-06-18 18:42:50,500][18875] Fps is (10 sec: 42609.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3102490624. Throughput: 0: 41797.3. Samples: 326582360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:42:50,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 18:42:51,607][19107] Updated weights for policy 0, policy_version 189365 (0.0037) [2024-06-18 18:42:55,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 3102687232. Throughput: 0: 41841.3. Samples: 326832160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:42:55,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 18:42:55,890][19107] Updated weights for policy 0, policy_version 189375 (0.0035) [2024-06-18 18:42:59,350][19107] Updated weights for policy 0, policy_version 189385 (0.0028) [2024-06-18 18:43:00,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3102932992. Throughput: 0: 41825.7. Samples: 326956880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:43:00,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 18:43:03,559][19107] Updated weights for policy 0, policy_version 189395 (0.0033) [2024-06-18 18:43:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 3103096832. Throughput: 0: 41849.9. Samples: 327210100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:43:05,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 18:43:07,106][19107] Updated weights for policy 0, policy_version 189405 (0.0038) [2024-06-18 18:43:10,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3103309824. Throughput: 0: 41778.7. Samples: 327460480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:43:10,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 18:43:11,230][19107] Updated weights for policy 0, policy_version 189415 (0.0036) [2024-06-18 18:43:14,936][19107] Updated weights for policy 0, policy_version 189425 (0.0023) [2024-06-18 18:43:15,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3103555584. Throughput: 0: 42064.1. Samples: 327589220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:43:15,500][18875] Avg episode reward: [(0, '0.204')] [2024-06-18 18:43:19,122][19107] Updated weights for policy 0, policy_version 189435 (0.0035) [2024-06-18 18:43:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 3103719424. Throughput: 0: 41888.0. Samples: 327836600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:43:20,501][18875] Avg episode reward: [(0, '0.357')] [2024-06-18 18:43:22,692][19107] Updated weights for policy 0, policy_version 189445 (0.0039) [2024-06-18 18:43:25,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3103948800. Throughput: 0: 41986.8. Samples: 328094980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 18:43:25,500][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 18:43:27,047][19107] Updated weights for policy 0, policy_version 189455 (0.0040) [2024-06-18 18:43:30,504][18875] Fps is (10 sec: 45858.5, 60 sec: 42049.8, 300 sec: 41820.3). Total num frames: 3104178176. Throughput: 0: 42153.1. Samples: 328223080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:43:30,505][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 18:43:30,656][19107] Updated weights for policy 0, policy_version 189465 (0.0031) [2024-06-18 18:43:33,508][19087] Signal inference workers to stop experience collection... (4750 times) [2024-06-18 18:43:33,509][19087] Signal inference workers to resume experience collection... (4750 times) [2024-06-18 18:43:33,556][19107] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-18 18:43:33,557][19107] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-18 18:43:34,802][19107] Updated weights for policy 0, policy_version 189475 (0.0036) [2024-06-18 18:43:35,500][18875] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 41765.6). Total num frames: 3104358400. Throughput: 0: 41874.5. Samples: 328466720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:43:35,501][18875] Avg episode reward: [(0, '0.324')] [2024-06-18 18:43:38,941][19107] Updated weights for policy 0, policy_version 189485 (0.0035) [2024-06-18 18:43:40,500][18875] Fps is (10 sec: 39336.3, 60 sec: 41781.0, 300 sec: 41820.9). Total num frames: 3104571392. Throughput: 0: 41801.4. Samples: 328713220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:43:40,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 18:43:42,532][19107] Updated weights for policy 0, policy_version 189495 (0.0034) [2024-06-18 18:43:45,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3104784384. Throughput: 0: 41840.6. Samples: 328839700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:43:45,501][18875] Avg episode reward: [(0, '0.315')] [2024-06-18 18:43:46,624][19107] Updated weights for policy 0, policy_version 189505 (0.0030) [2024-06-18 18:43:50,355][19107] Updated weights for policy 0, policy_version 189515 (0.0031) [2024-06-18 18:43:50,504][18875] Fps is (10 sec: 44220.5, 60 sec: 42049.7, 300 sec: 41875.9). Total num frames: 3105013760. Throughput: 0: 41833.5. Samples: 329092760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:43:50,505][18875] Avg episode reward: [(0, '0.219')] [2024-06-18 18:43:54,350][19107] Updated weights for policy 0, policy_version 189525 (0.0045) [2024-06-18 18:43:55,504][18875] Fps is (10 sec: 44220.6, 60 sec: 42322.8, 300 sec: 41931.9). Total num frames: 3105226752. Throughput: 0: 41666.9. Samples: 329335640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:43:55,504][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 18:43:58,119][19107] Updated weights for policy 0, policy_version 189535 (0.0029) [2024-06-18 18:44:00,500][18875] Fps is (10 sec: 37696.8, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 3105390592. Throughput: 0: 41622.6. Samples: 329462240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:44:00,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 18:44:02,185][19107] Updated weights for policy 0, policy_version 189545 (0.0039) [2024-06-18 18:44:05,504][18875] Fps is (10 sec: 40960.9, 60 sec: 42323.0, 300 sec: 41820.4). Total num frames: 3105636352. Throughput: 0: 41713.4. Samples: 329713840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:44:05,504][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 18:44:06,194][19107] Updated weights for policy 0, policy_version 189555 (0.0030) [2024-06-18 18:44:10,415][19107] Updated weights for policy 0, policy_version 189565 (0.0037) [2024-06-18 18:44:10,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3105832960. Throughput: 0: 41589.6. Samples: 329966520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:44:10,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 18:44:13,779][19107] Updated weights for policy 0, policy_version 189575 (0.0034) [2024-06-18 18:44:15,500][18875] Fps is (10 sec: 39335.0, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 3106029568. Throughput: 0: 41455.9. Samples: 330088440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:44:15,501][18875] Avg episode reward: [(0, '0.292')] [2024-06-18 18:44:18,138][19107] Updated weights for policy 0, policy_version 189585 (0.0028) [2024-06-18 18:44:20,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 3106258944. Throughput: 0: 41670.6. Samples: 330341900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:44:20,501][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 18:44:21,754][19107] Updated weights for policy 0, policy_version 189595 (0.0030) [2024-06-18 18:44:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.0, 300 sec: 41654.2). Total num frames: 3106439168. Throughput: 0: 41836.3. Samples: 330595860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:44:25,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 18:44:25,957][19107] Updated weights for policy 0, policy_version 189605 (0.0040) [2024-06-18 18:44:29,736][19107] Updated weights for policy 0, policy_version 189615 (0.0054) [2024-06-18 18:44:30,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41235.5, 300 sec: 41709.8). Total num frames: 3106652160. Throughput: 0: 41698.1. Samples: 330716120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-18 18:44:30,501][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 18:44:33,745][19107] Updated weights for policy 0, policy_version 189625 (0.0024) [2024-06-18 18:44:35,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3106865152. Throughput: 0: 41559.0. Samples: 330962760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:44:35,500][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 18:44:35,521][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189629_3106881536.pth... [2024-06-18 18:44:35,577][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189018_3096870912.pth [2024-06-18 18:44:37,800][19107] Updated weights for policy 0, policy_version 189635 (0.0036) [2024-06-18 18:44:40,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3107061760. Throughput: 0: 41815.3. Samples: 331217180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:44:40,501][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 18:44:41,670][19107] Updated weights for policy 0, policy_version 189645 (0.0039) [2024-06-18 18:44:45,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3107291136. Throughput: 0: 41629.3. Samples: 331335560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:44:45,501][18875] Avg episode reward: [(0, '0.291')] [2024-06-18 18:44:45,706][19107] Updated weights for policy 0, policy_version 189655 (0.0040) [2024-06-18 18:44:49,589][19107] Updated weights for policy 0, policy_version 189665 (0.0034) [2024-06-18 18:44:50,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41508.7, 300 sec: 41765.3). Total num frames: 3107504128. Throughput: 0: 41714.3. Samples: 331590840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:44:50,501][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 18:44:53,631][19107] Updated weights for policy 0, policy_version 189675 (0.0038) [2024-06-18 18:44:55,500][18875] Fps is (10 sec: 39321.7, 60 sec: 40962.4, 300 sec: 41654.5). Total num frames: 3107684352. Throughput: 0: 41504.1. Samples: 331834200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:44:55,512][18875] Avg episode reward: [(0, '0.210')] [2024-06-18 18:44:57,384][19107] Updated weights for policy 0, policy_version 189685 (0.0029) [2024-06-18 18:45:00,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 3107930112. Throughput: 0: 41473.2. Samples: 331954740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:00,501][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 18:45:01,580][19107] Updated weights for policy 0, policy_version 189695 (0.0046) [2024-06-18 18:45:05,256][19107] Updated weights for policy 0, policy_version 189705 (0.0046) [2024-06-18 18:45:05,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41508.5, 300 sec: 41765.3). Total num frames: 3108126720. Throughput: 0: 41650.0. Samples: 332216140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:05,500][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 18:45:07,348][19087] Signal inference workers to stop experience collection... (4800 times) [2024-06-18 18:45:07,349][19087] Signal inference workers to resume experience collection... (4800 times) [2024-06-18 18:45:07,375][19107] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-18 18:45:07,375][19107] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-18 18:45:09,391][19107] Updated weights for policy 0, policy_version 189715 (0.0029) [2024-06-18 18:45:10,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.3, 300 sec: 41709.8). Total num frames: 3108323328. Throughput: 0: 41338.8. Samples: 332456100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:10,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 18:45:13,087][19107] Updated weights for policy 0, policy_version 189725 (0.0035) [2024-06-18 18:45:15,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3108536320. Throughput: 0: 41427.6. Samples: 332580360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:15,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 18:45:17,082][19107] Updated weights for policy 0, policy_version 189735 (0.0039) [2024-06-18 18:45:20,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.3, 300 sec: 41710.3). Total num frames: 3108732928. Throughput: 0: 41592.0. Samples: 332834400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:20,500][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 18:45:20,989][19107] Updated weights for policy 0, policy_version 189745 (0.0036) [2024-06-18 18:45:24,899][19107] Updated weights for policy 0, policy_version 189755 (0.0035) [2024-06-18 18:45:25,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 3108962304. Throughput: 0: 41367.1. Samples: 333078700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:25,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 18:45:28,880][19107] Updated weights for policy 0, policy_version 189765 (0.0031) [2024-06-18 18:45:30,504][18875] Fps is (10 sec: 42582.7, 60 sec: 41776.7, 300 sec: 41709.3). Total num frames: 3109158912. Throughput: 0: 41668.7. Samples: 333210800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:30,504][18875] Avg episode reward: [(0, '0.279')] [2024-06-18 18:45:32,575][19107] Updated weights for policy 0, policy_version 189775 (0.0032) [2024-06-18 18:45:35,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 3109355520. Throughput: 0: 41588.3. Samples: 333462320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:35,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 18:45:36,763][19107] Updated weights for policy 0, policy_version 189785 (0.0038) [2024-06-18 18:45:40,429][19107] Updated weights for policy 0, policy_version 189795 (0.0037) [2024-06-18 18:45:40,500][18875] Fps is (10 sec: 44252.5, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 3109601280. Throughput: 0: 41743.5. Samples: 333712660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 18:45:40,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 18:45:44,608][19107] Updated weights for policy 0, policy_version 189805 (0.0029) [2024-06-18 18:45:45,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3109781504. Throughput: 0: 41959.2. Samples: 333842900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:45:45,501][18875] Avg episode reward: [(0, '0.765')] [2024-06-18 18:45:48,086][19107] Updated weights for policy 0, policy_version 189815 (0.0038) [2024-06-18 18:45:50,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41232.9, 300 sec: 41709.7). Total num frames: 3109978112. Throughput: 0: 41622.9. Samples: 334089180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:45:50,501][18875] Avg episode reward: [(0, '0.751')] [2024-06-18 18:45:52,327][19107] Updated weights for policy 0, policy_version 189825 (0.0039) [2024-06-18 18:45:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 3110223872. Throughput: 0: 41862.6. Samples: 334339920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:45:55,501][18875] Avg episode reward: [(0, '0.748')] [2024-06-18 18:45:55,982][19107] Updated weights for policy 0, policy_version 189835 (0.0041) [2024-06-18 18:46:00,504][18875] Fps is (10 sec: 42583.6, 60 sec: 41230.7, 300 sec: 41709.3). Total num frames: 3110404096. Throughput: 0: 42002.5. Samples: 334470620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:00,505][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 18:46:00,579][19107] Updated weights for policy 0, policy_version 189845 (0.0032) [2024-06-18 18:46:03,767][19107] Updated weights for policy 0, policy_version 189855 (0.0041) [2024-06-18 18:46:05,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3110617088. Throughput: 0: 41714.1. Samples: 334711540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:05,501][18875] Avg episode reward: [(0, '0.342')] [2024-06-18 18:46:08,546][19107] Updated weights for policy 0, policy_version 189865 (0.0039) [2024-06-18 18:46:10,500][18875] Fps is (10 sec: 44252.7, 60 sec: 42052.2, 300 sec: 41821.2). Total num frames: 3110846464. Throughput: 0: 41833.8. Samples: 334961220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:10,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 18:46:12,116][19107] Updated weights for policy 0, policy_version 189875 (0.0042) [2024-06-18 18:46:15,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3111026688. Throughput: 0: 41736.1. Samples: 335088780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:15,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 18:46:16,473][19107] Updated weights for policy 0, policy_version 189885 (0.0035) [2024-06-18 18:46:19,880][19107] Updated weights for policy 0, policy_version 189895 (0.0036) [2024-06-18 18:46:20,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 41710.3). Total num frames: 3111239680. Throughput: 0: 41579.2. Samples: 335333380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:20,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 18:46:24,433][19107] Updated weights for policy 0, policy_version 189905 (0.0046) [2024-06-18 18:46:25,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3111469056. Throughput: 0: 41721.8. Samples: 335590140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:25,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 18:46:27,895][19107] Updated weights for policy 0, policy_version 189915 (0.0043) [2024-06-18 18:46:30,504][18875] Fps is (10 sec: 40945.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3111649280. Throughput: 0: 41554.5. Samples: 335713000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:30,504][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 18:46:32,117][19107] Updated weights for policy 0, policy_version 189925 (0.0033) [2024-06-18 18:46:35,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3111862272. Throughput: 0: 41617.3. Samples: 335961960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:35,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 18:46:35,516][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189933_3111862272.pth... [2024-06-18 18:46:35,594][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189323_3101868032.pth [2024-06-18 18:46:35,959][19107] Updated weights for policy 0, policy_version 189935 (0.0041) [2024-06-18 18:46:39,871][19107] Updated weights for policy 0, policy_version 189945 (0.0044) [2024-06-18 18:46:40,500][18875] Fps is (10 sec: 44252.7, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3112091648. Throughput: 0: 41775.6. Samples: 336219820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:40,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 18:46:42,477][19087] Signal inference workers to stop experience collection... (4850 times) [2024-06-18 18:46:42,526][19107] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-18 18:46:42,532][19087] Signal inference workers to resume experience collection... (4850 times) [2024-06-18 18:46:42,543][19107] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-18 18:46:43,682][19107] Updated weights for policy 0, policy_version 189955 (0.0033) [2024-06-18 18:46:45,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3112304640. Throughput: 0: 41617.6. Samples: 336343260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 18:46:45,500][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 18:46:47,739][19107] Updated weights for policy 0, policy_version 189965 (0.0043) [2024-06-18 18:46:50,501][18875] Fps is (10 sec: 39317.0, 60 sec: 41778.5, 300 sec: 41654.1). Total num frames: 3112484864. Throughput: 0: 41704.8. Samples: 336588300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:46:50,502][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 18:46:51,421][19107] Updated weights for policy 0, policy_version 189975 (0.0040) [2024-06-18 18:46:55,388][19107] Updated weights for policy 0, policy_version 189985 (0.0036) [2024-06-18 18:46:55,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3112714240. Throughput: 0: 41896.1. Samples: 336846540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:46:55,500][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 18:46:59,190][19107] Updated weights for policy 0, policy_version 189995 (0.0034) [2024-06-18 18:47:00,500][18875] Fps is (10 sec: 44242.0, 60 sec: 42054.8, 300 sec: 41709.8). Total num frames: 3112927232. Throughput: 0: 41856.1. Samples: 336972300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:00,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 18:47:03,059][19107] Updated weights for policy 0, policy_version 190005 (0.0031) [2024-06-18 18:47:05,504][18875] Fps is (10 sec: 40944.7, 60 sec: 41776.7, 300 sec: 41709.3). Total num frames: 3113123840. Throughput: 0: 41770.5. Samples: 337213200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:05,504][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 18:47:07,310][19107] Updated weights for policy 0, policy_version 190015 (0.0028) [2024-06-18 18:47:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3113336832. Throughput: 0: 41961.3. Samples: 337478400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:10,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 18:47:11,180][19107] Updated weights for policy 0, policy_version 190025 (0.0038) [2024-06-18 18:47:15,045][19107] Updated weights for policy 0, policy_version 190035 (0.0041) [2024-06-18 18:47:15,500][18875] Fps is (10 sec: 42614.1, 60 sec: 42052.4, 300 sec: 41709.8). Total num frames: 3113549824. Throughput: 0: 41887.8. Samples: 337597800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:15,501][18875] Avg episode reward: [(0, '0.744')] [2024-06-18 18:47:18,767][19107] Updated weights for policy 0, policy_version 190045 (0.0034) [2024-06-18 18:47:20,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3113779200. Throughput: 0: 41964.0. Samples: 337850340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:20,501][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 18:47:23,067][19107] Updated weights for policy 0, policy_version 190055 (0.0045) [2024-06-18 18:47:25,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3113959424. Throughput: 0: 41971.6. Samples: 338108540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:25,500][18875] Avg episode reward: [(0, '0.353')] [2024-06-18 18:47:26,348][19107] Updated weights for policy 0, policy_version 190065 (0.0041) [2024-06-18 18:47:30,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41781.6, 300 sec: 41709.8). Total num frames: 3114156032. Throughput: 0: 41812.7. Samples: 338224840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:30,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 18:47:31,142][19107] Updated weights for policy 0, policy_version 190075 (0.0048) [2024-06-18 18:47:33,956][19107] Updated weights for policy 0, policy_version 190085 (0.0031) [2024-06-18 18:47:35,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 41876.7). Total num frames: 3114418176. Throughput: 0: 41997.1. Samples: 338478120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:35,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 18:47:38,810][19107] Updated weights for policy 0, policy_version 190095 (0.0029) [2024-06-18 18:47:40,458][19087] Signal inference workers to stop experience collection... (4900 times) [2024-06-18 18:47:40,464][19087] Signal inference workers to resume experience collection... (4900 times) [2024-06-18 18:47:40,500][18875] Fps is (10 sec: 44237.9, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3114598400. Throughput: 0: 42113.3. Samples: 338741640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:40,500][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 18:47:40,506][19107] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-18 18:47:40,506][19107] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-18 18:47:41,693][19107] Updated weights for policy 0, policy_version 190105 (0.0035) [2024-06-18 18:47:45,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3114795008. Throughput: 0: 41966.3. Samples: 338860780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:45,501][18875] Avg episode reward: [(0, '0.398')] [2024-06-18 18:47:46,385][19107] Updated weights for policy 0, policy_version 190115 (0.0036) [2024-06-18 18:47:49,333][19107] Updated weights for policy 0, policy_version 190125 (0.0037) [2024-06-18 18:47:50,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42872.2, 300 sec: 41931.9). Total num frames: 3115057152. Throughput: 0: 42264.7. Samples: 339114960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 18:47:50,512][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 18:47:54,080][19107] Updated weights for policy 0, policy_version 190135 (0.0036) [2024-06-18 18:47:55,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 3115220992. Throughput: 0: 42088.4. Samples: 339372380. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:47:55,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 18:47:56,950][19107] Updated weights for policy 0, policy_version 190145 (0.0027) [2024-06-18 18:48:00,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3115433984. Throughput: 0: 42152.8. Samples: 339494680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:00,504][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 18:48:01,699][19107] Updated weights for policy 0, policy_version 190155 (0.0030) [2024-06-18 18:48:05,008][19107] Updated weights for policy 0, policy_version 190165 (0.0035) [2024-06-18 18:48:05,504][18875] Fps is (10 sec: 45858.7, 60 sec: 42598.4, 300 sec: 41931.4). Total num frames: 3115679744. Throughput: 0: 42125.6. Samples: 339746140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:05,504][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 18:48:09,779][19107] Updated weights for policy 0, policy_version 190175 (0.0039) [2024-06-18 18:48:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3115843584. Throughput: 0: 42039.0. Samples: 340000300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:10,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 18:48:12,700][19107] Updated weights for policy 0, policy_version 190185 (0.0032) [2024-06-18 18:48:15,500][18875] Fps is (10 sec: 39335.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3116072960. Throughput: 0: 42054.3. Samples: 340117280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:15,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 18:48:17,607][19107] Updated weights for policy 0, policy_version 190195 (0.0042) [2024-06-18 18:48:20,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3116302336. Throughput: 0: 42196.8. Samples: 340376980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:20,501][18875] Avg episode reward: [(0, '0.828')] [2024-06-18 18:48:20,922][19107] Updated weights for policy 0, policy_version 190205 (0.0044) [2024-06-18 18:48:25,294][19107] Updated weights for policy 0, policy_version 190215 (0.0030) [2024-06-18 18:48:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41710.3). Total num frames: 3116482560. Throughput: 0: 41860.4. Samples: 340625360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:25,500][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 18:48:28,570][19107] Updated weights for policy 0, policy_version 190225 (0.0033) [2024-06-18 18:48:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 3116711936. Throughput: 0: 41861.6. Samples: 340744560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:30,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 18:48:33,043][19107] Updated weights for policy 0, policy_version 190235 (0.0037) [2024-06-18 18:48:35,500][18875] Fps is (10 sec: 42597.3, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3116908544. Throughput: 0: 42003.5. Samples: 341005120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:35,502][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 18:48:35,582][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000190242_3116924928.pth... [2024-06-18 18:48:35,641][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189629_3106881536.pth [2024-06-18 18:48:36,331][19107] Updated weights for policy 0, policy_version 190245 (0.0040) [2024-06-18 18:48:40,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3117105152. Throughput: 0: 41844.1. Samples: 341255360. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:40,500][18875] Avg episode reward: [(0, '0.394')] [2024-06-18 18:48:40,999][19107] Updated weights for policy 0, policy_version 190255 (0.0031) [2024-06-18 18:48:44,217][19107] Updated weights for policy 0, policy_version 190265 (0.0032) [2024-06-18 18:48:45,500][18875] Fps is (10 sec: 42599.4, 60 sec: 42325.3, 300 sec: 41765.8). Total num frames: 3117334528. Throughput: 0: 41683.2. Samples: 341370420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:45,500][18875] Avg episode reward: [(0, '0.663')] [2024-06-18 18:48:49,035][19107] Updated weights for policy 0, policy_version 190275 (0.0033) [2024-06-18 18:48:50,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41506.2, 300 sec: 41765.8). Total num frames: 3117547520. Throughput: 0: 41892.7. Samples: 341631160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:50,501][18875] Avg episode reward: [(0, '0.663')] [2024-06-18 18:48:51,934][19107] Updated weights for policy 0, policy_version 190285 (0.0039) [2024-06-18 18:48:55,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3117727744. Throughput: 0: 41826.7. Samples: 341882500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:48:55,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 18:48:56,787][19107] Updated weights for policy 0, policy_version 190295 (0.0048) [2024-06-18 18:48:59,776][19107] Updated weights for policy 0, policy_version 190305 (0.0043) [2024-06-18 18:49:00,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41765.8). Total num frames: 3117957120. Throughput: 0: 41922.6. Samples: 342003800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 18:49:00,501][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 18:49:04,503][19107] Updated weights for policy 0, policy_version 190315 (0.0028) [2024-06-18 18:49:05,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41235.5, 300 sec: 41765.3). Total num frames: 3118153728. Throughput: 0: 41809.8. Samples: 342258420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:05,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 18:49:05,866][19087] Signal inference workers to stop experience collection... (4950 times) [2024-06-18 18:49:05,867][19087] Signal inference workers to resume experience collection... (4950 times) [2024-06-18 18:49:05,892][19107] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-18 18:49:05,892][19107] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-18 18:49:07,669][19107] Updated weights for policy 0, policy_version 190325 (0.0036) [2024-06-18 18:49:10,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3118350336. Throughput: 0: 41791.8. Samples: 342506000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:10,504][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 18:49:12,446][19107] Updated weights for policy 0, policy_version 190335 (0.0035) [2024-06-18 18:49:15,418][19107] Updated weights for policy 0, policy_version 190345 (0.0030) [2024-06-18 18:49:15,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3118612480. Throughput: 0: 41917.4. Samples: 342630840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:15,501][18875] Avg episode reward: [(0, '0.770')] [2024-06-18 18:49:20,383][19107] Updated weights for policy 0, policy_version 190355 (0.0034) [2024-06-18 18:49:20,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41820.8). Total num frames: 3118776320. Throughput: 0: 41603.2. Samples: 342877260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:20,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 18:49:23,215][19107] Updated weights for policy 0, policy_version 190365 (0.0052) [2024-06-18 18:49:25,500][18875] Fps is (10 sec: 36044.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3118972928. Throughput: 0: 41573.7. Samples: 343126180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:25,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 18:49:28,230][19107] Updated weights for policy 0, policy_version 190375 (0.0039) [2024-06-18 18:49:30,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3119235072. Throughput: 0: 41790.0. Samples: 343250980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:30,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 18:49:31,053][19107] Updated weights for policy 0, policy_version 190385 (0.0040) [2024-06-18 18:49:35,504][18875] Fps is (10 sec: 42583.3, 60 sec: 41503.8, 300 sec: 41820.4). Total num frames: 3119398912. Throughput: 0: 41615.8. Samples: 343504020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:35,504][18875] Avg episode reward: [(0, '0.425')] [2024-06-18 18:49:36,003][19107] Updated weights for policy 0, policy_version 190395 (0.0023) [2024-06-18 18:49:39,195][19107] Updated weights for policy 0, policy_version 190405 (0.0035) [2024-06-18 18:49:40,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3119611904. Throughput: 0: 41331.9. Samples: 343742440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:40,501][18875] Avg episode reward: [(0, '0.352')] [2024-06-18 18:49:44,154][19107] Updated weights for policy 0, policy_version 190415 (0.0029) [2024-06-18 18:49:45,500][18875] Fps is (10 sec: 44252.3, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3119841280. Throughput: 0: 41573.3. Samples: 343874600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:45,501][18875] Avg episode reward: [(0, '0.305')] [2024-06-18 18:49:47,131][19107] Updated weights for policy 0, policy_version 190425 (0.0037) [2024-06-18 18:49:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 3120021504. Throughput: 0: 41545.4. Samples: 344127960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:50,501][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 18:49:51,855][19107] Updated weights for policy 0, policy_version 190435 (0.0023) [2024-06-18 18:49:55,145][19107] Updated weights for policy 0, policy_version 190445 (0.0045) [2024-06-18 18:49:55,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3120250880. Throughput: 0: 41550.2. Samples: 344375760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:49:55,501][18875] Avg episode reward: [(0, '0.326')] [2024-06-18 18:49:59,551][19107] Updated weights for policy 0, policy_version 190455 (0.0038) [2024-06-18 18:50:00,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3120463872. Throughput: 0: 41593.4. Samples: 344502540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:50:00,500][18875] Avg episode reward: [(0, '0.346')] [2024-06-18 18:50:03,175][19107] Updated weights for policy 0, policy_version 190465 (0.0043) [2024-06-18 18:50:05,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 3120627712. Throughput: 0: 41528.5. Samples: 344746040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 18:50:05,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 18:50:07,437][19107] Updated weights for policy 0, policy_version 190475 (0.0037) [2024-06-18 18:50:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3120873472. Throughput: 0: 41576.1. Samples: 344997100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:10,500][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 18:50:10,977][19107] Updated weights for policy 0, policy_version 190485 (0.0040) [2024-06-18 18:50:15,210][19107] Updated weights for policy 0, policy_version 190495 (0.0033) [2024-06-18 18:50:15,500][18875] Fps is (10 sec: 45874.7, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 3121086464. Throughput: 0: 41681.3. Samples: 345126640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:15,504][18875] Avg episode reward: [(0, '0.328')] [2024-06-18 18:50:18,730][19107] Updated weights for policy 0, policy_version 190505 (0.0030) [2024-06-18 18:50:20,500][18875] Fps is (10 sec: 37682.6, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 3121250304. Throughput: 0: 41484.6. Samples: 345370680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:20,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 18:50:22,991][19107] Updated weights for policy 0, policy_version 190515 (0.0040) [2024-06-18 18:50:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41821.4). Total num frames: 3121496064. Throughput: 0: 41691.5. Samples: 345618560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:25,501][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 18:50:27,239][19107] Updated weights for policy 0, policy_version 190525 (0.0026) [2024-06-18 18:50:30,500][18875] Fps is (10 sec: 44237.5, 60 sec: 40960.1, 300 sec: 41820.9). Total num frames: 3121692672. Throughput: 0: 41749.9. Samples: 345753340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:30,501][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 18:50:30,801][19107] Updated weights for policy 0, policy_version 190535 (0.0044) [2024-06-18 18:50:31,672][19087] Signal inference workers to stop experience collection... (5000 times) [2024-06-18 18:50:31,673][19087] Signal inference workers to resume experience collection... (5000 times) [2024-06-18 18:50:31,692][19107] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-18 18:50:31,692][19107] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-18 18:50:34,941][19107] Updated weights for policy 0, policy_version 190545 (0.0037) [2024-06-18 18:50:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41781.7, 300 sec: 41709.8). Total num frames: 3121905664. Throughput: 0: 41520.0. Samples: 345996360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:35,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 18:50:35,526][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000190546_3121905664.pth... [2024-06-18 18:50:35,571][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000189933_3111862272.pth [2024-06-18 18:50:38,967][19107] Updated weights for policy 0, policy_version 190555 (0.0030) [2024-06-18 18:50:40,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3122135040. Throughput: 0: 41558.7. Samples: 346245900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:40,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 18:50:42,796][19107] Updated weights for policy 0, policy_version 190565 (0.0031) [2024-06-18 18:50:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 3122315264. Throughput: 0: 41627.4. Samples: 346375780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:45,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 18:50:46,740][19107] Updated weights for policy 0, policy_version 190575 (0.0027) [2024-06-18 18:50:50,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3122528256. Throughput: 0: 41874.6. Samples: 346630400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:50,501][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 18:50:50,709][19107] Updated weights for policy 0, policy_version 190585 (0.0034) [2024-06-18 18:50:54,279][19107] Updated weights for policy 0, policy_version 190595 (0.0032) [2024-06-18 18:50:55,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 3122757632. Throughput: 0: 41838.0. Samples: 346879820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:50:55,501][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 18:50:58,344][19107] Updated weights for policy 0, policy_version 190605 (0.0034) [2024-06-18 18:51:00,502][18875] Fps is (10 sec: 42591.4, 60 sec: 41504.9, 300 sec: 41820.6). Total num frames: 3122954240. Throughput: 0: 41821.2. Samples: 347008660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:51:00,503][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 18:51:02,129][19107] Updated weights for policy 0, policy_version 190615 (0.0039) [2024-06-18 18:51:05,500][18875] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3123150848. Throughput: 0: 41933.4. Samples: 347257680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:51:05,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 18:51:06,323][19107] Updated weights for policy 0, policy_version 190625 (0.0036) [2024-06-18 18:51:09,730][19107] Updated weights for policy 0, policy_version 190635 (0.0035) [2024-06-18 18:51:10,500][18875] Fps is (10 sec: 44244.6, 60 sec: 42052.2, 300 sec: 41932.0). Total num frames: 3123396608. Throughput: 0: 41952.6. Samples: 347506420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 18:51:10,501][18875] Avg episode reward: [(0, '0.342')] [2024-06-18 18:51:14,193][19107] Updated weights for policy 0, policy_version 190645 (0.0038) [2024-06-18 18:51:15,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3123576832. Throughput: 0: 41931.0. Samples: 347640240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:15,501][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 18:51:17,756][19107] Updated weights for policy 0, policy_version 190655 (0.0037) [2024-06-18 18:51:20,500][18875] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 3123789824. Throughput: 0: 41919.0. Samples: 347882720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:20,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 18:51:22,031][19107] Updated weights for policy 0, policy_version 190665 (0.0040) [2024-06-18 18:51:25,450][19107] Updated weights for policy 0, policy_version 190675 (0.0028) [2024-06-18 18:51:25,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41932.4). Total num frames: 3124019200. Throughput: 0: 41975.6. Samples: 348134800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:25,504][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 18:51:29,772][19107] Updated weights for policy 0, policy_version 190685 (0.0044) [2024-06-18 18:51:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 3124199424. Throughput: 0: 41870.7. Samples: 348259960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:30,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 18:51:33,189][19107] Updated weights for policy 0, policy_version 190695 (0.0040) [2024-06-18 18:51:35,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3124428800. Throughput: 0: 41842.6. Samples: 348513320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:35,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 18:51:37,530][19107] Updated weights for policy 0, policy_version 190705 (0.0036) [2024-06-18 18:51:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3124625408. Throughput: 0: 41878.7. Samples: 348764360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:40,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 18:51:41,247][19107] Updated weights for policy 0, policy_version 190715 (0.0030) [2024-06-18 18:51:45,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41779.3, 300 sec: 41821.0). Total num frames: 3124822016. Throughput: 0: 41690.5. Samples: 348884660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:45,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 18:51:45,653][19107] Updated weights for policy 0, policy_version 190725 (0.0034) [2024-06-18 18:51:48,871][19107] Updated weights for policy 0, policy_version 190735 (0.0034) [2024-06-18 18:51:50,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3125067776. Throughput: 0: 41780.5. Samples: 349137800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:50,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 18:51:53,449][19107] Updated weights for policy 0, policy_version 190745 (0.0045) [2024-06-18 18:51:54,512][19087] Signal inference workers to stop experience collection... (5050 times) [2024-06-18 18:51:54,513][19087] Signal inference workers to resume experience collection... (5050 times) [2024-06-18 18:51:54,536][19107] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-18 18:51:54,536][19107] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-18 18:51:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3125264384. Throughput: 0: 41824.9. Samples: 349388540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:51:55,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 18:51:56,605][19107] Updated weights for policy 0, policy_version 190755 (0.0033) [2024-06-18 18:52:00,504][18875] Fps is (10 sec: 39307.3, 60 sec: 41777.9, 300 sec: 41820.9). Total num frames: 3125460992. Throughput: 0: 41603.8. Samples: 349512560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:52:00,504][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 18:52:01,199][19107] Updated weights for policy 0, policy_version 190765 (0.0039) [2024-06-18 18:52:04,669][19107] Updated weights for policy 0, policy_version 190775 (0.0029) [2024-06-18 18:52:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3125690368. Throughput: 0: 41965.1. Samples: 349771140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:52:05,500][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 18:52:09,048][19107] Updated weights for policy 0, policy_version 190785 (0.0043) [2024-06-18 18:52:10,500][18875] Fps is (10 sec: 42613.2, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3125886976. Throughput: 0: 42015.5. Samples: 350025500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:52:10,501][18875] Avg episode reward: [(0, '0.334')] [2024-06-18 18:52:12,319][19107] Updated weights for policy 0, policy_version 190795 (0.0040) [2024-06-18 18:52:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3126083584. Throughput: 0: 41941.4. Samples: 350147320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 18:52:15,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 18:52:16,756][19107] Updated weights for policy 0, policy_version 190805 (0.0047) [2024-06-18 18:52:20,303][19107] Updated weights for policy 0, policy_version 190815 (0.0043) [2024-06-18 18:52:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3126312960. Throughput: 0: 41929.9. Samples: 350400160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:20,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 18:52:24,866][19107] Updated weights for policy 0, policy_version 190825 (0.0042) [2024-06-18 18:52:25,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 3126525952. Throughput: 0: 41826.8. Samples: 350646560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:25,501][18875] Avg episode reward: [(0, '0.382')] [2024-06-18 18:52:27,993][19107] Updated weights for policy 0, policy_version 190835 (0.0029) [2024-06-18 18:52:30,504][18875] Fps is (10 sec: 40945.1, 60 sec: 42049.7, 300 sec: 41709.3). Total num frames: 3126722560. Throughput: 0: 41868.5. Samples: 350768900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:30,504][18875] Avg episode reward: [(0, '0.450')] [2024-06-18 18:52:32,614][19107] Updated weights for policy 0, policy_version 190845 (0.0037) [2024-06-18 18:52:35,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3126951936. Throughput: 0: 41894.6. Samples: 351023060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:35,501][18875] Avg episode reward: [(0, '0.659')] [2024-06-18 18:52:35,581][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000190855_3126968320.pth... [2024-06-18 18:52:35,588][19107] Updated weights for policy 0, policy_version 190855 (0.0030) [2024-06-18 18:52:35,644][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000190242_3116924928.pth [2024-06-18 18:52:40,439][19107] Updated weights for policy 0, policy_version 190865 (0.0030) [2024-06-18 18:52:40,500][18875] Fps is (10 sec: 40975.3, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3127132160. Throughput: 0: 42035.6. Samples: 351280140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:40,500][18875] Avg episode reward: [(0, '0.339')] [2024-06-18 18:52:43,702][19107] Updated weights for policy 0, policy_version 190875 (0.0030) [2024-06-18 18:52:45,504][18875] Fps is (10 sec: 39307.7, 60 sec: 42049.7, 300 sec: 41653.7). Total num frames: 3127345152. Throughput: 0: 41848.0. Samples: 351395720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:45,504][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 18:52:48,204][19107] Updated weights for policy 0, policy_version 190885 (0.0038) [2024-06-18 18:52:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3127558144. Throughput: 0: 41725.3. Samples: 351648780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:50,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 18:52:51,660][19107] Updated weights for policy 0, policy_version 190895 (0.0047) [2024-06-18 18:52:55,500][18875] Fps is (10 sec: 39335.8, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 3127738368. Throughput: 0: 41805.4. Samples: 351906740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:52:55,501][18875] Avg episode reward: [(0, '0.436')] [2024-06-18 18:52:56,028][19107] Updated weights for policy 0, policy_version 190905 (0.0041) [2024-06-18 18:52:57,169][19087] Signal inference workers to stop experience collection... (5100 times) [2024-06-18 18:52:57,169][19087] Signal inference workers to resume experience collection... (5100 times) [2024-06-18 18:52:57,215][19107] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-18 18:52:57,216][19107] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-18 18:52:59,392][19107] Updated weights for policy 0, policy_version 190915 (0.0037) [2024-06-18 18:53:00,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42054.9, 300 sec: 41710.3). Total num frames: 3127984128. Throughput: 0: 41801.8. Samples: 352028400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:53:00,500][18875] Avg episode reward: [(0, '0.306')] [2024-06-18 18:53:03,662][19107] Updated weights for policy 0, policy_version 190925 (0.0029) [2024-06-18 18:53:05,500][18875] Fps is (10 sec: 45874.9, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3128197120. Throughput: 0: 41806.1. Samples: 352281440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:53:05,501][18875] Avg episode reward: [(0, '0.248')] [2024-06-18 18:53:07,407][19107] Updated weights for policy 0, policy_version 190935 (0.0034) [2024-06-18 18:53:10,500][18875] Fps is (10 sec: 39320.9, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3128377344. Throughput: 0: 41962.6. Samples: 352534880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:53:10,501][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 18:53:11,356][19107] Updated weights for policy 0, policy_version 190945 (0.0037) [2024-06-18 18:53:15,191][19107] Updated weights for policy 0, policy_version 190955 (0.0037) [2024-06-18 18:53:15,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3128606720. Throughput: 0: 41967.4. Samples: 352657280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:53:15,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 18:53:19,215][19107] Updated weights for policy 0, policy_version 190965 (0.0036) [2024-06-18 18:53:20,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3128836096. Throughput: 0: 42071.2. Samples: 352916260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:53:20,504][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 18:53:22,687][19107] Updated weights for policy 0, policy_version 190975 (0.0040) [2024-06-18 18:53:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3129032704. Throughput: 0: 41922.0. Samples: 353166640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 18:53:25,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 18:53:26,931][19107] Updated weights for policy 0, policy_version 190985 (0.0038) [2024-06-18 18:53:30,383][19107] Updated weights for policy 0, policy_version 190995 (0.0031) [2024-06-18 18:53:30,504][18875] Fps is (10 sec: 42582.6, 60 sec: 42325.3, 300 sec: 41875.9). Total num frames: 3129262080. Throughput: 0: 42216.4. Samples: 353295460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:53:30,505][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 18:53:34,990][19107] Updated weights for policy 0, policy_version 191005 (0.0047) [2024-06-18 18:53:35,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 3129442304. Throughput: 0: 42174.7. Samples: 353546640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:53:35,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 18:53:38,255][19107] Updated weights for policy 0, policy_version 191015 (0.0033) [2024-06-18 18:53:40,500][18875] Fps is (10 sec: 42614.2, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 3129688064. Throughput: 0: 42021.4. Samples: 353797700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:53:40,500][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 18:53:42,682][19107] Updated weights for policy 0, policy_version 191025 (0.0029) [2024-06-18 18:53:45,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42054.7, 300 sec: 41765.3). Total num frames: 3129868288. Throughput: 0: 42184.2. Samples: 353926700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:53:45,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 18:53:46,016][19107] Updated weights for policy 0, policy_version 191035 (0.0035) [2024-06-18 18:53:50,324][19107] Updated weights for policy 0, policy_version 191045 (0.0033) [2024-06-18 18:53:50,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3130081280. Throughput: 0: 42135.1. Samples: 354177520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:53:50,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 18:53:53,672][19107] Updated weights for policy 0, policy_version 191055 (0.0028) [2024-06-18 18:53:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 41820.8). Total num frames: 3130294272. Throughput: 0: 42185.3. Samples: 354433220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:53:55,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 18:53:57,941][19107] Updated weights for policy 0, policy_version 191065 (0.0030) [2024-06-18 18:54:00,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3130523648. Throughput: 0: 42236.0. Samples: 354557900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:54:00,501][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 18:54:01,728][19107] Updated weights for policy 0, policy_version 191075 (0.0032) [2024-06-18 18:54:05,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 41932.0). Total num frames: 3130720256. Throughput: 0: 42036.9. Samples: 354807920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:54:05,500][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 18:54:05,586][19107] Updated weights for policy 0, policy_version 191085 (0.0024) [2024-06-18 18:54:09,537][19107] Updated weights for policy 0, policy_version 191095 (0.0036) [2024-06-18 18:54:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 41765.3). Total num frames: 3130933248. Throughput: 0: 42014.7. Samples: 355057300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:54:10,501][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 18:54:13,273][19107] Updated weights for policy 0, policy_version 191105 (0.0037) [2024-06-18 18:54:15,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3131129856. Throughput: 0: 41938.4. Samples: 355182540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:54:15,501][18875] Avg episode reward: [(0, '0.714')] [2024-06-18 18:54:17,283][19107] Updated weights for policy 0, policy_version 191115 (0.0043) [2024-06-18 18:54:20,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3131326464. Throughput: 0: 42034.7. Samples: 355438200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:54:20,500][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 18:54:21,244][19107] Updated weights for policy 0, policy_version 191125 (0.0030) [2024-06-18 18:54:22,830][19087] Signal inference workers to stop experience collection... (5150 times) [2024-06-18 18:54:22,879][19107] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-18 18:54:22,947][19087] Signal inference workers to resume experience collection... (5150 times) [2024-06-18 18:54:22,947][19107] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-18 18:54:24,893][19107] Updated weights for policy 0, policy_version 191135 (0.0040) [2024-06-18 18:54:25,503][18875] Fps is (10 sec: 44224.2, 60 sec: 42323.3, 300 sec: 41820.5). Total num frames: 3131572224. Throughput: 0: 41931.0. Samples: 355684720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:54:25,504][18875] Avg episode reward: [(0, '0.394')] [2024-06-18 18:54:29,256][19107] Updated weights for policy 0, policy_version 191145 (0.0035) [2024-06-18 18:54:30,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41781.8, 300 sec: 41932.4). Total num frames: 3131768832. Throughput: 0: 42031.7. Samples: 355818120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:54:30,501][18875] Avg episode reward: [(0, '0.649')] [2024-06-18 18:54:32,557][19107] Updated weights for policy 0, policy_version 191155 (0.0029) [2024-06-18 18:54:35,500][18875] Fps is (10 sec: 40971.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3131981824. Throughput: 0: 42088.9. Samples: 356071520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:54:35,501][18875] Avg episode reward: [(0, '0.590')] [2024-06-18 18:54:35,533][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000191161_3131981824.pth... [2024-06-18 18:54:35,578][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000190546_3121905664.pth [2024-06-18 18:54:36,868][19107] Updated weights for policy 0, policy_version 191165 (0.0043) [2024-06-18 18:54:40,406][19107] Updated weights for policy 0, policy_version 191175 (0.0036) [2024-06-18 18:54:40,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3132211200. Throughput: 0: 42007.3. Samples: 356323540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:54:40,500][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 18:54:44,440][19107] Updated weights for policy 0, policy_version 191185 (0.0046) [2024-06-18 18:54:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3132391424. Throughput: 0: 42093.0. Samples: 356452080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:54:45,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 18:54:48,340][19107] Updated weights for policy 0, policy_version 191195 (0.0034) [2024-06-18 18:54:50,500][18875] Fps is (10 sec: 39320.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3132604416. Throughput: 0: 42150.4. Samples: 356704700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:54:50,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 18:54:52,155][19107] Updated weights for policy 0, policy_version 191205 (0.0038) [2024-06-18 18:54:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3132833792. Throughput: 0: 42231.2. Samples: 356957700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:54:55,501][18875] Avg episode reward: [(0, '0.159')] [2024-06-18 18:54:56,207][19107] Updated weights for policy 0, policy_version 191215 (0.0052) [2024-06-18 18:55:00,049][19107] Updated weights for policy 0, policy_version 191225 (0.0036) [2024-06-18 18:55:00,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3133046784. Throughput: 0: 42272.5. Samples: 357084800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:00,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 18:55:03,805][19107] Updated weights for policy 0, policy_version 191235 (0.0047) [2024-06-18 18:55:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3133243392. Throughput: 0: 42153.2. Samples: 357335100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:05,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 18:55:07,588][19107] Updated weights for policy 0, policy_version 191245 (0.0028) [2024-06-18 18:55:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3133456384. Throughput: 0: 42393.9. Samples: 357592320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:10,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 18:55:11,731][19107] Updated weights for policy 0, policy_version 191255 (0.0038) [2024-06-18 18:55:15,491][19107] Updated weights for policy 0, policy_version 191265 (0.0039) [2024-06-18 18:55:15,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3133685760. Throughput: 0: 42095.5. Samples: 357712420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:15,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 18:55:19,496][19107] Updated weights for policy 0, policy_version 191275 (0.0046) [2024-06-18 18:55:20,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42043.0). Total num frames: 3133898752. Throughput: 0: 42167.6. Samples: 357969060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:20,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 18:55:23,604][19107] Updated weights for policy 0, policy_version 191285 (0.0034) [2024-06-18 18:55:25,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41781.2, 300 sec: 41987.4). Total num frames: 3134078976. Throughput: 0: 42225.6. Samples: 358223700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:25,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 18:55:27,184][19107] Updated weights for policy 0, policy_version 191295 (0.0038) [2024-06-18 18:55:30,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3134308352. Throughput: 0: 42015.1. Samples: 358342760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:30,501][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 18:55:31,414][19107] Updated weights for policy 0, policy_version 191305 (0.0034) [2024-06-18 18:55:35,055][19107] Updated weights for policy 0, policy_version 191315 (0.0033) [2024-06-18 18:55:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3134504960. Throughput: 0: 42136.1. Samples: 358600820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-18 18:55:35,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 18:55:39,250][19107] Updated weights for policy 0, policy_version 191325 (0.0029) [2024-06-18 18:55:40,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3134701568. Throughput: 0: 42104.5. Samples: 358852400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:55:40,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 18:55:42,906][19107] Updated weights for policy 0, policy_version 191335 (0.0033) [2024-06-18 18:55:45,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3134914560. Throughput: 0: 41900.8. Samples: 358970340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:55:45,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 18:55:47,021][19107] Updated weights for policy 0, policy_version 191345 (0.0032) [2024-06-18 18:55:48,351][19087] Signal inference workers to stop experience collection... (5200 times) [2024-06-18 18:55:48,382][19107] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-18 18:55:48,467][19087] Signal inference workers to resume experience collection... (5200 times) [2024-06-18 18:55:48,468][19107] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-18 18:55:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3135127552. Throughput: 0: 41994.6. Samples: 359224860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:55:50,501][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 18:55:50,954][19107] Updated weights for policy 0, policy_version 191355 (0.0032) [2024-06-18 18:55:54,705][19107] Updated weights for policy 0, policy_version 191365 (0.0043) [2024-06-18 18:55:55,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 41987.7). Total num frames: 3135340544. Throughput: 0: 41817.7. Samples: 359474120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:55:55,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 18:55:58,703][19107] Updated weights for policy 0, policy_version 191375 (0.0026) [2024-06-18 18:56:00,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3135569920. Throughput: 0: 42047.0. Samples: 359604540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:00,501][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 18:56:02,770][19107] Updated weights for policy 0, policy_version 191385 (0.0027) [2024-06-18 18:56:05,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3135750144. Throughput: 0: 41842.3. Samples: 359851960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:05,501][18875] Avg episode reward: [(0, '0.340')] [2024-06-18 18:56:06,474][19107] Updated weights for policy 0, policy_version 191395 (0.0043) [2024-06-18 18:56:10,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3135946752. Throughput: 0: 41744.1. Samples: 360102180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:10,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 18:56:10,820][19107] Updated weights for policy 0, policy_version 191405 (0.0031) [2024-06-18 18:56:14,196][19107] Updated weights for policy 0, policy_version 191415 (0.0030) [2024-06-18 18:56:15,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3136192512. Throughput: 0: 41782.2. Samples: 360222960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:15,501][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 18:56:18,782][19107] Updated weights for policy 0, policy_version 191425 (0.0035) [2024-06-18 18:56:20,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3136389120. Throughput: 0: 41649.8. Samples: 360475060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:20,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 18:56:22,205][19107] Updated weights for policy 0, policy_version 191435 (0.0040) [2024-06-18 18:56:25,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3136585728. Throughput: 0: 41732.4. Samples: 360730360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:25,501][18875] Avg episode reward: [(0, '0.758')] [2024-06-18 18:56:26,491][19107] Updated weights for policy 0, policy_version 191445 (0.0033) [2024-06-18 18:56:30,063][19107] Updated weights for policy 0, policy_version 191455 (0.0023) [2024-06-18 18:56:30,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3136815104. Throughput: 0: 41851.6. Samples: 360853660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:30,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 18:56:34,387][19107] Updated weights for policy 0, policy_version 191465 (0.0030) [2024-06-18 18:56:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3137011712. Throughput: 0: 41854.3. Samples: 361108300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:35,501][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 18:56:35,520][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000191468_3137011712.pth... [2024-06-18 18:56:35,578][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000190855_3126968320.pth [2024-06-18 18:56:38,012][19107] Updated weights for policy 0, policy_version 191475 (0.0037) [2024-06-18 18:56:40,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3137208320. Throughput: 0: 41735.7. Samples: 361352220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:40,500][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 18:56:42,547][19107] Updated weights for policy 0, policy_version 191485 (0.0031) [2024-06-18 18:56:45,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3137421312. Throughput: 0: 41726.8. Samples: 361482240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 18:56:45,501][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 18:56:46,037][19107] Updated weights for policy 0, policy_version 191495 (0.0045) [2024-06-18 18:56:50,062][19107] Updated weights for policy 0, policy_version 191505 (0.0038) [2024-06-18 18:56:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3137634304. Throughput: 0: 41809.8. Samples: 361733400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:56:50,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 18:56:53,534][19107] Updated weights for policy 0, policy_version 191515 (0.0033) [2024-06-18 18:56:55,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 3137863680. Throughput: 0: 41811.9. Samples: 361983720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:56:55,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 18:56:57,960][19107] Updated weights for policy 0, policy_version 191525 (0.0032) [2024-06-18 18:57:00,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3138076672. Throughput: 0: 42159.1. Samples: 362120120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:00,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 18:57:00,932][19107] Updated weights for policy 0, policy_version 191535 (0.0023) [2024-06-18 18:57:05,454][19107] Updated weights for policy 0, policy_version 191545 (0.0028) [2024-06-18 18:57:05,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3138273280. Throughput: 0: 42260.9. Samples: 362376800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:05,501][18875] Avg episode reward: [(0, '0.401')] [2024-06-18 18:57:06,549][19087] Signal inference workers to stop experience collection... (5250 times) [2024-06-18 18:57:06,603][19107] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-18 18:57:06,608][19087] Signal inference workers to resume experience collection... (5250 times) [2024-06-18 18:57:06,623][19107] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-18 18:57:08,426][19107] Updated weights for policy 0, policy_version 191555 (0.0038) [2024-06-18 18:57:10,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3138502656. Throughput: 0: 42191.1. Samples: 362628960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:10,501][18875] Avg episode reward: [(0, '0.436')] [2024-06-18 18:57:13,217][19107] Updated weights for policy 0, policy_version 191565 (0.0037) [2024-06-18 18:57:15,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3138715648. Throughput: 0: 42382.3. Samples: 362760860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:15,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 18:57:16,377][19107] Updated weights for policy 0, policy_version 191575 (0.0028) [2024-06-18 18:57:20,504][18875] Fps is (10 sec: 39307.8, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 3138895872. Throughput: 0: 42240.7. Samples: 363009280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:20,505][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 18:57:20,968][19107] Updated weights for policy 0, policy_version 191585 (0.0035) [2024-06-18 18:57:24,069][19107] Updated weights for policy 0, policy_version 191595 (0.0038) [2024-06-18 18:57:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 3139125248. Throughput: 0: 42534.7. Samples: 363266280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:25,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 18:57:28,777][19107] Updated weights for policy 0, policy_version 191605 (0.0033) [2024-06-18 18:57:30,500][18875] Fps is (10 sec: 45892.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3139354624. Throughput: 0: 42629.8. Samples: 363400580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:30,501][18875] Avg episode reward: [(0, '0.769')] [2024-06-18 18:57:31,668][19107] Updated weights for policy 0, policy_version 191615 (0.0041) [2024-06-18 18:57:35,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3139551232. Throughput: 0: 42552.3. Samples: 363648260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:35,501][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 18:57:36,373][19107] Updated weights for policy 0, policy_version 191625 (0.0027) [2024-06-18 18:57:39,370][19107] Updated weights for policy 0, policy_version 191635 (0.0037) [2024-06-18 18:57:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42099.1). Total num frames: 3139764224. Throughput: 0: 42511.2. Samples: 363896720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:40,501][18875] Avg episode reward: [(0, '0.315')] [2024-06-18 18:57:44,033][19107] Updated weights for policy 0, policy_version 191645 (0.0026) [2024-06-18 18:57:45,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 3139977216. Throughput: 0: 42409.3. Samples: 364028540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:45,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 18:57:47,026][19107] Updated weights for policy 0, policy_version 191655 (0.0028) [2024-06-18 18:57:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3140173824. Throughput: 0: 42425.8. Samples: 364285960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 18:57:50,501][18875] Avg episode reward: [(0, '0.335')] [2024-06-18 18:57:51,878][19107] Updated weights for policy 0, policy_version 191665 (0.0039) [2024-06-18 18:57:54,635][19107] Updated weights for policy 0, policy_version 191675 (0.0025) [2024-06-18 18:57:55,504][18875] Fps is (10 sec: 44220.7, 60 sec: 42595.9, 300 sec: 42153.5). Total num frames: 3140419584. Throughput: 0: 42261.1. Samples: 364530860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:57:55,505][18875] Avg episode reward: [(0, '0.335')] [2024-06-18 18:57:59,539][19107] Updated weights for policy 0, policy_version 191685 (0.0027) [2024-06-18 18:58:00,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3140616192. Throughput: 0: 42300.9. Samples: 364664400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:00,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 18:58:02,426][19107] Updated weights for policy 0, policy_version 191695 (0.0040) [2024-06-18 18:58:05,500][18875] Fps is (10 sec: 40974.6, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3140829184. Throughput: 0: 42346.4. Samples: 364914720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:05,501][18875] Avg episode reward: [(0, '0.405')] [2024-06-18 18:58:07,229][19107] Updated weights for policy 0, policy_version 191705 (0.0042) [2024-06-18 18:58:08,734][19087] Signal inference workers to stop experience collection... (5300 times) [2024-06-18 18:58:08,741][19087] Signal inference workers to resume experience collection... (5300 times) [2024-06-18 18:58:08,754][19107] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-18 18:58:08,754][19107] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-18 18:58:10,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3141042176. Throughput: 0: 42173.7. Samples: 365164100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:10,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 18:58:10,771][19107] Updated weights for policy 0, policy_version 191715 (0.0048) [2024-06-18 18:58:14,949][19107] Updated weights for policy 0, policy_version 191725 (0.0039) [2024-06-18 18:58:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3141238784. Throughput: 0: 42057.3. Samples: 365293160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:15,501][18875] Avg episode reward: [(0, '0.419')] [2024-06-18 18:58:18,629][19107] Updated weights for policy 0, policy_version 191735 (0.0031) [2024-06-18 18:58:20,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42600.9, 300 sec: 42098.5). Total num frames: 3141451776. Throughput: 0: 42078.2. Samples: 365541780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:20,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 18:58:22,639][19107] Updated weights for policy 0, policy_version 191745 (0.0035) [2024-06-18 18:58:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 3141664768. Throughput: 0: 42141.8. Samples: 365793100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:25,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 18:58:26,312][19107] Updated weights for policy 0, policy_version 191755 (0.0038) [2024-06-18 18:58:30,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3141861376. Throughput: 0: 42009.8. Samples: 365918980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:30,501][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 18:58:30,716][19107] Updated weights for policy 0, policy_version 191765 (0.0035) [2024-06-18 18:58:34,581][19107] Updated weights for policy 0, policy_version 191775 (0.0033) [2024-06-18 18:58:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3142090752. Throughput: 0: 41741.3. Samples: 366164320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:35,501][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 18:58:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000191778_3142090752.pth... [2024-06-18 18:58:35,568][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000191161_3131981824.pth [2024-06-18 18:58:38,212][19107] Updated weights for policy 0, policy_version 191785 (0.0032) [2024-06-18 18:58:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3142287360. Throughput: 0: 41918.9. Samples: 366417060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:40,504][18875] Avg episode reward: [(0, '0.383')] [2024-06-18 18:58:42,240][19107] Updated weights for policy 0, policy_version 191795 (0.0042) [2024-06-18 18:58:45,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3142483968. Throughput: 0: 41752.1. Samples: 366543240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:45,500][18875] Avg episode reward: [(0, '0.315')] [2024-06-18 18:58:46,358][19107] Updated weights for policy 0, policy_version 191805 (0.0034) [2024-06-18 18:58:50,211][19107] Updated weights for policy 0, policy_version 191815 (0.0032) [2024-06-18 18:58:50,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3142713344. Throughput: 0: 41894.7. Samples: 366799980. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:50,501][18875] Avg episode reward: [(0, '0.315')] [2024-06-18 18:58:54,117][19107] Updated weights for policy 0, policy_version 191825 (0.0041) [2024-06-18 18:58:55,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42054.8, 300 sec: 42098.5). Total num frames: 3142942720. Throughput: 0: 41896.8. Samples: 367049460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 18:58:55,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 18:58:57,949][19107] Updated weights for policy 0, policy_version 191835 (0.0037) [2024-06-18 18:59:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3143122944. Throughput: 0: 41989.3. Samples: 367182680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:00,504][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 18:59:01,824][19107] Updated weights for policy 0, policy_version 191845 (0.0047) [2024-06-18 18:59:05,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3143335936. Throughput: 0: 42010.2. Samples: 367432240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:05,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 18:59:05,750][19107] Updated weights for policy 0, policy_version 191855 (0.0023) [2024-06-18 18:59:09,724][19107] Updated weights for policy 0, policy_version 191865 (0.0033) [2024-06-18 18:59:10,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3143532544. Throughput: 0: 41961.2. Samples: 367681360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:10,501][18875] Avg episode reward: [(0, '0.776')] [2024-06-18 18:59:13,476][19107] Updated weights for policy 0, policy_version 191875 (0.0033) [2024-06-18 18:59:15,500][18875] Fps is (10 sec: 42599.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3143761920. Throughput: 0: 41994.7. Samples: 367808740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:15,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 18:59:17,477][19107] Updated weights for policy 0, policy_version 191885 (0.0034) [2024-06-18 18:59:20,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42043.4). Total num frames: 3143974912. Throughput: 0: 41989.7. Samples: 368053860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:20,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 18:59:21,263][19107] Updated weights for policy 0, policy_version 191895 (0.0027) [2024-06-18 18:59:25,315][19107] Updated weights for policy 0, policy_version 191905 (0.0038) [2024-06-18 18:59:25,500][18875] Fps is (10 sec: 40959.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3144171520. Throughput: 0: 42119.1. Samples: 368312420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:25,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 18:59:29,182][19107] Updated weights for policy 0, policy_version 191915 (0.0045) [2024-06-18 18:59:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3144384512. Throughput: 0: 42001.2. Samples: 368433300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:30,501][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 18:59:32,932][19087] Signal inference workers to stop experience collection... (5350 times) [2024-06-18 18:59:32,932][19087] Signal inference workers to resume experience collection... (5350 times) [2024-06-18 18:59:32,967][19107] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-18 18:59:32,967][19107] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-18 18:59:33,086][19107] Updated weights for policy 0, policy_version 191925 (0.0031) [2024-06-18 18:59:35,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3144597504. Throughput: 0: 41771.5. Samples: 368679700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:35,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 18:59:36,942][19107] Updated weights for policy 0, policy_version 191935 (0.0043) [2024-06-18 18:59:40,504][18875] Fps is (10 sec: 40945.4, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 3144794112. Throughput: 0: 41901.6. Samples: 368935180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:40,505][18875] Avg episode reward: [(0, '0.797')] [2024-06-18 18:59:40,707][19107] Updated weights for policy 0, policy_version 191945 (0.0033) [2024-06-18 18:59:44,789][19107] Updated weights for policy 0, policy_version 191955 (0.0051) [2024-06-18 18:59:45,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 3145007104. Throughput: 0: 41721.2. Samples: 369060140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:45,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 18:59:48,487][19107] Updated weights for policy 0, policy_version 191965 (0.0023) [2024-06-18 18:59:50,500][18875] Fps is (10 sec: 40975.2, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3145203712. Throughput: 0: 41703.8. Samples: 369308900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:50,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 18:59:52,546][19107] Updated weights for policy 0, policy_version 191975 (0.0037) [2024-06-18 18:59:55,500][18875] Fps is (10 sec: 42599.5, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3145433088. Throughput: 0: 41770.0. Samples: 369561000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 18:59:55,500][18875] Avg episode reward: [(0, '0.375')] [2024-06-18 18:59:57,039][19107] Updated weights for policy 0, policy_version 191985 (0.0038) [2024-06-18 19:00:00,359][19107] Updated weights for policy 0, policy_version 191995 (0.0045) [2024-06-18 19:00:00,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3145646080. Throughput: 0: 41828.8. Samples: 369691040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:00:00,501][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 19:00:04,726][19107] Updated weights for policy 0, policy_version 192005 (0.0043) [2024-06-18 19:00:05,500][18875] Fps is (10 sec: 37682.5, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3145809920. Throughput: 0: 41805.8. Samples: 369935120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:00:05,501][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 19:00:08,229][19107] Updated weights for policy 0, policy_version 192015 (0.0033) [2024-06-18 19:00:10,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42049.8, 300 sec: 41931.4). Total num frames: 3146055680. Throughput: 0: 41498.1. Samples: 370179980. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:10,505][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 19:00:12,767][19107] Updated weights for policy 0, policy_version 192025 (0.0030) [2024-06-18 19:00:15,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 3146235904. Throughput: 0: 41617.5. Samples: 370306080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:15,501][18875] Avg episode reward: [(0, '0.765')] [2024-06-18 19:00:16,568][19107] Updated weights for policy 0, policy_version 192035 (0.0032) [2024-06-18 19:00:20,500][18875] Fps is (10 sec: 39335.4, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 3146448896. Throughput: 0: 41622.2. Samples: 370552700. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:20,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 19:00:20,614][19107] Updated weights for policy 0, policy_version 192045 (0.0037) [2024-06-18 19:00:24,123][19107] Updated weights for policy 0, policy_version 192055 (0.0035) [2024-06-18 19:00:25,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3146661888. Throughput: 0: 41599.3. Samples: 370807000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:25,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 19:00:28,362][19107] Updated weights for policy 0, policy_version 192065 (0.0025) [2024-06-18 19:00:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3146874880. Throughput: 0: 41640.2. Samples: 370933940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:30,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 19:00:31,994][19107] Updated weights for policy 0, policy_version 192075 (0.0039) [2024-06-18 19:00:35,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3147087872. Throughput: 0: 41574.5. Samples: 371179760. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:35,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 19:00:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000192083_3147087872.pth... [2024-06-18 19:00:35,567][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000191468_3137011712.pth [2024-06-18 19:00:36,037][19107] Updated weights for policy 0, policy_version 192085 (0.0042) [2024-06-18 19:00:40,045][19107] Updated weights for policy 0, policy_version 192095 (0.0033) [2024-06-18 19:00:40,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41508.7, 300 sec: 41931.9). Total num frames: 3147284480. Throughput: 0: 41622.6. Samples: 371434020. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:40,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 19:00:43,965][19107] Updated weights for policy 0, policy_version 192105 (0.0029) [2024-06-18 19:00:45,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3147513856. Throughput: 0: 41494.3. Samples: 371558280. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:45,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 19:00:47,822][19107] Updated weights for policy 0, policy_version 192115 (0.0044) [2024-06-18 19:00:50,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3147743232. Throughput: 0: 41608.0. Samples: 371807480. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:50,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 19:00:51,815][19107] Updated weights for policy 0, policy_version 192125 (0.0028) [2024-06-18 19:00:55,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 3147923456. Throughput: 0: 41721.4. Samples: 372057300. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:00:55,501][18875] Avg episode reward: [(0, '0.769')] [2024-06-18 19:00:55,739][19107] Updated weights for policy 0, policy_version 192135 (0.0039) [2024-06-18 19:00:59,559][19107] Updated weights for policy 0, policy_version 192145 (0.0028) [2024-06-18 19:01:00,504][18875] Fps is (10 sec: 39307.9, 60 sec: 41503.7, 300 sec: 41987.0). Total num frames: 3148136448. Throughput: 0: 41570.0. Samples: 372176880. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:01:00,504][18875] Avg episode reward: [(0, '0.659')] [2024-06-18 19:01:03,538][19107] Updated weights for policy 0, policy_version 192155 (0.0031) [2024-06-18 19:01:04,941][19087] Signal inference workers to stop experience collection... (5400 times) [2024-06-18 19:01:04,992][19107] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-18 19:01:05,002][19087] Signal inference workers to resume experience collection... (5400 times) [2024-06-18 19:01:05,008][19107] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-18 19:01:05,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3148349440. Throughput: 0: 41760.5. Samples: 372431920. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:01:05,501][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 19:01:07,651][19107] Updated weights for policy 0, policy_version 192165 (0.0037) [2024-06-18 19:01:10,500][18875] Fps is (10 sec: 39336.0, 60 sec: 41235.6, 300 sec: 41820.9). Total num frames: 3148529664. Throughput: 0: 41544.6. Samples: 372676500. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-18 19:01:10,501][18875] Avg episode reward: [(0, '0.353')] [2024-06-18 19:01:11,579][19107] Updated weights for policy 0, policy_version 192175 (0.0026) [2024-06-18 19:01:15,471][19107] Updated weights for policy 0, policy_version 192185 (0.0035) [2024-06-18 19:01:15,504][18875] Fps is (10 sec: 40945.3, 60 sec: 42049.7, 300 sec: 41931.4). Total num frames: 3148759040. Throughput: 0: 41434.9. Samples: 372798660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:15,504][18875] Avg episode reward: [(0, '0.211')] [2024-06-18 19:01:19,333][19107] Updated weights for policy 0, policy_version 192195 (0.0036) [2024-06-18 19:01:20,504][18875] Fps is (10 sec: 42582.5, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 3148955648. Throughput: 0: 41686.9. Samples: 373055820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:20,505][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 19:01:23,138][19107] Updated weights for policy 0, policy_version 192205 (0.0039) [2024-06-18 19:01:25,500][18875] Fps is (10 sec: 40974.3, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3149168640. Throughput: 0: 41569.7. Samples: 373304660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:25,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 19:01:27,104][19107] Updated weights for policy 0, policy_version 192215 (0.0036) [2024-06-18 19:01:30,504][18875] Fps is (10 sec: 42598.6, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 3149381632. Throughput: 0: 41668.7. Samples: 373433520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:30,504][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 19:01:31,300][19107] Updated weights for policy 0, policy_version 192225 (0.0037) [2024-06-18 19:01:34,800][19107] Updated weights for policy 0, policy_version 192235 (0.0040) [2024-06-18 19:01:35,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3149594624. Throughput: 0: 41751.3. Samples: 373686280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:35,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 19:01:39,092][19107] Updated weights for policy 0, policy_version 192245 (0.0028) [2024-06-18 19:01:40,500][18875] Fps is (10 sec: 44252.1, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3149824000. Throughput: 0: 41713.3. Samples: 373934400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:40,501][18875] Avg episode reward: [(0, '0.554')] [2024-06-18 19:01:42,710][19107] Updated weights for policy 0, policy_version 192255 (0.0029) [2024-06-18 19:01:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3150004224. Throughput: 0: 41843.4. Samples: 374059680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:45,500][18875] Avg episode reward: [(0, '0.695')] [2024-06-18 19:01:46,795][19107] Updated weights for policy 0, policy_version 192265 (0.0027) [2024-06-18 19:01:50,508][18875] Fps is (10 sec: 39292.5, 60 sec: 41228.0, 300 sec: 41875.3). Total num frames: 3150217216. Throughput: 0: 41760.1. Samples: 374311440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:50,508][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 19:01:50,691][19107] Updated weights for policy 0, policy_version 192275 (0.0035) [2024-06-18 19:01:54,547][19107] Updated weights for policy 0, policy_version 192285 (0.0031) [2024-06-18 19:01:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3150430208. Throughput: 0: 41832.4. Samples: 374558960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:01:55,500][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 19:01:58,340][19107] Updated weights for policy 0, policy_version 192295 (0.0038) [2024-06-18 19:02:00,500][18875] Fps is (10 sec: 40990.9, 60 sec: 41508.6, 300 sec: 41876.4). Total num frames: 3150626816. Throughput: 0: 42014.9. Samples: 374689180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:02:00,504][18875] Avg episode reward: [(0, '0.768')] [2024-06-18 19:02:02,247][19107] Updated weights for policy 0, policy_version 192305 (0.0035) [2024-06-18 19:02:05,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3150823424. Throughput: 0: 41802.6. Samples: 374936780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:02:05,500][18875] Avg episode reward: [(0, '0.848')] [2024-06-18 19:02:06,231][19107] Updated weights for policy 0, policy_version 192315 (0.0039) [2024-06-18 19:02:10,022][19107] Updated weights for policy 0, policy_version 192325 (0.0043) [2024-06-18 19:02:10,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.1, 300 sec: 41820.8). Total num frames: 3151052800. Throughput: 0: 41928.5. Samples: 375191440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:02:10,501][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 19:02:13,892][19107] Updated weights for policy 0, policy_version 192335 (0.0032) [2024-06-18 19:02:15,500][18875] Fps is (10 sec: 44236.0, 60 sec: 41781.7, 300 sec: 41932.4). Total num frames: 3151265792. Throughput: 0: 41989.5. Samples: 375322900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 19:02:15,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 19:02:17,649][19107] Updated weights for policy 0, policy_version 192345 (0.0040) [2024-06-18 19:02:20,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42054.7, 300 sec: 41876.4). Total num frames: 3151478784. Throughput: 0: 41880.7. Samples: 375570920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:20,501][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 19:02:21,582][19107] Updated weights for policy 0, policy_version 192355 (0.0029) [2024-06-18 19:02:25,503][18875] Fps is (10 sec: 42586.0, 60 sec: 42050.2, 300 sec: 41820.4). Total num frames: 3151691776. Throughput: 0: 42000.0. Samples: 375824520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:25,504][18875] Avg episode reward: [(0, '0.197')] [2024-06-18 19:02:25,720][19107] Updated weights for policy 0, policy_version 192365 (0.0031) [2024-06-18 19:02:29,155][19107] Updated weights for policy 0, policy_version 192375 (0.0039) [2024-06-18 19:02:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42054.8, 300 sec: 41876.4). Total num frames: 3151904768. Throughput: 0: 42096.8. Samples: 375954040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:30,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 19:02:33,492][19107] Updated weights for policy 0, policy_version 192385 (0.0025) [2024-06-18 19:02:35,504][18875] Fps is (10 sec: 39319.3, 60 sec: 41503.6, 300 sec: 41764.8). Total num frames: 3152084992. Throughput: 0: 41880.6. Samples: 376195900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:35,504][18875] Avg episode reward: [(0, '0.739')] [2024-06-18 19:02:35,662][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000192389_3152101376.pth... [2024-06-18 19:02:35,706][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000191778_3142090752.pth [2024-06-18 19:02:37,457][19107] Updated weights for policy 0, policy_version 192395 (0.0033) [2024-06-18 19:02:40,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 3152297984. Throughput: 0: 41971.1. Samples: 376447660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:40,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 19:02:41,379][19107] Updated weights for policy 0, policy_version 192405 (0.0040) [2024-06-18 19:02:45,500][18875] Fps is (10 sec: 42614.0, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3152510976. Throughput: 0: 41917.4. Samples: 376575460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:45,500][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 19:02:45,634][19107] Updated weights for policy 0, policy_version 192415 (0.0043) [2024-06-18 19:02:48,987][19087] Signal inference workers to stop experience collection... (5450 times) [2024-06-18 19:02:49,036][19107] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-18 19:02:49,050][19087] Signal inference workers to resume experience collection... (5450 times) [2024-06-18 19:02:49,055][19107] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-18 19:02:49,210][19107] Updated weights for policy 0, policy_version 192425 (0.0023) [2024-06-18 19:02:50,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42057.6, 300 sec: 41765.8). Total num frames: 3152740352. Throughput: 0: 42005.7. Samples: 376827040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:50,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 19:02:53,457][19107] Updated weights for policy 0, policy_version 192435 (0.0037) [2024-06-18 19:02:55,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3152953344. Throughput: 0: 41843.2. Samples: 377074380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:02:55,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 19:02:56,807][19107] Updated weights for policy 0, policy_version 192445 (0.0043) [2024-06-18 19:03:00,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3153149952. Throughput: 0: 41711.6. Samples: 377199920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:03:00,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 19:03:01,272][19107] Updated weights for policy 0, policy_version 192455 (0.0048) [2024-06-18 19:03:04,542][19107] Updated weights for policy 0, policy_version 192465 (0.0040) [2024-06-18 19:03:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 41820.9). Total num frames: 3153379328. Throughput: 0: 41911.2. Samples: 377456920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:03:05,501][18875] Avg episode reward: [(0, '0.703')] [2024-06-18 19:03:09,201][19107] Updated weights for policy 0, policy_version 192475 (0.0047) [2024-06-18 19:03:10,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3153592320. Throughput: 0: 41900.1. Samples: 377709900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:03:10,501][18875] Avg episode reward: [(0, '0.590')] [2024-06-18 19:03:12,689][19107] Updated weights for policy 0, policy_version 192485 (0.0032) [2024-06-18 19:03:15,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3153772544. Throughput: 0: 41800.4. Samples: 377835060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:03:15,501][18875] Avg episode reward: [(0, '0.773')] [2024-06-18 19:03:17,142][19107] Updated weights for policy 0, policy_version 192495 (0.0030) [2024-06-18 19:03:20,334][19107] Updated weights for policy 0, policy_version 192505 (0.0033) [2024-06-18 19:03:20,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3154001920. Throughput: 0: 41975.0. Samples: 378084620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:03:20,501][18875] Avg episode reward: [(0, '0.784')] [2024-06-18 19:03:24,999][19107] Updated weights for policy 0, policy_version 192515 (0.0039) [2024-06-18 19:03:25,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41508.2, 300 sec: 41765.3). Total num frames: 3154182144. Throughput: 0: 42084.0. Samples: 378341440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:03:25,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 19:03:28,011][19107] Updated weights for policy 0, policy_version 192525 (0.0039) [2024-06-18 19:03:30,505][18875] Fps is (10 sec: 42576.4, 60 sec: 42048.7, 300 sec: 41820.1). Total num frames: 3154427904. Throughput: 0: 41858.8. Samples: 378459320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:03:30,506][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 19:03:32,580][19107] Updated weights for policy 0, policy_version 192535 (0.0033) [2024-06-18 19:03:35,504][18875] Fps is (10 sec: 45858.4, 60 sec: 42598.4, 300 sec: 41875.9). Total num frames: 3154640896. Throughput: 0: 41892.1. Samples: 378712340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:03:35,505][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 19:03:35,717][19107] Updated weights for policy 0, policy_version 192545 (0.0027) [2024-06-18 19:03:40,094][19107] Updated weights for policy 0, policy_version 192555 (0.0033) [2024-06-18 19:03:40,500][18875] Fps is (10 sec: 39341.3, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3154821120. Throughput: 0: 42132.4. Samples: 378970340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:03:40,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 19:03:43,382][19107] Updated weights for policy 0, policy_version 192565 (0.0040) [2024-06-18 19:03:45,500][18875] Fps is (10 sec: 40974.6, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 3155050496. Throughput: 0: 41984.4. Samples: 379089220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:03:45,501][18875] Avg episode reward: [(0, '0.441')] [2024-06-18 19:03:47,702][19107] Updated weights for policy 0, policy_version 192575 (0.0048) [2024-06-18 19:03:50,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3155247104. Throughput: 0: 42055.7. Samples: 379349420. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:03:50,500][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 19:03:51,243][19107] Updated weights for policy 0, policy_version 192585 (0.0033) [2024-06-18 19:03:55,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3155460096. Throughput: 0: 41912.0. Samples: 379595940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:03:55,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 19:03:55,826][19107] Updated weights for policy 0, policy_version 192595 (0.0037) [2024-06-18 19:03:59,140][19107] Updated weights for policy 0, policy_version 192605 (0.0033) [2024-06-18 19:04:00,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3155689472. Throughput: 0: 42029.4. Samples: 379726380. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:04:00,501][18875] Avg episode reward: [(0, '0.175')] [2024-06-18 19:04:03,534][19107] Updated weights for policy 0, policy_version 192615 (0.0035) [2024-06-18 19:04:05,504][18875] Fps is (10 sec: 42583.2, 60 sec: 41776.7, 300 sec: 41875.9). Total num frames: 3155886080. Throughput: 0: 42015.6. Samples: 379975480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:04:05,505][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 19:04:07,120][19107] Updated weights for policy 0, policy_version 192625 (0.0039) [2024-06-18 19:04:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3156099072. Throughput: 0: 41878.1. Samples: 380225960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:04:10,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 19:04:11,170][19107] Updated weights for policy 0, policy_version 192635 (0.0034) [2024-06-18 19:04:14,361][19087] Signal inference workers to stop experience collection... (5500 times) [2024-06-18 19:04:14,362][19087] Signal inference workers to resume experience collection... (5500 times) [2024-06-18 19:04:14,411][19107] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-18 19:04:14,411][19107] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-18 19:04:15,068][19107] Updated weights for policy 0, policy_version 192645 (0.0029) [2024-06-18 19:04:15,500][18875] Fps is (10 sec: 42614.2, 60 sec: 42325.5, 300 sec: 41820.9). Total num frames: 3156312064. Throughput: 0: 42097.3. Samples: 380353480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:04:15,501][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 19:04:18,606][19107] Updated weights for policy 0, policy_version 192655 (0.0041) [2024-06-18 19:04:20,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3156492288. Throughput: 0: 41968.8. Samples: 380600780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:04:20,501][18875] Avg episode reward: [(0, '0.357')] [2024-06-18 19:04:22,789][19107] Updated weights for policy 0, policy_version 192665 (0.0031) [2024-06-18 19:04:25,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 3156738048. Throughput: 0: 41848.6. Samples: 380853520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:04:25,500][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 19:04:26,503][19107] Updated weights for policy 0, policy_version 192675 (0.0034) [2024-06-18 19:04:30,500][18875] Fps is (10 sec: 44235.9, 60 sec: 41782.7, 300 sec: 41820.8). Total num frames: 3156934656. Throughput: 0: 42064.4. Samples: 380982120. Policy #0 lag: (min: 1.0, avg: 10.6, max: 23.0) [2024-06-18 19:04:30,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 19:04:30,563][19107] Updated weights for policy 0, policy_version 192685 (0.0040) [2024-06-18 19:04:34,335][19107] Updated weights for policy 0, policy_version 192695 (0.0031) [2024-06-18 19:04:35,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41508.6, 300 sec: 41821.4). Total num frames: 3157131264. Throughput: 0: 41786.1. Samples: 381229800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:04:35,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 19:04:35,675][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000192697_3157147648.pth... [2024-06-18 19:04:35,723][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000192083_3147087872.pth [2024-06-18 19:04:38,837][19107] Updated weights for policy 0, policy_version 192705 (0.0033) [2024-06-18 19:04:40,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3157360640. Throughput: 0: 41959.1. Samples: 381484100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:04:40,501][18875] Avg episode reward: [(0, '0.296')] [2024-06-18 19:04:41,968][19107] Updated weights for policy 0, policy_version 192715 (0.0028) [2024-06-18 19:04:45,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3157540864. Throughput: 0: 41821.5. Samples: 381608340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:04:45,500][18875] Avg episode reward: [(0, '0.649')] [2024-06-18 19:04:46,527][19107] Updated weights for policy 0, policy_version 192725 (0.0031) [2024-06-18 19:04:49,722][19107] Updated weights for policy 0, policy_version 192735 (0.0038) [2024-06-18 19:04:50,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 3157786624. Throughput: 0: 41915.3. Samples: 381861520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:04:50,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 19:04:54,358][19107] Updated weights for policy 0, policy_version 192745 (0.0039) [2024-06-18 19:04:55,500][18875] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3157999616. Throughput: 0: 42022.6. Samples: 382116980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:04:55,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 19:04:57,330][19107] Updated weights for policy 0, policy_version 192755 (0.0031) [2024-06-18 19:05:00,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3158179840. Throughput: 0: 41939.5. Samples: 382240760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:00,501][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 19:05:01,980][19107] Updated weights for policy 0, policy_version 192765 (0.0040) [2024-06-18 19:05:04,943][19107] Updated weights for policy 0, policy_version 192775 (0.0028) [2024-06-18 19:05:05,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42601.0, 300 sec: 41988.0). Total num frames: 3158441984. Throughput: 0: 42216.3. Samples: 382500520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:05,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 19:05:09,649][19107] Updated weights for policy 0, policy_version 192785 (0.0035) [2024-06-18 19:05:10,504][18875] Fps is (10 sec: 42583.0, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 3158605824. Throughput: 0: 42171.2. Samples: 382751380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:10,505][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 19:05:12,534][19107] Updated weights for policy 0, policy_version 192795 (0.0024) [2024-06-18 19:05:15,500][18875] Fps is (10 sec: 36044.6, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 3158802432. Throughput: 0: 41907.6. Samples: 382867960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:15,501][18875] Avg episode reward: [(0, '0.349')] [2024-06-18 19:05:17,754][19107] Updated weights for policy 0, policy_version 192805 (0.0038) [2024-06-18 19:05:20,500][18875] Fps is (10 sec: 45891.2, 60 sec: 42871.3, 300 sec: 42043.0). Total num frames: 3159064576. Throughput: 0: 42143.9. Samples: 383126280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:20,501][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 19:05:20,682][19107] Updated weights for policy 0, policy_version 192815 (0.0033) [2024-06-18 19:05:25,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3159228416. Throughput: 0: 42287.2. Samples: 383387020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:25,501][18875] Avg episode reward: [(0, '0.171')] [2024-06-18 19:05:25,617][19107] Updated weights for policy 0, policy_version 192825 (0.0034) [2024-06-18 19:05:28,484][19107] Updated weights for policy 0, policy_version 192835 (0.0039) [2024-06-18 19:05:30,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3159441408. Throughput: 0: 42023.0. Samples: 383499380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:30,501][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 19:05:33,214][19087] Signal inference workers to stop experience collection... (5550 times) [2024-06-18 19:05:33,215][19087] Signal inference workers to resume experience collection... (5550 times) [2024-06-18 19:05:33,255][19107] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-18 19:05:33,256][19107] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-18 19:05:33,359][19107] Updated weights for policy 0, policy_version 192845 (0.0038) [2024-06-18 19:05:35,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3159670784. Throughput: 0: 42023.6. Samples: 383752580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-18 19:05:35,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 19:05:36,298][19107] Updated weights for policy 0, policy_version 192855 (0.0032) [2024-06-18 19:05:40,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3159851008. Throughput: 0: 42131.8. Samples: 384012900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:05:40,500][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 19:05:41,220][19107] Updated weights for policy 0, policy_version 192865 (0.0035) [2024-06-18 19:05:44,247][19107] Updated weights for policy 0, policy_version 192875 (0.0023) [2024-06-18 19:05:45,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 41820.9). Total num frames: 3160080384. Throughput: 0: 41966.6. Samples: 384129260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:05:45,501][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 19:05:48,882][19107] Updated weights for policy 0, policy_version 192885 (0.0030) [2024-06-18 19:05:50,500][18875] Fps is (10 sec: 45875.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3160309760. Throughput: 0: 41884.1. Samples: 384385300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:05:50,500][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 19:05:52,058][19107] Updated weights for policy 0, policy_version 192895 (0.0037) [2024-06-18 19:05:55,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41876.9). Total num frames: 3160489984. Throughput: 0: 41968.6. Samples: 384639820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:05:55,501][18875] Avg episode reward: [(0, '0.405')] [2024-06-18 19:05:56,680][19107] Updated weights for policy 0, policy_version 192905 (0.0043) [2024-06-18 19:06:00,247][19107] Updated weights for policy 0, policy_version 192915 (0.0036) [2024-06-18 19:06:00,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3160719360. Throughput: 0: 42055.6. Samples: 384760460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:00,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 19:06:04,278][19107] Updated weights for policy 0, policy_version 192925 (0.0037) [2024-06-18 19:06:05,500][18875] Fps is (10 sec: 42599.5, 60 sec: 41233.2, 300 sec: 41987.5). Total num frames: 3160915968. Throughput: 0: 41935.8. Samples: 385013380. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:05,500][18875] Avg episode reward: [(0, '0.835')] [2024-06-18 19:06:08,117][19107] Updated weights for policy 0, policy_version 192935 (0.0041) [2024-06-18 19:06:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42054.8, 300 sec: 41932.4). Total num frames: 3161128960. Throughput: 0: 41717.7. Samples: 385264320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:10,501][18875] Avg episode reward: [(0, '0.809')] [2024-06-18 19:06:12,136][19107] Updated weights for policy 0, policy_version 192945 (0.0036) [2024-06-18 19:06:15,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41932.4). Total num frames: 3161325568. Throughput: 0: 41935.6. Samples: 385386480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:15,501][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 19:06:16,294][19107] Updated weights for policy 0, policy_version 192955 (0.0045) [2024-06-18 19:06:19,896][19107] Updated weights for policy 0, policy_version 192965 (0.0041) [2024-06-18 19:06:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3161554944. Throughput: 0: 41953.9. Samples: 385640500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:20,500][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 19:06:24,189][19107] Updated weights for policy 0, policy_version 192975 (0.0045) [2024-06-18 19:06:25,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41932.5). Total num frames: 3161751552. Throughput: 0: 41785.7. Samples: 385893260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:25,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 19:06:27,852][19107] Updated weights for policy 0, policy_version 192985 (0.0031) [2024-06-18 19:06:30,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3161964544. Throughput: 0: 41912.5. Samples: 386015320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:30,501][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 19:06:32,033][19107] Updated weights for policy 0, policy_version 192995 (0.0045) [2024-06-18 19:06:35,501][18875] Fps is (10 sec: 42597.3, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3162177536. Throughput: 0: 41848.6. Samples: 386268500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:35,501][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 19:06:35,642][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193005_3162193920.pth... [2024-06-18 19:06:35,648][19107] Updated weights for policy 0, policy_version 193005 (0.0046) [2024-06-18 19:06:35,702][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000192389_3152101376.pth [2024-06-18 19:06:39,822][19107] Updated weights for policy 0, policy_version 193015 (0.0028) [2024-06-18 19:06:40,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3162390528. Throughput: 0: 41783.3. Samples: 386520060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:06:40,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 19:06:43,508][19107] Updated weights for policy 0, policy_version 193025 (0.0047) [2024-06-18 19:06:45,500][18875] Fps is (10 sec: 42599.4, 60 sec: 42052.4, 300 sec: 41988.5). Total num frames: 3162603520. Throughput: 0: 41964.0. Samples: 386648840. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:06:45,501][18875] Avg episode reward: [(0, '0.651')] [2024-06-18 19:06:47,647][19107] Updated weights for policy 0, policy_version 193035 (0.0032) [2024-06-18 19:06:50,504][18875] Fps is (10 sec: 39307.3, 60 sec: 41230.5, 300 sec: 41875.9). Total num frames: 3162783744. Throughput: 0: 41772.5. Samples: 386893300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:06:50,505][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 19:06:51,303][19107] Updated weights for policy 0, policy_version 193045 (0.0030) [2024-06-18 19:06:55,428][19107] Updated weights for policy 0, policy_version 193055 (0.0056) [2024-06-18 19:06:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3163013120. Throughput: 0: 41852.0. Samples: 387147660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:06:55,501][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 19:06:59,421][19107] Updated weights for policy 0, policy_version 193065 (0.0035) [2024-06-18 19:07:00,500][18875] Fps is (10 sec: 45892.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3163242496. Throughput: 0: 41955.6. Samples: 387274480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:00,501][18875] Avg episode reward: [(0, '0.206')] [2024-06-18 19:07:03,209][19107] Updated weights for policy 0, policy_version 193075 (0.0034) [2024-06-18 19:07:05,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3163439104. Throughput: 0: 41763.6. Samples: 387519860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:05,500][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 19:07:07,553][19107] Updated weights for policy 0, policy_version 193085 (0.0040) [2024-06-18 19:07:07,844][19087] Signal inference workers to stop experience collection... (5600 times) [2024-06-18 19:07:07,846][19087] Signal inference workers to resume experience collection... (5600 times) [2024-06-18 19:07:07,873][19107] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-18 19:07:07,873][19107] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-18 19:07:10,500][18875] Fps is (10 sec: 37682.9, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3163619328. Throughput: 0: 41747.5. Samples: 387771900. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:10,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 19:07:11,098][19107] Updated weights for policy 0, policy_version 193095 (0.0034) [2024-06-18 19:07:15,184][19107] Updated weights for policy 0, policy_version 193105 (0.0033) [2024-06-18 19:07:15,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3163832320. Throughput: 0: 41674.3. Samples: 387890660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:15,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 19:07:18,955][19107] Updated weights for policy 0, policy_version 193115 (0.0034) [2024-06-18 19:07:20,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 3164061696. Throughput: 0: 41694.9. Samples: 388144760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:20,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 19:07:22,877][19107] Updated weights for policy 0, policy_version 193125 (0.0038) [2024-06-18 19:07:25,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3164274688. Throughput: 0: 41708.4. Samples: 388396940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:25,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 19:07:26,604][19107] Updated weights for policy 0, policy_version 193135 (0.0046) [2024-06-18 19:07:30,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41932.4). Total num frames: 3164454912. Throughput: 0: 41595.9. Samples: 388520660. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:30,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 19:07:30,949][19107] Updated weights for policy 0, policy_version 193145 (0.0046) [2024-06-18 19:07:34,373][19107] Updated weights for policy 0, policy_version 193155 (0.0034) [2024-06-18 19:07:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3164700672. Throughput: 0: 41981.2. Samples: 388782300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:35,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 19:07:39,019][19107] Updated weights for policy 0, policy_version 193165 (0.0027) [2024-06-18 19:07:40,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3164897280. Throughput: 0: 41819.7. Samples: 389029540. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:40,501][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 19:07:42,083][19107] Updated weights for policy 0, policy_version 193175 (0.0031) [2024-06-18 19:07:45,500][18875] Fps is (10 sec: 37682.7, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 3165077504. Throughput: 0: 41664.8. Samples: 389149400. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:45,501][18875] Avg episode reward: [(0, '0.370')] [2024-06-18 19:07:46,724][19107] Updated weights for policy 0, policy_version 193185 (0.0043) [2024-06-18 19:07:49,831][19107] Updated weights for policy 0, policy_version 193195 (0.0038) [2024-06-18 19:07:50,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42327.9, 300 sec: 41931.9). Total num frames: 3165323264. Throughput: 0: 41899.1. Samples: 389405320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 22.0) [2024-06-18 19:07:50,501][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 19:07:54,439][19107] Updated weights for policy 0, policy_version 193205 (0.0039) [2024-06-18 19:07:55,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3165519872. Throughput: 0: 41798.7. Samples: 389652840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:07:55,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 19:07:57,578][19107] Updated weights for policy 0, policy_version 193215 (0.0030) [2024-06-18 19:08:00,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3165732864. Throughput: 0: 41837.7. Samples: 389773360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:00,501][18875] Avg episode reward: [(0, '0.703')] [2024-06-18 19:08:02,190][19107] Updated weights for policy 0, policy_version 193225 (0.0037) [2024-06-18 19:08:05,150][19107] Updated weights for policy 0, policy_version 193235 (0.0043) [2024-06-18 19:08:05,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3165978624. Throughput: 0: 41957.7. Samples: 390032860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:05,501][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 19:08:10,037][19107] Updated weights for policy 0, policy_version 193245 (0.0048) [2024-06-18 19:08:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3166142464. Throughput: 0: 41946.7. Samples: 390284540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:10,501][18875] Avg episode reward: [(0, '0.627')] [2024-06-18 19:08:13,127][19107] Updated weights for policy 0, policy_version 193255 (0.0039) [2024-06-18 19:08:15,500][18875] Fps is (10 sec: 36044.6, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3166339072. Throughput: 0: 41726.6. Samples: 390398360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:15,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 19:08:18,394][19107] Updated weights for policy 0, policy_version 193265 (0.0027) [2024-06-18 19:08:20,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3166584832. Throughput: 0: 41718.7. Samples: 390659640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:20,501][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 19:08:20,828][19087] Signal inference workers to stop experience collection... (5650 times) [2024-06-18 19:08:20,871][19107] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-18 19:08:20,880][19087] Signal inference workers to resume experience collection... (5650 times) [2024-06-18 19:08:20,885][19107] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-18 19:08:20,890][19107] Updated weights for policy 0, policy_version 193275 (0.0033) [2024-06-18 19:08:25,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41766.0). Total num frames: 3166748672. Throughput: 0: 41754.5. Samples: 390908500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:25,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 19:08:26,299][19107] Updated weights for policy 0, policy_version 193285 (0.0042) [2024-06-18 19:08:28,571][19107] Updated weights for policy 0, policy_version 193295 (0.0040) [2024-06-18 19:08:30,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41876.9). Total num frames: 3166994432. Throughput: 0: 41710.2. Samples: 391026360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:30,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 19:08:34,152][19107] Updated weights for policy 0, policy_version 193305 (0.0045) [2024-06-18 19:08:35,500][18875] Fps is (10 sec: 45874.9, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3167207424. Throughput: 0: 41863.8. Samples: 391289200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:35,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 19:08:35,510][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193311_3167207424.pth... [2024-06-18 19:08:35,588][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000192697_3157147648.pth [2024-06-18 19:08:36,470][19107] Updated weights for policy 0, policy_version 193315 (0.0041) [2024-06-18 19:08:40,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 3167371264. Throughput: 0: 41833.3. Samples: 391535340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:40,501][18875] Avg episode reward: [(0, '0.403')] [2024-06-18 19:08:41,867][19107] Updated weights for policy 0, policy_version 193325 (0.0037) [2024-06-18 19:08:44,687][19107] Updated weights for policy 0, policy_version 193335 (0.0037) [2024-06-18 19:08:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 41987.4). Total num frames: 3167633408. Throughput: 0: 41778.6. Samples: 391653400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:45,501][18875] Avg episode reward: [(0, '0.389')] [2024-06-18 19:08:49,641][19107] Updated weights for policy 0, policy_version 193345 (0.0029) [2024-06-18 19:08:50,500][18875] Fps is (10 sec: 47513.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3167846400. Throughput: 0: 41938.6. Samples: 391920100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:50,501][18875] Avg episode reward: [(0, '0.329')] [2024-06-18 19:08:52,614][19107] Updated weights for policy 0, policy_version 193355 (0.0038) [2024-06-18 19:08:55,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3168010240. Throughput: 0: 41787.5. Samples: 392164980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 19:08:55,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 19:08:57,278][19107] Updated weights for policy 0, policy_version 193365 (0.0036) [2024-06-18 19:09:00,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 41876.9). Total num frames: 3168239616. Throughput: 0: 41928.8. Samples: 392285160. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:00,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 19:09:00,589][19107] Updated weights for policy 0, policy_version 193375 (0.0036) [2024-06-18 19:09:05,156][19107] Updated weights for policy 0, policy_version 193385 (0.0030) [2024-06-18 19:09:05,500][18875] Fps is (10 sec: 44237.8, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 3168452608. Throughput: 0: 41887.6. Samples: 392544580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:05,500][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 19:09:08,474][19107] Updated weights for policy 0, policy_version 193395 (0.0044) [2024-06-18 19:09:10,500][18875] Fps is (10 sec: 39322.7, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3168632832. Throughput: 0: 41686.0. Samples: 392784360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:10,500][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 19:09:12,870][19107] Updated weights for policy 0, policy_version 193405 (0.0039) [2024-06-18 19:09:14,268][19087] Signal inference workers to stop experience collection... (5700 times) [2024-06-18 19:09:14,268][19087] Signal inference workers to resume experience collection... (5700 times) [2024-06-18 19:09:14,305][19107] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-18 19:09:14,305][19107] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-18 19:09:15,504][18875] Fps is (10 sec: 40944.8, 60 sec: 42049.8, 300 sec: 41931.4). Total num frames: 3168862208. Throughput: 0: 41890.0. Samples: 392911560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:15,504][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 19:09:16,417][19107] Updated weights for policy 0, policy_version 193415 (0.0033) [2024-06-18 19:09:20,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3169058816. Throughput: 0: 41775.3. Samples: 393169080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:20,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 19:09:20,606][19107] Updated weights for policy 0, policy_version 193425 (0.0033) [2024-06-18 19:09:24,195][19107] Updated weights for policy 0, policy_version 193435 (0.0045) [2024-06-18 19:09:25,500][18875] Fps is (10 sec: 40975.0, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3169271808. Throughput: 0: 41808.0. Samples: 393416700. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:25,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 19:09:28,409][19107] Updated weights for policy 0, policy_version 193445 (0.0038) [2024-06-18 19:09:30,500][18875] Fps is (10 sec: 45875.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3169517568. Throughput: 0: 42032.2. Samples: 393544840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:30,500][18875] Avg episode reward: [(0, '0.373')] [2024-06-18 19:09:31,931][19107] Updated weights for policy 0, policy_version 193455 (0.0037) [2024-06-18 19:09:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.3, 300 sec: 41820.9). Total num frames: 3169697792. Throughput: 0: 41748.1. Samples: 393798760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:35,500][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 19:09:35,801][19107] Updated weights for policy 0, policy_version 193465 (0.0033) [2024-06-18 19:09:40,165][19107] Updated weights for policy 0, policy_version 193475 (0.0035) [2024-06-18 19:09:40,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3169910784. Throughput: 0: 41863.6. Samples: 394048840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:40,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 19:09:43,373][19107] Updated weights for policy 0, policy_version 193485 (0.0044) [2024-06-18 19:09:45,500][18875] Fps is (10 sec: 45874.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3170156544. Throughput: 0: 41928.1. Samples: 394171920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:45,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 19:09:47,889][19107] Updated weights for policy 0, policy_version 193495 (0.0038) [2024-06-18 19:09:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3170336768. Throughput: 0: 41870.9. Samples: 394428780. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:50,501][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 19:09:51,215][19107] Updated weights for policy 0, policy_version 193505 (0.0030) [2024-06-18 19:09:55,500][18875] Fps is (10 sec: 37683.5, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 3170533376. Throughput: 0: 42105.7. Samples: 394679120. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:09:55,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 19:09:55,689][19107] Updated weights for policy 0, policy_version 193515 (0.0032) [2024-06-18 19:09:59,394][19107] Updated weights for policy 0, policy_version 193525 (0.0029) [2024-06-18 19:10:00,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 41820.8). Total num frames: 3170779136. Throughput: 0: 42064.6. Samples: 394804320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 26.0) [2024-06-18 19:10:00,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 19:10:03,332][19107] Updated weights for policy 0, policy_version 193535 (0.0034) [2024-06-18 19:10:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41821.4). Total num frames: 3170942976. Throughput: 0: 41992.4. Samples: 395058740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:05,500][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 19:10:07,036][19107] Updated weights for policy 0, policy_version 193545 (0.0043) [2024-06-18 19:10:10,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3171188736. Throughput: 0: 41999.0. Samples: 395306660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:10,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 19:10:11,066][19107] Updated weights for policy 0, policy_version 193555 (0.0032) [2024-06-18 19:10:14,700][19107] Updated weights for policy 0, policy_version 193565 (0.0032) [2024-06-18 19:10:15,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41781.6, 300 sec: 41709.8). Total num frames: 3171368960. Throughput: 0: 42010.1. Samples: 395435300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:15,501][18875] Avg episode reward: [(0, '0.430')] [2024-06-18 19:10:18,910][19107] Updated weights for policy 0, policy_version 193575 (0.0026) [2024-06-18 19:10:20,500][18875] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3171581952. Throughput: 0: 41738.7. Samples: 395677000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:20,500][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 19:10:22,692][19107] Updated weights for policy 0, policy_version 193585 (0.0024) [2024-06-18 19:10:25,502][18875] Fps is (10 sec: 42593.2, 60 sec: 42051.3, 300 sec: 41876.2). Total num frames: 3171794944. Throughput: 0: 41828.6. Samples: 395931180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:25,502][18875] Avg episode reward: [(0, '0.222')] [2024-06-18 19:10:26,589][19107] Updated weights for policy 0, policy_version 193595 (0.0045) [2024-06-18 19:10:30,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3172007936. Throughput: 0: 41995.2. Samples: 396061700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:30,501][18875] Avg episode reward: [(0, '0.222')] [2024-06-18 19:10:30,578][19107] Updated weights for policy 0, policy_version 193605 (0.0043) [2024-06-18 19:10:34,931][19107] Updated weights for policy 0, policy_version 193615 (0.0040) [2024-06-18 19:10:35,500][18875] Fps is (10 sec: 39327.2, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3172188160. Throughput: 0: 41819.3. Samples: 396310640. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:35,500][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 19:10:35,526][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193616_3172204544.pth... [2024-06-18 19:10:35,576][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193005_3162193920.pth [2024-06-18 19:10:38,170][19107] Updated weights for policy 0, policy_version 193625 (0.0038) [2024-06-18 19:10:38,680][19087] Signal inference workers to stop experience collection... (5750 times) [2024-06-18 19:10:38,732][19107] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-18 19:10:38,742][19087] Signal inference workers to resume experience collection... (5750 times) [2024-06-18 19:10:38,747][19107] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-18 19:10:40,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3172433920. Throughput: 0: 41704.9. Samples: 396555840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:40,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 19:10:43,099][19107] Updated weights for policy 0, policy_version 193635 (0.0036) [2024-06-18 19:10:45,500][18875] Fps is (10 sec: 45874.4, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3172646912. Throughput: 0: 41980.4. Samples: 396693440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:45,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 19:10:45,802][19107] Updated weights for policy 0, policy_version 193645 (0.0027) [2024-06-18 19:10:50,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3172827136. Throughput: 0: 41877.2. Samples: 396943220. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:50,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 19:10:50,729][19107] Updated weights for policy 0, policy_version 193655 (0.0034) [2024-06-18 19:10:53,582][19107] Updated weights for policy 0, policy_version 193665 (0.0036) [2024-06-18 19:10:55,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 41820.8). Total num frames: 3173056512. Throughput: 0: 41716.3. Samples: 397183900. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:10:55,508][18875] Avg episode reward: [(0, '0.715')] [2024-06-18 19:10:58,329][19107] Updated weights for policy 0, policy_version 193675 (0.0039) [2024-06-18 19:11:00,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3173269504. Throughput: 0: 41744.1. Samples: 397313780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:11:00,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 19:11:01,678][19107] Updated weights for policy 0, policy_version 193685 (0.0036) [2024-06-18 19:11:05,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3173449728. Throughput: 0: 41950.1. Samples: 397564760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:11:05,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 19:11:05,956][19107] Updated weights for policy 0, policy_version 193695 (0.0041) [2024-06-18 19:11:09,570][19107] Updated weights for policy 0, policy_version 193705 (0.0044) [2024-06-18 19:11:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3173695488. Throughput: 0: 41748.4. Samples: 397809800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 19:11:10,501][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 19:11:14,141][19107] Updated weights for policy 0, policy_version 193715 (0.0035) [2024-06-18 19:11:15,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3173908480. Throughput: 0: 41853.2. Samples: 397945100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:15,504][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 19:11:17,434][19107] Updated weights for policy 0, policy_version 193725 (0.0034) [2024-06-18 19:11:20,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3174072320. Throughput: 0: 41780.8. Samples: 398190780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:20,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 19:11:21,679][19107] Updated weights for policy 0, policy_version 193735 (0.0025) [2024-06-18 19:11:25,337][19107] Updated weights for policy 0, policy_version 193745 (0.0027) [2024-06-18 19:11:25,504][18875] Fps is (10 sec: 42583.5, 60 sec: 42323.7, 300 sec: 41931.4). Total num frames: 3174334464. Throughput: 0: 41948.2. Samples: 398443660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:25,504][18875] Avg episode reward: [(0, '0.452')] [2024-06-18 19:11:29,263][19107] Updated weights for policy 0, policy_version 193755 (0.0036) [2024-06-18 19:11:30,500][18875] Fps is (10 sec: 45875.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3174531072. Throughput: 0: 41750.8. Samples: 398572220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:30,500][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 19:11:33,065][19107] Updated weights for policy 0, policy_version 193765 (0.0031) [2024-06-18 19:11:35,500][18875] Fps is (10 sec: 37697.2, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3174711296. Throughput: 0: 41653.1. Samples: 398817600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:35,500][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 19:11:37,426][19107] Updated weights for policy 0, policy_version 193775 (0.0035) [2024-06-18 19:11:40,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3174940672. Throughput: 0: 41832.1. Samples: 399066340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:40,501][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 19:11:40,903][19107] Updated weights for policy 0, policy_version 193785 (0.0038) [2024-06-18 19:11:45,258][19107] Updated weights for policy 0, policy_version 193795 (0.0034) [2024-06-18 19:11:45,500][18875] Fps is (10 sec: 44235.9, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 3175153664. Throughput: 0: 41811.9. Samples: 399195320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:45,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 19:11:49,212][19107] Updated weights for policy 0, policy_version 193805 (0.0043) [2024-06-18 19:11:50,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3175333888. Throughput: 0: 41662.7. Samples: 399439580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:50,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 19:11:53,110][19107] Updated weights for policy 0, policy_version 193815 (0.0022) [2024-06-18 19:11:55,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3175546880. Throughput: 0: 41697.3. Samples: 399686180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:11:55,501][18875] Avg episode reward: [(0, '0.315')] [2024-06-18 19:11:55,781][19087] Signal inference workers to stop experience collection... (5800 times) [2024-06-18 19:11:55,781][19087] Signal inference workers to resume experience collection... (5800 times) [2024-06-18 19:11:55,828][19107] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-18 19:11:55,828][19107] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-18 19:11:57,111][19107] Updated weights for policy 0, policy_version 193825 (0.0036) [2024-06-18 19:12:00,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3175759872. Throughput: 0: 41494.8. Samples: 399812360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:12:00,500][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 19:12:00,759][19107] Updated weights for policy 0, policy_version 193835 (0.0047) [2024-06-18 19:12:04,958][19107] Updated weights for policy 0, policy_version 193845 (0.0033) [2024-06-18 19:12:05,504][18875] Fps is (10 sec: 40945.4, 60 sec: 41776.7, 300 sec: 41820.3). Total num frames: 3175956480. Throughput: 0: 41540.2. Samples: 400060240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:12:05,505][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 19:12:08,430][19107] Updated weights for policy 0, policy_version 193855 (0.0033) [2024-06-18 19:12:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 3176169472. Throughput: 0: 41408.3. Samples: 400306880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:12:10,501][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 19:12:12,775][19107] Updated weights for policy 0, policy_version 193865 (0.0042) [2024-06-18 19:12:15,500][18875] Fps is (10 sec: 42613.9, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3176382464. Throughput: 0: 41391.5. Samples: 400434840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 19:12:15,501][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 19:12:16,768][19107] Updated weights for policy 0, policy_version 193875 (0.0030) [2024-06-18 19:12:20,500][18875] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3176579072. Throughput: 0: 41454.0. Samples: 400683040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:20,501][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 19:12:20,733][19107] Updated weights for policy 0, policy_version 193885 (0.0041) [2024-06-18 19:12:24,272][19107] Updated weights for policy 0, policy_version 193895 (0.0044) [2024-06-18 19:12:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 40962.4, 300 sec: 41820.9). Total num frames: 3176792064. Throughput: 0: 41521.8. Samples: 400934820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:25,501][18875] Avg episode reward: [(0, '0.380')] [2024-06-18 19:12:28,415][19107] Updated weights for policy 0, policy_version 193905 (0.0037) [2024-06-18 19:12:30,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 3177005056. Throughput: 0: 41492.6. Samples: 401062480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:30,501][18875] Avg episode reward: [(0, '0.351')] [2024-06-18 19:12:31,951][19107] Updated weights for policy 0, policy_version 193915 (0.0031) [2024-06-18 19:12:35,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.0, 300 sec: 41765.3). Total num frames: 3177218048. Throughput: 0: 41723.0. Samples: 401317120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:35,501][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 19:12:35,653][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193923_3177234432.pth... [2024-06-18 19:12:35,713][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193311_3167207424.pth [2024-06-18 19:12:36,107][19107] Updated weights for policy 0, policy_version 193925 (0.0037) [2024-06-18 19:12:39,562][19107] Updated weights for policy 0, policy_version 193935 (0.0032) [2024-06-18 19:12:40,500][18875] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3177447424. Throughput: 0: 41802.2. Samples: 401567280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:40,501][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 19:12:43,855][19107] Updated weights for policy 0, policy_version 193945 (0.0028) [2024-06-18 19:12:45,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 3177627648. Throughput: 0: 41913.2. Samples: 401698460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:45,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 19:12:47,301][19107] Updated weights for policy 0, policy_version 193955 (0.0038) [2024-06-18 19:12:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3177857024. Throughput: 0: 42009.1. Samples: 401950500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:50,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 19:12:52,077][19107] Updated weights for policy 0, policy_version 193965 (0.0036) [2024-06-18 19:12:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3178070016. Throughput: 0: 41973.3. Samples: 402195680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:12:55,501][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 19:12:55,627][19107] Updated weights for policy 0, policy_version 193975 (0.0035) [2024-06-18 19:12:59,658][19087] Signal inference workers to stop experience collection... (5850 times) [2024-06-18 19:12:59,698][19107] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-06-18 19:12:59,723][19087] Signal inference workers to resume experience collection... (5850 times) [2024-06-18 19:12:59,723][19107] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-06-18 19:12:59,865][19107] Updated weights for policy 0, policy_version 193985 (0.0036) [2024-06-18 19:13:00,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 3178266624. Throughput: 0: 41987.5. Samples: 402324280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:13:00,501][18875] Avg episode reward: [(0, '0.314')] [2024-06-18 19:13:03,346][19107] Updated weights for policy 0, policy_version 193995 (0.0030) [2024-06-18 19:13:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42054.8, 300 sec: 41820.9). Total num frames: 3178479616. Throughput: 0: 41971.6. Samples: 402571760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:13:05,501][18875] Avg episode reward: [(0, '0.741')] [2024-06-18 19:13:07,517][19107] Updated weights for policy 0, policy_version 194005 (0.0036) [2024-06-18 19:13:10,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3178708992. Throughput: 0: 42105.8. Samples: 402829580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:13:10,501][18875] Avg episode reward: [(0, '0.741')] [2024-06-18 19:13:11,056][19107] Updated weights for policy 0, policy_version 194015 (0.0032) [2024-06-18 19:13:15,357][19107] Updated weights for policy 0, policy_version 194025 (0.0031) [2024-06-18 19:13:15,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3178905600. Throughput: 0: 42051.0. Samples: 402954780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:13:15,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 19:13:19,153][19107] Updated weights for policy 0, policy_version 194035 (0.0033) [2024-06-18 19:13:20,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 41932.0). Total num frames: 3179118592. Throughput: 0: 42070.9. Samples: 403210300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:13:20,500][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 19:13:23,199][19107] Updated weights for policy 0, policy_version 194045 (0.0024) [2024-06-18 19:13:25,504][18875] Fps is (10 sec: 42583.4, 60 sec: 42322.8, 300 sec: 41820.4). Total num frames: 3179331584. Throughput: 0: 42048.7. Samples: 403459620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:13:25,513][18875] Avg episode reward: [(0, '0.662')] [2024-06-18 19:13:26,771][19107] Updated weights for policy 0, policy_version 194055 (0.0040) [2024-06-18 19:13:30,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3179528192. Throughput: 0: 41955.1. Samples: 403586440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:13:30,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 19:13:31,071][19107] Updated weights for policy 0, policy_version 194065 (0.0037) [2024-06-18 19:13:34,551][19107] Updated weights for policy 0, policy_version 194075 (0.0035) [2024-06-18 19:13:35,500][18875] Fps is (10 sec: 42613.5, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3179757568. Throughput: 0: 42159.1. Samples: 403847660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:13:35,501][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 19:13:39,024][19107] Updated weights for policy 0, policy_version 194085 (0.0038) [2024-06-18 19:13:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3179970560. Throughput: 0: 42093.3. Samples: 404089880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:13:40,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 19:13:42,229][19107] Updated weights for policy 0, policy_version 194095 (0.0045) [2024-06-18 19:13:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 3180167168. Throughput: 0: 42128.1. Samples: 404220040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:13:45,501][18875] Avg episode reward: [(0, '0.662')] [2024-06-18 19:13:46,603][19107] Updated weights for policy 0, policy_version 194105 (0.0025) [2024-06-18 19:13:49,992][19107] Updated weights for policy 0, policy_version 194115 (0.0028) [2024-06-18 19:13:50,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 3180380160. Throughput: 0: 42272.1. Samples: 404474000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:13:50,500][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 19:13:54,344][19107] Updated weights for policy 0, policy_version 194125 (0.0032) [2024-06-18 19:13:55,501][18875] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3180593152. Throughput: 0: 42111.4. Samples: 404724600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:13:55,501][18875] Avg episode reward: [(0, '0.732')] [2024-06-18 19:13:57,853][19107] Updated weights for policy 0, policy_version 194135 (0.0039) [2024-06-18 19:14:00,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3180806144. Throughput: 0: 41952.0. Samples: 404842620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:14:00,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 19:14:02,325][19107] Updated weights for policy 0, policy_version 194145 (0.0035) [2024-06-18 19:14:04,140][19087] Signal inference workers to stop experience collection... (5900 times) [2024-06-18 19:14:04,141][19087] Signal inference workers to resume experience collection... (5900 times) [2024-06-18 19:14:04,177][19107] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-06-18 19:14:04,177][19107] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-06-18 19:14:05,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 3181019136. Throughput: 0: 41886.9. Samples: 405095220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:14:05,501][18875] Avg episode reward: [(0, '0.479')] [2024-06-18 19:14:05,901][19107] Updated weights for policy 0, policy_version 194155 (0.0036) [2024-06-18 19:14:10,033][19107] Updated weights for policy 0, policy_version 194165 (0.0028) [2024-06-18 19:14:10,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41821.4). Total num frames: 3181199360. Throughput: 0: 41968.8. Samples: 405348060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:14:10,500][18875] Avg episode reward: [(0, '0.479')] [2024-06-18 19:14:13,538][19107] Updated weights for policy 0, policy_version 194175 (0.0033) [2024-06-18 19:14:15,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 3181428736. Throughput: 0: 41911.2. Samples: 405472440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:14:15,500][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 19:14:18,139][19107] Updated weights for policy 0, policy_version 194185 (0.0035) [2024-06-18 19:14:20,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3181625344. Throughput: 0: 41637.9. Samples: 405721360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:14:20,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 19:14:21,378][19107] Updated weights for policy 0, policy_version 194195 (0.0027) [2024-06-18 19:14:25,501][18875] Fps is (10 sec: 40957.1, 60 sec: 41781.3, 300 sec: 41765.2). Total num frames: 3181838336. Throughput: 0: 41871.9. Samples: 405974140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:14:25,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 19:14:25,770][19107] Updated weights for policy 0, policy_version 194205 (0.0039) [2024-06-18 19:14:29,415][19107] Updated weights for policy 0, policy_version 194215 (0.0034) [2024-06-18 19:14:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3182034944. Throughput: 0: 41803.5. Samples: 406101200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:14:30,501][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 19:14:33,269][19107] Updated weights for policy 0, policy_version 194225 (0.0035) [2024-06-18 19:14:35,500][18875] Fps is (10 sec: 40961.8, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3182247936. Throughput: 0: 41714.0. Samples: 406351140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:14:35,501][18875] Avg episode reward: [(0, '0.682')] [2024-06-18 19:14:35,532][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000194229_3182247936.pth... [2024-06-18 19:14:35,587][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193616_3172204544.pth [2024-06-18 19:14:37,359][19107] Updated weights for policy 0, policy_version 194235 (0.0033) [2024-06-18 19:14:40,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3182477312. Throughput: 0: 41532.1. Samples: 406593540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:14:40,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 19:14:41,165][19107] Updated weights for policy 0, policy_version 194245 (0.0029) [2024-06-18 19:14:45,241][19107] Updated weights for policy 0, policy_version 194255 (0.0025) [2024-06-18 19:14:45,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3182673920. Throughput: 0: 41918.7. Samples: 406728960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:14:45,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 19:14:48,783][19107] Updated weights for policy 0, policy_version 194265 (0.0026) [2024-06-18 19:14:50,504][18875] Fps is (10 sec: 40945.4, 60 sec: 41776.6, 300 sec: 41875.9). Total num frames: 3182886912. Throughput: 0: 41731.0. Samples: 406973260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:14:50,505][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 19:14:53,149][19107] Updated weights for policy 0, policy_version 194275 (0.0042) [2024-06-18 19:14:55,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3183116288. Throughput: 0: 41688.3. Samples: 407224040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:14:55,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 19:14:56,558][19107] Updated weights for policy 0, policy_version 194285 (0.0041) [2024-06-18 19:15:00,500][18875] Fps is (10 sec: 39336.1, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 3183280128. Throughput: 0: 41837.3. Samples: 407355120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:00,500][18875] Avg episode reward: [(0, '0.754')] [2024-06-18 19:15:01,003][19107] Updated weights for policy 0, policy_version 194295 (0.0039) [2024-06-18 19:15:04,316][19107] Updated weights for policy 0, policy_version 194305 (0.0029) [2024-06-18 19:15:05,501][18875] Fps is (10 sec: 39319.9, 60 sec: 41505.9, 300 sec: 41765.3). Total num frames: 3183509504. Throughput: 0: 41802.2. Samples: 407602480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:05,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 19:15:08,918][19107] Updated weights for policy 0, policy_version 194315 (0.0058) [2024-06-18 19:15:10,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 41932.0). Total num frames: 3183738880. Throughput: 0: 41839.3. Samples: 407856880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:10,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 19:15:11,892][19107] Updated weights for policy 0, policy_version 194325 (0.0026) [2024-06-18 19:15:13,973][19087] Signal inference workers to stop experience collection... (5950 times) [2024-06-18 19:15:13,975][19087] Signal inference workers to resume experience collection... (5950 times) [2024-06-18 19:15:13,990][19107] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-06-18 19:15:14,006][19107] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-06-18 19:15:15,500][18875] Fps is (10 sec: 39323.4, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 3183902720. Throughput: 0: 41817.3. Samples: 407982980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:15,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 19:15:16,809][19107] Updated weights for policy 0, policy_version 194335 (0.0040) [2024-06-18 19:15:19,523][19107] Updated weights for policy 0, policy_version 194345 (0.0040) [2024-06-18 19:15:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41876.6). Total num frames: 3184148480. Throughput: 0: 41745.0. Samples: 408229660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:20,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 19:15:24,716][19107] Updated weights for policy 0, policy_version 194355 (0.0035) [2024-06-18 19:15:25,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42052.7, 300 sec: 41876.4). Total num frames: 3184361472. Throughput: 0: 42062.2. Samples: 408486340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:25,506][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 19:15:27,511][19107] Updated weights for policy 0, policy_version 194365 (0.0035) [2024-06-18 19:15:30,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3184541696. Throughput: 0: 41678.8. Samples: 408604500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:30,500][18875] Avg episode reward: [(0, '0.684')] [2024-06-18 19:15:32,515][19107] Updated weights for policy 0, policy_version 194375 (0.0031) [2024-06-18 19:15:35,404][19107] Updated weights for policy 0, policy_version 194385 (0.0039) [2024-06-18 19:15:35,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 41931.9). Total num frames: 3184803840. Throughput: 0: 41922.1. Samples: 408859600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-18 19:15:35,500][18875] Avg episode reward: [(0, '0.401')] [2024-06-18 19:15:40,244][19107] Updated weights for policy 0, policy_version 194395 (0.0036) [2024-06-18 19:15:40,500][18875] Fps is (10 sec: 42597.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3184967680. Throughput: 0: 42079.5. Samples: 409117620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:15:40,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 19:15:43,555][19107] Updated weights for policy 0, policy_version 194405 (0.0039) [2024-06-18 19:15:45,500][18875] Fps is (10 sec: 37682.4, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3185180672. Throughput: 0: 41796.3. Samples: 409235960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:15:45,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 19:15:47,977][19107] Updated weights for policy 0, policy_version 194415 (0.0027) [2024-06-18 19:15:50,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41781.7, 300 sec: 41820.9). Total num frames: 3185393664. Throughput: 0: 41911.5. Samples: 409488480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:15:50,501][18875] Avg episode reward: [(0, '0.213')] [2024-06-18 19:15:51,306][19107] Updated weights for policy 0, policy_version 194425 (0.0041) [2024-06-18 19:15:55,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 3185590272. Throughput: 0: 42128.1. Samples: 409752640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:15:55,500][18875] Avg episode reward: [(0, '0.258')] [2024-06-18 19:15:55,730][19107] Updated weights for policy 0, policy_version 194435 (0.0036) [2024-06-18 19:15:58,842][19107] Updated weights for policy 0, policy_version 194445 (0.0022) [2024-06-18 19:16:00,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3185819648. Throughput: 0: 41966.5. Samples: 409871480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:00,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 19:16:03,401][19107] Updated weights for policy 0, policy_version 194455 (0.0032) [2024-06-18 19:16:05,500][18875] Fps is (10 sec: 45874.5, 60 sec: 42325.6, 300 sec: 41876.4). Total num frames: 3186049024. Throughput: 0: 42169.7. Samples: 410127300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:05,501][18875] Avg episode reward: [(0, '0.334')] [2024-06-18 19:16:06,776][19107] Updated weights for policy 0, policy_version 194465 (0.0032) [2024-06-18 19:16:10,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3186229248. Throughput: 0: 42048.0. Samples: 410378500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:10,501][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 19:16:11,069][19107] Updated weights for policy 0, policy_version 194475 (0.0036) [2024-06-18 19:16:14,804][19107] Updated weights for policy 0, policy_version 194485 (0.0042) [2024-06-18 19:16:15,501][18875] Fps is (10 sec: 40955.9, 60 sec: 42597.7, 300 sec: 41987.3). Total num frames: 3186458624. Throughput: 0: 42094.5. Samples: 410498800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:15,502][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 19:16:18,713][19107] Updated weights for policy 0, policy_version 194495 (0.0027) [2024-06-18 19:16:20,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41821.3). Total num frames: 3186671616. Throughput: 0: 42065.1. Samples: 410752540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:20,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 19:16:22,750][19107] Updated weights for policy 0, policy_version 194505 (0.0030) [2024-06-18 19:16:25,500][18875] Fps is (10 sec: 39325.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3186851840. Throughput: 0: 42190.7. Samples: 411016200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:25,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 19:16:26,379][19107] Updated weights for policy 0, policy_version 194515 (0.0028) [2024-06-18 19:16:30,478][19107] Updated weights for policy 0, policy_version 194525 (0.0049) [2024-06-18 19:16:30,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3187097600. Throughput: 0: 42160.1. Samples: 411133160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:30,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 19:16:33,661][19087] Signal inference workers to stop experience collection... (6000 times) [2024-06-18 19:16:33,706][19107] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-06-18 19:16:33,715][19087] Signal inference workers to resume experience collection... (6000 times) [2024-06-18 19:16:33,722][19107] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-06-18 19:16:34,171][19107] Updated weights for policy 0, policy_version 194535 (0.0029) [2024-06-18 19:16:35,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3187310592. Throughput: 0: 42155.9. Samples: 411385500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:35,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 19:16:35,518][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000194538_3187310592.pth... [2024-06-18 19:16:35,585][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000193923_3177234432.pth [2024-06-18 19:16:38,367][19107] Updated weights for policy 0, policy_version 194545 (0.0040) [2024-06-18 19:16:40,500][18875] Fps is (10 sec: 37683.2, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3187474432. Throughput: 0: 42015.9. Samples: 411643360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:16:40,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 19:16:42,155][19107] Updated weights for policy 0, policy_version 194555 (0.0031) [2024-06-18 19:16:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3187720192. Throughput: 0: 42055.2. Samples: 411763960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:16:45,501][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 19:16:46,111][19107] Updated weights for policy 0, policy_version 194565 (0.0035) [2024-06-18 19:16:50,223][19107] Updated weights for policy 0, policy_version 194575 (0.0035) [2024-06-18 19:16:50,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3187916800. Throughput: 0: 41972.9. Samples: 412016080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:16:50,501][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 19:16:53,734][19107] Updated weights for policy 0, policy_version 194585 (0.0038) [2024-06-18 19:16:55,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3188113408. Throughput: 0: 42093.4. Samples: 412272700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:16:55,500][18875] Avg episode reward: [(0, '0.719')] [2024-06-18 19:16:57,707][19107] Updated weights for policy 0, policy_version 194595 (0.0031) [2024-06-18 19:17:00,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42099.1). Total num frames: 3188375552. Throughput: 0: 42123.1. Samples: 412394300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:00,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 19:17:01,714][19107] Updated weights for policy 0, policy_version 194605 (0.0029) [2024-06-18 19:17:05,501][18875] Fps is (10 sec: 44235.4, 60 sec: 41779.1, 300 sec: 41987.4). Total num frames: 3188555776. Throughput: 0: 42116.8. Samples: 412647800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:05,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 19:17:05,663][19107] Updated weights for policy 0, policy_version 194615 (0.0043) [2024-06-18 19:17:09,656][19107] Updated weights for policy 0, policy_version 194625 (0.0030) [2024-06-18 19:17:10,500][18875] Fps is (10 sec: 36045.4, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3188736000. Throughput: 0: 41810.3. Samples: 412897660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:10,509][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 19:17:13,448][19107] Updated weights for policy 0, policy_version 194635 (0.0032) [2024-06-18 19:17:15,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42326.0, 300 sec: 42098.5). Total num frames: 3188998144. Throughput: 0: 41981.3. Samples: 413022320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:15,510][18875] Avg episode reward: [(0, '0.397')] [2024-06-18 19:17:17,611][19107] Updated weights for policy 0, policy_version 194645 (0.0032) [2024-06-18 19:17:20,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3189178368. Throughput: 0: 42002.8. Samples: 413275620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:20,501][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 19:17:21,519][19107] Updated weights for policy 0, policy_version 194655 (0.0025) [2024-06-18 19:17:25,504][18875] Fps is (10 sec: 37669.9, 60 sec: 42049.7, 300 sec: 41931.4). Total num frames: 3189374976. Throughput: 0: 41718.0. Samples: 413520820. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:25,505][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 19:17:25,745][19107] Updated weights for policy 0, policy_version 194665 (0.0041) [2024-06-18 19:17:29,216][19107] Updated weights for policy 0, policy_version 194675 (0.0049) [2024-06-18 19:17:30,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3189604352. Throughput: 0: 41935.2. Samples: 413651040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:30,500][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 19:17:33,572][19107] Updated weights for policy 0, policy_version 194685 (0.0020) [2024-06-18 19:17:35,500][18875] Fps is (10 sec: 42613.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3189800960. Throughput: 0: 41975.5. Samples: 413904980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:35,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 19:17:36,878][19107] Updated weights for policy 0, policy_version 194695 (0.0027) [2024-06-18 19:17:40,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3190013952. Throughput: 0: 41731.9. Samples: 414150640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:40,501][18875] Avg episode reward: [(0, '0.403')] [2024-06-18 19:17:41,345][19107] Updated weights for policy 0, policy_version 194705 (0.0026) [2024-06-18 19:17:44,515][19107] Updated weights for policy 0, policy_version 194715 (0.0035) [2024-06-18 19:17:45,501][18875] Fps is (10 sec: 42597.4, 60 sec: 41779.0, 300 sec: 41931.9). Total num frames: 3190226944. Throughput: 0: 41929.5. Samples: 414281140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:45,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 19:17:49,092][19107] Updated weights for policy 0, policy_version 194725 (0.0029) [2024-06-18 19:17:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3190423552. Throughput: 0: 41833.1. Samples: 414530280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-18 19:17:50,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 19:17:51,505][19087] Signal inference workers to stop experience collection... (6050 times) [2024-06-18 19:17:51,555][19107] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-06-18 19:17:51,567][19087] Signal inference workers to resume experience collection... (6050 times) [2024-06-18 19:17:51,572][19107] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-06-18 19:17:52,758][19107] Updated weights for policy 0, policy_version 194735 (0.0030) [2024-06-18 19:17:55,500][18875] Fps is (10 sec: 40961.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3190636544. Throughput: 0: 41827.1. Samples: 414779880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:17:55,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 19:17:56,652][19107] Updated weights for policy 0, policy_version 194745 (0.0036) [2024-06-18 19:18:00,415][19107] Updated weights for policy 0, policy_version 194755 (0.0029) [2024-06-18 19:18:00,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3190865920. Throughput: 0: 41806.3. Samples: 414903600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:00,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 19:18:04,168][19107] Updated weights for policy 0, policy_version 194765 (0.0041) [2024-06-18 19:18:05,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 3191062528. Throughput: 0: 41781.8. Samples: 415155800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:05,500][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 19:18:08,206][19107] Updated weights for policy 0, policy_version 194775 (0.0031) [2024-06-18 19:18:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3191275520. Throughput: 0: 42013.6. Samples: 415411280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:10,501][18875] Avg episode reward: [(0, '0.768')] [2024-06-18 19:18:11,819][19107] Updated weights for policy 0, policy_version 194785 (0.0027) [2024-06-18 19:18:15,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.3, 300 sec: 41931.9). Total num frames: 3191488512. Throughput: 0: 41883.5. Samples: 415535800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:15,500][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 19:18:16,235][19107] Updated weights for policy 0, policy_version 194795 (0.0050) [2024-06-18 19:18:19,715][19107] Updated weights for policy 0, policy_version 194805 (0.0040) [2024-06-18 19:18:20,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41932.4). Total num frames: 3191701504. Throughput: 0: 41720.1. Samples: 415782380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:20,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 19:18:24,271][19107] Updated weights for policy 0, policy_version 194815 (0.0043) [2024-06-18 19:18:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42054.8, 300 sec: 41931.9). Total num frames: 3191898112. Throughput: 0: 42021.8. Samples: 416041620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:25,501][18875] Avg episode reward: [(0, '0.713')] [2024-06-18 19:18:27,549][19107] Updated weights for policy 0, policy_version 194825 (0.0041) [2024-06-18 19:18:30,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3192111104. Throughput: 0: 41792.0. Samples: 416161760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:30,500][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 19:18:31,828][19107] Updated weights for policy 0, policy_version 194835 (0.0028) [2024-06-18 19:18:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3192324096. Throughput: 0: 41893.3. Samples: 416415480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:35,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 19:18:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000194844_3192324096.pth... [2024-06-18 19:18:35,593][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000194229_3182247936.pth [2024-06-18 19:18:35,734][19107] Updated weights for policy 0, policy_version 194845 (0.0032) [2024-06-18 19:18:39,437][19107] Updated weights for policy 0, policy_version 194855 (0.0029) [2024-06-18 19:18:40,500][18875] Fps is (10 sec: 40959.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3192520704. Throughput: 0: 41991.4. Samples: 416669500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:40,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 19:18:43,247][19107] Updated weights for policy 0, policy_version 194865 (0.0034) [2024-06-18 19:18:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.5, 300 sec: 41931.9). Total num frames: 3192750080. Throughput: 0: 41951.1. Samples: 416791400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:45,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 19:18:47,094][19107] Updated weights for policy 0, policy_version 194875 (0.0039) [2024-06-18 19:18:50,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 3192963072. Throughput: 0: 42127.5. Samples: 417051540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:50,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 19:18:51,315][19107] Updated weights for policy 0, policy_version 194885 (0.0038) [2024-06-18 19:18:54,765][19107] Updated weights for policy 0, policy_version 194895 (0.0036) [2024-06-18 19:18:55,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3193159680. Throughput: 0: 41891.2. Samples: 417296380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:18:55,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 19:18:59,128][19107] Updated weights for policy 0, policy_version 194905 (0.0029) [2024-06-18 19:19:00,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3193405440. Throughput: 0: 41987.4. Samples: 417425240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:00,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 19:19:02,714][19107] Updated weights for policy 0, policy_version 194915 (0.0037) [2024-06-18 19:19:05,500][18875] Fps is (10 sec: 40959.2, 60 sec: 41779.0, 300 sec: 41931.9). Total num frames: 3193569280. Throughput: 0: 42143.0. Samples: 417678820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:05,501][18875] Avg episode reward: [(0, '0.409')] [2024-06-18 19:19:06,891][19107] Updated weights for policy 0, policy_version 194925 (0.0025) [2024-06-18 19:19:10,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3193782272. Throughput: 0: 41945.3. Samples: 417929160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:10,501][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 19:19:11,062][19107] Updated weights for policy 0, policy_version 194935 (0.0027) [2024-06-18 19:19:14,714][19107] Updated weights for policy 0, policy_version 194945 (0.0036) [2024-06-18 19:19:15,500][18875] Fps is (10 sec: 45876.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3194028032. Throughput: 0: 42086.2. Samples: 418055640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:15,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 19:19:18,911][19107] Updated weights for policy 0, policy_version 194955 (0.0034) [2024-06-18 19:19:20,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 3194208256. Throughput: 0: 42036.1. Samples: 418307100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:20,501][18875] Avg episode reward: [(0, '0.702')] [2024-06-18 19:19:22,421][19107] Updated weights for policy 0, policy_version 194965 (0.0025) [2024-06-18 19:19:25,500][18875] Fps is (10 sec: 39320.7, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 3194421248. Throughput: 0: 42091.5. Samples: 418563620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:25,501][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 19:19:26,618][19107] Updated weights for policy 0, policy_version 194975 (0.0052) [2024-06-18 19:19:27,865][19087] Signal inference workers to stop experience collection... (6100 times) [2024-06-18 19:19:27,911][19107] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-06-18 19:19:27,921][19087] Signal inference workers to resume experience collection... (6100 times) [2024-06-18 19:19:27,926][19107] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-06-18 19:19:30,086][19107] Updated weights for policy 0, policy_version 194985 (0.0032) [2024-06-18 19:19:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3194650624. Throughput: 0: 42119.2. Samples: 418686760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:30,501][18875] Avg episode reward: [(0, '0.448')] [2024-06-18 19:19:34,356][19107] Updated weights for policy 0, policy_version 194995 (0.0030) [2024-06-18 19:19:35,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3194830848. Throughput: 0: 42049.2. Samples: 418943760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:35,501][18875] Avg episode reward: [(0, '0.313')] [2024-06-18 19:19:37,783][19107] Updated weights for policy 0, policy_version 195005 (0.0024) [2024-06-18 19:19:40,504][18875] Fps is (10 sec: 40945.0, 60 sec: 42322.9, 300 sec: 41987.0). Total num frames: 3195060224. Throughput: 0: 42013.0. Samples: 419187120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:40,504][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 19:19:42,128][19107] Updated weights for policy 0, policy_version 195015 (0.0046) [2024-06-18 19:19:45,453][19107] Updated weights for policy 0, policy_version 195025 (0.0028) [2024-06-18 19:19:45,503][18875] Fps is (10 sec: 45861.5, 60 sec: 42323.2, 300 sec: 42043.1). Total num frames: 3195289600. Throughput: 0: 42189.3. Samples: 419323880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:45,504][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 19:19:49,898][19107] Updated weights for policy 0, policy_version 195035 (0.0033) [2024-06-18 19:19:50,500][18875] Fps is (10 sec: 40974.6, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3195469824. Throughput: 0: 42185.0. Samples: 419577140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:50,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 19:19:53,431][19107] Updated weights for policy 0, policy_version 195045 (0.0037) [2024-06-18 19:19:55,500][18875] Fps is (10 sec: 42611.0, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3195715584. Throughput: 0: 42035.1. Samples: 419820740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:19:55,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 19:19:57,560][19107] Updated weights for policy 0, policy_version 195055 (0.0030) [2024-06-18 19:20:00,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 42043.1). Total num frames: 3195912192. Throughput: 0: 42190.1. Samples: 419954200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 19:20:00,501][18875] Avg episode reward: [(0, '0.704')] [2024-06-18 19:20:01,091][19107] Updated weights for policy 0, policy_version 195065 (0.0030) [2024-06-18 19:20:05,276][19107] Updated weights for policy 0, policy_version 195075 (0.0033) [2024-06-18 19:20:05,500][18875] Fps is (10 sec: 39322.1, 60 sec: 42325.5, 300 sec: 41931.9). Total num frames: 3196108800. Throughput: 0: 42171.1. Samples: 420204800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:05,500][18875] Avg episode reward: [(0, '0.820')] [2024-06-18 19:20:09,053][19107] Updated weights for policy 0, policy_version 195085 (0.0039) [2024-06-18 19:20:10,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3196338176. Throughput: 0: 41977.1. Samples: 420452580. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:10,501][18875] Avg episode reward: [(0, '0.809')] [2024-06-18 19:20:12,914][19107] Updated weights for policy 0, policy_version 195095 (0.0037) [2024-06-18 19:20:15,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3196534784. Throughput: 0: 42078.2. Samples: 420580280. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:15,500][18875] Avg episode reward: [(0, '0.734')] [2024-06-18 19:20:17,007][19107] Updated weights for policy 0, policy_version 195105 (0.0047) [2024-06-18 19:20:20,504][18875] Fps is (10 sec: 39307.3, 60 sec: 42049.7, 300 sec: 41931.4). Total num frames: 3196731392. Throughput: 0: 41936.7. Samples: 420831060. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:20,504][18875] Avg episode reward: [(0, '0.795')] [2024-06-18 19:20:20,799][19107] Updated weights for policy 0, policy_version 195115 (0.0029) [2024-06-18 19:20:24,671][19107] Updated weights for policy 0, policy_version 195125 (0.0033) [2024-06-18 19:20:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42098.5). Total num frames: 3196960768. Throughput: 0: 42159.4. Samples: 421084140. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:25,500][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 19:20:28,477][19107] Updated weights for policy 0, policy_version 195135 (0.0044) [2024-06-18 19:20:30,500][18875] Fps is (10 sec: 42613.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3197157376. Throughput: 0: 41977.1. Samples: 421212720. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:30,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 19:20:32,527][19107] Updated weights for policy 0, policy_version 195145 (0.0034) [2024-06-18 19:20:35,500][18875] Fps is (10 sec: 39320.5, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3197353984. Throughput: 0: 41991.4. Samples: 421466760. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:35,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 19:20:35,516][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000195152_3197370368.pth... [2024-06-18 19:20:35,597][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000194538_3187310592.pth [2024-06-18 19:20:36,027][19107] Updated weights for policy 0, policy_version 195155 (0.0029) [2024-06-18 19:20:40,445][19107] Updated weights for policy 0, policy_version 195165 (0.0037) [2024-06-18 19:20:40,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42054.9, 300 sec: 42043.0). Total num frames: 3197583360. Throughput: 0: 42138.8. Samples: 421716980. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:40,500][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 19:20:43,738][19107] Updated weights for policy 0, policy_version 195175 (0.0039) [2024-06-18 19:20:45,500][18875] Fps is (10 sec: 44237.5, 60 sec: 41781.3, 300 sec: 42043.0). Total num frames: 3197796352. Throughput: 0: 42013.3. Samples: 421844800. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:45,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 19:20:48,273][19107] Updated weights for policy 0, policy_version 195185 (0.0029) [2024-06-18 19:20:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3198009344. Throughput: 0: 42108.0. Samples: 422099660. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:50,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 19:20:51,531][19107] Updated weights for policy 0, policy_version 195195 (0.0025) [2024-06-18 19:20:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3198205952. Throughput: 0: 42164.7. Samples: 422350000. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:20:55,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 19:20:56,034][19107] Updated weights for policy 0, policy_version 195205 (0.0029) [2024-06-18 19:20:58,262][19087] Signal inference workers to stop experience collection... (6150 times) [2024-06-18 19:20:58,285][19107] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-06-18 19:20:58,322][19087] Signal inference workers to resume experience collection... (6150 times) [2024-06-18 19:20:58,322][19107] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-06-18 19:20:59,426][19107] Updated weights for policy 0, policy_version 195215 (0.0029) [2024-06-18 19:21:00,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3198402560. Throughput: 0: 42050.6. Samples: 422472560. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:21:00,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 19:21:03,861][19107] Updated weights for policy 0, policy_version 195225 (0.0034) [2024-06-18 19:21:05,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3198615552. Throughput: 0: 42058.5. Samples: 422723540. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:21:05,501][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 19:21:07,039][19107] Updated weights for policy 0, policy_version 195235 (0.0040) [2024-06-18 19:21:10,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41987.6). Total num frames: 3198844928. Throughput: 0: 42006.5. Samples: 422974440. Policy #0 lag: (min: 1.0, avg: 9.2, max: 21.0) [2024-06-18 19:21:10,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 19:21:11,628][19107] Updated weights for policy 0, policy_version 195245 (0.0028) [2024-06-18 19:21:14,725][19107] Updated weights for policy 0, policy_version 195255 (0.0036) [2024-06-18 19:21:15,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3199057920. Throughput: 0: 42048.3. Samples: 423104900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:15,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 19:21:19,414][19107] Updated weights for policy 0, policy_version 195265 (0.0036) [2024-06-18 19:21:20,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 3199254528. Throughput: 0: 41979.4. Samples: 423355820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:20,500][18875] Avg episode reward: [(0, '0.712')] [2024-06-18 19:21:22,756][19107] Updated weights for policy 0, policy_version 195275 (0.0024) [2024-06-18 19:21:25,504][18875] Fps is (10 sec: 40945.3, 60 sec: 41776.6, 300 sec: 41931.4). Total num frames: 3199467520. Throughput: 0: 42005.4. Samples: 423607380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:25,505][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 19:21:27,340][19107] Updated weights for policy 0, policy_version 195285 (0.0041) [2024-06-18 19:21:30,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3199696896. Throughput: 0: 42041.4. Samples: 423736660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:30,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 19:21:30,908][19107] Updated weights for policy 0, policy_version 195295 (0.0046) [2024-06-18 19:21:35,392][19107] Updated weights for policy 0, policy_version 195305 (0.0028) [2024-06-18 19:21:35,500][18875] Fps is (10 sec: 40975.1, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3199877120. Throughput: 0: 42040.0. Samples: 423991460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:35,501][18875] Avg episode reward: [(0, '0.793')] [2024-06-18 19:21:38,748][19107] Updated weights for policy 0, policy_version 195315 (0.0045) [2024-06-18 19:21:40,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 3200106496. Throughput: 0: 41895.6. Samples: 424235300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:40,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 19:21:43,139][19107] Updated weights for policy 0, policy_version 195325 (0.0030) [2024-06-18 19:21:45,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3200335872. Throughput: 0: 42206.6. Samples: 424371860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:45,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 19:21:46,323][19107] Updated weights for policy 0, policy_version 195335 (0.0034) [2024-06-18 19:21:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3200516096. Throughput: 0: 42224.4. Samples: 424623640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:50,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 19:21:50,798][19107] Updated weights for policy 0, policy_version 195345 (0.0031) [2024-06-18 19:21:53,892][19107] Updated weights for policy 0, policy_version 195355 (0.0027) [2024-06-18 19:21:55,500][18875] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 3200729088. Throughput: 0: 42201.0. Samples: 424873480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:21:55,500][18875] Avg episode reward: [(0, '0.759')] [2024-06-18 19:21:58,406][19107] Updated weights for policy 0, policy_version 195365 (0.0042) [2024-06-18 19:22:00,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3200958464. Throughput: 0: 42217.4. Samples: 425004680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:22:00,501][18875] Avg episode reward: [(0, '0.743')] [2024-06-18 19:22:01,641][19107] Updated weights for policy 0, policy_version 195375 (0.0039) [2024-06-18 19:22:05,504][18875] Fps is (10 sec: 42582.7, 60 sec: 42322.8, 300 sec: 42098.0). Total num frames: 3201155072. Throughput: 0: 42108.5. Samples: 425250860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:22:05,504][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 19:22:06,703][19107] Updated weights for policy 0, policy_version 195385 (0.0025) [2024-06-18 19:22:07,050][19087] Signal inference workers to stop experience collection... (6200 times) [2024-06-18 19:22:07,051][19087] Signal inference workers to resume experience collection... (6200 times) [2024-06-18 19:22:07,064][19107] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-06-18 19:22:07,064][19107] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-06-18 19:22:09,419][19107] Updated weights for policy 0, policy_version 195395 (0.0027) [2024-06-18 19:22:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3201368064. Throughput: 0: 41923.4. Samples: 425493780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:22:10,501][18875] Avg episode reward: [(0, '0.682')] [2024-06-18 19:22:14,226][19107] Updated weights for policy 0, policy_version 195405 (0.0040) [2024-06-18 19:22:15,502][18875] Fps is (10 sec: 42605.8, 60 sec: 42051.0, 300 sec: 42042.7). Total num frames: 3201581056. Throughput: 0: 42002.3. Samples: 425626840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:22:15,503][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 19:22:17,523][19107] Updated weights for policy 0, policy_version 195415 (0.0034) [2024-06-18 19:22:20,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 3201777664. Throughput: 0: 41806.6. Samples: 425872760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:20,501][18875] Avg episode reward: [(0, '0.862')] [2024-06-18 19:22:21,957][19107] Updated weights for policy 0, policy_version 195425 (0.0042) [2024-06-18 19:22:25,337][19107] Updated weights for policy 0, policy_version 195435 (0.0024) [2024-06-18 19:22:25,500][18875] Fps is (10 sec: 42606.1, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 3202007040. Throughput: 0: 42071.6. Samples: 426128520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:25,501][18875] Avg episode reward: [(0, '0.633')] [2024-06-18 19:22:29,554][19107] Updated weights for policy 0, policy_version 195445 (0.0034) [2024-06-18 19:22:30,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3202203648. Throughput: 0: 41855.5. Samples: 426255360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:30,501][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 19:22:33,108][19107] Updated weights for policy 0, policy_version 195455 (0.0043) [2024-06-18 19:22:35,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3202416640. Throughput: 0: 41848.0. Samples: 426506800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:35,501][18875] Avg episode reward: [(0, '0.607')] [2024-06-18 19:22:35,516][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000195460_3202416640.pth... [2024-06-18 19:22:35,578][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000194844_3192324096.pth [2024-06-18 19:22:37,385][19107] Updated weights for policy 0, policy_version 195465 (0.0036) [2024-06-18 19:22:40,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3202629632. Throughput: 0: 41952.2. Samples: 426761340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:40,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 19:22:40,954][19107] Updated weights for policy 0, policy_version 195475 (0.0022) [2024-06-18 19:22:45,261][19107] Updated weights for policy 0, policy_version 195485 (0.0049) [2024-06-18 19:22:45,504][18875] Fps is (10 sec: 40945.6, 60 sec: 41503.7, 300 sec: 42042.5). Total num frames: 3202826240. Throughput: 0: 41892.2. Samples: 426889980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:45,505][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 19:22:48,738][19107] Updated weights for policy 0, policy_version 195495 (0.0037) [2024-06-18 19:22:50,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3203055616. Throughput: 0: 42090.9. Samples: 427144800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:50,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 19:22:52,797][19107] Updated weights for policy 0, policy_version 195505 (0.0028) [2024-06-18 19:22:55,500][18875] Fps is (10 sec: 44253.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3203268608. Throughput: 0: 42409.0. Samples: 427402180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:22:55,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 19:22:56,515][19107] Updated weights for policy 0, policy_version 195515 (0.0029) [2024-06-18 19:23:00,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3203465216. Throughput: 0: 42132.3. Samples: 427522720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:23:00,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 19:23:00,516][19107] Updated weights for policy 0, policy_version 195525 (0.0031) [2024-06-18 19:23:04,446][19107] Updated weights for policy 0, policy_version 195535 (0.0035) [2024-06-18 19:23:05,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42327.9, 300 sec: 42098.6). Total num frames: 3203694592. Throughput: 0: 42360.0. Samples: 427778960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:23:05,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 19:23:08,391][19107] Updated weights for policy 0, policy_version 195545 (0.0035) [2024-06-18 19:23:10,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3203891200. Throughput: 0: 42218.8. Samples: 428028360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:23:10,500][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 19:23:12,192][19107] Updated weights for policy 0, policy_version 195555 (0.0043) [2024-06-18 19:23:15,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42053.5, 300 sec: 42043.0). Total num frames: 3204104192. Throughput: 0: 42171.6. Samples: 428153080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:23:15,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 19:23:15,995][19107] Updated weights for policy 0, policy_version 195565 (0.0028) [2024-06-18 19:23:20,090][19107] Updated weights for policy 0, policy_version 195575 (0.0039) [2024-06-18 19:23:20,502][18875] Fps is (10 sec: 42590.2, 60 sec: 42324.0, 300 sec: 42098.3). Total num frames: 3204317184. Throughput: 0: 42353.5. Samples: 428412780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:23:20,503][18875] Avg episode reward: [(0, '0.186')] [2024-06-18 19:23:23,542][19107] Updated weights for policy 0, policy_version 195585 (0.0025) [2024-06-18 19:23:24,603][19087] Signal inference workers to stop experience collection... (6250 times) [2024-06-18 19:23:24,635][19107] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-06-18 19:23:24,653][19087] Signal inference workers to resume experience collection... (6250 times) [2024-06-18 19:23:24,656][19107] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-06-18 19:23:25,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3204530176. Throughput: 0: 42286.3. Samples: 428664220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:23:25,501][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 19:23:27,631][19107] Updated weights for policy 0, policy_version 195595 (0.0028) [2024-06-18 19:23:30,500][18875] Fps is (10 sec: 42605.9, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3204743168. Throughput: 0: 42334.0. Samples: 428794860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:23:30,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 19:23:31,281][19107] Updated weights for policy 0, policy_version 195605 (0.0032) [2024-06-18 19:23:35,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3204939776. Throughput: 0: 42368.1. Samples: 429051360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:23:35,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 19:23:35,585][19107] Updated weights for policy 0, policy_version 195615 (0.0028) [2024-06-18 19:23:38,871][19107] Updated weights for policy 0, policy_version 195625 (0.0029) [2024-06-18 19:23:40,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3205152768. Throughput: 0: 42197.2. Samples: 429301060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:23:40,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 19:23:43,123][19107] Updated weights for policy 0, policy_version 195635 (0.0035) [2024-06-18 19:23:45,500][18875] Fps is (10 sec: 42597.4, 60 sec: 42327.8, 300 sec: 42043.0). Total num frames: 3205365760. Throughput: 0: 42463.1. Samples: 429433560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:23:45,501][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 19:23:46,552][19107] Updated weights for policy 0, policy_version 195645 (0.0053) [2024-06-18 19:23:50,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3205595136. Throughput: 0: 42334.1. Samples: 429684000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:23:50,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 19:23:50,667][19107] Updated weights for policy 0, policy_version 195655 (0.0029) [2024-06-18 19:23:54,306][19107] Updated weights for policy 0, policy_version 195665 (0.0032) [2024-06-18 19:23:55,502][18875] Fps is (10 sec: 42592.0, 60 sec: 42051.1, 300 sec: 41987.3). Total num frames: 3205791744. Throughput: 0: 42439.2. Samples: 429938200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:23:55,503][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 19:23:58,322][19107] Updated weights for policy 0, policy_version 195675 (0.0030) [2024-06-18 19:24:00,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 3206021120. Throughput: 0: 42341.9. Samples: 430058460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:24:00,501][18875] Avg episode reward: [(0, '0.304')] [2024-06-18 19:24:02,602][19107] Updated weights for policy 0, policy_version 195685 (0.0027) [2024-06-18 19:24:05,500][18875] Fps is (10 sec: 42605.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3206217728. Throughput: 0: 42340.9. Samples: 430318040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:24:05,500][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 19:24:05,903][19107] Updated weights for policy 0, policy_version 195695 (0.0033) [2024-06-18 19:24:10,414][19107] Updated weights for policy 0, policy_version 195705 (0.0030) [2024-06-18 19:24:10,504][18875] Fps is (10 sec: 40945.0, 60 sec: 42322.7, 300 sec: 42042.5). Total num frames: 3206430720. Throughput: 0: 42352.2. Samples: 430570220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:24:10,505][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 19:24:13,568][19107] Updated weights for policy 0, policy_version 195715 (0.0033) [2024-06-18 19:24:15,504][18875] Fps is (10 sec: 44220.5, 60 sec: 42595.9, 300 sec: 42209.1). Total num frames: 3206660096. Throughput: 0: 42338.4. Samples: 430700240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:24:15,504][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 19:24:18,099][19107] Updated weights for policy 0, policy_version 195725 (0.0032) [2024-06-18 19:24:20,500][18875] Fps is (10 sec: 42614.0, 60 sec: 42326.7, 300 sec: 42154.1). Total num frames: 3206856704. Throughput: 0: 42286.2. Samples: 430954240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:24:20,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 19:24:21,264][19107] Updated weights for policy 0, policy_version 195735 (0.0032) [2024-06-18 19:24:25,500][18875] Fps is (10 sec: 40974.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3207069696. Throughput: 0: 42368.5. Samples: 431207640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:24:25,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 19:24:25,756][19107] Updated weights for policy 0, policy_version 195745 (0.0031) [2024-06-18 19:24:29,007][19107] Updated weights for policy 0, policy_version 195755 (0.0033) [2024-06-18 19:24:30,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 3207315456. Throughput: 0: 42259.3. Samples: 431335220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:24:30,501][18875] Avg episode reward: [(0, '0.338')] [2024-06-18 19:24:33,709][19107] Updated weights for policy 0, policy_version 195765 (0.0036) [2024-06-18 19:24:35,500][18875] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 3207462912. Throughput: 0: 42125.4. Samples: 431579640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:24:35,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 19:24:35,722][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000195770_3207495680.pth... [2024-06-18 19:24:35,781][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000195152_3197370368.pth [2024-06-18 19:24:37,226][19107] Updated weights for policy 0, policy_version 195775 (0.0046) [2024-06-18 19:24:40,500][18875] Fps is (10 sec: 36044.8, 60 sec: 42052.4, 300 sec: 41987.9). Total num frames: 3207675904. Throughput: 0: 42121.2. Samples: 431833580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:24:40,500][18875] Avg episode reward: [(0, '0.419')] [2024-06-18 19:24:41,342][19107] Updated weights for policy 0, policy_version 195785 (0.0032) [2024-06-18 19:24:44,889][19107] Updated weights for policy 0, policy_version 195795 (0.0044) [2024-06-18 19:24:45,500][18875] Fps is (10 sec: 47514.0, 60 sec: 42871.6, 300 sec: 42265.2). Total num frames: 3207938048. Throughput: 0: 42340.0. Samples: 431963760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:24:45,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 19:24:48,819][19087] Signal inference workers to stop experience collection... (6300 times) [2024-06-18 19:24:48,879][19107] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-06-18 19:24:48,884][19087] Signal inference workers to resume experience collection... (6300 times) [2024-06-18 19:24:48,896][19107] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-06-18 19:24:49,026][19107] Updated weights for policy 0, policy_version 195805 (0.0043) [2024-06-18 19:24:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3208101888. Throughput: 0: 42071.5. Samples: 432211260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:24:50,501][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 19:24:52,657][19107] Updated weights for policy 0, policy_version 195815 (0.0036) [2024-06-18 19:24:55,500][18875] Fps is (10 sec: 39321.0, 60 sec: 42326.4, 300 sec: 42098.5). Total num frames: 3208331264. Throughput: 0: 42133.9. Samples: 432466100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:24:55,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 19:24:56,708][19107] Updated weights for policy 0, policy_version 195825 (0.0035) [2024-06-18 19:25:00,204][19107] Updated weights for policy 0, policy_version 195835 (0.0029) [2024-06-18 19:25:00,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3208560640. Throughput: 0: 42186.1. Samples: 432598460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:00,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 19:25:04,556][19107] Updated weights for policy 0, policy_version 195845 (0.0025) [2024-06-18 19:25:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 3208740864. Throughput: 0: 42108.7. Samples: 432849140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:05,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 19:25:08,408][19107] Updated weights for policy 0, policy_version 195855 (0.0030) [2024-06-18 19:25:10,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42327.9, 300 sec: 42154.1). Total num frames: 3208970240. Throughput: 0: 41945.0. Samples: 433095160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:10,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 19:25:12,407][19107] Updated weights for policy 0, policy_version 195865 (0.0025) [2024-06-18 19:25:15,504][18875] Fps is (10 sec: 44221.4, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3209183232. Throughput: 0: 41942.3. Samples: 433222780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:15,504][18875] Avg episode reward: [(0, '0.328')] [2024-06-18 19:25:16,395][19107] Updated weights for policy 0, policy_version 195875 (0.0039) [2024-06-18 19:25:20,048][19107] Updated weights for policy 0, policy_version 195885 (0.0042) [2024-06-18 19:25:20,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3209396224. Throughput: 0: 42181.0. Samples: 433477780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:20,501][18875] Avg episode reward: [(0, '0.305')] [2024-06-18 19:25:23,936][19107] Updated weights for policy 0, policy_version 195895 (0.0026) [2024-06-18 19:25:25,500][18875] Fps is (10 sec: 44252.1, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 3209625600. Throughput: 0: 42076.7. Samples: 433727040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:25,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 19:25:27,774][19107] Updated weights for policy 0, policy_version 195905 (0.0031) [2024-06-18 19:25:30,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 42154.1). Total num frames: 3209789440. Throughput: 0: 42046.2. Samples: 433855840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:30,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 19:25:31,727][19107] Updated weights for policy 0, policy_version 195915 (0.0051) [2024-06-18 19:25:35,500][18875] Fps is (10 sec: 39322.7, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3210018816. Throughput: 0: 42149.4. Samples: 434107980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:25:35,500][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 19:25:35,555][19107] Updated weights for policy 0, policy_version 195925 (0.0034) [2024-06-18 19:25:39,497][19107] Updated weights for policy 0, policy_version 195935 (0.0033) [2024-06-18 19:25:40,500][18875] Fps is (10 sec: 47513.5, 60 sec: 43144.5, 300 sec: 42265.2). Total num frames: 3210264576. Throughput: 0: 41999.2. Samples: 434356060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:25:40,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 19:25:43,550][19107] Updated weights for policy 0, policy_version 195945 (0.0037) [2024-06-18 19:25:45,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 3210412032. Throughput: 0: 41993.3. Samples: 434488160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:25:45,501][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 19:25:47,201][19107] Updated weights for policy 0, policy_version 195955 (0.0048) [2024-06-18 19:25:50,504][18875] Fps is (10 sec: 37670.0, 60 sec: 42322.8, 300 sec: 42153.6). Total num frames: 3210641408. Throughput: 0: 42004.4. Samples: 434739480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:25:50,504][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 19:25:51,373][19107] Updated weights for policy 0, policy_version 195965 (0.0046) [2024-06-18 19:25:55,005][19107] Updated weights for policy 0, policy_version 195975 (0.0042) [2024-06-18 19:25:55,500][18875] Fps is (10 sec: 47512.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 3210887168. Throughput: 0: 42086.5. Samples: 434989060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:25:55,501][18875] Avg episode reward: [(0, '0.713')] [2024-06-18 19:25:59,233][19107] Updated weights for policy 0, policy_version 195985 (0.0044) [2024-06-18 19:26:00,500][18875] Fps is (10 sec: 42613.3, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 3211067392. Throughput: 0: 42189.1. Samples: 435121140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:00,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 19:26:02,649][19107] Updated weights for policy 0, policy_version 195995 (0.0040) [2024-06-18 19:26:05,500][18875] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3211280384. Throughput: 0: 42022.2. Samples: 435368780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:05,501][18875] Avg episode reward: [(0, '0.488')] [2024-06-18 19:26:07,204][19107] Updated weights for policy 0, policy_version 196005 (0.0037) [2024-06-18 19:26:10,447][19107] Updated weights for policy 0, policy_version 196015 (0.0025) [2024-06-18 19:26:10,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3211509760. Throughput: 0: 42138.0. Samples: 435623240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:10,500][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 19:26:15,030][19107] Updated weights for policy 0, policy_version 196025 (0.0041) [2024-06-18 19:26:15,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 3211689984. Throughput: 0: 42128.9. Samples: 435751640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:15,500][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 19:26:18,188][19107] Updated weights for policy 0, policy_version 196035 (0.0030) [2024-06-18 19:26:20,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42210.1). Total num frames: 3211919360. Throughput: 0: 42100.8. Samples: 436002520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:20,501][18875] Avg episode reward: [(0, '0.651')] [2024-06-18 19:26:22,851][19107] Updated weights for policy 0, policy_version 196045 (0.0022) [2024-06-18 19:26:23,904][19087] Signal inference workers to stop experience collection... (6350 times) [2024-06-18 19:26:23,905][19087] Signal inference workers to resume experience collection... (6350 times) [2024-06-18 19:26:23,946][19107] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-06-18 19:26:23,946][19107] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-06-18 19:26:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 3212115968. Throughput: 0: 42209.8. Samples: 436255500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:25,501][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 19:26:26,167][19107] Updated weights for policy 0, policy_version 196055 (0.0030) [2024-06-18 19:26:30,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3212312576. Throughput: 0: 42068.5. Samples: 436381240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:30,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 19:26:30,566][19107] Updated weights for policy 0, policy_version 196065 (0.0034) [2024-06-18 19:26:34,060][19107] Updated weights for policy 0, policy_version 196075 (0.0043) [2024-06-18 19:26:35,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3212558336. Throughput: 0: 42029.0. Samples: 436630640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:35,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 19:26:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000196079_3212558336.pth... [2024-06-18 19:26:35,578][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000195460_3202416640.pth [2024-06-18 19:26:38,461][19107] Updated weights for policy 0, policy_version 196085 (0.0035) [2024-06-18 19:26:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 3212738560. Throughput: 0: 42126.3. Samples: 436884740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:40,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 19:26:41,851][19107] Updated weights for policy 0, policy_version 196095 (0.0043) [2024-06-18 19:26:45,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3212951552. Throughput: 0: 41833.3. Samples: 437003640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:26:45,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 19:26:46,279][19107] Updated weights for policy 0, policy_version 196105 (0.0029) [2024-06-18 19:26:49,618][19107] Updated weights for policy 0, policy_version 196115 (0.0032) [2024-06-18 19:26:50,504][18875] Fps is (10 sec: 45858.9, 60 sec: 42598.4, 300 sec: 42264.6). Total num frames: 3213197312. Throughput: 0: 42077.1. Samples: 437262400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:26:50,504][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 19:26:53,812][19107] Updated weights for policy 0, policy_version 196125 (0.0038) [2024-06-18 19:26:55,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41506.3, 300 sec: 42098.6). Total num frames: 3213377536. Throughput: 0: 42163.1. Samples: 437520580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:26:55,500][18875] Avg episode reward: [(0, '0.353')] [2024-06-18 19:26:57,285][19107] Updated weights for policy 0, policy_version 196135 (0.0034) [2024-06-18 19:27:00,500][18875] Fps is (10 sec: 39335.4, 60 sec: 42052.2, 300 sec: 42154.6). Total num frames: 3213590528. Throughput: 0: 41883.4. Samples: 437636400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:00,501][18875] Avg episode reward: [(0, '0.501')] [2024-06-18 19:27:01,823][19107] Updated weights for policy 0, policy_version 196145 (0.0031) [2024-06-18 19:27:04,963][19107] Updated weights for policy 0, policy_version 196155 (0.0039) [2024-06-18 19:27:05,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3213819904. Throughput: 0: 41990.6. Samples: 437892100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:05,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 19:27:09,927][19107] Updated weights for policy 0, policy_version 196165 (0.0035) [2024-06-18 19:27:10,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 42043.3). Total num frames: 3213983744. Throughput: 0: 42107.1. Samples: 438150320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:10,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 19:27:12,563][19107] Updated weights for policy 0, policy_version 196175 (0.0035) [2024-06-18 19:27:15,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3214229504. Throughput: 0: 41934.2. Samples: 438268280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:15,501][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 19:27:17,578][19107] Updated weights for policy 0, policy_version 196185 (0.0023) [2024-06-18 19:27:20,482][19107] Updated weights for policy 0, policy_version 196195 (0.0038) [2024-06-18 19:27:20,500][18875] Fps is (10 sec: 47513.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3214458880. Throughput: 0: 42089.8. Samples: 438524680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:20,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 19:27:25,053][19107] Updated weights for policy 0, policy_version 196205 (0.0036) [2024-06-18 19:27:25,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3214622720. Throughput: 0: 42148.4. Samples: 438781420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:25,501][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 19:27:28,581][19107] Updated weights for policy 0, policy_version 196215 (0.0038) [2024-06-18 19:27:30,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3214868480. Throughput: 0: 42155.5. Samples: 438900640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:30,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 19:27:32,800][19107] Updated weights for policy 0, policy_version 196225 (0.0037) [2024-06-18 19:27:35,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3215081472. Throughput: 0: 42176.6. Samples: 439160200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:35,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 19:27:36,207][19107] Updated weights for policy 0, policy_version 196235 (0.0031) [2024-06-18 19:27:40,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42154.6). Total num frames: 3215261696. Throughput: 0: 41956.7. Samples: 439408640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:40,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 19:27:40,537][19107] Updated weights for policy 0, policy_version 196245 (0.0034) [2024-06-18 19:27:44,105][19107] Updated weights for policy 0, policy_version 196255 (0.0026) [2024-06-18 19:27:45,502][18875] Fps is (10 sec: 40953.9, 60 sec: 42324.3, 300 sec: 42153.9). Total num frames: 3215491072. Throughput: 0: 42144.4. Samples: 439532960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:45,502][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 19:27:48,164][19107] Updated weights for policy 0, policy_version 196265 (0.0037) [2024-06-18 19:27:50,327][19087] Signal inference workers to stop experience collection... (6400 times) [2024-06-18 19:27:50,328][19087] Signal inference workers to resume experience collection... (6400 times) [2024-06-18 19:27:50,343][19107] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-06-18 19:27:50,344][19107] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-06-18 19:27:50,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41508.7, 300 sec: 42098.6). Total num frames: 3215687680. Throughput: 0: 42168.6. Samples: 439789680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 19:27:50,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 19:27:52,024][19107] Updated weights for policy 0, policy_version 196275 (0.0033) [2024-06-18 19:27:55,500][18875] Fps is (10 sec: 40966.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3215900672. Throughput: 0: 41830.7. Samples: 440032700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:27:55,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 19:27:56,174][19107] Updated weights for policy 0, policy_version 196285 (0.0037) [2024-06-18 19:27:59,974][19107] Updated weights for policy 0, policy_version 196295 (0.0023) [2024-06-18 19:28:00,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3216113664. Throughput: 0: 42208.9. Samples: 440167680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:00,501][18875] Avg episode reward: [(0, '0.620')] [2024-06-18 19:28:03,707][19107] Updated weights for policy 0, policy_version 196305 (0.0034) [2024-06-18 19:28:05,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3216326656. Throughput: 0: 42071.5. Samples: 440417900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:05,501][18875] Avg episode reward: [(0, '0.373')] [2024-06-18 19:28:07,691][19107] Updated weights for policy 0, policy_version 196315 (0.0035) [2024-06-18 19:28:10,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 3216556032. Throughput: 0: 41875.9. Samples: 440665840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:10,501][18875] Avg episode reward: [(0, '0.764')] [2024-06-18 19:28:11,266][19107] Updated weights for policy 0, policy_version 196325 (0.0040) [2024-06-18 19:28:15,451][19107] Updated weights for policy 0, policy_version 196335 (0.0036) [2024-06-18 19:28:15,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42154.4). Total num frames: 3216752640. Throughput: 0: 42059.2. Samples: 440793300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:15,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 19:28:19,292][19107] Updated weights for policy 0, policy_version 196345 (0.0035) [2024-06-18 19:28:20,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 3216949248. Throughput: 0: 41866.7. Samples: 441044200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:20,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 19:28:23,131][19107] Updated weights for policy 0, policy_version 196355 (0.0035) [2024-06-18 19:28:25,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3217178624. Throughput: 0: 41901.7. Samples: 441294220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:25,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 19:28:26,827][19107] Updated weights for policy 0, policy_version 196365 (0.0027) [2024-06-18 19:28:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3217375232. Throughput: 0: 42080.5. Samples: 441426520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:30,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 19:28:31,129][19107] Updated weights for policy 0, policy_version 196375 (0.0025) [2024-06-18 19:28:34,621][19107] Updated weights for policy 0, policy_version 196385 (0.0022) [2024-06-18 19:28:35,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 3217571840. Throughput: 0: 41830.6. Samples: 441672060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:35,501][18875] Avg episode reward: [(0, '0.326')] [2024-06-18 19:28:35,562][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000196386_3217588224.pth... [2024-06-18 19:28:35,629][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000195770_3207495680.pth [2024-06-18 19:28:38,920][19107] Updated weights for policy 0, policy_version 196395 (0.0039) [2024-06-18 19:28:40,504][18875] Fps is (10 sec: 42583.5, 60 sec: 42322.9, 300 sec: 42153.6). Total num frames: 3217801216. Throughput: 0: 41927.7. Samples: 441919600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:40,504][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 19:28:42,250][19107] Updated weights for policy 0, policy_version 196405 (0.0035) [2024-06-18 19:28:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41780.3, 300 sec: 42043.0). Total num frames: 3217997824. Throughput: 0: 41812.4. Samples: 442049240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:45,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 19:28:46,708][19107] Updated weights for policy 0, policy_version 196415 (0.0031) [2024-06-18 19:28:49,914][19107] Updated weights for policy 0, policy_version 196425 (0.0042) [2024-06-18 19:28:50,500][18875] Fps is (10 sec: 42613.2, 60 sec: 42325.2, 300 sec: 42154.3). Total num frames: 3218227200. Throughput: 0: 41867.1. Samples: 442301920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:50,501][18875] Avg episode reward: [(0, '0.282')] [2024-06-18 19:28:54,735][19107] Updated weights for policy 0, policy_version 196435 (0.0027) [2024-06-18 19:28:55,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3218423808. Throughput: 0: 42036.5. Samples: 442557480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 19:28:55,501][18875] Avg episode reward: [(0, '0.363')] [2024-06-18 19:28:57,962][19107] Updated weights for policy 0, policy_version 196445 (0.0044) [2024-06-18 19:29:00,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3218620416. Throughput: 0: 41855.9. Samples: 442676820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:00,501][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 19:29:02,444][19107] Updated weights for policy 0, policy_version 196455 (0.0055) [2024-06-18 19:29:05,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42099.1). Total num frames: 3218849792. Throughput: 0: 41877.7. Samples: 442928700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:05,501][18875] Avg episode reward: [(0, '0.344')] [2024-06-18 19:29:05,831][19107] Updated weights for policy 0, policy_version 196465 (0.0039) [2024-06-18 19:29:10,163][19107] Updated weights for policy 0, policy_version 196475 (0.0029) [2024-06-18 19:29:10,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41988.0). Total num frames: 3219046400. Throughput: 0: 42102.3. Samples: 443188820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:10,501][18875] Avg episode reward: [(0, '0.339')] [2024-06-18 19:29:11,056][19087] Signal inference workers to stop experience collection... (6450 times) [2024-06-18 19:29:11,057][19087] Signal inference workers to resume experience collection... (6450 times) [2024-06-18 19:29:11,100][19107] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-06-18 19:29:11,100][19107] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-06-18 19:29:13,482][19107] Updated weights for policy 0, policy_version 196485 (0.0030) [2024-06-18 19:29:15,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3219275776. Throughput: 0: 41941.0. Samples: 443313860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:15,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 19:29:17,808][19107] Updated weights for policy 0, policy_version 196495 (0.0046) [2024-06-18 19:29:20,501][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3219488768. Throughput: 0: 42221.1. Samples: 443572020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:20,501][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 19:29:21,219][19107] Updated weights for policy 0, policy_version 196505 (0.0035) [2024-06-18 19:29:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3219685376. Throughput: 0: 42524.7. Samples: 443833060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:25,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 19:29:25,613][19107] Updated weights for policy 0, policy_version 196515 (0.0044) [2024-06-18 19:29:29,131][19107] Updated weights for policy 0, policy_version 196525 (0.0043) [2024-06-18 19:29:30,500][18875] Fps is (10 sec: 42599.5, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3219914752. Throughput: 0: 42302.3. Samples: 443952840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:30,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 19:29:33,441][19107] Updated weights for policy 0, policy_version 196535 (0.0043) [2024-06-18 19:29:35,504][18875] Fps is (10 sec: 42582.5, 60 sec: 42322.7, 300 sec: 42153.6). Total num frames: 3220111360. Throughput: 0: 42364.6. Samples: 444208480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:35,505][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 19:29:36,930][19107] Updated weights for policy 0, policy_version 196545 (0.0034) [2024-06-18 19:29:40,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 3220307968. Throughput: 0: 42398.2. Samples: 444465400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:40,501][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 19:29:41,126][19107] Updated weights for policy 0, policy_version 196555 (0.0034) [2024-06-18 19:29:44,599][19107] Updated weights for policy 0, policy_version 196565 (0.0039) [2024-06-18 19:29:45,500][18875] Fps is (10 sec: 44252.8, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3220553728. Throughput: 0: 42484.5. Samples: 444588620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:45,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 19:29:48,872][19107] Updated weights for policy 0, policy_version 196575 (0.0032) [2024-06-18 19:29:50,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3220750336. Throughput: 0: 42418.3. Samples: 444837520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:50,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 19:29:52,541][19107] Updated weights for policy 0, policy_version 196585 (0.0028) [2024-06-18 19:29:55,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3220946944. Throughput: 0: 42306.2. Samples: 445092600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:29:55,509][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 19:29:56,556][19107] Updated weights for policy 0, policy_version 196595 (0.0027) [2024-06-18 19:30:00,265][19107] Updated weights for policy 0, policy_version 196605 (0.0039) [2024-06-18 19:30:00,503][18875] Fps is (10 sec: 42585.8, 60 sec: 42596.4, 300 sec: 42153.7). Total num frames: 3221176320. Throughput: 0: 42346.5. Samples: 445219580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:30:00,504][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 19:30:04,293][19107] Updated weights for policy 0, policy_version 196615 (0.0039) [2024-06-18 19:30:05,504][18875] Fps is (10 sec: 42583.3, 60 sec: 42049.8, 300 sec: 42042.5). Total num frames: 3221372928. Throughput: 0: 42316.3. Samples: 445476400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-18 19:30:05,504][18875] Avg episode reward: [(0, '0.475')] [2024-06-18 19:30:08,126][19107] Updated weights for policy 0, policy_version 196625 (0.0032) [2024-06-18 19:30:10,501][18875] Fps is (10 sec: 40969.0, 60 sec: 42324.8, 300 sec: 42043.4). Total num frames: 3221585920. Throughput: 0: 42052.6. Samples: 445725460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:10,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 19:30:12,048][19107] Updated weights for policy 0, policy_version 196635 (0.0047) [2024-06-18 19:30:15,500][18875] Fps is (10 sec: 42613.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3221798912. Throughput: 0: 42263.9. Samples: 445854720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:15,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 19:30:15,707][19107] Updated weights for policy 0, policy_version 196645 (0.0026) [2024-06-18 19:30:19,683][19107] Updated weights for policy 0, policy_version 196655 (0.0030) [2024-06-18 19:30:20,500][18875] Fps is (10 sec: 42601.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3222011904. Throughput: 0: 42227.8. Samples: 446108580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:20,504][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 19:30:24,051][19107] Updated weights for policy 0, policy_version 196665 (0.0032) [2024-06-18 19:30:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3222224896. Throughput: 0: 42102.2. Samples: 446360000. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:25,501][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 19:30:27,536][19107] Updated weights for policy 0, policy_version 196675 (0.0041) [2024-06-18 19:30:30,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3222454272. Throughput: 0: 42176.0. Samples: 446486540. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:30,501][18875] Avg episode reward: [(0, '0.399')] [2024-06-18 19:30:31,645][19107] Updated weights for policy 0, policy_version 196685 (0.0047) [2024-06-18 19:30:35,272][19107] Updated weights for policy 0, policy_version 196695 (0.0030) [2024-06-18 19:30:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 3222650880. Throughput: 0: 42437.8. Samples: 446747220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:35,501][18875] Avg episode reward: [(0, '0.554')] [2024-06-18 19:30:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000196695_3222650880.pth... [2024-06-18 19:30:35,584][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000196079_3212558336.pth [2024-06-18 19:30:39,199][19107] Updated weights for policy 0, policy_version 196705 (0.0035) [2024-06-18 19:30:40,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3222863872. Throughput: 0: 42300.4. Samples: 446996120. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:40,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 19:30:42,993][19107] Updated weights for policy 0, policy_version 196715 (0.0028) [2024-06-18 19:30:45,504][18875] Fps is (10 sec: 42583.1, 60 sec: 42049.8, 300 sec: 42154.1). Total num frames: 3223076864. Throughput: 0: 42256.3. Samples: 447121140. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:45,504][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 19:30:47,076][19107] Updated weights for policy 0, policy_version 196725 (0.0029) [2024-06-18 19:30:49,979][19087] Signal inference workers to stop experience collection... (6500 times) [2024-06-18 19:30:50,002][19107] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-06-18 19:30:50,091][19087] Signal inference workers to resume experience collection... (6500 times) [2024-06-18 19:30:50,091][19107] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-06-18 19:30:50,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3223273472. Throughput: 0: 42245.6. Samples: 447377300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:50,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 19:30:50,801][19107] Updated weights for policy 0, policy_version 196735 (0.0037) [2024-06-18 19:30:54,662][19107] Updated weights for policy 0, policy_version 196745 (0.0032) [2024-06-18 19:30:55,500][18875] Fps is (10 sec: 44253.0, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 3223519232. Throughput: 0: 42300.8. Samples: 447628960. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:30:55,500][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 19:30:58,506][19107] Updated weights for policy 0, policy_version 196755 (0.0034) [2024-06-18 19:31:00,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42054.4, 300 sec: 42098.6). Total num frames: 3223699456. Throughput: 0: 42368.5. Samples: 447761300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:31:00,500][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 19:31:02,210][19107] Updated weights for policy 0, policy_version 196765 (0.0041) [2024-06-18 19:31:05,504][18875] Fps is (10 sec: 40944.6, 60 sec: 42598.4, 300 sec: 42098.0). Total num frames: 3223928832. Throughput: 0: 42358.0. Samples: 448014840. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:31:05,505][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 19:31:06,163][19107] Updated weights for policy 0, policy_version 196775 (0.0028) [2024-06-18 19:31:10,434][19107] Updated weights for policy 0, policy_version 196785 (0.0034) [2024-06-18 19:31:10,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.9, 300 sec: 42154.1). Total num frames: 3224125440. Throughput: 0: 42495.6. Samples: 448272300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 22.0) [2024-06-18 19:31:10,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 19:31:13,725][19107] Updated weights for policy 0, policy_version 196795 (0.0030) [2024-06-18 19:31:15,500][18875] Fps is (10 sec: 42614.2, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3224354816. Throughput: 0: 42433.0. Samples: 448396020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:15,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 19:31:18,268][19107] Updated weights for policy 0, policy_version 196805 (0.0030) [2024-06-18 19:31:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3224535040. Throughput: 0: 42125.8. Samples: 448642880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:20,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 19:31:21,730][19107] Updated weights for policy 0, policy_version 196815 (0.0030) [2024-06-18 19:31:25,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3224748032. Throughput: 0: 42193.0. Samples: 448894800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:25,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 19:31:25,967][19107] Updated weights for policy 0, policy_version 196825 (0.0048) [2024-06-18 19:31:29,353][19107] Updated weights for policy 0, policy_version 196835 (0.0043) [2024-06-18 19:31:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3224961024. Throughput: 0: 42238.0. Samples: 449021700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:30,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 19:31:33,518][19107] Updated weights for policy 0, policy_version 196845 (0.0034) [2024-06-18 19:31:35,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3225174016. Throughput: 0: 42031.1. Samples: 449268700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:35,501][18875] Avg episode reward: [(0, '0.764')] [2024-06-18 19:31:36,981][19107] Updated weights for policy 0, policy_version 196855 (0.0042) [2024-06-18 19:31:40,504][18875] Fps is (10 sec: 44221.0, 60 sec: 42322.9, 300 sec: 42209.1). Total num frames: 3225403392. Throughput: 0: 42168.1. Samples: 449526680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:40,505][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 19:31:41,092][19107] Updated weights for policy 0, policy_version 196865 (0.0031) [2024-06-18 19:31:44,662][19107] Updated weights for policy 0, policy_version 196875 (0.0028) [2024-06-18 19:31:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42054.7, 300 sec: 42043.5). Total num frames: 3225600000. Throughput: 0: 42136.3. Samples: 449657440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:45,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 19:31:48,760][19107] Updated weights for policy 0, policy_version 196885 (0.0030) [2024-06-18 19:31:50,500][18875] Fps is (10 sec: 40975.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3225812992. Throughput: 0: 41964.8. Samples: 449903100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:50,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 19:31:52,613][19107] Updated weights for policy 0, policy_version 196895 (0.0028) [2024-06-18 19:31:55,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 3226009600. Throughput: 0: 41992.9. Samples: 450161980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:31:55,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 19:31:56,350][19087] Signal inference workers to stop experience collection... (6550 times) [2024-06-18 19:31:56,359][19107] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-06-18 19:31:56,461][19087] Signal inference workers to resume experience collection... (6550 times) [2024-06-18 19:31:56,461][19107] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-06-18 19:31:56,591][19107] Updated weights for policy 0, policy_version 196905 (0.0026) [2024-06-18 19:32:00,466][19107] Updated weights for policy 0, policy_version 196915 (0.0023) [2024-06-18 19:32:00,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3226255360. Throughput: 0: 42016.8. Samples: 450286780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:32:00,501][18875] Avg episode reward: [(0, '0.436')] [2024-06-18 19:32:04,243][19107] Updated weights for policy 0, policy_version 196925 (0.0028) [2024-06-18 19:32:05,500][18875] Fps is (10 sec: 42597.6, 60 sec: 41781.6, 300 sec: 42209.6). Total num frames: 3226435584. Throughput: 0: 42094.1. Samples: 450537120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:32:05,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 19:32:08,052][19107] Updated weights for policy 0, policy_version 196935 (0.0033) [2024-06-18 19:32:10,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3226632192. Throughput: 0: 42293.0. Samples: 450797980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:32:10,501][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 19:32:12,223][19107] Updated weights for policy 0, policy_version 196945 (0.0038) [2024-06-18 19:32:15,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3226877952. Throughput: 0: 42119.4. Samples: 450917080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-18 19:32:15,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 19:32:16,155][19107] Updated weights for policy 0, policy_version 196955 (0.0034) [2024-06-18 19:32:19,930][19107] Updated weights for policy 0, policy_version 196965 (0.0031) [2024-06-18 19:32:20,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 3227090944. Throughput: 0: 42311.2. Samples: 451172700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:20,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 19:32:23,981][19107] Updated weights for policy 0, policy_version 196975 (0.0028) [2024-06-18 19:32:25,500][18875] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3227271168. Throughput: 0: 42267.7. Samples: 451428580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:25,504][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 19:32:28,111][19107] Updated weights for policy 0, policy_version 196985 (0.0041) [2024-06-18 19:32:30,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3227516928. Throughput: 0: 42175.7. Samples: 451555340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:30,500][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 19:32:31,807][19107] Updated weights for policy 0, policy_version 196995 (0.0025) [2024-06-18 19:32:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3227697152. Throughput: 0: 42207.0. Samples: 451802420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:35,501][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 19:32:35,602][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197004_3227713536.pth... [2024-06-18 19:32:35,647][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000196386_3217588224.pth [2024-06-18 19:32:36,005][19107] Updated weights for policy 0, policy_version 197005 (0.0041) [2024-06-18 19:32:39,839][19107] Updated weights for policy 0, policy_version 197015 (0.0027) [2024-06-18 19:32:40,500][18875] Fps is (10 sec: 39320.6, 60 sec: 41781.6, 300 sec: 42098.8). Total num frames: 3227910144. Throughput: 0: 42070.0. Samples: 452055140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:40,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 19:32:43,653][19107] Updated weights for policy 0, policy_version 197025 (0.0027) [2024-06-18 19:32:45,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 3228155904. Throughput: 0: 42111.0. Samples: 452181780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:45,501][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 19:32:47,801][19107] Updated weights for policy 0, policy_version 197035 (0.0044) [2024-06-18 19:32:50,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3228352512. Throughput: 0: 42233.9. Samples: 452437640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:50,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 19:32:51,300][19107] Updated weights for policy 0, policy_version 197045 (0.0032) [2024-06-18 19:32:55,458][19107] Updated weights for policy 0, policy_version 197055 (0.0031) [2024-06-18 19:32:55,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3228549120. Throughput: 0: 42038.0. Samples: 452689700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:32:55,501][18875] Avg episode reward: [(0, '0.646')] [2024-06-18 19:32:59,162][19107] Updated weights for policy 0, policy_version 197065 (0.0034) [2024-06-18 19:33:00,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3228794880. Throughput: 0: 42168.6. Samples: 452814660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:33:00,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 19:33:03,179][19107] Updated weights for policy 0, policy_version 197075 (0.0026) [2024-06-18 19:33:05,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3228958720. Throughput: 0: 42259.1. Samples: 453074360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:33:05,501][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 19:33:06,903][19107] Updated weights for policy 0, policy_version 197085 (0.0036) [2024-06-18 19:33:10,500][18875] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3229171712. Throughput: 0: 42130.3. Samples: 453324440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:33:10,501][18875] Avg episode reward: [(0, '0.386')] [2024-06-18 19:33:10,814][19107] Updated weights for policy 0, policy_version 197095 (0.0030) [2024-06-18 19:33:14,514][19107] Updated weights for policy 0, policy_version 197105 (0.0037) [2024-06-18 19:33:14,850][19087] Signal inference workers to stop experience collection... (6600 times) [2024-06-18 19:33:14,850][19087] Signal inference workers to resume experience collection... (6600 times) [2024-06-18 19:33:14,879][19107] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-06-18 19:33:14,879][19107] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-06-18 19:33:15,500][18875] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3229417472. Throughput: 0: 42146.1. Samples: 453451920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:33:15,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 19:33:18,501][19107] Updated weights for policy 0, policy_version 197115 (0.0038) [2024-06-18 19:33:20,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3229597696. Throughput: 0: 42346.3. Samples: 453708000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:33:20,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 19:33:22,038][19107] Updated weights for policy 0, policy_version 197125 (0.0043) [2024-06-18 19:33:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 3229827072. Throughput: 0: 42355.7. Samples: 453961140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 19:33:25,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 19:33:26,067][19107] Updated weights for policy 0, policy_version 197135 (0.0040) [2024-06-18 19:33:29,632][19107] Updated weights for policy 0, policy_version 197145 (0.0030) [2024-06-18 19:33:30,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 3230056448. Throughput: 0: 42496.2. Samples: 454094100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:33:30,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 19:33:34,018][19107] Updated weights for policy 0, policy_version 197155 (0.0049) [2024-06-18 19:33:35,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42099.0). Total num frames: 3230220288. Throughput: 0: 42302.5. Samples: 454341260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:33:35,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 19:33:37,513][19107] Updated weights for policy 0, policy_version 197165 (0.0038) [2024-06-18 19:33:40,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3230449664. Throughput: 0: 42213.9. Samples: 454589320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:33:40,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 19:33:41,725][19107] Updated weights for policy 0, policy_version 197175 (0.0039) [2024-06-18 19:33:45,180][19107] Updated weights for policy 0, policy_version 197185 (0.0026) [2024-06-18 19:33:45,500][18875] Fps is (10 sec: 45875.8, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 3230679040. Throughput: 0: 42440.5. Samples: 454724480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:33:45,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 19:33:49,499][19107] Updated weights for policy 0, policy_version 197195 (0.0027) [2024-06-18 19:33:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3230859264. Throughput: 0: 42159.5. Samples: 454971540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:33:50,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 19:33:52,880][19107] Updated weights for policy 0, policy_version 197205 (0.0029) [2024-06-18 19:33:55,500][18875] Fps is (10 sec: 37682.9, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3231055872. Throughput: 0: 42107.0. Samples: 455219260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:33:55,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 19:33:57,079][19107] Updated weights for policy 0, policy_version 197215 (0.0037) [2024-06-18 19:34:00,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 3231285248. Throughput: 0: 42031.0. Samples: 455343320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:34:00,501][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 19:34:00,926][19107] Updated weights for policy 0, policy_version 197225 (0.0025) [2024-06-18 19:34:05,099][19107] Updated weights for policy 0, policy_version 197235 (0.0032) [2024-06-18 19:34:05,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 3231514624. Throughput: 0: 41894.2. Samples: 455593240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:34:05,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 19:34:08,750][19107] Updated weights for policy 0, policy_version 197245 (0.0038) [2024-06-18 19:34:10,504][18875] Fps is (10 sec: 40945.6, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 3231694848. Throughput: 0: 41805.5. Samples: 455842540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:34:10,505][18875] Avg episode reward: [(0, '0.783')] [2024-06-18 19:34:12,989][19107] Updated weights for policy 0, policy_version 197255 (0.0033) [2024-06-18 19:34:15,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3231924224. Throughput: 0: 41552.8. Samples: 455963980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:34:15,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 19:34:16,560][19107] Updated weights for policy 0, policy_version 197265 (0.0029) [2024-06-18 19:34:20,500][18875] Fps is (10 sec: 44252.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3232137216. Throughput: 0: 41946.7. Samples: 456228860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:34:20,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 19:34:20,591][19107] Updated weights for policy 0, policy_version 197275 (0.0029) [2024-06-18 19:34:24,472][19107] Updated weights for policy 0, policy_version 197285 (0.0029) [2024-06-18 19:34:25,503][18875] Fps is (10 sec: 40947.8, 60 sec: 41777.1, 300 sec: 42098.1). Total num frames: 3232333824. Throughput: 0: 41918.1. Samples: 456475760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:34:25,504][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 19:34:26,008][19087] Signal inference workers to stop experience collection... (6650 times) [2024-06-18 19:34:26,055][19087] Signal inference workers to resume experience collection... (6650 times) [2024-06-18 19:34:26,056][19107] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-06-18 19:34:26,072][19107] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-06-18 19:34:28,364][19107] Updated weights for policy 0, policy_version 197295 (0.0038) [2024-06-18 19:34:30,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42265.7). Total num frames: 3232579584. Throughput: 0: 41703.0. Samples: 456601120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 19:34:30,501][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 19:34:32,709][19107] Updated weights for policy 0, policy_version 197305 (0.0039) [2024-06-18 19:34:35,500][18875] Fps is (10 sec: 40972.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3232743424. Throughput: 0: 41941.5. Samples: 456858900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:34:35,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 19:34:35,681][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197313_3232776192.pth... [2024-06-18 19:34:35,735][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000196695_3222650880.pth [2024-06-18 19:34:36,364][19107] Updated weights for policy 0, policy_version 197315 (0.0035) [2024-06-18 19:34:40,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3232972800. Throughput: 0: 41991.6. Samples: 457108880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:34:40,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 19:34:40,502][19107] Updated weights for policy 0, policy_version 197325 (0.0032) [2024-06-18 19:34:44,109][19107] Updated weights for policy 0, policy_version 197335 (0.0034) [2024-06-18 19:34:45,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3233185792. Throughput: 0: 42002.0. Samples: 457233400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:34:45,500][18875] Avg episode reward: [(0, '0.712')] [2024-06-18 19:34:48,140][19107] Updated weights for policy 0, policy_version 197345 (0.0031) [2024-06-18 19:34:50,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3233366016. Throughput: 0: 42109.3. Samples: 457488160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:34:50,501][18875] Avg episode reward: [(0, '0.651')] [2024-06-18 19:34:51,899][19107] Updated weights for policy 0, policy_version 197355 (0.0028) [2024-06-18 19:34:55,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42099.0). Total num frames: 3233595392. Throughput: 0: 42022.9. Samples: 457733420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:34:55,501][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 19:34:55,876][19107] Updated weights for policy 0, policy_version 197365 (0.0025) [2024-06-18 19:34:59,877][19107] Updated weights for policy 0, policy_version 197375 (0.0027) [2024-06-18 19:35:00,500][18875] Fps is (10 sec: 47513.5, 60 sec: 42598.5, 300 sec: 42265.7). Total num frames: 3233841152. Throughput: 0: 42253.9. Samples: 457865400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:00,502][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 19:35:04,236][19107] Updated weights for policy 0, policy_version 197385 (0.0033) [2024-06-18 19:35:05,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42098.7). Total num frames: 3234004992. Throughput: 0: 42029.4. Samples: 458120180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:05,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 19:35:07,731][19107] Updated weights for policy 0, policy_version 197395 (0.0043) [2024-06-18 19:35:10,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42600.9, 300 sec: 42209.6). Total num frames: 3234250752. Throughput: 0: 41915.7. Samples: 458361840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:10,501][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 19:35:11,923][19107] Updated weights for policy 0, policy_version 197405 (0.0027) [2024-06-18 19:35:15,494][19107] Updated weights for policy 0, policy_version 197415 (0.0025) [2024-06-18 19:35:15,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3234447360. Throughput: 0: 42111.6. Samples: 458496140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:15,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 19:35:19,539][19107] Updated weights for policy 0, policy_version 197425 (0.0042) [2024-06-18 19:35:20,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3234627584. Throughput: 0: 42008.7. Samples: 458749300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:20,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 19:35:23,226][19107] Updated weights for policy 0, policy_version 197435 (0.0029) [2024-06-18 19:35:23,807][19087] Signal inference workers to stop experience collection... (6700 times) [2024-06-18 19:35:23,808][19087] Signal inference workers to resume experience collection... (6700 times) [2024-06-18 19:35:23,822][19107] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-06-18 19:35:23,841][19107] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-06-18 19:35:25,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42600.6, 300 sec: 42154.1). Total num frames: 3234889728. Throughput: 0: 41812.9. Samples: 458990460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:25,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 19:35:27,142][19107] Updated weights for policy 0, policy_version 197445 (0.0029) [2024-06-18 19:35:30,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 3235069952. Throughput: 0: 42181.2. Samples: 459131560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:30,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 19:35:30,929][19107] Updated weights for policy 0, policy_version 197455 (0.0039) [2024-06-18 19:35:35,036][19107] Updated weights for policy 0, policy_version 197465 (0.0033) [2024-06-18 19:35:35,500][18875] Fps is (10 sec: 37682.7, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 3235266560. Throughput: 0: 42022.1. Samples: 459379160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:35,501][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 19:35:38,715][19107] Updated weights for policy 0, policy_version 197475 (0.0040) [2024-06-18 19:35:40,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42210.1). Total num frames: 3235528704. Throughput: 0: 42024.1. Samples: 459624500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 19:35:40,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 19:35:42,586][19107] Updated weights for policy 0, policy_version 197485 (0.0034) [2024-06-18 19:35:45,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 3235708928. Throughput: 0: 42135.1. Samples: 459761480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:35:45,501][18875] Avg episode reward: [(0, '0.780')] [2024-06-18 19:35:46,430][19107] Updated weights for policy 0, policy_version 197495 (0.0026) [2024-06-18 19:35:50,224][19107] Updated weights for policy 0, policy_version 197505 (0.0029) [2024-06-18 19:35:50,502][18875] Fps is (10 sec: 39314.1, 60 sec: 42597.1, 300 sec: 42042.7). Total num frames: 3235921920. Throughput: 0: 41995.5. Samples: 460010060. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:35:50,503][18875] Avg episode reward: [(0, '0.651')] [2024-06-18 19:35:54,215][19107] Updated weights for policy 0, policy_version 197515 (0.0041) [2024-06-18 19:35:55,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3236151296. Throughput: 0: 42095.2. Samples: 460256120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:35:55,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 19:35:57,864][19107] Updated weights for policy 0, policy_version 197525 (0.0029) [2024-06-18 19:36:00,500][18875] Fps is (10 sec: 40968.0, 60 sec: 41506.2, 300 sec: 42043.5). Total num frames: 3236331520. Throughput: 0: 41949.0. Samples: 460383840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:00,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 19:36:02,071][19107] Updated weights for policy 0, policy_version 197535 (0.0027) [2024-06-18 19:36:05,466][19107] Updated weights for policy 0, policy_version 197545 (0.0042) [2024-06-18 19:36:05,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 3236577280. Throughput: 0: 41939.5. Samples: 460636580. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:05,501][18875] Avg episode reward: [(0, '0.765')] [2024-06-18 19:36:09,912][19107] Updated weights for policy 0, policy_version 197555 (0.0031) [2024-06-18 19:36:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3236757504. Throughput: 0: 42105.9. Samples: 460885220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:10,501][18875] Avg episode reward: [(0, '0.753')] [2024-06-18 19:36:13,590][19107] Updated weights for policy 0, policy_version 197565 (0.0053) [2024-06-18 19:36:15,500][18875] Fps is (10 sec: 37684.1, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3236954112. Throughput: 0: 41763.3. Samples: 461010900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:15,501][18875] Avg episode reward: [(0, '0.756')] [2024-06-18 19:36:17,927][19107] Updated weights for policy 0, policy_version 197575 (0.0031) [2024-06-18 19:36:20,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3237183488. Throughput: 0: 41907.7. Samples: 461265000. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:20,501][18875] Avg episode reward: [(0, '0.717')] [2024-06-18 19:36:21,243][19107] Updated weights for policy 0, policy_version 197585 (0.0031) [2024-06-18 19:36:25,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 3237380096. Throughput: 0: 42160.5. Samples: 461521720. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:25,501][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 19:36:25,522][19107] Updated weights for policy 0, policy_version 197595 (0.0049) [2024-06-18 19:36:29,421][19107] Updated weights for policy 0, policy_version 197605 (0.0038) [2024-06-18 19:36:30,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3237576704. Throughput: 0: 41820.9. Samples: 461643420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:30,501][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 19:36:33,239][19107] Updated weights for policy 0, policy_version 197615 (0.0038) [2024-06-18 19:36:35,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42099.0). Total num frames: 3237822464. Throughput: 0: 41982.1. Samples: 461899180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:35,501][18875] Avg episode reward: [(0, '0.333')] [2024-06-18 19:36:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197621_3237822464.pth... [2024-06-18 19:36:35,567][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197004_3227713536.pth [2024-06-18 19:36:37,161][19107] Updated weights for policy 0, policy_version 197625 (0.0043) [2024-06-18 19:36:40,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 3238002688. Throughput: 0: 42141.0. Samples: 462152460. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:40,500][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 19:36:41,243][19107] Updated weights for policy 0, policy_version 197635 (0.0029) [2024-06-18 19:36:45,051][19107] Updated weights for policy 0, policy_version 197645 (0.0030) [2024-06-18 19:36:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3238232064. Throughput: 0: 42072.3. Samples: 462277100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-18 19:36:45,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 19:36:48,670][19087] Signal inference workers to stop experience collection... (6750 times) [2024-06-18 19:36:48,696][19107] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-06-18 19:36:48,734][19087] Signal inference workers to resume experience collection... (6750 times) [2024-06-18 19:36:48,734][19107] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-06-18 19:36:48,874][19107] Updated weights for policy 0, policy_version 197655 (0.0035) [2024-06-18 19:36:50,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41780.5, 300 sec: 42098.5). Total num frames: 3238428672. Throughput: 0: 42023.7. Samples: 462527640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:36:50,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 19:36:52,902][19107] Updated weights for policy 0, policy_version 197665 (0.0038) [2024-06-18 19:36:55,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3238641664. Throughput: 0: 42111.0. Samples: 462780220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:36:55,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 19:36:56,604][19107] Updated weights for policy 0, policy_version 197675 (0.0046) [2024-06-18 19:37:00,504][18875] Fps is (10 sec: 42583.7, 60 sec: 42049.8, 300 sec: 42098.1). Total num frames: 3238854656. Throughput: 0: 42240.7. Samples: 462911880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:00,504][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 19:37:00,761][19107] Updated weights for policy 0, policy_version 197685 (0.0035) [2024-06-18 19:37:04,310][19107] Updated weights for policy 0, policy_version 197695 (0.0044) [2024-06-18 19:37:05,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 42098.6). Total num frames: 3239051264. Throughput: 0: 42029.0. Samples: 463156300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:05,500][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 19:37:08,418][19107] Updated weights for policy 0, policy_version 197705 (0.0036) [2024-06-18 19:37:10,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3239280640. Throughput: 0: 41992.0. Samples: 463411360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:10,500][18875] Avg episode reward: [(0, '0.335')] [2024-06-18 19:37:12,342][19107] Updated weights for policy 0, policy_version 197715 (0.0043) [2024-06-18 19:37:15,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3239493632. Throughput: 0: 42163.2. Samples: 463540760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:15,501][18875] Avg episode reward: [(0, '0.238')] [2024-06-18 19:37:16,366][19107] Updated weights for policy 0, policy_version 197725 (0.0037) [2024-06-18 19:37:20,093][19107] Updated weights for policy 0, policy_version 197735 (0.0040) [2024-06-18 19:37:20,500][18875] Fps is (10 sec: 40959.2, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3239690240. Throughput: 0: 42075.1. Samples: 463792560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:20,501][18875] Avg episode reward: [(0, '0.324')] [2024-06-18 19:37:24,074][19107] Updated weights for policy 0, policy_version 197745 (0.0049) [2024-06-18 19:37:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3239903232. Throughput: 0: 41980.0. Samples: 464041560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:25,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 19:37:27,929][19107] Updated weights for policy 0, policy_version 197755 (0.0033) [2024-06-18 19:37:30,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3240132608. Throughput: 0: 42101.5. Samples: 464171660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:30,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 19:37:31,848][19107] Updated weights for policy 0, policy_version 197765 (0.0037) [2024-06-18 19:37:35,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.3, 300 sec: 42043.0). Total num frames: 3240312832. Throughput: 0: 42140.5. Samples: 464423960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:35,501][18875] Avg episode reward: [(0, '0.717')] [2024-06-18 19:37:35,689][19107] Updated weights for policy 0, policy_version 197775 (0.0027) [2024-06-18 19:37:39,674][19107] Updated weights for policy 0, policy_version 197785 (0.0044) [2024-06-18 19:37:40,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42325.1, 300 sec: 41987.5). Total num frames: 3240542208. Throughput: 0: 42107.9. Samples: 464675080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:40,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 19:37:43,386][19107] Updated weights for policy 0, policy_version 197795 (0.0043) [2024-06-18 19:37:45,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3240771584. Throughput: 0: 42071.2. Samples: 464804940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:45,501][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 19:37:47,491][19107] Updated weights for policy 0, policy_version 197805 (0.0037) [2024-06-18 19:37:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3240951808. Throughput: 0: 42067.4. Samples: 465049340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 19:37:50,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 19:37:51,272][19107] Updated weights for policy 0, policy_version 197815 (0.0033) [2024-06-18 19:37:55,216][19107] Updated weights for policy 0, policy_version 197825 (0.0041) [2024-06-18 19:37:55,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42322.8, 300 sec: 41987.0). Total num frames: 3241181184. Throughput: 0: 42151.7. Samples: 465308340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:37:55,504][18875] Avg episode reward: [(0, '0.388')] [2024-06-18 19:37:58,972][19107] Updated weights for policy 0, policy_version 197835 (0.0049) [2024-06-18 19:38:00,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42327.7, 300 sec: 42154.1). Total num frames: 3241394176. Throughput: 0: 42104.3. Samples: 465435460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:00,501][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 19:38:02,827][19107] Updated weights for policy 0, policy_version 197845 (0.0035) [2024-06-18 19:38:05,500][18875] Fps is (10 sec: 42613.3, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3241607168. Throughput: 0: 42123.1. Samples: 465688100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:05,501][18875] Avg episode reward: [(0, '0.326')] [2024-06-18 19:38:06,780][19107] Updated weights for policy 0, policy_version 197855 (0.0046) [2024-06-18 19:38:10,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3241803776. Throughput: 0: 42301.7. Samples: 465945140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:10,501][18875] Avg episode reward: [(0, '0.424')] [2024-06-18 19:38:10,622][19107] Updated weights for policy 0, policy_version 197865 (0.0023) [2024-06-18 19:38:14,694][19087] Signal inference workers to stop experience collection... (6800 times) [2024-06-18 19:38:14,743][19107] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-06-18 19:38:14,753][19087] Signal inference workers to resume experience collection... (6800 times) [2024-06-18 19:38:14,759][19107] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-06-18 19:38:14,904][19107] Updated weights for policy 0, policy_version 197875 (0.0034) [2024-06-18 19:38:15,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3242000384. Throughput: 0: 42163.5. Samples: 466069020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:15,501][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 19:38:18,428][19107] Updated weights for policy 0, policy_version 197885 (0.0041) [2024-06-18 19:38:20,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3242229760. Throughput: 0: 42024.3. Samples: 466315060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:20,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 19:38:22,497][19107] Updated weights for policy 0, policy_version 197895 (0.0031) [2024-06-18 19:38:25,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3242442752. Throughput: 0: 41987.6. Samples: 466564520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:25,501][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 19:38:26,191][19107] Updated weights for policy 0, policy_version 197905 (0.0034) [2024-06-18 19:38:30,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3242622976. Throughput: 0: 41916.4. Samples: 466691180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:30,501][18875] Avg episode reward: [(0, '0.342')] [2024-06-18 19:38:30,520][19107] Updated weights for policy 0, policy_version 197915 (0.0033) [2024-06-18 19:38:34,239][19107] Updated weights for policy 0, policy_version 197925 (0.0034) [2024-06-18 19:38:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 3242868736. Throughput: 0: 42161.8. Samples: 466946620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:35,501][18875] Avg episode reward: [(0, '0.766')] [2024-06-18 19:38:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197929_3242868736.pth... [2024-06-18 19:38:35,583][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197313_3232776192.pth [2024-06-18 19:38:38,017][19107] Updated weights for policy 0, policy_version 197935 (0.0041) [2024-06-18 19:38:40,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 3243081728. Throughput: 0: 41924.7. Samples: 467194800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:40,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 19:38:42,069][19107] Updated weights for policy 0, policy_version 197945 (0.0030) [2024-06-18 19:38:45,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3243278336. Throughput: 0: 41882.3. Samples: 467320160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:45,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 19:38:45,620][19107] Updated weights for policy 0, policy_version 197955 (0.0033) [2024-06-18 19:38:49,814][19107] Updated weights for policy 0, policy_version 197965 (0.0037) [2024-06-18 19:38:50,500][18875] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3243474944. Throughput: 0: 41908.4. Samples: 467573980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:50,501][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 19:38:53,330][19107] Updated weights for policy 0, policy_version 197975 (0.0028) [2024-06-18 19:38:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42054.8, 300 sec: 42098.6). Total num frames: 3243704320. Throughput: 0: 41914.8. Samples: 467831300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:38:55,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 19:38:57,629][19107] Updated weights for policy 0, policy_version 197985 (0.0044) [2024-06-18 19:39:00,501][18875] Fps is (10 sec: 42595.1, 60 sec: 41778.6, 300 sec: 41987.3). Total num frames: 3243900928. Throughput: 0: 41857.4. Samples: 467952640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 19:39:00,502][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 19:39:01,086][19107] Updated weights for policy 0, policy_version 197995 (0.0024) [2024-06-18 19:39:05,373][19107] Updated weights for policy 0, policy_version 198005 (0.0033) [2024-06-18 19:39:05,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42099.1). Total num frames: 3244113920. Throughput: 0: 41920.0. Samples: 468201460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:05,501][18875] Avg episode reward: [(0, '0.659')] [2024-06-18 19:39:09,220][19107] Updated weights for policy 0, policy_version 198015 (0.0029) [2024-06-18 19:39:10,504][18875] Fps is (10 sec: 40949.1, 60 sec: 41776.7, 300 sec: 41987.0). Total num frames: 3244310528. Throughput: 0: 41951.0. Samples: 468452460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:10,504][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 19:39:13,142][19107] Updated weights for policy 0, policy_version 198025 (0.0038) [2024-06-18 19:39:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3244523520. Throughput: 0: 41881.4. Samples: 468575840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:15,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 19:39:16,919][19107] Updated weights for policy 0, policy_version 198035 (0.0037) [2024-06-18 19:39:20,500][18875] Fps is (10 sec: 42613.5, 60 sec: 41779.3, 300 sec: 42043.4). Total num frames: 3244736512. Throughput: 0: 41789.9. Samples: 468827160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:20,501][18875] Avg episode reward: [(0, '0.286')] [2024-06-18 19:39:20,895][19107] Updated weights for policy 0, policy_version 198045 (0.0042) [2024-06-18 19:39:24,719][19107] Updated weights for policy 0, policy_version 198055 (0.0029) [2024-06-18 19:39:25,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3244949504. Throughput: 0: 41797.3. Samples: 469075680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:25,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 19:39:28,925][19107] Updated weights for policy 0, policy_version 198065 (0.0033) [2024-06-18 19:39:30,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3245146112. Throughput: 0: 41909.3. Samples: 469206080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:30,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 19:39:32,599][19107] Updated weights for policy 0, policy_version 198075 (0.0034) [2024-06-18 19:39:32,599][19087] Signal inference workers to stop experience collection... (6850 times) [2024-06-18 19:39:32,599][19087] Signal inference workers to resume experience collection... (6850 times) [2024-06-18 19:39:32,646][19107] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-06-18 19:39:32,647][19107] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-06-18 19:39:35,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3245359104. Throughput: 0: 41795.2. Samples: 469454760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:35,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 19:39:36,591][19107] Updated weights for policy 0, policy_version 198085 (0.0031) [2024-06-18 19:39:40,335][19107] Updated weights for policy 0, policy_version 198095 (0.0050) [2024-06-18 19:39:40,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3245588480. Throughput: 0: 41725.2. Samples: 469708940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:40,501][18875] Avg episode reward: [(0, '0.554')] [2024-06-18 19:39:44,558][19107] Updated weights for policy 0, policy_version 198105 (0.0031) [2024-06-18 19:39:45,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3245768704. Throughput: 0: 41886.6. Samples: 469837500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:45,509][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 19:39:48,125][19107] Updated weights for policy 0, policy_version 198115 (0.0035) [2024-06-18 19:39:50,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42049.8, 300 sec: 42042.5). Total num frames: 3245998080. Throughput: 0: 41896.7. Samples: 470086960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:50,504][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 19:39:52,103][19107] Updated weights for policy 0, policy_version 198125 (0.0041) [2024-06-18 19:39:55,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3246211072. Throughput: 0: 42075.2. Samples: 470345700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:39:55,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 19:39:55,697][19107] Updated weights for policy 0, policy_version 198135 (0.0033) [2024-06-18 19:39:59,859][19107] Updated weights for policy 0, policy_version 198145 (0.0030) [2024-06-18 19:40:00,500][18875] Fps is (10 sec: 42613.2, 60 sec: 42052.8, 300 sec: 42098.5). Total num frames: 3246424064. Throughput: 0: 42116.3. Samples: 470471080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:40:00,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 19:40:03,441][19107] Updated weights for policy 0, policy_version 198155 (0.0026) [2024-06-18 19:40:05,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3246620672. Throughput: 0: 42030.6. Samples: 470718540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 19:40:05,501][18875] Avg episode reward: [(0, '0.488')] [2024-06-18 19:40:07,689][19107] Updated weights for policy 0, policy_version 198165 (0.0035) [2024-06-18 19:40:10,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 3246850048. Throughput: 0: 42116.0. Samples: 470970900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:10,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 19:40:11,313][19107] Updated weights for policy 0, policy_version 198175 (0.0031) [2024-06-18 19:40:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3247046656. Throughput: 0: 41979.1. Samples: 471095140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:15,501][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 19:40:15,838][19107] Updated weights for policy 0, policy_version 198185 (0.0035) [2024-06-18 19:40:19,295][19107] Updated weights for policy 0, policy_version 198195 (0.0043) [2024-06-18 19:40:20,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3247259648. Throughput: 0: 41968.4. Samples: 471343340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:20,501][18875] Avg episode reward: [(0, '0.790')] [2024-06-18 19:40:23,610][19107] Updated weights for policy 0, policy_version 198205 (0.0042) [2024-06-18 19:40:25,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3247439872. Throughput: 0: 41971.1. Samples: 471597640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:25,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 19:40:27,391][19107] Updated weights for policy 0, policy_version 198215 (0.0042) [2024-06-18 19:40:30,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3247652864. Throughput: 0: 41889.0. Samples: 471722500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:30,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 19:40:31,215][19107] Updated weights for policy 0, policy_version 198225 (0.0027) [2024-06-18 19:40:35,015][19107] Updated weights for policy 0, policy_version 198235 (0.0028) [2024-06-18 19:40:35,504][18875] Fps is (10 sec: 45859.0, 60 sec: 42322.8, 300 sec: 41931.4). Total num frames: 3247898624. Throughput: 0: 41955.5. Samples: 471974960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:35,505][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 19:40:35,523][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000198236_3247898624.pth... [2024-06-18 19:40:35,589][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197621_3237822464.pth [2024-06-18 19:40:39,072][19107] Updated weights for policy 0, policy_version 198245 (0.0034) [2024-06-18 19:40:40,500][18875] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3248095232. Throughput: 0: 41831.1. Samples: 472228100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:40,501][18875] Avg episode reward: [(0, '0.784')] [2024-06-18 19:40:42,683][19107] Updated weights for policy 0, policy_version 198255 (0.0030) [2024-06-18 19:40:45,504][18875] Fps is (10 sec: 39321.8, 60 sec: 42049.8, 300 sec: 41931.7). Total num frames: 3248291840. Throughput: 0: 41702.6. Samples: 472347840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:45,504][18875] Avg episode reward: [(0, '0.775')] [2024-06-18 19:40:47,295][19107] Updated weights for policy 0, policy_version 198265 (0.0034) [2024-06-18 19:40:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41781.7, 300 sec: 41876.4). Total num frames: 3248504832. Throughput: 0: 41844.1. Samples: 472601520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:50,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 19:40:50,925][19107] Updated weights for policy 0, policy_version 198275 (0.0032) [2024-06-18 19:40:54,959][19107] Updated weights for policy 0, policy_version 198285 (0.0034) [2024-06-18 19:40:55,500][18875] Fps is (10 sec: 44252.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3248734208. Throughput: 0: 41818.6. Samples: 472852740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:40:55,501][18875] Avg episode reward: [(0, '0.285')] [2024-06-18 19:40:57,708][19087] Signal inference workers to stop experience collection... (6900 times) [2024-06-18 19:40:57,745][19107] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-06-18 19:40:57,761][19087] Signal inference workers to resume experience collection... (6900 times) [2024-06-18 19:40:57,761][19107] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-06-18 19:40:58,661][19107] Updated weights for policy 0, policy_version 198295 (0.0034) [2024-06-18 19:41:00,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 3248930816. Throughput: 0: 41745.9. Samples: 472973700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:41:00,500][18875] Avg episode reward: [(0, '0.399')] [2024-06-18 19:41:02,813][19107] Updated weights for policy 0, policy_version 198305 (0.0043) [2024-06-18 19:41:05,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3249127424. Throughput: 0: 41800.0. Samples: 473224340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:41:05,501][18875] Avg episode reward: [(0, '0.706')] [2024-06-18 19:41:06,466][19107] Updated weights for policy 0, policy_version 198315 (0.0037) [2024-06-18 19:41:10,461][19107] Updated weights for policy 0, policy_version 198325 (0.0051) [2024-06-18 19:41:10,500][18875] Fps is (10 sec: 42597.4, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3249356800. Throughput: 0: 41768.8. Samples: 473477240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:41:10,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 19:41:14,646][19107] Updated weights for policy 0, policy_version 198335 (0.0036) [2024-06-18 19:41:15,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3249537024. Throughput: 0: 41743.6. Samples: 473600960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 19:41:15,500][18875] Avg episode reward: [(0, '0.742')] [2024-06-18 19:41:18,146][19107] Updated weights for policy 0, policy_version 198345 (0.0025) [2024-06-18 19:41:20,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3249782784. Throughput: 0: 41696.3. Samples: 473851140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:20,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 19:41:22,587][19107] Updated weights for policy 0, policy_version 198355 (0.0030) [2024-06-18 19:41:25,500][18875] Fps is (10 sec: 44235.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3249979392. Throughput: 0: 41772.4. Samples: 474107860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:25,501][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 19:41:25,838][19107] Updated weights for policy 0, policy_version 198365 (0.0034) [2024-06-18 19:41:30,402][19107] Updated weights for policy 0, policy_version 198375 (0.0027) [2024-06-18 19:41:30,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3250176000. Throughput: 0: 41746.4. Samples: 474226280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:30,501][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 19:41:33,785][19107] Updated weights for policy 0, policy_version 198385 (0.0049) [2024-06-18 19:41:35,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41781.7, 300 sec: 42043.0). Total num frames: 3250405376. Throughput: 0: 41843.1. Samples: 474484460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:35,501][18875] Avg episode reward: [(0, '0.741')] [2024-06-18 19:41:37,916][19107] Updated weights for policy 0, policy_version 198395 (0.0036) [2024-06-18 19:41:40,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 3250601984. Throughput: 0: 41931.2. Samples: 474739640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:40,500][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 19:41:41,634][19107] Updated weights for policy 0, policy_version 198405 (0.0026) [2024-06-18 19:41:45,504][18875] Fps is (10 sec: 40945.2, 60 sec: 42052.2, 300 sec: 41987.0). Total num frames: 3250814976. Throughput: 0: 41985.4. Samples: 474863200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:45,505][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 19:41:46,178][19107] Updated weights for policy 0, policy_version 198415 (0.0038) [2024-06-18 19:41:49,519][19107] Updated weights for policy 0, policy_version 198425 (0.0031) [2024-06-18 19:41:50,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3251044352. Throughput: 0: 42016.4. Samples: 475115080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:50,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 19:41:54,128][19107] Updated weights for policy 0, policy_version 198435 (0.0031) [2024-06-18 19:41:55,501][18875] Fps is (10 sec: 42612.7, 60 sec: 41779.0, 300 sec: 41987.9). Total num frames: 3251240960. Throughput: 0: 42039.9. Samples: 475369040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:41:55,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 19:41:57,126][19107] Updated weights for policy 0, policy_version 198445 (0.0031) [2024-06-18 19:42:00,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3251437568. Throughput: 0: 42056.4. Samples: 475493500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:42:00,500][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 19:42:01,889][19107] Updated weights for policy 0, policy_version 198455 (0.0037) [2024-06-18 19:42:04,987][19107] Updated weights for policy 0, policy_version 198465 (0.0031) [2024-06-18 19:42:05,500][18875] Fps is (10 sec: 42599.8, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3251666944. Throughput: 0: 42181.8. Samples: 475749320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:42:05,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 19:42:09,405][19107] Updated weights for policy 0, policy_version 198475 (0.0047) [2024-06-18 19:42:10,504][18875] Fps is (10 sec: 42582.5, 60 sec: 41776.8, 300 sec: 41931.4). Total num frames: 3251863552. Throughput: 0: 42310.5. Samples: 476011980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:42:10,505][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 19:42:12,698][19107] Updated weights for policy 0, policy_version 198485 (0.0041) [2024-06-18 19:42:15,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3252076544. Throughput: 0: 42371.1. Samples: 476132980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:42:15,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 19:42:16,876][19107] Updated weights for policy 0, policy_version 198495 (0.0045) [2024-06-18 19:42:20,271][19107] Updated weights for policy 0, policy_version 198505 (0.0036) [2024-06-18 19:42:20,500][18875] Fps is (10 sec: 44253.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3252305920. Throughput: 0: 42350.8. Samples: 476390240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 19:42:20,500][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 19:42:24,739][19107] Updated weights for policy 0, policy_version 198515 (0.0032) [2024-06-18 19:42:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3252486144. Throughput: 0: 42258.9. Samples: 476641300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:42:25,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 19:42:27,945][19107] Updated weights for policy 0, policy_version 198525 (0.0029) [2024-06-18 19:42:30,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3252715520. Throughput: 0: 42094.5. Samples: 476757300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:42:30,501][18875] Avg episode reward: [(0, '0.372')] [2024-06-18 19:42:32,566][19107] Updated weights for policy 0, policy_version 198535 (0.0046) [2024-06-18 19:42:34,225][19087] Signal inference workers to stop experience collection... (6950 times) [2024-06-18 19:42:34,259][19107] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-06-18 19:42:34,282][19087] Signal inference workers to resume experience collection... (6950 times) [2024-06-18 19:42:34,282][19107] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-06-18 19:42:35,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3252944896. Throughput: 0: 42356.0. Samples: 477021100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:42:35,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 19:42:35,530][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000198544_3252944896.pth... [2024-06-18 19:42:35,598][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000197929_3242868736.pth [2024-06-18 19:42:36,115][19107] Updated weights for policy 0, policy_version 198545 (0.0040) [2024-06-18 19:42:40,231][19107] Updated weights for policy 0, policy_version 198555 (0.0033) [2024-06-18 19:42:40,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3253141504. Throughput: 0: 42280.6. Samples: 477271660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:42:40,504][18875] Avg episode reward: [(0, '0.638')] [2024-06-18 19:42:43,905][19107] Updated weights for policy 0, policy_version 198565 (0.0038) [2024-06-18 19:42:45,501][18875] Fps is (10 sec: 39321.1, 60 sec: 42054.7, 300 sec: 41987.5). Total num frames: 3253338112. Throughput: 0: 42259.8. Samples: 477395200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:42:45,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 19:42:47,963][19107] Updated weights for policy 0, policy_version 198575 (0.0040) [2024-06-18 19:42:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 3253551104. Throughput: 0: 42245.6. Samples: 477650380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:42:50,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 19:42:51,563][19107] Updated weights for policy 0, policy_version 198585 (0.0034) [2024-06-18 19:42:55,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 3253747712. Throughput: 0: 41934.6. Samples: 477898880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:42:55,500][18875] Avg episode reward: [(0, '0.704')] [2024-06-18 19:42:55,738][19107] Updated weights for policy 0, policy_version 198595 (0.0040) [2024-06-18 19:42:59,254][19107] Updated weights for policy 0, policy_version 198605 (0.0033) [2024-06-18 19:43:00,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3253993472. Throughput: 0: 42119.5. Samples: 478028360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:43:00,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 19:43:03,243][19107] Updated weights for policy 0, policy_version 198615 (0.0024) [2024-06-18 19:43:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3254157312. Throughput: 0: 42028.0. Samples: 478281500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:43:05,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 19:43:07,090][19107] Updated weights for policy 0, policy_version 198625 (0.0033) [2024-06-18 19:43:10,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 3254403072. Throughput: 0: 42115.2. Samples: 478536480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:43:10,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 19:43:11,045][19107] Updated weights for policy 0, policy_version 198635 (0.0043) [2024-06-18 19:43:14,695][19107] Updated weights for policy 0, policy_version 198645 (0.0032) [2024-06-18 19:43:15,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3254599680. Throughput: 0: 42359.6. Samples: 478663480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:43:15,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 19:43:18,938][19107] Updated weights for policy 0, policy_version 198655 (0.0032) [2024-06-18 19:43:20,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3254812672. Throughput: 0: 41923.5. Samples: 478907660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:43:20,504][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 19:43:22,599][19107] Updated weights for policy 0, policy_version 198665 (0.0037) [2024-06-18 19:43:25,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3255025664. Throughput: 0: 41987.2. Samples: 479161080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 19:43:25,501][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 19:43:26,671][19107] Updated weights for policy 0, policy_version 198675 (0.0039) [2024-06-18 19:43:30,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 3255238656. Throughput: 0: 41986.0. Samples: 479284560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:43:30,501][18875] Avg episode reward: [(0, '0.307')] [2024-06-18 19:43:30,560][19107] Updated weights for policy 0, policy_version 198685 (0.0044) [2024-06-18 19:43:34,506][19107] Updated weights for policy 0, policy_version 198695 (0.0038) [2024-06-18 19:43:35,502][18875] Fps is (10 sec: 44230.5, 60 sec: 42051.3, 300 sec: 41987.3). Total num frames: 3255468032. Throughput: 0: 41999.2. Samples: 479540400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:43:35,502][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 19:43:38,445][19107] Updated weights for policy 0, policy_version 198705 (0.0033) [2024-06-18 19:43:40,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3255648256. Throughput: 0: 42070.5. Samples: 479792060. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:43:40,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 19:43:42,431][19107] Updated weights for policy 0, policy_version 198715 (0.0027) [2024-06-18 19:43:45,500][18875] Fps is (10 sec: 39327.6, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3255861248. Throughput: 0: 42016.2. Samples: 479919080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:43:45,500][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 19:43:46,172][19107] Updated weights for policy 0, policy_version 198725 (0.0026) [2024-06-18 19:43:50,150][19107] Updated weights for policy 0, policy_version 198735 (0.0025) [2024-06-18 19:43:50,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 3256074240. Throughput: 0: 42070.7. Samples: 480174680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:43:50,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 19:43:53,675][19107] Updated weights for policy 0, policy_version 198745 (0.0027) [2024-06-18 19:43:55,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.6). Total num frames: 3256287232. Throughput: 0: 41988.1. Samples: 480425940. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:43:55,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 19:43:56,733][19087] Signal inference workers to stop experience collection... (7000 times) [2024-06-18 19:43:56,733][19087] Signal inference workers to resume experience collection... (7000 times) [2024-06-18 19:43:56,776][19107] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-06-18 19:43:56,776][19107] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-06-18 19:43:57,866][19107] Updated weights for policy 0, policy_version 198755 (0.0035) [2024-06-18 19:44:00,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3256500224. Throughput: 0: 42020.1. Samples: 480554380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:00,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 19:44:01,592][19107] Updated weights for policy 0, policy_version 198765 (0.0034) [2024-06-18 19:44:05,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42043.5). Total num frames: 3256713216. Throughput: 0: 42197.9. Samples: 480806560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:05,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 19:44:05,700][19107] Updated weights for policy 0, policy_version 198775 (0.0037) [2024-06-18 19:44:09,454][19107] Updated weights for policy 0, policy_version 198785 (0.0035) [2024-06-18 19:44:10,504][18875] Fps is (10 sec: 44220.5, 60 sec: 42322.8, 300 sec: 42098.0). Total num frames: 3256942592. Throughput: 0: 42139.8. Samples: 481057520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:10,504][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 19:44:13,605][19107] Updated weights for policy 0, policy_version 198795 (0.0034) [2024-06-18 19:44:15,504][18875] Fps is (10 sec: 40945.1, 60 sec: 42049.7, 300 sec: 41987.0). Total num frames: 3257122816. Throughput: 0: 42260.5. Samples: 481186440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:15,505][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 19:44:17,403][19107] Updated weights for policy 0, policy_version 198805 (0.0038) [2024-06-18 19:44:20,500][18875] Fps is (10 sec: 40974.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3257352192. Throughput: 0: 42131.6. Samples: 481436260. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:20,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 19:44:21,171][19107] Updated weights for policy 0, policy_version 198815 (0.0039) [2024-06-18 19:44:24,912][19107] Updated weights for policy 0, policy_version 198825 (0.0042) [2024-06-18 19:44:25,500][18875] Fps is (10 sec: 44252.8, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3257565184. Throughput: 0: 42246.3. Samples: 481693140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:25,501][18875] Avg episode reward: [(0, '0.662')] [2024-06-18 19:44:28,722][19107] Updated weights for policy 0, policy_version 198835 (0.0048) [2024-06-18 19:44:30,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3257761792. Throughput: 0: 42307.0. Samples: 481822900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:30,501][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 19:44:32,502][19107] Updated weights for policy 0, policy_version 198845 (0.0034) [2024-06-18 19:44:35,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42053.2, 300 sec: 42043.0). Total num frames: 3257991168. Throughput: 0: 42315.0. Samples: 482078860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-18 19:44:35,504][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 19:44:35,595][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000198853_3258007552.pth... [2024-06-18 19:44:35,652][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000198236_3247898624.pth [2024-06-18 19:44:36,400][19107] Updated weights for policy 0, policy_version 198855 (0.0030) [2024-06-18 19:44:40,127][19107] Updated weights for policy 0, policy_version 198865 (0.0029) [2024-06-18 19:44:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3258204160. Throughput: 0: 42345.7. Samples: 482331500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:44:40,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 19:44:43,971][19107] Updated weights for policy 0, policy_version 198875 (0.0034) [2024-06-18 19:44:45,504][18875] Fps is (10 sec: 39307.9, 60 sec: 42049.7, 300 sec: 41987.5). Total num frames: 3258384384. Throughput: 0: 42405.0. Samples: 482462760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:44:45,504][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 19:44:47,821][19107] Updated weights for policy 0, policy_version 198885 (0.0040) [2024-06-18 19:44:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 3258630144. Throughput: 0: 42347.9. Samples: 482712220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:44:50,501][18875] Avg episode reward: [(0, '0.793')] [2024-06-18 19:44:51,823][19107] Updated weights for policy 0, policy_version 198895 (0.0030) [2024-06-18 19:44:55,500][18875] Fps is (10 sec: 45891.9, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3258843136. Throughput: 0: 42527.5. Samples: 482971100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:44:55,501][18875] Avg episode reward: [(0, '0.656')] [2024-06-18 19:44:55,510][19107] Updated weights for policy 0, policy_version 198905 (0.0036) [2024-06-18 19:44:59,525][19107] Updated weights for policy 0, policy_version 198915 (0.0045) [2024-06-18 19:45:00,500][18875] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3259023360. Throughput: 0: 42416.0. Samples: 483095000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:00,500][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 19:45:03,434][19107] Updated weights for policy 0, policy_version 198925 (0.0032) [2024-06-18 19:45:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3259269120. Throughput: 0: 42428.1. Samples: 483345520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:05,501][18875] Avg episode reward: [(0, '0.699')] [2024-06-18 19:45:07,412][19107] Updated weights for policy 0, policy_version 198935 (0.0043) [2024-06-18 19:45:10,502][18875] Fps is (10 sec: 44230.8, 60 sec: 42053.9, 300 sec: 42098.4). Total num frames: 3259465728. Throughput: 0: 42515.3. Samples: 483606380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:10,502][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 19:45:11,103][19107] Updated weights for policy 0, policy_version 198945 (0.0036) [2024-06-18 19:45:15,322][19107] Updated weights for policy 0, policy_version 198955 (0.0029) [2024-06-18 19:45:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42601.0, 300 sec: 42098.6). Total num frames: 3259678720. Throughput: 0: 42241.0. Samples: 483723740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:15,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 19:45:18,795][19107] Updated weights for policy 0, policy_version 198965 (0.0028) [2024-06-18 19:45:19,738][19087] Signal inference workers to stop experience collection... (7050 times) [2024-06-18 19:45:19,739][19087] Signal inference workers to resume experience collection... (7050 times) [2024-06-18 19:45:19,757][19107] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-06-18 19:45:19,757][19107] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-06-18 19:45:20,500][18875] Fps is (10 sec: 44242.0, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 3259908096. Throughput: 0: 42260.9. Samples: 483980600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:20,504][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 19:45:23,329][19107] Updated weights for policy 0, policy_version 198975 (0.0049) [2024-06-18 19:45:25,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3260104704. Throughput: 0: 42360.4. Samples: 484237720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:25,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 19:45:26,559][19107] Updated weights for policy 0, policy_version 198985 (0.0033) [2024-06-18 19:45:30,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 3260301312. Throughput: 0: 42069.6. Samples: 484355740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:30,501][18875] Avg episode reward: [(0, '0.712')] [2024-06-18 19:45:30,990][19107] Updated weights for policy 0, policy_version 198995 (0.0031) [2024-06-18 19:45:34,309][19107] Updated weights for policy 0, policy_version 199005 (0.0035) [2024-06-18 19:45:35,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3260547072. Throughput: 0: 42209.8. Samples: 484611660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:35,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 19:45:38,882][19107] Updated weights for policy 0, policy_version 199015 (0.0028) [2024-06-18 19:45:40,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42210.1). Total num frames: 3260743680. Throughput: 0: 42077.7. Samples: 484864600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-18 19:45:40,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 19:45:42,113][19107] Updated weights for policy 0, policy_version 199025 (0.0038) [2024-06-18 19:45:45,500][18875] Fps is (10 sec: 37683.0, 60 sec: 42327.8, 300 sec: 42098.5). Total num frames: 3260923904. Throughput: 0: 42022.4. Samples: 484986020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:45:45,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 19:45:46,604][19107] Updated weights for policy 0, policy_version 199035 (0.0040) [2024-06-18 19:45:49,727][19107] Updated weights for policy 0, policy_version 199045 (0.0033) [2024-06-18 19:45:50,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 3261169664. Throughput: 0: 42236.0. Samples: 485246140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:45:50,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 19:45:54,535][19107] Updated weights for policy 0, policy_version 199055 (0.0042) [2024-06-18 19:45:55,504][18875] Fps is (10 sec: 44221.3, 60 sec: 42049.7, 300 sec: 42153.6). Total num frames: 3261366272. Throughput: 0: 42024.9. Samples: 485497600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:45:55,505][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 19:45:57,486][19107] Updated weights for policy 0, policy_version 199065 (0.0044) [2024-06-18 19:46:00,504][18875] Fps is (10 sec: 39307.1, 60 sec: 42322.7, 300 sec: 42153.6). Total num frames: 3261562880. Throughput: 0: 42070.8. Samples: 485617080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:00,504][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 19:46:02,285][19107] Updated weights for policy 0, policy_version 199075 (0.0039) [2024-06-18 19:46:05,264][19107] Updated weights for policy 0, policy_version 199085 (0.0035) [2024-06-18 19:46:05,500][18875] Fps is (10 sec: 44253.2, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 3261808640. Throughput: 0: 42099.7. Samples: 485875080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:05,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 19:46:09,993][19107] Updated weights for policy 0, policy_version 199095 (0.0034) [2024-06-18 19:46:10,500][18875] Fps is (10 sec: 42613.2, 60 sec: 42053.1, 300 sec: 42209.6). Total num frames: 3261988864. Throughput: 0: 41982.2. Samples: 486126920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:10,501][18875] Avg episode reward: [(0, '0.303')] [2024-06-18 19:46:13,452][19107] Updated weights for policy 0, policy_version 199105 (0.0038) [2024-06-18 19:46:15,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3262201856. Throughput: 0: 42124.8. Samples: 486251360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:15,501][18875] Avg episode reward: [(0, '0.372')] [2024-06-18 19:46:17,821][19107] Updated weights for policy 0, policy_version 199115 (0.0045) [2024-06-18 19:46:20,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3262414848. Throughput: 0: 42144.0. Samples: 486508140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:20,501][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 19:46:21,232][19107] Updated weights for policy 0, policy_version 199125 (0.0035) [2024-06-18 19:46:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3262611456. Throughput: 0: 42098.3. Samples: 486759020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:25,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 19:46:25,597][19107] Updated weights for policy 0, policy_version 199135 (0.0036) [2024-06-18 19:46:29,028][19107] Updated weights for policy 0, policy_version 199145 (0.0047) [2024-06-18 19:46:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3262840832. Throughput: 0: 42187.3. Samples: 486884440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:30,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 19:46:33,449][19107] Updated weights for policy 0, policy_version 199155 (0.0029) [2024-06-18 19:46:35,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41506.3, 300 sec: 42154.1). Total num frames: 3263037440. Throughput: 0: 41912.0. Samples: 487132180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:35,500][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 19:46:35,540][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000199161_3263053824.pth... [2024-06-18 19:46:35,583][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000198544_3252944896.pth [2024-06-18 19:46:37,143][19107] Updated weights for policy 0, policy_version 199165 (0.0031) [2024-06-18 19:46:40,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42154.6). Total num frames: 3263250432. Throughput: 0: 41994.4. Samples: 487387200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:40,510][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 19:46:41,155][19107] Updated weights for policy 0, policy_version 199175 (0.0037) [2024-06-18 19:46:45,090][19107] Updated weights for policy 0, policy_version 199185 (0.0032) [2024-06-18 19:46:45,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3263447040. Throughput: 0: 42103.3. Samples: 487511580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:45,501][18875] Avg episode reward: [(0, '0.719')] [2024-06-18 19:46:49,131][19107] Updated weights for policy 0, policy_version 199195 (0.0033) [2024-06-18 19:46:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.0, 300 sec: 42098.6). Total num frames: 3263660032. Throughput: 0: 41865.7. Samples: 487759040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 19:46:50,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 19:46:52,800][19107] Updated weights for policy 0, policy_version 199205 (0.0028) [2024-06-18 19:46:55,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 3263873024. Throughput: 0: 41864.9. Samples: 488010840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:46:55,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 19:46:56,721][19107] Updated weights for policy 0, policy_version 199215 (0.0028) [2024-06-18 19:46:59,305][19087] Signal inference workers to stop experience collection... (7100 times) [2024-06-18 19:46:59,358][19087] Signal inference workers to resume experience collection... (7100 times) [2024-06-18 19:46:59,364][19107] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-06-18 19:46:59,377][19107] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-06-18 19:47:00,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42054.8, 300 sec: 42098.6). Total num frames: 3264086016. Throughput: 0: 41866.8. Samples: 488135360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:00,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 19:47:00,626][19107] Updated weights for policy 0, policy_version 199225 (0.0031) [2024-06-18 19:47:04,755][19107] Updated weights for policy 0, policy_version 199235 (0.0040) [2024-06-18 19:47:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 42154.6). Total num frames: 3264299008. Throughput: 0: 41834.7. Samples: 488390700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:05,501][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 19:47:08,371][19107] Updated weights for policy 0, policy_version 199245 (0.0028) [2024-06-18 19:47:10,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3264512000. Throughput: 0: 41817.0. Samples: 488640780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:10,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 19:47:12,585][19107] Updated weights for policy 0, policy_version 199255 (0.0041) [2024-06-18 19:47:15,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3264724992. Throughput: 0: 41790.9. Samples: 488765040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:15,501][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 19:47:15,917][19107] Updated weights for policy 0, policy_version 199265 (0.0040) [2024-06-18 19:47:20,478][19107] Updated weights for policy 0, policy_version 199275 (0.0023) [2024-06-18 19:47:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3264921600. Throughput: 0: 41981.7. Samples: 489021360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:20,501][18875] Avg episode reward: [(0, '0.753')] [2024-06-18 19:47:23,854][19107] Updated weights for policy 0, policy_version 199285 (0.0033) [2024-06-18 19:47:25,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3265134592. Throughput: 0: 41868.8. Samples: 489271300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:25,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 19:47:28,175][19107] Updated weights for policy 0, policy_version 199295 (0.0026) [2024-06-18 19:47:30,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3265363968. Throughput: 0: 41887.5. Samples: 489396520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:30,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 19:47:31,769][19107] Updated weights for policy 0, policy_version 199305 (0.0041) [2024-06-18 19:47:35,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3265544192. Throughput: 0: 41897.8. Samples: 489644440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:35,501][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 19:47:35,911][19107] Updated weights for policy 0, policy_version 199315 (0.0029) [2024-06-18 19:47:39,679][19107] Updated weights for policy 0, policy_version 199325 (0.0034) [2024-06-18 19:47:40,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3265757184. Throughput: 0: 41914.7. Samples: 489897000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:40,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 19:47:43,616][19107] Updated weights for policy 0, policy_version 199335 (0.0031) [2024-06-18 19:47:45,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3265986560. Throughput: 0: 41987.0. Samples: 490024780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:45,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 19:47:47,805][19107] Updated weights for policy 0, policy_version 199345 (0.0037) [2024-06-18 19:47:50,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3266183168. Throughput: 0: 42038.3. Samples: 490282420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:50,501][18875] Avg episode reward: [(0, '0.397')] [2024-06-18 19:47:51,800][19107] Updated weights for policy 0, policy_version 199355 (0.0036) [2024-06-18 19:47:55,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3266379776. Throughput: 0: 41958.1. Samples: 490528900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 19:47:55,501][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 19:47:55,752][19107] Updated weights for policy 0, policy_version 199365 (0.0029) [2024-06-18 19:47:59,451][19107] Updated weights for policy 0, policy_version 199375 (0.0038) [2024-06-18 19:48:00,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 3266625536. Throughput: 0: 41959.2. Samples: 490653200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:00,501][18875] Avg episode reward: [(0, '0.786')] [2024-06-18 19:48:04,103][19107] Updated weights for policy 0, policy_version 199385 (0.0039) [2024-06-18 19:48:05,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3266805760. Throughput: 0: 41937.4. Samples: 490908540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:05,500][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 19:48:07,302][19107] Updated weights for policy 0, policy_version 199395 (0.0030) [2024-06-18 19:48:10,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3267018752. Throughput: 0: 41738.3. Samples: 491149520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:10,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 19:48:11,958][19107] Updated weights for policy 0, policy_version 199405 (0.0047) [2024-06-18 19:48:15,212][19107] Updated weights for policy 0, policy_version 199415 (0.0042) [2024-06-18 19:48:15,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3267215360. Throughput: 0: 41785.4. Samples: 491276860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:15,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 19:48:19,743][19107] Updated weights for policy 0, policy_version 199425 (0.0034) [2024-06-18 19:48:19,834][19087] Signal inference workers to stop experience collection... (7150 times) [2024-06-18 19:48:19,872][19107] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-06-18 19:48:19,904][19087] Signal inference workers to resume experience collection... (7150 times) [2024-06-18 19:48:19,905][19107] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-06-18 19:48:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3267428352. Throughput: 0: 41868.8. Samples: 491528540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:20,501][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 19:48:22,933][19107] Updated weights for policy 0, policy_version 199435 (0.0031) [2024-06-18 19:48:25,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3267657728. Throughput: 0: 41700.9. Samples: 491773540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:25,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 19:48:27,336][19107] Updated weights for policy 0, policy_version 199445 (0.0034) [2024-06-18 19:48:30,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41987.7). Total num frames: 3267854336. Throughput: 0: 41732.3. Samples: 491902740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:30,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 19:48:30,637][19107] Updated weights for policy 0, policy_version 199455 (0.0037) [2024-06-18 19:48:35,054][19107] Updated weights for policy 0, policy_version 199465 (0.0039) [2024-06-18 19:48:35,504][18875] Fps is (10 sec: 39307.0, 60 sec: 41776.6, 300 sec: 42042.5). Total num frames: 3268050944. Throughput: 0: 41605.8. Samples: 492154840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:35,505][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 19:48:35,609][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000199467_3268067328.pth... [2024-06-18 19:48:35,664][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000198853_3258007552.pth [2024-06-18 19:48:38,333][19107] Updated weights for policy 0, policy_version 199475 (0.0032) [2024-06-18 19:48:40,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 3268280320. Throughput: 0: 41746.4. Samples: 492407480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:40,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 19:48:42,725][19107] Updated weights for policy 0, policy_version 199485 (0.0030) [2024-06-18 19:48:45,500][18875] Fps is (10 sec: 44253.7, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3268493312. Throughput: 0: 41889.4. Samples: 492538220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:45,501][18875] Avg episode reward: [(0, '0.697')] [2024-06-18 19:48:46,449][19107] Updated weights for policy 0, policy_version 199495 (0.0037) [2024-06-18 19:48:50,374][19107] Updated weights for policy 0, policy_version 199505 (0.0032) [2024-06-18 19:48:50,501][18875] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3268689920. Throughput: 0: 41758.0. Samples: 492787660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:50,501][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 19:48:54,265][19107] Updated weights for policy 0, policy_version 199515 (0.0035) [2024-06-18 19:48:55,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3268919296. Throughput: 0: 41858.7. Samples: 493033160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:48:55,501][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 19:48:58,029][19107] Updated weights for policy 0, policy_version 199525 (0.0024) [2024-06-18 19:49:00,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3269115904. Throughput: 0: 41992.9. Samples: 493166540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 19:49:00,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 19:49:01,811][19107] Updated weights for policy 0, policy_version 199535 (0.0040) [2024-06-18 19:49:05,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.1, 300 sec: 41932.4). Total num frames: 3269312512. Throughput: 0: 41910.3. Samples: 493414500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:05,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 19:49:06,054][19107] Updated weights for policy 0, policy_version 199545 (0.0038) [2024-06-18 19:49:09,399][19107] Updated weights for policy 0, policy_version 199555 (0.0033) [2024-06-18 19:49:10,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42154.6). Total num frames: 3269558272. Throughput: 0: 42187.7. Samples: 493671980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:10,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 19:49:13,738][19107] Updated weights for policy 0, policy_version 199565 (0.0035) [2024-06-18 19:49:15,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3269754880. Throughput: 0: 42209.8. Samples: 493802180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:15,501][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 19:49:16,997][19107] Updated weights for policy 0, policy_version 199575 (0.0028) [2024-06-18 19:49:20,500][18875] Fps is (10 sec: 37682.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3269935104. Throughput: 0: 42166.5. Samples: 494052180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:20,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 19:49:21,346][19107] Updated weights for policy 0, policy_version 199585 (0.0044) [2024-06-18 19:49:25,202][19107] Updated weights for policy 0, policy_version 199595 (0.0044) [2024-06-18 19:49:25,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3270164480. Throughput: 0: 42051.5. Samples: 494299800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:25,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 19:49:28,994][19107] Updated weights for policy 0, policy_version 199605 (0.0033) [2024-06-18 19:49:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3270377472. Throughput: 0: 42087.0. Samples: 494432140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:30,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 19:49:31,486][19087] Signal inference workers to stop experience collection... (7200 times) [2024-06-18 19:49:31,494][19087] Signal inference workers to resume experience collection... (7200 times) [2024-06-18 19:49:31,513][19107] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-06-18 19:49:31,513][19107] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-06-18 19:49:32,978][19107] Updated weights for policy 0, policy_version 199615 (0.0046) [2024-06-18 19:49:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42054.9, 300 sec: 41931.9). Total num frames: 3270574080. Throughput: 0: 42053.9. Samples: 494680080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:35,501][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 19:49:37,166][19107] Updated weights for policy 0, policy_version 199625 (0.0040) [2024-06-18 19:49:40,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42099.1). Total num frames: 3270803456. Throughput: 0: 42224.0. Samples: 494933240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:40,509][18875] Avg episode reward: [(0, '0.786')] [2024-06-18 19:49:40,644][19107] Updated weights for policy 0, policy_version 199635 (0.0042) [2024-06-18 19:49:44,761][19107] Updated weights for policy 0, policy_version 199645 (0.0031) [2024-06-18 19:49:45,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3271016448. Throughput: 0: 42120.4. Samples: 495061960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:45,509][18875] Avg episode reward: [(0, '0.713')] [2024-06-18 19:49:48,426][19107] Updated weights for policy 0, policy_version 199655 (0.0036) [2024-06-18 19:49:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3271229440. Throughput: 0: 42230.6. Samples: 495314880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:50,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 19:49:52,367][19107] Updated weights for policy 0, policy_version 199665 (0.0028) [2024-06-18 19:49:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3271442432. Throughput: 0: 42099.0. Samples: 495566440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:49:55,501][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 19:49:56,077][19107] Updated weights for policy 0, policy_version 199675 (0.0029) [2024-06-18 19:50:00,097][19107] Updated weights for policy 0, policy_version 199685 (0.0044) [2024-06-18 19:50:00,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3271639040. Throughput: 0: 42095.1. Samples: 495696460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:50:00,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 19:50:03,860][19107] Updated weights for policy 0, policy_version 199695 (0.0047) [2024-06-18 19:50:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 41987.6). Total num frames: 3271852032. Throughput: 0: 41911.6. Samples: 495938200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:50:05,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 19:50:08,257][19107] Updated weights for policy 0, policy_version 199705 (0.0036) [2024-06-18 19:50:10,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3272065024. Throughput: 0: 42073.7. Samples: 496193120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-18 19:50:10,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 19:50:11,589][19107] Updated weights for policy 0, policy_version 199715 (0.0038) [2024-06-18 19:50:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3272261632. Throughput: 0: 42004.6. Samples: 496322340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:15,501][18875] Avg episode reward: [(0, '0.394')] [2024-06-18 19:50:16,129][19107] Updated weights for policy 0, policy_version 199725 (0.0026) [2024-06-18 19:50:19,497][19107] Updated weights for policy 0, policy_version 199735 (0.0028) [2024-06-18 19:50:20,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3272474624. Throughput: 0: 42008.8. Samples: 496570480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:20,508][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 19:50:23,774][19107] Updated weights for policy 0, policy_version 199745 (0.0035) [2024-06-18 19:50:25,504][18875] Fps is (10 sec: 44220.4, 60 sec: 42322.7, 300 sec: 42042.5). Total num frames: 3272704000. Throughput: 0: 42064.2. Samples: 496826280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:25,505][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 19:50:27,367][19107] Updated weights for policy 0, policy_version 199755 (0.0038) [2024-06-18 19:50:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3272884224. Throughput: 0: 42004.4. Samples: 496952160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:30,501][18875] Avg episode reward: [(0, '0.778')] [2024-06-18 19:50:31,577][19107] Updated weights for policy 0, policy_version 199765 (0.0040) [2024-06-18 19:50:35,130][19107] Updated weights for policy 0, policy_version 199775 (0.0051) [2024-06-18 19:50:35,500][18875] Fps is (10 sec: 40974.8, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3273113600. Throughput: 0: 41997.3. Samples: 497204760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:35,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 19:50:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000199776_3273129984.pth... [2024-06-18 19:50:35,584][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000199161_3263053824.pth [2024-06-18 19:50:39,583][19107] Updated weights for policy 0, policy_version 199785 (0.0046) [2024-06-18 19:50:40,500][18875] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3273326592. Throughput: 0: 41785.5. Samples: 497446780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:40,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 19:50:42,968][19107] Updated weights for policy 0, policy_version 199795 (0.0038) [2024-06-18 19:50:45,500][18875] Fps is (10 sec: 37683.8, 60 sec: 41233.2, 300 sec: 41765.3). Total num frames: 3273490432. Throughput: 0: 41526.0. Samples: 497565120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:45,500][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 19:50:45,764][19087] Signal inference workers to stop experience collection... (7250 times) [2024-06-18 19:50:45,764][19087] Signal inference workers to resume experience collection... (7250 times) [2024-06-18 19:50:45,804][19107] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-06-18 19:50:45,804][19107] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-06-18 19:50:47,745][19107] Updated weights for policy 0, policy_version 199805 (0.0038) [2024-06-18 19:50:50,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 3273736192. Throughput: 0: 41752.9. Samples: 497817080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:50,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 19:50:50,978][19107] Updated weights for policy 0, policy_version 199815 (0.0044) [2024-06-18 19:50:55,343][19107] Updated weights for policy 0, policy_version 199825 (0.0027) [2024-06-18 19:50:55,500][18875] Fps is (10 sec: 44235.9, 60 sec: 41506.1, 300 sec: 41932.4). Total num frames: 3273932800. Throughput: 0: 41950.2. Samples: 498080880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:50:55,504][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 19:50:58,564][19107] Updated weights for policy 0, policy_version 199835 (0.0031) [2024-06-18 19:51:00,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3274145792. Throughput: 0: 41728.9. Samples: 498200140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:51:00,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 19:51:02,988][19107] Updated weights for policy 0, policy_version 199845 (0.0028) [2024-06-18 19:51:05,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3274391552. Throughput: 0: 41871.7. Samples: 498454700. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:51:05,504][18875] Avg episode reward: [(0, '0.424')] [2024-06-18 19:51:06,344][19107] Updated weights for policy 0, policy_version 199855 (0.0033) [2024-06-18 19:51:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3274571776. Throughput: 0: 42024.4. Samples: 498717220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:51:10,501][18875] Avg episode reward: [(0, '0.424')] [2024-06-18 19:51:10,566][19107] Updated weights for policy 0, policy_version 199865 (0.0036) [2024-06-18 19:51:14,114][19107] Updated weights for policy 0, policy_version 199875 (0.0032) [2024-06-18 19:51:15,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3274768384. Throughput: 0: 41779.7. Samples: 498832240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 19:51:15,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 19:51:18,195][19107] Updated weights for policy 0, policy_version 199885 (0.0030) [2024-06-18 19:51:20,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 3275014144. Throughput: 0: 41959.2. Samples: 499092920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:20,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 19:51:21,676][19107] Updated weights for policy 0, policy_version 199895 (0.0036) [2024-06-18 19:51:25,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 3275210752. Throughput: 0: 42196.8. Samples: 499345640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:25,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 19:51:25,875][19107] Updated weights for policy 0, policy_version 199905 (0.0029) [2024-06-18 19:51:29,456][19107] Updated weights for policy 0, policy_version 199915 (0.0033) [2024-06-18 19:51:30,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3275407360. Throughput: 0: 42336.7. Samples: 499470280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:30,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 19:51:33,951][19107] Updated weights for policy 0, policy_version 199925 (0.0036) [2024-06-18 19:51:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3275620352. Throughput: 0: 42349.9. Samples: 499722820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:35,501][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 19:51:37,236][19107] Updated weights for policy 0, policy_version 199935 (0.0035) [2024-06-18 19:51:40,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3275833344. Throughput: 0: 42112.4. Samples: 499975940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:40,501][18875] Avg episode reward: [(0, '0.225')] [2024-06-18 19:51:42,165][19107] Updated weights for policy 0, policy_version 199945 (0.0035) [2024-06-18 19:51:44,981][19107] Updated weights for policy 0, policy_version 199955 (0.0034) [2024-06-18 19:51:45,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42043.0). Total num frames: 3276062720. Throughput: 0: 42330.9. Samples: 500105040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:45,501][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 19:51:49,714][19107] Updated weights for policy 0, policy_version 199965 (0.0039) [2024-06-18 19:51:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3276242944. Throughput: 0: 42262.6. Samples: 500356520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:50,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 19:51:52,966][19107] Updated weights for policy 0, policy_version 199975 (0.0045) [2024-06-18 19:51:55,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3276488704. Throughput: 0: 41987.4. Samples: 500606660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:51:55,501][18875] Avg episode reward: [(0, '0.753')] [2024-06-18 19:51:57,635][19107] Updated weights for policy 0, policy_version 199985 (0.0029) [2024-06-18 19:52:00,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3276701696. Throughput: 0: 42275.9. Samples: 500734660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:52:00,501][18875] Avg episode reward: [(0, '0.707')] [2024-06-18 19:52:00,676][19107] Updated weights for policy 0, policy_version 199995 (0.0039) [2024-06-18 19:52:05,196][19107] Updated weights for policy 0, policy_version 200005 (0.0033) [2024-06-18 19:52:05,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3276881920. Throughput: 0: 42139.9. Samples: 500989220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:52:05,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 19:52:08,398][19107] Updated weights for policy 0, policy_version 200015 (0.0035) [2024-06-18 19:52:10,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3277127680. Throughput: 0: 42113.3. Samples: 501240740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:52:10,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 19:52:12,895][19107] Updated weights for policy 0, policy_version 200025 (0.0037) [2024-06-18 19:52:15,208][19087] Signal inference workers to stop experience collection... (7300 times) [2024-06-18 19:52:15,209][19087] Signal inference workers to resume experience collection... (7300 times) [2024-06-18 19:52:15,256][19107] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-06-18 19:52:15,256][19107] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-06-18 19:52:15,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3277324288. Throughput: 0: 42172.1. Samples: 501368020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:52:15,501][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 19:52:16,169][19107] Updated weights for policy 0, policy_version 200035 (0.0028) [2024-06-18 19:52:20,276][19107] Updated weights for policy 0, policy_version 200045 (0.0038) [2024-06-18 19:52:20,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3277537280. Throughput: 0: 42226.2. Samples: 501623000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:52:20,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 19:52:23,923][19107] Updated weights for policy 0, policy_version 200055 (0.0029) [2024-06-18 19:52:25,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3277750272. Throughput: 0: 42313.9. Samples: 501880060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 19:52:25,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 19:52:27,889][19107] Updated weights for policy 0, policy_version 200065 (0.0036) [2024-06-18 19:52:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3277946880. Throughput: 0: 42156.1. Samples: 502002060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:52:30,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 19:52:31,895][19107] Updated weights for policy 0, policy_version 200075 (0.0039) [2024-06-18 19:52:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3278176256. Throughput: 0: 42191.7. Samples: 502255140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:52:35,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 19:52:35,607][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000200085_3278192640.pth... [2024-06-18 19:52:35,610][19107] Updated weights for policy 0, policy_version 200085 (0.0024) [2024-06-18 19:52:35,660][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000199467_3268067328.pth [2024-06-18 19:52:39,717][19107] Updated weights for policy 0, policy_version 200095 (0.0034) [2024-06-18 19:52:40,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 3278372864. Throughput: 0: 42293.7. Samples: 502509880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:52:40,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 19:52:43,296][19107] Updated weights for policy 0, policy_version 200105 (0.0037) [2024-06-18 19:52:45,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3278569472. Throughput: 0: 42282.6. Samples: 502637380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:52:45,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 19:52:47,398][19107] Updated weights for policy 0, policy_version 200115 (0.0031) [2024-06-18 19:52:50,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42098.6). Total num frames: 3278798848. Throughput: 0: 42204.5. Samples: 502888420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:52:50,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 19:52:51,108][19107] Updated weights for policy 0, policy_version 200125 (0.0053) [2024-06-18 19:52:55,335][19107] Updated weights for policy 0, policy_version 200135 (0.0025) [2024-06-18 19:52:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3279011840. Throughput: 0: 42202.6. Samples: 503139860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:52:55,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 19:52:58,810][19107] Updated weights for policy 0, policy_version 200145 (0.0033) [2024-06-18 19:53:00,504][18875] Fps is (10 sec: 39307.5, 60 sec: 41503.7, 300 sec: 41987.0). Total num frames: 3279192064. Throughput: 0: 42200.1. Samples: 503267180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:53:00,505][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 19:53:02,966][19107] Updated weights for policy 0, policy_version 200155 (0.0036) [2024-06-18 19:53:05,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 3279454208. Throughput: 0: 42186.2. Samples: 503521380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:53:05,501][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 19:53:06,695][19107] Updated weights for policy 0, policy_version 200165 (0.0044) [2024-06-18 19:53:10,500][18875] Fps is (10 sec: 44252.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3279634432. Throughput: 0: 42207.5. Samples: 503779400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:53:10,501][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 19:53:10,789][19107] Updated weights for policy 0, policy_version 200175 (0.0031) [2024-06-18 19:53:14,560][19107] Updated weights for policy 0, policy_version 200185 (0.0034) [2024-06-18 19:53:15,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3279847424. Throughput: 0: 42042.7. Samples: 503893980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:53:15,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 19:53:18,591][19107] Updated weights for policy 0, policy_version 200195 (0.0025) [2024-06-18 19:53:20,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3280076800. Throughput: 0: 42185.7. Samples: 504153500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:53:20,501][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 19:53:22,279][19107] Updated weights for policy 0, policy_version 200205 (0.0037) [2024-06-18 19:53:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3280257024. Throughput: 0: 42229.9. Samples: 504410220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:53:25,501][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 19:53:26,602][19107] Updated weights for policy 0, policy_version 200215 (0.0034) [2024-06-18 19:53:30,017][19107] Updated weights for policy 0, policy_version 200225 (0.0031) [2024-06-18 19:53:30,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42210.2). Total num frames: 3280502784. Throughput: 0: 41992.9. Samples: 504527060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-18 19:53:30,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 19:53:34,481][19107] Updated weights for policy 0, policy_version 200235 (0.0049) [2024-06-18 19:53:35,504][18875] Fps is (10 sec: 44220.9, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 3280699392. Throughput: 0: 42232.6. Samples: 504789040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:53:35,505][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 19:53:37,643][19107] Updated weights for policy 0, policy_version 200245 (0.0027) [2024-06-18 19:53:40,500][18875] Fps is (10 sec: 37683.2, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3280879616. Throughput: 0: 42167.6. Samples: 505037400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:53:40,501][18875] Avg episode reward: [(0, '0.394')] [2024-06-18 19:53:42,237][19107] Updated weights for policy 0, policy_version 200255 (0.0047) [2024-06-18 19:53:44,839][19087] Signal inference workers to stop experience collection... (7350 times) [2024-06-18 19:53:44,848][19087] Signal inference workers to resume experience collection... (7350 times) [2024-06-18 19:53:44,858][19107] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-06-18 19:53:44,892][19107] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-06-18 19:53:45,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3281125376. Throughput: 0: 42019.3. Samples: 505157900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:53:45,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 19:53:45,732][19107] Updated weights for policy 0, policy_version 200265 (0.0033) [2024-06-18 19:53:50,281][19107] Updated weights for policy 0, policy_version 200275 (0.0042) [2024-06-18 19:53:50,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3281305600. Throughput: 0: 41998.3. Samples: 505411300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:53:50,500][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 19:53:53,624][19107] Updated weights for policy 0, policy_version 200285 (0.0035) [2024-06-18 19:53:55,500][18875] Fps is (10 sec: 37683.7, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3281502208. Throughput: 0: 41849.0. Samples: 505662600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:53:55,500][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 19:53:58,220][19107] Updated weights for policy 0, policy_version 200295 (0.0039) [2024-06-18 19:54:00,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42874.1, 300 sec: 42209.6). Total num frames: 3281764352. Throughput: 0: 42088.1. Samples: 505787940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:00,500][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 19:54:01,330][19107] Updated weights for policy 0, policy_version 200305 (0.0045) [2024-06-18 19:54:05,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 3281928192. Throughput: 0: 41789.0. Samples: 506034000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:05,501][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 19:54:05,815][19107] Updated weights for policy 0, policy_version 200315 (0.0027) [2024-06-18 19:54:09,016][19107] Updated weights for policy 0, policy_version 200325 (0.0037) [2024-06-18 19:54:10,500][18875] Fps is (10 sec: 37683.2, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3282141184. Throughput: 0: 41713.0. Samples: 506287300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:10,501][18875] Avg episode reward: [(0, '0.201')] [2024-06-18 19:54:13,629][19107] Updated weights for policy 0, policy_version 200335 (0.0029) [2024-06-18 19:54:15,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 3282386944. Throughput: 0: 41927.7. Samples: 506413800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:15,501][18875] Avg episode reward: [(0, '0.401')] [2024-06-18 19:54:16,754][19107] Updated weights for policy 0, policy_version 200345 (0.0028) [2024-06-18 19:54:20,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3282567168. Throughput: 0: 41819.8. Samples: 506670780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:20,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 19:54:21,363][19107] Updated weights for policy 0, policy_version 200355 (0.0036) [2024-06-18 19:54:24,435][19107] Updated weights for policy 0, policy_version 200365 (0.0038) [2024-06-18 19:54:25,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3282796544. Throughput: 0: 41803.0. Samples: 506918540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:25,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 19:54:29,003][19107] Updated weights for policy 0, policy_version 200375 (0.0042) [2024-06-18 19:54:30,500][18875] Fps is (10 sec: 44237.5, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3283009536. Throughput: 0: 42110.8. Samples: 507052880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:30,500][18875] Avg episode reward: [(0, '0.761')] [2024-06-18 19:54:32,522][19107] Updated weights for policy 0, policy_version 200385 (0.0039) [2024-06-18 19:54:35,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41508.6, 300 sec: 41987.5). Total num frames: 3283189760. Throughput: 0: 42080.3. Samples: 507304920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:35,501][18875] Avg episode reward: [(0, '0.864')] [2024-06-18 19:54:35,521][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000200390_3283189760.pth... [2024-06-18 19:54:35,586][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000199776_3273129984.pth [2024-06-18 19:54:36,757][19107] Updated weights for policy 0, policy_version 200395 (0.0037) [2024-06-18 19:54:40,277][19107] Updated weights for policy 0, policy_version 200405 (0.0034) [2024-06-18 19:54:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3283435520. Throughput: 0: 42091.0. Samples: 507556700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 19:54:40,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 19:54:44,405][19107] Updated weights for policy 0, policy_version 200415 (0.0029) [2024-06-18 19:54:45,500][18875] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3283648512. Throughput: 0: 42239.4. Samples: 507688720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:54:45,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 19:54:48,015][19107] Updated weights for policy 0, policy_version 200425 (0.0024) [2024-06-18 19:54:50,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3283845120. Throughput: 0: 42326.1. Samples: 507938680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:54:50,501][18875] Avg episode reward: [(0, '0.413')] [2024-06-18 19:54:52,197][19107] Updated weights for policy 0, policy_version 200435 (0.0028) [2024-06-18 19:54:55,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42154.1). Total num frames: 3284074496. Throughput: 0: 42313.6. Samples: 508191420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:54:55,501][18875] Avg episode reward: [(0, '0.389')] [2024-06-18 19:54:55,711][19107] Updated weights for policy 0, policy_version 200445 (0.0034) [2024-06-18 19:54:59,759][19087] Signal inference workers to stop experience collection... (7400 times) [2024-06-18 19:54:59,775][19107] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-06-18 19:54:59,872][19087] Signal inference workers to resume experience collection... (7400 times) [2024-06-18 19:54:59,872][19107] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-06-18 19:55:00,021][19107] Updated weights for policy 0, policy_version 200455 (0.0031) [2024-06-18 19:55:00,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3284271104. Throughput: 0: 42349.3. Samples: 508319520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:00,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 19:55:03,511][19107] Updated weights for policy 0, policy_version 200465 (0.0042) [2024-06-18 19:55:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 3284500480. Throughput: 0: 42122.6. Samples: 508566300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:05,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 19:55:07,665][19107] Updated weights for policy 0, policy_version 200475 (0.0033) [2024-06-18 19:55:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3284680704. Throughput: 0: 42393.4. Samples: 508826240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:10,501][18875] Avg episode reward: [(0, '0.399')] [2024-06-18 19:55:11,280][19107] Updated weights for policy 0, policy_version 200485 (0.0045) [2024-06-18 19:55:15,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3284893696. Throughput: 0: 42223.4. Samples: 508952940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:15,501][18875] Avg episode reward: [(0, '0.424')] [2024-06-18 19:55:15,543][19107] Updated weights for policy 0, policy_version 200495 (0.0036) [2024-06-18 19:55:19,015][19107] Updated weights for policy 0, policy_version 200505 (0.0030) [2024-06-18 19:55:20,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42099.1). Total num frames: 3285123072. Throughput: 0: 42149.9. Samples: 509201660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:20,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 19:55:23,219][19107] Updated weights for policy 0, policy_version 200515 (0.0042) [2024-06-18 19:55:25,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3285319680. Throughput: 0: 42257.9. Samples: 509458300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:25,500][18875] Avg episode reward: [(0, '0.707')] [2024-06-18 19:55:26,725][19107] Updated weights for policy 0, policy_version 200525 (0.0031) [2024-06-18 19:55:30,504][18875] Fps is (10 sec: 40945.1, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 3285532672. Throughput: 0: 42111.4. Samples: 509583880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:30,504][18875] Avg episode reward: [(0, '0.780')] [2024-06-18 19:55:30,955][19107] Updated weights for policy 0, policy_version 200535 (0.0023) [2024-06-18 19:55:34,474][19107] Updated weights for policy 0, policy_version 200545 (0.0033) [2024-06-18 19:55:35,500][18875] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 3285762048. Throughput: 0: 42304.0. Samples: 509842360. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:35,501][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 19:55:39,199][19107] Updated weights for policy 0, policy_version 200555 (0.0025) [2024-06-18 19:55:40,500][18875] Fps is (10 sec: 44253.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 3285975040. Throughput: 0: 42199.4. Samples: 510090380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:40,500][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 19:55:42,203][19107] Updated weights for policy 0, policy_version 200565 (0.0031) [2024-06-18 19:55:45,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3286155264. Throughput: 0: 42158.0. Samples: 510216640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 19:55:45,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 19:55:46,853][19107] Updated weights for policy 0, policy_version 200575 (0.0047) [2024-06-18 19:55:50,015][19107] Updated weights for policy 0, policy_version 200585 (0.0033) [2024-06-18 19:55:50,504][18875] Fps is (10 sec: 42582.5, 60 sec: 42595.9, 300 sec: 42264.7). Total num frames: 3286401024. Throughput: 0: 42358.0. Samples: 510472560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:55:50,504][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 19:55:54,580][19107] Updated weights for policy 0, policy_version 200595 (0.0028) [2024-06-18 19:55:55,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3286597632. Throughput: 0: 42347.5. Samples: 510731880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:55:55,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 19:55:57,666][19107] Updated weights for policy 0, policy_version 200605 (0.0042) [2024-06-18 19:56:00,500][18875] Fps is (10 sec: 40974.4, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3286810624. Throughput: 0: 42232.0. Samples: 510853380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:00,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 19:56:02,242][19107] Updated weights for policy 0, policy_version 200615 (0.0046) [2024-06-18 19:56:05,417][19107] Updated weights for policy 0, policy_version 200625 (0.0039) [2024-06-18 19:56:05,500][18875] Fps is (10 sec: 44237.7, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 3287040000. Throughput: 0: 42299.2. Samples: 511105120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:05,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 19:56:10,025][19107] Updated weights for policy 0, policy_version 200635 (0.0032) [2024-06-18 19:56:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3287220224. Throughput: 0: 42296.7. Samples: 511361660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:10,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 19:56:13,318][19107] Updated weights for policy 0, policy_version 200645 (0.0037) [2024-06-18 19:56:15,504][18875] Fps is (10 sec: 39307.1, 60 sec: 42322.8, 300 sec: 42098.0). Total num frames: 3287433216. Throughput: 0: 42165.3. Samples: 511481320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:15,505][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 19:56:16,607][19087] Signal inference workers to stop experience collection... (7450 times) [2024-06-18 19:56:16,608][19087] Signal inference workers to resume experience collection... (7450 times) [2024-06-18 19:56:16,647][19107] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-06-18 19:56:16,647][19107] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-06-18 19:56:17,798][19107] Updated weights for policy 0, policy_version 200655 (0.0030) [2024-06-18 19:56:20,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3287646208. Throughput: 0: 42002.3. Samples: 511732460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:20,504][18875] Avg episode reward: [(0, '0.295')] [2024-06-18 19:56:20,978][19107] Updated weights for policy 0, policy_version 200665 (0.0038) [2024-06-18 19:56:25,500][18875] Fps is (10 sec: 40974.3, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 3287842816. Throughput: 0: 42151.8. Samples: 511987220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:25,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 19:56:25,604][19107] Updated weights for policy 0, policy_version 200675 (0.0037) [2024-06-18 19:56:28,824][19107] Updated weights for policy 0, policy_version 200685 (0.0041) [2024-06-18 19:56:30,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42327.8, 300 sec: 42209.6). Total num frames: 3288072192. Throughput: 0: 42170.7. Samples: 512114320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:30,501][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 19:56:33,411][19107] Updated weights for policy 0, policy_version 200695 (0.0032) [2024-06-18 19:56:35,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3288268800. Throughput: 0: 42122.4. Samples: 512367920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:35,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 19:56:35,518][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000200700_3288268800.pth... [2024-06-18 19:56:35,581][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000200085_3278192640.pth [2024-06-18 19:56:36,545][19107] Updated weights for policy 0, policy_version 200705 (0.0027) [2024-06-18 19:56:40,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.0, 300 sec: 42098.5). Total num frames: 3288481792. Throughput: 0: 41904.4. Samples: 512617580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:40,501][18875] Avg episode reward: [(0, '0.771')] [2024-06-18 19:56:41,655][19107] Updated weights for policy 0, policy_version 200715 (0.0039) [2024-06-18 19:56:44,535][19107] Updated weights for policy 0, policy_version 200725 (0.0030) [2024-06-18 19:56:45,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 3288694784. Throughput: 0: 41982.4. Samples: 512742580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:45,500][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 19:56:49,297][19107] Updated weights for policy 0, policy_version 200735 (0.0044) [2024-06-18 19:56:50,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42054.8, 300 sec: 42154.1). Total num frames: 3288924160. Throughput: 0: 42094.6. Samples: 512999380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 19:56:50,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 19:56:52,567][19107] Updated weights for policy 0, policy_version 200745 (0.0041) [2024-06-18 19:56:55,505][18875] Fps is (10 sec: 42576.3, 60 sec: 42048.7, 300 sec: 42097.8). Total num frames: 3289120768. Throughput: 0: 41889.5. Samples: 513246900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:56:55,506][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 19:56:56,961][19107] Updated weights for policy 0, policy_version 200755 (0.0032) [2024-06-18 19:57:00,402][19107] Updated weights for policy 0, policy_version 200765 (0.0041) [2024-06-18 19:57:00,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3289333760. Throughput: 0: 42054.8. Samples: 513373640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:00,501][18875] Avg episode reward: [(0, '0.530')] [2024-06-18 19:57:04,665][19107] Updated weights for policy 0, policy_version 200775 (0.0046) [2024-06-18 19:57:05,500][18875] Fps is (10 sec: 40980.9, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3289530368. Throughput: 0: 42151.6. Samples: 513629280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:05,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 19:57:08,256][19107] Updated weights for policy 0, policy_version 200785 (0.0046) [2024-06-18 19:57:10,500][18875] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3289743360. Throughput: 0: 41939.7. Samples: 513874500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:10,501][18875] Avg episode reward: [(0, '0.794')] [2024-06-18 19:57:12,319][19107] Updated weights for policy 0, policy_version 200795 (0.0043) [2024-06-18 19:57:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41781.7, 300 sec: 42043.0). Total num frames: 3289939968. Throughput: 0: 41886.3. Samples: 513999200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:15,501][18875] Avg episode reward: [(0, '0.882')] [2024-06-18 19:57:16,026][19107] Updated weights for policy 0, policy_version 200805 (0.0029) [2024-06-18 19:57:20,226][19107] Updated weights for policy 0, policy_version 200815 (0.0032) [2024-06-18 19:57:20,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3290169344. Throughput: 0: 41934.7. Samples: 514254980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:20,501][18875] Avg episode reward: [(0, '0.844')] [2024-06-18 19:57:24,083][19107] Updated weights for policy 0, policy_version 200825 (0.0031) [2024-06-18 19:57:25,504][18875] Fps is (10 sec: 44220.9, 60 sec: 42322.9, 300 sec: 42153.6). Total num frames: 3290382336. Throughput: 0: 41856.3. Samples: 514501260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:25,505][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 19:57:27,866][19107] Updated weights for policy 0, policy_version 200835 (0.0028) [2024-06-18 19:57:30,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41987.4). Total num frames: 3290562560. Throughput: 0: 41915.8. Samples: 514628800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:30,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 19:57:32,068][19107] Updated weights for policy 0, policy_version 200845 (0.0049) [2024-06-18 19:57:35,500][18875] Fps is (10 sec: 40974.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3290791936. Throughput: 0: 41752.8. Samples: 514878260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:35,501][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 19:57:35,659][19107] Updated weights for policy 0, policy_version 200855 (0.0035) [2024-06-18 19:57:35,910][19087] Signal inference workers to stop experience collection... (7500 times) [2024-06-18 19:57:35,949][19107] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-06-18 19:57:35,983][19087] Signal inference workers to resume experience collection... (7500 times) [2024-06-18 19:57:35,987][19107] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-06-18 19:57:40,014][19107] Updated weights for policy 0, policy_version 200865 (0.0038) [2024-06-18 19:57:40,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3291004928. Throughput: 0: 41950.1. Samples: 515134440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:40,501][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 19:57:43,378][19107] Updated weights for policy 0, policy_version 200875 (0.0034) [2024-06-18 19:57:45,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3291185152. Throughput: 0: 41834.8. Samples: 515256200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:45,501][18875] Avg episode reward: [(0, '0.683')] [2024-06-18 19:57:47,806][19107] Updated weights for policy 0, policy_version 200885 (0.0034) [2024-06-18 19:57:50,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3291414528. Throughput: 0: 41733.7. Samples: 515507300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:50,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 19:57:51,110][19107] Updated weights for policy 0, policy_version 200895 (0.0029) [2024-06-18 19:57:55,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41509.7, 300 sec: 42099.1). Total num frames: 3291611136. Throughput: 0: 42006.2. Samples: 515764780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:57:55,501][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 19:57:55,580][19107] Updated weights for policy 0, policy_version 200905 (0.0043) [2024-06-18 19:57:58,848][19107] Updated weights for policy 0, policy_version 200915 (0.0039) [2024-06-18 19:58:00,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3291840512. Throughput: 0: 41932.9. Samples: 515886180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 19:58:00,501][18875] Avg episode reward: [(0, '0.353')] [2024-06-18 19:58:03,599][19107] Updated weights for policy 0, policy_version 200925 (0.0024) [2024-06-18 19:58:05,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3292053504. Throughput: 0: 41886.2. Samples: 516139860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:05,501][18875] Avg episode reward: [(0, '0.228')] [2024-06-18 19:58:06,404][19107] Updated weights for policy 0, policy_version 200935 (0.0043) [2024-06-18 19:58:10,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.0, 300 sec: 42043.0). Total num frames: 3292250112. Throughput: 0: 41994.4. Samples: 516390860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:10,501][18875] Avg episode reward: [(0, '0.488')] [2024-06-18 19:58:11,547][19107] Updated weights for policy 0, policy_version 200945 (0.0030) [2024-06-18 19:58:14,363][19107] Updated weights for policy 0, policy_version 200955 (0.0033) [2024-06-18 19:58:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3292463104. Throughput: 0: 41777.0. Samples: 516508760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:15,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 19:58:19,410][19107] Updated weights for policy 0, policy_version 200965 (0.0033) [2024-06-18 19:58:20,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3292676096. Throughput: 0: 42085.4. Samples: 516772100. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:20,500][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 19:58:21,954][19107] Updated weights for policy 0, policy_version 200975 (0.0035) [2024-06-18 19:58:25,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41508.6, 300 sec: 41931.9). Total num frames: 3292872704. Throughput: 0: 41814.2. Samples: 517016080. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:25,501][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 19:58:27,235][19107] Updated weights for policy 0, policy_version 200985 (0.0031) [2024-06-18 19:58:29,975][19107] Updated weights for policy 0, policy_version 200995 (0.0027) [2024-06-18 19:58:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42099.1). Total num frames: 3293118464. Throughput: 0: 41983.2. Samples: 517145440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:30,501][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 19:58:34,925][19107] Updated weights for policy 0, policy_version 201005 (0.0041) [2024-06-18 19:58:35,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3293282304. Throughput: 0: 42220.0. Samples: 517407200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:35,501][18875] Avg episode reward: [(0, '0.749')] [2024-06-18 19:58:35,619][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201007_3293298688.pth... [2024-06-18 19:58:35,698][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000200390_3283189760.pth [2024-06-18 19:58:37,446][19107] Updated weights for policy 0, policy_version 201015 (0.0042) [2024-06-18 19:58:40,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3293528064. Throughput: 0: 41955.4. Samples: 517652780. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:40,501][18875] Avg episode reward: [(0, '0.776')] [2024-06-18 19:58:42,633][19107] Updated weights for policy 0, policy_version 201025 (0.0038) [2024-06-18 19:58:45,223][19107] Updated weights for policy 0, policy_version 201035 (0.0035) [2024-06-18 19:58:45,500][18875] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 3293757440. Throughput: 0: 42359.1. Samples: 517792340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:45,501][18875] Avg episode reward: [(0, '0.662')] [2024-06-18 19:58:50,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3293904896. Throughput: 0: 42296.5. Samples: 518043200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:50,504][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 19:58:50,680][19107] Updated weights for policy 0, policy_version 201045 (0.0040) [2024-06-18 19:58:52,946][19107] Updated weights for policy 0, policy_version 201055 (0.0038) [2024-06-18 19:58:55,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3294167040. Throughput: 0: 42088.6. Samples: 518284840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:58:55,501][18875] Avg episode reward: [(0, '0.530')] [2024-06-18 19:58:58,359][19107] Updated weights for policy 0, policy_version 201065 (0.0026) [2024-06-18 19:58:59,263][19087] Signal inference workers to stop experience collection... (7550 times) [2024-06-18 19:58:59,295][19107] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-06-18 19:58:59,333][19087] Signal inference workers to resume experience collection... (7550 times) [2024-06-18 19:58:59,334][19107] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-06-18 19:59:00,500][18875] Fps is (10 sec: 47513.9, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3294380032. Throughput: 0: 42608.1. Samples: 518426120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:59:00,500][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 19:59:00,764][19107] Updated weights for policy 0, policy_version 201075 (0.0036) [2024-06-18 19:59:05,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3294543872. Throughput: 0: 42055.1. Samples: 518664580. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 19:59:05,505][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 19:59:05,890][19107] Updated weights for policy 0, policy_version 201085 (0.0038) [2024-06-18 19:59:08,708][19107] Updated weights for policy 0, policy_version 201095 (0.0025) [2024-06-18 19:59:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 3294789632. Throughput: 0: 42005.4. Samples: 518906320. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:10,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 19:59:13,744][19107] Updated weights for policy 0, policy_version 201105 (0.0035) [2024-06-18 19:59:15,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3294986240. Throughput: 0: 42231.9. Samples: 519045880. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:15,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 19:59:16,398][19107] Updated weights for policy 0, policy_version 201115 (0.0033) [2024-06-18 19:59:20,500][18875] Fps is (10 sec: 37682.5, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3295166464. Throughput: 0: 41837.3. Samples: 519289880. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:20,501][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 19:59:21,493][19107] Updated weights for policy 0, policy_version 201125 (0.0039) [2024-06-18 19:59:24,250][19107] Updated weights for policy 0, policy_version 201135 (0.0041) [2024-06-18 19:59:25,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3295412224. Throughput: 0: 41879.3. Samples: 519537340. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:25,500][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 19:59:29,139][19107] Updated weights for policy 0, policy_version 201145 (0.0038) [2024-06-18 19:59:30,500][18875] Fps is (10 sec: 44237.7, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 3295608832. Throughput: 0: 41824.5. Samples: 519674440. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:30,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 19:59:32,149][19107] Updated weights for policy 0, policy_version 201155 (0.0035) [2024-06-18 19:59:35,500][18875] Fps is (10 sec: 39320.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3295805440. Throughput: 0: 41523.4. Samples: 519911760. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:35,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 19:59:37,479][19107] Updated weights for policy 0, policy_version 201165 (0.0031) [2024-06-18 19:59:40,052][19107] Updated weights for policy 0, policy_version 201175 (0.0044) [2024-06-18 19:59:40,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3296051200. Throughput: 0: 41695.2. Samples: 520161120. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:40,500][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 19:59:45,083][19107] Updated weights for policy 0, policy_version 201185 (0.0036) [2024-06-18 19:59:45,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 3296231424. Throughput: 0: 41480.7. Samples: 520292760. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:45,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 19:59:47,961][19107] Updated weights for policy 0, policy_version 201195 (0.0040) [2024-06-18 19:59:50,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 3296444416. Throughput: 0: 41737.8. Samples: 520542780. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:50,500][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 19:59:52,684][19107] Updated weights for policy 0, policy_version 201205 (0.0037) [2024-06-18 19:59:55,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3296673792. Throughput: 0: 41979.9. Samples: 520795420. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 19:59:55,501][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 19:59:55,810][19107] Updated weights for policy 0, policy_version 201215 (0.0038) [2024-06-18 20:00:00,376][19107] Updated weights for policy 0, policy_version 201225 (0.0050) [2024-06-18 20:00:00,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3296870400. Throughput: 0: 41666.7. Samples: 520920880. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 20:00:00,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 20:00:00,747][19087] Signal inference workers to stop experience collection... (7600 times) [2024-06-18 20:00:00,753][19087] Signal inference workers to resume experience collection... (7600 times) [2024-06-18 20:00:00,789][19107] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-06-18 20:00:00,789][19107] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-06-18 20:00:03,873][19107] Updated weights for policy 0, policy_version 201235 (0.0041) [2024-06-18 20:00:05,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3297083392. Throughput: 0: 41753.3. Samples: 521168780. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 20:00:05,501][18875] Avg episode reward: [(0, '0.324')] [2024-06-18 20:00:08,152][19107] Updated weights for policy 0, policy_version 201245 (0.0043) [2024-06-18 20:00:10,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3297280000. Throughput: 0: 41939.5. Samples: 521424620. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 20:00:10,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 20:00:11,668][19107] Updated weights for policy 0, policy_version 201255 (0.0050) [2024-06-18 20:00:15,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3297492992. Throughput: 0: 41740.9. Samples: 521552780. Policy #0 lag: (min: 0.0, avg: 13.6, max: 30.0) [2024-06-18 20:00:15,500][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 20:00:16,001][19107] Updated weights for policy 0, policy_version 201265 (0.0032) [2024-06-18 20:00:19,526][19107] Updated weights for policy 0, policy_version 201275 (0.0031) [2024-06-18 20:00:20,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3297722368. Throughput: 0: 42023.3. Samples: 521802800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:20,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 20:00:23,786][19107] Updated weights for policy 0, policy_version 201285 (0.0029) [2024-06-18 20:00:25,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 3297935360. Throughput: 0: 42069.2. Samples: 522054240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:25,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 20:00:27,510][19107] Updated weights for policy 0, policy_version 201295 (0.0042) [2024-06-18 20:00:30,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3298115584. Throughput: 0: 41820.4. Samples: 522174680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:30,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 20:00:31,592][19107] Updated weights for policy 0, policy_version 201305 (0.0038) [2024-06-18 20:00:35,154][19107] Updated weights for policy 0, policy_version 201315 (0.0029) [2024-06-18 20:00:35,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 41987.5). Total num frames: 3298361344. Throughput: 0: 41935.1. Samples: 522429860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:35,500][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 20:00:35,507][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201316_3298361344.pth... [2024-06-18 20:00:35,556][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000200700_3288268800.pth [2024-06-18 20:00:39,457][19107] Updated weights for policy 0, policy_version 201325 (0.0028) [2024-06-18 20:00:40,506][18875] Fps is (10 sec: 45851.2, 60 sec: 42048.5, 300 sec: 42097.8). Total num frames: 3298574336. Throughput: 0: 42044.4. Samples: 522687640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:40,506][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 20:00:43,107][19107] Updated weights for policy 0, policy_version 201335 (0.0038) [2024-06-18 20:00:45,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.4, 300 sec: 41876.9). Total num frames: 3298754560. Throughput: 0: 42014.8. Samples: 522811540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:45,501][18875] Avg episode reward: [(0, '0.614')] [2024-06-18 20:00:47,109][19107] Updated weights for policy 0, policy_version 201345 (0.0028) [2024-06-18 20:00:50,500][18875] Fps is (10 sec: 39342.8, 60 sec: 42052.2, 300 sec: 41932.0). Total num frames: 3298967552. Throughput: 0: 41949.0. Samples: 523056480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:50,501][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 20:00:51,160][19107] Updated weights for policy 0, policy_version 201355 (0.0041) [2024-06-18 20:00:55,175][19107] Updated weights for policy 0, policy_version 201365 (0.0040) [2024-06-18 20:00:55,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3299180544. Throughput: 0: 42010.0. Samples: 523315080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:00:55,501][18875] Avg episode reward: [(0, '0.765')] [2024-06-18 20:00:59,043][19107] Updated weights for policy 0, policy_version 201375 (0.0037) [2024-06-18 20:01:00,504][18875] Fps is (10 sec: 40944.9, 60 sec: 41776.7, 300 sec: 41820.3). Total num frames: 3299377152. Throughput: 0: 41905.4. Samples: 523438680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:00,505][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 20:01:02,823][19107] Updated weights for policy 0, policy_version 201385 (0.0031) [2024-06-18 20:01:05,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3299590144. Throughput: 0: 41858.2. Samples: 523686420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:05,502][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 20:01:06,797][19107] Updated weights for policy 0, policy_version 201395 (0.0046) [2024-06-18 20:01:10,500][18875] Fps is (10 sec: 42614.4, 60 sec: 42052.3, 300 sec: 41932.5). Total num frames: 3299803136. Throughput: 0: 41962.4. Samples: 523942540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:10,500][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 20:01:10,530][19107] Updated weights for policy 0, policy_version 201405 (0.0035) [2024-06-18 20:01:14,544][19107] Updated weights for policy 0, policy_version 201415 (0.0040) [2024-06-18 20:01:15,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3300016128. Throughput: 0: 42058.7. Samples: 524067320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:15,501][18875] Avg episode reward: [(0, '0.724')] [2024-06-18 20:01:18,165][19107] Updated weights for policy 0, policy_version 201425 (0.0038) [2024-06-18 20:01:20,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3300229120. Throughput: 0: 42015.8. Samples: 524320580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:20,501][18875] Avg episode reward: [(0, '0.853')] [2024-06-18 20:01:22,599][19107] Updated weights for policy 0, policy_version 201435 (0.0038) [2024-06-18 20:01:25,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 3300442112. Throughput: 0: 41902.0. Samples: 524573000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:25,500][18875] Avg episode reward: [(0, '0.686')] [2024-06-18 20:01:25,801][19107] Updated weights for policy 0, policy_version 201445 (0.0033) [2024-06-18 20:01:27,690][19087] Signal inference workers to stop experience collection... (7650 times) [2024-06-18 20:01:27,736][19107] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-06-18 20:01:27,743][19087] Signal inference workers to resume experience collection... (7650 times) [2024-06-18 20:01:27,751][19107] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-06-18 20:01:30,226][19107] Updated weights for policy 0, policy_version 201455 (0.0039) [2024-06-18 20:01:30,504][18875] Fps is (10 sec: 40945.6, 60 sec: 42049.8, 300 sec: 41931.4). Total num frames: 3300638720. Throughput: 0: 41861.1. Samples: 524695440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:30,504][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 20:01:33,562][19107] Updated weights for policy 0, policy_version 201465 (0.0034) [2024-06-18 20:01:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3300851712. Throughput: 0: 42074.2. Samples: 524949820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:35,508][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 20:01:37,768][19107] Updated weights for policy 0, policy_version 201475 (0.0036) [2024-06-18 20:01:40,500][18875] Fps is (10 sec: 42614.2, 60 sec: 41509.9, 300 sec: 41931.9). Total num frames: 3301064704. Throughput: 0: 42082.5. Samples: 525208780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:40,500][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 20:01:41,273][19107] Updated weights for policy 0, policy_version 201485 (0.0026) [2024-06-18 20:01:45,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3301261312. Throughput: 0: 42162.6. Samples: 525335840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:45,500][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 20:01:45,906][19107] Updated weights for policy 0, policy_version 201495 (0.0027) [2024-06-18 20:01:49,137][19107] Updated weights for policy 0, policy_version 201505 (0.0041) [2024-06-18 20:01:50,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41988.2). Total num frames: 3301507072. Throughput: 0: 42195.1. Samples: 525585200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:50,501][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 20:01:53,523][19107] Updated weights for policy 0, policy_version 201515 (0.0036) [2024-06-18 20:01:55,500][18875] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3301720064. Throughput: 0: 42251.4. Samples: 525843860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:01:55,502][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 20:01:56,958][19107] Updated weights for policy 0, policy_version 201525 (0.0044) [2024-06-18 20:02:00,504][18875] Fps is (10 sec: 37669.7, 60 sec: 41779.2, 300 sec: 41875.9). Total num frames: 3301883904. Throughput: 0: 42147.8. Samples: 525964120. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:02:00,505][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 20:02:01,381][19107] Updated weights for policy 0, policy_version 201535 (0.0033) [2024-06-18 20:02:04,747][19107] Updated weights for policy 0, policy_version 201545 (0.0040) [2024-06-18 20:02:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3302129664. Throughput: 0: 42128.9. Samples: 526216380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:02:05,501][18875] Avg episode reward: [(0, '0.830')] [2024-06-18 20:02:09,284][19107] Updated weights for policy 0, policy_version 201555 (0.0032) [2024-06-18 20:02:10,500][18875] Fps is (10 sec: 44252.4, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 3302326272. Throughput: 0: 42099.0. Samples: 526467460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:02:10,501][18875] Avg episode reward: [(0, '0.574')] [2024-06-18 20:02:12,667][19107] Updated weights for policy 0, policy_version 201565 (0.0034) [2024-06-18 20:02:15,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3302539264. Throughput: 0: 42165.0. Samples: 526592720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:02:15,501][18875] Avg episode reward: [(0, '0.337')] [2024-06-18 20:02:17,102][19107] Updated weights for policy 0, policy_version 201575 (0.0044) [2024-06-18 20:02:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41932.4). Total num frames: 3302752256. Throughput: 0: 42141.3. Samples: 526846180. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:02:20,501][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 20:02:20,588][19107] Updated weights for policy 0, policy_version 201585 (0.0033) [2024-06-18 20:02:25,085][19107] Updated weights for policy 0, policy_version 201595 (0.0038) [2024-06-18 20:02:25,504][18875] Fps is (10 sec: 40945.4, 60 sec: 41776.6, 300 sec: 41987.0). Total num frames: 3302948864. Throughput: 0: 42126.2. Samples: 527104620. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:02:25,505][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 20:02:28,183][19107] Updated weights for policy 0, policy_version 201605 (0.0040) [2024-06-18 20:02:30,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 3303178240. Throughput: 0: 41967.5. Samples: 527224380. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-18 20:02:30,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 20:02:33,077][19107] Updated weights for policy 0, policy_version 201615 (0.0038) [2024-06-18 20:02:35,500][18875] Fps is (10 sec: 44252.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3303391232. Throughput: 0: 42217.7. Samples: 527485000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:02:35,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 20:02:35,524][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201623_3303391232.pth... [2024-06-18 20:02:35,573][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201007_3293298688.pth [2024-06-18 20:02:35,974][19107] Updated weights for policy 0, policy_version 201625 (0.0038) [2024-06-18 20:02:40,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3303555072. Throughput: 0: 42060.5. Samples: 527736580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:02:40,500][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 20:02:40,905][19107] Updated weights for policy 0, policy_version 201635 (0.0051) [2024-06-18 20:02:43,701][19107] Updated weights for policy 0, policy_version 201645 (0.0032) [2024-06-18 20:02:45,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42098.6). Total num frames: 3303833600. Throughput: 0: 42025.5. Samples: 527855120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:02:45,501][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 20:02:46,612][19087] Signal inference workers to stop experience collection... (7700 times) [2024-06-18 20:02:46,638][19107] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-06-18 20:02:46,677][19087] Signal inference workers to resume experience collection... (7700 times) [2024-06-18 20:02:46,678][19107] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-06-18 20:02:48,465][19107] Updated weights for policy 0, policy_version 201655 (0.0031) [2024-06-18 20:02:50,500][18875] Fps is (10 sec: 47513.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3304030208. Throughput: 0: 42278.3. Samples: 528118900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:02:50,501][18875] Avg episode reward: [(0, '0.766')] [2024-06-18 20:02:51,196][19107] Updated weights for policy 0, policy_version 201665 (0.0038) [2024-06-18 20:02:55,503][18875] Fps is (10 sec: 37674.9, 60 sec: 41504.6, 300 sec: 41931.6). Total num frames: 3304210432. Throughput: 0: 42223.3. Samples: 528367600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:02:55,503][18875] Avg episode reward: [(0, '0.837')] [2024-06-18 20:02:56,163][19107] Updated weights for policy 0, policy_version 201675 (0.0029) [2024-06-18 20:02:58,857][19107] Updated weights for policy 0, policy_version 201685 (0.0032) [2024-06-18 20:03:00,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42874.1, 300 sec: 42043.0). Total num frames: 3304456192. Throughput: 0: 42216.2. Samples: 528492440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:00,500][18875] Avg episode reward: [(0, '0.834')] [2024-06-18 20:03:04,391][19107] Updated weights for policy 0, policy_version 201695 (0.0030) [2024-06-18 20:03:05,500][18875] Fps is (10 sec: 42608.2, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3304636416. Throughput: 0: 42243.6. Samples: 528747140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:05,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 20:03:06,759][19107] Updated weights for policy 0, policy_version 201705 (0.0032) [2024-06-18 20:03:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3304865792. Throughput: 0: 41931.4. Samples: 528991380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:10,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 20:03:11,907][19107] Updated weights for policy 0, policy_version 201715 (0.0050) [2024-06-18 20:03:14,698][19107] Updated weights for policy 0, policy_version 201725 (0.0041) [2024-06-18 20:03:15,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42598.5, 300 sec: 42098.5). Total num frames: 3305095168. Throughput: 0: 42222.7. Samples: 529124400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:15,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 20:03:19,603][19107] Updated weights for policy 0, policy_version 201735 (0.0031) [2024-06-18 20:03:20,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3305259008. Throughput: 0: 42060.2. Samples: 529377700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:20,501][18875] Avg episode reward: [(0, '0.425')] [2024-06-18 20:03:22,534][19107] Updated weights for policy 0, policy_version 201745 (0.0032) [2024-06-18 20:03:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42601.1, 300 sec: 41987.5). Total num frames: 3305504768. Throughput: 0: 41795.1. Samples: 529617360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:25,500][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 20:03:27,897][19107] Updated weights for policy 0, policy_version 201755 (0.0029) [2024-06-18 20:03:30,340][19107] Updated weights for policy 0, policy_version 201765 (0.0030) [2024-06-18 20:03:30,500][18875] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3305717760. Throughput: 0: 42223.5. Samples: 529755180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:30,501][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 20:03:35,500][18875] Fps is (10 sec: 36044.2, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 3305865216. Throughput: 0: 41727.0. Samples: 529996620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 20:03:35,501][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 20:03:35,533][19107] Updated weights for policy 0, policy_version 201775 (0.0044) [2024-06-18 20:03:38,441][19107] Updated weights for policy 0, policy_version 201785 (0.0038) [2024-06-18 20:03:40,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 41931.9). Total num frames: 3306127360. Throughput: 0: 41670.1. Samples: 530242660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:03:40,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 20:03:43,191][19107] Updated weights for policy 0, policy_version 201795 (0.0023) [2024-06-18 20:03:45,500][18875] Fps is (10 sec: 45875.5, 60 sec: 41506.2, 300 sec: 42098.5). Total num frames: 3306323968. Throughput: 0: 41916.3. Samples: 530378680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:03:45,501][18875] Avg episode reward: [(0, '0.791')] [2024-06-18 20:03:46,100][19107] Updated weights for policy 0, policy_version 201805 (0.0029) [2024-06-18 20:03:50,500][18875] Fps is (10 sec: 37682.5, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 3306504192. Throughput: 0: 41688.7. Samples: 530623140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:03:50,501][18875] Avg episode reward: [(0, '0.701')] [2024-06-18 20:03:50,902][19107] Updated weights for policy 0, policy_version 201815 (0.0043) [2024-06-18 20:03:53,255][19087] Signal inference workers to stop experience collection... (7750 times) [2024-06-18 20:03:53,310][19107] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-06-18 20:03:53,310][19087] Signal inference workers to resume experience collection... (7750 times) [2024-06-18 20:03:53,335][19107] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-06-18 20:03:53,796][19107] Updated weights for policy 0, policy_version 201825 (0.0034) [2024-06-18 20:03:55,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42600.0, 300 sec: 41987.5). Total num frames: 3306766336. Throughput: 0: 41696.5. Samples: 530867720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:03:55,501][18875] Avg episode reward: [(0, '0.767')] [2024-06-18 20:03:58,909][19107] Updated weights for policy 0, policy_version 201835 (0.0029) [2024-06-18 20:04:00,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 3306946560. Throughput: 0: 41643.1. Samples: 530998340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:00,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 20:04:01,942][19107] Updated weights for policy 0, policy_version 201845 (0.0037) [2024-06-18 20:04:05,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3307159552. Throughput: 0: 41578.7. Samples: 531248740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:05,500][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 20:04:06,526][19107] Updated weights for policy 0, policy_version 201855 (0.0041) [2024-06-18 20:04:09,718][19107] Updated weights for policy 0, policy_version 201865 (0.0050) [2024-06-18 20:04:10,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3307388928. Throughput: 0: 41812.8. Samples: 531498940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:10,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 20:04:14,202][19107] Updated weights for policy 0, policy_version 201875 (0.0040) [2024-06-18 20:04:15,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 3307585536. Throughput: 0: 41573.3. Samples: 531625980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:15,501][18875] Avg episode reward: [(0, '0.452')] [2024-06-18 20:04:17,627][19107] Updated weights for policy 0, policy_version 201885 (0.0028) [2024-06-18 20:04:20,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3307782144. Throughput: 0: 41700.6. Samples: 531873140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:20,508][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 20:04:21,853][19107] Updated weights for policy 0, policy_version 201895 (0.0043) [2024-06-18 20:04:25,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3307995136. Throughput: 0: 41877.8. Samples: 532127160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:25,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 20:04:25,652][19107] Updated weights for policy 0, policy_version 201905 (0.0033) [2024-06-18 20:04:29,556][19107] Updated weights for policy 0, policy_version 201915 (0.0032) [2024-06-18 20:04:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3308208128. Throughput: 0: 41728.1. Samples: 532256440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:30,501][18875] Avg episode reward: [(0, '0.627')] [2024-06-18 20:04:33,279][19107] Updated weights for policy 0, policy_version 201925 (0.0050) [2024-06-18 20:04:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 41876.4). Total num frames: 3308404736. Throughput: 0: 41857.5. Samples: 532506720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:35,500][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 20:04:35,586][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201930_3308421120.pth... [2024-06-18 20:04:35,631][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201316_3298361344.pth [2024-06-18 20:04:37,297][19107] Updated weights for policy 0, policy_version 201935 (0.0040) [2024-06-18 20:04:40,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3308634112. Throughput: 0: 41988.4. Samples: 532757200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-18 20:04:40,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 20:04:41,036][19107] Updated weights for policy 0, policy_version 201945 (0.0041) [2024-06-18 20:04:45,442][19107] Updated weights for policy 0, policy_version 201955 (0.0041) [2024-06-18 20:04:45,500][18875] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 41987.4). Total num frames: 3308830720. Throughput: 0: 41938.6. Samples: 532885580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:04:45,501][18875] Avg episode reward: [(0, '0.719')] [2024-06-18 20:04:48,807][19107] Updated weights for policy 0, policy_version 201965 (0.0028) [2024-06-18 20:04:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3309043712. Throughput: 0: 41812.8. Samples: 533130320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:04:50,501][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 20:04:53,284][19107] Updated weights for policy 0, policy_version 201975 (0.0036) [2024-06-18 20:04:55,504][18875] Fps is (10 sec: 42583.5, 60 sec: 41503.6, 300 sec: 41987.0). Total num frames: 3309256704. Throughput: 0: 41803.8. Samples: 533380260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:04:55,504][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 20:04:56,618][19107] Updated weights for policy 0, policy_version 201985 (0.0042) [2024-06-18 20:05:00,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3309436928. Throughput: 0: 41831.1. Samples: 533508380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:00,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 20:05:00,929][19107] Updated weights for policy 0, policy_version 201995 (0.0038) [2024-06-18 20:05:04,610][19107] Updated weights for policy 0, policy_version 202005 (0.0041) [2024-06-18 20:05:05,500][18875] Fps is (10 sec: 42613.4, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 3309682688. Throughput: 0: 41964.3. Samples: 533761540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:05,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 20:05:08,633][19107] Updated weights for policy 0, policy_version 202015 (0.0033) [2024-06-18 20:05:10,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3309895680. Throughput: 0: 41833.2. Samples: 534009660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:10,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 20:05:12,547][19107] Updated weights for policy 0, policy_version 202025 (0.0036) [2024-06-18 20:05:15,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3310075904. Throughput: 0: 41765.3. Samples: 534135880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:15,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 20:05:16,419][19107] Updated weights for policy 0, policy_version 202035 (0.0037) [2024-06-18 20:05:20,423][19107] Updated weights for policy 0, policy_version 202045 (0.0030) [2024-06-18 20:05:20,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3310305280. Throughput: 0: 41614.1. Samples: 534379360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:20,501][18875] Avg episode reward: [(0, '0.777')] [2024-06-18 20:05:23,625][19087] Signal inference workers to stop experience collection... (7800 times) [2024-06-18 20:05:23,633][19087] Signal inference workers to resume experience collection... (7800 times) [2024-06-18 20:05:23,642][19107] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-06-18 20:05:23,654][19107] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-06-18 20:05:24,097][19107] Updated weights for policy 0, policy_version 202055 (0.0036) [2024-06-18 20:05:25,500][18875] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3310501888. Throughput: 0: 41856.3. Samples: 534640740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:25,501][18875] Avg episode reward: [(0, '0.753')] [2024-06-18 20:05:28,272][19107] Updated weights for policy 0, policy_version 202065 (0.0035) [2024-06-18 20:05:30,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3310698496. Throughput: 0: 41854.7. Samples: 534769040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:30,501][18875] Avg episode reward: [(0, '0.753')] [2024-06-18 20:05:31,951][19107] Updated weights for policy 0, policy_version 202075 (0.0041) [2024-06-18 20:05:35,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.2, 300 sec: 41877.2). Total num frames: 3310927872. Throughput: 0: 41822.7. Samples: 535012340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:35,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 20:05:36,119][19107] Updated weights for policy 0, policy_version 202085 (0.0038) [2024-06-18 20:05:39,782][19107] Updated weights for policy 0, policy_version 202095 (0.0035) [2024-06-18 20:05:40,500][18875] Fps is (10 sec: 44237.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3311140864. Throughput: 0: 41910.1. Samples: 535266060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:40,501][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 20:05:43,995][19107] Updated weights for policy 0, policy_version 202105 (0.0031) [2024-06-18 20:05:45,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3311321088. Throughput: 0: 41849.2. Samples: 535391600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:45,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 20:05:47,730][19107] Updated weights for policy 0, policy_version 202115 (0.0035) [2024-06-18 20:05:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3311566848. Throughput: 0: 41705.3. Samples: 535638280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:05:50,501][18875] Avg episode reward: [(0, '0.374')] [2024-06-18 20:05:51,919][19107] Updated weights for policy 0, policy_version 202125 (0.0030) [2024-06-18 20:05:55,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41781.6, 300 sec: 41988.0). Total num frames: 3311763456. Throughput: 0: 41893.7. Samples: 535894880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:05:55,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 20:05:55,516][19107] Updated weights for policy 0, policy_version 202135 (0.0038) [2024-06-18 20:05:59,606][19107] Updated weights for policy 0, policy_version 202145 (0.0037) [2024-06-18 20:06:00,500][18875] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3311960064. Throughput: 0: 41759.1. Samples: 536015040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:00,501][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 20:06:03,557][19107] Updated weights for policy 0, policy_version 202155 (0.0040) [2024-06-18 20:06:05,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3312189440. Throughput: 0: 42013.0. Samples: 536269940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:05,500][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 20:06:07,868][19107] Updated weights for policy 0, policy_version 202165 (0.0037) [2024-06-18 20:06:10,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3312386048. Throughput: 0: 41836.9. Samples: 536523400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:10,501][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 20:06:11,350][19107] Updated weights for policy 0, policy_version 202175 (0.0036) [2024-06-18 20:06:15,500][18875] Fps is (10 sec: 37682.5, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3312566272. Throughput: 0: 41694.2. Samples: 536645280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:15,501][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 20:06:15,688][19107] Updated weights for policy 0, policy_version 202185 (0.0038) [2024-06-18 20:06:19,094][19107] Updated weights for policy 0, policy_version 202195 (0.0044) [2024-06-18 20:06:20,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3312844800. Throughput: 0: 41975.9. Samples: 536901260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:20,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 20:06:23,364][19087] Signal inference workers to stop experience collection... (7850 times) [2024-06-18 20:06:23,389][19107] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-06-18 20:06:23,423][19087] Signal inference workers to resume experience collection... (7850 times) [2024-06-18 20:06:23,435][19107] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-06-18 20:06:23,596][19107] Updated weights for policy 0, policy_version 202205 (0.0033) [2024-06-18 20:06:25,500][18875] Fps is (10 sec: 45876.2, 60 sec: 42052.4, 300 sec: 41988.0). Total num frames: 3313025024. Throughput: 0: 41974.3. Samples: 537154900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:25,501][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 20:06:26,848][19107] Updated weights for policy 0, policy_version 202215 (0.0050) [2024-06-18 20:06:30,501][18875] Fps is (10 sec: 37682.7, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 3313221632. Throughput: 0: 41905.2. Samples: 537277340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:30,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 20:06:31,344][19107] Updated weights for policy 0, policy_version 202225 (0.0032) [2024-06-18 20:06:35,043][19107] Updated weights for policy 0, policy_version 202235 (0.0047) [2024-06-18 20:06:35,504][18875] Fps is (10 sec: 42582.5, 60 sec: 42049.7, 300 sec: 41986.9). Total num frames: 3313451008. Throughput: 0: 42034.9. Samples: 537530000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:35,504][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 20:06:35,572][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000202238_3313467392.pth... [2024-06-18 20:06:35,654][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201623_3303391232.pth [2024-06-18 20:06:39,161][19107] Updated weights for policy 0, policy_version 202245 (0.0037) [2024-06-18 20:06:40,500][18875] Fps is (10 sec: 40961.6, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3313631232. Throughput: 0: 41873.6. Samples: 537779180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:40,500][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 20:06:42,988][19107] Updated weights for policy 0, policy_version 202255 (0.0037) [2024-06-18 20:06:45,500][18875] Fps is (10 sec: 40974.6, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3313860608. Throughput: 0: 41890.6. Samples: 537900120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:45,501][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 20:06:46,779][19107] Updated weights for policy 0, policy_version 202265 (0.0037) [2024-06-18 20:06:50,500][18875] Fps is (10 sec: 40959.0, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 3314040832. Throughput: 0: 41802.0. Samples: 538151040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:50,501][18875] Avg episode reward: [(0, '0.690')] [2024-06-18 20:06:50,738][19107] Updated weights for policy 0, policy_version 202275 (0.0036) [2024-06-18 20:06:54,408][19107] Updated weights for policy 0, policy_version 202285 (0.0045) [2024-06-18 20:06:55,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.2, 300 sec: 41932.4). Total num frames: 3314253824. Throughput: 0: 41782.7. Samples: 538403620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:06:55,501][18875] Avg episode reward: [(0, '0.817')] [2024-06-18 20:06:58,503][19107] Updated weights for policy 0, policy_version 202295 (0.0043) [2024-06-18 20:07:00,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3314483200. Throughput: 0: 41842.4. Samples: 538528180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:00,501][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 20:07:01,923][19107] Updated weights for policy 0, policy_version 202305 (0.0033) [2024-06-18 20:07:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 3314679808. Throughput: 0: 41808.5. Samples: 538782640. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:05,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 20:07:06,203][19107] Updated weights for policy 0, policy_version 202315 (0.0034) [2024-06-18 20:07:10,254][19107] Updated weights for policy 0, policy_version 202325 (0.0039) [2024-06-18 20:07:10,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3314892800. Throughput: 0: 41623.4. Samples: 539027960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:10,501][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 20:07:14,181][19107] Updated weights for policy 0, policy_version 202335 (0.0033) [2024-06-18 20:07:15,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 41876.4). Total num frames: 3315105792. Throughput: 0: 41741.6. Samples: 539155700. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:15,500][18875] Avg episode reward: [(0, '0.703')] [2024-06-18 20:07:17,943][19107] Updated weights for policy 0, policy_version 202345 (0.0024) [2024-06-18 20:07:20,500][18875] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41876.9). Total num frames: 3315302400. Throughput: 0: 41740.6. Samples: 539408180. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:20,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 20:07:22,104][19107] Updated weights for policy 0, policy_version 202355 (0.0039) [2024-06-18 20:07:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3315515392. Throughput: 0: 41630.1. Samples: 539652540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:25,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 20:07:26,253][19107] Updated weights for policy 0, policy_version 202365 (0.0045) [2024-06-18 20:07:29,813][19107] Updated weights for policy 0, policy_version 202375 (0.0035) [2024-06-18 20:07:30,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 3315744768. Throughput: 0: 41860.5. Samples: 539783840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:30,509][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 20:07:33,849][19107] Updated weights for policy 0, policy_version 202385 (0.0045) [2024-06-18 20:07:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41235.5, 300 sec: 41931.9). Total num frames: 3315924992. Throughput: 0: 41856.0. Samples: 540034560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:35,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 20:07:37,573][19107] Updated weights for policy 0, policy_version 202395 (0.0039) [2024-06-18 20:07:40,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3316154368. Throughput: 0: 41747.6. Samples: 540282260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:40,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 20:07:41,819][19107] Updated weights for policy 0, policy_version 202405 (0.0023) [2024-06-18 20:07:45,375][19107] Updated weights for policy 0, policy_version 202415 (0.0039) [2024-06-18 20:07:45,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3316367360. Throughput: 0: 41878.6. Samples: 540412720. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:45,501][18875] Avg episode reward: [(0, '0.344')] [2024-06-18 20:07:49,524][19107] Updated weights for policy 0, policy_version 202425 (0.0031) [2024-06-18 20:07:50,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41876.7). Total num frames: 3316563968. Throughput: 0: 41807.0. Samples: 540663960. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:50,501][18875] Avg episode reward: [(0, '0.327')] [2024-06-18 20:07:53,051][19107] Updated weights for policy 0, policy_version 202435 (0.0043) [2024-06-18 20:07:53,961][19087] Signal inference workers to stop experience collection... (7900 times) [2024-06-18 20:07:54,016][19107] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-06-18 20:07:54,018][19087] Signal inference workers to resume experience collection... (7900 times) [2024-06-18 20:07:54,028][19107] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-06-18 20:07:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3316793344. Throughput: 0: 41904.9. Samples: 540913680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:07:55,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 20:07:57,457][19107] Updated weights for policy 0, policy_version 202445 (0.0034) [2024-06-18 20:08:00,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3316989952. Throughput: 0: 42038.6. Samples: 541047440. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:08:00,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 20:08:00,783][19107] Updated weights for policy 0, policy_version 202455 (0.0031) [2024-06-18 20:08:05,043][19107] Updated weights for policy 0, policy_version 202465 (0.0033) [2024-06-18 20:08:05,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3317186560. Throughput: 0: 42004.6. Samples: 541298380. Policy #0 lag: (min: 0.0, avg: 12.0, max: 27.0) [2024-06-18 20:08:05,500][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 20:08:08,788][19107] Updated weights for policy 0, policy_version 202475 (0.0034) [2024-06-18 20:08:10,504][18875] Fps is (10 sec: 44220.9, 60 sec: 42322.8, 300 sec: 41820.3). Total num frames: 3317432320. Throughput: 0: 41954.8. Samples: 541540660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:10,505][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 20:08:13,019][19107] Updated weights for policy 0, policy_version 202485 (0.0035) [2024-06-18 20:08:15,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3317612544. Throughput: 0: 41987.1. Samples: 541673260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:15,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 20:08:16,655][19107] Updated weights for policy 0, policy_version 202495 (0.0026) [2024-06-18 20:08:20,500][18875] Fps is (10 sec: 39335.5, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3317825536. Throughput: 0: 42004.9. Samples: 541924780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:20,501][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 20:08:20,547][19107] Updated weights for policy 0, policy_version 202505 (0.0025) [2024-06-18 20:08:24,431][19107] Updated weights for policy 0, policy_version 202515 (0.0029) [2024-06-18 20:08:25,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 3318054912. Throughput: 0: 42014.7. Samples: 542172920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:25,500][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 20:08:28,230][19107] Updated weights for policy 0, policy_version 202525 (0.0035) [2024-06-18 20:08:30,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3318251520. Throughput: 0: 41881.0. Samples: 542297360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:30,501][18875] Avg episode reward: [(0, '0.686')] [2024-06-18 20:08:32,397][19107] Updated weights for policy 0, policy_version 202535 (0.0029) [2024-06-18 20:08:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3318464512. Throughput: 0: 41824.1. Samples: 542546040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:35,508][18875] Avg episode reward: [(0, '0.771')] [2024-06-18 20:08:35,526][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000202543_3318464512.pth... [2024-06-18 20:08:35,585][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000201930_3308421120.pth [2024-06-18 20:08:36,171][19107] Updated weights for policy 0, policy_version 202545 (0.0042) [2024-06-18 20:08:40,055][19107] Updated weights for policy 0, policy_version 202555 (0.0032) [2024-06-18 20:08:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3318661120. Throughput: 0: 41931.7. Samples: 542800600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:40,500][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 20:08:44,518][19107] Updated weights for policy 0, policy_version 202565 (0.0028) [2024-06-18 20:08:45,502][18875] Fps is (10 sec: 40952.4, 60 sec: 41777.9, 300 sec: 41931.7). Total num frames: 3318874112. Throughput: 0: 41752.4. Samples: 542926380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:45,503][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 20:08:47,881][19107] Updated weights for policy 0, policy_version 202575 (0.0033) [2024-06-18 20:08:50,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3319087104. Throughput: 0: 41771.4. Samples: 543178100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:50,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 20:08:52,475][19107] Updated weights for policy 0, policy_version 202585 (0.0041) [2024-06-18 20:08:55,500][18875] Fps is (10 sec: 40968.4, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3319283712. Throughput: 0: 41956.3. Samples: 543428540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:08:55,500][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 20:08:55,683][19107] Updated weights for policy 0, policy_version 202595 (0.0037) [2024-06-18 20:09:00,223][19107] Updated weights for policy 0, policy_version 202605 (0.0037) [2024-06-18 20:09:00,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3319496704. Throughput: 0: 41701.3. Samples: 543549820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:09:00,501][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 20:09:03,571][19107] Updated weights for policy 0, policy_version 202615 (0.0030) [2024-06-18 20:09:05,504][18875] Fps is (10 sec: 44220.3, 60 sec: 42322.7, 300 sec: 41820.3). Total num frames: 3319726080. Throughput: 0: 41835.4. Samples: 543807520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:09:05,513][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 20:09:07,738][19107] Updated weights for policy 0, policy_version 202625 (0.0026) [2024-06-18 20:09:10,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41235.6, 300 sec: 41765.3). Total num frames: 3319906304. Throughput: 0: 41901.4. Samples: 544058480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 20:09:10,500][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 20:09:11,608][19107] Updated weights for policy 0, policy_version 202635 (0.0033) [2024-06-18 20:09:12,257][19087] Signal inference workers to stop experience collection... (7950 times) [2024-06-18 20:09:12,257][19087] Signal inference workers to resume experience collection... (7950 times) [2024-06-18 20:09:12,269][19107] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-06-18 20:09:12,270][19107] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-06-18 20:09:15,475][19107] Updated weights for policy 0, policy_version 202645 (0.0032) [2024-06-18 20:09:15,500][18875] Fps is (10 sec: 40974.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3320135680. Throughput: 0: 41706.6. Samples: 544174160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:15,504][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 20:09:19,419][19107] Updated weights for policy 0, policy_version 202655 (0.0029) [2024-06-18 20:09:20,500][18875] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 3320381440. Throughput: 0: 42007.2. Samples: 544436360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:20,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 20:09:23,119][19107] Updated weights for policy 0, policy_version 202665 (0.0031) [2024-06-18 20:09:25,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3320561664. Throughput: 0: 41936.0. Samples: 544687720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:25,500][18875] Avg episode reward: [(0, '0.337')] [2024-06-18 20:09:27,126][19107] Updated weights for policy 0, policy_version 202675 (0.0047) [2024-06-18 20:09:30,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3320758272. Throughput: 0: 41855.6. Samples: 544809800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:30,501][18875] Avg episode reward: [(0, '0.651')] [2024-06-18 20:09:30,777][19107] Updated weights for policy 0, policy_version 202685 (0.0044) [2024-06-18 20:09:34,814][19107] Updated weights for policy 0, policy_version 202695 (0.0029) [2024-06-18 20:09:35,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3320971264. Throughput: 0: 41861.5. Samples: 545061860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:35,500][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 20:09:38,778][19107] Updated weights for policy 0, policy_version 202705 (0.0032) [2024-06-18 20:09:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3321184256. Throughput: 0: 41772.4. Samples: 545308300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:40,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 20:09:42,723][19107] Updated weights for policy 0, policy_version 202715 (0.0040) [2024-06-18 20:09:45,500][18875] Fps is (10 sec: 40959.1, 60 sec: 41780.4, 300 sec: 41820.8). Total num frames: 3321380864. Throughput: 0: 41943.8. Samples: 545437300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:45,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 20:09:46,646][19107] Updated weights for policy 0, policy_version 202725 (0.0028) [2024-06-18 20:09:50,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41821.4). Total num frames: 3321593856. Throughput: 0: 41864.3. Samples: 545691260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:50,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 20:09:50,608][19107] Updated weights for policy 0, policy_version 202735 (0.0029) [2024-06-18 20:09:54,317][19107] Updated weights for policy 0, policy_version 202745 (0.0027) [2024-06-18 20:09:55,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3321823232. Throughput: 0: 41907.0. Samples: 545944300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:09:55,501][18875] Avg episode reward: [(0, '0.425')] [2024-06-18 20:09:58,462][19107] Updated weights for policy 0, policy_version 202755 (0.0044) [2024-06-18 20:10:00,500][18875] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3322019840. Throughput: 0: 42176.8. Samples: 546072120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:10:00,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 20:10:02,407][19107] Updated weights for policy 0, policy_version 202765 (0.0032) [2024-06-18 20:10:05,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42054.8, 300 sec: 41876.4). Total num frames: 3322249216. Throughput: 0: 41944.8. Samples: 546323880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:10:05,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 20:10:06,258][19107] Updated weights for policy 0, policy_version 202775 (0.0033) [2024-06-18 20:10:09,931][19107] Updated weights for policy 0, policy_version 202785 (0.0031) [2024-06-18 20:10:10,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3322445824. Throughput: 0: 42038.1. Samples: 546579440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:10:10,501][18875] Avg episode reward: [(0, '0.802')] [2024-06-18 20:10:14,087][19107] Updated weights for policy 0, policy_version 202795 (0.0040) [2024-06-18 20:10:15,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3322642432. Throughput: 0: 42089.6. Samples: 546703840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:10:15,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 20:10:17,512][19107] Updated weights for policy 0, policy_version 202805 (0.0031) [2024-06-18 20:10:20,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3322888192. Throughput: 0: 42138.6. Samples: 546958100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-18 20:10:20,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 20:10:21,712][19107] Updated weights for policy 0, policy_version 202815 (0.0033) [2024-06-18 20:10:25,270][19107] Updated weights for policy 0, policy_version 202825 (0.0032) [2024-06-18 20:10:25,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3323084800. Throughput: 0: 42140.4. Samples: 547204620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:10:25,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 20:10:29,529][19107] Updated weights for policy 0, policy_version 202835 (0.0035) [2024-06-18 20:10:30,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3323265024. Throughput: 0: 42212.2. Samples: 547336840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:10:30,500][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 20:10:32,904][19107] Updated weights for policy 0, policy_version 202845 (0.0029) [2024-06-18 20:10:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3323494400. Throughput: 0: 42075.4. Samples: 547584660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:10:35,501][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 20:10:35,633][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000202851_3323510784.pth... [2024-06-18 20:10:35,702][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000202238_3313467392.pth [2024-06-18 20:10:37,311][19107] Updated weights for policy 0, policy_version 202855 (0.0034) [2024-06-18 20:10:40,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3323707392. Throughput: 0: 42042.7. Samples: 547836220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:10:40,501][18875] Avg episode reward: [(0, '0.633')] [2024-06-18 20:10:41,155][19107] Updated weights for policy 0, policy_version 202865 (0.0034) [2024-06-18 20:10:44,912][19107] Updated weights for policy 0, policy_version 202875 (0.0031) [2024-06-18 20:10:45,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 41876.4). Total num frames: 3323920384. Throughput: 0: 41986.9. Samples: 547961520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:10:45,500][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 20:10:48,925][19107] Updated weights for policy 0, policy_version 202885 (0.0032) [2024-06-18 20:10:50,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.1, 300 sec: 41876.4). Total num frames: 3324116992. Throughput: 0: 41980.8. Samples: 548213020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:10:50,501][18875] Avg episode reward: [(0, '0.733')] [2024-06-18 20:10:50,531][19087] Signal inference workers to stop experience collection... (8000 times) [2024-06-18 20:10:50,584][19107] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-06-18 20:10:50,645][19087] Signal inference workers to resume experience collection... (8000 times) [2024-06-18 20:10:50,645][19107] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-06-18 20:10:52,743][19107] Updated weights for policy 0, policy_version 202895 (0.0043) [2024-06-18 20:10:55,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3324329984. Throughput: 0: 41854.7. Samples: 548462900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:10:55,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 20:10:56,704][19107] Updated weights for policy 0, policy_version 202905 (0.0041) [2024-06-18 20:11:00,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3324542976. Throughput: 0: 41797.8. Samples: 548584740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:11:00,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 20:11:01,028][19107] Updated weights for policy 0, policy_version 202915 (0.0033) [2024-06-18 20:11:04,467][19107] Updated weights for policy 0, policy_version 202925 (0.0038) [2024-06-18 20:11:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3324739584. Throughput: 0: 41714.7. Samples: 548835260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:11:05,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 20:11:08,859][19107] Updated weights for policy 0, policy_version 202935 (0.0030) [2024-06-18 20:11:10,500][18875] Fps is (10 sec: 39322.5, 60 sec: 41506.3, 300 sec: 41932.0). Total num frames: 3324936192. Throughput: 0: 41904.1. Samples: 549090300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:11:10,500][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 20:11:12,418][19107] Updated weights for policy 0, policy_version 202945 (0.0053) [2024-06-18 20:11:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 3325165568. Throughput: 0: 41669.7. Samples: 549211980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:11:15,501][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 20:11:16,992][19107] Updated weights for policy 0, policy_version 202955 (0.0039) [2024-06-18 20:11:20,132][19107] Updated weights for policy 0, policy_version 202965 (0.0032) [2024-06-18 20:11:20,500][18875] Fps is (10 sec: 44236.2, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3325378560. Throughput: 0: 41671.1. Samples: 549459860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:11:20,512][18875] Avg episode reward: [(0, '0.608')] [2024-06-18 20:11:24,801][19107] Updated weights for policy 0, policy_version 202975 (0.0032) [2024-06-18 20:11:25,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 3325558784. Throughput: 0: 41771.9. Samples: 549715960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 25.0) [2024-06-18 20:11:25,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 20:11:27,779][19107] Updated weights for policy 0, policy_version 202985 (0.0023) [2024-06-18 20:11:30,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 41876.9). Total num frames: 3325804544. Throughput: 0: 41679.5. Samples: 549837100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:11:30,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 20:11:32,599][19107] Updated weights for policy 0, policy_version 202995 (0.0037) [2024-06-18 20:11:35,500][18875] Fps is (10 sec: 44237.9, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3326001152. Throughput: 0: 41724.3. Samples: 550090600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:11:35,500][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 20:11:35,862][19107] Updated weights for policy 0, policy_version 203005 (0.0048) [2024-06-18 20:11:40,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3326181376. Throughput: 0: 41811.6. Samples: 550344420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:11:40,501][18875] Avg episode reward: [(0, '0.739')] [2024-06-18 20:11:40,782][19107] Updated weights for policy 0, policy_version 203015 (0.0045) [2024-06-18 20:11:43,496][19107] Updated weights for policy 0, policy_version 203025 (0.0044) [2024-06-18 20:11:45,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41932.0). Total num frames: 3326410752. Throughput: 0: 41833.5. Samples: 550467240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:11:45,501][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 20:11:48,496][19107] Updated weights for policy 0, policy_version 203035 (0.0037) [2024-06-18 20:11:50,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3326640128. Throughput: 0: 42017.7. Samples: 550726060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:11:50,501][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 20:11:51,309][19107] Updated weights for policy 0, policy_version 203045 (0.0025) [2024-06-18 20:11:55,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 3326820352. Throughput: 0: 41945.7. Samples: 550977860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:11:55,501][18875] Avg episode reward: [(0, '0.801')] [2024-06-18 20:11:56,111][19107] Updated weights for policy 0, policy_version 203055 (0.0034) [2024-06-18 20:11:59,094][19107] Updated weights for policy 0, policy_version 203065 (0.0040) [2024-06-18 20:12:00,504][18875] Fps is (10 sec: 42583.4, 60 sec: 42049.8, 300 sec: 41987.0). Total num frames: 3327066112. Throughput: 0: 41918.0. Samples: 551098440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:00,505][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 20:12:03,620][19107] Updated weights for policy 0, policy_version 203075 (0.0034) [2024-06-18 20:12:05,500][18875] Fps is (10 sec: 45874.7, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3327279104. Throughput: 0: 42204.8. Samples: 551359080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:05,501][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 20:12:06,694][19107] Updated weights for policy 0, policy_version 203085 (0.0048) [2024-06-18 20:12:10,500][18875] Fps is (10 sec: 39336.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3327459328. Throughput: 0: 42151.7. Samples: 551612780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:10,501][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 20:12:11,353][19107] Updated weights for policy 0, policy_version 203095 (0.0040) [2024-06-18 20:12:14,309][19107] Updated weights for policy 0, policy_version 203105 (0.0026) [2024-06-18 20:12:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3327688704. Throughput: 0: 42164.4. Samples: 551734500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:15,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 20:12:16,901][19087] Signal inference workers to stop experience collection... (8050 times) [2024-06-18 20:12:16,901][19087] Signal inference workers to resume experience collection... (8050 times) [2024-06-18 20:12:16,922][19107] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-06-18 20:12:16,923][19107] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-06-18 20:12:18,949][19107] Updated weights for policy 0, policy_version 203115 (0.0045) [2024-06-18 20:12:20,500][18875] Fps is (10 sec: 44235.9, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 3327901696. Throughput: 0: 42169.1. Samples: 551988220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:20,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 20:12:22,143][19107] Updated weights for policy 0, policy_version 203125 (0.0030) [2024-06-18 20:12:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3328098304. Throughput: 0: 42156.4. Samples: 552241460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:25,501][18875] Avg episode reward: [(0, '0.391')] [2024-06-18 20:12:26,627][19107] Updated weights for policy 0, policy_version 203135 (0.0028) [2024-06-18 20:12:30,338][19107] Updated weights for policy 0, policy_version 203145 (0.0044) [2024-06-18 20:12:30,500][18875] Fps is (10 sec: 42599.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3328327680. Throughput: 0: 42209.9. Samples: 552366680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:30,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 20:12:34,652][19107] Updated weights for policy 0, policy_version 203155 (0.0048) [2024-06-18 20:12:35,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3328524288. Throughput: 0: 42052.1. Samples: 552618400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 20:12:35,500][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 20:12:35,650][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000203158_3328540672.pth... [2024-06-18 20:12:35,699][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000202543_3318464512.pth [2024-06-18 20:12:38,248][19107] Updated weights for policy 0, policy_version 203165 (0.0041) [2024-06-18 20:12:40,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 3328737280. Throughput: 0: 41898.2. Samples: 552863280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:12:40,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 20:12:42,639][19107] Updated weights for policy 0, policy_version 203175 (0.0029) [2024-06-18 20:12:45,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3328933888. Throughput: 0: 42156.6. Samples: 552995340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:12:45,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 20:12:46,217][19107] Updated weights for policy 0, policy_version 203185 (0.0035) [2024-06-18 20:12:50,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3329130496. Throughput: 0: 41910.3. Samples: 553245040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:12:50,501][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 20:12:50,545][19107] Updated weights for policy 0, policy_version 203195 (0.0028) [2024-06-18 20:12:54,082][19107] Updated weights for policy 0, policy_version 203205 (0.0029) [2024-06-18 20:12:55,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 3329392640. Throughput: 0: 41794.1. Samples: 553493520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:12:55,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 20:12:58,460][19107] Updated weights for policy 0, policy_version 203215 (0.0026) [2024-06-18 20:13:00,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41781.7, 300 sec: 41987.5). Total num frames: 3329572864. Throughput: 0: 42144.0. Samples: 553630980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:00,501][18875] Avg episode reward: [(0, '0.391')] [2024-06-18 20:13:01,894][19107] Updated weights for policy 0, policy_version 203225 (0.0046) [2024-06-18 20:13:05,500][18875] Fps is (10 sec: 37683.7, 60 sec: 41506.3, 300 sec: 41821.4). Total num frames: 3329769472. Throughput: 0: 41933.1. Samples: 553875200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:05,501][18875] Avg episode reward: [(0, '0.363')] [2024-06-18 20:13:06,129][19107] Updated weights for policy 0, policy_version 203235 (0.0033) [2024-06-18 20:13:09,629][19107] Updated weights for policy 0, policy_version 203245 (0.0045) [2024-06-18 20:13:10,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3329998848. Throughput: 0: 41866.3. Samples: 554125440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:10,501][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 20:13:13,853][19107] Updated weights for policy 0, policy_version 203255 (0.0045) [2024-06-18 20:13:15,504][18875] Fps is (10 sec: 44220.6, 60 sec: 42049.8, 300 sec: 41987.0). Total num frames: 3330211840. Throughput: 0: 41915.7. Samples: 554253040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:15,505][18875] Avg episode reward: [(0, '0.800')] [2024-06-18 20:13:17,791][19107] Updated weights for policy 0, policy_version 203265 (0.0030) [2024-06-18 20:13:20,504][18875] Fps is (10 sec: 40944.9, 60 sec: 41776.8, 300 sec: 41875.9). Total num frames: 3330408448. Throughput: 0: 41806.3. Samples: 554499840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:20,505][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 20:13:21,565][19107] Updated weights for policy 0, policy_version 203275 (0.0039) [2024-06-18 20:13:25,500][18875] Fps is (10 sec: 39335.9, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3330605056. Throughput: 0: 42034.3. Samples: 554754820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:25,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 20:13:25,659][19107] Updated weights for policy 0, policy_version 203285 (0.0023) [2024-06-18 20:13:29,298][19107] Updated weights for policy 0, policy_version 203295 (0.0027) [2024-06-18 20:13:30,500][18875] Fps is (10 sec: 40975.1, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3330818048. Throughput: 0: 41811.3. Samples: 554876840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:30,500][18875] Avg episode reward: [(0, '0.590')] [2024-06-18 20:13:30,635][19087] Signal inference workers to stop experience collection... (8100 times) [2024-06-18 20:13:30,636][19087] Signal inference workers to resume experience collection... (8100 times) [2024-06-18 20:13:30,658][19107] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-06-18 20:13:30,685][19107] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-06-18 20:13:33,361][19107] Updated weights for policy 0, policy_version 203305 (0.0039) [2024-06-18 20:13:35,504][18875] Fps is (10 sec: 44220.6, 60 sec: 42049.7, 300 sec: 41986.9). Total num frames: 3331047424. Throughput: 0: 41734.4. Samples: 555123240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:35,504][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 20:13:37,254][19107] Updated weights for policy 0, policy_version 203315 (0.0040) [2024-06-18 20:13:40,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41821.1). Total num frames: 3331211264. Throughput: 0: 41949.9. Samples: 555381260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 20:13:40,500][18875] Avg episode reward: [(0, '0.701')] [2024-06-18 20:13:41,267][19107] Updated weights for policy 0, policy_version 203325 (0.0032) [2024-06-18 20:13:44,898][19107] Updated weights for policy 0, policy_version 203335 (0.0032) [2024-06-18 20:13:45,500][18875] Fps is (10 sec: 40974.9, 60 sec: 42052.4, 300 sec: 41932.0). Total num frames: 3331457024. Throughput: 0: 41549.8. Samples: 555500720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:13:45,501][18875] Avg episode reward: [(0, '0.811')] [2024-06-18 20:13:48,870][19107] Updated weights for policy 0, policy_version 203345 (0.0036) [2024-06-18 20:13:50,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3331670016. Throughput: 0: 41744.3. Samples: 555753700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:13:50,501][18875] Avg episode reward: [(0, '0.805')] [2024-06-18 20:13:52,825][19107] Updated weights for policy 0, policy_version 203355 (0.0036) [2024-06-18 20:13:55,500][18875] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 41876.4). Total num frames: 3331850240. Throughput: 0: 41830.1. Samples: 556007800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:13:55,501][18875] Avg episode reward: [(0, '0.845')] [2024-06-18 20:13:56,653][19107] Updated weights for policy 0, policy_version 203365 (0.0045) [2024-06-18 20:14:00,496][19107] Updated weights for policy 0, policy_version 203375 (0.0042) [2024-06-18 20:14:00,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41932.5). Total num frames: 3332096000. Throughput: 0: 41606.4. Samples: 556125180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:00,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 20:14:04,848][19107] Updated weights for policy 0, policy_version 203385 (0.0038) [2024-06-18 20:14:05,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3332308992. Throughput: 0: 41793.5. Samples: 556380400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:05,501][18875] Avg episode reward: [(0, '0.760')] [2024-06-18 20:14:08,115][19107] Updated weights for policy 0, policy_version 203395 (0.0033) [2024-06-18 20:14:10,504][18875] Fps is (10 sec: 37669.6, 60 sec: 41230.6, 300 sec: 41820.4). Total num frames: 3332472832. Throughput: 0: 41717.9. Samples: 556632280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:10,504][18875] Avg episode reward: [(0, '0.762')] [2024-06-18 20:14:12,607][19107] Updated weights for policy 0, policy_version 203405 (0.0034) [2024-06-18 20:14:15,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41781.7, 300 sec: 41820.9). Total num frames: 3332718592. Throughput: 0: 41716.0. Samples: 556754060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:15,501][18875] Avg episode reward: [(0, '0.409')] [2024-06-18 20:14:15,757][19107] Updated weights for policy 0, policy_version 203415 (0.0041) [2024-06-18 20:14:20,499][19107] Updated weights for policy 0, policy_version 203425 (0.0035) [2024-06-18 20:14:20,500][18875] Fps is (10 sec: 44252.6, 60 sec: 41781.7, 300 sec: 41876.4). Total num frames: 3332915200. Throughput: 0: 41878.5. Samples: 557007620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:20,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 20:14:23,546][19107] Updated weights for policy 0, policy_version 203435 (0.0045) [2024-06-18 20:14:25,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3333111808. Throughput: 0: 41583.1. Samples: 557252500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:25,501][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 20:14:28,300][19107] Updated weights for policy 0, policy_version 203445 (0.0041) [2024-06-18 20:14:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3333324800. Throughput: 0: 41824.9. Samples: 557382840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:30,501][18875] Avg episode reward: [(0, '0.330')] [2024-06-18 20:14:31,424][19107] Updated weights for policy 0, policy_version 203455 (0.0048) [2024-06-18 20:14:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41235.5, 300 sec: 41820.8). Total num frames: 3333521408. Throughput: 0: 41594.2. Samples: 557625440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:35,501][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 20:14:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000203463_3333537792.pth... [2024-06-18 20:14:35,583][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000202851_3323510784.pth [2024-06-18 20:14:36,210][19107] Updated weights for policy 0, policy_version 203465 (0.0037) [2024-06-18 20:14:39,284][19107] Updated weights for policy 0, policy_version 203475 (0.0033) [2024-06-18 20:14:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3333734400. Throughput: 0: 41543.1. Samples: 557877240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:40,501][18875] Avg episode reward: [(0, '0.282')] [2024-06-18 20:14:44,023][19107] Updated weights for policy 0, policy_version 203485 (0.0032) [2024-06-18 20:14:44,518][19087] Signal inference workers to stop experience collection... (8150 times) [2024-06-18 20:14:44,519][19087] Signal inference workers to resume experience collection... (8150 times) [2024-06-18 20:14:44,560][19107] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-06-18 20:14:44,564][19107] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-06-18 20:14:45,504][18875] Fps is (10 sec: 42583.3, 60 sec: 41503.6, 300 sec: 41875.9). Total num frames: 3333947392. Throughput: 0: 41832.2. Samples: 558007780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:45,504][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 20:14:47,647][19107] Updated weights for policy 0, policy_version 203495 (0.0031) [2024-06-18 20:14:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3334160384. Throughput: 0: 41675.6. Samples: 558255800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 20:14:50,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 20:14:51,996][19107] Updated weights for policy 0, policy_version 203505 (0.0042) [2024-06-18 20:14:55,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3334373376. Throughput: 0: 41702.0. Samples: 558508720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:14:55,501][18875] Avg episode reward: [(0, '0.368')] [2024-06-18 20:14:55,508][19107] Updated weights for policy 0, policy_version 203515 (0.0036) [2024-06-18 20:14:59,552][19107] Updated weights for policy 0, policy_version 203525 (0.0035) [2024-06-18 20:15:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3334586368. Throughput: 0: 41859.4. Samples: 558637740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:00,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 20:15:03,658][19107] Updated weights for policy 0, policy_version 203535 (0.0033) [2024-06-18 20:15:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3334799360. Throughput: 0: 41845.8. Samples: 558890680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:05,501][18875] Avg episode reward: [(0, '0.656')] [2024-06-18 20:15:07,199][19107] Updated weights for policy 0, policy_version 203545 (0.0028) [2024-06-18 20:15:10,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42327.9, 300 sec: 41932.0). Total num frames: 3335012352. Throughput: 0: 41998.3. Samples: 559142420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:10,501][18875] Avg episode reward: [(0, '0.656')] [2024-06-18 20:15:11,292][19107] Updated weights for policy 0, policy_version 203555 (0.0036) [2024-06-18 20:15:15,003][19107] Updated weights for policy 0, policy_version 203565 (0.0036) [2024-06-18 20:15:15,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3335225344. Throughput: 0: 41832.4. Samples: 559265300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:15,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 20:15:19,094][19107] Updated weights for policy 0, policy_version 203575 (0.0033) [2024-06-18 20:15:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3335438336. Throughput: 0: 42166.8. Samples: 559522940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:20,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 20:15:22,701][19107] Updated weights for policy 0, policy_version 203585 (0.0037) [2024-06-18 20:15:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3335634944. Throughput: 0: 42087.2. Samples: 559771160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:25,501][18875] Avg episode reward: [(0, '0.303')] [2024-06-18 20:15:26,832][19107] Updated weights for policy 0, policy_version 203595 (0.0043) [2024-06-18 20:15:30,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3335847936. Throughput: 0: 41967.3. Samples: 559896160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:30,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 20:15:30,594][19107] Updated weights for policy 0, policy_version 203605 (0.0037) [2024-06-18 20:15:34,485][19107] Updated weights for policy 0, policy_version 203615 (0.0040) [2024-06-18 20:15:35,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 3336077312. Throughput: 0: 42132.8. Samples: 560151780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:35,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 20:15:38,543][19107] Updated weights for policy 0, policy_version 203625 (0.0056) [2024-06-18 20:15:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3336257536. Throughput: 0: 42152.6. Samples: 560405580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:40,500][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 20:15:42,203][19107] Updated weights for policy 0, policy_version 203635 (0.0033) [2024-06-18 20:15:45,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42327.9, 300 sec: 41932.0). Total num frames: 3336486912. Throughput: 0: 41932.1. Samples: 560524680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:45,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 20:15:46,145][19107] Updated weights for policy 0, policy_version 203645 (0.0031) [2024-06-18 20:15:49,837][19107] Updated weights for policy 0, policy_version 203655 (0.0040) [2024-06-18 20:15:50,500][18875] Fps is (10 sec: 44235.8, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3336699904. Throughput: 0: 42033.6. Samples: 560782200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:50,501][18875] Avg episode reward: [(0, '0.270')] [2024-06-18 20:15:53,933][19107] Updated weights for policy 0, policy_version 203665 (0.0041) [2024-06-18 20:15:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3336896512. Throughput: 0: 42046.5. Samples: 561034520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 23.0) [2024-06-18 20:15:55,501][18875] Avg episode reward: [(0, '0.309')] [2024-06-18 20:15:57,776][19107] Updated weights for policy 0, policy_version 203675 (0.0036) [2024-06-18 20:16:00,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3337109504. Throughput: 0: 42022.2. Samples: 561156300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:00,501][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 20:16:02,003][19107] Updated weights for policy 0, policy_version 203685 (0.0038) [2024-06-18 20:16:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 3337322496. Throughput: 0: 41919.0. Samples: 561409300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:05,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 20:16:05,667][19107] Updated weights for policy 0, policy_version 203695 (0.0041) [2024-06-18 20:16:09,875][19107] Updated weights for policy 0, policy_version 203705 (0.0040) [2024-06-18 20:16:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3337519104. Throughput: 0: 41899.5. Samples: 561656640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:10,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 20:16:13,715][19107] Updated weights for policy 0, policy_version 203715 (0.0032) [2024-06-18 20:16:15,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3337715712. Throughput: 0: 41957.8. Samples: 561784260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:15,501][18875] Avg episode reward: [(0, '0.411')] [2024-06-18 20:16:17,510][19107] Updated weights for policy 0, policy_version 203725 (0.0030) [2024-06-18 20:16:20,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3337945088. Throughput: 0: 41859.6. Samples: 562035460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:20,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 20:16:21,562][19107] Updated weights for policy 0, policy_version 203735 (0.0034) [2024-06-18 20:16:25,502][18875] Fps is (10 sec: 40954.2, 60 sec: 41505.1, 300 sec: 41765.1). Total num frames: 3338125312. Throughput: 0: 41756.4. Samples: 562284680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:25,502][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 20:16:25,819][19107] Updated weights for policy 0, policy_version 203745 (0.0040) [2024-06-18 20:16:26,684][19087] Signal inference workers to stop experience collection... (8200 times) [2024-06-18 20:16:26,690][19087] Signal inference workers to resume experience collection... (8200 times) [2024-06-18 20:16:26,717][19107] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-06-18 20:16:26,717][19107] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-06-18 20:16:29,447][19107] Updated weights for policy 0, policy_version 203755 (0.0033) [2024-06-18 20:16:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3338371072. Throughput: 0: 41859.4. Samples: 562408360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:30,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 20:16:33,570][19107] Updated weights for policy 0, policy_version 203765 (0.0026) [2024-06-18 20:16:35,500][18875] Fps is (10 sec: 42604.4, 60 sec: 41233.2, 300 sec: 41931.9). Total num frames: 3338551296. Throughput: 0: 41838.3. Samples: 562664920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:35,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 20:16:35,552][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000203770_3338567680.pth... [2024-06-18 20:16:35,628][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000203158_3328540672.pth [2024-06-18 20:16:37,260][19107] Updated weights for policy 0, policy_version 203775 (0.0033) [2024-06-18 20:16:40,500][18875] Fps is (10 sec: 40961.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3338780672. Throughput: 0: 41616.6. Samples: 562907260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:40,500][18875] Avg episode reward: [(0, '0.839')] [2024-06-18 20:16:41,226][19107] Updated weights for policy 0, policy_version 203785 (0.0030) [2024-06-18 20:16:45,136][19107] Updated weights for policy 0, policy_version 203795 (0.0037) [2024-06-18 20:16:45,503][18875] Fps is (10 sec: 44223.9, 60 sec: 41777.2, 300 sec: 41876.0). Total num frames: 3338993664. Throughput: 0: 41860.9. Samples: 563040160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:45,504][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 20:16:48,783][19107] Updated weights for policy 0, policy_version 203805 (0.0031) [2024-06-18 20:16:50,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3339206656. Throughput: 0: 41837.8. Samples: 563292000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:50,501][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 20:16:52,872][19107] Updated weights for policy 0, policy_version 203815 (0.0031) [2024-06-18 20:16:55,500][18875] Fps is (10 sec: 42610.6, 60 sec: 42052.3, 300 sec: 41876.9). Total num frames: 3339419648. Throughput: 0: 41919.5. Samples: 563543020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:16:55,501][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 20:16:56,847][19107] Updated weights for policy 0, policy_version 203825 (0.0038) [2024-06-18 20:17:00,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3339616256. Throughput: 0: 41942.3. Samples: 563671660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:17:00,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 20:17:00,634][19107] Updated weights for policy 0, policy_version 203835 (0.0045) [2024-06-18 20:17:04,862][19107] Updated weights for policy 0, policy_version 203845 (0.0037) [2024-06-18 20:17:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3339829248. Throughput: 0: 41908.9. Samples: 563921360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 20:17:05,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 20:17:08,714][19107] Updated weights for policy 0, policy_version 203855 (0.0038) [2024-06-18 20:17:10,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3340058624. Throughput: 0: 41838.7. Samples: 564167360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:10,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 20:17:12,692][19107] Updated weights for policy 0, policy_version 203865 (0.0036) [2024-06-18 20:17:15,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3340238848. Throughput: 0: 41893.5. Samples: 564293560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:15,507][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 20:17:16,389][19107] Updated weights for policy 0, policy_version 203875 (0.0032) [2024-06-18 20:17:20,183][19107] Updated weights for policy 0, policy_version 203885 (0.0026) [2024-06-18 20:17:20,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3340468224. Throughput: 0: 41863.9. Samples: 564548800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:20,501][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 20:17:24,268][19107] Updated weights for policy 0, policy_version 203895 (0.0035) [2024-06-18 20:17:25,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42326.3, 300 sec: 41820.8). Total num frames: 3340664832. Throughput: 0: 42155.4. Samples: 564804260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:25,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 20:17:27,968][19107] Updated weights for policy 0, policy_version 203905 (0.0032) [2024-06-18 20:17:30,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3340877824. Throughput: 0: 41881.8. Samples: 564924720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:30,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 20:17:32,038][19107] Updated weights for policy 0, policy_version 203915 (0.0034) [2024-06-18 20:17:32,978][19087] Signal inference workers to stop experience collection... (8250 times) [2024-06-18 20:17:33,009][19107] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-06-18 20:17:33,030][19087] Signal inference workers to resume experience collection... (8250 times) [2024-06-18 20:17:33,031][19107] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-06-18 20:17:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3341074432. Throughput: 0: 41877.9. Samples: 565176500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:35,501][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 20:17:35,694][19107] Updated weights for policy 0, policy_version 203925 (0.0029) [2024-06-18 20:17:39,758][19107] Updated weights for policy 0, policy_version 203935 (0.0039) [2024-06-18 20:17:40,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3341287424. Throughput: 0: 41903.5. Samples: 565428680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:40,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 20:17:43,323][19107] Updated weights for policy 0, policy_version 203945 (0.0040) [2024-06-18 20:17:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41781.2, 300 sec: 41931.9). Total num frames: 3341500416. Throughput: 0: 41798.6. Samples: 565552600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:45,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 20:17:47,626][19107] Updated weights for policy 0, policy_version 203955 (0.0041) [2024-06-18 20:17:50,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3341713408. Throughput: 0: 41943.3. Samples: 565808800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:50,500][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 20:17:50,911][19107] Updated weights for policy 0, policy_version 203965 (0.0034) [2024-06-18 20:17:55,440][19107] Updated weights for policy 0, policy_version 203975 (0.0038) [2024-06-18 20:17:55,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3341926400. Throughput: 0: 42165.6. Samples: 566064820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:17:55,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 20:17:58,573][19107] Updated weights for policy 0, policy_version 203985 (0.0049) [2024-06-18 20:18:00,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3342155776. Throughput: 0: 42119.9. Samples: 566188960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:18:00,501][18875] Avg episode reward: [(0, '0.699')] [2024-06-18 20:18:03,317][19107] Updated weights for policy 0, policy_version 203995 (0.0047) [2024-06-18 20:18:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3342352384. Throughput: 0: 42223.1. Samples: 566448840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:18:05,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 20:18:06,240][19107] Updated weights for policy 0, policy_version 204005 (0.0030) [2024-06-18 20:18:10,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41821.4). Total num frames: 3342548992. Throughput: 0: 42032.0. Samples: 566695700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-18 20:18:10,501][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 20:18:11,118][19107] Updated weights for policy 0, policy_version 204015 (0.0037) [2024-06-18 20:18:14,457][19107] Updated weights for policy 0, policy_version 204025 (0.0032) [2024-06-18 20:18:15,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 41988.0). Total num frames: 3342794752. Throughput: 0: 42148.6. Samples: 566821400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:15,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 20:18:18,913][19107] Updated weights for policy 0, policy_version 204035 (0.0027) [2024-06-18 20:18:20,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3342991360. Throughput: 0: 42278.2. Samples: 567079020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:20,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 20:18:22,186][19107] Updated weights for policy 0, policy_version 204045 (0.0038) [2024-06-18 20:18:25,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3343187968. Throughput: 0: 42259.2. Samples: 567330340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:25,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 20:18:26,827][19107] Updated weights for policy 0, policy_version 204055 (0.0037) [2024-06-18 20:18:29,979][19107] Updated weights for policy 0, policy_version 204065 (0.0032) [2024-06-18 20:18:30,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 41988.0). Total num frames: 3343433728. Throughput: 0: 42391.2. Samples: 567460200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:30,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 20:18:34,744][19107] Updated weights for policy 0, policy_version 204075 (0.0041) [2024-06-18 20:18:35,504][18875] Fps is (10 sec: 39307.2, 60 sec: 41776.6, 300 sec: 41931.4). Total num frames: 3343581184. Throughput: 0: 42229.8. Samples: 567709300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:35,505][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 20:18:35,646][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000204077_3343597568.pth... [2024-06-18 20:18:35,706][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000203463_3333537792.pth [2024-06-18 20:18:37,720][19107] Updated weights for policy 0, policy_version 204085 (0.0038) [2024-06-18 20:18:40,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3343826944. Throughput: 0: 41928.9. Samples: 567951620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:40,501][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 20:18:42,616][19107] Updated weights for policy 0, policy_version 204095 (0.0027) [2024-06-18 20:18:43,511][19087] Signal inference workers to stop experience collection... (8300 times) [2024-06-18 20:18:43,550][19107] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-06-18 20:18:43,557][19087] Signal inference workers to resume experience collection... (8300 times) [2024-06-18 20:18:43,568][19107] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-06-18 20:18:45,456][19107] Updated weights for policy 0, policy_version 204105 (0.0029) [2024-06-18 20:18:45,504][18875] Fps is (10 sec: 47514.0, 60 sec: 42595.9, 300 sec: 41987.0). Total num frames: 3344056320. Throughput: 0: 42151.8. Samples: 568085940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:45,504][18875] Avg episode reward: [(0, '0.649')] [2024-06-18 20:18:50,494][19107] Updated weights for policy 0, policy_version 204115 (0.0025) [2024-06-18 20:18:50,502][18875] Fps is (10 sec: 39315.7, 60 sec: 41778.0, 300 sec: 41931.7). Total num frames: 3344220160. Throughput: 0: 41939.1. Samples: 568336160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:50,502][18875] Avg episode reward: [(0, '0.761')] [2024-06-18 20:18:53,784][19107] Updated weights for policy 0, policy_version 204125 (0.0026) [2024-06-18 20:18:55,500][18875] Fps is (10 sec: 42613.5, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3344482304. Throughput: 0: 41639.1. Samples: 568569460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:18:55,501][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 20:18:58,365][19107] Updated weights for policy 0, policy_version 204135 (0.0032) [2024-06-18 20:19:00,504][18875] Fps is (10 sec: 44227.8, 60 sec: 41776.7, 300 sec: 41875.9). Total num frames: 3344662528. Throughput: 0: 41950.4. Samples: 568709320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:19:00,504][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 20:19:01,620][19107] Updated weights for policy 0, policy_version 204145 (0.0030) [2024-06-18 20:19:05,500][18875] Fps is (10 sec: 36044.6, 60 sec: 41506.1, 300 sec: 41932.4). Total num frames: 3344842752. Throughput: 0: 41607.0. Samples: 568951340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:19:05,501][18875] Avg episode reward: [(0, '0.665')] [2024-06-18 20:19:06,051][19107] Updated weights for policy 0, policy_version 204155 (0.0035) [2024-06-18 20:19:09,463][19107] Updated weights for policy 0, policy_version 204165 (0.0028) [2024-06-18 20:19:10,500][18875] Fps is (10 sec: 44252.4, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3345104896. Throughput: 0: 41570.2. Samples: 569201000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:19:10,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 20:19:13,717][19107] Updated weights for policy 0, policy_version 204175 (0.0030) [2024-06-18 20:19:15,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 3345268736. Throughput: 0: 41671.5. Samples: 569335420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:19:15,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 20:19:17,449][19107] Updated weights for policy 0, policy_version 204185 (0.0039) [2024-06-18 20:19:20,500][18875] Fps is (10 sec: 36045.1, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3345465344. Throughput: 0: 41417.2. Samples: 569572920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 20:19:20,501][18875] Avg episode reward: [(0, '0.298')] [2024-06-18 20:19:21,723][19107] Updated weights for policy 0, policy_version 204195 (0.0028) [2024-06-18 20:19:25,219][19107] Updated weights for policy 0, policy_version 204205 (0.0040) [2024-06-18 20:19:25,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3345711104. Throughput: 0: 41816.5. Samples: 569833360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:19:25,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 20:19:29,333][19107] Updated weights for policy 0, policy_version 204215 (0.0037) [2024-06-18 20:19:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 3345891328. Throughput: 0: 41651.8. Samples: 569960120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:19:30,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 20:19:32,947][19107] Updated weights for policy 0, policy_version 204225 (0.0042) [2024-06-18 20:19:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 3346120704. Throughput: 0: 41473.8. Samples: 570202420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:19:35,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 20:19:36,998][19107] Updated weights for policy 0, policy_version 204235 (0.0028) [2024-06-18 20:19:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41988.0). Total num frames: 3346333696. Throughput: 0: 42147.7. Samples: 570466100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:19:40,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 20:19:40,707][19107] Updated weights for policy 0, policy_version 204245 (0.0040) [2024-06-18 20:19:44,891][19107] Updated weights for policy 0, policy_version 204255 (0.0040) [2024-06-18 20:19:45,504][18875] Fps is (10 sec: 40945.6, 60 sec: 41233.1, 300 sec: 41931.4). Total num frames: 3346530304. Throughput: 0: 41772.4. Samples: 570589080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:19:45,504][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 20:19:46,575][19087] Signal inference workers to stop experience collection... (8350 times) [2024-06-18 20:19:46,577][19087] Signal inference workers to resume experience collection... (8350 times) [2024-06-18 20:19:46,618][19107] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-06-18 20:19:46,618][19107] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-06-18 20:19:48,427][19107] Updated weights for policy 0, policy_version 204265 (0.0034) [2024-06-18 20:19:50,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42326.4, 300 sec: 41987.5). Total num frames: 3346759680. Throughput: 0: 41797.5. Samples: 570832220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:19:50,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 20:19:52,791][19107] Updated weights for policy 0, policy_version 204275 (0.0035) [2024-06-18 20:19:55,500][18875] Fps is (10 sec: 42614.1, 60 sec: 41233.2, 300 sec: 41932.0). Total num frames: 3346956288. Throughput: 0: 41935.2. Samples: 571088080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:19:55,501][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 20:19:56,370][19107] Updated weights for policy 0, policy_version 204285 (0.0028) [2024-06-18 20:20:00,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41508.7, 300 sec: 41876.4). Total num frames: 3347152896. Throughput: 0: 41678.7. Samples: 571210960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:20:00,501][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 20:20:00,522][19107] Updated weights for policy 0, policy_version 204295 (0.0037) [2024-06-18 20:20:04,069][19107] Updated weights for policy 0, policy_version 204305 (0.0030) [2024-06-18 20:20:05,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42871.5, 300 sec: 42043.0). Total num frames: 3347415040. Throughput: 0: 42057.2. Samples: 571465500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:20:05,501][18875] Avg episode reward: [(0, '0.682')] [2024-06-18 20:20:08,251][19107] Updated weights for policy 0, policy_version 204315 (0.0033) [2024-06-18 20:20:10,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 3347578880. Throughput: 0: 41995.2. Samples: 571723140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:20:10,500][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 20:20:11,996][19107] Updated weights for policy 0, policy_version 204325 (0.0029) [2024-06-18 20:20:15,500][18875] Fps is (10 sec: 36045.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3347775488. Throughput: 0: 41785.8. Samples: 571840480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:20:15,500][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 20:20:16,158][19107] Updated weights for policy 0, policy_version 204335 (0.0038) [2024-06-18 20:20:19,743][19107] Updated weights for policy 0, policy_version 204345 (0.0031) [2024-06-18 20:20:20,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3348021248. Throughput: 0: 42215.1. Samples: 572102100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:20:20,501][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 20:20:23,761][19107] Updated weights for policy 0, policy_version 204355 (0.0029) [2024-06-18 20:20:25,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3348201472. Throughput: 0: 41935.5. Samples: 572353200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 20:20:25,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 20:20:27,444][19107] Updated weights for policy 0, policy_version 204365 (0.0043) [2024-06-18 20:20:30,500][18875] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 3348414464. Throughput: 0: 41982.9. Samples: 572478160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:20:30,501][18875] Avg episode reward: [(0, '0.812')] [2024-06-18 20:20:31,558][19107] Updated weights for policy 0, policy_version 204375 (0.0032) [2024-06-18 20:20:35,348][19107] Updated weights for policy 0, policy_version 204385 (0.0052) [2024-06-18 20:20:35,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 41987.4). Total num frames: 3348643840. Throughput: 0: 42261.2. Samples: 572733980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:20:35,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 20:20:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000204385_3348643840.pth... [2024-06-18 20:20:35,577][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000203770_3338567680.pth [2024-06-18 20:20:39,319][19107] Updated weights for policy 0, policy_version 204395 (0.0032) [2024-06-18 20:20:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 3348824064. Throughput: 0: 42153.2. Samples: 572984980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:20:40,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 20:20:43,197][19107] Updated weights for policy 0, policy_version 204405 (0.0034) [2024-06-18 20:20:45,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42327.9, 300 sec: 41931.9). Total num frames: 3349069824. Throughput: 0: 42182.2. Samples: 573109160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:20:45,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 20:20:46,972][19107] Updated weights for policy 0, policy_version 204415 (0.0034) [2024-06-18 20:20:50,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3349266432. Throughput: 0: 42161.4. Samples: 573362760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:20:50,501][18875] Avg episode reward: [(0, '0.638')] [2024-06-18 20:20:51,021][19107] Updated weights for policy 0, policy_version 204425 (0.0039) [2024-06-18 20:20:54,496][19107] Updated weights for policy 0, policy_version 204435 (0.0026) [2024-06-18 20:20:55,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3349479424. Throughput: 0: 42125.6. Samples: 573618800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:20:55,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 20:20:58,802][19107] Updated weights for policy 0, policy_version 204445 (0.0030) [2024-06-18 20:21:00,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3349708800. Throughput: 0: 42392.7. Samples: 573748160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:00,501][18875] Avg episode reward: [(0, '0.715')] [2024-06-18 20:21:01,515][19087] Signal inference workers to stop experience collection... (8400 times) [2024-06-18 20:21:01,520][19087] Signal inference workers to resume experience collection... (8400 times) [2024-06-18 20:21:01,563][19107] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-06-18 20:21:01,563][19107] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-06-18 20:21:02,477][19107] Updated weights for policy 0, policy_version 204455 (0.0042) [2024-06-18 20:21:05,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3349921792. Throughput: 0: 42285.9. Samples: 574004960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:05,500][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 20:21:06,422][19107] Updated weights for policy 0, policy_version 204465 (0.0034) [2024-06-18 20:21:10,058][19107] Updated weights for policy 0, policy_version 204475 (0.0029) [2024-06-18 20:21:10,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 3350134784. Throughput: 0: 42332.8. Samples: 574258180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:10,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 20:21:13,886][19107] Updated weights for policy 0, policy_version 204485 (0.0034) [2024-06-18 20:21:15,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 3350347776. Throughput: 0: 42396.0. Samples: 574385980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:15,501][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 20:21:18,051][19107] Updated weights for policy 0, policy_version 204495 (0.0036) [2024-06-18 20:21:20,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41779.3, 300 sec: 42043.2). Total num frames: 3350528000. Throughput: 0: 42250.0. Samples: 574635220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:20,500][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 20:21:21,927][19107] Updated weights for policy 0, policy_version 204505 (0.0034) [2024-06-18 20:21:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3350757376. Throughput: 0: 42355.7. Samples: 574890980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:25,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 20:21:25,578][19107] Updated weights for policy 0, policy_version 204515 (0.0035) [2024-06-18 20:21:29,562][19107] Updated weights for policy 0, policy_version 204525 (0.0026) [2024-06-18 20:21:30,500][18875] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3350953984. Throughput: 0: 42471.0. Samples: 575020360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:30,504][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 20:21:33,173][19107] Updated weights for policy 0, policy_version 204535 (0.0038) [2024-06-18 20:21:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 3351183360. Throughput: 0: 42466.3. Samples: 575273740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-18 20:21:35,500][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 20:21:37,451][19107] Updated weights for policy 0, policy_version 204545 (0.0026) [2024-06-18 20:21:40,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 41987.9). Total num frames: 3351379968. Throughput: 0: 42353.4. Samples: 575524700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:21:40,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 20:21:40,978][19107] Updated weights for policy 0, policy_version 204555 (0.0043) [2024-06-18 20:21:45,362][19107] Updated weights for policy 0, policy_version 204565 (0.0030) [2024-06-18 20:21:45,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3351592960. Throughput: 0: 42253.3. Samples: 575649560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:21:45,501][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 20:21:48,904][19107] Updated weights for policy 0, policy_version 204575 (0.0026) [2024-06-18 20:21:50,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3351822336. Throughput: 0: 42084.0. Samples: 575898740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:21:50,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 20:21:53,118][19107] Updated weights for policy 0, policy_version 204585 (0.0038) [2024-06-18 20:21:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3352018944. Throughput: 0: 42177.8. Samples: 576156180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:21:55,501][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 20:21:56,660][19107] Updated weights for policy 0, policy_version 204595 (0.0036) [2024-06-18 20:22:00,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3352215552. Throughput: 0: 42101.9. Samples: 576280560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:00,500][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 20:22:00,742][19107] Updated weights for policy 0, policy_version 204605 (0.0032) [2024-06-18 20:22:04,442][19107] Updated weights for policy 0, policy_version 204615 (0.0044) [2024-06-18 20:22:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3352444928. Throughput: 0: 42275.0. Samples: 576537600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:05,501][18875] Avg episode reward: [(0, '0.402')] [2024-06-18 20:22:08,454][19107] Updated weights for policy 0, policy_version 204625 (0.0035) [2024-06-18 20:22:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3352641536. Throughput: 0: 42169.8. Samples: 576788620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:10,501][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 20:22:12,113][19107] Updated weights for policy 0, policy_version 204635 (0.0042) [2024-06-18 20:22:15,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3352854528. Throughput: 0: 42029.0. Samples: 576911660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:15,501][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 20:22:16,250][19107] Updated weights for policy 0, policy_version 204645 (0.0039) [2024-06-18 20:22:20,189][19107] Updated weights for policy 0, policy_version 204655 (0.0030) [2024-06-18 20:22:20,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3353083904. Throughput: 0: 42004.9. Samples: 577163960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:20,500][18875] Avg episode reward: [(0, '0.388')] [2024-06-18 20:22:24,051][19107] Updated weights for policy 0, policy_version 204665 (0.0032) [2024-06-18 20:22:25,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3353264128. Throughput: 0: 42116.8. Samples: 577419960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:25,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 20:22:27,979][19107] Updated weights for policy 0, policy_version 204675 (0.0026) [2024-06-18 20:22:29,380][19087] Signal inference workers to stop experience collection... (8450 times) [2024-06-18 20:22:29,422][19107] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-06-18 20:22:29,432][19087] Signal inference workers to resume experience collection... (8450 times) [2024-06-18 20:22:29,445][19107] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-06-18 20:22:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3353509888. Throughput: 0: 42034.8. Samples: 577541120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:30,500][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 20:22:31,742][19107] Updated weights for policy 0, policy_version 204685 (0.0034) [2024-06-18 20:22:35,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3353706496. Throughput: 0: 42193.8. Samples: 577797460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:35,500][18875] Avg episode reward: [(0, '0.765')] [2024-06-18 20:22:35,618][19107] Updated weights for policy 0, policy_version 204695 (0.0030) [2024-06-18 20:22:35,620][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000204695_3353722880.pth... [2024-06-18 20:22:35,678][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000204077_3343597568.pth [2024-06-18 20:22:39,626][19107] Updated weights for policy 0, policy_version 204705 (0.0043) [2024-06-18 20:22:40,504][18875] Fps is (10 sec: 39307.1, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 3353903104. Throughput: 0: 42143.3. Samples: 578052780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 24.0) [2024-06-18 20:22:40,504][18875] Avg episode reward: [(0, '0.772')] [2024-06-18 20:22:43,527][19107] Updated weights for policy 0, policy_version 204715 (0.0029) [2024-06-18 20:22:45,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 3354165248. Throughput: 0: 42296.0. Samples: 578183880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:22:45,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 20:22:47,491][19107] Updated weights for policy 0, policy_version 204725 (0.0033) [2024-06-18 20:22:50,504][18875] Fps is (10 sec: 44236.9, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 3354345472. Throughput: 0: 42192.6. Samples: 578436420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:22:50,513][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 20:22:51,150][19107] Updated weights for policy 0, policy_version 204735 (0.0030) [2024-06-18 20:22:55,178][19107] Updated weights for policy 0, policy_version 204745 (0.0026) [2024-06-18 20:22:55,500][18875] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3354542080. Throughput: 0: 42268.4. Samples: 578690700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:22:55,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 20:22:58,767][19107] Updated weights for policy 0, policy_version 204755 (0.0028) [2024-06-18 20:23:00,501][18875] Fps is (10 sec: 44251.8, 60 sec: 42871.3, 300 sec: 42154.1). Total num frames: 3354787840. Throughput: 0: 42347.3. Samples: 578817300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:00,504][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 20:23:02,843][19107] Updated weights for policy 0, policy_version 204765 (0.0033) [2024-06-18 20:23:05,504][18875] Fps is (10 sec: 42583.0, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 3354968064. Throughput: 0: 42333.4. Samples: 579069120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:05,504][18875] Avg episode reward: [(0, '0.793')] [2024-06-18 20:23:06,898][19107] Updated weights for policy 0, policy_version 204775 (0.0042) [2024-06-18 20:23:10,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 3355181056. Throughput: 0: 42254.2. Samples: 579321400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:10,501][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 20:23:10,723][19107] Updated weights for policy 0, policy_version 204785 (0.0036) [2024-06-18 20:23:14,702][19107] Updated weights for policy 0, policy_version 204795 (0.0028) [2024-06-18 20:23:15,500][18875] Fps is (10 sec: 44253.1, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3355410432. Throughput: 0: 42267.6. Samples: 579443160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:15,501][18875] Avg episode reward: [(0, '0.348')] [2024-06-18 20:23:18,419][19107] Updated weights for policy 0, policy_version 204805 (0.0034) [2024-06-18 20:23:20,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3355607040. Throughput: 0: 42280.3. Samples: 579700080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:20,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 20:23:22,247][19107] Updated weights for policy 0, policy_version 204815 (0.0030) [2024-06-18 20:23:25,501][18875] Fps is (10 sec: 40959.0, 60 sec: 42598.3, 300 sec: 41987.4). Total num frames: 3355820032. Throughput: 0: 42257.4. Samples: 579954220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:25,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 20:23:26,142][19107] Updated weights for policy 0, policy_version 204825 (0.0024) [2024-06-18 20:23:29,681][19107] Updated weights for policy 0, policy_version 204835 (0.0045) [2024-06-18 20:23:30,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42210.2). Total num frames: 3356033024. Throughput: 0: 42148.5. Samples: 580080560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:30,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 20:23:33,606][19107] Updated weights for policy 0, policy_version 204845 (0.0039) [2024-06-18 20:23:35,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3356229632. Throughput: 0: 42188.7. Samples: 580334760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:35,501][18875] Avg episode reward: [(0, '0.683')] [2024-06-18 20:23:37,385][19107] Updated weights for policy 0, policy_version 204855 (0.0042) [2024-06-18 20:23:40,500][18875] Fps is (10 sec: 44235.8, 60 sec: 42873.9, 300 sec: 42099.0). Total num frames: 3356475392. Throughput: 0: 42164.3. Samples: 580588100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:40,512][18875] Avg episode reward: [(0, '0.752')] [2024-06-18 20:23:41,963][19107] Updated weights for policy 0, policy_version 204865 (0.0044) [2024-06-18 20:23:45,231][19107] Updated weights for policy 0, policy_version 204875 (0.0035) [2024-06-18 20:23:45,504][18875] Fps is (10 sec: 44220.8, 60 sec: 41776.6, 300 sec: 42209.3). Total num frames: 3356672000. Throughput: 0: 42261.2. Samples: 580719200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:45,505][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 20:23:49,713][19107] Updated weights for policy 0, policy_version 204885 (0.0038) [2024-06-18 20:23:50,504][18875] Fps is (10 sec: 39308.1, 60 sec: 42052.3, 300 sec: 41987.0). Total num frames: 3356868608. Throughput: 0: 42247.1. Samples: 580970240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 20:23:50,504][18875] Avg episode reward: [(0, '0.633')] [2024-06-18 20:23:53,031][19107] Updated weights for policy 0, policy_version 204895 (0.0047) [2024-06-18 20:23:55,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42598.3, 300 sec: 42154.6). Total num frames: 3357097984. Throughput: 0: 42088.5. Samples: 581215380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:23:55,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 20:23:57,479][19107] Updated weights for policy 0, policy_version 204905 (0.0027) [2024-06-18 20:24:00,500][18875] Fps is (10 sec: 42613.1, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 3357294592. Throughput: 0: 42279.8. Samples: 581345760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:00,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 20:24:00,724][19107] Updated weights for policy 0, policy_version 204915 (0.0039) [2024-06-18 20:24:05,059][19107] Updated weights for policy 0, policy_version 204925 (0.0037) [2024-06-18 20:24:05,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42054.8, 300 sec: 41987.5). Total num frames: 3357491200. Throughput: 0: 42031.5. Samples: 581591500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:05,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 20:24:08,483][19107] Updated weights for policy 0, policy_version 204935 (0.0034) [2024-06-18 20:24:10,500][18875] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3357704192. Throughput: 0: 41933.6. Samples: 581841220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:10,500][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 20:24:12,450][19087] Signal inference workers to stop experience collection... (8500 times) [2024-06-18 20:24:12,450][19087] Signal inference workers to resume experience collection... (8500 times) [2024-06-18 20:24:12,489][19107] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-06-18 20:24:12,489][19107] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-06-18 20:24:12,773][19107] Updated weights for policy 0, policy_version 204945 (0.0038) [2024-06-18 20:24:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 3357917184. Throughput: 0: 41954.6. Samples: 581968520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:15,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 20:24:16,337][19107] Updated weights for policy 0, policy_version 204955 (0.0037) [2024-06-18 20:24:20,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3358130176. Throughput: 0: 42017.4. Samples: 582225540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:20,501][18875] Avg episode reward: [(0, '0.501')] [2024-06-18 20:24:20,690][19107] Updated weights for policy 0, policy_version 204965 (0.0034) [2024-06-18 20:24:23,987][19107] Updated weights for policy 0, policy_version 204975 (0.0039) [2024-06-18 20:24:25,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3358326784. Throughput: 0: 41921.0. Samples: 582474540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:25,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 20:24:28,429][19107] Updated weights for policy 0, policy_version 204985 (0.0046) [2024-06-18 20:24:30,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3358556160. Throughput: 0: 41923.8. Samples: 582605620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:30,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 20:24:31,579][19107] Updated weights for policy 0, policy_version 204995 (0.0034) [2024-06-18 20:24:35,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3358736384. Throughput: 0: 41920.2. Samples: 582856500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:35,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 20:24:35,507][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205001_3358736384.pth... [2024-06-18 20:24:35,586][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000204385_3348643840.pth [2024-06-18 20:24:36,410][19107] Updated weights for policy 0, policy_version 205005 (0.0024) [2024-06-18 20:24:39,606][19107] Updated weights for policy 0, policy_version 205015 (0.0036) [2024-06-18 20:24:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42210.1). Total num frames: 3358982144. Throughput: 0: 41809.7. Samples: 583096820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:40,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 20:24:44,371][19107] Updated weights for policy 0, policy_version 205025 (0.0039) [2024-06-18 20:24:45,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42054.9, 300 sec: 42154.1). Total num frames: 3359195136. Throughput: 0: 41970.9. Samples: 583234440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:45,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 20:24:47,578][19107] Updated weights for policy 0, policy_version 205035 (0.0043) [2024-06-18 20:24:50,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41781.6, 300 sec: 42098.5). Total num frames: 3359375360. Throughput: 0: 42019.9. Samples: 583482400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:50,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 20:24:52,231][19107] Updated weights for policy 0, policy_version 205045 (0.0031) [2024-06-18 20:24:55,313][19107] Updated weights for policy 0, policy_version 205055 (0.0048) [2024-06-18 20:24:55,504][18875] Fps is (10 sec: 42582.7, 60 sec: 42049.8, 300 sec: 42264.6). Total num frames: 3359621120. Throughput: 0: 41982.3. Samples: 583730580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 20:24:55,505][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 20:25:00,010][19107] Updated weights for policy 0, policy_version 205065 (0.0030) [2024-06-18 20:25:00,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3359801344. Throughput: 0: 42158.2. Samples: 583865640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:00,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 20:25:03,033][19107] Updated weights for policy 0, policy_version 205075 (0.0040) [2024-06-18 20:25:05,500][18875] Fps is (10 sec: 37697.2, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 3359997952. Throughput: 0: 41992.5. Samples: 584115200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:05,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 20:25:07,675][19107] Updated weights for policy 0, policy_version 205085 (0.0036) [2024-06-18 20:25:10,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 3360243712. Throughput: 0: 41915.5. Samples: 584360740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:10,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 20:25:10,779][19107] Updated weights for policy 0, policy_version 205095 (0.0032) [2024-06-18 20:25:15,385][19107] Updated weights for policy 0, policy_version 205105 (0.0036) [2024-06-18 20:25:15,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3360440320. Throughput: 0: 41980.0. Samples: 584494720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:15,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 20:25:18,454][19107] Updated weights for policy 0, policy_version 205115 (0.0039) [2024-06-18 20:25:20,501][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.0, 300 sec: 42209.6). Total num frames: 3360653312. Throughput: 0: 41831.4. Samples: 584738920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:20,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 20:25:23,200][19107] Updated weights for policy 0, policy_version 205125 (0.0042) [2024-06-18 20:25:25,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 3360882688. Throughput: 0: 42040.9. Samples: 584988660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:25,501][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 20:25:26,190][19107] Updated weights for policy 0, policy_version 205135 (0.0036) [2024-06-18 20:25:30,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3361062912. Throughput: 0: 41905.2. Samples: 585120180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:30,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 20:25:31,143][19107] Updated weights for policy 0, policy_version 205145 (0.0031) [2024-06-18 20:25:31,748][19087] Signal inference workers to stop experience collection... (8550 times) [2024-06-18 20:25:31,749][19087] Signal inference workers to resume experience collection... (8550 times) [2024-06-18 20:25:31,787][19107] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-06-18 20:25:31,788][19107] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-06-18 20:25:34,075][19107] Updated weights for policy 0, policy_version 205155 (0.0038) [2024-06-18 20:25:35,500][18875] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3361275904. Throughput: 0: 41840.5. Samples: 585365220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:35,501][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 20:25:39,101][19107] Updated weights for policy 0, policy_version 205165 (0.0045) [2024-06-18 20:25:40,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3361505280. Throughput: 0: 41966.0. Samples: 585618900. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:40,501][18875] Avg episode reward: [(0, '0.682')] [2024-06-18 20:25:42,301][19107] Updated weights for policy 0, policy_version 205175 (0.0032) [2024-06-18 20:25:45,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 3361669120. Throughput: 0: 41789.8. Samples: 585746180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:45,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 20:25:46,766][19107] Updated weights for policy 0, policy_version 205185 (0.0032) [2024-06-18 20:25:50,041][19107] Updated weights for policy 0, policy_version 205195 (0.0039) [2024-06-18 20:25:50,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3361914880. Throughput: 0: 41882.9. Samples: 585999940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:50,501][18875] Avg episode reward: [(0, '0.374')] [2024-06-18 20:25:54,464][19107] Updated weights for policy 0, policy_version 205205 (0.0044) [2024-06-18 20:25:55,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41781.7, 300 sec: 42098.6). Total num frames: 3362127872. Throughput: 0: 42020.5. Samples: 586251660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:25:55,501][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 20:25:57,874][19107] Updated weights for policy 0, policy_version 205215 (0.0029) [2024-06-18 20:26:00,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3362308096. Throughput: 0: 41746.7. Samples: 586373320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:26:00,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 20:26:02,169][19107] Updated weights for policy 0, policy_version 205225 (0.0040) [2024-06-18 20:26:05,504][18875] Fps is (10 sec: 40945.0, 60 sec: 42322.7, 300 sec: 42042.5). Total num frames: 3362537472. Throughput: 0: 41984.8. Samples: 586628380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:26:05,505][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 20:26:05,755][19107] Updated weights for policy 0, policy_version 205235 (0.0031) [2024-06-18 20:26:10,317][19107] Updated weights for policy 0, policy_version 205245 (0.0033) [2024-06-18 20:26:10,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3362750464. Throughput: 0: 42212.2. Samples: 586888200. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:10,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 20:26:13,703][19107] Updated weights for policy 0, policy_version 205255 (0.0028) [2024-06-18 20:26:15,504][18875] Fps is (10 sec: 40960.1, 60 sec: 41776.7, 300 sec: 42098.0). Total num frames: 3362947072. Throughput: 0: 41980.7. Samples: 587009460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:15,505][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 20:26:18,160][19107] Updated weights for policy 0, policy_version 205265 (0.0043) [2024-06-18 20:26:20,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3363160064. Throughput: 0: 42021.3. Samples: 587256180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:20,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 20:26:21,577][19107] Updated weights for policy 0, policy_version 205275 (0.0036) [2024-06-18 20:26:25,500][18875] Fps is (10 sec: 40974.5, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 3363356672. Throughput: 0: 42152.0. Samples: 587515740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:25,501][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 20:26:25,868][19107] Updated weights for policy 0, policy_version 205285 (0.0036) [2024-06-18 20:26:29,393][19107] Updated weights for policy 0, policy_version 205295 (0.0035) [2024-06-18 20:26:30,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3363586048. Throughput: 0: 41966.6. Samples: 587634680. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:30,501][18875] Avg episode reward: [(0, '0.574')] [2024-06-18 20:26:33,685][19107] Updated weights for policy 0, policy_version 205305 (0.0028) [2024-06-18 20:26:35,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3363782656. Throughput: 0: 41756.6. Samples: 587878980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:35,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 20:26:35,610][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205310_3363799040.pth... [2024-06-18 20:26:35,673][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000204695_3353722880.pth [2024-06-18 20:26:37,367][19107] Updated weights for policy 0, policy_version 205315 (0.0027) [2024-06-18 20:26:40,502][18875] Fps is (10 sec: 40952.2, 60 sec: 41504.8, 300 sec: 42042.7). Total num frames: 3363995648. Throughput: 0: 41774.2. Samples: 588131580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:40,503][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 20:26:41,453][19107] Updated weights for policy 0, policy_version 205325 (0.0033) [2024-06-18 20:26:45,073][19107] Updated weights for policy 0, policy_version 205335 (0.0039) [2024-06-18 20:26:45,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3364208640. Throughput: 0: 41877.2. Samples: 588257800. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:45,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 20:26:49,313][19107] Updated weights for policy 0, policy_version 205345 (0.0040) [2024-06-18 20:26:50,501][18875] Fps is (10 sec: 42606.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3364421632. Throughput: 0: 41945.0. Samples: 588515760. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:50,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 20:26:52,836][19107] Updated weights for policy 0, policy_version 205355 (0.0039) [2024-06-18 20:26:55,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3364634624. Throughput: 0: 41752.8. Samples: 588767080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:26:55,501][18875] Avg episode reward: [(0, '0.518')] [2024-06-18 20:26:57,150][19107] Updated weights for policy 0, policy_version 205365 (0.0048) [2024-06-18 20:27:00,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3364847616. Throughput: 0: 41796.6. Samples: 588890160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:27:00,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 20:27:00,567][19107] Updated weights for policy 0, policy_version 205375 (0.0025) [2024-06-18 20:27:04,779][19107] Updated weights for policy 0, policy_version 205385 (0.0029) [2024-06-18 20:27:05,504][18875] Fps is (10 sec: 42583.4, 60 sec: 42052.3, 300 sec: 42098.0). Total num frames: 3365060608. Throughput: 0: 42038.1. Samples: 589148040. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:27:05,504][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 20:27:08,319][19107] Updated weights for policy 0, policy_version 205395 (0.0029) [2024-06-18 20:27:10,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3365289984. Throughput: 0: 41900.0. Samples: 589401240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 20:27:10,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 20:27:12,323][19107] Updated weights for policy 0, policy_version 205405 (0.0024) [2024-06-18 20:27:13,515][19087] Signal inference workers to stop experience collection... (8600 times) [2024-06-18 20:27:13,555][19107] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-06-18 20:27:13,579][19087] Signal inference workers to resume experience collection... (8600 times) [2024-06-18 20:27:13,580][19107] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-06-18 20:27:15,500][18875] Fps is (10 sec: 42613.5, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 3365486592. Throughput: 0: 42184.9. Samples: 589533000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:15,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 20:27:15,841][19107] Updated weights for policy 0, policy_version 205415 (0.0033) [2024-06-18 20:27:20,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3365666816. Throughput: 0: 42376.9. Samples: 589785940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:20,501][18875] Avg episode reward: [(0, '0.742')] [2024-06-18 20:27:20,543][19107] Updated weights for policy 0, policy_version 205425 (0.0044) [2024-06-18 20:27:23,947][19107] Updated weights for policy 0, policy_version 205435 (0.0038) [2024-06-18 20:27:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3365912576. Throughput: 0: 42275.2. Samples: 590033880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:25,503][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 20:27:28,231][19107] Updated weights for policy 0, policy_version 205445 (0.0032) [2024-06-18 20:27:30,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3366109184. Throughput: 0: 42462.4. Samples: 590168600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:30,500][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 20:27:31,766][19107] Updated weights for policy 0, policy_version 205455 (0.0027) [2024-06-18 20:27:35,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 3366305792. Throughput: 0: 42175.1. Samples: 590413640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:35,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 20:27:35,883][19107] Updated weights for policy 0, policy_version 205465 (0.0040) [2024-06-18 20:27:39,344][19107] Updated weights for policy 0, policy_version 205475 (0.0035) [2024-06-18 20:27:40,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42599.8, 300 sec: 41987.5). Total num frames: 3366551552. Throughput: 0: 42220.0. Samples: 590666980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:40,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 20:27:43,700][19107] Updated weights for policy 0, policy_version 205485 (0.0030) [2024-06-18 20:27:45,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 3366731776. Throughput: 0: 42404.1. Samples: 590798340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:45,501][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 20:27:47,368][19107] Updated weights for policy 0, policy_version 205495 (0.0042) [2024-06-18 20:27:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 3366961152. Throughput: 0: 42194.0. Samples: 591046620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:50,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 20:27:51,494][19107] Updated weights for policy 0, policy_version 205505 (0.0029) [2024-06-18 20:27:55,113][19107] Updated weights for policy 0, policy_version 205515 (0.0046) [2024-06-18 20:27:55,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3367174144. Throughput: 0: 42182.8. Samples: 591299460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:27:55,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 20:27:59,354][19107] Updated weights for policy 0, policy_version 205525 (0.0033) [2024-06-18 20:28:00,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 41988.0). Total num frames: 3367354368. Throughput: 0: 42044.5. Samples: 591425000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:28:00,501][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 20:28:02,818][19107] Updated weights for policy 0, policy_version 205535 (0.0044) [2024-06-18 20:28:05,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 3367583744. Throughput: 0: 41984.0. Samples: 591675220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:28:05,501][18875] Avg episode reward: [(0, '0.680')] [2024-06-18 20:28:06,856][19107] Updated weights for policy 0, policy_version 205545 (0.0035) [2024-06-18 20:28:10,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3367796736. Throughput: 0: 42096.0. Samples: 591928200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:28:10,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 20:28:10,583][19107] Updated weights for policy 0, policy_version 205555 (0.0035) [2024-06-18 20:28:14,617][19107] Updated weights for policy 0, policy_version 205565 (0.0033) [2024-06-18 20:28:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3367993344. Throughput: 0: 41956.4. Samples: 592056640. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:28:15,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 20:28:18,236][19107] Updated weights for policy 0, policy_version 205575 (0.0033) [2024-06-18 20:28:20,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3368222720. Throughput: 0: 42101.5. Samples: 592308200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:20,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 20:28:22,252][19107] Updated weights for policy 0, policy_version 205585 (0.0032) [2024-06-18 20:28:25,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3368402944. Throughput: 0: 42036.6. Samples: 592558620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:25,500][18875] Avg episode reward: [(0, '0.372')] [2024-06-18 20:28:26,089][19107] Updated weights for policy 0, policy_version 205595 (0.0027) [2024-06-18 20:28:30,465][19107] Updated weights for policy 0, policy_version 205605 (0.0028) [2024-06-18 20:28:30,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3368632320. Throughput: 0: 41783.5. Samples: 592678600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:30,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 20:28:34,006][19107] Updated weights for policy 0, policy_version 205615 (0.0027) [2024-06-18 20:28:35,501][18875] Fps is (10 sec: 44235.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3368845312. Throughput: 0: 41909.1. Samples: 592932540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:35,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 20:28:35,659][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205620_3368878080.pth... [2024-06-18 20:28:35,719][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205001_3358736384.pth [2024-06-18 20:28:38,079][19107] Updated weights for policy 0, policy_version 205625 (0.0036) [2024-06-18 20:28:40,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41932.5). Total num frames: 3369041920. Throughput: 0: 41860.4. Samples: 593183180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:40,500][18875] Avg episode reward: [(0, '0.776')] [2024-06-18 20:28:41,457][19087] Signal inference workers to stop experience collection... (8650 times) [2024-06-18 20:28:41,457][19087] Signal inference workers to resume experience collection... (8650 times) [2024-06-18 20:28:41,468][19107] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-06-18 20:28:41,477][19107] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-06-18 20:28:41,895][19107] Updated weights for policy 0, policy_version 205635 (0.0044) [2024-06-18 20:28:45,500][18875] Fps is (10 sec: 40961.0, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 3369254912. Throughput: 0: 41838.3. Samples: 593307720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:45,501][18875] Avg episode reward: [(0, '0.397')] [2024-06-18 20:28:46,398][19107] Updated weights for policy 0, policy_version 205645 (0.0023) [2024-06-18 20:28:49,916][19107] Updated weights for policy 0, policy_version 205655 (0.0026) [2024-06-18 20:28:50,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3369484288. Throughput: 0: 42011.2. Samples: 593565720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:50,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 20:28:54,048][19107] Updated weights for policy 0, policy_version 205665 (0.0038) [2024-06-18 20:28:55,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3369697280. Throughput: 0: 41955.7. Samples: 593816200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:28:55,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 20:28:57,815][19107] Updated weights for policy 0, policy_version 205675 (0.0033) [2024-06-18 20:29:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3369893888. Throughput: 0: 41884.1. Samples: 593941420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:29:00,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 20:29:01,545][19107] Updated weights for policy 0, policy_version 205685 (0.0027) [2024-06-18 20:29:05,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3370090496. Throughput: 0: 41881.8. Samples: 594192880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:29:05,501][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 20:29:05,546][19107] Updated weights for policy 0, policy_version 205695 (0.0029) [2024-06-18 20:29:09,150][19107] Updated weights for policy 0, policy_version 205705 (0.0031) [2024-06-18 20:29:10,504][18875] Fps is (10 sec: 42582.6, 60 sec: 42049.8, 300 sec: 42042.5). Total num frames: 3370319872. Throughput: 0: 41958.3. Samples: 594446900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:29:10,505][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 20:29:13,658][19107] Updated weights for policy 0, policy_version 205715 (0.0051) [2024-06-18 20:29:15,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3370516480. Throughput: 0: 42101.3. Samples: 594573160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:29:15,501][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 20:29:17,082][19107] Updated weights for policy 0, policy_version 205725 (0.0034) [2024-06-18 20:29:20,500][18875] Fps is (10 sec: 40975.5, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3370729472. Throughput: 0: 41900.7. Samples: 594818060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:29:20,500][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 20:29:21,683][19107] Updated weights for policy 0, policy_version 205735 (0.0037) [2024-06-18 20:29:24,863][19107] Updated weights for policy 0, policy_version 205745 (0.0032) [2024-06-18 20:29:25,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3370942464. Throughput: 0: 42041.8. Samples: 595075060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:29:25,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 20:29:29,342][19107] Updated weights for policy 0, policy_version 205755 (0.0034) [2024-06-18 20:29:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3371139072. Throughput: 0: 42103.6. Samples: 595202380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:29:30,500][18875] Avg episode reward: [(0, '0.330')] [2024-06-18 20:29:32,740][19107] Updated weights for policy 0, policy_version 205765 (0.0032) [2024-06-18 20:29:35,500][18875] Fps is (10 sec: 40959.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3371352064. Throughput: 0: 41786.5. Samples: 595446120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:29:35,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 20:29:37,105][19107] Updated weights for policy 0, policy_version 205775 (0.0029) [2024-06-18 20:29:40,465][19107] Updated weights for policy 0, policy_version 205785 (0.0042) [2024-06-18 20:29:40,501][18875] Fps is (10 sec: 44234.9, 60 sec: 42325.1, 300 sec: 41987.4). Total num frames: 3371581440. Throughput: 0: 41790.7. Samples: 595696800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:29:40,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 20:29:45,071][19107] Updated weights for policy 0, policy_version 205795 (0.0049) [2024-06-18 20:29:45,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41506.1, 300 sec: 41932.0). Total num frames: 3371745280. Throughput: 0: 41740.9. Samples: 595819760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:29:45,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 20:29:48,387][19107] Updated weights for policy 0, policy_version 205805 (0.0039) [2024-06-18 20:29:50,504][18875] Fps is (10 sec: 40946.6, 60 sec: 41776.6, 300 sec: 41931.9). Total num frames: 3371991040. Throughput: 0: 41689.9. Samples: 596069080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:29:50,513][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 20:29:52,871][19107] Updated weights for policy 0, policy_version 205815 (0.0035) [2024-06-18 20:29:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3372187648. Throughput: 0: 41659.0. Samples: 596321400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:29:55,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 20:29:56,285][19107] Updated weights for policy 0, policy_version 205825 (0.0040) [2024-06-18 20:30:00,500][18875] Fps is (10 sec: 39335.7, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3372384256. Throughput: 0: 41566.7. Samples: 596443660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:30:00,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 20:30:01,024][19107] Updated weights for policy 0, policy_version 205835 (0.0042) [2024-06-18 20:30:04,204][19107] Updated weights for policy 0, policy_version 205845 (0.0041) [2024-06-18 20:30:05,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3372630016. Throughput: 0: 41742.0. Samples: 596696460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:30:05,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 20:30:08,741][19107] Updated weights for policy 0, policy_version 205855 (0.0036) [2024-06-18 20:30:10,388][19087] Signal inference workers to stop experience collection... (8700 times) [2024-06-18 20:30:10,388][19087] Signal inference workers to resume experience collection... (8700 times) [2024-06-18 20:30:10,432][19107] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-06-18 20:30:10,432][19107] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-06-18 20:30:10,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41508.7, 300 sec: 41931.9). Total num frames: 3372810240. Throughput: 0: 41802.2. Samples: 596956160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:30:10,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 20:30:11,916][19107] Updated weights for policy 0, policy_version 205865 (0.0039) [2024-06-18 20:30:15,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 3373023232. Throughput: 0: 41526.5. Samples: 597071080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:30:15,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 20:30:16,291][19107] Updated weights for policy 0, policy_version 205875 (0.0055) [2024-06-18 20:30:19,784][19107] Updated weights for policy 0, policy_version 205885 (0.0038) [2024-06-18 20:30:20,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 41932.0). Total num frames: 3373252608. Throughput: 0: 41683.7. Samples: 597321880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:30:20,500][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 20:30:23,943][19107] Updated weights for policy 0, policy_version 205895 (0.0036) [2024-06-18 20:30:25,500][18875] Fps is (10 sec: 37683.9, 60 sec: 40960.0, 300 sec: 41820.9). Total num frames: 3373400064. Throughput: 0: 42019.9. Samples: 597587680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:30:25,500][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 20:30:27,568][19107] Updated weights for policy 0, policy_version 205905 (0.0036) [2024-06-18 20:30:30,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3373662208. Throughput: 0: 41810.6. Samples: 597701240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 20:30:30,501][18875] Avg episode reward: [(0, '0.793')] [2024-06-18 20:30:31,888][19107] Updated weights for policy 0, policy_version 205915 (0.0047) [2024-06-18 20:30:35,280][19107] Updated weights for policy 0, policy_version 205925 (0.0037) [2024-06-18 20:30:35,500][18875] Fps is (10 sec: 47513.4, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 3373875200. Throughput: 0: 41946.1. Samples: 597956500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:30:35,501][18875] Avg episode reward: [(0, '0.887')] [2024-06-18 20:30:35,610][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205926_3373891584.pth... [2024-06-18 20:30:35,689][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205310_3363799040.pth [2024-06-18 20:30:39,728][19107] Updated weights for policy 0, policy_version 205935 (0.0048) [2024-06-18 20:30:40,500][18875] Fps is (10 sec: 37683.4, 60 sec: 40960.2, 300 sec: 41931.9). Total num frames: 3374039040. Throughput: 0: 41977.3. Samples: 598210380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:30:40,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 20:30:43,436][19107] Updated weights for policy 0, policy_version 205945 (0.0031) [2024-06-18 20:30:45,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3374284800. Throughput: 0: 41994.1. Samples: 598333400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:30:45,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 20:30:47,674][19107] Updated weights for policy 0, policy_version 205955 (0.0039) [2024-06-18 20:30:50,500][18875] Fps is (10 sec: 45874.5, 60 sec: 41781.6, 300 sec: 41931.9). Total num frames: 3374497792. Throughput: 0: 41946.2. Samples: 598584040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:30:50,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 20:30:51,089][19107] Updated weights for policy 0, policy_version 205965 (0.0060) [2024-06-18 20:30:55,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3374678016. Throughput: 0: 41674.2. Samples: 598831500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:30:55,501][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 20:30:55,778][19107] Updated weights for policy 0, policy_version 205975 (0.0036) [2024-06-18 20:30:58,840][19107] Updated weights for policy 0, policy_version 205985 (0.0033) [2024-06-18 20:31:00,500][18875] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 41988.0). Total num frames: 3374923776. Throughput: 0: 41967.7. Samples: 598959620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:00,500][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 20:31:03,480][19107] Updated weights for policy 0, policy_version 205995 (0.0024) [2024-06-18 20:31:05,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3375120384. Throughput: 0: 42158.5. Samples: 599219020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:05,501][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 20:31:06,767][19107] Updated weights for policy 0, policy_version 206005 (0.0046) [2024-06-18 20:31:10,500][18875] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 41932.4). Total num frames: 3375316992. Throughput: 0: 41517.6. Samples: 599455980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:10,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 20:31:11,275][19107] Updated weights for policy 0, policy_version 206015 (0.0034) [2024-06-18 20:31:14,558][19107] Updated weights for policy 0, policy_version 206025 (0.0034) [2024-06-18 20:31:15,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3375546368. Throughput: 0: 41948.4. Samples: 599588920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:15,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 20:31:18,869][19107] Updated weights for policy 0, policy_version 206035 (0.0049) [2024-06-18 20:31:20,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41233.0, 300 sec: 41932.0). Total num frames: 3375726592. Throughput: 0: 41799.1. Samples: 599837460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:20,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 20:31:22,292][19107] Updated weights for policy 0, policy_version 206045 (0.0027) [2024-06-18 20:31:25,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 41987.5). Total num frames: 3375972352. Throughput: 0: 41615.1. Samples: 600083060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:25,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 20:31:27,040][19107] Updated weights for policy 0, policy_version 206055 (0.0034) [2024-06-18 20:31:30,290][19107] Updated weights for policy 0, policy_version 206065 (0.0043) [2024-06-18 20:31:30,502][18875] Fps is (10 sec: 44227.8, 60 sec: 41777.8, 300 sec: 41987.2). Total num frames: 3376168960. Throughput: 0: 41832.9. Samples: 600215960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:30,503][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 20:31:34,847][19107] Updated weights for policy 0, policy_version 206075 (0.0026) [2024-06-18 20:31:35,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41233.0, 300 sec: 41876.7). Total num frames: 3376349184. Throughput: 0: 41819.7. Samples: 600465920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:35,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 20:31:38,159][19107] Updated weights for policy 0, policy_version 206085 (0.0040) [2024-06-18 20:31:40,500][18875] Fps is (10 sec: 42607.2, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3376594944. Throughput: 0: 41799.6. Samples: 600712480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-18 20:31:40,501][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 20:31:42,782][19107] Updated weights for policy 0, policy_version 206095 (0.0043) [2024-06-18 20:31:43,956][19087] Signal inference workers to stop experience collection... (8750 times) [2024-06-18 20:31:43,956][19087] Signal inference workers to resume experience collection... (8750 times) [2024-06-18 20:31:43,997][19107] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-06-18 20:31:43,997][19107] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-06-18 20:31:45,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3376791552. Throughput: 0: 41889.6. Samples: 600844660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:31:45,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 20:31:45,982][19107] Updated weights for policy 0, policy_version 206105 (0.0031) [2024-06-18 20:31:50,343][19107] Updated weights for policy 0, policy_version 206115 (0.0026) [2024-06-18 20:31:50,504][18875] Fps is (10 sec: 39307.3, 60 sec: 41503.7, 300 sec: 41875.9). Total num frames: 3376988160. Throughput: 0: 41559.9. Samples: 601089360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:31:50,504][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 20:31:53,889][19107] Updated weights for policy 0, policy_version 206125 (0.0040) [2024-06-18 20:31:55,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3377233920. Throughput: 0: 41826.3. Samples: 601338160. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:31:55,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 20:31:58,000][19107] Updated weights for policy 0, policy_version 206135 (0.0045) [2024-06-18 20:32:00,500][18875] Fps is (10 sec: 42613.9, 60 sec: 41506.1, 300 sec: 41876.9). Total num frames: 3377414144. Throughput: 0: 41815.6. Samples: 601470620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:00,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 20:32:01,746][19107] Updated weights for policy 0, policy_version 206145 (0.0043) [2024-06-18 20:32:05,501][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3377627136. Throughput: 0: 41757.6. Samples: 601716560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:05,501][18875] Avg episode reward: [(0, '0.683')] [2024-06-18 20:32:05,660][19107] Updated weights for policy 0, policy_version 206155 (0.0037) [2024-06-18 20:32:09,728][19107] Updated weights for policy 0, policy_version 206165 (0.0024) [2024-06-18 20:32:10,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 3377872896. Throughput: 0: 41906.3. Samples: 601968840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:10,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 20:32:13,359][19107] Updated weights for policy 0, policy_version 206175 (0.0043) [2024-06-18 20:32:15,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3378020352. Throughput: 0: 41782.3. Samples: 602096080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:15,501][18875] Avg episode reward: [(0, '0.712')] [2024-06-18 20:32:17,497][19107] Updated weights for policy 0, policy_version 206185 (0.0032) [2024-06-18 20:32:20,500][18875] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 3378266112. Throughput: 0: 41609.2. Samples: 602338340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:20,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 20:32:21,156][19107] Updated weights for policy 0, policy_version 206195 (0.0037) [2024-06-18 20:32:25,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 3378446336. Throughput: 0: 41799.0. Samples: 602593440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:25,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 20:32:25,592][19107] Updated weights for policy 0, policy_version 206205 (0.0040) [2024-06-18 20:32:29,097][19107] Updated weights for policy 0, policy_version 206215 (0.0035) [2024-06-18 20:32:30,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41234.4, 300 sec: 41820.9). Total num frames: 3378642944. Throughput: 0: 41492.5. Samples: 602711820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:30,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 20:32:33,281][19107] Updated weights for policy 0, policy_version 206225 (0.0036) [2024-06-18 20:32:35,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 3378888704. Throughput: 0: 41769.0. Samples: 602968820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:35,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 20:32:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000206231_3378888704.pth... [2024-06-18 20:32:35,569][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205620_3368878080.pth [2024-06-18 20:32:36,820][19107] Updated weights for policy 0, policy_version 206235 (0.0035) [2024-06-18 20:32:40,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41232.9, 300 sec: 41820.8). Total num frames: 3379068928. Throughput: 0: 41881.7. Samples: 603222840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:40,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 20:32:41,391][19107] Updated weights for policy 0, policy_version 206245 (0.0032) [2024-06-18 20:32:45,055][19107] Updated weights for policy 0, policy_version 206255 (0.0035) [2024-06-18 20:32:45,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3379281920. Throughput: 0: 41545.6. Samples: 603340180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-18 20:32:45,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 20:32:49,356][19107] Updated weights for policy 0, policy_version 206265 (0.0033) [2024-06-18 20:32:50,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42054.8, 300 sec: 41820.8). Total num frames: 3379511296. Throughput: 0: 41783.7. Samples: 603596820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:32:50,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 20:32:52,749][19107] Updated weights for policy 0, policy_version 206275 (0.0038) [2024-06-18 20:32:55,500][18875] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41820.9). Total num frames: 3379691520. Throughput: 0: 41736.9. Samples: 603847000. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:32:55,504][18875] Avg episode reward: [(0, '0.704')] [2024-06-18 20:32:57,131][19107] Updated weights for policy 0, policy_version 206285 (0.0038) [2024-06-18 20:33:00,393][19107] Updated weights for policy 0, policy_version 206295 (0.0035) [2024-06-18 20:33:00,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3379937280. Throughput: 0: 41533.8. Samples: 603965100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:00,501][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 20:33:04,870][19107] Updated weights for policy 0, policy_version 206305 (0.0028) [2024-06-18 20:33:05,459][19087] Signal inference workers to stop experience collection... (8800 times) [2024-06-18 20:33:05,461][19087] Signal inference workers to resume experience collection... (8800 times) [2024-06-18 20:33:05,474][19107] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-06-18 20:33:05,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41779.4, 300 sec: 41820.9). Total num frames: 3380133888. Throughput: 0: 42024.6. Samples: 604229440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:05,500][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 20:33:05,502][19107] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-06-18 20:33:07,907][19107] Updated weights for policy 0, policy_version 206315 (0.0031) [2024-06-18 20:33:10,503][18875] Fps is (10 sec: 39310.2, 60 sec: 40958.0, 300 sec: 41820.4). Total num frames: 3380330496. Throughput: 0: 41878.7. Samples: 604478100. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:10,504][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 20:33:12,579][19107] Updated weights for policy 0, policy_version 206325 (0.0035) [2024-06-18 20:33:15,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 41876.4). Total num frames: 3380576256. Throughput: 0: 42030.8. Samples: 604603200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:15,500][18875] Avg episode reward: [(0, '0.379')] [2024-06-18 20:33:15,582][19107] Updated weights for policy 0, policy_version 206335 (0.0033) [2024-06-18 20:33:20,500][18875] Fps is (10 sec: 39332.5, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 3380723712. Throughput: 0: 41836.5. Samples: 604851460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:20,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 20:33:20,839][19107] Updated weights for policy 0, policy_version 206345 (0.0042) [2024-06-18 20:33:23,257][19107] Updated weights for policy 0, policy_version 206355 (0.0030) [2024-06-18 20:33:25,500][18875] Fps is (10 sec: 39320.8, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3380969472. Throughput: 0: 41788.1. Samples: 605103300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:25,501][18875] Avg episode reward: [(0, '0.717')] [2024-06-18 20:33:28,750][19107] Updated weights for policy 0, policy_version 206365 (0.0039) [2024-06-18 20:33:30,500][18875] Fps is (10 sec: 47514.2, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 3381198848. Throughput: 0: 42156.6. Samples: 605237220. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:30,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 20:33:31,473][19107] Updated weights for policy 0, policy_version 206375 (0.0039) [2024-06-18 20:33:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3381362688. Throughput: 0: 41924.8. Samples: 605483440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:35,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 20:33:36,302][19107] Updated weights for policy 0, policy_version 206385 (0.0033) [2024-06-18 20:33:39,047][19107] Updated weights for policy 0, policy_version 206395 (0.0035) [2024-06-18 20:33:40,500][18875] Fps is (10 sec: 39321.0, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 3381592064. Throughput: 0: 41920.3. Samples: 605733420. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:40,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 20:33:44,011][19107] Updated weights for policy 0, policy_version 206405 (0.0032) [2024-06-18 20:33:45,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 41820.8). Total num frames: 3381821440. Throughput: 0: 42292.0. Samples: 605868240. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:45,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 20:33:46,820][19107] Updated weights for policy 0, policy_version 206415 (0.0040) [2024-06-18 20:33:50,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3382001664. Throughput: 0: 41952.4. Samples: 606117300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:50,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 20:33:51,631][19107] Updated weights for policy 0, policy_version 206425 (0.0044) [2024-06-18 20:33:54,450][19107] Updated weights for policy 0, policy_version 206435 (0.0035) [2024-06-18 20:33:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 3382247424. Throughput: 0: 41946.7. Samples: 606365580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-18 20:33:55,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 20:33:59,393][19107] Updated weights for policy 0, policy_version 206445 (0.0035) [2024-06-18 20:34:00,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3382460416. Throughput: 0: 42175.9. Samples: 606501120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:00,501][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 20:34:02,240][19107] Updated weights for policy 0, policy_version 206455 (0.0035) [2024-06-18 20:34:05,504][18875] Fps is (10 sec: 39307.3, 60 sec: 41776.6, 300 sec: 41765.3). Total num frames: 3382640640. Throughput: 0: 42174.0. Samples: 606749440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:05,505][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 20:34:07,115][19107] Updated weights for policy 0, policy_version 206465 (0.0040) [2024-06-18 20:34:10,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42327.4, 300 sec: 41876.4). Total num frames: 3382870016. Throughput: 0: 42032.2. Samples: 606994740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:10,500][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 20:34:10,550][19107] Updated weights for policy 0, policy_version 206475 (0.0028) [2024-06-18 20:34:14,804][19107] Updated weights for policy 0, policy_version 206485 (0.0029) [2024-06-18 20:34:15,500][18875] Fps is (10 sec: 44252.5, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3383083008. Throughput: 0: 41984.8. Samples: 607126540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:15,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 20:34:18,464][19107] Updated weights for policy 0, policy_version 206495 (0.0036) [2024-06-18 20:34:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 41820.8). Total num frames: 3383279616. Throughput: 0: 42090.4. Samples: 607377500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:20,501][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 20:34:22,457][19087] Signal inference workers to stop experience collection... (8850 times) [2024-06-18 20:34:22,489][19107] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-06-18 20:34:22,526][19087] Signal inference workers to resume experience collection... (8850 times) [2024-06-18 20:34:22,527][19107] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-06-18 20:34:22,661][19107] Updated weights for policy 0, policy_version 206505 (0.0022) [2024-06-18 20:34:25,504][18875] Fps is (10 sec: 42583.4, 60 sec: 42322.9, 300 sec: 41931.4). Total num frames: 3383508992. Throughput: 0: 42208.7. Samples: 607632960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:25,504][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 20:34:26,049][19107] Updated weights for policy 0, policy_version 206515 (0.0028) [2024-06-18 20:34:30,372][19107] Updated weights for policy 0, policy_version 206525 (0.0030) [2024-06-18 20:34:30,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3383705600. Throughput: 0: 42046.6. Samples: 607760340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:30,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 20:34:34,007][19107] Updated weights for policy 0, policy_version 206535 (0.0027) [2024-06-18 20:34:35,504][18875] Fps is (10 sec: 40959.6, 60 sec: 42595.8, 300 sec: 41820.4). Total num frames: 3383918592. Throughput: 0: 42102.3. Samples: 608012060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:35,505][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 20:34:35,531][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000206538_3383918592.pth... [2024-06-18 20:34:35,590][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000205926_3373891584.pth [2024-06-18 20:34:38,159][19107] Updated weights for policy 0, policy_version 206545 (0.0032) [2024-06-18 20:34:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 3384131584. Throughput: 0: 42178.6. Samples: 608263620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:40,501][18875] Avg episode reward: [(0, '0.304')] [2024-06-18 20:34:41,636][19107] Updated weights for policy 0, policy_version 206555 (0.0048) [2024-06-18 20:34:45,504][18875] Fps is (10 sec: 40960.4, 60 sec: 41776.7, 300 sec: 41820.9). Total num frames: 3384328192. Throughput: 0: 41974.9. Samples: 608390140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:45,505][18875] Avg episode reward: [(0, '0.304')] [2024-06-18 20:34:45,902][19107] Updated weights for policy 0, policy_version 206565 (0.0028) [2024-06-18 20:34:49,409][19107] Updated weights for policy 0, policy_version 206575 (0.0025) [2024-06-18 20:34:50,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 3384557568. Throughput: 0: 42051.0. Samples: 608641580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:50,500][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 20:34:54,453][19107] Updated weights for policy 0, policy_version 206585 (0.0034) [2024-06-18 20:34:55,500][18875] Fps is (10 sec: 40974.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3384737792. Throughput: 0: 42271.9. Samples: 608896980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:34:55,501][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 20:34:57,065][19107] Updated weights for policy 0, policy_version 206595 (0.0028) [2024-06-18 20:35:00,500][18875] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3384967168. Throughput: 0: 41927.1. Samples: 609013260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:35:00,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 20:35:02,397][19107] Updated weights for policy 0, policy_version 206605 (0.0038) [2024-06-18 20:35:05,107][19107] Updated weights for policy 0, policy_version 206615 (0.0035) [2024-06-18 20:35:05,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42600.9, 300 sec: 41987.5). Total num frames: 3385196544. Throughput: 0: 41938.5. Samples: 609264740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 20:35:05,501][18875] Avg episode reward: [(0, '0.755')] [2024-06-18 20:35:10,107][19107] Updated weights for policy 0, policy_version 206625 (0.0033) [2024-06-18 20:35:10,500][18875] Fps is (10 sec: 37683.8, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 3385344000. Throughput: 0: 41969.2. Samples: 609521420. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:10,501][18875] Avg episode reward: [(0, '0.707')] [2024-06-18 20:35:12,977][19107] Updated weights for policy 0, policy_version 206635 (0.0034) [2024-06-18 20:35:15,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 41820.8). Total num frames: 3385589760. Throughput: 0: 41688.1. Samples: 609636300. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:15,509][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 20:35:17,638][19107] Updated weights for policy 0, policy_version 206645 (0.0031) [2024-06-18 20:35:20,500][18875] Fps is (10 sec: 47513.0, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3385819136. Throughput: 0: 41720.7. Samples: 609889340. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:20,501][18875] Avg episode reward: [(0, '0.403')] [2024-06-18 20:35:20,690][19107] Updated weights for policy 0, policy_version 206655 (0.0033) [2024-06-18 20:35:25,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41235.5, 300 sec: 41765.3). Total num frames: 3385982976. Throughput: 0: 41733.4. Samples: 610141620. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:25,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 20:35:25,899][19107] Updated weights for policy 0, policy_version 206665 (0.0034) [2024-06-18 20:35:28,492][19107] Updated weights for policy 0, policy_version 206675 (0.0038) [2024-06-18 20:35:30,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3386245120. Throughput: 0: 41567.8. Samples: 610260540. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:30,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 20:35:33,487][19107] Updated weights for policy 0, policy_version 206685 (0.0042) [2024-06-18 20:35:35,500][18875] Fps is (10 sec: 47513.4, 60 sec: 42327.9, 300 sec: 42098.5). Total num frames: 3386458112. Throughput: 0: 41846.5. Samples: 610524680. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:35,501][18875] Avg episode reward: [(0, '0.787')] [2024-06-18 20:35:36,338][19107] Updated weights for policy 0, policy_version 206695 (0.0046) [2024-06-18 20:35:40,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3386621952. Throughput: 0: 41810.2. Samples: 610778440. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:40,501][18875] Avg episode reward: [(0, '0.870')] [2024-06-18 20:35:41,102][19107] Updated weights for policy 0, policy_version 206705 (0.0031) [2024-06-18 20:35:43,134][19087] Signal inference workers to stop experience collection... (8900 times) [2024-06-18 20:35:43,168][19107] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-06-18 20:35:43,182][19087] Signal inference workers to resume experience collection... (8900 times) [2024-06-18 20:35:43,192][19107] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-06-18 20:35:44,205][19107] Updated weights for policy 0, policy_version 206715 (0.0043) [2024-06-18 20:35:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42327.8, 300 sec: 41931.9). Total num frames: 3386867712. Throughput: 0: 41864.0. Samples: 610897140. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:45,501][18875] Avg episode reward: [(0, '0.530')] [2024-06-18 20:35:48,948][19107] Updated weights for policy 0, policy_version 206725 (0.0043) [2024-06-18 20:35:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41232.9, 300 sec: 41876.4). Total num frames: 3387031552. Throughput: 0: 41922.2. Samples: 611151240. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:50,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 20:35:52,114][19107] Updated weights for policy 0, policy_version 206735 (0.0030) [2024-06-18 20:35:55,500][18875] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 3387260928. Throughput: 0: 41833.8. Samples: 611403940. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:35:55,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 20:35:56,483][19107] Updated weights for policy 0, policy_version 206745 (0.0038) [2024-06-18 20:35:59,820][19107] Updated weights for policy 0, policy_version 206755 (0.0030) [2024-06-18 20:36:00,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3387490304. Throughput: 0: 42203.9. Samples: 611535480. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:36:00,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 20:36:04,102][19107] Updated weights for policy 0, policy_version 206765 (0.0033) [2024-06-18 20:36:05,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3387670528. Throughput: 0: 42075.2. Samples: 611782720. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:36:05,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 20:36:07,669][19107] Updated weights for policy 0, policy_version 206775 (0.0024) [2024-06-18 20:36:10,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 3387899904. Throughput: 0: 41956.8. Samples: 612029680. Policy #0 lag: (min: 0.0, avg: 13.2, max: 23.0) [2024-06-18 20:36:10,501][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 20:36:11,907][19107] Updated weights for policy 0, policy_version 206785 (0.0033) [2024-06-18 20:36:15,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3388112896. Throughput: 0: 42302.6. Samples: 612164160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:15,501][18875] Avg episode reward: [(0, '0.718')] [2024-06-18 20:36:15,666][19107] Updated weights for policy 0, policy_version 206795 (0.0039) [2024-06-18 20:36:19,818][19107] Updated weights for policy 0, policy_version 206805 (0.0036) [2024-06-18 20:36:20,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 3388309504. Throughput: 0: 41938.7. Samples: 612411920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:20,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 20:36:23,247][19107] Updated weights for policy 0, policy_version 206815 (0.0024) [2024-06-18 20:36:25,504][18875] Fps is (10 sec: 42583.4, 60 sec: 42595.9, 300 sec: 41931.7). Total num frames: 3388538880. Throughput: 0: 41774.5. Samples: 612658440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:25,505][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 20:36:27,458][19107] Updated weights for policy 0, policy_version 206825 (0.0038) [2024-06-18 20:36:30,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3388735488. Throughput: 0: 42104.1. Samples: 612791820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:30,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 20:36:31,517][19107] Updated weights for policy 0, policy_version 206835 (0.0042) [2024-06-18 20:36:35,138][19107] Updated weights for policy 0, policy_version 206845 (0.0037) [2024-06-18 20:36:35,500][18875] Fps is (10 sec: 42614.2, 60 sec: 41779.4, 300 sec: 41931.9). Total num frames: 3388964864. Throughput: 0: 41963.7. Samples: 613039600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:35,501][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 20:36:35,512][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000206846_3388964864.pth... [2024-06-18 20:36:35,571][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000206231_3378888704.pth [2024-06-18 20:36:39,151][19107] Updated weights for policy 0, policy_version 206855 (0.0031) [2024-06-18 20:36:40,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 3389161472. Throughput: 0: 41892.0. Samples: 613289080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:40,500][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 20:36:42,835][19107] Updated weights for policy 0, policy_version 206865 (0.0028) [2024-06-18 20:36:45,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41988.0). Total num frames: 3389374464. Throughput: 0: 41790.7. Samples: 613416060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:45,501][18875] Avg episode reward: [(0, '0.695')] [2024-06-18 20:36:47,107][19107] Updated weights for policy 0, policy_version 206875 (0.0029) [2024-06-18 20:36:50,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.5, 300 sec: 41820.9). Total num frames: 3389571072. Throughput: 0: 41837.0. Samples: 613665380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:50,500][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 20:36:51,042][19107] Updated weights for policy 0, policy_version 206885 (0.0033) [2024-06-18 20:36:53,616][19087] Signal inference workers to stop experience collection... (8950 times) [2024-06-18 20:36:53,616][19087] Signal inference workers to resume experience collection... (8950 times) [2024-06-18 20:36:53,654][19107] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-06-18 20:36:53,654][19107] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-06-18 20:36:54,897][19107] Updated weights for policy 0, policy_version 206895 (0.0031) [2024-06-18 20:36:55,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3389784064. Throughput: 0: 41997.5. Samples: 613919560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:36:55,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 20:36:58,576][19107] Updated weights for policy 0, policy_version 206905 (0.0022) [2024-06-18 20:37:00,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 3389980672. Throughput: 0: 41773.5. Samples: 614043960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:37:00,501][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 20:37:02,607][19107] Updated weights for policy 0, policy_version 206915 (0.0031) [2024-06-18 20:37:05,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 3390210048. Throughput: 0: 42055.7. Samples: 614304420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:37:05,501][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 20:37:06,099][19107] Updated weights for policy 0, policy_version 206925 (0.0039) [2024-06-18 20:37:10,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3390406656. Throughput: 0: 42187.4. Samples: 614556720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:37:10,501][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 20:37:10,585][19107] Updated weights for policy 0, policy_version 206935 (0.0032) [2024-06-18 20:37:13,993][19107] Updated weights for policy 0, policy_version 206945 (0.0039) [2024-06-18 20:37:15,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 3390603264. Throughput: 0: 42043.9. Samples: 614683800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:37:15,501][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 20:37:18,189][19107] Updated weights for policy 0, policy_version 206955 (0.0039) [2024-06-18 20:37:20,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3390849024. Throughput: 0: 42276.4. Samples: 614942040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 20:37:20,501][18875] Avg episode reward: [(0, '0.747')] [2024-06-18 20:37:21,645][19107] Updated weights for policy 0, policy_version 206965 (0.0023) [2024-06-18 20:37:25,500][18875] Fps is (10 sec: 45875.9, 60 sec: 42054.8, 300 sec: 42098.6). Total num frames: 3391062016. Throughput: 0: 42255.1. Samples: 615190560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:37:25,501][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 20:37:25,771][19107] Updated weights for policy 0, policy_version 206975 (0.0037) [2024-06-18 20:37:29,449][19107] Updated weights for policy 0, policy_version 206985 (0.0032) [2024-06-18 20:37:30,504][18875] Fps is (10 sec: 40945.1, 60 sec: 42049.7, 300 sec: 41931.4). Total num frames: 3391258624. Throughput: 0: 42267.3. Samples: 615318240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:37:30,504][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 20:37:33,907][19107] Updated weights for policy 0, policy_version 206995 (0.0026) [2024-06-18 20:37:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3391471616. Throughput: 0: 42311.0. Samples: 615569380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:37:35,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 20:37:37,183][19107] Updated weights for policy 0, policy_version 207005 (0.0037) [2024-06-18 20:37:40,500][18875] Fps is (10 sec: 42614.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3391684608. Throughput: 0: 42199.7. Samples: 615818540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:37:40,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 20:37:41,617][19107] Updated weights for policy 0, policy_version 207015 (0.0041) [2024-06-18 20:37:45,070][19107] Updated weights for policy 0, policy_version 207025 (0.0032) [2024-06-18 20:37:45,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3391897600. Throughput: 0: 42316.3. Samples: 615948200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:37:45,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 20:37:49,455][19107] Updated weights for policy 0, policy_version 207035 (0.0028) [2024-06-18 20:37:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3392110592. Throughput: 0: 42170.2. Samples: 616202080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:37:50,501][18875] Avg episode reward: [(0, '0.704')] [2024-06-18 20:37:53,129][19107] Updated weights for policy 0, policy_version 207045 (0.0039) [2024-06-18 20:37:55,504][18875] Fps is (10 sec: 42583.3, 60 sec: 42322.8, 300 sec: 41987.0). Total num frames: 3392323584. Throughput: 0: 42053.6. Samples: 616449280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:37:55,504][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 20:37:57,036][19107] Updated weights for policy 0, policy_version 207055 (0.0036) [2024-06-18 20:38:00,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3392487424. Throughput: 0: 42028.2. Samples: 616575060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:38:00,500][18875] Avg episode reward: [(0, '0.735')] [2024-06-18 20:38:01,171][19107] Updated weights for policy 0, policy_version 207065 (0.0031) [2024-06-18 20:38:04,869][19107] Updated weights for policy 0, policy_version 207075 (0.0035) [2024-06-18 20:38:05,504][18875] Fps is (10 sec: 42598.1, 60 sec: 42322.7, 300 sec: 42098.4). Total num frames: 3392749568. Throughput: 0: 41861.5. Samples: 616825960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:38:05,505][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 20:38:08,998][19107] Updated weights for policy 0, policy_version 207085 (0.0021) [2024-06-18 20:38:10,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3392946176. Throughput: 0: 41889.6. Samples: 617075600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:38:10,501][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 20:38:12,677][19107] Updated weights for policy 0, policy_version 207095 (0.0037) [2024-06-18 20:38:14,058][19087] Signal inference workers to stop experience collection... (9000 times) [2024-06-18 20:38:14,059][19087] Signal inference workers to resume experience collection... (9000 times) [2024-06-18 20:38:14,088][19107] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-06-18 20:38:14,088][19107] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-06-18 20:38:15,504][18875] Fps is (10 sec: 39321.8, 60 sec: 42322.9, 300 sec: 42098.0). Total num frames: 3393142784. Throughput: 0: 41836.9. Samples: 617200900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:38:15,504][18875] Avg episode reward: [(0, '0.387')] [2024-06-18 20:38:16,649][19107] Updated weights for policy 0, policy_version 207105 (0.0036) [2024-06-18 20:38:20,486][19107] Updated weights for policy 0, policy_version 207115 (0.0038) [2024-06-18 20:38:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3393372160. Throughput: 0: 41931.0. Samples: 617456280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:38:20,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 20:38:24,383][19107] Updated weights for policy 0, policy_version 207125 (0.0037) [2024-06-18 20:38:25,500][18875] Fps is (10 sec: 45891.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3393601536. Throughput: 0: 42085.7. Samples: 617712400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-18 20:38:25,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 20:38:28,060][19107] Updated weights for policy 0, policy_version 207135 (0.0043) [2024-06-18 20:38:30,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42054.8, 300 sec: 42098.6). Total num frames: 3393781760. Throughput: 0: 41962.7. Samples: 617836520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:38:30,501][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 20:38:32,350][19107] Updated weights for policy 0, policy_version 207145 (0.0028) [2024-06-18 20:38:35,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3393994752. Throughput: 0: 41989.4. Samples: 618091600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:38:35,501][18875] Avg episode reward: [(0, '0.721')] [2024-06-18 20:38:35,599][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000207154_3394011136.pth... [2024-06-18 20:38:35,649][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000206538_3383918592.pth [2024-06-18 20:38:35,879][19107] Updated weights for policy 0, policy_version 207155 (0.0032) [2024-06-18 20:38:40,051][19107] Updated weights for policy 0, policy_version 207165 (0.0031) [2024-06-18 20:38:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3394207744. Throughput: 0: 42129.5. Samples: 618344960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:38:40,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 20:38:43,643][19107] Updated weights for policy 0, policy_version 207175 (0.0035) [2024-06-18 20:38:45,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3394420736. Throughput: 0: 42003.5. Samples: 618465220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:38:45,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 20:38:48,328][19107] Updated weights for policy 0, policy_version 207185 (0.0039) [2024-06-18 20:38:50,504][18875] Fps is (10 sec: 44221.2, 60 sec: 42322.8, 300 sec: 42042.5). Total num frames: 3394650112. Throughput: 0: 42035.2. Samples: 618717540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:38:50,504][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 20:38:51,385][19107] Updated weights for policy 0, policy_version 207195 (0.0029) [2024-06-18 20:38:55,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 3394830336. Throughput: 0: 42165.9. Samples: 618973060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:38:55,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 20:38:55,953][19107] Updated weights for policy 0, policy_version 207205 (0.0029) [2024-06-18 20:38:59,145][19107] Updated weights for policy 0, policy_version 207215 (0.0034) [2024-06-18 20:39:00,500][18875] Fps is (10 sec: 40974.8, 60 sec: 42871.5, 300 sec: 42099.1). Total num frames: 3395059712. Throughput: 0: 42099.9. Samples: 619095240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:00,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 20:39:03,663][19107] Updated weights for policy 0, policy_version 207225 (0.0044) [2024-06-18 20:39:05,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 3395272704. Throughput: 0: 42080.1. Samples: 619349880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:05,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 20:39:06,840][19107] Updated weights for policy 0, policy_version 207235 (0.0027) [2024-06-18 20:39:10,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3395452928. Throughput: 0: 42021.8. Samples: 619603380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:10,501][18875] Avg episode reward: [(0, '0.778')] [2024-06-18 20:39:11,595][19107] Updated weights for policy 0, policy_version 207245 (0.0044) [2024-06-18 20:39:14,550][19107] Updated weights for policy 0, policy_version 207255 (0.0042) [2024-06-18 20:39:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 3395682304. Throughput: 0: 41980.0. Samples: 619725620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:15,500][18875] Avg episode reward: [(0, '0.761')] [2024-06-18 20:39:19,162][19107] Updated weights for policy 0, policy_version 207265 (0.0043) [2024-06-18 20:39:20,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 3395878912. Throughput: 0: 42044.3. Samples: 619983600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:20,501][18875] Avg episode reward: [(0, '0.692')] [2024-06-18 20:39:22,293][19107] Updated weights for policy 0, policy_version 207275 (0.0047) [2024-06-18 20:39:25,500][18875] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3396091904. Throughput: 0: 42039.1. Samples: 620236720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:25,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 20:39:27,022][19107] Updated weights for policy 0, policy_version 207285 (0.0024) [2024-06-18 20:39:30,039][19107] Updated weights for policy 0, policy_version 207295 (0.0027) [2024-06-18 20:39:30,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42099.1). Total num frames: 3396337664. Throughput: 0: 41992.5. Samples: 620354880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:30,504][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 20:39:34,687][19107] Updated weights for policy 0, policy_version 207305 (0.0022) [2024-06-18 20:39:35,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 3396501504. Throughput: 0: 42187.4. Samples: 620615820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 20:39:35,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 20:39:37,619][19087] Signal inference workers to stop experience collection... (9050 times) [2024-06-18 20:39:37,620][19087] Signal inference workers to resume experience collection... (9050 times) [2024-06-18 20:39:37,660][19107] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-06-18 20:39:37,660][19107] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-06-18 20:39:37,771][19107] Updated weights for policy 0, policy_version 207315 (0.0029) [2024-06-18 20:39:40,501][18875] Fps is (10 sec: 36043.9, 60 sec: 41506.0, 300 sec: 41932.4). Total num frames: 3396698112. Throughput: 0: 42014.8. Samples: 620863740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:39:40,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 20:39:42,370][19107] Updated weights for policy 0, policy_version 207325 (0.0039) [2024-06-18 20:39:45,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3396960256. Throughput: 0: 42038.3. Samples: 620986960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:39:45,500][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 20:39:45,559][19107] Updated weights for policy 0, policy_version 207335 (0.0031) [2024-06-18 20:39:50,202][19107] Updated weights for policy 0, policy_version 207345 (0.0039) [2024-06-18 20:39:50,500][18875] Fps is (10 sec: 44238.0, 60 sec: 41508.6, 300 sec: 42043.0). Total num frames: 3397140480. Throughput: 0: 42037.3. Samples: 621241560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:39:50,501][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 20:39:53,420][19107] Updated weights for policy 0, policy_version 207355 (0.0037) [2024-06-18 20:39:55,500][18875] Fps is (10 sec: 39320.6, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 3397353472. Throughput: 0: 41976.3. Samples: 621492320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:39:55,501][18875] Avg episode reward: [(0, '0.664')] [2024-06-18 20:39:58,138][19107] Updated weights for policy 0, policy_version 207365 (0.0037) [2024-06-18 20:40:00,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3397566464. Throughput: 0: 42084.3. Samples: 621619420. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:00,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 20:40:01,522][19107] Updated weights for policy 0, policy_version 207375 (0.0026) [2024-06-18 20:40:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.0, 300 sec: 42098.5). Total num frames: 3397763072. Throughput: 0: 41935.9. Samples: 621870720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:05,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 20:40:06,132][19107] Updated weights for policy 0, policy_version 207385 (0.0042) [2024-06-18 20:40:09,184][19107] Updated weights for policy 0, policy_version 207395 (0.0034) [2024-06-18 20:40:10,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3397992448. Throughput: 0: 41689.4. Samples: 622112740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:10,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 20:40:13,995][19107] Updated weights for policy 0, policy_version 207405 (0.0039) [2024-06-18 20:40:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 3398172672. Throughput: 0: 41897.7. Samples: 622240280. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:15,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 20:40:17,102][19107] Updated weights for policy 0, policy_version 207415 (0.0030) [2024-06-18 20:40:20,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3398402048. Throughput: 0: 41776.0. Samples: 622495740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:20,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 20:40:21,584][19107] Updated weights for policy 0, policy_version 207425 (0.0054) [2024-06-18 20:40:24,962][19107] Updated weights for policy 0, policy_version 207435 (0.0030) [2024-06-18 20:40:25,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3398615040. Throughput: 0: 41658.0. Samples: 622738340. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:25,501][18875] Avg episode reward: [(0, '0.834')] [2024-06-18 20:40:29,280][19107] Updated weights for policy 0, policy_version 207445 (0.0043) [2024-06-18 20:40:30,500][18875] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 41820.9). Total num frames: 3398795264. Throughput: 0: 41830.2. Samples: 622869320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:30,501][18875] Avg episode reward: [(0, '0.784')] [2024-06-18 20:40:32,642][19107] Updated weights for policy 0, policy_version 207455 (0.0042) [2024-06-18 20:40:35,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3399041024. Throughput: 0: 41804.4. Samples: 623122760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:35,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 20:40:35,518][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000207461_3399041024.pth... [2024-06-18 20:40:35,583][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000206846_3388964864.pth [2024-06-18 20:40:37,150][19107] Updated weights for policy 0, policy_version 207465 (0.0033) [2024-06-18 20:40:40,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42598.7, 300 sec: 41987.5). Total num frames: 3399254016. Throughput: 0: 41779.8. Samples: 623372400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 20:40:40,500][18875] Avg episode reward: [(0, '0.665')] [2024-06-18 20:40:40,741][19107] Updated weights for policy 0, policy_version 207475 (0.0048) [2024-06-18 20:40:45,381][19107] Updated weights for policy 0, policy_version 207485 (0.0032) [2024-06-18 20:40:45,504][18875] Fps is (10 sec: 39307.5, 60 sec: 41230.5, 300 sec: 42042.5). Total num frames: 3399434240. Throughput: 0: 41831.4. Samples: 623501980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:40:45,505][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 20:40:48,399][19107] Updated weights for policy 0, policy_version 207495 (0.0032) [2024-06-18 20:40:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3399680000. Throughput: 0: 41920.6. Samples: 623757140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:40:50,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 20:40:53,114][19107] Updated weights for policy 0, policy_version 207505 (0.0028) [2024-06-18 20:40:55,500][18875] Fps is (10 sec: 44252.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3399876608. Throughput: 0: 42153.7. Samples: 624009660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:40:55,501][18875] Avg episode reward: [(0, '0.415')] [2024-06-18 20:40:56,186][19107] Updated weights for policy 0, policy_version 207515 (0.0028) [2024-06-18 20:41:00,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3400073216. Throughput: 0: 42157.8. Samples: 624137380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:00,504][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 20:41:00,783][19107] Updated weights for policy 0, policy_version 207525 (0.0048) [2024-06-18 20:41:03,903][19107] Updated weights for policy 0, policy_version 207535 (0.0032) [2024-06-18 20:41:05,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 3400302592. Throughput: 0: 42018.7. Samples: 624386580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:05,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 20:41:08,345][19107] Updated weights for policy 0, policy_version 207545 (0.0037) [2024-06-18 20:41:10,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3400499200. Throughput: 0: 42428.6. Samples: 624647620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:10,501][18875] Avg episode reward: [(0, '0.275')] [2024-06-18 20:41:11,702][19107] Updated weights for policy 0, policy_version 207555 (0.0033) [2024-06-18 20:41:13,515][19087] Signal inference workers to stop experience collection... (9100 times) [2024-06-18 20:41:13,528][19107] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-06-18 20:41:13,630][19087] Signal inference workers to resume experience collection... (9100 times) [2024-06-18 20:41:13,630][19107] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-06-18 20:41:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3400712192. Throughput: 0: 42165.7. Samples: 624766780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:15,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 20:41:15,930][19107] Updated weights for policy 0, policy_version 207565 (0.0038) [2024-06-18 20:41:19,500][19107] Updated weights for policy 0, policy_version 207575 (0.0037) [2024-06-18 20:41:20,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 3400925184. Throughput: 0: 42202.8. Samples: 625021880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:20,501][18875] Avg episode reward: [(0, '0.620')] [2024-06-18 20:41:23,806][19107] Updated weights for policy 0, policy_version 207585 (0.0030) [2024-06-18 20:41:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3401121792. Throughput: 0: 42257.7. Samples: 625274000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:25,501][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 20:41:27,414][19107] Updated weights for policy 0, policy_version 207595 (0.0029) [2024-06-18 20:41:30,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 3401367552. Throughput: 0: 42099.4. Samples: 625396300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:30,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 20:41:31,629][19107] Updated weights for policy 0, policy_version 207605 (0.0024) [2024-06-18 20:41:35,262][19107] Updated weights for policy 0, policy_version 207615 (0.0035) [2024-06-18 20:41:35,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3401564160. Throughput: 0: 42204.5. Samples: 625656340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:35,501][18875] Avg episode reward: [(0, '0.370')] [2024-06-18 20:41:39,404][19107] Updated weights for policy 0, policy_version 207625 (0.0042) [2024-06-18 20:41:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3401777152. Throughput: 0: 42108.2. Samples: 625904520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:40,500][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 20:41:43,130][19107] Updated weights for policy 0, policy_version 207635 (0.0033) [2024-06-18 20:41:45,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42601.0, 300 sec: 42098.5). Total num frames: 3401990144. Throughput: 0: 42032.9. Samples: 626028860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:45,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 20:41:47,058][19107] Updated weights for policy 0, policy_version 207645 (0.0035) [2024-06-18 20:41:50,500][18875] Fps is (10 sec: 37682.8, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 3402153984. Throughput: 0: 42129.3. Samples: 626282400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 20:41:50,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 20:41:51,049][19107] Updated weights for policy 0, policy_version 207655 (0.0023) [2024-06-18 20:41:54,672][19107] Updated weights for policy 0, policy_version 207665 (0.0036) [2024-06-18 20:41:55,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 3402399744. Throughput: 0: 41759.5. Samples: 626526800. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:41:55,504][18875] Avg episode reward: [(0, '0.839')] [2024-06-18 20:41:58,865][19107] Updated weights for policy 0, policy_version 207675 (0.0033) [2024-06-18 20:42:00,500][18875] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3402629120. Throughput: 0: 42122.1. Samples: 626662280. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:00,501][18875] Avg episode reward: [(0, '0.727')] [2024-06-18 20:42:02,551][19107] Updated weights for policy 0, policy_version 207685 (0.0031) [2024-06-18 20:42:05,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3402792960. Throughput: 0: 42030.2. Samples: 626913240. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:05,501][18875] Avg episode reward: [(0, '0.727')] [2024-06-18 20:42:06,625][19107] Updated weights for policy 0, policy_version 207695 (0.0034) [2024-06-18 20:42:10,208][19107] Updated weights for policy 0, policy_version 207705 (0.0033) [2024-06-18 20:42:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3403038720. Throughput: 0: 41880.4. Samples: 627158620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:10,501][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 20:42:13,797][19087] Signal inference workers to stop experience collection... (9150 times) [2024-06-18 20:42:13,844][19107] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-06-18 20:42:13,915][19087] Signal inference workers to resume experience collection... (9150 times) [2024-06-18 20:42:13,915][19107] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-06-18 20:42:14,418][19107] Updated weights for policy 0, policy_version 207715 (0.0030) [2024-06-18 20:42:15,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3403251712. Throughput: 0: 42338.2. Samples: 627301520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:15,502][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 20:42:17,714][19107] Updated weights for policy 0, policy_version 207725 (0.0033) [2024-06-18 20:42:20,501][18875] Fps is (10 sec: 37681.3, 60 sec: 41505.7, 300 sec: 41876.3). Total num frames: 3403415552. Throughput: 0: 42010.6. Samples: 627546840. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:20,501][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 20:42:22,273][19107] Updated weights for policy 0, policy_version 207735 (0.0038) [2024-06-18 20:42:25,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42099.1). Total num frames: 3403677696. Throughput: 0: 41947.9. Samples: 627792180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:25,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 20:42:25,631][19107] Updated weights for policy 0, policy_version 207745 (0.0035) [2024-06-18 20:42:30,027][19107] Updated weights for policy 0, policy_version 207755 (0.0041) [2024-06-18 20:42:30,500][18875] Fps is (10 sec: 45877.0, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3403874304. Throughput: 0: 42203.9. Samples: 627928040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:30,501][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 20:42:33,381][19107] Updated weights for policy 0, policy_version 207765 (0.0043) [2024-06-18 20:42:35,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41987.4). Total num frames: 3404070912. Throughput: 0: 41966.2. Samples: 628170880. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:35,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 20:42:35,512][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000207768_3404070912.pth... [2024-06-18 20:42:35,571][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000207154_3394011136.pth [2024-06-18 20:42:37,995][19107] Updated weights for policy 0, policy_version 207775 (0.0030) [2024-06-18 20:42:40,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3404316672. Throughput: 0: 41979.9. Samples: 628415900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:40,506][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 20:42:41,190][19107] Updated weights for policy 0, policy_version 207785 (0.0041) [2024-06-18 20:42:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3404480512. Throughput: 0: 42003.6. Samples: 628552440. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:45,501][18875] Avg episode reward: [(0, '0.638')] [2024-06-18 20:42:45,847][19107] Updated weights for policy 0, policy_version 207795 (0.0038) [2024-06-18 20:42:48,990][19107] Updated weights for policy 0, policy_version 207805 (0.0024) [2024-06-18 20:42:50,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 41988.0). Total num frames: 3404709888. Throughput: 0: 41884.4. Samples: 628798040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:50,501][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 20:42:53,593][19107] Updated weights for policy 0, policy_version 207815 (0.0029) [2024-06-18 20:42:55,500][18875] Fps is (10 sec: 47513.4, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 3404955648. Throughput: 0: 42029.3. Samples: 629049940. Policy #0 lag: (min: 1.0, avg: 9.4, max: 24.0) [2024-06-18 20:42:55,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 20:42:56,724][19107] Updated weights for policy 0, policy_version 207825 (0.0031) [2024-06-18 20:43:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41932.4). Total num frames: 3405119488. Throughput: 0: 41926.2. Samples: 629188200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:00,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 20:43:01,195][19107] Updated weights for policy 0, policy_version 207835 (0.0043) [2024-06-18 20:43:04,378][19107] Updated weights for policy 0, policy_version 207845 (0.0032) [2024-06-18 20:43:05,504][18875] Fps is (10 sec: 39307.3, 60 sec: 42595.8, 300 sec: 42042.5). Total num frames: 3405348864. Throughput: 0: 41874.8. Samples: 629431340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:05,505][18875] Avg episode reward: [(0, '0.283')] [2024-06-18 20:43:09,012][19107] Updated weights for policy 0, policy_version 207855 (0.0032) [2024-06-18 20:43:10,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42099.1). Total num frames: 3405561856. Throughput: 0: 42205.4. Samples: 629691420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:10,501][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 20:43:12,041][19107] Updated weights for policy 0, policy_version 207865 (0.0037) [2024-06-18 20:43:15,500][18875] Fps is (10 sec: 40974.7, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3405758464. Throughput: 0: 41940.5. Samples: 629815360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:15,501][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 20:43:16,640][19107] Updated weights for policy 0, policy_version 207875 (0.0039) [2024-06-18 20:43:19,828][19107] Updated weights for policy 0, policy_version 207885 (0.0038) [2024-06-18 20:43:20,500][18875] Fps is (10 sec: 44237.0, 60 sec: 43145.0, 300 sec: 42043.0). Total num frames: 3406004224. Throughput: 0: 42083.2. Samples: 630064620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:20,501][18875] Avg episode reward: [(0, '0.656')] [2024-06-18 20:43:24,149][19087] Signal inference workers to stop experience collection... (9200 times) [2024-06-18 20:43:24,156][19087] Signal inference workers to resume experience collection... (9200 times) [2024-06-18 20:43:24,178][19107] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-06-18 20:43:24,178][19107] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-06-18 20:43:24,313][19107] Updated weights for policy 0, policy_version 207895 (0.0041) [2024-06-18 20:43:25,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3406200832. Throughput: 0: 42487.2. Samples: 630327820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:25,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 20:43:27,454][19107] Updated weights for policy 0, policy_version 207905 (0.0037) [2024-06-18 20:43:30,500][18875] Fps is (10 sec: 37683.2, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3406381056. Throughput: 0: 42212.5. Samples: 630452000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:30,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 20:43:32,211][19107] Updated weights for policy 0, policy_version 207915 (0.0034) [2024-06-18 20:43:35,438][19107] Updated weights for policy 0, policy_version 207925 (0.0023) [2024-06-18 20:43:35,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 3406643200. Throughput: 0: 42349.8. Samples: 630703780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:35,501][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 20:43:39,951][19107] Updated weights for policy 0, policy_version 207935 (0.0029) [2024-06-18 20:43:40,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3406823424. Throughput: 0: 42348.6. Samples: 630955620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:40,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 20:43:43,482][19107] Updated weights for policy 0, policy_version 207945 (0.0036) [2024-06-18 20:43:45,500][18875] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 41932.4). Total num frames: 3407020032. Throughput: 0: 41999.5. Samples: 631078180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:45,501][18875] Avg episode reward: [(0, '0.722')] [2024-06-18 20:43:47,843][19107] Updated weights for policy 0, policy_version 207955 (0.0024) [2024-06-18 20:43:50,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3407265792. Throughput: 0: 42214.7. Samples: 631330840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:50,500][18875] Avg episode reward: [(0, '0.822')] [2024-06-18 20:43:51,403][19107] Updated weights for policy 0, policy_version 207965 (0.0037) [2024-06-18 20:43:55,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3407446016. Throughput: 0: 42231.9. Samples: 631591860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:43:55,501][18875] Avg episode reward: [(0, '0.822')] [2024-06-18 20:43:55,677][19107] Updated weights for policy 0, policy_version 207975 (0.0038) [2024-06-18 20:43:59,354][19107] Updated weights for policy 0, policy_version 207985 (0.0028) [2024-06-18 20:44:00,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3407675392. Throughput: 0: 42187.2. Samples: 631713780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:44:00,501][18875] Avg episode reward: [(0, '0.719')] [2024-06-18 20:44:03,113][19107] Updated weights for policy 0, policy_version 207995 (0.0030) [2024-06-18 20:44:05,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42328.0, 300 sec: 42154.1). Total num frames: 3407888384. Throughput: 0: 42364.0. Samples: 631971000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:44:05,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 20:44:07,123][19107] Updated weights for policy 0, policy_version 208005 (0.0023) [2024-06-18 20:44:10,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3408084992. Throughput: 0: 42180.8. Samples: 632225960. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:10,501][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 20:44:10,762][19107] Updated weights for policy 0, policy_version 208015 (0.0041) [2024-06-18 20:44:14,958][19107] Updated weights for policy 0, policy_version 208025 (0.0029) [2024-06-18 20:44:15,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3408297984. Throughput: 0: 42158.1. Samples: 632349120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:15,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 20:44:18,445][19107] Updated weights for policy 0, policy_version 208035 (0.0051) [2024-06-18 20:44:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3408510976. Throughput: 0: 42158.3. Samples: 632600900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:20,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 20:44:22,661][19107] Updated weights for policy 0, policy_version 208045 (0.0035) [2024-06-18 20:44:25,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 3408707584. Throughput: 0: 42364.1. Samples: 632862000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:25,500][18875] Avg episode reward: [(0, '0.663')] [2024-06-18 20:44:26,193][19107] Updated weights for policy 0, policy_version 208055 (0.0038) [2024-06-18 20:44:30,193][19107] Updated weights for policy 0, policy_version 208065 (0.0045) [2024-06-18 20:44:30,504][18875] Fps is (10 sec: 42582.8, 60 sec: 42595.8, 300 sec: 42153.6). Total num frames: 3408936960. Throughput: 0: 42208.3. Samples: 632977700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:30,504][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 20:44:34,391][19107] Updated weights for policy 0, policy_version 208075 (0.0033) [2024-06-18 20:44:35,500][18875] Fps is (10 sec: 44235.6, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 3409149952. Throughput: 0: 42291.8. Samples: 633233980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:35,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 20:44:35,511][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000208078_3409149952.pth... [2024-06-18 20:44:35,596][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000207461_3399041024.pth [2024-06-18 20:44:38,019][19107] Updated weights for policy 0, policy_version 208085 (0.0026) [2024-06-18 20:44:40,500][18875] Fps is (10 sec: 39336.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3409330176. Throughput: 0: 42041.0. Samples: 633483700. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:40,500][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 20:44:42,046][19107] Updated weights for policy 0, policy_version 208095 (0.0029) [2024-06-18 20:44:45,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3409575936. Throughput: 0: 42097.8. Samples: 633608180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:45,500][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 20:44:45,580][19107] Updated weights for policy 0, policy_version 208105 (0.0037) [2024-06-18 20:44:49,695][19087] Signal inference workers to stop experience collection... (9250 times) [2024-06-18 20:44:49,698][19087] Signal inference workers to resume experience collection... (9250 times) [2024-06-18 20:44:49,715][19107] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-06-18 20:44:49,720][19107] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-06-18 20:44:49,866][19107] Updated weights for policy 0, policy_version 208115 (0.0035) [2024-06-18 20:44:50,500][18875] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3409772544. Throughput: 0: 41981.7. Samples: 633860180. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:50,504][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 20:44:53,315][19107] Updated weights for policy 0, policy_version 208125 (0.0036) [2024-06-18 20:44:55,504][18875] Fps is (10 sec: 37669.4, 60 sec: 41776.7, 300 sec: 41987.0). Total num frames: 3409952768. Throughput: 0: 42003.8. Samples: 634116280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:44:55,504][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 20:44:57,525][19107] Updated weights for policy 0, policy_version 208135 (0.0035) [2024-06-18 20:45:00,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3410198528. Throughput: 0: 41978.2. Samples: 634238140. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:45:00,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 20:45:01,582][19107] Updated weights for policy 0, policy_version 208145 (0.0051) [2024-06-18 20:45:05,371][19107] Updated weights for policy 0, policy_version 208155 (0.0030) [2024-06-18 20:45:05,501][18875] Fps is (10 sec: 45890.9, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3410411520. Throughput: 0: 42069.5. Samples: 634494040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:45:05,502][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 20:45:09,323][19107] Updated weights for policy 0, policy_version 208165 (0.0033) [2024-06-18 20:45:10,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3410591744. Throughput: 0: 41931.0. Samples: 634748900. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-18 20:45:10,501][18875] Avg episode reward: [(0, '0.218')] [2024-06-18 20:45:13,127][19107] Updated weights for policy 0, policy_version 208175 (0.0046) [2024-06-18 20:45:15,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3410821120. Throughput: 0: 42116.2. Samples: 634872780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:15,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 20:45:17,124][19107] Updated weights for policy 0, policy_version 208185 (0.0044) [2024-06-18 20:45:20,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3411034112. Throughput: 0: 42211.6. Samples: 635133500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:20,501][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 20:45:20,716][19107] Updated weights for policy 0, policy_version 208195 (0.0035) [2024-06-18 20:45:25,007][19107] Updated weights for policy 0, policy_version 208205 (0.0029) [2024-06-18 20:45:25,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 3411230720. Throughput: 0: 42120.7. Samples: 635379140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:25,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 20:45:28,298][19107] Updated weights for policy 0, policy_version 208215 (0.0033) [2024-06-18 20:45:30,504][18875] Fps is (10 sec: 44221.2, 60 sec: 42325.3, 300 sec: 42153.6). Total num frames: 3411476480. Throughput: 0: 42140.6. Samples: 635504660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:30,504][18875] Avg episode reward: [(0, '0.375')] [2024-06-18 20:45:32,785][19107] Updated weights for policy 0, policy_version 208225 (0.0034) [2024-06-18 20:45:35,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41779.4, 300 sec: 42043.0). Total num frames: 3411656704. Throughput: 0: 42297.5. Samples: 635763560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:35,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 20:45:36,138][19107] Updated weights for policy 0, policy_version 208235 (0.0037) [2024-06-18 20:45:40,500][18875] Fps is (10 sec: 39335.4, 60 sec: 42325.2, 300 sec: 42154.6). Total num frames: 3411869696. Throughput: 0: 41989.9. Samples: 636005680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:40,501][18875] Avg episode reward: [(0, '0.716')] [2024-06-18 20:45:41,178][19107] Updated weights for policy 0, policy_version 208245 (0.0035) [2024-06-18 20:45:43,868][19107] Updated weights for policy 0, policy_version 208255 (0.0039) [2024-06-18 20:45:45,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3412115456. Throughput: 0: 42081.0. Samples: 636131780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:45,501][18875] Avg episode reward: [(0, '0.731')] [2024-06-18 20:45:48,954][19107] Updated weights for policy 0, policy_version 208265 (0.0035) [2024-06-18 20:45:50,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3412262912. Throughput: 0: 42026.4. Samples: 636385220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:50,501][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 20:45:51,666][19107] Updated weights for policy 0, policy_version 208275 (0.0033) [2024-06-18 20:45:55,500][18875] Fps is (10 sec: 37682.8, 60 sec: 42327.8, 300 sec: 42098.5). Total num frames: 3412492288. Throughput: 0: 41983.1. Samples: 636638140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:45:55,501][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 20:45:56,757][19107] Updated weights for policy 0, policy_version 208285 (0.0037) [2024-06-18 20:45:58,721][19087] Signal inference workers to stop experience collection... (9300 times) [2024-06-18 20:45:58,753][19107] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-06-18 20:45:58,838][19087] Signal inference workers to resume experience collection... (9300 times) [2024-06-18 20:45:58,838][19107] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-06-18 20:45:59,487][19107] Updated weights for policy 0, policy_version 208295 (0.0029) [2024-06-18 20:46:00,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3412721664. Throughput: 0: 42115.6. Samples: 636767980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:46:00,501][18875] Avg episode reward: [(0, '0.851')] [2024-06-18 20:46:04,440][19107] Updated weights for policy 0, policy_version 208305 (0.0047) [2024-06-18 20:46:05,504][18875] Fps is (10 sec: 39307.7, 60 sec: 41230.7, 300 sec: 41986.9). Total num frames: 3412885504. Throughput: 0: 41867.3. Samples: 637017680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:46:05,505][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 20:46:07,150][19107] Updated weights for policy 0, policy_version 208315 (0.0031) [2024-06-18 20:46:10,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3413114880. Throughput: 0: 41890.8. Samples: 637264220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:46:10,501][18875] Avg episode reward: [(0, '0.318')] [2024-06-18 20:46:12,408][19107] Updated weights for policy 0, policy_version 208325 (0.0037) [2024-06-18 20:46:15,442][19107] Updated weights for policy 0, policy_version 208335 (0.0043) [2024-06-18 20:46:15,501][18875] Fps is (10 sec: 47526.4, 60 sec: 42324.7, 300 sec: 42153.9). Total num frames: 3413360640. Throughput: 0: 41988.7. Samples: 637394040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:46:15,502][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 20:46:20,193][19107] Updated weights for policy 0, policy_version 208345 (0.0032) [2024-06-18 20:46:20,502][18875] Fps is (10 sec: 40953.9, 60 sec: 41505.2, 300 sec: 42042.8). Total num frames: 3413524480. Throughput: 0: 41853.3. Samples: 637647020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-18 20:46:20,502][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 20:46:23,215][19107] Updated weights for policy 0, policy_version 208355 (0.0033) [2024-06-18 20:46:25,500][18875] Fps is (10 sec: 40963.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3413770240. Throughput: 0: 41841.3. Samples: 637888540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:46:25,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 20:46:27,900][19107] Updated weights for policy 0, policy_version 208365 (0.0039) [2024-06-18 20:46:30,500][18875] Fps is (10 sec: 45881.5, 60 sec: 41781.7, 300 sec: 42098.5). Total num frames: 3413983232. Throughput: 0: 42030.2. Samples: 638023140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:46:30,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 20:46:30,954][19107] Updated weights for policy 0, policy_version 208375 (0.0038) [2024-06-18 20:46:35,500][18875] Fps is (10 sec: 37684.2, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3414147072. Throughput: 0: 42008.6. Samples: 638275600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:46:35,500][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 20:46:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000208384_3414163456.pth... [2024-06-18 20:46:35,573][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000207768_3404070912.pth [2024-06-18 20:46:36,073][19107] Updated weights for policy 0, policy_version 208385 (0.0038) [2024-06-18 20:46:38,715][19107] Updated weights for policy 0, policy_version 208395 (0.0041) [2024-06-18 20:46:40,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3414409216. Throughput: 0: 41695.1. Samples: 638514420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:46:40,501][18875] Avg episode reward: [(0, '0.318')] [2024-06-18 20:46:43,889][19107] Updated weights for policy 0, policy_version 208405 (0.0035) [2024-06-18 20:46:45,500][18875] Fps is (10 sec: 45874.2, 60 sec: 41506.0, 300 sec: 42209.6). Total num frames: 3414605824. Throughput: 0: 41854.1. Samples: 638651420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:46:45,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 20:46:46,362][19107] Updated weights for policy 0, policy_version 208415 (0.0033) [2024-06-18 20:46:50,500][18875] Fps is (10 sec: 36045.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3414769664. Throughput: 0: 41713.6. Samples: 638894640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:46:50,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 20:46:51,611][19107] Updated weights for policy 0, policy_version 208425 (0.0026) [2024-06-18 20:46:54,033][19107] Updated weights for policy 0, policy_version 208435 (0.0033) [2024-06-18 20:46:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3415031808. Throughput: 0: 41716.8. Samples: 639141480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:46:55,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 20:46:59,357][19107] Updated weights for policy 0, policy_version 208445 (0.0035) [2024-06-18 20:47:00,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 3415212032. Throughput: 0: 41960.4. Samples: 639282220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:47:00,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 20:47:01,914][19107] Updated weights for policy 0, policy_version 208455 (0.0029) [2024-06-18 20:47:05,500][18875] Fps is (10 sec: 37683.5, 60 sec: 42054.8, 300 sec: 41931.9). Total num frames: 3415408640. Throughput: 0: 41724.4. Samples: 639524560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:47:05,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 20:47:07,209][19107] Updated weights for policy 0, policy_version 208465 (0.0041) [2024-06-18 20:47:09,964][19107] Updated weights for policy 0, policy_version 208475 (0.0028) [2024-06-18 20:47:10,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3415670784. Throughput: 0: 41811.8. Samples: 639770060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:47:10,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 20:47:14,792][19087] Signal inference workers to stop experience collection... (9350 times) [2024-06-18 20:47:14,826][19107] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-06-18 20:47:14,904][19087] Signal inference workers to resume experience collection... (9350 times) [2024-06-18 20:47:14,904][19107] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-06-18 20:47:15,053][19107] Updated weights for policy 0, policy_version 208485 (0.0032) [2024-06-18 20:47:15,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41233.7, 300 sec: 42098.6). Total num frames: 3415834624. Throughput: 0: 41839.2. Samples: 639905900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:47:15,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 20:47:17,778][19107] Updated weights for policy 0, policy_version 208495 (0.0043) [2024-06-18 20:47:20,500][18875] Fps is (10 sec: 39321.0, 60 sec: 42326.3, 300 sec: 41987.5). Total num frames: 3416064000. Throughput: 0: 41714.1. Samples: 640152740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:47:20,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 20:47:22,846][19107] Updated weights for policy 0, policy_version 208505 (0.0034) [2024-06-18 20:47:25,381][19107] Updated weights for policy 0, policy_version 208515 (0.0025) [2024-06-18 20:47:25,500][18875] Fps is (10 sec: 47513.5, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 3416309760. Throughput: 0: 41990.4. Samples: 640403980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-18 20:47:25,501][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 20:47:30,501][18875] Fps is (10 sec: 39317.8, 60 sec: 41232.4, 300 sec: 41987.3). Total num frames: 3416457216. Throughput: 0: 41761.8. Samples: 640530740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:47:30,502][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 20:47:30,849][19107] Updated weights for policy 0, policy_version 208525 (0.0041) [2024-06-18 20:47:33,138][19107] Updated weights for policy 0, policy_version 208535 (0.0032) [2024-06-18 20:47:35,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3416702976. Throughput: 0: 41939.9. Samples: 640781940. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:47:35,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 20:47:38,487][19107] Updated weights for policy 0, policy_version 208545 (0.0037) [2024-06-18 20:47:40,500][18875] Fps is (10 sec: 47519.3, 60 sec: 42052.5, 300 sec: 42209.7). Total num frames: 3416932352. Throughput: 0: 42079.3. Samples: 641035040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:47:40,500][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 20:47:41,190][19107] Updated weights for policy 0, policy_version 208555 (0.0042) [2024-06-18 20:47:45,500][18875] Fps is (10 sec: 37683.7, 60 sec: 41233.2, 300 sec: 41931.9). Total num frames: 3417079808. Throughput: 0: 41803.2. Samples: 641163360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:47:45,500][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 20:47:46,037][19107] Updated weights for policy 0, policy_version 208565 (0.0041) [2024-06-18 20:47:48,870][19107] Updated weights for policy 0, policy_version 208575 (0.0031) [2024-06-18 20:47:50,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 41932.0). Total num frames: 3417325568. Throughput: 0: 41965.8. Samples: 641413020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:47:50,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 20:47:53,679][19107] Updated weights for policy 0, policy_version 208585 (0.0039) [2024-06-18 20:47:55,500][18875] Fps is (10 sec: 47513.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3417554944. Throughput: 0: 42243.4. Samples: 641671020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:47:55,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 20:47:56,801][19107] Updated weights for policy 0, policy_version 208595 (0.0034) [2024-06-18 20:48:00,504][18875] Fps is (10 sec: 39307.1, 60 sec: 41776.7, 300 sec: 41931.9). Total num frames: 3417718784. Throughput: 0: 42090.3. Samples: 641800120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:00,505][18875] Avg episode reward: [(0, '0.677')] [2024-06-18 20:48:01,624][19107] Updated weights for policy 0, policy_version 208605 (0.0028) [2024-06-18 20:48:04,546][19107] Updated weights for policy 0, policy_version 208615 (0.0033) [2024-06-18 20:48:05,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3417964544. Throughput: 0: 42165.8. Samples: 642050200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:05,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 20:48:09,232][19107] Updated weights for policy 0, policy_version 208625 (0.0033) [2024-06-18 20:48:10,500][18875] Fps is (10 sec: 45891.8, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3418177536. Throughput: 0: 42153.3. Samples: 642300880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:10,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 20:48:12,486][19107] Updated weights for policy 0, policy_version 208635 (0.0034) [2024-06-18 20:48:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3418374144. Throughput: 0: 42279.6. Samples: 642433280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:15,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 20:48:16,762][19107] Updated weights for policy 0, policy_version 208645 (0.0036) [2024-06-18 20:48:20,504][18875] Fps is (10 sec: 40945.2, 60 sec: 42049.8, 300 sec: 41987.0). Total num frames: 3418587136. Throughput: 0: 42181.6. Samples: 642680260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:20,504][18875] Avg episode reward: [(0, '0.716')] [2024-06-18 20:48:20,542][19107] Updated weights for policy 0, policy_version 208655 (0.0035) [2024-06-18 20:48:24,562][19107] Updated weights for policy 0, policy_version 208665 (0.0035) [2024-06-18 20:48:25,504][18875] Fps is (10 sec: 44220.9, 60 sec: 41776.6, 300 sec: 42153.6). Total num frames: 3418816512. Throughput: 0: 42189.3. Samples: 642933720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:25,505][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 20:48:28,507][19087] Signal inference workers to stop experience collection... (9400 times) [2024-06-18 20:48:28,508][19087] Signal inference workers to resume experience collection... (9400 times) [2024-06-18 20:48:28,516][19107] Updated weights for policy 0, policy_version 208675 (0.0041) [2024-06-18 20:48:28,537][19107] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-06-18 20:48:28,537][19107] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-06-18 20:48:30,500][18875] Fps is (10 sec: 42614.2, 60 sec: 42599.2, 300 sec: 41932.0). Total num frames: 3419013120. Throughput: 0: 42179.6. Samples: 643061440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:30,500][18875] Avg episode reward: [(0, '0.397')] [2024-06-18 20:48:32,340][19107] Updated weights for policy 0, policy_version 208685 (0.0043) [2024-06-18 20:48:35,500][18875] Fps is (10 sec: 40975.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3419226112. Throughput: 0: 42183.6. Samples: 643311280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 20.0) [2024-06-18 20:48:35,500][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 20:48:35,513][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000208693_3419226112.pth... [2024-06-18 20:48:35,570][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000208078_3409149952.pth [2024-06-18 20:48:36,145][19107] Updated weights for policy 0, policy_version 208695 (0.0034) [2024-06-18 20:48:40,045][19107] Updated weights for policy 0, policy_version 208705 (0.0023) [2024-06-18 20:48:40,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 3419422720. Throughput: 0: 42078.7. Samples: 643564560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:48:40,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 20:48:43,987][19107] Updated weights for policy 0, policy_version 208715 (0.0026) [2024-06-18 20:48:45,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 3419635712. Throughput: 0: 41990.9. Samples: 643689560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:48:45,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 20:48:48,093][19107] Updated weights for policy 0, policy_version 208725 (0.0035) [2024-06-18 20:48:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3419832320. Throughput: 0: 42018.3. Samples: 643941020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:48:50,501][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 20:48:52,103][19107] Updated weights for policy 0, policy_version 208735 (0.0033) [2024-06-18 20:48:55,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3420045312. Throughput: 0: 41946.6. Samples: 644188480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:48:55,501][18875] Avg episode reward: [(0, '0.706')] [2024-06-18 20:48:55,941][19107] Updated weights for policy 0, policy_version 208745 (0.0049) [2024-06-18 20:48:59,771][19107] Updated weights for policy 0, policy_version 208755 (0.0035) [2024-06-18 20:49:00,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42874.0, 300 sec: 42043.0). Total num frames: 3420291072. Throughput: 0: 41932.9. Samples: 644320260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:00,501][18875] Avg episode reward: [(0, '0.762')] [2024-06-18 20:49:03,535][19107] Updated weights for policy 0, policy_version 208765 (0.0037) [2024-06-18 20:49:05,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3420487680. Throughput: 0: 42018.0. Samples: 644570920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:05,503][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 20:49:07,360][19107] Updated weights for policy 0, policy_version 208775 (0.0030) [2024-06-18 20:49:10,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3420700672. Throughput: 0: 42005.7. Samples: 644823820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:10,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 20:49:11,353][19107] Updated weights for policy 0, policy_version 208785 (0.0034) [2024-06-18 20:49:14,919][19107] Updated weights for policy 0, policy_version 208795 (0.0031) [2024-06-18 20:49:15,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3420913664. Throughput: 0: 41999.8. Samples: 644951440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:15,501][18875] Avg episode reward: [(0, '0.742')] [2024-06-18 20:49:18,972][19107] Updated weights for policy 0, policy_version 208805 (0.0043) [2024-06-18 20:49:20,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 3421110272. Throughput: 0: 41954.6. Samples: 645199240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:20,501][18875] Avg episode reward: [(0, '0.791')] [2024-06-18 20:49:22,928][19107] Updated weights for policy 0, policy_version 208815 (0.0042) [2024-06-18 20:49:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41781.8, 300 sec: 41988.0). Total num frames: 3421323264. Throughput: 0: 42013.8. Samples: 645455180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:25,501][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 20:49:26,510][19107] Updated weights for policy 0, policy_version 208825 (0.0034) [2024-06-18 20:49:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3421536256. Throughput: 0: 42161.8. Samples: 645586840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:30,501][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 20:49:30,573][19107] Updated weights for policy 0, policy_version 208835 (0.0036) [2024-06-18 20:49:34,226][19107] Updated weights for policy 0, policy_version 208845 (0.0033) [2024-06-18 20:49:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3421732864. Throughput: 0: 42149.7. Samples: 645837760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:35,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 20:49:38,667][19107] Updated weights for policy 0, policy_version 208855 (0.0038) [2024-06-18 20:49:40,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 3421962240. Throughput: 0: 42266.6. Samples: 646090480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 20:49:40,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 20:49:42,003][19107] Updated weights for policy 0, policy_version 208865 (0.0035) [2024-06-18 20:49:45,504][18875] Fps is (10 sec: 40945.5, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 3422142464. Throughput: 0: 42110.0. Samples: 646215360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:49:45,505][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 20:49:46,348][19107] Updated weights for policy 0, policy_version 208875 (0.0040) [2024-06-18 20:49:47,808][19087] Signal inference workers to stop experience collection... (9450 times) [2024-06-18 20:49:47,815][19087] Signal inference workers to resume experience collection... (9450 times) [2024-06-18 20:49:47,854][19107] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-06-18 20:49:47,854][19107] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-06-18 20:49:49,951][19107] Updated weights for policy 0, policy_version 208885 (0.0040) [2024-06-18 20:49:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42154.6). Total num frames: 3422388224. Throughput: 0: 42155.9. Samples: 646467940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:49:50,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 20:49:54,175][19107] Updated weights for policy 0, policy_version 208895 (0.0028) [2024-06-18 20:49:55,500][18875] Fps is (10 sec: 42613.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3422568448. Throughput: 0: 42290.9. Samples: 646726920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:49:55,501][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 20:49:57,496][19107] Updated weights for policy 0, policy_version 208905 (0.0035) [2024-06-18 20:50:00,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3422797824. Throughput: 0: 42128.0. Samples: 646847200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:00,501][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 20:50:02,245][19107] Updated weights for policy 0, policy_version 208915 (0.0030) [2024-06-18 20:50:05,140][19107] Updated weights for policy 0, policy_version 208925 (0.0040) [2024-06-18 20:50:05,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3423027200. Throughput: 0: 42354.2. Samples: 647105180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:05,504][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 20:50:09,826][19107] Updated weights for policy 0, policy_version 208935 (0.0034) [2024-06-18 20:50:10,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3423207424. Throughput: 0: 42353.8. Samples: 647361100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:10,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 20:50:12,945][19107] Updated weights for policy 0, policy_version 208945 (0.0046) [2024-06-18 20:50:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3423436800. Throughput: 0: 42039.9. Samples: 647478640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:15,502][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 20:50:17,554][19107] Updated weights for policy 0, policy_version 208955 (0.0044) [2024-06-18 20:50:20,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3423649792. Throughput: 0: 42220.1. Samples: 647737660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:20,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 20:50:20,822][19107] Updated weights for policy 0, policy_version 208965 (0.0040) [2024-06-18 20:50:25,243][19107] Updated weights for policy 0, policy_version 208975 (0.0033) [2024-06-18 20:50:25,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41932.4). Total num frames: 3423846400. Throughput: 0: 42237.5. Samples: 647991160. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:25,501][18875] Avg episode reward: [(0, '0.684')] [2024-06-18 20:50:28,640][19107] Updated weights for policy 0, policy_version 208985 (0.0039) [2024-06-18 20:50:30,504][18875] Fps is (10 sec: 40944.5, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 3424059392. Throughput: 0: 42275.9. Samples: 648117780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:30,505][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 20:50:32,981][19107] Updated weights for policy 0, policy_version 208995 (0.0031) [2024-06-18 20:50:35,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3424272384. Throughput: 0: 42210.2. Samples: 648367400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:35,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 20:50:35,526][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209001_3424272384.pth... [2024-06-18 20:50:35,572][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000208384_3414163456.pth [2024-06-18 20:50:36,698][19107] Updated weights for policy 0, policy_version 209005 (0.0035) [2024-06-18 20:50:40,500][18875] Fps is (10 sec: 40975.1, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3424468992. Throughput: 0: 42017.9. Samples: 648617720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:40,501][18875] Avg episode reward: [(0, '0.313')] [2024-06-18 20:50:40,677][19107] Updated weights for policy 0, policy_version 209015 (0.0037) [2024-06-18 20:50:44,616][19107] Updated weights for policy 0, policy_version 209025 (0.0032) [2024-06-18 20:50:45,504][18875] Fps is (10 sec: 42583.5, 60 sec: 42598.3, 300 sec: 42153.6). Total num frames: 3424698368. Throughput: 0: 42251.7. Samples: 648748680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:45,505][18875] Avg episode reward: [(0, '0.337')] [2024-06-18 20:50:48,403][19107] Updated weights for policy 0, policy_version 209035 (0.0045) [2024-06-18 20:50:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3424878592. Throughput: 0: 42009.3. Samples: 648995600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-18 20:50:50,501][18875] Avg episode reward: [(0, '0.764')] [2024-06-18 20:50:52,423][19107] Updated weights for policy 0, policy_version 209045 (0.0031) [2024-06-18 20:50:55,500][18875] Fps is (10 sec: 42614.1, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3425124352. Throughput: 0: 41739.1. Samples: 649239360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:50:55,501][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 20:50:56,265][19107] Updated weights for policy 0, policy_version 209055 (0.0035) [2024-06-18 20:51:00,504][18875] Fps is (10 sec: 42583.5, 60 sec: 41776.7, 300 sec: 42098.6). Total num frames: 3425304576. Throughput: 0: 41994.1. Samples: 649368520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:00,505][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 20:51:00,739][19107] Updated weights for policy 0, policy_version 209065 (0.0047) [2024-06-18 20:51:03,939][19107] Updated weights for policy 0, policy_version 209075 (0.0024) [2024-06-18 20:51:05,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3425533952. Throughput: 0: 41752.3. Samples: 649616520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:05,501][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 20:51:08,287][19107] Updated weights for policy 0, policy_version 209085 (0.0040) [2024-06-18 20:51:10,500][18875] Fps is (10 sec: 44253.0, 60 sec: 42325.4, 300 sec: 41987.6). Total num frames: 3425746944. Throughput: 0: 41870.2. Samples: 649875320. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:10,501][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 20:51:11,584][19107] Updated weights for policy 0, policy_version 209095 (0.0036) [2024-06-18 20:51:15,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 42043.2). Total num frames: 3425927168. Throughput: 0: 41826.1. Samples: 649999800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:15,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 20:51:15,991][19107] Updated weights for policy 0, policy_version 209105 (0.0034) [2024-06-18 20:51:19,317][19107] Updated weights for policy 0, policy_version 209115 (0.0039) [2024-06-18 20:51:20,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3426156544. Throughput: 0: 41838.4. Samples: 650250120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:20,501][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 20:51:22,043][19087] Signal inference workers to stop experience collection... (9500 times) [2024-06-18 20:51:22,044][19087] Signal inference workers to resume experience collection... (9500 times) [2024-06-18 20:51:22,054][19107] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-06-18 20:51:22,054][19107] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-06-18 20:51:23,808][19107] Updated weights for policy 0, policy_version 209125 (0.0022) [2024-06-18 20:51:25,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3426385920. Throughput: 0: 41897.0. Samples: 650503080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:25,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 20:51:27,521][19107] Updated weights for policy 0, policy_version 209135 (0.0029) [2024-06-18 20:51:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41781.8, 300 sec: 42098.5). Total num frames: 3426566144. Throughput: 0: 41745.3. Samples: 650627060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:30,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 20:51:31,679][19107] Updated weights for policy 0, policy_version 209145 (0.0030) [2024-06-18 20:51:35,294][19107] Updated weights for policy 0, policy_version 209155 (0.0036) [2024-06-18 20:51:35,502][18875] Fps is (10 sec: 40953.6, 60 sec: 42051.3, 300 sec: 41987.3). Total num frames: 3426795520. Throughput: 0: 41805.4. Samples: 650876900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:35,502][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 20:51:39,329][19107] Updated weights for policy 0, policy_version 209165 (0.0039) [2024-06-18 20:51:40,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3426992128. Throughput: 0: 41966.6. Samples: 651127860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:40,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 20:51:43,044][19107] Updated weights for policy 0, policy_version 209175 (0.0028) [2024-06-18 20:51:45,500][18875] Fps is (10 sec: 39327.3, 60 sec: 41508.6, 300 sec: 42098.5). Total num frames: 3427188736. Throughput: 0: 41847.7. Samples: 651251520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:45,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 20:51:47,086][19107] Updated weights for policy 0, policy_version 209185 (0.0032) [2024-06-18 20:51:50,504][18875] Fps is (10 sec: 44221.4, 60 sec: 42595.9, 300 sec: 42042.5). Total num frames: 3427434496. Throughput: 0: 41930.0. Samples: 651503520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:50,504][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 20:51:50,841][19107] Updated weights for policy 0, policy_version 209195 (0.0033) [2024-06-18 20:51:55,280][19107] Updated weights for policy 0, policy_version 209205 (0.0049) [2024-06-18 20:51:55,504][18875] Fps is (10 sec: 44221.0, 60 sec: 41776.7, 300 sec: 42098.0). Total num frames: 3427631104. Throughput: 0: 41863.7. Samples: 651759340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-18 20:51:55,505][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 20:51:58,616][19107] Updated weights for policy 0, policy_version 209215 (0.0034) [2024-06-18 20:52:00,500][18875] Fps is (10 sec: 37696.7, 60 sec: 41781.7, 300 sec: 42043.0). Total num frames: 3427811328. Throughput: 0: 41793.4. Samples: 651880500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:00,501][18875] Avg episode reward: [(0, '0.396')] [2024-06-18 20:52:02,899][19107] Updated weights for policy 0, policy_version 209225 (0.0039) [2024-06-18 20:52:05,500][18875] Fps is (10 sec: 40974.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3428040704. Throughput: 0: 41740.7. Samples: 652128460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:05,501][18875] Avg episode reward: [(0, '0.726')] [2024-06-18 20:52:06,430][19107] Updated weights for policy 0, policy_version 209235 (0.0025) [2024-06-18 20:52:10,504][18875] Fps is (10 sec: 42583.0, 60 sec: 41503.6, 300 sec: 42042.5). Total num frames: 3428237312. Throughput: 0: 41843.3. Samples: 652386180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:10,505][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 20:52:11,055][19107] Updated weights for policy 0, policy_version 209245 (0.0033) [2024-06-18 20:52:14,235][19107] Updated weights for policy 0, policy_version 209255 (0.0041) [2024-06-18 20:52:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3428450304. Throughput: 0: 41700.8. Samples: 652503600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:15,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 20:52:18,894][19107] Updated weights for policy 0, policy_version 209265 (0.0035) [2024-06-18 20:52:20,500][18875] Fps is (10 sec: 44252.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3428679680. Throughput: 0: 41872.9. Samples: 652761120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:20,501][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 20:52:22,143][19107] Updated weights for policy 0, policy_version 209275 (0.0038) [2024-06-18 20:52:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 42043.1). Total num frames: 3428859904. Throughput: 0: 41959.1. Samples: 653016020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:25,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 20:52:26,675][19107] Updated weights for policy 0, policy_version 209285 (0.0037) [2024-06-18 20:52:29,885][19107] Updated weights for policy 0, policy_version 209295 (0.0030) [2024-06-18 20:52:30,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 3429089280. Throughput: 0: 41897.3. Samples: 653136900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:30,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 20:52:34,445][19107] Updated weights for policy 0, policy_version 209305 (0.0044) [2024-06-18 20:52:35,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42053.3, 300 sec: 41987.4). Total num frames: 3429318656. Throughput: 0: 42099.7. Samples: 653397860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:35,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 20:52:35,555][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209310_3429335040.pth... [2024-06-18 20:52:35,618][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000208693_3419226112.pth [2024-06-18 20:52:37,743][19107] Updated weights for policy 0, policy_version 209315 (0.0032) [2024-06-18 20:52:40,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3429498880. Throughput: 0: 42038.6. Samples: 653650920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:40,500][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 20:52:42,204][19107] Updated weights for policy 0, policy_version 209325 (0.0040) [2024-06-18 20:52:43,314][19087] Signal inference workers to stop experience collection... (9550 times) [2024-06-18 20:52:43,314][19087] Signal inference workers to resume experience collection... (9550 times) [2024-06-18 20:52:43,359][19107] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-06-18 20:52:43,359][19107] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-06-18 20:52:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3429728256. Throughput: 0: 42023.5. Samples: 653771560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:45,501][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 20:52:45,527][19107] Updated weights for policy 0, policy_version 209335 (0.0035) [2024-06-18 20:52:49,999][19107] Updated weights for policy 0, policy_version 209345 (0.0032) [2024-06-18 20:52:50,500][18875] Fps is (10 sec: 42597.3, 60 sec: 41508.5, 300 sec: 41931.9). Total num frames: 3429924864. Throughput: 0: 42361.7. Samples: 654034740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:50,501][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 20:52:53,127][19107] Updated weights for policy 0, policy_version 209355 (0.0031) [2024-06-18 20:52:55,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41781.7, 300 sec: 42099.1). Total num frames: 3430137856. Throughput: 0: 41990.4. Samples: 654275600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:52:55,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 20:52:57,706][19107] Updated weights for policy 0, policy_version 209365 (0.0034) [2024-06-18 20:53:00,500][18875] Fps is (10 sec: 45876.3, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 3430383616. Throughput: 0: 42336.6. Samples: 654408740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:53:00,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 20:53:00,641][19107] Updated weights for policy 0, policy_version 209375 (0.0030) [2024-06-18 20:53:05,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3430547456. Throughput: 0: 42339.0. Samples: 654666380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-18 20:53:05,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 20:53:05,617][19107] Updated weights for policy 0, policy_version 209385 (0.0035) [2024-06-18 20:53:08,309][19107] Updated weights for policy 0, policy_version 209395 (0.0033) [2024-06-18 20:53:10,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42328.0, 300 sec: 42043.0). Total num frames: 3430776832. Throughput: 0: 42122.8. Samples: 654911540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:10,500][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 20:53:13,249][19107] Updated weights for policy 0, policy_version 209405 (0.0037) [2024-06-18 20:53:15,500][18875] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42154.6). Total num frames: 3431022592. Throughput: 0: 42335.7. Samples: 655042000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:15,501][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 20:53:16,370][19107] Updated weights for policy 0, policy_version 209415 (0.0028) [2024-06-18 20:53:20,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41932.5). Total num frames: 3431186432. Throughput: 0: 42106.4. Samples: 655292640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:20,500][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 20:53:21,035][19107] Updated weights for policy 0, policy_version 209425 (0.0041) [2024-06-18 20:53:24,211][19107] Updated weights for policy 0, policy_version 209435 (0.0039) [2024-06-18 20:53:25,502][18875] Fps is (10 sec: 39315.5, 60 sec: 42597.4, 300 sec: 42042.8). Total num frames: 3431415808. Throughput: 0: 41972.3. Samples: 655539740. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:25,502][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 20:53:29,023][19107] Updated weights for policy 0, policy_version 209445 (0.0029) [2024-06-18 20:53:30,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3431628800. Throughput: 0: 42255.9. Samples: 655673080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:30,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 20:53:31,818][19107] Updated weights for policy 0, policy_version 209455 (0.0032) [2024-06-18 20:53:35,500][18875] Fps is (10 sec: 39327.6, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3431809024. Throughput: 0: 41929.5. Samples: 655921560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:35,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 20:53:36,676][19107] Updated weights for policy 0, policy_version 209465 (0.0038) [2024-06-18 20:53:39,575][19107] Updated weights for policy 0, policy_version 209475 (0.0035) [2024-06-18 20:53:40,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3432038400. Throughput: 0: 42037.0. Samples: 656167260. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:40,501][18875] Avg episode reward: [(0, '0.677')] [2024-06-18 20:53:44,389][19107] Updated weights for policy 0, policy_version 209485 (0.0038) [2024-06-18 20:53:45,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3432251392. Throughput: 0: 42030.0. Samples: 656300100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:45,501][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 20:53:47,695][19107] Updated weights for policy 0, policy_version 209495 (0.0032) [2024-06-18 20:53:50,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3432448000. Throughput: 0: 41850.3. Samples: 656549640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:50,501][18875] Avg episode reward: [(0, '0.772')] [2024-06-18 20:53:51,969][19107] Updated weights for policy 0, policy_version 209505 (0.0030) [2024-06-18 20:53:55,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3432677376. Throughput: 0: 41921.7. Samples: 656798020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:53:55,501][18875] Avg episode reward: [(0, '0.733')] [2024-06-18 20:53:55,524][19107] Updated weights for policy 0, policy_version 209515 (0.0044) [2024-06-18 20:53:59,659][19107] Updated weights for policy 0, policy_version 209525 (0.0038) [2024-06-18 20:54:00,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 3432873984. Throughput: 0: 41907.5. Samples: 656927840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:54:00,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 20:54:02,529][19087] Signal inference workers to stop experience collection... (9600 times) [2024-06-18 20:54:02,581][19107] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-06-18 20:54:02,642][19087] Signal inference workers to resume experience collection... (9600 times) [2024-06-18 20:54:02,642][19107] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-06-18 20:54:03,286][19107] Updated weights for policy 0, policy_version 209535 (0.0031) [2024-06-18 20:54:05,504][18875] Fps is (10 sec: 40945.3, 60 sec: 42322.9, 300 sec: 41987.0). Total num frames: 3433086976. Throughput: 0: 41861.0. Samples: 657176540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:54:05,505][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 20:54:07,297][19107] Updated weights for policy 0, policy_version 209545 (0.0031) [2024-06-18 20:54:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3433299968. Throughput: 0: 42011.2. Samples: 657430180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 26.0) [2024-06-18 20:54:10,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 20:54:10,996][19107] Updated weights for policy 0, policy_version 209555 (0.0038) [2024-06-18 20:54:14,969][19107] Updated weights for policy 0, policy_version 209565 (0.0033) [2024-06-18 20:54:15,500][18875] Fps is (10 sec: 42613.1, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 3433512960. Throughput: 0: 41887.5. Samples: 657558020. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:15,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 20:54:18,965][19107] Updated weights for policy 0, policy_version 209575 (0.0030) [2024-06-18 20:54:20,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3433725952. Throughput: 0: 41791.5. Samples: 657802180. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:20,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 20:54:23,266][19107] Updated weights for policy 0, policy_version 209585 (0.0040) [2024-06-18 20:54:25,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41780.2, 300 sec: 41987.5). Total num frames: 3433922560. Throughput: 0: 42027.0. Samples: 658058480. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:25,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 20:54:26,873][19107] Updated weights for policy 0, policy_version 209595 (0.0031) [2024-06-18 20:54:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3434135552. Throughput: 0: 41758.3. Samples: 658179220. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:30,508][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 20:54:30,877][19107] Updated weights for policy 0, policy_version 209605 (0.0026) [2024-06-18 20:54:34,799][19107] Updated weights for policy 0, policy_version 209615 (0.0041) [2024-06-18 20:54:35,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3434348544. Throughput: 0: 41852.8. Samples: 658433020. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:35,501][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 20:54:35,520][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209616_3434348544.pth... [2024-06-18 20:54:35,582][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209001_3424272384.pth [2024-06-18 20:54:38,624][19107] Updated weights for policy 0, policy_version 209625 (0.0039) [2024-06-18 20:54:40,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42043.5). Total num frames: 3434545152. Throughput: 0: 41997.7. Samples: 658687920. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:40,501][18875] Avg episode reward: [(0, '0.706')] [2024-06-18 20:54:43,070][19107] Updated weights for policy 0, policy_version 209635 (0.0036) [2024-06-18 20:54:45,504][18875] Fps is (10 sec: 42583.4, 60 sec: 42049.8, 300 sec: 41987.0). Total num frames: 3434774528. Throughput: 0: 41911.4. Samples: 658814000. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:45,504][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 20:54:46,581][19107] Updated weights for policy 0, policy_version 209645 (0.0038) [2024-06-18 20:54:50,504][18875] Fps is (10 sec: 42583.5, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 3434971136. Throughput: 0: 42026.7. Samples: 659067740. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:50,505][18875] Avg episode reward: [(0, '0.793')] [2024-06-18 20:54:50,742][19107] Updated weights for policy 0, policy_version 209655 (0.0032) [2024-06-18 20:54:54,016][19107] Updated weights for policy 0, policy_version 209665 (0.0027) [2024-06-18 20:54:55,500][18875] Fps is (10 sec: 42613.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3435200512. Throughput: 0: 42040.0. Samples: 659321980. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:54:55,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 20:54:58,617][19107] Updated weights for policy 0, policy_version 209675 (0.0041) [2024-06-18 20:55:00,500][18875] Fps is (10 sec: 42613.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3435397120. Throughput: 0: 42089.5. Samples: 659452040. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:55:00,501][18875] Avg episode reward: [(0, '0.298')] [2024-06-18 20:55:01,505][19107] Updated weights for policy 0, policy_version 209685 (0.0034) [2024-06-18 20:55:05,503][18875] Fps is (10 sec: 37675.0, 60 sec: 41507.1, 300 sec: 41931.6). Total num frames: 3435577344. Throughput: 0: 42243.8. Samples: 659703240. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:55:05,503][18875] Avg episode reward: [(0, '0.256')] [2024-06-18 20:55:06,312][19107] Updated weights for policy 0, policy_version 209695 (0.0033) [2024-06-18 20:55:09,455][19107] Updated weights for policy 0, policy_version 209705 (0.0038) [2024-06-18 20:55:10,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3435839488. Throughput: 0: 42058.2. Samples: 659951100. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:55:10,502][18875] Avg episode reward: [(0, '0.287')] [2024-06-18 20:55:14,213][19107] Updated weights for policy 0, policy_version 209715 (0.0036) [2024-06-18 20:55:15,500][18875] Fps is (10 sec: 45885.4, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3436036096. Throughput: 0: 42423.2. Samples: 660088260. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:55:15,501][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 20:55:16,943][19107] Updated weights for policy 0, policy_version 209725 (0.0027) [2024-06-18 20:55:17,997][19087] Signal inference workers to stop experience collection... (9650 times) [2024-06-18 20:55:17,998][19087] Signal inference workers to resume experience collection... (9650 times) [2024-06-18 20:55:18,013][19107] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-06-18 20:55:18,014][19107] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-06-18 20:55:20,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3436232704. Throughput: 0: 42345.4. Samples: 660338560. Policy #0 lag: (min: 2.0, avg: 11.1, max: 23.0) [2024-06-18 20:55:20,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 20:55:21,811][19107] Updated weights for policy 0, policy_version 209735 (0.0053) [2024-06-18 20:55:24,482][19107] Updated weights for policy 0, policy_version 209745 (0.0041) [2024-06-18 20:55:25,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42099.1). Total num frames: 3436478464. Throughput: 0: 42252.4. Samples: 660589280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:55:25,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 20:55:29,595][19107] Updated weights for policy 0, policy_version 209755 (0.0033) [2024-06-18 20:55:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3436658688. Throughput: 0: 42496.4. Samples: 660726180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:55:30,500][18875] Avg episode reward: [(0, '0.253')] [2024-06-18 20:55:32,281][19107] Updated weights for policy 0, policy_version 209765 (0.0038) [2024-06-18 20:55:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3436871680. Throughput: 0: 42373.9. Samples: 660974420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:55:35,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 20:55:37,447][19107] Updated weights for policy 0, policy_version 209775 (0.0031) [2024-06-18 20:55:40,011][19107] Updated weights for policy 0, policy_version 209785 (0.0044) [2024-06-18 20:55:40,500][18875] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42099.1). Total num frames: 3437117440. Throughput: 0: 42223.6. Samples: 661222040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:55:40,501][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 20:55:45,067][19107] Updated weights for policy 0, policy_version 209795 (0.0031) [2024-06-18 20:55:45,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41781.7, 300 sec: 42043.0). Total num frames: 3437281280. Throughput: 0: 42397.8. Samples: 661359940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:55:45,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 20:55:47,778][19107] Updated weights for policy 0, policy_version 209805 (0.0035) [2024-06-18 20:55:50,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 3437510656. Throughput: 0: 42323.9. Samples: 661607720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:55:50,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 20:55:52,910][19107] Updated weights for policy 0, policy_version 209815 (0.0040) [2024-06-18 20:55:55,500][18875] Fps is (10 sec: 47513.6, 60 sec: 42598.5, 300 sec: 42210.1). Total num frames: 3437756416. Throughput: 0: 42429.9. Samples: 661860440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:55:55,500][18875] Avg episode reward: [(0, '0.376')] [2024-06-18 20:55:55,561][19107] Updated weights for policy 0, policy_version 209825 (0.0026) [2024-06-18 20:56:00,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3437936640. Throughput: 0: 42295.6. Samples: 661991560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:56:00,500][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 20:56:00,507][19107] Updated weights for policy 0, policy_version 209835 (0.0038) [2024-06-18 20:56:03,621][19107] Updated weights for policy 0, policy_version 209845 (0.0032) [2024-06-18 20:56:05,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42873.0, 300 sec: 42043.0). Total num frames: 3438149632. Throughput: 0: 42148.9. Samples: 662235260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:56:05,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 20:56:08,202][19107] Updated weights for policy 0, policy_version 209855 (0.0037) [2024-06-18 20:56:10,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3438362624. Throughput: 0: 42273.5. Samples: 662491580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:56:10,501][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 20:56:11,466][19107] Updated weights for policy 0, policy_version 209865 (0.0036) [2024-06-18 20:56:15,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3438542848. Throughput: 0: 42022.1. Samples: 662617180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:56:15,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 20:56:16,102][19107] Updated weights for policy 0, policy_version 209875 (0.0034) [2024-06-18 20:56:19,242][19107] Updated weights for policy 0, policy_version 209885 (0.0038) [2024-06-18 20:56:20,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3438788608. Throughput: 0: 42055.2. Samples: 662866900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:56:20,504][18875] Avg episode reward: [(0, '0.570')] [2024-06-18 20:56:24,044][19107] Updated weights for policy 0, policy_version 209895 (0.0038) [2024-06-18 20:56:25,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 3438985216. Throughput: 0: 42260.0. Samples: 663123740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 20:56:25,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 20:56:26,953][19107] Updated weights for policy 0, policy_version 209905 (0.0036) [2024-06-18 20:56:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42043.2). Total num frames: 3439198208. Throughput: 0: 41939.6. Samples: 663247220. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:56:30,500][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 20:56:31,540][19107] Updated weights for policy 0, policy_version 209915 (0.0028) [2024-06-18 20:56:32,708][19087] Signal inference workers to stop experience collection... (9700 times) [2024-06-18 20:56:32,748][19107] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-06-18 20:56:32,763][19087] Signal inference workers to resume experience collection... (9700 times) [2024-06-18 20:56:32,764][19107] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-06-18 20:56:34,901][19107] Updated weights for policy 0, policy_version 209925 (0.0037) [2024-06-18 20:56:35,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3439411200. Throughput: 0: 42170.6. Samples: 663505400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:56:35,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 20:56:35,682][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209927_3439443968.pth... [2024-06-18 20:56:35,737][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209310_3429335040.pth [2024-06-18 20:56:39,314][19107] Updated weights for policy 0, policy_version 209935 (0.0022) [2024-06-18 20:56:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3439624192. Throughput: 0: 42228.8. Samples: 663760740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:56:40,501][18875] Avg episode reward: [(0, '0.405')] [2024-06-18 20:56:43,064][19107] Updated weights for policy 0, policy_version 209945 (0.0032) [2024-06-18 20:56:45,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42043.5). Total num frames: 3439837184. Throughput: 0: 42064.3. Samples: 663884460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:56:45,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 20:56:46,942][19107] Updated weights for policy 0, policy_version 209955 (0.0037) [2024-06-18 20:56:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 3440033792. Throughput: 0: 42195.6. Samples: 664134060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:56:50,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 20:56:50,893][19107] Updated weights for policy 0, policy_version 209965 (0.0029) [2024-06-18 20:56:54,464][19107] Updated weights for policy 0, policy_version 209975 (0.0035) [2024-06-18 20:56:55,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 3440246784. Throughput: 0: 42181.8. Samples: 664389760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:56:55,500][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 20:56:58,672][19107] Updated weights for policy 0, policy_version 209985 (0.0027) [2024-06-18 20:57:00,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3440476160. Throughput: 0: 42354.3. Samples: 664523120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:00,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 20:57:02,020][19107] Updated weights for policy 0, policy_version 209995 (0.0033) [2024-06-18 20:57:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42099.1). Total num frames: 3440656384. Throughput: 0: 42294.3. Samples: 664770140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:05,500][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 20:57:06,562][19107] Updated weights for policy 0, policy_version 210005 (0.0028) [2024-06-18 20:57:09,730][19107] Updated weights for policy 0, policy_version 210015 (0.0028) [2024-06-18 20:57:10,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3440885760. Throughput: 0: 42015.1. Samples: 665014420. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:10,501][18875] Avg episode reward: [(0, '0.855')] [2024-06-18 20:57:14,519][19107] Updated weights for policy 0, policy_version 210025 (0.0029) [2024-06-18 20:57:15,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3441098752. Throughput: 0: 42183.5. Samples: 665145480. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:15,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 20:57:18,066][19107] Updated weights for policy 0, policy_version 210035 (0.0037) [2024-06-18 20:57:20,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42209.7). Total num frames: 3441311744. Throughput: 0: 42096.6. Samples: 665399740. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:20,500][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 20:57:22,117][19107] Updated weights for policy 0, policy_version 210045 (0.0050) [2024-06-18 20:57:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3441524736. Throughput: 0: 41914.7. Samples: 665646900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:25,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 20:57:25,637][19107] Updated weights for policy 0, policy_version 210055 (0.0037) [2024-06-18 20:57:30,178][19107] Updated weights for policy 0, policy_version 210065 (0.0027) [2024-06-18 20:57:30,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3441704960. Throughput: 0: 41935.3. Samples: 665771540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:30,500][18875] Avg episode reward: [(0, '0.870')] [2024-06-18 20:57:33,547][19107] Updated weights for policy 0, policy_version 210075 (0.0034) [2024-06-18 20:57:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3441934336. Throughput: 0: 42037.8. Samples: 666025760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-18 20:57:35,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 20:57:37,838][19107] Updated weights for policy 0, policy_version 210085 (0.0055) [2024-06-18 20:57:40,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3442163712. Throughput: 0: 41936.8. Samples: 666276920. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:57:40,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 20:57:41,646][19107] Updated weights for policy 0, policy_version 210095 (0.0027) [2024-06-18 20:57:45,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3442343936. Throughput: 0: 41840.3. Samples: 666405940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:57:45,509][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 20:57:45,519][19107] Updated weights for policy 0, policy_version 210105 (0.0023) [2024-06-18 20:57:49,192][19107] Updated weights for policy 0, policy_version 210115 (0.0046) [2024-06-18 20:57:50,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3442573312. Throughput: 0: 41966.9. Samples: 666658660. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:57:50,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 20:57:53,155][19107] Updated weights for policy 0, policy_version 210125 (0.0029) [2024-06-18 20:57:53,746][19087] Signal inference workers to stop experience collection... (9750 times) [2024-06-18 20:57:53,747][19087] Signal inference workers to resume experience collection... (9750 times) [2024-06-18 20:57:53,766][19107] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-06-18 20:57:53,767][19107] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-06-18 20:57:55,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42598.2, 300 sec: 42098.5). Total num frames: 3442802688. Throughput: 0: 42082.1. Samples: 666908120. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:57:55,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 20:57:56,776][19107] Updated weights for policy 0, policy_version 210135 (0.0038) [2024-06-18 20:58:00,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3442982912. Throughput: 0: 42167.5. Samples: 667043020. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:00,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 20:58:01,233][19107] Updated weights for policy 0, policy_version 210145 (0.0041) [2024-06-18 20:58:04,757][19107] Updated weights for policy 0, policy_version 210155 (0.0037) [2024-06-18 20:58:05,500][18875] Fps is (10 sec: 39322.4, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3443195904. Throughput: 0: 41972.8. Samples: 667288520. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:05,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 20:58:08,902][19107] Updated weights for policy 0, policy_version 210165 (0.0034) [2024-06-18 20:58:10,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3443408896. Throughput: 0: 42066.3. Samples: 667539880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:10,501][18875] Avg episode reward: [(0, '0.425')] [2024-06-18 20:58:12,449][19107] Updated weights for policy 0, policy_version 210175 (0.0031) [2024-06-18 20:58:15,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3443605504. Throughput: 0: 42187.4. Samples: 667669980. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:15,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 20:58:16,405][19107] Updated weights for policy 0, policy_version 210185 (0.0029) [2024-06-18 20:58:20,074][19107] Updated weights for policy 0, policy_version 210195 (0.0035) [2024-06-18 20:58:20,500][18875] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42098.7). Total num frames: 3443834880. Throughput: 0: 42141.6. Samples: 667922140. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:20,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 20:58:24,067][19107] Updated weights for policy 0, policy_version 210205 (0.0030) [2024-06-18 20:58:25,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3444047872. Throughput: 0: 42172.9. Samples: 668174700. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:25,501][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 20:58:28,230][19107] Updated weights for policy 0, policy_version 210215 (0.0032) [2024-06-18 20:58:30,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3444228096. Throughput: 0: 42185.4. Samples: 668304280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:30,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 20:58:31,648][19107] Updated weights for policy 0, policy_version 210225 (0.0032) [2024-06-18 20:58:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3444457472. Throughput: 0: 42115.7. Samples: 668553860. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:35,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 20:58:35,522][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000210233_3444457472.pth... [2024-06-18 20:58:35,592][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209616_3434348544.pth [2024-06-18 20:58:35,858][19107] Updated weights for policy 0, policy_version 210235 (0.0022) [2024-06-18 20:58:39,360][19107] Updated weights for policy 0, policy_version 210245 (0.0045) [2024-06-18 20:58:40,500][18875] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3444670464. Throughput: 0: 42303.6. Samples: 668811780. Policy #0 lag: (min: 1.0, avg: 9.3, max: 21.0) [2024-06-18 20:58:40,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 20:58:43,557][19107] Updated weights for policy 0, policy_version 210255 (0.0049) [2024-06-18 20:58:45,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3444883456. Throughput: 0: 42166.3. Samples: 668940500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:58:45,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 20:58:47,174][19107] Updated weights for policy 0, policy_version 210265 (0.0036) [2024-06-18 20:58:50,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3445096448. Throughput: 0: 42207.1. Samples: 669187840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:58:50,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 20:58:51,423][19107] Updated weights for policy 0, policy_version 210275 (0.0030) [2024-06-18 20:58:54,975][19107] Updated weights for policy 0, policy_version 210285 (0.0028) [2024-06-18 20:58:55,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3445309440. Throughput: 0: 42120.7. Samples: 669435320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:58:55,501][18875] Avg episode reward: [(0, '0.424')] [2024-06-18 20:58:59,223][19107] Updated weights for policy 0, policy_version 210295 (0.0036) [2024-06-18 20:59:00,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42099.0). Total num frames: 3445506048. Throughput: 0: 42139.5. Samples: 669566260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:00,501][18875] Avg episode reward: [(0, '0.692')] [2024-06-18 20:59:03,322][19107] Updated weights for policy 0, policy_version 210305 (0.0043) [2024-06-18 20:59:04,691][19087] Signal inference workers to stop experience collection... (9800 times) [2024-06-18 20:59:04,727][19107] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-06-18 20:59:04,753][19087] Signal inference workers to resume experience collection... (9800 times) [2024-06-18 20:59:04,756][19107] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-06-18 20:59:05,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3445735424. Throughput: 0: 42092.1. Samples: 669816280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:05,501][18875] Avg episode reward: [(0, '0.808')] [2024-06-18 20:59:06,823][19107] Updated weights for policy 0, policy_version 210315 (0.0034) [2024-06-18 20:59:10,500][18875] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3445932032. Throughput: 0: 42120.5. Samples: 670070120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:10,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 20:59:11,245][19107] Updated weights for policy 0, policy_version 210325 (0.0044) [2024-06-18 20:59:14,633][19107] Updated weights for policy 0, policy_version 210335 (0.0052) [2024-06-18 20:59:15,504][18875] Fps is (10 sec: 39307.0, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 3446128640. Throughput: 0: 41902.4. Samples: 670190040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:15,505][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 20:59:18,974][19107] Updated weights for policy 0, policy_version 210345 (0.0045) [2024-06-18 20:59:20,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3446374400. Throughput: 0: 42000.3. Samples: 670443880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:20,502][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 20:59:22,483][19107] Updated weights for policy 0, policy_version 210355 (0.0031) [2024-06-18 20:59:25,500][18875] Fps is (10 sec: 44253.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3446571008. Throughput: 0: 41982.8. Samples: 670701000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:25,501][18875] Avg episode reward: [(0, '0.321')] [2024-06-18 20:59:26,588][19107] Updated weights for policy 0, policy_version 210365 (0.0029) [2024-06-18 20:59:30,268][19107] Updated weights for policy 0, policy_version 210375 (0.0027) [2024-06-18 20:59:30,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3446784000. Throughput: 0: 41828.5. Samples: 670822780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:30,501][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 20:59:34,290][19107] Updated weights for policy 0, policy_version 210385 (0.0030) [2024-06-18 20:59:35,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3446996992. Throughput: 0: 42161.2. Samples: 671085100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:35,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 20:59:38,027][19107] Updated weights for policy 0, policy_version 210395 (0.0038) [2024-06-18 20:59:40,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42099.1). Total num frames: 3447193600. Throughput: 0: 42247.3. Samples: 671336440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:40,501][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 20:59:42,137][19107] Updated weights for policy 0, policy_version 210405 (0.0032) [2024-06-18 20:59:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42154.6). Total num frames: 3447406592. Throughput: 0: 42107.6. Samples: 671461100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:45,501][18875] Avg episode reward: [(0, '0.810')] [2024-06-18 20:59:45,670][19107] Updated weights for policy 0, policy_version 210415 (0.0036) [2024-06-18 20:59:49,770][19107] Updated weights for policy 0, policy_version 210425 (0.0042) [2024-06-18 20:59:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3447619584. Throughput: 0: 42234.7. Samples: 671716840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 20:59:50,501][18875] Avg episode reward: [(0, '0.763')] [2024-06-18 20:59:53,391][19107] Updated weights for policy 0, policy_version 210435 (0.0036) [2024-06-18 20:59:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3447832576. Throughput: 0: 42183.9. Samples: 671968400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 20:59:55,501][18875] Avg episode reward: [(0, '0.763')] [2024-06-18 20:59:57,547][19107] Updated weights for policy 0, policy_version 210445 (0.0032) [2024-06-18 21:00:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42265.5). Total num frames: 3448045568. Throughput: 0: 42242.7. Samples: 672090800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:00,501][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 21:00:00,974][19107] Updated weights for policy 0, policy_version 210455 (0.0031) [2024-06-18 21:00:05,309][19107] Updated weights for policy 0, policy_version 210465 (0.0036) [2024-06-18 21:00:05,501][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3448258560. Throughput: 0: 42234.6. Samples: 672344440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:05,501][18875] Avg episode reward: [(0, '0.298')] [2024-06-18 21:00:08,941][19107] Updated weights for policy 0, policy_version 210475 (0.0050) [2024-06-18 21:00:10,501][18875] Fps is (10 sec: 39318.0, 60 sec: 41778.6, 300 sec: 42042.9). Total num frames: 3448438784. Throughput: 0: 42116.1. Samples: 672596260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:10,502][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 21:00:13,343][19107] Updated weights for policy 0, policy_version 210485 (0.0032) [2024-06-18 21:00:15,500][18875] Fps is (10 sec: 42599.4, 60 sec: 42601.0, 300 sec: 42209.6). Total num frames: 3448684544. Throughput: 0: 42169.8. Samples: 672720420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:15,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 21:00:16,992][19107] Updated weights for policy 0, policy_version 210495 (0.0027) [2024-06-18 21:00:20,500][18875] Fps is (10 sec: 42602.5, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3448864768. Throughput: 0: 42049.1. Samples: 672977300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:20,500][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 21:00:20,979][19107] Updated weights for policy 0, policy_version 210505 (0.0032) [2024-06-18 21:00:24,757][19107] Updated weights for policy 0, policy_version 210515 (0.0030) [2024-06-18 21:00:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3449094144. Throughput: 0: 42023.1. Samples: 673227480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:25,500][18875] Avg episode reward: [(0, '0.354')] [2024-06-18 21:00:28,827][19107] Updated weights for policy 0, policy_version 210525 (0.0034) [2024-06-18 21:00:29,922][19087] Signal inference workers to stop experience collection... (9850 times) [2024-06-18 21:00:29,945][19107] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-06-18 21:00:30,034][19087] Signal inference workers to resume experience collection... (9850 times) [2024-06-18 21:00:30,034][19107] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-06-18 21:00:30,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 3449323520. Throughput: 0: 42140.2. Samples: 673357400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:30,501][18875] Avg episode reward: [(0, '0.337')] [2024-06-18 21:00:32,475][19107] Updated weights for policy 0, policy_version 210535 (0.0028) [2024-06-18 21:00:35,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3449520128. Throughput: 0: 42083.5. Samples: 673610600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:35,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 21:00:35,514][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000210542_3449520128.pth... [2024-06-18 21:00:35,580][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000209927_3439443968.pth [2024-06-18 21:00:36,658][19107] Updated weights for policy 0, policy_version 210545 (0.0030) [2024-06-18 21:00:40,043][19107] Updated weights for policy 0, policy_version 210555 (0.0029) [2024-06-18 21:00:40,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3449733120. Throughput: 0: 41956.6. Samples: 673856440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:40,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 21:00:44,379][19107] Updated weights for policy 0, policy_version 210565 (0.0034) [2024-06-18 21:00:45,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3449929728. Throughput: 0: 42160.7. Samples: 673988040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:45,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 21:00:47,923][19107] Updated weights for policy 0, policy_version 210575 (0.0033) [2024-06-18 21:00:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3450142720. Throughput: 0: 42041.2. Samples: 674236280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:50,500][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 21:00:52,609][19107] Updated weights for policy 0, policy_version 210585 (0.0040) [2024-06-18 21:00:55,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3450372096. Throughput: 0: 41974.9. Samples: 674485100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:00:55,501][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 21:00:55,606][19107] Updated weights for policy 0, policy_version 210595 (0.0040) [2024-06-18 21:01:00,351][19107] Updated weights for policy 0, policy_version 210605 (0.0034) [2024-06-18 21:01:00,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3450552320. Throughput: 0: 42087.5. Samples: 674614360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 23.0) [2024-06-18 21:01:00,501][18875] Avg episode reward: [(0, '0.633')] [2024-06-18 21:01:03,464][19107] Updated weights for policy 0, policy_version 210615 (0.0030) [2024-06-18 21:01:05,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3450765312. Throughput: 0: 41971.0. Samples: 674866000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:05,501][18875] Avg episode reward: [(0, '0.529')] [2024-06-18 21:01:08,010][19107] Updated weights for policy 0, policy_version 210625 (0.0035) [2024-06-18 21:01:10,501][18875] Fps is (10 sec: 45871.5, 60 sec: 42871.5, 300 sec: 42265.0). Total num frames: 3451011072. Throughput: 0: 42058.3. Samples: 675120140. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:10,502][18875] Avg episode reward: [(0, '0.263')] [2024-06-18 21:01:11,574][19107] Updated weights for policy 0, policy_version 210635 (0.0027) [2024-06-18 21:01:15,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3451191296. Throughput: 0: 42137.7. Samples: 675253600. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:15,501][18875] Avg episode reward: [(0, '0.780')] [2024-06-18 21:01:15,682][19107] Updated weights for policy 0, policy_version 210645 (0.0026) [2024-06-18 21:01:19,271][19107] Updated weights for policy 0, policy_version 210655 (0.0032) [2024-06-18 21:01:20,500][18875] Fps is (10 sec: 40963.2, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3451420672. Throughput: 0: 42098.2. Samples: 675505020. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:20,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 21:01:23,306][19107] Updated weights for policy 0, policy_version 210665 (0.0054) [2024-06-18 21:01:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3451617280. Throughput: 0: 42220.3. Samples: 675756360. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:25,501][18875] Avg episode reward: [(0, '0.415')] [2024-06-18 21:01:26,813][19107] Updated weights for policy 0, policy_version 210675 (0.0045) [2024-06-18 21:01:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3451830272. Throughput: 0: 42106.8. Samples: 675882840. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:30,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 21:01:30,908][19107] Updated weights for policy 0, policy_version 210685 (0.0036) [2024-06-18 21:01:34,348][19107] Updated weights for policy 0, policy_version 210695 (0.0027) [2024-06-18 21:01:35,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3452043264. Throughput: 0: 42264.8. Samples: 676138200. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:35,501][18875] Avg episode reward: [(0, '0.413')] [2024-06-18 21:01:38,779][19107] Updated weights for policy 0, policy_version 210705 (0.0046) [2024-06-18 21:01:40,504][18875] Fps is (10 sec: 44220.9, 60 sec: 42322.7, 300 sec: 42153.6). Total num frames: 3452272640. Throughput: 0: 42355.3. Samples: 676391240. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:40,505][18875] Avg episode reward: [(0, '0.346')] [2024-06-18 21:01:42,030][19107] Updated weights for policy 0, policy_version 210715 (0.0037) [2024-06-18 21:01:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3452452864. Throughput: 0: 42365.8. Samples: 676520820. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:45,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 21:01:46,406][19107] Updated weights for policy 0, policy_version 210725 (0.0030) [2024-06-18 21:01:49,726][19107] Updated weights for policy 0, policy_version 210735 (0.0049) [2024-06-18 21:01:50,500][18875] Fps is (10 sec: 40974.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3452682240. Throughput: 0: 42406.7. Samples: 676774300. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:50,501][18875] Avg episode reward: [(0, '0.459')] [2024-06-18 21:01:54,166][19107] Updated weights for policy 0, policy_version 210745 (0.0027) [2024-06-18 21:01:55,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 3452895232. Throughput: 0: 42367.5. Samples: 677026640. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:01:55,501][18875] Avg episode reward: [(0, '0.735')] [2024-06-18 21:01:57,838][19087] Signal inference workers to stop experience collection... (9900 times) [2024-06-18 21:01:57,839][19087] Signal inference workers to resume experience collection... (9900 times) [2024-06-18 21:01:57,879][19107] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-06-18 21:01:57,880][19107] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-06-18 21:01:57,982][19107] Updated weights for policy 0, policy_version 210755 (0.0044) [2024-06-18 21:02:00,504][18875] Fps is (10 sec: 40945.0, 60 sec: 42322.8, 300 sec: 42153.6). Total num frames: 3453091840. Throughput: 0: 42014.9. Samples: 677144420. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:02:00,505][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 21:02:02,389][19107] Updated weights for policy 0, policy_version 210765 (0.0035) [2024-06-18 21:02:05,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3453321216. Throughput: 0: 42133.0. Samples: 677401000. Policy #0 lag: (min: 1.0, avg: 9.6, max: 22.0) [2024-06-18 21:02:05,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 21:02:05,585][19107] Updated weights for policy 0, policy_version 210775 (0.0039) [2024-06-18 21:02:10,247][19107] Updated weights for policy 0, policy_version 210785 (0.0034) [2024-06-18 21:02:10,500][18875] Fps is (10 sec: 42613.9, 60 sec: 41779.8, 300 sec: 42098.6). Total num frames: 3453517824. Throughput: 0: 42421.0. Samples: 677665300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:10,501][18875] Avg episode reward: [(0, '0.475')] [2024-06-18 21:02:13,311][19107] Updated weights for policy 0, policy_version 210795 (0.0030) [2024-06-18 21:02:15,500][18875] Fps is (10 sec: 39320.8, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3453714432. Throughput: 0: 42207.0. Samples: 677782160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:15,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 21:02:17,900][19107] Updated weights for policy 0, policy_version 210805 (0.0034) [2024-06-18 21:02:20,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3453960192. Throughput: 0: 42077.8. Samples: 678031700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:20,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 21:02:21,093][19107] Updated weights for policy 0, policy_version 210815 (0.0030) [2024-06-18 21:02:25,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3454140416. Throughput: 0: 42361.2. Samples: 678297340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:25,509][18875] Avg episode reward: [(0, '0.516')] [2024-06-18 21:02:25,630][19107] Updated weights for policy 0, policy_version 210825 (0.0047) [2024-06-18 21:02:29,032][19107] Updated weights for policy 0, policy_version 210835 (0.0029) [2024-06-18 21:02:30,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3454369792. Throughput: 0: 42058.2. Samples: 678413440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:30,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 21:02:33,686][19107] Updated weights for policy 0, policy_version 210845 (0.0046) [2024-06-18 21:02:35,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3454582784. Throughput: 0: 42044.1. Samples: 678666280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:35,500][18875] Avg episode reward: [(0, '0.211')] [2024-06-18 21:02:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000210851_3454582784.pth... [2024-06-18 21:02:35,594][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000210233_3444457472.pth [2024-06-18 21:02:36,668][19107] Updated weights for policy 0, policy_version 210855 (0.0030) [2024-06-18 21:02:40,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 3454779392. Throughput: 0: 42148.8. Samples: 678923340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:40,501][18875] Avg episode reward: [(0, '0.199')] [2024-06-18 21:02:41,287][19107] Updated weights for policy 0, policy_version 210865 (0.0040) [2024-06-18 21:02:44,302][19107] Updated weights for policy 0, policy_version 210875 (0.0037) [2024-06-18 21:02:45,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3454992384. Throughput: 0: 42230.9. Samples: 679044660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:45,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 21:02:49,215][19107] Updated weights for policy 0, policy_version 210885 (0.0034) [2024-06-18 21:02:50,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3455205376. Throughput: 0: 42232.4. Samples: 679301460. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:50,501][18875] Avg episode reward: [(0, '0.415')] [2024-06-18 21:02:52,149][19107] Updated weights for policy 0, policy_version 210895 (0.0040) [2024-06-18 21:02:55,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3455401984. Throughput: 0: 41831.5. Samples: 679547720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:02:55,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 21:02:56,959][19107] Updated weights for policy 0, policy_version 210905 (0.0034) [2024-06-18 21:02:59,799][19107] Updated weights for policy 0, policy_version 210915 (0.0030) [2024-06-18 21:03:00,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42327.9, 300 sec: 42154.1). Total num frames: 3455631360. Throughput: 0: 42054.0. Samples: 679674580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:03:00,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 21:03:04,707][19107] Updated weights for policy 0, policy_version 210925 (0.0033) [2024-06-18 21:03:05,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 3455844352. Throughput: 0: 42078.5. Samples: 679925240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:03:05,501][18875] Avg episode reward: [(0, '0.607')] [2024-06-18 21:03:06,772][19087] Signal inference workers to stop experience collection... (9950 times) [2024-06-18 21:03:06,832][19107] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-06-18 21:03:06,836][19087] Signal inference workers to resume experience collection... (9950 times) [2024-06-18 21:03:06,844][19107] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-06-18 21:03:07,986][19107] Updated weights for policy 0, policy_version 210935 (0.0047) [2024-06-18 21:03:10,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3456024576. Throughput: 0: 41650.6. Samples: 680171620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:03:10,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 21:03:12,444][19107] Updated weights for policy 0, policy_version 210945 (0.0051) [2024-06-18 21:03:15,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3456270336. Throughput: 0: 41920.5. Samples: 680299860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-18 21:03:15,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 21:03:15,760][19107] Updated weights for policy 0, policy_version 210955 (0.0038) [2024-06-18 21:03:20,380][19107] Updated weights for policy 0, policy_version 210965 (0.0035) [2024-06-18 21:03:20,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3456450560. Throughput: 0: 41954.2. Samples: 680554220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:20,500][18875] Avg episode reward: [(0, '0.713')] [2024-06-18 21:03:23,520][19107] Updated weights for policy 0, policy_version 210975 (0.0030) [2024-06-18 21:03:25,504][18875] Fps is (10 sec: 39307.2, 60 sec: 42049.7, 300 sec: 42153.6). Total num frames: 3456663552. Throughput: 0: 41682.0. Samples: 680799180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:25,505][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 21:03:28,346][19107] Updated weights for policy 0, policy_version 210985 (0.0048) [2024-06-18 21:03:30,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3456876544. Throughput: 0: 41640.9. Samples: 680918500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:30,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 21:03:31,388][19107] Updated weights for policy 0, policy_version 210995 (0.0038) [2024-06-18 21:03:35,504][18875] Fps is (10 sec: 40959.9, 60 sec: 41503.5, 300 sec: 42042.5). Total num frames: 3457073152. Throughput: 0: 41472.6. Samples: 681167880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:35,505][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 21:03:35,992][19107] Updated weights for policy 0, policy_version 211005 (0.0047) [2024-06-18 21:03:39,222][19107] Updated weights for policy 0, policy_version 211015 (0.0037) [2024-06-18 21:03:40,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3457269760. Throughput: 0: 41457.5. Samples: 681413300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:40,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 21:03:44,124][19107] Updated weights for policy 0, policy_version 211025 (0.0056) [2024-06-18 21:03:45,500][18875] Fps is (10 sec: 40975.0, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3457482752. Throughput: 0: 41534.6. Samples: 681543640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:45,501][18875] Avg episode reward: [(0, '0.300')] [2024-06-18 21:03:47,523][19107] Updated weights for policy 0, policy_version 211035 (0.0038) [2024-06-18 21:03:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3457695744. Throughput: 0: 41511.3. Samples: 681793240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:50,500][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 21:03:51,757][19107] Updated weights for policy 0, policy_version 211045 (0.0033) [2024-06-18 21:03:55,358][19107] Updated weights for policy 0, policy_version 211055 (0.0029) [2024-06-18 21:03:55,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3457925120. Throughput: 0: 41506.8. Samples: 682039420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:03:55,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 21:03:59,548][19107] Updated weights for policy 0, policy_version 211065 (0.0040) [2024-06-18 21:04:00,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3458121728. Throughput: 0: 41581.8. Samples: 682171040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:04:00,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 21:04:03,119][19107] Updated weights for policy 0, policy_version 211075 (0.0036) [2024-06-18 21:04:05,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41233.2, 300 sec: 41987.5). Total num frames: 3458318336. Throughput: 0: 41544.9. Samples: 682423740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:04:05,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 21:04:07,199][19107] Updated weights for policy 0, policy_version 211085 (0.0027) [2024-06-18 21:04:10,504][18875] Fps is (10 sec: 42582.8, 60 sec: 42049.8, 300 sec: 42098.6). Total num frames: 3458547712. Throughput: 0: 41529.8. Samples: 682668020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:04:10,504][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 21:04:10,986][19107] Updated weights for policy 0, policy_version 211095 (0.0040) [2024-06-18 21:04:14,966][19107] Updated weights for policy 0, policy_version 211105 (0.0034) [2024-06-18 21:04:15,500][18875] Fps is (10 sec: 44236.2, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3458760704. Throughput: 0: 41781.8. Samples: 682798680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:04:15,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 21:04:19,367][19107] Updated weights for policy 0, policy_version 211115 (0.0039) [2024-06-18 21:04:20,500][18875] Fps is (10 sec: 40974.8, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3458957312. Throughput: 0: 41882.1. Samples: 683052420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-18 21:04:20,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 21:04:22,594][19107] Updated weights for policy 0, policy_version 211125 (0.0029) [2024-06-18 21:04:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41781.6, 300 sec: 41987.4). Total num frames: 3459170304. Throughput: 0: 42075.3. Samples: 683306700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:04:25,501][18875] Avg episode reward: [(0, '0.374')] [2024-06-18 21:04:26,963][19107] Updated weights for policy 0, policy_version 211135 (0.0033) [2024-06-18 21:04:30,434][19107] Updated weights for policy 0, policy_version 211145 (0.0028) [2024-06-18 21:04:30,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3459399680. Throughput: 0: 41926.2. Samples: 683430320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:04:30,501][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 21:04:34,714][19107] Updated weights for policy 0, policy_version 211155 (0.0036) [2024-06-18 21:04:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 3459596288. Throughput: 0: 42031.8. Samples: 683684680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:04:35,501][18875] Avg episode reward: [(0, '0.692')] [2024-06-18 21:04:35,512][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000211157_3459596288.pth... [2024-06-18 21:04:35,564][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000210542_3449520128.pth [2024-06-18 21:04:38,098][19107] Updated weights for policy 0, policy_version 211165 (0.0035) [2024-06-18 21:04:40,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3459792896. Throughput: 0: 42087.5. Samples: 683933360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:04:40,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 21:04:42,452][19107] Updated weights for policy 0, policy_version 211175 (0.0038) [2024-06-18 21:04:43,592][19087] Signal inference workers to stop experience collection... (10000 times) [2024-06-18 21:04:43,592][19087] Signal inference workers to resume experience collection... (10000 times) [2024-06-18 21:04:43,618][19107] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-06-18 21:04:43,618][19107] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-06-18 21:04:45,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3460022272. Throughput: 0: 41989.6. Samples: 684060580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:04:45,501][18875] Avg episode reward: [(0, '0.643')] [2024-06-18 21:04:46,028][19107] Updated weights for policy 0, policy_version 211185 (0.0037) [2024-06-18 21:04:50,353][19107] Updated weights for policy 0, policy_version 211195 (0.0034) [2024-06-18 21:04:50,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3460235264. Throughput: 0: 42008.3. Samples: 684314120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:04:50,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 21:04:53,904][19107] Updated weights for policy 0, policy_version 211205 (0.0033) [2024-06-18 21:04:55,503][18875] Fps is (10 sec: 40947.4, 60 sec: 41776.9, 300 sec: 41987.0). Total num frames: 3460431872. Throughput: 0: 42092.4. Samples: 684562160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:04:55,504][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 21:04:57,984][19107] Updated weights for policy 0, policy_version 211215 (0.0043) [2024-06-18 21:05:00,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3460644864. Throughput: 0: 41969.0. Samples: 684687280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:05:00,501][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 21:05:01,780][19107] Updated weights for policy 0, policy_version 211225 (0.0031) [2024-06-18 21:05:05,500][18875] Fps is (10 sec: 40973.5, 60 sec: 42052.3, 300 sec: 42043.1). Total num frames: 3460841472. Throughput: 0: 42046.3. Samples: 684944500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:05:05,500][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 21:05:05,699][19107] Updated weights for policy 0, policy_version 211235 (0.0035) [2024-06-18 21:05:09,597][19107] Updated weights for policy 0, policy_version 211245 (0.0035) [2024-06-18 21:05:10,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 3461054464. Throughput: 0: 41874.3. Samples: 685191040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:05:10,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 21:05:13,418][19107] Updated weights for policy 0, policy_version 211255 (0.0029) [2024-06-18 21:05:15,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3461283840. Throughput: 0: 42029.3. Samples: 685321640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:05:15,501][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 21:05:17,190][19107] Updated weights for policy 0, policy_version 211265 (0.0050) [2024-06-18 21:05:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3461480448. Throughput: 0: 42114.9. Samples: 685579840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:05:20,500][18875] Avg episode reward: [(0, '0.692')] [2024-06-18 21:05:21,133][19107] Updated weights for policy 0, policy_version 211275 (0.0026) [2024-06-18 21:05:24,822][19107] Updated weights for policy 0, policy_version 211285 (0.0036) [2024-06-18 21:05:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3461709824. Throughput: 0: 42093.8. Samples: 685827580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:05:25,501][18875] Avg episode reward: [(0, '0.847')] [2024-06-18 21:05:28,904][19107] Updated weights for policy 0, policy_version 211295 (0.0036) [2024-06-18 21:05:30,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3461906432. Throughput: 0: 42131.2. Samples: 685956480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-18 21:05:30,501][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 21:05:32,451][19107] Updated weights for policy 0, policy_version 211305 (0.0041) [2024-06-18 21:05:35,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3462119424. Throughput: 0: 42113.4. Samples: 686209220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:05:35,501][18875] Avg episode reward: [(0, '0.684')] [2024-06-18 21:05:37,091][19107] Updated weights for policy 0, policy_version 211315 (0.0028) [2024-06-18 21:05:40,504][18875] Fps is (10 sec: 42583.1, 60 sec: 42322.8, 300 sec: 42042.5). Total num frames: 3462332416. Throughput: 0: 42149.4. Samples: 686458900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:05:40,505][18875] Avg episode reward: [(0, '0.330')] [2024-06-18 21:05:40,685][19107] Updated weights for policy 0, policy_version 211325 (0.0039) [2024-06-18 21:05:44,957][19107] Updated weights for policy 0, policy_version 211335 (0.0040) [2024-06-18 21:05:45,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3462529024. Throughput: 0: 42203.1. Samples: 686586420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:05:45,501][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 21:05:48,591][19107] Updated weights for policy 0, policy_version 211345 (0.0029) [2024-06-18 21:05:50,500][18875] Fps is (10 sec: 42613.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3462758400. Throughput: 0: 41960.7. Samples: 686832740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:05:50,501][18875] Avg episode reward: [(0, '0.465')] [2024-06-18 21:05:52,815][19107] Updated weights for policy 0, policy_version 211355 (0.0046) [2024-06-18 21:05:55,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42327.6, 300 sec: 42098.6). Total num frames: 3462971392. Throughput: 0: 42086.3. Samples: 687084920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:05:55,500][18875] Avg episode reward: [(0, '0.690')] [2024-06-18 21:05:56,118][19107] Updated weights for policy 0, policy_version 211365 (0.0038) [2024-06-18 21:06:00,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3463151616. Throughput: 0: 42103.7. Samples: 687216300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:00,501][18875] Avg episode reward: [(0, '0.756')] [2024-06-18 21:06:00,567][19107] Updated weights for policy 0, policy_version 211375 (0.0037) [2024-06-18 21:06:03,758][19107] Updated weights for policy 0, policy_version 211385 (0.0033) [2024-06-18 21:06:05,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 41932.0). Total num frames: 3463380992. Throughput: 0: 41975.4. Samples: 687468740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:05,501][18875] Avg episode reward: [(0, '0.829')] [2024-06-18 21:06:08,794][19107] Updated weights for policy 0, policy_version 211395 (0.0049) [2024-06-18 21:06:10,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3463593984. Throughput: 0: 42093.8. Samples: 687721800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:10,501][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 21:06:11,450][19107] Updated weights for policy 0, policy_version 211405 (0.0044) [2024-06-18 21:06:15,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3463774208. Throughput: 0: 42018.2. Samples: 687847300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:15,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 21:06:16,285][19107] Updated weights for policy 0, policy_version 211415 (0.0033) [2024-06-18 21:06:16,727][19087] Signal inference workers to stop experience collection... (10050 times) [2024-06-18 21:06:16,771][19107] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-06-18 21:06:16,782][19087] Signal inference workers to resume experience collection... (10050 times) [2024-06-18 21:06:16,790][19107] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-06-18 21:06:19,131][19107] Updated weights for policy 0, policy_version 211425 (0.0031) [2024-06-18 21:06:20,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3464019968. Throughput: 0: 42003.7. Samples: 688099380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:20,500][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 21:06:24,025][19107] Updated weights for policy 0, policy_version 211435 (0.0027) [2024-06-18 21:06:25,504][18875] Fps is (10 sec: 45858.7, 60 sec: 42049.8, 300 sec: 42042.5). Total num frames: 3464232960. Throughput: 0: 42152.4. Samples: 688355760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:25,504][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 21:06:26,865][19107] Updated weights for policy 0, policy_version 211445 (0.0034) [2024-06-18 21:06:30,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3464429568. Throughput: 0: 42163.9. Samples: 688483800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:30,503][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 21:06:31,856][19107] Updated weights for policy 0, policy_version 211455 (0.0045) [2024-06-18 21:06:34,612][19107] Updated weights for policy 0, policy_version 211465 (0.0031) [2024-06-18 21:06:35,500][18875] Fps is (10 sec: 42613.3, 60 sec: 42325.3, 300 sec: 41988.0). Total num frames: 3464658944. Throughput: 0: 42351.5. Samples: 688738560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-18 21:06:35,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 21:06:35,514][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000211466_3464658944.pth... [2024-06-18 21:06:35,569][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000210851_3454582784.pth [2024-06-18 21:06:39,929][19107] Updated weights for policy 0, policy_version 211475 (0.0042) [2024-06-18 21:06:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 3464855552. Throughput: 0: 42301.6. Samples: 688988500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:06:40,501][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 21:06:42,428][19107] Updated weights for policy 0, policy_version 211485 (0.0024) [2024-06-18 21:06:45,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3465068544. Throughput: 0: 41990.6. Samples: 689105880. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:06:45,501][18875] Avg episode reward: [(0, '0.645')] [2024-06-18 21:06:47,712][19107] Updated weights for policy 0, policy_version 211495 (0.0032) [2024-06-18 21:06:50,213][19107] Updated weights for policy 0, policy_version 211505 (0.0037) [2024-06-18 21:06:50,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3465297920. Throughput: 0: 42181.7. Samples: 689366920. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:06:50,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 21:06:55,450][19107] Updated weights for policy 0, policy_version 211515 (0.0043) [2024-06-18 21:06:55,500][18875] Fps is (10 sec: 39320.9, 60 sec: 41506.0, 300 sec: 41932.4). Total num frames: 3465461760. Throughput: 0: 42154.5. Samples: 689618760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:06:55,501][18875] Avg episode reward: [(0, '0.285')] [2024-06-18 21:06:58,143][19107] Updated weights for policy 0, policy_version 211525 (0.0030) [2024-06-18 21:07:00,500][18875] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3465691136. Throughput: 0: 41867.6. Samples: 689731340. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:00,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 21:07:03,243][19107] Updated weights for policy 0, policy_version 211535 (0.0028) [2024-06-18 21:07:05,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3465904128. Throughput: 0: 42007.9. Samples: 689989740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:05,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 21:07:06,652][19107] Updated weights for policy 0, policy_version 211545 (0.0032) [2024-06-18 21:07:10,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3466100736. Throughput: 0: 41720.7. Samples: 690233040. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:10,501][18875] Avg episode reward: [(0, '0.612')] [2024-06-18 21:07:11,191][19107] Updated weights for policy 0, policy_version 211555 (0.0032) [2024-06-18 21:07:14,423][19107] Updated weights for policy 0, policy_version 211565 (0.0037) [2024-06-18 21:07:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3466313728. Throughput: 0: 41780.0. Samples: 690363900. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:15,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 21:07:17,098][19087] Signal inference workers to stop experience collection... (10100 times) [2024-06-18 21:07:17,127][19107] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-06-18 21:07:17,146][19087] Signal inference workers to resume experience collection... (10100 times) [2024-06-18 21:07:17,148][19107] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-06-18 21:07:18,873][19107] Updated weights for policy 0, policy_version 211575 (0.0029) [2024-06-18 21:07:20,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3466510336. Throughput: 0: 41730.3. Samples: 690616420. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:20,501][18875] Avg episode reward: [(0, '0.450')] [2024-06-18 21:07:21,940][19107] Updated weights for policy 0, policy_version 211585 (0.0036) [2024-06-18 21:07:25,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42054.8, 300 sec: 41987.5). Total num frames: 3466756096. Throughput: 0: 41594.7. Samples: 690860260. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:25,501][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 21:07:26,949][19107] Updated weights for policy 0, policy_version 211595 (0.0028) [2024-06-18 21:07:29,879][19107] Updated weights for policy 0, policy_version 211605 (0.0046) [2024-06-18 21:07:30,504][18875] Fps is (10 sec: 44221.0, 60 sec: 42049.8, 300 sec: 41931.4). Total num frames: 3466952704. Throughput: 0: 41928.2. Samples: 690992800. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:30,504][18875] Avg episode reward: [(0, '0.430')] [2024-06-18 21:07:34,485][19107] Updated weights for policy 0, policy_version 211615 (0.0034) [2024-06-18 21:07:35,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3467149312. Throughput: 0: 41831.1. Samples: 691249320. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:35,501][18875] Avg episode reward: [(0, '0.635')] [2024-06-18 21:07:37,517][19107] Updated weights for policy 0, policy_version 211625 (0.0041) [2024-06-18 21:07:40,500][18875] Fps is (10 sec: 40974.9, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3467362304. Throughput: 0: 41734.8. Samples: 691496820. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:40,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 21:07:42,025][19107] Updated weights for policy 0, policy_version 211635 (0.0043) [2024-06-18 21:07:45,229][19107] Updated weights for policy 0, policy_version 211645 (0.0037) [2024-06-18 21:07:45,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3467591680. Throughput: 0: 42247.5. Samples: 691632480. Policy #0 lag: (min: 0.0, avg: 8.0, max: 21.0) [2024-06-18 21:07:45,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 21:07:49,596][19107] Updated weights for policy 0, policy_version 211655 (0.0039) [2024-06-18 21:07:50,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 3467771904. Throughput: 0: 42052.8. Samples: 691882120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:07:50,501][18875] Avg episode reward: [(0, '0.825')] [2024-06-18 21:07:53,094][19107] Updated weights for policy 0, policy_version 211665 (0.0036) [2024-06-18 21:07:55,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 41987.4). Total num frames: 3468017664. Throughput: 0: 42203.5. Samples: 692132200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:07:55,501][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 21:07:57,186][19107] Updated weights for policy 0, policy_version 211675 (0.0038) [2024-06-18 21:08:00,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3468197888. Throughput: 0: 42291.0. Samples: 692267000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:00,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 21:08:00,918][19107] Updated weights for policy 0, policy_version 211685 (0.0037) [2024-06-18 21:08:05,084][19107] Updated weights for policy 0, policy_version 211695 (0.0042) [2024-06-18 21:08:05,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3468410880. Throughput: 0: 42203.6. Samples: 692515580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:05,501][18875] Avg episode reward: [(0, '0.327')] [2024-06-18 21:08:08,837][19107] Updated weights for policy 0, policy_version 211705 (0.0035) [2024-06-18 21:08:10,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 41987.4). Total num frames: 3468656640. Throughput: 0: 42406.6. Samples: 692768560. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:10,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 21:08:13,329][19107] Updated weights for policy 0, policy_version 211715 (0.0042) [2024-06-18 21:08:15,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3468836864. Throughput: 0: 42223.5. Samples: 692892700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:15,500][18875] Avg episode reward: [(0, '0.850')] [2024-06-18 21:08:16,668][19107] Updated weights for policy 0, policy_version 211725 (0.0034) [2024-06-18 21:08:20,500][18875] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 41932.4). Total num frames: 3469033472. Throughput: 0: 42097.0. Samples: 693143680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:20,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 21:08:20,845][19107] Updated weights for policy 0, policy_version 211735 (0.0034) [2024-06-18 21:08:24,574][19107] Updated weights for policy 0, policy_version 211745 (0.0031) [2024-06-18 21:08:25,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3469262848. Throughput: 0: 42113.8. Samples: 693391940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:25,500][18875] Avg episode reward: [(0, '0.410')] [2024-06-18 21:08:28,610][19107] Updated weights for policy 0, policy_version 211755 (0.0030) [2024-06-18 21:08:30,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42054.7, 300 sec: 42043.5). Total num frames: 3469475840. Throughput: 0: 41928.4. Samples: 693519260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:30,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 21:08:32,387][19107] Updated weights for policy 0, policy_version 211765 (0.0030) [2024-06-18 21:08:33,171][19087] Signal inference workers to stop experience collection... (10150 times) [2024-06-18 21:08:33,172][19087] Signal inference workers to resume experience collection... (10150 times) [2024-06-18 21:08:33,200][19107] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-06-18 21:08:33,200][19107] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-06-18 21:08:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3469672448. Throughput: 0: 41873.9. Samples: 693766440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:35,500][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 21:08:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000211773_3469688832.pth... [2024-06-18 21:08:35,568][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000211157_3459596288.pth [2024-06-18 21:08:36,374][19107] Updated weights for policy 0, policy_version 211775 (0.0036) [2024-06-18 21:08:40,187][19107] Updated weights for policy 0, policy_version 211785 (0.0032) [2024-06-18 21:08:40,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3469901824. Throughput: 0: 42088.9. Samples: 694026200. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:40,501][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 21:08:44,213][19107] Updated weights for policy 0, policy_version 211795 (0.0035) [2024-06-18 21:08:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3470098432. Throughput: 0: 41878.0. Samples: 694151500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:45,500][18875] Avg episode reward: [(0, '0.731')] [2024-06-18 21:08:48,284][19107] Updated weights for policy 0, policy_version 211805 (0.0035) [2024-06-18 21:08:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3470311424. Throughput: 0: 41831.1. Samples: 694397980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 21:08:50,501][18875] Avg episode reward: [(0, '0.836')] [2024-06-18 21:08:51,965][19107] Updated weights for policy 0, policy_version 211815 (0.0044) [2024-06-18 21:08:55,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3470508032. Throughput: 0: 41827.2. Samples: 694650780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:08:55,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 21:08:56,135][19107] Updated weights for policy 0, policy_version 211825 (0.0032) [2024-06-18 21:08:59,621][19107] Updated weights for policy 0, policy_version 211835 (0.0034) [2024-06-18 21:09:00,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 3470737408. Throughput: 0: 41804.5. Samples: 694773900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:00,500][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 21:09:03,900][19107] Updated weights for policy 0, policy_version 211845 (0.0035) [2024-06-18 21:09:05,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 3470950400. Throughput: 0: 42036.5. Samples: 695035320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:05,501][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 21:09:07,295][19107] Updated weights for policy 0, policy_version 211855 (0.0033) [2024-06-18 21:09:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3471147008. Throughput: 0: 42109.8. Samples: 695286880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:10,500][18875] Avg episode reward: [(0, '0.415')] [2024-06-18 21:09:11,557][19107] Updated weights for policy 0, policy_version 211865 (0.0044) [2024-06-18 21:09:14,884][19107] Updated weights for policy 0, policy_version 211875 (0.0048) [2024-06-18 21:09:15,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3471392768. Throughput: 0: 42100.0. Samples: 695413760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:15,501][18875] Avg episode reward: [(0, '0.295')] [2024-06-18 21:09:19,294][19107] Updated weights for policy 0, policy_version 211885 (0.0033) [2024-06-18 21:09:20,504][18875] Fps is (10 sec: 42582.5, 60 sec: 42322.8, 300 sec: 42042.5). Total num frames: 3471572992. Throughput: 0: 42248.5. Samples: 695667780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:20,504][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 21:09:22,745][19107] Updated weights for policy 0, policy_version 211895 (0.0023) [2024-06-18 21:09:25,504][18875] Fps is (10 sec: 37669.7, 60 sec: 41776.6, 300 sec: 41931.4). Total num frames: 3471769600. Throughput: 0: 42072.3. Samples: 695919600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:25,504][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 21:09:27,089][19107] Updated weights for policy 0, policy_version 211905 (0.0034) [2024-06-18 21:09:30,500][18875] Fps is (10 sec: 42613.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3471998976. Throughput: 0: 42047.5. Samples: 696043640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:30,501][18875] Avg episode reward: [(0, '0.813')] [2024-06-18 21:09:30,540][19107] Updated weights for policy 0, policy_version 211915 (0.0046) [2024-06-18 21:09:34,926][19107] Updated weights for policy 0, policy_version 211925 (0.0043) [2024-06-18 21:09:35,500][18875] Fps is (10 sec: 42613.4, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 3472195584. Throughput: 0: 42277.7. Samples: 696300480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:35,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 21:09:37,995][19107] Updated weights for policy 0, policy_version 211935 (0.0031) [2024-06-18 21:09:40,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.3, 300 sec: 41932.0). Total num frames: 3472392192. Throughput: 0: 42196.2. Samples: 696549600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:40,500][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 21:09:42,774][19107] Updated weights for policy 0, policy_version 211945 (0.0032) [2024-06-18 21:09:45,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 3472654336. Throughput: 0: 42308.7. Samples: 696677800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:45,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 21:09:45,938][19107] Updated weights for policy 0, policy_version 211955 (0.0030) [2024-06-18 21:09:50,501][19107] Updated weights for policy 0, policy_version 211965 (0.0025) [2024-06-18 21:09:50,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42043.5). Total num frames: 3472834560. Throughput: 0: 42147.1. Samples: 696931940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:50,501][18875] Avg episode reward: [(0, '0.369')] [2024-06-18 21:09:52,798][19087] Signal inference workers to stop experience collection... (10200 times) [2024-06-18 21:09:52,846][19107] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-06-18 21:09:52,855][19087] Signal inference workers to resume experience collection... (10200 times) [2024-06-18 21:09:52,861][19107] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-06-18 21:09:53,905][19107] Updated weights for policy 0, policy_version 211975 (0.0040) [2024-06-18 21:09:55,500][18875] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3473047552. Throughput: 0: 41814.9. Samples: 697168560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:09:55,503][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 21:09:58,358][19107] Updated weights for policy 0, policy_version 211985 (0.0034) [2024-06-18 21:10:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3473260544. Throughput: 0: 41968.5. Samples: 697302340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 21:10:00,501][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 21:10:01,861][19107] Updated weights for policy 0, policy_version 211995 (0.0030) [2024-06-18 21:10:05,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3473440768. Throughput: 0: 41770.0. Samples: 697547280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:05,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 21:10:06,308][19107] Updated weights for policy 0, policy_version 212005 (0.0036) [2024-06-18 21:10:09,540][19107] Updated weights for policy 0, policy_version 212015 (0.0035) [2024-06-18 21:10:10,504][18875] Fps is (10 sec: 44220.9, 60 sec: 42595.8, 300 sec: 42098.0). Total num frames: 3473702912. Throughput: 0: 41745.4. Samples: 697798140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:10,504][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 21:10:14,266][19107] Updated weights for policy 0, policy_version 212025 (0.0040) [2024-06-18 21:10:15,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 3473866752. Throughput: 0: 42043.1. Samples: 697935580. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:15,501][18875] Avg episode reward: [(0, '0.627')] [2024-06-18 21:10:17,281][19107] Updated weights for policy 0, policy_version 212035 (0.0048) [2024-06-18 21:10:20,502][18875] Fps is (10 sec: 37688.9, 60 sec: 41780.3, 300 sec: 41931.6). Total num frames: 3474079744. Throughput: 0: 41710.2. Samples: 698177520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:20,503][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 21:10:21,981][19107] Updated weights for policy 0, policy_version 212045 (0.0035) [2024-06-18 21:10:24,912][19107] Updated weights for policy 0, policy_version 212055 (0.0041) [2024-06-18 21:10:25,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42601.0, 300 sec: 42098.6). Total num frames: 3474325504. Throughput: 0: 41842.2. Samples: 698432500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:25,500][18875] Avg episode reward: [(0, '0.517')] [2024-06-18 21:10:29,827][19107] Updated weights for policy 0, policy_version 212065 (0.0025) [2024-06-18 21:10:30,500][18875] Fps is (10 sec: 42607.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3474505728. Throughput: 0: 42012.5. Samples: 698568360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:30,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 21:10:32,888][19107] Updated weights for policy 0, policy_version 212075 (0.0022) [2024-06-18 21:10:35,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 3474718720. Throughput: 0: 41876.4. Samples: 698816380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:35,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 21:10:35,516][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000212080_3474718720.pth... [2024-06-18 21:10:35,573][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000211466_3464658944.pth [2024-06-18 21:10:37,205][19107] Updated weights for policy 0, policy_version 212085 (0.0033) [2024-06-18 21:10:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3474948096. Throughput: 0: 42287.2. Samples: 699071480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:40,501][18875] Avg episode reward: [(0, '0.344')] [2024-06-18 21:10:40,557][19107] Updated weights for policy 0, policy_version 212095 (0.0025) [2024-06-18 21:10:44,866][19107] Updated weights for policy 0, policy_version 212105 (0.0030) [2024-06-18 21:10:45,504][18875] Fps is (10 sec: 42582.9, 60 sec: 41503.6, 300 sec: 41987.0). Total num frames: 3475144704. Throughput: 0: 42193.0. Samples: 699201180. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:45,505][18875] Avg episode reward: [(0, '0.367')] [2024-06-18 21:10:48,729][19107] Updated weights for policy 0, policy_version 212115 (0.0025) [2024-06-18 21:10:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3475374080. Throughput: 0: 42325.3. Samples: 699451920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:50,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 21:10:52,695][19107] Updated weights for policy 0, policy_version 212125 (0.0024) [2024-06-18 21:10:53,569][19087] Signal inference workers to stop experience collection... (10250 times) [2024-06-18 21:10:53,569][19087] Signal inference workers to resume experience collection... (10250 times) [2024-06-18 21:10:53,598][19107] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-06-18 21:10:53,604][19107] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-06-18 21:10:55,500][18875] Fps is (10 sec: 44252.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3475587072. Throughput: 0: 42424.2. Samples: 699707080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:10:55,501][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 21:10:56,422][19107] Updated weights for policy 0, policy_version 212135 (0.0029) [2024-06-18 21:11:00,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3475767296. Throughput: 0: 42198.2. Samples: 699834500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:11:00,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 21:11:00,709][19107] Updated weights for policy 0, policy_version 212145 (0.0033) [2024-06-18 21:11:04,036][19107] Updated weights for policy 0, policy_version 212155 (0.0039) [2024-06-18 21:11:05,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 3476013056. Throughput: 0: 42552.2. Samples: 700092280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:11:05,500][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 21:11:08,095][19107] Updated weights for policy 0, policy_version 212165 (0.0027) [2024-06-18 21:11:10,500][18875] Fps is (10 sec: 45876.0, 60 sec: 42054.8, 300 sec: 42209.6). Total num frames: 3476226048. Throughput: 0: 42529.3. Samples: 700346320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 22.0) [2024-06-18 21:11:10,500][18875] Avg episode reward: [(0, '0.741')] [2024-06-18 21:11:11,742][19107] Updated weights for policy 0, policy_version 212175 (0.0032) [2024-06-18 21:11:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3476406272. Throughput: 0: 42277.8. Samples: 700470860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:15,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 21:11:15,731][19107] Updated weights for policy 0, policy_version 212185 (0.0029) [2024-06-18 21:11:19,601][19107] Updated weights for policy 0, policy_version 212195 (0.0044) [2024-06-18 21:11:20,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42599.8, 300 sec: 42043.5). Total num frames: 3476635648. Throughput: 0: 42384.9. Samples: 700723700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:20,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 21:11:23,428][19107] Updated weights for policy 0, policy_version 212205 (0.0043) [2024-06-18 21:11:25,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3476848640. Throughput: 0: 42345.0. Samples: 700977000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:25,500][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 21:11:27,412][19107] Updated weights for policy 0, policy_version 212215 (0.0033) [2024-06-18 21:11:30,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3477061632. Throughput: 0: 42223.4. Samples: 701101080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:30,504][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 21:11:31,073][19107] Updated weights for policy 0, policy_version 212225 (0.0042) [2024-06-18 21:11:35,094][19107] Updated weights for policy 0, policy_version 212235 (0.0038) [2024-06-18 21:11:35,500][18875] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 3477274624. Throughput: 0: 42332.0. Samples: 701356860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:35,501][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 21:11:38,907][19107] Updated weights for policy 0, policy_version 212245 (0.0037) [2024-06-18 21:11:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3477471232. Throughput: 0: 42040.8. Samples: 701598920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:40,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 21:11:42,870][19107] Updated weights for policy 0, policy_version 212255 (0.0037) [2024-06-18 21:11:45,500][18875] Fps is (10 sec: 39321.8, 60 sec: 42054.8, 300 sec: 41931.9). Total num frames: 3477667840. Throughput: 0: 42102.2. Samples: 701729100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:45,501][18875] Avg episode reward: [(0, '0.744')] [2024-06-18 21:11:46,553][19107] Updated weights for policy 0, policy_version 212265 (0.0033) [2024-06-18 21:11:50,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3477897216. Throughput: 0: 41968.0. Samples: 701980840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:50,501][18875] Avg episode reward: [(0, '0.795')] [2024-06-18 21:11:50,578][19107] Updated weights for policy 0, policy_version 212275 (0.0030) [2024-06-18 21:11:54,723][19107] Updated weights for policy 0, policy_version 212285 (0.0035) [2024-06-18 21:11:55,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3478110208. Throughput: 0: 41912.8. Samples: 702232400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:11:55,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 21:11:58,560][19107] Updated weights for policy 0, policy_version 212295 (0.0030) [2024-06-18 21:12:00,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3478306816. Throughput: 0: 42008.8. Samples: 702361260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:12:00,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 21:12:02,467][19107] Updated weights for policy 0, policy_version 212305 (0.0043) [2024-06-18 21:12:05,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3478536192. Throughput: 0: 41867.1. Samples: 702607720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:12:05,504][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 21:12:06,300][19107] Updated weights for policy 0, policy_version 212315 (0.0043) [2024-06-18 21:12:08,090][19087] Signal inference workers to stop experience collection... (10300 times) [2024-06-18 21:12:08,091][19087] Signal inference workers to resume experience collection... (10300 times) [2024-06-18 21:12:08,112][19107] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-06-18 21:12:08,112][19107] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-06-18 21:12:10,235][19107] Updated weights for policy 0, policy_version 212325 (0.0041) [2024-06-18 21:12:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3478732800. Throughput: 0: 41796.3. Samples: 702857840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:12:10,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 21:12:14,076][19107] Updated weights for policy 0, policy_version 212335 (0.0036) [2024-06-18 21:12:15,500][18875] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3478913024. Throughput: 0: 41866.2. Samples: 702985060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:12:15,501][18875] Avg episode reward: [(0, '0.324')] [2024-06-18 21:12:18,396][19107] Updated weights for policy 0, policy_version 212345 (0.0035) [2024-06-18 21:12:20,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3479158784. Throughput: 0: 41835.8. Samples: 703239460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:20,500][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 21:12:21,714][19107] Updated weights for policy 0, policy_version 212355 (0.0044) [2024-06-18 21:12:25,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42043.5). Total num frames: 3479355392. Throughput: 0: 42180.1. Samples: 703497020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:25,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 21:12:26,145][19107] Updated weights for policy 0, policy_version 212365 (0.0037) [2024-06-18 21:12:29,789][19107] Updated weights for policy 0, policy_version 212375 (0.0046) [2024-06-18 21:12:30,504][18875] Fps is (10 sec: 40944.7, 60 sec: 41776.7, 300 sec: 42098.1). Total num frames: 3479568384. Throughput: 0: 41986.9. Samples: 703618660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:30,505][18875] Avg episode reward: [(0, '0.754')] [2024-06-18 21:12:33,906][19107] Updated weights for policy 0, policy_version 212385 (0.0039) [2024-06-18 21:12:35,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3479797760. Throughput: 0: 42075.6. Samples: 703874240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:35,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 21:12:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000212390_3479797760.pth... [2024-06-18 21:12:35,589][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000211773_3469688832.pth [2024-06-18 21:12:37,470][19107] Updated weights for policy 0, policy_version 212395 (0.0026) [2024-06-18 21:12:40,500][18875] Fps is (10 sec: 40975.0, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3479977984. Throughput: 0: 42089.4. Samples: 704126420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:40,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 21:12:41,630][19107] Updated weights for policy 0, policy_version 212405 (0.0042) [2024-06-18 21:12:45,504][18875] Fps is (10 sec: 39307.2, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 3480190976. Throughput: 0: 41977.1. Samples: 704250380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:45,505][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 21:12:45,543][19107] Updated weights for policy 0, policy_version 212415 (0.0037) [2024-06-18 21:12:49,356][19107] Updated weights for policy 0, policy_version 212425 (0.0038) [2024-06-18 21:12:50,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3480420352. Throughput: 0: 42300.9. Samples: 704511260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:50,501][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 21:12:53,278][19107] Updated weights for policy 0, policy_version 212435 (0.0036) [2024-06-18 21:12:55,500][18875] Fps is (10 sec: 42614.3, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3480616960. Throughput: 0: 42331.7. Samples: 704762760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:12:55,500][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 21:12:57,137][19107] Updated weights for policy 0, policy_version 212445 (0.0034) [2024-06-18 21:13:00,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3480846336. Throughput: 0: 42258.4. Samples: 704886680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:13:00,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 21:13:00,825][19107] Updated weights for policy 0, policy_version 212455 (0.0043) [2024-06-18 21:13:04,971][19107] Updated weights for policy 0, policy_version 212465 (0.0045) [2024-06-18 21:13:05,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3481042944. Throughput: 0: 42295.9. Samples: 705142780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:13:05,501][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 21:13:08,536][19107] Updated weights for policy 0, policy_version 212475 (0.0046) [2024-06-18 21:13:10,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3481239552. Throughput: 0: 42074.7. Samples: 705390380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:13:10,501][18875] Avg episode reward: [(0, '0.867')] [2024-06-18 21:13:12,598][19107] Updated weights for policy 0, policy_version 212485 (0.0040) [2024-06-18 21:13:15,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42598.6, 300 sec: 42154.1). Total num frames: 3481468928. Throughput: 0: 42165.7. Samples: 705515960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:13:15,500][18875] Avg episode reward: [(0, '0.870')] [2024-06-18 21:13:16,158][19107] Updated weights for policy 0, policy_version 212495 (0.0044) [2024-06-18 21:13:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3481665536. Throughput: 0: 42009.0. Samples: 705764640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:13:20,500][18875] Avg episode reward: [(0, '0.742')] [2024-06-18 21:13:20,766][19107] Updated weights for policy 0, policy_version 212505 (0.0033) [2024-06-18 21:13:24,000][19107] Updated weights for policy 0, policy_version 212515 (0.0033) [2024-06-18 21:13:25,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3481878528. Throughput: 0: 42079.0. Samples: 706019980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:13:25,501][18875] Avg episode reward: [(0, '0.646')] [2024-06-18 21:13:28,799][19107] Updated weights for policy 0, policy_version 212525 (0.0040) [2024-06-18 21:13:30,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42054.8, 300 sec: 42098.5). Total num frames: 3482091520. Throughput: 0: 42146.9. Samples: 706146840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:13:30,501][18875] Avg episode reward: [(0, '0.399')] [2024-06-18 21:13:31,789][19107] Updated weights for policy 0, policy_version 212535 (0.0036) [2024-06-18 21:13:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3482304512. Throughput: 0: 41911.6. Samples: 706397280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:13:35,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 21:13:36,435][19087] Signal inference workers to stop experience collection... (10350 times) [2024-06-18 21:13:36,436][19087] Signal inference workers to resume experience collection... (10350 times) [2024-06-18 21:13:36,451][19107] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-06-18 21:13:36,451][19107] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-06-18 21:13:36,615][19107] Updated weights for policy 0, policy_version 212545 (0.0031) [2024-06-18 21:13:39,581][19107] Updated weights for policy 0, policy_version 212555 (0.0038) [2024-06-18 21:13:40,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3482517504. Throughput: 0: 41856.9. Samples: 706646320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:13:40,500][18875] Avg episode reward: [(0, '0.582')] [2024-06-18 21:13:44,266][19107] Updated weights for policy 0, policy_version 212565 (0.0037) [2024-06-18 21:13:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42054.9, 300 sec: 42043.0). Total num frames: 3482714112. Throughput: 0: 41913.8. Samples: 706772800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:13:45,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 21:13:47,602][19107] Updated weights for policy 0, policy_version 212575 (0.0029) [2024-06-18 21:13:50,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3482927104. Throughput: 0: 41721.0. Samples: 707020220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:13:50,500][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 21:13:52,108][19107] Updated weights for policy 0, policy_version 212585 (0.0046) [2024-06-18 21:13:55,186][19107] Updated weights for policy 0, policy_version 212595 (0.0040) [2024-06-18 21:13:55,504][18875] Fps is (10 sec: 44220.7, 60 sec: 42322.7, 300 sec: 42098.0). Total num frames: 3483156480. Throughput: 0: 41778.9. Samples: 707270580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:13:55,505][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 21:13:59,586][19107] Updated weights for policy 0, policy_version 212605 (0.0039) [2024-06-18 21:14:00,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 3483320320. Throughput: 0: 41836.4. Samples: 707398600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:14:00,500][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 21:14:02,874][19107] Updated weights for policy 0, policy_version 212615 (0.0025) [2024-06-18 21:14:05,500][18875] Fps is (10 sec: 40974.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3483566080. Throughput: 0: 41946.9. Samples: 707652260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:14:05,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 21:14:07,504][19107] Updated weights for policy 0, policy_version 212625 (0.0046) [2024-06-18 21:14:10,500][18875] Fps is (10 sec: 45874.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3483779072. Throughput: 0: 41873.3. Samples: 707904280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:14:10,501][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 21:14:10,748][19107] Updated weights for policy 0, policy_version 212635 (0.0035) [2024-06-18 21:14:15,081][19107] Updated weights for policy 0, policy_version 212645 (0.0032) [2024-06-18 21:14:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 42043.5). Total num frames: 3483975680. Throughput: 0: 41790.7. Samples: 708027420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:14:15,501][18875] Avg episode reward: [(0, '0.388')] [2024-06-18 21:14:18,459][19107] Updated weights for policy 0, policy_version 212655 (0.0035) [2024-06-18 21:14:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42154.6). Total num frames: 3484205056. Throughput: 0: 41846.6. Samples: 708280380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:14:20,501][18875] Avg episode reward: [(0, '0.423')] [2024-06-18 21:14:22,623][19107] Updated weights for policy 0, policy_version 212665 (0.0028) [2024-06-18 21:14:25,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41506.3, 300 sec: 41931.9). Total num frames: 3484368896. Throughput: 0: 42181.3. Samples: 708544480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:14:25,500][18875] Avg episode reward: [(0, '0.272')] [2024-06-18 21:14:26,489][19107] Updated weights for policy 0, policy_version 212675 (0.0039) [2024-06-18 21:14:30,266][19107] Updated weights for policy 0, policy_version 212685 (0.0029) [2024-06-18 21:14:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3484631040. Throughput: 0: 42015.9. Samples: 708663520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:14:30,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 21:14:34,319][19107] Updated weights for policy 0, policy_version 212695 (0.0049) [2024-06-18 21:14:35,500][18875] Fps is (10 sec: 47513.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3484844032. Throughput: 0: 42145.7. Samples: 708916780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:14:35,501][18875] Avg episode reward: [(0, '0.461')] [2024-06-18 21:14:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000212698_3484844032.pth... [2024-06-18 21:14:35,573][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000212080_3474718720.pth [2024-06-18 21:14:38,004][19107] Updated weights for policy 0, policy_version 212705 (0.0039) [2024-06-18 21:14:40,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3485007872. Throughput: 0: 42209.1. Samples: 709169840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:14:40,501][18875] Avg episode reward: [(0, '0.750')] [2024-06-18 21:14:40,981][19087] Signal inference workers to stop experience collection... (10400 times) [2024-06-18 21:14:40,996][19107] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-06-18 21:14:41,046][19087] Signal inference workers to resume experience collection... (10400 times) [2024-06-18 21:14:41,046][19107] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-06-18 21:14:42,615][19107] Updated weights for policy 0, policy_version 212715 (0.0031) [2024-06-18 21:14:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3485253632. Throughput: 0: 41942.6. Samples: 709286020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:14:45,501][18875] Avg episode reward: [(0, '0.764')] [2024-06-18 21:14:45,753][19107] Updated weights for policy 0, policy_version 212725 (0.0037) [2024-06-18 21:14:50,235][19107] Updated weights for policy 0, policy_version 212735 (0.0028) [2024-06-18 21:14:50,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3485450240. Throughput: 0: 42069.8. Samples: 709545400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:14:50,501][18875] Avg episode reward: [(0, '0.399')] [2024-06-18 21:14:53,345][19107] Updated weights for policy 0, policy_version 212745 (0.0029) [2024-06-18 21:14:55,500][18875] Fps is (10 sec: 37682.8, 60 sec: 41235.5, 300 sec: 41931.9). Total num frames: 3485630464. Throughput: 0: 42026.3. Samples: 709795460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:14:55,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 21:14:58,152][19107] Updated weights for policy 0, policy_version 212755 (0.0032) [2024-06-18 21:15:00,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3485876224. Throughput: 0: 42056.1. Samples: 709919940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:00,501][18875] Avg episode reward: [(0, '0.791')] [2024-06-18 21:15:01,091][19107] Updated weights for policy 0, policy_version 212765 (0.0031) [2024-06-18 21:15:05,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41876.9). Total num frames: 3486056448. Throughput: 0: 42197.3. Samples: 710179260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:05,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 21:15:05,931][19107] Updated weights for policy 0, policy_version 212775 (0.0033) [2024-06-18 21:15:09,244][19107] Updated weights for policy 0, policy_version 212785 (0.0038) [2024-06-18 21:15:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3486285824. Throughput: 0: 41547.9. Samples: 710414140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:10,501][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 21:15:13,997][19107] Updated weights for policy 0, policy_version 212795 (0.0048) [2024-06-18 21:15:15,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42098.8). Total num frames: 3486498816. Throughput: 0: 41947.3. Samples: 710551140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:15,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 21:15:16,995][19107] Updated weights for policy 0, policy_version 212805 (0.0031) [2024-06-18 21:15:20,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 3486679040. Throughput: 0: 41920.5. Samples: 710803200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:20,501][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 21:15:21,898][19107] Updated weights for policy 0, policy_version 212815 (0.0030) [2024-06-18 21:15:24,794][19107] Updated weights for policy 0, policy_version 212825 (0.0034) [2024-06-18 21:15:25,500][18875] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42154.1). Total num frames: 3486941184. Throughput: 0: 41684.8. Samples: 711045660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:25,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 21:15:29,502][19107] Updated weights for policy 0, policy_version 212835 (0.0027) [2024-06-18 21:15:30,500][18875] Fps is (10 sec: 45874.8, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3487137792. Throughput: 0: 42232.0. Samples: 711186460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:30,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 21:15:32,742][19107] Updated weights for policy 0, policy_version 212845 (0.0032) [2024-06-18 21:15:35,500][18875] Fps is (10 sec: 36044.7, 60 sec: 40959.9, 300 sec: 41876.4). Total num frames: 3487301632. Throughput: 0: 41940.8. Samples: 711432740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:35,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 21:15:37,460][19107] Updated weights for policy 0, policy_version 212855 (0.0045) [2024-06-18 21:15:40,338][19087] Signal inference workers to stop experience collection... (10450 times) [2024-06-18 21:15:40,388][19107] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-06-18 21:15:40,393][19087] Signal inference workers to resume experience collection... (10450 times) [2024-06-18 21:15:40,404][19107] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-06-18 21:15:40,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42099.1). Total num frames: 3487563776. Throughput: 0: 41772.6. Samples: 711675220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 25.0) [2024-06-18 21:15:40,500][18875] Avg episode reward: [(0, '0.858')] [2024-06-18 21:15:40,534][19107] Updated weights for policy 0, policy_version 212865 (0.0040) [2024-06-18 21:15:45,319][19107] Updated weights for policy 0, policy_version 212875 (0.0040) [2024-06-18 21:15:45,500][18875] Fps is (10 sec: 45875.5, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3487760384. Throughput: 0: 41990.5. Samples: 711809520. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:15:45,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 21:15:48,374][19107] Updated weights for policy 0, policy_version 212885 (0.0038) [2024-06-18 21:15:50,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3487956992. Throughput: 0: 41751.3. Samples: 712058060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:15:50,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 21:15:52,852][19107] Updated weights for policy 0, policy_version 212895 (0.0033) [2024-06-18 21:15:55,505][18875] Fps is (10 sec: 44218.2, 60 sec: 42868.4, 300 sec: 42153.5). Total num frames: 3488202752. Throughput: 0: 41985.8. Samples: 712303680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:15:55,505][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 21:15:56,215][19107] Updated weights for policy 0, policy_version 212905 (0.0043) [2024-06-18 21:16:00,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3488382976. Throughput: 0: 41971.0. Samples: 712439840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:00,501][18875] Avg episode reward: [(0, '0.343')] [2024-06-18 21:16:00,638][19107] Updated weights for policy 0, policy_version 212915 (0.0028) [2024-06-18 21:16:04,386][19107] Updated weights for policy 0, policy_version 212925 (0.0037) [2024-06-18 21:16:05,500][18875] Fps is (10 sec: 37699.6, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 3488579584. Throughput: 0: 41904.4. Samples: 712688900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:05,500][18875] Avg episode reward: [(0, '0.741')] [2024-06-18 21:16:08,807][19107] Updated weights for policy 0, policy_version 212935 (0.0043) [2024-06-18 21:16:10,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3488825344. Throughput: 0: 41907.2. Samples: 712931480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:10,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 21:16:11,906][19107] Updated weights for policy 0, policy_version 212945 (0.0034) [2024-06-18 21:16:15,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3489005568. Throughput: 0: 41752.4. Samples: 713065320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:15,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 21:16:16,482][19107] Updated weights for policy 0, policy_version 212955 (0.0034) [2024-06-18 21:16:19,503][19107] Updated weights for policy 0, policy_version 212965 (0.0044) [2024-06-18 21:16:20,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3489234944. Throughput: 0: 41764.1. Samples: 713312120. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:20,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 21:16:24,071][19107] Updated weights for policy 0, policy_version 212975 (0.0034) [2024-06-18 21:16:25,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 3489447936. Throughput: 0: 42049.8. Samples: 713567460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:25,500][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 21:16:27,473][19107] Updated weights for policy 0, policy_version 212985 (0.0051) [2024-06-18 21:16:30,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 3489628160. Throughput: 0: 41897.8. Samples: 713694920. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:30,501][18875] Avg episode reward: [(0, '0.779')] [2024-06-18 21:16:31,916][19107] Updated weights for policy 0, policy_version 212995 (0.0027) [2024-06-18 21:16:35,216][19107] Updated weights for policy 0, policy_version 213005 (0.0031) [2024-06-18 21:16:35,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42871.5, 300 sec: 42043.0). Total num frames: 3489873920. Throughput: 0: 42047.4. Samples: 713950200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:35,501][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 21:16:35,518][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213005_3489873920.pth... [2024-06-18 21:16:35,570][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000212390_3479797760.pth [2024-06-18 21:16:39,684][19107] Updated weights for policy 0, policy_version 213015 (0.0034) [2024-06-18 21:16:40,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3490070528. Throughput: 0: 42156.9. Samples: 714200560. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:40,501][18875] Avg episode reward: [(0, '0.217')] [2024-06-18 21:16:43,153][19107] Updated weights for policy 0, policy_version 213025 (0.0026) [2024-06-18 21:16:45,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3490250752. Throughput: 0: 41883.5. Samples: 714324600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:45,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 21:16:47,745][19107] Updated weights for policy 0, policy_version 213035 (0.0027) [2024-06-18 21:16:47,912][19087] Signal inference workers to stop experience collection... (10500 times) [2024-06-18 21:16:47,960][19107] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-06-18 21:16:47,966][19087] Signal inference workers to resume experience collection... (10500 times) [2024-06-18 21:16:47,975][19107] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-06-18 21:16:50,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3490480128. Throughput: 0: 41926.7. Samples: 714575600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 21:16:50,500][18875] Avg episode reward: [(0, '0.767')] [2024-06-18 21:16:51,298][19107] Updated weights for policy 0, policy_version 213045 (0.0039) [2024-06-18 21:16:55,334][19107] Updated weights for policy 0, policy_version 213055 (0.0041) [2024-06-18 21:16:55,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41509.1, 300 sec: 41987.5). Total num frames: 3490693120. Throughput: 0: 42264.0. Samples: 714833360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:16:55,501][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 21:16:59,040][19107] Updated weights for policy 0, policy_version 213065 (0.0035) [2024-06-18 21:17:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3490889728. Throughput: 0: 42014.3. Samples: 714955960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:00,500][18875] Avg episode reward: [(0, '0.427')] [2024-06-18 21:17:02,876][19107] Updated weights for policy 0, policy_version 213075 (0.0030) [2024-06-18 21:17:05,504][18875] Fps is (10 sec: 42583.2, 60 sec: 42322.8, 300 sec: 41987.0). Total num frames: 3491119104. Throughput: 0: 42108.7. Samples: 715207160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:05,504][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 21:17:06,720][19107] Updated weights for policy 0, policy_version 213085 (0.0040) [2024-06-18 21:17:10,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3491332096. Throughput: 0: 42243.4. Samples: 715468420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:10,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 21:17:10,586][19107] Updated weights for policy 0, policy_version 213095 (0.0035) [2024-06-18 21:17:15,022][19107] Updated weights for policy 0, policy_version 213105 (0.0045) [2024-06-18 21:17:15,500][18875] Fps is (10 sec: 40974.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3491528704. Throughput: 0: 42046.7. Samples: 715587020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:15,501][18875] Avg episode reward: [(0, '0.185')] [2024-06-18 21:17:18,591][19107] Updated weights for policy 0, policy_version 213115 (0.0036) [2024-06-18 21:17:20,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3491758080. Throughput: 0: 42011.7. Samples: 715840720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:20,501][18875] Avg episode reward: [(0, '0.448')] [2024-06-18 21:17:22,626][19107] Updated weights for policy 0, policy_version 213125 (0.0048) [2024-06-18 21:17:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41988.0). Total num frames: 3491954688. Throughput: 0: 42147.1. Samples: 716097180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:25,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 21:17:26,372][19107] Updated weights for policy 0, policy_version 213135 (0.0034) [2024-06-18 21:17:30,269][19107] Updated weights for policy 0, policy_version 213145 (0.0043) [2024-06-18 21:17:30,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3492167680. Throughput: 0: 42093.4. Samples: 716218800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:30,502][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 21:17:34,371][19107] Updated weights for policy 0, policy_version 213155 (0.0039) [2024-06-18 21:17:35,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3492397056. Throughput: 0: 42312.5. Samples: 716479660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:35,501][18875] Avg episode reward: [(0, '0.303')] [2024-06-18 21:17:37,918][19107] Updated weights for policy 0, policy_version 213165 (0.0034) [2024-06-18 21:17:40,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42043.5). Total num frames: 3492593664. Throughput: 0: 42053.8. Samples: 716725780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:40,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 21:17:41,926][19107] Updated weights for policy 0, policy_version 213175 (0.0035) [2024-06-18 21:17:45,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3492790272. Throughput: 0: 42091.9. Samples: 716850100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:45,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 21:17:46,162][19107] Updated weights for policy 0, policy_version 213185 (0.0030) [2024-06-18 21:17:49,627][19107] Updated weights for policy 0, policy_version 213195 (0.0045) [2024-06-18 21:17:50,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3493003264. Throughput: 0: 42260.7. Samples: 717108740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:50,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 21:17:53,928][19107] Updated weights for policy 0, policy_version 213205 (0.0041) [2024-06-18 21:17:55,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3493232640. Throughput: 0: 41958.7. Samples: 717356560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 21:17:55,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 21:17:57,363][19107] Updated weights for policy 0, policy_version 213215 (0.0042) [2024-06-18 21:18:00,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3493412864. Throughput: 0: 42205.9. Samples: 717486280. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:00,501][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 21:18:01,494][19107] Updated weights for policy 0, policy_version 213225 (0.0034) [2024-06-18 21:18:04,724][19087] Signal inference workers to stop experience collection... (10550 times) [2024-06-18 21:18:04,726][19087] Signal inference workers to resume experience collection... (10550 times) [2024-06-18 21:18:04,761][19107] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-06-18 21:18:04,761][19107] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-06-18 21:18:05,219][19107] Updated weights for policy 0, policy_version 213235 (0.0029) [2024-06-18 21:18:05,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 3493642240. Throughput: 0: 42108.3. Samples: 717735600. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:05,501][18875] Avg episode reward: [(0, '0.419')] [2024-06-18 21:18:09,286][19107] Updated weights for policy 0, policy_version 213245 (0.0048) [2024-06-18 21:18:10,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3493871616. Throughput: 0: 41848.4. Samples: 717980360. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:10,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 21:18:13,013][19107] Updated weights for policy 0, policy_version 213255 (0.0043) [2024-06-18 21:18:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3494068224. Throughput: 0: 42083.1. Samples: 718112540. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:15,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 21:18:16,832][19107] Updated weights for policy 0, policy_version 213265 (0.0041) [2024-06-18 21:18:20,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3494281216. Throughput: 0: 41878.2. Samples: 718364180. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:20,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 21:18:20,806][19107] Updated weights for policy 0, policy_version 213275 (0.0037) [2024-06-18 21:18:24,457][19107] Updated weights for policy 0, policy_version 213285 (0.0038) [2024-06-18 21:18:25,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3494494208. Throughput: 0: 41992.5. Samples: 718615440. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:25,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 21:18:28,498][19107] Updated weights for policy 0, policy_version 213295 (0.0033) [2024-06-18 21:18:30,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3494690816. Throughput: 0: 42167.7. Samples: 718747640. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:30,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 21:18:32,332][19107] Updated weights for policy 0, policy_version 213305 (0.0029) [2024-06-18 21:18:35,501][18875] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 3494920192. Throughput: 0: 41988.7. Samples: 718998240. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:35,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 21:18:35,513][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213313_3494920192.pth... [2024-06-18 21:18:35,570][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000212698_3484844032.pth [2024-06-18 21:18:36,245][19107] Updated weights for policy 0, policy_version 213315 (0.0033) [2024-06-18 21:18:40,021][19107] Updated weights for policy 0, policy_version 213325 (0.0029) [2024-06-18 21:18:40,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3495133184. Throughput: 0: 42035.4. Samples: 719248160. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:40,507][18875] Avg episode reward: [(0, '0.321')] [2024-06-18 21:18:43,841][19107] Updated weights for policy 0, policy_version 213335 (0.0039) [2024-06-18 21:18:45,500][18875] Fps is (10 sec: 39322.7, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3495313408. Throughput: 0: 42049.4. Samples: 719378500. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:45,501][18875] Avg episode reward: [(0, '0.398')] [2024-06-18 21:18:47,649][19107] Updated weights for policy 0, policy_version 213345 (0.0033) [2024-06-18 21:18:50,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 41988.0). Total num frames: 3495542784. Throughput: 0: 42064.1. Samples: 719628480. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:50,501][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 21:18:51,466][19107] Updated weights for policy 0, policy_version 213355 (0.0032) [2024-06-18 21:18:55,329][19107] Updated weights for policy 0, policy_version 213365 (0.0035) [2024-06-18 21:18:55,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3495772160. Throughput: 0: 42309.5. Samples: 719884280. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:18:55,500][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 21:18:59,305][19107] Updated weights for policy 0, policy_version 213375 (0.0034) [2024-06-18 21:19:00,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3495952384. Throughput: 0: 42285.0. Samples: 720015360. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:19:00,501][18875] Avg episode reward: [(0, '0.502')] [2024-06-18 21:19:03,132][19107] Updated weights for policy 0, policy_version 213385 (0.0029) [2024-06-18 21:19:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 3496181760. Throughput: 0: 42126.3. Samples: 720259860. Policy #0 lag: (min: 1.0, avg: 12.2, max: 24.0) [2024-06-18 21:19:05,500][18875] Avg episode reward: [(0, '0.422')] [2024-06-18 21:19:07,006][19107] Updated weights for policy 0, policy_version 213395 (0.0029) [2024-06-18 21:19:10,506][18875] Fps is (10 sec: 42575.5, 60 sec: 41775.5, 300 sec: 42042.2). Total num frames: 3496378368. Throughput: 0: 42364.7. Samples: 720522080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:10,506][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 21:19:10,917][19107] Updated weights for policy 0, policy_version 213405 (0.0035) [2024-06-18 21:19:14,192][19087] Signal inference workers to stop experience collection... (10600 times) [2024-06-18 21:19:14,192][19087] Signal inference workers to resume experience collection... (10600 times) [2024-06-18 21:19:14,237][19107] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-06-18 21:19:14,237][19107] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-06-18 21:19:14,961][19107] Updated weights for policy 0, policy_version 213415 (0.0030) [2024-06-18 21:19:15,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3496591360. Throughput: 0: 42263.0. Samples: 720649480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:15,501][18875] Avg episode reward: [(0, '0.703')] [2024-06-18 21:19:18,672][19107] Updated weights for policy 0, policy_version 213425 (0.0043) [2024-06-18 21:19:20,500][18875] Fps is (10 sec: 42621.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3496804352. Throughput: 0: 42223.8. Samples: 720898300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:20,501][18875] Avg episode reward: [(0, '0.732')] [2024-06-18 21:19:22,590][19107] Updated weights for policy 0, policy_version 213435 (0.0031) [2024-06-18 21:19:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3497017344. Throughput: 0: 42439.1. Samples: 721157920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:25,501][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 21:19:26,505][19107] Updated weights for policy 0, policy_version 213445 (0.0028) [2024-06-18 21:19:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3497230336. Throughput: 0: 42202.6. Samples: 721277620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:30,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 21:19:30,702][19107] Updated weights for policy 0, policy_version 213455 (0.0024) [2024-06-18 21:19:34,684][19107] Updated weights for policy 0, policy_version 213465 (0.0042) [2024-06-18 21:19:35,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3497459712. Throughput: 0: 42150.6. Samples: 721525260. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:35,501][18875] Avg episode reward: [(0, '0.750')] [2024-06-18 21:19:38,377][19107] Updated weights for policy 0, policy_version 213475 (0.0029) [2024-06-18 21:19:40,504][18875] Fps is (10 sec: 39307.3, 60 sec: 41503.7, 300 sec: 41931.4). Total num frames: 3497623552. Throughput: 0: 42168.9. Samples: 721782040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:40,505][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 21:19:42,490][19107] Updated weights for policy 0, policy_version 213485 (0.0036) [2024-06-18 21:19:45,500][18875] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3497869312. Throughput: 0: 41889.0. Samples: 721900360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:45,501][18875] Avg episode reward: [(0, '0.684')] [2024-06-18 21:19:46,528][19107] Updated weights for policy 0, policy_version 213495 (0.0036) [2024-06-18 21:19:50,297][19107] Updated weights for policy 0, policy_version 213505 (0.0035) [2024-06-18 21:19:50,500][18875] Fps is (10 sec: 44252.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3498065920. Throughput: 0: 42111.5. Samples: 722154880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:50,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 21:19:54,142][19107] Updated weights for policy 0, policy_version 213515 (0.0025) [2024-06-18 21:19:55,500][18875] Fps is (10 sec: 37682.5, 60 sec: 41232.9, 300 sec: 41931.9). Total num frames: 3498246144. Throughput: 0: 41878.2. Samples: 722406380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:19:55,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 21:19:58,372][19107] Updated weights for policy 0, policy_version 213525 (0.0037) [2024-06-18 21:20:00,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3498491904. Throughput: 0: 41936.1. Samples: 722536600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:20:00,501][18875] Avg episode reward: [(0, '0.661')] [2024-06-18 21:20:01,833][19107] Updated weights for policy 0, policy_version 213535 (0.0032) [2024-06-18 21:20:05,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3498688512. Throughput: 0: 41847.5. Samples: 722781440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:20:05,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 21:20:06,184][19107] Updated weights for policy 0, policy_version 213545 (0.0038) [2024-06-18 21:20:09,386][19107] Updated weights for policy 0, policy_version 213555 (0.0032) [2024-06-18 21:20:10,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41782.9, 300 sec: 41987.5). Total num frames: 3498885120. Throughput: 0: 41688.5. Samples: 723033900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:20:10,510][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 21:20:14,163][19107] Updated weights for policy 0, policy_version 213565 (0.0043) [2024-06-18 21:20:15,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3499130880. Throughput: 0: 41808.9. Samples: 723159020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:15,501][18875] Avg episode reward: [(0, '0.386')] [2024-06-18 21:20:17,728][19107] Updated weights for policy 0, policy_version 213575 (0.0044) [2024-06-18 21:20:20,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3499311104. Throughput: 0: 41821.0. Samples: 723407200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:20,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 21:20:21,886][19107] Updated weights for policy 0, policy_version 213585 (0.0046) [2024-06-18 21:20:25,343][19107] Updated weights for policy 0, policy_version 213595 (0.0039) [2024-06-18 21:20:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3499540480. Throughput: 0: 41765.7. Samples: 723661340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:25,501][18875] Avg episode reward: [(0, '0.300')] [2024-06-18 21:20:29,698][19107] Updated weights for policy 0, policy_version 213605 (0.0043) [2024-06-18 21:20:30,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3499737088. Throughput: 0: 41997.2. Samples: 723790240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:30,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 21:20:31,689][19087] Signal inference workers to stop experience collection... (10650 times) [2024-06-18 21:20:31,690][19087] Signal inference workers to resume experience collection... (10650 times) [2024-06-18 21:20:31,716][19107] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-06-18 21:20:31,716][19107] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-06-18 21:20:33,015][19107] Updated weights for policy 0, policy_version 213615 (0.0033) [2024-06-18 21:20:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3499950080. Throughput: 0: 41891.1. Samples: 724039980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:35,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 21:20:35,519][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213620_3499950080.pth... [2024-06-18 21:20:35,582][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213005_3489873920.pth [2024-06-18 21:20:37,395][19107] Updated weights for policy 0, policy_version 213625 (0.0050) [2024-06-18 21:20:40,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42600.9, 300 sec: 42098.6). Total num frames: 3500179456. Throughput: 0: 41767.1. Samples: 724285900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:40,501][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 21:20:40,834][19107] Updated weights for policy 0, policy_version 213635 (0.0029) [2024-06-18 21:20:45,196][19107] Updated weights for policy 0, policy_version 213645 (0.0032) [2024-06-18 21:20:45,501][18875] Fps is (10 sec: 40959.2, 60 sec: 41505.9, 300 sec: 42043.0). Total num frames: 3500359680. Throughput: 0: 41804.1. Samples: 724417800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:45,501][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 21:20:48,665][19107] Updated weights for policy 0, policy_version 213655 (0.0039) [2024-06-18 21:20:50,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41988.1). Total num frames: 3500589056. Throughput: 0: 42044.5. Samples: 724673440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:50,500][18875] Avg episode reward: [(0, '0.722')] [2024-06-18 21:20:53,108][19107] Updated weights for policy 0, policy_version 213665 (0.0048) [2024-06-18 21:20:55,504][18875] Fps is (10 sec: 44221.9, 60 sec: 42595.9, 300 sec: 42098.0). Total num frames: 3500802048. Throughput: 0: 42026.0. Samples: 724925220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:20:55,504][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 21:20:56,536][19107] Updated weights for policy 0, policy_version 213675 (0.0042) [2024-06-18 21:21:00,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3500982272. Throughput: 0: 41963.2. Samples: 725047360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:21:00,500][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 21:21:01,023][19107] Updated weights for policy 0, policy_version 213685 (0.0032) [2024-06-18 21:21:04,046][19107] Updated weights for policy 0, policy_version 213695 (0.0035) [2024-06-18 21:21:05,500][18875] Fps is (10 sec: 42614.1, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3501228032. Throughput: 0: 42110.3. Samples: 725302160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:21:05,500][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 21:21:08,745][19107] Updated weights for policy 0, policy_version 213705 (0.0043) [2024-06-18 21:21:10,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3501424640. Throughput: 0: 42215.5. Samples: 725561040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:21:10,501][18875] Avg episode reward: [(0, '0.404')] [2024-06-18 21:21:12,014][19107] Updated weights for policy 0, policy_version 213715 (0.0034) [2024-06-18 21:21:15,500][18875] Fps is (10 sec: 40959.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3501637632. Throughput: 0: 42044.0. Samples: 725682220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:21:15,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 21:21:16,277][19107] Updated weights for policy 0, policy_version 213725 (0.0031) [2024-06-18 21:21:19,669][19107] Updated weights for policy 0, policy_version 213735 (0.0060) [2024-06-18 21:21:20,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3501867008. Throughput: 0: 42322.7. Samples: 725944500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:21:20,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 21:21:23,930][19107] Updated weights for policy 0, policy_version 213745 (0.0026) [2024-06-18 21:21:25,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3502063616. Throughput: 0: 42315.2. Samples: 726190080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:21:25,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 21:21:27,406][19107] Updated weights for policy 0, policy_version 213755 (0.0037) [2024-06-18 21:21:30,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3502260224. Throughput: 0: 42108.2. Samples: 726312660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:21:30,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 21:21:32,039][19107] Updated weights for policy 0, policy_version 213765 (0.0038) [2024-06-18 21:21:35,057][19107] Updated weights for policy 0, policy_version 213775 (0.0034) [2024-06-18 21:21:35,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3502505984. Throughput: 0: 42220.7. Samples: 726573380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:21:35,501][18875] Avg episode reward: [(0, '0.479')] [2024-06-18 21:21:39,599][19107] Updated weights for policy 0, policy_version 213785 (0.0035) [2024-06-18 21:21:40,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.4, 300 sec: 42209.7). Total num frames: 3502702592. Throughput: 0: 42117.3. Samples: 726820340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:21:40,500][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 21:21:42,817][19107] Updated weights for policy 0, policy_version 213795 (0.0043) [2024-06-18 21:21:45,500][18875] Fps is (10 sec: 37683.6, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3502882816. Throughput: 0: 42229.6. Samples: 726947700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:21:45,501][18875] Avg episode reward: [(0, '0.524')] [2024-06-18 21:21:47,157][19107] Updated weights for policy 0, policy_version 213805 (0.0041) [2024-06-18 21:21:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3503128576. Throughput: 0: 42393.7. Samples: 727209880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:21:50,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 21:21:50,727][19107] Updated weights for policy 0, policy_version 213815 (0.0029) [2024-06-18 21:21:54,909][19107] Updated weights for policy 0, policy_version 213825 (0.0041) [2024-06-18 21:21:55,504][18875] Fps is (10 sec: 44221.1, 60 sec: 42052.3, 300 sec: 42153.6). Total num frames: 3503325184. Throughput: 0: 42170.8. Samples: 727458880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:21:55,504][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 21:21:58,404][19107] Updated weights for policy 0, policy_version 213835 (0.0023) [2024-06-18 21:22:00,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42043.5). Total num frames: 3503521792. Throughput: 0: 42313.0. Samples: 727586300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:22:00,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 21:22:02,672][19107] Updated weights for policy 0, policy_version 213845 (0.0039) [2024-06-18 21:22:05,500][18875] Fps is (10 sec: 42614.1, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3503751168. Throughput: 0: 42205.4. Samples: 727843740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:22:05,500][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 21:22:06,375][19107] Updated weights for policy 0, policy_version 213855 (0.0028) [2024-06-18 21:22:10,384][19107] Updated weights for policy 0, policy_version 213865 (0.0042) [2024-06-18 21:22:10,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3503964160. Throughput: 0: 42326.7. Samples: 728094780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:22:10,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 21:22:11,188][19087] Signal inference workers to stop experience collection... (10700 times) [2024-06-18 21:22:11,224][19107] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-06-18 21:22:11,235][19087] Signal inference workers to resume experience collection... (10700 times) [2024-06-18 21:22:11,244][19107] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-06-18 21:22:14,148][19107] Updated weights for policy 0, policy_version 213875 (0.0029) [2024-06-18 21:22:15,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3504177152. Throughput: 0: 42400.8. Samples: 728220700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:22:15,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 21:22:18,275][19107] Updated weights for policy 0, policy_version 213885 (0.0044) [2024-06-18 21:22:20,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3504373760. Throughput: 0: 42141.5. Samples: 728469740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:22:20,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 21:22:21,751][19107] Updated weights for policy 0, policy_version 213895 (0.0034) [2024-06-18 21:22:25,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3504586752. Throughput: 0: 42359.4. Samples: 728726520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:22:25,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 21:22:26,052][19107] Updated weights for policy 0, policy_version 213905 (0.0038) [2024-06-18 21:22:29,357][19107] Updated weights for policy 0, policy_version 213915 (0.0041) [2024-06-18 21:22:30,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3504799744. Throughput: 0: 42325.3. Samples: 728852340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:22:30,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 21:22:33,699][19107] Updated weights for policy 0, policy_version 213925 (0.0042) [2024-06-18 21:22:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 3505012736. Throughput: 0: 42016.0. Samples: 729100600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:22:35,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 21:22:35,599][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213930_3505029120.pth... [2024-06-18 21:22:35,642][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213313_3494920192.pth [2024-06-18 21:22:37,052][19107] Updated weights for policy 0, policy_version 213935 (0.0041) [2024-06-18 21:22:40,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.0, 300 sec: 42098.5). Total num frames: 3505209344. Throughput: 0: 42140.1. Samples: 729355040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:22:40,501][18875] Avg episode reward: [(0, '0.436')] [2024-06-18 21:22:41,743][19107] Updated weights for policy 0, policy_version 213945 (0.0037) [2024-06-18 21:22:45,015][19107] Updated weights for policy 0, policy_version 213955 (0.0048) [2024-06-18 21:22:45,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 3505455104. Throughput: 0: 42214.7. Samples: 729485960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:22:45,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 21:22:49,632][19107] Updated weights for policy 0, policy_version 213965 (0.0044) [2024-06-18 21:22:50,504][18875] Fps is (10 sec: 44221.7, 60 sec: 42049.8, 300 sec: 42098.0). Total num frames: 3505651712. Throughput: 0: 42045.9. Samples: 729735960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:22:50,504][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 21:22:52,766][19107] Updated weights for policy 0, policy_version 213975 (0.0035) [2024-06-18 21:22:55,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42054.7, 300 sec: 42154.1). Total num frames: 3505848320. Throughput: 0: 42067.1. Samples: 729987800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:22:55,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 21:22:57,438][19107] Updated weights for policy 0, policy_version 213985 (0.0042) [2024-06-18 21:23:00,402][19107] Updated weights for policy 0, policy_version 213995 (0.0038) [2024-06-18 21:23:00,500][18875] Fps is (10 sec: 44252.6, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 3506094080. Throughput: 0: 42034.3. Samples: 730112240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:00,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 21:23:05,119][19107] Updated weights for policy 0, policy_version 214005 (0.0024) [2024-06-18 21:23:05,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3506290688. Throughput: 0: 42325.8. Samples: 730374400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:05,501][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 21:23:08,115][19107] Updated weights for policy 0, policy_version 214015 (0.0042) [2024-06-18 21:23:10,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3506470912. Throughput: 0: 42222.3. Samples: 730626520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:10,501][18875] Avg episode reward: [(0, '0.720')] [2024-06-18 21:23:12,897][19107] Updated weights for policy 0, policy_version 214025 (0.0037) [2024-06-18 21:23:15,500][18875] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3506716672. Throughput: 0: 42168.0. Samples: 730749900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:15,501][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 21:23:15,946][19107] Updated weights for policy 0, policy_version 214035 (0.0055) [2024-06-18 21:23:20,483][19107] Updated weights for policy 0, policy_version 214045 (0.0035) [2024-06-18 21:23:20,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3506913280. Throughput: 0: 42315.6. Samples: 731004800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:20,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 21:23:23,655][19107] Updated weights for policy 0, policy_version 214055 (0.0032) [2024-06-18 21:23:25,500][18875] Fps is (10 sec: 39322.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3507109888. Throughput: 0: 42354.9. Samples: 731261000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:25,500][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 21:23:28,275][19107] Updated weights for policy 0, policy_version 214065 (0.0036) [2024-06-18 21:23:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3507355648. Throughput: 0: 42287.5. Samples: 731388900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:30,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 21:23:31,386][19107] Updated weights for policy 0, policy_version 214075 (0.0035) [2024-06-18 21:23:35,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3507535872. Throughput: 0: 42303.3. Samples: 731639460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 21:23:35,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 21:23:35,823][19107] Updated weights for policy 0, policy_version 214085 (0.0034) [2024-06-18 21:23:38,220][19087] Signal inference workers to stop experience collection... (10750 times) [2024-06-18 21:23:38,220][19087] Signal inference workers to resume experience collection... (10750 times) [2024-06-18 21:23:38,263][19107] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-06-18 21:23:38,263][19107] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-06-18 21:23:39,752][19107] Updated weights for policy 0, policy_version 214095 (0.0049) [2024-06-18 21:23:40,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3507748864. Throughput: 0: 42288.0. Samples: 731890760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:23:40,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 21:23:43,740][19107] Updated weights for policy 0, policy_version 214105 (0.0036) [2024-06-18 21:23:45,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3507961856. Throughput: 0: 42377.0. Samples: 732019200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:23:45,501][18875] Avg episode reward: [(0, '0.620')] [2024-06-18 21:23:47,407][19107] Updated weights for policy 0, policy_version 214115 (0.0038) [2024-06-18 21:23:50,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41781.6, 300 sec: 41987.4). Total num frames: 3508158464. Throughput: 0: 42015.4. Samples: 732265100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:23:50,501][18875] Avg episode reward: [(0, '0.297')] [2024-06-18 21:23:51,585][19107] Updated weights for policy 0, policy_version 214125 (0.0045) [2024-06-18 21:23:55,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3508371456. Throughput: 0: 42018.7. Samples: 732517360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:23:55,500][18875] Avg episode reward: [(0, '0.665')] [2024-06-18 21:23:55,529][19107] Updated weights for policy 0, policy_version 214135 (0.0036) [2024-06-18 21:23:59,246][19107] Updated weights for policy 0, policy_version 214145 (0.0035) [2024-06-18 21:24:00,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3508584448. Throughput: 0: 42158.3. Samples: 732647020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:00,509][18875] Avg episode reward: [(0, '0.638')] [2024-06-18 21:24:03,202][19107] Updated weights for policy 0, policy_version 214155 (0.0035) [2024-06-18 21:24:05,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42099.3). Total num frames: 3508797440. Throughput: 0: 41991.6. Samples: 732894420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:05,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 21:24:07,279][19107] Updated weights for policy 0, policy_version 214165 (0.0035) [2024-06-18 21:24:10,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3509010432. Throughput: 0: 41991.1. Samples: 733150600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:10,501][18875] Avg episode reward: [(0, '0.608')] [2024-06-18 21:24:10,881][19107] Updated weights for policy 0, policy_version 214175 (0.0037) [2024-06-18 21:24:15,047][19107] Updated weights for policy 0, policy_version 214185 (0.0037) [2024-06-18 21:24:15,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3509207040. Throughput: 0: 41913.3. Samples: 733275000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:15,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 21:24:18,792][19107] Updated weights for policy 0, policy_version 214195 (0.0040) [2024-06-18 21:24:20,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3509436416. Throughput: 0: 41750.8. Samples: 733518240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:20,500][18875] Avg episode reward: [(0, '0.680')] [2024-06-18 21:24:23,198][19107] Updated weights for policy 0, policy_version 214205 (0.0036) [2024-06-18 21:24:25,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3509649408. Throughput: 0: 41820.6. Samples: 733772680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:25,500][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 21:24:26,414][19107] Updated weights for policy 0, policy_version 214215 (0.0040) [2024-06-18 21:24:30,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 41932.0). Total num frames: 3509829632. Throughput: 0: 41793.3. Samples: 733899900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:30,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 21:24:30,910][19107] Updated weights for policy 0, policy_version 214225 (0.0029) [2024-06-18 21:24:34,359][19107] Updated weights for policy 0, policy_version 214235 (0.0039) [2024-06-18 21:24:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42154.6). Total num frames: 3510059008. Throughput: 0: 41856.1. Samples: 734148620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:35,501][18875] Avg episode reward: [(0, '0.732')] [2024-06-18 21:24:35,526][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000214238_3510075392.pth... [2024-06-18 21:24:35,574][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213620_3499950080.pth [2024-06-18 21:24:38,723][19107] Updated weights for policy 0, policy_version 214245 (0.0030) [2024-06-18 21:24:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3510272000. Throughput: 0: 41874.2. Samples: 734401700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:40,501][18875] Avg episode reward: [(0, '0.698')] [2024-06-18 21:24:42,166][19107] Updated weights for policy 0, policy_version 214255 (0.0041) [2024-06-18 21:24:45,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 3510452224. Throughput: 0: 41655.0. Samples: 734521500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 21:24:45,501][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 21:24:46,714][19107] Updated weights for policy 0, policy_version 214265 (0.0044) [2024-06-18 21:24:49,996][19107] Updated weights for policy 0, policy_version 214275 (0.0028) [2024-06-18 21:24:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3510697984. Throughput: 0: 41755.1. Samples: 734773400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:24:50,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 21:24:54,597][19107] Updated weights for policy 0, policy_version 214285 (0.0037) [2024-06-18 21:24:55,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3510894592. Throughput: 0: 41680.9. Samples: 735026240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:24:55,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 21:24:57,768][19107] Updated weights for policy 0, policy_version 214295 (0.0022) [2024-06-18 21:25:00,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3511091200. Throughput: 0: 41569.8. Samples: 735145640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:00,501][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 21:25:02,358][19107] Updated weights for policy 0, policy_version 214305 (0.0042) [2024-06-18 21:25:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3511320576. Throughput: 0: 41874.1. Samples: 735402580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:05,501][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 21:25:05,619][19107] Updated weights for policy 0, policy_version 214315 (0.0032) [2024-06-18 21:25:10,391][19107] Updated weights for policy 0, policy_version 214325 (0.0047) [2024-06-18 21:25:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3511500800. Throughput: 0: 41761.5. Samples: 735651960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:10,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 21:25:13,489][19107] Updated weights for policy 0, policy_version 214335 (0.0036) [2024-06-18 21:25:14,113][19087] Signal inference workers to stop experience collection... (10800 times) [2024-06-18 21:25:14,147][19107] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-06-18 21:25:14,159][19087] Signal inference workers to resume experience collection... (10800 times) [2024-06-18 21:25:14,176][19107] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-06-18 21:25:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3511713792. Throughput: 0: 41535.9. Samples: 735769020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:15,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 21:25:18,006][19107] Updated weights for policy 0, policy_version 214345 (0.0030) [2024-06-18 21:25:20,500][18875] Fps is (10 sec: 42599.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3511926784. Throughput: 0: 41744.5. Samples: 736027120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:20,500][18875] Avg episode reward: [(0, '0.724')] [2024-06-18 21:25:21,261][19107] Updated weights for policy 0, policy_version 214355 (0.0038) [2024-06-18 21:25:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41232.9, 300 sec: 41987.5). Total num frames: 3512123392. Throughput: 0: 41591.9. Samples: 736273340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:25,501][18875] Avg episode reward: [(0, '0.806')] [2024-06-18 21:25:25,856][19107] Updated weights for policy 0, policy_version 214365 (0.0046) [2024-06-18 21:25:29,167][19107] Updated weights for policy 0, policy_version 214375 (0.0036) [2024-06-18 21:25:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3512352768. Throughput: 0: 41717.0. Samples: 736398760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:30,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 21:25:33,953][19107] Updated weights for policy 0, policy_version 214385 (0.0036) [2024-06-18 21:25:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3512549376. Throughput: 0: 41773.6. Samples: 736653220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:35,501][18875] Avg episode reward: [(0, '0.614')] [2024-06-18 21:25:36,846][19107] Updated weights for policy 0, policy_version 214395 (0.0031) [2024-06-18 21:25:40,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3512762368. Throughput: 0: 41680.9. Samples: 736901880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:40,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 21:25:41,777][19107] Updated weights for policy 0, policy_version 214405 (0.0042) [2024-06-18 21:25:44,752][19107] Updated weights for policy 0, policy_version 214415 (0.0037) [2024-06-18 21:25:45,500][18875] Fps is (10 sec: 44237.8, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 3512991744. Throughput: 0: 41920.6. Samples: 737032060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:45,500][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 21:25:49,383][19107] Updated weights for policy 0, policy_version 214425 (0.0035) [2024-06-18 21:25:50,504][18875] Fps is (10 sec: 42583.3, 60 sec: 41503.6, 300 sec: 41987.5). Total num frames: 3513188352. Throughput: 0: 41902.0. Samples: 737288320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:50,504][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 21:25:52,668][19107] Updated weights for policy 0, policy_version 214435 (0.0028) [2024-06-18 21:25:55,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3513401344. Throughput: 0: 41865.0. Samples: 737535880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 25.0) [2024-06-18 21:25:55,501][18875] Avg episode reward: [(0, '0.351')] [2024-06-18 21:25:57,169][19107] Updated weights for policy 0, policy_version 214445 (0.0036) [2024-06-18 21:26:00,435][19107] Updated weights for policy 0, policy_version 214455 (0.0034) [2024-06-18 21:26:00,500][18875] Fps is (10 sec: 44252.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3513630720. Throughput: 0: 42116.5. Samples: 737664260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:00,501][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 21:26:04,967][19107] Updated weights for policy 0, policy_version 214465 (0.0038) [2024-06-18 21:26:05,504][18875] Fps is (10 sec: 40945.4, 60 sec: 41503.6, 300 sec: 41987.0). Total num frames: 3513810944. Throughput: 0: 41973.9. Samples: 737916100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:05,504][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 21:26:08,110][19107] Updated weights for policy 0, policy_version 214475 (0.0031) [2024-06-18 21:26:10,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3514040320. Throughput: 0: 42147.7. Samples: 738169980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:10,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 21:26:12,838][19107] Updated weights for policy 0, policy_version 214485 (0.0029) [2024-06-18 21:26:15,500][18875] Fps is (10 sec: 44252.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3514253312. Throughput: 0: 42125.3. Samples: 738294400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:15,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 21:26:15,944][19107] Updated weights for policy 0, policy_version 214495 (0.0035) [2024-06-18 21:26:20,387][19107] Updated weights for policy 0, policy_version 214505 (0.0034) [2024-06-18 21:26:20,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3514449920. Throughput: 0: 42079.2. Samples: 738546780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:20,501][18875] Avg episode reward: [(0, '0.678')] [2024-06-18 21:26:23,779][19107] Updated weights for policy 0, policy_version 214515 (0.0030) [2024-06-18 21:26:25,504][18875] Fps is (10 sec: 40945.3, 60 sec: 42322.8, 300 sec: 42042.5). Total num frames: 3514662912. Throughput: 0: 42123.7. Samples: 738797600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:25,505][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 21:26:27,936][19107] Updated weights for policy 0, policy_version 214525 (0.0029) [2024-06-18 21:26:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3514892288. Throughput: 0: 42018.1. Samples: 738922880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:30,501][18875] Avg episode reward: [(0, '0.310')] [2024-06-18 21:26:31,502][19107] Updated weights for policy 0, policy_version 214535 (0.0038) [2024-06-18 21:26:35,500][18875] Fps is (10 sec: 42613.5, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 3515088896. Throughput: 0: 42033.5. Samples: 739179680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:35,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 21:26:35,529][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000214544_3515088896.pth... [2024-06-18 21:26:35,589][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000213930_3505029120.pth [2024-06-18 21:26:35,875][19107] Updated weights for policy 0, policy_version 214545 (0.0039) [2024-06-18 21:26:39,321][19107] Updated weights for policy 0, policy_version 214555 (0.0029) [2024-06-18 21:26:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3515301888. Throughput: 0: 42107.2. Samples: 739430700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:40,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 21:26:43,283][19087] Signal inference workers to stop experience collection... (10850 times) [2024-06-18 21:26:43,284][19087] Signal inference workers to resume experience collection... (10850 times) [2024-06-18 21:26:43,298][19107] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-06-18 21:26:43,299][19107] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-06-18 21:26:43,441][19107] Updated weights for policy 0, policy_version 214565 (0.0025) [2024-06-18 21:26:45,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3515498496. Throughput: 0: 42092.5. Samples: 739558420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:45,500][18875] Avg episode reward: [(0, '0.248')] [2024-06-18 21:26:47,334][19107] Updated weights for policy 0, policy_version 214575 (0.0035) [2024-06-18 21:26:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42327.9, 300 sec: 42043.5). Total num frames: 3515727872. Throughput: 0: 42012.7. Samples: 739806520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:50,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 21:26:51,335][19107] Updated weights for policy 0, policy_version 214585 (0.0038) [2024-06-18 21:26:54,959][19107] Updated weights for policy 0, policy_version 214595 (0.0040) [2024-06-18 21:26:55,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3515924480. Throughput: 0: 41997.0. Samples: 740059840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:26:55,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 21:26:59,013][19107] Updated weights for policy 0, policy_version 214605 (0.0037) [2024-06-18 21:27:00,504][18875] Fps is (10 sec: 40945.3, 60 sec: 41776.7, 300 sec: 41986.9). Total num frames: 3516137472. Throughput: 0: 42206.0. Samples: 740193820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:27:00,504][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 21:27:02,608][19107] Updated weights for policy 0, policy_version 214615 (0.0037) [2024-06-18 21:27:05,504][18875] Fps is (10 sec: 42582.6, 60 sec: 42325.3, 300 sec: 41987.0). Total num frames: 3516350464. Throughput: 0: 42164.6. Samples: 740444340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:05,505][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 21:27:06,550][19107] Updated weights for policy 0, policy_version 214625 (0.0024) [2024-06-18 21:27:10,448][19107] Updated weights for policy 0, policy_version 214635 (0.0032) [2024-06-18 21:27:10,500][18875] Fps is (10 sec: 44252.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3516579840. Throughput: 0: 42331.9. Samples: 740702380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:10,501][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 21:27:14,593][19107] Updated weights for policy 0, policy_version 214645 (0.0030) [2024-06-18 21:27:15,500][18875] Fps is (10 sec: 42614.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3516776448. Throughput: 0: 42297.9. Samples: 740826280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:15,500][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 21:27:18,086][19107] Updated weights for policy 0, policy_version 214655 (0.0029) [2024-06-18 21:27:20,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3516973056. Throughput: 0: 42125.9. Samples: 741075340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:20,501][18875] Avg episode reward: [(0, '0.493')] [2024-06-18 21:27:22,533][19107] Updated weights for policy 0, policy_version 214665 (0.0028) [2024-06-18 21:27:25,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 3517202432. Throughput: 0: 42228.9. Samples: 741331000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:25,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 21:27:25,701][19107] Updated weights for policy 0, policy_version 214675 (0.0034) [2024-06-18 21:27:30,272][19107] Updated weights for policy 0, policy_version 214685 (0.0043) [2024-06-18 21:27:30,504][18875] Fps is (10 sec: 42583.1, 60 sec: 41776.7, 300 sec: 41987.0). Total num frames: 3517399040. Throughput: 0: 42206.3. Samples: 741457860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:30,505][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 21:27:33,423][19107] Updated weights for policy 0, policy_version 214695 (0.0032) [2024-06-18 21:27:35,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3517612032. Throughput: 0: 42255.1. Samples: 741708000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:35,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 21:27:37,843][19107] Updated weights for policy 0, policy_version 214705 (0.0049) [2024-06-18 21:27:40,500][18875] Fps is (10 sec: 44252.8, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3517841408. Throughput: 0: 42246.6. Samples: 741960940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:40,501][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 21:27:41,196][19107] Updated weights for policy 0, policy_version 214715 (0.0040) [2024-06-18 21:27:45,498][19107] Updated weights for policy 0, policy_version 214725 (0.0050) [2024-06-18 21:27:45,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42043.5). Total num frames: 3518054400. Throughput: 0: 42279.3. Samples: 742096240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:45,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 21:27:48,851][19107] Updated weights for policy 0, policy_version 214735 (0.0030) [2024-06-18 21:27:50,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3518251008. Throughput: 0: 42214.5. Samples: 742343840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:50,501][18875] Avg episode reward: [(0, '0.440')] [2024-06-18 21:27:53,262][19107] Updated weights for policy 0, policy_version 214745 (0.0038) [2024-06-18 21:27:55,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 3518480384. Throughput: 0: 42171.1. Samples: 742600080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:27:55,501][18875] Avg episode reward: [(0, '0.403')] [2024-06-18 21:27:57,146][19107] Updated weights for policy 0, policy_version 214755 (0.0040) [2024-06-18 21:28:00,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42327.8, 300 sec: 41987.4). Total num frames: 3518676992. Throughput: 0: 42389.1. Samples: 742733800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:28:00,501][18875] Avg episode reward: [(0, '0.283')] [2024-06-18 21:28:00,896][19107] Updated weights for policy 0, policy_version 214765 (0.0038) [2024-06-18 21:28:05,011][19107] Updated weights for policy 0, policy_version 214775 (0.0030) [2024-06-18 21:28:05,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 3518873600. Throughput: 0: 42326.5. Samples: 742980040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:28:05,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 21:28:05,734][19087] Signal inference workers to stop experience collection... (10900 times) [2024-06-18 21:28:05,780][19107] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-06-18 21:28:05,851][19087] Signal inference workers to resume experience collection... (10900 times) [2024-06-18 21:28:05,851][19107] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-06-18 21:28:09,047][19107] Updated weights for policy 0, policy_version 214785 (0.0035) [2024-06-18 21:28:10,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3519119360. Throughput: 0: 42329.2. Samples: 743235820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:28:10,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 21:28:12,782][19107] Updated weights for policy 0, policy_version 214795 (0.0037) [2024-06-18 21:28:15,500][18875] Fps is (10 sec: 45875.6, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 3519332352. Throughput: 0: 42401.1. Samples: 743365760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:15,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 21:28:16,582][19107] Updated weights for policy 0, policy_version 214805 (0.0030) [2024-06-18 21:28:20,379][19107] Updated weights for policy 0, policy_version 214815 (0.0038) [2024-06-18 21:28:20,504][18875] Fps is (10 sec: 40945.5, 60 sec: 42595.9, 300 sec: 42098.0). Total num frames: 3519528960. Throughput: 0: 42478.8. Samples: 743619700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:20,504][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 21:28:24,374][19107] Updated weights for policy 0, policy_version 214825 (0.0029) [2024-06-18 21:28:25,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3519741952. Throughput: 0: 42456.5. Samples: 743871480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:25,500][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 21:28:27,880][19107] Updated weights for policy 0, policy_version 214835 (0.0027) [2024-06-18 21:28:30,500][18875] Fps is (10 sec: 40974.4, 60 sec: 42327.8, 300 sec: 42043.0). Total num frames: 3519938560. Throughput: 0: 42244.9. Samples: 743997260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:30,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 21:28:31,954][19107] Updated weights for policy 0, policy_version 214845 (0.0029) [2024-06-18 21:28:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3520167936. Throughput: 0: 42260.1. Samples: 744245540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:35,500][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 21:28:35,512][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000214854_3520167936.pth... [2024-06-18 21:28:35,577][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000214238_3510075392.pth [2024-06-18 21:28:35,723][19107] Updated weights for policy 0, policy_version 214855 (0.0033) [2024-06-18 21:28:39,607][19107] Updated weights for policy 0, policy_version 214865 (0.0030) [2024-06-18 21:28:40,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3520380928. Throughput: 0: 42266.3. Samples: 744502060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:40,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 21:28:43,591][19107] Updated weights for policy 0, policy_version 214875 (0.0027) [2024-06-18 21:28:45,500][18875] Fps is (10 sec: 39320.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3520561152. Throughput: 0: 42033.8. Samples: 744625320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:45,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 21:28:47,443][19107] Updated weights for policy 0, policy_version 214885 (0.0039) [2024-06-18 21:28:50,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3520806912. Throughput: 0: 42058.4. Samples: 744872660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:50,501][18875] Avg episode reward: [(0, '0.257')] [2024-06-18 21:28:51,735][19107] Updated weights for policy 0, policy_version 214895 (0.0033) [2024-06-18 21:28:55,462][19107] Updated weights for policy 0, policy_version 214905 (0.0040) [2024-06-18 21:28:55,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3521003520. Throughput: 0: 42200.9. Samples: 745134860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:28:55,501][18875] Avg episode reward: [(0, '0.257')] [2024-06-18 21:28:59,113][19107] Updated weights for policy 0, policy_version 214915 (0.0029) [2024-06-18 21:29:00,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3521200128. Throughput: 0: 42123.5. Samples: 745261320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:00,501][18875] Avg episode reward: [(0, '0.419')] [2024-06-18 21:29:03,014][19107] Updated weights for policy 0, policy_version 214925 (0.0038) [2024-06-18 21:29:05,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42154.1). Total num frames: 3521445888. Throughput: 0: 42182.9. Samples: 745517780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:05,501][18875] Avg episode reward: [(0, '0.574')] [2024-06-18 21:29:06,565][19107] Updated weights for policy 0, policy_version 214935 (0.0031) [2024-06-18 21:29:10,501][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 3521642496. Throughput: 0: 42319.2. Samples: 745775860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:10,501][18875] Avg episode reward: [(0, '0.424')] [2024-06-18 21:29:10,686][19107] Updated weights for policy 0, policy_version 214945 (0.0041) [2024-06-18 21:29:14,096][19107] Updated weights for policy 0, policy_version 214955 (0.0027) [2024-06-18 21:29:15,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3521839104. Throughput: 0: 42308.0. Samples: 745901120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:15,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 21:29:18,323][19107] Updated weights for policy 0, policy_version 214965 (0.0042) [2024-06-18 21:29:20,500][18875] Fps is (10 sec: 44238.3, 60 sec: 42601.0, 300 sec: 42154.1). Total num frames: 3522084864. Throughput: 0: 42390.7. Samples: 746153120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:20,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 21:29:22,175][19107] Updated weights for policy 0, policy_version 214975 (0.0029) [2024-06-18 21:29:25,504][18875] Fps is (10 sec: 40945.8, 60 sec: 41776.6, 300 sec: 42098.0). Total num frames: 3522248704. Throughput: 0: 42361.1. Samples: 746408460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:25,504][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 21:29:26,131][19107] Updated weights for policy 0, policy_version 214985 (0.0034) [2024-06-18 21:29:26,948][19087] Signal inference workers to stop experience collection... (10950 times) [2024-06-18 21:29:26,952][19087] Signal inference workers to resume experience collection... (10950 times) [2024-06-18 21:29:27,001][19107] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-06-18 21:29:27,008][19107] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-06-18 21:29:30,014][19107] Updated weights for policy 0, policy_version 214995 (0.0034) [2024-06-18 21:29:30,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3522478080. Throughput: 0: 42307.2. Samples: 746529140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:30,501][18875] Avg episode reward: [(0, '0.548')] [2024-06-18 21:29:34,082][19107] Updated weights for policy 0, policy_version 215005 (0.0027) [2024-06-18 21:29:35,501][18875] Fps is (10 sec: 44251.8, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3522691072. Throughput: 0: 42411.8. Samples: 746781200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:35,501][18875] Avg episode reward: [(0, '0.501')] [2024-06-18 21:29:37,898][19107] Updated weights for policy 0, policy_version 215015 (0.0029) [2024-06-18 21:29:40,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3522887680. Throughput: 0: 42266.6. Samples: 747036860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:40,501][18875] Avg episode reward: [(0, '0.403')] [2024-06-18 21:29:41,873][19107] Updated weights for policy 0, policy_version 215025 (0.0032) [2024-06-18 21:29:45,500][18875] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42098.6). Total num frames: 3523117056. Throughput: 0: 42286.8. Samples: 747164220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:45,501][18875] Avg episode reward: [(0, '0.324')] [2024-06-18 21:29:45,582][19107] Updated weights for policy 0, policy_version 215035 (0.0035) [2024-06-18 21:29:49,710][19107] Updated weights for policy 0, policy_version 215045 (0.0030) [2024-06-18 21:29:50,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 3523330048. Throughput: 0: 42209.2. Samples: 747417200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:50,501][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 21:29:53,173][19107] Updated weights for policy 0, policy_version 215055 (0.0030) [2024-06-18 21:29:55,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3523526656. Throughput: 0: 42159.7. Samples: 747673040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:29:55,501][18875] Avg episode reward: [(0, '0.805')] [2024-06-18 21:29:57,470][19107] Updated weights for policy 0, policy_version 215065 (0.0029) [2024-06-18 21:30:00,500][18875] Fps is (10 sec: 44237.9, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 3523772416. Throughput: 0: 42183.7. Samples: 747799380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:30:00,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 21:30:00,713][19107] Updated weights for policy 0, policy_version 215075 (0.0046) [2024-06-18 21:30:05,465][19107] Updated weights for policy 0, policy_version 215085 (0.0039) [2024-06-18 21:30:05,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 3523952640. Throughput: 0: 42107.8. Samples: 748047980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:30:05,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 21:30:09,041][19107] Updated weights for policy 0, policy_version 215095 (0.0038) [2024-06-18 21:30:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.6, 300 sec: 42265.2). Total num frames: 3524182016. Throughput: 0: 41971.5. Samples: 748297020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:30:10,500][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 21:30:13,233][19107] Updated weights for policy 0, policy_version 215105 (0.0028) [2024-06-18 21:30:15,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 3524378624. Throughput: 0: 42276.5. Samples: 748431580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:30:15,501][18875] Avg episode reward: [(0, '0.703')] [2024-06-18 21:30:16,671][19107] Updated weights for policy 0, policy_version 215115 (0.0030) [2024-06-18 21:30:20,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 42209.7). Total num frames: 3524575232. Throughput: 0: 42253.6. Samples: 748682600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:30:20,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 21:30:21,109][19107] Updated weights for policy 0, policy_version 215125 (0.0046) [2024-06-18 21:30:24,365][19107] Updated weights for policy 0, policy_version 215135 (0.0040) [2024-06-18 21:30:25,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42600.9, 300 sec: 42209.6). Total num frames: 3524804608. Throughput: 0: 42094.3. Samples: 748931100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 21:30:25,501][18875] Avg episode reward: [(0, '0.783')] [2024-06-18 21:30:28,695][19107] Updated weights for policy 0, policy_version 215145 (0.0035) [2024-06-18 21:30:30,504][18875] Fps is (10 sec: 44220.3, 60 sec: 42322.8, 300 sec: 42264.7). Total num frames: 3525017600. Throughput: 0: 42108.1. Samples: 749059240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:30:30,505][18875] Avg episode reward: [(0, '0.781')] [2024-06-18 21:30:32,401][19107] Updated weights for policy 0, policy_version 215155 (0.0031) [2024-06-18 21:30:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 3525214208. Throughput: 0: 42085.9. Samples: 749311060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:30:35,501][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 21:30:35,532][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000215162_3525214208.pth... [2024-06-18 21:30:35,588][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000214544_3515088896.pth [2024-06-18 21:30:36,892][19107] Updated weights for policy 0, policy_version 215165 (0.0035) [2024-06-18 21:30:39,881][19107] Updated weights for policy 0, policy_version 215175 (0.0026) [2024-06-18 21:30:40,500][18875] Fps is (10 sec: 42613.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3525443584. Throughput: 0: 41979.6. Samples: 749562120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:30:40,501][18875] Avg episode reward: [(0, '0.501')] [2024-06-18 21:30:44,371][19107] Updated weights for policy 0, policy_version 215185 (0.0032) [2024-06-18 21:30:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42210.1). Total num frames: 3525640192. Throughput: 0: 42174.6. Samples: 749697240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:30:45,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 21:30:47,533][19107] Updated weights for policy 0, policy_version 215195 (0.0039) [2024-06-18 21:30:50,504][18875] Fps is (10 sec: 40946.0, 60 sec: 42049.9, 300 sec: 42209.1). Total num frames: 3525853184. Throughput: 0: 42107.5. Samples: 749942960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:30:50,505][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 21:30:51,816][19107] Updated weights for policy 0, policy_version 215205 (0.0040) [2024-06-18 21:30:55,395][19107] Updated weights for policy 0, policy_version 215215 (0.0037) [2024-06-18 21:30:55,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 3526082560. Throughput: 0: 42318.6. Samples: 750201360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:30:55,501][18875] Avg episode reward: [(0, '0.730')] [2024-06-18 21:30:59,706][19107] Updated weights for policy 0, policy_version 215225 (0.0022) [2024-06-18 21:31:00,500][18875] Fps is (10 sec: 40974.4, 60 sec: 41506.1, 300 sec: 42210.1). Total num frames: 3526262784. Throughput: 0: 42152.8. Samples: 750328460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:00,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 21:31:02,999][19087] Signal inference workers to stop experience collection... (11000 times) [2024-06-18 21:31:03,030][19107] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-06-18 21:31:03,064][19087] Signal inference workers to resume experience collection... (11000 times) [2024-06-18 21:31:03,068][19107] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-06-18 21:31:03,239][19107] Updated weights for policy 0, policy_version 215235 (0.0025) [2024-06-18 21:31:05,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 3526492160. Throughput: 0: 42122.3. Samples: 750578100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:05,500][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 21:31:07,224][19107] Updated weights for policy 0, policy_version 215245 (0.0040) [2024-06-18 21:31:10,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3526688768. Throughput: 0: 42214.4. Samples: 750830740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:10,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 21:31:10,926][19107] Updated weights for policy 0, policy_version 215255 (0.0027) [2024-06-18 21:31:14,955][19107] Updated weights for policy 0, policy_version 215265 (0.0047) [2024-06-18 21:31:15,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3526901760. Throughput: 0: 42219.9. Samples: 750958980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:15,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 21:31:18,658][19107] Updated weights for policy 0, policy_version 215275 (0.0038) [2024-06-18 21:31:20,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42210.2). Total num frames: 3527114752. Throughput: 0: 42260.1. Samples: 751212760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:20,501][18875] Avg episode reward: [(0, '0.346')] [2024-06-18 21:31:22,669][19107] Updated weights for policy 0, policy_version 215285 (0.0032) [2024-06-18 21:31:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3527327744. Throughput: 0: 42363.3. Samples: 751468460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:25,500][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 21:31:26,386][19107] Updated weights for policy 0, policy_version 215295 (0.0040) [2024-06-18 21:31:30,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42054.8, 300 sec: 42209.6). Total num frames: 3527540736. Throughput: 0: 42126.2. Samples: 751592920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:30,501][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 21:31:30,526][19107] Updated weights for policy 0, policy_version 215305 (0.0046) [2024-06-18 21:31:34,064][19107] Updated weights for policy 0, policy_version 215315 (0.0030) [2024-06-18 21:31:35,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 3527770112. Throughput: 0: 42254.0. Samples: 751844240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 23.0) [2024-06-18 21:31:35,509][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 21:31:38,301][19107] Updated weights for policy 0, policy_version 215325 (0.0037) [2024-06-18 21:31:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 3527950336. Throughput: 0: 42261.0. Samples: 752103100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:31:40,500][18875] Avg episode reward: [(0, '0.719')] [2024-06-18 21:31:41,770][19107] Updated weights for policy 0, policy_version 215335 (0.0033) [2024-06-18 21:31:45,504][18875] Fps is (10 sec: 39307.4, 60 sec: 42049.7, 300 sec: 42153.6). Total num frames: 3528163328. Throughput: 0: 42245.5. Samples: 752229660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:31:45,504][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 21:31:46,094][19107] Updated weights for policy 0, policy_version 215345 (0.0032) [2024-06-18 21:31:49,475][19107] Updated weights for policy 0, policy_version 215355 (0.0025) [2024-06-18 21:31:50,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42600.9, 300 sec: 42320.7). Total num frames: 3528409088. Throughput: 0: 42354.6. Samples: 752484060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:31:50,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 21:31:53,650][19107] Updated weights for policy 0, policy_version 215365 (0.0027) [2024-06-18 21:31:55,500][18875] Fps is (10 sec: 42613.6, 60 sec: 41779.2, 300 sec: 42210.1). Total num frames: 3528589312. Throughput: 0: 42490.5. Samples: 752742820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:31:55,501][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 21:31:57,468][19107] Updated weights for policy 0, policy_version 215375 (0.0032) [2024-06-18 21:32:00,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42265.7). Total num frames: 3528818688. Throughput: 0: 42243.6. Samples: 752859940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:00,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 21:32:01,346][19107] Updated weights for policy 0, policy_version 215385 (0.0030) [2024-06-18 21:32:05,396][19107] Updated weights for policy 0, policy_version 215395 (0.0041) [2024-06-18 21:32:05,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3529031680. Throughput: 0: 42265.2. Samples: 753114700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:05,501][18875] Avg episode reward: [(0, '0.311')] [2024-06-18 21:32:09,417][19107] Updated weights for policy 0, policy_version 215405 (0.0031) [2024-06-18 21:32:10,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3529228288. Throughput: 0: 42211.1. Samples: 753367960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:10,500][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 21:32:13,343][19107] Updated weights for policy 0, policy_version 215415 (0.0027) [2024-06-18 21:32:15,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 3529441280. Throughput: 0: 42321.0. Samples: 753497360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:15,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 21:32:17,068][19107] Updated weights for policy 0, policy_version 215425 (0.0040) [2024-06-18 21:32:20,504][18875] Fps is (10 sec: 42582.5, 60 sec: 42322.7, 300 sec: 42209.1). Total num frames: 3529654272. Throughput: 0: 42213.1. Samples: 753743980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:20,505][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 21:32:21,166][19107] Updated weights for policy 0, policy_version 215435 (0.0032) [2024-06-18 21:32:25,043][19107] Updated weights for policy 0, policy_version 215445 (0.0039) [2024-06-18 21:32:25,505][18875] Fps is (10 sec: 40940.4, 60 sec: 42048.9, 300 sec: 42209.5). Total num frames: 3529850880. Throughput: 0: 42041.7. Samples: 753995180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:25,505][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 21:32:28,993][19107] Updated weights for policy 0, policy_version 215455 (0.0056) [2024-06-18 21:32:30,198][19087] Signal inference workers to stop experience collection... (11050 times) [2024-06-18 21:32:30,232][19107] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-06-18 21:32:30,254][19087] Signal inference workers to resume experience collection... (11050 times) [2024-06-18 21:32:30,255][19107] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-06-18 21:32:30,500][18875] Fps is (10 sec: 40975.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3530063872. Throughput: 0: 42152.3. Samples: 754126360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:30,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 21:32:32,664][19107] Updated weights for policy 0, policy_version 215465 (0.0033) [2024-06-18 21:32:35,504][18875] Fps is (10 sec: 42603.0, 60 sec: 41776.6, 300 sec: 42153.6). Total num frames: 3530276864. Throughput: 0: 41979.7. Samples: 754373300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:35,505][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 21:32:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000215471_3530276864.pth... [2024-06-18 21:32:35,576][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000214854_3520167936.pth [2024-06-18 21:32:37,196][19107] Updated weights for policy 0, policy_version 215475 (0.0028) [2024-06-18 21:32:40,387][19107] Updated weights for policy 0, policy_version 215485 (0.0038) [2024-06-18 21:32:40,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3530506240. Throughput: 0: 41619.2. Samples: 754615680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 21:32:40,501][18875] Avg episode reward: [(0, '0.757')] [2024-06-18 21:32:45,211][19107] Updated weights for policy 0, policy_version 215495 (0.0035) [2024-06-18 21:32:45,500][18875] Fps is (10 sec: 39335.9, 60 sec: 41781.7, 300 sec: 42098.5). Total num frames: 3530670080. Throughput: 0: 41888.8. Samples: 754744940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:32:45,501][18875] Avg episode reward: [(0, '0.761')] [2024-06-18 21:32:48,082][19107] Updated weights for policy 0, policy_version 215505 (0.0028) [2024-06-18 21:32:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3530915840. Throughput: 0: 41840.6. Samples: 754997520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:32:50,501][18875] Avg episode reward: [(0, '0.745')] [2024-06-18 21:32:53,091][19107] Updated weights for policy 0, policy_version 215515 (0.0028) [2024-06-18 21:32:55,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 3531128832. Throughput: 0: 41808.4. Samples: 755249340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:32:55,501][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 21:32:55,849][19107] Updated weights for policy 0, policy_version 215525 (0.0046) [2024-06-18 21:33:00,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 3531309056. Throughput: 0: 41641.8. Samples: 755371240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:00,501][18875] Avg episode reward: [(0, '0.707')] [2024-06-18 21:33:00,727][19107] Updated weights for policy 0, policy_version 215535 (0.0034) [2024-06-18 21:33:04,039][19107] Updated weights for policy 0, policy_version 215545 (0.0036) [2024-06-18 21:33:05,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3531538432. Throughput: 0: 41792.2. Samples: 755624480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:05,501][18875] Avg episode reward: [(0, '0.707')] [2024-06-18 21:33:08,487][19107] Updated weights for policy 0, policy_version 215555 (0.0028) [2024-06-18 21:33:10,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3531751424. Throughput: 0: 41926.6. Samples: 755881680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:10,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 21:33:11,546][19107] Updated weights for policy 0, policy_version 215565 (0.0024) [2024-06-18 21:33:15,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 42099.1). Total num frames: 3531948032. Throughput: 0: 41679.6. Samples: 756001940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:15,500][18875] Avg episode reward: [(0, '0.739')] [2024-06-18 21:33:16,153][19107] Updated weights for policy 0, policy_version 215575 (0.0033) [2024-06-18 21:33:19,789][19107] Updated weights for policy 0, policy_version 215585 (0.0041) [2024-06-18 21:33:20,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42054.8, 300 sec: 42154.1). Total num frames: 3532177408. Throughput: 0: 41854.1. Samples: 756256580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:20,501][18875] Avg episode reward: [(0, '0.751')] [2024-06-18 21:33:23,998][19107] Updated weights for policy 0, policy_version 215595 (0.0030) [2024-06-18 21:33:25,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42055.6, 300 sec: 42154.1). Total num frames: 3532374016. Throughput: 0: 42203.1. Samples: 756514820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:25,501][18875] Avg episode reward: [(0, '0.740')] [2024-06-18 21:33:27,390][19107] Updated weights for policy 0, policy_version 215605 (0.0030) [2024-06-18 21:33:30,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3532570624. Throughput: 0: 42005.3. Samples: 756635180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:30,501][18875] Avg episode reward: [(0, '0.780')] [2024-06-18 21:33:31,871][19107] Updated weights for policy 0, policy_version 215615 (0.0032) [2024-06-18 21:33:35,038][19107] Updated weights for policy 0, policy_version 215625 (0.0025) [2024-06-18 21:33:35,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42327.9, 300 sec: 42154.1). Total num frames: 3532816384. Throughput: 0: 42107.5. Samples: 756892360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:35,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 21:33:39,585][19107] Updated weights for policy 0, policy_version 215635 (0.0035) [2024-06-18 21:33:40,503][18875] Fps is (10 sec: 42586.1, 60 sec: 41504.0, 300 sec: 42153.7). Total num frames: 3532996608. Throughput: 0: 42164.3. Samples: 757146860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:40,504][18875] Avg episode reward: [(0, '0.359')] [2024-06-18 21:33:43,049][19107] Updated weights for policy 0, policy_version 215645 (0.0043) [2024-06-18 21:33:45,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3533225984. Throughput: 0: 42083.5. Samples: 757265000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:45,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 21:33:47,237][19107] Updated weights for policy 0, policy_version 215655 (0.0026) [2024-06-18 21:33:50,500][18875] Fps is (10 sec: 42611.2, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3533422592. Throughput: 0: 42157.5. Samples: 757521560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 23.0) [2024-06-18 21:33:50,501][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 21:33:50,782][19107] Updated weights for policy 0, policy_version 215665 (0.0029) [2024-06-18 21:33:54,675][19087] Signal inference workers to stop experience collection... (11100 times) [2024-06-18 21:33:54,676][19087] Signal inference workers to resume experience collection... (11100 times) [2024-06-18 21:33:54,699][19107] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-06-18 21:33:54,700][19107] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-06-18 21:33:55,485][19107] Updated weights for policy 0, policy_version 215675 (0.0023) [2024-06-18 21:33:55,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 42098.5). Total num frames: 3533619200. Throughput: 0: 41999.1. Samples: 757771640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:33:55,501][18875] Avg episode reward: [(0, '0.774')] [2024-06-18 21:33:58,782][19107] Updated weights for policy 0, policy_version 215685 (0.0039) [2024-06-18 21:34:00,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 3533864960. Throughput: 0: 42029.2. Samples: 757893260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:00,501][18875] Avg episode reward: [(0, '0.908')] [2024-06-18 21:34:03,049][19107] Updated weights for policy 0, policy_version 215695 (0.0048) [2024-06-18 21:34:05,500][18875] Fps is (10 sec: 42599.1, 60 sec: 41779.3, 300 sec: 42043.1). Total num frames: 3534045184. Throughput: 0: 42033.0. Samples: 758148060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:05,500][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 21:34:06,491][19107] Updated weights for policy 0, policy_version 215705 (0.0037) [2024-06-18 21:34:10,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3534241792. Throughput: 0: 41768.3. Samples: 758394400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:10,501][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 21:34:10,970][19107] Updated weights for policy 0, policy_version 215715 (0.0028) [2024-06-18 21:34:14,499][19107] Updated weights for policy 0, policy_version 215725 (0.0039) [2024-06-18 21:34:15,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3534487552. Throughput: 0: 41780.5. Samples: 758515300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:15,501][18875] Avg episode reward: [(0, '0.441')] [2024-06-18 21:34:18,533][19107] Updated weights for policy 0, policy_version 215735 (0.0035) [2024-06-18 21:34:20,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41506.2, 300 sec: 42099.1). Total num frames: 3534667776. Throughput: 0: 41817.8. Samples: 758774160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:20,501][18875] Avg episode reward: [(0, '0.328')] [2024-06-18 21:34:22,224][19107] Updated weights for policy 0, policy_version 215745 (0.0041) [2024-06-18 21:34:25,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3534880768. Throughput: 0: 41674.3. Samples: 759022080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:25,501][18875] Avg episode reward: [(0, '0.245')] [2024-06-18 21:34:26,163][19107] Updated weights for policy 0, policy_version 215755 (0.0028) [2024-06-18 21:34:29,981][19107] Updated weights for policy 0, policy_version 215765 (0.0026) [2024-06-18 21:34:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 3535110144. Throughput: 0: 41957.4. Samples: 759153080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:30,501][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 21:34:34,304][19107] Updated weights for policy 0, policy_version 215775 (0.0034) [2024-06-18 21:34:35,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 3535290368. Throughput: 0: 41771.4. Samples: 759401280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:35,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 21:34:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000215778_3535306752.pth... [2024-06-18 21:34:35,588][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000215162_3525214208.pth [2024-06-18 21:34:37,559][19107] Updated weights for policy 0, policy_version 215785 (0.0035) [2024-06-18 21:34:40,500][18875] Fps is (10 sec: 39320.7, 60 sec: 41781.2, 300 sec: 41987.4). Total num frames: 3535503360. Throughput: 0: 41687.5. Samples: 759647580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:40,501][18875] Avg episode reward: [(0, '0.314')] [2024-06-18 21:34:41,881][19107] Updated weights for policy 0, policy_version 215795 (0.0036) [2024-06-18 21:34:45,482][19107] Updated weights for policy 0, policy_version 215805 (0.0038) [2024-06-18 21:34:45,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3535749120. Throughput: 0: 41978.7. Samples: 759782300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:45,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 21:34:49,626][19107] Updated weights for policy 0, policy_version 215815 (0.0048) [2024-06-18 21:34:50,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3535929344. Throughput: 0: 41872.8. Samples: 760032340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:50,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 21:34:53,138][19107] Updated weights for policy 0, policy_version 215825 (0.0031) [2024-06-18 21:34:55,500][18875] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3536142336. Throughput: 0: 41872.0. Samples: 760278640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:34:55,501][18875] Avg episode reward: [(0, '0.271')] [2024-06-18 21:34:57,262][19107] Updated weights for policy 0, policy_version 215835 (0.0044) [2024-06-18 21:35:00,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3536355328. Throughput: 0: 41979.9. Samples: 760404400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-18 21:35:00,501][18875] Avg episode reward: [(0, '0.332')] [2024-06-18 21:35:01,220][19107] Updated weights for policy 0, policy_version 215845 (0.0040) [2024-06-18 21:35:05,292][19107] Updated weights for policy 0, policy_version 215855 (0.0043) [2024-06-18 21:35:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 3536568320. Throughput: 0: 41932.7. Samples: 760661140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:05,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 21:35:08,950][19107] Updated weights for policy 0, policy_version 215865 (0.0037) [2024-06-18 21:35:10,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3536781312. Throughput: 0: 41838.2. Samples: 760904800. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:10,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 21:35:12,963][19107] Updated weights for policy 0, policy_version 215875 (0.0036) [2024-06-18 21:35:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3536977920. Throughput: 0: 41787.0. Samples: 761033500. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:15,501][18875] Avg episode reward: [(0, '0.402')] [2024-06-18 21:35:16,617][19107] Updated weights for policy 0, policy_version 215885 (0.0035) [2024-06-18 21:35:20,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3537190912. Throughput: 0: 41963.2. Samples: 761289620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:20,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 21:35:20,834][19107] Updated weights for policy 0, policy_version 215895 (0.0031) [2024-06-18 21:35:24,188][19107] Updated weights for policy 0, policy_version 215905 (0.0035) [2024-06-18 21:35:25,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 3537420288. Throughput: 0: 41973.0. Samples: 761536360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:25,501][18875] Avg episode reward: [(0, '0.713')] [2024-06-18 21:35:28,542][19107] Updated weights for policy 0, policy_version 215915 (0.0030) [2024-06-18 21:35:30,347][19087] Signal inference workers to stop experience collection... (11150 times) [2024-06-18 21:35:30,347][19087] Signal inference workers to resume experience collection... (11150 times) [2024-06-18 21:35:30,364][19107] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-06-18 21:35:30,364][19107] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-06-18 21:35:30,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3537616896. Throughput: 0: 41874.3. Samples: 761666640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:30,501][18875] Avg episode reward: [(0, '0.608')] [2024-06-18 21:35:32,082][19107] Updated weights for policy 0, policy_version 215925 (0.0032) [2024-06-18 21:35:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3537829888. Throughput: 0: 42050.2. Samples: 761924600. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:35,503][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 21:35:36,263][19107] Updated weights for policy 0, policy_version 215935 (0.0033) [2024-06-18 21:35:39,865][19107] Updated weights for policy 0, policy_version 215945 (0.0033) [2024-06-18 21:35:40,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42098.5). Total num frames: 3538059264. Throughput: 0: 42000.1. Samples: 762168640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:40,501][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 21:35:43,982][19107] Updated weights for policy 0, policy_version 215955 (0.0021) [2024-06-18 21:35:45,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41988.0). Total num frames: 3538239488. Throughput: 0: 42173.1. Samples: 762302180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:45,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 21:35:47,547][19107] Updated weights for policy 0, policy_version 215965 (0.0027) [2024-06-18 21:35:50,504][18875] Fps is (10 sec: 39307.6, 60 sec: 42049.8, 300 sec: 41931.4). Total num frames: 3538452480. Throughput: 0: 42149.6. Samples: 762558020. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:50,504][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 21:35:51,826][19107] Updated weights for policy 0, policy_version 215975 (0.0043) [2024-06-18 21:35:55,361][19107] Updated weights for policy 0, policy_version 215985 (0.0050) [2024-06-18 21:35:55,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3538698240. Throughput: 0: 42226.2. Samples: 762804980. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:35:55,501][18875] Avg episode reward: [(0, '0.632')] [2024-06-18 21:35:59,525][19107] Updated weights for policy 0, policy_version 215995 (0.0039) [2024-06-18 21:36:00,500][18875] Fps is (10 sec: 42614.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 3538878464. Throughput: 0: 42181.9. Samples: 762931680. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:36:00,500][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 21:36:03,147][19107] Updated weights for policy 0, policy_version 216005 (0.0042) [2024-06-18 21:36:05,504][18875] Fps is (10 sec: 37669.7, 60 sec: 41776.7, 300 sec: 41986.9). Total num frames: 3539075072. Throughput: 0: 42162.0. Samples: 763187060. Policy #0 lag: (min: 0.0, avg: 12.7, max: 23.0) [2024-06-18 21:36:05,504][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 21:36:07,245][19107] Updated weights for policy 0, policy_version 216015 (0.0033) [2024-06-18 21:36:10,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 3539320832. Throughput: 0: 42171.7. Samples: 763434080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:10,500][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 21:36:11,437][19107] Updated weights for policy 0, policy_version 216025 (0.0034) [2024-06-18 21:36:15,043][19107] Updated weights for policy 0, policy_version 216035 (0.0043) [2024-06-18 21:36:15,500][18875] Fps is (10 sec: 44252.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3539517440. Throughput: 0: 42175.6. Samples: 763564540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:15,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 21:36:19,349][19107] Updated weights for policy 0, policy_version 216045 (0.0041) [2024-06-18 21:36:20,500][18875] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3539714048. Throughput: 0: 41946.3. Samples: 763812180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:20,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 21:36:23,064][19107] Updated weights for policy 0, policy_version 216055 (0.0044) [2024-06-18 21:36:25,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3539943424. Throughput: 0: 42128.4. Samples: 764064420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:25,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 21:36:27,439][19107] Updated weights for policy 0, policy_version 216065 (0.0037) [2024-06-18 21:36:30,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3540156416. Throughput: 0: 42011.6. Samples: 764192700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:30,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 21:36:30,991][19107] Updated weights for policy 0, policy_version 216075 (0.0039) [2024-06-18 21:36:35,245][19107] Updated weights for policy 0, policy_version 216085 (0.0045) [2024-06-18 21:36:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41987.4). Total num frames: 3540336640. Throughput: 0: 41922.8. Samples: 764444400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:35,501][18875] Avg episode reward: [(0, '0.256')] [2024-06-18 21:36:35,595][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000216086_3540353024.pth... [2024-06-18 21:36:35,645][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000215471_3530276864.pth [2024-06-18 21:36:38,840][19107] Updated weights for policy 0, policy_version 216095 (0.0038) [2024-06-18 21:36:40,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42099.1). Total num frames: 3540582400. Throughput: 0: 41891.2. Samples: 764690080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:40,500][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 21:36:43,074][19107] Updated weights for policy 0, policy_version 216105 (0.0037) [2024-06-18 21:36:45,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3540779008. Throughput: 0: 42103.8. Samples: 764826360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:45,501][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 21:36:46,538][19107] Updated weights for policy 0, policy_version 216115 (0.0036) [2024-06-18 21:36:47,206][19087] Signal inference workers to stop experience collection... (11200 times) [2024-06-18 21:36:47,207][19087] Signal inference workers to resume experience collection... (11200 times) [2024-06-18 21:36:47,232][19107] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-06-18 21:36:47,232][19107] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-06-18 21:36:50,500][18875] Fps is (10 sec: 39320.5, 60 sec: 42054.6, 300 sec: 41987.5). Total num frames: 3540975616. Throughput: 0: 41865.4. Samples: 765070860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:50,501][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 21:36:50,672][19107] Updated weights for policy 0, policy_version 216125 (0.0041) [2024-06-18 21:36:54,194][19107] Updated weights for policy 0, policy_version 216135 (0.0040) [2024-06-18 21:36:55,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3541204992. Throughput: 0: 42016.3. Samples: 765324820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:36:55,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 21:36:58,576][19107] Updated weights for policy 0, policy_version 216145 (0.0031) [2024-06-18 21:37:00,500][18875] Fps is (10 sec: 42599.5, 60 sec: 42052.2, 300 sec: 41932.0). Total num frames: 3541401600. Throughput: 0: 42053.8. Samples: 765456960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:37:00,501][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 21:37:01,857][19107] Updated weights for policy 0, policy_version 216155 (0.0040) [2024-06-18 21:37:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42327.8, 300 sec: 41987.4). Total num frames: 3541614592. Throughput: 0: 42086.1. Samples: 765706060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:37:05,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 21:37:06,362][19107] Updated weights for policy 0, policy_version 216165 (0.0030) [2024-06-18 21:37:09,474][19107] Updated weights for policy 0, policy_version 216175 (0.0025) [2024-06-18 21:37:10,500][18875] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3541860352. Throughput: 0: 42113.3. Samples: 765959520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:37:10,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 21:37:14,015][19107] Updated weights for policy 0, policy_version 216185 (0.0034) [2024-06-18 21:37:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41932.4). Total num frames: 3542024192. Throughput: 0: 42162.5. Samples: 766090020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 21:37:15,501][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 21:37:17,213][19107] Updated weights for policy 0, policy_version 216195 (0.0045) [2024-06-18 21:37:20,500][18875] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 41988.1). Total num frames: 3542237184. Throughput: 0: 42098.3. Samples: 766338820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:20,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 21:37:21,621][19107] Updated weights for policy 0, policy_version 216205 (0.0041) [2024-06-18 21:37:25,172][19107] Updated weights for policy 0, policy_version 216215 (0.0032) [2024-06-18 21:37:25,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3542466560. Throughput: 0: 42380.4. Samples: 766597200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:25,500][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 21:37:29,166][19107] Updated weights for policy 0, policy_version 216225 (0.0027) [2024-06-18 21:37:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41988.0). Total num frames: 3542663168. Throughput: 0: 42252.5. Samples: 766727720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:30,501][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 21:37:32,867][19107] Updated weights for policy 0, policy_version 216235 (0.0028) [2024-06-18 21:37:35,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3542892544. Throughput: 0: 42368.1. Samples: 766977420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:35,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 21:37:36,852][19107] Updated weights for policy 0, policy_version 216245 (0.0034) [2024-06-18 21:37:40,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3543105536. Throughput: 0: 42357.5. Samples: 767230900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:40,500][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 21:37:40,603][19107] Updated weights for policy 0, policy_version 216255 (0.0027) [2024-06-18 21:37:44,518][19107] Updated weights for policy 0, policy_version 216265 (0.0034) [2024-06-18 21:37:45,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 3543285760. Throughput: 0: 42269.3. Samples: 767359080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:45,501][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 21:37:48,221][19107] Updated weights for policy 0, policy_version 216275 (0.0029) [2024-06-18 21:37:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3543531520. Throughput: 0: 42357.4. Samples: 767612140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:50,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 21:37:52,008][19107] Updated weights for policy 0, policy_version 216285 (0.0029) [2024-06-18 21:37:55,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3543728128. Throughput: 0: 42407.6. Samples: 767867860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:37:55,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 21:37:56,288][19107] Updated weights for policy 0, policy_version 216295 (0.0036) [2024-06-18 21:37:59,638][19107] Updated weights for policy 0, policy_version 216305 (0.0034) [2024-06-18 21:38:00,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42322.8, 300 sec: 42042.5). Total num frames: 3543941120. Throughput: 0: 42250.5. Samples: 767991440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:38:00,505][18875] Avg episode reward: [(0, '0.526')] [2024-06-18 21:38:03,820][19107] Updated weights for policy 0, policy_version 216315 (0.0028) [2024-06-18 21:38:05,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42098.6). Total num frames: 3544170496. Throughput: 0: 42383.7. Samples: 768246080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:38:05,501][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 21:38:07,666][19107] Updated weights for policy 0, policy_version 216325 (0.0030) [2024-06-18 21:38:10,500][18875] Fps is (10 sec: 42613.9, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 3544367104. Throughput: 0: 42376.0. Samples: 768504120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:38:10,501][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 21:38:11,412][19107] Updated weights for policy 0, policy_version 216335 (0.0032) [2024-06-18 21:38:12,882][19087] Signal inference workers to stop experience collection... (11250 times) [2024-06-18 21:38:12,883][19087] Signal inference workers to resume experience collection... (11250 times) [2024-06-18 21:38:12,928][19107] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-06-18 21:38:12,928][19107] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-06-18 21:38:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3544580096. Throughput: 0: 42236.2. Samples: 768628340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:38:15,501][18875] Avg episode reward: [(0, '0.723')] [2024-06-18 21:38:15,609][19107] Updated weights for policy 0, policy_version 216345 (0.0035) [2024-06-18 21:38:19,316][19107] Updated weights for policy 0, policy_version 216355 (0.0041) [2024-06-18 21:38:20,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3544776704. Throughput: 0: 42213.7. Samples: 768877040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:38:20,501][18875] Avg episode reward: [(0, '0.839')] [2024-06-18 21:38:23,308][19107] Updated weights for policy 0, policy_version 216365 (0.0052) [2024-06-18 21:38:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3544989696. Throughput: 0: 42220.3. Samples: 769130820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-18 21:38:25,501][18875] Avg episode reward: [(0, '0.304')] [2024-06-18 21:38:27,460][19107] Updated weights for policy 0, policy_version 216375 (0.0038) [2024-06-18 21:38:30,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3545219072. Throughput: 0: 42035.1. Samples: 769250660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:38:30,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 21:38:31,543][19107] Updated weights for policy 0, policy_version 216385 (0.0035) [2024-06-18 21:38:35,484][19107] Updated weights for policy 0, policy_version 216395 (0.0046) [2024-06-18 21:38:35,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42099.0). Total num frames: 3545415680. Throughput: 0: 42103.7. Samples: 769506800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:38:35,500][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 21:38:35,527][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000216395_3545415680.pth... [2024-06-18 21:38:35,575][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000215778_3535306752.pth [2024-06-18 21:38:39,477][19107] Updated weights for policy 0, policy_version 216405 (0.0029) [2024-06-18 21:38:40,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 3545612288. Throughput: 0: 41815.1. Samples: 769749540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:38:40,501][18875] Avg episode reward: [(0, '0.665')] [2024-06-18 21:38:43,337][19107] Updated weights for policy 0, policy_version 216415 (0.0041) [2024-06-18 21:38:45,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3545841664. Throughput: 0: 42022.9. Samples: 769882320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:38:45,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 21:38:47,271][19107] Updated weights for policy 0, policy_version 216425 (0.0037) [2024-06-18 21:38:50,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3546038272. Throughput: 0: 41855.1. Samples: 770129560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:38:50,501][18875] Avg episode reward: [(0, '0.372')] [2024-06-18 21:38:51,040][19107] Updated weights for policy 0, policy_version 216435 (0.0029) [2024-06-18 21:38:55,090][19107] Updated weights for policy 0, policy_version 216445 (0.0026) [2024-06-18 21:38:55,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3546251264. Throughput: 0: 41862.6. Samples: 770387940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:38:55,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 21:38:58,859][19107] Updated weights for policy 0, policy_version 216455 (0.0036) [2024-06-18 21:39:00,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42327.9, 300 sec: 42154.1). Total num frames: 3546480640. Throughput: 0: 41832.0. Samples: 770510780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:39:00,500][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 21:39:02,859][19107] Updated weights for policy 0, policy_version 216465 (0.0028) [2024-06-18 21:39:05,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.0, 300 sec: 42098.6). Total num frames: 3546660864. Throughput: 0: 41818.3. Samples: 770758860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:39:05,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 21:39:06,876][19107] Updated weights for policy 0, policy_version 216475 (0.0034) [2024-06-18 21:39:10,500][18875] Fps is (10 sec: 37682.7, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3546857472. Throughput: 0: 41736.0. Samples: 771008940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:39:10,501][18875] Avg episode reward: [(0, '0.340')] [2024-06-18 21:39:11,015][19107] Updated weights for policy 0, policy_version 216485 (0.0035) [2024-06-18 21:39:14,613][19107] Updated weights for policy 0, policy_version 216495 (0.0047) [2024-06-18 21:39:15,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3547103232. Throughput: 0: 41844.9. Samples: 771133680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:39:15,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 21:39:18,724][19107] Updated weights for policy 0, policy_version 216505 (0.0042) [2024-06-18 21:39:20,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3547299840. Throughput: 0: 41772.3. Samples: 771386560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:39:20,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 21:39:22,362][19107] Updated weights for policy 0, policy_version 216515 (0.0026) [2024-06-18 21:39:25,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3547512832. Throughput: 0: 41941.8. Samples: 771636920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:39:25,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 21:39:26,610][19107] Updated weights for policy 0, policy_version 216525 (0.0039) [2024-06-18 21:39:30,147][19107] Updated weights for policy 0, policy_version 216535 (0.0034) [2024-06-18 21:39:30,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3547725824. Throughput: 0: 41769.8. Samples: 771761960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 21:39:30,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 21:39:34,362][19107] Updated weights for policy 0, policy_version 216545 (0.0026) [2024-06-18 21:39:35,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3547922432. Throughput: 0: 42036.4. Samples: 772021200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:39:35,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 21:39:37,936][19107] Updated weights for policy 0, policy_version 216555 (0.0035) [2024-06-18 21:39:38,438][19087] Signal inference workers to stop experience collection... (11300 times) [2024-06-18 21:39:38,489][19107] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-06-18 21:39:38,553][19087] Signal inference workers to resume experience collection... (11300 times) [2024-06-18 21:39:38,553][19107] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-06-18 21:39:40,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3548151808. Throughput: 0: 41742.7. Samples: 772266360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:39:40,501][18875] Avg episode reward: [(0, '0.367')] [2024-06-18 21:39:41,997][19107] Updated weights for policy 0, policy_version 216565 (0.0035) [2024-06-18 21:39:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3548348416. Throughput: 0: 41792.4. Samples: 772391440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:39:45,501][18875] Avg episode reward: [(0, '0.773')] [2024-06-18 21:39:46,062][19107] Updated weights for policy 0, policy_version 216575 (0.0042) [2024-06-18 21:39:49,792][19107] Updated weights for policy 0, policy_version 216585 (0.0041) [2024-06-18 21:39:50,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3548545024. Throughput: 0: 41924.5. Samples: 772645460. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:39:50,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 21:39:53,774][19107] Updated weights for policy 0, policy_version 216595 (0.0056) [2024-06-18 21:39:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3548774400. Throughput: 0: 41741.4. Samples: 772887300. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:39:55,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 21:39:57,789][19107] Updated weights for policy 0, policy_version 216605 (0.0033) [2024-06-18 21:40:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 3548954624. Throughput: 0: 41870.2. Samples: 773017840. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:00,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 21:40:01,696][19107] Updated weights for policy 0, policy_version 216615 (0.0044) [2024-06-18 21:40:05,349][19107] Updated weights for policy 0, policy_version 216625 (0.0038) [2024-06-18 21:40:05,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3549184000. Throughput: 0: 41753.7. Samples: 773265480. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:05,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 21:40:09,673][19107] Updated weights for policy 0, policy_version 216635 (0.0043) [2024-06-18 21:40:10,504][18875] Fps is (10 sec: 44220.9, 60 sec: 42322.8, 300 sec: 42098.0). Total num frames: 3549396992. Throughput: 0: 41874.9. Samples: 773521440. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:10,505][18875] Avg episode reward: [(0, '0.768')] [2024-06-18 21:40:12,921][19107] Updated weights for policy 0, policy_version 216645 (0.0034) [2024-06-18 21:40:15,501][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 3549593600. Throughput: 0: 41862.5. Samples: 773645780. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:15,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 21:40:17,153][19107] Updated weights for policy 0, policy_version 216655 (0.0039) [2024-06-18 21:40:20,500][18875] Fps is (10 sec: 42614.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3549822976. Throughput: 0: 41737.8. Samples: 773899400. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:20,501][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 21:40:20,652][19107] Updated weights for policy 0, policy_version 216665 (0.0037) [2024-06-18 21:40:24,731][19107] Updated weights for policy 0, policy_version 216675 (0.0029) [2024-06-18 21:40:25,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3550035968. Throughput: 0: 42118.2. Samples: 774161680. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:25,504][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 21:40:28,286][19107] Updated weights for policy 0, policy_version 216685 (0.0032) [2024-06-18 21:40:30,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3550216192. Throughput: 0: 42192.5. Samples: 774290100. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:30,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 21:40:32,357][19107] Updated weights for policy 0, policy_version 216695 (0.0027) [2024-06-18 21:40:35,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 3550461952. Throughput: 0: 42181.3. Samples: 774543620. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:35,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 21:40:35,520][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000216703_3550461952.pth... [2024-06-18 21:40:35,576][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000216086_3540353024.pth [2024-06-18 21:40:35,959][19107] Updated weights for policy 0, policy_version 216705 (0.0028) [2024-06-18 21:40:40,080][19107] Updated weights for policy 0, policy_version 216715 (0.0042) [2024-06-18 21:40:40,500][18875] Fps is (10 sec: 44235.9, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3550658560. Throughput: 0: 42377.6. Samples: 774794300. Policy #0 lag: (min: 1.0, avg: 9.0, max: 21.0) [2024-06-18 21:40:40,501][18875] Avg episode reward: [(0, '0.355')] [2024-06-18 21:40:44,121][19107] Updated weights for policy 0, policy_version 216725 (0.0031) [2024-06-18 21:40:45,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42099.1). Total num frames: 3550871552. Throughput: 0: 42172.0. Samples: 774915580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:40:45,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 21:40:47,993][19107] Updated weights for policy 0, policy_version 216735 (0.0036) [2024-06-18 21:40:50,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3551084544. Throughput: 0: 42403.1. Samples: 775173620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:40:50,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 21:40:51,763][19107] Updated weights for policy 0, policy_version 216745 (0.0045) [2024-06-18 21:40:55,501][18875] Fps is (10 sec: 42597.8, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3551297536. Throughput: 0: 42212.1. Samples: 775420840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:40:55,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 21:40:55,922][19107] Updated weights for policy 0, policy_version 216755 (0.0041) [2024-06-18 21:40:59,451][19107] Updated weights for policy 0, policy_version 216765 (0.0026) [2024-06-18 21:41:00,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42099.1). Total num frames: 3551494144. Throughput: 0: 42085.1. Samples: 775539600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:00,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 21:41:04,136][19107] Updated weights for policy 0, policy_version 216775 (0.0043) [2024-06-18 21:41:05,167][19087] Signal inference workers to stop experience collection... (11350 times) [2024-06-18 21:41:05,168][19087] Signal inference workers to resume experience collection... (11350 times) [2024-06-18 21:41:05,202][19107] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-06-18 21:41:05,202][19107] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-06-18 21:41:05,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3551723520. Throughput: 0: 42129.7. Samples: 775795240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:05,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 21:41:07,086][19107] Updated weights for policy 0, policy_version 216785 (0.0035) [2024-06-18 21:41:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41781.7, 300 sec: 41987.5). Total num frames: 3551903744. Throughput: 0: 41843.1. Samples: 776044620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:10,501][18875] Avg episode reward: [(0, '0.418')] [2024-06-18 21:41:11,929][19107] Updated weights for policy 0, policy_version 216795 (0.0047) [2024-06-18 21:41:14,852][19107] Updated weights for policy 0, policy_version 216805 (0.0036) [2024-06-18 21:41:15,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 3552133120. Throughput: 0: 41748.5. Samples: 776168780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:15,500][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 21:41:19,731][19107] Updated weights for policy 0, policy_version 216815 (0.0031) [2024-06-18 21:41:20,500][18875] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3552329728. Throughput: 0: 41890.0. Samples: 776428660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:20,501][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 21:41:22,802][19107] Updated weights for policy 0, policy_version 216825 (0.0031) [2024-06-18 21:41:25,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3552526336. Throughput: 0: 41862.4. Samples: 776678100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:25,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 21:41:27,442][19107] Updated weights for policy 0, policy_version 216835 (0.0037) [2024-06-18 21:41:30,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3552772096. Throughput: 0: 41887.6. Samples: 776800520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:30,501][18875] Avg episode reward: [(0, '0.663')] [2024-06-18 21:41:30,578][19107] Updated weights for policy 0, policy_version 216845 (0.0037) [2024-06-18 21:41:35,355][19107] Updated weights for policy 0, policy_version 216855 (0.0033) [2024-06-18 21:41:35,501][18875] Fps is (10 sec: 42597.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3552952320. Throughput: 0: 41809.7. Samples: 777055060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:35,501][18875] Avg episode reward: [(0, '0.677')] [2024-06-18 21:41:38,939][19107] Updated weights for policy 0, policy_version 216865 (0.0033) [2024-06-18 21:41:40,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3553165312. Throughput: 0: 41724.5. Samples: 777298440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:40,501][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 21:41:43,265][19107] Updated weights for policy 0, policy_version 216875 (0.0033) [2024-06-18 21:41:45,500][18875] Fps is (10 sec: 40961.2, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3553361920. Throughput: 0: 41922.3. Samples: 777426100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:45,500][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 21:41:46,685][19107] Updated weights for policy 0, policy_version 216885 (0.0048) [2024-06-18 21:41:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3553574912. Throughput: 0: 41716.0. Samples: 777672460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 21:41:50,501][18875] Avg episode reward: [(0, '0.638')] [2024-06-18 21:41:51,274][19107] Updated weights for policy 0, policy_version 216895 (0.0035) [2024-06-18 21:41:54,617][19107] Updated weights for policy 0, policy_version 216905 (0.0038) [2024-06-18 21:41:55,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3553787904. Throughput: 0: 41771.7. Samples: 777924340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:41:55,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 21:41:58,983][19107] Updated weights for policy 0, policy_version 216915 (0.0030) [2024-06-18 21:42:00,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3554000896. Throughput: 0: 41822.2. Samples: 778050780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:00,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 21:42:02,465][19107] Updated weights for policy 0, policy_version 216925 (0.0032) [2024-06-18 21:42:05,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3554213888. Throughput: 0: 41692.4. Samples: 778304820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:05,501][18875] Avg episode reward: [(0, '0.697')] [2024-06-18 21:42:06,742][19107] Updated weights for policy 0, policy_version 216935 (0.0038) [2024-06-18 21:42:10,254][19107] Updated weights for policy 0, policy_version 216945 (0.0027) [2024-06-18 21:42:10,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3554426880. Throughput: 0: 41708.3. Samples: 778554980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:10,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 21:42:14,592][19107] Updated weights for policy 0, policy_version 216955 (0.0023) [2024-06-18 21:42:15,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3554623488. Throughput: 0: 41885.4. Samples: 778685360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:15,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 21:42:17,870][19107] Updated weights for policy 0, policy_version 216965 (0.0035) [2024-06-18 21:42:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.1, 300 sec: 41987.4). Total num frames: 3554852864. Throughput: 0: 41808.1. Samples: 778936420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:20,501][18875] Avg episode reward: [(0, '0.382')] [2024-06-18 21:42:22,619][19107] Updated weights for policy 0, policy_version 216975 (0.0031) [2024-06-18 21:42:25,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3555065856. Throughput: 0: 41925.9. Samples: 779185100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:25,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 21:42:25,890][19107] Updated weights for policy 0, policy_version 216985 (0.0035) [2024-06-18 21:42:30,323][19107] Updated weights for policy 0, policy_version 216995 (0.0034) [2024-06-18 21:42:30,500][18875] Fps is (10 sec: 39322.3, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3555246080. Throughput: 0: 41933.7. Samples: 779313120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:30,500][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 21:42:33,550][19107] Updated weights for policy 0, policy_version 217005 (0.0037) [2024-06-18 21:42:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3555491840. Throughput: 0: 42047.6. Samples: 779564600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:35,512][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 21:42:35,537][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217010_3555491840.pth... [2024-06-18 21:42:35,627][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000216395_3545415680.pth [2024-06-18 21:42:38,070][19107] Updated weights for policy 0, policy_version 217015 (0.0028) [2024-06-18 21:42:40,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3555704832. Throughput: 0: 42089.3. Samples: 779818360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:40,501][18875] Avg episode reward: [(0, '0.334')] [2024-06-18 21:42:41,226][19107] Updated weights for policy 0, policy_version 217025 (0.0039) [2024-06-18 21:42:44,724][19087] Signal inference workers to stop experience collection... (11400 times) [2024-06-18 21:42:44,724][19087] Signal inference workers to resume experience collection... (11400 times) [2024-06-18 21:42:44,757][19107] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-06-18 21:42:44,764][19107] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-06-18 21:42:45,500][18875] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3555885056. Throughput: 0: 42018.1. Samples: 779941600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:45,501][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 21:42:45,694][19107] Updated weights for policy 0, policy_version 217035 (0.0034) [2024-06-18 21:42:49,016][19107] Updated weights for policy 0, policy_version 217045 (0.0041) [2024-06-18 21:42:50,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42098.6). Total num frames: 3556147200. Throughput: 0: 42095.6. Samples: 780199120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:50,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 21:42:53,385][19107] Updated weights for policy 0, policy_version 217055 (0.0047) [2024-06-18 21:42:55,504][18875] Fps is (10 sec: 44221.0, 60 sec: 42322.7, 300 sec: 41987.5). Total num frames: 3556327424. Throughput: 0: 42230.0. Samples: 780455480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 21:42:55,505][18875] Avg episode reward: [(0, '0.286')] [2024-06-18 21:42:56,967][19107] Updated weights for policy 0, policy_version 217065 (0.0035) [2024-06-18 21:43:00,500][18875] Fps is (10 sec: 36044.4, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3556507648. Throughput: 0: 41993.3. Samples: 780575060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:00,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 21:43:01,115][19107] Updated weights for policy 0, policy_version 217075 (0.0030) [2024-06-18 21:43:04,825][19107] Updated weights for policy 0, policy_version 217085 (0.0029) [2024-06-18 21:43:05,500][18875] Fps is (10 sec: 44252.5, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3556769792. Throughput: 0: 42121.8. Samples: 780831900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:05,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 21:43:08,911][19107] Updated weights for policy 0, policy_version 217095 (0.0033) [2024-06-18 21:43:10,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3556933632. Throughput: 0: 42168.0. Samples: 781082660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:10,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 21:43:12,602][19107] Updated weights for policy 0, policy_version 217105 (0.0037) [2024-06-18 21:43:15,500][18875] Fps is (10 sec: 37683.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3557146624. Throughput: 0: 41930.6. Samples: 781200000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:15,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 21:43:16,812][19107] Updated weights for policy 0, policy_version 217115 (0.0041) [2024-06-18 21:43:20,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3557359616. Throughput: 0: 42045.2. Samples: 781456640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:20,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 21:43:20,519][19107] Updated weights for policy 0, policy_version 217125 (0.0041) [2024-06-18 21:43:24,558][19107] Updated weights for policy 0, policy_version 217135 (0.0046) [2024-06-18 21:43:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 3557556224. Throughput: 0: 42023.5. Samples: 781709420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:25,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 21:43:28,361][19107] Updated weights for policy 0, policy_version 217145 (0.0032) [2024-06-18 21:43:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3557785600. Throughput: 0: 41994.6. Samples: 781831360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:30,501][18875] Avg episode reward: [(0, '0.706')] [2024-06-18 21:43:32,330][19107] Updated weights for policy 0, policy_version 217155 (0.0033) [2024-06-18 21:43:35,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3557965824. Throughput: 0: 41900.9. Samples: 782084660. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:35,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 21:43:36,046][19107] Updated weights for policy 0, policy_version 217165 (0.0027) [2024-06-18 21:43:40,438][19107] Updated weights for policy 0, policy_version 217175 (0.0033) [2024-06-18 21:43:40,503][18875] Fps is (10 sec: 40948.4, 60 sec: 41504.1, 300 sec: 41876.0). Total num frames: 3558195200. Throughput: 0: 41809.1. Samples: 782336860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:40,504][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 21:43:43,933][19107] Updated weights for policy 0, policy_version 217185 (0.0038) [2024-06-18 21:43:45,500][18875] Fps is (10 sec: 47513.8, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3558440960. Throughput: 0: 41975.2. Samples: 782463940. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:45,501][18875] Avg episode reward: [(0, '0.683')] [2024-06-18 21:43:48,394][19107] Updated weights for policy 0, policy_version 217195 (0.0029) [2024-06-18 21:43:50,500][18875] Fps is (10 sec: 42610.5, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 3558621184. Throughput: 0: 41874.2. Samples: 782716240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:50,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 21:43:51,940][19107] Updated weights for policy 0, policy_version 217205 (0.0034) [2024-06-18 21:43:55,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41508.7, 300 sec: 41820.8). Total num frames: 3558817792. Throughput: 0: 41849.4. Samples: 782965880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:43:55,500][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 21:43:55,975][19107] Updated weights for policy 0, policy_version 217215 (0.0050) [2024-06-18 21:43:58,686][19087] Signal inference workers to stop experience collection... (11450 times) [2024-06-18 21:43:58,686][19087] Signal inference workers to resume experience collection... (11450 times) [2024-06-18 21:43:58,714][19107] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-06-18 21:43:58,715][19107] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-06-18 21:43:59,811][19107] Updated weights for policy 0, policy_version 217225 (0.0026) [2024-06-18 21:44:00,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3559063552. Throughput: 0: 42151.1. Samples: 783096800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:44:00,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 21:44:03,558][19107] Updated weights for policy 0, policy_version 217235 (0.0044) [2024-06-18 21:44:05,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3559260160. Throughput: 0: 42052.5. Samples: 783349000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 21:44:05,504][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 21:44:07,779][19107] Updated weights for policy 0, policy_version 217245 (0.0031) [2024-06-18 21:44:10,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3559473152. Throughput: 0: 41888.6. Samples: 783594400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:10,500][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 21:44:11,280][19107] Updated weights for policy 0, policy_version 217255 (0.0044) [2024-06-18 21:44:15,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3559653376. Throughput: 0: 42097.4. Samples: 783725740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:15,501][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 21:44:15,592][19107] Updated weights for policy 0, policy_version 217265 (0.0030) [2024-06-18 21:44:18,883][19107] Updated weights for policy 0, policy_version 217275 (0.0028) [2024-06-18 21:44:20,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3559899136. Throughput: 0: 42178.5. Samples: 783982700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:20,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 21:44:23,180][19107] Updated weights for policy 0, policy_version 217285 (0.0045) [2024-06-18 21:44:25,500][18875] Fps is (10 sec: 45875.9, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 3560112128. Throughput: 0: 42109.9. Samples: 784231680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:25,500][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 21:44:26,588][19107] Updated weights for policy 0, policy_version 217295 (0.0045) [2024-06-18 21:44:30,500][18875] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 3560275968. Throughput: 0: 42116.7. Samples: 784359200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:30,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 21:44:30,826][19107] Updated weights for policy 0, policy_version 217305 (0.0033) [2024-06-18 21:44:34,367][19107] Updated weights for policy 0, policy_version 217315 (0.0038) [2024-06-18 21:44:35,504][18875] Fps is (10 sec: 42582.6, 60 sec: 42868.8, 300 sec: 41987.0). Total num frames: 3560538112. Throughput: 0: 42215.3. Samples: 784616080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:35,505][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 21:44:35,521][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217318_3560538112.pth... [2024-06-18 21:44:35,573][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000216703_3550461952.pth [2024-06-18 21:44:38,540][19107] Updated weights for policy 0, policy_version 217325 (0.0052) [2024-06-18 21:44:40,500][18875] Fps is (10 sec: 45875.4, 60 sec: 42327.4, 300 sec: 41987.5). Total num frames: 3560734720. Throughput: 0: 42111.0. Samples: 784860880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:40,501][18875] Avg episode reward: [(0, '0.887')] [2024-06-18 21:44:42,312][19107] Updated weights for policy 0, policy_version 217335 (0.0033) [2024-06-18 21:44:45,500][18875] Fps is (10 sec: 37696.4, 60 sec: 41232.9, 300 sec: 41931.9). Total num frames: 3560914944. Throughput: 0: 41912.8. Samples: 784982880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:45,501][18875] Avg episode reward: [(0, '0.726')] [2024-06-18 21:44:46,457][19107] Updated weights for policy 0, policy_version 217345 (0.0038) [2024-06-18 21:44:49,960][19107] Updated weights for policy 0, policy_version 217355 (0.0037) [2024-06-18 21:44:50,502][18875] Fps is (10 sec: 44229.6, 60 sec: 42597.3, 300 sec: 42042.8). Total num frames: 3561177088. Throughput: 0: 41974.9. Samples: 785237940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:50,503][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 21:44:54,359][19107] Updated weights for policy 0, policy_version 217365 (0.0034) [2024-06-18 21:44:55,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3561340928. Throughput: 0: 42258.6. Samples: 785496040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:44:55,501][18875] Avg episode reward: [(0, '0.311')] [2024-06-18 21:44:57,646][19107] Updated weights for policy 0, policy_version 217375 (0.0046) [2024-06-18 21:45:00,500][18875] Fps is (10 sec: 37689.4, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3561553920. Throughput: 0: 41941.4. Samples: 785613100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:45:00,501][18875] Avg episode reward: [(0, '0.205')] [2024-06-18 21:45:01,963][19107] Updated weights for policy 0, policy_version 217385 (0.0044) [2024-06-18 21:45:05,194][19087] Signal inference workers to stop experience collection... (11500 times) [2024-06-18 21:45:05,243][19087] Signal inference workers to resume experience collection... (11500 times) [2024-06-18 21:45:05,243][19107] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-06-18 21:45:05,246][19107] Updated weights for policy 0, policy_version 217395 (0.0038) [2024-06-18 21:45:05,256][19107] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-06-18 21:45:05,504][18875] Fps is (10 sec: 47496.3, 60 sec: 42595.9, 300 sec: 42098.6). Total num frames: 3561816064. Throughput: 0: 42045.2. Samples: 785874880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:45:05,505][18875] Avg episode reward: [(0, '0.248')] [2024-06-18 21:45:09,719][19107] Updated weights for policy 0, policy_version 217405 (0.0035) [2024-06-18 21:45:10,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3561963520. Throughput: 0: 42139.0. Samples: 786127940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 21:45:10,501][18875] Avg episode reward: [(0, '0.304')] [2024-06-18 21:45:12,922][19107] Updated weights for policy 0, policy_version 217415 (0.0038) [2024-06-18 21:45:15,500][18875] Fps is (10 sec: 39335.6, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3562209280. Throughput: 0: 41921.8. Samples: 786245680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:15,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 21:45:17,320][19107] Updated weights for policy 0, policy_version 217425 (0.0038) [2024-06-18 21:45:20,504][18875] Fps is (10 sec: 45859.0, 60 sec: 42049.8, 300 sec: 41987.0). Total num frames: 3562422272. Throughput: 0: 41870.2. Samples: 786500240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:20,505][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 21:45:20,841][19107] Updated weights for policy 0, policy_version 217435 (0.0027) [2024-06-18 21:45:25,004][19107] Updated weights for policy 0, policy_version 217445 (0.0042) [2024-06-18 21:45:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3562618880. Throughput: 0: 41996.4. Samples: 786750720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:25,501][18875] Avg episode reward: [(0, '0.645')] [2024-06-18 21:45:28,538][19107] Updated weights for policy 0, policy_version 217455 (0.0040) [2024-06-18 21:45:30,500][18875] Fps is (10 sec: 40974.1, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 3562831872. Throughput: 0: 42141.8. Samples: 786879260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:30,501][18875] Avg episode reward: [(0, '0.716')] [2024-06-18 21:45:33,123][19107] Updated weights for policy 0, policy_version 217465 (0.0030) [2024-06-18 21:45:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41508.6, 300 sec: 41931.9). Total num frames: 3563028480. Throughput: 0: 42234.4. Samples: 787138420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:35,501][18875] Avg episode reward: [(0, '0.714')] [2024-06-18 21:45:36,461][19107] Updated weights for policy 0, policy_version 217475 (0.0028) [2024-06-18 21:45:40,500][18875] Fps is (10 sec: 40961.1, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 3563241472. Throughput: 0: 41958.3. Samples: 787384160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:40,500][18875] Avg episode reward: [(0, '0.445')] [2024-06-18 21:45:40,795][19107] Updated weights for policy 0, policy_version 217485 (0.0042) [2024-06-18 21:45:44,486][19107] Updated weights for policy 0, policy_version 217495 (0.0037) [2024-06-18 21:45:45,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3563470848. Throughput: 0: 42163.0. Samples: 787510440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:45,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 21:45:48,550][19107] Updated weights for policy 0, policy_version 217505 (0.0033) [2024-06-18 21:45:50,500][18875] Fps is (10 sec: 39321.7, 60 sec: 40961.2, 300 sec: 41820.9). Total num frames: 3563634688. Throughput: 0: 41830.1. Samples: 787757080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:50,500][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 21:45:52,318][19107] Updated weights for policy 0, policy_version 217515 (0.0043) [2024-06-18 21:45:55,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 3563880448. Throughput: 0: 41918.2. Samples: 788014260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:45:55,501][18875] Avg episode reward: [(0, '0.594')] [2024-06-18 21:45:56,403][19107] Updated weights for policy 0, policy_version 217525 (0.0035) [2024-06-18 21:46:00,060][19107] Updated weights for policy 0, policy_version 217535 (0.0038) [2024-06-18 21:46:00,500][18875] Fps is (10 sec: 47512.6, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3564109824. Throughput: 0: 42198.6. Samples: 788144620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:46:00,501][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 21:46:04,120][19107] Updated weights for policy 0, policy_version 217545 (0.0032) [2024-06-18 21:46:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41235.5, 300 sec: 41987.5). Total num frames: 3564290048. Throughput: 0: 41970.4. Samples: 788388760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:46:05,501][18875] Avg episode reward: [(0, '0.590')] [2024-06-18 21:46:08,013][19107] Updated weights for policy 0, policy_version 217555 (0.0044) [2024-06-18 21:46:10,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 41987.4). Total num frames: 3564519424. Throughput: 0: 41896.4. Samples: 788636060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:46:10,501][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 21:46:12,006][19107] Updated weights for policy 0, policy_version 217565 (0.0027) [2024-06-18 21:46:15,504][18875] Fps is (10 sec: 42583.5, 60 sec: 41776.7, 300 sec: 41986.9). Total num frames: 3564716032. Throughput: 0: 41971.5. Samples: 788768120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:46:15,505][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 21:46:15,898][19107] Updated weights for policy 0, policy_version 217575 (0.0037) [2024-06-18 21:46:19,780][19087] Signal inference workers to stop experience collection... (11550 times) [2024-06-18 21:46:19,780][19087] Signal inference workers to resume experience collection... (11550 times) [2024-06-18 21:46:19,808][19107] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-06-18 21:46:19,809][19107] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-06-18 21:46:19,926][19107] Updated weights for policy 0, policy_version 217585 (0.0043) [2024-06-18 21:46:20,500][18875] Fps is (10 sec: 40960.8, 60 sec: 41781.8, 300 sec: 42043.0). Total num frames: 3564929024. Throughput: 0: 41776.1. Samples: 789018340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 21:46:20,501][18875] Avg episode reward: [(0, '0.843')] [2024-06-18 21:46:23,733][19107] Updated weights for policy 0, policy_version 217595 (0.0033) [2024-06-18 21:46:25,500][18875] Fps is (10 sec: 42614.1, 60 sec: 42052.4, 300 sec: 41932.0). Total num frames: 3565142016. Throughput: 0: 41806.7. Samples: 789265460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:46:25,500][18875] Avg episode reward: [(0, '0.709')] [2024-06-18 21:46:27,713][19107] Updated weights for policy 0, policy_version 217605 (0.0049) [2024-06-18 21:46:30,504][18875] Fps is (10 sec: 40945.3, 60 sec: 41776.8, 300 sec: 41987.0). Total num frames: 3565338624. Throughput: 0: 41865.3. Samples: 789394520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:46:30,504][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 21:46:31,584][19107] Updated weights for policy 0, policy_version 217615 (0.0050) [2024-06-18 21:46:35,498][19107] Updated weights for policy 0, policy_version 217625 (0.0051) [2024-06-18 21:46:35,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3565568000. Throughput: 0: 41951.8. Samples: 789644920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:46:35,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 21:46:35,507][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217625_3565568000.pth... [2024-06-18 21:46:35,569][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217010_3555491840.pth [2024-06-18 21:46:39,382][19107] Updated weights for policy 0, policy_version 217635 (0.0024) [2024-06-18 21:46:40,500][18875] Fps is (10 sec: 42613.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3565764608. Throughput: 0: 41852.5. Samples: 789897620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:46:40,501][18875] Avg episode reward: [(0, '0.448')] [2024-06-18 21:46:43,364][19107] Updated weights for policy 0, policy_version 217645 (0.0028) [2024-06-18 21:46:45,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3565977600. Throughput: 0: 41612.8. Samples: 790017200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:46:45,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 21:46:47,622][19107] Updated weights for policy 0, policy_version 217655 (0.0023) [2024-06-18 21:46:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3566190592. Throughput: 0: 41882.7. Samples: 790273480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:46:50,501][18875] Avg episode reward: [(0, '0.838')] [2024-06-18 21:46:51,498][19107] Updated weights for policy 0, policy_version 217665 (0.0032) [2024-06-18 21:46:55,349][19107] Updated weights for policy 0, policy_version 217675 (0.0041) [2024-06-18 21:46:55,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3566403584. Throughput: 0: 42109.9. Samples: 790531000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:46:55,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 21:46:59,060][19107] Updated weights for policy 0, policy_version 217685 (0.0037) [2024-06-18 21:47:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3566600192. Throughput: 0: 41775.7. Samples: 790647880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:47:00,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 21:47:03,094][19107] Updated weights for policy 0, policy_version 217695 (0.0032) [2024-06-18 21:47:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3566829568. Throughput: 0: 41818.2. Samples: 790900160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:47:05,501][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 21:47:06,715][19107] Updated weights for policy 0, policy_version 217705 (0.0041) [2024-06-18 21:47:10,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 3567009792. Throughput: 0: 42130.2. Samples: 791161320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:47:10,500][18875] Avg episode reward: [(0, '0.729')] [2024-06-18 21:47:10,715][19107] Updated weights for policy 0, policy_version 217715 (0.0039) [2024-06-18 21:47:15,046][19107] Updated weights for policy 0, policy_version 217725 (0.0038) [2024-06-18 21:47:15,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41781.6, 300 sec: 41931.9). Total num frames: 3567222784. Throughput: 0: 41850.7. Samples: 791277660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:47:15,501][18875] Avg episode reward: [(0, '0.727')] [2024-06-18 21:47:18,390][19107] Updated weights for policy 0, policy_version 217735 (0.0028) [2024-06-18 21:47:20,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3567468544. Throughput: 0: 41899.2. Samples: 791530380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:47:20,501][18875] Avg episode reward: [(0, '0.598')] [2024-06-18 21:47:22,557][19107] Updated weights for policy 0, policy_version 217745 (0.0029) [2024-06-18 21:47:25,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 3567632384. Throughput: 0: 42004.9. Samples: 791787840. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:47:25,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 21:47:26,003][19087] Signal inference workers to stop experience collection... (11600 times) [2024-06-18 21:47:26,004][19087] Signal inference workers to resume experience collection... (11600 times) [2024-06-18 21:47:26,036][19107] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-06-18 21:47:26,036][19107] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-06-18 21:47:26,595][19107] Updated weights for policy 0, policy_version 217755 (0.0037) [2024-06-18 21:47:30,052][19107] Updated weights for policy 0, policy_version 217765 (0.0037) [2024-06-18 21:47:30,504][18875] Fps is (10 sec: 39307.4, 60 sec: 42052.2, 300 sec: 41931.4). Total num frames: 3567861760. Throughput: 0: 41941.7. Samples: 791904720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-18 21:47:30,504][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 21:47:34,260][19107] Updated weights for policy 0, policy_version 217775 (0.0039) [2024-06-18 21:47:35,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3568091136. Throughput: 0: 42052.9. Samples: 792165860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:47:35,501][18875] Avg episode reward: [(0, '0.781')] [2024-06-18 21:47:37,522][19107] Updated weights for policy 0, policy_version 217785 (0.0040) [2024-06-18 21:47:40,500][18875] Fps is (10 sec: 40974.4, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3568271360. Throughput: 0: 42025.3. Samples: 792422140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:47:40,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 21:47:42,078][19107] Updated weights for policy 0, policy_version 217795 (0.0036) [2024-06-18 21:47:45,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.5, 300 sec: 41876.4). Total num frames: 3568500736. Throughput: 0: 42059.3. Samples: 792540540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:47:45,500][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 21:47:45,701][19107] Updated weights for policy 0, policy_version 217805 (0.0027) [2024-06-18 21:47:49,794][19107] Updated weights for policy 0, policy_version 217815 (0.0036) [2024-06-18 21:47:50,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 3568713728. Throughput: 0: 42145.9. Samples: 792796720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:47:50,501][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 21:47:53,443][19107] Updated weights for policy 0, policy_version 217825 (0.0031) [2024-06-18 21:47:55,500][18875] Fps is (10 sec: 37682.8, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 3568877568. Throughput: 0: 42088.4. Samples: 793055300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:47:55,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 21:47:57,639][19107] Updated weights for policy 0, policy_version 217835 (0.0036) [2024-06-18 21:48:00,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3569156096. Throughput: 0: 42196.1. Samples: 793176480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:00,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 21:48:01,149][19107] Updated weights for policy 0, policy_version 217845 (0.0035) [2024-06-18 21:48:05,447][19107] Updated weights for policy 0, policy_version 217855 (0.0039) [2024-06-18 21:48:05,500][18875] Fps is (10 sec: 45874.7, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3569336320. Throughput: 0: 42321.2. Samples: 793434840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:05,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 21:48:08,811][19107] Updated weights for policy 0, policy_version 217865 (0.0027) [2024-06-18 21:48:10,500][18875] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3569532928. Throughput: 0: 42193.7. Samples: 793686560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:10,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 21:48:13,221][19107] Updated weights for policy 0, policy_version 217875 (0.0038) [2024-06-18 21:48:15,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 3569795072. Throughput: 0: 42314.9. Samples: 793808740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:15,501][18875] Avg episode reward: [(0, '0.404')] [2024-06-18 21:48:16,570][19107] Updated weights for policy 0, policy_version 217885 (0.0023) [2024-06-18 21:48:20,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3569958912. Throughput: 0: 42209.8. Samples: 794065300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:20,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 21:48:21,058][19107] Updated weights for policy 0, policy_version 217895 (0.0042) [2024-06-18 21:48:24,164][19107] Updated weights for policy 0, policy_version 217905 (0.0026) [2024-06-18 21:48:25,500][18875] Fps is (10 sec: 37683.4, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3570171904. Throughput: 0: 42085.0. Samples: 794315960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:25,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 21:48:28,666][19107] Updated weights for policy 0, policy_version 217915 (0.0049) [2024-06-18 21:48:30,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42601.0, 300 sec: 42209.6). Total num frames: 3570417664. Throughput: 0: 42348.9. Samples: 794446240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:30,501][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 21:48:31,750][19107] Updated weights for policy 0, policy_version 217925 (0.0052) [2024-06-18 21:48:35,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41932.3). Total num frames: 3570565120. Throughput: 0: 42185.7. Samples: 794695080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-18 21:48:35,508][18875] Avg episode reward: [(0, '0.664')] [2024-06-18 21:48:35,539][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217931_3570581504.pth... [2024-06-18 21:48:35,593][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217318_3560538112.pth [2024-06-18 21:48:36,463][19107] Updated weights for policy 0, policy_version 217935 (0.0042) [2024-06-18 21:48:39,873][19107] Updated weights for policy 0, policy_version 217945 (0.0029) [2024-06-18 21:48:40,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42325.5, 300 sec: 41931.9). Total num frames: 3570810880. Throughput: 0: 41817.9. Samples: 794937100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:48:40,500][18875] Avg episode reward: [(0, '0.563')] [2024-06-18 21:48:44,092][19107] Updated weights for policy 0, policy_version 217955 (0.0033) [2024-06-18 21:48:45,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3571023872. Throughput: 0: 42224.1. Samples: 795076560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:48:45,501][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 21:48:47,488][19107] Updated weights for policy 0, policy_version 217965 (0.0028) [2024-06-18 21:48:50,500][18875] Fps is (10 sec: 37682.8, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 3571187712. Throughput: 0: 42054.7. Samples: 795327300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:48:50,501][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 21:48:51,663][19087] Signal inference workers to stop experience collection... (11650 times) [2024-06-18 21:48:51,663][19087] Signal inference workers to resume experience collection... (11650 times) [2024-06-18 21:48:51,700][19107] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-06-18 21:48:51,700][19107] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-06-18 21:48:51,982][19107] Updated weights for policy 0, policy_version 217975 (0.0029) [2024-06-18 21:48:55,212][19107] Updated weights for policy 0, policy_version 217985 (0.0031) [2024-06-18 21:48:55,500][18875] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42043.0). Total num frames: 3571466240. Throughput: 0: 41865.8. Samples: 795570520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:48:55,501][18875] Avg episode reward: [(0, '0.458')] [2024-06-18 21:48:59,631][19107] Updated weights for policy 0, policy_version 217995 (0.0034) [2024-06-18 21:49:00,500][18875] Fps is (10 sec: 45875.5, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3571646464. Throughput: 0: 42276.5. Samples: 795711180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:00,500][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 21:49:02,983][19107] Updated weights for policy 0, policy_version 218005 (0.0039) [2024-06-18 21:49:05,500][18875] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3571843072. Throughput: 0: 42102.6. Samples: 795959920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:05,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 21:49:07,344][19107] Updated weights for policy 0, policy_version 218015 (0.0034) [2024-06-18 21:49:10,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 3572105216. Throughput: 0: 41866.3. Samples: 796199940. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:10,501][18875] Avg episode reward: [(0, '0.265')] [2024-06-18 21:49:11,313][19107] Updated weights for policy 0, policy_version 218025 (0.0034) [2024-06-18 21:49:15,211][19107] Updated weights for policy 0, policy_version 218035 (0.0046) [2024-06-18 21:49:15,500][18875] Fps is (10 sec: 44237.0, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3572285440. Throughput: 0: 41995.4. Samples: 796336040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:15,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 21:49:18,852][19107] Updated weights for policy 0, policy_version 218045 (0.0046) [2024-06-18 21:49:20,500][18875] Fps is (10 sec: 37682.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3572482048. Throughput: 0: 42025.2. Samples: 796586220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:20,501][18875] Avg episode reward: [(0, '0.587')] [2024-06-18 21:49:23,298][19107] Updated weights for policy 0, policy_version 218055 (0.0033) [2024-06-18 21:49:25,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42871.3, 300 sec: 42265.2). Total num frames: 3572744192. Throughput: 0: 42163.8. Samples: 796834480. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:25,509][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 21:49:26,391][19107] Updated weights for policy 0, policy_version 218065 (0.0029) [2024-06-18 21:49:30,500][18875] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 41988.0). Total num frames: 3572924416. Throughput: 0: 42158.1. Samples: 796973680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:30,501][18875] Avg episode reward: [(0, '0.366')] [2024-06-18 21:49:30,771][19107] Updated weights for policy 0, policy_version 218075 (0.0034) [2024-06-18 21:49:33,833][19107] Updated weights for policy 0, policy_version 218085 (0.0037) [2024-06-18 21:49:35,500][18875] Fps is (10 sec: 37683.5, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3573121024. Throughput: 0: 41993.7. Samples: 797217020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:35,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 21:49:38,832][19107] Updated weights for policy 0, policy_version 218095 (0.0036) [2024-06-18 21:49:40,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3573350400. Throughput: 0: 42290.7. Samples: 797473600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:40,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 21:49:41,754][19107] Updated weights for policy 0, policy_version 218105 (0.0024) [2024-06-18 21:49:45,504][18875] Fps is (10 sec: 40945.4, 60 sec: 41776.6, 300 sec: 41876.1). Total num frames: 3573530624. Throughput: 0: 42031.7. Samples: 797602760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-18 21:49:45,505][18875] Avg episode reward: [(0, '0.466')] [2024-06-18 21:49:46,438][19107] Updated weights for policy 0, policy_version 218115 (0.0037) [2024-06-18 21:49:49,679][19107] Updated weights for policy 0, policy_version 218125 (0.0034) [2024-06-18 21:49:50,500][18875] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42154.1). Total num frames: 3573776384. Throughput: 0: 41930.3. Samples: 797846780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:49:50,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 21:49:54,158][19107] Updated weights for policy 0, policy_version 218135 (0.0052) [2024-06-18 21:49:55,010][19087] Signal inference workers to stop experience collection... (11700 times) [2024-06-18 21:49:55,064][19107] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-06-18 21:49:55,124][19087] Signal inference workers to resume experience collection... (11700 times) [2024-06-18 21:49:55,124][19107] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-06-18 21:49:55,500][18875] Fps is (10 sec: 45892.3, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3573989376. Throughput: 0: 42229.8. Samples: 798100280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:49:55,508][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 21:49:57,568][19107] Updated weights for policy 0, policy_version 218145 (0.0035) [2024-06-18 21:50:00,500][18875] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 41876.9). Total num frames: 3574169600. Throughput: 0: 41936.2. Samples: 798223160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:00,500][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 21:50:02,391][19107] Updated weights for policy 0, policy_version 218155 (0.0033) [2024-06-18 21:50:05,456][19107] Updated weights for policy 0, policy_version 218165 (0.0027) [2024-06-18 21:50:05,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 3574415360. Throughput: 0: 42002.8. Samples: 798476340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:05,501][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 21:50:10,164][19107] Updated weights for policy 0, policy_version 218175 (0.0052) [2024-06-18 21:50:10,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3574595584. Throughput: 0: 41959.7. Samples: 798722660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:10,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 21:50:13,539][19107] Updated weights for policy 0, policy_version 218185 (0.0051) [2024-06-18 21:50:15,500][18875] Fps is (10 sec: 37682.7, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 3574792192. Throughput: 0: 41524.4. Samples: 798842280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:15,508][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 21:50:17,808][19107] Updated weights for policy 0, policy_version 218195 (0.0046) [2024-06-18 21:50:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3575005184. Throughput: 0: 41704.5. Samples: 799093720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:20,501][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 21:50:21,553][19107] Updated weights for policy 0, policy_version 218205 (0.0029) [2024-06-18 21:50:25,500][18875] Fps is (10 sec: 44237.6, 60 sec: 41506.3, 300 sec: 42043.0). Total num frames: 3575234560. Throughput: 0: 41802.7. Samples: 799354720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:25,501][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 21:50:25,505][19107] Updated weights for policy 0, policy_version 218215 (0.0034) [2024-06-18 21:50:29,387][19107] Updated weights for policy 0, policy_version 218225 (0.0035) [2024-06-18 21:50:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3575414784. Throughput: 0: 41693.1. Samples: 799478800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:30,501][18875] Avg episode reward: [(0, '0.499')] [2024-06-18 21:50:33,083][19107] Updated weights for policy 0, policy_version 218235 (0.0034) [2024-06-18 21:50:35,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3575644160. Throughput: 0: 41878.8. Samples: 799731320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:35,501][18875] Avg episode reward: [(0, '0.324')] [2024-06-18 21:50:35,513][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000218240_3575644160.pth... [2024-06-18 21:50:35,575][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217625_3565568000.pth [2024-06-18 21:50:37,152][19107] Updated weights for policy 0, policy_version 218245 (0.0030) [2024-06-18 21:50:40,504][18875] Fps is (10 sec: 44220.9, 60 sec: 41776.6, 300 sec: 41987.0). Total num frames: 3575857152. Throughput: 0: 41997.9. Samples: 799990340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:40,505][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 21:50:40,710][19107] Updated weights for policy 0, policy_version 218255 (0.0040) [2024-06-18 21:50:45,034][19107] Updated weights for policy 0, policy_version 218265 (0.0030) [2024-06-18 21:50:45,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42054.7, 300 sec: 42098.5). Total num frames: 3576053760. Throughput: 0: 41982.5. Samples: 800112380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:45,501][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 21:50:48,476][19107] Updated weights for policy 0, policy_version 218275 (0.0032) [2024-06-18 21:50:50,500][18875] Fps is (10 sec: 40974.6, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3576266752. Throughput: 0: 41942.2. Samples: 800363740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:50,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 21:50:52,906][19107] Updated weights for policy 0, policy_version 218285 (0.0037) [2024-06-18 21:50:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3576479744. Throughput: 0: 42097.7. Samples: 800617060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-18 21:50:55,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 21:50:56,447][19107] Updated weights for policy 0, policy_version 218295 (0.0030) [2024-06-18 21:51:00,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3576692736. Throughput: 0: 42268.2. Samples: 800744340. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:00,500][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 21:51:01,017][19107] Updated weights for policy 0, policy_version 218305 (0.0036) [2024-06-18 21:51:04,056][19107] Updated weights for policy 0, policy_version 218315 (0.0040) [2024-06-18 21:51:05,500][18875] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 3576905728. Throughput: 0: 42255.5. Samples: 800995220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:05,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 21:51:08,510][19107] Updated weights for policy 0, policy_version 218325 (0.0026) [2024-06-18 21:51:10,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42099.1). Total num frames: 3577135104. Throughput: 0: 42287.9. Samples: 801257680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:10,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 21:51:11,717][19107] Updated weights for policy 0, policy_version 218335 (0.0040) [2024-06-18 21:51:15,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3577315328. Throughput: 0: 42330.7. Samples: 801383680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:15,501][18875] Avg episode reward: [(0, '0.262')] [2024-06-18 21:51:16,235][19107] Updated weights for policy 0, policy_version 218345 (0.0039) [2024-06-18 21:51:18,795][19087] Signal inference workers to stop experience collection... (11750 times) [2024-06-18 21:51:18,795][19087] Signal inference workers to resume experience collection... (11750 times) [2024-06-18 21:51:18,810][19107] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-06-18 21:51:18,810][19107] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-06-18 21:51:19,255][19107] Updated weights for policy 0, policy_version 218355 (0.0030) [2024-06-18 21:51:20,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3577544704. Throughput: 0: 42266.2. Samples: 801633300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:20,501][18875] Avg episode reward: [(0, '0.262')] [2024-06-18 21:51:24,057][19107] Updated weights for policy 0, policy_version 218365 (0.0041) [2024-06-18 21:51:25,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42099.0). Total num frames: 3577757696. Throughput: 0: 42245.1. Samples: 801891220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:25,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 21:51:26,830][19107] Updated weights for policy 0, policy_version 218375 (0.0034) [2024-06-18 21:51:30,504][18875] Fps is (10 sec: 42583.3, 60 sec: 42595.9, 300 sec: 42042.5). Total num frames: 3577970688. Throughput: 0: 42303.4. Samples: 802016180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:30,504][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 21:51:31,647][19107] Updated weights for policy 0, policy_version 218385 (0.0031) [2024-06-18 21:51:34,699][19107] Updated weights for policy 0, policy_version 218395 (0.0031) [2024-06-18 21:51:35,504][18875] Fps is (10 sec: 44220.7, 60 sec: 42595.7, 300 sec: 42153.6). Total num frames: 3578200064. Throughput: 0: 42398.8. Samples: 802271840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:35,505][18875] Avg episode reward: [(0, '0.749')] [2024-06-18 21:51:39,226][19107] Updated weights for policy 0, policy_version 218405 (0.0032) [2024-06-18 21:51:40,500][18875] Fps is (10 sec: 42613.6, 60 sec: 42327.9, 300 sec: 42098.6). Total num frames: 3578396672. Throughput: 0: 42500.5. Samples: 802529580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:40,501][18875] Avg episode reward: [(0, '0.783')] [2024-06-18 21:51:42,310][19107] Updated weights for policy 0, policy_version 218415 (0.0039) [2024-06-18 21:51:45,500][18875] Fps is (10 sec: 39336.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3578593280. Throughput: 0: 42423.1. Samples: 802653380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:45,501][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 21:51:47,245][19107] Updated weights for policy 0, policy_version 218425 (0.0028) [2024-06-18 21:51:50,298][19107] Updated weights for policy 0, policy_version 218435 (0.0039) [2024-06-18 21:51:50,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 3578839040. Throughput: 0: 42562.8. Samples: 802910540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:50,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 21:51:54,899][19107] Updated weights for policy 0, policy_version 218445 (0.0035) [2024-06-18 21:51:55,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3579002880. Throughput: 0: 42195.0. Samples: 803156460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:51:55,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 21:51:57,966][19107] Updated weights for policy 0, policy_version 218455 (0.0032) [2024-06-18 21:52:00,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3579232256. Throughput: 0: 41994.7. Samples: 803273440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-18 21:52:00,500][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 21:52:02,755][19107] Updated weights for policy 0, policy_version 218465 (0.0032) [2024-06-18 21:52:05,504][18875] Fps is (10 sec: 45859.1, 60 sec: 42595.9, 300 sec: 42209.1). Total num frames: 3579461632. Throughput: 0: 42355.7. Samples: 803539460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:05,505][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 21:52:05,647][19107] Updated weights for policy 0, policy_version 218475 (0.0034) [2024-06-18 21:52:10,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3579625472. Throughput: 0: 42203.2. Samples: 803790360. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:10,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 21:52:10,709][19107] Updated weights for policy 0, policy_version 218485 (0.0027) [2024-06-18 21:52:13,483][19107] Updated weights for policy 0, policy_version 218495 (0.0028) [2024-06-18 21:52:15,504][18875] Fps is (10 sec: 39321.5, 60 sec: 42322.8, 300 sec: 41987.0). Total num frames: 3579854848. Throughput: 0: 41977.7. Samples: 803905180. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:15,505][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 21:52:18,595][19107] Updated weights for policy 0, policy_version 218505 (0.0041) [2024-06-18 21:52:20,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3580084224. Throughput: 0: 42126.4. Samples: 804167380. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:20,501][18875] Avg episode reward: [(0, '0.621')] [2024-06-18 21:52:21,147][19107] Updated weights for policy 0, policy_version 218515 (0.0028) [2024-06-18 21:52:25,500][18875] Fps is (10 sec: 40975.3, 60 sec: 41779.3, 300 sec: 42043.5). Total num frames: 3580264448. Throughput: 0: 42117.9. Samples: 804424880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:25,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 21:52:26,493][19107] Updated weights for policy 0, policy_version 218525 (0.0041) [2024-06-18 21:52:28,923][19107] Updated weights for policy 0, policy_version 218535 (0.0033) [2024-06-18 21:52:30,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42327.8, 300 sec: 42098.5). Total num frames: 3580510208. Throughput: 0: 42018.6. Samples: 804544220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:30,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 21:52:34,134][19107] Updated weights for policy 0, policy_version 218545 (0.0038) [2024-06-18 21:52:35,500][18875] Fps is (10 sec: 44236.2, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 3580706816. Throughput: 0: 42100.4. Samples: 804805060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:35,501][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 21:52:35,602][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000218550_3580723200.pth... [2024-06-18 21:52:35,648][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000217931_3570581504.pth [2024-06-18 21:52:36,943][19107] Updated weights for policy 0, policy_version 218555 (0.0032) [2024-06-18 21:52:40,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3580903424. Throughput: 0: 42151.3. Samples: 805053260. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:40,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 21:52:41,895][19107] Updated weights for policy 0, policy_version 218565 (0.0030) [2024-06-18 21:52:43,295][19087] Signal inference workers to stop experience collection... (11800 times) [2024-06-18 21:52:43,336][19107] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-06-18 21:52:43,344][19087] Signal inference workers to resume experience collection... (11800 times) [2024-06-18 21:52:43,350][19107] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-06-18 21:52:44,706][19107] Updated weights for policy 0, policy_version 218575 (0.0040) [2024-06-18 21:52:45,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42598.2, 300 sec: 42154.1). Total num frames: 3581149184. Throughput: 0: 42366.0. Samples: 805179920. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:45,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 21:52:49,429][19107] Updated weights for policy 0, policy_version 218585 (0.0035) [2024-06-18 21:52:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 3581329408. Throughput: 0: 42234.5. Samples: 805439860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:50,501][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 21:52:52,328][19107] Updated weights for policy 0, policy_version 218595 (0.0047) [2024-06-18 21:52:55,500][18875] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3581526016. Throughput: 0: 42170.5. Samples: 805688040. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:52:55,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 21:52:57,289][19107] Updated weights for policy 0, policy_version 218605 (0.0038) [2024-06-18 21:53:00,014][19107] Updated weights for policy 0, policy_version 218615 (0.0031) [2024-06-18 21:53:00,500][18875] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 3581804544. Throughput: 0: 42481.2. Samples: 805816680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:53:00,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 21:53:04,985][19107] Updated weights for policy 0, policy_version 218625 (0.0041) [2024-06-18 21:53:05,500][18875] Fps is (10 sec: 44237.8, 60 sec: 41781.8, 300 sec: 42154.1). Total num frames: 3581968384. Throughput: 0: 42402.0. Samples: 806075460. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:53:05,500][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 21:53:07,749][19107] Updated weights for policy 0, policy_version 218635 (0.0037) [2024-06-18 21:53:10,500][18875] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3582181376. Throughput: 0: 42367.9. Samples: 806331440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 20.0) [2024-06-18 21:53:10,501][18875] Avg episode reward: [(0, '0.330')] [2024-06-18 21:53:12,701][19107] Updated weights for policy 0, policy_version 218645 (0.0036) [2024-06-18 21:53:15,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42874.1, 300 sec: 42265.2). Total num frames: 3582427136. Throughput: 0: 42509.3. Samples: 806457140. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:15,501][18875] Avg episode reward: [(0, '0.384')] [2024-06-18 21:53:15,685][19107] Updated weights for policy 0, policy_version 218655 (0.0039) [2024-06-18 21:53:20,239][19107] Updated weights for policy 0, policy_version 218665 (0.0040) [2024-06-18 21:53:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.5, 300 sec: 42154.1). Total num frames: 3582607360. Throughput: 0: 42501.9. Samples: 806717640. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:20,500][18875] Avg episode reward: [(0, '0.372')] [2024-06-18 21:53:23,371][19107] Updated weights for policy 0, policy_version 218675 (0.0025) [2024-06-18 21:53:25,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3582820352. Throughput: 0: 42459.6. Samples: 806963940. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:25,501][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 21:53:27,917][19107] Updated weights for policy 0, policy_version 218685 (0.0040) [2024-06-18 21:53:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 3583049728. Throughput: 0: 42550.9. Samples: 807094700. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:30,500][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 21:53:31,041][19107] Updated weights for policy 0, policy_version 218695 (0.0036) [2024-06-18 21:53:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3583246336. Throughput: 0: 42483.6. Samples: 807351620. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:35,501][18875] Avg episode reward: [(0, '0.385')] [2024-06-18 21:53:35,547][19107] Updated weights for policy 0, policy_version 218705 (0.0033) [2024-06-18 21:53:38,904][19107] Updated weights for policy 0, policy_version 218715 (0.0039) [2024-06-18 21:53:40,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 3583475712. Throughput: 0: 42502.7. Samples: 807600660. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:40,501][18875] Avg episode reward: [(0, '0.492')] [2024-06-18 21:53:43,595][19107] Updated weights for policy 0, policy_version 218725 (0.0026) [2024-06-18 21:53:45,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 3583672320. Throughput: 0: 42534.5. Samples: 807730740. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:45,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 21:53:47,240][19107] Updated weights for policy 0, policy_version 218735 (0.0027) [2024-06-18 21:53:50,500][18875] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3583868928. Throughput: 0: 42220.5. Samples: 807975380. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:50,500][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 21:53:51,267][19107] Updated weights for policy 0, policy_version 218745 (0.0026) [2024-06-18 21:53:54,991][19107] Updated weights for policy 0, policy_version 218755 (0.0035) [2024-06-18 21:53:55,500][18875] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42265.1). Total num frames: 3584114688. Throughput: 0: 42195.5. Samples: 808230240. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:53:55,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 21:53:59,322][19107] Updated weights for policy 0, policy_version 218765 (0.0039) [2024-06-18 21:54:00,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 3584311296. Throughput: 0: 42219.6. Samples: 808357020. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:54:00,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 21:54:02,543][19107] Updated weights for policy 0, policy_version 218775 (0.0027) [2024-06-18 21:54:05,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42595.8, 300 sec: 42098.0). Total num frames: 3584524288. Throughput: 0: 41872.1. Samples: 808602040. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:54:05,505][18875] Avg episode reward: [(0, '0.441')] [2024-06-18 21:54:06,916][19107] Updated weights for policy 0, policy_version 218785 (0.0038) [2024-06-18 21:54:10,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3584720896. Throughput: 0: 42082.6. Samples: 808857660. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:54:10,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 21:54:10,525][19107] Updated weights for policy 0, policy_version 218795 (0.0031) [2024-06-18 21:54:14,484][19107] Updated weights for policy 0, policy_version 218805 (0.0035) [2024-06-18 21:54:15,500][18875] Fps is (10 sec: 40974.6, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 3584933888. Throughput: 0: 42048.7. Samples: 808986900. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:54:15,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 21:54:18,210][19107] Updated weights for policy 0, policy_version 218815 (0.0033) [2024-06-18 21:54:20,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3585130496. Throughput: 0: 41868.0. Samples: 809235680. Policy #0 lag: (min: 1.0, avg: 8.3, max: 20.0) [2024-06-18 21:54:20,501][18875] Avg episode reward: [(0, '0.745')] [2024-06-18 21:54:22,257][19087] Signal inference workers to stop experience collection... (11850 times) [2024-06-18 21:54:22,257][19087] Signal inference workers to resume experience collection... (11850 times) [2024-06-18 21:54:22,260][19107] Updated weights for policy 0, policy_version 218825 (0.0038) [2024-06-18 21:54:22,304][19107] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-06-18 21:54:22,304][19107] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-06-18 21:54:25,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3585376256. Throughput: 0: 42002.7. Samples: 809490780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:54:25,501][18875] Avg episode reward: [(0, '0.664')] [2024-06-18 21:54:25,823][19107] Updated weights for policy 0, policy_version 218835 (0.0041) [2024-06-18 21:54:29,978][19107] Updated weights for policy 0, policy_version 218845 (0.0048) [2024-06-18 21:54:30,500][18875] Fps is (10 sec: 44235.8, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 3585572864. Throughput: 0: 42008.0. Samples: 809621100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:54:30,501][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 21:54:33,632][19107] Updated weights for policy 0, policy_version 218855 (0.0039) [2024-06-18 21:54:35,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3585785856. Throughput: 0: 42049.1. Samples: 809867600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:54:35,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 21:54:35,521][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000218859_3585785856.pth... [2024-06-18 21:54:35,583][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000218240_3575644160.pth [2024-06-18 21:54:37,661][19107] Updated weights for policy 0, policy_version 218865 (0.0036) [2024-06-18 21:54:40,502][18875] Fps is (10 sec: 42593.2, 60 sec: 42051.4, 300 sec: 42265.5). Total num frames: 3585998848. Throughput: 0: 42101.9. Samples: 810124880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:54:40,502][18875] Avg episode reward: [(0, '0.413')] [2024-06-18 21:54:41,486][19107] Updated weights for policy 0, policy_version 218875 (0.0041) [2024-06-18 21:54:45,439][19107] Updated weights for policy 0, policy_version 218885 (0.0037) [2024-06-18 21:54:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3586211840. Throughput: 0: 42068.3. Samples: 810250100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:54:45,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 21:54:49,294][19107] Updated weights for policy 0, policy_version 218895 (0.0033) [2024-06-18 21:54:50,501][18875] Fps is (10 sec: 40964.8, 60 sec: 42325.1, 300 sec: 42098.5). Total num frames: 3586408448. Throughput: 0: 42283.2. Samples: 810504640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:54:50,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 21:54:53,298][19107] Updated weights for policy 0, policy_version 218905 (0.0037) [2024-06-18 21:54:55,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 3586621440. Throughput: 0: 42270.5. Samples: 810759840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:54:55,501][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 21:54:57,033][19107] Updated weights for policy 0, policy_version 218915 (0.0028) [2024-06-18 21:55:00,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3586834432. Throughput: 0: 42238.2. Samples: 810887620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:55:00,501][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 21:55:00,765][19107] Updated weights for policy 0, policy_version 218925 (0.0029) [2024-06-18 21:55:04,811][19107] Updated weights for policy 0, policy_version 218935 (0.0033) [2024-06-18 21:55:05,500][18875] Fps is (10 sec: 40960.9, 60 sec: 41781.8, 300 sec: 42154.1). Total num frames: 3587031040. Throughput: 0: 42329.8. Samples: 811140520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:55:05,501][18875] Avg episode reward: [(0, '0.803')] [2024-06-18 21:55:08,448][19107] Updated weights for policy 0, policy_version 218945 (0.0044) [2024-06-18 21:55:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3587244032. Throughput: 0: 42200.9. Samples: 811389820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:55:10,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 21:55:12,597][19107] Updated weights for policy 0, policy_version 218955 (0.0038) [2024-06-18 21:55:15,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3587473408. Throughput: 0: 42218.8. Samples: 811520940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:55:15,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 21:55:16,350][19107] Updated weights for policy 0, policy_version 218965 (0.0036) [2024-06-18 21:55:20,307][19107] Updated weights for policy 0, policy_version 218975 (0.0043) [2024-06-18 21:55:20,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3587686400. Throughput: 0: 42315.6. Samples: 811771800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:55:20,501][18875] Avg episode reward: [(0, '0.777')] [2024-06-18 21:55:23,941][19107] Updated weights for policy 0, policy_version 218985 (0.0037) [2024-06-18 21:55:25,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 3587899392. Throughput: 0: 42177.8. Samples: 812022820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 21:55:25,500][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 21:55:28,336][19107] Updated weights for policy 0, policy_version 218995 (0.0039) [2024-06-18 21:55:30,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 3588112384. Throughput: 0: 42258.8. Samples: 812151740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:55:30,501][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 21:55:31,610][19107] Updated weights for policy 0, policy_version 219005 (0.0034) [2024-06-18 21:55:35,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.4, 300 sec: 42210.1). Total num frames: 3588308992. Throughput: 0: 42238.4. Samples: 812405360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:55:35,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 21:55:36,019][19107] Updated weights for policy 0, policy_version 219015 (0.0036) [2024-06-18 21:55:39,445][19107] Updated weights for policy 0, policy_version 219025 (0.0036) [2024-06-18 21:55:40,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42326.3, 300 sec: 42320.7). Total num frames: 3588538368. Throughput: 0: 42036.6. Samples: 812651480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:55:40,501][18875] Avg episode reward: [(0, '0.663')] [2024-06-18 21:55:43,696][19107] Updated weights for policy 0, policy_version 219035 (0.0040) [2024-06-18 21:55:45,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 3588734976. Throughput: 0: 42105.8. Samples: 812782380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:55:45,501][18875] Avg episode reward: [(0, '0.755')] [2024-06-18 21:55:47,108][19107] Updated weights for policy 0, policy_version 219045 (0.0034) [2024-06-18 21:55:50,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 3588964352. Throughput: 0: 42260.3. Samples: 813042240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:55:50,501][18875] Avg episode reward: [(0, '0.707')] [2024-06-18 21:55:51,907][19107] Updated weights for policy 0, policy_version 219055 (0.0036) [2024-06-18 21:55:54,863][19107] Updated weights for policy 0, policy_version 219065 (0.0034) [2024-06-18 21:55:55,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 3589177344. Throughput: 0: 42192.8. Samples: 813288500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:55:55,504][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 21:55:59,585][19107] Updated weights for policy 0, policy_version 219075 (0.0029) [2024-06-18 21:56:00,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3589357568. Throughput: 0: 42128.9. Samples: 813416740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:00,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 21:56:02,268][19087] Signal inference workers to stop experience collection... (11900 times) [2024-06-18 21:56:02,313][19107] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-06-18 21:56:02,318][19087] Signal inference workers to resume experience collection... (11900 times) [2024-06-18 21:56:02,327][19107] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-06-18 21:56:02,623][19107] Updated weights for policy 0, policy_version 219085 (0.0029) [2024-06-18 21:56:05,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3589570560. Throughput: 0: 42146.3. Samples: 813668380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:05,501][18875] Avg episode reward: [(0, '0.704')] [2024-06-18 21:56:07,321][19107] Updated weights for policy 0, policy_version 219095 (0.0037) [2024-06-18 21:56:10,337][19107] Updated weights for policy 0, policy_version 219105 (0.0031) [2024-06-18 21:56:10,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 3589816320. Throughput: 0: 42191.4. Samples: 813921440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:10,501][18875] Avg episode reward: [(0, '0.704')] [2024-06-18 21:56:14,914][19107] Updated weights for policy 0, policy_version 219115 (0.0034) [2024-06-18 21:56:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3589980160. Throughput: 0: 42268.0. Samples: 814053800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:15,501][18875] Avg episode reward: [(0, '0.770')] [2024-06-18 21:56:18,230][19107] Updated weights for policy 0, policy_version 219125 (0.0039) [2024-06-18 21:56:20,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3590209536. Throughput: 0: 42064.4. Samples: 814298260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:20,501][18875] Avg episode reward: [(0, '0.673')] [2024-06-18 21:56:22,959][19107] Updated weights for policy 0, policy_version 219135 (0.0049) [2024-06-18 21:56:25,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42210.1). Total num frames: 3590422528. Throughput: 0: 42333.8. Samples: 814556500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:25,501][18875] Avg episode reward: [(0, '0.709')] [2024-06-18 21:56:26,394][19107] Updated weights for policy 0, policy_version 219145 (0.0032) [2024-06-18 21:56:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42099.1). Total num frames: 3590619136. Throughput: 0: 42272.5. Samples: 814684640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:30,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 21:56:30,558][19107] Updated weights for policy 0, policy_version 219155 (0.0033) [2024-06-18 21:56:33,977][19107] Updated weights for policy 0, policy_version 219165 (0.0053) [2024-06-18 21:56:35,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 3590864896. Throughput: 0: 42043.6. Samples: 814934200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-18 21:56:35,500][18875] Avg episode reward: [(0, '0.228')] [2024-06-18 21:56:35,515][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000219169_3590864896.pth... [2024-06-18 21:56:35,566][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000218550_3580723200.pth [2024-06-18 21:56:38,771][19107] Updated weights for policy 0, policy_version 219175 (0.0032) [2024-06-18 21:56:40,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 3591061504. Throughput: 0: 42209.8. Samples: 815187940. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:56:40,501][18875] Avg episode reward: [(0, '0.740')] [2024-06-18 21:56:41,697][19107] Updated weights for policy 0, policy_version 219185 (0.0041) [2024-06-18 21:56:45,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3591258112. Throughput: 0: 42112.1. Samples: 815311780. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:56:45,501][18875] Avg episode reward: [(0, '0.754')] [2024-06-18 21:56:46,228][19107] Updated weights for policy 0, policy_version 219195 (0.0038) [2024-06-18 21:56:49,389][19107] Updated weights for policy 0, policy_version 219205 (0.0036) [2024-06-18 21:56:50,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 3591503872. Throughput: 0: 42287.9. Samples: 815571340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:56:50,502][18875] Avg episode reward: [(0, '0.754')] [2024-06-18 21:56:53,670][19107] Updated weights for policy 0, policy_version 219215 (0.0047) [2024-06-18 21:56:55,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41233.2, 300 sec: 42098.6). Total num frames: 3591651328. Throughput: 0: 42586.4. Samples: 815837820. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:56:55,500][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 21:56:57,031][19107] Updated weights for policy 0, policy_version 219225 (0.0029) [2024-06-18 21:57:00,500][18875] Fps is (10 sec: 37683.1, 60 sec: 42052.2, 300 sec: 42099.1). Total num frames: 3591880704. Throughput: 0: 42054.6. Samples: 815946260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:00,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 21:57:01,443][19107] Updated weights for policy 0, policy_version 219235 (0.0043) [2024-06-18 21:57:04,976][19107] Updated weights for policy 0, policy_version 219245 (0.0034) [2024-06-18 21:57:05,500][18875] Fps is (10 sec: 47512.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 3592126464. Throughput: 0: 42338.2. Samples: 816203480. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:05,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 21:57:09,321][19107] Updated weights for policy 0, policy_version 219255 (0.0051) [2024-06-18 21:57:10,500][18875] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 42154.6). Total num frames: 3592290304. Throughput: 0: 42296.4. Samples: 816459840. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:10,501][18875] Avg episode reward: [(0, '0.485')] [2024-06-18 21:57:12,430][19087] Signal inference workers to stop experience collection... (11950 times) [2024-06-18 21:57:12,463][19107] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-06-18 21:57:12,545][19087] Signal inference workers to resume experience collection... (11950 times) [2024-06-18 21:57:12,546][19107] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-06-18 21:57:12,686][19107] Updated weights for policy 0, policy_version 219265 (0.0028) [2024-06-18 21:57:15,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3592536064. Throughput: 0: 42011.5. Samples: 816575160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:15,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 21:57:17,286][19107] Updated weights for policy 0, policy_version 219275 (0.0028) [2024-06-18 21:57:20,456][19107] Updated weights for policy 0, policy_version 219285 (0.0034) [2024-06-18 21:57:20,500][18875] Fps is (10 sec: 47513.4, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 3592765440. Throughput: 0: 42359.9. Samples: 816840400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:20,504][18875] Avg episode reward: [(0, '0.750')] [2024-06-18 21:57:24,866][19107] Updated weights for policy 0, policy_version 219295 (0.0039) [2024-06-18 21:57:25,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3592929280. Throughput: 0: 42210.2. Samples: 817087400. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:25,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 21:57:28,211][19107] Updated weights for policy 0, policy_version 219305 (0.0030) [2024-06-18 21:57:30,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 3593175040. Throughput: 0: 42175.5. Samples: 817209680. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:30,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 21:57:32,755][19107] Updated weights for policy 0, policy_version 219315 (0.0045) [2024-06-18 21:57:35,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 3593371648. Throughput: 0: 42160.1. Samples: 817468540. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:35,500][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 21:57:36,507][19107] Updated weights for policy 0, policy_version 219325 (0.0028) [2024-06-18 21:57:40,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3593568256. Throughput: 0: 41655.8. Samples: 817712340. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:40,510][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 21:57:40,539][19107] Updated weights for policy 0, policy_version 219335 (0.0042) [2024-06-18 21:57:44,274][19107] Updated weights for policy 0, policy_version 219345 (0.0041) [2024-06-18 21:57:45,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 3593814016. Throughput: 0: 42053.3. Samples: 817838660. Policy #0 lag: (min: 0.0, avg: 12.0, max: 23.0) [2024-06-18 21:57:45,510][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 21:57:48,470][19107] Updated weights for policy 0, policy_version 219355 (0.0028) [2024-06-18 21:57:50,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 3594010624. Throughput: 0: 42097.3. Samples: 818097860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:57:50,501][18875] Avg episode reward: [(0, '0.638')] [2024-06-18 21:57:51,998][19107] Updated weights for policy 0, policy_version 219365 (0.0041) [2024-06-18 21:57:55,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 3594207232. Throughput: 0: 41804.9. Samples: 818341060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:57:55,501][18875] Avg episode reward: [(0, '0.574')] [2024-06-18 21:57:56,135][19107] Updated weights for policy 0, policy_version 219375 (0.0037) [2024-06-18 21:57:59,735][19107] Updated weights for policy 0, policy_version 219385 (0.0035) [2024-06-18 21:58:00,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3594420224. Throughput: 0: 42117.4. Samples: 818470440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:00,501][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 21:58:03,805][19107] Updated weights for policy 0, policy_version 219395 (0.0042) [2024-06-18 21:58:05,504][18875] Fps is (10 sec: 42583.2, 60 sec: 41776.7, 300 sec: 42209.1). Total num frames: 3594633216. Throughput: 0: 41883.4. Samples: 818725300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:05,504][18875] Avg episode reward: [(0, '0.758')] [2024-06-18 21:58:07,586][19107] Updated weights for policy 0, policy_version 219405 (0.0044) [2024-06-18 21:58:10,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 3594862592. Throughput: 0: 41892.4. Samples: 818972560. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:10,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 21:58:11,598][19107] Updated weights for policy 0, policy_version 219415 (0.0031) [2024-06-18 21:58:15,256][19107] Updated weights for policy 0, policy_version 219425 (0.0039) [2024-06-18 21:58:15,500][18875] Fps is (10 sec: 42613.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3595059200. Throughput: 0: 42058.7. Samples: 819102320. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:15,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 21:58:19,279][19107] Updated weights for policy 0, policy_version 219435 (0.0031) [2024-06-18 21:58:20,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 3595255808. Throughput: 0: 42035.5. Samples: 819360140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:20,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 21:58:22,987][19107] Updated weights for policy 0, policy_version 219445 (0.0034) [2024-06-18 21:58:25,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42209.6). Total num frames: 3595501568. Throughput: 0: 42229.1. Samples: 819612640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:25,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 21:58:26,751][19107] Updated weights for policy 0, policy_version 219455 (0.0031) [2024-06-18 21:58:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3595681792. Throughput: 0: 42344.1. Samples: 819744140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:30,501][18875] Avg episode reward: [(0, '0.408')] [2024-06-18 21:58:30,930][19107] Updated weights for policy 0, policy_version 219465 (0.0030) [2024-06-18 21:58:34,316][19107] Updated weights for policy 0, policy_version 219475 (0.0031) [2024-06-18 21:58:35,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3595894784. Throughput: 0: 42159.7. Samples: 819995040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:35,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 21:58:35,513][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000219476_3595894784.pth... [2024-06-18 21:58:35,562][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000218859_3585785856.pth [2024-06-18 21:58:38,605][19107] Updated weights for policy 0, policy_version 219485 (0.0052) [2024-06-18 21:58:40,500][18875] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 3596140544. Throughput: 0: 42378.1. Samples: 820248080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:40,501][18875] Avg episode reward: [(0, '0.636')] [2024-06-18 21:58:42,311][19107] Updated weights for policy 0, policy_version 219495 (0.0030) [2024-06-18 21:58:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 3596320768. Throughput: 0: 42461.4. Samples: 820381200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:45,500][18875] Avg episode reward: [(0, '0.458')] [2024-06-18 21:58:46,418][19107] Updated weights for policy 0, policy_version 219505 (0.0031) [2024-06-18 21:58:50,403][19107] Updated weights for policy 0, policy_version 219515 (0.0039) [2024-06-18 21:58:50,500][18875] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3596533760. Throughput: 0: 42284.2. Samples: 820627940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 21:58:50,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 21:58:52,174][19087] Signal inference workers to stop experience collection... (12000 times) [2024-06-18 21:58:52,232][19107] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-06-18 21:58:52,234][19087] Signal inference workers to resume experience collection... (12000 times) [2024-06-18 21:58:52,242][19107] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-06-18 21:58:54,305][19107] Updated weights for policy 0, policy_version 219525 (0.0044) [2024-06-18 21:58:55,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3596746752. Throughput: 0: 42278.3. Samples: 820875080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:58:55,501][18875] Avg episode reward: [(0, '0.702')] [2024-06-18 21:58:58,089][19107] Updated weights for policy 0, policy_version 219535 (0.0043) [2024-06-18 21:59:00,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42099.1). Total num frames: 3596943360. Throughput: 0: 42175.5. Samples: 821000220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:00,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 21:59:02,114][19107] Updated weights for policy 0, policy_version 219545 (0.0036) [2024-06-18 21:59:05,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42054.8, 300 sec: 42154.1). Total num frames: 3597156352. Throughput: 0: 42157.8. Samples: 821257240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:05,501][18875] Avg episode reward: [(0, '0.281')] [2024-06-18 21:59:05,891][19107] Updated weights for policy 0, policy_version 219555 (0.0035) [2024-06-18 21:59:09,818][19107] Updated weights for policy 0, policy_version 219565 (0.0037) [2024-06-18 21:59:10,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3597385728. Throughput: 0: 42067.4. Samples: 821505680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:10,501][18875] Avg episode reward: [(0, '0.308')] [2024-06-18 21:59:13,557][19107] Updated weights for policy 0, policy_version 219575 (0.0040) [2024-06-18 21:59:15,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3597598720. Throughput: 0: 41917.7. Samples: 821630440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:15,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 21:59:17,527][19107] Updated weights for policy 0, policy_version 219585 (0.0039) [2024-06-18 21:59:20,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3597795328. Throughput: 0: 42038.2. Samples: 821886760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:20,501][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 21:59:21,485][19107] Updated weights for policy 0, policy_version 219595 (0.0034) [2024-06-18 21:59:25,323][19107] Updated weights for policy 0, policy_version 219605 (0.0039) [2024-06-18 21:59:25,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3598008320. Throughput: 0: 42071.2. Samples: 822141280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:25,504][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 21:59:28,946][19107] Updated weights for policy 0, policy_version 219615 (0.0030) [2024-06-18 21:59:30,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3598221312. Throughput: 0: 41946.9. Samples: 822268820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:30,501][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 21:59:33,126][19107] Updated weights for policy 0, policy_version 219625 (0.0046) [2024-06-18 21:59:35,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42154.3). Total num frames: 3598434304. Throughput: 0: 42078.3. Samples: 822521460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:35,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 21:59:36,664][19107] Updated weights for policy 0, policy_version 219635 (0.0043) [2024-06-18 21:59:40,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41233.2, 300 sec: 42043.0). Total num frames: 3598614528. Throughput: 0: 42286.3. Samples: 822777960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:40,501][18875] Avg episode reward: [(0, '0.683')] [2024-06-18 21:59:40,893][19107] Updated weights for policy 0, policy_version 219645 (0.0051) [2024-06-18 21:59:44,727][19107] Updated weights for policy 0, policy_version 219655 (0.0027) [2024-06-18 21:59:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 3598860288. Throughput: 0: 42254.4. Samples: 822901660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:45,500][18875] Avg episode reward: [(0, '0.827')] [2024-06-18 21:59:48,499][19107] Updated weights for policy 0, policy_version 219665 (0.0037) [2024-06-18 21:59:50,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3599040512. Throughput: 0: 42173.8. Samples: 823155060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:50,501][18875] Avg episode reward: [(0, '0.881')] [2024-06-18 21:59:52,527][19107] Updated weights for policy 0, policy_version 219675 (0.0035) [2024-06-18 21:59:55,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3599253504. Throughput: 0: 42241.4. Samples: 823406540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 21:59:55,501][18875] Avg episode reward: [(0, '0.549')] [2024-06-18 21:59:56,345][19107] Updated weights for policy 0, policy_version 219685 (0.0033) [2024-06-18 22:00:00,022][19107] Updated weights for policy 0, policy_version 219695 (0.0031) [2024-06-18 22:00:00,500][18875] Fps is (10 sec: 44235.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3599482880. Throughput: 0: 42380.8. Samples: 823537580. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-18 22:00:00,501][18875] Avg episode reward: [(0, '0.399')] [2024-06-18 22:00:04,070][19107] Updated weights for policy 0, policy_version 219705 (0.0027) [2024-06-18 22:00:05,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3599679488. Throughput: 0: 42244.0. Samples: 823787740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:05,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 22:00:08,195][19107] Updated weights for policy 0, policy_version 219715 (0.0037) [2024-06-18 22:00:10,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3599908864. Throughput: 0: 42132.8. Samples: 824037260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:10,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 22:00:11,786][19107] Updated weights for policy 0, policy_version 219725 (0.0041) [2024-06-18 22:00:15,500][18875] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3600105472. Throughput: 0: 42198.3. Samples: 824167740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:15,501][18875] Avg episode reward: [(0, '0.811')] [2024-06-18 22:00:15,855][19107] Updated weights for policy 0, policy_version 219735 (0.0023) [2024-06-18 22:00:16,486][19087] Signal inference workers to stop experience collection... (12050 times) [2024-06-18 22:00:16,486][19087] Signal inference workers to resume experience collection... (12050 times) [2024-06-18 22:00:16,504][19107] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-06-18 22:00:16,504][19107] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-06-18 22:00:19,548][19107] Updated weights for policy 0, policy_version 219745 (0.0051) [2024-06-18 22:00:20,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3600334848. Throughput: 0: 42116.5. Samples: 824416700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:20,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 22:00:23,595][19107] Updated weights for policy 0, policy_version 219755 (0.0032) [2024-06-18 22:00:25,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3600531456. Throughput: 0: 42032.8. Samples: 824669440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:25,501][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 22:00:27,270][19107] Updated weights for policy 0, policy_version 219765 (0.0032) [2024-06-18 22:00:30,500][18875] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3600760832. Throughput: 0: 42151.8. Samples: 824798500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:30,501][18875] Avg episode reward: [(0, '0.802')] [2024-06-18 22:00:31,522][19107] Updated weights for policy 0, policy_version 219775 (0.0038) [2024-06-18 22:00:34,941][19107] Updated weights for policy 0, policy_version 219785 (0.0032) [2024-06-18 22:00:35,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3600973824. Throughput: 0: 42225.6. Samples: 825055220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:35,501][18875] Avg episode reward: [(0, '0.799')] [2024-06-18 22:00:35,521][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000219786_3600973824.pth... [2024-06-18 22:00:35,579][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000219169_3590864896.pth [2024-06-18 22:00:39,035][19107] Updated weights for policy 0, policy_version 219795 (0.0024) [2024-06-18 22:00:40,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 3601186816. Throughput: 0: 42187.1. Samples: 825304960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:40,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 22:00:43,132][19107] Updated weights for policy 0, policy_version 219805 (0.0051) [2024-06-18 22:00:45,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3601399808. Throughput: 0: 42111.3. Samples: 825432580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:45,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 22:00:47,049][19107] Updated weights for policy 0, policy_version 219815 (0.0039) [2024-06-18 22:00:50,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3601580032. Throughput: 0: 42304.4. Samples: 825691440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:50,501][18875] Avg episode reward: [(0, '0.767')] [2024-06-18 22:00:50,758][19107] Updated weights for policy 0, policy_version 219825 (0.0035) [2024-06-18 22:00:54,706][19107] Updated weights for policy 0, policy_version 219835 (0.0033) [2024-06-18 22:00:55,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 3601825792. Throughput: 0: 42391.3. Samples: 825944860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:00:55,500][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 22:00:58,289][19107] Updated weights for policy 0, policy_version 219845 (0.0041) [2024-06-18 22:01:00,504][18875] Fps is (10 sec: 45858.5, 60 sec: 42595.9, 300 sec: 42264.7). Total num frames: 3602038784. Throughput: 0: 42354.5. Samples: 826073840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:01:00,513][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 22:01:02,299][19107] Updated weights for policy 0, policy_version 219855 (0.0043) [2024-06-18 22:01:05,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 3602235392. Throughput: 0: 42518.1. Samples: 826330020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:01:05,501][18875] Avg episode reward: [(0, '0.554')] [2024-06-18 22:01:06,035][19107] Updated weights for policy 0, policy_version 219865 (0.0034) [2024-06-18 22:01:09,898][19107] Updated weights for policy 0, policy_version 219875 (0.0039) [2024-06-18 22:01:10,500][18875] Fps is (10 sec: 40974.4, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 3602448384. Throughput: 0: 42458.6. Samples: 826580080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-18 22:01:10,501][18875] Avg episode reward: [(0, '0.722')] [2024-06-18 22:01:13,774][19107] Updated weights for policy 0, policy_version 219885 (0.0024) [2024-06-18 22:01:15,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 3602677760. Throughput: 0: 42545.4. Samples: 826713040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:15,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 22:01:17,564][19107] Updated weights for policy 0, policy_version 219895 (0.0025) [2024-06-18 22:01:20,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3602857984. Throughput: 0: 42566.8. Samples: 826970720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:20,501][18875] Avg episode reward: [(0, '0.691')] [2024-06-18 22:01:21,654][19107] Updated weights for policy 0, policy_version 219905 (0.0048) [2024-06-18 22:01:25,345][19107] Updated weights for policy 0, policy_version 219915 (0.0031) [2024-06-18 22:01:25,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 3603087360. Throughput: 0: 42502.6. Samples: 827217580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:25,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 22:01:29,464][19107] Updated weights for policy 0, policy_version 219925 (0.0028) [2024-06-18 22:01:30,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 3603316736. Throughput: 0: 42468.4. Samples: 827343660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:30,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 22:01:33,081][19107] Updated weights for policy 0, policy_version 219935 (0.0035) [2024-06-18 22:01:35,500][18875] Fps is (10 sec: 40961.0, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3603496960. Throughput: 0: 42434.7. Samples: 827601000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:35,500][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 22:01:37,065][19107] Updated weights for policy 0, policy_version 219945 (0.0045) [2024-06-18 22:01:37,751][19087] Signal inference workers to stop experience collection... (12100 times) [2024-06-18 22:01:37,755][19087] Signal inference workers to resume experience collection... (12100 times) [2024-06-18 22:01:37,787][19107] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-06-18 22:01:37,787][19107] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-06-18 22:01:40,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 3603726336. Throughput: 0: 42333.2. Samples: 827849860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:40,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 22:01:40,898][19107] Updated weights for policy 0, policy_version 219955 (0.0034) [2024-06-18 22:01:44,651][19107] Updated weights for policy 0, policy_version 219965 (0.0043) [2024-06-18 22:01:45,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3603922944. Throughput: 0: 42395.9. Samples: 827981500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:45,501][18875] Avg episode reward: [(0, '0.675')] [2024-06-18 22:01:48,555][19107] Updated weights for policy 0, policy_version 219975 (0.0039) [2024-06-18 22:01:50,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42595.8, 300 sec: 42320.2). Total num frames: 3604135936. Throughput: 0: 42294.9. Samples: 828233440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:50,504][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 22:01:52,803][19107] Updated weights for policy 0, policy_version 219985 (0.0029) [2024-06-18 22:01:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 3604365312. Throughput: 0: 42217.0. Samples: 828479840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:01:55,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 22:01:56,192][19107] Updated weights for policy 0, policy_version 219995 (0.0038) [2024-06-18 22:02:00,500][18875] Fps is (10 sec: 40974.6, 60 sec: 41781.7, 300 sec: 42098.6). Total num frames: 3604545536. Throughput: 0: 42107.1. Samples: 828607860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:02:00,501][18875] Avg episode reward: [(0, '0.565')] [2024-06-18 22:02:00,699][19107] Updated weights for policy 0, policy_version 220005 (0.0032) [2024-06-18 22:02:04,310][19107] Updated weights for policy 0, policy_version 220015 (0.0034) [2024-06-18 22:02:05,504][18875] Fps is (10 sec: 40945.2, 60 sec: 42322.8, 300 sec: 42320.2). Total num frames: 3604774912. Throughput: 0: 41955.7. Samples: 828858880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:02:05,505][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 22:02:08,434][19107] Updated weights for policy 0, policy_version 220025 (0.0030) [2024-06-18 22:02:10,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 3604987904. Throughput: 0: 41997.1. Samples: 829107440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:02:10,500][18875] Avg episode reward: [(0, '0.420')] [2024-06-18 22:02:11,982][19107] Updated weights for policy 0, policy_version 220035 (0.0034) [2024-06-18 22:02:15,500][18875] Fps is (10 sec: 39335.7, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3605168128. Throughput: 0: 41971.1. Samples: 829232360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:02:15,501][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 22:02:16,166][19107] Updated weights for policy 0, policy_version 220045 (0.0033) [2024-06-18 22:02:19,697][19107] Updated weights for policy 0, policy_version 220055 (0.0025) [2024-06-18 22:02:20,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 3605397504. Throughput: 0: 41819.0. Samples: 829482860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:20,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 22:02:23,864][19107] Updated weights for policy 0, policy_version 220065 (0.0029) [2024-06-18 22:02:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3605594112. Throughput: 0: 41889.8. Samples: 829734900. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:25,501][18875] Avg episode reward: [(0, '0.495')] [2024-06-18 22:02:27,364][19107] Updated weights for policy 0, policy_version 220075 (0.0039) [2024-06-18 22:02:30,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 3605807104. Throughput: 0: 41798.6. Samples: 829862440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:30,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 22:02:31,930][19107] Updated weights for policy 0, policy_version 220085 (0.0040) [2024-06-18 22:02:35,019][19107] Updated weights for policy 0, policy_version 220095 (0.0037) [2024-06-18 22:02:35,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 3606036480. Throughput: 0: 41706.3. Samples: 830110080. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:35,501][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 22:02:35,526][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000220095_3606036480.pth... [2024-06-18 22:02:35,597][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000219476_3595894784.pth [2024-06-18 22:02:39,779][19107] Updated weights for policy 0, policy_version 220105 (0.0041) [2024-06-18 22:02:40,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3606216704. Throughput: 0: 41887.1. Samples: 830364760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:40,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 22:02:43,296][19107] Updated weights for policy 0, policy_version 220115 (0.0046) [2024-06-18 22:02:45,500][18875] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3606446080. Throughput: 0: 41722.4. Samples: 830485360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:45,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 22:02:47,484][19107] Updated weights for policy 0, policy_version 220125 (0.0043) [2024-06-18 22:02:50,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42054.7, 300 sec: 42209.6). Total num frames: 3606659072. Throughput: 0: 41833.1. Samples: 830741220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:50,501][18875] Avg episode reward: [(0, '0.774')] [2024-06-18 22:02:50,903][19107] Updated weights for policy 0, policy_version 220135 (0.0039) [2024-06-18 22:02:52,060][19087] Signal inference workers to stop experience collection... (12150 times) [2024-06-18 22:02:52,060][19087] Signal inference workers to resume experience collection... (12150 times) [2024-06-18 22:02:52,101][19107] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-06-18 22:02:52,108][19107] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-06-18 22:02:55,500][18875] Fps is (10 sec: 39320.8, 60 sec: 41233.0, 300 sec: 42098.5). Total num frames: 3606839296. Throughput: 0: 41918.5. Samples: 830993780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:02:55,501][18875] Avg episode reward: [(0, '0.333')] [2024-06-18 22:02:55,635][19107] Updated weights for policy 0, policy_version 220145 (0.0050) [2024-06-18 22:02:58,623][19107] Updated weights for policy 0, policy_version 220155 (0.0038) [2024-06-18 22:03:00,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42154.6). Total num frames: 3607068672. Throughput: 0: 41925.5. Samples: 831119000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:03:00,500][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 22:03:03,191][19107] Updated weights for policy 0, policy_version 220165 (0.0031) [2024-06-18 22:03:05,500][18875] Fps is (10 sec: 44237.5, 60 sec: 41781.8, 300 sec: 42098.6). Total num frames: 3607281664. Throughput: 0: 41894.3. Samples: 831368100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:03:05,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 22:03:06,372][19107] Updated weights for policy 0, policy_version 220175 (0.0034) [2024-06-18 22:03:10,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3607494656. Throughput: 0: 42040.0. Samples: 831626700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:03:10,501][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 22:03:10,715][19107] Updated weights for policy 0, policy_version 220185 (0.0031) [2024-06-18 22:03:14,238][19107] Updated weights for policy 0, policy_version 220195 (0.0029) [2024-06-18 22:03:15,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3607691264. Throughput: 0: 42003.2. Samples: 831752580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:03:15,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 22:03:18,335][19107] Updated weights for policy 0, policy_version 220205 (0.0042) [2024-06-18 22:03:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3607904256. Throughput: 0: 42049.0. Samples: 832002280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:03:20,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 22:03:22,276][19107] Updated weights for policy 0, policy_version 220215 (0.0034) [2024-06-18 22:03:25,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3608117248. Throughput: 0: 42028.0. Samples: 832256020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 23.0) [2024-06-18 22:03:25,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 22:03:26,168][19107] Updated weights for policy 0, policy_version 220225 (0.0040) [2024-06-18 22:03:30,065][19107] Updated weights for policy 0, policy_version 220235 (0.0032) [2024-06-18 22:03:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3608330240. Throughput: 0: 42186.5. Samples: 832383760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:03:30,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 22:03:34,111][19107] Updated weights for policy 0, policy_version 220245 (0.0034) [2024-06-18 22:03:35,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3608526848. Throughput: 0: 42008.0. Samples: 832631580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:03:35,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 22:03:37,801][19107] Updated weights for policy 0, policy_version 220255 (0.0037) [2024-06-18 22:03:40,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3608756224. Throughput: 0: 42017.5. Samples: 832884560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:03:40,504][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 22:03:41,959][19107] Updated weights for policy 0, policy_version 220265 (0.0041) [2024-06-18 22:03:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3608952832. Throughput: 0: 42102.1. Samples: 833013600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:03:45,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 22:03:45,771][19107] Updated weights for policy 0, policy_version 220275 (0.0035) [2024-06-18 22:03:49,757][19107] Updated weights for policy 0, policy_version 220285 (0.0050) [2024-06-18 22:03:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3609165824. Throughput: 0: 42090.7. Samples: 833262180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:03:50,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 22:03:53,561][19107] Updated weights for policy 0, policy_version 220295 (0.0037) [2024-06-18 22:03:55,503][18875] Fps is (10 sec: 45864.2, 60 sec: 42869.8, 300 sec: 42264.8). Total num frames: 3609411584. Throughput: 0: 42026.6. Samples: 833518000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:03:55,503][18875] Avg episode reward: [(0, '0.483')] [2024-06-18 22:03:57,386][19107] Updated weights for policy 0, policy_version 220305 (0.0036) [2024-06-18 22:04:00,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3609591808. Throughput: 0: 42004.8. Samples: 833642800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:00,501][18875] Avg episode reward: [(0, '0.844')] [2024-06-18 22:04:01,474][19107] Updated weights for policy 0, policy_version 220315 (0.0042) [2024-06-18 22:04:05,210][19107] Updated weights for policy 0, policy_version 220325 (0.0030) [2024-06-18 22:04:05,500][18875] Fps is (10 sec: 39331.2, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3609804800. Throughput: 0: 41998.7. Samples: 833892220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:05,501][18875] Avg episode reward: [(0, '0.682')] [2024-06-18 22:04:09,205][19107] Updated weights for policy 0, policy_version 220335 (0.0030) [2024-06-18 22:04:10,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3610017792. Throughput: 0: 42066.5. Samples: 834149020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:10,501][18875] Avg episode reward: [(0, '0.755')] [2024-06-18 22:04:13,103][19107] Updated weights for policy 0, policy_version 220345 (0.0031) [2024-06-18 22:04:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3610214400. Throughput: 0: 42102.8. Samples: 834278380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:15,500][18875] Avg episode reward: [(0, '0.679')] [2024-06-18 22:04:16,981][19107] Updated weights for policy 0, policy_version 220355 (0.0032) [2024-06-18 22:04:20,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3610427392. Throughput: 0: 42093.7. Samples: 834525800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:20,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 22:04:20,909][19107] Updated weights for policy 0, policy_version 220365 (0.0035) [2024-06-18 22:04:24,698][19107] Updated weights for policy 0, policy_version 220375 (0.0039) [2024-06-18 22:04:25,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3610656768. Throughput: 0: 42096.8. Samples: 834778920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:25,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 22:04:26,742][19087] Signal inference workers to stop experience collection... (12200 times) [2024-06-18 22:04:26,743][19087] Signal inference workers to resume experience collection... (12200 times) [2024-06-18 22:04:26,753][19107] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-06-18 22:04:26,786][19107] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-06-18 22:04:28,722][19107] Updated weights for policy 0, policy_version 220385 (0.0030) [2024-06-18 22:04:30,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3610853376. Throughput: 0: 42108.9. Samples: 834908500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:30,500][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 22:04:32,518][19107] Updated weights for policy 0, policy_version 220395 (0.0034) [2024-06-18 22:04:35,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 3611082752. Throughput: 0: 42120.3. Samples: 835157600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-18 22:04:35,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 22:04:35,513][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000220403_3611082752.pth... [2024-06-18 22:04:35,572][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000219786_3600973824.pth [2024-06-18 22:04:36,897][19107] Updated weights for policy 0, policy_version 220405 (0.0033) [2024-06-18 22:04:40,286][19107] Updated weights for policy 0, policy_version 220415 (0.0040) [2024-06-18 22:04:40,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3611295744. Throughput: 0: 42172.8. Samples: 835415680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:04:40,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 22:04:44,517][19107] Updated weights for policy 0, policy_version 220425 (0.0034) [2024-06-18 22:04:45,504][18875] Fps is (10 sec: 40946.5, 60 sec: 42323.0, 300 sec: 42209.1). Total num frames: 3611492352. Throughput: 0: 42035.6. Samples: 835534540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:04:45,504][18875] Avg episode reward: [(0, '0.685')] [2024-06-18 22:04:48,162][19107] Updated weights for policy 0, policy_version 220435 (0.0030) [2024-06-18 22:04:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 3611721728. Throughput: 0: 42211.0. Samples: 835791720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:04:50,501][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 22:04:52,046][19107] Updated weights for policy 0, policy_version 220445 (0.0038) [2024-06-18 22:04:55,500][18875] Fps is (10 sec: 42612.9, 60 sec: 41780.9, 300 sec: 42154.1). Total num frames: 3611918336. Throughput: 0: 42156.2. Samples: 836046040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:04:55,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 22:04:55,724][19107] Updated weights for policy 0, policy_version 220455 (0.0033) [2024-06-18 22:04:59,702][19107] Updated weights for policy 0, policy_version 220465 (0.0036) [2024-06-18 22:05:00,504][18875] Fps is (10 sec: 40945.2, 60 sec: 42322.8, 300 sec: 42209.1). Total num frames: 3612131328. Throughput: 0: 42013.4. Samples: 836169140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:00,505][18875] Avg episode reward: [(0, '0.646')] [2024-06-18 22:05:03,437][19107] Updated weights for policy 0, policy_version 220475 (0.0034) [2024-06-18 22:05:05,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3612360704. Throughput: 0: 42333.7. Samples: 836430820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:05,501][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 22:05:07,332][19107] Updated weights for policy 0, policy_version 220485 (0.0049) [2024-06-18 22:05:10,500][18875] Fps is (10 sec: 40975.1, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3612540928. Throughput: 0: 42465.0. Samples: 836689840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:10,501][18875] Avg episode reward: [(0, '0.695')] [2024-06-18 22:05:11,564][19107] Updated weights for policy 0, policy_version 220495 (0.0031) [2024-06-18 22:05:15,029][19107] Updated weights for policy 0, policy_version 220505 (0.0032) [2024-06-18 22:05:15,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3612770304. Throughput: 0: 42152.3. Samples: 836805360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:15,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 22:05:19,169][19107] Updated weights for policy 0, policy_version 220515 (0.0042) [2024-06-18 22:05:20,500][18875] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 3612999680. Throughput: 0: 42396.9. Samples: 837065460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:20,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 22:05:22,497][19107] Updated weights for policy 0, policy_version 220525 (0.0038) [2024-06-18 22:05:25,504][18875] Fps is (10 sec: 39307.6, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 3613163520. Throughput: 0: 42324.2. Samples: 837320420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:25,504][18875] Avg episode reward: [(0, '0.501')] [2024-06-18 22:05:26,972][19107] Updated weights for policy 0, policy_version 220535 (0.0047) [2024-06-18 22:05:30,433][19107] Updated weights for policy 0, policy_version 220545 (0.0025) [2024-06-18 22:05:30,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3613409280. Throughput: 0: 42317.0. Samples: 837438660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:30,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 22:05:34,696][19107] Updated weights for policy 0, policy_version 220555 (0.0034) [2024-06-18 22:05:35,500][18875] Fps is (10 sec: 45891.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3613622272. Throughput: 0: 42390.6. Samples: 837699300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:35,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 22:05:37,986][19107] Updated weights for policy 0, policy_version 220565 (0.0029) [2024-06-18 22:05:40,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3613802496. Throughput: 0: 42396.9. Samples: 837953900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:05:40,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 22:05:42,403][19107] Updated weights for policy 0, policy_version 220575 (0.0033) [2024-06-18 22:05:43,807][19087] Signal inference workers to stop experience collection... (12250 times) [2024-06-18 22:05:43,858][19107] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-06-18 22:05:43,863][19087] Signal inference workers to resume experience collection... (12250 times) [2024-06-18 22:05:43,873][19107] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-06-18 22:05:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42600.8, 300 sec: 42265.2). Total num frames: 3614048256. Throughput: 0: 42405.2. Samples: 838077220. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:05:45,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 22:05:45,625][19107] Updated weights for policy 0, policy_version 220585 (0.0033) [2024-06-18 22:05:50,144][19107] Updated weights for policy 0, policy_version 220595 (0.0052) [2024-06-18 22:05:50,504][18875] Fps is (10 sec: 44220.6, 60 sec: 42049.8, 300 sec: 42098.0). Total num frames: 3614244864. Throughput: 0: 42334.1. Samples: 838336000. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:05:50,504][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 22:05:53,756][19107] Updated weights for policy 0, policy_version 220605 (0.0046) [2024-06-18 22:05:55,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 3614441472. Throughput: 0: 42159.9. Samples: 838587040. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:05:55,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 22:05:57,919][19107] Updated weights for policy 0, policy_version 220615 (0.0034) [2024-06-18 22:06:00,500][18875] Fps is (10 sec: 44252.4, 60 sec: 42600.9, 300 sec: 42209.6). Total num frames: 3614687232. Throughput: 0: 42344.9. Samples: 838710880. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:00,501][18875] Avg episode reward: [(0, '0.536')] [2024-06-18 22:06:01,302][19107] Updated weights for policy 0, policy_version 220625 (0.0040) [2024-06-18 22:06:05,504][18875] Fps is (10 sec: 40943.6, 60 sec: 41503.4, 300 sec: 42042.4). Total num frames: 3614851072. Throughput: 0: 42183.8. Samples: 838963900. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:05,505][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 22:06:05,775][19107] Updated weights for policy 0, policy_version 220635 (0.0034) [2024-06-18 22:06:08,971][19107] Updated weights for policy 0, policy_version 220645 (0.0039) [2024-06-18 22:06:10,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3615080448. Throughput: 0: 41991.8. Samples: 839209900. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:10,501][18875] Avg episode reward: [(0, '0.458')] [2024-06-18 22:06:13,587][19107] Updated weights for policy 0, policy_version 220655 (0.0036) [2024-06-18 22:06:15,500][18875] Fps is (10 sec: 45894.0, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3615309824. Throughput: 0: 42444.9. Samples: 839348680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:15,501][18875] Avg episode reward: [(0, '0.218')] [2024-06-18 22:06:16,834][19107] Updated weights for policy 0, policy_version 220665 (0.0042) [2024-06-18 22:06:20,500][18875] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 3615473664. Throughput: 0: 42132.9. Samples: 839595280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:20,501][18875] Avg episode reward: [(0, '0.326')] [2024-06-18 22:06:21,407][19107] Updated weights for policy 0, policy_version 220675 (0.0032) [2024-06-18 22:06:24,655][19107] Updated weights for policy 0, policy_version 220685 (0.0040) [2024-06-18 22:06:25,504][18875] Fps is (10 sec: 40945.1, 60 sec: 42598.4, 300 sec: 42042.5). Total num frames: 3615719424. Throughput: 0: 41919.7. Samples: 839840440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:25,505][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 22:06:29,273][19107] Updated weights for policy 0, policy_version 220695 (0.0035) [2024-06-18 22:06:30,500][18875] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 3615916032. Throughput: 0: 42127.2. Samples: 839972940. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:30,500][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 22:06:32,989][19107] Updated weights for policy 0, policy_version 220705 (0.0051) [2024-06-18 22:06:35,500][18875] Fps is (10 sec: 40974.2, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 3616129024. Throughput: 0: 41907.2. Samples: 840221680. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:35,501][18875] Avg episode reward: [(0, '0.361')] [2024-06-18 22:06:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000220711_3616129024.pth... [2024-06-18 22:06:35,571][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000220095_3606036480.pth [2024-06-18 22:06:37,119][19107] Updated weights for policy 0, policy_version 220715 (0.0037) [2024-06-18 22:06:40,504][18875] Fps is (10 sec: 42582.4, 60 sec: 42322.7, 300 sec: 42098.0). Total num frames: 3616342016. Throughput: 0: 41908.2. Samples: 840473060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:40,505][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 22:06:40,925][19107] Updated weights for policy 0, policy_version 220725 (0.0045) [2024-06-18 22:06:44,977][19107] Updated weights for policy 0, policy_version 220735 (0.0024) [2024-06-18 22:06:45,500][18875] Fps is (10 sec: 42599.6, 60 sec: 41779.3, 300 sec: 42099.1). Total num frames: 3616555008. Throughput: 0: 41874.9. Samples: 840595240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:45,500][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 22:06:48,533][19107] Updated weights for policy 0, policy_version 220745 (0.0026) [2024-06-18 22:06:50,500][18875] Fps is (10 sec: 42614.0, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 3616768000. Throughput: 0: 42012.2. Samples: 840854280. Policy #0 lag: (min: 1.0, avg: 11.0, max: 21.0) [2024-06-18 22:06:50,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 22:06:52,567][19107] Updated weights for policy 0, policy_version 220755 (0.0045) [2024-06-18 22:06:55,500][18875] Fps is (10 sec: 39320.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3616948224. Throughput: 0: 42170.6. Samples: 841107580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:06:55,504][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 22:06:56,463][19107] Updated weights for policy 0, policy_version 220765 (0.0028) [2024-06-18 22:06:57,074][19087] Signal inference workers to stop experience collection... (12300 times) [2024-06-18 22:06:57,108][19107] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-06-18 22:06:57,139][19087] Signal inference workers to resume experience collection... (12300 times) [2024-06-18 22:06:57,141][19107] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-06-18 22:07:00,246][19107] Updated weights for policy 0, policy_version 220775 (0.0028) [2024-06-18 22:07:00,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 42043.5). Total num frames: 3617177600. Throughput: 0: 41847.6. Samples: 841231820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:00,500][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 22:07:04,239][19107] Updated weights for policy 0, policy_version 220785 (0.0032) [2024-06-18 22:07:05,500][18875] Fps is (10 sec: 44237.7, 60 sec: 42328.3, 300 sec: 42043.0). Total num frames: 3617390592. Throughput: 0: 42159.7. Samples: 841492460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:05,500][18875] Avg episode reward: [(0, '0.767')] [2024-06-18 22:07:07,884][19107] Updated weights for policy 0, policy_version 220795 (0.0033) [2024-06-18 22:07:10,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3617603584. Throughput: 0: 42130.4. Samples: 841736160. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:10,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 22:07:12,127][19107] Updated weights for policy 0, policy_version 220805 (0.0031) [2024-06-18 22:07:15,430][19107] Updated weights for policy 0, policy_version 220815 (0.0042) [2024-06-18 22:07:15,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3617832960. Throughput: 0: 42104.3. Samples: 841867640. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:15,501][18875] Avg episode reward: [(0, '0.702')] [2024-06-18 22:07:19,699][19107] Updated weights for policy 0, policy_version 220825 (0.0037) [2024-06-18 22:07:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3618029568. Throughput: 0: 42314.4. Samples: 842125820. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:20,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 22:07:23,204][19107] Updated weights for policy 0, policy_version 220835 (0.0048) [2024-06-18 22:07:25,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42327.9, 300 sec: 42209.6). Total num frames: 3618258944. Throughput: 0: 42200.3. Samples: 842371920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:25,501][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 22:07:27,428][19107] Updated weights for policy 0, policy_version 220845 (0.0039) [2024-06-18 22:07:30,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3618455552. Throughput: 0: 42383.9. Samples: 842502520. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:30,501][18875] Avg episode reward: [(0, '0.614')] [2024-06-18 22:07:30,826][19107] Updated weights for policy 0, policy_version 220855 (0.0034) [2024-06-18 22:07:35,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3618652160. Throughput: 0: 42332.0. Samples: 842759220. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:35,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 22:07:35,513][19107] Updated weights for policy 0, policy_version 220865 (0.0040) [2024-06-18 22:07:38,488][19107] Updated weights for policy 0, policy_version 220875 (0.0029) [2024-06-18 22:07:40,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42327.9, 300 sec: 42154.1). Total num frames: 3618881536. Throughput: 0: 42189.0. Samples: 843006080. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:40,501][18875] Avg episode reward: [(0, '0.357')] [2024-06-18 22:07:43,078][19107] Updated weights for policy 0, policy_version 220885 (0.0031) [2024-06-18 22:07:45,500][18875] Fps is (10 sec: 42597.6, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3619078144. Throughput: 0: 42247.4. Samples: 843132960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:45,501][18875] Avg episode reward: [(0, '0.510')] [2024-06-18 22:07:46,543][19107] Updated weights for policy 0, policy_version 220895 (0.0038) [2024-06-18 22:07:50,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3619291136. Throughput: 0: 41992.3. Samples: 843382120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:50,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 22:07:50,937][19107] Updated weights for policy 0, policy_version 220905 (0.0032) [2024-06-18 22:07:54,291][19107] Updated weights for policy 0, policy_version 220915 (0.0030) [2024-06-18 22:07:55,500][18875] Fps is (10 sec: 40961.0, 60 sec: 42325.5, 300 sec: 42098.5). Total num frames: 3619487744. Throughput: 0: 42060.6. Samples: 843628880. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:07:55,500][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 22:07:58,443][19107] Updated weights for policy 0, policy_version 220925 (0.0037) [2024-06-18 22:08:00,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3619700736. Throughput: 0: 42037.9. Samples: 843759340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 21.0) [2024-06-18 22:08:00,501][18875] Avg episode reward: [(0, '0.752')] [2024-06-18 22:08:01,964][19107] Updated weights for policy 0, policy_version 220935 (0.0029) [2024-06-18 22:08:05,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3619930112. Throughput: 0: 41963.6. Samples: 844014180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:05,501][18875] Avg episode reward: [(0, '0.762')] [2024-06-18 22:08:06,001][19107] Updated weights for policy 0, policy_version 220945 (0.0035) [2024-06-18 22:08:07,241][19087] Signal inference workers to stop experience collection... (12350 times) [2024-06-18 22:08:07,291][19107] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-06-18 22:08:07,302][19087] Signal inference workers to resume experience collection... (12350 times) [2024-06-18 22:08:07,306][19107] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-06-18 22:08:09,909][19107] Updated weights for policy 0, policy_version 220955 (0.0039) [2024-06-18 22:08:10,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3620126720. Throughput: 0: 41964.0. Samples: 844260300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:10,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 22:08:13,645][19107] Updated weights for policy 0, policy_version 220965 (0.0033) [2024-06-18 22:08:15,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3620339712. Throughput: 0: 42012.4. Samples: 844393080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:15,501][18875] Avg episode reward: [(0, '0.626')] [2024-06-18 22:08:17,795][19107] Updated weights for policy 0, policy_version 220975 (0.0036) [2024-06-18 22:08:20,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3620552704. Throughput: 0: 41827.6. Samples: 844641460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:20,501][18875] Avg episode reward: [(0, '0.831')] [2024-06-18 22:08:21,416][19107] Updated weights for policy 0, policy_version 220985 (0.0046) [2024-06-18 22:08:25,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3620765696. Throughput: 0: 42043.6. Samples: 844898040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:25,501][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 22:08:25,584][19107] Updated weights for policy 0, policy_version 220995 (0.0040) [2024-06-18 22:08:29,202][19107] Updated weights for policy 0, policy_version 221005 (0.0031) [2024-06-18 22:08:30,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3620962304. Throughput: 0: 41968.2. Samples: 845021520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:30,500][18875] Avg episode reward: [(0, '0.341')] [2024-06-18 22:08:33,356][19107] Updated weights for policy 0, policy_version 221015 (0.0027) [2024-06-18 22:08:35,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3621191680. Throughput: 0: 42160.9. Samples: 845279360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:35,501][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 22:08:35,565][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221021_3621208064.pth... [2024-06-18 22:08:35,630][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000220403_3611082752.pth [2024-06-18 22:08:36,919][19107] Updated weights for policy 0, policy_version 221025 (0.0034) [2024-06-18 22:08:40,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3621388288. Throughput: 0: 42463.4. Samples: 845539740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:40,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 22:08:41,129][19107] Updated weights for policy 0, policy_version 221035 (0.0043) [2024-06-18 22:08:44,709][19107] Updated weights for policy 0, policy_version 221045 (0.0041) [2024-06-18 22:08:45,504][18875] Fps is (10 sec: 42583.0, 60 sec: 42322.9, 300 sec: 42209.1). Total num frames: 3621617664. Throughput: 0: 42191.6. Samples: 845658120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:45,505][18875] Avg episode reward: [(0, '0.715')] [2024-06-18 22:08:49,163][19107] Updated weights for policy 0, policy_version 221055 (0.0040) [2024-06-18 22:08:50,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42098.9). Total num frames: 3621830656. Throughput: 0: 42280.4. Samples: 845916800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:50,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 22:08:52,427][19107] Updated weights for policy 0, policy_version 221065 (0.0026) [2024-06-18 22:08:55,501][18875] Fps is (10 sec: 39335.3, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 3622010880. Throughput: 0: 42535.4. Samples: 846174400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:08:55,501][18875] Avg episode reward: [(0, '0.399')] [2024-06-18 22:08:56,935][19107] Updated weights for policy 0, policy_version 221075 (0.0023) [2024-06-18 22:09:00,067][19107] Updated weights for policy 0, policy_version 221085 (0.0037) [2024-06-18 22:09:00,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3622256640. Throughput: 0: 42224.9. Samples: 846293200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:09:00,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 22:09:04,739][19107] Updated weights for policy 0, policy_version 221095 (0.0040) [2024-06-18 22:09:05,500][18875] Fps is (10 sec: 45875.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3622469632. Throughput: 0: 42449.7. Samples: 846551700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:09:05,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 22:09:07,774][19107] Updated weights for policy 0, policy_version 221105 (0.0032) [2024-06-18 22:09:10,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3622633472. Throughput: 0: 42375.0. Samples: 846804920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 24.0) [2024-06-18 22:09:10,501][18875] Avg episode reward: [(0, '0.602')] [2024-06-18 22:09:12,518][19107] Updated weights for policy 0, policy_version 221115 (0.0028) [2024-06-18 22:09:15,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 3622895616. Throughput: 0: 42325.2. Samples: 846926160. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:15,501][18875] Avg episode reward: [(0, '0.695')] [2024-06-18 22:09:15,595][19107] Updated weights for policy 0, policy_version 221125 (0.0031) [2024-06-18 22:09:20,488][19107] Updated weights for policy 0, policy_version 221135 (0.0029) [2024-06-18 22:09:20,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3623075840. Throughput: 0: 42360.5. Samples: 847185580. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:20,501][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 22:09:23,157][19107] Updated weights for policy 0, policy_version 221145 (0.0036) [2024-06-18 22:09:25,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3623288832. Throughput: 0: 42110.3. Samples: 847434700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:25,501][18875] Avg episode reward: [(0, '0.404')] [2024-06-18 22:09:28,093][19107] Updated weights for policy 0, policy_version 221155 (0.0038) [2024-06-18 22:09:30,501][18875] Fps is (10 sec: 45874.3, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 3623534592. Throughput: 0: 42365.9. Samples: 847564440. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:30,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 22:09:30,857][19107] Updated weights for policy 0, policy_version 221165 (0.0029) [2024-06-18 22:09:35,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3623698432. Throughput: 0: 42269.7. Samples: 847818940. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:35,501][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 22:09:35,844][19107] Updated weights for policy 0, policy_version 221175 (0.0033) [2024-06-18 22:09:38,722][19107] Updated weights for policy 0, policy_version 221185 (0.0045) [2024-06-18 22:09:40,501][18875] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42154.5). Total num frames: 3623927808. Throughput: 0: 41869.3. Samples: 848058520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:40,501][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 22:09:43,479][19107] Updated weights for policy 0, policy_version 221195 (0.0031) [2024-06-18 22:09:44,597][19087] Signal inference workers to stop experience collection... (12400 times) [2024-06-18 22:09:44,597][19087] Signal inference workers to resume experience collection... (12400 times) [2024-06-18 22:09:44,618][19107] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-06-18 22:09:44,618][19107] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-06-18 22:09:45,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42054.9, 300 sec: 42098.6). Total num frames: 3624140800. Throughput: 0: 42135.6. Samples: 848189300. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:45,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 22:09:46,448][19107] Updated weights for policy 0, policy_version 221205 (0.0033) [2024-06-18 22:09:50,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3624321024. Throughput: 0: 42008.9. Samples: 848442100. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:50,501][18875] Avg episode reward: [(0, '0.700')] [2024-06-18 22:09:51,353][19107] Updated weights for policy 0, policy_version 221215 (0.0029) [2024-06-18 22:09:54,215][19107] Updated weights for policy 0, policy_version 221225 (0.0027) [2024-06-18 22:09:55,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42154.6). Total num frames: 3624566784. Throughput: 0: 41823.6. Samples: 848686980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:09:55,501][18875] Avg episode reward: [(0, '0.711')] [2024-06-18 22:09:59,463][19107] Updated weights for policy 0, policy_version 221235 (0.0037) [2024-06-18 22:10:00,500][18875] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3624763392. Throughput: 0: 42143.6. Samples: 848822620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:10:00,501][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 22:10:02,009][19107] Updated weights for policy 0, policy_version 221245 (0.0037) [2024-06-18 22:10:05,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3624976384. Throughput: 0: 41899.9. Samples: 849071080. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:10:05,501][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 22:10:07,328][19107] Updated weights for policy 0, policy_version 221255 (0.0038) [2024-06-18 22:10:09,926][19107] Updated weights for policy 0, policy_version 221265 (0.0033) [2024-06-18 22:10:10,500][18875] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42209.6). Total num frames: 3625222144. Throughput: 0: 41831.9. Samples: 849317140. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:10:10,501][18875] Avg episode reward: [(0, '0.782')] [2024-06-18 22:10:15,034][19107] Updated weights for policy 0, policy_version 221275 (0.0040) [2024-06-18 22:10:15,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3625385984. Throughput: 0: 41977.8. Samples: 849453440. Policy #0 lag: (min: 0.0, avg: 7.9, max: 22.0) [2024-06-18 22:10:15,510][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 22:10:17,911][19107] Updated weights for policy 0, policy_version 221285 (0.0027) [2024-06-18 22:10:20,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42210.1). Total num frames: 3625615360. Throughput: 0: 41832.4. Samples: 849701400. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:20,509][18875] Avg episode reward: [(0, '0.281')] [2024-06-18 22:10:22,999][19107] Updated weights for policy 0, policy_version 221295 (0.0038) [2024-06-18 22:10:25,500][18875] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3625828352. Throughput: 0: 42081.5. Samples: 849952180. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:25,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 22:10:25,799][19107] Updated weights for policy 0, policy_version 221305 (0.0038) [2024-06-18 22:10:30,501][18875] Fps is (10 sec: 37682.7, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 3625992192. Throughput: 0: 41882.4. Samples: 850074020. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:30,504][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 22:10:30,796][19107] Updated weights for policy 0, policy_version 221315 (0.0028) [2024-06-18 22:10:33,734][19107] Updated weights for policy 0, policy_version 221325 (0.0036) [2024-06-18 22:10:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3626237952. Throughput: 0: 41778.3. Samples: 850322120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:35,501][18875] Avg episode reward: [(0, '0.748')] [2024-06-18 22:10:35,551][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221329_3626254336.pth... [2024-06-18 22:10:35,619][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000220711_3616129024.pth [2024-06-18 22:10:38,663][19107] Updated weights for policy 0, policy_version 221335 (0.0030) [2024-06-18 22:10:40,500][18875] Fps is (10 sec: 44237.8, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 3626434560. Throughput: 0: 42018.3. Samples: 850577800. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:40,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 22:10:41,735][19107] Updated weights for policy 0, policy_version 221345 (0.0024) [2024-06-18 22:10:45,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42043.5). Total num frames: 3626647552. Throughput: 0: 41822.7. Samples: 850704640. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:45,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 22:10:46,238][19107] Updated weights for policy 0, policy_version 221355 (0.0044) [2024-06-18 22:10:49,698][19107] Updated weights for policy 0, policy_version 221365 (0.0037) [2024-06-18 22:10:50,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3626860544. Throughput: 0: 41825.0. Samples: 850953200. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:50,501][18875] Avg episode reward: [(0, '0.746')] [2024-06-18 22:10:53,910][19107] Updated weights for policy 0, policy_version 221375 (0.0040) [2024-06-18 22:10:55,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3627057152. Throughput: 0: 41873.8. Samples: 851201460. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:10:55,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 22:10:57,138][19087] Signal inference workers to stop experience collection... (12450 times) [2024-06-18 22:10:57,138][19087] Signal inference workers to resume experience collection... (12450 times) [2024-06-18 22:10:57,187][19107] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-06-18 22:10:57,187][19107] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-06-18 22:10:57,413][19107] Updated weights for policy 0, policy_version 221385 (0.0029) [2024-06-18 22:11:00,500][18875] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42099.1). Total num frames: 3627270144. Throughput: 0: 41600.1. Samples: 851325440. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:11:00,501][18875] Avg episode reward: [(0, '0.453')] [2024-06-18 22:11:01,778][19107] Updated weights for policy 0, policy_version 221395 (0.0034) [2024-06-18 22:11:05,147][19107] Updated weights for policy 0, policy_version 221405 (0.0034) [2024-06-18 22:11:05,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3627499520. Throughput: 0: 41816.8. Samples: 851583160. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:11:05,502][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 22:11:09,369][19107] Updated weights for policy 0, policy_version 221415 (0.0046) [2024-06-18 22:11:10,504][18875] Fps is (10 sec: 40945.3, 60 sec: 40957.5, 300 sec: 41931.4). Total num frames: 3627679744. Throughput: 0: 41859.7. Samples: 851836020. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:11:10,505][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 22:11:13,274][19107] Updated weights for policy 0, policy_version 221425 (0.0032) [2024-06-18 22:11:15,504][18875] Fps is (10 sec: 42583.6, 60 sec: 42322.9, 300 sec: 42209.1). Total num frames: 3627925504. Throughput: 0: 41861.3. Samples: 851957920. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:11:15,504][18875] Avg episode reward: [(0, '0.610')] [2024-06-18 22:11:16,857][19107] Updated weights for policy 0, policy_version 221435 (0.0033) [2024-06-18 22:11:20,500][18875] Fps is (10 sec: 45891.8, 60 sec: 42052.3, 300 sec: 42099.1). Total num frames: 3628138496. Throughput: 0: 42174.2. Samples: 852219960. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:11:20,504][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 22:11:21,030][19107] Updated weights for policy 0, policy_version 221445 (0.0034) [2024-06-18 22:11:24,637][19107] Updated weights for policy 0, policy_version 221455 (0.0028) [2024-06-18 22:11:25,500][18875] Fps is (10 sec: 40974.3, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3628335104. Throughput: 0: 41882.1. Samples: 852462500. Policy #0 lag: (min: 0.0, avg: 13.1, max: 21.0) [2024-06-18 22:11:25,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 22:11:28,967][19107] Updated weights for policy 0, policy_version 221465 (0.0038) [2024-06-18 22:11:30,504][18875] Fps is (10 sec: 39307.4, 60 sec: 42322.9, 300 sec: 42042.5). Total num frames: 3628531712. Throughput: 0: 41999.3. Samples: 852594760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:11:30,505][18875] Avg episode reward: [(0, '0.497')] [2024-06-18 22:11:32,241][19107] Updated weights for policy 0, policy_version 221475 (0.0028) [2024-06-18 22:11:35,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41988.0). Total num frames: 3628728320. Throughput: 0: 41994.1. Samples: 852842940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:11:35,501][18875] Avg episode reward: [(0, '0.407')] [2024-06-18 22:11:36,602][19107] Updated weights for policy 0, policy_version 221485 (0.0031) [2024-06-18 22:11:40,030][19107] Updated weights for policy 0, policy_version 221495 (0.0024) [2024-06-18 22:11:40,500][18875] Fps is (10 sec: 45891.5, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3628990464. Throughput: 0: 42065.3. Samples: 853094400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:11:40,501][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 22:11:44,353][19107] Updated weights for policy 0, policy_version 221505 (0.0028) [2024-06-18 22:11:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3629154304. Throughput: 0: 42304.9. Samples: 853229160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:11:45,501][18875] Avg episode reward: [(0, '0.720')] [2024-06-18 22:11:47,741][19107] Updated weights for policy 0, policy_version 221515 (0.0043) [2024-06-18 22:11:50,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3629383680. Throughput: 0: 42051.2. Samples: 853475460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:11:50,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 22:11:52,183][19107] Updated weights for policy 0, policy_version 221525 (0.0041) [2024-06-18 22:11:55,500][18875] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3629613056. Throughput: 0: 42091.8. Samples: 853730000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:11:55,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 22:11:55,556][19107] Updated weights for policy 0, policy_version 221535 (0.0042) [2024-06-18 22:11:59,964][19107] Updated weights for policy 0, policy_version 221545 (0.0027) [2024-06-18 22:12:00,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3629793280. Throughput: 0: 42150.5. Samples: 853854540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:00,501][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 22:12:03,341][19107] Updated weights for policy 0, policy_version 221555 (0.0041) [2024-06-18 22:12:05,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3630006272. Throughput: 0: 41825.2. Samples: 854102100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:05,501][18875] Avg episode reward: [(0, '0.819')] [2024-06-18 22:12:07,993][19107] Updated weights for policy 0, policy_version 221565 (0.0034) [2024-06-18 22:12:10,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42327.8, 300 sec: 41987.5). Total num frames: 3630219264. Throughput: 0: 42093.8. Samples: 854356720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:10,501][18875] Avg episode reward: [(0, '0.849')] [2024-06-18 22:12:11,292][19107] Updated weights for policy 0, policy_version 221575 (0.0050) [2024-06-18 22:12:15,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41235.6, 300 sec: 41931.9). Total num frames: 3630399488. Throughput: 0: 41934.1. Samples: 854481640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:15,501][18875] Avg episode reward: [(0, '0.799')] [2024-06-18 22:12:15,887][19107] Updated weights for policy 0, policy_version 221585 (0.0029) [2024-06-18 22:12:19,171][19107] Updated weights for policy 0, policy_version 221595 (0.0043) [2024-06-18 22:12:20,501][18875] Fps is (10 sec: 42594.4, 60 sec: 41778.5, 300 sec: 41987.3). Total num frames: 3630645248. Throughput: 0: 41943.1. Samples: 854730420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:20,502][18875] Avg episode reward: [(0, '0.805')] [2024-06-18 22:12:23,887][19107] Updated weights for policy 0, policy_version 221605 (0.0032) [2024-06-18 22:12:25,504][18875] Fps is (10 sec: 45858.4, 60 sec: 42049.8, 300 sec: 42042.5). Total num frames: 3630858240. Throughput: 0: 41975.4. Samples: 854983440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:25,504][18875] Avg episode reward: [(0, '0.831')] [2024-06-18 22:12:26,789][19107] Updated weights for policy 0, policy_version 221615 (0.0036) [2024-06-18 22:12:30,500][18875] Fps is (10 sec: 39325.6, 60 sec: 41781.7, 300 sec: 41987.5). Total num frames: 3631038464. Throughput: 0: 41684.4. Samples: 855104960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:30,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 22:12:31,721][19107] Updated weights for policy 0, policy_version 221625 (0.0042) [2024-06-18 22:12:34,525][19107] Updated weights for policy 0, policy_version 221635 (0.0041) [2024-06-18 22:12:35,503][18875] Fps is (10 sec: 42600.1, 60 sec: 42596.2, 300 sec: 42042.6). Total num frames: 3631284224. Throughput: 0: 41848.6. Samples: 855358780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-18 22:12:35,504][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 22:12:35,520][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221636_3631284224.pth... [2024-06-18 22:12:35,586][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221021_3621208064.pth [2024-06-18 22:12:36,036][19087] Signal inference workers to stop experience collection... (12500 times) [2024-06-18 22:12:36,036][19087] Signal inference workers to resume experience collection... (12500 times) [2024-06-18 22:12:36,080][19107] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-06-18 22:12:36,080][19107] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-06-18 22:12:39,393][19107] Updated weights for policy 0, policy_version 221645 (0.0035) [2024-06-18 22:12:40,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41233.2, 300 sec: 41987.5). Total num frames: 3631464448. Throughput: 0: 41895.6. Samples: 855615300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:12:40,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 22:12:42,384][19107] Updated weights for policy 0, policy_version 221655 (0.0033) [2024-06-18 22:12:45,500][18875] Fps is (10 sec: 39334.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3631677440. Throughput: 0: 41692.0. Samples: 855730680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:12:45,501][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 22:12:47,269][19107] Updated weights for policy 0, policy_version 221665 (0.0025) [2024-06-18 22:12:50,452][19107] Updated weights for policy 0, policy_version 221675 (0.0032) [2024-06-18 22:12:50,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3631923200. Throughput: 0: 42030.4. Samples: 855993460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:12:50,501][18875] Avg episode reward: [(0, '0.619')] [2024-06-18 22:12:54,886][19107] Updated weights for policy 0, policy_version 221685 (0.0044) [2024-06-18 22:12:55,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3632103424. Throughput: 0: 41923.7. Samples: 856243280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:12:55,500][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 22:12:58,354][19107] Updated weights for policy 0, policy_version 221695 (0.0042) [2024-06-18 22:13:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3632332800. Throughput: 0: 41956.9. Samples: 856369700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:00,501][18875] Avg episode reward: [(0, '0.430')] [2024-06-18 22:13:02,524][19107] Updated weights for policy 0, policy_version 221705 (0.0025) [2024-06-18 22:13:05,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3632545792. Throughput: 0: 42264.9. Samples: 856632300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:05,501][18875] Avg episode reward: [(0, '0.430')] [2024-06-18 22:13:06,181][19107] Updated weights for policy 0, policy_version 221715 (0.0027) [2024-06-18 22:13:10,063][19107] Updated weights for policy 0, policy_version 221725 (0.0030) [2024-06-18 22:13:10,504][18875] Fps is (10 sec: 42582.8, 60 sec: 42322.9, 300 sec: 42098.0). Total num frames: 3632758784. Throughput: 0: 42138.7. Samples: 856879680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:10,505][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 22:13:14,072][19107] Updated weights for policy 0, policy_version 221735 (0.0034) [2024-06-18 22:13:15,501][18875] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42098.5). Total num frames: 3632971776. Throughput: 0: 42206.1. Samples: 857004240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:15,501][18875] Avg episode reward: [(0, '0.506')] [2024-06-18 22:13:17,931][19107] Updated weights for policy 0, policy_version 221745 (0.0032) [2024-06-18 22:13:20,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42326.1, 300 sec: 42098.5). Total num frames: 3633184768. Throughput: 0: 42378.1. Samples: 857265660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:20,501][18875] Avg episode reward: [(0, '0.744')] [2024-06-18 22:13:21,884][19107] Updated weights for policy 0, policy_version 221755 (0.0043) [2024-06-18 22:13:25,500][18875] Fps is (10 sec: 39322.5, 60 sec: 41781.7, 300 sec: 42043.0). Total num frames: 3633364992. Throughput: 0: 42272.0. Samples: 857517540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:25,500][18875] Avg episode reward: [(0, '0.564')] [2024-06-18 22:13:25,648][19107] Updated weights for policy 0, policy_version 221765 (0.0036) [2024-06-18 22:13:29,586][19107] Updated weights for policy 0, policy_version 221775 (0.0025) [2024-06-18 22:13:30,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42098.5). Total num frames: 3633610752. Throughput: 0: 42453.7. Samples: 857641100. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:30,501][18875] Avg episode reward: [(0, '0.532')] [2024-06-18 22:13:33,607][19107] Updated weights for policy 0, policy_version 221785 (0.0028) [2024-06-18 22:13:35,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42054.6, 300 sec: 42098.6). Total num frames: 3633807360. Throughput: 0: 42363.1. Samples: 857899800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:35,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 22:13:37,156][19107] Updated weights for policy 0, policy_version 221795 (0.0039) [2024-06-18 22:13:40,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42595.8, 300 sec: 42043.0). Total num frames: 3634020352. Throughput: 0: 42307.2. Samples: 858147260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 22:13:40,505][18875] Avg episode reward: [(0, '0.477')] [2024-06-18 22:13:41,119][19107] Updated weights for policy 0, policy_version 221805 (0.0035) [2024-06-18 22:13:44,811][19107] Updated weights for policy 0, policy_version 221815 (0.0041) [2024-06-18 22:13:45,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 3634216960. Throughput: 0: 42319.0. Samples: 858274060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:13:45,501][18875] Avg episode reward: [(0, '0.489')] [2024-06-18 22:13:49,262][19107] Updated weights for policy 0, policy_version 221825 (0.0031) [2024-06-18 22:13:50,500][18875] Fps is (10 sec: 40975.2, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3634429952. Throughput: 0: 42115.3. Samples: 858527480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:13:50,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 22:13:52,931][19107] Updated weights for policy 0, policy_version 221835 (0.0024) [2024-06-18 22:13:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3634642944. Throughput: 0: 42134.0. Samples: 858775560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:13:55,504][18875] Avg episode reward: [(0, '0.599')] [2024-06-18 22:13:57,066][19107] Updated weights for policy 0, policy_version 221845 (0.0039) [2024-06-18 22:14:00,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3634855936. Throughput: 0: 42154.0. Samples: 858901160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:00,501][18875] Avg episode reward: [(0, '0.460')] [2024-06-18 22:14:00,815][19107] Updated weights for policy 0, policy_version 221855 (0.0037) [2024-06-18 22:14:04,706][19107] Updated weights for policy 0, policy_version 221865 (0.0046) [2024-06-18 22:14:05,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3635052544. Throughput: 0: 41925.3. Samples: 859152300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:05,501][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 22:14:05,703][19087] Signal inference workers to stop experience collection... (12550 times) [2024-06-18 22:14:05,707][19087] Signal inference workers to resume experience collection... (12550 times) [2024-06-18 22:14:05,748][19107] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-06-18 22:14:05,748][19107] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-06-18 22:14:08,749][19107] Updated weights for policy 0, policy_version 221875 (0.0039) [2024-06-18 22:14:10,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42054.8, 300 sec: 41987.5). Total num frames: 3635281920. Throughput: 0: 41985.2. Samples: 859406880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:10,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 22:14:12,458][19107] Updated weights for policy 0, policy_version 221885 (0.0040) [2024-06-18 22:14:15,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3635478528. Throughput: 0: 42037.4. Samples: 859532780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:15,501][18875] Avg episode reward: [(0, '0.511')] [2024-06-18 22:14:16,411][19107] Updated weights for policy 0, policy_version 221895 (0.0034) [2024-06-18 22:14:20,043][19107] Updated weights for policy 0, policy_version 221905 (0.0033) [2024-06-18 22:14:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3635691520. Throughput: 0: 41870.1. Samples: 859783960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:20,501][18875] Avg episode reward: [(0, '0.473')] [2024-06-18 22:14:24,315][19107] Updated weights for policy 0, policy_version 221915 (0.0041) [2024-06-18 22:14:25,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3635904512. Throughput: 0: 42009.9. Samples: 860037560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:25,501][18875] Avg episode reward: [(0, '0.481')] [2024-06-18 22:14:27,860][19107] Updated weights for policy 0, policy_version 221925 (0.0041) [2024-06-18 22:14:30,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3636101120. Throughput: 0: 42006.3. Samples: 860164340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:30,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 22:14:32,024][19107] Updated weights for policy 0, policy_version 221935 (0.0035) [2024-06-18 22:14:35,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3636330496. Throughput: 0: 42062.6. Samples: 860420300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:35,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 22:14:35,615][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221945_3636346880.pth... [2024-06-18 22:14:35,623][19107] Updated weights for policy 0, policy_version 221945 (0.0047) [2024-06-18 22:14:35,670][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221329_3626254336.pth [2024-06-18 22:14:39,725][19107] Updated weights for policy 0, policy_version 221955 (0.0033) [2024-06-18 22:14:40,500][18875] Fps is (10 sec: 42597.8, 60 sec: 41781.7, 300 sec: 41987.5). Total num frames: 3636527104. Throughput: 0: 42133.7. Samples: 860671580. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:40,501][18875] Avg episode reward: [(0, '0.666')] [2024-06-18 22:14:43,252][19107] Updated weights for policy 0, policy_version 221965 (0.0032) [2024-06-18 22:14:45,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3636756480. Throughput: 0: 42148.3. Samples: 860797840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:45,501][18875] Avg episode reward: [(0, '0.591')] [2024-06-18 22:14:47,871][19107] Updated weights for policy 0, policy_version 221975 (0.0036) [2024-06-18 22:14:50,504][18875] Fps is (10 sec: 45859.3, 60 sec: 42595.8, 300 sec: 42098.1). Total num frames: 3636985856. Throughput: 0: 42199.4. Samples: 861051420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-18 22:14:50,505][18875] Avg episode reward: [(0, '0.822')] [2024-06-18 22:14:50,839][19107] Updated weights for policy 0, policy_version 221985 (0.0027) [2024-06-18 22:14:55,472][19107] Updated weights for policy 0, policy_version 221995 (0.0036) [2024-06-18 22:14:55,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 3637166080. Throughput: 0: 42328.1. Samples: 861311640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:14:55,500][18875] Avg episode reward: [(0, '0.371')] [2024-06-18 22:14:58,540][19107] Updated weights for policy 0, policy_version 222005 (0.0027) [2024-06-18 22:15:00,500][18875] Fps is (10 sec: 39335.4, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 3637379072. Throughput: 0: 42108.0. Samples: 861427640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:00,501][18875] Avg episode reward: [(0, '0.344')] [2024-06-18 22:15:03,330][19107] Updated weights for policy 0, policy_version 222015 (0.0024) [2024-06-18 22:15:05,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 3637592064. Throughput: 0: 42247.5. Samples: 861685100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:05,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 22:15:06,457][19107] Updated weights for policy 0, policy_version 222025 (0.0033) [2024-06-18 22:15:10,500][18875] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3637772288. Throughput: 0: 42348.6. Samples: 861943240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:10,501][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 22:15:11,034][19107] Updated weights for policy 0, policy_version 222035 (0.0043) [2024-06-18 22:15:14,311][19107] Updated weights for policy 0, policy_version 222045 (0.0032) [2024-06-18 22:15:15,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3638018048. Throughput: 0: 42217.4. Samples: 862064120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:15,500][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 22:15:18,617][19107] Updated weights for policy 0, policy_version 222055 (0.0035) [2024-06-18 22:15:20,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3638231040. Throughput: 0: 42229.7. Samples: 862320640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:20,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 22:15:21,969][19107] Updated weights for policy 0, policy_version 222065 (0.0041) [2024-06-18 22:15:25,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3638427648. Throughput: 0: 42408.6. Samples: 862579960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:25,501][18875] Avg episode reward: [(0, '0.652')] [2024-06-18 22:15:26,247][19107] Updated weights for policy 0, policy_version 222075 (0.0027) [2024-06-18 22:15:29,647][19107] Updated weights for policy 0, policy_version 222085 (0.0034) [2024-06-18 22:15:30,267][19087] Signal inference workers to stop experience collection... (12600 times) [2024-06-18 22:15:30,269][19087] Signal inference workers to resume experience collection... (12600 times) [2024-06-18 22:15:30,287][19107] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-06-18 22:15:30,287][19107] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-06-18 22:15:30,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 3638673408. Throughput: 0: 42327.2. Samples: 862702560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:30,501][18875] Avg episode reward: [(0, '0.551')] [2024-06-18 22:15:33,846][19107] Updated weights for policy 0, policy_version 222095 (0.0043) [2024-06-18 22:15:35,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3638870016. Throughput: 0: 42380.2. Samples: 862958380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:35,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 22:15:37,197][19107] Updated weights for policy 0, policy_version 222105 (0.0034) [2024-06-18 22:15:40,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3639066624. Throughput: 0: 42254.1. Samples: 863213080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:40,501][18875] Avg episode reward: [(0, '0.333')] [2024-06-18 22:15:41,540][19107] Updated weights for policy 0, policy_version 222115 (0.0030) [2024-06-18 22:15:44,982][19107] Updated weights for policy 0, policy_version 222125 (0.0035) [2024-06-18 22:15:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3639296000. Throughput: 0: 42511.6. Samples: 863340660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:45,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 22:15:49,146][19107] Updated weights for policy 0, policy_version 222135 (0.0031) [2024-06-18 22:15:50,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41781.8, 300 sec: 42154.1). Total num frames: 3639492608. Throughput: 0: 42462.0. Samples: 863595880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:50,501][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 22:15:53,075][19107] Updated weights for policy 0, policy_version 222145 (0.0031) [2024-06-18 22:15:55,504][18875] Fps is (10 sec: 40945.4, 60 sec: 42322.7, 300 sec: 42153.6). Total num frames: 3639705600. Throughput: 0: 42240.2. Samples: 863844200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:15:55,504][18875] Avg episode reward: [(0, '0.304')] [2024-06-18 22:15:56,890][19107] Updated weights for policy 0, policy_version 222155 (0.0032) [2024-06-18 22:16:00,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3639934976. Throughput: 0: 42402.6. Samples: 863972240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:16:00,501][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 22:16:00,691][19107] Updated weights for policy 0, policy_version 222165 (0.0039) [2024-06-18 22:16:04,916][19107] Updated weights for policy 0, policy_version 222175 (0.0031) [2024-06-18 22:16:05,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42325.4, 300 sec: 42210.1). Total num frames: 3640131584. Throughput: 0: 42385.4. Samples: 864227980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:05,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 22:16:08,309][19107] Updated weights for policy 0, policy_version 222185 (0.0046) [2024-06-18 22:16:10,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42099.1). Total num frames: 3640344576. Throughput: 0: 42214.5. Samples: 864479620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:10,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 22:16:12,843][19107] Updated weights for policy 0, policy_version 222195 (0.0029) [2024-06-18 22:16:15,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3640557568. Throughput: 0: 42315.6. Samples: 864606760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:15,501][18875] Avg episode reward: [(0, '0.379')] [2024-06-18 22:16:16,018][19107] Updated weights for policy 0, policy_version 222205 (0.0034) [2024-06-18 22:16:20,460][19107] Updated weights for policy 0, policy_version 222215 (0.0041) [2024-06-18 22:16:20,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3640770560. Throughput: 0: 42320.8. Samples: 864862820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:20,501][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 22:16:24,036][19107] Updated weights for policy 0, policy_version 222225 (0.0034) [2024-06-18 22:16:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42154.6). Total num frames: 3640967168. Throughput: 0: 42353.4. Samples: 865118980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:25,500][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 22:16:28,136][19107] Updated weights for policy 0, policy_version 222235 (0.0040) [2024-06-18 22:16:30,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 3641196544. Throughput: 0: 42337.3. Samples: 865245840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:30,501][18875] Avg episode reward: [(0, '0.709')] [2024-06-18 22:16:31,733][19107] Updated weights for policy 0, policy_version 222245 (0.0042) [2024-06-18 22:16:35,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3641409536. Throughput: 0: 42353.7. Samples: 865501800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:35,501][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 22:16:35,622][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000222255_3641425920.pth... [2024-06-18 22:16:35,627][19107] Updated weights for policy 0, policy_version 222255 (0.0031) [2024-06-18 22:16:35,676][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221636_3631284224.pth [2024-06-18 22:16:39,535][19107] Updated weights for policy 0, policy_version 222265 (0.0032) [2024-06-18 22:16:40,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3641606144. Throughput: 0: 42409.6. Samples: 865752480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:40,501][18875] Avg episode reward: [(0, '0.824')] [2024-06-18 22:16:43,519][19107] Updated weights for policy 0, policy_version 222275 (0.0026) [2024-06-18 22:16:45,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3641835520. Throughput: 0: 42354.6. Samples: 865878200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:45,501][18875] Avg episode reward: [(0, '0.807')] [2024-06-18 22:16:47,463][19107] Updated weights for policy 0, policy_version 222285 (0.0039) [2024-06-18 22:16:50,504][18875] Fps is (10 sec: 42583.1, 60 sec: 42322.7, 300 sec: 42098.0). Total num frames: 3642032128. Throughput: 0: 42314.0. Samples: 866132260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:50,504][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 22:16:51,247][19107] Updated weights for policy 0, policy_version 222295 (0.0035) [2024-06-18 22:16:55,170][19107] Updated weights for policy 0, policy_version 222305 (0.0031) [2024-06-18 22:16:55,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42327.9, 300 sec: 42209.6). Total num frames: 3642245120. Throughput: 0: 42309.5. Samples: 866383540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:16:55,501][18875] Avg episode reward: [(0, '0.476')] [2024-06-18 22:16:59,086][19107] Updated weights for policy 0, policy_version 222315 (0.0039) [2024-06-18 22:17:00,500][18875] Fps is (10 sec: 42613.9, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 3642458112. Throughput: 0: 42368.9. Samples: 866513360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:00,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 22:17:02,949][19107] Updated weights for policy 0, policy_version 222325 (0.0047) [2024-06-18 22:17:05,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3642654720. Throughput: 0: 42271.7. Samples: 866765040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:05,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 22:17:07,004][19107] Updated weights for policy 0, policy_version 222335 (0.0031) [2024-06-18 22:17:10,504][18875] Fps is (10 sec: 40945.1, 60 sec: 42049.8, 300 sec: 42264.6). Total num frames: 3642867712. Throughput: 0: 41950.8. Samples: 867006920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:10,504][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 22:17:10,826][19107] Updated weights for policy 0, policy_version 222345 (0.0031) [2024-06-18 22:17:14,711][19107] Updated weights for policy 0, policy_version 222355 (0.0031) [2024-06-18 22:17:15,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42154.2). Total num frames: 3643080704. Throughput: 0: 42111.2. Samples: 867140840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:15,501][18875] Avg episode reward: [(0, '0.534')] [2024-06-18 22:17:18,736][19107] Updated weights for policy 0, policy_version 222365 (0.0045) [2024-06-18 22:17:20,500][18875] Fps is (10 sec: 40974.3, 60 sec: 41779.3, 300 sec: 42099.1). Total num frames: 3643277312. Throughput: 0: 41999.0. Samples: 867391760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:20,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 22:17:20,662][19087] Signal inference workers to stop experience collection... (12650 times) [2024-06-18 22:17:20,719][19107] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-06-18 22:17:20,730][19087] Signal inference workers to resume experience collection... (12650 times) [2024-06-18 22:17:20,731][19107] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-06-18 22:17:22,346][19107] Updated weights for policy 0, policy_version 222375 (0.0036) [2024-06-18 22:17:25,500][18875] Fps is (10 sec: 44236.0, 60 sec: 42598.2, 300 sec: 42320.7). Total num frames: 3643523072. Throughput: 0: 42022.1. Samples: 867643480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:25,501][18875] Avg episode reward: [(0, '0.389')] [2024-06-18 22:17:26,188][19107] Updated weights for policy 0, policy_version 222385 (0.0050) [2024-06-18 22:17:29,848][19107] Updated weights for policy 0, policy_version 222395 (0.0025) [2024-06-18 22:17:30,500][18875] Fps is (10 sec: 45875.8, 60 sec: 42325.4, 300 sec: 42210.1). Total num frames: 3643736064. Throughput: 0: 42262.7. Samples: 867780020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:30,501][18875] Avg episode reward: [(0, '0.389')] [2024-06-18 22:17:33,977][19107] Updated weights for policy 0, policy_version 222405 (0.0043) [2024-06-18 22:17:35,501][18875] Fps is (10 sec: 37683.0, 60 sec: 41506.0, 300 sec: 42154.1). Total num frames: 3643899904. Throughput: 0: 41973.0. Samples: 868020900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:35,501][18875] Avg episode reward: [(0, '0.663')] [2024-06-18 22:17:37,477][19107] Updated weights for policy 0, policy_version 222415 (0.0032) [2024-06-18 22:17:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 3644145664. Throughput: 0: 41998.1. Samples: 868273460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:40,501][18875] Avg episode reward: [(0, '0.650')] [2024-06-18 22:17:41,759][19107] Updated weights for policy 0, policy_version 222425 (0.0043) [2024-06-18 22:17:45,500][18875] Fps is (10 sec: 45876.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3644358656. Throughput: 0: 42115.9. Samples: 868408580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:45,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 22:17:45,562][19107] Updated weights for policy 0, policy_version 222435 (0.0040) [2024-06-18 22:17:49,467][19107] Updated weights for policy 0, policy_version 222445 (0.0032) [2024-06-18 22:17:50,500][18875] Fps is (10 sec: 39321.4, 60 sec: 41781.6, 300 sec: 42154.1). Total num frames: 3644538880. Throughput: 0: 41983.5. Samples: 868654300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:50,501][18875] Avg episode reward: [(0, '0.433')] [2024-06-18 22:17:53,581][19107] Updated weights for policy 0, policy_version 222455 (0.0037) [2024-06-18 22:17:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3644784640. Throughput: 0: 42226.5. Samples: 868906960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:17:55,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 22:17:57,123][19107] Updated weights for policy 0, policy_version 222465 (0.0038) [2024-06-18 22:18:00,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3644981248. Throughput: 0: 42137.3. Samples: 869037020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:18:00,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 22:18:01,156][19107] Updated weights for policy 0, policy_version 222475 (0.0043) [2024-06-18 22:18:05,197][19107] Updated weights for policy 0, policy_version 222485 (0.0034) [2024-06-18 22:18:05,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42154.6). Total num frames: 3645194240. Throughput: 0: 42063.3. Samples: 869284600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:18:05,500][18875] Avg episode reward: [(0, '0.718')] [2024-06-18 22:18:09,456][19107] Updated weights for policy 0, policy_version 222495 (0.0042) [2024-06-18 22:18:10,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42327.8, 300 sec: 42154.1). Total num frames: 3645407232. Throughput: 0: 42000.5. Samples: 869533500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:18:10,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 22:18:13,008][19107] Updated weights for policy 0, policy_version 222505 (0.0045) [2024-06-18 22:18:15,503][18875] Fps is (10 sec: 42586.7, 60 sec: 42323.4, 300 sec: 42153.7). Total num frames: 3645620224. Throughput: 0: 41716.2. Samples: 869657360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:18:15,503][18875] Avg episode reward: [(0, '0.710')] [2024-06-18 22:18:17,107][19107] Updated weights for policy 0, policy_version 222515 (0.0034) [2024-06-18 22:18:20,502][18875] Fps is (10 sec: 40952.0, 60 sec: 42324.0, 300 sec: 42209.3). Total num frames: 3645816832. Throughput: 0: 41910.7. Samples: 869906960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:20,503][18875] Avg episode reward: [(0, '0.625')] [2024-06-18 22:18:20,975][19107] Updated weights for policy 0, policy_version 222525 (0.0035) [2024-06-18 22:18:24,919][19107] Updated weights for policy 0, policy_version 222535 (0.0029) [2024-06-18 22:18:25,500][18875] Fps is (10 sec: 40971.1, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3646029824. Throughput: 0: 41972.9. Samples: 870162240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:25,501][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 22:18:28,876][19107] Updated weights for policy 0, policy_version 222545 (0.0034) [2024-06-18 22:18:30,500][18875] Fps is (10 sec: 40968.1, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 3646226432. Throughput: 0: 41769.8. Samples: 870288220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:30,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 22:18:32,576][19107] Updated weights for policy 0, policy_version 222555 (0.0028) [2024-06-18 22:18:34,145][19087] Signal inference workers to stop experience collection... (12700 times) [2024-06-18 22:18:34,195][19107] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-06-18 22:18:34,206][19087] Signal inference workers to resume experience collection... (12700 times) [2024-06-18 22:18:34,215][19107] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-06-18 22:18:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42154.6). Total num frames: 3646455808. Throughput: 0: 41962.8. Samples: 870542620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:35,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 22:18:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000222562_3646455808.pth... [2024-06-18 22:18:35,555][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000221945_3636346880.pth [2024-06-18 22:18:36,531][19107] Updated weights for policy 0, policy_version 222565 (0.0030) [2024-06-18 22:18:40,500][18875] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 3646636032. Throughput: 0: 41945.3. Samples: 870794500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:40,501][18875] Avg episode reward: [(0, '0.343')] [2024-06-18 22:18:40,696][19107] Updated weights for policy 0, policy_version 222575 (0.0029) [2024-06-18 22:18:44,449][19107] Updated weights for policy 0, policy_version 222585 (0.0040) [2024-06-18 22:18:45,500][18875] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 3646849024. Throughput: 0: 41801.6. Samples: 870918100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:45,510][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 22:18:48,407][19107] Updated weights for policy 0, policy_version 222595 (0.0033) [2024-06-18 22:18:50,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3647094784. Throughput: 0: 41868.8. Samples: 871168700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:50,501][18875] Avg episode reward: [(0, '0.400')] [2024-06-18 22:18:52,281][19107] Updated weights for policy 0, policy_version 222605 (0.0040) [2024-06-18 22:18:55,500][18875] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 3647275008. Throughput: 0: 41950.3. Samples: 871421260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:18:55,509][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 22:18:56,401][19107] Updated weights for policy 0, policy_version 222615 (0.0030) [2024-06-18 22:19:00,253][19107] Updated weights for policy 0, policy_version 222625 (0.0030) [2024-06-18 22:19:00,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3647488000. Throughput: 0: 41747.4. Samples: 871535880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:19:00,500][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 22:19:04,149][19107] Updated weights for policy 0, policy_version 222635 (0.0043) [2024-06-18 22:19:05,501][18875] Fps is (10 sec: 45874.2, 60 sec: 42325.1, 300 sec: 42209.6). Total num frames: 3647733760. Throughput: 0: 41960.8. Samples: 871795120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:19:05,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 22:19:08,169][19107] Updated weights for policy 0, policy_version 222645 (0.0037) [2024-06-18 22:19:10,500][18875] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3647913984. Throughput: 0: 41798.6. Samples: 872043180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:19:10,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 22:19:12,284][19107] Updated weights for policy 0, policy_version 222655 (0.0026) [2024-06-18 22:19:15,500][18875] Fps is (10 sec: 37684.0, 60 sec: 41508.0, 300 sec: 42098.6). Total num frames: 3648110592. Throughput: 0: 41740.5. Samples: 872166540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:19:15,501][18875] Avg episode reward: [(0, '0.597')] [2024-06-18 22:19:16,115][19107] Updated weights for policy 0, policy_version 222665 (0.0031) [2024-06-18 22:19:19,881][19107] Updated weights for policy 0, policy_version 222675 (0.0041) [2024-06-18 22:19:20,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41780.6, 300 sec: 42098.6). Total num frames: 3648323584. Throughput: 0: 41973.4. Samples: 872431420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:19:20,501][18875] Avg episode reward: [(0, '0.723')] [2024-06-18 22:19:24,021][19107] Updated weights for policy 0, policy_version 222685 (0.0030) [2024-06-18 22:19:25,500][18875] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 3648569344. Throughput: 0: 41889.7. Samples: 872679540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 22:19:25,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 22:19:27,326][19107] Updated weights for policy 0, policy_version 222695 (0.0030) [2024-06-18 22:19:30,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3648765952. Throughput: 0: 42011.7. Samples: 872808620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:19:30,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 22:19:31,903][19107] Updated weights for policy 0, policy_version 222705 (0.0036) [2024-06-18 22:19:35,235][19107] Updated weights for policy 0, policy_version 222715 (0.0043) [2024-06-18 22:19:35,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3648962560. Throughput: 0: 42116.6. Samples: 873063940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:19:35,500][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 22:19:39,510][19107] Updated weights for policy 0, policy_version 222725 (0.0029) [2024-06-18 22:19:40,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3649191936. Throughput: 0: 42132.0. Samples: 873317200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:19:40,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 22:19:42,842][19107] Updated weights for policy 0, policy_version 222735 (0.0033) [2024-06-18 22:19:43,613][19087] Signal inference workers to stop experience collection... (12750 times) [2024-06-18 22:19:43,613][19087] Signal inference workers to resume experience collection... (12750 times) [2024-06-18 22:19:43,629][19107] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-06-18 22:19:43,629][19107] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-06-18 22:19:45,500][18875] Fps is (10 sec: 42597.4, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 3649388544. Throughput: 0: 42315.8. Samples: 873440100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:19:45,501][18875] Avg episode reward: [(0, '0.716')] [2024-06-18 22:19:47,063][19107] Updated weights for policy 0, policy_version 222745 (0.0030) [2024-06-18 22:19:50,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3649601536. Throughput: 0: 42179.7. Samples: 873693200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:19:50,501][18875] Avg episode reward: [(0, '0.432')] [2024-06-18 22:19:50,572][19107] Updated weights for policy 0, policy_version 222755 (0.0029) [2024-06-18 22:19:55,019][19107] Updated weights for policy 0, policy_version 222765 (0.0041) [2024-06-18 22:19:55,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3649814528. Throughput: 0: 42391.1. Samples: 873950780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:19:55,501][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 22:19:58,297][19107] Updated weights for policy 0, policy_version 222775 (0.0036) [2024-06-18 22:20:00,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42209.7). Total num frames: 3650043904. Throughput: 0: 42433.8. Samples: 874076060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:00,501][18875] Avg episode reward: [(0, '0.498')] [2024-06-18 22:20:02,755][19107] Updated weights for policy 0, policy_version 222785 (0.0036) [2024-06-18 22:20:05,500][18875] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 3650224128. Throughput: 0: 42171.0. Samples: 874329120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:05,501][18875] Avg episode reward: [(0, '0.732')] [2024-06-18 22:20:06,220][19107] Updated weights for policy 0, policy_version 222795 (0.0031) [2024-06-18 22:20:10,200][19107] Updated weights for policy 0, policy_version 222805 (0.0029) [2024-06-18 22:20:10,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3650437120. Throughput: 0: 42341.9. Samples: 874584920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:10,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 22:20:14,017][19107] Updated weights for policy 0, policy_version 222815 (0.0034) [2024-06-18 22:20:15,501][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3650650112. Throughput: 0: 42281.5. Samples: 874711300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:15,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 22:20:17,810][19107] Updated weights for policy 0, policy_version 222825 (0.0034) [2024-06-18 22:20:20,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3650879488. Throughput: 0: 42205.2. Samples: 874963180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:20,501][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 22:20:21,858][19107] Updated weights for policy 0, policy_version 222835 (0.0039) [2024-06-18 22:20:25,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3651076096. Throughput: 0: 42319.9. Samples: 875221600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:25,501][18875] Avg episode reward: [(0, '0.585')] [2024-06-18 22:20:25,751][19107] Updated weights for policy 0, policy_version 222845 (0.0042) [2024-06-18 22:20:29,606][19107] Updated weights for policy 0, policy_version 222855 (0.0038) [2024-06-18 22:20:30,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3651305472. Throughput: 0: 42331.2. Samples: 875345000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:30,504][18875] Avg episode reward: [(0, '0.447')] [2024-06-18 22:20:33,483][19107] Updated weights for policy 0, policy_version 222865 (0.0028) [2024-06-18 22:20:35,500][18875] Fps is (10 sec: 42599.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3651502080. Throughput: 0: 42356.1. Samples: 875599220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-18 22:20:35,501][18875] Avg episode reward: [(0, '0.732')] [2024-06-18 22:20:35,591][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000222871_3651518464.pth... [2024-06-18 22:20:35,651][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000222255_3641425920.pth [2024-06-18 22:20:37,130][19107] Updated weights for policy 0, policy_version 222875 (0.0048) [2024-06-18 22:20:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3651715072. Throughput: 0: 42220.5. Samples: 875850700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:20:40,501][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 22:20:41,017][19107] Updated weights for policy 0, policy_version 222885 (0.0036) [2024-06-18 22:20:44,862][19107] Updated weights for policy 0, policy_version 222895 (0.0053) [2024-06-18 22:20:45,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3651928064. Throughput: 0: 42253.3. Samples: 875977460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:20:45,501][18875] Avg episode reward: [(0, '0.736')] [2024-06-18 22:20:48,554][19107] Updated weights for policy 0, policy_version 222905 (0.0052) [2024-06-18 22:20:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42099.1). Total num frames: 3652124672. Throughput: 0: 42380.1. Samples: 876236220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:20:50,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 22:20:52,417][19107] Updated weights for policy 0, policy_version 222915 (0.0038) [2024-06-18 22:20:55,504][18875] Fps is (10 sec: 42582.9, 60 sec: 42322.8, 300 sec: 42098.0). Total num frames: 3652354048. Throughput: 0: 42371.3. Samples: 876491780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:20:55,505][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 22:20:56,439][19107] Updated weights for policy 0, policy_version 222925 (0.0036) [2024-06-18 22:21:00,014][19107] Updated weights for policy 0, policy_version 222935 (0.0027) [2024-06-18 22:21:00,500][18875] Fps is (10 sec: 45874.5, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3652583424. Throughput: 0: 42482.7. Samples: 876623020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:00,501][18875] Avg episode reward: [(0, '0.550')] [2024-06-18 22:21:04,171][19107] Updated weights for policy 0, policy_version 222945 (0.0024) [2024-06-18 22:21:05,500][18875] Fps is (10 sec: 42613.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3652780032. Throughput: 0: 42584.4. Samples: 876879480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:05,501][18875] Avg episode reward: [(0, '0.474')] [2024-06-18 22:21:07,772][19107] Updated weights for policy 0, policy_version 222955 (0.0034) [2024-06-18 22:21:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3652993024. Throughput: 0: 42511.2. Samples: 877134600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:10,512][18875] Avg episode reward: [(0, '0.705')] [2024-06-18 22:21:11,959][19107] Updated weights for policy 0, policy_version 222965 (0.0035) [2024-06-18 22:21:14,134][19087] Signal inference workers to stop experience collection... (12800 times) [2024-06-18 22:21:14,135][19087] Signal inference workers to resume experience collection... (12800 times) [2024-06-18 22:21:14,161][19107] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-06-18 22:21:14,195][19107] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-06-18 22:21:15,495][19107] Updated weights for policy 0, policy_version 222975 (0.0026) [2024-06-18 22:21:15,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42209.7). Total num frames: 3653222400. Throughput: 0: 42612.0. Samples: 877262540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:15,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 22:21:19,782][19107] Updated weights for policy 0, policy_version 222985 (0.0037) [2024-06-18 22:21:20,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3653419008. Throughput: 0: 42582.6. Samples: 877515440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:20,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 22:21:23,294][19107] Updated weights for policy 0, policy_version 222995 (0.0046) [2024-06-18 22:21:25,501][18875] Fps is (10 sec: 40956.7, 60 sec: 42598.0, 300 sec: 42154.0). Total num frames: 3653632000. Throughput: 0: 42492.5. Samples: 877762900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:25,501][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 22:21:27,888][19107] Updated weights for policy 0, policy_version 223005 (0.0030) [2024-06-18 22:21:30,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3653828608. Throughput: 0: 42468.3. Samples: 877888540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:30,501][18875] Avg episode reward: [(0, '0.833')] [2024-06-18 22:21:31,541][19107] Updated weights for policy 0, policy_version 223015 (0.0039) [2024-06-18 22:21:35,437][19107] Updated weights for policy 0, policy_version 223025 (0.0035) [2024-06-18 22:21:35,500][18875] Fps is (10 sec: 40963.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3654041600. Throughput: 0: 42235.9. Samples: 878136840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:35,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 22:21:39,123][19107] Updated weights for policy 0, policy_version 223035 (0.0034) [2024-06-18 22:21:40,500][18875] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3654270976. Throughput: 0: 42258.5. Samples: 878393260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 22:21:40,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 22:21:42,944][19107] Updated weights for policy 0, policy_version 223045 (0.0036) [2024-06-18 22:21:45,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42154.6). Total num frames: 3654467584. Throughput: 0: 42208.9. Samples: 878522420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:21:45,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 22:21:46,642][19107] Updated weights for policy 0, policy_version 223055 (0.0028) [2024-06-18 22:21:50,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3654680576. Throughput: 0: 42160.5. Samples: 878776700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:21:50,503][18875] Avg episode reward: [(0, '0.560')] [2024-06-18 22:21:50,764][19107] Updated weights for policy 0, policy_version 223065 (0.0025) [2024-06-18 22:21:54,187][19107] Updated weights for policy 0, policy_version 223075 (0.0040) [2024-06-18 22:21:55,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42600.9, 300 sec: 42209.6). Total num frames: 3654909952. Throughput: 0: 42174.2. Samples: 879032440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:21:55,501][18875] Avg episode reward: [(0, '0.391')] [2024-06-18 22:21:58,639][19107] Updated weights for policy 0, policy_version 223085 (0.0033) [2024-06-18 22:22:00,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 3655106560. Throughput: 0: 42190.3. Samples: 879161100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:00,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 22:22:02,043][19107] Updated weights for policy 0, policy_version 223095 (0.0039) [2024-06-18 22:22:05,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42154.6). Total num frames: 3655303168. Throughput: 0: 42113.8. Samples: 879410560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:05,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 22:22:06,475][19107] Updated weights for policy 0, policy_version 223105 (0.0045) [2024-06-18 22:22:10,112][19107] Updated weights for policy 0, policy_version 223115 (0.0025) [2024-06-18 22:22:10,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3655532544. Throughput: 0: 42382.5. Samples: 879670080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:10,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 22:22:14,141][19107] Updated weights for policy 0, policy_version 223125 (0.0040) [2024-06-18 22:22:15,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 3655729152. Throughput: 0: 42317.9. Samples: 879792840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:15,501][18875] Avg episode reward: [(0, '0.826')] [2024-06-18 22:22:18,016][19107] Updated weights for policy 0, policy_version 223135 (0.0028) [2024-06-18 22:22:20,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3655958528. Throughput: 0: 42188.4. Samples: 880035320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:20,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 22:22:22,102][19107] Updated weights for policy 0, policy_version 223145 (0.0039) [2024-06-18 22:22:25,500][18875] Fps is (10 sec: 40959.3, 60 sec: 41779.6, 300 sec: 42043.0). Total num frames: 3656138752. Throughput: 0: 42127.4. Samples: 880289000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:25,501][18875] Avg episode reward: [(0, '0.513')] [2024-06-18 22:22:25,830][19107] Updated weights for policy 0, policy_version 223155 (0.0038) [2024-06-18 22:22:29,923][19107] Updated weights for policy 0, policy_version 223165 (0.0038) [2024-06-18 22:22:30,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 3656351744. Throughput: 0: 41928.9. Samples: 880409220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:30,501][18875] Avg episode reward: [(0, '0.647')] [2024-06-18 22:22:33,743][19107] Updated weights for policy 0, policy_version 223175 (0.0027) [2024-06-18 22:22:34,615][19087] Signal inference workers to stop experience collection... (12850 times) [2024-06-18 22:22:34,616][19087] Signal inference workers to resume experience collection... (12850 times) [2024-06-18 22:22:34,632][19107] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-06-18 22:22:34,644][19107] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-06-18 22:22:35,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3656597504. Throughput: 0: 41982.1. Samples: 880665900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:35,505][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 22:22:35,514][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000223181_3656597504.pth... [2024-06-18 22:22:35,567][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000222562_3646455808.pth [2024-06-18 22:22:37,515][19107] Updated weights for policy 0, policy_version 223185 (0.0045) [2024-06-18 22:22:40,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3656777728. Throughput: 0: 41866.8. Samples: 880916440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:40,501][18875] Avg episode reward: [(0, '0.695')] [2024-06-18 22:22:41,389][19107] Updated weights for policy 0, policy_version 223195 (0.0026) [2024-06-18 22:22:45,208][19107] Updated weights for policy 0, policy_version 223205 (0.0032) [2024-06-18 22:22:45,501][18875] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3656990720. Throughput: 0: 41758.9. Samples: 881040260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:45,501][18875] Avg episode reward: [(0, '0.748')] [2024-06-18 22:22:49,535][19107] Updated weights for policy 0, policy_version 223215 (0.0033) [2024-06-18 22:22:50,500][18875] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3657236480. Throughput: 0: 41955.1. Samples: 881298540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-18 22:22:50,501][18875] Avg episode reward: [(0, '0.340')] [2024-06-18 22:22:52,812][19107] Updated weights for policy 0, policy_version 223225 (0.0042) [2024-06-18 22:22:55,500][18875] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3657416704. Throughput: 0: 41721.7. Samples: 881547560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:22:55,501][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 22:22:57,232][19107] Updated weights for policy 0, policy_version 223235 (0.0037) [2024-06-18 22:23:00,417][19107] Updated weights for policy 0, policy_version 223245 (0.0046) [2024-06-18 22:23:00,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3657646080. Throughput: 0: 41627.7. Samples: 881666080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:00,500][18875] Avg episode reward: [(0, '0.167')] [2024-06-18 22:23:04,964][19107] Updated weights for policy 0, policy_version 223255 (0.0033) [2024-06-18 22:23:05,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3657842688. Throughput: 0: 41911.5. Samples: 881921340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:05,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 22:23:08,709][19107] Updated weights for policy 0, policy_version 223265 (0.0031) [2024-06-18 22:23:10,500][18875] Fps is (10 sec: 37682.2, 60 sec: 41506.0, 300 sec: 42043.4). Total num frames: 3658022912. Throughput: 0: 41782.2. Samples: 882169200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:10,501][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 22:23:12,753][19107] Updated weights for policy 0, policy_version 223275 (0.0041) [2024-06-18 22:23:15,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 42098.8). Total num frames: 3658235904. Throughput: 0: 41715.1. Samples: 882286400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:15,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 22:23:16,313][19107] Updated weights for policy 0, policy_version 223285 (0.0041) [2024-06-18 22:23:20,500][18875] Fps is (10 sec: 44238.2, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3658465280. Throughput: 0: 41801.2. Samples: 882546940. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:20,500][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 22:23:20,512][19107] Updated weights for policy 0, policy_version 223295 (0.0041) [2024-06-18 22:23:24,010][19107] Updated weights for policy 0, policy_version 223305 (0.0042) [2024-06-18 22:23:25,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3658661888. Throughput: 0: 41699.5. Samples: 882792920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:25,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 22:23:28,252][19107] Updated weights for policy 0, policy_version 223315 (0.0032) [2024-06-18 22:23:30,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3658874880. Throughput: 0: 41736.6. Samples: 882918400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:30,509][18875] Avg episode reward: [(0, '0.537')] [2024-06-18 22:23:32,099][19107] Updated weights for policy 0, policy_version 223325 (0.0040) [2024-06-18 22:23:35,500][18875] Fps is (10 sec: 39321.6, 60 sec: 40960.1, 300 sec: 42098.5). Total num frames: 3659055104. Throughput: 0: 41633.3. Samples: 883172040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:35,509][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 22:23:36,221][19107] Updated weights for policy 0, policy_version 223335 (0.0038) [2024-06-18 22:23:40,155][19107] Updated weights for policy 0, policy_version 223345 (0.0035) [2024-06-18 22:23:40,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 3659300864. Throughput: 0: 41532.2. Samples: 883416500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:40,500][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 22:23:44,234][19107] Updated weights for policy 0, policy_version 223355 (0.0038) [2024-06-18 22:23:44,816][19087] Signal inference workers to stop experience collection... (12900 times) [2024-06-18 22:23:44,867][19107] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-06-18 22:23:44,873][19087] Signal inference workers to resume experience collection... (12900 times) [2024-06-18 22:23:44,889][19107] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-06-18 22:23:45,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3659497472. Throughput: 0: 41853.6. Samples: 883549500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:45,501][18875] Avg episode reward: [(0, '0.412')] [2024-06-18 22:23:48,082][19107] Updated weights for policy 0, policy_version 223365 (0.0029) [2024-06-18 22:23:50,500][18875] Fps is (10 sec: 37683.0, 60 sec: 40687.0, 300 sec: 42043.0). Total num frames: 3659677696. Throughput: 0: 41577.0. Samples: 883792300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:50,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 22:23:52,005][19107] Updated weights for policy 0, policy_version 223375 (0.0041) [2024-06-18 22:23:55,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 3659923456. Throughput: 0: 41704.6. Samples: 884045900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:23:55,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 22:23:55,712][19107] Updated weights for policy 0, policy_version 223385 (0.0031) [2024-06-18 22:23:59,783][19107] Updated weights for policy 0, policy_version 223395 (0.0028) [2024-06-18 22:24:00,500][18875] Fps is (10 sec: 47513.2, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3660152832. Throughput: 0: 41948.5. Samples: 884174080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:24:00,501][18875] Avg episode reward: [(0, '0.657')] [2024-06-18 22:24:03,730][19107] Updated weights for policy 0, policy_version 223405 (0.0037) [2024-06-18 22:24:05,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 3660316672. Throughput: 0: 41601.2. Samples: 884419000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:05,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 22:24:07,457][19107] Updated weights for policy 0, policy_version 223415 (0.0037) [2024-06-18 22:24:10,500][18875] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3660529664. Throughput: 0: 41790.6. Samples: 884673500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:10,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 22:24:11,569][19107] Updated weights for policy 0, policy_version 223425 (0.0049) [2024-06-18 22:24:15,292][19107] Updated weights for policy 0, policy_version 223435 (0.0036) [2024-06-18 22:24:15,504][18875] Fps is (10 sec: 45858.6, 60 sec: 42322.8, 300 sec: 42209.1). Total num frames: 3660775424. Throughput: 0: 41804.6. Samples: 884799760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:15,505][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 22:24:19,260][19107] Updated weights for policy 0, policy_version 223445 (0.0037) [2024-06-18 22:24:20,504][18875] Fps is (10 sec: 42583.5, 60 sec: 41503.5, 300 sec: 41987.0). Total num frames: 3660955648. Throughput: 0: 41700.3. Samples: 885048700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:20,504][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 22:24:22,999][19107] Updated weights for policy 0, policy_version 223455 (0.0039) [2024-06-18 22:24:25,500][18875] Fps is (10 sec: 37696.7, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 3661152256. Throughput: 0: 41943.9. Samples: 885303980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:25,501][18875] Avg episode reward: [(0, '0.618')] [2024-06-18 22:24:27,187][19107] Updated weights for policy 0, policy_version 223465 (0.0048) [2024-06-18 22:24:30,500][18875] Fps is (10 sec: 42613.9, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3661381632. Throughput: 0: 41768.1. Samples: 885429060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:30,501][18875] Avg episode reward: [(0, '0.530')] [2024-06-18 22:24:30,702][19107] Updated weights for policy 0, policy_version 223475 (0.0036) [2024-06-18 22:24:34,980][19107] Updated weights for policy 0, policy_version 223485 (0.0033) [2024-06-18 22:24:35,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 3661594624. Throughput: 0: 41955.1. Samples: 885680280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:35,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 22:24:35,524][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000223486_3661594624.pth... [2024-06-18 22:24:35,574][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000222871_3651518464.pth [2024-06-18 22:24:38,788][19107] Updated weights for policy 0, policy_version 223495 (0.0035) [2024-06-18 22:24:40,500][18875] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 3661791232. Throughput: 0: 41767.9. Samples: 885925460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:40,501][18875] Avg episode reward: [(0, '0.589')] [2024-06-18 22:24:42,742][19107] Updated weights for policy 0, policy_version 223505 (0.0050) [2024-06-18 22:24:45,500][18875] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3662004224. Throughput: 0: 41746.7. Samples: 886052680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:45,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 22:24:46,532][19107] Updated weights for policy 0, policy_version 223515 (0.0042) [2024-06-18 22:24:50,442][19107] Updated weights for policy 0, policy_version 223525 (0.0026) [2024-06-18 22:24:50,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3662233600. Throughput: 0: 42126.3. Samples: 886314680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:50,501][18875] Avg episode reward: [(0, '0.780')] [2024-06-18 22:24:54,469][19107] Updated weights for policy 0, policy_version 223535 (0.0034) [2024-06-18 22:24:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3662430208. Throughput: 0: 42012.1. Samples: 886564040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:24:55,501][18875] Avg episode reward: [(0, '0.593')] [2024-06-18 22:24:58,522][19107] Updated weights for policy 0, policy_version 223545 (0.0025) [2024-06-18 22:25:00,500][18875] Fps is (10 sec: 37683.1, 60 sec: 40960.0, 300 sec: 41987.5). Total num frames: 3662610432. Throughput: 0: 42007.4. Samples: 886689940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:25:00,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 22:25:02,282][19107] Updated weights for policy 0, policy_version 223555 (0.0029) [2024-06-18 22:25:05,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3662856192. Throughput: 0: 42121.1. Samples: 886944000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:25:05,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 22:25:06,176][19107] Updated weights for policy 0, policy_version 223565 (0.0043) [2024-06-18 22:25:09,086][19087] Signal inference workers to stop experience collection... (12950 times) [2024-06-18 22:25:09,087][19087] Signal inference workers to resume experience collection... (12950 times) [2024-06-18 22:25:09,100][19107] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-06-18 22:25:09,130][19107] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-06-18 22:25:10,144][19107] Updated weights for policy 0, policy_version 223575 (0.0033) [2024-06-18 22:25:10,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 3663069184. Throughput: 0: 42159.2. Samples: 887201140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:25:10,501][18875] Avg episode reward: [(0, '0.605')] [2024-06-18 22:25:13,875][19107] Updated weights for policy 0, policy_version 223585 (0.0025) [2024-06-18 22:25:15,500][18875] Fps is (10 sec: 39321.2, 60 sec: 41235.5, 300 sec: 41931.9). Total num frames: 3663249408. Throughput: 0: 42162.5. Samples: 887326380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:15,501][18875] Avg episode reward: [(0, '0.487')] [2024-06-18 22:25:17,792][19107] Updated weights for policy 0, policy_version 223595 (0.0032) [2024-06-18 22:25:20,500][18875] Fps is (10 sec: 40959.0, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 3663478784. Throughput: 0: 42199.8. Samples: 887579280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:20,501][18875] Avg episode reward: [(0, '0.458')] [2024-06-18 22:25:21,647][19107] Updated weights for policy 0, policy_version 223605 (0.0039) [2024-06-18 22:25:25,389][19107] Updated weights for policy 0, policy_version 223615 (0.0037) [2024-06-18 22:25:25,500][18875] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 3663708160. Throughput: 0: 42249.9. Samples: 887826700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:25,501][18875] Avg episode reward: [(0, '0.637')] [2024-06-18 22:25:29,549][19107] Updated weights for policy 0, policy_version 223625 (0.0032) [2024-06-18 22:25:30,500][18875] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3663888384. Throughput: 0: 42281.8. Samples: 887955360. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:30,501][18875] Avg episode reward: [(0, '0.717')] [2024-06-18 22:25:33,016][19107] Updated weights for policy 0, policy_version 223635 (0.0032) [2024-06-18 22:25:35,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3664117760. Throughput: 0: 42063.6. Samples: 888207540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:35,501][18875] Avg episode reward: [(0, '0.641')] [2024-06-18 22:25:37,337][19107] Updated weights for policy 0, policy_version 223645 (0.0032) [2024-06-18 22:25:40,504][18875] Fps is (10 sec: 44220.7, 60 sec: 42322.9, 300 sec: 42042.5). Total num frames: 3664330752. Throughput: 0: 42115.2. Samples: 888459380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:40,505][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 22:25:40,809][19107] Updated weights for policy 0, policy_version 223655 (0.0036) [2024-06-18 22:25:45,314][19107] Updated weights for policy 0, policy_version 223665 (0.0042) [2024-06-18 22:25:45,501][18875] Fps is (10 sec: 40955.8, 60 sec: 42051.6, 300 sec: 42042.9). Total num frames: 3664527360. Throughput: 0: 42182.7. Samples: 888588200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:45,502][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 22:25:48,435][19107] Updated weights for policy 0, policy_version 223675 (0.0034) [2024-06-18 22:25:50,500][18875] Fps is (10 sec: 42614.0, 60 sec: 42052.3, 300 sec: 42043.5). Total num frames: 3664756736. Throughput: 0: 42162.3. Samples: 888841300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:50,501][18875] Avg episode reward: [(0, '0.735')] [2024-06-18 22:25:52,978][19107] Updated weights for policy 0, policy_version 223685 (0.0046) [2024-06-18 22:25:55,500][18875] Fps is (10 sec: 42602.4, 60 sec: 42052.2, 300 sec: 41932.0). Total num frames: 3664953344. Throughput: 0: 41975.9. Samples: 889090060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:25:55,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 22:25:56,456][19107] Updated weights for policy 0, policy_version 223695 (0.0035) [2024-06-18 22:26:00,501][18875] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3665149952. Throughput: 0: 42103.1. Samples: 889221020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:26:00,501][18875] Avg episode reward: [(0, '0.611')] [2024-06-18 22:26:00,989][19107] Updated weights for policy 0, policy_version 223705 (0.0044) [2024-06-18 22:26:04,233][19107] Updated weights for policy 0, policy_version 223715 (0.0031) [2024-06-18 22:26:05,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 3665395712. Throughput: 0: 42106.3. Samples: 889474060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:26:05,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 22:26:08,563][19107] Updated weights for policy 0, policy_version 223725 (0.0037) [2024-06-18 22:26:10,500][18875] Fps is (10 sec: 45875.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3665608704. Throughput: 0: 42274.6. Samples: 889729060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:26:10,504][18875] Avg episode reward: [(0, '0.642')] [2024-06-18 22:26:11,823][19107] Updated weights for policy 0, policy_version 223735 (0.0034) [2024-06-18 22:26:15,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 3665805312. Throughput: 0: 42254.1. Samples: 889856800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-18 22:26:15,501][18875] Avg episode reward: [(0, '0.528')] [2024-06-18 22:26:16,203][19107] Updated weights for policy 0, policy_version 223745 (0.0030) [2024-06-18 22:26:19,293][19107] Updated weights for policy 0, policy_version 223755 (0.0040) [2024-06-18 22:26:20,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 41987.6). Total num frames: 3666018304. Throughput: 0: 42165.6. Samples: 890105000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:20,501][18875] Avg episode reward: [(0, '0.436')] [2024-06-18 22:26:23,876][19107] Updated weights for policy 0, policy_version 223765 (0.0039) [2024-06-18 22:26:25,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3666231296. Throughput: 0: 42319.5. Samples: 890363600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:25,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 22:26:27,237][19107] Updated weights for policy 0, policy_version 223775 (0.0028) [2024-06-18 22:26:30,500][18875] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 3666427904. Throughput: 0: 42260.1. Samples: 890489860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:30,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 22:26:31,692][19107] Updated weights for policy 0, policy_version 223785 (0.0038) [2024-06-18 22:26:34,871][19107] Updated weights for policy 0, policy_version 223795 (0.0036) [2024-06-18 22:26:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3666657280. Throughput: 0: 42022.2. Samples: 890732300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:35,501][18875] Avg episode reward: [(0, '0.603')] [2024-06-18 22:26:35,525][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000223795_3666657280.pth... [2024-06-18 22:26:35,580][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000223181_3656597504.pth [2024-06-18 22:26:37,890][19087] Signal inference workers to stop experience collection... (13000 times) [2024-06-18 22:26:37,892][19087] Signal inference workers to resume experience collection... (13000 times) [2024-06-18 22:26:37,937][19107] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-06-18 22:26:37,937][19107] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-06-18 22:26:39,915][19107] Updated weights for policy 0, policy_version 223805 (0.0044) [2024-06-18 22:26:40,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42054.9, 300 sec: 41987.5). Total num frames: 3666853888. Throughput: 0: 42335.2. Samples: 890995140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:40,501][18875] Avg episode reward: [(0, '0.471')] [2024-06-18 22:26:42,581][19107] Updated weights for policy 0, policy_version 223815 (0.0048) [2024-06-18 22:26:45,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42052.9, 300 sec: 41931.9). Total num frames: 3667050496. Throughput: 0: 42055.7. Samples: 891113520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:45,501][18875] Avg episode reward: [(0, '0.344')] [2024-06-18 22:26:47,445][19107] Updated weights for policy 0, policy_version 223825 (0.0044) [2024-06-18 22:26:50,350][19107] Updated weights for policy 0, policy_version 223835 (0.0040) [2024-06-18 22:26:50,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3667312640. Throughput: 0: 42101.5. Samples: 891368620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:50,500][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 22:26:55,272][19107] Updated weights for policy 0, policy_version 223845 (0.0034) [2024-06-18 22:26:55,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 3667476480. Throughput: 0: 42197.2. Samples: 891627940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:26:55,501][18875] Avg episode reward: [(0, '0.330')] [2024-06-18 22:26:58,570][19107] Updated weights for policy 0, policy_version 223855 (0.0034) [2024-06-18 22:27:00,500][18875] Fps is (10 sec: 37682.9, 60 sec: 42325.5, 300 sec: 41987.5). Total num frames: 3667689472. Throughput: 0: 41965.0. Samples: 891745220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:27:00,501][18875] Avg episode reward: [(0, '0.671')] [2024-06-18 22:27:02,906][19107] Updated weights for policy 0, policy_version 223865 (0.0034) [2024-06-18 22:27:05,500][18875] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3667951616. Throughput: 0: 42200.9. Samples: 892004040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:27:05,501][18875] Avg episode reward: [(0, '0.373')] [2024-06-18 22:27:06,155][19107] Updated weights for policy 0, policy_version 223875 (0.0038) [2024-06-18 22:27:10,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 3668115456. Throughput: 0: 42123.9. Samples: 892259180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:27:10,501][18875] Avg episode reward: [(0, '0.457')] [2024-06-18 22:27:10,647][19107] Updated weights for policy 0, policy_version 223885 (0.0036) [2024-06-18 22:27:13,761][19107] Updated weights for policy 0, policy_version 223895 (0.0034) [2024-06-18 22:27:15,500][18875] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3668328448. Throughput: 0: 41843.3. Samples: 892372820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:27:15,501][18875] Avg episode reward: [(0, '0.442')] [2024-06-18 22:27:18,399][19107] Updated weights for policy 0, policy_version 223905 (0.0038) [2024-06-18 22:27:20,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3668557824. Throughput: 0: 42288.0. Samples: 892635260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:27:20,501][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 22:27:21,177][19107] Updated weights for policy 0, policy_version 223915 (0.0030) [2024-06-18 22:27:25,500][18875] Fps is (10 sec: 39322.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 3668721664. Throughput: 0: 42132.4. Samples: 892891100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-18 22:27:25,501][18875] Avg episode reward: [(0, '0.810')] [2024-06-18 22:27:26,265][19107] Updated weights for policy 0, policy_version 223925 (0.0037) [2024-06-18 22:27:29,061][19107] Updated weights for policy 0, policy_version 223935 (0.0040) [2024-06-18 22:27:30,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 3668967424. Throughput: 0: 42076.8. Samples: 893006980. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:27:30,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 22:27:34,135][19107] Updated weights for policy 0, policy_version 223945 (0.0038) [2024-06-18 22:27:35,502][18875] Fps is (10 sec: 47506.1, 60 sec: 42324.2, 300 sec: 42098.3). Total num frames: 3669196800. Throughput: 0: 42233.1. Samples: 893269180. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:27:35,502][18875] Avg episode reward: [(0, '0.520')] [2024-06-18 22:27:35,838][19087] Signal inference workers to stop experience collection... (13050 times) [2024-06-18 22:27:35,870][19107] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-06-18 22:27:35,896][19087] Signal inference workers to resume experience collection... (13050 times) [2024-06-18 22:27:35,900][19107] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-06-18 22:27:36,676][19107] Updated weights for policy 0, policy_version 223955 (0.0039) [2024-06-18 22:27:40,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 3669344256. Throughput: 0: 42136.5. Samples: 893524080. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:27:40,503][18875] Avg episode reward: [(0, '0.715')] [2024-06-18 22:27:41,757][19107] Updated weights for policy 0, policy_version 223965 (0.0033) [2024-06-18 22:27:45,082][19107] Updated weights for policy 0, policy_version 223975 (0.0034) [2024-06-18 22:27:45,500][18875] Fps is (10 sec: 40965.9, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 3669606400. Throughput: 0: 42074.1. Samples: 893638560. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:27:45,501][18875] Avg episode reward: [(0, '0.512')] [2024-06-18 22:27:49,906][19107] Updated weights for policy 0, policy_version 223985 (0.0039) [2024-06-18 22:27:50,504][18875] Fps is (10 sec: 45859.2, 60 sec: 41503.6, 300 sec: 41987.0). Total num frames: 3669803008. Throughput: 0: 42027.9. Samples: 893895440. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:27:50,504][18875] Avg episode reward: [(0, '0.490')] [2024-06-18 22:27:53,417][19107] Updated weights for policy 0, policy_version 223995 (0.0037) [2024-06-18 22:27:55,500][18875] Fps is (10 sec: 36045.1, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3669966848. Throughput: 0: 42056.9. Samples: 894151740. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:27:55,501][18875] Avg episode reward: [(0, '0.553')] [2024-06-18 22:27:57,775][19107] Updated weights for policy 0, policy_version 224005 (0.0028) [2024-06-18 22:28:00,500][18875] Fps is (10 sec: 44252.2, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 3670245376. Throughput: 0: 42251.7. Samples: 894274140. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:00,501][18875] Avg episode reward: [(0, '0.619')] [2024-06-18 22:28:01,154][19107] Updated weights for policy 0, policy_version 224015 (0.0044) [2024-06-18 22:28:05,303][19107] Updated weights for policy 0, policy_version 224025 (0.0038) [2024-06-18 22:28:05,500][18875] Fps is (10 sec: 45875.0, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 3670425600. Throughput: 0: 42097.3. Samples: 894529640. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:05,501][18875] Avg episode reward: [(0, '0.631')] [2024-06-18 22:28:08,801][19107] Updated weights for policy 0, policy_version 224035 (0.0028) [2024-06-18 22:28:10,504][18875] Fps is (10 sec: 36031.9, 60 sec: 41503.6, 300 sec: 41931.4). Total num frames: 3670605824. Throughput: 0: 42038.3. Samples: 894782980. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:10,505][18875] Avg episode reward: [(0, '0.478')] [2024-06-18 22:28:12,926][19107] Updated weights for policy 0, policy_version 224045 (0.0033) [2024-06-18 22:28:15,500][18875] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 3670884352. Throughput: 0: 42159.9. Samples: 894904180. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:15,501][18875] Avg episode reward: [(0, '0.783')] [2024-06-18 22:28:16,303][19107] Updated weights for policy 0, policy_version 224055 (0.0033) [2024-06-18 22:28:20,500][18875] Fps is (10 sec: 44253.1, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3671048192. Throughput: 0: 42086.8. Samples: 895163020. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:20,501][18875] Avg episode reward: [(0, '0.468')] [2024-06-18 22:28:20,910][19107] Updated weights for policy 0, policy_version 224065 (0.0030) [2024-06-18 22:28:23,832][19107] Updated weights for policy 0, policy_version 224075 (0.0033) [2024-06-18 22:28:25,500][18875] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 3671261184. Throughput: 0: 42003.1. Samples: 895414220. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:25,504][18875] Avg episode reward: [(0, '0.808')] [2024-06-18 22:28:28,350][19107] Updated weights for policy 0, policy_version 224085 (0.0030) [2024-06-18 22:28:30,504][18875] Fps is (10 sec: 45857.9, 60 sec: 42322.8, 300 sec: 42209.1). Total num frames: 3671506944. Throughput: 0: 42396.1. Samples: 895546540. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:30,505][18875] Avg episode reward: [(0, '0.925')] [2024-06-18 22:28:30,505][19087] Saving new best policy, reward=0.925! [2024-06-18 22:28:31,140][19087] Signal inference workers to stop experience collection... (13100 times) [2024-06-18 22:28:31,140][19087] Signal inference workers to resume experience collection... (13100 times) [2024-06-18 22:28:31,150][19107] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-06-18 22:28:31,150][19107] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-06-18 22:28:31,295][19107] Updated weights for policy 0, policy_version 224095 (0.0036) [2024-06-18 22:28:35,500][18875] Fps is (10 sec: 44237.3, 60 sec: 41780.3, 300 sec: 42043.0). Total num frames: 3671703552. Throughput: 0: 42380.3. Samples: 895802400. Policy #0 lag: (min: 0.0, avg: 13.4, max: 26.0) [2024-06-18 22:28:35,501][18875] Avg episode reward: [(0, '0.812')] [2024-06-18 22:28:35,647][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000224104_3671719936.pth... [2024-06-18 22:28:35,712][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000223486_3661594624.pth [2024-06-18 22:28:35,867][19107] Updated weights for policy 0, policy_version 224105 (0.0035) [2024-06-18 22:28:39,132][19107] Updated weights for policy 0, policy_version 224115 (0.0046) [2024-06-18 22:28:40,504][18875] Fps is (10 sec: 40960.4, 60 sec: 42868.9, 300 sec: 42098.0). Total num frames: 3671916544. Throughput: 0: 42227.7. Samples: 896052140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:28:40,505][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 22:28:43,610][19107] Updated weights for policy 0, policy_version 224125 (0.0041) [2024-06-18 22:28:45,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3672145920. Throughput: 0: 42463.6. Samples: 896185000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:28:45,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 22:28:47,290][19107] Updated weights for policy 0, policy_version 224135 (0.0038) [2024-06-18 22:28:50,500][18875] Fps is (10 sec: 40974.7, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 3672326144. Throughput: 0: 42435.1. Samples: 896439220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:28:50,501][18875] Avg episode reward: [(0, '0.568')] [2024-06-18 22:28:51,461][19107] Updated weights for policy 0, policy_version 224145 (0.0038) [2024-06-18 22:28:54,948][19107] Updated weights for policy 0, policy_version 224155 (0.0028) [2024-06-18 22:28:55,500][18875] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42043.0). Total num frames: 3672555520. Throughput: 0: 42410.6. Samples: 896691300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:28:55,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 22:28:59,340][19107] Updated weights for policy 0, policy_version 224165 (0.0038) [2024-06-18 22:29:00,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3672768512. Throughput: 0: 42766.4. Samples: 896828660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:00,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 22:29:02,395][19107] Updated weights for policy 0, policy_version 224175 (0.0038) [2024-06-18 22:29:05,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3672981504. Throughput: 0: 42487.9. Samples: 897074980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:05,501][18875] Avg episode reward: [(0, '0.377')] [2024-06-18 22:29:06,882][19107] Updated weights for policy 0, policy_version 224185 (0.0032) [2024-06-18 22:29:10,395][19107] Updated weights for policy 0, policy_version 224195 (0.0040) [2024-06-18 22:29:10,500][18875] Fps is (10 sec: 44236.7, 60 sec: 43420.2, 300 sec: 42154.6). Total num frames: 3673210880. Throughput: 0: 42603.6. Samples: 897331380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:10,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 22:29:14,341][19107] Updated weights for policy 0, policy_version 224205 (0.0027) [2024-06-18 22:29:15,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 42210.1). Total num frames: 3673407488. Throughput: 0: 42678.6. Samples: 897466920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:15,501][18875] Avg episode reward: [(0, '0.552')] [2024-06-18 22:29:17,923][19107] Updated weights for policy 0, policy_version 224215 (0.0026) [2024-06-18 22:29:20,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 3673620480. Throughput: 0: 42431.5. Samples: 897711820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:20,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 22:29:21,878][19107] Updated weights for policy 0, policy_version 224225 (0.0044) [2024-06-18 22:29:25,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 3673833472. Throughput: 0: 42684.8. Samples: 897972800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:25,501][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 22:29:25,939][19107] Updated weights for policy 0, policy_version 224235 (0.0033) [2024-06-18 22:29:29,521][19107] Updated weights for policy 0, policy_version 224245 (0.0037) [2024-06-18 22:29:30,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42327.9, 300 sec: 42209.6). Total num frames: 3674046464. Throughput: 0: 42566.2. Samples: 898100480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:30,501][18875] Avg episode reward: [(0, '0.615')] [2024-06-18 22:29:33,489][19107] Updated weights for policy 0, policy_version 224255 (0.0032) [2024-06-18 22:29:35,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 3674259456. Throughput: 0: 42535.0. Samples: 898353300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:35,509][18875] Avg episode reward: [(0, '0.535')] [2024-06-18 22:29:37,096][19107] Updated weights for policy 0, policy_version 224265 (0.0050) [2024-06-18 22:29:40,066][19087] Signal inference workers to stop experience collection... (13150 times) [2024-06-18 22:29:40,067][19087] Signal inference workers to resume experience collection... (13150 times) [2024-06-18 22:29:40,088][19107] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-06-18 22:29:40,088][19107] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-06-18 22:29:40,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42874.1, 300 sec: 42320.7). Total num frames: 3674488832. Throughput: 0: 42768.9. Samples: 898615900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:40,500][18875] Avg episode reward: [(0, '0.693')] [2024-06-18 22:29:41,172][19107] Updated weights for policy 0, policy_version 224275 (0.0031) [2024-06-18 22:29:44,741][19107] Updated weights for policy 0, policy_version 224285 (0.0041) [2024-06-18 22:29:45,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3674685440. Throughput: 0: 42511.9. Samples: 898741700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-18 22:29:45,501][18875] Avg episode reward: [(0, '0.747')] [2024-06-18 22:29:48,816][19107] Updated weights for policy 0, policy_version 224295 (0.0030) [2024-06-18 22:29:50,500][18875] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 3674914816. Throughput: 0: 42656.5. Samples: 898994520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:29:50,508][18875] Avg episode reward: [(0, '0.463')] [2024-06-18 22:29:52,686][19107] Updated weights for policy 0, policy_version 224305 (0.0034) [2024-06-18 22:29:55,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 3675111424. Throughput: 0: 42692.0. Samples: 899252520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:29:55,501][18875] Avg episode reward: [(0, '0.651')] [2024-06-18 22:29:56,520][19107] Updated weights for policy 0, policy_version 224315 (0.0032) [2024-06-18 22:30:00,251][19107] Updated weights for policy 0, policy_version 224325 (0.0037) [2024-06-18 22:30:00,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 3675340800. Throughput: 0: 42493.8. Samples: 899379140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:00,501][18875] Avg episode reward: [(0, '0.508')] [2024-06-18 22:30:04,193][19107] Updated weights for policy 0, policy_version 224335 (0.0042) [2024-06-18 22:30:05,504][18875] Fps is (10 sec: 44221.2, 60 sec: 42868.9, 300 sec: 42320.2). Total num frames: 3675553792. Throughput: 0: 42713.5. Samples: 899634080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:05,504][18875] Avg episode reward: [(0, '0.640')] [2024-06-18 22:30:07,949][19107] Updated weights for policy 0, policy_version 224345 (0.0043) [2024-06-18 22:30:10,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 3675750400. Throughput: 0: 42626.2. Samples: 899890980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:10,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 22:30:12,043][19107] Updated weights for policy 0, policy_version 224355 (0.0035) [2024-06-18 22:30:15,500][18875] Fps is (10 sec: 42614.2, 60 sec: 42871.6, 300 sec: 42376.3). Total num frames: 3675979776. Throughput: 0: 42495.7. Samples: 900012780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:15,500][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 22:30:15,730][19107] Updated weights for policy 0, policy_version 224365 (0.0027) [2024-06-18 22:30:19,807][19107] Updated weights for policy 0, policy_version 224375 (0.0035) [2024-06-18 22:30:20,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 3676192768. Throughput: 0: 42636.1. Samples: 900271920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:20,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 22:30:23,272][19107] Updated weights for policy 0, policy_version 224385 (0.0029) [2024-06-18 22:30:25,500][18875] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 3676389376. Throughput: 0: 42518.5. Samples: 900529240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:25,501][18875] Avg episode reward: [(0, '0.538')] [2024-06-18 22:30:27,607][19107] Updated weights for policy 0, policy_version 224395 (0.0040) [2024-06-18 22:30:30,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 3676602368. Throughput: 0: 42440.6. Samples: 900651520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:30,501][18875] Avg episode reward: [(0, '0.662')] [2024-06-18 22:30:31,038][19107] Updated weights for policy 0, policy_version 224405 (0.0043) [2024-06-18 22:30:35,450][19107] Updated weights for policy 0, policy_version 224415 (0.0033) [2024-06-18 22:30:35,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42321.2). Total num frames: 3676815360. Throughput: 0: 42483.1. Samples: 900906260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:35,509][18875] Avg episode reward: [(0, '0.717')] [2024-06-18 22:30:35,556][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000224416_3676831744.pth... [2024-06-18 22:30:35,600][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000223795_3666657280.pth [2024-06-18 22:30:38,994][19107] Updated weights for policy 0, policy_version 224425 (0.0048) [2024-06-18 22:30:40,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42376.4). Total num frames: 3677028352. Throughput: 0: 42461.5. Samples: 901163280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:40,500][18875] Avg episode reward: [(0, '0.802')] [2024-06-18 22:30:43,072][19107] Updated weights for policy 0, policy_version 224435 (0.0038) [2024-06-18 22:30:45,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 3677241344. Throughput: 0: 42428.9. Samples: 901288440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:45,501][18875] Avg episode reward: [(0, '0.663')] [2024-06-18 22:30:46,548][19107] Updated weights for policy 0, policy_version 224445 (0.0039) [2024-06-18 22:30:50,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 3677437952. Throughput: 0: 42306.5. Samples: 901537720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 22:30:50,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 22:30:50,903][19107] Updated weights for policy 0, policy_version 224455 (0.0035) [2024-06-18 22:30:54,667][19107] Updated weights for policy 0, policy_version 224465 (0.0037) [2024-06-18 22:30:55,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 3677650944. Throughput: 0: 42283.8. Samples: 901793760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:30:55,501][18875] Avg episode reward: [(0, '0.505')] [2024-06-18 22:30:58,675][19107] Updated weights for policy 0, policy_version 224475 (0.0038) [2024-06-18 22:31:00,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 3677880320. Throughput: 0: 42388.9. Samples: 901920280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:00,500][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 22:31:02,412][19107] Updated weights for policy 0, policy_version 224485 (0.0042) [2024-06-18 22:31:05,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42327.8, 300 sec: 42320.7). Total num frames: 3678093312. Throughput: 0: 42345.6. Samples: 902177480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:05,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 22:31:06,212][19107] Updated weights for policy 0, policy_version 224495 (0.0041) [2024-06-18 22:31:08,924][19087] Signal inference workers to stop experience collection... (13200 times) [2024-06-18 22:31:08,978][19107] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-06-18 22:31:08,984][19087] Signal inference workers to resume experience collection... (13200 times) [2024-06-18 22:31:08,993][19107] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-06-18 22:31:10,391][19107] Updated weights for policy 0, policy_version 224505 (0.0045) [2024-06-18 22:31:10,500][18875] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 3678289920. Throughput: 0: 42247.1. Samples: 902430360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:10,501][18875] Avg episode reward: [(0, '0.369')] [2024-06-18 22:31:13,982][19107] Updated weights for policy 0, policy_version 224515 (0.0022) [2024-06-18 22:31:15,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 3678519296. Throughput: 0: 42270.6. Samples: 902553700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:15,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 22:31:18,016][19107] Updated weights for policy 0, policy_version 224525 (0.0034) [2024-06-18 22:31:20,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 3678715904. Throughput: 0: 42202.7. Samples: 902805380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:20,501][18875] Avg episode reward: [(0, '0.659')] [2024-06-18 22:31:21,914][19107] Updated weights for policy 0, policy_version 224535 (0.0039) [2024-06-18 22:31:25,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 3678912512. Throughput: 0: 42222.5. Samples: 903063300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:25,501][18875] Avg episode reward: [(0, '0.658')] [2024-06-18 22:31:25,751][19107] Updated weights for policy 0, policy_version 224545 (0.0029) [2024-06-18 22:31:29,579][19107] Updated weights for policy 0, policy_version 224555 (0.0033) [2024-06-18 22:31:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 3679125504. Throughput: 0: 42137.4. Samples: 903184620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:30,501][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 22:31:33,837][19107] Updated weights for policy 0, policy_version 224565 (0.0039) [2024-06-18 22:31:35,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 3679338496. Throughput: 0: 42232.1. Samples: 903438160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:35,500][18875] Avg episode reward: [(0, '0.668')] [2024-06-18 22:31:37,209][19107] Updated weights for policy 0, policy_version 224575 (0.0036) [2024-06-18 22:31:40,500][18875] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 3679551488. Throughput: 0: 42145.0. Samples: 903690280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:40,501][18875] Avg episode reward: [(0, '0.430')] [2024-06-18 22:31:41,650][19107] Updated weights for policy 0, policy_version 224585 (0.0041) [2024-06-18 22:31:44,795][19107] Updated weights for policy 0, policy_version 224595 (0.0055) [2024-06-18 22:31:45,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3679780864. Throughput: 0: 42137.3. Samples: 903816460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:45,501][18875] Avg episode reward: [(0, '0.358')] [2024-06-18 22:31:49,177][19107] Updated weights for policy 0, policy_version 224605 (0.0046) [2024-06-18 22:31:50,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 3679977472. Throughput: 0: 42067.1. Samples: 904070500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:50,501][18875] Avg episode reward: [(0, '0.488')] [2024-06-18 22:31:52,414][19107] Updated weights for policy 0, policy_version 224615 (0.0034) [2024-06-18 22:31:55,500][18875] Fps is (10 sec: 37683.5, 60 sec: 41779.4, 300 sec: 42265.2). Total num frames: 3680157696. Throughput: 0: 42093.5. Samples: 904324560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:31:55,500][18875] Avg episode reward: [(0, '0.677')] [2024-06-18 22:31:56,749][19107] Updated weights for policy 0, policy_version 224625 (0.0048) [2024-06-18 22:32:00,258][19107] Updated weights for policy 0, policy_version 224635 (0.0040) [2024-06-18 22:32:00,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 3680419840. Throughput: 0: 42120.3. Samples: 904449120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:32:00,501][18875] Avg episode reward: [(0, '0.337')] [2024-06-18 22:32:04,790][19107] Updated weights for policy 0, policy_version 224645 (0.0045) [2024-06-18 22:32:05,500][18875] Fps is (10 sec: 44235.7, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 3680600064. Throughput: 0: 42064.3. Samples: 904698280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:05,501][18875] Avg episode reward: [(0, '0.672')] [2024-06-18 22:32:08,457][19107] Updated weights for policy 0, policy_version 224655 (0.0033) [2024-06-18 22:32:10,500][18875] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 3680813056. Throughput: 0: 41959.6. Samples: 904951480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:10,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 22:32:12,588][19107] Updated weights for policy 0, policy_version 224665 (0.0047) [2024-06-18 22:32:15,500][18875] Fps is (10 sec: 44237.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 3681042432. Throughput: 0: 41984.9. Samples: 905073940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:15,500][18875] Avg episode reward: [(0, '0.714')] [2024-06-18 22:32:16,094][19107] Updated weights for policy 0, policy_version 224675 (0.0032) [2024-06-18 22:32:20,446][19107] Updated weights for policy 0, policy_version 224685 (0.0039) [2024-06-18 22:32:20,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 3681239040. Throughput: 0: 42179.1. Samples: 905336220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:20,501][18875] Avg episode reward: [(0, '0.809')] [2024-06-18 22:32:23,885][19107] Updated weights for policy 0, policy_version 224695 (0.0020) [2024-06-18 22:32:25,501][18875] Fps is (10 sec: 40958.8, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 3681452032. Throughput: 0: 42092.8. Samples: 905584460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:25,501][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 22:32:28,284][19107] Updated weights for policy 0, policy_version 224705 (0.0033) [2024-06-18 22:32:30,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42265.4). Total num frames: 3681665024. Throughput: 0: 42237.6. Samples: 905717160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:30,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 22:32:31,935][19107] Updated weights for policy 0, policy_version 224715 (0.0029) [2024-06-18 22:32:35,500][18875] Fps is (10 sec: 40960.9, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 3681861632. Throughput: 0: 42126.8. Samples: 905966200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:35,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 22:32:35,515][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000224723_3681861632.pth... [2024-06-18 22:32:35,584][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000224104_3671719936.pth [2024-06-18 22:32:36,084][19107] Updated weights for policy 0, policy_version 224725 (0.0041) [2024-06-18 22:32:39,544][19107] Updated weights for policy 0, policy_version 224735 (0.0040) [2024-06-18 22:32:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 3682074624. Throughput: 0: 42101.6. Samples: 906219140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:40,501][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 22:32:43,815][19107] Updated weights for policy 0, policy_version 224745 (0.0029) [2024-06-18 22:32:45,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42432.3). Total num frames: 3682320384. Throughput: 0: 42327.3. Samples: 906353840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:45,500][18875] Avg episode reward: [(0, '0.616')] [2024-06-18 22:32:47,055][19107] Updated weights for policy 0, policy_version 224755 (0.0035) [2024-06-18 22:32:50,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 3682500608. Throughput: 0: 42438.8. Samples: 906608020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:50,501][18875] Avg episode reward: [(0, '0.595')] [2024-06-18 22:32:51,343][19107] Updated weights for policy 0, policy_version 224765 (0.0038) [2024-06-18 22:32:53,379][19087] Signal inference workers to stop experience collection... (13250 times) [2024-06-18 22:32:53,379][19087] Signal inference workers to resume experience collection... (13250 times) [2024-06-18 22:32:53,401][19107] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-06-18 22:32:53,402][19107] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-06-18 22:32:54,722][19107] Updated weights for policy 0, policy_version 224775 (0.0034) [2024-06-18 22:32:55,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 3682729984. Throughput: 0: 42359.7. Samples: 906857660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:32:55,500][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 22:32:58,999][19107] Updated weights for policy 0, policy_version 224785 (0.0038) [2024-06-18 22:33:00,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 3682942976. Throughput: 0: 42667.0. Samples: 906993960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:33:00,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 22:33:02,500][19107] Updated weights for policy 0, policy_version 224795 (0.0030) [2024-06-18 22:33:05,500][18875] Fps is (10 sec: 39321.2, 60 sec: 42052.4, 300 sec: 42432.3). Total num frames: 3683123200. Throughput: 0: 42360.8. Samples: 907242460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:33:05,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 22:33:06,617][19107] Updated weights for policy 0, policy_version 224805 (0.0028) [2024-06-18 22:33:10,105][19107] Updated weights for policy 0, policy_version 224815 (0.0039) [2024-06-18 22:33:10,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 3683368960. Throughput: 0: 42444.2. Samples: 907494440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 22:33:10,501][18875] Avg episode reward: [(0, '0.544')] [2024-06-18 22:33:14,216][19107] Updated weights for policy 0, policy_version 224825 (0.0031) [2024-06-18 22:33:15,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 3683581952. Throughput: 0: 42478.3. Samples: 907628680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:15,501][18875] Avg episode reward: [(0, '0.464')] [2024-06-18 22:33:18,004][19107] Updated weights for policy 0, policy_version 224835 (0.0029) [2024-06-18 22:33:20,500][18875] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 3683778560. Throughput: 0: 42558.6. Samples: 907881340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:20,501][18875] Avg episode reward: [(0, '0.670')] [2024-06-18 22:33:21,726][19107] Updated weights for policy 0, policy_version 224845 (0.0031) [2024-06-18 22:33:25,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 42376.8). Total num frames: 3684007936. Throughput: 0: 42490.8. Samples: 908131220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:25,500][18875] Avg episode reward: [(0, '0.578')] [2024-06-18 22:33:25,547][19107] Updated weights for policy 0, policy_version 224855 (0.0036) [2024-06-18 22:33:29,581][19107] Updated weights for policy 0, policy_version 224865 (0.0033) [2024-06-18 22:33:30,501][18875] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42431.7). Total num frames: 3684220928. Throughput: 0: 42382.8. Samples: 908261080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:30,501][18875] Avg episode reward: [(0, '0.634')] [2024-06-18 22:33:33,531][19107] Updated weights for policy 0, policy_version 224875 (0.0041) [2024-06-18 22:33:35,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42321.2). Total num frames: 3684401152. Throughput: 0: 42397.8. Samples: 908515920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:35,501][18875] Avg episode reward: [(0, '0.613')] [2024-06-18 22:33:37,309][19107] Updated weights for policy 0, policy_version 224885 (0.0044) [2024-06-18 22:33:40,500][18875] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 3684646912. Throughput: 0: 42315.4. Samples: 908761860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:40,505][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 22:33:41,974][19107] Updated weights for policy 0, policy_version 224895 (0.0033) [2024-06-18 22:33:44,910][19107] Updated weights for policy 0, policy_version 224905 (0.0040) [2024-06-18 22:33:45,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.1, 300 sec: 42431.8). Total num frames: 3684843520. Throughput: 0: 42170.5. Samples: 908891640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:45,501][18875] Avg episode reward: [(0, '0.480')] [2024-06-18 22:33:49,509][19107] Updated weights for policy 0, policy_version 224915 (0.0039) [2024-06-18 22:33:50,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 3685040128. Throughput: 0: 42294.1. Samples: 909145700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:50,501][18875] Avg episode reward: [(0, '0.299')] [2024-06-18 22:33:52,779][19107] Updated weights for policy 0, policy_version 224925 (0.0046) [2024-06-18 22:33:55,500][18875] Fps is (10 sec: 40960.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 3685253120. Throughput: 0: 42310.2. Samples: 909398400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:33:55,501][18875] Avg episode reward: [(0, '0.439')] [2024-06-18 22:33:57,077][19107] Updated weights for policy 0, policy_version 224935 (0.0039) [2024-06-18 22:34:00,453][19107] Updated weights for policy 0, policy_version 224945 (0.0029) [2024-06-18 22:34:00,504][18875] Fps is (10 sec: 45859.1, 60 sec: 42595.8, 300 sec: 42431.3). Total num frames: 3685498880. Throughput: 0: 42232.2. Samples: 909529280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:34:00,505][18875] Avg episode reward: [(0, '0.628')] [2024-06-18 22:34:04,775][19107] Updated weights for policy 0, policy_version 224955 (0.0035) [2024-06-18 22:34:05,500][18875] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3685662720. Throughput: 0: 42225.8. Samples: 909781500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:34:05,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 22:34:08,070][19107] Updated weights for policy 0, policy_version 224965 (0.0028) [2024-06-18 22:34:10,500][18875] Fps is (10 sec: 40974.5, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 3685908480. Throughput: 0: 42186.1. Samples: 910029600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:34:10,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 22:34:13,369][19107] Updated weights for policy 0, policy_version 224975 (0.0046) [2024-06-18 22:34:15,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 3686105088. Throughput: 0: 42332.6. Samples: 910166040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:34:15,501][18875] Avg episode reward: [(0, '0.722')] [2024-06-18 22:34:15,947][19107] Updated weights for policy 0, policy_version 224985 (0.0034) [2024-06-18 22:34:20,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 3686301696. Throughput: 0: 42037.0. Samples: 910407580. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:34:20,501][18875] Avg episode reward: [(0, '0.649')] [2024-06-18 22:34:21,115][19107] Updated weights for policy 0, policy_version 224995 (0.0023) [2024-06-18 22:34:23,737][19107] Updated weights for policy 0, policy_version 225005 (0.0047) [2024-06-18 22:34:24,275][19087] Signal inference workers to stop experience collection... (13300 times) [2024-06-18 22:34:24,285][19107] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-06-18 22:34:24,335][19087] Signal inference workers to resume experience collection... (13300 times) [2024-06-18 22:34:24,336][19107] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-06-18 22:34:25,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 3686547456. Throughput: 0: 42106.1. Samples: 910656640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:34:25,501][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 22:34:28,601][19107] Updated weights for policy 0, policy_version 225015 (0.0040) [2024-06-18 22:34:30,504][18875] Fps is (10 sec: 42582.9, 60 sec: 41776.8, 300 sec: 42264.7). Total num frames: 3686727680. Throughput: 0: 42290.1. Samples: 910794840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:34:30,505][18875] Avg episode reward: [(0, '0.545')] [2024-06-18 22:34:31,899][19107] Updated weights for policy 0, policy_version 225025 (0.0042) [2024-06-18 22:34:35,500][18875] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3686940672. Throughput: 0: 41973.0. Samples: 911034480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:34:35,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 22:34:35,516][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225033_3686940672.pth... [2024-06-18 22:34:35,568][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000224416_3676831744.pth [2024-06-18 22:34:36,370][19107] Updated weights for policy 0, policy_version 225035 (0.0038) [2024-06-18 22:34:39,628][19107] Updated weights for policy 0, policy_version 225045 (0.0045) [2024-06-18 22:34:40,500][18875] Fps is (10 sec: 45891.3, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 3687186432. Throughput: 0: 42013.6. Samples: 911289020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:34:40,501][18875] Avg episode reward: [(0, '0.428')] [2024-06-18 22:34:44,390][19107] Updated weights for policy 0, policy_version 225055 (0.0034) [2024-06-18 22:34:45,500][18875] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 3687333888. Throughput: 0: 42011.4. Samples: 911419640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:34:45,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 22:34:47,544][19107] Updated weights for policy 0, policy_version 225065 (0.0039) [2024-06-18 22:34:50,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 3687596032. Throughput: 0: 41852.1. Samples: 911664840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:34:50,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 22:34:51,902][19107] Updated weights for policy 0, policy_version 225075 (0.0032) [2024-06-18 22:34:55,353][19107] Updated weights for policy 0, policy_version 225085 (0.0027) [2024-06-18 22:34:55,500][18875] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3687792640. Throughput: 0: 42230.3. Samples: 911929960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:34:55,501][18875] Avg episode reward: [(0, '0.519')] [2024-06-18 22:34:59,403][19107] Updated weights for policy 0, policy_version 225095 (0.0028) [2024-06-18 22:35:00,500][18875] Fps is (10 sec: 37682.6, 60 sec: 41235.4, 300 sec: 42099.0). Total num frames: 3687972864. Throughput: 0: 41897.2. Samples: 912051420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:35:00,501][18875] Avg episode reward: [(0, '0.491')] [2024-06-18 22:35:02,882][19107] Updated weights for policy 0, policy_version 225105 (0.0035) [2024-06-18 22:35:05,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 3688218624. Throughput: 0: 42158.6. Samples: 912304720. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:35:05,501][18875] Avg episode reward: [(0, '0.514')] [2024-06-18 22:35:06,990][19107] Updated weights for policy 0, policy_version 225115 (0.0027) [2024-06-18 22:35:10,433][19107] Updated weights for policy 0, policy_version 225125 (0.0044) [2024-06-18 22:35:10,504][18875] Fps is (10 sec: 47497.0, 60 sec: 42322.8, 300 sec: 42264.6). Total num frames: 3688448000. Throughput: 0: 42231.4. Samples: 912557200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:35:10,505][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 22:35:15,154][19107] Updated weights for policy 0, policy_version 225135 (0.0033) [2024-06-18 22:35:15,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3688611840. Throughput: 0: 41930.0. Samples: 912681540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:35:15,501][18875] Avg episode reward: [(0, '0.562')] [2024-06-18 22:35:18,041][19107] Updated weights for policy 0, policy_version 225145 (0.0027) [2024-06-18 22:35:20,500][18875] Fps is (10 sec: 42613.7, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 3688873984. Throughput: 0: 42316.0. Samples: 912938700. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:35:20,501][18875] Avg episode reward: [(0, '0.395')] [2024-06-18 22:35:22,722][19107] Updated weights for policy 0, policy_version 225155 (0.0047) [2024-06-18 22:35:25,500][18875] Fps is (10 sec: 47514.1, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 3689086976. Throughput: 0: 42363.7. Samples: 913195380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-18 22:35:25,501][18875] Avg episode reward: [(0, '0.381')] [2024-06-18 22:35:25,602][19107] Updated weights for policy 0, policy_version 225165 (0.0041) [2024-06-18 22:35:30,500][18875] Fps is (10 sec: 37683.4, 60 sec: 42054.8, 300 sec: 42154.1). Total num frames: 3689250816. Throughput: 0: 42200.4. Samples: 913318660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:35:30,501][18875] Avg episode reward: [(0, '0.586')] [2024-06-18 22:35:30,537][19107] Updated weights for policy 0, policy_version 225175 (0.0033) [2024-06-18 22:35:33,597][19107] Updated weights for policy 0, policy_version 225185 (0.0036) [2024-06-18 22:35:35,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 3689512960. Throughput: 0: 42407.6. Samples: 913573180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:35:35,501][18875] Avg episode reward: [(0, '0.353')] [2024-06-18 22:35:38,407][19107] Updated weights for policy 0, policy_version 225195 (0.0025) [2024-06-18 22:35:38,925][19087] Signal inference workers to stop experience collection... (13350 times) [2024-06-18 22:35:38,926][19087] Signal inference workers to resume experience collection... (13350 times) [2024-06-18 22:35:38,973][19107] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-06-18 22:35:38,973][19107] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-06-18 22:35:40,500][18875] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 3689693184. Throughput: 0: 42292.4. Samples: 913833120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:35:40,501][18875] Avg episode reward: [(0, '0.426')] [2024-06-18 22:35:41,194][19107] Updated weights for policy 0, policy_version 225205 (0.0035) [2024-06-18 22:35:45,504][18875] Fps is (10 sec: 37669.6, 60 sec: 42595.8, 300 sec: 42209.1). Total num frames: 3689889792. Throughput: 0: 42254.5. Samples: 913953020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:35:45,504][18875] Avg episode reward: [(0, '0.392')] [2024-06-18 22:35:45,980][19107] Updated weights for policy 0, policy_version 225215 (0.0034) [2024-06-18 22:35:48,906][19107] Updated weights for policy 0, policy_version 225225 (0.0030) [2024-06-18 22:35:50,500][18875] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 3690151936. Throughput: 0: 42266.3. Samples: 914206700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:35:50,501][18875] Avg episode reward: [(0, '0.438')] [2024-06-18 22:35:53,775][19107] Updated weights for policy 0, policy_version 225235 (0.0037) [2024-06-18 22:35:55,500][18875] Fps is (10 sec: 42613.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3690315776. Throughput: 0: 42450.1. Samples: 914467300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:35:55,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 22:35:56,745][19107] Updated weights for policy 0, policy_version 225245 (0.0039) [2024-06-18 22:36:00,500][18875] Fps is (10 sec: 37682.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3690528768. Throughput: 0: 42342.2. Samples: 914586940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:00,501][18875] Avg episode reward: [(0, '0.669')] [2024-06-18 22:36:01,608][19107] Updated weights for policy 0, policy_version 225255 (0.0028) [2024-06-18 22:36:04,459][19107] Updated weights for policy 0, policy_version 225265 (0.0029) [2024-06-18 22:36:05,500][18875] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 3690774528. Throughput: 0: 42355.5. Samples: 914844700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:05,501][18875] Avg episode reward: [(0, '0.298')] [2024-06-18 22:36:09,239][19107] Updated weights for policy 0, policy_version 225275 (0.0027) [2024-06-18 22:36:10,500][18875] Fps is (10 sec: 42599.3, 60 sec: 41781.8, 300 sec: 42154.1). Total num frames: 3690954752. Throughput: 0: 42331.6. Samples: 915100300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:10,500][18875] Avg episode reward: [(0, '0.738')] [2024-06-18 22:36:12,604][19107] Updated weights for policy 0, policy_version 225285 (0.0036) [2024-06-18 22:36:15,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3691167744. Throughput: 0: 42300.8. Samples: 915222200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:15,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 22:36:16,889][19107] Updated weights for policy 0, policy_version 225295 (0.0041) [2024-06-18 22:36:20,247][19107] Updated weights for policy 0, policy_version 225305 (0.0029) [2024-06-18 22:36:20,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 3691397120. Throughput: 0: 42207.0. Samples: 915472500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:20,501][18875] Avg episode reward: [(0, '0.362')] [2024-06-18 22:36:24,702][19107] Updated weights for policy 0, policy_version 225315 (0.0033) [2024-06-18 22:36:25,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 3691577344. Throughput: 0: 42217.3. Samples: 915732900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:25,501][18875] Avg episode reward: [(0, '0.596')] [2024-06-18 22:36:28,052][19107] Updated weights for policy 0, policy_version 225325 (0.0040) [2024-06-18 22:36:30,501][18875] Fps is (10 sec: 40959.6, 60 sec: 42598.2, 300 sec: 42265.1). Total num frames: 3691806720. Throughput: 0: 42287.6. Samples: 915855820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:30,501][18875] Avg episode reward: [(0, '0.462')] [2024-06-18 22:36:32,566][19107] Updated weights for policy 0, policy_version 225335 (0.0037) [2024-06-18 22:36:35,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 3692019712. Throughput: 0: 42240.2. Samples: 916107520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 22:36:35,501][18875] Avg episode reward: [(0, '0.754')] [2024-06-18 22:36:35,658][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225344_3692036096.pth... [2024-06-18 22:36:35,703][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000224723_3681861632.pth [2024-06-18 22:36:35,883][19107] Updated weights for policy 0, policy_version 225345 (0.0023) [2024-06-18 22:36:40,202][19107] Updated weights for policy 0, policy_version 225355 (0.0029) [2024-06-18 22:36:40,500][18875] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3692216320. Throughput: 0: 42240.5. Samples: 916368120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:36:40,501][18875] Avg episode reward: [(0, '0.699')] [2024-06-18 22:36:43,588][19107] Updated weights for policy 0, policy_version 225365 (0.0037) [2024-06-18 22:36:45,504][18875] Fps is (10 sec: 42583.6, 60 sec: 42598.4, 300 sec: 42264.7). Total num frames: 3692445696. Throughput: 0: 42389.6. Samples: 916494620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:36:45,505][18875] Avg episode reward: [(0, '0.689')] [2024-06-18 22:36:47,937][19107] Updated weights for policy 0, policy_version 225375 (0.0042) [2024-06-18 22:36:50,500][18875] Fps is (10 sec: 42597.7, 60 sec: 41506.0, 300 sec: 42320.7). Total num frames: 3692642304. Throughput: 0: 42315.5. Samples: 916748900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:36:50,501][18875] Avg episode reward: [(0, '0.443')] [2024-06-18 22:36:51,430][19107] Updated weights for policy 0, policy_version 225385 (0.0041) [2024-06-18 22:36:55,500][18875] Fps is (10 sec: 40974.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3692855296. Throughput: 0: 42210.2. Samples: 916999760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:36:55,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 22:36:55,852][19107] Updated weights for policy 0, policy_version 225395 (0.0034) [2024-06-18 22:36:56,948][19087] Signal inference workers to stop experience collection... (13400 times) [2024-06-18 22:36:56,992][19107] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-06-18 22:36:56,998][19087] Signal inference workers to resume experience collection... (13400 times) [2024-06-18 22:36:57,006][19107] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-06-18 22:36:59,359][19107] Updated weights for policy 0, policy_version 225405 (0.0037) [2024-06-18 22:37:00,500][18875] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 3693068288. Throughput: 0: 42205.9. Samples: 917121460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:00,500][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 22:37:03,475][19107] Updated weights for policy 0, policy_version 225415 (0.0044) [2024-06-18 22:37:05,504][18875] Fps is (10 sec: 44220.7, 60 sec: 42049.8, 300 sec: 42320.2). Total num frames: 3693297664. Throughput: 0: 42314.5. Samples: 917376800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:05,504][18875] Avg episode reward: [(0, '0.504')] [2024-06-18 22:37:07,221][19107] Updated weights for policy 0, policy_version 225425 (0.0031) [2024-06-18 22:37:10,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3693477888. Throughput: 0: 42048.9. Samples: 917625100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:10,502][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 22:37:11,339][19107] Updated weights for policy 0, policy_version 225435 (0.0032) [2024-06-18 22:37:15,151][19107] Updated weights for policy 0, policy_version 225445 (0.0029) [2024-06-18 22:37:15,501][18875] Fps is (10 sec: 40974.0, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 3693707264. Throughput: 0: 42062.2. Samples: 917748620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:15,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 22:37:19,247][19107] Updated weights for policy 0, policy_version 225455 (0.0036) [2024-06-18 22:37:20,504][18875] Fps is (10 sec: 44221.3, 60 sec: 42049.8, 300 sec: 42264.7). Total num frames: 3693920256. Throughput: 0: 42263.4. Samples: 918009520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:20,504][18875] Avg episode reward: [(0, '0.690')] [2024-06-18 22:37:22,855][19107] Updated weights for policy 0, policy_version 225465 (0.0038) [2024-06-18 22:37:25,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3694116864. Throughput: 0: 42090.9. Samples: 918262220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:25,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 22:37:26,727][19107] Updated weights for policy 0, policy_version 225475 (0.0028) [2024-06-18 22:37:30,358][19107] Updated weights for policy 0, policy_version 225485 (0.0033) [2024-06-18 22:37:30,501][18875] Fps is (10 sec: 42612.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 3694346240. Throughput: 0: 41992.1. Samples: 918384120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:30,501][18875] Avg episode reward: [(0, '0.624')] [2024-06-18 22:37:34,519][19107] Updated weights for policy 0, policy_version 225495 (0.0045) [2024-06-18 22:37:35,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 3694559232. Throughput: 0: 42056.1. Samples: 918641420. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:35,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 22:37:37,877][19107] Updated weights for policy 0, policy_version 225505 (0.0038) [2024-06-18 22:37:40,500][18875] Fps is (10 sec: 37684.0, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3694723072. Throughput: 0: 42104.0. Samples: 918894440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:40,501][18875] Avg episode reward: [(0, '0.719')] [2024-06-18 22:37:42,310][19107] Updated weights for policy 0, policy_version 225515 (0.0029) [2024-06-18 22:37:45,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42054.8, 300 sec: 42265.2). Total num frames: 3694968832. Throughput: 0: 41949.3. Samples: 919009180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-18 22:37:45,501][18875] Avg episode reward: [(0, '0.416')] [2024-06-18 22:37:45,858][19107] Updated weights for policy 0, policy_version 225525 (0.0046) [2024-06-18 22:37:50,051][19107] Updated weights for policy 0, policy_version 225535 (0.0038) [2024-06-18 22:37:50,500][18875] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3695181824. Throughput: 0: 42081.1. Samples: 919270300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:37:50,501][18875] Avg episode reward: [(0, '0.500')] [2024-06-18 22:37:53,546][19107] Updated weights for policy 0, policy_version 225545 (0.0032) [2024-06-18 22:37:55,500][18875] Fps is (10 sec: 39321.1, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 3695362048. Throughput: 0: 42112.9. Samples: 919520180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:37:55,501][18875] Avg episode reward: [(0, '0.472')] [2024-06-18 22:37:57,857][19107] Updated weights for policy 0, policy_version 225555 (0.0040) [2024-06-18 22:38:00,500][18875] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 3695591424. Throughput: 0: 42142.3. Samples: 919645020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:00,501][18875] Avg episode reward: [(0, '0.386')] [2024-06-18 22:38:01,245][19107] Updated weights for policy 0, policy_version 225565 (0.0038) [2024-06-18 22:38:05,500][18875] Fps is (10 sec: 44236.8, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 3695804416. Throughput: 0: 42147.3. Samples: 919906000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:05,501][18875] Avg episode reward: [(0, '0.786')] [2024-06-18 22:38:05,566][19107] Updated weights for policy 0, policy_version 225575 (0.0026) [2024-06-18 22:38:09,162][19107] Updated weights for policy 0, policy_version 225585 (0.0040) [2024-06-18 22:38:10,504][18875] Fps is (10 sec: 40945.6, 60 sec: 42049.8, 300 sec: 42098.0). Total num frames: 3696001024. Throughput: 0: 41822.1. Samples: 920144360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:10,504][18875] Avg episode reward: [(0, '0.509')] [2024-06-18 22:38:12,754][19087] Signal inference workers to stop experience collection... (13450 times) [2024-06-18 22:38:12,756][19087] Signal inference workers to resume experience collection... (13450 times) [2024-06-18 22:38:12,773][19107] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-06-18 22:38:12,774][19107] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-06-18 22:38:13,346][19107] Updated weights for policy 0, policy_version 225595 (0.0034) [2024-06-18 22:38:15,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 3696230400. Throughput: 0: 42008.6. Samples: 920274500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:15,501][18875] Avg episode reward: [(0, '0.421')] [2024-06-18 22:38:17,385][19107] Updated weights for policy 0, policy_version 225605 (0.0032) [2024-06-18 22:38:20,500][18875] Fps is (10 sec: 40974.7, 60 sec: 41508.6, 300 sec: 42043.0). Total num frames: 3696410624. Throughput: 0: 41869.0. Samples: 920525520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:20,501][18875] Avg episode reward: [(0, '0.390')] [2024-06-18 22:38:21,119][19107] Updated weights for policy 0, policy_version 225615 (0.0042) [2024-06-18 22:38:24,951][19107] Updated weights for policy 0, policy_version 225625 (0.0033) [2024-06-18 22:38:25,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 3696656384. Throughput: 0: 41850.7. Samples: 920777720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:25,500][18875] Avg episode reward: [(0, '0.581')] [2024-06-18 22:38:29,036][19107] Updated weights for policy 0, policy_version 225635 (0.0045) [2024-06-18 22:38:30,500][18875] Fps is (10 sec: 45875.0, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 3696869376. Throughput: 0: 42269.2. Samples: 920911300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:30,501][18875] Avg episode reward: [(0, '0.684')] [2024-06-18 22:38:32,639][19107] Updated weights for policy 0, policy_version 225645 (0.0037) [2024-06-18 22:38:35,500][18875] Fps is (10 sec: 39321.3, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3697049600. Throughput: 0: 41893.0. Samples: 921155480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:35,501][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 22:38:35,644][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225651_3697065984.pth... [2024-06-18 22:38:35,693][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225033_3686940672.pth [2024-06-18 22:38:36,604][19107] Updated weights for policy 0, policy_version 225655 (0.0037) [2024-06-18 22:38:40,469][19107] Updated weights for policy 0, policy_version 225665 (0.0036) [2024-06-18 22:38:40,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 3697295360. Throughput: 0: 42087.0. Samples: 921414100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:40,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 22:38:44,395][19107] Updated weights for policy 0, policy_version 225675 (0.0028) [2024-06-18 22:38:45,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3697475584. Throughput: 0: 42224.1. Samples: 921545100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:45,500][18875] Avg episode reward: [(0, '0.484')] [2024-06-18 22:38:48,363][19107] Updated weights for policy 0, policy_version 225685 (0.0033) [2024-06-18 22:38:50,500][18875] Fps is (10 sec: 40961.0, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 3697704960. Throughput: 0: 42035.7. Samples: 921797600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:50,501][18875] Avg episode reward: [(0, '0.417')] [2024-06-18 22:38:52,029][19107] Updated weights for policy 0, policy_version 225695 (0.0035) [2024-06-18 22:38:55,500][18875] Fps is (10 sec: 44235.9, 60 sec: 42598.4, 300 sec: 42099.0). Total num frames: 3697917952. Throughput: 0: 42425.5. Samples: 922053360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 22:38:55,501][18875] Avg episode reward: [(0, '0.306')] [2024-06-18 22:38:55,949][19107] Updated weights for policy 0, policy_version 225705 (0.0050) [2024-06-18 22:38:59,841][19107] Updated weights for policy 0, policy_version 225715 (0.0040) [2024-06-18 22:39:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3698114560. Throughput: 0: 42405.8. Samples: 922182760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:00,501][18875] Avg episode reward: [(0, '0.540')] [2024-06-18 22:39:03,459][19107] Updated weights for policy 0, policy_version 225725 (0.0038) [2024-06-18 22:39:05,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3698343936. Throughput: 0: 42444.4. Samples: 922435520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:05,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 22:39:07,612][19107] Updated weights for policy 0, policy_version 225735 (0.0026) [2024-06-18 22:39:10,500][18875] Fps is (10 sec: 45875.1, 60 sec: 42874.0, 300 sec: 42265.2). Total num frames: 3698573312. Throughput: 0: 42433.7. Samples: 922687240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:10,501][18875] Avg episode reward: [(0, '0.326')] [2024-06-18 22:39:11,214][19107] Updated weights for policy 0, policy_version 225745 (0.0043) [2024-06-18 22:39:15,437][19107] Updated weights for policy 0, policy_version 225755 (0.0038) [2024-06-18 22:39:15,504][18875] Fps is (10 sec: 42583.3, 60 sec: 42322.8, 300 sec: 42264.6). Total num frames: 3698769920. Throughput: 0: 42245.1. Samples: 922812480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:15,504][18875] Avg episode reward: [(0, '0.348')] [2024-06-18 22:39:18,412][19087] Signal inference workers to stop experience collection... (13500 times) [2024-06-18 22:39:18,463][19107] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-06-18 22:39:18,465][19087] Signal inference workers to resume experience collection... (13500 times) [2024-06-18 22:39:18,478][19107] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-06-18 22:39:19,117][19107] Updated weights for policy 0, policy_version 225765 (0.0044) [2024-06-18 22:39:20,504][18875] Fps is (10 sec: 40945.0, 60 sec: 42868.9, 300 sec: 42153.6). Total num frames: 3698982912. Throughput: 0: 42495.6. Samples: 923067940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:20,505][18875] Avg episode reward: [(0, '0.348')] [2024-06-18 22:39:23,153][19107] Updated weights for policy 0, policy_version 225775 (0.0039) [2024-06-18 22:39:25,500][18875] Fps is (10 sec: 40974.8, 60 sec: 42052.2, 300 sec: 42210.1). Total num frames: 3699179520. Throughput: 0: 42426.8. Samples: 923323300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:25,501][18875] Avg episode reward: [(0, '0.620')] [2024-06-18 22:39:26,904][19107] Updated weights for policy 0, policy_version 225785 (0.0042) [2024-06-18 22:39:30,500][18875] Fps is (10 sec: 40974.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3699392512. Throughput: 0: 42213.1. Samples: 923444700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:30,501][18875] Avg episode reward: [(0, '0.579')] [2024-06-18 22:39:30,970][19107] Updated weights for policy 0, policy_version 225795 (0.0039) [2024-06-18 22:39:34,554][19107] Updated weights for policy 0, policy_version 225805 (0.0029) [2024-06-18 22:39:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 3699605504. Throughput: 0: 42237.7. Samples: 923698300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:35,502][18875] Avg episode reward: [(0, '0.740')] [2024-06-18 22:39:38,736][19107] Updated weights for policy 0, policy_version 225815 (0.0037) [2024-06-18 22:39:40,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 3699834880. Throughput: 0: 42394.7. Samples: 923961120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:40,501][18875] Avg episode reward: [(0, '0.674')] [2024-06-18 22:39:42,364][19107] Updated weights for policy 0, policy_version 225825 (0.0039) [2024-06-18 22:39:45,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3700031488. Throughput: 0: 42188.8. Samples: 924081260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:45,501][18875] Avg episode reward: [(0, '0.897')] [2024-06-18 22:39:46,221][19107] Updated weights for policy 0, policy_version 225835 (0.0028) [2024-06-18 22:39:50,093][19107] Updated weights for policy 0, policy_version 225845 (0.0037) [2024-06-18 22:39:50,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 3700260864. Throughput: 0: 42220.4. Samples: 924335440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:50,501][18875] Avg episode reward: [(0, '0.783')] [2024-06-18 22:39:54,089][19107] Updated weights for policy 0, policy_version 225855 (0.0047) [2024-06-18 22:39:55,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 3700457472. Throughput: 0: 42358.2. Samples: 924593360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:39:55,501][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 22:39:57,946][19107] Updated weights for policy 0, policy_version 225865 (0.0033) [2024-06-18 22:40:00,500][18875] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3700654080. Throughput: 0: 42330.5. Samples: 924717200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-18 22:40:00,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 22:40:01,941][19107] Updated weights for policy 0, policy_version 225875 (0.0043) [2024-06-18 22:40:05,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42154.6). Total num frames: 3700883456. Throughput: 0: 42307.4. Samples: 924971620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:05,501][18875] Avg episode reward: [(0, '0.378')] [2024-06-18 22:40:05,702][19107] Updated weights for policy 0, policy_version 225885 (0.0043) [2024-06-18 22:40:09,808][19107] Updated weights for policy 0, policy_version 225895 (0.0027) [2024-06-18 22:40:10,500][18875] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 3701080064. Throughput: 0: 42299.5. Samples: 925226780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:10,501][18875] Avg episode reward: [(0, '0.576')] [2024-06-18 22:40:13,563][19107] Updated weights for policy 0, policy_version 225905 (0.0028) [2024-06-18 22:40:15,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42054.7, 300 sec: 42098.5). Total num frames: 3701293056. Throughput: 0: 42389.8. Samples: 925352240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:15,501][18875] Avg episode reward: [(0, '0.541')] [2024-06-18 22:40:17,496][19107] Updated weights for policy 0, policy_version 225915 (0.0034) [2024-06-18 22:40:20,504][18875] Fps is (10 sec: 44220.9, 60 sec: 42325.4, 300 sec: 42153.6). Total num frames: 3701522432. Throughput: 0: 42321.5. Samples: 925602920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:20,505][18875] Avg episode reward: [(0, '0.745')] [2024-06-18 22:40:21,189][19107] Updated weights for policy 0, policy_version 225925 (0.0036) [2024-06-18 22:40:25,040][19107] Updated weights for policy 0, policy_version 225935 (0.0035) [2024-06-18 22:40:25,500][18875] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 3701719040. Throughput: 0: 42197.8. Samples: 925860020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:25,501][18875] Avg episode reward: [(0, '0.667')] [2024-06-18 22:40:28,847][19107] Updated weights for policy 0, policy_version 225945 (0.0043) [2024-06-18 22:40:30,500][18875] Fps is (10 sec: 40974.7, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 3701932032. Throughput: 0: 42331.1. Samples: 925986160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:30,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 22:40:32,729][19107] Updated weights for policy 0, policy_version 225955 (0.0046) [2024-06-18 22:40:33,809][19087] Signal inference workers to stop experience collection... (13550 times) [2024-06-18 22:40:33,855][19107] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-06-18 22:40:33,864][19087] Signal inference workers to resume experience collection... (13550 times) [2024-06-18 22:40:33,880][19107] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-06-18 22:40:35,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3702145024. Throughput: 0: 42309.0. Samples: 926239340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:35,501][18875] Avg episode reward: [(0, '0.542')] [2024-06-18 22:40:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225961_3702145024.pth... [2024-06-18 22:40:35,619][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225344_3692036096.pth [2024-06-18 22:40:36,623][19107] Updated weights for policy 0, policy_version 225965 (0.0030) [2024-06-18 22:40:40,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42210.1). Total num frames: 3702341632. Throughput: 0: 42226.3. Samples: 926493540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:40,500][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 22:40:40,678][19107] Updated weights for policy 0, policy_version 225975 (0.0039) [2024-06-18 22:40:44,363][19107] Updated weights for policy 0, policy_version 225985 (0.0041) [2024-06-18 22:40:45,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 3702571008. Throughput: 0: 42143.9. Samples: 926613680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:45,501][18875] Avg episode reward: [(0, '0.577')] [2024-06-18 22:40:48,303][19107] Updated weights for policy 0, policy_version 225995 (0.0040) [2024-06-18 22:40:50,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 3702784000. Throughput: 0: 42244.6. Samples: 926872620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:50,500][18875] Avg episode reward: [(0, '0.708')] [2024-06-18 22:40:52,094][19107] Updated weights for policy 0, policy_version 226005 (0.0029) [2024-06-18 22:40:55,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3702980608. Throughput: 0: 42142.2. Samples: 927123180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:40:55,501][18875] Avg episode reward: [(0, '0.644')] [2024-06-18 22:40:56,226][19107] Updated weights for policy 0, policy_version 226015 (0.0039) [2024-06-18 22:40:59,867][19107] Updated weights for policy 0, policy_version 226025 (0.0043) [2024-06-18 22:41:00,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 3703193600. Throughput: 0: 42125.1. Samples: 927247860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:41:00,500][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 22:41:03,630][19107] Updated weights for policy 0, policy_version 226035 (0.0034) [2024-06-18 22:41:05,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3703406592. Throughput: 0: 42144.2. Samples: 927499260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:41:05,501][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 22:41:07,338][19107] Updated weights for policy 0, policy_version 226045 (0.0040) [2024-06-18 22:41:10,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 3703619584. Throughput: 0: 42215.8. Samples: 927759720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-18 22:41:10,500][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 22:41:11,754][19107] Updated weights for policy 0, policy_version 226055 (0.0039) [2024-06-18 22:41:15,290][19107] Updated weights for policy 0, policy_version 226065 (0.0024) [2024-06-18 22:41:15,500][18875] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 3703848960. Throughput: 0: 42151.2. Samples: 927882960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:15,501][18875] Avg episode reward: [(0, '0.456')] [2024-06-18 22:41:19,433][19107] Updated weights for policy 0, policy_version 226075 (0.0035) [2024-06-18 22:41:20,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42054.9, 300 sec: 42265.2). Total num frames: 3704045568. Throughput: 0: 42171.2. Samples: 928137040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:20,500][18875] Avg episode reward: [(0, '0.645')] [2024-06-18 22:41:23,077][19107] Updated weights for policy 0, policy_version 226085 (0.0037) [2024-06-18 22:41:25,500][18875] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3704242176. Throughput: 0: 42318.1. Samples: 928397860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:25,501][18875] Avg episode reward: [(0, '0.619')] [2024-06-18 22:41:26,893][19107] Updated weights for policy 0, policy_version 226095 (0.0043) [2024-06-18 22:41:30,504][18875] Fps is (10 sec: 44220.2, 60 sec: 42595.8, 300 sec: 42264.7). Total num frames: 3704487936. Throughput: 0: 42307.8. Samples: 928517680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:30,505][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 22:41:31,400][19107] Updated weights for policy 0, policy_version 226105 (0.0026) [2024-06-18 22:41:34,712][19107] Updated weights for policy 0, policy_version 226115 (0.0041) [2024-06-18 22:41:35,500][18875] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3704684544. Throughput: 0: 42281.7. Samples: 928775300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:35,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 22:41:39,291][19107] Updated weights for policy 0, policy_version 226125 (0.0027) [2024-06-18 22:41:40,504][18875] Fps is (10 sec: 37683.5, 60 sec: 42049.7, 300 sec: 42098.6). Total num frames: 3704864768. Throughput: 0: 42412.3. Samples: 929031880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:40,505][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 22:41:42,865][19107] Updated weights for policy 0, policy_version 226135 (0.0036) [2024-06-18 22:41:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 3705110528. Throughput: 0: 42363.1. Samples: 929154200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:45,501][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 22:41:47,128][19107] Updated weights for policy 0, policy_version 226145 (0.0023) [2024-06-18 22:41:50,500][18875] Fps is (10 sec: 42613.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3705290752. Throughput: 0: 42498.8. Samples: 929411700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:50,501][18875] Avg episode reward: [(0, '0.247')] [2024-06-18 22:41:50,880][19107] Updated weights for policy 0, policy_version 226155 (0.0032) [2024-06-18 22:41:54,871][19107] Updated weights for policy 0, policy_version 226165 (0.0032) [2024-06-18 22:41:55,500][18875] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3705503744. Throughput: 0: 42096.8. Samples: 929654080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:41:55,501][18875] Avg episode reward: [(0, '0.525')] [2024-06-18 22:41:58,691][19107] Updated weights for policy 0, policy_version 226175 (0.0036) [2024-06-18 22:42:00,500][18875] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42210.1). Total num frames: 3705749504. Throughput: 0: 42111.5. Samples: 929777980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:42:00,501][18875] Avg episode reward: [(0, '0.720')] [2024-06-18 22:42:01,212][19087] Signal inference workers to stop experience collection... (13600 times) [2024-06-18 22:42:01,212][19087] Signal inference workers to resume experience collection... (13600 times) [2024-06-18 22:42:01,237][19107] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-06-18 22:42:01,238][19107] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-06-18 22:42:02,956][19107] Updated weights for policy 0, policy_version 226185 (0.0034) [2024-06-18 22:42:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 3705929728. Throughput: 0: 42255.5. Samples: 930038540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:42:05,501][18875] Avg episode reward: [(0, '0.584')] [2024-06-18 22:42:06,214][19107] Updated weights for policy 0, policy_version 226195 (0.0037) [2024-06-18 22:42:10,500][18875] Fps is (10 sec: 37683.4, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3706126336. Throughput: 0: 41909.0. Samples: 930283760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:42:10,501][18875] Avg episode reward: [(0, '0.711')] [2024-06-18 22:42:10,554][19107] Updated weights for policy 0, policy_version 226205 (0.0028) [2024-06-18 22:42:13,913][19107] Updated weights for policy 0, policy_version 226215 (0.0040) [2024-06-18 22:42:15,500][18875] Fps is (10 sec: 45874.4, 60 sec: 42325.2, 300 sec: 42265.7). Total num frames: 3706388480. Throughput: 0: 42003.7. Samples: 930407700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:42:15,501][18875] Avg episode reward: [(0, '0.546')] [2024-06-18 22:42:18,086][19107] Updated weights for policy 0, policy_version 226225 (0.0028) [2024-06-18 22:42:20,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3706552320. Throughput: 0: 42059.1. Samples: 930667960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:42:20,501][18875] Avg episode reward: [(0, '0.665')] [2024-06-18 22:42:21,506][19107] Updated weights for policy 0, policy_version 226235 (0.0042) [2024-06-18 22:42:25,500][18875] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3706781696. Throughput: 0: 41994.8. Samples: 930921500. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:42:25,501][18875] Avg episode reward: [(0, '0.728')] [2024-06-18 22:42:25,600][19107] Updated weights for policy 0, policy_version 226245 (0.0030) [2024-06-18 22:42:29,330][19107] Updated weights for policy 0, policy_version 226255 (0.0040) [2024-06-18 22:42:30,500][18875] Fps is (10 sec: 49151.0, 60 sec: 42600.9, 300 sec: 42320.7). Total num frames: 3707043840. Throughput: 0: 42117.5. Samples: 931049500. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:42:30,501][18875] Avg episode reward: [(0, '0.467')] [2024-06-18 22:42:33,606][19107] Updated weights for policy 0, policy_version 226265 (0.0041) [2024-06-18 22:42:35,501][18875] Fps is (10 sec: 39321.0, 60 sec: 41505.9, 300 sec: 42209.6). Total num frames: 3707174912. Throughput: 0: 41948.2. Samples: 931299380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:42:35,501][18875] Avg episode reward: [(0, '0.503')] [2024-06-18 22:42:35,510][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000226268_3707174912.pth... [2024-06-18 22:42:35,564][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225651_3697065984.pth [2024-06-18 22:42:37,141][19107] Updated weights for policy 0, policy_version 226275 (0.0032) [2024-06-18 22:42:40,500][18875] Fps is (10 sec: 37683.4, 60 sec: 42600.9, 300 sec: 42209.6). Total num frames: 3707420672. Throughput: 0: 42016.8. Samples: 931544840. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:42:40,501][18875] Avg episode reward: [(0, '0.522')] [2024-06-18 22:42:41,272][19107] Updated weights for policy 0, policy_version 226285 (0.0036) [2024-06-18 22:42:45,074][19107] Updated weights for policy 0, policy_version 226295 (0.0028) [2024-06-18 22:42:45,500][18875] Fps is (10 sec: 45876.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3707633664. Throughput: 0: 42190.6. Samples: 931676560. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:42:45,501][18875] Avg episode reward: [(0, '0.409')] [2024-06-18 22:42:48,960][19107] Updated weights for policy 0, policy_version 226305 (0.0041) [2024-06-18 22:42:50,504][18875] Fps is (10 sec: 39307.7, 60 sec: 42049.7, 300 sec: 42209.1). Total num frames: 3707813888. Throughput: 0: 41915.7. Samples: 931924900. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:42:50,505][18875] Avg episode reward: [(0, '0.630')] [2024-06-18 22:42:52,838][19107] Updated weights for policy 0, policy_version 226315 (0.0028) [2024-06-18 22:42:55,500][18875] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 3708059648. Throughput: 0: 42020.5. Samples: 932174680. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:42:55,500][18875] Avg episode reward: [(0, '0.805')] [2024-06-18 22:42:56,597][19107] Updated weights for policy 0, policy_version 226325 (0.0041) [2024-06-18 22:43:00,504][18875] Fps is (10 sec: 44236.8, 60 sec: 41776.7, 300 sec: 42209.1). Total num frames: 3708256256. Throughput: 0: 42167.4. Samples: 932305380. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:43:00,505][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 22:43:00,635][19107] Updated weights for policy 0, policy_version 226335 (0.0035) [2024-06-18 22:43:04,663][19107] Updated weights for policy 0, policy_version 226345 (0.0039) [2024-06-18 22:43:05,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42210.1). Total num frames: 3708452864. Throughput: 0: 41870.7. Samples: 932552140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:43:05,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 22:43:08,462][19107] Updated weights for policy 0, policy_version 226355 (0.0036) [2024-06-18 22:43:10,502][18875] Fps is (10 sec: 42604.7, 60 sec: 42596.8, 300 sec: 42209.3). Total num frames: 3708682240. Throughput: 0: 41735.8. Samples: 932799700. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:43:10,503][18875] Avg episode reward: [(0, '0.389')] [2024-06-18 22:43:12,066][19087] Signal inference workers to stop experience collection... (13650 times) [2024-06-18 22:43:12,067][19087] Signal inference workers to resume experience collection... (13650 times) [2024-06-18 22:43:12,090][19107] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-06-18 22:43:12,091][19107] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-06-18 22:43:12,211][19107] Updated weights for policy 0, policy_version 226365 (0.0034) [2024-06-18 22:43:15,500][18875] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 3708895232. Throughput: 0: 41817.4. Samples: 932931280. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:43:15,501][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 22:43:16,025][19107] Updated weights for policy 0, policy_version 226375 (0.0037) [2024-06-18 22:43:20,057][19107] Updated weights for policy 0, policy_version 226385 (0.0043) [2024-06-18 22:43:20,500][18875] Fps is (10 sec: 40968.2, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 3709091840. Throughput: 0: 41950.3. Samples: 933187140. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:43:20,501][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 22:43:23,925][19107] Updated weights for policy 0, policy_version 226395 (0.0032) [2024-06-18 22:43:25,500][18875] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 3709337600. Throughput: 0: 42057.5. Samples: 933437420. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:43:25,501][18875] Avg episode reward: [(0, '0.429')] [2024-06-18 22:43:28,082][19107] Updated weights for policy 0, policy_version 226405 (0.0022) [2024-06-18 22:43:30,500][18875] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 42265.1). Total num frames: 3709517824. Throughput: 0: 42033.3. Samples: 933568060. Policy #0 lag: (min: 0.0, avg: 12.8, max: 21.0) [2024-06-18 22:43:30,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 22:43:31,967][19107] Updated weights for policy 0, policy_version 226415 (0.0032) [2024-06-18 22:43:35,500][18875] Fps is (10 sec: 37683.2, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 3709714432. Throughput: 0: 42092.3. Samples: 933818900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:43:35,501][18875] Avg episode reward: [(0, '0.569')] [2024-06-18 22:43:35,767][19107] Updated weights for policy 0, policy_version 226425 (0.0023) [2024-06-18 22:43:39,589][19107] Updated weights for policy 0, policy_version 226435 (0.0031) [2024-06-18 22:43:40,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 3709943808. Throughput: 0: 42109.7. Samples: 934069620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:43:40,504][18875] Avg episode reward: [(0, '0.567')] [2024-06-18 22:43:44,051][19107] Updated weights for policy 0, policy_version 226445 (0.0043) [2024-06-18 22:43:45,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3710156800. Throughput: 0: 42054.9. Samples: 934197700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:43:45,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 22:43:47,532][19107] Updated weights for policy 0, policy_version 226455 (0.0028) [2024-06-18 22:43:50,500][18875] Fps is (10 sec: 42598.9, 60 sec: 42601.0, 300 sec: 42209.7). Total num frames: 3710369792. Throughput: 0: 42048.9. Samples: 934444340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:43:50,501][18875] Avg episode reward: [(0, '0.227')] [2024-06-18 22:43:51,732][19107] Updated weights for policy 0, policy_version 226465 (0.0047) [2024-06-18 22:43:55,401][19107] Updated weights for policy 0, policy_version 226475 (0.0034) [2024-06-18 22:43:55,500][18875] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 3710566400. Throughput: 0: 42347.9. Samples: 934705260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:43:55,501][18875] Avg episode reward: [(0, '0.250')] [2024-06-18 22:43:59,413][19107] Updated weights for policy 0, policy_version 226485 (0.0035) [2024-06-18 22:44:00,500][18875] Fps is (10 sec: 40959.9, 60 sec: 42054.9, 300 sec: 42154.1). Total num frames: 3710779392. Throughput: 0: 42197.5. Samples: 934830160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:00,501][18875] Avg episode reward: [(0, '0.414')] [2024-06-18 22:44:03,110][19107] Updated weights for policy 0, policy_version 226495 (0.0028) [2024-06-18 22:44:05,500][18875] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3710976000. Throughput: 0: 42020.2. Samples: 935078040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:05,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 22:44:07,121][19107] Updated weights for policy 0, policy_version 226505 (0.0033) [2024-06-18 22:44:10,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42053.7, 300 sec: 42154.6). Total num frames: 3711205376. Throughput: 0: 42189.2. Samples: 935335940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:10,501][18875] Avg episode reward: [(0, '0.573')] [2024-06-18 22:44:10,667][19107] Updated weights for policy 0, policy_version 226515 (0.0032) [2024-06-18 22:44:14,749][19107] Updated weights for policy 0, policy_version 226525 (0.0039) [2024-06-18 22:44:15,500][18875] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42099.1). Total num frames: 3711401984. Throughput: 0: 42184.6. Samples: 935466360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:15,501][18875] Avg episode reward: [(0, '0.633')] [2024-06-18 22:44:18,281][19107] Updated weights for policy 0, policy_version 226535 (0.0045) [2024-06-18 22:44:20,500][18875] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3711614976. Throughput: 0: 42116.0. Samples: 935714120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:20,500][18875] Avg episode reward: [(0, '0.558')] [2024-06-18 22:44:22,502][19107] Updated weights for policy 0, policy_version 226545 (0.0033) [2024-06-18 22:44:25,500][18875] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 3711844352. Throughput: 0: 42156.9. Samples: 935966680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:25,509][18875] Avg episode reward: [(0, '0.891')] [2024-06-18 22:44:25,996][19107] Updated weights for policy 0, policy_version 226555 (0.0040) [2024-06-18 22:44:30,257][19107] Updated weights for policy 0, policy_version 226565 (0.0042) [2024-06-18 22:44:30,500][18875] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3712040960. Throughput: 0: 42156.5. Samples: 936094740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:30,501][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 22:44:33,905][19107] Updated weights for policy 0, policy_version 226575 (0.0034) [2024-06-18 22:44:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 3712253952. Throughput: 0: 42143.0. Samples: 936340780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:35,501][18875] Avg episode reward: [(0, '0.677')] [2024-06-18 22:44:35,523][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000226578_3712253952.pth... [2024-06-18 22:44:35,583][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000225961_3702145024.pth [2024-06-18 22:44:38,168][19107] Updated weights for policy 0, policy_version 226585 (0.0034) [2024-06-18 22:44:40,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3712466944. Throughput: 0: 42044.4. Samples: 936597260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 22:44:40,501][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 22:44:41,535][19107] Updated weights for policy 0, policy_version 226595 (0.0043) [2024-06-18 22:44:45,500][18875] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3712663552. Throughput: 0: 42096.8. Samples: 936724520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:44:45,501][18875] Avg episode reward: [(0, '0.365')] [2024-06-18 22:44:45,999][19087] Signal inference workers to stop experience collection... (13700 times) [2024-06-18 22:44:46,032][19107] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-06-18 22:44:46,054][19087] Signal inference workers to resume experience collection... (13700 times) [2024-06-18 22:44:46,055][19107] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-06-18 22:44:46,058][19107] Updated weights for policy 0, policy_version 226605 (0.0029) [2024-06-18 22:44:49,166][19107] Updated weights for policy 0, policy_version 226615 (0.0042) [2024-06-18 22:44:50,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3712892928. Throughput: 0: 42211.5. Samples: 936977560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:44:50,501][18875] Avg episode reward: [(0, '0.494')] [2024-06-18 22:44:53,653][19107] Updated weights for policy 0, policy_version 226625 (0.0028) [2024-06-18 22:44:55,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3713105920. Throughput: 0: 42197.3. Samples: 937234820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:44:55,501][18875] Avg episode reward: [(0, '0.435')] [2024-06-18 22:44:57,225][19107] Updated weights for policy 0, policy_version 226635 (0.0035) [2024-06-18 22:45:00,500][18875] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3713286144. Throughput: 0: 42013.3. Samples: 937356960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:00,501][18875] Avg episode reward: [(0, '0.654')] [2024-06-18 22:45:01,543][19107] Updated weights for policy 0, policy_version 226645 (0.0028) [2024-06-18 22:45:04,684][19107] Updated weights for policy 0, policy_version 226655 (0.0034) [2024-06-18 22:45:05,500][18875] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 3713548288. Throughput: 0: 42237.6. Samples: 937614820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:05,501][18875] Avg episode reward: [(0, '0.687')] [2024-06-18 22:45:09,059][19107] Updated weights for policy 0, policy_version 226665 (0.0036) [2024-06-18 22:45:10,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3713728512. Throughput: 0: 42353.4. Samples: 937872580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:10,501][18875] Avg episode reward: [(0, '0.676')] [2024-06-18 22:45:12,342][19107] Updated weights for policy 0, policy_version 226675 (0.0031) [2024-06-18 22:45:15,500][18875] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42099.1). Total num frames: 3713941504. Throughput: 0: 42213.2. Samples: 937994340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:15,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 22:45:16,630][19107] Updated weights for policy 0, policy_version 226685 (0.0039) [2024-06-18 22:45:20,029][19107] Updated weights for policy 0, policy_version 226695 (0.0036) [2024-06-18 22:45:20,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3714170880. Throughput: 0: 42446.6. Samples: 938250880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:20,501][18875] Avg episode reward: [(0, '0.149')] [2024-06-18 22:45:24,382][19107] Updated weights for policy 0, policy_version 226705 (0.0034) [2024-06-18 22:45:25,500][18875] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 3714351104. Throughput: 0: 42377.8. Samples: 938504260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:25,501][18875] Avg episode reward: [(0, '0.623')] [2024-06-18 22:45:27,982][19107] Updated weights for policy 0, policy_version 226715 (0.0025) [2024-06-18 22:45:30,500][18875] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3714564096. Throughput: 0: 42329.4. Samples: 938629340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:30,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 22:45:32,047][19107] Updated weights for policy 0, policy_version 226725 (0.0040) [2024-06-18 22:45:35,500][18875] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3714793472. Throughput: 0: 42348.5. Samples: 938883240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:35,501][18875] Avg episode reward: [(0, '0.617')] [2024-06-18 22:45:36,084][19107] Updated weights for policy 0, policy_version 226735 (0.0032) [2024-06-18 22:45:39,688][19107] Updated weights for policy 0, policy_version 226745 (0.0038) [2024-06-18 22:45:40,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3714990080. Throughput: 0: 42188.2. Samples: 939133280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:40,500][18875] Avg episode reward: [(0, '0.507')] [2024-06-18 22:45:44,020][19107] Updated weights for policy 0, policy_version 226755 (0.0026) [2024-06-18 22:45:45,502][18875] Fps is (10 sec: 40951.4, 60 sec: 42323.9, 300 sec: 42098.2). Total num frames: 3715203072. Throughput: 0: 42238.0. Samples: 939257760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:45:45,503][18875] Avg episode reward: [(0, '0.450')] [2024-06-18 22:45:47,591][19107] Updated weights for policy 0, policy_version 226765 (0.0035) [2024-06-18 22:45:50,500][18875] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 3715432448. Throughput: 0: 42197.0. Samples: 939513680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:45:50,501][18875] Avg episode reward: [(0, '0.446')] [2024-06-18 22:45:51,647][19107] Updated weights for policy 0, policy_version 226775 (0.0031) [2024-06-18 22:45:55,371][19107] Updated weights for policy 0, policy_version 226785 (0.0033) [2024-06-18 22:45:55,500][18875] Fps is (10 sec: 44245.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3715645440. Throughput: 0: 42171.8. Samples: 939770320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:45:55,501][18875] Avg episode reward: [(0, '0.451')] [2024-06-18 22:45:59,399][19107] Updated weights for policy 0, policy_version 226795 (0.0033) [2024-06-18 22:46:00,504][18875] Fps is (10 sec: 40945.1, 60 sec: 42595.8, 300 sec: 42153.6). Total num frames: 3715842048. Throughput: 0: 42232.3. Samples: 939894940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:00,504][18875] Avg episode reward: [(0, '0.629')] [2024-06-18 22:46:03,365][19107] Updated weights for policy 0, policy_version 226805 (0.0030) [2024-06-18 22:46:05,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3716071424. Throughput: 0: 42177.3. Samples: 940148860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:05,501][18875] Avg episode reward: [(0, '0.604')] [2024-06-18 22:46:06,932][19107] Updated weights for policy 0, policy_version 226815 (0.0044) [2024-06-18 22:46:08,328][19087] Signal inference workers to stop experience collection... (13750 times) [2024-06-18 22:46:08,329][19087] Signal inference workers to resume experience collection... (13750 times) [2024-06-18 22:46:08,344][19107] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-06-18 22:46:08,344][19107] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-06-18 22:46:10,500][18875] Fps is (10 sec: 44252.4, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3716284416. Throughput: 0: 42171.1. Samples: 940401960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:10,501][18875] Avg episode reward: [(0, '0.696')] [2024-06-18 22:46:11,138][19107] Updated weights for policy 0, policy_version 226825 (0.0037) [2024-06-18 22:46:14,966][19107] Updated weights for policy 0, policy_version 226835 (0.0036) [2024-06-18 22:46:15,500][18875] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 3716464640. Throughput: 0: 42158.6. Samples: 940526480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:15,501][18875] Avg episode reward: [(0, '0.829')] [2024-06-18 22:46:18,829][19107] Updated weights for policy 0, policy_version 226845 (0.0030) [2024-06-18 22:46:20,504][18875] Fps is (10 sec: 40945.5, 60 sec: 42049.8, 300 sec: 42209.1). Total num frames: 3716694016. Throughput: 0: 42091.3. Samples: 940777500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:20,504][18875] Avg episode reward: [(0, '0.694')] [2024-06-18 22:46:22,630][19107] Updated weights for policy 0, policy_version 226855 (0.0035) [2024-06-18 22:46:25,500][18875] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42154.6). Total num frames: 3716923392. Throughput: 0: 42157.6. Samples: 941030380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:25,501][18875] Avg episode reward: [(0, '0.539')] [2024-06-18 22:46:26,923][19107] Updated weights for policy 0, policy_version 226865 (0.0052) [2024-06-18 22:46:30,501][18875] Fps is (10 sec: 40974.0, 60 sec: 42325.1, 300 sec: 42098.5). Total num frames: 3717103616. Throughput: 0: 42253.8. Samples: 941159100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:30,501][18875] Avg episode reward: [(0, '0.580')] [2024-06-18 22:46:30,671][19107] Updated weights for policy 0, policy_version 226875 (0.0025) [2024-06-18 22:46:34,406][19107] Updated weights for policy 0, policy_version 226885 (0.0036) [2024-06-18 22:46:35,500][18875] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42265.7). Total num frames: 3717332992. Throughput: 0: 42160.3. Samples: 941410900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:35,501][18875] Avg episode reward: [(0, '0.431')] [2024-06-18 22:46:35,517][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000226888_3717332992.pth... [2024-06-18 22:46:35,564][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000226268_3707174912.pth [2024-06-18 22:46:38,284][19107] Updated weights for policy 0, policy_version 226895 (0.0030) [2024-06-18 22:46:40,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42598.2, 300 sec: 42154.1). Total num frames: 3717545984. Throughput: 0: 42170.2. Samples: 941667980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:40,501][18875] Avg episode reward: [(0, '0.382')] [2024-06-18 22:46:42,175][19107] Updated weights for policy 0, policy_version 226905 (0.0036) [2024-06-18 22:46:45,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42599.9, 300 sec: 42265.2). Total num frames: 3717758976. Throughput: 0: 42154.9. Samples: 941791760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:45,501][18875] Avg episode reward: [(0, '0.382')] [2024-06-18 22:46:45,967][19107] Updated weights for policy 0, policy_version 226915 (0.0053) [2024-06-18 22:46:49,892][19107] Updated weights for policy 0, policy_version 226925 (0.0033) [2024-06-18 22:46:50,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 3717971968. Throughput: 0: 42185.4. Samples: 942047200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:50,501][18875] Avg episode reward: [(0, '0.382')] [2024-06-18 22:46:53,915][19107] Updated weights for policy 0, policy_version 226935 (0.0029) [2024-06-18 22:46:55,500][18875] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3718152192. Throughput: 0: 42257.8. Samples: 942303560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-18 22:46:55,501][18875] Avg episode reward: [(0, '0.620')] [2024-06-18 22:46:57,541][19107] Updated weights for policy 0, policy_version 226945 (0.0040) [2024-06-18 22:47:00,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42327.8, 300 sec: 42209.6). Total num frames: 3718381568. Throughput: 0: 42151.0. Samples: 942423280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:00,501][18875] Avg episode reward: [(0, '0.755')] [2024-06-18 22:47:01,679][19107] Updated weights for policy 0, policy_version 226955 (0.0027) [2024-06-18 22:47:05,118][19107] Updated weights for policy 0, policy_version 226965 (0.0026) [2024-06-18 22:47:05,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 3718594560. Throughput: 0: 42360.3. Samples: 942683560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:05,501][18875] Avg episode reward: [(0, '0.660')] [2024-06-18 22:47:09,488][19107] Updated weights for policy 0, policy_version 226975 (0.0033) [2024-06-18 22:47:10,500][18875] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 3718774784. Throughput: 0: 42337.1. Samples: 942935540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:10,501][18875] Avg episode reward: [(0, '0.680')] [2024-06-18 22:47:12,806][19107] Updated weights for policy 0, policy_version 226985 (0.0046) [2024-06-18 22:47:15,503][18875] Fps is (10 sec: 42585.0, 60 sec: 42596.2, 300 sec: 42264.7). Total num frames: 3719020544. Throughput: 0: 42174.2. Samples: 943057060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:15,504][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 22:47:17,174][19107] Updated weights for policy 0, policy_version 226995 (0.0043) [2024-06-18 22:47:20,419][19107] Updated weights for policy 0, policy_version 227005 (0.0033) [2024-06-18 22:47:20,500][18875] Fps is (10 sec: 47513.7, 60 sec: 42601.0, 300 sec: 42265.2). Total num frames: 3719249920. Throughput: 0: 42422.3. Samples: 943319900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:20,501][18875] Avg episode reward: [(0, '0.306')] [2024-06-18 22:47:24,951][19107] Updated weights for policy 0, policy_version 227015 (0.0028) [2024-06-18 22:47:25,500][18875] Fps is (10 sec: 40973.1, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 3719430144. Throughput: 0: 42139.3. Samples: 943564240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:25,500][18875] Avg episode reward: [(0, '0.449')] [2024-06-18 22:47:28,226][19107] Updated weights for policy 0, policy_version 227025 (0.0026) [2024-06-18 22:47:30,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42598.6, 300 sec: 42320.7). Total num frames: 3719659520. Throughput: 0: 42134.3. Samples: 943687800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:30,500][18875] Avg episode reward: [(0, '0.607')] [2024-06-18 22:47:32,565][19107] Updated weights for policy 0, policy_version 227035 (0.0042) [2024-06-18 22:47:35,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3719839744. Throughput: 0: 42191.7. Samples: 943945820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:35,500][18875] Avg episode reward: [(0, '0.543')] [2024-06-18 22:47:36,048][19107] Updated weights for policy 0, policy_version 227045 (0.0038) [2024-06-18 22:47:40,454][19107] Updated weights for policy 0, policy_version 227055 (0.0029) [2024-06-18 22:47:40,500][18875] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 3720069120. Throughput: 0: 42028.5. Samples: 944194840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:40,501][18875] Avg episode reward: [(0, '0.455')] [2024-06-18 22:47:42,041][19087] Signal inference workers to stop experience collection... (13800 times) [2024-06-18 22:47:42,042][19087] Signal inference workers to resume experience collection... (13800 times) [2024-06-18 22:47:42,084][19107] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-06-18 22:47:42,084][19107] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-06-18 22:47:43,976][19107] Updated weights for policy 0, policy_version 227065 (0.0030) [2024-06-18 22:47:45,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42265.7). Total num frames: 3720282112. Throughput: 0: 42217.4. Samples: 944323060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:45,501][18875] Avg episode reward: [(0, '0.561')] [2024-06-18 22:47:48,113][19107] Updated weights for policy 0, policy_version 227075 (0.0046) [2024-06-18 22:47:50,504][18875] Fps is (10 sec: 39307.3, 60 sec: 41503.6, 300 sec: 42042.5). Total num frames: 3720462336. Throughput: 0: 41989.0. Samples: 944573220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:50,505][18875] Avg episode reward: [(0, '0.393')] [2024-06-18 22:47:52,010][19107] Updated weights for policy 0, policy_version 227085 (0.0035) [2024-06-18 22:47:55,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42210.1). Total num frames: 3720708096. Throughput: 0: 41996.4. Samples: 944825380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:47:55,501][18875] Avg episode reward: [(0, '0.556')] [2024-06-18 22:47:56,367][19107] Updated weights for policy 0, policy_version 227095 (0.0042) [2024-06-18 22:47:59,949][19107] Updated weights for policy 0, policy_version 227105 (0.0031) [2024-06-18 22:48:00,500][18875] Fps is (10 sec: 44252.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3720904704. Throughput: 0: 42129.9. Samples: 944952780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:48:00,504][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 22:48:04,199][19107] Updated weights for policy 0, policy_version 227115 (0.0037) [2024-06-18 22:48:05,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42098.9). Total num frames: 3721101312. Throughput: 0: 41900.0. Samples: 945205400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 22:48:05,501][18875] Avg episode reward: [(0, '0.572')] [2024-06-18 22:48:07,704][19107] Updated weights for policy 0, policy_version 227125 (0.0032) [2024-06-18 22:48:10,500][18875] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3721330688. Throughput: 0: 41923.9. Samples: 945450820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:10,501][18875] Avg episode reward: [(0, '0.575')] [2024-06-18 22:48:11,769][19107] Updated weights for policy 0, policy_version 227135 (0.0043) [2024-06-18 22:48:15,288][19107] Updated weights for policy 0, policy_version 227145 (0.0040) [2024-06-18 22:48:15,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42054.5, 300 sec: 42209.7). Total num frames: 3721543680. Throughput: 0: 42083.5. Samples: 945581560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:15,501][18875] Avg episode reward: [(0, '0.592')] [2024-06-18 22:48:19,480][19107] Updated weights for policy 0, policy_version 227155 (0.0025) [2024-06-18 22:48:20,500][18875] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3721740288. Throughput: 0: 42139.1. Samples: 945842080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:20,500][18875] Avg episode reward: [(0, '0.527')] [2024-06-18 22:48:22,861][19107] Updated weights for policy 0, policy_version 227165 (0.0034) [2024-06-18 22:48:25,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 3721969664. Throughput: 0: 42150.1. Samples: 946091600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:25,501][18875] Avg episode reward: [(0, '0.590')] [2024-06-18 22:48:27,311][19107] Updated weights for policy 0, policy_version 227175 (0.0039) [2024-06-18 22:48:30,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 3722182656. Throughput: 0: 42091.6. Samples: 946217180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:30,500][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 22:48:30,576][19107] Updated weights for policy 0, policy_version 227185 (0.0036) [2024-06-18 22:48:35,208][19107] Updated weights for policy 0, policy_version 227195 (0.0027) [2024-06-18 22:48:35,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3722379264. Throughput: 0: 42210.5. Samples: 946472540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:35,501][18875] Avg episode reward: [(0, '0.725')] [2024-06-18 22:48:35,540][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227197_3722395648.pth... [2024-06-18 22:48:35,607][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000226578_3712253952.pth [2024-06-18 22:48:38,365][19107] Updated weights for policy 0, policy_version 227205 (0.0049) [2024-06-18 22:48:40,500][18875] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3722592256. Throughput: 0: 42136.8. Samples: 946721540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:40,501][18875] Avg episode reward: [(0, '0.771')] [2024-06-18 22:48:42,918][19107] Updated weights for policy 0, policy_version 227215 (0.0036) [2024-06-18 22:48:45,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3722805248. Throughput: 0: 42200.6. Samples: 946851800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:45,501][18875] Avg episode reward: [(0, '0.773')] [2024-06-18 22:48:46,156][19107] Updated weights for policy 0, policy_version 227225 (0.0036) [2024-06-18 22:48:50,500][18875] Fps is (10 sec: 40960.9, 60 sec: 42328.0, 300 sec: 42154.1). Total num frames: 3723001856. Throughput: 0: 42266.7. Samples: 947107400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:50,500][18875] Avg episode reward: [(0, '0.684')] [2024-06-18 22:48:50,561][19107] Updated weights for policy 0, policy_version 227235 (0.0048) [2024-06-18 22:48:54,200][19107] Updated weights for policy 0, policy_version 227245 (0.0031) [2024-06-18 22:48:55,500][18875] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3723214848. Throughput: 0: 42186.6. Samples: 947349220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:48:55,501][18875] Avg episode reward: [(0, '0.773')] [2024-06-18 22:48:57,529][19087] Signal inference workers to stop experience collection... (13850 times) [2024-06-18 22:48:57,530][19087] Signal inference workers to resume experience collection... (13850 times) [2024-06-18 22:48:57,560][19107] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-06-18 22:48:57,561][19107] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-06-18 22:48:58,649][19107] Updated weights for policy 0, policy_version 227255 (0.0053) [2024-06-18 22:49:00,500][18875] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 3723444224. Throughput: 0: 42147.0. Samples: 947478180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:49:00,501][18875] Avg episode reward: [(0, '0.600')] [2024-06-18 22:49:02,005][19107] Updated weights for policy 0, policy_version 227265 (0.0041) [2024-06-18 22:49:05,504][18875] Fps is (10 sec: 42583.4, 60 sec: 42322.7, 300 sec: 42153.6). Total num frames: 3723640832. Throughput: 0: 41956.6. Samples: 947730280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:49:05,504][18875] Avg episode reward: [(0, '0.559')] [2024-06-18 22:49:06,328][19107] Updated weights for policy 0, policy_version 227275 (0.0032) [2024-06-18 22:49:09,748][19107] Updated weights for policy 0, policy_version 227285 (0.0033) [2024-06-18 22:49:10,500][18875] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 3723837440. Throughput: 0: 42003.2. Samples: 947981740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:49:10,501][18875] Avg episode reward: [(0, '0.607')] [2024-06-18 22:49:14,069][19107] Updated weights for policy 0, policy_version 227295 (0.0039) [2024-06-18 22:49:15,500][18875] Fps is (10 sec: 42613.6, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3724066816. Throughput: 0: 42058.1. Samples: 948109800. Policy #0 lag: (min: 0.0, avg: 10.0, max: 19.0) [2024-06-18 22:49:15,501][18875] Avg episode reward: [(0, '0.356')] [2024-06-18 22:49:17,443][19107] Updated weights for policy 0, policy_version 227305 (0.0032) [2024-06-18 22:49:20,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 3724263424. Throughput: 0: 41970.2. Samples: 948361200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:20,501][18875] Avg episode reward: [(0, '0.437')] [2024-06-18 22:49:21,530][19107] Updated weights for policy 0, policy_version 227315 (0.0037) [2024-06-18 22:49:25,291][19107] Updated weights for policy 0, policy_version 227325 (0.0034) [2024-06-18 22:49:25,500][18875] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3724492800. Throughput: 0: 42040.1. Samples: 948613340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:25,501][18875] Avg episode reward: [(0, '0.349')] [2024-06-18 22:49:29,182][19107] Updated weights for policy 0, policy_version 227335 (0.0044) [2024-06-18 22:49:30,500][18875] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3724705792. Throughput: 0: 42025.2. Samples: 948742940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:30,501][18875] Avg episode reward: [(0, '0.360')] [2024-06-18 22:49:33,885][19107] Updated weights for policy 0, policy_version 227345 (0.0037) [2024-06-18 22:49:35,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3724902400. Throughput: 0: 41964.8. Samples: 948995820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:35,501][18875] Avg episode reward: [(0, '0.653')] [2024-06-18 22:49:37,043][19107] Updated weights for policy 0, policy_version 227355 (0.0045) [2024-06-18 22:49:40,500][18875] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3725115392. Throughput: 0: 42098.3. Samples: 949243640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:40,501][18875] Avg episode reward: [(0, '0.531')] [2024-06-18 22:49:41,609][19107] Updated weights for policy 0, policy_version 227365 (0.0050) [2024-06-18 22:49:44,890][19107] Updated weights for policy 0, policy_version 227375 (0.0038) [2024-06-18 22:49:45,500][18875] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3725328384. Throughput: 0: 42057.0. Samples: 949370740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:45,501][18875] Avg episode reward: [(0, '0.639')] [2024-06-18 22:49:49,245][19107] Updated weights for policy 0, policy_version 227385 (0.0035) [2024-06-18 22:49:50,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 3725524992. Throughput: 0: 42206.2. Samples: 949629400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:50,500][18875] Avg episode reward: [(0, '0.486')] [2024-06-18 22:49:52,801][19107] Updated weights for policy 0, policy_version 227395 (0.0043) [2024-06-18 22:49:55,500][18875] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3725737984. Throughput: 0: 42109.7. Samples: 949876680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:49:55,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 22:49:57,594][19107] Updated weights for policy 0, policy_version 227405 (0.0036) [2024-06-18 22:50:00,386][19107] Updated weights for policy 0, policy_version 227415 (0.0037) [2024-06-18 22:50:00,500][18875] Fps is (10 sec: 44236.5, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 3725967360. Throughput: 0: 42151.6. Samples: 950006620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:50:00,501][18875] Avg episode reward: [(0, '0.681')] [2024-06-18 22:50:05,389][19107] Updated weights for policy 0, policy_version 227425 (0.0027) [2024-06-18 22:50:05,504][18875] Fps is (10 sec: 39307.8, 60 sec: 41506.2, 300 sec: 42042.5). Total num frames: 3726131200. Throughput: 0: 42122.8. Samples: 950256880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:50:05,505][18875] Avg episode reward: [(0, '0.622')] [2024-06-18 22:50:07,904][19107] Updated weights for policy 0, policy_version 227435 (0.0036) [2024-06-18 22:50:10,500][18875] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 3726393344. Throughput: 0: 42083.0. Samples: 950507080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:50:10,501][18875] Avg episode reward: [(0, '0.482')] [2024-06-18 22:50:13,188][19107] Updated weights for policy 0, policy_version 227445 (0.0041) [2024-06-18 22:50:15,500][18875] Fps is (10 sec: 47530.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 3726606336. Throughput: 0: 42381.0. Samples: 950650080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:50:15,501][18875] Avg episode reward: [(0, '0.454')] [2024-06-18 22:50:15,580][19107] Updated weights for policy 0, policy_version 227455 (0.0033) [2024-06-18 22:50:20,500][18875] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 3726770176. Throughput: 0: 42209.0. Samples: 950895220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:50:20,501][18875] Avg episode reward: [(0, '0.496')] [2024-06-18 22:50:20,937][19107] Updated weights for policy 0, policy_version 227465 (0.0041) [2024-06-18 22:50:21,448][19087] Signal inference workers to stop experience collection... (13900 times) [2024-06-18 22:50:21,448][19087] Signal inference workers to resume experience collection... (13900 times) [2024-06-18 22:50:21,484][19107] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-06-18 22:50:21,484][19107] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-06-18 22:50:23,261][19107] Updated weights for policy 0, policy_version 227475 (0.0029) [2024-06-18 22:50:25,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 3727015936. Throughput: 0: 42282.2. Samples: 951146340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-18 22:50:25,501][18875] Avg episode reward: [(0, '0.469')] [2024-06-18 22:50:28,542][19107] Updated weights for policy 0, policy_version 227485 (0.0043) [2024-06-18 22:50:30,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 3727212544. Throughput: 0: 42349.3. Samples: 951276460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:50:30,501][18875] Avg episode reward: [(0, '0.264')] [2024-06-18 22:50:31,279][19107] Updated weights for policy 0, policy_version 227495 (0.0041) [2024-06-18 22:50:35,500][18875] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 3727425536. Throughput: 0: 42150.6. Samples: 951526180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:50:35,501][18875] Avg episode reward: [(0, '0.583')] [2024-06-18 22:50:35,509][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227504_3727425536.pth... [2024-06-18 22:50:35,562][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000226888_3717332992.pth [2024-06-18 22:50:36,111][19107] Updated weights for policy 0, policy_version 227505 (0.0040) [2024-06-18 22:50:39,047][19107] Updated weights for policy 0, policy_version 227515 (0.0047) [2024-06-18 22:50:40,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42209.9). Total num frames: 3727654912. Throughput: 0: 42173.5. Samples: 951774480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:50:40,500][18875] Avg episode reward: [(0, '0.819')] [2024-06-18 22:50:43,738][19107] Updated weights for policy 0, policy_version 227525 (0.0035) [2024-06-18 22:50:45,500][18875] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3727851520. Throughput: 0: 42340.7. Samples: 951911960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:50:45,501][18875] Avg episode reward: [(0, '0.601')] [2024-06-18 22:50:46,655][19107] Updated weights for policy 0, policy_version 227535 (0.0039) [2024-06-18 22:50:50,500][18875] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 3728048128. Throughput: 0: 42169.6. Samples: 952154360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:50:50,501][18875] Avg episode reward: [(0, '0.470')] [2024-06-18 22:50:51,364][19107] Updated weights for policy 0, policy_version 227545 (0.0034) [2024-06-18 22:50:54,533][19107] Updated weights for policy 0, policy_version 227555 (0.0027) [2024-06-18 22:50:55,500][18875] Fps is (10 sec: 44237.8, 60 sec: 42598.5, 300 sec: 42210.2). Total num frames: 3728293888. Throughput: 0: 42265.0. Samples: 952409000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:50:55,500][18875] Avg episode reward: [(0, '0.345')] [2024-06-18 22:50:59,614][19107] Updated weights for policy 0, policy_version 227565 (0.0041) [2024-06-18 22:51:00,500][18875] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 3728490496. Throughput: 0: 42059.5. Samples: 952542760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:51:00,501][18875] Avg episode reward: [(0, '0.347')] [2024-06-18 22:51:02,131][19107] Updated weights for policy 0, policy_version 227575 (0.0032) [2024-06-18 22:51:05,500][18875] Fps is (10 sec: 40958.9, 60 sec: 42873.9, 300 sec: 42098.5). Total num frames: 3728703488. Throughput: 0: 42039.3. Samples: 952787000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:51:05,501][18875] Avg episode reward: [(0, '0.319')] [2024-06-18 22:51:07,073][19107] Updated weights for policy 0, policy_version 227585 (0.0039) [2024-06-18 22:51:10,134][19107] Updated weights for policy 0, policy_version 227595 (0.0038) [2024-06-18 22:51:10,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 3728932864. Throughput: 0: 42231.6. Samples: 953046760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:51:10,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 22:51:14,661][19107] Updated weights for policy 0, policy_version 227605 (0.0037) [2024-06-18 22:51:15,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 42099.1). Total num frames: 3729113088. Throughput: 0: 42132.4. Samples: 953172420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:51:15,501][18875] Avg episode reward: [(0, '0.756')] [2024-06-18 22:51:17,727][19087] Signal inference workers to stop experience collection... (13950 times) [2024-06-18 22:51:17,774][19107] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-06-18 22:51:17,785][19087] Signal inference workers to resume experience collection... (13950 times) [2024-06-18 22:51:17,800][19107] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-06-18 22:51:17,929][19107] Updated weights for policy 0, policy_version 227615 (0.0044) [2024-06-18 22:51:20,504][18875] Fps is (10 sec: 40946.6, 60 sec: 42869.1, 300 sec: 42098.1). Total num frames: 3729342464. Throughput: 0: 42135.5. Samples: 953422420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:51:20,504][18875] Avg episode reward: [(0, '0.606')] [2024-06-18 22:51:22,537][19107] Updated weights for policy 0, policy_version 227625 (0.0041) [2024-06-18 22:51:25,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 3729555456. Throughput: 0: 42327.8. Samples: 953679240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:51:25,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 22:51:25,639][19107] Updated weights for policy 0, policy_version 227635 (0.0035) [2024-06-18 22:51:30,411][19107] Updated weights for policy 0, policy_version 227645 (0.0032) [2024-06-18 22:51:30,500][18875] Fps is (10 sec: 39334.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3729735680. Throughput: 0: 41988.6. Samples: 953801440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-18 22:51:30,502][18875] Avg episode reward: [(0, '0.434')] [2024-06-18 22:51:33,556][19107] Updated weights for policy 0, policy_version 227655 (0.0033) [2024-06-18 22:51:35,500][18875] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 3729981440. Throughput: 0: 42180.8. Samples: 954052500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:51:35,501][18875] Avg episode reward: [(0, '0.334')] [2024-06-18 22:51:38,253][19107] Updated weights for policy 0, policy_version 227665 (0.0039) [2024-06-18 22:51:40,500][18875] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3730161664. Throughput: 0: 42248.0. Samples: 954310160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:51:40,500][18875] Avg episode reward: [(0, '0.363')] [2024-06-18 22:51:41,316][19107] Updated weights for policy 0, policy_version 227675 (0.0033) [2024-06-18 22:51:45,500][18875] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 3730358272. Throughput: 0: 41942.3. Samples: 954430160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:51:45,501][18875] Avg episode reward: [(0, '0.571')] [2024-06-18 22:51:45,848][19107] Updated weights for policy 0, policy_version 227685 (0.0029) [2024-06-18 22:51:49,055][19107] Updated weights for policy 0, policy_version 227695 (0.0030) [2024-06-18 22:51:50,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 3730620416. Throughput: 0: 42145.1. Samples: 954683520. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:51:50,500][18875] Avg episode reward: [(0, '0.406')] [2024-06-18 22:51:53,399][19107] Updated weights for policy 0, policy_version 227705 (0.0033) [2024-06-18 22:51:55,500][18875] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 42098.6). Total num frames: 3730800640. Throughput: 0: 42124.9. Samples: 954942380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:51:55,501][18875] Avg episode reward: [(0, '0.523')] [2024-06-18 22:51:56,955][19107] Updated weights for policy 0, policy_version 227715 (0.0040) [2024-06-18 22:52:00,500][18875] Fps is (10 sec: 37683.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 3730997248. Throughput: 0: 41846.3. Samples: 955055500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:00,501][18875] Avg episode reward: [(0, '0.715')] [2024-06-18 22:52:01,294][19107] Updated weights for policy 0, policy_version 227725 (0.0043) [2024-06-18 22:52:04,967][19107] Updated weights for policy 0, policy_version 227735 (0.0033) [2024-06-18 22:52:05,500][18875] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 3731243008. Throughput: 0: 42213.4. Samples: 955321880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:05,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 22:52:09,237][19107] Updated weights for policy 0, policy_version 227745 (0.0035) [2024-06-18 22:52:10,500][18875] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 42043.5). Total num frames: 3731423232. Throughput: 0: 41903.2. Samples: 955564880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:10,501][18875] Avg episode reward: [(0, '0.555')] [2024-06-18 22:52:12,927][19107] Updated weights for policy 0, policy_version 227755 (0.0033) [2024-06-18 22:52:15,500][18875] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3731636224. Throughput: 0: 41975.2. Samples: 955690320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:15,501][18875] Avg episode reward: [(0, '0.688')] [2024-06-18 22:52:16,738][19107] Updated weights for policy 0, policy_version 227765 (0.0033) [2024-06-18 22:52:20,500][18875] Fps is (10 sec: 42598.7, 60 sec: 41781.6, 300 sec: 42098.5). Total num frames: 3731849216. Throughput: 0: 42279.8. Samples: 955955080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:20,501][18875] Avg episode reward: [(0, '0.775')] [2024-06-18 22:52:20,510][19107] Updated weights for policy 0, policy_version 227775 (0.0032) [2024-06-18 22:52:24,253][19107] Updated weights for policy 0, policy_version 227785 (0.0034) [2024-06-18 22:52:25,500][18875] Fps is (10 sec: 42597.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 3732062208. Throughput: 0: 42038.0. Samples: 956201880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:25,501][18875] Avg episode reward: [(0, '0.817')] [2024-06-18 22:52:28,162][19107] Updated weights for policy 0, policy_version 227795 (0.0045) [2024-06-18 22:52:30,500][18875] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3732291584. Throughput: 0: 42322.6. Samples: 956334680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:30,501][18875] Avg episode reward: [(0, '0.817')] [2024-06-18 22:52:31,916][19107] Updated weights for policy 0, policy_version 227805 (0.0028) [2024-06-18 22:52:34,440][19087] Signal inference workers to stop experience collection... (14000 times) [2024-06-18 22:52:34,466][19107] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-06-18 22:52:34,502][19087] Signal inference workers to resume experience collection... (14000 times) [2024-06-18 22:52:34,516][19107] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-06-18 22:52:35,500][18875] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 3732471808. Throughput: 0: 42295.0. Samples: 956586800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:35,501][18875] Avg episode reward: [(0, '0.444')] [2024-06-18 22:52:35,676][19087] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227814_3732504576.pth... [2024-06-18 22:52:35,754][19087] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227197_3722395648.pth [2024-06-18 22:52:35,907][19107] Updated weights for policy 0, policy_version 227815 (0.0029) [2024-06-18 22:52:39,581][19107] Updated weights for policy 0, policy_version 227825 (0.0043) [2024-06-18 22:52:40,500][18875] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 3732701184. Throughput: 0: 41901.2. Samples: 956827940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 22:52:40,501][18875] Avg episode reward: [(0, '0.452')] [2024-06-18 22:52:43,798][19107] Updated weights for policy 0, policy_version 227835 (0.0047) [2024-06-18 22:52:45,500][18875] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42265.7). Total num frames: 3732930560. Throughput: 0: 42394.6. Samples: 956963260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:52:45,501][18875] Avg episode reward: [(0, '0.521')] [2024-06-18 22:52:47,149][19107] Updated weights for policy 0, policy_version 227845 (0.0041) [2024-06-18 22:52:50,500][18875] Fps is (10 sec: 39322.1, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 3733094400. Throughput: 0: 41998.6. Samples: 957211820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:52:50,501][18875] Avg episode reward: [(0, '0.648')] [2024-06-18 22:52:51,805][19107] Updated weights for policy 0, policy_version 227855 (0.0037) [2024-06-18 22:52:54,925][19107] Updated weights for policy 0, policy_version 227865 (0.0034) [2024-06-18 22:52:55,500][18875] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 3733356544. Throughput: 0: 41996.0. Samples: 957454700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:52:55,501][18875] Avg episode reward: [(0, '0.533')] [2024-06-18 22:52:59,426][19107] Updated weights for policy 0, policy_version 227875 (0.0047) [2024-06-18 22:53:00,500][18875] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3733536768. Throughput: 0: 42379.5. Samples: 957597400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:00,501][18875] Avg episode reward: [(0, '0.557')] [2024-06-18 22:53:02,326][19107] Updated weights for policy 0, policy_version 227885 (0.0037) [2024-06-18 22:53:05,500][18875] Fps is (10 sec: 37683.0, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 3733733376. Throughput: 0: 42080.7. Samples: 957848720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:05,501][18875] Avg episode reward: [(0, '0.813')] [2024-06-18 22:53:07,195][19107] Updated weights for policy 0, policy_version 227895 (0.0038) [2024-06-18 22:53:09,991][19107] Updated weights for policy 0, policy_version 227905 (0.0032) [2024-06-18 22:53:10,500][18875] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 3733995520. Throughput: 0: 41994.5. Samples: 958091620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:10,500][18875] Avg episode reward: [(0, '0.699')] [2024-06-18 22:53:14,647][19107] Updated weights for policy 0, policy_version 227915 (0.0033) [2024-06-18 22:53:15,500][18875] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 3734175744. Throughput: 0: 42237.4. Samples: 958235360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:15,501][18875] Avg episode reward: [(0, '0.588')] [2024-06-18 22:53:18,235][19107] Updated weights for policy 0, policy_version 227925 (0.0035) [2024-06-18 22:53:20,500][18875] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 3734372352. Throughput: 0: 42136.1. Samples: 958482920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:20,501][18875] Avg episode reward: [(0, '0.547')] [2024-06-18 22:53:22,338][19107] Updated weights for policy 0, policy_version 227935 (0.0037) [2024-06-18 22:53:25,500][18875] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 3734618112. Throughput: 0: 42428.0. Samples: 958737200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:25,501][18875] Avg episode reward: [(0, '0.402')] [2024-06-18 22:53:25,758][19107] Updated weights for policy 0, policy_version 227945 (0.0033) [2024-06-18 22:53:29,726][19107] Updated weights for policy 0, policy_version 227955 (0.0032) [2024-06-18 22:53:30,504][18875] Fps is (10 sec: 44220.7, 60 sec: 42049.8, 300 sec: 42153.6). Total num frames: 3734814720. Throughput: 0: 42369.1. Samples: 958870020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:30,505][18875] Avg episode reward: [(0, '0.530')] [2024-06-18 22:53:33,550][19107] Updated weights for policy 0, policy_version 227965 (0.0033) [2024-06-18 22:53:35,500][18875] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 3735027712. Throughput: 0: 42483.1. Samples: 959123560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:35,501][18875] Avg episode reward: [(0, '0.609')] [2024-06-18 22:53:37,892][19107] Updated weights for policy 0, policy_version 227975 (0.0042) [2024-06-18 22:53:40,500][18875] Fps is (10 sec: 44252.9, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 3735257088. Throughput: 0: 42709.8. Samples: 959376640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:40,501][18875] Avg episode reward: [(0, '0.566')] [2024-06-18 22:53:41,142][19107] Updated weights for policy 0, policy_version 227985 (0.0046) [2024-06-18 22:53:45,500][18875] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 3735453696. Throughput: 0: 42438.7. Samples: 959507140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:45,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 22:53:45,881][19107] Updated weights for policy 0, policy_version 227995 (0.0035) [2024-06-18 22:53:48,977][19107] Updated weights for policy 0, policy_version 228005 (0.0040) [2024-06-18 22:53:50,500][18875] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42209.7). Total num frames: 3735666688. Throughput: 0: 42285.0. Samples: 959751540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-18 22:53:50,501][18875] Avg episode reward: [(0, '0.515')] [2024-06-18 22:53:53,594][19107] Updated weights for policy 0, policy_version 228015 (0.0032) [2024-06-18 22:53:53,609][19087] Signal inference workers to stop experience collection... (14050 times) [2024-06-18 22:53:53,610][19087] Signal inference workers to resume experience collection... (14050 times) [2024-06-18 22:53:53,652][19107] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-06-18 22:53:53,652][19107] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-06-18 22:53:55,500][18875] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 3735879680. Throughput: 0: 42556.7. Samples: 960006680. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 22:53:55,501][18875] Avg episode reward: [(0, '0.322')] [2024-06-18 22:53:56,755][19107] Updated weights for policy 0, policy_version 228025 (0.0038) [2024-06-18 22:54:00,500][18875] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42210.1). Total num frames: 3736092672. Throughput: 0: 42253.7. Samples: 960136780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 22:54:00,501][18875] Avg episode reward: [(0, '0.318')] [2024-06-18 22:54:01,186][19107] Updated weights for policy 0, policy_version 228035 (0.0032) [2024-06-18 22:54:05,072][19107] Updated weights for policy 0, policy_version 228045 (0.0032) [2024-06-18 22:54:05,500][18875] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 3736289280. Throughput: 0: 42330.2. Samples: 960387780. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 22:54:05,500][18875] Avg episode reward: [(0, '0.273')] [2024-06-18 22:54:09,004][19107] Updated weights for policy 0, policy_version 228055 (0.0044) [2024-06-18 22:54:10,500][18875] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 3736502272. Throughput: 0: 42242.3. Samples: 960638100. Policy #0 lag: (min: 1.0, avg: 10.3, max: 21.0) [2024-06-18 22:54:10,501][18875] Avg episode reward: [(0, '0.655')] [2024-06-18 22:54:12,700][19107] Updated weights for policy 0, policy_version 228065 (0.0048) [2024-06-18 22:54:29,475][21373] Saving configuration to /workspace/metta/train_dir/p2.dr4/config.json... [2024-06-18 22:54:29,491][21373] Rollout worker 0 uses device cpu [2024-06-18 22:54:29,491][21373] Rollout worker 1 uses device cpu [2024-06-18 22:54:29,491][21373] Rollout worker 2 uses device cpu [2024-06-18 22:54:29,491][21373] Rollout worker 3 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 4 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 5 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 6 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 7 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 8 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 9 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 10 uses device cpu [2024-06-18 22:54:29,492][21373] Rollout worker 11 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 12 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 13 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 14 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 15 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 16 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 17 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 18 uses device cpu [2024-06-18 22:54:29,493][21373] Rollout worker 19 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 20 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 21 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 22 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 23 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 24 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 25 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 26 uses device cpu [2024-06-18 22:54:29,494][21373] Rollout worker 27 uses device cpu [2024-06-18 22:54:29,495][21373] Rollout worker 28 uses device cpu [2024-06-18 22:54:29,495][21373] Rollout worker 29 uses device cpu [2024-06-18 22:54:29,495][21373] Rollout worker 30 uses device cpu [2024-06-18 22:54:29,495][21373] Rollout worker 31 uses device cpu [2024-06-18 22:54:30,077][21373] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:54:30,077][21373] InferenceWorker_p0-w0: min num requests: 10 [2024-06-18 22:54:30,156][21373] Starting all processes... [2024-06-18 22:54:30,156][21373] Starting process learner_proc0 [2024-06-18 22:54:30,390][21373] Starting all processes... [2024-06-18 22:54:30,392][21373] Starting process inference_proc0-0 [2024-06-18 22:54:30,392][21373] Starting process rollout_proc0 [2024-06-18 22:54:30,392][21373] Starting process rollout_proc1 [2024-06-18 22:54:30,393][21373] Starting process rollout_proc2 [2024-06-18 22:54:30,455][21373] Starting process rollout_proc3 [2024-06-18 22:54:30,455][21373] Starting process rollout_proc4 [2024-06-18 22:54:30,455][21373] Starting process rollout_proc5 [2024-06-18 22:54:30,455][21373] Starting process rollout_proc6 [2024-06-18 22:54:30,455][21373] Starting process rollout_proc7 [2024-06-18 22:54:30,456][21373] Starting process rollout_proc8 [2024-06-18 22:54:30,456][21373] Starting process rollout_proc9 [2024-06-18 22:54:30,457][21373] Starting process rollout_proc10 [2024-06-18 22:54:30,457][21373] Starting process rollout_proc11 [2024-06-18 22:54:30,458][21373] Starting process rollout_proc12 [2024-06-18 22:54:30,458][21373] Starting process rollout_proc13 [2024-06-18 22:54:30,458][21373] Starting process rollout_proc14 [2024-06-18 22:54:30,459][21373] Starting process rollout_proc15 [2024-06-18 22:54:30,460][21373] Starting process rollout_proc16 [2024-06-18 22:54:30,468][21373] Starting process rollout_proc17 [2024-06-18 22:54:30,468][21373] Starting process rollout_proc18 [2024-06-18 22:54:30,480][21373] Starting process rollout_proc19 [2024-06-18 22:54:30,486][21373] Starting process rollout_proc20 [2024-06-18 22:54:30,487][21373] Starting process rollout_proc21 [2024-06-18 22:54:30,487][21373] Starting process rollout_proc22 [2024-06-18 22:54:30,488][21373] Starting process rollout_proc23 [2024-06-18 22:54:30,505][21373] Starting process rollout_proc24 [2024-06-18 22:54:30,505][21373] Starting process rollout_proc25 [2024-06-18 22:54:30,506][21373] Starting process rollout_proc26 [2024-06-18 22:54:30,508][21373] Starting process rollout_proc27 [2024-06-18 22:54:30,512][21373] Starting process rollout_proc28 [2024-06-18 22:54:30,514][21373] Starting process rollout_proc29 [2024-06-18 22:54:30,523][21373] Starting process rollout_proc30 [2024-06-18 22:54:30,525][21373] Starting process rollout_proc31 [2024-06-18 22:54:32,540][21606] Worker 0 uses CPU cores [0] [2024-06-18 22:54:32,593][21605] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:54:32,594][21605] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-18 22:54:32,603][21605] Num visible devices: 1 [2024-06-18 22:54:32,680][21642] Worker 5 uses CPU cores [5] [2024-06-18 22:54:32,695][21609] Worker 3 uses CPU cores [3] [2024-06-18 22:54:32,704][21641] Worker 2 uses CPU cores [2] [2024-06-18 22:54:32,719][21646] Worker 8 uses CPU cores [8] [2024-06-18 22:54:32,724][21666] Worker 30 uses CPU cores [30] [2024-06-18 22:54:32,727][21654] Worker 19 uses CPU cores [19] [2024-06-18 22:54:32,751][21607] Worker 1 uses CPU cores [1] [2024-06-18 22:54:32,756][21663] Worker 25 uses CPU cores [25] [2024-06-18 22:54:32,760][21661] Worker 23 uses CPU cores [23] [2024-06-18 22:54:32,768][21660] Worker 22 uses CPU cores [22] [2024-06-18 22:54:32,772][21659] Worker 20 uses CPU cores [20] [2024-06-18 22:54:32,780][21649] Worker 12 uses CPU cores [12] [2024-06-18 22:54:32,792][21648] Worker 11 uses CPU cores [11] [2024-06-18 22:54:32,800][21664] Worker 27 uses CPU cores [27] [2024-06-18 22:54:32,823][21656] Worker 17 uses CPU cores [17] [2024-06-18 22:54:32,847][21644] Worker 7 uses CPU cores [7] [2024-06-18 22:54:32,852][21647] Worker 10 uses CPU cores [10] [2024-06-18 22:54:32,885][21643] Worker 6 uses CPU cores [6] [2024-06-18 22:54:32,886][21655] Worker 18 uses CPU cores [18] [2024-06-18 22:54:32,890][21585] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:54:32,890][21585] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-18 22:54:32,899][21585] Num visible devices: 1 [2024-06-18 22:54:32,899][21640] Worker 4 uses CPU cores [4] [2024-06-18 22:54:32,911][21658] Worker 24 uses CPU cores [24] [2024-06-18 22:54:32,912][21585] Setting fixed seed 0 [2024-06-18 22:54:32,913][21585] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:54:32,913][21585] Initializing actor-critic model on device cuda:0 [2024-06-18 22:54:32,928][21650] Worker 14 uses CPU cores [14] [2024-06-18 22:54:32,931][21665] Worker 29 uses CPU cores [29] [2024-06-18 22:54:32,934][21698] Worker 31 uses CPU cores [31] [2024-06-18 22:54:32,985][21662] Worker 26 uses CPU cores [26] [2024-06-18 22:54:32,997][21645] Worker 9 uses CPU cores [9] [2024-06-18 22:54:33,001][21651] Worker 13 uses CPU cores [13] [2024-06-18 22:54:33,008][21657] Worker 21 uses CPU cores [21] [2024-06-18 22:54:33,022][21653] Worker 16 uses CPU cores [16] [2024-06-18 22:54:33,040][21652] Worker 15 uses CPU cores [15] [2024-06-18 22:54:33,100][21699] Worker 28 uses CPU cores [28] [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,682][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,683][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,686][21585] RunningMeanStd input shape: (1,) [2024-06-18 22:54:33,686][21585] RunningMeanStd input shape: (1,) [2024-06-18 22:54:33,686][21585] RunningMeanStd input shape: (1,) [2024-06-18 22:54:33,687][21585] RunningMeanStd input shape: (1,) [2024-06-18 22:54:33,687][21585] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:33,727][21585] RunningMeanStd input shape: (1,) [2024-06-18 22:54:33,731][21585] Created Actor Critic model with architecture: [2024-06-18 22:54:33,731][21585] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-18 22:54:33,799][21585] Using optimizer [2024-06-18 22:54:33,984][21585] Loading state from checkpoint /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227814_3732504576.pth... [2024-06-18 22:54:33,999][21585] Loading model from checkpoint [2024-06-18 22:54:34,001][21585] Loaded experiment state at self.train_step=227814, self.env_steps=3732504576 [2024-06-18 22:54:34,001][21585] Initialized policy 0 weights for model version 227814 [2024-06-18 22:54:34,002][21585] LearnerWorker_p0 finished initialization! [2024-06-18 22:54:34,003][21585] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,758][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,759][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,762][21605] RunningMeanStd input shape: (1,) [2024-06-18 22:54:34,762][21605] RunningMeanStd input shape: (1,) [2024-06-18 22:54:34,762][21605] RunningMeanStd input shape: (1,) [2024-06-18 22:54:34,763][21605] RunningMeanStd input shape: (1,) [2024-06-18 22:54:34,763][21605] RunningMeanStd input shape: (11, 11) [2024-06-18 22:54:34,801][21605] RunningMeanStd input shape: (1,) [2024-06-18 22:54:34,823][21373] Inference worker 0-0 is ready! [2024-06-18 22:54:34,824][21373] All inference workers are ready! Signal rollout workers to start! [2024-06-18 22:54:37,138][21373] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 3732504576. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:54:37,545][21657] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,552][21662] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,560][21666] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,587][21664] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,606][21665] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,607][21648] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,609][21649] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,613][21698] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,631][21653] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,637][21609] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,647][21647] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,668][21643] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,669][21656] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,677][21651] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,686][21641] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,701][21650] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,701][21645] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,705][21642] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,718][21646] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,734][21606] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,752][21655] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,754][21699] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,757][21644] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,761][21652] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,788][21658] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,792][21640] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,793][21607] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,794][21654] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,797][21660] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,801][21659] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,806][21663] Decorrelating experience for 0 frames... [2024-06-18 22:54:37,807][21661] Decorrelating experience for 0 frames... [2024-06-18 22:54:38,673][21662] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,755][21657] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,771][21648] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,778][21664] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,822][21653] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,838][21698] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,866][21641] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,905][21665] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,913][21652] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,913][21666] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,947][21650] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,949][21642] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,956][21647] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,964][21644] Decorrelating experience for 256 frames... [2024-06-18 22:54:38,996][21609] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,000][21649] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,006][21651] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,009][21640] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,020][21606] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,037][21643] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,038][21663] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,050][21645] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,055][21656] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,066][21646] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,075][21661] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,077][21655] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,092][21659] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,119][21607] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,167][21699] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,178][21658] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,212][21660] Decorrelating experience for 256 frames... [2024-06-18 22:54:39,232][21654] Decorrelating experience for 256 frames... [2024-06-18 22:54:42,138][21373] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3732504576. Throughput: 0: 972.0. Samples: 4860. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:54:46,385][21647] EvtLoop [rollout_proc10_evt_loop, process=rollout_proc10] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance10'), args=(1, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects[object].symbol TypeError: 'method' object is not subscriptable [2024-06-18 22:54:46,386][21647] Unhandled exception 'method' object is not subscriptable in evt loop rollout_proc10_evt_loop [2024-06-18 22:54:46,419][21648] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-18 22:54:46,451][21652] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-18 22:54:46,569][21664] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-18 22:54:46,569][21641] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-18 22:54:46,575][21609] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-18 22:54:46,588][21657] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-18 22:54:46,630][21644] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-18 22:54:46,651][21662] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-18 22:54:46,682][21642] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-18 22:54:46,709][21650] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-18 22:54:46,711][21646] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-18 22:54:46,722][21653] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-18 22:54:46,738][21585] Signal inference workers to stop experience collection... [2024-06-18 22:54:46,749][21605] InferenceWorker_p0-w0: stopping experience collection [2024-06-18 22:54:46,756][21649] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-18 22:54:46,763][21640] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-18 22:54:47,138][21373] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3732504576. Throughput: 0: 31956.2. Samples: 319560. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:54:47,347][21585] Signal inference workers to resume experience collection... [2024-06-18 22:54:47,348][21605] InferenceWorker_p0-w0: resuming experience collection [2024-06-18 22:54:47,368][21663] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-18 22:54:47,376][21698] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-18 22:54:47,379][21645] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-18 22:54:47,380][21607] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-18 22:54:47,595][21651] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-18 22:54:47,620][21643] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-18 22:54:47,747][21659] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-18 22:54:47,788][21666] EvtLoop [rollout_proc30_evt_loop, process=rollout_proc30] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance30'), args=(1, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects[object].symbol TypeError: 'method' object is not subscriptable [2024-06-18 22:54:47,789][21666] Unhandled exception 'method' object is not subscriptable in evt loop rollout_proc30_evt_loop [2024-06-18 22:54:47,841][21655] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-18 22:54:48,028][21656] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-18 22:54:48,056][21661] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-18 22:54:48,129][21658] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-18 22:54:48,214][21665] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-18 22:54:48,395][21660] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-18 22:54:48,844][21654] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-18 22:54:48,905][21699] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-18 22:54:50,073][21373] Heartbeat connected on Batcher_0 [2024-06-18 22:54:50,075][21373] Heartbeat connected on LearnerWorker_p0 [2024-06-18 22:54:50,080][21373] Heartbeat connected on RolloutWorker_w0 [2024-06-18 22:54:50,114][21373] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-18 22:54:52,091][21607] Worker 1 awakens! [2024-06-18 22:54:52,096][21373] Heartbeat connected on RolloutWorker_w1 [2024-06-18 22:54:52,138][21373] Fps is (10 sec: 14745.7, 60 sec: 9830.4, 300 sec: 9830.4). Total num frames: 3732652032. Throughput: 0: 21984.1. Samples: 329760. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:54:52,234][21605] Updated weights for policy 0, policy_version 227824 (0.0018) [2024-06-18 22:54:55,992][21641] Worker 2 awakens! [2024-06-18 22:54:55,997][21373] Heartbeat connected on RolloutWorker_w2 [2024-06-18 22:54:57,138][21373] Fps is (10 sec: 16383.9, 60 sec: 8192.0, 300 sec: 8192.0). Total num frames: 3732668416. Throughput: 0: 17090.0. Samples: 341800. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:55:00,708][21609] Worker 3 awakens! [2024-06-18 22:55:00,715][21373] Heartbeat connected on RolloutWorker_w3 [2024-06-18 22:55:02,138][21373] Fps is (10 sec: 3276.7, 60 sec: 7208.9, 300 sec: 7208.9). Total num frames: 3732684800. Throughput: 0: 14658.3. Samples: 366460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:55:05,607][21640] Worker 4 awakens! [2024-06-18 22:55:05,611][21373] Heartbeat connected on RolloutWorker_w4 [2024-06-18 22:55:07,138][21373] Fps is (10 sec: 6553.6, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 3732733952. Throughput: 0: 12722.0. Samples: 381660. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:55:10,220][21642] Worker 5 awakens! [2024-06-18 22:55:10,226][21373] Heartbeat connected on RolloutWorker_w5 [2024-06-18 22:55:12,138][21373] Fps is (10 sec: 11469.0, 60 sec: 8426.1, 300 sec: 8426.1). Total num frames: 3732799488. Throughput: 0: 13027.4. Samples: 455960. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:55:13,656][21605] Updated weights for policy 0, policy_version 227834 (0.0015) [2024-06-18 22:55:15,845][21643] Worker 6 awakens! [2024-06-18 22:55:15,851][21373] Heartbeat connected on RolloutWorker_w6 [2024-06-18 22:55:17,138][21373] Fps is (10 sec: 13107.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 3732865024. Throughput: 0: 13678.5. Samples: 547140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:55:19,542][21644] Worker 7 awakens! [2024-06-18 22:55:19,551][21373] Heartbeat connected on RolloutWorker_w7 [2024-06-18 22:55:21,223][21642] EvtLoop [rollout_proc5_evt_loop, process=rollout_proc5] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance5'), args=(0, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects[object].symbol TypeError: 'method' object is not subscriptable [2024-06-18 22:55:21,224][21642] Unhandled exception 'method' object is not subscriptable in evt loop rollout_proc5_evt_loop [2024-06-18 22:55:22,138][21373] Fps is (10 sec: 18022.4, 60 sec: 10558.6, 300 sec: 10558.6). Total num frames: 3732979712. Throughput: 0: 13440.5. Samples: 604820. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:55:22,323][21605] Updated weights for policy 0, policy_version 227844 (0.0012) [2024-06-18 22:55:23,143][21609] EvtLoop [rollout_proc3_evt_loop, process=rollout_proc3] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance3'), args=(0, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects[object].symbol TypeError: 'method' object is not subscriptable [2024-06-18 22:55:23,144][21609] Unhandled exception 'method' object is not subscriptable in evt loop rollout_proc3_evt_loop [2024-06-18 22:55:24,310][21646] Worker 8 awakens! [2024-06-18 22:55:24,316][21373] Heartbeat connected on RolloutWorker_w8 [2024-06-18 22:55:26,079][21641] EvtLoop [rollout_proc2_evt_loop, process=rollout_proc2] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance2'), args=(0, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects[object].symbol TypeError: 'method' object is not subscriptable [2024-06-18 22:55:26,080][21641] Unhandled exception 'method' object is not subscriptable in evt loop rollout_proc2_evt_loop [2024-06-18 22:55:26,321][21606] EvtLoop [rollout_proc0_evt_loop, process=rollout_proc0] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance0'), args=(0, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects[object].symbol TypeError: 'method' object is not subscriptable [2024-06-18 22:55:26,321][21606] Unhandled exception 'method' object is not subscriptable in evt loop rollout_proc0_evt_loop [2024-06-18 22:55:27,138][21373] Fps is (10 sec: 19660.8, 60 sec: 11141.1, 300 sec: 11141.1). Total num frames: 3733061632. Throughput: 0: 15850.2. Samples: 718120. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:56:30,112][23870] Saving configuration to /workspace/metta/train_dir/p2.dr4/config.json... [2024-06-18 22:56:30,129][23870] Rollout worker 0 uses device cpu [2024-06-18 22:56:30,129][23870] Rollout worker 1 uses device cpu [2024-06-18 22:56:30,129][23870] Rollout worker 2 uses device cpu [2024-06-18 22:56:30,129][23870] Rollout worker 3 uses device cpu [2024-06-18 22:56:30,129][23870] Rollout worker 4 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 5 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 6 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 7 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 8 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 9 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 10 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 11 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 12 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 13 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 14 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 15 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 16 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 17 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 18 uses device cpu [2024-06-18 22:56:30,130][23870] Rollout worker 19 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 20 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 21 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 22 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 23 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 24 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 25 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 26 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 27 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 28 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 29 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 30 uses device cpu [2024-06-18 22:56:30,131][23870] Rollout worker 31 uses device cpu [2024-06-18 22:56:30,700][23870] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:56:30,700][23870] InferenceWorker_p0-w0: min num requests: 10 [2024-06-18 22:56:30,775][23870] Starting all processes... [2024-06-18 22:56:30,776][23870] Starting process learner_proc0 [2024-06-18 22:56:31,006][23870] Starting all processes... [2024-06-18 22:56:31,008][23870] Starting process inference_proc0-0 [2024-06-18 22:56:31,009][23870] Starting process rollout_proc0 [2024-06-18 22:56:31,009][23870] Starting process rollout_proc2 [2024-06-18 22:56:31,009][23870] Starting process rollout_proc1 [2024-06-18 22:56:31,010][23870] Starting process rollout_proc3 [2024-06-18 22:56:31,010][23870] Starting process rollout_proc4 [2024-06-18 22:56:31,010][23870] Starting process rollout_proc5 [2024-06-18 22:56:31,010][23870] Starting process rollout_proc6 [2024-06-18 22:56:31,011][23870] Starting process rollout_proc7 [2024-06-18 22:56:31,012][23870] Starting process rollout_proc8 [2024-06-18 22:56:31,013][23870] Starting process rollout_proc9 [2024-06-18 22:56:31,013][23870] Starting process rollout_proc10 [2024-06-18 22:56:31,013][23870] Starting process rollout_proc11 [2024-06-18 22:56:31,078][23870] Starting process rollout_proc12 [2024-06-18 22:56:31,079][23870] Starting process rollout_proc13 [2024-06-18 22:56:31,079][23870] Starting process rollout_proc14 [2024-06-18 22:56:31,094][23870] Starting process rollout_proc16 [2024-06-18 22:56:31,088][23870] Starting process rollout_proc15 [2024-06-18 22:56:31,094][23870] Starting process rollout_proc17 [2024-06-18 22:56:31,094][23870] Starting process rollout_proc18 [2024-06-18 22:56:31,100][23870] Starting process rollout_proc19 [2024-06-18 22:56:31,107][23870] Starting process rollout_proc20 [2024-06-18 22:56:31,107][23870] Starting process rollout_proc21 [2024-06-18 22:56:31,109][23870] Starting process rollout_proc22 [2024-06-18 22:56:31,109][23870] Starting process rollout_proc23 [2024-06-18 22:56:31,112][23870] Starting process rollout_proc24 [2024-06-18 22:56:31,117][23870] Starting process rollout_proc25 [2024-06-18 22:56:31,125][23870] Starting process rollout_proc26 [2024-06-18 22:56:31,126][23870] Starting process rollout_proc27 [2024-06-18 22:56:31,136][23870] Starting process rollout_proc28 [2024-06-18 22:56:31,137][23870] Starting process rollout_proc29 [2024-06-18 22:56:31,140][23870] Starting process rollout_proc30 [2024-06-18 22:56:31,141][23870] Starting process rollout_proc31 [2024-06-18 22:56:33,167][24082] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:56:33,167][24082] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-18 22:56:33,176][24082] Num visible devices: 1 [2024-06-18 22:56:33,188][24082] Setting fixed seed 0 [2024-06-18 22:56:33,189][24082] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:56:33,190][24082] Initializing actor-critic model on device cuda:0 [2024-06-18 22:56:33,216][24138] Worker 4 uses CPU cores [4] [2024-06-18 22:56:33,263][24103] Worker 0 uses CPU cores [0] [2024-06-18 22:56:33,276][24136] Worker 2 uses CPU cores [2] [2024-06-18 22:56:33,284][24184] Worker 19 uses CPU cores [19] [2024-06-18 22:56:33,316][24135] Worker 1 uses CPU cores [1] [2024-06-18 22:56:33,328][24185] Worker 20 uses CPU cores [20] [2024-06-18 22:56:33,352][24176] Worker 9 uses CPU cores [9] [2024-06-18 22:56:33,366][24102] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:56:33,366][24102] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-18 22:56:33,368][24191] Worker 27 uses CPU cores [27] [2024-06-18 22:56:33,372][24189] Worker 24 uses CPU cores [24] [2024-06-18 22:56:33,376][24183] Worker 17 uses CPU cores [17] [2024-06-18 22:56:33,376][24102] Num visible devices: 1 [2024-06-18 22:56:33,416][24194] Worker 31 uses CPU cores [31] [2024-06-18 22:56:33,420][24186] Worker 21 uses CPU cores [21] [2024-06-18 22:56:33,428][24187] Worker 22 uses CPU cores [22] [2024-06-18 22:56:33,428][24192] Worker 26 uses CPU cores [26] [2024-06-18 22:56:33,474][24171] Worker 7 uses CPU cores [7] [2024-06-18 22:56:33,488][24177] Worker 13 uses CPU cores [13] [2024-06-18 22:56:33,499][24181] Worker 18 uses CPU cores [18] [2024-06-18 22:56:33,506][24137] Worker 3 uses CPU cores [3] [2024-06-18 22:56:33,512][24195] Worker 30 uses CPU cores [30] [2024-06-18 22:56:33,515][24155] Worker 6 uses CPU cores [6] [2024-06-18 22:56:33,524][24175] Worker 11 uses CPU cores [11] [2024-06-18 22:56:33,536][24188] Worker 23 uses CPU cores [23] [2024-06-18 22:56:33,554][24173] Worker 8 uses CPU cores [8] [2024-06-18 22:56:33,574][24174] Worker 10 uses CPU cores [10] [2024-06-18 22:56:33,588][24182] Worker 15 uses CPU cores [15] [2024-06-18 22:56:33,600][24180] Worker 16 uses CPU cores [16] [2024-06-18 22:56:33,632][24196] Worker 29 uses CPU cores [29] [2024-06-18 22:56:33,634][24179] Worker 14 uses CPU cores [14] [2024-06-18 22:56:33,639][24193] Worker 28 uses CPU cores [28] [2024-06-18 22:56:33,641][24178] Worker 12 uses CPU cores [12] [2024-06-18 22:56:33,672][24172] Worker 5 uses CPU cores [5] [2024-06-18 22:56:33,821][24190] Worker 25 uses CPU cores [25] [2024-06-18 22:56:34,151][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,152][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,153][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,153][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,156][24082] RunningMeanStd input shape: (1,) [2024-06-18 22:56:34,156][24082] RunningMeanStd input shape: (1,) [2024-06-18 22:56:34,156][24082] RunningMeanStd input shape: (1,) [2024-06-18 22:56:34,156][24082] RunningMeanStd input shape: (1,) [2024-06-18 22:56:34,156][24082] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:34,195][24082] RunningMeanStd input shape: (1,) [2024-06-18 22:56:34,199][24082] Created Actor Critic model with architecture: [2024-06-18 22:56:34,200][24082] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-18 22:56:34,269][24082] Using optimizer [2024-06-18 22:56:34,454][24082] Loading state from checkpoint /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227814_3732504576.pth... [2024-06-18 22:56:34,469][24082] Loading model from checkpoint [2024-06-18 22:56:34,470][24082] Loaded experiment state at self.train_step=227814, self.env_steps=3732504576 [2024-06-18 22:56:34,471][24082] Initialized policy 0 weights for model version 227814 [2024-06-18 22:56:34,472][24082] LearnerWorker_p0 finished initialization! [2024-06-18 22:56:34,472][24082] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:56:35,260][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,261][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,262][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,262][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,262][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,262][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,262][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,262][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,265][24102] RunningMeanStd input shape: (1,) [2024-06-18 22:56:35,265][24102] RunningMeanStd input shape: (1,) [2024-06-18 22:56:35,265][24102] RunningMeanStd input shape: (1,) [2024-06-18 22:56:35,265][24102] RunningMeanStd input shape: (1,) [2024-06-18 22:56:35,265][24102] RunningMeanStd input shape: (11, 11) [2024-06-18 22:56:35,304][24102] RunningMeanStd input shape: (1,) [2024-06-18 22:56:35,326][23870] Inference worker 0-0 is ready! [2024-06-18 22:56:35,326][23870] All inference workers are ready! Signal rollout workers to start! [2024-06-18 22:56:37,807][23870] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 3732504576. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:56:38,067][24186] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,108][24194] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,113][24171] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,122][24178] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,135][24136] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,139][24174] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,140][24175] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,149][24180] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,159][24193] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,162][24155] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,198][24138] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,206][24183] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,214][24190] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,218][24191] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,221][24173] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,224][24192] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,227][24135] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,231][24177] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,239][24196] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,239][24185] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,246][24172] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,246][24188] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,248][24181] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,260][24103] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,269][24195] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,269][24182] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,279][24137] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,282][24187] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,287][24179] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,290][24176] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,296][24184] Decorrelating experience for 0 frames... [2024-06-18 22:56:38,302][24189] Decorrelating experience for 0 frames... [2024-06-18 22:56:39,222][24186] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,281][24136] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,323][24178] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,369][24177] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,394][24155] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,394][24137] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,436][24135] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,446][24138] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,455][24193] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,478][24190] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,479][24174] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,493][24171] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,494][24175] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,496][24181] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,515][24194] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,530][24185] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,531][24195] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,531][24180] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,533][24192] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,573][24196] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,574][24172] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,600][24188] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,601][24187] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,621][24182] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,636][24176] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,642][24173] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,648][24183] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,659][24103] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,667][24189] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,671][24179] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,675][24184] Decorrelating experience for 256 frames... [2024-06-18 22:56:39,735][24191] Decorrelating experience for 256 frames... [2024-06-18 22:56:42,812][23870] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3732504576. Throughput: 0: 811.2. Samples: 4060. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:56:47,034][24137] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-18 22:56:47,042][24135] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-18 22:56:47,048][24136] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-18 22:56:47,067][24177] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-18 22:56:47,089][24178] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-18 22:56:47,113][24175] EvtLoop [rollout_proc11_evt_loop, process=rollout_proc11] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance11'), args=(1, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects()[object].symbol TypeError: list indices must be integers or slices, not str [2024-06-18 22:56:47,114][24175] Unhandled exception list indices must be integers or slices, not str in evt loop rollout_proc11_evt_loop [2024-06-18 22:56:47,212][24138] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-18 22:56:47,212][24186] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-18 22:56:47,218][24155] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-18 22:56:47,237][24082] Signal inference workers to stop experience collection... [2024-06-18 22:56:47,241][24179] EvtLoop [rollout_proc14_evt_loop, process=rollout_proc14] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance14'), args=(1, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects()[object].symbol TypeError: list indices must be integers or slices, not str [2024-06-18 22:56:47,241][24179] Unhandled exception list indices must be integers or slices, not str in evt loop rollout_proc14_evt_loop [2024-06-18 22:56:47,245][24102] InferenceWorker_p0-w0: stopping experience collection [2024-06-18 22:56:47,724][24082] Signal inference workers to resume experience collection... [2024-06-18 22:56:47,725][24102] InferenceWorker_p0-w0: resuming experience collection [2024-06-18 22:56:47,738][24182] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-18 22:56:47,749][24196] EvtLoop [rollout_proc29_evt_loop, process=rollout_proc29] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance29'), args=(1, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects()[object].symbol TypeError: list indices must be integers or slices, not str [2024-06-18 22:56:47,750][24196] Unhandled exception list indices must be integers or slices, not str in evt loop rollout_proc29_evt_loop [2024-06-18 22:56:47,777][24172] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-18 22:56:47,780][24176] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-18 22:56:47,795][24174] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-18 22:56:47,807][23870] Fps is (10 sec: 1638.4, 60 sec: 1638.4, 300 sec: 1638.4). Total num frames: 3732520960. Throughput: 0: 31422.2. Samples: 314220. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:56:48,048][24173] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-18 22:56:48,052][24195] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-18 22:56:48,078][24180] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-18 22:56:48,216][24181] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-18 22:56:48,260][24185] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-18 22:56:48,321][24193] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-18 22:56:48,327][24194] EvtLoop [rollout_proc31_evt_loop, process=rollout_proc31] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance31'), args=(1, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects()[object].symbol TypeError: list indices must be integers or slices, not str [2024-06-18 22:56:48,328][24194] Unhandled exception list indices must be integers or slices, not str in evt loop rollout_proc31_evt_loop [2024-06-18 22:56:48,382][24190] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-18 22:56:48,422][24192] EvtLoop [rollout_proc26_evt_loop, process=rollout_proc26] unhandled exception in slot='advance_rollouts' connected to emitter=Emitter(object_id='InferenceWorker_p0-w0', signal_name='advance26'), args=(1, 0) Traceback (most recent call last): File "/opt/conda/lib/python3.10/site-packages/signal_slot/signal_slot.py", line 355, in _process_signal slot_callable(*args) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/rollout_worker.py", line 241, in advance_rollouts complete_rollouts, episodic_stats = runner.advance_rollouts(policy_id, self.timing) File "/workspace/metta/third_party/sample_factory/sample_factory/algo/sampling/non_batched_sampling.py", line 634, in advance_rollouts new_obs, rewards, terminated, truncated, infos = e.step(actions) File "/opt/conda/lib/python3.10/site-packages/gymnasium/core.py", line 461, in step return self.env.step(action) File "/workspace/metta/rl_framework/sample_factory/sample_factory_env_wrapper.py", line 52, in step obs, rewards, terminated, truncated, infos_dict = self.gym_env.step(actions) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 56, in step self.process_episode_stats(info["episode_extra_stats"]) File "/workspace/metta/env/griddly/mettagrid/gym_env.py", line 68, in process_episode_stats symbol = self._game_builder.objects()[object].symbol TypeError: list indices must be integers or slices, not str [2024-06-18 22:56:48,422][24171] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-18 22:56:48,422][24192] Unhandled exception list indices must be integers or slices, not str in evt loop rollout_proc26_evt_loop [2024-06-18 22:56:48,504][24189] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-18 22:56:48,600][24187] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-18 22:56:48,662][24188] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-18 22:56:48,751][24183] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-18 22:56:49,026][24191] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-18 22:56:49,085][24184] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-18 22:57:40,655][26367] Saving configuration to /workspace/metta/train_dir/p2.dr4/config.json... [2024-06-18 22:57:40,672][26367] Rollout worker 0 uses device cpu [2024-06-18 22:57:40,672][26367] Rollout worker 1 uses device cpu [2024-06-18 22:57:40,672][26367] Rollout worker 2 uses device cpu [2024-06-18 22:57:40,672][26367] Rollout worker 3 uses device cpu [2024-06-18 22:57:40,672][26367] Rollout worker 4 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 5 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 6 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 7 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 8 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 9 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 10 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 11 uses device cpu [2024-06-18 22:57:40,673][26367] Rollout worker 12 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 13 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 14 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 15 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 16 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 17 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 18 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 19 uses device cpu [2024-06-18 22:57:40,674][26367] Rollout worker 20 uses device cpu [2024-06-18 22:57:40,675][26367] Rollout worker 21 uses device cpu [2024-06-18 22:57:40,675][26367] Rollout worker 22 uses device cpu [2024-06-18 22:57:40,675][26367] Rollout worker 23 uses device cpu [2024-06-18 22:57:40,675][26367] Rollout worker 24 uses device cpu [2024-06-18 22:57:40,675][26367] Rollout worker 25 uses device cpu [2024-06-18 22:57:40,675][26367] Rollout worker 26 uses device cpu [2024-06-18 22:57:40,675][26367] Rollout worker 27 uses device cpu [2024-06-18 22:57:40,676][26367] Rollout worker 28 uses device cpu [2024-06-18 22:57:40,676][26367] Rollout worker 29 uses device cpu [2024-06-18 22:57:40,676][26367] Rollout worker 30 uses device cpu [2024-06-18 22:57:40,676][26367] Rollout worker 31 uses device cpu [2024-06-18 22:57:41,261][26367] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:57:41,261][26367] InferenceWorker_p0-w0: min num requests: 10 [2024-06-18 22:57:41,340][26367] Starting all processes... [2024-06-18 22:57:41,340][26367] Starting process learner_proc0 [2024-06-18 22:57:41,569][26367] Starting all processes... [2024-06-18 22:57:41,572][26367] Starting process inference_proc0-0 [2024-06-18 22:57:41,572][26367] Starting process rollout_proc0 [2024-06-18 22:57:41,572][26367] Starting process rollout_proc1 [2024-06-18 22:57:41,572][26367] Starting process rollout_proc2 [2024-06-18 22:57:41,572][26367] Starting process rollout_proc3 [2024-06-18 22:57:41,575][26367] Starting process rollout_proc4 [2024-06-18 22:57:41,575][26367] Starting process rollout_proc5 [2024-06-18 22:57:41,575][26367] Starting process rollout_proc6 [2024-06-18 22:57:41,575][26367] Starting process rollout_proc7 [2024-06-18 22:57:41,639][26367] Starting process rollout_proc8 [2024-06-18 22:57:41,639][26367] Starting process rollout_proc9 [2024-06-18 22:57:41,639][26367] Starting process rollout_proc10 [2024-06-18 22:57:41,639][26367] Starting process rollout_proc11 [2024-06-18 22:57:41,639][26367] Starting process rollout_proc12 [2024-06-18 22:57:41,639][26367] Starting process rollout_proc13 [2024-06-18 22:57:41,640][26367] Starting process rollout_proc14 [2024-06-18 22:57:41,640][26367] Starting process rollout_proc15 [2024-06-18 22:57:41,641][26367] Starting process rollout_proc16 [2024-06-18 22:57:41,642][26367] Starting process rollout_proc17 [2024-06-18 22:57:41,642][26367] Starting process rollout_proc18 [2024-06-18 22:57:41,642][26367] Starting process rollout_proc19 [2024-06-18 22:57:41,644][26367] Starting process rollout_proc20 [2024-06-18 22:57:41,653][26367] Starting process rollout_proc21 [2024-06-18 22:57:41,658][26367] Starting process rollout_proc22 [2024-06-18 22:57:41,676][26367] Starting process rollout_proc23 [2024-06-18 22:57:41,676][26367] Starting process rollout_proc24 [2024-06-18 22:57:41,676][26367] Starting process rollout_proc25 [2024-06-18 22:57:41,680][26367] Starting process rollout_proc26 [2024-06-18 22:57:41,692][26367] Starting process rollout_proc27 [2024-06-18 22:57:41,692][26367] Starting process rollout_proc28 [2024-06-18 22:57:41,696][26367] Starting process rollout_proc29 [2024-06-18 22:57:41,696][26367] Starting process rollout_proc30 [2024-06-18 22:57:41,696][26367] Starting process rollout_proc31 [2024-06-18 22:57:43,705][26599] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:57:43,705][26599] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for inference process 0 [2024-06-18 22:57:43,714][26599] Num visible devices: 1 [2024-06-18 22:57:43,741][26602] Worker 1 uses CPU cores [1] [2024-06-18 22:57:43,791][26600] Worker 2 uses CPU cores [2] [2024-06-18 22:57:43,799][26601] Worker 0 uses CPU cores [0] [2024-06-18 22:57:43,820][26686] Worker 25 uses CPU cores [25] [2024-06-18 22:57:43,824][26579] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:57:43,824][26579] Set environment var CUDA_VISIBLE_DEVICES to '0' (GPU indices [0]) for learning process 0 [2024-06-18 22:57:43,836][26579] Num visible devices: 1 [2024-06-18 22:57:43,852][26679] Worker 16 uses CPU cores [16] [2024-06-18 22:57:43,852][26579] Setting fixed seed 0 [2024-06-18 22:57:43,853][26579] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:57:43,853][26579] Initializing actor-critic model on device cuda:0 [2024-06-18 22:57:43,892][26637] Worker 6 uses CPU cores [6] [2024-06-18 22:57:43,896][26690] Worker 26 uses CPU cores [26] [2024-06-18 22:57:43,924][26672] Worker 12 uses CPU cores [12] [2024-06-18 22:57:43,936][26684] Worker 22 uses CPU cores [22] [2024-06-18 22:57:43,952][26678] Worker 17 uses CPU cores [17] [2024-06-18 22:57:43,956][26691] Worker 31 uses CPU cores [31] [2024-06-18 22:57:43,967][26671] Worker 8 uses CPU cores [8] [2024-06-18 22:57:43,971][26674] Worker 13 uses CPU cores [13] [2024-06-18 22:57:44,010][26673] Worker 10 uses CPU cores [10] [2024-06-18 22:57:44,015][26693] Worker 29 uses CPU cores [29] [2024-06-18 22:57:44,019][26669] Worker 7 uses CPU cores [7] [2024-06-18 22:57:44,019][26681] Worker 19 uses CPU cores [19] [2024-06-18 22:57:44,050][26603] Worker 3 uses CPU cores [3] [2024-06-18 22:57:44,060][26604] Worker 4 uses CPU cores [4] [2024-06-18 22:57:44,064][26682] Worker 21 uses CPU cores [21] [2024-06-18 22:57:44,067][26636] Worker 5 uses CPU cores [5] [2024-06-18 22:57:44,087][26677] Worker 11 uses CPU cores [11] [2024-06-18 22:57:44,128][26670] Worker 9 uses CPU cores [9] [2024-06-18 22:57:44,130][26685] Worker 20 uses CPU cores [20] [2024-06-18 22:57:44,164][26689] Worker 28 uses CPU cores [28] [2024-06-18 22:57:44,191][26676] Worker 15 uses CPU cores [15] [2024-06-18 22:57:44,196][26683] Worker 24 uses CPU cores [24] [2024-06-18 22:57:44,276][26680] Worker 18 uses CPU cores [18] [2024-06-18 22:57:44,297][26692] Worker 30 uses CPU cores [30] [2024-06-18 22:57:44,309][26688] Worker 23 uses CPU cores [23] [2024-06-18 22:57:44,333][26675] Worker 14 uses CPU cores [14] [2024-06-18 22:57:44,346][26687] Worker 27 uses CPU cores [27] [2024-06-18 22:57:44,799][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,800][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,801][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,801][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,804][26579] RunningMeanStd input shape: (1,) [2024-06-18 22:57:44,804][26579] RunningMeanStd input shape: (1,) [2024-06-18 22:57:44,804][26579] RunningMeanStd input shape: (1,) [2024-06-18 22:57:44,804][26579] RunningMeanStd input shape: (1,) [2024-06-18 22:57:44,804][26579] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:44,844][26579] RunningMeanStd input shape: (1,) [2024-06-18 22:57:44,848][26579] Created Actor Critic model with architecture: [2024-06-18 22:57:44,848][26579] SampleFactoryAgentWrapper( (obs_normalizer): ObservationNormalizer() (returns_normalizer): RecursiveScriptModule(original_name=RunningMeanStdInPlace) (agent): MettaAgent( (_encoder): MultiFeatureSetEncoder( (feature_set_encoders): ModuleDict( (grid_obs): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (agent): RunningMeanStdInPlace() (altar): RunningMeanStdInPlace() (clock): RunningMeanStdInPlace() (converter): RunningMeanStdInPlace() (generator): RunningMeanStdInPlace() (wall): RunningMeanStdInPlace() (agent:dir): RunningMeanStdInPlace() (agent:energy): RunningMeanStdInPlace() (agent:frozen): RunningMeanStdInPlace() (agent:hp): RunningMeanStdInPlace() (agent:id): RunningMeanStdInPlace() (agent:inv_r1): RunningMeanStdInPlace() (agent:inv_r2): RunningMeanStdInPlace() (agent:inv_r3): RunningMeanStdInPlace() (agent:shield): RunningMeanStdInPlace() (altar:hp): RunningMeanStdInPlace() (altar:state): RunningMeanStdInPlace() (converter:hp): RunningMeanStdInPlace() (converter:state): RunningMeanStdInPlace() (generator:amount): RunningMeanStdInPlace() (generator:hp): RunningMeanStdInPlace() (generator:state): RunningMeanStdInPlace() (wall:hp): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) (6): Linear(in_features=512, out_features=512, bias=True) (7): ELU(alpha=1.0) ) ) (global_vars): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (_steps): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_action): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_action_id): RunningMeanStdInPlace() (last_action_val): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (last_reward): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (last_reward): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=5, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) (kinship): FeatureSetEncoder( (_normalizer): FeatureListNormalizer( (_norms_dict): ModuleDict( (kinship): RunningMeanStdInPlace() ) ) (embedding_net): Sequential( (0): Linear(in_features=125, out_features=8, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=8, out_features=8, bias=True) (3): ELU(alpha=1.0) ) ) ) (merged_encoder): Sequential( (0): Linear(in_features=544, out_features=512, bias=True) (1): ELU(alpha=1.0) (2): Linear(in_features=512, out_features=512, bias=True) (3): ELU(alpha=1.0) (4): Linear(in_features=512, out_features=512, bias=True) (5): ELU(alpha=1.0) ) ) (_core): ModelCoreRNN( (core): GRU(512, 512) ) (_decoder): Decoder( (mlp): Identity() ) (_critic_linear): Linear(in_features=512, out_features=1, bias=True) (_action_parameterization): ActionParameterizationDefault( (distribution_linear): Linear(in_features=512, out_features=16, bias=True) ) ) ) [2024-06-18 22:57:44,912][26579] Using optimizer [2024-06-18 22:57:45,097][26579] Loading state from checkpoint /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227814_3732504576.pth... [2024-06-18 22:57:45,112][26579] Loading model from checkpoint [2024-06-18 22:57:45,114][26579] Loaded experiment state at self.train_step=227814, self.env_steps=3732504576 [2024-06-18 22:57:45,114][26579] Initialized policy 0 weights for model version 227814 [2024-06-18 22:57:45,115][26579] LearnerWorker_p0 finished initialization! [2024-06-18 22:57:45,115][26579] Using GPUs [0] for process 0 (actually maps to GPUs [0]) [2024-06-18 22:57:45,852][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,852][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,852][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,852][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,852][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,852][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,853][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,856][26599] RunningMeanStd input shape: (1,) [2024-06-18 22:57:45,857][26599] RunningMeanStd input shape: (1,) [2024-06-18 22:57:45,857][26599] RunningMeanStd input shape: (1,) [2024-06-18 22:57:45,857][26599] RunningMeanStd input shape: (1,) [2024-06-18 22:57:45,857][26599] RunningMeanStd input shape: (11, 11) [2024-06-18 22:57:45,897][26599] RunningMeanStd input shape: (1,) [2024-06-18 22:57:45,918][26367] Inference worker 0-0 is ready! [2024-06-18 22:57:45,919][26367] All inference workers are ready! Signal rollout workers to start! [2024-06-18 22:57:48,380][26367] Fps is (10 sec: nan, 60 sec: nan, 300 sec: nan). Total num frames: 3732504576. Throughput: 0: nan. Samples: 0. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:57:48,675][26682] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,700][26602] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,705][26637] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,751][26691] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,752][26681] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,773][26674] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,777][26692] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,777][26685] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,782][26670] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,782][26690] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,803][26636] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,803][26684] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,814][26686] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,839][26678] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,844][26687] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,848][26600] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,849][26672] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,856][26677] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,857][26689] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,862][26693] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,865][26601] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,868][26604] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,871][26603] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,876][26683] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,882][26675] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,887][26673] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,889][26671] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,891][26669] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,901][26676] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,907][26679] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,921][26688] Decorrelating experience for 0 frames... [2024-06-18 22:57:48,925][26680] Decorrelating experience for 0 frames... [2024-06-18 22:57:49,920][26674] Decorrelating experience for 256 frames... [2024-06-18 22:57:49,966][26637] Decorrelating experience for 256 frames... [2024-06-18 22:57:49,973][26684] Decorrelating experience for 256 frames... [2024-06-18 22:57:49,979][26685] Decorrelating experience for 256 frames... [2024-06-18 22:57:49,993][26636] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,005][26602] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,028][26670] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,029][26672] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,035][26682] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,040][26686] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,059][26671] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,085][26688] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,107][26683] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,109][26673] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,115][26681] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,118][26676] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,122][26687] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,144][26690] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,149][26675] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,176][26691] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,177][26669] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,184][26603] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,188][26601] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,199][26677] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,202][26689] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,207][26678] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,209][26604] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,215][26600] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,217][26692] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,237][26693] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,282][26679] Decorrelating experience for 256 frames... [2024-06-18 22:57:50,346][26680] Decorrelating experience for 256 frames... [2024-06-18 22:57:53,380][26367] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3732504576. Throughput: 0: 732.0. Samples: 3660. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:57:57,667][26674] Worker 13, sleep for 60.938 sec to decorrelate experience collection [2024-06-18 22:57:57,726][26602] Worker 1, sleep for 4.688 sec to decorrelate experience collection [2024-06-18 22:57:57,729][26670] Worker 9, sleep for 42.188 sec to decorrelate experience collection [2024-06-18 22:57:57,748][26675] Worker 14, sleep for 65.625 sec to decorrelate experience collection [2024-06-18 22:57:57,756][26672] Worker 12, sleep for 56.250 sec to decorrelate experience collection [2024-06-18 22:57:57,786][26671] Worker 8, sleep for 37.500 sec to decorrelate experience collection [2024-06-18 22:57:57,800][26676] Worker 15, sleep for 70.312 sec to decorrelate experience collection [2024-06-18 22:57:57,820][26636] Worker 5, sleep for 23.438 sec to decorrelate experience collection [2024-06-18 22:57:57,861][26688] Worker 23, sleep for 107.812 sec to decorrelate experience collection [2024-06-18 22:57:57,865][26673] Worker 10, sleep for 46.875 sec to decorrelate experience collection [2024-06-18 22:57:57,885][26579] Signal inference workers to stop experience collection... [2024-06-18 22:57:57,893][26599] InferenceWorker_p0-w0: stopping experience collection [2024-06-18 22:57:58,380][26367] Fps is (10 sec: 0.0, 60 sec: 0.0, 300 sec: 0.0). Total num frames: 3732504576. Throughput: 0: 31806.1. Samples: 318060. Policy #0 lag: (min: -1.0, avg: -1.0, max: -1.0) [2024-06-18 22:57:58,395][26579] Signal inference workers to resume experience collection... [2024-06-18 22:57:58,396][26599] InferenceWorker_p0-w0: resuming experience collection [2024-06-18 22:57:58,424][26684] Worker 22, sleep for 103.125 sec to decorrelate experience collection [2024-06-18 22:57:58,424][26687] Worker 27, sleep for 126.562 sec to decorrelate experience collection [2024-06-18 22:57:58,425][26677] Worker 11, sleep for 51.562 sec to decorrelate experience collection [2024-06-18 22:57:58,439][26685] Worker 20, sleep for 93.750 sec to decorrelate experience collection [2024-06-18 22:57:58,639][26603] Worker 3, sleep for 14.062 sec to decorrelate experience collection [2024-06-18 22:57:58,692][26637] Worker 6, sleep for 28.125 sec to decorrelate experience collection [2024-06-18 22:57:58,762][26600] Worker 2, sleep for 9.375 sec to decorrelate experience collection [2024-06-18 22:57:58,814][26669] Worker 7, sleep for 32.812 sec to decorrelate experience collection [2024-06-18 22:57:58,887][26604] Worker 4, sleep for 18.750 sec to decorrelate experience collection [2024-06-18 22:57:59,021][26681] Worker 19, sleep for 89.062 sec to decorrelate experience collection [2024-06-18 22:57:59,042][26683] Worker 24, sleep for 112.500 sec to decorrelate experience collection [2024-06-18 22:57:59,066][26686] Worker 25, sleep for 117.188 sec to decorrelate experience collection [2024-06-18 22:57:59,080][26689] Worker 28, sleep for 131.250 sec to decorrelate experience collection [2024-06-18 22:57:59,246][26678] Worker 17, sleep for 79.688 sec to decorrelate experience collection [2024-06-18 22:57:59,302][26690] Worker 26, sleep for 121.875 sec to decorrelate experience collection [2024-06-18 22:57:59,307][26682] Worker 21, sleep for 98.438 sec to decorrelate experience collection [2024-06-18 22:57:59,367][26679] Worker 16, sleep for 75.000 sec to decorrelate experience collection [2024-06-18 22:57:59,486][26692] Worker 30, sleep for 140.625 sec to decorrelate experience collection [2024-06-18 22:57:59,526][26691] Worker 31, sleep for 145.312 sec to decorrelate experience collection [2024-06-18 22:57:59,526][26680] Worker 18, sleep for 84.375 sec to decorrelate experience collection [2024-06-18 22:57:59,558][26693] Worker 29, sleep for 135.938 sec to decorrelate experience collection [2024-06-18 22:57:59,670][26599] Updated weights for policy 0, policy_version 227824 (0.0014) [2024-06-18 22:58:01,257][26367] Heartbeat connected on Batcher_0 [2024-06-18 22:58:01,259][26367] Heartbeat connected on LearnerWorker_p0 [2024-06-18 22:58:01,264][26367] Heartbeat connected on RolloutWorker_w0 [2024-06-18 22:58:01,299][26367] Heartbeat connected on InferenceWorker_p0-w0 [2024-06-18 22:58:02,437][26602] Worker 1 awakens! [2024-06-18 22:58:02,444][26367] Heartbeat connected on RolloutWorker_w1 [2024-06-18 22:58:03,380][26367] Fps is (10 sec: 16383.9, 60 sec: 10922.6, 300 sec: 10922.6). Total num frames: 3732668416. Throughput: 0: 22038.6. Samples: 330580. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:08,184][26600] Worker 2 awakens! [2024-06-18 22:58:08,189][26367] Heartbeat connected on RolloutWorker_w2 [2024-06-18 22:58:08,380][26367] Fps is (10 sec: 18022.3, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 3732684800. Throughput: 0: 17182.0. Samples: 343640. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:12,772][26603] Worker 3 awakens! [2024-06-18 22:58:12,782][26367] Heartbeat connected on RolloutWorker_w3 [2024-06-18 22:58:13,380][26367] Fps is (10 sec: 3276.8, 60 sec: 7864.3, 300 sec: 7864.3). Total num frames: 3732701184. Throughput: 0: 14608.0. Samples: 365200. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:17,731][26604] Worker 4 awakens! [2024-06-18 22:58:17,737][26367] Heartbeat connected on RolloutWorker_w4 [2024-06-18 22:58:18,380][26367] Fps is (10 sec: 4915.2, 60 sec: 7645.9, 300 sec: 7645.9). Total num frames: 3732733952. Throughput: 0: 12685.4. Samples: 380560. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:21,357][26636] Worker 5 awakens! [2024-06-18 22:58:21,364][26367] Heartbeat connected on RolloutWorker_w5 [2024-06-18 22:58:23,380][26367] Fps is (10 sec: 9830.5, 60 sec: 8426.1, 300 sec: 8426.1). Total num frames: 3732799488. Throughput: 0: 12722.3. Samples: 445280. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:25,606][26599] Updated weights for policy 0, policy_version 227834 (0.0017) [2024-06-18 22:58:26,918][26637] Worker 6 awakens! [2024-06-18 22:58:26,923][26367] Heartbeat connected on RolloutWorker_w6 [2024-06-18 22:58:28,380][26367] Fps is (10 sec: 13107.2, 60 sec: 9011.2, 300 sec: 9011.2). Total num frames: 3732865024. Throughput: 0: 13434.5. Samples: 537380. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:31,727][26669] Worker 7 awakens! [2024-06-18 22:58:31,736][26367] Heartbeat connected on RolloutWorker_w7 [2024-06-18 22:58:33,380][26367] Fps is (10 sec: 18022.2, 60 sec: 10558.6, 300 sec: 10558.6). Total num frames: 3732979712. Throughput: 0: 13165.8. Samples: 592460. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:33,966][26599] Updated weights for policy 0, policy_version 227844 (0.0013) [2024-06-18 22:58:35,384][26671] Worker 8 awakens! [2024-06-18 22:58:35,389][26367] Heartbeat connected on RolloutWorker_w8 [2024-06-18 22:58:38,380][26367] Fps is (10 sec: 21299.2, 60 sec: 11468.8, 300 sec: 11468.8). Total num frames: 3733078016. Throughput: 0: 15832.9. Samples: 716140. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:40,017][26670] Worker 9 awakens! [2024-06-18 22:58:40,023][26367] Heartbeat connected on RolloutWorker_w9 [2024-06-18 22:58:41,618][26599] Updated weights for policy 0, policy_version 227854 (0.0012) [2024-06-18 22:58:43,380][26367] Fps is (10 sec: 22937.4, 60 sec: 12809.3, 300 sec: 12809.3). Total num frames: 3733209088. Throughput: 0: 11929.3. Samples: 854880. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:43,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-18 22:58:44,840][26673] Worker 10 awakens! [2024-06-18 22:58:44,845][26367] Heartbeat connected on RolloutWorker_w10 [2024-06-18 22:58:47,976][26599] Updated weights for policy 0, policy_version 227864 (0.0016) [2024-06-18 22:58:48,380][26367] Fps is (10 sec: 24575.7, 60 sec: 13653.3, 300 sec: 13653.3). Total num frames: 3733323776. Throughput: 0: 13215.1. Samples: 925260. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:48,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-18 22:58:50,088][26677] Worker 11 awakens! [2024-06-18 22:58:50,098][26367] Heartbeat connected on RolloutWorker_w11 [2024-06-18 22:58:53,380][26367] Fps is (10 sec: 26214.4, 60 sec: 16110.9, 300 sec: 14871.6). Total num frames: 3733471232. Throughput: 0: 16801.8. Samples: 1099720. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:53,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-18 22:58:53,419][26599] Updated weights for policy 0, policy_version 227874 (0.0014) [2024-06-18 22:58:54,104][26672] Worker 12 awakens! [2024-06-18 22:58:54,112][26367] Heartbeat connected on RolloutWorker_w12 [2024-06-18 22:58:58,380][26367] Fps is (10 sec: 31129.7, 60 sec: 18841.6, 300 sec: 16149.9). Total num frames: 3733635072. Throughput: 0: 20616.4. Samples: 1292940. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:58:58,381][26367] Avg episode reward: [(0, '0.335')] [2024-06-18 22:58:58,523][26599] Updated weights for policy 0, policy_version 227884 (0.0026) [2024-06-18 22:58:58,704][26674] Worker 13 awakens! [2024-06-18 22:58:58,713][26367] Heartbeat connected on RolloutWorker_w13 [2024-06-18 22:59:03,380][26367] Fps is (10 sec: 32767.9, 60 sec: 18841.6, 300 sec: 17257.8). Total num frames: 3733798912. Throughput: 0: 22291.5. Samples: 1383680. Policy #0 lag: (min: 0.0, avg: 0.0, max: 0.0) [2024-06-18 22:59:03,381][26367] Avg episode reward: [(0, '0.358')] [2024-06-18 22:59:03,472][26675] Worker 14 awakens! [2024-06-18 22:59:03,479][26367] Heartbeat connected on RolloutWorker_w14 [2024-06-18 22:59:03,897][26599] Updated weights for policy 0, policy_version 227894 (0.0027) [2024-06-18 22:59:08,212][26676] Worker 15 awakens! [2024-06-18 22:59:08,221][26367] Heartbeat connected on RolloutWorker_w15 [2024-06-18 22:59:08,380][26367] Fps is (10 sec: 32767.9, 60 sec: 21299.2, 300 sec: 18227.2). Total num frames: 3733962752. Throughput: 0: 25380.8. Samples: 1587420. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:08,381][26367] Avg episode reward: [(0, '0.208')] [2024-06-18 22:59:08,543][26599] Updated weights for policy 0, policy_version 227904 (0.0019) [2024-06-18 22:59:13,380][26367] Fps is (10 sec: 32768.0, 60 sec: 23756.8, 300 sec: 19082.5). Total num frames: 3734126592. Throughput: 0: 27643.9. Samples: 1781360. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:13,381][26367] Avg episode reward: [(0, '0.420')] [2024-06-18 22:59:13,635][26599] Updated weights for policy 0, policy_version 227914 (0.0024) [2024-06-18 22:59:14,382][26679] Worker 16 awakens! [2024-06-18 22:59:14,393][26367] Heartbeat connected on RolloutWorker_w16 [2024-06-18 22:59:18,380][26367] Fps is (10 sec: 32768.2, 60 sec: 25941.3, 300 sec: 19842.8). Total num frames: 3734290432. Throughput: 0: 28695.5. Samples: 1883760. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:18,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-18 22:59:18,625][26599] Updated weights for policy 0, policy_version 227924 (0.0028) [2024-06-18 22:59:19,034][26678] Worker 17 awakens! [2024-06-18 22:59:19,046][26367] Heartbeat connected on RolloutWorker_w17 [2024-06-18 22:59:22,932][26599] Updated weights for policy 0, policy_version 227934 (0.0029) [2024-06-18 22:59:23,380][26367] Fps is (10 sec: 34406.5, 60 sec: 27852.7, 300 sec: 20695.6). Total num frames: 3734470656. Throughput: 0: 30584.4. Samples: 2092440. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:23,381][26367] Avg episode reward: [(0, '0.329')] [2024-06-18 22:59:24,000][26680] Worker 18 awakens! [2024-06-18 22:59:24,013][26367] Heartbeat connected on RolloutWorker_w18 [2024-06-18 22:59:28,006][26599] Updated weights for policy 0, policy_version 227944 (0.0021) [2024-06-18 22:59:28,182][26681] Worker 19 awakens! [2024-06-18 22:59:28,194][26367] Heartbeat connected on RolloutWorker_w19 [2024-06-18 22:59:28,380][26367] Fps is (10 sec: 36044.7, 60 sec: 29764.2, 300 sec: 21463.0). Total num frames: 3734650880. Throughput: 0: 32156.4. Samples: 2301920. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:28,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-18 22:59:32,288][26685] Worker 20 awakens! [2024-06-18 22:59:32,298][26367] Heartbeat connected on RolloutWorker_w20 [2024-06-18 22:59:32,691][26599] Updated weights for policy 0, policy_version 227954 (0.0022) [2024-06-18 22:59:33,380][26367] Fps is (10 sec: 36044.5, 60 sec: 30856.5, 300 sec: 22157.4). Total num frames: 3734831104. Throughput: 0: 32897.3. Samples: 2405640. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:33,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-18 22:59:36,402][26599] Updated weights for policy 0, policy_version 227964 (0.0033) [2024-06-18 22:59:37,844][26682] Worker 21 awakens! [2024-06-18 22:59:37,856][26367] Heartbeat connected on RolloutWorker_w21 [2024-06-18 22:59:38,380][26367] Fps is (10 sec: 36044.9, 60 sec: 32221.8, 300 sec: 22788.7). Total num frames: 3735011328. Throughput: 0: 34022.2. Samples: 2630720. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:38,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-18 22:59:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227967_3735011328.pth... [2024-06-18 22:59:38,443][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227504_3727425536.pth [2024-06-18 22:59:41,249][26599] Updated weights for policy 0, policy_version 227974 (0.0028) [2024-06-18 22:59:41,649][26684] Worker 22 awakens! [2024-06-18 22:59:41,663][26367] Heartbeat connected on RolloutWorker_w22 [2024-06-18 22:59:43,380][26367] Fps is (10 sec: 36044.9, 60 sec: 33041.0, 300 sec: 23365.0). Total num frames: 3735191552. Throughput: 0: 34654.6. Samples: 2852400. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:43,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-18 22:59:45,733][26599] Updated weights for policy 0, policy_version 227984 (0.0034) [2024-06-18 22:59:45,772][26688] Worker 23 awakens! [2024-06-18 22:59:45,784][26367] Heartbeat connected on RolloutWorker_w23 [2024-06-18 22:59:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 34679.5, 300 sec: 24166.4). Total num frames: 3735404544. Throughput: 0: 35267.2. Samples: 2970700. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:48,381][26367] Avg episode reward: [(0, '0.236')] [2024-06-18 22:59:49,750][26599] Updated weights for policy 0, policy_version 227994 (0.0031) [2024-06-18 22:59:51,616][26683] Worker 24 awakens! [2024-06-18 22:59:51,630][26367] Heartbeat connected on RolloutWorker_w24 [2024-06-18 22:59:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 35498.6, 300 sec: 24772.6). Total num frames: 3735601152. Throughput: 0: 36004.0. Samples: 3207600. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:53,381][26367] Avg episode reward: [(0, '0.268')] [2024-06-18 22:59:53,706][26599] Updated weights for policy 0, policy_version 228004 (0.0032) [2024-06-18 22:59:56,352][26686] Worker 25 awakens! [2024-06-18 22:59:56,367][26367] Heartbeat connected on RolloutWorker_w25 [2024-06-18 22:59:58,282][26599] Updated weights for policy 0, policy_version 228014 (0.0033) [2024-06-18 22:59:58,380][26367] Fps is (10 sec: 37683.4, 60 sec: 35771.8, 300 sec: 25206.2). Total num frames: 3735781376. Throughput: 0: 36942.8. Samples: 3443780. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 22:59:58,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-18 23:00:01,202][26690] Worker 26 awakens! [2024-06-18 23:00:01,214][26367] Heartbeat connected on RolloutWorker_w26 [2024-06-18 23:00:02,298][26599] Updated weights for policy 0, policy_version 228024 (0.0029) [2024-06-18 23:00:03,380][26367] Fps is (10 sec: 39322.0, 60 sec: 36591.0, 300 sec: 25850.3). Total num frames: 3735994368. Throughput: 0: 37305.8. Samples: 3562520. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 23:00:03,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-18 23:00:05,087][26687] Worker 27 awakens! [2024-06-18 23:00:05,100][26367] Heartbeat connected on RolloutWorker_w27 [2024-06-18 23:00:05,655][26599] Updated weights for policy 0, policy_version 228034 (0.0029) [2024-06-18 23:00:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 37410.2, 300 sec: 26448.5). Total num frames: 3736207360. Throughput: 0: 37992.9. Samples: 3802120. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 23:00:08,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-18 23:00:10,428][26689] Worker 28 awakens! [2024-06-18 23:00:10,443][26367] Heartbeat connected on RolloutWorker_w28 [2024-06-18 23:00:10,638][26599] Updated weights for policy 0, policy_version 228044 (0.0034) [2024-06-18 23:00:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 37683.2, 300 sec: 26779.4). Total num frames: 3736387584. Throughput: 0: 38707.6. Samples: 4043760. Policy #0 lag: (min: 1.0, avg: 4.6, max: 11.0) [2024-06-18 23:00:13,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-18 23:00:14,124][26599] Updated weights for policy 0, policy_version 228054 (0.0035) [2024-06-18 23:00:15,596][26693] Worker 29 awakens! [2024-06-18 23:00:15,611][26367] Heartbeat connected on RolloutWorker_w29 [2024-06-18 23:00:18,340][26599] Updated weights for policy 0, policy_version 228064 (0.0032) [2024-06-18 23:00:18,380][26367] Fps is (10 sec: 39321.5, 60 sec: 38502.4, 300 sec: 27306.7). Total num frames: 3736600576. Throughput: 0: 39166.3. Samples: 4168120. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:18,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-18 23:00:20,202][26692] Worker 30 awakens! [2024-06-18 23:00:20,220][26367] Heartbeat connected on RolloutWorker_w30 [2024-06-18 23:00:21,958][26599] Updated weights for policy 0, policy_version 228074 (0.0033) [2024-06-18 23:00:23,380][26367] Fps is (10 sec: 42598.8, 60 sec: 39048.6, 300 sec: 27800.0). Total num frames: 3736813568. Throughput: 0: 39657.4. Samples: 4415300. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:23,380][26367] Avg episode reward: [(0, '0.725')] [2024-06-18 23:00:24,936][26691] Worker 31 awakens! [2024-06-18 23:00:24,952][26367] Heartbeat connected on RolloutWorker_w31 [2024-06-18 23:00:26,149][26599] Updated weights for policy 0, policy_version 228084 (0.0042) [2024-06-18 23:00:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 39594.7, 300 sec: 28262.4). Total num frames: 3737026560. Throughput: 0: 40257.5. Samples: 4663980. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:28,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-18 23:00:29,868][26599] Updated weights for policy 0, policy_version 228094 (0.0034) [2024-06-18 23:00:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 39594.8, 300 sec: 28498.2). Total num frames: 3737206784. Throughput: 0: 40450.2. Samples: 4790960. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:33,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-18 23:00:34,208][26599] Updated weights for policy 0, policy_version 228104 (0.0046) [2024-06-18 23:00:37,623][26599] Updated weights for policy 0, policy_version 228114 (0.0040) [2024-06-18 23:00:38,381][26367] Fps is (10 sec: 40958.3, 60 sec: 40413.6, 300 sec: 29009.3). Total num frames: 3737436160. Throughput: 0: 40718.5. Samples: 5039940. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:38,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-18 23:00:41,712][26579] Signal inference workers to stop experience collection... (50 times) [2024-06-18 23:00:41,763][26599] InferenceWorker_p0-w0: stopping experience collection (50 times) [2024-06-18 23:00:41,770][26579] Signal inference workers to resume experience collection... (50 times) [2024-06-18 23:00:41,776][26599] InferenceWorker_p0-w0: resuming experience collection (50 times) [2024-06-18 23:00:41,916][26599] Updated weights for policy 0, policy_version 228124 (0.0052) [2024-06-18 23:00:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40413.9, 300 sec: 29210.3). Total num frames: 3737616384. Throughput: 0: 41111.9. Samples: 5293820. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:43,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-18 23:00:45,344][26599] Updated weights for policy 0, policy_version 228134 (0.0034) [2024-06-18 23:00:48,380][26367] Fps is (10 sec: 40961.7, 60 sec: 40686.9, 300 sec: 29673.3). Total num frames: 3737845760. Throughput: 0: 41197.4. Samples: 5416400. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:48,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-18 23:00:49,528][26599] Updated weights for policy 0, policy_version 228144 (0.0037) [2024-06-18 23:00:53,157][26599] Updated weights for policy 0, policy_version 228154 (0.0049) [2024-06-18 23:00:53,380][26367] Fps is (10 sec: 45874.6, 60 sec: 41233.1, 300 sec: 30111.1). Total num frames: 3738075136. Throughput: 0: 41538.6. Samples: 5671360. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:53,381][26367] Avg episode reward: [(0, '0.341')] [2024-06-18 23:00:57,370][26599] Updated weights for policy 0, policy_version 228164 (0.0028) [2024-06-18 23:00:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 30353.5). Total num frames: 3738271744. Throughput: 0: 41768.5. Samples: 5923340. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:00:58,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-18 23:01:01,009][26599] Updated weights for policy 0, policy_version 228174 (0.0045) [2024-06-18 23:01:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 30583.4). Total num frames: 3738468352. Throughput: 0: 41717.7. Samples: 6045420. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:01:03,381][26367] Avg episode reward: [(0, '0.389')] [2024-06-18 23:01:05,191][26599] Updated weights for policy 0, policy_version 228184 (0.0028) [2024-06-18 23:01:08,384][26367] Fps is (10 sec: 42583.7, 60 sec: 41503.8, 300 sec: 30965.2). Total num frames: 3738697728. Throughput: 0: 41891.8. Samples: 6300580. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:01:08,392][26367] Avg episode reward: [(0, '0.492')] [2024-06-18 23:01:08,839][26599] Updated weights for policy 0, policy_version 228194 (0.0045) [2024-06-18 23:01:13,021][26599] Updated weights for policy 0, policy_version 228204 (0.0037) [2024-06-18 23:01:13,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42052.3, 300 sec: 31249.5). Total num frames: 3738910720. Throughput: 0: 41772.4. Samples: 6543740. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:01:13,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-18 23:01:16,543][26599] Updated weights for policy 0, policy_version 228214 (0.0032) [2024-06-18 23:01:18,384][26367] Fps is (10 sec: 39321.0, 60 sec: 41503.7, 300 sec: 31363.1). Total num frames: 3739090944. Throughput: 0: 41657.5. Samples: 6665700. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:01:18,384][26367] Avg episode reward: [(0, '0.674')] [2024-06-18 23:01:20,883][26599] Updated weights for policy 0, policy_version 228224 (0.0030) [2024-06-18 23:01:23,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 31625.0). Total num frames: 3739303936. Throughput: 0: 41794.7. Samples: 6920680. Policy #0 lag: (min: 0.0, avg: 82.9, max: 244.0) [2024-06-18 23:01:23,380][26367] Avg episode reward: [(0, '0.621')] [2024-06-18 23:01:24,286][26599] Updated weights for policy 0, policy_version 228234 (0.0027) [2024-06-18 23:01:28,380][26367] Fps is (10 sec: 42613.3, 60 sec: 41506.0, 300 sec: 31874.3). Total num frames: 3739516928. Throughput: 0: 41517.2. Samples: 7162100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:01:28,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-18 23:01:28,726][26599] Updated weights for policy 0, policy_version 228244 (0.0025) [2024-06-18 23:01:32,173][26599] Updated weights for policy 0, policy_version 228254 (0.0026) [2024-06-18 23:01:33,381][26367] Fps is (10 sec: 40957.8, 60 sec: 41778.9, 300 sec: 32039.8). Total num frames: 3739713536. Throughput: 0: 41815.6. Samples: 7298120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:01:33,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-18 23:01:36,575][26599] Updated weights for policy 0, policy_version 228264 (0.0036) [2024-06-18 23:01:38,380][26367] Fps is (10 sec: 37683.7, 60 sec: 40960.3, 300 sec: 32126.9). Total num frames: 3739893760. Throughput: 0: 41537.9. Samples: 7540560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:01:38,380][26367] Avg episode reward: [(0, '0.567')] [2024-06-18 23:01:38,475][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000228266_3739910144.pth... [2024-06-18 23:01:38,556][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227814_3732504576.pth [2024-06-18 23:01:40,206][26599] Updated weights for policy 0, policy_version 228274 (0.0029) [2024-06-18 23:01:43,380][26367] Fps is (10 sec: 42599.7, 60 sec: 42052.2, 300 sec: 32489.1). Total num frames: 3740139520. Throughput: 0: 41517.7. Samples: 7791640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:01:43,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-18 23:01:44,597][26599] Updated weights for policy 0, policy_version 228284 (0.0044) [2024-06-18 23:01:47,905][26599] Updated weights for policy 0, policy_version 228294 (0.0028) [2024-06-18 23:01:48,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42052.2, 300 sec: 32768.0). Total num frames: 3740368896. Throughput: 0: 41659.7. Samples: 7920100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:01:48,381][26367] Avg episode reward: [(0, '0.768')] [2024-06-18 23:01:52,303][26599] Updated weights for policy 0, policy_version 228304 (0.0035) [2024-06-18 23:01:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 32768.0). Total num frames: 3740532736. Throughput: 0: 41417.9. Samples: 8164240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:01:53,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-18 23:01:56,007][26599] Updated weights for policy 0, policy_version 228314 (0.0032) [2024-06-18 23:01:57,848][26579] Signal inference workers to stop experience collection... (100 times) [2024-06-18 23:01:57,858][26599] InferenceWorker_p0-w0: stopping experience collection (100 times) [2024-06-18 23:01:57,958][26579] Signal inference workers to resume experience collection... (100 times) [2024-06-18 23:01:57,958][26599] InferenceWorker_p0-w0: resuming experience collection (100 times) [2024-06-18 23:01:58,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 33030.1). Total num frames: 3740762112. Throughput: 0: 41520.0. Samples: 8412140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:01:58,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-18 23:02:00,617][26599] Updated weights for policy 0, policy_version 228324 (0.0034) [2024-06-18 23:02:03,384][26367] Fps is (10 sec: 44220.5, 60 sec: 41776.8, 300 sec: 33217.3). Total num frames: 3740975104. Throughput: 0: 41475.5. Samples: 8532100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:02:03,384][26367] Avg episode reward: [(0, '0.578')] [2024-06-18 23:02:04,403][26599] Updated weights for policy 0, policy_version 228334 (0.0029) [2024-06-18 23:02:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41235.4, 300 sec: 33335.1). Total num frames: 3741171712. Throughput: 0: 41364.2. Samples: 8782080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:02:08,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-18 23:02:08,817][26599] Updated weights for policy 0, policy_version 228344 (0.0032) [2024-06-18 23:02:12,377][26599] Updated weights for policy 0, policy_version 228354 (0.0038) [2024-06-18 23:02:13,380][26367] Fps is (10 sec: 39336.0, 60 sec: 40960.0, 300 sec: 33448.1). Total num frames: 3741368320. Throughput: 0: 41473.0. Samples: 9028380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:02:13,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-18 23:02:16,689][26599] Updated weights for policy 0, policy_version 228364 (0.0042) [2024-06-18 23:02:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41781.7, 300 sec: 33678.2). Total num frames: 3741597696. Throughput: 0: 41381.2. Samples: 9160260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:02:18,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-18 23:02:19,986][26599] Updated weights for policy 0, policy_version 228374 (0.0040) [2024-06-18 23:02:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 33661.7). Total num frames: 3741761536. Throughput: 0: 41406.2. Samples: 9403840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:02:23,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-18 23:02:24,491][26599] Updated weights for policy 0, policy_version 228384 (0.0037) [2024-06-18 23:02:28,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 33879.8). Total num frames: 3741990912. Throughput: 0: 41373.8. Samples: 9653460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:02:28,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-18 23:02:28,442][26599] Updated weights for policy 0, policy_version 228394 (0.0035) [2024-06-18 23:02:32,174][26599] Updated weights for policy 0, policy_version 228404 (0.0035) [2024-06-18 23:02:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41233.4, 300 sec: 33975.2). Total num frames: 3742187520. Throughput: 0: 41496.0. Samples: 9787420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-18 23:02:33,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-18 23:02:36,113][26599] Updated weights for policy 0, policy_version 228414 (0.0043) [2024-06-18 23:02:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 34180.4). Total num frames: 3742416896. Throughput: 0: 41486.1. Samples: 10031120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:02:38,385][26367] Avg episode reward: [(0, '0.613')] [2024-06-18 23:02:40,338][26599] Updated weights for policy 0, policy_version 228424 (0.0042) [2024-06-18 23:02:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41506.2, 300 sec: 34323.1). Total num frames: 3742629888. Throughput: 0: 41387.1. Samples: 10274560. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:02:43,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-18 23:02:43,946][26599] Updated weights for policy 0, policy_version 228434 (0.0034) [2024-06-18 23:02:48,218][26599] Updated weights for policy 0, policy_version 228444 (0.0038) [2024-06-18 23:02:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 34989.6). Total num frames: 3742826496. Throughput: 0: 41517.1. Samples: 10400220. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:02:48,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-18 23:02:51,829][26599] Updated weights for policy 0, policy_version 228454 (0.0035) [2024-06-18 23:02:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 35711.6). Total num frames: 3743039488. Throughput: 0: 41506.3. Samples: 10649860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:02:53,381][26367] Avg episode reward: [(0, '0.391')] [2024-06-18 23:02:56,059][26599] Updated weights for policy 0, policy_version 228464 (0.0033) [2024-06-18 23:02:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 35933.7). Total num frames: 3743268864. Throughput: 0: 41523.1. Samples: 10896920. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:02:58,381][26367] Avg episode reward: [(0, '0.793')] [2024-06-18 23:02:59,838][26599] Updated weights for policy 0, policy_version 228474 (0.0035) [2024-06-18 23:03:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 40962.5, 300 sec: 36433.6). Total num frames: 3743432704. Throughput: 0: 41509.0. Samples: 11028160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:03,380][26367] Avg episode reward: [(0, '0.554')] [2024-06-18 23:03:04,009][26599] Updated weights for policy 0, policy_version 228484 (0.0033) [2024-06-18 23:03:07,639][26599] Updated weights for policy 0, policy_version 228494 (0.0042) [2024-06-18 23:03:08,380][26367] Fps is (10 sec: 39320.9, 60 sec: 41506.1, 300 sec: 37155.6). Total num frames: 3743662080. Throughput: 0: 41631.8. Samples: 11277280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:08,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-18 23:03:12,067][26599] Updated weights for policy 0, policy_version 228504 (0.0048) [2024-06-18 23:03:13,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 37822.1). Total num frames: 3743891456. Throughput: 0: 41534.7. Samples: 11522520. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:13,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-18 23:03:15,264][26579] Signal inference workers to stop experience collection... (150 times) [2024-06-18 23:03:15,272][26579] Signal inference workers to resume experience collection... (150 times) [2024-06-18 23:03:15,314][26599] InferenceWorker_p0-w0: stopping experience collection (150 times) [2024-06-18 23:03:15,315][26599] InferenceWorker_p0-w0: resuming experience collection (150 times) [2024-06-18 23:03:15,410][26599] Updated weights for policy 0, policy_version 228514 (0.0043) [2024-06-18 23:03:18,380][26367] Fps is (10 sec: 39322.5, 60 sec: 40960.1, 300 sec: 38155.3). Total num frames: 3744055296. Throughput: 0: 41480.0. Samples: 11654020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:18,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-18 23:03:19,692][26599] Updated weights for policy 0, policy_version 228524 (0.0042) [2024-06-18 23:03:23,230][26599] Updated weights for policy 0, policy_version 228534 (0.0031) [2024-06-18 23:03:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 38766.2). Total num frames: 3744301056. Throughput: 0: 41369.4. Samples: 11892740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:23,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-18 23:03:27,672][26599] Updated weights for policy 0, policy_version 228544 (0.0037) [2024-06-18 23:03:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 38988.4). Total num frames: 3744481280. Throughput: 0: 41730.3. Samples: 12152420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:28,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-18 23:03:31,013][26599] Updated weights for policy 0, policy_version 228554 (0.0033) [2024-06-18 23:03:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 39432.7). Total num frames: 3744710656. Throughput: 0: 41660.5. Samples: 12274940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:33,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-18 23:03:35,451][26599] Updated weights for policy 0, policy_version 228564 (0.0035) [2024-06-18 23:03:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 39710.4). Total num frames: 3744923648. Throughput: 0: 41629.4. Samples: 12523180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:38,380][26367] Avg episode reward: [(0, '0.665')] [2024-06-18 23:03:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000228572_3744923648.pth... [2024-06-18 23:03:38,453][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000227967_3735011328.pth [2024-06-18 23:03:38,862][26599] Updated weights for policy 0, policy_version 228574 (0.0036) [2024-06-18 23:03:43,355][26599] Updated weights for policy 0, policy_version 228584 (0.0034) [2024-06-18 23:03:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 39988.1). Total num frames: 3745120256. Throughput: 0: 41829.7. Samples: 12779260. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-18 23:03:43,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-18 23:03:46,999][26599] Updated weights for policy 0, policy_version 228594 (0.0032) [2024-06-18 23:03:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 40154.7). Total num frames: 3745316864. Throughput: 0: 41474.2. Samples: 12894500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:03:48,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-18 23:03:51,368][26599] Updated weights for policy 0, policy_version 228604 (0.0044) [2024-06-18 23:03:53,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 40432.4). Total num frames: 3745562624. Throughput: 0: 41534.0. Samples: 13146300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:03:53,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-18 23:03:54,843][26599] Updated weights for policy 0, policy_version 228614 (0.0039) [2024-06-18 23:03:58,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 40376.8). Total num frames: 3745710080. Throughput: 0: 41831.9. Samples: 13404960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:03:58,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-18 23:03:59,118][26599] Updated weights for policy 0, policy_version 228624 (0.0035) [2024-06-18 23:04:02,567][26599] Updated weights for policy 0, policy_version 228634 (0.0039) [2024-06-18 23:04:03,380][26367] Fps is (10 sec: 37682.6, 60 sec: 41779.1, 300 sec: 40599.0). Total num frames: 3745939456. Throughput: 0: 41457.6. Samples: 13519620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:03,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-18 23:04:06,987][26599] Updated weights for policy 0, policy_version 228644 (0.0033) [2024-06-18 23:04:08,380][26367] Fps is (10 sec: 47514.2, 60 sec: 42052.4, 300 sec: 40876.7). Total num frames: 3746185216. Throughput: 0: 41837.0. Samples: 13775400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:08,380][26367] Avg episode reward: [(0, '0.550')] [2024-06-18 23:04:10,273][26599] Updated weights for policy 0, policy_version 228654 (0.0039) [2024-06-18 23:04:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 40876.7). Total num frames: 3746349056. Throughput: 0: 41694.1. Samples: 14028660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:13,381][26367] Avg episode reward: [(0, '0.385')] [2024-06-18 23:04:14,893][26599] Updated weights for policy 0, policy_version 228664 (0.0033) [2024-06-18 23:04:18,276][26599] Updated weights for policy 0, policy_version 228674 (0.0044) [2024-06-18 23:04:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41098.9). Total num frames: 3746594816. Throughput: 0: 41452.5. Samples: 14140300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:18,380][26367] Avg episode reward: [(0, '0.541')] [2024-06-18 23:04:22,628][26599] Updated weights for policy 0, policy_version 228684 (0.0044) [2024-06-18 23:04:23,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 3746791424. Throughput: 0: 41851.1. Samples: 14406480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:23,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-18 23:04:25,929][26599] Updated weights for policy 0, policy_version 228694 (0.0036) [2024-06-18 23:04:28,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41210.0). Total num frames: 3746988032. Throughput: 0: 41528.1. Samples: 14648020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:28,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-18 23:04:30,658][26599] Updated weights for policy 0, policy_version 228704 (0.0041) [2024-06-18 23:04:31,229][26579] Signal inference workers to stop experience collection... (200 times) [2024-06-18 23:04:31,271][26599] InferenceWorker_p0-w0: stopping experience collection (200 times) [2024-06-18 23:04:31,277][26579] Signal inference workers to resume experience collection... (200 times) [2024-06-18 23:04:31,295][26599] InferenceWorker_p0-w0: resuming experience collection (200 times) [2024-06-18 23:04:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 3747201024. Throughput: 0: 41673.4. Samples: 14769800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:33,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-18 23:04:33,750][26599] Updated weights for policy 0, policy_version 228714 (0.0030) [2024-06-18 23:04:38,363][26599] Updated weights for policy 0, policy_version 228724 (0.0029) [2024-06-18 23:04:38,380][26367] Fps is (10 sec: 42597.4, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 3747414016. Throughput: 0: 41809.1. Samples: 15027720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:38,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-18 23:04:41,659][26599] Updated weights for policy 0, policy_version 228734 (0.0035) [2024-06-18 23:04:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 3747610624. Throughput: 0: 41455.1. Samples: 15270440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:43,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-18 23:04:46,319][26599] Updated weights for policy 0, policy_version 228744 (0.0035) [2024-06-18 23:04:48,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 3747840000. Throughput: 0: 41717.0. Samples: 15396880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:48,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-18 23:04:49,837][26599] Updated weights for policy 0, policy_version 228754 (0.0035) [2024-06-18 23:04:53,380][26367] Fps is (10 sec: 37683.0, 60 sec: 40413.8, 300 sec: 41376.5). Total num frames: 3747987456. Throughput: 0: 41548.3. Samples: 15645080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-18 23:04:53,381][26367] Avg episode reward: [(0, '0.384')] [2024-06-18 23:04:54,138][26599] Updated weights for policy 0, policy_version 228764 (0.0041) [2024-06-18 23:04:57,558][26599] Updated weights for policy 0, policy_version 228774 (0.0039) [2024-06-18 23:04:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41543.2). Total num frames: 3748249600. Throughput: 0: 41376.9. Samples: 15890620. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:04:58,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-18 23:05:01,965][26599] Updated weights for policy 0, policy_version 228784 (0.0043) [2024-06-18 23:05:03,380][26367] Fps is (10 sec: 45875.7, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 3748446208. Throughput: 0: 41771.1. Samples: 16020000. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:03,380][26367] Avg episode reward: [(0, '0.486')] [2024-06-18 23:05:06,208][26599] Updated weights for policy 0, policy_version 228794 (0.0029) [2024-06-18 23:05:08,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40959.9, 300 sec: 41543.2). Total num frames: 3748642816. Throughput: 0: 41114.2. Samples: 16256620. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:08,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-18 23:05:10,413][26599] Updated weights for policy 0, policy_version 228804 (0.0042) [2024-06-18 23:05:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 3748872192. Throughput: 0: 41163.0. Samples: 16500360. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:13,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-18 23:05:14,003][26599] Updated weights for policy 0, policy_version 228814 (0.0045) [2024-06-18 23:05:18,199][26599] Updated weights for policy 0, policy_version 228824 (0.0036) [2024-06-18 23:05:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 3749052416. Throughput: 0: 41317.3. Samples: 16629080. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:18,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-18 23:05:21,931][26599] Updated weights for policy 0, policy_version 228834 (0.0048) [2024-06-18 23:05:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3749265408. Throughput: 0: 41140.0. Samples: 16879020. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:23,381][26367] Avg episode reward: [(0, '0.362')] [2024-06-18 23:05:26,147][26599] Updated weights for policy 0, policy_version 228844 (0.0028) [2024-06-18 23:05:28,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 3749494784. Throughput: 0: 41147.0. Samples: 17122060. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:28,381][26367] Avg episode reward: [(0, '0.409')] [2024-06-18 23:05:30,051][26599] Updated weights for policy 0, policy_version 228854 (0.0036) [2024-06-18 23:05:33,380][26367] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 3749658624. Throughput: 0: 41256.5. Samples: 17253420. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:33,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-18 23:05:33,962][26599] Updated weights for policy 0, policy_version 228864 (0.0037) [2024-06-18 23:05:37,644][26599] Updated weights for policy 0, policy_version 228874 (0.0034) [2024-06-18 23:05:38,380][26367] Fps is (10 sec: 37683.6, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 3749871616. Throughput: 0: 41196.9. Samples: 17498940. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:38,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-18 23:05:38,433][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000228875_3749888000.pth... [2024-06-18 23:05:38,476][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000228266_3739910144.pth [2024-06-18 23:05:41,949][26599] Updated weights for policy 0, policy_version 228884 (0.0046) [2024-06-18 23:05:43,380][26367] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3750117376. Throughput: 0: 41239.2. Samples: 17746380. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:43,380][26367] Avg episode reward: [(0, '0.415')] [2024-06-18 23:05:45,556][26599] Updated weights for policy 0, policy_version 228894 (0.0050) [2024-06-18 23:05:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 3750297600. Throughput: 0: 41325.7. Samples: 17879660. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:48,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-18 23:05:49,606][26599] Updated weights for policy 0, policy_version 228904 (0.0050) [2024-06-18 23:05:53,172][26599] Updated weights for policy 0, policy_version 228914 (0.0034) [2024-06-18 23:05:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 41543.2). Total num frames: 3750526976. Throughput: 0: 41519.9. Samples: 18125020. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:53,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-18 23:05:57,472][26599] Updated weights for policy 0, policy_version 228924 (0.0041) [2024-06-18 23:05:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3750739968. Throughput: 0: 41713.0. Samples: 18377440. Policy #0 lag: (min: 0.0, avg: 7.6, max: 20.0) [2024-06-18 23:05:58,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-18 23:06:01,532][26599] Updated weights for policy 0, policy_version 228934 (0.0040) [2024-06-18 23:06:03,380][26367] Fps is (10 sec: 37683.2, 60 sec: 40959.9, 300 sec: 41377.0). Total num frames: 3750903808. Throughput: 0: 41594.1. Samples: 18500820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:03,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-18 23:06:03,583][26579] Signal inference workers to stop experience collection... (250 times) [2024-06-18 23:06:03,585][26579] Signal inference workers to resume experience collection... (250 times) [2024-06-18 23:06:03,605][26599] InferenceWorker_p0-w0: stopping experience collection (250 times) [2024-06-18 23:06:03,605][26599] InferenceWorker_p0-w0: resuming experience collection (250 times) [2024-06-18 23:06:05,381][26599] Updated weights for policy 0, policy_version 228944 (0.0043) [2024-06-18 23:06:08,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 3751149568. Throughput: 0: 41505.8. Samples: 18746780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:08,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-18 23:06:09,196][26599] Updated weights for policy 0, policy_version 228954 (0.0044) [2024-06-18 23:06:13,053][26599] Updated weights for policy 0, policy_version 228964 (0.0040) [2024-06-18 23:06:13,380][26367] Fps is (10 sec: 45875.3, 60 sec: 41506.2, 300 sec: 41599.2). Total num frames: 3751362560. Throughput: 0: 41800.5. Samples: 19003080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:13,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-18 23:06:16,938][26599] Updated weights for policy 0, policy_version 228974 (0.0034) [2024-06-18 23:06:18,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3751542784. Throughput: 0: 41572.3. Samples: 19124180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:18,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-18 23:06:20,876][26599] Updated weights for policy 0, policy_version 228984 (0.0040) [2024-06-18 23:06:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 3751772160. Throughput: 0: 41665.4. Samples: 19373880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:23,380][26367] Avg episode reward: [(0, '0.487')] [2024-06-18 23:06:24,732][26599] Updated weights for policy 0, policy_version 228994 (0.0030) [2024-06-18 23:06:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3751968768. Throughput: 0: 41842.5. Samples: 19629300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:28,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-18 23:06:28,685][26599] Updated weights for policy 0, policy_version 229004 (0.0031) [2024-06-18 23:06:32,754][26599] Updated weights for policy 0, policy_version 229014 (0.0033) [2024-06-18 23:06:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 3752181760. Throughput: 0: 41474.3. Samples: 19746000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:33,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-18 23:06:36,527][26599] Updated weights for policy 0, policy_version 229024 (0.0028) [2024-06-18 23:06:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 3752394752. Throughput: 0: 41613.8. Samples: 19997640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:38,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-18 23:06:40,416][26599] Updated weights for policy 0, policy_version 229034 (0.0032) [2024-06-18 23:06:43,384][26367] Fps is (10 sec: 39307.2, 60 sec: 40957.4, 300 sec: 41376.0). Total num frames: 3752574976. Throughput: 0: 41703.2. Samples: 20254240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:43,385][26367] Avg episode reward: [(0, '0.462')] [2024-06-18 23:06:44,659][26599] Updated weights for policy 0, policy_version 229044 (0.0031) [2024-06-18 23:06:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3752804352. Throughput: 0: 41452.0. Samples: 20366160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:48,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-18 23:06:48,515][26599] Updated weights for policy 0, policy_version 229054 (0.0031) [2024-06-18 23:06:52,481][26599] Updated weights for policy 0, policy_version 229064 (0.0036) [2024-06-18 23:06:53,380][26367] Fps is (10 sec: 44252.8, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3753017344. Throughput: 0: 41693.3. Samples: 20622980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:53,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-18 23:06:56,674][26599] Updated weights for policy 0, policy_version 229074 (0.0044) [2024-06-18 23:06:58,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41432.6). Total num frames: 3753197568. Throughput: 0: 41517.8. Samples: 20871380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:06:58,381][26367] Avg episode reward: [(0, '0.864')] [2024-06-18 23:07:00,240][26599] Updated weights for policy 0, policy_version 229084 (0.0038) [2024-06-18 23:07:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 3753426944. Throughput: 0: 41501.9. Samples: 20991760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:07:03,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-18 23:07:04,618][26599] Updated weights for policy 0, policy_version 229094 (0.0036) [2024-06-18 23:07:08,343][26599] Updated weights for policy 0, policy_version 229104 (0.0040) [2024-06-18 23:07:08,384][26367] Fps is (10 sec: 44220.8, 60 sec: 41503.6, 300 sec: 41598.2). Total num frames: 3753639936. Throughput: 0: 41506.3. Samples: 21241820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-18 23:07:08,385][26367] Avg episode reward: [(0, '0.362')] [2024-06-18 23:07:12,923][26599] Updated weights for policy 0, policy_version 229114 (0.0035) [2024-06-18 23:07:13,380][26367] Fps is (10 sec: 39320.8, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 3753820160. Throughput: 0: 41331.0. Samples: 21489200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:13,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-18 23:07:16,235][26599] Updated weights for policy 0, policy_version 229124 (0.0027) [2024-06-18 23:07:18,380][26367] Fps is (10 sec: 40975.2, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 3754049536. Throughput: 0: 41360.1. Samples: 21607200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:18,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-18 23:07:20,683][26599] Updated weights for policy 0, policy_version 229134 (0.0024) [2024-06-18 23:07:23,380][26367] Fps is (10 sec: 40960.8, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 3754229760. Throughput: 0: 41367.2. Samples: 21859160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:23,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-18 23:07:24,299][26599] Updated weights for policy 0, policy_version 229144 (0.0041) [2024-06-18 23:07:28,380][26367] Fps is (10 sec: 37683.2, 60 sec: 40960.1, 300 sec: 41487.6). Total num frames: 3754426368. Throughput: 0: 41122.5. Samples: 22104600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:28,380][26367] Avg episode reward: [(0, '0.495')] [2024-06-18 23:07:28,536][26599] Updated weights for policy 0, policy_version 229154 (0.0038) [2024-06-18 23:07:32,462][26599] Updated weights for policy 0, policy_version 229164 (0.0037) [2024-06-18 23:07:32,675][26579] Signal inference workers to stop experience collection... (300 times) [2024-06-18 23:07:32,677][26579] Signal inference workers to resume experience collection... (300 times) [2024-06-18 23:07:32,713][26599] InferenceWorker_p0-w0: stopping experience collection (300 times) [2024-06-18 23:07:32,713][26599] InferenceWorker_p0-w0: resuming experience collection (300 times) [2024-06-18 23:07:33,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3754672128. Throughput: 0: 41430.7. Samples: 22230540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:33,381][26367] Avg episode reward: [(0, '0.347')] [2024-06-18 23:07:36,436][26599] Updated weights for policy 0, policy_version 229174 (0.0046) [2024-06-18 23:07:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 3754852352. Throughput: 0: 41083.1. Samples: 22471720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:38,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-18 23:07:38,386][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000229178_3754852352.pth... [2024-06-18 23:07:38,437][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000228572_3744923648.pth [2024-06-18 23:07:40,374][26599] Updated weights for policy 0, policy_version 229184 (0.0028) [2024-06-18 23:07:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41508.6, 300 sec: 41487.6). Total num frames: 3755065344. Throughput: 0: 41033.4. Samples: 22717880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:43,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-18 23:07:44,563][26599] Updated weights for policy 0, policy_version 229194 (0.0030) [2024-06-18 23:07:48,269][26599] Updated weights for policy 0, policy_version 229204 (0.0033) [2024-06-18 23:07:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3755278336. Throughput: 0: 41193.7. Samples: 22845480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:48,389][26367] Avg episode reward: [(0, '0.239')] [2024-06-18 23:07:52,535][26599] Updated weights for policy 0, policy_version 229214 (0.0033) [2024-06-18 23:07:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 3755474944. Throughput: 0: 40946.9. Samples: 23084280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:53,389][26367] Avg episode reward: [(0, '0.548')] [2024-06-18 23:07:56,400][26599] Updated weights for policy 0, policy_version 229224 (0.0035) [2024-06-18 23:07:58,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3755704320. Throughput: 0: 40856.6. Samples: 23327740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:07:58,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-18 23:08:00,351][26599] Updated weights for policy 0, policy_version 229234 (0.0051) [2024-06-18 23:08:03,380][26367] Fps is (10 sec: 37683.6, 60 sec: 40413.9, 300 sec: 41321.0). Total num frames: 3755851776. Throughput: 0: 41154.7. Samples: 23459160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:08:03,380][26367] Avg episode reward: [(0, '0.724')] [2024-06-18 23:08:04,562][26599] Updated weights for policy 0, policy_version 229244 (0.0042) [2024-06-18 23:08:08,380][26367] Fps is (10 sec: 37682.7, 60 sec: 40689.3, 300 sec: 41321.0). Total num frames: 3756081152. Throughput: 0: 40942.1. Samples: 23701560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:08:08,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-18 23:08:08,757][26599] Updated weights for policy 0, policy_version 229254 (0.0042) [2024-06-18 23:08:12,416][26599] Updated weights for policy 0, policy_version 229264 (0.0035) [2024-06-18 23:08:13,380][26367] Fps is (10 sec: 45874.4, 60 sec: 41506.2, 300 sec: 41543.1). Total num frames: 3756310528. Throughput: 0: 40961.2. Samples: 23947860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:08:13,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-18 23:08:16,468][26599] Updated weights for policy 0, policy_version 229274 (0.0045) [2024-06-18 23:08:18,380][26367] Fps is (10 sec: 40960.8, 60 sec: 40686.9, 300 sec: 41321.0). Total num frames: 3756490752. Throughput: 0: 40884.5. Samples: 24070340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-18 23:08:18,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-18 23:08:20,335][26599] Updated weights for policy 0, policy_version 229284 (0.0049) [2024-06-18 23:08:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 3756720128. Throughput: 0: 40903.9. Samples: 24312400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:23,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-18 23:08:24,193][26599] Updated weights for policy 0, policy_version 229294 (0.0057) [2024-06-18 23:08:28,164][26599] Updated weights for policy 0, policy_version 229304 (0.0034) [2024-06-18 23:08:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 3756916736. Throughput: 0: 41036.0. Samples: 24564500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:28,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-18 23:08:32,625][26599] Updated weights for policy 0, policy_version 229314 (0.0048) [2024-06-18 23:08:33,384][26367] Fps is (10 sec: 37669.6, 60 sec: 40411.4, 300 sec: 41264.9). Total num frames: 3757096960. Throughput: 0: 40890.5. Samples: 24685700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:33,385][26367] Avg episode reward: [(0, '0.506')] [2024-06-18 23:08:35,922][26599] Updated weights for policy 0, policy_version 229324 (0.0039) [2024-06-18 23:08:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3757342720. Throughput: 0: 41140.4. Samples: 24935600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:38,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-18 23:08:40,373][26599] Updated weights for policy 0, policy_version 229334 (0.0028) [2024-06-18 23:08:43,380][26367] Fps is (10 sec: 42614.7, 60 sec: 40960.1, 300 sec: 41376.6). Total num frames: 3757522944. Throughput: 0: 41271.2. Samples: 25184940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:43,380][26367] Avg episode reward: [(0, '0.479')] [2024-06-18 23:08:44,187][26599] Updated weights for policy 0, policy_version 229344 (0.0043) [2024-06-18 23:08:48,260][26599] Updated weights for policy 0, policy_version 229354 (0.0038) [2024-06-18 23:08:48,383][26367] Fps is (10 sec: 39310.7, 60 sec: 40958.1, 300 sec: 41265.1). Total num frames: 3757735936. Throughput: 0: 41062.2. Samples: 25307080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:48,384][26367] Avg episode reward: [(0, '0.356')] [2024-06-18 23:08:51,856][26599] Updated weights for policy 0, policy_version 229364 (0.0047) [2024-06-18 23:08:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3757948928. Throughput: 0: 41204.1. Samples: 25555740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:53,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-18 23:08:56,244][26599] Updated weights for policy 0, policy_version 229374 (0.0039) [2024-06-18 23:08:58,384][26367] Fps is (10 sec: 40957.1, 60 sec: 40684.5, 300 sec: 41376.1). Total num frames: 3758145536. Throughput: 0: 41324.4. Samples: 25807600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:08:58,384][26367] Avg episode reward: [(0, '0.667')] [2024-06-18 23:08:59,862][26599] Updated weights for policy 0, policy_version 229384 (0.0031) [2024-06-18 23:09:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41265.4). Total num frames: 3758358528. Throughput: 0: 41308.8. Samples: 25929240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:09:03,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-18 23:09:04,057][26599] Updated weights for policy 0, policy_version 229394 (0.0045) [2024-06-18 23:09:05,622][26579] Signal inference workers to stop experience collection... (350 times) [2024-06-18 23:09:05,658][26599] InferenceWorker_p0-w0: stopping experience collection (350 times) [2024-06-18 23:09:05,670][26579] Signal inference workers to resume experience collection... (350 times) [2024-06-18 23:09:05,677][26599] InferenceWorker_p0-w0: resuming experience collection (350 times) [2024-06-18 23:09:07,654][26599] Updated weights for policy 0, policy_version 229404 (0.0036) [2024-06-18 23:09:08,380][26367] Fps is (10 sec: 42613.3, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 3758571520. Throughput: 0: 41459.6. Samples: 26178080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:09:08,381][26367] Avg episode reward: [(0, '0.763')] [2024-06-18 23:09:12,000][26599] Updated weights for policy 0, policy_version 229414 (0.0052) [2024-06-18 23:09:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41265.4). Total num frames: 3758768128. Throughput: 0: 41575.5. Samples: 26435400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:09:13,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-18 23:09:15,527][26599] Updated weights for policy 0, policy_version 229424 (0.0043) [2024-06-18 23:09:18,383][26367] Fps is (10 sec: 40948.2, 60 sec: 41504.0, 300 sec: 41320.6). Total num frames: 3758981120. Throughput: 0: 41478.9. Samples: 26552220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:09:18,384][26367] Avg episode reward: [(0, '0.677')] [2024-06-18 23:09:19,880][26599] Updated weights for policy 0, policy_version 229434 (0.0041) [2024-06-18 23:09:23,384][26367] Fps is (10 sec: 42583.1, 60 sec: 41230.6, 300 sec: 41376.0). Total num frames: 3759194112. Throughput: 0: 41339.8. Samples: 26796040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:09:23,385][26367] Avg episode reward: [(0, '0.712')] [2024-06-18 23:09:23,794][26599] Updated weights for policy 0, policy_version 229444 (0.0049) [2024-06-18 23:09:28,201][26599] Updated weights for policy 0, policy_version 229454 (0.0044) [2024-06-18 23:09:28,380][26367] Fps is (10 sec: 39332.6, 60 sec: 40959.9, 300 sec: 41265.4). Total num frames: 3759374336. Throughput: 0: 41372.7. Samples: 27046720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-18 23:09:28,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-18 23:09:31,571][26599] Updated weights for policy 0, policy_version 229464 (0.0026) [2024-06-18 23:09:33,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42054.8, 300 sec: 41376.6). Total num frames: 3759620096. Throughput: 0: 41304.4. Samples: 27165660. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:09:33,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-18 23:09:36,048][26599] Updated weights for policy 0, policy_version 229474 (0.0030) [2024-06-18 23:09:38,380][26367] Fps is (10 sec: 44237.8, 60 sec: 41233.2, 300 sec: 41376.6). Total num frames: 3759816704. Throughput: 0: 41477.0. Samples: 27422200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:09:38,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-18 23:09:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000229481_3759816704.pth... [2024-06-18 23:09:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000228875_3749888000.pth [2024-06-18 23:09:39,397][26599] Updated weights for policy 0, policy_version 229484 (0.0030) [2024-06-18 23:09:43,384][26367] Fps is (10 sec: 39307.3, 60 sec: 41503.5, 300 sec: 41264.9). Total num frames: 3760013312. Throughput: 0: 41368.3. Samples: 27669180. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:09:43,385][26367] Avg episode reward: [(0, '0.500')] [2024-06-18 23:09:43,988][26599] Updated weights for policy 0, policy_version 229494 (0.0043) [2024-06-18 23:09:47,274][26599] Updated weights for policy 0, policy_version 229504 (0.0049) [2024-06-18 23:09:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41508.1, 300 sec: 41487.6). Total num frames: 3760226304. Throughput: 0: 41456.0. Samples: 27794760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:09:48,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-18 23:09:51,766][26599] Updated weights for policy 0, policy_version 229514 (0.0057) [2024-06-18 23:09:53,380][26367] Fps is (10 sec: 39335.9, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 3760406528. Throughput: 0: 41552.5. Samples: 28047940. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:09:53,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-18 23:09:54,996][26599] Updated weights for policy 0, policy_version 229524 (0.0023) [2024-06-18 23:09:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41508.6, 300 sec: 41321.0). Total num frames: 3760635904. Throughput: 0: 41225.0. Samples: 28290520. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:09:58,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-18 23:09:59,639][26599] Updated weights for policy 0, policy_version 229534 (0.0029) [2024-06-18 23:10:02,838][26599] Updated weights for policy 0, policy_version 229544 (0.0047) [2024-06-18 23:10:03,382][26367] Fps is (10 sec: 44230.0, 60 sec: 41505.1, 300 sec: 41376.3). Total num frames: 3760848896. Throughput: 0: 41431.9. Samples: 28416600. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:10:03,382][26367] Avg episode reward: [(0, '0.570')] [2024-06-18 23:10:07,671][26599] Updated weights for policy 0, policy_version 229554 (0.0058) [2024-06-18 23:10:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 3761045504. Throughput: 0: 41549.7. Samples: 28665620. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:10:08,380][26367] Avg episode reward: [(0, '0.632')] [2024-06-18 23:10:10,691][26599] Updated weights for policy 0, policy_version 229564 (0.0035) [2024-06-18 23:10:13,380][26367] Fps is (10 sec: 40966.3, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 3761258496. Throughput: 0: 41451.7. Samples: 28912040. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:10:13,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-18 23:10:15,607][26599] Updated weights for policy 0, policy_version 229574 (0.0036) [2024-06-18 23:10:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41781.3, 300 sec: 41432.1). Total num frames: 3761487872. Throughput: 0: 41536.1. Samples: 29034780. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:10:18,380][26367] Avg episode reward: [(0, '0.638')] [2024-06-18 23:10:18,654][26599] Updated weights for policy 0, policy_version 229584 (0.0045) [2024-06-18 23:10:23,380][26367] Fps is (10 sec: 37683.6, 60 sec: 40689.5, 300 sec: 41154.4). Total num frames: 3761635328. Throughput: 0: 41267.1. Samples: 29279220. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:10:23,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-18 23:10:23,622][26599] Updated weights for policy 0, policy_version 229594 (0.0026) [2024-06-18 23:10:23,886][26579] Signal inference workers to stop experience collection... (400 times) [2024-06-18 23:10:23,886][26579] Signal inference workers to resume experience collection... (400 times) [2024-06-18 23:10:23,901][26599] InferenceWorker_p0-w0: stopping experience collection (400 times) [2024-06-18 23:10:23,902][26599] InferenceWorker_p0-w0: resuming experience collection (400 times) [2024-06-18 23:10:26,749][26599] Updated weights for policy 0, policy_version 229604 (0.0035) [2024-06-18 23:10:28,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.3, 300 sec: 41376.5). Total num frames: 3761864704. Throughput: 0: 41371.0. Samples: 29530720. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:10:28,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-18 23:10:31,490][26599] Updated weights for policy 0, policy_version 229614 (0.0033) [2024-06-18 23:10:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 3762077696. Throughput: 0: 41294.7. Samples: 29653020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-18 23:10:33,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-18 23:10:34,761][26599] Updated weights for policy 0, policy_version 229624 (0.0028) [2024-06-18 23:10:38,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 3762274304. Throughput: 0: 41161.8. Samples: 29900220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:10:38,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-18 23:10:39,285][26599] Updated weights for policy 0, policy_version 229634 (0.0036) [2024-06-18 23:10:42,533][26599] Updated weights for policy 0, policy_version 229644 (0.0039) [2024-06-18 23:10:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41508.6, 300 sec: 41376.5). Total num frames: 3762503680. Throughput: 0: 41325.7. Samples: 30150180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:10:43,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-18 23:10:47,316][26599] Updated weights for policy 0, policy_version 229654 (0.0026) [2024-06-18 23:10:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 3762716672. Throughput: 0: 41396.2. Samples: 30279360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:10:48,381][26367] Avg episode reward: [(0, '0.363')] [2024-06-18 23:10:50,327][26599] Updated weights for policy 0, policy_version 229664 (0.0037) [2024-06-18 23:10:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 3762896896. Throughput: 0: 41232.2. Samples: 30521080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:10:53,388][26367] Avg episode reward: [(0, '0.521')] [2024-06-18 23:10:55,235][26599] Updated weights for policy 0, policy_version 229674 (0.0038) [2024-06-18 23:10:58,359][26599] Updated weights for policy 0, policy_version 229684 (0.0039) [2024-06-18 23:10:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 3763142656. Throughput: 0: 41049.0. Samples: 30759240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:10:58,380][26367] Avg episode reward: [(0, '0.356')] [2024-06-18 23:11:03,213][26599] Updated weights for policy 0, policy_version 229694 (0.0037) [2024-06-18 23:11:03,384][26367] Fps is (10 sec: 40945.5, 60 sec: 40958.6, 300 sec: 41209.4). Total num frames: 3763306496. Throughput: 0: 41221.0. Samples: 30889880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:03,384][26367] Avg episode reward: [(0, '0.624')] [2024-06-18 23:11:06,455][26599] Updated weights for policy 0, policy_version 229704 (0.0033) [2024-06-18 23:11:08,380][26367] Fps is (10 sec: 37682.9, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 3763519488. Throughput: 0: 41168.4. Samples: 31131800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:08,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-18 23:11:10,991][26599] Updated weights for policy 0, policy_version 229714 (0.0038) [2024-06-18 23:11:13,380][26367] Fps is (10 sec: 42613.4, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 3763732480. Throughput: 0: 41115.8. Samples: 31380940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:13,392][26367] Avg episode reward: [(0, '0.449')] [2024-06-18 23:11:14,683][26599] Updated weights for policy 0, policy_version 229724 (0.0049) [2024-06-18 23:11:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 3763929088. Throughput: 0: 41226.3. Samples: 31508200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:18,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-18 23:11:18,818][26599] Updated weights for policy 0, policy_version 229734 (0.0048) [2024-06-18 23:11:22,576][26599] Updated weights for policy 0, policy_version 229744 (0.0033) [2024-06-18 23:11:23,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 3764125696. Throughput: 0: 41180.9. Samples: 31753360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:23,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-18 23:11:26,868][26599] Updated weights for policy 0, policy_version 229754 (0.0045) [2024-06-18 23:11:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 3764371456. Throughput: 0: 41085.9. Samples: 31999040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:28,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-18 23:11:30,437][26599] Updated weights for policy 0, policy_version 229764 (0.0031) [2024-06-18 23:11:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 3764535296. Throughput: 0: 40986.3. Samples: 32123740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:33,380][26367] Avg episode reward: [(0, '0.328')] [2024-06-18 23:11:34,870][26599] Updated weights for policy 0, policy_version 229774 (0.0033) [2024-06-18 23:11:38,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41321.5). Total num frames: 3764764672. Throughput: 0: 41051.5. Samples: 32368400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:38,381][26367] Avg episode reward: [(0, '0.291')] [2024-06-18 23:11:38,393][26599] Updated weights for policy 0, policy_version 229784 (0.0040) [2024-06-18 23:11:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000229784_3764781056.pth... [2024-06-18 23:11:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000229178_3754852352.pth [2024-06-18 23:11:42,557][26579] Signal inference workers to stop experience collection... (450 times) [2024-06-18 23:11:42,558][26579] Signal inference workers to resume experience collection... (450 times) [2024-06-18 23:11:42,579][26599] InferenceWorker_p0-w0: stopping experience collection (450 times) [2024-06-18 23:11:42,579][26599] InferenceWorker_p0-w0: resuming experience collection (450 times) [2024-06-18 23:11:42,707][26599] Updated weights for policy 0, policy_version 229794 (0.0044) [2024-06-18 23:11:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 3764961280. Throughput: 0: 41262.2. Samples: 32616040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 25.0) [2024-06-18 23:11:43,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-18 23:11:46,498][26599] Updated weights for policy 0, policy_version 229804 (0.0047) [2024-06-18 23:11:48,380][26367] Fps is (10 sec: 40961.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 3765174272. Throughput: 0: 41086.5. Samples: 32738620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:11:48,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-18 23:11:50,891][26599] Updated weights for policy 0, policy_version 229814 (0.0034) [2024-06-18 23:11:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 3765387264. Throughput: 0: 41128.9. Samples: 32982600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:11:53,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-18 23:11:54,447][26599] Updated weights for policy 0, policy_version 229824 (0.0050) [2024-06-18 23:11:58,380][26367] Fps is (10 sec: 39320.5, 60 sec: 40413.7, 300 sec: 41154.4). Total num frames: 3765567488. Throughput: 0: 41270.2. Samples: 33238100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:11:58,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-18 23:11:58,799][26599] Updated weights for policy 0, policy_version 229834 (0.0027) [2024-06-18 23:12:02,500][26599] Updated weights for policy 0, policy_version 229844 (0.0023) [2024-06-18 23:12:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41235.6, 300 sec: 41154.9). Total num frames: 3765780480. Throughput: 0: 41008.0. Samples: 33353560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:03,380][26367] Avg episode reward: [(0, '0.516')] [2024-06-18 23:12:06,860][26599] Updated weights for policy 0, policy_version 229854 (0.0048) [2024-06-18 23:12:08,381][26367] Fps is (10 sec: 44232.5, 60 sec: 41505.3, 300 sec: 41320.9). Total num frames: 3766009856. Throughput: 0: 41093.2. Samples: 33602600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:08,382][26367] Avg episode reward: [(0, '0.744')] [2024-06-18 23:12:10,540][26599] Updated weights for policy 0, policy_version 229864 (0.0050) [2024-06-18 23:12:13,380][26367] Fps is (10 sec: 39321.1, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 3766173696. Throughput: 0: 41209.7. Samples: 33853480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:13,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-18 23:12:14,942][26599] Updated weights for policy 0, policy_version 229874 (0.0043) [2024-06-18 23:12:18,380][26367] Fps is (10 sec: 39325.9, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 3766403072. Throughput: 0: 41051.4. Samples: 33971060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:18,386][26367] Avg episode reward: [(0, '0.590')] [2024-06-18 23:12:18,446][26599] Updated weights for policy 0, policy_version 229884 (0.0034) [2024-06-18 23:12:22,796][26599] Updated weights for policy 0, policy_version 229894 (0.0042) [2024-06-18 23:12:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 3766616064. Throughput: 0: 41177.9. Samples: 34221400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:23,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-18 23:12:27,005][26599] Updated weights for policy 0, policy_version 229904 (0.0030) [2024-06-18 23:12:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 3766812672. Throughput: 0: 41057.2. Samples: 34463620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:28,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-18 23:12:30,775][26599] Updated weights for policy 0, policy_version 229914 (0.0041) [2024-06-18 23:12:33,384][26367] Fps is (10 sec: 40944.9, 60 sec: 41503.5, 300 sec: 41265.0). Total num frames: 3767025664. Throughput: 0: 41006.8. Samples: 34584080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:33,385][26367] Avg episode reward: [(0, '0.458')] [2024-06-18 23:12:35,058][26599] Updated weights for policy 0, policy_version 229924 (0.0039) [2024-06-18 23:12:38,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 3767205888. Throughput: 0: 41132.4. Samples: 34833560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:38,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-18 23:12:38,626][26599] Updated weights for policy 0, policy_version 229934 (0.0041) [2024-06-18 23:12:42,904][26599] Updated weights for policy 0, policy_version 229944 (0.0053) [2024-06-18 23:12:43,380][26367] Fps is (10 sec: 39336.3, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 3767418880. Throughput: 0: 40905.1. Samples: 35078820. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:43,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-18 23:12:46,510][26599] Updated weights for policy 0, policy_version 229954 (0.0054) [2024-06-18 23:12:48,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41233.0, 300 sec: 41265.5). Total num frames: 3767648256. Throughput: 0: 41106.2. Samples: 35203340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:48,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-18 23:12:50,658][26599] Updated weights for policy 0, policy_version 229964 (0.0038) [2024-06-18 23:12:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 3767828480. Throughput: 0: 41162.0. Samples: 35454840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-18 23:12:53,380][26367] Avg episode reward: [(0, '0.606')] [2024-06-18 23:12:54,514][26599] Updated weights for policy 0, policy_version 229974 (0.0031) [2024-06-18 23:12:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.3, 300 sec: 41321.0). Total num frames: 3768041472. Throughput: 0: 41074.4. Samples: 35701820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:12:58,380][26367] Avg episode reward: [(0, '0.516')] [2024-06-18 23:12:58,778][26599] Updated weights for policy 0, policy_version 229984 (0.0027) [2024-06-18 23:13:02,442][26599] Updated weights for policy 0, policy_version 229994 (0.0034) [2024-06-18 23:13:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41210.0). Total num frames: 3768238080. Throughput: 0: 41207.3. Samples: 35825380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:03,380][26367] Avg episode reward: [(0, '0.266')] [2024-06-18 23:13:06,591][26599] Updated weights for policy 0, policy_version 230004 (0.0031) [2024-06-18 23:13:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40687.8, 300 sec: 41154.4). Total num frames: 3768451072. Throughput: 0: 41096.1. Samples: 36070720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:08,380][26367] Avg episode reward: [(0, '0.335')] [2024-06-18 23:13:10,331][26599] Updated weights for policy 0, policy_version 230014 (0.0030) [2024-06-18 23:13:13,384][26367] Fps is (10 sec: 42582.4, 60 sec: 41503.7, 300 sec: 41264.9). Total num frames: 3768664064. Throughput: 0: 41138.5. Samples: 36315000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:13,384][26367] Avg episode reward: [(0, '0.335')] [2024-06-18 23:13:14,719][26599] Updated weights for policy 0, policy_version 230024 (0.0043) [2024-06-18 23:13:18,350][26599] Updated weights for policy 0, policy_version 230034 (0.0040) [2024-06-18 23:13:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41233.2, 300 sec: 41210.0). Total num frames: 3768877056. Throughput: 0: 41150.1. Samples: 36435680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:18,380][26367] Avg episode reward: [(0, '0.470')] [2024-06-18 23:13:23,002][26599] Updated weights for policy 0, policy_version 230044 (0.0034) [2024-06-18 23:13:23,380][26367] Fps is (10 sec: 39335.5, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 3769057280. Throughput: 0: 41084.4. Samples: 36682360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:23,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-18 23:13:26,318][26599] Updated weights for policy 0, policy_version 230054 (0.0031) [2024-06-18 23:13:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.2, 300 sec: 41321.5). Total num frames: 3769286656. Throughput: 0: 41145.8. Samples: 36930380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:28,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-18 23:13:31,169][26599] Updated weights for policy 0, policy_version 230064 (0.0048) [2024-06-18 23:13:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40689.4, 300 sec: 41098.9). Total num frames: 3769466880. Throughput: 0: 41238.6. Samples: 37059080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:33,384][26367] Avg episode reward: [(0, '0.665')] [2024-06-18 23:13:34,597][26599] Updated weights for policy 0, policy_version 230074 (0.0037) [2024-06-18 23:13:38,380][26367] Fps is (10 sec: 37682.6, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 3769663488. Throughput: 0: 40954.5. Samples: 37297800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:38,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-18 23:13:38,418][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230082_3769663488.pth... [2024-06-18 23:13:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000229481_3759816704.pth [2024-06-18 23:13:38,995][26579] Signal inference workers to stop experience collection... (500 times) [2024-06-18 23:13:38,997][26579] Signal inference workers to resume experience collection... (500 times) [2024-06-18 23:13:39,043][26599] InferenceWorker_p0-w0: stopping experience collection (500 times) [2024-06-18 23:13:39,043][26599] InferenceWorker_p0-w0: resuming experience collection (500 times) [2024-06-18 23:13:39,133][26599] Updated weights for policy 0, policy_version 230084 (0.0033) [2024-06-18 23:13:42,439][26599] Updated weights for policy 0, policy_version 230094 (0.0042) [2024-06-18 23:13:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41210.3). Total num frames: 3769892864. Throughput: 0: 40829.6. Samples: 37539160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:43,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-18 23:13:47,093][26599] Updated weights for policy 0, policy_version 230104 (0.0031) [2024-06-18 23:13:48,384][26367] Fps is (10 sec: 44221.1, 60 sec: 40957.5, 300 sec: 41209.4). Total num frames: 3770105856. Throughput: 0: 40911.3. Samples: 37666540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:48,384][26367] Avg episode reward: [(0, '0.533')] [2024-06-18 23:13:50,296][26599] Updated weights for policy 0, policy_version 230114 (0.0042) [2024-06-18 23:13:53,380][26367] Fps is (10 sec: 39322.3, 60 sec: 40960.0, 300 sec: 41154.9). Total num frames: 3770286080. Throughput: 0: 41005.8. Samples: 37915980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:53,380][26367] Avg episode reward: [(0, '0.568')] [2024-06-18 23:13:54,911][26599] Updated weights for policy 0, policy_version 230124 (0.0041) [2024-06-18 23:13:58,170][26599] Updated weights for policy 0, policy_version 230134 (0.0034) [2024-06-18 23:13:58,380][26367] Fps is (10 sec: 42613.6, 60 sec: 41506.0, 300 sec: 41265.5). Total num frames: 3770531840. Throughput: 0: 41102.8. Samples: 38164480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:13:58,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-18 23:14:02,635][26599] Updated weights for policy 0, policy_version 230144 (0.0037) [2024-06-18 23:14:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 3770695680. Throughput: 0: 41234.6. Samples: 38291240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-18 23:14:03,380][26367] Avg episode reward: [(0, '0.372')] [2024-06-18 23:14:06,007][26599] Updated weights for policy 0, policy_version 230154 (0.0026) [2024-06-18 23:14:08,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41232.9, 300 sec: 41209.9). Total num frames: 3770925056. Throughput: 0: 41138.6. Samples: 38533600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:08,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-18 23:14:10,228][26599] Updated weights for policy 0, policy_version 230164 (0.0040) [2024-06-18 23:14:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 40962.5, 300 sec: 41154.8). Total num frames: 3771121664. Throughput: 0: 41249.8. Samples: 38786620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:13,380][26367] Avg episode reward: [(0, '0.800')] [2024-06-18 23:14:13,843][26599] Updated weights for policy 0, policy_version 230174 (0.0039) [2024-06-18 23:14:18,380][26367] Fps is (10 sec: 39322.0, 60 sec: 40686.8, 300 sec: 41099.4). Total num frames: 3771318272. Throughput: 0: 40978.7. Samples: 38903120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:18,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-18 23:14:18,578][26599] Updated weights for policy 0, policy_version 230184 (0.0053) [2024-06-18 23:14:22,221][26599] Updated weights for policy 0, policy_version 230194 (0.0049) [2024-06-18 23:14:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.3, 300 sec: 41321.0). Total num frames: 3771564032. Throughput: 0: 41168.1. Samples: 39150360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:23,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-18 23:14:26,550][26599] Updated weights for policy 0, policy_version 230204 (0.0027) [2024-06-18 23:14:28,384][26367] Fps is (10 sec: 42583.1, 60 sec: 40957.5, 300 sec: 41098.3). Total num frames: 3771744256. Throughput: 0: 41346.9. Samples: 39399920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:28,393][26367] Avg episode reward: [(0, '0.638')] [2024-06-18 23:14:30,472][26599] Updated weights for policy 0, policy_version 230214 (0.0044) [2024-06-18 23:14:33,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 3771940864. Throughput: 0: 41157.6. Samples: 39518480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:33,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-18 23:14:34,364][26599] Updated weights for policy 0, policy_version 230224 (0.0037) [2024-06-18 23:14:38,380][26367] Fps is (10 sec: 39336.1, 60 sec: 41233.2, 300 sec: 41099.4). Total num frames: 3772137472. Throughput: 0: 41067.9. Samples: 39764040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:38,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-18 23:14:38,515][26599] Updated weights for policy 0, policy_version 230234 (0.0045) [2024-06-18 23:14:42,393][26599] Updated weights for policy 0, policy_version 230244 (0.0041) [2024-06-18 23:14:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 3772350464. Throughput: 0: 40980.6. Samples: 40008600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:43,381][26367] Avg episode reward: [(0, '0.792')] [2024-06-18 23:14:46,763][26599] Updated weights for policy 0, policy_version 230254 (0.0029) [2024-06-18 23:14:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 40962.5, 300 sec: 41209.9). Total num frames: 3772563456. Throughput: 0: 41034.2. Samples: 40137780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:48,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-18 23:14:50,256][26599] Updated weights for policy 0, policy_version 230264 (0.0037) [2024-06-18 23:14:52,294][26579] Signal inference workers to stop experience collection... (550 times) [2024-06-18 23:14:52,294][26579] Signal inference workers to resume experience collection... (550 times) [2024-06-18 23:14:52,306][26599] InferenceWorker_p0-w0: stopping experience collection (550 times) [2024-06-18 23:14:52,306][26599] InferenceWorker_p0-w0: resuming experience collection (550 times) [2024-06-18 23:14:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.0, 300 sec: 41154.4). Total num frames: 3772776448. Throughput: 0: 41115.7. Samples: 40383800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:53,382][26367] Avg episode reward: [(0, '0.685')] [2024-06-18 23:14:54,732][26599] Updated weights for policy 0, policy_version 230274 (0.0035) [2024-06-18 23:14:58,323][26599] Updated weights for policy 0, policy_version 230284 (0.0029) [2024-06-18 23:14:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 41099.1). Total num frames: 3772973056. Throughput: 0: 40952.4. Samples: 40629480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:14:58,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-18 23:15:02,459][26599] Updated weights for policy 0, policy_version 230294 (0.0033) [2024-06-18 23:15:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 3773186048. Throughput: 0: 40956.0. Samples: 40746140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:15:03,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-18 23:15:06,553][26599] Updated weights for policy 0, policy_version 230304 (0.0035) [2024-06-18 23:15:08,384][26367] Fps is (10 sec: 40945.2, 60 sec: 40957.6, 300 sec: 41098.3). Total num frames: 3773382656. Throughput: 0: 41057.1. Samples: 40998080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:15:08,393][26367] Avg episode reward: [(0, '0.347')] [2024-06-18 23:15:10,243][26599] Updated weights for policy 0, policy_version 230314 (0.0049) [2024-06-18 23:15:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3773595648. Throughput: 0: 40912.6. Samples: 41240840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-18 23:15:13,381][26367] Avg episode reward: [(0, '0.370')] [2024-06-18 23:15:14,423][26599] Updated weights for policy 0, policy_version 230324 (0.0032) [2024-06-18 23:15:18,159][26599] Updated weights for policy 0, policy_version 230334 (0.0032) [2024-06-18 23:15:18,380][26367] Fps is (10 sec: 40974.5, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 3773792256. Throughput: 0: 41023.4. Samples: 41364540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:18,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-18 23:15:22,299][26599] Updated weights for policy 0, policy_version 230344 (0.0044) [2024-06-18 23:15:23,380][26367] Fps is (10 sec: 37683.4, 60 sec: 40140.8, 300 sec: 41043.3). Total num frames: 3773972480. Throughput: 0: 41022.2. Samples: 41610040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:23,380][26367] Avg episode reward: [(0, '0.570')] [2024-06-18 23:15:26,508][26599] Updated weights for policy 0, policy_version 230354 (0.0031) [2024-06-18 23:15:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40962.4, 300 sec: 41098.8). Total num frames: 3774201856. Throughput: 0: 41036.8. Samples: 41855260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:28,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-18 23:15:30,368][26599] Updated weights for policy 0, policy_version 230364 (0.0037) [2024-06-18 23:15:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 3774398464. Throughput: 0: 41046.7. Samples: 41984880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:33,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-18 23:15:34,372][26599] Updated weights for policy 0, policy_version 230374 (0.0043) [2024-06-18 23:15:38,283][26599] Updated weights for policy 0, policy_version 230384 (0.0046) [2024-06-18 23:15:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3774611456. Throughput: 0: 40900.0. Samples: 42224300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:38,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-18 23:15:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230384_3774611456.pth... [2024-06-18 23:15:38,449][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000229784_3764781056.pth [2024-06-18 23:15:42,339][26599] Updated weights for policy 0, policy_version 230394 (0.0034) [2024-06-18 23:15:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3774824448. Throughput: 0: 40970.7. Samples: 42473160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:43,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-18 23:15:46,087][26599] Updated weights for policy 0, policy_version 230404 (0.0041) [2024-06-18 23:15:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 3775021056. Throughput: 0: 41232.1. Samples: 42601580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:48,380][26367] Avg episode reward: [(0, '0.441')] [2024-06-18 23:15:50,106][26599] Updated weights for policy 0, policy_version 230414 (0.0034) [2024-06-18 23:15:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3775234048. Throughput: 0: 41071.8. Samples: 42846160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:53,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-18 23:15:54,018][26599] Updated weights for policy 0, policy_version 230424 (0.0033) [2024-06-18 23:15:57,879][26599] Updated weights for policy 0, policy_version 230434 (0.0042) [2024-06-18 23:15:58,380][26367] Fps is (10 sec: 40959.3, 60 sec: 40960.0, 300 sec: 41099.3). Total num frames: 3775430656. Throughput: 0: 40951.5. Samples: 43083660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:15:58,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-18 23:16:02,009][26599] Updated weights for policy 0, policy_version 230444 (0.0038) [2024-06-18 23:16:03,380][26367] Fps is (10 sec: 37683.5, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 3775610880. Throughput: 0: 40868.2. Samples: 43203600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:16:03,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-18 23:16:06,416][26599] Updated weights for policy 0, policy_version 230454 (0.0030) [2024-06-18 23:16:08,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41235.6, 300 sec: 41098.9). Total num frames: 3775856640. Throughput: 0: 40992.9. Samples: 43454720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:16:08,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-18 23:16:09,943][26599] Updated weights for policy 0, policy_version 230464 (0.0040) [2024-06-18 23:16:13,380][26367] Fps is (10 sec: 44236.4, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3776053248. Throughput: 0: 40892.1. Samples: 43695400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:16:13,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-18 23:16:14,268][26599] Updated weights for policy 0, policy_version 230474 (0.0036) [2024-06-18 23:16:18,050][26599] Updated weights for policy 0, policy_version 230484 (0.0042) [2024-06-18 23:16:18,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3776266240. Throughput: 0: 40851.9. Samples: 43823220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:16:18,382][26367] Avg episode reward: [(0, '0.552')] [2024-06-18 23:16:22,193][26599] Updated weights for policy 0, policy_version 230494 (0.0042) [2024-06-18 23:16:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 40932.2). Total num frames: 3776446464. Throughput: 0: 40997.3. Samples: 44069180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:23,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-18 23:16:25,811][26599] Updated weights for policy 0, policy_version 230504 (0.0033) [2024-06-18 23:16:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 3776643072. Throughput: 0: 41028.4. Samples: 44319440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:28,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-18 23:16:30,432][26599] Updated weights for policy 0, policy_version 230514 (0.0029) [2024-06-18 23:16:33,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3776872448. Throughput: 0: 40869.3. Samples: 44440700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:33,381][26367] Avg episode reward: [(0, '0.744')] [2024-06-18 23:16:33,685][26599] Updated weights for policy 0, policy_version 230524 (0.0045) [2024-06-18 23:16:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 3777052672. Throughput: 0: 40893.7. Samples: 44686380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:38,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-18 23:16:38,556][26599] Updated weights for policy 0, policy_version 230534 (0.0041) [2024-06-18 23:16:41,595][26599] Updated weights for policy 0, policy_version 230544 (0.0032) [2024-06-18 23:16:43,380][26367] Fps is (10 sec: 39321.0, 60 sec: 40686.9, 300 sec: 40987.7). Total num frames: 3777265664. Throughput: 0: 41087.1. Samples: 44932580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:43,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-18 23:16:45,131][26579] Signal inference workers to stop experience collection... (600 times) [2024-06-18 23:16:45,157][26599] InferenceWorker_p0-w0: stopping experience collection (600 times) [2024-06-18 23:16:45,190][26579] Signal inference workers to resume experience collection... (600 times) [2024-06-18 23:16:45,192][26599] InferenceWorker_p0-w0: resuming experience collection (600 times) [2024-06-18 23:16:46,358][26599] Updated weights for policy 0, policy_version 230554 (0.0040) [2024-06-18 23:16:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3777495040. Throughput: 0: 41074.2. Samples: 45051940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:48,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-18 23:16:49,711][26599] Updated weights for policy 0, policy_version 230564 (0.0030) [2024-06-18 23:16:53,380][26367] Fps is (10 sec: 40960.6, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 3777675264. Throughput: 0: 40960.4. Samples: 45297940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:53,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-18 23:16:54,297][26599] Updated weights for policy 0, policy_version 230574 (0.0030) [2024-06-18 23:16:58,143][26599] Updated weights for policy 0, policy_version 230584 (0.0034) [2024-06-18 23:16:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 3777904640. Throughput: 0: 40993.7. Samples: 45540120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:16:58,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-18 23:17:02,162][26599] Updated weights for policy 0, policy_version 230594 (0.0051) [2024-06-18 23:17:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 40876.9). Total num frames: 3778068480. Throughput: 0: 40885.9. Samples: 45663080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:17:03,380][26367] Avg episode reward: [(0, '0.839')] [2024-06-18 23:17:05,989][26599] Updated weights for policy 0, policy_version 230604 (0.0040) [2024-06-18 23:17:08,380][26367] Fps is (10 sec: 37683.4, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 3778281472. Throughput: 0: 40820.5. Samples: 45906100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:17:08,381][26367] Avg episode reward: [(0, '0.775')] [2024-06-18 23:17:10,332][26599] Updated weights for policy 0, policy_version 230614 (0.0039) [2024-06-18 23:17:13,380][26367] Fps is (10 sec: 44236.3, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 3778510848. Throughput: 0: 40652.4. Samples: 46148800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:17:13,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-18 23:17:13,885][26599] Updated weights for policy 0, policy_version 230624 (0.0044) [2024-06-18 23:17:18,347][26599] Updated weights for policy 0, policy_version 230634 (0.0033) [2024-06-18 23:17:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 3778707456. Throughput: 0: 40763.4. Samples: 46275060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:17:18,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-18 23:17:21,803][26599] Updated weights for policy 0, policy_version 230644 (0.0033) [2024-06-18 23:17:23,380][26367] Fps is (10 sec: 39322.4, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 3778904064. Throughput: 0: 40682.9. Samples: 46517100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:17:23,380][26367] Avg episode reward: [(0, '0.421')] [2024-06-18 23:17:26,445][26599] Updated weights for policy 0, policy_version 230654 (0.0036) [2024-06-18 23:17:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 40686.9, 300 sec: 40877.2). Total num frames: 3779084288. Throughput: 0: 40907.1. Samples: 46773400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-18 23:17:28,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-18 23:17:29,975][26599] Updated weights for policy 0, policy_version 230664 (0.0039) [2024-06-18 23:17:33,380][26367] Fps is (10 sec: 39321.0, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 3779297280. Throughput: 0: 40838.2. Samples: 46889660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:17:33,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-18 23:17:34,297][26599] Updated weights for policy 0, policy_version 230674 (0.0038) [2024-06-18 23:17:37,879][26599] Updated weights for policy 0, policy_version 230684 (0.0030) [2024-06-18 23:17:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3779526656. Throughput: 0: 40841.2. Samples: 47135800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:17:38,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-18 23:17:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230684_3779526656.pth... [2024-06-18 23:17:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230082_3769663488.pth [2024-06-18 23:17:42,081][26599] Updated weights for policy 0, policy_version 230694 (0.0021) [2024-06-18 23:17:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 40686.9, 300 sec: 40876.7). Total num frames: 3779706880. Throughput: 0: 41062.6. Samples: 47387940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:17:43,381][26367] Avg episode reward: [(0, '0.452')] [2024-06-18 23:17:45,706][26599] Updated weights for policy 0, policy_version 230704 (0.0025) [2024-06-18 23:17:48,380][26367] Fps is (10 sec: 40960.7, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 3779936256. Throughput: 0: 40915.6. Samples: 47504280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:17:48,380][26367] Avg episode reward: [(0, '0.513')] [2024-06-18 23:17:50,068][26599] Updated weights for policy 0, policy_version 230714 (0.0056) [2024-06-18 23:17:53,380][26367] Fps is (10 sec: 45875.7, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 3780165632. Throughput: 0: 41203.1. Samples: 47760240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:17:53,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-18 23:17:53,565][26599] Updated weights for policy 0, policy_version 230724 (0.0045) [2024-06-18 23:17:57,822][26599] Updated weights for policy 0, policy_version 230734 (0.0042) [2024-06-18 23:17:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 3780345856. Throughput: 0: 41273.9. Samples: 48006120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:17:58,380][26367] Avg episode reward: [(0, '0.759')] [2024-06-18 23:18:01,496][26599] Updated weights for policy 0, policy_version 230744 (0.0042) [2024-06-18 23:18:03,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 3780558848. Throughput: 0: 41325.4. Samples: 48134700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:03,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-18 23:18:06,343][26599] Updated weights for policy 0, policy_version 230754 (0.0047) [2024-06-18 23:18:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 40988.3). Total num frames: 3780755456. Throughput: 0: 41380.0. Samples: 48379200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:08,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-18 23:18:09,483][26599] Updated weights for policy 0, policy_version 230764 (0.0035) [2024-06-18 23:18:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3780968448. Throughput: 0: 41124.9. Samples: 48624020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:13,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-18 23:18:14,343][26599] Updated weights for policy 0, policy_version 230774 (0.0031) [2024-06-18 23:18:17,389][26599] Updated weights for policy 0, policy_version 230784 (0.0030) [2024-06-18 23:18:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41506.3, 300 sec: 41154.4). Total num frames: 3781197824. Throughput: 0: 41232.5. Samples: 48745120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:18,380][26367] Avg episode reward: [(0, '0.457')] [2024-06-18 23:18:22,368][26599] Updated weights for policy 0, policy_version 230794 (0.0040) [2024-06-18 23:18:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 3781378048. Throughput: 0: 41427.2. Samples: 49000020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:23,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-18 23:18:25,192][26599] Updated weights for policy 0, policy_version 230804 (0.0029) [2024-06-18 23:18:28,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41506.2, 300 sec: 41043.3). Total num frames: 3781574656. Throughput: 0: 41274.8. Samples: 49245300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:28,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-18 23:18:30,275][26599] Updated weights for policy 0, policy_version 230814 (0.0050) [2024-06-18 23:18:33,345][26579] Signal inference workers to stop experience collection... (650 times) [2024-06-18 23:18:33,345][26579] Signal inference workers to resume experience collection... (650 times) [2024-06-18 23:18:33,370][26599] InferenceWorker_p0-w0: stopping experience collection (650 times) [2024-06-18 23:18:33,370][26599] InferenceWorker_p0-w0: resuming experience collection (650 times) [2024-06-18 23:18:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 41154.4). Total num frames: 3781804032. Throughput: 0: 41379.1. Samples: 49366340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:33,380][26367] Avg episode reward: [(0, '0.520')] [2024-06-18 23:18:33,502][26599] Updated weights for policy 0, policy_version 230824 (0.0037) [2024-06-18 23:18:38,215][26599] Updated weights for policy 0, policy_version 230834 (0.0038) [2024-06-18 23:18:38,380][26367] Fps is (10 sec: 40959.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3781984256. Throughput: 0: 41178.5. Samples: 49613280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-18 23:18:38,381][26367] Avg episode reward: [(0, '0.453')] [2024-06-18 23:18:41,677][26599] Updated weights for policy 0, policy_version 230844 (0.0029) [2024-06-18 23:18:43,380][26367] Fps is (10 sec: 40958.8, 60 sec: 41779.1, 300 sec: 41043.8). Total num frames: 3782213632. Throughput: 0: 41137.5. Samples: 49857320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:18:43,381][26367] Avg episode reward: [(0, '0.799')] [2024-06-18 23:18:45,971][26599] Updated weights for policy 0, policy_version 230854 (0.0042) [2024-06-18 23:18:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 3782410240. Throughput: 0: 41121.4. Samples: 49985160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:18:48,381][26367] Avg episode reward: [(0, '0.833')] [2024-06-18 23:18:49,526][26599] Updated weights for policy 0, policy_version 230864 (0.0037) [2024-06-18 23:18:53,380][26367] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 3782606848. Throughput: 0: 41113.6. Samples: 50229320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:18:53,381][26367] Avg episode reward: [(0, '0.420')] [2024-06-18 23:18:54,291][26599] Updated weights for policy 0, policy_version 230874 (0.0042) [2024-06-18 23:18:57,477][26599] Updated weights for policy 0, policy_version 230884 (0.0045) [2024-06-18 23:18:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 3782836224. Throughput: 0: 41066.8. Samples: 50472020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:18:58,380][26367] Avg episode reward: [(0, '0.538')] [2024-06-18 23:19:02,184][26599] Updated weights for policy 0, policy_version 230894 (0.0032) [2024-06-18 23:19:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3783016448. Throughput: 0: 41222.1. Samples: 50600120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:03,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-18 23:19:05,629][26599] Updated weights for policy 0, policy_version 230904 (0.0040) [2024-06-18 23:19:08,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3783229440. Throughput: 0: 40884.5. Samples: 50839820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:08,380][26367] Avg episode reward: [(0, '0.441')] [2024-06-18 23:19:10,135][26599] Updated weights for policy 0, policy_version 230914 (0.0042) [2024-06-18 23:19:13,384][26367] Fps is (10 sec: 42583.3, 60 sec: 41230.6, 300 sec: 41098.4). Total num frames: 3783442432. Throughput: 0: 41010.4. Samples: 51090920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:13,384][26367] Avg episode reward: [(0, '0.449')] [2024-06-18 23:19:13,478][26599] Updated weights for policy 0, policy_version 230924 (0.0045) [2024-06-18 23:19:18,066][26599] Updated weights for policy 0, policy_version 230934 (0.0037) [2024-06-18 23:19:18,380][26367] Fps is (10 sec: 39320.9, 60 sec: 40413.7, 300 sec: 40876.7). Total num frames: 3783622656. Throughput: 0: 41012.7. Samples: 51211920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:18,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-18 23:19:21,568][26599] Updated weights for policy 0, policy_version 230944 (0.0042) [2024-06-18 23:19:23,380][26367] Fps is (10 sec: 42613.5, 60 sec: 41506.1, 300 sec: 41099.3). Total num frames: 3783868416. Throughput: 0: 40994.7. Samples: 51458040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:23,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-18 23:19:26,058][26599] Updated weights for policy 0, policy_version 230954 (0.0036) [2024-06-18 23:19:28,380][26367] Fps is (10 sec: 40960.8, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3784032256. Throughput: 0: 41169.1. Samples: 51709920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:28,381][26367] Avg episode reward: [(0, '0.300')] [2024-06-18 23:19:29,283][26599] Updated weights for policy 0, policy_version 230964 (0.0038) [2024-06-18 23:19:33,384][26367] Fps is (10 sec: 37669.7, 60 sec: 40684.4, 300 sec: 41042.8). Total num frames: 3784245248. Throughput: 0: 40914.0. Samples: 51826440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:33,384][26367] Avg episode reward: [(0, '0.350')] [2024-06-18 23:19:34,102][26599] Updated weights for policy 0, policy_version 230974 (0.0030) [2024-06-18 23:19:37,172][26599] Updated weights for policy 0, policy_version 230984 (0.0032) [2024-06-18 23:19:38,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3784458240. Throughput: 0: 40921.8. Samples: 52070800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:38,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-18 23:19:38,460][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230986_3784474624.pth... [2024-06-18 23:19:38,521][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230384_3774611456.pth [2024-06-18 23:19:41,937][26599] Updated weights for policy 0, policy_version 230994 (0.0036) [2024-06-18 23:19:43,380][26367] Fps is (10 sec: 40974.7, 60 sec: 40687.0, 300 sec: 40987.8). Total num frames: 3784654848. Throughput: 0: 41068.8. Samples: 52320120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:43,381][26367] Avg episode reward: [(0, '0.298')] [2024-06-18 23:19:45,927][26599] Updated weights for policy 0, policy_version 231004 (0.0029) [2024-06-18 23:19:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 3784867840. Throughput: 0: 40948.4. Samples: 52442800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-18 23:19:48,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-18 23:19:49,648][26599] Updated weights for policy 0, policy_version 231014 (0.0038) [2024-06-18 23:19:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 3785048064. Throughput: 0: 41059.5. Samples: 52687500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:19:53,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-18 23:19:53,841][26599] Updated weights for policy 0, policy_version 231024 (0.0046) [2024-06-18 23:19:57,829][26599] Updated weights for policy 0, policy_version 231034 (0.0025) [2024-06-18 23:19:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 3785277440. Throughput: 0: 40972.2. Samples: 52934520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:19:58,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-18 23:20:01,753][26599] Updated weights for policy 0, policy_version 231044 (0.0041) [2024-06-18 23:20:03,380][26367] Fps is (10 sec: 45875.5, 60 sec: 41506.2, 300 sec: 41099.4). Total num frames: 3785506816. Throughput: 0: 41184.6. Samples: 53065220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:03,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-18 23:20:05,763][26599] Updated weights for policy 0, policy_version 231054 (0.0038) [2024-06-18 23:20:08,381][26367] Fps is (10 sec: 39320.7, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 3785670656. Throughput: 0: 41016.7. Samples: 53303800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:08,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-18 23:20:09,814][26599] Updated weights for policy 0, policy_version 231064 (0.0046) [2024-06-18 23:20:13,380][26367] Fps is (10 sec: 37683.1, 60 sec: 40689.4, 300 sec: 40987.8). Total num frames: 3785883648. Throughput: 0: 40900.0. Samples: 53550420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:13,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-18 23:20:13,884][26599] Updated weights for policy 0, policy_version 231074 (0.0034) [2024-06-18 23:20:15,744][26579] Signal inference workers to stop experience collection... (700 times) [2024-06-18 23:20:15,745][26579] Signal inference workers to resume experience collection... (700 times) [2024-06-18 23:20:15,759][26599] InferenceWorker_p0-w0: stopping experience collection (700 times) [2024-06-18 23:20:15,760][26599] InferenceWorker_p0-w0: resuming experience collection (700 times) [2024-06-18 23:20:17,677][26599] Updated weights for policy 0, policy_version 231084 (0.0032) [2024-06-18 23:20:18,380][26367] Fps is (10 sec: 45876.1, 60 sec: 41779.3, 300 sec: 41209.9). Total num frames: 3786129408. Throughput: 0: 41074.4. Samples: 53674640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:18,381][26367] Avg episode reward: [(0, '0.371')] [2024-06-18 23:20:21,723][26599] Updated weights for policy 0, policy_version 231094 (0.0039) [2024-06-18 23:20:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40140.8, 300 sec: 40932.2). Total num frames: 3786276864. Throughput: 0: 41096.9. Samples: 53920160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:23,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-18 23:20:25,449][26599] Updated weights for policy 0, policy_version 231104 (0.0034) [2024-06-18 23:20:28,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 3786522624. Throughput: 0: 40985.7. Samples: 54164480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:28,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-18 23:20:29,644][26599] Updated weights for policy 0, policy_version 231114 (0.0030) [2024-06-18 23:20:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41235.5, 300 sec: 41043.3). Total num frames: 3786719232. Throughput: 0: 41082.6. Samples: 54291520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:33,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-18 23:20:33,663][26599] Updated weights for policy 0, policy_version 231124 (0.0039) [2024-06-18 23:20:37,692][26599] Updated weights for policy 0, policy_version 231134 (0.0030) [2024-06-18 23:20:38,380][26367] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3786915840. Throughput: 0: 41105.8. Samples: 54537260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:38,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-18 23:20:41,510][26599] Updated weights for policy 0, policy_version 231144 (0.0043) [2024-06-18 23:20:43,384][26367] Fps is (10 sec: 42583.4, 60 sec: 41503.6, 300 sec: 41098.3). Total num frames: 3787145216. Throughput: 0: 41026.0. Samples: 54780840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:43,384][26367] Avg episode reward: [(0, '0.434')] [2024-06-18 23:20:45,611][26599] Updated weights for policy 0, policy_version 231154 (0.0037) [2024-06-18 23:20:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3787341824. Throughput: 0: 41057.7. Samples: 54912820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:48,381][26367] Avg episode reward: [(0, '0.371')] [2024-06-18 23:20:49,117][26599] Updated weights for policy 0, policy_version 231164 (0.0035) [2024-06-18 23:20:53,380][26367] Fps is (10 sec: 39335.5, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 3787538432. Throughput: 0: 41139.2. Samples: 55155060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:53,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-18 23:20:53,539][26599] Updated weights for policy 0, policy_version 231174 (0.0041) [2024-06-18 23:20:56,898][26599] Updated weights for policy 0, policy_version 231184 (0.0037) [2024-06-18 23:20:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 3787767808. Throughput: 0: 41144.7. Samples: 55401940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-18 23:20:58,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-18 23:21:01,543][26599] Updated weights for policy 0, policy_version 231194 (0.0047) [2024-06-18 23:21:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40413.8, 300 sec: 40932.2). Total num frames: 3787931648. Throughput: 0: 41250.2. Samples: 55530900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:03,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-18 23:21:05,237][26599] Updated weights for policy 0, policy_version 231204 (0.0038) [2024-06-18 23:21:08,381][26367] Fps is (10 sec: 40956.7, 60 sec: 41778.7, 300 sec: 41098.7). Total num frames: 3788177408. Throughput: 0: 41140.9. Samples: 55771540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:08,382][26367] Avg episode reward: [(0, '0.637')] [2024-06-18 23:21:09,498][26599] Updated weights for policy 0, policy_version 231214 (0.0034) [2024-06-18 23:21:13,075][26599] Updated weights for policy 0, policy_version 231224 (0.0031) [2024-06-18 23:21:13,380][26367] Fps is (10 sec: 45875.1, 60 sec: 41779.1, 300 sec: 41098.8). Total num frames: 3788390400. Throughput: 0: 41298.7. Samples: 56022920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:13,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-18 23:21:17,470][26599] Updated weights for policy 0, policy_version 231234 (0.0042) [2024-06-18 23:21:18,380][26367] Fps is (10 sec: 36048.5, 60 sec: 40140.9, 300 sec: 40987.8). Total num frames: 3788537856. Throughput: 0: 41212.2. Samples: 56146060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:18,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-18 23:21:20,882][26599] Updated weights for policy 0, policy_version 231244 (0.0045) [2024-06-18 23:21:23,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41154.4). Total num frames: 3788783616. Throughput: 0: 41198.2. Samples: 56391180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:23,381][26367] Avg episode reward: [(0, '0.345')] [2024-06-18 23:21:25,383][26599] Updated weights for policy 0, policy_version 231254 (0.0049) [2024-06-18 23:21:28,380][26367] Fps is (10 sec: 45875.2, 60 sec: 41233.2, 300 sec: 41098.8). Total num frames: 3788996608. Throughput: 0: 41490.5. Samples: 56647760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:28,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-18 23:21:28,915][26599] Updated weights for policy 0, policy_version 231264 (0.0036) [2024-06-18 23:21:33,340][26599] Updated weights for policy 0, policy_version 231274 (0.0043) [2024-06-18 23:21:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3789193216. Throughput: 0: 41211.0. Samples: 56767320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:33,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-18 23:21:36,072][26579] Signal inference workers to stop experience collection... (750 times) [2024-06-18 23:21:36,073][26579] Signal inference workers to resume experience collection... (750 times) [2024-06-18 23:21:36,116][26599] InferenceWorker_p0-w0: stopping experience collection (750 times) [2024-06-18 23:21:36,116][26599] InferenceWorker_p0-w0: resuming experience collection (750 times) [2024-06-18 23:21:36,611][26599] Updated weights for policy 0, policy_version 231284 (0.0031) [2024-06-18 23:21:38,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 3789406208. Throughput: 0: 41326.3. Samples: 57014740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:38,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-18 23:21:38,450][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000231288_3789422592.pth... [2024-06-18 23:21:38,527][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230684_3779526656.pth [2024-06-18 23:21:41,146][26599] Updated weights for policy 0, policy_version 231294 (0.0029) [2024-06-18 23:21:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40962.5, 300 sec: 41043.3). Total num frames: 3789602816. Throughput: 0: 41486.8. Samples: 57268840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:43,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-18 23:21:44,600][26599] Updated weights for policy 0, policy_version 231304 (0.0032) [2024-06-18 23:21:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3789815808. Throughput: 0: 41200.6. Samples: 57384920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:48,380][26367] Avg episode reward: [(0, '0.726')] [2024-06-18 23:21:49,163][26599] Updated weights for policy 0, policy_version 231314 (0.0038) [2024-06-18 23:21:52,539][26599] Updated weights for policy 0, policy_version 231324 (0.0040) [2024-06-18 23:21:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 3790028800. Throughput: 0: 41304.0. Samples: 57630180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:53,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-18 23:21:57,218][26599] Updated weights for policy 0, policy_version 231334 (0.0040) [2024-06-18 23:21:58,380][26367] Fps is (10 sec: 37683.2, 60 sec: 40414.0, 300 sec: 41098.8). Total num frames: 3790192640. Throughput: 0: 41409.5. Samples: 57886340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:21:58,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-18 23:22:00,545][26599] Updated weights for policy 0, policy_version 231344 (0.0028) [2024-06-18 23:22:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 3790438400. Throughput: 0: 41337.7. Samples: 58006260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:22:03,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-18 23:22:05,097][26599] Updated weights for policy 0, policy_version 231354 (0.0035) [2024-06-18 23:22:08,380][26367] Fps is (10 sec: 44236.9, 60 sec: 40960.7, 300 sec: 41098.9). Total num frames: 3790635008. Throughput: 0: 41404.5. Samples: 58254380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-18 23:22:08,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-18 23:22:08,783][26599] Updated weights for policy 0, policy_version 231364 (0.0031) [2024-06-18 23:22:13,100][26599] Updated weights for policy 0, policy_version 231374 (0.0029) [2024-06-18 23:22:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 40687.0, 300 sec: 41098.9). Total num frames: 3790831616. Throughput: 0: 41225.7. Samples: 58502920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:13,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-18 23:22:16,886][26599] Updated weights for policy 0, policy_version 231384 (0.0050) [2024-06-18 23:22:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41265.4). Total num frames: 3791077376. Throughput: 0: 41156.1. Samples: 58619340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:18,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-18 23:22:21,333][26599] Updated weights for policy 0, policy_version 231394 (0.0038) [2024-06-18 23:22:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 3791241216. Throughput: 0: 41264.1. Samples: 58871620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:23,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-18 23:22:24,739][26599] Updated weights for policy 0, policy_version 231404 (0.0040) [2024-06-18 23:22:28,380][26367] Fps is (10 sec: 37683.0, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 3791454208. Throughput: 0: 41023.1. Samples: 59114880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:28,381][26367] Avg episode reward: [(0, '0.357')] [2024-06-18 23:22:29,313][26599] Updated weights for policy 0, policy_version 231414 (0.0034) [2024-06-18 23:22:32,542][26599] Updated weights for policy 0, policy_version 231424 (0.0031) [2024-06-18 23:22:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3791667200. Throughput: 0: 41104.8. Samples: 59234640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:33,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-18 23:22:37,188][26599] Updated weights for policy 0, policy_version 231434 (0.0024) [2024-06-18 23:22:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 41154.4). Total num frames: 3791847424. Throughput: 0: 41110.7. Samples: 59480160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:38,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-18 23:22:40,912][26599] Updated weights for policy 0, policy_version 231444 (0.0025) [2024-06-18 23:22:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3792076800. Throughput: 0: 40810.5. Samples: 59722820. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:43,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-18 23:22:45,178][26599] Updated weights for policy 0, policy_version 231454 (0.0033) [2024-06-18 23:22:48,384][26367] Fps is (10 sec: 44220.9, 60 sec: 41230.6, 300 sec: 41098.3). Total num frames: 3792289792. Throughput: 0: 40943.9. Samples: 59848880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:48,384][26367] Avg episode reward: [(0, '0.696')] [2024-06-18 23:22:48,796][26599] Updated weights for policy 0, policy_version 231464 (0.0038) [2024-06-18 23:22:52,926][26599] Updated weights for policy 0, policy_version 231474 (0.0043) [2024-06-18 23:22:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 3792486400. Throughput: 0: 40861.2. Samples: 60093140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:53,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-18 23:22:56,696][26599] Updated weights for policy 0, policy_version 231484 (0.0046) [2024-06-18 23:22:58,380][26367] Fps is (10 sec: 39335.5, 60 sec: 41506.0, 300 sec: 41098.8). Total num frames: 3792683008. Throughput: 0: 40918.2. Samples: 60344240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:22:58,382][26367] Avg episode reward: [(0, '0.492')] [2024-06-18 23:23:00,670][26599] Updated weights for policy 0, policy_version 231494 (0.0045) [2024-06-18 23:23:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 3792896000. Throughput: 0: 41062.2. Samples: 60467140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:23:03,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-18 23:23:04,584][26599] Updated weights for policy 0, policy_version 231504 (0.0034) [2024-06-18 23:23:08,384][26367] Fps is (10 sec: 40945.2, 60 sec: 40957.4, 300 sec: 41098.3). Total num frames: 3793092608. Throughput: 0: 40808.7. Samples: 60708160. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:23:08,385][26367] Avg episode reward: [(0, '0.622')] [2024-06-18 23:23:09,434][26599] Updated weights for policy 0, policy_version 231514 (0.0047) [2024-06-18 23:23:12,759][26599] Updated weights for policy 0, policy_version 231524 (0.0032) [2024-06-18 23:23:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3793305600. Throughput: 0: 40903.1. Samples: 60955520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:23:13,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-18 23:23:17,194][26599] Updated weights for policy 0, policy_version 231534 (0.0045) [2024-06-18 23:23:18,380][26367] Fps is (10 sec: 40974.8, 60 sec: 40413.8, 300 sec: 41098.8). Total num frames: 3793502208. Throughput: 0: 40991.6. Samples: 61079260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-18 23:23:18,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-18 23:23:19,568][26579] Signal inference workers to stop experience collection... (800 times) [2024-06-18 23:23:19,569][26579] Signal inference workers to resume experience collection... (800 times) [2024-06-18 23:23:19,607][26599] InferenceWorker_p0-w0: stopping experience collection (800 times) [2024-06-18 23:23:19,608][26599] InferenceWorker_p0-w0: resuming experience collection (800 times) [2024-06-18 23:23:20,823][26599] Updated weights for policy 0, policy_version 231544 (0.0038) [2024-06-18 23:23:23,380][26367] Fps is (10 sec: 39321.9, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3793698816. Throughput: 0: 40892.0. Samples: 61320300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:23,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-18 23:23:25,034][26599] Updated weights for policy 0, policy_version 231554 (0.0043) [2024-06-18 23:23:28,380][26367] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 3793911808. Throughput: 0: 41038.8. Samples: 61569560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:28,380][26367] Avg episode reward: [(0, '0.451')] [2024-06-18 23:23:28,663][26599] Updated weights for policy 0, policy_version 231564 (0.0038) [2024-06-18 23:23:32,689][26599] Updated weights for policy 0, policy_version 231574 (0.0033) [2024-06-18 23:23:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 3794124800. Throughput: 0: 41113.5. Samples: 61698840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:33,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-18 23:23:36,563][26599] Updated weights for policy 0, policy_version 231584 (0.0034) [2024-06-18 23:23:38,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3794321408. Throughput: 0: 41148.4. Samples: 61944820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:38,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-18 23:23:38,497][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000231588_3794337792.pth... [2024-06-18 23:23:38,544][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000230986_3784474624.pth [2024-06-18 23:23:40,445][26599] Updated weights for policy 0, policy_version 231594 (0.0043) [2024-06-18 23:23:43,381][26367] Fps is (10 sec: 40959.0, 60 sec: 40959.8, 300 sec: 41098.8). Total num frames: 3794534400. Throughput: 0: 41045.1. Samples: 62191280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:43,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-18 23:23:44,515][26599] Updated weights for policy 0, policy_version 231604 (0.0031) [2024-06-18 23:23:48,292][26599] Updated weights for policy 0, policy_version 231614 (0.0046) [2024-06-18 23:23:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41235.5, 300 sec: 41209.9). Total num frames: 3794763776. Throughput: 0: 41155.5. Samples: 62319140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:48,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-18 23:23:52,758][26599] Updated weights for policy 0, policy_version 231624 (0.0044) [2024-06-18 23:23:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 3794944000. Throughput: 0: 41132.6. Samples: 62558980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:53,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-18 23:23:56,355][26599] Updated weights for policy 0, policy_version 231634 (0.0031) [2024-06-18 23:23:58,380][26367] Fps is (10 sec: 37684.0, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 3795140608. Throughput: 0: 41123.7. Samples: 62806080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:23:58,380][26367] Avg episode reward: [(0, '0.704')] [2024-06-18 23:24:00,748][26599] Updated weights for policy 0, policy_version 231644 (0.0044) [2024-06-18 23:24:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3795353600. Throughput: 0: 41021.8. Samples: 62925240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:24:03,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-18 23:24:04,310][26599] Updated weights for policy 0, policy_version 231654 (0.0050) [2024-06-18 23:24:08,380][26367] Fps is (10 sec: 40959.4, 60 sec: 40962.5, 300 sec: 41043.8). Total num frames: 3795550208. Throughput: 0: 41162.6. Samples: 63172620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:24:08,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-18 23:24:08,966][26599] Updated weights for policy 0, policy_version 231664 (0.0035) [2024-06-18 23:24:12,191][26599] Updated weights for policy 0, policy_version 231674 (0.0031) [2024-06-18 23:24:13,380][26367] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 3795763200. Throughput: 0: 40949.3. Samples: 63412280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:24:13,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-18 23:24:16,952][26599] Updated weights for policy 0, policy_version 231684 (0.0043) [2024-06-18 23:24:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3795959808. Throughput: 0: 40840.9. Samples: 63536680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:24:18,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-18 23:24:20,458][26599] Updated weights for policy 0, policy_version 231694 (0.0042) [2024-06-18 23:24:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3796172800. Throughput: 0: 40816.1. Samples: 63781540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-18 23:24:23,382][26367] Avg episode reward: [(0, '0.474')] [2024-06-18 23:24:24,903][26599] Updated weights for policy 0, policy_version 231704 (0.0034) [2024-06-18 23:24:28,309][26599] Updated weights for policy 0, policy_version 231714 (0.0048) [2024-06-18 23:24:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41506.0, 300 sec: 41210.4). Total num frames: 3796402176. Throughput: 0: 40733.1. Samples: 64024260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:24:28,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-18 23:24:33,159][26599] Updated weights for policy 0, policy_version 231724 (0.0036) [2024-06-18 23:24:33,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40686.9, 300 sec: 41043.3). Total num frames: 3796566016. Throughput: 0: 40711.1. Samples: 64151140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:24:33,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-18 23:24:36,202][26599] Updated weights for policy 0, policy_version 231734 (0.0040) [2024-06-18 23:24:38,380][26367] Fps is (10 sec: 37683.8, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 3796779008. Throughput: 0: 40803.7. Samples: 64395140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:24:38,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-18 23:24:39,013][26579] Signal inference workers to stop experience collection... (850 times) [2024-06-18 23:24:39,055][26599] InferenceWorker_p0-w0: stopping experience collection (850 times) [2024-06-18 23:24:39,062][26579] Signal inference workers to resume experience collection... (850 times) [2024-06-18 23:24:39,073][26599] InferenceWorker_p0-w0: resuming experience collection (850 times) [2024-06-18 23:24:41,319][26599] Updated weights for policy 0, policy_version 231744 (0.0041) [2024-06-18 23:24:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 40960.2, 300 sec: 41098.8). Total num frames: 3796992000. Throughput: 0: 40812.3. Samples: 64642640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:24:43,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-18 23:24:44,233][26599] Updated weights for policy 0, policy_version 231754 (0.0035) [2024-06-18 23:24:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40414.0, 300 sec: 41154.4). Total num frames: 3797188608. Throughput: 0: 40731.2. Samples: 64758140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:24:48,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-18 23:24:49,302][26599] Updated weights for policy 0, policy_version 231764 (0.0029) [2024-06-18 23:24:52,353][26599] Updated weights for policy 0, policy_version 231774 (0.0032) [2024-06-18 23:24:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41098.8). Total num frames: 3797401600. Throughput: 0: 40704.0. Samples: 65004300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:24:53,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-18 23:24:57,323][26599] Updated weights for policy 0, policy_version 231784 (0.0034) [2024-06-18 23:24:58,380][26367] Fps is (10 sec: 39321.0, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 3797581824. Throughput: 0: 41015.9. Samples: 65258000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:24:58,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-18 23:25:00,611][26599] Updated weights for policy 0, policy_version 231794 (0.0038) [2024-06-18 23:25:03,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 3797811200. Throughput: 0: 40863.0. Samples: 65375520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:25:03,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-18 23:25:05,451][26599] Updated weights for policy 0, policy_version 231804 (0.0036) [2024-06-18 23:25:08,380][26367] Fps is (10 sec: 42598.8, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3798007808. Throughput: 0: 40820.0. Samples: 65618440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:25:08,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-18 23:25:08,531][26599] Updated weights for policy 0, policy_version 231814 (0.0030) [2024-06-18 23:25:13,380][26367] Fps is (10 sec: 37683.1, 60 sec: 40413.7, 300 sec: 40876.7). Total num frames: 3798188032. Throughput: 0: 41072.8. Samples: 65872540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:25:13,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-18 23:25:13,395][26599] Updated weights for policy 0, policy_version 231824 (0.0042) [2024-06-18 23:25:16,811][26599] Updated weights for policy 0, policy_version 231834 (0.0044) [2024-06-18 23:25:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41154.4). Total num frames: 3798417408. Throughput: 0: 40867.3. Samples: 65990160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:25:18,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-18 23:25:21,225][26599] Updated weights for policy 0, policy_version 231844 (0.0049) [2024-06-18 23:25:23,380][26367] Fps is (10 sec: 44237.9, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 3798630400. Throughput: 0: 41064.5. Samples: 66243040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:25:23,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-18 23:25:24,769][26599] Updated weights for policy 0, policy_version 231854 (0.0048) [2024-06-18 23:25:28,380][26367] Fps is (10 sec: 39321.3, 60 sec: 40140.8, 300 sec: 40987.8). Total num frames: 3798810624. Throughput: 0: 40990.7. Samples: 66487220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:25:28,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-18 23:25:29,263][26599] Updated weights for policy 0, policy_version 231864 (0.0040) [2024-06-18 23:25:32,743][26599] Updated weights for policy 0, policy_version 231874 (0.0027) [2024-06-18 23:25:33,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 3799040000. Throughput: 0: 40993.2. Samples: 66602840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-18 23:25:33,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-18 23:25:37,168][26599] Updated weights for policy 0, policy_version 231884 (0.0044) [2024-06-18 23:25:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41233.1, 300 sec: 41043.8). Total num frames: 3799252992. Throughput: 0: 41222.3. Samples: 66859300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:25:38,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-18 23:25:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000231889_3799269376.pth... [2024-06-18 23:25:38,455][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000231288_3789422592.pth [2024-06-18 23:25:40,877][26599] Updated weights for policy 0, policy_version 231894 (0.0036) [2024-06-18 23:25:43,380][26367] Fps is (10 sec: 40960.6, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 3799449600. Throughput: 0: 40952.2. Samples: 67100840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:25:43,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-18 23:25:45,085][26599] Updated weights for policy 0, policy_version 231904 (0.0040) [2024-06-18 23:25:48,380][26367] Fps is (10 sec: 39320.8, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 3799646208. Throughput: 0: 41094.2. Samples: 67224760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:25:48,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-18 23:25:48,869][26599] Updated weights for policy 0, policy_version 231914 (0.0035) [2024-06-18 23:25:53,012][26599] Updated weights for policy 0, policy_version 231924 (0.0040) [2024-06-18 23:25:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3799859200. Throughput: 0: 41286.6. Samples: 67476340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:25:53,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-18 23:25:56,935][26599] Updated weights for policy 0, policy_version 231934 (0.0042) [2024-06-18 23:25:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 3800055808. Throughput: 0: 40957.0. Samples: 67715600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:25:58,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-18 23:25:58,560][26579] Signal inference workers to stop experience collection... (900 times) [2024-06-18 23:25:58,560][26579] Signal inference workers to resume experience collection... (900 times) [2024-06-18 23:25:58,581][26599] InferenceWorker_p0-w0: stopping experience collection (900 times) [2024-06-18 23:25:58,582][26599] InferenceWorker_p0-w0: resuming experience collection (900 times) [2024-06-18 23:26:01,040][26599] Updated weights for policy 0, policy_version 231944 (0.0042) [2024-06-18 23:26:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.1, 300 sec: 40987.9). Total num frames: 3800268800. Throughput: 0: 41035.0. Samples: 67836740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:03,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-18 23:26:04,859][26599] Updated weights for policy 0, policy_version 231954 (0.0040) [2024-06-18 23:26:08,380][26367] Fps is (10 sec: 39321.9, 60 sec: 40687.0, 300 sec: 40876.7). Total num frames: 3800449024. Throughput: 0: 40777.7. Samples: 68078040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:08,380][26367] Avg episode reward: [(0, '0.739')] [2024-06-18 23:26:08,851][26599] Updated weights for policy 0, policy_version 231964 (0.0035) [2024-06-18 23:26:12,795][26599] Updated weights for policy 0, policy_version 231974 (0.0039) [2024-06-18 23:26:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 3800662016. Throughput: 0: 40892.0. Samples: 68327360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:13,384][26367] Avg episode reward: [(0, '0.756')] [2024-06-18 23:26:17,054][26599] Updated weights for policy 0, policy_version 231984 (0.0033) [2024-06-18 23:26:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3800875008. Throughput: 0: 41124.6. Samples: 68453440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:18,381][26367] Avg episode reward: [(0, '0.825')] [2024-06-18 23:26:20,747][26599] Updated weights for policy 0, policy_version 231994 (0.0043) [2024-06-18 23:26:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 3801071616. Throughput: 0: 40745.3. Samples: 68692840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:23,380][26367] Avg episode reward: [(0, '0.802')] [2024-06-18 23:26:24,886][26599] Updated weights for policy 0, policy_version 232004 (0.0042) [2024-06-18 23:26:28,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 3801284608. Throughput: 0: 40839.4. Samples: 68938620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:28,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-18 23:26:28,854][26599] Updated weights for policy 0, policy_version 232014 (0.0037) [2024-06-18 23:26:32,956][26599] Updated weights for policy 0, policy_version 232024 (0.0039) [2024-06-18 23:26:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40687.0, 300 sec: 40932.2). Total num frames: 3801481216. Throughput: 0: 40893.5. Samples: 69064960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:33,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-18 23:26:36,964][26599] Updated weights for policy 0, policy_version 232034 (0.0035) [2024-06-18 23:26:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40686.8, 300 sec: 40987.8). Total num frames: 3801694208. Throughput: 0: 40740.8. Samples: 69309680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:38,381][26367] Avg episode reward: [(0, '0.361')] [2024-06-18 23:26:40,871][26599] Updated weights for policy 0, policy_version 232044 (0.0038) [2024-06-18 23:26:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 3801907200. Throughput: 0: 40919.1. Samples: 69556960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-18 23:26:43,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-18 23:26:44,885][26599] Updated weights for policy 0, policy_version 232054 (0.0034) [2024-06-18 23:26:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 3802103808. Throughput: 0: 41094.6. Samples: 69686000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:26:48,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-18 23:26:48,736][26599] Updated weights for policy 0, policy_version 232064 (0.0036) [2024-06-18 23:26:53,119][26599] Updated weights for policy 0, policy_version 232074 (0.0032) [2024-06-18 23:26:53,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3802316800. Throughput: 0: 41091.9. Samples: 69927180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:26:53,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-18 23:26:56,795][26599] Updated weights for policy 0, policy_version 232084 (0.0032) [2024-06-18 23:26:58,380][26367] Fps is (10 sec: 42599.4, 60 sec: 41233.2, 300 sec: 40987.8). Total num frames: 3802529792. Throughput: 0: 41016.6. Samples: 70173100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:26:58,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-18 23:27:00,987][26599] Updated weights for policy 0, policy_version 232094 (0.0035) [2024-06-18 23:27:03,380][26367] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 40932.2). Total num frames: 3802710016. Throughput: 0: 41080.3. Samples: 70302060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:03,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-18 23:27:04,693][26599] Updated weights for policy 0, policy_version 232104 (0.0046) [2024-06-18 23:27:08,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41233.1, 300 sec: 40987.8). Total num frames: 3802923008. Throughput: 0: 41166.7. Samples: 70545340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:08,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-18 23:27:08,907][26599] Updated weights for policy 0, policy_version 232114 (0.0028) [2024-06-18 23:27:12,498][26599] Updated weights for policy 0, policy_version 232124 (0.0046) [2024-06-18 23:27:13,380][26367] Fps is (10 sec: 45875.6, 60 sec: 41779.2, 300 sec: 40987.8). Total num frames: 3803168768. Throughput: 0: 41179.2. Samples: 70791680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:13,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-18 23:27:16,749][26599] Updated weights for policy 0, policy_version 232134 (0.0030) [2024-06-18 23:27:18,088][26579] Signal inference workers to stop experience collection... (950 times) [2024-06-18 23:27:18,092][26579] Signal inference workers to resume experience collection... (950 times) [2024-06-18 23:27:18,133][26599] InferenceWorker_p0-w0: stopping experience collection (950 times) [2024-06-18 23:27:18,140][26599] InferenceWorker_p0-w0: resuming experience collection (950 times) [2024-06-18 23:27:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3803332608. Throughput: 0: 41188.5. Samples: 70918440. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:18,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-18 23:27:20,540][26599] Updated weights for policy 0, policy_version 232144 (0.0043) [2024-06-18 23:27:23,380][26367] Fps is (10 sec: 36045.1, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 3803529216. Throughput: 0: 41073.9. Samples: 71158000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:23,380][26367] Avg episode reward: [(0, '0.664')] [2024-06-18 23:27:25,013][26599] Updated weights for policy 0, policy_version 232154 (0.0039) [2024-06-18 23:27:28,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 3803758592. Throughput: 0: 40995.5. Samples: 71401760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:28,381][26367] Avg episode reward: [(0, '0.290')] [2024-06-18 23:27:28,431][26599] Updated weights for policy 0, policy_version 232164 (0.0042) [2024-06-18 23:27:32,870][26599] Updated weights for policy 0, policy_version 232174 (0.0036) [2024-06-18 23:27:33,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3803955200. Throughput: 0: 40998.6. Samples: 71530940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:33,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-18 23:27:36,655][26599] Updated weights for policy 0, policy_version 232184 (0.0038) [2024-06-18 23:27:38,380][26367] Fps is (10 sec: 39322.2, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 3804151808. Throughput: 0: 41023.2. Samples: 71773220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:38,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-18 23:27:38,418][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000232188_3804168192.pth... [2024-06-18 23:27:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000231588_3794337792.pth [2024-06-18 23:27:40,666][26599] Updated weights for policy 0, policy_version 232194 (0.0038) [2024-06-18 23:27:43,380][26367] Fps is (10 sec: 42599.5, 60 sec: 41233.2, 300 sec: 40988.3). Total num frames: 3804381184. Throughput: 0: 41070.6. Samples: 72021280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:43,380][26367] Avg episode reward: [(0, '0.682')] [2024-06-18 23:27:44,537][26599] Updated weights for policy 0, policy_version 232204 (0.0054) [2024-06-18 23:27:48,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 3804577792. Throughput: 0: 40988.9. Samples: 72146560. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:48,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-18 23:27:48,473][26599] Updated weights for policy 0, policy_version 232214 (0.0030) [2024-06-18 23:27:52,321][26599] Updated weights for policy 0, policy_version 232224 (0.0028) [2024-06-18 23:27:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3804790784. Throughput: 0: 41175.5. Samples: 72398240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-18 23:27:53,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-18 23:27:56,295][26599] Updated weights for policy 0, policy_version 232234 (0.0039) [2024-06-18 23:27:58,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40686.8, 300 sec: 40932.2). Total num frames: 3804971008. Throughput: 0: 41264.5. Samples: 72648580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:27:58,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-18 23:28:00,182][26599] Updated weights for policy 0, policy_version 232244 (0.0040) [2024-06-18 23:28:03,380][26367] Fps is (10 sec: 37683.0, 60 sec: 40960.0, 300 sec: 40932.7). Total num frames: 3805167616. Throughput: 0: 41022.1. Samples: 72764440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:03,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-18 23:28:04,621][26599] Updated weights for policy 0, policy_version 232254 (0.0048) [2024-06-18 23:28:08,100][26599] Updated weights for policy 0, policy_version 232264 (0.0030) [2024-06-18 23:28:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 3805413376. Throughput: 0: 41200.4. Samples: 73012020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:08,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-18 23:28:12,678][26599] Updated weights for policy 0, policy_version 232274 (0.0036) [2024-06-18 23:28:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 3805593600. Throughput: 0: 41176.5. Samples: 73254700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:13,381][26367] Avg episode reward: [(0, '0.763')] [2024-06-18 23:28:16,075][26599] Updated weights for policy 0, policy_version 232284 (0.0025) [2024-06-18 23:28:18,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41232.9, 300 sec: 41043.3). Total num frames: 3805806592. Throughput: 0: 41076.4. Samples: 73379380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:18,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-18 23:28:20,583][26599] Updated weights for policy 0, policy_version 232294 (0.0035) [2024-06-18 23:28:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41098.8). Total num frames: 3806035968. Throughput: 0: 41279.5. Samples: 73630800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:23,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-18 23:28:24,099][26599] Updated weights for policy 0, policy_version 232304 (0.0047) [2024-06-18 23:28:28,349][26599] Updated weights for policy 0, policy_version 232314 (0.0024) [2024-06-18 23:28:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3806232576. Throughput: 0: 41282.0. Samples: 73878980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:28,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-18 23:28:32,262][26599] Updated weights for policy 0, policy_version 232324 (0.0034) [2024-06-18 23:28:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41506.3, 300 sec: 41098.9). Total num frames: 3806445568. Throughput: 0: 41219.7. Samples: 74001440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:33,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-18 23:28:36,124][26599] Updated weights for policy 0, policy_version 232334 (0.0038) [2024-06-18 23:28:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 3806625792. Throughput: 0: 41172.9. Samples: 74251020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:38,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-18 23:28:40,013][26599] Updated weights for policy 0, policy_version 232344 (0.0050) [2024-06-18 23:28:40,759][26579] Signal inference workers to stop experience collection... (1000 times) [2024-06-18 23:28:40,806][26599] InferenceWorker_p0-w0: stopping experience collection (1000 times) [2024-06-18 23:28:40,877][26579] Signal inference workers to resume experience collection... (1000 times) [2024-06-18 23:28:40,877][26599] InferenceWorker_p0-w0: resuming experience collection (1000 times) [2024-06-18 23:28:43,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40959.9, 300 sec: 40932.2). Total num frames: 3806838784. Throughput: 0: 41185.8. Samples: 74501940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:43,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-18 23:28:44,539][26599] Updated weights for policy 0, policy_version 232354 (0.0036) [2024-06-18 23:28:47,984][26599] Updated weights for policy 0, policy_version 232364 (0.0041) [2024-06-18 23:28:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3807051776. Throughput: 0: 41369.8. Samples: 74626080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:48,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-18 23:28:52,414][26599] Updated weights for policy 0, policy_version 232374 (0.0045) [2024-06-18 23:28:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 3807264768. Throughput: 0: 41336.4. Samples: 74872160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:53,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-18 23:28:55,826][26599] Updated weights for policy 0, policy_version 232384 (0.0035) [2024-06-18 23:28:58,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 40987.8). Total num frames: 3807444992. Throughput: 0: 41376.9. Samples: 75116660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:28:58,381][26367] Avg episode reward: [(0, '0.308')] [2024-06-18 23:29:00,402][26599] Updated weights for policy 0, policy_version 232394 (0.0038) [2024-06-18 23:29:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41506.3, 300 sec: 41043.3). Total num frames: 3807657984. Throughput: 0: 41390.9. Samples: 75241960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-18 23:29:03,380][26367] Avg episode reward: [(0, '0.443')] [2024-06-18 23:29:03,767][26599] Updated weights for policy 0, policy_version 232404 (0.0041) [2024-06-18 23:29:08,245][26599] Updated weights for policy 0, policy_version 232414 (0.0026) [2024-06-18 23:29:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 3807870976. Throughput: 0: 41243.6. Samples: 75486760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:08,381][26367] Avg episode reward: [(0, '0.820')] [2024-06-18 23:29:11,942][26599] Updated weights for policy 0, policy_version 232424 (0.0043) [2024-06-18 23:29:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41098.9). Total num frames: 3808083968. Throughput: 0: 41053.1. Samples: 75726360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:13,380][26367] Avg episode reward: [(0, '0.545')] [2024-06-18 23:29:16,294][26599] Updated weights for policy 0, policy_version 232434 (0.0050) [2024-06-18 23:29:18,380][26367] Fps is (10 sec: 39321.2, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 3808264192. Throughput: 0: 41159.0. Samples: 75853600. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:18,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-18 23:29:20,048][26599] Updated weights for policy 0, policy_version 232444 (0.0033) [2024-06-18 23:29:23,380][26367] Fps is (10 sec: 37682.8, 60 sec: 40413.9, 300 sec: 40876.7). Total num frames: 3808460800. Throughput: 0: 41031.1. Samples: 76097420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:23,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-18 23:29:24,199][26599] Updated weights for policy 0, policy_version 232454 (0.0039) [2024-06-18 23:29:28,030][26599] Updated weights for policy 0, policy_version 232464 (0.0044) [2024-06-18 23:29:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3808706560. Throughput: 0: 41067.4. Samples: 76349980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:28,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-18 23:29:32,134][26599] Updated weights for policy 0, policy_version 232474 (0.0034) [2024-06-18 23:29:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3808903168. Throughput: 0: 41146.8. Samples: 76477680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:33,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-18 23:29:36,117][26599] Updated weights for policy 0, policy_version 232484 (0.0030) [2024-06-18 23:29:38,383][26367] Fps is (10 sec: 39310.0, 60 sec: 41231.0, 300 sec: 41042.9). Total num frames: 3809099776. Throughput: 0: 41025.7. Samples: 76718440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:38,384][26367] Avg episode reward: [(0, '0.725')] [2024-06-18 23:29:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000232489_3809099776.pth... [2024-06-18 23:29:38,450][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000231889_3799269376.pth [2024-06-18 23:29:40,017][26599] Updated weights for policy 0, policy_version 232494 (0.0027) [2024-06-18 23:29:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 3809312768. Throughput: 0: 41212.4. Samples: 76971220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:43,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-18 23:29:43,995][26599] Updated weights for policy 0, policy_version 232504 (0.0035) [2024-06-18 23:29:47,811][26599] Updated weights for policy 0, policy_version 232514 (0.0041) [2024-06-18 23:29:48,380][26367] Fps is (10 sec: 42611.3, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 3809525760. Throughput: 0: 41162.1. Samples: 77094260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:48,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-18 23:29:51,881][26599] Updated weights for policy 0, policy_version 232524 (0.0032) [2024-06-18 23:29:53,384][26367] Fps is (10 sec: 40945.3, 60 sec: 40957.5, 300 sec: 41153.9). Total num frames: 3809722368. Throughput: 0: 41234.0. Samples: 77342440. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:53,384][26367] Avg episode reward: [(0, '0.473')] [2024-06-18 23:29:55,629][26599] Updated weights for policy 0, policy_version 232534 (0.0040) [2024-06-18 23:29:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 3809935360. Throughput: 0: 41530.1. Samples: 77595220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:29:58,382][26367] Avg episode reward: [(0, '0.408')] [2024-06-18 23:29:59,866][26599] Updated weights for policy 0, policy_version 232544 (0.0037) [2024-06-18 23:30:03,380][26367] Fps is (10 sec: 40974.7, 60 sec: 41233.0, 300 sec: 41098.8). Total num frames: 3810131968. Throughput: 0: 41390.7. Samples: 77716180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:30:03,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-18 23:30:03,759][26599] Updated weights for policy 0, policy_version 232554 (0.0049) [2024-06-18 23:30:07,851][26599] Updated weights for policy 0, policy_version 232564 (0.0040) [2024-06-18 23:30:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 3810344960. Throughput: 0: 41465.3. Samples: 77963360. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:30:08,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-18 23:30:11,509][26599] Updated weights for policy 0, policy_version 232574 (0.0031) [2024-06-18 23:30:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41232.9, 300 sec: 41154.4). Total num frames: 3810557952. Throughput: 0: 41170.3. Samples: 78202640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 21.0) [2024-06-18 23:30:13,382][26367] Avg episode reward: [(0, '0.548')] [2024-06-18 23:30:14,801][26579] Signal inference workers to stop experience collection... (1050 times) [2024-06-18 23:30:14,859][26599] InferenceWorker_p0-w0: stopping experience collection (1050 times) [2024-06-18 23:30:14,916][26579] Signal inference workers to resume experience collection... (1050 times) [2024-06-18 23:30:14,916][26599] InferenceWorker_p0-w0: resuming experience collection (1050 times) [2024-06-18 23:30:15,752][26599] Updated weights for policy 0, policy_version 232584 (0.0037) [2024-06-18 23:30:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 3810770944. Throughput: 0: 41163.8. Samples: 78330060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:18,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-18 23:30:19,576][26599] Updated weights for policy 0, policy_version 232594 (0.0040) [2024-06-18 23:30:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 3810967552. Throughput: 0: 41373.1. Samples: 78580100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:23,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-18 23:30:23,579][26599] Updated weights for policy 0, policy_version 232604 (0.0031) [2024-06-18 23:30:27,399][26599] Updated weights for policy 0, policy_version 232614 (0.0046) [2024-06-18 23:30:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 3811196928. Throughput: 0: 41230.2. Samples: 78826580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:28,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-18 23:30:31,528][26599] Updated weights for policy 0, policy_version 232624 (0.0043) [2024-06-18 23:30:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 3811393536. Throughput: 0: 41318.7. Samples: 78953600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:33,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-18 23:30:35,145][26599] Updated weights for policy 0, policy_version 232634 (0.0049) [2024-06-18 23:30:38,380][26367] Fps is (10 sec: 36044.8, 60 sec: 40962.1, 300 sec: 41043.3). Total num frames: 3811557376. Throughput: 0: 41233.0. Samples: 79197780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:38,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-18 23:30:39,946][26599] Updated weights for policy 0, policy_version 232644 (0.0035) [2024-06-18 23:30:43,039][26599] Updated weights for policy 0, policy_version 232654 (0.0052) [2024-06-18 23:30:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41265.5). Total num frames: 3811819520. Throughput: 0: 40938.6. Samples: 79437460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:43,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-18 23:30:47,947][26599] Updated weights for policy 0, policy_version 232664 (0.0041) [2024-06-18 23:30:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 3811983360. Throughput: 0: 41145.8. Samples: 79567740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:48,381][26367] Avg episode reward: [(0, '0.811')] [2024-06-18 23:30:51,312][26599] Updated weights for policy 0, policy_version 232674 (0.0042) [2024-06-18 23:30:53,380][26367] Fps is (10 sec: 36045.5, 60 sec: 40962.6, 300 sec: 41098.9). Total num frames: 3812179968. Throughput: 0: 40861.1. Samples: 79802100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:53,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-18 23:30:55,936][26599] Updated weights for policy 0, policy_version 232684 (0.0025) [2024-06-18 23:30:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 3812425728. Throughput: 0: 41138.7. Samples: 80053880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:30:58,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-18 23:30:59,068][26599] Updated weights for policy 0, policy_version 232694 (0.0034) [2024-06-18 23:31:03,380][26367] Fps is (10 sec: 39320.7, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 3812573184. Throughput: 0: 41119.1. Samples: 80180420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:31:03,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-18 23:31:03,963][26599] Updated weights for policy 0, policy_version 232704 (0.0040) [2024-06-18 23:31:06,810][26599] Updated weights for policy 0, policy_version 232714 (0.0028) [2024-06-18 23:31:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 3812835328. Throughput: 0: 40955.8. Samples: 80423120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:31:08,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-18 23:31:12,046][26599] Updated weights for policy 0, policy_version 232724 (0.0025) [2024-06-18 23:31:13,380][26367] Fps is (10 sec: 45875.3, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 3813031936. Throughput: 0: 41048.4. Samples: 80673760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:31:13,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-18 23:31:14,619][26599] Updated weights for policy 0, policy_version 232734 (0.0034) [2024-06-18 23:31:18,383][26367] Fps is (10 sec: 36036.5, 60 sec: 40412.3, 300 sec: 41098.5). Total num frames: 3813195776. Throughput: 0: 40819.1. Samples: 80790560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-18 23:31:18,383][26367] Avg episode reward: [(0, '0.635')] [2024-06-18 23:31:19,968][26599] Updated weights for policy 0, policy_version 232744 (0.0049) [2024-06-18 23:31:20,247][26579] Signal inference workers to stop experience collection... (1100 times) [2024-06-18 23:31:20,247][26579] Signal inference workers to resume experience collection... (1100 times) [2024-06-18 23:31:20,283][26599] InferenceWorker_p0-w0: stopping experience collection (1100 times) [2024-06-18 23:31:20,284][26599] InferenceWorker_p0-w0: resuming experience collection (1100 times) [2024-06-18 23:31:22,586][26599] Updated weights for policy 0, policy_version 232754 (0.0035) [2024-06-18 23:31:23,384][26367] Fps is (10 sec: 42583.0, 60 sec: 41503.5, 300 sec: 41265.0). Total num frames: 3813457920. Throughput: 0: 40960.2. Samples: 81041140. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:23,385][26367] Avg episode reward: [(0, '0.807')] [2024-06-18 23:31:27,759][26599] Updated weights for policy 0, policy_version 232764 (0.0035) [2024-06-18 23:31:28,380][26367] Fps is (10 sec: 44247.0, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 3813638144. Throughput: 0: 41294.6. Samples: 81295720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:28,381][26367] Avg episode reward: [(0, '0.783')] [2024-06-18 23:31:30,682][26599] Updated weights for policy 0, policy_version 232774 (0.0028) [2024-06-18 23:31:33,380][26367] Fps is (10 sec: 37697.0, 60 sec: 40686.9, 300 sec: 41154.4). Total num frames: 3813834752. Throughput: 0: 41017.3. Samples: 81413520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:33,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-18 23:31:35,641][26599] Updated weights for policy 0, policy_version 232784 (0.0035) [2024-06-18 23:31:38,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42052.4, 300 sec: 41265.5). Total num frames: 3814080512. Throughput: 0: 41390.6. Samples: 81664680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:38,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-18 23:31:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000232793_3814080512.pth... [2024-06-18 23:31:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000232188_3804168192.pth [2024-06-18 23:31:38,596][26599] Updated weights for policy 0, policy_version 232794 (0.0029) [2024-06-18 23:31:43,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40413.9, 300 sec: 41154.4). Total num frames: 3814244352. Throughput: 0: 41466.7. Samples: 81919880. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:43,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-18 23:31:43,419][26599] Updated weights for policy 0, policy_version 232804 (0.0043) [2024-06-18 23:31:46,320][26599] Updated weights for policy 0, policy_version 232814 (0.0041) [2024-06-18 23:31:48,382][26367] Fps is (10 sec: 39312.7, 60 sec: 41504.6, 300 sec: 41209.6). Total num frames: 3814473728. Throughput: 0: 41170.1. Samples: 82033160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:48,383][26367] Avg episode reward: [(0, '0.666')] [2024-06-18 23:31:51,579][26599] Updated weights for policy 0, policy_version 232824 (0.0040) [2024-06-18 23:31:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 3814686720. Throughput: 0: 41347.3. Samples: 82283740. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:53,381][26367] Avg episode reward: [(0, '0.399')] [2024-06-18 23:31:54,307][26599] Updated weights for policy 0, policy_version 232834 (0.0029) [2024-06-18 23:31:58,380][26367] Fps is (10 sec: 37691.1, 60 sec: 40413.8, 300 sec: 41154.4). Total num frames: 3814850560. Throughput: 0: 41274.2. Samples: 82531100. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:31:58,389][26367] Avg episode reward: [(0, '0.373')] [2024-06-18 23:31:59,762][26599] Updated weights for policy 0, policy_version 232844 (0.0039) [2024-06-18 23:32:02,810][26599] Updated weights for policy 0, policy_version 232854 (0.0031) [2024-06-18 23:32:03,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 3815079936. Throughput: 0: 41239.9. Samples: 82646260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:32:03,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-18 23:32:07,582][26599] Updated weights for policy 0, policy_version 232864 (0.0052) [2024-06-18 23:32:08,380][26367] Fps is (10 sec: 44236.3, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 3815292928. Throughput: 0: 41354.8. Samples: 82901960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:32:08,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-18 23:32:10,669][26599] Updated weights for policy 0, policy_version 232874 (0.0027) [2024-06-18 23:32:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 3815489536. Throughput: 0: 41162.2. Samples: 83148020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:32:13,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-18 23:32:15,526][26599] Updated weights for policy 0, policy_version 232884 (0.0044) [2024-06-18 23:32:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42053.9, 300 sec: 41321.0). Total num frames: 3815718912. Throughput: 0: 41247.9. Samples: 83269680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:32:18,381][26367] Avg episode reward: [(0, '0.401')] [2024-06-18 23:32:18,537][26599] Updated weights for policy 0, policy_version 232894 (0.0029) [2024-06-18 23:32:23,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40416.4, 300 sec: 41098.9). Total num frames: 3815882752. Throughput: 0: 41312.0. Samples: 83523720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:32:23,380][26367] Avg episode reward: [(0, '0.551')] [2024-06-18 23:32:23,398][26599] Updated weights for policy 0, policy_version 232904 (0.0039) [2024-06-18 23:32:26,770][26599] Updated weights for policy 0, policy_version 232914 (0.0035) [2024-06-18 23:32:28,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41233.2, 300 sec: 41210.0). Total num frames: 3816112128. Throughput: 0: 41040.5. Samples: 83766700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 20.0) [2024-06-18 23:32:28,381][26367] Avg episode reward: [(0, '0.841')] [2024-06-18 23:32:31,431][26599] Updated weights for policy 0, policy_version 232924 (0.0046) [2024-06-18 23:32:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 3816325120. Throughput: 0: 41263.3. Samples: 83889920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:32:33,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-18 23:32:34,577][26599] Updated weights for policy 0, policy_version 232934 (0.0042) [2024-06-18 23:32:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40413.8, 300 sec: 41098.8). Total num frames: 3816505344. Throughput: 0: 41186.6. Samples: 84137140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:32:38,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-18 23:32:39,172][26599] Updated weights for policy 0, policy_version 232944 (0.0038) [2024-06-18 23:32:39,391][26579] Signal inference workers to stop experience collection... (1150 times) [2024-06-18 23:32:39,391][26579] Signal inference workers to resume experience collection... (1150 times) [2024-06-18 23:32:39,407][26599] InferenceWorker_p0-w0: stopping experience collection (1150 times) [2024-06-18 23:32:39,407][26599] InferenceWorker_p0-w0: resuming experience collection (1150 times) [2024-06-18 23:32:42,421][26599] Updated weights for policy 0, policy_version 232954 (0.0038) [2024-06-18 23:32:43,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3816718336. Throughput: 0: 41216.9. Samples: 84385860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:32:43,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-18 23:32:47,107][26599] Updated weights for policy 0, policy_version 232964 (0.0033) [2024-06-18 23:32:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41234.6, 300 sec: 41209.9). Total num frames: 3816947712. Throughput: 0: 41413.0. Samples: 84509840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:32:48,381][26367] Avg episode reward: [(0, '0.756')] [2024-06-18 23:32:50,914][26599] Updated weights for policy 0, policy_version 232974 (0.0039) [2024-06-18 23:32:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 40686.9, 300 sec: 41209.9). Total num frames: 3817127936. Throughput: 0: 41162.8. Samples: 84754280. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:32:53,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-18 23:32:54,993][26599] Updated weights for policy 0, policy_version 232984 (0.0039) [2024-06-18 23:32:58,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41265.5). Total num frames: 3817340928. Throughput: 0: 41067.1. Samples: 84996040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:32:58,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-18 23:32:58,920][26599] Updated weights for policy 0, policy_version 232994 (0.0048) [2024-06-18 23:33:02,831][26599] Updated weights for policy 0, policy_version 233004 (0.0037) [2024-06-18 23:33:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41233.2, 300 sec: 41154.4). Total num frames: 3817553920. Throughput: 0: 41249.1. Samples: 85125880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:03,380][26367] Avg episode reward: [(0, '0.494')] [2024-06-18 23:33:06,963][26599] Updated weights for policy 0, policy_version 233014 (0.0038) [2024-06-18 23:33:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.1, 300 sec: 41209.9). Total num frames: 3817750528. Throughput: 0: 41022.1. Samples: 85369720. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:08,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-18 23:33:10,843][26599] Updated weights for policy 0, policy_version 233024 (0.0031) [2024-06-18 23:33:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 3817963520. Throughput: 0: 41045.7. Samples: 85613760. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:13,381][26367] Avg episode reward: [(0, '0.367')] [2024-06-18 23:33:14,812][26599] Updated weights for policy 0, policy_version 233034 (0.0036) [2024-06-18 23:33:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 3818160128. Throughput: 0: 41061.3. Samples: 85737680. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:18,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-18 23:33:18,678][26599] Updated weights for policy 0, policy_version 233044 (0.0036) [2024-06-18 23:33:22,921][26599] Updated weights for policy 0, policy_version 233054 (0.0034) [2024-06-18 23:33:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 3818373120. Throughput: 0: 41000.0. Samples: 85982140. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:23,384][26367] Avg episode reward: [(0, '0.445')] [2024-06-18 23:33:26,492][26599] Updated weights for policy 0, policy_version 233064 (0.0042) [2024-06-18 23:33:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3818569728. Throughput: 0: 41099.3. Samples: 86235320. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:28,381][26367] Avg episode reward: [(0, '0.402')] [2024-06-18 23:33:30,951][26599] Updated weights for policy 0, policy_version 233074 (0.0027) [2024-06-18 23:33:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 3818782720. Throughput: 0: 40957.3. Samples: 86352920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:33,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-18 23:33:34,405][26599] Updated weights for policy 0, policy_version 233084 (0.0039) [2024-06-18 23:33:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 3818995712. Throughput: 0: 41062.2. Samples: 86602080. Policy #0 lag: (min: 0.0, avg: 12.3, max: 25.0) [2024-06-18 23:33:38,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-18 23:33:38,386][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233093_3818995712.pth... [2024-06-18 23:33:38,432][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000232489_3809099776.pth [2024-06-18 23:33:38,757][26599] Updated weights for policy 0, policy_version 233094 (0.0039) [2024-06-18 23:33:42,409][26599] Updated weights for policy 0, policy_version 233104 (0.0045) [2024-06-18 23:33:43,384][26367] Fps is (10 sec: 40944.7, 60 sec: 41230.6, 300 sec: 41153.9). Total num frames: 3819192320. Throughput: 0: 41019.8. Samples: 86842080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:33:43,385][26367] Avg episode reward: [(0, '0.626')] [2024-06-18 23:33:46,788][26599] Updated weights for policy 0, policy_version 233114 (0.0034) [2024-06-18 23:33:48,380][26367] Fps is (10 sec: 37683.0, 60 sec: 40413.8, 300 sec: 41043.3). Total num frames: 3819372544. Throughput: 0: 40890.5. Samples: 86965960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:33:48,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-18 23:33:50,739][26599] Updated weights for policy 0, policy_version 233124 (0.0038) [2024-06-18 23:33:53,380][26367] Fps is (10 sec: 40975.4, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 3819601920. Throughput: 0: 41065.9. Samples: 87217680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:33:53,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-18 23:33:54,491][26599] Updated weights for policy 0, policy_version 233134 (0.0032) [2024-06-18 23:33:58,380][26367] Fps is (10 sec: 42599.1, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 3819798528. Throughput: 0: 41084.1. Samples: 87462540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:33:58,380][26367] Avg episode reward: [(0, '0.678')] [2024-06-18 23:33:58,599][26599] Updated weights for policy 0, policy_version 233144 (0.0033) [2024-06-18 23:34:00,668][26579] Signal inference workers to stop experience collection... (1200 times) [2024-06-18 23:34:00,668][26579] Signal inference workers to resume experience collection... (1200 times) [2024-06-18 23:34:00,707][26599] InferenceWorker_p0-w0: stopping experience collection (1200 times) [2024-06-18 23:34:00,708][26599] InferenceWorker_p0-w0: resuming experience collection (1200 times) [2024-06-18 23:34:02,623][26599] Updated weights for policy 0, policy_version 233154 (0.0035) [2024-06-18 23:34:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 3820011520. Throughput: 0: 41046.2. Samples: 87584760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:03,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-18 23:34:06,486][26599] Updated weights for policy 0, policy_version 233164 (0.0035) [2024-06-18 23:34:08,382][26367] Fps is (10 sec: 40951.9, 60 sec: 40958.8, 300 sec: 41098.6). Total num frames: 3820208128. Throughput: 0: 41092.0. Samples: 87831360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:08,383][26367] Avg episode reward: [(0, '0.467')] [2024-06-18 23:34:10,424][26599] Updated weights for policy 0, policy_version 233174 (0.0046) [2024-06-18 23:34:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 40959.9, 300 sec: 41209.9). Total num frames: 3820421120. Throughput: 0: 40966.1. Samples: 88078800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:13,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-18 23:34:14,290][26599] Updated weights for policy 0, policy_version 233184 (0.0031) [2024-06-18 23:34:18,380][26367] Fps is (10 sec: 42606.5, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 3820634112. Throughput: 0: 41154.7. Samples: 88204880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:18,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-18 23:34:18,466][26599] Updated weights for policy 0, policy_version 233194 (0.0027) [2024-06-18 23:34:22,482][26599] Updated weights for policy 0, policy_version 233204 (0.0030) [2024-06-18 23:34:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3820847104. Throughput: 0: 41066.6. Samples: 88450080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:23,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-18 23:34:26,248][26599] Updated weights for policy 0, policy_version 233214 (0.0040) [2024-06-18 23:34:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3821043712. Throughput: 0: 41433.2. Samples: 88706420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:28,381][26367] Avg episode reward: [(0, '0.409')] [2024-06-18 23:34:30,388][26599] Updated weights for policy 0, policy_version 233224 (0.0043) [2024-06-18 23:34:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41265.9). Total num frames: 3821273088. Throughput: 0: 41334.8. Samples: 88826020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:33,380][26367] Avg episode reward: [(0, '0.551')] [2024-06-18 23:34:33,944][26599] Updated weights for policy 0, policy_version 233234 (0.0039) [2024-06-18 23:34:38,139][26599] Updated weights for policy 0, policy_version 233244 (0.0041) [2024-06-18 23:34:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 3821469696. Throughput: 0: 41250.2. Samples: 89073940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:38,380][26367] Avg episode reward: [(0, '0.679')] [2024-06-18 23:34:41,891][26599] Updated weights for policy 0, policy_version 233254 (0.0050) [2024-06-18 23:34:43,380][26367] Fps is (10 sec: 37683.4, 60 sec: 40962.6, 300 sec: 41098.9). Total num frames: 3821649920. Throughput: 0: 41311.1. Samples: 89321540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:43,380][26367] Avg episode reward: [(0, '0.721')] [2024-06-18 23:34:45,866][26599] Updated weights for policy 0, policy_version 233264 (0.0037) [2024-06-18 23:34:48,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41233.1, 300 sec: 41099.4). Total num frames: 3821846528. Throughput: 0: 41219.1. Samples: 89439620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-18 23:34:48,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-18 23:34:49,767][26599] Updated weights for policy 0, policy_version 233274 (0.0038) [2024-06-18 23:34:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 3822092288. Throughput: 0: 41496.9. Samples: 89698640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:34:53,380][26367] Avg episode reward: [(0, '0.595')] [2024-06-18 23:34:53,855][26599] Updated weights for policy 0, policy_version 233284 (0.0047) [2024-06-18 23:34:57,636][26599] Updated weights for policy 0, policy_version 233294 (0.0038) [2024-06-18 23:34:58,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 3822288896. Throughput: 0: 41282.2. Samples: 89936500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:34:58,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-18 23:35:01,609][26599] Updated weights for policy 0, policy_version 233304 (0.0041) [2024-06-18 23:35:03,380][26367] Fps is (10 sec: 37683.1, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 3822469120. Throughput: 0: 41189.4. Samples: 90058400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:03,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-18 23:35:05,686][26599] Updated weights for policy 0, policy_version 233314 (0.0040) [2024-06-18 23:35:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41507.5, 300 sec: 41154.4). Total num frames: 3822698496. Throughput: 0: 41311.2. Samples: 90309080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:08,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-18 23:35:10,265][26599] Updated weights for policy 0, policy_version 233324 (0.0043) [2024-06-18 23:35:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41506.2, 300 sec: 41154.4). Total num frames: 3822911488. Throughput: 0: 41003.6. Samples: 90551580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:13,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-18 23:35:14,134][26599] Updated weights for policy 0, policy_version 233334 (0.0035) [2024-06-18 23:35:16,120][26579] Signal inference workers to stop experience collection... (1250 times) [2024-06-18 23:35:16,121][26579] Signal inference workers to resume experience collection... (1250 times) [2024-06-18 23:35:16,173][26599] InferenceWorker_p0-w0: stopping experience collection (1250 times) [2024-06-18 23:35:16,174][26599] InferenceWorker_p0-w0: resuming experience collection (1250 times) [2024-06-18 23:35:18,236][26599] Updated weights for policy 0, policy_version 233344 (0.0045) [2024-06-18 23:35:18,384][26367] Fps is (10 sec: 42582.9, 60 sec: 41503.6, 300 sec: 41209.4). Total num frames: 3823124480. Throughput: 0: 41146.0. Samples: 90677740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:18,384][26367] Avg episode reward: [(0, '0.591')] [2024-06-18 23:35:22,052][26599] Updated weights for policy 0, policy_version 233354 (0.0030) [2024-06-18 23:35:23,380][26367] Fps is (10 sec: 37682.8, 60 sec: 40686.9, 300 sec: 40987.8). Total num frames: 3823288320. Throughput: 0: 41045.2. Samples: 90920980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:23,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-18 23:35:26,123][26599] Updated weights for policy 0, policy_version 233364 (0.0032) [2024-06-18 23:35:28,380][26367] Fps is (10 sec: 37696.9, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 3823501312. Throughput: 0: 41285.7. Samples: 91179400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:28,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-18 23:35:29,911][26599] Updated weights for policy 0, policy_version 233374 (0.0034) [2024-06-18 23:35:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 3823730688. Throughput: 0: 41323.9. Samples: 91299200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:33,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-18 23:35:33,881][26599] Updated weights for policy 0, policy_version 233384 (0.0031) [2024-06-18 23:35:37,961][26599] Updated weights for policy 0, policy_version 233394 (0.0046) [2024-06-18 23:35:38,384][26367] Fps is (10 sec: 42582.9, 60 sec: 40957.5, 300 sec: 41042.8). Total num frames: 3823927296. Throughput: 0: 41031.3. Samples: 91545200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:38,385][26367] Avg episode reward: [(0, '0.319')] [2024-06-18 23:35:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233394_3823927296.pth... [2024-06-18 23:35:38,445][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000232793_3814080512.pth [2024-06-18 23:35:42,057][26599] Updated weights for policy 0, policy_version 233404 (0.0043) [2024-06-18 23:35:43,380][26367] Fps is (10 sec: 39322.4, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3824123904. Throughput: 0: 41167.3. Samples: 91789020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:43,380][26367] Avg episode reward: [(0, '0.244')] [2024-06-18 23:35:46,065][26599] Updated weights for policy 0, policy_version 233414 (0.0035) [2024-06-18 23:35:48,380][26367] Fps is (10 sec: 42613.4, 60 sec: 41779.1, 300 sec: 41265.4). Total num frames: 3824353280. Throughput: 0: 41215.0. Samples: 91913080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:48,381][26367] Avg episode reward: [(0, '0.270')] [2024-06-18 23:35:49,899][26599] Updated weights for policy 0, policy_version 233424 (0.0034) [2024-06-18 23:35:53,384][26367] Fps is (10 sec: 39307.2, 60 sec: 40411.4, 300 sec: 40987.3). Total num frames: 3824517120. Throughput: 0: 41010.9. Samples: 92154720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:53,384][26367] Avg episode reward: [(0, '0.355')] [2024-06-18 23:35:54,273][26599] Updated weights for policy 0, policy_version 233434 (0.0036) [2024-06-18 23:35:58,307][26599] Updated weights for policy 0, policy_version 233444 (0.0037) [2024-06-18 23:35:58,380][26367] Fps is (10 sec: 39322.4, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 3824746496. Throughput: 0: 41185.0. Samples: 92404900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-18 23:35:58,380][26367] Avg episode reward: [(0, '0.431')] [2024-06-18 23:36:02,129][26599] Updated weights for policy 0, policy_version 233454 (0.0028) [2024-06-18 23:36:03,380][26367] Fps is (10 sec: 45891.2, 60 sec: 41779.1, 300 sec: 41154.4). Total num frames: 3824975872. Throughput: 0: 41179.2. Samples: 92530660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:03,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-18 23:36:06,371][26599] Updated weights for policy 0, policy_version 233464 (0.0037) [2024-06-18 23:36:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3825172480. Throughput: 0: 41219.7. Samples: 92775860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:08,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-18 23:36:10,232][26599] Updated weights for policy 0, policy_version 233474 (0.0039) [2024-06-18 23:36:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41321.3). Total num frames: 3825385472. Throughput: 0: 40883.1. Samples: 93019140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:13,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-18 23:36:14,267][26599] Updated weights for policy 0, policy_version 233484 (0.0046) [2024-06-18 23:36:18,161][26599] Updated weights for policy 0, policy_version 233494 (0.0039) [2024-06-18 23:36:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40962.5, 300 sec: 41099.4). Total num frames: 3825582080. Throughput: 0: 41064.5. Samples: 93147100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:18,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-18 23:36:22,072][26599] Updated weights for policy 0, policy_version 233504 (0.0046) [2024-06-18 23:36:23,384][26367] Fps is (10 sec: 39307.5, 60 sec: 41503.7, 300 sec: 41153.9). Total num frames: 3825778688. Throughput: 0: 41053.8. Samples: 93392620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:23,384][26367] Avg episode reward: [(0, '0.458')] [2024-06-18 23:36:26,024][26599] Updated weights for policy 0, policy_version 233514 (0.0051) [2024-06-18 23:36:28,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 3825991680. Throughput: 0: 41099.3. Samples: 93638500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:28,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-18 23:36:30,194][26599] Updated weights for policy 0, policy_version 233524 (0.0034) [2024-06-18 23:36:30,966][26579] Signal inference workers to stop experience collection... (1300 times) [2024-06-18 23:36:31,022][26599] InferenceWorker_p0-w0: stopping experience collection (1300 times) [2024-06-18 23:36:31,025][26579] Signal inference workers to resume experience collection... (1300 times) [2024-06-18 23:36:31,036][26599] InferenceWorker_p0-w0: resuming experience collection (1300 times) [2024-06-18 23:36:33,380][26367] Fps is (10 sec: 39336.4, 60 sec: 40687.1, 300 sec: 40987.8). Total num frames: 3826171904. Throughput: 0: 41163.8. Samples: 93765440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:33,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-18 23:36:34,015][26599] Updated weights for policy 0, policy_version 233534 (0.0033) [2024-06-18 23:36:37,999][26599] Updated weights for policy 0, policy_version 233544 (0.0033) [2024-06-18 23:36:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41235.5, 300 sec: 41209.9). Total num frames: 3826401280. Throughput: 0: 41236.5. Samples: 94010220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:38,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-18 23:36:41,847][26599] Updated weights for policy 0, policy_version 233554 (0.0036) [2024-06-18 23:36:43,380][26367] Fps is (10 sec: 45874.9, 60 sec: 41779.2, 300 sec: 41210.2). Total num frames: 3826630656. Throughput: 0: 41037.3. Samples: 94251580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:43,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-18 23:36:46,201][26599] Updated weights for policy 0, policy_version 233564 (0.0043) [2024-06-18 23:36:48,380][26367] Fps is (10 sec: 40961.0, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 3826810880. Throughput: 0: 41255.7. Samples: 94387160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:48,380][26367] Avg episode reward: [(0, '0.532')] [2024-06-18 23:36:49,882][26599] Updated weights for policy 0, policy_version 233574 (0.0043) [2024-06-18 23:36:53,384][26367] Fps is (10 sec: 37669.1, 60 sec: 41506.1, 300 sec: 41209.4). Total num frames: 3827007488. Throughput: 0: 41117.1. Samples: 94626280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:53,385][26367] Avg episode reward: [(0, '0.545')] [2024-06-18 23:36:54,001][26599] Updated weights for policy 0, policy_version 233584 (0.0035) [2024-06-18 23:36:57,718][26599] Updated weights for policy 0, policy_version 233594 (0.0039) [2024-06-18 23:36:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 3827236864. Throughput: 0: 41122.2. Samples: 94869640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:36:58,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-18 23:37:02,161][26599] Updated weights for policy 0, policy_version 233604 (0.0040) [2024-06-18 23:37:03,380][26367] Fps is (10 sec: 42614.3, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 3827433472. Throughput: 0: 41047.2. Samples: 94994220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:37:03,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-18 23:37:05,709][26599] Updated weights for policy 0, policy_version 233614 (0.0034) [2024-06-18 23:37:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41154.4). Total num frames: 3827630080. Throughput: 0: 40985.0. Samples: 95236800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 25.0) [2024-06-18 23:37:08,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-18 23:37:09,830][26599] Updated weights for policy 0, policy_version 233624 (0.0046) [2024-06-18 23:37:13,380][26367] Fps is (10 sec: 37682.9, 60 sec: 40413.9, 300 sec: 40987.8). Total num frames: 3827810304. Throughput: 0: 41054.4. Samples: 95485940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:13,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-18 23:37:13,773][26599] Updated weights for policy 0, policy_version 233634 (0.0043) [2024-06-18 23:37:17,725][26599] Updated weights for policy 0, policy_version 233644 (0.0045) [2024-06-18 23:37:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 3828039680. Throughput: 0: 40956.3. Samples: 95608480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:18,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-18 23:37:21,747][26599] Updated weights for policy 0, policy_version 233654 (0.0035) [2024-06-18 23:37:23,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41235.6, 300 sec: 41154.4). Total num frames: 3828252672. Throughput: 0: 40965.1. Samples: 95853640. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:23,381][26367] Avg episode reward: [(0, '0.388')] [2024-06-18 23:37:25,550][26599] Updated weights for policy 0, policy_version 233664 (0.0032) [2024-06-18 23:37:28,380][26367] Fps is (10 sec: 39321.1, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 3828432896. Throughput: 0: 41177.1. Samples: 96104560. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:28,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-18 23:37:29,896][26599] Updated weights for policy 0, policy_version 233674 (0.0035) [2024-06-18 23:37:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 3828662272. Throughput: 0: 40794.2. Samples: 96222900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:33,381][26367] Avg episode reward: [(0, '0.375')] [2024-06-18 23:37:33,587][26599] Updated weights for policy 0, policy_version 233684 (0.0037) [2024-06-18 23:37:37,831][26599] Updated weights for policy 0, policy_version 233694 (0.0035) [2024-06-18 23:37:38,380][26367] Fps is (10 sec: 42599.1, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 3828858880. Throughput: 0: 41034.0. Samples: 96472660. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:38,381][26367] Avg episode reward: [(0, '0.378')] [2024-06-18 23:37:38,482][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233696_3828875264.pth... [2024-06-18 23:37:38,529][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233093_3818995712.pth [2024-06-18 23:37:41,568][26599] Updated weights for policy 0, policy_version 233704 (0.0034) [2024-06-18 23:37:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 41098.9). Total num frames: 3829071872. Throughput: 0: 40965.0. Samples: 96713060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:43,380][26367] Avg episode reward: [(0, '0.490')] [2024-06-18 23:37:45,744][26599] Updated weights for policy 0, policy_version 233714 (0.0037) [2024-06-18 23:37:48,380][26367] Fps is (10 sec: 39321.3, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 3829252096. Throughput: 0: 40929.2. Samples: 96836040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:48,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-18 23:37:49,081][26579] Signal inference workers to stop experience collection... (1350 times) [2024-06-18 23:37:49,132][26599] InferenceWorker_p0-w0: stopping experience collection (1350 times) [2024-06-18 23:37:49,141][26579] Signal inference workers to resume experience collection... (1350 times) [2024-06-18 23:37:49,154][26599] InferenceWorker_p0-w0: resuming experience collection (1350 times) [2024-06-18 23:37:49,279][26599] Updated weights for policy 0, policy_version 233724 (0.0037) [2024-06-18 23:37:53,380][26367] Fps is (10 sec: 39321.1, 60 sec: 40962.4, 300 sec: 41098.8). Total num frames: 3829465088. Throughput: 0: 41107.1. Samples: 97086620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:53,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-18 23:37:53,690][26599] Updated weights for policy 0, policy_version 233734 (0.0048) [2024-06-18 23:37:57,311][26599] Updated weights for policy 0, policy_version 233744 (0.0033) [2024-06-18 23:37:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 40687.0, 300 sec: 41098.8). Total num frames: 3829678080. Throughput: 0: 41038.7. Samples: 97332680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:37:58,380][26367] Avg episode reward: [(0, '0.720')] [2024-06-18 23:38:01,511][26599] Updated weights for policy 0, policy_version 233754 (0.0038) [2024-06-18 23:38:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40686.8, 300 sec: 41098.8). Total num frames: 3829874688. Throughput: 0: 41120.3. Samples: 97458900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:38:03,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-18 23:38:05,481][26599] Updated weights for policy 0, policy_version 233764 (0.0049) [2024-06-18 23:38:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3830087680. Throughput: 0: 41128.0. Samples: 97704400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:38:08,381][26367] Avg episode reward: [(0, '0.290')] [2024-06-18 23:38:09,376][26599] Updated weights for policy 0, policy_version 233774 (0.0045) [2024-06-18 23:38:13,350][26599] Updated weights for policy 0, policy_version 233784 (0.0037) [2024-06-18 23:38:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41779.2, 300 sec: 41209.9). Total num frames: 3830317056. Throughput: 0: 40948.9. Samples: 97947260. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:38:13,381][26367] Avg episode reward: [(0, '0.295')] [2024-06-18 23:38:17,312][26599] Updated weights for policy 0, policy_version 233794 (0.0035) [2024-06-18 23:38:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 40960.0, 300 sec: 41098.8). Total num frames: 3830497280. Throughput: 0: 41289.7. Samples: 98080940. Policy #0 lag: (min: 1.0, avg: 10.6, max: 20.0) [2024-06-18 23:38:18,381][26367] Avg episode reward: [(0, '0.420')] [2024-06-18 23:38:21,313][26599] Updated weights for policy 0, policy_version 233804 (0.0033) [2024-06-18 23:38:23,380][26367] Fps is (10 sec: 37683.4, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 3830693888. Throughput: 0: 41099.5. Samples: 98322140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:23,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-18 23:38:25,252][26599] Updated weights for policy 0, policy_version 233814 (0.0030) [2024-06-18 23:38:28,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41233.2, 300 sec: 41098.9). Total num frames: 3830906880. Throughput: 0: 41162.2. Samples: 98565360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:28,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-18 23:38:29,338][26599] Updated weights for policy 0, policy_version 233824 (0.0045) [2024-06-18 23:38:33,132][26599] Updated weights for policy 0, policy_version 233834 (0.0043) [2024-06-18 23:38:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3831136256. Throughput: 0: 41188.0. Samples: 98689500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:33,384][26367] Avg episode reward: [(0, '0.608')] [2024-06-18 23:38:37,309][26599] Updated weights for policy 0, policy_version 233844 (0.0038) [2024-06-18 23:38:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 41154.9). Total num frames: 3831332864. Throughput: 0: 41140.0. Samples: 98937920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:38,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-18 23:38:41,312][26599] Updated weights for policy 0, policy_version 233854 (0.0036) [2024-06-18 23:38:43,380][26367] Fps is (10 sec: 37682.6, 60 sec: 40686.8, 300 sec: 41154.4). Total num frames: 3831513088. Throughput: 0: 41124.2. Samples: 99183280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:43,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-18 23:38:45,245][26599] Updated weights for policy 0, policy_version 233864 (0.0044) [2024-06-18 23:38:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41233.1, 300 sec: 41098.8). Total num frames: 3831726080. Throughput: 0: 40864.6. Samples: 99297800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:48,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-18 23:38:49,402][26599] Updated weights for policy 0, policy_version 233874 (0.0038) [2024-06-18 23:38:53,374][26599] Updated weights for policy 0, policy_version 233884 (0.0046) [2024-06-18 23:38:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 3831955456. Throughput: 0: 40936.8. Samples: 99546560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:53,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-18 23:38:57,326][26599] Updated weights for policy 0, policy_version 233894 (0.0036) [2024-06-18 23:38:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 3832135680. Throughput: 0: 40938.3. Samples: 99789480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:38:58,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-18 23:39:01,424][26599] Updated weights for policy 0, policy_version 233904 (0.0044) [2024-06-18 23:39:03,380][26367] Fps is (10 sec: 37683.6, 60 sec: 40960.1, 300 sec: 41099.1). Total num frames: 3832332288. Throughput: 0: 40818.7. Samples: 99917780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:39:03,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-18 23:39:05,279][26599] Updated weights for policy 0, policy_version 233914 (0.0037) [2024-06-18 23:39:05,284][26579] Signal inference workers to stop experience collection... (1400 times) [2024-06-18 23:39:05,285][26579] Signal inference workers to resume experience collection... (1400 times) [2024-06-18 23:39:05,325][26599] InferenceWorker_p0-w0: stopping experience collection (1400 times) [2024-06-18 23:39:05,325][26599] InferenceWorker_p0-w0: resuming experience collection (1400 times) [2024-06-18 23:39:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41098.9). Total num frames: 3832545280. Throughput: 0: 40993.4. Samples: 100166840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:39:08,380][26367] Avg episode reward: [(0, '0.463')] [2024-06-18 23:39:09,435][26599] Updated weights for policy 0, policy_version 233924 (0.0037) [2024-06-18 23:39:13,359][26599] Updated weights for policy 0, policy_version 233934 (0.0034) [2024-06-18 23:39:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 40960.1, 300 sec: 41154.4). Total num frames: 3832774656. Throughput: 0: 41005.3. Samples: 100410600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:39:13,380][26367] Avg episode reward: [(0, '0.297')] [2024-06-18 23:39:17,417][26599] Updated weights for policy 0, policy_version 233944 (0.0043) [2024-06-18 23:39:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 3832954880. Throughput: 0: 40951.1. Samples: 100532300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:39:18,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-18 23:39:21,070][26599] Updated weights for policy 0, policy_version 233954 (0.0030) [2024-06-18 23:39:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41098.9). Total num frames: 3833167872. Throughput: 0: 40961.5. Samples: 100781180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:39:23,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-18 23:39:25,424][26599] Updated weights for policy 0, policy_version 233964 (0.0038) [2024-06-18 23:39:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3833364480. Throughput: 0: 41104.6. Samples: 101032980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-18 23:39:28,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-18 23:39:29,454][26599] Updated weights for policy 0, policy_version 233974 (0.0035) [2024-06-18 23:39:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 3833577472. Throughput: 0: 41188.5. Samples: 101151280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:39:33,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-18 23:39:33,426][26599] Updated weights for policy 0, policy_version 233984 (0.0038) [2024-06-18 23:39:37,268][26599] Updated weights for policy 0, policy_version 233994 (0.0041) [2024-06-18 23:39:38,380][26367] Fps is (10 sec: 40959.2, 60 sec: 40686.9, 300 sec: 41098.8). Total num frames: 3833774080. Throughput: 0: 41187.5. Samples: 101400000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:39:38,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-18 23:39:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233995_3833774080.pth... [2024-06-18 23:39:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233394_3823927296.pth [2024-06-18 23:39:41,281][26599] Updated weights for policy 0, policy_version 234004 (0.0035) [2024-06-18 23:39:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 41209.9). Total num frames: 3834003456. Throughput: 0: 41291.9. Samples: 101647620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:39:43,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-18 23:39:45,489][26599] Updated weights for policy 0, policy_version 234014 (0.0044) [2024-06-18 23:39:48,380][26367] Fps is (10 sec: 40961.1, 60 sec: 40960.1, 300 sec: 40987.8). Total num frames: 3834183680. Throughput: 0: 41177.9. Samples: 101770780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:39:48,380][26367] Avg episode reward: [(0, '0.292')] [2024-06-18 23:39:49,249][26599] Updated weights for policy 0, policy_version 234024 (0.0044) [2024-06-18 23:39:53,255][26599] Updated weights for policy 0, policy_version 234034 (0.0026) [2024-06-18 23:39:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 3834413056. Throughput: 0: 41052.9. Samples: 102014220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:39:53,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-18 23:39:57,409][26599] Updated weights for policy 0, policy_version 234044 (0.0036) [2024-06-18 23:39:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3834609664. Throughput: 0: 41107.0. Samples: 102260420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:39:58,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-18 23:40:01,294][26599] Updated weights for policy 0, policy_version 234054 (0.0028) [2024-06-18 23:40:03,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3834806272. Throughput: 0: 41051.6. Samples: 102379620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:03,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-18 23:40:05,838][26599] Updated weights for policy 0, policy_version 234064 (0.0033) [2024-06-18 23:40:08,380][26367] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 3835002880. Throughput: 0: 40998.0. Samples: 102626100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:08,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-18 23:40:09,516][26599] Updated weights for policy 0, policy_version 234074 (0.0026) [2024-06-18 23:40:13,382][26367] Fps is (10 sec: 40951.3, 60 sec: 40685.4, 300 sec: 40988.0). Total num frames: 3835215872. Throughput: 0: 40821.6. Samples: 102870040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:13,383][26367] Avg episode reward: [(0, '0.430')] [2024-06-18 23:40:13,837][26599] Updated weights for policy 0, policy_version 234084 (0.0045) [2024-06-18 23:40:17,766][26599] Updated weights for policy 0, policy_version 234094 (0.0039) [2024-06-18 23:40:18,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41506.0, 300 sec: 41209.9). Total num frames: 3835445248. Throughput: 0: 40954.4. Samples: 102994240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:18,381][26367] Avg episode reward: [(0, '0.425')] [2024-06-18 23:40:21,615][26599] Updated weights for policy 0, policy_version 234104 (0.0036) [2024-06-18 23:40:23,380][26367] Fps is (10 sec: 42607.8, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3835641856. Throughput: 0: 40948.7. Samples: 103242680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:23,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-18 23:40:25,639][26599] Updated weights for policy 0, policy_version 234114 (0.0029) [2024-06-18 23:40:28,380][26367] Fps is (10 sec: 37683.6, 60 sec: 40959.9, 300 sec: 40987.8). Total num frames: 3835822080. Throughput: 0: 40904.8. Samples: 103488340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:28,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-18 23:40:29,594][26599] Updated weights for policy 0, policy_version 234124 (0.0041) [2024-06-18 23:40:33,380][26367] Fps is (10 sec: 39321.1, 60 sec: 40959.9, 300 sec: 41043.8). Total num frames: 3836035072. Throughput: 0: 40780.3. Samples: 103605900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:33,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-18 23:40:33,468][26599] Updated weights for policy 0, policy_version 234134 (0.0041) [2024-06-18 23:40:33,742][26579] Signal inference workers to stop experience collection... (1450 times) [2024-06-18 23:40:33,755][26599] InferenceWorker_p0-w0: stopping experience collection (1450 times) [2024-06-18 23:40:33,858][26579] Signal inference workers to resume experience collection... (1450 times) [2024-06-18 23:40:33,858][26599] InferenceWorker_p0-w0: resuming experience collection (1450 times) [2024-06-18 23:40:37,493][26599] Updated weights for policy 0, policy_version 234144 (0.0035) [2024-06-18 23:40:38,384][26367] Fps is (10 sec: 42583.4, 60 sec: 41230.7, 300 sec: 41098.3). Total num frames: 3836248064. Throughput: 0: 41061.1. Samples: 103862120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-18 23:40:38,384][26367] Avg episode reward: [(0, '0.568')] [2024-06-18 23:40:41,626][26599] Updated weights for policy 0, policy_version 234154 (0.0042) [2024-06-18 23:40:43,380][26367] Fps is (10 sec: 42599.1, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 3836461056. Throughput: 0: 40924.6. Samples: 104102020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:40:43,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-18 23:40:45,452][26599] Updated weights for policy 0, policy_version 234164 (0.0036) [2024-06-18 23:40:48,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41233.1, 300 sec: 41154.9). Total num frames: 3836657664. Throughput: 0: 41043.2. Samples: 104226560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:40:48,380][26367] Avg episode reward: [(0, '0.476')] [2024-06-18 23:40:49,396][26599] Updated weights for policy 0, policy_version 234174 (0.0036) [2024-06-18 23:40:53,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 41043.3). Total num frames: 3836854272. Throughput: 0: 41137.1. Samples: 104477260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:40:53,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-18 23:40:53,491][26599] Updated weights for policy 0, policy_version 234184 (0.0031) [2024-06-18 23:40:57,457][26599] Updated weights for policy 0, policy_version 234194 (0.0033) [2024-06-18 23:40:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 40987.8). Total num frames: 3837067264. Throughput: 0: 41131.7. Samples: 104720880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:40:58,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-18 23:41:01,505][26599] Updated weights for policy 0, policy_version 234204 (0.0045) [2024-06-18 23:41:03,380][26367] Fps is (10 sec: 42597.6, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3837280256. Throughput: 0: 41142.8. Samples: 104845660. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:03,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-18 23:41:05,311][26599] Updated weights for policy 0, policy_version 234214 (0.0036) [2024-06-18 23:41:08,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40960.1, 300 sec: 40932.2). Total num frames: 3837460480. Throughput: 0: 41000.4. Samples: 105087700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:08,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-18 23:41:09,388][26599] Updated weights for policy 0, policy_version 234224 (0.0050) [2024-06-18 23:41:13,380][26367] Fps is (10 sec: 39322.2, 60 sec: 40961.5, 300 sec: 40987.8). Total num frames: 3837673472. Throughput: 0: 41116.2. Samples: 105338560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:13,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-18 23:41:13,394][26599] Updated weights for policy 0, policy_version 234234 (0.0036) [2024-06-18 23:41:17,167][26599] Updated weights for policy 0, policy_version 234244 (0.0032) [2024-06-18 23:41:18,380][26367] Fps is (10 sec: 45875.5, 60 sec: 41233.3, 300 sec: 41154.9). Total num frames: 3837919232. Throughput: 0: 41332.1. Samples: 105465840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:18,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-18 23:41:21,288][26599] Updated weights for policy 0, policy_version 234254 (0.0041) [2024-06-18 23:41:23,384][26367] Fps is (10 sec: 42582.2, 60 sec: 40957.4, 300 sec: 41042.8). Total num frames: 3838099456. Throughput: 0: 40961.7. Samples: 105705400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:23,385][26367] Avg episode reward: [(0, '0.411')] [2024-06-18 23:41:25,214][26599] Updated weights for policy 0, policy_version 234264 (0.0046) [2024-06-18 23:41:28,380][26367] Fps is (10 sec: 36044.6, 60 sec: 40960.1, 300 sec: 41043.3). Total num frames: 3838279680. Throughput: 0: 41116.8. Samples: 105952280. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:28,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-18 23:41:29,314][26599] Updated weights for policy 0, policy_version 234274 (0.0029) [2024-06-18 23:41:33,299][26599] Updated weights for policy 0, policy_version 234284 (0.0037) [2024-06-18 23:41:33,380][26367] Fps is (10 sec: 40975.2, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3838509056. Throughput: 0: 41067.9. Samples: 106074620. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:33,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-18 23:41:37,431][26599] Updated weights for policy 0, policy_version 234294 (0.0040) [2024-06-18 23:41:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 40962.5, 300 sec: 40932.2). Total num frames: 3838705664. Throughput: 0: 41109.3. Samples: 106327180. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:38,380][26367] Avg episode reward: [(0, '0.706')] [2024-06-18 23:41:38,442][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000234297_3838722048.pth... [2024-06-18 23:41:38,496][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233696_3828875264.pth [2024-06-18 23:41:41,270][26599] Updated weights for policy 0, policy_version 234304 (0.0040) [2024-06-18 23:41:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 3838918656. Throughput: 0: 41059.5. Samples: 106568560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-18 23:41:43,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-18 23:41:45,435][26599] Updated weights for policy 0, policy_version 234314 (0.0035) [2024-06-18 23:41:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41043.8). Total num frames: 3839115264. Throughput: 0: 41215.7. Samples: 106700360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:41:48,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-18 23:41:48,913][26599] Updated weights for policy 0, policy_version 234324 (0.0050) [2024-06-18 23:41:53,380][26367] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 40932.2). Total num frames: 3839311872. Throughput: 0: 41297.4. Samples: 106946080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:41:53,380][26367] Avg episode reward: [(0, '0.478')] [2024-06-18 23:41:53,480][26599] Updated weights for policy 0, policy_version 234334 (0.0048) [2024-06-18 23:41:56,810][26599] Updated weights for policy 0, policy_version 234344 (0.0049) [2024-06-18 23:41:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41043.3). Total num frames: 3839541248. Throughput: 0: 41106.5. Samples: 107188360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:41:58,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-18 23:42:01,563][26599] Updated weights for policy 0, policy_version 234354 (0.0037) [2024-06-18 23:42:03,380][26367] Fps is (10 sec: 42597.9, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 3839737856. Throughput: 0: 41223.0. Samples: 107320880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:03,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-18 23:42:04,644][26599] Updated weights for policy 0, policy_version 234364 (0.0026) [2024-06-18 23:42:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 3839950848. Throughput: 0: 41143.9. Samples: 107556720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:08,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-18 23:42:09,598][26599] Updated weights for policy 0, policy_version 234374 (0.0040) [2024-06-18 23:42:09,614][26579] Signal inference workers to stop experience collection... (1500 times) [2024-06-18 23:42:09,614][26579] Signal inference workers to resume experience collection... (1500 times) [2024-06-18 23:42:09,668][26599] InferenceWorker_p0-w0: stopping experience collection (1500 times) [2024-06-18 23:42:09,677][26599] InferenceWorker_p0-w0: resuming experience collection (1500 times) [2024-06-18 23:42:12,747][26599] Updated weights for policy 0, policy_version 234384 (0.0036) [2024-06-18 23:42:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 3840163840. Throughput: 0: 41096.0. Samples: 107801600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:13,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-18 23:42:17,659][26599] Updated weights for policy 0, policy_version 234394 (0.0029) [2024-06-18 23:42:18,380][26367] Fps is (10 sec: 39321.3, 60 sec: 40413.8, 300 sec: 40987.8). Total num frames: 3840344064. Throughput: 0: 41066.2. Samples: 107922600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:18,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-18 23:42:20,800][26599] Updated weights for policy 0, policy_version 234404 (0.0034) [2024-06-18 23:42:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41508.7, 300 sec: 41209.9). Total num frames: 3840589824. Throughput: 0: 41029.2. Samples: 108173500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:23,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-18 23:42:25,374][26599] Updated weights for policy 0, policy_version 234414 (0.0037) [2024-06-18 23:42:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41043.3). Total num frames: 3840770048. Throughput: 0: 41233.8. Samples: 108424080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:28,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-18 23:42:28,598][26599] Updated weights for policy 0, policy_version 234424 (0.0036) [2024-06-18 23:42:33,295][26599] Updated weights for policy 0, policy_version 234434 (0.0034) [2024-06-18 23:42:33,380][26367] Fps is (10 sec: 37683.2, 60 sec: 40960.0, 300 sec: 41043.3). Total num frames: 3840966656. Throughput: 0: 40977.2. Samples: 108544340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:33,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-18 23:42:36,287][26599] Updated weights for policy 0, policy_version 234444 (0.0037) [2024-06-18 23:42:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 3841196032. Throughput: 0: 41061.3. Samples: 108793840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:38,380][26367] Avg episode reward: [(0, '0.403')] [2024-06-18 23:42:41,401][26599] Updated weights for policy 0, policy_version 234454 (0.0045) [2024-06-18 23:42:43,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41233.1, 300 sec: 41154.4). Total num frames: 3841392640. Throughput: 0: 41365.1. Samples: 109049780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:43,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-18 23:42:44,055][26599] Updated weights for policy 0, policy_version 234464 (0.0044) [2024-06-18 23:42:48,380][26367] Fps is (10 sec: 37682.5, 60 sec: 40959.9, 300 sec: 41043.3). Total num frames: 3841572864. Throughput: 0: 41132.8. Samples: 109171860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:48,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-18 23:42:49,181][26599] Updated weights for policy 0, policy_version 234474 (0.0028) [2024-06-18 23:42:52,162][26599] Updated weights for policy 0, policy_version 234484 (0.0048) [2024-06-18 23:42:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41098.8). Total num frames: 3841802240. Throughput: 0: 41386.7. Samples: 109419120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:42:53,380][26367] Avg episode reward: [(0, '0.555')] [2024-06-18 23:42:56,902][26599] Updated weights for policy 0, policy_version 234494 (0.0031) [2024-06-18 23:42:58,380][26367] Fps is (10 sec: 42599.0, 60 sec: 40960.1, 300 sec: 41098.9). Total num frames: 3841998848. Throughput: 0: 41627.1. Samples: 109674820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:42:58,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-18 23:43:00,195][26599] Updated weights for policy 0, policy_version 234504 (0.0034) [2024-06-18 23:43:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41154.4). Total num frames: 3842228224. Throughput: 0: 41506.2. Samples: 109790380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:03,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-18 23:43:04,934][26599] Updated weights for policy 0, policy_version 234514 (0.0031) [2024-06-18 23:43:07,974][26599] Updated weights for policy 0, policy_version 234524 (0.0030) [2024-06-18 23:43:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41506.1, 300 sec: 41098.9). Total num frames: 3842441216. Throughput: 0: 41589.0. Samples: 110045000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:08,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-18 23:43:12,737][26599] Updated weights for policy 0, policy_version 234534 (0.0044) [2024-06-18 23:43:13,380][26367] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41098.8). Total num frames: 3842621440. Throughput: 0: 41472.8. Samples: 110290360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:13,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-18 23:43:15,885][26599] Updated weights for policy 0, policy_version 234544 (0.0034) [2024-06-18 23:43:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41265.5). Total num frames: 3842867200. Throughput: 0: 41527.7. Samples: 110413080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:18,380][26367] Avg episode reward: [(0, '0.741')] [2024-06-18 23:43:20,516][26599] Updated weights for policy 0, policy_version 234554 (0.0041) [2024-06-18 23:43:23,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41233.1, 300 sec: 41209.9). Total num frames: 3843063808. Throughput: 0: 41601.7. Samples: 110665920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:23,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-18 23:43:23,698][26599] Updated weights for policy 0, policy_version 234564 (0.0034) [2024-06-18 23:43:28,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41043.3). Total num frames: 3843244032. Throughput: 0: 41598.2. Samples: 110921700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:28,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-18 23:43:28,510][26599] Updated weights for policy 0, policy_version 234574 (0.0043) [2024-06-18 23:43:31,714][26599] Updated weights for policy 0, policy_version 234584 (0.0032) [2024-06-18 23:43:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 41209.9). Total num frames: 3843489792. Throughput: 0: 41477.5. Samples: 111038340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:33,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-18 23:43:36,210][26599] Updated weights for policy 0, policy_version 234594 (0.0026) [2024-06-18 23:43:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 3843686400. Throughput: 0: 41669.7. Samples: 111294260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:38,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-18 23:43:38,384][26579] Signal inference workers to stop experience collection... (1550 times) [2024-06-18 23:43:38,384][26579] Signal inference workers to resume experience collection... (1550 times) [2024-06-18 23:43:38,420][26599] InferenceWorker_p0-w0: stopping experience collection (1550 times) [2024-06-18 23:43:38,420][26599] InferenceWorker_p0-w0: resuming experience collection (1550 times) [2024-06-18 23:43:38,536][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000234601_3843702784.pth... [2024-06-18 23:43:38,581][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000233995_3833774080.pth [2024-06-18 23:43:39,992][26599] Updated weights for policy 0, policy_version 234604 (0.0043) [2024-06-18 23:43:43,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 41209.9). Total num frames: 3843883008. Throughput: 0: 41552.0. Samples: 111544660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:43,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-18 23:43:44,019][26599] Updated weights for policy 0, policy_version 234614 (0.0026) [2024-06-18 23:43:47,749][26599] Updated weights for policy 0, policy_version 234624 (0.0045) [2024-06-18 23:43:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 41154.4). Total num frames: 3844096000. Throughput: 0: 41730.3. Samples: 111668240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:48,380][26367] Avg episode reward: [(0, '0.640')] [2024-06-18 23:43:52,069][26599] Updated weights for policy 0, policy_version 234634 (0.0045) [2024-06-18 23:43:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41265.5). Total num frames: 3844308992. Throughput: 0: 41667.5. Samples: 111920040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:53,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-18 23:43:55,424][26599] Updated weights for policy 0, policy_version 234644 (0.0037) [2024-06-18 23:43:58,385][26367] Fps is (10 sec: 40940.9, 60 sec: 41776.0, 300 sec: 41264.8). Total num frames: 3844505600. Throughput: 0: 41539.4. Samples: 112159820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:43:58,385][26367] Avg episode reward: [(0, '0.648')] [2024-06-18 23:43:59,839][26599] Updated weights for policy 0, policy_version 234654 (0.0030) [2024-06-18 23:44:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 3844718592. Throughput: 0: 41740.8. Samples: 112291420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-18 23:44:03,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-18 23:44:03,506][26599] Updated weights for policy 0, policy_version 234664 (0.0030) [2024-06-18 23:44:07,509][26599] Updated weights for policy 0, policy_version 234674 (0.0035) [2024-06-18 23:44:08,380][26367] Fps is (10 sec: 40978.6, 60 sec: 41233.0, 300 sec: 41154.4). Total num frames: 3844915200. Throughput: 0: 41634.6. Samples: 112539480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:08,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-18 23:44:11,588][26599] Updated weights for policy 0, policy_version 234684 (0.0042) [2024-06-18 23:44:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 41321.0). Total num frames: 3845144576. Throughput: 0: 41512.9. Samples: 112789780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:13,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-18 23:44:15,276][26599] Updated weights for policy 0, policy_version 234694 (0.0039) [2024-06-18 23:44:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41233.0, 300 sec: 41265.4). Total num frames: 3845341184. Throughput: 0: 41776.8. Samples: 112918300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:18,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-18 23:44:19,255][26599] Updated weights for policy 0, policy_version 234704 (0.0028) [2024-06-18 23:44:22,966][26599] Updated weights for policy 0, policy_version 234714 (0.0029) [2024-06-18 23:44:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41321.0). Total num frames: 3845554176. Throughput: 0: 41468.5. Samples: 113160340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:23,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-18 23:44:26,966][26599] Updated weights for policy 0, policy_version 234724 (0.0032) [2024-06-18 23:44:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 41321.0). Total num frames: 3845767168. Throughput: 0: 41492.9. Samples: 113411840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:28,381][26367] Avg episode reward: [(0, '0.359')] [2024-06-18 23:44:30,734][26599] Updated weights for policy 0, policy_version 234734 (0.0039) [2024-06-18 23:44:33,384][26367] Fps is (10 sec: 40944.9, 60 sec: 41230.5, 300 sec: 41320.5). Total num frames: 3845963776. Throughput: 0: 41423.7. Samples: 113532460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:33,384][26367] Avg episode reward: [(0, '0.467')] [2024-06-18 23:44:34,817][26599] Updated weights for policy 0, policy_version 234744 (0.0042) [2024-06-18 23:44:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 3846193152. Throughput: 0: 41494.7. Samples: 113787300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:38,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-18 23:44:38,424][26599] Updated weights for policy 0, policy_version 234754 (0.0022) [2024-06-18 23:44:42,631][26599] Updated weights for policy 0, policy_version 234764 (0.0035) [2024-06-18 23:44:43,380][26367] Fps is (10 sec: 40974.7, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 3846373376. Throughput: 0: 41590.0. Samples: 114031180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:43,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-18 23:44:46,346][26599] Updated weights for policy 0, policy_version 234774 (0.0039) [2024-06-18 23:44:48,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 3846586368. Throughput: 0: 41388.0. Samples: 114153880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:48,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-18 23:44:50,735][26599] Updated weights for policy 0, policy_version 234784 (0.0041) [2024-06-18 23:44:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41233.2, 300 sec: 41265.5). Total num frames: 3846782976. Throughput: 0: 41426.4. Samples: 114403660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:53,380][26367] Avg episode reward: [(0, '0.343')] [2024-06-18 23:44:54,398][26599] Updated weights for policy 0, policy_version 234794 (0.0045) [2024-06-18 23:44:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41509.2, 300 sec: 41321.0). Total num frames: 3846995968. Throughput: 0: 41482.1. Samples: 114656480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:44:58,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-18 23:44:59,053][26599] Updated weights for policy 0, policy_version 234804 (0.0039) [2024-06-18 23:45:02,450][26599] Updated weights for policy 0, policy_version 234814 (0.0040) [2024-06-18 23:45:03,381][26367] Fps is (10 sec: 40958.6, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 3847192576. Throughput: 0: 41251.4. Samples: 114774620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:45:03,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-18 23:45:04,723][26579] Signal inference workers to stop experience collection... (1600 times) [2024-06-18 23:45:04,723][26579] Signal inference workers to resume experience collection... (1600 times) [2024-06-18 23:45:04,767][26599] InferenceWorker_p0-w0: stopping experience collection (1600 times) [2024-06-18 23:45:04,768][26599] InferenceWorker_p0-w0: resuming experience collection (1600 times) [2024-06-18 23:45:06,821][26599] Updated weights for policy 0, policy_version 234824 (0.0034) [2024-06-18 23:45:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41321.3). Total num frames: 3847405568. Throughput: 0: 41478.1. Samples: 115026860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:45:08,384][26367] Avg episode reward: [(0, '0.645')] [2024-06-18 23:45:10,376][26599] Updated weights for policy 0, policy_version 234834 (0.0033) [2024-06-18 23:45:13,380][26367] Fps is (10 sec: 40961.1, 60 sec: 40960.0, 300 sec: 41210.0). Total num frames: 3847602176. Throughput: 0: 41350.2. Samples: 115272600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:45:13,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-18 23:45:14,905][26599] Updated weights for policy 0, policy_version 234844 (0.0038) [2024-06-18 23:45:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 3847831552. Throughput: 0: 41417.1. Samples: 115396080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:18,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-18 23:45:18,608][26599] Updated weights for policy 0, policy_version 234854 (0.0043) [2024-06-18 23:45:22,663][26599] Updated weights for policy 0, policy_version 234864 (0.0035) [2024-06-18 23:45:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3848044544. Throughput: 0: 41420.0. Samples: 115651200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:23,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-18 23:45:26,513][26599] Updated weights for policy 0, policy_version 234874 (0.0033) [2024-06-18 23:45:28,384][26367] Fps is (10 sec: 42583.0, 60 sec: 41503.6, 300 sec: 41431.6). Total num frames: 3848257536. Throughput: 0: 41466.9. Samples: 115897340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:28,385][26367] Avg episode reward: [(0, '0.419')] [2024-06-18 23:45:30,436][26599] Updated weights for policy 0, policy_version 234884 (0.0032) [2024-06-18 23:45:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41508.7, 300 sec: 41377.1). Total num frames: 3848454144. Throughput: 0: 41583.2. Samples: 116025120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:33,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-18 23:45:34,654][26599] Updated weights for policy 0, policy_version 234894 (0.0042) [2024-06-18 23:45:38,190][26599] Updated weights for policy 0, policy_version 234904 (0.0039) [2024-06-18 23:45:38,380][26367] Fps is (10 sec: 40974.5, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 3848667136. Throughput: 0: 41585.1. Samples: 116275000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:38,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-18 23:45:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000234904_3848667136.pth... [2024-06-18 23:45:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000234297_3838722048.pth [2024-06-18 23:45:42,510][26599] Updated weights for policy 0, policy_version 234914 (0.0042) [2024-06-18 23:45:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 3848880128. Throughput: 0: 41368.5. Samples: 116518060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:43,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-18 23:45:46,509][26599] Updated weights for policy 0, policy_version 234924 (0.0030) [2024-06-18 23:45:48,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 3849076736. Throughput: 0: 41511.8. Samples: 116642640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:48,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-18 23:45:50,651][26599] Updated weights for policy 0, policy_version 234934 (0.0036) [2024-06-18 23:45:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 3849273344. Throughput: 0: 41287.7. Samples: 116884800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:53,384][26367] Avg episode reward: [(0, '0.756')] [2024-06-18 23:45:54,287][26599] Updated weights for policy 0, policy_version 234944 (0.0031) [2024-06-18 23:45:58,384][26367] Fps is (10 sec: 37669.1, 60 sec: 40957.6, 300 sec: 41265.0). Total num frames: 3849453568. Throughput: 0: 41524.1. Samples: 117141340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:45:58,384][26367] Avg episode reward: [(0, '0.469')] [2024-06-18 23:45:58,698][26599] Updated weights for policy 0, policy_version 234954 (0.0030) [2024-06-18 23:46:02,072][26599] Updated weights for policy 0, policy_version 234964 (0.0031) [2024-06-18 23:46:03,384][26367] Fps is (10 sec: 42582.8, 60 sec: 41776.8, 300 sec: 41487.1). Total num frames: 3849699328. Throughput: 0: 41478.5. Samples: 117262760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:46:03,384][26367] Avg episode reward: [(0, '0.230')] [2024-06-18 23:46:06,343][26599] Updated weights for policy 0, policy_version 234974 (0.0035) [2024-06-18 23:46:08,384][26367] Fps is (10 sec: 45875.4, 60 sec: 41776.8, 300 sec: 41487.1). Total num frames: 3849912320. Throughput: 0: 41406.5. Samples: 117514640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:46:08,384][26367] Avg episode reward: [(0, '0.468')] [2024-06-18 23:46:09,975][26599] Updated weights for policy 0, policy_version 234984 (0.0043) [2024-06-18 23:46:13,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 3850108928. Throughput: 0: 41487.4. Samples: 117764120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:46:13,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-18 23:46:14,355][26599] Updated weights for policy 0, policy_version 234994 (0.0037) [2024-06-18 23:46:17,855][26599] Updated weights for policy 0, policy_version 235004 (0.0034) [2024-06-18 23:46:18,380][26367] Fps is (10 sec: 40974.3, 60 sec: 41506.1, 300 sec: 41432.6). Total num frames: 3850321920. Throughput: 0: 41274.5. Samples: 117882480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:46:18,381][26367] Avg episode reward: [(0, '0.339')] [2024-06-18 23:46:22,183][26599] Updated weights for policy 0, policy_version 235014 (0.0035) [2024-06-18 23:46:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3850534912. Throughput: 0: 41287.8. Samples: 118132940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-18 23:46:23,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-18 23:46:25,611][26599] Updated weights for policy 0, policy_version 235024 (0.0041) [2024-06-18 23:46:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41235.6, 300 sec: 41432.1). Total num frames: 3850731520. Throughput: 0: 41403.1. Samples: 118381200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:46:28,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-18 23:46:30,050][26599] Updated weights for policy 0, policy_version 235034 (0.0036) [2024-06-18 23:46:32,988][26579] Signal inference workers to stop experience collection... (1650 times) [2024-06-18 23:46:33,045][26599] InferenceWorker_p0-w0: stopping experience collection (1650 times) [2024-06-18 23:46:33,049][26579] Signal inference workers to resume experience collection... (1650 times) [2024-06-18 23:46:33,059][26599] InferenceWorker_p0-w0: resuming experience collection (1650 times) [2024-06-18 23:46:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3850944512. Throughput: 0: 41498.6. Samples: 118510080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:46:33,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-18 23:46:33,529][26599] Updated weights for policy 0, policy_version 235044 (0.0034) [2024-06-18 23:46:37,836][26599] Updated weights for policy 0, policy_version 235054 (0.0029) [2024-06-18 23:46:38,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41230.7, 300 sec: 41431.6). Total num frames: 3851141120. Throughput: 0: 41593.1. Samples: 118756640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:46:38,384][26367] Avg episode reward: [(0, '0.623')] [2024-06-18 23:46:41,561][26599] Updated weights for policy 0, policy_version 235064 (0.0039) [2024-06-18 23:46:43,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 3851337728. Throughput: 0: 41429.6. Samples: 119005520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:46:43,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-18 23:46:46,268][26599] Updated weights for policy 0, policy_version 235074 (0.0038) [2024-06-18 23:46:48,380][26367] Fps is (10 sec: 40974.9, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3851550720. Throughput: 0: 41569.1. Samples: 119133220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:46:48,384][26367] Avg episode reward: [(0, '0.551')] [2024-06-18 23:46:49,768][26599] Updated weights for policy 0, policy_version 235084 (0.0039) [2024-06-18 23:46:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3851763712. Throughput: 0: 41320.1. Samples: 119373900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:46:53,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-18 23:46:54,055][26599] Updated weights for policy 0, policy_version 235094 (0.0038) [2024-06-18 23:46:57,888][26599] Updated weights for policy 0, policy_version 235104 (0.0038) [2024-06-18 23:46:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41781.7, 300 sec: 41432.1). Total num frames: 3851960320. Throughput: 0: 41387.4. Samples: 119626560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:46:58,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-18 23:47:02,001][26599] Updated weights for policy 0, policy_version 235114 (0.0039) [2024-06-18 23:47:03,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41508.7, 300 sec: 41487.6). Total num frames: 3852189696. Throughput: 0: 41356.2. Samples: 119743500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:47:03,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-18 23:47:05,916][26599] Updated weights for policy 0, policy_version 235124 (0.0031) [2024-06-18 23:47:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 41235.6, 300 sec: 41432.1). Total num frames: 3852386304. Throughput: 0: 41433.7. Samples: 119997460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:47:08,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-18 23:47:09,735][26599] Updated weights for policy 0, policy_version 235134 (0.0040) [2024-06-18 23:47:13,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3852582912. Throughput: 0: 41329.3. Samples: 120241020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:47:13,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-18 23:47:13,820][26599] Updated weights for policy 0, policy_version 235144 (0.0039) [2024-06-18 23:47:17,704][26599] Updated weights for policy 0, policy_version 235154 (0.0036) [2024-06-18 23:47:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 3852779520. Throughput: 0: 41194.2. Samples: 120363820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:47:18,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-18 23:47:21,913][26599] Updated weights for policy 0, policy_version 235164 (0.0051) [2024-06-18 23:47:23,384][26367] Fps is (10 sec: 40944.3, 60 sec: 40957.3, 300 sec: 41431.5). Total num frames: 3852992512. Throughput: 0: 41207.3. Samples: 120610980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:47:23,385][26367] Avg episode reward: [(0, '0.476')] [2024-06-18 23:47:25,522][26599] Updated weights for policy 0, policy_version 235174 (0.0048) [2024-06-18 23:47:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3853205504. Throughput: 0: 41144.9. Samples: 120857040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:47:28,380][26367] Avg episode reward: [(0, '0.598')] [2024-06-18 23:47:29,564][26599] Updated weights for policy 0, policy_version 235184 (0.0032) [2024-06-18 23:47:33,328][26599] Updated weights for policy 0, policy_version 235194 (0.0036) [2024-06-18 23:47:33,380][26367] Fps is (10 sec: 42615.3, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3853418496. Throughput: 0: 41122.3. Samples: 120983720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-18 23:47:33,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-18 23:47:37,537][26599] Updated weights for policy 0, policy_version 235204 (0.0037) [2024-06-18 23:47:38,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40962.5, 300 sec: 41376.5). Total num frames: 3853598720. Throughput: 0: 41328.6. Samples: 121233680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:47:38,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-18 23:47:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000235205_3853598720.pth... [2024-06-18 23:47:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000234601_3843702784.pth [2024-06-18 23:47:41,326][26599] Updated weights for policy 0, policy_version 235214 (0.0032) [2024-06-18 23:47:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3853828096. Throughput: 0: 40897.5. Samples: 121466940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:47:43,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-18 23:47:45,724][26599] Updated weights for policy 0, policy_version 235224 (0.0040) [2024-06-18 23:47:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3854024704. Throughput: 0: 41230.1. Samples: 121598860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:47:48,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-18 23:47:49,294][26599] Updated weights for policy 0, policy_version 235234 (0.0035) [2024-06-18 23:47:53,380][26367] Fps is (10 sec: 37683.2, 60 sec: 40687.0, 300 sec: 41376.5). Total num frames: 3854204928. Throughput: 0: 41048.4. Samples: 121844640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:47:53,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-18 23:47:53,691][26599] Updated weights for policy 0, policy_version 235244 (0.0034) [2024-06-18 23:47:56,954][26599] Updated weights for policy 0, policy_version 235254 (0.0047) [2024-06-18 23:47:58,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.3, 300 sec: 41432.1). Total num frames: 3854450688. Throughput: 0: 41134.3. Samples: 122092060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:47:58,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-18 23:48:02,163][26599] Updated weights for policy 0, policy_version 235264 (0.0032) [2024-06-18 23:48:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 40686.8, 300 sec: 41321.0). Total num frames: 3854630912. Throughput: 0: 41277.3. Samples: 122221300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:03,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-18 23:48:04,796][26599] Updated weights for policy 0, policy_version 235274 (0.0037) [2024-06-18 23:48:08,380][26367] Fps is (10 sec: 39321.2, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 3854843904. Throughput: 0: 41082.6. Samples: 122459540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:08,381][26367] Avg episode reward: [(0, '0.361')] [2024-06-18 23:48:09,974][26599] Updated weights for policy 0, policy_version 235284 (0.0035) [2024-06-18 23:48:12,184][26579] Signal inference workers to stop experience collection... (1700 times) [2024-06-18 23:48:12,184][26579] Signal inference workers to resume experience collection... (1700 times) [2024-06-18 23:48:12,206][26599] InferenceWorker_p0-w0: stopping experience collection (1700 times) [2024-06-18 23:48:12,206][26599] InferenceWorker_p0-w0: resuming experience collection (1700 times) [2024-06-18 23:48:13,353][26599] Updated weights for policy 0, policy_version 235294 (0.0030) [2024-06-18 23:48:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 3855056896. Throughput: 0: 41206.6. Samples: 122711340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:13,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-18 23:48:17,685][26599] Updated weights for policy 0, policy_version 235304 (0.0032) [2024-06-18 23:48:18,380][26367] Fps is (10 sec: 39322.3, 60 sec: 40960.1, 300 sec: 41265.5). Total num frames: 3855237120. Throughput: 0: 41128.9. Samples: 122834520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:18,380][26367] Avg episode reward: [(0, '0.427')] [2024-06-18 23:48:21,104][26599] Updated weights for policy 0, policy_version 235314 (0.0042) [2024-06-18 23:48:23,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41508.9, 300 sec: 41487.6). Total num frames: 3855482880. Throughput: 0: 41033.3. Samples: 123080180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:23,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-18 23:48:25,414][26599] Updated weights for policy 0, policy_version 235324 (0.0023) [2024-06-18 23:48:28,380][26367] Fps is (10 sec: 42597.6, 60 sec: 40959.9, 300 sec: 41265.4). Total num frames: 3855663104. Throughput: 0: 41581.2. Samples: 123338100. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:28,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-18 23:48:28,842][26599] Updated weights for policy 0, policy_version 235334 (0.0036) [2024-06-18 23:48:33,200][26599] Updated weights for policy 0, policy_version 235344 (0.0041) [2024-06-18 23:48:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 3855876096. Throughput: 0: 41235.2. Samples: 123454440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:33,380][26367] Avg episode reward: [(0, '0.483')] [2024-06-18 23:48:36,617][26599] Updated weights for policy 0, policy_version 235354 (0.0038) [2024-06-18 23:48:38,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 3856121856. Throughput: 0: 41361.3. Samples: 123705900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:38,381][26367] Avg episode reward: [(0, '0.350')] [2024-06-18 23:48:41,034][26599] Updated weights for policy 0, policy_version 235364 (0.0025) [2024-06-18 23:48:43,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 3856269312. Throughput: 0: 41466.7. Samples: 123958060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-18 23:48:43,380][26367] Avg episode reward: [(0, '0.743')] [2024-06-18 23:48:44,922][26599] Updated weights for policy 0, policy_version 235374 (0.0035) [2024-06-18 23:48:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 3856515072. Throughput: 0: 41087.6. Samples: 124070240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:48:48,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-18 23:48:49,460][26599] Updated weights for policy 0, policy_version 235384 (0.0050) [2024-06-18 23:48:52,857][26599] Updated weights for policy 0, policy_version 235394 (0.0036) [2024-06-18 23:48:53,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42052.2, 300 sec: 41432.7). Total num frames: 3856728064. Throughput: 0: 41466.2. Samples: 124325520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:48:53,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-18 23:48:57,263][26599] Updated weights for policy 0, policy_version 235404 (0.0039) [2024-06-18 23:48:58,380][26367] Fps is (10 sec: 36045.0, 60 sec: 40413.8, 300 sec: 41209.9). Total num frames: 3856875520. Throughput: 0: 41402.7. Samples: 124574460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:48:58,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-18 23:49:00,787][26599] Updated weights for policy 0, policy_version 235414 (0.0041) [2024-06-18 23:49:03,384][26367] Fps is (10 sec: 42583.2, 60 sec: 42049.8, 300 sec: 41487.1). Total num frames: 3857154048. Throughput: 0: 41260.6. Samples: 124691400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:03,385][26367] Avg episode reward: [(0, '0.678')] [2024-06-18 23:49:05,042][26599] Updated weights for policy 0, policy_version 235424 (0.0032) [2024-06-18 23:49:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 40960.0, 300 sec: 41209.9). Total num frames: 3857301504. Throughput: 0: 41601.3. Samples: 124952240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:08,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-18 23:49:08,712][26599] Updated weights for policy 0, policy_version 235434 (0.0031) [2024-06-18 23:49:12,838][26599] Updated weights for policy 0, policy_version 235444 (0.0043) [2024-06-18 23:49:13,380][26367] Fps is (10 sec: 36057.6, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 3857514496. Throughput: 0: 41208.0. Samples: 125192460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:13,381][26367] Avg episode reward: [(0, '0.381')] [2024-06-18 23:49:16,599][26599] Updated weights for policy 0, policy_version 235454 (0.0029) [2024-06-18 23:49:18,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42325.2, 300 sec: 41432.1). Total num frames: 3857776640. Throughput: 0: 41565.7. Samples: 125324900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:18,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-18 23:49:20,634][26599] Updated weights for policy 0, policy_version 235464 (0.0051) [2024-06-18 23:49:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 3857940480. Throughput: 0: 41547.2. Samples: 125575520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:23,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-18 23:49:24,435][26599] Updated weights for policy 0, policy_version 235474 (0.0039) [2024-06-18 23:49:28,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41506.2, 300 sec: 41321.5). Total num frames: 3858153472. Throughput: 0: 41318.6. Samples: 125817400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:28,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-18 23:49:28,439][26599] Updated weights for policy 0, policy_version 235484 (0.0044) [2024-06-18 23:49:32,266][26599] Updated weights for policy 0, policy_version 235494 (0.0035) [2024-06-18 23:49:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 3858382848. Throughput: 0: 41668.1. Samples: 125945300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:33,380][26367] Avg episode reward: [(0, '0.641')] [2024-06-18 23:49:36,828][26599] Updated weights for policy 0, policy_version 235504 (0.0047) [2024-06-18 23:49:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40687.0, 300 sec: 41321.0). Total num frames: 3858563072. Throughput: 0: 41539.6. Samples: 126194800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:38,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-18 23:49:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000235508_3858563072.pth... [2024-06-18 23:49:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000234904_3848667136.pth [2024-06-18 23:49:40,006][26599] Updated weights for policy 0, policy_version 235514 (0.0033) [2024-06-18 23:49:40,905][26579] Signal inference workers to stop experience collection... (1750 times) [2024-06-18 23:49:40,912][26579] Signal inference workers to resume experience collection... (1750 times) [2024-06-18 23:49:40,928][26599] InferenceWorker_p0-w0: stopping experience collection (1750 times) [2024-06-18 23:49:40,956][26599] InferenceWorker_p0-w0: resuming experience collection (1750 times) [2024-06-18 23:49:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41432.1). Total num frames: 3858808832. Throughput: 0: 41209.4. Samples: 126428880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:43,380][26367] Avg episode reward: [(0, '0.820')] [2024-06-18 23:49:44,620][26599] Updated weights for policy 0, policy_version 235524 (0.0040) [2024-06-18 23:49:48,039][26599] Updated weights for policy 0, policy_version 235534 (0.0027) [2024-06-18 23:49:48,384][26367] Fps is (10 sec: 44220.9, 60 sec: 41503.6, 300 sec: 41431.6). Total num frames: 3859005440. Throughput: 0: 41628.9. Samples: 126564700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:48,384][26367] Avg episode reward: [(0, '0.570')] [2024-06-18 23:49:52,346][26599] Updated weights for policy 0, policy_version 235544 (0.0039) [2024-06-18 23:49:53,380][26367] Fps is (10 sec: 36044.9, 60 sec: 40687.1, 300 sec: 41265.5). Total num frames: 3859169280. Throughput: 0: 41323.7. Samples: 126811800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 25.0) [2024-06-18 23:49:53,380][26367] Avg episode reward: [(0, '0.570')] [2024-06-18 23:49:55,764][26599] Updated weights for policy 0, policy_version 235554 (0.0034) [2024-06-18 23:49:58,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42598.4, 300 sec: 41487.7). Total num frames: 3859431424. Throughput: 0: 41501.9. Samples: 127060040. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:49:58,380][26367] Avg episode reward: [(0, '0.170')] [2024-06-18 23:50:00,609][26599] Updated weights for policy 0, policy_version 235564 (0.0039) [2024-06-18 23:50:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 40689.5, 300 sec: 41321.0). Total num frames: 3859595264. Throughput: 0: 41477.1. Samples: 127191360. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:03,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-18 23:50:03,860][26599] Updated weights for policy 0, policy_version 235574 (0.0036) [2024-06-18 23:50:08,317][26599] Updated weights for policy 0, policy_version 235584 (0.0033) [2024-06-18 23:50:08,380][26367] Fps is (10 sec: 37682.8, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 3859808256. Throughput: 0: 41193.7. Samples: 127429240. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:08,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-18 23:50:11,765][26599] Updated weights for policy 0, policy_version 235594 (0.0038) [2024-06-18 23:50:13,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.4, 300 sec: 41376.6). Total num frames: 3860037632. Throughput: 0: 41246.8. Samples: 127673500. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:13,380][26367] Avg episode reward: [(0, '0.683')] [2024-06-18 23:50:16,413][26599] Updated weights for policy 0, policy_version 235604 (0.0040) [2024-06-18 23:50:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 3860217856. Throughput: 0: 41484.9. Samples: 127812120. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:18,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-18 23:50:19,552][26599] Updated weights for policy 0, policy_version 235614 (0.0046) [2024-06-18 23:50:23,380][26367] Fps is (10 sec: 39320.5, 60 sec: 41506.0, 300 sec: 41266.0). Total num frames: 3860430848. Throughput: 0: 41200.3. Samples: 128048820. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:23,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-18 23:50:24,297][26599] Updated weights for policy 0, policy_version 235624 (0.0042) [2024-06-18 23:50:27,743][26599] Updated weights for policy 0, policy_version 235634 (0.0043) [2024-06-18 23:50:28,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 41432.1). Total num frames: 3860676608. Throughput: 0: 41482.1. Samples: 128295580. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:28,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-18 23:50:32,515][26599] Updated weights for policy 0, policy_version 235644 (0.0031) [2024-06-18 23:50:33,380][26367] Fps is (10 sec: 40960.6, 60 sec: 40959.9, 300 sec: 41265.5). Total num frames: 3860840448. Throughput: 0: 41241.5. Samples: 128420420. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:33,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-18 23:50:35,510][26599] Updated weights for policy 0, policy_version 235654 (0.0039) [2024-06-18 23:50:38,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 41376.6). Total num frames: 3861086208. Throughput: 0: 41178.2. Samples: 128664820. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:38,380][26367] Avg episode reward: [(0, '0.529')] [2024-06-18 23:50:40,185][26599] Updated weights for policy 0, policy_version 235664 (0.0036) [2024-06-18 23:50:43,380][26367] Fps is (10 sec: 42599.0, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 3861266432. Throughput: 0: 41285.8. Samples: 128917900. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:43,380][26367] Avg episode reward: [(0, '0.459')] [2024-06-18 23:50:43,407][26599] Updated weights for policy 0, policy_version 235674 (0.0041) [2024-06-18 23:50:47,880][26599] Updated weights for policy 0, policy_version 235684 (0.0043) [2024-06-18 23:50:48,380][26367] Fps is (10 sec: 36044.3, 60 sec: 40689.4, 300 sec: 41265.5). Total num frames: 3861446656. Throughput: 0: 41075.0. Samples: 129039740. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:48,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-18 23:50:51,286][26599] Updated weights for policy 0, policy_version 235694 (0.0036) [2024-06-18 23:50:51,972][26579] Signal inference workers to stop experience collection... (1800 times) [2024-06-18 23:50:52,017][26579] Signal inference workers to resume experience collection... (1800 times) [2024-06-18 23:50:52,017][26599] InferenceWorker_p0-w0: stopping experience collection (1800 times) [2024-06-18 23:50:52,036][26599] InferenceWorker_p0-w0: resuming experience collection (1800 times) [2024-06-18 23:50:53,381][26367] Fps is (10 sec: 42597.2, 60 sec: 42052.0, 300 sec: 41488.1). Total num frames: 3861692416. Throughput: 0: 41300.3. Samples: 129287760. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:53,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-18 23:50:55,765][26599] Updated weights for policy 0, policy_version 235704 (0.0039) [2024-06-18 23:50:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 40687.0, 300 sec: 41266.0). Total num frames: 3861872640. Throughput: 0: 41507.6. Samples: 129541340. Policy #0 lag: (min: 0.0, avg: 7.8, max: 21.0) [2024-06-18 23:50:58,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-18 23:50:59,489][26599] Updated weights for policy 0, policy_version 235714 (0.0041) [2024-06-18 23:51:03,384][26367] Fps is (10 sec: 39308.1, 60 sec: 41503.5, 300 sec: 41265.5). Total num frames: 3862085632. Throughput: 0: 40930.9. Samples: 129654160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:03,384][26367] Avg episode reward: [(0, '0.515')] [2024-06-18 23:51:03,782][26599] Updated weights for policy 0, policy_version 235724 (0.0037) [2024-06-18 23:51:07,546][26599] Updated weights for policy 0, policy_version 235734 (0.0027) [2024-06-18 23:51:08,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 3862331392. Throughput: 0: 41355.3. Samples: 129909800. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:08,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-18 23:51:11,609][26599] Updated weights for policy 0, policy_version 235744 (0.0039) [2024-06-18 23:51:13,380][26367] Fps is (10 sec: 42613.7, 60 sec: 41233.0, 300 sec: 41321.0). Total num frames: 3862511616. Throughput: 0: 41492.0. Samples: 130162720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:13,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-18 23:51:15,245][26599] Updated weights for policy 0, policy_version 235754 (0.0025) [2024-06-18 23:51:18,380][26367] Fps is (10 sec: 36044.4, 60 sec: 41233.0, 300 sec: 41209.9). Total num frames: 3862691840. Throughput: 0: 41261.3. Samples: 130277180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:18,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-18 23:51:19,367][26599] Updated weights for policy 0, policy_version 235764 (0.0028) [2024-06-18 23:51:23,103][26599] Updated weights for policy 0, policy_version 235774 (0.0038) [2024-06-18 23:51:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41376.5). Total num frames: 3862937600. Throughput: 0: 41571.8. Samples: 130535560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:23,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-18 23:51:27,072][26599] Updated weights for policy 0, policy_version 235784 (0.0041) [2024-06-18 23:51:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 3863117824. Throughput: 0: 41550.1. Samples: 130787660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:28,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-18 23:51:30,964][26599] Updated weights for policy 0, policy_version 235794 (0.0036) [2024-06-18 23:51:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41321.5). Total num frames: 3863330816. Throughput: 0: 41508.9. Samples: 130907640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:33,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-18 23:51:34,879][26599] Updated weights for policy 0, policy_version 235804 (0.0031) [2024-06-18 23:51:38,380][26367] Fps is (10 sec: 42599.0, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 3863543808. Throughput: 0: 41514.9. Samples: 131155920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:38,380][26367] Avg episode reward: [(0, '0.180')] [2024-06-18 23:51:38,444][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000235813_3863560192.pth... [2024-06-18 23:51:38,512][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000235205_3853598720.pth [2024-06-18 23:51:38,658][26599] Updated weights for policy 0, policy_version 235814 (0.0038) [2024-06-18 23:51:42,870][26599] Updated weights for policy 0, policy_version 235824 (0.0030) [2024-06-18 23:51:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41232.9, 300 sec: 41321.0). Total num frames: 3863740416. Throughput: 0: 41442.9. Samples: 131406280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:43,381][26367] Avg episode reward: [(0, '0.375')] [2024-06-18 23:51:46,820][26599] Updated weights for policy 0, policy_version 235834 (0.0033) [2024-06-18 23:51:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41779.2, 300 sec: 41321.0). Total num frames: 3863953408. Throughput: 0: 41688.2. Samples: 131529980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:48,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-18 23:51:50,598][26599] Updated weights for policy 0, policy_version 235844 (0.0033) [2024-06-18 23:51:53,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41506.3, 300 sec: 41432.1). Total num frames: 3864182784. Throughput: 0: 41578.6. Samples: 131780840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:53,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-18 23:51:54,722][26599] Updated weights for policy 0, policy_version 235854 (0.0038) [2024-06-18 23:51:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41506.1, 300 sec: 41265.5). Total num frames: 3864363008. Throughput: 0: 41584.6. Samples: 132034020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:51:58,380][26367] Avg episode reward: [(0, '0.545')] [2024-06-18 23:51:58,583][26599] Updated weights for policy 0, policy_version 235864 (0.0035) [2024-06-18 23:52:02,725][26599] Updated weights for policy 0, policy_version 235874 (0.0040) [2024-06-18 23:52:03,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41508.6, 300 sec: 41321.0). Total num frames: 3864576000. Throughput: 0: 41580.0. Samples: 132148280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:52:03,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-18 23:52:06,493][26599] Updated weights for policy 0, policy_version 235884 (0.0028) [2024-06-18 23:52:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 40960.0, 300 sec: 41376.6). Total num frames: 3864788992. Throughput: 0: 41304.6. Samples: 132394260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:52:08,380][26367] Avg episode reward: [(0, '0.551')] [2024-06-18 23:52:10,847][26599] Updated weights for policy 0, policy_version 235894 (0.0037) [2024-06-18 23:52:13,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41233.2, 300 sec: 41376.6). Total num frames: 3864985600. Throughput: 0: 41209.9. Samples: 132642100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:13,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-18 23:52:13,789][26579] Signal inference workers to stop experience collection... (1850 times) [2024-06-18 23:52:13,842][26599] InferenceWorker_p0-w0: stopping experience collection (1850 times) [2024-06-18 23:52:13,907][26579] Signal inference workers to resume experience collection... (1850 times) [2024-06-18 23:52:13,908][26599] InferenceWorker_p0-w0: resuming experience collection (1850 times) [2024-06-18 23:52:14,437][26599] Updated weights for policy 0, policy_version 235904 (0.0036) [2024-06-18 23:52:18,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41321.6). Total num frames: 3865182208. Throughput: 0: 41246.8. Samples: 132763740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:18,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-18 23:52:18,811][26599] Updated weights for policy 0, policy_version 235914 (0.0034) [2024-06-18 23:52:22,347][26599] Updated weights for policy 0, policy_version 235924 (0.0031) [2024-06-18 23:52:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.1, 300 sec: 41321.0). Total num frames: 3865395200. Throughput: 0: 41175.1. Samples: 133008800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:23,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-18 23:52:26,852][26599] Updated weights for policy 0, policy_version 235934 (0.0037) [2024-06-18 23:52:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41265.5). Total num frames: 3865591808. Throughput: 0: 41034.3. Samples: 133252820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:28,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-18 23:52:30,424][26599] Updated weights for policy 0, policy_version 235944 (0.0036) [2024-06-18 23:52:33,380][26367] Fps is (10 sec: 39321.0, 60 sec: 40960.0, 300 sec: 41321.0). Total num frames: 3865788416. Throughput: 0: 41016.4. Samples: 133375720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:33,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-18 23:52:34,998][26599] Updated weights for policy 0, policy_version 235954 (0.0036) [2024-06-18 23:52:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41265.5). Total num frames: 3866001408. Throughput: 0: 41021.4. Samples: 133626800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:38,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-18 23:52:38,853][26599] Updated weights for policy 0, policy_version 235964 (0.0044) [2024-06-18 23:52:42,823][26599] Updated weights for policy 0, policy_version 235974 (0.0036) [2024-06-18 23:52:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 3866214400. Throughput: 0: 40906.9. Samples: 133874840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:43,381][26367] Avg episode reward: [(0, '0.334')] [2024-06-18 23:52:46,659][26599] Updated weights for policy 0, policy_version 235984 (0.0036) [2024-06-18 23:52:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 3866443776. Throughput: 0: 41165.0. Samples: 134000700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:48,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-18 23:52:50,743][26599] Updated weights for policy 0, policy_version 235994 (0.0041) [2024-06-18 23:52:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 40687.0, 300 sec: 41265.5). Total num frames: 3866624000. Throughput: 0: 41170.2. Samples: 134246920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:53,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-18 23:52:54,540][26599] Updated weights for policy 0, policy_version 236004 (0.0028) [2024-06-18 23:52:58,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 3866836992. Throughput: 0: 41252.8. Samples: 134498480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:52:58,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-18 23:52:58,625][26599] Updated weights for policy 0, policy_version 236014 (0.0036) [2024-06-18 23:53:02,373][26599] Updated weights for policy 0, policy_version 236024 (0.0029) [2024-06-18 23:53:03,380][26367] Fps is (10 sec: 45875.4, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 3867082752. Throughput: 0: 41416.9. Samples: 134627500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:53:03,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-18 23:53:06,360][26599] Updated weights for policy 0, policy_version 236034 (0.0032) [2024-06-18 23:53:08,380][26367] Fps is (10 sec: 39322.0, 60 sec: 40686.9, 300 sec: 41265.5). Total num frames: 3867230208. Throughput: 0: 41480.5. Samples: 134875420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:53:08,380][26367] Avg episode reward: [(0, '0.569')] [2024-06-18 23:53:10,045][26599] Updated weights for policy 0, policy_version 236044 (0.0032) [2024-06-18 23:53:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 3867475968. Throughput: 0: 41489.3. Samples: 135119840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:53:13,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-18 23:53:14,221][26599] Updated weights for policy 0, policy_version 236054 (0.0040) [2024-06-18 23:53:17,838][26599] Updated weights for policy 0, policy_version 236064 (0.0036) [2024-06-18 23:53:18,380][26367] Fps is (10 sec: 45875.0, 60 sec: 41779.2, 300 sec: 41376.5). Total num frames: 3867688960. Throughput: 0: 41653.9. Samples: 135250140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-18 23:53:18,381][26367] Avg episode reward: [(0, '0.396')] [2024-06-18 23:53:22,075][26599] Updated weights for policy 0, policy_version 236074 (0.0036) [2024-06-18 23:53:23,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 3867869184. Throughput: 0: 41640.5. Samples: 135500620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:23,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-18 23:53:24,863][26579] Signal inference workers to stop experience collection... (1900 times) [2024-06-18 23:53:24,863][26579] Signal inference workers to resume experience collection... (1900 times) [2024-06-18 23:53:24,904][26599] InferenceWorker_p0-w0: stopping experience collection (1900 times) [2024-06-18 23:53:24,908][26599] InferenceWorker_p0-w0: resuming experience collection (1900 times) [2024-06-18 23:53:25,770][26599] Updated weights for policy 0, policy_version 236084 (0.0036) [2024-06-18 23:53:28,380][26367] Fps is (10 sec: 39320.9, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 3868082176. Throughput: 0: 41620.8. Samples: 135747780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:28,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-18 23:53:29,906][26599] Updated weights for policy 0, policy_version 236094 (0.0045) [2024-06-18 23:53:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41265.5). Total num frames: 3868295168. Throughput: 0: 41705.8. Samples: 135877460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:33,380][26367] Avg episode reward: [(0, '0.481')] [2024-06-18 23:53:33,930][26599] Updated weights for policy 0, policy_version 236104 (0.0039) [2024-06-18 23:53:37,656][26599] Updated weights for policy 0, policy_version 236114 (0.0025) [2024-06-18 23:53:38,380][26367] Fps is (10 sec: 42599.3, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 3868508160. Throughput: 0: 41790.7. Samples: 136127500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:38,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-18 23:53:38,423][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000236116_3868524544.pth... [2024-06-18 23:53:38,465][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000235508_3858563072.pth [2024-06-18 23:53:41,774][26599] Updated weights for policy 0, policy_version 236124 (0.0042) [2024-06-18 23:53:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 41432.1). Total num frames: 3868737536. Throughput: 0: 41698.4. Samples: 136374900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:43,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-18 23:53:45,396][26599] Updated weights for policy 0, policy_version 236134 (0.0029) [2024-06-18 23:53:48,384][26367] Fps is (10 sec: 42582.6, 60 sec: 41503.6, 300 sec: 41376.0). Total num frames: 3868934144. Throughput: 0: 41762.3. Samples: 136506960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:48,385][26367] Avg episode reward: [(0, '0.528')] [2024-06-18 23:53:49,464][26599] Updated weights for policy 0, policy_version 236144 (0.0037) [2024-06-18 23:53:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 3869130752. Throughput: 0: 41743.1. Samples: 136753860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:53,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-18 23:53:53,417][26599] Updated weights for policy 0, policy_version 236154 (0.0036) [2024-06-18 23:53:57,137][26599] Updated weights for policy 0, policy_version 236164 (0.0039) [2024-06-18 23:53:58,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42325.3, 300 sec: 41432.6). Total num frames: 3869376512. Throughput: 0: 41789.3. Samples: 137000360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:53:58,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-18 23:54:01,158][26599] Updated weights for policy 0, policy_version 236174 (0.0042) [2024-06-18 23:54:03,384][26367] Fps is (10 sec: 39307.1, 60 sec: 40684.4, 300 sec: 41431.6). Total num frames: 3869523968. Throughput: 0: 41827.7. Samples: 137132540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:54:03,384][26367] Avg episode reward: [(0, '0.731')] [2024-06-18 23:54:04,855][26599] Updated weights for policy 0, policy_version 236184 (0.0043) [2024-06-18 23:54:08,384][26367] Fps is (10 sec: 37669.5, 60 sec: 42049.6, 300 sec: 41487.1). Total num frames: 3869753344. Throughput: 0: 41766.7. Samples: 137380280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:54:08,385][26367] Avg episode reward: [(0, '0.563')] [2024-06-18 23:54:09,018][26599] Updated weights for policy 0, policy_version 236194 (0.0037) [2024-06-18 23:54:12,915][26599] Updated weights for policy 0, policy_version 236204 (0.0050) [2024-06-18 23:54:13,380][26367] Fps is (10 sec: 45892.4, 60 sec: 41779.3, 300 sec: 41376.6). Total num frames: 3869982720. Throughput: 0: 41686.5. Samples: 137623660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:54:13,380][26367] Avg episode reward: [(0, '0.459')] [2024-06-18 23:54:16,683][26599] Updated weights for policy 0, policy_version 236214 (0.0032) [2024-06-18 23:54:18,380][26367] Fps is (10 sec: 40974.5, 60 sec: 41232.9, 300 sec: 41432.1). Total num frames: 3870162944. Throughput: 0: 41628.2. Samples: 137750740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:54:18,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-18 23:54:20,518][26599] Updated weights for policy 0, policy_version 236224 (0.0036) [2024-06-18 23:54:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 3870392320. Throughput: 0: 41712.4. Samples: 138004560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:54:23,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-18 23:54:24,531][26599] Updated weights for policy 0, policy_version 236234 (0.0031) [2024-06-18 23:54:28,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42052.4, 300 sec: 41432.1). Total num frames: 3870605312. Throughput: 0: 41793.7. Samples: 138255620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-18 23:54:28,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-18 23:54:28,426][26599] Updated weights for policy 0, policy_version 236244 (0.0029) [2024-06-18 23:54:32,424][26599] Updated weights for policy 0, policy_version 236254 (0.0051) [2024-06-18 23:54:33,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41776.6, 300 sec: 41487.1). Total num frames: 3870801920. Throughput: 0: 41661.8. Samples: 138381740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:54:33,385][26367] Avg episode reward: [(0, '0.639')] [2024-06-18 23:54:36,389][26599] Updated weights for policy 0, policy_version 236264 (0.0043) [2024-06-18 23:54:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41432.1). Total num frames: 3871031296. Throughput: 0: 41680.8. Samples: 138629500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:54:38,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-18 23:54:40,531][26599] Updated weights for policy 0, policy_version 236274 (0.0030) [2024-06-18 23:54:43,380][26367] Fps is (10 sec: 44252.3, 60 sec: 41779.1, 300 sec: 41488.1). Total num frames: 3871244288. Throughput: 0: 41882.6. Samples: 138885080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:54:43,381][26367] Avg episode reward: [(0, '0.318')] [2024-06-18 23:54:44,077][26599] Updated weights for policy 0, policy_version 236284 (0.0043) [2024-06-18 23:54:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41508.7, 300 sec: 41543.1). Total num frames: 3871424512. Throughput: 0: 41637.2. Samples: 139006060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:54:48,380][26367] Avg episode reward: [(0, '0.376')] [2024-06-18 23:54:48,465][26599] Updated weights for policy 0, policy_version 236294 (0.0039) [2024-06-18 23:54:49,900][26579] Signal inference workers to stop experience collection... (1950 times) [2024-06-18 23:54:49,900][26579] Signal inference workers to resume experience collection... (1950 times) [2024-06-18 23:54:49,933][26599] InferenceWorker_p0-w0: stopping experience collection (1950 times) [2024-06-18 23:54:49,934][26599] InferenceWorker_p0-w0: resuming experience collection (1950 times) [2024-06-18 23:54:51,818][26599] Updated weights for policy 0, policy_version 236304 (0.0042) [2024-06-18 23:54:53,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 3871653888. Throughput: 0: 41708.4. Samples: 139257000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:54:53,380][26367] Avg episode reward: [(0, '0.483')] [2024-06-18 23:54:56,411][26599] Updated weights for policy 0, policy_version 236314 (0.0028) [2024-06-18 23:54:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3871850496. Throughput: 0: 41963.5. Samples: 139512020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:54:58,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-18 23:54:59,553][26599] Updated weights for policy 0, policy_version 236324 (0.0028) [2024-06-18 23:55:03,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42054.8, 300 sec: 41487.6). Total num frames: 3872047104. Throughput: 0: 41786.8. Samples: 139631140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:03,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-18 23:55:04,225][26599] Updated weights for policy 0, policy_version 236334 (0.0035) [2024-06-18 23:55:07,359][26599] Updated weights for policy 0, policy_version 236344 (0.0031) [2024-06-18 23:55:08,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42327.8, 300 sec: 41543.1). Total num frames: 3872292864. Throughput: 0: 41718.1. Samples: 139881880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:08,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-18 23:55:12,107][26599] Updated weights for policy 0, policy_version 236354 (0.0039) [2024-06-18 23:55:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.0, 300 sec: 41543.1). Total num frames: 3872473088. Throughput: 0: 41830.6. Samples: 140138000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:13,381][26367] Avg episode reward: [(0, '0.766')] [2024-06-18 23:55:15,085][26599] Updated weights for policy 0, policy_version 236364 (0.0031) [2024-06-18 23:55:18,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 3872686080. Throughput: 0: 41728.2. Samples: 140259360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:18,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-18 23:55:20,010][26599] Updated weights for policy 0, policy_version 236374 (0.0040) [2024-06-18 23:55:23,378][26599] Updated weights for policy 0, policy_version 236384 (0.0036) [2024-06-18 23:55:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 3872915456. Throughput: 0: 41711.6. Samples: 140506520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:23,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-18 23:55:28,003][26599] Updated weights for policy 0, policy_version 236394 (0.0034) [2024-06-18 23:55:28,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3873079296. Throughput: 0: 41638.0. Samples: 140758780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:28,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-18 23:55:31,292][26599] Updated weights for policy 0, policy_version 236404 (0.0041) [2024-06-18 23:55:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41781.7, 300 sec: 41432.1). Total num frames: 3873308672. Throughput: 0: 41605.2. Samples: 140878300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:33,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-18 23:55:35,827][26599] Updated weights for policy 0, policy_version 236414 (0.0033) [2024-06-18 23:55:38,380][26367] Fps is (10 sec: 44235.9, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 3873521664. Throughput: 0: 41658.0. Samples: 141131620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-18 23:55:38,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-18 23:55:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000236422_3873538048.pth... [2024-06-18 23:55:38,480][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000235813_3863560192.pth [2024-06-18 23:55:39,394][26599] Updated weights for policy 0, policy_version 236424 (0.0043) [2024-06-18 23:55:43,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 3873701888. Throughput: 0: 41427.5. Samples: 141376260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:55:43,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-18 23:55:43,807][26599] Updated weights for policy 0, policy_version 236434 (0.0030) [2024-06-18 23:55:47,177][26599] Updated weights for policy 0, policy_version 236444 (0.0030) [2024-06-18 23:55:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 3873914880. Throughput: 0: 41504.8. Samples: 141498860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:55:48,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-18 23:55:51,518][26599] Updated weights for policy 0, policy_version 236454 (0.0026) [2024-06-18 23:55:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3874144256. Throughput: 0: 41548.6. Samples: 141751560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:55:53,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-18 23:55:55,222][26599] Updated weights for policy 0, policy_version 236464 (0.0035) [2024-06-18 23:55:58,380][26367] Fps is (10 sec: 42599.4, 60 sec: 41506.1, 300 sec: 41543.7). Total num frames: 3874340864. Throughput: 0: 41435.2. Samples: 142002580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:55:58,380][26367] Avg episode reward: [(0, '0.632')] [2024-06-18 23:55:59,240][26599] Updated weights for policy 0, policy_version 236474 (0.0045) [2024-06-18 23:56:02,898][26599] Updated weights for policy 0, policy_version 236484 (0.0026) [2024-06-18 23:56:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41487.6). Total num frames: 3874570240. Throughput: 0: 41503.5. Samples: 142127020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:03,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-18 23:56:06,972][26599] Updated weights for policy 0, policy_version 236494 (0.0041) [2024-06-18 23:56:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41233.2, 300 sec: 41543.2). Total num frames: 3874766848. Throughput: 0: 41445.4. Samples: 142371560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:08,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-18 23:56:10,918][26599] Updated weights for policy 0, policy_version 236504 (0.0035) [2024-06-18 23:56:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3874963456. Throughput: 0: 41484.3. Samples: 142625580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:13,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-18 23:56:14,743][26599] Updated weights for policy 0, policy_version 236514 (0.0035) [2024-06-18 23:56:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3875176448. Throughput: 0: 41444.0. Samples: 142743280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:18,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-18 23:56:19,288][26599] Updated weights for policy 0, policy_version 236524 (0.0032) [2024-06-18 23:56:21,831][26579] Signal inference workers to stop experience collection... (2000 times) [2024-06-18 23:56:21,882][26599] InferenceWorker_p0-w0: stopping experience collection (2000 times) [2024-06-18 23:56:21,886][26579] Signal inference workers to resume experience collection... (2000 times) [2024-06-18 23:56:21,897][26599] InferenceWorker_p0-w0: resuming experience collection (2000 times) [2024-06-18 23:56:22,760][26599] Updated weights for policy 0, policy_version 236534 (0.0036) [2024-06-18 23:56:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 3875389440. Throughput: 0: 41461.4. Samples: 142997380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:23,386][26367] Avg episode reward: [(0, '0.602')] [2024-06-18 23:56:26,935][26599] Updated weights for policy 0, policy_version 236544 (0.0033) [2024-06-18 23:56:28,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 3875586048. Throughput: 0: 41564.5. Samples: 143246660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:28,380][26367] Avg episode reward: [(0, '0.765')] [2024-06-18 23:56:30,550][26599] Updated weights for policy 0, policy_version 236554 (0.0032) [2024-06-18 23:56:33,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3875782656. Throughput: 0: 41486.7. Samples: 143365760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:33,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-18 23:56:35,013][26599] Updated weights for policy 0, policy_version 236564 (0.0039) [2024-06-18 23:56:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 3876012032. Throughput: 0: 41443.2. Samples: 143616500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:38,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-18 23:56:38,483][26599] Updated weights for policy 0, policy_version 236574 (0.0030) [2024-06-18 23:56:42,731][26599] Updated weights for policy 0, policy_version 236584 (0.0026) [2024-06-18 23:56:43,382][26367] Fps is (10 sec: 42593.8, 60 sec: 41778.4, 300 sec: 41543.0). Total num frames: 3876208640. Throughput: 0: 41392.6. Samples: 143865300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:43,382][26367] Avg episode reward: [(0, '0.524')] [2024-06-18 23:56:46,099][26599] Updated weights for policy 0, policy_version 236594 (0.0035) [2024-06-18 23:56:48,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41506.3, 300 sec: 41432.1). Total num frames: 3876405248. Throughput: 0: 41325.0. Samples: 143986640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-18 23:56:48,380][26367] Avg episode reward: [(0, '0.555')] [2024-06-18 23:56:50,639][26599] Updated weights for policy 0, policy_version 236604 (0.0038) [2024-06-18 23:56:53,380][26367] Fps is (10 sec: 42603.9, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3876634624. Throughput: 0: 41540.1. Samples: 144240860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:56:53,380][26367] Avg episode reward: [(0, '0.378')] [2024-06-18 23:56:54,137][26599] Updated weights for policy 0, policy_version 236614 (0.0032) [2024-06-18 23:56:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41506.0, 300 sec: 41543.2). Total num frames: 3876831232. Throughput: 0: 41432.9. Samples: 144490060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:56:58,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-18 23:56:58,511][26599] Updated weights for policy 0, policy_version 236624 (0.0032) [2024-06-18 23:57:02,232][26599] Updated weights for policy 0, policy_version 236634 (0.0044) [2024-06-18 23:57:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40960.1, 300 sec: 41487.6). Total num frames: 3877027840. Throughput: 0: 41573.5. Samples: 144614080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:03,380][26367] Avg episode reward: [(0, '0.583')] [2024-06-18 23:57:06,456][26599] Updated weights for policy 0, policy_version 236644 (0.0040) [2024-06-18 23:57:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3877240832. Throughput: 0: 41437.5. Samples: 144862060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:08,380][26367] Avg episode reward: [(0, '0.735')] [2024-06-18 23:57:10,151][26599] Updated weights for policy 0, policy_version 236654 (0.0031) [2024-06-18 23:57:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3877437440. Throughput: 0: 41398.2. Samples: 145109580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:13,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-18 23:57:14,214][26599] Updated weights for policy 0, policy_version 236664 (0.0043) [2024-06-18 23:57:18,085][26599] Updated weights for policy 0, policy_version 236674 (0.0043) [2024-06-18 23:57:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3877666816. Throughput: 0: 41525.9. Samples: 145234420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:18,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-18 23:57:22,040][26599] Updated weights for policy 0, policy_version 236684 (0.0046) [2024-06-18 23:57:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 3877847040. Throughput: 0: 41452.5. Samples: 145481860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:23,380][26367] Avg episode reward: [(0, '0.765')] [2024-06-18 23:57:26,086][26599] Updated weights for policy 0, policy_version 236694 (0.0044) [2024-06-18 23:57:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3878076416. Throughput: 0: 41355.8. Samples: 145726260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:28,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-18 23:57:30,390][26599] Updated weights for policy 0, policy_version 236704 (0.0027) [2024-06-18 23:57:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 3878273024. Throughput: 0: 41589.8. Samples: 145858180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:33,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-18 23:57:33,980][26599] Updated weights for policy 0, policy_version 236714 (0.0045) [2024-06-18 23:57:38,171][26599] Updated weights for policy 0, policy_version 236724 (0.0025) [2024-06-18 23:57:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 3878486016. Throughput: 0: 41404.0. Samples: 146104040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:38,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-18 23:57:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000236724_3878486016.pth... [2024-06-18 23:57:38,448][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000236116_3868524544.pth [2024-06-18 23:57:41,813][26599] Updated weights for policy 0, policy_version 236734 (0.0046) [2024-06-18 23:57:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41780.1, 300 sec: 41598.7). Total num frames: 3878715392. Throughput: 0: 41331.7. Samples: 146349980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:43,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-18 23:57:46,129][26599] Updated weights for policy 0, policy_version 236744 (0.0056) [2024-06-18 23:57:48,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41779.0, 300 sec: 41654.2). Total num frames: 3878912000. Throughput: 0: 41447.3. Samples: 146479220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:48,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-18 23:57:49,901][26599] Updated weights for policy 0, policy_version 236754 (0.0042) [2024-06-18 23:57:53,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 3879108608. Throughput: 0: 41336.9. Samples: 146722220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:53,380][26367] Avg episode reward: [(0, '0.571')] [2024-06-18 23:57:54,063][26599] Updated weights for policy 0, policy_version 236764 (0.0039) [2024-06-18 23:57:57,843][26599] Updated weights for policy 0, policy_version 236774 (0.0035) [2024-06-18 23:57:58,380][26367] Fps is (10 sec: 40961.0, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 3879321600. Throughput: 0: 41407.2. Samples: 146972900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 23.0) [2024-06-18 23:57:58,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-18 23:57:58,487][26579] Signal inference workers to stop experience collection... (2050 times) [2024-06-18 23:57:58,527][26599] InferenceWorker_p0-w0: stopping experience collection (2050 times) [2024-06-18 23:57:58,539][26579] Signal inference workers to resume experience collection... (2050 times) [2024-06-18 23:57:58,540][26599] InferenceWorker_p0-w0: resuming experience collection (2050 times) [2024-06-18 23:58:01,863][26599] Updated weights for policy 0, policy_version 236784 (0.0040) [2024-06-18 23:58:03,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 3879501824. Throughput: 0: 41443.2. Samples: 147099360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:03,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-18 23:58:05,922][26599] Updated weights for policy 0, policy_version 236794 (0.0030) [2024-06-18 23:58:08,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 3879763968. Throughput: 0: 41364.4. Samples: 147343260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:08,380][26367] Avg episode reward: [(0, '0.720')] [2024-06-18 23:58:09,607][26599] Updated weights for policy 0, policy_version 236804 (0.0037) [2024-06-18 23:58:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3879927808. Throughput: 0: 41619.2. Samples: 147599120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:13,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-18 23:58:13,842][26599] Updated weights for policy 0, policy_version 236814 (0.0036) [2024-06-18 23:58:17,694][26599] Updated weights for policy 0, policy_version 236824 (0.0038) [2024-06-18 23:58:18,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 3880140800. Throughput: 0: 41324.5. Samples: 147717780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:18,380][26367] Avg episode reward: [(0, '0.558')] [2024-06-18 23:58:21,678][26599] Updated weights for policy 0, policy_version 236834 (0.0031) [2024-06-18 23:58:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 41654.3). Total num frames: 3880370176. Throughput: 0: 41456.4. Samples: 147969580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:23,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-18 23:58:25,481][26599] Updated weights for policy 0, policy_version 236844 (0.0039) [2024-06-18 23:58:28,384][26367] Fps is (10 sec: 40944.7, 60 sec: 41230.6, 300 sec: 41542.6). Total num frames: 3880550400. Throughput: 0: 41637.5. Samples: 148223820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:28,385][26367] Avg episode reward: [(0, '0.568')] [2024-06-18 23:58:29,473][26599] Updated weights for policy 0, policy_version 236854 (0.0038) [2024-06-18 23:58:33,304][26599] Updated weights for policy 0, policy_version 236864 (0.0029) [2024-06-18 23:58:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3880779776. Throughput: 0: 41369.9. Samples: 148340860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:33,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-18 23:58:37,578][26599] Updated weights for policy 0, policy_version 236874 (0.0039) [2024-06-18 23:58:38,380][26367] Fps is (10 sec: 45892.2, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 3881009152. Throughput: 0: 41674.6. Samples: 148597580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:38,381][26367] Avg episode reward: [(0, '0.753')] [2024-06-18 23:58:41,186][26599] Updated weights for policy 0, policy_version 236884 (0.0039) [2024-06-18 23:58:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40960.0, 300 sec: 41488.1). Total num frames: 3881172992. Throughput: 0: 41536.4. Samples: 148842040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:43,384][26367] Avg episode reward: [(0, '0.640')] [2024-06-18 23:58:45,584][26599] Updated weights for policy 0, policy_version 236894 (0.0031) [2024-06-18 23:58:48,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 3881402368. Throughput: 0: 41325.4. Samples: 148959000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:48,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-18 23:58:49,403][26599] Updated weights for policy 0, policy_version 236904 (0.0023) [2024-06-18 23:58:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41376.6). Total num frames: 3881582592. Throughput: 0: 41516.0. Samples: 149211480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:53,380][26367] Avg episode reward: [(0, '0.547')] [2024-06-18 23:58:53,390][26599] Updated weights for policy 0, policy_version 236914 (0.0031) [2024-06-18 23:58:57,008][26599] Updated weights for policy 0, policy_version 236924 (0.0033) [2024-06-18 23:58:58,380][26367] Fps is (10 sec: 39320.9, 60 sec: 41232.9, 300 sec: 41599.2). Total num frames: 3881795584. Throughput: 0: 41469.2. Samples: 149465240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:58:58,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-18 23:59:01,360][26599] Updated weights for policy 0, policy_version 236934 (0.0039) [2024-06-18 23:59:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41599.2). Total num frames: 3882024960. Throughput: 0: 41580.4. Samples: 149588900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:59:03,381][26367] Avg episode reward: [(0, '0.295')] [2024-06-18 23:59:04,625][26599] Updated weights for policy 0, policy_version 236944 (0.0033) [2024-06-18 23:59:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 40686.9, 300 sec: 41432.1). Total num frames: 3882205184. Throughput: 0: 41612.0. Samples: 149842120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-18 23:59:08,381][26367] Avg episode reward: [(0, '0.330')] [2024-06-18 23:59:09,078][26599] Updated weights for policy 0, policy_version 236954 (0.0036) [2024-06-18 23:59:12,326][26599] Updated weights for policy 0, policy_version 236964 (0.0039) [2024-06-18 23:59:13,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3882418176. Throughput: 0: 41391.9. Samples: 150086300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:13,381][26367] Avg episode reward: [(0, '0.303')] [2024-06-18 23:59:17,003][26599] Updated weights for policy 0, policy_version 236974 (0.0050) [2024-06-18 23:59:17,237][26579] Signal inference workers to stop experience collection... (2100 times) [2024-06-18 23:59:17,238][26579] Signal inference workers to resume experience collection... (2100 times) [2024-06-18 23:59:17,282][26599] InferenceWorker_p0-w0: stopping experience collection (2100 times) [2024-06-18 23:59:17,283][26599] InferenceWorker_p0-w0: resuming experience collection (2100 times) [2024-06-18 23:59:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3882631168. Throughput: 0: 41569.8. Samples: 150211500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:18,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-18 23:59:20,276][26599] Updated weights for policy 0, policy_version 236984 (0.0042) [2024-06-18 23:59:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3882844160. Throughput: 0: 41330.2. Samples: 150457440. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:23,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-18 23:59:25,010][26599] Updated weights for policy 0, policy_version 236994 (0.0037) [2024-06-18 23:59:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41781.7, 300 sec: 41543.7). Total num frames: 3883057152. Throughput: 0: 41296.9. Samples: 150700400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:28,384][26367] Avg episode reward: [(0, '0.592')] [2024-06-18 23:59:29,218][26599] Updated weights for policy 0, policy_version 237004 (0.0024) [2024-06-18 23:59:32,713][26599] Updated weights for policy 0, policy_version 237014 (0.0038) [2024-06-18 23:59:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3883253760. Throughput: 0: 41507.5. Samples: 150826840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:33,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-18 23:59:36,866][26599] Updated weights for policy 0, policy_version 237024 (0.0028) [2024-06-18 23:59:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 3883466752. Throughput: 0: 41471.0. Samples: 151077680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:38,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-18 23:59:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237028_3883466752.pth... [2024-06-18 23:59:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000236422_3873538048.pth [2024-06-18 23:59:40,425][26599] Updated weights for policy 0, policy_version 237034 (0.0031) [2024-06-18 23:59:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 3883679744. Throughput: 0: 41390.8. Samples: 151327820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:43,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-18 23:59:44,483][26599] Updated weights for policy 0, policy_version 237044 (0.0029) [2024-06-18 23:59:48,264][26599] Updated weights for policy 0, policy_version 237054 (0.0033) [2024-06-18 23:59:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41506.0, 300 sec: 41487.6). Total num frames: 3883892736. Throughput: 0: 41332.3. Samples: 151448860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:48,381][26367] Avg episode reward: [(0, '0.328')] [2024-06-18 23:59:52,314][26599] Updated weights for policy 0, policy_version 237064 (0.0031) [2024-06-18 23:59:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3884072960. Throughput: 0: 41231.1. Samples: 151697520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:53,381][26367] Avg episode reward: [(0, '0.327')] [2024-06-18 23:59:56,109][26599] Updated weights for policy 0, policy_version 237074 (0.0032) [2024-06-18 23:59:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 3884302336. Throughput: 0: 41383.9. Samples: 151948580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-18 23:59:58,381][26367] Avg episode reward: [(0, '0.326')] [2024-06-19 00:00:00,075][26599] Updated weights for policy 0, policy_version 237084 (0.0041) [2024-06-19 00:00:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3884515328. Throughput: 0: 41401.7. Samples: 152074580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-19 00:00:03,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 00:00:03,875][26599] Updated weights for policy 0, policy_version 237094 (0.0037) [2024-06-19 00:00:07,959][26599] Updated weights for policy 0, policy_version 237104 (0.0040) [2024-06-19 00:00:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 3884711936. Throughput: 0: 41468.4. Samples: 152323520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-19 00:00:08,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 00:00:11,938][26599] Updated weights for policy 0, policy_version 237114 (0.0025) [2024-06-19 00:00:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 3884908544. Throughput: 0: 41659.2. Samples: 152575060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-19 00:00:13,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 00:00:15,990][26599] Updated weights for policy 0, policy_version 237124 (0.0028) [2024-06-19 00:00:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41506.0, 300 sec: 41376.5). Total num frames: 3885121536. Throughput: 0: 41556.8. Samples: 152696900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 23.0) [2024-06-19 00:00:18,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 00:00:19,569][26599] Updated weights for policy 0, policy_version 237134 (0.0031) [2024-06-19 00:00:23,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41506.1, 300 sec: 41543.1). Total num frames: 3885334528. Throughput: 0: 41522.7. Samples: 152946200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:23,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 00:00:23,686][26599] Updated weights for policy 0, policy_version 237144 (0.0040) [2024-06-19 00:00:27,652][26599] Updated weights for policy 0, policy_version 237154 (0.0040) [2024-06-19 00:00:28,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3885531136. Throughput: 0: 41609.4. Samples: 153200240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:28,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 00:00:30,491][26579] Signal inference workers to stop experience collection... (2150 times) [2024-06-19 00:00:30,491][26579] Signal inference workers to resume experience collection... (2150 times) [2024-06-19 00:00:30,528][26599] InferenceWorker_p0-w0: stopping experience collection (2150 times) [2024-06-19 00:00:30,528][26599] InferenceWorker_p0-w0: resuming experience collection (2150 times) [2024-06-19 00:00:31,537][26599] Updated weights for policy 0, policy_version 237164 (0.0038) [2024-06-19 00:00:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 3885744128. Throughput: 0: 41710.3. Samples: 153325820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:33,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 00:00:35,395][26599] Updated weights for policy 0, policy_version 237174 (0.0034) [2024-06-19 00:00:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 3885973504. Throughput: 0: 41789.4. Samples: 153578040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:38,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 00:00:39,414][26599] Updated weights for policy 0, policy_version 237184 (0.0024) [2024-06-19 00:00:43,275][26599] Updated weights for policy 0, policy_version 237194 (0.0026) [2024-06-19 00:00:43,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 3886186496. Throughput: 0: 41682.4. Samples: 153824280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:43,380][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 00:00:47,356][26599] Updated weights for policy 0, policy_version 237204 (0.0032) [2024-06-19 00:00:48,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3886366720. Throughput: 0: 41590.6. Samples: 153946160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:48,381][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 00:00:51,019][26599] Updated weights for policy 0, policy_version 237214 (0.0042) [2024-06-19 00:00:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 41543.2). Total num frames: 3886596096. Throughput: 0: 41798.8. Samples: 154204460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:53,380][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 00:00:55,041][26599] Updated weights for policy 0, policy_version 237224 (0.0046) [2024-06-19 00:00:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3886792704. Throughput: 0: 41738.1. Samples: 154453280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:00:58,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 00:00:59,307][26599] Updated weights for policy 0, policy_version 237234 (0.0035) [2024-06-19 00:01:02,859][26599] Updated weights for policy 0, policy_version 237244 (0.0040) [2024-06-19 00:01:03,384][26367] Fps is (10 sec: 40946.2, 60 sec: 41503.9, 300 sec: 41487.2). Total num frames: 3887005696. Throughput: 0: 41733.5. Samples: 154575040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:01:03,384][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 00:01:07,007][26599] Updated weights for policy 0, policy_version 237254 (0.0034) [2024-06-19 00:01:08,384][26367] Fps is (10 sec: 39309.1, 60 sec: 41230.9, 300 sec: 41431.6). Total num frames: 3887185920. Throughput: 0: 41637.9. Samples: 154820040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:01:08,384][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 00:01:10,818][26599] Updated weights for policy 0, policy_version 237264 (0.0030) [2024-06-19 00:01:13,380][26367] Fps is (10 sec: 40973.1, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 3887415296. Throughput: 0: 41603.0. Samples: 155072380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:01:13,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 00:01:15,177][26599] Updated weights for policy 0, policy_version 237274 (0.0036) [2024-06-19 00:01:18,384][26367] Fps is (10 sec: 45873.6, 60 sec: 42049.8, 300 sec: 41542.7). Total num frames: 3887644672. Throughput: 0: 41587.8. Samples: 155197420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:01:18,385][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 00:01:18,682][26599] Updated weights for policy 0, policy_version 237284 (0.0035) [2024-06-19 00:01:22,992][26599] Updated weights for policy 0, policy_version 237294 (0.0040) [2024-06-19 00:01:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 3887824896. Throughput: 0: 41515.1. Samples: 155446220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:01:23,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 00:01:26,613][26599] Updated weights for policy 0, policy_version 237304 (0.0042) [2024-06-19 00:01:28,380][26367] Fps is (10 sec: 39336.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 3888037888. Throughput: 0: 41620.4. Samples: 155697200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:01:28,380][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 00:01:30,883][26599] Updated weights for policy 0, policy_version 237314 (0.0030) [2024-06-19 00:01:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 3888267264. Throughput: 0: 41637.5. Samples: 155819840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:01:33,380][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 00:01:34,609][26599] Updated weights for policy 0, policy_version 237324 (0.0042) [2024-06-19 00:01:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41233.1, 300 sec: 41487.8). Total num frames: 3888447488. Throughput: 0: 41495.1. Samples: 156071740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:01:38,380][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 00:01:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237333_3888463872.pth... [2024-06-19 00:01:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000236724_3878486016.pth [2024-06-19 00:01:38,699][26599] Updated weights for policy 0, policy_version 237334 (0.0026) [2024-06-19 00:01:42,519][26599] Updated weights for policy 0, policy_version 237344 (0.0034) [2024-06-19 00:01:43,380][26367] Fps is (10 sec: 40959.2, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 3888676864. Throughput: 0: 41498.2. Samples: 156320700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:01:43,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 00:01:46,575][26599] Updated weights for policy 0, policy_version 237354 (0.0024) [2024-06-19 00:01:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 41543.1). Total num frames: 3888889856. Throughput: 0: 41636.7. Samples: 156448560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:01:48,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 00:01:50,314][26599] Updated weights for policy 0, policy_version 237364 (0.0025) [2024-06-19 00:01:53,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3889070080. Throughput: 0: 41743.6. Samples: 156698360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:01:53,380][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 00:01:54,329][26599] Updated weights for policy 0, policy_version 237374 (0.0036) [2024-06-19 00:01:58,175][26599] Updated weights for policy 0, policy_version 237384 (0.0034) [2024-06-19 00:01:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3889299456. Throughput: 0: 41531.9. Samples: 156941320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:01:58,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 00:02:01,921][26579] Signal inference workers to stop experience collection... (2200 times) [2024-06-19 00:02:01,962][26599] InferenceWorker_p0-w0: stopping experience collection (2200 times) [2024-06-19 00:02:01,971][26579] Signal inference workers to resume experience collection... (2200 times) [2024-06-19 00:02:01,977][26599] InferenceWorker_p0-w0: resuming experience collection (2200 times) [2024-06-19 00:02:02,115][26599] Updated weights for policy 0, policy_version 237394 (0.0034) [2024-06-19 00:02:03,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41235.3, 300 sec: 41487.6). Total num frames: 3889479680. Throughput: 0: 41660.2. Samples: 157071980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:02:03,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 00:02:05,865][26599] Updated weights for policy 0, policy_version 237404 (0.0033) [2024-06-19 00:02:08,384][26367] Fps is (10 sec: 39307.8, 60 sec: 41778.9, 300 sec: 41542.6). Total num frames: 3889692672. Throughput: 0: 41522.8. Samples: 157314900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:02:08,384][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 00:02:10,569][26599] Updated weights for policy 0, policy_version 237414 (0.0040) [2024-06-19 00:02:13,380][26367] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 3889922048. Throughput: 0: 41431.6. Samples: 157561620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:02:13,380][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 00:02:14,141][26599] Updated weights for policy 0, policy_version 237424 (0.0030) [2024-06-19 00:02:18,311][26599] Updated weights for policy 0, policy_version 237434 (0.0030) [2024-06-19 00:02:18,380][26367] Fps is (10 sec: 42613.8, 60 sec: 41235.5, 300 sec: 41598.7). Total num frames: 3890118656. Throughput: 0: 41596.3. Samples: 157691680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:02:18,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:02:22,110][26599] Updated weights for policy 0, policy_version 237444 (0.0039) [2024-06-19 00:02:23,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 3890331648. Throughput: 0: 41503.0. Samples: 157939380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:02:23,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 00:02:25,947][26599] Updated weights for policy 0, policy_version 237454 (0.0029) [2024-06-19 00:02:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3890544640. Throughput: 0: 41505.1. Samples: 158188420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:02:28,380][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 00:02:29,755][26599] Updated weights for policy 0, policy_version 237464 (0.0038) [2024-06-19 00:02:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 3890741248. Throughput: 0: 41465.4. Samples: 158314500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:02:33,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 00:02:34,082][26599] Updated weights for policy 0, policy_version 237474 (0.0026) [2024-06-19 00:02:37,712][26599] Updated weights for policy 0, policy_version 237484 (0.0049) [2024-06-19 00:02:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 3890954240. Throughput: 0: 41413.3. Samples: 158561960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:02:38,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 00:02:41,962][26599] Updated weights for policy 0, policy_version 237494 (0.0036) [2024-06-19 00:02:43,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41230.6, 300 sec: 41487.1). Total num frames: 3891150848. Throughput: 0: 41494.6. Samples: 158808720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:02:43,385][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 00:02:45,491][26599] Updated weights for policy 0, policy_version 237504 (0.0030) [2024-06-19 00:02:48,381][26367] Fps is (10 sec: 39317.2, 60 sec: 40959.3, 300 sec: 41487.5). Total num frames: 3891347456. Throughput: 0: 41292.4. Samples: 158930180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:02:48,382][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 00:02:49,883][26599] Updated weights for policy 0, policy_version 237514 (0.0028) [2024-06-19 00:02:53,380][26367] Fps is (10 sec: 42613.9, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 3891576832. Throughput: 0: 41514.0. Samples: 159182880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:02:53,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 00:02:53,694][26599] Updated weights for policy 0, policy_version 237524 (0.0030) [2024-06-19 00:02:57,610][26599] Updated weights for policy 0, policy_version 237534 (0.0032) [2024-06-19 00:02:58,380][26367] Fps is (10 sec: 42603.4, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 3891773440. Throughput: 0: 41435.1. Samples: 159426200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:02:58,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 00:03:01,774][26599] Updated weights for policy 0, policy_version 237544 (0.0033) [2024-06-19 00:03:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 3891970048. Throughput: 0: 41272.5. Samples: 159548940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:03,380][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 00:03:05,643][26599] Updated weights for policy 0, policy_version 237554 (0.0035) [2024-06-19 00:03:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41781.8, 300 sec: 41598.7). Total num frames: 3892199424. Throughput: 0: 41388.6. Samples: 159801860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:08,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 00:03:09,592][26599] Updated weights for policy 0, policy_version 237564 (0.0028) [2024-06-19 00:03:13,355][26599] Updated weights for policy 0, policy_version 237574 (0.0031) [2024-06-19 00:03:13,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 3892412416. Throughput: 0: 41509.2. Samples: 160056340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:13,381][26367] Avg episode reward: [(0, '0.381')] [2024-06-19 00:03:17,474][26599] Updated weights for policy 0, policy_version 237584 (0.0029) [2024-06-19 00:03:18,384][26367] Fps is (10 sec: 42582.8, 60 sec: 41776.7, 300 sec: 41542.7). Total num frames: 3892625408. Throughput: 0: 41530.0. Samples: 160183500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:18,384][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 00:03:21,038][26599] Updated weights for policy 0, policy_version 237594 (0.0046) [2024-06-19 00:03:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41599.2). Total num frames: 3892822016. Throughput: 0: 41626.6. Samples: 160435160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:23,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 00:03:25,369][26599] Updated weights for policy 0, policy_version 237604 (0.0035) [2024-06-19 00:03:28,380][26367] Fps is (10 sec: 40975.3, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3893035008. Throughput: 0: 41763.9. Samples: 160687940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:28,380][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 00:03:28,834][26599] Updated weights for policy 0, policy_version 237614 (0.0026) [2024-06-19 00:03:33,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41376.5). Total num frames: 3893215232. Throughput: 0: 41652.2. Samples: 160804480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:33,380][26367] Avg episode reward: [(0, '0.857')] [2024-06-19 00:03:33,435][26599] Updated weights for policy 0, policy_version 237624 (0.0037) [2024-06-19 00:03:34,673][26579] Signal inference workers to stop experience collection... (2250 times) [2024-06-19 00:03:34,721][26599] InferenceWorker_p0-w0: stopping experience collection (2250 times) [2024-06-19 00:03:34,796][26579] Signal inference workers to resume experience collection... (2250 times) [2024-06-19 00:03:34,796][26599] InferenceWorker_p0-w0: resuming experience collection (2250 times) [2024-06-19 00:03:37,037][26599] Updated weights for policy 0, policy_version 237634 (0.0029) [2024-06-19 00:03:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3893444608. Throughput: 0: 41651.2. Samples: 161057180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:38,380][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 00:03:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237637_3893444608.pth... [2024-06-19 00:03:38,468][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237028_3883466752.pth [2024-06-19 00:03:41,344][26599] Updated weights for policy 0, policy_version 237644 (0.0031) [2024-06-19 00:03:43,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41781.7, 300 sec: 41543.1). Total num frames: 3893657600. Throughput: 0: 41634.1. Samples: 161299740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 00:03:43,383][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 00:03:44,642][26599] Updated weights for policy 0, policy_version 237654 (0.0029) [2024-06-19 00:03:48,384][26367] Fps is (10 sec: 40944.7, 60 sec: 41777.4, 300 sec: 41598.2). Total num frames: 3893854208. Throughput: 0: 41731.2. Samples: 161427000. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:03:48,384][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 00:03:49,286][26599] Updated weights for policy 0, policy_version 237664 (0.0042) [2024-06-19 00:03:52,496][26599] Updated weights for policy 0, policy_version 237674 (0.0034) [2024-06-19 00:03:53,384][26367] Fps is (10 sec: 40945.5, 60 sec: 41503.6, 300 sec: 41598.2). Total num frames: 3894067200. Throughput: 0: 41625.0. Samples: 161675140. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:03:53,384][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:03:56,969][26599] Updated weights for policy 0, policy_version 237684 (0.0037) [2024-06-19 00:03:58,380][26367] Fps is (10 sec: 42614.4, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 3894280192. Throughput: 0: 41666.4. Samples: 161931320. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:03:58,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:04:00,203][26599] Updated weights for policy 0, policy_version 237694 (0.0035) [2024-06-19 00:04:03,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 3894493184. Throughput: 0: 41578.8. Samples: 162054400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:03,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 00:04:04,883][26599] Updated weights for policy 0, policy_version 237704 (0.0044) [2024-06-19 00:04:08,070][26599] Updated weights for policy 0, policy_version 237714 (0.0036) [2024-06-19 00:04:08,380][26367] Fps is (10 sec: 42597.2, 60 sec: 41779.0, 300 sec: 41654.2). Total num frames: 3894706176. Throughput: 0: 41427.4. Samples: 162299400. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:08,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 00:04:12,687][26599] Updated weights for policy 0, policy_version 237724 (0.0035) [2024-06-19 00:04:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 3894886400. Throughput: 0: 41551.3. Samples: 162557760. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:13,381][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 00:04:15,941][26599] Updated weights for policy 0, policy_version 237734 (0.0032) [2024-06-19 00:04:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41508.6, 300 sec: 41598.7). Total num frames: 3895115776. Throughput: 0: 41581.2. Samples: 162675640. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:18,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 00:04:20,514][26599] Updated weights for policy 0, policy_version 237744 (0.0031) [2024-06-19 00:04:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3895328768. Throughput: 0: 41472.8. Samples: 162923460. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:23,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 00:04:24,340][26599] Updated weights for policy 0, policy_version 237754 (0.0034) [2024-06-19 00:04:28,277][26599] Updated weights for policy 0, policy_version 237764 (0.0038) [2024-06-19 00:04:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 3895525376. Throughput: 0: 41782.2. Samples: 163179940. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:28,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 00:04:32,115][26599] Updated weights for policy 0, policy_version 237774 (0.0046) [2024-06-19 00:04:33,384][26367] Fps is (10 sec: 39307.4, 60 sec: 41776.6, 300 sec: 41542.7). Total num frames: 3895721984. Throughput: 0: 41644.0. Samples: 163300980. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:33,385][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 00:04:36,547][26599] Updated weights for policy 0, policy_version 237784 (0.0034) [2024-06-19 00:04:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.0, 300 sec: 41543.2). Total num frames: 3895934976. Throughput: 0: 41606.4. Samples: 163547280. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:38,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 00:04:39,919][26599] Updated weights for policy 0, policy_version 237794 (0.0043) [2024-06-19 00:04:43,380][26367] Fps is (10 sec: 39335.8, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 3896115200. Throughput: 0: 41325.2. Samples: 163790960. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:43,381][26367] Avg episode reward: [(0, '0.323')] [2024-06-19 00:04:44,461][26599] Updated weights for policy 0, policy_version 237804 (0.0038) [2024-06-19 00:04:45,267][26579] Signal inference workers to stop experience collection... (2300 times) [2024-06-19 00:04:45,274][26579] Signal inference workers to resume experience collection... (2300 times) [2024-06-19 00:04:45,290][26599] InferenceWorker_p0-w0: stopping experience collection (2300 times) [2024-06-19 00:04:45,290][26599] InferenceWorker_p0-w0: resuming experience collection (2300 times) [2024-06-19 00:04:48,179][26599] Updated weights for policy 0, policy_version 237814 (0.0036) [2024-06-19 00:04:48,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41506.1, 300 sec: 41598.2). Total num frames: 3896344576. Throughput: 0: 41218.5. Samples: 163909380. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:48,384][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 00:04:52,295][26599] Updated weights for policy 0, policy_version 237824 (0.0029) [2024-06-19 00:04:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41508.6, 300 sec: 41543.1). Total num frames: 3896557568. Throughput: 0: 41508.9. Samples: 164167300. Policy #0 lag: (min: 1.0, avg: 11.4, max: 22.0) [2024-06-19 00:04:53,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 00:04:55,972][26599] Updated weights for policy 0, policy_version 237834 (0.0032) [2024-06-19 00:04:58,380][26367] Fps is (10 sec: 40974.5, 60 sec: 41232.9, 300 sec: 41487.6). Total num frames: 3896754176. Throughput: 0: 41403.6. Samples: 164420920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:04:58,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 00:04:59,947][26599] Updated weights for policy 0, policy_version 237844 (0.0034) [2024-06-19 00:05:03,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3896967168. Throughput: 0: 41490.3. Samples: 164542700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:03,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 00:05:03,739][26599] Updated weights for policy 0, policy_version 237854 (0.0032) [2024-06-19 00:05:07,684][26599] Updated weights for policy 0, policy_version 237864 (0.0037) [2024-06-19 00:05:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 3897180160. Throughput: 0: 41637.4. Samples: 164797140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:08,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 00:05:11,567][26599] Updated weights for policy 0, policy_version 237874 (0.0045) [2024-06-19 00:05:13,384][26367] Fps is (10 sec: 40944.8, 60 sec: 41503.7, 300 sec: 41542.7). Total num frames: 3897376768. Throughput: 0: 41293.6. Samples: 165038300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:13,385][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 00:05:15,491][26599] Updated weights for policy 0, policy_version 237884 (0.0032) [2024-06-19 00:05:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3897589760. Throughput: 0: 41362.1. Samples: 165162120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:18,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 00:05:19,490][26599] Updated weights for policy 0, policy_version 237894 (0.0047) [2024-06-19 00:05:23,334][26599] Updated weights for policy 0, policy_version 237904 (0.0031) [2024-06-19 00:05:23,380][26367] Fps is (10 sec: 44252.9, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3897819136. Throughput: 0: 41648.9. Samples: 165421480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:23,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 00:05:27,634][26599] Updated weights for policy 0, policy_version 237914 (0.0037) [2024-06-19 00:05:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3897999360. Throughput: 0: 41530.3. Samples: 165659820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:28,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 00:05:31,332][26599] Updated weights for policy 0, policy_version 237924 (0.0033) [2024-06-19 00:05:33,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41235.5, 300 sec: 41432.1). Total num frames: 3898195968. Throughput: 0: 41608.7. Samples: 165781620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:33,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 00:05:35,471][26599] Updated weights for policy 0, policy_version 237934 (0.0044) [2024-06-19 00:05:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 3898408960. Throughput: 0: 41341.5. Samples: 166027660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:38,380][26367] Avg episode reward: [(0, '0.321')] [2024-06-19 00:05:38,524][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237941_3898425344.pth... [2024-06-19 00:05:38,581][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237333_3888463872.pth [2024-06-19 00:05:39,326][26599] Updated weights for policy 0, policy_version 237944 (0.0037) [2024-06-19 00:05:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 3898621952. Throughput: 0: 41237.3. Samples: 166276600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:43,381][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 00:05:43,532][26599] Updated weights for policy 0, policy_version 237954 (0.0024) [2024-06-19 00:05:47,200][26599] Updated weights for policy 0, policy_version 237964 (0.0031) [2024-06-19 00:05:48,380][26367] Fps is (10 sec: 40958.9, 60 sec: 41235.5, 300 sec: 41432.0). Total num frames: 3898818560. Throughput: 0: 41367.8. Samples: 166404260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:48,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 00:05:51,452][26599] Updated weights for policy 0, policy_version 237974 (0.0033) [2024-06-19 00:05:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40960.1, 300 sec: 41432.1). Total num frames: 3899015168. Throughput: 0: 41178.2. Samples: 166650160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:53,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 00:05:55,194][26599] Updated weights for policy 0, policy_version 237984 (0.0044) [2024-06-19 00:05:58,380][26367] Fps is (10 sec: 44237.7, 60 sec: 41779.3, 300 sec: 41543.6). Total num frames: 3899260928. Throughput: 0: 41248.7. Samples: 166894340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:05:58,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 00:05:59,127][26599] Updated weights for policy 0, policy_version 237994 (0.0025) [2024-06-19 00:06:03,139][26599] Updated weights for policy 0, policy_version 238004 (0.0040) [2024-06-19 00:06:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41506.1, 300 sec: 41599.2). Total num frames: 3899457536. Throughput: 0: 41321.7. Samples: 167021600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 00:06:03,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 00:06:03,893][26579] Signal inference workers to stop experience collection... (2350 times) [2024-06-19 00:06:03,897][26579] Signal inference workers to resume experience collection... (2350 times) [2024-06-19 00:06:03,907][26599] InferenceWorker_p0-w0: stopping experience collection (2350 times) [2024-06-19 00:06:03,942][26599] InferenceWorker_p0-w0: resuming experience collection (2350 times) [2024-06-19 00:06:07,432][26599] Updated weights for policy 0, policy_version 238014 (0.0039) [2024-06-19 00:06:08,380][26367] Fps is (10 sec: 37682.8, 60 sec: 40959.9, 300 sec: 41432.1). Total num frames: 3899637760. Throughput: 0: 41004.4. Samples: 167266680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:08,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 00:06:11,650][26599] Updated weights for policy 0, policy_version 238024 (0.0059) [2024-06-19 00:06:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41781.7, 300 sec: 41488.1). Total num frames: 3899883520. Throughput: 0: 41048.8. Samples: 167507020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:13,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 00:06:15,314][26599] Updated weights for policy 0, policy_version 238034 (0.0035) [2024-06-19 00:06:18,380][26367] Fps is (10 sec: 40960.7, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 3900047360. Throughput: 0: 41217.0. Samples: 167636380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:18,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 00:06:19,465][26599] Updated weights for policy 0, policy_version 238044 (0.0050) [2024-06-19 00:06:23,096][26599] Updated weights for policy 0, policy_version 238054 (0.0035) [2024-06-19 00:06:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41487.6). Total num frames: 3900276736. Throughput: 0: 41141.7. Samples: 167879040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:23,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 00:06:27,312][26599] Updated weights for policy 0, policy_version 238064 (0.0032) [2024-06-19 00:06:28,380][26367] Fps is (10 sec: 44235.9, 60 sec: 41506.0, 300 sec: 41432.0). Total num frames: 3900489728. Throughput: 0: 41166.7. Samples: 168129100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:28,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 00:06:30,807][26599] Updated weights for policy 0, policy_version 238074 (0.0033) [2024-06-19 00:06:33,380][26367] Fps is (10 sec: 37683.5, 60 sec: 40960.1, 300 sec: 41376.5). Total num frames: 3900653568. Throughput: 0: 41006.9. Samples: 168249560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:33,380][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 00:06:35,155][26599] Updated weights for policy 0, policy_version 238084 (0.0024) [2024-06-19 00:06:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.0, 300 sec: 41432.1). Total num frames: 3900899328. Throughput: 0: 41168.0. Samples: 168502720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:38,381][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 00:06:38,774][26599] Updated weights for policy 0, policy_version 238094 (0.0029) [2024-06-19 00:06:43,079][26599] Updated weights for policy 0, policy_version 238104 (0.0033) [2024-06-19 00:06:43,380][26367] Fps is (10 sec: 45875.4, 60 sec: 41506.3, 300 sec: 41432.1). Total num frames: 3901112320. Throughput: 0: 41255.6. Samples: 168750840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:43,384][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 00:06:46,513][26599] Updated weights for policy 0, policy_version 238114 (0.0033) [2024-06-19 00:06:48,384][26367] Fps is (10 sec: 37669.5, 60 sec: 40957.6, 300 sec: 41376.0). Total num frames: 3901276160. Throughput: 0: 41076.2. Samples: 168870180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:48,384][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 00:06:50,966][26599] Updated weights for policy 0, policy_version 238124 (0.0033) [2024-06-19 00:06:53,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 3901521920. Throughput: 0: 41227.1. Samples: 169121900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:53,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 00:06:54,851][26599] Updated weights for policy 0, policy_version 238134 (0.0035) [2024-06-19 00:06:58,380][26367] Fps is (10 sec: 42614.3, 60 sec: 40687.0, 300 sec: 41432.1). Total num frames: 3901702144. Throughput: 0: 41554.3. Samples: 169376960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:06:58,380][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 00:06:58,828][26599] Updated weights for policy 0, policy_version 238144 (0.0050) [2024-06-19 00:07:02,529][26599] Updated weights for policy 0, policy_version 238154 (0.0045) [2024-06-19 00:07:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40959.9, 300 sec: 41432.6). Total num frames: 3901915136. Throughput: 0: 41245.6. Samples: 169492440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:07:03,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 00:07:06,763][26599] Updated weights for policy 0, policy_version 238164 (0.0052) [2024-06-19 00:07:08,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41321.0). Total num frames: 3902111744. Throughput: 0: 41422.2. Samples: 169743040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:07:08,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 00:07:10,808][26599] Updated weights for policy 0, policy_version 238174 (0.0041) [2024-06-19 00:07:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40686.9, 300 sec: 41376.5). Total num frames: 3902324736. Throughput: 0: 41386.7. Samples: 169991500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 00:07:13,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 00:07:14,798][26599] Updated weights for policy 0, policy_version 238184 (0.0038) [2024-06-19 00:07:16,849][26579] Signal inference workers to stop experience collection... (2400 times) [2024-06-19 00:07:16,856][26579] Signal inference workers to resume experience collection... (2400 times) [2024-06-19 00:07:16,896][26599] InferenceWorker_p0-w0: stopping experience collection (2400 times) [2024-06-19 00:07:16,897][26599] InferenceWorker_p0-w0: resuming experience collection (2400 times) [2024-06-19 00:07:18,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 3902554112. Throughput: 0: 41444.8. Samples: 170114580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:18,381][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 00:07:18,601][26599] Updated weights for policy 0, policy_version 238194 (0.0039) [2024-06-19 00:07:22,666][26599] Updated weights for policy 0, policy_version 238204 (0.0031) [2024-06-19 00:07:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 3902750720. Throughput: 0: 41435.9. Samples: 170367340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:23,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 00:07:26,563][26599] Updated weights for policy 0, policy_version 238214 (0.0026) [2024-06-19 00:07:28,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40960.2, 300 sec: 41376.6). Total num frames: 3902947328. Throughput: 0: 41416.9. Samples: 170614600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:28,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 00:07:30,785][26599] Updated weights for policy 0, policy_version 238224 (0.0041) [2024-06-19 00:07:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 41487.6). Total num frames: 3903193088. Throughput: 0: 41482.9. Samples: 170736760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:33,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 00:07:34,440][26599] Updated weights for policy 0, policy_version 238234 (0.0049) [2024-06-19 00:07:38,380][26367] Fps is (10 sec: 40959.2, 60 sec: 40960.0, 300 sec: 41377.0). Total num frames: 3903356928. Throughput: 0: 41412.5. Samples: 170985460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:38,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 00:07:38,487][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000238243_3903373312.pth... [2024-06-19 00:07:38,536][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237637_3893444608.pth [2024-06-19 00:07:38,851][26599] Updated weights for policy 0, policy_version 238244 (0.0027) [2024-06-19 00:07:42,273][26599] Updated weights for policy 0, policy_version 238254 (0.0034) [2024-06-19 00:07:43,381][26367] Fps is (10 sec: 37682.5, 60 sec: 40959.8, 300 sec: 41432.2). Total num frames: 3903569920. Throughput: 0: 41286.0. Samples: 171234840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:43,381][26367] Avg episode reward: [(0, '0.820')] [2024-06-19 00:07:46,605][26599] Updated weights for policy 0, policy_version 238264 (0.0030) [2024-06-19 00:07:48,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42327.9, 300 sec: 41487.6). Total num frames: 3903815680. Throughput: 0: 41604.5. Samples: 171364640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:48,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 00:07:50,013][26599] Updated weights for policy 0, policy_version 238274 (0.0033) [2024-06-19 00:07:53,380][26367] Fps is (10 sec: 40961.3, 60 sec: 40960.1, 300 sec: 41376.5). Total num frames: 3903979520. Throughput: 0: 41693.0. Samples: 171619220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:53,380][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 00:07:54,348][26599] Updated weights for policy 0, policy_version 238284 (0.0041) [2024-06-19 00:07:58,026][26599] Updated weights for policy 0, policy_version 238294 (0.0033) [2024-06-19 00:07:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 3904225280. Throughput: 0: 41669.9. Samples: 171866640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:07:58,381][26367] Avg episode reward: [(0, '0.206')] [2024-06-19 00:08:02,276][26599] Updated weights for policy 0, policy_version 238304 (0.0038) [2024-06-19 00:08:03,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 3904438272. Throughput: 0: 41848.0. Samples: 171997740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:08:03,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 00:08:05,989][26599] Updated weights for policy 0, policy_version 238314 (0.0038) [2024-06-19 00:08:08,380][26367] Fps is (10 sec: 37682.8, 60 sec: 41506.1, 300 sec: 41321.0). Total num frames: 3904602112. Throughput: 0: 41629.4. Samples: 172240660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:08:08,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 00:08:10,188][26599] Updated weights for policy 0, policy_version 238324 (0.0030) [2024-06-19 00:08:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41377.0). Total num frames: 3904831488. Throughput: 0: 41687.4. Samples: 172490540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:08:13,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 00:08:13,801][26599] Updated weights for policy 0, policy_version 238334 (0.0032) [2024-06-19 00:08:17,912][26599] Updated weights for policy 0, policy_version 238344 (0.0038) [2024-06-19 00:08:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 3905044480. Throughput: 0: 41932.5. Samples: 172623720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:08:18,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 00:08:21,475][26599] Updated weights for policy 0, policy_version 238354 (0.0034) [2024-06-19 00:08:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41376.5). Total num frames: 3905241088. Throughput: 0: 41833.5. Samples: 172867960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 00:08:23,380][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 00:08:25,895][26599] Updated weights for policy 0, policy_version 238364 (0.0038) [2024-06-19 00:08:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 3905470464. Throughput: 0: 41804.2. Samples: 173116020. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:08:28,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 00:08:29,225][26599] Updated weights for policy 0, policy_version 238374 (0.0029) [2024-06-19 00:08:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40960.0, 300 sec: 41376.5). Total num frames: 3905650688. Throughput: 0: 41836.5. Samples: 173247280. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:08:33,381][26367] Avg episode reward: [(0, '0.276')] [2024-06-19 00:08:33,527][26599] Updated weights for policy 0, policy_version 238384 (0.0033) [2024-06-19 00:08:34,902][26579] Signal inference workers to stop experience collection... (2450 times) [2024-06-19 00:08:34,950][26599] InferenceWorker_p0-w0: stopping experience collection (2450 times) [2024-06-19 00:08:34,959][26579] Signal inference workers to resume experience collection... (2450 times) [2024-06-19 00:08:34,968][26599] InferenceWorker_p0-w0: resuming experience collection (2450 times) [2024-06-19 00:08:37,158][26599] Updated weights for policy 0, policy_version 238394 (0.0052) [2024-06-19 00:08:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 3905880064. Throughput: 0: 41692.8. Samples: 173495400. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:08:38,381][26367] Avg episode reward: [(0, '0.323')] [2024-06-19 00:08:41,252][26599] Updated weights for policy 0, policy_version 238404 (0.0031) [2024-06-19 00:08:43,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 41543.7). Total num frames: 3906109440. Throughput: 0: 41709.4. Samples: 173743560. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:08:43,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 00:08:45,011][26599] Updated weights for policy 0, policy_version 238414 (0.0027) [2024-06-19 00:08:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41488.1). Total num frames: 3906306048. Throughput: 0: 41652.0. Samples: 173872080. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:08:48,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 00:08:49,689][26599] Updated weights for policy 0, policy_version 238424 (0.0046) [2024-06-19 00:08:52,878][26599] Updated weights for policy 0, policy_version 238434 (0.0047) [2024-06-19 00:08:53,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41432.1). Total num frames: 3906502656. Throughput: 0: 41681.9. Samples: 174116340. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:08:53,380][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 00:08:57,591][26599] Updated weights for policy 0, policy_version 238444 (0.0028) [2024-06-19 00:08:58,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 3906715648. Throughput: 0: 41839.2. Samples: 174373300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:08:58,380][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 00:09:00,638][26599] Updated weights for policy 0, policy_version 238454 (0.0040) [2024-06-19 00:09:03,380][26367] Fps is (10 sec: 40959.2, 60 sec: 41233.0, 300 sec: 41376.6). Total num frames: 3906912256. Throughput: 0: 41531.8. Samples: 174492660. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:09:03,381][26367] Avg episode reward: [(0, '0.378')] [2024-06-19 00:09:05,399][26599] Updated weights for policy 0, policy_version 238464 (0.0031) [2024-06-19 00:09:08,382][26367] Fps is (10 sec: 42591.8, 60 sec: 42324.4, 300 sec: 41543.0). Total num frames: 3907141632. Throughput: 0: 41667.5. Samples: 174743060. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:09:08,382][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 00:09:08,448][26599] Updated weights for policy 0, policy_version 238474 (0.0039) [2024-06-19 00:09:13,326][26599] Updated weights for policy 0, policy_version 238484 (0.0038) [2024-06-19 00:09:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41376.5). Total num frames: 3907321856. Throughput: 0: 41730.6. Samples: 174993900. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:09:13,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 00:09:16,435][26599] Updated weights for policy 0, policy_version 238494 (0.0045) [2024-06-19 00:09:18,383][26367] Fps is (10 sec: 39315.8, 60 sec: 41504.1, 300 sec: 41376.1). Total num frames: 3907534848. Throughput: 0: 41363.6. Samples: 175108760. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:09:18,384][26367] Avg episode reward: [(0, '0.381')] [2024-06-19 00:09:21,298][26599] Updated weights for policy 0, policy_version 238504 (0.0038) [2024-06-19 00:09:23,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41432.1). Total num frames: 3907747840. Throughput: 0: 41465.8. Samples: 175361360. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:09:23,380][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 00:09:24,880][26599] Updated weights for policy 0, policy_version 238514 (0.0039) [2024-06-19 00:09:28,380][26367] Fps is (10 sec: 42610.6, 60 sec: 41506.1, 300 sec: 41488.1). Total num frames: 3907960832. Throughput: 0: 41481.7. Samples: 175610240. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:09:28,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 00:09:29,033][26599] Updated weights for policy 0, policy_version 238524 (0.0029) [2024-06-19 00:09:32,623][26599] Updated weights for policy 0, policy_version 238534 (0.0029) [2024-06-19 00:09:33,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41779.1, 300 sec: 41432.1). Total num frames: 3908157440. Throughput: 0: 41360.0. Samples: 175733280. Policy #0 lag: (min: 1.0, avg: 8.8, max: 19.0) [2024-06-19 00:09:33,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 00:09:36,500][26599] Updated weights for policy 0, policy_version 238544 (0.0032) [2024-06-19 00:09:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3908386816. Throughput: 0: 41557.3. Samples: 175986420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:09:38,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 00:09:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000238549_3908386816.pth... [2024-06-19 00:09:38,448][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000237941_3898425344.pth [2024-06-19 00:09:40,381][26599] Updated weights for policy 0, policy_version 238554 (0.0036) [2024-06-19 00:09:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40959.9, 300 sec: 41432.6). Total num frames: 3908567040. Throughput: 0: 41498.9. Samples: 176240760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:09:43,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 00:09:44,092][26599] Updated weights for policy 0, policy_version 238564 (0.0023) [2024-06-19 00:09:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3908780032. Throughput: 0: 41581.0. Samples: 176363800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:09:48,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 00:09:48,541][26599] Updated weights for policy 0, policy_version 238574 (0.0045) [2024-06-19 00:09:51,798][26599] Updated weights for policy 0, policy_version 238584 (0.0048) [2024-06-19 00:09:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3908993024. Throughput: 0: 41340.9. Samples: 176603340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:09:53,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 00:09:56,392][26599] Updated weights for policy 0, policy_version 238594 (0.0053) [2024-06-19 00:09:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 3909189632. Throughput: 0: 41428.0. Samples: 176858160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:09:58,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 00:10:00,036][26599] Updated weights for policy 0, policy_version 238604 (0.0037) [2024-06-19 00:10:03,384][26367] Fps is (10 sec: 40945.0, 60 sec: 41503.7, 300 sec: 41431.6). Total num frames: 3909402624. Throughput: 0: 41612.2. Samples: 176981340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:03,384][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 00:10:04,152][26599] Updated weights for policy 0, policy_version 238614 (0.0036) [2024-06-19 00:10:07,799][26599] Updated weights for policy 0, policy_version 238624 (0.0033) [2024-06-19 00:10:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41234.0, 300 sec: 41488.1). Total num frames: 3909615616. Throughput: 0: 41636.7. Samples: 177235020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:08,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 00:10:11,806][26599] Updated weights for policy 0, policy_version 238634 (0.0030) [2024-06-19 00:10:13,380][26367] Fps is (10 sec: 42614.0, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 3909828608. Throughput: 0: 41660.9. Samples: 177484980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:13,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 00:10:15,525][26599] Updated weights for policy 0, policy_version 238644 (0.0032) [2024-06-19 00:10:18,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41508.2, 300 sec: 41376.6). Total num frames: 3910025216. Throughput: 0: 41710.5. Samples: 177610240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:18,380][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 00:10:19,380][26579] Signal inference workers to stop experience collection... (2500 times) [2024-06-19 00:10:19,382][26579] Signal inference workers to resume experience collection... (2500 times) [2024-06-19 00:10:19,421][26599] InferenceWorker_p0-w0: stopping experience collection (2500 times) [2024-06-19 00:10:19,421][26599] InferenceWorker_p0-w0: resuming experience collection (2500 times) [2024-06-19 00:10:19,523][26599] Updated weights for policy 0, policy_version 238654 (0.0037) [2024-06-19 00:10:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 3910254592. Throughput: 0: 41742.2. Samples: 177864820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:23,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 00:10:23,422][26599] Updated weights for policy 0, policy_version 238664 (0.0033) [2024-06-19 00:10:27,634][26599] Updated weights for policy 0, policy_version 238674 (0.0041) [2024-06-19 00:10:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3910451200. Throughput: 0: 41689.0. Samples: 178116760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:28,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 00:10:31,058][26599] Updated weights for policy 0, policy_version 238684 (0.0040) [2024-06-19 00:10:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.3, 300 sec: 41543.1). Total num frames: 3910664192. Throughput: 0: 41617.7. Samples: 178236600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:33,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 00:10:35,397][26599] Updated weights for policy 0, policy_version 238694 (0.0033) [2024-06-19 00:10:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3910877184. Throughput: 0: 41855.5. Samples: 178486840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:38,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 00:10:39,319][26599] Updated weights for policy 0, policy_version 238704 (0.0048) [2024-06-19 00:10:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 3911073792. Throughput: 0: 41746.7. Samples: 178736760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 00:10:43,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 00:10:43,554][26599] Updated weights for policy 0, policy_version 238714 (0.0037) [2024-06-19 00:10:47,084][26599] Updated weights for policy 0, policy_version 238724 (0.0042) [2024-06-19 00:10:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3911286784. Throughput: 0: 41672.7. Samples: 178856460. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:10:48,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 00:10:51,345][26599] Updated weights for policy 0, policy_version 238734 (0.0037) [2024-06-19 00:10:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 3911499776. Throughput: 0: 41634.3. Samples: 179108560. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:10:53,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 00:10:55,156][26599] Updated weights for policy 0, policy_version 238744 (0.0044) [2024-06-19 00:10:58,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41432.1). Total num frames: 3911680000. Throughput: 0: 41506.7. Samples: 179352780. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:10:58,380][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 00:10:59,415][26599] Updated weights for policy 0, policy_version 238754 (0.0040) [2024-06-19 00:11:02,902][26599] Updated weights for policy 0, policy_version 238764 (0.0032) [2024-06-19 00:11:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41781.8, 300 sec: 41598.7). Total num frames: 3911909376. Throughput: 0: 41483.9. Samples: 179477020. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:03,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 00:11:07,416][26599] Updated weights for policy 0, policy_version 238774 (0.0036) [2024-06-19 00:11:08,381][26367] Fps is (10 sec: 40958.9, 60 sec: 41233.0, 300 sec: 41376.5). Total num frames: 3912089600. Throughput: 0: 41254.9. Samples: 179721300. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:08,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 00:11:10,902][26599] Updated weights for policy 0, policy_version 238784 (0.0028) [2024-06-19 00:11:13,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 3912318976. Throughput: 0: 41224.3. Samples: 179971860. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:13,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 00:11:15,243][26599] Updated weights for policy 0, policy_version 238794 (0.0038) [2024-06-19 00:11:18,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41779.0, 300 sec: 41543.1). Total num frames: 3912531968. Throughput: 0: 41316.4. Samples: 180095840. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:18,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 00:11:19,158][26599] Updated weights for policy 0, policy_version 238804 (0.0034) [2024-06-19 00:11:23,024][26599] Updated weights for policy 0, policy_version 238814 (0.0029) [2024-06-19 00:11:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3912728576. Throughput: 0: 41246.6. Samples: 180342940. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:23,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 00:11:27,062][26599] Updated weights for policy 0, policy_version 238824 (0.0035) [2024-06-19 00:11:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 3912941568. Throughput: 0: 41235.0. Samples: 180592340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:28,384][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 00:11:31,193][26599] Updated weights for policy 0, policy_version 238834 (0.0041) [2024-06-19 00:11:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3913154560. Throughput: 0: 41310.6. Samples: 180715440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:33,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 00:11:34,813][26599] Updated weights for policy 0, policy_version 238844 (0.0036) [2024-06-19 00:11:38,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3913351168. Throughput: 0: 41201.6. Samples: 180962640. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:38,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 00:11:38,522][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000238853_3913367552.pth... [2024-06-19 00:11:38,573][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000238243_3903373312.pth [2024-06-19 00:11:39,081][26599] Updated weights for policy 0, policy_version 238854 (0.0037) [2024-06-19 00:11:42,742][26599] Updated weights for policy 0, policy_version 238864 (0.0043) [2024-06-19 00:11:43,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41599.2). Total num frames: 3913547776. Throughput: 0: 41125.8. Samples: 181203440. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:43,380][26367] Avg episode reward: [(0, '0.197')] [2024-06-19 00:11:47,264][26599] Updated weights for policy 0, policy_version 238874 (0.0031) [2024-06-19 00:11:48,380][26367] Fps is (10 sec: 37683.6, 60 sec: 40686.9, 300 sec: 41376.5). Total num frames: 3913728000. Throughput: 0: 41259.9. Samples: 181333720. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:48,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 00:11:50,486][26599] Updated weights for policy 0, policy_version 238884 (0.0043) [2024-06-19 00:11:52,763][26579] Signal inference workers to stop experience collection... (2550 times) [2024-06-19 00:11:52,766][26579] Signal inference workers to resume experience collection... (2550 times) [2024-06-19 00:11:52,782][26599] InferenceWorker_p0-w0: stopping experience collection (2550 times) [2024-06-19 00:11:52,783][26599] InferenceWorker_p0-w0: resuming experience collection (2550 times) [2024-06-19 00:11:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 3913957376. Throughput: 0: 41262.0. Samples: 181578080. Policy #0 lag: (min: 1.0, avg: 10.5, max: 21.0) [2024-06-19 00:11:53,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 00:11:55,073][26599] Updated weights for policy 0, policy_version 238894 (0.0033) [2024-06-19 00:11:58,380][26367] Fps is (10 sec: 45875.4, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 3914186752. Throughput: 0: 41211.2. Samples: 181826360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:11:58,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 00:11:58,526][26599] Updated weights for policy 0, policy_version 238904 (0.0040) [2024-06-19 00:12:02,855][26599] Updated weights for policy 0, policy_version 238914 (0.0044) [2024-06-19 00:12:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 3914366976. Throughput: 0: 41252.2. Samples: 181952180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:03,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 00:12:06,714][26599] Updated weights for policy 0, policy_version 238924 (0.0047) [2024-06-19 00:12:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.4, 300 sec: 41598.7). Total num frames: 3914596352. Throughput: 0: 41301.9. Samples: 182201520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:08,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 00:12:10,664][26599] Updated weights for policy 0, policy_version 238934 (0.0038) [2024-06-19 00:12:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3914792960. Throughput: 0: 41248.4. Samples: 182448520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:13,381][26367] Avg episode reward: [(0, '0.401')] [2024-06-19 00:12:14,455][26599] Updated weights for policy 0, policy_version 238944 (0.0044) [2024-06-19 00:12:18,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3915005952. Throughput: 0: 41282.7. Samples: 182573160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:18,381][26367] Avg episode reward: [(0, '0.401')] [2024-06-19 00:12:18,513][26599] Updated weights for policy 0, policy_version 238954 (0.0037) [2024-06-19 00:12:22,319][26599] Updated weights for policy 0, policy_version 238964 (0.0053) [2024-06-19 00:12:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 3915218944. Throughput: 0: 41343.7. Samples: 182823100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:23,381][26367] Avg episode reward: [(0, '0.282')] [2024-06-19 00:12:26,412][26599] Updated weights for policy 0, policy_version 238974 (0.0034) [2024-06-19 00:12:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 3915415552. Throughput: 0: 41507.8. Samples: 183071300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:28,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 00:12:30,503][26599] Updated weights for policy 0, policy_version 238984 (0.0032) [2024-06-19 00:12:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 3915612160. Throughput: 0: 41350.3. Samples: 183194480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:33,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 00:12:34,513][26599] Updated weights for policy 0, policy_version 238994 (0.0026) [2024-06-19 00:12:38,239][26599] Updated weights for policy 0, policy_version 239004 (0.0027) [2024-06-19 00:12:38,380][26367] Fps is (10 sec: 42599.5, 60 sec: 41506.3, 300 sec: 41598.7). Total num frames: 3915841536. Throughput: 0: 41559.6. Samples: 183448260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:38,380][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 00:12:42,249][26599] Updated weights for policy 0, policy_version 239014 (0.0038) [2024-06-19 00:12:43,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41232.9, 300 sec: 41376.5). Total num frames: 3916021760. Throughput: 0: 41547.4. Samples: 183696000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:43,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 00:12:46,000][26599] Updated weights for policy 0, policy_version 239024 (0.0044) [2024-06-19 00:12:48,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 3916267520. Throughput: 0: 41533.6. Samples: 183821200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:48,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 00:12:50,133][26599] Updated weights for policy 0, policy_version 239034 (0.0047) [2024-06-19 00:12:53,380][26367] Fps is (10 sec: 42599.1, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3916447744. Throughput: 0: 41583.9. Samples: 184072800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:53,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 00:12:53,813][26599] Updated weights for policy 0, policy_version 239044 (0.0030) [2024-06-19 00:12:57,861][26599] Updated weights for policy 0, policy_version 239054 (0.0023) [2024-06-19 00:12:58,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3916660736. Throughput: 0: 41513.9. Samples: 184316640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:12:58,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 00:13:02,227][26599] Updated weights for policy 0, policy_version 239064 (0.0030) [2024-06-19 00:13:03,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 41654.2). Total num frames: 3916890112. Throughput: 0: 41547.9. Samples: 184442820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 00:13:03,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 00:13:05,544][26599] Updated weights for policy 0, policy_version 239074 (0.0028) [2024-06-19 00:13:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3917070336. Throughput: 0: 41700.1. Samples: 184699600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:08,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 00:13:09,908][26599] Updated weights for policy 0, policy_version 239084 (0.0032) [2024-06-19 00:13:13,384][26367] Fps is (10 sec: 40946.1, 60 sec: 41776.8, 300 sec: 41542.7). Total num frames: 3917299712. Throughput: 0: 41742.2. Samples: 184949840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:13,384][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 00:13:13,769][26599] Updated weights for policy 0, policy_version 239094 (0.0037) [2024-06-19 00:13:17,653][26599] Updated weights for policy 0, policy_version 239104 (0.0032) [2024-06-19 00:13:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3917496320. Throughput: 0: 41788.0. Samples: 185074940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:18,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 00:13:18,511][26579] Signal inference workers to stop experience collection... (2600 times) [2024-06-19 00:13:18,516][26579] Signal inference workers to resume experience collection... (2600 times) [2024-06-19 00:13:18,543][26599] InferenceWorker_p0-w0: stopping experience collection (2600 times) [2024-06-19 00:13:18,543][26599] InferenceWorker_p0-w0: resuming experience collection (2600 times) [2024-06-19 00:13:21,465][26599] Updated weights for policy 0, policy_version 239114 (0.0048) [2024-06-19 00:13:23,380][26367] Fps is (10 sec: 39335.8, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3917692928. Throughput: 0: 41789.3. Samples: 185328780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:23,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 00:13:25,370][26599] Updated weights for policy 0, policy_version 239124 (0.0042) [2024-06-19 00:13:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 3917922304. Throughput: 0: 41778.0. Samples: 185576000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:28,380][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 00:13:29,723][26599] Updated weights for policy 0, policy_version 239134 (0.0023) [2024-06-19 00:13:33,157][26599] Updated weights for policy 0, policy_version 239144 (0.0033) [2024-06-19 00:13:33,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 3918135296. Throughput: 0: 41760.9. Samples: 185700440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:33,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 00:13:37,577][26599] Updated weights for policy 0, policy_version 239154 (0.0028) [2024-06-19 00:13:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3918331904. Throughput: 0: 41800.9. Samples: 185953840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:38,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 00:13:38,456][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000239157_3918348288.pth... [2024-06-19 00:13:38,503][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000238549_3908386816.pth [2024-06-19 00:13:40,886][26599] Updated weights for policy 0, policy_version 239164 (0.0051) [2024-06-19 00:13:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 41543.2). Total num frames: 3918561280. Throughput: 0: 41854.1. Samples: 186200080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:43,390][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 00:13:45,385][26599] Updated weights for policy 0, policy_version 239174 (0.0047) [2024-06-19 00:13:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3918774272. Throughput: 0: 41890.3. Samples: 186327880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:48,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 00:13:48,731][26599] Updated weights for policy 0, policy_version 239184 (0.0039) [2024-06-19 00:13:53,370][26599] Updated weights for policy 0, policy_version 239194 (0.0036) [2024-06-19 00:13:53,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41779.3, 300 sec: 41487.6). Total num frames: 3918954496. Throughput: 0: 41796.5. Samples: 186580440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:53,380][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 00:13:56,703][26599] Updated weights for policy 0, policy_version 239204 (0.0030) [2024-06-19 00:13:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 41543.2). Total num frames: 3919167488. Throughput: 0: 41878.5. Samples: 186834220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:13:58,380][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 00:14:01,364][26599] Updated weights for policy 0, policy_version 239214 (0.0039) [2024-06-19 00:14:03,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42052.5, 300 sec: 41598.9). Total num frames: 3919413248. Throughput: 0: 41911.6. Samples: 186960960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:14:03,380][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 00:14:04,186][26599] Updated weights for policy 0, policy_version 239224 (0.0035) [2024-06-19 00:14:08,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41543.1). Total num frames: 3919577088. Throughput: 0: 41735.4. Samples: 187206880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:14:08,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 00:14:09,091][26599] Updated weights for policy 0, policy_version 239234 (0.0031) [2024-06-19 00:14:11,878][26599] Updated weights for policy 0, policy_version 239244 (0.0036) [2024-06-19 00:14:13,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41508.7, 300 sec: 41543.6). Total num frames: 3919790080. Throughput: 0: 41764.5. Samples: 187455400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 00:14:13,380][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 00:14:16,901][26599] Updated weights for policy 0, policy_version 239254 (0.0042) [2024-06-19 00:14:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 3920019456. Throughput: 0: 41733.9. Samples: 187578460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:18,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 00:14:20,047][26599] Updated weights for policy 0, policy_version 239264 (0.0036) [2024-06-19 00:14:22,662][26579] Signal inference workers to stop experience collection... (2650 times) [2024-06-19 00:14:22,720][26599] InferenceWorker_p0-w0: stopping experience collection (2650 times) [2024-06-19 00:14:22,723][26579] Signal inference workers to resume experience collection... (2650 times) [2024-06-19 00:14:22,728][26599] InferenceWorker_p0-w0: resuming experience collection (2650 times) [2024-06-19 00:14:23,380][26367] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 41487.6). Total num frames: 3920199680. Throughput: 0: 41687.9. Samples: 187829800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:23,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 00:14:24,867][26599] Updated weights for policy 0, policy_version 239274 (0.0041) [2024-06-19 00:14:27,799][26599] Updated weights for policy 0, policy_version 239284 (0.0028) [2024-06-19 00:14:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 3920429056. Throughput: 0: 41543.5. Samples: 188069540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:28,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 00:14:32,677][26599] Updated weights for policy 0, policy_version 239294 (0.0040) [2024-06-19 00:14:33,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41233.2, 300 sec: 41432.1). Total num frames: 3920609280. Throughput: 0: 41631.7. Samples: 188201300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:33,381][26367] Avg episode reward: [(0, '0.192')] [2024-06-19 00:14:35,537][26599] Updated weights for policy 0, policy_version 239304 (0.0030) [2024-06-19 00:14:38,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3920822272. Throughput: 0: 41590.6. Samples: 188452020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:38,380][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 00:14:40,395][26599] Updated weights for policy 0, policy_version 239314 (0.0051) [2024-06-19 00:14:43,336][26599] Updated weights for policy 0, policy_version 239324 (0.0025) [2024-06-19 00:14:43,380][26367] Fps is (10 sec: 47512.4, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3921084416. Throughput: 0: 41207.8. Samples: 188688580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:43,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 00:14:48,125][26599] Updated weights for policy 0, policy_version 239334 (0.0034) [2024-06-19 00:14:48,380][26367] Fps is (10 sec: 42597.6, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 3921248256. Throughput: 0: 41458.5. Samples: 188826600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:48,381][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 00:14:51,035][26599] Updated weights for policy 0, policy_version 239344 (0.0039) [2024-06-19 00:14:53,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 3921461248. Throughput: 0: 41333.4. Samples: 189066880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:53,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 00:14:56,705][26599] Updated weights for policy 0, policy_version 239354 (0.0041) [2024-06-19 00:14:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41599.2). Total num frames: 3921674240. Throughput: 0: 41350.1. Samples: 189316160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:14:58,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 00:14:58,930][26599] Updated weights for policy 0, policy_version 239364 (0.0048) [2024-06-19 00:15:03,380][26367] Fps is (10 sec: 37683.0, 60 sec: 40413.8, 300 sec: 41432.1). Total num frames: 3921838080. Throughput: 0: 41400.0. Samples: 189441460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:15:03,381][26367] Avg episode reward: [(0, '0.305')] [2024-06-19 00:15:04,546][26599] Updated weights for policy 0, policy_version 239374 (0.0024) [2024-06-19 00:15:06,646][26599] Updated weights for policy 0, policy_version 239384 (0.0033) [2024-06-19 00:15:08,384][26367] Fps is (10 sec: 40945.0, 60 sec: 41776.7, 300 sec: 41542.6). Total num frames: 3922083840. Throughput: 0: 41094.5. Samples: 189679200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:15:08,385][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 00:15:12,525][26599] Updated weights for policy 0, policy_version 239394 (0.0038) [2024-06-19 00:15:13,380][26367] Fps is (10 sec: 44237.6, 60 sec: 41506.1, 300 sec: 41543.2). Total num frames: 3922280448. Throughput: 0: 41584.6. Samples: 189940840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:15:13,380][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 00:15:14,865][26599] Updated weights for policy 0, policy_version 239404 (0.0033) [2024-06-19 00:15:18,384][26367] Fps is (10 sec: 39321.7, 60 sec: 40957.5, 300 sec: 41431.6). Total num frames: 3922477056. Throughput: 0: 41293.1. Samples: 190059640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:15:18,384][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 00:15:20,261][26599] Updated weights for policy 0, policy_version 239414 (0.0038) [2024-06-19 00:15:22,523][26599] Updated weights for policy 0, policy_version 239424 (0.0034) [2024-06-19 00:15:23,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 3922722816. Throughput: 0: 41241.7. Samples: 190307900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 00:15:23,381][26367] Avg episode reward: [(0, '0.287')] [2024-06-19 00:15:27,643][26579] Signal inference workers to stop experience collection... (2700 times) [2024-06-19 00:15:27,651][26599] InferenceWorker_p0-w0: stopping experience collection (2700 times) [2024-06-19 00:15:27,701][26579] Signal inference workers to resume experience collection... (2700 times) [2024-06-19 00:15:27,701][26599] InferenceWorker_p0-w0: resuming experience collection (2700 times) [2024-06-19 00:15:28,166][26599] Updated weights for policy 0, policy_version 239434 (0.0044) [2024-06-19 00:15:28,380][26367] Fps is (10 sec: 42614.0, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3922903040. Throughput: 0: 41769.1. Samples: 190568180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:15:28,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 00:15:30,752][26599] Updated weights for policy 0, policy_version 239444 (0.0033) [2024-06-19 00:15:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41487.6). Total num frames: 3923116032. Throughput: 0: 41182.8. Samples: 190679820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:15:33,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 00:15:36,001][26599] Updated weights for policy 0, policy_version 239454 (0.0037) [2024-06-19 00:15:38,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 3923361792. Throughput: 0: 41567.2. Samples: 190937400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:15:38,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 00:15:38,525][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000239464_3923378176.pth... [2024-06-19 00:15:38,527][26599] Updated weights for policy 0, policy_version 239464 (0.0038) [2024-06-19 00:15:38,578][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000238853_3913367552.pth [2024-06-19 00:15:43,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40414.0, 300 sec: 41432.1). Total num frames: 3923509248. Throughput: 0: 41808.8. Samples: 191197560. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:15:43,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 00:15:43,870][26599] Updated weights for policy 0, policy_version 239474 (0.0025) [2024-06-19 00:15:46,356][26599] Updated weights for policy 0, policy_version 239484 (0.0045) [2024-06-19 00:15:48,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 3923755008. Throughput: 0: 41504.0. Samples: 191309140. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:15:48,381][26367] Avg episode reward: [(0, '0.316')] [2024-06-19 00:15:51,720][26599] Updated weights for policy 0, policy_version 239494 (0.0038) [2024-06-19 00:15:53,380][26367] Fps is (10 sec: 45875.2, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3923968000. Throughput: 0: 42002.9. Samples: 191569180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:15:53,382][26367] Avg episode reward: [(0, '0.396')] [2024-06-19 00:15:54,156][26599] Updated weights for policy 0, policy_version 239504 (0.0030) [2024-06-19 00:15:58,380][26367] Fps is (10 sec: 37683.4, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 3924131840. Throughput: 0: 41835.9. Samples: 191823460. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:15:58,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 00:15:59,410][26599] Updated weights for policy 0, policy_version 239514 (0.0039) [2024-06-19 00:16:01,981][26599] Updated weights for policy 0, policy_version 239524 (0.0034) [2024-06-19 00:16:03,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42322.8, 300 sec: 41653.7). Total num frames: 3924377600. Throughput: 0: 41781.7. Samples: 191939820. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:16:03,385][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 00:16:07,304][26599] Updated weights for policy 0, policy_version 239534 (0.0033) [2024-06-19 00:16:08,380][26367] Fps is (10 sec: 45875.0, 60 sec: 41781.7, 300 sec: 41598.7). Total num frames: 3924590592. Throughput: 0: 42095.1. Samples: 192202180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:16:08,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 00:16:09,778][26599] Updated weights for policy 0, policy_version 239544 (0.0034) [2024-06-19 00:16:13,380][26367] Fps is (10 sec: 39336.2, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3924770816. Throughput: 0: 41678.7. Samples: 192443720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:16:13,380][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 00:16:15,313][26599] Updated weights for policy 0, policy_version 239554 (0.0035) [2024-06-19 00:16:17,691][26599] Updated weights for policy 0, policy_version 239564 (0.0033) [2024-06-19 00:16:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42327.9, 300 sec: 41654.3). Total num frames: 3925016576. Throughput: 0: 41944.9. Samples: 192567340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:16:18,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 00:16:23,226][26599] Updated weights for policy 0, policy_version 239574 (0.0038) [2024-06-19 00:16:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 3925180416. Throughput: 0: 41777.7. Samples: 192817400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:16:23,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 00:16:25,522][26599] Updated weights for policy 0, policy_version 239584 (0.0022) [2024-06-19 00:16:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3925393408. Throughput: 0: 41527.6. Samples: 193066300. Policy #0 lag: (min: 0.0, avg: 12.6, max: 20.0) [2024-06-19 00:16:28,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 00:16:29,351][26579] Signal inference workers to stop experience collection... (2750 times) [2024-06-19 00:16:29,351][26579] Signal inference workers to resume experience collection... (2750 times) [2024-06-19 00:16:29,383][26599] InferenceWorker_p0-w0: stopping experience collection (2750 times) [2024-06-19 00:16:29,383][26599] InferenceWorker_p0-w0: resuming experience collection (2750 times) [2024-06-19 00:16:31,153][26599] Updated weights for policy 0, policy_version 239594 (0.0043) [2024-06-19 00:16:33,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42052.3, 300 sec: 41654.3). Total num frames: 3925639168. Throughput: 0: 42000.5. Samples: 193199160. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:16:33,380][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 00:16:34,005][26599] Updated weights for policy 0, policy_version 239604 (0.0034) [2024-06-19 00:16:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40686.9, 300 sec: 41543.2). Total num frames: 3925803008. Throughput: 0: 41733.4. Samples: 193447180. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:16:38,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 00:16:38,954][26599] Updated weights for policy 0, policy_version 239614 (0.0028) [2024-06-19 00:16:41,850][26599] Updated weights for policy 0, policy_version 239624 (0.0037) [2024-06-19 00:16:43,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3926032384. Throughput: 0: 41500.8. Samples: 193691000. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:16:43,381][26367] Avg episode reward: [(0, '0.814')] [2024-06-19 00:16:46,598][26599] Updated weights for policy 0, policy_version 239634 (0.0026) [2024-06-19 00:16:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3926245376. Throughput: 0: 41792.3. Samples: 193820320. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:16:48,381][26367] Avg episode reward: [(0, '0.827')] [2024-06-19 00:16:49,684][26599] Updated weights for policy 0, policy_version 239644 (0.0041) [2024-06-19 00:16:53,380][26367] Fps is (10 sec: 39321.3, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 3926425600. Throughput: 0: 41503.0. Samples: 194069820. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:16:53,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 00:16:54,436][26599] Updated weights for policy 0, policy_version 239654 (0.0042) [2024-06-19 00:16:57,490][26599] Updated weights for policy 0, policy_version 239664 (0.0030) [2024-06-19 00:16:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 3926671360. Throughput: 0: 41354.1. Samples: 194304660. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:16:58,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 00:17:02,632][26599] Updated weights for policy 0, policy_version 239674 (0.0040) [2024-06-19 00:17:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41235.5, 300 sec: 41543.1). Total num frames: 3926851584. Throughput: 0: 41496.4. Samples: 194434680. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:03,383][26367] Avg episode reward: [(0, '0.793')] [2024-06-19 00:17:05,349][26599] Updated weights for policy 0, policy_version 239684 (0.0039) [2024-06-19 00:17:08,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 3927064576. Throughput: 0: 41418.6. Samples: 194681240. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:08,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 00:17:10,460][26599] Updated weights for policy 0, policy_version 239694 (0.0034) [2024-06-19 00:17:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3927277568. Throughput: 0: 41234.7. Samples: 194921860. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:13,381][26367] Avg episode reward: [(0, '0.329')] [2024-06-19 00:17:13,627][26599] Updated weights for policy 0, policy_version 239704 (0.0050) [2024-06-19 00:17:18,351][26599] Updated weights for policy 0, policy_version 239714 (0.0028) [2024-06-19 00:17:18,384][26367] Fps is (10 sec: 40946.0, 60 sec: 40957.6, 300 sec: 41542.7). Total num frames: 3927474176. Throughput: 0: 41083.4. Samples: 195048060. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:18,384][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 00:17:21,590][26599] Updated weights for policy 0, policy_version 239724 (0.0035) [2024-06-19 00:17:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 3927687168. Throughput: 0: 41058.7. Samples: 195294820. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:23,380][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 00:17:26,335][26599] Updated weights for policy 0, policy_version 239734 (0.0030) [2024-06-19 00:17:28,380][26367] Fps is (10 sec: 42613.0, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 3927900160. Throughput: 0: 41236.4. Samples: 195546640. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:28,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 00:17:29,318][26599] Updated weights for policy 0, policy_version 239744 (0.0033) [2024-06-19 00:17:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 40686.9, 300 sec: 41487.6). Total num frames: 3928080384. Throughput: 0: 41170.7. Samples: 195673000. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:33,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 00:17:34,379][26599] Updated weights for policy 0, policy_version 239754 (0.0031) [2024-06-19 00:17:36,904][26579] Signal inference workers to stop experience collection... (2800 times) [2024-06-19 00:17:36,943][26599] InferenceWorker_p0-w0: stopping experience collection (2800 times) [2024-06-19 00:17:36,957][26579] Signal inference workers to resume experience collection... (2800 times) [2024-06-19 00:17:36,961][26599] InferenceWorker_p0-w0: resuming experience collection (2800 times) [2024-06-19 00:17:37,095][26599] Updated weights for policy 0, policy_version 239764 (0.0035) [2024-06-19 00:17:38,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 41654.3). Total num frames: 3928309760. Throughput: 0: 41039.3. Samples: 195916580. Policy #0 lag: (min: 1.0, avg: 12.3, max: 22.0) [2024-06-19 00:17:38,380][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 00:17:38,556][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000239767_3928342528.pth... [2024-06-19 00:17:38,613][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000239157_3918348288.pth [2024-06-19 00:17:42,211][26599] Updated weights for policy 0, policy_version 239774 (0.0035) [2024-06-19 00:17:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 41487.6). Total num frames: 3928506368. Throughput: 0: 41493.9. Samples: 196171880. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:17:43,380][26367] Avg episode reward: [(0, '0.323')] [2024-06-19 00:17:45,397][26599] Updated weights for policy 0, policy_version 239784 (0.0031) [2024-06-19 00:17:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 3928719360. Throughput: 0: 41325.9. Samples: 196294340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:17:48,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 00:17:50,012][26599] Updated weights for policy 0, policy_version 239794 (0.0042) [2024-06-19 00:17:53,246][26599] Updated weights for policy 0, policy_version 239804 (0.0032) [2024-06-19 00:17:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 3928948736. Throughput: 0: 41449.4. Samples: 196546460. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:17:53,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 00:17:57,877][26599] Updated weights for policy 0, policy_version 239814 (0.0036) [2024-06-19 00:17:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 40960.1, 300 sec: 41487.7). Total num frames: 3929128960. Throughput: 0: 41654.7. Samples: 196796320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:17:58,380][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 00:18:01,195][26599] Updated weights for policy 0, policy_version 239824 (0.0039) [2024-06-19 00:18:03,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3929341952. Throughput: 0: 41563.6. Samples: 196918280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:03,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 00:18:05,937][26599] Updated weights for policy 0, policy_version 239834 (0.0050) [2024-06-19 00:18:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41779.3, 300 sec: 41599.2). Total num frames: 3929571328. Throughput: 0: 41677.7. Samples: 197170320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:08,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 00:18:08,979][26599] Updated weights for policy 0, policy_version 239844 (0.0036) [2024-06-19 00:18:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41232.9, 300 sec: 41543.1). Total num frames: 3929751552. Throughput: 0: 41636.4. Samples: 197420280. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:13,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 00:18:13,930][26599] Updated weights for policy 0, policy_version 239854 (0.0029) [2024-06-19 00:18:16,727][26599] Updated weights for policy 0, policy_version 239864 (0.0031) [2024-06-19 00:18:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41508.6, 300 sec: 41598.7). Total num frames: 3929964544. Throughput: 0: 41461.4. Samples: 197538760. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:18,380][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 00:18:21,506][26599] Updated weights for policy 0, policy_version 239874 (0.0039) [2024-06-19 00:18:23,381][26367] Fps is (10 sec: 44237.1, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 3930193920. Throughput: 0: 41816.8. Samples: 197798340. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:23,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 00:18:24,575][26599] Updated weights for policy 0, policy_version 239884 (0.0029) [2024-06-19 00:18:28,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41233.2, 300 sec: 41487.6). Total num frames: 3930374144. Throughput: 0: 41812.4. Samples: 198053440. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:28,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 00:18:29,191][26599] Updated weights for policy 0, policy_version 239894 (0.0048) [2024-06-19 00:18:32,362][26599] Updated weights for policy 0, policy_version 239904 (0.0026) [2024-06-19 00:18:33,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 3930603520. Throughput: 0: 41805.8. Samples: 198175600. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:33,380][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 00:18:36,859][26599] Updated weights for policy 0, policy_version 239914 (0.0023) [2024-06-19 00:18:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41432.1). Total num frames: 3930783744. Throughput: 0: 41705.0. Samples: 198423180. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:38,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 00:18:40,338][26599] Updated weights for policy 0, policy_version 239924 (0.0032) [2024-06-19 00:18:43,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 41432.1). Total num frames: 3930996736. Throughput: 0: 41711.0. Samples: 198673320. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:43,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 00:18:44,531][26599] Updated weights for policy 0, policy_version 239934 (0.0030) [2024-06-19 00:18:48,148][26599] Updated weights for policy 0, policy_version 239944 (0.0039) [2024-06-19 00:18:48,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 3931242496. Throughput: 0: 41748.2. Samples: 198796940. Policy #0 lag: (min: 1.0, avg: 9.3, max: 22.0) [2024-06-19 00:18:48,380][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 00:18:50,134][26579] Signal inference workers to stop experience collection... (2850 times) [2024-06-19 00:18:50,139][26579] Signal inference workers to resume experience collection... (2850 times) [2024-06-19 00:18:50,160][26599] InferenceWorker_p0-w0: stopping experience collection (2850 times) [2024-06-19 00:18:50,161][26599] InferenceWorker_p0-w0: resuming experience collection (2850 times) [2024-06-19 00:18:52,394][26599] Updated weights for policy 0, policy_version 239954 (0.0034) [2024-06-19 00:18:53,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41233.2, 300 sec: 41543.2). Total num frames: 3931422720. Throughput: 0: 41743.7. Samples: 199048780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:18:53,380][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 00:18:56,218][26599] Updated weights for policy 0, policy_version 239964 (0.0028) [2024-06-19 00:18:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41487.6). Total num frames: 3931652096. Throughput: 0: 41594.4. Samples: 199292020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:18:58,380][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 00:19:00,581][26599] Updated weights for policy 0, policy_version 239974 (0.0032) [2024-06-19 00:19:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 41654.3). Total num frames: 3931865088. Throughput: 0: 41743.2. Samples: 199417200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:03,380][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 00:19:03,873][26599] Updated weights for policy 0, policy_version 239984 (0.0035) [2024-06-19 00:19:08,326][26599] Updated weights for policy 0, policy_version 239994 (0.0026) [2024-06-19 00:19:08,380][26367] Fps is (10 sec: 40959.0, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 3932061696. Throughput: 0: 41664.4. Samples: 199673240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:08,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 00:19:11,876][26599] Updated weights for policy 0, policy_version 240004 (0.0041) [2024-06-19 00:19:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41543.2). Total num frames: 3932274688. Throughput: 0: 41397.7. Samples: 199916340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:13,381][26367] Avg episode reward: [(0, '0.358')] [2024-06-19 00:19:16,380][26599] Updated weights for policy 0, policy_version 240014 (0.0049) [2024-06-19 00:19:18,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3932471296. Throughput: 0: 41500.8. Samples: 200043140. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:18,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 00:19:19,830][26599] Updated weights for policy 0, policy_version 240024 (0.0027) [2024-06-19 00:19:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41487.6). Total num frames: 3932667904. Throughput: 0: 41524.0. Samples: 200291760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:23,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 00:19:24,393][26599] Updated weights for policy 0, policy_version 240034 (0.0028) [2024-06-19 00:19:27,588][26599] Updated weights for policy 0, policy_version 240044 (0.0037) [2024-06-19 00:19:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 3932897280. Throughput: 0: 41463.1. Samples: 200539160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:28,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 00:19:32,153][26599] Updated weights for policy 0, policy_version 240054 (0.0039) [2024-06-19 00:19:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3933077504. Throughput: 0: 41634.6. Samples: 200670500. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:33,380][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 00:19:35,374][26599] Updated weights for policy 0, policy_version 240064 (0.0036) [2024-06-19 00:19:38,381][26367] Fps is (10 sec: 37681.7, 60 sec: 41505.8, 300 sec: 41321.0). Total num frames: 3933274112. Throughput: 0: 41430.2. Samples: 200913160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:38,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 00:19:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240068_3933274112.pth... [2024-06-19 00:19:38,465][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000239464_3923378176.pth [2024-06-19 00:19:40,027][26599] Updated weights for policy 0, policy_version 240074 (0.0038) [2024-06-19 00:19:43,118][26599] Updated weights for policy 0, policy_version 240084 (0.0026) [2024-06-19 00:19:43,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 3933536256. Throughput: 0: 41503.4. Samples: 201159680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:43,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 00:19:48,135][26599] Updated weights for policy 0, policy_version 240094 (0.0029) [2024-06-19 00:19:48,380][26367] Fps is (10 sec: 42600.0, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 3933700096. Throughput: 0: 41637.2. Samples: 201290880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:48,383][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 00:19:51,087][26599] Updated weights for policy 0, policy_version 240104 (0.0027) [2024-06-19 00:19:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 3933929472. Throughput: 0: 41298.3. Samples: 201531660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:53,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 00:19:56,006][26579] Signal inference workers to stop experience collection... (2900 times) [2024-06-19 00:19:56,055][26599] InferenceWorker_p0-w0: stopping experience collection (2900 times) [2024-06-19 00:19:56,065][26579] Signal inference workers to resume experience collection... (2900 times) [2024-06-19 00:19:56,072][26599] InferenceWorker_p0-w0: resuming experience collection (2900 times) [2024-06-19 00:19:56,080][26599] Updated weights for policy 0, policy_version 240114 (0.0048) [2024-06-19 00:19:58,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41233.0, 300 sec: 41654.3). Total num frames: 3934126080. Throughput: 0: 41597.4. Samples: 201788220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:19:58,381][26367] Avg episode reward: [(0, '0.195')] [2024-06-19 00:19:59,045][26599] Updated weights for policy 0, policy_version 240124 (0.0030) [2024-06-19 00:20:03,380][26367] Fps is (10 sec: 37683.1, 60 sec: 40686.8, 300 sec: 41432.6). Total num frames: 3934306304. Throughput: 0: 41579.9. Samples: 201914240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:03,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 00:20:03,808][26599] Updated weights for policy 0, policy_version 240134 (0.0042) [2024-06-19 00:20:06,842][26599] Updated weights for policy 0, policy_version 240144 (0.0037) [2024-06-19 00:20:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3934568448. Throughput: 0: 41441.7. Samples: 202156640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:08,390][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 00:20:11,818][26599] Updated weights for policy 0, policy_version 240154 (0.0027) [2024-06-19 00:20:13,380][26367] Fps is (10 sec: 45875.4, 60 sec: 41506.1, 300 sec: 41654.7). Total num frames: 3934765056. Throughput: 0: 41658.2. Samples: 202413780. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:13,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 00:20:14,813][26599] Updated weights for policy 0, policy_version 240164 (0.0039) [2024-06-19 00:20:18,384][26367] Fps is (10 sec: 37669.8, 60 sec: 41230.6, 300 sec: 41431.6). Total num frames: 3934945280. Throughput: 0: 41389.9. Samples: 202533200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:18,385][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 00:20:19,579][26599] Updated weights for policy 0, policy_version 240174 (0.0039) [2024-06-19 00:20:22,613][26599] Updated weights for policy 0, policy_version 240184 (0.0041) [2024-06-19 00:20:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 3935207424. Throughput: 0: 41766.2. Samples: 202792620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:23,380][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 00:20:27,374][26599] Updated weights for policy 0, policy_version 240194 (0.0047) [2024-06-19 00:20:28,380][26367] Fps is (10 sec: 44252.8, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3935387648. Throughput: 0: 41881.8. Samples: 203044360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:28,381][26367] Avg episode reward: [(0, '0.399')] [2024-06-19 00:20:30,393][26599] Updated weights for policy 0, policy_version 240204 (0.0043) [2024-06-19 00:20:33,380][26367] Fps is (10 sec: 39320.7, 60 sec: 42052.1, 300 sec: 41487.6). Total num frames: 3935600640. Throughput: 0: 41607.0. Samples: 203163200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:33,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 00:20:35,052][26599] Updated weights for policy 0, policy_version 240214 (0.0037) [2024-06-19 00:20:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.7, 300 sec: 41709.8). Total num frames: 3935813632. Throughput: 0: 41956.5. Samples: 203419700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:38,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 00:20:38,405][26599] Updated weights for policy 0, policy_version 240224 (0.0048) [2024-06-19 00:20:43,015][26599] Updated weights for policy 0, policy_version 240234 (0.0036) [2024-06-19 00:20:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3936010240. Throughput: 0: 41758.6. Samples: 203667360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:43,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 00:20:46,394][26599] Updated weights for policy 0, policy_version 240244 (0.0035) [2024-06-19 00:20:48,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 41598.7). Total num frames: 3936239616. Throughput: 0: 41584.4. Samples: 203785540. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:48,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 00:20:50,986][26599] Updated weights for policy 0, policy_version 240254 (0.0034) [2024-06-19 00:20:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3936419840. Throughput: 0: 41789.9. Samples: 204037180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:53,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 00:20:54,791][26599] Updated weights for policy 0, policy_version 240264 (0.0028) [2024-06-19 00:20:58,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 41543.7). Total num frames: 3936632832. Throughput: 0: 41668.5. Samples: 204288860. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:20:58,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 00:20:58,619][26599] Updated weights for policy 0, policy_version 240274 (0.0043) [2024-06-19 00:21:02,552][26599] Updated weights for policy 0, policy_version 240284 (0.0028) [2024-06-19 00:21:02,565][26579] Signal inference workers to stop experience collection... (2950 times) [2024-06-19 00:21:02,565][26579] Signal inference workers to resume experience collection... (2950 times) [2024-06-19 00:21:02,610][26599] InferenceWorker_p0-w0: stopping experience collection (2950 times) [2024-06-19 00:21:02,610][26599] InferenceWorker_p0-w0: resuming experience collection (2950 times) [2024-06-19 00:21:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 41487.6). Total num frames: 3936829440. Throughput: 0: 41730.1. Samples: 204410900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:21:03,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 00:21:06,350][26599] Updated weights for policy 0, policy_version 240294 (0.0033) [2024-06-19 00:21:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 40960.0, 300 sec: 41543.1). Total num frames: 3937026048. Throughput: 0: 41325.7. Samples: 204652280. Policy #0 lag: (min: 0.0, avg: 12.7, max: 27.0) [2024-06-19 00:21:08,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 00:21:10,447][26599] Updated weights for policy 0, policy_version 240304 (0.0037) [2024-06-19 00:21:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 41432.1). Total num frames: 3937239040. Throughput: 0: 41251.1. Samples: 204900660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:13,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 00:21:14,321][26599] Updated weights for policy 0, policy_version 240314 (0.0042) [2024-06-19 00:21:18,384][26367] Fps is (10 sec: 42583.1, 60 sec: 41779.2, 300 sec: 41598.2). Total num frames: 3937452032. Throughput: 0: 41368.8. Samples: 205024940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:18,384][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 00:21:18,467][26599] Updated weights for policy 0, policy_version 240324 (0.0032) [2024-06-19 00:21:22,034][26599] Updated weights for policy 0, policy_version 240334 (0.0035) [2024-06-19 00:21:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 40959.9, 300 sec: 41598.7). Total num frames: 3937665024. Throughput: 0: 41124.3. Samples: 205270300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:23,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 00:21:26,592][26599] Updated weights for policy 0, policy_version 240344 (0.0035) [2024-06-19 00:21:28,380][26367] Fps is (10 sec: 42613.5, 60 sec: 41506.1, 300 sec: 41487.6). Total num frames: 3937878016. Throughput: 0: 41084.9. Samples: 205516180. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:28,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 00:21:30,151][26599] Updated weights for policy 0, policy_version 240354 (0.0024) [2024-06-19 00:21:33,380][26367] Fps is (10 sec: 37683.5, 60 sec: 40687.0, 300 sec: 41487.6). Total num frames: 3938041856. Throughput: 0: 41343.2. Samples: 205645980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:33,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 00:21:34,567][26599] Updated weights for policy 0, policy_version 240364 (0.0035) [2024-06-19 00:21:37,956][26599] Updated weights for policy 0, policy_version 240374 (0.0038) [2024-06-19 00:21:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41232.9, 300 sec: 41543.1). Total num frames: 3938287616. Throughput: 0: 41249.2. Samples: 205893400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:38,381][26367] Avg episode reward: [(0, '0.287')] [2024-06-19 00:21:38,510][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240375_3938304000.pth... [2024-06-19 00:21:38,575][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000239767_3928342528.pth [2024-06-19 00:21:42,486][26599] Updated weights for policy 0, policy_version 240384 (0.0048) [2024-06-19 00:21:43,384][26367] Fps is (10 sec: 45858.9, 60 sec: 41503.7, 300 sec: 41542.6). Total num frames: 3938500608. Throughput: 0: 41139.8. Samples: 206140300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:43,384][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 00:21:45,672][26599] Updated weights for policy 0, policy_version 240394 (0.0033) [2024-06-19 00:21:48,380][26367] Fps is (10 sec: 37683.2, 60 sec: 40413.9, 300 sec: 41487.6). Total num frames: 3938664448. Throughput: 0: 41118.9. Samples: 206261260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:48,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 00:21:50,180][26599] Updated weights for policy 0, policy_version 240404 (0.0033) [2024-06-19 00:21:53,380][26367] Fps is (10 sec: 42614.1, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 3938926592. Throughput: 0: 41295.7. Samples: 206510580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:53,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 00:21:53,412][26599] Updated weights for policy 0, policy_version 240414 (0.0036) [2024-06-19 00:21:58,220][26599] Updated weights for policy 0, policy_version 240424 (0.0036) [2024-06-19 00:21:58,380][26367] Fps is (10 sec: 44237.5, 60 sec: 41233.1, 300 sec: 41543.2). Total num frames: 3939106816. Throughput: 0: 41515.2. Samples: 206768840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:21:58,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 00:22:01,104][26599] Updated weights for policy 0, policy_version 240434 (0.0036) [2024-06-19 00:22:03,380][26367] Fps is (10 sec: 37682.9, 60 sec: 41233.0, 300 sec: 41487.6). Total num frames: 3939303424. Throughput: 0: 41291.8. Samples: 206882920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:22:03,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 00:22:06,255][26599] Updated weights for policy 0, policy_version 240444 (0.0046) [2024-06-19 00:22:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 3939549184. Throughput: 0: 41548.5. Samples: 207139980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:22:08,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 00:22:09,306][26599] Updated weights for policy 0, policy_version 240454 (0.0027) [2024-06-19 00:22:13,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41230.6, 300 sec: 41487.6). Total num frames: 3939713024. Throughput: 0: 41671.0. Samples: 207391520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:22:13,384][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 00:22:14,229][26599] Updated weights for policy 0, policy_version 240464 (0.0032) [2024-06-19 00:22:17,057][26599] Updated weights for policy 0, policy_version 240474 (0.0036) [2024-06-19 00:22:17,946][26579] Signal inference workers to stop experience collection... (3000 times) [2024-06-19 00:22:17,987][26599] InferenceWorker_p0-w0: stopping experience collection (3000 times) [2024-06-19 00:22:17,995][26579] Signal inference workers to resume experience collection... (3000 times) [2024-06-19 00:22:18,011][26599] InferenceWorker_p0-w0: resuming experience collection (3000 times) [2024-06-19 00:22:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41781.8, 300 sec: 41598.7). Total num frames: 3939958784. Throughput: 0: 41334.8. Samples: 207506040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 00:22:18,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 00:22:22,151][26599] Updated weights for policy 0, policy_version 240484 (0.0030) [2024-06-19 00:22:23,380][26367] Fps is (10 sec: 44253.0, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3940155392. Throughput: 0: 41652.6. Samples: 207767760. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:23,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 00:22:24,970][26599] Updated weights for policy 0, policy_version 240494 (0.0029) [2024-06-19 00:22:28,380][26367] Fps is (10 sec: 37682.5, 60 sec: 40960.0, 300 sec: 41543.1). Total num frames: 3940335616. Throughput: 0: 41781.9. Samples: 208020340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:28,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 00:22:30,281][26599] Updated weights for policy 0, policy_version 240504 (0.0034) [2024-06-19 00:22:32,588][26599] Updated weights for policy 0, policy_version 240514 (0.0039) [2024-06-19 00:22:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 41654.2). Total num frames: 3940597760. Throughput: 0: 41730.8. Samples: 208139140. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:33,381][26367] Avg episode reward: [(0, '0.242')] [2024-06-19 00:22:37,975][26599] Updated weights for policy 0, policy_version 240524 (0.0043) [2024-06-19 00:22:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 41543.1). Total num frames: 3940761600. Throughput: 0: 41890.1. Samples: 208395640. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:38,381][26367] Avg episode reward: [(0, '0.378')] [2024-06-19 00:22:40,623][26599] Updated weights for policy 0, policy_version 240534 (0.0042) [2024-06-19 00:22:43,380][26367] Fps is (10 sec: 37682.9, 60 sec: 41235.5, 300 sec: 41543.1). Total num frames: 3940974592. Throughput: 0: 41575.0. Samples: 208639720. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:43,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 00:22:46,014][26599] Updated weights for policy 0, policy_version 240544 (0.0047) [2024-06-19 00:22:48,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 41598.7). Total num frames: 3941220352. Throughput: 0: 41895.9. Samples: 208768240. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:48,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 00:22:48,803][26599] Updated weights for policy 0, policy_version 240554 (0.0036) [2024-06-19 00:22:53,384][26367] Fps is (10 sec: 39307.8, 60 sec: 40684.4, 300 sec: 41487.1). Total num frames: 3941367808. Throughput: 0: 41770.5. Samples: 209019800. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:53,384][26367] Avg episode reward: [(0, '0.811')] [2024-06-19 00:22:53,854][26599] Updated weights for policy 0, policy_version 240564 (0.0032) [2024-06-19 00:22:56,614][26599] Updated weights for policy 0, policy_version 240574 (0.0030) [2024-06-19 00:22:58,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3941597184. Throughput: 0: 41569.2. Samples: 209261980. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:22:58,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 00:23:01,715][26599] Updated weights for policy 0, policy_version 240584 (0.0037) [2024-06-19 00:23:03,380][26367] Fps is (10 sec: 45891.4, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 3941826560. Throughput: 0: 41839.5. Samples: 209388820. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:23:03,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 00:23:04,472][26599] Updated weights for policy 0, policy_version 240594 (0.0036) [2024-06-19 00:23:08,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 3942023168. Throughput: 0: 41381.2. Samples: 209629920. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:23:08,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 00:23:09,563][26599] Updated weights for policy 0, policy_version 240604 (0.0029) [2024-06-19 00:23:12,677][26599] Updated weights for policy 0, policy_version 240614 (0.0033) [2024-06-19 00:23:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42054.8, 300 sec: 41598.7). Total num frames: 3942236160. Throughput: 0: 41355.7. Samples: 209881340. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:23:13,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 00:23:17,159][26599] Updated weights for policy 0, policy_version 240624 (0.0044) [2024-06-19 00:23:18,380][26367] Fps is (10 sec: 39322.2, 60 sec: 40960.0, 300 sec: 41432.1). Total num frames: 3942416384. Throughput: 0: 41607.6. Samples: 210011480. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:23:18,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 00:23:20,625][26599] Updated weights for policy 0, policy_version 240634 (0.0033) [2024-06-19 00:23:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41543.2). Total num frames: 3942629376. Throughput: 0: 41348.0. Samples: 210256300. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:23:23,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 00:23:24,799][26599] Updated weights for policy 0, policy_version 240644 (0.0028) [2024-06-19 00:23:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 41543.1). Total num frames: 3942858752. Throughput: 0: 41432.5. Samples: 210504180. Policy #0 lag: (min: 0.0, avg: 8.4, max: 21.0) [2024-06-19 00:23:28,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 00:23:28,643][26599] Updated weights for policy 0, policy_version 240654 (0.0035) [2024-06-19 00:23:32,683][26599] Updated weights for policy 0, policy_version 240664 (0.0037) [2024-06-19 00:23:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 40959.9, 300 sec: 41598.7). Total num frames: 3943055360. Throughput: 0: 41428.8. Samples: 210632540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:23:33,389][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 00:23:34,134][26579] Signal inference workers to stop experience collection... (3050 times) [2024-06-19 00:23:34,134][26579] Signal inference workers to resume experience collection... (3050 times) [2024-06-19 00:23:34,147][26599] InferenceWorker_p0-w0: stopping experience collection (3050 times) [2024-06-19 00:23:34,161][26599] InferenceWorker_p0-w0: resuming experience collection (3050 times) [2024-06-19 00:23:36,172][26599] Updated weights for policy 0, policy_version 240674 (0.0031) [2024-06-19 00:23:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 3943284736. Throughput: 0: 41531.8. Samples: 210888580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:23:38,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 00:23:38,436][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240679_3943284736.pth... [2024-06-19 00:23:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240068_3933274112.pth [2024-06-19 00:23:40,313][26599] Updated weights for policy 0, policy_version 240684 (0.0038) [2024-06-19 00:23:43,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41543.1). Total num frames: 3943497728. Throughput: 0: 41687.4. Samples: 211137920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:23:43,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 00:23:43,800][26599] Updated weights for policy 0, policy_version 240694 (0.0030) [2024-06-19 00:23:48,134][26599] Updated weights for policy 0, policy_version 240704 (0.0030) [2024-06-19 00:23:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 3943694336. Throughput: 0: 41579.1. Samples: 211259880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:23:48,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 00:23:52,231][26599] Updated weights for policy 0, policy_version 240714 (0.0042) [2024-06-19 00:23:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42054.8, 300 sec: 41487.6). Total num frames: 3943890944. Throughput: 0: 41955.7. Samples: 211517920. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:23:53,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 00:23:55,855][26599] Updated weights for policy 0, policy_version 240724 (0.0041) [2024-06-19 00:23:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41543.1). Total num frames: 3944120320. Throughput: 0: 41942.6. Samples: 211768760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:23:58,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 00:24:00,520][26599] Updated weights for policy 0, policy_version 240734 (0.0032) [2024-06-19 00:24:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 3944316928. Throughput: 0: 41748.0. Samples: 211890140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:03,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 00:24:03,680][26599] Updated weights for policy 0, policy_version 240744 (0.0038) [2024-06-19 00:24:08,384][26367] Fps is (10 sec: 37669.5, 60 sec: 41230.6, 300 sec: 41431.6). Total num frames: 3944497152. Throughput: 0: 41778.8. Samples: 212136500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:08,385][26367] Avg episode reward: [(0, '0.299')] [2024-06-19 00:24:08,566][26599] Updated weights for policy 0, policy_version 240754 (0.0025) [2024-06-19 00:24:11,728][26599] Updated weights for policy 0, policy_version 240764 (0.0037) [2024-06-19 00:24:13,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 41598.7). Total num frames: 3944742912. Throughput: 0: 41789.3. Samples: 212384700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:13,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 00:24:16,179][26599] Updated weights for policy 0, policy_version 240774 (0.0050) [2024-06-19 00:24:18,383][26367] Fps is (10 sec: 44240.8, 60 sec: 42050.3, 300 sec: 41598.3). Total num frames: 3944939520. Throughput: 0: 41927.8. Samples: 212519400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:18,384][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 00:24:19,326][26599] Updated weights for policy 0, policy_version 240784 (0.0037) [2024-06-19 00:24:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41543.2). Total num frames: 3945152512. Throughput: 0: 41607.1. Samples: 212760900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:23,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 00:24:23,802][26599] Updated weights for policy 0, policy_version 240794 (0.0033) [2024-06-19 00:24:27,492][26599] Updated weights for policy 0, policy_version 240804 (0.0049) [2024-06-19 00:24:28,380][26367] Fps is (10 sec: 42610.5, 60 sec: 41779.3, 300 sec: 41654.2). Total num frames: 3945365504. Throughput: 0: 41707.7. Samples: 213014760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:28,380][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 00:24:31,970][26599] Updated weights for policy 0, policy_version 240814 (0.0033) [2024-06-19 00:24:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 3945562112. Throughput: 0: 41848.1. Samples: 213143040. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:33,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 00:24:35,142][26599] Updated weights for policy 0, policy_version 240824 (0.0028) [2024-06-19 00:24:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41487.6). Total num frames: 3945775104. Throughput: 0: 41603.1. Samples: 213390060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 24.0) [2024-06-19 00:24:38,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 00:24:39,671][26599] Updated weights for policy 0, policy_version 240834 (0.0042) [2024-06-19 00:24:43,080][26599] Updated weights for policy 0, policy_version 240844 (0.0034) [2024-06-19 00:24:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3945988096. Throughput: 0: 41610.7. Samples: 213641240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:24:43,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 00:24:47,282][26599] Updated weights for policy 0, policy_version 240854 (0.0029) [2024-06-19 00:24:48,384][26367] Fps is (10 sec: 42582.9, 60 sec: 41776.7, 300 sec: 41598.2). Total num frames: 3946201088. Throughput: 0: 41791.7. Samples: 213770920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:24:48,384][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 00:24:50,737][26599] Updated weights for policy 0, policy_version 240864 (0.0038) [2024-06-19 00:24:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41654.2). Total num frames: 3946414080. Throughput: 0: 41994.9. Samples: 214026120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:24:53,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 00:24:54,877][26599] Updated weights for policy 0, policy_version 240874 (0.0047) [2024-06-19 00:24:58,384][26367] Fps is (10 sec: 42598.2, 60 sec: 41776.7, 300 sec: 41764.8). Total num frames: 3946627072. Throughput: 0: 41959.8. Samples: 214273040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:24:58,385][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 00:24:58,463][26599] Updated weights for policy 0, policy_version 240884 (0.0030) [2024-06-19 00:25:02,541][26599] Updated weights for policy 0, policy_version 240894 (0.0037) [2024-06-19 00:25:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 3946823680. Throughput: 0: 41852.3. Samples: 214402640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:03,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 00:25:04,308][26579] Signal inference workers to stop experience collection... (3100 times) [2024-06-19 00:25:04,352][26599] InferenceWorker_p0-w0: stopping experience collection (3100 times) [2024-06-19 00:25:04,380][26579] Signal inference workers to resume experience collection... (3100 times) [2024-06-19 00:25:04,380][26599] InferenceWorker_p0-w0: resuming experience collection (3100 times) [2024-06-19 00:25:06,091][26599] Updated weights for policy 0, policy_version 240904 (0.0035) [2024-06-19 00:25:08,380][26367] Fps is (10 sec: 40975.3, 60 sec: 42328.0, 300 sec: 41598.7). Total num frames: 3947036672. Throughput: 0: 42069.9. Samples: 214654040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:08,380][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 00:25:10,678][26599] Updated weights for policy 0, policy_version 240914 (0.0035) [2024-06-19 00:25:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41710.3). Total num frames: 3947249664. Throughput: 0: 42019.6. Samples: 214905640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:13,380][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 00:25:13,846][26599] Updated weights for policy 0, policy_version 240924 (0.0040) [2024-06-19 00:25:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41781.2, 300 sec: 41487.6). Total num frames: 3947446272. Throughput: 0: 42036.9. Samples: 215034700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:18,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 00:25:18,454][26599] Updated weights for policy 0, policy_version 240934 (0.0030) [2024-06-19 00:25:21,783][26599] Updated weights for policy 0, policy_version 240944 (0.0037) [2024-06-19 00:25:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 3947659264. Throughput: 0: 42066.2. Samples: 215283040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:23,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 00:25:26,042][26599] Updated weights for policy 0, policy_version 240954 (0.0023) [2024-06-19 00:25:28,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41654.3). Total num frames: 3947888640. Throughput: 0: 42054.2. Samples: 215533680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:28,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 00:25:29,488][26599] Updated weights for policy 0, policy_version 240964 (0.0030) [2024-06-19 00:25:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 3948085248. Throughput: 0: 42116.3. Samples: 215666000. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:33,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 00:25:33,724][26599] Updated weights for policy 0, policy_version 240974 (0.0033) [2024-06-19 00:25:37,293][26599] Updated weights for policy 0, policy_version 240984 (0.0037) [2024-06-19 00:25:38,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 41654.2). Total num frames: 3948298240. Throughput: 0: 41999.0. Samples: 215916080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:38,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 00:25:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240985_3948298240.pth... [2024-06-19 00:25:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240375_3938304000.pth [2024-06-19 00:25:41,476][26599] Updated weights for policy 0, policy_version 240994 (0.0029) [2024-06-19 00:25:43,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 3948527616. Throughput: 0: 42117.1. Samples: 216168160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:43,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 00:25:45,154][26599] Updated weights for policy 0, policy_version 241004 (0.0032) [2024-06-19 00:25:48,384][26367] Fps is (10 sec: 40945.8, 60 sec: 41779.2, 300 sec: 41653.7). Total num frames: 3948707840. Throughput: 0: 42139.8. Samples: 216299080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 00:25:48,384][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 00:25:49,208][26599] Updated weights for policy 0, policy_version 241014 (0.0028) [2024-06-19 00:25:53,041][26599] Updated weights for policy 0, policy_version 241024 (0.0036) [2024-06-19 00:25:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3948937216. Throughput: 0: 42016.8. Samples: 216544800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:25:53,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 00:25:57,558][26599] Updated weights for policy 0, policy_version 241034 (0.0033) [2024-06-19 00:25:58,380][26367] Fps is (10 sec: 45891.7, 60 sec: 42327.9, 300 sec: 41820.8). Total num frames: 3949166592. Throughput: 0: 42077.7. Samples: 216799140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:25:58,381][26367] Avg episode reward: [(0, '0.813')] [2024-06-19 00:26:01,520][26599] Updated weights for policy 0, policy_version 241044 (0.0034) [2024-06-19 00:26:03,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3949346816. Throughput: 0: 41942.0. Samples: 216922100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:03,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 00:26:05,347][26599] Updated weights for policy 0, policy_version 241054 (0.0044) [2024-06-19 00:26:08,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 3949576192. Throughput: 0: 42014.1. Samples: 217173680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:08,381][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 00:26:09,016][26599] Updated weights for policy 0, policy_version 241064 (0.0032) [2024-06-19 00:26:13,000][26599] Updated weights for policy 0, policy_version 241074 (0.0035) [2024-06-19 00:26:13,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.2, 300 sec: 41765.8). Total num frames: 3949772800. Throughput: 0: 42141.9. Samples: 217430060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:13,380][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 00:26:14,973][26579] Signal inference workers to stop experience collection... (3150 times) [2024-06-19 00:26:15,030][26599] InferenceWorker_p0-w0: stopping experience collection (3150 times) [2024-06-19 00:26:15,036][26579] Signal inference workers to resume experience collection... (3150 times) [2024-06-19 00:26:15,040][26599] InferenceWorker_p0-w0: resuming experience collection (3150 times) [2024-06-19 00:26:16,639][26599] Updated weights for policy 0, policy_version 241084 (0.0029) [2024-06-19 00:26:18,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3949969408. Throughput: 0: 41831.1. Samples: 217548400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:18,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 00:26:20,782][26599] Updated weights for policy 0, policy_version 241094 (0.0040) [2024-06-19 00:26:23,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 41820.9). Total num frames: 3950215168. Throughput: 0: 42015.6. Samples: 217806780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:23,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 00:26:24,277][26599] Updated weights for policy 0, policy_version 241104 (0.0032) [2024-06-19 00:26:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3950395392. Throughput: 0: 42088.5. Samples: 218062140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:28,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 00:26:28,585][26599] Updated weights for policy 0, policy_version 241114 (0.0038) [2024-06-19 00:26:31,916][26599] Updated weights for policy 0, policy_version 241124 (0.0033) [2024-06-19 00:26:33,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3950592000. Throughput: 0: 41796.7. Samples: 218179780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:33,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 00:26:36,482][26599] Updated weights for policy 0, policy_version 241134 (0.0031) [2024-06-19 00:26:38,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 41876.9). Total num frames: 3950854144. Throughput: 0: 41934.3. Samples: 218431840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:38,381][26367] Avg episode reward: [(0, '0.355')] [2024-06-19 00:26:40,224][26599] Updated weights for policy 0, policy_version 241144 (0.0037) [2024-06-19 00:26:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 3951017984. Throughput: 0: 41816.5. Samples: 218680880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:43,380][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 00:26:44,467][26599] Updated weights for policy 0, policy_version 241154 (0.0040) [2024-06-19 00:26:47,941][26599] Updated weights for policy 0, policy_version 241164 (0.0044) [2024-06-19 00:26:48,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42054.7, 300 sec: 41709.8). Total num frames: 3951230976. Throughput: 0: 41694.2. Samples: 218798340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:48,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 00:26:52,328][26599] Updated weights for policy 0, policy_version 241174 (0.0054) [2024-06-19 00:26:53,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3951460352. Throughput: 0: 41970.2. Samples: 219062340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:53,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 00:26:55,656][26599] Updated weights for policy 0, policy_version 241184 (0.0038) [2024-06-19 00:26:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3951656960. Throughput: 0: 41846.5. Samples: 219313160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 00:26:58,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 00:27:00,075][26599] Updated weights for policy 0, policy_version 241194 (0.0036) [2024-06-19 00:27:03,272][26599] Updated weights for policy 0, policy_version 241204 (0.0033) [2024-06-19 00:27:03,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 3951886336. Throughput: 0: 41899.6. Samples: 219433880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:03,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 00:27:07,918][26599] Updated weights for policy 0, policy_version 241214 (0.0031) [2024-06-19 00:27:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41506.3, 300 sec: 41876.9). Total num frames: 3952066560. Throughput: 0: 41837.0. Samples: 219689440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:08,380][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 00:27:10,882][26599] Updated weights for policy 0, policy_version 241224 (0.0042) [2024-06-19 00:27:13,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3952263168. Throughput: 0: 41710.3. Samples: 219939100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:13,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 00:27:15,759][26599] Updated weights for policy 0, policy_version 241234 (0.0027) [2024-06-19 00:27:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3952492544. Throughput: 0: 41884.5. Samples: 220064580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:18,380][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 00:27:19,395][26599] Updated weights for policy 0, policy_version 241244 (0.0027) [2024-06-19 00:27:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 3952689152. Throughput: 0: 41888.5. Samples: 220316820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:23,380][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 00:27:23,415][26599] Updated weights for policy 0, policy_version 241254 (0.0034) [2024-06-19 00:27:27,122][26599] Updated weights for policy 0, policy_version 241264 (0.0032) [2024-06-19 00:27:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3952902144. Throughput: 0: 41959.1. Samples: 220569040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:28,382][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 00:27:31,255][26599] Updated weights for policy 0, policy_version 241274 (0.0032) [2024-06-19 00:27:33,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42322.8, 300 sec: 41931.4). Total num frames: 3953131520. Throughput: 0: 42153.1. Samples: 220695380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:33,384][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 00:27:34,910][26599] Updated weights for policy 0, policy_version 241284 (0.0025) [2024-06-19 00:27:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 3953328128. Throughput: 0: 41821.0. Samples: 220944280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:38,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 00:27:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000241292_3953328128.pth... [2024-06-19 00:27:38,456][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240679_3943284736.pth [2024-06-19 00:27:38,923][26599] Updated weights for policy 0, policy_version 241294 (0.0049) [2024-06-19 00:27:40,284][26579] Signal inference workers to stop experience collection... (3200 times) [2024-06-19 00:27:40,330][26599] InferenceWorker_p0-w0: stopping experience collection (3200 times) [2024-06-19 00:27:40,342][26579] Signal inference workers to resume experience collection... (3200 times) [2024-06-19 00:27:40,349][26599] InferenceWorker_p0-w0: resuming experience collection (3200 times) [2024-06-19 00:27:42,738][26599] Updated weights for policy 0, policy_version 241304 (0.0034) [2024-06-19 00:27:43,380][26367] Fps is (10 sec: 40974.5, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3953541120. Throughput: 0: 41668.5. Samples: 221188240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:43,381][26367] Avg episode reward: [(0, '0.333')] [2024-06-19 00:27:46,751][26599] Updated weights for policy 0, policy_version 241314 (0.0032) [2024-06-19 00:27:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 3953770496. Throughput: 0: 41922.6. Samples: 221320400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:48,381][26367] Avg episode reward: [(0, '0.346')] [2024-06-19 00:27:50,484][26599] Updated weights for policy 0, policy_version 241324 (0.0038) [2024-06-19 00:27:53,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 3953950720. Throughput: 0: 41904.9. Samples: 221575160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:53,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 00:27:54,585][26599] Updated weights for policy 0, policy_version 241334 (0.0027) [2024-06-19 00:27:58,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 3954163712. Throughput: 0: 41841.3. Samples: 221821960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:27:58,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 00:27:58,487][26599] Updated weights for policy 0, policy_version 241344 (0.0035) [2024-06-19 00:28:02,265][26599] Updated weights for policy 0, policy_version 241354 (0.0039) [2024-06-19 00:28:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 3954376704. Throughput: 0: 41809.2. Samples: 221946000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:28:03,383][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 00:28:06,546][26599] Updated weights for policy 0, policy_version 241364 (0.0040) [2024-06-19 00:28:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 3954573312. Throughput: 0: 41820.7. Samples: 222198760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 00:28:08,382][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 00:28:09,973][26599] Updated weights for policy 0, policy_version 241374 (0.0034) [2024-06-19 00:28:13,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 3954769920. Throughput: 0: 41680.8. Samples: 222444680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:13,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 00:28:14,818][26599] Updated weights for policy 0, policy_version 241384 (0.0038) [2024-06-19 00:28:17,715][26599] Updated weights for policy 0, policy_version 241394 (0.0037) [2024-06-19 00:28:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 3955015680. Throughput: 0: 41701.2. Samples: 222571780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:18,380][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 00:28:22,692][26599] Updated weights for policy 0, policy_version 241404 (0.0032) [2024-06-19 00:28:23,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3955212288. Throughput: 0: 41791.1. Samples: 222824880. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:23,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 00:28:25,736][26599] Updated weights for policy 0, policy_version 241414 (0.0035) [2024-06-19 00:28:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 3955425280. Throughput: 0: 41721.9. Samples: 223065720. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:28,380][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 00:28:30,630][26599] Updated weights for policy 0, policy_version 241424 (0.0038) [2024-06-19 00:28:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41781.7, 300 sec: 41876.4). Total num frames: 3955638272. Throughput: 0: 41733.3. Samples: 223198400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:33,388][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 00:28:33,516][26599] Updated weights for policy 0, policy_version 241434 (0.0023) [2024-06-19 00:28:38,297][26599] Updated weights for policy 0, policy_version 241444 (0.0038) [2024-06-19 00:28:38,380][26367] Fps is (10 sec: 39320.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3955818496. Throughput: 0: 41628.8. Samples: 223448460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:38,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 00:28:41,136][26599] Updated weights for policy 0, policy_version 241454 (0.0044) [2024-06-19 00:28:43,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41506.3, 300 sec: 41820.9). Total num frames: 3956031488. Throughput: 0: 41697.9. Samples: 223698360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:43,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 00:28:45,974][26599] Updated weights for policy 0, policy_version 241464 (0.0033) [2024-06-19 00:28:48,380][26367] Fps is (10 sec: 44237.5, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 3956260864. Throughput: 0: 41790.8. Samples: 223826580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:48,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:28:49,002][26599] Updated weights for policy 0, policy_version 241474 (0.0038) [2024-06-19 00:28:53,380][26367] Fps is (10 sec: 40959.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3956441088. Throughput: 0: 41709.3. Samples: 224075680. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:53,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:28:53,731][26599] Updated weights for policy 0, policy_version 241484 (0.0033) [2024-06-19 00:28:57,006][26599] Updated weights for policy 0, policy_version 241494 (0.0058) [2024-06-19 00:28:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 3956670464. Throughput: 0: 41750.4. Samples: 224323440. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:28:58,380][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 00:29:01,754][26599] Updated weights for policy 0, policy_version 241504 (0.0038) [2024-06-19 00:29:03,380][26367] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41932.5). Total num frames: 3956867072. Throughput: 0: 41841.8. Samples: 224454660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:29:03,380][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 00:29:04,750][26599] Updated weights for policy 0, policy_version 241514 (0.0045) [2024-06-19 00:29:08,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3957080064. Throughput: 0: 41567.9. Samples: 224695440. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:29:08,381][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 00:29:10,195][26599] Updated weights for policy 0, policy_version 241524 (0.0029) [2024-06-19 00:29:12,644][26599] Updated weights for policy 0, policy_version 241534 (0.0035) [2024-06-19 00:29:13,380][26367] Fps is (10 sec: 44235.7, 60 sec: 42325.3, 300 sec: 41932.3). Total num frames: 3957309440. Throughput: 0: 41578.4. Samples: 224936760. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:29:13,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 00:29:17,977][26599] Updated weights for policy 0, policy_version 241544 (0.0040) [2024-06-19 00:29:18,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 3957473280. Throughput: 0: 41567.2. Samples: 225068920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 25.0) [2024-06-19 00:29:18,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 00:29:20,383][26599] Updated weights for policy 0, policy_version 241554 (0.0031) [2024-06-19 00:29:23,380][26367] Fps is (10 sec: 39322.6, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3957702656. Throughput: 0: 41457.5. Samples: 225314040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:23,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 00:29:25,524][26579] Signal inference workers to stop experience collection... (3250 times) [2024-06-19 00:29:25,551][26599] InferenceWorker_p0-w0: stopping experience collection (3250 times) [2024-06-19 00:29:25,580][26579] Signal inference workers to resume experience collection... (3250 times) [2024-06-19 00:29:25,584][26599] InferenceWorker_p0-w0: resuming experience collection (3250 times) [2024-06-19 00:29:25,720][26599] Updated weights for policy 0, policy_version 241564 (0.0030) [2024-06-19 00:29:28,380][26367] Fps is (10 sec: 45874.7, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3957932032. Throughput: 0: 41382.0. Samples: 225560560. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:28,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 00:29:28,603][26599] Updated weights for policy 0, policy_version 241574 (0.0023) [2024-06-19 00:29:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 40960.1, 300 sec: 41765.3). Total num frames: 3958095872. Throughput: 0: 41520.8. Samples: 225695020. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:33,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 00:29:33,482][26599] Updated weights for policy 0, policy_version 241584 (0.0031) [2024-06-19 00:29:37,027][26599] Updated weights for policy 0, policy_version 241594 (0.0033) [2024-06-19 00:29:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3958341632. Throughput: 0: 41508.1. Samples: 225943540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:38,381][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 00:29:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000241598_3958341632.pth... [2024-06-19 00:29:38,441][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000240985_3948298240.pth [2024-06-19 00:29:41,248][26599] Updated weights for policy 0, policy_version 241604 (0.0028) [2024-06-19 00:29:43,380][26367] Fps is (10 sec: 47513.2, 60 sec: 42325.2, 300 sec: 41932.4). Total num frames: 3958571008. Throughput: 0: 41410.1. Samples: 226186900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:43,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 00:29:44,737][26599] Updated weights for policy 0, policy_version 241614 (0.0041) [2024-06-19 00:29:48,380][26367] Fps is (10 sec: 37683.5, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 3958718464. Throughput: 0: 41352.4. Samples: 226315520. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:48,388][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 00:29:49,239][26599] Updated weights for policy 0, policy_version 241624 (0.0039) [2024-06-19 00:29:52,571][26599] Updated weights for policy 0, policy_version 241634 (0.0037) [2024-06-19 00:29:53,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41765.8). Total num frames: 3958947840. Throughput: 0: 41465.9. Samples: 226561400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:53,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 00:29:57,134][26599] Updated weights for policy 0, policy_version 241644 (0.0039) [2024-06-19 00:29:58,380][26367] Fps is (10 sec: 45874.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 3959177216. Throughput: 0: 41673.9. Samples: 226812080. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:29:58,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 00:30:00,527][26599] Updated weights for policy 0, policy_version 241654 (0.0044) [2024-06-19 00:30:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3959357440. Throughput: 0: 41563.5. Samples: 226939280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:30:03,384][26367] Avg episode reward: [(0, '0.255')] [2024-06-19 00:30:05,040][26599] Updated weights for policy 0, policy_version 241664 (0.0033) [2024-06-19 00:30:08,313][26599] Updated weights for policy 0, policy_version 241674 (0.0047) [2024-06-19 00:30:08,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3959586816. Throughput: 0: 41440.7. Samples: 227178880. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:30:08,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 00:30:12,691][26599] Updated weights for policy 0, policy_version 241684 (0.0044) [2024-06-19 00:30:13,380][26367] Fps is (10 sec: 40959.8, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 3959767040. Throughput: 0: 41717.8. Samples: 227437860. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:30:13,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 00:30:16,043][26599] Updated weights for policy 0, policy_version 241694 (0.0029) [2024-06-19 00:30:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3959996416. Throughput: 0: 41376.0. Samples: 227556940. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:30:18,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 00:30:20,368][26599] Updated weights for policy 0, policy_version 241704 (0.0040) [2024-06-19 00:30:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3960209408. Throughput: 0: 41483.0. Samples: 227810280. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:30:23,381][26367] Avg episode reward: [(0, '0.369')] [2024-06-19 00:30:23,816][26599] Updated weights for policy 0, policy_version 241714 (0.0036) [2024-06-19 00:30:27,824][26579] Signal inference workers to stop experience collection... (3300 times) [2024-06-19 00:30:27,824][26579] Signal inference workers to resume experience collection... (3300 times) [2024-06-19 00:30:27,839][26599] InferenceWorker_p0-w0: stopping experience collection (3300 times) [2024-06-19 00:30:27,839][26599] InferenceWorker_p0-w0: resuming experience collection (3300 times) [2024-06-19 00:30:28,125][26599] Updated weights for policy 0, policy_version 241724 (0.0045) [2024-06-19 00:30:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3960406016. Throughput: 0: 41755.6. Samples: 228065900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 23.0) [2024-06-19 00:30:28,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 00:30:31,381][26599] Updated weights for policy 0, policy_version 241734 (0.0042) [2024-06-19 00:30:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3960619008. Throughput: 0: 41527.0. Samples: 228184240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:30:33,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 00:30:36,075][26599] Updated weights for policy 0, policy_version 241744 (0.0029) [2024-06-19 00:30:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3960832000. Throughput: 0: 41725.4. Samples: 228439040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:30:38,380][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 00:30:39,570][26599] Updated weights for policy 0, policy_version 241754 (0.0029) [2024-06-19 00:30:43,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40687.0, 300 sec: 41710.3). Total num frames: 3961012224. Throughput: 0: 41875.6. Samples: 228696480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:30:43,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 00:30:44,108][26599] Updated weights for policy 0, policy_version 241764 (0.0037) [2024-06-19 00:30:47,284][26599] Updated weights for policy 0, policy_version 241774 (0.0041) [2024-06-19 00:30:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3961241600. Throughput: 0: 41656.9. Samples: 228813840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:30:48,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 00:30:51,780][26599] Updated weights for policy 0, policy_version 241784 (0.0038) [2024-06-19 00:30:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3961454592. Throughput: 0: 41877.9. Samples: 229063380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:30:53,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 00:30:55,094][26599] Updated weights for policy 0, policy_version 241794 (0.0032) [2024-06-19 00:30:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 3961651200. Throughput: 0: 41708.4. Samples: 229314740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:30:58,381][26367] Avg episode reward: [(0, '0.329')] [2024-06-19 00:30:59,948][26599] Updated weights for policy 0, policy_version 241804 (0.0033) [2024-06-19 00:31:02,756][26599] Updated weights for policy 0, policy_version 241814 (0.0030) [2024-06-19 00:31:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3961880576. Throughput: 0: 41824.0. Samples: 229439020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:03,381][26367] Avg episode reward: [(0, '0.312')] [2024-06-19 00:31:07,787][26599] Updated weights for policy 0, policy_version 241824 (0.0043) [2024-06-19 00:31:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3962077184. Throughput: 0: 41936.4. Samples: 229697420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:08,381][26367] Avg episode reward: [(0, '0.307')] [2024-06-19 00:31:10,475][26599] Updated weights for policy 0, policy_version 241834 (0.0045) [2024-06-19 00:31:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 3962273792. Throughput: 0: 41875.7. Samples: 229950300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:13,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 00:31:15,651][26599] Updated weights for policy 0, policy_version 241844 (0.0040) [2024-06-19 00:31:18,160][26599] Updated weights for policy 0, policy_version 241854 (0.0035) [2024-06-19 00:31:18,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 3962535936. Throughput: 0: 42011.6. Samples: 230074760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:18,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 00:31:23,376][26599] Updated weights for policy 0, policy_version 241864 (0.0033) [2024-06-19 00:31:23,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3962699776. Throughput: 0: 41920.7. Samples: 230325480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:23,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 00:31:26,029][26599] Updated weights for policy 0, policy_version 241874 (0.0037) [2024-06-19 00:31:28,380][26367] Fps is (10 sec: 36044.8, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3962896384. Throughput: 0: 41728.9. Samples: 230574280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:28,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 00:31:31,032][26599] Updated weights for policy 0, policy_version 241884 (0.0041) [2024-06-19 00:31:33,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 3963158528. Throughput: 0: 41992.0. Samples: 230703480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:33,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 00:31:34,437][26599] Updated weights for policy 0, policy_version 241894 (0.0043) [2024-06-19 00:31:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 3963322368. Throughput: 0: 42016.4. Samples: 230954120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 00:31:38,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 00:31:38,450][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000241903_3963338752.pth... [2024-06-19 00:31:38,494][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000241292_3953328128.pth [2024-06-19 00:31:38,791][26599] Updated weights for policy 0, policy_version 241904 (0.0030) [2024-06-19 00:31:42,313][26599] Updated weights for policy 0, policy_version 241914 (0.0039) [2024-06-19 00:31:43,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42052.1, 300 sec: 41709.8). Total num frames: 3963535360. Throughput: 0: 41823.5. Samples: 231196800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:31:43,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 00:31:46,959][26599] Updated weights for policy 0, policy_version 241924 (0.0037) [2024-06-19 00:31:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3963764736. Throughput: 0: 41904.8. Samples: 231324740. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:31:48,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 00:31:50,064][26599] Updated weights for policy 0, policy_version 241934 (0.0030) [2024-06-19 00:31:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 3963944960. Throughput: 0: 41744.6. Samples: 231575920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:31:53,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 00:31:54,761][26599] Updated weights for policy 0, policy_version 241944 (0.0047) [2024-06-19 00:31:57,869][26579] Signal inference workers to stop experience collection... (3350 times) [2024-06-19 00:31:57,870][26579] Signal inference workers to resume experience collection... (3350 times) [2024-06-19 00:31:57,914][26599] InferenceWorker_p0-w0: stopping experience collection (3350 times) [2024-06-19 00:31:57,914][26599] InferenceWorker_p0-w0: resuming experience collection (3350 times) [2024-06-19 00:31:57,999][26599] Updated weights for policy 0, policy_version 241954 (0.0030) [2024-06-19 00:31:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 41654.2). Total num frames: 3964174336. Throughput: 0: 41615.5. Samples: 231823000. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:31:58,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 00:32:02,502][26599] Updated weights for policy 0, policy_version 241964 (0.0044) [2024-06-19 00:32:03,384][26367] Fps is (10 sec: 42582.7, 60 sec: 41503.6, 300 sec: 41709.3). Total num frames: 3964370944. Throughput: 0: 41750.4. Samples: 231953680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:03,385][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 00:32:05,898][26599] Updated weights for policy 0, policy_version 241974 (0.0041) [2024-06-19 00:32:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.4, 300 sec: 41765.3). Total num frames: 3964583936. Throughput: 0: 41653.9. Samples: 232199900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:08,380][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 00:32:10,232][26599] Updated weights for policy 0, policy_version 241984 (0.0046) [2024-06-19 00:32:13,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 3964796928. Throughput: 0: 41625.8. Samples: 232447440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:13,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 00:32:13,695][26599] Updated weights for policy 0, policy_version 241994 (0.0029) [2024-06-19 00:32:18,004][26599] Updated weights for policy 0, policy_version 242004 (0.0036) [2024-06-19 00:32:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 41709.8). Total num frames: 3964993536. Throughput: 0: 41732.9. Samples: 232581460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:18,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 00:32:21,659][26599] Updated weights for policy 0, policy_version 242014 (0.0027) [2024-06-19 00:32:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3965206528. Throughput: 0: 41450.2. Samples: 232819380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:23,381][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 00:32:25,891][26599] Updated weights for policy 0, policy_version 242024 (0.0042) [2024-06-19 00:32:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 41710.3). Total num frames: 3965435904. Throughput: 0: 41644.4. Samples: 233070800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:28,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 00:32:29,695][26599] Updated weights for policy 0, policy_version 242034 (0.0028) [2024-06-19 00:32:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 3965616128. Throughput: 0: 41676.5. Samples: 233200180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:33,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 00:32:33,718][26599] Updated weights for policy 0, policy_version 242044 (0.0045) [2024-06-19 00:32:38,000][26599] Updated weights for policy 0, policy_version 242054 (0.0029) [2024-06-19 00:32:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3965829120. Throughput: 0: 41671.0. Samples: 233451120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:38,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 00:32:41,323][26599] Updated weights for policy 0, policy_version 242064 (0.0029) [2024-06-19 00:32:43,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42322.8, 300 sec: 41709.3). Total num frames: 3966074880. Throughput: 0: 41564.6. Samples: 233693560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:32:43,385][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 00:32:45,625][26599] Updated weights for policy 0, policy_version 242074 (0.0031) [2024-06-19 00:32:48,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41230.6, 300 sec: 41653.7). Total num frames: 3966238720. Throughput: 0: 41733.7. Samples: 233831700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:32:48,384][26367] Avg episode reward: [(0, '0.260')] [2024-06-19 00:32:49,206][26599] Updated weights for policy 0, policy_version 242084 (0.0029) [2024-06-19 00:32:53,380][26367] Fps is (10 sec: 37696.5, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 3966451712. Throughput: 0: 41799.8. Samples: 234080900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:32:53,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 00:32:53,655][26599] Updated weights for policy 0, policy_version 242094 (0.0041) [2024-06-19 00:32:57,055][26599] Updated weights for policy 0, policy_version 242104 (0.0026) [2024-06-19 00:32:58,380][26367] Fps is (10 sec: 47530.9, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 3966713856. Throughput: 0: 41864.8. Samples: 234331360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:32:58,381][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 00:33:01,385][26599] Updated weights for policy 0, policy_version 242114 (0.0032) [2024-06-19 00:33:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41781.7, 300 sec: 41709.8). Total num frames: 3966877696. Throughput: 0: 41773.7. Samples: 234461280. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:03,384][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 00:33:05,083][26599] Updated weights for policy 0, policy_version 242124 (0.0035) [2024-06-19 00:33:08,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3967090688. Throughput: 0: 41940.5. Samples: 234706700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:08,380][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 00:33:09,020][26599] Updated weights for policy 0, policy_version 242134 (0.0036) [2024-06-19 00:33:12,882][26599] Updated weights for policy 0, policy_version 242144 (0.0036) [2024-06-19 00:33:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 3967303680. Throughput: 0: 41975.7. Samples: 234959700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:13,381][26367] Avg episode reward: [(0, '0.315')] [2024-06-19 00:33:16,783][26599] Updated weights for policy 0, policy_version 242154 (0.0036) [2024-06-19 00:33:17,936][26579] Signal inference workers to stop experience collection... (3400 times) [2024-06-19 00:33:17,937][26579] Signal inference workers to resume experience collection... (3400 times) [2024-06-19 00:33:17,977][26599] InferenceWorker_p0-w0: stopping experience collection (3400 times) [2024-06-19 00:33:17,978][26599] InferenceWorker_p0-w0: resuming experience collection (3400 times) [2024-06-19 00:33:18,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42322.8, 300 sec: 41764.8). Total num frames: 3967533056. Throughput: 0: 42000.5. Samples: 235090360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:18,385][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 00:33:20,645][26599] Updated weights for policy 0, policy_version 242164 (0.0039) [2024-06-19 00:33:23,382][26367] Fps is (10 sec: 40951.0, 60 sec: 41777.7, 300 sec: 41653.9). Total num frames: 3967713280. Throughput: 0: 41820.7. Samples: 235333140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:23,383][26367] Avg episode reward: [(0, '0.355')] [2024-06-19 00:33:24,371][26599] Updated weights for policy 0, policy_version 242174 (0.0038) [2024-06-19 00:33:28,380][26367] Fps is (10 sec: 39335.8, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3967926272. Throughput: 0: 42239.8. Samples: 235594200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:28,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 00:33:28,873][26599] Updated weights for policy 0, policy_version 242184 (0.0030) [2024-06-19 00:33:32,182][26599] Updated weights for policy 0, policy_version 242194 (0.0034) [2024-06-19 00:33:33,380][26367] Fps is (10 sec: 44246.7, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 3968155648. Throughput: 0: 41964.9. Samples: 235719960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:33,380][26367] Avg episode reward: [(0, '0.330')] [2024-06-19 00:33:36,855][26599] Updated weights for policy 0, policy_version 242204 (0.0023) [2024-06-19 00:33:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 3968368640. Throughput: 0: 41994.3. Samples: 235970640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:38,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 00:33:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000242210_3968368640.pth... [2024-06-19 00:33:38,449][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000241598_3958341632.pth [2024-06-19 00:33:40,209][26599] Updated weights for policy 0, policy_version 242214 (0.0035) [2024-06-19 00:33:43,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41508.6, 300 sec: 41709.8). Total num frames: 3968565248. Throughput: 0: 41850.6. Samples: 236214640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:43,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 00:33:44,506][26599] Updated weights for policy 0, policy_version 242224 (0.0031) [2024-06-19 00:33:47,962][26599] Updated weights for policy 0, policy_version 242234 (0.0039) [2024-06-19 00:33:48,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42328.0, 300 sec: 41820.9). Total num frames: 3968778240. Throughput: 0: 41810.8. Samples: 236342760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:48,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 00:33:52,303][26599] Updated weights for policy 0, policy_version 242244 (0.0027) [2024-06-19 00:33:53,384][26367] Fps is (10 sec: 40945.7, 60 sec: 42049.8, 300 sec: 41709.3). Total num frames: 3968974848. Throughput: 0: 42048.1. Samples: 236599020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 00:33:53,384][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 00:33:55,678][26599] Updated weights for policy 0, policy_version 242254 (0.0023) [2024-06-19 00:33:58,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 3969187840. Throughput: 0: 41810.5. Samples: 236841180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:33:58,381][26367] Avg episode reward: [(0, '0.338')] [2024-06-19 00:34:00,131][26599] Updated weights for policy 0, policy_version 242264 (0.0033) [2024-06-19 00:34:03,380][26367] Fps is (10 sec: 42613.5, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3969400832. Throughput: 0: 41986.0. Samples: 236979580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:03,381][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 00:34:03,676][26599] Updated weights for policy 0, policy_version 242274 (0.0043) [2024-06-19 00:34:07,755][26599] Updated weights for policy 0, policy_version 242284 (0.0033) [2024-06-19 00:34:08,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 3969581056. Throughput: 0: 42064.7. Samples: 237225960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:08,380][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 00:34:11,360][26599] Updated weights for policy 0, policy_version 242294 (0.0043) [2024-06-19 00:34:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3969826816. Throughput: 0: 41685.4. Samples: 237470040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:13,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 00:34:15,874][26599] Updated weights for policy 0, policy_version 242304 (0.0028) [2024-06-19 00:34:18,380][26367] Fps is (10 sec: 45874.4, 60 sec: 41781.7, 300 sec: 41820.8). Total num frames: 3970039808. Throughput: 0: 41801.1. Samples: 237601020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:18,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 00:34:19,005][26599] Updated weights for policy 0, policy_version 242314 (0.0043) [2024-06-19 00:34:23,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41507.6, 300 sec: 41598.7). Total num frames: 3970203648. Throughput: 0: 41749.0. Samples: 237849340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:23,380][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 00:34:23,701][26599] Updated weights for policy 0, policy_version 242324 (0.0035) [2024-06-19 00:34:26,752][26599] Updated weights for policy 0, policy_version 242334 (0.0036) [2024-06-19 00:34:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 3970465792. Throughput: 0: 41714.8. Samples: 238091800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:28,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 00:34:31,897][26599] Updated weights for policy 0, policy_version 242344 (0.0041) [2024-06-19 00:34:33,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 3970646016. Throughput: 0: 41784.3. Samples: 238223060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:33,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 00:34:34,651][26599] Updated weights for policy 0, policy_version 242354 (0.0044) [2024-06-19 00:34:36,165][26579] Signal inference workers to stop experience collection... (3450 times) [2024-06-19 00:34:36,165][26579] Signal inference workers to resume experience collection... (3450 times) [2024-06-19 00:34:36,189][26599] InferenceWorker_p0-w0: stopping experience collection (3450 times) [2024-06-19 00:34:36,189][26599] InferenceWorker_p0-w0: resuming experience collection (3450 times) [2024-06-19 00:34:38,380][26367] Fps is (10 sec: 37682.7, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 3970842624. Throughput: 0: 41422.8. Samples: 238462900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:38,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 00:34:39,853][26599] Updated weights for policy 0, policy_version 242364 (0.0035) [2024-06-19 00:34:42,512][26599] Updated weights for policy 0, policy_version 242374 (0.0039) [2024-06-19 00:34:43,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 3971088384. Throughput: 0: 41617.8. Samples: 238713980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:43,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 00:34:47,752][26599] Updated weights for policy 0, policy_version 242384 (0.0036) [2024-06-19 00:34:48,380][26367] Fps is (10 sec: 37683.9, 60 sec: 40686.9, 300 sec: 41598.7). Total num frames: 3971219456. Throughput: 0: 41388.6. Samples: 238842060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:48,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 00:34:50,593][26599] Updated weights for policy 0, policy_version 242394 (0.0039) [2024-06-19 00:34:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41781.7, 300 sec: 41709.8). Total num frames: 3971481600. Throughput: 0: 41141.7. Samples: 239077340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:53,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 00:34:55,599][26599] Updated weights for policy 0, policy_version 242404 (0.0031) [2024-06-19 00:34:58,380][26367] Fps is (10 sec: 45874.6, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3971678208. Throughput: 0: 41380.4. Samples: 239332160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:34:58,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 00:34:58,993][26599] Updated weights for policy 0, policy_version 242414 (0.0049) [2024-06-19 00:35:03,380][26367] Fps is (10 sec: 36045.2, 60 sec: 40687.0, 300 sec: 41543.2). Total num frames: 3971842048. Throughput: 0: 41227.7. Samples: 239456260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 00:35:03,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 00:35:03,644][26599] Updated weights for policy 0, policy_version 242424 (0.0023) [2024-06-19 00:35:06,872][26599] Updated weights for policy 0, policy_version 242434 (0.0037) [2024-06-19 00:35:08,383][26367] Fps is (10 sec: 44226.1, 60 sec: 42323.5, 300 sec: 41876.1). Total num frames: 3972120576. Throughput: 0: 41331.4. Samples: 239709360. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:08,383][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 00:35:11,404][26599] Updated weights for policy 0, policy_version 242444 (0.0039) [2024-06-19 00:35:13,380][26367] Fps is (10 sec: 47513.6, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3972317184. Throughput: 0: 41468.5. Samples: 239957880. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:13,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 00:35:14,550][26599] Updated weights for policy 0, policy_version 242454 (0.0049) [2024-06-19 00:35:18,384][26367] Fps is (10 sec: 37678.9, 60 sec: 40957.6, 300 sec: 41653.7). Total num frames: 3972497408. Throughput: 0: 41217.6. Samples: 240078000. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:18,384][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 00:35:19,215][26599] Updated weights for policy 0, policy_version 242464 (0.0028) [2024-06-19 00:35:22,275][26599] Updated weights for policy 0, policy_version 242474 (0.0036) [2024-06-19 00:35:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 3972743168. Throughput: 0: 41648.6. Samples: 240337080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:23,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 00:35:27,035][26599] Updated weights for policy 0, policy_version 242484 (0.0034) [2024-06-19 00:35:28,380][26367] Fps is (10 sec: 40975.0, 60 sec: 40686.9, 300 sec: 41654.2). Total num frames: 3972907008. Throughput: 0: 41696.5. Samples: 240590320. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:28,381][26367] Avg episode reward: [(0, '0.800')] [2024-06-19 00:35:30,014][26599] Updated weights for policy 0, policy_version 242494 (0.0031) [2024-06-19 00:35:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3973136384. Throughput: 0: 41448.9. Samples: 240707260. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:33,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 00:35:34,850][26599] Updated weights for policy 0, policy_version 242504 (0.0035) [2024-06-19 00:35:37,881][26599] Updated weights for policy 0, policy_version 242514 (0.0037) [2024-06-19 00:35:38,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3973365760. Throughput: 0: 41892.9. Samples: 240962520. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:38,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 00:35:38,558][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000242516_3973382144.pth... [2024-06-19 00:35:38,607][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000241903_3963338752.pth [2024-06-19 00:35:42,900][26599] Updated weights for policy 0, policy_version 242524 (0.0026) [2024-06-19 00:35:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 40687.0, 300 sec: 41654.2). Total num frames: 3973529600. Throughput: 0: 41920.1. Samples: 241218560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:43,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 00:35:44,176][26579] Signal inference workers to stop experience collection... (3500 times) [2024-06-19 00:35:44,223][26599] InferenceWorker_p0-w0: stopping experience collection (3500 times) [2024-06-19 00:35:44,229][26579] Signal inference workers to resume experience collection... (3500 times) [2024-06-19 00:35:44,240][26599] InferenceWorker_p0-w0: resuming experience collection (3500 times) [2024-06-19 00:35:45,575][26599] Updated weights for policy 0, policy_version 242534 (0.0047) [2024-06-19 00:35:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 41765.3). Total num frames: 3973775360. Throughput: 0: 41784.3. Samples: 241336560. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:48,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 00:35:50,323][26599] Updated weights for policy 0, policy_version 242544 (0.0037) [2024-06-19 00:35:53,246][26599] Updated weights for policy 0, policy_version 242554 (0.0036) [2024-06-19 00:35:53,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3974004736. Throughput: 0: 41997.9. Samples: 241599160. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:53,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 00:35:58,009][26599] Updated weights for policy 0, policy_version 242564 (0.0042) [2024-06-19 00:35:58,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 3974168576. Throughput: 0: 42101.7. Samples: 241852460. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:35:58,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 00:36:01,271][26599] Updated weights for policy 0, policy_version 242574 (0.0032) [2024-06-19 00:36:03,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 41820.9). Total num frames: 3974414336. Throughput: 0: 42027.4. Samples: 241969080. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:36:03,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 00:36:05,995][26599] Updated weights for policy 0, policy_version 242584 (0.0023) [2024-06-19 00:36:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41234.7, 300 sec: 41765.3). Total num frames: 3974594560. Throughput: 0: 42027.0. Samples: 242228300. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:36:08,382][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 00:36:09,075][26599] Updated weights for policy 0, policy_version 242594 (0.0034) [2024-06-19 00:36:13,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41233.0, 300 sec: 41543.1). Total num frames: 3974791168. Throughput: 0: 41793.3. Samples: 242471020. Policy #0 lag: (min: 0.0, avg: 12.0, max: 24.0) [2024-06-19 00:36:13,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 00:36:14,016][26599] Updated weights for policy 0, policy_version 242604 (0.0039) [2024-06-19 00:36:16,897][26599] Updated weights for policy 0, policy_version 242614 (0.0047) [2024-06-19 00:36:18,384][26367] Fps is (10 sec: 44220.9, 60 sec: 42325.3, 300 sec: 41820.3). Total num frames: 3975036928. Throughput: 0: 42128.1. Samples: 242603180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:18,385][26367] Avg episode reward: [(0, '0.321')] [2024-06-19 00:36:21,751][26599] Updated weights for policy 0, policy_version 242624 (0.0034) [2024-06-19 00:36:23,384][26367] Fps is (10 sec: 40945.4, 60 sec: 40957.5, 300 sec: 41709.3). Total num frames: 3975200768. Throughput: 0: 41993.1. Samples: 242852360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:23,384][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 00:36:24,712][26599] Updated weights for policy 0, policy_version 242634 (0.0044) [2024-06-19 00:36:28,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 3975446528. Throughput: 0: 41707.0. Samples: 243095380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:28,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 00:36:29,392][26599] Updated weights for policy 0, policy_version 242644 (0.0039) [2024-06-19 00:36:32,466][26599] Updated weights for policy 0, policy_version 242654 (0.0028) [2024-06-19 00:36:33,380][26367] Fps is (10 sec: 47530.7, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3975675904. Throughput: 0: 42123.2. Samples: 243232100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:33,381][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 00:36:37,003][26599] Updated weights for policy 0, policy_version 242664 (0.0044) [2024-06-19 00:36:38,380][26367] Fps is (10 sec: 37683.2, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 3975823360. Throughput: 0: 41620.8. Samples: 243472100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:38,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 00:36:40,546][26599] Updated weights for policy 0, policy_version 242674 (0.0049) [2024-06-19 00:36:43,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 3976069120. Throughput: 0: 41510.7. Samples: 243720440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:43,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 00:36:44,665][26599] Updated weights for policy 0, policy_version 242684 (0.0040) [2024-06-19 00:36:48,305][26579] Signal inference workers to stop experience collection... (3550 times) [2024-06-19 00:36:48,354][26599] InferenceWorker_p0-w0: stopping experience collection (3550 times) [2024-06-19 00:36:48,359][26579] Signal inference workers to resume experience collection... (3550 times) [2024-06-19 00:36:48,375][26599] InferenceWorker_p0-w0: resuming experience collection (3550 times) [2024-06-19 00:36:48,382][26599] Updated weights for policy 0, policy_version 242694 (0.0032) [2024-06-19 00:36:48,382][26367] Fps is (10 sec: 47506.5, 60 sec: 42051.3, 300 sec: 41876.2). Total num frames: 3976298496. Throughput: 0: 41883.0. Samples: 243853880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:48,382][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 00:36:52,346][26599] Updated weights for policy 0, policy_version 242704 (0.0051) [2024-06-19 00:36:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 3976462336. Throughput: 0: 41612.1. Samples: 244100840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:53,380][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 00:36:56,046][26599] Updated weights for policy 0, policy_version 242714 (0.0027) [2024-06-19 00:36:58,380][26367] Fps is (10 sec: 40966.1, 60 sec: 42325.3, 300 sec: 41821.4). Total num frames: 3976708096. Throughput: 0: 41744.9. Samples: 244349540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:36:58,386][26367] Avg episode reward: [(0, '0.353')] [2024-06-19 00:37:00,261][26599] Updated weights for policy 0, policy_version 242724 (0.0038) [2024-06-19 00:37:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 3976904704. Throughput: 0: 41688.3. Samples: 244479000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:37:03,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 00:37:03,996][26599] Updated weights for policy 0, policy_version 242734 (0.0039) [2024-06-19 00:37:08,141][26599] Updated weights for policy 0, policy_version 242744 (0.0042) [2024-06-19 00:37:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 3977117696. Throughput: 0: 41636.3. Samples: 244725840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:37:08,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 00:37:12,406][26599] Updated weights for policy 0, policy_version 242754 (0.0033) [2024-06-19 00:37:13,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 3977297920. Throughput: 0: 41829.7. Samples: 244977720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:37:13,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 00:37:16,117][26599] Updated weights for policy 0, policy_version 242764 (0.0028) [2024-06-19 00:37:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41781.7, 300 sec: 41820.8). Total num frames: 3977543680. Throughput: 0: 41603.5. Samples: 245104260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:37:18,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 00:37:20,024][26599] Updated weights for policy 0, policy_version 242774 (0.0037) [2024-06-19 00:37:23,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42328.0, 300 sec: 41709.8). Total num frames: 3977740288. Throughput: 0: 41817.1. Samples: 245353860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 21.0) [2024-06-19 00:37:23,380][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 00:37:24,230][26599] Updated weights for policy 0, policy_version 242784 (0.0040) [2024-06-19 00:37:27,729][26599] Updated weights for policy 0, policy_version 242794 (0.0032) [2024-06-19 00:37:28,380][26367] Fps is (10 sec: 39322.4, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 3977936896. Throughput: 0: 41877.9. Samples: 245604940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:37:28,380][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 00:37:32,363][26599] Updated weights for policy 0, policy_version 242804 (0.0039) [2024-06-19 00:37:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 3978166272. Throughput: 0: 41733.5. Samples: 245731820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:37:33,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 00:37:35,461][26599] Updated weights for policy 0, policy_version 242814 (0.0039) [2024-06-19 00:37:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 41654.8). Total num frames: 3978362880. Throughput: 0: 41858.6. Samples: 245984480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:37:38,380][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 00:37:38,477][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000242821_3978379264.pth... [2024-06-19 00:37:38,544][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000242210_3968368640.pth [2024-06-19 00:37:40,018][26599] Updated weights for policy 0, policy_version 242824 (0.0035) [2024-06-19 00:37:43,126][26599] Updated weights for policy 0, policy_version 242834 (0.0034) [2024-06-19 00:37:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41876.9). Total num frames: 3978592256. Throughput: 0: 41869.8. Samples: 246233680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:37:43,384][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 00:37:47,691][26599] Updated weights for policy 0, policy_version 242844 (0.0043) [2024-06-19 00:37:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41234.1, 300 sec: 41765.3). Total num frames: 3978772480. Throughput: 0: 41775.9. Samples: 246358920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:37:48,384][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 00:37:50,749][26599] Updated weights for policy 0, policy_version 242854 (0.0028) [2024-06-19 00:37:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41598.7). Total num frames: 3978985472. Throughput: 0: 41943.4. Samples: 246613300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:37:53,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 00:37:55,395][26599] Updated weights for policy 0, policy_version 242864 (0.0028) [2024-06-19 00:37:58,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3979231232. Throughput: 0: 41825.4. Samples: 246859860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:37:58,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 00:37:58,448][26599] Updated weights for policy 0, policy_version 242874 (0.0041) [2024-06-19 00:38:03,056][26599] Updated weights for policy 0, policy_version 242884 (0.0044) [2024-06-19 00:38:03,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42052.1, 300 sec: 41820.8). Total num frames: 3979427840. Throughput: 0: 41943.0. Samples: 246991700. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:38:03,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 00:38:06,363][26599] Updated weights for policy 0, policy_version 242894 (0.0040) [2024-06-19 00:38:08,380][26367] Fps is (10 sec: 37682.7, 60 sec: 41506.0, 300 sec: 41709.7). Total num frames: 3979608064. Throughput: 0: 41677.5. Samples: 247229360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:38:08,381][26367] Avg episode reward: [(0, '0.366')] [2024-06-19 00:38:09,211][26579] Signal inference workers to stop experience collection... (3600 times) [2024-06-19 00:38:09,212][26579] Signal inference workers to resume experience collection... (3600 times) [2024-06-19 00:38:09,243][26599] InferenceWorker_p0-w0: stopping experience collection (3600 times) [2024-06-19 00:38:09,244][26599] InferenceWorker_p0-w0: resuming experience collection (3600 times) [2024-06-19 00:38:10,759][26599] Updated weights for policy 0, policy_version 242904 (0.0032) [2024-06-19 00:38:13,380][26367] Fps is (10 sec: 40961.1, 60 sec: 42325.5, 300 sec: 41710.3). Total num frames: 3979837440. Throughput: 0: 41992.0. Samples: 247494580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:38:13,380][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 00:38:14,266][26599] Updated weights for policy 0, policy_version 242914 (0.0038) [2024-06-19 00:38:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41765.6). Total num frames: 3980034048. Throughput: 0: 41889.6. Samples: 247616860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:38:18,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 00:38:18,822][26599] Updated weights for policy 0, policy_version 242924 (0.0037) [2024-06-19 00:38:22,191][26599] Updated weights for policy 0, policy_version 242934 (0.0039) [2024-06-19 00:38:23,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 3980247040. Throughput: 0: 41718.1. Samples: 247861800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:38:23,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 00:38:26,611][26599] Updated weights for policy 0, policy_version 242944 (0.0032) [2024-06-19 00:38:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.1, 300 sec: 41709.8). Total num frames: 3980460032. Throughput: 0: 41801.7. Samples: 248114760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:38:28,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 00:38:30,147][26599] Updated weights for policy 0, policy_version 242954 (0.0036) [2024-06-19 00:38:33,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41503.5, 300 sec: 41653.7). Total num frames: 3980656640. Throughput: 0: 41891.3. Samples: 248244180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 00:38:33,385][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 00:38:34,433][26599] Updated weights for policy 0, policy_version 242964 (0.0030) [2024-06-19 00:38:38,283][26599] Updated weights for policy 0, policy_version 242974 (0.0028) [2024-06-19 00:38:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3980886016. Throughput: 0: 41863.6. Samples: 248497160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:38:38,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 00:38:41,983][26599] Updated weights for policy 0, policy_version 242984 (0.0032) [2024-06-19 00:38:43,384][26367] Fps is (10 sec: 44237.1, 60 sec: 41776.7, 300 sec: 41764.8). Total num frames: 3981099008. Throughput: 0: 41803.8. Samples: 248741180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:38:43,384][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 00:38:46,001][26599] Updated weights for policy 0, policy_version 242994 (0.0042) [2024-06-19 00:38:48,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41654.7). Total num frames: 3981262848. Throughput: 0: 41629.5. Samples: 248865020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:38:48,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 00:38:49,981][26599] Updated weights for policy 0, policy_version 243004 (0.0037) [2024-06-19 00:38:53,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3981508608. Throughput: 0: 42061.0. Samples: 249122100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:38:53,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:38:53,712][26599] Updated weights for policy 0, policy_version 243014 (0.0039) [2024-06-19 00:38:57,676][26599] Updated weights for policy 0, policy_version 243024 (0.0037) [2024-06-19 00:38:58,380][26367] Fps is (10 sec: 45874.9, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3981721600. Throughput: 0: 41663.0. Samples: 249369420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:38:58,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 00:39:01,328][26599] Updated weights for policy 0, policy_version 243034 (0.0031) [2024-06-19 00:39:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 3981918208. Throughput: 0: 41865.4. Samples: 249500800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:03,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 00:39:05,467][26599] Updated weights for policy 0, policy_version 243044 (0.0038) [2024-06-19 00:39:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 3982131200. Throughput: 0: 42075.5. Samples: 249755200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:08,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 00:39:09,101][26599] Updated weights for policy 0, policy_version 243054 (0.0029) [2024-06-19 00:39:13,229][26599] Updated weights for policy 0, policy_version 243064 (0.0037) [2024-06-19 00:39:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3982360576. Throughput: 0: 42076.5. Samples: 250008200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:13,381][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 00:39:16,937][26599] Updated weights for policy 0, policy_version 243074 (0.0033) [2024-06-19 00:39:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3982557184. Throughput: 0: 41966.9. Samples: 250132540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:18,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 00:39:21,109][26599] Updated weights for policy 0, policy_version 243084 (0.0046) [2024-06-19 00:39:23,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 3982753792. Throughput: 0: 41787.9. Samples: 250377620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:23,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 00:39:25,024][26599] Updated weights for policy 0, policy_version 243094 (0.0042) [2024-06-19 00:39:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3982966784. Throughput: 0: 41972.2. Samples: 250629780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:28,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 00:39:29,280][26599] Updated weights for policy 0, policy_version 243104 (0.0041) [2024-06-19 00:39:29,796][26579] Signal inference workers to stop experience collection... (3650 times) [2024-06-19 00:39:29,797][26579] Signal inference workers to resume experience collection... (3650 times) [2024-06-19 00:39:29,809][26599] InferenceWorker_p0-w0: stopping experience collection (3650 times) [2024-06-19 00:39:29,809][26599] InferenceWorker_p0-w0: resuming experience collection (3650 times) [2024-06-19 00:39:32,830][26599] Updated weights for policy 0, policy_version 243114 (0.0030) [2024-06-19 00:39:33,380][26367] Fps is (10 sec: 44237.8, 60 sec: 42328.0, 300 sec: 41876.4). Total num frames: 3983196160. Throughput: 0: 41990.8. Samples: 250754600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:33,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 00:39:37,083][26599] Updated weights for policy 0, policy_version 243124 (0.0032) [2024-06-19 00:39:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 3983392768. Throughput: 0: 41940.8. Samples: 251009440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:38,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 00:39:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000243127_3983392768.pth... [2024-06-19 00:39:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000242516_3973382144.pth [2024-06-19 00:39:40,966][26599] Updated weights for policy 0, policy_version 243134 (0.0035) [2024-06-19 00:39:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41781.7, 300 sec: 41987.5). Total num frames: 3983605760. Throughput: 0: 41926.8. Samples: 251256120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 20.0) [2024-06-19 00:39:43,381][26367] Avg episode reward: [(0, '0.251')] [2024-06-19 00:39:44,843][26599] Updated weights for policy 0, policy_version 243144 (0.0037) [2024-06-19 00:39:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 3983802368. Throughput: 0: 41717.8. Samples: 251378100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:39:48,381][26367] Avg episode reward: [(0, '0.226')] [2024-06-19 00:39:49,013][26599] Updated weights for policy 0, policy_version 243154 (0.0039) [2024-06-19 00:39:52,601][26599] Updated weights for policy 0, policy_version 243164 (0.0036) [2024-06-19 00:39:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 3984031744. Throughput: 0: 41627.7. Samples: 251628440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:39:53,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 00:39:56,912][26599] Updated weights for policy 0, policy_version 243174 (0.0046) [2024-06-19 00:39:58,381][26367] Fps is (10 sec: 40958.8, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 3984211968. Throughput: 0: 41667.7. Samples: 251883260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:39:58,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 00:40:00,336][26599] Updated weights for policy 0, policy_version 243184 (0.0031) [2024-06-19 00:40:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41710.1). Total num frames: 3984424960. Throughput: 0: 41552.2. Samples: 252002380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:03,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 00:40:04,611][26599] Updated weights for policy 0, policy_version 243194 (0.0037) [2024-06-19 00:40:08,055][26599] Updated weights for policy 0, policy_version 243204 (0.0028) [2024-06-19 00:40:08,380][26367] Fps is (10 sec: 44238.1, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 3984654336. Throughput: 0: 41817.0. Samples: 252259380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:08,381][26367] Avg episode reward: [(0, '0.793')] [2024-06-19 00:40:12,423][26599] Updated weights for policy 0, policy_version 243214 (0.0035) [2024-06-19 00:40:13,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41506.0, 300 sec: 41876.9). Total num frames: 3984850944. Throughput: 0: 41679.5. Samples: 252505360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:13,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 00:40:16,187][26599] Updated weights for policy 0, policy_version 243224 (0.0034) [2024-06-19 00:40:18,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3985047552. Throughput: 0: 41604.8. Samples: 252626820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:18,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 00:40:20,225][26599] Updated weights for policy 0, policy_version 243234 (0.0033) [2024-06-19 00:40:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 3985276928. Throughput: 0: 41646.7. Samples: 252883540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:23,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 00:40:23,815][26599] Updated weights for policy 0, policy_version 243244 (0.0033) [2024-06-19 00:40:27,942][26599] Updated weights for policy 0, policy_version 243254 (0.0057) [2024-06-19 00:40:28,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 3985489920. Throughput: 0: 41739.0. Samples: 253134380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:28,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 00:40:31,828][26599] Updated weights for policy 0, policy_version 243264 (0.0041) [2024-06-19 00:40:33,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41503.6, 300 sec: 41764.8). Total num frames: 3985686528. Throughput: 0: 41899.3. Samples: 253263720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:33,385][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 00:40:35,668][26599] Updated weights for policy 0, policy_version 243274 (0.0041) [2024-06-19 00:40:38,380][26367] Fps is (10 sec: 39322.7, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 3985883136. Throughput: 0: 41922.3. Samples: 253514940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:38,380][26367] Avg episode reward: [(0, '0.327')] [2024-06-19 00:40:39,715][26599] Updated weights for policy 0, policy_version 243284 (0.0036) [2024-06-19 00:40:43,380][26367] Fps is (10 sec: 42613.8, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3986112512. Throughput: 0: 41749.2. Samples: 253761960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:43,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 00:40:43,533][26599] Updated weights for policy 0, policy_version 243294 (0.0046) [2024-06-19 00:40:47,477][26599] Updated weights for policy 0, policy_version 243304 (0.0030) [2024-06-19 00:40:48,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 3986325504. Throughput: 0: 41968.9. Samples: 253890980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:48,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 00:40:51,411][26599] Updated weights for policy 0, policy_version 243314 (0.0039) [2024-06-19 00:40:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 3986538496. Throughput: 0: 41697.3. Samples: 254135760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 00:40:53,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 00:40:55,680][26599] Updated weights for policy 0, policy_version 243324 (0.0048) [2024-06-19 00:40:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.6, 300 sec: 41765.3). Total num frames: 3986735104. Throughput: 0: 41859.3. Samples: 254389020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:40:58,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 00:40:59,190][26599] Updated weights for policy 0, policy_version 243334 (0.0031) [2024-06-19 00:41:03,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3986931712. Throughput: 0: 41812.1. Samples: 254508360. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:03,380][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 00:41:03,387][26599] Updated weights for policy 0, policy_version 243344 (0.0028) [2024-06-19 00:41:07,284][26599] Updated weights for policy 0, policy_version 243354 (0.0041) [2024-06-19 00:41:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 3987161088. Throughput: 0: 41675.1. Samples: 254758920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:08,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 00:41:10,679][26579] Signal inference workers to stop experience collection... (3700 times) [2024-06-19 00:41:10,732][26579] Signal inference workers to resume experience collection... (3700 times) [2024-06-19 00:41:10,732][26599] InferenceWorker_p0-w0: stopping experience collection (3700 times) [2024-06-19 00:41:10,758][26599] InferenceWorker_p0-w0: resuming experience collection (3700 times) [2024-06-19 00:41:11,036][26599] Updated weights for policy 0, policy_version 243364 (0.0031) [2024-06-19 00:41:13,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41710.3). Total num frames: 3987341312. Throughput: 0: 41802.3. Samples: 255015480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:13,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 00:41:15,194][26599] Updated weights for policy 0, policy_version 243374 (0.0023) [2024-06-19 00:41:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 41876.9). Total num frames: 3987554304. Throughput: 0: 41434.3. Samples: 255128120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:18,381][26367] Avg episode reward: [(0, '0.236')] [2024-06-19 00:41:19,067][26599] Updated weights for policy 0, policy_version 243384 (0.0030) [2024-06-19 00:41:22,953][26599] Updated weights for policy 0, policy_version 243394 (0.0035) [2024-06-19 00:41:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 3987783680. Throughput: 0: 41571.9. Samples: 255385680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:23,381][26367] Avg episode reward: [(0, '0.401')] [2024-06-19 00:41:26,825][26599] Updated weights for policy 0, policy_version 243404 (0.0030) [2024-06-19 00:41:28,380][26367] Fps is (10 sec: 39321.7, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 3987947520. Throughput: 0: 41663.5. Samples: 255636820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:28,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 00:41:30,697][26599] Updated weights for policy 0, policy_version 243414 (0.0042) [2024-06-19 00:41:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41781.6, 300 sec: 41931.9). Total num frames: 3988193280. Throughput: 0: 41477.6. Samples: 255757480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:33,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 00:41:34,885][26599] Updated weights for policy 0, policy_version 243424 (0.0042) [2024-06-19 00:41:38,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42052.1, 300 sec: 41820.8). Total num frames: 3988406272. Throughput: 0: 41814.6. Samples: 256017420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:38,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 00:41:38,398][26599] Updated weights for policy 0, policy_version 243434 (0.0033) [2024-06-19 00:41:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000243434_3988422656.pth... [2024-06-19 00:41:38,444][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000242821_3978379264.pth [2024-06-19 00:41:42,839][26599] Updated weights for policy 0, policy_version 243444 (0.0034) [2024-06-19 00:41:43,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41654.5). Total num frames: 3988586496. Throughput: 0: 41606.1. Samples: 256261300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:43,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 00:41:46,723][26599] Updated weights for policy 0, policy_version 243454 (0.0039) [2024-06-19 00:41:48,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 3988848640. Throughput: 0: 41817.7. Samples: 256390160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:48,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 00:41:51,025][26599] Updated weights for policy 0, policy_version 243464 (0.0032) [2024-06-19 00:41:53,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 3989012480. Throughput: 0: 41930.6. Samples: 256645800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:53,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 00:41:54,369][26599] Updated weights for policy 0, policy_version 243474 (0.0045) [2024-06-19 00:41:58,380][26367] Fps is (10 sec: 37682.6, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3989225472. Throughput: 0: 41671.4. Samples: 256890700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:41:58,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 00:41:58,609][26599] Updated weights for policy 0, policy_version 243484 (0.0037) [2024-06-19 00:42:02,204][26599] Updated weights for policy 0, policy_version 243494 (0.0029) [2024-06-19 00:42:03,380][26367] Fps is (10 sec: 45876.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 3989471232. Throughput: 0: 42035.8. Samples: 257019720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 00:42:03,380][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 00:42:06,276][26599] Updated weights for policy 0, policy_version 243504 (0.0032) [2024-06-19 00:42:08,380][26367] Fps is (10 sec: 39322.1, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 3989618688. Throughput: 0: 41977.8. Samples: 257274680. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:08,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 00:42:09,782][26599] Updated weights for policy 0, policy_version 243514 (0.0030) [2024-06-19 00:42:13,384][26367] Fps is (10 sec: 39306.9, 60 sec: 42049.7, 300 sec: 41764.8). Total num frames: 3989864448. Throughput: 0: 41742.9. Samples: 257515400. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:13,384][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 00:42:13,923][26599] Updated weights for policy 0, policy_version 243524 (0.0030) [2024-06-19 00:42:17,940][26599] Updated weights for policy 0, policy_version 243534 (0.0031) [2024-06-19 00:42:18,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 3990077440. Throughput: 0: 42045.5. Samples: 257649520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:18,380][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 00:42:22,153][26599] Updated weights for policy 0, policy_version 243544 (0.0045) [2024-06-19 00:42:23,380][26367] Fps is (10 sec: 39335.8, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 3990257664. Throughput: 0: 41733.4. Samples: 257895420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:23,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 00:42:25,657][26599] Updated weights for policy 0, policy_version 243554 (0.0059) [2024-06-19 00:42:26,928][26579] Signal inference workers to stop experience collection... (3750 times) [2024-06-19 00:42:26,928][26579] Signal inference workers to resume experience collection... (3750 times) [2024-06-19 00:42:26,957][26599] InferenceWorker_p0-w0: stopping experience collection (3750 times) [2024-06-19 00:42:26,957][26599] InferenceWorker_p0-w0: resuming experience collection (3750 times) [2024-06-19 00:42:28,380][26367] Fps is (10 sec: 42597.2, 60 sec: 42598.3, 300 sec: 41820.8). Total num frames: 3990503424. Throughput: 0: 41711.4. Samples: 258138320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:28,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 00:42:29,746][26599] Updated weights for policy 0, policy_version 243564 (0.0037) [2024-06-19 00:42:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 3990683648. Throughput: 0: 41906.8. Samples: 258275960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:33,380][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 00:42:33,568][26599] Updated weights for policy 0, policy_version 243574 (0.0028) [2024-06-19 00:42:37,488][26599] Updated weights for policy 0, policy_version 243584 (0.0029) [2024-06-19 00:42:38,380][26367] Fps is (10 sec: 39322.5, 60 sec: 41506.3, 300 sec: 41709.8). Total num frames: 3990896640. Throughput: 0: 41583.3. Samples: 258517040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:38,380][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 00:42:41,514][26599] Updated weights for policy 0, policy_version 243594 (0.0038) [2024-06-19 00:42:43,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 3991142400. Throughput: 0: 41635.2. Samples: 258764280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:43,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 00:42:45,482][26599] Updated weights for policy 0, policy_version 243604 (0.0027) [2024-06-19 00:42:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 3991306240. Throughput: 0: 41779.9. Samples: 258899820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:48,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 00:42:49,256][26599] Updated weights for policy 0, policy_version 243614 (0.0037) [2024-06-19 00:42:53,178][26599] Updated weights for policy 0, policy_version 243624 (0.0032) [2024-06-19 00:42:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 41709.8). Total num frames: 3991535616. Throughput: 0: 41608.1. Samples: 259147040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:53,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 00:42:56,977][26599] Updated weights for policy 0, policy_version 243634 (0.0043) [2024-06-19 00:42:58,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42325.5, 300 sec: 41820.9). Total num frames: 3991764992. Throughput: 0: 41923.9. Samples: 259401820. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:42:58,380][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 00:43:00,898][26599] Updated weights for policy 0, policy_version 243644 (0.0032) [2024-06-19 00:43:03,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41232.9, 300 sec: 41820.9). Total num frames: 3991945216. Throughput: 0: 41767.8. Samples: 259529080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:43:03,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 00:43:04,773][26599] Updated weights for policy 0, policy_version 243654 (0.0041) [2024-06-19 00:43:08,381][26367] Fps is (10 sec: 40954.9, 60 sec: 42597.6, 300 sec: 41820.7). Total num frames: 3992174592. Throughput: 0: 41782.1. Samples: 259775660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:43:08,382][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 00:43:08,639][26599] Updated weights for policy 0, policy_version 243664 (0.0037) [2024-06-19 00:43:12,651][26599] Updated weights for policy 0, policy_version 243674 (0.0043) [2024-06-19 00:43:13,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42054.9, 300 sec: 41876.4). Total num frames: 3992387584. Throughput: 0: 42000.7. Samples: 260028340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 00:43:13,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:43:16,447][26599] Updated weights for policy 0, policy_version 243684 (0.0042) [2024-06-19 00:43:18,380][26367] Fps is (10 sec: 39325.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 3992567808. Throughput: 0: 41768.3. Samples: 260155540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:18,381][26367] Avg episode reward: [(0, '0.365')] [2024-06-19 00:43:20,555][26599] Updated weights for policy 0, policy_version 243694 (0.0035) [2024-06-19 00:43:23,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42868.9, 300 sec: 41931.4). Total num frames: 3992829952. Throughput: 0: 41857.0. Samples: 260400760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:23,384][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 00:43:24,729][26599] Updated weights for policy 0, policy_version 243704 (0.0040) [2024-06-19 00:43:28,347][26599] Updated weights for policy 0, policy_version 243714 (0.0034) [2024-06-19 00:43:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41779.3, 300 sec: 41876.9). Total num frames: 3993010176. Throughput: 0: 42140.4. Samples: 260660600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:28,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 00:43:32,467][26599] Updated weights for policy 0, policy_version 243724 (0.0037) [2024-06-19 00:43:33,383][26367] Fps is (10 sec: 37687.8, 60 sec: 42050.5, 300 sec: 41765.0). Total num frames: 3993206784. Throughput: 0: 41650.2. Samples: 260774180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:33,383][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 00:43:36,031][26599] Updated weights for policy 0, policy_version 243734 (0.0038) [2024-06-19 00:43:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 41821.4). Total num frames: 3993436160. Throughput: 0: 41869.7. Samples: 261031180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:38,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 00:43:38,574][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000243742_3993468928.pth... [2024-06-19 00:43:38,631][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000243127_3983392768.pth [2024-06-19 00:43:40,267][26599] Updated weights for policy 0, policy_version 243744 (0.0035) [2024-06-19 00:43:43,380][26367] Fps is (10 sec: 40969.8, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 3993616384. Throughput: 0: 41989.2. Samples: 261291340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:43,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 00:43:43,512][26579] Signal inference workers to stop experience collection... (3800 times) [2024-06-19 00:43:43,513][26579] Signal inference workers to resume experience collection... (3800 times) [2024-06-19 00:43:43,533][26599] InferenceWorker_p0-w0: stopping experience collection (3800 times) [2024-06-19 00:43:43,533][26599] InferenceWorker_p0-w0: resuming experience collection (3800 times) [2024-06-19 00:43:43,830][26599] Updated weights for policy 0, policy_version 243754 (0.0043) [2024-06-19 00:43:47,901][26599] Updated weights for policy 0, policy_version 243764 (0.0030) [2024-06-19 00:43:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 41820.8). Total num frames: 3993845760. Throughput: 0: 41750.6. Samples: 261407860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:48,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 00:43:51,629][26599] Updated weights for policy 0, policy_version 243774 (0.0040) [2024-06-19 00:43:53,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 3994075136. Throughput: 0: 41995.7. Samples: 261665420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:53,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 00:43:55,945][26599] Updated weights for policy 0, policy_version 243784 (0.0041) [2024-06-19 00:43:58,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41232.9, 300 sec: 41765.3). Total num frames: 3994238976. Throughput: 0: 41997.1. Samples: 261918220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:43:58,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 00:43:59,454][26599] Updated weights for policy 0, policy_version 243794 (0.0036) [2024-06-19 00:44:03,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3994451968. Throughput: 0: 41803.2. Samples: 262036680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:44:03,380][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 00:44:03,795][26599] Updated weights for policy 0, policy_version 243804 (0.0037) [2024-06-19 00:44:07,417][26599] Updated weights for policy 0, policy_version 243814 (0.0036) [2024-06-19 00:44:08,380][26367] Fps is (10 sec: 44237.8, 60 sec: 41780.0, 300 sec: 41765.3). Total num frames: 3994681344. Throughput: 0: 41902.6. Samples: 262286220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:44:08,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 00:44:11,685][26599] Updated weights for policy 0, policy_version 243824 (0.0037) [2024-06-19 00:44:13,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41232.9, 300 sec: 41709.8). Total num frames: 3994861568. Throughput: 0: 41776.4. Samples: 262540540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:44:13,384][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 00:44:15,717][26599] Updated weights for policy 0, policy_version 243834 (0.0042) [2024-06-19 00:44:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 3995090944. Throughput: 0: 41945.4. Samples: 262661620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:44:18,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 00:44:19,521][26599] Updated weights for policy 0, policy_version 243844 (0.0042) [2024-06-19 00:44:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 40689.4, 300 sec: 41709.8). Total num frames: 3995271168. Throughput: 0: 41707.2. Samples: 262908000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 00:44:23,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 00:44:23,764][26599] Updated weights for policy 0, policy_version 243854 (0.0035) [2024-06-19 00:44:27,714][26599] Updated weights for policy 0, policy_version 243864 (0.0037) [2024-06-19 00:44:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3995500544. Throughput: 0: 41421.4. Samples: 263155300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:44:28,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 00:44:31,623][26599] Updated weights for policy 0, policy_version 243874 (0.0035) [2024-06-19 00:44:33,380][26367] Fps is (10 sec: 44236.0, 60 sec: 41780.8, 300 sec: 41765.3). Total num frames: 3995713536. Throughput: 0: 41579.1. Samples: 263278920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:44:33,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 00:44:35,590][26599] Updated weights for policy 0, policy_version 243884 (0.0032) [2024-06-19 00:44:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 3995926528. Throughput: 0: 41305.8. Samples: 263524180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:44:38,384][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 00:44:39,292][26599] Updated weights for policy 0, policy_version 243894 (0.0036) [2024-06-19 00:44:43,243][26599] Updated weights for policy 0, policy_version 243904 (0.0037) [2024-06-19 00:44:43,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3996123136. Throughput: 0: 41409.1. Samples: 263781620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:44:43,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 00:44:47,182][26599] Updated weights for policy 0, policy_version 243914 (0.0031) [2024-06-19 00:44:48,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 3996319744. Throughput: 0: 41427.4. Samples: 263900920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:44:48,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 00:44:51,024][26599] Updated weights for policy 0, policy_version 243924 (0.0027) [2024-06-19 00:44:53,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 3996549120. Throughput: 0: 41446.0. Samples: 264151300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:44:53,382][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 00:44:54,891][26599] Updated weights for policy 0, policy_version 243934 (0.0035) [2024-06-19 00:44:58,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 3996745728. Throughput: 0: 41587.6. Samples: 264411980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:44:58,381][26367] Avg episode reward: [(0, '0.227')] [2024-06-19 00:44:58,885][26599] Updated weights for policy 0, policy_version 243944 (0.0034) [2024-06-19 00:45:02,563][26599] Updated weights for policy 0, policy_version 243954 (0.0029) [2024-06-19 00:45:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 3996958720. Throughput: 0: 41513.7. Samples: 264529740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:45:03,381][26367] Avg episode reward: [(0, '0.396')] [2024-06-19 00:45:06,468][26599] Updated weights for policy 0, policy_version 243964 (0.0025) [2024-06-19 00:45:07,932][26579] Signal inference workers to stop experience collection... (3850 times) [2024-06-19 00:45:07,932][26579] Signal inference workers to resume experience collection... (3850 times) [2024-06-19 00:45:07,975][26599] InferenceWorker_p0-w0: stopping experience collection (3850 times) [2024-06-19 00:45:07,975][26599] InferenceWorker_p0-w0: resuming experience collection (3850 times) [2024-06-19 00:45:08,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 3997188096. Throughput: 0: 41705.2. Samples: 264784740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:45:08,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 00:45:10,331][26599] Updated weights for policy 0, policy_version 243974 (0.0033) [2024-06-19 00:45:13,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 3997351936. Throughput: 0: 42018.2. Samples: 265046120. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:45:13,380][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 00:45:14,258][26599] Updated weights for policy 0, policy_version 243984 (0.0029) [2024-06-19 00:45:18,040][26599] Updated weights for policy 0, policy_version 243994 (0.0036) [2024-06-19 00:45:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3997614080. Throughput: 0: 41926.3. Samples: 265165600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:45:18,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 00:45:22,092][26599] Updated weights for policy 0, policy_version 244004 (0.0025) [2024-06-19 00:45:23,380][26367] Fps is (10 sec: 47513.1, 60 sec: 42598.3, 300 sec: 41820.9). Total num frames: 3997827072. Throughput: 0: 42098.6. Samples: 265418620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:45:23,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 00:45:26,041][26599] Updated weights for policy 0, policy_version 244014 (0.0040) [2024-06-19 00:45:28,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41506.1, 300 sec: 41710.3). Total num frames: 3997990912. Throughput: 0: 42002.2. Samples: 265671720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:45:28,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 00:45:29,903][26599] Updated weights for policy 0, policy_version 244024 (0.0027) [2024-06-19 00:45:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 3998220288. Throughput: 0: 41878.2. Samples: 265785440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:45:33,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 00:45:34,145][26599] Updated weights for policy 0, policy_version 244034 (0.0042) [2024-06-19 00:45:37,613][26599] Updated weights for policy 0, policy_version 244044 (0.0027) [2024-06-19 00:45:38,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 3998449664. Throughput: 0: 42122.7. Samples: 266046820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:45:38,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 00:45:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244047_3998466048.pth... [2024-06-19 00:45:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000243434_3988422656.pth [2024-06-19 00:45:41,970][26599] Updated weights for policy 0, policy_version 244054 (0.0046) [2024-06-19 00:45:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 3998629888. Throughput: 0: 41865.2. Samples: 266295920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:45:43,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 00:45:45,374][26599] Updated weights for policy 0, policy_version 244064 (0.0024) [2024-06-19 00:45:48,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42052.4, 300 sec: 41709.8). Total num frames: 3998842880. Throughput: 0: 41894.0. Samples: 266414960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:45:48,380][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 00:45:49,688][26599] Updated weights for policy 0, policy_version 244074 (0.0039) [2024-06-19 00:45:53,295][26599] Updated weights for policy 0, policy_version 244084 (0.0033) [2024-06-19 00:45:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 3999072256. Throughput: 0: 41972.9. Samples: 266673520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:45:53,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 00:45:57,397][26599] Updated weights for policy 0, policy_version 244094 (0.0033) [2024-06-19 00:45:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 3999252480. Throughput: 0: 41740.9. Samples: 266924460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:45:58,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 00:46:01,251][26599] Updated weights for policy 0, policy_version 244104 (0.0042) [2024-06-19 00:46:03,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 3999481856. Throughput: 0: 41853.0. Samples: 267048980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:03,380][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 00:46:05,513][26599] Updated weights for policy 0, policy_version 244114 (0.0049) [2024-06-19 00:46:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.2, 300 sec: 41820.8). Total num frames: 3999678464. Throughput: 0: 41809.4. Samples: 267300040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:08,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 00:46:08,975][26599] Updated weights for policy 0, policy_version 244124 (0.0039) [2024-06-19 00:46:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 3999875072. Throughput: 0: 41736.0. Samples: 267549840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:13,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 00:46:13,555][26599] Updated weights for policy 0, policy_version 244134 (0.0027) [2024-06-19 00:46:16,810][26599] Updated weights for policy 0, policy_version 244144 (0.0037) [2024-06-19 00:46:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4000120832. Throughput: 0: 41947.2. Samples: 267673060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:18,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 00:46:21,547][26599] Updated weights for policy 0, policy_version 244154 (0.0047) [2024-06-19 00:46:21,581][26579] Signal inference workers to stop experience collection... (3900 times) [2024-06-19 00:46:21,581][26579] Signal inference workers to resume experience collection... (3900 times) [2024-06-19 00:46:21,608][26599] InferenceWorker_p0-w0: stopping experience collection (3900 times) [2024-06-19 00:46:21,608][26599] InferenceWorker_p0-w0: resuming experience collection (3900 times) [2024-06-19 00:46:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 4000301056. Throughput: 0: 41751.2. Samples: 267925620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:23,380][26367] Avg episode reward: [(0, '0.339')] [2024-06-19 00:46:24,600][26599] Updated weights for policy 0, policy_version 244164 (0.0037) [2024-06-19 00:46:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4000497664. Throughput: 0: 41788.5. Samples: 268176400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:28,384][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 00:46:29,404][26599] Updated weights for policy 0, policy_version 244174 (0.0034) [2024-06-19 00:46:32,302][26599] Updated weights for policy 0, policy_version 244184 (0.0042) [2024-06-19 00:46:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 4000727040. Throughput: 0: 41800.9. Samples: 268296000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:33,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 00:46:37,667][26599] Updated weights for policy 0, policy_version 244194 (0.0037) [2024-06-19 00:46:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 4000923648. Throughput: 0: 41783.5. Samples: 268553780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:38,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 00:46:40,117][26599] Updated weights for policy 0, policy_version 244204 (0.0036) [2024-06-19 00:46:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41654.3). Total num frames: 4001136640. Throughput: 0: 41658.7. Samples: 268799100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 00:46:43,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 00:46:45,470][26599] Updated weights for policy 0, policy_version 244214 (0.0046) [2024-06-19 00:46:47,918][26599] Updated weights for policy 0, policy_version 244224 (0.0028) [2024-06-19 00:46:48,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 4001382400. Throughput: 0: 41742.0. Samples: 268927380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:46:48,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 00:46:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40960.1, 300 sec: 41709.8). Total num frames: 4001529856. Throughput: 0: 41688.5. Samples: 269176020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:46:53,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 00:46:53,388][26599] Updated weights for policy 0, policy_version 244234 (0.0024) [2024-06-19 00:46:56,056][26599] Updated weights for policy 0, policy_version 244244 (0.0035) [2024-06-19 00:46:58,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42052.1, 300 sec: 41709.7). Total num frames: 4001775616. Throughput: 0: 41559.4. Samples: 269420020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:46:58,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 00:47:01,203][26599] Updated weights for policy 0, policy_version 244254 (0.0035) [2024-06-19 00:47:03,380][26367] Fps is (10 sec: 45875.3, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4001988608. Throughput: 0: 41845.4. Samples: 269556100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:03,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 00:47:03,699][26599] Updated weights for policy 0, policy_version 244264 (0.0029) [2024-06-19 00:47:08,380][26367] Fps is (10 sec: 37683.7, 60 sec: 41233.0, 300 sec: 41654.7). Total num frames: 4002152448. Throughput: 0: 41647.9. Samples: 269799780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:08,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 00:47:08,980][26599] Updated weights for policy 0, policy_version 244274 (0.0042) [2024-06-19 00:47:11,572][26599] Updated weights for policy 0, policy_version 244284 (0.0037) [2024-06-19 00:47:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 4002398208. Throughput: 0: 41420.0. Samples: 270040300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:13,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 00:47:17,074][26599] Updated weights for policy 0, policy_version 244294 (0.0030) [2024-06-19 00:47:18,380][26367] Fps is (10 sec: 45875.7, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4002611200. Throughput: 0: 41739.1. Samples: 270174260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:18,380][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 00:47:19,718][26599] Updated weights for policy 0, policy_version 244304 (0.0036) [2024-06-19 00:47:23,380][26367] Fps is (10 sec: 36044.9, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 4002758656. Throughput: 0: 41299.3. Samples: 270412240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:23,381][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 00:47:24,946][26599] Updated weights for policy 0, policy_version 244314 (0.0053) [2024-06-19 00:47:27,575][26599] Updated weights for policy 0, policy_version 244324 (0.0030) [2024-06-19 00:47:28,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 4003020800. Throughput: 0: 41272.3. Samples: 270656360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:28,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 00:47:32,694][26599] Updated weights for policy 0, policy_version 244334 (0.0032) [2024-06-19 00:47:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4003217408. Throughput: 0: 41373.9. Samples: 270789200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:33,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 00:47:35,619][26599] Updated weights for policy 0, policy_version 244344 (0.0043) [2024-06-19 00:47:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 4003414016. Throughput: 0: 41228.8. Samples: 271031320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:38,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 00:47:38,417][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244349_4003414016.pth... [2024-06-19 00:47:38,477][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000243742_3993468928.pth [2024-06-19 00:47:39,608][26579] Signal inference workers to stop experience collection... (3950 times) [2024-06-19 00:47:39,658][26599] InferenceWorker_p0-w0: stopping experience collection (3950 times) [2024-06-19 00:47:39,666][26579] Signal inference workers to resume experience collection... (3950 times) [2024-06-19 00:47:39,672][26599] InferenceWorker_p0-w0: resuming experience collection (3950 times) [2024-06-19 00:47:40,348][26599] Updated weights for policy 0, policy_version 244354 (0.0038) [2024-06-19 00:47:43,354][26599] Updated weights for policy 0, policy_version 244364 (0.0037) [2024-06-19 00:47:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4003659776. Throughput: 0: 41357.5. Samples: 271281100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:43,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 00:47:48,291][26599] Updated weights for policy 0, policy_version 244374 (0.0035) [2024-06-19 00:47:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40686.9, 300 sec: 41654.2). Total num frames: 4003823616. Throughput: 0: 41141.6. Samples: 271407480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:48,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 00:47:51,783][26599] Updated weights for policy 0, policy_version 244384 (0.0037) [2024-06-19 00:47:53,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 4004036608. Throughput: 0: 41185.4. Samples: 271653120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 00:47:53,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 00:47:56,129][26599] Updated weights for policy 0, policy_version 244394 (0.0039) [2024-06-19 00:47:58,380][26367] Fps is (10 sec: 44237.6, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 4004265984. Throughput: 0: 41341.3. Samples: 271900660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:47:58,381][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 00:47:59,980][26599] Updated weights for policy 0, policy_version 244404 (0.0033) [2024-06-19 00:48:03,380][26367] Fps is (10 sec: 39321.1, 60 sec: 40686.8, 300 sec: 41543.3). Total num frames: 4004429824. Throughput: 0: 41117.6. Samples: 272024560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:03,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 00:48:03,932][26599] Updated weights for policy 0, policy_version 244414 (0.0030) [2024-06-19 00:48:07,660][26599] Updated weights for policy 0, policy_version 244424 (0.0040) [2024-06-19 00:48:08,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 4004659200. Throughput: 0: 41506.3. Samples: 272280020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:08,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 00:48:12,117][26599] Updated weights for policy 0, policy_version 244434 (0.0032) [2024-06-19 00:48:13,380][26367] Fps is (10 sec: 45875.1, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 4004888576. Throughput: 0: 41208.0. Samples: 272510720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:13,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 00:48:15,585][26599] Updated weights for policy 0, policy_version 244444 (0.0045) [2024-06-19 00:48:18,380][26367] Fps is (10 sec: 39320.7, 60 sec: 40686.8, 300 sec: 41432.6). Total num frames: 4005052416. Throughput: 0: 41119.4. Samples: 272639580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:18,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 00:48:19,776][26599] Updated weights for policy 0, policy_version 244454 (0.0032) [2024-06-19 00:48:23,295][26599] Updated weights for policy 0, policy_version 244464 (0.0034) [2024-06-19 00:48:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 4005298176. Throughput: 0: 41269.4. Samples: 272888440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:23,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 00:48:27,563][26599] Updated weights for policy 0, policy_version 244474 (0.0033) [2024-06-19 00:48:28,380][26367] Fps is (10 sec: 45875.7, 60 sec: 41506.2, 300 sec: 41710.1). Total num frames: 4005511168. Throughput: 0: 41271.1. Samples: 273138300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:28,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 00:48:31,056][26599] Updated weights for policy 0, policy_version 244484 (0.0035) [2024-06-19 00:48:33,380][26367] Fps is (10 sec: 37682.8, 60 sec: 40959.9, 300 sec: 41487.6). Total num frames: 4005675008. Throughput: 0: 41261.4. Samples: 273264240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:33,381][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 00:48:35,493][26599] Updated weights for policy 0, policy_version 244494 (0.0024) [2024-06-19 00:48:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 4005904384. Throughput: 0: 41323.5. Samples: 273512680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:38,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 00:48:38,801][26599] Updated weights for policy 0, policy_version 244504 (0.0034) [2024-06-19 00:48:43,195][26599] Updated weights for policy 0, policy_version 244514 (0.0045) [2024-06-19 00:48:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 4006117376. Throughput: 0: 41468.8. Samples: 273766760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:43,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 00:48:47,210][26599] Updated weights for policy 0, policy_version 244524 (0.0042) [2024-06-19 00:48:48,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41543.2). Total num frames: 4006330368. Throughput: 0: 41561.9. Samples: 273894840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:48,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 00:48:50,843][26599] Updated weights for policy 0, policy_version 244534 (0.0025) [2024-06-19 00:48:53,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 4006543360. Throughput: 0: 41432.9. Samples: 274144500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:53,380][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 00:48:54,981][26599] Updated weights for policy 0, policy_version 244544 (0.0040) [2024-06-19 00:48:58,384][26367] Fps is (10 sec: 40945.0, 60 sec: 41230.6, 300 sec: 41653.7). Total num frames: 4006739968. Throughput: 0: 42086.5. Samples: 274404760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:48:58,385][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 00:48:58,837][26599] Updated weights for policy 0, policy_version 244554 (0.0034) [2024-06-19 00:49:01,504][26579] Signal inference workers to stop experience collection... (4000 times) [2024-06-19 00:49:01,505][26579] Signal inference workers to resume experience collection... (4000 times) [2024-06-19 00:49:01,532][26599] InferenceWorker_p0-w0: stopping experience collection (4000 times) [2024-06-19 00:49:01,532][26599] InferenceWorker_p0-w0: resuming experience collection (4000 times) [2024-06-19 00:49:02,730][26599] Updated weights for policy 0, policy_version 244564 (0.0041) [2024-06-19 00:49:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 41598.7). Total num frames: 4006952960. Throughput: 0: 41901.5. Samples: 274525140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 00:49:03,380][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 00:49:06,643][26599] Updated weights for policy 0, policy_version 244574 (0.0034) [2024-06-19 00:49:08,380][26367] Fps is (10 sec: 44252.4, 60 sec: 42052.1, 300 sec: 41765.3). Total num frames: 4007182336. Throughput: 0: 42029.2. Samples: 274779760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:08,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 00:49:10,420][26599] Updated weights for policy 0, policy_version 244584 (0.0037) [2024-06-19 00:49:13,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 4007362560. Throughput: 0: 42192.8. Samples: 275036980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:13,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 00:49:14,321][26599] Updated weights for policy 0, policy_version 244594 (0.0036) [2024-06-19 00:49:18,003][26599] Updated weights for policy 0, policy_version 244604 (0.0045) [2024-06-19 00:49:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 41820.8). Total num frames: 4007608320. Throughput: 0: 41983.6. Samples: 275153500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:18,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 00:49:22,164][26599] Updated weights for policy 0, policy_version 244614 (0.0038) [2024-06-19 00:49:23,384][26367] Fps is (10 sec: 44221.1, 60 sec: 41776.7, 300 sec: 41709.3). Total num frames: 4007804928. Throughput: 0: 42239.3. Samples: 275413600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:23,385][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 00:49:26,147][26599] Updated weights for policy 0, policy_version 244624 (0.0038) [2024-06-19 00:49:28,384][26367] Fps is (10 sec: 39309.1, 60 sec: 41503.9, 300 sec: 41653.8). Total num frames: 4008001536. Throughput: 0: 42044.2. Samples: 275658880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:28,384][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 00:49:30,312][26599] Updated weights for policy 0, policy_version 244634 (0.0030) [2024-06-19 00:49:33,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42325.4, 300 sec: 41654.2). Total num frames: 4008214528. Throughput: 0: 41872.8. Samples: 275779120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:33,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 00:49:34,051][26599] Updated weights for policy 0, policy_version 244644 (0.0046) [2024-06-19 00:49:38,024][26599] Updated weights for policy 0, policy_version 244654 (0.0030) [2024-06-19 00:49:38,380][26367] Fps is (10 sec: 42612.4, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 4008427520. Throughput: 0: 42185.3. Samples: 276042840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:38,380][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 00:49:38,528][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244656_4008443904.pth... [2024-06-19 00:49:38,586][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244047_3998466048.pth [2024-06-19 00:49:41,825][26599] Updated weights for policy 0, policy_version 244664 (0.0033) [2024-06-19 00:49:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 4008640512. Throughput: 0: 41910.1. Samples: 276290560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:43,380][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 00:49:46,067][26599] Updated weights for policy 0, policy_version 244674 (0.0038) [2024-06-19 00:49:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 4008869888. Throughput: 0: 42028.3. Samples: 276416420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:48,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 00:49:49,711][26599] Updated weights for policy 0, policy_version 244684 (0.0031) [2024-06-19 00:49:53,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 4009033728. Throughput: 0: 41849.5. Samples: 276662980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:53,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 00:49:53,829][26599] Updated weights for policy 0, policy_version 244694 (0.0037) [2024-06-19 00:49:57,283][26599] Updated weights for policy 0, policy_version 244704 (0.0036) [2024-06-19 00:49:58,384][26367] Fps is (10 sec: 39307.5, 60 sec: 42052.3, 300 sec: 41709.3). Total num frames: 4009263104. Throughput: 0: 41734.5. Samples: 276915180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:49:58,385][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 00:50:01,680][26579] Signal inference workers to stop experience collection... (4050 times) [2024-06-19 00:50:01,680][26579] Signal inference workers to resume experience collection... (4050 times) [2024-06-19 00:50:01,698][26599] Updated weights for policy 0, policy_version 244714 (0.0038) [2024-06-19 00:50:01,726][26599] InferenceWorker_p0-w0: stopping experience collection (4050 times) [2024-06-19 00:50:01,727][26599] InferenceWorker_p0-w0: resuming experience collection (4050 times) [2024-06-19 00:50:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41654.3). Total num frames: 4009476096. Throughput: 0: 41995.2. Samples: 277043280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:50:03,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 00:50:05,629][26599] Updated weights for policy 0, policy_version 244724 (0.0031) [2024-06-19 00:50:08,380][26367] Fps is (10 sec: 40974.8, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4009672704. Throughput: 0: 41559.3. Samples: 277283620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:50:08,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 00:50:09,386][26599] Updated weights for policy 0, policy_version 244734 (0.0039) [2024-06-19 00:50:13,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41779.4, 300 sec: 41543.2). Total num frames: 4009869312. Throughput: 0: 41708.4. Samples: 277535620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 19.0) [2024-06-19 00:50:13,380][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 00:50:13,479][26599] Updated weights for policy 0, policy_version 244744 (0.0024) [2024-06-19 00:50:17,110][26599] Updated weights for policy 0, policy_version 244754 (0.0030) [2024-06-19 00:50:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 4010098688. Throughput: 0: 41950.3. Samples: 277666880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:18,381][26367] Avg episode reward: [(0, '0.328')] [2024-06-19 00:50:21,130][26599] Updated weights for policy 0, policy_version 244764 (0.0034) [2024-06-19 00:50:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41508.7, 300 sec: 41709.8). Total num frames: 4010295296. Throughput: 0: 41616.9. Samples: 277915600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:23,380][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 00:50:24,883][26599] Updated weights for policy 0, policy_version 244774 (0.0043) [2024-06-19 00:50:28,382][26367] Fps is (10 sec: 42592.9, 60 sec: 42053.6, 300 sec: 41709.6). Total num frames: 4010524672. Throughput: 0: 41660.1. Samples: 278165320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:28,382][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 00:50:28,803][26599] Updated weights for policy 0, policy_version 244784 (0.0032) [2024-06-19 00:50:32,639][26599] Updated weights for policy 0, policy_version 244794 (0.0028) [2024-06-19 00:50:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 4010721280. Throughput: 0: 41645.9. Samples: 278290480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:33,380][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 00:50:36,574][26599] Updated weights for policy 0, policy_version 244804 (0.0029) [2024-06-19 00:50:38,383][26367] Fps is (10 sec: 40954.0, 60 sec: 41777.3, 300 sec: 41709.4). Total num frames: 4010934272. Throughput: 0: 41581.4. Samples: 278534260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:38,383][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 00:50:40,533][26599] Updated weights for policy 0, policy_version 244814 (0.0025) [2024-06-19 00:50:43,384][26367] Fps is (10 sec: 40944.8, 60 sec: 41503.6, 300 sec: 41653.7). Total num frames: 4011130880. Throughput: 0: 41743.2. Samples: 278793620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:43,384][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 00:50:44,325][26599] Updated weights for policy 0, policy_version 244824 (0.0029) [2024-06-19 00:50:48,380][26367] Fps is (10 sec: 39332.1, 60 sec: 40960.0, 300 sec: 41543.2). Total num frames: 4011327488. Throughput: 0: 41546.1. Samples: 278912860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:48,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 00:50:48,762][26599] Updated weights for policy 0, policy_version 244834 (0.0032) [2024-06-19 00:50:52,094][26599] Updated weights for policy 0, policy_version 244844 (0.0027) [2024-06-19 00:50:53,380][26367] Fps is (10 sec: 44252.3, 60 sec: 42325.2, 300 sec: 41765.3). Total num frames: 4011573248. Throughput: 0: 41692.0. Samples: 279159760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:53,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 00:50:56,417][26599] Updated weights for policy 0, policy_version 244854 (0.0034) [2024-06-19 00:50:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41235.5, 300 sec: 41543.1). Total num frames: 4011737088. Throughput: 0: 41803.3. Samples: 279416780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:50:58,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 00:51:00,323][26599] Updated weights for policy 0, policy_version 244864 (0.0031) [2024-06-19 00:51:03,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 4011950080. Throughput: 0: 41398.7. Samples: 279529820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:51:03,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 00:51:04,118][26599] Updated weights for policy 0, policy_version 244874 (0.0031) [2024-06-19 00:51:08,091][26599] Updated weights for policy 0, policy_version 244884 (0.0041) [2024-06-19 00:51:08,380][26367] Fps is (10 sec: 45876.3, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 4012195840. Throughput: 0: 41598.6. Samples: 279787540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:51:08,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 00:51:12,057][26599] Updated weights for policy 0, policy_version 244894 (0.0047) [2024-06-19 00:51:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41779.0, 300 sec: 41543.1). Total num frames: 4012376064. Throughput: 0: 41577.1. Samples: 280036240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:51:13,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 00:51:15,827][26599] Updated weights for policy 0, policy_version 244904 (0.0032) [2024-06-19 00:51:18,383][26367] Fps is (10 sec: 37671.2, 60 sec: 41230.9, 300 sec: 41598.2). Total num frames: 4012572672. Throughput: 0: 41493.9. Samples: 280157840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:51:18,384][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 00:51:19,829][26599] Updated weights for policy 0, policy_version 244914 (0.0040) [2024-06-19 00:51:23,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4012802048. Throughput: 0: 41757.3. Samples: 280413220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 00:51:23,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 00:51:24,104][26599] Updated weights for policy 0, policy_version 244924 (0.0040) [2024-06-19 00:51:24,492][26579] Signal inference workers to stop experience collection... (4100 times) [2024-06-19 00:51:24,492][26579] Signal inference workers to resume experience collection... (4100 times) [2024-06-19 00:51:24,516][26599] InferenceWorker_p0-w0: stopping experience collection (4100 times) [2024-06-19 00:51:24,516][26599] InferenceWorker_p0-w0: resuming experience collection (4100 times) [2024-06-19 00:51:27,747][26599] Updated weights for policy 0, policy_version 244934 (0.0030) [2024-06-19 00:51:28,380][26367] Fps is (10 sec: 44250.5, 60 sec: 41507.0, 300 sec: 41654.2). Total num frames: 4013015040. Throughput: 0: 41535.3. Samples: 280662560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:51:28,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 00:51:32,101][26599] Updated weights for policy 0, policy_version 244944 (0.0043) [2024-06-19 00:51:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 4013228032. Throughput: 0: 41649.8. Samples: 280787100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:51:33,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 00:51:35,738][26599] Updated weights for policy 0, policy_version 244954 (0.0033) [2024-06-19 00:51:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41508.0, 300 sec: 41654.2). Total num frames: 4013424640. Throughput: 0: 41813.0. Samples: 281041340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:51:38,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 00:51:38,451][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244961_4013441024.pth... [2024-06-19 00:51:38,506][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244349_4003414016.pth [2024-06-19 00:51:39,875][26599] Updated weights for policy 0, policy_version 244964 (0.0033) [2024-06-19 00:51:43,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41508.5, 300 sec: 41487.6). Total num frames: 4013621248. Throughput: 0: 41586.7. Samples: 281288180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:51:43,381][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 00:51:43,975][26599] Updated weights for policy 0, policy_version 244974 (0.0036) [2024-06-19 00:51:47,582][26599] Updated weights for policy 0, policy_version 244984 (0.0039) [2024-06-19 00:51:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41820.8). Total num frames: 4013867008. Throughput: 0: 41866.2. Samples: 281413800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:51:48,384][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 00:51:51,717][26599] Updated weights for policy 0, policy_version 244994 (0.0036) [2024-06-19 00:51:53,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 4014047232. Throughput: 0: 41865.7. Samples: 281671500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:51:53,381][26367] Avg episode reward: [(0, '0.367')] [2024-06-19 00:51:55,361][26599] Updated weights for policy 0, policy_version 245004 (0.0036) [2024-06-19 00:51:58,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 41598.7). Total num frames: 4014260224. Throughput: 0: 41921.0. Samples: 281922680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:51:58,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 00:51:59,738][26599] Updated weights for policy 0, policy_version 245014 (0.0038) [2024-06-19 00:52:03,093][26599] Updated weights for policy 0, policy_version 245024 (0.0046) [2024-06-19 00:52:03,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 4014489600. Throughput: 0: 41968.6. Samples: 282046300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:52:03,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 00:52:07,598][26599] Updated weights for policy 0, policy_version 245034 (0.0028) [2024-06-19 00:52:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41598.7). Total num frames: 4014669824. Throughput: 0: 41993.4. Samples: 282302920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:52:08,380][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 00:52:10,832][26599] Updated weights for policy 0, policy_version 245044 (0.0033) [2024-06-19 00:52:13,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 41654.2). Total num frames: 4014899200. Throughput: 0: 41874.3. Samples: 282546900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:52:13,380][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 00:52:15,477][26599] Updated weights for policy 0, policy_version 245054 (0.0033) [2024-06-19 00:52:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42054.4, 300 sec: 41820.8). Total num frames: 4015095808. Throughput: 0: 41958.6. Samples: 282675240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:52:18,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 00:52:18,680][26599] Updated weights for policy 0, policy_version 245064 (0.0029) [2024-06-19 00:52:23,202][26599] Updated weights for policy 0, policy_version 245074 (0.0037) [2024-06-19 00:52:23,380][26367] Fps is (10 sec: 39320.6, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 4015292416. Throughput: 0: 41818.1. Samples: 282923160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:52:23,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 00:52:26,678][26599] Updated weights for policy 0, policy_version 245084 (0.0038) [2024-06-19 00:52:28,384][26367] Fps is (10 sec: 42582.8, 60 sec: 41776.6, 300 sec: 41709.2). Total num frames: 4015521792. Throughput: 0: 41836.3. Samples: 283170960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:52:28,385][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 00:52:30,961][26599] Updated weights for policy 0, policy_version 245094 (0.0033) [2024-06-19 00:52:33,380][26367] Fps is (10 sec: 44237.7, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 4015734784. Throughput: 0: 41873.9. Samples: 283298120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 00:52:33,380][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 00:52:34,419][26599] Updated weights for policy 0, policy_version 245104 (0.0039) [2024-06-19 00:52:38,380][26367] Fps is (10 sec: 39336.6, 60 sec: 41506.2, 300 sec: 41543.2). Total num frames: 4015915008. Throughput: 0: 41705.0. Samples: 283548220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:52:38,380][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 00:52:38,747][26599] Updated weights for policy 0, policy_version 245114 (0.0036) [2024-06-19 00:52:39,760][26579] Signal inference workers to stop experience collection... (4150 times) [2024-06-19 00:52:39,761][26579] Signal inference workers to resume experience collection... (4150 times) [2024-06-19 00:52:39,775][26599] InferenceWorker_p0-w0: stopping experience collection (4150 times) [2024-06-19 00:52:39,776][26599] InferenceWorker_p0-w0: resuming experience collection (4150 times) [2024-06-19 00:52:42,279][26599] Updated weights for policy 0, policy_version 245124 (0.0025) [2024-06-19 00:52:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 41765.3). Total num frames: 4016144384. Throughput: 0: 41624.1. Samples: 283795760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:52:43,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 00:52:46,647][26599] Updated weights for policy 0, policy_version 245134 (0.0040) [2024-06-19 00:52:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 4016340992. Throughput: 0: 41646.3. Samples: 283920380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:52:48,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 00:52:49,917][26599] Updated weights for policy 0, policy_version 245144 (0.0037) [2024-06-19 00:52:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 4016570368. Throughput: 0: 41627.6. Samples: 284176160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:52:53,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 00:52:54,797][26599] Updated weights for policy 0, policy_version 245154 (0.0032) [2024-06-19 00:52:57,656][26599] Updated weights for policy 0, policy_version 245164 (0.0035) [2024-06-19 00:52:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4016783360. Throughput: 0: 41550.9. Samples: 284416700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:52:58,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 00:53:02,543][26599] Updated weights for policy 0, policy_version 245174 (0.0042) [2024-06-19 00:53:03,382][26367] Fps is (10 sec: 39315.9, 60 sec: 41232.2, 300 sec: 41709.6). Total num frames: 4016963584. Throughput: 0: 41562.4. Samples: 284545600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:03,382][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 00:53:05,583][26599] Updated weights for policy 0, policy_version 245184 (0.0035) [2024-06-19 00:53:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 41709.8). Total num frames: 4017192960. Throughput: 0: 41668.2. Samples: 284798220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:08,381][26367] Avg episode reward: [(0, '0.304')] [2024-06-19 00:53:10,464][26599] Updated weights for policy 0, policy_version 245194 (0.0033) [2024-06-19 00:53:13,126][26599] Updated weights for policy 0, policy_version 245204 (0.0035) [2024-06-19 00:53:13,380][26367] Fps is (10 sec: 45881.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4017422336. Throughput: 0: 41586.9. Samples: 285042220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:13,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 00:53:18,056][26599] Updated weights for policy 0, policy_version 245214 (0.0034) [2024-06-19 00:53:18,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41654.3). Total num frames: 4017586176. Throughput: 0: 41725.8. Samples: 285175780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:18,380][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 00:53:21,110][26599] Updated weights for policy 0, policy_version 245224 (0.0025) [2024-06-19 00:53:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 4017831936. Throughput: 0: 41666.5. Samples: 285423220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:23,381][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 00:53:26,505][26599] Updated weights for policy 0, policy_version 245234 (0.0047) [2024-06-19 00:53:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41781.8, 300 sec: 41876.4). Total num frames: 4018028544. Throughput: 0: 41822.2. Samples: 285677760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:28,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 00:53:29,002][26599] Updated weights for policy 0, policy_version 245244 (0.0026) [2024-06-19 00:53:33,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 4018208768. Throughput: 0: 41717.2. Samples: 285797660. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:33,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 00:53:34,144][26599] Updated weights for policy 0, policy_version 245254 (0.0035) [2024-06-19 00:53:36,742][26599] Updated weights for policy 0, policy_version 245264 (0.0047) [2024-06-19 00:53:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 4018454528. Throughput: 0: 41736.4. Samples: 286054300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:38,380][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 00:53:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000245267_4018454528.pth... [2024-06-19 00:53:38,478][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244656_4008443904.pth [2024-06-19 00:53:41,847][26599] Updated weights for policy 0, policy_version 245274 (0.0030) [2024-06-19 00:53:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 4018651136. Throughput: 0: 42129.4. Samples: 286312520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 00:53:43,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 00:53:44,445][26599] Updated weights for policy 0, policy_version 245284 (0.0030) [2024-06-19 00:53:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 4018864128. Throughput: 0: 42039.9. Samples: 286437340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:53:48,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 00:53:49,717][26599] Updated weights for policy 0, policy_version 245294 (0.0036) [2024-06-19 00:53:51,422][26579] Signal inference workers to stop experience collection... (4200 times) [2024-06-19 00:53:51,423][26579] Signal inference workers to resume experience collection... (4200 times) [2024-06-19 00:53:51,452][26599] InferenceWorker_p0-w0: stopping experience collection (4200 times) [2024-06-19 00:53:51,452][26599] InferenceWorker_p0-w0: resuming experience collection (4200 times) [2024-06-19 00:53:52,071][26599] Updated weights for policy 0, policy_version 245304 (0.0034) [2024-06-19 00:53:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41821.4). Total num frames: 4019077120. Throughput: 0: 41873.3. Samples: 286682520. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:53:53,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 00:53:57,478][26599] Updated weights for policy 0, policy_version 245314 (0.0030) [2024-06-19 00:53:58,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 4019257344. Throughput: 0: 42334.3. Samples: 286947260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:53:58,380][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 00:53:59,914][26599] Updated weights for policy 0, policy_version 245324 (0.0041) [2024-06-19 00:54:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42326.3, 300 sec: 41765.3). Total num frames: 4019503104. Throughput: 0: 41992.0. Samples: 287065420. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:03,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 00:54:05,365][26599] Updated weights for policy 0, policy_version 245334 (0.0034) [2024-06-19 00:54:07,546][26599] Updated weights for policy 0, policy_version 245344 (0.0035) [2024-06-19 00:54:08,380][26367] Fps is (10 sec: 47513.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4019732480. Throughput: 0: 42146.7. Samples: 287319820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:08,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 00:54:13,047][26599] Updated weights for policy 0, policy_version 245354 (0.0032) [2024-06-19 00:54:13,380][26367] Fps is (10 sec: 37682.7, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 4019879936. Throughput: 0: 42211.0. Samples: 287577260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:13,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 00:54:15,460][26599] Updated weights for policy 0, policy_version 245364 (0.0040) [2024-06-19 00:54:18,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 41765.8). Total num frames: 4020125696. Throughput: 0: 42005.9. Samples: 287687920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:18,380][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 00:54:20,687][26599] Updated weights for policy 0, policy_version 245374 (0.0032) [2024-06-19 00:54:23,356][26599] Updated weights for policy 0, policy_version 245384 (0.0049) [2024-06-19 00:54:23,384][26367] Fps is (10 sec: 49134.5, 60 sec: 42322.9, 300 sec: 41931.9). Total num frames: 4020371456. Throughput: 0: 42037.0. Samples: 287946120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:23,384][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 00:54:28,380][26367] Fps is (10 sec: 37682.9, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 4020502528. Throughput: 0: 41965.4. Samples: 288200960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:28,384][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 00:54:28,683][26599] Updated weights for policy 0, policy_version 245394 (0.0038) [2024-06-19 00:54:31,245][26599] Updated weights for policy 0, policy_version 245404 (0.0035) [2024-06-19 00:54:33,380][26367] Fps is (10 sec: 37697.1, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 4020748288. Throughput: 0: 41694.7. Samples: 288313600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:33,381][26367] Avg episode reward: [(0, '0.364')] [2024-06-19 00:54:36,810][26599] Updated weights for policy 0, policy_version 245414 (0.0034) [2024-06-19 00:54:38,380][26367] Fps is (10 sec: 47513.1, 60 sec: 42052.1, 300 sec: 41820.8). Total num frames: 4020977664. Throughput: 0: 42204.8. Samples: 288581740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:38,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 00:54:38,898][26599] Updated weights for policy 0, policy_version 245424 (0.0034) [2024-06-19 00:54:43,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 4021157888. Throughput: 0: 41776.3. Samples: 288827200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:43,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 00:54:44,430][26599] Updated weights for policy 0, policy_version 245434 (0.0034) [2024-06-19 00:54:46,764][26599] Updated weights for policy 0, policy_version 245444 (0.0039) [2024-06-19 00:54:48,381][26367] Fps is (10 sec: 40959.0, 60 sec: 42052.0, 300 sec: 41876.3). Total num frames: 4021387264. Throughput: 0: 41926.7. Samples: 288952140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:48,382][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 00:54:48,876][26579] Signal inference workers to stop experience collection... (4250 times) [2024-06-19 00:54:48,883][26579] Signal inference workers to resume experience collection... (4250 times) [2024-06-19 00:54:48,906][26599] InferenceWorker_p0-w0: stopping experience collection (4250 times) [2024-06-19 00:54:48,907][26599] InferenceWorker_p0-w0: resuming experience collection (4250 times) [2024-06-19 00:54:52,114][26599] Updated weights for policy 0, policy_version 245454 (0.0031) [2024-06-19 00:54:53,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41506.2, 300 sec: 41710.3). Total num frames: 4021567488. Throughput: 0: 41964.6. Samples: 289208220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 00:54:53,380][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 00:54:54,643][26599] Updated weights for policy 0, policy_version 245464 (0.0031) [2024-06-19 00:54:58,380][26367] Fps is (10 sec: 40961.5, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 4021796864. Throughput: 0: 41791.6. Samples: 289457880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:54:58,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 00:54:59,745][26599] Updated weights for policy 0, policy_version 245474 (0.0030) [2024-06-19 00:55:02,533][26599] Updated weights for policy 0, policy_version 245484 (0.0030) [2024-06-19 00:55:03,382][26367] Fps is (10 sec: 45869.1, 60 sec: 42051.3, 300 sec: 41876.2). Total num frames: 4022026240. Throughput: 0: 42197.0. Samples: 289586840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:03,382][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 00:55:07,377][26599] Updated weights for policy 0, policy_version 245494 (0.0037) [2024-06-19 00:55:08,380][26367] Fps is (10 sec: 39321.1, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 4022190080. Throughput: 0: 42010.8. Samples: 289836460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:08,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 00:55:10,439][26599] Updated weights for policy 0, policy_version 245504 (0.0038) [2024-06-19 00:55:13,380][26367] Fps is (10 sec: 39326.7, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 4022419456. Throughput: 0: 41805.0. Samples: 290082180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:13,380][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 00:55:15,082][26599] Updated weights for policy 0, policy_version 245514 (0.0036) [2024-06-19 00:55:18,372][26599] Updated weights for policy 0, policy_version 245524 (0.0040) [2024-06-19 00:55:18,380][26367] Fps is (10 sec: 47514.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4022665216. Throughput: 0: 42259.5. Samples: 290215280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:18,381][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 00:55:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 40689.4, 300 sec: 41654.4). Total num frames: 4022812672. Throughput: 0: 41801.9. Samples: 290462820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:23,381][26367] Avg episode reward: [(0, '0.830')] [2024-06-19 00:55:23,435][26599] Updated weights for policy 0, policy_version 245534 (0.0033) [2024-06-19 00:55:26,159][26599] Updated weights for policy 0, policy_version 245544 (0.0032) [2024-06-19 00:55:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 41820.8). Total num frames: 4023058432. Throughput: 0: 41940.5. Samples: 290714520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:28,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 00:55:31,077][26599] Updated weights for policy 0, policy_version 245554 (0.0033) [2024-06-19 00:55:33,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 41821.3). Total num frames: 4023271424. Throughput: 0: 42140.4. Samples: 290848440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:33,381][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 00:55:34,046][26599] Updated weights for policy 0, policy_version 245564 (0.0042) [2024-06-19 00:55:38,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41821.4). Total num frames: 4023468032. Throughput: 0: 41879.8. Samples: 291092820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:38,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 00:55:38,413][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000245573_4023468032.pth... [2024-06-19 00:55:38,472][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000244961_4013441024.pth [2024-06-19 00:55:38,705][26599] Updated weights for policy 0, policy_version 245574 (0.0042) [2024-06-19 00:55:41,891][26599] Updated weights for policy 0, policy_version 245584 (0.0035) [2024-06-19 00:55:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4023697408. Throughput: 0: 41849.8. Samples: 291341120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:43,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 00:55:46,611][26599] Updated weights for policy 0, policy_version 245594 (0.0037) [2024-06-19 00:55:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.5, 300 sec: 41820.9). Total num frames: 4023910400. Throughput: 0: 41766.8. Samples: 291466300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:48,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 00:55:49,600][26599] Updated weights for policy 0, policy_version 245604 (0.0040) [2024-06-19 00:55:53,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 41932.0). Total num frames: 4024107008. Throughput: 0: 41841.9. Samples: 291719340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:53,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 00:55:54,219][26599] Updated weights for policy 0, policy_version 245614 (0.0040) [2024-06-19 00:55:57,638][26599] Updated weights for policy 0, policy_version 245624 (0.0030) [2024-06-19 00:55:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4024320000. Throughput: 0: 41788.9. Samples: 291962680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:55:58,380][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 00:56:01,913][26599] Updated weights for policy 0, policy_version 245634 (0.0034) [2024-06-19 00:56:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41507.0, 300 sec: 41765.3). Total num frames: 4024516608. Throughput: 0: 41738.2. Samples: 292093500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 00:56:03,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 00:56:05,203][26599] Updated weights for policy 0, policy_version 245644 (0.0033) [2024-06-19 00:56:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 4024729600. Throughput: 0: 41838.2. Samples: 292345540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:08,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 00:56:10,186][26599] Updated weights for policy 0, policy_version 245654 (0.0048) [2024-06-19 00:56:12,399][26579] Signal inference workers to stop experience collection... (4300 times) [2024-06-19 00:56:12,432][26599] InferenceWorker_p0-w0: stopping experience collection (4300 times) [2024-06-19 00:56:12,456][26579] Signal inference workers to resume experience collection... (4300 times) [2024-06-19 00:56:12,460][26599] InferenceWorker_p0-w0: resuming experience collection (4300 times) [2024-06-19 00:56:13,234][26599] Updated weights for policy 0, policy_version 245664 (0.0037) [2024-06-19 00:56:13,384][26367] Fps is (10 sec: 44220.6, 60 sec: 42322.7, 300 sec: 41987.4). Total num frames: 4024958976. Throughput: 0: 41798.8. Samples: 292595620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:13,385][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 00:56:17,939][26599] Updated weights for policy 0, policy_version 245674 (0.0030) [2024-06-19 00:56:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4025139200. Throughput: 0: 41635.1. Samples: 292722020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:18,380][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 00:56:21,318][26599] Updated weights for policy 0, policy_version 245684 (0.0024) [2024-06-19 00:56:23,380][26367] Fps is (10 sec: 39336.2, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 4025352192. Throughput: 0: 41707.7. Samples: 292969660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:23,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 00:56:25,557][26599] Updated weights for policy 0, policy_version 245694 (0.0042) [2024-06-19 00:56:28,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4025565184. Throughput: 0: 41798.6. Samples: 293222060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:28,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 00:56:29,199][26599] Updated weights for policy 0, policy_version 245704 (0.0042) [2024-06-19 00:56:33,151][26599] Updated weights for policy 0, policy_version 245714 (0.0033) [2024-06-19 00:56:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4025778176. Throughput: 0: 41780.1. Samples: 293346400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:33,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 00:56:37,142][26599] Updated weights for policy 0, policy_version 245724 (0.0034) [2024-06-19 00:56:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4025991168. Throughput: 0: 41667.1. Samples: 293594360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:38,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 00:56:40,821][26599] Updated weights for policy 0, policy_version 245734 (0.0030) [2024-06-19 00:56:43,384][26367] Fps is (10 sec: 42582.9, 60 sec: 41776.6, 300 sec: 41820.3). Total num frames: 4026204160. Throughput: 0: 41741.9. Samples: 293841220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:43,385][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 00:56:44,958][26599] Updated weights for policy 0, policy_version 245744 (0.0037) [2024-06-19 00:56:48,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41503.7, 300 sec: 41875.9). Total num frames: 4026400768. Throughput: 0: 41734.8. Samples: 293971720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:48,385][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 00:56:48,932][26599] Updated weights for policy 0, policy_version 245754 (0.0032) [2024-06-19 00:56:53,009][26599] Updated weights for policy 0, policy_version 245764 (0.0042) [2024-06-19 00:56:53,380][26367] Fps is (10 sec: 40974.8, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4026613760. Throughput: 0: 41698.2. Samples: 294221960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:53,381][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 00:56:56,723][26599] Updated weights for policy 0, policy_version 245774 (0.0031) [2024-06-19 00:56:58,380][26367] Fps is (10 sec: 42614.2, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4026826752. Throughput: 0: 41649.7. Samples: 294469700. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:56:58,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 00:57:00,878][26599] Updated weights for policy 0, policy_version 245784 (0.0035) [2024-06-19 00:57:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4027023360. Throughput: 0: 41669.7. Samples: 294597160. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:57:03,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 00:57:04,549][26599] Updated weights for policy 0, policy_version 245794 (0.0029) [2024-06-19 00:57:08,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 4027236352. Throughput: 0: 41738.5. Samples: 294847900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:57:08,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 00:57:08,586][26599] Updated weights for policy 0, policy_version 245804 (0.0028) [2024-06-19 00:57:12,607][26599] Updated weights for policy 0, policy_version 245814 (0.0029) [2024-06-19 00:57:13,384][26367] Fps is (10 sec: 44220.9, 60 sec: 41779.2, 300 sec: 41931.4). Total num frames: 4027465728. Throughput: 0: 41641.1. Samples: 295096060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 00:57:13,385][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 00:57:16,303][26599] Updated weights for policy 0, policy_version 245824 (0.0036) [2024-06-19 00:57:18,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4027645952. Throughput: 0: 41756.9. Samples: 295225460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:18,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 00:57:20,373][26599] Updated weights for policy 0, policy_version 245834 (0.0036) [2024-06-19 00:57:23,380][26367] Fps is (10 sec: 39336.1, 60 sec: 41779.2, 300 sec: 41821.4). Total num frames: 4027858944. Throughput: 0: 41633.9. Samples: 295467880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:23,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 00:57:24,478][26599] Updated weights for policy 0, policy_version 245844 (0.0042) [2024-06-19 00:57:28,210][26599] Updated weights for policy 0, policy_version 245854 (0.0040) [2024-06-19 00:57:28,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4028088320. Throughput: 0: 41927.3. Samples: 295727800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:28,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 00:57:32,389][26599] Updated weights for policy 0, policy_version 245864 (0.0034) [2024-06-19 00:57:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4028284928. Throughput: 0: 41805.3. Samples: 295852800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:33,380][26367] Avg episode reward: [(0, '0.265')] [2024-06-19 00:57:35,884][26599] Updated weights for policy 0, policy_version 245874 (0.0028) [2024-06-19 00:57:38,381][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4028497920. Throughput: 0: 41716.7. Samples: 296099220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:38,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 00:57:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000245880_4028497920.pth... [2024-06-19 00:57:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000245267_4018454528.pth [2024-06-19 00:57:40,342][26599] Updated weights for policy 0, policy_version 245884 (0.0033) [2024-06-19 00:57:43,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41508.6, 300 sec: 41876.4). Total num frames: 4028694528. Throughput: 0: 41914.5. Samples: 296355860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:43,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 00:57:43,769][26599] Updated weights for policy 0, policy_version 245894 (0.0030) [2024-06-19 00:57:46,971][26579] Signal inference workers to stop experience collection... (4350 times) [2024-06-19 00:57:46,995][26599] InferenceWorker_p0-w0: stopping experience collection (4350 times) [2024-06-19 00:57:47,086][26579] Signal inference workers to resume experience collection... (4350 times) [2024-06-19 00:57:47,086][26599] InferenceWorker_p0-w0: resuming experience collection (4350 times) [2024-06-19 00:57:48,257][26599] Updated weights for policy 0, policy_version 245904 (0.0035) [2024-06-19 00:57:48,380][26367] Fps is (10 sec: 39322.6, 60 sec: 41508.7, 300 sec: 41765.3). Total num frames: 4028891136. Throughput: 0: 41680.0. Samples: 296472760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:48,380][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 00:57:51,755][26599] Updated weights for policy 0, policy_version 245914 (0.0033) [2024-06-19 00:57:53,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 4029153280. Throughput: 0: 41626.8. Samples: 296721100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:53,381][26367] Avg episode reward: [(0, '0.303')] [2024-06-19 00:57:56,066][26599] Updated weights for policy 0, policy_version 245924 (0.0026) [2024-06-19 00:57:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41821.0). Total num frames: 4029300736. Throughput: 0: 41904.3. Samples: 296981600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:57:58,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 00:57:59,431][26599] Updated weights for policy 0, policy_version 245934 (0.0034) [2024-06-19 00:58:03,384][26367] Fps is (10 sec: 34393.8, 60 sec: 41230.6, 300 sec: 41709.3). Total num frames: 4029497344. Throughput: 0: 41633.0. Samples: 297099100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:58:03,385][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 00:58:03,765][26599] Updated weights for policy 0, policy_version 245944 (0.0042) [2024-06-19 00:58:07,596][26599] Updated weights for policy 0, policy_version 245954 (0.0043) [2024-06-19 00:58:08,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 4029775872. Throughput: 0: 41957.3. Samples: 297355960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:58:08,381][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 00:58:12,083][26599] Updated weights for policy 0, policy_version 245964 (0.0052) [2024-06-19 00:58:13,380][26367] Fps is (10 sec: 44252.9, 60 sec: 41235.6, 300 sec: 41876.4). Total num frames: 4029939712. Throughput: 0: 41691.6. Samples: 297603920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:58:13,381][26367] Avg episode reward: [(0, '0.364')] [2024-06-19 00:58:15,314][26599] Updated weights for policy 0, policy_version 245974 (0.0041) [2024-06-19 00:58:18,380][26367] Fps is (10 sec: 36044.7, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 4030136320. Throughput: 0: 41617.2. Samples: 297725580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:58:18,381][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 00:58:19,769][26599] Updated weights for policy 0, policy_version 245984 (0.0030) [2024-06-19 00:58:22,911][26599] Updated weights for policy 0, policy_version 245994 (0.0033) [2024-06-19 00:58:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4030382080. Throughput: 0: 41814.9. Samples: 297980880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 00:58:23,381][26367] Avg episode reward: [(0, '0.373')] [2024-06-19 00:58:27,447][26599] Updated weights for policy 0, policy_version 246004 (0.0032) [2024-06-19 00:58:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4030578688. Throughput: 0: 41846.7. Samples: 298238960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:58:28,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 00:58:30,637][26599] Updated weights for policy 0, policy_version 246014 (0.0037) [2024-06-19 00:58:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4030791680. Throughput: 0: 41815.6. Samples: 298354460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:58:33,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 00:58:35,339][26599] Updated weights for policy 0, policy_version 246024 (0.0035) [2024-06-19 00:58:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 4031004672. Throughput: 0: 42013.7. Samples: 298611720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:58:38,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 00:58:38,400][26599] Updated weights for policy 0, policy_version 246034 (0.0033) [2024-06-19 00:58:43,122][26599] Updated weights for policy 0, policy_version 246044 (0.0039) [2024-06-19 00:58:43,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4031184896. Throughput: 0: 41885.2. Samples: 298866440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:58:43,384][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 00:58:46,113][26599] Updated weights for policy 0, policy_version 246054 (0.0034) [2024-06-19 00:58:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 4031414272. Throughput: 0: 41864.6. Samples: 298982860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:58:48,381][26367] Avg episode reward: [(0, '0.392')] [2024-06-19 00:58:50,978][26599] Updated weights for policy 0, policy_version 246064 (0.0036) [2024-06-19 00:58:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 4031627264. Throughput: 0: 41769.4. Samples: 299235580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:58:53,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 00:58:54,339][26599] Updated weights for policy 0, policy_version 246074 (0.0035) [2024-06-19 00:58:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 4031823872. Throughput: 0: 41730.2. Samples: 299481780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:58:58,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 00:58:58,682][26599] Updated weights for policy 0, policy_version 246084 (0.0028) [2024-06-19 00:58:59,692][26579] Signal inference workers to stop experience collection... (4400 times) [2024-06-19 00:58:59,742][26599] InferenceWorker_p0-w0: stopping experience collection (4400 times) [2024-06-19 00:58:59,747][26579] Signal inference workers to resume experience collection... (4400 times) [2024-06-19 00:58:59,755][26599] InferenceWorker_p0-w0: resuming experience collection (4400 times) [2024-06-19 00:59:01,863][26599] Updated weights for policy 0, policy_version 246094 (0.0039) [2024-06-19 00:59:03,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42327.8, 300 sec: 41709.8). Total num frames: 4032036864. Throughput: 0: 41808.0. Samples: 299606940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:59:03,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 00:59:06,573][26599] Updated weights for policy 0, policy_version 246104 (0.0036) [2024-06-19 00:59:08,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4032266240. Throughput: 0: 41784.8. Samples: 299861200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:59:08,381][26367] Avg episode reward: [(0, '0.285')] [2024-06-19 00:59:09,756][26599] Updated weights for policy 0, policy_version 246114 (0.0046) [2024-06-19 00:59:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 4032446464. Throughput: 0: 41596.8. Samples: 300110820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:59:13,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 00:59:14,445][26599] Updated weights for policy 0, policy_version 246124 (0.0044) [2024-06-19 00:59:17,553][26599] Updated weights for policy 0, policy_version 246134 (0.0039) [2024-06-19 00:59:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41710.3). Total num frames: 4032675840. Throughput: 0: 41759.0. Samples: 300233620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:59:18,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 00:59:22,107][26599] Updated weights for policy 0, policy_version 246144 (0.0033) [2024-06-19 00:59:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4032872448. Throughput: 0: 41484.0. Samples: 300478500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:59:23,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 00:59:25,643][26599] Updated weights for policy 0, policy_version 246154 (0.0033) [2024-06-19 00:59:28,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 4033052672. Throughput: 0: 41484.9. Samples: 300733260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:59:28,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 00:59:30,305][26599] Updated weights for policy 0, policy_version 246164 (0.0030) [2024-06-19 00:59:33,367][26599] Updated weights for policy 0, policy_version 246174 (0.0032) [2024-06-19 00:59:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 4033314816. Throughput: 0: 41559.3. Samples: 300853020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 00:59:33,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 00:59:38,027][26599] Updated weights for policy 0, policy_version 246184 (0.0047) [2024-06-19 00:59:38,380][26367] Fps is (10 sec: 45875.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4033511424. Throughput: 0: 41814.7. Samples: 301117240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 00:59:38,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 00:59:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000246186_4033511424.pth... [2024-06-19 00:59:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000245573_4023468032.pth [2024-06-19 00:59:41,119][26599] Updated weights for policy 0, policy_version 246194 (0.0030) [2024-06-19 00:59:43,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4033691648. Throughput: 0: 41869.3. Samples: 301365900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 00:59:43,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 00:59:45,647][26599] Updated weights for policy 0, policy_version 246204 (0.0037) [2024-06-19 00:59:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4033937408. Throughput: 0: 41872.0. Samples: 301491180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 00:59:48,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 00:59:48,989][26599] Updated weights for policy 0, policy_version 246214 (0.0040) [2024-06-19 00:59:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4034117632. Throughput: 0: 41818.8. Samples: 301743040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 00:59:53,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 00:59:53,439][26599] Updated weights for policy 0, policy_version 246224 (0.0033) [2024-06-19 00:59:57,079][26599] Updated weights for policy 0, policy_version 246234 (0.0036) [2024-06-19 00:59:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 41765.5). Total num frames: 4034347008. Throughput: 0: 41875.3. Samples: 301995200. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 00:59:58,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 01:00:01,106][26599] Updated weights for policy 0, policy_version 246244 (0.0043) [2024-06-19 01:00:03,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42325.5, 300 sec: 41987.5). Total num frames: 4034576384. Throughput: 0: 42015.2. Samples: 302124300. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:03,380][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 01:00:04,798][26599] Updated weights for policy 0, policy_version 246254 (0.0041) [2024-06-19 01:00:08,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4034756608. Throughput: 0: 42091.0. Samples: 302372600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:08,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 01:00:08,867][26599] Updated weights for policy 0, policy_version 246264 (0.0041) [2024-06-19 01:00:12,530][26599] Updated weights for policy 0, policy_version 246274 (0.0031) [2024-06-19 01:00:13,381][26367] Fps is (10 sec: 39319.5, 60 sec: 42052.1, 300 sec: 41709.7). Total num frames: 4034969600. Throughput: 0: 41976.5. Samples: 302622220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:13,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 01:00:16,911][26599] Updated weights for policy 0, policy_version 246284 (0.0028) [2024-06-19 01:00:17,586][26579] Signal inference workers to stop experience collection... (4450 times) [2024-06-19 01:00:17,586][26579] Signal inference workers to resume experience collection... (4450 times) [2024-06-19 01:00:17,618][26599] InferenceWorker_p0-w0: stopping experience collection (4450 times) [2024-06-19 01:00:17,618][26599] InferenceWorker_p0-w0: resuming experience collection (4450 times) [2024-06-19 01:00:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4035166208. Throughput: 0: 42181.3. Samples: 302751180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:18,380][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 01:00:20,802][26599] Updated weights for policy 0, policy_version 246294 (0.0048) [2024-06-19 01:00:23,380][26367] Fps is (10 sec: 40962.1, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 4035379200. Throughput: 0: 41898.3. Samples: 303002660. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:23,380][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 01:00:24,507][26599] Updated weights for policy 0, policy_version 246304 (0.0048) [2024-06-19 01:00:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 4035592192. Throughput: 0: 41922.2. Samples: 303252400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:28,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 01:00:28,394][26599] Updated weights for policy 0, policy_version 246314 (0.0029) [2024-06-19 01:00:32,127][26599] Updated weights for policy 0, policy_version 246324 (0.0048) [2024-06-19 01:00:33,380][26367] Fps is (10 sec: 44235.9, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4035821568. Throughput: 0: 42030.2. Samples: 303382540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:33,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 01:00:36,310][26599] Updated weights for policy 0, policy_version 246334 (0.0033) [2024-06-19 01:00:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4036018176. Throughput: 0: 42088.9. Samples: 303637040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:38,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 01:00:39,749][26599] Updated weights for policy 0, policy_version 246344 (0.0032) [2024-06-19 01:00:43,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 4036231168. Throughput: 0: 42108.5. Samples: 303890080. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 01:00:43,380][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 01:00:44,181][26599] Updated weights for policy 0, policy_version 246354 (0.0032) [2024-06-19 01:00:47,708][26599] Updated weights for policy 0, policy_version 246364 (0.0036) [2024-06-19 01:00:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4036444160. Throughput: 0: 41918.1. Samples: 304010620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:00:48,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 01:00:51,776][26599] Updated weights for policy 0, policy_version 246374 (0.0031) [2024-06-19 01:00:53,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 4036640768. Throughput: 0: 42122.8. Samples: 304268120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:00:53,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 01:00:55,357][26599] Updated weights for policy 0, policy_version 246384 (0.0038) [2024-06-19 01:00:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4036853760. Throughput: 0: 42190.6. Samples: 304520780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:00:58,383][26367] Avg episode reward: [(0, '0.337')] [2024-06-19 01:00:59,371][26599] Updated weights for policy 0, policy_version 246394 (0.0042) [2024-06-19 01:01:03,145][26599] Updated weights for policy 0, policy_version 246404 (0.0041) [2024-06-19 01:01:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4037083136. Throughput: 0: 42123.4. Samples: 304646740. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:03,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 01:01:07,185][26599] Updated weights for policy 0, policy_version 246414 (0.0028) [2024-06-19 01:01:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41710.3). Total num frames: 4037263360. Throughput: 0: 41913.3. Samples: 304888760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:08,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 01:01:11,009][26599] Updated weights for policy 0, policy_version 246424 (0.0036) [2024-06-19 01:01:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.6, 300 sec: 41876.4). Total num frames: 4037492736. Throughput: 0: 41844.5. Samples: 305135400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:13,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 01:01:15,197][26599] Updated weights for policy 0, policy_version 246434 (0.0038) [2024-06-19 01:01:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 4037689344. Throughput: 0: 41861.5. Samples: 305266300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:18,380][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 01:01:18,888][26599] Updated weights for policy 0, policy_version 246444 (0.0037) [2024-06-19 01:01:23,164][26599] Updated weights for policy 0, policy_version 246454 (0.0035) [2024-06-19 01:01:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4037918720. Throughput: 0: 41872.4. Samples: 305521300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:23,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 01:01:26,660][26599] Updated weights for policy 0, policy_version 246464 (0.0035) [2024-06-19 01:01:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 4038131712. Throughput: 0: 41661.3. Samples: 305764840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:28,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 01:01:30,884][26599] Updated weights for policy 0, policy_version 246474 (0.0046) [2024-06-19 01:01:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4038311936. Throughput: 0: 41692.0. Samples: 305886760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:33,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 01:01:34,189][26579] Signal inference workers to stop experience collection... (4500 times) [2024-06-19 01:01:34,244][26599] InferenceWorker_p0-w0: stopping experience collection (4500 times) [2024-06-19 01:01:34,244][26579] Signal inference workers to resume experience collection... (4500 times) [2024-06-19 01:01:34,260][26599] InferenceWorker_p0-w0: resuming experience collection (4500 times) [2024-06-19 01:01:34,662][26599] Updated weights for policy 0, policy_version 246484 (0.0044) [2024-06-19 01:01:38,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41710.3). Total num frames: 4038508544. Throughput: 0: 41641.4. Samples: 306141980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:38,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 01:01:38,518][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000246492_4038524928.pth... [2024-06-19 01:01:38,561][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000245880_4028497920.pth [2024-06-19 01:01:38,908][26599] Updated weights for policy 0, policy_version 246494 (0.0046) [2024-06-19 01:01:42,357][26599] Updated weights for policy 0, policy_version 246504 (0.0026) [2024-06-19 01:01:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41821.4). Total num frames: 4038737920. Throughput: 0: 41391.1. Samples: 306383380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:43,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 01:01:47,129][26599] Updated weights for policy 0, policy_version 246514 (0.0038) [2024-06-19 01:01:48,382][26367] Fps is (10 sec: 42591.3, 60 sec: 41505.0, 300 sec: 41765.1). Total num frames: 4038934528. Throughput: 0: 41576.8. Samples: 306517760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:48,382][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 01:01:50,082][26599] Updated weights for policy 0, policy_version 246524 (0.0038) [2024-06-19 01:01:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 4039131136. Throughput: 0: 41609.2. Samples: 306761180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 01:01:53,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 01:01:55,248][26599] Updated weights for policy 0, policy_version 246534 (0.0029) [2024-06-19 01:01:58,380][26367] Fps is (10 sec: 42605.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4039360512. Throughput: 0: 41647.2. Samples: 307009520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:01:58,380][26367] Avg episode reward: [(0, '0.807')] [2024-06-19 01:01:58,430][26599] Updated weights for policy 0, policy_version 246544 (0.0043) [2024-06-19 01:02:03,038][26599] Updated weights for policy 0, policy_version 246554 (0.0033) [2024-06-19 01:02:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 4039557120. Throughput: 0: 41549.2. Samples: 307136020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:03,381][26367] Avg episode reward: [(0, '0.281')] [2024-06-19 01:02:06,200][26599] Updated weights for policy 0, policy_version 246564 (0.0033) [2024-06-19 01:02:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41765.8). Total num frames: 4039786496. Throughput: 0: 41355.1. Samples: 307382280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:08,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 01:02:10,746][26599] Updated weights for policy 0, policy_version 246574 (0.0044) [2024-06-19 01:02:13,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 4039966720. Throughput: 0: 41572.0. Samples: 307635580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:13,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 01:02:13,924][26599] Updated weights for policy 0, policy_version 246584 (0.0048) [2024-06-19 01:02:18,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4040179712. Throughput: 0: 41575.2. Samples: 307757640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:18,380][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 01:02:18,450][26599] Updated weights for policy 0, policy_version 246594 (0.0036) [2024-06-19 01:02:21,648][26599] Updated weights for policy 0, policy_version 246604 (0.0035) [2024-06-19 01:02:23,380][26367] Fps is (10 sec: 44236.0, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4040409088. Throughput: 0: 41483.4. Samples: 308008740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:23,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 01:02:26,790][26599] Updated weights for policy 0, policy_version 246614 (0.0035) [2024-06-19 01:02:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 4040605696. Throughput: 0: 41690.4. Samples: 308259440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:28,380][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 01:02:29,543][26599] Updated weights for policy 0, policy_version 246624 (0.0035) [2024-06-19 01:02:33,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 4040802304. Throughput: 0: 41422.0. Samples: 308381680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:33,381][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 01:02:34,714][26599] Updated weights for policy 0, policy_version 246634 (0.0030) [2024-06-19 01:02:37,707][26599] Updated weights for policy 0, policy_version 246644 (0.0035) [2024-06-19 01:02:38,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 4041031680. Throughput: 0: 41624.9. Samples: 308634300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:38,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 01:02:42,426][26599] Updated weights for policy 0, policy_version 246654 (0.0043) [2024-06-19 01:02:43,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4041228288. Throughput: 0: 41842.5. Samples: 308892440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:43,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 01:02:45,666][26599] Updated weights for policy 0, policy_version 246664 (0.0031) [2024-06-19 01:02:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41780.3, 300 sec: 41654.2). Total num frames: 4041441280. Throughput: 0: 41671.1. Samples: 309011220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:48,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 01:02:49,949][26579] Signal inference workers to stop experience collection... (4550 times) [2024-06-19 01:02:49,950][26579] Signal inference workers to resume experience collection... (4550 times) [2024-06-19 01:02:49,965][26599] InferenceWorker_p0-w0: stopping experience collection (4550 times) [2024-06-19 01:02:49,965][26599] InferenceWorker_p0-w0: resuming experience collection (4550 times) [2024-06-19 01:02:50,105][26599] Updated weights for policy 0, policy_version 246674 (0.0035) [2024-06-19 01:02:53,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4041670656. Throughput: 0: 41853.8. Samples: 309265700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:53,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 01:02:53,388][26599] Updated weights for policy 0, policy_version 246684 (0.0030) [2024-06-19 01:02:57,945][26599] Updated weights for policy 0, policy_version 246694 (0.0042) [2024-06-19 01:02:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 41821.4). Total num frames: 4041834496. Throughput: 0: 41883.5. Samples: 309520340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:02:58,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 01:03:01,077][26599] Updated weights for policy 0, policy_version 246704 (0.0029) [2024-06-19 01:03:03,381][26367] Fps is (10 sec: 40957.9, 60 sec: 42052.0, 300 sec: 41709.7). Total num frames: 4042080256. Throughput: 0: 41929.2. Samples: 309644480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 24.0) [2024-06-19 01:03:03,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 01:03:05,498][26599] Updated weights for policy 0, policy_version 246714 (0.0042) [2024-06-19 01:03:08,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 4042276864. Throughput: 0: 41947.7. Samples: 309896380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:08,380][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 01:03:09,195][26599] Updated weights for policy 0, policy_version 246724 (0.0029) [2024-06-19 01:03:13,380][26367] Fps is (10 sec: 39323.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4042473472. Throughput: 0: 41998.6. Samples: 310149380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:13,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 01:03:13,722][26599] Updated weights for policy 0, policy_version 246734 (0.0031) [2024-06-19 01:03:16,932][26599] Updated weights for policy 0, policy_version 246744 (0.0037) [2024-06-19 01:03:18,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 4042719232. Throughput: 0: 42090.3. Samples: 310275740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:18,380][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 01:03:21,307][26599] Updated weights for policy 0, policy_version 246754 (0.0038) [2024-06-19 01:03:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4042915840. Throughput: 0: 42144.0. Samples: 310530780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:23,384][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 01:03:24,511][26599] Updated weights for policy 0, policy_version 246764 (0.0037) [2024-06-19 01:03:28,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 4043112448. Throughput: 0: 41952.0. Samples: 310780280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:28,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 01:03:29,257][26599] Updated weights for policy 0, policy_version 246774 (0.0030) [2024-06-19 01:03:32,236][26599] Updated weights for policy 0, policy_version 246784 (0.0040) [2024-06-19 01:03:33,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 4043358208. Throughput: 0: 42132.8. Samples: 310907200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:33,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 01:03:36,757][26599] Updated weights for policy 0, policy_version 246794 (0.0033) [2024-06-19 01:03:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4043554816. Throughput: 0: 42086.1. Samples: 311159580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:38,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 01:03:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000246799_4043554816.pth... [2024-06-19 01:03:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000246186_4033511424.pth [2024-06-19 01:03:40,649][26599] Updated weights for policy 0, policy_version 246804 (0.0031) [2024-06-19 01:03:43,380][26367] Fps is (10 sec: 39322.5, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 4043751424. Throughput: 0: 41700.9. Samples: 311396880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:43,380][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 01:03:44,948][26599] Updated weights for policy 0, policy_version 246814 (0.0035) [2024-06-19 01:03:48,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 4043931648. Throughput: 0: 41750.1. Samples: 311523220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:48,381][26367] Avg episode reward: [(0, '0.812')] [2024-06-19 01:03:48,561][26599] Updated weights for policy 0, policy_version 246824 (0.0042) [2024-06-19 01:03:52,728][26599] Updated weights for policy 0, policy_version 246834 (0.0034) [2024-06-19 01:03:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 4044161024. Throughput: 0: 41795.5. Samples: 311777180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:53,382][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 01:03:56,266][26599] Updated weights for policy 0, policy_version 246844 (0.0035) [2024-06-19 01:03:57,264][26579] Signal inference workers to stop experience collection... (4600 times) [2024-06-19 01:03:57,276][26599] InferenceWorker_p0-w0: stopping experience collection (4600 times) [2024-06-19 01:03:57,328][26579] Signal inference workers to resume experience collection... (4600 times) [2024-06-19 01:03:57,328][26599] InferenceWorker_p0-w0: resuming experience collection (4600 times) [2024-06-19 01:03:58,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 4044390400. Throughput: 0: 41633.8. Samples: 312022900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:03:58,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 01:04:00,613][26599] Updated weights for policy 0, policy_version 246854 (0.0047) [2024-06-19 01:04:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.4, 300 sec: 41765.3). Total num frames: 4044587008. Throughput: 0: 41709.2. Samples: 312152660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:04:03,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 01:04:04,395][26599] Updated weights for policy 0, policy_version 246864 (0.0029) [2024-06-19 01:04:08,276][26599] Updated weights for policy 0, policy_version 246874 (0.0040) [2024-06-19 01:04:08,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41779.0, 300 sec: 41820.9). Total num frames: 4044783616. Throughput: 0: 41541.7. Samples: 312400160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:04:08,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 01:04:12,153][26599] Updated weights for policy 0, policy_version 246884 (0.0033) [2024-06-19 01:04:13,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 4044996608. Throughput: 0: 41599.7. Samples: 312652260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:04:13,380][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 01:04:16,001][26599] Updated weights for policy 0, policy_version 246894 (0.0043) [2024-06-19 01:04:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.0, 300 sec: 41820.8). Total num frames: 4045209600. Throughput: 0: 41512.5. Samples: 312775260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:18,384][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 01:04:19,924][26599] Updated weights for policy 0, policy_version 246904 (0.0040) [2024-06-19 01:04:23,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4045422592. Throughput: 0: 41615.9. Samples: 313032300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:23,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 01:04:23,611][26599] Updated weights for policy 0, policy_version 246914 (0.0037) [2024-06-19 01:04:27,682][26599] Updated weights for policy 0, policy_version 246924 (0.0044) [2024-06-19 01:04:28,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41776.7, 300 sec: 41709.3). Total num frames: 4045619200. Throughput: 0: 41937.4. Samples: 313284220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:28,385][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 01:04:31,266][26599] Updated weights for policy 0, policy_version 246934 (0.0036) [2024-06-19 01:04:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 4045832192. Throughput: 0: 41868.0. Samples: 313407280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:33,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 01:04:35,430][26599] Updated weights for policy 0, policy_version 246944 (0.0036) [2024-06-19 01:04:38,380][26367] Fps is (10 sec: 40975.2, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4046028800. Throughput: 0: 41663.6. Samples: 313652040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:38,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 01:04:39,570][26599] Updated weights for policy 0, policy_version 246954 (0.0027) [2024-06-19 01:04:43,174][26599] Updated weights for policy 0, policy_version 246964 (0.0030) [2024-06-19 01:04:43,384][26367] Fps is (10 sec: 42582.9, 60 sec: 41776.6, 300 sec: 41764.8). Total num frames: 4046258176. Throughput: 0: 41928.6. Samples: 313909840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:43,385][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 01:04:47,266][26599] Updated weights for policy 0, policy_version 246974 (0.0035) [2024-06-19 01:04:48,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4046471168. Throughput: 0: 41870.3. Samples: 314036820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:48,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 01:04:51,306][26599] Updated weights for policy 0, policy_version 246984 (0.0040) [2024-06-19 01:04:53,384][26367] Fps is (10 sec: 42598.5, 60 sec: 42049.7, 300 sec: 41820.3). Total num frames: 4046684160. Throughput: 0: 41863.4. Samples: 314284160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:53,385][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 01:04:55,009][26599] Updated weights for policy 0, policy_version 246994 (0.0048) [2024-06-19 01:04:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 4046880768. Throughput: 0: 41858.6. Samples: 314535900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:04:58,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 01:04:59,259][26599] Updated weights for policy 0, policy_version 247004 (0.0041) [2024-06-19 01:05:02,788][26599] Updated weights for policy 0, policy_version 247014 (0.0032) [2024-06-19 01:05:03,380][26367] Fps is (10 sec: 40974.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4047093760. Throughput: 0: 41856.0. Samples: 314658780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:05:03,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 01:05:07,341][26599] Updated weights for policy 0, policy_version 247024 (0.0024) [2024-06-19 01:05:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.5, 300 sec: 41820.9). Total num frames: 4047306752. Throughput: 0: 41942.0. Samples: 314919680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:05:08,380][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 01:05:10,695][26599] Updated weights for policy 0, policy_version 247034 (0.0031) [2024-06-19 01:05:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4047519744. Throughput: 0: 41628.8. Samples: 315157360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:05:13,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 01:05:14,937][26599] Updated weights for policy 0, policy_version 247044 (0.0043) [2024-06-19 01:05:18,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4047716352. Throughput: 0: 41739.1. Samples: 315285540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:05:18,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 01:05:18,728][26599] Updated weights for policy 0, policy_version 247054 (0.0036) [2024-06-19 01:05:22,844][26599] Updated weights for policy 0, policy_version 247064 (0.0039) [2024-06-19 01:05:23,380][26367] Fps is (10 sec: 37682.6, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 4047896576. Throughput: 0: 41869.7. Samples: 315536180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:05:23,381][26367] Avg episode reward: [(0, '0.805')] [2024-06-19 01:05:23,403][26579] Signal inference workers to stop experience collection... (4650 times) [2024-06-19 01:05:23,449][26599] InferenceWorker_p0-w0: stopping experience collection (4650 times) [2024-06-19 01:05:23,458][26579] Signal inference workers to resume experience collection... (4650 times) [2024-06-19 01:05:23,465][26599] InferenceWorker_p0-w0: resuming experience collection (4650 times) [2024-06-19 01:05:26,505][26599] Updated weights for policy 0, policy_version 247074 (0.0031) [2024-06-19 01:05:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42054.7, 300 sec: 41765.3). Total num frames: 4048142336. Throughput: 0: 41611.2. Samples: 315782200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 01:05:28,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 01:05:30,587][26599] Updated weights for policy 0, policy_version 247084 (0.0036) [2024-06-19 01:05:33,384][26367] Fps is (10 sec: 42583.2, 60 sec: 41503.6, 300 sec: 41709.3). Total num frames: 4048322560. Throughput: 0: 41709.6. Samples: 315913900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:05:33,384][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 01:05:34,339][26599] Updated weights for policy 0, policy_version 247094 (0.0049) [2024-06-19 01:05:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 4048535552. Throughput: 0: 41531.3. Samples: 316152920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:05:38,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 01:05:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000247103_4048535552.pth... [2024-06-19 01:05:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000246492_4038524928.pth [2024-06-19 01:05:38,907][26599] Updated weights for policy 0, policy_version 247104 (0.0032) [2024-06-19 01:05:42,195][26599] Updated weights for policy 0, policy_version 247114 (0.0040) [2024-06-19 01:05:43,380][26367] Fps is (10 sec: 42614.5, 60 sec: 41508.7, 300 sec: 41709.8). Total num frames: 4048748544. Throughput: 0: 41528.5. Samples: 316404680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:05:43,380][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 01:05:46,775][26599] Updated weights for policy 0, policy_version 247124 (0.0037) [2024-06-19 01:05:48,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 4048945152. Throughput: 0: 41657.5. Samples: 316533360. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:05:48,380][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 01:05:50,354][26599] Updated weights for policy 0, policy_version 247134 (0.0039) [2024-06-19 01:05:53,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41508.6, 300 sec: 41765.3). Total num frames: 4049174528. Throughput: 0: 41298.5. Samples: 316778120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:05:53,381][26367] Avg episode reward: [(0, '0.414')] [2024-06-19 01:05:54,491][26599] Updated weights for policy 0, policy_version 247144 (0.0040) [2024-06-19 01:05:57,957][26599] Updated weights for policy 0, policy_version 247154 (0.0023) [2024-06-19 01:05:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4049387520. Throughput: 0: 41759.1. Samples: 317036520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:05:58,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 01:06:02,168][26599] Updated weights for policy 0, policy_version 247164 (0.0035) [2024-06-19 01:06:03,382][26367] Fps is (10 sec: 39315.0, 60 sec: 41231.9, 300 sec: 41709.5). Total num frames: 4049567744. Throughput: 0: 41602.9. Samples: 317157740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:03,382][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 01:06:05,651][26599] Updated weights for policy 0, policy_version 247174 (0.0033) [2024-06-19 01:06:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 4049829888. Throughput: 0: 41656.1. Samples: 317410700. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:08,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 01:06:09,867][26599] Updated weights for policy 0, policy_version 247184 (0.0041) [2024-06-19 01:06:13,380][26367] Fps is (10 sec: 44244.9, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4050010112. Throughput: 0: 41757.6. Samples: 317661280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:13,380][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 01:06:13,479][26599] Updated weights for policy 0, policy_version 247194 (0.0033) [2024-06-19 01:06:17,494][26599] Updated weights for policy 0, policy_version 247204 (0.0024) [2024-06-19 01:06:18,380][26367] Fps is (10 sec: 37682.6, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 4050206720. Throughput: 0: 41625.5. Samples: 317786900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:18,381][26367] Avg episode reward: [(0, '0.369')] [2024-06-19 01:06:21,191][26599] Updated weights for policy 0, policy_version 247214 (0.0046) [2024-06-19 01:06:23,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 41709.8). Total num frames: 4050436096. Throughput: 0: 41929.4. Samples: 318039740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:23,381][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 01:06:25,118][26599] Updated weights for policy 0, policy_version 247224 (0.0042) [2024-06-19 01:06:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 4050632704. Throughput: 0: 41985.2. Samples: 318294020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:28,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 01:06:29,045][26599] Updated weights for policy 0, policy_version 247234 (0.0044) [2024-06-19 01:06:32,732][26599] Updated weights for policy 0, policy_version 247244 (0.0043) [2024-06-19 01:06:33,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42052.3, 300 sec: 41820.3). Total num frames: 4050845696. Throughput: 0: 41829.4. Samples: 318415840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:33,384][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 01:06:36,742][26599] Updated weights for policy 0, policy_version 247254 (0.0033) [2024-06-19 01:06:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 4051058688. Throughput: 0: 41974.7. Samples: 318666980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 01:06:38,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-19 01:06:40,552][26599] Updated weights for policy 0, policy_version 247264 (0.0033) [2024-06-19 01:06:43,383][26367] Fps is (10 sec: 40963.5, 60 sec: 41777.2, 300 sec: 41765.2). Total num frames: 4051255296. Throughput: 0: 41960.9. Samples: 318924880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:06:43,384][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 01:06:44,371][26599] Updated weights for policy 0, policy_version 247274 (0.0037) [2024-06-19 01:06:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 4051484672. Throughput: 0: 41948.6. Samples: 319045360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:06:48,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 01:06:49,093][26599] Updated weights for policy 0, policy_version 247284 (0.0031) [2024-06-19 01:06:51,914][26579] Signal inference workers to stop experience collection... (4700 times) [2024-06-19 01:06:51,968][26599] InferenceWorker_p0-w0: stopping experience collection (4700 times) [2024-06-19 01:06:51,974][26579] Signal inference workers to resume experience collection... (4700 times) [2024-06-19 01:06:51,986][26599] InferenceWorker_p0-w0: resuming experience collection (4700 times) [2024-06-19 01:06:52,108][26599] Updated weights for policy 0, policy_version 247294 (0.0043) [2024-06-19 01:06:53,380][26367] Fps is (10 sec: 42610.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4051681280. Throughput: 0: 41925.3. Samples: 319297340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:06:53,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 01:06:56,958][26599] Updated weights for policy 0, policy_version 247304 (0.0030) [2024-06-19 01:06:58,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 4051877888. Throughput: 0: 42055.4. Samples: 319553780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:06:58,381][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 01:06:59,971][26599] Updated weights for policy 0, policy_version 247314 (0.0039) [2024-06-19 01:07:03,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42872.7, 300 sec: 41876.4). Total num frames: 4052140032. Throughput: 0: 41937.5. Samples: 319674080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:03,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 01:07:04,735][26599] Updated weights for policy 0, policy_version 247324 (0.0036) [2024-06-19 01:07:08,354][26599] Updated weights for policy 0, policy_version 247334 (0.0035) [2024-06-19 01:07:08,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4052320256. Throughput: 0: 41914.3. Samples: 319925880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:08,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 01:07:12,595][26599] Updated weights for policy 0, policy_version 247344 (0.0032) [2024-06-19 01:07:13,380][26367] Fps is (10 sec: 36045.0, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4052500480. Throughput: 0: 42016.9. Samples: 320184780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:13,380][26367] Avg episode reward: [(0, '0.371')] [2024-06-19 01:07:16,140][26599] Updated weights for policy 0, policy_version 247354 (0.0028) [2024-06-19 01:07:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 4052746240. Throughput: 0: 42035.7. Samples: 320307300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:18,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 01:07:20,104][26599] Updated weights for policy 0, policy_version 247364 (0.0041) [2024-06-19 01:07:23,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4052942848. Throughput: 0: 42081.7. Samples: 320560660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:23,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 01:07:23,965][26599] Updated weights for policy 0, policy_version 247374 (0.0041) [2024-06-19 01:07:28,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4053123072. Throughput: 0: 41932.8. Samples: 320811740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:28,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 01:07:28,395][26599] Updated weights for policy 0, policy_version 247384 (0.0030) [2024-06-19 01:07:31,879][26599] Updated weights for policy 0, policy_version 247394 (0.0035) [2024-06-19 01:07:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42327.9, 300 sec: 41876.4). Total num frames: 4053385216. Throughput: 0: 41986.3. Samples: 320934740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:33,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 01:07:36,270][26599] Updated weights for policy 0, policy_version 247404 (0.0037) [2024-06-19 01:07:38,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4053581824. Throughput: 0: 42079.2. Samples: 321190900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:38,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 01:07:38,480][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000247412_4053598208.pth... [2024-06-19 01:07:38,524][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000246799_4043554816.pth [2024-06-19 01:07:39,810][26599] Updated weights for policy 0, policy_version 247414 (0.0028) [2024-06-19 01:07:43,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41781.2, 300 sec: 41765.3). Total num frames: 4053762048. Throughput: 0: 41972.6. Samples: 321442540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:43,380][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 01:07:43,778][26599] Updated weights for policy 0, policy_version 247424 (0.0029) [2024-06-19 01:07:47,517][26599] Updated weights for policy 0, policy_version 247434 (0.0035) [2024-06-19 01:07:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 4054007808. Throughput: 0: 42073.3. Samples: 321567380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 01:07:48,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 01:07:51,381][26599] Updated weights for policy 0, policy_version 247444 (0.0034) [2024-06-19 01:07:53,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4054204416. Throughput: 0: 42136.4. Samples: 321822020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:07:53,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 01:07:55,303][26599] Updated weights for policy 0, policy_version 247454 (0.0028) [2024-06-19 01:07:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 4054417408. Throughput: 0: 41919.1. Samples: 322071140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:07:58,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 01:07:59,135][26599] Updated weights for policy 0, policy_version 247464 (0.0037) [2024-06-19 01:07:59,747][26579] Signal inference workers to stop experience collection... (4750 times) [2024-06-19 01:07:59,747][26579] Signal inference workers to resume experience collection... (4750 times) [2024-06-19 01:07:59,793][26599] InferenceWorker_p0-w0: stopping experience collection (4750 times) [2024-06-19 01:07:59,793][26599] InferenceWorker_p0-w0: resuming experience collection (4750 times) [2024-06-19 01:08:03,143][26599] Updated weights for policy 0, policy_version 247474 (0.0040) [2024-06-19 01:08:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4054614016. Throughput: 0: 42092.2. Samples: 322201440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:03,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 01:08:07,094][26599] Updated weights for policy 0, policy_version 247484 (0.0026) [2024-06-19 01:08:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4054827008. Throughput: 0: 42175.6. Samples: 322458560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:08,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 01:08:10,864][26599] Updated weights for policy 0, policy_version 247494 (0.0033) [2024-06-19 01:08:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41820.9). Total num frames: 4055056384. Throughput: 0: 41971.7. Samples: 322700460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:13,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 01:08:14,733][26599] Updated weights for policy 0, policy_version 247504 (0.0042) [2024-06-19 01:08:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4055252992. Throughput: 0: 42140.4. Samples: 322831060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:18,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 01:08:18,531][26599] Updated weights for policy 0, policy_version 247514 (0.0034) [2024-06-19 01:08:22,437][26599] Updated weights for policy 0, policy_version 247524 (0.0034) [2024-06-19 01:08:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4055465984. Throughput: 0: 41934.1. Samples: 323077940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:23,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 01:08:26,133][26599] Updated weights for policy 0, policy_version 247534 (0.0035) [2024-06-19 01:08:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 4055662592. Throughput: 0: 42010.1. Samples: 323333000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:28,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 01:08:30,029][26599] Updated weights for policy 0, policy_version 247544 (0.0037) [2024-06-19 01:08:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4055875584. Throughput: 0: 41972.1. Samples: 323456120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:33,380][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 01:08:34,325][26599] Updated weights for policy 0, policy_version 247554 (0.0046) [2024-06-19 01:08:37,754][26599] Updated weights for policy 0, policy_version 247564 (0.0032) [2024-06-19 01:08:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4056104960. Throughput: 0: 41851.5. Samples: 323705340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:38,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 01:08:42,083][26599] Updated weights for policy 0, policy_version 247574 (0.0029) [2024-06-19 01:08:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4056301568. Throughput: 0: 42115.1. Samples: 323966320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:43,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 01:08:45,362][26599] Updated weights for policy 0, policy_version 247584 (0.0035) [2024-06-19 01:08:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4056514560. Throughput: 0: 41831.4. Samples: 324083860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:48,381][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 01:08:49,809][26599] Updated weights for policy 0, policy_version 247594 (0.0040) [2024-06-19 01:08:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 4056727552. Throughput: 0: 41838.3. Samples: 324341280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:53,380][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 01:08:53,391][26599] Updated weights for policy 0, policy_version 247604 (0.0032) [2024-06-19 01:08:57,810][26599] Updated weights for policy 0, policy_version 247614 (0.0039) [2024-06-19 01:08:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4056924160. Throughput: 0: 42064.0. Samples: 324593340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 01:08:58,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 01:09:01,038][26599] Updated weights for policy 0, policy_version 247624 (0.0028) [2024-06-19 01:09:03,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 4057169920. Throughput: 0: 41918.6. Samples: 324717400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:03,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 01:09:05,483][26599] Updated weights for policy 0, policy_version 247634 (0.0027) [2024-06-19 01:09:08,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4057382912. Throughput: 0: 42122.2. Samples: 324973440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:08,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 01:09:08,598][26599] Updated weights for policy 0, policy_version 247644 (0.0040) [2024-06-19 01:09:13,322][26599] Updated weights for policy 0, policy_version 247654 (0.0035) [2024-06-19 01:09:13,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4057563136. Throughput: 0: 41999.1. Samples: 325222960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:13,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 01:09:16,902][26599] Updated weights for policy 0, policy_version 247664 (0.0028) [2024-06-19 01:09:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4057792512. Throughput: 0: 41941.6. Samples: 325343500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:18,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 01:09:20,884][26599] Updated weights for policy 0, policy_version 247674 (0.0038) [2024-06-19 01:09:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 4057972736. Throughput: 0: 42141.8. Samples: 325601720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:23,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 01:09:23,752][26579] Signal inference workers to stop experience collection... (4800 times) [2024-06-19 01:09:23,802][26599] InferenceWorker_p0-w0: stopping experience collection (4800 times) [2024-06-19 01:09:23,802][26579] Signal inference workers to resume experience collection... (4800 times) [2024-06-19 01:09:23,824][26599] InferenceWorker_p0-w0: resuming experience collection (4800 times) [2024-06-19 01:09:24,797][26599] Updated weights for policy 0, policy_version 247684 (0.0030) [2024-06-19 01:09:28,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4058202112. Throughput: 0: 41758.6. Samples: 325845460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:28,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 01:09:28,412][26599] Updated weights for policy 0, policy_version 247694 (0.0025) [2024-06-19 01:09:32,604][26599] Updated weights for policy 0, policy_version 247704 (0.0039) [2024-06-19 01:09:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4058398720. Throughput: 0: 42026.7. Samples: 325975060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:33,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 01:09:36,281][26599] Updated weights for policy 0, policy_version 247714 (0.0042) [2024-06-19 01:09:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41821.4). Total num frames: 4058595328. Throughput: 0: 41868.5. Samples: 326225360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:38,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 01:09:38,387][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000247717_4058595328.pth... [2024-06-19 01:09:38,442][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000247103_4048535552.pth [2024-06-19 01:09:40,219][26599] Updated weights for policy 0, policy_version 247724 (0.0034) [2024-06-19 01:09:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4058824704. Throughput: 0: 41685.2. Samples: 326469180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:43,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 01:09:44,131][26599] Updated weights for policy 0, policy_version 247734 (0.0032) [2024-06-19 01:09:48,058][26599] Updated weights for policy 0, policy_version 247744 (0.0031) [2024-06-19 01:09:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 41876.9). Total num frames: 4059037696. Throughput: 0: 41924.5. Samples: 326604000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:48,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 01:09:51,753][26599] Updated weights for policy 0, policy_version 247754 (0.0036) [2024-06-19 01:09:53,384][26367] Fps is (10 sec: 39307.5, 60 sec: 41503.6, 300 sec: 41820.3). Total num frames: 4059217920. Throughput: 0: 41610.0. Samples: 326846040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:53,385][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 01:09:55,822][26599] Updated weights for policy 0, policy_version 247764 (0.0025) [2024-06-19 01:09:58,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42322.7, 300 sec: 41931.4). Total num frames: 4059463680. Throughput: 0: 41742.4. Samples: 327101520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:09:58,385][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 01:09:59,745][26599] Updated weights for policy 0, policy_version 247774 (0.0025) [2024-06-19 01:10:03,380][26367] Fps is (10 sec: 44253.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4059660288. Throughput: 0: 41983.7. Samples: 327232760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:10:03,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 01:10:03,915][26599] Updated weights for policy 0, policy_version 247784 (0.0034) [2024-06-19 01:10:07,407][26599] Updated weights for policy 0, policy_version 247794 (0.0032) [2024-06-19 01:10:08,380][26367] Fps is (10 sec: 40974.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4059873280. Throughput: 0: 41705.7. Samples: 327478480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 01:10:08,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 01:10:11,791][26599] Updated weights for policy 0, policy_version 247804 (0.0034) [2024-06-19 01:10:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4060069888. Throughput: 0: 41954.2. Samples: 327733400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:13,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 01:10:15,198][26599] Updated weights for policy 0, policy_version 247814 (0.0043) [2024-06-19 01:10:18,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 4060266496. Throughput: 0: 41897.8. Samples: 327860460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:18,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 01:10:19,561][26599] Updated weights for policy 0, policy_version 247824 (0.0036) [2024-06-19 01:10:23,155][26599] Updated weights for policy 0, policy_version 247834 (0.0032) [2024-06-19 01:10:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 41932.0). Total num frames: 4060512256. Throughput: 0: 41866.6. Samples: 328109360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:23,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 01:10:27,463][26599] Updated weights for policy 0, policy_version 247844 (0.0032) [2024-06-19 01:10:28,384][26367] Fps is (10 sec: 44220.8, 60 sec: 41776.7, 300 sec: 41987.5). Total num frames: 4060708864. Throughput: 0: 42058.9. Samples: 328361980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:28,385][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 01:10:31,430][26599] Updated weights for policy 0, policy_version 247854 (0.0035) [2024-06-19 01:10:33,380][26367] Fps is (10 sec: 37683.7, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4060889088. Throughput: 0: 41868.5. Samples: 328488080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:33,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 01:10:35,478][26599] Updated weights for policy 0, policy_version 247864 (0.0041) [2024-06-19 01:10:38,380][26367] Fps is (10 sec: 44253.2, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4061151232. Throughput: 0: 41998.2. Samples: 328735800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:38,381][26367] Avg episode reward: [(0, '0.328')] [2024-06-19 01:10:39,126][26599] Updated weights for policy 0, policy_version 247874 (0.0036) [2024-06-19 01:10:43,102][26599] Updated weights for policy 0, policy_version 247884 (0.0030) [2024-06-19 01:10:43,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41987.4). Total num frames: 4061331456. Throughput: 0: 42026.9. Samples: 328992580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:43,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 01:10:46,793][26599] Updated weights for policy 0, policy_version 247894 (0.0038) [2024-06-19 01:10:48,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4061544448. Throughput: 0: 41779.9. Samples: 329112860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:48,381][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 01:10:51,163][26599] Updated weights for policy 0, policy_version 247904 (0.0048) [2024-06-19 01:10:53,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42601.0, 300 sec: 41987.5). Total num frames: 4061773824. Throughput: 0: 42089.5. Samples: 329372500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:53,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 01:10:54,390][26599] Updated weights for policy 0, policy_version 247914 (0.0038) [2024-06-19 01:10:56,467][26579] Signal inference workers to stop experience collection... (4850 times) [2024-06-19 01:10:56,516][26599] InferenceWorker_p0-w0: stopping experience collection (4850 times) [2024-06-19 01:10:56,524][26579] Signal inference workers to resume experience collection... (4850 times) [2024-06-19 01:10:56,526][26599] InferenceWorker_p0-w0: resuming experience collection (4850 times) [2024-06-19 01:10:58,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41506.1, 300 sec: 41987.2). Total num frames: 4061954048. Throughput: 0: 41962.8. Samples: 329621880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:10:58,385][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 01:10:59,210][26599] Updated weights for policy 0, policy_version 247924 (0.0028) [2024-06-19 01:11:01,984][26599] Updated weights for policy 0, policy_version 247934 (0.0038) [2024-06-19 01:11:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4062167040. Throughput: 0: 41818.7. Samples: 329742300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:11:03,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 01:11:06,990][26599] Updated weights for policy 0, policy_version 247944 (0.0036) [2024-06-19 01:11:08,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4062363648. Throughput: 0: 42036.9. Samples: 330001020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:11:08,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 01:11:10,243][26599] Updated weights for policy 0, policy_version 247954 (0.0039) [2024-06-19 01:11:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4062593024. Throughput: 0: 41952.7. Samples: 330249700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:11:13,381][26367] Avg episode reward: [(0, '0.843')] [2024-06-19 01:11:14,715][26599] Updated weights for policy 0, policy_version 247964 (0.0038) [2024-06-19 01:11:17,970][26599] Updated weights for policy 0, policy_version 247974 (0.0038) [2024-06-19 01:11:18,382][26367] Fps is (10 sec: 44230.1, 60 sec: 42324.3, 300 sec: 41931.7). Total num frames: 4062806016. Throughput: 0: 42020.7. Samples: 330379080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 01:11:18,382][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 01:11:22,616][26599] Updated weights for policy 0, policy_version 247984 (0.0035) [2024-06-19 01:11:23,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4062986240. Throughput: 0: 41933.3. Samples: 330622800. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:23,380][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 01:11:25,701][26599] Updated weights for policy 0, policy_version 247994 (0.0059) [2024-06-19 01:11:28,380][26367] Fps is (10 sec: 42604.9, 60 sec: 42054.8, 300 sec: 41988.0). Total num frames: 4063232000. Throughput: 0: 41734.3. Samples: 330870620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:28,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 01:11:30,228][26599] Updated weights for policy 0, policy_version 248004 (0.0035) [2024-06-19 01:11:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4063444992. Throughput: 0: 42024.2. Samples: 331003940. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:33,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 01:11:33,541][26599] Updated weights for policy 0, policy_version 248014 (0.0030) [2024-06-19 01:11:38,033][26599] Updated weights for policy 0, policy_version 248024 (0.0030) [2024-06-19 01:11:38,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41932.3). Total num frames: 4063625216. Throughput: 0: 41747.6. Samples: 331251140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:38,380][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 01:11:38,423][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248025_4063641600.pth... [2024-06-19 01:11:38,485][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000247412_4053598208.pth [2024-06-19 01:11:41,325][26599] Updated weights for policy 0, policy_version 248034 (0.0040) [2024-06-19 01:11:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4063870976. Throughput: 0: 41834.1. Samples: 331504260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:43,381][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 01:11:45,578][26599] Updated weights for policy 0, policy_version 248044 (0.0034) [2024-06-19 01:11:48,384][26367] Fps is (10 sec: 42582.5, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 4064051200. Throughput: 0: 42062.3. Samples: 331635260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:48,385][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 01:11:49,167][26599] Updated weights for policy 0, policy_version 248054 (0.0033) [2024-06-19 01:11:53,329][26599] Updated weights for policy 0, policy_version 248064 (0.0039) [2024-06-19 01:11:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4064280576. Throughput: 0: 41879.9. Samples: 331885620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:53,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 01:11:56,990][26599] Updated weights for policy 0, policy_version 248074 (0.0031) [2024-06-19 01:11:58,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42327.9, 300 sec: 41876.4). Total num frames: 4064493568. Throughput: 0: 41857.4. Samples: 332133280. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:11:58,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 01:12:01,218][26599] Updated weights for policy 0, policy_version 248084 (0.0032) [2024-06-19 01:12:03,383][26367] Fps is (10 sec: 39310.5, 60 sec: 41777.1, 300 sec: 41876.0). Total num frames: 4064673792. Throughput: 0: 41954.7. Samples: 332267100. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:12:03,384][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 01:12:04,842][26599] Updated weights for policy 0, policy_version 248094 (0.0044) [2024-06-19 01:12:08,381][26367] Fps is (10 sec: 40959.0, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4064903168. Throughput: 0: 42034.8. Samples: 332514380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:12:08,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 01:12:08,957][26599] Updated weights for policy 0, policy_version 248104 (0.0028) [2024-06-19 01:12:12,847][26599] Updated weights for policy 0, policy_version 248114 (0.0029) [2024-06-19 01:12:13,383][26367] Fps is (10 sec: 44238.2, 60 sec: 42050.5, 300 sec: 41931.6). Total num frames: 4065116160. Throughput: 0: 41974.9. Samples: 332759600. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:12:13,383][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 01:12:16,727][26599] Updated weights for policy 0, policy_version 248124 (0.0032) [2024-06-19 01:12:18,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41780.2, 300 sec: 41931.9). Total num frames: 4065312768. Throughput: 0: 41848.8. Samples: 332887140. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:12:18,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 01:12:20,603][26599] Updated weights for policy 0, policy_version 248134 (0.0028) [2024-06-19 01:12:23,380][26367] Fps is (10 sec: 40970.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4065525760. Throughput: 0: 41922.6. Samples: 333137660. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:12:23,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:12:24,755][26599] Updated weights for policy 0, policy_version 248144 (0.0046) [2024-06-19 01:12:25,863][26579] Signal inference workers to stop experience collection... (4900 times) [2024-06-19 01:12:25,898][26599] InferenceWorker_p0-w0: stopping experience collection (4900 times) [2024-06-19 01:12:25,926][26579] Signal inference workers to resume experience collection... (4900 times) [2024-06-19 01:12:25,927][26599] InferenceWorker_p0-w0: resuming experience collection (4900 times) [2024-06-19 01:12:28,316][26599] Updated weights for policy 0, policy_version 248154 (0.0030) [2024-06-19 01:12:28,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4065755136. Throughput: 0: 41943.2. Samples: 333391700. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 01:12:28,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 01:12:32,675][26599] Updated weights for policy 0, policy_version 248164 (0.0038) [2024-06-19 01:12:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4065935360. Throughput: 0: 41780.8. Samples: 333515240. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:12:33,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 01:12:36,337][26599] Updated weights for policy 0, policy_version 248174 (0.0027) [2024-06-19 01:12:38,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4066164736. Throughput: 0: 41722.7. Samples: 333763140. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:12:38,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 01:12:40,338][26599] Updated weights for policy 0, policy_version 248184 (0.0051) [2024-06-19 01:12:43,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 4066344960. Throughput: 0: 41947.0. Samples: 334020900. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:12:43,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 01:12:44,067][26599] Updated weights for policy 0, policy_version 248194 (0.0031) [2024-06-19 01:12:48,150][26599] Updated weights for policy 0, policy_version 248204 (0.0038) [2024-06-19 01:12:48,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42052.3, 300 sec: 41931.4). Total num frames: 4066574336. Throughput: 0: 41571.8. Samples: 334137860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:12:48,385][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 01:12:51,693][26599] Updated weights for policy 0, policy_version 248214 (0.0028) [2024-06-19 01:12:53,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4066803712. Throughput: 0: 41685.9. Samples: 334390240. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:12:53,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 01:12:56,043][26599] Updated weights for policy 0, policy_version 248224 (0.0039) [2024-06-19 01:12:58,380][26367] Fps is (10 sec: 40974.6, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4066983936. Throughput: 0: 42017.0. Samples: 334650260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:12:58,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 01:12:59,592][26599] Updated weights for policy 0, policy_version 248234 (0.0045) [2024-06-19 01:13:03,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42054.4, 300 sec: 41931.9). Total num frames: 4067196928. Throughput: 0: 41773.5. Samples: 334766940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:03,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 01:13:03,777][26599] Updated weights for policy 0, policy_version 248244 (0.0037) [2024-06-19 01:13:07,398][26599] Updated weights for policy 0, policy_version 248254 (0.0029) [2024-06-19 01:13:08,380][26367] Fps is (10 sec: 47514.3, 60 sec: 42598.6, 300 sec: 42043.0). Total num frames: 4067459072. Throughput: 0: 42026.7. Samples: 335028860. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:08,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:13:11,659][26599] Updated weights for policy 0, policy_version 248264 (0.0038) [2024-06-19 01:13:13,382][26367] Fps is (10 sec: 40952.5, 60 sec: 41506.7, 300 sec: 41876.2). Total num frames: 4067606528. Throughput: 0: 41974.8. Samples: 335280640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:13,382][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 01:13:15,349][26599] Updated weights for policy 0, policy_version 248274 (0.0046) [2024-06-19 01:13:18,380][26367] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4067835904. Throughput: 0: 41856.4. Samples: 335398780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:18,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 01:13:19,168][26599] Updated weights for policy 0, policy_version 248284 (0.0049) [2024-06-19 01:13:23,143][26599] Updated weights for policy 0, policy_version 248294 (0.0038) [2024-06-19 01:13:23,380][26367] Fps is (10 sec: 44244.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4068048896. Throughput: 0: 42152.1. Samples: 335659980. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:23,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 01:13:27,031][26599] Updated weights for policy 0, policy_version 248304 (0.0033) [2024-06-19 01:13:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 4068229120. Throughput: 0: 42019.7. Samples: 335911780. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:28,380][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 01:13:31,021][26599] Updated weights for policy 0, policy_version 248314 (0.0031) [2024-06-19 01:13:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4068458496. Throughput: 0: 42167.0. Samples: 336035220. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:33,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 01:13:35,024][26599] Updated weights for policy 0, policy_version 248324 (0.0037) [2024-06-19 01:13:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4068655104. Throughput: 0: 42125.5. Samples: 336285880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 21.0) [2024-06-19 01:13:38,380][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 01:13:38,488][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248332_4068671488.pth... [2024-06-19 01:13:38,569][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000247717_4058595328.pth [2024-06-19 01:13:38,814][26599] Updated weights for policy 0, policy_version 248334 (0.0032) [2024-06-19 01:13:42,685][26599] Updated weights for policy 0, policy_version 248344 (0.0038) [2024-06-19 01:13:43,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4068884480. Throughput: 0: 41791.1. Samples: 336530860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:13:43,381][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 01:13:46,506][26599] Updated weights for policy 0, policy_version 248354 (0.0038) [2024-06-19 01:13:48,383][26367] Fps is (10 sec: 42587.4, 60 sec: 41780.0, 300 sec: 41876.0). Total num frames: 4069081088. Throughput: 0: 42125.6. Samples: 336662700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:13:48,383][26367] Avg episode reward: [(0, '0.265')] [2024-06-19 01:13:50,208][26579] Signal inference workers to stop experience collection... (4950 times) [2024-06-19 01:13:50,208][26579] Signal inference workers to resume experience collection... (4950 times) [2024-06-19 01:13:50,229][26599] InferenceWorker_p0-w0: stopping experience collection (4950 times) [2024-06-19 01:13:50,229][26599] InferenceWorker_p0-w0: resuming experience collection (4950 times) [2024-06-19 01:13:50,549][26599] Updated weights for policy 0, policy_version 248364 (0.0028) [2024-06-19 01:13:53,383][26367] Fps is (10 sec: 40950.8, 60 sec: 41504.6, 300 sec: 41931.6). Total num frames: 4069294080. Throughput: 0: 41851.5. Samples: 336912280. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:13:53,383][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 01:13:54,342][26599] Updated weights for policy 0, policy_version 248374 (0.0031) [2024-06-19 01:13:58,170][26599] Updated weights for policy 0, policy_version 248384 (0.0025) [2024-06-19 01:13:58,380][26367] Fps is (10 sec: 45886.2, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 4069539840. Throughput: 0: 41970.9. Samples: 337169260. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:13:58,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 01:14:02,163][26599] Updated weights for policy 0, policy_version 248394 (0.0039) [2024-06-19 01:14:03,380][26367] Fps is (10 sec: 44247.0, 60 sec: 42325.2, 300 sec: 41876.4). Total num frames: 4069736448. Throughput: 0: 42170.6. Samples: 337296460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:03,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 01:14:05,962][26599] Updated weights for policy 0, policy_version 248404 (0.0039) [2024-06-19 01:14:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4069949440. Throughput: 0: 41890.7. Samples: 337545060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:08,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 01:14:10,063][26599] Updated weights for policy 0, policy_version 248414 (0.0024) [2024-06-19 01:14:13,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42053.5, 300 sec: 41820.9). Total num frames: 4070129664. Throughput: 0: 42099.1. Samples: 337806240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:13,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 01:14:13,904][26599] Updated weights for policy 0, policy_version 248424 (0.0038) [2024-06-19 01:14:18,075][26599] Updated weights for policy 0, policy_version 248434 (0.0042) [2024-06-19 01:14:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4070342656. Throughput: 0: 41932.4. Samples: 337922180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:18,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 01:14:21,570][26599] Updated weights for policy 0, policy_version 248444 (0.0044) [2024-06-19 01:14:23,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4070588416. Throughput: 0: 41952.5. Samples: 338173740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:23,380][26367] Avg episode reward: [(0, '0.270')] [2024-06-19 01:14:26,117][26599] Updated weights for policy 0, policy_version 248454 (0.0043) [2024-06-19 01:14:28,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4070752256. Throughput: 0: 42321.4. Samples: 338435320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:28,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 01:14:29,378][26599] Updated weights for policy 0, policy_version 248464 (0.0035) [2024-06-19 01:14:33,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4070965248. Throughput: 0: 41881.1. Samples: 338547240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:33,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 01:14:33,781][26599] Updated weights for policy 0, policy_version 248474 (0.0041) [2024-06-19 01:14:37,151][26599] Updated weights for policy 0, policy_version 248484 (0.0033) [2024-06-19 01:14:38,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4071211008. Throughput: 0: 42047.1. Samples: 338804300. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:38,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 01:14:41,496][26599] Updated weights for policy 0, policy_version 248494 (0.0062) [2024-06-19 01:14:43,383][26367] Fps is (10 sec: 42588.6, 60 sec: 41777.7, 300 sec: 41876.1). Total num frames: 4071391232. Throughput: 0: 41904.2. Samples: 339055040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:43,383][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 01:14:44,878][26599] Updated weights for policy 0, policy_version 248504 (0.0044) [2024-06-19 01:14:48,384][26367] Fps is (10 sec: 39308.7, 60 sec: 42051.8, 300 sec: 41987.5). Total num frames: 4071604224. Throughput: 0: 41758.4. Samples: 339175720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 01:14:48,384][26367] Avg episode reward: [(0, '0.863')] [2024-06-19 01:14:49,566][26599] Updated weights for policy 0, policy_version 248514 (0.0040) [2024-06-19 01:14:52,580][26599] Updated weights for policy 0, policy_version 248524 (0.0035) [2024-06-19 01:14:53,380][26367] Fps is (10 sec: 44247.0, 60 sec: 42327.0, 300 sec: 41932.5). Total num frames: 4071833600. Throughput: 0: 41981.4. Samples: 339434220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:14:53,381][26367] Avg episode reward: [(0, '0.863')] [2024-06-19 01:14:57,493][26599] Updated weights for policy 0, policy_version 248534 (0.0038) [2024-06-19 01:14:58,380][26367] Fps is (10 sec: 40973.4, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 4072013824. Throughput: 0: 41873.3. Samples: 339690540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:14:58,381][26367] Avg episode reward: [(0, '0.905')] [2024-06-19 01:15:00,545][26599] Updated weights for policy 0, policy_version 248544 (0.0036) [2024-06-19 01:15:03,384][26367] Fps is (10 sec: 40944.6, 60 sec: 41776.6, 300 sec: 41931.4). Total num frames: 4072243200. Throughput: 0: 41805.4. Samples: 339803580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:03,385][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 01:15:05,225][26599] Updated weights for policy 0, policy_version 248554 (0.0036) [2024-06-19 01:15:08,370][26599] Updated weights for policy 0, policy_version 248564 (0.0043) [2024-06-19 01:15:08,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4072472576. Throughput: 0: 41958.2. Samples: 340061860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:08,380][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 01:15:12,716][26579] Signal inference workers to stop experience collection... (5000 times) [2024-06-19 01:15:12,745][26599] InferenceWorker_p0-w0: stopping experience collection (5000 times) [2024-06-19 01:15:12,784][26579] Signal inference workers to resume experience collection... (5000 times) [2024-06-19 01:15:12,784][26599] InferenceWorker_p0-w0: resuming experience collection (5000 times) [2024-06-19 01:15:13,089][26599] Updated weights for policy 0, policy_version 248574 (0.0027) [2024-06-19 01:15:13,380][26367] Fps is (10 sec: 39336.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4072636416. Throughput: 0: 41837.4. Samples: 340318000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:13,381][26367] Avg episode reward: [(0, '0.318')] [2024-06-19 01:15:16,127][26599] Updated weights for policy 0, policy_version 248584 (0.0025) [2024-06-19 01:15:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 4072898560. Throughput: 0: 41975.0. Samples: 340436120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:18,381][26367] Avg episode reward: [(0, '0.345')] [2024-06-19 01:15:20,939][26599] Updated weights for policy 0, policy_version 248594 (0.0025) [2024-06-19 01:15:23,384][26367] Fps is (10 sec: 45858.4, 60 sec: 41776.6, 300 sec: 41987.5). Total num frames: 4073095168. Throughput: 0: 42088.1. Samples: 340698420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:23,384][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 01:15:23,735][26599] Updated weights for policy 0, policy_version 248604 (0.0024) [2024-06-19 01:15:28,380][26367] Fps is (10 sec: 36045.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4073259008. Throughput: 0: 42123.0. Samples: 340950480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:28,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 01:15:28,760][26599] Updated weights for policy 0, policy_version 248614 (0.0029) [2024-06-19 01:15:31,556][26599] Updated weights for policy 0, policy_version 248624 (0.0044) [2024-06-19 01:15:33,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42871.4, 300 sec: 41987.5). Total num frames: 4073537536. Throughput: 0: 42110.6. Samples: 341070560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:33,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 01:15:36,723][26599] Updated weights for policy 0, policy_version 248634 (0.0031) [2024-06-19 01:15:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4073684992. Throughput: 0: 42050.6. Samples: 341326500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:38,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 01:15:38,430][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248639_4073701376.pth... [2024-06-19 01:15:38,469][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248025_4063641600.pth [2024-06-19 01:15:39,635][26599] Updated weights for policy 0, policy_version 248644 (0.0036) [2024-06-19 01:15:43,380][26367] Fps is (10 sec: 36045.1, 60 sec: 41780.8, 300 sec: 41876.4). Total num frames: 4073897984. Throughput: 0: 41840.5. Samples: 341573360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:43,380][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 01:15:44,448][26599] Updated weights for policy 0, policy_version 248654 (0.0031) [2024-06-19 01:15:47,423][26599] Updated weights for policy 0, policy_version 248664 (0.0031) [2024-06-19 01:15:48,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42600.7, 300 sec: 41987.5). Total num frames: 4074160128. Throughput: 0: 42108.8. Samples: 341698320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:48,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 01:15:52,171][26599] Updated weights for policy 0, policy_version 248674 (0.0024) [2024-06-19 01:15:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 41932.5). Total num frames: 4074323968. Throughput: 0: 42078.2. Samples: 341955380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:53,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 01:15:55,075][26599] Updated weights for policy 0, policy_version 248684 (0.0033) [2024-06-19 01:15:58,380][26367] Fps is (10 sec: 36044.4, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4074520576. Throughput: 0: 41906.1. Samples: 342203780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:15:58,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 01:15:59,994][26599] Updated weights for policy 0, policy_version 248694 (0.0024) [2024-06-19 01:16:02,949][26599] Updated weights for policy 0, policy_version 248704 (0.0037) [2024-06-19 01:16:03,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42328.0, 300 sec: 42098.6). Total num frames: 4074782720. Throughput: 0: 42141.0. Samples: 342332460. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:03,384][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 01:16:07,822][26599] Updated weights for policy 0, policy_version 248714 (0.0046) [2024-06-19 01:16:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40959.9, 300 sec: 41820.9). Total num frames: 4074930176. Throughput: 0: 41978.5. Samples: 342587300. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:08,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 01:16:10,431][26579] Signal inference workers to stop experience collection... (5050 times) [2024-06-19 01:16:10,464][26599] InferenceWorker_p0-w0: stopping experience collection (5050 times) [2024-06-19 01:16:10,498][26579] Signal inference workers to resume experience collection... (5050 times) [2024-06-19 01:16:10,498][26599] InferenceWorker_p0-w0: resuming experience collection (5050 times) [2024-06-19 01:16:10,661][26599] Updated weights for policy 0, policy_version 248724 (0.0031) [2024-06-19 01:16:13,384][26367] Fps is (10 sec: 39307.1, 60 sec: 42322.7, 300 sec: 41931.6). Total num frames: 4075175936. Throughput: 0: 41686.4. Samples: 342826520. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:13,385][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 01:16:15,825][26599] Updated weights for policy 0, policy_version 248734 (0.0044) [2024-06-19 01:16:18,380][26367] Fps is (10 sec: 47513.4, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4075405312. Throughput: 0: 42009.3. Samples: 342960980. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:18,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 01:16:18,491][26599] Updated weights for policy 0, policy_version 248744 (0.0024) [2024-06-19 01:16:23,380][26367] Fps is (10 sec: 37697.3, 60 sec: 40962.5, 300 sec: 41765.3). Total num frames: 4075552768. Throughput: 0: 41803.2. Samples: 343207640. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:23,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 01:16:24,079][26599] Updated weights for policy 0, policy_version 248754 (0.0042) [2024-06-19 01:16:26,306][26599] Updated weights for policy 0, policy_version 248764 (0.0028) [2024-06-19 01:16:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 41987.4). Total num frames: 4075831296. Throughput: 0: 41650.1. Samples: 343447620. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:28,388][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 01:16:31,829][26599] Updated weights for policy 0, policy_version 248774 (0.0040) [2024-06-19 01:16:33,380][26367] Fps is (10 sec: 45875.4, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 4076011520. Throughput: 0: 42032.5. Samples: 343589780. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:33,380][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 01:16:34,187][26599] Updated weights for policy 0, policy_version 248784 (0.0038) [2024-06-19 01:16:38,380][26367] Fps is (10 sec: 34406.6, 60 sec: 41506.1, 300 sec: 41709.8). Total num frames: 4076175360. Throughput: 0: 41703.9. Samples: 343832060. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:38,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 01:16:39,548][26599] Updated weights for policy 0, policy_version 248794 (0.0043) [2024-06-19 01:16:42,042][26599] Updated weights for policy 0, policy_version 248804 (0.0031) [2024-06-19 01:16:43,381][26367] Fps is (10 sec: 45869.5, 60 sec: 42870.6, 300 sec: 42098.9). Total num frames: 4076470272. Throughput: 0: 41443.1. Samples: 344068760. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:43,382][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 01:16:47,520][26599] Updated weights for policy 0, policy_version 248814 (0.0045) [2024-06-19 01:16:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 40687.0, 300 sec: 41765.3). Total num frames: 4076601344. Throughput: 0: 41738.3. Samples: 344210680. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:48,380][26367] Avg episode reward: [(0, '0.799')] [2024-06-19 01:16:49,773][26599] Updated weights for policy 0, policy_version 248824 (0.0037) [2024-06-19 01:16:53,380][26367] Fps is (10 sec: 36048.7, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 4076830720. Throughput: 0: 41404.4. Samples: 344450500. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:53,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 01:16:55,701][26599] Updated weights for policy 0, policy_version 248834 (0.0032) [2024-06-19 01:16:57,839][26599] Updated weights for policy 0, policy_version 248844 (0.0036) [2024-06-19 01:16:58,380][26367] Fps is (10 sec: 50788.8, 60 sec: 43144.5, 300 sec: 42154.5). Total num frames: 4077109248. Throughput: 0: 41555.2. Samples: 344696360. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:16:58,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 01:17:03,349][26599] Updated weights for policy 0, policy_version 248854 (0.0030) [2024-06-19 01:17:03,380][26367] Fps is (10 sec: 39322.2, 60 sec: 40687.0, 300 sec: 41765.4). Total num frames: 4077223936. Throughput: 0: 41617.1. Samples: 344833740. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:17:03,380][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 01:17:04,680][26579] Signal inference workers to stop experience collection... (5100 times) [2024-06-19 01:17:04,681][26579] Signal inference workers to resume experience collection... (5100 times) [2024-06-19 01:17:04,702][26599] InferenceWorker_p0-w0: stopping experience collection (5100 times) [2024-06-19 01:17:04,702][26599] InferenceWorker_p0-w0: resuming experience collection (5100 times) [2024-06-19 01:17:05,492][26599] Updated weights for policy 0, policy_version 248864 (0.0038) [2024-06-19 01:17:08,380][26367] Fps is (10 sec: 36045.7, 60 sec: 42325.4, 300 sec: 41876.8). Total num frames: 4077469696. Throughput: 0: 41630.6. Samples: 345081020. Policy #0 lag: (min: 0.0, avg: 8.0, max: 20.0) [2024-06-19 01:17:08,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 01:17:10,931][26599] Updated weights for policy 0, policy_version 248874 (0.0028) [2024-06-19 01:17:13,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42054.9, 300 sec: 41987.5). Total num frames: 4077699072. Throughput: 0: 41914.4. Samples: 345333760. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:13,381][26367] Avg episode reward: [(0, '0.277')] [2024-06-19 01:17:13,442][26599] Updated weights for policy 0, policy_version 248884 (0.0033) [2024-06-19 01:17:18,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40960.1, 300 sec: 41820.9). Total num frames: 4077862912. Throughput: 0: 41657.8. Samples: 345464380. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:18,380][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 01:17:18,547][26599] Updated weights for policy 0, policy_version 248894 (0.0035) [2024-06-19 01:17:21,109][26599] Updated weights for policy 0, policy_version 248904 (0.0035) [2024-06-19 01:17:23,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 41931.9). Total num frames: 4078125056. Throughput: 0: 41821.3. Samples: 345714020. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:23,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 01:17:26,517][26599] Updated weights for policy 0, policy_version 248914 (0.0042) [2024-06-19 01:17:28,380][26367] Fps is (10 sec: 45875.2, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4078321664. Throughput: 0: 42217.6. Samples: 345968500. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:28,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 01:17:28,811][26599] Updated weights for policy 0, policy_version 248924 (0.0040) [2024-06-19 01:17:33,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 4078501888. Throughput: 0: 41703.0. Samples: 346087320. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:33,384][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 01:17:34,103][26599] Updated weights for policy 0, policy_version 248934 (0.0024) [2024-06-19 01:17:36,758][26599] Updated weights for policy 0, policy_version 248944 (0.0042) [2024-06-19 01:17:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42043.0). Total num frames: 4078747648. Throughput: 0: 42029.4. Samples: 346341820. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:38,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 01:17:38,462][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248948_4078764032.pth... [2024-06-19 01:17:38,505][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248332_4068671488.pth [2024-06-19 01:17:41,893][26599] Updated weights for policy 0, policy_version 248954 (0.0047) [2024-06-19 01:17:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 40960.8, 300 sec: 41876.9). Total num frames: 4078927872. Throughput: 0: 42360.2. Samples: 346602560. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:43,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 01:17:44,844][26599] Updated weights for policy 0, policy_version 248964 (0.0028) [2024-06-19 01:17:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 41876.4). Total num frames: 4079157248. Throughput: 0: 41941.5. Samples: 346721120. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:48,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 01:17:49,567][26599] Updated weights for policy 0, policy_version 248974 (0.0037) [2024-06-19 01:17:52,494][26599] Updated weights for policy 0, policy_version 248984 (0.0037) [2024-06-19 01:17:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4079370240. Throughput: 0: 42072.3. Samples: 346974280. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:53,381][26367] Avg episode reward: [(0, '0.388')] [2024-06-19 01:17:57,043][26599] Updated weights for policy 0, policy_version 248994 (0.0029) [2024-06-19 01:17:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.1, 300 sec: 41931.9). Total num frames: 4079566848. Throughput: 0: 42246.1. Samples: 347234840. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:17:58,381][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 01:18:00,303][26599] Updated weights for policy 0, policy_version 249004 (0.0038) [2024-06-19 01:18:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 41765.3). Total num frames: 4079779840. Throughput: 0: 42049.6. Samples: 347356620. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:18:03,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 01:18:04,298][26579] Signal inference workers to stop experience collection... (5150 times) [2024-06-19 01:18:04,299][26579] Signal inference workers to resume experience collection... (5150 times) [2024-06-19 01:18:04,321][26599] InferenceWorker_p0-w0: stopping experience collection (5150 times) [2024-06-19 01:18:04,321][26599] InferenceWorker_p0-w0: resuming experience collection (5150 times) [2024-06-19 01:18:05,118][26599] Updated weights for policy 0, policy_version 249014 (0.0035) [2024-06-19 01:18:08,071][26599] Updated weights for policy 0, policy_version 249024 (0.0025) [2024-06-19 01:18:08,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42098.8). Total num frames: 4080025600. Throughput: 0: 42157.5. Samples: 347611100. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:18:08,380][26367] Avg episode reward: [(0, '0.336')] [2024-06-19 01:18:12,603][26599] Updated weights for policy 0, policy_version 249034 (0.0037) [2024-06-19 01:18:13,384][26367] Fps is (10 sec: 44221.0, 60 sec: 42049.7, 300 sec: 41986.9). Total num frames: 4080222208. Throughput: 0: 42195.6. Samples: 347867460. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:18:13,384][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 01:18:15,943][26599] Updated weights for policy 0, policy_version 249044 (0.0039) [2024-06-19 01:18:18,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 4080418816. Throughput: 0: 42203.9. Samples: 347986500. Policy #0 lag: (min: 0.0, avg: 6.7, max: 21.0) [2024-06-19 01:18:18,385][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 01:18:20,384][26599] Updated weights for policy 0, policy_version 249054 (0.0025) [2024-06-19 01:18:23,384][26367] Fps is (10 sec: 37683.5, 60 sec: 41230.7, 300 sec: 41931.4). Total num frames: 4080599040. Throughput: 0: 42126.0. Samples: 348237640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:23,385][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 01:18:23,959][26599] Updated weights for policy 0, policy_version 249064 (0.0029) [2024-06-19 01:18:27,958][26599] Updated weights for policy 0, policy_version 249074 (0.0039) [2024-06-19 01:18:28,384][26367] Fps is (10 sec: 42583.2, 60 sec: 42049.7, 300 sec: 41987.0). Total num frames: 4080844800. Throughput: 0: 41876.2. Samples: 348487140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:28,385][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 01:18:31,723][26599] Updated weights for policy 0, policy_version 249084 (0.0028) [2024-06-19 01:18:33,380][26367] Fps is (10 sec: 42613.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4081025024. Throughput: 0: 42074.3. Samples: 348614460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:33,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 01:18:35,898][26599] Updated weights for policy 0, policy_version 249094 (0.0041) [2024-06-19 01:18:38,380][26367] Fps is (10 sec: 39335.9, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4081238016. Throughput: 0: 41892.1. Samples: 348859420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:38,381][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 01:18:40,112][26599] Updated weights for policy 0, policy_version 249104 (0.0028) [2024-06-19 01:18:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41932.3). Total num frames: 4081451008. Throughput: 0: 41638.7. Samples: 349108580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:43,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 01:18:43,919][26599] Updated weights for policy 0, policy_version 249114 (0.0030) [2024-06-19 01:18:48,006][26599] Updated weights for policy 0, policy_version 249124 (0.0032) [2024-06-19 01:18:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 41932.3). Total num frames: 4081664000. Throughput: 0: 41751.2. Samples: 349235420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:48,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 01:18:51,601][26599] Updated weights for policy 0, policy_version 249134 (0.0039) [2024-06-19 01:18:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4081893376. Throughput: 0: 41742.1. Samples: 349489500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:53,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 01:18:55,820][26599] Updated weights for policy 0, policy_version 249144 (0.0035) [2024-06-19 01:18:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4082089984. Throughput: 0: 41496.1. Samples: 349734640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:18:58,381][26367] Avg episode reward: [(0, '0.371')] [2024-06-19 01:18:59,302][26599] Updated weights for policy 0, policy_version 249154 (0.0044) [2024-06-19 01:19:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4082286592. Throughput: 0: 41615.7. Samples: 349859200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:19:03,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 01:19:03,563][26599] Updated weights for policy 0, policy_version 249164 (0.0029) [2024-06-19 01:19:06,975][26599] Updated weights for policy 0, policy_version 249174 (0.0029) [2024-06-19 01:19:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41232.9, 300 sec: 41931.9). Total num frames: 4082499584. Throughput: 0: 41640.1. Samples: 350111300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:19:08,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 01:19:11,327][26599] Updated weights for policy 0, policy_version 249184 (0.0032) [2024-06-19 01:19:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41508.6, 300 sec: 41931.9). Total num frames: 4082712576. Throughput: 0: 41699.3. Samples: 350363460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:19:13,381][26367] Avg episode reward: [(0, '0.849')] [2024-06-19 01:19:15,210][26599] Updated weights for policy 0, policy_version 249194 (0.0036) [2024-06-19 01:19:18,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 4082909184. Throughput: 0: 41750.9. Samples: 350493240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:19:18,381][26367] Avg episode reward: [(0, '0.875')] [2024-06-19 01:19:19,037][26599] Updated weights for policy 0, policy_version 249204 (0.0037) [2024-06-19 01:19:22,895][26599] Updated weights for policy 0, policy_version 249214 (0.0038) [2024-06-19 01:19:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 4083138560. Throughput: 0: 41819.2. Samples: 350741280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:19:23,380][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 01:19:27,038][26599] Updated weights for policy 0, policy_version 249224 (0.0037) [2024-06-19 01:19:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41508.7, 300 sec: 41931.9). Total num frames: 4083335168. Throughput: 0: 41846.4. Samples: 350991660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:19:28,380][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 01:19:28,655][26579] Signal inference workers to stop experience collection... (5200 times) [2024-06-19 01:19:28,707][26599] InferenceWorker_p0-w0: stopping experience collection (5200 times) [2024-06-19 01:19:28,716][26579] Signal inference workers to resume experience collection... (5200 times) [2024-06-19 01:19:28,732][26599] InferenceWorker_p0-w0: resuming experience collection (5200 times) [2024-06-19 01:19:30,745][26599] Updated weights for policy 0, policy_version 249234 (0.0029) [2024-06-19 01:19:33,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4083531776. Throughput: 0: 41871.0. Samples: 351119620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:19:33,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 01:19:34,840][26599] Updated weights for policy 0, policy_version 249244 (0.0046) [2024-06-19 01:19:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41932.3). Total num frames: 4083761152. Throughput: 0: 41720.5. Samples: 351366920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:19:38,380][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 01:19:38,477][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000249254_4083777536.pth... [2024-06-19 01:19:38,485][26599] Updated weights for policy 0, policy_version 249254 (0.0033) [2024-06-19 01:19:38,538][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248639_4073701376.pth [2024-06-19 01:19:42,900][26599] Updated weights for policy 0, policy_version 249264 (0.0028) [2024-06-19 01:19:43,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41506.3, 300 sec: 41821.3). Total num frames: 4083941376. Throughput: 0: 41854.8. Samples: 351618100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:19:43,380][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 01:19:46,151][26599] Updated weights for policy 0, policy_version 249274 (0.0032) [2024-06-19 01:19:48,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4084154368. Throughput: 0: 41842.1. Samples: 351742100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:19:48,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 01:19:50,626][26599] Updated weights for policy 0, policy_version 249284 (0.0043) [2024-06-19 01:19:53,380][26367] Fps is (10 sec: 47512.7, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4084416512. Throughput: 0: 41806.2. Samples: 351992580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:19:53,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 01:19:53,925][26599] Updated weights for policy 0, policy_version 249294 (0.0032) [2024-06-19 01:19:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41821.4). Total num frames: 4084580352. Throughput: 0: 41878.7. Samples: 352248000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:19:58,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 01:19:58,506][26599] Updated weights for policy 0, policy_version 249304 (0.0031) [2024-06-19 01:20:01,649][26599] Updated weights for policy 0, policy_version 249314 (0.0041) [2024-06-19 01:20:03,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4084793344. Throughput: 0: 41584.4. Samples: 352364540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:03,381][26367] Avg episode reward: [(0, '0.292')] [2024-06-19 01:20:06,599][26599] Updated weights for policy 0, policy_version 249324 (0.0030) [2024-06-19 01:20:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4085006336. Throughput: 0: 41799.4. Samples: 352622260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:08,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 01:20:09,775][26599] Updated weights for policy 0, policy_version 249334 (0.0041) [2024-06-19 01:20:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 4085202944. Throughput: 0: 41852.3. Samples: 352875020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:13,381][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 01:20:14,244][26599] Updated weights for policy 0, policy_version 249344 (0.0036) [2024-06-19 01:20:17,621][26599] Updated weights for policy 0, policy_version 249354 (0.0024) [2024-06-19 01:20:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41821.4). Total num frames: 4085432320. Throughput: 0: 41658.3. Samples: 352994240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:18,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 01:20:22,010][26599] Updated weights for policy 0, policy_version 249364 (0.0052) [2024-06-19 01:20:23,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4085661696. Throughput: 0: 41759.8. Samples: 353246120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:23,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 01:20:25,466][26599] Updated weights for policy 0, policy_version 249374 (0.0036) [2024-06-19 01:20:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41233.0, 300 sec: 41598.7). Total num frames: 4085809152. Throughput: 0: 41867.5. Samples: 353502140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:28,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 01:20:30,204][26599] Updated weights for policy 0, policy_version 249384 (0.0032) [2024-06-19 01:20:33,353][26599] Updated weights for policy 0, policy_version 249394 (0.0028) [2024-06-19 01:20:33,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4086071296. Throughput: 0: 41622.8. Samples: 353615120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:33,380][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 01:20:38,189][26599] Updated weights for policy 0, policy_version 249404 (0.0042) [2024-06-19 01:20:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 4086235136. Throughput: 0: 41756.1. Samples: 353871600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 01:20:38,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 01:20:41,199][26599] Updated weights for policy 0, policy_version 249414 (0.0031) [2024-06-19 01:20:43,380][26367] Fps is (10 sec: 37682.7, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 4086448128. Throughput: 0: 41522.6. Samples: 354116520. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:20:43,384][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 01:20:44,787][26579] Signal inference workers to stop experience collection... (5250 times) [2024-06-19 01:20:44,787][26579] Signal inference workers to resume experience collection... (5250 times) [2024-06-19 01:20:44,830][26599] InferenceWorker_p0-w0: stopping experience collection (5250 times) [2024-06-19 01:20:44,830][26599] InferenceWorker_p0-w0: resuming experience collection (5250 times) [2024-06-19 01:20:45,934][26599] Updated weights for policy 0, policy_version 249424 (0.0035) [2024-06-19 01:20:48,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4086693888. Throughput: 0: 41758.6. Samples: 354243680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:20:48,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 01:20:48,831][26599] Updated weights for policy 0, policy_version 249434 (0.0042) [2024-06-19 01:20:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40687.0, 300 sec: 41820.9). Total num frames: 4086857728. Throughput: 0: 41632.9. Samples: 354495740. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:20:53,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 01:20:53,681][26599] Updated weights for policy 0, policy_version 249444 (0.0035) [2024-06-19 01:20:57,054][26599] Updated weights for policy 0, policy_version 249454 (0.0037) [2024-06-19 01:20:58,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4087087104. Throughput: 0: 41538.2. Samples: 354744240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:20:58,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 01:21:01,306][26599] Updated weights for policy 0, policy_version 249464 (0.0030) [2024-06-19 01:21:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4087300096. Throughput: 0: 41740.5. Samples: 354872560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:03,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 01:21:05,012][26599] Updated weights for policy 0, policy_version 249474 (0.0052) [2024-06-19 01:21:08,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41765.8). Total num frames: 4087496704. Throughput: 0: 41737.8. Samples: 355124320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:08,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 01:21:09,359][26599] Updated weights for policy 0, policy_version 249484 (0.0046) [2024-06-19 01:21:12,824][26599] Updated weights for policy 0, policy_version 249494 (0.0037) [2024-06-19 01:21:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 4087709696. Throughput: 0: 41480.8. Samples: 355368780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:13,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 01:21:16,986][26599] Updated weights for policy 0, policy_version 249504 (0.0036) [2024-06-19 01:21:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4087922688. Throughput: 0: 41900.8. Samples: 355500660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:18,381][26367] Avg episode reward: [(0, '0.401')] [2024-06-19 01:21:21,204][26599] Updated weights for policy 0, policy_version 249514 (0.0032) [2024-06-19 01:21:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 4088135680. Throughput: 0: 41661.2. Samples: 355746360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:23,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-19 01:21:24,679][26599] Updated weights for policy 0, policy_version 249524 (0.0032) [2024-06-19 01:21:28,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42049.7, 300 sec: 41764.8). Total num frames: 4088332288. Throughput: 0: 41715.8. Samples: 355993880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:28,384][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 01:21:28,777][26599] Updated weights for policy 0, policy_version 249534 (0.0037) [2024-06-19 01:21:32,476][26599] Updated weights for policy 0, policy_version 249544 (0.0051) [2024-06-19 01:21:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 4088545280. Throughput: 0: 41690.7. Samples: 356119760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:33,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 01:21:36,640][26599] Updated weights for policy 0, policy_version 249554 (0.0029) [2024-06-19 01:21:38,380][26367] Fps is (10 sec: 42613.3, 60 sec: 42052.1, 300 sec: 41654.4). Total num frames: 4088758272. Throughput: 0: 41690.1. Samples: 356371800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:38,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 01:21:38,435][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000249559_4088774656.pth... [2024-06-19 01:21:38,482][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000248948_4078764032.pth [2024-06-19 01:21:40,179][26599] Updated weights for policy 0, policy_version 249564 (0.0032) [2024-06-19 01:21:43,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4088971264. Throughput: 0: 41798.9. Samples: 356625200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:43,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 01:21:44,190][26599] Updated weights for policy 0, policy_version 249574 (0.0029) [2024-06-19 01:21:47,910][26599] Updated weights for policy 0, policy_version 249584 (0.0029) [2024-06-19 01:21:48,384][26367] Fps is (10 sec: 42583.6, 60 sec: 41503.7, 300 sec: 41875.9). Total num frames: 4089184256. Throughput: 0: 41684.2. Samples: 356748500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 01:21:48,384][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 01:21:52,516][26599] Updated weights for policy 0, policy_version 249594 (0.0026) [2024-06-19 01:21:53,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 4089380864. Throughput: 0: 41701.0. Samples: 357000860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:21:53,381][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 01:21:55,583][26599] Updated weights for policy 0, policy_version 249604 (0.0037) [2024-06-19 01:21:58,380][26367] Fps is (10 sec: 40974.0, 60 sec: 41779.0, 300 sec: 41931.9). Total num frames: 4089593856. Throughput: 0: 41845.7. Samples: 357251840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:21:58,381][26367] Avg episode reward: [(0, '0.804')] [2024-06-19 01:22:00,091][26599] Updated weights for policy 0, policy_version 249614 (0.0033) [2024-06-19 01:22:03,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4089823232. Throughput: 0: 41756.5. Samples: 357379700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:03,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 01:22:03,409][26599] Updated weights for policy 0, policy_version 249624 (0.0041) [2024-06-19 01:22:07,788][26599] Updated weights for policy 0, policy_version 249634 (0.0031) [2024-06-19 01:22:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 4090019840. Throughput: 0: 42002.7. Samples: 357636480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:08,381][26367] Avg episode reward: [(0, '0.265')] [2024-06-19 01:22:11,609][26599] Updated weights for policy 0, policy_version 249644 (0.0033) [2024-06-19 01:22:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4090232832. Throughput: 0: 42053.2. Samples: 357886120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:13,384][26367] Avg episode reward: [(0, '0.283')] [2024-06-19 01:22:15,770][26599] Updated weights for policy 0, policy_version 249654 (0.0044) [2024-06-19 01:22:18,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 4090445824. Throughput: 0: 41891.9. Samples: 358004900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:18,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 01:22:19,207][26599] Updated weights for policy 0, policy_version 249664 (0.0034) [2024-06-19 01:22:23,240][26579] Signal inference workers to stop experience collection... (5300 times) [2024-06-19 01:22:23,273][26599] InferenceWorker_p0-w0: stopping experience collection (5300 times) [2024-06-19 01:22:23,299][26579] Signal inference workers to resume experience collection... (5300 times) [2024-06-19 01:22:23,304][26599] InferenceWorker_p0-w0: resuming experience collection (5300 times) [2024-06-19 01:22:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4090642432. Throughput: 0: 41988.9. Samples: 358261300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:23,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 01:22:23,448][26599] Updated weights for policy 0, policy_version 249674 (0.0025) [2024-06-19 01:22:27,119][26599] Updated weights for policy 0, policy_version 249684 (0.0039) [2024-06-19 01:22:28,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42054.9, 300 sec: 41876.4). Total num frames: 4090855424. Throughput: 0: 41978.9. Samples: 358514240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:28,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 01:22:31,096][26599] Updated weights for policy 0, policy_version 249694 (0.0035) [2024-06-19 01:22:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 4091084800. Throughput: 0: 41996.2. Samples: 358638180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:33,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 01:22:34,923][26599] Updated weights for policy 0, policy_version 249704 (0.0034) [2024-06-19 01:22:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4091265024. Throughput: 0: 41981.8. Samples: 358890040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:38,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 01:22:38,945][26599] Updated weights for policy 0, policy_version 249714 (0.0034) [2024-06-19 01:22:42,755][26599] Updated weights for policy 0, policy_version 249724 (0.0029) [2024-06-19 01:22:43,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41779.4, 300 sec: 41765.3). Total num frames: 4091478016. Throughput: 0: 41984.2. Samples: 359141120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:43,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 01:22:46,746][26599] Updated weights for policy 0, policy_version 249734 (0.0029) [2024-06-19 01:22:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41781.8, 300 sec: 41765.3). Total num frames: 4091691008. Throughput: 0: 42122.3. Samples: 359275200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:48,380][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 01:22:50,488][26599] Updated weights for policy 0, policy_version 249744 (0.0026) [2024-06-19 01:22:53,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 4091904000. Throughput: 0: 41882.2. Samples: 359521180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:53,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 01:22:55,010][26599] Updated weights for policy 0, policy_version 249754 (0.0032) [2024-06-19 01:22:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.5, 300 sec: 41820.9). Total num frames: 4092116992. Throughput: 0: 41873.4. Samples: 359770420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 01:22:58,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:22:58,411][26599] Updated weights for policy 0, policy_version 249764 (0.0033) [2024-06-19 01:23:02,784][26599] Updated weights for policy 0, policy_version 249774 (0.0037) [2024-06-19 01:23:03,380][26367] Fps is (10 sec: 42599.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4092329984. Throughput: 0: 42025.5. Samples: 359896040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:03,380][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 01:23:06,435][26599] Updated weights for policy 0, policy_version 249784 (0.0037) [2024-06-19 01:23:08,381][26367] Fps is (10 sec: 42596.4, 60 sec: 42052.1, 300 sec: 41765.8). Total num frames: 4092542976. Throughput: 0: 41866.9. Samples: 360145320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:08,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 01:23:10,577][26599] Updated weights for policy 0, policy_version 249794 (0.0032) [2024-06-19 01:23:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4092739584. Throughput: 0: 41905.3. Samples: 360399980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:13,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 01:23:14,226][26599] Updated weights for policy 0, policy_version 249804 (0.0037) [2024-06-19 01:23:18,313][26599] Updated weights for policy 0, policy_version 249814 (0.0038) [2024-06-19 01:23:18,380][26367] Fps is (10 sec: 40961.3, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 4092952576. Throughput: 0: 41815.1. Samples: 360519860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:18,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 01:23:22,108][26599] Updated weights for policy 0, policy_version 249824 (0.0032) [2024-06-19 01:23:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41710.3). Total num frames: 4093149184. Throughput: 0: 41864.9. Samples: 360773960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:23,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 01:23:26,107][26599] Updated weights for policy 0, policy_version 249834 (0.0035) [2024-06-19 01:23:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 4093362176. Throughput: 0: 41746.1. Samples: 361019700. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:28,381][26367] Avg episode reward: [(0, '0.788')] [2024-06-19 01:23:29,985][26599] Updated weights for policy 0, policy_version 249844 (0.0035) [2024-06-19 01:23:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 4093575168. Throughput: 0: 41583.0. Samples: 361146440. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:33,384][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 01:23:33,758][26599] Updated weights for policy 0, policy_version 249854 (0.0050) [2024-06-19 01:23:37,836][26599] Updated weights for policy 0, policy_version 249864 (0.0035) [2024-06-19 01:23:38,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42049.7, 300 sec: 41820.4). Total num frames: 4093788160. Throughput: 0: 41759.0. Samples: 361400480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:38,385][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 01:23:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000249865_4093788160.pth... [2024-06-19 01:23:38,456][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000249254_4083777536.pth [2024-06-19 01:23:41,779][26599] Updated weights for policy 0, policy_version 249874 (0.0036) [2024-06-19 01:23:43,384][26367] Fps is (10 sec: 42584.3, 60 sec: 42049.9, 300 sec: 41820.4). Total num frames: 4094001152. Throughput: 0: 41689.8. Samples: 361646600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:43,384][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 01:23:45,625][26599] Updated weights for policy 0, policy_version 249884 (0.0045) [2024-06-19 01:23:48,380][26367] Fps is (10 sec: 40975.4, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4094197760. Throughput: 0: 41726.3. Samples: 361773720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:48,380][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 01:23:49,303][26599] Updated weights for policy 0, policy_version 249894 (0.0039) [2024-06-19 01:23:52,210][26579] Signal inference workers to stop experience collection... (5350 times) [2024-06-19 01:23:52,211][26579] Signal inference workers to resume experience collection... (5350 times) [2024-06-19 01:23:52,245][26599] InferenceWorker_p0-w0: stopping experience collection (5350 times) [2024-06-19 01:23:52,245][26599] InferenceWorker_p0-w0: resuming experience collection (5350 times) [2024-06-19 01:23:53,380][26367] Fps is (10 sec: 39334.9, 60 sec: 41506.3, 300 sec: 41709.8). Total num frames: 4094394368. Throughput: 0: 41910.7. Samples: 362031280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:53,380][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 01:23:53,753][26599] Updated weights for policy 0, policy_version 249904 (0.0035) [2024-06-19 01:23:57,191][26599] Updated weights for policy 0, policy_version 249914 (0.0029) [2024-06-19 01:23:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4094623744. Throughput: 0: 41502.3. Samples: 362267580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:23:58,380][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 01:24:01,822][26599] Updated weights for policy 0, policy_version 249924 (0.0034) [2024-06-19 01:24:03,380][26367] Fps is (10 sec: 44236.2, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 4094836736. Throughput: 0: 41749.8. Samples: 362398600. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:24:03,384][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 01:24:04,975][26599] Updated weights for policy 0, policy_version 249934 (0.0037) [2024-06-19 01:24:08,380][26367] Fps is (10 sec: 37683.1, 60 sec: 40960.3, 300 sec: 41654.3). Total num frames: 4095000576. Throughput: 0: 41543.6. Samples: 362643420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 01:24:08,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 01:24:09,464][26599] Updated weights for policy 0, policy_version 249944 (0.0033) [2024-06-19 01:24:12,995][26599] Updated weights for policy 0, policy_version 249954 (0.0042) [2024-06-19 01:24:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4095262720. Throughput: 0: 41518.7. Samples: 362888040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:13,381][26367] Avg episode reward: [(0, '0.788')] [2024-06-19 01:24:17,240][26599] Updated weights for policy 0, policy_version 249964 (0.0041) [2024-06-19 01:24:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 41654.2). Total num frames: 4095426560. Throughput: 0: 41712.5. Samples: 363023500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:18,380][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 01:24:20,783][26599] Updated weights for policy 0, policy_version 249974 (0.0032) [2024-06-19 01:24:23,380][26367] Fps is (10 sec: 36044.6, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 4095623168. Throughput: 0: 41436.2. Samples: 363264960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:23,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 01:24:24,878][26599] Updated weights for policy 0, policy_version 249984 (0.0046) [2024-06-19 01:24:28,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4095885312. Throughput: 0: 41568.9. Samples: 363517060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:28,380][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 01:24:28,392][26599] Updated weights for policy 0, policy_version 249994 (0.0033) [2024-06-19 01:24:32,497][26599] Updated weights for policy 0, policy_version 250004 (0.0042) [2024-06-19 01:24:33,380][26367] Fps is (10 sec: 45875.6, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4096081920. Throughput: 0: 41810.6. Samples: 363655200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:33,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 01:24:36,521][26599] Updated weights for policy 0, policy_version 250014 (0.0048) [2024-06-19 01:24:38,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41235.6, 300 sec: 41765.3). Total num frames: 4096262144. Throughput: 0: 41347.5. Samples: 363891920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:38,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 01:24:40,741][26599] Updated weights for policy 0, policy_version 250024 (0.0042) [2024-06-19 01:24:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41508.5, 300 sec: 41820.9). Total num frames: 4096491520. Throughput: 0: 41767.1. Samples: 364147100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:43,380][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 01:24:44,210][26599] Updated weights for policy 0, policy_version 250034 (0.0031) [2024-06-19 01:24:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 4096688128. Throughput: 0: 41812.6. Samples: 364280160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:48,380][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 01:24:48,548][26599] Updated weights for policy 0, policy_version 250044 (0.0037) [2024-06-19 01:24:52,050][26599] Updated weights for policy 0, policy_version 250054 (0.0041) [2024-06-19 01:24:53,380][26367] Fps is (10 sec: 42597.3, 60 sec: 42052.1, 300 sec: 41820.8). Total num frames: 4096917504. Throughput: 0: 41685.5. Samples: 364519280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:53,381][26367] Avg episode reward: [(0, '0.336')] [2024-06-19 01:24:56,258][26599] Updated weights for policy 0, policy_version 250064 (0.0023) [2024-06-19 01:24:58,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 4097130496. Throughput: 0: 42042.6. Samples: 364779960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:24:58,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 01:24:59,822][26599] Updated weights for policy 0, policy_version 250074 (0.0032) [2024-06-19 01:25:03,380][26367] Fps is (10 sec: 39322.8, 60 sec: 41233.2, 300 sec: 41709.8). Total num frames: 4097310720. Throughput: 0: 41713.4. Samples: 364900600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:25:03,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 01:25:04,078][26599] Updated weights for policy 0, policy_version 250084 (0.0037) [2024-06-19 01:25:07,675][26599] Updated weights for policy 0, policy_version 250094 (0.0036) [2024-06-19 01:25:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 41876.4). Total num frames: 4097556480. Throughput: 0: 41942.6. Samples: 365152380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:25:08,381][26367] Avg episode reward: [(0, '0.283')] [2024-06-19 01:25:12,163][26599] Updated weights for policy 0, policy_version 250104 (0.0033) [2024-06-19 01:25:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 40960.1, 300 sec: 41654.3). Total num frames: 4097720320. Throughput: 0: 41989.8. Samples: 365406600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:25:13,380][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 01:25:13,528][26579] Signal inference workers to stop experience collection... (5400 times) [2024-06-19 01:25:13,528][26579] Signal inference workers to resume experience collection... (5400 times) [2024-06-19 01:25:13,547][26599] InferenceWorker_p0-w0: stopping experience collection (5400 times) [2024-06-19 01:25:13,547][26599] InferenceWorker_p0-w0: resuming experience collection (5400 times) [2024-06-19 01:25:15,578][26599] Updated weights for policy 0, policy_version 250114 (0.0035) [2024-06-19 01:25:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 41654.3). Total num frames: 4097949696. Throughput: 0: 41495.5. Samples: 365522500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:25:18,381][26367] Avg episode reward: [(0, '0.266')] [2024-06-19 01:25:20,498][26599] Updated weights for policy 0, policy_version 250124 (0.0027) [2024-06-19 01:25:23,377][26599] Updated weights for policy 0, policy_version 250134 (0.0044) [2024-06-19 01:25:23,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42871.6, 300 sec: 41987.5). Total num frames: 4098195456. Throughput: 0: 41795.6. Samples: 365772720. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:23,380][26367] Avg episode reward: [(0, '0.394')] [2024-06-19 01:25:28,277][26599] Updated weights for policy 0, policy_version 250144 (0.0037) [2024-06-19 01:25:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.0, 300 sec: 41654.2). Total num frames: 4098359296. Throughput: 0: 41875.9. Samples: 366031520. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:28,381][26367] Avg episode reward: [(0, '0.325')] [2024-06-19 01:25:31,243][26599] Updated weights for policy 0, policy_version 250154 (0.0028) [2024-06-19 01:25:33,380][26367] Fps is (10 sec: 37682.8, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4098572288. Throughput: 0: 41455.4. Samples: 366145660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:33,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 01:25:36,187][26599] Updated weights for policy 0, policy_version 250164 (0.0038) [2024-06-19 01:25:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4098801664. Throughput: 0: 41798.8. Samples: 366400220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:38,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 01:25:38,390][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000250171_4098801664.pth... [2024-06-19 01:25:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000249559_4088774656.pth [2024-06-19 01:25:39,366][26599] Updated weights for policy 0, policy_version 250174 (0.0037) [2024-06-19 01:25:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41654.2). Total num frames: 4098981888. Throughput: 0: 41590.7. Samples: 366651540. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:43,381][26367] Avg episode reward: [(0, '0.763')] [2024-06-19 01:25:44,189][26599] Updated weights for policy 0, policy_version 250184 (0.0025) [2024-06-19 01:25:47,060][26599] Updated weights for policy 0, policy_version 250194 (0.0041) [2024-06-19 01:25:48,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4099194880. Throughput: 0: 41639.5. Samples: 366774380. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:48,380][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 01:25:51,930][26599] Updated weights for policy 0, policy_version 250204 (0.0028) [2024-06-19 01:25:53,384][26367] Fps is (10 sec: 44220.8, 60 sec: 41776.8, 300 sec: 41820.3). Total num frames: 4099424256. Throughput: 0: 41683.4. Samples: 367028280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:53,385][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 01:25:54,960][26599] Updated weights for policy 0, policy_version 250214 (0.0034) [2024-06-19 01:25:58,380][26367] Fps is (10 sec: 40959.0, 60 sec: 41233.0, 300 sec: 41709.8). Total num frames: 4099604480. Throughput: 0: 41441.1. Samples: 367271460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:25:58,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 01:25:59,697][26599] Updated weights for policy 0, policy_version 250224 (0.0042) [2024-06-19 01:26:03,380][26367] Fps is (10 sec: 39335.9, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 4099817472. Throughput: 0: 41615.1. Samples: 367395180. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:26:03,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 01:26:03,410][26599] Updated weights for policy 0, policy_version 250234 (0.0034) [2024-06-19 01:26:07,505][26599] Updated weights for policy 0, policy_version 250244 (0.0038) [2024-06-19 01:26:08,380][26367] Fps is (10 sec: 40960.9, 60 sec: 40960.1, 300 sec: 41709.8). Total num frames: 4100014080. Throughput: 0: 41582.2. Samples: 367643920. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:26:08,380][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 01:26:11,401][26599] Updated weights for policy 0, policy_version 250254 (0.0041) [2024-06-19 01:26:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4100227072. Throughput: 0: 41288.2. Samples: 367889480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:26:13,380][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 01:26:15,291][26599] Updated weights for policy 0, policy_version 250264 (0.0033) [2024-06-19 01:26:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 4100440064. Throughput: 0: 41631.6. Samples: 368019080. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:26:18,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 01:26:19,115][26599] Updated weights for policy 0, policy_version 250274 (0.0037) [2024-06-19 01:26:23,082][26599] Updated weights for policy 0, policy_version 250284 (0.0034) [2024-06-19 01:26:23,380][26367] Fps is (10 sec: 42597.1, 60 sec: 40959.8, 300 sec: 41765.8). Total num frames: 4100653056. Throughput: 0: 41515.4. Samples: 368268420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:26:23,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 01:26:27,024][26599] Updated weights for policy 0, policy_version 250294 (0.0035) [2024-06-19 01:26:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4100866048. Throughput: 0: 41443.1. Samples: 368516480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:26:28,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 01:26:30,727][26599] Updated weights for policy 0, policy_version 250304 (0.0032) [2024-06-19 01:26:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.0, 300 sec: 41709.8). Total num frames: 4101062656. Throughput: 0: 41446.4. Samples: 368639480. Policy #0 lag: (min: 0.0, avg: 11.9, max: 26.0) [2024-06-19 01:26:33,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 01:26:35,270][26599] Updated weights for policy 0, policy_version 250314 (0.0034) [2024-06-19 01:26:38,303][26579] Signal inference workers to stop experience collection... (5450 times) [2024-06-19 01:26:38,344][26599] InferenceWorker_p0-w0: stopping experience collection (5450 times) [2024-06-19 01:26:38,354][26579] Signal inference workers to resume experience collection... (5450 times) [2024-06-19 01:26:38,355][26599] InferenceWorker_p0-w0: resuming experience collection (5450 times) [2024-06-19 01:26:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 41765.4). Total num frames: 4101292032. Throughput: 0: 41366.1. Samples: 368889600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:26:38,380][26367] Avg episode reward: [(0, '0.822')] [2024-06-19 01:26:38,513][26599] Updated weights for policy 0, policy_version 250324 (0.0037) [2024-06-19 01:26:43,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41599.2). Total num frames: 4101455872. Throughput: 0: 41477.0. Samples: 369137920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:26:43,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 01:26:43,523][26599] Updated weights for policy 0, policy_version 250334 (0.0040) [2024-06-19 01:26:46,572][26599] Updated weights for policy 0, policy_version 250344 (0.0042) [2024-06-19 01:26:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 4101718016. Throughput: 0: 41371.1. Samples: 369256880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:26:48,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 01:26:51,465][26599] Updated weights for policy 0, policy_version 250354 (0.0032) [2024-06-19 01:26:53,380][26367] Fps is (10 sec: 42599.0, 60 sec: 40962.6, 300 sec: 41654.3). Total num frames: 4101881856. Throughput: 0: 41428.9. Samples: 369508220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:26:53,380][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 01:26:54,392][26599] Updated weights for policy 0, policy_version 250364 (0.0041) [2024-06-19 01:26:58,380][26367] Fps is (10 sec: 36045.3, 60 sec: 41233.2, 300 sec: 41543.2). Total num frames: 4102078464. Throughput: 0: 41680.4. Samples: 369765100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:26:58,380][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 01:26:59,105][26599] Updated weights for policy 0, policy_version 250374 (0.0051) [2024-06-19 01:27:02,208][26599] Updated weights for policy 0, policy_version 250384 (0.0036) [2024-06-19 01:27:03,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 4102340608. Throughput: 0: 41631.0. Samples: 369892480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:03,381][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 01:27:06,801][26599] Updated weights for policy 0, policy_version 250394 (0.0032) [2024-06-19 01:27:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 4102504448. Throughput: 0: 41510.8. Samples: 370136400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:08,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 01:27:10,114][26599] Updated weights for policy 0, policy_version 250404 (0.0032) [2024-06-19 01:27:13,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41506.0, 300 sec: 41598.7). Total num frames: 4102717440. Throughput: 0: 41610.2. Samples: 370388940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:13,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 01:27:14,495][26599] Updated weights for policy 0, policy_version 250414 (0.0035) [2024-06-19 01:27:17,994][26599] Updated weights for policy 0, policy_version 250424 (0.0036) [2024-06-19 01:27:18,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 4102963200. Throughput: 0: 41856.1. Samples: 370523000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:18,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 01:27:22,220][26599] Updated weights for policy 0, policy_version 250434 (0.0029) [2024-06-19 01:27:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41233.2, 300 sec: 41598.7). Total num frames: 4103127040. Throughput: 0: 41746.1. Samples: 370768180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:23,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 01:27:25,974][26599] Updated weights for policy 0, policy_version 250444 (0.0038) [2024-06-19 01:27:28,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41776.7, 300 sec: 41653.7). Total num frames: 4103372800. Throughput: 0: 41685.6. Samples: 371013920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:28,385][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 01:27:29,973][26599] Updated weights for policy 0, policy_version 250454 (0.0056) [2024-06-19 01:27:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41779.3, 300 sec: 41709.8). Total num frames: 4103569408. Throughput: 0: 41953.3. Samples: 371144780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:33,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 01:27:33,735][26599] Updated weights for policy 0, policy_version 250464 (0.0036) [2024-06-19 01:27:37,718][26599] Updated weights for policy 0, policy_version 250474 (0.0047) [2024-06-19 01:27:38,384][26367] Fps is (10 sec: 40959.8, 60 sec: 41503.5, 300 sec: 41709.2). Total num frames: 4103782400. Throughput: 0: 41867.6. Samples: 371392420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:38,385][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 01:27:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000250475_4103782400.pth... [2024-06-19 01:27:38,456][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000249865_4093788160.pth [2024-06-19 01:27:41,512][26599] Updated weights for policy 0, policy_version 250484 (0.0040) [2024-06-19 01:27:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 41765.3). Total num frames: 4104011776. Throughput: 0: 41655.5. Samples: 371639600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 24.0) [2024-06-19 01:27:43,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 01:27:45,928][26599] Updated weights for policy 0, policy_version 250494 (0.0042) [2024-06-19 01:27:48,380][26367] Fps is (10 sec: 39336.0, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 4104175616. Throughput: 0: 41592.9. Samples: 371764160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:27:48,381][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 01:27:48,458][26579] Signal inference workers to stop experience collection... (5500 times) [2024-06-19 01:27:48,459][26579] Signal inference workers to resume experience collection... (5500 times) [2024-06-19 01:27:48,479][26599] InferenceWorker_p0-w0: stopping experience collection (5500 times) [2024-06-19 01:27:48,479][26599] InferenceWorker_p0-w0: resuming experience collection (5500 times) [2024-06-19 01:27:49,249][26599] Updated weights for policy 0, policy_version 250504 (0.0042) [2024-06-19 01:27:53,384][26367] Fps is (10 sec: 39307.2, 60 sec: 42049.6, 300 sec: 41653.7). Total num frames: 4104404992. Throughput: 0: 41691.8. Samples: 372012680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:27:53,385][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 01:27:53,750][26599] Updated weights for policy 0, policy_version 250514 (0.0035) [2024-06-19 01:27:57,243][26599] Updated weights for policy 0, policy_version 250524 (0.0034) [2024-06-19 01:27:58,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 41654.2). Total num frames: 4104617984. Throughput: 0: 41665.2. Samples: 372263880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:27:58,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 01:28:01,497][26599] Updated weights for policy 0, policy_version 250534 (0.0032) [2024-06-19 01:28:03,380][26367] Fps is (10 sec: 39336.0, 60 sec: 40960.1, 300 sec: 41543.2). Total num frames: 4104798208. Throughput: 0: 41517.4. Samples: 372391280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:03,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 01:28:05,073][26599] Updated weights for policy 0, policy_version 250544 (0.0047) [2024-06-19 01:28:08,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 4105011200. Throughput: 0: 41590.3. Samples: 372639740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:08,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 01:28:09,322][26599] Updated weights for policy 0, policy_version 250554 (0.0040) [2024-06-19 01:28:13,051][26599] Updated weights for policy 0, policy_version 250564 (0.0036) [2024-06-19 01:28:13,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 41709.8). Total num frames: 4105256960. Throughput: 0: 41647.0. Samples: 372887880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:13,380][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:28:17,330][26599] Updated weights for policy 0, policy_version 250574 (0.0026) [2024-06-19 01:28:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 4105420800. Throughput: 0: 41584.0. Samples: 373016060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:18,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 01:28:20,832][26599] Updated weights for policy 0, policy_version 250584 (0.0036) [2024-06-19 01:28:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41654.2). Total num frames: 4105650176. Throughput: 0: 41458.6. Samples: 373257900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:23,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 01:28:25,395][26599] Updated weights for policy 0, policy_version 250594 (0.0035) [2024-06-19 01:28:28,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41508.7, 300 sec: 41654.2). Total num frames: 4105863168. Throughput: 0: 41628.5. Samples: 373512880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:28,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 01:28:28,749][26599] Updated weights for policy 0, policy_version 250604 (0.0041) [2024-06-19 01:28:33,282][26599] Updated weights for policy 0, policy_version 250614 (0.0042) [2024-06-19 01:28:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 41599.2). Total num frames: 4106059776. Throughput: 0: 41547.1. Samples: 373633780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:33,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 01:28:37,073][26599] Updated weights for policy 0, policy_version 250624 (0.0034) [2024-06-19 01:28:38,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41508.7, 300 sec: 41599.2). Total num frames: 4106272768. Throughput: 0: 41558.0. Samples: 373882640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:38,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 01:28:41,195][26599] Updated weights for policy 0, policy_version 250634 (0.0035) [2024-06-19 01:28:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41598.7). Total num frames: 4106469376. Throughput: 0: 41563.7. Samples: 374134240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:43,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 01:28:44,960][26599] Updated weights for policy 0, policy_version 250644 (0.0028) [2024-06-19 01:28:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 41598.7). Total num frames: 4106665984. Throughput: 0: 41372.8. Samples: 374253060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:48,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 01:28:49,144][26599] Updated weights for policy 0, policy_version 250654 (0.0026) [2024-06-19 01:28:52,684][26599] Updated weights for policy 0, policy_version 250664 (0.0036) [2024-06-19 01:28:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41508.7, 300 sec: 41598.7). Total num frames: 4106895360. Throughput: 0: 41627.1. Samples: 374512960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 01:28:53,380][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 01:28:56,797][26599] Updated weights for policy 0, policy_version 250674 (0.0035) [2024-06-19 01:28:58,380][26367] Fps is (10 sec: 45875.1, 60 sec: 41779.2, 300 sec: 41654.2). Total num frames: 4107124736. Throughput: 0: 41541.6. Samples: 374757260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:28:58,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 01:29:00,669][26599] Updated weights for policy 0, policy_version 250684 (0.0025) [2024-06-19 01:29:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4107304960. Throughput: 0: 41525.8. Samples: 374884720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:03,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 01:29:04,420][26599] Updated weights for policy 0, policy_version 250694 (0.0023) [2024-06-19 01:29:08,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 41543.2). Total num frames: 4107517952. Throughput: 0: 41783.9. Samples: 375138180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:08,384][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 01:29:08,710][26599] Updated weights for policy 0, policy_version 250704 (0.0036) [2024-06-19 01:29:12,350][26599] Updated weights for policy 0, policy_version 250714 (0.0033) [2024-06-19 01:29:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4107747328. Throughput: 0: 41533.8. Samples: 375381900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:13,380][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 01:29:16,148][26579] Signal inference workers to stop experience collection... (5550 times) [2024-06-19 01:29:16,148][26579] Signal inference workers to resume experience collection... (5550 times) [2024-06-19 01:29:16,188][26599] InferenceWorker_p0-w0: stopping experience collection (5550 times) [2024-06-19 01:29:16,188][26599] InferenceWorker_p0-w0: resuming experience collection (5550 times) [2024-06-19 01:29:16,283][26599] Updated weights for policy 0, policy_version 250724 (0.0035) [2024-06-19 01:29:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4107927552. Throughput: 0: 41703.1. Samples: 375510420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:18,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 01:29:20,377][26599] Updated weights for policy 0, policy_version 250734 (0.0031) [2024-06-19 01:29:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41598.7). Total num frames: 4108156928. Throughput: 0: 41686.8. Samples: 375758540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:23,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 01:29:24,077][26599] Updated weights for policy 0, policy_version 250744 (0.0027) [2024-06-19 01:29:28,018][26599] Updated weights for policy 0, policy_version 250754 (0.0036) [2024-06-19 01:29:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41779.1, 300 sec: 41654.2). Total num frames: 4108369920. Throughput: 0: 41674.6. Samples: 376009600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:28,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 01:29:31,702][26599] Updated weights for policy 0, policy_version 250764 (0.0048) [2024-06-19 01:29:33,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 4108566528. Throughput: 0: 41990.6. Samples: 376142640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:33,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 01:29:35,578][26599] Updated weights for policy 0, policy_version 250774 (0.0032) [2024-06-19 01:29:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41598.7). Total num frames: 4108763136. Throughput: 0: 41783.5. Samples: 376393220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:38,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 01:29:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000250779_4108763136.pth... [2024-06-19 01:29:38,469][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000250171_4098801664.pth [2024-06-19 01:29:40,000][26599] Updated weights for policy 0, policy_version 250784 (0.0038) [2024-06-19 01:29:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41709.8). Total num frames: 4108992512. Throughput: 0: 41928.1. Samples: 376644020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:43,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 01:29:43,432][26599] Updated weights for policy 0, policy_version 250794 (0.0034) [2024-06-19 01:29:47,713][26599] Updated weights for policy 0, policy_version 250804 (0.0038) [2024-06-19 01:29:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 41598.7). Total num frames: 4109189120. Throughput: 0: 41963.1. Samples: 376773060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:48,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 01:29:51,026][26599] Updated weights for policy 0, policy_version 250814 (0.0046) [2024-06-19 01:29:53,384][26367] Fps is (10 sec: 40945.6, 60 sec: 41776.7, 300 sec: 41598.2). Total num frames: 4109402112. Throughput: 0: 41742.6. Samples: 377016740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:53,384][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 01:29:55,599][26599] Updated weights for policy 0, policy_version 250824 (0.0037) [2024-06-19 01:29:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 4109631488. Throughput: 0: 42081.7. Samples: 377275580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:29:58,380][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 01:29:58,647][26599] Updated weights for policy 0, policy_version 250834 (0.0029) [2024-06-19 01:30:03,379][26599] Updated weights for policy 0, policy_version 250844 (0.0040) [2024-06-19 01:30:03,380][26367] Fps is (10 sec: 42613.4, 60 sec: 42052.3, 300 sec: 41598.7). Total num frames: 4109828096. Throughput: 0: 42052.5. Samples: 377402780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 01:30:03,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 01:30:06,480][26599] Updated weights for policy 0, policy_version 250854 (0.0049) [2024-06-19 01:30:08,384][26367] Fps is (10 sec: 40944.9, 60 sec: 42049.8, 300 sec: 41764.8). Total num frames: 4110041088. Throughput: 0: 41987.2. Samples: 377648120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:08,384][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 01:30:11,285][26599] Updated weights for policy 0, policy_version 250864 (0.0044) [2024-06-19 01:30:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4110254080. Throughput: 0: 41981.0. Samples: 377898740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:13,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 01:30:14,518][26599] Updated weights for policy 0, policy_version 250874 (0.0039) [2024-06-19 01:30:18,381][26367] Fps is (10 sec: 39334.4, 60 sec: 41779.0, 300 sec: 41487.6). Total num frames: 4110434304. Throughput: 0: 41757.5. Samples: 378021740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:18,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 01:30:19,088][26599] Updated weights for policy 0, policy_version 250884 (0.0040) [2024-06-19 01:30:22,191][26599] Updated weights for policy 0, policy_version 250894 (0.0034) [2024-06-19 01:30:23,384][26367] Fps is (10 sec: 40944.9, 60 sec: 41776.6, 300 sec: 41709.3). Total num frames: 4110663680. Throughput: 0: 41710.8. Samples: 378270360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:23,385][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 01:30:26,902][26599] Updated weights for policy 0, policy_version 250904 (0.0033) [2024-06-19 01:30:28,384][26367] Fps is (10 sec: 44222.2, 60 sec: 41776.7, 300 sec: 41709.3). Total num frames: 4110876672. Throughput: 0: 41813.5. Samples: 378525780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:28,385][26367] Avg episode reward: [(0, '0.859')] [2024-06-19 01:30:29,972][26599] Updated weights for policy 0, policy_version 250914 (0.0037) [2024-06-19 01:30:33,380][26367] Fps is (10 sec: 40974.9, 60 sec: 41779.3, 300 sec: 41598.7). Total num frames: 4111073280. Throughput: 0: 41669.3. Samples: 378648180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:33,381][26367] Avg episode reward: [(0, '0.846')] [2024-06-19 01:30:34,995][26599] Updated weights for policy 0, policy_version 250924 (0.0029) [2024-06-19 01:30:35,524][26579] Signal inference workers to stop experience collection... (5600 times) [2024-06-19 01:30:35,525][26579] Signal inference workers to resume experience collection... (5600 times) [2024-06-19 01:30:35,547][26599] InferenceWorker_p0-w0: stopping experience collection (5600 times) [2024-06-19 01:30:35,552][26599] InferenceWorker_p0-w0: resuming experience collection (5600 times) [2024-06-19 01:30:37,851][26599] Updated weights for policy 0, policy_version 250934 (0.0036) [2024-06-19 01:30:38,382][26367] Fps is (10 sec: 44244.1, 60 sec: 42596.9, 300 sec: 41820.6). Total num frames: 4111319040. Throughput: 0: 41767.2. Samples: 378896200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:38,383][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 01:30:43,061][26599] Updated weights for policy 0, policy_version 250944 (0.0033) [2024-06-19 01:30:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 41654.2). Total num frames: 4111482880. Throughput: 0: 41886.3. Samples: 379160460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:43,380][26367] Avg episode reward: [(0, '0.192')] [2024-06-19 01:30:45,747][26599] Updated weights for policy 0, policy_version 250954 (0.0033) [2024-06-19 01:30:48,380][26367] Fps is (10 sec: 39329.3, 60 sec: 42052.2, 300 sec: 41654.7). Total num frames: 4111712256. Throughput: 0: 41481.7. Samples: 379269460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:48,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 01:30:51,002][26599] Updated weights for policy 0, policy_version 250964 (0.0033) [2024-06-19 01:30:53,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42327.8, 300 sec: 41820.9). Total num frames: 4111941632. Throughput: 0: 41720.2. Samples: 379525380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:53,381][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 01:30:53,631][26599] Updated weights for policy 0, policy_version 250974 (0.0030) [2024-06-19 01:30:58,380][26367] Fps is (10 sec: 37682.9, 60 sec: 40959.9, 300 sec: 41598.7). Total num frames: 4112089088. Throughput: 0: 41915.4. Samples: 379784940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:30:58,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 01:30:58,692][26599] Updated weights for policy 0, policy_version 250984 (0.0036) [2024-06-19 01:31:01,302][26599] Updated weights for policy 0, policy_version 250994 (0.0041) [2024-06-19 01:31:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4112334848. Throughput: 0: 41764.8. Samples: 379901140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:31:03,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 01:31:06,423][26599] Updated weights for policy 0, policy_version 251004 (0.0034) [2024-06-19 01:31:08,380][26367] Fps is (10 sec: 45876.0, 60 sec: 41781.8, 300 sec: 41765.3). Total num frames: 4112547840. Throughput: 0: 41865.7. Samples: 380154160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:31:08,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 01:31:09,218][26599] Updated weights for policy 0, policy_version 251014 (0.0033) [2024-06-19 01:31:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 4112728064. Throughput: 0: 41866.6. Samples: 380409620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 01:31:13,380][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 01:31:14,187][26599] Updated weights for policy 0, policy_version 251024 (0.0036) [2024-06-19 01:31:17,164][26599] Updated weights for policy 0, policy_version 251034 (0.0037) [2024-06-19 01:31:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.5, 300 sec: 41765.3). Total num frames: 4112973824. Throughput: 0: 41821.2. Samples: 380530140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:18,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 01:31:22,030][26599] Updated weights for policy 0, policy_version 251044 (0.0037) [2024-06-19 01:31:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41781.7, 300 sec: 41709.8). Total num frames: 4113170432. Throughput: 0: 41969.9. Samples: 380784760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:23,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 01:31:24,986][26599] Updated weights for policy 0, policy_version 251054 (0.0030) [2024-06-19 01:31:28,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41508.6, 300 sec: 41709.8). Total num frames: 4113367040. Throughput: 0: 41649.7. Samples: 381034700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:28,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 01:31:29,794][26599] Updated weights for policy 0, policy_version 251064 (0.0025) [2024-06-19 01:31:32,763][26599] Updated weights for policy 0, policy_version 251074 (0.0037) [2024-06-19 01:31:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 4113612800. Throughput: 0: 41970.3. Samples: 381158120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:33,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 01:31:37,543][26599] Updated weights for policy 0, policy_version 251084 (0.0042) [2024-06-19 01:31:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 40961.4, 300 sec: 41765.3). Total num frames: 4113776640. Throughput: 0: 42059.2. Samples: 381418040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:38,380][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 01:31:38,539][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000251087_4113809408.pth... [2024-06-19 01:31:38,623][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000250475_4103782400.pth [2024-06-19 01:31:40,530][26599] Updated weights for policy 0, policy_version 251094 (0.0022) [2024-06-19 01:31:43,380][26367] Fps is (10 sec: 37682.7, 60 sec: 41779.0, 300 sec: 41598.7). Total num frames: 4113989632. Throughput: 0: 41780.9. Samples: 381665080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:43,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 01:31:45,237][26599] Updated weights for policy 0, policy_version 251104 (0.0034) [2024-06-19 01:31:46,273][26579] Signal inference workers to stop experience collection... (5650 times) [2024-06-19 01:31:46,315][26599] InferenceWorker_p0-w0: stopping experience collection (5650 times) [2024-06-19 01:31:46,328][26579] Signal inference workers to resume experience collection... (5650 times) [2024-06-19 01:31:46,341][26599] InferenceWorker_p0-w0: resuming experience collection (5650 times) [2024-06-19 01:31:48,295][26599] Updated weights for policy 0, policy_version 251114 (0.0053) [2024-06-19 01:31:48,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4114251776. Throughput: 0: 42020.0. Samples: 381792040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:48,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 01:31:53,132][26599] Updated weights for policy 0, policy_version 251124 (0.0027) [2024-06-19 01:31:53,380][26367] Fps is (10 sec: 42599.4, 60 sec: 41233.2, 300 sec: 41820.9). Total num frames: 4114415616. Throughput: 0: 42048.5. Samples: 382046340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:53,380][26367] Avg episode reward: [(0, '0.800')] [2024-06-19 01:31:56,427][26599] Updated weights for policy 0, policy_version 251134 (0.0030) [2024-06-19 01:31:58,380][26367] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 41654.2). Total num frames: 4114628608. Throughput: 0: 41702.4. Samples: 382286240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:31:58,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 01:32:01,204][26599] Updated weights for policy 0, policy_version 251144 (0.0035) [2024-06-19 01:32:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4114841600. Throughput: 0: 42041.1. Samples: 382421980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:32:03,380][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 01:32:04,089][26599] Updated weights for policy 0, policy_version 251154 (0.0032) [2024-06-19 01:32:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4115038208. Throughput: 0: 41988.0. Samples: 382674220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:32:08,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 01:32:08,986][26599] Updated weights for policy 0, policy_version 251164 (0.0030) [2024-06-19 01:32:11,894][26599] Updated weights for policy 0, policy_version 251174 (0.0029) [2024-06-19 01:32:13,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42595.7, 300 sec: 41764.8). Total num frames: 4115283968. Throughput: 0: 41974.4. Samples: 382923700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:32:13,385][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 01:32:16,519][26599] Updated weights for policy 0, policy_version 251184 (0.0028) [2024-06-19 01:32:18,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4115480576. Throughput: 0: 42131.0. Samples: 383054020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:32:18,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 01:32:19,496][26599] Updated weights for policy 0, policy_version 251194 (0.0036) [2024-06-19 01:32:23,380][26367] Fps is (10 sec: 39335.7, 60 sec: 41779.1, 300 sec: 41710.3). Total num frames: 4115677184. Throughput: 0: 41753.2. Samples: 383296940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 01:32:23,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 01:32:24,179][26599] Updated weights for policy 0, policy_version 251204 (0.0030) [2024-06-19 01:32:27,545][26599] Updated weights for policy 0, policy_version 251214 (0.0040) [2024-06-19 01:32:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 4115906560. Throughput: 0: 41817.4. Samples: 383546860. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:32:28,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 01:32:31,998][26599] Updated weights for policy 0, policy_version 251224 (0.0033) [2024-06-19 01:32:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41233.1, 300 sec: 41710.3). Total num frames: 4116086784. Throughput: 0: 41944.9. Samples: 383679560. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:32:33,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 01:32:35,710][26599] Updated weights for policy 0, policy_version 251234 (0.0045) [2024-06-19 01:32:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 41709.8). Total num frames: 4116316160. Throughput: 0: 41779.3. Samples: 383926420. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:32:38,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 01:32:39,750][26599] Updated weights for policy 0, policy_version 251244 (0.0033) [2024-06-19 01:32:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 4116512768. Throughput: 0: 42128.3. Samples: 384182000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:32:43,380][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 01:32:43,568][26599] Updated weights for policy 0, policy_version 251254 (0.0035) [2024-06-19 01:32:48,039][26599] Updated weights for policy 0, policy_version 251264 (0.0023) [2024-06-19 01:32:48,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41233.1, 300 sec: 41765.8). Total num frames: 4116725760. Throughput: 0: 41757.8. Samples: 384301080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:32:48,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 01:32:51,328][26599] Updated weights for policy 0, policy_version 251274 (0.0034) [2024-06-19 01:32:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41820.9). Total num frames: 4116955136. Throughput: 0: 41805.8. Samples: 384555480. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:32:53,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 01:32:55,708][26599] Updated weights for policy 0, policy_version 251284 (0.0035) [2024-06-19 01:32:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4117151744. Throughput: 0: 41958.6. Samples: 384811680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:32:58,380][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 01:32:59,370][26599] Updated weights for policy 0, policy_version 251294 (0.0026) [2024-06-19 01:33:03,232][26599] Updated weights for policy 0, policy_version 251304 (0.0027) [2024-06-19 01:33:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4117364736. Throughput: 0: 41739.7. Samples: 384932300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:33:03,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 01:33:07,165][26599] Updated weights for policy 0, policy_version 251314 (0.0033) [2024-06-19 01:33:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41765.3). Total num frames: 4117577728. Throughput: 0: 42000.5. Samples: 385186960. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:33:08,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 01:33:08,488][26579] Signal inference workers to stop experience collection... (5700 times) [2024-06-19 01:33:08,488][26579] Signal inference workers to resume experience collection... (5700 times) [2024-06-19 01:33:08,538][26599] InferenceWorker_p0-w0: stopping experience collection (5700 times) [2024-06-19 01:33:08,538][26599] InferenceWorker_p0-w0: resuming experience collection (5700 times) [2024-06-19 01:33:10,889][26599] Updated weights for policy 0, policy_version 251324 (0.0033) [2024-06-19 01:33:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41781.8, 300 sec: 41932.0). Total num frames: 4117790720. Throughput: 0: 42033.9. Samples: 385438380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:33:13,380][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 01:33:14,907][26599] Updated weights for policy 0, policy_version 251334 (0.0032) [2024-06-19 01:33:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4117987328. Throughput: 0: 41854.1. Samples: 385563000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:33:18,381][26367] Avg episode reward: [(0, '0.362')] [2024-06-19 01:33:18,687][26599] Updated weights for policy 0, policy_version 251344 (0.0040) [2024-06-19 01:33:22,569][26599] Updated weights for policy 0, policy_version 251354 (0.0045) [2024-06-19 01:33:23,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 4118200320. Throughput: 0: 42150.3. Samples: 385823180. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:33:23,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 01:33:26,415][26599] Updated weights for policy 0, policy_version 251364 (0.0034) [2024-06-19 01:33:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 4118396928. Throughput: 0: 41972.3. Samples: 386070760. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:33:28,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 01:33:30,570][26599] Updated weights for policy 0, policy_version 251374 (0.0024) [2024-06-19 01:33:33,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4118626304. Throughput: 0: 42147.0. Samples: 386197700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 22.0) [2024-06-19 01:33:33,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 01:33:34,637][26599] Updated weights for policy 0, policy_version 251384 (0.0034) [2024-06-19 01:33:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 4118822912. Throughput: 0: 42034.6. Samples: 386447040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:33:38,380][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 01:33:38,464][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000251394_4118839296.pth... [2024-06-19 01:33:38,473][26599] Updated weights for policy 0, policy_version 251394 (0.0028) [2024-06-19 01:33:38,541][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000250779_4108763136.pth [2024-06-19 01:33:42,190][26599] Updated weights for policy 0, policy_version 251404 (0.0041) [2024-06-19 01:33:43,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4119019520. Throughput: 0: 41917.3. Samples: 386697960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:33:43,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 01:33:46,321][26599] Updated weights for policy 0, policy_version 251414 (0.0029) [2024-06-19 01:33:48,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 4119265280. Throughput: 0: 41995.9. Samples: 386822120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:33:48,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 01:33:49,799][26599] Updated weights for policy 0, policy_version 251424 (0.0029) [2024-06-19 01:33:53,380][26367] Fps is (10 sec: 44236.2, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 4119461888. Throughput: 0: 42154.6. Samples: 387083920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:33:53,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 01:33:53,950][26599] Updated weights for policy 0, policy_version 251434 (0.0034) [2024-06-19 01:33:57,982][26599] Updated weights for policy 0, policy_version 251444 (0.0042) [2024-06-19 01:33:58,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4119658496. Throughput: 0: 42118.2. Samples: 387333700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:33:58,380][26367] Avg episode reward: [(0, '0.387')] [2024-06-19 01:34:01,770][26599] Updated weights for policy 0, policy_version 251454 (0.0045) [2024-06-19 01:34:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4119887872. Throughput: 0: 42053.0. Samples: 387455380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:03,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 01:34:05,619][26599] Updated weights for policy 0, policy_version 251464 (0.0034) [2024-06-19 01:34:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4120100864. Throughput: 0: 41974.8. Samples: 387712040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:08,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 01:34:09,289][26599] Updated weights for policy 0, policy_version 251474 (0.0037) [2024-06-19 01:34:13,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4120281088. Throughput: 0: 42014.3. Samples: 387961400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:13,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 01:34:13,664][26599] Updated weights for policy 0, policy_version 251484 (0.0040) [2024-06-19 01:34:16,986][26599] Updated weights for policy 0, policy_version 251494 (0.0024) [2024-06-19 01:34:18,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4120526848. Throughput: 0: 41957.2. Samples: 388085780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:18,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 01:34:21,460][26599] Updated weights for policy 0, policy_version 251504 (0.0031) [2024-06-19 01:34:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4120723456. Throughput: 0: 42075.6. Samples: 388340440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:23,380][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 01:34:24,652][26599] Updated weights for policy 0, policy_version 251514 (0.0036) [2024-06-19 01:34:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4120936448. Throughput: 0: 42077.2. Samples: 388591440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:28,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 01:34:29,135][26599] Updated weights for policy 0, policy_version 251524 (0.0041) [2024-06-19 01:34:31,543][26579] Signal inference workers to stop experience collection... (5750 times) [2024-06-19 01:34:31,543][26579] Signal inference workers to resume experience collection... (5750 times) [2024-06-19 01:34:31,560][26599] InferenceWorker_p0-w0: stopping experience collection (5750 times) [2024-06-19 01:34:31,592][26599] InferenceWorker_p0-w0: resuming experience collection (5750 times) [2024-06-19 01:34:32,320][26599] Updated weights for policy 0, policy_version 251534 (0.0032) [2024-06-19 01:34:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4121165824. Throughput: 0: 42292.1. Samples: 388725260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:33,380][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 01:34:37,290][26599] Updated weights for policy 0, policy_version 251544 (0.0046) [2024-06-19 01:34:38,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4121313280. Throughput: 0: 42040.5. Samples: 388975740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:38,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 01:34:40,573][26599] Updated weights for policy 0, policy_version 251554 (0.0037) [2024-06-19 01:34:43,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4121575424. Throughput: 0: 41981.4. Samples: 389222860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 01:34:43,384][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 01:34:45,042][26599] Updated weights for policy 0, policy_version 251564 (0.0049) [2024-06-19 01:34:48,170][26599] Updated weights for policy 0, policy_version 251574 (0.0027) [2024-06-19 01:34:48,380][26367] Fps is (10 sec: 47513.2, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 4121788416. Throughput: 0: 42329.7. Samples: 389360220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:34:48,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 01:34:52,714][26599] Updated weights for policy 0, policy_version 251584 (0.0026) [2024-06-19 01:34:53,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4121968640. Throughput: 0: 42112.3. Samples: 389607100. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:34:53,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 01:34:55,907][26599] Updated weights for policy 0, policy_version 251594 (0.0029) [2024-06-19 01:34:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4122198016. Throughput: 0: 42104.0. Samples: 389856080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:34:58,380][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 01:35:00,318][26599] Updated weights for policy 0, policy_version 251604 (0.0036) [2024-06-19 01:35:03,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 41988.0). Total num frames: 4122427392. Throughput: 0: 42321.0. Samples: 389990220. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:03,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 01:35:03,455][26599] Updated weights for policy 0, policy_version 251614 (0.0040) [2024-06-19 01:35:08,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4122591232. Throughput: 0: 42186.2. Samples: 390238820. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:08,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 01:35:08,450][26599] Updated weights for policy 0, policy_version 251624 (0.0035) [2024-06-19 01:35:11,849][26599] Updated weights for policy 0, policy_version 251634 (0.0030) [2024-06-19 01:35:13,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42598.2, 300 sec: 42043.0). Total num frames: 4122836992. Throughput: 0: 41881.7. Samples: 390476120. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:13,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 01:35:16,034][26599] Updated weights for policy 0, policy_version 251644 (0.0040) [2024-06-19 01:35:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 4123049984. Throughput: 0: 41863.9. Samples: 390609140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:18,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 01:35:19,546][26599] Updated weights for policy 0, policy_version 251654 (0.0039) [2024-06-19 01:35:23,380][26367] Fps is (10 sec: 39322.4, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 4123230208. Throughput: 0: 41833.8. Samples: 390858260. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:23,381][26367] Avg episode reward: [(0, '0.768')] [2024-06-19 01:35:23,723][26599] Updated weights for policy 0, policy_version 251664 (0.0032) [2024-06-19 01:35:27,093][26599] Updated weights for policy 0, policy_version 251674 (0.0041) [2024-06-19 01:35:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4123459584. Throughput: 0: 41907.9. Samples: 391108720. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:28,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 01:35:31,405][26599] Updated weights for policy 0, policy_version 251684 (0.0039) [2024-06-19 01:35:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41821.2). Total num frames: 4123656192. Throughput: 0: 41783.2. Samples: 391240460. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:33,380][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 01:35:34,639][26599] Updated weights for policy 0, policy_version 251694 (0.0031) [2024-06-19 01:35:38,384][26367] Fps is (10 sec: 39307.0, 60 sec: 42322.7, 300 sec: 41931.4). Total num frames: 4123852800. Throughput: 0: 41795.3. Samples: 391488040. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:38,385][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 01:35:38,395][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000251700_4123852800.pth... [2024-06-19 01:35:38,443][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000251087_4113809408.pth [2024-06-19 01:35:39,294][26599] Updated weights for policy 0, policy_version 251704 (0.0032) [2024-06-19 01:35:41,080][26579] Signal inference workers to stop experience collection... (5800 times) [2024-06-19 01:35:41,081][26579] Signal inference workers to resume experience collection... (5800 times) [2024-06-19 01:35:41,092][26599] InferenceWorker_p0-w0: stopping experience collection (5800 times) [2024-06-19 01:35:41,092][26599] InferenceWorker_p0-w0: resuming experience collection (5800 times) [2024-06-19 01:35:42,335][26599] Updated weights for policy 0, policy_version 251714 (0.0039) [2024-06-19 01:35:43,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4124098560. Throughput: 0: 41768.4. Samples: 391735660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:43,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 01:35:47,090][26599] Updated weights for policy 0, policy_version 251724 (0.0038) [2024-06-19 01:35:48,380][26367] Fps is (10 sec: 45892.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4124311552. Throughput: 0: 41906.2. Samples: 391876000. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:48,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 01:35:50,281][26599] Updated weights for policy 0, policy_version 251734 (0.0052) [2024-06-19 01:35:53,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4124475392. Throughput: 0: 41750.2. Samples: 392117580. Policy #0 lag: (min: 0.0, avg: 8.3, max: 22.0) [2024-06-19 01:35:53,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 01:35:54,800][26599] Updated weights for policy 0, policy_version 251744 (0.0038) [2024-06-19 01:35:58,077][26599] Updated weights for policy 0, policy_version 251754 (0.0044) [2024-06-19 01:35:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4124737536. Throughput: 0: 42096.6. Samples: 392370460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:35:58,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 01:36:02,784][26599] Updated weights for policy 0, policy_version 251764 (0.0032) [2024-06-19 01:36:03,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4124917760. Throughput: 0: 42120.4. Samples: 392504560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:03,381][26367] Avg episode reward: [(0, '0.371')] [2024-06-19 01:36:06,132][26599] Updated weights for policy 0, policy_version 251774 (0.0038) [2024-06-19 01:36:08,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4125114368. Throughput: 0: 42016.4. Samples: 392749000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:08,380][26367] Avg episode reward: [(0, '0.866')] [2024-06-19 01:36:10,509][26599] Updated weights for policy 0, policy_version 251784 (0.0038) [2024-06-19 01:36:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 4125360128. Throughput: 0: 41978.2. Samples: 392997740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:13,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 01:36:13,747][26599] Updated weights for policy 0, policy_version 251794 (0.0039) [2024-06-19 01:36:18,155][26599] Updated weights for policy 0, policy_version 251804 (0.0030) [2024-06-19 01:36:18,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4125556736. Throughput: 0: 42017.2. Samples: 393131240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:18,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 01:36:21,529][26599] Updated weights for policy 0, policy_version 251814 (0.0039) [2024-06-19 01:36:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4125769728. Throughput: 0: 41940.8. Samples: 393375220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:23,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 01:36:25,950][26599] Updated weights for policy 0, policy_version 251824 (0.0036) [2024-06-19 01:36:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4125982720. Throughput: 0: 42213.8. Samples: 393635280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:28,381][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 01:36:29,372][26599] Updated weights for policy 0, policy_version 251834 (0.0026) [2024-06-19 01:36:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4126179328. Throughput: 0: 41789.7. Samples: 393756540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:33,381][26367] Avg episode reward: [(0, '0.354')] [2024-06-19 01:36:33,647][26599] Updated weights for policy 0, policy_version 251844 (0.0046) [2024-06-19 01:36:37,173][26599] Updated weights for policy 0, policy_version 251854 (0.0040) [2024-06-19 01:36:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42600.9, 300 sec: 42098.6). Total num frames: 4126408704. Throughput: 0: 42003.5. Samples: 394007740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:38,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 01:36:41,719][26599] Updated weights for policy 0, policy_version 251864 (0.0038) [2024-06-19 01:36:43,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 4126588928. Throughput: 0: 42011.2. Samples: 394260960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:43,380][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 01:36:45,131][26599] Updated weights for policy 0, policy_version 251874 (0.0033) [2024-06-19 01:36:48,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 41987.4). Total num frames: 4126801920. Throughput: 0: 41791.6. Samples: 394385180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:48,384][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 01:36:49,246][26599] Updated weights for policy 0, policy_version 251884 (0.0030) [2024-06-19 01:36:52,029][26579] Signal inference workers to stop experience collection... (5850 times) [2024-06-19 01:36:52,030][26579] Signal inference workers to resume experience collection... (5850 times) [2024-06-19 01:36:52,052][26599] InferenceWorker_p0-w0: stopping experience collection (5850 times) [2024-06-19 01:36:52,053][26599] InferenceWorker_p0-w0: resuming experience collection (5850 times) [2024-06-19 01:36:52,978][26599] Updated weights for policy 0, policy_version 251894 (0.0027) [2024-06-19 01:36:53,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 4127047680. Throughput: 0: 42067.1. Samples: 394642020. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:53,381][26367] Avg episode reward: [(0, '0.364')] [2024-06-19 01:36:56,859][26599] Updated weights for policy 0, policy_version 251904 (0.0028) [2024-06-19 01:36:58,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4127227904. Throughput: 0: 42238.3. Samples: 394898460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:36:58,380][26367] Avg episode reward: [(0, '0.399')] [2024-06-19 01:37:00,791][26599] Updated weights for policy 0, policy_version 251914 (0.0031) [2024-06-19 01:37:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 4127457280. Throughput: 0: 41843.2. Samples: 395014180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:37:03,380][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 01:37:04,427][26599] Updated weights for policy 0, policy_version 251924 (0.0030) [2024-06-19 01:37:08,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 41988.0). Total num frames: 4127670272. Throughput: 0: 42174.3. Samples: 395273060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:08,380][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 01:37:08,539][26599] Updated weights for policy 0, policy_version 251934 (0.0039) [2024-06-19 01:37:12,267][26599] Updated weights for policy 0, policy_version 251944 (0.0032) [2024-06-19 01:37:13,380][26367] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4127866880. Throughput: 0: 42042.2. Samples: 395527180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:13,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 01:37:16,437][26599] Updated weights for policy 0, policy_version 251954 (0.0042) [2024-06-19 01:37:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4128079872. Throughput: 0: 42154.7. Samples: 395653500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:18,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 01:37:20,333][26599] Updated weights for policy 0, policy_version 251964 (0.0031) [2024-06-19 01:37:23,384][26367] Fps is (10 sec: 40945.4, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 4128276480. Throughput: 0: 42039.3. Samples: 395899660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:23,384][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 01:37:24,634][26599] Updated weights for policy 0, policy_version 251974 (0.0032) [2024-06-19 01:37:27,996][26599] Updated weights for policy 0, policy_version 251984 (0.0038) [2024-06-19 01:37:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4128522240. Throughput: 0: 41994.6. Samples: 396150720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:28,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 01:37:32,373][26599] Updated weights for policy 0, policy_version 251994 (0.0040) [2024-06-19 01:37:33,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4128718848. Throughput: 0: 42230.2. Samples: 396285540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:33,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 01:37:35,766][26599] Updated weights for policy 0, policy_version 252004 (0.0031) [2024-06-19 01:37:38,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4128899072. Throughput: 0: 41870.7. Samples: 396526200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:38,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 01:37:38,459][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252009_4128915456.pth... [2024-06-19 01:37:38,511][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000251394_4118839296.pth [2024-06-19 01:37:40,308][26599] Updated weights for policy 0, policy_version 252014 (0.0035) [2024-06-19 01:37:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 4129144832. Throughput: 0: 41682.1. Samples: 396774160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:43,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 01:37:43,906][26599] Updated weights for policy 0, policy_version 252024 (0.0037) [2024-06-19 01:37:48,116][26599] Updated weights for policy 0, policy_version 252034 (0.0035) [2024-06-19 01:37:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 4129325056. Throughput: 0: 42099.1. Samples: 396908640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:48,381][26367] Avg episode reward: [(0, '0.373')] [2024-06-19 01:37:51,395][26599] Updated weights for policy 0, policy_version 252044 (0.0033) [2024-06-19 01:37:53,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4129538048. Throughput: 0: 41738.2. Samples: 397151280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:53,381][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 01:37:56,247][26599] Updated weights for policy 0, policy_version 252054 (0.0035) [2024-06-19 01:37:58,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4129783808. Throughput: 0: 41630.8. Samples: 397400560. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:37:58,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 01:37:59,061][26599] Updated weights for policy 0, policy_version 252064 (0.0035) [2024-06-19 01:38:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 4129931264. Throughput: 0: 41715.2. Samples: 397530680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:38:03,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 01:38:03,952][26599] Updated weights for policy 0, policy_version 252074 (0.0037) [2024-06-19 01:38:04,423][26579] Signal inference workers to stop experience collection... (5900 times) [2024-06-19 01:38:04,423][26579] Signal inference workers to resume experience collection... (5900 times) [2024-06-19 01:38:04,441][26599] InferenceWorker_p0-w0: stopping experience collection (5900 times) [2024-06-19 01:38:04,441][26599] InferenceWorker_p0-w0: resuming experience collection (5900 times) [2024-06-19 01:38:06,638][26599] Updated weights for policy 0, policy_version 252084 (0.0028) [2024-06-19 01:38:08,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4130193408. Throughput: 0: 41769.1. Samples: 397779120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:38:08,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 01:38:12,068][26599] Updated weights for policy 0, policy_version 252094 (0.0035) [2024-06-19 01:38:13,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4130406400. Throughput: 0: 41794.8. Samples: 398031480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:38:13,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 01:38:14,930][26599] Updated weights for policy 0, policy_version 252104 (0.0023) [2024-06-19 01:38:18,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4130570240. Throughput: 0: 41637.3. Samples: 398159220. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:38:18,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 01:38:19,908][26599] Updated weights for policy 0, policy_version 252114 (0.0040) [2024-06-19 01:38:22,940][26599] Updated weights for policy 0, policy_version 252124 (0.0032) [2024-06-19 01:38:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 4130799616. Throughput: 0: 41735.5. Samples: 398404300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:23,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 01:38:27,412][26599] Updated weights for policy 0, policy_version 252134 (0.0037) [2024-06-19 01:38:28,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4131012608. Throughput: 0: 42077.4. Samples: 398667640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:28,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 01:38:30,452][26599] Updated weights for policy 0, policy_version 252144 (0.0034) [2024-06-19 01:38:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4131209216. Throughput: 0: 41934.7. Samples: 398795700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:33,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 01:38:34,894][26599] Updated weights for policy 0, policy_version 252154 (0.0029) [2024-06-19 01:38:38,042][26599] Updated weights for policy 0, policy_version 252164 (0.0027) [2024-06-19 01:38:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4131454976. Throughput: 0: 41925.3. Samples: 399037920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:38,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 01:38:43,057][26599] Updated weights for policy 0, policy_version 252174 (0.0045) [2024-06-19 01:38:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4131635200. Throughput: 0: 42329.2. Samples: 399305380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:43,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 01:38:46,006][26599] Updated weights for policy 0, policy_version 252184 (0.0037) [2024-06-19 01:38:48,384][26367] Fps is (10 sec: 39307.1, 60 sec: 42049.6, 300 sec: 41986.9). Total num frames: 4131848192. Throughput: 0: 42056.5. Samples: 399423380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:48,385][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 01:38:50,619][26599] Updated weights for policy 0, policy_version 252194 (0.0035) [2024-06-19 01:38:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 4132077568. Throughput: 0: 42153.7. Samples: 399676040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:53,381][26367] Avg episode reward: [(0, '0.367')] [2024-06-19 01:38:53,864][26599] Updated weights for policy 0, policy_version 252204 (0.0035) [2024-06-19 01:38:58,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 4132257792. Throughput: 0: 42399.1. Samples: 399939440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:38:58,381][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 01:38:58,409][26599] Updated weights for policy 0, policy_version 252214 (0.0033) [2024-06-19 01:39:01,470][26599] Updated weights for policy 0, policy_version 252224 (0.0029) [2024-06-19 01:39:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4132470784. Throughput: 0: 42170.3. Samples: 400056880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:39:03,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 01:39:05,971][26599] Updated weights for policy 0, policy_version 252234 (0.0035) [2024-06-19 01:39:08,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4132732928. Throughput: 0: 42457.4. Samples: 400314880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:39:08,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 01:39:09,139][26599] Updated weights for policy 0, policy_version 252244 (0.0027) [2024-06-19 01:39:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4132896768. Throughput: 0: 42335.0. Samples: 400572720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:39:13,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 01:39:13,649][26599] Updated weights for policy 0, policy_version 252254 (0.0033) [2024-06-19 01:39:17,057][26599] Updated weights for policy 0, policy_version 252264 (0.0032) [2024-06-19 01:39:18,380][26367] Fps is (10 sec: 36044.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4133093376. Throughput: 0: 42111.5. Samples: 400690720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:39:18,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 01:39:21,335][26599] Updated weights for policy 0, policy_version 252274 (0.0033) [2024-06-19 01:39:22,500][26579] Signal inference workers to stop experience collection... (5950 times) [2024-06-19 01:39:22,549][26599] InferenceWorker_p0-w0: stopping experience collection (5950 times) [2024-06-19 01:39:22,559][26579] Signal inference workers to resume experience collection... (5950 times) [2024-06-19 01:39:22,569][26599] InferenceWorker_p0-w0: resuming experience collection (5950 times) [2024-06-19 01:39:23,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 4133355520. Throughput: 0: 42387.0. Samples: 400945340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:39:23,392][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 01:39:24,956][26599] Updated weights for policy 0, policy_version 252284 (0.0031) [2024-06-19 01:39:28,383][26367] Fps is (10 sec: 44223.0, 60 sec: 42050.0, 300 sec: 41931.5). Total num frames: 4133535744. Throughput: 0: 42244.6. Samples: 401206520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 01:39:28,384][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 01:39:29,182][26599] Updated weights for policy 0, policy_version 252294 (0.0029) [2024-06-19 01:39:32,952][26599] Updated weights for policy 0, policy_version 252304 (0.0034) [2024-06-19 01:39:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 4133748736. Throughput: 0: 42160.7. Samples: 401320460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:39:33,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 01:39:36,850][26599] Updated weights for policy 0, policy_version 252314 (0.0031) [2024-06-19 01:39:38,380][26367] Fps is (10 sec: 45889.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4133994496. Throughput: 0: 42256.1. Samples: 401577560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:39:38,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 01:39:38,458][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252320_4134010880.pth... [2024-06-19 01:39:38,515][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000251700_4123852800.pth [2024-06-19 01:39:41,159][26599] Updated weights for policy 0, policy_version 252324 (0.0030) [2024-06-19 01:39:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4134158336. Throughput: 0: 42206.6. Samples: 401838740. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:39:43,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 01:39:44,671][26599] Updated weights for policy 0, policy_version 252334 (0.0042) [2024-06-19 01:39:48,384][26367] Fps is (10 sec: 37669.5, 60 sec: 42052.3, 300 sec: 42042.5). Total num frames: 4134371328. Throughput: 0: 42071.3. Samples: 401950240. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:39:48,384][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 01:39:48,861][26599] Updated weights for policy 0, policy_version 252344 (0.0033) [2024-06-19 01:39:52,300][26599] Updated weights for policy 0, policy_version 252354 (0.0033) [2024-06-19 01:39:53,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 4134617088. Throughput: 0: 42116.5. Samples: 402210120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:39:53,380][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 01:39:56,581][26599] Updated weights for policy 0, policy_version 252364 (0.0038) [2024-06-19 01:39:58,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4134780928. Throughput: 0: 42130.7. Samples: 402468600. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:39:58,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 01:39:59,933][26599] Updated weights for policy 0, policy_version 252374 (0.0030) [2024-06-19 01:40:03,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4135026688. Throughput: 0: 42194.2. Samples: 402589460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:03,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 01:40:04,297][26599] Updated weights for policy 0, policy_version 252384 (0.0046) [2024-06-19 01:40:08,002][26599] Updated weights for policy 0, policy_version 252394 (0.0039) [2024-06-19 01:40:08,380][26367] Fps is (10 sec: 45875.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4135239680. Throughput: 0: 42112.6. Samples: 402840400. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:08,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 01:40:12,075][26599] Updated weights for policy 0, policy_version 252404 (0.0034) [2024-06-19 01:40:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4135419904. Throughput: 0: 41962.5. Samples: 403094700. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:13,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 01:40:15,757][26599] Updated weights for policy 0, policy_version 252414 (0.0036) [2024-06-19 01:40:18,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4135632896. Throughput: 0: 42060.5. Samples: 403213180. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:18,383][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 01:40:19,874][26599] Updated weights for policy 0, policy_version 252424 (0.0025) [2024-06-19 01:40:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4135862272. Throughput: 0: 42224.4. Samples: 403477660. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:23,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 01:40:23,431][26599] Updated weights for policy 0, policy_version 252434 (0.0023) [2024-06-19 01:40:27,480][26599] Updated weights for policy 0, policy_version 252444 (0.0028) [2024-06-19 01:40:28,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41781.5, 300 sec: 41987.5). Total num frames: 4136042496. Throughput: 0: 41958.4. Samples: 403726860. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:28,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 01:40:31,044][26599] Updated weights for policy 0, policy_version 252454 (0.0033) [2024-06-19 01:40:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42154.6). Total num frames: 4136288256. Throughput: 0: 42163.9. Samples: 403847460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:33,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 01:40:35,159][26599] Updated weights for policy 0, policy_version 252464 (0.0034) [2024-06-19 01:40:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4136484864. Throughput: 0: 42106.2. Samples: 404104900. Policy #0 lag: (min: 1.0, avg: 10.9, max: 23.0) [2024-06-19 01:40:38,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-19 01:40:38,872][26599] Updated weights for policy 0, policy_version 252474 (0.0027) [2024-06-19 01:40:42,948][26599] Updated weights for policy 0, policy_version 252484 (0.0042) [2024-06-19 01:40:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4136697856. Throughput: 0: 41871.1. Samples: 404352800. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:40:43,381][26367] Avg episode reward: [(0, '0.389')] [2024-06-19 01:40:46,879][26599] Updated weights for policy 0, policy_version 252494 (0.0033) [2024-06-19 01:40:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42327.8, 300 sec: 42154.1). Total num frames: 4136910848. Throughput: 0: 42056.8. Samples: 404482020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:40:48,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 01:40:50,751][26599] Updated weights for policy 0, policy_version 252504 (0.0048) [2024-06-19 01:40:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4137123840. Throughput: 0: 42038.2. Samples: 404732120. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:40:53,380][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 01:40:54,856][26599] Updated weights for policy 0, policy_version 252514 (0.0035) [2024-06-19 01:40:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4137320448. Throughput: 0: 41847.1. Samples: 404977820. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:40:58,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 01:40:58,822][26599] Updated weights for policy 0, policy_version 252524 (0.0033) [2024-06-19 01:40:59,856][26579] Signal inference workers to stop experience collection... (6000 times) [2024-06-19 01:40:59,856][26579] Signal inference workers to resume experience collection... (6000 times) [2024-06-19 01:40:59,900][26599] InferenceWorker_p0-w0: stopping experience collection (6000 times) [2024-06-19 01:40:59,900][26599] InferenceWorker_p0-w0: resuming experience collection (6000 times) [2024-06-19 01:41:02,731][26599] Updated weights for policy 0, policy_version 252534 (0.0038) [2024-06-19 01:41:03,380][26367] Fps is (10 sec: 39320.7, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 4137517056. Throughput: 0: 41972.8. Samples: 405101960. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:03,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 01:41:06,677][26599] Updated weights for policy 0, policy_version 252544 (0.0047) [2024-06-19 01:41:08,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41779.0, 300 sec: 41987.5). Total num frames: 4137746432. Throughput: 0: 41569.3. Samples: 405348280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:08,382][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 01:41:10,395][26599] Updated weights for policy 0, policy_version 252554 (0.0027) [2024-06-19 01:41:13,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4137943040. Throughput: 0: 41662.7. Samples: 405601680. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:13,380][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 01:41:14,640][26599] Updated weights for policy 0, policy_version 252564 (0.0046) [2024-06-19 01:41:18,199][26599] Updated weights for policy 0, policy_version 252574 (0.0033) [2024-06-19 01:41:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4138172416. Throughput: 0: 41751.0. Samples: 405726260. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:18,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 01:41:22,645][26599] Updated weights for policy 0, policy_version 252584 (0.0044) [2024-06-19 01:41:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4138369024. Throughput: 0: 41698.7. Samples: 405981340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:23,380][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 01:41:26,436][26599] Updated weights for policy 0, policy_version 252594 (0.0029) [2024-06-19 01:41:28,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4138582016. Throughput: 0: 41720.5. Samples: 406230220. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:28,380][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 01:41:30,729][26599] Updated weights for policy 0, policy_version 252604 (0.0046) [2024-06-19 01:41:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4138762240. Throughput: 0: 41539.7. Samples: 406351300. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:33,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 01:41:34,475][26599] Updated weights for policy 0, policy_version 252614 (0.0034) [2024-06-19 01:41:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4138975232. Throughput: 0: 41616.8. Samples: 406604880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:38,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 01:41:38,415][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252624_4138991616.pth... [2024-06-19 01:41:38,425][26599] Updated weights for policy 0, policy_version 252624 (0.0041) [2024-06-19 01:41:38,474][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252009_4128915456.pth [2024-06-19 01:41:42,283][26599] Updated weights for policy 0, policy_version 252634 (0.0041) [2024-06-19 01:41:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4139188224. Throughput: 0: 41840.5. Samples: 406860640. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:43,380][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 01:41:46,137][26599] Updated weights for policy 0, policy_version 252644 (0.0030) [2024-06-19 01:41:48,384][26367] Fps is (10 sec: 44220.5, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 4139417600. Throughput: 0: 41847.4. Samples: 406985240. Policy #0 lag: (min: 0.0, avg: 11.9, max: 24.0) [2024-06-19 01:41:48,384][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 01:41:49,867][26599] Updated weights for policy 0, policy_version 252654 (0.0039) [2024-06-19 01:41:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4139614208. Throughput: 0: 41903.3. Samples: 407233920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:41:53,380][26367] Avg episode reward: [(0, '0.237')] [2024-06-19 01:41:53,775][26599] Updated weights for policy 0, policy_version 252664 (0.0039) [2024-06-19 01:41:58,020][26599] Updated weights for policy 0, policy_version 252674 (0.0024) [2024-06-19 01:41:58,380][26367] Fps is (10 sec: 40974.6, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4139827200. Throughput: 0: 41957.6. Samples: 407489780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:41:58,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 01:42:01,426][26599] Updated weights for policy 0, policy_version 252684 (0.0043) [2024-06-19 01:42:03,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 41987.4). Total num frames: 4140056576. Throughput: 0: 41983.1. Samples: 407615500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:03,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 01:42:05,675][26599] Updated weights for policy 0, policy_version 252694 (0.0036) [2024-06-19 01:42:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4140236800. Throughput: 0: 41960.8. Samples: 407869580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:08,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 01:42:08,615][26579] Signal inference workers to stop experience collection... (6050 times) [2024-06-19 01:42:08,615][26579] Signal inference workers to resume experience collection... (6050 times) [2024-06-19 01:42:08,632][26599] InferenceWorker_p0-w0: stopping experience collection (6050 times) [2024-06-19 01:42:08,632][26599] InferenceWorker_p0-w0: resuming experience collection (6050 times) [2024-06-19 01:42:09,096][26599] Updated weights for policy 0, policy_version 252704 (0.0031) [2024-06-19 01:42:13,353][26599] Updated weights for policy 0, policy_version 252714 (0.0035) [2024-06-19 01:42:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4140466176. Throughput: 0: 41948.8. Samples: 408117920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:13,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 01:42:16,844][26599] Updated weights for policy 0, policy_version 252724 (0.0028) [2024-06-19 01:42:18,384][26367] Fps is (10 sec: 45858.9, 60 sec: 42049.8, 300 sec: 42098.6). Total num frames: 4140695552. Throughput: 0: 42139.2. Samples: 408247720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:18,385][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 01:42:21,081][26599] Updated weights for policy 0, policy_version 252734 (0.0052) [2024-06-19 01:42:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4140875776. Throughput: 0: 42134.5. Samples: 408500940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:23,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 01:42:24,611][26599] Updated weights for policy 0, policy_version 252744 (0.0025) [2024-06-19 01:42:28,380][26367] Fps is (10 sec: 39335.5, 60 sec: 41779.0, 300 sec: 41931.9). Total num frames: 4141088768. Throughput: 0: 42104.7. Samples: 408755360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:28,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 01:42:28,787][26599] Updated weights for policy 0, policy_version 252754 (0.0034) [2024-06-19 01:42:32,351][26599] Updated weights for policy 0, policy_version 252764 (0.0034) [2024-06-19 01:42:33,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 4141334528. Throughput: 0: 42188.3. Samples: 408883560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:33,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 01:42:36,349][26599] Updated weights for policy 0, policy_version 252774 (0.0031) [2024-06-19 01:42:38,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42049.7, 300 sec: 41875.9). Total num frames: 4141498368. Throughput: 0: 42169.0. Samples: 409131680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:38,385][26367] Avg episode reward: [(0, '0.373')] [2024-06-19 01:42:40,451][26599] Updated weights for policy 0, policy_version 252784 (0.0053) [2024-06-19 01:42:43,380][26367] Fps is (10 sec: 37683.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4141711360. Throughput: 0: 42095.7. Samples: 409384080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:43,380][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 01:42:44,133][26599] Updated weights for policy 0, policy_version 252794 (0.0034) [2024-06-19 01:42:48,160][26599] Updated weights for policy 0, policy_version 252804 (0.0052) [2024-06-19 01:42:48,380][26367] Fps is (10 sec: 45892.2, 60 sec: 42327.9, 300 sec: 42098.5). Total num frames: 4141957120. Throughput: 0: 42093.0. Samples: 409509680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:48,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 01:42:52,251][26599] Updated weights for policy 0, policy_version 252814 (0.0036) [2024-06-19 01:42:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4142153728. Throughput: 0: 42029.0. Samples: 409760880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:53,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 01:42:55,768][26599] Updated weights for policy 0, policy_version 252824 (0.0035) [2024-06-19 01:42:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 4142350336. Throughput: 0: 42083.7. Samples: 410011680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 01:42:58,380][26367] Avg episode reward: [(0, '0.359')] [2024-06-19 01:43:00,045][26599] Updated weights for policy 0, policy_version 252834 (0.0038) [2024-06-19 01:43:03,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4142563328. Throughput: 0: 42068.3. Samples: 410140640. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:03,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 01:43:03,590][26599] Updated weights for policy 0, policy_version 252844 (0.0038) [2024-06-19 01:43:07,773][26599] Updated weights for policy 0, policy_version 252854 (0.0042) [2024-06-19 01:43:08,384][26367] Fps is (10 sec: 42582.2, 60 sec: 42322.8, 300 sec: 41931.4). Total num frames: 4142776320. Throughput: 0: 42030.4. Samples: 410392460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:08,385][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 01:43:10,722][26579] Signal inference workers to stop experience collection... (6100 times) [2024-06-19 01:43:10,723][26579] Signal inference workers to resume experience collection... (6100 times) [2024-06-19 01:43:10,741][26599] InferenceWorker_p0-w0: stopping experience collection (6100 times) [2024-06-19 01:43:10,776][26599] InferenceWorker_p0-w0: resuming experience collection (6100 times) [2024-06-19 01:43:11,337][26599] Updated weights for policy 0, policy_version 252864 (0.0037) [2024-06-19 01:43:13,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 4142989312. Throughput: 0: 41929.6. Samples: 410642340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:13,384][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 01:43:15,770][26599] Updated weights for policy 0, policy_version 252874 (0.0046) [2024-06-19 01:43:18,380][26367] Fps is (10 sec: 40975.3, 60 sec: 41508.7, 300 sec: 41987.5). Total num frames: 4143185920. Throughput: 0: 41877.4. Samples: 410768040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:18,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 01:43:19,489][26599] Updated weights for policy 0, policy_version 252884 (0.0030) [2024-06-19 01:43:23,384][26367] Fps is (10 sec: 40960.1, 60 sec: 42049.8, 300 sec: 41986.9). Total num frames: 4143398912. Throughput: 0: 41949.8. Samples: 411019420. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:23,385][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 01:43:23,519][26599] Updated weights for policy 0, policy_version 252894 (0.0028) [2024-06-19 01:43:27,231][26599] Updated weights for policy 0, policy_version 252904 (0.0035) [2024-06-19 01:43:28,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4143628288. Throughput: 0: 41884.2. Samples: 411268880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:28,381][26367] Avg episode reward: [(0, '0.388')] [2024-06-19 01:43:31,455][26599] Updated weights for policy 0, policy_version 252914 (0.0036) [2024-06-19 01:43:33,384][26367] Fps is (10 sec: 42598.5, 60 sec: 41503.6, 300 sec: 41931.4). Total num frames: 4143824896. Throughput: 0: 41999.7. Samples: 411399820. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:33,384][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 01:43:35,043][26599] Updated weights for policy 0, policy_version 252924 (0.0051) [2024-06-19 01:43:38,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42325.3, 300 sec: 42042.5). Total num frames: 4144037888. Throughput: 0: 41925.9. Samples: 411647700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:38,385][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 01:43:38,412][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252932_4144037888.pth... [2024-06-19 01:43:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252320_4134010880.pth [2024-06-19 01:43:39,211][26599] Updated weights for policy 0, policy_version 252934 (0.0035) [2024-06-19 01:43:43,053][26599] Updated weights for policy 0, policy_version 252944 (0.0051) [2024-06-19 01:43:43,380][26367] Fps is (10 sec: 40975.3, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 4144234496. Throughput: 0: 41944.0. Samples: 411899160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:43,380][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 01:43:46,920][26599] Updated weights for policy 0, policy_version 252954 (0.0042) [2024-06-19 01:43:48,384][26367] Fps is (10 sec: 40960.2, 60 sec: 41503.6, 300 sec: 41931.4). Total num frames: 4144447488. Throughput: 0: 41827.8. Samples: 412023040. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:48,384][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 01:43:51,062][26599] Updated weights for policy 0, policy_version 252964 (0.0048) [2024-06-19 01:43:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 4144676864. Throughput: 0: 41912.8. Samples: 412278380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:53,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 01:43:54,733][26599] Updated weights for policy 0, policy_version 252974 (0.0031) [2024-06-19 01:43:58,380][26367] Fps is (10 sec: 40974.5, 60 sec: 41779.0, 300 sec: 41987.5). Total num frames: 4144857088. Throughput: 0: 41939.8. Samples: 412529480. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:43:58,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 01:43:58,839][26599] Updated weights for policy 0, policy_version 252984 (0.0028) [2024-06-19 01:44:02,882][26599] Updated weights for policy 0, policy_version 252994 (0.0044) [2024-06-19 01:44:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4145070080. Throughput: 0: 41917.8. Samples: 412654340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:44:03,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 01:44:06,571][26599] Updated weights for policy 0, policy_version 253004 (0.0028) [2024-06-19 01:44:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 4145299456. Throughput: 0: 41892.7. Samples: 412904440. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 01:44:08,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 01:44:10,457][26599] Updated weights for policy 0, policy_version 253014 (0.0029) [2024-06-19 01:44:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41781.8, 300 sec: 42043.0). Total num frames: 4145496064. Throughput: 0: 42019.2. Samples: 413159740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:13,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 01:44:14,249][26599] Updated weights for policy 0, policy_version 253024 (0.0035) [2024-06-19 01:44:18,307][26599] Updated weights for policy 0, policy_version 253034 (0.0031) [2024-06-19 01:44:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4145709056. Throughput: 0: 41882.9. Samples: 413284400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:18,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 01:44:21,970][26599] Updated weights for policy 0, policy_version 253044 (0.0038) [2024-06-19 01:44:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42327.8, 300 sec: 42043.4). Total num frames: 4145938432. Throughput: 0: 41886.0. Samples: 413532420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:23,381][26367] Avg episode reward: [(0, '0.243')] [2024-06-19 01:44:26,050][26599] Updated weights for policy 0, policy_version 253054 (0.0034) [2024-06-19 01:44:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4146135040. Throughput: 0: 42018.1. Samples: 413789980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:28,381][26367] Avg episode reward: [(0, '0.373')] [2024-06-19 01:44:29,868][26599] Updated weights for policy 0, policy_version 253064 (0.0029) [2024-06-19 01:44:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42054.8, 300 sec: 41876.4). Total num frames: 4146348032. Throughput: 0: 41839.3. Samples: 413905660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:33,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 01:44:33,726][26599] Updated weights for policy 0, policy_version 253074 (0.0030) [2024-06-19 01:44:36,628][26579] Signal inference workers to stop experience collection... (6150 times) [2024-06-19 01:44:36,628][26579] Signal inference workers to resume experience collection... (6150 times) [2024-06-19 01:44:36,653][26599] InferenceWorker_p0-w0: stopping experience collection (6150 times) [2024-06-19 01:44:36,654][26599] InferenceWorker_p0-w0: resuming experience collection (6150 times) [2024-06-19 01:44:37,422][26599] Updated weights for policy 0, policy_version 253084 (0.0030) [2024-06-19 01:44:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 4146561024. Throughput: 0: 41954.1. Samples: 414166320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:38,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 01:44:41,557][26599] Updated weights for policy 0, policy_version 253094 (0.0041) [2024-06-19 01:44:43,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.0, 300 sec: 41932.4). Total num frames: 4146741248. Throughput: 0: 42183.1. Samples: 414427720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:43,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 01:44:44,991][26599] Updated weights for policy 0, policy_version 253104 (0.0035) [2024-06-19 01:44:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42327.9, 300 sec: 41931.9). Total num frames: 4146987008. Throughput: 0: 41924.9. Samples: 414540960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:48,381][26367] Avg episode reward: [(0, '0.225')] [2024-06-19 01:44:49,111][26599] Updated weights for policy 0, policy_version 253114 (0.0034) [2024-06-19 01:44:52,969][26599] Updated weights for policy 0, policy_version 253124 (0.0029) [2024-06-19 01:44:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4147183616. Throughput: 0: 42029.8. Samples: 414795780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:53,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 01:44:57,058][26599] Updated weights for policy 0, policy_version 253134 (0.0042) [2024-06-19 01:44:58,380][26367] Fps is (10 sec: 37682.7, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4147363840. Throughput: 0: 42033.7. Samples: 415051260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:44:58,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 01:45:00,653][26599] Updated weights for policy 0, policy_version 253144 (0.0044) [2024-06-19 01:45:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 4147609600. Throughput: 0: 41859.1. Samples: 415168060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:45:03,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 01:45:05,508][26599] Updated weights for policy 0, policy_version 253154 (0.0041) [2024-06-19 01:45:08,380][26367] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4147806208. Throughput: 0: 42182.8. Samples: 415430640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:45:08,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 01:45:08,882][26599] Updated weights for policy 0, policy_version 253164 (0.0037) [2024-06-19 01:45:13,191][26599] Updated weights for policy 0, policy_version 253174 (0.0052) [2024-06-19 01:45:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 4148002816. Throughput: 0: 41966.3. Samples: 415678460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:45:13,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 01:45:16,665][26599] Updated weights for policy 0, policy_version 253184 (0.0045) [2024-06-19 01:45:18,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4148264960. Throughput: 0: 42220.4. Samples: 415805580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 01:45:18,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 01:45:20,606][26599] Updated weights for policy 0, policy_version 253194 (0.0048) [2024-06-19 01:45:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4148445184. Throughput: 0: 42138.2. Samples: 416062540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:23,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 01:45:24,651][26599] Updated weights for policy 0, policy_version 253204 (0.0032) [2024-06-19 01:45:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4148641792. Throughput: 0: 41801.4. Samples: 416308780. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:28,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 01:45:28,593][26599] Updated weights for policy 0, policy_version 253214 (0.0040) [2024-06-19 01:45:32,286][26599] Updated weights for policy 0, policy_version 253224 (0.0042) [2024-06-19 01:45:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4148887552. Throughput: 0: 42082.2. Samples: 416434660. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:33,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 01:45:36,742][26599] Updated weights for policy 0, policy_version 253234 (0.0040) [2024-06-19 01:45:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4149067776. Throughput: 0: 42194.7. Samples: 416694540. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:38,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 01:45:38,416][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000253239_4149067776.pth... [2024-06-19 01:45:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252624_4138991616.pth [2024-06-19 01:45:40,057][26599] Updated weights for policy 0, policy_version 253244 (0.0033) [2024-06-19 01:45:43,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.5, 300 sec: 41932.0). Total num frames: 4149280768. Throughput: 0: 41958.8. Samples: 416939400. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:43,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 01:45:44,529][26599] Updated weights for policy 0, policy_version 253254 (0.0037) [2024-06-19 01:45:47,833][26599] Updated weights for policy 0, policy_version 253264 (0.0042) [2024-06-19 01:45:48,384][26367] Fps is (10 sec: 44220.6, 60 sec: 42049.7, 300 sec: 41986.9). Total num frames: 4149510144. Throughput: 0: 42131.8. Samples: 417064140. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:48,385][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 01:45:52,547][26599] Updated weights for policy 0, policy_version 253274 (0.0038) [2024-06-19 01:45:53,380][26367] Fps is (10 sec: 40959.0, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4149690368. Throughput: 0: 41882.9. Samples: 417315380. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:53,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 01:45:54,118][26579] Signal inference workers to stop experience collection... (6200 times) [2024-06-19 01:45:54,118][26579] Signal inference workers to resume experience collection... (6200 times) [2024-06-19 01:45:54,134][26599] InferenceWorker_p0-w0: stopping experience collection (6200 times) [2024-06-19 01:45:54,134][26599] InferenceWorker_p0-w0: resuming experience collection (6200 times) [2024-06-19 01:45:55,659][26599] Updated weights for policy 0, policy_version 253284 (0.0029) [2024-06-19 01:45:58,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4149919744. Throughput: 0: 41863.5. Samples: 417562320. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:45:58,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 01:46:00,173][26599] Updated weights for policy 0, policy_version 253294 (0.0035) [2024-06-19 01:46:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4150116352. Throughput: 0: 41944.5. Samples: 417693080. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:46:03,381][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 01:46:03,462][26599] Updated weights for policy 0, policy_version 253304 (0.0037) [2024-06-19 01:46:07,705][26599] Updated weights for policy 0, policy_version 253314 (0.0034) [2024-06-19 01:46:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4150312960. Throughput: 0: 41845.8. Samples: 417945600. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:46:08,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 01:46:11,119][26599] Updated weights for policy 0, policy_version 253324 (0.0036) [2024-06-19 01:46:13,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4150558720. Throughput: 0: 41800.6. Samples: 418189800. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:46:13,380][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 01:46:15,272][26599] Updated weights for policy 0, policy_version 253334 (0.0028) [2024-06-19 01:46:18,380][26367] Fps is (10 sec: 40960.7, 60 sec: 40960.1, 300 sec: 41876.4). Total num frames: 4150722560. Throughput: 0: 42028.5. Samples: 418325940. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:46:18,380][26367] Avg episode reward: [(0, '0.807')] [2024-06-19 01:46:19,064][26599] Updated weights for policy 0, policy_version 253344 (0.0025) [2024-06-19 01:46:22,843][26599] Updated weights for policy 0, policy_version 253354 (0.0023) [2024-06-19 01:46:23,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4150951936. Throughput: 0: 41755.1. Samples: 418573520. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:46:23,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 01:46:26,846][26599] Updated weights for policy 0, policy_version 253364 (0.0045) [2024-06-19 01:46:28,380][26367] Fps is (10 sec: 47512.2, 60 sec: 42598.3, 300 sec: 42154.0). Total num frames: 4151197696. Throughput: 0: 41817.5. Samples: 418821200. Policy #0 lag: (min: 0.0, avg: 8.3, max: 21.0) [2024-06-19 01:46:28,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 01:46:30,509][26599] Updated weights for policy 0, policy_version 253374 (0.0034) [2024-06-19 01:46:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 4151361536. Throughput: 0: 42010.1. Samples: 418954440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:46:33,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 01:46:34,639][26599] Updated weights for policy 0, policy_version 253384 (0.0034) [2024-06-19 01:46:38,189][26599] Updated weights for policy 0, policy_version 253394 (0.0039) [2024-06-19 01:46:38,380][26367] Fps is (10 sec: 40961.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4151607296. Throughput: 0: 41943.8. Samples: 419202840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:46:38,380][26367] Avg episode reward: [(0, '0.364')] [2024-06-19 01:46:42,312][26599] Updated weights for policy 0, policy_version 253404 (0.0031) [2024-06-19 01:46:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41932.5). Total num frames: 4151787520. Throughput: 0: 42047.2. Samples: 419454440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:46:43,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:46:45,970][26599] Updated weights for policy 0, policy_version 253414 (0.0038) [2024-06-19 01:46:48,380][26367] Fps is (10 sec: 37682.9, 60 sec: 41235.6, 300 sec: 41931.9). Total num frames: 4151984128. Throughput: 0: 41884.1. Samples: 419577860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:46:48,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 01:46:50,233][26599] Updated weights for policy 0, policy_version 253424 (0.0038) [2024-06-19 01:46:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 4152229888. Throughput: 0: 41925.9. Samples: 419832260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:46:53,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 01:46:53,597][26599] Updated weights for policy 0, policy_version 253434 (0.0033) [2024-06-19 01:46:57,906][26599] Updated weights for policy 0, policy_version 253444 (0.0027) [2024-06-19 01:46:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4152426496. Throughput: 0: 42155.4. Samples: 420086800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:46:58,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 01:47:01,842][26599] Updated weights for policy 0, policy_version 253454 (0.0020) [2024-06-19 01:47:03,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4152639488. Throughput: 0: 41875.8. Samples: 420210360. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:03,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 01:47:05,936][26599] Updated weights for policy 0, policy_version 253464 (0.0034) [2024-06-19 01:47:08,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4152852480. Throughput: 0: 42097.0. Samples: 420467880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:08,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 01:47:09,314][26599] Updated weights for policy 0, policy_version 253474 (0.0022) [2024-06-19 01:47:13,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41932.5). Total num frames: 4153065472. Throughput: 0: 42302.0. Samples: 420724780. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:13,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 01:47:13,482][26599] Updated weights for policy 0, policy_version 253484 (0.0035) [2024-06-19 01:47:16,932][26599] Updated weights for policy 0, policy_version 253494 (0.0036) [2024-06-19 01:47:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42098.6). Total num frames: 4153294848. Throughput: 0: 42028.0. Samples: 420845700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:18,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 01:47:21,500][26599] Updated weights for policy 0, policy_version 253504 (0.0028) [2024-06-19 01:47:22,117][26579] Signal inference workers to stop experience collection... (6250 times) [2024-06-19 01:47:22,118][26579] Signal inference workers to resume experience collection... (6250 times) [2024-06-19 01:47:22,135][26599] InferenceWorker_p0-w0: stopping experience collection (6250 times) [2024-06-19 01:47:22,135][26599] InferenceWorker_p0-w0: resuming experience collection (6250 times) [2024-06-19 01:47:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4153491456. Throughput: 0: 42283.1. Samples: 421105580. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:23,381][26367] Avg episode reward: [(0, '0.286')] [2024-06-19 01:47:24,574][26599] Updated weights for policy 0, policy_version 253514 (0.0038) [2024-06-19 01:47:28,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 4153688064. Throughput: 0: 42326.1. Samples: 421359120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:28,381][26367] Avg episode reward: [(0, '0.344')] [2024-06-19 01:47:29,065][26599] Updated weights for policy 0, policy_version 253524 (0.0038) [2024-06-19 01:47:32,362][26599] Updated weights for policy 0, policy_version 253534 (0.0045) [2024-06-19 01:47:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42099.1). Total num frames: 4153917440. Throughput: 0: 42337.4. Samples: 421483040. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:33,384][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 01:47:36,916][26599] Updated weights for policy 0, policy_version 253544 (0.0031) [2024-06-19 01:47:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4154114048. Throughput: 0: 42281.6. Samples: 421734940. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 01:47:38,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 01:47:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000253547_4154114048.pth... [2024-06-19 01:47:38,481][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000252932_4144037888.pth [2024-06-19 01:47:40,250][26599] Updated weights for policy 0, policy_version 253554 (0.0032) [2024-06-19 01:47:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4154327040. Throughput: 0: 42263.2. Samples: 421988640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:47:43,381][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 01:47:44,893][26599] Updated weights for policy 0, policy_version 253564 (0.0041) [2024-06-19 01:47:47,918][26599] Updated weights for policy 0, policy_version 253574 (0.0039) [2024-06-19 01:47:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 4154556416. Throughput: 0: 42356.9. Samples: 422116420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:47:48,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 01:47:52,721][26599] Updated weights for policy 0, policy_version 253584 (0.0032) [2024-06-19 01:47:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4154736640. Throughput: 0: 42164.8. Samples: 422365300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:47:53,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 01:47:55,689][26599] Updated weights for policy 0, policy_version 253594 (0.0041) [2024-06-19 01:47:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4154966016. Throughput: 0: 41960.0. Samples: 422612980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:47:58,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 01:48:00,481][26599] Updated weights for policy 0, policy_version 253604 (0.0035) [2024-06-19 01:48:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 4155179008. Throughput: 0: 42094.6. Samples: 422739960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:03,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 01:48:03,934][26599] Updated weights for policy 0, policy_version 253614 (0.0039) [2024-06-19 01:48:08,131][26599] Updated weights for policy 0, policy_version 253624 (0.0028) [2024-06-19 01:48:08,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42049.7, 300 sec: 41987.5). Total num frames: 4155375616. Throughput: 0: 41947.6. Samples: 422993380. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:08,384][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 01:48:11,692][26599] Updated weights for policy 0, policy_version 253634 (0.0043) [2024-06-19 01:48:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4155588608. Throughput: 0: 41854.1. Samples: 423242560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:13,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 01:48:15,802][26599] Updated weights for policy 0, policy_version 253644 (0.0032) [2024-06-19 01:48:18,380][26367] Fps is (10 sec: 42614.3, 60 sec: 41779.2, 300 sec: 42043.5). Total num frames: 4155801600. Throughput: 0: 41887.2. Samples: 423367960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:18,380][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 01:48:19,555][26599] Updated weights for policy 0, policy_version 253654 (0.0025) [2024-06-19 01:48:23,380][26367] Fps is (10 sec: 42599.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4156014592. Throughput: 0: 41936.6. Samples: 423622080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:23,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 01:48:23,550][26599] Updated weights for policy 0, policy_version 253664 (0.0034) [2024-06-19 01:48:27,518][26599] Updated weights for policy 0, policy_version 253674 (0.0028) [2024-06-19 01:48:28,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42322.8, 300 sec: 42043.0). Total num frames: 4156227584. Throughput: 0: 41905.5. Samples: 423874540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:28,384][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 01:48:31,995][26599] Updated weights for policy 0, policy_version 253684 (0.0026) [2024-06-19 01:48:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 41988.0). Total num frames: 4156424192. Throughput: 0: 41883.8. Samples: 424001180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:33,380][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 01:48:35,399][26599] Updated weights for policy 0, policy_version 253694 (0.0035) [2024-06-19 01:48:38,382][26367] Fps is (10 sec: 40967.3, 60 sec: 42051.0, 300 sec: 42042.7). Total num frames: 4156637184. Throughput: 0: 41868.9. Samples: 424249480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:38,383][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 01:48:39,567][26599] Updated weights for policy 0, policy_version 253704 (0.0040) [2024-06-19 01:48:43,353][26599] Updated weights for policy 0, policy_version 253714 (0.0042) [2024-06-19 01:48:43,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 4156850176. Throughput: 0: 42015.5. Samples: 424503680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:43,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 01:48:47,770][26599] Updated weights for policy 0, policy_version 253724 (0.0028) [2024-06-19 01:48:48,380][26367] Fps is (10 sec: 40968.2, 60 sec: 41506.3, 300 sec: 41931.9). Total num frames: 4157046784. Throughput: 0: 41981.5. Samples: 424629120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:48,380][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 01:48:51,083][26599] Updated weights for policy 0, policy_version 253734 (0.0027) [2024-06-19 01:48:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4157276160. Throughput: 0: 41986.6. Samples: 424882620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 20.0) [2024-06-19 01:48:53,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 01:48:55,369][26599] Updated weights for policy 0, policy_version 253744 (0.0035) [2024-06-19 01:48:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4157472768. Throughput: 0: 41979.4. Samples: 425131620. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:48:58,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 01:48:59,061][26599] Updated weights for policy 0, policy_version 253754 (0.0040) [2024-06-19 01:49:03,183][26599] Updated weights for policy 0, policy_version 253764 (0.0034) [2024-06-19 01:49:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41932.0). Total num frames: 4157669376. Throughput: 0: 41978.2. Samples: 425256980. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:03,380][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 01:49:06,682][26599] Updated weights for policy 0, policy_version 253774 (0.0033) [2024-06-19 01:49:07,626][26579] Signal inference workers to stop experience collection... (6300 times) [2024-06-19 01:49:07,629][26579] Signal inference workers to resume experience collection... (6300 times) [2024-06-19 01:49:07,662][26599] InferenceWorker_p0-w0: stopping experience collection (6300 times) [2024-06-19 01:49:07,662][26599] InferenceWorker_p0-w0: resuming experience collection (6300 times) [2024-06-19 01:49:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42327.9, 300 sec: 42098.6). Total num frames: 4157915136. Throughput: 0: 41885.3. Samples: 425506920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:08,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 01:49:10,718][26599] Updated weights for policy 0, policy_version 253784 (0.0028) [2024-06-19 01:49:13,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 4158128128. Throughput: 0: 41966.1. Samples: 425762860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:13,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 01:49:14,599][26599] Updated weights for policy 0, policy_version 253794 (0.0052) [2024-06-19 01:49:18,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4158291968. Throughput: 0: 41842.2. Samples: 425884080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:18,380][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 01:49:18,559][26599] Updated weights for policy 0, policy_version 253804 (0.0035) [2024-06-19 01:49:22,093][26599] Updated weights for policy 0, policy_version 253814 (0.0029) [2024-06-19 01:49:23,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4158521344. Throughput: 0: 42016.5. Samples: 426140140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:23,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 01:49:26,178][26599] Updated weights for policy 0, policy_version 253824 (0.0026) [2024-06-19 01:49:28,387][26367] Fps is (10 sec: 44205.5, 60 sec: 41776.9, 300 sec: 41986.5). Total num frames: 4158734336. Throughput: 0: 41964.6. Samples: 426392380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:28,388][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 01:49:29,635][26599] Updated weights for policy 0, policy_version 253834 (0.0029) [2024-06-19 01:49:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4158947328. Throughput: 0: 42026.6. Samples: 426520320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:33,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 01:49:33,821][26599] Updated weights for policy 0, policy_version 253844 (0.0036) [2024-06-19 01:49:37,854][26599] Updated weights for policy 0, policy_version 253854 (0.0030) [2024-06-19 01:49:38,380][26367] Fps is (10 sec: 42628.1, 60 sec: 42053.6, 300 sec: 42098.6). Total num frames: 4159160320. Throughput: 0: 41921.7. Samples: 426769100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:38,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:49:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000253855_4159160320.pth... [2024-06-19 01:49:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000253239_4149067776.pth [2024-06-19 01:49:41,749][26599] Updated weights for policy 0, policy_version 253864 (0.0040) [2024-06-19 01:49:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4159356928. Throughput: 0: 41943.1. Samples: 427019060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:43,380][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:49:45,598][26599] Updated weights for policy 0, policy_version 253874 (0.0033) [2024-06-19 01:49:48,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42322.7, 300 sec: 42042.5). Total num frames: 4159586304. Throughput: 0: 42043.1. Samples: 427149080. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:48,385][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 01:49:49,491][26599] Updated weights for policy 0, policy_version 253884 (0.0028) [2024-06-19 01:49:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4159782912. Throughput: 0: 42026.8. Samples: 427398120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:53,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 01:49:53,412][26599] Updated weights for policy 0, policy_version 253894 (0.0033) [2024-06-19 01:49:57,350][26599] Updated weights for policy 0, policy_version 253904 (0.0028) [2024-06-19 01:49:58,380][26367] Fps is (10 sec: 39336.3, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 4159979520. Throughput: 0: 41829.4. Samples: 427645180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:49:58,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 01:50:01,059][26599] Updated weights for policy 0, policy_version 253914 (0.0039) [2024-06-19 01:50:03,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42322.7, 300 sec: 42042.5). Total num frames: 4160208896. Throughput: 0: 41877.4. Samples: 427768720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 01:50:03,384][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 01:50:05,310][26599] Updated weights for policy 0, policy_version 253924 (0.0040) [2024-06-19 01:50:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4160405504. Throughput: 0: 41766.6. Samples: 428019640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:08,381][26367] Avg episode reward: [(0, '0.249')] [2024-06-19 01:50:09,155][26599] Updated weights for policy 0, policy_version 253934 (0.0036) [2024-06-19 01:50:12,831][26599] Updated weights for policy 0, policy_version 253944 (0.0031) [2024-06-19 01:50:13,380][26367] Fps is (10 sec: 42614.0, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 4160634880. Throughput: 0: 41736.7. Samples: 428270240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:13,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 01:50:16,947][26599] Updated weights for policy 0, policy_version 253954 (0.0030) [2024-06-19 01:50:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4160831488. Throughput: 0: 41875.5. Samples: 428404720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:18,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 01:50:20,552][26599] Updated weights for policy 0, policy_version 253964 (0.0043) [2024-06-19 01:50:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4161028096. Throughput: 0: 41855.6. Samples: 428652600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:23,381][26367] Avg episode reward: [(0, '0.786')] [2024-06-19 01:50:23,743][26579] Signal inference workers to stop experience collection... (6350 times) [2024-06-19 01:50:23,797][26599] InferenceWorker_p0-w0: stopping experience collection (6350 times) [2024-06-19 01:50:23,861][26579] Signal inference workers to resume experience collection... (6350 times) [2024-06-19 01:50:23,861][26599] InferenceWorker_p0-w0: resuming experience collection (6350 times) [2024-06-19 01:50:25,032][26599] Updated weights for policy 0, policy_version 253974 (0.0037) [2024-06-19 01:50:28,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42054.6, 300 sec: 41931.4). Total num frames: 4161257472. Throughput: 0: 41866.3. Samples: 428903200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:28,384][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 01:50:28,596][26599] Updated weights for policy 0, policy_version 253984 (0.0040) [2024-06-19 01:50:32,872][26599] Updated weights for policy 0, policy_version 253994 (0.0040) [2024-06-19 01:50:33,381][26367] Fps is (10 sec: 42594.9, 60 sec: 41778.6, 300 sec: 41987.4). Total num frames: 4161454080. Throughput: 0: 41869.4. Samples: 429033080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:33,382][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 01:50:36,363][26599] Updated weights for policy 0, policy_version 254004 (0.0046) [2024-06-19 01:50:38,380][26367] Fps is (10 sec: 39336.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4161650688. Throughput: 0: 41733.3. Samples: 429276120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:38,381][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 01:50:40,729][26599] Updated weights for policy 0, policy_version 254014 (0.0035) [2024-06-19 01:50:43,380][26367] Fps is (10 sec: 42601.9, 60 sec: 42052.2, 300 sec: 41932.5). Total num frames: 4161880064. Throughput: 0: 41813.7. Samples: 429526800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:43,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 01:50:44,103][26599] Updated weights for policy 0, policy_version 254024 (0.0032) [2024-06-19 01:50:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41235.6, 300 sec: 41932.0). Total num frames: 4162060288. Throughput: 0: 41944.3. Samples: 429656060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:48,380][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 01:50:48,702][26599] Updated weights for policy 0, policy_version 254034 (0.0028) [2024-06-19 01:50:51,866][26599] Updated weights for policy 0, policy_version 254044 (0.0034) [2024-06-19 01:50:53,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4162273280. Throughput: 0: 41696.9. Samples: 429896000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:53,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 01:50:56,451][26599] Updated weights for policy 0, policy_version 254054 (0.0032) [2024-06-19 01:50:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41932.0). Total num frames: 4162486272. Throughput: 0: 41988.9. Samples: 430159740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:50:58,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 01:50:59,594][26599] Updated weights for policy 0, policy_version 254064 (0.0032) [2024-06-19 01:51:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41508.7, 300 sec: 41987.5). Total num frames: 4162699264. Throughput: 0: 41745.4. Samples: 430283260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:51:03,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 01:51:04,236][26599] Updated weights for policy 0, policy_version 254074 (0.0026) [2024-06-19 01:51:07,351][26599] Updated weights for policy 0, policy_version 254084 (0.0027) [2024-06-19 01:51:08,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42049.7, 300 sec: 41931.4). Total num frames: 4162928640. Throughput: 0: 41766.3. Samples: 430532240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:51:08,384][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 01:51:12,246][26599] Updated weights for policy 0, policy_version 254094 (0.0034) [2024-06-19 01:51:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4163125248. Throughput: 0: 41813.6. Samples: 430784660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 01:51:13,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 01:51:15,087][26599] Updated weights for policy 0, policy_version 254104 (0.0028) [2024-06-19 01:51:18,380][26367] Fps is (10 sec: 40975.2, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4163338240. Throughput: 0: 41595.0. Samples: 430904820. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:18,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 01:51:19,954][26599] Updated weights for policy 0, policy_version 254114 (0.0029) [2024-06-19 01:51:22,974][26599] Updated weights for policy 0, policy_version 254124 (0.0044) [2024-06-19 01:51:23,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 4163584000. Throughput: 0: 41973.2. Samples: 431164920. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:23,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 01:51:27,738][26599] Updated weights for policy 0, policy_version 254134 (0.0034) [2024-06-19 01:51:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41508.6, 300 sec: 41987.5). Total num frames: 4163747840. Throughput: 0: 42052.4. Samples: 431419160. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:28,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 01:51:30,879][26599] Updated weights for policy 0, policy_version 254144 (0.0041) [2024-06-19 01:51:33,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41779.7, 300 sec: 41876.4). Total num frames: 4163960832. Throughput: 0: 41757.7. Samples: 431535160. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:33,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 01:51:35,497][26599] Updated weights for policy 0, policy_version 254154 (0.0027) [2024-06-19 01:51:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4164190208. Throughput: 0: 42145.3. Samples: 431792540. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:38,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 01:51:38,479][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000254163_4164206592.pth... [2024-06-19 01:51:38,531][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000253547_4154114048.pth [2024-06-19 01:51:39,061][26599] Updated weights for policy 0, policy_version 254164 (0.0032) [2024-06-19 01:51:43,275][26599] Updated weights for policy 0, policy_version 254174 (0.0030) [2024-06-19 01:51:43,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4164386816. Throughput: 0: 41903.9. Samples: 432045420. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:43,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 01:51:46,822][26599] Updated weights for policy 0, policy_version 254184 (0.0029) [2024-06-19 01:51:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4164599808. Throughput: 0: 41872.4. Samples: 432167520. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:48,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 01:51:50,930][26579] Signal inference workers to stop experience collection... (6400 times) [2024-06-19 01:51:50,940][26579] Signal inference workers to resume experience collection... (6400 times) [2024-06-19 01:51:50,962][26599] InferenceWorker_p0-w0: stopping experience collection (6400 times) [2024-06-19 01:51:50,962][26599] InferenceWorker_p0-w0: resuming experience collection (6400 times) [2024-06-19 01:51:51,082][26599] Updated weights for policy 0, policy_version 254194 (0.0033) [2024-06-19 01:51:53,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4164796416. Throughput: 0: 42027.3. Samples: 432423320. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:53,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 01:51:54,488][26599] Updated weights for policy 0, policy_version 254204 (0.0030) [2024-06-19 01:51:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4165009408. Throughput: 0: 42005.7. Samples: 432674920. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:51:58,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 01:51:58,915][26599] Updated weights for policy 0, policy_version 254214 (0.0042) [2024-06-19 01:52:02,253][26599] Updated weights for policy 0, policy_version 254224 (0.0034) [2024-06-19 01:52:03,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42322.7, 300 sec: 41986.9). Total num frames: 4165238784. Throughput: 0: 42031.2. Samples: 432796380. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:52:03,384][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 01:52:07,029][26599] Updated weights for policy 0, policy_version 254234 (0.0039) [2024-06-19 01:52:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41508.7, 300 sec: 41876.4). Total num frames: 4165419008. Throughput: 0: 41906.3. Samples: 433050700. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:52:08,380][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 01:52:10,130][26599] Updated weights for policy 0, policy_version 254244 (0.0038) [2024-06-19 01:52:13,380][26367] Fps is (10 sec: 39336.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4165632000. Throughput: 0: 41852.5. Samples: 433302520. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:52:13,380][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 01:52:14,615][26599] Updated weights for policy 0, policy_version 254254 (0.0034) [2024-06-19 01:52:17,905][26599] Updated weights for policy 0, policy_version 254264 (0.0041) [2024-06-19 01:52:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4165861376. Throughput: 0: 42032.1. Samples: 433426600. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:52:18,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 01:52:22,377][26599] Updated weights for policy 0, policy_version 254274 (0.0040) [2024-06-19 01:52:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4166074368. Throughput: 0: 41854.3. Samples: 433675980. Policy #0 lag: (min: 1.0, avg: 12.9, max: 21.0) [2024-06-19 01:52:23,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 01:52:25,845][26599] Updated weights for policy 0, policy_version 254284 (0.0030) [2024-06-19 01:52:28,384][26367] Fps is (10 sec: 40944.6, 60 sec: 42049.7, 300 sec: 41875.9). Total num frames: 4166270976. Throughput: 0: 41815.6. Samples: 433927280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:52:28,384][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 01:52:30,107][26599] Updated weights for policy 0, policy_version 254294 (0.0039) [2024-06-19 01:52:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4166483968. Throughput: 0: 41920.8. Samples: 434053960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:52:33,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 01:52:33,716][26599] Updated weights for policy 0, policy_version 254304 (0.0039) [2024-06-19 01:52:37,836][26599] Updated weights for policy 0, policy_version 254314 (0.0035) [2024-06-19 01:52:38,380][26367] Fps is (10 sec: 40975.2, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4166680576. Throughput: 0: 42000.1. Samples: 434313320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:52:38,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 01:52:41,690][26599] Updated weights for policy 0, policy_version 254324 (0.0042) [2024-06-19 01:52:43,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 4166926336. Throughput: 0: 41786.3. Samples: 434555300. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:52:43,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 01:52:45,658][26599] Updated weights for policy 0, policy_version 254334 (0.0035) [2024-06-19 01:52:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4167106560. Throughput: 0: 41929.7. Samples: 434683060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:52:48,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 01:52:49,425][26599] Updated weights for policy 0, policy_version 254344 (0.0035) [2024-06-19 01:52:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4167319552. Throughput: 0: 41930.7. Samples: 434937580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:52:53,380][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 01:52:53,394][26599] Updated weights for policy 0, policy_version 254354 (0.0036) [2024-06-19 01:52:57,247][26599] Updated weights for policy 0, policy_version 254364 (0.0032) [2024-06-19 01:52:58,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4167548928. Throughput: 0: 41757.7. Samples: 435181620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:52:58,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 01:53:01,820][26599] Updated weights for policy 0, policy_version 254374 (0.0038) [2024-06-19 01:53:03,384][26367] Fps is (10 sec: 42582.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4167745536. Throughput: 0: 41800.5. Samples: 435307780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:53:03,385][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 01:53:05,011][26599] Updated weights for policy 0, policy_version 254384 (0.0022) [2024-06-19 01:53:08,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4167925760. Throughput: 0: 41868.0. Samples: 435560040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:53:08,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 01:53:09,394][26599] Updated weights for policy 0, policy_version 254394 (0.0025) [2024-06-19 01:53:12,830][26579] Signal inference workers to stop experience collection... (6450 times) [2024-06-19 01:53:12,831][26579] Signal inference workers to resume experience collection... (6450 times) [2024-06-19 01:53:12,856][26599] InferenceWorker_p0-w0: stopping experience collection (6450 times) [2024-06-19 01:53:12,859][26599] Updated weights for policy 0, policy_version 254404 (0.0040) [2024-06-19 01:53:12,885][26599] InferenceWorker_p0-w0: resuming experience collection (6450 times) [2024-06-19 01:53:13,384][26367] Fps is (10 sec: 42598.2, 60 sec: 42322.6, 300 sec: 41931.4). Total num frames: 4168171520. Throughput: 0: 41752.9. Samples: 435806160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:53:13,385][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 01:53:17,137][26599] Updated weights for policy 0, policy_version 254414 (0.0035) [2024-06-19 01:53:18,381][26367] Fps is (10 sec: 42593.5, 60 sec: 41505.3, 300 sec: 41820.7). Total num frames: 4168351744. Throughput: 0: 41870.5. Samples: 435938180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:53:18,382][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 01:53:20,622][26599] Updated weights for policy 0, policy_version 254424 (0.0041) [2024-06-19 01:53:23,380][26367] Fps is (10 sec: 40975.5, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 4168581120. Throughput: 0: 41707.2. Samples: 436190140. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:53:23,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 01:53:24,895][26599] Updated weights for policy 0, policy_version 254434 (0.0041) [2024-06-19 01:53:28,380][26367] Fps is (10 sec: 42603.7, 60 sec: 41781.8, 300 sec: 41876.4). Total num frames: 4168777728. Throughput: 0: 42017.3. Samples: 436446080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:53:28,380][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 01:53:28,639][26599] Updated weights for policy 0, policy_version 254444 (0.0039) [2024-06-19 01:53:32,624][26599] Updated weights for policy 0, policy_version 254454 (0.0036) [2024-06-19 01:53:33,382][26367] Fps is (10 sec: 40951.5, 60 sec: 41777.8, 300 sec: 41876.4). Total num frames: 4168990720. Throughput: 0: 41963.0. Samples: 436571480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 01:53:33,383][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 01:53:36,246][26599] Updated weights for policy 0, policy_version 254464 (0.0032) [2024-06-19 01:53:38,384][26367] Fps is (10 sec: 44220.1, 60 sec: 42322.7, 300 sec: 41931.4). Total num frames: 4169220096. Throughput: 0: 41852.0. Samples: 436821080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:53:38,384][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 01:53:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000254469_4169220096.pth... [2024-06-19 01:53:38,447][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000253855_4159160320.pth [2024-06-19 01:53:40,167][26599] Updated weights for policy 0, policy_version 254474 (0.0026) [2024-06-19 01:53:43,380][26367] Fps is (10 sec: 40968.8, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4169400320. Throughput: 0: 42147.6. Samples: 437078260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:53:43,380][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 01:53:44,216][26599] Updated weights for policy 0, policy_version 254484 (0.0041) [2024-06-19 01:53:48,380][26367] Fps is (10 sec: 39336.4, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4169613312. Throughput: 0: 42158.2. Samples: 437204740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:53:48,380][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 01:53:48,435][26599] Updated weights for policy 0, policy_version 254494 (0.0034) [2024-06-19 01:53:51,896][26599] Updated weights for policy 0, policy_version 254504 (0.0026) [2024-06-19 01:53:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4169842688. Throughput: 0: 42017.9. Samples: 437450840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:53:53,380][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 01:53:55,994][26599] Updated weights for policy 0, policy_version 254514 (0.0039) [2024-06-19 01:53:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4170055680. Throughput: 0: 42243.5. Samples: 437706960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:53:58,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 01:53:59,669][26599] Updated weights for policy 0, policy_version 254524 (0.0036) [2024-06-19 01:54:03,384][26367] Fps is (10 sec: 40944.5, 60 sec: 41779.2, 300 sec: 41820.3). Total num frames: 4170252288. Throughput: 0: 42154.1. Samples: 437835220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:03,384][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 01:54:03,572][26599] Updated weights for policy 0, policy_version 254534 (0.0035) [2024-06-19 01:54:07,791][26599] Updated weights for policy 0, policy_version 254544 (0.0035) [2024-06-19 01:54:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 4170481664. Throughput: 0: 42168.5. Samples: 438087720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:08,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 01:54:11,454][26599] Updated weights for policy 0, policy_version 254554 (0.0027) [2024-06-19 01:54:13,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42054.9, 300 sec: 42043.0). Total num frames: 4170694656. Throughput: 0: 41937.7. Samples: 438333280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:13,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 01:54:15,498][26599] Updated weights for policy 0, policy_version 254564 (0.0040) [2024-06-19 01:54:18,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42053.1, 300 sec: 41876.4). Total num frames: 4170874880. Throughput: 0: 42014.4. Samples: 438462040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:18,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 01:54:19,239][26599] Updated weights for policy 0, policy_version 254574 (0.0031) [2024-06-19 01:54:23,264][26599] Updated weights for policy 0, policy_version 254584 (0.0044) [2024-06-19 01:54:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 41932.9). Total num frames: 4171104256. Throughput: 0: 42209.7. Samples: 438720360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:23,380][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 01:54:26,961][26599] Updated weights for policy 0, policy_version 254594 (0.0038) [2024-06-19 01:54:27,801][26579] Signal inference workers to stop experience collection... (6500 times) [2024-06-19 01:54:27,802][26579] Signal inference workers to resume experience collection... (6500 times) [2024-06-19 01:54:27,810][26599] InferenceWorker_p0-w0: stopping experience collection (6500 times) [2024-06-19 01:54:27,810][26599] InferenceWorker_p0-w0: resuming experience collection (6500 times) [2024-06-19 01:54:28,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4171333632. Throughput: 0: 42029.3. Samples: 438969580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:28,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 01:54:31,154][26599] Updated weights for policy 0, policy_version 254604 (0.0037) [2024-06-19 01:54:33,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42051.3, 300 sec: 41875.9). Total num frames: 4171513856. Throughput: 0: 42118.5. Samples: 439100220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:33,384][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 01:54:34,635][26599] Updated weights for policy 0, policy_version 254614 (0.0038) [2024-06-19 01:54:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41781.8, 300 sec: 41931.9). Total num frames: 4171726848. Throughput: 0: 42104.4. Samples: 439345540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:38,380][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 01:54:39,057][26599] Updated weights for policy 0, policy_version 254624 (0.0035) [2024-06-19 01:54:42,694][26599] Updated weights for policy 0, policy_version 254634 (0.0040) [2024-06-19 01:54:43,380][26367] Fps is (10 sec: 44252.2, 60 sec: 42598.3, 300 sec: 41932.5). Total num frames: 4171956224. Throughput: 0: 42066.2. Samples: 439599940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 01:54:43,381][26367] Avg episode reward: [(0, '0.306')] [2024-06-19 01:54:46,855][26599] Updated weights for policy 0, policy_version 254644 (0.0035) [2024-06-19 01:54:48,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4172169216. Throughput: 0: 41942.6. Samples: 439722480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:54:48,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 01:54:50,562][26599] Updated weights for policy 0, policy_version 254654 (0.0037) [2024-06-19 01:54:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4172365824. Throughput: 0: 41912.9. Samples: 439973800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:54:53,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 01:54:54,576][26599] Updated weights for policy 0, policy_version 254664 (0.0041) [2024-06-19 01:54:58,371][26599] Updated weights for policy 0, policy_version 254674 (0.0038) [2024-06-19 01:54:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41932.5). Total num frames: 4172578816. Throughput: 0: 42017.3. Samples: 440224060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:54:58,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 01:55:02,498][26599] Updated weights for policy 0, policy_version 254684 (0.0048) [2024-06-19 01:55:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42328.0, 300 sec: 41987.5). Total num frames: 4172791808. Throughput: 0: 41824.5. Samples: 440344140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:03,380][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 01:55:06,107][26599] Updated weights for policy 0, policy_version 254694 (0.0044) [2024-06-19 01:55:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4172988416. Throughput: 0: 41804.4. Samples: 440601560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:08,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 01:55:10,201][26599] Updated weights for policy 0, policy_version 254704 (0.0033) [2024-06-19 01:55:13,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4173185024. Throughput: 0: 41876.0. Samples: 440854000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:13,381][26367] Avg episode reward: [(0, '0.381')] [2024-06-19 01:55:14,207][26599] Updated weights for policy 0, policy_version 254714 (0.0037) [2024-06-19 01:55:18,081][26599] Updated weights for policy 0, policy_version 254724 (0.0037) [2024-06-19 01:55:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4173398016. Throughput: 0: 41749.4. Samples: 440978800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:18,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 01:55:22,259][26599] Updated weights for policy 0, policy_version 254734 (0.0044) [2024-06-19 01:55:23,384][26367] Fps is (10 sec: 42582.5, 60 sec: 41776.6, 300 sec: 41876.4). Total num frames: 4173611008. Throughput: 0: 41921.4. Samples: 441232160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:23,384][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 01:55:25,953][26599] Updated weights for policy 0, policy_version 254744 (0.0039) [2024-06-19 01:55:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41932.0). Total num frames: 4173824000. Throughput: 0: 41637.3. Samples: 441473620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:28,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 01:55:30,284][26599] Updated weights for policy 0, policy_version 254754 (0.0044) [2024-06-19 01:55:33,380][26367] Fps is (10 sec: 42614.4, 60 sec: 42054.8, 300 sec: 41987.5). Total num frames: 4174036992. Throughput: 0: 41682.3. Samples: 441598180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:33,380][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 01:55:33,811][26599] Updated weights for policy 0, policy_version 254764 (0.0033) [2024-06-19 01:55:37,907][26599] Updated weights for policy 0, policy_version 254774 (0.0032) [2024-06-19 01:55:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4174233600. Throughput: 0: 41712.5. Samples: 441850860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:38,380][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 01:55:38,395][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000254775_4174233600.pth... [2024-06-19 01:55:38,471][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000254163_4164206592.pth [2024-06-19 01:55:41,655][26599] Updated weights for policy 0, policy_version 254784 (0.0038) [2024-06-19 01:55:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4174446592. Throughput: 0: 41684.0. Samples: 442099840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:43,380][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 01:55:45,887][26599] Updated weights for policy 0, policy_version 254794 (0.0040) [2024-06-19 01:55:48,384][26367] Fps is (10 sec: 44220.4, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 4174675968. Throughput: 0: 41896.1. Samples: 442229620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:48,384][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 01:55:49,306][26599] Updated weights for policy 0, policy_version 254804 (0.0028) [2024-06-19 01:55:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4174856192. Throughput: 0: 41768.9. Samples: 442481160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:53,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 01:55:53,553][26599] Updated weights for policy 0, policy_version 254814 (0.0037) [2024-06-19 01:55:57,175][26599] Updated weights for policy 0, policy_version 254824 (0.0033) [2024-06-19 01:55:58,384][26367] Fps is (10 sec: 39321.2, 60 sec: 41503.5, 300 sec: 41931.4). Total num frames: 4175069184. Throughput: 0: 41725.8. Samples: 442731820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 01:55:58,385][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 01:56:01,386][26599] Updated weights for policy 0, policy_version 254834 (0.0040) [2024-06-19 01:56:03,384][26367] Fps is (10 sec: 44220.4, 60 sec: 41776.6, 300 sec: 41931.9). Total num frames: 4175298560. Throughput: 0: 41823.8. Samples: 442861020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:03,384][26367] Avg episode reward: [(0, '0.318')] [2024-06-19 01:56:04,932][26599] Updated weights for policy 0, policy_version 254844 (0.0034) [2024-06-19 01:56:08,380][26367] Fps is (10 sec: 39336.6, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4175462400. Throughput: 0: 41666.2. Samples: 443106980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:08,380][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 01:56:08,735][26579] Signal inference workers to stop experience collection... (6550 times) [2024-06-19 01:56:08,789][26599] InferenceWorker_p0-w0: stopping experience collection (6550 times) [2024-06-19 01:56:08,796][26579] Signal inference workers to resume experience collection... (6550 times) [2024-06-19 01:56:08,808][26599] InferenceWorker_p0-w0: resuming experience collection (6550 times) [2024-06-19 01:56:09,084][26599] Updated weights for policy 0, policy_version 254854 (0.0034) [2024-06-19 01:56:12,753][26599] Updated weights for policy 0, policy_version 254864 (0.0040) [2024-06-19 01:56:13,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4175724544. Throughput: 0: 41896.9. Samples: 443358980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:13,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 01:56:17,061][26599] Updated weights for policy 0, policy_version 254874 (0.0037) [2024-06-19 01:56:18,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 4175937536. Throughput: 0: 42138.2. Samples: 443494400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:18,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 01:56:20,770][26599] Updated weights for policy 0, policy_version 254884 (0.0045) [2024-06-19 01:56:23,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41508.7, 300 sec: 41876.4). Total num frames: 4176101376. Throughput: 0: 41917.7. Samples: 443737160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:23,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 01:56:24,874][26599] Updated weights for policy 0, policy_version 254894 (0.0035) [2024-06-19 01:56:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4176330752. Throughput: 0: 41952.0. Samples: 443987680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:28,380][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 01:56:28,421][26599] Updated weights for policy 0, policy_version 254904 (0.0033) [2024-06-19 01:56:32,691][26599] Updated weights for policy 0, policy_version 254914 (0.0041) [2024-06-19 01:56:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4176560128. Throughput: 0: 42019.4. Samples: 444120340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:33,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 01:56:36,197][26599] Updated weights for policy 0, policy_version 254924 (0.0028) [2024-06-19 01:56:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4176740352. Throughput: 0: 41814.2. Samples: 444362800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:38,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 01:56:40,346][26599] Updated weights for policy 0, policy_version 254934 (0.0041) [2024-06-19 01:56:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4176953344. Throughput: 0: 42037.3. Samples: 444623340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:43,381][26367] Avg episode reward: [(0, '0.807')] [2024-06-19 01:56:44,034][26599] Updated weights for policy 0, policy_version 254944 (0.0030) [2024-06-19 01:56:48,084][26599] Updated weights for policy 0, policy_version 254954 (0.0032) [2024-06-19 01:56:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41508.6, 300 sec: 41931.9). Total num frames: 4177166336. Throughput: 0: 41901.6. Samples: 444746440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:48,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 01:56:51,727][26599] Updated weights for policy 0, policy_version 254964 (0.0027) [2024-06-19 01:56:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4177379328. Throughput: 0: 41856.3. Samples: 444990520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:53,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 01:56:55,877][26599] Updated weights for policy 0, policy_version 254974 (0.0037) [2024-06-19 01:56:58,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41781.9, 300 sec: 41821.4). Total num frames: 4177575936. Throughput: 0: 41937.4. Samples: 445246160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:56:58,380][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 01:56:59,700][26599] Updated weights for policy 0, policy_version 254984 (0.0043) [2024-06-19 01:57:03,381][26367] Fps is (10 sec: 40954.8, 60 sec: 41507.8, 300 sec: 41931.7). Total num frames: 4177788928. Throughput: 0: 41653.0. Samples: 445368840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:57:03,382][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 01:57:03,879][26599] Updated weights for policy 0, policy_version 254994 (0.0043) [2024-06-19 01:57:07,391][26599] Updated weights for policy 0, policy_version 255004 (0.0028) [2024-06-19 01:57:08,381][26367] Fps is (10 sec: 42595.6, 60 sec: 42324.9, 300 sec: 41931.8). Total num frames: 4178001920. Throughput: 0: 41844.4. Samples: 445620180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 24.0) [2024-06-19 01:57:08,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 01:57:11,742][26599] Updated weights for policy 0, policy_version 255014 (0.0039) [2024-06-19 01:57:13,380][26367] Fps is (10 sec: 40965.5, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4178198528. Throughput: 0: 41844.5. Samples: 445870680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:13,380][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 01:57:14,501][26579] Signal inference workers to stop experience collection... (6600 times) [2024-06-19 01:57:14,501][26579] Signal inference workers to resume experience collection... (6600 times) [2024-06-19 01:57:14,544][26599] InferenceWorker_p0-w0: stopping experience collection (6600 times) [2024-06-19 01:57:14,545][26599] InferenceWorker_p0-w0: resuming experience collection (6600 times) [2024-06-19 01:57:14,979][26599] Updated weights for policy 0, policy_version 255024 (0.0028) [2024-06-19 01:57:18,380][26367] Fps is (10 sec: 40962.5, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4178411520. Throughput: 0: 41725.4. Samples: 445997980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:18,380][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 01:57:19,467][26599] Updated weights for policy 0, policy_version 255034 (0.0035) [2024-06-19 01:57:23,150][26599] Updated weights for policy 0, policy_version 255044 (0.0041) [2024-06-19 01:57:23,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41932.5). Total num frames: 4178640896. Throughput: 0: 41892.3. Samples: 446247960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:23,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 01:57:27,215][26599] Updated weights for policy 0, policy_version 255054 (0.0040) [2024-06-19 01:57:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4178837504. Throughput: 0: 41676.9. Samples: 446498800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:28,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 01:57:30,970][26599] Updated weights for policy 0, policy_version 255064 (0.0050) [2024-06-19 01:57:33,384][26367] Fps is (10 sec: 42583.1, 60 sec: 41776.6, 300 sec: 41987.0). Total num frames: 4179066880. Throughput: 0: 41714.4. Samples: 446623740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:33,384][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 01:57:34,881][26599] Updated weights for policy 0, policy_version 255074 (0.0040) [2024-06-19 01:57:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4179279872. Throughput: 0: 42064.4. Samples: 446883420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:38,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 01:57:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000255083_4179279872.pth... [2024-06-19 01:57:38,469][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000254469_4169220096.pth [2024-06-19 01:57:38,619][26599] Updated weights for policy 0, policy_version 255084 (0.0033) [2024-06-19 01:57:42,797][26599] Updated weights for policy 0, policy_version 255094 (0.0024) [2024-06-19 01:57:43,380][26367] Fps is (10 sec: 40974.3, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 4179476480. Throughput: 0: 41843.7. Samples: 447129140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:43,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 01:57:46,910][26599] Updated weights for policy 0, policy_version 255104 (0.0040) [2024-06-19 01:57:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4179705856. Throughput: 0: 41892.8. Samples: 447253960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:48,380][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 01:57:50,491][26599] Updated weights for policy 0, policy_version 255114 (0.0044) [2024-06-19 01:57:53,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4179886080. Throughput: 0: 42037.5. Samples: 447511840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:53,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 01:57:54,679][26599] Updated weights for policy 0, policy_version 255124 (0.0034) [2024-06-19 01:57:58,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41876.9). Total num frames: 4180099072. Throughput: 0: 41896.9. Samples: 447756040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:57:58,380][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 01:57:58,497][26599] Updated weights for policy 0, policy_version 255134 (0.0039) [2024-06-19 01:58:02,451][26599] Updated weights for policy 0, policy_version 255144 (0.0037) [2024-06-19 01:58:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42053.2, 300 sec: 41987.5). Total num frames: 4180312064. Throughput: 0: 41861.7. Samples: 447881760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:58:03,381][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 01:58:06,138][26599] Updated weights for policy 0, policy_version 255154 (0.0029) [2024-06-19 01:58:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.6, 300 sec: 41821.4). Total num frames: 4180508672. Throughput: 0: 41909.5. Samples: 448133880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:58:08,380][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 01:58:10,115][26599] Updated weights for policy 0, policy_version 255164 (0.0037) [2024-06-19 01:58:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41987.6). Total num frames: 4180738048. Throughput: 0: 41945.8. Samples: 448386360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:58:13,380][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 01:58:14,186][26599] Updated weights for policy 0, policy_version 255174 (0.0026) [2024-06-19 01:58:17,885][26599] Updated weights for policy 0, policy_version 255184 (0.0036) [2024-06-19 01:58:18,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4180934656. Throughput: 0: 42068.8. Samples: 448516680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 01:58:18,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 01:58:21,753][26599] Updated weights for policy 0, policy_version 255194 (0.0031) [2024-06-19 01:58:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4181147648. Throughput: 0: 41893.9. Samples: 448768640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:23,380][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 01:58:25,411][26599] Updated weights for policy 0, policy_version 255204 (0.0049) [2024-06-19 01:58:28,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42043.3). Total num frames: 4181393408. Throughput: 0: 42027.4. Samples: 449020360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:28,380][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 01:58:29,477][26599] Updated weights for policy 0, policy_version 255214 (0.0042) [2024-06-19 01:58:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42054.8, 300 sec: 41932.5). Total num frames: 4181590016. Throughput: 0: 42155.9. Samples: 449150980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:33,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 01:58:33,381][26599] Updated weights for policy 0, policy_version 255224 (0.0030) [2024-06-19 01:58:37,153][26599] Updated weights for policy 0, policy_version 255234 (0.0049) [2024-06-19 01:58:38,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4181770240. Throughput: 0: 41913.8. Samples: 449397960. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:38,380][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 01:58:41,648][26599] Updated weights for policy 0, policy_version 255244 (0.0053) [2024-06-19 01:58:43,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 4181999616. Throughput: 0: 42129.5. Samples: 449651880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:43,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 01:58:44,913][26599] Updated weights for policy 0, policy_version 255254 (0.0037) [2024-06-19 01:58:48,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4182212608. Throughput: 0: 42155.6. Samples: 449778760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:48,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 01:58:49,252][26599] Updated weights for policy 0, policy_version 255264 (0.0034) [2024-06-19 01:58:53,005][26599] Updated weights for policy 0, policy_version 255274 (0.0044) [2024-06-19 01:58:53,384][26367] Fps is (10 sec: 40945.7, 60 sec: 42049.6, 300 sec: 41875.9). Total num frames: 4182409216. Throughput: 0: 41966.2. Samples: 450022520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:53,385][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 01:58:57,110][26599] Updated weights for policy 0, policy_version 255284 (0.0033) [2024-06-19 01:58:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41932.5). Total num frames: 4182622208. Throughput: 0: 42110.2. Samples: 450281320. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:58:58,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 01:59:00,606][26599] Updated weights for policy 0, policy_version 255294 (0.0046) [2024-06-19 01:59:03,383][26367] Fps is (10 sec: 40964.2, 60 sec: 41777.3, 300 sec: 41820.5). Total num frames: 4182818816. Throughput: 0: 41934.8. Samples: 450403860. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:59:03,383][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 01:59:04,740][26599] Updated weights for policy 0, policy_version 255304 (0.0033) [2024-06-19 01:59:06,901][26579] Signal inference workers to stop experience collection... (6650 times) [2024-06-19 01:59:06,951][26579] Signal inference workers to resume experience collection... (6650 times) [2024-06-19 01:59:06,952][26599] InferenceWorker_p0-w0: stopping experience collection (6650 times) [2024-06-19 01:59:06,968][26599] InferenceWorker_p0-w0: resuming experience collection (6650 times) [2024-06-19 01:59:08,063][26599] Updated weights for policy 0, policy_version 255314 (0.0027) [2024-06-19 01:59:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 4183064576. Throughput: 0: 41980.0. Samples: 450657740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:59:08,380][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 01:59:12,545][26599] Updated weights for policy 0, policy_version 255324 (0.0047) [2024-06-19 01:59:13,380][26367] Fps is (10 sec: 44249.1, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4183261184. Throughput: 0: 41992.9. Samples: 450910040. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:59:13,380][26367] Avg episode reward: [(0, '0.331')] [2024-06-19 01:59:15,713][26599] Updated weights for policy 0, policy_version 255334 (0.0034) [2024-06-19 01:59:18,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4183441408. Throughput: 0: 41760.9. Samples: 451030220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:59:18,380][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 01:59:20,429][26599] Updated weights for policy 0, policy_version 255344 (0.0055) [2024-06-19 01:59:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4183687168. Throughput: 0: 42116.0. Samples: 451293180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:59:23,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 01:59:23,909][26599] Updated weights for policy 0, policy_version 255354 (0.0030) [2024-06-19 01:59:28,255][26599] Updated weights for policy 0, policy_version 255364 (0.0037) [2024-06-19 01:59:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41506.0, 300 sec: 41932.4). Total num frames: 4183883776. Throughput: 0: 42095.7. Samples: 451546180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 22.0) [2024-06-19 01:59:28,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 01:59:31,777][26599] Updated weights for policy 0, policy_version 255374 (0.0038) [2024-06-19 01:59:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4184096768. Throughput: 0: 41810.3. Samples: 451660220. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 01:59:33,380][26367] Avg episode reward: [(0, '0.335')] [2024-06-19 01:59:35,903][26599] Updated weights for policy 0, policy_version 255384 (0.0031) [2024-06-19 01:59:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 4184326144. Throughput: 0: 42289.2. Samples: 451925380. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 01:59:38,381][26367] Avg episode reward: [(0, '0.317')] [2024-06-19 01:59:38,407][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000255391_4184326144.pth... [2024-06-19 01:59:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000254775_4174233600.pth [2024-06-19 01:59:39,439][26599] Updated weights for policy 0, policy_version 255394 (0.0027) [2024-06-19 01:59:43,384][26367] Fps is (10 sec: 40944.3, 60 sec: 41776.8, 300 sec: 41820.3). Total num frames: 4184506368. Throughput: 0: 42183.6. Samples: 452179740. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 01:59:43,385][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 01:59:43,807][26599] Updated weights for policy 0, policy_version 255404 (0.0029) [2024-06-19 01:59:47,360][26599] Updated weights for policy 0, policy_version 255414 (0.0033) [2024-06-19 01:59:48,381][26367] Fps is (10 sec: 40956.6, 60 sec: 42051.7, 300 sec: 41931.8). Total num frames: 4184735744. Throughput: 0: 42008.4. Samples: 452294160. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 01:59:48,382][26367] Avg episode reward: [(0, '0.299')] [2024-06-19 01:59:51,577][26599] Updated weights for policy 0, policy_version 255424 (0.0038) [2024-06-19 01:59:53,380][26367] Fps is (10 sec: 44253.8, 60 sec: 42328.0, 300 sec: 41931.9). Total num frames: 4184948736. Throughput: 0: 42155.6. Samples: 452554740. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 01:59:53,380][26367] Avg episode reward: [(0, '0.399')] [2024-06-19 01:59:54,907][26599] Updated weights for policy 0, policy_version 255434 (0.0038) [2024-06-19 01:59:58,380][26367] Fps is (10 sec: 40963.7, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4185145344. Throughput: 0: 42242.2. Samples: 452810940. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 01:59:58,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-19 01:59:59,385][26599] Updated weights for policy 0, policy_version 255444 (0.0043) [2024-06-19 02:00:02,428][26599] Updated weights for policy 0, policy_version 255454 (0.0034) [2024-06-19 02:00:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42600.4, 300 sec: 41987.5). Total num frames: 4185374720. Throughput: 0: 42269.8. Samples: 452932360. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:03,380][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 02:00:07,194][26599] Updated weights for policy 0, policy_version 255464 (0.0039) [2024-06-19 02:00:08,384][26367] Fps is (10 sec: 42582.6, 60 sec: 41776.6, 300 sec: 41986.9). Total num frames: 4185571328. Throughput: 0: 42078.3. Samples: 453186860. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:08,384][26367] Avg episode reward: [(0, '0.839')] [2024-06-19 02:00:10,489][26599] Updated weights for policy 0, policy_version 255474 (0.0043) [2024-06-19 02:00:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4185767936. Throughput: 0: 41984.0. Samples: 453435460. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:13,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 02:00:14,953][26599] Updated weights for policy 0, policy_version 255484 (0.0039) [2024-06-19 02:00:18,235][26599] Updated weights for policy 0, policy_version 255494 (0.0032) [2024-06-19 02:00:18,380][26367] Fps is (10 sec: 44253.4, 60 sec: 42871.5, 300 sec: 42043.5). Total num frames: 4186013696. Throughput: 0: 42334.7. Samples: 453565280. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:18,380][26367] Avg episode reward: [(0, '0.350')] [2024-06-19 02:00:22,723][26599] Updated weights for policy 0, policy_version 255504 (0.0042) [2024-06-19 02:00:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4186193920. Throughput: 0: 42150.3. Samples: 453822140. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:23,380][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 02:00:25,901][26599] Updated weights for policy 0, policy_version 255514 (0.0038) [2024-06-19 02:00:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4186423296. Throughput: 0: 42005.3. Samples: 454069820. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:28,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 02:00:30,601][26599] Updated weights for policy 0, policy_version 255524 (0.0043) [2024-06-19 02:00:32,102][26579] Signal inference workers to stop experience collection... (6700 times) [2024-06-19 02:00:32,102][26579] Signal inference workers to resume experience collection... (6700 times) [2024-06-19 02:00:32,130][26599] InferenceWorker_p0-w0: stopping experience collection (6700 times) [2024-06-19 02:00:32,130][26599] InferenceWorker_p0-w0: resuming experience collection (6700 times) [2024-06-19 02:00:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4186619904. Throughput: 0: 42230.2. Samples: 454194480. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:33,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 02:00:34,389][26599] Updated weights for policy 0, policy_version 255534 (0.0041) [2024-06-19 02:00:38,383][26367] Fps is (10 sec: 39309.1, 60 sec: 41504.0, 300 sec: 41931.5). Total num frames: 4186816512. Throughput: 0: 41792.6. Samples: 454435540. Policy #0 lag: (min: 1.0, avg: 12.0, max: 22.0) [2024-06-19 02:00:38,384][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 02:00:38,875][26599] Updated weights for policy 0, policy_version 255544 (0.0038) [2024-06-19 02:00:42,116][26599] Updated weights for policy 0, policy_version 255554 (0.0038) [2024-06-19 02:00:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42054.9, 300 sec: 41876.9). Total num frames: 4187029504. Throughput: 0: 41565.3. Samples: 454681380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:00:43,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 02:00:46,976][26599] Updated weights for policy 0, policy_version 255564 (0.0033) [2024-06-19 02:00:48,380][26367] Fps is (10 sec: 42611.6, 60 sec: 41779.8, 300 sec: 41987.5). Total num frames: 4187242496. Throughput: 0: 41704.3. Samples: 454809060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:00:48,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 02:00:49,920][26599] Updated weights for policy 0, policy_version 255574 (0.0038) [2024-06-19 02:00:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41988.0). Total num frames: 4187455488. Throughput: 0: 41577.1. Samples: 455057680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:00:53,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 02:00:54,726][26599] Updated weights for policy 0, policy_version 255584 (0.0034) [2024-06-19 02:00:57,462][26599] Updated weights for policy 0, policy_version 255594 (0.0035) [2024-06-19 02:00:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 41932.5). Total num frames: 4187668480. Throughput: 0: 41565.4. Samples: 455305900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:00:58,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 02:01:02,399][26599] Updated weights for policy 0, policy_version 255604 (0.0027) [2024-06-19 02:01:03,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 4187848704. Throughput: 0: 41621.3. Samples: 455438240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:03,381][26367] Avg episode reward: [(0, '0.347')] [2024-06-19 02:01:05,211][26599] Updated weights for policy 0, policy_version 255614 (0.0040) [2024-06-19 02:01:08,380][26367] Fps is (10 sec: 40959.2, 60 sec: 41781.7, 300 sec: 41876.4). Total num frames: 4188078080. Throughput: 0: 41416.3. Samples: 455685880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:08,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 02:01:09,951][26599] Updated weights for policy 0, policy_version 255624 (0.0032) [2024-06-19 02:01:13,327][26599] Updated weights for policy 0, policy_version 255634 (0.0032) [2024-06-19 02:01:13,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4188307456. Throughput: 0: 41506.1. Samples: 455937600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:13,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 02:01:18,115][26599] Updated weights for policy 0, policy_version 255644 (0.0039) [2024-06-19 02:01:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41233.0, 300 sec: 41987.5). Total num frames: 4188487680. Throughput: 0: 41591.1. Samples: 456066080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:18,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 02:01:21,000][26599] Updated weights for policy 0, policy_version 255654 (0.0046) [2024-06-19 02:01:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4188717056. Throughput: 0: 41854.0. Samples: 456318840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:23,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 02:01:25,646][26599] Updated weights for policy 0, policy_version 255664 (0.0040) [2024-06-19 02:01:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4188913664. Throughput: 0: 41979.1. Samples: 456570440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:28,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 02:01:28,984][26599] Updated weights for policy 0, policy_version 255674 (0.0035) [2024-06-19 02:01:33,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4189110272. Throughput: 0: 41801.8. Samples: 456690140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:33,380][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 02:01:33,411][26599] Updated weights for policy 0, policy_version 255684 (0.0034) [2024-06-19 02:01:36,821][26599] Updated weights for policy 0, policy_version 255694 (0.0032) [2024-06-19 02:01:38,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42327.6, 300 sec: 42043.0). Total num frames: 4189356032. Throughput: 0: 41777.5. Samples: 456937660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:38,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 02:01:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000255698_4189356032.pth... [2024-06-19 02:01:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000255083_4179279872.pth [2024-06-19 02:01:41,374][26599] Updated weights for policy 0, policy_version 255704 (0.0046) [2024-06-19 02:01:42,987][26579] Signal inference workers to stop experience collection... (6750 times) [2024-06-19 02:01:42,987][26579] Signal inference workers to resume experience collection... (6750 times) [2024-06-19 02:01:43,023][26599] InferenceWorker_p0-w0: stopping experience collection (6750 times) [2024-06-19 02:01:43,023][26599] InferenceWorker_p0-w0: resuming experience collection (6750 times) [2024-06-19 02:01:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4189536256. Throughput: 0: 42095.1. Samples: 457200180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:43,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 02:01:44,878][26599] Updated weights for policy 0, policy_version 255714 (0.0038) [2024-06-19 02:01:48,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4189732864. Throughput: 0: 41720.0. Samples: 457315640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 25.0) [2024-06-19 02:01:48,380][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 02:01:49,326][26599] Updated weights for policy 0, policy_version 255724 (0.0034) [2024-06-19 02:01:52,594][26599] Updated weights for policy 0, policy_version 255734 (0.0034) [2024-06-19 02:01:53,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42325.5, 300 sec: 42098.5). Total num frames: 4189995008. Throughput: 0: 41946.8. Samples: 457573480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:01:53,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 02:01:57,181][26599] Updated weights for policy 0, policy_version 255744 (0.0035) [2024-06-19 02:01:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41932.1). Total num frames: 4190158848. Throughput: 0: 41966.2. Samples: 457826080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:01:58,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 02:02:00,418][26599] Updated weights for policy 0, policy_version 255754 (0.0027) [2024-06-19 02:02:03,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 4190371840. Throughput: 0: 41649.7. Samples: 457940320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:03,381][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 02:02:04,954][26599] Updated weights for policy 0, policy_version 255764 (0.0030) [2024-06-19 02:02:08,078][26599] Updated weights for policy 0, policy_version 255774 (0.0040) [2024-06-19 02:02:08,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4190601216. Throughput: 0: 41826.8. Samples: 458201200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:08,385][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 02:02:12,693][26599] Updated weights for policy 0, policy_version 255784 (0.0037) [2024-06-19 02:02:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 4190781440. Throughput: 0: 41822.8. Samples: 458452460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:13,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 02:02:15,921][26599] Updated weights for policy 0, policy_version 255794 (0.0043) [2024-06-19 02:02:18,384][26367] Fps is (10 sec: 40960.1, 60 sec: 42049.6, 300 sec: 41931.4). Total num frames: 4191010816. Throughput: 0: 41833.8. Samples: 458572820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:18,385][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 02:02:20,506][26599] Updated weights for policy 0, policy_version 255804 (0.0037) [2024-06-19 02:02:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4191207424. Throughput: 0: 42024.4. Samples: 458828760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:23,381][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 02:02:23,750][26599] Updated weights for policy 0, policy_version 255814 (0.0033) [2024-06-19 02:02:28,089][26599] Updated weights for policy 0, policy_version 255824 (0.0035) [2024-06-19 02:02:28,380][26367] Fps is (10 sec: 40975.6, 60 sec: 41779.3, 300 sec: 41876.9). Total num frames: 4191420416. Throughput: 0: 41614.7. Samples: 459072840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:28,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 02:02:32,011][26599] Updated weights for policy 0, policy_version 255834 (0.0032) [2024-06-19 02:02:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4191633408. Throughput: 0: 41836.4. Samples: 459198280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:33,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 02:02:35,996][26599] Updated weights for policy 0, policy_version 255844 (0.0040) [2024-06-19 02:02:38,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 4191830016. Throughput: 0: 41867.0. Samples: 459457500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:38,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 02:02:39,706][26599] Updated weights for policy 0, policy_version 255854 (0.0037) [2024-06-19 02:02:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 4192043008. Throughput: 0: 41920.0. Samples: 459712480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:43,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 02:02:43,747][26599] Updated weights for policy 0, policy_version 255864 (0.0028) [2024-06-19 02:02:47,354][26599] Updated weights for policy 0, policy_version 255874 (0.0043) [2024-06-19 02:02:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4192256000. Throughput: 0: 42188.0. Samples: 459838780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:48,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 02:02:51,511][26599] Updated weights for policy 0, policy_version 255884 (0.0040) [2024-06-19 02:02:53,384][26367] Fps is (10 sec: 42582.9, 60 sec: 41230.5, 300 sec: 41931.4). Total num frames: 4192468992. Throughput: 0: 41966.3. Samples: 460089680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:53,385][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 02:02:55,019][26599] Updated weights for policy 0, policy_version 255894 (0.0035) [2024-06-19 02:02:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4192698368. Throughput: 0: 42031.9. Samples: 460343900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:02:58,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 02:02:59,288][26599] Updated weights for policy 0, policy_version 255904 (0.0042) [2024-06-19 02:03:02,924][26599] Updated weights for policy 0, policy_version 255914 (0.0030) [2024-06-19 02:03:03,380][26367] Fps is (10 sec: 42614.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4192894976. Throughput: 0: 42195.1. Samples: 460471440. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:03,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 02:03:06,956][26599] Updated weights for policy 0, policy_version 255924 (0.0045) [2024-06-19 02:03:08,384][26367] Fps is (10 sec: 37669.5, 60 sec: 41233.1, 300 sec: 41820.3). Total num frames: 4193075200. Throughput: 0: 42076.2. Samples: 460722340. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:08,384][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 02:03:10,697][26599] Updated weights for policy 0, policy_version 255934 (0.0032) [2024-06-19 02:03:13,227][26579] Signal inference workers to stop experience collection... (6800 times) [2024-06-19 02:03:13,227][26579] Signal inference workers to resume experience collection... (6800 times) [2024-06-19 02:03:13,248][26599] InferenceWorker_p0-w0: stopping experience collection (6800 times) [2024-06-19 02:03:13,248][26599] InferenceWorker_p0-w0: resuming experience collection (6800 times) [2024-06-19 02:03:13,382][26367] Fps is (10 sec: 44227.6, 60 sec: 42596.9, 300 sec: 42042.7). Total num frames: 4193337344. Throughput: 0: 42190.5. Samples: 460971500. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:13,383][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 02:03:14,792][26599] Updated weights for policy 0, policy_version 255944 (0.0032) [2024-06-19 02:03:18,380][26367] Fps is (10 sec: 45892.0, 60 sec: 42054.9, 300 sec: 41987.5). Total num frames: 4193533952. Throughput: 0: 42275.6. Samples: 461100680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:18,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 02:03:18,548][26599] Updated weights for policy 0, policy_version 255954 (0.0037) [2024-06-19 02:03:22,333][26599] Updated weights for policy 0, policy_version 255964 (0.0035) [2024-06-19 02:03:23,380][26367] Fps is (10 sec: 39329.3, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 4193730560. Throughput: 0: 42125.3. Samples: 461353140. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:23,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 02:03:26,637][26599] Updated weights for policy 0, policy_version 255974 (0.0029) [2024-06-19 02:03:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4193959936. Throughput: 0: 42069.9. Samples: 461605620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:28,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 02:03:30,826][26599] Updated weights for policy 0, policy_version 255984 (0.0043) [2024-06-19 02:03:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 41987.4). Total num frames: 4194156544. Throughput: 0: 42044.3. Samples: 461730780. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:33,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 02:03:34,504][26599] Updated weights for policy 0, policy_version 255994 (0.0042) [2024-06-19 02:03:38,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4194353152. Throughput: 0: 42018.6. Samples: 461980360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:38,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 02:03:38,501][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256004_4194369536.pth... [2024-06-19 02:03:38,506][26599] Updated weights for policy 0, policy_version 256004 (0.0036) [2024-06-19 02:03:38,545][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000255391_4184326144.pth [2024-06-19 02:03:42,130][26599] Updated weights for policy 0, policy_version 256014 (0.0034) [2024-06-19 02:03:43,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4194582528. Throughput: 0: 41933.4. Samples: 462230900. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:43,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 02:03:46,056][26599] Updated weights for policy 0, policy_version 256024 (0.0030) [2024-06-19 02:03:48,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 41988.0). Total num frames: 4194795520. Throughput: 0: 41983.4. Samples: 462360700. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:48,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 02:03:49,877][26599] Updated weights for policy 0, policy_version 256034 (0.0030) [2024-06-19 02:03:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 4195008512. Throughput: 0: 42000.8. Samples: 462612220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:53,381][26367] Avg episode reward: [(0, '0.328')] [2024-06-19 02:03:53,630][26599] Updated weights for policy 0, policy_version 256044 (0.0038) [2024-06-19 02:03:57,634][26599] Updated weights for policy 0, policy_version 256054 (0.0023) [2024-06-19 02:03:58,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 41987.9). Total num frames: 4195205120. Throughput: 0: 42058.4. Samples: 462864040. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:03:58,380][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 02:04:01,262][26599] Updated weights for policy 0, policy_version 256064 (0.0041) [2024-06-19 02:04:03,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 4195401728. Throughput: 0: 41989.8. Samples: 462990220. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:04:03,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 02:04:05,290][26599] Updated weights for policy 0, policy_version 256074 (0.0040) [2024-06-19 02:04:08,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42871.5, 300 sec: 41986.9). Total num frames: 4195647488. Throughput: 0: 41933.5. Samples: 463240300. Policy #0 lag: (min: 1.0, avg: 9.5, max: 22.0) [2024-06-19 02:04:08,384][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 02:04:08,923][26599] Updated weights for policy 0, policy_version 256084 (0.0029) [2024-06-19 02:04:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41507.5, 300 sec: 41987.5). Total num frames: 4195827712. Throughput: 0: 41979.5. Samples: 463494700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:13,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 02:04:13,503][26599] Updated weights for policy 0, policy_version 256094 (0.0031) [2024-06-19 02:04:16,755][26599] Updated weights for policy 0, policy_version 256104 (0.0035) [2024-06-19 02:04:18,380][26367] Fps is (10 sec: 37697.4, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 4196024320. Throughput: 0: 42029.1. Samples: 463622080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:18,380][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 02:04:21,148][26599] Updated weights for policy 0, policy_version 256114 (0.0033) [2024-06-19 02:04:23,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4196270080. Throughput: 0: 42074.2. Samples: 463873700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:23,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 02:04:24,606][26599] Updated weights for policy 0, policy_version 256124 (0.0038) [2024-06-19 02:04:28,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4196466688. Throughput: 0: 42187.5. Samples: 464129340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:28,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 02:04:28,845][26599] Updated weights for policy 0, policy_version 256134 (0.0029) [2024-06-19 02:04:30,194][26579] Signal inference workers to stop experience collection... (6850 times) [2024-06-19 02:04:30,194][26579] Signal inference workers to resume experience collection... (6850 times) [2024-06-19 02:04:30,228][26599] InferenceWorker_p0-w0: stopping experience collection (6850 times) [2024-06-19 02:04:30,228][26599] InferenceWorker_p0-w0: resuming experience collection (6850 times) [2024-06-19 02:04:32,888][26599] Updated weights for policy 0, policy_version 256144 (0.0026) [2024-06-19 02:04:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4196663296. Throughput: 0: 41890.8. Samples: 464245780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:33,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 02:04:36,849][26599] Updated weights for policy 0, policy_version 256154 (0.0046) [2024-06-19 02:04:38,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42099.1). Total num frames: 4196925440. Throughput: 0: 41996.0. Samples: 464502040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:38,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 02:04:40,469][26599] Updated weights for policy 0, policy_version 256164 (0.0043) [2024-06-19 02:04:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 41876.5). Total num frames: 4197089280. Throughput: 0: 42075.9. Samples: 464757460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:43,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 02:04:44,312][26599] Updated weights for policy 0, policy_version 256174 (0.0040) [2024-06-19 02:04:48,357][26599] Updated weights for policy 0, policy_version 256184 (0.0031) [2024-06-19 02:04:48,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4197318656. Throughput: 0: 41924.8. Samples: 464876840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:48,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 02:04:52,397][26599] Updated weights for policy 0, policy_version 256194 (0.0044) [2024-06-19 02:04:53,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4197515264. Throughput: 0: 42112.4. Samples: 465135200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:53,380][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 02:04:56,221][26599] Updated weights for policy 0, policy_version 256204 (0.0042) [2024-06-19 02:04:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4197728256. Throughput: 0: 41972.3. Samples: 465383460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:04:58,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 02:05:00,749][26599] Updated weights for policy 0, policy_version 256214 (0.0042) [2024-06-19 02:05:03,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 41932.4). Total num frames: 4197941248. Throughput: 0: 41936.3. Samples: 465509220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:05:03,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 02:05:03,909][26599] Updated weights for policy 0, policy_version 256224 (0.0029) [2024-06-19 02:05:08,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41235.6, 300 sec: 41876.4). Total num frames: 4198121472. Throughput: 0: 41928.9. Samples: 465760500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:05:08,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 02:05:08,541][26599] Updated weights for policy 0, policy_version 256234 (0.0038) [2024-06-19 02:05:11,686][26599] Updated weights for policy 0, policy_version 256244 (0.0038) [2024-06-19 02:05:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 4198350848. Throughput: 0: 41803.6. Samples: 466010500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:05:13,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 02:05:16,255][26599] Updated weights for policy 0, policy_version 256254 (0.0041) [2024-06-19 02:05:18,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4198580224. Throughput: 0: 42040.0. Samples: 466137580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:05:18,380][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 02:05:19,258][26599] Updated weights for policy 0, policy_version 256264 (0.0037) [2024-06-19 02:05:23,384][26367] Fps is (10 sec: 40945.1, 60 sec: 41503.6, 300 sec: 41820.3). Total num frames: 4198760448. Throughput: 0: 41977.0. Samples: 466391160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 02:05:23,385][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 02:05:23,858][26599] Updated weights for policy 0, policy_version 256274 (0.0051) [2024-06-19 02:05:26,926][26599] Updated weights for policy 0, policy_version 256284 (0.0028) [2024-06-19 02:05:28,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4198973440. Throughput: 0: 41842.2. Samples: 466640360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:05:28,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 02:05:31,657][26599] Updated weights for policy 0, policy_version 256294 (0.0038) [2024-06-19 02:05:33,380][26367] Fps is (10 sec: 44253.2, 60 sec: 42325.3, 300 sec: 41987.9). Total num frames: 4199202816. Throughput: 0: 42055.7. Samples: 466769340. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:05:33,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 02:05:34,883][26599] Updated weights for policy 0, policy_version 256304 (0.0037) [2024-06-19 02:05:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 4199399424. Throughput: 0: 42026.1. Samples: 467026380. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:05:38,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 02:05:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256311_4199399424.pth... [2024-06-19 02:05:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000255698_4189356032.pth [2024-06-19 02:05:39,269][26599] Updated weights for policy 0, policy_version 256314 (0.0034) [2024-06-19 02:05:42,378][26579] Signal inference workers to stop experience collection... (6900 times) [2024-06-19 02:05:42,381][26579] Signal inference workers to resume experience collection... (6900 times) [2024-06-19 02:05:42,403][26599] InferenceWorker_p0-w0: stopping experience collection (6900 times) [2024-06-19 02:05:42,403][26599] InferenceWorker_p0-w0: resuming experience collection (6900 times) [2024-06-19 02:05:42,701][26599] Updated weights for policy 0, policy_version 256324 (0.0043) [2024-06-19 02:05:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4199628800. Throughput: 0: 41781.8. Samples: 467263640. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:05:43,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 02:05:46,978][26599] Updated weights for policy 0, policy_version 256334 (0.0044) [2024-06-19 02:05:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4199841792. Throughput: 0: 42060.0. Samples: 467401920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:05:48,381][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 02:05:50,434][26599] Updated weights for policy 0, policy_version 256344 (0.0040) [2024-06-19 02:05:53,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4200005632. Throughput: 0: 41977.8. Samples: 467649500. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:05:53,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 02:05:54,651][26599] Updated weights for policy 0, policy_version 256354 (0.0031) [2024-06-19 02:05:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4200251392. Throughput: 0: 41919.7. Samples: 467896880. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:05:58,380][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 02:05:58,454][26599] Updated weights for policy 0, policy_version 256364 (0.0024) [2024-06-19 02:06:02,581][26599] Updated weights for policy 0, policy_version 256374 (0.0047) [2024-06-19 02:06:03,384][26367] Fps is (10 sec: 45858.0, 60 sec: 42049.7, 300 sec: 41987.0). Total num frames: 4200464384. Throughput: 0: 42054.7. Samples: 468030200. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:06:03,385][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 02:06:06,117][26599] Updated weights for policy 0, policy_version 256384 (0.0034) [2024-06-19 02:06:08,384][26367] Fps is (10 sec: 39306.6, 60 sec: 42049.6, 300 sec: 41820.3). Total num frames: 4200644608. Throughput: 0: 41705.7. Samples: 468267920. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:06:08,385][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 02:06:10,637][26599] Updated weights for policy 0, policy_version 256394 (0.0033) [2024-06-19 02:06:13,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4200890368. Throughput: 0: 41719.1. Samples: 468517720. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:06:13,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 02:06:14,153][26599] Updated weights for policy 0, policy_version 256404 (0.0042) [2024-06-19 02:06:18,380][26367] Fps is (10 sec: 42614.6, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4201070592. Throughput: 0: 41747.1. Samples: 468647960. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:06:18,380][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 02:06:18,386][26599] Updated weights for policy 0, policy_version 256414 (0.0034) [2024-06-19 02:06:21,833][26599] Updated weights for policy 0, policy_version 256424 (0.0047) [2024-06-19 02:06:23,384][26367] Fps is (10 sec: 37669.3, 60 sec: 41779.2, 300 sec: 41875.9). Total num frames: 4201267200. Throughput: 0: 41470.4. Samples: 468892700. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:06:23,384][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 02:06:26,161][26599] Updated weights for policy 0, policy_version 256434 (0.0036) [2024-06-19 02:06:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4201512960. Throughput: 0: 42034.3. Samples: 469155180. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:06:28,380][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 02:06:29,814][26599] Updated weights for policy 0, policy_version 256444 (0.0033) [2024-06-19 02:06:33,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41233.0, 300 sec: 41765.3). Total num frames: 4201676800. Throughput: 0: 41720.9. Samples: 469279360. Policy #0 lag: (min: 1.0, avg: 9.0, max: 20.0) [2024-06-19 02:06:33,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 02:06:34,173][26599] Updated weights for policy 0, policy_version 256454 (0.0028) [2024-06-19 02:06:37,524][26599] Updated weights for policy 0, policy_version 256464 (0.0032) [2024-06-19 02:06:38,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4201922560. Throughput: 0: 41755.9. Samples: 469528520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:06:38,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 02:06:41,992][26599] Updated weights for policy 0, policy_version 256474 (0.0040) [2024-06-19 02:06:43,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4202151936. Throughput: 0: 41943.0. Samples: 469784320. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:06:43,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 02:06:45,266][26599] Updated weights for policy 0, policy_version 256484 (0.0023) [2024-06-19 02:06:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 4202315776. Throughput: 0: 41690.5. Samples: 469906120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:06:48,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 02:06:49,747][26599] Updated weights for policy 0, policy_version 256494 (0.0040) [2024-06-19 02:06:53,075][26599] Updated weights for policy 0, policy_version 256504 (0.0041) [2024-06-19 02:06:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 4202561536. Throughput: 0: 41827.3. Samples: 470150000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:06:53,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 02:06:57,572][26599] Updated weights for policy 0, policy_version 256514 (0.0034) [2024-06-19 02:06:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 41505.9, 300 sec: 41931.9). Total num frames: 4202741760. Throughput: 0: 42220.2. Samples: 470417640. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:06:58,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 02:07:01,674][26599] Updated weights for policy 0, policy_version 256524 (0.0028) [2024-06-19 02:07:03,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41508.7, 300 sec: 41876.9). Total num frames: 4202954752. Throughput: 0: 41947.1. Samples: 470535580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:03,380][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 02:07:05,235][26599] Updated weights for policy 0, policy_version 256534 (0.0028) [2024-06-19 02:07:08,382][26367] Fps is (10 sec: 44230.8, 60 sec: 42326.8, 300 sec: 42042.8). Total num frames: 4203184128. Throughput: 0: 42029.1. Samples: 470783920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:08,382][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 02:07:09,332][26599] Updated weights for policy 0, policy_version 256544 (0.0048) [2024-06-19 02:07:12,792][26579] Signal inference workers to stop experience collection... (6950 times) [2024-06-19 02:07:12,792][26579] Signal inference workers to resume experience collection... (6950 times) [2024-06-19 02:07:12,837][26599] InferenceWorker_p0-w0: stopping experience collection (6950 times) [2024-06-19 02:07:12,838][26599] InferenceWorker_p0-w0: resuming experience collection (6950 times) [2024-06-19 02:07:12,927][26599] Updated weights for policy 0, policy_version 256554 (0.0037) [2024-06-19 02:07:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41932.5). Total num frames: 4203380736. Throughput: 0: 41944.4. Samples: 471042680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:13,380][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 02:07:16,936][26599] Updated weights for policy 0, policy_version 256564 (0.0043) [2024-06-19 02:07:18,380][26367] Fps is (10 sec: 40966.1, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 4203593728. Throughput: 0: 42011.0. Samples: 471169860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:18,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 02:07:20,902][26599] Updated weights for policy 0, policy_version 256574 (0.0037) [2024-06-19 02:07:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42601.0, 300 sec: 42043.0). Total num frames: 4203823104. Throughput: 0: 42022.3. Samples: 471419520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:23,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 02:07:24,558][26599] Updated weights for policy 0, policy_version 256584 (0.0047) [2024-06-19 02:07:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4204019712. Throughput: 0: 41997.8. Samples: 471674220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:28,380][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 02:07:28,682][26599] Updated weights for policy 0, policy_version 256594 (0.0026) [2024-06-19 02:07:32,481][26599] Updated weights for policy 0, policy_version 256604 (0.0038) [2024-06-19 02:07:33,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4204216320. Throughput: 0: 42058.3. Samples: 471798740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:33,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 02:07:36,223][26599] Updated weights for policy 0, policy_version 256614 (0.0029) [2024-06-19 02:07:38,384][26367] Fps is (10 sec: 42582.2, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4204445696. Throughput: 0: 42364.1. Samples: 472056540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:38,385][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 02:07:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256619_4204445696.pth... [2024-06-19 02:07:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256004_4194369536.pth [2024-06-19 02:07:40,115][26599] Updated weights for policy 0, policy_version 256624 (0.0028) [2024-06-19 02:07:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4204658688. Throughput: 0: 41983.4. Samples: 472306880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 02:07:43,380][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 02:07:43,865][26599] Updated weights for policy 0, policy_version 256634 (0.0040) [2024-06-19 02:07:47,700][26599] Updated weights for policy 0, policy_version 256644 (0.0045) [2024-06-19 02:07:48,380][26367] Fps is (10 sec: 40975.7, 60 sec: 42325.4, 300 sec: 41988.0). Total num frames: 4204855296. Throughput: 0: 42227.1. Samples: 472435800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:07:48,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 02:07:52,187][26599] Updated weights for policy 0, policy_version 256654 (0.0037) [2024-06-19 02:07:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 4205084672. Throughput: 0: 42347.8. Samples: 472689500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:07:53,380][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 02:07:55,293][26599] Updated weights for policy 0, policy_version 256664 (0.0037) [2024-06-19 02:07:58,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.5, 300 sec: 41987.5). Total num frames: 4205281280. Throughput: 0: 42090.6. Samples: 472936760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:07:58,381][26367] Avg episode reward: [(0, '0.278')] [2024-06-19 02:07:59,841][26599] Updated weights for policy 0, policy_version 256674 (0.0035) [2024-06-19 02:08:03,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42099.1). Total num frames: 4205494272. Throughput: 0: 42032.4. Samples: 473061320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:03,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 02:08:03,492][26599] Updated weights for policy 0, policy_version 256684 (0.0040) [2024-06-19 02:08:07,590][26599] Updated weights for policy 0, policy_version 256694 (0.0040) [2024-06-19 02:08:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41780.3, 300 sec: 41876.7). Total num frames: 4205690880. Throughput: 0: 42244.9. Samples: 473320540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:08,381][26367] Avg episode reward: [(0, '0.392')] [2024-06-19 02:08:11,218][26599] Updated weights for policy 0, policy_version 256704 (0.0043) [2024-06-19 02:08:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4205920256. Throughput: 0: 42059.5. Samples: 473566900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:13,381][26367] Avg episode reward: [(0, '0.333')] [2024-06-19 02:08:15,479][26599] Updated weights for policy 0, policy_version 256714 (0.0043) [2024-06-19 02:08:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4206100480. Throughput: 0: 42032.4. Samples: 473690200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:18,380][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 02:08:19,265][26599] Updated weights for policy 0, policy_version 256724 (0.0035) [2024-06-19 02:08:23,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4206313472. Throughput: 0: 41779.1. Samples: 473936440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:23,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 02:08:23,421][26599] Updated weights for policy 0, policy_version 256734 (0.0037) [2024-06-19 02:08:27,117][26599] Updated weights for policy 0, policy_version 256744 (0.0040) [2024-06-19 02:08:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4206542848. Throughput: 0: 41855.1. Samples: 474190360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:28,380][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 02:08:31,101][26599] Updated weights for policy 0, policy_version 256754 (0.0037) [2024-06-19 02:08:33,381][26367] Fps is (10 sec: 42594.7, 60 sec: 42051.6, 300 sec: 41987.4). Total num frames: 4206739456. Throughput: 0: 41832.0. Samples: 474318280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:33,382][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 02:08:34,943][26599] Updated weights for policy 0, policy_version 256764 (0.0035) [2024-06-19 02:08:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41781.8, 300 sec: 41931.9). Total num frames: 4206952448. Throughput: 0: 41688.4. Samples: 474565480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:38,380][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 02:08:38,647][26599] Updated weights for policy 0, policy_version 256774 (0.0034) [2024-06-19 02:08:42,820][26599] Updated weights for policy 0, policy_version 256784 (0.0039) [2024-06-19 02:08:43,380][26367] Fps is (10 sec: 44240.2, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 4207181824. Throughput: 0: 41863.1. Samples: 474820600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:43,381][26367] Avg episode reward: [(0, '0.775')] [2024-06-19 02:08:46,490][26599] Updated weights for policy 0, policy_version 256794 (0.0031) [2024-06-19 02:08:48,384][26367] Fps is (10 sec: 40945.0, 60 sec: 41776.6, 300 sec: 41875.9). Total num frames: 4207362048. Throughput: 0: 41837.6. Samples: 474944160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:48,384][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 02:08:50,663][26579] Signal inference workers to stop experience collection... (7000 times) [2024-06-19 02:08:50,712][26599] InferenceWorker_p0-w0: stopping experience collection (7000 times) [2024-06-19 02:08:50,773][26579] Signal inference workers to resume experience collection... (7000 times) [2024-06-19 02:08:50,774][26599] InferenceWorker_p0-w0: resuming experience collection (7000 times) [2024-06-19 02:08:50,777][26599] Updated weights for policy 0, policy_version 256804 (0.0034) [2024-06-19 02:08:53,384][26367] Fps is (10 sec: 40945.4, 60 sec: 41776.6, 300 sec: 41986.9). Total num frames: 4207591424. Throughput: 0: 41598.4. Samples: 475192620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 02:08:53,384][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 02:08:54,665][26599] Updated weights for policy 0, policy_version 256814 (0.0040) [2024-06-19 02:08:58,380][26367] Fps is (10 sec: 40975.3, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4207771648. Throughput: 0: 41749.0. Samples: 475445600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:08:58,380][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 02:08:58,632][26599] Updated weights for policy 0, policy_version 256824 (0.0037) [2024-06-19 02:09:02,461][26599] Updated weights for policy 0, policy_version 256834 (0.0044) [2024-06-19 02:09:03,380][26367] Fps is (10 sec: 39336.1, 60 sec: 41506.2, 300 sec: 41821.4). Total num frames: 4207984640. Throughput: 0: 41614.2. Samples: 475562840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:03,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 02:09:06,499][26599] Updated weights for policy 0, policy_version 256844 (0.0038) [2024-06-19 02:09:08,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4208230400. Throughput: 0: 41887.1. Samples: 475821360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:08,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 02:09:10,081][26599] Updated weights for policy 0, policy_version 256854 (0.0037) [2024-06-19 02:09:13,386][26367] Fps is (10 sec: 42575.8, 60 sec: 41502.5, 300 sec: 41986.7). Total num frames: 4208410624. Throughput: 0: 42011.4. Samples: 476081100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:13,386][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 02:09:14,184][26599] Updated weights for policy 0, policy_version 256864 (0.0042) [2024-06-19 02:09:17,783][26599] Updated weights for policy 0, policy_version 256874 (0.0035) [2024-06-19 02:09:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4208640000. Throughput: 0: 41719.9. Samples: 476195640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:18,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 02:09:21,986][26599] Updated weights for policy 0, policy_version 256884 (0.0043) [2024-06-19 02:09:23,380][26367] Fps is (10 sec: 45899.3, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4208869376. Throughput: 0: 42032.0. Samples: 476456920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:23,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 02:09:25,402][26599] Updated weights for policy 0, policy_version 256894 (0.0026) [2024-06-19 02:09:28,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 4209016832. Throughput: 0: 42000.5. Samples: 476710620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:28,381][26367] Avg episode reward: [(0, '0.324')] [2024-06-19 02:09:29,988][26599] Updated weights for policy 0, policy_version 256904 (0.0031) [2024-06-19 02:09:33,273][26599] Updated weights for policy 0, policy_version 256914 (0.0028) [2024-06-19 02:09:33,382][26367] Fps is (10 sec: 40953.8, 60 sec: 42324.8, 300 sec: 41876.2). Total num frames: 4209278976. Throughput: 0: 41903.7. Samples: 476829740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:33,382][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 02:09:37,644][26599] Updated weights for policy 0, policy_version 256924 (0.0045) [2024-06-19 02:09:38,383][26367] Fps is (10 sec: 47501.2, 60 sec: 42323.5, 300 sec: 42042.6). Total num frames: 4209491968. Throughput: 0: 42193.0. Samples: 477091260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:38,383][26367] Avg episode reward: [(0, '0.261')] [2024-06-19 02:09:38,413][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256927_4209491968.pth... [2024-06-19 02:09:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256311_4199399424.pth [2024-06-19 02:09:41,116][26599] Updated weights for policy 0, policy_version 256934 (0.0040) [2024-06-19 02:09:43,380][26367] Fps is (10 sec: 37689.2, 60 sec: 41233.2, 300 sec: 41820.9). Total num frames: 4209655808. Throughput: 0: 41997.7. Samples: 477335500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:43,381][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 02:09:45,351][26599] Updated weights for policy 0, policy_version 256944 (0.0039) [2024-06-19 02:09:48,380][26367] Fps is (10 sec: 40970.4, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 4209901568. Throughput: 0: 42089.3. Samples: 477456860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:48,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 02:09:48,702][26599] Updated weights for policy 0, policy_version 256954 (0.0039) [2024-06-19 02:09:52,947][26599] Updated weights for policy 0, policy_version 256964 (0.0032) [2024-06-19 02:09:53,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42054.8, 300 sec: 41987.5). Total num frames: 4210114560. Throughput: 0: 42199.1. Samples: 477720320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:53,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 02:09:56,386][26599] Updated weights for policy 0, policy_version 256974 (0.0044) [2024-06-19 02:09:58,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4210278400. Throughput: 0: 41907.6. Samples: 477966720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:09:58,380][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 02:10:00,930][26599] Updated weights for policy 0, policy_version 256984 (0.0043) [2024-06-19 02:10:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4210524160. Throughput: 0: 42009.8. Samples: 478086080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 02:10:03,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 02:10:04,483][26599] Updated weights for policy 0, policy_version 256994 (0.0035) [2024-06-19 02:10:07,072][26579] Signal inference workers to stop experience collection... (7050 times) [2024-06-19 02:10:07,112][26599] InferenceWorker_p0-w0: stopping experience collection (7050 times) [2024-06-19 02:10:07,129][26579] Signal inference workers to resume experience collection... (7050 times) [2024-06-19 02:10:07,130][26599] InferenceWorker_p0-w0: resuming experience collection (7050 times) [2024-06-19 02:10:08,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4210720768. Throughput: 0: 41955.2. Samples: 478344900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:08,380][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 02:10:08,791][26599] Updated weights for policy 0, policy_version 257004 (0.0037) [2024-06-19 02:10:12,482][26599] Updated weights for policy 0, policy_version 257014 (0.0050) [2024-06-19 02:10:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42056.0, 300 sec: 41876.4). Total num frames: 4210933760. Throughput: 0: 41749.8. Samples: 478589360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:13,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 02:10:16,770][26599] Updated weights for policy 0, policy_version 257024 (0.0053) [2024-06-19 02:10:18,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42325.4, 300 sec: 42099.1). Total num frames: 4211179520. Throughput: 0: 41997.0. Samples: 478719540. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:18,381][26367] Avg episode reward: [(0, '0.297')] [2024-06-19 02:10:20,448][26599] Updated weights for policy 0, policy_version 257034 (0.0037) [2024-06-19 02:10:23,380][26367] Fps is (10 sec: 39321.9, 60 sec: 40960.1, 300 sec: 41876.4). Total num frames: 4211326976. Throughput: 0: 41727.4. Samples: 478968880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:23,380][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 02:10:24,818][26599] Updated weights for policy 0, policy_version 257044 (0.0031) [2024-06-19 02:10:28,242][26599] Updated weights for policy 0, policy_version 257054 (0.0037) [2024-06-19 02:10:28,384][26367] Fps is (10 sec: 39307.3, 60 sec: 42595.8, 300 sec: 41931.4). Total num frames: 4211572736. Throughput: 0: 41874.3. Samples: 479220000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:28,384][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 02:10:32,439][26599] Updated weights for policy 0, policy_version 257064 (0.0037) [2024-06-19 02:10:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 41780.3, 300 sec: 41987.5). Total num frames: 4211785728. Throughput: 0: 42006.3. Samples: 479347140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:33,380][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 02:10:35,897][26599] Updated weights for policy 0, policy_version 257074 (0.0028) [2024-06-19 02:10:38,380][26367] Fps is (10 sec: 39335.4, 60 sec: 41234.8, 300 sec: 41820.8). Total num frames: 4211965952. Throughput: 0: 41746.1. Samples: 479598900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:38,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 02:10:40,064][26599] Updated weights for policy 0, policy_version 257084 (0.0039) [2024-06-19 02:10:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4212195328. Throughput: 0: 41668.5. Samples: 479841800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:43,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 02:10:43,588][26599] Updated weights for policy 0, policy_version 257094 (0.0029) [2024-06-19 02:10:48,041][26599] Updated weights for policy 0, policy_version 257104 (0.0037) [2024-06-19 02:10:48,382][26367] Fps is (10 sec: 44230.0, 60 sec: 41778.1, 300 sec: 42042.8). Total num frames: 4212408320. Throughput: 0: 41965.6. Samples: 479974600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:48,382][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 02:10:51,244][26599] Updated weights for policy 0, policy_version 257114 (0.0042) [2024-06-19 02:10:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4212604928. Throughput: 0: 41780.0. Samples: 480225000. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:53,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 02:10:55,781][26599] Updated weights for policy 0, policy_version 257124 (0.0039) [2024-06-19 02:10:58,384][26367] Fps is (10 sec: 40951.8, 60 sec: 42322.7, 300 sec: 41876.4). Total num frames: 4212817920. Throughput: 0: 41785.0. Samples: 480469840. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:10:58,384][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 02:10:59,760][26599] Updated weights for policy 0, policy_version 257134 (0.0031) [2024-06-19 02:11:03,384][26367] Fps is (10 sec: 42582.6, 60 sec: 41776.7, 300 sec: 41987.5). Total num frames: 4213030912. Throughput: 0: 41891.3. Samples: 480604800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:11:03,385][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 02:11:03,529][26599] Updated weights for policy 0, policy_version 257144 (0.0040) [2024-06-19 02:11:07,869][26599] Updated weights for policy 0, policy_version 257154 (0.0036) [2024-06-19 02:11:08,380][26367] Fps is (10 sec: 40975.0, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 4213227520. Throughput: 0: 41828.8. Samples: 480851180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:11:08,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 02:11:11,167][26599] Updated weights for policy 0, policy_version 257164 (0.0038) [2024-06-19 02:11:13,380][26367] Fps is (10 sec: 42614.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4213456896. Throughput: 0: 41779.5. Samples: 481099920. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 02:11:13,380][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 02:11:15,513][26599] Updated weights for policy 0, policy_version 257174 (0.0039) [2024-06-19 02:11:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 40959.9, 300 sec: 41932.4). Total num frames: 4213637120. Throughput: 0: 41822.9. Samples: 481229180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:18,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 02:11:18,487][26579] Signal inference workers to stop experience collection... (7100 times) [2024-06-19 02:11:18,533][26599] InferenceWorker_p0-w0: stopping experience collection (7100 times) [2024-06-19 02:11:18,542][26579] Signal inference workers to resume experience collection... (7100 times) [2024-06-19 02:11:18,552][26599] InferenceWorker_p0-w0: resuming experience collection (7100 times) [2024-06-19 02:11:19,021][26599] Updated weights for policy 0, policy_version 257184 (0.0050) [2024-06-19 02:11:23,156][26599] Updated weights for policy 0, policy_version 257194 (0.0033) [2024-06-19 02:11:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4213866496. Throughput: 0: 41865.9. Samples: 481482860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:23,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 02:11:26,643][26599] Updated weights for policy 0, policy_version 257204 (0.0041) [2024-06-19 02:11:28,380][26367] Fps is (10 sec: 44237.6, 60 sec: 41781.8, 300 sec: 42043.0). Total num frames: 4214079488. Throughput: 0: 42009.8. Samples: 481732240. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:28,380][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 02:11:30,782][26599] Updated weights for policy 0, policy_version 257214 (0.0036) [2024-06-19 02:11:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 4214259712. Throughput: 0: 41858.4. Samples: 481858160. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:33,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 02:11:34,440][26599] Updated weights for policy 0, policy_version 257224 (0.0040) [2024-06-19 02:11:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 4214489088. Throughput: 0: 41915.0. Samples: 482111180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:38,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 02:11:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000257232_4214489088.pth... [2024-06-19 02:11:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256619_4204445696.pth [2024-06-19 02:11:38,734][26599] Updated weights for policy 0, policy_version 257234 (0.0032) [2024-06-19 02:11:42,486][26599] Updated weights for policy 0, policy_version 257244 (0.0034) [2024-06-19 02:11:43,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4214734848. Throughput: 0: 41790.9. Samples: 482350280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:43,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 02:11:46,705][26599] Updated weights for policy 0, policy_version 257254 (0.0042) [2024-06-19 02:11:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41234.2, 300 sec: 41765.3). Total num frames: 4214882304. Throughput: 0: 41730.5. Samples: 482482520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:48,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 02:11:50,126][26599] Updated weights for policy 0, policy_version 257264 (0.0030) [2024-06-19 02:11:53,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41779.1, 300 sec: 41932.0). Total num frames: 4215111680. Throughput: 0: 41691.0. Samples: 482727280. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:53,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 02:11:54,877][26599] Updated weights for policy 0, policy_version 257274 (0.0037) [2024-06-19 02:11:57,862][26599] Updated weights for policy 0, policy_version 257284 (0.0043) [2024-06-19 02:11:58,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 4215357440. Throughput: 0: 41810.1. Samples: 482981380. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:11:58,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 02:12:02,755][26599] Updated weights for policy 0, policy_version 257294 (0.0031) [2024-06-19 02:12:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41508.6, 300 sec: 41821.1). Total num frames: 4215521280. Throughput: 0: 41907.1. Samples: 483115000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:12:03,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 02:12:05,575][26599] Updated weights for policy 0, policy_version 257304 (0.0036) [2024-06-19 02:12:08,383][26367] Fps is (10 sec: 39311.3, 60 sec: 42050.4, 300 sec: 41931.5). Total num frames: 4215750656. Throughput: 0: 41722.4. Samples: 483360480. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:12:08,383][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 02:12:10,418][26599] Updated weights for policy 0, policy_version 257314 (0.0027) [2024-06-19 02:12:11,586][26579] Signal inference workers to stop experience collection... (7150 times) [2024-06-19 02:12:11,637][26599] InferenceWorker_p0-w0: stopping experience collection (7150 times) [2024-06-19 02:12:11,646][26579] Signal inference workers to resume experience collection... (7150 times) [2024-06-19 02:12:11,651][26599] InferenceWorker_p0-w0: resuming experience collection (7150 times) [2024-06-19 02:12:13,271][26599] Updated weights for policy 0, policy_version 257324 (0.0025) [2024-06-19 02:12:13,380][26367] Fps is (10 sec: 49152.9, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4216012800. Throughput: 0: 41982.7. Samples: 483621460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:12:13,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 02:12:18,201][26599] Updated weights for policy 0, policy_version 257334 (0.0045) [2024-06-19 02:12:18,380][26367] Fps is (10 sec: 40970.5, 60 sec: 42052.3, 300 sec: 41820.8). Total num frames: 4216160256. Throughput: 0: 41993.3. Samples: 483747860. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:12:18,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 02:12:21,159][26599] Updated weights for policy 0, policy_version 257344 (0.0028) [2024-06-19 02:12:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4216406016. Throughput: 0: 41827.2. Samples: 483993400. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:12:23,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 02:12:26,011][26599] Updated weights for policy 0, policy_version 257354 (0.0039) [2024-06-19 02:12:28,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4216602624. Throughput: 0: 42263.2. Samples: 484252120. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:12:28,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 02:12:29,338][26599] Updated weights for policy 0, policy_version 257364 (0.0030) [2024-06-19 02:12:33,380][26367] Fps is (10 sec: 36044.6, 60 sec: 41779.2, 300 sec: 41765.8). Total num frames: 4216766464. Throughput: 0: 41915.1. Samples: 484368700. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:12:33,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 02:12:33,765][26599] Updated weights for policy 0, policy_version 257374 (0.0028) [2024-06-19 02:12:37,016][26599] Updated weights for policy 0, policy_version 257384 (0.0033) [2024-06-19 02:12:38,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4217044992. Throughput: 0: 42246.3. Samples: 484628360. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:12:38,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 02:12:41,685][26599] Updated weights for policy 0, policy_version 257394 (0.0032) [2024-06-19 02:12:43,380][26367] Fps is (10 sec: 45875.4, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4217225216. Throughput: 0: 42325.8. Samples: 484886040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:12:43,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 02:12:44,834][26599] Updated weights for policy 0, policy_version 257404 (0.0028) [2024-06-19 02:12:48,384][26367] Fps is (10 sec: 37669.4, 60 sec: 42322.7, 300 sec: 41820.3). Total num frames: 4217421824. Throughput: 0: 41906.4. Samples: 485000940. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:12:48,385][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 02:12:49,363][26599] Updated weights for policy 0, policy_version 257414 (0.0038) [2024-06-19 02:12:52,503][26599] Updated weights for policy 0, policy_version 257424 (0.0030) [2024-06-19 02:12:53,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42043.0). Total num frames: 4217683968. Throughput: 0: 42379.8. Samples: 485267460. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:12:53,381][26367] Avg episode reward: [(0, '0.840')] [2024-06-19 02:12:57,106][26599] Updated weights for policy 0, policy_version 257434 (0.0039) [2024-06-19 02:12:58,380][26367] Fps is (10 sec: 42613.9, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4217847808. Throughput: 0: 42208.3. Samples: 485520840. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:12:58,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 02:13:00,498][26599] Updated weights for policy 0, policy_version 257444 (0.0031) [2024-06-19 02:13:03,380][26367] Fps is (10 sec: 37683.1, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4218060800. Throughput: 0: 42000.5. Samples: 485637880. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:03,381][26367] Avg episode reward: [(0, '0.219')] [2024-06-19 02:13:04,976][26599] Updated weights for policy 0, policy_version 257454 (0.0042) [2024-06-19 02:13:08,208][26599] Updated weights for policy 0, policy_version 257464 (0.0034) [2024-06-19 02:13:08,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42327.2, 300 sec: 41931.9). Total num frames: 4218290176. Throughput: 0: 42295.5. Samples: 485896700. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:08,380][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 02:13:12,721][26599] Updated weights for policy 0, policy_version 257474 (0.0028) [2024-06-19 02:13:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 4218470400. Throughput: 0: 42121.8. Samples: 486147600. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:13,380][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 02:13:15,883][26579] Signal inference workers to stop experience collection... (7200 times) [2024-06-19 02:13:15,938][26579] Signal inference workers to resume experience collection... (7200 times) [2024-06-19 02:13:15,939][26599] InferenceWorker_p0-w0: stopping experience collection (7200 times) [2024-06-19 02:13:15,954][26599] InferenceWorker_p0-w0: resuming experience collection (7200 times) [2024-06-19 02:13:16,106][26599] Updated weights for policy 0, policy_version 257484 (0.0036) [2024-06-19 02:13:18,383][26367] Fps is (10 sec: 42586.9, 60 sec: 42596.6, 300 sec: 42042.6). Total num frames: 4218716160. Throughput: 0: 42304.6. Samples: 486272520. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:18,383][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 02:13:20,272][26599] Updated weights for policy 0, policy_version 257494 (0.0034) [2024-06-19 02:13:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4218896384. Throughput: 0: 42154.3. Samples: 486525300. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:23,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 02:13:24,145][26599] Updated weights for policy 0, policy_version 257504 (0.0039) [2024-06-19 02:13:28,206][26599] Updated weights for policy 0, policy_version 257514 (0.0042) [2024-06-19 02:13:28,384][26367] Fps is (10 sec: 39317.6, 60 sec: 41776.6, 300 sec: 41931.5). Total num frames: 4219109376. Throughput: 0: 42065.0. Samples: 486779120. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:28,384][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 02:13:31,935][26599] Updated weights for policy 0, policy_version 257524 (0.0029) [2024-06-19 02:13:33,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 41987.5). Total num frames: 4219338752. Throughput: 0: 42294.6. Samples: 486904040. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:33,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 02:13:35,941][26599] Updated weights for policy 0, policy_version 257534 (0.0031) [2024-06-19 02:13:38,380][26367] Fps is (10 sec: 42613.7, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4219535360. Throughput: 0: 41794.6. Samples: 487148220. Policy #0 lag: (min: 1.0, avg: 11.5, max: 26.0) [2024-06-19 02:13:38,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 02:13:38,406][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000257540_4219535360.pth... [2024-06-19 02:13:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000256927_4209491968.pth [2024-06-19 02:13:39,838][26599] Updated weights for policy 0, policy_version 257544 (0.0034) [2024-06-19 02:13:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41988.0). Total num frames: 4219748352. Throughput: 0: 41888.4. Samples: 487405820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:13:43,381][26367] Avg episode reward: [(0, '0.380')] [2024-06-19 02:13:43,586][26599] Updated weights for policy 0, policy_version 257554 (0.0035) [2024-06-19 02:13:47,503][26599] Updated weights for policy 0, policy_version 257564 (0.0025) [2024-06-19 02:13:48,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42328.0, 300 sec: 41932.5). Total num frames: 4219961344. Throughput: 0: 42140.1. Samples: 487534180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:13:48,380][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 02:13:51,209][26599] Updated weights for policy 0, policy_version 257574 (0.0039) [2024-06-19 02:13:53,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4220174336. Throughput: 0: 42097.8. Samples: 487791100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:13:53,381][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 02:13:55,140][26599] Updated weights for policy 0, policy_version 257584 (0.0042) [2024-06-19 02:13:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4220387328. Throughput: 0: 42052.9. Samples: 488039980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:13:58,380][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 02:13:58,909][26599] Updated weights for policy 0, policy_version 257594 (0.0030) [2024-06-19 02:14:02,841][26599] Updated weights for policy 0, policy_version 257604 (0.0040) [2024-06-19 02:14:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4220583936. Throughput: 0: 42178.9. Samples: 488170460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:03,381][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 02:14:06,535][26599] Updated weights for policy 0, policy_version 257614 (0.0040) [2024-06-19 02:14:08,382][26367] Fps is (10 sec: 40951.4, 60 sec: 41777.8, 300 sec: 41987.9). Total num frames: 4220796928. Throughput: 0: 42113.2. Samples: 488420480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:08,383][26367] Avg episode reward: [(0, '0.396')] [2024-06-19 02:14:11,074][26599] Updated weights for policy 0, policy_version 257624 (0.0034) [2024-06-19 02:14:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4221009920. Throughput: 0: 41990.6. Samples: 488668540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:13,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 02:14:14,430][26599] Updated weights for policy 0, policy_version 257634 (0.0028) [2024-06-19 02:14:18,380][26367] Fps is (10 sec: 42607.3, 60 sec: 41781.1, 300 sec: 41876.4). Total num frames: 4221222912. Throughput: 0: 41953.4. Samples: 488791940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:18,380][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 02:14:18,672][26599] Updated weights for policy 0, policy_version 257644 (0.0038) [2024-06-19 02:14:22,314][26599] Updated weights for policy 0, policy_version 257654 (0.0054) [2024-06-19 02:14:23,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42322.7, 300 sec: 42098.0). Total num frames: 4221435904. Throughput: 0: 42099.3. Samples: 489042840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:23,384][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 02:14:26,181][26599] Updated weights for policy 0, policy_version 257664 (0.0028) [2024-06-19 02:14:28,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41781.8, 300 sec: 41821.1). Total num frames: 4221616128. Throughput: 0: 42192.6. Samples: 489304480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:28,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 02:14:30,074][26599] Updated weights for policy 0, policy_version 257674 (0.0033) [2024-06-19 02:14:33,380][26367] Fps is (10 sec: 44253.2, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 4221878272. Throughput: 0: 42051.5. Samples: 489426500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:33,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 02:14:33,646][26599] Updated weights for policy 0, policy_version 257684 (0.0038) [2024-06-19 02:14:37,726][26599] Updated weights for policy 0, policy_version 257694 (0.0042) [2024-06-19 02:14:38,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4222074880. Throughput: 0: 41956.5. Samples: 489679140. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:38,384][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 02:14:41,215][26599] Updated weights for policy 0, policy_version 257704 (0.0034) [2024-06-19 02:14:43,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 4222255104. Throughput: 0: 42136.0. Samples: 489936100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:43,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 02:14:44,892][26579] Signal inference workers to stop experience collection... (7250 times) [2024-06-19 02:14:44,892][26579] Signal inference workers to resume experience collection... (7250 times) [2024-06-19 02:14:44,915][26599] InferenceWorker_p0-w0: stopping experience collection (7250 times) [2024-06-19 02:14:44,915][26599] InferenceWorker_p0-w0: resuming experience collection (7250 times) [2024-06-19 02:14:45,187][26599] Updated weights for policy 0, policy_version 257714 (0.0030) [2024-06-19 02:14:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4222517248. Throughput: 0: 42019.2. Samples: 490061320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 02:14:48,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 02:14:49,742][26599] Updated weights for policy 0, policy_version 257724 (0.0044) [2024-06-19 02:14:52,813][26599] Updated weights for policy 0, policy_version 257734 (0.0039) [2024-06-19 02:14:53,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4222713856. Throughput: 0: 42288.6. Samples: 490323380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:14:53,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 02:14:57,397][26599] Updated weights for policy 0, policy_version 257744 (0.0027) [2024-06-19 02:14:58,380][26367] Fps is (10 sec: 37682.9, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4222894080. Throughput: 0: 42271.1. Samples: 490570740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:14:58,388][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 02:15:00,909][26599] Updated weights for policy 0, policy_version 257754 (0.0031) [2024-06-19 02:15:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4223123456. Throughput: 0: 42242.2. Samples: 490692840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:03,380][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 02:15:05,254][26599] Updated weights for policy 0, policy_version 257764 (0.0027) [2024-06-19 02:15:08,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42324.1, 300 sec: 42042.5). Total num frames: 4223336448. Throughput: 0: 42406.7. Samples: 490951140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:08,385][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 02:15:08,633][26599] Updated weights for policy 0, policy_version 257774 (0.0025) [2024-06-19 02:15:12,903][26599] Updated weights for policy 0, policy_version 257784 (0.0050) [2024-06-19 02:15:13,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4223533056. Throughput: 0: 42113.6. Samples: 491199600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:13,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 02:15:16,381][26599] Updated weights for policy 0, policy_version 257794 (0.0032) [2024-06-19 02:15:18,380][26367] Fps is (10 sec: 40975.1, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4223746048. Throughput: 0: 42219.1. Samples: 491326360. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:18,381][26367] Avg episode reward: [(0, '0.342')] [2024-06-19 02:15:20,486][26599] Updated weights for policy 0, policy_version 257804 (0.0041) [2024-06-19 02:15:23,380][26367] Fps is (10 sec: 44237.8, 60 sec: 42328.0, 300 sec: 42043.5). Total num frames: 4223975424. Throughput: 0: 42280.9. Samples: 491581780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:23,380][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 02:15:24,278][26599] Updated weights for policy 0, policy_version 257814 (0.0023) [2024-06-19 02:15:28,063][26599] Updated weights for policy 0, policy_version 257824 (0.0032) [2024-06-19 02:15:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 4224188416. Throughput: 0: 42272.8. Samples: 491838380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:28,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 02:15:32,264][26599] Updated weights for policy 0, policy_version 257834 (0.0026) [2024-06-19 02:15:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4224385024. Throughput: 0: 42353.7. Samples: 491967240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:33,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 02:15:35,708][26599] Updated weights for policy 0, policy_version 257844 (0.0029) [2024-06-19 02:15:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4224598016. Throughput: 0: 42139.2. Samples: 492219640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:38,380][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 02:15:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000257850_4224614400.pth... [2024-06-19 02:15:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000257232_4214489088.pth [2024-06-19 02:15:39,925][26599] Updated weights for policy 0, policy_version 257854 (0.0045) [2024-06-19 02:15:43,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42098.8). Total num frames: 4224827392. Throughput: 0: 42150.8. Samples: 492467520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:43,380][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 02:15:43,455][26599] Updated weights for policy 0, policy_version 257864 (0.0031) [2024-06-19 02:15:47,623][26599] Updated weights for policy 0, policy_version 257874 (0.0032) [2024-06-19 02:15:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4225007616. Throughput: 0: 42276.4. Samples: 492595280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:48,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 02:15:50,991][26599] Updated weights for policy 0, policy_version 257884 (0.0031) [2024-06-19 02:15:53,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 42154.6). Total num frames: 4225253376. Throughput: 0: 42286.9. Samples: 492853900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:53,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 02:15:55,046][26599] Updated weights for policy 0, policy_version 257894 (0.0029) [2024-06-19 02:15:58,151][26579] Signal inference workers to stop experience collection... (7300 times) [2024-06-19 02:15:58,208][26599] InferenceWorker_p0-w0: stopping experience collection (7300 times) [2024-06-19 02:15:58,269][26579] Signal inference workers to resume experience collection... (7300 times) [2024-06-19 02:15:58,269][26599] InferenceWorker_p0-w0: resuming experience collection (7300 times) [2024-06-19 02:15:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42099.1). Total num frames: 4225449984. Throughput: 0: 42451.3. Samples: 493109900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 02:15:58,380][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 02:15:58,925][26599] Updated weights for policy 0, policy_version 257904 (0.0026) [2024-06-19 02:16:02,991][26599] Updated weights for policy 0, policy_version 257914 (0.0027) [2024-06-19 02:16:03,384][26367] Fps is (10 sec: 40945.7, 60 sec: 42322.8, 300 sec: 42153.6). Total num frames: 4225662976. Throughput: 0: 42245.1. Samples: 493227540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:03,384][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 02:16:07,101][26599] Updated weights for policy 0, policy_version 257924 (0.0036) [2024-06-19 02:16:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42601.0, 300 sec: 42154.1). Total num frames: 4225892352. Throughput: 0: 42386.2. Samples: 493489160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:08,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 02:16:10,504][26599] Updated weights for policy 0, policy_version 257934 (0.0036) [2024-06-19 02:16:13,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 4226088960. Throughput: 0: 42367.6. Samples: 493744920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:13,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 02:16:14,704][26599] Updated weights for policy 0, policy_version 257944 (0.0029) [2024-06-19 02:16:18,198][26599] Updated weights for policy 0, policy_version 257954 (0.0040) [2024-06-19 02:16:18,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42868.8, 300 sec: 42209.1). Total num frames: 4226318336. Throughput: 0: 42276.5. Samples: 493869840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:18,385][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 02:16:22,418][26599] Updated weights for policy 0, policy_version 257964 (0.0021) [2024-06-19 02:16:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4226498560. Throughput: 0: 42361.2. Samples: 494125900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:23,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 02:16:25,963][26599] Updated weights for policy 0, policy_version 257974 (0.0033) [2024-06-19 02:16:28,380][26367] Fps is (10 sec: 39336.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4226711552. Throughput: 0: 42627.9. Samples: 494385780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:28,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 02:16:30,166][26599] Updated weights for policy 0, policy_version 257984 (0.0023) [2024-06-19 02:16:33,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4226940928. Throughput: 0: 42425.3. Samples: 494504420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:33,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 02:16:34,142][26599] Updated weights for policy 0, policy_version 257994 (0.0028) [2024-06-19 02:16:38,012][26599] Updated weights for policy 0, policy_version 258004 (0.0030) [2024-06-19 02:16:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 4227153920. Throughput: 0: 42200.5. Samples: 494752920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:38,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 02:16:42,075][26599] Updated weights for policy 0, policy_version 258014 (0.0051) [2024-06-19 02:16:43,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 4227317760. Throughput: 0: 42231.0. Samples: 495010300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:43,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 02:16:45,781][26599] Updated weights for policy 0, policy_version 258024 (0.0039) [2024-06-19 02:16:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4227563520. Throughput: 0: 42091.8. Samples: 495121520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:48,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 02:16:49,692][26599] Updated weights for policy 0, policy_version 258034 (0.0037) [2024-06-19 02:16:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4227760128. Throughput: 0: 42058.7. Samples: 495381800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:53,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 02:16:53,648][26599] Updated weights for policy 0, policy_version 258044 (0.0037) [2024-06-19 02:16:57,407][26599] Updated weights for policy 0, policy_version 258054 (0.0036) [2024-06-19 02:16:58,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4227956736. Throughput: 0: 42019.6. Samples: 495635800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:16:58,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 02:17:01,558][26599] Updated weights for policy 0, policy_version 258064 (0.0045) [2024-06-19 02:17:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42054.8, 300 sec: 42154.5). Total num frames: 4228186112. Throughput: 0: 42012.4. Samples: 495760240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:17:03,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 02:17:05,100][26599] Updated weights for policy 0, policy_version 258074 (0.0034) [2024-06-19 02:17:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4228382720. Throughput: 0: 41884.9. Samples: 496010720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:17:08,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 02:17:09,318][26599] Updated weights for policy 0, policy_version 258084 (0.0034) [2024-06-19 02:17:13,203][26599] Updated weights for policy 0, policy_version 258094 (0.0041) [2024-06-19 02:17:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4228612096. Throughput: 0: 41716.4. Samples: 496263020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:13,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 02:17:16,644][26579] Signal inference workers to stop experience collection... (7350 times) [2024-06-19 02:17:16,692][26599] InferenceWorker_p0-w0: stopping experience collection (7350 times) [2024-06-19 02:17:16,765][26579] Signal inference workers to resume experience collection... (7350 times) [2024-06-19 02:17:16,766][26599] InferenceWorker_p0-w0: resuming experience collection (7350 times) [2024-06-19 02:17:16,910][26599] Updated weights for policy 0, policy_version 258104 (0.0032) [2024-06-19 02:17:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41508.7, 300 sec: 42043.0). Total num frames: 4228808704. Throughput: 0: 41932.5. Samples: 496391380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:18,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 02:17:20,891][26599] Updated weights for policy 0, policy_version 258114 (0.0046) [2024-06-19 02:17:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4229021696. Throughput: 0: 41969.7. Samples: 496641560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:23,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 02:17:24,613][26599] Updated weights for policy 0, policy_version 258124 (0.0038) [2024-06-19 02:17:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4229234688. Throughput: 0: 41735.6. Samples: 496888400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:28,380][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 02:17:28,877][26599] Updated weights for policy 0, policy_version 258134 (0.0039) [2024-06-19 02:17:32,498][26599] Updated weights for policy 0, policy_version 258144 (0.0045) [2024-06-19 02:17:33,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4229447680. Throughput: 0: 42207.5. Samples: 497020860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:33,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 02:17:36,624][26599] Updated weights for policy 0, policy_version 258154 (0.0033) [2024-06-19 02:17:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 4229644288. Throughput: 0: 42004.5. Samples: 497272000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:38,380][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 02:17:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000258157_4229644288.pth... [2024-06-19 02:17:38,468][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000257540_4219535360.pth [2024-06-19 02:17:40,357][26599] Updated weights for policy 0, policy_version 258164 (0.0034) [2024-06-19 02:17:43,381][26367] Fps is (10 sec: 42594.8, 60 sec: 42597.8, 300 sec: 42210.0). Total num frames: 4229873664. Throughput: 0: 41914.7. Samples: 497522000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:43,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 02:17:44,475][26599] Updated weights for policy 0, policy_version 258174 (0.0028) [2024-06-19 02:17:48,131][26599] Updated weights for policy 0, policy_version 258184 (0.0038) [2024-06-19 02:17:48,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4230086656. Throughput: 0: 41914.8. Samples: 497646560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:48,384][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 02:17:52,370][26599] Updated weights for policy 0, policy_version 258194 (0.0034) [2024-06-19 02:17:53,380][26367] Fps is (10 sec: 40963.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4230283264. Throughput: 0: 41841.8. Samples: 497893600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:53,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 02:17:56,030][26599] Updated weights for policy 0, policy_version 258204 (0.0038) [2024-06-19 02:17:58,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4230496256. Throughput: 0: 41860.1. Samples: 498146720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:17:58,380][26367] Avg episode reward: [(0, '0.374')] [2024-06-19 02:18:00,423][26599] Updated weights for policy 0, policy_version 258214 (0.0030) [2024-06-19 02:18:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4230676480. Throughput: 0: 41743.6. Samples: 498269840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:18:03,380][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 02:18:03,846][26599] Updated weights for policy 0, policy_version 258224 (0.0031) [2024-06-19 02:18:08,122][26599] Updated weights for policy 0, policy_version 258234 (0.0032) [2024-06-19 02:18:08,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4230905856. Throughput: 0: 41777.8. Samples: 498521560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:18:08,381][26367] Avg episode reward: [(0, '0.235')] [2024-06-19 02:18:11,796][26599] Updated weights for policy 0, policy_version 258244 (0.0034) [2024-06-19 02:18:13,380][26367] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 42043.4). Total num frames: 4231118848. Throughput: 0: 41698.5. Samples: 498764840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:18:13,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 02:18:15,857][26599] Updated weights for policy 0, policy_version 258254 (0.0035) [2024-06-19 02:18:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4231315456. Throughput: 0: 41456.5. Samples: 498886400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 24.0) [2024-06-19 02:18:18,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 02:18:19,818][26599] Updated weights for policy 0, policy_version 258264 (0.0025) [2024-06-19 02:18:23,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42099.1). Total num frames: 4231528448. Throughput: 0: 41541.4. Samples: 499141360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:23,380][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 02:18:23,542][26599] Updated weights for policy 0, policy_version 258274 (0.0033) [2024-06-19 02:18:27,902][26599] Updated weights for policy 0, policy_version 258284 (0.0044) [2024-06-19 02:18:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4231725056. Throughput: 0: 41564.8. Samples: 499392380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:28,380][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 02:18:31,417][26599] Updated weights for policy 0, policy_version 258294 (0.0034) [2024-06-19 02:18:33,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 4231921664. Throughput: 0: 41645.7. Samples: 499520460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:33,380][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 02:18:35,683][26599] Updated weights for policy 0, policy_version 258304 (0.0038) [2024-06-19 02:18:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4232151040. Throughput: 0: 41648.0. Samples: 499767760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:38,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 02:18:39,169][26599] Updated weights for policy 0, policy_version 258314 (0.0039) [2024-06-19 02:18:41,736][26579] Signal inference workers to stop experience collection... (7400 times) [2024-06-19 02:18:41,780][26599] InferenceWorker_p0-w0: stopping experience collection (7400 times) [2024-06-19 02:18:41,799][26579] Signal inference workers to resume experience collection... (7400 times) [2024-06-19 02:18:41,799][26599] InferenceWorker_p0-w0: resuming experience collection (7400 times) [2024-06-19 02:18:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41506.7, 300 sec: 42043.0). Total num frames: 4232364032. Throughput: 0: 41587.1. Samples: 500018140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:43,381][26367] Avg episode reward: [(0, '0.305')] [2024-06-19 02:18:43,639][26599] Updated weights for policy 0, policy_version 258324 (0.0035) [2024-06-19 02:18:47,153][26599] Updated weights for policy 0, policy_version 258334 (0.0035) [2024-06-19 02:18:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41235.5, 300 sec: 41987.5). Total num frames: 4232560640. Throughput: 0: 41645.2. Samples: 500143880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:48,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 02:18:51,435][26599] Updated weights for policy 0, policy_version 258344 (0.0043) [2024-06-19 02:18:53,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 4232757248. Throughput: 0: 41632.5. Samples: 500395020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:53,381][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 02:18:54,935][26599] Updated weights for policy 0, policy_version 258354 (0.0037) [2024-06-19 02:18:58,384][26367] Fps is (10 sec: 44220.3, 60 sec: 41776.5, 300 sec: 42098.0). Total num frames: 4233003008. Throughput: 0: 41780.1. Samples: 500645100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:18:58,385][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 02:18:59,221][26599] Updated weights for policy 0, policy_version 258364 (0.0038) [2024-06-19 02:19:02,684][26599] Updated weights for policy 0, policy_version 258374 (0.0034) [2024-06-19 02:19:03,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42043.3). Total num frames: 4233199616. Throughput: 0: 42076.9. Samples: 500779860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:19:03,380][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 02:19:06,890][26599] Updated weights for policy 0, policy_version 258384 (0.0029) [2024-06-19 02:19:08,380][26367] Fps is (10 sec: 37697.5, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 4233379840. Throughput: 0: 41697.7. Samples: 501017760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:19:08,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 02:19:10,564][26599] Updated weights for policy 0, policy_version 258394 (0.0036) [2024-06-19 02:19:13,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 41987.4). Total num frames: 4233609216. Throughput: 0: 41714.1. Samples: 501269520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:19:13,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 02:19:15,132][26599] Updated weights for policy 0, policy_version 258404 (0.0028) [2024-06-19 02:19:18,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42043.5). Total num frames: 4233838592. Throughput: 0: 41768.9. Samples: 501400060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:19:18,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 02:19:18,667][26599] Updated weights for policy 0, policy_version 258414 (0.0035) [2024-06-19 02:19:22,926][26599] Updated weights for policy 0, policy_version 258424 (0.0026) [2024-06-19 02:19:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 4234035200. Throughput: 0: 41816.8. Samples: 501649520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:19:23,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 02:19:26,399][26599] Updated weights for policy 0, policy_version 258434 (0.0033) [2024-06-19 02:19:28,384][26367] Fps is (10 sec: 42582.4, 60 sec: 42322.7, 300 sec: 41986.9). Total num frames: 4234264576. Throughput: 0: 41645.9. Samples: 501892360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:19:28,384][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 02:19:30,649][26599] Updated weights for policy 0, policy_version 258444 (0.0033) [2024-06-19 02:19:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4234444800. Throughput: 0: 41785.8. Samples: 502024240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 02:19:33,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 02:19:34,258][26599] Updated weights for policy 0, policy_version 258454 (0.0035) [2024-06-19 02:19:38,380][26367] Fps is (10 sec: 39336.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4234657792. Throughput: 0: 41778.8. Samples: 502275060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:19:38,380][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 02:19:38,439][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000258464_4234674176.pth... [2024-06-19 02:19:38,445][26599] Updated weights for policy 0, policy_version 258464 (0.0046) [2024-06-19 02:19:38,499][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000257850_4224614400.pth [2024-06-19 02:19:42,467][26599] Updated weights for policy 0, policy_version 258474 (0.0034) [2024-06-19 02:19:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4234887168. Throughput: 0: 41702.6. Samples: 502521560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:19:43,381][26367] Avg episode reward: [(0, '0.816')] [2024-06-19 02:19:46,290][26599] Updated weights for policy 0, policy_version 258484 (0.0040) [2024-06-19 02:19:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4235083776. Throughput: 0: 41649.7. Samples: 502654100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:19:48,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 02:19:50,209][26599] Updated weights for policy 0, policy_version 258494 (0.0034) [2024-06-19 02:19:53,382][26367] Fps is (10 sec: 39316.0, 60 sec: 42051.3, 300 sec: 41987.3). Total num frames: 4235280384. Throughput: 0: 41751.1. Samples: 502896620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:19:53,382][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 02:19:54,368][26599] Updated weights for policy 0, policy_version 258504 (0.0047) [2024-06-19 02:19:58,262][26599] Updated weights for policy 0, policy_version 258514 (0.0034) [2024-06-19 02:19:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41508.7, 300 sec: 41931.9). Total num frames: 4235493376. Throughput: 0: 41836.5. Samples: 503152160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:19:58,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 02:20:02,191][26599] Updated weights for policy 0, policy_version 258524 (0.0048) [2024-06-19 02:20:03,380][26367] Fps is (10 sec: 40966.0, 60 sec: 41506.2, 300 sec: 41876.9). Total num frames: 4235689984. Throughput: 0: 41645.3. Samples: 503274100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:03,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 02:20:05,994][26599] Updated weights for policy 0, policy_version 258534 (0.0042) [2024-06-19 02:20:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4235919360. Throughput: 0: 41572.9. Samples: 503520300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:08,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 02:20:10,223][26599] Updated weights for policy 0, policy_version 258544 (0.0026) [2024-06-19 02:20:13,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4236115968. Throughput: 0: 41781.1. Samples: 503772360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:13,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 02:20:13,778][26599] Updated weights for policy 0, policy_version 258554 (0.0027) [2024-06-19 02:20:14,327][26579] Signal inference workers to stop experience collection... (7450 times) [2024-06-19 02:20:14,388][26599] InferenceWorker_p0-w0: stopping experience collection (7450 times) [2024-06-19 02:20:14,446][26579] Signal inference workers to resume experience collection... (7450 times) [2024-06-19 02:20:14,446][26599] InferenceWorker_p0-w0: resuming experience collection (7450 times) [2024-06-19 02:20:18,292][26599] Updated weights for policy 0, policy_version 258564 (0.0025) [2024-06-19 02:20:18,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41233.0, 300 sec: 41820.8). Total num frames: 4236312576. Throughput: 0: 41600.4. Samples: 503896260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:18,380][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 02:20:21,767][26599] Updated weights for policy 0, policy_version 258574 (0.0031) [2024-06-19 02:20:23,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4236558336. Throughput: 0: 41653.7. Samples: 504149480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:23,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 02:20:25,879][26599] Updated weights for policy 0, policy_version 258584 (0.0031) [2024-06-19 02:20:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41508.6, 300 sec: 41931.9). Total num frames: 4236754944. Throughput: 0: 41839.9. Samples: 504404360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:28,381][26367] Avg episode reward: [(0, '0.861')] [2024-06-19 02:20:29,386][26599] Updated weights for policy 0, policy_version 258594 (0.0041) [2024-06-19 02:20:33,384][26367] Fps is (10 sec: 39307.3, 60 sec: 41776.6, 300 sec: 41875.9). Total num frames: 4236951552. Throughput: 0: 41722.8. Samples: 504531780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:33,384][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 02:20:33,596][26599] Updated weights for policy 0, policy_version 258604 (0.0040) [2024-06-19 02:20:37,007][26599] Updated weights for policy 0, policy_version 258614 (0.0031) [2024-06-19 02:20:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4237164544. Throughput: 0: 41939.1. Samples: 504783820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:38,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 02:20:41,653][26599] Updated weights for policy 0, policy_version 258624 (0.0033) [2024-06-19 02:20:43,380][26367] Fps is (10 sec: 44253.4, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4237393920. Throughput: 0: 41834.8. Samples: 505034720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:20:43,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 02:20:44,629][26599] Updated weights for policy 0, policy_version 258634 (0.0033) [2024-06-19 02:20:48,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 4237557760. Throughput: 0: 41943.0. Samples: 505161540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:20:48,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 02:20:49,302][26599] Updated weights for policy 0, policy_version 258644 (0.0043) [2024-06-19 02:20:52,500][26599] Updated weights for policy 0, policy_version 258654 (0.0029) [2024-06-19 02:20:53,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42050.7, 300 sec: 41875.9). Total num frames: 4237803520. Throughput: 0: 42063.3. Samples: 505413300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:20:53,384][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 02:20:57,083][26599] Updated weights for policy 0, policy_version 258664 (0.0035) [2024-06-19 02:20:58,384][26367] Fps is (10 sec: 44220.6, 60 sec: 41776.7, 300 sec: 41820.8). Total num frames: 4238000128. Throughput: 0: 42011.8. Samples: 505663040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:20:58,385][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 02:21:00,193][26599] Updated weights for policy 0, policy_version 258674 (0.0045) [2024-06-19 02:21:03,380][26367] Fps is (10 sec: 39336.0, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4238196736. Throughput: 0: 42015.1. Samples: 505786940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:03,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 02:21:04,977][26599] Updated weights for policy 0, policy_version 258684 (0.0026) [2024-06-19 02:21:07,819][26599] Updated weights for policy 0, policy_version 258694 (0.0041) [2024-06-19 02:21:08,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4238442496. Throughput: 0: 41968.0. Samples: 506038040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:08,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 02:21:12,934][26599] Updated weights for policy 0, policy_version 258704 (0.0030) [2024-06-19 02:21:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41710.3). Total num frames: 4238622720. Throughput: 0: 41946.7. Samples: 506291960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:13,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 02:21:15,449][26599] Updated weights for policy 0, policy_version 258714 (0.0034) [2024-06-19 02:21:18,380][26367] Fps is (10 sec: 37682.8, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 4238819328. Throughput: 0: 41729.5. Samples: 506409460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:18,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 02:21:21,014][26599] Updated weights for policy 0, policy_version 258724 (0.0036) [2024-06-19 02:21:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4239065088. Throughput: 0: 41761.7. Samples: 506663100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:23,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 02:21:23,599][26599] Updated weights for policy 0, policy_version 258734 (0.0026) [2024-06-19 02:21:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 4239228928. Throughput: 0: 41877.7. Samples: 506919220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:28,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 02:21:28,717][26579] Signal inference workers to stop experience collection... (7500 times) [2024-06-19 02:21:28,765][26599] InferenceWorker_p0-w0: stopping experience collection (7500 times) [2024-06-19 02:21:28,841][26579] Signal inference workers to resume experience collection... (7500 times) [2024-06-19 02:21:28,842][26599] InferenceWorker_p0-w0: resuming experience collection (7500 times) [2024-06-19 02:21:28,844][26599] Updated weights for policy 0, policy_version 258744 (0.0037) [2024-06-19 02:21:31,469][26599] Updated weights for policy 0, policy_version 258754 (0.0039) [2024-06-19 02:21:33,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41781.8, 300 sec: 41709.8). Total num frames: 4239458304. Throughput: 0: 41657.0. Samples: 507036100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:33,380][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 02:21:36,579][26599] Updated weights for policy 0, policy_version 258764 (0.0041) [2024-06-19 02:21:38,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4239687680. Throughput: 0: 41727.8. Samples: 507290900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:38,381][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 02:21:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000258770_4239687680.pth... [2024-06-19 02:21:38,452][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000258157_4229644288.pth [2024-06-19 02:21:39,553][26599] Updated weights for policy 0, policy_version 258774 (0.0044) [2024-06-19 02:21:43,382][26367] Fps is (10 sec: 42589.5, 60 sec: 41504.7, 300 sec: 41765.0). Total num frames: 4239884288. Throughput: 0: 41840.2. Samples: 507545780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:43,383][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 02:21:44,434][26599] Updated weights for policy 0, policy_version 258784 (0.0038) [2024-06-19 02:21:47,303][26599] Updated weights for policy 0, policy_version 258794 (0.0035) [2024-06-19 02:21:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 4240113664. Throughput: 0: 41736.0. Samples: 507665060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:48,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 02:21:52,200][26599] Updated weights for policy 0, policy_version 258804 (0.0038) [2024-06-19 02:21:53,380][26367] Fps is (10 sec: 40968.6, 60 sec: 41508.7, 300 sec: 41820.9). Total num frames: 4240293888. Throughput: 0: 41840.5. Samples: 507920860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 02:21:53,380][26367] Avg episode reward: [(0, '0.848')] [2024-06-19 02:21:55,422][26599] Updated weights for policy 0, policy_version 258814 (0.0034) [2024-06-19 02:21:58,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41781.7, 300 sec: 41765.3). Total num frames: 4240506880. Throughput: 0: 41703.5. Samples: 508168620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:21:58,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 02:22:00,009][26599] Updated weights for policy 0, policy_version 258824 (0.0046) [2024-06-19 02:22:03,270][26599] Updated weights for policy 0, policy_version 258834 (0.0033) [2024-06-19 02:22:03,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4240736256. Throughput: 0: 41967.2. Samples: 508297980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:03,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 02:22:07,738][26599] Updated weights for policy 0, policy_version 258844 (0.0040) [2024-06-19 02:22:08,380][26367] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 41654.2). Total num frames: 4240900096. Throughput: 0: 41838.3. Samples: 508545820. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:08,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 02:22:10,919][26599] Updated weights for policy 0, policy_version 258854 (0.0037) [2024-06-19 02:22:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4241129472. Throughput: 0: 41667.5. Samples: 508794260. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:13,381][26367] Avg episode reward: [(0, '0.811')] [2024-06-19 02:22:15,605][26599] Updated weights for policy 0, policy_version 258864 (0.0040) [2024-06-19 02:22:18,384][26367] Fps is (10 sec: 47496.0, 60 sec: 42595.9, 300 sec: 41875.9). Total num frames: 4241375232. Throughput: 0: 41898.3. Samples: 508921680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:18,385][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 02:22:19,127][26599] Updated weights for policy 0, policy_version 258874 (0.0033) [2024-06-19 02:22:23,370][26599] Updated weights for policy 0, policy_version 258884 (0.0042) [2024-06-19 02:22:23,380][26367] Fps is (10 sec: 42599.1, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4241555456. Throughput: 0: 41727.2. Samples: 509168620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:23,380][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 02:22:26,723][26599] Updated weights for policy 0, policy_version 258894 (0.0029) [2024-06-19 02:22:28,380][26367] Fps is (10 sec: 39336.4, 60 sec: 42325.4, 300 sec: 41765.3). Total num frames: 4241768448. Throughput: 0: 41671.7. Samples: 509420920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:28,380][26367] Avg episode reward: [(0, '0.825')] [2024-06-19 02:22:31,013][26599] Updated weights for policy 0, policy_version 258904 (0.0030) [2024-06-19 02:22:33,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 4241981440. Throughput: 0: 41817.7. Samples: 509546860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:33,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 02:22:34,667][26599] Updated weights for policy 0, policy_version 258914 (0.0043) [2024-06-19 02:22:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41709.9). Total num frames: 4242178048. Throughput: 0: 41677.3. Samples: 509796340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:38,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 02:22:38,747][26599] Updated weights for policy 0, policy_version 258924 (0.0044) [2024-06-19 02:22:42,389][26599] Updated weights for policy 0, policy_version 258934 (0.0039) [2024-06-19 02:22:43,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41778.1, 300 sec: 41709.8). Total num frames: 4242391040. Throughput: 0: 41637.1. Samples: 510042440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:43,385][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 02:22:46,659][26599] Updated weights for policy 0, policy_version 258944 (0.0031) [2024-06-19 02:22:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 4242587648. Throughput: 0: 41610.7. Samples: 510170460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:48,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 02:22:50,263][26599] Updated weights for policy 0, policy_version 258954 (0.0034) [2024-06-19 02:22:53,380][26367] Fps is (10 sec: 40975.0, 60 sec: 41779.1, 300 sec: 41709.8). Total num frames: 4242800640. Throughput: 0: 41683.5. Samples: 510421580. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:53,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 02:22:54,482][26599] Updated weights for policy 0, policy_version 258964 (0.0040) [2024-06-19 02:22:58,141][26599] Updated weights for policy 0, policy_version 258974 (0.0032) [2024-06-19 02:22:58,383][26367] Fps is (10 sec: 44226.6, 60 sec: 42050.7, 300 sec: 41876.1). Total num frames: 4243030016. Throughput: 0: 41746.4. Samples: 510672940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:22:58,383][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 02:22:59,448][26579] Signal inference workers to stop experience collection... (7550 times) [2024-06-19 02:22:59,456][26579] Signal inference workers to resume experience collection... (7550 times) [2024-06-19 02:22:59,495][26599] InferenceWorker_p0-w0: stopping experience collection (7550 times) [2024-06-19 02:22:59,495][26599] InferenceWorker_p0-w0: resuming experience collection (7550 times) [2024-06-19 02:23:02,576][26599] Updated weights for policy 0, policy_version 258984 (0.0040) [2024-06-19 02:23:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4243226624. Throughput: 0: 41811.9. Samples: 510803060. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 02:23:03,380][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 02:23:05,591][26599] Updated weights for policy 0, policy_version 258994 (0.0037) [2024-06-19 02:23:08,380][26367] Fps is (10 sec: 42608.1, 60 sec: 42598.4, 300 sec: 41820.9). Total num frames: 4243456000. Throughput: 0: 41923.5. Samples: 511055180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:08,381][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 02:23:10,425][26599] Updated weights for policy 0, policy_version 259004 (0.0045) [2024-06-19 02:23:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 4243668992. Throughput: 0: 41845.2. Samples: 511303960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:13,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 02:23:13,764][26599] Updated weights for policy 0, policy_version 259014 (0.0045) [2024-06-19 02:23:18,380][26367] Fps is (10 sec: 37683.1, 60 sec: 40962.5, 300 sec: 41709.8). Total num frames: 4243832832. Throughput: 0: 41797.8. Samples: 511427760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:18,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 02:23:18,413][26599] Updated weights for policy 0, policy_version 259024 (0.0047) [2024-06-19 02:23:21,679][26599] Updated weights for policy 0, policy_version 259034 (0.0040) [2024-06-19 02:23:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4244078592. Throughput: 0: 41784.1. Samples: 511676620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:23,380][26367] Avg episode reward: [(0, '0.358')] [2024-06-19 02:23:26,351][26599] Updated weights for policy 0, policy_version 259044 (0.0032) [2024-06-19 02:23:28,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4244275200. Throughput: 0: 41986.9. Samples: 511931700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:28,381][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 02:23:29,413][26599] Updated weights for policy 0, policy_version 259054 (0.0038) [2024-06-19 02:23:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4244488192. Throughput: 0: 41879.6. Samples: 512055040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:33,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 02:23:33,986][26599] Updated weights for policy 0, policy_version 259064 (0.0038) [2024-06-19 02:23:37,218][26599] Updated weights for policy 0, policy_version 259074 (0.0024) [2024-06-19 02:23:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4244717568. Throughput: 0: 42082.6. Samples: 512315300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:38,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 02:23:38,387][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259078_4244733952.pth... [2024-06-19 02:23:38,435][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000258464_4234674176.pth [2024-06-19 02:23:41,640][26599] Updated weights for policy 0, policy_version 259084 (0.0033) [2024-06-19 02:23:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41508.7, 300 sec: 41765.3). Total num frames: 4244881408. Throughput: 0: 42205.3. Samples: 512572080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:43,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 02:23:44,958][26599] Updated weights for policy 0, policy_version 259094 (0.0033) [2024-06-19 02:23:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4245127168. Throughput: 0: 41851.4. Samples: 512686380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:48,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 02:23:49,603][26599] Updated weights for policy 0, policy_version 259104 (0.0041) [2024-06-19 02:23:52,663][26599] Updated weights for policy 0, policy_version 259114 (0.0036) [2024-06-19 02:23:53,380][26367] Fps is (10 sec: 47513.0, 60 sec: 42598.4, 300 sec: 41876.9). Total num frames: 4245356544. Throughput: 0: 41989.7. Samples: 512944720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:53,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 02:23:57,093][26599] Updated weights for policy 0, policy_version 259124 (0.0037) [2024-06-19 02:23:58,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41507.8, 300 sec: 41765.3). Total num frames: 4245520384. Throughput: 0: 42215.7. Samples: 513203660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:23:58,380][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 02:24:00,407][26599] Updated weights for policy 0, policy_version 259134 (0.0031) [2024-06-19 02:24:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 4245766144. Throughput: 0: 42122.6. Samples: 513323280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:24:03,381][26367] Avg episode reward: [(0, '0.414')] [2024-06-19 02:24:04,655][26599] Updated weights for policy 0, policy_version 259144 (0.0042) [2024-06-19 02:24:08,278][26599] Updated weights for policy 0, policy_version 259154 (0.0030) [2024-06-19 02:24:08,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4245979136. Throughput: 0: 42358.1. Samples: 513582740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:24:08,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 02:24:12,736][26599] Updated weights for policy 0, policy_version 259164 (0.0039) [2024-06-19 02:24:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4246159360. Throughput: 0: 42109.8. Samples: 513826640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 02:24:13,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 02:24:16,040][26599] Updated weights for policy 0, policy_version 259174 (0.0046) [2024-06-19 02:24:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 41931.9). Total num frames: 4246405120. Throughput: 0: 42163.5. Samples: 513952400. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:18,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 02:24:20,377][26599] Updated weights for policy 0, policy_version 259184 (0.0039) [2024-06-19 02:24:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41765.8). Total num frames: 4246585344. Throughput: 0: 41984.1. Samples: 514204580. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:23,380][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 02:24:23,961][26599] Updated weights for policy 0, policy_version 259194 (0.0050) [2024-06-19 02:24:24,925][26579] Signal inference workers to stop experience collection... (7600 times) [2024-06-19 02:24:24,977][26599] InferenceWorker_p0-w0: stopping experience collection (7600 times) [2024-06-19 02:24:24,986][26579] Signal inference workers to resume experience collection... (7600 times) [2024-06-19 02:24:24,999][26599] InferenceWorker_p0-w0: resuming experience collection (7600 times) [2024-06-19 02:24:28,227][26599] Updated weights for policy 0, policy_version 259204 (0.0031) [2024-06-19 02:24:28,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4246798336. Throughput: 0: 41841.8. Samples: 514454960. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:28,380][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 02:24:31,993][26599] Updated weights for policy 0, policy_version 259214 (0.0037) [2024-06-19 02:24:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4247011328. Throughput: 0: 42066.3. Samples: 514579360. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:33,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 02:24:36,013][26599] Updated weights for policy 0, policy_version 259224 (0.0030) [2024-06-19 02:24:38,384][26367] Fps is (10 sec: 42582.7, 60 sec: 41776.7, 300 sec: 41820.3). Total num frames: 4247224320. Throughput: 0: 41938.9. Samples: 514832120. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:38,384][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 02:24:39,885][26599] Updated weights for policy 0, policy_version 259234 (0.0037) [2024-06-19 02:24:43,383][26367] Fps is (10 sec: 39311.7, 60 sec: 42050.5, 300 sec: 41765.0). Total num frames: 4247404544. Throughput: 0: 41847.8. Samples: 515086920. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:43,383][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 02:24:44,119][26599] Updated weights for policy 0, policy_version 259244 (0.0032) [2024-06-19 02:24:47,654][26599] Updated weights for policy 0, policy_version 259254 (0.0033) [2024-06-19 02:24:48,380][26367] Fps is (10 sec: 40974.7, 60 sec: 41779.2, 300 sec: 41876.6). Total num frames: 4247633920. Throughput: 0: 41896.0. Samples: 515208600. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:48,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 02:24:52,031][26599] Updated weights for policy 0, policy_version 259264 (0.0040) [2024-06-19 02:24:53,380][26367] Fps is (10 sec: 44248.5, 60 sec: 41506.3, 300 sec: 41876.4). Total num frames: 4247846912. Throughput: 0: 41716.6. Samples: 515459980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:53,380][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 02:24:55,691][26599] Updated weights for policy 0, policy_version 259274 (0.0048) [2024-06-19 02:24:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4248043520. Throughput: 0: 41780.9. Samples: 515706780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:24:58,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 02:24:59,840][26599] Updated weights for policy 0, policy_version 259284 (0.0038) [2024-06-19 02:25:03,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4248256512. Throughput: 0: 41747.5. Samples: 515831040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:25:03,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 02:25:03,462][26599] Updated weights for policy 0, policy_version 259294 (0.0033) [2024-06-19 02:25:07,654][26599] Updated weights for policy 0, policy_version 259304 (0.0043) [2024-06-19 02:25:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41233.2, 300 sec: 41820.9). Total num frames: 4248453120. Throughput: 0: 41793.8. Samples: 516085300. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:25:08,380][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 02:25:11,171][26599] Updated weights for policy 0, policy_version 259314 (0.0053) [2024-06-19 02:25:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4248682496. Throughput: 0: 41756.0. Samples: 516333980. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:25:13,381][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 02:25:15,313][26599] Updated weights for policy 0, policy_version 259324 (0.0045) [2024-06-19 02:25:18,380][26367] Fps is (10 sec: 44235.8, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4248895488. Throughput: 0: 41784.3. Samples: 516459660. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:25:18,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 02:25:19,379][26599] Updated weights for policy 0, policy_version 259334 (0.0028) [2024-06-19 02:25:23,123][26599] Updated weights for policy 0, policy_version 259344 (0.0040) [2024-06-19 02:25:23,384][26367] Fps is (10 sec: 40945.0, 60 sec: 41776.6, 300 sec: 41820.3). Total num frames: 4249092096. Throughput: 0: 41770.6. Samples: 516711800. Policy #0 lag: (min: 1.0, avg: 11.6, max: 22.0) [2024-06-19 02:25:23,384][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 02:25:27,449][26599] Updated weights for policy 0, policy_version 259354 (0.0033) [2024-06-19 02:25:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.1, 300 sec: 41876.9). Total num frames: 4249305088. Throughput: 0: 41634.3. Samples: 516960360. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:25:28,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 02:25:30,933][26599] Updated weights for policy 0, policy_version 259364 (0.0051) [2024-06-19 02:25:33,380][26367] Fps is (10 sec: 44253.5, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 4249534464. Throughput: 0: 41648.1. Samples: 517082760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:25:33,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 02:25:35,301][26599] Updated weights for policy 0, policy_version 259374 (0.0039) [2024-06-19 02:25:38,380][26367] Fps is (10 sec: 40959.1, 60 sec: 41508.5, 300 sec: 41765.3). Total num frames: 4249714688. Throughput: 0: 41647.7. Samples: 517334140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:25:38,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 02:25:38,395][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259382_4249714688.pth... [2024-06-19 02:25:38,450][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000258770_4239687680.pth [2024-06-19 02:25:38,935][26599] Updated weights for policy 0, policy_version 259384 (0.0037) [2024-06-19 02:25:42,972][26599] Updated weights for policy 0, policy_version 259394 (0.0032) [2024-06-19 02:25:43,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41781.0, 300 sec: 41876.4). Total num frames: 4249911296. Throughput: 0: 41687.6. Samples: 517582720. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:25:43,380][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 02:25:44,638][26579] Signal inference workers to stop experience collection... (7650 times) [2024-06-19 02:25:44,684][26599] InferenceWorker_p0-w0: stopping experience collection (7650 times) [2024-06-19 02:25:44,688][26579] Signal inference workers to resume experience collection... (7650 times) [2024-06-19 02:25:44,695][26599] InferenceWorker_p0-w0: resuming experience collection (7650 times) [2024-06-19 02:25:46,739][26599] Updated weights for policy 0, policy_version 259404 (0.0034) [2024-06-19 02:25:48,380][26367] Fps is (10 sec: 44238.0, 60 sec: 42052.3, 300 sec: 41876.9). Total num frames: 4250157056. Throughput: 0: 41579.6. Samples: 517702120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:25:48,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 02:25:50,899][26599] Updated weights for policy 0, policy_version 259414 (0.0031) [2024-06-19 02:25:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41233.0, 300 sec: 41765.8). Total num frames: 4250320896. Throughput: 0: 41640.8. Samples: 517959140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:25:53,380][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 02:25:54,685][26599] Updated weights for policy 0, policy_version 259424 (0.0046) [2024-06-19 02:25:58,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4250550272. Throughput: 0: 41465.7. Samples: 518199940. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:25:58,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 02:25:59,050][26599] Updated weights for policy 0, policy_version 259434 (0.0033) [2024-06-19 02:26:02,587][26599] Updated weights for policy 0, policy_version 259444 (0.0037) [2024-06-19 02:26:03,384][26367] Fps is (10 sec: 45858.3, 60 sec: 42049.8, 300 sec: 41820.3). Total num frames: 4250779648. Throughput: 0: 41593.2. Samples: 518331500. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:03,385][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 02:26:06,883][26599] Updated weights for policy 0, policy_version 259454 (0.0051) [2024-06-19 02:26:08,381][26367] Fps is (10 sec: 37678.9, 60 sec: 41232.2, 300 sec: 41709.6). Total num frames: 4250927104. Throughput: 0: 41526.3. Samples: 518580380. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:08,382][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 02:26:10,133][26599] Updated weights for policy 0, policy_version 259464 (0.0034) [2024-06-19 02:26:13,384][26367] Fps is (10 sec: 39321.3, 60 sec: 41503.6, 300 sec: 41875.9). Total num frames: 4251172864. Throughput: 0: 41402.8. Samples: 518823640. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:13,385][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 02:26:15,181][26599] Updated weights for policy 0, policy_version 259474 (0.0038) [2024-06-19 02:26:17,866][26599] Updated weights for policy 0, policy_version 259484 (0.0053) [2024-06-19 02:26:18,380][26367] Fps is (10 sec: 47519.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4251402240. Throughput: 0: 41708.8. Samples: 518959660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:18,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 02:26:22,807][26599] Updated weights for policy 0, policy_version 259494 (0.0028) [2024-06-19 02:26:23,380][26367] Fps is (10 sec: 39336.1, 60 sec: 41235.6, 300 sec: 41820.9). Total num frames: 4251566080. Throughput: 0: 41668.6. Samples: 519209220. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:23,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 02:26:25,798][26599] Updated weights for policy 0, policy_version 259504 (0.0035) [2024-06-19 02:26:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4251811840. Throughput: 0: 41574.6. Samples: 519453580. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:28,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 02:26:30,340][26599] Updated weights for policy 0, policy_version 259514 (0.0035) [2024-06-19 02:26:33,380][26367] Fps is (10 sec: 45875.8, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 4252024832. Throughput: 0: 42000.1. Samples: 519592120. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:33,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 02:26:33,475][26599] Updated weights for policy 0, policy_version 259524 (0.0039) [2024-06-19 02:26:37,820][26599] Updated weights for policy 0, policy_version 259534 (0.0040) [2024-06-19 02:26:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.4, 300 sec: 41821.1). Total num frames: 4252221440. Throughput: 0: 41866.6. Samples: 519843140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 23.0) [2024-06-19 02:26:38,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 02:26:41,442][26599] Updated weights for policy 0, policy_version 259544 (0.0030) [2024-06-19 02:26:43,381][26367] Fps is (10 sec: 44234.2, 60 sec: 42598.0, 300 sec: 41876.3). Total num frames: 4252467200. Throughput: 0: 41872.9. Samples: 520084240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:26:43,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 02:26:45,516][26599] Updated weights for policy 0, policy_version 259554 (0.0039) [2024-06-19 02:26:47,794][26579] Signal inference workers to stop experience collection... (7700 times) [2024-06-19 02:26:47,847][26599] InferenceWorker_p0-w0: stopping experience collection (7700 times) [2024-06-19 02:26:47,847][26579] Signal inference workers to resume experience collection... (7700 times) [2024-06-19 02:26:47,863][26599] InferenceWorker_p0-w0: resuming experience collection (7700 times) [2024-06-19 02:26:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4252647424. Throughput: 0: 41984.3. Samples: 520220640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:26:48,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 02:26:49,069][26599] Updated weights for policy 0, policy_version 259564 (0.0031) [2024-06-19 02:26:53,380][26367] Fps is (10 sec: 37685.1, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 4252844032. Throughput: 0: 42046.0. Samples: 520472400. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:26:53,381][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 02:26:53,616][26599] Updated weights for policy 0, policy_version 259574 (0.0026) [2024-06-19 02:26:56,753][26599] Updated weights for policy 0, policy_version 259584 (0.0041) [2024-06-19 02:26:58,383][26367] Fps is (10 sec: 44224.2, 60 sec: 42323.4, 300 sec: 41876.0). Total num frames: 4253089792. Throughput: 0: 42084.8. Samples: 520717420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:26:58,383][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 02:27:01,328][26599] Updated weights for policy 0, policy_version 259594 (0.0025) [2024-06-19 02:27:03,380][26367] Fps is (10 sec: 39321.8, 60 sec: 40962.5, 300 sec: 41820.9). Total num frames: 4253237248. Throughput: 0: 41950.2. Samples: 520847420. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:03,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 02:27:04,651][26599] Updated weights for policy 0, policy_version 259604 (0.0037) [2024-06-19 02:27:08,384][26367] Fps is (10 sec: 39318.7, 60 sec: 42596.7, 300 sec: 41875.9). Total num frames: 4253483008. Throughput: 0: 41901.1. Samples: 521094920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:08,384][26367] Avg episode reward: [(0, '0.863')] [2024-06-19 02:27:08,918][26599] Updated weights for policy 0, policy_version 259614 (0.0034) [2024-06-19 02:27:12,559][26599] Updated weights for policy 0, policy_version 259624 (0.0034) [2024-06-19 02:27:13,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42055.0, 300 sec: 41765.8). Total num frames: 4253696000. Throughput: 0: 42091.7. Samples: 521347700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:13,380][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 02:27:16,663][26599] Updated weights for policy 0, policy_version 259634 (0.0044) [2024-06-19 02:27:18,383][26367] Fps is (10 sec: 39323.8, 60 sec: 41230.9, 300 sec: 41764.9). Total num frames: 4253876224. Throughput: 0: 41899.7. Samples: 521477740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:18,384][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 02:27:20,337][26599] Updated weights for policy 0, policy_version 259644 (0.0042) [2024-06-19 02:27:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 41876.4). Total num frames: 4254121984. Throughput: 0: 41886.3. Samples: 521728020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:23,380][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 02:27:24,424][26599] Updated weights for policy 0, policy_version 259654 (0.0038) [2024-06-19 02:27:28,046][26599] Updated weights for policy 0, policy_version 259664 (0.0046) [2024-06-19 02:27:28,380][26367] Fps is (10 sec: 45889.2, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4254334976. Throughput: 0: 42345.4. Samples: 521989760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:28,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 02:27:32,039][26599] Updated weights for policy 0, policy_version 259674 (0.0042) [2024-06-19 02:27:33,382][26367] Fps is (10 sec: 40950.9, 60 sec: 41777.6, 300 sec: 41876.1). Total num frames: 4254531584. Throughput: 0: 42082.9. Samples: 522114460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:33,383][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 02:27:35,935][26599] Updated weights for policy 0, policy_version 259684 (0.0034) [2024-06-19 02:27:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 41932.5). Total num frames: 4254760960. Throughput: 0: 42090.3. Samples: 522366460. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:38,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 02:27:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259691_4254777344.pth... [2024-06-19 02:27:38,460][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259078_4244733952.pth [2024-06-19 02:27:39,791][26599] Updated weights for policy 0, policy_version 259694 (0.0031) [2024-06-19 02:27:43,380][26367] Fps is (10 sec: 42607.6, 60 sec: 41506.5, 300 sec: 41931.9). Total num frames: 4254957568. Throughput: 0: 42329.4. Samples: 522622120. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:43,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 02:27:43,611][26599] Updated weights for policy 0, policy_version 259704 (0.0029) [2024-06-19 02:27:47,736][26599] Updated weights for policy 0, policy_version 259714 (0.0048) [2024-06-19 02:27:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4255170560. Throughput: 0: 42180.3. Samples: 522745540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:27:48,381][26367] Avg episode reward: [(0, '0.895')] [2024-06-19 02:27:51,443][26599] Updated weights for policy 0, policy_version 259724 (0.0035) [2024-06-19 02:27:53,381][26367] Fps is (10 sec: 44232.2, 60 sec: 42597.7, 300 sec: 41932.1). Total num frames: 4255399936. Throughput: 0: 42309.1. Samples: 522998720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:27:53,382][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 02:27:55,485][26599] Updated weights for policy 0, policy_version 259734 (0.0038) [2024-06-19 02:27:58,380][26367] Fps is (10 sec: 42599.4, 60 sec: 41781.3, 300 sec: 41931.9). Total num frames: 4255596544. Throughput: 0: 42341.8. Samples: 523253080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:27:58,380][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 02:27:59,145][26599] Updated weights for policy 0, policy_version 259744 (0.0029) [2024-06-19 02:28:03,263][26579] Signal inference workers to stop experience collection... (7750 times) [2024-06-19 02:28:03,263][26579] Signal inference workers to resume experience collection... (7750 times) [2024-06-19 02:28:03,291][26599] InferenceWorker_p0-w0: stopping experience collection (7750 times) [2024-06-19 02:28:03,291][26599] InferenceWorker_p0-w0: resuming experience collection (7750 times) [2024-06-19 02:28:03,380][26367] Fps is (10 sec: 39326.0, 60 sec: 42598.4, 300 sec: 41820.9). Total num frames: 4255793152. Throughput: 0: 42223.4. Samples: 523377660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:03,380][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 02:28:03,422][26599] Updated weights for policy 0, policy_version 259754 (0.0034) [2024-06-19 02:28:06,846][26599] Updated weights for policy 0, policy_version 259764 (0.0035) [2024-06-19 02:28:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42327.9, 300 sec: 41876.4). Total num frames: 4256022528. Throughput: 0: 42327.1. Samples: 523632740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:08,380][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 02:28:10,996][26599] Updated weights for policy 0, policy_version 259774 (0.0039) [2024-06-19 02:28:13,384][26367] Fps is (10 sec: 44220.0, 60 sec: 42322.6, 300 sec: 42042.5). Total num frames: 4256235520. Throughput: 0: 42057.4. Samples: 523882500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:13,385][26367] Avg episode reward: [(0, '0.372')] [2024-06-19 02:28:14,950][26599] Updated weights for policy 0, policy_version 259784 (0.0031) [2024-06-19 02:28:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42600.7, 300 sec: 41876.4). Total num frames: 4256432128. Throughput: 0: 42100.3. Samples: 524008880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:18,380][26367] Avg episode reward: [(0, '0.354')] [2024-06-19 02:28:18,695][26599] Updated weights for policy 0, policy_version 259794 (0.0024) [2024-06-19 02:28:22,796][26599] Updated weights for policy 0, policy_version 259804 (0.0033) [2024-06-19 02:28:23,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4256645120. Throughput: 0: 42115.1. Samples: 524261640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:23,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 02:28:26,526][26599] Updated weights for policy 0, policy_version 259814 (0.0039) [2024-06-19 02:28:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4256858112. Throughput: 0: 42057.8. Samples: 524514720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:28,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 02:28:30,605][26599] Updated weights for policy 0, policy_version 259824 (0.0029) [2024-06-19 02:28:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42326.9, 300 sec: 41876.4). Total num frames: 4257071104. Throughput: 0: 42112.6. Samples: 524640600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:33,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 02:28:34,232][26599] Updated weights for policy 0, policy_version 259834 (0.0033) [2024-06-19 02:28:38,186][26599] Updated weights for policy 0, policy_version 259844 (0.0036) [2024-06-19 02:28:38,383][26367] Fps is (10 sec: 42587.4, 60 sec: 42050.5, 300 sec: 42042.6). Total num frames: 4257284096. Throughput: 0: 41990.6. Samples: 524888360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:38,383][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 02:28:42,187][26599] Updated weights for policy 0, policy_version 259854 (0.0038) [2024-06-19 02:28:43,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42049.7, 300 sec: 41875.9). Total num frames: 4257480704. Throughput: 0: 42002.3. Samples: 525143340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:43,385][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 02:28:45,855][26599] Updated weights for policy 0, policy_version 259864 (0.0040) [2024-06-19 02:28:48,380][26367] Fps is (10 sec: 40970.2, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 4257693696. Throughput: 0: 41959.5. Samples: 525265840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:48,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 02:28:49,827][26599] Updated weights for policy 0, policy_version 259874 (0.0028) [2024-06-19 02:28:53,384][26367] Fps is (10 sec: 44236.7, 60 sec: 42050.4, 300 sec: 42042.5). Total num frames: 4257923072. Throughput: 0: 42011.6. Samples: 525523420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:53,384][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 02:28:53,985][26599] Updated weights for policy 0, policy_version 259884 (0.0043) [2024-06-19 02:28:57,456][26599] Updated weights for policy 0, policy_version 259894 (0.0048) [2024-06-19 02:28:58,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4258119680. Throughput: 0: 41991.5. Samples: 525771960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 02:28:58,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 02:29:01,723][26599] Updated weights for policy 0, policy_version 259904 (0.0032) [2024-06-19 02:29:03,380][26367] Fps is (10 sec: 40975.4, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4258332672. Throughput: 0: 42050.2. Samples: 525901140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:03,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 02:29:05,114][26599] Updated weights for policy 0, policy_version 259914 (0.0045) [2024-06-19 02:29:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4258529280. Throughput: 0: 42039.6. Samples: 526153420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:08,380][26367] Avg episode reward: [(0, '0.284')] [2024-06-19 02:29:09,374][26599] Updated weights for policy 0, policy_version 259924 (0.0039) [2024-06-19 02:29:13,010][26599] Updated weights for policy 0, policy_version 259934 (0.0030) [2024-06-19 02:29:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42054.9, 300 sec: 41876.4). Total num frames: 4258758656. Throughput: 0: 41708.0. Samples: 526391580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:13,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 02:29:17,440][26599] Updated weights for policy 0, policy_version 259944 (0.0030) [2024-06-19 02:29:18,384][26367] Fps is (10 sec: 40944.7, 60 sec: 41776.6, 300 sec: 41875.9). Total num frames: 4258938880. Throughput: 0: 41773.4. Samples: 526520560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:18,385][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 02:29:20,407][26579] Signal inference workers to stop experience collection... (7800 times) [2024-06-19 02:29:20,408][26579] Signal inference workers to resume experience collection... (7800 times) [2024-06-19 02:29:20,428][26599] InferenceWorker_p0-w0: stopping experience collection (7800 times) [2024-06-19 02:29:20,428][26599] InferenceWorker_p0-w0: resuming experience collection (7800 times) [2024-06-19 02:29:20,910][26599] Updated weights for policy 0, policy_version 259954 (0.0037) [2024-06-19 02:29:23,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 4259135488. Throughput: 0: 41850.0. Samples: 526771500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:23,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 02:29:25,143][26599] Updated weights for policy 0, policy_version 259964 (0.0036) [2024-06-19 02:29:28,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4259381248. Throughput: 0: 41746.9. Samples: 527021800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:28,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 02:29:28,783][26599] Updated weights for policy 0, policy_version 259974 (0.0026) [2024-06-19 02:29:33,154][26599] Updated weights for policy 0, policy_version 259984 (0.0038) [2024-06-19 02:29:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 4259577856. Throughput: 0: 41940.0. Samples: 527153140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:33,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 02:29:36,394][26599] Updated weights for policy 0, policy_version 259994 (0.0037) [2024-06-19 02:29:38,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41507.9, 300 sec: 41932.3). Total num frames: 4259774464. Throughput: 0: 41661.7. Samples: 527398040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:38,380][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 02:29:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259996_4259774464.pth... [2024-06-19 02:29:38,486][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259382_4249714688.pth [2024-06-19 02:29:40,866][26599] Updated weights for policy 0, policy_version 260004 (0.0028) [2024-06-19 02:29:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42054.8, 300 sec: 41931.9). Total num frames: 4260003840. Throughput: 0: 41846.2. Samples: 527655040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:43,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 02:29:44,187][26599] Updated weights for policy 0, policy_version 260014 (0.0043) [2024-06-19 02:29:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4260200448. Throughput: 0: 41835.0. Samples: 527783720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:48,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 02:29:48,865][26599] Updated weights for policy 0, policy_version 260024 (0.0035) [2024-06-19 02:29:51,835][26599] Updated weights for policy 0, policy_version 260034 (0.0031) [2024-06-19 02:29:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41508.7, 300 sec: 41931.9). Total num frames: 4260413440. Throughput: 0: 41644.0. Samples: 528027400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:53,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 02:29:56,946][26599] Updated weights for policy 0, policy_version 260044 (0.0033) [2024-06-19 02:29:58,383][26367] Fps is (10 sec: 42588.2, 60 sec: 41777.5, 300 sec: 41931.6). Total num frames: 4260626432. Throughput: 0: 42093.7. Samples: 528285900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:29:58,383][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 02:29:59,667][26599] Updated weights for policy 0, policy_version 260054 (0.0032) [2024-06-19 02:30:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4260823040. Throughput: 0: 41978.6. Samples: 528409440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:30:03,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 02:30:04,961][26599] Updated weights for policy 0, policy_version 260064 (0.0027) [2024-06-19 02:30:07,415][26599] Updated weights for policy 0, policy_version 260074 (0.0034) [2024-06-19 02:30:08,380][26367] Fps is (10 sec: 44247.4, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4261068800. Throughput: 0: 41855.5. Samples: 528655000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 02:30:08,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 02:30:12,594][26599] Updated weights for policy 0, policy_version 260084 (0.0038) [2024-06-19 02:30:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4261249024. Throughput: 0: 42009.4. Samples: 528912220. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:13,380][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 02:30:15,395][26599] Updated weights for policy 0, policy_version 260094 (0.0036) [2024-06-19 02:30:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42054.8, 300 sec: 41932.4). Total num frames: 4261462016. Throughput: 0: 41711.0. Samples: 529030140. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:18,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 02:30:20,323][26599] Updated weights for policy 0, policy_version 260104 (0.0046) [2024-06-19 02:30:23,381][26367] Fps is (10 sec: 44234.5, 60 sec: 42598.0, 300 sec: 41987.4). Total num frames: 4261691392. Throughput: 0: 41827.5. Samples: 529280300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:23,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 02:30:23,846][26599] Updated weights for policy 0, policy_version 260114 (0.0045) [2024-06-19 02:30:27,429][26579] Signal inference workers to stop experience collection... (7850 times) [2024-06-19 02:30:27,472][26599] InferenceWorker_p0-w0: stopping experience collection (7850 times) [2024-06-19 02:30:27,480][26579] Signal inference workers to resume experience collection... (7850 times) [2024-06-19 02:30:27,494][26599] InferenceWorker_p0-w0: resuming experience collection (7850 times) [2024-06-19 02:30:28,267][26599] Updated weights for policy 0, policy_version 260124 (0.0037) [2024-06-19 02:30:28,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41503.6, 300 sec: 41820.3). Total num frames: 4261871616. Throughput: 0: 41881.9. Samples: 529539880. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:28,384][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 02:30:31,719][26599] Updated weights for policy 0, policy_version 260134 (0.0043) [2024-06-19 02:30:33,380][26367] Fps is (10 sec: 39322.9, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4262084608. Throughput: 0: 41635.9. Samples: 529657340. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:33,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 02:30:35,840][26599] Updated weights for policy 0, policy_version 260144 (0.0038) [2024-06-19 02:30:38,384][26367] Fps is (10 sec: 45875.3, 60 sec: 42595.8, 300 sec: 42098.0). Total num frames: 4262330368. Throughput: 0: 41958.3. Samples: 529915680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:38,384][26367] Avg episode reward: [(0, '0.763')] [2024-06-19 02:30:39,580][26599] Updated weights for policy 0, policy_version 260154 (0.0031) [2024-06-19 02:30:43,380][26367] Fps is (10 sec: 40961.1, 60 sec: 41506.2, 300 sec: 41820.9). Total num frames: 4262494208. Throughput: 0: 41754.8. Samples: 530164760. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:43,380][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 02:30:43,681][26599] Updated weights for policy 0, policy_version 260164 (0.0033) [2024-06-19 02:30:47,322][26599] Updated weights for policy 0, policy_version 260174 (0.0041) [2024-06-19 02:30:48,384][26367] Fps is (10 sec: 39321.3, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4262723584. Throughput: 0: 41593.5. Samples: 530281300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:48,385][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 02:30:51,672][26599] Updated weights for policy 0, policy_version 260184 (0.0028) [2024-06-19 02:30:53,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4262936576. Throughput: 0: 41906.7. Samples: 530540800. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:53,381][26367] Avg episode reward: [(0, '0.350')] [2024-06-19 02:30:55,134][26599] Updated weights for policy 0, policy_version 260194 (0.0028) [2024-06-19 02:30:58,380][26367] Fps is (10 sec: 39336.6, 60 sec: 41507.9, 300 sec: 41821.4). Total num frames: 4263116800. Throughput: 0: 41788.9. Samples: 530792720. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:30:58,380][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 02:30:59,399][26599] Updated weights for policy 0, policy_version 260204 (0.0044) [2024-06-19 02:31:02,953][26599] Updated weights for policy 0, policy_version 260214 (0.0031) [2024-06-19 02:31:03,381][26367] Fps is (10 sec: 40959.1, 60 sec: 42052.1, 300 sec: 42098.7). Total num frames: 4263346176. Throughput: 0: 41831.4. Samples: 530912560. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:31:03,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 02:31:07,257][26599] Updated weights for policy 0, policy_version 260224 (0.0033) [2024-06-19 02:31:08,384][26367] Fps is (10 sec: 44220.3, 60 sec: 41503.6, 300 sec: 41987.5). Total num frames: 4263559168. Throughput: 0: 42094.8. Samples: 531174700. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:31:08,384][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 02:31:10,666][26599] Updated weights for policy 0, policy_version 260234 (0.0031) [2024-06-19 02:31:13,380][26367] Fps is (10 sec: 40961.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4263755776. Throughput: 0: 41906.6. Samples: 531425520. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:31:13,380][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 02:31:14,942][26599] Updated weights for policy 0, policy_version 260244 (0.0027) [2024-06-19 02:31:18,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 4263985152. Throughput: 0: 42011.3. Samples: 531547840. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 02:31:18,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 02:31:18,420][26599] Updated weights for policy 0, policy_version 260254 (0.0043) [2024-06-19 02:31:22,710][26599] Updated weights for policy 0, policy_version 260264 (0.0032) [2024-06-19 02:31:23,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41779.5, 300 sec: 41987.5). Total num frames: 4264198144. Throughput: 0: 42033.2. Samples: 531807020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:23,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 02:31:26,010][26599] Updated weights for policy 0, policy_version 260274 (0.0039) [2024-06-19 02:31:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42327.9, 300 sec: 41987.5). Total num frames: 4264411136. Throughput: 0: 42102.5. Samples: 532059380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:28,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 02:31:30,415][26599] Updated weights for policy 0, policy_version 260284 (0.0032) [2024-06-19 02:31:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 4264624128. Throughput: 0: 42370.2. Samples: 532187800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:33,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 02:31:33,702][26599] Updated weights for policy 0, policy_version 260294 (0.0049) [2024-06-19 02:31:38,132][26599] Updated weights for policy 0, policy_version 260304 (0.0047) [2024-06-19 02:31:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41781.7, 300 sec: 41932.0). Total num frames: 4264837120. Throughput: 0: 42289.4. Samples: 532443820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:38,381][26367] Avg episode reward: [(0, '0.377')] [2024-06-19 02:31:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000260305_4264837120.pth... [2024-06-19 02:31:38,472][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259691_4254777344.pth [2024-06-19 02:31:41,410][26599] Updated weights for policy 0, policy_version 260314 (0.0040) [2024-06-19 02:31:43,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42322.7, 300 sec: 41987.0). Total num frames: 4265033728. Throughput: 0: 42108.9. Samples: 532687780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:43,384][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 02:31:45,819][26599] Updated weights for policy 0, policy_version 260324 (0.0031) [2024-06-19 02:31:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41781.8, 300 sec: 41987.5). Total num frames: 4265230336. Throughput: 0: 42234.9. Samples: 532813120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:48,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 02:31:49,541][26599] Updated weights for policy 0, policy_version 260334 (0.0037) [2024-06-19 02:31:51,140][26579] Signal inference workers to stop experience collection... (7900 times) [2024-06-19 02:31:51,141][26579] Signal inference workers to resume experience collection... (7900 times) [2024-06-19 02:31:51,165][26599] InferenceWorker_p0-w0: stopping experience collection (7900 times) [2024-06-19 02:31:51,166][26599] InferenceWorker_p0-w0: resuming experience collection (7900 times) [2024-06-19 02:31:53,384][26367] Fps is (10 sec: 42598.5, 60 sec: 42049.7, 300 sec: 41931.8). Total num frames: 4265459712. Throughput: 0: 42031.5. Samples: 533066120. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:53,385][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 02:31:53,601][26599] Updated weights for policy 0, policy_version 260344 (0.0025) [2024-06-19 02:31:57,510][26599] Updated weights for policy 0, policy_version 260354 (0.0037) [2024-06-19 02:31:58,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4265656320. Throughput: 0: 41977.3. Samples: 533314500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:31:58,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 02:32:01,388][26599] Updated weights for policy 0, policy_version 260364 (0.0031) [2024-06-19 02:32:03,380][26367] Fps is (10 sec: 39336.0, 60 sec: 41779.4, 300 sec: 41932.5). Total num frames: 4265852928. Throughput: 0: 41993.3. Samples: 533437540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:32:03,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 02:32:05,566][26599] Updated weights for policy 0, policy_version 260374 (0.0028) [2024-06-19 02:32:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 4266065920. Throughput: 0: 41891.6. Samples: 533692140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:32:08,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 02:32:09,097][26599] Updated weights for policy 0, policy_version 260384 (0.0031) [2024-06-19 02:32:13,206][26599] Updated weights for policy 0, policy_version 260394 (0.0033) [2024-06-19 02:32:13,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42099.0). Total num frames: 4266295296. Throughput: 0: 41931.7. Samples: 533946300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:32:13,380][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 02:32:17,049][26599] Updated weights for policy 0, policy_version 260404 (0.0038) [2024-06-19 02:32:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4266491904. Throughput: 0: 41765.8. Samples: 534067260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:32:18,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 02:32:21,365][26599] Updated weights for policy 0, policy_version 260414 (0.0033) [2024-06-19 02:32:23,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42049.7, 300 sec: 41987.0). Total num frames: 4266721280. Throughput: 0: 41731.7. Samples: 534321900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:32:23,384][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 02:32:24,730][26599] Updated weights for policy 0, policy_version 260424 (0.0033) [2024-06-19 02:32:28,382][26367] Fps is (10 sec: 40951.3, 60 sec: 41504.7, 300 sec: 41931.9). Total num frames: 4266901504. Throughput: 0: 42006.8. Samples: 534578020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:32:28,383][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 02:32:29,123][26599] Updated weights for policy 0, policy_version 260434 (0.0039) [2024-06-19 02:32:32,546][26599] Updated weights for policy 0, policy_version 260444 (0.0049) [2024-06-19 02:32:33,380][26367] Fps is (10 sec: 40975.2, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4267130880. Throughput: 0: 41884.5. Samples: 534697920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 23.0) [2024-06-19 02:32:33,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 02:32:37,046][26599] Updated weights for policy 0, policy_version 260454 (0.0038) [2024-06-19 02:32:38,380][26367] Fps is (10 sec: 45884.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4267360256. Throughput: 0: 41992.8. Samples: 534955640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:32:38,380][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 02:32:40,625][26599] Updated weights for policy 0, policy_version 260464 (0.0032) [2024-06-19 02:32:43,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41508.7, 300 sec: 41876.4). Total num frames: 4267524096. Throughput: 0: 41924.5. Samples: 535201100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:32:43,380][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 02:32:44,955][26599] Updated weights for policy 0, policy_version 260474 (0.0038) [2024-06-19 02:32:48,224][26599] Updated weights for policy 0, policy_version 260484 (0.0038) [2024-06-19 02:32:48,384][26367] Fps is (10 sec: 40944.6, 60 sec: 42322.7, 300 sec: 41931.6). Total num frames: 4267769856. Throughput: 0: 41823.6. Samples: 535319760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:32:48,385][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 02:32:52,768][26599] Updated weights for policy 0, policy_version 260494 (0.0022) [2024-06-19 02:32:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41781.8, 300 sec: 41931.9). Total num frames: 4267966464. Throughput: 0: 41904.0. Samples: 535577820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:32:53,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 02:32:55,735][26599] Updated weights for policy 0, policy_version 260504 (0.0030) [2024-06-19 02:32:58,384][26367] Fps is (10 sec: 37683.6, 60 sec: 41503.6, 300 sec: 41875.9). Total num frames: 4268146688. Throughput: 0: 41739.7. Samples: 535824740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:32:58,393][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 02:33:00,531][26599] Updated weights for policy 0, policy_version 260514 (0.0033) [2024-06-19 02:33:03,363][26599] Updated weights for policy 0, policy_version 260524 (0.0030) [2024-06-19 02:33:03,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 4268425216. Throughput: 0: 41837.3. Samples: 535949940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:03,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 02:33:08,380][26367] Fps is (10 sec: 44252.3, 60 sec: 42052.2, 300 sec: 41876.9). Total num frames: 4268589056. Throughput: 0: 41747.3. Samples: 536200380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:08,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 02:33:08,388][26599] Updated weights for policy 0, policy_version 260534 (0.0043) [2024-06-19 02:33:09,323][26579] Signal inference workers to stop experience collection... (7950 times) [2024-06-19 02:33:09,324][26579] Signal inference workers to resume experience collection... (7950 times) [2024-06-19 02:33:09,351][26599] InferenceWorker_p0-w0: stopping experience collection (7950 times) [2024-06-19 02:33:09,351][26599] InferenceWorker_p0-w0: resuming experience collection (7950 times) [2024-06-19 02:33:11,218][26599] Updated weights for policy 0, policy_version 260544 (0.0048) [2024-06-19 02:33:13,380][26367] Fps is (10 sec: 34406.8, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4268769280. Throughput: 0: 41798.5. Samples: 536458860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:13,380][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 02:33:16,399][26599] Updated weights for policy 0, policy_version 260554 (0.0030) [2024-06-19 02:33:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4269015040. Throughput: 0: 41830.7. Samples: 536580300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:18,380][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 02:33:18,975][26599] Updated weights for policy 0, policy_version 260564 (0.0029) [2024-06-19 02:33:23,382][26367] Fps is (10 sec: 44227.6, 60 sec: 41507.3, 300 sec: 41876.1). Total num frames: 4269211648. Throughput: 0: 41793.2. Samples: 536836420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:23,383][26367] Avg episode reward: [(0, '0.266')] [2024-06-19 02:33:24,001][26599] Updated weights for policy 0, policy_version 260574 (0.0038) [2024-06-19 02:33:26,868][26599] Updated weights for policy 0, policy_version 260584 (0.0048) [2024-06-19 02:33:28,384][26367] Fps is (10 sec: 39307.0, 60 sec: 41778.1, 300 sec: 41820.3). Total num frames: 4269408256. Throughput: 0: 41772.5. Samples: 537081020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:28,384][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 02:33:31,738][26599] Updated weights for policy 0, policy_version 260594 (0.0039) [2024-06-19 02:33:33,380][26367] Fps is (10 sec: 44245.2, 60 sec: 42052.2, 300 sec: 41932.3). Total num frames: 4269654016. Throughput: 0: 41964.3. Samples: 537208000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:33,381][26367] Avg episode reward: [(0, '0.303')] [2024-06-19 02:33:34,886][26599] Updated weights for policy 0, policy_version 260604 (0.0041) [2024-06-19 02:33:38,380][26367] Fps is (10 sec: 42614.2, 60 sec: 41233.1, 300 sec: 41876.9). Total num frames: 4269834240. Throughput: 0: 41952.0. Samples: 537465660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:38,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 02:33:38,430][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000260611_4269850624.pth... [2024-06-19 02:33:38,484][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000259996_4259774464.pth [2024-06-19 02:33:39,664][26599] Updated weights for policy 0, policy_version 260614 (0.0042) [2024-06-19 02:33:42,643][26599] Updated weights for policy 0, policy_version 260624 (0.0036) [2024-06-19 02:33:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 4270063616. Throughput: 0: 41864.1. Samples: 537708480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 02:33:43,381][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 02:33:47,398][26599] Updated weights for policy 0, policy_version 260634 (0.0024) [2024-06-19 02:33:48,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42054.8, 300 sec: 41932.4). Total num frames: 4270292992. Throughput: 0: 41952.8. Samples: 537837820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:33:48,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 02:33:50,587][26599] Updated weights for policy 0, policy_version 260644 (0.0034) [2024-06-19 02:33:53,384][26367] Fps is (10 sec: 39307.9, 60 sec: 41503.6, 300 sec: 41820.3). Total num frames: 4270456832. Throughput: 0: 42062.9. Samples: 538093360. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:33:53,385][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 02:33:55,134][26599] Updated weights for policy 0, policy_version 260654 (0.0028) [2024-06-19 02:33:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42600.9, 300 sec: 41931.9). Total num frames: 4270702592. Throughput: 0: 41609.1. Samples: 538331280. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:33:58,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 02:33:59,052][26599] Updated weights for policy 0, policy_version 260664 (0.0022) [2024-06-19 02:34:02,707][26599] Updated weights for policy 0, policy_version 260674 (0.0031) [2024-06-19 02:34:03,380][26367] Fps is (10 sec: 45891.8, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4270915584. Throughput: 0: 41914.2. Samples: 538466440. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:03,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 02:34:07,250][26599] Updated weights for policy 0, policy_version 260684 (0.0035) [2024-06-19 02:34:08,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4271079424. Throughput: 0: 41779.5. Samples: 538716420. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:08,381][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 02:34:10,339][26599] Updated weights for policy 0, policy_version 260694 (0.0042) [2024-06-19 02:34:13,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 41932.5). Total num frames: 4271308800. Throughput: 0: 41736.8. Samples: 538959020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:13,380][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 02:34:15,579][26599] Updated weights for policy 0, policy_version 260704 (0.0042) [2024-06-19 02:34:18,224][26599] Updated weights for policy 0, policy_version 260714 (0.0043) [2024-06-19 02:34:18,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4271538176. Throughput: 0: 41969.8. Samples: 539096640. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:18,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 02:34:23,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41234.5, 300 sec: 41709.8). Total num frames: 4271685632. Throughput: 0: 41590.3. Samples: 539337220. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:23,380][26367] Avg episode reward: [(0, '0.362')] [2024-06-19 02:34:23,496][26599] Updated weights for policy 0, policy_version 260724 (0.0033) [2024-06-19 02:34:24,307][26579] Signal inference workers to stop experience collection... (8000 times) [2024-06-19 02:34:24,330][26599] InferenceWorker_p0-w0: stopping experience collection (8000 times) [2024-06-19 02:34:24,362][26579] Signal inference workers to resume experience collection... (8000 times) [2024-06-19 02:34:24,363][26599] InferenceWorker_p0-w0: resuming experience collection (8000 times) [2024-06-19 02:34:26,331][26599] Updated weights for policy 0, policy_version 260734 (0.0042) [2024-06-19 02:34:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42601.0, 300 sec: 41987.5). Total num frames: 4271964160. Throughput: 0: 41452.1. Samples: 539573820. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:28,381][26367] Avg episode reward: [(0, '0.362')] [2024-06-19 02:34:31,347][26599] Updated weights for policy 0, policy_version 260744 (0.0047) [2024-06-19 02:34:33,380][26367] Fps is (10 sec: 45874.9, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4272144384. Throughput: 0: 41776.6. Samples: 539717760. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:33,381][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 02:34:34,233][26599] Updated weights for policy 0, policy_version 260754 (0.0049) [2024-06-19 02:34:38,380][26367] Fps is (10 sec: 36044.4, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4272324608. Throughput: 0: 41474.8. Samples: 539959580. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:38,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 02:34:39,071][26599] Updated weights for policy 0, policy_version 260764 (0.0038) [2024-06-19 02:34:42,317][26599] Updated weights for policy 0, policy_version 260774 (0.0034) [2024-06-19 02:34:43,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42598.5, 300 sec: 42098.5). Total num frames: 4272619520. Throughput: 0: 41661.0. Samples: 540206020. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:43,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 02:34:47,109][26599] Updated weights for policy 0, policy_version 260784 (0.0044) [2024-06-19 02:34:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 4272766976. Throughput: 0: 41740.9. Samples: 540344780. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:48,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 02:34:49,993][26599] Updated weights for policy 0, policy_version 260794 (0.0034) [2024-06-19 02:34:53,380][26367] Fps is (10 sec: 32768.2, 60 sec: 41508.7, 300 sec: 41765.7). Total num frames: 4272947200. Throughput: 0: 41389.0. Samples: 540578920. Policy #0 lag: (min: 0.0, avg: 7.7, max: 20.0) [2024-06-19 02:34:53,380][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 02:34:54,784][26599] Updated weights for policy 0, policy_version 260804 (0.0040) [2024-06-19 02:34:57,777][26599] Updated weights for policy 0, policy_version 260814 (0.0028) [2024-06-19 02:34:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4273209344. Throughput: 0: 41663.5. Samples: 540833880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:34:58,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 02:35:02,684][26599] Updated weights for policy 0, policy_version 260824 (0.0036) [2024-06-19 02:35:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 40960.1, 300 sec: 41709.8). Total num frames: 4273373184. Throughput: 0: 41613.9. Samples: 540969260. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:03,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 02:35:05,530][26599] Updated weights for policy 0, policy_version 260834 (0.0034) [2024-06-19 02:35:08,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4273602560. Throughput: 0: 41588.8. Samples: 541208720. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:08,381][26367] Avg episode reward: [(0, '0.775')] [2024-06-19 02:35:10,433][26599] Updated weights for policy 0, policy_version 260844 (0.0033) [2024-06-19 02:35:13,143][26599] Updated weights for policy 0, policy_version 260854 (0.0030) [2024-06-19 02:35:13,384][26367] Fps is (10 sec: 45858.0, 60 sec: 42049.7, 300 sec: 41931.4). Total num frames: 4273831936. Throughput: 0: 41935.2. Samples: 541461060. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:13,384][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 02:35:16,843][26579] Signal inference workers to stop experience collection... (8050 times) [2024-06-19 02:35:16,843][26579] Signal inference workers to resume experience collection... (8050 times) [2024-06-19 02:35:16,887][26599] InferenceWorker_p0-w0: stopping experience collection (8050 times) [2024-06-19 02:35:16,888][26599] InferenceWorker_p0-w0: resuming experience collection (8050 times) [2024-06-19 02:35:18,128][26599] Updated weights for policy 0, policy_version 260864 (0.0031) [2024-06-19 02:35:18,380][26367] Fps is (10 sec: 39321.2, 60 sec: 40960.0, 300 sec: 41709.8). Total num frames: 4273995776. Throughput: 0: 41604.4. Samples: 541589960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:18,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 02:35:20,925][26599] Updated weights for policy 0, policy_version 260874 (0.0054) [2024-06-19 02:35:23,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42598.3, 300 sec: 41932.4). Total num frames: 4274241536. Throughput: 0: 41647.1. Samples: 541833700. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:23,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 02:35:26,117][26599] Updated weights for policy 0, policy_version 260884 (0.0041) [2024-06-19 02:35:28,384][26367] Fps is (10 sec: 45859.3, 60 sec: 41503.7, 300 sec: 41931.5). Total num frames: 4274454528. Throughput: 0: 42010.1. Samples: 542096620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:28,384][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 02:35:28,644][26599] Updated weights for policy 0, policy_version 260894 (0.0037) [2024-06-19 02:35:33,380][26367] Fps is (10 sec: 37683.7, 60 sec: 41233.1, 300 sec: 41654.8). Total num frames: 4274618368. Throughput: 0: 41604.5. Samples: 542216980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:33,380][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 02:35:34,222][26599] Updated weights for policy 0, policy_version 260904 (0.0028) [2024-06-19 02:35:36,393][26599] Updated weights for policy 0, policy_version 260914 (0.0039) [2024-06-19 02:35:38,380][26367] Fps is (10 sec: 44252.6, 60 sec: 42871.6, 300 sec: 42043.0). Total num frames: 4274896896. Throughput: 0: 41908.9. Samples: 542464820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:38,380][26367] Avg episode reward: [(0, '0.219')] [2024-06-19 02:35:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000260919_4274896896.pth... [2024-06-19 02:35:38,437][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000260305_4264837120.pth [2024-06-19 02:35:41,857][26599] Updated weights for policy 0, policy_version 260924 (0.0036) [2024-06-19 02:35:43,380][26367] Fps is (10 sec: 44235.9, 60 sec: 40686.8, 300 sec: 41821.4). Total num frames: 4275060736. Throughput: 0: 42037.6. Samples: 542725580. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:43,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 02:35:44,448][26599] Updated weights for policy 0, policy_version 260934 (0.0023) [2024-06-19 02:35:48,380][26367] Fps is (10 sec: 34406.3, 60 sec: 41233.1, 300 sec: 41709.8). Total num frames: 4275240960. Throughput: 0: 41657.7. Samples: 542843860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:48,380][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 02:35:49,501][26599] Updated weights for policy 0, policy_version 260944 (0.0028) [2024-06-19 02:35:52,314][26599] Updated weights for policy 0, policy_version 260954 (0.0033) [2024-06-19 02:35:53,384][26367] Fps is (10 sec: 44221.4, 60 sec: 42595.8, 300 sec: 41986.9). Total num frames: 4275503104. Throughput: 0: 41915.7. Samples: 543095080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:53,384][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 02:35:57,233][26599] Updated weights for policy 0, policy_version 260964 (0.0034) [2024-06-19 02:35:58,384][26367] Fps is (10 sec: 44220.6, 60 sec: 41230.5, 300 sec: 41820.4). Total num frames: 4275683328. Throughput: 0: 42104.9. Samples: 543355780. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:35:58,392][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 02:36:00,079][26599] Updated weights for policy 0, policy_version 260974 (0.0040) [2024-06-19 02:36:03,384][26367] Fps is (10 sec: 39321.5, 60 sec: 42049.6, 300 sec: 41820.9). Total num frames: 4275896320. Throughput: 0: 41838.9. Samples: 543472860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 19.0) [2024-06-19 02:36:03,384][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 02:36:05,018][26599] Updated weights for policy 0, policy_version 260984 (0.0025) [2024-06-19 02:36:07,764][26599] Updated weights for policy 0, policy_version 260994 (0.0041) [2024-06-19 02:36:08,380][26367] Fps is (10 sec: 45891.5, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 4276142080. Throughput: 0: 42228.0. Samples: 543733960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:08,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 02:36:12,707][26599] Updated weights for policy 0, policy_version 261004 (0.0029) [2024-06-19 02:36:13,380][26367] Fps is (10 sec: 42613.4, 60 sec: 41508.6, 300 sec: 41820.8). Total num frames: 4276322304. Throughput: 0: 42239.6. Samples: 543997260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:13,381][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 02:36:15,368][26599] Updated weights for policy 0, policy_version 261014 (0.0038) [2024-06-19 02:36:18,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 41820.9). Total num frames: 4276535296. Throughput: 0: 42122.7. Samples: 544112500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:18,380][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 02:36:20,589][26599] Updated weights for policy 0, policy_version 261024 (0.0039) [2024-06-19 02:36:23,123][26599] Updated weights for policy 0, policy_version 261034 (0.0036) [2024-06-19 02:36:23,380][26367] Fps is (10 sec: 45876.3, 60 sec: 42325.5, 300 sec: 41932.0). Total num frames: 4276781056. Throughput: 0: 42203.6. Samples: 544363980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:23,380][26367] Avg episode reward: [(0, '0.271')] [2024-06-19 02:36:28,168][26599] Updated weights for policy 0, policy_version 261044 (0.0032) [2024-06-19 02:36:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41508.5, 300 sec: 41765.3). Total num frames: 4276944896. Throughput: 0: 42196.1. Samples: 544624400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:28,381][26367] Avg episode reward: [(0, '0.296')] [2024-06-19 02:36:28,713][26579] Signal inference workers to stop experience collection... (8100 times) [2024-06-19 02:36:28,756][26599] InferenceWorker_p0-w0: stopping experience collection (8100 times) [2024-06-19 02:36:28,833][26579] Signal inference workers to resume experience collection... (8100 times) [2024-06-19 02:36:28,833][26599] InferenceWorker_p0-w0: resuming experience collection (8100 times) [2024-06-19 02:36:30,770][26599] Updated weights for policy 0, policy_version 261054 (0.0044) [2024-06-19 02:36:33,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 41820.8). Total num frames: 4277174272. Throughput: 0: 42069.3. Samples: 544736980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:33,381][26367] Avg episode reward: [(0, '0.296')] [2024-06-19 02:36:35,931][26599] Updated weights for policy 0, policy_version 261064 (0.0044) [2024-06-19 02:36:38,380][26367] Fps is (10 sec: 47513.9, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 4277420032. Throughput: 0: 42345.7. Samples: 545000480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:38,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 02:36:38,523][26599] Updated weights for policy 0, policy_version 261074 (0.0035) [2024-06-19 02:36:43,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41779.4, 300 sec: 41820.9). Total num frames: 4277567488. Throughput: 0: 42234.6. Samples: 545256180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:43,380][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 02:36:43,661][26599] Updated weights for policy 0, policy_version 261084 (0.0039) [2024-06-19 02:36:46,598][26599] Updated weights for policy 0, policy_version 261094 (0.0031) [2024-06-19 02:36:48,380][26367] Fps is (10 sec: 37682.6, 60 sec: 42598.3, 300 sec: 41821.4). Total num frames: 4277796864. Throughput: 0: 42162.0. Samples: 545370000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:48,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 02:36:51,617][26599] Updated weights for policy 0, policy_version 261104 (0.0046) [2024-06-19 02:36:53,384][26367] Fps is (10 sec: 47495.6, 60 sec: 42325.3, 300 sec: 41986.9). Total num frames: 4278042624. Throughput: 0: 42170.4. Samples: 545631780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:53,385][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 02:36:54,726][26599] Updated weights for policy 0, policy_version 261114 (0.0038) [2024-06-19 02:36:58,380][26367] Fps is (10 sec: 39322.4, 60 sec: 41781.8, 300 sec: 41820.9). Total num frames: 4278190080. Throughput: 0: 41808.2. Samples: 545878620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:36:58,380][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 02:36:59,425][26599] Updated weights for policy 0, policy_version 261124 (0.0044) [2024-06-19 02:37:02,816][26599] Updated weights for policy 0, policy_version 261134 (0.0029) [2024-06-19 02:37:03,380][26367] Fps is (10 sec: 39336.3, 60 sec: 42328.0, 300 sec: 41931.9). Total num frames: 4278435840. Throughput: 0: 41838.7. Samples: 545995240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:37:03,380][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 02:37:07,064][26599] Updated weights for policy 0, policy_version 261144 (0.0035) [2024-06-19 02:37:08,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42052.4, 300 sec: 41931.9). Total num frames: 4278665216. Throughput: 0: 42135.9. Samples: 546260100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:37:08,380][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 02:37:10,512][26599] Updated weights for policy 0, policy_version 261154 (0.0029) [2024-06-19 02:37:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4278829056. Throughput: 0: 41932.5. Samples: 546511360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 02:37:13,380][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 02:37:14,938][26599] Updated weights for policy 0, policy_version 261164 (0.0024) [2024-06-19 02:37:18,208][26599] Updated weights for policy 0, policy_version 261174 (0.0033) [2024-06-19 02:37:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41876.9). Total num frames: 4279074816. Throughput: 0: 42043.6. Samples: 546628940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:18,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 02:37:22,769][26599] Updated weights for policy 0, policy_version 261184 (0.0047) [2024-06-19 02:37:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41506.1, 300 sec: 41932.2). Total num frames: 4279271424. Throughput: 0: 41992.0. Samples: 546890120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:23,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 02:37:25,820][26599] Updated weights for policy 0, policy_version 261194 (0.0038) [2024-06-19 02:37:28,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4279451648. Throughput: 0: 41828.8. Samples: 547138480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:28,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 02:37:30,353][26599] Updated weights for policy 0, policy_version 261204 (0.0029) [2024-06-19 02:37:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 41820.9). Total num frames: 4279697408. Throughput: 0: 42104.6. Samples: 547264700. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:33,380][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 02:37:33,847][26599] Updated weights for policy 0, policy_version 261214 (0.0024) [2024-06-19 02:37:37,963][26599] Updated weights for policy 0, policy_version 261224 (0.0034) [2024-06-19 02:37:38,380][26367] Fps is (10 sec: 45875.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4279910400. Throughput: 0: 42005.7. Samples: 547521880. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:38,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 02:37:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000261225_4279910400.pth... [2024-06-19 02:37:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000260611_4269850624.pth [2024-06-19 02:37:39,366][26579] Signal inference workers to stop experience collection... (8150 times) [2024-06-19 02:37:39,366][26579] Signal inference workers to resume experience collection... (8150 times) [2024-06-19 02:37:39,407][26599] InferenceWorker_p0-w0: stopping experience collection (8150 times) [2024-06-19 02:37:39,407][26599] InferenceWorker_p0-w0: resuming experience collection (8150 times) [2024-06-19 02:37:41,685][26599] Updated weights for policy 0, policy_version 261234 (0.0033) [2024-06-19 02:37:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41821.4). Total num frames: 4280107008. Throughput: 0: 42010.6. Samples: 547769100. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:43,380][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 02:37:45,787][26599] Updated weights for policy 0, policy_version 261244 (0.0030) [2024-06-19 02:37:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 4280352768. Throughput: 0: 42184.0. Samples: 547893520. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:48,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 02:37:49,615][26599] Updated weights for policy 0, policy_version 261254 (0.0034) [2024-06-19 02:37:53,365][26599] Updated weights for policy 0, policy_version 261264 (0.0035) [2024-06-19 02:37:53,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41781.7, 300 sec: 42043.5). Total num frames: 4280549376. Throughput: 0: 42062.5. Samples: 548152920. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:53,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 02:37:57,687][26599] Updated weights for policy 0, policy_version 261274 (0.0033) [2024-06-19 02:37:58,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 41765.3). Total num frames: 4280745984. Throughput: 0: 41925.7. Samples: 548398020. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:37:58,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 02:38:01,302][26599] Updated weights for policy 0, policy_version 261284 (0.0041) [2024-06-19 02:38:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4280958976. Throughput: 0: 42161.3. Samples: 548526200. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:38:03,381][26367] Avg episode reward: [(0, '0.295')] [2024-06-19 02:38:05,351][26599] Updated weights for policy 0, policy_version 261294 (0.0033) [2024-06-19 02:38:08,384][26367] Fps is (10 sec: 40944.8, 60 sec: 41503.5, 300 sec: 41986.9). Total num frames: 4281155584. Throughput: 0: 41907.6. Samples: 548776120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:38:08,385][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 02:38:09,426][26599] Updated weights for policy 0, policy_version 261304 (0.0030) [2024-06-19 02:38:13,055][26599] Updated weights for policy 0, policy_version 261314 (0.0033) [2024-06-19 02:38:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4281368576. Throughput: 0: 41910.3. Samples: 549024440. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:38:13,381][26367] Avg episode reward: [(0, '0.882')] [2024-06-19 02:38:16,993][26599] Updated weights for policy 0, policy_version 261324 (0.0034) [2024-06-19 02:38:18,384][26367] Fps is (10 sec: 44236.9, 60 sec: 42049.7, 300 sec: 41987.2). Total num frames: 4281597952. Throughput: 0: 42000.9. Samples: 549154900. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:38:18,385][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 02:38:20,740][26599] Updated weights for policy 0, policy_version 261334 (0.0041) [2024-06-19 02:38:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 41988.0). Total num frames: 4281794560. Throughput: 0: 41919.5. Samples: 549408260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:38:23,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 02:38:24,739][26599] Updated weights for policy 0, policy_version 261344 (0.0026) [2024-06-19 02:38:28,381][26367] Fps is (10 sec: 40971.3, 60 sec: 42597.8, 300 sec: 41876.3). Total num frames: 4282007552. Throughput: 0: 41893.8. Samples: 549654360. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 02:38:28,382][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 02:38:29,000][26599] Updated weights for policy 0, policy_version 261354 (0.0032) [2024-06-19 02:38:32,516][26599] Updated weights for policy 0, policy_version 261364 (0.0048) [2024-06-19 02:38:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4282236928. Throughput: 0: 41872.9. Samples: 549777800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:38:33,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 02:38:36,700][26599] Updated weights for policy 0, policy_version 261374 (0.0054) [2024-06-19 02:38:38,380][26367] Fps is (10 sec: 40964.3, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 4282417152. Throughput: 0: 41864.7. Samples: 550036820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:38:38,380][26367] Avg episode reward: [(0, '0.387')] [2024-06-19 02:38:40,343][26599] Updated weights for policy 0, policy_version 261384 (0.0042) [2024-06-19 02:38:43,384][26367] Fps is (10 sec: 37669.4, 60 sec: 41776.6, 300 sec: 41764.8). Total num frames: 4282613760. Throughput: 0: 42006.4. Samples: 550288460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:38:43,385][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 02:38:44,391][26599] Updated weights for policy 0, policy_version 261394 (0.0033) [2024-06-19 02:38:48,076][26599] Updated weights for policy 0, policy_version 261404 (0.0039) [2024-06-19 02:38:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.3, 300 sec: 42043.5). Total num frames: 4282859520. Throughput: 0: 42012.6. Samples: 550416760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:38:48,380][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 02:38:52,145][26599] Updated weights for policy 0, policy_version 261414 (0.0047) [2024-06-19 02:38:53,380][26367] Fps is (10 sec: 44253.0, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 4283056128. Throughput: 0: 42144.4. Samples: 550672460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:38:53,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 02:38:55,726][26599] Updated weights for policy 0, policy_version 261424 (0.0035) [2024-06-19 02:38:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4283269120. Throughput: 0: 42164.0. Samples: 550921820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:38:58,380][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 02:39:00,231][26599] Updated weights for policy 0, policy_version 261434 (0.0041) [2024-06-19 02:39:00,236][26579] Signal inference workers to stop experience collection... (8200 times) [2024-06-19 02:39:00,236][26579] Signal inference workers to resume experience collection... (8200 times) [2024-06-19 02:39:00,286][26599] InferenceWorker_p0-w0: stopping experience collection (8200 times) [2024-06-19 02:39:00,286][26599] InferenceWorker_p0-w0: resuming experience collection (8200 times) [2024-06-19 02:39:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4283482112. Throughput: 0: 42165.8. Samples: 551052200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:03,380][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 02:39:03,402][26599] Updated weights for policy 0, policy_version 261444 (0.0039) [2024-06-19 02:39:07,795][26599] Updated weights for policy 0, policy_version 261454 (0.0038) [2024-06-19 02:39:08,384][26367] Fps is (10 sec: 40944.7, 60 sec: 42052.3, 300 sec: 41931.4). Total num frames: 4283678720. Throughput: 0: 42179.3. Samples: 551306480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:08,384][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 02:39:11,290][26599] Updated weights for policy 0, policy_version 261464 (0.0038) [2024-06-19 02:39:13,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42322.8, 300 sec: 41931.4). Total num frames: 4283908096. Throughput: 0: 42195.3. Samples: 551553260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:13,384][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 02:39:15,565][26599] Updated weights for policy 0, policy_version 261474 (0.0037) [2024-06-19 02:39:18,380][26367] Fps is (10 sec: 42614.4, 60 sec: 41781.8, 300 sec: 42098.5). Total num frames: 4284104704. Throughput: 0: 42421.0. Samples: 551686740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:18,380][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 02:39:18,980][26599] Updated weights for policy 0, policy_version 261484 (0.0027) [2024-06-19 02:39:23,384][26367] Fps is (10 sec: 39321.5, 60 sec: 41776.7, 300 sec: 41820.3). Total num frames: 4284301312. Throughput: 0: 42076.0. Samples: 551930400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:23,384][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 02:39:23,666][26599] Updated weights for policy 0, policy_version 261494 (0.0031) [2024-06-19 02:39:26,607][26599] Updated weights for policy 0, policy_version 261504 (0.0030) [2024-06-19 02:39:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42326.0, 300 sec: 42043.0). Total num frames: 4284547072. Throughput: 0: 42137.2. Samples: 552184480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:28,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 02:39:31,633][26599] Updated weights for policy 0, policy_version 261514 (0.0047) [2024-06-19 02:39:33,380][26367] Fps is (10 sec: 42614.2, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4284727296. Throughput: 0: 42151.0. Samples: 552313560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:33,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 02:39:34,497][26599] Updated weights for policy 0, policy_version 261524 (0.0042) [2024-06-19 02:39:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41765.3). Total num frames: 4284940288. Throughput: 0: 41872.5. Samples: 552556720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:39:38,380][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 02:39:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000261532_4284940288.pth... [2024-06-19 02:39:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000260919_4274896896.pth [2024-06-19 02:39:39,339][26599] Updated weights for policy 0, policy_version 261534 (0.0052) [2024-06-19 02:39:42,259][26599] Updated weights for policy 0, policy_version 261544 (0.0037) [2024-06-19 02:39:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42601.0, 300 sec: 42043.0). Total num frames: 4285169664. Throughput: 0: 41951.0. Samples: 552809620. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:39:43,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 02:39:46,991][26599] Updated weights for policy 0, policy_version 261554 (0.0034) [2024-06-19 02:39:48,384][26367] Fps is (10 sec: 40944.8, 60 sec: 41503.5, 300 sec: 42042.5). Total num frames: 4285349888. Throughput: 0: 41911.6. Samples: 552938380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:39:48,385][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 02:39:49,983][26599] Updated weights for policy 0, policy_version 261564 (0.0039) [2024-06-19 02:39:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4285562880. Throughput: 0: 41741.6. Samples: 553184700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:39:53,381][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 02:39:54,519][26599] Updated weights for policy 0, policy_version 261574 (0.0027) [2024-06-19 02:39:57,697][26599] Updated weights for policy 0, policy_version 261584 (0.0037) [2024-06-19 02:39:58,382][26367] Fps is (10 sec: 45884.8, 60 sec: 42324.2, 300 sec: 42153.9). Total num frames: 4285808640. Throughput: 0: 41855.3. Samples: 553436660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:39:58,382][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 02:40:02,343][26599] Updated weights for policy 0, policy_version 261594 (0.0045) [2024-06-19 02:40:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.0, 300 sec: 41931.9). Total num frames: 4285972480. Throughput: 0: 41875.0. Samples: 553571120. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:03,381][26367] Avg episode reward: [(0, '0.343')] [2024-06-19 02:40:05,690][26599] Updated weights for policy 0, policy_version 261604 (0.0032) [2024-06-19 02:40:08,380][26367] Fps is (10 sec: 39327.8, 60 sec: 42054.9, 300 sec: 41932.5). Total num frames: 4286201856. Throughput: 0: 41871.4. Samples: 553814460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:08,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 02:40:10,053][26599] Updated weights for policy 0, policy_version 261614 (0.0040) [2024-06-19 02:40:13,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42054.8, 300 sec: 42154.1). Total num frames: 4286431232. Throughput: 0: 41857.7. Samples: 554068080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:13,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 02:40:13,552][26599] Updated weights for policy 0, policy_version 261624 (0.0044) [2024-06-19 02:40:17,723][26599] Updated weights for policy 0, policy_version 261634 (0.0044) [2024-06-19 02:40:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4286611456. Throughput: 0: 41847.1. Samples: 554196680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:18,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 02:40:21,199][26599] Updated weights for policy 0, policy_version 261644 (0.0031) [2024-06-19 02:40:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42327.9, 300 sec: 41988.0). Total num frames: 4286840832. Throughput: 0: 42071.1. Samples: 554449920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:23,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 02:40:25,757][26599] Updated weights for policy 0, policy_version 261654 (0.0045) [2024-06-19 02:40:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4287037440. Throughput: 0: 42072.4. Samples: 554702880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:28,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 02:40:29,023][26579] Signal inference workers to stop experience collection... (8250 times) [2024-06-19 02:40:29,023][26579] Signal inference workers to resume experience collection... (8250 times) [2024-06-19 02:40:29,061][26599] InferenceWorker_p0-w0: stopping experience collection (8250 times) [2024-06-19 02:40:29,061][26599] InferenceWorker_p0-w0: resuming experience collection (8250 times) [2024-06-19 02:40:29,170][26599] Updated weights for policy 0, policy_version 261664 (0.0033) [2024-06-19 02:40:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4287250432. Throughput: 0: 41987.0. Samples: 554827640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:33,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 02:40:33,437][26599] Updated weights for policy 0, policy_version 261674 (0.0025) [2024-06-19 02:40:36,860][26599] Updated weights for policy 0, policy_version 261684 (0.0038) [2024-06-19 02:40:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4287463424. Throughput: 0: 41995.6. Samples: 555074500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:38,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 02:40:41,218][26599] Updated weights for policy 0, policy_version 261694 (0.0038) [2024-06-19 02:40:43,384][26367] Fps is (10 sec: 42583.1, 60 sec: 41776.7, 300 sec: 42153.6). Total num frames: 4287676416. Throughput: 0: 42162.1. Samples: 555334040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:43,384][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 02:40:44,650][26599] Updated weights for policy 0, policy_version 261704 (0.0030) [2024-06-19 02:40:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42327.9, 300 sec: 41988.0). Total num frames: 4287889408. Throughput: 0: 41956.1. Samples: 555459140. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 02:40:48,380][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 02:40:49,109][26599] Updated weights for policy 0, policy_version 261714 (0.0033) [2024-06-19 02:40:52,423][26599] Updated weights for policy 0, policy_version 261724 (0.0033) [2024-06-19 02:40:53,384][26367] Fps is (10 sec: 42598.2, 60 sec: 42322.8, 300 sec: 42098.5). Total num frames: 4288102400. Throughput: 0: 42048.6. Samples: 555706800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:40:53,384][26367] Avg episode reward: [(0, '0.307')] [2024-06-19 02:40:56,863][26599] Updated weights for policy 0, policy_version 261734 (0.0044) [2024-06-19 02:40:58,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41234.2, 300 sec: 41988.0). Total num frames: 4288282624. Throughput: 0: 42015.2. Samples: 555958760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:40:58,380][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 02:41:00,475][26599] Updated weights for policy 0, policy_version 261744 (0.0039) [2024-06-19 02:41:03,380][26367] Fps is (10 sec: 39335.4, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4288495616. Throughput: 0: 41862.5. Samples: 556080500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:03,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 02:41:04,538][26599] Updated weights for policy 0, policy_version 261754 (0.0036) [2024-06-19 02:41:08,313][26599] Updated weights for policy 0, policy_version 261764 (0.0038) [2024-06-19 02:41:08,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4288741376. Throughput: 0: 41947.1. Samples: 556337540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:08,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 02:41:12,519][26599] Updated weights for policy 0, policy_version 261774 (0.0037) [2024-06-19 02:41:13,384][26367] Fps is (10 sec: 42583.2, 60 sec: 41503.6, 300 sec: 41986.9). Total num frames: 4288921600. Throughput: 0: 41818.8. Samples: 556584880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:13,385][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 02:41:16,261][26599] Updated weights for policy 0, policy_version 261784 (0.0050) [2024-06-19 02:41:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4289134592. Throughput: 0: 41861.8. Samples: 556711420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:18,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 02:41:20,324][26599] Updated weights for policy 0, policy_version 261794 (0.0040) [2024-06-19 02:41:23,380][26367] Fps is (10 sec: 42614.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4289347584. Throughput: 0: 41965.4. Samples: 556962940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:23,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 02:41:24,197][26599] Updated weights for policy 0, policy_version 261804 (0.0041) [2024-06-19 02:41:28,162][26599] Updated weights for policy 0, policy_version 261814 (0.0029) [2024-06-19 02:41:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4289560576. Throughput: 0: 41855.3. Samples: 557217380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:28,385][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 02:41:32,009][26599] Updated weights for policy 0, policy_version 261824 (0.0042) [2024-06-19 02:41:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4289773568. Throughput: 0: 41760.4. Samples: 557338360. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:33,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 02:41:36,310][26599] Updated weights for policy 0, policy_version 261834 (0.0036) [2024-06-19 02:41:38,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42098.5). Total num frames: 4289986560. Throughput: 0: 41965.2. Samples: 557595080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:38,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 02:41:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000261840_4289986560.pth... [2024-06-19 02:41:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000261225_4279910400.pth [2024-06-19 02:41:39,827][26599] Updated weights for policy 0, policy_version 261844 (0.0027) [2024-06-19 02:41:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 4290199552. Throughput: 0: 41804.9. Samples: 557839980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:43,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 02:41:44,400][26599] Updated weights for policy 0, policy_version 261854 (0.0036) [2024-06-19 02:41:47,951][26599] Updated weights for policy 0, policy_version 261864 (0.0036) [2024-06-19 02:41:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 4290396160. Throughput: 0: 41808.6. Samples: 557961880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:48,380][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 02:41:52,026][26599] Updated weights for policy 0, policy_version 261874 (0.0032) [2024-06-19 02:41:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42054.8, 300 sec: 42154.1). Total num frames: 4290625536. Throughput: 0: 42014.3. Samples: 558228180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:53,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 02:41:55,588][26599] Updated weights for policy 0, policy_version 261884 (0.0041) [2024-06-19 02:41:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.2, 300 sec: 41987.4). Total num frames: 4290822144. Throughput: 0: 41960.2. Samples: 558472940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:41:58,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 02:41:59,734][26599] Updated weights for policy 0, policy_version 261894 (0.0039) [2024-06-19 02:42:03,191][26599] Updated weights for policy 0, policy_version 261904 (0.0039) [2024-06-19 02:42:03,382][26367] Fps is (10 sec: 40953.9, 60 sec: 42324.4, 300 sec: 41931.7). Total num frames: 4291035136. Throughput: 0: 41900.0. Samples: 558596980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:03,382][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 02:42:07,509][26599] Updated weights for policy 0, policy_version 261914 (0.0045) [2024-06-19 02:42:08,382][26367] Fps is (10 sec: 42592.5, 60 sec: 41778.2, 300 sec: 42098.3). Total num frames: 4291248128. Throughput: 0: 41943.9. Samples: 558850480. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:08,382][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 02:42:11,064][26579] Signal inference workers to stop experience collection... (8300 times) [2024-06-19 02:42:11,068][26579] Signal inference workers to resume experience collection... (8300 times) [2024-06-19 02:42:11,105][26599] InferenceWorker_p0-w0: stopping experience collection (8300 times) [2024-06-19 02:42:11,105][26599] InferenceWorker_p0-w0: resuming experience collection (8300 times) [2024-06-19 02:42:11,225][26599] Updated weights for policy 0, policy_version 261924 (0.0035) [2024-06-19 02:42:13,380][26367] Fps is (10 sec: 40966.3, 60 sec: 42054.9, 300 sec: 41931.9). Total num frames: 4291444736. Throughput: 0: 41780.6. Samples: 559097500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:13,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 02:42:15,232][26599] Updated weights for policy 0, policy_version 261934 (0.0045) [2024-06-19 02:42:18,380][26367] Fps is (10 sec: 40966.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4291657728. Throughput: 0: 41827.6. Samples: 559220600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:18,380][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 02:42:18,924][26599] Updated weights for policy 0, policy_version 261944 (0.0038) [2024-06-19 02:42:22,838][26599] Updated weights for policy 0, policy_version 261954 (0.0030) [2024-06-19 02:42:23,380][26367] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4291854336. Throughput: 0: 41801.6. Samples: 559476160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:23,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 02:42:26,626][26599] Updated weights for policy 0, policy_version 261964 (0.0028) [2024-06-19 02:42:28,382][26367] Fps is (10 sec: 40950.7, 60 sec: 41777.7, 300 sec: 41931.6). Total num frames: 4292067328. Throughput: 0: 41778.8. Samples: 559720120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:28,383][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 02:42:30,969][26599] Updated weights for policy 0, policy_version 261974 (0.0043) [2024-06-19 02:42:33,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4292280320. Throughput: 0: 42030.6. Samples: 559853260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:33,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 02:42:34,586][26599] Updated weights for policy 0, policy_version 261984 (0.0030) [2024-06-19 02:42:38,380][26367] Fps is (10 sec: 39330.2, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 4292460544. Throughput: 0: 41649.3. Samples: 560102400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:38,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 02:42:38,732][26599] Updated weights for policy 0, policy_version 261994 (0.0044) [2024-06-19 02:42:42,178][26599] Updated weights for policy 0, policy_version 262004 (0.0039) [2024-06-19 02:42:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4292706304. Throughput: 0: 41801.0. Samples: 560353980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:43,381][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 02:42:46,670][26599] Updated weights for policy 0, policy_version 262014 (0.0036) [2024-06-19 02:42:48,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 4292919296. Throughput: 0: 41963.6. Samples: 560485280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:48,381][26367] Avg episode reward: [(0, '0.342')] [2024-06-19 02:42:50,090][26599] Updated weights for policy 0, policy_version 262024 (0.0043) [2024-06-19 02:42:53,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41232.9, 300 sec: 41876.4). Total num frames: 4293099520. Throughput: 0: 41852.4. Samples: 560733780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:53,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 02:42:54,345][26599] Updated weights for policy 0, policy_version 262034 (0.0052) [2024-06-19 02:42:57,856][26599] Updated weights for policy 0, policy_version 262044 (0.0050) [2024-06-19 02:42:58,381][26367] Fps is (10 sec: 40957.4, 60 sec: 41778.9, 300 sec: 41931.9). Total num frames: 4293328896. Throughput: 0: 41878.5. Samples: 560982060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:42:58,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 02:43:02,119][26599] Updated weights for policy 0, policy_version 262054 (0.0035) [2024-06-19 02:43:03,380][26367] Fps is (10 sec: 44237.7, 60 sec: 41780.3, 300 sec: 41988.0). Total num frames: 4293541888. Throughput: 0: 42135.5. Samples: 561116700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:43:03,380][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 02:43:05,648][26599] Updated weights for policy 0, policy_version 262064 (0.0032) [2024-06-19 02:43:08,380][26367] Fps is (10 sec: 40962.3, 60 sec: 41507.2, 300 sec: 41931.9). Total num frames: 4293738496. Throughput: 0: 41973.9. Samples: 561364980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:43:08,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 02:43:09,814][26599] Updated weights for policy 0, policy_version 262074 (0.0025) [2024-06-19 02:43:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41932.5). Total num frames: 4293967872. Throughput: 0: 42104.4. Samples: 561614720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 02:43:13,380][26367] Avg episode reward: [(0, '0.242')] [2024-06-19 02:43:13,422][26599] Updated weights for policy 0, policy_version 262084 (0.0038) [2024-06-19 02:43:17,754][26599] Updated weights for policy 0, policy_version 262094 (0.0029) [2024-06-19 02:43:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4294164480. Throughput: 0: 42020.0. Samples: 561744160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:18,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 02:43:21,289][26599] Updated weights for policy 0, policy_version 262104 (0.0041) [2024-06-19 02:43:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 41932.1). Total num frames: 4294377472. Throughput: 0: 42026.7. Samples: 561993600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:23,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 02:43:25,526][26599] Updated weights for policy 0, policy_version 262114 (0.0030) [2024-06-19 02:43:26,622][26579] Signal inference workers to stop experience collection... (8350 times) [2024-06-19 02:43:26,622][26579] Signal inference workers to resume experience collection... (8350 times) [2024-06-19 02:43:26,668][26599] InferenceWorker_p0-w0: stopping experience collection (8350 times) [2024-06-19 02:43:26,668][26599] InferenceWorker_p0-w0: resuming experience collection (8350 times) [2024-06-19 02:43:28,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42326.9, 300 sec: 41931.9). Total num frames: 4294606848. Throughput: 0: 41961.9. Samples: 562242260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:28,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 02:43:29,142][26599] Updated weights for policy 0, policy_version 262124 (0.0038) [2024-06-19 02:43:33,260][26599] Updated weights for policy 0, policy_version 262134 (0.0039) [2024-06-19 02:43:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4294803456. Throughput: 0: 41987.5. Samples: 562374720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:33,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 02:43:36,655][26599] Updated weights for policy 0, policy_version 262144 (0.0029) [2024-06-19 02:43:38,382][26367] Fps is (10 sec: 39312.7, 60 sec: 42323.8, 300 sec: 41987.7). Total num frames: 4295000064. Throughput: 0: 41993.6. Samples: 562623580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:38,383][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 02:43:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000262146_4295000064.pth... [2024-06-19 02:43:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000261532_4284940288.pth [2024-06-19 02:43:40,947][26599] Updated weights for policy 0, policy_version 262154 (0.0043) [2024-06-19 02:43:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4295245824. Throughput: 0: 42096.5. Samples: 562876380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:43,381][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 02:43:44,184][26599] Updated weights for policy 0, policy_version 262164 (0.0041) [2024-06-19 02:43:48,380][26367] Fps is (10 sec: 42607.8, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4295426048. Throughput: 0: 42027.5. Samples: 563007940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:48,381][26367] Avg episode reward: [(0, '0.298')] [2024-06-19 02:43:48,681][26599] Updated weights for policy 0, policy_version 262174 (0.0037) [2024-06-19 02:43:51,688][26599] Updated weights for policy 0, policy_version 262184 (0.0037) [2024-06-19 02:43:53,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4295639040. Throughput: 0: 41924.9. Samples: 563251600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:53,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 02:43:56,457][26599] Updated weights for policy 0, policy_version 262194 (0.0032) [2024-06-19 02:43:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.8, 300 sec: 41987.5). Total num frames: 4295868416. Throughput: 0: 42141.7. Samples: 563511100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:43:58,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 02:43:59,968][26599] Updated weights for policy 0, policy_version 262204 (0.0027) [2024-06-19 02:44:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41932.5). Total num frames: 4296048640. Throughput: 0: 42129.9. Samples: 563640000. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:44:03,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 02:44:04,140][26599] Updated weights for policy 0, policy_version 262214 (0.0050) [2024-06-19 02:44:08,020][26599] Updated weights for policy 0, policy_version 262224 (0.0034) [2024-06-19 02:44:08,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 41932.4). Total num frames: 4296278016. Throughput: 0: 42055.5. Samples: 563886100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:44:08,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 02:44:11,926][26599] Updated weights for policy 0, policy_version 262234 (0.0026) [2024-06-19 02:44:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4296474624. Throughput: 0: 42226.3. Samples: 564142440. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:44:13,380][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 02:44:15,693][26599] Updated weights for policy 0, policy_version 262244 (0.0031) [2024-06-19 02:44:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41932.5). Total num frames: 4296671232. Throughput: 0: 41954.2. Samples: 564262660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:44:18,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 02:44:19,715][26599] Updated weights for policy 0, policy_version 262254 (0.0047) [2024-06-19 02:44:23,384][26367] Fps is (10 sec: 44220.1, 60 sec: 42322.8, 300 sec: 41931.4). Total num frames: 4296916992. Throughput: 0: 42032.4. Samples: 564515100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 02:44:23,385][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 02:44:23,545][26599] Updated weights for policy 0, policy_version 262264 (0.0031) [2024-06-19 02:44:27,417][26599] Updated weights for policy 0, policy_version 262274 (0.0043) [2024-06-19 02:44:28,380][26367] Fps is (10 sec: 44235.9, 60 sec: 41779.0, 300 sec: 41987.4). Total num frames: 4297113600. Throughput: 0: 42145.2. Samples: 564772920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:44:28,381][26367] Avg episode reward: [(0, '0.364')] [2024-06-19 02:44:31,674][26599] Updated weights for policy 0, policy_version 262284 (0.0032) [2024-06-19 02:44:33,383][26367] Fps is (10 sec: 39326.4, 60 sec: 41777.5, 300 sec: 41931.6). Total num frames: 4297310208. Throughput: 0: 41993.3. Samples: 564897740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:44:33,383][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 02:44:35,554][26579] Signal inference workers to stop experience collection... (8400 times) [2024-06-19 02:44:35,554][26579] Signal inference workers to resume experience collection... (8400 times) [2024-06-19 02:44:35,606][26599] InferenceWorker_p0-w0: stopping experience collection (8400 times) [2024-06-19 02:44:35,606][26599] InferenceWorker_p0-w0: resuming experience collection (8400 times) [2024-06-19 02:44:35,708][26599] Updated weights for policy 0, policy_version 262294 (0.0030) [2024-06-19 02:44:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42599.9, 300 sec: 41987.5). Total num frames: 4297555968. Throughput: 0: 42015.9. Samples: 565142320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:44:38,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 02:44:39,371][26599] Updated weights for policy 0, policy_version 262304 (0.0042) [2024-06-19 02:44:43,380][26367] Fps is (10 sec: 40970.5, 60 sec: 41233.1, 300 sec: 41932.5). Total num frames: 4297719808. Throughput: 0: 42040.1. Samples: 565402900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:44:43,380][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 02:44:43,578][26599] Updated weights for policy 0, policy_version 262314 (0.0033) [2024-06-19 02:44:47,321][26599] Updated weights for policy 0, policy_version 262324 (0.0038) [2024-06-19 02:44:48,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4297949184. Throughput: 0: 41876.7. Samples: 565524460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:44:48,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 02:44:51,378][26599] Updated weights for policy 0, policy_version 262334 (0.0036) [2024-06-19 02:44:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 41876.6). Total num frames: 4298162176. Throughput: 0: 41871.2. Samples: 565770300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:44:53,380][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 02:44:55,425][26599] Updated weights for policy 0, policy_version 262344 (0.0028) [2024-06-19 02:44:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 4298358784. Throughput: 0: 41934.0. Samples: 566029480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:44:58,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 02:44:59,125][26599] Updated weights for policy 0, policy_version 262354 (0.0035) [2024-06-19 02:45:03,189][26599] Updated weights for policy 0, policy_version 262364 (0.0026) [2024-06-19 02:45:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4298571776. Throughput: 0: 41970.7. Samples: 566151340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:45:03,380][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 02:45:06,872][26599] Updated weights for policy 0, policy_version 262374 (0.0041) [2024-06-19 02:45:08,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4298817536. Throughput: 0: 41975.8. Samples: 566403860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:45:08,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 02:45:11,084][26599] Updated weights for policy 0, policy_version 262384 (0.0049) [2024-06-19 02:45:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4298997760. Throughput: 0: 42018.9. Samples: 566663760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:45:13,380][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 02:45:14,582][26599] Updated weights for policy 0, policy_version 262394 (0.0041) [2024-06-19 02:45:18,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4299194368. Throughput: 0: 41791.5. Samples: 566778260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:45:18,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 02:45:18,866][26599] Updated weights for policy 0, policy_version 262404 (0.0029) [2024-06-19 02:45:22,159][26599] Updated weights for policy 0, policy_version 262414 (0.0031) [2024-06-19 02:45:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42054.9, 300 sec: 42043.0). Total num frames: 4299440128. Throughput: 0: 42135.2. Samples: 567038400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:45:23,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 02:45:26,670][26599] Updated weights for policy 0, policy_version 262424 (0.0039) [2024-06-19 02:45:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4299603968. Throughput: 0: 42035.4. Samples: 567294500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:45:28,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 02:45:29,877][26599] Updated weights for policy 0, policy_version 262434 (0.0043) [2024-06-19 02:45:33,384][26367] Fps is (10 sec: 40944.7, 60 sec: 42324.5, 300 sec: 41987.0). Total num frames: 4299849728. Throughput: 0: 41822.0. Samples: 567406600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 02:45:33,385][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 02:45:35,155][26599] Updated weights for policy 0, policy_version 262444 (0.0030) [2024-06-19 02:45:37,612][26579] Signal inference workers to stop experience collection... (8450 times) [2024-06-19 02:45:37,613][26579] Signal inference workers to resume experience collection... (8450 times) [2024-06-19 02:45:37,637][26599] InferenceWorker_p0-w0: stopping experience collection (8450 times) [2024-06-19 02:45:37,637][26599] InferenceWorker_p0-w0: resuming experience collection (8450 times) [2024-06-19 02:45:37,768][26599] Updated weights for policy 0, policy_version 262454 (0.0044) [2024-06-19 02:45:38,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 42043.5). Total num frames: 4300079104. Throughput: 0: 42109.2. Samples: 567665220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:45:38,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 02:45:38,406][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000262456_4300079104.pth... [2024-06-19 02:45:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000261840_4289986560.pth [2024-06-19 02:45:42,830][26599] Updated weights for policy 0, policy_version 262464 (0.0041) [2024-06-19 02:45:43,380][26367] Fps is (10 sec: 37696.9, 60 sec: 41779.1, 300 sec: 41820.8). Total num frames: 4300226560. Throughput: 0: 42094.7. Samples: 567923740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:45:43,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 02:45:45,665][26599] Updated weights for policy 0, policy_version 262474 (0.0027) [2024-06-19 02:45:48,382][26367] Fps is (10 sec: 39313.6, 60 sec: 42050.9, 300 sec: 41932.2). Total num frames: 4300472320. Throughput: 0: 41873.1. Samples: 568035720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:45:48,383][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 02:45:50,774][26599] Updated weights for policy 0, policy_version 262484 (0.0049) [2024-06-19 02:45:53,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4300685312. Throughput: 0: 42031.1. Samples: 568295260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:45:53,381][26367] Avg episode reward: [(0, '0.799')] [2024-06-19 02:45:53,676][26599] Updated weights for policy 0, policy_version 262494 (0.0039) [2024-06-19 02:45:58,371][26599] Updated weights for policy 0, policy_version 262504 (0.0042) [2024-06-19 02:45:58,380][26367] Fps is (10 sec: 39330.0, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 4300865536. Throughput: 0: 41866.2. Samples: 568547740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:45:58,380][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 02:46:01,298][26599] Updated weights for policy 0, policy_version 262514 (0.0036) [2024-06-19 02:46:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 4301127680. Throughput: 0: 41854.3. Samples: 568661700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:03,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 02:46:05,926][26599] Updated weights for policy 0, policy_version 262524 (0.0035) [2024-06-19 02:46:08,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41506.2, 300 sec: 41988.0). Total num frames: 4301307904. Throughput: 0: 41972.1. Samples: 568927140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:08,380][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 02:46:09,276][26599] Updated weights for policy 0, policy_version 262534 (0.0038) [2024-06-19 02:46:13,380][26367] Fps is (10 sec: 36045.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4301488128. Throughput: 0: 41777.8. Samples: 569174500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:13,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 02:46:13,687][26599] Updated weights for policy 0, policy_version 262544 (0.0040) [2024-06-19 02:46:16,949][26599] Updated weights for policy 0, policy_version 262554 (0.0041) [2024-06-19 02:46:18,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42098.5). Total num frames: 4301766656. Throughput: 0: 42064.3. Samples: 569299340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:18,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 02:46:21,442][26599] Updated weights for policy 0, policy_version 262564 (0.0035) [2024-06-19 02:46:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4301914112. Throughput: 0: 41947.2. Samples: 569552840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:23,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 02:46:24,782][26599] Updated weights for policy 0, policy_version 262574 (0.0039) [2024-06-19 02:46:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4302143488. Throughput: 0: 41833.8. Samples: 569806260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:28,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 02:46:29,173][26599] Updated weights for policy 0, policy_version 262584 (0.0034) [2024-06-19 02:46:32,418][26599] Updated weights for policy 0, policy_version 262594 (0.0029) [2024-06-19 02:46:33,380][26367] Fps is (10 sec: 47513.0, 60 sec: 42327.9, 300 sec: 42043.0). Total num frames: 4302389248. Throughput: 0: 42209.9. Samples: 569935080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:33,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 02:46:36,903][26599] Updated weights for policy 0, policy_version 262604 (0.0040) [2024-06-19 02:46:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4302553088. Throughput: 0: 41876.9. Samples: 570179720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:38,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 02:46:40,395][26599] Updated weights for policy 0, policy_version 262614 (0.0039) [2024-06-19 02:46:43,384][26367] Fps is (10 sec: 39307.5, 60 sec: 42595.8, 300 sec: 41986.9). Total num frames: 4302782464. Throughput: 0: 41730.3. Samples: 570425760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 02:46:43,384][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 02:46:44,595][26599] Updated weights for policy 0, policy_version 262624 (0.0029) [2024-06-19 02:46:48,230][26599] Updated weights for policy 0, policy_version 262634 (0.0029) [2024-06-19 02:46:48,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42053.7, 300 sec: 41931.9). Total num frames: 4302995456. Throughput: 0: 42140.9. Samples: 570558040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:46:48,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 02:46:52,202][26599] Updated weights for policy 0, policy_version 262644 (0.0039) [2024-06-19 02:46:53,380][26367] Fps is (10 sec: 39336.0, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4303175680. Throughput: 0: 41836.4. Samples: 570809780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:46:53,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 02:46:53,536][26579] Signal inference workers to stop experience collection... (8500 times) [2024-06-19 02:46:53,538][26579] Signal inference workers to resume experience collection... (8500 times) [2024-06-19 02:46:53,550][26599] InferenceWorker_p0-w0: stopping experience collection (8500 times) [2024-06-19 02:46:53,550][26599] InferenceWorker_p0-w0: resuming experience collection (8500 times) [2024-06-19 02:46:56,035][26599] Updated weights for policy 0, policy_version 262654 (0.0031) [2024-06-19 02:46:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 41932.1). Total num frames: 4303405056. Throughput: 0: 41864.4. Samples: 571058400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:46:58,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 02:47:00,085][26599] Updated weights for policy 0, policy_version 262664 (0.0038) [2024-06-19 02:47:03,381][26367] Fps is (10 sec: 45872.5, 60 sec: 41778.8, 300 sec: 41987.6). Total num frames: 4303634432. Throughput: 0: 41908.8. Samples: 571185260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:03,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 02:47:03,774][26599] Updated weights for policy 0, policy_version 262674 (0.0028) [2024-06-19 02:47:08,130][26599] Updated weights for policy 0, policy_version 262684 (0.0033) [2024-06-19 02:47:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4303814656. Throughput: 0: 41875.4. Samples: 571437240. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:08,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 02:47:11,590][26599] Updated weights for policy 0, policy_version 262694 (0.0025) [2024-06-19 02:47:13,380][26367] Fps is (10 sec: 39324.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4304027648. Throughput: 0: 41745.0. Samples: 571684780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:13,384][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 02:47:15,751][26599] Updated weights for policy 0, policy_version 262704 (0.0045) [2024-06-19 02:47:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 4304240640. Throughput: 0: 41663.6. Samples: 571809940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:18,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 02:47:19,183][26599] Updated weights for policy 0, policy_version 262714 (0.0023) [2024-06-19 02:47:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41987.8). Total num frames: 4304453632. Throughput: 0: 41929.3. Samples: 572066540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:23,380][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 02:47:23,555][26599] Updated weights for policy 0, policy_version 262724 (0.0032) [2024-06-19 02:47:26,939][26599] Updated weights for policy 0, policy_version 262734 (0.0039) [2024-06-19 02:47:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4304666624. Throughput: 0: 42006.0. Samples: 572315880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:28,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 02:47:31,374][26599] Updated weights for policy 0, policy_version 262744 (0.0035) [2024-06-19 02:47:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 4304863232. Throughput: 0: 41989.3. Samples: 572447560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:33,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 02:47:34,841][26599] Updated weights for policy 0, policy_version 262754 (0.0035) [2024-06-19 02:47:38,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 41932.0). Total num frames: 4305076224. Throughput: 0: 42079.7. Samples: 572703360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:38,380][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 02:47:38,480][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000262762_4305092608.pth... [2024-06-19 02:47:38,550][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000262146_4295000064.pth [2024-06-19 02:47:39,167][26599] Updated weights for policy 0, policy_version 262764 (0.0029) [2024-06-19 02:47:42,433][26599] Updated weights for policy 0, policy_version 262774 (0.0037) [2024-06-19 02:47:43,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42054.9, 300 sec: 41987.5). Total num frames: 4305305600. Throughput: 0: 41994.7. Samples: 572948160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:43,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 02:47:47,140][26599] Updated weights for policy 0, policy_version 262784 (0.0040) [2024-06-19 02:47:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4305502208. Throughput: 0: 42167.6. Samples: 573082780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:48,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 02:47:50,478][26599] Updated weights for policy 0, policy_version 262794 (0.0040) [2024-06-19 02:47:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 41987.6). Total num frames: 4305715200. Throughput: 0: 42053.1. Samples: 573329620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 02:47:53,380][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 02:47:54,951][26599] Updated weights for policy 0, policy_version 262804 (0.0031) [2024-06-19 02:47:58,044][26599] Updated weights for policy 0, policy_version 262814 (0.0034) [2024-06-19 02:47:58,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42098.6). Total num frames: 4305960960. Throughput: 0: 42155.1. Samples: 573581760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:47:58,380][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 02:48:02,697][26599] Updated weights for policy 0, policy_version 262824 (0.0041) [2024-06-19 02:48:03,384][26367] Fps is (10 sec: 42582.6, 60 sec: 41777.1, 300 sec: 42042.5). Total num frames: 4306141184. Throughput: 0: 42287.3. Samples: 573713020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:03,384][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 02:48:05,851][26599] Updated weights for policy 0, policy_version 262834 (0.0030) [2024-06-19 02:48:08,384][26367] Fps is (10 sec: 37669.1, 60 sec: 42049.8, 300 sec: 41931.4). Total num frames: 4306337792. Throughput: 0: 42133.9. Samples: 573962720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:08,384][26367] Avg episode reward: [(0, '0.340')] [2024-06-19 02:48:10,772][26599] Updated weights for policy 0, policy_version 262844 (0.0043) [2024-06-19 02:48:13,380][26367] Fps is (10 sec: 44252.6, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 4306583552. Throughput: 0: 42009.8. Samples: 574206320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:13,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 02:48:13,684][26599] Updated weights for policy 0, policy_version 262854 (0.0038) [2024-06-19 02:48:18,380][26367] Fps is (10 sec: 40974.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4306747392. Throughput: 0: 42046.7. Samples: 574339660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:18,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 02:48:18,638][26599] Updated weights for policy 0, policy_version 262864 (0.0040) [2024-06-19 02:48:20,440][26579] Signal inference workers to stop experience collection... (8550 times) [2024-06-19 02:48:20,441][26579] Signal inference workers to resume experience collection... (8550 times) [2024-06-19 02:48:20,468][26599] InferenceWorker_p0-w0: stopping experience collection (8550 times) [2024-06-19 02:48:20,468][26599] InferenceWorker_p0-w0: resuming experience collection (8550 times) [2024-06-19 02:48:21,557][26599] Updated weights for policy 0, policy_version 262874 (0.0032) [2024-06-19 02:48:23,383][26367] Fps is (10 sec: 39311.9, 60 sec: 42050.5, 300 sec: 41931.6). Total num frames: 4306976768. Throughput: 0: 41739.3. Samples: 574581740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:23,383][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 02:48:26,184][26599] Updated weights for policy 0, policy_version 262884 (0.0027) [2024-06-19 02:48:28,380][26367] Fps is (10 sec: 47514.0, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 4307222528. Throughput: 0: 42028.9. Samples: 574839460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:28,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 02:48:29,290][26599] Updated weights for policy 0, policy_version 262894 (0.0033) [2024-06-19 02:48:33,380][26367] Fps is (10 sec: 39331.6, 60 sec: 41779.3, 300 sec: 41932.2). Total num frames: 4307369984. Throughput: 0: 41894.3. Samples: 574968020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:33,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 02:48:33,998][26599] Updated weights for policy 0, policy_version 262904 (0.0034) [2024-06-19 02:48:36,942][26599] Updated weights for policy 0, policy_version 262914 (0.0041) [2024-06-19 02:48:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 41931.9). Total num frames: 4307615744. Throughput: 0: 41901.7. Samples: 575215200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:38,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 02:48:41,816][26599] Updated weights for policy 0, policy_version 262924 (0.0031) [2024-06-19 02:48:43,380][26367] Fps is (10 sec: 47513.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4307845120. Throughput: 0: 42143.1. Samples: 575478200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:43,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 02:48:44,625][26599] Updated weights for policy 0, policy_version 262934 (0.0032) [2024-06-19 02:48:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4308008960. Throughput: 0: 42044.8. Samples: 575604880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:48,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 02:48:49,626][26599] Updated weights for policy 0, policy_version 262944 (0.0029) [2024-06-19 02:48:52,338][26599] Updated weights for policy 0, policy_version 262954 (0.0028) [2024-06-19 02:48:53,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4308254720. Throughput: 0: 42043.4. Samples: 575854520. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:53,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 02:48:57,255][26599] Updated weights for policy 0, policy_version 262964 (0.0036) [2024-06-19 02:48:58,380][26367] Fps is (10 sec: 44237.2, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4308451328. Throughput: 0: 42288.6. Samples: 576109300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:48:58,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 02:49:00,128][26599] Updated weights for policy 0, policy_version 262974 (0.0037) [2024-06-19 02:49:03,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41781.8, 300 sec: 41932.0). Total num frames: 4308647936. Throughput: 0: 42093.0. Samples: 576233840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:49:03,380][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 02:49:04,923][26599] Updated weights for policy 0, policy_version 262984 (0.0041) [2024-06-19 02:49:07,936][26599] Updated weights for policy 0, policy_version 262994 (0.0043) [2024-06-19 02:49:08,381][26367] Fps is (10 sec: 44232.7, 60 sec: 42600.4, 300 sec: 42098.4). Total num frames: 4308893696. Throughput: 0: 42357.2. Samples: 576487740. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 02:49:08,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 02:49:12,884][26599] Updated weights for policy 0, policy_version 263004 (0.0030) [2024-06-19 02:49:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.3, 300 sec: 42043.0). Total num frames: 4309073920. Throughput: 0: 42194.3. Samples: 576738200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:13,380][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 02:49:15,961][26599] Updated weights for policy 0, policy_version 263014 (0.0033) [2024-06-19 02:49:18,380][26367] Fps is (10 sec: 39324.9, 60 sec: 42325.4, 300 sec: 41932.5). Total num frames: 4309286912. Throughput: 0: 41972.0. Samples: 576856760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:18,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 02:49:20,457][26599] Updated weights for policy 0, policy_version 263024 (0.0032) [2024-06-19 02:49:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42054.1, 300 sec: 41987.5). Total num frames: 4309499904. Throughput: 0: 42140.1. Samples: 577111500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:23,380][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 02:49:23,884][26599] Updated weights for policy 0, policy_version 263034 (0.0049) [2024-06-19 02:49:28,218][26599] Updated weights for policy 0, policy_version 263044 (0.0041) [2024-06-19 02:49:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41506.2, 300 sec: 42043.4). Total num frames: 4309712896. Throughput: 0: 42073.8. Samples: 577371520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:28,380][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 02:49:31,836][26599] Updated weights for policy 0, policy_version 263054 (0.0039) [2024-06-19 02:49:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 4309925888. Throughput: 0: 41946.7. Samples: 577492480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:33,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 02:49:36,270][26599] Updated weights for policy 0, policy_version 263064 (0.0029) [2024-06-19 02:49:36,832][26579] Signal inference workers to stop experience collection... (8600 times) [2024-06-19 02:49:36,835][26579] Signal inference workers to resume experience collection... (8600 times) [2024-06-19 02:49:36,849][26599] InferenceWorker_p0-w0: stopping experience collection (8600 times) [2024-06-19 02:49:36,849][26599] InferenceWorker_p0-w0: resuming experience collection (8600 times) [2024-06-19 02:49:38,384][26367] Fps is (10 sec: 40944.5, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 4310122496. Throughput: 0: 41853.5. Samples: 577738080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:38,385][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 02:49:38,412][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263069_4310122496.pth... [2024-06-19 02:49:38,473][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000262456_4300079104.pth [2024-06-19 02:49:39,729][26599] Updated weights for policy 0, policy_version 263074 (0.0042) [2024-06-19 02:49:43,383][26367] Fps is (10 sec: 40949.8, 60 sec: 41504.4, 300 sec: 41987.1). Total num frames: 4310335488. Throughput: 0: 41871.9. Samples: 577993640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:43,384][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 02:49:44,071][26599] Updated weights for policy 0, policy_version 263084 (0.0039) [2024-06-19 02:49:47,872][26599] Updated weights for policy 0, policy_version 263094 (0.0048) [2024-06-19 02:49:48,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 4310548480. Throughput: 0: 41827.4. Samples: 578116080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:48,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 02:49:51,786][26599] Updated weights for policy 0, policy_version 263104 (0.0029) [2024-06-19 02:49:53,380][26367] Fps is (10 sec: 44247.3, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 4310777856. Throughput: 0: 41767.8. Samples: 578367260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:53,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 02:49:55,907][26599] Updated weights for policy 0, policy_version 263114 (0.0032) [2024-06-19 02:49:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.0, 300 sec: 41987.4). Total num frames: 4310958080. Throughput: 0: 41874.9. Samples: 578622580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:49:58,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 02:49:59,521][26599] Updated weights for policy 0, policy_version 263124 (0.0046) [2024-06-19 02:50:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4311171072. Throughput: 0: 41912.5. Samples: 578742820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:50:03,380][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 02:50:03,874][26599] Updated weights for policy 0, policy_version 263134 (0.0029) [2024-06-19 02:50:07,181][26599] Updated weights for policy 0, policy_version 263144 (0.0029) [2024-06-19 02:50:08,384][26367] Fps is (10 sec: 44221.2, 60 sec: 41777.2, 300 sec: 42042.5). Total num frames: 4311400448. Throughput: 0: 41965.4. Samples: 579000100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:50:08,385][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 02:50:11,714][26599] Updated weights for policy 0, policy_version 263154 (0.0047) [2024-06-19 02:50:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4311597056. Throughput: 0: 41774.2. Samples: 579251360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:50:13,381][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 02:50:14,956][26599] Updated weights for policy 0, policy_version 263164 (0.0040) [2024-06-19 02:50:18,384][26367] Fps is (10 sec: 39321.5, 60 sec: 41776.6, 300 sec: 41875.9). Total num frames: 4311793664. Throughput: 0: 41756.1. Samples: 579371660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 02:50:18,385][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 02:50:19,377][26599] Updated weights for policy 0, policy_version 263174 (0.0048) [2024-06-19 02:50:22,640][26599] Updated weights for policy 0, policy_version 263184 (0.0039) [2024-06-19 02:50:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4312023040. Throughput: 0: 41946.5. Samples: 579625520. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:23,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 02:50:26,999][26599] Updated weights for policy 0, policy_version 263194 (0.0032) [2024-06-19 02:50:28,380][26367] Fps is (10 sec: 40975.3, 60 sec: 41506.1, 300 sec: 41876.9). Total num frames: 4312203264. Throughput: 0: 42059.2. Samples: 579886200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:28,380][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 02:50:30,578][26599] Updated weights for policy 0, policy_version 263204 (0.0038) [2024-06-19 02:50:33,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42049.6, 300 sec: 41931.4). Total num frames: 4312449024. Throughput: 0: 41985.5. Samples: 580005580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:33,384][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 02:50:34,856][26599] Updated weights for policy 0, policy_version 263214 (0.0037) [2024-06-19 02:50:38,343][26599] Updated weights for policy 0, policy_version 263224 (0.0032) [2024-06-19 02:50:38,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42327.9, 300 sec: 42154.1). Total num frames: 4312662016. Throughput: 0: 42144.5. Samples: 580263760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:38,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 02:50:42,838][26599] Updated weights for policy 0, policy_version 263234 (0.0038) [2024-06-19 02:50:43,384][26367] Fps is (10 sec: 37683.6, 60 sec: 41505.3, 300 sec: 41876.2). Total num frames: 4312825856. Throughput: 0: 41981.2. Samples: 580511880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:43,384][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 02:50:46,072][26599] Updated weights for policy 0, policy_version 263244 (0.0028) [2024-06-19 02:50:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4313088000. Throughput: 0: 41975.9. Samples: 580631740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:48,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 02:50:50,607][26599] Updated weights for policy 0, policy_version 263254 (0.0030) [2024-06-19 02:50:53,380][26367] Fps is (10 sec: 44252.7, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4313268224. Throughput: 0: 42005.6. Samples: 580890200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:53,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 02:50:53,888][26579] Signal inference workers to stop experience collection... (8650 times) [2024-06-19 02:50:53,888][26579] Signal inference workers to resume experience collection... (8650 times) [2024-06-19 02:50:53,910][26599] InferenceWorker_p0-w0: stopping experience collection (8650 times) [2024-06-19 02:50:53,911][26599] InferenceWorker_p0-w0: resuming experience collection (8650 times) [2024-06-19 02:50:54,035][26599] Updated weights for policy 0, policy_version 263264 (0.0039) [2024-06-19 02:50:58,380][26367] Fps is (10 sec: 37683.7, 60 sec: 41779.4, 300 sec: 41820.9). Total num frames: 4313464832. Throughput: 0: 41951.6. Samples: 581139180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:50:58,380][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 02:50:58,487][26599] Updated weights for policy 0, policy_version 263274 (0.0029) [2024-06-19 02:51:02,023][26599] Updated weights for policy 0, policy_version 263284 (0.0039) [2024-06-19 02:51:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4313710592. Throughput: 0: 42029.8. Samples: 581262840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:51:03,380][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 02:51:06,436][26599] Updated weights for policy 0, policy_version 263294 (0.0034) [2024-06-19 02:51:08,380][26367] Fps is (10 sec: 42597.6, 60 sec: 41508.6, 300 sec: 42043.0). Total num frames: 4313890816. Throughput: 0: 41972.9. Samples: 581514300. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:51:08,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 02:51:09,731][26599] Updated weights for policy 0, policy_version 263304 (0.0037) [2024-06-19 02:51:13,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41779.1, 300 sec: 41820.9). Total num frames: 4314103808. Throughput: 0: 41777.7. Samples: 581766200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:51:13,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 02:51:14,099][26599] Updated weights for policy 0, policy_version 263314 (0.0046) [2024-06-19 02:51:17,519][26599] Updated weights for policy 0, policy_version 263324 (0.0036) [2024-06-19 02:51:18,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42054.9, 300 sec: 42043.0). Total num frames: 4314316800. Throughput: 0: 41879.9. Samples: 581890020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:51:18,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 02:51:21,821][26599] Updated weights for policy 0, policy_version 263334 (0.0043) [2024-06-19 02:51:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4314529792. Throughput: 0: 41781.8. Samples: 582143940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:51:23,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 02:51:25,642][26599] Updated weights for policy 0, policy_version 263344 (0.0038) [2024-06-19 02:51:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4314742784. Throughput: 0: 41788.2. Samples: 582392200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 02:51:28,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 02:51:29,460][26599] Updated weights for policy 0, policy_version 263354 (0.0041) [2024-06-19 02:51:33,215][26599] Updated weights for policy 0, policy_version 263364 (0.0041) [2024-06-19 02:51:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41781.7, 300 sec: 42043.0). Total num frames: 4314955776. Throughput: 0: 42139.4. Samples: 582528020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:51:33,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 02:51:37,070][26599] Updated weights for policy 0, policy_version 263374 (0.0037) [2024-06-19 02:51:38,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 41876.9). Total num frames: 4315136000. Throughput: 0: 41712.9. Samples: 582767280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:51:38,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 02:51:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263375_4315136000.pth... [2024-06-19 02:51:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000262762_4305092608.pth [2024-06-19 02:51:40,966][26599] Updated weights for policy 0, policy_version 263384 (0.0039) [2024-06-19 02:51:43,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42327.9, 300 sec: 41931.9). Total num frames: 4315365376. Throughput: 0: 41836.8. Samples: 583021840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:51:43,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 02:51:45,223][26599] Updated weights for policy 0, policy_version 263394 (0.0028) [2024-06-19 02:51:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4315578368. Throughput: 0: 41847.5. Samples: 583145980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:51:48,380][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 02:51:48,660][26599] Updated weights for policy 0, policy_version 263404 (0.0032) [2024-06-19 02:51:53,178][26599] Updated weights for policy 0, policy_version 263414 (0.0035) [2024-06-19 02:51:53,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 4315774976. Throughput: 0: 41866.9. Samples: 583398460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:51:53,384][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 02:51:56,241][26599] Updated weights for policy 0, policy_version 263424 (0.0042) [2024-06-19 02:51:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 41876.5). Total num frames: 4315987968. Throughput: 0: 41884.9. Samples: 583651020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:51:58,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 02:52:00,845][26599] Updated weights for policy 0, policy_version 263434 (0.0046) [2024-06-19 02:52:03,383][26367] Fps is (10 sec: 42601.3, 60 sec: 41504.0, 300 sec: 41987.1). Total num frames: 4316200960. Throughput: 0: 41885.7. Samples: 583775000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:03,384][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 02:52:04,086][26599] Updated weights for policy 0, policy_version 263444 (0.0041) [2024-06-19 02:52:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4316413952. Throughput: 0: 41938.7. Samples: 584031180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:08,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 02:52:08,964][26599] Updated weights for policy 0, policy_version 263454 (0.0046) [2024-06-19 02:52:12,281][26599] Updated weights for policy 0, policy_version 263464 (0.0029) [2024-06-19 02:52:13,380][26367] Fps is (10 sec: 40971.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4316610560. Throughput: 0: 41824.4. Samples: 584274300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:13,381][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 02:52:14,452][26579] Signal inference workers to stop experience collection... (8700 times) [2024-06-19 02:52:14,496][26599] InferenceWorker_p0-w0: stopping experience collection (8700 times) [2024-06-19 02:52:14,562][26579] Signal inference workers to resume experience collection... (8700 times) [2024-06-19 02:52:14,562][26599] InferenceWorker_p0-w0: resuming experience collection (8700 times) [2024-06-19 02:52:16,555][26599] Updated weights for policy 0, policy_version 263474 (0.0033) [2024-06-19 02:52:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4316839936. Throughput: 0: 41573.5. Samples: 584398820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:18,380][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 02:52:19,938][26599] Updated weights for policy 0, policy_version 263484 (0.0036) [2024-06-19 02:52:23,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41503.6, 300 sec: 41875.9). Total num frames: 4317020160. Throughput: 0: 41972.1. Samples: 584656180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:23,385][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 02:52:24,383][26599] Updated weights for policy 0, policy_version 263494 (0.0027) [2024-06-19 02:52:28,096][26599] Updated weights for policy 0, policy_version 263504 (0.0034) [2024-06-19 02:52:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4317249536. Throughput: 0: 41783.1. Samples: 584902080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:28,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 02:52:32,176][26599] Updated weights for policy 0, policy_version 263514 (0.0028) [2024-06-19 02:52:33,380][26367] Fps is (10 sec: 44253.6, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 4317462528. Throughput: 0: 41879.2. Samples: 585030540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:33,380][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 02:52:35,981][26599] Updated weights for policy 0, policy_version 263524 (0.0047) [2024-06-19 02:52:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4317659136. Throughput: 0: 41928.2. Samples: 585285080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:38,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 02:52:39,936][26599] Updated weights for policy 0, policy_version 263534 (0.0030) [2024-06-19 02:52:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 4317888512. Throughput: 0: 41717.9. Samples: 585528320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 02:52:43,380][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 02:52:43,757][26599] Updated weights for policy 0, policy_version 263544 (0.0033) [2024-06-19 02:52:47,898][26599] Updated weights for policy 0, policy_version 263554 (0.0028) [2024-06-19 02:52:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4318085120. Throughput: 0: 41839.6. Samples: 585657660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:52:48,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 02:52:51,676][26599] Updated weights for policy 0, policy_version 263564 (0.0041) [2024-06-19 02:52:53,384][26367] Fps is (10 sec: 39306.9, 60 sec: 41779.2, 300 sec: 41764.8). Total num frames: 4318281728. Throughput: 0: 41735.3. Samples: 585909420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:52:53,385][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 02:52:55,713][26599] Updated weights for policy 0, policy_version 263574 (0.0045) [2024-06-19 02:52:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41932.4). Total num frames: 4318511104. Throughput: 0: 41845.0. Samples: 586157320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:52:58,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 02:52:59,415][26599] Updated weights for policy 0, policy_version 263584 (0.0034) [2024-06-19 02:53:03,384][26367] Fps is (10 sec: 40960.3, 60 sec: 41505.7, 300 sec: 41876.4). Total num frames: 4318691328. Throughput: 0: 41953.1. Samples: 586286860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:03,384][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 02:53:03,706][26599] Updated weights for policy 0, policy_version 263594 (0.0032) [2024-06-19 02:53:07,228][26599] Updated weights for policy 0, policy_version 263604 (0.0032) [2024-06-19 02:53:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4318937088. Throughput: 0: 41861.6. Samples: 586539800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:08,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 02:53:11,408][26599] Updated weights for policy 0, policy_version 263614 (0.0031) [2024-06-19 02:53:13,380][26367] Fps is (10 sec: 47530.3, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4319166464. Throughput: 0: 41932.4. Samples: 586789040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:13,381][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 02:53:15,114][26599] Updated weights for policy 0, policy_version 263624 (0.0044) [2024-06-19 02:53:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41506.0, 300 sec: 41876.7). Total num frames: 4319330304. Throughput: 0: 41963.8. Samples: 586918920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:18,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 02:53:19,098][26599] Updated weights for policy 0, policy_version 263634 (0.0029) [2024-06-19 02:53:22,999][26599] Updated weights for policy 0, policy_version 263644 (0.0032) [2024-06-19 02:53:23,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42054.8, 300 sec: 41765.3). Total num frames: 4319543296. Throughput: 0: 41748.5. Samples: 587163760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:23,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 02:53:25,481][26579] Signal inference workers to stop experience collection... (8750 times) [2024-06-19 02:53:25,481][26579] Signal inference workers to resume experience collection... (8750 times) [2024-06-19 02:53:25,499][26599] InferenceWorker_p0-w0: stopping experience collection (8750 times) [2024-06-19 02:53:25,500][26599] InferenceWorker_p0-w0: resuming experience collection (8750 times) [2024-06-19 02:53:27,052][26599] Updated weights for policy 0, policy_version 263654 (0.0043) [2024-06-19 02:53:28,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4319789056. Throughput: 0: 41931.1. Samples: 587415220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:28,380][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 02:53:31,105][26599] Updated weights for policy 0, policy_version 263664 (0.0042) [2024-06-19 02:53:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4319969280. Throughput: 0: 41951.1. Samples: 587545460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:33,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 02:53:35,007][26599] Updated weights for policy 0, policy_version 263674 (0.0030) [2024-06-19 02:53:38,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 4320165888. Throughput: 0: 41791.4. Samples: 587789880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:38,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 02:53:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263682_4320165888.pth... [2024-06-19 02:53:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263069_4310122496.pth [2024-06-19 02:53:39,031][26599] Updated weights for policy 0, policy_version 263684 (0.0033) [2024-06-19 02:53:42,764][26599] Updated weights for policy 0, policy_version 263694 (0.0039) [2024-06-19 02:53:43,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4320411648. Throughput: 0: 41907.2. Samples: 588043140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:43,380][26367] Avg episode reward: [(0, '0.337')] [2024-06-19 02:53:46,877][26599] Updated weights for policy 0, policy_version 263704 (0.0034) [2024-06-19 02:53:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4320575488. Throughput: 0: 41909.9. Samples: 588172660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:48,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 02:53:50,519][26599] Updated weights for policy 0, policy_version 263714 (0.0052) [2024-06-19 02:53:53,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42327.9, 300 sec: 41931.9). Total num frames: 4320821248. Throughput: 0: 41591.2. Samples: 588411400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:53:53,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 02:53:55,032][26599] Updated weights for policy 0, policy_version 263724 (0.0048) [2024-06-19 02:53:58,191][26599] Updated weights for policy 0, policy_version 263734 (0.0046) [2024-06-19 02:53:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4321017856. Throughput: 0: 41798.3. Samples: 588669960. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:53:58,380][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 02:54:02,735][26599] Updated weights for policy 0, policy_version 263744 (0.0035) [2024-06-19 02:54:03,380][26367] Fps is (10 sec: 36044.7, 60 sec: 41508.6, 300 sec: 41654.4). Total num frames: 4321181696. Throughput: 0: 41650.7. Samples: 588793200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:03,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 02:54:05,818][26599] Updated weights for policy 0, policy_version 263754 (0.0039) [2024-06-19 02:54:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4321443840. Throughput: 0: 41743.9. Samples: 589042240. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:08,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 02:54:10,418][26599] Updated weights for policy 0, policy_version 263764 (0.0047) [2024-06-19 02:54:13,380][26367] Fps is (10 sec: 45874.9, 60 sec: 41233.0, 300 sec: 41876.4). Total num frames: 4321640448. Throughput: 0: 41849.2. Samples: 589298440. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:13,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 02:54:13,769][26599] Updated weights for policy 0, policy_version 263774 (0.0031) [2024-06-19 02:54:18,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.1, 300 sec: 41765.3). Total num frames: 4321820672. Throughput: 0: 41737.3. Samples: 589423640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:18,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 02:54:18,758][26599] Updated weights for policy 0, policy_version 263784 (0.0037) [2024-06-19 02:54:21,536][26599] Updated weights for policy 0, policy_version 263794 (0.0040) [2024-06-19 02:54:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4322082816. Throughput: 0: 41724.4. Samples: 589667480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:23,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 02:54:26,605][26599] Updated weights for policy 0, policy_version 263804 (0.0043) [2024-06-19 02:54:28,380][26367] Fps is (10 sec: 44237.5, 60 sec: 41233.0, 300 sec: 41820.9). Total num frames: 4322263040. Throughput: 0: 41852.9. Samples: 589926520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:28,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 02:54:29,063][26579] Signal inference workers to stop experience collection... (8800 times) [2024-06-19 02:54:29,066][26579] Signal inference workers to resume experience collection... (8800 times) [2024-06-19 02:54:29,073][26599] InferenceWorker_p0-w0: stopping experience collection (8800 times) [2024-06-19 02:54:29,086][26599] InferenceWorker_p0-w0: resuming experience collection (8800 times) [2024-06-19 02:54:29,237][26599] Updated weights for policy 0, policy_version 263814 (0.0034) [2024-06-19 02:54:33,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41506.2, 300 sec: 41821.4). Total num frames: 4322459648. Throughput: 0: 41800.0. Samples: 590053660. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:33,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 02:54:34,406][26599] Updated weights for policy 0, policy_version 263824 (0.0027) [2024-06-19 02:54:36,918][26599] Updated weights for policy 0, policy_version 263834 (0.0025) [2024-06-19 02:54:38,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 41932.3). Total num frames: 4322705408. Throughput: 0: 41957.7. Samples: 590299500. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:38,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 02:54:42,172][26599] Updated weights for policy 0, policy_version 263844 (0.0033) [2024-06-19 02:54:43,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 4322902016. Throughput: 0: 41934.1. Samples: 590557000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:43,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 02:54:44,578][26599] Updated weights for policy 0, policy_version 263854 (0.0036) [2024-06-19 02:54:48,384][26367] Fps is (10 sec: 39308.2, 60 sec: 42049.9, 300 sec: 41764.8). Total num frames: 4323098624. Throughput: 0: 41918.5. Samples: 590679680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:48,384][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 02:54:49,826][26599] Updated weights for policy 0, policy_version 263864 (0.0048) [2024-06-19 02:54:52,173][26599] Updated weights for policy 0, policy_version 263874 (0.0028) [2024-06-19 02:54:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4323344384. Throughput: 0: 41940.1. Samples: 590929540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:53,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 02:54:57,631][26599] Updated weights for policy 0, policy_version 263884 (0.0036) [2024-06-19 02:54:58,380][26367] Fps is (10 sec: 42613.2, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4323524608. Throughput: 0: 42070.7. Samples: 591191620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:54:58,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 02:55:00,394][26599] Updated weights for policy 0, policy_version 263894 (0.0029) [2024-06-19 02:55:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 41821.4). Total num frames: 4323737600. Throughput: 0: 41925.9. Samples: 591310300. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 02:55:03,380][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 02:55:05,319][26599] Updated weights for policy 0, policy_version 263904 (0.0032) [2024-06-19 02:55:08,040][26599] Updated weights for policy 0, policy_version 263914 (0.0035) [2024-06-19 02:55:08,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4323983360. Throughput: 0: 42163.5. Samples: 591564840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:08,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 02:55:12,942][26599] Updated weights for policy 0, policy_version 263924 (0.0028) [2024-06-19 02:55:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41876.9). Total num frames: 4324147200. Throughput: 0: 42175.5. Samples: 591824420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:13,381][26367] Avg episode reward: [(0, '0.291')] [2024-06-19 02:55:15,747][26599] Updated weights for policy 0, policy_version 263934 (0.0034) [2024-06-19 02:55:18,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 41876.4). Total num frames: 4324376576. Throughput: 0: 42019.6. Samples: 591944540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:18,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 02:55:20,692][26599] Updated weights for policy 0, policy_version 263944 (0.0030) [2024-06-19 02:55:22,496][26579] Signal inference workers to stop experience collection... (8850 times) [2024-06-19 02:55:22,550][26599] InferenceWorker_p0-w0: stopping experience collection (8850 times) [2024-06-19 02:55:22,552][26579] Signal inference workers to resume experience collection... (8850 times) [2024-06-19 02:55:22,565][26599] InferenceWorker_p0-w0: resuming experience collection (8850 times) [2024-06-19 02:55:23,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4324605952. Throughput: 0: 42295.7. Samples: 592202960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:23,385][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 02:55:23,723][26599] Updated weights for policy 0, policy_version 263954 (0.0040) [2024-06-19 02:55:28,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 41765.9). Total num frames: 4324769792. Throughput: 0: 42205.0. Samples: 592456220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:28,380][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 02:55:28,394][26599] Updated weights for policy 0, policy_version 263964 (0.0038) [2024-06-19 02:55:31,599][26599] Updated weights for policy 0, policy_version 263974 (0.0033) [2024-06-19 02:55:33,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 4325015552. Throughput: 0: 42139.7. Samples: 592575820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:33,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 02:55:36,030][26599] Updated weights for policy 0, policy_version 263984 (0.0028) [2024-06-19 02:55:38,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41779.3, 300 sec: 41988.0). Total num frames: 4325212160. Throughput: 0: 42141.7. Samples: 592825920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:38,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 02:55:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263990_4325212160.pth... [2024-06-19 02:55:38,464][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263375_4315136000.pth [2024-06-19 02:55:39,353][26599] Updated weights for policy 0, policy_version 263994 (0.0029) [2024-06-19 02:55:43,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 41765.3). Total num frames: 4325408768. Throughput: 0: 42139.0. Samples: 593087880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:43,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 02:55:43,599][26599] Updated weights for policy 0, policy_version 264004 (0.0034) [2024-06-19 02:55:46,972][26599] Updated weights for policy 0, policy_version 264014 (0.0034) [2024-06-19 02:55:48,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42600.8, 300 sec: 41987.5). Total num frames: 4325654528. Throughput: 0: 42179.0. Samples: 593208360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:48,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 02:55:51,545][26599] Updated weights for policy 0, policy_version 264024 (0.0037) [2024-06-19 02:55:53,384][26367] Fps is (10 sec: 44221.1, 60 sec: 41776.6, 300 sec: 41986.9). Total num frames: 4325851136. Throughput: 0: 42164.6. Samples: 593462400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:53,384][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 02:55:54,816][26599] Updated weights for policy 0, policy_version 264034 (0.0026) [2024-06-19 02:55:58,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 4326047744. Throughput: 0: 42001.4. Samples: 593714480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:55:58,380][26367] Avg episode reward: [(0, '0.367')] [2024-06-19 02:55:59,286][26599] Updated weights for policy 0, policy_version 264044 (0.0031) [2024-06-19 02:56:02,833][26599] Updated weights for policy 0, policy_version 264054 (0.0038) [2024-06-19 02:56:03,380][26367] Fps is (10 sec: 42614.5, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4326277120. Throughput: 0: 42052.0. Samples: 593836880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:56:03,380][26367] Avg episode reward: [(0, '0.367')] [2024-06-19 02:56:06,813][26599] Updated weights for policy 0, policy_version 264064 (0.0039) [2024-06-19 02:56:08,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4326490112. Throughput: 0: 42030.9. Samples: 594094200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:56:08,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 02:56:10,476][26599] Updated weights for policy 0, policy_version 264074 (0.0042) [2024-06-19 02:56:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4326670336. Throughput: 0: 42006.6. Samples: 594346520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 02:56:13,380][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 02:56:14,663][26599] Updated weights for policy 0, policy_version 264084 (0.0038) [2024-06-19 02:56:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4326899712. Throughput: 0: 42095.6. Samples: 594470120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:18,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 02:56:18,584][26599] Updated weights for policy 0, policy_version 264094 (0.0044) [2024-06-19 02:56:22,448][26599] Updated weights for policy 0, policy_version 264104 (0.0048) [2024-06-19 02:56:23,380][26367] Fps is (10 sec: 44235.7, 60 sec: 41781.6, 300 sec: 41931.9). Total num frames: 4327112704. Throughput: 0: 42271.4. Samples: 594728140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:23,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 02:56:26,241][26599] Updated weights for policy 0, policy_version 264114 (0.0026) [2024-06-19 02:56:28,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.1, 300 sec: 41820.9). Total num frames: 4327292928. Throughput: 0: 42058.2. Samples: 594980500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:28,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 02:56:30,090][26599] Updated weights for policy 0, policy_version 264124 (0.0049) [2024-06-19 02:56:33,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4327555072. Throughput: 0: 42051.1. Samples: 595100660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:33,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 02:56:34,090][26599] Updated weights for policy 0, policy_version 264134 (0.0027) [2024-06-19 02:56:37,766][26599] Updated weights for policy 0, policy_version 264144 (0.0033) [2024-06-19 02:56:38,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4327751680. Throughput: 0: 42189.3. Samples: 595360760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:38,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 02:56:42,151][26599] Updated weights for policy 0, policy_version 264154 (0.0032) [2024-06-19 02:56:43,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4327931904. Throughput: 0: 42166.1. Samples: 595611960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:43,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 02:56:45,517][26599] Updated weights for policy 0, policy_version 264164 (0.0031) [2024-06-19 02:56:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42043.5). Total num frames: 4328177664. Throughput: 0: 42119.1. Samples: 595732240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:48,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 02:56:49,844][26599] Updated weights for policy 0, policy_version 264174 (0.0040) [2024-06-19 02:56:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42054.9, 300 sec: 41987.5). Total num frames: 4328374272. Throughput: 0: 42075.7. Samples: 595987600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:53,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 02:56:53,418][26599] Updated weights for policy 0, policy_version 264184 (0.0037) [2024-06-19 02:56:55,246][26579] Signal inference workers to stop experience collection... (8900 times) [2024-06-19 02:56:55,294][26599] InferenceWorker_p0-w0: stopping experience collection (8900 times) [2024-06-19 02:56:55,295][26579] Signal inference workers to resume experience collection... (8900 times) [2024-06-19 02:56:55,309][26599] InferenceWorker_p0-w0: resuming experience collection (8900 times) [2024-06-19 02:56:57,447][26599] Updated weights for policy 0, policy_version 264194 (0.0031) [2024-06-19 02:56:58,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 41932.3). Total num frames: 4328570880. Throughput: 0: 42070.6. Samples: 596239700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:56:58,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 02:57:01,228][26599] Updated weights for policy 0, policy_version 264204 (0.0027) [2024-06-19 02:57:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4328800256. Throughput: 0: 42082.4. Samples: 596363820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:57:03,380][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 02:57:05,114][26599] Updated weights for policy 0, policy_version 264214 (0.0029) [2024-06-19 02:57:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4329013248. Throughput: 0: 41998.7. Samples: 596618080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:57:08,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 02:57:08,999][26599] Updated weights for policy 0, policy_version 264224 (0.0035) [2024-06-19 02:57:12,994][26599] Updated weights for policy 0, policy_version 264234 (0.0031) [2024-06-19 02:57:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4329209856. Throughput: 0: 41801.9. Samples: 596861580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:57:13,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 02:57:16,842][26599] Updated weights for policy 0, policy_version 264244 (0.0038) [2024-06-19 02:57:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 4329422848. Throughput: 0: 41980.0. Samples: 596989760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:57:18,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 02:57:20,842][26599] Updated weights for policy 0, policy_version 264254 (0.0042) [2024-06-19 02:57:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4329619456. Throughput: 0: 41862.1. Samples: 597244560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:57:23,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 02:57:24,435][26599] Updated weights for policy 0, policy_version 264264 (0.0038) [2024-06-19 02:57:28,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4329816064. Throughput: 0: 41800.0. Samples: 597492960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 02:57:28,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 02:57:28,955][26599] Updated weights for policy 0, policy_version 264274 (0.0038) [2024-06-19 02:57:32,042][26599] Updated weights for policy 0, policy_version 264284 (0.0035) [2024-06-19 02:57:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4330045440. Throughput: 0: 41951.4. Samples: 597620060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:57:33,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 02:57:37,031][26599] Updated weights for policy 0, policy_version 264294 (0.0039) [2024-06-19 02:57:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 4330242048. Throughput: 0: 41887.4. Samples: 597872540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:57:38,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 02:57:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000264297_4330242048.pth... [2024-06-19 02:57:38,452][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263682_4320165888.pth [2024-06-19 02:57:40,090][26599] Updated weights for policy 0, policy_version 264304 (0.0036) [2024-06-19 02:57:43,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42049.7, 300 sec: 41931.4). Total num frames: 4330455040. Throughput: 0: 41628.2. Samples: 598113120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:57:43,385][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 02:57:44,849][26599] Updated weights for policy 0, policy_version 264314 (0.0035) [2024-06-19 02:57:47,908][26599] Updated weights for policy 0, policy_version 264324 (0.0039) [2024-06-19 02:57:48,384][26367] Fps is (10 sec: 44221.1, 60 sec: 41776.6, 300 sec: 42043.0). Total num frames: 4330684416. Throughput: 0: 41798.3. Samples: 598244900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:57:48,384][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 02:57:52,798][26599] Updated weights for policy 0, policy_version 264334 (0.0037) [2024-06-19 02:57:53,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4330864640. Throughput: 0: 41750.3. Samples: 598496840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:57:53,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 02:57:56,340][26599] Updated weights for policy 0, policy_version 264344 (0.0051) [2024-06-19 02:57:58,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42325.3, 300 sec: 42099.1). Total num frames: 4331110400. Throughput: 0: 41737.7. Samples: 598739780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:57:58,384][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 02:58:00,647][26599] Updated weights for policy 0, policy_version 264354 (0.0028) [2024-06-19 02:58:03,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4331323392. Throughput: 0: 41843.2. Samples: 598872700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:03,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 02:58:04,015][26599] Updated weights for policy 0, policy_version 264364 (0.0036) [2024-06-19 02:58:08,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41233.1, 300 sec: 41765.3). Total num frames: 4331487232. Throughput: 0: 41721.7. Samples: 599122040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:08,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 02:58:08,570][26599] Updated weights for policy 0, policy_version 264374 (0.0033) [2024-06-19 02:58:11,913][26599] Updated weights for policy 0, policy_version 264384 (0.0041) [2024-06-19 02:58:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4331732992. Throughput: 0: 41698.6. Samples: 599369400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:13,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 02:58:16,200][26599] Updated weights for policy 0, policy_version 264394 (0.0047) [2024-06-19 02:58:18,383][26367] Fps is (10 sec: 44224.9, 60 sec: 41777.3, 300 sec: 41987.1). Total num frames: 4331929600. Throughput: 0: 41775.3. Samples: 599500060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:18,388][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 02:58:19,732][26599] Updated weights for policy 0, policy_version 264404 (0.0033) [2024-06-19 02:58:20,625][26579] Signal inference workers to stop experience collection... (8950 times) [2024-06-19 02:58:20,626][26579] Signal inference workers to resume experience collection... (8950 times) [2024-06-19 02:58:20,667][26599] InferenceWorker_p0-w0: stopping experience collection (8950 times) [2024-06-19 02:58:20,667][26599] InferenceWorker_p0-w0: resuming experience collection (8950 times) [2024-06-19 02:58:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4332126208. Throughput: 0: 41680.0. Samples: 599748140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:23,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 02:58:23,921][26599] Updated weights for policy 0, policy_version 264414 (0.0030) [2024-06-19 02:58:27,598][26599] Updated weights for policy 0, policy_version 264424 (0.0033) [2024-06-19 02:58:28,380][26367] Fps is (10 sec: 42610.2, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4332355584. Throughput: 0: 41983.0. Samples: 600002200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:28,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 02:58:31,712][26599] Updated weights for policy 0, policy_version 264434 (0.0042) [2024-06-19 02:58:33,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4332584960. Throughput: 0: 41989.1. Samples: 600134260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:33,381][26367] Avg episode reward: [(0, '0.373')] [2024-06-19 02:58:35,423][26599] Updated weights for policy 0, policy_version 264444 (0.0035) [2024-06-19 02:58:38,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41779.3, 300 sec: 41820.8). Total num frames: 4332748800. Throughput: 0: 42036.1. Samples: 600388460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 02:58:38,380][26367] Avg episode reward: [(0, '0.254')] [2024-06-19 02:58:39,549][26599] Updated weights for policy 0, policy_version 264454 (0.0047) [2024-06-19 02:58:43,209][26599] Updated weights for policy 0, policy_version 264464 (0.0034) [2024-06-19 02:58:43,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42054.9, 300 sec: 42043.0). Total num frames: 4332978176. Throughput: 0: 42237.0. Samples: 600640440. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:58:43,381][26367] Avg episode reward: [(0, '0.221')] [2024-06-19 02:58:47,107][26599] Updated weights for policy 0, policy_version 264474 (0.0038) [2024-06-19 02:58:48,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41781.7, 300 sec: 41931.9). Total num frames: 4333191168. Throughput: 0: 42161.7. Samples: 600769980. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:58:48,389][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 02:58:51,059][26599] Updated weights for policy 0, policy_version 264484 (0.0046) [2024-06-19 02:58:53,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4333387776. Throughput: 0: 41915.6. Samples: 601008240. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:58:53,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 02:58:54,842][26599] Updated weights for policy 0, policy_version 264494 (0.0041) [2024-06-19 02:58:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4333600768. Throughput: 0: 42125.3. Samples: 601265040. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:58:58,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 02:58:58,695][26599] Updated weights for policy 0, policy_version 264504 (0.0029) [2024-06-19 02:59:02,621][26599] Updated weights for policy 0, policy_version 264514 (0.0040) [2024-06-19 02:59:03,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4333830144. Throughput: 0: 42094.1. Samples: 601394180. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:03,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 02:59:06,463][26599] Updated weights for policy 0, policy_version 264524 (0.0048) [2024-06-19 02:59:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4334010368. Throughput: 0: 42066.6. Samples: 601641140. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:08,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 02:59:10,454][26599] Updated weights for policy 0, policy_version 264534 (0.0030) [2024-06-19 02:59:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4334223360. Throughput: 0: 41992.5. Samples: 601891860. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:13,380][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 02:59:14,501][26599] Updated weights for policy 0, policy_version 264544 (0.0044) [2024-06-19 02:59:18,384][26367] Fps is (10 sec: 40946.8, 60 sec: 41505.8, 300 sec: 41820.4). Total num frames: 4334419968. Throughput: 0: 41875.2. Samples: 602018780. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:18,384][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 02:59:18,603][26599] Updated weights for policy 0, policy_version 264554 (0.0027) [2024-06-19 02:59:22,100][26599] Updated weights for policy 0, policy_version 264564 (0.0044) [2024-06-19 02:59:23,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42049.8, 300 sec: 41986.9). Total num frames: 4334649344. Throughput: 0: 41729.0. Samples: 602266420. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:23,384][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 02:59:26,508][26599] Updated weights for policy 0, policy_version 264574 (0.0042) [2024-06-19 02:59:28,380][26367] Fps is (10 sec: 45890.2, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4334878720. Throughput: 0: 41623.4. Samples: 602513500. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:28,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 02:59:29,819][26599] Updated weights for policy 0, policy_version 264584 (0.0044) [2024-06-19 02:59:33,380][26367] Fps is (10 sec: 40974.8, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4335058944. Throughput: 0: 41658.2. Samples: 602644600. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:33,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 02:59:34,278][26599] Updated weights for policy 0, policy_version 264594 (0.0043) [2024-06-19 02:59:37,819][26599] Updated weights for policy 0, policy_version 264604 (0.0032) [2024-06-19 02:59:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4335271936. Throughput: 0: 41894.6. Samples: 602893500. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:38,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 02:59:38,474][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000264605_4335288320.pth... [2024-06-19 02:59:38,524][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000263990_4325212160.pth [2024-06-19 02:59:41,935][26599] Updated weights for policy 0, policy_version 264614 (0.0042) [2024-06-19 02:59:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 41988.0). Total num frames: 4335484928. Throughput: 0: 41724.2. Samples: 603142620. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:43,380][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 02:59:45,855][26599] Updated weights for policy 0, policy_version 264624 (0.0045) [2024-06-19 02:59:47,120][26579] Signal inference workers to stop experience collection... (9000 times) [2024-06-19 02:59:47,159][26599] InferenceWorker_p0-w0: stopping experience collection (9000 times) [2024-06-19 02:59:47,170][26579] Signal inference workers to resume experience collection... (9000 times) [2024-06-19 02:59:47,180][26599] InferenceWorker_p0-w0: resuming experience collection (9000 times) [2024-06-19 02:59:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4335697920. Throughput: 0: 41596.1. Samples: 603266000. Policy #0 lag: (min: 2.0, avg: 9.9, max: 21.0) [2024-06-19 02:59:48,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 02:59:49,896][26599] Updated weights for policy 0, policy_version 264634 (0.0037) [2024-06-19 02:59:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4335894528. Throughput: 0: 41736.9. Samples: 603519300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:59:53,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 02:59:53,599][26599] Updated weights for policy 0, policy_version 264644 (0.0042) [2024-06-19 02:59:57,558][26599] Updated weights for policy 0, policy_version 264654 (0.0031) [2024-06-19 02:59:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4336123904. Throughput: 0: 41731.5. Samples: 603769780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 02:59:58,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 03:00:01,427][26599] Updated weights for policy 0, policy_version 264664 (0.0042) [2024-06-19 03:00:03,384][26367] Fps is (10 sec: 44221.0, 60 sec: 41776.7, 300 sec: 41875.9). Total num frames: 4336336896. Throughput: 0: 41734.4. Samples: 603896840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:03,384][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 03:00:05,166][26599] Updated weights for policy 0, policy_version 264674 (0.0035) [2024-06-19 03:00:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4336533504. Throughput: 0: 41822.0. Samples: 604148260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:08,381][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 03:00:09,585][26599] Updated weights for policy 0, policy_version 264684 (0.0039) [2024-06-19 03:00:12,859][26599] Updated weights for policy 0, policy_version 264694 (0.0034) [2024-06-19 03:00:13,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4336762880. Throughput: 0: 41687.7. Samples: 604389440. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:13,380][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 03:00:17,355][26599] Updated weights for policy 0, policy_version 264704 (0.0037) [2024-06-19 03:00:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41781.4, 300 sec: 41765.8). Total num frames: 4336926720. Throughput: 0: 41645.7. Samples: 604518660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:18,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 03:00:20,654][26599] Updated weights for policy 0, policy_version 264714 (0.0038) [2024-06-19 03:00:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 4337172480. Throughput: 0: 41596.4. Samples: 604765340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:23,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 03:00:25,289][26599] Updated weights for policy 0, policy_version 264724 (0.0033) [2024-06-19 03:00:28,380][26367] Fps is (10 sec: 45876.0, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4337385472. Throughput: 0: 41675.5. Samples: 605018020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:28,380][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 03:00:28,545][26599] Updated weights for policy 0, policy_version 264734 (0.0038) [2024-06-19 03:00:33,198][26599] Updated weights for policy 0, policy_version 264744 (0.0025) [2024-06-19 03:00:33,381][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4337565696. Throughput: 0: 41800.3. Samples: 605147020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:33,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 03:00:36,361][26599] Updated weights for policy 0, policy_version 264754 (0.0029) [2024-06-19 03:00:38,384][26367] Fps is (10 sec: 42582.4, 60 sec: 42322.7, 300 sec: 42042.5). Total num frames: 4337811456. Throughput: 0: 41813.0. Samples: 605401040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:38,384][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 03:00:41,203][26599] Updated weights for policy 0, policy_version 264764 (0.0034) [2024-06-19 03:00:43,380][26367] Fps is (10 sec: 42599.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4337991680. Throughput: 0: 41799.6. Samples: 605650760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:43,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 03:00:44,291][26599] Updated weights for policy 0, policy_version 264774 (0.0038) [2024-06-19 03:00:48,380][26367] Fps is (10 sec: 37697.3, 60 sec: 41506.1, 300 sec: 41821.4). Total num frames: 4338188288. Throughput: 0: 41693.2. Samples: 605772880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:48,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 03:00:49,211][26599] Updated weights for policy 0, policy_version 264784 (0.0037) [2024-06-19 03:00:51,975][26599] Updated weights for policy 0, policy_version 264794 (0.0035) [2024-06-19 03:00:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4338434048. Throughput: 0: 41727.6. Samples: 606026000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:53,380][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 03:00:57,086][26599] Updated weights for policy 0, policy_version 264804 (0.0033) [2024-06-19 03:00:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4338630656. Throughput: 0: 42076.4. Samples: 606282880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 03:00:58,380][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 03:00:59,761][26599] Updated weights for policy 0, policy_version 264814 (0.0044) [2024-06-19 03:01:00,466][26579] Signal inference workers to stop experience collection... (9050 times) [2024-06-19 03:01:00,481][26599] InferenceWorker_p0-w0: stopping experience collection (9050 times) [2024-06-19 03:01:00,526][26579] Signal inference workers to resume experience collection... (9050 times) [2024-06-19 03:01:00,526][26599] InferenceWorker_p0-w0: resuming experience collection (9050 times) [2024-06-19 03:01:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41508.7, 300 sec: 41820.9). Total num frames: 4338827264. Throughput: 0: 41918.0. Samples: 606404960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:03,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 03:01:04,664][26599] Updated weights for policy 0, policy_version 264824 (0.0048) [2024-06-19 03:01:07,710][26599] Updated weights for policy 0, policy_version 264834 (0.0029) [2024-06-19 03:01:08,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4339073024. Throughput: 0: 42094.7. Samples: 606659600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:08,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 03:01:12,356][26599] Updated weights for policy 0, policy_version 264844 (0.0037) [2024-06-19 03:01:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41233.1, 300 sec: 41820.9). Total num frames: 4339236864. Throughput: 0: 42197.4. Samples: 606916900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:13,380][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 03:01:15,607][26599] Updated weights for policy 0, policy_version 264854 (0.0030) [2024-06-19 03:01:18,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4339466240. Throughput: 0: 42039.1. Samples: 607038780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:18,381][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 03:01:19,979][26599] Updated weights for policy 0, policy_version 264864 (0.0045) [2024-06-19 03:01:23,311][26599] Updated weights for policy 0, policy_version 264874 (0.0042) [2024-06-19 03:01:23,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4339695616. Throughput: 0: 42102.1. Samples: 607295480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:23,381][26367] Avg episode reward: [(0, '0.866')] [2024-06-19 03:01:27,751][26599] Updated weights for policy 0, policy_version 264884 (0.0043) [2024-06-19 03:01:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 4339875840. Throughput: 0: 42239.3. Samples: 607551540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:28,381][26367] Avg episode reward: [(0, '0.284')] [2024-06-19 03:01:30,929][26599] Updated weights for policy 0, policy_version 264894 (0.0035) [2024-06-19 03:01:33,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 41876.4). Total num frames: 4340105216. Throughput: 0: 42133.4. Samples: 607668880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:33,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 03:01:35,299][26599] Updated weights for policy 0, policy_version 264904 (0.0034) [2024-06-19 03:01:38,380][26367] Fps is (10 sec: 44237.4, 60 sec: 41781.8, 300 sec: 41987.5). Total num frames: 4340318208. Throughput: 0: 42188.0. Samples: 607924460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:38,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 03:01:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000264912_4340318208.pth... [2024-06-19 03:01:38,453][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000264297_4330242048.pth [2024-06-19 03:01:39,347][26599] Updated weights for policy 0, policy_version 264914 (0.0026) [2024-06-19 03:01:43,299][26599] Updated weights for policy 0, policy_version 264924 (0.0034) [2024-06-19 03:01:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 4340514816. Throughput: 0: 42052.9. Samples: 608175260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:43,380][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 03:01:47,307][26599] Updated weights for policy 0, policy_version 264934 (0.0031) [2024-06-19 03:01:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4340727808. Throughput: 0: 42024.8. Samples: 608296080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:48,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 03:01:51,142][26599] Updated weights for policy 0, policy_version 264944 (0.0044) [2024-06-19 03:01:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4340957184. Throughput: 0: 42006.3. Samples: 608549880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:53,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 03:01:55,022][26599] Updated weights for policy 0, policy_version 264954 (0.0033) [2024-06-19 03:01:58,384][26367] Fps is (10 sec: 40945.1, 60 sec: 41776.6, 300 sec: 41820.3). Total num frames: 4341137408. Throughput: 0: 41891.2. Samples: 608802160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:01:58,384][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 03:01:58,989][26599] Updated weights for policy 0, policy_version 264964 (0.0028) [2024-06-19 03:02:02,783][26599] Updated weights for policy 0, policy_version 264974 (0.0041) [2024-06-19 03:02:03,384][26367] Fps is (10 sec: 39307.6, 60 sec: 42049.8, 300 sec: 41820.4). Total num frames: 4341350400. Throughput: 0: 41896.0. Samples: 608924240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:02:03,384][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 03:02:06,659][26599] Updated weights for policy 0, policy_version 264984 (0.0029) [2024-06-19 03:02:08,380][26367] Fps is (10 sec: 45892.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4341596160. Throughput: 0: 41925.9. Samples: 609182140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:02:08,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 03:02:10,455][26599] Updated weights for policy 0, policy_version 264994 (0.0043) [2024-06-19 03:02:13,380][26367] Fps is (10 sec: 42613.3, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4341776384. Throughput: 0: 41818.4. Samples: 609433360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:02:13,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 03:02:14,422][26599] Updated weights for policy 0, policy_version 265004 (0.0037) [2024-06-19 03:02:18,104][26599] Updated weights for policy 0, policy_version 265014 (0.0044) [2024-06-19 03:02:18,380][26367] Fps is (10 sec: 39320.8, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4341989376. Throughput: 0: 41842.9. Samples: 609551820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:18,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 03:02:22,098][26599] Updated weights for policy 0, policy_version 265024 (0.0032) [2024-06-19 03:02:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4342218752. Throughput: 0: 41894.4. Samples: 609809700. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:23,380][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 03:02:25,950][26599] Updated weights for policy 0, policy_version 265034 (0.0034) [2024-06-19 03:02:28,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4342398976. Throughput: 0: 41936.8. Samples: 610062420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:28,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 03:02:29,812][26599] Updated weights for policy 0, policy_version 265044 (0.0029) [2024-06-19 03:02:33,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4342628352. Throughput: 0: 41907.5. Samples: 610181920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:33,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 03:02:33,614][26599] Updated weights for policy 0, policy_version 265054 (0.0025) [2024-06-19 03:02:37,608][26599] Updated weights for policy 0, policy_version 265064 (0.0035) [2024-06-19 03:02:37,824][26579] Signal inference workers to stop experience collection... (9100 times) [2024-06-19 03:02:37,824][26579] Signal inference workers to resume experience collection... (9100 times) [2024-06-19 03:02:37,843][26599] InferenceWorker_p0-w0: stopping experience collection (9100 times) [2024-06-19 03:02:37,843][26599] InferenceWorker_p0-w0: resuming experience collection (9100 times) [2024-06-19 03:02:38,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 4342857728. Throughput: 0: 42122.6. Samples: 610445400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:38,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 03:02:41,376][26599] Updated weights for policy 0, policy_version 265074 (0.0043) [2024-06-19 03:02:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.1, 300 sec: 41821.4). Total num frames: 4343021568. Throughput: 0: 42173.6. Samples: 610699820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:43,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 03:02:45,438][26599] Updated weights for policy 0, policy_version 265084 (0.0030) [2024-06-19 03:02:48,381][26367] Fps is (10 sec: 40957.0, 60 sec: 42324.8, 300 sec: 42042.9). Total num frames: 4343267328. Throughput: 0: 42049.2. Samples: 610816340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:48,382][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 03:02:49,075][26599] Updated weights for policy 0, policy_version 265094 (0.0030) [2024-06-19 03:02:53,294][26599] Updated weights for policy 0, policy_version 265104 (0.0025) [2024-06-19 03:02:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4343463936. Throughput: 0: 42074.1. Samples: 611075480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:53,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 03:02:56,716][26599] Updated weights for policy 0, policy_version 265114 (0.0051) [2024-06-19 03:02:58,380][26367] Fps is (10 sec: 37685.9, 60 sec: 41781.7, 300 sec: 41765.3). Total num frames: 4343644160. Throughput: 0: 42133.7. Samples: 611329380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:02:58,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 03:03:00,990][26599] Updated weights for policy 0, policy_version 265124 (0.0035) [2024-06-19 03:03:03,384][26367] Fps is (10 sec: 42583.4, 60 sec: 42325.2, 300 sec: 42042.5). Total num frames: 4343889920. Throughput: 0: 42120.3. Samples: 611447380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:03:03,384][26367] Avg episode reward: [(0, '0.845')] [2024-06-19 03:03:05,066][26599] Updated weights for policy 0, policy_version 265134 (0.0038) [2024-06-19 03:03:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4344086528. Throughput: 0: 42094.6. Samples: 611703960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:03:08,380][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 03:03:08,768][26599] Updated weights for policy 0, policy_version 265144 (0.0028) [2024-06-19 03:03:13,060][26599] Updated weights for policy 0, policy_version 265154 (0.0040) [2024-06-19 03:03:13,380][26367] Fps is (10 sec: 39335.7, 60 sec: 41779.2, 300 sec: 41876.8). Total num frames: 4344283136. Throughput: 0: 41978.2. Samples: 611951440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:03:13,381][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 03:03:16,655][26599] Updated weights for policy 0, policy_version 265164 (0.0042) [2024-06-19 03:03:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.4, 300 sec: 41932.0). Total num frames: 4344496128. Throughput: 0: 42092.1. Samples: 612076060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:03:18,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 03:03:21,005][26599] Updated weights for policy 0, policy_version 265174 (0.0036) [2024-06-19 03:03:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.0, 300 sec: 41876.4). Total num frames: 4344709120. Throughput: 0: 41779.5. Samples: 612325480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:03:23,381][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 03:03:24,479][26599] Updated weights for policy 0, policy_version 265184 (0.0038) [2024-06-19 03:03:28,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4344905728. Throughput: 0: 41611.6. Samples: 612572340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:03:28,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 03:03:28,748][26599] Updated weights for policy 0, policy_version 265194 (0.0032) [2024-06-19 03:03:32,178][26599] Updated weights for policy 0, policy_version 265204 (0.0039) [2024-06-19 03:03:33,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4345151488. Throughput: 0: 41907.0. Samples: 612702120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:03:33,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 03:03:36,691][26599] Updated weights for policy 0, policy_version 265214 (0.0029) [2024-06-19 03:03:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4345331712. Throughput: 0: 41761.0. Samples: 612954720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:03:38,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 03:03:38,450][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000265219_4345348096.pth... [2024-06-19 03:03:38,504][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000264605_4335288320.pth [2024-06-19 03:03:40,132][26599] Updated weights for policy 0, policy_version 265224 (0.0030) [2024-06-19 03:03:43,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4345544704. Throughput: 0: 41674.7. Samples: 613204740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:03:43,381][26367] Avg episode reward: [(0, '0.244')] [2024-06-19 03:03:44,368][26599] Updated weights for policy 0, policy_version 265234 (0.0040) [2024-06-19 03:03:47,887][26599] Updated weights for policy 0, policy_version 265244 (0.0028) [2024-06-19 03:03:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 41779.7, 300 sec: 41987.5). Total num frames: 4345774080. Throughput: 0: 41923.3. Samples: 613333780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:03:48,381][26367] Avg episode reward: [(0, '0.287')] [2024-06-19 03:03:52,167][26599] Updated weights for policy 0, policy_version 265254 (0.0045) [2024-06-19 03:03:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4345954304. Throughput: 0: 41765.7. Samples: 613583420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:03:53,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 03:03:55,774][26599] Updated weights for policy 0, policy_version 265264 (0.0038) [2024-06-19 03:03:58,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 4346183680. Throughput: 0: 41830.8. Samples: 613833820. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:03:58,380][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 03:04:00,026][26599] Updated weights for policy 0, policy_version 265274 (0.0029) [2024-06-19 03:04:03,373][26599] Updated weights for policy 0, policy_version 265284 (0.0031) [2024-06-19 03:04:03,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 4346413056. Throughput: 0: 41964.8. Samples: 613964480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:04:03,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 03:04:07,779][26599] Updated weights for policy 0, policy_version 265294 (0.0032) [2024-06-19 03:04:08,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4346593280. Throughput: 0: 41862.7. Samples: 614209300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:04:08,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 03:04:10,823][26579] Signal inference workers to stop experience collection... (9150 times) [2024-06-19 03:04:10,829][26579] Signal inference workers to resume experience collection... (9150 times) [2024-06-19 03:04:10,872][26599] InferenceWorker_p0-w0: stopping experience collection (9150 times) [2024-06-19 03:04:10,872][26599] InferenceWorker_p0-w0: resuming experience collection (9150 times) [2024-06-19 03:04:11,799][26599] Updated weights for policy 0, policy_version 265304 (0.0033) [2024-06-19 03:04:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41987.9). Total num frames: 4346806272. Throughput: 0: 41962.3. Samples: 614460640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:04:13,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:04:15,511][26599] Updated weights for policy 0, policy_version 265314 (0.0036) [2024-06-19 03:04:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 41876.9). Total num frames: 4347002880. Throughput: 0: 41946.3. Samples: 614589700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:04:18,380][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 03:04:19,653][26599] Updated weights for policy 0, policy_version 265324 (0.0029) [2024-06-19 03:04:23,256][26599] Updated weights for policy 0, policy_version 265334 (0.0033) [2024-06-19 03:04:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4347232256. Throughput: 0: 41840.4. Samples: 614837540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:04:23,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 03:04:27,371][26599] Updated weights for policy 0, policy_version 265344 (0.0027) [2024-06-19 03:04:28,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 4347445248. Throughput: 0: 41875.9. Samples: 615089160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:04:28,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 03:04:30,896][26599] Updated weights for policy 0, policy_version 265354 (0.0039) [2024-06-19 03:04:33,382][26367] Fps is (10 sec: 40954.6, 60 sec: 41505.2, 300 sec: 41931.7). Total num frames: 4347641856. Throughput: 0: 41803.3. Samples: 615214980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 03:04:33,382][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 03:04:35,067][26599] Updated weights for policy 0, policy_version 265364 (0.0032) [2024-06-19 03:04:38,380][26367] Fps is (10 sec: 40961.1, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4347854848. Throughput: 0: 42002.8. Samples: 615473540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:04:38,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 03:04:38,655][26599] Updated weights for policy 0, policy_version 265374 (0.0039) [2024-06-19 03:04:42,769][26599] Updated weights for policy 0, policy_version 265384 (0.0039) [2024-06-19 03:04:43,380][26367] Fps is (10 sec: 40965.8, 60 sec: 41779.3, 300 sec: 41876.4). Total num frames: 4348051456. Throughput: 0: 41913.7. Samples: 615719940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:04:43,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 03:04:46,699][26599] Updated weights for policy 0, policy_version 265394 (0.0038) [2024-06-19 03:04:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4348280832. Throughput: 0: 41796.9. Samples: 615845340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:04:48,380][26367] Avg episode reward: [(0, '0.361')] [2024-06-19 03:04:50,754][26599] Updated weights for policy 0, policy_version 265404 (0.0040) [2024-06-19 03:04:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4348493824. Throughput: 0: 41983.2. Samples: 616098540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:04:53,380][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 03:04:54,783][26599] Updated weights for policy 0, policy_version 265414 (0.0036) [2024-06-19 03:04:58,380][26367] Fps is (10 sec: 39320.6, 60 sec: 41505.9, 300 sec: 41821.3). Total num frames: 4348674048. Throughput: 0: 41845.6. Samples: 616343700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:04:58,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 03:04:58,714][26599] Updated weights for policy 0, policy_version 265424 (0.0030) [2024-06-19 03:05:02,377][26599] Updated weights for policy 0, policy_version 265434 (0.0029) [2024-06-19 03:05:03,381][26367] Fps is (10 sec: 40956.0, 60 sec: 41505.5, 300 sec: 41931.8). Total num frames: 4348903424. Throughput: 0: 41762.7. Samples: 616469060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:03,382][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 03:05:06,342][26599] Updated weights for policy 0, policy_version 265444 (0.0040) [2024-06-19 03:05:08,380][26367] Fps is (10 sec: 44238.2, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4349116416. Throughput: 0: 42018.8. Samples: 616728380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:08,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 03:05:09,865][26599] Updated weights for policy 0, policy_version 265454 (0.0051) [2024-06-19 03:05:13,380][26367] Fps is (10 sec: 40963.9, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4349313024. Throughput: 0: 42089.1. Samples: 616983160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:13,380][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 03:05:14,071][26599] Updated weights for policy 0, policy_version 265464 (0.0032) [2024-06-19 03:05:17,442][26599] Updated weights for policy 0, policy_version 265474 (0.0029) [2024-06-19 03:05:18,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 41876.4). Total num frames: 4349526016. Throughput: 0: 42016.8. Samples: 617105680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:18,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 03:05:21,908][26599] Updated weights for policy 0, policy_version 265484 (0.0030) [2024-06-19 03:05:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4349739008. Throughput: 0: 41897.3. Samples: 617358920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:23,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 03:05:25,097][26599] Updated weights for policy 0, policy_version 265494 (0.0043) [2024-06-19 03:05:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4349935616. Throughput: 0: 42071.5. Samples: 617613160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:28,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 03:05:29,702][26599] Updated weights for policy 0, policy_version 265504 (0.0044) [2024-06-19 03:05:32,347][26579] Signal inference workers to stop experience collection... (9200 times) [2024-06-19 03:05:32,348][26579] Signal inference workers to resume experience collection... (9200 times) [2024-06-19 03:05:32,363][26599] InferenceWorker_p0-w0: stopping experience collection (9200 times) [2024-06-19 03:05:32,363][26599] InferenceWorker_p0-w0: resuming experience collection (9200 times) [2024-06-19 03:05:33,029][26599] Updated weights for policy 0, policy_version 265514 (0.0034) [2024-06-19 03:05:33,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42326.3, 300 sec: 41932.5). Total num frames: 4350181376. Throughput: 0: 42060.8. Samples: 617738080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:33,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 03:05:37,397][26599] Updated weights for policy 0, policy_version 265524 (0.0037) [2024-06-19 03:05:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4350361600. Throughput: 0: 42199.5. Samples: 617997520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:38,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 03:05:38,523][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000265526_4350377984.pth... [2024-06-19 03:05:38,584][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000264912_4340318208.pth [2024-06-19 03:05:40,867][26599] Updated weights for policy 0, policy_version 265534 (0.0024) [2024-06-19 03:05:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4350590976. Throughput: 0: 42184.1. Samples: 618241980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 03:05:43,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 03:05:44,917][26599] Updated weights for policy 0, policy_version 265544 (0.0028) [2024-06-19 03:05:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4350803968. Throughput: 0: 42414.7. Samples: 618377680. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:05:48,380][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 03:05:48,618][26599] Updated weights for policy 0, policy_version 265554 (0.0037) [2024-06-19 03:05:53,079][26599] Updated weights for policy 0, policy_version 265564 (0.0035) [2024-06-19 03:05:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4351000576. Throughput: 0: 42201.2. Samples: 618627440. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:05:53,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 03:05:56,767][26599] Updated weights for policy 0, policy_version 265574 (0.0035) [2024-06-19 03:05:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 4351229952. Throughput: 0: 41949.7. Samples: 618870900. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:05:58,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 03:06:01,209][26599] Updated weights for policy 0, policy_version 265584 (0.0039) [2024-06-19 03:06:03,382][26367] Fps is (10 sec: 42592.5, 60 sec: 42051.9, 300 sec: 41876.2). Total num frames: 4351426560. Throughput: 0: 42304.1. Samples: 619009420. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:03,382][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 03:06:04,365][26599] Updated weights for policy 0, policy_version 265594 (0.0035) [2024-06-19 03:06:08,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4351623168. Throughput: 0: 42262.3. Samples: 619260720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:08,380][26367] Avg episode reward: [(0, '0.336')] [2024-06-19 03:06:08,763][26599] Updated weights for policy 0, policy_version 265604 (0.0029) [2024-06-19 03:06:11,992][26599] Updated weights for policy 0, policy_version 265614 (0.0038) [2024-06-19 03:06:13,380][26367] Fps is (10 sec: 44243.6, 60 sec: 42598.5, 300 sec: 42043.1). Total num frames: 4351868928. Throughput: 0: 41986.4. Samples: 619502540. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:13,380][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 03:06:16,569][26599] Updated weights for policy 0, policy_version 265624 (0.0034) [2024-06-19 03:06:18,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4352065536. Throughput: 0: 42310.2. Samples: 619642040. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:18,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 03:06:19,579][26599] Updated weights for policy 0, policy_version 265634 (0.0033) [2024-06-19 03:06:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4352278528. Throughput: 0: 42124.0. Samples: 619893100. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:23,381][26367] Avg episode reward: [(0, '0.830')] [2024-06-19 03:06:24,527][26599] Updated weights for policy 0, policy_version 265644 (0.0037) [2024-06-19 03:06:27,604][26599] Updated weights for policy 0, policy_version 265654 (0.0040) [2024-06-19 03:06:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4352491520. Throughput: 0: 42187.2. Samples: 620140400. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:28,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 03:06:32,226][26599] Updated weights for policy 0, policy_version 265664 (0.0046) [2024-06-19 03:06:33,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4352688128. Throughput: 0: 41966.0. Samples: 620266160. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:33,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 03:06:35,465][26599] Updated weights for policy 0, policy_version 265674 (0.0032) [2024-06-19 03:06:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 4352917504. Throughput: 0: 41963.1. Samples: 620515780. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:38,381][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 03:06:39,869][26599] Updated weights for policy 0, policy_version 265684 (0.0037) [2024-06-19 03:06:43,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 4353114112. Throughput: 0: 42130.4. Samples: 620766760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:43,380][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 03:06:43,566][26599] Updated weights for policy 0, policy_version 265694 (0.0034) [2024-06-19 03:06:47,496][26599] Updated weights for policy 0, policy_version 265704 (0.0035) [2024-06-19 03:06:48,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4353327104. Throughput: 0: 41764.5. Samples: 620888760. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:48,380][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:06:51,443][26599] Updated weights for policy 0, policy_version 265714 (0.0029) [2024-06-19 03:06:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 4353540096. Throughput: 0: 41813.8. Samples: 621142340. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:53,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 03:06:55,199][26599] Updated weights for policy 0, policy_version 265724 (0.0045) [2024-06-19 03:06:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41988.0). Total num frames: 4353736704. Throughput: 0: 42226.2. Samples: 621402720. Policy #0 lag: (min: 1.0, avg: 10.1, max: 24.0) [2024-06-19 03:06:58,380][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 03:06:59,091][26599] Updated weights for policy 0, policy_version 265734 (0.0029) [2024-06-19 03:07:03,039][26599] Updated weights for policy 0, policy_version 265744 (0.0032) [2024-06-19 03:07:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42053.3, 300 sec: 41876.4). Total num frames: 4353949696. Throughput: 0: 41774.3. Samples: 621521880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:03,380][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 03:07:06,854][26599] Updated weights for policy 0, policy_version 265754 (0.0035) [2024-06-19 03:07:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 41987.5). Total num frames: 4354162688. Throughput: 0: 41854.6. Samples: 621776560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:08,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 03:07:10,599][26599] Updated weights for policy 0, policy_version 265764 (0.0041) [2024-06-19 03:07:13,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4354375680. Throughput: 0: 42052.9. Samples: 622032780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:13,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 03:07:14,699][26599] Updated weights for policy 0, policy_version 265774 (0.0050) [2024-06-19 03:07:18,108][26599] Updated weights for policy 0, policy_version 265784 (0.0051) [2024-06-19 03:07:18,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4354605056. Throughput: 0: 42075.3. Samples: 622159540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:18,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 03:07:22,502][26599] Updated weights for policy 0, policy_version 265794 (0.0037) [2024-06-19 03:07:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4354801664. Throughput: 0: 42276.5. Samples: 622418220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:23,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 03:07:25,130][26579] Signal inference workers to stop experience collection... (9250 times) [2024-06-19 03:07:25,137][26579] Signal inference workers to resume experience collection... (9250 times) [2024-06-19 03:07:25,188][26599] InferenceWorker_p0-w0: stopping experience collection (9250 times) [2024-06-19 03:07:25,188][26599] InferenceWorker_p0-w0: resuming experience collection (9250 times) [2024-06-19 03:07:26,188][26599] Updated weights for policy 0, policy_version 265804 (0.0039) [2024-06-19 03:07:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4355014656. Throughput: 0: 42194.6. Samples: 622665520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:28,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 03:07:30,143][26599] Updated weights for policy 0, policy_version 265814 (0.0032) [2024-06-19 03:07:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 41931.9). Total num frames: 4355227648. Throughput: 0: 42343.0. Samples: 622794200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:33,381][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 03:07:33,801][26599] Updated weights for policy 0, policy_version 265824 (0.0048) [2024-06-19 03:07:38,271][26599] Updated weights for policy 0, policy_version 265834 (0.0045) [2024-06-19 03:07:38,384][26367] Fps is (10 sec: 40944.7, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 4355424256. Throughput: 0: 42424.0. Samples: 623051580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:38,384][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 03:07:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000265834_4355424256.pth... [2024-06-19 03:07:38,452][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000265219_4345348096.pth [2024-06-19 03:07:41,479][26599] Updated weights for policy 0, policy_version 265844 (0.0033) [2024-06-19 03:07:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 41932.0). Total num frames: 4355637248. Throughput: 0: 42117.6. Samples: 623298020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:43,385][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 03:07:46,329][26599] Updated weights for policy 0, policy_version 265854 (0.0030) [2024-06-19 03:07:48,380][26367] Fps is (10 sec: 45892.3, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4355883008. Throughput: 0: 42491.5. Samples: 623434000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:48,381][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 03:07:49,194][26599] Updated weights for policy 0, policy_version 265864 (0.0033) [2024-06-19 03:07:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4356063232. Throughput: 0: 42497.4. Samples: 623688940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:53,381][26367] Avg episode reward: [(0, '0.316')] [2024-06-19 03:07:54,085][26599] Updated weights for policy 0, policy_version 265874 (0.0034) [2024-06-19 03:07:56,873][26599] Updated weights for policy 0, policy_version 265884 (0.0037) [2024-06-19 03:07:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42043.5). Total num frames: 4356292608. Throughput: 0: 42267.1. Samples: 623934800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:07:58,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 03:08:01,652][26599] Updated weights for policy 0, policy_version 265894 (0.0041) [2024-06-19 03:08:03,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 4356505600. Throughput: 0: 42424.9. Samples: 624068660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:08:03,380][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 03:08:04,491][26599] Updated weights for policy 0, policy_version 265904 (0.0038) [2024-06-19 03:08:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4356718592. Throughput: 0: 42339.1. Samples: 624323480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 24.0) [2024-06-19 03:08:08,381][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 03:08:09,134][26599] Updated weights for policy 0, policy_version 265914 (0.0034) [2024-06-19 03:08:12,058][26599] Updated weights for policy 0, policy_version 265924 (0.0038) [2024-06-19 03:08:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4356931584. Throughput: 0: 42388.9. Samples: 624573020. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:13,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 03:08:16,714][26599] Updated weights for policy 0, policy_version 265934 (0.0030) [2024-06-19 03:08:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4357128192. Throughput: 0: 42452.9. Samples: 624704580. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:18,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 03:08:19,989][26599] Updated weights for policy 0, policy_version 265944 (0.0038) [2024-06-19 03:08:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4357341184. Throughput: 0: 42307.1. Samples: 624955240. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:23,380][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 03:08:24,504][26599] Updated weights for policy 0, policy_version 265954 (0.0033) [2024-06-19 03:08:27,559][26599] Updated weights for policy 0, policy_version 265964 (0.0032) [2024-06-19 03:08:28,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 4357586944. Throughput: 0: 42449.3. Samples: 625208240. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:28,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 03:08:32,232][26599] Updated weights for policy 0, policy_version 265974 (0.0041) [2024-06-19 03:08:33,384][26367] Fps is (10 sec: 39306.9, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 4357734400. Throughput: 0: 42370.3. Samples: 625340820. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:33,384][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 03:08:35,515][26599] Updated weights for policy 0, policy_version 265984 (0.0047) [2024-06-19 03:08:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42601.0, 300 sec: 42154.1). Total num frames: 4357980160. Throughput: 0: 42121.7. Samples: 625584420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:38,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 03:08:39,934][26599] Updated weights for policy 0, policy_version 265994 (0.0037) [2024-06-19 03:08:43,380][26367] Fps is (10 sec: 45891.7, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4358193152. Throughput: 0: 42216.0. Samples: 625834520. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:43,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 03:08:43,577][26599] Updated weights for policy 0, policy_version 266004 (0.0045) [2024-06-19 03:08:47,704][26599] Updated weights for policy 0, policy_version 266014 (0.0034) [2024-06-19 03:08:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4358389760. Throughput: 0: 42126.6. Samples: 625964360. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:48,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 03:08:51,507][26599] Updated weights for policy 0, policy_version 266024 (0.0033) [2024-06-19 03:08:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4358619136. Throughput: 0: 41939.6. Samples: 626210760. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:53,384][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 03:08:55,331][26599] Updated weights for policy 0, policy_version 266034 (0.0042) [2024-06-19 03:08:56,703][26579] Signal inference workers to stop experience collection... (9300 times) [2024-06-19 03:08:56,744][26599] InferenceWorker_p0-w0: stopping experience collection (9300 times) [2024-06-19 03:08:56,752][26579] Signal inference workers to resume experience collection... (9300 times) [2024-06-19 03:08:56,761][26599] InferenceWorker_p0-w0: resuming experience collection (9300 times) [2024-06-19 03:08:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4358799360. Throughput: 0: 42048.5. Samples: 626465200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:08:58,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 03:08:59,465][26599] Updated weights for policy 0, policy_version 266044 (0.0027) [2024-06-19 03:09:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4359012352. Throughput: 0: 41744.0. Samples: 626583060. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:09:03,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 03:09:03,682][26599] Updated weights for policy 0, policy_version 266054 (0.0033) [2024-06-19 03:09:07,650][26599] Updated weights for policy 0, policy_version 266064 (0.0036) [2024-06-19 03:09:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4359241728. Throughput: 0: 41764.3. Samples: 626834640. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:09:08,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 03:09:11,676][26599] Updated weights for policy 0, policy_version 266074 (0.0031) [2024-06-19 03:09:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4359421952. Throughput: 0: 41723.1. Samples: 627085780. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:09:13,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 03:09:15,394][26599] Updated weights for policy 0, policy_version 266084 (0.0034) [2024-06-19 03:09:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4359634944. Throughput: 0: 41500.6. Samples: 627208200. Policy #0 lag: (min: 1.0, avg: 8.9, max: 21.0) [2024-06-19 03:09:18,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 03:09:19,439][26599] Updated weights for policy 0, policy_version 266094 (0.0043) [2024-06-19 03:09:23,199][26599] Updated weights for policy 0, policy_version 266104 (0.0034) [2024-06-19 03:09:23,380][26367] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 42043.1). Total num frames: 4359847936. Throughput: 0: 41886.8. Samples: 627469320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:23,380][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 03:09:26,982][26599] Updated weights for policy 0, policy_version 266114 (0.0037) [2024-06-19 03:09:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 40960.0, 300 sec: 42043.2). Total num frames: 4360044544. Throughput: 0: 41995.1. Samples: 627724300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:28,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 03:09:31,047][26599] Updated weights for policy 0, policy_version 266124 (0.0030) [2024-06-19 03:09:33,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42600.9, 300 sec: 42154.1). Total num frames: 4360290304. Throughput: 0: 41886.6. Samples: 627849260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:33,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 03:09:35,270][26599] Updated weights for policy 0, policy_version 266134 (0.0025) [2024-06-19 03:09:38,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 4360454144. Throughput: 0: 41962.3. Samples: 628099060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:38,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 03:09:38,473][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000266142_4360470528.pth... [2024-06-19 03:09:38,534][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000265526_4350377984.pth [2024-06-19 03:09:38,892][26599] Updated weights for policy 0, policy_version 266144 (0.0036) [2024-06-19 03:09:43,060][26599] Updated weights for policy 0, policy_version 266154 (0.0027) [2024-06-19 03:09:43,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4360683520. Throughput: 0: 41933.8. Samples: 628352220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:43,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 03:09:46,661][26599] Updated weights for policy 0, policy_version 266164 (0.0033) [2024-06-19 03:09:48,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4360912896. Throughput: 0: 42040.5. Samples: 628474880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:48,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 03:09:50,667][26599] Updated weights for policy 0, policy_version 266174 (0.0032) [2024-06-19 03:09:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41233.2, 300 sec: 42098.6). Total num frames: 4361093120. Throughput: 0: 42021.1. Samples: 628725580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:53,380][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 03:09:54,648][26599] Updated weights for policy 0, policy_version 266184 (0.0032) [2024-06-19 03:09:58,380][26599] Updated weights for policy 0, policy_version 266194 (0.0026) [2024-06-19 03:09:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42098.7). Total num frames: 4361322496. Throughput: 0: 42100.0. Samples: 628980280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:09:58,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 03:10:02,318][26599] Updated weights for policy 0, policy_version 266204 (0.0046) [2024-06-19 03:10:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4361535488. Throughput: 0: 42191.8. Samples: 629106820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:10:03,380][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 03:10:06,076][26599] Updated weights for policy 0, policy_version 266214 (0.0044) [2024-06-19 03:10:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4361732096. Throughput: 0: 42099.8. Samples: 629363820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:10:08,384][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 03:10:09,955][26599] Updated weights for policy 0, policy_version 266224 (0.0033) [2024-06-19 03:10:13,381][26367] Fps is (10 sec: 42593.8, 60 sec: 42324.7, 300 sec: 42154.0). Total num frames: 4361961472. Throughput: 0: 41869.8. Samples: 629608480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:10:13,382][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 03:10:13,676][26599] Updated weights for policy 0, policy_version 266234 (0.0033) [2024-06-19 03:10:16,697][26579] Signal inference workers to stop experience collection... (9350 times) [2024-06-19 03:10:16,697][26579] Signal inference workers to resume experience collection... (9350 times) [2024-06-19 03:10:16,709][26599] InferenceWorker_p0-w0: stopping experience collection (9350 times) [2024-06-19 03:10:16,709][26599] InferenceWorker_p0-w0: resuming experience collection (9350 times) [2024-06-19 03:10:17,718][26599] Updated weights for policy 0, policy_version 266244 (0.0034) [2024-06-19 03:10:18,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42052.5, 300 sec: 42098.6). Total num frames: 4362158080. Throughput: 0: 41952.2. Samples: 629737100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:10:18,380][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 03:10:21,310][26599] Updated weights for policy 0, policy_version 266254 (0.0024) [2024-06-19 03:10:23,384][26367] Fps is (10 sec: 42587.0, 60 sec: 42322.7, 300 sec: 42209.1). Total num frames: 4362387456. Throughput: 0: 42151.7. Samples: 629996040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:10:23,384][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 03:10:25,246][26599] Updated weights for policy 0, policy_version 266264 (0.0028) [2024-06-19 03:10:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 4362584064. Throughput: 0: 42124.5. Samples: 630247820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 03:10:28,380][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 03:10:28,980][26599] Updated weights for policy 0, policy_version 266274 (0.0041) [2024-06-19 03:10:33,069][26599] Updated weights for policy 0, policy_version 266284 (0.0047) [2024-06-19 03:10:33,380][26367] Fps is (10 sec: 40974.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4362797056. Throughput: 0: 42249.7. Samples: 630376120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:10:33,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 03:10:36,630][26599] Updated weights for policy 0, policy_version 266294 (0.0031) [2024-06-19 03:10:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4363010048. Throughput: 0: 42207.0. Samples: 630624900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:10:38,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 03:10:40,639][26599] Updated weights for policy 0, policy_version 266304 (0.0036) [2024-06-19 03:10:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4363223040. Throughput: 0: 42102.8. Samples: 630874900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:10:43,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 03:10:44,654][26599] Updated weights for policy 0, policy_version 266314 (0.0040) [2024-06-19 03:10:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4363436032. Throughput: 0: 42239.4. Samples: 631007600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:10:48,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 03:10:48,754][26599] Updated weights for policy 0, policy_version 266324 (0.0032) [2024-06-19 03:10:52,532][26599] Updated weights for policy 0, policy_version 266334 (0.0032) [2024-06-19 03:10:53,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4363632640. Throughput: 0: 42024.9. Samples: 631254940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:10:53,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 03:10:56,303][26599] Updated weights for policy 0, policy_version 266344 (0.0034) [2024-06-19 03:10:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42209.8). Total num frames: 4363878400. Throughput: 0: 42214.7. Samples: 631508100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:10:58,380][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 03:11:00,183][26599] Updated weights for policy 0, policy_version 266354 (0.0033) [2024-06-19 03:11:03,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4364058624. Throughput: 0: 42264.4. Samples: 631639000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:03,380][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 03:11:04,201][26599] Updated weights for policy 0, policy_version 266364 (0.0036) [2024-06-19 03:11:08,147][26599] Updated weights for policy 0, policy_version 266374 (0.0034) [2024-06-19 03:11:08,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4364271616. Throughput: 0: 41934.4. Samples: 631882940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:08,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 03:11:11,801][26599] Updated weights for policy 0, policy_version 266384 (0.0052) [2024-06-19 03:11:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42326.0, 300 sec: 42154.1). Total num frames: 4364500992. Throughput: 0: 41909.2. Samples: 632133740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:13,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 03:11:16,574][26599] Updated weights for policy 0, policy_version 266394 (0.0033) [2024-06-19 03:11:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 4364697600. Throughput: 0: 42066.2. Samples: 632269100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:18,381][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 03:11:19,401][26599] Updated weights for policy 0, policy_version 266404 (0.0037) [2024-06-19 03:11:23,384][26367] Fps is (10 sec: 39307.4, 60 sec: 41779.2, 300 sec: 42042.5). Total num frames: 4364894208. Throughput: 0: 41914.0. Samples: 632511180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:23,384][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 03:11:24,279][26599] Updated weights for policy 0, policy_version 266414 (0.0035) [2024-06-19 03:11:27,116][26599] Updated weights for policy 0, policy_version 266424 (0.0039) [2024-06-19 03:11:28,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4365107200. Throughput: 0: 41933.8. Samples: 632761920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:28,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 03:11:31,938][26599] Updated weights for policy 0, policy_version 266434 (0.0030) [2024-06-19 03:11:33,380][26367] Fps is (10 sec: 42613.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4365320192. Throughput: 0: 41922.7. Samples: 632894120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:33,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 03:11:35,197][26599] Updated weights for policy 0, policy_version 266444 (0.0027) [2024-06-19 03:11:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4365533184. Throughput: 0: 41986.3. Samples: 633144320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:38,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 03:11:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000266451_4365533184.pth... [2024-06-19 03:11:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000265834_4355424256.pth [2024-06-19 03:11:40,054][26599] Updated weights for policy 0, policy_version 266454 (0.0052) [2024-06-19 03:11:42,925][26599] Updated weights for policy 0, policy_version 266464 (0.0026) [2024-06-19 03:11:43,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4365762560. Throughput: 0: 41799.6. Samples: 633389080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:11:43,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 03:11:47,053][26579] Signal inference workers to stop experience collection... (9400 times) [2024-06-19 03:11:47,053][26579] Signal inference workers to resume experience collection... (9400 times) [2024-06-19 03:11:47,100][26599] InferenceWorker_p0-w0: stopping experience collection (9400 times) [2024-06-19 03:11:47,100][26599] InferenceWorker_p0-w0: resuming experience collection (9400 times) [2024-06-19 03:11:47,620][26599] Updated weights for policy 0, policy_version 266474 (0.0041) [2024-06-19 03:11:48,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 41987.4). Total num frames: 4365926400. Throughput: 0: 41826.6. Samples: 633521200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:11:48,381][26367] Avg episode reward: [(0, '0.286')] [2024-06-19 03:11:50,683][26599] Updated weights for policy 0, policy_version 266484 (0.0030) [2024-06-19 03:11:53,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4366172160. Throughput: 0: 41844.4. Samples: 633765940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:11:53,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 03:11:55,271][26599] Updated weights for policy 0, policy_version 266494 (0.0039) [2024-06-19 03:11:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41506.0, 300 sec: 42098.5). Total num frames: 4366368768. Throughput: 0: 41915.5. Samples: 634019940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:11:58,381][26367] Avg episode reward: [(0, '0.865')] [2024-06-19 03:11:58,863][26599] Updated weights for policy 0, policy_version 266504 (0.0038) [2024-06-19 03:12:03,238][26599] Updated weights for policy 0, policy_version 266514 (0.0027) [2024-06-19 03:12:03,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4366565376. Throughput: 0: 41674.3. Samples: 634144440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:03,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 03:12:06,723][26599] Updated weights for policy 0, policy_version 266524 (0.0033) [2024-06-19 03:12:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4366811136. Throughput: 0: 41949.9. Samples: 634398780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:08,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 03:12:11,046][26599] Updated weights for policy 0, policy_version 266534 (0.0031) [2024-06-19 03:12:13,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41506.0, 300 sec: 41987.4). Total num frames: 4366991360. Throughput: 0: 41842.5. Samples: 634644840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:13,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 03:12:14,552][26599] Updated weights for policy 0, policy_version 266544 (0.0052) [2024-06-19 03:12:18,380][26367] Fps is (10 sec: 37683.1, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4367187968. Throughput: 0: 41731.0. Samples: 634772020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:18,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 03:12:18,719][26599] Updated weights for policy 0, policy_version 266554 (0.0032) [2024-06-19 03:12:22,534][26599] Updated weights for policy 0, policy_version 266564 (0.0046) [2024-06-19 03:12:23,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41781.7, 300 sec: 41987.5). Total num frames: 4367400960. Throughput: 0: 41806.8. Samples: 635025620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:23,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 03:12:26,475][26599] Updated weights for policy 0, policy_version 266574 (0.0030) [2024-06-19 03:12:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4367630336. Throughput: 0: 41884.2. Samples: 635273880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:28,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 03:12:30,590][26599] Updated weights for policy 0, policy_version 266584 (0.0032) [2024-06-19 03:12:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42043.5). Total num frames: 4367826944. Throughput: 0: 41848.4. Samples: 635404380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:33,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 03:12:34,127][26599] Updated weights for policy 0, policy_version 266594 (0.0032) [2024-06-19 03:12:38,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4368023552. Throughput: 0: 41905.0. Samples: 635651660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:38,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 03:12:38,505][26599] Updated weights for policy 0, policy_version 266604 (0.0041) [2024-06-19 03:12:42,338][26599] Updated weights for policy 0, policy_version 266614 (0.0025) [2024-06-19 03:12:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4368269312. Throughput: 0: 41621.4. Samples: 635892900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:43,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 03:12:46,338][26599] Updated weights for policy 0, policy_version 266624 (0.0034) [2024-06-19 03:12:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4368433152. Throughput: 0: 41818.7. Samples: 636026280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:48,380][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 03:12:50,160][26599] Updated weights for policy 0, policy_version 266634 (0.0049) [2024-06-19 03:12:53,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4368662528. Throughput: 0: 41608.9. Samples: 636271180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 03:12:53,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 03:12:53,953][26599] Updated weights for policy 0, policy_version 266644 (0.0027) [2024-06-19 03:12:57,900][26599] Updated weights for policy 0, policy_version 266654 (0.0037) [2024-06-19 03:12:58,384][26367] Fps is (10 sec: 44220.5, 60 sec: 41776.7, 300 sec: 41931.4). Total num frames: 4368875520. Throughput: 0: 41753.6. Samples: 636523900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:12:58,384][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 03:13:01,637][26599] Updated weights for policy 0, policy_version 266664 (0.0033) [2024-06-19 03:13:03,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 4369055744. Throughput: 0: 41735.7. Samples: 636650120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:03,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 03:13:05,547][26599] Updated weights for policy 0, policy_version 266674 (0.0039) [2024-06-19 03:13:08,380][26367] Fps is (10 sec: 42613.4, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4369301504. Throughput: 0: 41638.5. Samples: 636899360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:08,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 03:13:09,051][26579] Signal inference workers to stop experience collection... (9450 times) [2024-06-19 03:13:09,089][26599] InferenceWorker_p0-w0: stopping experience collection (9450 times) [2024-06-19 03:13:09,166][26579] Signal inference workers to resume experience collection... (9450 times) [2024-06-19 03:13:09,166][26599] InferenceWorker_p0-w0: resuming experience collection (9450 times) [2024-06-19 03:13:09,307][26599] Updated weights for policy 0, policy_version 266684 (0.0027) [2024-06-19 03:13:13,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41779.4, 300 sec: 41931.9). Total num frames: 4369498112. Throughput: 0: 41990.4. Samples: 637163440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:13,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 03:13:13,441][26599] Updated weights for policy 0, policy_version 266694 (0.0030) [2024-06-19 03:13:16,983][26599] Updated weights for policy 0, policy_version 266704 (0.0038) [2024-06-19 03:13:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4369694720. Throughput: 0: 41746.6. Samples: 637282980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:18,386][26367] Avg episode reward: [(0, '0.841')] [2024-06-19 03:13:21,251][26599] Updated weights for policy 0, policy_version 266714 (0.0023) [2024-06-19 03:13:23,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4369940480. Throughput: 0: 41817.3. Samples: 637533440. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:23,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 03:13:24,718][26599] Updated weights for policy 0, policy_version 266724 (0.0037) [2024-06-19 03:13:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.2, 300 sec: 41988.0). Total num frames: 4370120704. Throughput: 0: 42364.4. Samples: 637799300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:28,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 03:13:28,979][26599] Updated weights for policy 0, policy_version 266734 (0.0035) [2024-06-19 03:13:32,635][26599] Updated weights for policy 0, policy_version 266744 (0.0032) [2024-06-19 03:13:33,384][26367] Fps is (10 sec: 39307.5, 60 sec: 41776.7, 300 sec: 41875.9). Total num frames: 4370333696. Throughput: 0: 41888.6. Samples: 637911420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:33,385][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 03:13:36,834][26599] Updated weights for policy 0, policy_version 266754 (0.0035) [2024-06-19 03:13:38,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4370579456. Throughput: 0: 42091.7. Samples: 638165300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:38,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 03:13:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000266759_4370579456.pth... [2024-06-19 03:13:38,447][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000266142_4360470528.pth [2024-06-19 03:13:40,555][26599] Updated weights for policy 0, policy_version 266764 (0.0030) [2024-06-19 03:13:43,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4370743296. Throughput: 0: 42265.7. Samples: 638425700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:43,381][26367] Avg episode reward: [(0, '0.314')] [2024-06-19 03:13:44,491][26599] Updated weights for policy 0, policy_version 266774 (0.0029) [2024-06-19 03:13:48,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4370972672. Throughput: 0: 41988.9. Samples: 638539620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:48,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 03:13:48,519][26599] Updated weights for policy 0, policy_version 266784 (0.0036) [2024-06-19 03:13:52,209][26599] Updated weights for policy 0, policy_version 266794 (0.0034) [2024-06-19 03:13:53,380][26367] Fps is (10 sec: 47513.1, 60 sec: 42598.5, 300 sec: 42098.5). Total num frames: 4371218432. Throughput: 0: 42200.9. Samples: 638798400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:53,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 03:13:56,402][26599] Updated weights for policy 0, policy_version 266804 (0.0042) [2024-06-19 03:13:58,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41508.6, 300 sec: 41876.4). Total num frames: 4371365888. Throughput: 0: 41997.6. Samples: 639053340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:13:58,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 03:14:00,023][26599] Updated weights for policy 0, policy_version 266814 (0.0039) [2024-06-19 03:14:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 41931.9). Total num frames: 4371611648. Throughput: 0: 41870.7. Samples: 639167160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:14:03,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 03:14:04,212][26599] Updated weights for policy 0, policy_version 266824 (0.0042) [2024-06-19 03:14:07,770][26599] Updated weights for policy 0, policy_version 266834 (0.0030) [2024-06-19 03:14:08,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4371824640. Throughput: 0: 42059.1. Samples: 639426100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:08,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 03:14:12,033][26599] Updated weights for policy 0, policy_version 266844 (0.0036) [2024-06-19 03:14:13,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 4372021248. Throughput: 0: 41817.3. Samples: 639681080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:13,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 03:14:15,708][26599] Updated weights for policy 0, policy_version 266854 (0.0029) [2024-06-19 03:14:18,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4372250624. Throughput: 0: 42019.8. Samples: 639802160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:18,385][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 03:14:19,832][26599] Updated weights for policy 0, policy_version 266864 (0.0048) [2024-06-19 03:14:23,380][26367] Fps is (10 sec: 42599.5, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4372447232. Throughput: 0: 42065.9. Samples: 640058260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:23,380][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 03:14:23,451][26599] Updated weights for policy 0, policy_version 266874 (0.0035) [2024-06-19 03:14:27,525][26599] Updated weights for policy 0, policy_version 266884 (0.0037) [2024-06-19 03:14:28,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4372627456. Throughput: 0: 41768.0. Samples: 640305260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:28,380][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 03:14:29,888][26579] Signal inference workers to stop experience collection... (9500 times) [2024-06-19 03:14:29,888][26579] Signal inference workers to resume experience collection... (9500 times) [2024-06-19 03:14:29,930][26599] InferenceWorker_p0-w0: stopping experience collection (9500 times) [2024-06-19 03:14:29,930][26599] InferenceWorker_p0-w0: resuming experience collection (9500 times) [2024-06-19 03:14:31,177][26599] Updated weights for policy 0, policy_version 266894 (0.0039) [2024-06-19 03:14:33,380][26367] Fps is (10 sec: 42597.1, 60 sec: 42327.8, 300 sec: 42098.5). Total num frames: 4372873216. Throughput: 0: 41996.8. Samples: 640429480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:33,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 03:14:35,187][26599] Updated weights for policy 0, policy_version 266904 (0.0038) [2024-06-19 03:14:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41233.1, 300 sec: 41931.9). Total num frames: 4373053440. Throughput: 0: 42029.9. Samples: 640689740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:38,380][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 03:14:38,958][26599] Updated weights for policy 0, policy_version 266914 (0.0035) [2024-06-19 03:14:42,829][26599] Updated weights for policy 0, policy_version 266924 (0.0043) [2024-06-19 03:14:43,385][26367] Fps is (10 sec: 40939.4, 60 sec: 42321.6, 300 sec: 41931.2). Total num frames: 4373282816. Throughput: 0: 41767.3. Samples: 640933080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:43,386][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 03:14:46,939][26599] Updated weights for policy 0, policy_version 266934 (0.0034) [2024-06-19 03:14:48,380][26367] Fps is (10 sec: 42597.4, 60 sec: 41779.1, 300 sec: 41987.4). Total num frames: 4373479424. Throughput: 0: 42043.9. Samples: 641059140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:48,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 03:14:50,930][26599] Updated weights for policy 0, policy_version 266944 (0.0048) [2024-06-19 03:14:53,380][26367] Fps is (10 sec: 42620.2, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4373708800. Throughput: 0: 42004.0. Samples: 641316280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:53,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 03:14:54,527][26599] Updated weights for policy 0, policy_version 266954 (0.0034) [2024-06-19 03:14:58,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 41987.5). Total num frames: 4373921792. Throughput: 0: 41919.3. Samples: 641567440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:14:58,380][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 03:14:58,517][26599] Updated weights for policy 0, policy_version 266964 (0.0045) [2024-06-19 03:15:02,923][26599] Updated weights for policy 0, policy_version 266974 (0.0039) [2024-06-19 03:15:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4374118400. Throughput: 0: 42009.9. Samples: 641692600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:15:03,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 03:15:06,859][26599] Updated weights for policy 0, policy_version 266984 (0.0028) [2024-06-19 03:15:08,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 41987.6). Total num frames: 4374347776. Throughput: 0: 41968.2. Samples: 641946840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:15:08,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 03:15:10,498][26599] Updated weights for policy 0, policy_version 266994 (0.0024) [2024-06-19 03:15:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4374528000. Throughput: 0: 42057.2. Samples: 642197840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:15:13,381][26367] Avg episode reward: [(0, '0.818')] [2024-06-19 03:15:14,503][26599] Updated weights for policy 0, policy_version 267004 (0.0048) [2024-06-19 03:15:18,070][26599] Updated weights for policy 0, policy_version 267014 (0.0028) [2024-06-19 03:15:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 4374757376. Throughput: 0: 42157.0. Samples: 642326540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 03:15:18,381][26367] Avg episode reward: [(0, '0.831')] [2024-06-19 03:15:22,228][26599] Updated weights for policy 0, policy_version 267024 (0.0035) [2024-06-19 03:15:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4374970368. Throughput: 0: 42111.1. Samples: 642584740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:23,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 03:15:25,886][26599] Updated weights for policy 0, policy_version 267034 (0.0030) [2024-06-19 03:15:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 4375183360. Throughput: 0: 42288.9. Samples: 642835860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:28,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 03:15:29,977][26599] Updated weights for policy 0, policy_version 267044 (0.0043) [2024-06-19 03:15:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.4, 300 sec: 41931.9). Total num frames: 4375379968. Throughput: 0: 42286.0. Samples: 642962000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:33,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 03:15:33,871][26599] Updated weights for policy 0, policy_version 267054 (0.0043) [2024-06-19 03:15:37,672][26599] Updated weights for policy 0, policy_version 267064 (0.0046) [2024-06-19 03:15:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4375592960. Throughput: 0: 42237.9. Samples: 643216980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:38,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 03:15:38,473][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267066_4375609344.pth... [2024-06-19 03:15:38,560][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000266451_4365533184.pth [2024-06-19 03:15:41,583][26599] Updated weights for policy 0, policy_version 267074 (0.0034) [2024-06-19 03:15:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42056.0, 300 sec: 41932.0). Total num frames: 4375805952. Throughput: 0: 42265.4. Samples: 643469380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:43,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 03:15:45,351][26599] Updated weights for policy 0, policy_version 267084 (0.0034) [2024-06-19 03:15:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 4376035328. Throughput: 0: 42282.2. Samples: 643595300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:48,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 03:15:49,171][26599] Updated weights for policy 0, policy_version 267094 (0.0031) [2024-06-19 03:15:53,322][26599] Updated weights for policy 0, policy_version 267104 (0.0037) [2024-06-19 03:15:53,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 41876.4). Total num frames: 4376231936. Throughput: 0: 42316.5. Samples: 643851080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:53,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 03:15:56,766][26599] Updated weights for policy 0, policy_version 267114 (0.0024) [2024-06-19 03:15:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4376444928. Throughput: 0: 42301.0. Samples: 644101380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:15:58,380][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 03:16:00,044][26579] Signal inference workers to stop experience collection... (9550 times) [2024-06-19 03:16:00,045][26579] Signal inference workers to resume experience collection... (9550 times) [2024-06-19 03:16:00,084][26599] InferenceWorker_p0-w0: stopping experience collection (9550 times) [2024-06-19 03:16:00,084][26599] InferenceWorker_p0-w0: resuming experience collection (9550 times) [2024-06-19 03:16:01,145][26599] Updated weights for policy 0, policy_version 267124 (0.0036) [2024-06-19 03:16:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4376657920. Throughput: 0: 42251.5. Samples: 644227860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:16:03,381][26367] Avg episode reward: [(0, '0.823')] [2024-06-19 03:16:05,020][26599] Updated weights for policy 0, policy_version 267134 (0.0039) [2024-06-19 03:16:08,381][26367] Fps is (10 sec: 40957.4, 60 sec: 41778.9, 300 sec: 41876.3). Total num frames: 4376854528. Throughput: 0: 42086.5. Samples: 644478660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:16:08,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 03:16:08,724][26599] Updated weights for policy 0, policy_version 267144 (0.0033) [2024-06-19 03:16:12,614][26599] Updated weights for policy 0, policy_version 267154 (0.0030) [2024-06-19 03:16:13,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 41932.0). Total num frames: 4377067520. Throughput: 0: 42033.4. Samples: 644727360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:16:13,380][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 03:16:16,378][26599] Updated weights for policy 0, policy_version 267164 (0.0032) [2024-06-19 03:16:18,380][26367] Fps is (10 sec: 42600.5, 60 sec: 42052.2, 300 sec: 41988.0). Total num frames: 4377280512. Throughput: 0: 42037.1. Samples: 644853680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:16:18,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 03:16:20,199][26599] Updated weights for policy 0, policy_version 267174 (0.0034) [2024-06-19 03:16:23,383][26367] Fps is (10 sec: 40949.9, 60 sec: 41777.5, 300 sec: 41931.6). Total num frames: 4377477120. Throughput: 0: 42046.2. Samples: 645109160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:16:23,383][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 03:16:24,200][26599] Updated weights for policy 0, policy_version 267184 (0.0036) [2024-06-19 03:16:27,899][26599] Updated weights for policy 0, policy_version 267194 (0.0034) [2024-06-19 03:16:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4377722880. Throughput: 0: 41891.4. Samples: 645354500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:16:28,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 03:16:31,980][26599] Updated weights for policy 0, policy_version 267204 (0.0030) [2024-06-19 03:16:33,380][26367] Fps is (10 sec: 44247.5, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4377919488. Throughput: 0: 42112.0. Samples: 645490340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:16:33,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 03:16:35,579][26599] Updated weights for policy 0, policy_version 267214 (0.0028) [2024-06-19 03:16:38,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41779.3, 300 sec: 41820.9). Total num frames: 4378099712. Throughput: 0: 41955.7. Samples: 645739080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:16:38,380][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 03:16:39,726][26599] Updated weights for policy 0, policy_version 267224 (0.0023) [2024-06-19 03:16:43,336][26599] Updated weights for policy 0, policy_version 267234 (0.0030) [2024-06-19 03:16:43,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42595.7, 300 sec: 42153.6). Total num frames: 4378361856. Throughput: 0: 41869.9. Samples: 645985680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:16:43,384][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 03:16:47,388][26599] Updated weights for policy 0, policy_version 267244 (0.0031) [2024-06-19 03:16:48,380][26367] Fps is (10 sec: 44235.9, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4378542080. Throughput: 0: 42010.2. Samples: 646118320. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:16:48,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 03:16:51,143][26599] Updated weights for policy 0, policy_version 267254 (0.0045) [2024-06-19 03:16:53,380][26367] Fps is (10 sec: 37696.7, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4378738688. Throughput: 0: 41903.6. Samples: 646364300. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:16:53,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 03:16:55,266][26599] Updated weights for policy 0, policy_version 267264 (0.0033) [2024-06-19 03:16:58,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4378968064. Throughput: 0: 42074.2. Samples: 646620700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:16:58,380][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 03:16:58,992][26599] Updated weights for policy 0, policy_version 267274 (0.0049) [2024-06-19 03:17:03,102][26599] Updated weights for policy 0, policy_version 267284 (0.0035) [2024-06-19 03:17:03,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4379197440. Throughput: 0: 42095.7. Samples: 646747980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:03,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 03:17:06,684][26599] Updated weights for policy 0, policy_version 267294 (0.0035) [2024-06-19 03:17:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.7, 300 sec: 41987.5). Total num frames: 4379377664. Throughput: 0: 41871.2. Samples: 646993260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:08,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 03:17:09,252][26579] Signal inference workers to stop experience collection... (9600 times) [2024-06-19 03:17:09,300][26599] InferenceWorker_p0-w0: stopping experience collection (9600 times) [2024-06-19 03:17:09,304][26579] Signal inference workers to resume experience collection... (9600 times) [2024-06-19 03:17:09,320][26599] InferenceWorker_p0-w0: resuming experience collection (9600 times) [2024-06-19 03:17:10,811][26599] Updated weights for policy 0, policy_version 267304 (0.0034) [2024-06-19 03:17:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4379590656. Throughput: 0: 42161.0. Samples: 647251740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:13,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 03:17:15,045][26599] Updated weights for policy 0, policy_version 267314 (0.0035) [2024-06-19 03:17:18,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 4379820032. Throughput: 0: 41942.3. Samples: 647377740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:18,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 03:17:18,840][26599] Updated weights for policy 0, policy_version 267324 (0.0040) [2024-06-19 03:17:22,794][26599] Updated weights for policy 0, policy_version 267334 (0.0042) [2024-06-19 03:17:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42054.0, 300 sec: 41932.0). Total num frames: 4380000256. Throughput: 0: 41937.8. Samples: 647626280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:23,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 03:17:26,486][26599] Updated weights for policy 0, policy_version 267344 (0.0028) [2024-06-19 03:17:28,384][26367] Fps is (10 sec: 40944.4, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 4380229632. Throughput: 0: 42227.1. Samples: 647885900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:28,384][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 03:17:30,465][26599] Updated weights for policy 0, policy_version 267354 (0.0042) [2024-06-19 03:17:33,384][26367] Fps is (10 sec: 44221.9, 60 sec: 42050.0, 300 sec: 42098.1). Total num frames: 4380442624. Throughput: 0: 42053.0. Samples: 648010840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:33,384][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:17:34,071][26599] Updated weights for policy 0, policy_version 267364 (0.0036) [2024-06-19 03:17:38,252][26599] Updated weights for policy 0, policy_version 267374 (0.0034) [2024-06-19 03:17:38,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4380655616. Throughput: 0: 42257.0. Samples: 648265860. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 03:17:38,380][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:17:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267374_4380655616.pth... [2024-06-19 03:17:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000266759_4370579456.pth [2024-06-19 03:17:41,834][26599] Updated weights for policy 0, policy_version 267384 (0.0040) [2024-06-19 03:17:43,380][26367] Fps is (10 sec: 42611.5, 60 sec: 41781.6, 300 sec: 42154.1). Total num frames: 4380868608. Throughput: 0: 42196.6. Samples: 648519560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:17:43,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 03:17:45,861][26599] Updated weights for policy 0, policy_version 267394 (0.0040) [2024-06-19 03:17:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4381065216. Throughput: 0: 42109.0. Samples: 648642880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:17:48,380][26367] Avg episode reward: [(0, '0.406')] [2024-06-19 03:17:49,382][26599] Updated weights for policy 0, policy_version 267404 (0.0036) [2024-06-19 03:17:53,380][26367] Fps is (10 sec: 40961.1, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 4381278208. Throughput: 0: 42255.1. Samples: 648894740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:17:53,380][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 03:17:53,556][26599] Updated weights for policy 0, policy_version 267414 (0.0037) [2024-06-19 03:17:57,201][26599] Updated weights for policy 0, policy_version 267424 (0.0042) [2024-06-19 03:17:58,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42322.7, 300 sec: 42209.1). Total num frames: 4381507584. Throughput: 0: 42222.3. Samples: 649151900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:17:58,384][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 03:18:01,247][26599] Updated weights for policy 0, policy_version 267434 (0.0035) [2024-06-19 03:18:03,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4381687808. Throughput: 0: 42244.3. Samples: 649278740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:03,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 03:18:04,984][26599] Updated weights for policy 0, policy_version 267444 (0.0038) [2024-06-19 03:18:08,380][26367] Fps is (10 sec: 42613.1, 60 sec: 42598.2, 300 sec: 42154.1). Total num frames: 4381933568. Throughput: 0: 42337.5. Samples: 649531480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:08,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 03:18:09,388][26599] Updated weights for policy 0, policy_version 267454 (0.0034) [2024-06-19 03:18:12,937][26599] Updated weights for policy 0, policy_version 267464 (0.0030) [2024-06-19 03:18:13,384][26367] Fps is (10 sec: 45858.6, 60 sec: 42595.7, 300 sec: 42209.1). Total num frames: 4382146560. Throughput: 0: 42120.0. Samples: 649781300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:13,385][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 03:18:16,951][26599] Updated weights for policy 0, policy_version 267474 (0.0041) [2024-06-19 03:18:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4382343168. Throughput: 0: 42206.6. Samples: 649910000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:18,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 03:18:20,594][26599] Updated weights for policy 0, policy_version 267484 (0.0028) [2024-06-19 03:18:23,380][26367] Fps is (10 sec: 40975.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4382556160. Throughput: 0: 42212.4. Samples: 650165420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:23,380][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 03:18:24,921][26599] Updated weights for policy 0, policy_version 267494 (0.0027) [2024-06-19 03:18:25,706][26579] Signal inference workers to stop experience collection... (9650 times) [2024-06-19 03:18:25,706][26579] Signal inference workers to resume experience collection... (9650 times) [2024-06-19 03:18:25,729][26599] InferenceWorker_p0-w0: stopping experience collection (9650 times) [2024-06-19 03:18:25,729][26599] InferenceWorker_p0-w0: resuming experience collection (9650 times) [2024-06-19 03:18:28,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42054.9, 300 sec: 42099.1). Total num frames: 4382752768. Throughput: 0: 42310.5. Samples: 650423520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:28,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 03:18:28,647][26599] Updated weights for policy 0, policy_version 267504 (0.0032) [2024-06-19 03:18:32,495][26599] Updated weights for policy 0, policy_version 267514 (0.0032) [2024-06-19 03:18:33,380][26367] Fps is (10 sec: 40958.9, 60 sec: 42054.4, 300 sec: 41987.4). Total num frames: 4382965760. Throughput: 0: 42151.7. Samples: 650539720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:33,381][26367] Avg episode reward: [(0, '0.801')] [2024-06-19 03:18:36,967][26599] Updated weights for policy 0, policy_version 267524 (0.0038) [2024-06-19 03:18:38,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4383195136. Throughput: 0: 42338.5. Samples: 650799980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:38,384][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 03:18:40,381][26599] Updated weights for policy 0, policy_version 267534 (0.0037) [2024-06-19 03:18:43,384][26367] Fps is (10 sec: 40945.7, 60 sec: 41776.8, 300 sec: 42042.5). Total num frames: 4383375360. Throughput: 0: 42205.3. Samples: 651051140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:43,385][26367] Avg episode reward: [(0, '0.361')] [2024-06-19 03:18:44,678][26599] Updated weights for policy 0, policy_version 267544 (0.0034) [2024-06-19 03:18:48,237][26599] Updated weights for policy 0, policy_version 267554 (0.0040) [2024-06-19 03:18:48,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42322.6, 300 sec: 41987.0). Total num frames: 4383604736. Throughput: 0: 41958.8. Samples: 651167040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:48,385][26367] Avg episode reward: [(0, '0.338')] [2024-06-19 03:18:52,401][26599] Updated weights for policy 0, policy_version 267564 (0.0028) [2024-06-19 03:18:53,380][26367] Fps is (10 sec: 45891.9, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4383834112. Throughput: 0: 42189.4. Samples: 651430000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 20.0) [2024-06-19 03:18:53,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 03:18:56,043][26599] Updated weights for policy 0, policy_version 267574 (0.0035) [2024-06-19 03:18:58,381][26367] Fps is (10 sec: 40971.8, 60 sec: 41781.1, 300 sec: 42042.9). Total num frames: 4384014336. Throughput: 0: 42153.8. Samples: 651678100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:18:58,382][26367] Avg episode reward: [(0, '0.372')] [2024-06-19 03:19:00,251][26599] Updated weights for policy 0, policy_version 267584 (0.0038) [2024-06-19 03:19:03,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4384227328. Throughput: 0: 42107.5. Samples: 651804840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:03,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 03:19:03,698][26599] Updated weights for policy 0, policy_version 267594 (0.0035) [2024-06-19 03:19:07,963][26599] Updated weights for policy 0, policy_version 267604 (0.0038) [2024-06-19 03:19:08,380][26367] Fps is (10 sec: 44240.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4384456704. Throughput: 0: 42265.6. Samples: 652067380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:08,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 03:19:11,503][26599] Updated weights for policy 0, policy_version 267614 (0.0039) [2024-06-19 03:19:13,384][26367] Fps is (10 sec: 42583.3, 60 sec: 41779.2, 300 sec: 42042.5). Total num frames: 4384653312. Throughput: 0: 41842.3. Samples: 652306580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:13,384][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 03:19:15,748][26599] Updated weights for policy 0, policy_version 267624 (0.0040) [2024-06-19 03:19:18,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4384849920. Throughput: 0: 42202.0. Samples: 652438800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:18,380][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:19:19,058][26599] Updated weights for policy 0, policy_version 267634 (0.0038) [2024-06-19 03:19:23,380][26367] Fps is (10 sec: 40975.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4385062912. Throughput: 0: 41921.1. Samples: 652686420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:23,380][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 03:19:23,413][26599] Updated weights for policy 0, policy_version 267644 (0.0031) [2024-06-19 03:19:27,031][26599] Updated weights for policy 0, policy_version 267654 (0.0027) [2024-06-19 03:19:28,384][26367] Fps is (10 sec: 45858.1, 60 sec: 42595.7, 300 sec: 42153.6). Total num frames: 4385308672. Throughput: 0: 41895.1. Samples: 652936420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:28,384][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 03:19:31,304][26599] Updated weights for policy 0, policy_version 267664 (0.0040) [2024-06-19 03:19:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.5, 300 sec: 42154.1). Total num frames: 4385488896. Throughput: 0: 42252.5. Samples: 653068240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:33,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 03:19:34,876][26599] Updated weights for policy 0, policy_version 267674 (0.0043) [2024-06-19 03:19:38,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42052.3, 300 sec: 42154.8). Total num frames: 4385718272. Throughput: 0: 41915.1. Samples: 653316180. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:38,388][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 03:19:38,422][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267683_4385718272.pth... [2024-06-19 03:19:38,484][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267066_4375609344.pth [2024-06-19 03:19:38,862][26599] Updated weights for policy 0, policy_version 267684 (0.0032) [2024-06-19 03:19:42,088][26579] Signal inference workers to stop experience collection... (9700 times) [2024-06-19 03:19:42,092][26579] Signal inference workers to resume experience collection... (9700 times) [2024-06-19 03:19:42,135][26599] InferenceWorker_p0-w0: stopping experience collection (9700 times) [2024-06-19 03:19:42,135][26599] InferenceWorker_p0-w0: resuming experience collection (9700 times) [2024-06-19 03:19:42,835][26599] Updated weights for policy 0, policy_version 267694 (0.0040) [2024-06-19 03:19:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42601.0, 300 sec: 42209.7). Total num frames: 4385931264. Throughput: 0: 42037.7. Samples: 653569760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:43,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 03:19:46,485][26599] Updated weights for policy 0, policy_version 267704 (0.0034) [2024-06-19 03:19:48,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41781.8, 300 sec: 42043.0). Total num frames: 4386111488. Throughput: 0: 41909.5. Samples: 653690760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:48,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 03:19:50,547][26599] Updated weights for policy 0, policy_version 267714 (0.0040) [2024-06-19 03:19:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4386373632. Throughput: 0: 41761.9. Samples: 653946660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:53,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 03:19:54,927][26599] Updated weights for policy 0, policy_version 267724 (0.0031) [2024-06-19 03:19:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.8, 300 sec: 42098.5). Total num frames: 4386537472. Throughput: 0: 42157.2. Samples: 654203500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:19:58,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 03:19:58,427][26599] Updated weights for policy 0, policy_version 267734 (0.0032) [2024-06-19 03:20:02,275][26599] Updated weights for policy 0, policy_version 267744 (0.0043) [2024-06-19 03:20:03,380][26367] Fps is (10 sec: 36044.9, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4386734080. Throughput: 0: 41823.1. Samples: 654320840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 03:20:03,380][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 03:20:06,026][26599] Updated weights for policy 0, policy_version 267754 (0.0025) [2024-06-19 03:20:08,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42049.8, 300 sec: 42209.1). Total num frames: 4386979840. Throughput: 0: 42156.9. Samples: 654583640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:08,384][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 03:20:09,631][26599] Updated weights for policy 0, policy_version 267764 (0.0025) [2024-06-19 03:20:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41781.8, 300 sec: 42043.0). Total num frames: 4387160064. Throughput: 0: 42297.2. Samples: 654839640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:13,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 03:20:14,008][26599] Updated weights for policy 0, policy_version 267774 (0.0031) [2024-06-19 03:20:17,127][26599] Updated weights for policy 0, policy_version 267784 (0.0028) [2024-06-19 03:20:18,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 4387389440. Throughput: 0: 42002.1. Samples: 654958340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:18,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 03:20:21,954][26599] Updated weights for policy 0, policy_version 267794 (0.0024) [2024-06-19 03:20:23,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4387618816. Throughput: 0: 42179.7. Samples: 655214260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:23,380][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 03:20:25,391][26599] Updated weights for policy 0, policy_version 267804 (0.0027) [2024-06-19 03:20:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41508.7, 300 sec: 42098.5). Total num frames: 4387799040. Throughput: 0: 42141.8. Samples: 655466140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:28,389][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 03:20:29,815][26599] Updated weights for policy 0, policy_version 267814 (0.0036) [2024-06-19 03:20:33,079][26599] Updated weights for policy 0, policy_version 267824 (0.0025) [2024-06-19 03:20:33,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42322.7, 300 sec: 42153.6). Total num frames: 4388028416. Throughput: 0: 42235.2. Samples: 655591500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:33,384][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 03:20:37,350][26599] Updated weights for policy 0, policy_version 267834 (0.0043) [2024-06-19 03:20:38,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42049.7, 300 sec: 42153.6). Total num frames: 4388241408. Throughput: 0: 42317.4. Samples: 655851100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:38,384][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 03:20:40,942][26599] Updated weights for policy 0, policy_version 267844 (0.0036) [2024-06-19 03:20:43,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4388454400. Throughput: 0: 42107.1. Samples: 656098320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:43,383][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 03:20:44,969][26599] Updated weights for policy 0, policy_version 267854 (0.0031) [2024-06-19 03:20:48,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4388651008. Throughput: 0: 42310.2. Samples: 656224800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:48,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 03:20:48,890][26599] Updated weights for policy 0, policy_version 267864 (0.0043) [2024-06-19 03:20:52,996][26599] Updated weights for policy 0, policy_version 267874 (0.0033) [2024-06-19 03:20:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 4388864000. Throughput: 0: 42140.8. Samples: 656479820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:53,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 03:20:56,475][26599] Updated weights for policy 0, policy_version 267884 (0.0039) [2024-06-19 03:20:58,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42595.8, 300 sec: 42153.6). Total num frames: 4389093376. Throughput: 0: 42076.9. Samples: 656733260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:20:58,385][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 03:21:00,911][26599] Updated weights for policy 0, policy_version 267894 (0.0044) [2024-06-19 03:21:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42154.2). Total num frames: 4389289984. Throughput: 0: 42402.7. Samples: 656866460. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:21:03,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 03:21:04,295][26599] Updated weights for policy 0, policy_version 267904 (0.0039) [2024-06-19 03:21:08,380][26367] Fps is (10 sec: 37697.6, 60 sec: 41508.7, 300 sec: 42043.0). Total num frames: 4389470208. Throughput: 0: 42092.0. Samples: 657108400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:21:08,380][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 03:21:08,647][26599] Updated weights for policy 0, policy_version 267914 (0.0027) [2024-06-19 03:21:12,314][26599] Updated weights for policy 0, policy_version 267924 (0.0034) [2024-06-19 03:21:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4389699584. Throughput: 0: 42148.9. Samples: 657362840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 25.0) [2024-06-19 03:21:13,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 03:21:14,132][26579] Signal inference workers to stop experience collection... (9750 times) [2024-06-19 03:21:14,141][26579] Signal inference workers to resume experience collection... (9750 times) [2024-06-19 03:21:14,167][26599] InferenceWorker_p0-w0: stopping experience collection (9750 times) [2024-06-19 03:21:14,167][26599] InferenceWorker_p0-w0: resuming experience collection (9750 times) [2024-06-19 03:21:16,347][26599] Updated weights for policy 0, policy_version 267934 (0.0035) [2024-06-19 03:21:18,380][26367] Fps is (10 sec: 45874.2, 60 sec: 42325.3, 300 sec: 42210.0). Total num frames: 4389928960. Throughput: 0: 42267.3. Samples: 657493380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:18,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 03:21:19,943][26599] Updated weights for policy 0, policy_version 267944 (0.0033) [2024-06-19 03:21:23,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41779.0, 300 sec: 42043.0). Total num frames: 4390125568. Throughput: 0: 42040.1. Samples: 657742760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:23,388][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 03:21:24,136][26599] Updated weights for policy 0, policy_version 267954 (0.0038) [2024-06-19 03:21:27,521][26599] Updated weights for policy 0, policy_version 267964 (0.0039) [2024-06-19 03:21:28,384][26367] Fps is (10 sec: 39307.6, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4390322176. Throughput: 0: 42096.2. Samples: 657992800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:28,384][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 03:21:31,813][26599] Updated weights for policy 0, policy_version 267974 (0.0032) [2024-06-19 03:21:33,380][26367] Fps is (10 sec: 42599.6, 60 sec: 42054.9, 300 sec: 42209.6). Total num frames: 4390551552. Throughput: 0: 42021.5. Samples: 658115760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:33,380][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 03:21:35,566][26599] Updated weights for policy 0, policy_version 267984 (0.0030) [2024-06-19 03:21:38,380][26367] Fps is (10 sec: 40975.0, 60 sec: 41508.7, 300 sec: 41932.5). Total num frames: 4390731776. Throughput: 0: 41968.4. Samples: 658368400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:38,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 03:21:38,455][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267990_4390748160.pth... [2024-06-19 03:21:38,510][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267374_4380655616.pth [2024-06-19 03:21:39,668][26599] Updated weights for policy 0, policy_version 267994 (0.0033) [2024-06-19 03:21:43,319][26599] Updated weights for policy 0, policy_version 268004 (0.0043) [2024-06-19 03:21:43,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4390977536. Throughput: 0: 41849.6. Samples: 658616340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:43,384][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 03:21:47,544][26599] Updated weights for policy 0, policy_version 268014 (0.0046) [2024-06-19 03:21:48,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4391190528. Throughput: 0: 41669.4. Samples: 658741580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:48,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 03:21:51,112][26599] Updated weights for policy 0, policy_version 268024 (0.0040) [2024-06-19 03:21:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4391370752. Throughput: 0: 41861.6. Samples: 658992180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:53,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 03:21:55,478][26599] Updated weights for policy 0, policy_version 268034 (0.0032) [2024-06-19 03:21:58,384][26367] Fps is (10 sec: 40945.0, 60 sec: 41779.2, 300 sec: 42042.5). Total num frames: 4391600128. Throughput: 0: 41794.8. Samples: 659243760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:21:58,384][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 03:21:59,416][26599] Updated weights for policy 0, policy_version 268044 (0.0033) [2024-06-19 03:22:03,187][26599] Updated weights for policy 0, policy_version 268054 (0.0036) [2024-06-19 03:22:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 4391796736. Throughput: 0: 41682.8. Samples: 659369100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:22:03,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 03:22:07,154][26599] Updated weights for policy 0, policy_version 268064 (0.0043) [2024-06-19 03:22:08,384][26367] Fps is (10 sec: 40960.1, 60 sec: 42322.7, 300 sec: 42098.0). Total num frames: 4392009728. Throughput: 0: 41709.2. Samples: 659619820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:22:08,385][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 03:22:11,009][26599] Updated weights for policy 0, policy_version 268074 (0.0034) [2024-06-19 03:22:13,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4392222720. Throughput: 0: 41694.8. Samples: 659868920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:22:13,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 03:22:14,909][26599] Updated weights for policy 0, policy_version 268084 (0.0045) [2024-06-19 03:22:18,380][26367] Fps is (10 sec: 40974.3, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4392419328. Throughput: 0: 41710.0. Samples: 659992720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:22:18,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 03:22:18,770][26599] Updated weights for policy 0, policy_version 268094 (0.0041) [2024-06-19 03:22:22,553][26599] Updated weights for policy 0, policy_version 268104 (0.0037) [2024-06-19 03:22:23,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42043.5). Total num frames: 4392632320. Throughput: 0: 41722.7. Samples: 660245920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 03:22:23,381][26367] Avg episode reward: [(0, '0.854')] [2024-06-19 03:22:25,849][26579] Signal inference workers to stop experience collection... (9800 times) [2024-06-19 03:22:25,850][26579] Signal inference workers to resume experience collection... (9800 times) [2024-06-19 03:22:25,873][26599] InferenceWorker_p0-w0: stopping experience collection (9800 times) [2024-06-19 03:22:25,904][26599] InferenceWorker_p0-w0: resuming experience collection (9800 times) [2024-06-19 03:22:26,504][26599] Updated weights for policy 0, policy_version 268114 (0.0046) [2024-06-19 03:22:28,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42327.9, 300 sec: 42099.0). Total num frames: 4392861696. Throughput: 0: 41844.1. Samples: 660499320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:22:28,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 03:22:30,103][26599] Updated weights for policy 0, policy_version 268124 (0.0035) [2024-06-19 03:22:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4393041920. Throughput: 0: 41806.7. Samples: 660622880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:22:33,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 03:22:34,336][26599] Updated weights for policy 0, policy_version 268134 (0.0033) [2024-06-19 03:22:37,676][26599] Updated weights for policy 0, policy_version 268144 (0.0046) [2024-06-19 03:22:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4393271296. Throughput: 0: 41804.9. Samples: 660873400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:22:38,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 03:22:42,338][26599] Updated weights for policy 0, policy_version 268154 (0.0034) [2024-06-19 03:22:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4393467904. Throughput: 0: 41786.6. Samples: 661124000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:22:43,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 03:22:46,255][26599] Updated weights for policy 0, policy_version 268164 (0.0035) [2024-06-19 03:22:48,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 4393664512. Throughput: 0: 41827.6. Samples: 661251340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:22:48,380][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 03:22:50,187][26599] Updated weights for policy 0, policy_version 268174 (0.0046) [2024-06-19 03:22:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 4393893888. Throughput: 0: 41701.2. Samples: 661496220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:22:53,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 03:22:53,806][26599] Updated weights for policy 0, policy_version 268184 (0.0025) [2024-06-19 03:22:57,849][26599] Updated weights for policy 0, policy_version 268194 (0.0053) [2024-06-19 03:22:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41508.7, 300 sec: 42043.0). Total num frames: 4394090496. Throughput: 0: 41813.4. Samples: 661750520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:22:58,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 03:23:01,436][26599] Updated weights for policy 0, policy_version 268204 (0.0036) [2024-06-19 03:23:03,380][26367] Fps is (10 sec: 37682.3, 60 sec: 41232.9, 300 sec: 41820.8). Total num frames: 4394270720. Throughput: 0: 41839.1. Samples: 661875480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:03,381][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 03:23:05,553][26599] Updated weights for policy 0, policy_version 268214 (0.0042) [2024-06-19 03:23:08,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42054.9, 300 sec: 41988.0). Total num frames: 4394532864. Throughput: 0: 41954.7. Samples: 662133880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:08,380][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 03:23:09,066][26599] Updated weights for policy 0, policy_version 268224 (0.0031) [2024-06-19 03:23:13,380][26367] Fps is (10 sec: 45876.4, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4394729472. Throughput: 0: 41832.5. Samples: 662381780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:13,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 03:23:13,499][26599] Updated weights for policy 0, policy_version 268234 (0.0034) [2024-06-19 03:23:17,033][26599] Updated weights for policy 0, policy_version 268244 (0.0036) [2024-06-19 03:23:18,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.3, 300 sec: 41931.9). Total num frames: 4394926080. Throughput: 0: 41725.3. Samples: 662500520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:18,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 03:23:21,627][26599] Updated weights for policy 0, policy_version 268254 (0.0029) [2024-06-19 03:23:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4395155456. Throughput: 0: 41838.7. Samples: 662756140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:23,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 03:23:24,750][26599] Updated weights for policy 0, policy_version 268264 (0.0025) [2024-06-19 03:23:27,252][26579] Signal inference workers to stop experience collection... (9850 times) [2024-06-19 03:23:27,302][26599] InferenceWorker_p0-w0: stopping experience collection (9850 times) [2024-06-19 03:23:27,307][26579] Signal inference workers to resume experience collection... (9850 times) [2024-06-19 03:23:27,324][26599] InferenceWorker_p0-w0: resuming experience collection (9850 times) [2024-06-19 03:23:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4395352064. Throughput: 0: 42090.1. Samples: 663018060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:28,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 03:23:29,635][26599] Updated weights for policy 0, policy_version 268274 (0.0028) [2024-06-19 03:23:32,343][26599] Updated weights for policy 0, policy_version 268284 (0.0037) [2024-06-19 03:23:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4395581440. Throughput: 0: 41876.8. Samples: 663135800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:33,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 03:23:37,299][26599] Updated weights for policy 0, policy_version 268294 (0.0035) [2024-06-19 03:23:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.5). Total num frames: 4395778048. Throughput: 0: 42093.3. Samples: 663390420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 03:23:38,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 03:23:38,418][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000268298_4395794432.pth... [2024-06-19 03:23:38,483][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267683_4385718272.pth [2024-06-19 03:23:40,590][26599] Updated weights for policy 0, policy_version 268304 (0.0034) [2024-06-19 03:23:43,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 41932.5). Total num frames: 4395974656. Throughput: 0: 42150.7. Samples: 663647300. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:23:43,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 03:23:45,048][26599] Updated weights for policy 0, policy_version 268314 (0.0038) [2024-06-19 03:23:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 41932.0). Total num frames: 4396204032. Throughput: 0: 41992.3. Samples: 663765120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:23:48,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 03:23:48,418][26599] Updated weights for policy 0, policy_version 268324 (0.0034) [2024-06-19 03:23:52,732][26599] Updated weights for policy 0, policy_version 268334 (0.0037) [2024-06-19 03:23:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42043.1). Total num frames: 4396417024. Throughput: 0: 42001.7. Samples: 664023960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:23:53,389][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 03:23:56,219][26599] Updated weights for policy 0, policy_version 268344 (0.0040) [2024-06-19 03:23:58,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 41932.0). Total num frames: 4396597248. Throughput: 0: 42128.5. Samples: 664277560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:23:58,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 03:24:00,492][26599] Updated weights for policy 0, policy_version 268354 (0.0040) [2024-06-19 03:24:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.6, 300 sec: 41932.0). Total num frames: 4396826624. Throughput: 0: 42148.9. Samples: 664397220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:03,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 03:24:04,073][26599] Updated weights for policy 0, policy_version 268364 (0.0036) [2024-06-19 03:24:08,083][26599] Updated weights for policy 0, policy_version 268374 (0.0036) [2024-06-19 03:24:08,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 41988.0). Total num frames: 4397039616. Throughput: 0: 42288.5. Samples: 664659120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:08,381][26367] Avg episode reward: [(0, '0.318')] [2024-06-19 03:24:11,843][26599] Updated weights for policy 0, policy_version 268384 (0.0032) [2024-06-19 03:24:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4397236224. Throughput: 0: 42029.0. Samples: 664909360. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:13,380][26367] Avg episode reward: [(0, '0.256')] [2024-06-19 03:24:15,715][26599] Updated weights for policy 0, policy_version 268394 (0.0028) [2024-06-19 03:24:18,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 4397481984. Throughput: 0: 42121.3. Samples: 665031260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:18,381][26367] Avg episode reward: [(0, '0.818')] [2024-06-19 03:24:19,786][26599] Updated weights for policy 0, policy_version 268404 (0.0035) [2024-06-19 03:24:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 41932.4). Total num frames: 4397678592. Throughput: 0: 42292.3. Samples: 665293580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:23,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 03:24:23,575][26599] Updated weights for policy 0, policy_version 268414 (0.0031) [2024-06-19 03:24:27,390][26599] Updated weights for policy 0, policy_version 268424 (0.0038) [2024-06-19 03:24:28,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4397875200. Throughput: 0: 42060.8. Samples: 665540040. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:28,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 03:24:31,142][26599] Updated weights for policy 0, policy_version 268434 (0.0029) [2024-06-19 03:24:33,384][26367] Fps is (10 sec: 44221.0, 60 sec: 42322.8, 300 sec: 42042.5). Total num frames: 4398120960. Throughput: 0: 42225.0. Samples: 665665400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:33,385][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 03:24:35,236][26599] Updated weights for policy 0, policy_version 268444 (0.0039) [2024-06-19 03:24:38,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42322.7, 300 sec: 41986.9). Total num frames: 4398317568. Throughput: 0: 42209.9. Samples: 665923560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:38,385][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 03:24:39,191][26599] Updated weights for policy 0, policy_version 268454 (0.0033) [2024-06-19 03:24:43,049][26599] Updated weights for policy 0, policy_version 268464 (0.0043) [2024-06-19 03:24:43,380][26367] Fps is (10 sec: 39336.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4398514176. Throughput: 0: 42128.0. Samples: 666173320. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:43,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 03:24:47,040][26599] Updated weights for policy 0, policy_version 268474 (0.0034) [2024-06-19 03:24:48,380][26367] Fps is (10 sec: 42614.4, 60 sec: 42325.3, 300 sec: 41931.9). Total num frames: 4398743552. Throughput: 0: 42265.3. Samples: 666299160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 03:24:48,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 03:24:51,083][26599] Updated weights for policy 0, policy_version 268484 (0.0039) [2024-06-19 03:24:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4398940160. Throughput: 0: 42002.5. Samples: 666549240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:24:53,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 03:24:54,629][26579] Signal inference workers to stop experience collection... (9900 times) [2024-06-19 03:24:54,629][26579] Signal inference workers to resume experience collection... (9900 times) [2024-06-19 03:24:54,684][26599] InferenceWorker_p0-w0: stopping experience collection (9900 times) [2024-06-19 03:24:54,684][26599] InferenceWorker_p0-w0: resuming experience collection (9900 times) [2024-06-19 03:24:54,764][26599] Updated weights for policy 0, policy_version 268494 (0.0028) [2024-06-19 03:24:58,380][26367] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4399136768. Throughput: 0: 42150.9. Samples: 666806160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:24:58,381][26367] Avg episode reward: [(0, '0.360')] [2024-06-19 03:24:58,863][26599] Updated weights for policy 0, policy_version 268504 (0.0032) [2024-06-19 03:25:02,485][26599] Updated weights for policy 0, policy_version 268514 (0.0029) [2024-06-19 03:25:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 41932.5). Total num frames: 4399349760. Throughput: 0: 42182.3. Samples: 666929460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:03,380][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 03:25:06,637][26599] Updated weights for policy 0, policy_version 268524 (0.0037) [2024-06-19 03:25:08,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4399579136. Throughput: 0: 42042.3. Samples: 667185480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:08,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 03:25:10,095][26599] Updated weights for policy 0, policy_version 268534 (0.0044) [2024-06-19 03:25:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4399759360. Throughput: 0: 42089.0. Samples: 667434040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:13,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 03:25:14,482][26599] Updated weights for policy 0, policy_version 268544 (0.0028) [2024-06-19 03:25:17,872][26599] Updated weights for policy 0, policy_version 268554 (0.0043) [2024-06-19 03:25:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4400005120. Throughput: 0: 42171.4. Samples: 667562960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:18,381][26367] Avg episode reward: [(0, '0.293')] [2024-06-19 03:25:22,309][26599] Updated weights for policy 0, policy_version 268564 (0.0032) [2024-06-19 03:25:23,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4400201728. Throughput: 0: 42178.2. Samples: 667821420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:23,380][26367] Avg episode reward: [(0, '0.293')] [2024-06-19 03:25:25,397][26599] Updated weights for policy 0, policy_version 268574 (0.0042) [2024-06-19 03:25:28,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 41988.0). Total num frames: 4400414720. Throughput: 0: 42216.7. Samples: 668073080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:28,381][26367] Avg episode reward: [(0, '0.374')] [2024-06-19 03:25:29,917][26599] Updated weights for policy 0, policy_version 268584 (0.0046) [2024-06-19 03:25:33,084][26599] Updated weights for policy 0, policy_version 268594 (0.0036) [2024-06-19 03:25:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42054.8, 300 sec: 42043.5). Total num frames: 4400644096. Throughput: 0: 42269.3. Samples: 668201280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:33,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 03:25:37,782][26599] Updated weights for policy 0, policy_version 268604 (0.0040) [2024-06-19 03:25:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42054.8, 300 sec: 41987.5). Total num frames: 4400840704. Throughput: 0: 42451.1. Samples: 668459540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:38,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 03:25:38,458][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000268607_4400857088.pth... [2024-06-19 03:25:38,499][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000267990_4390748160.pth [2024-06-19 03:25:40,943][26599] Updated weights for policy 0, policy_version 268614 (0.0035) [2024-06-19 03:25:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4401053696. Throughput: 0: 42265.0. Samples: 668708080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:43,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 03:25:45,534][26599] Updated weights for policy 0, policy_version 268624 (0.0029) [2024-06-19 03:25:48,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4401266688. Throughput: 0: 42471.6. Samples: 668840680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:48,380][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 03:25:48,535][26599] Updated weights for policy 0, policy_version 268634 (0.0034) [2024-06-19 03:25:53,188][26599] Updated weights for policy 0, policy_version 268644 (0.0031) [2024-06-19 03:25:53,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 41932.5). Total num frames: 4401463296. Throughput: 0: 42411.2. Samples: 669093980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:53,380][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 03:25:56,234][26599] Updated weights for policy 0, policy_version 268654 (0.0044) [2024-06-19 03:25:58,384][26367] Fps is (10 sec: 40944.7, 60 sec: 42322.9, 300 sec: 41987.0). Total num frames: 4401676288. Throughput: 0: 42556.1. Samples: 669349220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 03:25:58,384][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 03:26:00,761][26599] Updated weights for policy 0, policy_version 268664 (0.0031) [2024-06-19 03:26:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4401905664. Throughput: 0: 42540.9. Samples: 669477300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:03,381][26367] Avg episode reward: [(0, '0.378')] [2024-06-19 03:26:04,092][26599] Updated weights for policy 0, policy_version 268674 (0.0040) [2024-06-19 03:26:08,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4402102272. Throughput: 0: 42416.4. Samples: 669730160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:08,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 03:26:08,563][26599] Updated weights for policy 0, policy_version 268684 (0.0031) [2024-06-19 03:26:11,754][26599] Updated weights for policy 0, policy_version 268694 (0.0032) [2024-06-19 03:26:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 4402331648. Throughput: 0: 42261.9. Samples: 669974860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:13,381][26367] Avg episode reward: [(0, '0.317')] [2024-06-19 03:26:16,207][26599] Updated weights for policy 0, policy_version 268704 (0.0021) [2024-06-19 03:26:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4402544640. Throughput: 0: 42348.4. Samples: 670106960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:18,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 03:26:19,680][26599] Updated weights for policy 0, policy_version 268714 (0.0041) [2024-06-19 03:26:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 4402724864. Throughput: 0: 42134.3. Samples: 670355580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:23,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 03:26:24,038][26599] Updated weights for policy 0, policy_version 268724 (0.0027) [2024-06-19 03:26:27,429][26599] Updated weights for policy 0, policy_version 268734 (0.0029) [2024-06-19 03:26:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4402954240. Throughput: 0: 42248.4. Samples: 670609260. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:28,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 03:26:31,846][26599] Updated weights for policy 0, policy_version 268744 (0.0039) [2024-06-19 03:26:32,732][26579] Signal inference workers to stop experience collection... (9950 times) [2024-06-19 03:26:32,732][26579] Signal inference workers to resume experience collection... (9950 times) [2024-06-19 03:26:32,762][26599] InferenceWorker_p0-w0: stopping experience collection (9950 times) [2024-06-19 03:26:32,762][26599] InferenceWorker_p0-w0: resuming experience collection (9950 times) [2024-06-19 03:26:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4403183616. Throughput: 0: 42247.4. Samples: 670741820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:33,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 03:26:35,426][26599] Updated weights for policy 0, policy_version 268754 (0.0041) [2024-06-19 03:26:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4403363840. Throughput: 0: 42165.2. Samples: 670991420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:38,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 03:26:39,570][26599] Updated weights for policy 0, policy_version 268764 (0.0031) [2024-06-19 03:26:43,127][26599] Updated weights for policy 0, policy_version 268774 (0.0029) [2024-06-19 03:26:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4403593216. Throughput: 0: 42057.5. Samples: 671241660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:43,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 03:26:47,535][26599] Updated weights for policy 0, policy_version 268784 (0.0039) [2024-06-19 03:26:48,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 4403806208. Throughput: 0: 42039.9. Samples: 671369100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:48,389][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 03:26:50,947][26599] Updated weights for policy 0, policy_version 268794 (0.0031) [2024-06-19 03:26:53,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42322.7, 300 sec: 42043.0). Total num frames: 4404002816. Throughput: 0: 42083.3. Samples: 671624060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:53,384][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 03:26:55,307][26599] Updated weights for policy 0, policy_version 268804 (0.0031) [2024-06-19 03:26:58,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42601.1, 300 sec: 42154.1). Total num frames: 4404232192. Throughput: 0: 42046.3. Samples: 671866940. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:26:58,380][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 03:26:58,471][26599] Updated weights for policy 0, policy_version 268814 (0.0040) [2024-06-19 03:27:02,910][26599] Updated weights for policy 0, policy_version 268824 (0.0047) [2024-06-19 03:27:03,380][26367] Fps is (10 sec: 40974.5, 60 sec: 41779.1, 300 sec: 42043.5). Total num frames: 4404412416. Throughput: 0: 42044.8. Samples: 671998980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:27:03,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 03:27:06,142][26599] Updated weights for policy 0, policy_version 268834 (0.0034) [2024-06-19 03:27:08,380][26367] Fps is (10 sec: 37682.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4404609024. Throughput: 0: 42113.3. Samples: 672250680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:27:08,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 03:27:10,803][26599] Updated weights for policy 0, policy_version 268844 (0.0041) [2024-06-19 03:27:13,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4404887552. Throughput: 0: 41901.8. Samples: 672494840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 20.0) [2024-06-19 03:27:13,384][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 03:27:13,785][26599] Updated weights for policy 0, policy_version 268854 (0.0031) [2024-06-19 03:27:18,293][26599] Updated weights for policy 0, policy_version 268864 (0.0045) [2024-06-19 03:27:18,384][26367] Fps is (10 sec: 45858.6, 60 sec: 42049.7, 300 sec: 42153.6). Total num frames: 4405067776. Throughput: 0: 42115.7. Samples: 672637180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:18,385][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 03:27:21,515][26599] Updated weights for policy 0, policy_version 268874 (0.0034) [2024-06-19 03:27:23,384][26367] Fps is (10 sec: 36032.0, 60 sec: 42049.8, 300 sec: 41987.0). Total num frames: 4405248000. Throughput: 0: 41942.4. Samples: 672878980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:23,384][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 03:27:26,307][26599] Updated weights for policy 0, policy_version 268884 (0.0041) [2024-06-19 03:27:28,380][26367] Fps is (10 sec: 44253.3, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4405510144. Throughput: 0: 42048.6. Samples: 673133840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:28,380][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 03:27:29,166][26599] Updated weights for policy 0, policy_version 268894 (0.0029) [2024-06-19 03:27:33,380][26367] Fps is (10 sec: 44252.6, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4405690368. Throughput: 0: 42158.3. Samples: 673266220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:33,381][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 03:27:33,942][26599] Updated weights for policy 0, policy_version 268904 (0.0028) [2024-06-19 03:27:36,873][26599] Updated weights for policy 0, policy_version 268914 (0.0033) [2024-06-19 03:27:38,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 4405886976. Throughput: 0: 41847.9. Samples: 673507060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:38,380][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 03:27:38,483][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000268915_4405903360.pth... [2024-06-19 03:27:38,538][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000268298_4395794432.pth [2024-06-19 03:27:41,838][26599] Updated weights for policy 0, policy_version 268924 (0.0039) [2024-06-19 03:27:42,685][26579] Signal inference workers to stop experience collection... (10000 times) [2024-06-19 03:27:42,731][26599] InferenceWorker_p0-w0: stopping experience collection (10000 times) [2024-06-19 03:27:42,746][26579] Signal inference workers to resume experience collection... (10000 times) [2024-06-19 03:27:42,749][26599] InferenceWorker_p0-w0: resuming experience collection (10000 times) [2024-06-19 03:27:43,384][26367] Fps is (10 sec: 44220.9, 60 sec: 42322.8, 300 sec: 42264.6). Total num frames: 4406132736. Throughput: 0: 42088.0. Samples: 673761060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:43,384][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 03:27:44,609][26599] Updated weights for policy 0, policy_version 268934 (0.0025) [2024-06-19 03:27:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4406296576. Throughput: 0: 41993.0. Samples: 673888660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:48,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 03:27:50,120][26599] Updated weights for policy 0, policy_version 268944 (0.0042) [2024-06-19 03:27:52,477][26599] Updated weights for policy 0, policy_version 268954 (0.0025) [2024-06-19 03:27:53,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42327.9, 300 sec: 42209.6). Total num frames: 4406542336. Throughput: 0: 41830.2. Samples: 674133040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:53,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 03:27:57,894][26599] Updated weights for policy 0, policy_version 268964 (0.0040) [2024-06-19 03:27:58,384][26367] Fps is (10 sec: 44220.5, 60 sec: 41776.6, 300 sec: 42264.7). Total num frames: 4406738944. Throughput: 0: 42182.4. Samples: 674393200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:27:58,385][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 03:28:00,267][26599] Updated weights for policy 0, policy_version 268974 (0.0042) [2024-06-19 03:28:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4406935552. Throughput: 0: 41643.8. Samples: 674511000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:28:03,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 03:28:05,494][26599] Updated weights for policy 0, policy_version 268984 (0.0035) [2024-06-19 03:28:08,221][26599] Updated weights for policy 0, policy_version 268994 (0.0044) [2024-06-19 03:28:08,380][26367] Fps is (10 sec: 45891.7, 60 sec: 43144.5, 300 sec: 42265.2). Total num frames: 4407197696. Throughput: 0: 41940.2. Samples: 674766140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:28:08,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 03:28:13,335][26599] Updated weights for policy 0, policy_version 269004 (0.0030) [2024-06-19 03:28:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41233.1, 300 sec: 42154.1). Total num frames: 4407361536. Throughput: 0: 41937.3. Samples: 675021020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:28:13,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 03:28:17,026][26599] Updated weights for policy 0, policy_version 269014 (0.0031) [2024-06-19 03:28:18,380][26367] Fps is (10 sec: 36044.6, 60 sec: 41508.6, 300 sec: 42043.0). Total num frames: 4407558144. Throughput: 0: 41571.1. Samples: 675136920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:28:18,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 03:28:21,369][26599] Updated weights for policy 0, policy_version 269024 (0.0043) [2024-06-19 03:28:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42601.0, 300 sec: 42209.6). Total num frames: 4407803904. Throughput: 0: 42044.0. Samples: 675399040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 03:28:23,380][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 03:28:24,590][26599] Updated weights for policy 0, policy_version 269034 (0.0034) [2024-06-19 03:28:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 4407984128. Throughput: 0: 41905.6. Samples: 675646660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:28:28,380][26367] Avg episode reward: [(0, '0.790')] [2024-06-19 03:28:28,967][26599] Updated weights for policy 0, policy_version 269044 (0.0031) [2024-06-19 03:28:32,323][26599] Updated weights for policy 0, policy_version 269054 (0.0040) [2024-06-19 03:28:33,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4408180736. Throughput: 0: 41795.6. Samples: 675769460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:28:33,380][26367] Avg episode reward: [(0, '0.326')] [2024-06-19 03:28:37,106][26599] Updated weights for policy 0, policy_version 269064 (0.0032) [2024-06-19 03:28:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4408410112. Throughput: 0: 42012.0. Samples: 676023580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:28:38,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 03:28:40,299][26599] Updated weights for policy 0, policy_version 269074 (0.0033) [2024-06-19 03:28:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41235.6, 300 sec: 42043.0). Total num frames: 4408606720. Throughput: 0: 41779.9. Samples: 676273140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:28:43,380][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 03:28:44,857][26599] Updated weights for policy 0, policy_version 269084 (0.0039) [2024-06-19 03:28:48,226][26599] Updated weights for policy 0, policy_version 269094 (0.0046) [2024-06-19 03:28:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4408836096. Throughput: 0: 41915.6. Samples: 676397200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:28:48,381][26367] Avg episode reward: [(0, '0.783')] [2024-06-19 03:28:52,503][26599] Updated weights for policy 0, policy_version 269104 (0.0029) [2024-06-19 03:28:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 4409032704. Throughput: 0: 41933.0. Samples: 676653120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:28:53,380][26367] Avg episode reward: [(0, '0.763')] [2024-06-19 03:28:56,238][26599] Updated weights for policy 0, policy_version 269114 (0.0032) [2024-06-19 03:28:56,789][26579] Signal inference workers to stop experience collection... (10050 times) [2024-06-19 03:28:56,839][26599] InferenceWorker_p0-w0: stopping experience collection (10050 times) [2024-06-19 03:28:56,849][26579] Signal inference workers to resume experience collection... (10050 times) [2024-06-19 03:28:56,861][26599] InferenceWorker_p0-w0: resuming experience collection (10050 times) [2024-06-19 03:28:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41781.7, 300 sec: 42098.5). Total num frames: 4409245696. Throughput: 0: 41786.1. Samples: 676901400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:28:58,381][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 03:29:00,137][26599] Updated weights for policy 0, policy_version 269124 (0.0036) [2024-06-19 03:29:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4409442304. Throughput: 0: 42014.4. Samples: 677027560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:03,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 03:29:04,147][26599] Updated weights for policy 0, policy_version 269134 (0.0034) [2024-06-19 03:29:07,837][26599] Updated weights for policy 0, policy_version 269144 (0.0032) [2024-06-19 03:29:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41233.0, 300 sec: 42154.1). Total num frames: 4409671680. Throughput: 0: 41795.8. Samples: 677279860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:08,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 03:29:11,915][26599] Updated weights for policy 0, policy_version 269154 (0.0039) [2024-06-19 03:29:13,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4409884672. Throughput: 0: 41752.4. Samples: 677525520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:13,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 03:29:15,474][26599] Updated weights for policy 0, policy_version 269164 (0.0032) [2024-06-19 03:29:18,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4410064896. Throughput: 0: 41856.7. Samples: 677653020. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:18,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 03:29:19,771][26599] Updated weights for policy 0, policy_version 269174 (0.0030) [2024-06-19 03:29:23,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 4410277888. Throughput: 0: 41889.9. Samples: 677908620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:23,380][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 03:29:23,636][26599] Updated weights for policy 0, policy_version 269184 (0.0048) [2024-06-19 03:29:27,584][26599] Updated weights for policy 0, policy_version 269194 (0.0044) [2024-06-19 03:29:28,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 4410523648. Throughput: 0: 41943.5. Samples: 678160600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:28,381][26367] Avg episode reward: [(0, '0.339')] [2024-06-19 03:29:31,305][26599] Updated weights for policy 0, policy_version 269204 (0.0029) [2024-06-19 03:29:33,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 4410720256. Throughput: 0: 42072.9. Samples: 678290480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:33,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 03:29:35,259][26599] Updated weights for policy 0, policy_version 269214 (0.0032) [2024-06-19 03:29:38,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 4410933248. Throughput: 0: 42037.4. Samples: 678544960. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 03:29:38,384][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 03:29:38,411][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000269222_4410933248.pth... [2024-06-19 03:29:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000268607_4400857088.pth [2024-06-19 03:29:38,926][26599] Updated weights for policy 0, policy_version 269224 (0.0032) [2024-06-19 03:29:43,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4411113472. Throughput: 0: 42172.2. Samples: 678799140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:29:43,380][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 03:29:43,407][26599] Updated weights for policy 0, policy_version 269234 (0.0034) [2024-06-19 03:29:46,579][26599] Updated weights for policy 0, policy_version 269244 (0.0036) [2024-06-19 03:29:48,380][26367] Fps is (10 sec: 44252.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4411375616. Throughput: 0: 42084.2. Samples: 678921360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:29:48,384][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 03:29:51,118][26599] Updated weights for policy 0, policy_version 269254 (0.0029) [2024-06-19 03:29:53,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4411555840. Throughput: 0: 42021.9. Samples: 679170840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:29:53,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 03:29:54,306][26599] Updated weights for policy 0, policy_version 269264 (0.0024) [2024-06-19 03:29:58,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4411752448. Throughput: 0: 42256.9. Samples: 679427080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:29:58,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 03:29:58,831][26599] Updated weights for policy 0, policy_version 269274 (0.0042) [2024-06-19 03:30:02,032][26599] Updated weights for policy 0, policy_version 269284 (0.0028) [2024-06-19 03:30:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4411998208. Throughput: 0: 42197.5. Samples: 679551900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:03,380][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 03:30:07,089][26599] Updated weights for policy 0, policy_version 269294 (0.0035) [2024-06-19 03:30:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4412178432. Throughput: 0: 42285.1. Samples: 679811460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:08,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 03:30:10,060][26599] Updated weights for policy 0, policy_version 269304 (0.0029) [2024-06-19 03:30:13,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4412391424. Throughput: 0: 42097.8. Samples: 680055000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:13,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 03:30:14,889][26599] Updated weights for policy 0, policy_version 269314 (0.0042) [2024-06-19 03:30:17,821][26599] Updated weights for policy 0, policy_version 269324 (0.0027) [2024-06-19 03:30:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42098.5). Total num frames: 4412620800. Throughput: 0: 42008.1. Samples: 680180840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:18,380][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 03:30:18,680][26579] Signal inference workers to stop experience collection... (10100 times) [2024-06-19 03:30:18,731][26599] InferenceWorker_p0-w0: stopping experience collection (10100 times) [2024-06-19 03:30:18,795][26579] Signal inference workers to resume experience collection... (10100 times) [2024-06-19 03:30:18,795][26599] InferenceWorker_p0-w0: resuming experience collection (10100 times) [2024-06-19 03:30:22,524][26599] Updated weights for policy 0, policy_version 269334 (0.0032) [2024-06-19 03:30:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 41987.5). Total num frames: 4412801024. Throughput: 0: 41976.7. Samples: 680433760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:23,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 03:30:25,896][26599] Updated weights for policy 0, policy_version 269344 (0.0031) [2024-06-19 03:30:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4413030400. Throughput: 0: 41872.7. Samples: 680683420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:28,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 03:30:30,118][26599] Updated weights for policy 0, policy_version 269354 (0.0040) [2024-06-19 03:30:33,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4413243392. Throughput: 0: 42089.3. Samples: 680815380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:33,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 03:30:33,567][26599] Updated weights for policy 0, policy_version 269364 (0.0033) [2024-06-19 03:30:37,872][26599] Updated weights for policy 0, policy_version 269374 (0.0046) [2024-06-19 03:30:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41781.8, 300 sec: 41987.5). Total num frames: 4413440000. Throughput: 0: 42128.5. Samples: 681066620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:38,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 03:30:41,249][26599] Updated weights for policy 0, policy_version 269384 (0.0038) [2024-06-19 03:30:43,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 4413669376. Throughput: 0: 41850.2. Samples: 681310340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:43,386][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 03:30:45,968][26599] Updated weights for policy 0, policy_version 269394 (0.0032) [2024-06-19 03:30:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4413882368. Throughput: 0: 42200.7. Samples: 681450940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 03:30:48,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 03:30:49,068][26599] Updated weights for policy 0, policy_version 269404 (0.0045) [2024-06-19 03:30:53,384][26367] Fps is (10 sec: 39307.4, 60 sec: 41776.6, 300 sec: 41987.5). Total num frames: 4414062592. Throughput: 0: 42003.7. Samples: 681701780. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:30:53,385][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 03:30:53,649][26599] Updated weights for policy 0, policy_version 269414 (0.0033) [2024-06-19 03:30:56,719][26599] Updated weights for policy 0, policy_version 269424 (0.0037) [2024-06-19 03:30:58,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42098.6). Total num frames: 4414324736. Throughput: 0: 42063.1. Samples: 681947840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:30:58,380][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 03:31:01,466][26599] Updated weights for policy 0, policy_version 269434 (0.0043) [2024-06-19 03:31:03,380][26367] Fps is (10 sec: 42614.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4414488576. Throughput: 0: 42203.1. Samples: 682079980. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:03,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 03:31:04,346][26599] Updated weights for policy 0, policy_version 269444 (0.0036) [2024-06-19 03:31:08,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4414701568. Throughput: 0: 42121.0. Samples: 682329200. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:08,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 03:31:09,029][26599] Updated weights for policy 0, policy_version 269454 (0.0039) [2024-06-19 03:31:12,323][26599] Updated weights for policy 0, policy_version 269464 (0.0039) [2024-06-19 03:31:13,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4414930944. Throughput: 0: 42080.0. Samples: 682577020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:13,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 03:31:16,902][26599] Updated weights for policy 0, policy_version 269474 (0.0036) [2024-06-19 03:31:18,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41506.0, 300 sec: 41987.5). Total num frames: 4415111168. Throughput: 0: 42074.7. Samples: 682708740. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:18,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 03:31:20,212][26599] Updated weights for policy 0, policy_version 269484 (0.0036) [2024-06-19 03:31:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4415356928. Throughput: 0: 41987.1. Samples: 682956040. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:23,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 03:31:25,065][26599] Updated weights for policy 0, policy_version 269494 (0.0029) [2024-06-19 03:31:27,987][26599] Updated weights for policy 0, policy_version 269504 (0.0034) [2024-06-19 03:31:28,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4415569920. Throughput: 0: 42225.9. Samples: 683210500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:28,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 03:31:32,633][26599] Updated weights for policy 0, policy_version 269514 (0.0032) [2024-06-19 03:31:33,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4415733760. Throughput: 0: 41896.5. Samples: 683336280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:33,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 03:31:34,631][26579] Signal inference workers to stop experience collection... (10150 times) [2024-06-19 03:31:34,631][26579] Signal inference workers to resume experience collection... (10150 times) [2024-06-19 03:31:34,652][26599] InferenceWorker_p0-w0: stopping experience collection (10150 times) [2024-06-19 03:31:34,652][26599] InferenceWorker_p0-w0: resuming experience collection (10150 times) [2024-06-19 03:31:35,692][26599] Updated weights for policy 0, policy_version 269524 (0.0033) [2024-06-19 03:31:38,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 4415995904. Throughput: 0: 41963.7. Samples: 683590000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:38,381][26367] Avg episode reward: [(0, '0.380')] [2024-06-19 03:31:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000269531_4415995904.pth... [2024-06-19 03:31:38,445][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000268915_4405903360.pth [2024-06-19 03:31:40,537][26599] Updated weights for policy 0, policy_version 269534 (0.0055) [2024-06-19 03:31:43,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4416192512. Throughput: 0: 42076.4. Samples: 683841280. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:43,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 03:31:43,417][26599] Updated weights for policy 0, policy_version 269544 (0.0032) [2024-06-19 03:31:48,380][26367] Fps is (10 sec: 36045.5, 60 sec: 41233.2, 300 sec: 41876.9). Total num frames: 4416356352. Throughput: 0: 41748.4. Samples: 683958660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:48,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 03:31:48,481][26599] Updated weights for policy 0, policy_version 269554 (0.0053) [2024-06-19 03:31:51,675][26599] Updated weights for policy 0, policy_version 269564 (0.0044) [2024-06-19 03:31:53,381][26367] Fps is (10 sec: 44233.0, 60 sec: 42873.5, 300 sec: 42042.9). Total num frames: 4416634880. Throughput: 0: 41924.9. Samples: 684215860. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:53,382][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 03:31:56,309][26599] Updated weights for policy 0, policy_version 269574 (0.0039) [2024-06-19 03:31:58,384][26367] Fps is (10 sec: 44220.8, 60 sec: 41230.6, 300 sec: 41987.0). Total num frames: 4416798720. Throughput: 0: 42210.9. Samples: 684476660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 03:31:58,384][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:31:59,332][26599] Updated weights for policy 0, policy_version 269584 (0.0024) [2024-06-19 03:32:03,380][26367] Fps is (10 sec: 36047.8, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4416995328. Throughput: 0: 41859.6. Samples: 684592420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:03,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 03:32:03,898][26599] Updated weights for policy 0, policy_version 269594 (0.0025) [2024-06-19 03:32:06,941][26599] Updated weights for policy 0, policy_version 269604 (0.0035) [2024-06-19 03:32:08,380][26367] Fps is (10 sec: 47530.5, 60 sec: 42871.4, 300 sec: 41987.5). Total num frames: 4417273856. Throughput: 0: 42108.9. Samples: 684850940. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:08,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 03:32:11,826][26599] Updated weights for policy 0, policy_version 269614 (0.0042) [2024-06-19 03:32:13,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42052.3, 300 sec: 41988.0). Total num frames: 4417454080. Throughput: 0: 42180.4. Samples: 685108620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:13,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 03:32:14,726][26599] Updated weights for policy 0, policy_version 269624 (0.0036) [2024-06-19 03:32:18,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 4417650688. Throughput: 0: 41908.9. Samples: 685222180. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:18,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 03:32:19,414][26599] Updated weights for policy 0, policy_version 269634 (0.0033) [2024-06-19 03:32:22,346][26599] Updated weights for policy 0, policy_version 269644 (0.0038) [2024-06-19 03:32:23,384][26367] Fps is (10 sec: 44220.9, 60 sec: 42322.8, 300 sec: 41986.9). Total num frames: 4417896448. Throughput: 0: 42055.0. Samples: 685482620. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:23,384][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 03:32:27,133][26599] Updated weights for policy 0, policy_version 269654 (0.0040) [2024-06-19 03:32:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41931.9). Total num frames: 4418060288. Throughput: 0: 42200.5. Samples: 685740300. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:28,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 03:32:30,424][26599] Updated weights for policy 0, policy_version 269664 (0.0030) [2024-06-19 03:32:33,380][26367] Fps is (10 sec: 39336.2, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 4418289664. Throughput: 0: 42111.6. Samples: 685853680. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:33,381][26367] Avg episode reward: [(0, '0.825')] [2024-06-19 03:32:34,962][26599] Updated weights for policy 0, policy_version 269674 (0.0038) [2024-06-19 03:32:38,161][26599] Updated weights for policy 0, policy_version 269684 (0.0035) [2024-06-19 03:32:38,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41779.2, 300 sec: 41932.4). Total num frames: 4418502656. Throughput: 0: 42042.5. Samples: 686107740. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:38,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 03:32:42,702][26599] Updated weights for policy 0, policy_version 269694 (0.0026) [2024-06-19 03:32:43,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 41987.5). Total num frames: 4418682880. Throughput: 0: 41918.5. Samples: 686362840. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:43,380][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 03:32:45,749][26599] Updated weights for policy 0, policy_version 269704 (0.0024) [2024-06-19 03:32:48,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42598.5, 300 sec: 41932.0). Total num frames: 4418912256. Throughput: 0: 42005.0. Samples: 686482640. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:48,380][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 03:32:50,551][26599] Updated weights for policy 0, policy_version 269714 (0.0044) [2024-06-19 03:32:52,116][26579] Signal inference workers to stop experience collection... (10200 times) [2024-06-19 03:32:52,180][26599] InferenceWorker_p0-w0: stopping experience collection (10200 times) [2024-06-19 03:32:52,239][26579] Signal inference workers to resume experience collection... (10200 times) [2024-06-19 03:32:52,239][26599] InferenceWorker_p0-w0: resuming experience collection (10200 times) [2024-06-19 03:32:53,380][26367] Fps is (10 sec: 45874.8, 60 sec: 41779.8, 300 sec: 42043.5). Total num frames: 4419141632. Throughput: 0: 42012.0. Samples: 686741480. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:53,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 03:32:53,587][26599] Updated weights for policy 0, policy_version 269724 (0.0041) [2024-06-19 03:32:58,167][26599] Updated weights for policy 0, policy_version 269734 (0.0031) [2024-06-19 03:32:58,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42054.7, 300 sec: 41987.5). Total num frames: 4419321856. Throughput: 0: 41848.4. Samples: 686991800. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:32:58,381][26367] Avg episode reward: [(0, '0.343')] [2024-06-19 03:33:01,618][26599] Updated weights for policy 0, policy_version 269744 (0.0044) [2024-06-19 03:33:03,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 41931.9). Total num frames: 4419567616. Throughput: 0: 42030.6. Samples: 687113560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:33:03,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 03:33:05,776][26599] Updated weights for policy 0, policy_version 269754 (0.0040) [2024-06-19 03:33:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 40960.0, 300 sec: 41931.9). Total num frames: 4419731456. Throughput: 0: 42050.9. Samples: 687374760. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:33:08,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 03:33:09,336][26599] Updated weights for policy 0, policy_version 269764 (0.0042) [2024-06-19 03:33:13,380][26367] Fps is (10 sec: 39322.3, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4419960832. Throughput: 0: 41796.0. Samples: 687621120. Policy #0 lag: (min: 0.0, avg: 12.7, max: 21.0) [2024-06-19 03:33:13,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 03:33:13,568][26599] Updated weights for policy 0, policy_version 269774 (0.0039) [2024-06-19 03:33:17,109][26599] Updated weights for policy 0, policy_version 269784 (0.0035) [2024-06-19 03:33:18,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42598.4, 300 sec: 42043.0). Total num frames: 4420206592. Throughput: 0: 42167.0. Samples: 687751200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:18,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 03:33:21,508][26599] Updated weights for policy 0, policy_version 269794 (0.0049) [2024-06-19 03:33:23,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41235.5, 300 sec: 41987.5). Total num frames: 4420370432. Throughput: 0: 42063.5. Samples: 688000600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:23,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 03:33:25,049][26599] Updated weights for policy 0, policy_version 269804 (0.0044) [2024-06-19 03:33:28,381][26367] Fps is (10 sec: 39319.7, 60 sec: 42325.0, 300 sec: 42098.5). Total num frames: 4420599808. Throughput: 0: 41971.5. Samples: 688251580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:28,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 03:33:29,407][26599] Updated weights for policy 0, policy_version 269814 (0.0029) [2024-06-19 03:33:33,023][26599] Updated weights for policy 0, policy_version 269824 (0.0035) [2024-06-19 03:33:33,380][26367] Fps is (10 sec: 45876.3, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4420829184. Throughput: 0: 42156.5. Samples: 688379680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:33,380][26367] Avg episode reward: [(0, '0.248')] [2024-06-19 03:33:37,108][26599] Updated weights for policy 0, policy_version 269834 (0.0045) [2024-06-19 03:33:38,384][26367] Fps is (10 sec: 40947.1, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 4421009408. Throughput: 0: 41937.9. Samples: 688628840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:38,385][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 03:33:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000269837_4421009408.pth... [2024-06-19 03:33:38,448][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000269222_4410933248.pth [2024-06-19 03:33:40,767][26599] Updated weights for policy 0, policy_version 269844 (0.0028) [2024-06-19 03:33:43,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4421222400. Throughput: 0: 41885.4. Samples: 688876640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:43,384][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 03:33:44,953][26599] Updated weights for policy 0, policy_version 269854 (0.0034) [2024-06-19 03:33:48,380][26367] Fps is (10 sec: 42614.4, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4421435392. Throughput: 0: 41976.7. Samples: 689002500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:48,380][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 03:33:48,432][26599] Updated weights for policy 0, policy_version 269864 (0.0030) [2024-06-19 03:33:52,538][26599] Updated weights for policy 0, policy_version 269874 (0.0042) [2024-06-19 03:33:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4421648384. Throughput: 0: 41853.4. Samples: 689258160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:53,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 03:33:56,455][26599] Updated weights for policy 0, policy_version 269884 (0.0049) [2024-06-19 03:33:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4421861376. Throughput: 0: 41876.9. Samples: 689505580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:33:58,381][26367] Avg episode reward: [(0, '0.266')] [2024-06-19 03:34:00,321][26599] Updated weights for policy 0, policy_version 269894 (0.0040) [2024-06-19 03:34:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4422074368. Throughput: 0: 41867.0. Samples: 689635220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:34:03,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 03:34:04,039][26599] Updated weights for policy 0, policy_version 269904 (0.0042) [2024-06-19 03:34:07,886][26599] Updated weights for policy 0, policy_version 269914 (0.0044) [2024-06-19 03:34:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4422270976. Throughput: 0: 41977.1. Samples: 689889560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:34:08,380][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 03:34:11,975][26599] Updated weights for policy 0, policy_version 269924 (0.0033) [2024-06-19 03:34:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 4422483968. Throughput: 0: 41931.5. Samples: 690138480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:34:13,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 03:34:15,896][26599] Updated weights for policy 0, policy_version 269934 (0.0044) [2024-06-19 03:34:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41233.1, 300 sec: 42043.0). Total num frames: 4422680576. Throughput: 0: 41902.6. Samples: 690265300. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:34:18,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 03:34:19,642][26599] Updated weights for policy 0, policy_version 269944 (0.0033) [2024-06-19 03:34:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4422909952. Throughput: 0: 41938.0. Samples: 690515900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 03:34:23,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 03:34:23,688][26599] Updated weights for policy 0, policy_version 269954 (0.0031) [2024-06-19 03:34:27,820][26579] Signal inference workers to stop experience collection... (10250 times) [2024-06-19 03:34:27,820][26579] Signal inference workers to resume experience collection... (10250 times) [2024-06-19 03:34:27,824][26599] Updated weights for policy 0, policy_version 269964 (0.0037) [2024-06-19 03:34:27,872][26599] InferenceWorker_p0-w0: stopping experience collection (10250 times) [2024-06-19 03:34:27,872][26599] InferenceWorker_p0-w0: resuming experience collection (10250 times) [2024-06-19 03:34:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.6, 300 sec: 41987.5). Total num frames: 4423106560. Throughput: 0: 42059.6. Samples: 690769320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:34:28,381][26367] Avg episode reward: [(0, '0.351')] [2024-06-19 03:34:31,373][26599] Updated weights for policy 0, policy_version 269974 (0.0036) [2024-06-19 03:34:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.1, 300 sec: 41988.0). Total num frames: 4423319552. Throughput: 0: 42118.2. Samples: 690897820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:34:33,381][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 03:34:35,374][26599] Updated weights for policy 0, policy_version 269984 (0.0034) [2024-06-19 03:34:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42054.8, 300 sec: 42098.5). Total num frames: 4423532544. Throughput: 0: 42118.5. Samples: 691153500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:34:38,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 03:34:38,875][26599] Updated weights for policy 0, policy_version 269994 (0.0036) [2024-06-19 03:34:43,112][26599] Updated weights for policy 0, policy_version 270004 (0.0034) [2024-06-19 03:34:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4423761920. Throughput: 0: 42255.9. Samples: 691407100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:34:43,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 03:34:46,968][26599] Updated weights for policy 0, policy_version 270014 (0.0040) [2024-06-19 03:34:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4423958528. Throughput: 0: 42205.9. Samples: 691534480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:34:48,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 03:34:50,742][26599] Updated weights for policy 0, policy_version 270024 (0.0022) [2024-06-19 03:34:53,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4424155136. Throughput: 0: 42056.9. Samples: 691782120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:34:53,380][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 03:34:54,611][26599] Updated weights for policy 0, policy_version 270034 (0.0031) [2024-06-19 03:34:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4424384512. Throughput: 0: 42208.5. Samples: 692037860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:34:58,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 03:34:58,761][26599] Updated weights for policy 0, policy_version 270044 (0.0045) [2024-06-19 03:35:02,583][26599] Updated weights for policy 0, policy_version 270054 (0.0037) [2024-06-19 03:35:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 4424597504. Throughput: 0: 42289.0. Samples: 692168300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:35:03,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 03:35:06,637][26599] Updated weights for policy 0, policy_version 270064 (0.0028) [2024-06-19 03:35:08,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.0, 300 sec: 41987.4). Total num frames: 4424777728. Throughput: 0: 42243.9. Samples: 692416880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:35:08,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 03:35:10,475][26599] Updated weights for policy 0, policy_version 270074 (0.0032) [2024-06-19 03:35:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 4425007104. Throughput: 0: 42266.3. Samples: 692671300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:35:13,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 03:35:14,265][26599] Updated weights for policy 0, policy_version 270084 (0.0028) [2024-06-19 03:35:18,182][26599] Updated weights for policy 0, policy_version 270094 (0.0034) [2024-06-19 03:35:18,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4425220096. Throughput: 0: 42259.5. Samples: 692799500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:35:18,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 03:35:21,843][26599] Updated weights for policy 0, policy_version 270104 (0.0038) [2024-06-19 03:35:23,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4425433088. Throughput: 0: 42098.3. Samples: 693047920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:35:23,381][26367] Avg episode reward: [(0, '0.857')] [2024-06-19 03:35:26,274][26599] Updated weights for policy 0, policy_version 270114 (0.0033) [2024-06-19 03:35:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4425646080. Throughput: 0: 41949.8. Samples: 693294840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:35:28,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 03:35:29,803][26599] Updated weights for policy 0, policy_version 270124 (0.0037) [2024-06-19 03:35:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4425842688. Throughput: 0: 42039.9. Samples: 693426280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 03:35:33,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 03:35:34,063][26599] Updated weights for policy 0, policy_version 270134 (0.0037) [2024-06-19 03:35:37,565][26599] Updated weights for policy 0, policy_version 270144 (0.0029) [2024-06-19 03:35:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4426055680. Throughput: 0: 42191.8. Samples: 693680760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:35:38,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 03:35:38,469][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000270146_4426072064.pth... [2024-06-19 03:35:38,526][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000269531_4415995904.pth [2024-06-19 03:35:41,778][26599] Updated weights for policy 0, policy_version 270154 (0.0030) [2024-06-19 03:35:43,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4426285056. Throughput: 0: 41945.5. Samples: 693925560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:35:43,385][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 03:35:45,278][26599] Updated weights for policy 0, policy_version 270164 (0.0032) [2024-06-19 03:35:48,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42099.1). Total num frames: 4426481664. Throughput: 0: 42027.4. Samples: 694059540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:35:48,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 03:35:49,539][26599] Updated weights for policy 0, policy_version 270174 (0.0033) [2024-06-19 03:35:50,167][26579] Signal inference workers to stop experience collection... (10300 times) [2024-06-19 03:35:50,168][26579] Signal inference workers to resume experience collection... (10300 times) [2024-06-19 03:35:50,211][26599] InferenceWorker_p0-w0: stopping experience collection (10300 times) [2024-06-19 03:35:50,211][26599] InferenceWorker_p0-w0: resuming experience collection (10300 times) [2024-06-19 03:35:52,981][26599] Updated weights for policy 0, policy_version 270184 (0.0044) [2024-06-19 03:35:53,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42598.3, 300 sec: 41987.5). Total num frames: 4426711040. Throughput: 0: 42205.0. Samples: 694316100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:35:53,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 03:35:57,270][26599] Updated weights for policy 0, policy_version 270194 (0.0036) [2024-06-19 03:35:58,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 4426907648. Throughput: 0: 41958.3. Samples: 694559580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:35:58,385][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 03:36:00,623][26599] Updated weights for policy 0, policy_version 270204 (0.0037) [2024-06-19 03:36:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4427104256. Throughput: 0: 41888.5. Samples: 694684480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:03,380][26367] Avg episode reward: [(0, '0.387')] [2024-06-19 03:36:04,819][26599] Updated weights for policy 0, policy_version 270214 (0.0035) [2024-06-19 03:36:08,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4427317248. Throughput: 0: 42137.3. Samples: 694944100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:08,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 03:36:08,797][26599] Updated weights for policy 0, policy_version 270224 (0.0034) [2024-06-19 03:36:12,597][26599] Updated weights for policy 0, policy_version 270234 (0.0041) [2024-06-19 03:36:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4427530240. Throughput: 0: 42174.8. Samples: 695192700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:13,380][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 03:36:16,548][26599] Updated weights for policy 0, policy_version 270244 (0.0057) [2024-06-19 03:36:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4427743232. Throughput: 0: 42070.3. Samples: 695319440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:18,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 03:36:20,292][26599] Updated weights for policy 0, policy_version 270254 (0.0027) [2024-06-19 03:36:23,380][26367] Fps is (10 sec: 40959.2, 60 sec: 41779.1, 300 sec: 41931.9). Total num frames: 4427939840. Throughput: 0: 42092.4. Samples: 695574920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:23,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 03:36:24,175][26599] Updated weights for policy 0, policy_version 270264 (0.0028) [2024-06-19 03:36:27,839][26599] Updated weights for policy 0, policy_version 270274 (0.0029) [2024-06-19 03:36:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4428185600. Throughput: 0: 42234.0. Samples: 695825940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:28,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 03:36:31,913][26599] Updated weights for policy 0, policy_version 270284 (0.0031) [2024-06-19 03:36:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4428382208. Throughput: 0: 42253.3. Samples: 695960940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:33,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 03:36:35,654][26599] Updated weights for policy 0, policy_version 270294 (0.0049) [2024-06-19 03:36:38,384][26367] Fps is (10 sec: 39307.5, 60 sec: 42049.8, 300 sec: 41986.9). Total num frames: 4428578816. Throughput: 0: 42078.4. Samples: 696209780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:38,385][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 03:36:39,711][26599] Updated weights for policy 0, policy_version 270304 (0.0035) [2024-06-19 03:36:43,307][26599] Updated weights for policy 0, policy_version 270314 (0.0038) [2024-06-19 03:36:43,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42328.0, 300 sec: 42265.2). Total num frames: 4428824576. Throughput: 0: 42113.3. Samples: 696454520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:43,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 03:36:47,764][26599] Updated weights for policy 0, policy_version 270324 (0.0039) [2024-06-19 03:36:48,380][26367] Fps is (10 sec: 40974.8, 60 sec: 41779.2, 300 sec: 41876.5). Total num frames: 4428988416. Throughput: 0: 42188.8. Samples: 696582980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:36:48,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 03:36:51,032][26599] Updated weights for policy 0, policy_version 270334 (0.0051) [2024-06-19 03:36:53,380][26367] Fps is (10 sec: 37682.7, 60 sec: 41506.1, 300 sec: 42043.5). Total num frames: 4429201408. Throughput: 0: 41888.4. Samples: 696829080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:36:53,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 03:36:55,821][26599] Updated weights for policy 0, policy_version 270344 (0.0024) [2024-06-19 03:36:58,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42327.9, 300 sec: 42209.6). Total num frames: 4429447168. Throughput: 0: 41922.5. Samples: 697079220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:36:58,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 03:36:58,927][26599] Updated weights for policy 0, policy_version 270354 (0.0050) [2024-06-19 03:37:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4429611008. Throughput: 0: 42057.8. Samples: 697212040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:03,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 03:37:03,556][26599] Updated weights for policy 0, policy_version 270364 (0.0033) [2024-06-19 03:37:06,588][26599] Updated weights for policy 0, policy_version 270374 (0.0041) [2024-06-19 03:37:08,384][26367] Fps is (10 sec: 39307.5, 60 sec: 42049.7, 300 sec: 41987.0). Total num frames: 4429840384. Throughput: 0: 41969.6. Samples: 697463700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:08,385][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 03:37:11,351][26599] Updated weights for policy 0, policy_version 270384 (0.0031) [2024-06-19 03:37:13,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4430086144. Throughput: 0: 42008.5. Samples: 697716320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:13,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 03:37:14,342][26599] Updated weights for policy 0, policy_version 270394 (0.0037) [2024-06-19 03:37:18,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42052.3, 300 sec: 41932.4). Total num frames: 4430266368. Throughput: 0: 41949.4. Samples: 697848660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:18,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 03:37:19,391][26599] Updated weights for policy 0, policy_version 270404 (0.0041) [2024-06-19 03:37:20,489][26579] Signal inference workers to stop experience collection... (10350 times) [2024-06-19 03:37:20,531][26599] InferenceWorker_p0-w0: stopping experience collection (10350 times) [2024-06-19 03:37:20,539][26579] Signal inference workers to resume experience collection... (10350 times) [2024-06-19 03:37:20,543][26599] InferenceWorker_p0-w0: resuming experience collection (10350 times) [2024-06-19 03:37:21,941][26599] Updated weights for policy 0, policy_version 270414 (0.0029) [2024-06-19 03:37:23,383][26367] Fps is (10 sec: 39309.6, 60 sec: 42323.3, 300 sec: 42098.1). Total num frames: 4430479360. Throughput: 0: 41802.8. Samples: 698090880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:23,384][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 03:37:27,032][26599] Updated weights for policy 0, policy_version 270424 (0.0032) [2024-06-19 03:37:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4430708736. Throughput: 0: 42146.1. Samples: 698351100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:28,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 03:37:29,626][26599] Updated weights for policy 0, policy_version 270434 (0.0034) [2024-06-19 03:37:33,380][26367] Fps is (10 sec: 40972.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4430888960. Throughput: 0: 42167.6. Samples: 698480520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:33,384][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 03:37:34,767][26599] Updated weights for policy 0, policy_version 270444 (0.0039) [2024-06-19 03:37:37,445][26599] Updated weights for policy 0, policy_version 270454 (0.0026) [2024-06-19 03:37:38,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42598.4, 300 sec: 42209.1). Total num frames: 4431134720. Throughput: 0: 42186.9. Samples: 698727640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:38,385][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 03:37:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000270455_4431134720.pth... [2024-06-19 03:37:38,452][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000269837_4421009408.pth [2024-06-19 03:37:42,571][26599] Updated weights for policy 0, policy_version 270464 (0.0041) [2024-06-19 03:37:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 4431331328. Throughput: 0: 42389.7. Samples: 698986760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:43,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 03:37:45,151][26599] Updated weights for policy 0, policy_version 270474 (0.0035) [2024-06-19 03:37:48,380][26367] Fps is (10 sec: 39336.0, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4431527936. Throughput: 0: 42101.3. Samples: 699106600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:48,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 03:37:50,274][26599] Updated weights for policy 0, policy_version 270484 (0.0040) [2024-06-19 03:37:52,813][26599] Updated weights for policy 0, policy_version 270494 (0.0023) [2024-06-19 03:37:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 4431773696. Throughput: 0: 42139.4. Samples: 699359820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:53,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 03:37:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.2, 300 sec: 41876.4). Total num frames: 4431921152. Throughput: 0: 42301.4. Samples: 699619880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 24.0) [2024-06-19 03:37:58,380][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 03:37:58,452][26599] Updated weights for policy 0, policy_version 270504 (0.0028) [2024-06-19 03:38:00,807][26599] Updated weights for policy 0, policy_version 270514 (0.0037) [2024-06-19 03:38:03,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4432150528. Throughput: 0: 41765.8. Samples: 699728120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:03,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 03:38:06,209][26599] Updated weights for policy 0, policy_version 270524 (0.0025) [2024-06-19 03:38:08,380][26367] Fps is (10 sec: 47512.7, 60 sec: 42600.9, 300 sec: 42154.1). Total num frames: 4432396288. Throughput: 0: 42183.7. Samples: 699989020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:08,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 03:38:08,734][26599] Updated weights for policy 0, policy_version 270534 (0.0035) [2024-06-19 03:38:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.1, 300 sec: 41876.4). Total num frames: 4432560128. Throughput: 0: 42152.9. Samples: 700247980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:13,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 03:38:13,823][26599] Updated weights for policy 0, policy_version 270544 (0.0030) [2024-06-19 03:38:16,595][26599] Updated weights for policy 0, policy_version 270554 (0.0030) [2024-06-19 03:38:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4432805888. Throughput: 0: 41903.5. Samples: 700366180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:18,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 03:38:21,676][26599] Updated weights for policy 0, policy_version 270564 (0.0043) [2024-06-19 03:38:23,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42600.6, 300 sec: 42154.2). Total num frames: 4433035264. Throughput: 0: 42228.3. Samples: 700627760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:23,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 03:38:24,395][26599] Updated weights for policy 0, policy_version 270574 (0.0032) [2024-06-19 03:38:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4433215488. Throughput: 0: 42015.7. Samples: 700877460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:28,381][26367] Avg episode reward: [(0, '0.339')] [2024-06-19 03:38:29,364][26599] Updated weights for policy 0, policy_version 270584 (0.0040) [2024-06-19 03:38:31,583][26579] Signal inference workers to stop experience collection... (10400 times) [2024-06-19 03:38:31,631][26599] InferenceWorker_p0-w0: stopping experience collection (10400 times) [2024-06-19 03:38:31,640][26579] Signal inference workers to resume experience collection... (10400 times) [2024-06-19 03:38:31,653][26599] InferenceWorker_p0-w0: resuming experience collection (10400 times) [2024-06-19 03:38:32,246][26599] Updated weights for policy 0, policy_version 270594 (0.0029) [2024-06-19 03:38:33,380][26367] Fps is (10 sec: 39320.9, 60 sec: 42325.2, 300 sec: 42099.0). Total num frames: 4433428480. Throughput: 0: 42046.1. Samples: 700998680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:33,381][26367] Avg episode reward: [(0, '0.209')] [2024-06-19 03:38:36,952][26599] Updated weights for policy 0, policy_version 270604 (0.0036) [2024-06-19 03:38:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42054.9, 300 sec: 42154.1). Total num frames: 4433657856. Throughput: 0: 42180.1. Samples: 701257920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:38,380][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 03:38:40,135][26599] Updated weights for policy 0, policy_version 270614 (0.0030) [2024-06-19 03:38:43,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4433838080. Throughput: 0: 41965.7. Samples: 701508340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:43,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 03:38:44,578][26599] Updated weights for policy 0, policy_version 270624 (0.0032) [2024-06-19 03:38:48,158][26599] Updated weights for policy 0, policy_version 270634 (0.0026) [2024-06-19 03:38:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4434067456. Throughput: 0: 42283.0. Samples: 701630860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:48,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 03:38:52,328][26599] Updated weights for policy 0, policy_version 270644 (0.0038) [2024-06-19 03:38:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 4434280448. Throughput: 0: 42265.9. Samples: 701890980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:53,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 03:38:56,159][26599] Updated weights for policy 0, policy_version 270654 (0.0038) [2024-06-19 03:38:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42098.5). Total num frames: 4434493440. Throughput: 0: 41843.9. Samples: 702130960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:38:58,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 03:39:00,033][26599] Updated weights for policy 0, policy_version 270664 (0.0035) [2024-06-19 03:39:03,384][26367] Fps is (10 sec: 39307.1, 60 sec: 42049.7, 300 sec: 42042.5). Total num frames: 4434673664. Throughput: 0: 42138.0. Samples: 702262540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:39:03,385][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 03:39:04,233][26599] Updated weights for policy 0, policy_version 270674 (0.0046) [2024-06-19 03:39:07,734][26599] Updated weights for policy 0, policy_version 270684 (0.0037) [2024-06-19 03:39:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4434903040. Throughput: 0: 41863.1. Samples: 702511600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 03:39:08,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 03:39:11,991][26599] Updated weights for policy 0, policy_version 270694 (0.0041) [2024-06-19 03:39:13,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4435116032. Throughput: 0: 41887.5. Samples: 702762400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:13,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 03:39:15,381][26599] Updated weights for policy 0, policy_version 270704 (0.0035) [2024-06-19 03:39:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4435312640. Throughput: 0: 41939.3. Samples: 702885940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:18,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 03:39:19,718][26599] Updated weights for policy 0, policy_version 270714 (0.0034) [2024-06-19 03:39:23,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 4435525632. Throughput: 0: 41786.3. Samples: 703138300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:23,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 03:39:23,512][26599] Updated weights for policy 0, policy_version 270724 (0.0043) [2024-06-19 03:39:27,463][26599] Updated weights for policy 0, policy_version 270734 (0.0041) [2024-06-19 03:39:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 4435738624. Throughput: 0: 41742.3. Samples: 703386740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:28,380][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 03:39:31,180][26599] Updated weights for policy 0, policy_version 270744 (0.0028) [2024-06-19 03:39:33,384][26367] Fps is (10 sec: 42582.2, 60 sec: 42049.8, 300 sec: 42098.0). Total num frames: 4435951616. Throughput: 0: 41802.4. Samples: 703512120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:33,385][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 03:39:35,317][26599] Updated weights for policy 0, policy_version 270754 (0.0023) [2024-06-19 03:39:38,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 4436131840. Throughput: 0: 41651.4. Samples: 703765300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:38,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 03:39:38,524][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000270762_4436164608.pth... [2024-06-19 03:39:38,580][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000270146_4426072064.pth [2024-06-19 03:39:38,944][26599] Updated weights for policy 0, policy_version 270764 (0.0033) [2024-06-19 03:39:43,112][26599] Updated weights for policy 0, policy_version 270774 (0.0038) [2024-06-19 03:39:43,380][26367] Fps is (10 sec: 42614.4, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4436377600. Throughput: 0: 41955.7. Samples: 704018960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:43,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 03:39:46,888][26599] Updated weights for policy 0, policy_version 270784 (0.0045) [2024-06-19 03:39:48,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4436590592. Throughput: 0: 41941.0. Samples: 704149740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:48,381][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 03:39:50,857][26599] Updated weights for policy 0, policy_version 270794 (0.0037) [2024-06-19 03:39:51,745][26579] Signal inference workers to stop experience collection... (10450 times) [2024-06-19 03:39:51,751][26579] Signal inference workers to resume experience collection... (10450 times) [2024-06-19 03:39:51,796][26599] InferenceWorker_p0-w0: stopping experience collection (10450 times) [2024-06-19 03:39:51,796][26599] InferenceWorker_p0-w0: resuming experience collection (10450 times) [2024-06-19 03:39:53,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4436770816. Throughput: 0: 41918.3. Samples: 704397920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:53,381][26367] Avg episode reward: [(0, '0.306')] [2024-06-19 03:39:54,670][26599] Updated weights for policy 0, policy_version 270804 (0.0033) [2024-06-19 03:39:58,380][26367] Fps is (10 sec: 39322.5, 60 sec: 41506.3, 300 sec: 41987.5). Total num frames: 4436983808. Throughput: 0: 41959.2. Samples: 704650560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:39:58,380][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 03:39:58,657][26599] Updated weights for policy 0, policy_version 270814 (0.0034) [2024-06-19 03:40:02,549][26599] Updated weights for policy 0, policy_version 270824 (0.0041) [2024-06-19 03:40:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42327.8, 300 sec: 42154.1). Total num frames: 4437213184. Throughput: 0: 42012.3. Samples: 704776500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:40:03,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 03:40:06,635][26599] Updated weights for policy 0, policy_version 270834 (0.0029) [2024-06-19 03:40:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4437426176. Throughput: 0: 41957.2. Samples: 705026380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:40:08,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 03:40:10,611][26599] Updated weights for policy 0, policy_version 270844 (0.0034) [2024-06-19 03:40:13,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4437622784. Throughput: 0: 42054.6. Samples: 705279200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:40:13,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 03:40:14,431][26599] Updated weights for policy 0, policy_version 270854 (0.0045) [2024-06-19 03:40:18,370][26599] Updated weights for policy 0, policy_version 270864 (0.0036) [2024-06-19 03:40:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4437835776. Throughput: 0: 42027.5. Samples: 705403200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:40:18,380][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 03:40:22,162][26599] Updated weights for policy 0, policy_version 270874 (0.0043) [2024-06-19 03:40:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4438048768. Throughput: 0: 42038.7. Samples: 705657040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 03:40:23,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 03:40:26,642][26599] Updated weights for policy 0, policy_version 270884 (0.0034) [2024-06-19 03:40:28,383][26367] Fps is (10 sec: 42584.8, 60 sec: 42050.0, 300 sec: 42098.1). Total num frames: 4438261760. Throughput: 0: 41861.5. Samples: 705902860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:40:28,384][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 03:40:30,191][26599] Updated weights for policy 0, policy_version 270894 (0.0028) [2024-06-19 03:40:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41781.8, 300 sec: 42043.0). Total num frames: 4438458368. Throughput: 0: 41884.1. Samples: 706034520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:40:33,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 03:40:34,192][26599] Updated weights for policy 0, policy_version 270904 (0.0049) [2024-06-19 03:40:37,947][26599] Updated weights for policy 0, policy_version 270914 (0.0032) [2024-06-19 03:40:38,380][26367] Fps is (10 sec: 40972.6, 60 sec: 42325.4, 300 sec: 41988.0). Total num frames: 4438671360. Throughput: 0: 41938.2. Samples: 706285140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:40:38,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 03:40:41,896][26599] Updated weights for policy 0, policy_version 270924 (0.0032) [2024-06-19 03:40:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4438900736. Throughput: 0: 41945.3. Samples: 706538100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:40:43,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 03:40:45,591][26599] Updated weights for policy 0, policy_version 270934 (0.0031) [2024-06-19 03:40:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4439097344. Throughput: 0: 42036.5. Samples: 706668140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:40:48,381][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 03:40:49,673][26599] Updated weights for policy 0, policy_version 270944 (0.0030) [2024-06-19 03:40:53,204][26599] Updated weights for policy 0, policy_version 270954 (0.0041) [2024-06-19 03:40:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42043.5). Total num frames: 4439310336. Throughput: 0: 42015.1. Samples: 706917060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:40:53,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 03:40:57,320][26599] Updated weights for policy 0, policy_version 270964 (0.0045) [2024-06-19 03:40:58,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4439523328. Throughput: 0: 42066.7. Samples: 707172200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:40:58,380][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 03:41:01,107][26599] Updated weights for policy 0, policy_version 270974 (0.0036) [2024-06-19 03:41:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.4, 300 sec: 42043.0). Total num frames: 4439719936. Throughput: 0: 42161.4. Samples: 707300460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:41:03,380][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 03:41:05,069][26599] Updated weights for policy 0, policy_version 270984 (0.0041) [2024-06-19 03:41:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4439932928. Throughput: 0: 41981.9. Samples: 707546220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:41:08,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 03:41:08,935][26599] Updated weights for policy 0, policy_version 270994 (0.0045) [2024-06-19 03:41:12,745][26599] Updated weights for policy 0, policy_version 271004 (0.0038) [2024-06-19 03:41:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4440145920. Throughput: 0: 42183.9. Samples: 707801000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:41:13,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 03:41:16,702][26599] Updated weights for policy 0, policy_version 271014 (0.0038) [2024-06-19 03:41:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4440358912. Throughput: 0: 42095.1. Samples: 707928800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:41:18,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 03:41:20,493][26599] Updated weights for policy 0, policy_version 271024 (0.0028) [2024-06-19 03:41:20,985][26579] Signal inference workers to stop experience collection... (10500 times) [2024-06-19 03:41:21,026][26599] InferenceWorker_p0-w0: stopping experience collection (10500 times) [2024-06-19 03:41:21,047][26579] Signal inference workers to resume experience collection... (10500 times) [2024-06-19 03:41:21,048][26599] InferenceWorker_p0-w0: resuming experience collection (10500 times) [2024-06-19 03:41:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4440555520. Throughput: 0: 42153.3. Samples: 708182040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:41:23,384][26367] Avg episode reward: [(0, '0.888')] [2024-06-19 03:41:24,319][26599] Updated weights for policy 0, policy_version 271034 (0.0039) [2024-06-19 03:41:28,231][26599] Updated weights for policy 0, policy_version 271044 (0.0034) [2024-06-19 03:41:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42054.5, 300 sec: 42043.0). Total num frames: 4440784896. Throughput: 0: 42194.7. Samples: 708436860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:41:28,380][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 03:41:31,987][26599] Updated weights for policy 0, policy_version 271054 (0.0037) [2024-06-19 03:41:33,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42049.7, 300 sec: 42043.0). Total num frames: 4440981504. Throughput: 0: 42114.4. Samples: 708563440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 03:41:33,385][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 03:41:36,316][26599] Updated weights for policy 0, policy_version 271064 (0.0037) [2024-06-19 03:41:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4441194496. Throughput: 0: 42021.8. Samples: 708808040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:41:38,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 03:41:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271069_4441194496.pth... [2024-06-19 03:41:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000270455_4431134720.pth [2024-06-19 03:41:40,221][26599] Updated weights for policy 0, policy_version 271074 (0.0035) [2024-06-19 03:41:43,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4441391104. Throughput: 0: 41961.3. Samples: 709060460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:41:43,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 03:41:44,067][26599] Updated weights for policy 0, policy_version 271084 (0.0050) [2024-06-19 03:41:48,056][26599] Updated weights for policy 0, policy_version 271094 (0.0039) [2024-06-19 03:41:48,382][26367] Fps is (10 sec: 40954.0, 60 sec: 41778.3, 300 sec: 42042.8). Total num frames: 4441604096. Throughput: 0: 41891.4. Samples: 709185640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:41:48,382][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 03:41:51,888][26599] Updated weights for policy 0, policy_version 271104 (0.0041) [2024-06-19 03:41:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4441817088. Throughput: 0: 42015.0. Samples: 709436900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:41:53,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 03:41:55,793][26599] Updated weights for policy 0, policy_version 271114 (0.0043) [2024-06-19 03:41:58,380][26367] Fps is (10 sec: 42604.6, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 4442030080. Throughput: 0: 41793.7. Samples: 709681720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:41:58,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 03:41:59,692][26599] Updated weights for policy 0, policy_version 271124 (0.0042) [2024-06-19 03:42:03,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41779.0, 300 sec: 41988.0). Total num frames: 4442226688. Throughput: 0: 41786.5. Samples: 709809200. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:03,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 03:42:03,904][26599] Updated weights for policy 0, policy_version 271134 (0.0036) [2024-06-19 03:42:07,701][26599] Updated weights for policy 0, policy_version 271144 (0.0038) [2024-06-19 03:42:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 41931.9). Total num frames: 4442456064. Throughput: 0: 41606.1. Samples: 710054320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:08,381][26367] Avg episode reward: [(0, '0.786')] [2024-06-19 03:42:11,789][26599] Updated weights for policy 0, policy_version 271154 (0.0034) [2024-06-19 03:42:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4442652672. Throughput: 0: 41413.6. Samples: 710300480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:13,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 03:42:15,594][26599] Updated weights for policy 0, policy_version 271164 (0.0028) [2024-06-19 03:42:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 41932.4). Total num frames: 4442849280. Throughput: 0: 41316.2. Samples: 710422520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:18,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 03:42:19,838][26599] Updated weights for policy 0, policy_version 271174 (0.0036) [2024-06-19 03:42:23,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41779.2, 300 sec: 41876.4). Total num frames: 4443062272. Throughput: 0: 41535.1. Samples: 710677120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:23,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 03:42:23,538][26599] Updated weights for policy 0, policy_version 271184 (0.0042) [2024-06-19 03:42:27,601][26599] Updated weights for policy 0, policy_version 271194 (0.0040) [2024-06-19 03:42:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4443291648. Throughput: 0: 41382.6. Samples: 710922680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:28,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 03:42:31,368][26599] Updated weights for policy 0, policy_version 271204 (0.0045) [2024-06-19 03:42:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41508.7, 300 sec: 41821.4). Total num frames: 4443471872. Throughput: 0: 41426.7. Samples: 711049780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:33,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 03:42:35,318][26599] Updated weights for policy 0, policy_version 271214 (0.0036) [2024-06-19 03:42:38,380][26367] Fps is (10 sec: 36045.1, 60 sec: 40960.0, 300 sec: 41765.3). Total num frames: 4443652096. Throughput: 0: 41268.9. Samples: 711294000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:38,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 03:42:39,295][26599] Updated weights for policy 0, policy_version 271224 (0.0047) [2024-06-19 03:42:41,072][26579] Signal inference workers to stop experience collection... (10550 times) [2024-06-19 03:42:41,072][26579] Signal inference workers to resume experience collection... (10550 times) [2024-06-19 03:42:41,098][26599] InferenceWorker_p0-w0: stopping experience collection (10550 times) [2024-06-19 03:42:41,099][26599] InferenceWorker_p0-w0: resuming experience collection (10550 times) [2024-06-19 03:42:43,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4443881472. Throughput: 0: 41288.0. Samples: 711539680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 03:42:43,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 03:42:43,453][26599] Updated weights for policy 0, policy_version 271234 (0.0029) [2024-06-19 03:42:47,092][26599] Updated weights for policy 0, policy_version 271244 (0.0044) [2024-06-19 03:42:48,384][26367] Fps is (10 sec: 42582.8, 60 sec: 41231.6, 300 sec: 41709.3). Total num frames: 4444078080. Throughput: 0: 41337.2. Samples: 711669520. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:42:48,384][26367] Avg episode reward: [(0, '0.800')] [2024-06-19 03:42:51,057][26599] Updated weights for policy 0, policy_version 271254 (0.0041) [2024-06-19 03:42:53,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41233.0, 300 sec: 41931.9). Total num frames: 4444291072. Throughput: 0: 41454.7. Samples: 711919780. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:42:53,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 03:42:55,087][26599] Updated weights for policy 0, policy_version 271264 (0.0039) [2024-06-19 03:42:58,380][26367] Fps is (10 sec: 44253.0, 60 sec: 41506.2, 300 sec: 41931.9). Total num frames: 4444520448. Throughput: 0: 41661.5. Samples: 712175240. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:42:58,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 03:42:58,660][26599] Updated weights for policy 0, policy_version 271274 (0.0027) [2024-06-19 03:43:02,850][26599] Updated weights for policy 0, policy_version 271284 (0.0030) [2024-06-19 03:43:03,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41506.3, 300 sec: 41765.3). Total num frames: 4444717056. Throughput: 0: 41834.8. Samples: 712305080. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:03,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 03:43:06,324][26599] Updated weights for policy 0, policy_version 271294 (0.0046) [2024-06-19 03:43:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41233.2, 300 sec: 41931.9). Total num frames: 4444930048. Throughput: 0: 41501.3. Samples: 712544680. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:08,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 03:43:10,789][26599] Updated weights for policy 0, policy_version 271304 (0.0042) [2024-06-19 03:43:13,380][26367] Fps is (10 sec: 42597.4, 60 sec: 41506.1, 300 sec: 41820.8). Total num frames: 4445143040. Throughput: 0: 41599.5. Samples: 712794660. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:13,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 03:43:14,048][26599] Updated weights for policy 0, policy_version 271314 (0.0032) [2024-06-19 03:43:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 4445339648. Throughput: 0: 41664.4. Samples: 712924680. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:18,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 03:43:18,571][26599] Updated weights for policy 0, policy_version 271324 (0.0022) [2024-06-19 03:43:21,935][26599] Updated weights for policy 0, policy_version 271334 (0.0032) [2024-06-19 03:43:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.1, 300 sec: 41876.4). Total num frames: 4445569024. Throughput: 0: 41683.4. Samples: 713169760. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:23,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 03:43:26,356][26599] Updated weights for policy 0, policy_version 271344 (0.0041) [2024-06-19 03:43:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41506.2, 300 sec: 41876.4). Total num frames: 4445782016. Throughput: 0: 41937.4. Samples: 713426860. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:28,380][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 03:43:29,810][26599] Updated weights for policy 0, policy_version 271354 (0.0033) [2024-06-19 03:43:33,380][26367] Fps is (10 sec: 37683.9, 60 sec: 41233.1, 300 sec: 41654.2). Total num frames: 4445945856. Throughput: 0: 41716.7. Samples: 713546620. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:33,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 03:43:34,333][26599] Updated weights for policy 0, policy_version 271364 (0.0039) [2024-06-19 03:43:37,916][26599] Updated weights for policy 0, policy_version 271374 (0.0039) [2024-06-19 03:43:38,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 41931.9). Total num frames: 4446208000. Throughput: 0: 41784.5. Samples: 713800080. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:38,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 03:43:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271375_4446208000.pth... [2024-06-19 03:43:38,466][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000270762_4436164608.pth [2024-06-19 03:43:42,031][26599] Updated weights for policy 0, policy_version 271384 (0.0037) [2024-06-19 03:43:43,384][26367] Fps is (10 sec: 44220.2, 60 sec: 41776.7, 300 sec: 41764.8). Total num frames: 4446388224. Throughput: 0: 41733.9. Samples: 714053420. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:43,385][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 03:43:43,765][26579] Signal inference workers to stop experience collection... (10600 times) [2024-06-19 03:43:43,765][26579] Signal inference workers to resume experience collection... (10600 times) [2024-06-19 03:43:43,797][26599] InferenceWorker_p0-w0: stopping experience collection (10600 times) [2024-06-19 03:43:43,797][26599] InferenceWorker_p0-w0: resuming experience collection (10600 times) [2024-06-19 03:43:45,703][26599] Updated weights for policy 0, policy_version 271394 (0.0026) [2024-06-19 03:43:48,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41781.6, 300 sec: 41709.8). Total num frames: 4446584832. Throughput: 0: 41449.6. Samples: 714170320. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:48,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 03:43:49,939][26599] Updated weights for policy 0, policy_version 271404 (0.0035) [2024-06-19 03:43:53,380][26367] Fps is (10 sec: 44253.6, 60 sec: 42325.5, 300 sec: 41820.9). Total num frames: 4446830592. Throughput: 0: 41863.6. Samples: 714428540. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:53,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 03:43:53,453][26599] Updated weights for policy 0, policy_version 271414 (0.0044) [2024-06-19 03:43:57,888][26599] Updated weights for policy 0, policy_version 271424 (0.0038) [2024-06-19 03:43:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.1, 300 sec: 41876.9). Total num frames: 4447027200. Throughput: 0: 41884.9. Samples: 714679480. Policy #0 lag: (min: 2.0, avg: 11.6, max: 24.0) [2024-06-19 03:43:58,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 03:44:01,188][26599] Updated weights for policy 0, policy_version 271434 (0.0030) [2024-06-19 03:44:03,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42052.2, 300 sec: 41820.9). Total num frames: 4447240192. Throughput: 0: 41810.1. Samples: 714806140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:03,386][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 03:44:05,611][26599] Updated weights for policy 0, policy_version 271444 (0.0033) [2024-06-19 03:44:08,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 41876.4). Total num frames: 4447469568. Throughput: 0: 41980.2. Samples: 715058860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:08,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 03:44:08,792][26599] Updated weights for policy 0, policy_version 271454 (0.0032) [2024-06-19 03:44:13,338][26599] Updated weights for policy 0, policy_version 271464 (0.0037) [2024-06-19 03:44:13,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4447666176. Throughput: 0: 42024.0. Samples: 715317940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:13,380][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 03:44:16,877][26599] Updated weights for policy 0, policy_version 271474 (0.0039) [2024-06-19 03:44:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 41820.8). Total num frames: 4447862784. Throughput: 0: 41944.4. Samples: 715434120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:18,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 03:44:21,265][26599] Updated weights for policy 0, policy_version 271484 (0.0039) [2024-06-19 03:44:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.4, 300 sec: 41876.4). Total num frames: 4448092160. Throughput: 0: 42033.4. Samples: 715691580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:23,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 03:44:24,628][26599] Updated weights for policy 0, policy_version 271494 (0.0046) [2024-06-19 03:44:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 41821.4). Total num frames: 4448288768. Throughput: 0: 41840.8. Samples: 715936100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:28,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 03:44:29,239][26599] Updated weights for policy 0, policy_version 271504 (0.0042) [2024-06-19 03:44:32,406][26599] Updated weights for policy 0, policy_version 271514 (0.0038) [2024-06-19 03:44:33,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42598.2, 300 sec: 41931.9). Total num frames: 4448501760. Throughput: 0: 42019.9. Samples: 716061220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:33,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 03:44:37,248][26599] Updated weights for policy 0, policy_version 271524 (0.0033) [2024-06-19 03:44:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41506.2, 300 sec: 41765.3). Total num frames: 4448698368. Throughput: 0: 42030.1. Samples: 716319900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:38,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 03:44:40,235][26599] Updated weights for policy 0, policy_version 271534 (0.0031) [2024-06-19 03:44:43,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42054.9, 300 sec: 41765.3). Total num frames: 4448911360. Throughput: 0: 42017.5. Samples: 716570260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:43,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 03:44:45,022][26599] Updated weights for policy 0, policy_version 271544 (0.0035) [2024-06-19 03:44:48,171][26599] Updated weights for policy 0, policy_version 271554 (0.0035) [2024-06-19 03:44:48,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42595.9, 300 sec: 41931.4). Total num frames: 4449140736. Throughput: 0: 41940.2. Samples: 716693600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:48,384][26367] Avg episode reward: [(0, '0.396')] [2024-06-19 03:44:52,964][26599] Updated weights for policy 0, policy_version 271564 (0.0029) [2024-06-19 03:44:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 4449320960. Throughput: 0: 41698.7. Samples: 716935300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:53,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 03:44:56,087][26579] Signal inference workers to stop experience collection... (10650 times) [2024-06-19 03:44:56,109][26599] InferenceWorker_p0-w0: stopping experience collection (10650 times) [2024-06-19 03:44:56,201][26579] Signal inference workers to resume experience collection... (10650 times) [2024-06-19 03:44:56,201][26599] InferenceWorker_p0-w0: resuming experience collection (10650 times) [2024-06-19 03:44:56,203][26599] Updated weights for policy 0, policy_version 271574 (0.0038) [2024-06-19 03:44:58,380][26367] Fps is (10 sec: 37697.3, 60 sec: 41506.3, 300 sec: 41709.8). Total num frames: 4449517568. Throughput: 0: 41692.4. Samples: 717194100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:44:58,380][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 03:45:00,789][26599] Updated weights for policy 0, policy_version 271584 (0.0050) [2024-06-19 03:45:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.2, 300 sec: 41709.8). Total num frames: 4449730560. Throughput: 0: 41884.5. Samples: 717318920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:45:03,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 03:45:03,936][26599] Updated weights for policy 0, policy_version 271594 (0.0027) [2024-06-19 03:45:08,366][26599] Updated weights for policy 0, policy_version 271604 (0.0028) [2024-06-19 03:45:08,380][26367] Fps is (10 sec: 44236.3, 60 sec: 41506.1, 300 sec: 41820.9). Total num frames: 4449959936. Throughput: 0: 41727.5. Samples: 717569320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 03:45:08,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 03:45:11,823][26599] Updated weights for policy 0, policy_version 271614 (0.0041) [2024-06-19 03:45:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41506.0, 300 sec: 41765.3). Total num frames: 4450156544. Throughput: 0: 41827.4. Samples: 717818340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:13,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 03:45:16,637][26599] Updated weights for policy 0, policy_version 271624 (0.0029) [2024-06-19 03:45:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 41765.3). Total num frames: 4450369536. Throughput: 0: 41902.9. Samples: 717946840. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:18,380][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 03:45:19,614][26599] Updated weights for policy 0, policy_version 271634 (0.0035) [2024-06-19 03:45:23,380][26367] Fps is (10 sec: 39322.0, 60 sec: 40960.0, 300 sec: 41654.7). Total num frames: 4450549760. Throughput: 0: 41583.1. Samples: 718191140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:23,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 03:45:24,528][26599] Updated weights for policy 0, policy_version 271644 (0.0040) [2024-06-19 03:45:27,452][26599] Updated weights for policy 0, policy_version 271654 (0.0044) [2024-06-19 03:45:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 41820.8). Total num frames: 4450795520. Throughput: 0: 41633.3. Samples: 718443760. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:28,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 03:45:32,254][26599] Updated weights for policy 0, policy_version 271664 (0.0045) [2024-06-19 03:45:33,384][26367] Fps is (10 sec: 45858.5, 60 sec: 41776.8, 300 sec: 41820.3). Total num frames: 4451008512. Throughput: 0: 41952.4. Samples: 718581460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:33,385][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 03:45:35,039][26599] Updated weights for policy 0, policy_version 271674 (0.0036) [2024-06-19 03:45:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 41709.8). Total num frames: 4451205120. Throughput: 0: 42038.2. Samples: 718827020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:38,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 03:45:38,413][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271680_4451205120.pth... [2024-06-19 03:45:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271069_4441194496.pth [2024-06-19 03:45:40,002][26599] Updated weights for policy 0, policy_version 271684 (0.0041) [2024-06-19 03:45:42,736][26599] Updated weights for policy 0, policy_version 271694 (0.0038) [2024-06-19 03:45:43,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42052.3, 300 sec: 41820.9). Total num frames: 4451434496. Throughput: 0: 41845.3. Samples: 719077140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:43,381][26367] Avg episode reward: [(0, '0.845')] [2024-06-19 03:45:48,061][26599] Updated weights for policy 0, policy_version 271704 (0.0045) [2024-06-19 03:45:48,384][26367] Fps is (10 sec: 40944.8, 60 sec: 41233.0, 300 sec: 41709.3). Total num frames: 4451614720. Throughput: 0: 42053.0. Samples: 719211460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:48,385][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 03:45:50,961][26599] Updated weights for policy 0, policy_version 271714 (0.0042) [2024-06-19 03:45:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 41820.8). Total num frames: 4451860480. Throughput: 0: 42095.2. Samples: 719463600. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:53,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 03:45:55,750][26599] Updated weights for policy 0, policy_version 271724 (0.0035) [2024-06-19 03:45:58,380][26367] Fps is (10 sec: 45891.3, 60 sec: 42598.2, 300 sec: 41876.4). Total num frames: 4452073472. Throughput: 0: 42172.4. Samples: 719716100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:45:58,388][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 03:45:58,704][26599] Updated weights for policy 0, policy_version 271734 (0.0032) [2024-06-19 03:46:03,251][26599] Updated weights for policy 0, policy_version 271744 (0.0036) [2024-06-19 03:46:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 41765.3). Total num frames: 4452253696. Throughput: 0: 42184.0. Samples: 719845120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:46:03,380][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 03:46:06,423][26599] Updated weights for policy 0, policy_version 271754 (0.0032) [2024-06-19 03:46:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 41876.4). Total num frames: 4452499456. Throughput: 0: 42385.2. Samples: 720098480. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:46:08,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 03:46:11,109][26599] Updated weights for policy 0, policy_version 271764 (0.0034) [2024-06-19 03:46:13,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 41876.4). Total num frames: 4452712448. Throughput: 0: 42230.2. Samples: 720344120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:46:13,384][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 03:46:14,624][26599] Updated weights for policy 0, policy_version 271774 (0.0035) [2024-06-19 03:46:18,380][26367] Fps is (10 sec: 37684.0, 60 sec: 41779.2, 300 sec: 41765.3). Total num frames: 4452876288. Throughput: 0: 41953.7. Samples: 720469220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 25.0) [2024-06-19 03:46:18,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 03:46:18,533][26579] Signal inference workers to stop experience collection... (10700 times) [2024-06-19 03:46:18,533][26579] Signal inference workers to resume experience collection... (10700 times) [2024-06-19 03:46:18,579][26599] InferenceWorker_p0-w0: stopping experience collection (10700 times) [2024-06-19 03:46:18,579][26599] InferenceWorker_p0-w0: resuming experience collection (10700 times) [2024-06-19 03:46:18,671][26599] Updated weights for policy 0, policy_version 271784 (0.0031) [2024-06-19 03:46:22,197][26599] Updated weights for policy 0, policy_version 271794 (0.0038) [2024-06-19 03:46:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 41820.8). Total num frames: 4453122048. Throughput: 0: 42295.9. Samples: 720730340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:23,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 03:46:26,920][26599] Updated weights for policy 0, policy_version 271804 (0.0031) [2024-06-19 03:46:28,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42325.3, 300 sec: 41876.9). Total num frames: 4453335040. Throughput: 0: 42258.6. Samples: 720978780. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:28,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 03:46:30,120][26599] Updated weights for policy 0, policy_version 271814 (0.0048) [2024-06-19 03:46:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42054.8, 300 sec: 41820.8). Total num frames: 4453531648. Throughput: 0: 42011.8. Samples: 721101840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:33,388][26367] Avg episode reward: [(0, '0.813')] [2024-06-19 03:46:34,563][26599] Updated weights for policy 0, policy_version 271824 (0.0043) [2024-06-19 03:46:37,646][26599] Updated weights for policy 0, policy_version 271834 (0.0041) [2024-06-19 03:46:38,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42322.7, 300 sec: 41875.9). Total num frames: 4453744640. Throughput: 0: 42087.7. Samples: 721357700. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:38,385][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 03:46:42,087][26599] Updated weights for policy 0, policy_version 271844 (0.0031) [2024-06-19 03:46:43,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 41821.1). Total num frames: 4453941248. Throughput: 0: 42103.3. Samples: 721610740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:43,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 03:46:45,258][26599] Updated weights for policy 0, policy_version 271854 (0.0035) [2024-06-19 03:46:48,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42601.0, 300 sec: 41876.4). Total num frames: 4454170624. Throughput: 0: 42046.2. Samples: 721737200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:48,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 03:46:49,595][26599] Updated weights for policy 0, policy_version 271864 (0.0033) [2024-06-19 03:46:52,855][26599] Updated weights for policy 0, policy_version 271874 (0.0033) [2024-06-19 03:46:53,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42049.7, 300 sec: 41875.9). Total num frames: 4454383616. Throughput: 0: 42043.8. Samples: 721990600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:53,385][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 03:46:57,235][26599] Updated weights for policy 0, policy_version 271884 (0.0030) [2024-06-19 03:46:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 4454580224. Throughput: 0: 42251.2. Samples: 722245420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:46:58,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 03:47:00,695][26599] Updated weights for policy 0, policy_version 271894 (0.0034) [2024-06-19 03:47:03,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42598.4, 300 sec: 41876.4). Total num frames: 4454809600. Throughput: 0: 42251.9. Samples: 722370560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:47:03,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 03:47:05,147][26599] Updated weights for policy 0, policy_version 271904 (0.0042) [2024-06-19 03:47:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.4, 300 sec: 41876.4). Total num frames: 4455006208. Throughput: 0: 42222.3. Samples: 722630340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:47:08,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 03:47:08,634][26599] Updated weights for policy 0, policy_version 271914 (0.0031) [2024-06-19 03:47:12,737][26599] Updated weights for policy 0, policy_version 271924 (0.0033) [2024-06-19 03:47:13,388][26367] Fps is (10 sec: 40928.8, 60 sec: 41773.9, 300 sec: 41930.9). Total num frames: 4455219200. Throughput: 0: 42122.7. Samples: 722874620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:47:13,388][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 03:47:16,648][26599] Updated weights for policy 0, policy_version 271934 (0.0036) [2024-06-19 03:47:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 41987.5). Total num frames: 4455448576. Throughput: 0: 42153.9. Samples: 722998760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:47:18,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 03:47:20,958][26599] Updated weights for policy 0, policy_version 271944 (0.0050) [2024-06-19 03:47:23,380][26367] Fps is (10 sec: 40991.0, 60 sec: 41779.2, 300 sec: 41820.9). Total num frames: 4455628800. Throughput: 0: 42141.6. Samples: 723253920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:47:23,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 03:47:24,435][26599] Updated weights for policy 0, policy_version 271954 (0.0037) [2024-06-19 03:47:28,380][26367] Fps is (10 sec: 37682.5, 60 sec: 41506.1, 300 sec: 41876.4). Total num frames: 4455825408. Throughput: 0: 42110.6. Samples: 723505720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:47:28,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 03:47:28,381][26579] Signal inference workers to stop experience collection... (10750 times) [2024-06-19 03:47:28,383][26579] Signal inference workers to resume experience collection... (10750 times) [2024-06-19 03:47:28,423][26599] InferenceWorker_p0-w0: stopping experience collection (10750 times) [2024-06-19 03:47:28,424][26599] InferenceWorker_p0-w0: resuming experience collection (10750 times) [2024-06-19 03:47:28,520][26599] Updated weights for policy 0, policy_version 271964 (0.0038) [2024-06-19 03:47:32,276][26599] Updated weights for policy 0, policy_version 271974 (0.0041) [2024-06-19 03:47:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4456071168. Throughput: 0: 42132.9. Samples: 723633180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 03:47:33,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 03:47:35,988][26599] Updated weights for policy 0, policy_version 271984 (0.0030) [2024-06-19 03:47:38,380][26367] Fps is (10 sec: 45876.2, 60 sec: 42328.0, 300 sec: 42043.0). Total num frames: 4456284160. Throughput: 0: 42307.1. Samples: 723894260. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:47:38,380][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 03:47:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271990_4456284160.pth... [2024-06-19 03:47:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271375_4446208000.pth [2024-06-19 03:47:39,848][26599] Updated weights for policy 0, policy_version 271994 (0.0029) [2024-06-19 03:47:43,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42043.5). Total num frames: 4456480768. Throughput: 0: 42336.5. Samples: 724150560. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:47:43,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 03:47:43,697][26599] Updated weights for policy 0, policy_version 272004 (0.0037) [2024-06-19 03:47:47,446][26599] Updated weights for policy 0, policy_version 272014 (0.0037) [2024-06-19 03:47:48,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4456693760. Throughput: 0: 42249.7. Samples: 724271800. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:47:48,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 03:47:51,419][26599] Updated weights for policy 0, policy_version 272024 (0.0031) [2024-06-19 03:47:53,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42327.8, 300 sec: 42043.0). Total num frames: 4456923136. Throughput: 0: 42240.3. Samples: 724531160. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:47:53,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:47:55,712][26599] Updated weights for policy 0, policy_version 272034 (0.0041) [2024-06-19 03:47:58,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4457119744. Throughput: 0: 42356.6. Samples: 724780340. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:47:58,380][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 03:47:59,600][26599] Updated weights for policy 0, policy_version 272044 (0.0031) [2024-06-19 03:48:03,307][26599] Updated weights for policy 0, policy_version 272054 (0.0051) [2024-06-19 03:48:03,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4457332736. Throughput: 0: 42366.2. Samples: 724905240. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:03,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 03:48:07,171][26599] Updated weights for policy 0, policy_version 272064 (0.0038) [2024-06-19 03:48:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4457545728. Throughput: 0: 42420.0. Samples: 725162820. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:08,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 03:48:10,804][26599] Updated weights for policy 0, policy_version 272074 (0.0033) [2024-06-19 03:48:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42330.7, 300 sec: 42098.5). Total num frames: 4457758720. Throughput: 0: 42338.8. Samples: 725410960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:13,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 03:48:14,883][26599] Updated weights for policy 0, policy_version 272084 (0.0041) [2024-06-19 03:48:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4457971712. Throughput: 0: 42461.8. Samples: 725543960. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:18,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 03:48:18,392][26599] Updated weights for policy 0, policy_version 272094 (0.0040) [2024-06-19 03:48:22,944][26599] Updated weights for policy 0, policy_version 272104 (0.0034) [2024-06-19 03:48:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4458168320. Throughput: 0: 42284.3. Samples: 725797060. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:23,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 03:48:26,159][26599] Updated weights for policy 0, policy_version 272114 (0.0046) [2024-06-19 03:48:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 4458397696. Throughput: 0: 42123.9. Samples: 726046140. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:28,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 03:48:30,435][26599] Updated weights for policy 0, policy_version 272124 (0.0029) [2024-06-19 03:48:33,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 4458627072. Throughput: 0: 42391.6. Samples: 726179420. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:33,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 03:48:33,769][26599] Updated weights for policy 0, policy_version 272134 (0.0033) [2024-06-19 03:48:34,863][26579] Signal inference workers to stop experience collection... (10800 times) [2024-06-19 03:48:34,883][26599] InferenceWorker_p0-w0: stopping experience collection (10800 times) [2024-06-19 03:48:34,920][26579] Signal inference workers to resume experience collection... (10800 times) [2024-06-19 03:48:34,921][26599] InferenceWorker_p0-w0: resuming experience collection (10800 times) [2024-06-19 03:48:37,957][26599] Updated weights for policy 0, policy_version 272144 (0.0027) [2024-06-19 03:48:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42099.1). Total num frames: 4458807296. Throughput: 0: 42287.2. Samples: 726434080. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:38,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 03:48:41,602][26599] Updated weights for policy 0, policy_version 272154 (0.0040) [2024-06-19 03:48:43,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42209.7). Total num frames: 4459036672. Throughput: 0: 42264.0. Samples: 726682220. Policy #0 lag: (min: 0.0, avg: 12.7, max: 22.0) [2024-06-19 03:48:43,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 03:48:45,845][26599] Updated weights for policy 0, policy_version 272164 (0.0037) [2024-06-19 03:48:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4459233280. Throughput: 0: 42336.7. Samples: 726810400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:48:48,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 03:48:49,450][26599] Updated weights for policy 0, policy_version 272174 (0.0040) [2024-06-19 03:48:53,384][26367] Fps is (10 sec: 39307.1, 60 sec: 41776.7, 300 sec: 42042.5). Total num frames: 4459429888. Throughput: 0: 42239.3. Samples: 727063740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:48:53,384][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 03:48:53,619][26599] Updated weights for policy 0, policy_version 272184 (0.0042) [2024-06-19 03:48:57,145][26599] Updated weights for policy 0, policy_version 272194 (0.0044) [2024-06-19 03:48:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4459675648. Throughput: 0: 42326.6. Samples: 727315660. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:48:58,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 03:49:01,335][26599] Updated weights for policy 0, policy_version 272204 (0.0029) [2024-06-19 03:49:03,380][26367] Fps is (10 sec: 44252.7, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4459872256. Throughput: 0: 42294.7. Samples: 727447220. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:03,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 03:49:04,778][26599] Updated weights for policy 0, policy_version 272214 (0.0044) [2024-06-19 03:49:08,380][26367] Fps is (10 sec: 37683.7, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4460052480. Throughput: 0: 42185.8. Samples: 727695420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:08,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 03:49:09,137][26599] Updated weights for policy 0, policy_version 272224 (0.0037) [2024-06-19 03:49:12,830][26599] Updated weights for policy 0, policy_version 272234 (0.0047) [2024-06-19 03:49:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4460298240. Throughput: 0: 42303.1. Samples: 727949780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:13,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 03:49:16,810][26599] Updated weights for policy 0, policy_version 272244 (0.0042) [2024-06-19 03:49:18,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4460511232. Throughput: 0: 42318.4. Samples: 728083740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:18,380][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 03:49:20,348][26599] Updated weights for policy 0, policy_version 272254 (0.0040) [2024-06-19 03:49:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4460707840. Throughput: 0: 42125.3. Samples: 728329720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:23,381][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 03:49:24,574][26599] Updated weights for policy 0, policy_version 272264 (0.0047) [2024-06-19 03:49:28,214][26599] Updated weights for policy 0, policy_version 272274 (0.0043) [2024-06-19 03:49:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4460937216. Throughput: 0: 42295.6. Samples: 728585520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:28,380][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 03:49:32,178][26599] Updated weights for policy 0, policy_version 272284 (0.0052) [2024-06-19 03:49:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4461133824. Throughput: 0: 42323.2. Samples: 728714940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:33,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 03:49:35,746][26599] Updated weights for policy 0, policy_version 272294 (0.0048) [2024-06-19 03:49:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4461363200. Throughput: 0: 42267.8. Samples: 728965640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:38,381][26367] Avg episode reward: [(0, '0.370')] [2024-06-19 03:49:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000272300_4461363200.pth... [2024-06-19 03:49:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271680_4451205120.pth [2024-06-19 03:49:39,836][26599] Updated weights for policy 0, policy_version 272304 (0.0032) [2024-06-19 03:49:43,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42154.6). Total num frames: 4461576192. Throughput: 0: 42357.0. Samples: 729221720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:43,381][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 03:49:43,440][26599] Updated weights for policy 0, policy_version 272314 (0.0033) [2024-06-19 03:49:46,070][26579] Signal inference workers to stop experience collection... (10850 times) [2024-06-19 03:49:46,070][26579] Signal inference workers to resume experience collection... (10850 times) [2024-06-19 03:49:46,086][26599] InferenceWorker_p0-w0: stopping experience collection (10850 times) [2024-06-19 03:49:46,101][26599] InferenceWorker_p0-w0: resuming experience collection (10850 times) [2024-06-19 03:49:47,555][26599] Updated weights for policy 0, policy_version 272324 (0.0030) [2024-06-19 03:49:48,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4461756416. Throughput: 0: 42236.9. Samples: 729347880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:48,381][26367] Avg episode reward: [(0, '0.281')] [2024-06-19 03:49:51,508][26599] Updated weights for policy 0, policy_version 272334 (0.0042) [2024-06-19 03:49:53,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42327.9, 300 sec: 42209.6). Total num frames: 4461969408. Throughput: 0: 42176.9. Samples: 729593380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:53,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 03:49:55,639][26599] Updated weights for policy 0, policy_version 272344 (0.0035) [2024-06-19 03:49:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4462182400. Throughput: 0: 42193.8. Samples: 729848500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 22.0) [2024-06-19 03:49:58,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 03:49:59,206][26599] Updated weights for policy 0, policy_version 272354 (0.0027) [2024-06-19 03:50:03,084][26599] Updated weights for policy 0, policy_version 272364 (0.0040) [2024-06-19 03:50:03,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4462411776. Throughput: 0: 42087.8. Samples: 729977700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:03,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 03:50:07,155][26599] Updated weights for policy 0, policy_version 272374 (0.0033) [2024-06-19 03:50:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42209.7). Total num frames: 4462608384. Throughput: 0: 42271.7. Samples: 730231940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:08,380][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 03:50:10,790][26599] Updated weights for policy 0, policy_version 272384 (0.0033) [2024-06-19 03:50:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4462821376. Throughput: 0: 42151.9. Samples: 730482360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:13,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 03:50:14,997][26599] Updated weights for policy 0, policy_version 272394 (0.0044) [2024-06-19 03:50:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4463050752. Throughput: 0: 42097.0. Samples: 730609300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:18,380][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 03:50:18,451][26599] Updated weights for policy 0, policy_version 272404 (0.0028) [2024-06-19 03:50:22,604][26599] Updated weights for policy 0, policy_version 272414 (0.0043) [2024-06-19 03:50:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4463247360. Throughput: 0: 42176.4. Samples: 730863580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:23,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 03:50:26,672][26599] Updated weights for policy 0, policy_version 272424 (0.0038) [2024-06-19 03:50:28,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42210.1). Total num frames: 4463460352. Throughput: 0: 42091.0. Samples: 731115820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:28,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 03:50:30,267][26599] Updated weights for policy 0, policy_version 272434 (0.0033) [2024-06-19 03:50:33,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 4463656960. Throughput: 0: 42089.9. Samples: 731241920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:33,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 03:50:34,313][26599] Updated weights for policy 0, policy_version 272444 (0.0040) [2024-06-19 03:50:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4463869952. Throughput: 0: 42262.6. Samples: 731495200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:38,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 03:50:38,439][26599] Updated weights for policy 0, policy_version 272454 (0.0038) [2024-06-19 03:50:42,430][26599] Updated weights for policy 0, policy_version 272464 (0.0023) [2024-06-19 03:50:43,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42265.7). Total num frames: 4464082944. Throughput: 0: 42058.2. Samples: 731741120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:43,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 03:50:46,119][26599] Updated weights for policy 0, policy_version 272474 (0.0041) [2024-06-19 03:50:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4464295936. Throughput: 0: 42047.2. Samples: 731869820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:48,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 03:50:50,213][26599] Updated weights for policy 0, policy_version 272484 (0.0035) [2024-06-19 03:50:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4464508928. Throughput: 0: 41953.2. Samples: 732119840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:53,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 03:50:53,939][26599] Updated weights for policy 0, policy_version 272494 (0.0033) [2024-06-19 03:50:57,980][26599] Updated weights for policy 0, policy_version 272504 (0.0042) [2024-06-19 03:50:58,384][26367] Fps is (10 sec: 42583.2, 60 sec: 42322.8, 300 sec: 42264.6). Total num frames: 4464721920. Throughput: 0: 41933.5. Samples: 732369520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:50:58,384][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 03:51:01,539][26599] Updated weights for policy 0, policy_version 272514 (0.0043) [2024-06-19 03:51:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 4464918528. Throughput: 0: 41863.0. Samples: 732493140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:51:03,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 03:51:05,738][26599] Updated weights for policy 0, policy_version 272524 (0.0047) [2024-06-19 03:51:08,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4465131520. Throughput: 0: 41909.4. Samples: 732749500. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 03:51:08,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 03:51:09,131][26599] Updated weights for policy 0, policy_version 272534 (0.0032) [2024-06-19 03:51:13,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4465328128. Throughput: 0: 41960.6. Samples: 733004040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:13,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 03:51:13,580][26599] Updated weights for policy 0, policy_version 272544 (0.0042) [2024-06-19 03:51:17,305][26599] Updated weights for policy 0, policy_version 272554 (0.0031) [2024-06-19 03:51:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4465573888. Throughput: 0: 41972.8. Samples: 733130700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:18,381][26367] Avg episode reward: [(0, '0.314')] [2024-06-19 03:51:21,516][26599] Updated weights for policy 0, policy_version 272564 (0.0030) [2024-06-19 03:51:23,384][26367] Fps is (10 sec: 42582.3, 60 sec: 41776.7, 300 sec: 42098.0). Total num frames: 4465754112. Throughput: 0: 41852.6. Samples: 733378720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:23,385][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 03:51:25,000][26599] Updated weights for policy 0, policy_version 272574 (0.0023) [2024-06-19 03:51:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4465967104. Throughput: 0: 41995.1. Samples: 733630900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:28,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 03:51:29,241][26599] Updated weights for policy 0, policy_version 272584 (0.0042) [2024-06-19 03:51:29,245][26579] Signal inference workers to stop experience collection... (10900 times) [2024-06-19 03:51:29,245][26579] Signal inference workers to resume experience collection... (10900 times) [2024-06-19 03:51:29,284][26599] InferenceWorker_p0-w0: stopping experience collection (10900 times) [2024-06-19 03:51:29,285][26599] InferenceWorker_p0-w0: resuming experience collection (10900 times) [2024-06-19 03:51:32,723][26599] Updated weights for policy 0, policy_version 272594 (0.0031) [2024-06-19 03:51:33,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42325.2, 300 sec: 42210.1). Total num frames: 4466196480. Throughput: 0: 42027.5. Samples: 733761060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:33,384][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 03:51:36,882][26599] Updated weights for policy 0, policy_version 272604 (0.0038) [2024-06-19 03:51:38,383][26367] Fps is (10 sec: 42586.8, 60 sec: 42050.4, 300 sec: 42209.2). Total num frames: 4466393088. Throughput: 0: 42080.6. Samples: 734013580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:38,383][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 03:51:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000272607_4466393088.pth... [2024-06-19 03:51:38,447][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000271990_4456284160.pth [2024-06-19 03:51:40,371][26599] Updated weights for policy 0, policy_version 272614 (0.0027) [2024-06-19 03:51:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4466606080. Throughput: 0: 42153.1. Samples: 734266260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:43,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 03:51:44,599][26599] Updated weights for policy 0, policy_version 272624 (0.0035) [2024-06-19 03:51:48,139][26599] Updated weights for policy 0, policy_version 272634 (0.0033) [2024-06-19 03:51:48,380][26367] Fps is (10 sec: 44248.6, 60 sec: 42325.3, 300 sec: 42210.1). Total num frames: 4466835456. Throughput: 0: 42169.3. Samples: 734390760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:48,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 03:51:52,403][26599] Updated weights for policy 0, policy_version 272644 (0.0032) [2024-06-19 03:51:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4467032064. Throughput: 0: 42098.2. Samples: 734643920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:53,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 03:51:56,114][26599] Updated weights for policy 0, policy_version 272654 (0.0039) [2024-06-19 03:51:58,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41781.6, 300 sec: 42098.5). Total num frames: 4467228672. Throughput: 0: 41934.5. Samples: 734891100. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:51:58,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 03:52:00,621][26599] Updated weights for policy 0, policy_version 272664 (0.0039) [2024-06-19 03:52:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 4467458048. Throughput: 0: 41881.0. Samples: 735015340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:52:03,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 03:52:04,047][26599] Updated weights for policy 0, policy_version 272674 (0.0034) [2024-06-19 03:52:08,371][26599] Updated weights for policy 0, policy_version 272684 (0.0042) [2024-06-19 03:52:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42155.2). Total num frames: 4467654656. Throughput: 0: 41999.0. Samples: 735268520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:52:08,381][26367] Avg episode reward: [(0, '0.389')] [2024-06-19 03:52:11,910][26599] Updated weights for policy 0, policy_version 272694 (0.0034) [2024-06-19 03:52:13,382][26367] Fps is (10 sec: 39314.1, 60 sec: 42050.9, 300 sec: 42042.7). Total num frames: 4467851264. Throughput: 0: 41842.8. Samples: 735513900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:52:13,383][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 03:52:16,267][26599] Updated weights for policy 0, policy_version 272704 (0.0028) [2024-06-19 03:52:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4468080640. Throughput: 0: 41817.7. Samples: 735642860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 20.0) [2024-06-19 03:52:18,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 03:52:19,753][26599] Updated weights for policy 0, policy_version 272714 (0.0037) [2024-06-19 03:52:23,380][26367] Fps is (10 sec: 40967.4, 60 sec: 41781.8, 300 sec: 42154.1). Total num frames: 4468260864. Throughput: 0: 41859.9. Samples: 735897160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:23,381][26367] Avg episode reward: [(0, '0.801')] [2024-06-19 03:52:23,826][26599] Updated weights for policy 0, policy_version 272724 (0.0040) [2024-06-19 03:52:27,471][26599] Updated weights for policy 0, policy_version 272734 (0.0027) [2024-06-19 03:52:28,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42049.7, 300 sec: 42098.0). Total num frames: 4468490240. Throughput: 0: 41734.4. Samples: 736144460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:28,385][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 03:52:32,090][26599] Updated weights for policy 0, policy_version 272744 (0.0047) [2024-06-19 03:52:33,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4468719616. Throughput: 0: 41910.8. Samples: 736276740. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:33,380][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 03:52:35,168][26599] Updated weights for policy 0, policy_version 272754 (0.0038) [2024-06-19 03:52:38,380][26367] Fps is (10 sec: 40975.5, 60 sec: 41781.2, 300 sec: 42098.6). Total num frames: 4468899840. Throughput: 0: 41844.1. Samples: 736526900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:38,380][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 03:52:39,864][26599] Updated weights for policy 0, policy_version 272764 (0.0045) [2024-06-19 03:52:43,112][26599] Updated weights for policy 0, policy_version 272774 (0.0032) [2024-06-19 03:52:43,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4469129216. Throughput: 0: 41761.8. Samples: 736770380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:43,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 03:52:47,524][26599] Updated weights for policy 0, policy_version 272784 (0.0029) [2024-06-19 03:52:48,380][26367] Fps is (10 sec: 44236.4, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4469342208. Throughput: 0: 41947.9. Samples: 736903000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:48,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 03:52:50,581][26599] Updated weights for policy 0, policy_version 272794 (0.0032) [2024-06-19 03:52:53,380][26367] Fps is (10 sec: 42599.6, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4469555200. Throughput: 0: 42106.3. Samples: 737163300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:53,380][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 03:52:55,358][26599] Updated weights for policy 0, policy_version 272804 (0.0039) [2024-06-19 03:52:58,164][26599] Updated weights for policy 0, policy_version 272814 (0.0029) [2024-06-19 03:52:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4469784576. Throughput: 0: 42130.1. Samples: 737409680. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:52:58,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 03:53:00,884][26579] Signal inference workers to stop experience collection... (10950 times) [2024-06-19 03:53:00,919][26599] InferenceWorker_p0-w0: stopping experience collection (10950 times) [2024-06-19 03:53:00,940][26579] Signal inference workers to resume experience collection... (10950 times) [2024-06-19 03:53:00,941][26599] InferenceWorker_p0-w0: resuming experience collection (10950 times) [2024-06-19 03:53:03,087][26599] Updated weights for policy 0, policy_version 272824 (0.0038) [2024-06-19 03:53:03,380][26367] Fps is (10 sec: 39321.0, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4469948416. Throughput: 0: 42105.9. Samples: 737537620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:53:03,381][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 03:53:06,249][26599] Updated weights for policy 0, policy_version 272834 (0.0028) [2024-06-19 03:53:08,380][26367] Fps is (10 sec: 37683.0, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4470161408. Throughput: 0: 42047.5. Samples: 737789300. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:53:08,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 03:53:10,821][26599] Updated weights for policy 0, policy_version 272844 (0.0038) [2024-06-19 03:53:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42326.7, 300 sec: 42098.6). Total num frames: 4470390784. Throughput: 0: 42090.2. Samples: 738038360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:53:13,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 03:53:14,143][26599] Updated weights for policy 0, policy_version 272854 (0.0032) [2024-06-19 03:53:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4470571008. Throughput: 0: 42003.1. Samples: 738166880. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:53:18,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 03:53:18,587][26599] Updated weights for policy 0, policy_version 272864 (0.0033) [2024-06-19 03:53:22,023][26599] Updated weights for policy 0, policy_version 272874 (0.0043) [2024-06-19 03:53:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4470800384. Throughput: 0: 41981.2. Samples: 738416060. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:53:23,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 03:53:26,370][26599] Updated weights for policy 0, policy_version 272884 (0.0040) [2024-06-19 03:53:28,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42327.8, 300 sec: 42043.0). Total num frames: 4471029760. Throughput: 0: 42177.7. Samples: 738668380. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:53:28,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 03:53:29,862][26599] Updated weights for policy 0, policy_version 272894 (0.0037) [2024-06-19 03:53:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 4471209984. Throughput: 0: 41947.5. Samples: 738790640. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 03:53:33,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 03:53:34,351][26599] Updated weights for policy 0, policy_version 272904 (0.0041) [2024-06-19 03:53:37,789][26599] Updated weights for policy 0, policy_version 272914 (0.0036) [2024-06-19 03:53:38,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4471439360. Throughput: 0: 41694.1. Samples: 739039540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:53:38,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 03:53:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000272915_4471439360.pth... [2024-06-19 03:53:38,444][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000272300_4461363200.pth [2024-06-19 03:53:41,987][26599] Updated weights for policy 0, policy_version 272924 (0.0024) [2024-06-19 03:53:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4471635968. Throughput: 0: 41976.0. Samples: 739298600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:53:43,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 03:53:45,662][26599] Updated weights for policy 0, policy_version 272934 (0.0040) [2024-06-19 03:53:48,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41506.1, 300 sec: 42043.5). Total num frames: 4471832576. Throughput: 0: 41914.6. Samples: 739423780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:53:48,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 03:53:49,561][26599] Updated weights for policy 0, policy_version 272944 (0.0029) [2024-06-19 03:53:53,384][26367] Fps is (10 sec: 42583.1, 60 sec: 41776.6, 300 sec: 41987.0). Total num frames: 4472061952. Throughput: 0: 41915.9. Samples: 739675660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:53:53,384][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 03:53:53,561][26599] Updated weights for policy 0, policy_version 272954 (0.0032) [2024-06-19 03:53:57,415][26599] Updated weights for policy 0, policy_version 272964 (0.0039) [2024-06-19 03:53:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4472274944. Throughput: 0: 42148.8. Samples: 739935060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:53:58,380][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 03:54:01,137][26599] Updated weights for policy 0, policy_version 272974 (0.0046) [2024-06-19 03:54:03,380][26367] Fps is (10 sec: 42613.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4472487936. Throughput: 0: 42086.6. Samples: 740060780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:03,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 03:54:05,126][26599] Updated weights for policy 0, policy_version 272984 (0.0037) [2024-06-19 03:54:08,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4472700928. Throughput: 0: 42199.4. Samples: 740315040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:08,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 03:54:09,040][26599] Updated weights for policy 0, policy_version 272994 (0.0041) [2024-06-19 03:54:12,721][26599] Updated weights for policy 0, policy_version 273004 (0.0038) [2024-06-19 03:54:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 41987.5). Total num frames: 4472897536. Throughput: 0: 42125.5. Samples: 740564020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:13,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 03:54:16,722][26599] Updated weights for policy 0, policy_version 273014 (0.0030) [2024-06-19 03:54:18,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4473126912. Throughput: 0: 42306.8. Samples: 740694440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:18,380][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 03:54:20,381][26599] Updated weights for policy 0, policy_version 273024 (0.0033) [2024-06-19 03:54:23,379][26579] Signal inference workers to stop experience collection... (11000 times) [2024-06-19 03:54:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 41931.9). Total num frames: 4473307136. Throughput: 0: 42376.5. Samples: 740946480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:23,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 03:54:23,385][26579] Signal inference workers to resume experience collection... (11000 times) [2024-06-19 03:54:23,402][26599] InferenceWorker_p0-w0: stopping experience collection (11000 times) [2024-06-19 03:54:23,434][26599] InferenceWorker_p0-w0: resuming experience collection (11000 times) [2024-06-19 03:54:24,666][26599] Updated weights for policy 0, policy_version 273034 (0.0030) [2024-06-19 03:54:28,034][26599] Updated weights for policy 0, policy_version 273044 (0.0032) [2024-06-19 03:54:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 4473552896. Throughput: 0: 42107.5. Samples: 741193440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:28,381][26367] Avg episode reward: [(0, '0.800')] [2024-06-19 03:54:32,425][26599] Updated weights for policy 0, policy_version 273054 (0.0031) [2024-06-19 03:54:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4473749504. Throughput: 0: 42281.4. Samples: 741326440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:33,380][26367] Avg episode reward: [(0, '0.818')] [2024-06-19 03:54:35,594][26599] Updated weights for policy 0, policy_version 273064 (0.0036) [2024-06-19 03:54:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4473962496. Throughput: 0: 42350.4. Samples: 741581280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:38,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 03:54:40,410][26599] Updated weights for policy 0, policy_version 273074 (0.0041) [2024-06-19 03:54:43,275][26599] Updated weights for policy 0, policy_version 273084 (0.0028) [2024-06-19 03:54:43,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 4474208256. Throughput: 0: 41978.1. Samples: 741824080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 03:54:43,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 03:54:48,211][26599] Updated weights for policy 0, policy_version 273094 (0.0044) [2024-06-19 03:54:48,382][26367] Fps is (10 sec: 40954.3, 60 sec: 42324.3, 300 sec: 42042.8). Total num frames: 4474372096. Throughput: 0: 42203.1. Samples: 741959980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:54:48,382][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 03:54:51,159][26599] Updated weights for policy 0, policy_version 273104 (0.0034) [2024-06-19 03:54:53,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 4474585088. Throughput: 0: 42177.4. Samples: 742213020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:54:53,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 03:54:55,660][26599] Updated weights for policy 0, policy_version 273114 (0.0033) [2024-06-19 03:54:58,380][26367] Fps is (10 sec: 45881.5, 60 sec: 42598.3, 300 sec: 42098.6). Total num frames: 4474830848. Throughput: 0: 42165.7. Samples: 742461480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:54:58,381][26367] Avg episode reward: [(0, '0.790')] [2024-06-19 03:54:59,323][26599] Updated weights for policy 0, policy_version 273124 (0.0025) [2024-06-19 03:55:03,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4475011072. Throughput: 0: 42246.2. Samples: 742595520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:03,380][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 03:55:03,454][26599] Updated weights for policy 0, policy_version 273134 (0.0042) [2024-06-19 03:55:06,829][26599] Updated weights for policy 0, policy_version 273144 (0.0041) [2024-06-19 03:55:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 4475240448. Throughput: 0: 42200.9. Samples: 742845520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:08,380][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 03:55:11,307][26599] Updated weights for policy 0, policy_version 273154 (0.0025) [2024-06-19 03:55:13,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42098.5). Total num frames: 4475469824. Throughput: 0: 42310.7. Samples: 743097420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:13,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 03:55:14,453][26599] Updated weights for policy 0, policy_version 273164 (0.0027) [2024-06-19 03:55:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4475650048. Throughput: 0: 42289.3. Samples: 743229460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:18,380][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 03:55:18,879][26599] Updated weights for policy 0, policy_version 273174 (0.0034) [2024-06-19 03:55:22,230][26599] Updated weights for policy 0, policy_version 273184 (0.0043) [2024-06-19 03:55:23,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 4475863040. Throughput: 0: 42199.8. Samples: 743480260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:23,380][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 03:55:26,587][26599] Updated weights for policy 0, policy_version 273194 (0.0027) [2024-06-19 03:55:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 42043.0). Total num frames: 4476059648. Throughput: 0: 42437.1. Samples: 743733740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:28,380][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 03:55:30,088][26599] Updated weights for policy 0, policy_version 273204 (0.0054) [2024-06-19 03:55:33,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4476272640. Throughput: 0: 42115.1. Samples: 743855100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:33,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 03:55:34,359][26599] Updated weights for policy 0, policy_version 273214 (0.0033) [2024-06-19 03:55:38,087][26599] Updated weights for policy 0, policy_version 273224 (0.0029) [2024-06-19 03:55:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42098.6). Total num frames: 4476502016. Throughput: 0: 42094.8. Samples: 744107280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:38,380][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 03:55:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000273225_4476518400.pth... [2024-06-19 03:55:38,456][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000272607_4466393088.pth [2024-06-19 03:55:41,390][26579] Signal inference workers to stop experience collection... (11050 times) [2024-06-19 03:55:41,390][26579] Signal inference workers to resume experience collection... (11050 times) [2024-06-19 03:55:41,411][26599] InferenceWorker_p0-w0: stopping experience collection (11050 times) [2024-06-19 03:55:41,412][26599] InferenceWorker_p0-w0: resuming experience collection (11050 times) [2024-06-19 03:55:41,889][26599] Updated weights for policy 0, policy_version 273234 (0.0030) [2024-06-19 03:55:43,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41233.2, 300 sec: 41987.5). Total num frames: 4476682240. Throughput: 0: 42231.3. Samples: 744361880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:43,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 03:55:45,994][26599] Updated weights for policy 0, policy_version 273244 (0.0047) [2024-06-19 03:55:48,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42053.4, 300 sec: 41987.5). Total num frames: 4476895232. Throughput: 0: 41930.6. Samples: 744482400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:48,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 03:55:49,553][26599] Updated weights for policy 0, policy_version 273254 (0.0038) [2024-06-19 03:55:53,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42099.1). Total num frames: 4477140992. Throughput: 0: 41986.1. Samples: 744734900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 03:55:53,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 03:55:53,575][26599] Updated weights for policy 0, policy_version 273264 (0.0041) [2024-06-19 03:55:57,941][26599] Updated weights for policy 0, policy_version 273274 (0.0043) [2024-06-19 03:55:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4477321216. Throughput: 0: 42007.9. Samples: 744987780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:55:58,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 03:56:01,331][26599] Updated weights for policy 0, policy_version 273284 (0.0036) [2024-06-19 03:56:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4477534208. Throughput: 0: 41769.6. Samples: 745109100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:03,384][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 03:56:05,751][26599] Updated weights for policy 0, policy_version 273294 (0.0032) [2024-06-19 03:56:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4477747200. Throughput: 0: 41922.6. Samples: 745366780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:08,381][26367] Avg episode reward: [(0, '0.327')] [2024-06-19 03:56:09,145][26599] Updated weights for policy 0, policy_version 273304 (0.0040) [2024-06-19 03:56:13,382][26367] Fps is (10 sec: 40954.2, 60 sec: 41232.1, 300 sec: 41931.7). Total num frames: 4477943808. Throughput: 0: 41911.4. Samples: 745619820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:13,382][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 03:56:13,715][26599] Updated weights for policy 0, policy_version 273314 (0.0044) [2024-06-19 03:56:16,930][26599] Updated weights for policy 0, policy_version 273324 (0.0031) [2024-06-19 03:56:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42099.1). Total num frames: 4478173184. Throughput: 0: 41971.7. Samples: 745743820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:18,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 03:56:21,549][26599] Updated weights for policy 0, policy_version 273334 (0.0037) [2024-06-19 03:56:23,384][26367] Fps is (10 sec: 44226.4, 60 sec: 42049.5, 300 sec: 42098.0). Total num frames: 4478386176. Throughput: 0: 42161.2. Samples: 746004700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:23,385][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 03:56:24,561][26599] Updated weights for policy 0, policy_version 273344 (0.0025) [2024-06-19 03:56:28,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4478582784. Throughput: 0: 42068.3. Samples: 746254960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:28,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 03:56:29,708][26599] Updated weights for policy 0, policy_version 273354 (0.0039) [2024-06-19 03:56:32,371][26599] Updated weights for policy 0, policy_version 273364 (0.0043) [2024-06-19 03:56:33,380][26367] Fps is (10 sec: 42615.0, 60 sec: 42325.5, 300 sec: 42098.9). Total num frames: 4478812160. Throughput: 0: 42051.1. Samples: 746374700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:33,381][26367] Avg episode reward: [(0, '0.232')] [2024-06-19 03:56:37,415][26599] Updated weights for policy 0, policy_version 273374 (0.0042) [2024-06-19 03:56:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4479008768. Throughput: 0: 42228.4. Samples: 746635180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:38,381][26367] Avg episode reward: [(0, '0.316')] [2024-06-19 03:56:40,269][26599] Updated weights for policy 0, policy_version 273384 (0.0039) [2024-06-19 03:56:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4479221760. Throughput: 0: 41971.7. Samples: 746876500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:43,381][26367] Avg episode reward: [(0, '0.825')] [2024-06-19 03:56:45,203][26599] Updated weights for policy 0, policy_version 273394 (0.0032) [2024-06-19 03:56:48,377][26599] Updated weights for policy 0, policy_version 273404 (0.0034) [2024-06-19 03:56:48,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4479451136. Throughput: 0: 42203.2. Samples: 747008240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:48,380][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 03:56:52,737][26599] Updated weights for policy 0, policy_version 273414 (0.0040) [2024-06-19 03:56:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4479631360. Throughput: 0: 42102.2. Samples: 747261380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:53,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 03:56:56,023][26599] Updated weights for policy 0, policy_version 273424 (0.0036) [2024-06-19 03:56:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4479860736. Throughput: 0: 42117.3. Samples: 747515040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:56:58,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 03:57:00,307][26599] Updated weights for policy 0, policy_version 273434 (0.0037) [2024-06-19 03:57:03,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 4480090112. Throughput: 0: 42191.6. Samples: 747642440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:57:03,380][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 03:57:03,695][26599] Updated weights for policy 0, policy_version 273444 (0.0033) [2024-06-19 03:57:07,819][26579] Signal inference workers to stop experience collection... (11100 times) [2024-06-19 03:57:07,819][26579] Signal inference workers to resume experience collection... (11100 times) [2024-06-19 03:57:07,838][26599] InferenceWorker_p0-w0: stopping experience collection (11100 times) [2024-06-19 03:57:07,871][26599] InferenceWorker_p0-w0: resuming experience collection (11100 times) [2024-06-19 03:57:07,968][26599] Updated weights for policy 0, policy_version 273454 (0.0036) [2024-06-19 03:57:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42098.8). Total num frames: 4480270336. Throughput: 0: 41932.1. Samples: 747891480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 23.0) [2024-06-19 03:57:08,380][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 03:57:11,365][26599] Updated weights for policy 0, policy_version 273464 (0.0043) [2024-06-19 03:57:13,380][26367] Fps is (10 sec: 39320.9, 60 sec: 42326.3, 300 sec: 42043.0). Total num frames: 4480483328. Throughput: 0: 41945.7. Samples: 748142520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:13,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 03:57:15,968][26599] Updated weights for policy 0, policy_version 273474 (0.0036) [2024-06-19 03:57:18,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4480729088. Throughput: 0: 42150.6. Samples: 748271480. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:18,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 03:57:18,976][26599] Updated weights for policy 0, policy_version 273484 (0.0045) [2024-06-19 03:57:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42054.9, 300 sec: 42099.1). Total num frames: 4480909312. Throughput: 0: 41933.8. Samples: 748522200. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:23,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 03:57:23,642][26599] Updated weights for policy 0, policy_version 273494 (0.0035) [2024-06-19 03:57:27,342][26599] Updated weights for policy 0, policy_version 273504 (0.0034) [2024-06-19 03:57:28,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4481105920. Throughput: 0: 42133.8. Samples: 748772520. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:28,381][26367] Avg episode reward: [(0, '0.330')] [2024-06-19 03:57:31,255][26599] Updated weights for policy 0, policy_version 273514 (0.0028) [2024-06-19 03:57:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4481335296. Throughput: 0: 42005.3. Samples: 748898480. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:33,380][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 03:57:35,061][26599] Updated weights for policy 0, policy_version 273524 (0.0032) [2024-06-19 03:57:38,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4481548288. Throughput: 0: 42117.2. Samples: 749156660. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:38,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 03:57:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000273532_4481548288.pth... [2024-06-19 03:57:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000272915_4471439360.pth [2024-06-19 03:57:39,266][26599] Updated weights for policy 0, policy_version 273534 (0.0032) [2024-06-19 03:57:42,650][26599] Updated weights for policy 0, policy_version 273544 (0.0042) [2024-06-19 03:57:43,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4481761280. Throughput: 0: 41929.8. Samples: 749401880. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:43,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 03:57:47,534][26599] Updated weights for policy 0, policy_version 273554 (0.0036) [2024-06-19 03:57:48,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41506.1, 300 sec: 41987.4). Total num frames: 4481941504. Throughput: 0: 41909.3. Samples: 749528360. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:48,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 03:57:50,835][26599] Updated weights for policy 0, policy_version 273564 (0.0051) [2024-06-19 03:57:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 41987.5). Total num frames: 4482170880. Throughput: 0: 42037.6. Samples: 749783180. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:53,381][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 03:57:55,178][26599] Updated weights for policy 0, policy_version 273574 (0.0034) [2024-06-19 03:57:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4482383872. Throughput: 0: 42042.8. Samples: 750034440. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:57:58,381][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 03:57:58,496][26599] Updated weights for policy 0, policy_version 273584 (0.0039) [2024-06-19 03:58:02,713][26599] Updated weights for policy 0, policy_version 273594 (0.0044) [2024-06-19 03:58:03,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 4482580480. Throughput: 0: 42056.1. Samples: 750164000. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:58:03,380][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 03:58:06,013][26599] Updated weights for policy 0, policy_version 273604 (0.0035) [2024-06-19 03:58:08,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 41987.4). Total num frames: 4482777088. Throughput: 0: 42050.2. Samples: 750414460. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:58:08,389][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 03:58:10,443][26599] Updated weights for policy 0, policy_version 273614 (0.0029) [2024-06-19 03:58:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4483022848. Throughput: 0: 41861.3. Samples: 750656280. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:58:13,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 03:58:13,638][26599] Updated weights for policy 0, policy_version 273624 (0.0045) [2024-06-19 03:58:18,345][26599] Updated weights for policy 0, policy_version 273634 (0.0032) [2024-06-19 03:58:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 4483219456. Throughput: 0: 42153.3. Samples: 750795380. Policy #0 lag: (min: 1.0, avg: 9.1, max: 21.0) [2024-06-19 03:58:18,380][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 03:58:21,510][26599] Updated weights for policy 0, policy_version 273644 (0.0040) [2024-06-19 03:58:23,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4483416064. Throughput: 0: 41883.8. Samples: 751041420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:23,380][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 03:58:26,245][26599] Updated weights for policy 0, policy_version 273654 (0.0033) [2024-06-19 03:58:28,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4483661824. Throughput: 0: 41918.6. Samples: 751288220. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:28,384][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 03:58:28,922][26579] Signal inference workers to stop experience collection... (11150 times) [2024-06-19 03:58:28,955][26599] InferenceWorker_p0-w0: stopping experience collection (11150 times) [2024-06-19 03:58:28,970][26579] Signal inference workers to resume experience collection... (11150 times) [2024-06-19 03:58:28,971][26599] InferenceWorker_p0-w0: resuming experience collection (11150 times) [2024-06-19 03:58:29,129][26599] Updated weights for policy 0, policy_version 273664 (0.0028) [2024-06-19 03:58:33,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4483825664. Throughput: 0: 42077.3. Samples: 751421840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:33,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 03:58:33,906][26599] Updated weights for policy 0, policy_version 273674 (0.0035) [2024-06-19 03:58:36,882][26599] Updated weights for policy 0, policy_version 273684 (0.0023) [2024-06-19 03:58:38,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41779.4, 300 sec: 42098.6). Total num frames: 4484055040. Throughput: 0: 41783.7. Samples: 751663440. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:38,380][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 03:58:41,780][26599] Updated weights for policy 0, policy_version 273694 (0.0054) [2024-06-19 03:58:43,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4484300800. Throughput: 0: 41905.3. Samples: 751920180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:43,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 03:58:44,715][26599] Updated weights for policy 0, policy_version 273704 (0.0030) [2024-06-19 03:58:48,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 41988.0). Total num frames: 4484448256. Throughput: 0: 41877.1. Samples: 752048480. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:48,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 03:58:49,740][26599] Updated weights for policy 0, policy_version 273714 (0.0040) [2024-06-19 03:58:52,557][26599] Updated weights for policy 0, policy_version 273724 (0.0029) [2024-06-19 03:58:53,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4484710400. Throughput: 0: 41832.0. Samples: 752296900. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:53,388][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 03:58:57,411][26599] Updated weights for policy 0, policy_version 273734 (0.0044) [2024-06-19 03:58:58,380][26367] Fps is (10 sec: 47514.4, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4484923392. Throughput: 0: 42193.8. Samples: 752555000. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:58:58,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 03:59:00,319][26599] Updated weights for policy 0, policy_version 273744 (0.0031) [2024-06-19 03:59:03,380][26367] Fps is (10 sec: 36045.1, 60 sec: 41506.1, 300 sec: 41932.0). Total num frames: 4485070848. Throughput: 0: 41862.2. Samples: 752679180. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:59:03,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 03:59:05,282][26599] Updated weights for policy 0, policy_version 273754 (0.0047) [2024-06-19 03:59:08,194][26599] Updated weights for policy 0, policy_version 273764 (0.0041) [2024-06-19 03:59:08,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 4485349376. Throughput: 0: 41961.1. Samples: 752929680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:59:08,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 03:59:12,969][26599] Updated weights for policy 0, policy_version 273774 (0.0046) [2024-06-19 03:59:13,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4485545984. Throughput: 0: 42154.3. Samples: 753185160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:59:13,381][26367] Avg episode reward: [(0, '0.837')] [2024-06-19 03:59:16,081][26599] Updated weights for policy 0, policy_version 273784 (0.0033) [2024-06-19 03:59:18,380][26367] Fps is (10 sec: 34407.2, 60 sec: 41233.1, 300 sec: 41987.5). Total num frames: 4485693440. Throughput: 0: 41876.1. Samples: 753306260. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:59:18,380][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 03:59:20,756][26599] Updated weights for policy 0, policy_version 273794 (0.0029) [2024-06-19 03:59:23,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42871.3, 300 sec: 42154.1). Total num frames: 4485988352. Throughput: 0: 42123.8. Samples: 753559020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:59:23,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 03:59:24,512][26599] Updated weights for policy 0, policy_version 273804 (0.0038) [2024-06-19 03:59:28,380][26367] Fps is (10 sec: 45874.3, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4486152192. Throughput: 0: 42307.5. Samples: 753824020. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:59:28,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 03:59:28,395][26599] Updated weights for policy 0, policy_version 273814 (0.0033) [2024-06-19 03:59:32,487][26599] Updated weights for policy 0, policy_version 273824 (0.0029) [2024-06-19 03:59:33,380][26367] Fps is (10 sec: 36045.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4486348800. Throughput: 0: 42073.8. Samples: 753941800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 03:59:33,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 03:59:35,554][26579] Signal inference workers to stop experience collection... (11200 times) [2024-06-19 03:59:35,556][26579] Signal inference workers to resume experience collection... (11200 times) [2024-06-19 03:59:35,586][26599] InferenceWorker_p0-w0: stopping experience collection (11200 times) [2024-06-19 03:59:35,586][26599] InferenceWorker_p0-w0: resuming experience collection (11200 times) [2024-06-19 03:59:36,000][26599] Updated weights for policy 0, policy_version 273834 (0.0031) [2024-06-19 03:59:38,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42043.0). Total num frames: 4486610944. Throughput: 0: 42139.6. Samples: 754193180. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 03:59:38,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 03:59:38,413][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000273841_4486610944.pth... [2024-06-19 03:59:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000273225_4476518400.pth [2024-06-19 03:59:40,289][26599] Updated weights for policy 0, policy_version 273844 (0.0048) [2024-06-19 03:59:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 41506.1, 300 sec: 42098.8). Total num frames: 4486791168. Throughput: 0: 42257.7. Samples: 754456600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 03:59:43,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 03:59:43,651][26599] Updated weights for policy 0, policy_version 273854 (0.0024) [2024-06-19 03:59:47,868][26599] Updated weights for policy 0, policy_version 273864 (0.0028) [2024-06-19 03:59:48,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4486987776. Throughput: 0: 42131.0. Samples: 754575080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 03:59:48,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 03:59:51,306][26599] Updated weights for policy 0, policy_version 273874 (0.0034) [2024-06-19 03:59:53,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4487249920. Throughput: 0: 42238.0. Samples: 754830380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 03:59:53,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 03:59:55,414][26599] Updated weights for policy 0, policy_version 273884 (0.0026) [2024-06-19 03:59:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4487413760. Throughput: 0: 42359.0. Samples: 755091320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 03:59:58,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 03:59:59,248][26599] Updated weights for policy 0, policy_version 273894 (0.0030) [2024-06-19 04:00:03,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42598.4, 300 sec: 41987.5). Total num frames: 4487626752. Throughput: 0: 42217.3. Samples: 755206040. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:03,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 04:00:03,593][26599] Updated weights for policy 0, policy_version 273904 (0.0035) [2024-06-19 04:00:06,930][26599] Updated weights for policy 0, policy_version 273914 (0.0045) [2024-06-19 04:00:08,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4487888896. Throughput: 0: 42213.0. Samples: 755458600. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:08,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 04:00:11,514][26599] Updated weights for policy 0, policy_version 273924 (0.0031) [2024-06-19 04:00:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4488052736. Throughput: 0: 42094.8. Samples: 755718280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:13,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 04:00:14,849][26599] Updated weights for policy 0, policy_version 273934 (0.0035) [2024-06-19 04:00:18,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42871.4, 300 sec: 42043.0). Total num frames: 4488265728. Throughput: 0: 42040.5. Samples: 755833620. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:18,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 04:00:19,389][26599] Updated weights for policy 0, policy_version 273944 (0.0029) [2024-06-19 04:00:22,533][26599] Updated weights for policy 0, policy_version 273954 (0.0028) [2024-06-19 04:00:23,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4488511488. Throughput: 0: 42131.1. Samples: 756089080. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:23,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 04:00:27,456][26599] Updated weights for policy 0, policy_version 273964 (0.0038) [2024-06-19 04:00:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4488675328. Throughput: 0: 41883.5. Samples: 756341360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:28,381][26367] Avg episode reward: [(0, '0.370')] [2024-06-19 04:00:30,345][26599] Updated weights for policy 0, policy_version 273974 (0.0023) [2024-06-19 04:00:33,380][26367] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 41987.5). Total num frames: 4488888320. Throughput: 0: 42020.6. Samples: 756466000. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:33,380][26367] Avg episode reward: [(0, '0.370')] [2024-06-19 04:00:34,730][26579] Signal inference workers to stop experience collection... (11250 times) [2024-06-19 04:00:34,786][26599] InferenceWorker_p0-w0: stopping experience collection (11250 times) [2024-06-19 04:00:34,849][26579] Signal inference workers to resume experience collection... (11250 times) [2024-06-19 04:00:34,849][26599] InferenceWorker_p0-w0: resuming experience collection (11250 times) [2024-06-19 04:00:34,994][26599] Updated weights for policy 0, policy_version 273984 (0.0041) [2024-06-19 04:00:38,055][26599] Updated weights for policy 0, policy_version 273994 (0.0030) [2024-06-19 04:00:38,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4489134080. Throughput: 0: 42086.7. Samples: 756724280. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:38,380][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 04:00:42,684][26599] Updated weights for policy 0, policy_version 274004 (0.0034) [2024-06-19 04:00:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4489297920. Throughput: 0: 42029.4. Samples: 756982640. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 04:00:43,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 04:00:45,621][26599] Updated weights for policy 0, policy_version 274014 (0.0029) [2024-06-19 04:00:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42043.0). Total num frames: 4489543680. Throughput: 0: 42035.5. Samples: 757097640. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:00:48,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 04:00:50,276][26599] Updated weights for policy 0, policy_version 274024 (0.0037) [2024-06-19 04:00:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 4489740288. Throughput: 0: 42202.3. Samples: 757357700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:00:53,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 04:00:53,751][26599] Updated weights for policy 0, policy_version 274034 (0.0038) [2024-06-19 04:00:58,026][26599] Updated weights for policy 0, policy_version 274044 (0.0038) [2024-06-19 04:00:58,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4489936896. Throughput: 0: 42031.5. Samples: 757609700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:00:58,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 04:01:01,357][26599] Updated weights for policy 0, policy_version 274054 (0.0033) [2024-06-19 04:01:03,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4490166272. Throughput: 0: 42171.2. Samples: 757731320. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:03,380][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 04:01:05,885][26599] Updated weights for policy 0, policy_version 274064 (0.0044) [2024-06-19 04:01:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 41506.1, 300 sec: 42154.3). Total num frames: 4490379264. Throughput: 0: 42192.0. Samples: 757987720. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:08,381][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 04:01:09,006][26599] Updated weights for policy 0, policy_version 274074 (0.0040) [2024-06-19 04:01:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4490575872. Throughput: 0: 42164.2. Samples: 758238740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:13,380][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 04:01:13,501][26599] Updated weights for policy 0, policy_version 274084 (0.0040) [2024-06-19 04:01:16,769][26599] Updated weights for policy 0, policy_version 274094 (0.0033) [2024-06-19 04:01:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42099.1). Total num frames: 4490805248. Throughput: 0: 42290.2. Samples: 758369060. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:18,381][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 04:01:21,175][26599] Updated weights for policy 0, policy_version 274104 (0.0024) [2024-06-19 04:01:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 4491001856. Throughput: 0: 42277.7. Samples: 758626780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:23,381][26367] Avg episode reward: [(0, '0.277')] [2024-06-19 04:01:24,535][26599] Updated weights for policy 0, policy_version 274114 (0.0032) [2024-06-19 04:01:28,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4491214848. Throughput: 0: 42247.9. Samples: 758883800. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:28,381][26367] Avg episode reward: [(0, '0.351')] [2024-06-19 04:01:28,794][26599] Updated weights for policy 0, policy_version 274124 (0.0043) [2024-06-19 04:01:32,106][26599] Updated weights for policy 0, policy_version 274134 (0.0032) [2024-06-19 04:01:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 4491460608. Throughput: 0: 42454.6. Samples: 759008100. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:33,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 04:01:36,512][26599] Updated weights for policy 0, policy_version 274144 (0.0033) [2024-06-19 04:01:38,380][26367] Fps is (10 sec: 42599.4, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4491640832. Throughput: 0: 42357.4. Samples: 759263780. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:38,380][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 04:01:38,432][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000274149_4491657216.pth... [2024-06-19 04:01:38,491][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000273532_4481548288.pth [2024-06-19 04:01:39,782][26599] Updated weights for policy 0, policy_version 274154 (0.0038) [2024-06-19 04:01:43,380][26367] Fps is (10 sec: 36044.9, 60 sec: 42052.3, 300 sec: 41931.9). Total num frames: 4491821056. Throughput: 0: 42416.5. Samples: 759518440. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:43,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 04:01:44,634][26599] Updated weights for policy 0, policy_version 274164 (0.0028) [2024-06-19 04:01:47,691][26599] Updated weights for policy 0, policy_version 274174 (0.0029) [2024-06-19 04:01:48,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4492099584. Throughput: 0: 42403.4. Samples: 759639480. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:48,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 04:01:52,312][26599] Updated weights for policy 0, policy_version 274184 (0.0031) [2024-06-19 04:01:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4492263424. Throughput: 0: 42236.1. Samples: 759888340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 04:01:53,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 04:01:55,620][26599] Updated weights for policy 0, policy_version 274194 (0.0048) [2024-06-19 04:01:58,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 41987.4). Total num frames: 4492476416. Throughput: 0: 42256.3. Samples: 760140280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:01:58,381][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 04:01:59,986][26599] Updated weights for policy 0, policy_version 274204 (0.0029) [2024-06-19 04:02:03,273][26579] Signal inference workers to stop experience collection... (11300 times) [2024-06-19 04:02:03,326][26599] InferenceWorker_p0-w0: stopping experience collection (11300 times) [2024-06-19 04:02:03,326][26579] Signal inference workers to resume experience collection... (11300 times) [2024-06-19 04:02:03,349][26599] InferenceWorker_p0-w0: resuming experience collection (11300 times) [2024-06-19 04:02:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4492705792. Throughput: 0: 42099.5. Samples: 760263540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:03,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 04:02:03,481][26599] Updated weights for policy 0, policy_version 274214 (0.0033) [2024-06-19 04:02:07,958][26599] Updated weights for policy 0, policy_version 274224 (0.0030) [2024-06-19 04:02:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 4492902400. Throughput: 0: 42028.9. Samples: 760518080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:08,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 04:02:11,148][26599] Updated weights for policy 0, policy_version 274234 (0.0036) [2024-06-19 04:02:13,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4493099008. Throughput: 0: 42008.5. Samples: 760774180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:13,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 04:02:15,649][26599] Updated weights for policy 0, policy_version 274244 (0.0036) [2024-06-19 04:02:18,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4493344768. Throughput: 0: 42058.3. Samples: 760900720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:18,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 04:02:18,965][26599] Updated weights for policy 0, policy_version 274254 (0.0047) [2024-06-19 04:02:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4493524992. Throughput: 0: 41958.2. Samples: 761151900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:23,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 04:02:23,406][26599] Updated weights for policy 0, policy_version 274264 (0.0047) [2024-06-19 04:02:26,600][26599] Updated weights for policy 0, policy_version 274274 (0.0026) [2024-06-19 04:02:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4493754368. Throughput: 0: 41772.8. Samples: 761398220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:28,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 04:02:31,442][26599] Updated weights for policy 0, policy_version 274284 (0.0044) [2024-06-19 04:02:33,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4493983744. Throughput: 0: 42038.8. Samples: 761531220. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:33,380][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 04:02:35,121][26599] Updated weights for policy 0, policy_version 274294 (0.0044) [2024-06-19 04:02:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4494163968. Throughput: 0: 42099.9. Samples: 761782840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:38,381][26367] Avg episode reward: [(0, '0.358')] [2024-06-19 04:02:39,068][26599] Updated weights for policy 0, policy_version 274304 (0.0032) [2024-06-19 04:02:43,040][26599] Updated weights for policy 0, policy_version 274314 (0.0038) [2024-06-19 04:02:43,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4494376960. Throughput: 0: 42012.5. Samples: 762030840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:43,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 04:02:46,629][26599] Updated weights for policy 0, policy_version 274324 (0.0038) [2024-06-19 04:02:48,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4494606336. Throughput: 0: 42108.8. Samples: 762158440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:48,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 04:02:50,890][26599] Updated weights for policy 0, policy_version 274334 (0.0039) [2024-06-19 04:02:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4494786560. Throughput: 0: 42030.3. Samples: 762409440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:53,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 04:02:54,430][26599] Updated weights for policy 0, policy_version 274344 (0.0041) [2024-06-19 04:02:58,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4494999552. Throughput: 0: 41949.4. Samples: 762661900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:02:58,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 04:02:58,867][26599] Updated weights for policy 0, policy_version 274354 (0.0039) [2024-06-19 04:03:02,192][26599] Updated weights for policy 0, policy_version 274364 (0.0029) [2024-06-19 04:03:03,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42049.7, 300 sec: 42209.1). Total num frames: 4495228928. Throughput: 0: 42081.5. Samples: 762794540. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:03:03,385][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 04:03:06,408][26599] Updated weights for policy 0, policy_version 274374 (0.0033) [2024-06-19 04:03:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4495425536. Throughput: 0: 42117.2. Samples: 763047180. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:03:08,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 04:03:10,039][26599] Updated weights for policy 0, policy_version 274384 (0.0031) [2024-06-19 04:03:13,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4495638528. Throughput: 0: 42156.9. Samples: 763295280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:13,383][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 04:03:13,965][26599] Updated weights for policy 0, policy_version 274394 (0.0032) [2024-06-19 04:03:17,833][26599] Updated weights for policy 0, policy_version 274404 (0.0027) [2024-06-19 04:03:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 4495851520. Throughput: 0: 42105.6. Samples: 763425980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:18,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 04:03:21,585][26599] Updated weights for policy 0, policy_version 274414 (0.0035) [2024-06-19 04:03:22,974][26579] Signal inference workers to stop experience collection... (11350 times) [2024-06-19 04:03:22,982][26579] Signal inference workers to resume experience collection... (11350 times) [2024-06-19 04:03:22,996][26599] InferenceWorker_p0-w0: stopping experience collection (11350 times) [2024-06-19 04:03:22,997][26599] InferenceWorker_p0-w0: resuming experience collection (11350 times) [2024-06-19 04:03:23,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 41987.5). Total num frames: 4496048128. Throughput: 0: 42230.0. Samples: 763683180. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:23,380][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 04:03:25,516][26599] Updated weights for policy 0, policy_version 274424 (0.0024) [2024-06-19 04:03:28,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4496293888. Throughput: 0: 42130.3. Samples: 763926700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:28,381][26367] Avg episode reward: [(0, '0.237')] [2024-06-19 04:03:29,663][26599] Updated weights for policy 0, policy_version 274434 (0.0038) [2024-06-19 04:03:33,255][26599] Updated weights for policy 0, policy_version 274444 (0.0031) [2024-06-19 04:03:33,380][26367] Fps is (10 sec: 44235.9, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 4496490496. Throughput: 0: 42288.4. Samples: 764061420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:33,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 04:03:37,245][26599] Updated weights for policy 0, policy_version 274454 (0.0027) [2024-06-19 04:03:38,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 4496687104. Throughput: 0: 42408.4. Samples: 764317820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:38,381][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 04:03:38,415][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000274457_4496703488.pth... [2024-06-19 04:03:38,465][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000273841_4486610944.pth [2024-06-19 04:03:41,061][26599] Updated weights for policy 0, policy_version 274464 (0.0041) [2024-06-19 04:03:43,381][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 4496916480. Throughput: 0: 42292.7. Samples: 764565080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:43,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 04:03:44,928][26599] Updated weights for policy 0, policy_version 274474 (0.0039) [2024-06-19 04:03:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4497113088. Throughput: 0: 42229.6. Samples: 764694720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:48,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 04:03:48,660][26599] Updated weights for policy 0, policy_version 274484 (0.0044) [2024-06-19 04:03:52,490][26599] Updated weights for policy 0, policy_version 274494 (0.0041) [2024-06-19 04:03:53,380][26367] Fps is (10 sec: 39322.7, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4497309696. Throughput: 0: 42162.8. Samples: 764944500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:53,381][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 04:03:56,287][26599] Updated weights for policy 0, policy_version 274504 (0.0032) [2024-06-19 04:03:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4497555456. Throughput: 0: 42285.0. Samples: 765198100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:03:58,380][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 04:04:00,117][26599] Updated weights for policy 0, policy_version 274514 (0.0029) [2024-06-19 04:04:03,380][26367] Fps is (10 sec: 45874.2, 60 sec: 42327.8, 300 sec: 42098.5). Total num frames: 4497768448. Throughput: 0: 42331.4. Samples: 765330900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:04:03,388][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 04:04:04,134][26599] Updated weights for policy 0, policy_version 274524 (0.0039) [2024-06-19 04:04:08,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4497948672. Throughput: 0: 42169.7. Samples: 765580820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:04:08,381][26367] Avg episode reward: [(0, '0.323')] [2024-06-19 04:04:08,675][26599] Updated weights for policy 0, policy_version 274534 (0.0028) [2024-06-19 04:04:11,766][26599] Updated weights for policy 0, policy_version 274544 (0.0045) [2024-06-19 04:04:13,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4498194432. Throughput: 0: 42272.9. Samples: 765828980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:04:13,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 04:04:16,311][26599] Updated weights for policy 0, policy_version 274554 (0.0035) [2024-06-19 04:04:18,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 41987.5). Total num frames: 4498374656. Throughput: 0: 42312.2. Samples: 765965460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 04:04:18,380][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 04:04:19,465][26599] Updated weights for policy 0, policy_version 274564 (0.0026) [2024-06-19 04:04:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4498587648. Throughput: 0: 42008.9. Samples: 766208220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:23,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 04:04:24,312][26599] Updated weights for policy 0, policy_version 274574 (0.0038) [2024-06-19 04:04:27,343][26599] Updated weights for policy 0, policy_version 274584 (0.0037) [2024-06-19 04:04:28,380][26367] Fps is (10 sec: 45874.1, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4498833408. Throughput: 0: 42071.7. Samples: 766458300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:28,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 04:04:32,069][26599] Updated weights for policy 0, policy_version 274594 (0.0039) [2024-06-19 04:04:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.3, 300 sec: 41987.5). Total num frames: 4498997248. Throughput: 0: 42215.1. Samples: 766594400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:33,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 04:04:34,887][26599] Updated weights for policy 0, policy_version 274604 (0.0027) [2024-06-19 04:04:38,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4499243008. Throughput: 0: 42177.3. Samples: 766842480. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:38,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 04:04:39,567][26599] Updated weights for policy 0, policy_version 274614 (0.0030) [2024-06-19 04:04:39,989][26579] Signal inference workers to stop experience collection... (11400 times) [2024-06-19 04:04:40,030][26599] InferenceWorker_p0-w0: stopping experience collection (11400 times) [2024-06-19 04:04:40,036][26579] Signal inference workers to resume experience collection... (11400 times) [2024-06-19 04:04:40,053][26599] InferenceWorker_p0-w0: resuming experience collection (11400 times) [2024-06-19 04:04:42,754][26599] Updated weights for policy 0, policy_version 274624 (0.0033) [2024-06-19 04:04:43,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4499456000. Throughput: 0: 42242.0. Samples: 767099000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:43,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 04:04:47,121][26599] Updated weights for policy 0, policy_version 274634 (0.0033) [2024-06-19 04:04:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4499652608. Throughput: 0: 42198.4. Samples: 767229820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:48,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 04:04:50,355][26599] Updated weights for policy 0, policy_version 274644 (0.0023) [2024-06-19 04:04:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4499865600. Throughput: 0: 42168.4. Samples: 767478400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:53,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 04:04:54,673][26599] Updated weights for policy 0, policy_version 274654 (0.0040) [2024-06-19 04:04:57,800][26599] Updated weights for policy 0, policy_version 274664 (0.0031) [2024-06-19 04:04:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4500094976. Throughput: 0: 42327.5. Samples: 767733720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:04:58,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 04:05:02,710][26599] Updated weights for policy 0, policy_version 274674 (0.0034) [2024-06-19 04:05:03,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41779.4, 300 sec: 41987.5). Total num frames: 4500275200. Throughput: 0: 42180.0. Samples: 767863560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:05:03,380][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 04:05:06,109][26599] Updated weights for policy 0, policy_version 274684 (0.0032) [2024-06-19 04:05:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 4500520960. Throughput: 0: 42378.2. Samples: 768115240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:05:08,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 04:05:10,282][26599] Updated weights for policy 0, policy_version 274694 (0.0034) [2024-06-19 04:05:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4500717568. Throughput: 0: 42342.8. Samples: 768363720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:05:13,380][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 04:05:14,095][26599] Updated weights for policy 0, policy_version 274704 (0.0032) [2024-06-19 04:05:18,099][26599] Updated weights for policy 0, policy_version 274714 (0.0039) [2024-06-19 04:05:18,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4500914176. Throughput: 0: 42061.7. Samples: 768487180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:05:18,384][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 04:05:21,655][26599] Updated weights for policy 0, policy_version 274724 (0.0043) [2024-06-19 04:05:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4501159936. Throughput: 0: 42246.6. Samples: 768743580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:05:23,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 04:05:25,595][26599] Updated weights for policy 0, policy_version 274734 (0.0040) [2024-06-19 04:05:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4501356544. Throughput: 0: 42152.1. Samples: 768995840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 04:05:28,381][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 04:05:29,367][26599] Updated weights for policy 0, policy_version 274744 (0.0029) [2024-06-19 04:05:33,299][26599] Updated weights for policy 0, policy_version 274754 (0.0035) [2024-06-19 04:05:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 4501569536. Throughput: 0: 41980.0. Samples: 769118920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:05:33,381][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 04:05:37,066][26599] Updated weights for policy 0, policy_version 274764 (0.0028) [2024-06-19 04:05:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4501766144. Throughput: 0: 42119.1. Samples: 769373760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:05:38,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 04:05:38,471][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000274767_4501782528.pth... [2024-06-19 04:05:38,532][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000274149_4491657216.pth [2024-06-19 04:05:41,575][26599] Updated weights for policy 0, policy_version 274774 (0.0042) [2024-06-19 04:05:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4501979136. Throughput: 0: 42205.4. Samples: 769632960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:05:43,380][26367] Avg episode reward: [(0, '0.256')] [2024-06-19 04:05:44,921][26599] Updated weights for policy 0, policy_version 274784 (0.0028) [2024-06-19 04:05:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4502192128. Throughput: 0: 42020.7. Samples: 769754500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:05:48,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 04:05:49,969][26599] Updated weights for policy 0, policy_version 274794 (0.0037) [2024-06-19 04:05:52,589][26599] Updated weights for policy 0, policy_version 274804 (0.0034) [2024-06-19 04:05:53,380][26367] Fps is (10 sec: 40959.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4502388736. Throughput: 0: 41854.6. Samples: 769998700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:05:53,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 04:05:57,479][26599] Updated weights for policy 0, policy_version 274814 (0.0031) [2024-06-19 04:05:58,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 4502568960. Throughput: 0: 42196.8. Samples: 770262580. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:05:58,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 04:05:58,701][26579] Signal inference workers to stop experience collection... (11450 times) [2024-06-19 04:05:58,751][26599] InferenceWorker_p0-w0: stopping experience collection (11450 times) [2024-06-19 04:05:58,757][26579] Signal inference workers to resume experience collection... (11450 times) [2024-06-19 04:05:58,764][26599] InferenceWorker_p0-w0: resuming experience collection (11450 times) [2024-06-19 04:06:00,338][26599] Updated weights for policy 0, policy_version 274824 (0.0032) [2024-06-19 04:06:03,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4502814720. Throughput: 0: 42137.0. Samples: 770383340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:03,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 04:06:05,347][26599] Updated weights for policy 0, policy_version 274834 (0.0039) [2024-06-19 04:06:08,336][26599] Updated weights for policy 0, policy_version 274844 (0.0043) [2024-06-19 04:06:08,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42052.3, 300 sec: 42265.1). Total num frames: 4503044096. Throughput: 0: 41969.3. Samples: 770632200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:08,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 04:06:13,085][26599] Updated weights for policy 0, policy_version 274854 (0.0024) [2024-06-19 04:06:13,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 4503207936. Throughput: 0: 42093.7. Samples: 770890060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:13,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 04:06:15,973][26599] Updated weights for policy 0, policy_version 274864 (0.0025) [2024-06-19 04:06:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4503437312. Throughput: 0: 41939.6. Samples: 771006200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:18,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 04:06:20,782][26599] Updated weights for policy 0, policy_version 274874 (0.0038) [2024-06-19 04:06:23,380][26367] Fps is (10 sec: 45875.9, 60 sec: 41779.2, 300 sec: 42209.7). Total num frames: 4503666688. Throughput: 0: 42095.2. Samples: 771268040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:23,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 04:06:23,778][26599] Updated weights for policy 0, policy_version 274884 (0.0037) [2024-06-19 04:06:28,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.1, 300 sec: 41987.5). Total num frames: 4503846912. Throughput: 0: 41993.2. Samples: 771522660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:28,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 04:06:28,568][26599] Updated weights for policy 0, policy_version 274894 (0.0029) [2024-06-19 04:06:31,501][26599] Updated weights for policy 0, policy_version 274904 (0.0046) [2024-06-19 04:06:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4504059904. Throughput: 0: 42004.5. Samples: 771644700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:33,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 04:06:36,259][26599] Updated weights for policy 0, policy_version 274914 (0.0031) [2024-06-19 04:06:38,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4504289280. Throughput: 0: 42273.5. Samples: 771901000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:38,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 04:06:39,313][26599] Updated weights for policy 0, policy_version 274924 (0.0043) [2024-06-19 04:06:43,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4504485888. Throughput: 0: 41965.0. Samples: 772151000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:06:43,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 04:06:43,931][26599] Updated weights for policy 0, policy_version 274934 (0.0031) [2024-06-19 04:06:47,159][26599] Updated weights for policy 0, policy_version 274944 (0.0035) [2024-06-19 04:06:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4504698880. Throughput: 0: 42103.9. Samples: 772278020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:06:48,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 04:06:51,697][26599] Updated weights for policy 0, policy_version 274954 (0.0041) [2024-06-19 04:06:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4504911872. Throughput: 0: 42110.3. Samples: 772527160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:06:53,380][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 04:06:55,553][26599] Updated weights for policy 0, policy_version 274964 (0.0047) [2024-06-19 04:06:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4505108480. Throughput: 0: 41930.8. Samples: 772776940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:06:58,384][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 04:06:59,656][26599] Updated weights for policy 0, policy_version 274974 (0.0032) [2024-06-19 04:07:03,153][26599] Updated weights for policy 0, policy_version 274984 (0.0038) [2024-06-19 04:07:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4505337856. Throughput: 0: 42039.5. Samples: 772897980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:03,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 04:07:07,170][26599] Updated weights for policy 0, policy_version 274994 (0.0031) [2024-06-19 04:07:08,380][26367] Fps is (10 sec: 44236.8, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 4505550848. Throughput: 0: 42011.1. Samples: 773158540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:08,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 04:07:11,006][26599] Updated weights for policy 0, policy_version 275004 (0.0024) [2024-06-19 04:07:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4505747456. Throughput: 0: 42038.2. Samples: 773414380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:13,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 04:07:14,952][26599] Updated weights for policy 0, policy_version 275014 (0.0041) [2024-06-19 04:07:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4505960448. Throughput: 0: 42107.1. Samples: 773539520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:18,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 04:07:18,984][26599] Updated weights for policy 0, policy_version 275024 (0.0032) [2024-06-19 04:07:22,996][26599] Updated weights for policy 0, policy_version 275034 (0.0028) [2024-06-19 04:07:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 4506173440. Throughput: 0: 41943.9. Samples: 773788480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:23,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 04:07:26,514][26599] Updated weights for policy 0, policy_version 275044 (0.0023) [2024-06-19 04:07:28,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42043.0). Total num frames: 4506386432. Throughput: 0: 42022.3. Samples: 774042000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:28,380][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 04:07:30,565][26599] Updated weights for policy 0, policy_version 275054 (0.0038) [2024-06-19 04:07:32,399][26579] Signal inference workers to stop experience collection... (11500 times) [2024-06-19 04:07:32,400][26579] Signal inference workers to resume experience collection... (11500 times) [2024-06-19 04:07:32,422][26599] InferenceWorker_p0-w0: stopping experience collection (11500 times) [2024-06-19 04:07:32,422][26599] InferenceWorker_p0-w0: resuming experience collection (11500 times) [2024-06-19 04:07:33,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 4506615808. Throughput: 0: 41974.3. Samples: 774166860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:33,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 04:07:34,410][26599] Updated weights for policy 0, policy_version 275064 (0.0030) [2024-06-19 04:07:38,231][26599] Updated weights for policy 0, policy_version 275074 (0.0038) [2024-06-19 04:07:38,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4506812416. Throughput: 0: 42228.3. Samples: 774427440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:38,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 04:07:38,501][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000275075_4506828800.pth... [2024-06-19 04:07:38,550][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000274457_4496703488.pth [2024-06-19 04:07:42,128][26599] Updated weights for policy 0, policy_version 275084 (0.0038) [2024-06-19 04:07:43,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4507025408. Throughput: 0: 42285.7. Samples: 774679800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:43,381][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 04:07:46,155][26599] Updated weights for policy 0, policy_version 275094 (0.0037) [2024-06-19 04:07:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4507238400. Throughput: 0: 42381.9. Samples: 774805160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:48,380][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 04:07:49,864][26599] Updated weights for policy 0, policy_version 275104 (0.0041) [2024-06-19 04:07:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4507435008. Throughput: 0: 42258.6. Samples: 775060180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 04:07:53,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 04:07:53,906][26599] Updated weights for policy 0, policy_version 275114 (0.0030) [2024-06-19 04:07:57,375][26599] Updated weights for policy 0, policy_version 275124 (0.0035) [2024-06-19 04:07:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42099.1). Total num frames: 4507648000. Throughput: 0: 42112.2. Samples: 775309420. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:07:58,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 04:08:01,741][26599] Updated weights for policy 0, policy_version 275134 (0.0035) [2024-06-19 04:08:03,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4507893760. Throughput: 0: 42221.4. Samples: 775439480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:03,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 04:08:04,843][26599] Updated weights for policy 0, policy_version 275144 (0.0034) [2024-06-19 04:08:08,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 4508057600. Throughput: 0: 42336.4. Samples: 775693620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:08,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 04:08:09,324][26599] Updated weights for policy 0, policy_version 275154 (0.0040) [2024-06-19 04:08:13,059][26599] Updated weights for policy 0, policy_version 275164 (0.0045) [2024-06-19 04:08:13,384][26367] Fps is (10 sec: 39307.4, 60 sec: 42322.8, 300 sec: 42153.6). Total num frames: 4508286976. Throughput: 0: 42163.6. Samples: 775939520. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:13,384][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 04:08:17,409][26599] Updated weights for policy 0, policy_version 275174 (0.0028) [2024-06-19 04:08:18,381][26367] Fps is (10 sec: 47511.0, 60 sec: 42871.1, 300 sec: 42320.6). Total num frames: 4508532736. Throughput: 0: 42321.1. Samples: 776071340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:18,381][26367] Avg episode reward: [(0, '0.867')] [2024-06-19 04:08:20,693][26599] Updated weights for policy 0, policy_version 275184 (0.0041) [2024-06-19 04:08:23,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4508696576. Throughput: 0: 42047.1. Samples: 776319560. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:23,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 04:08:25,142][26599] Updated weights for policy 0, policy_version 275194 (0.0036) [2024-06-19 04:08:28,265][26599] Updated weights for policy 0, policy_version 275204 (0.0037) [2024-06-19 04:08:28,380][26367] Fps is (10 sec: 40962.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4508942336. Throughput: 0: 41986.3. Samples: 776569180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:28,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 04:08:32,674][26599] Updated weights for policy 0, policy_version 275214 (0.0029) [2024-06-19 04:08:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4509122560. Throughput: 0: 42102.2. Samples: 776699760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:33,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 04:08:36,475][26599] Updated weights for policy 0, policy_version 275224 (0.0029) [2024-06-19 04:08:38,384][26367] Fps is (10 sec: 39307.2, 60 sec: 42049.7, 300 sec: 42098.1). Total num frames: 4509335552. Throughput: 0: 42003.3. Samples: 776950480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:38,385][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 04:08:40,329][26599] Updated weights for policy 0, policy_version 275234 (0.0042) [2024-06-19 04:08:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4509564928. Throughput: 0: 41942.6. Samples: 777196840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:43,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 04:08:44,133][26599] Updated weights for policy 0, policy_version 275244 (0.0028) [2024-06-19 04:08:48,308][26599] Updated weights for policy 0, policy_version 275254 (0.0030) [2024-06-19 04:08:48,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4509761536. Throughput: 0: 41946.8. Samples: 777327080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:48,380][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 04:08:51,833][26599] Updated weights for policy 0, policy_version 275264 (0.0038) [2024-06-19 04:08:53,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4509958144. Throughput: 0: 41873.9. Samples: 777577940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:53,380][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 04:08:56,188][26579] Signal inference workers to stop experience collection... (11550 times) [2024-06-19 04:08:56,229][26599] InferenceWorker_p0-w0: stopping experience collection (11550 times) [2024-06-19 04:08:56,238][26579] Signal inference workers to resume experience collection... (11550 times) [2024-06-19 04:08:56,243][26599] InferenceWorker_p0-w0: resuming experience collection (11550 times) [2024-06-19 04:08:56,415][26599] Updated weights for policy 0, policy_version 275274 (0.0040) [2024-06-19 04:08:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42098.6). Total num frames: 4510187520. Throughput: 0: 42048.2. Samples: 777831540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:08:58,384][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 04:08:59,475][26599] Updated weights for policy 0, policy_version 275284 (0.0030) [2024-06-19 04:09:03,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 4510384128. Throughput: 0: 41996.1. Samples: 777961140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:09:03,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 04:09:04,184][26599] Updated weights for policy 0, policy_version 275294 (0.0032) [2024-06-19 04:09:07,197][26599] Updated weights for policy 0, policy_version 275304 (0.0045) [2024-06-19 04:09:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4510597120. Throughput: 0: 41968.0. Samples: 778208120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 04:09:08,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 04:09:11,947][26599] Updated weights for policy 0, policy_version 275314 (0.0040) [2024-06-19 04:09:13,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42325.3, 300 sec: 42209.1). Total num frames: 4510826496. Throughput: 0: 42133.5. Samples: 778465340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:13,384][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 04:09:14,940][26599] Updated weights for policy 0, policy_version 275324 (0.0039) [2024-06-19 04:09:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41233.6, 300 sec: 42098.6). Total num frames: 4511006720. Throughput: 0: 42026.7. Samples: 778590960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:18,380][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 04:09:19,644][26599] Updated weights for policy 0, policy_version 275334 (0.0035) [2024-06-19 04:09:22,840][26599] Updated weights for policy 0, policy_version 275344 (0.0042) [2024-06-19 04:09:23,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4511236096. Throughput: 0: 41870.4. Samples: 778834500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:23,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 04:09:27,569][26599] Updated weights for policy 0, policy_version 275354 (0.0042) [2024-06-19 04:09:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 4511432704. Throughput: 0: 42217.4. Samples: 779096620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:28,380][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 04:09:31,148][26599] Updated weights for policy 0, policy_version 275364 (0.0043) [2024-06-19 04:09:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42043.0). Total num frames: 4511645696. Throughput: 0: 42092.8. Samples: 779221260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:33,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 04:09:35,311][26599] Updated weights for policy 0, policy_version 275374 (0.0029) [2024-06-19 04:09:38,384][26367] Fps is (10 sec: 44219.9, 60 sec: 42325.3, 300 sec: 42098.0). Total num frames: 4511875072. Throughput: 0: 42091.1. Samples: 779472200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:38,385][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 04:09:38,415][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000275383_4511875072.pth... [2024-06-19 04:09:38,474][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000274767_4501782528.pth [2024-06-19 04:09:38,918][26599] Updated weights for policy 0, policy_version 275384 (0.0030) [2024-06-19 04:09:43,164][26599] Updated weights for policy 0, policy_version 275394 (0.0041) [2024-06-19 04:09:43,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41506.0, 300 sec: 42043.0). Total num frames: 4512055296. Throughput: 0: 42200.8. Samples: 779730580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:43,381][26367] Avg episode reward: [(0, '0.361')] [2024-06-19 04:09:47,012][26599] Updated weights for policy 0, policy_version 275404 (0.0031) [2024-06-19 04:09:48,380][26367] Fps is (10 sec: 39336.5, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4512268288. Throughput: 0: 41992.1. Samples: 779850780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:48,381][26367] Avg episode reward: [(0, '0.291')] [2024-06-19 04:09:50,839][26599] Updated weights for policy 0, policy_version 275414 (0.0038) [2024-06-19 04:09:53,380][26367] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42098.6). Total num frames: 4512514048. Throughput: 0: 42156.1. Samples: 780105140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:53,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 04:09:54,666][26599] Updated weights for policy 0, policy_version 275424 (0.0034) [2024-06-19 04:09:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 4512694272. Throughput: 0: 42060.4. Samples: 780357900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:09:58,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 04:09:58,551][26599] Updated weights for policy 0, policy_version 275434 (0.0034) [2024-06-19 04:10:02,284][26599] Updated weights for policy 0, policy_version 275444 (0.0032) [2024-06-19 04:10:03,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4512907264. Throughput: 0: 42027.0. Samples: 780482180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:10:03,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 04:10:06,541][26599] Updated weights for policy 0, policy_version 275454 (0.0052) [2024-06-19 04:10:08,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4513136640. Throughput: 0: 42269.9. Samples: 780736640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:10:08,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 04:10:09,953][26599] Updated weights for policy 0, policy_version 275464 (0.0036) [2024-06-19 04:10:13,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41235.6, 300 sec: 41987.5). Total num frames: 4513300480. Throughput: 0: 41908.8. Samples: 780982520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:10:13,381][26367] Avg episode reward: [(0, '0.843')] [2024-06-19 04:10:14,626][26599] Updated weights for policy 0, policy_version 275474 (0.0037) [2024-06-19 04:10:17,338][26579] Signal inference workers to stop experience collection... (11600 times) [2024-06-19 04:10:17,339][26579] Signal inference workers to resume experience collection... (11600 times) [2024-06-19 04:10:17,373][26599] InferenceWorker_p0-w0: stopping experience collection (11600 times) [2024-06-19 04:10:17,373][26599] InferenceWorker_p0-w0: resuming experience collection (11600 times) [2024-06-19 04:10:17,481][26599] Updated weights for policy 0, policy_version 275484 (0.0045) [2024-06-19 04:10:18,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42322.7, 300 sec: 41987.0). Total num frames: 4513546240. Throughput: 0: 41864.6. Samples: 781105320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 04:10:18,384][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 04:10:22,281][26599] Updated weights for policy 0, policy_version 275494 (0.0026) [2024-06-19 04:10:23,384][26367] Fps is (10 sec: 44222.1, 60 sec: 41777.0, 300 sec: 41987.0). Total num frames: 4513742848. Throughput: 0: 41968.8. Samples: 781360780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:23,384][26367] Avg episode reward: [(0, '0.860')] [2024-06-19 04:10:25,527][26599] Updated weights for policy 0, policy_version 275504 (0.0032) [2024-06-19 04:10:28,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42052.2, 300 sec: 41987.5). Total num frames: 4513955840. Throughput: 0: 41857.0. Samples: 781614140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:28,381][26367] Avg episode reward: [(0, '0.860')] [2024-06-19 04:10:29,878][26599] Updated weights for policy 0, policy_version 275514 (0.0033) [2024-06-19 04:10:33,167][26599] Updated weights for policy 0, policy_version 275524 (0.0030) [2024-06-19 04:10:33,380][26367] Fps is (10 sec: 44251.4, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4514185216. Throughput: 0: 42042.2. Samples: 781742680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:33,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 04:10:37,626][26599] Updated weights for policy 0, policy_version 275534 (0.0030) [2024-06-19 04:10:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41781.7, 300 sec: 42043.0). Total num frames: 4514381824. Throughput: 0: 42087.0. Samples: 781999060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:38,389][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 04:10:40,707][26599] Updated weights for policy 0, policy_version 275544 (0.0026) [2024-06-19 04:10:43,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42322.9, 300 sec: 42042.5). Total num frames: 4514594816. Throughput: 0: 41959.6. Samples: 782246240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:43,385][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 04:10:45,129][26599] Updated weights for policy 0, policy_version 275554 (0.0033) [2024-06-19 04:10:48,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4514824192. Throughput: 0: 42037.4. Samples: 782373860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:48,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 04:10:48,536][26599] Updated weights for policy 0, policy_version 275564 (0.0024) [2024-06-19 04:10:52,701][26599] Updated weights for policy 0, policy_version 275574 (0.0033) [2024-06-19 04:10:53,380][26367] Fps is (10 sec: 42613.6, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4515020800. Throughput: 0: 42075.5. Samples: 782630040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:53,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 04:10:57,091][26599] Updated weights for policy 0, policy_version 275584 (0.0042) [2024-06-19 04:10:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 4515233792. Throughput: 0: 42275.5. Samples: 782884920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:10:58,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 04:11:00,779][26599] Updated weights for policy 0, policy_version 275594 (0.0049) [2024-06-19 04:11:03,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42049.8, 300 sec: 41987.0). Total num frames: 4515430400. Throughput: 0: 42324.0. Samples: 783009900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:11:03,384][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 04:11:04,806][26599] Updated weights for policy 0, policy_version 275604 (0.0035) [2024-06-19 04:11:08,310][26599] Updated weights for policy 0, policy_version 275614 (0.0032) [2024-06-19 04:11:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4515659776. Throughput: 0: 42241.7. Samples: 783261520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:11:08,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 04:11:12,476][26599] Updated weights for policy 0, policy_version 275624 (0.0044) [2024-06-19 04:11:13,384][26367] Fps is (10 sec: 44236.6, 60 sec: 42868.8, 300 sec: 42153.6). Total num frames: 4515872768. Throughput: 0: 42305.4. Samples: 783518040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:11:13,385][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 04:11:15,892][26599] Updated weights for policy 0, policy_version 275634 (0.0045) [2024-06-19 04:11:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42054.7, 300 sec: 42043.0). Total num frames: 4516069376. Throughput: 0: 42064.7. Samples: 783635600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:11:18,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 04:11:20,499][26599] Updated weights for policy 0, policy_version 275644 (0.0023) [2024-06-19 04:11:23,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42600.7, 300 sec: 42209.6). Total num frames: 4516298752. Throughput: 0: 42010.7. Samples: 783889540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:11:23,381][26367] Avg episode reward: [(0, '0.414')] [2024-06-19 04:11:23,892][26599] Updated weights for policy 0, policy_version 275654 (0.0032) [2024-06-19 04:11:28,063][26599] Updated weights for policy 0, policy_version 275664 (0.0032) [2024-06-19 04:11:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4516495360. Throughput: 0: 42341.2. Samples: 784151440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:11:28,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 04:11:31,463][26599] Updated weights for policy 0, policy_version 275674 (0.0027) [2024-06-19 04:11:33,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 4516724736. Throughput: 0: 42222.1. Samples: 784273860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:11:33,381][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 04:11:35,786][26599] Updated weights for policy 0, policy_version 275684 (0.0043) [2024-06-19 04:11:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4516921344. Throughput: 0: 42075.2. Samples: 784523420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:11:38,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 04:11:38,412][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000275692_4516937728.pth... [2024-06-19 04:11:38,465][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000275075_4506828800.pth [2024-06-19 04:11:39,770][26599] Updated weights for policy 0, policy_version 275694 (0.0032) [2024-06-19 04:11:39,936][26579] Signal inference workers to stop experience collection... (11650 times) [2024-06-19 04:11:39,937][26579] Signal inference workers to resume experience collection... (11650 times) [2024-06-19 04:11:39,954][26599] InferenceWorker_p0-w0: stopping experience collection (11650 times) [2024-06-19 04:11:39,982][26599] InferenceWorker_p0-w0: resuming experience collection (11650 times) [2024-06-19 04:11:43,380][26367] Fps is (10 sec: 37683.9, 60 sec: 41781.8, 300 sec: 42043.0). Total num frames: 4517101568. Throughput: 0: 42168.5. Samples: 784782500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:11:43,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 04:11:43,666][26599] Updated weights for policy 0, policy_version 275704 (0.0029) [2024-06-19 04:11:47,589][26599] Updated weights for policy 0, policy_version 275714 (0.0034) [2024-06-19 04:11:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4517347328. Throughput: 0: 42180.3. Samples: 784907860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:11:48,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 04:11:51,430][26599] Updated weights for policy 0, policy_version 275724 (0.0041) [2024-06-19 04:11:53,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4517560320. Throughput: 0: 42108.1. Samples: 785156380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:11:53,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 04:11:55,244][26599] Updated weights for policy 0, policy_version 275734 (0.0033) [2024-06-19 04:11:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4517756928. Throughput: 0: 42313.2. Samples: 785421980. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:11:58,381][26367] Avg episode reward: [(0, '0.850')] [2024-06-19 04:11:59,266][26599] Updated weights for policy 0, policy_version 275744 (0.0039) [2024-06-19 04:12:02,956][26599] Updated weights for policy 0, policy_version 275754 (0.0038) [2024-06-19 04:12:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42327.9, 300 sec: 42098.6). Total num frames: 4517969920. Throughput: 0: 42368.6. Samples: 785542180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:03,381][26367] Avg episode reward: [(0, '0.814')] [2024-06-19 04:12:06,882][26599] Updated weights for policy 0, policy_version 275764 (0.0028) [2024-06-19 04:12:08,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42325.5, 300 sec: 42209.7). Total num frames: 4518199296. Throughput: 0: 42446.3. Samples: 785799620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:08,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 04:12:10,515][26599] Updated weights for policy 0, policy_version 275774 (0.0027) [2024-06-19 04:12:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41781.8, 300 sec: 42098.6). Total num frames: 4518379520. Throughput: 0: 42273.9. Samples: 786053760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:13,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 04:12:14,523][26599] Updated weights for policy 0, policy_version 275784 (0.0033) [2024-06-19 04:12:18,213][26599] Updated weights for policy 0, policy_version 275794 (0.0036) [2024-06-19 04:12:18,384][26367] Fps is (10 sec: 40944.7, 60 sec: 42322.9, 300 sec: 42153.6). Total num frames: 4518608896. Throughput: 0: 42263.8. Samples: 786175880. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:18,384][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 04:12:22,278][26599] Updated weights for policy 0, policy_version 275804 (0.0041) [2024-06-19 04:12:23,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4518838272. Throughput: 0: 42576.9. Samples: 786439380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:23,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 04:12:25,909][26599] Updated weights for policy 0, policy_version 275814 (0.0046) [2024-06-19 04:12:28,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42052.3, 300 sec: 42043.0). Total num frames: 4519018496. Throughput: 0: 42276.4. Samples: 786684940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:28,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 04:12:29,920][26599] Updated weights for policy 0, policy_version 275824 (0.0032) [2024-06-19 04:12:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4519247872. Throughput: 0: 42389.4. Samples: 786815380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:33,381][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 04:12:33,606][26599] Updated weights for policy 0, policy_version 275834 (0.0042) [2024-06-19 04:12:37,652][26599] Updated weights for policy 0, policy_version 275844 (0.0028) [2024-06-19 04:12:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4519428096. Throughput: 0: 42627.1. Samples: 787074600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:38,381][26367] Avg episode reward: [(0, '0.339')] [2024-06-19 04:12:41,166][26599] Updated weights for policy 0, policy_version 275854 (0.0032) [2024-06-19 04:12:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42154.1). Total num frames: 4519673856. Throughput: 0: 42216.6. Samples: 787321720. Policy #0 lag: (min: 0.0, avg: 10.8, max: 21.0) [2024-06-19 04:12:43,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 04:12:46,002][26599] Updated weights for policy 0, policy_version 275864 (0.0038) [2024-06-19 04:12:46,642][26579] Signal inference workers to stop experience collection... (11700 times) [2024-06-19 04:12:46,642][26579] Signal inference workers to resume experience collection... (11700 times) [2024-06-19 04:12:46,682][26599] InferenceWorker_p0-w0: stopping experience collection (11700 times) [2024-06-19 04:12:46,683][26599] InferenceWorker_p0-w0: resuming experience collection (11700 times) [2024-06-19 04:12:48,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4519903232. Throughput: 0: 42506.7. Samples: 787454980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:12:48,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 04:12:48,751][26599] Updated weights for policy 0, policy_version 275874 (0.0022) [2024-06-19 04:12:53,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4520067072. Throughput: 0: 42357.6. Samples: 787705720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:12:53,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 04:12:53,521][26599] Updated weights for policy 0, policy_version 275884 (0.0035) [2024-06-19 04:12:56,387][26599] Updated weights for policy 0, policy_version 275894 (0.0041) [2024-06-19 04:12:58,380][26367] Fps is (10 sec: 39320.8, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4520296448. Throughput: 0: 42253.1. Samples: 787955160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:12:58,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 04:13:01,143][26599] Updated weights for policy 0, policy_version 275904 (0.0044) [2024-06-19 04:13:03,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4520525824. Throughput: 0: 42412.9. Samples: 788084300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:03,380][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 04:13:04,488][26599] Updated weights for policy 0, policy_version 275914 (0.0033) [2024-06-19 04:13:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.1, 300 sec: 42099.1). Total num frames: 4520706048. Throughput: 0: 42121.2. Samples: 788334840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:08,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 04:13:08,748][26599] Updated weights for policy 0, policy_version 275924 (0.0042) [2024-06-19 04:13:12,155][26599] Updated weights for policy 0, policy_version 275934 (0.0027) [2024-06-19 04:13:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 41987.6). Total num frames: 4520919040. Throughput: 0: 42199.1. Samples: 788583900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:13,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 04:13:16,550][26599] Updated weights for policy 0, policy_version 275944 (0.0030) [2024-06-19 04:13:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42054.8, 300 sec: 42154.1). Total num frames: 4521132032. Throughput: 0: 42228.0. Samples: 788715640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:18,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 04:13:20,057][26599] Updated weights for policy 0, policy_version 275954 (0.0029) [2024-06-19 04:13:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.2, 300 sec: 42043.0). Total num frames: 4521345024. Throughput: 0: 41876.5. Samples: 788959040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:23,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 04:13:24,247][26599] Updated weights for policy 0, policy_version 275964 (0.0035) [2024-06-19 04:13:27,847][26599] Updated weights for policy 0, policy_version 275974 (0.0032) [2024-06-19 04:13:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4521574400. Throughput: 0: 41988.3. Samples: 789211200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:28,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 04:13:32,267][26599] Updated weights for policy 0, policy_version 275984 (0.0026) [2024-06-19 04:13:33,384][26367] Fps is (10 sec: 39307.2, 60 sec: 41503.6, 300 sec: 42043.0). Total num frames: 4521738240. Throughput: 0: 41921.5. Samples: 789341600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:33,384][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 04:13:35,765][26599] Updated weights for policy 0, policy_version 275994 (0.0031) [2024-06-19 04:13:38,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4521967616. Throughput: 0: 41668.5. Samples: 789580800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:38,381][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 04:13:38,430][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276000_4521984000.pth... [2024-06-19 04:13:38,479][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000275383_4511875072.pth [2024-06-19 04:13:40,387][26599] Updated weights for policy 0, policy_version 276004 (0.0024) [2024-06-19 04:13:43,380][26367] Fps is (10 sec: 45891.5, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4522196992. Throughput: 0: 41875.7. Samples: 789839560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:43,381][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 04:13:43,555][26599] Updated weights for policy 0, policy_version 276014 (0.0039) [2024-06-19 04:13:48,127][26599] Updated weights for policy 0, policy_version 276024 (0.0038) [2024-06-19 04:13:48,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41233.0, 300 sec: 42098.5). Total num frames: 4522377216. Throughput: 0: 41757.6. Samples: 789963400. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:48,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 04:13:51,382][26599] Updated weights for policy 0, policy_version 276034 (0.0025) [2024-06-19 04:13:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4522606592. Throughput: 0: 41649.0. Samples: 790209040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 04:13:53,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 04:13:55,851][26599] Updated weights for policy 0, policy_version 276044 (0.0038) [2024-06-19 04:13:58,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41779.4, 300 sec: 42098.6). Total num frames: 4522803200. Throughput: 0: 41865.8. Samples: 790467860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:13:58,381][26367] Avg episode reward: [(0, '0.894')] [2024-06-19 04:13:59,224][26599] Updated weights for policy 0, policy_version 276054 (0.0025) [2024-06-19 04:14:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41233.0, 300 sec: 42043.0). Total num frames: 4522999808. Throughput: 0: 41694.3. Samples: 790591880. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:03,380][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 04:14:03,616][26599] Updated weights for policy 0, policy_version 276064 (0.0030) [2024-06-19 04:14:07,043][26599] Updated weights for policy 0, policy_version 276074 (0.0035) [2024-06-19 04:14:07,053][26579] Signal inference workers to stop experience collection... (11750 times) [2024-06-19 04:14:07,060][26579] Signal inference workers to resume experience collection... (11750 times) [2024-06-19 04:14:07,089][26599] InferenceWorker_p0-w0: stopping experience collection (11750 times) [2024-06-19 04:14:07,090][26599] InferenceWorker_p0-w0: resuming experience collection (11750 times) [2024-06-19 04:14:08,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42099.1). Total num frames: 4523245568. Throughput: 0: 41757.2. Samples: 790838120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:08,388][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 04:14:11,338][26599] Updated weights for policy 0, policy_version 276084 (0.0047) [2024-06-19 04:14:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4523442176. Throughput: 0: 41848.1. Samples: 791094360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:13,380][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 04:14:14,978][26599] Updated weights for policy 0, policy_version 276094 (0.0033) [2024-06-19 04:14:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4523655168. Throughput: 0: 41710.9. Samples: 791218440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:18,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 04:14:19,043][26599] Updated weights for policy 0, policy_version 276104 (0.0038) [2024-06-19 04:14:22,671][26599] Updated weights for policy 0, policy_version 276114 (0.0030) [2024-06-19 04:14:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4523868160. Throughput: 0: 42054.2. Samples: 791473240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:23,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 04:14:26,807][26599] Updated weights for policy 0, policy_version 276124 (0.0051) [2024-06-19 04:14:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4524064768. Throughput: 0: 41964.0. Samples: 791727940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:28,381][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 04:14:30,413][26599] Updated weights for policy 0, policy_version 276134 (0.0026) [2024-06-19 04:14:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42327.9, 300 sec: 42043.5). Total num frames: 4524277760. Throughput: 0: 41931.2. Samples: 791850300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:33,381][26367] Avg episode reward: [(0, '0.308')] [2024-06-19 04:14:34,533][26599] Updated weights for policy 0, policy_version 276144 (0.0033) [2024-06-19 04:14:38,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4524490752. Throughput: 0: 42217.8. Samples: 792108840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:38,380][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 04:14:38,460][26599] Updated weights for policy 0, policy_version 276154 (0.0034) [2024-06-19 04:14:42,096][26599] Updated weights for policy 0, policy_version 276164 (0.0028) [2024-06-19 04:14:43,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41503.6, 300 sec: 42098.0). Total num frames: 4524687360. Throughput: 0: 42027.6. Samples: 792359260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:43,385][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 04:14:46,175][26599] Updated weights for policy 0, policy_version 276174 (0.0031) [2024-06-19 04:14:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4524916736. Throughput: 0: 42018.6. Samples: 792482720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:48,380][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 04:14:50,228][26599] Updated weights for policy 0, policy_version 276184 (0.0037) [2024-06-19 04:14:53,384][26367] Fps is (10 sec: 44236.9, 60 sec: 42049.7, 300 sec: 42153.6). Total num frames: 4525129728. Throughput: 0: 42270.9. Samples: 792740460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:53,384][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 04:14:53,900][26599] Updated weights for policy 0, policy_version 276194 (0.0025) [2024-06-19 04:14:57,724][26599] Updated weights for policy 0, policy_version 276204 (0.0037) [2024-06-19 04:14:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4525342720. Throughput: 0: 42135.4. Samples: 792990460. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:14:58,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 04:15:01,720][26599] Updated weights for policy 0, policy_version 276214 (0.0039) [2024-06-19 04:15:03,380][26367] Fps is (10 sec: 40975.1, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4525539328. Throughput: 0: 42058.3. Samples: 793111060. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:15:03,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 04:15:05,413][26599] Updated weights for policy 0, policy_version 276224 (0.0034) [2024-06-19 04:15:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4525752320. Throughput: 0: 42153.0. Samples: 793370120. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 04:15:08,380][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 04:15:09,525][26599] Updated weights for policy 0, policy_version 276234 (0.0044) [2024-06-19 04:15:13,095][26599] Updated weights for policy 0, policy_version 276244 (0.0034) [2024-06-19 04:15:13,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42154.6). Total num frames: 4525981696. Throughput: 0: 41976.4. Samples: 793616880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:13,381][26367] Avg episode reward: [(0, '0.356')] [2024-06-19 04:15:17,238][26599] Updated weights for policy 0, policy_version 276254 (0.0043) [2024-06-19 04:15:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42154.6). Total num frames: 4526178304. Throughput: 0: 42217.3. Samples: 793750080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:18,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 04:15:20,253][26579] Signal inference workers to stop experience collection... (11800 times) [2024-06-19 04:15:20,254][26579] Signal inference workers to resume experience collection... (11800 times) [2024-06-19 04:15:20,312][26599] InferenceWorker_p0-w0: stopping experience collection (11800 times) [2024-06-19 04:15:20,312][26599] InferenceWorker_p0-w0: resuming experience collection (11800 times) [2024-06-19 04:15:20,936][26599] Updated weights for policy 0, policy_version 276264 (0.0028) [2024-06-19 04:15:23,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4526407680. Throughput: 0: 42180.4. Samples: 794006960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:23,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 04:15:24,593][26599] Updated weights for policy 0, policy_version 276274 (0.0027) [2024-06-19 04:15:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4526604288. Throughput: 0: 42346.9. Samples: 794264720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:28,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 04:15:28,693][26599] Updated weights for policy 0, policy_version 276284 (0.0031) [2024-06-19 04:15:32,892][26599] Updated weights for policy 0, policy_version 276294 (0.0032) [2024-06-19 04:15:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4526817280. Throughput: 0: 42204.4. Samples: 794381920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:33,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 04:15:36,403][26599] Updated weights for policy 0, policy_version 276304 (0.0048) [2024-06-19 04:15:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.2, 300 sec: 42210.1). Total num frames: 4527046656. Throughput: 0: 42128.6. Samples: 794636100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:38,381][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 04:15:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276309_4527046656.pth... [2024-06-19 04:15:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000275692_4516937728.pth [2024-06-19 04:15:40,474][26599] Updated weights for policy 0, policy_version 276314 (0.0039) [2024-06-19 04:15:43,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42054.9, 300 sec: 41987.5). Total num frames: 4527210496. Throughput: 0: 42308.6. Samples: 794894340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:43,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 04:15:44,652][26599] Updated weights for policy 0, policy_version 276324 (0.0031) [2024-06-19 04:15:48,117][26599] Updated weights for policy 0, policy_version 276334 (0.0039) [2024-06-19 04:15:48,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4527456256. Throughput: 0: 42333.7. Samples: 795016080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:48,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 04:15:52,354][26599] Updated weights for policy 0, policy_version 276344 (0.0040) [2024-06-19 04:15:53,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42054.8, 300 sec: 42098.6). Total num frames: 4527652864. Throughput: 0: 42371.4. Samples: 795276840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:53,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 04:15:55,786][26599] Updated weights for policy 0, policy_version 276354 (0.0041) [2024-06-19 04:15:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41779.3, 300 sec: 42099.1). Total num frames: 4527849472. Throughput: 0: 42591.3. Samples: 795533480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:15:58,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 04:16:00,092][26599] Updated weights for policy 0, policy_version 276364 (0.0040) [2024-06-19 04:16:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4528095232. Throughput: 0: 42298.2. Samples: 795653500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:16:03,381][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 04:16:03,473][26599] Updated weights for policy 0, policy_version 276374 (0.0036) [2024-06-19 04:16:07,698][26599] Updated weights for policy 0, policy_version 276384 (0.0032) [2024-06-19 04:16:08,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42154.6). Total num frames: 4528308224. Throughput: 0: 42354.1. Samples: 795912900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:16:08,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 04:16:11,385][26599] Updated weights for policy 0, policy_version 276394 (0.0044) [2024-06-19 04:16:13,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.4, 300 sec: 42098.6). Total num frames: 4528488448. Throughput: 0: 42281.5. Samples: 796167380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:16:13,380][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 04:16:15,389][26599] Updated weights for policy 0, policy_version 276404 (0.0040) [2024-06-19 04:16:18,384][26367] Fps is (10 sec: 42583.5, 60 sec: 42595.8, 300 sec: 42153.6). Total num frames: 4528734208. Throughput: 0: 42320.1. Samples: 796286480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:16:18,385][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 04:16:19,185][26599] Updated weights for policy 0, policy_version 276414 (0.0043) [2024-06-19 04:16:23,150][26599] Updated weights for policy 0, policy_version 276424 (0.0052) [2024-06-19 04:16:23,384][26367] Fps is (10 sec: 45858.3, 60 sec: 42322.7, 300 sec: 42209.1). Total num frames: 4528947200. Throughput: 0: 42444.7. Samples: 796546260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:23,384][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 04:16:26,903][26599] Updated weights for policy 0, policy_version 276434 (0.0041) [2024-06-19 04:16:28,380][26367] Fps is (10 sec: 37696.7, 60 sec: 41779.2, 300 sec: 41987.5). Total num frames: 4529111040. Throughput: 0: 42518.5. Samples: 796807680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:28,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 04:16:30,089][26579] Signal inference workers to stop experience collection... (11850 times) [2024-06-19 04:16:30,091][26579] Signal inference workers to resume experience collection... (11850 times) [2024-06-19 04:16:30,121][26599] InferenceWorker_p0-w0: stopping experience collection (11850 times) [2024-06-19 04:16:30,122][26599] InferenceWorker_p0-w0: resuming experience collection (11850 times) [2024-06-19 04:16:30,751][26599] Updated weights for policy 0, policy_version 276444 (0.0030) [2024-06-19 04:16:33,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4529373184. Throughput: 0: 42395.6. Samples: 796923880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:33,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 04:16:34,455][26599] Updated weights for policy 0, policy_version 276454 (0.0029) [2024-06-19 04:16:38,380][26367] Fps is (10 sec: 44237.4, 60 sec: 41779.4, 300 sec: 42209.6). Total num frames: 4529553408. Throughput: 0: 42397.0. Samples: 797184700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:38,380][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 04:16:38,820][26599] Updated weights for policy 0, policy_version 276464 (0.0037) [2024-06-19 04:16:42,151][26599] Updated weights for policy 0, policy_version 276474 (0.0027) [2024-06-19 04:16:43,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42325.2, 300 sec: 42043.0). Total num frames: 4529750016. Throughput: 0: 42315.4. Samples: 797437680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:43,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 04:16:46,395][26599] Updated weights for policy 0, policy_version 276484 (0.0046) [2024-06-19 04:16:48,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4530012160. Throughput: 0: 42452.0. Samples: 797563840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:48,381][26367] Avg episode reward: [(0, '0.831')] [2024-06-19 04:16:50,416][26599] Updated weights for policy 0, policy_version 276494 (0.0023) [2024-06-19 04:16:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4530192384. Throughput: 0: 42449.4. Samples: 797823120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:53,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 04:16:54,110][26599] Updated weights for policy 0, policy_version 276504 (0.0043) [2024-06-19 04:16:58,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42098.6). Total num frames: 4530388992. Throughput: 0: 42424.9. Samples: 798076500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:16:58,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 04:16:58,422][26599] Updated weights for policy 0, policy_version 276514 (0.0043) [2024-06-19 04:17:01,810][26599] Updated weights for policy 0, policy_version 276524 (0.0039) [2024-06-19 04:17:03,380][26367] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 4530667520. Throughput: 0: 42553.7. Samples: 798201240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:17:03,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 04:17:06,127][26599] Updated weights for policy 0, policy_version 276534 (0.0030) [2024-06-19 04:17:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4530814976. Throughput: 0: 42403.0. Samples: 798454240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:17:08,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 04:17:09,572][26599] Updated weights for policy 0, policy_version 276544 (0.0040) [2024-06-19 04:17:13,380][26367] Fps is (10 sec: 36044.9, 60 sec: 42325.4, 300 sec: 42099.1). Total num frames: 4531027968. Throughput: 0: 42164.6. Samples: 798705080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:17:13,380][26367] Avg episode reward: [(0, '0.399')] [2024-06-19 04:17:13,844][26599] Updated weights for policy 0, policy_version 276554 (0.0033) [2024-06-19 04:17:17,282][26599] Updated weights for policy 0, policy_version 276564 (0.0049) [2024-06-19 04:17:18,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42601.0, 300 sec: 42209.6). Total num frames: 4531290112. Throughput: 0: 42488.4. Samples: 798835860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:17:18,381][26367] Avg episode reward: [(0, '0.188')] [2024-06-19 04:17:21,499][26599] Updated weights for policy 0, policy_version 276574 (0.0033) [2024-06-19 04:17:23,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41781.7, 300 sec: 42154.1). Total num frames: 4531453952. Throughput: 0: 42327.9. Samples: 799089460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:17:23,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 04:17:24,900][26599] Updated weights for policy 0, policy_version 276584 (0.0052) [2024-06-19 04:17:28,384][26367] Fps is (10 sec: 39307.1, 60 sec: 42868.9, 300 sec: 42153.6). Total num frames: 4531683328. Throughput: 0: 42073.5. Samples: 799331140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:17:28,385][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 04:17:29,714][26599] Updated weights for policy 0, policy_version 276594 (0.0035) [2024-06-19 04:17:32,868][26599] Updated weights for policy 0, policy_version 276604 (0.0037) [2024-06-19 04:17:33,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4531896320. Throughput: 0: 42177.4. Samples: 799461820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 24.0) [2024-06-19 04:17:33,380][26367] Avg episode reward: [(0, '0.392')] [2024-06-19 04:17:37,496][26599] Updated weights for policy 0, policy_version 276614 (0.0028) [2024-06-19 04:17:38,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42325.2, 300 sec: 42098.5). Total num frames: 4532092928. Throughput: 0: 42077.7. Samples: 799716620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:17:38,381][26367] Avg episode reward: [(0, '0.429')] [2024-06-19 04:17:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276617_4532092928.pth... [2024-06-19 04:17:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276000_4521984000.pth [2024-06-19 04:17:40,218][26579] Signal inference workers to stop experience collection... (11900 times) [2024-06-19 04:17:40,240][26599] InferenceWorker_p0-w0: stopping experience collection (11900 times) [2024-06-19 04:17:40,277][26579] Signal inference workers to resume experience collection... (11900 times) [2024-06-19 04:17:40,277][26599] InferenceWorker_p0-w0: resuming experience collection (11900 times) [2024-06-19 04:17:40,424][26599] Updated weights for policy 0, policy_version 276624 (0.0030) [2024-06-19 04:17:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43144.6, 300 sec: 42154.1). Total num frames: 4532338688. Throughput: 0: 41848.4. Samples: 799959680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:17:43,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 04:17:45,106][26599] Updated weights for policy 0, policy_version 276634 (0.0038) [2024-06-19 04:17:47,969][26599] Updated weights for policy 0, policy_version 276644 (0.0048) [2024-06-19 04:17:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4532535296. Throughput: 0: 42221.6. Samples: 800101220. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:17:48,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 04:17:52,777][26599] Updated weights for policy 0, policy_version 276654 (0.0040) [2024-06-19 04:17:53,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4532715520. Throughput: 0: 42179.4. Samples: 800352320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:17:53,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 04:17:55,713][26599] Updated weights for policy 0, policy_version 276664 (0.0043) [2024-06-19 04:17:58,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42154.1). Total num frames: 4532961280. Throughput: 0: 42101.3. Samples: 800599640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:17:58,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 04:18:00,583][26599] Updated weights for policy 0, policy_version 276674 (0.0045) [2024-06-19 04:18:03,380][26367] Fps is (10 sec: 44237.7, 60 sec: 41506.2, 300 sec: 42209.7). Total num frames: 4533157888. Throughput: 0: 42262.7. Samples: 800737680. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:03,380][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 04:18:03,944][26599] Updated weights for policy 0, policy_version 276684 (0.0044) [2024-06-19 04:18:08,239][26599] Updated weights for policy 0, policy_version 276694 (0.0030) [2024-06-19 04:18:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4533354496. Throughput: 0: 42040.9. Samples: 800981300. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:08,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 04:18:12,044][26599] Updated weights for policy 0, policy_version 276704 (0.0038) [2024-06-19 04:18:13,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 4533600256. Throughput: 0: 42147.1. Samples: 801227600. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:13,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 04:18:15,832][26599] Updated weights for policy 0, policy_version 276714 (0.0040) [2024-06-19 04:18:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41233.1, 300 sec: 42098.6). Total num frames: 4533764096. Throughput: 0: 42167.5. Samples: 801359360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:18,380][26367] Avg episode reward: [(0, '0.406')] [2024-06-19 04:18:19,651][26599] Updated weights for policy 0, policy_version 276724 (0.0039) [2024-06-19 04:18:23,364][26599] Updated weights for policy 0, policy_version 276734 (0.0049) [2024-06-19 04:18:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4534009856. Throughput: 0: 41954.8. Samples: 801604580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:23,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 04:18:27,532][26599] Updated weights for policy 0, policy_version 276744 (0.0040) [2024-06-19 04:18:28,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42327.8, 300 sec: 42321.2). Total num frames: 4534222848. Throughput: 0: 42316.8. Samples: 801863940. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:28,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 04:18:31,510][26599] Updated weights for policy 0, policy_version 276754 (0.0037) [2024-06-19 04:18:33,383][26367] Fps is (10 sec: 40947.4, 60 sec: 42050.1, 300 sec: 42209.2). Total num frames: 4534419456. Throughput: 0: 42004.3. Samples: 801991540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:33,384][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 04:18:35,321][26599] Updated weights for policy 0, policy_version 276764 (0.0033) [2024-06-19 04:18:38,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42322.8, 300 sec: 42153.6). Total num frames: 4534632448. Throughput: 0: 41902.4. Samples: 802238080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:38,385][26367] Avg episode reward: [(0, '0.371')] [2024-06-19 04:18:39,141][26599] Updated weights for policy 0, policy_version 276774 (0.0036) [2024-06-19 04:18:43,380][26367] Fps is (10 sec: 39333.7, 60 sec: 41233.1, 300 sec: 42154.1). Total num frames: 4534812672. Throughput: 0: 42107.1. Samples: 802494460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 04:18:43,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 04:18:43,390][26599] Updated weights for policy 0, policy_version 276784 (0.0039) [2024-06-19 04:18:46,909][26599] Updated weights for policy 0, policy_version 276794 (0.0039) [2024-06-19 04:18:48,380][26367] Fps is (10 sec: 40975.6, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4535042048. Throughput: 0: 41754.2. Samples: 802616620. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:18:48,380][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 04:18:51,071][26599] Updated weights for policy 0, policy_version 276804 (0.0026) [2024-06-19 04:18:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4535255040. Throughput: 0: 41971.1. Samples: 802870000. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:18:53,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 04:18:54,671][26599] Updated weights for policy 0, policy_version 276814 (0.0027) [2024-06-19 04:18:55,743][26579] Signal inference workers to stop experience collection... (11950 times) [2024-06-19 04:18:55,744][26579] Signal inference workers to resume experience collection... (11950 times) [2024-06-19 04:18:55,796][26599] InferenceWorker_p0-w0: stopping experience collection (11950 times) [2024-06-19 04:18:55,796][26599] InferenceWorker_p0-w0: resuming experience collection (11950 times) [2024-06-19 04:18:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 41779.1, 300 sec: 42265.1). Total num frames: 4535468032. Throughput: 0: 42236.7. Samples: 803128260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:18:58,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 04:18:58,773][26599] Updated weights for policy 0, policy_version 276824 (0.0034) [2024-06-19 04:19:02,112][26599] Updated weights for policy 0, policy_version 276834 (0.0023) [2024-06-19 04:19:03,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4535697408. Throughput: 0: 42141.8. Samples: 803255740. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:03,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 04:19:06,528][26599] Updated weights for policy 0, policy_version 276844 (0.0036) [2024-06-19 04:19:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4535894016. Throughput: 0: 42277.8. Samples: 803507080. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:08,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 04:19:09,974][26599] Updated weights for policy 0, policy_version 276854 (0.0047) [2024-06-19 04:19:13,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 4536090624. Throughput: 0: 42150.4. Samples: 803760700. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:13,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 04:19:14,214][26599] Updated weights for policy 0, policy_version 276864 (0.0026) [2024-06-19 04:19:17,451][26599] Updated weights for policy 0, policy_version 276874 (0.0034) [2024-06-19 04:19:18,384][26367] Fps is (10 sec: 44221.3, 60 sec: 42868.9, 300 sec: 42264.7). Total num frames: 4536336384. Throughput: 0: 42193.4. Samples: 803890260. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:18,384][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 04:19:21,947][26599] Updated weights for policy 0, policy_version 276884 (0.0027) [2024-06-19 04:19:23,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 4536516608. Throughput: 0: 42348.1. Samples: 804143580. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:23,380][26367] Avg episode reward: [(0, '0.839')] [2024-06-19 04:19:25,032][26599] Updated weights for policy 0, policy_version 276894 (0.0037) [2024-06-19 04:19:28,384][26367] Fps is (10 sec: 39321.2, 60 sec: 41776.8, 300 sec: 42209.1). Total num frames: 4536729600. Throughput: 0: 42284.2. Samples: 804397400. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:28,384][26367] Avg episode reward: [(0, '0.839')] [2024-06-19 04:19:29,748][26599] Updated weights for policy 0, policy_version 276904 (0.0048) [2024-06-19 04:19:32,850][26599] Updated weights for policy 0, policy_version 276914 (0.0041) [2024-06-19 04:19:33,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42600.6, 300 sec: 42320.7). Total num frames: 4536975360. Throughput: 0: 42444.8. Samples: 804526640. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:33,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 04:19:37,507][26599] Updated weights for policy 0, policy_version 276924 (0.0028) [2024-06-19 04:19:38,380][26367] Fps is (10 sec: 40974.9, 60 sec: 41781.8, 300 sec: 42210.2). Total num frames: 4537139200. Throughput: 0: 42361.0. Samples: 804776240. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:38,380][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 04:19:38,421][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276926_4537155584.pth... [2024-06-19 04:19:38,493][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276309_4527046656.pth [2024-06-19 04:19:40,767][26599] Updated weights for policy 0, policy_version 276934 (0.0030) [2024-06-19 04:19:43,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4537368576. Throughput: 0: 42143.5. Samples: 805024720. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:43,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 04:19:45,319][26599] Updated weights for policy 0, policy_version 276944 (0.0031) [2024-06-19 04:19:48,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42265.7). Total num frames: 4537597952. Throughput: 0: 42280.0. Samples: 805158340. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:48,380][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 04:19:48,514][26599] Updated weights for policy 0, policy_version 276954 (0.0032) [2024-06-19 04:19:53,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4537778176. Throughput: 0: 42358.7. Samples: 805413220. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:53,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 04:19:53,384][26599] Updated weights for policy 0, policy_version 276964 (0.0046) [2024-06-19 04:19:56,091][26599] Updated weights for policy 0, policy_version 276974 (0.0038) [2024-06-19 04:19:58,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4537991168. Throughput: 0: 42321.7. Samples: 805665180. Policy #0 lag: (min: 0.0, avg: 12.6, max: 27.0) [2024-06-19 04:19:58,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 04:20:00,859][26599] Updated weights for policy 0, policy_version 276984 (0.0044) [2024-06-19 04:20:03,381][26367] Fps is (10 sec: 45873.9, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4538236928. Throughput: 0: 42340.0. Samples: 805795420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:03,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 04:20:03,586][26599] Updated weights for policy 0, policy_version 276994 (0.0027) [2024-06-19 04:20:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4538417152. Throughput: 0: 42376.2. Samples: 806050520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:08,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 04:20:08,641][26599] Updated weights for policy 0, policy_version 277004 (0.0038) [2024-06-19 04:20:11,306][26599] Updated weights for policy 0, policy_version 277014 (0.0032) [2024-06-19 04:20:13,380][26367] Fps is (10 sec: 40961.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4538646528. Throughput: 0: 42247.4. Samples: 806298380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:13,380][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 04:20:16,168][26599] Updated weights for policy 0, policy_version 277024 (0.0038) [2024-06-19 04:20:18,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42054.7, 300 sec: 42209.6). Total num frames: 4538859520. Throughput: 0: 42311.1. Samples: 806430640. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:18,381][26367] Avg episode reward: [(0, '0.194')] [2024-06-19 04:20:18,982][26599] Updated weights for policy 0, policy_version 277034 (0.0041) [2024-06-19 04:20:22,870][26579] Signal inference workers to stop experience collection... (12000 times) [2024-06-19 04:20:22,876][26579] Signal inference workers to resume experience collection... (12000 times) [2024-06-19 04:20:22,913][26599] InferenceWorker_p0-w0: stopping experience collection (12000 times) [2024-06-19 04:20:22,914][26599] InferenceWorker_p0-w0: resuming experience collection (12000 times) [2024-06-19 04:20:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4539056128. Throughput: 0: 42274.6. Samples: 806678600. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:23,383][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 04:20:24,193][26599] Updated weights for policy 0, policy_version 277044 (0.0029) [2024-06-19 04:20:26,937][26599] Updated weights for policy 0, policy_version 277054 (0.0030) [2024-06-19 04:20:28,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42327.8, 300 sec: 42209.6). Total num frames: 4539269120. Throughput: 0: 42303.1. Samples: 806928360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:28,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 04:20:31,730][26599] Updated weights for policy 0, policy_version 277064 (0.0030) [2024-06-19 04:20:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4539498496. Throughput: 0: 42340.3. Samples: 807063660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:33,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 04:20:34,814][26599] Updated weights for policy 0, policy_version 277074 (0.0033) [2024-06-19 04:20:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 4539678720. Throughput: 0: 42250.0. Samples: 807314480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:38,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 04:20:39,497][26599] Updated weights for policy 0, policy_version 277084 (0.0042) [2024-06-19 04:20:42,316][26599] Updated weights for policy 0, policy_version 277094 (0.0034) [2024-06-19 04:20:43,384][26367] Fps is (10 sec: 42583.6, 60 sec: 42595.9, 300 sec: 42264.7). Total num frames: 4539924480. Throughput: 0: 42065.1. Samples: 807558260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:43,384][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 04:20:47,481][26599] Updated weights for policy 0, policy_version 277104 (0.0042) [2024-06-19 04:20:48,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4540137472. Throughput: 0: 42327.3. Samples: 807700140. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:48,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 04:20:50,004][26599] Updated weights for policy 0, policy_version 277114 (0.0032) [2024-06-19 04:20:53,380][26367] Fps is (10 sec: 39335.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4540317696. Throughput: 0: 42252.1. Samples: 807951860. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:53,381][26367] Avg episode reward: [(0, '0.261')] [2024-06-19 04:20:55,009][26599] Updated weights for policy 0, policy_version 277124 (0.0034) [2024-06-19 04:20:58,058][26599] Updated weights for policy 0, policy_version 277134 (0.0041) [2024-06-19 04:20:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 4540563456. Throughput: 0: 42298.6. Samples: 808201820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:20:58,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 04:21:02,711][26599] Updated weights for policy 0, policy_version 277144 (0.0033) [2024-06-19 04:21:03,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4540760064. Throughput: 0: 42295.9. Samples: 808333960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:21:03,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 04:21:05,804][26599] Updated weights for policy 0, policy_version 277154 (0.0036) [2024-06-19 04:21:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4540956672. Throughput: 0: 42270.7. Samples: 808580780. Policy #0 lag: (min: 0.0, avg: 11.3, max: 25.0) [2024-06-19 04:21:08,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 04:21:10,462][26599] Updated weights for policy 0, policy_version 277164 (0.0035) [2024-06-19 04:21:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42265.7). Total num frames: 4541202432. Throughput: 0: 42345.8. Samples: 808833920. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:13,381][26367] Avg episode reward: [(0, '0.369')] [2024-06-19 04:21:13,605][26599] Updated weights for policy 0, policy_version 277174 (0.0034) [2024-06-19 04:21:18,087][26599] Updated weights for policy 0, policy_version 277184 (0.0048) [2024-06-19 04:21:18,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42154.6). Total num frames: 4541382656. Throughput: 0: 42322.8. Samples: 808968180. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:18,381][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 04:21:21,215][26599] Updated weights for policy 0, policy_version 277194 (0.0034) [2024-06-19 04:21:23,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4541579264. Throughput: 0: 42308.1. Samples: 809218340. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:23,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 04:21:25,714][26599] Updated weights for policy 0, policy_version 277204 (0.0028) [2024-06-19 04:21:28,380][26367] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 4541857792. Throughput: 0: 42635.8. Samples: 809476720. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:28,381][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 04:21:28,802][26599] Updated weights for policy 0, policy_version 277214 (0.0041) [2024-06-19 04:21:32,998][26579] Signal inference workers to stop experience collection... (12050 times) [2024-06-19 04:21:32,998][26579] Signal inference workers to resume experience collection... (12050 times) [2024-06-19 04:21:33,028][26599] InferenceWorker_p0-w0: stopping experience collection (12050 times) [2024-06-19 04:21:33,028][26599] InferenceWorker_p0-w0: resuming experience collection (12050 times) [2024-06-19 04:21:33,353][26599] Updated weights for policy 0, policy_version 277224 (0.0027) [2024-06-19 04:21:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4542038016. Throughput: 0: 42480.0. Samples: 809611740. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:33,381][26367] Avg episode reward: [(0, '0.279')] [2024-06-19 04:21:36,489][26599] Updated weights for policy 0, policy_version 277234 (0.0046) [2024-06-19 04:21:38,380][26367] Fps is (10 sec: 36044.5, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4542218240. Throughput: 0: 42335.4. Samples: 809856960. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:38,385][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 04:21:38,458][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000277236_4542234624.pth... [2024-06-19 04:21:38,509][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276617_4532092928.pth [2024-06-19 04:21:40,970][26599] Updated weights for policy 0, policy_version 277244 (0.0033) [2024-06-19 04:21:43,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42874.1, 300 sec: 42320.7). Total num frames: 4542496768. Throughput: 0: 42466.3. Samples: 810112800. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:43,380][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 04:21:44,143][26599] Updated weights for policy 0, policy_version 277254 (0.0036) [2024-06-19 04:21:48,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4542676992. Throughput: 0: 42592.6. Samples: 810250620. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:48,380][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 04:21:48,479][26599] Updated weights for policy 0, policy_version 277264 (0.0034) [2024-06-19 04:21:51,712][26599] Updated weights for policy 0, policy_version 277274 (0.0036) [2024-06-19 04:21:53,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4542873600. Throughput: 0: 42543.6. Samples: 810495240. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:53,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 04:21:56,418][26599] Updated weights for policy 0, policy_version 277284 (0.0035) [2024-06-19 04:21:58,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4543119360. Throughput: 0: 42615.2. Samples: 810751600. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:21:58,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 04:21:59,380][26599] Updated weights for policy 0, policy_version 277294 (0.0038) [2024-06-19 04:22:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4543299584. Throughput: 0: 42698.1. Samples: 810889600. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:22:03,381][26367] Avg episode reward: [(0, '0.380')] [2024-06-19 04:22:04,257][26599] Updated weights for policy 0, policy_version 277304 (0.0038) [2024-06-19 04:22:07,032][26599] Updated weights for policy 0, policy_version 277314 (0.0039) [2024-06-19 04:22:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4543528960. Throughput: 0: 42568.4. Samples: 811133920. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:22:08,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 04:22:11,991][26599] Updated weights for policy 0, policy_version 277324 (0.0033) [2024-06-19 04:22:13,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 4543758336. Throughput: 0: 42481.7. Samples: 811388400. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:22:13,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 04:22:14,797][26599] Updated weights for policy 0, policy_version 277334 (0.0039) [2024-06-19 04:22:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4543922176. Throughput: 0: 42402.3. Samples: 811519840. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:22:18,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 04:22:19,634][26599] Updated weights for policy 0, policy_version 277344 (0.0044) [2024-06-19 04:22:22,392][26599] Updated weights for policy 0, policy_version 277354 (0.0028) [2024-06-19 04:22:23,384][26367] Fps is (10 sec: 40945.6, 60 sec: 43141.9, 300 sec: 42320.7). Total num frames: 4544167936. Throughput: 0: 42424.2. Samples: 811766200. Policy #0 lag: (min: 2.0, avg: 9.8, max: 25.0) [2024-06-19 04:22:23,384][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 04:22:27,682][26599] Updated weights for policy 0, policy_version 277364 (0.0037) [2024-06-19 04:22:28,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4544380928. Throughput: 0: 42635.1. Samples: 812031380. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:22:28,380][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 04:22:30,124][26599] Updated weights for policy 0, policy_version 277374 (0.0032) [2024-06-19 04:22:33,380][26367] Fps is (10 sec: 39336.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4544561152. Throughput: 0: 42297.3. Samples: 812154000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:22:33,380][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 04:22:35,176][26599] Updated weights for policy 0, policy_version 277384 (0.0040) [2024-06-19 04:22:38,238][26599] Updated weights for policy 0, policy_version 277394 (0.0034) [2024-06-19 04:22:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43417.7, 300 sec: 42320.7). Total num frames: 4544823296. Throughput: 0: 42396.4. Samples: 812403080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:22:38,384][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 04:22:42,871][26599] Updated weights for policy 0, policy_version 277404 (0.0027) [2024-06-19 04:22:43,380][26367] Fps is (10 sec: 44236.1, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 4545003520. Throughput: 0: 42542.1. Samples: 812666000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:22:43,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 04:22:45,073][26579] Signal inference workers to stop experience collection... (12100 times) [2024-06-19 04:22:45,073][26579] Signal inference workers to resume experience collection... (12100 times) [2024-06-19 04:22:45,107][26599] InferenceWorker_p0-w0: stopping experience collection (12100 times) [2024-06-19 04:22:45,108][26599] InferenceWorker_p0-w0: resuming experience collection (12100 times) [2024-06-19 04:22:45,817][26599] Updated weights for policy 0, policy_version 277414 (0.0046) [2024-06-19 04:22:48,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4545200128. Throughput: 0: 42108.1. Samples: 812784460. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:22:48,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 04:22:50,518][26599] Updated weights for policy 0, policy_version 277424 (0.0034) [2024-06-19 04:22:53,380][26367] Fps is (10 sec: 45875.2, 60 sec: 43144.5, 300 sec: 42376.2). Total num frames: 4545462272. Throughput: 0: 42380.4. Samples: 813041040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:22:53,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 04:22:53,833][26599] Updated weights for policy 0, policy_version 277434 (0.0040) [2024-06-19 04:22:58,132][26599] Updated weights for policy 0, policy_version 277444 (0.0024) [2024-06-19 04:22:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4545642496. Throughput: 0: 42514.4. Samples: 813301540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:22:58,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 04:23:01,243][26599] Updated weights for policy 0, policy_version 277454 (0.0030) [2024-06-19 04:23:03,380][26367] Fps is (10 sec: 37683.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4545839104. Throughput: 0: 42185.8. Samples: 813418200. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:23:03,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 04:23:06,089][26599] Updated weights for policy 0, policy_version 277464 (0.0045) [2024-06-19 04:23:08,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4546084864. Throughput: 0: 42551.3. Samples: 813680860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:23:08,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 04:23:08,990][26599] Updated weights for policy 0, policy_version 277474 (0.0029) [2024-06-19 04:23:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.3, 300 sec: 42376.2). Total num frames: 4546265088. Throughput: 0: 42332.8. Samples: 813936360. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:23:13,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 04:23:13,893][26599] Updated weights for policy 0, policy_version 277484 (0.0038) [2024-06-19 04:23:16,838][26599] Updated weights for policy 0, policy_version 277494 (0.0044) [2024-06-19 04:23:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 4546478080. Throughput: 0: 42344.3. Samples: 814059500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:23:18,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 04:23:21,387][26599] Updated weights for policy 0, policy_version 277504 (0.0038) [2024-06-19 04:23:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42327.9, 300 sec: 42320.7). Total num frames: 4546707456. Throughput: 0: 42504.0. Samples: 814315760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:23:23,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 04:23:24,640][26599] Updated weights for policy 0, policy_version 277514 (0.0032) [2024-06-19 04:23:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42321.1). Total num frames: 4546904064. Throughput: 0: 42407.2. Samples: 814574320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:23:28,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 04:23:28,991][26599] Updated weights for policy 0, policy_version 277524 (0.0032) [2024-06-19 04:23:32,399][26599] Updated weights for policy 0, policy_version 277534 (0.0046) [2024-06-19 04:23:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42376.8). Total num frames: 4547133440. Throughput: 0: 42539.5. Samples: 814698740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 04:23:33,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 04:23:37,069][26599] Updated weights for policy 0, policy_version 277544 (0.0040) [2024-06-19 04:23:38,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 4547330048. Throughput: 0: 42532.9. Samples: 814955020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:23:38,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 04:23:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000277547_4547330048.pth... [2024-06-19 04:23:38,478][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000276926_4537155584.pth [2024-06-19 04:23:40,152][26599] Updated weights for policy 0, policy_version 277554 (0.0043) [2024-06-19 04:23:43,385][26367] Fps is (10 sec: 40940.8, 60 sec: 42322.1, 300 sec: 42375.5). Total num frames: 4547543040. Throughput: 0: 42396.8. Samples: 815209600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:23:43,386][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 04:23:44,795][26599] Updated weights for policy 0, policy_version 277564 (0.0044) [2024-06-19 04:23:47,971][26599] Updated weights for policy 0, policy_version 277574 (0.0029) [2024-06-19 04:23:48,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 4547788800. Throughput: 0: 42787.0. Samples: 815343620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:23:48,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 04:23:52,290][26599] Updated weights for policy 0, policy_version 277584 (0.0040) [2024-06-19 04:23:53,380][26367] Fps is (10 sec: 42618.7, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 4547969024. Throughput: 0: 42527.3. Samples: 815594580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:23:53,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 04:23:55,705][26599] Updated weights for policy 0, policy_version 277594 (0.0033) [2024-06-19 04:23:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4548198400. Throughput: 0: 42565.3. Samples: 815851800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:23:58,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 04:23:59,865][26599] Updated weights for policy 0, policy_version 277604 (0.0028) [2024-06-19 04:24:03,266][26599] Updated weights for policy 0, policy_version 277614 (0.0025) [2024-06-19 04:24:03,380][26367] Fps is (10 sec: 45874.6, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 4548427776. Throughput: 0: 42644.9. Samples: 815978520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:03,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 04:24:07,614][26599] Updated weights for policy 0, policy_version 277624 (0.0045) [2024-06-19 04:24:08,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42376.2). Total num frames: 4548591616. Throughput: 0: 42553.8. Samples: 816230680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:08,380][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 04:24:10,797][26599] Updated weights for policy 0, policy_version 277634 (0.0031) [2024-06-19 04:24:13,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42321.2). Total num frames: 4548820992. Throughput: 0: 42538.2. Samples: 816488540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:13,381][26367] Avg episode reward: [(0, '0.278')] [2024-06-19 04:24:15,022][26579] Signal inference workers to stop experience collection... (12150 times) [2024-06-19 04:24:15,026][26579] Signal inference workers to resume experience collection... (12150 times) [2024-06-19 04:24:15,051][26599] InferenceWorker_p0-w0: stopping experience collection (12150 times) [2024-06-19 04:24:15,051][26599] InferenceWorker_p0-w0: resuming experience collection (12150 times) [2024-06-19 04:24:15,174][26599] Updated weights for policy 0, policy_version 277644 (0.0036) [2024-06-19 04:24:18,380][26367] Fps is (10 sec: 47513.6, 60 sec: 43144.7, 300 sec: 42542.8). Total num frames: 4549066752. Throughput: 0: 42696.5. Samples: 816620080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:18,381][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 04:24:18,434][26599] Updated weights for policy 0, policy_version 277654 (0.0033) [2024-06-19 04:24:22,819][26599] Updated weights for policy 0, policy_version 277664 (0.0030) [2024-06-19 04:24:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42432.3). Total num frames: 4549246976. Throughput: 0: 42424.5. Samples: 816864120. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:23,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 04:24:26,982][26599] Updated weights for policy 0, policy_version 277674 (0.0027) [2024-06-19 04:24:28,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4549459968. Throughput: 0: 42538.8. Samples: 817123640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:28,380][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 04:24:30,509][26599] Updated weights for policy 0, policy_version 277684 (0.0039) [2024-06-19 04:24:33,384][26367] Fps is (10 sec: 44220.9, 60 sec: 42595.9, 300 sec: 42542.3). Total num frames: 4549689344. Throughput: 0: 42315.7. Samples: 817247980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:33,385][26367] Avg episode reward: [(0, '0.429')] [2024-06-19 04:24:34,921][26599] Updated weights for policy 0, policy_version 277694 (0.0052) [2024-06-19 04:24:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4549885952. Throughput: 0: 42319.6. Samples: 817498960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:38,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 04:24:38,410][26599] Updated weights for policy 0, policy_version 277704 (0.0031) [2024-06-19 04:24:42,855][26599] Updated weights for policy 0, policy_version 277714 (0.0036) [2024-06-19 04:24:43,380][26367] Fps is (10 sec: 39335.7, 60 sec: 42328.6, 300 sec: 42320.7). Total num frames: 4550082560. Throughput: 0: 42179.1. Samples: 817749860. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 04:24:43,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 04:24:46,226][26599] Updated weights for policy 0, policy_version 277724 (0.0041) [2024-06-19 04:24:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4550311936. Throughput: 0: 42052.0. Samples: 817870860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:24:48,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 04:24:50,539][26599] Updated weights for policy 0, policy_version 277734 (0.0028) [2024-06-19 04:24:53,384][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4550524928. Throughput: 0: 42126.6. Samples: 818126380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:24:53,384][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 04:24:53,835][26599] Updated weights for policy 0, policy_version 277744 (0.0037) [2024-06-19 04:24:58,347][26599] Updated weights for policy 0, policy_version 277754 (0.0040) [2024-06-19 04:24:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4550721536. Throughput: 0: 42062.2. Samples: 818381340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:24:58,384][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 04:25:01,474][26599] Updated weights for policy 0, policy_version 277764 (0.0037) [2024-06-19 04:25:03,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 4550950912. Throughput: 0: 41924.9. Samples: 818506700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:03,380][26367] Avg episode reward: [(0, '0.374')] [2024-06-19 04:25:06,401][26599] Updated weights for policy 0, policy_version 277774 (0.0036) [2024-06-19 04:25:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4551147520. Throughput: 0: 42180.0. Samples: 818762220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:08,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 04:25:09,642][26599] Updated weights for policy 0, policy_version 277784 (0.0037) [2024-06-19 04:25:13,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 4551344128. Throughput: 0: 41923.6. Samples: 819010200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:13,380][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 04:25:13,979][26599] Updated weights for policy 0, policy_version 277794 (0.0024) [2024-06-19 04:25:17,348][26599] Updated weights for policy 0, policy_version 277804 (0.0041) [2024-06-19 04:25:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 4551573504. Throughput: 0: 41979.0. Samples: 819136880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:18,381][26367] Avg episode reward: [(0, '0.324')] [2024-06-19 04:25:21,595][26599] Updated weights for policy 0, policy_version 277814 (0.0030) [2024-06-19 04:25:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 4551786496. Throughput: 0: 42158.3. Samples: 819396080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:23,380][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 04:25:25,352][26599] Updated weights for policy 0, policy_version 277824 (0.0034) [2024-06-19 04:25:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4551983104. Throughput: 0: 42124.1. Samples: 819645440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:28,380][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 04:25:29,189][26599] Updated weights for policy 0, policy_version 277834 (0.0045) [2024-06-19 04:25:30,628][26579] Signal inference workers to stop experience collection... (12200 times) [2024-06-19 04:25:30,629][26579] Signal inference workers to resume experience collection... (12200 times) [2024-06-19 04:25:30,672][26599] InferenceWorker_p0-w0: stopping experience collection (12200 times) [2024-06-19 04:25:30,672][26599] InferenceWorker_p0-w0: resuming experience collection (12200 times) [2024-06-19 04:25:32,989][26599] Updated weights for policy 0, policy_version 277844 (0.0032) [2024-06-19 04:25:33,384][26367] Fps is (10 sec: 42582.4, 60 sec: 42052.3, 300 sec: 42486.8). Total num frames: 4552212480. Throughput: 0: 42184.7. Samples: 819769320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:33,384][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 04:25:36,800][26599] Updated weights for policy 0, policy_version 277854 (0.0031) [2024-06-19 04:25:38,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42325.2, 300 sec: 42376.7). Total num frames: 4552425472. Throughput: 0: 42206.1. Samples: 820025660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:38,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 04:25:38,530][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000277859_4552441856.pth... [2024-06-19 04:25:38,580][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000277236_4542234624.pth [2024-06-19 04:25:40,684][26599] Updated weights for policy 0, policy_version 277864 (0.0038) [2024-06-19 04:25:43,380][26367] Fps is (10 sec: 40974.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4552622080. Throughput: 0: 42122.2. Samples: 820276840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:43,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 04:25:45,148][26599] Updated weights for policy 0, policy_version 277874 (0.0051) [2024-06-19 04:25:48,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 4552835072. Throughput: 0: 42081.8. Samples: 820400380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:48,380][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 04:25:48,525][26599] Updated weights for policy 0, policy_version 277884 (0.0035) [2024-06-19 04:25:52,679][26599] Updated weights for policy 0, policy_version 277894 (0.0033) [2024-06-19 04:25:53,382][26367] Fps is (10 sec: 42591.4, 60 sec: 42051.1, 300 sec: 42320.5). Total num frames: 4553048064. Throughput: 0: 42081.1. Samples: 820655940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:53,382][26367] Avg episode reward: [(0, '0.853')] [2024-06-19 04:25:56,286][26599] Updated weights for policy 0, policy_version 277904 (0.0029) [2024-06-19 04:25:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4553261056. Throughput: 0: 42254.1. Samples: 820911640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:25:58,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 04:26:00,186][26599] Updated weights for policy 0, policy_version 277914 (0.0040) [2024-06-19 04:26:03,380][26367] Fps is (10 sec: 42605.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4553474048. Throughput: 0: 42192.9. Samples: 821035560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:03,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 04:26:03,963][26599] Updated weights for policy 0, policy_version 277924 (0.0046) [2024-06-19 04:26:08,195][26599] Updated weights for policy 0, policy_version 277934 (0.0038) [2024-06-19 04:26:08,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4553670656. Throughput: 0: 42086.9. Samples: 821290000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:08,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 04:26:11,736][26599] Updated weights for policy 0, policy_version 277944 (0.0044) [2024-06-19 04:26:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4553883648. Throughput: 0: 42030.2. Samples: 821536800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:13,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 04:26:16,309][26599] Updated weights for policy 0, policy_version 277954 (0.0037) [2024-06-19 04:26:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4554096640. Throughput: 0: 42246.1. Samples: 821670240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:18,380][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 04:26:19,468][26599] Updated weights for policy 0, policy_version 277964 (0.0039) [2024-06-19 04:26:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 4554276864. Throughput: 0: 42070.4. Samples: 821918820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:23,380][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 04:26:23,882][26599] Updated weights for policy 0, policy_version 277974 (0.0041) [2024-06-19 04:26:27,042][26599] Updated weights for policy 0, policy_version 277984 (0.0027) [2024-06-19 04:26:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4554522624. Throughput: 0: 42059.3. Samples: 822169500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:28,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 04:26:31,512][26599] Updated weights for policy 0, policy_version 277994 (0.0030) [2024-06-19 04:26:33,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42054.8, 300 sec: 42431.8). Total num frames: 4554735616. Throughput: 0: 42376.7. Samples: 822307340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:33,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 04:26:35,166][26599] Updated weights for policy 0, policy_version 278004 (0.0051) [2024-06-19 04:26:38,380][26367] Fps is (10 sec: 39320.6, 60 sec: 41506.1, 300 sec: 42098.5). Total num frames: 4554915840. Throughput: 0: 42093.4. Samples: 822550080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:38,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 04:26:39,259][26599] Updated weights for policy 0, policy_version 278014 (0.0034) [2024-06-19 04:26:43,005][26599] Updated weights for policy 0, policy_version 278024 (0.0029) [2024-06-19 04:26:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4555161600. Throughput: 0: 41961.3. Samples: 822799900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:43,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 04:26:47,349][26599] Updated weights for policy 0, policy_version 278034 (0.0037) [2024-06-19 04:26:48,388][26367] Fps is (10 sec: 44204.6, 60 sec: 42047.0, 300 sec: 42319.6). Total num frames: 4555358208. Throughput: 0: 42132.1. Samples: 822931820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:48,388][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 04:26:50,615][26599] Updated weights for policy 0, policy_version 278044 (0.0037) [2024-06-19 04:26:52,329][26579] Signal inference workers to stop experience collection... (12250 times) [2024-06-19 04:26:52,358][26599] InferenceWorker_p0-w0: stopping experience collection (12250 times) [2024-06-19 04:26:52,377][26579] Signal inference workers to resume experience collection... (12250 times) [2024-06-19 04:26:52,381][26599] InferenceWorker_p0-w0: resuming experience collection (12250 times) [2024-06-19 04:26:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41780.4, 300 sec: 42154.1). Total num frames: 4555554816. Throughput: 0: 42110.4. Samples: 823184960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:53,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 04:26:55,057][26599] Updated weights for policy 0, policy_version 278054 (0.0026) [2024-06-19 04:26:58,380][26367] Fps is (10 sec: 42629.8, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4555784192. Throughput: 0: 42261.2. Samples: 823438560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:26:58,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 04:26:58,448][26599] Updated weights for policy 0, policy_version 278064 (0.0027) [2024-06-19 04:27:02,548][26599] Updated weights for policy 0, policy_version 278074 (0.0028) [2024-06-19 04:27:03,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4555997184. Throughput: 0: 42158.2. Samples: 823567360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:27:03,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 04:27:06,055][26599] Updated weights for policy 0, policy_version 278084 (0.0028) [2024-06-19 04:27:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4556210176. Throughput: 0: 42317.7. Samples: 823823120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 04:27:08,384][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 04:27:10,084][26599] Updated weights for policy 0, policy_version 278094 (0.0040) [2024-06-19 04:27:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4556423168. Throughput: 0: 42263.4. Samples: 824071360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:13,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 04:27:13,897][26599] Updated weights for policy 0, policy_version 278104 (0.0033) [2024-06-19 04:27:18,008][26599] Updated weights for policy 0, policy_version 278114 (0.0027) [2024-06-19 04:27:18,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42322.7, 300 sec: 42265.2). Total num frames: 4556636160. Throughput: 0: 42169.5. Samples: 824205120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:18,385][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 04:27:21,795][26599] Updated weights for policy 0, policy_version 278124 (0.0035) [2024-06-19 04:27:23,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4556832768. Throughput: 0: 42363.8. Samples: 824456440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:23,380][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 04:27:25,626][26599] Updated weights for policy 0, policy_version 278134 (0.0031) [2024-06-19 04:27:28,383][26367] Fps is (10 sec: 44240.7, 60 sec: 42596.4, 300 sec: 42431.4). Total num frames: 4557078528. Throughput: 0: 42346.7. Samples: 824705620. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:28,384][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 04:27:29,697][26599] Updated weights for policy 0, policy_version 278144 (0.0030) [2024-06-19 04:27:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4557258752. Throughput: 0: 42596.8. Samples: 824848360. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:33,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 04:27:33,412][26599] Updated weights for policy 0, policy_version 278154 (0.0041) [2024-06-19 04:27:37,440][26599] Updated weights for policy 0, policy_version 278164 (0.0035) [2024-06-19 04:27:38,380][26367] Fps is (10 sec: 37692.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4557455360. Throughput: 0: 42298.4. Samples: 825088400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:38,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 04:27:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000278165_4557455360.pth... [2024-06-19 04:27:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000277547_4547330048.pth [2024-06-19 04:27:41,302][26599] Updated weights for policy 0, policy_version 278174 (0.0040) [2024-06-19 04:27:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4557701120. Throughput: 0: 42291.6. Samples: 825341680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:43,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 04:27:44,970][26599] Updated weights for policy 0, policy_version 278184 (0.0043) [2024-06-19 04:27:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42330.5, 300 sec: 42154.1). Total num frames: 4557897728. Throughput: 0: 42473.7. Samples: 825478680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:48,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 04:27:48,947][26599] Updated weights for policy 0, policy_version 278194 (0.0021) [2024-06-19 04:27:52,752][26599] Updated weights for policy 0, policy_version 278204 (0.0038) [2024-06-19 04:27:53,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4558094336. Throughput: 0: 42223.0. Samples: 825723160. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:53,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 04:27:56,652][26599] Updated weights for policy 0, policy_version 278214 (0.0035) [2024-06-19 04:27:58,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4558340096. Throughput: 0: 42342.2. Samples: 825976760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:27:58,388][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 04:28:00,327][26599] Updated weights for policy 0, policy_version 278224 (0.0033) [2024-06-19 04:28:03,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42209.7). Total num frames: 4558536704. Throughput: 0: 42469.7. Samples: 826116100. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:28:03,380][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 04:28:04,147][26599] Updated weights for policy 0, policy_version 278234 (0.0037) [2024-06-19 04:28:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4558733312. Throughput: 0: 42334.9. Samples: 826361520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:28:08,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 04:28:08,766][26599] Updated weights for policy 0, policy_version 278244 (0.0034) [2024-06-19 04:28:11,807][26599] Updated weights for policy 0, policy_version 278254 (0.0033) [2024-06-19 04:28:13,337][26579] Signal inference workers to stop experience collection... (12300 times) [2024-06-19 04:28:13,359][26599] InferenceWorker_p0-w0: stopping experience collection (12300 times) [2024-06-19 04:28:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4558962688. Throughput: 0: 42468.9. Samples: 826616600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:28:13,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 04:28:13,447][26579] Signal inference workers to resume experience collection... (12300 times) [2024-06-19 04:28:13,447][26599] InferenceWorker_p0-w0: resuming experience collection (12300 times) [2024-06-19 04:28:16,384][26599] Updated weights for policy 0, policy_version 278264 (0.0032) [2024-06-19 04:28:18,381][26367] Fps is (10 sec: 44232.2, 60 sec: 42327.1, 300 sec: 42265.0). Total num frames: 4559175680. Throughput: 0: 42274.1. Samples: 826750740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:28:18,382][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 04:28:19,336][26599] Updated weights for policy 0, policy_version 278274 (0.0030) [2024-06-19 04:28:23,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4559388672. Throughput: 0: 42506.8. Samples: 827001200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 21.0) [2024-06-19 04:28:23,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 04:28:24,318][26599] Updated weights for policy 0, policy_version 278284 (0.0035) [2024-06-19 04:28:26,910][26599] Updated weights for policy 0, policy_version 278294 (0.0027) [2024-06-19 04:28:28,384][26367] Fps is (10 sec: 44225.7, 60 sec: 42324.7, 300 sec: 42320.2). Total num frames: 4559618048. Throughput: 0: 42373.0. Samples: 827248620. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:28:28,385][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 04:28:32,091][26599] Updated weights for policy 0, policy_version 278304 (0.0030) [2024-06-19 04:28:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4559814656. Throughput: 0: 42270.3. Samples: 827380840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:28:33,384][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 04:28:34,930][26599] Updated weights for policy 0, policy_version 278314 (0.0036) [2024-06-19 04:28:38,380][26367] Fps is (10 sec: 39336.1, 60 sec: 42598.6, 300 sec: 42265.9). Total num frames: 4560011264. Throughput: 0: 42364.1. Samples: 827629540. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:28:38,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 04:28:39,609][26599] Updated weights for policy 0, policy_version 278324 (0.0033) [2024-06-19 04:28:42,608][26599] Updated weights for policy 0, policy_version 278334 (0.0026) [2024-06-19 04:28:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4560240640. Throughput: 0: 42220.9. Samples: 827876700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:28:43,383][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 04:28:47,392][26599] Updated weights for policy 0, policy_version 278344 (0.0045) [2024-06-19 04:28:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4560437248. Throughput: 0: 42129.2. Samples: 828011920. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:28:48,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 04:28:50,305][26599] Updated weights for policy 0, policy_version 278354 (0.0024) [2024-06-19 04:28:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4560633856. Throughput: 0: 42199.6. Samples: 828260500. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:28:53,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 04:28:55,345][26599] Updated weights for policy 0, policy_version 278364 (0.0034) [2024-06-19 04:28:57,814][26599] Updated weights for policy 0, policy_version 278374 (0.0036) [2024-06-19 04:28:58,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4560896000. Throughput: 0: 42009.7. Samples: 828507040. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:28:58,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 04:29:03,091][26599] Updated weights for policy 0, policy_version 278384 (0.0045) [2024-06-19 04:29:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4561059840. Throughput: 0: 42034.9. Samples: 828642260. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:29:03,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 04:29:05,834][26599] Updated weights for policy 0, policy_version 278394 (0.0033) [2024-06-19 04:29:08,380][26367] Fps is (10 sec: 37683.7, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 4561272832. Throughput: 0: 41989.1. Samples: 828890700. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:29:08,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 04:29:10,803][26599] Updated weights for policy 0, policy_version 278404 (0.0031) [2024-06-19 04:29:13,384][26367] Fps is (10 sec: 45857.9, 60 sec: 42595.7, 300 sec: 42209.1). Total num frames: 4561518592. Throughput: 0: 42159.0. Samples: 829145780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:29:13,385][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 04:29:13,549][26599] Updated weights for policy 0, policy_version 278414 (0.0041) [2024-06-19 04:29:18,384][26367] Fps is (10 sec: 40944.9, 60 sec: 41777.5, 300 sec: 42153.6). Total num frames: 4561682432. Throughput: 0: 42215.3. Samples: 829280680. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:29:18,384][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 04:29:18,528][26599] Updated weights for policy 0, policy_version 278424 (0.0034) [2024-06-19 04:29:21,084][26599] Updated weights for policy 0, policy_version 278434 (0.0035) [2024-06-19 04:29:23,380][26367] Fps is (10 sec: 39335.7, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4561911808. Throughput: 0: 42189.2. Samples: 829528060. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:29:23,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 04:29:26,220][26599] Updated weights for policy 0, policy_version 278444 (0.0037) [2024-06-19 04:29:28,380][26367] Fps is (10 sec: 47530.5, 60 sec: 42327.9, 300 sec: 42265.7). Total num frames: 4562157568. Throughput: 0: 42364.4. Samples: 829783100. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:29:28,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 04:29:28,933][26599] Updated weights for policy 0, policy_version 278454 (0.0031) [2024-06-19 04:29:33,380][26367] Fps is (10 sec: 39322.5, 60 sec: 41506.2, 300 sec: 42098.6). Total num frames: 4562305024. Throughput: 0: 42243.7. Samples: 829912880. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 04:29:33,380][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 04:29:33,780][26599] Updated weights for policy 0, policy_version 278464 (0.0046) [2024-06-19 04:29:36,797][26599] Updated weights for policy 0, policy_version 278474 (0.0033) [2024-06-19 04:29:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4562567168. Throughput: 0: 42196.3. Samples: 830159340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:29:38,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 04:29:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000278477_4562567168.pth... [2024-06-19 04:29:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000277859_4552441856.pth [2024-06-19 04:29:41,657][26599] Updated weights for policy 0, policy_version 278484 (0.0030) [2024-06-19 04:29:42,928][26579] Signal inference workers to stop experience collection... (12350 times) [2024-06-19 04:29:42,928][26579] Signal inference workers to resume experience collection... (12350 times) [2024-06-19 04:29:42,948][26599] InferenceWorker_p0-w0: stopping experience collection (12350 times) [2024-06-19 04:29:42,948][26599] InferenceWorker_p0-w0: resuming experience collection (12350 times) [2024-06-19 04:29:43,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4562780160. Throughput: 0: 42421.0. Samples: 830415980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:29:43,381][26367] Avg episode reward: [(0, '0.338')] [2024-06-19 04:29:44,838][26599] Updated weights for policy 0, policy_version 278494 (0.0037) [2024-06-19 04:29:48,380][26367] Fps is (10 sec: 37683.3, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4562944000. Throughput: 0: 42224.3. Samples: 830542360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:29:48,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 04:29:49,181][26599] Updated weights for policy 0, policy_version 278504 (0.0034) [2024-06-19 04:29:52,458][26599] Updated weights for policy 0, policy_version 278514 (0.0025) [2024-06-19 04:29:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4563206144. Throughput: 0: 42391.9. Samples: 830798340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:29:53,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 04:29:56,774][26599] Updated weights for policy 0, policy_version 278524 (0.0037) [2024-06-19 04:29:58,380][26367] Fps is (10 sec: 47514.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4563419136. Throughput: 0: 42457.3. Samples: 831056200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:29:58,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 04:30:00,273][26599] Updated weights for policy 0, policy_version 278534 (0.0049) [2024-06-19 04:30:03,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4563599360. Throughput: 0: 42174.5. Samples: 831178380. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:03,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 04:30:04,399][26599] Updated weights for policy 0, policy_version 278544 (0.0030) [2024-06-19 04:30:07,932][26599] Updated weights for policy 0, policy_version 278554 (0.0044) [2024-06-19 04:30:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4563845120. Throughput: 0: 42351.2. Samples: 831433860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:08,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 04:30:12,197][26599] Updated weights for policy 0, policy_version 278564 (0.0035) [2024-06-19 04:30:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41508.6, 300 sec: 42154.1). Total num frames: 4564008960. Throughput: 0: 42408.8. Samples: 831691500. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:13,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 04:30:15,893][26599] Updated weights for policy 0, policy_version 278574 (0.0032) [2024-06-19 04:30:18,380][26367] Fps is (10 sec: 37683.8, 60 sec: 42328.0, 300 sec: 42154.1). Total num frames: 4564221952. Throughput: 0: 42140.9. Samples: 831809220. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:18,380][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 04:30:20,330][26599] Updated weights for policy 0, policy_version 278584 (0.0030) [2024-06-19 04:30:23,384][26367] Fps is (10 sec: 45859.3, 60 sec: 42595.9, 300 sec: 42320.2). Total num frames: 4564467712. Throughput: 0: 42235.9. Samples: 832060100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:23,384][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 04:30:23,642][26599] Updated weights for policy 0, policy_version 278594 (0.0038) [2024-06-19 04:30:27,967][26599] Updated weights for policy 0, policy_version 278604 (0.0045) [2024-06-19 04:30:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41506.2, 300 sec: 42154.6). Total num frames: 4564647936. Throughput: 0: 42303.5. Samples: 832319640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:28,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 04:30:31,359][26599] Updated weights for policy 0, policy_version 278614 (0.0027) [2024-06-19 04:30:33,380][26367] Fps is (10 sec: 40974.4, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 4564877312. Throughput: 0: 42192.9. Samples: 832441040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:33,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 04:30:35,679][26599] Updated weights for policy 0, policy_version 278624 (0.0050) [2024-06-19 04:30:38,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4565106688. Throughput: 0: 42256.5. Samples: 832699880. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:38,380][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 04:30:39,032][26599] Updated weights for policy 0, policy_version 278634 (0.0040) [2024-06-19 04:30:43,338][26599] Updated weights for policy 0, policy_version 278644 (0.0035) [2024-06-19 04:30:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4565303296. Throughput: 0: 42277.8. Samples: 832958700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:43,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 04:30:46,938][26599] Updated weights for policy 0, policy_version 278654 (0.0035) [2024-06-19 04:30:48,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42325.5, 300 sec: 42154.3). Total num frames: 4565483520. Throughput: 0: 42236.1. Samples: 833079000. Policy #0 lag: (min: 0.0, avg: 12.3, max: 24.0) [2024-06-19 04:30:48,380][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 04:30:51,043][26599] Updated weights for policy 0, policy_version 278664 (0.0035) [2024-06-19 04:30:53,383][26367] Fps is (10 sec: 42587.6, 60 sec: 42050.5, 300 sec: 42264.8). Total num frames: 4565729280. Throughput: 0: 42128.3. Samples: 833329740. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:30:53,383][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 04:30:54,615][26599] Updated weights for policy 0, policy_version 278674 (0.0024) [2024-06-19 04:30:58,380][26367] Fps is (10 sec: 44235.5, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4565925888. Throughput: 0: 42338.2. Samples: 833596720. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:30:58,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 04:30:58,655][26599] Updated weights for policy 0, policy_version 278684 (0.0032) [2024-06-19 04:30:59,420][26579] Signal inference workers to stop experience collection... (12400 times) [2024-06-19 04:30:59,460][26599] InferenceWorker_p0-w0: stopping experience collection (12400 times) [2024-06-19 04:30:59,492][26579] Signal inference workers to resume experience collection... (12400 times) [2024-06-19 04:30:59,493][26599] InferenceWorker_p0-w0: resuming experience collection (12400 times) [2024-06-19 04:31:02,915][26599] Updated weights for policy 0, policy_version 278694 (0.0029) [2024-06-19 04:31:03,380][26367] Fps is (10 sec: 39331.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4566122496. Throughput: 0: 42335.8. Samples: 833714340. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:03,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 04:31:06,502][26599] Updated weights for policy 0, policy_version 278704 (0.0036) [2024-06-19 04:31:08,380][26367] Fps is (10 sec: 44238.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4566368256. Throughput: 0: 42374.1. Samples: 833966780. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:08,380][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 04:31:10,931][26599] Updated weights for policy 0, policy_version 278714 (0.0033) [2024-06-19 04:31:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4566564864. Throughput: 0: 42355.0. Samples: 834225620. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:13,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 04:31:14,043][26599] Updated weights for policy 0, policy_version 278724 (0.0030) [2024-06-19 04:31:18,380][26367] Fps is (10 sec: 37682.6, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 4566745088. Throughput: 0: 42381.8. Samples: 834348220. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:18,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 04:31:18,555][26599] Updated weights for policy 0, policy_version 278734 (0.0030) [2024-06-19 04:31:21,727][26599] Updated weights for policy 0, policy_version 278744 (0.0031) [2024-06-19 04:31:23,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42600.9, 300 sec: 42376.2). Total num frames: 4567023616. Throughput: 0: 42267.9. Samples: 834601940. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:23,382][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 04:31:26,227][26599] Updated weights for policy 0, policy_version 278754 (0.0026) [2024-06-19 04:31:28,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4567203840. Throughput: 0: 42308.0. Samples: 834862560. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:28,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 04:31:29,388][26599] Updated weights for policy 0, policy_version 278764 (0.0041) [2024-06-19 04:31:33,380][26367] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4567400448. Throughput: 0: 42273.6. Samples: 834981320. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:33,390][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 04:31:33,839][26599] Updated weights for policy 0, policy_version 278774 (0.0034) [2024-06-19 04:31:37,331][26599] Updated weights for policy 0, policy_version 278784 (0.0040) [2024-06-19 04:31:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4567646208. Throughput: 0: 42356.1. Samples: 835235660. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:38,381][26367] Avg episode reward: [(0, '0.889')] [2024-06-19 04:31:38,407][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000278787_4567646208.pth... [2024-06-19 04:31:38,479][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000278165_4557455360.pth [2024-06-19 04:31:41,721][26599] Updated weights for policy 0, policy_version 278794 (0.0032) [2024-06-19 04:31:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42266.2). Total num frames: 4567826432. Throughput: 0: 42124.2. Samples: 835492300. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:43,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 04:31:45,017][26599] Updated weights for policy 0, policy_version 278804 (0.0034) [2024-06-19 04:31:48,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4568023040. Throughput: 0: 42116.1. Samples: 835609560. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:48,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 04:31:49,529][26599] Updated weights for policy 0, policy_version 278814 (0.0031) [2024-06-19 04:31:52,972][26599] Updated weights for policy 0, policy_version 278824 (0.0033) [2024-06-19 04:31:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42327.1, 300 sec: 42320.7). Total num frames: 4568268800. Throughput: 0: 42234.6. Samples: 835867340. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:53,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 04:31:57,497][26599] Updated weights for policy 0, policy_version 278834 (0.0037) [2024-06-19 04:31:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 4568465408. Throughput: 0: 42071.1. Samples: 836118820. Policy #0 lag: (min: 2.0, avg: 9.2, max: 21.0) [2024-06-19 04:31:58,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 04:32:00,657][26599] Updated weights for policy 0, policy_version 278844 (0.0032) [2024-06-19 04:32:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4568678400. Throughput: 0: 42000.5. Samples: 836238240. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:03,381][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 04:32:05,270][26599] Updated weights for policy 0, policy_version 278854 (0.0030) [2024-06-19 04:32:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4568875008. Throughput: 0: 41975.1. Samples: 836490820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:08,383][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 04:32:08,610][26599] Updated weights for policy 0, policy_version 278864 (0.0030) [2024-06-19 04:32:13,001][26599] Updated weights for policy 0, policy_version 278874 (0.0039) [2024-06-19 04:32:13,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.3, 300 sec: 42154.6). Total num frames: 4569071616. Throughput: 0: 41753.4. Samples: 836741460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:13,380][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 04:32:15,759][26579] Signal inference workers to stop experience collection... (12450 times) [2024-06-19 04:32:15,796][26599] InferenceWorker_p0-w0: stopping experience collection (12450 times) [2024-06-19 04:32:15,815][26579] Signal inference workers to resume experience collection... (12450 times) [2024-06-19 04:32:15,815][26599] InferenceWorker_p0-w0: resuming experience collection (12450 times) [2024-06-19 04:32:16,527][26599] Updated weights for policy 0, policy_version 278884 (0.0029) [2024-06-19 04:32:18,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4569300992. Throughput: 0: 41893.5. Samples: 836866520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:18,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 04:32:20,948][26599] Updated weights for policy 0, policy_version 278894 (0.0038) [2024-06-19 04:32:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41233.2, 300 sec: 42099.0). Total num frames: 4569497600. Throughput: 0: 41845.5. Samples: 837118700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:23,380][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 04:32:24,511][26599] Updated weights for policy 0, policy_version 278904 (0.0034) [2024-06-19 04:32:28,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 4569694208. Throughput: 0: 41736.9. Samples: 837370460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:28,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 04:32:28,663][26599] Updated weights for policy 0, policy_version 278914 (0.0042) [2024-06-19 04:32:32,207][26599] Updated weights for policy 0, policy_version 278924 (0.0032) [2024-06-19 04:32:33,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4569939968. Throughput: 0: 41928.0. Samples: 837496320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:33,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 04:32:36,434][26599] Updated weights for policy 0, policy_version 278934 (0.0037) [2024-06-19 04:32:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 41506.3, 300 sec: 42154.1). Total num frames: 4570136576. Throughput: 0: 41893.0. Samples: 837752520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:38,380][26367] Avg episode reward: [(0, '0.365')] [2024-06-19 04:32:39,833][26599] Updated weights for policy 0, policy_version 278944 (0.0027) [2024-06-19 04:32:43,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4570333184. Throughput: 0: 42024.5. Samples: 838009920. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:43,380][26367] Avg episode reward: [(0, '0.365')] [2024-06-19 04:32:44,015][26599] Updated weights for policy 0, policy_version 278954 (0.0039) [2024-06-19 04:32:47,412][26599] Updated weights for policy 0, policy_version 278964 (0.0034) [2024-06-19 04:32:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4570578944. Throughput: 0: 42087.2. Samples: 838132160. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:48,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 04:32:51,590][26599] Updated weights for policy 0, policy_version 278974 (0.0044) [2024-06-19 04:32:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.1, 300 sec: 42098.6). Total num frames: 4570759168. Throughput: 0: 42044.5. Samples: 838382820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:53,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 04:32:55,086][26599] Updated weights for policy 0, policy_version 278984 (0.0029) [2024-06-19 04:32:58,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 4570972160. Throughput: 0: 42157.2. Samples: 838638540. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:32:58,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 04:32:59,409][26599] Updated weights for policy 0, policy_version 278994 (0.0034) [2024-06-19 04:33:02,937][26599] Updated weights for policy 0, policy_version 279004 (0.0035) [2024-06-19 04:33:03,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4571201536. Throughput: 0: 42218.6. Samples: 838766360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:33:03,380][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 04:33:07,097][26599] Updated weights for policy 0, policy_version 279014 (0.0042) [2024-06-19 04:33:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4571414528. Throughput: 0: 42178.9. Samples: 839016760. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:33:08,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 04:33:10,849][26599] Updated weights for policy 0, policy_version 279024 (0.0037) [2024-06-19 04:33:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42154.2). Total num frames: 4571611136. Throughput: 0: 42122.2. Samples: 839265960. Policy #0 lag: (min: 0.0, avg: 12.4, max: 28.0) [2024-06-19 04:33:13,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 04:33:14,994][26599] Updated weights for policy 0, policy_version 279034 (0.0031) [2024-06-19 04:33:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4571824128. Throughput: 0: 42215.5. Samples: 839396020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:18,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 04:33:18,596][26599] Updated weights for policy 0, policy_version 279044 (0.0033) [2024-06-19 04:33:23,153][26599] Updated weights for policy 0, policy_version 279054 (0.0048) [2024-06-19 04:33:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42043.5). Total num frames: 4572020736. Throughput: 0: 42123.1. Samples: 839648060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:23,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 04:33:26,291][26599] Updated weights for policy 0, policy_version 279064 (0.0038) [2024-06-19 04:33:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4572250112. Throughput: 0: 42022.6. Samples: 839900940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:28,381][26367] Avg episode reward: [(0, '0.305')] [2024-06-19 04:33:30,698][26599] Updated weights for policy 0, policy_version 279074 (0.0024) [2024-06-19 04:33:33,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4572479488. Throughput: 0: 42186.1. Samples: 840030540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:33,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 04:33:34,198][26599] Updated weights for policy 0, policy_version 279084 (0.0050) [2024-06-19 04:33:38,286][26599] Updated weights for policy 0, policy_version 279094 (0.0029) [2024-06-19 04:33:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4572676096. Throughput: 0: 42238.7. Samples: 840283560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:38,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 04:33:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000279094_4572676096.pth... [2024-06-19 04:33:38,445][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000278477_4562567168.pth [2024-06-19 04:33:42,120][26599] Updated weights for policy 0, policy_version 279104 (0.0028) [2024-06-19 04:33:43,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4572872704. Throughput: 0: 42141.5. Samples: 840534900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:43,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 04:33:45,911][26599] Updated weights for policy 0, policy_version 279114 (0.0033) [2024-06-19 04:33:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4573102080. Throughput: 0: 41958.6. Samples: 840654500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:48,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 04:33:49,712][26599] Updated weights for policy 0, policy_version 279124 (0.0036) [2024-06-19 04:33:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42043.0). Total num frames: 4573298688. Throughput: 0: 42340.7. Samples: 840922080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:53,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 04:33:53,710][26599] Updated weights for policy 0, policy_version 279134 (0.0046) [2024-06-19 04:33:55,259][26579] Signal inference workers to stop experience collection... (12500 times) [2024-06-19 04:33:55,293][26599] InferenceWorker_p0-w0: stopping experience collection (12500 times) [2024-06-19 04:33:55,320][26579] Signal inference workers to resume experience collection... (12500 times) [2024-06-19 04:33:55,328][26599] InferenceWorker_p0-w0: resuming experience collection (12500 times) [2024-06-19 04:33:57,634][26599] Updated weights for policy 0, policy_version 279144 (0.0029) [2024-06-19 04:33:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4573511680. Throughput: 0: 42191.6. Samples: 841164580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:33:58,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 04:34:01,655][26599] Updated weights for policy 0, policy_version 279154 (0.0028) [2024-06-19 04:34:03,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42322.7, 300 sec: 42264.6). Total num frames: 4573741056. Throughput: 0: 42139.7. Samples: 841292460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:34:03,384][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 04:34:05,285][26599] Updated weights for policy 0, policy_version 279164 (0.0042) [2024-06-19 04:34:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42043.5). Total num frames: 4573921280. Throughput: 0: 42190.1. Samples: 841546620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:34:08,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 04:34:09,571][26599] Updated weights for policy 0, policy_version 279174 (0.0041) [2024-06-19 04:34:13,160][26599] Updated weights for policy 0, policy_version 279184 (0.0034) [2024-06-19 04:34:13,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42325.3, 300 sec: 42265.7). Total num frames: 4574150656. Throughput: 0: 42034.2. Samples: 841792480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:34:13,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 04:34:17,450][26599] Updated weights for policy 0, policy_version 279194 (0.0028) [2024-06-19 04:34:18,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4574363648. Throughput: 0: 41996.0. Samples: 841920360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:34:18,384][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 04:34:20,812][26599] Updated weights for policy 0, policy_version 279204 (0.0035) [2024-06-19 04:34:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42043.0). Total num frames: 4574560256. Throughput: 0: 42134.7. Samples: 842179620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 04:34:23,380][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 04:34:25,459][26599] Updated weights for policy 0, policy_version 279214 (0.0039) [2024-06-19 04:34:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4574789632. Throughput: 0: 42120.8. Samples: 842430340. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:34:28,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 04:34:28,509][26599] Updated weights for policy 0, policy_version 279224 (0.0036) [2024-06-19 04:34:32,923][26599] Updated weights for policy 0, policy_version 279234 (0.0039) [2024-06-19 04:34:33,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4574986240. Throughput: 0: 42340.4. Samples: 842559820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:34:33,381][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 04:34:36,797][26599] Updated weights for policy 0, policy_version 279244 (0.0032) [2024-06-19 04:34:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.1, 300 sec: 42043.0). Total num frames: 4575182848. Throughput: 0: 41922.9. Samples: 842808620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:34:38,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 04:34:40,488][26599] Updated weights for policy 0, policy_version 279254 (0.0031) [2024-06-19 04:34:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4575428608. Throughput: 0: 42065.6. Samples: 843057540. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:34:43,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 04:34:44,631][26599] Updated weights for policy 0, policy_version 279264 (0.0044) [2024-06-19 04:34:48,299][26599] Updated weights for policy 0, policy_version 279274 (0.0024) [2024-06-19 04:34:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4575625216. Throughput: 0: 42109.5. Samples: 843187240. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:34:48,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 04:34:52,349][26599] Updated weights for policy 0, policy_version 279284 (0.0041) [2024-06-19 04:34:53,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.1, 300 sec: 42043.0). Total num frames: 4575821824. Throughput: 0: 42133.2. Samples: 843442620. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:34:53,381][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 04:34:56,267][26599] Updated weights for policy 0, policy_version 279294 (0.0031) [2024-06-19 04:34:58,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4576051200. Throughput: 0: 42120.5. Samples: 843687900. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:34:58,389][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 04:35:00,217][26599] Updated weights for policy 0, policy_version 279304 (0.0036) [2024-06-19 04:35:03,384][26367] Fps is (10 sec: 42583.8, 60 sec: 41779.2, 300 sec: 42042.5). Total num frames: 4576247808. Throughput: 0: 42238.9. Samples: 843821260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:35:03,384][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 04:35:04,055][26599] Updated weights for policy 0, policy_version 279314 (0.0034) [2024-06-19 04:35:07,955][26599] Updated weights for policy 0, policy_version 279324 (0.0035) [2024-06-19 04:35:08,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4576444416. Throughput: 0: 41994.1. Samples: 844069360. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:35:08,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 04:35:11,662][26599] Updated weights for policy 0, policy_version 279334 (0.0031) [2024-06-19 04:35:13,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4576673792. Throughput: 0: 42147.7. Samples: 844326980. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:35:13,380][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 04:35:14,136][26579] Signal inference workers to stop experience collection... (12550 times) [2024-06-19 04:35:14,137][26579] Signal inference workers to resume experience collection... (12550 times) [2024-06-19 04:35:14,176][26599] InferenceWorker_p0-w0: stopping experience collection (12550 times) [2024-06-19 04:35:14,177][26599] InferenceWorker_p0-w0: resuming experience collection (12550 times) [2024-06-19 04:35:16,005][26599] Updated weights for policy 0, policy_version 279344 (0.0031) [2024-06-19 04:35:18,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 42043.5). Total num frames: 4576870400. Throughput: 0: 42086.9. Samples: 844453720. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:35:18,380][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 04:35:19,144][26599] Updated weights for policy 0, policy_version 279354 (0.0037) [2024-06-19 04:35:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4577083392. Throughput: 0: 41993.5. Samples: 844698320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:35:23,380][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 04:35:23,462][26599] Updated weights for policy 0, policy_version 279364 (0.0035) [2024-06-19 04:35:26,726][26599] Updated weights for policy 0, policy_version 279374 (0.0036) [2024-06-19 04:35:28,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4577312768. Throughput: 0: 42267.2. Samples: 844959560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:35:28,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 04:35:31,802][26599] Updated weights for policy 0, policy_version 279384 (0.0046) [2024-06-19 04:35:33,383][26367] Fps is (10 sec: 42584.5, 60 sec: 42050.1, 300 sec: 42042.5). Total num frames: 4577509376. Throughput: 0: 42135.3. Samples: 845083460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 22.0) [2024-06-19 04:35:33,384][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 04:35:34,624][26599] Updated weights for policy 0, policy_version 279394 (0.0034) [2024-06-19 04:35:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4577738752. Throughput: 0: 42100.9. Samples: 845337160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:35:38,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 04:35:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000279403_4577738752.pth... [2024-06-19 04:35:38,444][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000278787_4567646208.pth [2024-06-19 04:35:39,285][26599] Updated weights for policy 0, policy_version 279404 (0.0037) [2024-06-19 04:35:42,391][26599] Updated weights for policy 0, policy_version 279414 (0.0038) [2024-06-19 04:35:43,380][26367] Fps is (10 sec: 44250.9, 60 sec: 42052.4, 300 sec: 42265.1). Total num frames: 4577951744. Throughput: 0: 42400.4. Samples: 845595920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:35:43,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 04:35:46,835][26599] Updated weights for policy 0, policy_version 279424 (0.0035) [2024-06-19 04:35:48,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42043.4). Total num frames: 4578131968. Throughput: 0: 42285.1. Samples: 845723940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:35:48,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 04:35:50,173][26599] Updated weights for policy 0, policy_version 279434 (0.0039) [2024-06-19 04:35:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 4578377728. Throughput: 0: 42271.1. Samples: 845971560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:35:53,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 04:35:54,494][26599] Updated weights for policy 0, policy_version 279444 (0.0027) [2024-06-19 04:35:57,989][26599] Updated weights for policy 0, policy_version 279454 (0.0028) [2024-06-19 04:35:58,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4578590720. Throughput: 0: 42213.2. Samples: 846226580. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:35:58,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 04:36:02,002][26599] Updated weights for policy 0, policy_version 279464 (0.0033) [2024-06-19 04:36:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42054.8, 300 sec: 42043.0). Total num frames: 4578770944. Throughput: 0: 42359.0. Samples: 846359880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:03,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 04:36:05,777][26599] Updated weights for policy 0, policy_version 279474 (0.0038) [2024-06-19 04:36:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4579000320. Throughput: 0: 42387.4. Samples: 846605760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:08,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 04:36:10,175][26599] Updated weights for policy 0, policy_version 279484 (0.0031) [2024-06-19 04:36:13,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 4579213312. Throughput: 0: 42351.5. Samples: 846865380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:13,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 04:36:13,410][26599] Updated weights for policy 0, policy_version 279494 (0.0031) [2024-06-19 04:36:17,824][26599] Updated weights for policy 0, policy_version 279504 (0.0042) [2024-06-19 04:36:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 41931.9). Total num frames: 4579393536. Throughput: 0: 42513.2. Samples: 846996420. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:18,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 04:36:21,366][26599] Updated weights for policy 0, policy_version 279514 (0.0038) [2024-06-19 04:36:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4579639296. Throughput: 0: 42266.8. Samples: 847239160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:23,381][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 04:36:25,373][26599] Updated weights for policy 0, policy_version 279524 (0.0041) [2024-06-19 04:36:26,360][26579] Signal inference workers to stop experience collection... (12600 times) [2024-06-19 04:36:26,360][26579] Signal inference workers to resume experience collection... (12600 times) [2024-06-19 04:36:26,377][26599] InferenceWorker_p0-w0: stopping experience collection (12600 times) [2024-06-19 04:36:26,377][26599] InferenceWorker_p0-w0: resuming experience collection (12600 times) [2024-06-19 04:36:28,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42049.7, 300 sec: 42153.6). Total num frames: 4579835904. Throughput: 0: 42263.3. Samples: 847497920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:28,384][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 04:36:29,133][26599] Updated weights for policy 0, policy_version 279534 (0.0046) [2024-06-19 04:36:33,355][26599] Updated weights for policy 0, policy_version 279544 (0.0032) [2024-06-19 04:36:33,381][26367] Fps is (10 sec: 40959.1, 60 sec: 42327.4, 300 sec: 42043.0). Total num frames: 4580048896. Throughput: 0: 42051.3. Samples: 847616260. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:33,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 04:36:36,763][26599] Updated weights for policy 0, policy_version 279554 (0.0028) [2024-06-19 04:36:38,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4580261888. Throughput: 0: 42261.0. Samples: 847873300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:38,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 04:36:41,115][26599] Updated weights for policy 0, policy_version 279564 (0.0037) [2024-06-19 04:36:43,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4580458496. Throughput: 0: 42258.7. Samples: 848128220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:43,381][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 04:36:44,799][26599] Updated weights for policy 0, policy_version 279574 (0.0040) [2024-06-19 04:36:48,384][26367] Fps is (10 sec: 39307.4, 60 sec: 42049.7, 300 sec: 41987.0). Total num frames: 4580655104. Throughput: 0: 41989.5. Samples: 848249560. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:36:48,384][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 04:36:48,848][26599] Updated weights for policy 0, policy_version 279584 (0.0033) [2024-06-19 04:36:52,423][26599] Updated weights for policy 0, policy_version 279594 (0.0031) [2024-06-19 04:36:53,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4580900864. Throughput: 0: 42246.0. Samples: 848506820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:36:53,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 04:36:56,794][26599] Updated weights for policy 0, policy_version 279604 (0.0038) [2024-06-19 04:36:58,380][26367] Fps is (10 sec: 42613.6, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4581081088. Throughput: 0: 42167.6. Samples: 848762920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:36:58,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 04:37:00,104][26599] Updated weights for policy 0, policy_version 279614 (0.0039) [2024-06-19 04:37:03,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4581310464. Throughput: 0: 41968.0. Samples: 848884980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:03,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 04:37:04,651][26599] Updated weights for policy 0, policy_version 279624 (0.0026) [2024-06-19 04:37:07,968][26599] Updated weights for policy 0, policy_version 279634 (0.0034) [2024-06-19 04:37:08,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4581539840. Throughput: 0: 42269.3. Samples: 849141280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:08,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 04:37:12,299][26599] Updated weights for policy 0, policy_version 279644 (0.0040) [2024-06-19 04:37:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41506.2, 300 sec: 42043.0). Total num frames: 4581703680. Throughput: 0: 42184.3. Samples: 849396060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:13,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 04:37:15,772][26599] Updated weights for policy 0, policy_version 279654 (0.0034) [2024-06-19 04:37:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4581949440. Throughput: 0: 42177.9. Samples: 849514260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:18,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 04:37:19,854][26599] Updated weights for policy 0, policy_version 279664 (0.0046) [2024-06-19 04:37:23,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4582162432. Throughput: 0: 42118.2. Samples: 849768620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:23,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 04:37:23,583][26599] Updated weights for policy 0, policy_version 279674 (0.0028) [2024-06-19 04:37:28,314][26599] Updated weights for policy 0, policy_version 279684 (0.0050) [2024-06-19 04:37:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41781.6, 300 sec: 42043.0). Total num frames: 4582342656. Throughput: 0: 42097.2. Samples: 850022600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:28,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 04:37:31,545][26599] Updated weights for policy 0, policy_version 279694 (0.0043) [2024-06-19 04:37:33,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42322.9, 300 sec: 42209.1). Total num frames: 4582588416. Throughput: 0: 42134.2. Samples: 850145600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:33,384][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 04:37:35,920][26599] Updated weights for policy 0, policy_version 279704 (0.0038) [2024-06-19 04:37:38,380][26367] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4582768640. Throughput: 0: 42173.3. Samples: 850404620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:38,380][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 04:37:38,426][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000279711_4582785024.pth... [2024-06-19 04:37:38,485][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000279094_4572676096.pth [2024-06-19 04:37:39,410][26599] Updated weights for policy 0, policy_version 279714 (0.0041) [2024-06-19 04:37:43,380][26367] Fps is (10 sec: 39336.5, 60 sec: 42052.4, 300 sec: 42043.0). Total num frames: 4582981632. Throughput: 0: 42016.2. Samples: 850653640. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:43,380][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 04:37:43,510][26599] Updated weights for policy 0, policy_version 279724 (0.0028) [2024-06-19 04:37:47,082][26599] Updated weights for policy 0, policy_version 279734 (0.0037) [2024-06-19 04:37:47,406][26579] Signal inference workers to stop experience collection... (12650 times) [2024-06-19 04:37:47,449][26599] InferenceWorker_p0-w0: stopping experience collection (12650 times) [2024-06-19 04:37:47,532][26579] Signal inference workers to resume experience collection... (12650 times) [2024-06-19 04:37:47,533][26599] InferenceWorker_p0-w0: resuming experience collection (12650 times) [2024-06-19 04:37:48,380][26367] Fps is (10 sec: 47513.3, 60 sec: 43147.2, 300 sec: 42320.7). Total num frames: 4583243776. Throughput: 0: 42119.6. Samples: 850780360. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:48,392][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 04:37:51,291][26599] Updated weights for policy 0, policy_version 279744 (0.0035) [2024-06-19 04:37:53,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42052.2, 300 sec: 42209.7). Total num frames: 4583424000. Throughput: 0: 42151.6. Samples: 851038100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:53,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 04:37:54,916][26599] Updated weights for policy 0, policy_version 279754 (0.0028) [2024-06-19 04:37:58,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42098.5). Total num frames: 4583620608. Throughput: 0: 42073.8. Samples: 851289380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 04:37:58,380][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 04:37:58,735][26599] Updated weights for policy 0, policy_version 279764 (0.0036) [2024-06-19 04:38:02,507][26599] Updated weights for policy 0, policy_version 279774 (0.0029) [2024-06-19 04:38:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 4583866368. Throughput: 0: 42191.7. Samples: 851412880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:03,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 04:38:06,571][26599] Updated weights for policy 0, policy_version 279784 (0.0036) [2024-06-19 04:38:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4584046592. Throughput: 0: 42299.2. Samples: 851672080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:08,380][26367] Avg episode reward: [(0, '0.816')] [2024-06-19 04:38:10,105][26599] Updated weights for policy 0, policy_version 279794 (0.0032) [2024-06-19 04:38:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4584259584. Throughput: 0: 42206.3. Samples: 851921880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:13,381][26367] Avg episode reward: [(0, '0.816')] [2024-06-19 04:38:14,214][26599] Updated weights for policy 0, policy_version 279804 (0.0035) [2024-06-19 04:38:17,834][26599] Updated weights for policy 0, policy_version 279814 (0.0029) [2024-06-19 04:38:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 4584488960. Throughput: 0: 42434.6. Samples: 852055000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:18,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 04:38:22,258][26599] Updated weights for policy 0, policy_version 279824 (0.0042) [2024-06-19 04:38:23,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42049.8, 300 sec: 42153.6). Total num frames: 4584685568. Throughput: 0: 42215.7. Samples: 852304480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:23,384][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 04:38:25,615][26599] Updated weights for policy 0, policy_version 279834 (0.0031) [2024-06-19 04:38:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42098.6). Total num frames: 4584898560. Throughput: 0: 42130.1. Samples: 852549500. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:28,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 04:38:29,889][26599] Updated weights for policy 0, policy_version 279844 (0.0046) [2024-06-19 04:38:33,321][26599] Updated weights for policy 0, policy_version 279854 (0.0033) [2024-06-19 04:38:33,386][26367] Fps is (10 sec: 44226.5, 60 sec: 42323.7, 300 sec: 42208.8). Total num frames: 4585127936. Throughput: 0: 42269.1. Samples: 852682720. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:33,387][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 04:38:37,433][26599] Updated weights for policy 0, policy_version 279864 (0.0043) [2024-06-19 04:38:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4585308160. Throughput: 0: 42112.4. Samples: 852933160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:38,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 04:38:40,992][26599] Updated weights for policy 0, policy_version 279874 (0.0036) [2024-06-19 04:38:43,380][26367] Fps is (10 sec: 40984.0, 60 sec: 42598.2, 300 sec: 42154.1). Total num frames: 4585537536. Throughput: 0: 42046.6. Samples: 853181480. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:43,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 04:38:45,365][26599] Updated weights for policy 0, policy_version 279884 (0.0029) [2024-06-19 04:38:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 4585734144. Throughput: 0: 42149.0. Samples: 853309580. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:48,380][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 04:38:48,754][26599] Updated weights for policy 0, policy_version 279894 (0.0031) [2024-06-19 04:38:53,138][26599] Updated weights for policy 0, policy_version 279904 (0.0033) [2024-06-19 04:38:53,383][26367] Fps is (10 sec: 40947.2, 60 sec: 42050.0, 300 sec: 42153.6). Total num frames: 4585947136. Throughput: 0: 42051.2. Samples: 853564520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:53,384][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 04:38:56,669][26599] Updated weights for policy 0, policy_version 279914 (0.0026) [2024-06-19 04:38:58,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42154.6). Total num frames: 4586176512. Throughput: 0: 42140.4. Samples: 853818200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:38:58,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 04:39:00,898][26599] Updated weights for policy 0, policy_version 279924 (0.0029) [2024-06-19 04:39:03,380][26367] Fps is (10 sec: 42611.7, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4586373120. Throughput: 0: 42067.9. Samples: 853948060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:39:03,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 04:39:04,161][26599] Updated weights for policy 0, policy_version 279934 (0.0038) [2024-06-19 04:39:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42098.6). Total num frames: 4586569728. Throughput: 0: 42209.1. Samples: 854203740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:39:08,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 04:39:08,635][26599] Updated weights for policy 0, policy_version 279944 (0.0050) [2024-06-19 04:39:11,850][26599] Updated weights for policy 0, policy_version 279954 (0.0035) [2024-06-19 04:39:13,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 4586815488. Throughput: 0: 42423.2. Samples: 854458540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 04:39:13,380][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 04:39:16,433][26599] Updated weights for policy 0, policy_version 279964 (0.0040) [2024-06-19 04:39:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4587012096. Throughput: 0: 42425.1. Samples: 854591600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:18,381][26367] Avg episode reward: [(0, '0.409')] [2024-06-19 04:39:19,763][26599] Updated weights for policy 0, policy_version 279974 (0.0034) [2024-06-19 04:39:23,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42327.8, 300 sec: 42154.1). Total num frames: 4587225088. Throughput: 0: 42445.3. Samples: 854843200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:23,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 04:39:24,190][26599] Updated weights for policy 0, policy_version 279984 (0.0037) [2024-06-19 04:39:25,882][26579] Signal inference workers to stop experience collection... (12700 times) [2024-06-19 04:39:25,882][26579] Signal inference workers to resume experience collection... (12700 times) [2024-06-19 04:39:25,897][26599] InferenceWorker_p0-w0: stopping experience collection (12700 times) [2024-06-19 04:39:25,898][26599] InferenceWorker_p0-w0: resuming experience collection (12700 times) [2024-06-19 04:39:27,297][26599] Updated weights for policy 0, policy_version 279994 (0.0034) [2024-06-19 04:39:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4587438080. Throughput: 0: 42568.4. Samples: 855097060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:28,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 04:39:31,699][26599] Updated weights for policy 0, policy_version 280004 (0.0051) [2024-06-19 04:39:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41783.4, 300 sec: 42209.7). Total num frames: 4587634688. Throughput: 0: 42600.5. Samples: 855226600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:33,380][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 04:39:35,162][26599] Updated weights for policy 0, policy_version 280014 (0.0039) [2024-06-19 04:39:38,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42868.8, 300 sec: 42209.1). Total num frames: 4587880448. Throughput: 0: 42545.8. Samples: 855479100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:38,385][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 04:39:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280022_4587880448.pth... [2024-06-19 04:39:38,468][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000279403_4577738752.pth [2024-06-19 04:39:39,289][26599] Updated weights for policy 0, policy_version 280024 (0.0045) [2024-06-19 04:39:42,886][26599] Updated weights for policy 0, policy_version 280034 (0.0034) [2024-06-19 04:39:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4588077056. Throughput: 0: 42446.7. Samples: 855728300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:43,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 04:39:47,052][26599] Updated weights for policy 0, policy_version 280044 (0.0032) [2024-06-19 04:39:48,380][26367] Fps is (10 sec: 39336.3, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 4588273664. Throughput: 0: 42509.0. Samples: 855860960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:48,380][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 04:39:50,639][26599] Updated weights for policy 0, policy_version 280054 (0.0030) [2024-06-19 04:39:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42600.6, 300 sec: 42209.6). Total num frames: 4588503040. Throughput: 0: 42463.5. Samples: 856114600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:53,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 04:39:54,859][26599] Updated weights for policy 0, policy_version 280064 (0.0025) [2024-06-19 04:39:58,341][26599] Updated weights for policy 0, policy_version 280074 (0.0035) [2024-06-19 04:39:58,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42321.2). Total num frames: 4588732416. Throughput: 0: 42434.1. Samples: 856368080. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:39:58,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 04:40:02,662][26599] Updated weights for policy 0, policy_version 280084 (0.0029) [2024-06-19 04:40:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4588929024. Throughput: 0: 42410.2. Samples: 856500060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:40:03,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 04:40:05,838][26599] Updated weights for policy 0, policy_version 280094 (0.0029) [2024-06-19 04:40:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42265.2). Total num frames: 4589142016. Throughput: 0: 42426.7. Samples: 856752400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:40:08,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 04:40:10,354][26599] Updated weights for policy 0, policy_version 280104 (0.0033) [2024-06-19 04:40:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4589355008. Throughput: 0: 42376.5. Samples: 857004000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:40:13,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 04:40:13,557][26599] Updated weights for policy 0, policy_version 280114 (0.0027) [2024-06-19 04:40:18,042][26599] Updated weights for policy 0, policy_version 280124 (0.0032) [2024-06-19 04:40:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4589568000. Throughput: 0: 42362.5. Samples: 857132920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:40:18,381][26367] Avg episode reward: [(0, '0.805')] [2024-06-19 04:40:21,147][26599] Updated weights for policy 0, policy_version 280134 (0.0034) [2024-06-19 04:40:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4589764608. Throughput: 0: 42356.3. Samples: 857384980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 04:40:23,381][26367] Avg episode reward: [(0, '0.822')] [2024-06-19 04:40:25,791][26599] Updated weights for policy 0, policy_version 280144 (0.0029) [2024-06-19 04:40:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42265.6). Total num frames: 4589977600. Throughput: 0: 42584.0. Samples: 857644580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:40:28,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 04:40:28,906][26599] Updated weights for policy 0, policy_version 280154 (0.0035) [2024-06-19 04:40:33,312][26599] Updated weights for policy 0, policy_version 280164 (0.0043) [2024-06-19 04:40:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 4590206976. Throughput: 0: 42417.3. Samples: 857769740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:40:33,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 04:40:36,936][26599] Updated weights for policy 0, policy_version 280174 (0.0043) [2024-06-19 04:40:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42327.9, 300 sec: 42265.2). Total num frames: 4590419968. Throughput: 0: 42319.1. Samples: 858018960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:40:38,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 04:40:41,071][26599] Updated weights for policy 0, policy_version 280184 (0.0035) [2024-06-19 04:40:43,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4590616576. Throughput: 0: 42384.8. Samples: 858275400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:40:43,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 04:40:44,572][26599] Updated weights for policy 0, policy_version 280194 (0.0032) [2024-06-19 04:40:48,383][26367] Fps is (10 sec: 42585.7, 60 sec: 42869.2, 300 sec: 42264.7). Total num frames: 4590845952. Throughput: 0: 42154.9. Samples: 858397160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:40:48,384][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 04:40:48,690][26599] Updated weights for policy 0, policy_version 280204 (0.0030) [2024-06-19 04:40:50,512][26579] Signal inference workers to stop experience collection... (12750 times) [2024-06-19 04:40:50,512][26579] Signal inference workers to resume experience collection... (12750 times) [2024-06-19 04:40:50,553][26599] InferenceWorker_p0-w0: stopping experience collection (12750 times) [2024-06-19 04:40:50,553][26599] InferenceWorker_p0-w0: resuming experience collection (12750 times) [2024-06-19 04:40:52,663][26599] Updated weights for policy 0, policy_version 280214 (0.0027) [2024-06-19 04:40:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4591058944. Throughput: 0: 42117.7. Samples: 858647700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:40:53,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 04:40:56,367][26599] Updated weights for policy 0, policy_version 280224 (0.0030) [2024-06-19 04:40:58,380][26367] Fps is (10 sec: 40972.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4591255552. Throughput: 0: 42288.8. Samples: 858907000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:40:58,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 04:41:00,510][26599] Updated weights for policy 0, policy_version 280234 (0.0037) [2024-06-19 04:41:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4591452160. Throughput: 0: 42044.9. Samples: 859024940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:03,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 04:41:04,422][26599] Updated weights for policy 0, policy_version 280244 (0.0030) [2024-06-19 04:41:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4591665152. Throughput: 0: 42225.8. Samples: 859285140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:08,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 04:41:08,417][26599] Updated weights for policy 0, policy_version 280254 (0.0032) [2024-06-19 04:41:12,334][26599] Updated weights for policy 0, policy_version 280264 (0.0039) [2024-06-19 04:41:13,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42049.7, 300 sec: 42320.2). Total num frames: 4591878144. Throughput: 0: 41994.0. Samples: 859534460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:13,384][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 04:41:16,220][26599] Updated weights for policy 0, policy_version 280274 (0.0027) [2024-06-19 04:41:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4592091136. Throughput: 0: 42100.4. Samples: 859664260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:18,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 04:41:20,184][26599] Updated weights for policy 0, policy_version 280284 (0.0031) [2024-06-19 04:41:23,380][26367] Fps is (10 sec: 42613.5, 60 sec: 42325.3, 300 sec: 42265.7). Total num frames: 4592304128. Throughput: 0: 42228.0. Samples: 859919220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:23,381][26367] Avg episode reward: [(0, '0.352')] [2024-06-19 04:41:23,722][26599] Updated weights for policy 0, policy_version 280294 (0.0032) [2024-06-19 04:41:27,648][26599] Updated weights for policy 0, policy_version 280304 (0.0043) [2024-06-19 04:41:28,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42322.7, 300 sec: 42264.7). Total num frames: 4592517120. Throughput: 0: 42176.2. Samples: 860173480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:28,385][26367] Avg episode reward: [(0, '0.312')] [2024-06-19 04:41:31,646][26599] Updated weights for policy 0, policy_version 280314 (0.0033) [2024-06-19 04:41:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4592746496. Throughput: 0: 42316.2. Samples: 860301260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:33,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 04:41:35,421][26599] Updated weights for policy 0, policy_version 280324 (0.0030) [2024-06-19 04:41:38,380][26367] Fps is (10 sec: 40974.6, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 4592926720. Throughput: 0: 42323.5. Samples: 860552260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 04:41:38,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 04:41:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280330_4592926720.pth... [2024-06-19 04:41:38,486][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000279711_4582785024.pth [2024-06-19 04:41:39,454][26599] Updated weights for policy 0, policy_version 280334 (0.0041) [2024-06-19 04:41:43,188][26599] Updated weights for policy 0, policy_version 280344 (0.0045) [2024-06-19 04:41:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42376.8). Total num frames: 4593156096. Throughput: 0: 42147.6. Samples: 860803640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:41:43,381][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 04:41:47,209][26599] Updated weights for policy 0, policy_version 280354 (0.0037) [2024-06-19 04:41:48,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42054.4, 300 sec: 42265.1). Total num frames: 4593369088. Throughput: 0: 42393.7. Samples: 860932660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:41:48,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 04:41:50,850][26599] Updated weights for policy 0, policy_version 280364 (0.0032) [2024-06-19 04:41:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4593582080. Throughput: 0: 42371.1. Samples: 861191840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:41:53,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 04:41:55,052][26599] Updated weights for policy 0, policy_version 280374 (0.0043) [2024-06-19 04:41:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4593795072. Throughput: 0: 42261.1. Samples: 861436060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:41:58,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 04:41:58,539][26599] Updated weights for policy 0, policy_version 280384 (0.0043) [2024-06-19 04:42:02,597][26599] Updated weights for policy 0, policy_version 280394 (0.0050) [2024-06-19 04:42:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4594024448. Throughput: 0: 42284.5. Samples: 861567060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:03,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 04:42:06,245][26599] Updated weights for policy 0, policy_version 280404 (0.0030) [2024-06-19 04:42:08,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4594188288. Throughput: 0: 42338.4. Samples: 861824440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:08,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 04:42:10,231][26599] Updated weights for policy 0, policy_version 280414 (0.0032) [2024-06-19 04:42:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42874.0, 300 sec: 42376.2). Total num frames: 4594450432. Throughput: 0: 42153.6. Samples: 862070240. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:13,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 04:42:13,942][26599] Updated weights for policy 0, policy_version 280424 (0.0024) [2024-06-19 04:42:16,760][26579] Signal inference workers to stop experience collection... (12800 times) [2024-06-19 04:42:16,761][26579] Signal inference workers to resume experience collection... (12800 times) [2024-06-19 04:42:16,794][26599] InferenceWorker_p0-w0: stopping experience collection (12800 times) [2024-06-19 04:42:16,794][26599] InferenceWorker_p0-w0: resuming experience collection (12800 times) [2024-06-19 04:42:17,962][26599] Updated weights for policy 0, policy_version 280434 (0.0030) [2024-06-19 04:42:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4594630656. Throughput: 0: 42388.1. Samples: 862208720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:18,380][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 04:42:21,620][26599] Updated weights for policy 0, policy_version 280444 (0.0035) [2024-06-19 04:42:23,380][26367] Fps is (10 sec: 36045.4, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4594810880. Throughput: 0: 42303.3. Samples: 862455900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:23,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 04:42:25,646][26599] Updated weights for policy 0, policy_version 280454 (0.0041) [2024-06-19 04:42:28,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42327.9, 300 sec: 42265.7). Total num frames: 4595056640. Throughput: 0: 42345.3. Samples: 862709180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:28,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 04:42:29,254][26599] Updated weights for policy 0, policy_version 280464 (0.0037) [2024-06-19 04:42:33,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4595269632. Throughput: 0: 42505.4. Samples: 862845400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:33,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 04:42:33,534][26599] Updated weights for policy 0, policy_version 280474 (0.0039) [2024-06-19 04:42:36,932][26599] Updated weights for policy 0, policy_version 280484 (0.0036) [2024-06-19 04:42:38,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4595466240. Throughput: 0: 42209.8. Samples: 863091280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:38,380][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 04:42:41,165][26599] Updated weights for policy 0, policy_version 280494 (0.0037) [2024-06-19 04:42:43,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4595712000. Throughput: 0: 42557.4. Samples: 863351140. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:43,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 04:42:44,635][26599] Updated weights for policy 0, policy_version 280504 (0.0036) [2024-06-19 04:42:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4595892224. Throughput: 0: 42600.0. Samples: 863484060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 19.0) [2024-06-19 04:42:48,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 04:42:49,097][26599] Updated weights for policy 0, policy_version 280514 (0.0036) [2024-06-19 04:42:52,612][26599] Updated weights for policy 0, policy_version 280524 (0.0047) [2024-06-19 04:42:53,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4596121600. Throughput: 0: 42340.7. Samples: 863729780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:42:53,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 04:42:56,901][26599] Updated weights for policy 0, policy_version 280534 (0.0044) [2024-06-19 04:42:58,380][26367] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4596367360. Throughput: 0: 42568.5. Samples: 863985820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:42:58,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 04:43:00,407][26599] Updated weights for policy 0, policy_version 280544 (0.0039) [2024-06-19 04:43:03,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.1, 300 sec: 42265.1). Total num frames: 4596514816. Throughput: 0: 42369.7. Samples: 864115360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:03,381][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 04:43:04,618][26599] Updated weights for policy 0, policy_version 280554 (0.0041) [2024-06-19 04:43:07,883][26599] Updated weights for policy 0, policy_version 280564 (0.0023) [2024-06-19 04:43:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 43144.4, 300 sec: 42431.8). Total num frames: 4596776960. Throughput: 0: 42436.3. Samples: 864365540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:08,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 04:43:12,281][26599] Updated weights for policy 0, policy_version 280574 (0.0033) [2024-06-19 04:43:13,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4596989952. Throughput: 0: 42560.9. Samples: 864624420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:13,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 04:43:15,683][26599] Updated weights for policy 0, policy_version 280584 (0.0031) [2024-06-19 04:43:18,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42052.2, 300 sec: 42265.7). Total num frames: 4597153792. Throughput: 0: 42270.2. Samples: 864747560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:18,382][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 04:43:19,899][26599] Updated weights for policy 0, policy_version 280594 (0.0033) [2024-06-19 04:43:23,106][26599] Updated weights for policy 0, policy_version 280604 (0.0041) [2024-06-19 04:43:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42431.8). Total num frames: 4597415936. Throughput: 0: 42446.5. Samples: 865001380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:23,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 04:43:27,773][26599] Updated weights for policy 0, policy_version 280614 (0.0032) [2024-06-19 04:43:28,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42377.1). Total num frames: 4597628928. Throughput: 0: 42524.3. Samples: 865264740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:28,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 04:43:31,047][26599] Updated weights for policy 0, policy_version 280624 (0.0030) [2024-06-19 04:43:33,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4597809152. Throughput: 0: 42336.9. Samples: 865389220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:33,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 04:43:35,412][26599] Updated weights for policy 0, policy_version 280634 (0.0034) [2024-06-19 04:43:38,384][26367] Fps is (10 sec: 42583.3, 60 sec: 43141.9, 300 sec: 42431.3). Total num frames: 4598054912. Throughput: 0: 42501.0. Samples: 865642480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:38,384][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 04:43:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280643_4598054912.pth... [2024-06-19 04:43:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280022_4587880448.pth [2024-06-19 04:43:38,705][26599] Updated weights for policy 0, policy_version 280644 (0.0033) [2024-06-19 04:43:40,873][26579] Signal inference workers to stop experience collection... (12850 times) [2024-06-19 04:43:40,909][26599] InferenceWorker_p0-w0: stopping experience collection (12850 times) [2024-06-19 04:43:40,942][26579] Signal inference workers to resume experience collection... (12850 times) [2024-06-19 04:43:40,942][26599] InferenceWorker_p0-w0: resuming experience collection (12850 times) [2024-06-19 04:43:43,300][26599] Updated weights for policy 0, policy_version 280654 (0.0048) [2024-06-19 04:43:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4598235136. Throughput: 0: 42589.5. Samples: 865902340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:43,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 04:43:46,477][26599] Updated weights for policy 0, policy_version 280664 (0.0034) [2024-06-19 04:43:48,380][26367] Fps is (10 sec: 39336.3, 60 sec: 42598.5, 300 sec: 42376.7). Total num frames: 4598448128. Throughput: 0: 42309.5. Samples: 866019280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:48,381][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 04:43:51,123][26599] Updated weights for policy 0, policy_version 280674 (0.0038) [2024-06-19 04:43:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4598677504. Throughput: 0: 42461.1. Samples: 866276280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:53,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 04:43:54,230][26599] Updated weights for policy 0, policy_version 280684 (0.0042) [2024-06-19 04:43:58,380][26367] Fps is (10 sec: 40959.0, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 4598857728. Throughput: 0: 42366.6. Samples: 866530920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:43:58,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 04:43:58,879][26599] Updated weights for policy 0, policy_version 280694 (0.0030) [2024-06-19 04:44:02,110][26599] Updated weights for policy 0, policy_version 280704 (0.0033) [2024-06-19 04:44:03,380][26367] Fps is (10 sec: 39320.7, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4599070720. Throughput: 0: 42279.0. Samples: 866650120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 04:44:03,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 04:44:06,520][26599] Updated weights for policy 0, policy_version 280714 (0.0043) [2024-06-19 04:44:08,383][26367] Fps is (10 sec: 44223.7, 60 sec: 42050.1, 300 sec: 42320.2). Total num frames: 4599300096. Throughput: 0: 42343.4. Samples: 866906960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:08,384][26367] Avg episode reward: [(0, '0.832')] [2024-06-19 04:44:09,837][26599] Updated weights for policy 0, policy_version 280724 (0.0050) [2024-06-19 04:44:13,380][26367] Fps is (10 sec: 42599.2, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 4599496704. Throughput: 0: 42190.4. Samples: 867163300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:13,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 04:44:14,378][26599] Updated weights for policy 0, policy_version 280734 (0.0033) [2024-06-19 04:44:17,540][26599] Updated weights for policy 0, policy_version 280744 (0.0039) [2024-06-19 04:44:18,380][26367] Fps is (10 sec: 40972.6, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4599709696. Throughput: 0: 42173.7. Samples: 867287040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:18,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 04:44:22,187][26599] Updated weights for policy 0, policy_version 280754 (0.0037) [2024-06-19 04:44:23,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4599922688. Throughput: 0: 42262.9. Samples: 867544160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:23,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 04:44:25,358][26599] Updated weights for policy 0, policy_version 280764 (0.0039) [2024-06-19 04:44:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 4600135680. Throughput: 0: 42051.7. Samples: 867794680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:28,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 04:44:29,851][26599] Updated weights for policy 0, policy_version 280774 (0.0035) [2024-06-19 04:44:33,256][26599] Updated weights for policy 0, policy_version 280784 (0.0033) [2024-06-19 04:44:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42321.2). Total num frames: 4600365056. Throughput: 0: 42324.8. Samples: 867923900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:33,383][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 04:44:37,458][26599] Updated weights for policy 0, policy_version 280794 (0.0033) [2024-06-19 04:44:38,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41508.6, 300 sec: 42265.2). Total num frames: 4600545280. Throughput: 0: 42259.4. Samples: 868177960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:38,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 04:44:41,073][26599] Updated weights for policy 0, policy_version 280804 (0.0022) [2024-06-19 04:44:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4600774656. Throughput: 0: 42201.1. Samples: 868429960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:43,380][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 04:44:45,133][26599] Updated weights for policy 0, policy_version 280814 (0.0031) [2024-06-19 04:44:48,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4601004032. Throughput: 0: 42494.9. Samples: 868562380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:48,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 04:44:48,485][26599] Updated weights for policy 0, policy_version 280824 (0.0032) [2024-06-19 04:44:53,146][26599] Updated weights for policy 0, policy_version 280834 (0.0024) [2024-06-19 04:44:53,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.0, 300 sec: 42209.6). Total num frames: 4601184256. Throughput: 0: 42285.1. Samples: 868809660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:53,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 04:44:55,794][26579] Signal inference workers to stop experience collection... (12900 times) [2024-06-19 04:44:55,801][26579] Signal inference workers to resume experience collection... (12900 times) [2024-06-19 04:44:55,812][26599] InferenceWorker_p0-w0: stopping experience collection (12900 times) [2024-06-19 04:44:55,847][26599] InferenceWorker_p0-w0: resuming experience collection (12900 times) [2024-06-19 04:44:56,151][26599] Updated weights for policy 0, policy_version 280844 (0.0044) [2024-06-19 04:44:58,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4601413632. Throughput: 0: 42282.5. Samples: 869066020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:44:58,384][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 04:45:00,800][26599] Updated weights for policy 0, policy_version 280854 (0.0042) [2024-06-19 04:45:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4601626624. Throughput: 0: 42377.8. Samples: 869194040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:45:03,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 04:45:03,963][26599] Updated weights for policy 0, policy_version 280864 (0.0041) [2024-06-19 04:45:08,339][26599] Updated weights for policy 0, policy_version 280874 (0.0032) [2024-06-19 04:45:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42327.4, 300 sec: 42320.7). Total num frames: 4601839616. Throughput: 0: 42285.7. Samples: 869447020. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:45:08,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 04:45:11,987][26599] Updated weights for policy 0, policy_version 280884 (0.0040) [2024-06-19 04:45:13,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4602036224. Throughput: 0: 42377.1. Samples: 869701640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 04:45:13,380][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 04:45:16,130][26599] Updated weights for policy 0, policy_version 280894 (0.0037) [2024-06-19 04:45:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4602249216. Throughput: 0: 42332.9. Samples: 869828880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:18,381][26367] Avg episode reward: [(0, '0.859')] [2024-06-19 04:45:19,738][26599] Updated weights for policy 0, policy_version 280904 (0.0029) [2024-06-19 04:45:23,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4602462208. Throughput: 0: 42236.4. Samples: 870078600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:23,389][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 04:45:23,994][26599] Updated weights for policy 0, policy_version 280914 (0.0032) [2024-06-19 04:45:27,693][26599] Updated weights for policy 0, policy_version 280924 (0.0043) [2024-06-19 04:45:28,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 4602675200. Throughput: 0: 42329.1. Samples: 870334780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:28,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 04:45:31,700][26599] Updated weights for policy 0, policy_version 280934 (0.0040) [2024-06-19 04:45:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 4602871808. Throughput: 0: 42152.8. Samples: 870459260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:33,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 04:45:35,751][26599] Updated weights for policy 0, policy_version 280944 (0.0026) [2024-06-19 04:45:38,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4603101184. Throughput: 0: 42033.5. Samples: 870701160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:38,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 04:45:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280951_4603101184.pth... [2024-06-19 04:45:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280330_4592926720.pth [2024-06-19 04:45:39,900][26599] Updated weights for policy 0, policy_version 280954 (0.0039) [2024-06-19 04:45:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42154.5). Total num frames: 4603281408. Throughput: 0: 42093.9. Samples: 870960240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:43,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 04:45:43,618][26599] Updated weights for policy 0, policy_version 280964 (0.0039) [2024-06-19 04:45:47,629][26599] Updated weights for policy 0, policy_version 280974 (0.0027) [2024-06-19 04:45:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4603510784. Throughput: 0: 42039.6. Samples: 871085820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:48,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 04:45:51,351][26599] Updated weights for policy 0, policy_version 280984 (0.0045) [2024-06-19 04:45:53,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4603740160. Throughput: 0: 42047.7. Samples: 871339160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:53,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 04:45:55,565][26599] Updated weights for policy 0, policy_version 280994 (0.0040) [2024-06-19 04:45:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4603920384. Throughput: 0: 42136.4. Samples: 871597780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:45:58,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 04:45:59,129][26599] Updated weights for policy 0, policy_version 281004 (0.0029) [2024-06-19 04:46:03,204][26599] Updated weights for policy 0, policy_version 281014 (0.0039) [2024-06-19 04:46:03,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4604149760. Throughput: 0: 41983.5. Samples: 871718140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:46:03,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 04:46:06,793][26599] Updated weights for policy 0, policy_version 281024 (0.0025) [2024-06-19 04:46:07,030][26579] Signal inference workers to stop experience collection... (12950 times) [2024-06-19 04:46:07,031][26579] Signal inference workers to resume experience collection... (12950 times) [2024-06-19 04:46:07,043][26599] InferenceWorker_p0-w0: stopping experience collection (12950 times) [2024-06-19 04:46:07,043][26599] InferenceWorker_p0-w0: resuming experience collection (12950 times) [2024-06-19 04:46:08,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42598.5, 300 sec: 42432.3). Total num frames: 4604395520. Throughput: 0: 42174.7. Samples: 871976460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:46:08,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 04:46:10,740][26599] Updated weights for policy 0, policy_version 281034 (0.0035) [2024-06-19 04:46:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42265.2). Total num frames: 4604559360. Throughput: 0: 42168.0. Samples: 872232340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:46:13,390][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 04:46:14,362][26599] Updated weights for policy 0, policy_version 281044 (0.0030) [2024-06-19 04:46:18,380][26367] Fps is (10 sec: 37683.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4604772352. Throughput: 0: 42077.7. Samples: 872352760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:46:18,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 04:46:18,551][26599] Updated weights for policy 0, policy_version 281054 (0.0035) [2024-06-19 04:46:22,227][26599] Updated weights for policy 0, policy_version 281064 (0.0043) [2024-06-19 04:46:23,380][26367] Fps is (10 sec: 45876.2, 60 sec: 42598.5, 300 sec: 42376.8). Total num frames: 4605018112. Throughput: 0: 42603.6. Samples: 872618320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:46:23,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 04:46:26,076][26599] Updated weights for policy 0, policy_version 281074 (0.0029) [2024-06-19 04:46:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 4605198336. Throughput: 0: 42533.0. Samples: 872874220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 22.0) [2024-06-19 04:46:28,380][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 04:46:29,790][26599] Updated weights for policy 0, policy_version 281084 (0.0035) [2024-06-19 04:46:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4605411328. Throughput: 0: 42385.0. Samples: 872993140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:46:33,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 04:46:33,625][26599] Updated weights for policy 0, policy_version 281094 (0.0029) [2024-06-19 04:46:37,185][26599] Updated weights for policy 0, policy_version 281104 (0.0028) [2024-06-19 04:46:38,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4605657088. Throughput: 0: 42572.8. Samples: 873254940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:46:38,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 04:46:41,345][26599] Updated weights for policy 0, policy_version 281114 (0.0027) [2024-06-19 04:46:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4605853696. Throughput: 0: 42560.4. Samples: 873513000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:46:43,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 04:46:44,669][26599] Updated weights for policy 0, policy_version 281124 (0.0035) [2024-06-19 04:46:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4606050304. Throughput: 0: 42658.7. Samples: 873637780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:46:48,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 04:46:48,834][26599] Updated weights for policy 0, policy_version 281134 (0.0037) [2024-06-19 04:46:52,256][26599] Updated weights for policy 0, policy_version 281144 (0.0035) [2024-06-19 04:46:53,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4606279680. Throughput: 0: 42611.0. Samples: 873893960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:46:53,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 04:46:56,507][26599] Updated weights for policy 0, policy_version 281154 (0.0029) [2024-06-19 04:46:58,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42265.1). Total num frames: 4606492672. Throughput: 0: 42678.7. Samples: 874152880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:46:58,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 04:47:00,377][26599] Updated weights for policy 0, policy_version 281164 (0.0048) [2024-06-19 04:47:03,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4606689280. Throughput: 0: 42758.3. Samples: 874276880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:03,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 04:47:04,162][26599] Updated weights for policy 0, policy_version 281174 (0.0035) [2024-06-19 04:47:07,987][26599] Updated weights for policy 0, policy_version 281184 (0.0032) [2024-06-19 04:47:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4606918656. Throughput: 0: 42568.3. Samples: 874533900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:08,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 04:47:12,104][26599] Updated weights for policy 0, policy_version 281194 (0.0034) [2024-06-19 04:47:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4607115264. Throughput: 0: 42479.5. Samples: 874785800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:13,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 04:47:15,550][26599] Updated weights for policy 0, policy_version 281204 (0.0035) [2024-06-19 04:47:17,383][26579] Signal inference workers to stop experience collection... (13000 times) [2024-06-19 04:47:17,383][26579] Signal inference workers to resume experience collection... (13000 times) [2024-06-19 04:47:17,412][26599] InferenceWorker_p0-w0: stopping experience collection (13000 times) [2024-06-19 04:47:17,413][26599] InferenceWorker_p0-w0: resuming experience collection (13000 times) [2024-06-19 04:47:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4607328256. Throughput: 0: 42571.4. Samples: 874908860. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:18,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 04:47:20,233][26599] Updated weights for policy 0, policy_version 281214 (0.0035) [2024-06-19 04:47:23,097][26599] Updated weights for policy 0, policy_version 281224 (0.0039) [2024-06-19 04:47:23,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4607574016. Throughput: 0: 42436.4. Samples: 875164580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:23,381][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 04:47:27,900][26599] Updated weights for policy 0, policy_version 281234 (0.0035) [2024-06-19 04:47:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4607754240. Throughput: 0: 42384.0. Samples: 875420280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:28,381][26367] Avg episode reward: [(0, '0.301')] [2024-06-19 04:47:30,955][26599] Updated weights for policy 0, policy_version 281244 (0.0034) [2024-06-19 04:47:33,380][26367] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4607950848. Throughput: 0: 42232.1. Samples: 875538220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:33,380][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 04:47:35,471][26599] Updated weights for policy 0, policy_version 281254 (0.0034) [2024-06-19 04:47:38,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4608196608. Throughput: 0: 42379.8. Samples: 875801040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 04:47:38,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 04:47:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000281263_4608212992.pth... [2024-06-19 04:47:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280643_4598054912.pth [2024-06-19 04:47:38,681][26599] Updated weights for policy 0, policy_version 281264 (0.0039) [2024-06-19 04:47:43,229][26599] Updated weights for policy 0, policy_version 281274 (0.0037) [2024-06-19 04:47:43,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4608393216. Throughput: 0: 42384.9. Samples: 876060200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:47:43,381][26367] Avg episode reward: [(0, '0.888')] [2024-06-19 04:47:46,439][26599] Updated weights for policy 0, policy_version 281284 (0.0039) [2024-06-19 04:47:48,380][26367] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 4608589824. Throughput: 0: 42404.7. Samples: 876185100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:47:48,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 04:47:50,834][26599] Updated weights for policy 0, policy_version 281294 (0.0035) [2024-06-19 04:47:53,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 4608819200. Throughput: 0: 42323.6. Samples: 876438460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:47:53,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 04:47:54,442][26599] Updated weights for policy 0, policy_version 281304 (0.0028) [2024-06-19 04:47:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4609015808. Throughput: 0: 42224.8. Samples: 876685920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:47:58,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 04:47:58,802][26599] Updated weights for policy 0, policy_version 281314 (0.0028) [2024-06-19 04:48:02,217][26599] Updated weights for policy 0, policy_version 281324 (0.0038) [2024-06-19 04:48:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4609228800. Throughput: 0: 42341.0. Samples: 876814200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:03,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 04:48:06,657][26599] Updated weights for policy 0, policy_version 281334 (0.0033) [2024-06-19 04:48:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4609458176. Throughput: 0: 42352.9. Samples: 877070460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:08,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 04:48:09,936][26599] Updated weights for policy 0, policy_version 281344 (0.0032) [2024-06-19 04:48:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4609671168. Throughput: 0: 42162.2. Samples: 877317580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:13,389][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 04:48:14,356][26599] Updated weights for policy 0, policy_version 281354 (0.0040) [2024-06-19 04:48:17,614][26599] Updated weights for policy 0, policy_version 281364 (0.0038) [2024-06-19 04:48:18,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42322.8, 300 sec: 42209.1). Total num frames: 4609867776. Throughput: 0: 42224.5. Samples: 877438480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:18,393][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 04:48:21,970][26599] Updated weights for policy 0, policy_version 281374 (0.0036) [2024-06-19 04:48:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4610080768. Throughput: 0: 42107.0. Samples: 877695860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:23,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 04:48:25,330][26599] Updated weights for policy 0, policy_version 281384 (0.0021) [2024-06-19 04:48:26,506][26579] Signal inference workers to stop experience collection... (13050 times) [2024-06-19 04:48:26,507][26579] Signal inference workers to resume experience collection... (13050 times) [2024-06-19 04:48:26,521][26599] InferenceWorker_p0-w0: stopping experience collection (13050 times) [2024-06-19 04:48:26,521][26599] InferenceWorker_p0-w0: resuming experience collection (13050 times) [2024-06-19 04:48:28,384][26367] Fps is (10 sec: 42598.3, 60 sec: 42322.8, 300 sec: 42320.2). Total num frames: 4610293760. Throughput: 0: 42122.9. Samples: 877955880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:28,385][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 04:48:29,673][26599] Updated weights for policy 0, policy_version 281394 (0.0029) [2024-06-19 04:48:33,083][26599] Updated weights for policy 0, policy_version 281404 (0.0034) [2024-06-19 04:48:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42265.7). Total num frames: 4610523136. Throughput: 0: 42141.0. Samples: 878081440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:33,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 04:48:37,189][26599] Updated weights for policy 0, policy_version 281414 (0.0039) [2024-06-19 04:48:38,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4610719744. Throughput: 0: 42210.2. Samples: 878337920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:38,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 04:48:40,918][26599] Updated weights for policy 0, policy_version 281424 (0.0024) [2024-06-19 04:48:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4610932736. Throughput: 0: 42380.7. Samples: 878593040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:43,380][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 04:48:45,446][26599] Updated weights for policy 0, policy_version 281434 (0.0029) [2024-06-19 04:48:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4611145728. Throughput: 0: 42228.0. Samples: 878714460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:48,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 04:48:48,749][26599] Updated weights for policy 0, policy_version 281444 (0.0034) [2024-06-19 04:48:53,190][26599] Updated weights for policy 0, policy_version 281454 (0.0028) [2024-06-19 04:48:53,380][26367] Fps is (10 sec: 40959.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4611342336. Throughput: 0: 42286.6. Samples: 878973360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 20.0) [2024-06-19 04:48:53,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 04:48:56,523][26599] Updated weights for policy 0, policy_version 281464 (0.0034) [2024-06-19 04:48:58,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4611571712. Throughput: 0: 42308.9. Samples: 879221480. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:48:58,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 04:49:01,080][26599] Updated weights for policy 0, policy_version 281474 (0.0043) [2024-06-19 04:49:03,384][26367] Fps is (10 sec: 45859.2, 60 sec: 42868.9, 300 sec: 42376.2). Total num frames: 4611801088. Throughput: 0: 42516.0. Samples: 879351700. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:03,384][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 04:49:04,521][26599] Updated weights for policy 0, policy_version 281484 (0.0035) [2024-06-19 04:49:08,380][26367] Fps is (10 sec: 39322.2, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4611964928. Throughput: 0: 42495.2. Samples: 879608140. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:08,380][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 04:49:08,799][26599] Updated weights for policy 0, policy_version 281494 (0.0037) [2024-06-19 04:49:12,195][26599] Updated weights for policy 0, policy_version 281504 (0.0031) [2024-06-19 04:49:13,384][26367] Fps is (10 sec: 40959.8, 60 sec: 42322.8, 300 sec: 42375.7). Total num frames: 4612210688. Throughput: 0: 42150.2. Samples: 879852640. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:13,385][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 04:49:16,498][26599] Updated weights for policy 0, policy_version 281514 (0.0039) [2024-06-19 04:49:18,382][26367] Fps is (10 sec: 49143.0, 60 sec: 43145.9, 300 sec: 42487.1). Total num frames: 4612456448. Throughput: 0: 42427.2. Samples: 879990740. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:18,383][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 04:49:19,957][26599] Updated weights for policy 0, policy_version 281524 (0.0030) [2024-06-19 04:49:23,380][26367] Fps is (10 sec: 37696.6, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4612587520. Throughput: 0: 42215.4. Samples: 880237620. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:23,381][26367] Avg episode reward: [(0, '0.800')] [2024-06-19 04:49:24,118][26599] Updated weights for policy 0, policy_version 281534 (0.0032) [2024-06-19 04:49:27,459][26579] Signal inference workers to stop experience collection... (13100 times) [2024-06-19 04:49:27,459][26579] Signal inference workers to resume experience collection... (13100 times) [2024-06-19 04:49:27,494][26599] InferenceWorker_p0-w0: stopping experience collection (13100 times) [2024-06-19 04:49:27,494][26599] InferenceWorker_p0-w0: resuming experience collection (13100 times) [2024-06-19 04:49:27,609][26599] Updated weights for policy 0, policy_version 281544 (0.0040) [2024-06-19 04:49:28,380][26367] Fps is (10 sec: 37689.4, 60 sec: 42327.9, 300 sec: 42265.2). Total num frames: 4612833280. Throughput: 0: 41925.6. Samples: 880479700. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:28,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 04:49:32,223][26599] Updated weights for policy 0, policy_version 281554 (0.0042) [2024-06-19 04:49:33,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4613062656. Throughput: 0: 42324.3. Samples: 880619060. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:33,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 04:49:35,520][26599] Updated weights for policy 0, policy_version 281564 (0.0031) [2024-06-19 04:49:38,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.0, 300 sec: 42209.6). Total num frames: 4613226496. Throughput: 0: 41988.0. Samples: 880862820. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:38,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 04:49:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000281569_4613226496.pth... [2024-06-19 04:49:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000280951_4603101184.pth [2024-06-19 04:49:39,898][26599] Updated weights for policy 0, policy_version 281574 (0.0036) [2024-06-19 04:49:43,192][26599] Updated weights for policy 0, policy_version 281584 (0.0038) [2024-06-19 04:49:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4613472256. Throughput: 0: 41970.7. Samples: 881110160. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:43,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 04:49:47,631][26599] Updated weights for policy 0, policy_version 281594 (0.0039) [2024-06-19 04:49:48,380][26367] Fps is (10 sec: 45876.3, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4613685248. Throughput: 0: 42182.1. Samples: 881249740. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:48,380][26367] Avg episode reward: [(0, '0.829')] [2024-06-19 04:49:50,821][26599] Updated weights for policy 0, policy_version 281604 (0.0034) [2024-06-19 04:49:53,380][26367] Fps is (10 sec: 37682.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4613849088. Throughput: 0: 41963.8. Samples: 881496520. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:53,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 04:49:55,266][26599] Updated weights for policy 0, policy_version 281614 (0.0032) [2024-06-19 04:49:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4614111232. Throughput: 0: 42120.4. Samples: 881747900. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:49:58,380][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 04:49:58,384][26599] Updated weights for policy 0, policy_version 281624 (0.0034) [2024-06-19 04:50:02,963][26599] Updated weights for policy 0, policy_version 281634 (0.0040) [2024-06-19 04:50:03,380][26367] Fps is (10 sec: 45875.0, 60 sec: 41781.6, 300 sec: 42265.2). Total num frames: 4614307840. Throughput: 0: 42155.3. Samples: 881887660. Policy #0 lag: (min: 1.0, avg: 12.3, max: 25.0) [2024-06-19 04:50:03,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 04:50:05,943][26599] Updated weights for policy 0, policy_version 281644 (0.0033) [2024-06-19 04:50:08,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4614488064. Throughput: 0: 42069.0. Samples: 882130720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:08,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 04:50:10,747][26599] Updated weights for policy 0, policy_version 281654 (0.0038) [2024-06-19 04:50:13,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42328.0, 300 sec: 42376.3). Total num frames: 4614750208. Throughput: 0: 42183.7. Samples: 882377960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:13,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 04:50:14,189][26599] Updated weights for policy 0, policy_version 281664 (0.0045) [2024-06-19 04:50:18,344][26599] Updated weights for policy 0, policy_version 281674 (0.0032) [2024-06-19 04:50:18,383][26367] Fps is (10 sec: 45862.7, 60 sec: 41505.4, 300 sec: 42320.3). Total num frames: 4614946816. Throughput: 0: 42237.0. Samples: 882519840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:18,383][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 04:50:22,110][26599] Updated weights for policy 0, policy_version 281684 (0.0041) [2024-06-19 04:50:23,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4615127040. Throughput: 0: 42270.8. Samples: 882765000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:23,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 04:50:25,870][26599] Updated weights for policy 0, policy_version 281694 (0.0038) [2024-06-19 04:50:28,380][26367] Fps is (10 sec: 44249.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4615389184. Throughput: 0: 42229.4. Samples: 883010480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:28,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 04:50:29,668][26599] Updated weights for policy 0, policy_version 281704 (0.0035) [2024-06-19 04:50:31,780][26579] Signal inference workers to stop experience collection... (13150 times) [2024-06-19 04:50:31,811][26599] InferenceWorker_p0-w0: stopping experience collection (13150 times) [2024-06-19 04:50:31,896][26579] Signal inference workers to resume experience collection... (13150 times) [2024-06-19 04:50:31,896][26599] InferenceWorker_p0-w0: resuming experience collection (13150 times) [2024-06-19 04:50:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 4615553024. Throughput: 0: 42177.3. Samples: 883147720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:33,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 04:50:33,882][26599] Updated weights for policy 0, policy_version 281714 (0.0038) [2024-06-19 04:50:37,452][26599] Updated weights for policy 0, policy_version 281724 (0.0036) [2024-06-19 04:50:38,380][26367] Fps is (10 sec: 37682.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4615766016. Throughput: 0: 42178.2. Samples: 883394540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:38,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 04:50:41,746][26599] Updated weights for policy 0, policy_version 281734 (0.0043) [2024-06-19 04:50:43,380][26367] Fps is (10 sec: 47514.1, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4616028160. Throughput: 0: 42011.6. Samples: 883638420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:43,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 04:50:45,343][26599] Updated weights for policy 0, policy_version 281744 (0.0034) [2024-06-19 04:50:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 4616192000. Throughput: 0: 41932.2. Samples: 883774600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:48,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 04:50:49,412][26599] Updated weights for policy 0, policy_version 281754 (0.0047) [2024-06-19 04:50:53,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42598.6, 300 sec: 42320.7). Total num frames: 4616404992. Throughput: 0: 42061.9. Samples: 884023500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:53,380][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 04:50:53,656][26599] Updated weights for policy 0, policy_version 281764 (0.0044) [2024-06-19 04:50:56,975][26599] Updated weights for policy 0, policy_version 281774 (0.0031) [2024-06-19 04:50:58,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4616650752. Throughput: 0: 42201.8. Samples: 884277040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:50:58,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 04:51:01,407][26599] Updated weights for policy 0, policy_version 281784 (0.0032) [2024-06-19 04:51:03,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 4616814592. Throughput: 0: 42076.8. Samples: 884413180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:51:03,382][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 04:51:04,613][26599] Updated weights for policy 0, policy_version 281794 (0.0034) [2024-06-19 04:51:08,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4617043968. Throughput: 0: 42066.2. Samples: 884657980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:51:08,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 04:51:09,121][26599] Updated weights for policy 0, policy_version 281804 (0.0032) [2024-06-19 04:51:12,495][26599] Updated weights for policy 0, policy_version 281814 (0.0036) [2024-06-19 04:51:13,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4617289728. Throughput: 0: 42204.0. Samples: 884909660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:51:13,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 04:51:16,858][26599] Updated weights for policy 0, policy_version 281824 (0.0049) [2024-06-19 04:51:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41508.0, 300 sec: 42098.5). Total num frames: 4617437184. Throughput: 0: 42082.2. Samples: 885041420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 04:51:18,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 04:51:20,093][26599] Updated weights for policy 0, policy_version 281834 (0.0040) [2024-06-19 04:51:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4617699328. Throughput: 0: 42108.1. Samples: 885289400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:23,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 04:51:24,536][26599] Updated weights for policy 0, policy_version 281844 (0.0028) [2024-06-19 04:51:27,939][26599] Updated weights for policy 0, policy_version 281854 (0.0038) [2024-06-19 04:51:28,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4617912320. Throughput: 0: 42361.2. Samples: 885544680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:28,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 04:51:32,143][26599] Updated weights for policy 0, policy_version 281864 (0.0023) [2024-06-19 04:51:33,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4618076160. Throughput: 0: 42192.8. Samples: 885673280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:33,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 04:51:36,013][26599] Updated weights for policy 0, policy_version 281874 (0.0036) [2024-06-19 04:51:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4618338304. Throughput: 0: 42217.1. Samples: 885923280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:38,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 04:51:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000281881_4618338304.pth... [2024-06-19 04:51:38,466][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000281263_4608212992.pth [2024-06-19 04:51:40,629][26599] Updated weights for policy 0, policy_version 281884 (0.0054) [2024-06-19 04:51:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41506.0, 300 sec: 42265.1). Total num frames: 4618518528. Throughput: 0: 42239.4. Samples: 886177820. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:43,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 04:51:43,913][26599] Updated weights for policy 0, policy_version 281894 (0.0036) [2024-06-19 04:51:48,210][26599] Updated weights for policy 0, policy_version 281904 (0.0030) [2024-06-19 04:51:48,380][26367] Fps is (10 sec: 37684.0, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4618715136. Throughput: 0: 41905.0. Samples: 886298900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:48,380][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 04:51:51,296][26579] Signal inference workers to stop experience collection... (13200 times) [2024-06-19 04:51:51,296][26579] Signal inference workers to resume experience collection... (13200 times) [2024-06-19 04:51:51,314][26599] InferenceWorker_p0-w0: stopping experience collection (13200 times) [2024-06-19 04:51:51,314][26599] InferenceWorker_p0-w0: resuming experience collection (13200 times) [2024-06-19 04:51:51,642][26599] Updated weights for policy 0, policy_version 281914 (0.0032) [2024-06-19 04:51:53,384][26367] Fps is (10 sec: 45859.4, 60 sec: 42868.8, 300 sec: 42320.2). Total num frames: 4618977280. Throughput: 0: 42046.9. Samples: 886550240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:53,384][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 04:51:55,822][26599] Updated weights for policy 0, policy_version 281924 (0.0033) [2024-06-19 04:51:58,382][26367] Fps is (10 sec: 42589.9, 60 sec: 41504.8, 300 sec: 42209.3). Total num frames: 4619141120. Throughput: 0: 42232.9. Samples: 886810220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:51:58,383][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 04:51:59,351][26599] Updated weights for policy 0, policy_version 281934 (0.0035) [2024-06-19 04:52:03,380][26367] Fps is (10 sec: 36058.1, 60 sec: 42052.4, 300 sec: 42098.6). Total num frames: 4619337728. Throughput: 0: 41941.5. Samples: 886928780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:52:03,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 04:52:03,920][26599] Updated weights for policy 0, policy_version 281944 (0.0031) [2024-06-19 04:52:07,029][26599] Updated weights for policy 0, policy_version 281954 (0.0037) [2024-06-19 04:52:08,380][26367] Fps is (10 sec: 45884.6, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4619599872. Throughput: 0: 42097.5. Samples: 887183780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:52:08,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 04:52:11,663][26599] Updated weights for policy 0, policy_version 281964 (0.0037) [2024-06-19 04:52:13,384][26367] Fps is (10 sec: 40944.5, 60 sec: 40957.5, 300 sec: 42098.0). Total num frames: 4619747328. Throughput: 0: 42082.4. Samples: 887438540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:52:13,385][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 04:52:14,971][26599] Updated weights for policy 0, policy_version 281974 (0.0027) [2024-06-19 04:52:18,380][26367] Fps is (10 sec: 39320.5, 60 sec: 42598.3, 300 sec: 42098.5). Total num frames: 4619993088. Throughput: 0: 41867.5. Samples: 887557320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:52:18,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 04:52:19,263][26599] Updated weights for policy 0, policy_version 281984 (0.0034) [2024-06-19 04:52:22,539][26599] Updated weights for policy 0, policy_version 281994 (0.0023) [2024-06-19 04:52:23,380][26367] Fps is (10 sec: 45892.1, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 4620206080. Throughput: 0: 41929.9. Samples: 887810120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:52:23,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 04:52:26,860][26599] Updated weights for policy 0, policy_version 282004 (0.0032) [2024-06-19 04:52:28,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41233.1, 300 sec: 42154.1). Total num frames: 4620386304. Throughput: 0: 42155.2. Samples: 888074800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 04:52:28,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 04:52:30,169][26599] Updated weights for policy 0, policy_version 282014 (0.0035) [2024-06-19 04:52:33,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42595.9, 300 sec: 42153.5). Total num frames: 4620632064. Throughput: 0: 42098.3. Samples: 888193480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:52:33,385][26367] Avg episode reward: [(0, '0.843')] [2024-06-19 04:52:34,885][26599] Updated weights for policy 0, policy_version 282024 (0.0044) [2024-06-19 04:52:38,024][26599] Updated weights for policy 0, policy_version 282034 (0.0034) [2024-06-19 04:52:38,380][26367] Fps is (10 sec: 45875.7, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 4620845056. Throughput: 0: 42269.7. Samples: 888452220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:52:38,380][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 04:52:42,367][26599] Updated weights for policy 0, policy_version 282044 (0.0033) [2024-06-19 04:52:43,380][26367] Fps is (10 sec: 39336.2, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4621025280. Throughput: 0: 42353.8. Samples: 888716060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:52:43,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 04:52:45,439][26599] Updated weights for policy 0, policy_version 282054 (0.0038) [2024-06-19 04:52:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 4621287424. Throughput: 0: 42416.7. Samples: 888837540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:52:48,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 04:52:49,734][26599] Updated weights for policy 0, policy_version 282064 (0.0039) [2024-06-19 04:52:53,022][26599] Updated weights for policy 0, policy_version 282074 (0.0034) [2024-06-19 04:52:53,380][26367] Fps is (10 sec: 47512.9, 60 sec: 42054.7, 300 sec: 42320.7). Total num frames: 4621500416. Throughput: 0: 42507.8. Samples: 889096640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:52:53,381][26367] Avg episode reward: [(0, '0.825')] [2024-06-19 04:52:57,461][26599] Updated weights for policy 0, policy_version 282084 (0.0034) [2024-06-19 04:52:58,380][26367] Fps is (10 sec: 37683.7, 60 sec: 42053.7, 300 sec: 42154.1). Total num frames: 4621664256. Throughput: 0: 42576.0. Samples: 889354300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:52:58,380][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 04:53:00,588][26579] Signal inference workers to stop experience collection... (13250 times) [2024-06-19 04:53:00,592][26579] Signal inference workers to resume experience collection... (13250 times) [2024-06-19 04:53:00,634][26599] InferenceWorker_p0-w0: stopping experience collection (13250 times) [2024-06-19 04:53:00,634][26599] InferenceWorker_p0-w0: resuming experience collection (13250 times) [2024-06-19 04:53:00,737][26599] Updated weights for policy 0, policy_version 282094 (0.0050) [2024-06-19 04:53:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43144.4, 300 sec: 42265.2). Total num frames: 4621926400. Throughput: 0: 42618.8. Samples: 889475160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:03,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 04:53:05,040][26599] Updated weights for policy 0, policy_version 282104 (0.0032) [2024-06-19 04:53:08,380][26367] Fps is (10 sec: 47513.2, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 4622139392. Throughput: 0: 42840.0. Samples: 889737920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:08,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 04:53:08,467][26599] Updated weights for policy 0, policy_version 282114 (0.0032) [2024-06-19 04:53:12,650][26599] Updated weights for policy 0, policy_version 282124 (0.0038) [2024-06-19 04:53:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42874.1, 300 sec: 42210.1). Total num frames: 4622319616. Throughput: 0: 42370.7. Samples: 889981480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:13,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 04:53:16,585][26599] Updated weights for policy 0, policy_version 282134 (0.0035) [2024-06-19 04:53:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4622532608. Throughput: 0: 42549.2. Samples: 890108040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:18,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 04:53:20,871][26599] Updated weights for policy 0, policy_version 282144 (0.0045) [2024-06-19 04:53:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42265.7). Total num frames: 4622761984. Throughput: 0: 42528.8. Samples: 890366020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:23,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 04:53:24,366][26599] Updated weights for policy 0, policy_version 282154 (0.0033) [2024-06-19 04:53:28,313][26599] Updated weights for policy 0, policy_version 282164 (0.0042) [2024-06-19 04:53:28,380][26367] Fps is (10 sec: 44237.6, 60 sec: 43144.6, 300 sec: 42209.6). Total num frames: 4622974976. Throughput: 0: 42338.7. Samples: 890621300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:28,380][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 04:53:31,841][26599] Updated weights for policy 0, policy_version 282174 (0.0036) [2024-06-19 04:53:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42601.0, 300 sec: 42265.2). Total num frames: 4623187968. Throughput: 0: 42468.1. Samples: 890748600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:33,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 04:53:35,793][26599] Updated weights for policy 0, policy_version 282184 (0.0041) [2024-06-19 04:53:38,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42322.7, 300 sec: 42209.1). Total num frames: 4623384576. Throughput: 0: 42421.1. Samples: 891005740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:38,384][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 04:53:38,439][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000282190_4623400960.pth... [2024-06-19 04:53:38,484][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000281569_4613226496.pth [2024-06-19 04:53:39,401][26599] Updated weights for policy 0, policy_version 282194 (0.0028) [2024-06-19 04:53:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42265.1). Total num frames: 4623613952. Throughput: 0: 42121.6. Samples: 891249780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 04:53:43,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 04:53:43,947][26599] Updated weights for policy 0, policy_version 282204 (0.0038) [2024-06-19 04:53:47,602][26599] Updated weights for policy 0, policy_version 282214 (0.0032) [2024-06-19 04:53:48,380][26367] Fps is (10 sec: 44252.1, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4623826944. Throughput: 0: 42352.3. Samples: 891381020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:53:48,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 04:53:51,585][26599] Updated weights for policy 0, policy_version 282224 (0.0048) [2024-06-19 04:53:53,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4624023552. Throughput: 0: 42143.0. Samples: 891634360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:53:53,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 04:53:55,506][26599] Updated weights for policy 0, policy_version 282234 (0.0031) [2024-06-19 04:53:58,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42099.1). Total num frames: 4624220160. Throughput: 0: 42190.3. Samples: 891880040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:53:58,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 04:53:59,448][26599] Updated weights for policy 0, policy_version 282244 (0.0029) [2024-06-19 04:54:03,102][26599] Updated weights for policy 0, policy_version 282254 (0.0038) [2024-06-19 04:54:03,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4624449536. Throughput: 0: 42292.6. Samples: 892011200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:03,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 04:54:07,358][26599] Updated weights for policy 0, policy_version 282264 (0.0031) [2024-06-19 04:54:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41779.1, 300 sec: 42154.6). Total num frames: 4624646144. Throughput: 0: 42175.9. Samples: 892263940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:08,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 04:54:10,852][26599] Updated weights for policy 0, policy_version 282274 (0.0037) [2024-06-19 04:54:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42043.3). Total num frames: 4624859136. Throughput: 0: 42220.0. Samples: 892521200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:13,380][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 04:54:14,876][26599] Updated weights for policy 0, policy_version 282284 (0.0051) [2024-06-19 04:54:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4625072128. Throughput: 0: 42116.3. Samples: 892643840. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:18,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 04:54:18,577][26599] Updated weights for policy 0, policy_version 282294 (0.0040) [2024-06-19 04:54:22,582][26599] Updated weights for policy 0, policy_version 282304 (0.0031) [2024-06-19 04:54:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4625301504. Throughput: 0: 42121.6. Samples: 892901060. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:23,381][26367] Avg episode reward: [(0, '0.387')] [2024-06-19 04:54:25,777][26579] Signal inference workers to stop experience collection... (13300 times) [2024-06-19 04:54:25,817][26599] InferenceWorker_p0-w0: stopping experience collection (13300 times) [2024-06-19 04:54:25,827][26579] Signal inference workers to resume experience collection... (13300 times) [2024-06-19 04:54:25,832][26599] InferenceWorker_p0-w0: resuming experience collection (13300 times) [2024-06-19 04:54:26,357][26599] Updated weights for policy 0, policy_version 282314 (0.0036) [2024-06-19 04:54:28,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4625514496. Throughput: 0: 42344.8. Samples: 893155300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:28,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 04:54:30,375][26599] Updated weights for policy 0, policy_version 282324 (0.0030) [2024-06-19 04:54:33,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 4625711104. Throughput: 0: 42271.1. Samples: 893283220. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:33,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 04:54:34,051][26599] Updated weights for policy 0, policy_version 282334 (0.0028) [2024-06-19 04:54:37,979][26599] Updated weights for policy 0, policy_version 282344 (0.0035) [2024-06-19 04:54:38,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42327.9, 300 sec: 42209.6). Total num frames: 4625924096. Throughput: 0: 42374.4. Samples: 893541200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:38,381][26367] Avg episode reward: [(0, '0.318')] [2024-06-19 04:54:41,852][26599] Updated weights for policy 0, policy_version 282354 (0.0026) [2024-06-19 04:54:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4626153472. Throughput: 0: 42504.7. Samples: 893792760. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:43,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 04:54:45,825][26599] Updated weights for policy 0, policy_version 282364 (0.0045) [2024-06-19 04:54:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4626350080. Throughput: 0: 42482.1. Samples: 893922900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:48,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 04:54:49,571][26599] Updated weights for policy 0, policy_version 282374 (0.0036) [2024-06-19 04:54:53,323][26599] Updated weights for policy 0, policy_version 282384 (0.0028) [2024-06-19 04:54:53,382][26367] Fps is (10 sec: 42593.2, 60 sec: 42597.6, 300 sec: 42265.0). Total num frames: 4626579456. Throughput: 0: 42544.2. Samples: 894178480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 04:54:53,382][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 04:54:57,079][26599] Updated weights for policy 0, policy_version 282394 (0.0038) [2024-06-19 04:54:58,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42868.8, 300 sec: 42320.2). Total num frames: 4626792448. Throughput: 0: 42405.8. Samples: 894429620. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:54:58,385][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 04:55:01,202][26599] Updated weights for policy 0, policy_version 282404 (0.0028) [2024-06-19 04:55:03,380][26367] Fps is (10 sec: 40965.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4626989056. Throughput: 0: 42610.3. Samples: 894561300. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:03,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 04:55:04,872][26599] Updated weights for policy 0, policy_version 282414 (0.0028) [2024-06-19 04:55:08,380][26367] Fps is (10 sec: 37697.0, 60 sec: 42052.3, 300 sec: 42098.5). Total num frames: 4627169280. Throughput: 0: 42431.5. Samples: 894810480. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:08,384][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 04:55:09,026][26599] Updated weights for policy 0, policy_version 282424 (0.0039) [2024-06-19 04:55:12,802][26599] Updated weights for policy 0, policy_version 282434 (0.0029) [2024-06-19 04:55:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42265.6). Total num frames: 4627415040. Throughput: 0: 42423.3. Samples: 895064340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:13,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 04:55:17,117][26599] Updated weights for policy 0, policy_version 282444 (0.0036) [2024-06-19 04:55:18,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42598.6, 300 sec: 42376.3). Total num frames: 4627628032. Throughput: 0: 42548.3. Samples: 895197880. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:18,380][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 04:55:20,450][26599] Updated weights for policy 0, policy_version 282454 (0.0031) [2024-06-19 04:55:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4627824640. Throughput: 0: 42305.3. Samples: 895444940. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:23,381][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 04:55:25,009][26599] Updated weights for policy 0, policy_version 282464 (0.0040) [2024-06-19 04:55:28,138][26599] Updated weights for policy 0, policy_version 282474 (0.0037) [2024-06-19 04:55:28,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4628054016. Throughput: 0: 42393.4. Samples: 895700460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:28,381][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 04:55:32,720][26599] Updated weights for policy 0, policy_version 282484 (0.0037) [2024-06-19 04:55:33,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4628250624. Throughput: 0: 42447.7. Samples: 895833040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:33,380][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 04:55:35,905][26599] Updated weights for policy 0, policy_version 282494 (0.0043) [2024-06-19 04:55:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4628447232. Throughput: 0: 42203.4. Samples: 896077580. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:38,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 04:55:38,388][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000282498_4628447232.pth... [2024-06-19 04:55:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000281881_4618338304.pth [2024-06-19 04:55:40,293][26599] Updated weights for policy 0, policy_version 282504 (0.0041) [2024-06-19 04:55:41,021][26579] Signal inference workers to stop experience collection... (13350 times) [2024-06-19 04:55:41,028][26579] Signal inference workers to resume experience collection... (13350 times) [2024-06-19 04:55:41,038][26599] InferenceWorker_p0-w0: stopping experience collection (13350 times) [2024-06-19 04:55:41,062][26599] InferenceWorker_p0-w0: resuming experience collection (13350 times) [2024-06-19 04:55:43,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4628676608. Throughput: 0: 42376.3. Samples: 896336400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:43,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 04:55:43,538][26599] Updated weights for policy 0, policy_version 282514 (0.0051) [2024-06-19 04:55:48,017][26599] Updated weights for policy 0, policy_version 282524 (0.0028) [2024-06-19 04:55:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4628889600. Throughput: 0: 42359.0. Samples: 896467460. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:48,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 04:55:51,395][26599] Updated weights for policy 0, policy_version 282534 (0.0039) [2024-06-19 04:55:53,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42053.2, 300 sec: 42209.6). Total num frames: 4629102592. Throughput: 0: 42397.0. Samples: 896718340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:53,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 04:55:55,780][26599] Updated weights for policy 0, policy_version 282544 (0.0033) [2024-06-19 04:55:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42054.9, 300 sec: 42376.3). Total num frames: 4629315584. Throughput: 0: 42363.1. Samples: 896970680. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:55:58,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 04:55:59,251][26599] Updated weights for policy 0, policy_version 282554 (0.0036) [2024-06-19 04:56:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4629512192. Throughput: 0: 42136.9. Samples: 897094040. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:56:03,380][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 04:56:03,387][26599] Updated weights for policy 0, policy_version 282564 (0.0032) [2024-06-19 04:56:07,114][26599] Updated weights for policy 0, policy_version 282574 (0.0031) [2024-06-19 04:56:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 4629741568. Throughput: 0: 42275.5. Samples: 897347340. Policy #0 lag: (min: 1.0, avg: 10.0, max: 21.0) [2024-06-19 04:56:08,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 04:56:11,219][26599] Updated weights for policy 0, policy_version 282584 (0.0041) [2024-06-19 04:56:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4629938176. Throughput: 0: 42220.6. Samples: 897600380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:13,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 04:56:14,734][26599] Updated weights for policy 0, policy_version 282594 (0.0029) [2024-06-19 04:56:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4630151168. Throughput: 0: 42113.8. Samples: 897728160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:18,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 04:56:18,668][26599] Updated weights for policy 0, policy_version 282604 (0.0043) [2024-06-19 04:56:22,354][26599] Updated weights for policy 0, policy_version 282614 (0.0037) [2024-06-19 04:56:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4630380544. Throughput: 0: 42483.7. Samples: 897989340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:23,380][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 04:56:26,459][26599] Updated weights for policy 0, policy_version 282624 (0.0024) [2024-06-19 04:56:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4630593536. Throughput: 0: 42436.2. Samples: 898246020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:28,380][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 04:56:30,212][26599] Updated weights for policy 0, policy_version 282634 (0.0034) [2024-06-19 04:56:33,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4630806528. Throughput: 0: 42343.5. Samples: 898372920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:33,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 04:56:34,277][26599] Updated weights for policy 0, policy_version 282644 (0.0049) [2024-06-19 04:56:38,018][26599] Updated weights for policy 0, policy_version 282654 (0.0034) [2024-06-19 04:56:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42376.3). Total num frames: 4631019520. Throughput: 0: 42468.9. Samples: 898629440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:38,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 04:56:42,104][26599] Updated weights for policy 0, policy_version 282664 (0.0038) [2024-06-19 04:56:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42431.7). Total num frames: 4631232512. Throughput: 0: 42437.2. Samples: 898880360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:43,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 04:56:45,794][26599] Updated weights for policy 0, policy_version 282674 (0.0034) [2024-06-19 04:56:48,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42210.1). Total num frames: 4631429120. Throughput: 0: 42572.2. Samples: 899009800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:48,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 04:56:49,814][26599] Updated weights for policy 0, policy_version 282684 (0.0031) [2024-06-19 04:56:53,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42322.7, 300 sec: 42376.0). Total num frames: 4631642112. Throughput: 0: 42577.5. Samples: 899263480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:53,384][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 04:56:53,427][26599] Updated weights for policy 0, policy_version 282694 (0.0034) [2024-06-19 04:56:57,778][26599] Updated weights for policy 0, policy_version 282704 (0.0034) [2024-06-19 04:56:58,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4631838720. Throughput: 0: 42668.9. Samples: 899520480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:56:58,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 04:57:01,226][26599] Updated weights for policy 0, policy_version 282714 (0.0024) [2024-06-19 04:57:03,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4632068096. Throughput: 0: 42534.2. Samples: 899642200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:57:03,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 04:57:05,464][26599] Updated weights for policy 0, policy_version 282724 (0.0031) [2024-06-19 04:57:08,380][26367] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42543.4). Total num frames: 4632297472. Throughput: 0: 42430.5. Samples: 899898720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:57:08,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 04:57:08,796][26599] Updated weights for policy 0, policy_version 282734 (0.0039) [2024-06-19 04:57:13,073][26599] Updated weights for policy 0, policy_version 282744 (0.0044) [2024-06-19 04:57:13,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4632477696. Throughput: 0: 42418.0. Samples: 900154840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:57:13,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 04:57:14,394][26579] Signal inference workers to stop experience collection... (13400 times) [2024-06-19 04:57:14,394][26579] Signal inference workers to resume experience collection... (13400 times) [2024-06-19 04:57:14,417][26599] InferenceWorker_p0-w0: stopping experience collection (13400 times) [2024-06-19 04:57:14,417][26599] InferenceWorker_p0-w0: resuming experience collection (13400 times) [2024-06-19 04:57:16,623][26599] Updated weights for policy 0, policy_version 282754 (0.0038) [2024-06-19 04:57:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4632690688. Throughput: 0: 42246.2. Samples: 900274000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 04:57:18,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 04:57:20,857][26599] Updated weights for policy 0, policy_version 282764 (0.0048) [2024-06-19 04:57:23,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4632920064. Throughput: 0: 42174.2. Samples: 900527280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:23,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 04:57:24,546][26599] Updated weights for policy 0, policy_version 282774 (0.0042) [2024-06-19 04:57:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.2, 300 sec: 42321.2). Total num frames: 4633116672. Throughput: 0: 42147.7. Samples: 900777000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:28,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 04:57:28,433][26599] Updated weights for policy 0, policy_version 282784 (0.0036) [2024-06-19 04:57:32,176][26599] Updated weights for policy 0, policy_version 282794 (0.0041) [2024-06-19 04:57:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4633346048. Throughput: 0: 42058.3. Samples: 900902420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:33,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 04:57:35,933][26599] Updated weights for policy 0, policy_version 282804 (0.0038) [2024-06-19 04:57:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4633542656. Throughput: 0: 42175.9. Samples: 901161240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:38,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 04:57:38,534][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000282810_4633559040.pth... [2024-06-19 04:57:38,607][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000282190_4623400960.pth [2024-06-19 04:57:39,856][26599] Updated weights for policy 0, policy_version 282814 (0.0041) [2024-06-19 04:57:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4633772032. Throughput: 0: 41915.1. Samples: 901406660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:43,380][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 04:57:43,559][26599] Updated weights for policy 0, policy_version 282824 (0.0038) [2024-06-19 04:57:48,067][26599] Updated weights for policy 0, policy_version 282834 (0.0042) [2024-06-19 04:57:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4633968640. Throughput: 0: 42134.6. Samples: 901538260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:48,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 04:57:51,838][26599] Updated weights for policy 0, policy_version 282844 (0.0032) [2024-06-19 04:57:53,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42054.8, 300 sec: 42376.2). Total num frames: 4634165248. Throughput: 0: 42068.1. Samples: 901791780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:53,384][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 04:57:55,820][26599] Updated weights for policy 0, policy_version 282854 (0.0046) [2024-06-19 04:57:58,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4634411008. Throughput: 0: 42017.9. Samples: 902045640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:57:58,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 04:57:59,359][26599] Updated weights for policy 0, policy_version 282864 (0.0032) [2024-06-19 04:58:03,346][26599] Updated weights for policy 0, policy_version 282874 (0.0035) [2024-06-19 04:58:03,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42322.7, 300 sec: 42264.6). Total num frames: 4634607616. Throughput: 0: 42271.8. Samples: 902176380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:58:03,385][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 04:58:06,834][26599] Updated weights for policy 0, policy_version 282884 (0.0032) [2024-06-19 04:58:08,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 4634804224. Throughput: 0: 42200.4. Samples: 902426300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:58:08,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 04:58:11,392][26599] Updated weights for policy 0, policy_version 282894 (0.0036) [2024-06-19 04:58:13,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4635033600. Throughput: 0: 42247.6. Samples: 902678140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:58:13,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 04:58:14,743][26599] Updated weights for policy 0, policy_version 282904 (0.0039) [2024-06-19 04:58:18,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42322.8, 300 sec: 42264.6). Total num frames: 4635230208. Throughput: 0: 42365.9. Samples: 902809040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:58:18,384][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 04:58:18,918][26599] Updated weights for policy 0, policy_version 282914 (0.0021) [2024-06-19 04:58:22,317][26599] Updated weights for policy 0, policy_version 282924 (0.0031) [2024-06-19 04:58:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4635443200. Throughput: 0: 42216.9. Samples: 903061000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:58:23,380][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 04:58:26,408][26599] Updated weights for policy 0, policy_version 282934 (0.0028) [2024-06-19 04:58:28,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4635656192. Throughput: 0: 42430.1. Samples: 903316020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:58:28,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 04:58:30,202][26599] Updated weights for policy 0, policy_version 282944 (0.0048) [2024-06-19 04:58:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42321.2). Total num frames: 4635869184. Throughput: 0: 42401.8. Samples: 903446340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 04:58:33,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 04:58:34,078][26599] Updated weights for policy 0, policy_version 282954 (0.0025) [2024-06-19 04:58:37,744][26579] Signal inference workers to stop experience collection... (13450 times) [2024-06-19 04:58:37,784][26599] InferenceWorker_p0-w0: stopping experience collection (13450 times) [2024-06-19 04:58:37,790][26579] Signal inference workers to resume experience collection... (13450 times) [2024-06-19 04:58:37,796][26599] InferenceWorker_p0-w0: resuming experience collection (13450 times) [2024-06-19 04:58:37,933][26599] Updated weights for policy 0, policy_version 282964 (0.0039) [2024-06-19 04:58:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 4636082176. Throughput: 0: 42294.1. Samples: 903695020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:58:38,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 04:58:41,800][26599] Updated weights for policy 0, policy_version 282974 (0.0032) [2024-06-19 04:58:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4636278784. Throughput: 0: 42339.4. Samples: 903950920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:58:43,383][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 04:58:45,665][26599] Updated weights for policy 0, policy_version 282984 (0.0030) [2024-06-19 04:58:48,380][26367] Fps is (10 sec: 40961.0, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4636491776. Throughput: 0: 42195.5. Samples: 904075020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:58:48,380][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 04:58:49,392][26599] Updated weights for policy 0, policy_version 282994 (0.0040) [2024-06-19 04:58:53,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4636721152. Throughput: 0: 42269.9. Samples: 904328440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:58:53,380][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 04:58:53,483][26599] Updated weights for policy 0, policy_version 283004 (0.0034) [2024-06-19 04:58:57,294][26599] Updated weights for policy 0, policy_version 283014 (0.0038) [2024-06-19 04:58:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 4636917760. Throughput: 0: 42400.9. Samples: 904586180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:58:58,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 04:59:01,346][26599] Updated weights for policy 0, policy_version 283024 (0.0034) [2024-06-19 04:59:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42327.9, 300 sec: 42376.3). Total num frames: 4637147136. Throughput: 0: 42333.6. Samples: 904713900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:03,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 04:59:05,648][26599] Updated weights for policy 0, policy_version 283034 (0.0042) [2024-06-19 04:59:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4637360128. Throughput: 0: 42275.5. Samples: 904963400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:08,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 04:59:09,316][26599] Updated weights for policy 0, policy_version 283044 (0.0038) [2024-06-19 04:59:13,364][26599] Updated weights for policy 0, policy_version 283054 (0.0045) [2024-06-19 04:59:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4637556736. Throughput: 0: 42186.7. Samples: 905214420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:13,383][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 04:59:17,149][26599] Updated weights for policy 0, policy_version 283064 (0.0040) [2024-06-19 04:59:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42874.1, 300 sec: 42376.2). Total num frames: 4637802496. Throughput: 0: 42091.1. Samples: 905340440. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:18,384][26367] Avg episode reward: [(0, '0.833')] [2024-06-19 04:59:21,122][26599] Updated weights for policy 0, policy_version 283074 (0.0027) [2024-06-19 04:59:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 4637966336. Throughput: 0: 42077.1. Samples: 905588480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:23,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 04:59:25,012][26599] Updated weights for policy 0, policy_version 283084 (0.0032) [2024-06-19 04:59:28,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4638179328. Throughput: 0: 42148.6. Samples: 905847600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:28,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 04:59:28,834][26599] Updated weights for policy 0, policy_version 283094 (0.0029) [2024-06-19 04:59:32,616][26599] Updated weights for policy 0, policy_version 283104 (0.0033) [2024-06-19 04:59:33,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42322.8, 300 sec: 42320.2). Total num frames: 4638408704. Throughput: 0: 42093.4. Samples: 905969380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:33,384][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 04:59:36,503][26599] Updated weights for policy 0, policy_version 283114 (0.0035) [2024-06-19 04:59:38,381][26367] Fps is (10 sec: 42597.1, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4638605312. Throughput: 0: 42081.0. Samples: 906222100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:38,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 04:59:38,493][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000283119_4638621696.pth... [2024-06-19 04:59:38,553][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000282498_4628447232.pth [2024-06-19 04:59:40,238][26599] Updated weights for policy 0, policy_version 283124 (0.0024) [2024-06-19 04:59:43,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4638818304. Throughput: 0: 42007.5. Samples: 906476520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 04:59:43,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 04:59:44,222][26599] Updated weights for policy 0, policy_version 283134 (0.0039) [2024-06-19 04:59:47,954][26599] Updated weights for policy 0, policy_version 283144 (0.0029) [2024-06-19 04:59:48,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.3, 300 sec: 42265.3). Total num frames: 4639047680. Throughput: 0: 41956.4. Samples: 906601940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 04:59:48,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 04:59:51,858][26599] Updated weights for policy 0, policy_version 283154 (0.0031) [2024-06-19 04:59:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42210.2). Total num frames: 4639244288. Throughput: 0: 42056.0. Samples: 906855920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 04:59:53,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 04:59:55,554][26599] Updated weights for policy 0, policy_version 283164 (0.0027) [2024-06-19 04:59:57,376][26579] Signal inference workers to stop experience collection... (13500 times) [2024-06-19 04:59:57,377][26579] Signal inference workers to resume experience collection... (13500 times) [2024-06-19 04:59:57,408][26599] InferenceWorker_p0-w0: stopping experience collection (13500 times) [2024-06-19 04:59:57,408][26599] InferenceWorker_p0-w0: resuming experience collection (13500 times) [2024-06-19 04:59:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4639457280. Throughput: 0: 42122.3. Samples: 907109920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 04:59:58,381][26367] Avg episode reward: [(0, '0.387')] [2024-06-19 04:59:59,435][26599] Updated weights for policy 0, policy_version 283174 (0.0031) [2024-06-19 05:00:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4639670272. Throughput: 0: 42230.8. Samples: 907240820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:03,380][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 05:00:03,740][26599] Updated weights for policy 0, policy_version 283184 (0.0031) [2024-06-19 05:00:07,220][26599] Updated weights for policy 0, policy_version 283194 (0.0046) [2024-06-19 05:00:08,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41776.7, 300 sec: 42209.1). Total num frames: 4639866880. Throughput: 0: 42281.9. Samples: 907491320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:08,384][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 05:00:11,704][26599] Updated weights for policy 0, policy_version 283204 (0.0042) [2024-06-19 05:00:13,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 4640096256. Throughput: 0: 42074.5. Samples: 907740960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:13,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 05:00:15,048][26599] Updated weights for policy 0, policy_version 283214 (0.0028) [2024-06-19 05:00:18,380][26367] Fps is (10 sec: 42613.7, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 4640292864. Throughput: 0: 42378.1. Samples: 907876240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:18,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 05:00:18,969][26599] Updated weights for policy 0, policy_version 283224 (0.0032) [2024-06-19 05:00:22,646][26599] Updated weights for policy 0, policy_version 283234 (0.0031) [2024-06-19 05:00:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4640522240. Throughput: 0: 42335.3. Samples: 908127180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:23,381][26367] Avg episode reward: [(0, '0.366')] [2024-06-19 05:00:26,713][26599] Updated weights for policy 0, policy_version 283244 (0.0025) [2024-06-19 05:00:28,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4640735232. Throughput: 0: 42343.4. Samples: 908381980. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:28,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 05:00:30,266][26599] Updated weights for policy 0, policy_version 283254 (0.0034) [2024-06-19 05:00:33,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41781.8, 300 sec: 42265.2). Total num frames: 4640915456. Throughput: 0: 42425.1. Samples: 908511060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:33,380][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 05:00:34,485][26599] Updated weights for policy 0, policy_version 283264 (0.0038) [2024-06-19 05:00:38,185][26599] Updated weights for policy 0, policy_version 283274 (0.0037) [2024-06-19 05:00:38,384][26367] Fps is (10 sec: 42583.6, 60 sec: 42596.0, 300 sec: 42320.2). Total num frames: 4641161216. Throughput: 0: 42422.8. Samples: 908765100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:38,384][26367] Avg episode reward: [(0, '0.317')] [2024-06-19 05:00:42,423][26599] Updated weights for policy 0, policy_version 283284 (0.0037) [2024-06-19 05:00:43,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4641374208. Throughput: 0: 42292.1. Samples: 909013060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:43,380][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 05:00:45,923][26599] Updated weights for policy 0, policy_version 283294 (0.0026) [2024-06-19 05:00:48,380][26367] Fps is (10 sec: 40975.1, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4641570816. Throughput: 0: 42116.0. Samples: 909136040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:48,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:00:50,299][26599] Updated weights for policy 0, policy_version 283304 (0.0037) [2024-06-19 05:00:53,357][26599] Updated weights for policy 0, policy_version 283314 (0.0041) [2024-06-19 05:00:53,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4641816576. Throughput: 0: 42341.1. Samples: 909396520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:53,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 05:00:57,998][26599] Updated weights for policy 0, policy_version 283324 (0.0023) [2024-06-19 05:00:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4641996800. Throughput: 0: 42465.0. Samples: 909651880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:00:58,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 05:01:01,418][26599] Updated weights for policy 0, policy_version 283334 (0.0035) [2024-06-19 05:01:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4642209792. Throughput: 0: 42183.6. Samples: 909774500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:03,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 05:01:05,890][26599] Updated weights for policy 0, policy_version 283344 (0.0037) [2024-06-19 05:01:08,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42601.0, 300 sec: 42320.7). Total num frames: 4642422784. Throughput: 0: 42296.6. Samples: 910030520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:08,380][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 05:01:09,333][26599] Updated weights for policy 0, policy_version 283354 (0.0028) [2024-06-19 05:01:13,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4642603008. Throughput: 0: 42249.0. Samples: 910283180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:13,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 05:01:13,527][26599] Updated weights for policy 0, policy_version 283364 (0.0039) [2024-06-19 05:01:17,281][26599] Updated weights for policy 0, policy_version 283374 (0.0034) [2024-06-19 05:01:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4642848768. Throughput: 0: 42139.0. Samples: 910407320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:18,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 05:01:21,277][26599] Updated weights for policy 0, policy_version 283384 (0.0039) [2024-06-19 05:01:23,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4643045376. Throughput: 0: 42123.5. Samples: 910660500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:23,380][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 05:01:25,220][26599] Updated weights for policy 0, policy_version 283394 (0.0032) [2024-06-19 05:01:28,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.4, 300 sec: 42154.1). Total num frames: 4643241984. Throughput: 0: 42264.9. Samples: 910914980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:28,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-19 05:01:28,921][26579] Signal inference workers to stop experience collection... (13550 times) [2024-06-19 05:01:28,921][26579] Signal inference workers to resume experience collection... (13550 times) [2024-06-19 05:01:28,961][26599] InferenceWorker_p0-w0: stopping experience collection (13550 times) [2024-06-19 05:01:28,961][26599] InferenceWorker_p0-w0: resuming experience collection (13550 times) [2024-06-19 05:01:29,050][26599] Updated weights for policy 0, policy_version 283404 (0.0043) [2024-06-19 05:01:32,850][26599] Updated weights for policy 0, policy_version 283414 (0.0027) [2024-06-19 05:01:33,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4643471360. Throughput: 0: 42229.2. Samples: 911036360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:33,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 05:01:36,718][26599] Updated weights for policy 0, policy_version 283424 (0.0043) [2024-06-19 05:01:38,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42327.8, 300 sec: 42265.2). Total num frames: 4643700736. Throughput: 0: 42175.5. Samples: 911294420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:38,381][26367] Avg episode reward: [(0, '0.799')] [2024-06-19 05:01:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000283429_4643700736.pth... [2024-06-19 05:01:38,442][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000282810_4633559040.pth [2024-06-19 05:01:40,527][26599] Updated weights for policy 0, policy_version 283434 (0.0040) [2024-06-19 05:01:43,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4643880960. Throughput: 0: 42092.5. Samples: 911546040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:43,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 05:01:44,582][26599] Updated weights for policy 0, policy_version 283444 (0.0029) [2024-06-19 05:01:48,144][26599] Updated weights for policy 0, policy_version 283454 (0.0052) [2024-06-19 05:01:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.1, 300 sec: 42265.7). Total num frames: 4644110336. Throughput: 0: 42068.2. Samples: 911667580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:48,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 05:01:52,417][26599] Updated weights for policy 0, policy_version 283464 (0.0043) [2024-06-19 05:01:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4644323328. Throughput: 0: 42197.7. Samples: 911929420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:53,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 05:01:55,774][26599] Updated weights for policy 0, policy_version 283474 (0.0034) [2024-06-19 05:01:58,380][26367] Fps is (10 sec: 39322.4, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4644503552. Throughput: 0: 42198.7. Samples: 912182120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:01:58,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 05:02:00,090][26599] Updated weights for policy 0, policy_version 283484 (0.0028) [2024-06-19 05:02:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4644732928. Throughput: 0: 42074.3. Samples: 912300660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:02:03,381][26367] Avg episode reward: [(0, '0.806')] [2024-06-19 05:02:04,054][26599] Updated weights for policy 0, policy_version 283494 (0.0034) [2024-06-19 05:02:07,807][26599] Updated weights for policy 0, policy_version 283504 (0.0031) [2024-06-19 05:02:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4644945920. Throughput: 0: 42218.3. Samples: 912560320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:02:08,380][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 05:02:11,938][26599] Updated weights for policy 0, policy_version 283514 (0.0032) [2024-06-19 05:02:13,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4645142528. Throughput: 0: 41954.9. Samples: 912802960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:13,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 05:02:15,889][26599] Updated weights for policy 0, policy_version 283524 (0.0034) [2024-06-19 05:02:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4645371904. Throughput: 0: 42143.2. Samples: 912932800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:18,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 05:02:19,622][26599] Updated weights for policy 0, policy_version 283534 (0.0028) [2024-06-19 05:02:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 4645552128. Throughput: 0: 42027.1. Samples: 913185640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:23,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 05:02:23,585][26599] Updated weights for policy 0, policy_version 283544 (0.0048) [2024-06-19 05:02:27,225][26599] Updated weights for policy 0, policy_version 283554 (0.0032) [2024-06-19 05:02:28,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4645765120. Throughput: 0: 42068.4. Samples: 913439120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:28,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 05:02:31,265][26599] Updated weights for policy 0, policy_version 283564 (0.0039) [2024-06-19 05:02:33,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42325.5, 300 sec: 42265.2). Total num frames: 4646010880. Throughput: 0: 42296.7. Samples: 913570920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:33,380][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 05:02:34,793][26599] Updated weights for policy 0, policy_version 283574 (0.0035) [2024-06-19 05:02:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4646207488. Throughput: 0: 42061.8. Samples: 913822200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:38,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 05:02:38,956][26599] Updated weights for policy 0, policy_version 283584 (0.0037) [2024-06-19 05:02:42,403][26599] Updated weights for policy 0, policy_version 283594 (0.0034) [2024-06-19 05:02:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4646420480. Throughput: 0: 42066.7. Samples: 914075120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:43,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 05:02:46,642][26599] Updated weights for policy 0, policy_version 283604 (0.0036) [2024-06-19 05:02:48,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42049.9, 300 sec: 42264.6). Total num frames: 4646633472. Throughput: 0: 42293.0. Samples: 914204000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:48,384][26367] Avg episode reward: [(0, '0.828')] [2024-06-19 05:02:50,563][26599] Updated weights for policy 0, policy_version 283614 (0.0034) [2024-06-19 05:02:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4646830080. Throughput: 0: 42123.9. Samples: 914455900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:53,381][26367] Avg episode reward: [(0, '0.775')] [2024-06-19 05:02:54,290][26599] Updated weights for policy 0, policy_version 283624 (0.0032) [2024-06-19 05:02:58,166][26599] Updated weights for policy 0, policy_version 283634 (0.0048) [2024-06-19 05:02:58,380][26367] Fps is (10 sec: 42613.5, 60 sec: 42598.3, 300 sec: 42210.1). Total num frames: 4647059456. Throughput: 0: 42385.8. Samples: 914710320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:02:58,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 05:03:01,920][26599] Updated weights for policy 0, policy_version 283644 (0.0040) [2024-06-19 05:03:03,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4647272448. Throughput: 0: 42394.8. Samples: 914840560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:03:03,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 05:03:04,090][26579] Signal inference workers to stop experience collection... (13600 times) [2024-06-19 05:03:04,140][26599] InferenceWorker_p0-w0: stopping experience collection (13600 times) [2024-06-19 05:03:04,144][26579] Signal inference workers to resume experience collection... (13600 times) [2024-06-19 05:03:04,165][26599] InferenceWorker_p0-w0: resuming experience collection (13600 times) [2024-06-19 05:03:05,942][26599] Updated weights for policy 0, policy_version 283654 (0.0041) [2024-06-19 05:03:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 4647469056. Throughput: 0: 42382.1. Samples: 915092840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:03:08,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 05:03:09,640][26599] Updated weights for policy 0, policy_version 283664 (0.0027) [2024-06-19 05:03:13,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42210.1). Total num frames: 4647682048. Throughput: 0: 42275.0. Samples: 915341500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:03:13,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 05:03:13,763][26599] Updated weights for policy 0, policy_version 283674 (0.0040) [2024-06-19 05:03:17,498][26599] Updated weights for policy 0, policy_version 283684 (0.0031) [2024-06-19 05:03:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4647911424. Throughput: 0: 42208.2. Samples: 915470300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:03:18,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 05:03:21,702][26599] Updated weights for policy 0, policy_version 283694 (0.0028) [2024-06-19 05:03:23,381][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 4648075264. Throughput: 0: 42156.2. Samples: 915719240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 05:03:23,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 05:03:25,173][26599] Updated weights for policy 0, policy_version 283704 (0.0037) [2024-06-19 05:03:28,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42265.2). Total num frames: 4648337408. Throughput: 0: 42117.4. Samples: 915970400. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:03:28,380][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 05:03:29,622][26599] Updated weights for policy 0, policy_version 283714 (0.0029) [2024-06-19 05:03:33,223][26599] Updated weights for policy 0, policy_version 283724 (0.0039) [2024-06-19 05:03:33,380][26367] Fps is (10 sec: 47515.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4648550400. Throughput: 0: 42182.2. Samples: 916102040. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:03:33,380][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 05:03:37,720][26599] Updated weights for policy 0, policy_version 283734 (0.0032) [2024-06-19 05:03:38,380][26367] Fps is (10 sec: 37682.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4648714240. Throughput: 0: 42128.5. Samples: 916351680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:03:38,381][26367] Avg episode reward: [(0, '0.302')] [2024-06-19 05:03:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000283735_4648714240.pth... [2024-06-19 05:03:38,464][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000283119_4638621696.pth [2024-06-19 05:03:40,822][26599] Updated weights for policy 0, policy_version 283744 (0.0041) [2024-06-19 05:03:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4648943616. Throughput: 0: 42045.9. Samples: 916602380. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:03:43,380][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 05:03:45,385][26599] Updated weights for policy 0, policy_version 283754 (0.0035) [2024-06-19 05:03:48,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42327.8, 300 sec: 42209.6). Total num frames: 4649172992. Throughput: 0: 42048.3. Samples: 916732740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:03:48,383][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 05:03:48,458][26599] Updated weights for policy 0, policy_version 283764 (0.0033) [2024-06-19 05:03:53,089][26599] Updated weights for policy 0, policy_version 283774 (0.0037) [2024-06-19 05:03:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4649369600. Throughput: 0: 41904.6. Samples: 916978540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:03:53,380][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 05:03:56,221][26599] Updated weights for policy 0, policy_version 283784 (0.0031) [2024-06-19 05:03:58,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42098.6). Total num frames: 4649566208. Throughput: 0: 42038.8. Samples: 917233240. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:03:58,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 05:04:00,708][26599] Updated weights for policy 0, policy_version 283794 (0.0031) [2024-06-19 05:04:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4649779200. Throughput: 0: 42077.6. Samples: 917363780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:04:03,380][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 05:04:04,251][26599] Updated weights for policy 0, policy_version 283804 (0.0037) [2024-06-19 05:04:04,713][26579] Signal inference workers to stop experience collection... (13650 times) [2024-06-19 05:04:04,751][26599] InferenceWorker_p0-w0: stopping experience collection (13650 times) [2024-06-19 05:04:04,780][26579] Signal inference workers to resume experience collection... (13650 times) [2024-06-19 05:04:04,781][26599] InferenceWorker_p0-w0: resuming experience collection (13650 times) [2024-06-19 05:04:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4649992192. Throughput: 0: 42123.4. Samples: 917614780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:04:08,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 05:04:08,491][26599] Updated weights for policy 0, policy_version 283814 (0.0038) [2024-06-19 05:04:12,063][26599] Updated weights for policy 0, policy_version 283824 (0.0037) [2024-06-19 05:04:13,383][26367] Fps is (10 sec: 44222.3, 60 sec: 42323.2, 300 sec: 42098.1). Total num frames: 4650221568. Throughput: 0: 41980.9. Samples: 917859680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:04:13,384][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 05:04:16,741][26599] Updated weights for policy 0, policy_version 283834 (0.0027) [2024-06-19 05:04:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4650434560. Throughput: 0: 42175.9. Samples: 917999960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:04:18,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 05:04:19,619][26599] Updated weights for policy 0, policy_version 283844 (0.0040) [2024-06-19 05:04:23,380][26367] Fps is (10 sec: 39334.4, 60 sec: 42325.5, 300 sec: 42154.1). Total num frames: 4650614784. Throughput: 0: 42063.6. Samples: 918244540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:04:23,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 05:04:24,480][26599] Updated weights for policy 0, policy_version 283854 (0.0032) [2024-06-19 05:04:27,439][26599] Updated weights for policy 0, policy_version 283864 (0.0037) [2024-06-19 05:04:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42210.1). Total num frames: 4650860544. Throughput: 0: 42030.6. Samples: 918493760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:04:28,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 05:04:32,280][26599] Updated weights for policy 0, policy_version 283874 (0.0042) [2024-06-19 05:04:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 4651040768. Throughput: 0: 42180.2. Samples: 918630840. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 05:04:33,380][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 05:04:35,118][26599] Updated weights for policy 0, policy_version 283884 (0.0041) [2024-06-19 05:04:38,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4651253760. Throughput: 0: 42083.1. Samples: 918872280. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:04:38,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 05:04:39,747][26599] Updated weights for policy 0, policy_version 283894 (0.0030) [2024-06-19 05:04:42,628][26599] Updated weights for policy 0, policy_version 283904 (0.0037) [2024-06-19 05:04:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4651483136. Throughput: 0: 42096.0. Samples: 919127560. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:04:43,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 05:04:47,614][26599] Updated weights for policy 0, policy_version 283914 (0.0034) [2024-06-19 05:04:48,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4651679744. Throughput: 0: 42137.6. Samples: 919259980. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:04:48,381][26367] Avg episode reward: [(0, '0.081')] [2024-06-19 05:04:50,703][26599] Updated weights for policy 0, policy_version 283924 (0.0029) [2024-06-19 05:04:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4651909120. Throughput: 0: 41996.9. Samples: 919504640. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:04:53,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 05:04:55,689][26599] Updated weights for policy 0, policy_version 283934 (0.0033) [2024-06-19 05:04:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4652122112. Throughput: 0: 42166.9. Samples: 919757060. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:04:58,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 05:04:58,772][26599] Updated weights for policy 0, policy_version 283944 (0.0030) [2024-06-19 05:05:03,309][26599] Updated weights for policy 0, policy_version 283954 (0.0044) [2024-06-19 05:05:03,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42154.6). Total num frames: 4652302336. Throughput: 0: 41866.7. Samples: 919883960. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:03,380][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 05:05:06,641][26599] Updated weights for policy 0, policy_version 283964 (0.0033) [2024-06-19 05:05:08,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42265.2). Total num frames: 4652564480. Throughput: 0: 42147.0. Samples: 920141160. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:08,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 05:05:11,082][26599] Updated weights for policy 0, policy_version 283974 (0.0039) [2024-06-19 05:05:11,700][26579] Signal inference workers to stop experience collection... (13700 times) [2024-06-19 05:05:11,726][26599] InferenceWorker_p0-w0: stopping experience collection (13700 times) [2024-06-19 05:05:11,760][26579] Signal inference workers to resume experience collection... (13700 times) [2024-06-19 05:05:11,761][26599] InferenceWorker_p0-w0: resuming experience collection (13700 times) [2024-06-19 05:05:13,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42327.5, 300 sec: 42265.2). Total num frames: 4652761088. Throughput: 0: 42260.4. Samples: 920395480. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:13,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 05:05:14,190][26599] Updated weights for policy 0, policy_version 283984 (0.0036) [2024-06-19 05:05:18,380][26367] Fps is (10 sec: 36045.0, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4652924928. Throughput: 0: 41985.3. Samples: 920520180. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:18,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 05:05:18,796][26599] Updated weights for policy 0, policy_version 283994 (0.0033) [2024-06-19 05:05:21,869][26599] Updated weights for policy 0, policy_version 284004 (0.0033) [2024-06-19 05:05:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42154.1). Total num frames: 4653170688. Throughput: 0: 42323.5. Samples: 920776840. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:23,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 05:05:26,399][26599] Updated weights for policy 0, policy_version 284014 (0.0037) [2024-06-19 05:05:28,381][26367] Fps is (10 sec: 45873.9, 60 sec: 42052.1, 300 sec: 42265.1). Total num frames: 4653383680. Throughput: 0: 42255.8. Samples: 921029080. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:28,381][26367] Avg episode reward: [(0, '0.267')] [2024-06-19 05:05:29,515][26599] Updated weights for policy 0, policy_version 284024 (0.0044) [2024-06-19 05:05:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42099.1). Total num frames: 4653580288. Throughput: 0: 42341.9. Samples: 921165360. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:33,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 05:05:34,027][26599] Updated weights for policy 0, policy_version 284034 (0.0033) [2024-06-19 05:05:37,132][26599] Updated weights for policy 0, policy_version 284044 (0.0038) [2024-06-19 05:05:38,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4653793280. Throughput: 0: 42530.1. Samples: 921418500. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:38,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 05:05:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284045_4653793280.pth... [2024-06-19 05:05:38,435][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000283429_4643700736.pth [2024-06-19 05:05:41,831][26599] Updated weights for policy 0, policy_version 284054 (0.0025) [2024-06-19 05:05:43,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4654022656. Throughput: 0: 42568.6. Samples: 921672640. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:43,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 05:05:44,658][26599] Updated weights for policy 0, policy_version 284064 (0.0038) [2024-06-19 05:05:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42098.5). Total num frames: 4654235648. Throughput: 0: 42752.8. Samples: 921807840. Policy #0 lag: (min: 1.0, avg: 12.5, max: 23.0) [2024-06-19 05:05:48,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 05:05:49,475][26599] Updated weights for policy 0, policy_version 284074 (0.0022) [2024-06-19 05:05:52,486][26599] Updated weights for policy 0, policy_version 284084 (0.0039) [2024-06-19 05:05:53,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4654448640. Throughput: 0: 42548.9. Samples: 922055860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:05:53,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 05:05:56,932][26599] Updated weights for policy 0, policy_version 284094 (0.0039) [2024-06-19 05:05:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4654661632. Throughput: 0: 42613.8. Samples: 922313100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:05:58,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 05:06:00,326][26599] Updated weights for policy 0, policy_version 284104 (0.0039) [2024-06-19 05:06:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4654858240. Throughput: 0: 42643.4. Samples: 922439140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:03,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 05:06:04,650][26599] Updated weights for policy 0, policy_version 284114 (0.0036) [2024-06-19 05:06:08,006][26599] Updated weights for policy 0, policy_version 284124 (0.0031) [2024-06-19 05:06:08,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 4655087616. Throughput: 0: 42573.9. Samples: 922692660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:08,380][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 05:06:12,159][26599] Updated weights for policy 0, policy_version 284134 (0.0029) [2024-06-19 05:06:13,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4655300608. Throughput: 0: 42525.5. Samples: 922942720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:13,381][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 05:06:16,119][26599] Updated weights for policy 0, policy_version 284144 (0.0042) [2024-06-19 05:06:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 4655497216. Throughput: 0: 42531.7. Samples: 923079280. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:18,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 05:06:19,690][26599] Updated weights for policy 0, policy_version 284154 (0.0031) [2024-06-19 05:06:20,387][26579] Signal inference workers to stop experience collection... (13750 times) [2024-06-19 05:06:20,388][26579] Signal inference workers to resume experience collection... (13750 times) [2024-06-19 05:06:20,430][26599] InferenceWorker_p0-w0: stopping experience collection (13750 times) [2024-06-19 05:06:20,430][26599] InferenceWorker_p0-w0: resuming experience collection (13750 times) [2024-06-19 05:06:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4655710208. Throughput: 0: 42492.4. Samples: 923330660. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:23,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 05:06:24,091][26599] Updated weights for policy 0, policy_version 284164 (0.0032) [2024-06-19 05:06:27,375][26599] Updated weights for policy 0, policy_version 284174 (0.0033) [2024-06-19 05:06:28,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4655939584. Throughput: 0: 42367.9. Samples: 923579200. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:28,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 05:06:31,825][26599] Updated weights for policy 0, policy_version 284184 (0.0035) [2024-06-19 05:06:33,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42154.1). Total num frames: 4656136192. Throughput: 0: 42221.5. Samples: 923707800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:33,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 05:06:35,217][26599] Updated weights for policy 0, policy_version 284194 (0.0038) [2024-06-19 05:06:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4656365568. Throughput: 0: 42248.0. Samples: 923957020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:38,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 05:06:39,779][26599] Updated weights for policy 0, policy_version 284204 (0.0040) [2024-06-19 05:06:42,866][26599] Updated weights for policy 0, policy_version 284214 (0.0039) [2024-06-19 05:06:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4656578560. Throughput: 0: 42157.3. Samples: 924210180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:43,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 05:06:47,719][26599] Updated weights for policy 0, policy_version 284224 (0.0047) [2024-06-19 05:06:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4656758784. Throughput: 0: 42220.1. Samples: 924339040. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:48,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 05:06:50,614][26599] Updated weights for policy 0, policy_version 284234 (0.0038) [2024-06-19 05:06:53,384][26367] Fps is (10 sec: 39307.3, 60 sec: 42049.8, 300 sec: 42264.6). Total num frames: 4656971776. Throughput: 0: 42131.2. Samples: 924588720. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:53,384][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 05:06:55,627][26599] Updated weights for policy 0, policy_version 284244 (0.0028) [2024-06-19 05:06:58,330][26599] Updated weights for policy 0, policy_version 284254 (0.0029) [2024-06-19 05:06:58,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4657217536. Throughput: 0: 42231.0. Samples: 924843120. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 05:06:58,381][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 05:07:03,380][26367] Fps is (10 sec: 39335.7, 60 sec: 41779.2, 300 sec: 42098.5). Total num frames: 4657364992. Throughput: 0: 42091.4. Samples: 924973400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:03,381][26367] Avg episode reward: [(0, '0.230')] [2024-06-19 05:07:03,428][26599] Updated weights for policy 0, policy_version 284264 (0.0042) [2024-06-19 05:07:06,077][26599] Updated weights for policy 0, policy_version 284274 (0.0044) [2024-06-19 05:07:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4657627136. Throughput: 0: 42006.2. Samples: 925220940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:08,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 05:07:11,111][26599] Updated weights for policy 0, policy_version 284284 (0.0026) [2024-06-19 05:07:13,380][26367] Fps is (10 sec: 47513.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4657840128. Throughput: 0: 42058.3. Samples: 925471820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:13,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 05:07:13,760][26599] Updated weights for policy 0, policy_version 284294 (0.0027) [2024-06-19 05:07:18,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 4658003968. Throughput: 0: 42107.5. Samples: 925602640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:18,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 05:07:18,886][26599] Updated weights for policy 0, policy_version 284304 (0.0043) [2024-06-19 05:07:21,277][26599] Updated weights for policy 0, policy_version 284314 (0.0041) [2024-06-19 05:07:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4658266112. Throughput: 0: 42188.9. Samples: 925855520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:23,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 05:07:26,496][26599] Updated weights for policy 0, policy_version 284324 (0.0038) [2024-06-19 05:07:28,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4658462720. Throughput: 0: 42331.4. Samples: 926115100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:28,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 05:07:28,690][26579] Signal inference workers to stop experience collection... (13800 times) [2024-06-19 05:07:28,694][26579] Signal inference workers to resume experience collection... (13800 times) [2024-06-19 05:07:28,710][26599] InferenceWorker_p0-w0: stopping experience collection (13800 times) [2024-06-19 05:07:28,710][26599] InferenceWorker_p0-w0: resuming experience collection (13800 times) [2024-06-19 05:07:28,994][26599] Updated weights for policy 0, policy_version 284334 (0.0033) [2024-06-19 05:07:33,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 4658659328. Throughput: 0: 42193.3. Samples: 926237740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:33,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 05:07:34,219][26599] Updated weights for policy 0, policy_version 284344 (0.0033) [2024-06-19 05:07:36,778][26599] Updated weights for policy 0, policy_version 284354 (0.0028) [2024-06-19 05:07:38,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4658905088. Throughput: 0: 42273.9. Samples: 926490900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:38,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 05:07:38,407][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284357_4658905088.pth... [2024-06-19 05:07:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000283735_4648714240.pth [2024-06-19 05:07:41,956][26599] Updated weights for policy 0, policy_version 284364 (0.0028) [2024-06-19 05:07:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42265.7). Total num frames: 4659101696. Throughput: 0: 42431.6. Samples: 926752540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:43,381][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 05:07:44,879][26599] Updated weights for policy 0, policy_version 284374 (0.0030) [2024-06-19 05:07:48,380][26367] Fps is (10 sec: 37684.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4659281920. Throughput: 0: 42147.6. Samples: 926870040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:48,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 05:07:49,747][26599] Updated weights for policy 0, policy_version 284384 (0.0042) [2024-06-19 05:07:52,923][26599] Updated weights for policy 0, policy_version 284394 (0.0033) [2024-06-19 05:07:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42874.0, 300 sec: 42320.7). Total num frames: 4659544064. Throughput: 0: 42331.6. Samples: 927125860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:53,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 05:07:57,583][26599] Updated weights for policy 0, policy_version 284404 (0.0040) [2024-06-19 05:07:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 4659707904. Throughput: 0: 42391.5. Samples: 927379440. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:07:58,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 05:08:00,483][26599] Updated weights for policy 0, policy_version 284414 (0.0035) [2024-06-19 05:08:03,380][26367] Fps is (10 sec: 37683.8, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 4659920896. Throughput: 0: 42139.6. Samples: 927498920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:08:03,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 05:08:05,115][26599] Updated weights for policy 0, policy_version 284424 (0.0028) [2024-06-19 05:08:08,191][26599] Updated weights for policy 0, policy_version 284434 (0.0036) [2024-06-19 05:08:08,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4660166656. Throughput: 0: 42237.9. Samples: 927756220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:08:08,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 05:08:12,730][26599] Updated weights for policy 0, policy_version 284444 (0.0035) [2024-06-19 05:08:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4660346880. Throughput: 0: 42077.0. Samples: 928008560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 05:08:13,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 05:08:15,882][26599] Updated weights for policy 0, policy_version 284454 (0.0038) [2024-06-19 05:08:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 4660576256. Throughput: 0: 42003.2. Samples: 928127880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:18,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:08:20,295][26599] Updated weights for policy 0, policy_version 284464 (0.0031) [2024-06-19 05:08:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4660789248. Throughput: 0: 42245.9. Samples: 928391960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:23,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 05:08:23,626][26599] Updated weights for policy 0, policy_version 284474 (0.0046) [2024-06-19 05:08:28,187][26599] Updated weights for policy 0, policy_version 284484 (0.0033) [2024-06-19 05:08:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4660985856. Throughput: 0: 41999.2. Samples: 928642500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:28,380][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 05:08:31,421][26599] Updated weights for policy 0, policy_version 284494 (0.0029) [2024-06-19 05:08:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4661198848. Throughput: 0: 42158.1. Samples: 928767160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:33,384][26367] Avg episode reward: [(0, '0.220')] [2024-06-19 05:08:35,966][26599] Updated weights for policy 0, policy_version 284504 (0.0035) [2024-06-19 05:08:38,380][26367] Fps is (10 sec: 42597.6, 60 sec: 41779.3, 300 sec: 42265.1). Total num frames: 4661411840. Throughput: 0: 42208.9. Samples: 929025260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:38,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-19 05:08:39,308][26599] Updated weights for policy 0, policy_version 284514 (0.0040) [2024-06-19 05:08:42,749][26579] Signal inference workers to stop experience collection... (13850 times) [2024-06-19 05:08:42,753][26579] Signal inference workers to resume experience collection... (13850 times) [2024-06-19 05:08:42,791][26599] InferenceWorker_p0-w0: stopping experience collection (13850 times) [2024-06-19 05:08:42,791][26599] InferenceWorker_p0-w0: resuming experience collection (13850 times) [2024-06-19 05:08:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4661624832. Throughput: 0: 42217.4. Samples: 929279220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:43,381][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 05:08:43,458][26599] Updated weights for policy 0, policy_version 284524 (0.0040) [2024-06-19 05:08:47,149][26599] Updated weights for policy 0, policy_version 284534 (0.0044) [2024-06-19 05:08:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 4661837824. Throughput: 0: 42469.1. Samples: 929410040. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:48,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 05:08:51,115][26599] Updated weights for policy 0, policy_version 284544 (0.0046) [2024-06-19 05:08:53,381][26367] Fps is (10 sec: 42594.3, 60 sec: 41778.6, 300 sec: 42320.6). Total num frames: 4662050816. Throughput: 0: 42443.5. Samples: 929666220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:53,382][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 05:08:55,049][26599] Updated weights for policy 0, policy_version 284554 (0.0047) [2024-06-19 05:08:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4662263808. Throughput: 0: 42463.9. Samples: 929919440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:08:58,381][26367] Avg episode reward: [(0, '0.227')] [2024-06-19 05:08:58,977][26599] Updated weights for policy 0, policy_version 284564 (0.0034) [2024-06-19 05:09:02,901][26599] Updated weights for policy 0, policy_version 284574 (0.0032) [2024-06-19 05:09:03,380][26367] Fps is (10 sec: 44240.7, 60 sec: 42871.3, 300 sec: 42376.2). Total num frames: 4662493184. Throughput: 0: 42519.0. Samples: 930041240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:09:03,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 05:09:06,619][26599] Updated weights for policy 0, policy_version 284584 (0.0036) [2024-06-19 05:09:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.3, 300 sec: 42321.2). Total num frames: 4662706176. Throughput: 0: 42363.2. Samples: 930298300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:09:08,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 05:09:10,583][26599] Updated weights for policy 0, policy_version 284594 (0.0034) [2024-06-19 05:09:13,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4662886400. Throughput: 0: 42519.1. Samples: 930555860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:09:13,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 05:09:14,299][26599] Updated weights for policy 0, policy_version 284604 (0.0028) [2024-06-19 05:09:18,338][26599] Updated weights for policy 0, policy_version 284614 (0.0042) [2024-06-19 05:09:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4663115776. Throughput: 0: 42420.2. Samples: 930676060. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:09:18,380][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 05:09:21,993][26599] Updated weights for policy 0, policy_version 284624 (0.0042) [2024-06-19 05:09:23,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.6, 300 sec: 42320.7). Total num frames: 4663345152. Throughput: 0: 42500.6. Samples: 930937780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:09:23,380][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 05:09:26,010][26599] Updated weights for policy 0, policy_version 284634 (0.0034) [2024-06-19 05:09:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4663541760. Throughput: 0: 42654.7. Samples: 931198680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:09:28,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 05:09:29,803][26599] Updated weights for policy 0, policy_version 284644 (0.0033) [2024-06-19 05:09:33,380][26367] Fps is (10 sec: 39320.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4663738368. Throughput: 0: 42518.7. Samples: 931323380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:09:33,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 05:09:33,743][26599] Updated weights for policy 0, policy_version 284654 (0.0034) [2024-06-19 05:09:37,380][26599] Updated weights for policy 0, policy_version 284664 (0.0036) [2024-06-19 05:09:38,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 4664000512. Throughput: 0: 42656.0. Samples: 931585700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:09:38,380][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 05:09:38,501][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284669_4664016896.pth... [2024-06-19 05:09:38,557][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284045_4653793280.pth [2024-06-19 05:09:41,591][26599] Updated weights for policy 0, policy_version 284674 (0.0040) [2024-06-19 05:09:43,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4664180736. Throughput: 0: 42627.2. Samples: 931837660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:09:43,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 05:09:44,764][26599] Updated weights for policy 0, policy_version 284684 (0.0044) [2024-06-19 05:09:48,380][26367] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4664377344. Throughput: 0: 42713.3. Samples: 931963340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:09:48,384][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 05:09:49,265][26599] Updated weights for policy 0, policy_version 284694 (0.0033) [2024-06-19 05:09:52,574][26599] Updated weights for policy 0, policy_version 284704 (0.0024) [2024-06-19 05:09:53,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42872.2, 300 sec: 42376.3). Total num frames: 4664623104. Throughput: 0: 42731.2. Samples: 932221200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:09:53,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 05:09:56,960][26599] Updated weights for policy 0, policy_version 284714 (0.0033) [2024-06-19 05:09:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4664819712. Throughput: 0: 42666.5. Samples: 932475860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:09:58,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 05:10:00,271][26599] Updated weights for policy 0, policy_version 284724 (0.0041) [2024-06-19 05:10:03,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4665016320. Throughput: 0: 42602.5. Samples: 932593180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:03,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 05:10:04,647][26599] Updated weights for policy 0, policy_version 284734 (0.0031) [2024-06-19 05:10:07,778][26599] Updated weights for policy 0, policy_version 284744 (0.0032) [2024-06-19 05:10:08,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4665278464. Throughput: 0: 42635.8. Samples: 932856400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:08,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 05:10:12,242][26579] Signal inference workers to stop experience collection... (13900 times) [2024-06-19 05:10:12,243][26579] Signal inference workers to resume experience collection... (13900 times) [2024-06-19 05:10:12,285][26599] InferenceWorker_p0-w0: stopping experience collection (13900 times) [2024-06-19 05:10:12,285][26599] InferenceWorker_p0-w0: resuming experience collection (13900 times) [2024-06-19 05:10:12,387][26599] Updated weights for policy 0, policy_version 284754 (0.0029) [2024-06-19 05:10:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4665442304. Throughput: 0: 42464.4. Samples: 933109580. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:13,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 05:10:15,659][26599] Updated weights for policy 0, policy_version 284764 (0.0029) [2024-06-19 05:10:18,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4665655296. Throughput: 0: 42342.4. Samples: 933228780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:18,381][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 05:10:20,097][26599] Updated weights for policy 0, policy_version 284774 (0.0030) [2024-06-19 05:10:23,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4665901056. Throughput: 0: 42369.3. Samples: 933492320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:23,380][26367] Avg episode reward: [(0, '0.343')] [2024-06-19 05:10:23,387][26599] Updated weights for policy 0, policy_version 284784 (0.0024) [2024-06-19 05:10:27,668][26599] Updated weights for policy 0, policy_version 284794 (0.0047) [2024-06-19 05:10:28,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42322.7, 300 sec: 42375.7). Total num frames: 4666081280. Throughput: 0: 42405.0. Samples: 933746040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:28,385][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 05:10:31,009][26599] Updated weights for policy 0, policy_version 284804 (0.0034) [2024-06-19 05:10:33,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4666294272. Throughput: 0: 42319.6. Samples: 933867720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:33,381][26367] Avg episode reward: [(0, '0.803')] [2024-06-19 05:10:35,452][26599] Updated weights for policy 0, policy_version 284814 (0.0034) [2024-06-19 05:10:38,380][26367] Fps is (10 sec: 44253.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4666523648. Throughput: 0: 42374.2. Samples: 934128040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 05:10:38,380][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 05:10:38,648][26599] Updated weights for policy 0, policy_version 284824 (0.0050) [2024-06-19 05:10:43,073][26599] Updated weights for policy 0, policy_version 284834 (0.0043) [2024-06-19 05:10:43,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4666720256. Throughput: 0: 42282.5. Samples: 934378560. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:10:43,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 05:10:46,496][26599] Updated weights for policy 0, policy_version 284844 (0.0028) [2024-06-19 05:10:48,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4666949632. Throughput: 0: 42492.4. Samples: 934505340. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:10:48,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 05:10:51,296][26599] Updated weights for policy 0, policy_version 284854 (0.0031) [2024-06-19 05:10:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4667146240. Throughput: 0: 42340.5. Samples: 934761720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:10:53,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 05:10:54,133][26599] Updated weights for policy 0, policy_version 284864 (0.0027) [2024-06-19 05:10:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 4667359232. Throughput: 0: 42475.6. Samples: 935020980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:10:58,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 05:10:58,783][26599] Updated weights for policy 0, policy_version 284874 (0.0033) [2024-06-19 05:11:01,944][26599] Updated weights for policy 0, policy_version 284884 (0.0030) [2024-06-19 05:11:03,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4667588608. Throughput: 0: 42562.9. Samples: 935144120. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:03,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 05:11:06,240][26599] Updated weights for policy 0, policy_version 284894 (0.0031) [2024-06-19 05:11:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 4667785216. Throughput: 0: 42378.7. Samples: 935399360. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:08,380][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 05:11:09,694][26599] Updated weights for policy 0, policy_version 284904 (0.0038) [2024-06-19 05:11:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4667998208. Throughput: 0: 42424.3. Samples: 935654980. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:13,390][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 05:11:14,020][26599] Updated weights for policy 0, policy_version 284914 (0.0032) [2024-06-19 05:11:17,456][26599] Updated weights for policy 0, policy_version 284924 (0.0034) [2024-06-19 05:11:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4668211200. Throughput: 0: 42490.8. Samples: 935779800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:18,380][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 05:11:21,692][26599] Updated weights for policy 0, policy_version 284934 (0.0027) [2024-06-19 05:11:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4668424192. Throughput: 0: 42436.3. Samples: 936037680. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:23,381][26367] Avg episode reward: [(0, '0.392')] [2024-06-19 05:11:25,217][26599] Updated weights for policy 0, policy_version 284944 (0.0033) [2024-06-19 05:11:27,808][26579] Signal inference workers to stop experience collection... (13950 times) [2024-06-19 05:11:27,810][26579] Signal inference workers to resume experience collection... (13950 times) [2024-06-19 05:11:27,845][26599] InferenceWorker_p0-w0: stopping experience collection (13950 times) [2024-06-19 05:11:27,845][26599] InferenceWorker_p0-w0: resuming experience collection (13950 times) [2024-06-19 05:11:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42601.0, 300 sec: 42376.2). Total num frames: 4668637184. Throughput: 0: 42531.0. Samples: 936292460. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:28,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 05:11:29,203][26599] Updated weights for policy 0, policy_version 284954 (0.0042) [2024-06-19 05:11:33,282][26599] Updated weights for policy 0, policy_version 284964 (0.0034) [2024-06-19 05:11:33,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4668850176. Throughput: 0: 42438.4. Samples: 936415060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:33,381][26367] Avg episode reward: [(0, '0.399')] [2024-06-19 05:11:37,161][26599] Updated weights for policy 0, policy_version 284974 (0.0024) [2024-06-19 05:11:38,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42322.7, 300 sec: 42320.2). Total num frames: 4669063168. Throughput: 0: 42510.3. Samples: 936674840. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:38,385][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 05:11:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284977_4669063168.pth... [2024-06-19 05:11:38,460][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284357_4658905088.pth [2024-06-19 05:11:40,987][26599] Updated weights for policy 0, policy_version 284984 (0.0026) [2024-06-19 05:11:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4669276160. Throughput: 0: 42486.3. Samples: 936932860. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:43,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 05:11:44,937][26599] Updated weights for policy 0, policy_version 284994 (0.0039) [2024-06-19 05:11:48,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42052.3, 300 sec: 42376.8). Total num frames: 4669472768. Throughput: 0: 42393.0. Samples: 937051800. Policy #0 lag: (min: 0.0, avg: 12.4, max: 23.0) [2024-06-19 05:11:48,382][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 05:11:48,828][26599] Updated weights for policy 0, policy_version 285004 (0.0034) [2024-06-19 05:11:52,694][26599] Updated weights for policy 0, policy_version 285014 (0.0030) [2024-06-19 05:11:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4669702144. Throughput: 0: 42471.6. Samples: 937310580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:11:53,380][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 05:11:56,560][26599] Updated weights for policy 0, policy_version 285024 (0.0042) [2024-06-19 05:11:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4669898752. Throughput: 0: 42344.1. Samples: 937560460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:11:58,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 05:12:00,719][26599] Updated weights for policy 0, policy_version 285034 (0.0042) [2024-06-19 05:12:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 4670128128. Throughput: 0: 42348.9. Samples: 937685500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:03,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 05:12:04,258][26599] Updated weights for policy 0, policy_version 285044 (0.0035) [2024-06-19 05:12:08,235][26599] Updated weights for policy 0, policy_version 285054 (0.0039) [2024-06-19 05:12:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4670341120. Throughput: 0: 42329.0. Samples: 937942480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:08,380][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 05:12:12,075][26599] Updated weights for policy 0, policy_version 285064 (0.0035) [2024-06-19 05:12:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 4670554112. Throughput: 0: 42377.3. Samples: 938199440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:13,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 05:12:15,826][26599] Updated weights for policy 0, policy_version 285074 (0.0034) [2024-06-19 05:12:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4670750720. Throughput: 0: 42314.7. Samples: 938319220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:18,381][26367] Avg episode reward: [(0, '0.284')] [2024-06-19 05:12:20,229][26599] Updated weights for policy 0, policy_version 285084 (0.0032) [2024-06-19 05:12:23,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4670963712. Throughput: 0: 42236.4. Samples: 938575320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:23,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 05:12:23,694][26599] Updated weights for policy 0, policy_version 285094 (0.0037) [2024-06-19 05:12:27,989][26599] Updated weights for policy 0, policy_version 285104 (0.0040) [2024-06-19 05:12:28,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4671143936. Throughput: 0: 42255.0. Samples: 938834340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:28,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 05:12:31,272][26599] Updated weights for policy 0, policy_version 285114 (0.0035) [2024-06-19 05:12:33,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4671389696. Throughput: 0: 42320.0. Samples: 938956200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:33,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 05:12:35,687][26599] Updated weights for policy 0, policy_version 285124 (0.0027) [2024-06-19 05:12:38,384][26367] Fps is (10 sec: 45860.0, 60 sec: 42325.6, 300 sec: 42375.8). Total num frames: 4671602688. Throughput: 0: 42213.3. Samples: 939210320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:38,384][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 05:12:38,760][26599] Updated weights for policy 0, policy_version 285134 (0.0027) [2024-06-19 05:12:40,062][26579] Signal inference workers to stop experience collection... (14000 times) [2024-06-19 05:12:40,088][26599] InferenceWorker_p0-w0: stopping experience collection (14000 times) [2024-06-19 05:12:40,125][26579] Signal inference workers to resume experience collection... (14000 times) [2024-06-19 05:12:40,125][26599] InferenceWorker_p0-w0: resuming experience collection (14000 times) [2024-06-19 05:12:43,384][26367] Fps is (10 sec: 39307.8, 60 sec: 41776.6, 300 sec: 42375.7). Total num frames: 4671782912. Throughput: 0: 42569.9. Samples: 939476260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:43,393][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 05:12:43,400][26599] Updated weights for policy 0, policy_version 285144 (0.0039) [2024-06-19 05:12:46,307][26599] Updated weights for policy 0, policy_version 285154 (0.0030) [2024-06-19 05:12:48,380][26367] Fps is (10 sec: 42612.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4672028672. Throughput: 0: 42468.0. Samples: 939596560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:48,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 05:12:50,933][26599] Updated weights for policy 0, policy_version 285164 (0.0032) [2024-06-19 05:12:53,380][26367] Fps is (10 sec: 47530.1, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 4672258048. Throughput: 0: 42528.2. Samples: 939856260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:53,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 05:12:53,986][26599] Updated weights for policy 0, policy_version 285174 (0.0037) [2024-06-19 05:12:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4672438272. Throughput: 0: 42406.7. Samples: 940107740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:12:58,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 05:12:58,444][26599] Updated weights for policy 0, policy_version 285184 (0.0034) [2024-06-19 05:13:01,583][26599] Updated weights for policy 0, policy_version 285194 (0.0028) [2024-06-19 05:13:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4672684032. Throughput: 0: 42514.1. Samples: 940232360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 05:13:03,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 05:13:05,971][26599] Updated weights for policy 0, policy_version 285204 (0.0028) [2024-06-19 05:13:08,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42598.2, 300 sec: 42542.8). Total num frames: 4672897024. Throughput: 0: 42716.7. Samples: 940497580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:08,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 05:13:09,486][26599] Updated weights for policy 0, policy_version 285214 (0.0032) [2024-06-19 05:13:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4673093632. Throughput: 0: 42467.1. Samples: 940745360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:13,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 05:13:13,735][26599] Updated weights for policy 0, policy_version 285224 (0.0037) [2024-06-19 05:13:17,398][26599] Updated weights for policy 0, policy_version 285234 (0.0034) [2024-06-19 05:13:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.3, 300 sec: 42487.3). Total num frames: 4673323008. Throughput: 0: 42548.9. Samples: 940870900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:18,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 05:13:21,713][26599] Updated weights for policy 0, policy_version 285244 (0.0039) [2024-06-19 05:13:23,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4673503232. Throughput: 0: 42583.7. Samples: 941126440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:23,380][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 05:13:25,387][26599] Updated weights for policy 0, policy_version 285254 (0.0032) [2024-06-19 05:13:28,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4673716224. Throughput: 0: 42311.4. Samples: 941380120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:28,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 05:13:29,262][26599] Updated weights for policy 0, policy_version 285264 (0.0033) [2024-06-19 05:13:33,067][26599] Updated weights for policy 0, policy_version 285274 (0.0039) [2024-06-19 05:13:33,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4673945600. Throughput: 0: 42380.9. Samples: 941503700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:33,381][26367] Avg episode reward: [(0, '0.359')] [2024-06-19 05:13:36,815][26599] Updated weights for policy 0, policy_version 285284 (0.0032) [2024-06-19 05:13:38,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42600.6, 300 sec: 42487.3). Total num frames: 4674158592. Throughput: 0: 42208.4. Samples: 941755640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:38,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 05:13:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000285288_4674158592.pth... [2024-06-19 05:13:38,466][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284669_4664016896.pth [2024-06-19 05:13:40,592][26599] Updated weights for policy 0, policy_version 285294 (0.0031) [2024-06-19 05:13:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42874.0, 300 sec: 42431.8). Total num frames: 4674355200. Throughput: 0: 42280.4. Samples: 942010360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:43,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 05:13:44,566][26599] Updated weights for policy 0, policy_version 285304 (0.0039) [2024-06-19 05:13:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42431.9). Total num frames: 4674568192. Throughput: 0: 42329.7. Samples: 942137200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:48,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 05:13:48,558][26599] Updated weights for policy 0, policy_version 285314 (0.0043) [2024-06-19 05:13:52,650][26599] Updated weights for policy 0, policy_version 285324 (0.0045) [2024-06-19 05:13:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 4674764800. Throughput: 0: 41997.0. Samples: 942387440. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:53,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 05:13:56,138][26599] Updated weights for policy 0, policy_version 285334 (0.0041) [2024-06-19 05:13:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4674977792. Throughput: 0: 42087.6. Samples: 942639300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:13:58,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 05:14:00,284][26599] Updated weights for policy 0, policy_version 285344 (0.0032) [2024-06-19 05:14:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 4675174400. Throughput: 0: 42152.6. Samples: 942767760. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:03,381][26367] Avg episode reward: [(0, '0.324')] [2024-06-19 05:14:03,419][26579] Signal inference workers to stop experience collection... (14050 times) [2024-06-19 05:14:03,420][26579] Signal inference workers to resume experience collection... (14050 times) [2024-06-19 05:14:03,434][26599] InferenceWorker_p0-w0: stopping experience collection (14050 times) [2024-06-19 05:14:03,434][26599] InferenceWorker_p0-w0: resuming experience collection (14050 times) [2024-06-19 05:14:04,054][26599] Updated weights for policy 0, policy_version 285354 (0.0024) [2024-06-19 05:14:08,193][26599] Updated weights for policy 0, policy_version 285364 (0.0027) [2024-06-19 05:14:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 4675403776. Throughput: 0: 41910.5. Samples: 943012420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:08,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 05:14:11,886][26599] Updated weights for policy 0, policy_version 285374 (0.0031) [2024-06-19 05:14:13,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4675616768. Throughput: 0: 41932.4. Samples: 943267080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:13,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 05:14:16,198][26599] Updated weights for policy 0, policy_version 285384 (0.0022) [2024-06-19 05:14:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 4675813376. Throughput: 0: 42035.6. Samples: 943395300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:18,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 05:14:19,967][26599] Updated weights for policy 0, policy_version 285394 (0.0045) [2024-06-19 05:14:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 4676009984. Throughput: 0: 41896.1. Samples: 943640960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:23,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 05:14:23,989][26599] Updated weights for policy 0, policy_version 285404 (0.0043) [2024-06-19 05:14:27,568][26599] Updated weights for policy 0, policy_version 285414 (0.0051) [2024-06-19 05:14:28,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 4676239360. Throughput: 0: 41916.4. Samples: 943896600. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:28,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 05:14:31,698][26599] Updated weights for policy 0, policy_version 285424 (0.0026) [2024-06-19 05:14:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41506.1, 300 sec: 42154.1). Total num frames: 4676435968. Throughput: 0: 41924.2. Samples: 944023780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:33,381][26367] Avg episode reward: [(0, '0.380')] [2024-06-19 05:14:35,239][26599] Updated weights for policy 0, policy_version 285434 (0.0039) [2024-06-19 05:14:38,381][26367] Fps is (10 sec: 42595.9, 60 sec: 41778.8, 300 sec: 42320.6). Total num frames: 4676665344. Throughput: 0: 41916.7. Samples: 944273720. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:38,382][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 05:14:39,284][26599] Updated weights for policy 0, policy_version 285444 (0.0034) [2024-06-19 05:14:43,106][26599] Updated weights for policy 0, policy_version 285454 (0.0033) [2024-06-19 05:14:43,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4676878336. Throughput: 0: 41969.8. Samples: 944527940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:43,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 05:14:46,889][26599] Updated weights for policy 0, policy_version 285464 (0.0034) [2024-06-19 05:14:48,380][26367] Fps is (10 sec: 40962.9, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4677074944. Throughput: 0: 41964.0. Samples: 944656140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:48,384][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 05:14:51,288][26599] Updated weights for policy 0, policy_version 285474 (0.0036) [2024-06-19 05:14:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4677304320. Throughput: 0: 42189.7. Samples: 944910960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:53,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 05:14:54,588][26599] Updated weights for policy 0, policy_version 285484 (0.0029) [2024-06-19 05:14:58,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4677500928. Throughput: 0: 42068.3. Samples: 945160160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:14:58,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 05:14:58,859][26599] Updated weights for policy 0, policy_version 285494 (0.0043) [2024-06-19 05:15:02,577][26599] Updated weights for policy 0, policy_version 285504 (0.0034) [2024-06-19 05:15:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4677713920. Throughput: 0: 41946.6. Samples: 945282900. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:15:03,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 05:15:07,135][26599] Updated weights for policy 0, policy_version 285514 (0.0034) [2024-06-19 05:15:08,380][26367] Fps is (10 sec: 44237.8, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4677943296. Throughput: 0: 42185.0. Samples: 945539280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:15:08,380][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 05:15:10,341][26599] Updated weights for policy 0, policy_version 285524 (0.0043) [2024-06-19 05:15:13,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41506.1, 300 sec: 42209.6). Total num frames: 4678107136. Throughput: 0: 42117.4. Samples: 945791880. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:15:13,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 05:15:14,731][26599] Updated weights for policy 0, policy_version 285534 (0.0026) [2024-06-19 05:15:17,863][26599] Updated weights for policy 0, policy_version 285544 (0.0037) [2024-06-19 05:15:18,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4678352896. Throughput: 0: 41958.6. Samples: 945911920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:15:18,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 05:15:22,303][26599] Updated weights for policy 0, policy_version 285554 (0.0045) [2024-06-19 05:15:23,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42265.7). Total num frames: 4678549504. Throughput: 0: 42209.6. Samples: 946173120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:15:23,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 05:15:25,873][26599] Updated weights for policy 0, policy_version 285564 (0.0042) [2024-06-19 05:15:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4678762496. Throughput: 0: 42210.2. Samples: 946427400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:15:28,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 05:15:29,468][26579] Signal inference workers to stop experience collection... (14100 times) [2024-06-19 05:15:29,470][26579] Signal inference workers to resume experience collection... (14100 times) [2024-06-19 05:15:29,484][26599] InferenceWorker_p0-w0: stopping experience collection (14100 times) [2024-06-19 05:15:29,512][26599] InferenceWorker_p0-w0: resuming experience collection (14100 times) [2024-06-19 05:15:29,903][26599] Updated weights for policy 0, policy_version 285574 (0.0031) [2024-06-19 05:15:33,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 4678991872. Throughput: 0: 42236.4. Samples: 946556780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:15:33,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 05:15:33,621][26599] Updated weights for policy 0, policy_version 285584 (0.0047) [2024-06-19 05:15:37,849][26599] Updated weights for policy 0, policy_version 285594 (0.0034) [2024-06-19 05:15:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.8, 300 sec: 42265.2). Total num frames: 4679188480. Throughput: 0: 42164.5. Samples: 946808360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:15:38,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 05:15:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000285595_4679188480.pth... [2024-06-19 05:15:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000284977_4669063168.pth [2024-06-19 05:15:41,261][26599] Updated weights for policy 0, policy_version 285604 (0.0037) [2024-06-19 05:15:43,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4679385088. Throughput: 0: 42285.5. Samples: 947063000. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:15:43,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 05:15:45,495][26599] Updated weights for policy 0, policy_version 285614 (0.0030) [2024-06-19 05:15:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4679630848. Throughput: 0: 42406.6. Samples: 947191200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:15:48,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 05:15:48,882][26599] Updated weights for policy 0, policy_version 285624 (0.0030) [2024-06-19 05:15:53,290][26599] Updated weights for policy 0, policy_version 285634 (0.0026) [2024-06-19 05:15:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4679827456. Throughput: 0: 42285.6. Samples: 947442140. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:15:53,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 05:15:56,457][26599] Updated weights for policy 0, policy_version 285644 (0.0034) [2024-06-19 05:15:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4680040448. Throughput: 0: 42338.3. Samples: 947697100. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:15:58,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 05:16:01,157][26599] Updated weights for policy 0, policy_version 285654 (0.0033) [2024-06-19 05:16:03,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4680269824. Throughput: 0: 42605.4. Samples: 947829160. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:03,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 05:16:03,934][26599] Updated weights for policy 0, policy_version 285664 (0.0041) [2024-06-19 05:16:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4680466432. Throughput: 0: 42404.0. Samples: 948081300. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:08,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-19 05:16:08,651][26599] Updated weights for policy 0, policy_version 285674 (0.0047) [2024-06-19 05:16:12,201][26599] Updated weights for policy 0, policy_version 285684 (0.0036) [2024-06-19 05:16:13,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42320.7). Total num frames: 4680695808. Throughput: 0: 42402.2. Samples: 948335500. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:13,381][26367] Avg episode reward: [(0, '0.350')] [2024-06-19 05:16:16,108][26599] Updated weights for policy 0, policy_version 285694 (0.0044) [2024-06-19 05:16:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4680892416. Throughput: 0: 42382.2. Samples: 948463980. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:18,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 05:16:19,755][26599] Updated weights for policy 0, policy_version 285704 (0.0032) [2024-06-19 05:16:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 4681105408. Throughput: 0: 42392.3. Samples: 948716020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:23,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 05:16:23,947][26599] Updated weights for policy 0, policy_version 285714 (0.0037) [2024-06-19 05:16:27,282][26599] Updated weights for policy 0, policy_version 285724 (0.0032) [2024-06-19 05:16:28,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4681334784. Throughput: 0: 42423.0. Samples: 948972040. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:28,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 05:16:31,663][26599] Updated weights for policy 0, policy_version 285734 (0.0028) [2024-06-19 05:16:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42210.2). Total num frames: 4681515008. Throughput: 0: 42448.5. Samples: 949101380. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:33,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 05:16:35,178][26599] Updated weights for policy 0, policy_version 285744 (0.0046) [2024-06-19 05:16:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4681744384. Throughput: 0: 42576.5. Samples: 949358080. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:38,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 05:16:39,171][26599] Updated weights for policy 0, policy_version 285754 (0.0039) [2024-06-19 05:16:43,142][26599] Updated weights for policy 0, policy_version 285764 (0.0038) [2024-06-19 05:16:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4681957376. Throughput: 0: 42551.1. Samples: 949611900. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 05:16:43,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 05:16:47,190][26599] Updated weights for policy 0, policy_version 285774 (0.0035) [2024-06-19 05:16:48,383][26367] Fps is (10 sec: 42587.2, 60 sec: 42323.5, 300 sec: 42264.8). Total num frames: 4682170368. Throughput: 0: 42467.7. Samples: 949740320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:16:48,383][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 05:16:51,062][26599] Updated weights for policy 0, policy_version 285784 (0.0030) [2024-06-19 05:16:53,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42595.9, 300 sec: 42320.2). Total num frames: 4682383360. Throughput: 0: 42476.6. Samples: 949992900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:16:53,384][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 05:16:54,817][26599] Updated weights for policy 0, policy_version 285794 (0.0041) [2024-06-19 05:16:58,380][26367] Fps is (10 sec: 40970.2, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4682579968. Throughput: 0: 42635.1. Samples: 950254080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:16:58,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 05:16:58,733][26599] Updated weights for policy 0, policy_version 285804 (0.0039) [2024-06-19 05:17:02,500][26599] Updated weights for policy 0, policy_version 285814 (0.0034) [2024-06-19 05:17:03,380][26367] Fps is (10 sec: 40975.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4682792960. Throughput: 0: 42572.5. Samples: 950379740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:03,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 05:17:06,254][26599] Updated weights for policy 0, policy_version 285824 (0.0036) [2024-06-19 05:17:07,403][26579] Signal inference workers to stop experience collection... (14150 times) [2024-06-19 05:17:07,408][26579] Signal inference workers to resume experience collection... (14150 times) [2024-06-19 05:17:07,421][26599] InferenceWorker_p0-w0: stopping experience collection (14150 times) [2024-06-19 05:17:07,448][26599] InferenceWorker_p0-w0: resuming experience collection (14150 times) [2024-06-19 05:17:08,380][26367] Fps is (10 sec: 45876.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4683038720. Throughput: 0: 42658.0. Samples: 950635620. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:08,380][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 05:17:10,294][26599] Updated weights for policy 0, policy_version 285834 (0.0039) [2024-06-19 05:17:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4683235328. Throughput: 0: 42646.5. Samples: 950891120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:13,380][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 05:17:14,025][26599] Updated weights for policy 0, policy_version 285844 (0.0033) [2024-06-19 05:17:18,128][26599] Updated weights for policy 0, policy_version 285854 (0.0033) [2024-06-19 05:17:18,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4683431936. Throughput: 0: 42419.9. Samples: 951010280. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:18,381][26367] Avg episode reward: [(0, '0.833')] [2024-06-19 05:17:21,761][26599] Updated weights for policy 0, policy_version 285864 (0.0041) [2024-06-19 05:17:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 4683677696. Throughput: 0: 42384.5. Samples: 951265380. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:23,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 05:17:25,987][26599] Updated weights for policy 0, policy_version 285874 (0.0037) [2024-06-19 05:17:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4683857920. Throughput: 0: 42453.2. Samples: 951522300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:28,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 05:17:29,375][26599] Updated weights for policy 0, policy_version 285884 (0.0035) [2024-06-19 05:17:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42265.6). Total num frames: 4684070912. Throughput: 0: 42293.6. Samples: 951643420. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:33,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 05:17:33,506][26599] Updated weights for policy 0, policy_version 285894 (0.0043) [2024-06-19 05:17:37,189][26599] Updated weights for policy 0, policy_version 285904 (0.0042) [2024-06-19 05:17:38,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42376.8). Total num frames: 4684283904. Throughput: 0: 42377.7. Samples: 951899740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:38,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 05:17:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000285906_4684283904.pth... [2024-06-19 05:17:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000285288_4674158592.pth [2024-06-19 05:17:41,228][26599] Updated weights for policy 0, policy_version 285914 (0.0029) [2024-06-19 05:17:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4684480512. Throughput: 0: 42348.1. Samples: 952159740. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:43,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 05:17:45,077][26599] Updated weights for policy 0, policy_version 285924 (0.0022) [2024-06-19 05:17:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42327.2, 300 sec: 42209.6). Total num frames: 4684709888. Throughput: 0: 42401.2. Samples: 952287800. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:48,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 05:17:48,828][26599] Updated weights for policy 0, policy_version 285934 (0.0040) [2024-06-19 05:17:52,784][26599] Updated weights for policy 0, policy_version 285944 (0.0027) [2024-06-19 05:17:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42327.9, 300 sec: 42320.7). Total num frames: 4684922880. Throughput: 0: 42298.6. Samples: 952539060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 23.0) [2024-06-19 05:17:53,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 05:17:56,431][26599] Updated weights for policy 0, policy_version 285954 (0.0037) [2024-06-19 05:17:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 4685135872. Throughput: 0: 42312.8. Samples: 952795200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:17:58,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 05:18:00,429][26599] Updated weights for policy 0, policy_version 285964 (0.0041) [2024-06-19 05:18:03,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42209.7). Total num frames: 4685348864. Throughput: 0: 42540.1. Samples: 952924580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:03,380][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 05:18:03,973][26599] Updated weights for policy 0, policy_version 285974 (0.0023) [2024-06-19 05:18:08,110][26599] Updated weights for policy 0, policy_version 285984 (0.0035) [2024-06-19 05:18:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42265.2). Total num frames: 4685561856. Throughput: 0: 42470.1. Samples: 953176540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:08,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 05:18:11,805][26599] Updated weights for policy 0, policy_version 285994 (0.0044) [2024-06-19 05:18:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42209.7). Total num frames: 4685774848. Throughput: 0: 42321.5. Samples: 953426760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:13,380][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 05:18:15,844][26599] Updated weights for policy 0, policy_version 286004 (0.0046) [2024-06-19 05:18:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4685987840. Throughput: 0: 42507.9. Samples: 953556280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:18,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 05:18:19,462][26599] Updated weights for policy 0, policy_version 286014 (0.0041) [2024-06-19 05:18:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4686200832. Throughput: 0: 42583.2. Samples: 953815980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:23,380][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 05:18:23,531][26599] Updated weights for policy 0, policy_version 286024 (0.0031) [2024-06-19 05:18:27,297][26599] Updated weights for policy 0, policy_version 286034 (0.0029) [2024-06-19 05:18:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4686413824. Throughput: 0: 42404.5. Samples: 954067940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:28,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 05:18:30,854][26579] Signal inference workers to stop experience collection... (14200 times) [2024-06-19 05:18:30,857][26579] Signal inference workers to resume experience collection... (14200 times) [2024-06-19 05:18:30,893][26599] InferenceWorker_p0-w0: stopping experience collection (14200 times) [2024-06-19 05:18:30,894][26599] InferenceWorker_p0-w0: resuming experience collection (14200 times) [2024-06-19 05:18:31,351][26599] Updated weights for policy 0, policy_version 286044 (0.0036) [2024-06-19 05:18:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4686626816. Throughput: 0: 42497.4. Samples: 954200180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:33,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 05:18:34,895][26599] Updated weights for policy 0, policy_version 286054 (0.0033) [2024-06-19 05:18:38,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 4686807040. Throughput: 0: 42344.5. Samples: 954444560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:38,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 05:18:39,044][26599] Updated weights for policy 0, policy_version 286064 (0.0049) [2024-06-19 05:18:42,635][26599] Updated weights for policy 0, policy_version 286074 (0.0030) [2024-06-19 05:18:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4687052800. Throughput: 0: 42332.0. Samples: 954700140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:43,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 05:18:46,810][26599] Updated weights for policy 0, policy_version 286084 (0.0031) [2024-06-19 05:18:48,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4687249408. Throughput: 0: 42476.5. Samples: 954836020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:48,380][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 05:18:50,401][26599] Updated weights for policy 0, policy_version 286094 (0.0034) [2024-06-19 05:18:53,382][26367] Fps is (10 sec: 40953.3, 60 sec: 42324.2, 300 sec: 42320.5). Total num frames: 4687462400. Throughput: 0: 42337.7. Samples: 955081800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:53,382][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 05:18:54,596][26599] Updated weights for policy 0, policy_version 286104 (0.0026) [2024-06-19 05:18:58,291][26599] Updated weights for policy 0, policy_version 286114 (0.0029) [2024-06-19 05:18:58,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4687691776. Throughput: 0: 42530.2. Samples: 955340620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:18:58,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 05:19:02,275][26599] Updated weights for policy 0, policy_version 286124 (0.0030) [2024-06-19 05:19:03,380][26367] Fps is (10 sec: 44243.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4687904768. Throughput: 0: 42468.5. Samples: 955467360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:19:03,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 05:19:06,041][26599] Updated weights for policy 0, policy_version 286134 (0.0051) [2024-06-19 05:19:08,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4688101376. Throughput: 0: 42234.9. Samples: 955716560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 05:19:08,381][26367] Avg episode reward: [(0, '0.303')] [2024-06-19 05:19:09,887][26599] Updated weights for policy 0, policy_version 286144 (0.0043) [2024-06-19 05:19:13,381][26367] Fps is (10 sec: 40956.5, 60 sec: 42324.7, 300 sec: 42376.1). Total num frames: 4688314368. Throughput: 0: 42448.5. Samples: 955978160. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:13,382][26367] Avg episode reward: [(0, '0.753')] [2024-06-19 05:19:13,630][26599] Updated weights for policy 0, policy_version 286154 (0.0046) [2024-06-19 05:19:17,544][26599] Updated weights for policy 0, policy_version 286164 (0.0035) [2024-06-19 05:19:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4688527360. Throughput: 0: 42294.6. Samples: 956103440. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:18,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 05:19:21,267][26599] Updated weights for policy 0, policy_version 286174 (0.0032) [2024-06-19 05:19:23,380][26367] Fps is (10 sec: 44240.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4688756736. Throughput: 0: 42423.0. Samples: 956353600. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:23,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 05:19:25,175][26599] Updated weights for policy 0, policy_version 286184 (0.0034) [2024-06-19 05:19:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4688953344. Throughput: 0: 42553.6. Samples: 956615060. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:28,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 05:19:28,947][26599] Updated weights for policy 0, policy_version 286194 (0.0035) [2024-06-19 05:19:29,993][26579] Signal inference workers to stop experience collection... (14250 times) [2024-06-19 05:19:29,997][26579] Signal inference workers to resume experience collection... (14250 times) [2024-06-19 05:19:30,008][26599] InferenceWorker_p0-w0: stopping experience collection (14250 times) [2024-06-19 05:19:30,027][26599] InferenceWorker_p0-w0: resuming experience collection (14250 times) [2024-06-19 05:19:32,840][26599] Updated weights for policy 0, policy_version 286204 (0.0032) [2024-06-19 05:19:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4689166336. Throughput: 0: 42299.0. Samples: 956739480. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:33,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 05:19:36,730][26599] Updated weights for policy 0, policy_version 286214 (0.0038) [2024-06-19 05:19:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43144.4, 300 sec: 42431.8). Total num frames: 4689395712. Throughput: 0: 42473.5. Samples: 956993040. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:38,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 05:19:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000286218_4689395712.pth... [2024-06-19 05:19:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000285595_4679188480.pth [2024-06-19 05:19:40,581][26599] Updated weights for policy 0, policy_version 286224 (0.0037) [2024-06-19 05:19:43,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4689608704. Throughput: 0: 42358.3. Samples: 957246740. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:43,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 05:19:44,749][26599] Updated weights for policy 0, policy_version 286234 (0.0041) [2024-06-19 05:19:48,329][26599] Updated weights for policy 0, policy_version 286244 (0.0035) [2024-06-19 05:19:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4689821696. Throughput: 0: 42308.9. Samples: 957371260. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:48,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 05:19:52,585][26599] Updated weights for policy 0, policy_version 286254 (0.0043) [2024-06-19 05:19:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42599.6, 300 sec: 42431.8). Total num frames: 4690018304. Throughput: 0: 42492.6. Samples: 957628720. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:53,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 05:19:56,273][26599] Updated weights for policy 0, policy_version 286264 (0.0031) [2024-06-19 05:19:58,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4690214912. Throughput: 0: 42344.7. Samples: 957883640. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:19:58,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 05:20:00,344][26599] Updated weights for policy 0, policy_version 286274 (0.0037) [2024-06-19 05:20:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4690444288. Throughput: 0: 42272.0. Samples: 958005680. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:20:03,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 05:20:03,973][26599] Updated weights for policy 0, policy_version 286284 (0.0037) [2024-06-19 05:20:07,875][26599] Updated weights for policy 0, policy_version 286294 (0.0035) [2024-06-19 05:20:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4690640896. Throughput: 0: 42359.9. Samples: 958259800. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:20:08,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 05:20:11,892][26599] Updated weights for policy 0, policy_version 286304 (0.0034) [2024-06-19 05:20:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42326.0, 300 sec: 42376.3). Total num frames: 4690853888. Throughput: 0: 42233.5. Samples: 958515560. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:20:13,381][26367] Avg episode reward: [(0, '0.819')] [2024-06-19 05:20:15,952][26599] Updated weights for policy 0, policy_version 286314 (0.0038) [2024-06-19 05:20:18,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42322.8, 300 sec: 42431.3). Total num frames: 4691066880. Throughput: 0: 42281.5. Samples: 958642300. Policy #0 lag: (min: 1.0, avg: 8.8, max: 21.0) [2024-06-19 05:20:18,385][26367] Avg episode reward: [(0, '0.761')] [2024-06-19 05:20:19,501][26599] Updated weights for policy 0, policy_version 286324 (0.0036) [2024-06-19 05:20:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 4691263488. Throughput: 0: 42332.1. Samples: 958897980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:23,381][26367] Avg episode reward: [(0, '0.809')] [2024-06-19 05:20:23,634][26599] Updated weights for policy 0, policy_version 286334 (0.0032) [2024-06-19 05:20:27,410][26599] Updated weights for policy 0, policy_version 286344 (0.0025) [2024-06-19 05:20:28,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4691492864. Throughput: 0: 42199.0. Samples: 959145700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:28,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 05:20:31,541][26599] Updated weights for policy 0, policy_version 286354 (0.0029) [2024-06-19 05:20:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4691705856. Throughput: 0: 42286.4. Samples: 959274140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:33,380][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 05:20:35,151][26599] Updated weights for policy 0, policy_version 286364 (0.0038) [2024-06-19 05:20:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 4691902464. Throughput: 0: 42076.8. Samples: 959522180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:38,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 05:20:39,472][26599] Updated weights for policy 0, policy_version 286374 (0.0042) [2024-06-19 05:20:43,232][26599] Updated weights for policy 0, policy_version 286384 (0.0040) [2024-06-19 05:20:43,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4692131840. Throughput: 0: 42019.6. Samples: 959774520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:43,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 05:20:47,184][26599] Updated weights for policy 0, policy_version 286394 (0.0029) [2024-06-19 05:20:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 4692328448. Throughput: 0: 42133.5. Samples: 959901680. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:48,380][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 05:20:48,508][26579] Signal inference workers to stop experience collection... (14300 times) [2024-06-19 05:20:48,510][26579] Signal inference workers to resume experience collection... (14300 times) [2024-06-19 05:20:48,552][26599] InferenceWorker_p0-w0: stopping experience collection (14300 times) [2024-06-19 05:20:48,553][26599] InferenceWorker_p0-w0: resuming experience collection (14300 times) [2024-06-19 05:20:50,909][26599] Updated weights for policy 0, policy_version 286404 (0.0036) [2024-06-19 05:20:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4692525056. Throughput: 0: 41985.0. Samples: 960149120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:53,380][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 05:20:54,911][26599] Updated weights for policy 0, policy_version 286414 (0.0031) [2024-06-19 05:20:58,381][26367] Fps is (10 sec: 42596.5, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4692754432. Throughput: 0: 42085.9. Samples: 960409440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:20:58,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 05:20:58,465][26599] Updated weights for policy 0, policy_version 286424 (0.0029) [2024-06-19 05:21:02,558][26599] Updated weights for policy 0, policy_version 286434 (0.0036) [2024-06-19 05:21:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4692967424. Throughput: 0: 42116.9. Samples: 960537400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:21:03,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 05:21:06,163][26599] Updated weights for policy 0, policy_version 286444 (0.0028) [2024-06-19 05:21:08,380][26367] Fps is (10 sec: 42599.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4693180416. Throughput: 0: 41990.1. Samples: 960787540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:21:08,384][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 05:21:10,536][26599] Updated weights for policy 0, policy_version 286454 (0.0023) [2024-06-19 05:21:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4693377024. Throughput: 0: 42196.5. Samples: 961044540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:21:13,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 05:21:13,830][26599] Updated weights for policy 0, policy_version 286464 (0.0033) [2024-06-19 05:21:18,367][26599] Updated weights for policy 0, policy_version 286474 (0.0043) [2024-06-19 05:21:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42054.9, 300 sec: 42320.7). Total num frames: 4693590016. Throughput: 0: 42069.7. Samples: 961167280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:21:18,380][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 05:21:21,921][26599] Updated weights for policy 0, policy_version 286484 (0.0032) [2024-06-19 05:21:23,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4693835776. Throughput: 0: 42152.3. Samples: 961419040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:21:23,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 05:21:26,007][26599] Updated weights for policy 0, policy_version 286494 (0.0024) [2024-06-19 05:21:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4693999616. Throughput: 0: 42152.9. Samples: 961671400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:21:28,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 05:21:29,676][26599] Updated weights for policy 0, policy_version 286504 (0.0048) [2024-06-19 05:21:33,380][26367] Fps is (10 sec: 37683.7, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 4694212608. Throughput: 0: 42035.0. Samples: 961793260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 05:21:33,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 05:21:33,597][26599] Updated weights for policy 0, policy_version 286514 (0.0043) [2024-06-19 05:21:37,445][26599] Updated weights for policy 0, policy_version 286524 (0.0035) [2024-06-19 05:21:38,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4694458368. Throughput: 0: 42265.2. Samples: 962051060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:21:38,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 05:21:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000286527_4694458368.pth... [2024-06-19 05:21:38,473][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000285906_4684283904.pth [2024-06-19 05:21:41,838][26599] Updated weights for policy 0, policy_version 286534 (0.0035) [2024-06-19 05:21:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41506.1, 300 sec: 42210.0). Total num frames: 4694622208. Throughput: 0: 42120.7. Samples: 962304860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:21:43,383][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 05:21:45,119][26599] Updated weights for policy 0, policy_version 286544 (0.0037) [2024-06-19 05:21:48,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42321.2). Total num frames: 4694867968. Throughput: 0: 42040.0. Samples: 962429200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:21:48,380][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 05:21:49,651][26599] Updated weights for policy 0, policy_version 286554 (0.0037) [2024-06-19 05:21:52,745][26599] Updated weights for policy 0, policy_version 286564 (0.0040) [2024-06-19 05:21:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4695064576. Throughput: 0: 42246.7. Samples: 962688640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:21:53,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 05:21:57,235][26599] Updated weights for policy 0, policy_version 286574 (0.0038) [2024-06-19 05:21:58,380][26367] Fps is (10 sec: 39321.2, 60 sec: 41779.4, 300 sec: 42265.2). Total num frames: 4695261184. Throughput: 0: 42131.1. Samples: 962940440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:21:58,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 05:22:00,561][26599] Updated weights for policy 0, policy_version 286584 (0.0040) [2024-06-19 05:22:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4695490560. Throughput: 0: 42098.3. Samples: 963061700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:03,380][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 05:22:05,206][26599] Updated weights for policy 0, policy_version 286594 (0.0025) [2024-06-19 05:22:08,125][26599] Updated weights for policy 0, policy_version 286604 (0.0035) [2024-06-19 05:22:08,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4695719936. Throughput: 0: 42212.1. Samples: 963318580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:08,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 05:22:12,837][26599] Updated weights for policy 0, policy_version 286614 (0.0037) [2024-06-19 05:22:13,383][26367] Fps is (10 sec: 40950.3, 60 sec: 42050.7, 300 sec: 42264.8). Total num frames: 4695900160. Throughput: 0: 42285.0. Samples: 963574320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:13,383][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 05:22:16,082][26599] Updated weights for policy 0, policy_version 286624 (0.0043) [2024-06-19 05:22:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4696129536. Throughput: 0: 42269.7. Samples: 963695400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:18,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 05:22:20,483][26599] Updated weights for policy 0, policy_version 286634 (0.0040) [2024-06-19 05:22:23,380][26367] Fps is (10 sec: 45885.7, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4696358912. Throughput: 0: 42174.3. Samples: 963948900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:23,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 05:22:23,874][26599] Updated weights for policy 0, policy_version 286644 (0.0045) [2024-06-19 05:22:25,024][26579] Signal inference workers to stop experience collection... (14350 times) [2024-06-19 05:22:25,028][26579] Signal inference workers to resume experience collection... (14350 times) [2024-06-19 05:22:25,052][26599] InferenceWorker_p0-w0: stopping experience collection (14350 times) [2024-06-19 05:22:25,052][26599] InferenceWorker_p0-w0: resuming experience collection (14350 times) [2024-06-19 05:22:28,122][26599] Updated weights for policy 0, policy_version 286654 (0.0042) [2024-06-19 05:22:28,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4696539136. Throughput: 0: 42215.6. Samples: 964204560. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:28,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 05:22:31,542][26599] Updated weights for policy 0, policy_version 286664 (0.0036) [2024-06-19 05:22:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4696752128. Throughput: 0: 42168.9. Samples: 964326800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:33,384][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 05:22:35,796][26599] Updated weights for policy 0, policy_version 286674 (0.0032) [2024-06-19 05:22:38,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4696981504. Throughput: 0: 42172.5. Samples: 964586400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:38,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 05:22:39,204][26599] Updated weights for policy 0, policy_version 286684 (0.0036) [2024-06-19 05:22:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4697161728. Throughput: 0: 42049.8. Samples: 964832680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 05:22:43,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 05:22:44,035][26599] Updated weights for policy 0, policy_version 286694 (0.0031) [2024-06-19 05:22:47,278][26599] Updated weights for policy 0, policy_version 286704 (0.0037) [2024-06-19 05:22:48,382][26367] Fps is (10 sec: 42591.2, 60 sec: 42324.1, 300 sec: 42320.5). Total num frames: 4697407488. Throughput: 0: 42089.5. Samples: 964955800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:22:48,382][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 05:22:51,492][26599] Updated weights for policy 0, policy_version 286714 (0.0034) [2024-06-19 05:22:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4697604096. Throughput: 0: 42028.0. Samples: 965209840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:22:53,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 05:22:55,442][26599] Updated weights for policy 0, policy_version 286724 (0.0035) [2024-06-19 05:22:58,380][26367] Fps is (10 sec: 39328.1, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4697800704. Throughput: 0: 41959.9. Samples: 965462420. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:22:58,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 05:22:59,197][26599] Updated weights for policy 0, policy_version 286734 (0.0033) [2024-06-19 05:23:03,331][26599] Updated weights for policy 0, policy_version 286744 (0.0038) [2024-06-19 05:23:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42209.7). Total num frames: 4698013696. Throughput: 0: 41994.7. Samples: 965585160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:03,380][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 05:23:06,819][26599] Updated weights for policy 0, policy_version 286754 (0.0036) [2024-06-19 05:23:08,384][26367] Fps is (10 sec: 42583.0, 60 sec: 41776.7, 300 sec: 42209.1). Total num frames: 4698226688. Throughput: 0: 42024.2. Samples: 965840140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:08,385][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 05:23:11,018][26599] Updated weights for policy 0, policy_version 286764 (0.0031) [2024-06-19 05:23:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42326.9, 300 sec: 42209.6). Total num frames: 4698439680. Throughput: 0: 41998.2. Samples: 966094480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:13,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 05:23:14,554][26599] Updated weights for policy 0, policy_version 286774 (0.0028) [2024-06-19 05:23:18,380][26367] Fps is (10 sec: 40974.8, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4698636288. Throughput: 0: 42173.7. Samples: 966224620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:18,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 05:23:19,001][26599] Updated weights for policy 0, policy_version 286784 (0.0044) [2024-06-19 05:23:22,253][26599] Updated weights for policy 0, policy_version 286794 (0.0040) [2024-06-19 05:23:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4698882048. Throughput: 0: 42082.2. Samples: 966480100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:23,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 05:23:26,652][26599] Updated weights for policy 0, policy_version 286804 (0.0034) [2024-06-19 05:23:28,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4699095040. Throughput: 0: 42119.5. Samples: 966728060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:28,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 05:23:30,000][26599] Updated weights for policy 0, policy_version 286814 (0.0044) [2024-06-19 05:23:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4699275264. Throughput: 0: 42331.8. Samples: 966860660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:33,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 05:23:34,303][26599] Updated weights for policy 0, policy_version 286824 (0.0034) [2024-06-19 05:23:37,534][26599] Updated weights for policy 0, policy_version 286834 (0.0024) [2024-06-19 05:23:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4699504640. Throughput: 0: 42247.0. Samples: 967110960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:38,381][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 05:23:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000286835_4699504640.pth... [2024-06-19 05:23:38,441][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000286218_4689395712.pth [2024-06-19 05:23:41,963][26599] Updated weights for policy 0, policy_version 286844 (0.0036) [2024-06-19 05:23:43,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4699734016. Throughput: 0: 42346.3. Samples: 967368000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:43,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 05:23:45,461][26599] Updated weights for policy 0, policy_version 286854 (0.0052) [2024-06-19 05:23:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41780.4, 300 sec: 42209.9). Total num frames: 4699914240. Throughput: 0: 42445.7. Samples: 967495220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:48,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 05:23:49,885][26599] Updated weights for policy 0, policy_version 286864 (0.0041) [2024-06-19 05:23:53,380][26367] Fps is (10 sec: 39320.7, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 4700127232. Throughput: 0: 42305.9. Samples: 967743760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:53,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 05:23:53,535][26599] Updated weights for policy 0, policy_version 286874 (0.0025) [2024-06-19 05:23:57,333][26599] Updated weights for policy 0, policy_version 286884 (0.0027) [2024-06-19 05:23:58,140][26579] Signal inference workers to stop experience collection... (14400 times) [2024-06-19 05:23:58,168][26599] InferenceWorker_p0-w0: stopping experience collection (14400 times) [2024-06-19 05:23:58,250][26579] Signal inference workers to resume experience collection... (14400 times) [2024-06-19 05:23:58,251][26599] InferenceWorker_p0-w0: resuming experience collection (14400 times) [2024-06-19 05:23:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4700356608. Throughput: 0: 42616.1. Samples: 968012200. Policy #0 lag: (min: 0.0, avg: 10.9, max: 22.0) [2024-06-19 05:23:58,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 05:24:01,106][26599] Updated weights for policy 0, policy_version 286894 (0.0036) [2024-06-19 05:24:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4700553216. Throughput: 0: 42531.1. Samples: 968138520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:03,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 05:24:05,205][26599] Updated weights for policy 0, policy_version 286904 (0.0034) [2024-06-19 05:24:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42601.0, 300 sec: 42265.3). Total num frames: 4700782592. Throughput: 0: 42246.7. Samples: 968381200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:08,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 05:24:08,644][26599] Updated weights for policy 0, policy_version 286914 (0.0040) [2024-06-19 05:24:12,964][26599] Updated weights for policy 0, policy_version 286924 (0.0034) [2024-06-19 05:24:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4700979200. Throughput: 0: 42425.4. Samples: 968637200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:13,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 05:24:16,202][26599] Updated weights for policy 0, policy_version 286934 (0.0028) [2024-06-19 05:24:18,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42098.5). Total num frames: 4701175808. Throughput: 0: 42251.5. Samples: 968761980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:18,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 05:24:20,569][26599] Updated weights for policy 0, policy_version 286944 (0.0043) [2024-06-19 05:24:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42209.7). Total num frames: 4701405184. Throughput: 0: 42386.8. Samples: 969018360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:23,380][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 05:24:23,764][26599] Updated weights for policy 0, policy_version 286954 (0.0026) [2024-06-19 05:24:28,293][26599] Updated weights for policy 0, policy_version 286964 (0.0038) [2024-06-19 05:24:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4701618176. Throughput: 0: 42497.2. Samples: 969280380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:28,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 05:24:31,422][26599] Updated weights for policy 0, policy_version 286974 (0.0042) [2024-06-19 05:24:33,380][26367] Fps is (10 sec: 42597.3, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4701831168. Throughput: 0: 42316.3. Samples: 969399460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:33,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 05:24:36,431][26599] Updated weights for policy 0, policy_version 286984 (0.0040) [2024-06-19 05:24:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4702044160. Throughput: 0: 42453.9. Samples: 969654180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:38,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 05:24:39,260][26599] Updated weights for policy 0, policy_version 286994 (0.0038) [2024-06-19 05:24:43,380][26367] Fps is (10 sec: 39322.4, 60 sec: 41506.1, 300 sec: 42043.0). Total num frames: 4702224384. Throughput: 0: 42143.6. Samples: 969908660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:43,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 05:24:44,074][26599] Updated weights for policy 0, policy_version 287004 (0.0034) [2024-06-19 05:24:47,161][26599] Updated weights for policy 0, policy_version 287014 (0.0028) [2024-06-19 05:24:48,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4702470144. Throughput: 0: 42049.2. Samples: 970030740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:48,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 05:24:51,943][26599] Updated weights for policy 0, policy_version 287024 (0.0031) [2024-06-19 05:24:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.5, 300 sec: 42209.7). Total num frames: 4702666752. Throughput: 0: 42335.6. Samples: 970286300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:53,380][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 05:24:54,886][26599] Updated weights for policy 0, policy_version 287034 (0.0033) [2024-06-19 05:24:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 41779.2, 300 sec: 42098.6). Total num frames: 4702863360. Throughput: 0: 42227.1. Samples: 970537420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:24:58,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 05:24:59,610][26599] Updated weights for policy 0, policy_version 287044 (0.0031) [2024-06-19 05:25:02,602][26599] Updated weights for policy 0, policy_version 287054 (0.0033) [2024-06-19 05:25:03,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4703109120. Throughput: 0: 42252.0. Samples: 970663320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:25:03,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 05:25:03,741][26579] Signal inference workers to stop experience collection... (14450 times) [2024-06-19 05:25:03,742][26579] Signal inference workers to resume experience collection... (14450 times) [2024-06-19 05:25:03,771][26599] InferenceWorker_p0-w0: stopping experience collection (14450 times) [2024-06-19 05:25:03,776][26599] InferenceWorker_p0-w0: resuming experience collection (14450 times) [2024-06-19 05:25:07,040][26599] Updated weights for policy 0, policy_version 287064 (0.0037) [2024-06-19 05:25:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4703289344. Throughput: 0: 42312.8. Samples: 970922440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:25:08,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 05:25:10,341][26599] Updated weights for policy 0, policy_version 287074 (0.0032) [2024-06-19 05:25:13,381][26367] Fps is (10 sec: 40959.1, 60 sec: 42325.2, 300 sec: 42210.1). Total num frames: 4703518720. Throughput: 0: 42184.3. Samples: 971178680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:25:13,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 05:25:14,739][26599] Updated weights for policy 0, policy_version 287084 (0.0026) [2024-06-19 05:25:18,118][26599] Updated weights for policy 0, policy_version 287094 (0.0035) [2024-06-19 05:25:18,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4703748096. Throughput: 0: 42498.7. Samples: 971311900. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:18,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 05:25:22,352][26599] Updated weights for policy 0, policy_version 287104 (0.0030) [2024-06-19 05:25:23,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4703944704. Throughput: 0: 42448.3. Samples: 971564360. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:23,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 05:25:26,034][26599] Updated weights for policy 0, policy_version 287114 (0.0040) [2024-06-19 05:25:28,380][26367] Fps is (10 sec: 40961.1, 60 sec: 42325.5, 300 sec: 42209.6). Total num frames: 4704157696. Throughput: 0: 42346.3. Samples: 971814240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:28,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:25:29,883][26599] Updated weights for policy 0, policy_version 287124 (0.0025) [2024-06-19 05:25:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4704370688. Throughput: 0: 42596.5. Samples: 971947580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:33,381][26367] Avg episode reward: [(0, '0.352')] [2024-06-19 05:25:33,866][26599] Updated weights for policy 0, policy_version 287134 (0.0033) [2024-06-19 05:25:37,559][26599] Updated weights for policy 0, policy_version 287144 (0.0043) [2024-06-19 05:25:38,380][26367] Fps is (10 sec: 40958.9, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 4704567296. Throughput: 0: 42478.5. Samples: 972197840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:38,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 05:25:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000287144_4704567296.pth... [2024-06-19 05:25:38,469][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000286527_4694458368.pth [2024-06-19 05:25:41,671][26599] Updated weights for policy 0, policy_version 287154 (0.0028) [2024-06-19 05:25:43,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 4704813056. Throughput: 0: 42519.7. Samples: 972450800. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:43,380][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 05:25:45,266][26599] Updated weights for policy 0, policy_version 287164 (0.0033) [2024-06-19 05:25:48,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4704993280. Throughput: 0: 42637.9. Samples: 972582020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:48,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 05:25:49,319][26599] Updated weights for policy 0, policy_version 287174 (0.0035) [2024-06-19 05:25:53,380][26367] Fps is (10 sec: 39320.6, 60 sec: 42325.2, 300 sec: 42209.7). Total num frames: 4705206272. Throughput: 0: 42422.5. Samples: 972831460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:53,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 05:25:53,823][26599] Updated weights for policy 0, policy_version 287184 (0.0024) [2024-06-19 05:25:56,893][26599] Updated weights for policy 0, policy_version 287194 (0.0028) [2024-06-19 05:25:58,382][26367] Fps is (10 sec: 47504.9, 60 sec: 43416.3, 300 sec: 42376.0). Total num frames: 4705468416. Throughput: 0: 42416.8. Samples: 973087500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:25:58,383][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 05:26:01,484][26599] Updated weights for policy 0, policy_version 287204 (0.0041) [2024-06-19 05:26:03,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 4705632256. Throughput: 0: 42466.4. Samples: 973222880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:26:03,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 05:26:04,601][26599] Updated weights for policy 0, policy_version 287214 (0.0031) [2024-06-19 05:26:08,380][26367] Fps is (10 sec: 39328.1, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4705861632. Throughput: 0: 42429.8. Samples: 973473700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:26:08,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 05:26:09,118][26599] Updated weights for policy 0, policy_version 287224 (0.0041) [2024-06-19 05:26:12,248][26599] Updated weights for policy 0, policy_version 287234 (0.0022) [2024-06-19 05:26:13,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.7, 300 sec: 42376.2). Total num frames: 4706091008. Throughput: 0: 42439.4. Samples: 973724020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:26:13,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 05:26:16,685][26599] Updated weights for policy 0, policy_version 287244 (0.0040) [2024-06-19 05:26:18,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42209.7). Total num frames: 4706287616. Throughput: 0: 42377.9. Samples: 973854580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:26:18,380][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 05:26:19,078][26579] Signal inference workers to stop experience collection... (14500 times) [2024-06-19 05:26:19,089][26599] InferenceWorker_p0-w0: stopping experience collection (14500 times) [2024-06-19 05:26:19,136][26579] Signal inference workers to resume experience collection... (14500 times) [2024-06-19 05:26:19,136][26599] InferenceWorker_p0-w0: resuming experience collection (14500 times) [2024-06-19 05:26:19,763][26599] Updated weights for policy 0, policy_version 287254 (0.0038) [2024-06-19 05:26:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4706500608. Throughput: 0: 42490.7. Samples: 974109920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 05:26:23,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 05:26:24,050][26599] Updated weights for policy 0, policy_version 287264 (0.0048) [2024-06-19 05:26:27,543][26599] Updated weights for policy 0, policy_version 287274 (0.0028) [2024-06-19 05:26:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4706713600. Throughput: 0: 42673.7. Samples: 974371120. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:26:28,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 05:26:31,578][26599] Updated weights for policy 0, policy_version 287284 (0.0024) [2024-06-19 05:26:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4706942976. Throughput: 0: 42649.7. Samples: 974501260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:26:33,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 05:26:35,024][26599] Updated weights for policy 0, policy_version 287294 (0.0045) [2024-06-19 05:26:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43144.7, 300 sec: 42487.3). Total num frames: 4707155968. Throughput: 0: 42750.0. Samples: 974755200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:26:38,380][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 05:26:39,143][26599] Updated weights for policy 0, policy_version 287304 (0.0036) [2024-06-19 05:26:42,840][26599] Updated weights for policy 0, policy_version 287314 (0.0049) [2024-06-19 05:26:43,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4707352576. Throughput: 0: 42690.4. Samples: 975008500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:26:43,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 05:26:46,806][26599] Updated weights for policy 0, policy_version 287324 (0.0027) [2024-06-19 05:26:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4707565568. Throughput: 0: 42653.7. Samples: 975142300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:26:48,380][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 05:26:50,247][26599] Updated weights for policy 0, policy_version 287334 (0.0041) [2024-06-19 05:26:53,384][26367] Fps is (10 sec: 44222.8, 60 sec: 43142.3, 300 sec: 42486.8). Total num frames: 4707794944. Throughput: 0: 42713.8. Samples: 975395960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:26:53,384][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 05:26:54,763][26599] Updated weights for policy 0, policy_version 287344 (0.0040) [2024-06-19 05:26:58,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42050.9, 300 sec: 42375.7). Total num frames: 4707991552. Throughput: 0: 42755.7. Samples: 975648180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:26:58,385][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 05:26:58,456][26599] Updated weights for policy 0, policy_version 287354 (0.0042) [2024-06-19 05:27:02,392][26599] Updated weights for policy 0, policy_version 287364 (0.0037) [2024-06-19 05:27:03,380][26367] Fps is (10 sec: 42612.1, 60 sec: 43144.4, 300 sec: 42376.2). Total num frames: 4708220928. Throughput: 0: 42768.7. Samples: 975779180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:03,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 05:27:06,064][26599] Updated weights for policy 0, policy_version 287374 (0.0038) [2024-06-19 05:27:08,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42871.5, 300 sec: 42487.6). Total num frames: 4708433920. Throughput: 0: 42708.0. Samples: 976031780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:08,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 05:27:10,038][26599] Updated weights for policy 0, policy_version 287384 (0.0039) [2024-06-19 05:27:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4708630528. Throughput: 0: 42825.3. Samples: 976298260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:13,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 05:27:13,529][26599] Updated weights for policy 0, policy_version 287394 (0.0029) [2024-06-19 05:27:17,571][26599] Updated weights for policy 0, policy_version 287404 (0.0040) [2024-06-19 05:27:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42376.2). Total num frames: 4708859904. Throughput: 0: 42722.6. Samples: 976423780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:18,381][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 05:27:21,026][26599] Updated weights for policy 0, policy_version 287414 (0.0031) [2024-06-19 05:27:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4709072896. Throughput: 0: 42792.3. Samples: 976680860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:23,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 05:27:25,236][26599] Updated weights for policy 0, policy_version 287424 (0.0046) [2024-06-19 05:27:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4709285888. Throughput: 0: 42792.6. Samples: 976934160. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:28,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 05:27:29,300][26599] Updated weights for policy 0, policy_version 287434 (0.0046) [2024-06-19 05:27:32,983][26599] Updated weights for policy 0, policy_version 287444 (0.0040) [2024-06-19 05:27:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4709498880. Throughput: 0: 42690.2. Samples: 977063360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:33,381][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 05:27:37,017][26599] Updated weights for policy 0, policy_version 287454 (0.0036) [2024-06-19 05:27:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4709711872. Throughput: 0: 42769.0. Samples: 977320420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:27:38,381][26367] Avg episode reward: [(0, '0.823')] [2024-06-19 05:27:38,417][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000287458_4709711872.pth... [2024-06-19 05:27:38,472][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000286835_4699504640.pth [2024-06-19 05:27:40,458][26599] Updated weights for policy 0, policy_version 287464 (0.0023) [2024-06-19 05:27:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42432.0). Total num frames: 4709924864. Throughput: 0: 42775.9. Samples: 977572940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:27:43,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 05:27:44,613][26599] Updated weights for policy 0, policy_version 287474 (0.0032) [2024-06-19 05:27:47,719][26579] Signal inference workers to stop experience collection... (14550 times) [2024-06-19 05:27:47,725][26579] Signal inference workers to resume experience collection... (14550 times) [2024-06-19 05:27:47,749][26599] InferenceWorker_p0-w0: stopping experience collection (14550 times) [2024-06-19 05:27:47,749][26599] InferenceWorker_p0-w0: resuming experience collection (14550 times) [2024-06-19 05:27:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4710121472. Throughput: 0: 42661.9. Samples: 977698960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:27:48,381][26367] Avg episode reward: [(0, '0.854')] [2024-06-19 05:27:48,465][26599] Updated weights for policy 0, policy_version 287484 (0.0039) [2024-06-19 05:27:52,221][26599] Updated weights for policy 0, policy_version 287494 (0.0024) [2024-06-19 05:27:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42327.7, 300 sec: 42487.3). Total num frames: 4710334464. Throughput: 0: 42636.0. Samples: 977950400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:27:53,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 05:27:56,391][26599] Updated weights for policy 0, policy_version 287504 (0.0036) [2024-06-19 05:27:58,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42600.9, 300 sec: 42487.3). Total num frames: 4710547456. Throughput: 0: 42286.5. Samples: 978201160. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:27:58,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 05:27:59,762][26599] Updated weights for policy 0, policy_version 287514 (0.0024) [2024-06-19 05:28:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.8). Total num frames: 4710760448. Throughput: 0: 42504.5. Samples: 978336480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:03,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 05:28:04,228][26599] Updated weights for policy 0, policy_version 287524 (0.0035) [2024-06-19 05:28:07,408][26599] Updated weights for policy 0, policy_version 287534 (0.0045) [2024-06-19 05:28:08,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4710989824. Throughput: 0: 42423.1. Samples: 978589900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:08,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 05:28:11,731][26599] Updated weights for policy 0, policy_version 287544 (0.0035) [2024-06-19 05:28:13,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4711202816. Throughput: 0: 42470.7. Samples: 978845340. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:13,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 05:28:14,991][26599] Updated weights for policy 0, policy_version 287554 (0.0036) [2024-06-19 05:28:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4711399424. Throughput: 0: 42433.8. Samples: 978972880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:18,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 05:28:19,321][26599] Updated weights for policy 0, policy_version 287564 (0.0034) [2024-06-19 05:28:22,638][26599] Updated weights for policy 0, policy_version 287574 (0.0028) [2024-06-19 05:28:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4711628800. Throughput: 0: 42329.4. Samples: 979225240. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:23,380][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 05:28:26,992][26599] Updated weights for policy 0, policy_version 287584 (0.0035) [2024-06-19 05:28:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4711825408. Throughput: 0: 42415.1. Samples: 979481620. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:28,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 05:28:30,342][26599] Updated weights for policy 0, policy_version 287594 (0.0044) [2024-06-19 05:28:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4712038400. Throughput: 0: 42447.6. Samples: 979609100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:33,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 05:28:34,715][26599] Updated weights for policy 0, policy_version 287604 (0.0031) [2024-06-19 05:28:38,044][26599] Updated weights for policy 0, policy_version 287614 (0.0042) [2024-06-19 05:28:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4712267776. Throughput: 0: 42526.2. Samples: 979864080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:38,384][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 05:28:42,569][26599] Updated weights for policy 0, policy_version 287624 (0.0033) [2024-06-19 05:28:43,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 4712480768. Throughput: 0: 42570.1. Samples: 980116960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:43,385][26367] Avg episode reward: [(0, '0.367')] [2024-06-19 05:28:45,826][26599] Updated weights for policy 0, policy_version 287634 (0.0031) [2024-06-19 05:28:48,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4712660992. Throughput: 0: 42212.3. Samples: 980236040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:28:48,381][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 05:28:50,411][26599] Updated weights for policy 0, policy_version 287644 (0.0029) [2024-06-19 05:28:53,380][26367] Fps is (10 sec: 40975.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4712890368. Throughput: 0: 42313.9. Samples: 980494020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:28:53,380][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 05:28:53,574][26599] Updated weights for policy 0, policy_version 287654 (0.0026) [2024-06-19 05:28:58,178][26599] Updated weights for policy 0, policy_version 287664 (0.0028) [2024-06-19 05:28:58,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 4713086976. Throughput: 0: 42234.7. Samples: 980745900. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:28:58,380][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 05:29:01,484][26579] Signal inference workers to stop experience collection... (14600 times) [2024-06-19 05:29:01,484][26579] Signal inference workers to resume experience collection... (14600 times) [2024-06-19 05:29:01,505][26599] InferenceWorker_p0-w0: stopping experience collection (14600 times) [2024-06-19 05:29:01,506][26599] InferenceWorker_p0-w0: resuming experience collection (14600 times) [2024-06-19 05:29:01,636][26599] Updated weights for policy 0, policy_version 287674 (0.0042) [2024-06-19 05:29:03,380][26367] Fps is (10 sec: 39320.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4713283584. Throughput: 0: 42097.7. Samples: 980867280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:03,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 05:29:05,765][26599] Updated weights for policy 0, policy_version 287684 (0.0040) [2024-06-19 05:29:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4713529344. Throughput: 0: 42186.1. Samples: 981123620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:08,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 05:29:09,191][26599] Updated weights for policy 0, policy_version 287694 (0.0030) [2024-06-19 05:29:13,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 4713725952. Throughput: 0: 42304.6. Samples: 981385320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:13,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 05:29:13,494][26599] Updated weights for policy 0, policy_version 287704 (0.0036) [2024-06-19 05:29:17,028][26599] Updated weights for policy 0, policy_version 287714 (0.0039) [2024-06-19 05:29:18,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4713922560. Throughput: 0: 42150.2. Samples: 981505860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:18,380][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 05:29:21,328][26599] Updated weights for policy 0, policy_version 287724 (0.0041) [2024-06-19 05:29:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4714151936. Throughput: 0: 42207.6. Samples: 981763420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:23,381][26367] Avg episode reward: [(0, '0.285')] [2024-06-19 05:29:24,632][26599] Updated weights for policy 0, policy_version 287734 (0.0022) [2024-06-19 05:29:28,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4714364928. Throughput: 0: 42232.6. Samples: 982017280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:28,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 05:29:29,216][26599] Updated weights for policy 0, policy_version 287744 (0.0038) [2024-06-19 05:29:32,814][26599] Updated weights for policy 0, policy_version 287754 (0.0040) [2024-06-19 05:29:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4714577920. Throughput: 0: 42355.3. Samples: 982142020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:33,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 05:29:36,843][26599] Updated weights for policy 0, policy_version 287764 (0.0043) [2024-06-19 05:29:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.2, 300 sec: 42542.8). Total num frames: 4714774528. Throughput: 0: 42406.9. Samples: 982402340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:38,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 05:29:38,425][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000287768_4714790912.pth... [2024-06-19 05:29:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000287144_4704567296.pth [2024-06-19 05:29:40,442][26599] Updated weights for policy 0, policy_version 287774 (0.0033) [2024-06-19 05:29:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41781.8, 300 sec: 42431.8). Total num frames: 4714987520. Throughput: 0: 42438.6. Samples: 982655640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:43,381][26367] Avg episode reward: [(0, '0.783')] [2024-06-19 05:29:44,593][26599] Updated weights for policy 0, policy_version 287784 (0.0042) [2024-06-19 05:29:48,141][26599] Updated weights for policy 0, policy_version 287794 (0.0035) [2024-06-19 05:29:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 4715216896. Throughput: 0: 42579.1. Samples: 982783340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:48,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 05:29:52,022][26599] Updated weights for policy 0, policy_version 287804 (0.0031) [2024-06-19 05:29:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 4715413504. Throughput: 0: 42480.5. Samples: 983035240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:53,381][26367] Avg episode reward: [(0, '0.861')] [2024-06-19 05:29:55,829][26599] Updated weights for policy 0, policy_version 287814 (0.0050) [2024-06-19 05:29:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4715626496. Throughput: 0: 42379.4. Samples: 983292400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:29:58,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 05:29:59,597][26599] Updated weights for policy 0, policy_version 287824 (0.0045) [2024-06-19 05:30:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4715855872. Throughput: 0: 42524.8. Samples: 983419480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 05:30:03,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 05:30:03,570][26599] Updated weights for policy 0, policy_version 287834 (0.0044) [2024-06-19 05:30:07,611][26599] Updated weights for policy 0, policy_version 287844 (0.0032) [2024-06-19 05:30:08,380][26367] Fps is (10 sec: 40959.6, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 4716036096. Throughput: 0: 42341.2. Samples: 983668780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:08,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 05:30:11,320][26599] Updated weights for policy 0, policy_version 287854 (0.0041) [2024-06-19 05:30:13,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.2, 300 sec: 42376.3). Total num frames: 4716249088. Throughput: 0: 42255.8. Samples: 983918780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:13,380][26367] Avg episode reward: [(0, '0.329')] [2024-06-19 05:30:15,204][26579] Signal inference workers to stop experience collection... (14650 times) [2024-06-19 05:30:15,256][26579] Signal inference workers to resume experience collection... (14650 times) [2024-06-19 05:30:15,256][26599] InferenceWorker_p0-w0: stopping experience collection (14650 times) [2024-06-19 05:30:15,276][26599] InferenceWorker_p0-w0: resuming experience collection (14650 times) [2024-06-19 05:30:15,402][26599] Updated weights for policy 0, policy_version 287864 (0.0040) [2024-06-19 05:30:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4716462080. Throughput: 0: 42343.1. Samples: 984047460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:18,380][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 05:30:19,021][26599] Updated weights for policy 0, policy_version 287874 (0.0040) [2024-06-19 05:30:23,270][26599] Updated weights for policy 0, policy_version 287884 (0.0027) [2024-06-19 05:30:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4716691456. Throughput: 0: 42058.4. Samples: 984294960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:23,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 05:30:26,813][26599] Updated weights for policy 0, policy_version 287894 (0.0055) [2024-06-19 05:30:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4716904448. Throughput: 0: 42091.9. Samples: 984549780. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:28,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 05:30:31,451][26599] Updated weights for policy 0, policy_version 287904 (0.0032) [2024-06-19 05:30:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 4717084672. Throughput: 0: 42123.7. Samples: 984678900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:33,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 05:30:35,049][26599] Updated weights for policy 0, policy_version 287914 (0.0037) [2024-06-19 05:30:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4717314048. Throughput: 0: 41947.1. Samples: 984922860. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:38,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 05:30:39,051][26599] Updated weights for policy 0, policy_version 287924 (0.0036) [2024-06-19 05:30:43,074][26599] Updated weights for policy 0, policy_version 287934 (0.0030) [2024-06-19 05:30:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4717527040. Throughput: 0: 41898.7. Samples: 985177840. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:43,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 05:30:46,786][26599] Updated weights for policy 0, policy_version 287944 (0.0029) [2024-06-19 05:30:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 4717723648. Throughput: 0: 41839.1. Samples: 985302240. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:48,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 05:30:50,587][26599] Updated weights for policy 0, policy_version 287954 (0.0030) [2024-06-19 05:30:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42376.5). Total num frames: 4717969408. Throughput: 0: 42029.9. Samples: 985560120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:53,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 05:30:54,494][26599] Updated weights for policy 0, policy_version 287964 (0.0029) [2024-06-19 05:30:58,235][26599] Updated weights for policy 0, policy_version 287974 (0.0029) [2024-06-19 05:30:58,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42322.8, 300 sec: 42486.8). Total num frames: 4718166016. Throughput: 0: 42078.8. Samples: 985812480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:30:58,384][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 05:31:02,347][26599] Updated weights for policy 0, policy_version 287984 (0.0031) [2024-06-19 05:31:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 4718362624. Throughput: 0: 41961.8. Samples: 985935740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:31:03,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 05:31:06,363][26599] Updated weights for policy 0, policy_version 287994 (0.0037) [2024-06-19 05:31:08,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4718592000. Throughput: 0: 42166.6. Samples: 986192460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:31:08,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 05:31:09,798][26599] Updated weights for policy 0, policy_version 288004 (0.0039) [2024-06-19 05:31:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4718772224. Throughput: 0: 42088.1. Samples: 986443740. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:31:13,380][26367] Avg episode reward: [(0, '0.327')] [2024-06-19 05:31:14,017][26599] Updated weights for policy 0, policy_version 288014 (0.0047) [2024-06-19 05:31:17,484][26599] Updated weights for policy 0, policy_version 288024 (0.0034) [2024-06-19 05:31:18,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4718985216. Throughput: 0: 41982.6. Samples: 986568120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 22.0) [2024-06-19 05:31:18,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 05:31:22,001][26599] Updated weights for policy 0, policy_version 288034 (0.0030) [2024-06-19 05:31:23,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4719230976. Throughput: 0: 42403.1. Samples: 986831000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:23,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 05:31:25,328][26599] Updated weights for policy 0, policy_version 288044 (0.0026) [2024-06-19 05:31:28,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 4719427584. Throughput: 0: 42409.9. Samples: 987086280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:28,380][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 05:31:29,688][26599] Updated weights for policy 0, policy_version 288054 (0.0043) [2024-06-19 05:31:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4719624192. Throughput: 0: 42257.8. Samples: 987203840. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:33,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 05:31:33,596][26599] Updated weights for policy 0, policy_version 288064 (0.0037) [2024-06-19 05:31:37,450][26599] Updated weights for policy 0, policy_version 288074 (0.0044) [2024-06-19 05:31:38,384][26367] Fps is (10 sec: 42582.3, 60 sec: 42322.8, 300 sec: 42375.7). Total num frames: 4719853568. Throughput: 0: 42332.1. Samples: 987465220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:38,385][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 05:31:38,520][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000288078_4719869952.pth... [2024-06-19 05:31:38,567][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000287458_4709711872.pth [2024-06-19 05:31:39,234][26579] Signal inference workers to stop experience collection... (14700 times) [2024-06-19 05:31:39,292][26599] InferenceWorker_p0-w0: stopping experience collection (14700 times) [2024-06-19 05:31:39,347][26579] Signal inference workers to resume experience collection... (14700 times) [2024-06-19 05:31:39,348][26599] InferenceWorker_p0-w0: resuming experience collection (14700 times) [2024-06-19 05:31:41,269][26599] Updated weights for policy 0, policy_version 288084 (0.0022) [2024-06-19 05:31:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4720050176. Throughput: 0: 42280.3. Samples: 987714940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:43,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 05:31:44,900][26599] Updated weights for policy 0, policy_version 288094 (0.0026) [2024-06-19 05:31:48,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42325.3, 300 sec: 42265.6). Total num frames: 4720263168. Throughput: 0: 42334.1. Samples: 987840780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:48,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 05:31:49,061][26599] Updated weights for policy 0, policy_version 288104 (0.0030) [2024-06-19 05:31:52,662][26599] Updated weights for policy 0, policy_version 288114 (0.0038) [2024-06-19 05:31:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42321.2). Total num frames: 4720476160. Throughput: 0: 42216.5. Samples: 988092200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:53,380][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 05:31:56,791][26599] Updated weights for policy 0, policy_version 288124 (0.0033) [2024-06-19 05:31:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42054.8, 300 sec: 42265.2). Total num frames: 4720689152. Throughput: 0: 42279.9. Samples: 988346340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:31:58,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 05:32:00,453][26599] Updated weights for policy 0, policy_version 288134 (0.0031) [2024-06-19 05:32:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4720918528. Throughput: 0: 42394.6. Samples: 988475880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:32:03,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 05:32:04,353][26599] Updated weights for policy 0, policy_version 288144 (0.0033) [2024-06-19 05:32:08,153][26599] Updated weights for policy 0, policy_version 288154 (0.0034) [2024-06-19 05:32:08,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4721131520. Throughput: 0: 42325.0. Samples: 988735620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:32:08,380][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 05:32:11,964][26599] Updated weights for policy 0, policy_version 288164 (0.0033) [2024-06-19 05:32:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4721328128. Throughput: 0: 42109.3. Samples: 988981200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:32:13,380][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 05:32:15,695][26599] Updated weights for policy 0, policy_version 288174 (0.0040) [2024-06-19 05:32:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4721541120. Throughput: 0: 42439.1. Samples: 989113600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:32:18,380][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 05:32:19,589][26599] Updated weights for policy 0, policy_version 288184 (0.0036) [2024-06-19 05:32:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4721754112. Throughput: 0: 42407.0. Samples: 989373380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:32:23,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 05:32:23,541][26599] Updated weights for policy 0, policy_version 288194 (0.0030) [2024-06-19 05:32:27,157][26599] Updated weights for policy 0, policy_version 288204 (0.0041) [2024-06-19 05:32:28,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42598.2, 300 sec: 42320.7). Total num frames: 4721983488. Throughput: 0: 42515.0. Samples: 989628120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 05:32:28,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 05:32:31,251][26599] Updated weights for policy 0, policy_version 288214 (0.0034) [2024-06-19 05:32:33,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42868.9, 300 sec: 42320.2). Total num frames: 4722196480. Throughput: 0: 42678.0. Samples: 989761440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:32:33,384][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 05:32:35,066][26599] Updated weights for policy 0, policy_version 288224 (0.0038) [2024-06-19 05:32:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42601.0, 300 sec: 42320.7). Total num frames: 4722409472. Throughput: 0: 42629.2. Samples: 990010520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:32:38,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 05:32:38,752][26599] Updated weights for policy 0, policy_version 288234 (0.0029) [2024-06-19 05:32:42,701][26599] Updated weights for policy 0, policy_version 288244 (0.0025) [2024-06-19 05:32:43,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4722606080. Throughput: 0: 42728.9. Samples: 990269140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:32:43,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 05:32:46,278][26599] Updated weights for policy 0, policy_version 288254 (0.0034) [2024-06-19 05:32:48,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4722835456. Throughput: 0: 42684.4. Samples: 990396680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:32:48,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 05:32:50,294][26599] Updated weights for policy 0, policy_version 288264 (0.0027) [2024-06-19 05:32:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 4723048448. Throughput: 0: 42547.4. Samples: 990650260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:32:53,381][26367] Avg episode reward: [(0, '0.853')] [2024-06-19 05:32:54,432][26599] Updated weights for policy 0, policy_version 288274 (0.0029) [2024-06-19 05:32:57,775][26579] Signal inference workers to stop experience collection... (14750 times) [2024-06-19 05:32:57,825][26599] InferenceWorker_p0-w0: stopping experience collection (14750 times) [2024-06-19 05:32:57,896][26579] Signal inference workers to resume experience collection... (14750 times) [2024-06-19 05:32:57,896][26599] InferenceWorker_p0-w0: resuming experience collection (14750 times) [2024-06-19 05:32:58,030][26599] Updated weights for policy 0, policy_version 288284 (0.0021) [2024-06-19 05:32:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4723245056. Throughput: 0: 42759.6. Samples: 990905380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:32:58,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 05:33:01,992][26599] Updated weights for policy 0, policy_version 288294 (0.0032) [2024-06-19 05:33:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4723458048. Throughput: 0: 42696.4. Samples: 991034940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:03,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 05:33:05,731][26599] Updated weights for policy 0, policy_version 288304 (0.0033) [2024-06-19 05:33:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4723671040. Throughput: 0: 42524.0. Samples: 991286960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:08,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 05:33:09,621][26599] Updated weights for policy 0, policy_version 288314 (0.0037) [2024-06-19 05:33:13,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42322.8, 300 sec: 42264.6). Total num frames: 4723867648. Throughput: 0: 42493.6. Samples: 991540480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:13,384][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 05:33:13,692][26599] Updated weights for policy 0, policy_version 288324 (0.0025) [2024-06-19 05:33:17,490][26599] Updated weights for policy 0, policy_version 288334 (0.0034) [2024-06-19 05:33:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4724097024. Throughput: 0: 42270.9. Samples: 991663480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:18,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 05:33:21,594][26599] Updated weights for policy 0, policy_version 288344 (0.0038) [2024-06-19 05:33:23,380][26367] Fps is (10 sec: 45891.5, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4724326400. Throughput: 0: 42383.6. Samples: 991917780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:23,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 05:33:24,916][26599] Updated weights for policy 0, policy_version 288354 (0.0043) [2024-06-19 05:33:28,380][26367] Fps is (10 sec: 39320.9, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4724490240. Throughput: 0: 42411.4. Samples: 992177660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:28,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 05:33:29,296][26599] Updated weights for policy 0, policy_version 288364 (0.0036) [2024-06-19 05:33:32,382][26599] Updated weights for policy 0, policy_version 288374 (0.0041) [2024-06-19 05:33:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42327.9, 300 sec: 42265.2). Total num frames: 4724736000. Throughput: 0: 42318.4. Samples: 992301000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:33,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 05:33:36,915][26599] Updated weights for policy 0, policy_version 288384 (0.0031) [2024-06-19 05:33:38,380][26367] Fps is (10 sec: 47514.4, 60 sec: 42598.4, 300 sec: 42321.2). Total num frames: 4724965376. Throughput: 0: 42369.8. Samples: 992556900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:38,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 05:33:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000288389_4724965376.pth... [2024-06-19 05:33:38,444][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000287768_4714790912.pth [2024-06-19 05:33:40,492][26599] Updated weights for policy 0, policy_version 288394 (0.0036) [2024-06-19 05:33:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4725145600. Throughput: 0: 42366.3. Samples: 992811860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 05:33:43,380][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 05:33:44,573][26599] Updated weights for policy 0, policy_version 288404 (0.0038) [2024-06-19 05:33:48,364][26599] Updated weights for policy 0, policy_version 288414 (0.0042) [2024-06-19 05:33:48,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42322.8, 300 sec: 42320.2). Total num frames: 4725374976. Throughput: 0: 42189.0. Samples: 992933600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:33:48,384][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 05:33:52,340][26599] Updated weights for policy 0, policy_version 288424 (0.0039) [2024-06-19 05:33:53,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4725587968. Throughput: 0: 42295.5. Samples: 993190260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:33:53,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 05:33:55,916][26599] Updated weights for policy 0, policy_version 288434 (0.0044) [2024-06-19 05:33:58,380][26367] Fps is (10 sec: 40974.4, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4725784576. Throughput: 0: 42335.3. Samples: 993445420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:33:58,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 05:34:00,127][26599] Updated weights for policy 0, policy_version 288444 (0.0030) [2024-06-19 05:34:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4726013952. Throughput: 0: 42382.2. Samples: 993570680. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:03,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 05:34:03,526][26599] Updated weights for policy 0, policy_version 288454 (0.0034) [2024-06-19 05:34:07,804][26599] Updated weights for policy 0, policy_version 288464 (0.0043) [2024-06-19 05:34:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4726210560. Throughput: 0: 42296.1. Samples: 993821100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:08,381][26367] Avg episode reward: [(0, '0.297')] [2024-06-19 05:34:11,364][26599] Updated weights for policy 0, policy_version 288474 (0.0043) [2024-06-19 05:34:13,384][26367] Fps is (10 sec: 39307.7, 60 sec: 42325.3, 300 sec: 42320.2). Total num frames: 4726407168. Throughput: 0: 42159.5. Samples: 994074980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:13,384][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 05:34:15,422][26599] Updated weights for policy 0, policy_version 288484 (0.0034) [2024-06-19 05:34:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4726636544. Throughput: 0: 42144.8. Samples: 994197520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:18,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 05:34:19,231][26599] Updated weights for policy 0, policy_version 288494 (0.0043) [2024-06-19 05:34:23,380][26367] Fps is (10 sec: 42613.8, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 4726833152. Throughput: 0: 42126.2. Samples: 994452580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:23,381][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 05:34:23,536][26599] Updated weights for policy 0, policy_version 288504 (0.0051) [2024-06-19 05:34:26,938][26599] Updated weights for policy 0, policy_version 288514 (0.0033) [2024-06-19 05:34:28,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.6, 300 sec: 42265.2). Total num frames: 4727046144. Throughput: 0: 42140.9. Samples: 994708200. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:28,380][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 05:34:31,189][26599] Updated weights for policy 0, policy_version 288524 (0.0039) [2024-06-19 05:34:33,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42322.7, 300 sec: 42375.7). Total num frames: 4727275520. Throughput: 0: 42327.6. Samples: 994838340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:33,384][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 05:34:34,482][26599] Updated weights for policy 0, policy_version 288534 (0.0028) [2024-06-19 05:34:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 4727455744. Throughput: 0: 42243.7. Samples: 995091220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:38,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 05:34:38,809][26599] Updated weights for policy 0, policy_version 288544 (0.0037) [2024-06-19 05:34:39,597][26579] Signal inference workers to stop experience collection... (14800 times) [2024-06-19 05:34:39,622][26599] InferenceWorker_p0-w0: stopping experience collection (14800 times) [2024-06-19 05:34:39,709][26579] Signal inference workers to resume experience collection... (14800 times) [2024-06-19 05:34:39,709][26599] InferenceWorker_p0-w0: resuming experience collection (14800 times) [2024-06-19 05:34:42,091][26599] Updated weights for policy 0, policy_version 288554 (0.0031) [2024-06-19 05:34:43,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4727685120. Throughput: 0: 42265.0. Samples: 995347340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:43,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 05:34:46,367][26599] Updated weights for policy 0, policy_version 288564 (0.0031) [2024-06-19 05:34:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42054.7, 300 sec: 42320.7). Total num frames: 4727898112. Throughput: 0: 42276.0. Samples: 995473100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:48,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 05:34:50,209][26599] Updated weights for policy 0, policy_version 288574 (0.0036) [2024-06-19 05:34:53,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 4728127488. Throughput: 0: 42370.8. Samples: 995727780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 05:34:53,380][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 05:34:54,041][26599] Updated weights for policy 0, policy_version 288584 (0.0037) [2024-06-19 05:34:57,757][26599] Updated weights for policy 0, policy_version 288594 (0.0037) [2024-06-19 05:34:58,381][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4728324096. Throughput: 0: 42386.7. Samples: 995982240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:34:58,381][26367] Avg episode reward: [(0, '0.790')] [2024-06-19 05:35:01,587][26599] Updated weights for policy 0, policy_version 288604 (0.0035) [2024-06-19 05:35:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4728537088. Throughput: 0: 42458.8. Samples: 996108160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:03,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 05:35:05,567][26599] Updated weights for policy 0, policy_version 288614 (0.0027) [2024-06-19 05:35:08,380][26367] Fps is (10 sec: 44238.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4728766464. Throughput: 0: 42713.8. Samples: 996374700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:08,380][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 05:35:09,083][26599] Updated weights for policy 0, policy_version 288624 (0.0039) [2024-06-19 05:35:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42601.0, 300 sec: 42376.3). Total num frames: 4728963072. Throughput: 0: 42540.9. Samples: 996622540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:13,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 05:35:13,469][26599] Updated weights for policy 0, policy_version 288634 (0.0038) [2024-06-19 05:35:16,635][26599] Updated weights for policy 0, policy_version 288644 (0.0028) [2024-06-19 05:35:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4729192448. Throughput: 0: 42445.3. Samples: 996748220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:18,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 05:35:21,481][26599] Updated weights for policy 0, policy_version 288654 (0.0029) [2024-06-19 05:35:23,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4729405440. Throughput: 0: 42711.1. Samples: 997013220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:23,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 05:35:24,139][26599] Updated weights for policy 0, policy_version 288664 (0.0027) [2024-06-19 05:35:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4729602048. Throughput: 0: 42504.0. Samples: 997260020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:28,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 05:35:29,226][26599] Updated weights for policy 0, policy_version 288674 (0.0041) [2024-06-19 05:35:32,279][26599] Updated weights for policy 0, policy_version 288684 (0.0041) [2024-06-19 05:35:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42600.9, 300 sec: 42431.8). Total num frames: 4729831424. Throughput: 0: 42550.3. Samples: 997387860. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:33,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 05:35:36,864][26599] Updated weights for policy 0, policy_version 288694 (0.0034) [2024-06-19 05:35:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 4730044416. Throughput: 0: 42651.4. Samples: 997647100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:38,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 05:35:38,395][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000288699_4730044416.pth... [2024-06-19 05:35:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000288078_4719869952.pth [2024-06-19 05:35:39,986][26599] Updated weights for policy 0, policy_version 288704 (0.0033) [2024-06-19 05:35:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4730224640. Throughput: 0: 42542.9. Samples: 997896660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:43,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 05:35:44,564][26599] Updated weights for policy 0, policy_version 288714 (0.0035) [2024-06-19 05:35:47,618][26599] Updated weights for policy 0, policy_version 288724 (0.0029) [2024-06-19 05:35:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4730470400. Throughput: 0: 42520.3. Samples: 998021580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:48,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 05:35:52,328][26599] Updated weights for policy 0, policy_version 288734 (0.0038) [2024-06-19 05:35:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42376.8). Total num frames: 4730667008. Throughput: 0: 42481.8. Samples: 998286380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:53,380][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 05:35:55,294][26599] Updated weights for policy 0, policy_version 288744 (0.0034) [2024-06-19 05:35:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4730880000. Throughput: 0: 42431.4. Samples: 998531960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:35:58,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 05:35:59,974][26599] Updated weights for policy 0, policy_version 288754 (0.0027) [2024-06-19 05:36:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4731092992. Throughput: 0: 42462.5. Samples: 998659040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:36:03,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 05:36:03,473][26599] Updated weights for policy 0, policy_version 288764 (0.0030) [2024-06-19 05:36:05,122][26579] Signal inference workers to stop experience collection... (14850 times) [2024-06-19 05:36:05,122][26579] Signal inference workers to resume experience collection... (14850 times) [2024-06-19 05:36:05,154][26599] InferenceWorker_p0-w0: stopping experience collection (14850 times) [2024-06-19 05:36:05,154][26599] InferenceWorker_p0-w0: resuming experience collection (14850 times) [2024-06-19 05:36:07,785][26599] Updated weights for policy 0, policy_version 288774 (0.0028) [2024-06-19 05:36:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4731289600. Throughput: 0: 42390.4. Samples: 998920780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:36:08,380][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 05:36:11,098][26599] Updated weights for policy 0, policy_version 288784 (0.0030) [2024-06-19 05:36:13,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4731486208. Throughput: 0: 42422.3. Samples: 999169020. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:13,380][26367] Avg episode reward: [(0, '0.371')] [2024-06-19 05:36:15,728][26599] Updated weights for policy 0, policy_version 288794 (0.0031) [2024-06-19 05:36:18,380][26367] Fps is (10 sec: 45874.0, 60 sec: 42598.2, 300 sec: 42431.8). Total num frames: 4731748352. Throughput: 0: 42302.6. Samples: 999291480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:18,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 05:36:18,673][26599] Updated weights for policy 0, policy_version 288804 (0.0025) [2024-06-19 05:36:23,177][26599] Updated weights for policy 0, policy_version 288814 (0.0035) [2024-06-19 05:36:23,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4731944960. Throughput: 0: 42313.0. Samples: 999551180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:23,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 05:36:26,316][26599] Updated weights for policy 0, policy_version 288824 (0.0034) [2024-06-19 05:36:28,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4732141568. Throughput: 0: 42515.7. Samples: 999809860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:28,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 05:36:31,054][26599] Updated weights for policy 0, policy_version 288834 (0.0030) [2024-06-19 05:36:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42487.8). Total num frames: 4732387328. Throughput: 0: 42550.3. Samples: 999936340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:33,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 05:36:33,950][26599] Updated weights for policy 0, policy_version 288844 (0.0036) [2024-06-19 05:36:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 4732567552. Throughput: 0: 42257.4. Samples: 1000187960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:38,380][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 05:36:38,530][26599] Updated weights for policy 0, policy_version 288854 (0.0026) [2024-06-19 05:36:41,853][26599] Updated weights for policy 0, policy_version 288864 (0.0040) [2024-06-19 05:36:43,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4732764160. Throughput: 0: 42375.2. Samples: 1000438840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:43,380][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 05:36:46,709][26599] Updated weights for policy 0, policy_version 288874 (0.0025) [2024-06-19 05:36:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4733009920. Throughput: 0: 42430.8. Samples: 1000568420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:48,380][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 05:36:49,800][26599] Updated weights for policy 0, policy_version 288884 (0.0036) [2024-06-19 05:36:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4733206528. Throughput: 0: 42244.0. Samples: 1000821760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:53,380][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 05:36:54,252][26599] Updated weights for policy 0, policy_version 288894 (0.0028) [2024-06-19 05:36:57,494][26599] Updated weights for policy 0, policy_version 288904 (0.0036) [2024-06-19 05:36:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4733419520. Throughput: 0: 42383.0. Samples: 1001076260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:36:58,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 05:37:01,874][26599] Updated weights for policy 0, policy_version 288914 (0.0031) [2024-06-19 05:37:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4733632512. Throughput: 0: 42588.5. Samples: 1001207960. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:37:03,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 05:37:05,047][26599] Updated weights for policy 0, policy_version 288924 (0.0023) [2024-06-19 05:37:08,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4733845504. Throughput: 0: 42473.4. Samples: 1001462480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:37:08,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 05:37:09,539][26599] Updated weights for policy 0, policy_version 288934 (0.0040) [2024-06-19 05:37:12,526][26599] Updated weights for policy 0, policy_version 288944 (0.0033) [2024-06-19 05:37:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4734058496. Throughput: 0: 42230.6. Samples: 1001710240. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:37:13,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 05:37:17,618][26599] Updated weights for policy 0, policy_version 288954 (0.0030) [2024-06-19 05:37:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 4734271488. Throughput: 0: 42337.8. Samples: 1001841540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 20.0) [2024-06-19 05:37:18,384][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 05:37:20,311][26599] Updated weights for policy 0, policy_version 288964 (0.0035) [2024-06-19 05:37:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4734484480. Throughput: 0: 42460.9. Samples: 1002098700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:23,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 05:37:25,275][26599] Updated weights for policy 0, policy_version 288974 (0.0035) [2024-06-19 05:37:27,833][26599] Updated weights for policy 0, policy_version 288984 (0.0030) [2024-06-19 05:37:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42432.3). Total num frames: 4734713856. Throughput: 0: 42369.2. Samples: 1002345460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:28,384][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 05:37:33,022][26599] Updated weights for policy 0, policy_version 288994 (0.0042) [2024-06-19 05:37:33,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41506.2, 300 sec: 42265.2). Total num frames: 4734877696. Throughput: 0: 42437.8. Samples: 1002478120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:33,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 05:37:34,713][26579] Signal inference workers to stop experience collection... (14900 times) [2024-06-19 05:37:34,767][26579] Signal inference workers to resume experience collection... (14900 times) [2024-06-19 05:37:34,768][26599] InferenceWorker_p0-w0: stopping experience collection (14900 times) [2024-06-19 05:37:34,796][26599] InferenceWorker_p0-w0: resuming experience collection (14900 times) [2024-06-19 05:37:35,508][26599] Updated weights for policy 0, policy_version 289004 (0.0030) [2024-06-19 05:37:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4735107072. Throughput: 0: 42477.2. Samples: 1002733240. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:38,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 05:37:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289008_4735107072.pth... [2024-06-19 05:37:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000288389_4724965376.pth [2024-06-19 05:37:40,739][26599] Updated weights for policy 0, policy_version 289014 (0.0031) [2024-06-19 05:37:43,380][26367] Fps is (10 sec: 47513.0, 60 sec: 43144.4, 300 sec: 42431.8). Total num frames: 4735352832. Throughput: 0: 42354.7. Samples: 1002982220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:43,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 05:37:43,550][26599] Updated weights for policy 0, policy_version 289024 (0.0036) [2024-06-19 05:37:48,296][26599] Updated weights for policy 0, policy_version 289034 (0.0042) [2024-06-19 05:37:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4735533056. Throughput: 0: 42497.3. Samples: 1003120340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:48,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 05:37:51,268][26599] Updated weights for policy 0, policy_version 289044 (0.0036) [2024-06-19 05:37:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4735746048. Throughput: 0: 42304.3. Samples: 1003366180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:53,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 05:37:55,962][26599] Updated weights for policy 0, policy_version 289054 (0.0026) [2024-06-19 05:37:58,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4735991808. Throughput: 0: 42432.9. Samples: 1003619720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:37:58,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 05:37:58,918][26599] Updated weights for policy 0, policy_version 289064 (0.0037) [2024-06-19 05:38:03,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4736172032. Throughput: 0: 42510.3. Samples: 1003754500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:38:03,380][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 05:38:03,459][26599] Updated weights for policy 0, policy_version 289074 (0.0036) [2024-06-19 05:38:06,553][26599] Updated weights for policy 0, policy_version 289084 (0.0044) [2024-06-19 05:38:08,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42598.2, 300 sec: 42487.8). Total num frames: 4736401408. Throughput: 0: 42367.7. Samples: 1004005260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:38:08,381][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 05:38:11,143][26599] Updated weights for policy 0, policy_version 289094 (0.0035) [2024-06-19 05:38:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4736598016. Throughput: 0: 42704.5. Samples: 1004267160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:38:13,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 05:38:14,220][26599] Updated weights for policy 0, policy_version 289104 (0.0023) [2024-06-19 05:38:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4736811008. Throughput: 0: 42566.5. Samples: 1004393620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:38:18,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:38:18,711][26599] Updated weights for policy 0, policy_version 289114 (0.0035) [2024-06-19 05:38:22,032][26599] Updated weights for policy 0, policy_version 289124 (0.0030) [2024-06-19 05:38:23,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42868.8, 300 sec: 42597.9). Total num frames: 4737056768. Throughput: 0: 42440.6. Samples: 1004643220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:38:23,385][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 05:38:26,360][26599] Updated weights for policy 0, policy_version 289134 (0.0037) [2024-06-19 05:38:28,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4737253376. Throughput: 0: 42749.0. Samples: 1004905920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:38:28,380][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 05:38:30,052][26599] Updated weights for policy 0, policy_version 289144 (0.0040) [2024-06-19 05:38:33,380][26367] Fps is (10 sec: 39335.7, 60 sec: 42871.3, 300 sec: 42320.7). Total num frames: 4737449984. Throughput: 0: 42509.7. Samples: 1005033280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 05:38:33,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 05:38:34,046][26599] Updated weights for policy 0, policy_version 289154 (0.0036) [2024-06-19 05:38:37,608][26599] Updated weights for policy 0, policy_version 289164 (0.0039) [2024-06-19 05:38:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 4737679360. Throughput: 0: 42674.9. Samples: 1005286540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:38:38,380][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 05:38:41,777][26599] Updated weights for policy 0, policy_version 289174 (0.0039) [2024-06-19 05:38:43,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42432.3). Total num frames: 4737892352. Throughput: 0: 42665.8. Samples: 1005539680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:38:43,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 05:38:45,603][26599] Updated weights for policy 0, policy_version 289184 (0.0038) [2024-06-19 05:38:48,380][26367] Fps is (10 sec: 39320.9, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4738072576. Throughput: 0: 42473.1. Samples: 1005665800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:38:48,381][26367] Avg episode reward: [(0, '0.361')] [2024-06-19 05:38:49,591][26599] Updated weights for policy 0, policy_version 289194 (0.0045) [2024-06-19 05:38:53,199][26599] Updated weights for policy 0, policy_version 289204 (0.0038) [2024-06-19 05:38:53,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4738318336. Throughput: 0: 42591.2. Samples: 1005921860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:38:53,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:38:57,328][26599] Updated weights for policy 0, policy_version 289214 (0.0034) [2024-06-19 05:38:58,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4738531328. Throughput: 0: 42347.6. Samples: 1006172800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:38:58,381][26367] Avg episode reward: [(0, '0.833')] [2024-06-19 05:38:58,527][26579] Signal inference workers to stop experience collection... (14950 times) [2024-06-19 05:38:58,559][26599] InferenceWorker_p0-w0: stopping experience collection (14950 times) [2024-06-19 05:38:58,593][26579] Signal inference workers to resume experience collection... (14950 times) [2024-06-19 05:38:58,596][26599] InferenceWorker_p0-w0: resuming experience collection (14950 times) [2024-06-19 05:39:00,941][26599] Updated weights for policy 0, policy_version 289224 (0.0032) [2024-06-19 05:39:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4738711552. Throughput: 0: 42350.3. Samples: 1006299380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:03,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 05:39:05,104][26599] Updated weights for policy 0, policy_version 289234 (0.0038) [2024-06-19 05:39:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.5, 300 sec: 42487.8). Total num frames: 4738940928. Throughput: 0: 42429.3. Samples: 1006552380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:08,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 05:39:08,570][26599] Updated weights for policy 0, policy_version 289244 (0.0028) [2024-06-19 05:39:12,971][26599] Updated weights for policy 0, policy_version 289254 (0.0035) [2024-06-19 05:39:13,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4739153920. Throughput: 0: 42246.7. Samples: 1006807020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:13,380][26367] Avg episode reward: [(0, '0.851')] [2024-06-19 05:39:16,337][26599] Updated weights for policy 0, policy_version 289264 (0.0050) [2024-06-19 05:39:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4739350528. Throughput: 0: 42196.1. Samples: 1006932100. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:18,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 05:39:20,530][26599] Updated weights for policy 0, policy_version 289274 (0.0040) [2024-06-19 05:39:23,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42054.8, 300 sec: 42487.3). Total num frames: 4739579904. Throughput: 0: 42176.8. Samples: 1007184500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:23,381][26367] Avg episode reward: [(0, '0.319')] [2024-06-19 05:39:24,021][26599] Updated weights for policy 0, policy_version 289284 (0.0031) [2024-06-19 05:39:28,189][26599] Updated weights for policy 0, policy_version 289294 (0.0039) [2024-06-19 05:39:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42432.3). Total num frames: 4739792896. Throughput: 0: 42289.8. Samples: 1007442720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:28,381][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 05:39:31,722][26599] Updated weights for policy 0, policy_version 289304 (0.0028) [2024-06-19 05:39:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4739973120. Throughput: 0: 42216.5. Samples: 1007565540. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:33,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 05:39:35,970][26599] Updated weights for policy 0, policy_version 289314 (0.0026) [2024-06-19 05:39:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4740218880. Throughput: 0: 42191.1. Samples: 1007820460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:38,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 05:39:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289320_4740218880.pth... [2024-06-19 05:39:38,443][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000288699_4730044416.pth [2024-06-19 05:39:39,886][26599] Updated weights for policy 0, policy_version 289324 (0.0033) [2024-06-19 05:39:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4740415488. Throughput: 0: 42311.5. Samples: 1008076820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:43,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 05:39:43,782][26599] Updated weights for policy 0, policy_version 289334 (0.0038) [2024-06-19 05:39:47,590][26599] Updated weights for policy 0, policy_version 289344 (0.0031) [2024-06-19 05:39:48,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4740612096. Throughput: 0: 42176.0. Samples: 1008197300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 05:39:48,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 05:39:51,374][26599] Updated weights for policy 0, policy_version 289354 (0.0028) [2024-06-19 05:39:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4740857856. Throughput: 0: 42254.6. Samples: 1008453840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:39:53,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 05:39:55,164][26599] Updated weights for policy 0, policy_version 289364 (0.0042) [2024-06-19 05:39:58,380][26367] Fps is (10 sec: 42599.3, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 4741038080. Throughput: 0: 42329.8. Samples: 1008711860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:39:58,380][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 05:39:59,182][26599] Updated weights for policy 0, policy_version 289374 (0.0029) [2024-06-19 05:40:02,846][26599] Updated weights for policy 0, policy_version 289384 (0.0040) [2024-06-19 05:40:03,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4741267456. Throughput: 0: 42269.9. Samples: 1008834240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:03,380][26367] Avg episode reward: [(0, '0.365')] [2024-06-19 05:40:07,122][26599] Updated weights for policy 0, policy_version 289394 (0.0042) [2024-06-19 05:40:08,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4741496832. Throughput: 0: 42376.9. Samples: 1009091460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:08,381][26367] Avg episode reward: [(0, '0.307')] [2024-06-19 05:40:11,283][26599] Updated weights for policy 0, policy_version 289404 (0.0036) [2024-06-19 05:40:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4741677056. Throughput: 0: 42254.7. Samples: 1009344180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:13,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 05:40:14,873][26599] Updated weights for policy 0, policy_version 289414 (0.0030) [2024-06-19 05:40:18,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4741890048. Throughput: 0: 42242.6. Samples: 1009466460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:18,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 05:40:18,921][26599] Updated weights for policy 0, policy_version 289424 (0.0034) [2024-06-19 05:40:22,633][26599] Updated weights for policy 0, policy_version 289434 (0.0043) [2024-06-19 05:40:23,380][26367] Fps is (10 sec: 42597.1, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 4742103040. Throughput: 0: 42318.9. Samples: 1009724820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:23,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 05:40:23,447][26579] Signal inference workers to stop experience collection... (15000 times) [2024-06-19 05:40:23,447][26579] Signal inference workers to resume experience collection... (15000 times) [2024-06-19 05:40:23,494][26599] InferenceWorker_p0-w0: stopping experience collection (15000 times) [2024-06-19 05:40:23,495][26599] InferenceWorker_p0-w0: resuming experience collection (15000 times) [2024-06-19 05:40:26,517][26599] Updated weights for policy 0, policy_version 289444 (0.0034) [2024-06-19 05:40:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4742316032. Throughput: 0: 42406.2. Samples: 1009985100. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:28,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 05:40:30,508][26599] Updated weights for policy 0, policy_version 289454 (0.0029) [2024-06-19 05:40:33,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4742545408. Throughput: 0: 42411.6. Samples: 1010105820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:33,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 05:40:34,186][26599] Updated weights for policy 0, policy_version 289464 (0.0037) [2024-06-19 05:40:38,269][26599] Updated weights for policy 0, policy_version 289474 (0.0038) [2024-06-19 05:40:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4742742016. Throughput: 0: 42384.0. Samples: 1010361120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:38,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 05:40:42,108][26599] Updated weights for policy 0, policy_version 289484 (0.0046) [2024-06-19 05:40:43,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4742955008. Throughput: 0: 42239.0. Samples: 1010612620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:43,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 05:40:46,043][26599] Updated weights for policy 0, policy_version 289494 (0.0037) [2024-06-19 05:40:48,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4743184384. Throughput: 0: 42415.3. Samples: 1010742940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:48,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 05:40:50,061][26599] Updated weights for policy 0, policy_version 289504 (0.0037) [2024-06-19 05:40:53,382][26367] Fps is (10 sec: 42590.7, 60 sec: 42051.1, 300 sec: 42376.0). Total num frames: 4743380992. Throughput: 0: 42172.6. Samples: 1010989300. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:53,383][26367] Avg episode reward: [(0, '0.240')] [2024-06-19 05:40:53,817][26599] Updated weights for policy 0, policy_version 289514 (0.0046) [2024-06-19 05:40:57,710][26599] Updated weights for policy 0, policy_version 289524 (0.0037) [2024-06-19 05:40:58,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4743577600. Throughput: 0: 42255.8. Samples: 1011245700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 05:40:58,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 05:41:01,215][26599] Updated weights for policy 0, policy_version 289534 (0.0046) [2024-06-19 05:41:03,380][26367] Fps is (10 sec: 42605.8, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4743806976. Throughput: 0: 42319.1. Samples: 1011370820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:03,382][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 05:41:05,305][26599] Updated weights for policy 0, policy_version 289544 (0.0043) [2024-06-19 05:41:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4744019968. Throughput: 0: 42325.6. Samples: 1011629460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:08,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 05:41:08,711][26599] Updated weights for policy 0, policy_version 289554 (0.0045) [2024-06-19 05:41:13,001][26599] Updated weights for policy 0, policy_version 289564 (0.0041) [2024-06-19 05:41:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4744232960. Throughput: 0: 42168.8. Samples: 1011882700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:13,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 05:41:16,405][26599] Updated weights for policy 0, policy_version 289574 (0.0041) [2024-06-19 05:41:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4744445952. Throughput: 0: 42304.1. Samples: 1012009500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:18,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 05:41:20,643][26599] Updated weights for policy 0, policy_version 289584 (0.0038) [2024-06-19 05:41:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 4744675328. Throughput: 0: 42304.9. Samples: 1012264840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:23,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 05:41:24,230][26599] Updated weights for policy 0, policy_version 289594 (0.0053) [2024-06-19 05:41:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4744855552. Throughput: 0: 42354.8. Samples: 1012518580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:28,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 05:41:28,486][26599] Updated weights for policy 0, policy_version 289604 (0.0029) [2024-06-19 05:41:32,063][26599] Updated weights for policy 0, policy_version 289614 (0.0024) [2024-06-19 05:41:33,384][26367] Fps is (10 sec: 39307.5, 60 sec: 42049.7, 300 sec: 42375.7). Total num frames: 4745068544. Throughput: 0: 42186.9. Samples: 1012641500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:33,384][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 05:41:36,170][26599] Updated weights for policy 0, policy_version 289624 (0.0040) [2024-06-19 05:41:38,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4745297920. Throughput: 0: 42495.8. Samples: 1012901540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:38,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 05:41:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289630_4745297920.pth... [2024-06-19 05:41:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289008_4735107072.pth [2024-06-19 05:41:39,645][26599] Updated weights for policy 0, policy_version 289634 (0.0039) [2024-06-19 05:41:43,380][26367] Fps is (10 sec: 42613.4, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4745494528. Throughput: 0: 42420.4. Samples: 1013154620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:43,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-19 05:41:44,115][26599] Updated weights for policy 0, policy_version 289644 (0.0028) [2024-06-19 05:41:47,690][26599] Updated weights for policy 0, policy_version 289654 (0.0028) [2024-06-19 05:41:48,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 4745723904. Throughput: 0: 42413.0. Samples: 1013279400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:48,380][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 05:41:51,884][26599] Updated weights for policy 0, policy_version 289664 (0.0032) [2024-06-19 05:41:53,380][26367] Fps is (10 sec: 44237.8, 60 sec: 42599.8, 300 sec: 42431.8). Total num frames: 4745936896. Throughput: 0: 42357.4. Samples: 1013535540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:53,380][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 05:41:55,235][26599] Updated weights for policy 0, policy_version 289674 (0.0044) [2024-06-19 05:41:58,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4746100736. Throughput: 0: 42594.8. Samples: 1013799460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:41:58,381][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 05:41:59,528][26599] Updated weights for policy 0, policy_version 289684 (0.0045) [2024-06-19 05:41:59,536][26579] Signal inference workers to stop experience collection... (15050 times) [2024-06-19 05:41:59,536][26579] Signal inference workers to resume experience collection... (15050 times) [2024-06-19 05:41:59,559][26599] InferenceWorker_p0-w0: stopping experience collection (15050 times) [2024-06-19 05:41:59,560][26599] InferenceWorker_p0-w0: resuming experience collection (15050 times) [2024-06-19 05:42:02,735][26599] Updated weights for policy 0, policy_version 289694 (0.0029) [2024-06-19 05:42:03,384][26367] Fps is (10 sec: 42582.4, 60 sec: 42595.9, 300 sec: 42431.2). Total num frames: 4746362880. Throughput: 0: 42354.8. Samples: 1013915620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:42:03,384][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 05:42:07,215][26599] Updated weights for policy 0, policy_version 289704 (0.0033) [2024-06-19 05:42:08,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4746559488. Throughput: 0: 42475.2. Samples: 1014176220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:42:08,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 05:42:10,522][26599] Updated weights for policy 0, policy_version 289714 (0.0031) [2024-06-19 05:42:13,384][26367] Fps is (10 sec: 37683.1, 60 sec: 41776.7, 300 sec: 42264.6). Total num frames: 4746739712. Throughput: 0: 42468.4. Samples: 1014429820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 05:42:13,385][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 05:42:15,100][26599] Updated weights for policy 0, policy_version 289724 (0.0041) [2024-06-19 05:42:18,306][26599] Updated weights for policy 0, policy_version 289734 (0.0032) [2024-06-19 05:42:18,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4747001856. Throughput: 0: 42476.7. Samples: 1014552800. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:18,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 05:42:22,804][26599] Updated weights for policy 0, policy_version 289744 (0.0031) [2024-06-19 05:42:23,380][26367] Fps is (10 sec: 45892.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4747198464. Throughput: 0: 42548.1. Samples: 1014816200. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:23,381][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 05:42:25,789][26599] Updated weights for policy 0, policy_version 289754 (0.0035) [2024-06-19 05:42:28,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4747395072. Throughput: 0: 42428.1. Samples: 1015063880. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:28,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 05:42:30,535][26599] Updated weights for policy 0, policy_version 289764 (0.0027) [2024-06-19 05:42:33,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42871.5, 300 sec: 42486.8). Total num frames: 4747640832. Throughput: 0: 42359.2. Samples: 1015185720. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:33,384][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 05:42:33,908][26599] Updated weights for policy 0, policy_version 289774 (0.0027) [2024-06-19 05:42:38,264][26599] Updated weights for policy 0, policy_version 289784 (0.0039) [2024-06-19 05:42:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4747821056. Throughput: 0: 42506.5. Samples: 1015448340. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:38,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 05:42:41,357][26599] Updated weights for policy 0, policy_version 289794 (0.0045) [2024-06-19 05:42:43,380][26367] Fps is (10 sec: 39335.3, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4748034048. Throughput: 0: 42155.0. Samples: 1015696440. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:43,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 05:42:46,159][26599] Updated weights for policy 0, policy_version 289804 (0.0041) [2024-06-19 05:42:48,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 4748279808. Throughput: 0: 42510.4. Samples: 1015828440. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:48,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 05:42:49,396][26599] Updated weights for policy 0, policy_version 289814 (0.0047) [2024-06-19 05:42:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41779.0, 300 sec: 42209.6). Total num frames: 4748443648. Throughput: 0: 42210.1. Samples: 1016075680. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:53,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 05:42:53,760][26599] Updated weights for policy 0, policy_version 289824 (0.0042) [2024-06-19 05:42:57,270][26599] Updated weights for policy 0, policy_version 289834 (0.0033) [2024-06-19 05:42:58,380][26367] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4748656640. Throughput: 0: 42151.0. Samples: 1016326460. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:42:58,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 05:43:01,369][26599] Updated weights for policy 0, policy_version 289844 (0.0032) [2024-06-19 05:43:03,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42327.9, 300 sec: 42376.3). Total num frames: 4748902400. Throughput: 0: 42391.7. Samples: 1016460420. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:43:03,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 05:43:05,023][26599] Updated weights for policy 0, policy_version 289854 (0.0039) [2024-06-19 05:43:08,382][26367] Fps is (10 sec: 42590.0, 60 sec: 42050.9, 300 sec: 42320.4). Total num frames: 4749082624. Throughput: 0: 42000.8. Samples: 1016706320. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:43:08,383][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 05:43:09,283][26599] Updated weights for policy 0, policy_version 289864 (0.0039) [2024-06-19 05:43:13,063][26599] Updated weights for policy 0, policy_version 289874 (0.0052) [2024-06-19 05:43:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42601.0, 300 sec: 42320.7). Total num frames: 4749295616. Throughput: 0: 42170.2. Samples: 1016961540. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:43:13,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 05:43:16,889][26579] Signal inference workers to stop experience collection... (15100 times) [2024-06-19 05:43:16,917][26599] InferenceWorker_p0-w0: stopping experience collection (15100 times) [2024-06-19 05:43:16,958][26579] Signal inference workers to resume experience collection... (15100 times) [2024-06-19 05:43:16,958][26599] InferenceWorker_p0-w0: resuming experience collection (15100 times) [2024-06-19 05:43:17,093][26599] Updated weights for policy 0, policy_version 289884 (0.0026) [2024-06-19 05:43:18,380][26367] Fps is (10 sec: 45884.4, 60 sec: 42325.5, 300 sec: 42321.2). Total num frames: 4749541376. Throughput: 0: 42248.8. Samples: 1017086760. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:43:18,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 05:43:20,755][26599] Updated weights for policy 0, policy_version 289894 (0.0034) [2024-06-19 05:43:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 4749721600. Throughput: 0: 42091.6. Samples: 1017342460. Policy #0 lag: (min: 2.0, avg: 8.8, max: 21.0) [2024-06-19 05:43:23,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 05:43:24,779][26599] Updated weights for policy 0, policy_version 289904 (0.0040) [2024-06-19 05:43:28,305][26599] Updated weights for policy 0, policy_version 289914 (0.0038) [2024-06-19 05:43:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4749950976. Throughput: 0: 42051.2. Samples: 1017588740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:43:28,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 05:43:32,425][26599] Updated weights for policy 0, policy_version 289924 (0.0027) [2024-06-19 05:43:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41781.7, 300 sec: 42265.2). Total num frames: 4750147584. Throughput: 0: 42026.8. Samples: 1017719640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:43:33,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 05:43:35,933][26599] Updated weights for policy 0, policy_version 289934 (0.0039) [2024-06-19 05:43:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4750344192. Throughput: 0: 42238.2. Samples: 1017976400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:43:38,381][26367] Avg episode reward: [(0, '0.821')] [2024-06-19 05:43:38,474][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289939_4750360576.pth... [2024-06-19 05:43:38,533][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289320_4740218880.pth [2024-06-19 05:43:39,983][26599] Updated weights for policy 0, policy_version 289944 (0.0050) [2024-06-19 05:43:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4750573568. Throughput: 0: 42176.4. Samples: 1018224400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:43:43,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 05:43:43,851][26599] Updated weights for policy 0, policy_version 289954 (0.0032) [2024-06-19 05:43:47,888][26599] Updated weights for policy 0, policy_version 289964 (0.0033) [2024-06-19 05:43:48,380][26367] Fps is (10 sec: 44237.7, 60 sec: 41779.4, 300 sec: 42265.2). Total num frames: 4750786560. Throughput: 0: 42098.7. Samples: 1018354860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:43:48,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 05:43:51,853][26599] Updated weights for policy 0, policy_version 289974 (0.0030) [2024-06-19 05:43:53,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42154.1). Total num frames: 4750966784. Throughput: 0: 42210.8. Samples: 1018605720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:43:53,380][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 05:43:55,615][26599] Updated weights for policy 0, policy_version 289984 (0.0030) [2024-06-19 05:43:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4751212544. Throughput: 0: 42101.8. Samples: 1018856120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:43:58,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 05:43:59,486][26599] Updated weights for policy 0, policy_version 289994 (0.0046) [2024-06-19 05:44:03,293][26599] Updated weights for policy 0, policy_version 290004 (0.0032) [2024-06-19 05:44:03,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4751425536. Throughput: 0: 42308.1. Samples: 1018990620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:03,380][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 05:44:07,292][26599] Updated weights for policy 0, policy_version 290014 (0.0049) [2024-06-19 05:44:08,384][26367] Fps is (10 sec: 39306.9, 60 sec: 42051.1, 300 sec: 42209.1). Total num frames: 4751605760. Throughput: 0: 42185.0. Samples: 1019240940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:08,385][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 05:44:11,145][26599] Updated weights for policy 0, policy_version 290024 (0.0031) [2024-06-19 05:44:13,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4751851520. Throughput: 0: 42173.3. Samples: 1019486540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:13,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 05:44:14,934][26599] Updated weights for policy 0, policy_version 290034 (0.0040) [2024-06-19 05:44:18,380][26367] Fps is (10 sec: 44252.8, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 4752048128. Throughput: 0: 42267.0. Samples: 1019621660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:18,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 05:44:18,782][26599] Updated weights for policy 0, policy_version 290044 (0.0043) [2024-06-19 05:44:22,587][26599] Updated weights for policy 0, policy_version 290054 (0.0041) [2024-06-19 05:44:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4752244736. Throughput: 0: 42122.2. Samples: 1019871900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:23,381][26367] Avg episode reward: [(0, '0.830')] [2024-06-19 05:44:26,523][26599] Updated weights for policy 0, policy_version 290064 (0.0033) [2024-06-19 05:44:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4752490496. Throughput: 0: 42165.7. Samples: 1020121860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:28,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 05:44:30,267][26599] Updated weights for policy 0, policy_version 290074 (0.0027) [2024-06-19 05:44:33,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4752670720. Throughput: 0: 42196.0. Samples: 1020253680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:33,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 05:44:34,533][26599] Updated weights for policy 0, policy_version 290084 (0.0042) [2024-06-19 05:44:38,226][26599] Updated weights for policy 0, policy_version 290094 (0.0025) [2024-06-19 05:44:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4752900096. Throughput: 0: 42292.8. Samples: 1020508900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 05:44:38,384][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 05:44:41,981][26599] Updated weights for policy 0, policy_version 290104 (0.0038) [2024-06-19 05:44:42,678][26579] Signal inference workers to stop experience collection... (15150 times) [2024-06-19 05:44:42,679][26579] Signal inference workers to resume experience collection... (15150 times) [2024-06-19 05:44:42,727][26599] InferenceWorker_p0-w0: stopping experience collection (15150 times) [2024-06-19 05:44:42,727][26599] InferenceWorker_p0-w0: resuming experience collection (15150 times) [2024-06-19 05:44:43,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4753129472. Throughput: 0: 42318.1. Samples: 1020760440. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:44:43,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 05:44:46,050][26599] Updated weights for policy 0, policy_version 290114 (0.0036) [2024-06-19 05:44:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.2, 300 sec: 42209.7). Total num frames: 4753309696. Throughput: 0: 42295.5. Samples: 1020893920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:44:48,380][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 05:44:49,614][26599] Updated weights for policy 0, policy_version 290124 (0.0028) [2024-06-19 05:44:53,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4753522688. Throughput: 0: 42355.9. Samples: 1021146800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:44:53,381][26367] Avg episode reward: [(0, '0.753')] [2024-06-19 05:44:54,024][26599] Updated weights for policy 0, policy_version 290134 (0.0035) [2024-06-19 05:44:57,382][26599] Updated weights for policy 0, policy_version 290144 (0.0024) [2024-06-19 05:44:58,380][26367] Fps is (10 sec: 45874.2, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4753768448. Throughput: 0: 42463.5. Samples: 1021397400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:44:58,384][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 05:45:01,933][26599] Updated weights for policy 0, policy_version 290154 (0.0038) [2024-06-19 05:45:03,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4753965056. Throughput: 0: 42318.4. Samples: 1021525980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:03,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 05:45:05,185][26599] Updated weights for policy 0, policy_version 290164 (0.0022) [2024-06-19 05:45:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42874.1, 300 sec: 42376.2). Total num frames: 4754178048. Throughput: 0: 42301.9. Samples: 1021775480. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:08,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 05:45:09,576][26599] Updated weights for policy 0, policy_version 290174 (0.0042) [2024-06-19 05:45:12,907][26599] Updated weights for policy 0, policy_version 290184 (0.0037) [2024-06-19 05:45:13,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4754374656. Throughput: 0: 42357.0. Samples: 1022027920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:13,381][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 05:45:17,278][26599] Updated weights for policy 0, policy_version 290194 (0.0044) [2024-06-19 05:45:18,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 42209.7). Total num frames: 4754554880. Throughput: 0: 42303.6. Samples: 1022157340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:18,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 05:45:20,527][26599] Updated weights for policy 0, policy_version 290204 (0.0036) [2024-06-19 05:45:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4754800640. Throughput: 0: 42273.7. Samples: 1022411220. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:23,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 05:45:25,188][26599] Updated weights for policy 0, policy_version 290214 (0.0052) [2024-06-19 05:45:28,007][26599] Updated weights for policy 0, policy_version 290224 (0.0039) [2024-06-19 05:45:28,380][26367] Fps is (10 sec: 47512.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4755030016. Throughput: 0: 42221.8. Samples: 1022660420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:28,381][26367] Avg episode reward: [(0, '0.841')] [2024-06-19 05:45:32,982][26599] Updated weights for policy 0, policy_version 290234 (0.0036) [2024-06-19 05:45:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4755193856. Throughput: 0: 42175.9. Samples: 1022791840. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:33,382][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:45:35,758][26599] Updated weights for policy 0, policy_version 290244 (0.0033) [2024-06-19 05:45:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4755439616. Throughput: 0: 42191.4. Samples: 1023045420. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:38,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 05:45:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000290249_4755439616.pth... [2024-06-19 05:45:38,473][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289630_4745297920.pth [2024-06-19 05:45:40,783][26599] Updated weights for policy 0, policy_version 290254 (0.0027) [2024-06-19 05:45:43,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4755652608. Throughput: 0: 42099.7. Samples: 1023291880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:43,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 05:45:43,712][26599] Updated weights for policy 0, policy_version 290264 (0.0038) [2024-06-19 05:45:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.1, 300 sec: 42209.9). Total num frames: 4755832832. Throughput: 0: 42084.6. Samples: 1023419800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:48,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 05:45:48,458][26599] Updated weights for policy 0, policy_version 290274 (0.0032) [2024-06-19 05:45:51,583][26599] Updated weights for policy 0, policy_version 290284 (0.0027) [2024-06-19 05:45:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4756062208. Throughput: 0: 42134.2. Samples: 1023671520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 19.0) [2024-06-19 05:45:53,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 05:45:55,969][26599] Updated weights for policy 0, policy_version 290294 (0.0033) [2024-06-19 05:45:58,380][26367] Fps is (10 sec: 42599.4, 60 sec: 41506.3, 300 sec: 42209.7). Total num frames: 4756258816. Throughput: 0: 42192.6. Samples: 1023926580. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:45:58,380][26367] Avg episode reward: [(0, '0.849')] [2024-06-19 05:45:59,502][26599] Updated weights for policy 0, policy_version 290304 (0.0034) [2024-06-19 05:46:03,384][26367] Fps is (10 sec: 40944.8, 60 sec: 41776.5, 300 sec: 42209.1). Total num frames: 4756471808. Throughput: 0: 42060.9. Samples: 1024050240. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:03,385][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 05:46:03,736][26599] Updated weights for policy 0, policy_version 290314 (0.0034) [2024-06-19 05:46:07,320][26599] Updated weights for policy 0, policy_version 290324 (0.0043) [2024-06-19 05:46:08,384][26367] Fps is (10 sec: 44220.1, 60 sec: 42049.7, 300 sec: 42264.7). Total num frames: 4756701184. Throughput: 0: 42046.5. Samples: 1024303460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:08,384][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 05:46:11,925][26599] Updated weights for policy 0, policy_version 290334 (0.0055) [2024-06-19 05:46:13,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4756897792. Throughput: 0: 42157.9. Samples: 1024557520. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:13,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 05:46:15,159][26599] Updated weights for policy 0, policy_version 290344 (0.0040) [2024-06-19 05:46:18,380][26367] Fps is (10 sec: 40974.5, 60 sec: 42598.3, 300 sec: 42154.1). Total num frames: 4757110784. Throughput: 0: 41975.5. Samples: 1024680740. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:18,381][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 05:46:19,634][26599] Updated weights for policy 0, policy_version 290354 (0.0035) [2024-06-19 05:46:23,267][26599] Updated weights for policy 0, policy_version 290364 (0.0033) [2024-06-19 05:46:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42265.1). Total num frames: 4757323776. Throughput: 0: 41957.4. Samples: 1024933500. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:23,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 05:46:27,334][26599] Updated weights for policy 0, policy_version 290374 (0.0033) [2024-06-19 05:46:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 41779.3, 300 sec: 42265.7). Total num frames: 4757536768. Throughput: 0: 42091.5. Samples: 1025186000. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:28,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 05:46:31,158][26599] Updated weights for policy 0, policy_version 290384 (0.0033) [2024-06-19 05:46:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4757749760. Throughput: 0: 42153.4. Samples: 1025316700. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:33,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 05:46:34,939][26599] Updated weights for policy 0, policy_version 290394 (0.0037) [2024-06-19 05:46:38,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 4757929984. Throughput: 0: 42112.9. Samples: 1025566600. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:38,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 05:46:38,430][26579] Signal inference workers to stop experience collection... (15200 times) [2024-06-19 05:46:38,487][26579] Signal inference workers to resume experience collection... (15200 times) [2024-06-19 05:46:38,488][26599] InferenceWorker_p0-w0: stopping experience collection (15200 times) [2024-06-19 05:46:38,498][26599] InferenceWorker_p0-w0: resuming experience collection (15200 times) [2024-06-19 05:46:39,037][26599] Updated weights for policy 0, policy_version 290404 (0.0038) [2024-06-19 05:46:42,736][26599] Updated weights for policy 0, policy_version 290414 (0.0042) [2024-06-19 05:46:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4758159360. Throughput: 0: 42054.1. Samples: 1025819020. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:43,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 05:46:46,849][26599] Updated weights for policy 0, policy_version 290424 (0.0031) [2024-06-19 05:46:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4758372352. Throughput: 0: 42252.8. Samples: 1025951460. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:48,381][26367] Avg episode reward: [(0, '0.359')] [2024-06-19 05:46:50,515][26599] Updated weights for policy 0, policy_version 290434 (0.0027) [2024-06-19 05:46:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 41779.1, 300 sec: 42265.1). Total num frames: 4758568960. Throughput: 0: 42133.9. Samples: 1026199340. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:53,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 05:46:54,703][26599] Updated weights for policy 0, policy_version 290444 (0.0033) [2024-06-19 05:46:58,084][26599] Updated weights for policy 0, policy_version 290454 (0.0042) [2024-06-19 05:46:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42154.6). Total num frames: 4758798336. Throughput: 0: 42098.7. Samples: 1026451960. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:46:58,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 05:47:02,653][26599] Updated weights for policy 0, policy_version 290464 (0.0040) [2024-06-19 05:47:03,380][26367] Fps is (10 sec: 40960.9, 60 sec: 41781.8, 300 sec: 42098.6). Total num frames: 4758978560. Throughput: 0: 42281.5. Samples: 1026583400. Policy #0 lag: (min: 1.0, avg: 11.0, max: 23.0) [2024-06-19 05:47:03,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 05:47:05,642][26599] Updated weights for policy 0, policy_version 290474 (0.0032) [2024-06-19 05:47:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42054.8, 300 sec: 42321.2). Total num frames: 4759224320. Throughput: 0: 42174.3. Samples: 1026831340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:08,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 05:47:10,643][26599] Updated weights for policy 0, policy_version 290484 (0.0027) [2024-06-19 05:47:13,259][26599] Updated weights for policy 0, policy_version 290494 (0.0031) [2024-06-19 05:47:13,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42598.5, 300 sec: 42209.7). Total num frames: 4759453696. Throughput: 0: 42245.4. Samples: 1027087040. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:13,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 05:47:18,285][26599] Updated weights for policy 0, policy_version 290504 (0.0038) [2024-06-19 05:47:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42098.5). Total num frames: 4759617536. Throughput: 0: 42193.3. Samples: 1027215400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:18,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 05:47:20,821][26599] Updated weights for policy 0, policy_version 290514 (0.0037) [2024-06-19 05:47:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4759863296. Throughput: 0: 42407.6. Samples: 1027474940. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:23,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 05:47:25,845][26599] Updated weights for policy 0, policy_version 290524 (0.0029) [2024-06-19 05:47:28,323][26599] Updated weights for policy 0, policy_version 290534 (0.0041) [2024-06-19 05:47:28,380][26367] Fps is (10 sec: 49152.3, 60 sec: 42871.5, 300 sec: 42265.7). Total num frames: 4760109056. Throughput: 0: 42445.4. Samples: 1027729060. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:28,384][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 05:47:33,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 4760256512. Throughput: 0: 42342.1. Samples: 1027856860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:33,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 05:47:33,543][26599] Updated weights for policy 0, policy_version 290544 (0.0028) [2024-06-19 05:47:35,905][26599] Updated weights for policy 0, policy_version 290554 (0.0029) [2024-06-19 05:47:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 4760518656. Throughput: 0: 42446.0. Samples: 1028109400. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:38,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 05:47:38,437][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000290560_4760535040.pth... [2024-06-19 05:47:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000289939_4750360576.pth [2024-06-19 05:47:41,266][26599] Updated weights for policy 0, policy_version 290564 (0.0032) [2024-06-19 05:47:43,380][26367] Fps is (10 sec: 47514.7, 60 sec: 42871.6, 300 sec: 42209.7). Total num frames: 4760731648. Throughput: 0: 42594.7. Samples: 1028368720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:43,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 05:47:43,763][26599] Updated weights for policy 0, policy_version 290574 (0.0033) [2024-06-19 05:47:48,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42052.3, 300 sec: 42209.7). Total num frames: 4760895488. Throughput: 0: 42477.3. Samples: 1028494880. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:48,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 05:47:48,837][26599] Updated weights for policy 0, policy_version 290584 (0.0035) [2024-06-19 05:47:51,284][26579] Signal inference workers to stop experience collection... (15250 times) [2024-06-19 05:47:51,302][26599] InferenceWorker_p0-w0: stopping experience collection (15250 times) [2024-06-19 05:47:51,344][26579] Signal inference workers to resume experience collection... (15250 times) [2024-06-19 05:47:51,344][26599] InferenceWorker_p0-w0: resuming experience collection (15250 times) [2024-06-19 05:47:51,497][26599] Updated weights for policy 0, policy_version 290594 (0.0027) [2024-06-19 05:47:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 4761141248. Throughput: 0: 42586.3. Samples: 1028747720. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:53,380][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 05:47:56,428][26599] Updated weights for policy 0, policy_version 290604 (0.0036) [2024-06-19 05:47:58,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4761354240. Throughput: 0: 42784.7. Samples: 1029012360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:47:58,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 05:47:59,225][26599] Updated weights for policy 0, policy_version 290614 (0.0035) [2024-06-19 05:48:03,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42265.4). Total num frames: 4761550848. Throughput: 0: 42708.8. Samples: 1029137300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:48:03,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 05:48:04,136][26599] Updated weights for policy 0, policy_version 290624 (0.0046) [2024-06-19 05:48:06,920][26599] Updated weights for policy 0, policy_version 290634 (0.0030) [2024-06-19 05:48:08,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4761780224. Throughput: 0: 42381.7. Samples: 1029382120. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:48:08,381][26367] Avg episode reward: [(0, '0.388')] [2024-06-19 05:48:12,253][26599] Updated weights for policy 0, policy_version 290644 (0.0029) [2024-06-19 05:48:13,384][26367] Fps is (10 sec: 44221.5, 60 sec: 42322.7, 300 sec: 42209.1). Total num frames: 4761993216. Throughput: 0: 42553.9. Samples: 1029644140. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:48:13,384][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 05:48:14,716][26599] Updated weights for policy 0, policy_version 290654 (0.0038) [2024-06-19 05:48:18,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42868.9, 300 sec: 42264.7). Total num frames: 4762189824. Throughput: 0: 42484.7. Samples: 1029768820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 24.0) [2024-06-19 05:48:18,384][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 05:48:19,715][26599] Updated weights for policy 0, policy_version 290664 (0.0042) [2024-06-19 05:48:22,278][26599] Updated weights for policy 0, policy_version 290674 (0.0033) [2024-06-19 05:48:23,380][26367] Fps is (10 sec: 42613.4, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4762419200. Throughput: 0: 42467.9. Samples: 1030020460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:23,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:48:27,090][26599] Updated weights for policy 0, policy_version 290684 (0.0033) [2024-06-19 05:48:28,380][26367] Fps is (10 sec: 42614.0, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 4762615808. Throughput: 0: 42587.5. Samples: 1030285160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:28,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 05:48:30,145][26599] Updated weights for policy 0, policy_version 290694 (0.0042) [2024-06-19 05:48:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 4762828800. Throughput: 0: 42422.2. Samples: 1030403880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:33,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 05:48:35,050][26599] Updated weights for policy 0, policy_version 290704 (0.0042) [2024-06-19 05:48:38,283][26599] Updated weights for policy 0, policy_version 290714 (0.0033) [2024-06-19 05:48:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4763058176. Throughput: 0: 42459.9. Samples: 1030658420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:38,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 05:48:42,644][26599] Updated weights for policy 0, policy_version 290724 (0.0034) [2024-06-19 05:48:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.1). Total num frames: 4763254784. Throughput: 0: 42370.3. Samples: 1030919020. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:43,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 05:48:45,977][26599] Updated weights for policy 0, policy_version 290734 (0.0036) [2024-06-19 05:48:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42431.8). Total num frames: 4763484160. Throughput: 0: 42303.6. Samples: 1031040960. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:48,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 05:48:50,243][26599] Updated weights for policy 0, policy_version 290744 (0.0043) [2024-06-19 05:48:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4763697152. Throughput: 0: 42564.0. Samples: 1031297500. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:53,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 05:48:53,633][26599] Updated weights for policy 0, policy_version 290754 (0.0023) [2024-06-19 05:48:57,929][26599] Updated weights for policy 0, policy_version 290764 (0.0040) [2024-06-19 05:48:58,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 4763877376. Throughput: 0: 42468.8. Samples: 1031555080. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:48:58,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 05:49:01,274][26599] Updated weights for policy 0, policy_version 290774 (0.0029) [2024-06-19 05:49:03,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42869.0, 300 sec: 42431.8). Total num frames: 4764123136. Throughput: 0: 42468.0. Samples: 1031679880. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:49:03,384][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 05:49:05,631][26599] Updated weights for policy 0, policy_version 290784 (0.0040) [2024-06-19 05:49:08,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 4764352512. Throughput: 0: 42673.0. Samples: 1031940740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:49:08,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 05:49:08,943][26599] Updated weights for policy 0, policy_version 290794 (0.0023) [2024-06-19 05:49:13,382][26367] Fps is (10 sec: 37690.9, 60 sec: 41780.6, 300 sec: 42209.4). Total num frames: 4764499968. Throughput: 0: 42459.4. Samples: 1032195900. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:49:13,382][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 05:49:13,747][26599] Updated weights for policy 0, policy_version 290804 (0.0040) [2024-06-19 05:49:16,550][26599] Updated weights for policy 0, policy_version 290814 (0.0029) [2024-06-19 05:49:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42601.0, 300 sec: 42376.3). Total num frames: 4764745728. Throughput: 0: 42504.5. Samples: 1032316580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:49:18,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 05:49:21,334][26599] Updated weights for policy 0, policy_version 290824 (0.0027) [2024-06-19 05:49:21,790][26579] Signal inference workers to stop experience collection... (15300 times) [2024-06-19 05:49:21,840][26599] InferenceWorker_p0-w0: stopping experience collection (15300 times) [2024-06-19 05:49:21,908][26579] Signal inference workers to resume experience collection... (15300 times) [2024-06-19 05:49:21,908][26599] InferenceWorker_p0-w0: resuming experience collection (15300 times) [2024-06-19 05:49:23,380][26367] Fps is (10 sec: 45882.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4764958720. Throughput: 0: 42472.5. Samples: 1032569680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:49:23,380][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 05:49:24,318][26599] Updated weights for policy 0, policy_version 290834 (0.0024) [2024-06-19 05:49:28,383][26367] Fps is (10 sec: 40946.8, 60 sec: 42323.0, 300 sec: 42320.2). Total num frames: 4765155328. Throughput: 0: 42390.8. Samples: 1032826740. Policy #0 lag: (min: 0.0, avg: 8.1, max: 21.0) [2024-06-19 05:49:28,384][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 05:49:29,098][26599] Updated weights for policy 0, policy_version 290844 (0.0033) [2024-06-19 05:49:32,003][26599] Updated weights for policy 0, policy_version 290854 (0.0037) [2024-06-19 05:49:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4765368320. Throughput: 0: 42466.8. Samples: 1032951960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:49:33,380][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 05:49:36,518][26599] Updated weights for policy 0, policy_version 290864 (0.0033) [2024-06-19 05:49:38,380][26367] Fps is (10 sec: 44251.3, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4765597696. Throughput: 0: 42629.8. Samples: 1033215840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:49:38,380][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 05:49:38,412][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000290870_4765614080.pth... [2024-06-19 05:49:38,464][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000290249_4755439616.pth [2024-06-19 05:49:39,734][26599] Updated weights for policy 0, policy_version 290874 (0.0033) [2024-06-19 05:49:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4765794304. Throughput: 0: 42358.5. Samples: 1033461220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:49:43,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 05:49:44,146][26599] Updated weights for policy 0, policy_version 290884 (0.0027) [2024-06-19 05:49:47,705][26599] Updated weights for policy 0, policy_version 290894 (0.0039) [2024-06-19 05:49:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4766023680. Throughput: 0: 42389.1. Samples: 1033587240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:49:48,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 05:49:51,715][26599] Updated weights for policy 0, policy_version 290904 (0.0035) [2024-06-19 05:49:53,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4766253056. Throughput: 0: 42327.5. Samples: 1033845480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:49:53,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 05:49:55,598][26599] Updated weights for policy 0, policy_version 290914 (0.0045) [2024-06-19 05:49:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 4766433280. Throughput: 0: 42265.0. Samples: 1034097760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:49:58,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 05:49:59,638][26599] Updated weights for policy 0, policy_version 290924 (0.0026) [2024-06-19 05:50:03,327][26599] Updated weights for policy 0, policy_version 290934 (0.0024) [2024-06-19 05:50:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42327.9, 300 sec: 42320.7). Total num frames: 4766662656. Throughput: 0: 42385.3. Samples: 1034223920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:03,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 05:50:07,110][26599] Updated weights for policy 0, policy_version 290944 (0.0037) [2024-06-19 05:50:08,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4766875648. Throughput: 0: 42541.3. Samples: 1034484040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:08,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 05:50:11,125][26599] Updated weights for policy 0, policy_version 290954 (0.0025) [2024-06-19 05:50:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43145.7, 300 sec: 42487.3). Total num frames: 4767088640. Throughput: 0: 42482.1. Samples: 1034738300. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:13,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 05:50:14,886][26599] Updated weights for policy 0, policy_version 290964 (0.0034) [2024-06-19 05:50:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4767285248. Throughput: 0: 42473.7. Samples: 1034863280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:18,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 05:50:18,850][26599] Updated weights for policy 0, policy_version 290974 (0.0029) [2024-06-19 05:50:22,590][26599] Updated weights for policy 0, policy_version 290984 (0.0036) [2024-06-19 05:50:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 4767531008. Throughput: 0: 42336.8. Samples: 1035121000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:23,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 05:50:26,640][26599] Updated weights for policy 0, policy_version 290994 (0.0049) [2024-06-19 05:50:28,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42600.6, 300 sec: 42431.8). Total num frames: 4767711232. Throughput: 0: 42445.7. Samples: 1035371280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:28,381][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 05:50:30,129][26599] Updated weights for policy 0, policy_version 291004 (0.0044) [2024-06-19 05:50:33,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4767924224. Throughput: 0: 42417.0. Samples: 1035496000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:33,381][26367] Avg episode reward: [(0, '0.325')] [2024-06-19 05:50:34,327][26599] Updated weights for policy 0, policy_version 291014 (0.0028) [2024-06-19 05:50:37,837][26599] Updated weights for policy 0, policy_version 291024 (0.0044) [2024-06-19 05:50:38,340][26579] Signal inference workers to stop experience collection... (15350 times) [2024-06-19 05:50:38,367][26599] InferenceWorker_p0-w0: stopping experience collection (15350 times) [2024-06-19 05:50:38,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4768153600. Throughput: 0: 42466.3. Samples: 1035756460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:38,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 05:50:38,394][26579] Signal inference workers to resume experience collection... (15350 times) [2024-06-19 05:50:38,394][26599] InferenceWorker_p0-w0: resuming experience collection (15350 times) [2024-06-19 05:50:42,203][26599] Updated weights for policy 0, policy_version 291034 (0.0034) [2024-06-19 05:50:43,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4768350208. Throughput: 0: 42470.1. Samples: 1036008920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 05:50:43,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 05:50:45,615][26599] Updated weights for policy 0, policy_version 291044 (0.0031) [2024-06-19 05:50:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4768563200. Throughput: 0: 42373.7. Samples: 1036130740. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:50:48,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 05:50:49,891][26599] Updated weights for policy 0, policy_version 291054 (0.0026) [2024-06-19 05:50:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 4768759808. Throughput: 0: 42365.8. Samples: 1036390500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:50:53,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 05:50:53,562][26599] Updated weights for policy 0, policy_version 291064 (0.0030) [2024-06-19 05:50:57,790][26599] Updated weights for policy 0, policy_version 291074 (0.0040) [2024-06-19 05:50:58,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42432.3). Total num frames: 4768989184. Throughput: 0: 42306.7. Samples: 1036642100. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:50:58,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 05:51:01,267][26599] Updated weights for policy 0, policy_version 291084 (0.0030) [2024-06-19 05:51:03,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42322.8, 300 sec: 42376.2). Total num frames: 4769202176. Throughput: 0: 42421.0. Samples: 1036772380. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:03,384][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 05:51:05,411][26599] Updated weights for policy 0, policy_version 291094 (0.0029) [2024-06-19 05:51:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4769398784. Throughput: 0: 42247.2. Samples: 1037022120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:08,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 05:51:08,949][26599] Updated weights for policy 0, policy_version 291104 (0.0041) [2024-06-19 05:51:13,054][26599] Updated weights for policy 0, policy_version 291114 (0.0031) [2024-06-19 05:51:13,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42052.2, 300 sec: 42376.3). Total num frames: 4769611776. Throughput: 0: 42328.5. Samples: 1037276060. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:13,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 05:51:16,541][26599] Updated weights for policy 0, policy_version 291124 (0.0031) [2024-06-19 05:51:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4769824768. Throughput: 0: 42422.6. Samples: 1037405020. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:18,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 05:51:20,640][26599] Updated weights for policy 0, policy_version 291134 (0.0035) [2024-06-19 05:51:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 4770037760. Throughput: 0: 42318.7. Samples: 1037660800. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:23,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 05:51:24,603][26599] Updated weights for policy 0, policy_version 291144 (0.0033) [2024-06-19 05:51:28,296][26599] Updated weights for policy 0, policy_version 291154 (0.0034) [2024-06-19 05:51:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4770267136. Throughput: 0: 42338.3. Samples: 1037914140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:28,381][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 05:51:32,158][26599] Updated weights for policy 0, policy_version 291164 (0.0039) [2024-06-19 05:51:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4770463744. Throughput: 0: 42483.6. Samples: 1038042500. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:33,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 05:51:35,977][26599] Updated weights for policy 0, policy_version 291174 (0.0041) [2024-06-19 05:51:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42431.8). Total num frames: 4770676736. Throughput: 0: 42359.9. Samples: 1038296700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:38,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 05:51:38,505][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000291180_4770693120.pth... [2024-06-19 05:51:38,547][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000290560_4760535040.pth [2024-06-19 05:51:39,830][26599] Updated weights for policy 0, policy_version 291184 (0.0040) [2024-06-19 05:51:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4770889728. Throughput: 0: 42459.5. Samples: 1038552780. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:43,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 05:51:43,629][26599] Updated weights for policy 0, policy_version 291194 (0.0038) [2024-06-19 05:51:47,378][26599] Updated weights for policy 0, policy_version 291204 (0.0033) [2024-06-19 05:51:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4771102720. Throughput: 0: 42317.5. Samples: 1038676520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:48,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 05:51:51,177][26599] Updated weights for policy 0, policy_version 291214 (0.0037) [2024-06-19 05:51:53,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4771299328. Throughput: 0: 42467.0. Samples: 1038933140. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:53,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 05:51:54,905][26599] Updated weights for policy 0, policy_version 291224 (0.0040) [2024-06-19 05:51:58,159][26579] Signal inference workers to stop experience collection... (15400 times) [2024-06-19 05:51:58,179][26599] InferenceWorker_p0-w0: stopping experience collection (15400 times) [2024-06-19 05:51:58,218][26579] Signal inference workers to resume experience collection... (15400 times) [2024-06-19 05:51:58,219][26599] InferenceWorker_p0-w0: resuming experience collection (15400 times) [2024-06-19 05:51:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4771545088. Throughput: 0: 42508.9. Samples: 1039188960. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 05:51:58,384][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 05:51:59,216][26599] Updated weights for policy 0, policy_version 291234 (0.0024) [2024-06-19 05:52:02,682][26599] Updated weights for policy 0, policy_version 291244 (0.0022) [2024-06-19 05:52:03,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42601.0, 300 sec: 42487.3). Total num frames: 4771758080. Throughput: 0: 42538.7. Samples: 1039319260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:03,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 05:52:07,033][26599] Updated weights for policy 0, policy_version 291254 (0.0028) [2024-06-19 05:52:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4771954688. Throughput: 0: 42418.6. Samples: 1039569640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:08,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 05:52:10,449][26599] Updated weights for policy 0, policy_version 291264 (0.0040) [2024-06-19 05:52:13,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42595.9, 300 sec: 42542.3). Total num frames: 4772167680. Throughput: 0: 42487.8. Samples: 1039826240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:13,384][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 05:52:14,611][26599] Updated weights for policy 0, policy_version 291274 (0.0039) [2024-06-19 05:52:18,024][26599] Updated weights for policy 0, policy_version 291284 (0.0037) [2024-06-19 05:52:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 4772397056. Throughput: 0: 42411.0. Samples: 1039951000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:18,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 05:52:22,490][26599] Updated weights for policy 0, policy_version 291294 (0.0036) [2024-06-19 05:52:23,383][26367] Fps is (10 sec: 40962.6, 60 sec: 42323.2, 300 sec: 42264.7). Total num frames: 4772577280. Throughput: 0: 42277.3. Samples: 1040199300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:23,384][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 05:52:26,095][26599] Updated weights for policy 0, policy_version 291304 (0.0037) [2024-06-19 05:52:28,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4772790272. Throughput: 0: 42241.7. Samples: 1040453660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:28,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 05:52:30,126][26599] Updated weights for policy 0, policy_version 291314 (0.0035) [2024-06-19 05:52:33,380][26367] Fps is (10 sec: 42610.7, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4773003264. Throughput: 0: 42289.8. Samples: 1040579560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:33,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 05:52:33,823][26599] Updated weights for policy 0, policy_version 291324 (0.0045) [2024-06-19 05:52:37,931][26599] Updated weights for policy 0, policy_version 291334 (0.0035) [2024-06-19 05:52:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4773216256. Throughput: 0: 42338.3. Samples: 1040838360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:38,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 05:52:41,562][26599] Updated weights for policy 0, policy_version 291344 (0.0034) [2024-06-19 05:52:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4773412864. Throughput: 0: 42257.8. Samples: 1041090560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:43,383][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 05:52:45,462][26599] Updated weights for policy 0, policy_version 291354 (0.0032) [2024-06-19 05:52:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4773642240. Throughput: 0: 42179.5. Samples: 1041217340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:48,385][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 05:52:49,202][26599] Updated weights for policy 0, policy_version 291364 (0.0045) [2024-06-19 05:52:53,173][26599] Updated weights for policy 0, policy_version 291374 (0.0037) [2024-06-19 05:52:53,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4773871616. Throughput: 0: 42280.5. Samples: 1041472260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:53,384][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 05:52:57,175][26599] Updated weights for policy 0, policy_version 291384 (0.0035) [2024-06-19 05:52:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4774068224. Throughput: 0: 42223.0. Samples: 1041726120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:52:58,380][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 05:53:00,863][26599] Updated weights for policy 0, policy_version 291394 (0.0038) [2024-06-19 05:53:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4774281216. Throughput: 0: 42154.3. Samples: 1041847940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:53:03,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 05:53:05,012][26599] Updated weights for policy 0, policy_version 291404 (0.0039) [2024-06-19 05:53:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42376.8). Total num frames: 4774494208. Throughput: 0: 42362.9. Samples: 1042105500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 05:53:08,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 05:53:08,714][26599] Updated weights for policy 0, policy_version 291414 (0.0047) [2024-06-19 05:53:12,563][26599] Updated weights for policy 0, policy_version 291424 (0.0025) [2024-06-19 05:53:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42054.8, 300 sec: 42376.8). Total num frames: 4774690816. Throughput: 0: 42319.7. Samples: 1042358040. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:13,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 05:53:17,026][26599] Updated weights for policy 0, policy_version 291434 (0.0032) [2024-06-19 05:53:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4774920192. Throughput: 0: 42278.7. Samples: 1042482100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:18,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 05:53:20,654][26599] Updated weights for policy 0, policy_version 291444 (0.0039) [2024-06-19 05:53:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42600.6, 300 sec: 42431.8). Total num frames: 4775133184. Throughput: 0: 42136.1. Samples: 1042734480. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:23,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 05:53:24,747][26599] Updated weights for policy 0, policy_version 291454 (0.0038) [2024-06-19 05:53:25,551][26579] Signal inference workers to stop experience collection... (15450 times) [2024-06-19 05:53:25,554][26579] Signal inference workers to resume experience collection... (15450 times) [2024-06-19 05:53:25,574][26599] InferenceWorker_p0-w0: stopping experience collection (15450 times) [2024-06-19 05:53:25,574][26599] InferenceWorker_p0-w0: resuming experience collection (15450 times) [2024-06-19 05:53:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4775329792. Throughput: 0: 42165.7. Samples: 1042988020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:28,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 05:53:28,655][26599] Updated weights for policy 0, policy_version 291464 (0.0039) [2024-06-19 05:53:32,625][26599] Updated weights for policy 0, policy_version 291474 (0.0037) [2024-06-19 05:53:33,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4775542784. Throughput: 0: 42201.7. Samples: 1043116420. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:33,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 05:53:36,180][26599] Updated weights for policy 0, policy_version 291484 (0.0047) [2024-06-19 05:53:38,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4775755776. Throughput: 0: 42142.3. Samples: 1043368660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:38,380][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 05:53:38,484][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000291490_4775772160.pth... [2024-06-19 05:53:38,536][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000290870_4765614080.pth [2024-06-19 05:53:40,411][26599] Updated weights for policy 0, policy_version 291494 (0.0042) [2024-06-19 05:53:43,384][26367] Fps is (10 sec: 42583.5, 60 sec: 42595.9, 300 sec: 42320.2). Total num frames: 4775968768. Throughput: 0: 41915.2. Samples: 1043612460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:43,384][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 05:53:44,279][26599] Updated weights for policy 0, policy_version 291504 (0.0051) [2024-06-19 05:53:48,191][26599] Updated weights for policy 0, policy_version 291514 (0.0032) [2024-06-19 05:53:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4776165376. Throughput: 0: 42031.1. Samples: 1043739340. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:48,383][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 05:53:51,762][26599] Updated weights for policy 0, policy_version 291524 (0.0025) [2024-06-19 05:53:53,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4776411136. Throughput: 0: 42139.9. Samples: 1044001800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:53,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 05:53:55,752][26599] Updated weights for policy 0, policy_version 291534 (0.0043) [2024-06-19 05:53:58,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42321.2). Total num frames: 4776607744. Throughput: 0: 42083.5. Samples: 1044251800. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:53:58,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 05:53:59,413][26599] Updated weights for policy 0, policy_version 291544 (0.0049) [2024-06-19 05:54:03,384][26367] Fps is (10 sec: 39307.3, 60 sec: 42049.8, 300 sec: 42209.1). Total num frames: 4776804352. Throughput: 0: 42049.1. Samples: 1044374460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:54:03,384][26367] Avg episode reward: [(0, '0.288')] [2024-06-19 05:54:03,416][26599] Updated weights for policy 0, policy_version 291554 (0.0053) [2024-06-19 05:54:07,245][26599] Updated weights for policy 0, policy_version 291564 (0.0040) [2024-06-19 05:54:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.5). Total num frames: 4777033728. Throughput: 0: 42091.1. Samples: 1044628580. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:54:08,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 05:54:11,059][26599] Updated weights for policy 0, policy_version 291574 (0.0036) [2024-06-19 05:54:13,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4777213952. Throughput: 0: 42277.9. Samples: 1044890520. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:54:13,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 05:54:14,826][26599] Updated weights for policy 0, policy_version 291584 (0.0031) [2024-06-19 05:54:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4777443328. Throughput: 0: 42156.4. Samples: 1045013460. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:54:18,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 05:54:18,666][26599] Updated weights for policy 0, policy_version 291594 (0.0038) [2024-06-19 05:54:22,341][26599] Updated weights for policy 0, policy_version 291604 (0.0036) [2024-06-19 05:54:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.2, 300 sec: 42376.7). Total num frames: 4777656320. Throughput: 0: 42219.8. Samples: 1045268560. Policy #0 lag: (min: 1.0, avg: 10.7, max: 21.0) [2024-06-19 05:54:23,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 05:54:26,362][26599] Updated weights for policy 0, policy_version 291614 (0.0037) [2024-06-19 05:54:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4777852928. Throughput: 0: 42547.4. Samples: 1045526940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:54:28,381][26367] Avg episode reward: [(0, '0.310')] [2024-06-19 05:54:30,036][26599] Updated weights for policy 0, policy_version 291624 (0.0044) [2024-06-19 05:54:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4778082304. Throughput: 0: 42513.7. Samples: 1045652460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:54:33,381][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 05:54:34,063][26599] Updated weights for policy 0, policy_version 291634 (0.0044) [2024-06-19 05:54:38,141][26599] Updated weights for policy 0, policy_version 291644 (0.0034) [2024-06-19 05:54:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4778295296. Throughput: 0: 42232.8. Samples: 1045902280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:54:38,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 05:54:42,106][26599] Updated weights for policy 0, policy_version 291654 (0.0035) [2024-06-19 05:54:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42054.8, 300 sec: 42265.2). Total num frames: 4778491904. Throughput: 0: 42431.6. Samples: 1046161220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:54:43,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 05:54:45,908][26599] Updated weights for policy 0, policy_version 291664 (0.0041) [2024-06-19 05:54:48,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4778704896. Throughput: 0: 42468.4. Samples: 1046285380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:54:48,380][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 05:54:49,653][26599] Updated weights for policy 0, policy_version 291674 (0.0041) [2024-06-19 05:54:53,384][26367] Fps is (10 sec: 42582.8, 60 sec: 41776.6, 300 sec: 42320.2). Total num frames: 4778917888. Throughput: 0: 42494.3. Samples: 1046540980. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:54:53,385][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 05:54:53,722][26599] Updated weights for policy 0, policy_version 291684 (0.0045) [2024-06-19 05:54:57,155][26579] Signal inference workers to stop experience collection... (15500 times) [2024-06-19 05:54:57,155][26579] Signal inference workers to resume experience collection... (15500 times) [2024-06-19 05:54:57,192][26599] InferenceWorker_p0-w0: stopping experience collection (15500 times) [2024-06-19 05:54:57,193][26599] InferenceWorker_p0-w0: resuming experience collection (15500 times) [2024-06-19 05:54:57,459][26599] Updated weights for policy 0, policy_version 291694 (0.0039) [2024-06-19 05:54:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4779130880. Throughput: 0: 42266.4. Samples: 1046792500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:54:58,380][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 05:55:01,491][26599] Updated weights for policy 0, policy_version 291704 (0.0042) [2024-06-19 05:55:03,380][26367] Fps is (10 sec: 44253.0, 60 sec: 42600.9, 300 sec: 42320.7). Total num frames: 4779360256. Throughput: 0: 42342.3. Samples: 1046918860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:55:03,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 05:55:05,202][26599] Updated weights for policy 0, policy_version 291714 (0.0042) [2024-06-19 05:55:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4779556864. Throughput: 0: 42405.0. Samples: 1047176780. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:55:08,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 05:55:09,133][26599] Updated weights for policy 0, policy_version 291724 (0.0040) [2024-06-19 05:55:13,281][26599] Updated weights for policy 0, policy_version 291734 (0.0034) [2024-06-19 05:55:13,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4779769856. Throughput: 0: 42308.2. Samples: 1047430800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:55:13,380][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 05:55:16,749][26599] Updated weights for policy 0, policy_version 291744 (0.0040) [2024-06-19 05:55:18,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.6, 300 sec: 42265.2). Total num frames: 4779999232. Throughput: 0: 42344.7. Samples: 1047557960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:55:18,380][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 05:55:21,068][26599] Updated weights for policy 0, policy_version 291754 (0.0042) [2024-06-19 05:55:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4780195840. Throughput: 0: 42345.9. Samples: 1047807840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:55:23,380][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 05:55:24,259][26599] Updated weights for policy 0, policy_version 291764 (0.0043) [2024-06-19 05:55:28,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4780392448. Throughput: 0: 42330.7. Samples: 1048066100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:55:28,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 05:55:28,590][26599] Updated weights for policy 0, policy_version 291774 (0.0034) [2024-06-19 05:55:32,064][26599] Updated weights for policy 0, policy_version 291784 (0.0036) [2024-06-19 05:55:33,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 4780621824. Throughput: 0: 42379.4. Samples: 1048192460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 20.0) [2024-06-19 05:55:33,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 05:55:36,215][26599] Updated weights for policy 0, policy_version 291794 (0.0041) [2024-06-19 05:55:38,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4780851200. Throughput: 0: 42365.7. Samples: 1048447280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:55:38,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 05:55:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000291800_4780851200.pth... [2024-06-19 05:55:38,441][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000291180_4770693120.pth [2024-06-19 05:55:39,809][26599] Updated weights for policy 0, policy_version 291804 (0.0031) [2024-06-19 05:55:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4781047808. Throughput: 0: 42492.2. Samples: 1048704660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:55:43,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 05:55:44,087][26599] Updated weights for policy 0, policy_version 291814 (0.0026) [2024-06-19 05:55:47,743][26599] Updated weights for policy 0, policy_version 291824 (0.0038) [2024-06-19 05:55:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4781260800. Throughput: 0: 42388.0. Samples: 1048826320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:55:48,384][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 05:55:51,966][26599] Updated weights for policy 0, policy_version 291834 (0.0031) [2024-06-19 05:55:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42601.0, 300 sec: 42320.7). Total num frames: 4781473792. Throughput: 0: 42403.1. Samples: 1049084920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:55:53,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 05:55:55,563][26599] Updated weights for policy 0, policy_version 291844 (0.0049) [2024-06-19 05:55:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42265.7). Total num frames: 4781670400. Throughput: 0: 42376.3. Samples: 1049337740. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:55:58,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 05:55:59,722][26599] Updated weights for policy 0, policy_version 291854 (0.0035) [2024-06-19 05:56:03,119][26599] Updated weights for policy 0, policy_version 291864 (0.0035) [2024-06-19 05:56:03,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4781916160. Throughput: 0: 42346.2. Samples: 1049463540. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:03,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 05:56:07,495][26599] Updated weights for policy 0, policy_version 291874 (0.0027) [2024-06-19 05:56:08,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42595.8, 300 sec: 42375.7). Total num frames: 4782112768. Throughput: 0: 42537.8. Samples: 1049722200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:08,384][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 05:56:10,791][26599] Updated weights for policy 0, policy_version 291884 (0.0042) [2024-06-19 05:56:13,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4782309376. Throughput: 0: 42315.1. Samples: 1049970280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:13,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 05:56:15,029][26579] Signal inference workers to stop experience collection... (15550 times) [2024-06-19 05:56:15,029][26579] Signal inference workers to resume experience collection... (15550 times) [2024-06-19 05:56:15,056][26599] InferenceWorker_p0-w0: stopping experience collection (15550 times) [2024-06-19 05:56:15,056][26599] InferenceWorker_p0-w0: resuming experience collection (15550 times) [2024-06-19 05:56:15,168][26599] Updated weights for policy 0, policy_version 291894 (0.0032) [2024-06-19 05:56:18,380][26367] Fps is (10 sec: 42613.3, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4782538752. Throughput: 0: 42410.6. Samples: 1050100940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:18,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 05:56:18,565][26599] Updated weights for policy 0, policy_version 291904 (0.0030) [2024-06-19 05:56:22,974][26599] Updated weights for policy 0, policy_version 291914 (0.0033) [2024-06-19 05:56:23,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4782751744. Throughput: 0: 42445.0. Samples: 1050357300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:23,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 05:56:26,565][26599] Updated weights for policy 0, policy_version 291924 (0.0030) [2024-06-19 05:56:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4782948352. Throughput: 0: 42251.2. Samples: 1050605960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:28,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 05:56:30,485][26599] Updated weights for policy 0, policy_version 291934 (0.0035) [2024-06-19 05:56:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4783161344. Throughput: 0: 42481.4. Samples: 1050737980. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:33,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 05:56:34,112][26599] Updated weights for policy 0, policy_version 291944 (0.0046) [2024-06-19 05:56:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 4783357952. Throughput: 0: 42417.9. Samples: 1050993720. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:38,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 05:56:38,518][26599] Updated weights for policy 0, policy_version 291954 (0.0032) [2024-06-19 05:56:42,224][26599] Updated weights for policy 0, policy_version 291964 (0.0035) [2024-06-19 05:56:43,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42322.9, 300 sec: 42320.2). Total num frames: 4783587328. Throughput: 0: 42289.1. Samples: 1051240900. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:43,384][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 05:56:46,127][26599] Updated weights for policy 0, policy_version 291974 (0.0041) [2024-06-19 05:56:48,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4783816704. Throughput: 0: 42377.1. Samples: 1051370520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 05:56:48,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 05:56:49,790][26599] Updated weights for policy 0, policy_version 291984 (0.0038) [2024-06-19 05:56:53,380][26367] Fps is (10 sec: 39335.4, 60 sec: 41779.1, 300 sec: 42154.1). Total num frames: 4783980544. Throughput: 0: 42188.2. Samples: 1051620520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:56:53,381][26367] Avg episode reward: [(0, '0.429')] [2024-06-19 05:56:53,818][26599] Updated weights for policy 0, policy_version 291994 (0.0046) [2024-06-19 05:56:57,356][26599] Updated weights for policy 0, policy_version 292004 (0.0048) [2024-06-19 05:56:58,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4784209920. Throughput: 0: 42298.3. Samples: 1051873700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:56:58,380][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 05:57:01,437][26599] Updated weights for policy 0, policy_version 292014 (0.0053) [2024-06-19 05:57:03,380][26367] Fps is (10 sec: 47514.2, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4784455680. Throughput: 0: 42303.7. Samples: 1052004600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:03,381][26367] Avg episode reward: [(0, '0.335')] [2024-06-19 05:57:05,010][26599] Updated weights for policy 0, policy_version 292024 (0.0027) [2024-06-19 05:57:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41781.7, 300 sec: 42210.1). Total num frames: 4784619520. Throughput: 0: 42159.4. Samples: 1052254480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:08,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 05:57:09,410][26599] Updated weights for policy 0, policy_version 292034 (0.0033) [2024-06-19 05:57:12,595][26599] Updated weights for policy 0, policy_version 292044 (0.0044) [2024-06-19 05:57:13,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4784848896. Throughput: 0: 42118.6. Samples: 1052501300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:13,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 05:57:17,153][26599] Updated weights for policy 0, policy_version 292054 (0.0037) [2024-06-19 05:57:18,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42049.8, 300 sec: 42320.6). Total num frames: 4785061888. Throughput: 0: 42110.3. Samples: 1052633100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:18,384][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 05:57:20,935][26599] Updated weights for policy 0, policy_version 292064 (0.0024) [2024-06-19 05:57:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 4785258496. Throughput: 0: 41967.5. Samples: 1052882260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:23,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 05:57:24,691][26599] Updated weights for policy 0, policy_version 292074 (0.0038) [2024-06-19 05:57:28,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4785471488. Throughput: 0: 42175.7. Samples: 1053138660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:28,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 05:57:28,638][26599] Updated weights for policy 0, policy_version 292084 (0.0032) [2024-06-19 05:57:32,164][26599] Updated weights for policy 0, policy_version 292094 (0.0034) [2024-06-19 05:57:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4785684480. Throughput: 0: 42178.7. Samples: 1053268560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:33,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 05:57:36,369][26599] Updated weights for policy 0, policy_version 292104 (0.0033) [2024-06-19 05:57:38,382][26367] Fps is (10 sec: 44230.7, 60 sec: 42597.3, 300 sec: 42376.0). Total num frames: 4785913856. Throughput: 0: 42241.4. Samples: 1053521440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:38,382][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 05:57:38,407][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000292109_4785913856.pth... [2024-06-19 05:57:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000291490_4775772160.pth [2024-06-19 05:57:39,764][26599] Updated weights for policy 0, policy_version 292114 (0.0028) [2024-06-19 05:57:41,655][26579] Signal inference workers to stop experience collection... (15600 times) [2024-06-19 05:57:41,708][26599] InferenceWorker_p0-w0: stopping experience collection (15600 times) [2024-06-19 05:57:41,714][26579] Signal inference workers to resume experience collection... (15600 times) [2024-06-19 05:57:41,725][26599] InferenceWorker_p0-w0: resuming experience collection (15600 times) [2024-06-19 05:57:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42327.8, 300 sec: 42320.7). Total num frames: 4786126848. Throughput: 0: 42304.7. Samples: 1053777420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:43,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 05:57:43,911][26599] Updated weights for policy 0, policy_version 292124 (0.0034) [2024-06-19 05:57:47,540][26599] Updated weights for policy 0, policy_version 292134 (0.0038) [2024-06-19 05:57:48,380][26367] Fps is (10 sec: 42604.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4786339840. Throughput: 0: 42333.3. Samples: 1053909600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:48,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 05:57:51,654][26599] Updated weights for policy 0, policy_version 292144 (0.0040) [2024-06-19 05:57:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42265.1). Total num frames: 4786536448. Throughput: 0: 42285.7. Samples: 1054157340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:53,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 05:57:55,378][26599] Updated weights for policy 0, policy_version 292154 (0.0030) [2024-06-19 05:57:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4786749440. Throughput: 0: 42534.0. Samples: 1054415320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:57:58,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 05:57:59,283][26599] Updated weights for policy 0, policy_version 292164 (0.0047) [2024-06-19 05:58:02,986][26599] Updated weights for policy 0, policy_version 292174 (0.0034) [2024-06-19 05:58:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 4786978816. Throughput: 0: 42458.0. Samples: 1054543560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 05:58:03,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 05:58:07,062][26599] Updated weights for policy 0, policy_version 292184 (0.0036) [2024-06-19 05:58:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4787175424. Throughput: 0: 42486.3. Samples: 1054794140. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:08,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 05:58:10,854][26599] Updated weights for policy 0, policy_version 292194 (0.0023) [2024-06-19 05:58:13,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4787388416. Throughput: 0: 42503.7. Samples: 1055051320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:13,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 05:58:14,968][26599] Updated weights for policy 0, policy_version 292204 (0.0048) [2024-06-19 05:58:18,384][26367] Fps is (10 sec: 42584.0, 60 sec: 42325.6, 300 sec: 42264.7). Total num frames: 4787601408. Throughput: 0: 42470.7. Samples: 1055179880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:18,384][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 05:58:18,911][26599] Updated weights for policy 0, policy_version 292214 (0.0031) [2024-06-19 05:58:22,539][26599] Updated weights for policy 0, policy_version 292224 (0.0034) [2024-06-19 05:58:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4787814400. Throughput: 0: 42447.2. Samples: 1055431500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:23,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 05:58:26,535][26599] Updated weights for policy 0, policy_version 292234 (0.0029) [2024-06-19 05:58:28,384][26367] Fps is (10 sec: 40958.5, 60 sec: 42322.8, 300 sec: 42264.7). Total num frames: 4788011008. Throughput: 0: 42393.5. Samples: 1055685280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:28,385][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 05:58:30,835][26599] Updated weights for policy 0, policy_version 292244 (0.0048) [2024-06-19 05:58:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4788256768. Throughput: 0: 42234.7. Samples: 1055810160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:33,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 05:58:34,347][26599] Updated weights for policy 0, policy_version 292254 (0.0030) [2024-06-19 05:58:38,344][26599] Updated weights for policy 0, policy_version 292264 (0.0034) [2024-06-19 05:58:38,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42326.4, 300 sec: 42321.2). Total num frames: 4788453376. Throughput: 0: 42486.3. Samples: 1056069220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:38,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 05:58:42,018][26599] Updated weights for policy 0, policy_version 292274 (0.0040) [2024-06-19 05:58:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4788666368. Throughput: 0: 42355.8. Samples: 1056321340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:43,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 05:58:45,984][26599] Updated weights for policy 0, policy_version 292284 (0.0043) [2024-06-19 05:58:48,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4788895744. Throughput: 0: 42421.0. Samples: 1056452500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:48,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 05:58:49,669][26599] Updated weights for policy 0, policy_version 292294 (0.0043) [2024-06-19 05:58:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4789075968. Throughput: 0: 42500.7. Samples: 1056706680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:53,381][26367] Avg episode reward: [(0, '0.274')] [2024-06-19 05:58:53,743][26599] Updated weights for policy 0, policy_version 292304 (0.0036) [2024-06-19 05:58:57,307][26599] Updated weights for policy 0, policy_version 292314 (0.0033) [2024-06-19 05:58:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42376.7). Total num frames: 4789305344. Throughput: 0: 42360.3. Samples: 1056957540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:58:58,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 05:59:01,336][26599] Updated weights for policy 0, policy_version 292324 (0.0028) [2024-06-19 05:59:03,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4789534720. Throughput: 0: 42456.4. Samples: 1057090280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:59:03,381][26367] Avg episode reward: [(0, '0.218')] [2024-06-19 05:59:04,831][26599] Updated weights for policy 0, policy_version 292334 (0.0031) [2024-06-19 05:59:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 4789698560. Throughput: 0: 42389.2. Samples: 1057339020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:59:08,381][26367] Avg episode reward: [(0, '0.394')] [2024-06-19 05:59:09,195][26599] Updated weights for policy 0, policy_version 292344 (0.0027) [2024-06-19 05:59:11,559][26579] Signal inference workers to stop experience collection... (15650 times) [2024-06-19 05:59:11,602][26599] InferenceWorker_p0-w0: stopping experience collection (15650 times) [2024-06-19 05:59:11,610][26579] Signal inference workers to resume experience collection... (15650 times) [2024-06-19 05:59:11,616][26599] InferenceWorker_p0-w0: resuming experience collection (15650 times) [2024-06-19 05:59:12,523][26599] Updated weights for policy 0, policy_version 292354 (0.0036) [2024-06-19 05:59:13,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42595.7, 300 sec: 42375.7). Total num frames: 4789944320. Throughput: 0: 42466.2. Samples: 1057596260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 05:59:13,385][26367] Avg episode reward: [(0, '0.342')] [2024-06-19 05:59:16,885][26599] Updated weights for policy 0, policy_version 292364 (0.0041) [2024-06-19 05:59:18,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42600.8, 300 sec: 42376.3). Total num frames: 4790157312. Throughput: 0: 42746.3. Samples: 1057733740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:18,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 05:59:20,227][26599] Updated weights for policy 0, policy_version 292374 (0.0038) [2024-06-19 05:59:23,380][26367] Fps is (10 sec: 39336.0, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4790337536. Throughput: 0: 42445.8. Samples: 1057979280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:23,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 05:59:24,623][26599] Updated weights for policy 0, policy_version 292384 (0.0043) [2024-06-19 05:59:28,192][26599] Updated weights for policy 0, policy_version 292394 (0.0030) [2024-06-19 05:59:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42874.1, 300 sec: 42376.3). Total num frames: 4790583296. Throughput: 0: 42387.2. Samples: 1058228760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:28,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 05:59:32,272][26599] Updated weights for policy 0, policy_version 292404 (0.0027) [2024-06-19 05:59:33,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4790796288. Throughput: 0: 42525.3. Samples: 1058366140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:33,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 05:59:35,687][26599] Updated weights for policy 0, policy_version 292414 (0.0037) [2024-06-19 05:59:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4790976512. Throughput: 0: 42389.3. Samples: 1058614200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:38,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 05:59:38,518][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000292419_4790992896.pth... [2024-06-19 05:59:38,577][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000291800_4780851200.pth [2024-06-19 05:59:39,923][26599] Updated weights for policy 0, policy_version 292424 (0.0044) [2024-06-19 05:59:43,226][26599] Updated weights for policy 0, policy_version 292434 (0.0033) [2024-06-19 05:59:43,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4791238656. Throughput: 0: 42254.2. Samples: 1058858980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:43,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 05:59:48,137][26599] Updated weights for policy 0, policy_version 292444 (0.0037) [2024-06-19 05:59:48,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 42376.8). Total num frames: 4791418880. Throughput: 0: 42445.5. Samples: 1059000320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:48,380][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 05:59:50,984][26599] Updated weights for policy 0, policy_version 292454 (0.0032) [2024-06-19 05:59:53,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4791615488. Throughput: 0: 42431.7. Samples: 1059248440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:53,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 05:59:55,629][26599] Updated weights for policy 0, policy_version 292464 (0.0031) [2024-06-19 05:59:58,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4791861248. Throughput: 0: 42346.6. Samples: 1059501700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 05:59:58,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 05:59:58,779][26599] Updated weights for policy 0, policy_version 292474 (0.0040) [2024-06-19 06:00:03,235][26599] Updated weights for policy 0, policy_version 292484 (0.0034) [2024-06-19 06:00:03,383][26367] Fps is (10 sec: 44226.6, 60 sec: 42050.7, 300 sec: 42375.9). Total num frames: 4792057856. Throughput: 0: 42275.1. Samples: 1059636220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:00:03,383][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 06:00:06,687][26599] Updated weights for policy 0, policy_version 292494 (0.0037) [2024-06-19 06:00:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4792254464. Throughput: 0: 42234.2. Samples: 1059879820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:00:08,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 06:00:11,011][26599] Updated weights for policy 0, policy_version 292504 (0.0038) [2024-06-19 06:00:13,380][26367] Fps is (10 sec: 42608.5, 60 sec: 42328.0, 300 sec: 42320.7). Total num frames: 4792483840. Throughput: 0: 42337.0. Samples: 1060133920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:00:13,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 06:00:14,284][26599] Updated weights for policy 0, policy_version 292514 (0.0038) [2024-06-19 06:00:18,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4792696832. Throughput: 0: 42350.3. Samples: 1060271900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:00:18,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 06:00:18,515][26599] Updated weights for policy 0, policy_version 292524 (0.0033) [2024-06-19 06:00:22,022][26599] Updated weights for policy 0, policy_version 292534 (0.0030) [2024-06-19 06:00:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4792893440. Throughput: 0: 42284.5. Samples: 1060517000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:00:23,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 06:00:26,256][26599] Updated weights for policy 0, policy_version 292544 (0.0031) [2024-06-19 06:00:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4793139200. Throughput: 0: 42547.1. Samples: 1060773600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:00:28,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 06:00:29,589][26599] Updated weights for policy 0, policy_version 292554 (0.0029) [2024-06-19 06:00:33,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4793303040. Throughput: 0: 42375.1. Samples: 1060907200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:00:33,380][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 06:00:33,908][26599] Updated weights for policy 0, policy_version 292564 (0.0032) [2024-06-19 06:00:36,808][26579] Signal inference workers to stop experience collection... (15700 times) [2024-06-19 06:00:36,808][26579] Signal inference workers to resume experience collection... (15700 times) [2024-06-19 06:00:36,856][26599] InferenceWorker_p0-w0: stopping experience collection (15700 times) [2024-06-19 06:00:36,856][26599] InferenceWorker_p0-w0: resuming experience collection (15700 times) [2024-06-19 06:00:37,215][26599] Updated weights for policy 0, policy_version 292574 (0.0028) [2024-06-19 06:00:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4793548800. Throughput: 0: 42410.1. Samples: 1061156900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:00:38,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 06:00:41,822][26599] Updated weights for policy 0, policy_version 292584 (0.0029) [2024-06-19 06:00:43,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4793761792. Throughput: 0: 42534.8. Samples: 1061415760. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:00:43,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 06:00:45,052][26599] Updated weights for policy 0, policy_version 292594 (0.0027) [2024-06-19 06:00:48,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.1, 300 sec: 42265.2). Total num frames: 4793942016. Throughput: 0: 42321.2. Samples: 1061540580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:00:48,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 06:00:49,772][26599] Updated weights for policy 0, policy_version 292604 (0.0044) [2024-06-19 06:00:52,715][26599] Updated weights for policy 0, policy_version 292614 (0.0029) [2024-06-19 06:00:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 4794204160. Throughput: 0: 42463.6. Samples: 1061790680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:00:53,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 06:00:57,474][26599] Updated weights for policy 0, policy_version 292624 (0.0032) [2024-06-19 06:00:58,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4794384384. Throughput: 0: 42602.3. Samples: 1062051020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:00:58,380][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 06:01:00,227][26599] Updated weights for policy 0, policy_version 292634 (0.0039) [2024-06-19 06:01:03,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42053.9, 300 sec: 42265.7). Total num frames: 4794580992. Throughput: 0: 42221.8. Samples: 1062171880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:03,380][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 06:01:05,308][26599] Updated weights for policy 0, policy_version 292644 (0.0043) [2024-06-19 06:01:07,795][26599] Updated weights for policy 0, policy_version 292654 (0.0024) [2024-06-19 06:01:08,380][26367] Fps is (10 sec: 45874.8, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 4794843136. Throughput: 0: 42545.8. Samples: 1062431560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:08,380][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 06:01:12,929][26599] Updated weights for policy 0, policy_version 292664 (0.0038) [2024-06-19 06:01:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4795006976. Throughput: 0: 42521.8. Samples: 1062687080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:13,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 06:01:15,973][26599] Updated weights for policy 0, policy_version 292674 (0.0026) [2024-06-19 06:01:18,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4795236352. Throughput: 0: 42257.7. Samples: 1062808800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:18,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 06:01:20,412][26599] Updated weights for policy 0, policy_version 292684 (0.0036) [2024-06-19 06:01:23,380][26367] Fps is (10 sec: 47512.8, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 4795482112. Throughput: 0: 42539.9. Samples: 1063071200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:23,381][26367] Avg episode reward: [(0, '0.370')] [2024-06-19 06:01:23,639][26599] Updated weights for policy 0, policy_version 292694 (0.0039) [2024-06-19 06:01:27,966][26599] Updated weights for policy 0, policy_version 292704 (0.0028) [2024-06-19 06:01:28,382][26367] Fps is (10 sec: 42592.5, 60 sec: 42051.3, 300 sec: 42376.0). Total num frames: 4795662336. Throughput: 0: 42443.9. Samples: 1063325800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:28,382][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 06:01:31,239][26599] Updated weights for policy 0, policy_version 292714 (0.0037) [2024-06-19 06:01:33,380][26367] Fps is (10 sec: 40960.8, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 4795891712. Throughput: 0: 42451.2. Samples: 1063450880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:33,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 06:01:35,595][26599] Updated weights for policy 0, policy_version 292724 (0.0036) [2024-06-19 06:01:38,380][26367] Fps is (10 sec: 44242.5, 60 sec: 42598.4, 300 sec: 42432.3). Total num frames: 4796104704. Throughput: 0: 42699.0. Samples: 1063712140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:38,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 06:01:38,387][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000292731_4796104704.pth... [2024-06-19 06:01:38,448][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000292109_4785913856.pth [2024-06-19 06:01:39,200][26599] Updated weights for policy 0, policy_version 292734 (0.0034) [2024-06-19 06:01:43,348][26599] Updated weights for policy 0, policy_version 292744 (0.0030) [2024-06-19 06:01:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42376.3). Total num frames: 4796317696. Throughput: 0: 42569.7. Samples: 1063966660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 06:01:43,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 06:01:46,962][26599] Updated weights for policy 0, policy_version 292754 (0.0034) [2024-06-19 06:01:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 4796547072. Throughput: 0: 42571.5. Samples: 1064087600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:01:48,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 06:01:50,938][26579] Signal inference workers to stop experience collection... (15750 times) [2024-06-19 06:01:50,938][26579] Signal inference workers to resume experience collection... (15750 times) [2024-06-19 06:01:50,973][26599] InferenceWorker_p0-w0: stopping experience collection (15750 times) [2024-06-19 06:01:50,973][26599] InferenceWorker_p0-w0: resuming experience collection (15750 times) [2024-06-19 06:01:51,085][26599] Updated weights for policy 0, policy_version 292764 (0.0037) [2024-06-19 06:01:53,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42049.7, 300 sec: 42431.3). Total num frames: 4796727296. Throughput: 0: 42636.1. Samples: 1064350340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:01:53,385][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 06:01:54,582][26599] Updated weights for policy 0, policy_version 292774 (0.0027) [2024-06-19 06:01:58,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4796923904. Throughput: 0: 42572.1. Samples: 1064602820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:01:58,380][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 06:01:58,780][26599] Updated weights for policy 0, policy_version 292784 (0.0044) [2024-06-19 06:02:02,455][26599] Updated weights for policy 0, policy_version 292794 (0.0027) [2024-06-19 06:02:03,380][26367] Fps is (10 sec: 44253.4, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 4797169664. Throughput: 0: 42602.8. Samples: 1064725920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:03,380][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 06:02:06,473][26599] Updated weights for policy 0, policy_version 292804 (0.0039) [2024-06-19 06:02:08,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4797366272. Throughput: 0: 42428.6. Samples: 1064980480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:08,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 06:02:10,203][26599] Updated weights for policy 0, policy_version 292814 (0.0039) [2024-06-19 06:02:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42432.3). Total num frames: 4797579264. Throughput: 0: 42515.1. Samples: 1065238920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:13,380][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 06:02:14,180][26599] Updated weights for policy 0, policy_version 292824 (0.0032) [2024-06-19 06:02:18,047][26599] Updated weights for policy 0, policy_version 292834 (0.0040) [2024-06-19 06:02:18,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4797792256. Throughput: 0: 42625.3. Samples: 1065369020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:18,381][26367] Avg episode reward: [(0, '0.330')] [2024-06-19 06:02:21,786][26599] Updated weights for policy 0, policy_version 292844 (0.0034) [2024-06-19 06:02:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 41506.3, 300 sec: 42376.3). Total num frames: 4797972480. Throughput: 0: 42281.5. Samples: 1065614800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:23,380][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 06:02:25,923][26599] Updated weights for policy 0, policy_version 292854 (0.0037) [2024-06-19 06:02:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42599.3, 300 sec: 42487.3). Total num frames: 4798218240. Throughput: 0: 42200.8. Samples: 1065865700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:28,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 06:02:29,435][26599] Updated weights for policy 0, policy_version 292864 (0.0031) [2024-06-19 06:02:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42320.9). Total num frames: 4798398464. Throughput: 0: 42447.6. Samples: 1065997740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:33,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 06:02:33,767][26599] Updated weights for policy 0, policy_version 292874 (0.0041) [2024-06-19 06:02:37,129][26599] Updated weights for policy 0, policy_version 292884 (0.0029) [2024-06-19 06:02:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4798627840. Throughput: 0: 42054.0. Samples: 1066242620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:38,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 06:02:41,631][26599] Updated weights for policy 0, policy_version 292894 (0.0027) [2024-06-19 06:02:43,384][26367] Fps is (10 sec: 45858.3, 60 sec: 42322.8, 300 sec: 42431.3). Total num frames: 4798857216. Throughput: 0: 42190.3. Samples: 1066501540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:43,384][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 06:02:44,777][26599] Updated weights for policy 0, policy_version 292904 (0.0047) [2024-06-19 06:02:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41506.2, 300 sec: 42376.3). Total num frames: 4799037440. Throughput: 0: 42316.4. Samples: 1066630160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:48,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 06:02:49,232][26599] Updated weights for policy 0, policy_version 292914 (0.0038) [2024-06-19 06:02:52,304][26599] Updated weights for policy 0, policy_version 292924 (0.0043) [2024-06-19 06:02:53,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42601.0, 300 sec: 42487.3). Total num frames: 4799283200. Throughput: 0: 42197.0. Samples: 1066879340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 06:02:53,380][26367] Avg episode reward: [(0, '0.881')] [2024-06-19 06:02:57,831][26599] Updated weights for policy 0, policy_version 292934 (0.0026) [2024-06-19 06:02:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4799463424. Throughput: 0: 42260.0. Samples: 1067140620. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:02:58,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 06:03:00,087][26599] Updated weights for policy 0, policy_version 292944 (0.0029) [2024-06-19 06:03:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 4799676416. Throughput: 0: 42033.9. Samples: 1067260540. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:03,380][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 06:03:05,450][26599] Updated weights for policy 0, policy_version 292954 (0.0031) [2024-06-19 06:03:08,012][26599] Updated weights for policy 0, policy_version 292964 (0.0046) [2024-06-19 06:03:08,381][26367] Fps is (10 sec: 45873.3, 60 sec: 42598.1, 300 sec: 42487.3). Total num frames: 4799922176. Throughput: 0: 42166.7. Samples: 1067512320. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:08,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 06:03:13,131][26599] Updated weights for policy 0, policy_version 292974 (0.0033) [2024-06-19 06:03:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42321.2). Total num frames: 4800086016. Throughput: 0: 42408.1. Samples: 1067774060. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:13,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 06:03:15,778][26599] Updated weights for policy 0, policy_version 292984 (0.0027) [2024-06-19 06:03:18,380][26367] Fps is (10 sec: 39323.1, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4800315392. Throughput: 0: 42109.7. Samples: 1067892680. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:18,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 06:03:20,793][26599] Updated weights for policy 0, policy_version 292994 (0.0040) [2024-06-19 06:03:21,256][26579] Signal inference workers to stop experience collection... (15800 times) [2024-06-19 06:03:21,303][26599] InferenceWorker_p0-w0: stopping experience collection (15800 times) [2024-06-19 06:03:21,312][26579] Signal inference workers to resume experience collection... (15800 times) [2024-06-19 06:03:21,318][26599] InferenceWorker_p0-w0: resuming experience collection (15800 times) [2024-06-19 06:03:23,380][26367] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42543.4). Total num frames: 4800561152. Throughput: 0: 42525.4. Samples: 1068156260. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:23,384][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 06:03:23,763][26599] Updated weights for policy 0, policy_version 293004 (0.0030) [2024-06-19 06:03:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4800724992. Throughput: 0: 42524.8. Samples: 1068415000. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:28,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 06:03:28,450][26599] Updated weights for policy 0, policy_version 293014 (0.0023) [2024-06-19 06:03:31,380][26599] Updated weights for policy 0, policy_version 293024 (0.0027) [2024-06-19 06:03:33,382][26367] Fps is (10 sec: 39314.7, 60 sec: 42597.1, 300 sec: 42376.0). Total num frames: 4800954368. Throughput: 0: 42316.1. Samples: 1068534460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:33,382][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 06:03:36,238][26599] Updated weights for policy 0, policy_version 293034 (0.0040) [2024-06-19 06:03:38,380][26367] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 4801200128. Throughput: 0: 42438.5. Samples: 1068789080. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:38,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 06:03:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293042_4801200128.pth... [2024-06-19 06:03:38,460][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000292419_4790992896.pth [2024-06-19 06:03:39,166][26599] Updated weights for policy 0, policy_version 293044 (0.0030) [2024-06-19 06:03:43,384][26367] Fps is (10 sec: 40952.1, 60 sec: 41779.2, 300 sec: 42264.6). Total num frames: 4801363968. Throughput: 0: 42275.6. Samples: 1069043180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:43,385][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 06:03:43,981][26599] Updated weights for policy 0, policy_version 293054 (0.0046) [2024-06-19 06:03:46,996][26599] Updated weights for policy 0, policy_version 293064 (0.0045) [2024-06-19 06:03:48,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4801593344. Throughput: 0: 42350.2. Samples: 1069166300. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:48,380][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 06:03:51,540][26599] Updated weights for policy 0, policy_version 293074 (0.0035) [2024-06-19 06:03:53,384][26367] Fps is (10 sec: 44237.1, 60 sec: 42049.7, 300 sec: 42375.7). Total num frames: 4801806336. Throughput: 0: 42527.2. Samples: 1069426180. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:53,384][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 06:03:54,566][26599] Updated weights for policy 0, policy_version 293084 (0.0034) [2024-06-19 06:03:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4802019328. Throughput: 0: 42353.6. Samples: 1069679980. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:03:58,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 06:03:59,132][26599] Updated weights for policy 0, policy_version 293094 (0.0033) [2024-06-19 06:04:02,564][26599] Updated weights for policy 0, policy_version 293104 (0.0042) [2024-06-19 06:04:03,380][26367] Fps is (10 sec: 42613.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4802232320. Throughput: 0: 42461.8. Samples: 1069803460. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:04:03,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 06:04:06,710][26599] Updated weights for policy 0, policy_version 293114 (0.0030) [2024-06-19 06:04:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.5, 300 sec: 42376.8). Total num frames: 4802445312. Throughput: 0: 42248.3. Samples: 1070057440. Policy #0 lag: (min: 0.0, avg: 8.8, max: 21.0) [2024-06-19 06:04:08,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 06:04:10,241][26599] Updated weights for policy 0, policy_version 293124 (0.0047) [2024-06-19 06:04:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4802641920. Throughput: 0: 42100.3. Samples: 1070309520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:13,389][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 06:04:14,749][26599] Updated weights for policy 0, policy_version 293134 (0.0033) [2024-06-19 06:04:18,027][26599] Updated weights for policy 0, policy_version 293144 (0.0044) [2024-06-19 06:04:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4802871296. Throughput: 0: 42360.4. Samples: 1070440600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:18,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 06:04:22,240][26599] Updated weights for policy 0, policy_version 293154 (0.0027) [2024-06-19 06:04:23,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4803084288. Throughput: 0: 42310.7. Samples: 1070693060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:23,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 06:04:25,809][26599] Updated weights for policy 0, policy_version 293164 (0.0038) [2024-06-19 06:04:28,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4803280896. Throughput: 0: 42200.2. Samples: 1070942040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:28,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 06:04:29,781][26599] Updated weights for policy 0, policy_version 293174 (0.0030) [2024-06-19 06:04:33,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42599.7, 300 sec: 42487.3). Total num frames: 4803510272. Throughput: 0: 42305.8. Samples: 1071070060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:33,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 06:04:33,596][26599] Updated weights for policy 0, policy_version 293184 (0.0040) [2024-06-19 06:04:37,403][26599] Updated weights for policy 0, policy_version 293194 (0.0031) [2024-06-19 06:04:38,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4803739648. Throughput: 0: 42267.4. Samples: 1071328060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:38,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 06:04:41,473][26579] Signal inference workers to stop experience collection... (15850 times) [2024-06-19 06:04:41,478][26599] Updated weights for policy 0, policy_version 293204 (0.0054) [2024-06-19 06:04:41,500][26599] InferenceWorker_p0-w0: stopping experience collection (15850 times) [2024-06-19 06:04:41,476][26579] Signal inference workers to resume experience collection... (15850 times) [2024-06-19 06:04:41,519][26599] InferenceWorker_p0-w0: resuming experience collection (15850 times) [2024-06-19 06:04:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42874.1, 300 sec: 42431.8). Total num frames: 4803936256. Throughput: 0: 42267.7. Samples: 1071582020. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:43,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 06:04:44,974][26599] Updated weights for policy 0, policy_version 293214 (0.0024) [2024-06-19 06:04:48,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4804116480. Throughput: 0: 42256.5. Samples: 1071705000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:48,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 06:04:49,328][26599] Updated weights for policy 0, policy_version 293224 (0.0042) [2024-06-19 06:04:53,025][26599] Updated weights for policy 0, policy_version 293234 (0.0040) [2024-06-19 06:04:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42601.0, 300 sec: 42376.2). Total num frames: 4804362240. Throughput: 0: 42296.6. Samples: 1071960780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:53,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 06:04:57,140][26599] Updated weights for policy 0, policy_version 293244 (0.0034) [2024-06-19 06:04:58,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42376.6). Total num frames: 4804558848. Throughput: 0: 42140.4. Samples: 1072205840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:04:58,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 06:05:00,891][26599] Updated weights for policy 0, policy_version 293254 (0.0031) [2024-06-19 06:05:03,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4804755456. Throughput: 0: 42089.6. Samples: 1072334640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:05:03,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 06:05:05,062][26599] Updated weights for policy 0, policy_version 293264 (0.0028) [2024-06-19 06:05:08,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.5, 300 sec: 42376.2). Total num frames: 4804984832. Throughput: 0: 42323.3. Samples: 1072597600. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:05:08,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 06:05:08,419][26599] Updated weights for policy 0, policy_version 293274 (0.0023) [2024-06-19 06:05:12,530][26599] Updated weights for policy 0, policy_version 293284 (0.0030) [2024-06-19 06:05:13,382][26367] Fps is (10 sec: 45865.6, 60 sec: 42869.9, 300 sec: 42431.5). Total num frames: 4805214208. Throughput: 0: 42349.2. Samples: 1072847840. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:05:13,383][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 06:05:16,374][26599] Updated weights for policy 0, policy_version 293294 (0.0030) [2024-06-19 06:05:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4805410816. Throughput: 0: 42415.9. Samples: 1072978780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:05:18,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 06:05:20,083][26599] Updated weights for policy 0, policy_version 293304 (0.0021) [2024-06-19 06:05:23,380][26367] Fps is (10 sec: 39330.2, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4805607424. Throughput: 0: 42446.2. Samples: 1073238140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 06:05:23,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 06:05:23,922][26599] Updated weights for policy 0, policy_version 293314 (0.0026) [2024-06-19 06:05:27,841][26599] Updated weights for policy 0, policy_version 293324 (0.0046) [2024-06-19 06:05:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4805836800. Throughput: 0: 42187.0. Samples: 1073480440. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:05:28,381][26367] Avg episode reward: [(0, '0.414')] [2024-06-19 06:05:31,808][26599] Updated weights for policy 0, policy_version 293334 (0.0044) [2024-06-19 06:05:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4806049792. Throughput: 0: 42281.3. Samples: 1073607660. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:05:33,381][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 06:05:36,025][26599] Updated weights for policy 0, policy_version 293344 (0.0046) [2024-06-19 06:05:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4806246400. Throughput: 0: 42096.9. Samples: 1073855140. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:05:38,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 06:05:38,450][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293351_4806262784.pth... [2024-06-19 06:05:38,497][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000292731_4796104704.pth [2024-06-19 06:05:39,845][26599] Updated weights for policy 0, policy_version 293354 (0.0030) [2024-06-19 06:05:43,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42376.3). Total num frames: 4806443008. Throughput: 0: 42347.7. Samples: 1074111480. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:05:43,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 06:05:43,539][26599] Updated weights for policy 0, policy_version 293364 (0.0038) [2024-06-19 06:05:47,456][26579] Signal inference workers to stop experience collection... (15900 times) [2024-06-19 06:05:47,456][26579] Signal inference workers to resume experience collection... (15900 times) [2024-06-19 06:05:47,465][26599] Updated weights for policy 0, policy_version 293374 (0.0037) [2024-06-19 06:05:47,480][26599] InferenceWorker_p0-w0: stopping experience collection (15900 times) [2024-06-19 06:05:47,480][26599] InferenceWorker_p0-w0: resuming experience collection (15900 times) [2024-06-19 06:05:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4806672384. Throughput: 0: 42158.0. Samples: 1074231740. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:05:48,380][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 06:05:51,504][26599] Updated weights for policy 0, policy_version 293384 (0.0036) [2024-06-19 06:05:53,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4806885376. Throughput: 0: 42005.2. Samples: 1074487840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:05:53,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 06:05:55,062][26599] Updated weights for policy 0, policy_version 293394 (0.0031) [2024-06-19 06:05:58,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4807065600. Throughput: 0: 42122.0. Samples: 1074743240. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:05:58,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 06:05:59,228][26599] Updated weights for policy 0, policy_version 293404 (0.0033) [2024-06-19 06:06:02,582][26599] Updated weights for policy 0, policy_version 293414 (0.0038) [2024-06-19 06:06:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4807311360. Throughput: 0: 41986.7. Samples: 1074868180. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:06:03,384][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 06:06:06,996][26599] Updated weights for policy 0, policy_version 293424 (0.0041) [2024-06-19 06:06:08,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 4807507968. Throughput: 0: 41882.5. Samples: 1075122860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:06:08,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 06:06:10,479][26599] Updated weights for policy 0, policy_version 293434 (0.0037) [2024-06-19 06:06:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41507.6, 300 sec: 42265.2). Total num frames: 4807704576. Throughput: 0: 42120.5. Samples: 1075375860. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:06:13,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 06:06:14,803][26599] Updated weights for policy 0, policy_version 293444 (0.0038) [2024-06-19 06:06:18,106][26599] Updated weights for policy 0, policy_version 293454 (0.0037) [2024-06-19 06:06:18,380][26367] Fps is (10 sec: 44238.0, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4807950336. Throughput: 0: 42180.1. Samples: 1075505760. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:06:18,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 06:06:22,740][26599] Updated weights for policy 0, policy_version 293464 (0.0047) [2024-06-19 06:06:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42265.4). Total num frames: 4808130560. Throughput: 0: 42325.7. Samples: 1075759800. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:06:23,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 06:06:26,041][26599] Updated weights for policy 0, policy_version 293474 (0.0032) [2024-06-19 06:06:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4808343552. Throughput: 0: 42160.5. Samples: 1076008700. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:06:28,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 06:06:30,382][26599] Updated weights for policy 0, policy_version 293484 (0.0041) [2024-06-19 06:06:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4808572928. Throughput: 0: 42335.4. Samples: 1076136840. Policy #0 lag: (min: 1.0, avg: 10.3, max: 22.0) [2024-06-19 06:06:33,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 06:06:33,749][26599] Updated weights for policy 0, policy_version 293494 (0.0035) [2024-06-19 06:06:38,377][26599] Updated weights for policy 0, policy_version 293504 (0.0038) [2024-06-19 06:06:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4808769536. Throughput: 0: 42250.4. Samples: 1076389100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:06:38,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 06:06:41,609][26599] Updated weights for policy 0, policy_version 293514 (0.0032) [2024-06-19 06:06:43,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42595.8, 300 sec: 42209.1). Total num frames: 4808998912. Throughput: 0: 42116.2. Samples: 1076638620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:06:43,384][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 06:06:46,183][26599] Updated weights for policy 0, policy_version 293524 (0.0040) [2024-06-19 06:06:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42321.2). Total num frames: 4809211904. Throughput: 0: 42367.7. Samples: 1076774720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:06:48,380][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 06:06:49,270][26599] Updated weights for policy 0, policy_version 293534 (0.0032) [2024-06-19 06:06:53,384][26367] Fps is (10 sec: 39321.7, 60 sec: 41776.8, 300 sec: 42264.6). Total num frames: 4809392128. Throughput: 0: 42226.6. Samples: 1077023200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:06:53,384][26367] Avg episode reward: [(0, '0.837')] [2024-06-19 06:06:53,923][26599] Updated weights for policy 0, policy_version 293544 (0.0038) [2024-06-19 06:06:57,074][26599] Updated weights for policy 0, policy_version 293554 (0.0033) [2024-06-19 06:06:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42265.1). Total num frames: 4809637888. Throughput: 0: 42124.9. Samples: 1077271480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:06:58,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 06:07:01,549][26599] Updated weights for policy 0, policy_version 293564 (0.0043) [2024-06-19 06:07:03,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4809834496. Throughput: 0: 42388.4. Samples: 1077413240. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:03,380][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 06:07:04,814][26599] Updated weights for policy 0, policy_version 293574 (0.0039) [2024-06-19 06:07:08,386][26367] Fps is (10 sec: 39301.3, 60 sec: 42048.7, 300 sec: 42208.9). Total num frames: 4810031104. Throughput: 0: 42237.3. Samples: 1077660700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:08,386][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 06:07:09,583][26599] Updated weights for policy 0, policy_version 293584 (0.0038) [2024-06-19 06:07:12,381][26599] Updated weights for policy 0, policy_version 293594 (0.0037) [2024-06-19 06:07:13,380][26367] Fps is (10 sec: 45874.7, 60 sec: 43144.6, 300 sec: 42376.2). Total num frames: 4810293248. Throughput: 0: 42239.4. Samples: 1077909480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:13,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 06:07:17,229][26599] Updated weights for policy 0, policy_version 293604 (0.0034) [2024-06-19 06:07:18,380][26367] Fps is (10 sec: 42621.1, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4810457088. Throughput: 0: 42514.3. Samples: 1078049980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:18,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 06:07:20,189][26599] Updated weights for policy 0, policy_version 293614 (0.0028) [2024-06-19 06:07:21,981][26579] Signal inference workers to stop experience collection... (15950 times) [2024-06-19 06:07:22,024][26599] InferenceWorker_p0-w0: stopping experience collection (15950 times) [2024-06-19 06:07:22,033][26579] Signal inference workers to resume experience collection... (15950 times) [2024-06-19 06:07:22,044][26599] InferenceWorker_p0-w0: resuming experience collection (15950 times) [2024-06-19 06:07:23,380][26367] Fps is (10 sec: 37683.1, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4810670080. Throughput: 0: 42349.2. Samples: 1078294820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:23,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 06:07:24,849][26599] Updated weights for policy 0, policy_version 293624 (0.0043) [2024-06-19 06:07:28,144][26599] Updated weights for policy 0, policy_version 293634 (0.0045) [2024-06-19 06:07:28,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42868.8, 300 sec: 42431.2). Total num frames: 4810915840. Throughput: 0: 42351.1. Samples: 1078544420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:28,385][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 06:07:32,549][26599] Updated weights for policy 0, policy_version 293644 (0.0029) [2024-06-19 06:07:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4811096064. Throughput: 0: 42273.7. Samples: 1078677040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:33,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 06:07:35,982][26599] Updated weights for policy 0, policy_version 293654 (0.0041) [2024-06-19 06:07:38,380][26367] Fps is (10 sec: 39336.2, 60 sec: 42325.3, 300 sec: 42210.2). Total num frames: 4811309056. Throughput: 0: 42220.8. Samples: 1078922980. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:38,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 06:07:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293659_4811309056.pth... [2024-06-19 06:07:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293042_4801200128.pth [2024-06-19 06:07:40,094][26599] Updated weights for policy 0, policy_version 293664 (0.0038) [2024-06-19 06:07:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42327.9, 300 sec: 42376.2). Total num frames: 4811538432. Throughput: 0: 42243.2. Samples: 1079172420. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:43,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 06:07:43,669][26599] Updated weights for policy 0, policy_version 293674 (0.0036) [2024-06-19 06:07:48,137][26599] Updated weights for policy 0, policy_version 293684 (0.0038) [2024-06-19 06:07:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42154.1). Total num frames: 4811718656. Throughput: 0: 42018.7. Samples: 1079304080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 06:07:48,380][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 06:07:51,360][26599] Updated weights for policy 0, policy_version 293694 (0.0030) [2024-06-19 06:07:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42874.0, 300 sec: 42376.2). Total num frames: 4811964416. Throughput: 0: 42098.2. Samples: 1079554900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:07:53,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 06:07:55,819][26599] Updated weights for policy 0, policy_version 293704 (0.0032) [2024-06-19 06:07:58,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4812161024. Throughput: 0: 42262.6. Samples: 1079811300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:07:58,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 06:07:59,314][26599] Updated weights for policy 0, policy_version 293714 (0.0028) [2024-06-19 06:08:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4812357632. Throughput: 0: 41877.7. Samples: 1079934480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:03,381][26367] Avg episode reward: [(0, '0.287')] [2024-06-19 06:08:03,467][26599] Updated weights for policy 0, policy_version 293724 (0.0043) [2024-06-19 06:08:07,014][26599] Updated weights for policy 0, policy_version 293734 (0.0046) [2024-06-19 06:08:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42602.1, 300 sec: 42376.2). Total num frames: 4812587008. Throughput: 0: 42141.3. Samples: 1080191180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:08,381][26367] Avg episode reward: [(0, '0.308')] [2024-06-19 06:08:11,154][26599] Updated weights for policy 0, policy_version 293744 (0.0033) [2024-06-19 06:08:13,384][26367] Fps is (10 sec: 44221.8, 60 sec: 41776.9, 300 sec: 42320.2). Total num frames: 4812800000. Throughput: 0: 42357.1. Samples: 1080450480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:13,384][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 06:08:14,868][26599] Updated weights for policy 0, policy_version 293754 (0.0029) [2024-06-19 06:08:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42154.1). Total num frames: 4812996608. Throughput: 0: 42152.8. Samples: 1080573920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:18,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 06:08:18,894][26599] Updated weights for policy 0, policy_version 293764 (0.0024) [2024-06-19 06:08:22,564][26599] Updated weights for policy 0, policy_version 293774 (0.0041) [2024-06-19 06:08:23,380][26367] Fps is (10 sec: 42613.0, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4813225984. Throughput: 0: 42409.3. Samples: 1080831400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:23,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 06:08:26,722][26599] Updated weights for policy 0, policy_version 293784 (0.0039) [2024-06-19 06:08:28,380][26367] Fps is (10 sec: 42599.3, 60 sec: 41781.8, 300 sec: 42265.4). Total num frames: 4813422592. Throughput: 0: 42650.3. Samples: 1081091680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:28,380][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 06:08:30,031][26599] Updated weights for policy 0, policy_version 293794 (0.0027) [2024-06-19 06:08:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4813651968. Throughput: 0: 42530.6. Samples: 1081217960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:33,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 06:08:34,286][26599] Updated weights for policy 0, policy_version 293804 (0.0035) [2024-06-19 06:08:38,003][26599] Updated weights for policy 0, policy_version 293814 (0.0033) [2024-06-19 06:08:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42376.8). Total num frames: 4813864960. Throughput: 0: 42679.7. Samples: 1081475480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:38,381][26367] Avg episode reward: [(0, '0.388')] [2024-06-19 06:08:41,951][26599] Updated weights for policy 0, policy_version 293824 (0.0034) [2024-06-19 06:08:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4814061568. Throughput: 0: 42594.0. Samples: 1081728020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:43,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 06:08:45,665][26599] Updated weights for policy 0, policy_version 293834 (0.0040) [2024-06-19 06:08:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42265.7). Total num frames: 4814274560. Throughput: 0: 42641.0. Samples: 1081853320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:48,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 06:08:49,473][26599] Updated weights for policy 0, policy_version 293844 (0.0021) [2024-06-19 06:08:53,364][26599] Updated weights for policy 0, policy_version 293854 (0.0039) [2024-06-19 06:08:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4814503936. Throughput: 0: 42693.1. Samples: 1082112360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:53,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 06:08:57,318][26599] Updated weights for policy 0, policy_version 293864 (0.0040) [2024-06-19 06:08:58,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4814716928. Throughput: 0: 42482.3. Samples: 1082362040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:08:58,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 06:09:00,987][26599] Updated weights for policy 0, policy_version 293874 (0.0038) [2024-06-19 06:09:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4814929920. Throughput: 0: 42569.4. Samples: 1082489540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:09:03,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 06:09:05,293][26599] Updated weights for policy 0, policy_version 293884 (0.0039) [2024-06-19 06:09:08,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4815110144. Throughput: 0: 42490.2. Samples: 1082743460. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:08,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 06:09:08,746][26599] Updated weights for policy 0, policy_version 293894 (0.0031) [2024-06-19 06:09:12,937][26599] Updated weights for policy 0, policy_version 293904 (0.0032) [2024-06-19 06:09:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42327.7, 300 sec: 42265.2). Total num frames: 4815339520. Throughput: 0: 42266.5. Samples: 1082993680. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:13,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 06:09:16,789][26599] Updated weights for policy 0, policy_version 293914 (0.0035) [2024-06-19 06:09:18,220][26579] Signal inference workers to stop experience collection... (16000 times) [2024-06-19 06:09:18,220][26579] Signal inference workers to resume experience collection... (16000 times) [2024-06-19 06:09:18,251][26599] InferenceWorker_p0-w0: stopping experience collection (16000 times) [2024-06-19 06:09:18,251][26599] InferenceWorker_p0-w0: resuming experience collection (16000 times) [2024-06-19 06:09:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.6, 300 sec: 42320.7). Total num frames: 4815568896. Throughput: 0: 42287.1. Samples: 1083120880. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:18,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 06:09:20,606][26599] Updated weights for policy 0, policy_version 293924 (0.0034) [2024-06-19 06:09:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4815765504. Throughput: 0: 42232.3. Samples: 1083375940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:23,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 06:09:24,544][26599] Updated weights for policy 0, policy_version 293934 (0.0041) [2024-06-19 06:09:28,235][26599] Updated weights for policy 0, policy_version 293944 (0.0026) [2024-06-19 06:09:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4815978496. Throughput: 0: 42297.2. Samples: 1083631400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:28,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 06:09:32,293][26599] Updated weights for policy 0, policy_version 293954 (0.0044) [2024-06-19 06:09:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4816191488. Throughput: 0: 42257.2. Samples: 1083754900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:33,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 06:09:35,813][26599] Updated weights for policy 0, policy_version 293964 (0.0038) [2024-06-19 06:09:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4816388096. Throughput: 0: 42200.8. Samples: 1084011400. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:38,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 06:09:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293969_4816388096.pth... [2024-06-19 06:09:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293351_4806262784.pth [2024-06-19 06:09:39,961][26599] Updated weights for policy 0, policy_version 293974 (0.0042) [2024-06-19 06:09:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4816617472. Throughput: 0: 42288.0. Samples: 1084265000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:43,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 06:09:43,408][26599] Updated weights for policy 0, policy_version 293984 (0.0042) [2024-06-19 06:09:47,635][26599] Updated weights for policy 0, policy_version 293994 (0.0029) [2024-06-19 06:09:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4816814080. Throughput: 0: 42248.1. Samples: 1084390700. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:48,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 06:09:51,161][26599] Updated weights for policy 0, policy_version 294004 (0.0025) [2024-06-19 06:09:53,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42049.7, 300 sec: 42264.7). Total num frames: 4817027072. Throughput: 0: 42215.7. Samples: 1084643320. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:53,384][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 06:09:55,434][26599] Updated weights for policy 0, policy_version 294014 (0.0040) [2024-06-19 06:09:58,382][26367] Fps is (10 sec: 40954.8, 60 sec: 41778.3, 300 sec: 42265.0). Total num frames: 4817223680. Throughput: 0: 42155.3. Samples: 1084890720. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:09:58,382][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 06:09:59,209][26599] Updated weights for policy 0, policy_version 294024 (0.0031) [2024-06-19 06:10:03,284][26599] Updated weights for policy 0, policy_version 294034 (0.0039) [2024-06-19 06:10:03,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4817453056. Throughput: 0: 42105.8. Samples: 1085015640. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:10:03,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 06:10:07,350][26599] Updated weights for policy 0, policy_version 294044 (0.0037) [2024-06-19 06:10:08,380][26367] Fps is (10 sec: 44242.3, 60 sec: 42598.4, 300 sec: 42209.9). Total num frames: 4817666048. Throughput: 0: 42044.4. Samples: 1085267940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:10:08,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 06:10:11,196][26599] Updated weights for policy 0, policy_version 294054 (0.0039) [2024-06-19 06:10:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4817862656. Throughput: 0: 41968.8. Samples: 1085520000. Policy #0 lag: (min: 0.0, avg: 11.1, max: 21.0) [2024-06-19 06:10:13,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 06:10:15,046][26599] Updated weights for policy 0, policy_version 294064 (0.0035) [2024-06-19 06:10:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4818075648. Throughput: 0: 41913.9. Samples: 1085641020. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:18,380][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 06:10:18,840][26599] Updated weights for policy 0, policy_version 294074 (0.0035) [2024-06-19 06:10:22,593][26599] Updated weights for policy 0, policy_version 294084 (0.0054) [2024-06-19 06:10:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4818305024. Throughput: 0: 41893.7. Samples: 1085896620. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:23,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 06:10:26,538][26599] Updated weights for policy 0, policy_version 294094 (0.0030) [2024-06-19 06:10:28,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4818485248. Throughput: 0: 41927.2. Samples: 1086151720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:28,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 06:10:30,245][26599] Updated weights for policy 0, policy_version 294104 (0.0030) [2024-06-19 06:10:33,380][26367] Fps is (10 sec: 39322.6, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4818698240. Throughput: 0: 41835.2. Samples: 1086273280. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:33,380][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 06:10:34,484][26599] Updated weights for policy 0, policy_version 294114 (0.0035) [2024-06-19 06:10:38,055][26599] Updated weights for policy 0, policy_version 294124 (0.0032) [2024-06-19 06:10:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4818927616. Throughput: 0: 42007.8. Samples: 1086533520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:38,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 06:10:42,138][26599] Updated weights for policy 0, policy_version 294134 (0.0035) [2024-06-19 06:10:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4819124224. Throughput: 0: 42092.0. Samples: 1086784800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:43,380][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 06:10:45,638][26599] Updated weights for policy 0, policy_version 294144 (0.0027) [2024-06-19 06:10:48,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42049.7, 300 sec: 42209.1). Total num frames: 4819337216. Throughput: 0: 42066.8. Samples: 1086908800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:48,385][26367] Avg episode reward: [(0, '0.753')] [2024-06-19 06:10:49,711][26599] Updated weights for policy 0, policy_version 294154 (0.0031) [2024-06-19 06:10:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42327.9, 300 sec: 42376.2). Total num frames: 4819566592. Throughput: 0: 42182.7. Samples: 1087166160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:53,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 06:10:53,528][26599] Updated weights for policy 0, policy_version 294164 (0.0032) [2024-06-19 06:10:57,363][26599] Updated weights for policy 0, policy_version 294174 (0.0033) [2024-06-19 06:10:58,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42326.2, 300 sec: 42209.6). Total num frames: 4819763200. Throughput: 0: 42163.2. Samples: 1087417340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:10:58,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 06:11:00,321][26579] Signal inference workers to stop experience collection... (16050 times) [2024-06-19 06:11:00,322][26579] Signal inference workers to resume experience collection... (16050 times) [2024-06-19 06:11:00,338][26599] InferenceWorker_p0-w0: stopping experience collection (16050 times) [2024-06-19 06:11:00,365][26599] InferenceWorker_p0-w0: resuming experience collection (16050 times) [2024-06-19 06:11:01,545][26599] Updated weights for policy 0, policy_version 294184 (0.0035) [2024-06-19 06:11:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4819976192. Throughput: 0: 42240.3. Samples: 1087541840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:11:03,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 06:11:05,462][26599] Updated weights for policy 0, policy_version 294194 (0.0040) [2024-06-19 06:11:08,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42322.8, 300 sec: 42375.7). Total num frames: 4820205568. Throughput: 0: 42383.8. Samples: 1087804040. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:11:08,385][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 06:11:09,374][26599] Updated weights for policy 0, policy_version 294204 (0.0027) [2024-06-19 06:11:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4820385792. Throughput: 0: 42244.4. Samples: 1088052720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:11:13,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 06:11:13,417][26599] Updated weights for policy 0, policy_version 294214 (0.0040) [2024-06-19 06:11:16,997][26599] Updated weights for policy 0, policy_version 294224 (0.0030) [2024-06-19 06:11:18,384][26367] Fps is (10 sec: 40960.2, 60 sec: 42322.7, 300 sec: 42320.2). Total num frames: 4820615168. Throughput: 0: 42214.3. Samples: 1088173080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:11:18,384][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 06:11:21,032][26599] Updated weights for policy 0, policy_version 294234 (0.0037) [2024-06-19 06:11:23,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42049.8, 300 sec: 42320.2). Total num frames: 4820828160. Throughput: 0: 42228.6. Samples: 1088433960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:11:23,384][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 06:11:24,698][26599] Updated weights for policy 0, policy_version 294244 (0.0033) [2024-06-19 06:11:28,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4821024768. Throughput: 0: 42215.1. Samples: 1088684480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 06:11:28,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 06:11:28,566][26599] Updated weights for policy 0, policy_version 294254 (0.0028) [2024-06-19 06:11:32,666][26599] Updated weights for policy 0, policy_version 294264 (0.0044) [2024-06-19 06:11:33,384][26367] Fps is (10 sec: 40959.9, 60 sec: 42322.7, 300 sec: 42264.6). Total num frames: 4821237760. Throughput: 0: 42299.5. Samples: 1088812280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:11:33,385][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 06:11:36,109][26599] Updated weights for policy 0, policy_version 294274 (0.0027) [2024-06-19 06:11:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42265.7). Total num frames: 4821467136. Throughput: 0: 42367.5. Samples: 1089072700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:11:38,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 06:11:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000294279_4821467136.pth... [2024-06-19 06:11:38,460][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293659_4811309056.pth [2024-06-19 06:11:40,549][26599] Updated weights for policy 0, policy_version 294284 (0.0052) [2024-06-19 06:11:43,380][26367] Fps is (10 sec: 44253.2, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4821680128. Throughput: 0: 42348.5. Samples: 1089323020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:11:43,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 06:11:43,707][26599] Updated weights for policy 0, policy_version 294294 (0.0038) [2024-06-19 06:11:48,086][26599] Updated weights for policy 0, policy_version 294304 (0.0038) [2024-06-19 06:11:48,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4821876736. Throughput: 0: 42484.6. Samples: 1089453800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:11:48,384][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 06:11:51,909][26599] Updated weights for policy 0, policy_version 294314 (0.0038) [2024-06-19 06:11:53,381][26367] Fps is (10 sec: 40956.3, 60 sec: 42051.7, 300 sec: 42209.5). Total num frames: 4822089728. Throughput: 0: 42183.5. Samples: 1089702180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:11:53,382][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 06:11:55,683][26599] Updated weights for policy 0, policy_version 294324 (0.0030) [2024-06-19 06:11:58,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4822302720. Throughput: 0: 42354.2. Samples: 1089958660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:11:58,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 06:11:59,749][26599] Updated weights for policy 0, policy_version 294334 (0.0033) [2024-06-19 06:12:03,221][26599] Updated weights for policy 0, policy_version 294344 (0.0042) [2024-06-19 06:12:03,380][26367] Fps is (10 sec: 44240.1, 60 sec: 42598.4, 300 sec: 42377.0). Total num frames: 4822532096. Throughput: 0: 42568.6. Samples: 1090088520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:03,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 06:12:07,364][26599] Updated weights for policy 0, policy_version 294354 (0.0040) [2024-06-19 06:12:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41781.8, 300 sec: 42098.6). Total num frames: 4822712320. Throughput: 0: 42415.9. Samples: 1090342520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:08,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 06:12:10,771][26599] Updated weights for policy 0, policy_version 294364 (0.0028) [2024-06-19 06:12:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4822941696. Throughput: 0: 42434.6. Samples: 1090594040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:13,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 06:12:15,168][26599] Updated weights for policy 0, policy_version 294374 (0.0028) [2024-06-19 06:12:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42601.0, 300 sec: 42376.3). Total num frames: 4823171072. Throughput: 0: 42495.0. Samples: 1090724400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:18,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 06:12:19,072][26599] Updated weights for policy 0, policy_version 294384 (0.0029) [2024-06-19 06:12:20,871][26579] Signal inference workers to stop experience collection... (16100 times) [2024-06-19 06:12:20,916][26579] Signal inference workers to resume experience collection... (16100 times) [2024-06-19 06:12:20,924][26599] InferenceWorker_p0-w0: stopping experience collection (16100 times) [2024-06-19 06:12:20,952][26599] InferenceWorker_p0-w0: resuming experience collection (16100 times) [2024-06-19 06:12:22,787][26599] Updated weights for policy 0, policy_version 294394 (0.0030) [2024-06-19 06:12:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42054.9, 300 sec: 42154.6). Total num frames: 4823351296. Throughput: 0: 42201.5. Samples: 1090971760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:23,380][26367] Avg episode reward: [(0, '0.816')] [2024-06-19 06:12:26,604][26599] Updated weights for policy 0, policy_version 294404 (0.0034) [2024-06-19 06:12:28,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4823564288. Throughput: 0: 42401.3. Samples: 1091231080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:28,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 06:12:30,393][26599] Updated weights for policy 0, policy_version 294414 (0.0039) [2024-06-19 06:12:33,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42601.0, 300 sec: 42320.7). Total num frames: 4823793664. Throughput: 0: 42465.6. Samples: 1091364600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:33,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 06:12:34,625][26599] Updated weights for policy 0, policy_version 294424 (0.0032) [2024-06-19 06:12:38,114][26599] Updated weights for policy 0, policy_version 294434 (0.0043) [2024-06-19 06:12:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4824006656. Throughput: 0: 42372.8. Samples: 1091608920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:38,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 06:12:42,092][26599] Updated weights for policy 0, policy_version 294444 (0.0036) [2024-06-19 06:12:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4824203264. Throughput: 0: 42454.2. Samples: 1091869100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:12:43,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 06:12:46,144][26599] Updated weights for policy 0, policy_version 294454 (0.0037) [2024-06-19 06:12:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42600.9, 300 sec: 42265.2). Total num frames: 4824432640. Throughput: 0: 42390.2. Samples: 1091996080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:12:48,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 06:12:49,601][26599] Updated weights for policy 0, policy_version 294464 (0.0033) [2024-06-19 06:12:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42599.0, 300 sec: 42320.7). Total num frames: 4824645632. Throughput: 0: 42299.1. Samples: 1092245980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:12:53,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 06:12:53,731][26599] Updated weights for policy 0, policy_version 294474 (0.0042) [2024-06-19 06:12:57,671][26599] Updated weights for policy 0, policy_version 294484 (0.0030) [2024-06-19 06:12:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4824842240. Throughput: 0: 42410.1. Samples: 1092502500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:12:58,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 06:13:01,568][26599] Updated weights for policy 0, policy_version 294494 (0.0036) [2024-06-19 06:13:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4825055232. Throughput: 0: 42297.3. Samples: 1092627780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:03,381][26367] Avg episode reward: [(0, '0.278')] [2024-06-19 06:13:05,238][26599] Updated weights for policy 0, policy_version 294504 (0.0043) [2024-06-19 06:13:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42265.7). Total num frames: 4825268224. Throughput: 0: 42348.9. Samples: 1092877460. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:08,380][26367] Avg episode reward: [(0, '0.335')] [2024-06-19 06:13:09,668][26599] Updated weights for policy 0, policy_version 294514 (0.0026) [2024-06-19 06:13:13,207][26599] Updated weights for policy 0, policy_version 294524 (0.0034) [2024-06-19 06:13:13,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4825481216. Throughput: 0: 42298.3. Samples: 1093134500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:13,380][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 06:13:17,266][26599] Updated weights for policy 0, policy_version 294534 (0.0043) [2024-06-19 06:13:18,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 42209.6). Total num frames: 4825677824. Throughput: 0: 42154.2. Samples: 1093261540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:18,381][26367] Avg episode reward: [(0, '0.401')] [2024-06-19 06:13:20,944][26599] Updated weights for policy 0, policy_version 294544 (0.0038) [2024-06-19 06:13:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4825923584. Throughput: 0: 42279.5. Samples: 1093511500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:23,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 06:13:25,111][26599] Updated weights for policy 0, policy_version 294554 (0.0043) [2024-06-19 06:13:28,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42322.7, 300 sec: 42209.1). Total num frames: 4826103808. Throughput: 0: 42153.9. Samples: 1093766180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:28,384][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 06:13:28,874][26599] Updated weights for policy 0, policy_version 294564 (0.0036) [2024-06-19 06:13:32,708][26599] Updated weights for policy 0, policy_version 294574 (0.0046) [2024-06-19 06:13:33,380][26367] Fps is (10 sec: 37683.6, 60 sec: 41779.3, 300 sec: 42154.1). Total num frames: 4826300416. Throughput: 0: 42082.0. Samples: 1093889760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:33,380][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 06:13:36,373][26599] Updated weights for policy 0, policy_version 294584 (0.0035) [2024-06-19 06:13:38,380][26367] Fps is (10 sec: 45891.7, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4826562560. Throughput: 0: 42161.3. Samples: 1094143240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:38,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 06:13:38,386][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000294590_4826562560.pth... [2024-06-19 06:13:38,437][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000293969_4816388096.pth [2024-06-19 06:13:40,437][26599] Updated weights for policy 0, policy_version 294594 (0.0041) [2024-06-19 06:13:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.4, 300 sec: 42209.6). Total num frames: 4826726400. Throughput: 0: 42208.2. Samples: 1094401860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:43,380][26367] Avg episode reward: [(0, '0.861')] [2024-06-19 06:13:44,269][26599] Updated weights for policy 0, policy_version 294604 (0.0033) [2024-06-19 06:13:48,245][26599] Updated weights for policy 0, policy_version 294614 (0.0031) [2024-06-19 06:13:48,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4826955776. Throughput: 0: 42085.7. Samples: 1094521640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:48,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 06:13:51,815][26599] Updated weights for policy 0, policy_version 294624 (0.0042) [2024-06-19 06:13:53,382][26367] Fps is (10 sec: 45864.6, 60 sec: 42323.8, 300 sec: 42264.9). Total num frames: 4827185152. Throughput: 0: 42208.5. Samples: 1094776940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 06:13:53,383][26367] Avg episode reward: [(0, '0.303')] [2024-06-19 06:13:55,994][26599] Updated weights for policy 0, policy_version 294634 (0.0042) [2024-06-19 06:13:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4827365376. Throughput: 0: 42291.0. Samples: 1095037600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:13:58,381][26367] Avg episode reward: [(0, '0.344')] [2024-06-19 06:13:59,814][26599] Updated weights for policy 0, policy_version 294644 (0.0031) [2024-06-19 06:14:03,380][26367] Fps is (10 sec: 39330.0, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4827578368. Throughput: 0: 42111.1. Samples: 1095156540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:03,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 06:14:03,843][26599] Updated weights for policy 0, policy_version 294654 (0.0040) [2024-06-19 06:14:07,304][26579] Signal inference workers to stop experience collection... (16150 times) [2024-06-19 06:14:07,305][26579] Signal inference workers to resume experience collection... (16150 times) [2024-06-19 06:14:07,324][26599] InferenceWorker_p0-w0: stopping experience collection (16150 times) [2024-06-19 06:14:07,324][26599] InferenceWorker_p0-w0: resuming experience collection (16150 times) [2024-06-19 06:14:07,462][26599] Updated weights for policy 0, policy_version 294664 (0.0033) [2024-06-19 06:14:08,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.2, 300 sec: 42320.7). Total num frames: 4827824128. Throughput: 0: 42197.2. Samples: 1095410380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:08,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 06:14:11,672][26599] Updated weights for policy 0, policy_version 294674 (0.0024) [2024-06-19 06:14:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.1, 300 sec: 42098.5). Total num frames: 4827987968. Throughput: 0: 42191.0. Samples: 1095664620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:13,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 06:14:15,302][26599] Updated weights for policy 0, policy_version 294684 (0.0033) [2024-06-19 06:14:18,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42154.1). Total num frames: 4828200960. Throughput: 0: 42092.3. Samples: 1095783920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:18,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 06:14:19,497][26599] Updated weights for policy 0, policy_version 294694 (0.0025) [2024-06-19 06:14:23,152][26599] Updated weights for policy 0, policy_version 294704 (0.0046) [2024-06-19 06:14:23,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4828446720. Throughput: 0: 42164.8. Samples: 1096040660. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:23,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 06:14:27,239][26599] Updated weights for policy 0, policy_version 294714 (0.0036) [2024-06-19 06:14:28,381][26367] Fps is (10 sec: 44234.0, 60 sec: 42327.4, 300 sec: 42209.5). Total num frames: 4828643328. Throughput: 0: 42030.8. Samples: 1096293280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:28,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 06:14:30,836][26599] Updated weights for policy 0, policy_version 294724 (0.0037) [2024-06-19 06:14:33,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4828856320. Throughput: 0: 42135.8. Samples: 1096417740. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:33,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 06:14:34,698][26599] Updated weights for policy 0, policy_version 294734 (0.0030) [2024-06-19 06:14:38,380][26367] Fps is (10 sec: 40962.8, 60 sec: 41506.2, 300 sec: 42154.1). Total num frames: 4829052928. Throughput: 0: 42267.8. Samples: 1096678900. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:38,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 06:14:38,612][26599] Updated weights for policy 0, policy_version 294744 (0.0037) [2024-06-19 06:14:42,453][26599] Updated weights for policy 0, policy_version 294754 (0.0031) [2024-06-19 06:14:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4829282304. Throughput: 0: 42061.9. Samples: 1096930380. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:43,380][26367] Avg episode reward: [(0, '0.803')] [2024-06-19 06:14:46,273][26599] Updated weights for policy 0, policy_version 294764 (0.0036) [2024-06-19 06:14:48,383][26367] Fps is (10 sec: 44224.5, 60 sec: 42323.4, 300 sec: 42265.3). Total num frames: 4829495296. Throughput: 0: 42262.7. Samples: 1097058480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:48,384][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 06:14:50,494][26599] Updated weights for policy 0, policy_version 294774 (0.0032) [2024-06-19 06:14:53,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41507.7, 300 sec: 42209.8). Total num frames: 4829675520. Throughput: 0: 42193.6. Samples: 1097309080. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:53,380][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 06:14:54,001][26599] Updated weights for policy 0, policy_version 294784 (0.0033) [2024-06-19 06:14:58,180][26599] Updated weights for policy 0, policy_version 294794 (0.0030) [2024-06-19 06:14:58,384][26367] Fps is (10 sec: 40956.6, 60 sec: 42322.8, 300 sec: 42209.1). Total num frames: 4829904896. Throughput: 0: 42180.2. Samples: 1097562880. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:14:58,384][26367] Avg episode reward: [(0, '0.374')] [2024-06-19 06:15:01,685][26599] Updated weights for policy 0, policy_version 294804 (0.0044) [2024-06-19 06:15:03,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4830134272. Throughput: 0: 42410.7. Samples: 1097692400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:15:03,381][26367] Avg episode reward: [(0, '0.338')] [2024-06-19 06:15:05,767][26599] Updated weights for policy 0, policy_version 294814 (0.0038) [2024-06-19 06:15:08,380][26367] Fps is (10 sec: 40975.4, 60 sec: 41506.3, 300 sec: 42209.7). Total num frames: 4830314496. Throughput: 0: 42288.2. Samples: 1097943620. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 06:15:08,380][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 06:15:09,543][26599] Updated weights for policy 0, policy_version 294824 (0.0035) [2024-06-19 06:15:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42265.1). Total num frames: 4830543872. Throughput: 0: 42309.5. Samples: 1098197180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:13,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 06:15:13,548][26599] Updated weights for policy 0, policy_version 294834 (0.0033) [2024-06-19 06:15:17,079][26599] Updated weights for policy 0, policy_version 294844 (0.0031) [2024-06-19 06:15:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4830740480. Throughput: 0: 42434.1. Samples: 1098327280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:18,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 06:15:20,972][26599] Updated weights for policy 0, policy_version 294854 (0.0042) [2024-06-19 06:15:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4830953472. Throughput: 0: 42357.8. Samples: 1098585000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:23,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 06:15:24,610][26599] Updated weights for policy 0, policy_version 294864 (0.0043) [2024-06-19 06:15:26,617][26579] Signal inference workers to stop experience collection... (16200 times) [2024-06-19 06:15:26,656][26599] InferenceWorker_p0-w0: stopping experience collection (16200 times) [2024-06-19 06:15:26,740][26579] Signal inference workers to resume experience collection... (16200 times) [2024-06-19 06:15:26,740][26599] InferenceWorker_p0-w0: resuming experience collection (16200 times) [2024-06-19 06:15:28,384][26367] Fps is (10 sec: 45858.7, 60 sec: 42596.4, 300 sec: 42375.7). Total num frames: 4831199232. Throughput: 0: 42334.7. Samples: 1098835600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:28,384][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 06:15:28,654][26599] Updated weights for policy 0, policy_version 294874 (0.0033) [2024-06-19 06:15:32,377][26599] Updated weights for policy 0, policy_version 294884 (0.0046) [2024-06-19 06:15:33,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4831412224. Throughput: 0: 42495.6. Samples: 1098970660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:33,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 06:15:36,766][26599] Updated weights for policy 0, policy_version 294894 (0.0037) [2024-06-19 06:15:38,380][26367] Fps is (10 sec: 37696.4, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4831576064. Throughput: 0: 42421.1. Samples: 1099218040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:38,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 06:15:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000294896_4831576064.pth... [2024-06-19 06:15:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000294279_4821467136.pth [2024-06-19 06:15:40,090][26599] Updated weights for policy 0, policy_version 294904 (0.0029) [2024-06-19 06:15:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42321.2). Total num frames: 4831821824. Throughput: 0: 42404.3. Samples: 1099470920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:43,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 06:15:44,339][26599] Updated weights for policy 0, policy_version 294914 (0.0034) [2024-06-19 06:15:47,674][26599] Updated weights for policy 0, policy_version 294924 (0.0050) [2024-06-19 06:15:48,380][26367] Fps is (10 sec: 47514.1, 60 sec: 42600.4, 300 sec: 42320.7). Total num frames: 4832051200. Throughput: 0: 42584.0. Samples: 1099608680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:48,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 06:15:51,963][26599] Updated weights for policy 0, policy_version 294934 (0.0028) [2024-06-19 06:15:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4832215040. Throughput: 0: 42474.1. Samples: 1099854960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:53,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 06:15:55,375][26599] Updated weights for policy 0, policy_version 294944 (0.0033) [2024-06-19 06:15:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42600.9, 300 sec: 42320.7). Total num frames: 4832460800. Throughput: 0: 42490.2. Samples: 1100109240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:15:58,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 06:16:00,137][26599] Updated weights for policy 0, policy_version 294954 (0.0036) [2024-06-19 06:16:03,254][26599] Updated weights for policy 0, policy_version 294964 (0.0038) [2024-06-19 06:16:03,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42598.4, 300 sec: 42321.2). Total num frames: 4832690176. Throughput: 0: 42677.7. Samples: 1100247780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:16:03,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 06:16:08,214][26599] Updated weights for policy 0, policy_version 294974 (0.0041) [2024-06-19 06:16:08,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 4832854016. Throughput: 0: 42420.0. Samples: 1100493900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:16:08,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 06:16:10,907][26599] Updated weights for policy 0, policy_version 294984 (0.0041) [2024-06-19 06:16:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42376.7). Total num frames: 4833116160. Throughput: 0: 42462.0. Samples: 1100746240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:16:13,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 06:16:15,918][26599] Updated weights for policy 0, policy_version 294994 (0.0032) [2024-06-19 06:16:18,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42321.2). Total num frames: 4833312768. Throughput: 0: 42480.4. Samples: 1100882280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:16:18,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 06:16:18,794][26599] Updated weights for policy 0, policy_version 295004 (0.0027) [2024-06-19 06:16:23,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 4833492992. Throughput: 0: 42496.0. Samples: 1101130360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:16:23,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 06:16:23,745][26599] Updated weights for policy 0, policy_version 295014 (0.0038) [2024-06-19 06:16:26,518][26599] Updated weights for policy 0, policy_version 295024 (0.0024) [2024-06-19 06:16:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42600.9, 300 sec: 42432.3). Total num frames: 4833755136. Throughput: 0: 42333.3. Samples: 1101375920. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:16:28,381][26367] Avg episode reward: [(0, '0.378')] [2024-06-19 06:16:31,738][26599] Updated weights for policy 0, policy_version 295034 (0.0034) [2024-06-19 06:16:33,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4833935360. Throughput: 0: 42333.8. Samples: 1101513700. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:16:33,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 06:16:34,106][26599] Updated weights for policy 0, policy_version 295044 (0.0046) [2024-06-19 06:16:35,158][26579] Signal inference workers to stop experience collection... (16250 times) [2024-06-19 06:16:35,215][26599] InferenceWorker_p0-w0: stopping experience collection (16250 times) [2024-06-19 06:16:35,269][26579] Signal inference workers to resume experience collection... (16250 times) [2024-06-19 06:16:35,270][26599] InferenceWorker_p0-w0: resuming experience collection (16250 times) [2024-06-19 06:16:38,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42598.5, 300 sec: 42209.6). Total num frames: 4834131968. Throughput: 0: 42408.0. Samples: 1101763320. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:16:38,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 06:16:39,295][26599] Updated weights for policy 0, policy_version 295054 (0.0030) [2024-06-19 06:16:42,039][26599] Updated weights for policy 0, policy_version 295064 (0.0040) [2024-06-19 06:16:43,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42432.3). Total num frames: 4834394112. Throughput: 0: 42312.3. Samples: 1102013280. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:16:43,380][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 06:16:46,906][26599] Updated weights for policy 0, policy_version 295074 (0.0037) [2024-06-19 06:16:48,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.3, 300 sec: 42320.8). Total num frames: 4834574336. Throughput: 0: 42229.0. Samples: 1102148080. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:16:48,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 06:16:49,705][26599] Updated weights for policy 0, policy_version 295084 (0.0041) [2024-06-19 06:16:53,380][26367] Fps is (10 sec: 39320.6, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4834787328. Throughput: 0: 42257.3. Samples: 1102395480. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:16:53,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 06:16:54,503][26599] Updated weights for policy 0, policy_version 295094 (0.0044) [2024-06-19 06:16:57,532][26599] Updated weights for policy 0, policy_version 295104 (0.0050) [2024-06-19 06:16:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4835000320. Throughput: 0: 42358.4. Samples: 1102652360. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:16:58,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 06:17:02,001][26599] Updated weights for policy 0, policy_version 295114 (0.0052) [2024-06-19 06:17:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4835196928. Throughput: 0: 42232.3. Samples: 1102782740. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:17:03,381][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 06:17:05,327][26599] Updated weights for policy 0, policy_version 295124 (0.0041) [2024-06-19 06:17:08,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4835409920. Throughput: 0: 42274.7. Samples: 1103032720. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:17:08,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 06:17:09,544][26599] Updated weights for policy 0, policy_version 295134 (0.0043) [2024-06-19 06:17:12,940][26599] Updated weights for policy 0, policy_version 295144 (0.0033) [2024-06-19 06:17:13,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4835655680. Throughput: 0: 42502.3. Samples: 1103288520. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:17:13,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 06:17:17,248][26599] Updated weights for policy 0, policy_version 295154 (0.0027) [2024-06-19 06:17:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4835835904. Throughput: 0: 42381.4. Samples: 1103420860. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:17:18,380][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 06:17:20,755][26599] Updated weights for policy 0, policy_version 295164 (0.0033) [2024-06-19 06:17:23,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4836065280. Throughput: 0: 42458.1. Samples: 1103673940. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:17:23,381][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 06:17:24,733][26599] Updated weights for policy 0, policy_version 295174 (0.0046) [2024-06-19 06:17:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 41779.2, 300 sec: 42265.2). Total num frames: 4836261888. Throughput: 0: 42590.5. Samples: 1103929860. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:17:28,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 06:17:28,646][26599] Updated weights for policy 0, policy_version 295184 (0.0032) [2024-06-19 06:17:32,778][26599] Updated weights for policy 0, policy_version 295194 (0.0026) [2024-06-19 06:17:33,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4836491264. Throughput: 0: 42220.0. Samples: 1104047980. Policy #0 lag: (min: 0.0, avg: 6.7, max: 20.0) [2024-06-19 06:17:33,381][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 06:17:36,678][26599] Updated weights for policy 0, policy_version 295204 (0.0030) [2024-06-19 06:17:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4836704256. Throughput: 0: 42557.8. Samples: 1104310580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:17:38,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 06:17:38,529][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000295210_4836720640.pth... [2024-06-19 06:17:38,583][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000294590_4826562560.pth [2024-06-19 06:17:40,434][26599] Updated weights for policy 0, policy_version 295214 (0.0028) [2024-06-19 06:17:41,202][26579] Signal inference workers to stop experience collection... (16300 times) [2024-06-19 06:17:41,203][26579] Signal inference workers to resume experience collection... (16300 times) [2024-06-19 06:17:41,214][26599] InferenceWorker_p0-w0: stopping experience collection (16300 times) [2024-06-19 06:17:41,237][26599] InferenceWorker_p0-w0: resuming experience collection (16300 times) [2024-06-19 06:17:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4836917248. Throughput: 0: 42448.5. Samples: 1104562540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:17:43,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 06:17:44,330][26599] Updated weights for policy 0, policy_version 295224 (0.0038) [2024-06-19 06:17:47,973][26599] Updated weights for policy 0, policy_version 295234 (0.0030) [2024-06-19 06:17:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4837130240. Throughput: 0: 42355.2. Samples: 1104688720. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:17:48,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 06:17:51,856][26599] Updated weights for policy 0, policy_version 295244 (0.0036) [2024-06-19 06:17:53,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4837326848. Throughput: 0: 42558.3. Samples: 1104947840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:17:53,381][26367] Avg episode reward: [(0, '0.352')] [2024-06-19 06:17:55,671][26599] Updated weights for policy 0, policy_version 295254 (0.0045) [2024-06-19 06:17:58,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42595.8, 300 sec: 42375.7). Total num frames: 4837556224. Throughput: 0: 42489.4. Samples: 1105200700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:17:58,384][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 06:17:59,375][26599] Updated weights for policy 0, policy_version 295264 (0.0032) [2024-06-19 06:18:03,376][26599] Updated weights for policy 0, policy_version 295274 (0.0050) [2024-06-19 06:18:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42376.2). Total num frames: 4837769216. Throughput: 0: 42504.0. Samples: 1105333540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:03,380][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 06:18:06,984][26599] Updated weights for policy 0, policy_version 295284 (0.0037) [2024-06-19 06:18:08,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4837965824. Throughput: 0: 42449.9. Samples: 1105584180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:08,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 06:18:10,901][26599] Updated weights for policy 0, policy_version 295294 (0.0045) [2024-06-19 06:18:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4838195200. Throughput: 0: 42450.7. Samples: 1105840140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:13,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 06:18:14,561][26599] Updated weights for policy 0, policy_version 295304 (0.0047) [2024-06-19 06:18:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4838375424. Throughput: 0: 42828.0. Samples: 1105975240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:18,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 06:18:18,852][26599] Updated weights for policy 0, policy_version 295314 (0.0037) [2024-06-19 06:18:22,176][26599] Updated weights for policy 0, policy_version 295324 (0.0033) [2024-06-19 06:18:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42376.8). Total num frames: 4838604800. Throughput: 0: 42388.5. Samples: 1106218060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:23,382][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 06:18:26,437][26599] Updated weights for policy 0, policy_version 295334 (0.0028) [2024-06-19 06:18:28,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4838834176. Throughput: 0: 42655.0. Samples: 1106482020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:28,381][26367] Avg episode reward: [(0, '0.389')] [2024-06-19 06:18:29,599][26599] Updated weights for policy 0, policy_version 295344 (0.0043) [2024-06-19 06:18:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4839030784. Throughput: 0: 42710.7. Samples: 1106610700. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:33,381][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 06:18:34,267][26599] Updated weights for policy 0, policy_version 295354 (0.0027) [2024-06-19 06:18:37,262][26599] Updated weights for policy 0, policy_version 295364 (0.0039) [2024-06-19 06:18:38,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4839260160. Throughput: 0: 42534.0. Samples: 1106861880. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:38,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 06:18:41,871][26599] Updated weights for policy 0, policy_version 295374 (0.0031) [2024-06-19 06:18:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4839473152. Throughput: 0: 42619.4. Samples: 1107118420. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:43,384][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 06:18:45,367][26599] Updated weights for policy 0, policy_version 295384 (0.0037) [2024-06-19 06:18:48,380][26367] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42321.0). Total num frames: 4839669760. Throughput: 0: 42538.2. Samples: 1107247760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 26.0) [2024-06-19 06:18:48,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 06:18:49,323][26599] Updated weights for policy 0, policy_version 295394 (0.0029) [2024-06-19 06:18:53,031][26599] Updated weights for policy 0, policy_version 295404 (0.0040) [2024-06-19 06:18:53,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4839899136. Throughput: 0: 42615.6. Samples: 1107501880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:18:53,380][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 06:18:57,154][26599] Updated weights for policy 0, policy_version 295414 (0.0032) [2024-06-19 06:18:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42327.9, 300 sec: 42431.8). Total num frames: 4840095744. Throughput: 0: 42716.5. Samples: 1107762380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:18:58,380][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 06:19:00,784][26599] Updated weights for policy 0, policy_version 295424 (0.0029) [2024-06-19 06:19:03,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42322.7, 300 sec: 42320.2). Total num frames: 4840308736. Throughput: 0: 42405.9. Samples: 1107883660. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:03,385][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 06:19:05,078][26599] Updated weights for policy 0, policy_version 295434 (0.0039) [2024-06-19 06:19:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4840538112. Throughput: 0: 42651.1. Samples: 1108137360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:08,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 06:19:08,513][26579] Signal inference workers to stop experience collection... (16350 times) [2024-06-19 06:19:08,515][26579] Signal inference workers to resume experience collection... (16350 times) [2024-06-19 06:19:08,523][26599] InferenceWorker_p0-w0: stopping experience collection (16350 times) [2024-06-19 06:19:08,526][26599] Updated weights for policy 0, policy_version 295444 (0.0026) [2024-06-19 06:19:08,549][26599] InferenceWorker_p0-w0: resuming experience collection (16350 times) [2024-06-19 06:19:12,771][26599] Updated weights for policy 0, policy_version 295454 (0.0031) [2024-06-19 06:19:13,384][26367] Fps is (10 sec: 40960.2, 60 sec: 42049.7, 300 sec: 42431.3). Total num frames: 4840718336. Throughput: 0: 42453.9. Samples: 1108392600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:13,384][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 06:19:16,015][26599] Updated weights for policy 0, policy_version 295464 (0.0036) [2024-06-19 06:19:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 4840947712. Throughput: 0: 42324.0. Samples: 1108515280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:18,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 06:19:20,431][26599] Updated weights for policy 0, policy_version 295474 (0.0032) [2024-06-19 06:19:23,380][26367] Fps is (10 sec: 45891.7, 60 sec: 42871.4, 300 sec: 42487.4). Total num frames: 4841177088. Throughput: 0: 42596.2. Samples: 1108778700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:23,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 06:19:23,842][26599] Updated weights for policy 0, policy_version 295484 (0.0039) [2024-06-19 06:19:28,033][26599] Updated weights for policy 0, policy_version 295494 (0.0033) [2024-06-19 06:19:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4841373696. Throughput: 0: 42654.7. Samples: 1109037880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:28,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 06:19:31,593][26599] Updated weights for policy 0, policy_version 295504 (0.0052) [2024-06-19 06:19:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4841603072. Throughput: 0: 42555.5. Samples: 1109162760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:33,380][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 06:19:35,921][26599] Updated weights for policy 0, policy_version 295514 (0.0032) [2024-06-19 06:19:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 4841799680. Throughput: 0: 42702.6. Samples: 1109423500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:38,381][26367] Avg episode reward: [(0, '0.310')] [2024-06-19 06:19:38,455][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000295521_4841816064.pth... [2024-06-19 06:19:38,509][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000294896_4831576064.pth [2024-06-19 06:19:39,198][26599] Updated weights for policy 0, policy_version 295524 (0.0032) [2024-06-19 06:19:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42432.2). Total num frames: 4842012672. Throughput: 0: 42517.7. Samples: 1109675680. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:43,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 06:19:43,500][26599] Updated weights for policy 0, policy_version 295534 (0.0039) [2024-06-19 06:19:47,142][26599] Updated weights for policy 0, policy_version 295544 (0.0039) [2024-06-19 06:19:48,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 4842242048. Throughput: 0: 42651.7. Samples: 1109802840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:48,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 06:19:51,208][26599] Updated weights for policy 0, policy_version 295554 (0.0044) [2024-06-19 06:19:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42487.8). Total num frames: 4842438656. Throughput: 0: 42670.2. Samples: 1110057520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:53,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 06:19:54,884][26599] Updated weights for policy 0, policy_version 295564 (0.0023) [2024-06-19 06:19:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4842651648. Throughput: 0: 42645.2. Samples: 1110311480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:19:58,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 06:19:59,235][26599] Updated weights for policy 0, policy_version 295574 (0.0027) [2024-06-19 06:20:02,762][26599] Updated weights for policy 0, policy_version 295584 (0.0024) [2024-06-19 06:20:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42600.9, 300 sec: 42542.8). Total num frames: 4842864640. Throughput: 0: 42787.0. Samples: 1110440700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 06:20:03,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 06:20:06,978][26599] Updated weights for policy 0, policy_version 295594 (0.0031) [2024-06-19 06:20:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4843077632. Throughput: 0: 42589.3. Samples: 1110695220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:08,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 06:20:10,578][26599] Updated weights for policy 0, policy_version 295604 (0.0041) [2024-06-19 06:20:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43147.1, 300 sec: 42598.4). Total num frames: 4843307008. Throughput: 0: 42274.6. Samples: 1110940240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:13,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 06:20:14,593][26599] Updated weights for policy 0, policy_version 295614 (0.0035) [2024-06-19 06:20:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4843470848. Throughput: 0: 42375.1. Samples: 1111069640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:18,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 06:20:18,780][26599] Updated weights for policy 0, policy_version 295624 (0.0046) [2024-06-19 06:20:22,237][26599] Updated weights for policy 0, policy_version 295634 (0.0023) [2024-06-19 06:20:23,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42432.3). Total num frames: 4843716608. Throughput: 0: 42286.7. Samples: 1111326400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:23,380][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 06:20:26,354][26599] Updated weights for policy 0, policy_version 295644 (0.0032) [2024-06-19 06:20:28,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4843929600. Throughput: 0: 42253.2. Samples: 1111577080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:28,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 06:20:29,903][26599] Updated weights for policy 0, policy_version 295654 (0.0044) [2024-06-19 06:20:33,380][26367] Fps is (10 sec: 39320.8, 60 sec: 41779.1, 300 sec: 42487.3). Total num frames: 4844109824. Throughput: 0: 42331.6. Samples: 1111707760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:33,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 06:20:33,779][26579] Signal inference workers to stop experience collection... (16400 times) [2024-06-19 06:20:33,780][26579] Signal inference workers to resume experience collection... (16400 times) [2024-06-19 06:20:33,791][26599] InferenceWorker_p0-w0: stopping experience collection (16400 times) [2024-06-19 06:20:33,805][26599] InferenceWorker_p0-w0: resuming experience collection (16400 times) [2024-06-19 06:20:33,945][26599] Updated weights for policy 0, policy_version 295664 (0.0041) [2024-06-19 06:20:37,477][26599] Updated weights for policy 0, policy_version 295674 (0.0025) [2024-06-19 06:20:38,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4844322816. Throughput: 0: 42308.8. Samples: 1111961420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:38,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 06:20:41,633][26599] Updated weights for policy 0, policy_version 295684 (0.0038) [2024-06-19 06:20:43,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 4844584960. Throughput: 0: 42236.9. Samples: 1112212140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:43,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 06:20:45,025][26599] Updated weights for policy 0, policy_version 295694 (0.0029) [2024-06-19 06:20:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.4, 300 sec: 42487.3). Total num frames: 4844748800. Throughput: 0: 42298.4. Samples: 1112344120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:48,380][26367] Avg episode reward: [(0, '0.835')] [2024-06-19 06:20:49,299][26599] Updated weights for policy 0, policy_version 295704 (0.0046) [2024-06-19 06:20:52,594][26599] Updated weights for policy 0, policy_version 295714 (0.0042) [2024-06-19 06:20:53,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4844978176. Throughput: 0: 42326.2. Samples: 1112599900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:53,381][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 06:20:57,109][26599] Updated weights for policy 0, policy_version 295724 (0.0034) [2024-06-19 06:20:58,384][26367] Fps is (10 sec: 47495.9, 60 sec: 42868.9, 300 sec: 42486.8). Total num frames: 4845223936. Throughput: 0: 42427.8. Samples: 1112849640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:20:58,384][26367] Avg episode reward: [(0, '0.827')] [2024-06-19 06:21:00,456][26599] Updated weights for policy 0, policy_version 295734 (0.0045) [2024-06-19 06:21:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 4845387776. Throughput: 0: 42428.0. Samples: 1112978900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:21:03,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 06:21:04,683][26599] Updated weights for policy 0, policy_version 295744 (0.0049) [2024-06-19 06:21:08,380][26367] Fps is (10 sec: 39336.1, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4845617152. Throughput: 0: 42312.8. Samples: 1113230480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:21:08,380][26367] Avg episode reward: [(0, '0.350')] [2024-06-19 06:21:08,506][26599] Updated weights for policy 0, policy_version 295754 (0.0029) [2024-06-19 06:21:12,443][26599] Updated weights for policy 0, policy_version 295764 (0.0038) [2024-06-19 06:21:13,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4845862912. Throughput: 0: 42406.9. Samples: 1113485380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 19.0) [2024-06-19 06:21:13,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 06:21:16,069][26599] Updated weights for policy 0, policy_version 295774 (0.0022) [2024-06-19 06:21:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4846026752. Throughput: 0: 42412.6. Samples: 1113616320. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:18,380][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 06:21:20,070][26599] Updated weights for policy 0, policy_version 295784 (0.0032) [2024-06-19 06:21:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4846256128. Throughput: 0: 42478.3. Samples: 1113872940. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:23,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 06:21:23,696][26599] Updated weights for policy 0, policy_version 295794 (0.0044) [2024-06-19 06:21:27,712][26599] Updated weights for policy 0, policy_version 295804 (0.0027) [2024-06-19 06:21:28,380][26367] Fps is (10 sec: 45874.1, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 4846485504. Throughput: 0: 42486.1. Samples: 1114124020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:28,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 06:21:31,373][26599] Updated weights for policy 0, policy_version 295814 (0.0038) [2024-06-19 06:21:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 4846649344. Throughput: 0: 42422.2. Samples: 1114253120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:33,380][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 06:21:35,340][26599] Updated weights for policy 0, policy_version 295824 (0.0028) [2024-06-19 06:21:38,380][26367] Fps is (10 sec: 40961.2, 60 sec: 42871.6, 300 sec: 42376.2). Total num frames: 4846895104. Throughput: 0: 42387.7. Samples: 1114507340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:38,380][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 06:21:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000295832_4846911488.pth... [2024-06-19 06:21:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000295210_4836720640.pth [2024-06-19 06:21:38,993][26599] Updated weights for policy 0, policy_version 295834 (0.0040) [2024-06-19 06:21:43,100][26599] Updated weights for policy 0, policy_version 295844 (0.0036) [2024-06-19 06:21:43,380][26367] Fps is (10 sec: 47512.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 4847124480. Throughput: 0: 42594.5. Samples: 1114766240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:43,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 06:21:46,693][26599] Updated weights for policy 0, policy_version 295854 (0.0024) [2024-06-19 06:21:48,383][26367] Fps is (10 sec: 40949.4, 60 sec: 42596.6, 300 sec: 42431.4). Total num frames: 4847304704. Throughput: 0: 42554.5. Samples: 1114893960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:48,383][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 06:21:48,967][26579] Signal inference workers to stop experience collection... (16450 times) [2024-06-19 06:21:48,968][26579] Signal inference workers to resume experience collection... (16450 times) [2024-06-19 06:21:48,982][26599] InferenceWorker_p0-w0: stopping experience collection (16450 times) [2024-06-19 06:21:48,983][26599] InferenceWorker_p0-w0: resuming experience collection (16450 times) [2024-06-19 06:21:50,678][26599] Updated weights for policy 0, policy_version 295864 (0.0051) [2024-06-19 06:21:53,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4847534080. Throughput: 0: 42556.8. Samples: 1115145540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:53,381][26367] Avg episode reward: [(0, '0.377')] [2024-06-19 06:21:54,397][26599] Updated weights for policy 0, policy_version 295874 (0.0051) [2024-06-19 06:21:58,380][26367] Fps is (10 sec: 44247.6, 60 sec: 42054.8, 300 sec: 42542.9). Total num frames: 4847747072. Throughput: 0: 42670.1. Samples: 1115405540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:21:58,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 06:21:58,452][26599] Updated weights for policy 0, policy_version 295884 (0.0030) [2024-06-19 06:22:02,086][26599] Updated weights for policy 0, policy_version 295894 (0.0037) [2024-06-19 06:22:03,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4847927296. Throughput: 0: 42690.2. Samples: 1115537380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:22:03,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 06:22:05,907][26599] Updated weights for policy 0, policy_version 295904 (0.0031) [2024-06-19 06:22:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4848173056. Throughput: 0: 42482.2. Samples: 1115784640. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:22:08,380][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 06:22:09,986][26599] Updated weights for policy 0, policy_version 295914 (0.0041) [2024-06-19 06:22:13,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 4848402432. Throughput: 0: 42585.1. Samples: 1116040340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:22:13,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 06:22:13,682][26599] Updated weights for policy 0, policy_version 295924 (0.0032) [2024-06-19 06:22:17,779][26599] Updated weights for policy 0, policy_version 295934 (0.0029) [2024-06-19 06:22:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 4848599040. Throughput: 0: 42576.3. Samples: 1116169060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:22:18,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 06:22:21,347][26599] Updated weights for policy 0, policy_version 295944 (0.0029) [2024-06-19 06:22:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4848812032. Throughput: 0: 42526.6. Samples: 1116421040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:22:23,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 06:22:25,567][26599] Updated weights for policy 0, policy_version 295954 (0.0033) [2024-06-19 06:22:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 4849041408. Throughput: 0: 42497.4. Samples: 1116678620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 22.0) [2024-06-19 06:22:28,384][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 06:22:29,303][26599] Updated weights for policy 0, policy_version 295964 (0.0037) [2024-06-19 06:22:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4849221632. Throughput: 0: 42420.2. Samples: 1116802760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:22:33,380][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 06:22:33,615][26599] Updated weights for policy 0, policy_version 295974 (0.0035) [2024-06-19 06:22:37,002][26599] Updated weights for policy 0, policy_version 295984 (0.0037) [2024-06-19 06:22:38,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42595.8, 300 sec: 42486.8). Total num frames: 4849451008. Throughput: 0: 42500.6. Samples: 1117058220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:22:38,385][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 06:22:41,549][26599] Updated weights for policy 0, policy_version 295994 (0.0036) [2024-06-19 06:22:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 4849664000. Throughput: 0: 42485.0. Samples: 1117317360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:22:43,380][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 06:22:44,821][26599] Updated weights for policy 0, policy_version 296004 (0.0036) [2024-06-19 06:22:48,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42873.3, 300 sec: 42542.9). Total num frames: 4849876992. Throughput: 0: 42192.4. Samples: 1117436040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:22:48,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 06:22:49,068][26599] Updated weights for policy 0, policy_version 296014 (0.0038) [2024-06-19 06:22:52,534][26599] Updated weights for policy 0, policy_version 296024 (0.0036) [2024-06-19 06:22:53,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42432.3). Total num frames: 4850073600. Throughput: 0: 42378.6. Samples: 1117691680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:22:53,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 06:22:57,167][26599] Updated weights for policy 0, policy_version 296034 (0.0026) [2024-06-19 06:22:58,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4850270208. Throughput: 0: 42381.3. Samples: 1117947500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:22:58,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 06:22:59,298][26579] Signal inference workers to stop experience collection... (16500 times) [2024-06-19 06:22:59,298][26579] Signal inference workers to resume experience collection... (16500 times) [2024-06-19 06:22:59,338][26599] InferenceWorker_p0-w0: stopping experience collection (16500 times) [2024-06-19 06:22:59,338][26599] InferenceWorker_p0-w0: resuming experience collection (16500 times) [2024-06-19 06:23:00,397][26599] Updated weights for policy 0, policy_version 296044 (0.0042) [2024-06-19 06:23:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 4850515968. Throughput: 0: 42306.2. Samples: 1118072840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:03,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 06:23:04,593][26599] Updated weights for policy 0, policy_version 296054 (0.0039) [2024-06-19 06:23:08,064][26599] Updated weights for policy 0, policy_version 296064 (0.0030) [2024-06-19 06:23:08,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4850712576. Throughput: 0: 42348.3. Samples: 1118326720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:08,381][26367] Avg episode reward: [(0, '0.847')] [2024-06-19 06:23:12,594][26599] Updated weights for policy 0, policy_version 296074 (0.0034) [2024-06-19 06:23:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 4850909184. Throughput: 0: 42295.2. Samples: 1118581900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:13,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 06:23:15,882][26599] Updated weights for policy 0, policy_version 296084 (0.0032) [2024-06-19 06:23:18,384][26367] Fps is (10 sec: 42583.4, 60 sec: 42322.7, 300 sec: 42486.8). Total num frames: 4851138560. Throughput: 0: 42163.6. Samples: 1118700280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:18,385][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 06:23:20,046][26599] Updated weights for policy 0, policy_version 296094 (0.0039) [2024-06-19 06:23:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4851335168. Throughput: 0: 42249.7. Samples: 1118959300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:23,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 06:23:23,647][26599] Updated weights for policy 0, policy_version 296104 (0.0037) [2024-06-19 06:23:27,773][26599] Updated weights for policy 0, policy_version 296114 (0.0038) [2024-06-19 06:23:28,380][26367] Fps is (10 sec: 40975.2, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 4851548160. Throughput: 0: 42143.9. Samples: 1119213840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:28,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 06:23:31,317][26599] Updated weights for policy 0, policy_version 296124 (0.0036) [2024-06-19 06:23:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4851777536. Throughput: 0: 42351.7. Samples: 1119341860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:33,380][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 06:23:35,363][26599] Updated weights for policy 0, policy_version 296134 (0.0037) [2024-06-19 06:23:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41781.7, 300 sec: 42320.7). Total num frames: 4851957760. Throughput: 0: 42281.8. Samples: 1119594360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:38,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 06:23:38,458][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000296141_4851974144.pth... [2024-06-19 06:23:38,527][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000295521_4841816064.pth [2024-06-19 06:23:39,098][26599] Updated weights for policy 0, policy_version 296144 (0.0041) [2024-06-19 06:23:42,927][26599] Updated weights for policy 0, policy_version 296154 (0.0030) [2024-06-19 06:23:43,384][26367] Fps is (10 sec: 42582.4, 60 sec: 42322.7, 300 sec: 42486.8). Total num frames: 4852203520. Throughput: 0: 42089.0. Samples: 1119841660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:23:43,385][26367] Avg episode reward: [(0, '0.845')] [2024-06-19 06:23:46,714][26599] Updated weights for policy 0, policy_version 296164 (0.0036) [2024-06-19 06:23:48,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4852400128. Throughput: 0: 42309.0. Samples: 1119976740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:23:48,381][26367] Avg episode reward: [(0, '0.868')] [2024-06-19 06:23:50,725][26599] Updated weights for policy 0, policy_version 296174 (0.0034) [2024-06-19 06:23:53,380][26367] Fps is (10 sec: 37697.5, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 4852580352. Throughput: 0: 42088.7. Samples: 1120220700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:23:53,380][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 06:23:54,926][26599] Updated weights for policy 0, policy_version 296184 (0.0042) [2024-06-19 06:23:58,180][26599] Updated weights for policy 0, policy_version 296194 (0.0044) [2024-06-19 06:23:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42487.8). Total num frames: 4852842496. Throughput: 0: 42181.3. Samples: 1120480060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:23:58,384][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 06:24:02,583][26599] Updated weights for policy 0, policy_version 296204 (0.0031) [2024-06-19 06:24:03,384][26367] Fps is (10 sec: 44220.1, 60 sec: 41776.7, 300 sec: 42320.2). Total num frames: 4853022720. Throughput: 0: 42538.7. Samples: 1120614520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:03,384][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 06:24:05,725][26599] Updated weights for policy 0, policy_version 296214 (0.0035) [2024-06-19 06:24:08,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42432.3). Total num frames: 4853235712. Throughput: 0: 42387.9. Samples: 1120866760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:08,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 06:24:10,171][26599] Updated weights for policy 0, policy_version 296224 (0.0034) [2024-06-19 06:24:13,384][26367] Fps is (10 sec: 45875.3, 60 sec: 42868.9, 300 sec: 42486.8). Total num frames: 4853481472. Throughput: 0: 42318.3. Samples: 1121118320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:13,385][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 06:24:13,659][26599] Updated weights for policy 0, policy_version 296234 (0.0032) [2024-06-19 06:24:17,055][26579] Signal inference workers to stop experience collection... (16550 times) [2024-06-19 06:24:17,060][26579] Signal inference workers to resume experience collection... (16550 times) [2024-06-19 06:24:17,073][26599] InferenceWorker_p0-w0: stopping experience collection (16550 times) [2024-06-19 06:24:17,108][26599] InferenceWorker_p0-w0: resuming experience collection (16550 times) [2024-06-19 06:24:17,769][26599] Updated weights for policy 0, policy_version 296244 (0.0027) [2024-06-19 06:24:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42054.8, 300 sec: 42320.7). Total num frames: 4853661696. Throughput: 0: 42464.7. Samples: 1121252780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:18,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 06:24:21,174][26599] Updated weights for policy 0, policy_version 296254 (0.0034) [2024-06-19 06:24:23,380][26367] Fps is (10 sec: 39336.0, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4853874688. Throughput: 0: 42518.7. Samples: 1121507700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:23,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 06:24:25,314][26599] Updated weights for policy 0, policy_version 296264 (0.0057) [2024-06-19 06:24:28,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4854120448. Throughput: 0: 42603.5. Samples: 1121758660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:28,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 06:24:28,661][26599] Updated weights for policy 0, policy_version 296274 (0.0043) [2024-06-19 06:24:32,949][26599] Updated weights for policy 0, policy_version 296284 (0.0039) [2024-06-19 06:24:33,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4854333440. Throughput: 0: 42596.7. Samples: 1121893600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:33,381][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 06:24:36,733][26599] Updated weights for policy 0, policy_version 296294 (0.0028) [2024-06-19 06:24:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4854530048. Throughput: 0: 42716.3. Samples: 1122142940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:38,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 06:24:40,881][26599] Updated weights for policy 0, policy_version 296304 (0.0034) [2024-06-19 06:24:43,383][26367] Fps is (10 sec: 40950.7, 60 sec: 42326.3, 300 sec: 42375.9). Total num frames: 4854743040. Throughput: 0: 42548.0. Samples: 1122394820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:43,383][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 06:24:44,416][26599] Updated weights for policy 0, policy_version 296314 (0.0037) [2024-06-19 06:24:48,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4854956032. Throughput: 0: 42459.9. Samples: 1122525060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:48,381][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 06:24:48,484][26599] Updated weights for policy 0, policy_version 296324 (0.0042) [2024-06-19 06:24:52,077][26599] Updated weights for policy 0, policy_version 296334 (0.0051) [2024-06-19 06:24:53,380][26367] Fps is (10 sec: 42607.7, 60 sec: 43144.3, 300 sec: 42431.8). Total num frames: 4855169024. Throughput: 0: 42319.8. Samples: 1122771160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 06:24:53,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 06:24:56,419][26599] Updated weights for policy 0, policy_version 296344 (0.0038) [2024-06-19 06:24:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4855382016. Throughput: 0: 42369.6. Samples: 1123024800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:24:58,381][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 06:25:00,017][26599] Updated weights for policy 0, policy_version 296354 (0.0037) [2024-06-19 06:25:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42600.9, 300 sec: 42376.2). Total num frames: 4855578624. Throughput: 0: 42219.9. Samples: 1123152680. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:03,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 06:25:04,289][26599] Updated weights for policy 0, policy_version 296364 (0.0029) [2024-06-19 06:25:08,079][26599] Updated weights for policy 0, policy_version 296374 (0.0038) [2024-06-19 06:25:08,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4855791616. Throughput: 0: 42079.2. Samples: 1123401260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:08,380][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 06:25:11,980][26599] Updated weights for policy 0, policy_version 296384 (0.0037) [2024-06-19 06:25:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42327.9, 300 sec: 42542.9). Total num frames: 4856020992. Throughput: 0: 42259.9. Samples: 1123660360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:13,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 06:25:15,841][26599] Updated weights for policy 0, policy_version 296394 (0.0034) [2024-06-19 06:25:18,380][26367] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4856201216. Throughput: 0: 42066.2. Samples: 1123786580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:18,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 06:25:19,654][26599] Updated weights for policy 0, policy_version 296404 (0.0030) [2024-06-19 06:25:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4856430592. Throughput: 0: 42070.7. Samples: 1124036120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:23,381][26367] Avg episode reward: [(0, '0.025')] [2024-06-19 06:25:23,430][26599] Updated weights for policy 0, policy_version 296414 (0.0040) [2024-06-19 06:25:27,483][26599] Updated weights for policy 0, policy_version 296424 (0.0033) [2024-06-19 06:25:28,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4856643584. Throughput: 0: 42058.7. Samples: 1124287360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:28,380][26367] Avg episode reward: [(0, '0.286')] [2024-06-19 06:25:31,389][26599] Updated weights for policy 0, policy_version 296434 (0.0043) [2024-06-19 06:25:33,380][26367] Fps is (10 sec: 39321.1, 60 sec: 41506.1, 300 sec: 42376.2). Total num frames: 4856823808. Throughput: 0: 42111.8. Samples: 1124420100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:33,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 06:25:34,228][26579] Signal inference workers to stop experience collection... (16600 times) [2024-06-19 06:25:34,229][26579] Signal inference workers to resume experience collection... (16600 times) [2024-06-19 06:25:34,272][26599] InferenceWorker_p0-w0: stopping experience collection (16600 times) [2024-06-19 06:25:34,272][26599] InferenceWorker_p0-w0: resuming experience collection (16600 times) [2024-06-19 06:25:35,036][26599] Updated weights for policy 0, policy_version 296444 (0.0039) [2024-06-19 06:25:38,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4857069568. Throughput: 0: 42220.4. Samples: 1124671080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:38,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 06:25:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000296452_4857069568.pth... [2024-06-19 06:25:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000295832_4846911488.pth [2024-06-19 06:25:39,064][26599] Updated weights for policy 0, policy_version 296454 (0.0042) [2024-06-19 06:25:42,637][26599] Updated weights for policy 0, policy_version 296464 (0.0037) [2024-06-19 06:25:43,380][26367] Fps is (10 sec: 45876.2, 60 sec: 42327.0, 300 sec: 42487.3). Total num frames: 4857282560. Throughput: 0: 42148.6. Samples: 1124921480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:43,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 06:25:47,025][26599] Updated weights for policy 0, policy_version 296474 (0.0038) [2024-06-19 06:25:48,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42049.7, 300 sec: 42375.7). Total num frames: 4857479168. Throughput: 0: 42234.4. Samples: 1125053380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:48,385][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 06:25:50,443][26599] Updated weights for policy 0, policy_version 296484 (0.0030) [2024-06-19 06:25:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42265.7). Total num frames: 4857692160. Throughput: 0: 42160.4. Samples: 1125298480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:53,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 06:25:55,143][26599] Updated weights for policy 0, policy_version 296494 (0.0034) [2024-06-19 06:25:58,111][26599] Updated weights for policy 0, policy_version 296504 (0.0038) [2024-06-19 06:25:58,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4857921536. Throughput: 0: 42155.5. Samples: 1125557360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:25:58,384][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 06:26:02,813][26599] Updated weights for policy 0, policy_version 296514 (0.0038) [2024-06-19 06:26:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 4858101760. Throughput: 0: 42228.2. Samples: 1125686840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:26:03,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 06:26:06,216][26599] Updated weights for policy 0, policy_version 296524 (0.0036) [2024-06-19 06:26:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4858331136. Throughput: 0: 42217.4. Samples: 1125935900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 06:26:08,380][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 06:26:10,456][26599] Updated weights for policy 0, policy_version 296534 (0.0028) [2024-06-19 06:26:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4858544128. Throughput: 0: 42502.3. Samples: 1126199960. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:13,380][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 06:26:14,087][26599] Updated weights for policy 0, policy_version 296544 (0.0055) [2024-06-19 06:26:17,941][26599] Updated weights for policy 0, policy_version 296554 (0.0029) [2024-06-19 06:26:18,380][26367] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4858740736. Throughput: 0: 42069.3. Samples: 1126313220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:18,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 06:26:21,881][26599] Updated weights for policy 0, policy_version 296564 (0.0031) [2024-06-19 06:26:23,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42376.3). Total num frames: 4858986496. Throughput: 0: 42293.4. Samples: 1126574280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:23,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 06:26:25,658][26599] Updated weights for policy 0, policy_version 296574 (0.0026) [2024-06-19 06:26:28,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4859166720. Throughput: 0: 42663.1. Samples: 1126841320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:28,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 06:26:29,349][26599] Updated weights for policy 0, policy_version 296584 (0.0037) [2024-06-19 06:26:33,325][26599] Updated weights for policy 0, policy_version 296594 (0.0035) [2024-06-19 06:26:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42871.6, 300 sec: 42376.2). Total num frames: 4859396096. Throughput: 0: 42347.5. Samples: 1126958860. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:33,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 06:26:36,929][26599] Updated weights for policy 0, policy_version 296604 (0.0043) [2024-06-19 06:26:37,374][26579] Signal inference workers to stop experience collection... (16650 times) [2024-06-19 06:26:37,423][26599] InferenceWorker_p0-w0: stopping experience collection (16650 times) [2024-06-19 06:26:37,433][26579] Signal inference workers to resume experience collection... (16650 times) [2024-06-19 06:26:37,440][26599] InferenceWorker_p0-w0: resuming experience collection (16650 times) [2024-06-19 06:26:38,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42871.6, 300 sec: 42431.8). Total num frames: 4859641856. Throughput: 0: 42663.5. Samples: 1127218340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:38,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 06:26:41,010][26599] Updated weights for policy 0, policy_version 296614 (0.0032) [2024-06-19 06:26:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42432.1). Total num frames: 4859822080. Throughput: 0: 42680.9. Samples: 1127478000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:43,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 06:26:44,609][26599] Updated weights for policy 0, policy_version 296624 (0.0034) [2024-06-19 06:26:48,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42601.1, 300 sec: 42376.3). Total num frames: 4860035072. Throughput: 0: 42585.8. Samples: 1127603200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:48,380][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 06:26:48,775][26599] Updated weights for policy 0, policy_version 296634 (0.0038) [2024-06-19 06:26:52,483][26599] Updated weights for policy 0, policy_version 296644 (0.0029) [2024-06-19 06:26:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4860264448. Throughput: 0: 42711.4. Samples: 1127857920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:53,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 06:26:56,650][26599] Updated weights for policy 0, policy_version 296654 (0.0044) [2024-06-19 06:26:58,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4860444672. Throughput: 0: 42552.3. Samples: 1128114820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:26:58,381][26367] Avg episode reward: [(0, '0.793')] [2024-06-19 06:27:00,029][26599] Updated weights for policy 0, policy_version 296664 (0.0046) [2024-06-19 06:27:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4860657664. Throughput: 0: 42762.9. Samples: 1128237540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:27:03,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 06:27:04,415][26599] Updated weights for policy 0, policy_version 296674 (0.0035) [2024-06-19 06:27:07,755][26599] Updated weights for policy 0, policy_version 296684 (0.0048) [2024-06-19 06:27:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4860887040. Throughput: 0: 42756.6. Samples: 1128498320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:27:08,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 06:27:12,103][26599] Updated weights for policy 0, policy_version 296694 (0.0044) [2024-06-19 06:27:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4861083648. Throughput: 0: 42179.9. Samples: 1128739420. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:27:13,383][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 06:27:15,433][26599] Updated weights for policy 0, policy_version 296704 (0.0029) [2024-06-19 06:27:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4861296640. Throughput: 0: 42427.1. Samples: 1128868080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:27:18,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 06:27:19,838][26599] Updated weights for policy 0, policy_version 296714 (0.0029) [2024-06-19 06:27:23,013][26599] Updated weights for policy 0, policy_version 296724 (0.0037) [2024-06-19 06:27:23,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4861542400. Throughput: 0: 42532.9. Samples: 1129132320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 06:27:23,381][26367] Avg episode reward: [(0, '0.814')] [2024-06-19 06:27:27,805][26599] Updated weights for policy 0, policy_version 296734 (0.0026) [2024-06-19 06:27:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4861706240. Throughput: 0: 42307.7. Samples: 1129381840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:27:28,380][26367] Avg episode reward: [(0, '0.351')] [2024-06-19 06:27:30,876][26599] Updated weights for policy 0, policy_version 296744 (0.0028) [2024-06-19 06:27:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42376.8). Total num frames: 4861952000. Throughput: 0: 42199.5. Samples: 1129502180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:27:33,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 06:27:35,477][26599] Updated weights for policy 0, policy_version 296754 (0.0042) [2024-06-19 06:27:38,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4862164992. Throughput: 0: 42289.4. Samples: 1129760940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:27:38,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 06:27:38,410][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000296764_4862181376.pth... [2024-06-19 06:27:38,411][26599] Updated weights for policy 0, policy_version 296764 (0.0027) [2024-06-19 06:27:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000296141_4851974144.pth [2024-06-19 06:27:43,146][26599] Updated weights for policy 0, policy_version 296774 (0.0037) [2024-06-19 06:27:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4862345216. Throughput: 0: 42395.2. Samples: 1130022600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:27:43,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 06:27:46,160][26599] Updated weights for policy 0, policy_version 296784 (0.0038) [2024-06-19 06:27:48,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42431.8). Total num frames: 4862590976. Throughput: 0: 42360.6. Samples: 1130143780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:27:48,381][26367] Avg episode reward: [(0, '0.389')] [2024-06-19 06:27:50,745][26599] Updated weights for policy 0, policy_version 296794 (0.0032) [2024-06-19 06:27:53,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4862803968. Throughput: 0: 42485.4. Samples: 1130410160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:27:53,380][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 06:27:53,631][26599] Updated weights for policy 0, policy_version 296804 (0.0033) [2024-06-19 06:27:58,380][26367] Fps is (10 sec: 39322.5, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4862984192. Throughput: 0: 42797.0. Samples: 1130665280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:27:58,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 06:27:58,586][26599] Updated weights for policy 0, policy_version 296814 (0.0029) [2024-06-19 06:28:00,089][26579] Signal inference workers to stop experience collection... (16700 times) [2024-06-19 06:28:00,089][26579] Signal inference workers to resume experience collection... (16700 times) [2024-06-19 06:28:00,103][26599] InferenceWorker_p0-w0: stopping experience collection (16700 times) [2024-06-19 06:28:00,103][26599] InferenceWorker_p0-w0: resuming experience collection (16700 times) [2024-06-19 06:28:01,184][26599] Updated weights for policy 0, policy_version 296824 (0.0032) [2024-06-19 06:28:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4863229952. Throughput: 0: 42602.3. Samples: 1130785180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:03,380][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 06:28:05,999][26599] Updated weights for policy 0, policy_version 296834 (0.0032) [2024-06-19 06:28:08,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4863459328. Throughput: 0: 42645.4. Samples: 1131051360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:08,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 06:28:08,769][26599] Updated weights for policy 0, policy_version 296844 (0.0040) [2024-06-19 06:28:13,384][26367] Fps is (10 sec: 39306.9, 60 sec: 42322.8, 300 sec: 42320.7). Total num frames: 4863623168. Throughput: 0: 42658.2. Samples: 1131301620. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:13,384][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 06:28:13,885][26599] Updated weights for policy 0, policy_version 296854 (0.0042) [2024-06-19 06:28:16,932][26599] Updated weights for policy 0, policy_version 296864 (0.0035) [2024-06-19 06:28:18,384][26367] Fps is (10 sec: 40944.6, 60 sec: 42868.8, 300 sec: 42486.8). Total num frames: 4863868928. Throughput: 0: 42805.8. Samples: 1131428600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:18,385][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 06:28:21,769][26599] Updated weights for policy 0, policy_version 296874 (0.0029) [2024-06-19 06:28:23,380][26367] Fps is (10 sec: 44253.4, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4864065536. Throughput: 0: 42801.5. Samples: 1131687000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:23,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 06:28:24,598][26599] Updated weights for policy 0, policy_version 296884 (0.0034) [2024-06-19 06:28:28,380][26367] Fps is (10 sec: 39336.0, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4864262144. Throughput: 0: 42543.9. Samples: 1131937080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:28,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 06:28:29,320][26599] Updated weights for policy 0, policy_version 296894 (0.0037) [2024-06-19 06:28:32,294][26599] Updated weights for policy 0, policy_version 296904 (0.0026) [2024-06-19 06:28:33,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4864524288. Throughput: 0: 42716.1. Samples: 1132066000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:33,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 06:28:37,225][26599] Updated weights for policy 0, policy_version 296914 (0.0035) [2024-06-19 06:28:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42321.2). Total num frames: 4864688128. Throughput: 0: 42420.4. Samples: 1132319080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 23.0) [2024-06-19 06:28:38,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 06:28:40,178][26599] Updated weights for policy 0, policy_version 296924 (0.0024) [2024-06-19 06:28:43,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4864901120. Throughput: 0: 42155.5. Samples: 1132562280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:28:43,381][26367] Avg episode reward: [(0, '0.358')] [2024-06-19 06:28:44,827][26599] Updated weights for policy 0, policy_version 296934 (0.0025) [2024-06-19 06:28:48,225][26599] Updated weights for policy 0, policy_version 296944 (0.0036) [2024-06-19 06:28:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 4865130496. Throughput: 0: 42336.9. Samples: 1132690340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:28:48,380][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 06:28:52,398][26599] Updated weights for policy 0, policy_version 296954 (0.0034) [2024-06-19 06:28:53,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4865343488. Throughput: 0: 42192.9. Samples: 1132950040. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:28:53,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 06:28:56,237][26599] Updated weights for policy 0, policy_version 296964 (0.0031) [2024-06-19 06:28:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42487.9). Total num frames: 4865556480. Throughput: 0: 42133.3. Samples: 1133197460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:28:58,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 06:28:59,964][26599] Updated weights for policy 0, policy_version 296974 (0.0037) [2024-06-19 06:29:03,384][26367] Fps is (10 sec: 40944.9, 60 sec: 42049.7, 300 sec: 42431.3). Total num frames: 4865753088. Throughput: 0: 42164.6. Samples: 1133326000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:03,384][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 06:29:03,766][26599] Updated weights for policy 0, policy_version 296984 (0.0027) [2024-06-19 06:29:07,472][26599] Updated weights for policy 0, policy_version 296994 (0.0046) [2024-06-19 06:29:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 41779.2, 300 sec: 42321.2). Total num frames: 4865966080. Throughput: 0: 42165.8. Samples: 1133584460. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:08,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 06:29:11,843][26599] Updated weights for policy 0, policy_version 297004 (0.0037) [2024-06-19 06:29:13,380][26367] Fps is (10 sec: 44252.5, 60 sec: 42874.1, 300 sec: 42487.3). Total num frames: 4866195456. Throughput: 0: 42048.9. Samples: 1133829280. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:13,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 06:29:15,319][26599] Updated weights for policy 0, policy_version 297014 (0.0037) [2024-06-19 06:29:18,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41781.8, 300 sec: 42376.2). Total num frames: 4866375680. Throughput: 0: 42089.0. Samples: 1133960000. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:18,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 06:29:19,437][26599] Updated weights for policy 0, policy_version 297024 (0.0041) [2024-06-19 06:29:22,888][26599] Updated weights for policy 0, policy_version 297034 (0.0032) [2024-06-19 06:29:23,380][26367] Fps is (10 sec: 40959.0, 60 sec: 42325.1, 300 sec: 42320.7). Total num frames: 4866605056. Throughput: 0: 42199.3. Samples: 1134218060. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:23,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 06:29:27,053][26599] Updated weights for policy 0, policy_version 297044 (0.0038) [2024-06-19 06:29:28,020][26579] Signal inference workers to stop experience collection... (16750 times) [2024-06-19 06:29:28,067][26599] InferenceWorker_p0-w0: stopping experience collection (16750 times) [2024-06-19 06:29:28,076][26579] Signal inference workers to resume experience collection... (16750 times) [2024-06-19 06:29:28,084][26599] InferenceWorker_p0-w0: resuming experience collection (16750 times) [2024-06-19 06:29:28,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42376.3). Total num frames: 4866834432. Throughput: 0: 42465.4. Samples: 1134473220. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:28,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 06:29:31,100][26599] Updated weights for policy 0, policy_version 297054 (0.0038) [2024-06-19 06:29:33,380][26367] Fps is (10 sec: 42599.6, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 4867031040. Throughput: 0: 42377.7. Samples: 1134597340. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:33,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 06:29:34,734][26599] Updated weights for policy 0, policy_version 297064 (0.0037) [2024-06-19 06:29:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42321.0). Total num frames: 4867227648. Throughput: 0: 42187.9. Samples: 1134848500. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:38,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 06:29:38,528][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000297073_4867244032.pth... [2024-06-19 06:29:38,570][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000296452_4857069568.pth [2024-06-19 06:29:39,039][26599] Updated weights for policy 0, policy_version 297074 (0.0039) [2024-06-19 06:29:42,749][26599] Updated weights for policy 0, policy_version 297084 (0.0027) [2024-06-19 06:29:43,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42322.8, 300 sec: 42320.2). Total num frames: 4867440640. Throughput: 0: 42403.6. Samples: 1135105780. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:43,384][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 06:29:46,643][26599] Updated weights for policy 0, policy_version 297094 (0.0023) [2024-06-19 06:29:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4867653632. Throughput: 0: 42295.9. Samples: 1135229160. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:48,380][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 06:29:50,435][26599] Updated weights for policy 0, policy_version 297104 (0.0028) [2024-06-19 06:29:53,380][26367] Fps is (10 sec: 44253.5, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4867883008. Throughput: 0: 42150.7. Samples: 1135481240. Policy #0 lag: (min: 1.0, avg: 11.6, max: 24.0) [2024-06-19 06:29:53,380][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 06:29:54,376][26599] Updated weights for policy 0, policy_version 297114 (0.0043) [2024-06-19 06:29:58,155][26599] Updated weights for policy 0, policy_version 297124 (0.0028) [2024-06-19 06:29:58,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42052.1, 300 sec: 42376.2). Total num frames: 4868079616. Throughput: 0: 42460.8. Samples: 1135740020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:29:58,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 06:30:01,884][26599] Updated weights for policy 0, policy_version 297134 (0.0032) [2024-06-19 06:30:03,380][26367] Fps is (10 sec: 39320.7, 60 sec: 42054.7, 300 sec: 42320.7). Total num frames: 4868276224. Throughput: 0: 42217.7. Samples: 1135859800. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:03,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 06:30:06,024][26599] Updated weights for policy 0, policy_version 297144 (0.0047) [2024-06-19 06:30:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4868505600. Throughput: 0: 42225.5. Samples: 1136118200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:08,384][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 06:30:09,460][26599] Updated weights for policy 0, policy_version 297154 (0.0043) [2024-06-19 06:30:13,383][26367] Fps is (10 sec: 42587.3, 60 sec: 41777.3, 300 sec: 42375.9). Total num frames: 4868702208. Throughput: 0: 42256.1. Samples: 1136374860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:13,383][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 06:30:13,654][26599] Updated weights for policy 0, policy_version 297164 (0.0034) [2024-06-19 06:30:17,550][26599] Updated weights for policy 0, policy_version 297174 (0.0040) [2024-06-19 06:30:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4868915200. Throughput: 0: 42173.2. Samples: 1136495140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:18,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 06:30:21,329][26599] Updated weights for policy 0, policy_version 297184 (0.0043) [2024-06-19 06:30:23,380][26367] Fps is (10 sec: 44248.3, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4869144576. Throughput: 0: 42251.5. Samples: 1136749820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:23,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 06:30:24,996][26599] Updated weights for policy 0, policy_version 297194 (0.0042) [2024-06-19 06:30:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 4869341184. Throughput: 0: 42258.8. Samples: 1137007280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:28,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 06:30:29,428][26599] Updated weights for policy 0, policy_version 297204 (0.0039) [2024-06-19 06:30:32,730][26599] Updated weights for policy 0, policy_version 297214 (0.0030) [2024-06-19 06:30:33,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4869570560. Throughput: 0: 42169.4. Samples: 1137126780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:33,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 06:30:36,942][26599] Updated weights for policy 0, policy_version 297224 (0.0047) [2024-06-19 06:30:38,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4869799936. Throughput: 0: 42210.9. Samples: 1137380740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:38,384][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 06:30:40,620][26599] Updated weights for policy 0, policy_version 297234 (0.0034) [2024-06-19 06:30:43,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42054.8, 300 sec: 42321.2). Total num frames: 4869963776. Throughput: 0: 42213.4. Samples: 1137639620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:43,381][26367] Avg episode reward: [(0, '0.805')] [2024-06-19 06:30:44,474][26599] Updated weights for policy 0, policy_version 297244 (0.0047) [2024-06-19 06:30:48,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4870193152. Throughput: 0: 42254.4. Samples: 1137761240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:48,380][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 06:30:48,515][26599] Updated weights for policy 0, policy_version 297254 (0.0035) [2024-06-19 06:30:52,399][26599] Updated weights for policy 0, policy_version 297264 (0.0037) [2024-06-19 06:30:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4870406144. Throughput: 0: 42226.4. Samples: 1138018380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:53,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 06:30:56,220][26599] Updated weights for policy 0, policy_version 297274 (0.0036) [2024-06-19 06:30:58,242][26579] Signal inference workers to stop experience collection... (16800 times) [2024-06-19 06:30:58,291][26599] InferenceWorker_p0-w0: stopping experience collection (16800 times) [2024-06-19 06:30:58,366][26579] Signal inference workers to resume experience collection... (16800 times) [2024-06-19 06:30:58,366][26599] InferenceWorker_p0-w0: resuming experience collection (16800 times) [2024-06-19 06:30:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 4870602752. Throughput: 0: 42293.7. Samples: 1138277960. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:30:58,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 06:30:59,987][26599] Updated weights for policy 0, policy_version 297284 (0.0036) [2024-06-19 06:31:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4870832128. Throughput: 0: 42318.7. Samples: 1138399480. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 06:31:03,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 06:31:03,899][26599] Updated weights for policy 0, policy_version 297294 (0.0027) [2024-06-19 06:31:07,644][26599] Updated weights for policy 0, policy_version 297304 (0.0038) [2024-06-19 06:31:08,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42049.8, 300 sec: 42320.2). Total num frames: 4871028736. Throughput: 0: 42238.5. Samples: 1138650700. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:08,384][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 06:31:11,946][26599] Updated weights for policy 0, policy_version 297314 (0.0043) [2024-06-19 06:31:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42327.2, 300 sec: 42376.3). Total num frames: 4871241728. Throughput: 0: 41993.4. Samples: 1138896980. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:13,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-19 06:31:15,758][26599] Updated weights for policy 0, policy_version 297324 (0.0032) [2024-06-19 06:31:18,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4871471104. Throughput: 0: 42234.2. Samples: 1139027320. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:18,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 06:31:19,978][26599] Updated weights for policy 0, policy_version 297334 (0.0032) [2024-06-19 06:31:23,362][26599] Updated weights for policy 0, policy_version 297344 (0.0035) [2024-06-19 06:31:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4871684096. Throughput: 0: 42250.3. Samples: 1139282000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:23,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 06:31:27,516][26599] Updated weights for policy 0, policy_version 297354 (0.0038) [2024-06-19 06:31:28,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4871864320. Throughput: 0: 42107.1. Samples: 1139534440. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:28,389][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 06:31:31,176][26599] Updated weights for policy 0, policy_version 297364 (0.0035) [2024-06-19 06:31:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4872093696. Throughput: 0: 42116.0. Samples: 1139656460. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:33,380][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 06:31:35,173][26599] Updated weights for policy 0, policy_version 297374 (0.0042) [2024-06-19 06:31:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4872306688. Throughput: 0: 42203.9. Samples: 1139917560. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:38,383][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 06:31:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000297382_4872306688.pth... [2024-06-19 06:31:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000296764_4862181376.pth [2024-06-19 06:31:38,763][26599] Updated weights for policy 0, policy_version 297384 (0.0024) [2024-06-19 06:31:42,962][26599] Updated weights for policy 0, policy_version 297394 (0.0030) [2024-06-19 06:31:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4872503296. Throughput: 0: 41904.8. Samples: 1140163680. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:43,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 06:31:46,609][26599] Updated weights for policy 0, policy_version 297404 (0.0043) [2024-06-19 06:31:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4872716288. Throughput: 0: 41941.0. Samples: 1140286820. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:48,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 06:31:50,864][26599] Updated weights for policy 0, policy_version 297414 (0.0040) [2024-06-19 06:31:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4872929280. Throughput: 0: 42071.8. Samples: 1140543780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:53,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 06:31:54,367][26599] Updated weights for policy 0, policy_version 297424 (0.0031) [2024-06-19 06:31:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4873125888. Throughput: 0: 42093.0. Samples: 1140791160. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:31:58,381][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 06:31:58,847][26599] Updated weights for policy 0, policy_version 297434 (0.0045) [2024-06-19 06:32:02,200][26599] Updated weights for policy 0, policy_version 297444 (0.0044) [2024-06-19 06:32:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4873355264. Throughput: 0: 41995.6. Samples: 1140917120. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:32:03,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 06:32:06,759][26599] Updated weights for policy 0, policy_version 297454 (0.0036) [2024-06-19 06:32:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41781.8, 300 sec: 42209.6). Total num frames: 4873535488. Throughput: 0: 42017.4. Samples: 1141172780. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:32:08,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 06:32:10,097][26599] Updated weights for policy 0, policy_version 297464 (0.0035) [2024-06-19 06:32:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4873764864. Throughput: 0: 41968.0. Samples: 1141423000. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:32:13,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 06:32:14,296][26599] Updated weights for policy 0, policy_version 297474 (0.0023) [2024-06-19 06:32:17,978][26599] Updated weights for policy 0, policy_version 297484 (0.0041) [2024-06-19 06:32:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4873994240. Throughput: 0: 42062.2. Samples: 1141549260. Policy #0 lag: (min: 1.0, avg: 10.8, max: 21.0) [2024-06-19 06:32:18,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 06:32:22,142][26599] Updated weights for policy 0, policy_version 297494 (0.0044) [2024-06-19 06:32:23,091][26579] Signal inference workers to stop experience collection... (16850 times) [2024-06-19 06:32:23,121][26599] InferenceWorker_p0-w0: stopping experience collection (16850 times) [2024-06-19 06:32:23,144][26579] Signal inference workers to resume experience collection... (16850 times) [2024-06-19 06:32:23,144][26599] InferenceWorker_p0-w0: resuming experience collection (16850 times) [2024-06-19 06:32:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4874190848. Throughput: 0: 41949.4. Samples: 1141805280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:23,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 06:32:25,615][26599] Updated weights for policy 0, policy_version 297504 (0.0023) [2024-06-19 06:32:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4874403840. Throughput: 0: 42114.7. Samples: 1142058840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:28,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 06:32:30,189][26599] Updated weights for policy 0, policy_version 297514 (0.0040) [2024-06-19 06:32:33,379][26599] Updated weights for policy 0, policy_version 297524 (0.0035) [2024-06-19 06:32:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.2, 300 sec: 42265.2). Total num frames: 4874633216. Throughput: 0: 42228.8. Samples: 1142187120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:33,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 06:32:37,928][26599] Updated weights for policy 0, policy_version 297534 (0.0038) [2024-06-19 06:32:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 41506.2, 300 sec: 42209.6). Total num frames: 4874797056. Throughput: 0: 42100.1. Samples: 1142438280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:38,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 06:32:41,042][26599] Updated weights for policy 0, policy_version 297544 (0.0029) [2024-06-19 06:32:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4875042816. Throughput: 0: 42331.8. Samples: 1142696100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:43,384][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 06:32:45,714][26599] Updated weights for policy 0, policy_version 297554 (0.0031) [2024-06-19 06:32:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4875239424. Throughput: 0: 42270.7. Samples: 1142819300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:48,380][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 06:32:48,787][26599] Updated weights for policy 0, policy_version 297564 (0.0036) [2024-06-19 06:32:53,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41779.2, 300 sec: 42209.6). Total num frames: 4875436032. Throughput: 0: 42131.9. Samples: 1143068720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:53,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 06:32:53,586][26599] Updated weights for policy 0, policy_version 297574 (0.0038) [2024-06-19 06:32:56,590][26599] Updated weights for policy 0, policy_version 297584 (0.0022) [2024-06-19 06:32:58,384][26367] Fps is (10 sec: 42582.2, 60 sec: 42322.7, 300 sec: 42153.5). Total num frames: 4875665408. Throughput: 0: 42194.8. Samples: 1143321920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:32:58,385][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 06:33:01,042][26599] Updated weights for policy 0, policy_version 297594 (0.0033) [2024-06-19 06:33:03,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.2, 300 sec: 42098.5). Total num frames: 4875878400. Throughput: 0: 42327.1. Samples: 1143453980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:33:03,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 06:33:04,221][26599] Updated weights for policy 0, policy_version 297604 (0.0044) [2024-06-19 06:33:08,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42598.3, 300 sec: 42265.7). Total num frames: 4876091392. Throughput: 0: 42193.3. Samples: 1143703980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:33:08,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 06:33:09,049][26599] Updated weights for policy 0, policy_version 297614 (0.0039) [2024-06-19 06:33:12,108][26599] Updated weights for policy 0, policy_version 297624 (0.0045) [2024-06-19 06:33:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42154.6). Total num frames: 4876304384. Throughput: 0: 42196.1. Samples: 1143957660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:33:13,380][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 06:33:16,458][26599] Updated weights for policy 0, policy_version 297634 (0.0034) [2024-06-19 06:33:18,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41776.7, 300 sec: 42153.6). Total num frames: 4876500992. Throughput: 0: 42153.1. Samples: 1144084160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:33:18,384][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 06:33:19,873][26599] Updated weights for policy 0, policy_version 297644 (0.0039) [2024-06-19 06:33:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4876730368. Throughput: 0: 42252.5. Samples: 1144339640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:33:23,380][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 06:33:23,923][26599] Updated weights for policy 0, policy_version 297654 (0.0041) [2024-06-19 06:33:27,456][26599] Updated weights for policy 0, policy_version 297664 (0.0028) [2024-06-19 06:33:28,380][26367] Fps is (10 sec: 44253.0, 60 sec: 42325.4, 300 sec: 42098.6). Total num frames: 4876943360. Throughput: 0: 42307.3. Samples: 1144599920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:33:28,380][26367] Avg episode reward: [(0, '0.324')] [2024-06-19 06:33:31,351][26599] Updated weights for policy 0, policy_version 297674 (0.0036) [2024-06-19 06:33:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.3, 300 sec: 42209.6). Total num frames: 4877139968. Throughput: 0: 42364.9. Samples: 1144725720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:33:33,380][26367] Avg episode reward: [(0, '0.284')] [2024-06-19 06:33:35,382][26599] Updated weights for policy 0, policy_version 297684 (0.0036) [2024-06-19 06:33:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42320.7). Total num frames: 4877385728. Throughput: 0: 42487.2. Samples: 1144980640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:33:38,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 06:33:38,415][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000297692_4877385728.pth... [2024-06-19 06:33:38,482][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000297073_4867244032.pth [2024-06-19 06:33:38,908][26599] Updated weights for policy 0, policy_version 297694 (0.0032) [2024-06-19 06:33:43,097][26599] Updated weights for policy 0, policy_version 297704 (0.0046) [2024-06-19 06:33:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4877582336. Throughput: 0: 42463.5. Samples: 1145232620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:33:43,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 06:33:46,031][26579] Signal inference workers to stop experience collection... (16900 times) [2024-06-19 06:33:46,032][26579] Signal inference workers to resume experience collection... (16900 times) [2024-06-19 06:33:46,046][26599] InferenceWorker_p0-w0: stopping experience collection (16900 times) [2024-06-19 06:33:46,047][26599] InferenceWorker_p0-w0: resuming experience collection (16900 times) [2024-06-19 06:33:47,158][26599] Updated weights for policy 0, policy_version 297714 (0.0039) [2024-06-19 06:33:48,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42052.3, 300 sec: 42098.6). Total num frames: 4877762560. Throughput: 0: 42349.9. Samples: 1145359720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:33:48,380][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 06:33:50,829][26599] Updated weights for policy 0, policy_version 297724 (0.0029) [2024-06-19 06:33:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 4878008320. Throughput: 0: 42426.7. Samples: 1145613180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:33:53,380][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 06:33:54,957][26599] Updated weights for policy 0, policy_version 297734 (0.0042) [2024-06-19 06:33:58,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42601.1, 300 sec: 42265.7). Total num frames: 4878221312. Throughput: 0: 42482.6. Samples: 1145869380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:33:58,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 06:33:58,578][26599] Updated weights for policy 0, policy_version 297744 (0.0029) [2024-06-19 06:34:02,761][26599] Updated weights for policy 0, policy_version 297754 (0.0037) [2024-06-19 06:34:03,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4878417920. Throughput: 0: 42500.7. Samples: 1145996540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:03,381][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 06:34:06,481][26599] Updated weights for policy 0, policy_version 297764 (0.0047) [2024-06-19 06:34:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42209.6). Total num frames: 4878647296. Throughput: 0: 42441.6. Samples: 1146249520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:08,382][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 06:34:10,390][26599] Updated weights for policy 0, policy_version 297774 (0.0024) [2024-06-19 06:34:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4878843904. Throughput: 0: 42335.6. Samples: 1146505020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:13,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 06:34:14,191][26599] Updated weights for policy 0, policy_version 297784 (0.0034) [2024-06-19 06:34:18,027][26599] Updated weights for policy 0, policy_version 297794 (0.0033) [2024-06-19 06:34:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42874.0, 300 sec: 42265.2). Total num frames: 4879073280. Throughput: 0: 42406.5. Samples: 1146634020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:18,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 06:34:22,102][26599] Updated weights for policy 0, policy_version 297804 (0.0040) [2024-06-19 06:34:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42098.5). Total num frames: 4879253504. Throughput: 0: 42349.6. Samples: 1146886380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:23,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 06:34:25,595][26599] Updated weights for policy 0, policy_version 297814 (0.0032) [2024-06-19 06:34:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4879482880. Throughput: 0: 42434.7. Samples: 1147142180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:28,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 06:34:29,733][26599] Updated weights for policy 0, policy_version 297824 (0.0036) [2024-06-19 06:34:33,062][26599] Updated weights for policy 0, policy_version 297834 (0.0028) [2024-06-19 06:34:33,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4879712256. Throughput: 0: 42507.9. Samples: 1147272580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:33,384][26367] Avg episode reward: [(0, '0.793')] [2024-06-19 06:34:37,414][26599] Updated weights for policy 0, policy_version 297844 (0.0036) [2024-06-19 06:34:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42265.7). Total num frames: 4879908864. Throughput: 0: 42439.9. Samples: 1147522980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:38,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 06:34:40,931][26599] Updated weights for policy 0, policy_version 297854 (0.0038) [2024-06-19 06:34:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42265.1). Total num frames: 4880121856. Throughput: 0: 42320.3. Samples: 1147773800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 06:34:43,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 06:34:45,133][26599] Updated weights for policy 0, policy_version 297864 (0.0038) [2024-06-19 06:34:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42209.6). Total num frames: 4880334848. Throughput: 0: 42407.1. Samples: 1147904860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:34:48,381][26367] Avg episode reward: [(0, '0.372')] [2024-06-19 06:34:49,161][26599] Updated weights for policy 0, policy_version 297874 (0.0045) [2024-06-19 06:34:53,027][26599] Updated weights for policy 0, policy_version 297884 (0.0037) [2024-06-19 06:34:53,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.1, 300 sec: 42209.6). Total num frames: 4880531456. Throughput: 0: 42383.5. Samples: 1148156780. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:34:53,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 06:34:56,630][26599] Updated weights for policy 0, policy_version 297894 (0.0028) [2024-06-19 06:34:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4880777216. Throughput: 0: 42301.2. Samples: 1148408580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:34:58,381][26367] Avg episode reward: [(0, '0.362')] [2024-06-19 06:35:00,580][26599] Updated weights for policy 0, policy_version 297904 (0.0037) [2024-06-19 06:35:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42154.1). Total num frames: 4880941056. Throughput: 0: 42462.8. Samples: 1148544840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:03,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 06:35:04,387][26599] Updated weights for policy 0, policy_version 297914 (0.0051) [2024-06-19 06:35:08,279][26599] Updated weights for policy 0, policy_version 297924 (0.0036) [2024-06-19 06:35:08,383][26367] Fps is (10 sec: 40951.0, 60 sec: 42323.8, 300 sec: 42320.8). Total num frames: 4881186816. Throughput: 0: 42396.1. Samples: 1148794300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:08,383][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 06:35:12,367][26579] Signal inference workers to stop experience collection... (16950 times) [2024-06-19 06:35:12,408][26599] InferenceWorker_p0-w0: stopping experience collection (16950 times) [2024-06-19 06:35:12,428][26579] Signal inference workers to resume experience collection... (16950 times) [2024-06-19 06:35:12,428][26599] InferenceWorker_p0-w0: resuming experience collection (16950 times) [2024-06-19 06:35:12,431][26599] Updated weights for policy 0, policy_version 297934 (0.0028) [2024-06-19 06:35:13,380][26367] Fps is (10 sec: 47513.2, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 4881416192. Throughput: 0: 42222.1. Samples: 1149042180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:13,381][26367] Avg episode reward: [(0, '0.849')] [2024-06-19 06:35:15,989][26599] Updated weights for policy 0, policy_version 297944 (0.0030) [2024-06-19 06:35:18,380][26367] Fps is (10 sec: 40969.2, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4881596416. Throughput: 0: 42292.4. Samples: 1149175740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:18,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 06:35:19,991][26599] Updated weights for policy 0, policy_version 297954 (0.0043) [2024-06-19 06:35:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4881809408. Throughput: 0: 42300.4. Samples: 1149426500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:23,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 06:35:23,604][26599] Updated weights for policy 0, policy_version 297964 (0.0037) [2024-06-19 06:35:27,624][26599] Updated weights for policy 0, policy_version 297974 (0.0035) [2024-06-19 06:35:28,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4882038784. Throughput: 0: 42406.4. Samples: 1149682080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:28,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 06:35:31,454][26599] Updated weights for policy 0, policy_version 297984 (0.0032) [2024-06-19 06:35:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42154.1). Total num frames: 4882235392. Throughput: 0: 42479.9. Samples: 1149816460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:33,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 06:35:35,383][26599] Updated weights for policy 0, policy_version 297994 (0.0038) [2024-06-19 06:35:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4882448384. Throughput: 0: 42465.5. Samples: 1150067720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:38,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 06:35:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298001_4882448384.pth... [2024-06-19 06:35:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000297382_4872306688.pth [2024-06-19 06:35:39,044][26599] Updated weights for policy 0, policy_version 298004 (0.0028) [2024-06-19 06:35:43,148][26599] Updated weights for policy 0, policy_version 298014 (0.0045) [2024-06-19 06:35:43,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4882677760. Throughput: 0: 42607.2. Samples: 1150325900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:43,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 06:35:46,870][26599] Updated weights for policy 0, policy_version 298024 (0.0030) [2024-06-19 06:35:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42265.1). Total num frames: 4882874368. Throughput: 0: 42395.1. Samples: 1150452620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:48,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 06:35:50,741][26599] Updated weights for policy 0, policy_version 298034 (0.0045) [2024-06-19 06:35:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42376.2). Total num frames: 4883103744. Throughput: 0: 42509.3. Samples: 1150707120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:53,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 06:35:54,437][26599] Updated weights for policy 0, policy_version 298044 (0.0040) [2024-06-19 06:35:58,358][26599] Updated weights for policy 0, policy_version 298054 (0.0035) [2024-06-19 06:35:58,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42325.5, 300 sec: 42320.7). Total num frames: 4883316736. Throughput: 0: 42799.3. Samples: 1150968140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 23.0) [2024-06-19 06:35:58,380][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 06:36:02,149][26599] Updated weights for policy 0, policy_version 298064 (0.0043) [2024-06-19 06:36:03,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42321.2). Total num frames: 4883513344. Throughput: 0: 42613.2. Samples: 1151093340. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:03,381][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 06:36:06,034][26599] Updated weights for policy 0, policy_version 298074 (0.0043) [2024-06-19 06:36:08,384][26367] Fps is (10 sec: 42582.3, 60 sec: 42597.4, 300 sec: 42375.7). Total num frames: 4883742720. Throughput: 0: 42593.9. Samples: 1151343380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:08,385][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 06:36:09,657][26599] Updated weights for policy 0, policy_version 298084 (0.0036) [2024-06-19 06:36:13,380][26367] Fps is (10 sec: 42599.6, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4883939328. Throughput: 0: 42747.1. Samples: 1151605700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:13,380][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 06:36:13,768][26599] Updated weights for policy 0, policy_version 298094 (0.0041) [2024-06-19 06:36:17,206][26599] Updated weights for policy 0, policy_version 298104 (0.0034) [2024-06-19 06:36:18,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4884152320. Throughput: 0: 42447.7. Samples: 1151726600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:18,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 06:36:21,509][26599] Updated weights for policy 0, policy_version 298114 (0.0025) [2024-06-19 06:36:23,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4884381696. Throughput: 0: 42483.4. Samples: 1151979480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:23,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 06:36:24,900][26599] Updated weights for policy 0, policy_version 298124 (0.0035) [2024-06-19 06:36:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4884561920. Throughput: 0: 42514.7. Samples: 1152239060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:28,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 06:36:29,466][26599] Updated weights for policy 0, policy_version 298134 (0.0034) [2024-06-19 06:36:29,902][26579] Signal inference workers to stop experience collection... (17000 times) [2024-06-19 06:36:29,943][26599] InferenceWorker_p0-w0: stopping experience collection (17000 times) [2024-06-19 06:36:30,023][26579] Signal inference workers to resume experience collection... (17000 times) [2024-06-19 06:36:30,024][26599] InferenceWorker_p0-w0: resuming experience collection (17000 times) [2024-06-19 06:36:32,565][26599] Updated weights for policy 0, policy_version 298144 (0.0033) [2024-06-19 06:36:33,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42595.9, 300 sec: 42320.2). Total num frames: 4884791296. Throughput: 0: 42391.7. Samples: 1152360400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:33,385][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 06:36:37,273][26599] Updated weights for policy 0, policy_version 298154 (0.0034) [2024-06-19 06:36:38,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4885004288. Throughput: 0: 42457.2. Samples: 1152617700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:38,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 06:36:40,245][26599] Updated weights for policy 0, policy_version 298164 (0.0041) [2024-06-19 06:36:43,380][26367] Fps is (10 sec: 40974.4, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4885200896. Throughput: 0: 42433.6. Samples: 1152877660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:43,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 06:36:44,843][26599] Updated weights for policy 0, policy_version 298174 (0.0029) [2024-06-19 06:36:48,033][26599] Updated weights for policy 0, policy_version 298184 (0.0038) [2024-06-19 06:36:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4885446656. Throughput: 0: 42297.1. Samples: 1152996700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:48,383][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 06:36:52,574][26599] Updated weights for policy 0, policy_version 298194 (0.0038) [2024-06-19 06:36:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4885643264. Throughput: 0: 42520.7. Samples: 1153256660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:53,383][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 06:36:55,614][26599] Updated weights for policy 0, policy_version 298204 (0.0038) [2024-06-19 06:36:58,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4885839872. Throughput: 0: 42597.4. Samples: 1153522580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:36:58,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 06:37:00,135][26599] Updated weights for policy 0, policy_version 298214 (0.0028) [2024-06-19 06:37:03,185][26599] Updated weights for policy 0, policy_version 298224 (0.0044) [2024-06-19 06:37:03,380][26367] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 4886102016. Throughput: 0: 42479.1. Samples: 1153638160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:37:03,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 06:37:07,675][26599] Updated weights for policy 0, policy_version 298234 (0.0039) [2024-06-19 06:37:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42054.9, 300 sec: 42376.3). Total num frames: 4886265856. Throughput: 0: 42611.3. Samples: 1153896980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:37:08,380][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 06:37:10,888][26599] Updated weights for policy 0, policy_version 298244 (0.0035) [2024-06-19 06:37:13,380][26367] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4886478848. Throughput: 0: 42558.1. Samples: 1154154180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 06:37:13,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 06:37:15,935][26599] Updated weights for policy 0, policy_version 298254 (0.0034) [2024-06-19 06:37:18,381][26367] Fps is (10 sec: 45870.2, 60 sec: 42870.8, 300 sec: 42487.2). Total num frames: 4886724608. Throughput: 0: 42529.2. Samples: 1154274100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:18,382][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 06:37:19,182][26599] Updated weights for policy 0, policy_version 298264 (0.0028) [2024-06-19 06:37:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.4, 300 sec: 42376.2). Total num frames: 4886904832. Throughput: 0: 42467.2. Samples: 1154528720. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:23,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 06:37:23,404][26599] Updated weights for policy 0, policy_version 298274 (0.0040) [2024-06-19 06:37:26,860][26599] Updated weights for policy 0, policy_version 298284 (0.0027) [2024-06-19 06:37:28,380][26367] Fps is (10 sec: 37686.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4887101440. Throughput: 0: 42227.6. Samples: 1154777900. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:28,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 06:37:31,019][26599] Updated weights for policy 0, policy_version 298294 (0.0036) [2024-06-19 06:37:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42328.0, 300 sec: 42487.3). Total num frames: 4887330816. Throughput: 0: 42385.0. Samples: 1154904020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:33,380][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 06:37:34,946][26599] Updated weights for policy 0, policy_version 298304 (0.0025) [2024-06-19 06:37:38,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 4887543808. Throughput: 0: 42375.3. Samples: 1155163540. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:38,380][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 06:37:38,488][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298313_4887560192.pth... [2024-06-19 06:37:38,539][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000297692_4877385728.pth [2024-06-19 06:37:38,685][26599] Updated weights for policy 0, policy_version 298314 (0.0043) [2024-06-19 06:37:42,721][26599] Updated weights for policy 0, policy_version 298324 (0.0041) [2024-06-19 06:37:43,384][26367] Fps is (10 sec: 42582.1, 60 sec: 42595.9, 300 sec: 42431.2). Total num frames: 4887756800. Throughput: 0: 41947.1. Samples: 1155410360. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:43,385][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 06:37:46,524][26599] Updated weights for policy 0, policy_version 298334 (0.0041) [2024-06-19 06:37:47,041][26579] Signal inference workers to stop experience collection... (17050 times) [2024-06-19 06:37:47,041][26579] Signal inference workers to resume experience collection... (17050 times) [2024-06-19 06:37:47,067][26599] InferenceWorker_p0-w0: stopping experience collection (17050 times) [2024-06-19 06:37:47,067][26599] InferenceWorker_p0-w0: resuming experience collection (17050 times) [2024-06-19 06:37:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 4887953408. Throughput: 0: 42173.4. Samples: 1155535960. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:48,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 06:37:50,435][26599] Updated weights for policy 0, policy_version 298344 (0.0028) [2024-06-19 06:37:53,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42052.3, 300 sec: 42376.8). Total num frames: 4888166400. Throughput: 0: 42178.1. Samples: 1155795000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:53,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 06:37:54,487][26599] Updated weights for policy 0, policy_version 298354 (0.0042) [2024-06-19 06:37:58,095][26599] Updated weights for policy 0, policy_version 298364 (0.0032) [2024-06-19 06:37:58,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4888395776. Throughput: 0: 41836.0. Samples: 1156036800. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:37:58,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 06:38:02,186][26599] Updated weights for policy 0, policy_version 298374 (0.0055) [2024-06-19 06:38:03,380][26367] Fps is (10 sec: 44237.1, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 4888608768. Throughput: 0: 42078.8. Samples: 1156167600. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:38:03,381][26367] Avg episode reward: [(0, '0.832')] [2024-06-19 06:38:05,685][26599] Updated weights for policy 0, policy_version 298384 (0.0027) [2024-06-19 06:38:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4888788992. Throughput: 0: 42140.9. Samples: 1156425060. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:38:08,384][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 06:38:09,792][26599] Updated weights for policy 0, policy_version 298394 (0.0031) [2024-06-19 06:38:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42432.3). Total num frames: 4889018368. Throughput: 0: 42109.5. Samples: 1156672820. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:38:13,380][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 06:38:13,743][26599] Updated weights for policy 0, policy_version 298404 (0.0041) [2024-06-19 06:38:17,527][26599] Updated weights for policy 0, policy_version 298414 (0.0034) [2024-06-19 06:38:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 41779.9, 300 sec: 42376.2). Total num frames: 4889231360. Throughput: 0: 42218.1. Samples: 1156803840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:38:18,389][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 06:38:21,273][26599] Updated weights for policy 0, policy_version 298424 (0.0030) [2024-06-19 06:38:23,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4889427968. Throughput: 0: 42083.0. Samples: 1157057280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 24.0) [2024-06-19 06:38:23,381][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 06:38:25,150][26599] Updated weights for policy 0, policy_version 298434 (0.0036) [2024-06-19 06:38:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4889657344. Throughput: 0: 42383.4. Samples: 1157317460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:38:28,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 06:38:29,151][26599] Updated weights for policy 0, policy_version 298444 (0.0031) [2024-06-19 06:38:32,646][26599] Updated weights for policy 0, policy_version 298454 (0.0037) [2024-06-19 06:38:33,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4889886720. Throughput: 0: 42457.6. Samples: 1157446560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:38:33,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 06:38:36,919][26599] Updated weights for policy 0, policy_version 298464 (0.0033) [2024-06-19 06:38:38,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.0, 300 sec: 42265.1). Total num frames: 4890050560. Throughput: 0: 42343.0. Samples: 1157700440. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:38:38,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 06:38:40,551][26599] Updated weights for policy 0, policy_version 298474 (0.0032) [2024-06-19 06:38:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42327.9, 300 sec: 42487.3). Total num frames: 4890296320. Throughput: 0: 42610.6. Samples: 1157954280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:38:43,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 06:38:44,613][26599] Updated weights for policy 0, policy_version 298484 (0.0036) [2024-06-19 06:38:48,378][26599] Updated weights for policy 0, policy_version 298494 (0.0028) [2024-06-19 06:38:48,380][26367] Fps is (10 sec: 47514.2, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4890525696. Throughput: 0: 42680.4. Samples: 1158088220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:38:48,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 06:38:52,072][26579] Signal inference workers to stop experience collection... (17100 times) [2024-06-19 06:38:52,072][26579] Signal inference workers to resume experience collection... (17100 times) [2024-06-19 06:38:52,096][26599] InferenceWorker_p0-w0: stopping experience collection (17100 times) [2024-06-19 06:38:52,097][26599] InferenceWorker_p0-w0: resuming experience collection (17100 times) [2024-06-19 06:38:52,230][26599] Updated weights for policy 0, policy_version 298504 (0.0031) [2024-06-19 06:38:53,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4890705920. Throughput: 0: 42531.6. Samples: 1158338980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:38:53,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 06:38:56,245][26599] Updated weights for policy 0, policy_version 298514 (0.0041) [2024-06-19 06:38:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4890935296. Throughput: 0: 42638.1. Samples: 1158591540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:38:58,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 06:38:59,846][26599] Updated weights for policy 0, policy_version 298524 (0.0034) [2024-06-19 06:39:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4891131904. Throughput: 0: 42625.0. Samples: 1158721960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:03,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 06:39:03,935][26599] Updated weights for policy 0, policy_version 298534 (0.0028) [2024-06-19 06:39:07,732][26599] Updated weights for policy 0, policy_version 298544 (0.0032) [2024-06-19 06:39:08,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4891361280. Throughput: 0: 42684.1. Samples: 1158978060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:08,380][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 06:39:11,518][26599] Updated weights for policy 0, policy_version 298554 (0.0043) [2024-06-19 06:39:13,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4891590656. Throughput: 0: 42571.2. Samples: 1159233160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:13,380][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 06:39:15,619][26599] Updated weights for policy 0, policy_version 298564 (0.0030) [2024-06-19 06:39:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42487.4). Total num frames: 4891787264. Throughput: 0: 42526.8. Samples: 1159360260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:18,380][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 06:39:19,205][26599] Updated weights for policy 0, policy_version 298574 (0.0033) [2024-06-19 06:39:23,380][26367] Fps is (10 sec: 39320.7, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4891983872. Throughput: 0: 42503.5. Samples: 1159613100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:23,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 06:39:23,690][26599] Updated weights for policy 0, policy_version 298584 (0.0034) [2024-06-19 06:39:26,846][26599] Updated weights for policy 0, policy_version 298594 (0.0050) [2024-06-19 06:39:28,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4892213248. Throughput: 0: 42555.5. Samples: 1159869280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:28,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 06:39:31,262][26599] Updated weights for policy 0, policy_version 298604 (0.0031) [2024-06-19 06:39:33,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4892426240. Throughput: 0: 42463.1. Samples: 1159999060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:33,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 06:39:34,475][26599] Updated weights for policy 0, policy_version 298614 (0.0035) [2024-06-19 06:39:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 4892639232. Throughput: 0: 42557.2. Samples: 1160254060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:39:38,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 06:39:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298623_4892639232.pth... [2024-06-19 06:39:38,466][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298001_4882448384.pth [2024-06-19 06:39:38,750][26599] Updated weights for policy 0, policy_version 298624 (0.0036) [2024-06-19 06:39:42,093][26599] Updated weights for policy 0, policy_version 298634 (0.0031) [2024-06-19 06:39:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4892852224. Throughput: 0: 42580.9. Samples: 1160507680. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:39:43,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 06:39:46,391][26599] Updated weights for policy 0, policy_version 298644 (0.0034) [2024-06-19 06:39:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42487.4). Total num frames: 4893065216. Throughput: 0: 42524.5. Samples: 1160635560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:39:48,380][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 06:39:49,757][26599] Updated weights for policy 0, policy_version 298654 (0.0023) [2024-06-19 06:39:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 4893278208. Throughput: 0: 42455.4. Samples: 1160888560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:39:53,385][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 06:39:53,991][26599] Updated weights for policy 0, policy_version 298664 (0.0029) [2024-06-19 06:39:57,798][26599] Updated weights for policy 0, policy_version 298674 (0.0042) [2024-06-19 06:39:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4893507584. Throughput: 0: 42448.9. Samples: 1161143360. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:39:58,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 06:40:01,654][26599] Updated weights for policy 0, policy_version 298684 (0.0040) [2024-06-19 06:40:03,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42376.6). Total num frames: 4893687808. Throughput: 0: 42491.3. Samples: 1161272380. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:03,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 06:40:05,407][26599] Updated weights for policy 0, policy_version 298694 (0.0030) [2024-06-19 06:40:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4893917184. Throughput: 0: 42470.4. Samples: 1161524260. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:08,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 06:40:09,698][26599] Updated weights for policy 0, policy_version 298704 (0.0039) [2024-06-19 06:40:13,055][26599] Updated weights for policy 0, policy_version 298714 (0.0036) [2024-06-19 06:40:13,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4894146560. Throughput: 0: 42436.6. Samples: 1161778920. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:13,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 06:40:17,141][26599] Updated weights for policy 0, policy_version 298724 (0.0028) [2024-06-19 06:40:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4894326784. Throughput: 0: 42417.7. Samples: 1161907860. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:18,384][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 06:40:20,507][26599] Updated weights for policy 0, policy_version 298734 (0.0039) [2024-06-19 06:40:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42431.8). Total num frames: 4894556160. Throughput: 0: 42466.7. Samples: 1162165060. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:23,381][26367] Avg episode reward: [(0, '0.345')] [2024-06-19 06:40:24,367][26579] Signal inference workers to stop experience collection... (17150 times) [2024-06-19 06:40:24,369][26579] Signal inference workers to resume experience collection... (17150 times) [2024-06-19 06:40:24,387][26599] InferenceWorker_p0-w0: stopping experience collection (17150 times) [2024-06-19 06:40:24,417][26599] InferenceWorker_p0-w0: resuming experience collection (17150 times) [2024-06-19 06:40:24,511][26599] Updated weights for policy 0, policy_version 298744 (0.0042) [2024-06-19 06:40:28,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.6, 300 sec: 42487.4). Total num frames: 4894769152. Throughput: 0: 42611.3. Samples: 1162425180. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:28,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 06:40:28,494][26599] Updated weights for policy 0, policy_version 298754 (0.0037) [2024-06-19 06:40:32,557][26599] Updated weights for policy 0, policy_version 298764 (0.0041) [2024-06-19 06:40:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4894965760. Throughput: 0: 42533.3. Samples: 1162549560. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:33,380][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 06:40:36,197][26599] Updated weights for policy 0, policy_version 298774 (0.0039) [2024-06-19 06:40:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4895195136. Throughput: 0: 42604.6. Samples: 1162805760. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:38,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 06:40:40,129][26599] Updated weights for policy 0, policy_version 298784 (0.0027) [2024-06-19 06:40:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4895391744. Throughput: 0: 42650.2. Samples: 1163062620. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:43,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 06:40:43,914][26599] Updated weights for policy 0, policy_version 298794 (0.0030) [2024-06-19 06:40:47,531][26599] Updated weights for policy 0, policy_version 298804 (0.0031) [2024-06-19 06:40:48,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 4895604736. Throughput: 0: 42485.8. Samples: 1163184240. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:48,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 06:40:51,615][26599] Updated weights for policy 0, policy_version 298814 (0.0043) [2024-06-19 06:40:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4895834112. Throughput: 0: 42638.1. Samples: 1163442980. Policy #0 lag: (min: 1.0, avg: 9.5, max: 21.0) [2024-06-19 06:40:53,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 06:40:55,584][26599] Updated weights for policy 0, policy_version 298824 (0.0042) [2024-06-19 06:40:58,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4896030720. Throughput: 0: 42673.0. Samples: 1163699200. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:40:58,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 06:40:59,234][26599] Updated weights for policy 0, policy_version 298834 (0.0026) [2024-06-19 06:41:03,084][26599] Updated weights for policy 0, policy_version 298844 (0.0038) [2024-06-19 06:41:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42432.3). Total num frames: 4896260096. Throughput: 0: 42671.6. Samples: 1163828080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:03,381][26367] Avg episode reward: [(0, '0.344')] [2024-06-19 06:41:06,925][26599] Updated weights for policy 0, policy_version 298854 (0.0036) [2024-06-19 06:41:08,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4896473088. Throughput: 0: 42630.7. Samples: 1164083440. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:08,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 06:41:10,486][26599] Updated weights for policy 0, policy_version 298864 (0.0032) [2024-06-19 06:41:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4896669696. Throughput: 0: 42622.2. Samples: 1164343180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:13,380][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 06:41:14,626][26599] Updated weights for policy 0, policy_version 298874 (0.0037) [2024-06-19 06:41:18,045][26599] Updated weights for policy 0, policy_version 298884 (0.0036) [2024-06-19 06:41:18,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 4896915456. Throughput: 0: 42693.7. Samples: 1164470780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:18,380][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 06:41:22,394][26599] Updated weights for policy 0, policy_version 298894 (0.0022) [2024-06-19 06:41:23,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4897095680. Throughput: 0: 42726.6. Samples: 1164728460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:23,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 06:41:25,679][26599] Updated weights for policy 0, policy_version 298904 (0.0049) [2024-06-19 06:41:28,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42432.3). Total num frames: 4897308672. Throughput: 0: 42582.2. Samples: 1164978820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:28,380][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 06:41:30,543][26599] Updated weights for policy 0, policy_version 298914 (0.0035) [2024-06-19 06:41:33,224][26599] Updated weights for policy 0, policy_version 298924 (0.0023) [2024-06-19 06:41:33,380][26367] Fps is (10 sec: 47513.5, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 4897570816. Throughput: 0: 42726.7. Samples: 1165106940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:33,383][26367] Avg episode reward: [(0, '0.278')] [2024-06-19 06:41:38,279][26599] Updated weights for policy 0, policy_version 298934 (0.0025) [2024-06-19 06:41:38,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4897734656. Throughput: 0: 42624.4. Samples: 1165361080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:38,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 06:41:38,528][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298935_4897751040.pth... [2024-06-19 06:41:38,578][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298313_4887560192.pth [2024-06-19 06:41:41,381][26599] Updated weights for policy 0, policy_version 298944 (0.0041) [2024-06-19 06:41:43,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4897964032. Throughput: 0: 42472.9. Samples: 1165610480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:43,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 06:41:45,938][26599] Updated weights for policy 0, policy_version 298954 (0.0040) [2024-06-19 06:41:47,430][26579] Signal inference workers to stop experience collection... (17200 times) [2024-06-19 06:41:47,433][26579] Signal inference workers to resume experience collection... (17200 times) [2024-06-19 06:41:47,446][26599] InferenceWorker_p0-w0: stopping experience collection (17200 times) [2024-06-19 06:41:47,480][26599] InferenceWorker_p0-w0: resuming experience collection (17200 times) [2024-06-19 06:41:48,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4898177024. Throughput: 0: 42359.9. Samples: 1165734280. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:48,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 06:41:49,090][26599] Updated weights for policy 0, policy_version 298964 (0.0041) [2024-06-19 06:41:53,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 4898357248. Throughput: 0: 42367.9. Samples: 1165990000. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:53,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 06:41:53,562][26599] Updated weights for policy 0, policy_version 298974 (0.0029) [2024-06-19 06:41:57,072][26599] Updated weights for policy 0, policy_version 298984 (0.0034) [2024-06-19 06:41:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4898586624. Throughput: 0: 41987.0. Samples: 1166232600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:41:58,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 06:42:01,343][26599] Updated weights for policy 0, policy_version 298994 (0.0036) [2024-06-19 06:42:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4898783232. Throughput: 0: 41994.6. Samples: 1166360540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:42:03,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 06:42:04,942][26599] Updated weights for policy 0, policy_version 299004 (0.0029) [2024-06-19 06:42:08,384][26367] Fps is (10 sec: 39307.5, 60 sec: 41776.6, 300 sec: 42375.7). Total num frames: 4898979840. Throughput: 0: 41931.8. Samples: 1166615540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 06:42:08,385][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 06:42:09,219][26599] Updated weights for policy 0, policy_version 299014 (0.0031) [2024-06-19 06:42:12,703][26599] Updated weights for policy 0, policy_version 299024 (0.0031) [2024-06-19 06:42:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42376.4). Total num frames: 4899225600. Throughput: 0: 41933.2. Samples: 1166865820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:13,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 06:42:16,835][26599] Updated weights for policy 0, policy_version 299034 (0.0032) [2024-06-19 06:42:18,381][26367] Fps is (10 sec: 44249.8, 60 sec: 41778.7, 300 sec: 42431.7). Total num frames: 4899422208. Throughput: 0: 42082.0. Samples: 1167000660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:18,382][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 06:42:20,373][26599] Updated weights for policy 0, policy_version 299044 (0.0033) [2024-06-19 06:42:23,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4899635200. Throughput: 0: 41953.1. Samples: 1167248960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:23,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 06:42:24,579][26599] Updated weights for policy 0, policy_version 299054 (0.0027) [2024-06-19 06:42:28,380][26367] Fps is (10 sec: 42601.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4899848192. Throughput: 0: 42068.8. Samples: 1167503580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:28,381][26367] Avg episode reward: [(0, '0.335')] [2024-06-19 06:42:28,719][26599] Updated weights for policy 0, policy_version 299064 (0.0033) [2024-06-19 06:42:32,654][26599] Updated weights for policy 0, policy_version 299074 (0.0036) [2024-06-19 06:42:33,380][26367] Fps is (10 sec: 42597.7, 60 sec: 41506.1, 300 sec: 42431.8). Total num frames: 4900061184. Throughput: 0: 42136.0. Samples: 1167630400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:33,384][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 06:42:36,723][26599] Updated weights for policy 0, policy_version 299084 (0.0038) [2024-06-19 06:42:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42432.3). Total num frames: 4900274176. Throughput: 0: 42080.8. Samples: 1167883640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:38,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 06:42:40,161][26599] Updated weights for policy 0, policy_version 299094 (0.0042) [2024-06-19 06:42:43,382][26367] Fps is (10 sec: 42592.9, 60 sec: 42051.3, 300 sec: 42487.1). Total num frames: 4900487168. Throughput: 0: 42258.3. Samples: 1168134280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:43,382][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 06:42:44,374][26599] Updated weights for policy 0, policy_version 299104 (0.0033) [2024-06-19 06:42:47,789][26599] Updated weights for policy 0, policy_version 299114 (0.0032) [2024-06-19 06:42:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4900700160. Throughput: 0: 42270.6. Samples: 1168262720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:48,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 06:42:51,963][26599] Updated weights for policy 0, policy_version 299124 (0.0029) [2024-06-19 06:42:53,380][26367] Fps is (10 sec: 42604.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4900913152. Throughput: 0: 42424.0. Samples: 1168524460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:53,380][26367] Avg episode reward: [(0, '0.359')] [2024-06-19 06:42:55,513][26599] Updated weights for policy 0, policy_version 299134 (0.0041) [2024-06-19 06:42:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4901142528. Throughput: 0: 42287.0. Samples: 1168768740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:42:58,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 06:42:59,791][26599] Updated weights for policy 0, policy_version 299144 (0.0029) [2024-06-19 06:43:03,073][26599] Updated weights for policy 0, policy_version 299154 (0.0033) [2024-06-19 06:43:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4901339136. Throughput: 0: 42278.9. Samples: 1168903180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:43:03,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 06:43:07,445][26599] Updated weights for policy 0, policy_version 299164 (0.0033) [2024-06-19 06:43:08,380][26367] Fps is (10 sec: 37683.9, 60 sec: 42327.9, 300 sec: 42376.2). Total num frames: 4901519360. Throughput: 0: 42404.0. Samples: 1169157140. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:43:08,381][26367] Avg episode reward: [(0, '0.330')] [2024-06-19 06:43:08,551][26579] Signal inference workers to stop experience collection... (17250 times) [2024-06-19 06:43:08,551][26579] Signal inference workers to resume experience collection... (17250 times) [2024-06-19 06:43:08,565][26599] InferenceWorker_p0-w0: stopping experience collection (17250 times) [2024-06-19 06:43:08,565][26599] InferenceWorker_p0-w0: resuming experience collection (17250 times) [2024-06-19 06:43:10,803][26599] Updated weights for policy 0, policy_version 299174 (0.0021) [2024-06-19 06:43:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4901781504. Throughput: 0: 42370.7. Samples: 1169410260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:43:13,381][26367] Avg episode reward: [(0, '0.370')] [2024-06-19 06:43:15,019][26599] Updated weights for policy 0, policy_version 299184 (0.0027) [2024-06-19 06:43:18,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.8, 300 sec: 42487.3). Total num frames: 4901961728. Throughput: 0: 42528.0. Samples: 1169544160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 06:43:18,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 06:43:18,943][26599] Updated weights for policy 0, policy_version 299194 (0.0037) [2024-06-19 06:43:22,458][26599] Updated weights for policy 0, policy_version 299204 (0.0024) [2024-06-19 06:43:23,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4902174720. Throughput: 0: 42555.2. Samples: 1169798620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:23,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 06:43:26,550][26599] Updated weights for policy 0, policy_version 299214 (0.0032) [2024-06-19 06:43:28,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4902420480. Throughput: 0: 42535.0. Samples: 1170048300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:28,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 06:43:30,158][26599] Updated weights for policy 0, policy_version 299224 (0.0022) [2024-06-19 06:43:33,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4902617088. Throughput: 0: 42646.1. Samples: 1170181800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:33,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 06:43:34,055][26599] Updated weights for policy 0, policy_version 299234 (0.0036) [2024-06-19 06:43:37,751][26599] Updated weights for policy 0, policy_version 299244 (0.0031) [2024-06-19 06:43:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4902830080. Throughput: 0: 42405.6. Samples: 1170432720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:38,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 06:43:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000299245_4902830080.pth... [2024-06-19 06:43:38,443][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298623_4892639232.pth [2024-06-19 06:43:41,775][26599] Updated weights for policy 0, policy_version 299254 (0.0033) [2024-06-19 06:43:43,380][26367] Fps is (10 sec: 44237.9, 60 sec: 42872.5, 300 sec: 42487.3). Total num frames: 4903059456. Throughput: 0: 42593.1. Samples: 1170685420. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:43,380][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 06:43:45,558][26599] Updated weights for policy 0, policy_version 299264 (0.0041) [2024-06-19 06:43:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4903239680. Throughput: 0: 42373.3. Samples: 1170809980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:48,381][26367] Avg episode reward: [(0, '0.800')] [2024-06-19 06:43:49,725][26599] Updated weights for policy 0, policy_version 299274 (0.0030) [2024-06-19 06:43:53,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4903452672. Throughput: 0: 42283.1. Samples: 1171059880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:53,381][26367] Avg episode reward: [(0, '0.826')] [2024-06-19 06:43:53,806][26599] Updated weights for policy 0, policy_version 299284 (0.0038) [2024-06-19 06:43:57,608][26599] Updated weights for policy 0, policy_version 299294 (0.0035) [2024-06-19 06:43:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 4903682048. Throughput: 0: 42391.6. Samples: 1171317880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:43:58,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 06:44:01,450][26599] Updated weights for policy 0, policy_version 299304 (0.0040) [2024-06-19 06:44:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4903878656. Throughput: 0: 42228.5. Samples: 1171444440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:44:03,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 06:44:05,344][26599] Updated weights for policy 0, policy_version 299314 (0.0043) [2024-06-19 06:44:08,257][26579] Signal inference workers to stop experience collection... (17300 times) [2024-06-19 06:44:08,258][26579] Signal inference workers to resume experience collection... (17300 times) [2024-06-19 06:44:08,298][26599] InferenceWorker_p0-w0: stopping experience collection (17300 times) [2024-06-19 06:44:08,298][26599] InferenceWorker_p0-w0: resuming experience collection (17300 times) [2024-06-19 06:44:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4904091648. Throughput: 0: 42192.5. Samples: 1171697280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:44:08,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 06:44:09,065][26599] Updated weights for policy 0, policy_version 299324 (0.0033) [2024-06-19 06:44:12,916][26599] Updated weights for policy 0, policy_version 299334 (0.0028) [2024-06-19 06:44:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4904321024. Throughput: 0: 42417.9. Samples: 1171957100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:44:13,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 06:44:16,610][26599] Updated weights for policy 0, policy_version 299344 (0.0036) [2024-06-19 06:44:18,383][26367] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42431.5). Total num frames: 4904501248. Throughput: 0: 42242.4. Samples: 1172082800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:44:18,383][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 06:44:20,859][26599] Updated weights for policy 0, policy_version 299354 (0.0037) [2024-06-19 06:44:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4904730624. Throughput: 0: 42238.7. Samples: 1172333460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:44:23,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 06:44:24,124][26599] Updated weights for policy 0, policy_version 299364 (0.0036) [2024-06-19 06:44:28,380][26367] Fps is (10 sec: 42608.3, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 4904927232. Throughput: 0: 42510.2. Samples: 1172598380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:44:28,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 06:44:28,432][26599] Updated weights for policy 0, policy_version 299374 (0.0036) [2024-06-19 06:44:31,665][26599] Updated weights for policy 0, policy_version 299384 (0.0027) [2024-06-19 06:44:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.4, 300 sec: 42376.3). Total num frames: 4905140224. Throughput: 0: 42560.1. Samples: 1172725180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 06:44:33,380][26367] Avg episode reward: [(0, '0.788')] [2024-06-19 06:44:35,966][26599] Updated weights for policy 0, policy_version 299394 (0.0036) [2024-06-19 06:44:38,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42322.8, 300 sec: 42431.3). Total num frames: 4905369600. Throughput: 0: 42588.6. Samples: 1172976520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:44:38,385][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 06:44:39,318][26599] Updated weights for policy 0, policy_version 299404 (0.0040) [2024-06-19 06:44:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 4905566208. Throughput: 0: 42716.9. Samples: 1173240140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:44:43,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 06:44:43,534][26599] Updated weights for policy 0, policy_version 299414 (0.0039) [2024-06-19 06:44:46,989][26599] Updated weights for policy 0, policy_version 299424 (0.0052) [2024-06-19 06:44:48,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 4905779200. Throughput: 0: 42632.9. Samples: 1173362920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:44:48,384][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 06:44:51,186][26599] Updated weights for policy 0, policy_version 299434 (0.0031) [2024-06-19 06:44:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 4906008576. Throughput: 0: 42612.3. Samples: 1173614840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:44:53,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 06:44:54,574][26599] Updated weights for policy 0, policy_version 299444 (0.0027) [2024-06-19 06:44:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4906221568. Throughput: 0: 42567.0. Samples: 1173872620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:44:58,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 06:44:58,865][26599] Updated weights for policy 0, policy_version 299454 (0.0037) [2024-06-19 06:45:02,538][26599] Updated weights for policy 0, policy_version 299464 (0.0034) [2024-06-19 06:45:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4906418176. Throughput: 0: 42442.6. Samples: 1173992620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:03,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 06:45:06,536][26599] Updated weights for policy 0, policy_version 299474 (0.0032) [2024-06-19 06:45:08,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4906647552. Throughput: 0: 42543.2. Samples: 1174247900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:08,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 06:45:10,055][26599] Updated weights for policy 0, policy_version 299484 (0.0025) [2024-06-19 06:45:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4906844160. Throughput: 0: 42457.7. Samples: 1174508980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:13,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 06:45:14,449][26599] Updated weights for policy 0, policy_version 299494 (0.0032) [2024-06-19 06:45:18,159][26599] Updated weights for policy 0, policy_version 299504 (0.0025) [2024-06-19 06:45:18,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42873.0, 300 sec: 42431.8). Total num frames: 4907073536. Throughput: 0: 42346.9. Samples: 1174630800. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:18,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 06:45:22,199][26599] Updated weights for policy 0, policy_version 299514 (0.0024) [2024-06-19 06:45:23,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4907286528. Throughput: 0: 42556.4. Samples: 1174891400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:23,380][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 06:45:25,639][26599] Updated weights for policy 0, policy_version 299524 (0.0033) [2024-06-19 06:45:28,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4907466752. Throughput: 0: 42374.2. Samples: 1175146980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:28,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 06:45:28,936][26579] Signal inference workers to stop experience collection... (17350 times) [2024-06-19 06:45:28,937][26579] Signal inference workers to resume experience collection... (17350 times) [2024-06-19 06:45:28,964][26599] InferenceWorker_p0-w0: stopping experience collection (17350 times) [2024-06-19 06:45:28,964][26599] InferenceWorker_p0-w0: resuming experience collection (17350 times) [2024-06-19 06:45:29,822][26599] Updated weights for policy 0, policy_version 299534 (0.0035) [2024-06-19 06:45:33,241][26599] Updated weights for policy 0, policy_version 299544 (0.0029) [2024-06-19 06:45:33,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 4907728896. Throughput: 0: 42353.8. Samples: 1175268840. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:33,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 06:45:37,369][26599] Updated weights for policy 0, policy_version 299554 (0.0041) [2024-06-19 06:45:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42327.9, 300 sec: 42431.8). Total num frames: 4907909120. Throughput: 0: 42526.3. Samples: 1175528520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:38,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 06:45:38,437][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000299556_4907925504.pth... [2024-06-19 06:45:38,491][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000298935_4897751040.pth [2024-06-19 06:45:40,989][26599] Updated weights for policy 0, policy_version 299564 (0.0029) [2024-06-19 06:45:43,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4908105728. Throughput: 0: 42542.3. Samples: 1175787020. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:43,381][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 06:45:45,177][26599] Updated weights for policy 0, policy_version 299574 (0.0046) [2024-06-19 06:45:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4908351488. Throughput: 0: 42543.2. Samples: 1175907060. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 06:45:48,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 06:45:49,330][26599] Updated weights for policy 0, policy_version 299584 (0.0033) [2024-06-19 06:45:52,736][26599] Updated weights for policy 0, policy_version 299594 (0.0032) [2024-06-19 06:45:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4908548096. Throughput: 0: 42524.0. Samples: 1176161480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:45:53,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 06:45:56,965][26599] Updated weights for policy 0, policy_version 299604 (0.0043) [2024-06-19 06:45:58,384][26367] Fps is (10 sec: 39307.1, 60 sec: 42049.7, 300 sec: 42320.2). Total num frames: 4908744704. Throughput: 0: 42620.1. Samples: 1176427040. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:45:58,384][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 06:46:00,773][26599] Updated weights for policy 0, policy_version 299614 (0.0033) [2024-06-19 06:46:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4908990464. Throughput: 0: 42602.9. Samples: 1176547920. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:03,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 06:46:04,657][26599] Updated weights for policy 0, policy_version 299624 (0.0024) [2024-06-19 06:46:08,288][26599] Updated weights for policy 0, policy_version 299634 (0.0034) [2024-06-19 06:46:08,380][26367] Fps is (10 sec: 45892.0, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4909203456. Throughput: 0: 42508.3. Samples: 1176804280. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:08,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 06:46:12,426][26599] Updated weights for policy 0, policy_version 299644 (0.0029) [2024-06-19 06:46:13,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4909367296. Throughput: 0: 42456.4. Samples: 1177057520. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:13,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 06:46:16,156][26599] Updated weights for policy 0, policy_version 299654 (0.0027) [2024-06-19 06:46:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4909629440. Throughput: 0: 42470.2. Samples: 1177180000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:18,381][26367] Avg episode reward: [(0, '0.753')] [2024-06-19 06:46:20,910][26599] Updated weights for policy 0, policy_version 299664 (0.0049) [2024-06-19 06:46:23,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4909826048. Throughput: 0: 42388.4. Samples: 1177436000. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:23,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 06:46:23,583][26579] Signal inference workers to stop experience collection... (17400 times) [2024-06-19 06:46:23,635][26599] InferenceWorker_p0-w0: stopping experience collection (17400 times) [2024-06-19 06:46:23,642][26579] Signal inference workers to resume experience collection... (17400 times) [2024-06-19 06:46:23,646][26599] InferenceWorker_p0-w0: resuming experience collection (17400 times) [2024-06-19 06:46:23,792][26599] Updated weights for policy 0, policy_version 299674 (0.0039) [2024-06-19 06:46:28,359][26599] Updated weights for policy 0, policy_version 299684 (0.0032) [2024-06-19 06:46:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4910022656. Throughput: 0: 42346.6. Samples: 1177692620. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:28,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 06:46:31,391][26599] Updated weights for policy 0, policy_version 299694 (0.0029) [2024-06-19 06:46:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4910268416. Throughput: 0: 42477.8. Samples: 1177818560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:33,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 06:46:35,906][26599] Updated weights for policy 0, policy_version 299704 (0.0028) [2024-06-19 06:46:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4910448640. Throughput: 0: 42591.1. Samples: 1178078080. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:38,380][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 06:46:39,043][26599] Updated weights for policy 0, policy_version 299714 (0.0037) [2024-06-19 06:46:43,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4910661632. Throughput: 0: 42273.3. Samples: 1178329180. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:43,380][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 06:46:43,467][26599] Updated weights for policy 0, policy_version 299724 (0.0038) [2024-06-19 06:46:47,038][26599] Updated weights for policy 0, policy_version 299734 (0.0041) [2024-06-19 06:46:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4910891008. Throughput: 0: 42442.6. Samples: 1178457840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:48,381][26367] Avg episode reward: [(0, '0.324')] [2024-06-19 06:46:51,476][26599] Updated weights for policy 0, policy_version 299744 (0.0029) [2024-06-19 06:46:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4911087616. Throughput: 0: 42491.6. Samples: 1178716400. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:53,381][26367] Avg episode reward: [(0, '0.380')] [2024-06-19 06:46:54,661][26599] Updated weights for policy 0, policy_version 299754 (0.0026) [2024-06-19 06:46:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42601.0, 300 sec: 42431.8). Total num frames: 4911300608. Throughput: 0: 42334.6. Samples: 1178962580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:46:58,381][26367] Avg episode reward: [(0, '0.243')] [2024-06-19 06:46:59,064][26599] Updated weights for policy 0, policy_version 299764 (0.0035) [2024-06-19 06:47:02,253][26599] Updated weights for policy 0, policy_version 299774 (0.0039) [2024-06-19 06:47:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42543.4). Total num frames: 4911529984. Throughput: 0: 42612.1. Samples: 1179097540. Policy #0 lag: (min: 0.0, avg: 10.8, max: 22.0) [2024-06-19 06:47:03,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 06:47:06,723][26599] Updated weights for policy 0, policy_version 299784 (0.0035) [2024-06-19 06:47:08,380][26367] Fps is (10 sec: 40959.4, 60 sec: 41779.1, 300 sec: 42320.7). Total num frames: 4911710208. Throughput: 0: 42608.7. Samples: 1179353400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:08,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 06:47:09,836][26599] Updated weights for policy 0, policy_version 299794 (0.0027) [2024-06-19 06:47:13,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42487.4). Total num frames: 4911955968. Throughput: 0: 42520.4. Samples: 1179606040. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:13,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 06:47:14,522][26599] Updated weights for policy 0, policy_version 299804 (0.0041) [2024-06-19 06:47:17,463][26599] Updated weights for policy 0, policy_version 299814 (0.0035) [2024-06-19 06:47:18,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4912168960. Throughput: 0: 42660.4. Samples: 1179738280. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:18,381][26367] Avg episode reward: [(0, '0.319')] [2024-06-19 06:47:22,149][26599] Updated weights for policy 0, policy_version 299824 (0.0038) [2024-06-19 06:47:23,380][26367] Fps is (10 sec: 39322.5, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 4912349184. Throughput: 0: 42432.0. Samples: 1179987520. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:23,380][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 06:47:25,410][26599] Updated weights for policy 0, policy_version 299834 (0.0046) [2024-06-19 06:47:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 4912594944. Throughput: 0: 42294.2. Samples: 1180232420. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:28,380][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 06:47:29,784][26599] Updated weights for policy 0, policy_version 299844 (0.0042) [2024-06-19 06:47:33,335][26599] Updated weights for policy 0, policy_version 299854 (0.0031) [2024-06-19 06:47:33,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4912807936. Throughput: 0: 42449.4. Samples: 1180368060. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:33,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 06:47:37,444][26599] Updated weights for policy 0, policy_version 299864 (0.0041) [2024-06-19 06:47:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42376.4). Total num frames: 4912988160. Throughput: 0: 42200.0. Samples: 1180615400. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:38,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 06:47:38,411][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000299865_4912988160.pth... [2024-06-19 06:47:38,480][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000299245_4902830080.pth [2024-06-19 06:47:41,274][26599] Updated weights for policy 0, policy_version 299874 (0.0039) [2024-06-19 06:47:43,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42595.8, 300 sec: 42431.3). Total num frames: 4913217536. Throughput: 0: 42246.4. Samples: 1180863820. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:43,384][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 06:47:45,230][26599] Updated weights for policy 0, policy_version 299884 (0.0033) [2024-06-19 06:47:48,380][26367] Fps is (10 sec: 40959.3, 60 sec: 41779.1, 300 sec: 42320.7). Total num frames: 4913397760. Throughput: 0: 42181.6. Samples: 1180995720. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:48,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 06:47:48,409][26579] Signal inference workers to stop experience collection... (17450 times) [2024-06-19 06:47:48,409][26579] Signal inference workers to resume experience collection... (17450 times) [2024-06-19 06:47:48,430][26599] InferenceWorker_p0-w0: stopping experience collection (17450 times) [2024-06-19 06:47:48,431][26599] InferenceWorker_p0-w0: resuming experience collection (17450 times) [2024-06-19 06:47:49,031][26599] Updated weights for policy 0, policy_version 299894 (0.0031) [2024-06-19 06:47:52,871][26599] Updated weights for policy 0, policy_version 299904 (0.0037) [2024-06-19 06:47:53,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4913627136. Throughput: 0: 42234.4. Samples: 1181253940. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:53,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 06:47:56,567][26599] Updated weights for policy 0, policy_version 299914 (0.0038) [2024-06-19 06:47:58,380][26367] Fps is (10 sec: 47514.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4913872896. Throughput: 0: 42081.5. Samples: 1181499700. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:47:58,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 06:48:00,766][26599] Updated weights for policy 0, policy_version 299924 (0.0037) [2024-06-19 06:48:03,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 4914036736. Throughput: 0: 42096.6. Samples: 1181632620. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:48:03,380][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 06:48:04,371][26599] Updated weights for policy 0, policy_version 299934 (0.0030) [2024-06-19 06:48:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4914266112. Throughput: 0: 42205.7. Samples: 1181886780. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:48:08,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 06:48:08,503][26599] Updated weights for policy 0, policy_version 299944 (0.0026) [2024-06-19 06:48:12,163][26599] Updated weights for policy 0, policy_version 299954 (0.0031) [2024-06-19 06:48:13,380][26367] Fps is (10 sec: 49151.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4914528256. Throughput: 0: 42219.9. Samples: 1182132320. Policy #0 lag: (min: 0.0, avg: 12.4, max: 21.0) [2024-06-19 06:48:13,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 06:48:16,698][26599] Updated weights for policy 0, policy_version 299964 (0.0031) [2024-06-19 06:48:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 4914675712. Throughput: 0: 42072.5. Samples: 1182261320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:18,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 06:48:20,049][26599] Updated weights for policy 0, policy_version 299974 (0.0037) [2024-06-19 06:48:23,380][26367] Fps is (10 sec: 36045.2, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4914888704. Throughput: 0: 42062.7. Samples: 1182508220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:23,380][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 06:48:24,391][26599] Updated weights for policy 0, policy_version 299984 (0.0035) [2024-06-19 06:48:27,945][26599] Updated weights for policy 0, policy_version 299994 (0.0051) [2024-06-19 06:48:28,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4915134464. Throughput: 0: 42180.8. Samples: 1182761800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:28,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 06:48:31,918][26599] Updated weights for policy 0, policy_version 300004 (0.0027) [2024-06-19 06:48:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.1, 300 sec: 42265.2). Total num frames: 4915298304. Throughput: 0: 42101.9. Samples: 1182890300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:33,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 06:48:35,810][26599] Updated weights for policy 0, policy_version 300014 (0.0030) [2024-06-19 06:48:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4915527680. Throughput: 0: 41851.2. Samples: 1183137240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:38,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 06:48:39,646][26599] Updated weights for policy 0, policy_version 300024 (0.0056) [2024-06-19 06:48:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42054.8, 300 sec: 42376.2). Total num frames: 4915740672. Throughput: 0: 42166.1. Samples: 1183397180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:43,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 06:48:43,428][26599] Updated weights for policy 0, policy_version 300034 (0.0036) [2024-06-19 06:48:47,514][26599] Updated weights for policy 0, policy_version 300044 (0.0030) [2024-06-19 06:48:48,383][26367] Fps is (10 sec: 40950.5, 60 sec: 42323.8, 300 sec: 42320.4). Total num frames: 4915937280. Throughput: 0: 41916.4. Samples: 1183518960. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:48,383][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 06:48:51,261][26599] Updated weights for policy 0, policy_version 300054 (0.0052) [2024-06-19 06:48:53,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4916183040. Throughput: 0: 41793.4. Samples: 1183767480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:53,380][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 06:48:55,385][26599] Updated weights for policy 0, policy_version 300064 (0.0045) [2024-06-19 06:48:58,380][26367] Fps is (10 sec: 42608.1, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 4916363264. Throughput: 0: 42170.2. Samples: 1184029980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:48:58,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 06:48:59,061][26599] Updated weights for policy 0, policy_version 300074 (0.0038) [2024-06-19 06:49:02,941][26599] Updated weights for policy 0, policy_version 300084 (0.0038) [2024-06-19 06:49:03,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4916576256. Throughput: 0: 41910.1. Samples: 1184147280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:49:03,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 06:49:06,765][26599] Updated weights for policy 0, policy_version 300094 (0.0039) [2024-06-19 06:49:08,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4916822016. Throughput: 0: 42117.7. Samples: 1184403520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:49:08,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 06:49:10,731][26599] Updated weights for policy 0, policy_version 300104 (0.0031) [2024-06-19 06:49:13,136][26579] Signal inference workers to stop experience collection... (17500 times) [2024-06-19 06:49:13,136][26579] Signal inference workers to resume experience collection... (17500 times) [2024-06-19 06:49:13,195][26599] InferenceWorker_p0-w0: stopping experience collection (17500 times) [2024-06-19 06:49:13,195][26599] InferenceWorker_p0-w0: resuming experience collection (17500 times) [2024-06-19 06:49:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 40959.9, 300 sec: 42321.0). Total num frames: 4916985856. Throughput: 0: 42297.6. Samples: 1184665200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:49:13,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 06:49:14,609][26599] Updated weights for policy 0, policy_version 300114 (0.0046) [2024-06-19 06:49:18,371][26599] Updated weights for policy 0, policy_version 300124 (0.0034) [2024-06-19 06:49:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42376.3). Total num frames: 4917231616. Throughput: 0: 42068.5. Samples: 1184783380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:49:18,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 06:49:22,350][26599] Updated weights for policy 0, policy_version 300134 (0.0038) [2024-06-19 06:49:23,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4917444608. Throughput: 0: 42289.3. Samples: 1185040260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:49:23,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 06:49:26,046][26599] Updated weights for policy 0, policy_version 300144 (0.0035) [2024-06-19 06:49:28,380][26367] Fps is (10 sec: 37682.6, 60 sec: 41233.0, 300 sec: 42265.1). Total num frames: 4917608448. Throughput: 0: 42184.5. Samples: 1185295480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 06:49:28,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 06:49:30,241][26599] Updated weights for policy 0, policy_version 300154 (0.0038) [2024-06-19 06:49:33,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42265.7). Total num frames: 4917837824. Throughput: 0: 42105.7. Samples: 1185413620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:49:33,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 06:49:33,722][26599] Updated weights for policy 0, policy_version 300164 (0.0039) [2024-06-19 06:49:38,044][26599] Updated weights for policy 0, policy_version 300174 (0.0033) [2024-06-19 06:49:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4918050816. Throughput: 0: 42366.1. Samples: 1185673960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:49:38,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 06:49:38,451][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000300175_4918067200.pth... [2024-06-19 06:49:38,505][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000299556_4907925504.pth [2024-06-19 06:49:41,599][26599] Updated weights for policy 0, policy_version 300184 (0.0048) [2024-06-19 06:49:43,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41776.7, 300 sec: 42264.6). Total num frames: 4918247424. Throughput: 0: 42038.4. Samples: 1185921860. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:49:43,384][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 06:49:45,667][26599] Updated weights for policy 0, policy_version 300194 (0.0034) [2024-06-19 06:49:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42327.0, 300 sec: 42265.2). Total num frames: 4918476800. Throughput: 0: 42235.6. Samples: 1186047880. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:49:48,384][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 06:49:49,531][26599] Updated weights for policy 0, policy_version 300204 (0.0035) [2024-06-19 06:49:53,380][26367] Fps is (10 sec: 44252.9, 60 sec: 41779.1, 300 sec: 42265.2). Total num frames: 4918689792. Throughput: 0: 42213.7. Samples: 1186303140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:49:53,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 06:49:53,399][26599] Updated weights for policy 0, policy_version 300214 (0.0040) [2024-06-19 06:49:57,254][26599] Updated weights for policy 0, policy_version 300224 (0.0037) [2024-06-19 06:49:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4918886400. Throughput: 0: 41999.2. Samples: 1186555160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:49:58,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 06:50:01,189][26599] Updated weights for policy 0, policy_version 300234 (0.0028) [2024-06-19 06:50:03,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4919132160. Throughput: 0: 42206.4. Samples: 1186682680. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:03,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 06:50:05,029][26599] Updated weights for policy 0, policy_version 300244 (0.0046) [2024-06-19 06:50:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41506.1, 300 sec: 42265.2). Total num frames: 4919312384. Throughput: 0: 42141.4. Samples: 1186936620. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:08,380][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 06:50:08,846][26599] Updated weights for policy 0, policy_version 300254 (0.0036) [2024-06-19 06:50:12,966][26599] Updated weights for policy 0, policy_version 300264 (0.0049) [2024-06-19 06:50:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42265.2). Total num frames: 4919541760. Throughput: 0: 41985.8. Samples: 1187184840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:13,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 06:50:16,750][26599] Updated weights for policy 0, policy_version 300274 (0.0043) [2024-06-19 06:50:18,380][26367] Fps is (10 sec: 42597.4, 60 sec: 41779.0, 300 sec: 42209.6). Total num frames: 4919738368. Throughput: 0: 42273.2. Samples: 1187315920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:18,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 06:50:20,548][26599] Updated weights for policy 0, policy_version 300284 (0.0033) [2024-06-19 06:50:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42320.7). Total num frames: 4919951360. Throughput: 0: 42117.0. Samples: 1187569220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:23,380][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 06:50:24,382][26599] Updated weights for policy 0, policy_version 300294 (0.0033) [2024-06-19 06:50:28,367][26599] Updated weights for policy 0, policy_version 300304 (0.0028) [2024-06-19 06:50:28,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42209.6). Total num frames: 4920180736. Throughput: 0: 42271.4. Samples: 1187823920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:28,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 06:50:32,270][26599] Updated weights for policy 0, policy_version 300314 (0.0031) [2024-06-19 06:50:33,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4920377344. Throughput: 0: 42341.2. Samples: 1187953240. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:33,381][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 06:50:36,057][26599] Updated weights for policy 0, policy_version 300324 (0.0030) [2024-06-19 06:50:38,344][26579] Signal inference workers to stop experience collection... (17550 times) [2024-06-19 06:50:38,348][26579] Signal inference workers to resume experience collection... (17550 times) [2024-06-19 06:50:38,380][26599] InferenceWorker_p0-w0: stopping experience collection (17550 times) [2024-06-19 06:50:38,380][26599] InferenceWorker_p0-w0: resuming experience collection (17550 times) [2024-06-19 06:50:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42320.7). Total num frames: 4920590336. Throughput: 0: 42441.7. Samples: 1188213020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:38,381][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 06:50:39,841][26599] Updated weights for policy 0, policy_version 300334 (0.0027) [2024-06-19 06:50:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42874.0, 300 sec: 42265.1). Total num frames: 4920819712. Throughput: 0: 42464.8. Samples: 1188466080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 23.0) [2024-06-19 06:50:43,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 06:50:43,853][26599] Updated weights for policy 0, policy_version 300344 (0.0041) [2024-06-19 06:50:47,558][26599] Updated weights for policy 0, policy_version 300354 (0.0044) [2024-06-19 06:50:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42265.1). Total num frames: 4921016320. Throughput: 0: 42373.3. Samples: 1188589480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:50:48,381][26367] Avg episode reward: [(0, '0.399')] [2024-06-19 06:50:51,655][26599] Updated weights for policy 0, policy_version 300364 (0.0023) [2024-06-19 06:50:53,380][26367] Fps is (10 sec: 39322.5, 60 sec: 42052.3, 300 sec: 42265.7). Total num frames: 4921212928. Throughput: 0: 42255.6. Samples: 1188838120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:50:53,380][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 06:50:55,471][26599] Updated weights for policy 0, policy_version 300374 (0.0032) [2024-06-19 06:50:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42154.1). Total num frames: 4921425920. Throughput: 0: 42183.9. Samples: 1189083120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:50:58,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 06:50:59,944][26599] Updated weights for policy 0, policy_version 300384 (0.0031) [2024-06-19 06:51:03,266][26599] Updated weights for policy 0, policy_version 300394 (0.0036) [2024-06-19 06:51:03,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4921655296. Throughput: 0: 42213.0. Samples: 1189215500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:03,384][26367] Avg episode reward: [(0, '0.377')] [2024-06-19 06:51:07,499][26599] Updated weights for policy 0, policy_version 300404 (0.0039) [2024-06-19 06:51:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4921835520. Throughput: 0: 42333.2. Samples: 1189474220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:08,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 06:51:10,743][26599] Updated weights for policy 0, policy_version 300414 (0.0047) [2024-06-19 06:51:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42209.6). Total num frames: 4922081280. Throughput: 0: 42122.2. Samples: 1189719420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:13,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 06:51:15,275][26599] Updated weights for policy 0, policy_version 300424 (0.0030) [2024-06-19 06:51:18,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42598.6, 300 sec: 42265.2). Total num frames: 4922294272. Throughput: 0: 42288.1. Samples: 1189856200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:18,380][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 06:51:18,402][26599] Updated weights for policy 0, policy_version 300434 (0.0036) [2024-06-19 06:51:22,900][26599] Updated weights for policy 0, policy_version 300444 (0.0040) [2024-06-19 06:51:23,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4922490880. Throughput: 0: 42239.3. Samples: 1190113780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:23,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 06:51:26,007][26599] Updated weights for policy 0, policy_version 300454 (0.0024) [2024-06-19 06:51:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42209.6). Total num frames: 4922720256. Throughput: 0: 42106.4. Samples: 1190360860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:28,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 06:51:30,615][26599] Updated weights for policy 0, policy_version 300464 (0.0047) [2024-06-19 06:51:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4922933248. Throughput: 0: 42379.3. Samples: 1190496540. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:33,380][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 06:51:33,761][26599] Updated weights for policy 0, policy_version 300474 (0.0031) [2024-06-19 06:51:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4923113472. Throughput: 0: 42578.2. Samples: 1190754140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:38,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 06:51:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000300484_4923129856.pth... [2024-06-19 06:51:38,415][26599] Updated weights for policy 0, policy_version 300484 (0.0039) [2024-06-19 06:51:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000299865_4912988160.pth [2024-06-19 06:51:41,534][26599] Updated weights for policy 0, policy_version 300494 (0.0038) [2024-06-19 06:51:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42265.2). Total num frames: 4923359232. Throughput: 0: 42543.6. Samples: 1190997580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:43,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 06:51:46,303][26599] Updated weights for policy 0, policy_version 300504 (0.0031) [2024-06-19 06:51:48,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4923572224. Throughput: 0: 42700.0. Samples: 1191137000. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:48,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 06:51:49,213][26599] Updated weights for policy 0, policy_version 300514 (0.0033) [2024-06-19 06:51:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.2, 300 sec: 42209.6). Total num frames: 4923752448. Throughput: 0: 42542.2. Samples: 1191388620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:53,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 06:51:54,053][26599] Updated weights for policy 0, policy_version 300524 (0.0031) [2024-06-19 06:51:56,727][26599] Updated weights for policy 0, policy_version 300534 (0.0039) [2024-06-19 06:51:58,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43144.7, 300 sec: 42320.7). Total num frames: 4924014592. Throughput: 0: 42648.1. Samples: 1191638580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 06:51:58,380][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 06:52:01,770][26599] Updated weights for policy 0, policy_version 300544 (0.0037) [2024-06-19 06:52:03,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4924194816. Throughput: 0: 42679.6. Samples: 1191776780. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:03,380][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 06:52:04,844][26599] Updated weights for policy 0, policy_version 300554 (0.0030) [2024-06-19 06:52:08,380][26367] Fps is (10 sec: 39320.7, 60 sec: 42871.4, 300 sec: 42209.6). Total num frames: 4924407808. Throughput: 0: 42496.7. Samples: 1192026140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:08,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 06:52:09,354][26599] Updated weights for policy 0, policy_version 300564 (0.0043) [2024-06-19 06:52:12,691][26599] Updated weights for policy 0, policy_version 300574 (0.0029) [2024-06-19 06:52:13,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42320.7). Total num frames: 4924653568. Throughput: 0: 42645.7. Samples: 1192279920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:13,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 06:52:13,697][26579] Signal inference workers to stop experience collection... (17600 times) [2024-06-19 06:52:13,698][26579] Signal inference workers to resume experience collection... (17600 times) [2024-06-19 06:52:13,717][26599] InferenceWorker_p0-w0: stopping experience collection (17600 times) [2024-06-19 06:52:13,718][26599] InferenceWorker_p0-w0: resuming experience collection (17600 times) [2024-06-19 06:52:17,457][26599] Updated weights for policy 0, policy_version 300584 (0.0038) [2024-06-19 06:52:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4924817408. Throughput: 0: 42630.6. Samples: 1192414920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:18,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 06:52:20,164][26599] Updated weights for policy 0, policy_version 300594 (0.0045) [2024-06-19 06:52:23,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42209.6). Total num frames: 4925046784. Throughput: 0: 42285.7. Samples: 1192657000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:23,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 06:52:25,030][26599] Updated weights for policy 0, policy_version 300604 (0.0045) [2024-06-19 06:52:27,749][26599] Updated weights for policy 0, policy_version 300614 (0.0042) [2024-06-19 06:52:28,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4925276160. Throughput: 0: 42577.0. Samples: 1192913540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:28,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 06:52:32,587][26599] Updated weights for policy 0, policy_version 300624 (0.0032) [2024-06-19 06:52:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4925456384. Throughput: 0: 42434.6. Samples: 1193046560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:33,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 06:52:35,545][26599] Updated weights for policy 0, policy_version 300634 (0.0034) [2024-06-19 06:52:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42321.2). Total num frames: 4925702144. Throughput: 0: 42389.8. Samples: 1193296160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:38,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 06:52:40,422][26599] Updated weights for policy 0, policy_version 300644 (0.0045) [2024-06-19 06:52:43,046][26599] Updated weights for policy 0, policy_version 300654 (0.0040) [2024-06-19 06:52:43,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4925915136. Throughput: 0: 42494.5. Samples: 1193550840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:43,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 06:52:47,967][26599] Updated weights for policy 0, policy_version 300664 (0.0035) [2024-06-19 06:52:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42265.2). Total num frames: 4926095360. Throughput: 0: 42382.6. Samples: 1193684000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:48,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 06:52:50,703][26599] Updated weights for policy 0, policy_version 300674 (0.0025) [2024-06-19 06:52:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42265.2). Total num frames: 4926341120. Throughput: 0: 42446.3. Samples: 1193936220. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:53,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 06:52:55,664][26599] Updated weights for policy 0, policy_version 300684 (0.0036) [2024-06-19 06:52:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4926537728. Throughput: 0: 42568.6. Samples: 1194195500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:52:58,380][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 06:52:58,561][26599] Updated weights for policy 0, policy_version 300694 (0.0036) [2024-06-19 06:53:03,113][26599] Updated weights for policy 0, policy_version 300704 (0.0038) [2024-06-19 06:53:03,381][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.2, 300 sec: 42320.7). Total num frames: 4926750720. Throughput: 0: 42455.8. Samples: 1194325440. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:53:03,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 06:53:06,065][26599] Updated weights for policy 0, policy_version 300714 (0.0031) [2024-06-19 06:53:08,384][26367] Fps is (10 sec: 44219.9, 60 sec: 42868.9, 300 sec: 42209.1). Total num frames: 4926980096. Throughput: 0: 42748.0. Samples: 1194580820. Policy #0 lag: (min: 0.0, avg: 11.7, max: 21.0) [2024-06-19 06:53:08,385][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 06:53:10,734][26599] Updated weights for policy 0, policy_version 300724 (0.0038) [2024-06-19 06:53:13,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4927176704. Throughput: 0: 42827.1. Samples: 1194840760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:13,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 06:53:13,744][26599] Updated weights for policy 0, policy_version 300734 (0.0039) [2024-06-19 06:53:17,138][26579] Signal inference workers to stop experience collection... (17650 times) [2024-06-19 06:53:17,176][26599] InferenceWorker_p0-w0: stopping experience collection (17650 times) [2024-06-19 06:53:17,195][26579] Signal inference workers to resume experience collection... (17650 times) [2024-06-19 06:53:17,195][26599] InferenceWorker_p0-w0: resuming experience collection (17650 times) [2024-06-19 06:53:18,154][26599] Updated weights for policy 0, policy_version 300744 (0.0026) [2024-06-19 06:53:18,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 4927389696. Throughput: 0: 42671.1. Samples: 1194966760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:18,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 06:53:21,359][26599] Updated weights for policy 0, policy_version 300754 (0.0032) [2024-06-19 06:53:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4927619072. Throughput: 0: 42793.3. Samples: 1195221860. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:23,384][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 06:53:25,909][26599] Updated weights for policy 0, policy_version 300764 (0.0035) [2024-06-19 06:53:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4927815680. Throughput: 0: 42900.0. Samples: 1195481340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:28,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 06:53:29,064][26599] Updated weights for policy 0, policy_version 300774 (0.0032) [2024-06-19 06:53:33,325][26599] Updated weights for policy 0, policy_version 300784 (0.0039) [2024-06-19 06:53:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42431.8). Total num frames: 4928045056. Throughput: 0: 42679.5. Samples: 1195604580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:33,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 06:53:36,621][26599] Updated weights for policy 0, policy_version 300794 (0.0029) [2024-06-19 06:53:38,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4928258048. Throughput: 0: 42713.1. Samples: 1195858300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:38,380][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 06:53:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000300797_4928258048.pth... [2024-06-19 06:53:38,466][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000300175_4918067200.pth [2024-06-19 06:53:41,046][26599] Updated weights for policy 0, policy_version 300804 (0.0034) [2024-06-19 06:53:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42432.1). Total num frames: 4928454656. Throughput: 0: 42753.7. Samples: 1196119420. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:43,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 06:53:44,378][26599] Updated weights for policy 0, policy_version 300814 (0.0042) [2024-06-19 06:53:48,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4928667648. Throughput: 0: 42595.6. Samples: 1196242240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:48,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 06:53:48,832][26599] Updated weights for policy 0, policy_version 300824 (0.0034) [2024-06-19 06:53:52,075][26599] Updated weights for policy 0, policy_version 300834 (0.0030) [2024-06-19 06:53:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4928897024. Throughput: 0: 42566.7. Samples: 1196496160. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:53,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 06:53:56,542][26599] Updated weights for policy 0, policy_version 300844 (0.0039) [2024-06-19 06:53:58,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4929093632. Throughput: 0: 42635.1. Samples: 1196759340. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:53:58,380][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 06:53:59,653][26599] Updated weights for policy 0, policy_version 300854 (0.0039) [2024-06-19 06:54:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.6, 300 sec: 42265.2). Total num frames: 4929290240. Throughput: 0: 42540.2. Samples: 1196881060. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:54:03,380][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 06:54:04,398][26599] Updated weights for policy 0, policy_version 300864 (0.0040) [2024-06-19 06:54:07,814][26599] Updated weights for policy 0, policy_version 300874 (0.0037) [2024-06-19 06:54:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42601.0, 300 sec: 42542.9). Total num frames: 4929536000. Throughput: 0: 42471.5. Samples: 1197133080. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:54:08,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 06:54:12,094][26599] Updated weights for policy 0, policy_version 300884 (0.0033) [2024-06-19 06:54:13,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4929732608. Throughput: 0: 42457.3. Samples: 1197391920. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:54:13,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 06:54:15,724][26599] Updated weights for policy 0, policy_version 300894 (0.0041) [2024-06-19 06:54:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4929945600. Throughput: 0: 42380.4. Samples: 1197511700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:54:18,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 06:54:19,824][26599] Updated weights for policy 0, policy_version 300904 (0.0040) [2024-06-19 06:54:23,305][26599] Updated weights for policy 0, policy_version 300914 (0.0028) [2024-06-19 06:54:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4930174976. Throughput: 0: 42558.6. Samples: 1197773440. Policy #0 lag: (min: 1.0, avg: 11.3, max: 24.0) [2024-06-19 06:54:23,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 06:54:27,669][26599] Updated weights for policy 0, policy_version 300924 (0.0026) [2024-06-19 06:54:28,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4930371584. Throughput: 0: 42381.0. Samples: 1198026560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:54:28,380][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 06:54:31,180][26599] Updated weights for policy 0, policy_version 300934 (0.0035) [2024-06-19 06:54:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4930584576. Throughput: 0: 42454.8. Samples: 1198152700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:54:33,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 06:54:35,182][26599] Updated weights for policy 0, policy_version 300944 (0.0042) [2024-06-19 06:54:38,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42598.9). Total num frames: 4930813952. Throughput: 0: 42481.2. Samples: 1198407820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:54:38,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 06:54:38,828][26599] Updated weights for policy 0, policy_version 300954 (0.0036) [2024-06-19 06:54:42,708][26599] Updated weights for policy 0, policy_version 300964 (0.0036) [2024-06-19 06:54:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4931026944. Throughput: 0: 42314.2. Samples: 1198663480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:54:43,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 06:54:46,474][26599] Updated weights for policy 0, policy_version 300974 (0.0032) [2024-06-19 06:54:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4931223552. Throughput: 0: 42482.0. Samples: 1198792760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:54:48,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 06:54:50,293][26599] Updated weights for policy 0, policy_version 300984 (0.0050) [2024-06-19 06:54:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4931436544. Throughput: 0: 42454.8. Samples: 1199043540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:54:53,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 06:54:54,186][26599] Updated weights for policy 0, policy_version 300994 (0.0030) [2024-06-19 06:54:57,200][26579] Signal inference workers to stop experience collection... (17700 times) [2024-06-19 06:54:57,234][26599] InferenceWorker_p0-w0: stopping experience collection (17700 times) [2024-06-19 06:54:57,258][26579] Signal inference workers to resume experience collection... (17700 times) [2024-06-19 06:54:57,259][26599] InferenceWorker_p0-w0: resuming experience collection (17700 times) [2024-06-19 06:54:57,891][26599] Updated weights for policy 0, policy_version 301004 (0.0036) [2024-06-19 06:54:58,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4931649536. Throughput: 0: 42196.6. Samples: 1199290760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:54:58,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 06:55:02,310][26599] Updated weights for policy 0, policy_version 301014 (0.0032) [2024-06-19 06:55:03,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4931846144. Throughput: 0: 42533.8. Samples: 1199425720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:03,381][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 06:55:05,605][26599] Updated weights for policy 0, policy_version 301024 (0.0038) [2024-06-19 06:55:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4932075520. Throughput: 0: 42427.6. Samples: 1199682680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:08,381][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 06:55:09,871][26599] Updated weights for policy 0, policy_version 301034 (0.0034) [2024-06-19 06:55:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4932288512. Throughput: 0: 42411.5. Samples: 1199935080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:13,380][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 06:55:13,418][26599] Updated weights for policy 0, policy_version 301044 (0.0030) [2024-06-19 06:55:17,764][26599] Updated weights for policy 0, policy_version 301054 (0.0034) [2024-06-19 06:55:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4932485120. Throughput: 0: 42385.3. Samples: 1200060040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:18,384][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 06:55:21,313][26599] Updated weights for policy 0, policy_version 301064 (0.0043) [2024-06-19 06:55:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4932714496. Throughput: 0: 42416.1. Samples: 1200316540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:23,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 06:55:25,373][26599] Updated weights for policy 0, policy_version 301074 (0.0033) [2024-06-19 06:55:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4932927488. Throughput: 0: 42457.3. Samples: 1200574060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:28,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 06:55:28,963][26599] Updated weights for policy 0, policy_version 301084 (0.0039) [2024-06-19 06:55:32,982][26599] Updated weights for policy 0, policy_version 301094 (0.0030) [2024-06-19 06:55:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4933140480. Throughput: 0: 42406.8. Samples: 1200701060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:33,381][26367] Avg episode reward: [(0, '0.839')] [2024-06-19 06:55:36,877][26599] Updated weights for policy 0, policy_version 301104 (0.0030) [2024-06-19 06:55:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4933353472. Throughput: 0: 42491.5. Samples: 1200955660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 06:55:38,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 06:55:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000301108_4933353472.pth... [2024-06-19 06:55:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000300484_4923129856.pth [2024-06-19 06:55:40,742][26599] Updated weights for policy 0, policy_version 301114 (0.0033) [2024-06-19 06:55:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 4933550080. Throughput: 0: 42500.3. Samples: 1201203280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:55:43,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 06:55:44,699][26599] Updated weights for policy 0, policy_version 301124 (0.0041) [2024-06-19 06:55:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 4933763072. Throughput: 0: 42385.9. Samples: 1201333080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:55:48,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 06:55:48,601][26599] Updated weights for policy 0, policy_version 301134 (0.0032) [2024-06-19 06:55:52,976][26599] Updated weights for policy 0, policy_version 301144 (0.0037) [2024-06-19 06:55:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4933959680. Throughput: 0: 42411.2. Samples: 1201591180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:55:53,380][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 06:55:56,346][26599] Updated weights for policy 0, policy_version 301154 (0.0032) [2024-06-19 06:55:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4934205440. Throughput: 0: 42313.7. Samples: 1201839200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:55:58,381][26367] Avg episode reward: [(0, '0.788')] [2024-06-19 06:56:00,559][26599] Updated weights for policy 0, policy_version 301164 (0.0040) [2024-06-19 06:56:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4934385664. Throughput: 0: 42440.0. Samples: 1201969840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:03,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 06:56:04,042][26599] Updated weights for policy 0, policy_version 301174 (0.0036) [2024-06-19 06:56:08,191][26599] Updated weights for policy 0, policy_version 301184 (0.0037) [2024-06-19 06:56:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 4934598656. Throughput: 0: 42305.7. Samples: 1202220300. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:08,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 06:56:11,767][26599] Updated weights for policy 0, policy_version 301194 (0.0042) [2024-06-19 06:56:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4934828032. Throughput: 0: 42293.8. Samples: 1202477280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:13,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 06:56:15,810][26599] Updated weights for policy 0, policy_version 301204 (0.0035) [2024-06-19 06:56:18,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42595.8, 300 sec: 42542.3). Total num frames: 4935041024. Throughput: 0: 42360.1. Samples: 1202607420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:18,384][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 06:56:19,433][26599] Updated weights for policy 0, policy_version 301214 (0.0045) [2024-06-19 06:56:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4935237632. Throughput: 0: 42290.3. Samples: 1202858720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:23,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 06:56:23,506][26599] Updated weights for policy 0, policy_version 301224 (0.0032) [2024-06-19 06:56:27,318][26579] Signal inference workers to stop experience collection... (17750 times) [2024-06-19 06:56:27,319][26579] Signal inference workers to resume experience collection... (17750 times) [2024-06-19 06:56:27,365][26599] InferenceWorker_p0-w0: stopping experience collection (17750 times) [2024-06-19 06:56:27,366][26599] InferenceWorker_p0-w0: resuming experience collection (17750 times) [2024-06-19 06:56:27,459][26599] Updated weights for policy 0, policy_version 301234 (0.0032) [2024-06-19 06:56:28,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4935467008. Throughput: 0: 42438.3. Samples: 1203113000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:28,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 06:56:31,148][26599] Updated weights for policy 0, policy_version 301244 (0.0026) [2024-06-19 06:56:33,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 4935680000. Throughput: 0: 42390.5. Samples: 1203240660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:33,381][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 06:56:35,025][26599] Updated weights for policy 0, policy_version 301254 (0.0033) [2024-06-19 06:56:38,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42049.7, 300 sec: 42431.3). Total num frames: 4935876608. Throughput: 0: 42241.8. Samples: 1203492220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:38,385][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 06:56:38,742][26599] Updated weights for policy 0, policy_version 301264 (0.0034) [2024-06-19 06:56:42,662][26599] Updated weights for policy 0, policy_version 301274 (0.0033) [2024-06-19 06:56:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4936105984. Throughput: 0: 42458.6. Samples: 1203749840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:43,381][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 06:56:46,387][26599] Updated weights for policy 0, policy_version 301284 (0.0043) [2024-06-19 06:56:48,380][26367] Fps is (10 sec: 44252.4, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 4936318976. Throughput: 0: 42449.6. Samples: 1203880080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:48,381][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 06:56:50,236][26599] Updated weights for policy 0, policy_version 301294 (0.0032) [2024-06-19 06:56:53,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4936515584. Throughput: 0: 42572.1. Samples: 1204136040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 22.0) [2024-06-19 06:56:53,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 06:56:54,058][26599] Updated weights for policy 0, policy_version 301304 (0.0029) [2024-06-19 06:56:57,814][26599] Updated weights for policy 0, policy_version 301314 (0.0033) [2024-06-19 06:56:58,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4936728576. Throughput: 0: 42376.9. Samples: 1204384240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:56:58,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 06:57:01,999][26599] Updated weights for policy 0, policy_version 301324 (0.0041) [2024-06-19 06:57:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4936941568. Throughput: 0: 42391.9. Samples: 1204514900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:03,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 06:57:05,790][26599] Updated weights for policy 0, policy_version 301334 (0.0041) [2024-06-19 06:57:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4937154560. Throughput: 0: 42431.0. Samples: 1204768120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:08,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 06:57:09,666][26599] Updated weights for policy 0, policy_version 301344 (0.0039) [2024-06-19 06:57:13,344][26599] Updated weights for policy 0, policy_version 301354 (0.0053) [2024-06-19 06:57:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4937383936. Throughput: 0: 42576.4. Samples: 1205028940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:13,381][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 06:57:17,290][26599] Updated weights for policy 0, policy_version 301364 (0.0037) [2024-06-19 06:57:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42327.9, 300 sec: 42487.3). Total num frames: 4937580544. Throughput: 0: 42619.7. Samples: 1205158540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:18,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 06:57:21,131][26599] Updated weights for policy 0, policy_version 301374 (0.0039) [2024-06-19 06:57:23,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4937793536. Throughput: 0: 42546.3. Samples: 1205406640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:23,380][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 06:57:24,958][26599] Updated weights for policy 0, policy_version 301384 (0.0029) [2024-06-19 06:57:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4938022912. Throughput: 0: 42667.7. Samples: 1205669880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:28,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 06:57:28,438][26599] Updated weights for policy 0, policy_version 301394 (0.0028) [2024-06-19 06:57:32,418][26599] Updated weights for policy 0, policy_version 301404 (0.0043) [2024-06-19 06:57:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 4938219520. Throughput: 0: 42679.8. Samples: 1205800660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:33,381][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 06:57:35,835][26599] Updated weights for policy 0, policy_version 301414 (0.0040) [2024-06-19 06:57:38,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42874.0, 300 sec: 42487.3). Total num frames: 4938448896. Throughput: 0: 42560.2. Samples: 1206051260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:38,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 06:57:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000301419_4938448896.pth... [2024-06-19 06:57:38,455][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000300797_4928258048.pth [2024-06-19 06:57:40,394][26599] Updated weights for policy 0, policy_version 301424 (0.0041) [2024-06-19 06:57:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4938661888. Throughput: 0: 42784.4. Samples: 1206309540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:43,381][26367] Avg episode reward: [(0, '0.847')] [2024-06-19 06:57:43,807][26599] Updated weights for policy 0, policy_version 301434 (0.0034) [2024-06-19 06:57:47,964][26599] Updated weights for policy 0, policy_version 301444 (0.0038) [2024-06-19 06:57:48,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4938858496. Throughput: 0: 42664.3. Samples: 1206434800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:48,381][26367] Avg episode reward: [(0, '0.859')] [2024-06-19 06:57:51,499][26599] Updated weights for policy 0, policy_version 301454 (0.0029) [2024-06-19 06:57:53,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4939087872. Throughput: 0: 42697.0. Samples: 1206689480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:53,380][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 06:57:54,897][26579] Signal inference workers to stop experience collection... (17800 times) [2024-06-19 06:57:54,923][26599] InferenceWorker_p0-w0: stopping experience collection (17800 times) [2024-06-19 06:57:54,958][26579] Signal inference workers to resume experience collection... (17800 times) [2024-06-19 06:57:54,958][26599] InferenceWorker_p0-w0: resuming experience collection (17800 times) [2024-06-19 06:57:55,561][26599] Updated weights for policy 0, policy_version 301464 (0.0034) [2024-06-19 06:57:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4939300864. Throughput: 0: 42714.7. Samples: 1206951100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:57:58,381][26367] Avg episode reward: [(0, '0.297')] [2024-06-19 06:57:59,054][26599] Updated weights for policy 0, policy_version 301474 (0.0035) [2024-06-19 06:58:03,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42376.8). Total num frames: 4939481088. Throughput: 0: 42644.0. Samples: 1207077520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 06:58:03,380][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 06:58:03,533][26599] Updated weights for policy 0, policy_version 301484 (0.0028) [2024-06-19 06:58:06,754][26599] Updated weights for policy 0, policy_version 301494 (0.0031) [2024-06-19 06:58:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4939726848. Throughput: 0: 42793.2. Samples: 1207332340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:08,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 06:58:11,201][26599] Updated weights for policy 0, policy_version 301504 (0.0035) [2024-06-19 06:58:13,380][26367] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4939956224. Throughput: 0: 42741.7. Samples: 1207593260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:13,381][26367] Avg episode reward: [(0, '0.864')] [2024-06-19 06:58:14,461][26599] Updated weights for policy 0, policy_version 301514 (0.0028) [2024-06-19 06:58:18,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42595.8, 300 sec: 42431.3). Total num frames: 4940136448. Throughput: 0: 42701.8. Samples: 1207722400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:18,385][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 06:58:19,082][26599] Updated weights for policy 0, policy_version 301524 (0.0035) [2024-06-19 06:58:22,144][26599] Updated weights for policy 0, policy_version 301534 (0.0041) [2024-06-19 06:58:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 4940365824. Throughput: 0: 42775.6. Samples: 1207976160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:23,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 06:58:26,633][26599] Updated weights for policy 0, policy_version 301544 (0.0031) [2024-06-19 06:58:28,380][26367] Fps is (10 sec: 47530.6, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 4940611584. Throughput: 0: 42666.6. Samples: 1208229540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:28,389][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 06:58:29,638][26599] Updated weights for policy 0, policy_version 301554 (0.0038) [2024-06-19 06:58:33,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4940775424. Throughput: 0: 42910.9. Samples: 1208365780. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:33,380][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 06:58:33,991][26599] Updated weights for policy 0, policy_version 301564 (0.0036) [2024-06-19 06:58:37,513][26599] Updated weights for policy 0, policy_version 301574 (0.0035) [2024-06-19 06:58:38,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 4941004800. Throughput: 0: 42742.6. Samples: 1208612900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:38,390][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 06:58:42,085][26599] Updated weights for policy 0, policy_version 301584 (0.0034) [2024-06-19 06:58:43,381][26367] Fps is (10 sec: 47509.6, 60 sec: 43144.1, 300 sec: 42653.9). Total num frames: 4941250560. Throughput: 0: 42581.6. Samples: 1208867300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:43,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 06:58:45,179][26599] Updated weights for policy 0, policy_version 301594 (0.0033) [2024-06-19 06:58:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 4941398016. Throughput: 0: 42651.5. Samples: 1208996840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:48,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 06:58:49,684][26599] Updated weights for policy 0, policy_version 301604 (0.0041) [2024-06-19 06:58:52,584][26599] Updated weights for policy 0, policy_version 301614 (0.0039) [2024-06-19 06:58:53,380][26367] Fps is (10 sec: 40963.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4941660160. Throughput: 0: 42569.4. Samples: 1209247960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:53,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 06:58:57,494][26599] Updated weights for policy 0, policy_version 301624 (0.0036) [2024-06-19 06:58:58,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4941873152. Throughput: 0: 42593.0. Samples: 1209509940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:58:58,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 06:58:59,766][26579] Signal inference workers to stop experience collection... (17850 times) [2024-06-19 06:58:59,766][26579] Signal inference workers to resume experience collection... (17850 times) [2024-06-19 06:58:59,782][26599] InferenceWorker_p0-w0: stopping experience collection (17850 times) [2024-06-19 06:58:59,782][26599] InferenceWorker_p0-w0: resuming experience collection (17850 times) [2024-06-19 06:59:00,074][26599] Updated weights for policy 0, policy_version 301634 (0.0042) [2024-06-19 06:59:03,380][26367] Fps is (10 sec: 37682.7, 60 sec: 42598.3, 300 sec: 42376.3). Total num frames: 4942036992. Throughput: 0: 42409.7. Samples: 1209630680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:59:03,381][26367] Avg episode reward: [(0, '0.323')] [2024-06-19 06:59:05,253][26599] Updated weights for policy 0, policy_version 301644 (0.0034) [2024-06-19 06:59:07,955][26599] Updated weights for policy 0, policy_version 301654 (0.0046) [2024-06-19 06:59:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4942299136. Throughput: 0: 42400.1. Samples: 1209884160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:59:08,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 06:59:12,845][26599] Updated weights for policy 0, policy_version 301664 (0.0032) [2024-06-19 06:59:13,384][26367] Fps is (10 sec: 44221.5, 60 sec: 42049.9, 300 sec: 42486.8). Total num frames: 4942479360. Throughput: 0: 42639.0. Samples: 1210148440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:59:13,384][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 06:59:15,690][26599] Updated weights for policy 0, policy_version 301674 (0.0040) [2024-06-19 06:59:18,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42601.0, 300 sec: 42431.8). Total num frames: 4942692352. Throughput: 0: 42195.8. Samples: 1210264600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 06:59:18,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 06:59:20,352][26599] Updated weights for policy 0, policy_version 301684 (0.0033) [2024-06-19 06:59:23,380][26367] Fps is (10 sec: 47530.6, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4942954496. Throughput: 0: 42505.9. Samples: 1210525660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:23,380][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 06:59:23,386][26599] Updated weights for policy 0, policy_version 301694 (0.0034) [2024-06-19 06:59:28,161][26599] Updated weights for policy 0, policy_version 301704 (0.0031) [2024-06-19 06:59:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 4943118336. Throughput: 0: 42610.4. Samples: 1210784740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:28,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 06:59:31,174][26599] Updated weights for policy 0, policy_version 301714 (0.0044) [2024-06-19 06:59:33,380][26367] Fps is (10 sec: 37682.5, 60 sec: 42598.2, 300 sec: 42431.8). Total num frames: 4943331328. Throughput: 0: 42401.7. Samples: 1210904920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:33,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 06:59:35,845][26599] Updated weights for policy 0, policy_version 301724 (0.0033) [2024-06-19 06:59:38,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4943577088. Throughput: 0: 42631.5. Samples: 1211166380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:38,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 06:59:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000301732_4943577088.pth... [2024-06-19 06:59:38,431][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000301108_4933353472.pth [2024-06-19 06:59:38,959][26599] Updated weights for policy 0, policy_version 301734 (0.0032) [2024-06-19 06:59:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41506.5, 300 sec: 42431.8). Total num frames: 4943740928. Throughput: 0: 42551.3. Samples: 1211424760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:43,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 06:59:43,669][26599] Updated weights for policy 0, policy_version 301744 (0.0035) [2024-06-19 06:59:46,678][26599] Updated weights for policy 0, policy_version 301754 (0.0037) [2024-06-19 06:59:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 4943986688. Throughput: 0: 42385.3. Samples: 1211538020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:48,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 06:59:51,202][26599] Updated weights for policy 0, policy_version 301764 (0.0034) [2024-06-19 06:59:53,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 4944216064. Throughput: 0: 42722.5. Samples: 1211806680. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:53,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 06:59:54,502][26599] Updated weights for policy 0, policy_version 301774 (0.0037) [2024-06-19 06:59:58,335][26579] Signal inference workers to stop experience collection... (17900 times) [2024-06-19 06:59:58,380][26367] Fps is (10 sec: 37683.8, 60 sec: 41506.2, 300 sec: 42431.8). Total num frames: 4944363520. Throughput: 0: 42523.8. Samples: 1212061860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 06:59:58,380][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 06:59:58,388][26599] InferenceWorker_p0-w0: stopping experience collection (17900 times) [2024-06-19 06:59:58,391][26579] Signal inference workers to resume experience collection... (17900 times) [2024-06-19 06:59:58,397][26599] InferenceWorker_p0-w0: resuming experience collection (17900 times) [2024-06-19 06:59:58,842][26599] Updated weights for policy 0, policy_version 301784 (0.0030) [2024-06-19 07:00:02,422][26599] Updated weights for policy 0, policy_version 301794 (0.0041) [2024-06-19 07:00:03,380][26367] Fps is (10 sec: 42598.9, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 4944642048. Throughput: 0: 42477.4. Samples: 1212176080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:00:03,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 07:00:07,140][26599] Updated weights for policy 0, policy_version 301804 (0.0033) [2024-06-19 07:00:08,380][26367] Fps is (10 sec: 47512.9, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 4944838656. Throughput: 0: 42617.2. Samples: 1212443440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:00:08,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 07:00:10,042][26599] Updated weights for policy 0, policy_version 301814 (0.0036) [2024-06-19 07:00:13,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42327.8, 300 sec: 42487.3). Total num frames: 4945018880. Throughput: 0: 42460.1. Samples: 1212695440. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:00:13,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 07:00:14,829][26599] Updated weights for policy 0, policy_version 301824 (0.0037) [2024-06-19 07:00:17,723][26599] Updated weights for policy 0, policy_version 301834 (0.0030) [2024-06-19 07:00:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 4945281024. Throughput: 0: 42473.4. Samples: 1212816220. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:00:18,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 07:00:22,506][26599] Updated weights for policy 0, policy_version 301844 (0.0030) [2024-06-19 07:00:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 41506.1, 300 sec: 42431.8). Total num frames: 4945444864. Throughput: 0: 42446.7. Samples: 1213076480. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:00:23,380][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 07:00:25,419][26599] Updated weights for policy 0, policy_version 301854 (0.0034) [2024-06-19 07:00:28,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4945657856. Throughput: 0: 42384.5. Samples: 1213332060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:00:28,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 07:00:30,147][26599] Updated weights for policy 0, policy_version 301864 (0.0028) [2024-06-19 07:00:33,142][26599] Updated weights for policy 0, policy_version 301874 (0.0033) [2024-06-19 07:00:33,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4945903616. Throughput: 0: 42622.6. Samples: 1213456040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:00:33,384][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 07:00:37,845][26599] Updated weights for policy 0, policy_version 301884 (0.0029) [2024-06-19 07:00:38,380][26367] Fps is (10 sec: 42599.4, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 4946083840. Throughput: 0: 42523.8. Samples: 1213720240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:00:38,380][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 07:00:40,784][26599] Updated weights for policy 0, policy_version 301894 (0.0033) [2024-06-19 07:00:43,384][26367] Fps is (10 sec: 39307.6, 60 sec: 42595.9, 300 sec: 42486.8). Total num frames: 4946296832. Throughput: 0: 42328.9. Samples: 1213966820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:00:43,384][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 07:00:45,516][26599] Updated weights for policy 0, policy_version 301904 (0.0033) [2024-06-19 07:00:48,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4946542592. Throughput: 0: 42608.8. Samples: 1214093480. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:00:48,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 07:00:48,751][26599] Updated weights for policy 0, policy_version 301914 (0.0029) [2024-06-19 07:00:53,264][26599] Updated weights for policy 0, policy_version 301924 (0.0027) [2024-06-19 07:00:53,380][26367] Fps is (10 sec: 42613.7, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 4946722816. Throughput: 0: 42292.5. Samples: 1214346600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:00:53,384][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 07:00:56,412][26599] Updated weights for policy 0, policy_version 301934 (0.0028) [2024-06-19 07:00:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 43144.3, 300 sec: 42598.4). Total num frames: 4946952192. Throughput: 0: 42240.2. Samples: 1214596260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:00:58,381][26367] Avg episode reward: [(0, '0.391')] [2024-06-19 07:01:01,110][26599] Updated weights for policy 0, policy_version 301944 (0.0038) [2024-06-19 07:01:03,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4947181568. Throughput: 0: 42561.8. Samples: 1214731500. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:03,381][26367] Avg episode reward: [(0, '0.800')] [2024-06-19 07:01:04,356][26599] Updated weights for policy 0, policy_version 301954 (0.0042) [2024-06-19 07:01:05,204][26579] Signal inference workers to stop experience collection... (17950 times) [2024-06-19 07:01:05,237][26599] InferenceWorker_p0-w0: stopping experience collection (17950 times) [2024-06-19 07:01:05,261][26579] Signal inference workers to resume experience collection... (17950 times) [2024-06-19 07:01:05,261][26599] InferenceWorker_p0-w0: resuming experience collection (17950 times) [2024-06-19 07:01:08,380][26367] Fps is (10 sec: 39322.5, 60 sec: 41779.3, 300 sec: 42431.8). Total num frames: 4947345408. Throughput: 0: 42297.8. Samples: 1214979880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:08,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 07:01:08,702][26599] Updated weights for policy 0, policy_version 301964 (0.0046) [2024-06-19 07:01:12,060][26599] Updated weights for policy 0, policy_version 301974 (0.0032) [2024-06-19 07:01:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42543.4). Total num frames: 4947591168. Throughput: 0: 42274.6. Samples: 1215234420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:13,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 07:01:16,297][26599] Updated weights for policy 0, policy_version 301984 (0.0037) [2024-06-19 07:01:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 4947804160. Throughput: 0: 42508.6. Samples: 1215368920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:18,380][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 07:01:19,865][26599] Updated weights for policy 0, policy_version 301994 (0.0040) [2024-06-19 07:01:23,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 4947984384. Throughput: 0: 42211.4. Samples: 1215619760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:23,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 07:01:24,053][26599] Updated weights for policy 0, policy_version 302004 (0.0034) [2024-06-19 07:01:27,477][26599] Updated weights for policy 0, policy_version 302014 (0.0027) [2024-06-19 07:01:28,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4948230144. Throughput: 0: 42365.5. Samples: 1215873120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:28,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 07:01:31,503][26599] Updated weights for policy 0, policy_version 302024 (0.0041) [2024-06-19 07:01:33,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.5, 300 sec: 42598.9). Total num frames: 4948443136. Throughput: 0: 42506.8. Samples: 1216006280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:33,380][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 07:01:35,171][26599] Updated weights for policy 0, policy_version 302034 (0.0029) [2024-06-19 07:01:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4948639744. Throughput: 0: 42520.4. Samples: 1216260020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:38,381][26367] Avg episode reward: [(0, '0.860')] [2024-06-19 07:01:38,525][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302042_4948656128.pth... [2024-06-19 07:01:38,575][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000301419_4938448896.pth [2024-06-19 07:01:39,166][26599] Updated weights for policy 0, policy_version 302044 (0.0033) [2024-06-19 07:01:42,723][26599] Updated weights for policy 0, policy_version 302054 (0.0041) [2024-06-19 07:01:43,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42600.9, 300 sec: 42487.3). Total num frames: 4948852736. Throughput: 0: 42632.9. Samples: 1216514740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:43,381][26367] Avg episode reward: [(0, '0.867')] [2024-06-19 07:01:46,856][26599] Updated weights for policy 0, policy_version 302064 (0.0031) [2024-06-19 07:01:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 4949082112. Throughput: 0: 42481.0. Samples: 1216643140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 07:01:48,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 07:01:50,331][26599] Updated weights for policy 0, policy_version 302074 (0.0043) [2024-06-19 07:01:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4949278720. Throughput: 0: 42617.2. Samples: 1216897660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:01:53,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 07:01:54,987][26599] Updated weights for policy 0, policy_version 302084 (0.0025) [2024-06-19 07:01:58,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 4949491712. Throughput: 0: 42586.8. Samples: 1217150820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:01:58,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 07:01:58,627][26599] Updated weights for policy 0, policy_version 302094 (0.0035) [2024-06-19 07:02:02,656][26599] Updated weights for policy 0, policy_version 302104 (0.0041) [2024-06-19 07:02:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 4949704704. Throughput: 0: 42429.8. Samples: 1217278260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:03,380][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 07:02:06,178][26599] Updated weights for policy 0, policy_version 302114 (0.0032) [2024-06-19 07:02:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 4949917696. Throughput: 0: 42667.1. Samples: 1217539780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:08,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 07:02:08,816][26579] Signal inference workers to stop experience collection... (18000 times) [2024-06-19 07:02:08,817][26579] Signal inference workers to resume experience collection... (18000 times) [2024-06-19 07:02:08,863][26599] InferenceWorker_p0-w0: stopping experience collection (18000 times) [2024-06-19 07:02:08,863][26599] InferenceWorker_p0-w0: resuming experience collection (18000 times) [2024-06-19 07:02:10,168][26599] Updated weights for policy 0, policy_version 302124 (0.0046) [2024-06-19 07:02:13,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 4950130688. Throughput: 0: 42583.2. Samples: 1217789360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:13,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 07:02:14,030][26599] Updated weights for policy 0, policy_version 302134 (0.0030) [2024-06-19 07:02:17,764][26599] Updated weights for policy 0, policy_version 302144 (0.0043) [2024-06-19 07:02:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4950343680. Throughput: 0: 42574.2. Samples: 1217922120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:18,380][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 07:02:21,590][26599] Updated weights for policy 0, policy_version 302154 (0.0029) [2024-06-19 07:02:23,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 4950556672. Throughput: 0: 42635.2. Samples: 1218178600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:23,380][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 07:02:25,255][26599] Updated weights for policy 0, policy_version 302164 (0.0043) [2024-06-19 07:02:28,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4950786048. Throughput: 0: 42521.8. Samples: 1218428220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:28,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 07:02:29,224][26599] Updated weights for policy 0, policy_version 302174 (0.0033) [2024-06-19 07:02:32,849][26599] Updated weights for policy 0, policy_version 302184 (0.0041) [2024-06-19 07:02:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4950982656. Throughput: 0: 42531.9. Samples: 1218557080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:33,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 07:02:36,853][26599] Updated weights for policy 0, policy_version 302194 (0.0032) [2024-06-19 07:02:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4951212032. Throughput: 0: 42563.9. Samples: 1218813040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:38,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 07:02:40,763][26599] Updated weights for policy 0, policy_version 302204 (0.0038) [2024-06-19 07:02:43,380][26367] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 4951441408. Throughput: 0: 42535.6. Samples: 1219064920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:43,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 07:02:44,762][26599] Updated weights for policy 0, policy_version 302214 (0.0032) [2024-06-19 07:02:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 4951621632. Throughput: 0: 42503.9. Samples: 1219190940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:48,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 07:02:48,709][26599] Updated weights for policy 0, policy_version 302224 (0.0031) [2024-06-19 07:02:52,252][26599] Updated weights for policy 0, policy_version 302234 (0.0038) [2024-06-19 07:02:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 4951851008. Throughput: 0: 42571.3. Samples: 1219455480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:53,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 07:02:56,282][26599] Updated weights for policy 0, policy_version 302244 (0.0031) [2024-06-19 07:02:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4952047616. Throughput: 0: 42592.9. Samples: 1219706040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:02:58,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 07:02:59,668][26599] Updated weights for policy 0, policy_version 302254 (0.0039) [2024-06-19 07:03:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4952260608. Throughput: 0: 42411.5. Samples: 1219830640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:03,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 07:03:03,940][26599] Updated weights for policy 0, policy_version 302264 (0.0026) [2024-06-19 07:03:07,517][26599] Updated weights for policy 0, policy_version 302274 (0.0038) [2024-06-19 07:03:08,380][26367] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 4952506368. Throughput: 0: 42574.2. Samples: 1220094440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:08,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 07:03:11,344][26599] Updated weights for policy 0, policy_version 302284 (0.0036) [2024-06-19 07:03:13,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.9). Total num frames: 4952702976. Throughput: 0: 42796.9. Samples: 1220354080. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:13,382][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 07:03:15,035][26599] Updated weights for policy 0, policy_version 302294 (0.0026) [2024-06-19 07:03:18,380][26367] Fps is (10 sec: 39320.9, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 4952899584. Throughput: 0: 42713.7. Samples: 1220479200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:18,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 07:03:19,125][26599] Updated weights for policy 0, policy_version 302304 (0.0042) [2024-06-19 07:03:22,508][26599] Updated weights for policy 0, policy_version 302314 (0.0032) [2024-06-19 07:03:23,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42868.8, 300 sec: 42431.3). Total num frames: 4953128960. Throughput: 0: 42781.5. Samples: 1220738360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:23,385][26367] Avg episode reward: [(0, '0.381')] [2024-06-19 07:03:26,815][26599] Updated weights for policy 0, policy_version 302324 (0.0039) [2024-06-19 07:03:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 4953325568. Throughput: 0: 43009.8. Samples: 1221000360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:28,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 07:03:30,063][26599] Updated weights for policy 0, policy_version 302334 (0.0039) [2024-06-19 07:03:33,380][26367] Fps is (10 sec: 44253.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 4953571328. Throughput: 0: 43031.2. Samples: 1221127340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:33,380][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 07:03:34,534][26599] Updated weights for policy 0, policy_version 302344 (0.0033) [2024-06-19 07:03:37,509][26599] Updated weights for policy 0, policy_version 302354 (0.0037) [2024-06-19 07:03:38,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42487.4). Total num frames: 4953784320. Throughput: 0: 42873.6. Samples: 1221384800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:38,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 07:03:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302355_4953784320.pth... [2024-06-19 07:03:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000301732_4943577088.pth [2024-06-19 07:03:39,324][26579] Signal inference workers to stop experience collection... (18050 times) [2024-06-19 07:03:39,324][26579] Signal inference workers to resume experience collection... (18050 times) [2024-06-19 07:03:39,351][26599] InferenceWorker_p0-w0: stopping experience collection (18050 times) [2024-06-19 07:03:39,351][26599] InferenceWorker_p0-w0: resuming experience collection (18050 times) [2024-06-19 07:03:42,306][26599] Updated weights for policy 0, policy_version 302364 (0.0035) [2024-06-19 07:03:43,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 4953964544. Throughput: 0: 43074.7. Samples: 1221644400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:43,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 07:03:45,183][26599] Updated weights for policy 0, policy_version 302374 (0.0037) [2024-06-19 07:03:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 4954210304. Throughput: 0: 42995.5. Samples: 1221765440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:48,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 07:03:49,882][26599] Updated weights for policy 0, policy_version 302384 (0.0030) [2024-06-19 07:03:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 4954406912. Throughput: 0: 42893.3. Samples: 1222024640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:53,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 07:03:53,415][26599] Updated weights for policy 0, policy_version 302394 (0.0033) [2024-06-19 07:03:57,402][26599] Updated weights for policy 0, policy_version 302404 (0.0042) [2024-06-19 07:03:58,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 4954603520. Throughput: 0: 42913.3. Samples: 1222285180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:03:58,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 07:04:00,993][26599] Updated weights for policy 0, policy_version 302414 (0.0037) [2024-06-19 07:04:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 4954849280. Throughput: 0: 42828.7. Samples: 1222406480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:04:03,380][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 07:04:04,871][26599] Updated weights for policy 0, policy_version 302424 (0.0032) [2024-06-19 07:04:08,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42654.4). Total num frames: 4955062272. Throughput: 0: 42818.5. Samples: 1222665040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:04:08,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 07:04:08,573][26599] Updated weights for policy 0, policy_version 302434 (0.0039) [2024-06-19 07:04:12,675][26599] Updated weights for policy 0, policy_version 302444 (0.0036) [2024-06-19 07:04:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4955242496. Throughput: 0: 42581.8. Samples: 1222916540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 07:04:13,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 07:04:16,321][26599] Updated weights for policy 0, policy_version 302454 (0.0034) [2024-06-19 07:04:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 4955488256. Throughput: 0: 42666.6. Samples: 1223047340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:18,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 07:04:20,555][26599] Updated weights for policy 0, policy_version 302464 (0.0034) [2024-06-19 07:04:23,383][26367] Fps is (10 sec: 45864.5, 60 sec: 42872.4, 300 sec: 42653.6). Total num frames: 4955701248. Throughput: 0: 42802.7. Samples: 1223311020. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:23,383][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 07:04:23,775][26599] Updated weights for policy 0, policy_version 302474 (0.0023) [2024-06-19 07:04:28,193][26599] Updated weights for policy 0, policy_version 302484 (0.0042) [2024-06-19 07:04:28,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4955897856. Throughput: 0: 42920.4. Samples: 1223575820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:28,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:04:31,742][26599] Updated weights for policy 0, policy_version 302494 (0.0051) [2024-06-19 07:04:33,384][26367] Fps is (10 sec: 44230.9, 60 sec: 42868.8, 300 sec: 42597.9). Total num frames: 4956143616. Throughput: 0: 42841.4. Samples: 1223693460. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:33,385][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 07:04:35,682][26599] Updated weights for policy 0, policy_version 302504 (0.0031) [2024-06-19 07:04:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 4956323840. Throughput: 0: 42867.8. Samples: 1223953700. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:38,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 07:04:39,239][26599] Updated weights for policy 0, policy_version 302514 (0.0029) [2024-06-19 07:04:43,320][26599] Updated weights for policy 0, policy_version 302524 (0.0038) [2024-06-19 07:04:43,380][26367] Fps is (10 sec: 40975.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 4956553216. Throughput: 0: 42709.9. Samples: 1224207120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:43,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 07:04:46,754][26599] Updated weights for policy 0, policy_version 302534 (0.0030) [2024-06-19 07:04:48,380][26367] Fps is (10 sec: 44237.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4956766208. Throughput: 0: 43003.9. Samples: 1224341660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:48,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 07:04:50,810][26599] Updated weights for policy 0, policy_version 302544 (0.0040) [2024-06-19 07:04:53,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4956979200. Throughput: 0: 42962.1. Samples: 1224598340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:53,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 07:04:54,255][26599] Updated weights for policy 0, policy_version 302554 (0.0029) [2024-06-19 07:04:58,242][26599] Updated weights for policy 0, policy_version 302564 (0.0030) [2024-06-19 07:04:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 4957208576. Throughput: 0: 43144.4. Samples: 1224858040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:04:58,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 07:05:01,906][26599] Updated weights for policy 0, policy_version 302574 (0.0044) [2024-06-19 07:05:03,380][26367] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 4957421568. Throughput: 0: 43026.8. Samples: 1224983540. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:05:03,380][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 07:05:05,679][26599] Updated weights for policy 0, policy_version 302584 (0.0033) [2024-06-19 07:05:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 4957618176. Throughput: 0: 42800.8. Samples: 1225236960. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:05:08,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 07:05:08,746][26579] Signal inference workers to stop experience collection... (18100 times) [2024-06-19 07:05:08,746][26579] Signal inference workers to resume experience collection... (18100 times) [2024-06-19 07:05:08,787][26599] InferenceWorker_p0-w0: stopping experience collection (18100 times) [2024-06-19 07:05:08,788][26599] InferenceWorker_p0-w0: resuming experience collection (18100 times) [2024-06-19 07:05:09,689][26599] Updated weights for policy 0, policy_version 302594 (0.0031) [2024-06-19 07:05:13,380][26367] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 4957831168. Throughput: 0: 42542.3. Samples: 1225490220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:05:13,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 07:05:13,983][26599] Updated weights for policy 0, policy_version 302604 (0.0027) [2024-06-19 07:05:17,523][26599] Updated weights for policy 0, policy_version 302614 (0.0028) [2024-06-19 07:05:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4958044160. Throughput: 0: 42843.9. Samples: 1225621280. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:05:18,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:05:21,455][26599] Updated weights for policy 0, policy_version 302624 (0.0037) [2024-06-19 07:05:23,384][26367] Fps is (10 sec: 42583.4, 60 sec: 42597.5, 300 sec: 42709.0). Total num frames: 4958257152. Throughput: 0: 42667.4. Samples: 1225873880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:05:23,384][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 07:05:25,093][26599] Updated weights for policy 0, policy_version 302634 (0.0039) [2024-06-19 07:05:28,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 4958486528. Throughput: 0: 42789.7. Samples: 1226132660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 26.0) [2024-06-19 07:05:28,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 07:05:28,978][26599] Updated weights for policy 0, policy_version 302644 (0.0034) [2024-06-19 07:05:32,623][26599] Updated weights for policy 0, policy_version 302654 (0.0025) [2024-06-19 07:05:33,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 4958699520. Throughput: 0: 42826.2. Samples: 1226268840. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:05:33,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 07:05:36,375][26599] Updated weights for policy 0, policy_version 302664 (0.0039) [2024-06-19 07:05:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.7, 300 sec: 42710.0). Total num frames: 4958896128. Throughput: 0: 42814.9. Samples: 1226525000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:05:38,380][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 07:05:38,609][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302669_4958928896.pth... [2024-06-19 07:05:38,655][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302042_4948656128.pth [2024-06-19 07:05:40,055][26599] Updated weights for policy 0, policy_version 302674 (0.0027) [2024-06-19 07:05:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4959125504. Throughput: 0: 42628.0. Samples: 1226776300. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:05:43,381][26367] Avg episode reward: [(0, '0.381')] [2024-06-19 07:05:44,507][26599] Updated weights for policy 0, policy_version 302684 (0.0045) [2024-06-19 07:05:47,871][26599] Updated weights for policy 0, policy_version 302694 (0.0028) [2024-06-19 07:05:48,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 4959338496. Throughput: 0: 42690.3. Samples: 1226904760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:05:48,384][26367] Avg episode reward: [(0, '0.344')] [2024-06-19 07:05:52,053][26599] Updated weights for policy 0, policy_version 302704 (0.0038) [2024-06-19 07:05:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 4959535104. Throughput: 0: 42756.1. Samples: 1227160980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:05:53,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 07:05:55,509][26599] Updated weights for policy 0, policy_version 302714 (0.0034) [2024-06-19 07:05:58,380][26367] Fps is (10 sec: 42613.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 4959764480. Throughput: 0: 42837.8. Samples: 1227417920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:05:58,383][26367] Avg episode reward: [(0, '0.869')] [2024-06-19 07:05:59,752][26599] Updated weights for policy 0, policy_version 302724 (0.0038) [2024-06-19 07:06:03,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 4959977472. Throughput: 0: 42781.7. Samples: 1227546460. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:03,381][26367] Avg episode reward: [(0, '0.870')] [2024-06-19 07:06:03,479][26599] Updated weights for policy 0, policy_version 302734 (0.0041) [2024-06-19 07:06:07,312][26599] Updated weights for policy 0, policy_version 302744 (0.0031) [2024-06-19 07:06:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 4960174080. Throughput: 0: 42738.5. Samples: 1227796960. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:08,381][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 07:06:11,088][26599] Updated weights for policy 0, policy_version 302754 (0.0032) [2024-06-19 07:06:13,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 4960403456. Throughput: 0: 42758.3. Samples: 1228056780. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:13,380][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 07:06:15,342][26599] Updated weights for policy 0, policy_version 302764 (0.0029) [2024-06-19 07:06:18,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4960616448. Throughput: 0: 42679.1. Samples: 1228189400. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:18,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 07:06:18,749][26599] Updated weights for policy 0, policy_version 302774 (0.0034) [2024-06-19 07:06:22,926][26599] Updated weights for policy 0, policy_version 302784 (0.0027) [2024-06-19 07:06:23,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42601.0, 300 sec: 42654.0). Total num frames: 4960813056. Throughput: 0: 42555.5. Samples: 1228440000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:23,381][26367] Avg episode reward: [(0, '0.431')] [2024-06-19 07:06:26,334][26599] Updated weights for policy 0, policy_version 302794 (0.0034) [2024-06-19 07:06:28,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4961058816. Throughput: 0: 42689.8. Samples: 1228697340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:28,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 07:06:30,383][26599] Updated weights for policy 0, policy_version 302804 (0.0044) [2024-06-19 07:06:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 4961255424. Throughput: 0: 42838.0. Samples: 1228832320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:33,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 07:06:34,011][26599] Updated weights for policy 0, policy_version 302814 (0.0029) [2024-06-19 07:06:38,265][26579] Signal inference workers to stop experience collection... (18150 times) [2024-06-19 07:06:38,316][26599] InferenceWorker_p0-w0: stopping experience collection (18150 times) [2024-06-19 07:06:38,323][26579] Signal inference workers to resume experience collection... (18150 times) [2024-06-19 07:06:38,339][26599] InferenceWorker_p0-w0: resuming experience collection (18150 times) [2024-06-19 07:06:38,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 4961452032. Throughput: 0: 42656.0. Samples: 1229080500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:38,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 07:06:38,462][26599] Updated weights for policy 0, policy_version 302824 (0.0041) [2024-06-19 07:06:41,661][26599] Updated weights for policy 0, policy_version 302834 (0.0043) [2024-06-19 07:06:43,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4961697792. Throughput: 0: 42735.3. Samples: 1229341000. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 07:06:43,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 07:06:45,899][26599] Updated weights for policy 0, policy_version 302844 (0.0046) [2024-06-19 07:06:48,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42601.1, 300 sec: 42765.0). Total num frames: 4961894400. Throughput: 0: 42741.5. Samples: 1229469820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:06:48,380][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 07:06:49,776][26599] Updated weights for policy 0, policy_version 302854 (0.0031) [2024-06-19 07:06:53,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4962107392. Throughput: 0: 42811.7. Samples: 1229723480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:06:53,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 07:06:53,449][26599] Updated weights for policy 0, policy_version 302864 (0.0040) [2024-06-19 07:06:57,329][26599] Updated weights for policy 0, policy_version 302874 (0.0034) [2024-06-19 07:06:58,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42868.9, 300 sec: 42820.0). Total num frames: 4962336768. Throughput: 0: 42664.4. Samples: 1229976840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:06:58,385][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 07:07:01,841][26599] Updated weights for policy 0, policy_version 302884 (0.0038) [2024-06-19 07:07:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 4962533376. Throughput: 0: 42615.2. Samples: 1230107080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:03,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 07:07:04,963][26599] Updated weights for policy 0, policy_version 302894 (0.0038) [2024-06-19 07:07:08,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 4962746368. Throughput: 0: 42647.5. Samples: 1230359140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:08,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 07:07:09,290][26599] Updated weights for policy 0, policy_version 302904 (0.0035) [2024-06-19 07:07:12,358][26599] Updated weights for policy 0, policy_version 302914 (0.0035) [2024-06-19 07:07:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 4962975744. Throughput: 0: 42614.3. Samples: 1230614980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:13,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 07:07:16,791][26599] Updated weights for policy 0, policy_version 302924 (0.0033) [2024-06-19 07:07:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42709.4). Total num frames: 4963155968. Throughput: 0: 42599.5. Samples: 1230749300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:18,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 07:07:19,830][26599] Updated weights for policy 0, policy_version 302934 (0.0031) [2024-06-19 07:07:23,380][26367] Fps is (10 sec: 39320.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 4963368960. Throughput: 0: 42546.1. Samples: 1230995080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:23,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 07:07:24,277][26599] Updated weights for policy 0, policy_version 302944 (0.0032) [2024-06-19 07:07:27,431][26599] Updated weights for policy 0, policy_version 302954 (0.0029) [2024-06-19 07:07:28,380][26367] Fps is (10 sec: 45876.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 4963614720. Throughput: 0: 42480.8. Samples: 1231252640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:28,381][26367] Avg episode reward: [(0, '0.335')] [2024-06-19 07:07:32,599][26599] Updated weights for policy 0, policy_version 302964 (0.0032) [2024-06-19 07:07:33,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 4963778560. Throughput: 0: 42636.8. Samples: 1231388480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:33,381][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 07:07:35,051][26599] Updated weights for policy 0, policy_version 302974 (0.0036) [2024-06-19 07:07:38,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 4964024320. Throughput: 0: 42541.1. Samples: 1231637840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:38,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 07:07:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302980_4964024320.pth... [2024-06-19 07:07:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302355_4953784320.pth [2024-06-19 07:07:40,013][26599] Updated weights for policy 0, policy_version 302984 (0.0035) [2024-06-19 07:07:43,079][26599] Updated weights for policy 0, policy_version 302994 (0.0030) [2024-06-19 07:07:43,380][26367] Fps is (10 sec: 47512.8, 60 sec: 42598.2, 300 sec: 42820.5). Total num frames: 4964253696. Throughput: 0: 42561.5. Samples: 1231891960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:43,381][26367] Avg episode reward: [(0, '0.225')] [2024-06-19 07:07:47,536][26599] Updated weights for policy 0, policy_version 303004 (0.0027) [2024-06-19 07:07:48,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 4964417536. Throughput: 0: 42562.6. Samples: 1232022400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:48,381][26367] Avg episode reward: [(0, '0.204')] [2024-06-19 07:07:50,704][26579] Signal inference workers to stop experience collection... (18200 times) [2024-06-19 07:07:50,704][26579] Signal inference workers to resume experience collection... (18200 times) [2024-06-19 07:07:50,736][26599] InferenceWorker_p0-w0: stopping experience collection (18200 times) [2024-06-19 07:07:50,736][26599] InferenceWorker_p0-w0: resuming experience collection (18200 times) [2024-06-19 07:07:50,836][26599] Updated weights for policy 0, policy_version 303014 (0.0034) [2024-06-19 07:07:53,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 4964679680. Throughput: 0: 42535.2. Samples: 1232273220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:53,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 07:07:55,147][26599] Updated weights for policy 0, policy_version 303024 (0.0047) [2024-06-19 07:07:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42054.8, 300 sec: 42709.5). Total num frames: 4964859904. Throughput: 0: 42573.8. Samples: 1232530800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 07:07:58,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 07:07:58,783][26599] Updated weights for policy 0, policy_version 303034 (0.0033) [2024-06-19 07:08:03,139][26599] Updated weights for policy 0, policy_version 303044 (0.0028) [2024-06-19 07:08:03,384][26367] Fps is (10 sec: 39307.3, 60 sec: 42322.7, 300 sec: 42597.9). Total num frames: 4965072896. Throughput: 0: 42313.6. Samples: 1232653560. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:03,384][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 07:08:06,779][26599] Updated weights for policy 0, policy_version 303054 (0.0036) [2024-06-19 07:08:08,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 4965318656. Throughput: 0: 42540.0. Samples: 1232909380. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:08,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 07:08:10,706][26599] Updated weights for policy 0, policy_version 303064 (0.0041) [2024-06-19 07:08:13,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42052.3, 300 sec: 42709.5). Total num frames: 4965498880. Throughput: 0: 42607.1. Samples: 1233169960. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:13,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 07:08:14,475][26599] Updated weights for policy 0, policy_version 303074 (0.0040) [2024-06-19 07:08:18,348][26599] Updated weights for policy 0, policy_version 303084 (0.0046) [2024-06-19 07:08:18,383][26367] Fps is (10 sec: 40950.3, 60 sec: 42869.8, 300 sec: 42709.6). Total num frames: 4965728256. Throughput: 0: 42155.4. Samples: 1233285580. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:18,383][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 07:08:22,333][26599] Updated weights for policy 0, policy_version 303094 (0.0044) [2024-06-19 07:08:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 4965941248. Throughput: 0: 42450.9. Samples: 1233548120. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:23,380][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 07:08:25,914][26599] Updated weights for policy 0, policy_version 303104 (0.0039) [2024-06-19 07:08:28,380][26367] Fps is (10 sec: 39332.0, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 4966121472. Throughput: 0: 42478.0. Samples: 1233803460. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:28,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 07:08:29,936][26599] Updated weights for policy 0, policy_version 303114 (0.0036) [2024-06-19 07:08:33,380][26367] Fps is (10 sec: 42597.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 4966367232. Throughput: 0: 42211.9. Samples: 1233921940. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:33,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 07:08:33,527][26599] Updated weights for policy 0, policy_version 303124 (0.0039) [2024-06-19 07:08:38,008][26599] Updated weights for policy 0, policy_version 303134 (0.0033) [2024-06-19 07:08:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.5, 300 sec: 42654.0). Total num frames: 4966547456. Throughput: 0: 42469.4. Samples: 1234184340. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:38,380][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 07:08:41,209][26599] Updated weights for policy 0, policy_version 303144 (0.0041) [2024-06-19 07:08:43,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 4966760448. Throughput: 0: 42093.7. Samples: 1234425020. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:43,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 07:08:45,723][26599] Updated weights for policy 0, policy_version 303154 (0.0034) [2024-06-19 07:08:48,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4966989824. Throughput: 0: 42251.4. Samples: 1234554720. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:48,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 07:08:49,050][26599] Updated weights for policy 0, policy_version 303164 (0.0044) [2024-06-19 07:08:53,384][26367] Fps is (10 sec: 40945.3, 60 sec: 41503.6, 300 sec: 42597.9). Total num frames: 4967170048. Throughput: 0: 42310.5. Samples: 1234813500. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:53,393][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 07:08:53,632][26599] Updated weights for policy 0, policy_version 303174 (0.0040) [2024-06-19 07:08:56,894][26599] Updated weights for policy 0, policy_version 303184 (0.0039) [2024-06-19 07:08:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4967399424. Throughput: 0: 41977.0. Samples: 1235058920. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:08:58,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 07:09:01,558][26599] Updated weights for policy 0, policy_version 303194 (0.0038) [2024-06-19 07:09:03,380][26367] Fps is (10 sec: 45892.4, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 4967628800. Throughput: 0: 42500.3. Samples: 1235197980. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:09:03,380][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 07:09:04,945][26599] Updated weights for policy 0, policy_version 303204 (0.0041) [2024-06-19 07:09:08,380][26367] Fps is (10 sec: 39320.7, 60 sec: 41233.1, 300 sec: 42542.8). Total num frames: 4967792640. Throughput: 0: 42049.6. Samples: 1235440360. Policy #0 lag: (min: 1.0, avg: 10.9, max: 20.0) [2024-06-19 07:09:08,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 07:09:08,616][26579] Signal inference workers to stop experience collection... (18250 times) [2024-06-19 07:09:08,664][26599] InferenceWorker_p0-w0: stopping experience collection (18250 times) [2024-06-19 07:09:08,671][26579] Signal inference workers to resume experience collection... (18250 times) [2024-06-19 07:09:08,678][26599] InferenceWorker_p0-w0: resuming experience collection (18250 times) [2024-06-19 07:09:09,353][26599] Updated weights for policy 0, policy_version 303214 (0.0033) [2024-06-19 07:09:12,631][26599] Updated weights for policy 0, policy_version 303224 (0.0028) [2024-06-19 07:09:13,382][26367] Fps is (10 sec: 42590.8, 60 sec: 42597.2, 300 sec: 42598.2). Total num frames: 4968054784. Throughput: 0: 41940.6. Samples: 1235690860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:13,383][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 07:09:17,021][26599] Updated weights for policy 0, policy_version 303234 (0.0042) [2024-06-19 07:09:18,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42054.0, 300 sec: 42543.2). Total num frames: 4968251392. Throughput: 0: 42248.6. Samples: 1235823120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:18,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 07:09:20,200][26599] Updated weights for policy 0, policy_version 303244 (0.0035) [2024-06-19 07:09:23,380][26367] Fps is (10 sec: 37689.3, 60 sec: 41506.0, 300 sec: 42487.3). Total num frames: 4968431616. Throughput: 0: 41863.9. Samples: 1236068220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:23,384][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 07:09:24,668][26599] Updated weights for policy 0, policy_version 303254 (0.0032) [2024-06-19 07:09:27,850][26599] Updated weights for policy 0, policy_version 303264 (0.0038) [2024-06-19 07:09:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42543.4). Total num frames: 4968693760. Throughput: 0: 42101.3. Samples: 1236319580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:28,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 07:09:32,421][26599] Updated weights for policy 0, policy_version 303274 (0.0044) [2024-06-19 07:09:33,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42052.5, 300 sec: 42598.4). Total num frames: 4968890368. Throughput: 0: 42210.8. Samples: 1236454200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:33,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 07:09:35,642][26599] Updated weights for policy 0, policy_version 303284 (0.0031) [2024-06-19 07:09:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4969086976. Throughput: 0: 41999.8. Samples: 1236703340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:38,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 07:09:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000303289_4969086976.pth... [2024-06-19 07:09:38,460][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302669_4958928896.pth [2024-06-19 07:09:39,997][26599] Updated weights for policy 0, policy_version 303294 (0.0039) [2024-06-19 07:09:43,330][26599] Updated weights for policy 0, policy_version 303304 (0.0022) [2024-06-19 07:09:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4969332736. Throughput: 0: 42051.5. Samples: 1236951240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:43,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 07:09:47,668][26599] Updated weights for policy 0, policy_version 303314 (0.0037) [2024-06-19 07:09:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 4969512960. Throughput: 0: 41903.0. Samples: 1237083620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:48,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 07:09:51,246][26599] Updated weights for policy 0, policy_version 303324 (0.0033) [2024-06-19 07:09:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42601.0, 300 sec: 42431.8). Total num frames: 4969725952. Throughput: 0: 42081.9. Samples: 1237334040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:53,380][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 07:09:55,363][26599] Updated weights for policy 0, policy_version 303334 (0.0034) [2024-06-19 07:09:58,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 4969955328. Throughput: 0: 42178.3. Samples: 1237588820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:09:58,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 07:09:59,272][26599] Updated weights for policy 0, policy_version 303344 (0.0043) [2024-06-19 07:10:03,313][26599] Updated weights for policy 0, policy_version 303354 (0.0028) [2024-06-19 07:10:03,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 4970151936. Throughput: 0: 42017.3. Samples: 1237713900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:10:03,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 07:10:07,183][26599] Updated weights for policy 0, policy_version 303364 (0.0031) [2024-06-19 07:10:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 4970364928. Throughput: 0: 42307.1. Samples: 1237972040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:10:08,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 07:10:11,017][26599] Updated weights for policy 0, policy_version 303374 (0.0041) [2024-06-19 07:10:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 41780.3, 300 sec: 42431.8). Total num frames: 4970561536. Throughput: 0: 42299.2. Samples: 1238223040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:10:13,381][26367] Avg episode reward: [(0, '0.809')] [2024-06-19 07:10:14,851][26599] Updated weights for policy 0, policy_version 303384 (0.0037) [2024-06-19 07:10:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42432.3). Total num frames: 4970774528. Throughput: 0: 42120.8. Samples: 1238349640. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:10:18,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 07:10:18,560][26599] Updated weights for policy 0, policy_version 303394 (0.0039) [2024-06-19 07:10:22,519][26599] Updated weights for policy 0, policy_version 303404 (0.0031) [2024-06-19 07:10:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42376.3). Total num frames: 4970987520. Throughput: 0: 42307.2. Samples: 1238607160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:10:23,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 07:10:26,140][26599] Updated weights for policy 0, policy_version 303414 (0.0039) [2024-06-19 07:10:28,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 4971216896. Throughput: 0: 42453.4. Samples: 1238861640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:10:28,380][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 07:10:30,182][26599] Updated weights for policy 0, policy_version 303424 (0.0029) [2024-06-19 07:10:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 4971413504. Throughput: 0: 42423.2. Samples: 1238992660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:10:33,380][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 07:10:33,967][26579] Signal inference workers to stop experience collection... (18300 times) [2024-06-19 07:10:33,967][26579] Signal inference workers to resume experience collection... (18300 times) [2024-06-19 07:10:34,008][26599] InferenceWorker_p0-w0: stopping experience collection (18300 times) [2024-06-19 07:10:34,009][26599] InferenceWorker_p0-w0: resuming experience collection (18300 times) [2024-06-19 07:10:34,113][26599] Updated weights for policy 0, policy_version 303434 (0.0047) [2024-06-19 07:10:38,309][26599] Updated weights for policy 0, policy_version 303444 (0.0036) [2024-06-19 07:10:38,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4971626496. Throughput: 0: 42435.4. Samples: 1239243640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:10:38,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 07:10:41,772][26599] Updated weights for policy 0, policy_version 303454 (0.0051) [2024-06-19 07:10:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.3, 300 sec: 42432.3). Total num frames: 4971855872. Throughput: 0: 42373.9. Samples: 1239495640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:10:43,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 07:10:46,059][26599] Updated weights for policy 0, policy_version 303464 (0.0037) [2024-06-19 07:10:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 4972036096. Throughput: 0: 42581.7. Samples: 1239630080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:10:48,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 07:10:49,538][26599] Updated weights for policy 0, policy_version 303474 (0.0037) [2024-06-19 07:10:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 4972265472. Throughput: 0: 42371.6. Samples: 1239878760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:10:53,381][26367] Avg episode reward: [(0, '0.324')] [2024-06-19 07:10:53,513][26599] Updated weights for policy 0, policy_version 303484 (0.0036) [2024-06-19 07:10:57,253][26599] Updated weights for policy 0, policy_version 303494 (0.0037) [2024-06-19 07:10:58,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 4972494848. Throughput: 0: 42530.7. Samples: 1240136920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:10:58,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 07:11:00,898][26599] Updated weights for policy 0, policy_version 303504 (0.0028) [2024-06-19 07:11:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4972675072. Throughput: 0: 42588.4. Samples: 1240266120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:03,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 07:11:04,902][26599] Updated weights for policy 0, policy_version 303514 (0.0032) [2024-06-19 07:11:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4972920832. Throughput: 0: 42547.6. Samples: 1240521800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:08,380][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 07:11:08,580][26599] Updated weights for policy 0, policy_version 303524 (0.0031) [2024-06-19 07:11:12,943][26599] Updated weights for policy 0, policy_version 303534 (0.0043) [2024-06-19 07:11:13,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 4973133824. Throughput: 0: 42316.9. Samples: 1240765900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:13,380][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 07:11:16,393][26599] Updated weights for policy 0, policy_version 303544 (0.0029) [2024-06-19 07:11:18,380][26367] Fps is (10 sec: 37682.6, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4973297664. Throughput: 0: 42233.2. Samples: 1240893160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:18,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 07:11:20,589][26599] Updated weights for policy 0, policy_version 303554 (0.0036) [2024-06-19 07:11:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4973543424. Throughput: 0: 42316.2. Samples: 1241147860. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:23,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 07:11:24,124][26599] Updated weights for policy 0, policy_version 303564 (0.0033) [2024-06-19 07:11:28,208][26599] Updated weights for policy 0, policy_version 303574 (0.0031) [2024-06-19 07:11:28,380][26367] Fps is (10 sec: 47514.0, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4973772800. Throughput: 0: 42504.9. Samples: 1241408360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:28,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 07:11:32,182][26599] Updated weights for policy 0, policy_version 303584 (0.0028) [2024-06-19 07:11:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42052.1, 300 sec: 42320.7). Total num frames: 4973936640. Throughput: 0: 42256.9. Samples: 1241531640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:33,384][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 07:11:35,737][26599] Updated weights for policy 0, policy_version 303594 (0.0023) [2024-06-19 07:11:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42376.2). Total num frames: 4974198784. Throughput: 0: 42328.0. Samples: 1241783520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 07:11:38,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 07:11:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000303601_4974198784.pth... [2024-06-19 07:11:38,473][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000302980_4964024320.pth [2024-06-19 07:11:40,245][26599] Updated weights for policy 0, policy_version 303604 (0.0027) [2024-06-19 07:11:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4974379008. Throughput: 0: 42407.5. Samples: 1242045260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:11:43,381][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 07:11:43,802][26599] Updated weights for policy 0, policy_version 303614 (0.0027) [2024-06-19 07:11:47,998][26599] Updated weights for policy 0, policy_version 303624 (0.0036) [2024-06-19 07:11:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42320.7). Total num frames: 4974592000. Throughput: 0: 42220.1. Samples: 1242166020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:11:48,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 07:11:51,463][26599] Updated weights for policy 0, policy_version 303634 (0.0038) [2024-06-19 07:11:53,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42871.5, 300 sec: 42376.8). Total num frames: 4974837760. Throughput: 0: 42251.0. Samples: 1242423100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:11:53,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 07:11:55,561][26599] Updated weights for policy 0, policy_version 303644 (0.0031) [2024-06-19 07:11:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4975017984. Throughput: 0: 42587.0. Samples: 1242682320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:11:58,381][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 07:11:59,346][26599] Updated weights for policy 0, policy_version 303654 (0.0040) [2024-06-19 07:12:03,242][26599] Updated weights for policy 0, policy_version 303664 (0.0022) [2024-06-19 07:12:03,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42320.7). Total num frames: 4975230976. Throughput: 0: 42352.5. Samples: 1242799020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:03,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 07:12:06,013][26579] Signal inference workers to stop experience collection... (18350 times) [2024-06-19 07:12:06,014][26579] Signal inference workers to resume experience collection... (18350 times) [2024-06-19 07:12:06,044][26599] InferenceWorker_p0-w0: stopping experience collection (18350 times) [2024-06-19 07:12:06,045][26599] InferenceWorker_p0-w0: resuming experience collection (18350 times) [2024-06-19 07:12:07,189][26599] Updated weights for policy 0, policy_version 303674 (0.0037) [2024-06-19 07:12:08,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4975476736. Throughput: 0: 42503.6. Samples: 1243060520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:08,380][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 07:12:10,955][26599] Updated weights for policy 0, policy_version 303684 (0.0033) [2024-06-19 07:12:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.1, 300 sec: 42320.7). Total num frames: 4975640576. Throughput: 0: 42293.6. Samples: 1243311580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:13,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 07:12:14,906][26599] Updated weights for policy 0, policy_version 303694 (0.0037) [2024-06-19 07:12:18,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42871.4, 300 sec: 42376.3). Total num frames: 4975869952. Throughput: 0: 42162.6. Samples: 1243428960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:18,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 07:12:18,591][26599] Updated weights for policy 0, policy_version 303704 (0.0038) [2024-06-19 07:12:22,721][26599] Updated weights for policy 0, policy_version 303714 (0.0031) [2024-06-19 07:12:23,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.2, 300 sec: 42209.6). Total num frames: 4976066560. Throughput: 0: 42314.3. Samples: 1243687660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:23,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 07:12:26,889][26599] Updated weights for policy 0, policy_version 303724 (0.0029) [2024-06-19 07:12:28,380][26367] Fps is (10 sec: 39321.9, 60 sec: 41506.1, 300 sec: 42320.7). Total num frames: 4976263168. Throughput: 0: 42060.5. Samples: 1243937980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:28,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 07:12:30,631][26599] Updated weights for policy 0, policy_version 303734 (0.0046) [2024-06-19 07:12:33,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42320.7). Total num frames: 4976508928. Throughput: 0: 42184.3. Samples: 1244064320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:33,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 07:12:34,627][26599] Updated weights for policy 0, policy_version 303744 (0.0032) [2024-06-19 07:12:38,117][26599] Updated weights for policy 0, policy_version 303754 (0.0031) [2024-06-19 07:12:38,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42052.2, 300 sec: 42265.2). Total num frames: 4976721920. Throughput: 0: 42155.0. Samples: 1244320080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:38,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 07:12:42,362][26599] Updated weights for policy 0, policy_version 303764 (0.0041) [2024-06-19 07:12:43,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4976902144. Throughput: 0: 42083.0. Samples: 1244576060. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:43,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 07:12:45,638][26599] Updated weights for policy 0, policy_version 303774 (0.0038) [2024-06-19 07:12:48,384][26367] Fps is (10 sec: 40945.8, 60 sec: 42322.8, 300 sec: 42209.1). Total num frames: 4977131520. Throughput: 0: 42217.1. Samples: 1244698940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:48,384][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 07:12:49,902][26599] Updated weights for policy 0, policy_version 303784 (0.0038) [2024-06-19 07:12:53,279][26599] Updated weights for policy 0, policy_version 303794 (0.0036) [2024-06-19 07:12:53,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 4977360896. Throughput: 0: 42160.8. Samples: 1244957760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 07:12:53,381][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 07:12:57,478][26599] Updated weights for policy 0, policy_version 303804 (0.0044) [2024-06-19 07:12:58,380][26367] Fps is (10 sec: 39335.6, 60 sec: 41779.2, 300 sec: 42210.1). Total num frames: 4977524736. Throughput: 0: 42218.3. Samples: 1245211400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:12:58,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 07:13:00,939][26599] Updated weights for policy 0, policy_version 303814 (0.0033) [2024-06-19 07:13:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42265.2). Total num frames: 4977786880. Throughput: 0: 42216.9. Samples: 1245328720. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:03,381][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 07:13:05,370][26599] Updated weights for policy 0, policy_version 303824 (0.0037) [2024-06-19 07:13:08,384][26367] Fps is (10 sec: 47496.4, 60 sec: 42049.7, 300 sec: 42375.7). Total num frames: 4977999872. Throughput: 0: 42494.3. Samples: 1245600060. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:08,385][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 07:13:08,698][26599] Updated weights for policy 0, policy_version 303834 (0.0030) [2024-06-19 07:13:12,882][26599] Updated weights for policy 0, policy_version 303844 (0.0035) [2024-06-19 07:13:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42210.0). Total num frames: 4978180096. Throughput: 0: 42503.5. Samples: 1245850640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:13,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 07:13:16,215][26599] Updated weights for policy 0, policy_version 303854 (0.0032) [2024-06-19 07:13:18,380][26367] Fps is (10 sec: 44253.4, 60 sec: 42871.6, 300 sec: 42376.2). Total num frames: 4978442240. Throughput: 0: 42527.3. Samples: 1245978040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:18,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 07:13:19,937][26579] Signal inference workers to stop experience collection... (18400 times) [2024-06-19 07:13:19,937][26579] Signal inference workers to resume experience collection... (18400 times) [2024-06-19 07:13:19,977][26599] InferenceWorker_p0-w0: stopping experience collection (18400 times) [2024-06-19 07:13:19,977][26599] InferenceWorker_p0-w0: resuming experience collection (18400 times) [2024-06-19 07:13:20,976][26599] Updated weights for policy 0, policy_version 303864 (0.0043) [2024-06-19 07:13:23,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42376.2). Total num frames: 4978622464. Throughput: 0: 42556.2. Samples: 1246235100. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:23,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 07:13:24,195][26599] Updated weights for policy 0, policy_version 303874 (0.0041) [2024-06-19 07:13:28,380][26367] Fps is (10 sec: 36044.7, 60 sec: 42325.4, 300 sec: 42154.1). Total num frames: 4978802688. Throughput: 0: 42403.7. Samples: 1246484220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:28,381][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 07:13:28,747][26599] Updated weights for policy 0, policy_version 303884 (0.0038) [2024-06-19 07:13:32,015][26599] Updated weights for policy 0, policy_version 303894 (0.0046) [2024-06-19 07:13:33,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 4979064832. Throughput: 0: 42424.7. Samples: 1246607900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:33,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 07:13:36,582][26599] Updated weights for policy 0, policy_version 303904 (0.0042) [2024-06-19 07:13:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.4, 300 sec: 42320.7). Total num frames: 4979245056. Throughput: 0: 42464.6. Samples: 1246868660. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:38,380][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 07:13:38,560][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000303910_4979261440.pth... [2024-06-19 07:13:38,625][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000303289_4969086976.pth [2024-06-19 07:13:39,577][26599] Updated weights for policy 0, policy_version 303914 (0.0031) [2024-06-19 07:13:43,384][26367] Fps is (10 sec: 39307.3, 60 sec: 42595.9, 300 sec: 42264.6). Total num frames: 4979458048. Throughput: 0: 42386.4. Samples: 1247118940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:43,384][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 07:13:44,609][26599] Updated weights for policy 0, policy_version 303924 (0.0054) [2024-06-19 07:13:47,146][26599] Updated weights for policy 0, policy_version 303934 (0.0031) [2024-06-19 07:13:48,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42600.9, 300 sec: 42432.3). Total num frames: 4979687424. Throughput: 0: 42605.7. Samples: 1247245980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:48,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 07:13:52,061][26599] Updated weights for policy 0, policy_version 303944 (0.0049) [2024-06-19 07:13:53,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 4979884032. Throughput: 0: 42467.0. Samples: 1247510920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:53,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 07:13:54,924][26599] Updated weights for policy 0, policy_version 303954 (0.0034) [2024-06-19 07:13:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42320.7). Total num frames: 4980113408. Throughput: 0: 42439.1. Samples: 1247760400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:13:58,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 07:13:59,599][26599] Updated weights for policy 0, policy_version 303964 (0.0030) [2024-06-19 07:14:02,570][26599] Updated weights for policy 0, policy_version 303974 (0.0027) [2024-06-19 07:14:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4980326400. Throughput: 0: 42519.9. Samples: 1247891440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:14:03,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 07:14:07,234][26599] Updated weights for policy 0, policy_version 303984 (0.0042) [2024-06-19 07:14:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42054.9, 300 sec: 42265.4). Total num frames: 4980523008. Throughput: 0: 42556.0. Samples: 1248150120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 24.0) [2024-06-19 07:14:08,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 07:14:10,534][26599] Updated weights for policy 0, policy_version 303994 (0.0032) [2024-06-19 07:14:13,384][26367] Fps is (10 sec: 42583.3, 60 sec: 42868.9, 300 sec: 42375.7). Total num frames: 4980752384. Throughput: 0: 42615.6. Samples: 1248402080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:13,384][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 07:14:14,835][26599] Updated weights for policy 0, policy_version 304004 (0.0037) [2024-06-19 07:14:18,158][26599] Updated weights for policy 0, policy_version 304014 (0.0041) [2024-06-19 07:14:18,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 4980965376. Throughput: 0: 42691.9. Samples: 1248529040. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:18,381][26367] Avg episode reward: [(0, '0.763')] [2024-06-19 07:14:22,628][26599] Updated weights for policy 0, policy_version 304024 (0.0031) [2024-06-19 07:14:23,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42325.3, 300 sec: 42265.2). Total num frames: 4981161984. Throughput: 0: 42558.5. Samples: 1248783800. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:23,381][26367] Avg episode reward: [(0, '0.309')] [2024-06-19 07:14:25,922][26599] Updated weights for policy 0, policy_version 304034 (0.0042) [2024-06-19 07:14:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.3, 300 sec: 42320.7). Total num frames: 4981374976. Throughput: 0: 42606.9. Samples: 1249036100. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:28,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 07:14:30,180][26599] Updated weights for policy 0, policy_version 304044 (0.0040) [2024-06-19 07:14:32,010][26579] Signal inference workers to stop experience collection... (18450 times) [2024-06-19 07:14:32,051][26599] InferenceWorker_p0-w0: stopping experience collection (18450 times) [2024-06-19 07:14:32,064][26579] Signal inference workers to resume experience collection... (18450 times) [2024-06-19 07:14:32,073][26599] InferenceWorker_p0-w0: resuming experience collection (18450 times) [2024-06-19 07:14:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4981604352. Throughput: 0: 42669.0. Samples: 1249166080. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:33,381][26367] Avg episode reward: [(0, '0.369')] [2024-06-19 07:14:33,507][26599] Updated weights for policy 0, policy_version 304054 (0.0043) [2024-06-19 07:14:37,648][26599] Updated weights for policy 0, policy_version 304064 (0.0036) [2024-06-19 07:14:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42265.2). Total num frames: 4981800960. Throughput: 0: 42533.3. Samples: 1249424920. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:38,381][26367] Avg episode reward: [(0, '0.287')] [2024-06-19 07:14:41,069][26599] Updated weights for policy 0, policy_version 304074 (0.0045) [2024-06-19 07:14:43,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42601.0, 300 sec: 42376.3). Total num frames: 4982013952. Throughput: 0: 42535.7. Samples: 1249674500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:43,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 07:14:45,529][26599] Updated weights for policy 0, policy_version 304084 (0.0037) [2024-06-19 07:14:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4982243328. Throughput: 0: 42449.4. Samples: 1249801660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:48,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 07:14:49,142][26599] Updated weights for policy 0, policy_version 304094 (0.0040) [2024-06-19 07:14:53,140][26599] Updated weights for policy 0, policy_version 304104 (0.0039) [2024-06-19 07:14:53,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42320.7). Total num frames: 4982439936. Throughput: 0: 42470.0. Samples: 1250061280. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:53,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 07:14:56,815][26599] Updated weights for policy 0, policy_version 304114 (0.0033) [2024-06-19 07:14:58,380][26367] Fps is (10 sec: 37683.5, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4982620160. Throughput: 0: 42472.8. Samples: 1250313200. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:14:58,380][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 07:15:00,710][26599] Updated weights for policy 0, policy_version 304124 (0.0037) [2024-06-19 07:15:03,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4982882304. Throughput: 0: 42363.3. Samples: 1250435380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:15:03,380][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 07:15:04,751][26599] Updated weights for policy 0, policy_version 304134 (0.0036) [2024-06-19 07:15:08,366][26599] Updated weights for policy 0, policy_version 304144 (0.0024) [2024-06-19 07:15:08,380][26367] Fps is (10 sec: 47512.9, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 4983095296. Throughput: 0: 42433.3. Samples: 1250693300. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:15:08,381][26367] Avg episode reward: [(0, '0.830')] [2024-06-19 07:15:12,398][26599] Updated weights for policy 0, policy_version 304154 (0.0045) [2024-06-19 07:15:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42054.8, 300 sec: 42376.3). Total num frames: 4983275520. Throughput: 0: 42465.0. Samples: 1250947020. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:15:13,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 07:15:15,946][26599] Updated weights for policy 0, policy_version 304164 (0.0033) [2024-06-19 07:15:18,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.6, 300 sec: 42487.3). Total num frames: 4983521280. Throughput: 0: 42251.7. Samples: 1251067400. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:15:18,380][26367] Avg episode reward: [(0, '0.310')] [2024-06-19 07:15:20,258][26599] Updated weights for policy 0, policy_version 304174 (0.0035) [2024-06-19 07:15:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42320.7). Total num frames: 4983701504. Throughput: 0: 42365.8. Samples: 1251331380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 21.0) [2024-06-19 07:15:23,381][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 07:15:24,215][26599] Updated weights for policy 0, policy_version 304184 (0.0025) [2024-06-19 07:15:28,098][26599] Updated weights for policy 0, policy_version 304194 (0.0037) [2024-06-19 07:15:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 4983930880. Throughput: 0: 42396.4. Samples: 1251582340. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:15:28,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 07:15:31,792][26599] Updated weights for policy 0, policy_version 304204 (0.0043) [2024-06-19 07:15:33,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4984160256. Throughput: 0: 42442.7. Samples: 1251711580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:15:33,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 07:15:35,682][26599] Updated weights for policy 0, policy_version 304214 (0.0049) [2024-06-19 07:15:38,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42325.2, 300 sec: 42320.7). Total num frames: 4984340480. Throughput: 0: 42389.8. Samples: 1251968820. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:15:38,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 07:15:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000304220_4984340480.pth... [2024-06-19 07:15:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000303601_4974198784.pth [2024-06-19 07:15:39,503][26599] Updated weights for policy 0, policy_version 304224 (0.0039) [2024-06-19 07:15:43,251][26599] Updated weights for policy 0, policy_version 304234 (0.0038) [2024-06-19 07:15:43,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 4984569856. Throughput: 0: 42348.5. Samples: 1252218880. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:15:43,380][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 07:15:47,288][26599] Updated weights for policy 0, policy_version 304244 (0.0039) [2024-06-19 07:15:48,380][26367] Fps is (10 sec: 45876.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 4984799232. Throughput: 0: 42358.7. Samples: 1252341520. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:15:48,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 07:15:51,349][26599] Updated weights for policy 0, policy_version 304254 (0.0030) [2024-06-19 07:15:52,853][26579] Signal inference workers to stop experience collection... (18500 times) [2024-06-19 07:15:52,856][26579] Signal inference workers to resume experience collection... (18500 times) [2024-06-19 07:15:52,879][26599] InferenceWorker_p0-w0: stopping experience collection (18500 times) [2024-06-19 07:15:52,879][26599] InferenceWorker_p0-w0: resuming experience collection (18500 times) [2024-06-19 07:15:53,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.4, 300 sec: 42265.2). Total num frames: 4984963072. Throughput: 0: 42321.4. Samples: 1252597760. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:15:53,381][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 07:15:55,078][26599] Updated weights for policy 0, policy_version 304264 (0.0032) [2024-06-19 07:15:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 4985208832. Throughput: 0: 42211.6. Samples: 1252846540. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:15:58,380][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 07:15:59,381][26599] Updated weights for policy 0, policy_version 304274 (0.0034) [2024-06-19 07:16:02,697][26599] Updated weights for policy 0, policy_version 304284 (0.0044) [2024-06-19 07:16:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.2, 300 sec: 42320.7). Total num frames: 4985405440. Throughput: 0: 42406.6. Samples: 1252975700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:16:03,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 07:16:07,423][26599] Updated weights for policy 0, policy_version 304294 (0.0024) [2024-06-19 07:16:08,380][26367] Fps is (10 sec: 39321.4, 60 sec: 41779.3, 300 sec: 42265.2). Total num frames: 4985602048. Throughput: 0: 42362.6. Samples: 1253237700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:16:08,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:16:10,341][26599] Updated weights for policy 0, policy_version 304304 (0.0024) [2024-06-19 07:16:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4985847808. Throughput: 0: 42347.5. Samples: 1253487980. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:16:13,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 07:16:15,222][26599] Updated weights for policy 0, policy_version 304314 (0.0039) [2024-06-19 07:16:17,831][26599] Updated weights for policy 0, policy_version 304324 (0.0029) [2024-06-19 07:16:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 4986060800. Throughput: 0: 42458.2. Samples: 1253622200. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:16:18,381][26367] Avg episode reward: [(0, '0.847')] [2024-06-19 07:16:22,709][26599] Updated weights for policy 0, policy_version 304334 (0.0042) [2024-06-19 07:16:23,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42052.3, 300 sec: 42209.6). Total num frames: 4986224640. Throughput: 0: 42402.0. Samples: 1253876900. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:16:23,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 07:16:25,468][26599] Updated weights for policy 0, policy_version 304344 (0.0043) [2024-06-19 07:16:28,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4986486784. Throughput: 0: 42418.0. Samples: 1254127700. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:16:28,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 07:16:30,041][26599] Updated weights for policy 0, policy_version 304354 (0.0040) [2024-06-19 07:16:33,156][26599] Updated weights for policy 0, policy_version 304364 (0.0038) [2024-06-19 07:16:33,384][26367] Fps is (10 sec: 49133.8, 60 sec: 42595.8, 300 sec: 42431.3). Total num frames: 4986716160. Throughput: 0: 42820.1. Samples: 1254268580. Policy #0 lag: (min: 1.0, avg: 10.5, max: 20.0) [2024-06-19 07:16:33,384][26367] Avg episode reward: [(0, '0.396')] [2024-06-19 07:16:37,449][26599] Updated weights for policy 0, policy_version 304374 (0.0032) [2024-06-19 07:16:38,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 4986880000. Throughput: 0: 42726.3. Samples: 1254520440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:16:38,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 07:16:40,823][26599] Updated weights for policy 0, policy_version 304384 (0.0047) [2024-06-19 07:16:43,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4987142144. Throughput: 0: 42799.0. Samples: 1254772500. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:16:43,381][26367] Avg episode reward: [(0, '0.355')] [2024-06-19 07:16:44,942][26599] Updated weights for policy 0, policy_version 304394 (0.0038) [2024-06-19 07:16:48,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 4987338752. Throughput: 0: 42904.4. Samples: 1254906400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:16:48,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 07:16:48,521][26599] Updated weights for policy 0, policy_version 304404 (0.0029) [2024-06-19 07:16:52,890][26599] Updated weights for policy 0, policy_version 304414 (0.0030) [2024-06-19 07:16:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 4987535360. Throughput: 0: 42570.6. Samples: 1255153380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:16:53,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 07:16:53,796][26579] Signal inference workers to stop experience collection... (18550 times) [2024-06-19 07:16:53,796][26579] Signal inference workers to resume experience collection... (18550 times) [2024-06-19 07:16:53,841][26599] InferenceWorker_p0-w0: stopping experience collection (18550 times) [2024-06-19 07:16:53,841][26599] InferenceWorker_p0-w0: resuming experience collection (18550 times) [2024-06-19 07:16:56,342][26599] Updated weights for policy 0, policy_version 304424 (0.0028) [2024-06-19 07:16:58,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42868.8, 300 sec: 42542.3). Total num frames: 4987781120. Throughput: 0: 42659.7. Samples: 1255407820. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:16:58,385][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 07:17:00,453][26599] Updated weights for policy 0, policy_version 304434 (0.0039) [2024-06-19 07:17:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.3, 300 sec: 42376.2). Total num frames: 4987977728. Throughput: 0: 42588.3. Samples: 1255538680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:03,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 07:17:04,152][26599] Updated weights for policy 0, policy_version 304444 (0.0044) [2024-06-19 07:17:08,035][26599] Updated weights for policy 0, policy_version 304454 (0.0036) [2024-06-19 07:17:08,384][26367] Fps is (10 sec: 39321.8, 60 sec: 42868.9, 300 sec: 42486.8). Total num frames: 4988174336. Throughput: 0: 42369.9. Samples: 1255783700. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:08,384][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 07:17:11,807][26599] Updated weights for policy 0, policy_version 304464 (0.0028) [2024-06-19 07:17:13,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 4988387328. Throughput: 0: 42623.7. Samples: 1256045760. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:13,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 07:17:15,893][26599] Updated weights for policy 0, policy_version 304474 (0.0047) [2024-06-19 07:17:18,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 4988600320. Throughput: 0: 42176.4. Samples: 1256166360. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:18,380][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 07:17:19,585][26599] Updated weights for policy 0, policy_version 304484 (0.0030) [2024-06-19 07:17:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 4988813312. Throughput: 0: 42086.6. Samples: 1256414340. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:23,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 07:17:23,901][26599] Updated weights for policy 0, policy_version 304494 (0.0040) [2024-06-19 07:17:27,167][26599] Updated weights for policy 0, policy_version 304504 (0.0034) [2024-06-19 07:17:28,380][26367] Fps is (10 sec: 39320.6, 60 sec: 41779.2, 300 sec: 42320.7). Total num frames: 4988993536. Throughput: 0: 42348.8. Samples: 1256678200. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:28,390][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 07:17:31,439][26599] Updated weights for policy 0, policy_version 304514 (0.0038) [2024-06-19 07:17:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42054.8, 300 sec: 42431.8). Total num frames: 4989239296. Throughput: 0: 42144.5. Samples: 1256802900. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:33,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 07:17:35,162][26599] Updated weights for policy 0, policy_version 304524 (0.0035) [2024-06-19 07:17:38,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42595.8, 300 sec: 42486.8). Total num frames: 4989435904. Throughput: 0: 42278.4. Samples: 1257056060. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:38,385][26367] Avg episode reward: [(0, '0.845')] [2024-06-19 07:17:38,415][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000304532_4989452288.pth... [2024-06-19 07:17:38,474][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000303910_4979261440.pth [2024-06-19 07:17:39,041][26599] Updated weights for policy 0, policy_version 304534 (0.0041) [2024-06-19 07:17:42,959][26599] Updated weights for policy 0, policy_version 304544 (0.0025) [2024-06-19 07:17:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42432.3). Total num frames: 4989648896. Throughput: 0: 42327.4. Samples: 1257312400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:43,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 07:17:46,709][26599] Updated weights for policy 0, policy_version 304554 (0.0034) [2024-06-19 07:17:48,380][26367] Fps is (10 sec: 45892.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 4989894656. Throughput: 0: 42329.9. Samples: 1257443520. Policy #0 lag: (min: 0.0, avg: 8.2, max: 20.0) [2024-06-19 07:17:48,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 07:17:50,557][26599] Updated weights for policy 0, policy_version 304564 (0.0039) [2024-06-19 07:17:53,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 4990074880. Throughput: 0: 42774.7. Samples: 1257708400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:17:53,380][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 07:17:54,452][26599] Updated weights for policy 0, policy_version 304574 (0.0031) [2024-06-19 07:17:58,201][26599] Updated weights for policy 0, policy_version 304584 (0.0032) [2024-06-19 07:17:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42054.9, 300 sec: 42431.8). Total num frames: 4990304256. Throughput: 0: 42486.3. Samples: 1257957640. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:17:58,380][26367] Avg episode reward: [(0, '0.768')] [2024-06-19 07:18:02,178][26599] Updated weights for policy 0, policy_version 304594 (0.0035) [2024-06-19 07:18:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.5, 300 sec: 42432.3). Total num frames: 4990517248. Throughput: 0: 42623.1. Samples: 1258084400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:03,380][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 07:18:05,881][26599] Updated weights for policy 0, policy_version 304604 (0.0035) [2024-06-19 07:18:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42600.9, 300 sec: 42542.9). Total num frames: 4990730240. Throughput: 0: 42837.3. Samples: 1258342020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:08,381][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 07:18:09,736][26599] Updated weights for policy 0, policy_version 304614 (0.0035) [2024-06-19 07:18:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42376.2). Total num frames: 4990943232. Throughput: 0: 42568.6. Samples: 1258593780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:13,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 07:18:13,503][26599] Updated weights for policy 0, policy_version 304624 (0.0036) [2024-06-19 07:18:17,055][26579] Signal inference workers to stop experience collection... (18600 times) [2024-06-19 07:18:17,087][26599] InferenceWorker_p0-w0: stopping experience collection (18600 times) [2024-06-19 07:18:17,120][26579] Signal inference workers to resume experience collection... (18600 times) [2024-06-19 07:18:17,123][26599] InferenceWorker_p0-w0: resuming experience collection (18600 times) [2024-06-19 07:18:17,410][26599] Updated weights for policy 0, policy_version 304634 (0.0028) [2024-06-19 07:18:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 4991172608. Throughput: 0: 42769.3. Samples: 1258727520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:18,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 07:18:21,107][26599] Updated weights for policy 0, policy_version 304644 (0.0030) [2024-06-19 07:18:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 4991352832. Throughput: 0: 42703.9. Samples: 1258977580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:23,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 07:18:25,148][26599] Updated weights for policy 0, policy_version 304654 (0.0042) [2024-06-19 07:18:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 4991582208. Throughput: 0: 42647.6. Samples: 1259231540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:28,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 07:18:28,978][26599] Updated weights for policy 0, policy_version 304664 (0.0034) [2024-06-19 07:18:32,963][26599] Updated weights for policy 0, policy_version 304674 (0.0042) [2024-06-19 07:18:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 4991795200. Throughput: 0: 42745.9. Samples: 1259367080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:33,380][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 07:18:36,413][26599] Updated weights for policy 0, policy_version 304684 (0.0035) [2024-06-19 07:18:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42328.0, 300 sec: 42432.3). Total num frames: 4991975424. Throughput: 0: 42495.1. Samples: 1259620680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:38,380][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 07:18:40,370][26599] Updated weights for policy 0, policy_version 304694 (0.0031) [2024-06-19 07:18:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 4992237568. Throughput: 0: 42452.3. Samples: 1259868000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:43,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 07:18:44,294][26599] Updated weights for policy 0, policy_version 304704 (0.0046) [2024-06-19 07:18:47,933][26599] Updated weights for policy 0, policy_version 304714 (0.0033) [2024-06-19 07:18:48,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4992434176. Throughput: 0: 42758.6. Samples: 1260008540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:48,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 07:18:52,050][26599] Updated weights for policy 0, policy_version 304724 (0.0024) [2024-06-19 07:18:53,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 4992630784. Throughput: 0: 42768.4. Samples: 1260266600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:53,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 07:18:55,581][26599] Updated weights for policy 0, policy_version 304734 (0.0028) [2024-06-19 07:18:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4992876544. Throughput: 0: 42760.0. Samples: 1260517980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:18:58,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 07:18:59,810][26599] Updated weights for policy 0, policy_version 304744 (0.0039) [2024-06-19 07:19:03,284][26599] Updated weights for policy 0, policy_version 304754 (0.0042) [2024-06-19 07:19:03,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4993089536. Throughput: 0: 42824.5. Samples: 1260654620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 07:19:03,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 07:19:07,377][26599] Updated weights for policy 0, policy_version 304764 (0.0040) [2024-06-19 07:19:08,381][26367] Fps is (10 sec: 39320.7, 60 sec: 42325.2, 300 sec: 42432.3). Total num frames: 4993269760. Throughput: 0: 42938.9. Samples: 1260909840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:08,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 07:19:11,003][26599] Updated weights for policy 0, policy_version 304774 (0.0034) [2024-06-19 07:19:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4993515520. Throughput: 0: 42752.5. Samples: 1261155400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:13,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 07:19:15,328][26599] Updated weights for policy 0, policy_version 304784 (0.0035) [2024-06-19 07:19:18,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4993712128. Throughput: 0: 42707.9. Samples: 1261288940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:18,384][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 07:19:18,884][26599] Updated weights for policy 0, policy_version 304794 (0.0031) [2024-06-19 07:19:22,905][26599] Updated weights for policy 0, policy_version 304804 (0.0040) [2024-06-19 07:19:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 4993925120. Throughput: 0: 42658.1. Samples: 1261540300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:23,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 07:19:26,480][26599] Updated weights for policy 0, policy_version 304814 (0.0040) [2024-06-19 07:19:28,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 4994154496. Throughput: 0: 42800.9. Samples: 1261794040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:28,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 07:19:30,487][26599] Updated weights for policy 0, policy_version 304824 (0.0024) [2024-06-19 07:19:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 4994351104. Throughput: 0: 42672.4. Samples: 1261928800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:33,381][26367] Avg episode reward: [(0, '0.845')] [2024-06-19 07:19:34,100][26599] Updated weights for policy 0, policy_version 304834 (0.0031) [2024-06-19 07:19:38,277][26599] Updated weights for policy 0, policy_version 304844 (0.0045) [2024-06-19 07:19:38,380][26367] Fps is (10 sec: 40959.6, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 4994564096. Throughput: 0: 42471.5. Samples: 1262177820. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:38,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 07:19:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000304844_4994564096.pth... [2024-06-19 07:19:38,466][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000304220_4984340480.pth [2024-06-19 07:19:39,530][26579] Signal inference workers to stop experience collection... (18650 times) [2024-06-19 07:19:39,536][26579] Signal inference workers to resume experience collection... (18650 times) [2024-06-19 07:19:39,551][26599] InferenceWorker_p0-w0: stopping experience collection (18650 times) [2024-06-19 07:19:39,551][26599] InferenceWorker_p0-w0: resuming experience collection (18650 times) [2024-06-19 07:19:41,844][26599] Updated weights for policy 0, policy_version 304854 (0.0036) [2024-06-19 07:19:43,384][26367] Fps is (10 sec: 45858.8, 60 sec: 42868.9, 300 sec: 42597.9). Total num frames: 4994809856. Throughput: 0: 42518.8. Samples: 1262431480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:43,384][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 07:19:45,892][26599] Updated weights for policy 0, policy_version 304864 (0.0037) [2024-06-19 07:19:48,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4994973696. Throughput: 0: 42410.7. Samples: 1262563100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:48,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 07:19:49,560][26599] Updated weights for policy 0, policy_version 304874 (0.0047) [2024-06-19 07:19:53,384][26367] Fps is (10 sec: 39321.6, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 4995203072. Throughput: 0: 42266.1. Samples: 1262811960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:53,384][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 07:19:53,426][26599] Updated weights for policy 0, policy_version 304884 (0.0040) [2024-06-19 07:19:57,302][26599] Updated weights for policy 0, policy_version 304894 (0.0025) [2024-06-19 07:19:58,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 4995448832. Throughput: 0: 42589.4. Samples: 1263071920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:19:58,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 07:20:01,121][26599] Updated weights for policy 0, policy_version 304904 (0.0037) [2024-06-19 07:20:03,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 4995629056. Throughput: 0: 42605.8. Samples: 1263206200. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:20:03,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 07:20:05,063][26599] Updated weights for policy 0, policy_version 304914 (0.0022) [2024-06-19 07:20:08,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 4995842048. Throughput: 0: 42558.8. Samples: 1263455440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:20:08,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 07:20:08,818][26599] Updated weights for policy 0, policy_version 304924 (0.0036) [2024-06-19 07:20:12,572][26599] Updated weights for policy 0, policy_version 304934 (0.0034) [2024-06-19 07:20:13,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 4996087808. Throughput: 0: 42655.1. Samples: 1263713520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:20:13,381][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 07:20:16,322][26599] Updated weights for policy 0, policy_version 304944 (0.0035) [2024-06-19 07:20:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4996251648. Throughput: 0: 42631.7. Samples: 1263847220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 07:20:18,380][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 07:20:20,101][26599] Updated weights for policy 0, policy_version 304954 (0.0036) [2024-06-19 07:20:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4996497408. Throughput: 0: 42710.4. Samples: 1264099780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:23,380][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 07:20:23,835][26599] Updated weights for policy 0, policy_version 304964 (0.0041) [2024-06-19 07:20:27,851][26599] Updated weights for policy 0, policy_version 304974 (0.0036) [2024-06-19 07:20:28,384][26367] Fps is (10 sec: 45857.9, 60 sec: 42595.8, 300 sec: 42542.3). Total num frames: 4996710400. Throughput: 0: 42779.5. Samples: 1264356560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:28,385][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 07:20:31,465][26599] Updated weights for policy 0, policy_version 304984 (0.0039) [2024-06-19 07:20:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 4996890624. Throughput: 0: 42660.1. Samples: 1264482800. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:33,380][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 07:20:35,471][26599] Updated weights for policy 0, policy_version 304994 (0.0048) [2024-06-19 07:20:38,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 4997136384. Throughput: 0: 42767.9. Samples: 1264736360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:38,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 07:20:39,566][26599] Updated weights for policy 0, policy_version 305004 (0.0029) [2024-06-19 07:20:43,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42052.3, 300 sec: 42486.8). Total num frames: 4997332992. Throughput: 0: 42647.7. Samples: 1264991220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:43,384][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 07:20:43,585][26599] Updated weights for policy 0, policy_version 305014 (0.0041) [2024-06-19 07:20:47,411][26599] Updated weights for policy 0, policy_version 305024 (0.0024) [2024-06-19 07:20:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 4997545984. Throughput: 0: 42538.7. Samples: 1265120440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:48,380][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 07:20:51,125][26599] Updated weights for policy 0, policy_version 305034 (0.0028) [2024-06-19 07:20:53,380][26367] Fps is (10 sec: 42613.4, 60 sec: 42600.9, 300 sec: 42542.8). Total num frames: 4997758976. Throughput: 0: 42659.0. Samples: 1265375100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:53,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 07:20:54,975][26599] Updated weights for policy 0, policy_version 305044 (0.0024) [2024-06-19 07:20:58,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 4997971968. Throughput: 0: 42760.4. Samples: 1265637740. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:20:58,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 07:20:58,725][26599] Updated weights for policy 0, policy_version 305054 (0.0037) [2024-06-19 07:20:59,546][26579] Signal inference workers to stop experience collection... (18700 times) [2024-06-19 07:20:59,546][26579] Signal inference workers to resume experience collection... (18700 times) [2024-06-19 07:20:59,578][26599] InferenceWorker_p0-w0: stopping experience collection (18700 times) [2024-06-19 07:20:59,578][26599] InferenceWorker_p0-w0: resuming experience collection (18700 times) [2024-06-19 07:21:02,669][26599] Updated weights for policy 0, policy_version 305064 (0.0036) [2024-06-19 07:21:03,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 4998201344. Throughput: 0: 42572.4. Samples: 1265762980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:21:03,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 07:21:06,427][26599] Updated weights for policy 0, policy_version 305074 (0.0033) [2024-06-19 07:21:08,384][26367] Fps is (10 sec: 44221.0, 60 sec: 42868.8, 300 sec: 42597.9). Total num frames: 4998414336. Throughput: 0: 42556.5. Samples: 1266014980. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:21:08,384][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 07:21:10,556][26599] Updated weights for policy 0, policy_version 305084 (0.0037) [2024-06-19 07:21:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 4998594560. Throughput: 0: 42660.0. Samples: 1266276100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:21:13,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 07:21:14,223][26599] Updated weights for policy 0, policy_version 305094 (0.0043) [2024-06-19 07:21:17,991][26599] Updated weights for policy 0, policy_version 305104 (0.0038) [2024-06-19 07:21:18,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 4998823936. Throughput: 0: 42614.1. Samples: 1266400440. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:21:18,381][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 07:21:21,812][26599] Updated weights for policy 0, policy_version 305114 (0.0027) [2024-06-19 07:21:23,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 4999053312. Throughput: 0: 42655.9. Samples: 1266655880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:21:23,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 07:21:25,670][26599] Updated weights for policy 0, policy_version 305124 (0.0045) [2024-06-19 07:21:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42328.0, 300 sec: 42487.9). Total num frames: 4999249920. Throughput: 0: 42658.2. Samples: 1266910680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:21:28,380][26367] Avg episode reward: [(0, '0.851')] [2024-06-19 07:21:29,531][26599] Updated weights for policy 0, policy_version 305134 (0.0038) [2024-06-19 07:21:33,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42868.8, 300 sec: 42653.4). Total num frames: 4999462912. Throughput: 0: 42633.4. Samples: 1267039100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 22.0) [2024-06-19 07:21:33,385][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 07:21:33,479][26599] Updated weights for policy 0, policy_version 305144 (0.0041) [2024-06-19 07:21:37,248][26599] Updated weights for policy 0, policy_version 305154 (0.0032) [2024-06-19 07:21:38,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 4999692288. Throughput: 0: 42680.0. Samples: 1267295700. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:21:38,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 07:21:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000305157_4999692288.pth... [2024-06-19 07:21:38,472][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000304532_4989452288.pth [2024-06-19 07:21:41,273][26599] Updated weights for policy 0, policy_version 305164 (0.0041) [2024-06-19 07:21:43,380][26367] Fps is (10 sec: 44253.5, 60 sec: 42874.1, 300 sec: 42598.4). Total num frames: 4999905280. Throughput: 0: 42451.7. Samples: 1267548060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:21:43,380][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 07:21:44,923][26599] Updated weights for policy 0, policy_version 305174 (0.0044) [2024-06-19 07:21:48,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5000085504. Throughput: 0: 42379.5. Samples: 1267670060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:21:48,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 07:21:49,136][26599] Updated weights for policy 0, policy_version 305184 (0.0027) [2024-06-19 07:21:52,584][26599] Updated weights for policy 0, policy_version 305194 (0.0041) [2024-06-19 07:21:53,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42598.3, 300 sec: 42487.8). Total num frames: 5000314880. Throughput: 0: 42571.8. Samples: 1267930560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:21:53,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 07:21:56,839][26599] Updated weights for policy 0, policy_version 305204 (0.0034) [2024-06-19 07:21:58,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5000544256. Throughput: 0: 42417.3. Samples: 1268184880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:21:58,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 07:22:00,096][26599] Updated weights for policy 0, policy_version 305214 (0.0024) [2024-06-19 07:22:03,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.2, 300 sec: 42543.4). Total num frames: 5000724480. Throughput: 0: 42621.4. Samples: 1268318400. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:03,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 07:22:04,355][26599] Updated weights for policy 0, policy_version 305224 (0.0038) [2024-06-19 07:22:08,074][26599] Updated weights for policy 0, policy_version 305234 (0.0025) [2024-06-19 07:22:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42601.0, 300 sec: 42653.9). Total num frames: 5000970240. Throughput: 0: 42539.6. Samples: 1268570160. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:08,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 07:22:12,018][26599] Updated weights for policy 0, policy_version 305244 (0.0026) [2024-06-19 07:22:13,380][26367] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5001199616. Throughput: 0: 42617.7. Samples: 1268828480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:13,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 07:22:15,788][26599] Updated weights for policy 0, policy_version 305254 (0.0032) [2024-06-19 07:22:16,498][26579] Signal inference workers to stop experience collection... (18750 times) [2024-06-19 07:22:16,502][26579] Signal inference workers to resume experience collection... (18750 times) [2024-06-19 07:22:16,539][26599] InferenceWorker_p0-w0: stopping experience collection (18750 times) [2024-06-19 07:22:16,539][26599] InferenceWorker_p0-w0: resuming experience collection (18750 times) [2024-06-19 07:22:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5001379840. Throughput: 0: 42785.2. Samples: 1268964280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:18,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 07:22:19,478][26599] Updated weights for policy 0, policy_version 305264 (0.0035) [2024-06-19 07:22:23,311][26599] Updated weights for policy 0, policy_version 305274 (0.0037) [2024-06-19 07:22:23,382][26367] Fps is (10 sec: 40953.1, 60 sec: 42597.3, 300 sec: 42764.8). Total num frames: 5001609216. Throughput: 0: 42681.6. Samples: 1269216440. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:23,382][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 07:22:27,079][26599] Updated weights for policy 0, policy_version 305284 (0.0027) [2024-06-19 07:22:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5001822208. Throughput: 0: 42840.3. Samples: 1269475880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:28,384][26367] Avg episode reward: [(0, '0.406')] [2024-06-19 07:22:31,104][26599] Updated weights for policy 0, policy_version 305294 (0.0042) [2024-06-19 07:22:33,380][26367] Fps is (10 sec: 40966.6, 60 sec: 42601.0, 300 sec: 42654.5). Total num frames: 5002018816. Throughput: 0: 43100.0. Samples: 1269609560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:33,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 07:22:34,861][26599] Updated weights for policy 0, policy_version 305304 (0.0034) [2024-06-19 07:22:38,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5002231808. Throughput: 0: 42623.5. Samples: 1269848620. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:38,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 07:22:38,899][26599] Updated weights for policy 0, policy_version 305314 (0.0040) [2024-06-19 07:22:42,640][26599] Updated weights for policy 0, policy_version 305324 (0.0045) [2024-06-19 07:22:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5002461184. Throughput: 0: 42732.9. Samples: 1270107860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 07:22:43,381][26367] Avg episode reward: [(0, '0.321')] [2024-06-19 07:22:46,421][26599] Updated weights for policy 0, policy_version 305334 (0.0035) [2024-06-19 07:22:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5002657792. Throughput: 0: 42671.0. Samples: 1270238600. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:22:48,381][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 07:22:50,134][26599] Updated weights for policy 0, policy_version 305344 (0.0036) [2024-06-19 07:22:53,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 5002887168. Throughput: 0: 42605.4. Samples: 1270487560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:22:53,385][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 07:22:54,038][26599] Updated weights for policy 0, policy_version 305354 (0.0030) [2024-06-19 07:22:57,894][26599] Updated weights for policy 0, policy_version 305364 (0.0051) [2024-06-19 07:22:58,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5003100160. Throughput: 0: 42617.7. Samples: 1270746280. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:22:58,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 07:23:01,557][26599] Updated weights for policy 0, policy_version 305374 (0.0040) [2024-06-19 07:23:03,380][26367] Fps is (10 sec: 39336.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5003280384. Throughput: 0: 42351.2. Samples: 1270870080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:03,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 07:23:05,402][26599] Updated weights for policy 0, policy_version 305384 (0.0024) [2024-06-19 07:23:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5003526144. Throughput: 0: 42506.8. Samples: 1271129180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:08,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 07:23:09,221][26599] Updated weights for policy 0, policy_version 305394 (0.0035) [2024-06-19 07:23:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5003722752. Throughput: 0: 42468.1. Samples: 1271386940. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:13,381][26367] Avg episode reward: [(0, '0.783')] [2024-06-19 07:23:13,397][26599] Updated weights for policy 0, policy_version 305404 (0.0032) [2024-06-19 07:23:16,951][26599] Updated weights for policy 0, policy_version 305414 (0.0032) [2024-06-19 07:23:18,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5003935744. Throughput: 0: 42311.3. Samples: 1271513560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:18,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 07:23:21,098][26599] Updated weights for policy 0, policy_version 305424 (0.0038) [2024-06-19 07:23:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42599.6, 300 sec: 42653.9). Total num frames: 5004165120. Throughput: 0: 42700.2. Samples: 1271770120. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:23,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 07:23:24,627][26599] Updated weights for policy 0, policy_version 305434 (0.0038) [2024-06-19 07:23:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5004361728. Throughput: 0: 42652.1. Samples: 1272027200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:28,380][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 07:23:28,728][26599] Updated weights for policy 0, policy_version 305444 (0.0037) [2024-06-19 07:23:32,511][26599] Updated weights for policy 0, policy_version 305454 (0.0040) [2024-06-19 07:23:33,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42868.9, 300 sec: 42764.5). Total num frames: 5004591104. Throughput: 0: 42565.1. Samples: 1272154180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:33,384][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 07:23:36,291][26599] Updated weights for policy 0, policy_version 305464 (0.0036) [2024-06-19 07:23:38,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5004804096. Throughput: 0: 42779.8. Samples: 1272412500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:38,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 07:23:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000305469_5004804096.pth... [2024-06-19 07:23:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000304844_4994564096.pth [2024-06-19 07:23:40,001][26599] Updated weights for policy 0, policy_version 305474 (0.0051) [2024-06-19 07:23:43,380][26367] Fps is (10 sec: 42613.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5005017088. Throughput: 0: 42697.8. Samples: 1272667680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:43,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 07:23:43,985][26599] Updated weights for policy 0, policy_version 305484 (0.0043) [2024-06-19 07:23:47,926][26599] Updated weights for policy 0, policy_version 305494 (0.0032) [2024-06-19 07:23:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5005230080. Throughput: 0: 42684.5. Samples: 1272790880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:48,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 07:23:51,181][26579] Signal inference workers to stop experience collection... (18800 times) [2024-06-19 07:23:51,230][26599] InferenceWorker_p0-w0: stopping experience collection (18800 times) [2024-06-19 07:23:51,296][26579] Signal inference workers to resume experience collection... (18800 times) [2024-06-19 07:23:51,296][26599] InferenceWorker_p0-w0: resuming experience collection (18800 times) [2024-06-19 07:23:51,622][26599] Updated weights for policy 0, policy_version 305504 (0.0039) [2024-06-19 07:23:53,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42327.9, 300 sec: 42542.9). Total num frames: 5005426688. Throughput: 0: 42633.0. Samples: 1273047660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:53,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 07:23:55,370][26599] Updated weights for policy 0, policy_version 305514 (0.0033) [2024-06-19 07:23:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5005639680. Throughput: 0: 42721.8. Samples: 1273309420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 23.0) [2024-06-19 07:23:58,381][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 07:23:59,676][26599] Updated weights for policy 0, policy_version 305524 (0.0036) [2024-06-19 07:24:03,122][26599] Updated weights for policy 0, policy_version 305534 (0.0040) [2024-06-19 07:24:03,380][26367] Fps is (10 sec: 45875.6, 60 sec: 43417.7, 300 sec: 42765.1). Total num frames: 5005885440. Throughput: 0: 42748.9. Samples: 1273437260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:03,380][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 07:24:07,158][26599] Updated weights for policy 0, policy_version 305544 (0.0040) [2024-06-19 07:24:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5006082048. Throughput: 0: 42733.4. Samples: 1273693120. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:08,380][26367] Avg episode reward: [(0, '0.849')] [2024-06-19 07:24:10,633][26599] Updated weights for policy 0, policy_version 305554 (0.0037) [2024-06-19 07:24:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5006295040. Throughput: 0: 42750.5. Samples: 1273950980. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:13,389][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 07:24:14,969][26599] Updated weights for policy 0, policy_version 305564 (0.0034) [2024-06-19 07:24:18,205][26599] Updated weights for policy 0, policy_version 305574 (0.0027) [2024-06-19 07:24:18,380][26367] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5006524416. Throughput: 0: 42705.2. Samples: 1274075760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:18,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 07:24:22,657][26599] Updated weights for policy 0, policy_version 305584 (0.0031) [2024-06-19 07:24:23,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5006737408. Throughput: 0: 42799.2. Samples: 1274338460. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:23,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 07:24:25,821][26599] Updated weights for policy 0, policy_version 305594 (0.0031) [2024-06-19 07:24:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5006934016. Throughput: 0: 42640.2. Samples: 1274586480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:28,380][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 07:24:30,311][26599] Updated weights for policy 0, policy_version 305604 (0.0031) [2024-06-19 07:24:33,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42600.9, 300 sec: 42653.9). Total num frames: 5007147008. Throughput: 0: 42857.6. Samples: 1274719480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:33,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 07:24:33,697][26599] Updated weights for policy 0, policy_version 305614 (0.0031) [2024-06-19 07:24:38,036][26599] Updated weights for policy 0, policy_version 305624 (0.0041) [2024-06-19 07:24:38,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42543.4). Total num frames: 5007360000. Throughput: 0: 42713.8. Samples: 1274969780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:38,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 07:24:41,373][26599] Updated weights for policy 0, policy_version 305634 (0.0028) [2024-06-19 07:24:43,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42322.8, 300 sec: 42653.4). Total num frames: 5007556608. Throughput: 0: 42581.8. Samples: 1275225760. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:43,384][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 07:24:45,818][26599] Updated weights for policy 0, policy_version 305644 (0.0041) [2024-06-19 07:24:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 5007785984. Throughput: 0: 42692.0. Samples: 1275358400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:48,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 07:24:48,965][26599] Updated weights for policy 0, policy_version 305654 (0.0034) [2024-06-19 07:24:53,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5007966208. Throughput: 0: 42605.2. Samples: 1275610360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:53,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 07:24:53,572][26599] Updated weights for policy 0, policy_version 305664 (0.0037) [2024-06-19 07:24:56,685][26599] Updated weights for policy 0, policy_version 305674 (0.0038) [2024-06-19 07:24:58,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5008195584. Throughput: 0: 42626.8. Samples: 1275869340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:24:58,384][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 07:25:01,195][26599] Updated weights for policy 0, policy_version 305684 (0.0034) [2024-06-19 07:25:03,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5008424960. Throughput: 0: 42662.8. Samples: 1275995580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:25:03,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 07:25:04,952][26599] Updated weights for policy 0, policy_version 305694 (0.0028) [2024-06-19 07:25:08,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5008621568. Throughput: 0: 42465.4. Samples: 1276249400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:25:08,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 07:25:08,869][26599] Updated weights for policy 0, policy_version 305704 (0.0031) [2024-06-19 07:25:12,499][26599] Updated weights for policy 0, policy_version 305714 (0.0032) [2024-06-19 07:25:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5008850944. Throughput: 0: 42613.6. Samples: 1276504100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 20.0) [2024-06-19 07:25:13,384][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 07:25:16,510][26599] Updated weights for policy 0, policy_version 305724 (0.0036) [2024-06-19 07:25:17,004][26579] Signal inference workers to stop experience collection... (18850 times) [2024-06-19 07:25:17,053][26579] Signal inference workers to resume experience collection... (18850 times) [2024-06-19 07:25:17,056][26599] InferenceWorker_p0-w0: stopping experience collection (18850 times) [2024-06-19 07:25:17,068][26599] InferenceWorker_p0-w0: resuming experience collection (18850 times) [2024-06-19 07:25:18,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5009063936. Throughput: 0: 42596.4. Samples: 1276636320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:18,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 07:25:19,824][26599] Updated weights for policy 0, policy_version 305734 (0.0038) [2024-06-19 07:25:23,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42325.4, 300 sec: 42598.9). Total num frames: 5009276928. Throughput: 0: 42625.5. Samples: 1276887920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:23,380][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 07:25:24,267][26599] Updated weights for policy 0, policy_version 305744 (0.0032) [2024-06-19 07:25:27,298][26599] Updated weights for policy 0, policy_version 305754 (0.0032) [2024-06-19 07:25:28,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5009506304. Throughput: 0: 42655.0. Samples: 1277145080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:28,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 07:25:31,875][26599] Updated weights for policy 0, policy_version 305764 (0.0029) [2024-06-19 07:25:33,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5009702912. Throughput: 0: 42667.8. Samples: 1277278460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:33,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 07:25:35,229][26599] Updated weights for policy 0, policy_version 305774 (0.0038) [2024-06-19 07:25:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5009932288. Throughput: 0: 42562.3. Samples: 1277525660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:38,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 07:25:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000305782_5009932288.pth... [2024-06-19 07:25:38,455][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000305157_4999692288.pth [2024-06-19 07:25:39,681][26599] Updated weights for policy 0, policy_version 305784 (0.0042) [2024-06-19 07:25:42,830][26599] Updated weights for policy 0, policy_version 305794 (0.0028) [2024-06-19 07:25:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43147.1, 300 sec: 42709.5). Total num frames: 5010145280. Throughput: 0: 42518.4. Samples: 1277782520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:43,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 07:25:47,422][26599] Updated weights for policy 0, policy_version 305804 (0.0024) [2024-06-19 07:25:48,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5010358272. Throughput: 0: 42676.7. Samples: 1277916040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:48,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 07:25:50,374][26599] Updated weights for policy 0, policy_version 305814 (0.0023) [2024-06-19 07:25:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5010538496. Throughput: 0: 42662.6. Samples: 1278169220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:53,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 07:25:55,005][26599] Updated weights for policy 0, policy_version 305824 (0.0028) [2024-06-19 07:25:57,925][26599] Updated weights for policy 0, policy_version 305834 (0.0029) [2024-06-19 07:25:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43420.2, 300 sec: 42709.5). Total num frames: 5010800640. Throughput: 0: 42767.6. Samples: 1278428640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:25:58,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 07:26:02,402][26599] Updated weights for policy 0, policy_version 305844 (0.0031) [2024-06-19 07:26:03,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 5010997248. Throughput: 0: 42857.3. Samples: 1278564900. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:26:03,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 07:26:05,955][26599] Updated weights for policy 0, policy_version 305854 (0.0031) [2024-06-19 07:26:08,384][26367] Fps is (10 sec: 39307.6, 60 sec: 42868.9, 300 sec: 42708.9). Total num frames: 5011193856. Throughput: 0: 42789.4. Samples: 1278813600. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:26:08,384][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 07:26:10,017][26599] Updated weights for policy 0, policy_version 305864 (0.0028) [2024-06-19 07:26:13,382][26367] Fps is (10 sec: 42593.2, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 5011423232. Throughput: 0: 42813.0. Samples: 1279071720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:26:13,382][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 07:26:13,602][26599] Updated weights for policy 0, policy_version 305874 (0.0029) [2024-06-19 07:26:17,537][26599] Updated weights for policy 0, policy_version 305884 (0.0032) [2024-06-19 07:26:18,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5011636224. Throughput: 0: 42816.1. Samples: 1279205180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:26:18,381][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 07:26:21,196][26599] Updated weights for policy 0, policy_version 305894 (0.0043) [2024-06-19 07:26:23,380][26367] Fps is (10 sec: 42603.5, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 5011849216. Throughput: 0: 43043.8. Samples: 1279462640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:26:23,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 07:26:25,665][26599] Updated weights for policy 0, policy_version 305904 (0.0029) [2024-06-19 07:26:26,455][26579] Signal inference workers to stop experience collection... (18900 times) [2024-06-19 07:26:26,456][26579] Signal inference workers to resume experience collection... (18900 times) [2024-06-19 07:26:26,482][26599] InferenceWorker_p0-w0: stopping experience collection (18900 times) [2024-06-19 07:26:26,482][26599] InferenceWorker_p0-w0: resuming experience collection (18900 times) [2024-06-19 07:26:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.6). Total num frames: 5012078592. Throughput: 0: 42892.6. Samples: 1279712680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 07:26:28,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 07:26:28,781][26599] Updated weights for policy 0, policy_version 305914 (0.0047) [2024-06-19 07:26:33,035][26599] Updated weights for policy 0, policy_version 305924 (0.0039) [2024-06-19 07:26:33,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5012275200. Throughput: 0: 42895.7. Samples: 1279846340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:26:33,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 07:26:36,292][26599] Updated weights for policy 0, policy_version 305934 (0.0028) [2024-06-19 07:26:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5012488192. Throughput: 0: 42937.4. Samples: 1280101400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:26:38,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 07:26:40,530][26599] Updated weights for policy 0, policy_version 305944 (0.0032) [2024-06-19 07:26:43,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5012717568. Throughput: 0: 42829.0. Samples: 1280355940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:26:43,380][26367] Avg episode reward: [(0, '0.361')] [2024-06-19 07:26:44,255][26599] Updated weights for policy 0, policy_version 305954 (0.0023) [2024-06-19 07:26:48,183][26599] Updated weights for policy 0, policy_version 305964 (0.0042) [2024-06-19 07:26:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5012914176. Throughput: 0: 42565.8. Samples: 1280480360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:26:48,383][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 07:26:51,835][26599] Updated weights for policy 0, policy_version 305974 (0.0029) [2024-06-19 07:26:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5013143552. Throughput: 0: 42742.6. Samples: 1280736860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:26:53,380][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 07:26:55,869][26599] Updated weights for policy 0, policy_version 305984 (0.0038) [2024-06-19 07:26:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5013340160. Throughput: 0: 42788.8. Samples: 1280997160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:26:58,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 07:26:59,382][26599] Updated weights for policy 0, policy_version 305994 (0.0029) [2024-06-19 07:27:03,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5013553152. Throughput: 0: 42587.0. Samples: 1281121600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:03,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 07:27:03,531][26599] Updated weights for policy 0, policy_version 306004 (0.0038) [2024-06-19 07:27:06,894][26599] Updated weights for policy 0, policy_version 306014 (0.0049) [2024-06-19 07:27:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43147.1, 300 sec: 42653.9). Total num frames: 5013782528. Throughput: 0: 42481.9. Samples: 1281374320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:08,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 07:27:11,125][26599] Updated weights for policy 0, policy_version 306024 (0.0037) [2024-06-19 07:27:13,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42599.4, 300 sec: 42709.5). Total num frames: 5013979136. Throughput: 0: 42838.7. Samples: 1281640420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:13,380][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 07:27:14,609][26599] Updated weights for policy 0, policy_version 306034 (0.0028) [2024-06-19 07:27:18,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42595.8, 300 sec: 42653.7). Total num frames: 5014192128. Throughput: 0: 42668.5. Samples: 1281766580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:18,385][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 07:27:18,719][26599] Updated weights for policy 0, policy_version 306044 (0.0032) [2024-06-19 07:27:21,968][26599] Updated weights for policy 0, policy_version 306054 (0.0030) [2024-06-19 07:27:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5014421504. Throughput: 0: 42676.4. Samples: 1282021840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:23,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 07:27:26,125][26599] Updated weights for policy 0, policy_version 306064 (0.0034) [2024-06-19 07:27:28,384][26367] Fps is (10 sec: 45875.1, 60 sec: 42868.9, 300 sec: 42820.0). Total num frames: 5014650880. Throughput: 0: 42986.6. Samples: 1282290500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:28,384][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 07:27:29,630][26599] Updated weights for policy 0, policy_version 306074 (0.0028) [2024-06-19 07:27:33,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5014847488. Throughput: 0: 43094.2. Samples: 1282419600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:33,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 07:27:34,197][26599] Updated weights for policy 0, policy_version 306084 (0.0029) [2024-06-19 07:27:37,135][26599] Updated weights for policy 0, policy_version 306094 (0.0026) [2024-06-19 07:27:38,380][26367] Fps is (10 sec: 42613.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5015076864. Throughput: 0: 42982.2. Samples: 1282671060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:38,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 07:27:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000306096_5015076864.pth... [2024-06-19 07:27:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000305469_5004804096.pth [2024-06-19 07:27:41,729][26599] Updated weights for policy 0, policy_version 306104 (0.0035) [2024-06-19 07:27:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5015273472. Throughput: 0: 43141.3. Samples: 1282938520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 07:27:43,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 07:27:44,697][26599] Updated weights for policy 0, policy_version 306114 (0.0040) [2024-06-19 07:27:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 5015486464. Throughput: 0: 43092.4. Samples: 1283060760. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:27:48,381][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 07:27:49,191][26599] Updated weights for policy 0, policy_version 306124 (0.0048) [2024-06-19 07:27:52,499][26599] Updated weights for policy 0, policy_version 306134 (0.0025) [2024-06-19 07:27:53,380][26367] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5015732224. Throughput: 0: 43223.5. Samples: 1283319380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:27:53,382][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 07:27:57,287][26599] Updated weights for policy 0, policy_version 306144 (0.0034) [2024-06-19 07:27:57,941][26579] Signal inference workers to stop experience collection... (18950 times) [2024-06-19 07:27:58,004][26599] InferenceWorker_p0-w0: stopping experience collection (18950 times) [2024-06-19 07:27:58,004][26579] Signal inference workers to resume experience collection... (18950 times) [2024-06-19 07:27:58,022][26599] InferenceWorker_p0-w0: resuming experience collection (18950 times) [2024-06-19 07:27:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5015928832. Throughput: 0: 43156.4. Samples: 1283582460. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:27:58,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 07:27:59,894][26599] Updated weights for policy 0, policy_version 306154 (0.0035) [2024-06-19 07:28:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5016125440. Throughput: 0: 43010.6. Samples: 1283701900. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:03,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 07:28:04,850][26599] Updated weights for policy 0, policy_version 306164 (0.0039) [2024-06-19 07:28:07,637][26599] Updated weights for policy 0, policy_version 306174 (0.0032) [2024-06-19 07:28:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5016371200. Throughput: 0: 43145.8. Samples: 1283963400. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:08,380][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 07:28:12,720][26599] Updated weights for policy 0, policy_version 306184 (0.0047) [2024-06-19 07:28:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5016535040. Throughput: 0: 42966.6. Samples: 1284223840. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:13,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 07:28:15,430][26599] Updated weights for policy 0, policy_version 306194 (0.0045) [2024-06-19 07:28:18,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42874.1, 300 sec: 42709.5). Total num frames: 5016764416. Throughput: 0: 42626.8. Samples: 1284337800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:18,380][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 07:28:20,470][26599] Updated weights for policy 0, policy_version 306204 (0.0023) [2024-06-19 07:28:23,057][26599] Updated weights for policy 0, policy_version 306214 (0.0028) [2024-06-19 07:28:23,380][26367] Fps is (10 sec: 49151.1, 60 sec: 43417.4, 300 sec: 42931.6). Total num frames: 5017026560. Throughput: 0: 42827.8. Samples: 1284598320. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:23,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 07:28:28,179][26599] Updated weights for policy 0, policy_version 306224 (0.0028) [2024-06-19 07:28:28,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42054.8, 300 sec: 42654.5). Total num frames: 5017174016. Throughput: 0: 42899.1. Samples: 1284868980. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:28,381][26367] Avg episode reward: [(0, '0.763')] [2024-06-19 07:28:30,867][26599] Updated weights for policy 0, policy_version 306234 (0.0027) [2024-06-19 07:28:33,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5017419776. Throughput: 0: 42686.3. Samples: 1284981640. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:33,381][26367] Avg episode reward: [(0, '0.753')] [2024-06-19 07:28:35,966][26599] Updated weights for policy 0, policy_version 306244 (0.0038) [2024-06-19 07:28:38,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5017632768. Throughput: 0: 42600.0. Samples: 1285236380. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:38,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 07:28:38,835][26599] Updated weights for policy 0, policy_version 306254 (0.0034) [2024-06-19 07:28:43,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5017796608. Throughput: 0: 42456.4. Samples: 1285493000. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:43,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 07:28:43,785][26599] Updated weights for policy 0, policy_version 306264 (0.0044) [2024-06-19 07:28:46,637][26599] Updated weights for policy 0, policy_version 306274 (0.0037) [2024-06-19 07:28:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5018058752. Throughput: 0: 42410.2. Samples: 1285610360. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:48,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 07:28:51,598][26599] Updated weights for policy 0, policy_version 306284 (0.0039) [2024-06-19 07:28:53,380][26367] Fps is (10 sec: 49152.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5018288128. Throughput: 0: 42374.1. Samples: 1285870240. Policy #0 lag: (min: 1.0, avg: 11.7, max: 24.0) [2024-06-19 07:28:53,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 07:28:54,627][26599] Updated weights for policy 0, policy_version 306294 (0.0033) [2024-06-19 07:28:58,380][26367] Fps is (10 sec: 37683.4, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 5018435584. Throughput: 0: 42351.2. Samples: 1286129640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:28:58,389][26367] Avg episode reward: [(0, '0.790')] [2024-06-19 07:28:59,254][26599] Updated weights for policy 0, policy_version 306304 (0.0031) [2024-06-19 07:28:59,286][26579] Signal inference workers to stop experience collection... (19000 times) [2024-06-19 07:28:59,287][26579] Signal inference workers to resume experience collection... (19000 times) [2024-06-19 07:28:59,314][26599] InferenceWorker_p0-w0: stopping experience collection (19000 times) [2024-06-19 07:28:59,314][26599] InferenceWorker_p0-w0: resuming experience collection (19000 times) [2024-06-19 07:29:02,335][26599] Updated weights for policy 0, policy_version 306314 (0.0023) [2024-06-19 07:29:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5018697728. Throughput: 0: 42312.8. Samples: 1286241880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:03,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 07:29:06,883][26599] Updated weights for policy 0, policy_version 306324 (0.0034) [2024-06-19 07:29:08,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5018910720. Throughput: 0: 42423.8. Samples: 1286507380. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:08,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 07:29:10,088][26599] Updated weights for policy 0, policy_version 306334 (0.0039) [2024-06-19 07:29:13,384][26367] Fps is (10 sec: 37669.9, 60 sec: 42322.8, 300 sec: 42542.3). Total num frames: 5019074560. Throughput: 0: 41993.1. Samples: 1286758820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:13,384][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 07:29:14,632][26599] Updated weights for policy 0, policy_version 306344 (0.0037) [2024-06-19 07:29:17,754][26599] Updated weights for policy 0, policy_version 306354 (0.0030) [2024-06-19 07:29:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5019336704. Throughput: 0: 42155.1. Samples: 1286878620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:18,381][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 07:29:22,292][26599] Updated weights for policy 0, policy_version 306364 (0.0027) [2024-06-19 07:29:23,380][26367] Fps is (10 sec: 44252.9, 60 sec: 41506.3, 300 sec: 42653.9). Total num frames: 5019516928. Throughput: 0: 42281.0. Samples: 1287139020. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:23,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 07:29:25,336][26599] Updated weights for policy 0, policy_version 306374 (0.0024) [2024-06-19 07:29:28,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5019713536. Throughput: 0: 42215.6. Samples: 1287392700. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:28,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 07:29:30,016][26599] Updated weights for policy 0, policy_version 306384 (0.0021) [2024-06-19 07:29:32,806][26599] Updated weights for policy 0, policy_version 306394 (0.0040) [2024-06-19 07:29:33,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5019975680. Throughput: 0: 42326.6. Samples: 1287515060. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:33,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 07:29:37,840][26599] Updated weights for policy 0, policy_version 306404 (0.0041) [2024-06-19 07:29:38,380][26367] Fps is (10 sec: 42598.0, 60 sec: 41779.2, 300 sec: 42654.4). Total num frames: 5020139520. Throughput: 0: 42411.6. Samples: 1287778760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:38,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 07:29:38,481][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000306406_5020155904.pth... [2024-06-19 07:29:38,537][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000305782_5009932288.pth [2024-06-19 07:29:40,390][26599] Updated weights for policy 0, policy_version 306414 (0.0031) [2024-06-19 07:29:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5020368896. Throughput: 0: 42192.8. Samples: 1288028320. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:43,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 07:29:45,430][26599] Updated weights for policy 0, policy_version 306424 (0.0044) [2024-06-19 07:29:48,187][26599] Updated weights for policy 0, policy_version 306434 (0.0039) [2024-06-19 07:29:48,380][26367] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5020614656. Throughput: 0: 42724.9. Samples: 1288164500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:48,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 07:29:52,962][26599] Updated weights for policy 0, policy_version 306444 (0.0039) [2024-06-19 07:29:53,384][26367] Fps is (10 sec: 40945.2, 60 sec: 41503.7, 300 sec: 42653.9). Total num frames: 5020778496. Throughput: 0: 42529.4. Samples: 1288421360. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:53,384][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 07:29:56,028][26599] Updated weights for policy 0, policy_version 306454 (0.0034) [2024-06-19 07:29:58,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5021007872. Throughput: 0: 42486.6. Samples: 1288670560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:29:58,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 07:30:00,745][26599] Updated weights for policy 0, policy_version 306464 (0.0035) [2024-06-19 07:30:03,380][26367] Fps is (10 sec: 45892.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5021237248. Throughput: 0: 42757.0. Samples: 1288802680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:30:03,380][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 07:30:03,839][26599] Updated weights for policy 0, policy_version 306474 (0.0040) [2024-06-19 07:30:08,307][26599] Updated weights for policy 0, policy_version 306484 (0.0030) [2024-06-19 07:30:08,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5021433856. Throughput: 0: 42556.8. Samples: 1289054080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 07:30:08,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 07:30:11,879][26599] Updated weights for policy 0, policy_version 306494 (0.0047) [2024-06-19 07:30:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43147.1, 300 sec: 42709.5). Total num frames: 5021663232. Throughput: 0: 42273.8. Samples: 1289295020. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:13,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 07:30:16,417][26599] Updated weights for policy 0, policy_version 306504 (0.0031) [2024-06-19 07:30:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5021859840. Throughput: 0: 42533.3. Samples: 1289429060. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:18,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 07:30:19,551][26599] Updated weights for policy 0, policy_version 306514 (0.0032) [2024-06-19 07:30:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5022056448. Throughput: 0: 42311.2. Samples: 1289682760. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:23,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 07:30:24,185][26599] Updated weights for policy 0, policy_version 306524 (0.0039) [2024-06-19 07:30:27,080][26599] Updated weights for policy 0, policy_version 306534 (0.0042) [2024-06-19 07:30:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5022302208. Throughput: 0: 42307.1. Samples: 1289932140. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:28,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 07:30:31,712][26599] Updated weights for policy 0, policy_version 306544 (0.0022) [2024-06-19 07:30:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5022498816. Throughput: 0: 42341.0. Samples: 1290069840. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:33,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 07:30:34,774][26599] Updated weights for policy 0, policy_version 306554 (0.0027) [2024-06-19 07:30:35,499][26579] Signal inference workers to stop experience collection... (19050 times) [2024-06-19 07:30:35,526][26599] InferenceWorker_p0-w0: stopping experience collection (19050 times) [2024-06-19 07:30:35,566][26579] Signal inference workers to resume experience collection... (19050 times) [2024-06-19 07:30:35,566][26599] InferenceWorker_p0-w0: resuming experience collection (19050 times) [2024-06-19 07:30:38,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5022679040. Throughput: 0: 42248.7. Samples: 1290322400. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:38,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:30:39,262][26599] Updated weights for policy 0, policy_version 306564 (0.0033) [2024-06-19 07:30:42,473][26599] Updated weights for policy 0, policy_version 306574 (0.0045) [2024-06-19 07:30:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5022941184. Throughput: 0: 42229.3. Samples: 1290570880. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:43,380][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 07:30:47,148][26599] Updated weights for policy 0, policy_version 306584 (0.0043) [2024-06-19 07:30:48,380][26367] Fps is (10 sec: 45876.1, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 5023137792. Throughput: 0: 42339.2. Samples: 1290707940. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:48,380][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 07:30:49,998][26599] Updated weights for policy 0, policy_version 306594 (0.0034) [2024-06-19 07:30:53,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42327.9, 300 sec: 42431.8). Total num frames: 5023318016. Throughput: 0: 42274.3. Samples: 1290956420. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:53,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 07:30:54,920][26599] Updated weights for policy 0, policy_version 306604 (0.0033) [2024-06-19 07:30:57,571][26599] Updated weights for policy 0, policy_version 306614 (0.0046) [2024-06-19 07:30:58,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5023580160. Throughput: 0: 42510.5. Samples: 1291208000. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:30:58,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 07:31:02,491][26599] Updated weights for policy 0, policy_version 306624 (0.0028) [2024-06-19 07:31:03,384][26367] Fps is (10 sec: 45858.8, 60 sec: 42322.8, 300 sec: 42653.9). Total num frames: 5023776768. Throughput: 0: 42588.6. Samples: 1291345700. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:31:03,384][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 07:31:05,315][26599] Updated weights for policy 0, policy_version 306634 (0.0040) [2024-06-19 07:31:08,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42543.0). Total num frames: 5023973376. Throughput: 0: 42390.7. Samples: 1291590340. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:31:08,381][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 07:31:10,170][26599] Updated weights for policy 0, policy_version 306644 (0.0022) [2024-06-19 07:31:13,160][26599] Updated weights for policy 0, policy_version 306654 (0.0031) [2024-06-19 07:31:13,380][26367] Fps is (10 sec: 44252.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5024219136. Throughput: 0: 42541.3. Samples: 1291846500. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:31:13,381][26367] Avg episode reward: [(0, '0.271')] [2024-06-19 07:31:18,141][26599] Updated weights for policy 0, policy_version 306664 (0.0035) [2024-06-19 07:31:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5024382976. Throughput: 0: 42435.8. Samples: 1291979460. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:31:18,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 07:31:20,994][26599] Updated weights for policy 0, policy_version 306674 (0.0042) [2024-06-19 07:31:23,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5024612352. Throughput: 0: 42339.2. Samples: 1292227660. Policy #0 lag: (min: 0.0, avg: 11.9, max: 22.0) [2024-06-19 07:31:23,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 07:31:26,109][26599] Updated weights for policy 0, policy_version 306684 (0.0038) [2024-06-19 07:31:28,380][26367] Fps is (10 sec: 47514.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5024858112. Throughput: 0: 42503.1. Samples: 1292483520. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:31:28,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 07:31:28,451][26599] Updated weights for policy 0, policy_version 306694 (0.0030) [2024-06-19 07:31:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 5025005568. Throughput: 0: 42480.3. Samples: 1292619560. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:31:33,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 07:31:33,976][26599] Updated weights for policy 0, policy_version 306704 (0.0036) [2024-06-19 07:31:36,007][26599] Updated weights for policy 0, policy_version 306714 (0.0024) [2024-06-19 07:31:38,380][26367] Fps is (10 sec: 40959.2, 60 sec: 43144.5, 300 sec: 42542.8). Total num frames: 5025267712. Throughput: 0: 42443.9. Samples: 1292866400. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:31:38,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 07:31:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000306718_5025267712.pth... [2024-06-19 07:31:38,464][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000306096_5015076864.pth [2024-06-19 07:31:41,669][26599] Updated weights for policy 0, policy_version 306724 (0.0032) [2024-06-19 07:31:42,015][26579] Signal inference workers to stop experience collection... (19100 times) [2024-06-19 07:31:42,066][26579] Signal inference workers to resume experience collection... (19100 times) [2024-06-19 07:31:42,068][26599] InferenceWorker_p0-w0: stopping experience collection (19100 times) [2024-06-19 07:31:42,103][26599] InferenceWorker_p0-w0: resuming experience collection (19100 times) [2024-06-19 07:31:43,380][26367] Fps is (10 sec: 49152.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5025497088. Throughput: 0: 42652.1. Samples: 1293127340. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:31:43,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:31:43,607][26599] Updated weights for policy 0, policy_version 306734 (0.0027) [2024-06-19 07:31:48,380][26367] Fps is (10 sec: 39322.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 5025660928. Throughput: 0: 42435.0. Samples: 1293255120. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:31:48,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 07:31:49,214][26599] Updated weights for policy 0, policy_version 306744 (0.0039) [2024-06-19 07:31:51,347][26599] Updated weights for policy 0, policy_version 306754 (0.0037) [2024-06-19 07:31:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5025906688. Throughput: 0: 42594.7. Samples: 1293507100. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:31:53,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 07:31:56,789][26599] Updated weights for policy 0, policy_version 306764 (0.0023) [2024-06-19 07:31:58,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5026136064. Throughput: 0: 42759.2. Samples: 1293770660. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:31:58,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 07:31:59,081][26599] Updated weights for policy 0, policy_version 306774 (0.0040) [2024-06-19 07:32:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42327.8, 300 sec: 42487.3). Total num frames: 5026316288. Throughput: 0: 42726.3. Samples: 1293902140. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:03,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 07:32:04,172][26599] Updated weights for policy 0, policy_version 306784 (0.0040) [2024-06-19 07:32:06,662][26599] Updated weights for policy 0, policy_version 306794 (0.0032) [2024-06-19 07:32:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5026545664. Throughput: 0: 42813.0. Samples: 1294154240. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:08,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 07:32:11,757][26599] Updated weights for policy 0, policy_version 306804 (0.0030) [2024-06-19 07:32:13,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42654.4). Total num frames: 5026775040. Throughput: 0: 42880.7. Samples: 1294413160. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:13,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 07:32:14,578][26599] Updated weights for policy 0, policy_version 306814 (0.0035) [2024-06-19 07:32:18,387][26367] Fps is (10 sec: 42569.3, 60 sec: 43139.8, 300 sec: 42541.9). Total num frames: 5026971648. Throughput: 0: 42762.5. Samples: 1294544160. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:18,388][26367] Avg episode reward: [(0, '0.335')] [2024-06-19 07:32:19,325][26599] Updated weights for policy 0, policy_version 306824 (0.0033) [2024-06-19 07:32:22,082][26599] Updated weights for policy 0, policy_version 306834 (0.0037) [2024-06-19 07:32:23,384][26367] Fps is (10 sec: 42583.5, 60 sec: 43141.9, 300 sec: 42542.9). Total num frames: 5027201024. Throughput: 0: 42867.8. Samples: 1294795600. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:23,385][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 07:32:26,760][26599] Updated weights for policy 0, policy_version 306844 (0.0042) [2024-06-19 07:32:28,387][26367] Fps is (10 sec: 45876.7, 60 sec: 42866.9, 300 sec: 42653.0). Total num frames: 5027430400. Throughput: 0: 42827.3. Samples: 1295054840. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:28,387][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 07:32:30,133][26599] Updated weights for policy 0, policy_version 306854 (0.0039) [2024-06-19 07:32:33,380][26367] Fps is (10 sec: 39335.8, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 5027594240. Throughput: 0: 42828.8. Samples: 1295182420. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:33,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 07:32:34,522][26599] Updated weights for policy 0, policy_version 306864 (0.0034) [2024-06-19 07:32:37,854][26599] Updated weights for policy 0, policy_version 306874 (0.0040) [2024-06-19 07:32:38,380][26367] Fps is (10 sec: 39346.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5027823616. Throughput: 0: 42971.1. Samples: 1295440800. Policy #0 lag: (min: 1.0, avg: 7.4, max: 19.0) [2024-06-19 07:32:38,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 07:32:42,039][26599] Updated weights for policy 0, policy_version 306884 (0.0047) [2024-06-19 07:32:42,998][26579] Signal inference workers to stop experience collection... (19150 times) [2024-06-19 07:32:43,053][26599] InferenceWorker_p0-w0: stopping experience collection (19150 times) [2024-06-19 07:32:43,053][26579] Signal inference workers to resume experience collection... (19150 times) [2024-06-19 07:32:43,074][26599] InferenceWorker_p0-w0: resuming experience collection (19150 times) [2024-06-19 07:32:43,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5028052992. Throughput: 0: 42674.6. Samples: 1295691020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:32:43,381][26367] Avg episode reward: [(0, '0.372')] [2024-06-19 07:32:45,710][26599] Updated weights for policy 0, policy_version 306894 (0.0036) [2024-06-19 07:32:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42376.2). Total num frames: 5028233216. Throughput: 0: 42421.0. Samples: 1295811080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:32:48,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 07:32:50,389][26599] Updated weights for policy 0, policy_version 306904 (0.0042) [2024-06-19 07:32:53,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5028446208. Throughput: 0: 42482.2. Samples: 1296065940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:32:53,380][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 07:32:53,557][26599] Updated weights for policy 0, policy_version 306914 (0.0034) [2024-06-19 07:32:57,846][26599] Updated weights for policy 0, policy_version 306924 (0.0045) [2024-06-19 07:32:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5028675584. Throughput: 0: 42434.4. Samples: 1296322700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:32:58,380][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 07:33:01,347][26599] Updated weights for policy 0, policy_version 306934 (0.0038) [2024-06-19 07:33:03,380][26367] Fps is (10 sec: 44235.7, 60 sec: 42871.4, 300 sec: 42431.7). Total num frames: 5028888576. Throughput: 0: 42347.5. Samples: 1296449520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:03,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 07:33:05,461][26599] Updated weights for policy 0, policy_version 306944 (0.0041) [2024-06-19 07:33:08,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 5029085184. Throughput: 0: 42406.9. Samples: 1296703760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:08,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 07:33:09,053][26599] Updated weights for policy 0, policy_version 306954 (0.0040) [2024-06-19 07:33:13,088][26599] Updated weights for policy 0, policy_version 306964 (0.0034) [2024-06-19 07:33:13,380][26367] Fps is (10 sec: 40961.0, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 5029298176. Throughput: 0: 42329.2. Samples: 1296959380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:13,380][26367] Avg episode reward: [(0, '0.455')] [2024-06-19 07:33:17,055][26599] Updated weights for policy 0, policy_version 306974 (0.0038) [2024-06-19 07:33:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42876.2, 300 sec: 42431.8). Total num frames: 5029543936. Throughput: 0: 42267.1. Samples: 1297084440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:18,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 07:33:20,654][26599] Updated weights for policy 0, policy_version 306984 (0.0029) [2024-06-19 07:33:23,382][26367] Fps is (10 sec: 42591.2, 60 sec: 42053.7, 300 sec: 42542.6). Total num frames: 5029724160. Throughput: 0: 42177.6. Samples: 1297338860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:23,382][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 07:33:24,619][26599] Updated weights for policy 0, policy_version 306994 (0.0031) [2024-06-19 07:33:28,274][26599] Updated weights for policy 0, policy_version 307004 (0.0036) [2024-06-19 07:33:28,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42056.8, 300 sec: 42487.3). Total num frames: 5029953536. Throughput: 0: 42383.3. Samples: 1297598260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:28,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 07:33:32,251][26599] Updated weights for policy 0, policy_version 307014 (0.0029) [2024-06-19 07:33:33,380][26367] Fps is (10 sec: 44243.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5030166528. Throughput: 0: 42580.9. Samples: 1297727220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:33,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 07:33:35,827][26599] Updated weights for policy 0, policy_version 307024 (0.0025) [2024-06-19 07:33:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5030346752. Throughput: 0: 42563.5. Samples: 1297981300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:38,381][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 07:33:38,503][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307029_5030363136.pth... [2024-06-19 07:33:38,566][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000306406_5020155904.pth [2024-06-19 07:33:40,080][26599] Updated weights for policy 0, policy_version 307034 (0.0044) [2024-06-19 07:33:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5030592512. Throughput: 0: 42435.5. Samples: 1298232300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:43,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 07:33:43,625][26599] Updated weights for policy 0, policy_version 307044 (0.0036) [2024-06-19 07:33:47,800][26599] Updated weights for policy 0, policy_version 307054 (0.0043) [2024-06-19 07:33:48,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 5030805504. Throughput: 0: 42633.1. Samples: 1298368000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:48,381][26367] Avg episode reward: [(0, '0.825')] [2024-06-19 07:33:51,268][26599] Updated weights for policy 0, policy_version 307064 (0.0033) [2024-06-19 07:33:53,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5030985728. Throughput: 0: 42457.8. Samples: 1298614360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 07:33:53,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 07:33:55,610][26599] Updated weights for policy 0, policy_version 307074 (0.0029) [2024-06-19 07:33:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5031231488. Throughput: 0: 42386.6. Samples: 1298866780. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:33:58,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 07:33:58,955][26599] Updated weights for policy 0, policy_version 307084 (0.0027) [2024-06-19 07:34:01,371][26579] Signal inference workers to stop experience collection... (19200 times) [2024-06-19 07:34:01,371][26579] Signal inference workers to resume experience collection... (19200 times) [2024-06-19 07:34:01,388][26599] InferenceWorker_p0-w0: stopping experience collection (19200 times) [2024-06-19 07:34:01,388][26599] InferenceWorker_p0-w0: resuming experience collection (19200 times) [2024-06-19 07:34:03,368][26599] Updated weights for policy 0, policy_version 307094 (0.0035) [2024-06-19 07:34:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5031428096. Throughput: 0: 42596.4. Samples: 1299001280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:03,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 07:34:06,471][26599] Updated weights for policy 0, policy_version 307104 (0.0040) [2024-06-19 07:34:08,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42325.3, 300 sec: 42543.4). Total num frames: 5031624704. Throughput: 0: 42381.9. Samples: 1299245980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:08,381][26367] Avg episode reward: [(0, '0.297')] [2024-06-19 07:34:10,986][26599] Updated weights for policy 0, policy_version 307114 (0.0038) [2024-06-19 07:34:13,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 5031854080. Throughput: 0: 42290.7. Samples: 1299501340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:13,381][26367] Avg episode reward: [(0, '0.399')] [2024-06-19 07:34:14,223][26599] Updated weights for policy 0, policy_version 307124 (0.0035) [2024-06-19 07:34:18,380][26367] Fps is (10 sec: 42598.9, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 5032050688. Throughput: 0: 42461.4. Samples: 1299637980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:18,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 07:34:19,091][26599] Updated weights for policy 0, policy_version 307134 (0.0038) [2024-06-19 07:34:22,148][26599] Updated weights for policy 0, policy_version 307144 (0.0044) [2024-06-19 07:34:23,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42599.4, 300 sec: 42598.4). Total num frames: 5032280064. Throughput: 0: 42386.5. Samples: 1299888700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:23,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 07:34:26,524][26599] Updated weights for policy 0, policy_version 307154 (0.0038) [2024-06-19 07:34:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5032493056. Throughput: 0: 42482.3. Samples: 1300144000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:28,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 07:34:29,779][26599] Updated weights for policy 0, policy_version 307164 (0.0023) [2024-06-19 07:34:33,384][26367] Fps is (10 sec: 40946.0, 60 sec: 42049.7, 300 sec: 42542.4). Total num frames: 5032689664. Throughput: 0: 42369.9. Samples: 1300274800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:33,384][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 07:34:34,078][26599] Updated weights for policy 0, policy_version 307174 (0.0042) [2024-06-19 07:34:37,763][26599] Updated weights for policy 0, policy_version 307184 (0.0034) [2024-06-19 07:34:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5032919040. Throughput: 0: 42590.3. Samples: 1300530920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:38,380][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 07:34:41,887][26599] Updated weights for policy 0, policy_version 307194 (0.0032) [2024-06-19 07:34:43,380][26367] Fps is (10 sec: 45891.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5033148416. Throughput: 0: 42564.4. Samples: 1300782180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:43,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 07:34:45,368][26599] Updated weights for policy 0, policy_version 307204 (0.0038) [2024-06-19 07:34:48,384][26367] Fps is (10 sec: 40944.7, 60 sec: 42049.7, 300 sec: 42542.9). Total num frames: 5033328640. Throughput: 0: 42492.2. Samples: 1300913580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:48,384][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 07:34:49,514][26599] Updated weights for policy 0, policy_version 307214 (0.0053) [2024-06-19 07:34:52,955][26599] Updated weights for policy 0, policy_version 307224 (0.0029) [2024-06-19 07:34:53,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42868.9, 300 sec: 42542.3). Total num frames: 5033558016. Throughput: 0: 42779.8. Samples: 1301171220. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:53,384][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 07:34:57,260][26599] Updated weights for policy 0, policy_version 307234 (0.0041) [2024-06-19 07:34:58,380][26367] Fps is (10 sec: 47530.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5033803776. Throughput: 0: 42608.7. Samples: 1301418740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:34:58,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 07:35:00,778][26599] Updated weights for policy 0, policy_version 307244 (0.0024) [2024-06-19 07:35:03,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5033967616. Throughput: 0: 42515.5. Samples: 1301551180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:35:03,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 07:35:04,951][26599] Updated weights for policy 0, policy_version 307254 (0.0034) [2024-06-19 07:35:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5034196992. Throughput: 0: 42575.2. Samples: 1301804580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 21.0) [2024-06-19 07:35:08,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 07:35:08,471][26599] Updated weights for policy 0, policy_version 307264 (0.0034) [2024-06-19 07:35:12,512][26599] Updated weights for policy 0, policy_version 307274 (0.0038) [2024-06-19 07:35:13,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5034426368. Throughput: 0: 42587.9. Samples: 1302060460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:13,386][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 07:35:16,223][26599] Updated weights for policy 0, policy_version 307284 (0.0036) [2024-06-19 07:35:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5034622976. Throughput: 0: 42579.4. Samples: 1302190720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:18,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 07:35:20,115][26599] Updated weights for policy 0, policy_version 307294 (0.0035) [2024-06-19 07:35:20,661][26579] Signal inference workers to stop experience collection... (19250 times) [2024-06-19 07:35:20,662][26579] Signal inference workers to resume experience collection... (19250 times) [2024-06-19 07:35:20,704][26599] InferenceWorker_p0-w0: stopping experience collection (19250 times) [2024-06-19 07:35:20,704][26599] InferenceWorker_p0-w0: resuming experience collection (19250 times) [2024-06-19 07:35:23,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42595.9, 300 sec: 42486.8). Total num frames: 5034835968. Throughput: 0: 42514.2. Samples: 1302444220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:23,385][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 07:35:23,874][26599] Updated weights for policy 0, policy_version 307304 (0.0041) [2024-06-19 07:35:27,573][26599] Updated weights for policy 0, policy_version 307314 (0.0033) [2024-06-19 07:35:28,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5035081728. Throughput: 0: 42739.1. Samples: 1302705440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:28,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 07:35:31,471][26599] Updated weights for policy 0, policy_version 307324 (0.0022) [2024-06-19 07:35:33,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42874.0, 300 sec: 42654.0). Total num frames: 5035261952. Throughput: 0: 42703.5. Samples: 1302835080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:33,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 07:35:35,298][26599] Updated weights for policy 0, policy_version 307334 (0.0039) [2024-06-19 07:35:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 5035491328. Throughput: 0: 42665.2. Samples: 1303091000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:38,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 07:35:38,387][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307342_5035491328.pth... [2024-06-19 07:35:38,450][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000306718_5025267712.pth [2024-06-19 07:35:39,373][26599] Updated weights for policy 0, policy_version 307344 (0.0033) [2024-06-19 07:35:42,866][26599] Updated weights for policy 0, policy_version 307354 (0.0041) [2024-06-19 07:35:43,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5035720704. Throughput: 0: 42848.7. Samples: 1303346920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:43,380][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 07:35:47,034][26599] Updated weights for policy 0, policy_version 307364 (0.0041) [2024-06-19 07:35:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5035884544. Throughput: 0: 42896.5. Samples: 1303481520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:48,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 07:35:50,349][26599] Updated weights for policy 0, policy_version 307374 (0.0027) [2024-06-19 07:35:53,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42874.0, 300 sec: 42542.9). Total num frames: 5036130304. Throughput: 0: 42861.4. Samples: 1303733340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:53,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 07:35:54,752][26599] Updated weights for policy 0, policy_version 307384 (0.0034) [2024-06-19 07:35:57,950][26599] Updated weights for policy 0, policy_version 307394 (0.0033) [2024-06-19 07:35:58,380][26367] Fps is (10 sec: 47514.0, 60 sec: 42598.5, 300 sec: 42654.5). Total num frames: 5036359680. Throughput: 0: 42778.3. Samples: 1303985480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:35:58,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 07:36:02,582][26599] Updated weights for policy 0, policy_version 307404 (0.0042) [2024-06-19 07:36:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5036523520. Throughput: 0: 42821.0. Samples: 1304117660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:36:03,380][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 07:36:05,490][26599] Updated weights for policy 0, policy_version 307414 (0.0044) [2024-06-19 07:36:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5036785664. Throughput: 0: 42865.3. Samples: 1304373000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:36:08,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 07:36:10,792][26599] Updated weights for policy 0, policy_version 307424 (0.0040) [2024-06-19 07:36:13,288][26599] Updated weights for policy 0, policy_version 307434 (0.0031) [2024-06-19 07:36:13,380][26367] Fps is (10 sec: 47512.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5036998656. Throughput: 0: 42660.8. Samples: 1304625180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:36:13,381][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 07:36:18,272][26599] Updated weights for policy 0, policy_version 307444 (0.0029) [2024-06-19 07:36:18,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5037162496. Throughput: 0: 42672.4. Samples: 1304755340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 22.0) [2024-06-19 07:36:18,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 07:36:21,075][26599] Updated weights for policy 0, policy_version 307454 (0.0034) [2024-06-19 07:36:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43147.1, 300 sec: 42598.4). Total num frames: 5037424640. Throughput: 0: 42464.4. Samples: 1305001900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:23,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 07:36:26,092][26599] Updated weights for policy 0, policy_version 307464 (0.0031) [2024-06-19 07:36:28,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5037621248. Throughput: 0: 42743.1. Samples: 1305270360. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:28,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:36:28,645][26599] Updated weights for policy 0, policy_version 307474 (0.0039) [2024-06-19 07:36:33,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5037801472. Throughput: 0: 42483.1. Samples: 1305393260. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:33,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 07:36:33,623][26599] Updated weights for policy 0, policy_version 307484 (0.0038) [2024-06-19 07:36:36,617][26599] Updated weights for policy 0, policy_version 307494 (0.0048) [2024-06-19 07:36:38,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5038063616. Throughput: 0: 42471.5. Samples: 1305644560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:38,384][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 07:36:41,260][26599] Updated weights for policy 0, policy_version 307504 (0.0037) [2024-06-19 07:36:42,006][26579] Signal inference workers to stop experience collection... (19300 times) [2024-06-19 07:36:42,056][26599] InferenceWorker_p0-w0: stopping experience collection (19300 times) [2024-06-19 07:36:42,062][26579] Signal inference workers to resume experience collection... (19300 times) [2024-06-19 07:36:42,070][26599] InferenceWorker_p0-w0: resuming experience collection (19300 times) [2024-06-19 07:36:43,380][26367] Fps is (10 sec: 42599.1, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 5038227456. Throughput: 0: 42915.1. Samples: 1305916660. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:43,380][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 07:36:44,123][26599] Updated weights for policy 0, policy_version 307514 (0.0031) [2024-06-19 07:36:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5038456832. Throughput: 0: 42520.3. Samples: 1306031080. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:48,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 07:36:48,711][26599] Updated weights for policy 0, policy_version 307524 (0.0027) [2024-06-19 07:36:51,668][26599] Updated weights for policy 0, policy_version 307534 (0.0028) [2024-06-19 07:36:53,380][26367] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5038702592. Throughput: 0: 42676.5. Samples: 1306293440. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:53,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 07:36:56,297][26599] Updated weights for policy 0, policy_version 307544 (0.0050) [2024-06-19 07:36:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.1, 300 sec: 42542.9). Total num frames: 5038866432. Throughput: 0: 42969.3. Samples: 1306558800. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:36:58,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 07:36:59,330][26599] Updated weights for policy 0, policy_version 307554 (0.0038) [2024-06-19 07:37:03,380][26367] Fps is (10 sec: 40959.4, 60 sec: 43144.3, 300 sec: 42598.4). Total num frames: 5039112192. Throughput: 0: 42804.8. Samples: 1306681560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:37:03,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 07:37:03,778][26599] Updated weights for policy 0, policy_version 307564 (0.0029) [2024-06-19 07:37:06,757][26599] Updated weights for policy 0, policy_version 307574 (0.0034) [2024-06-19 07:37:08,380][26367] Fps is (10 sec: 47514.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5039341568. Throughput: 0: 42994.8. Samples: 1306936660. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:37:08,380][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 07:37:11,269][26599] Updated weights for policy 0, policy_version 307584 (0.0034) [2024-06-19 07:37:13,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42052.3, 300 sec: 42543.8). Total num frames: 5039521792. Throughput: 0: 42889.3. Samples: 1307200380. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:37:13,380][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 07:37:14,690][26599] Updated weights for policy 0, policy_version 307594 (0.0035) [2024-06-19 07:37:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42598.9). Total num frames: 5039767552. Throughput: 0: 42865.8. Samples: 1307322220. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:37:18,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 07:37:18,740][26599] Updated weights for policy 0, policy_version 307604 (0.0039) [2024-06-19 07:37:22,394][26599] Updated weights for policy 0, policy_version 307614 (0.0032) [2024-06-19 07:37:23,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.6, 300 sec: 42543.8). Total num frames: 5039980544. Throughput: 0: 43131.7. Samples: 1307585480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:37:23,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 07:37:26,186][26599] Updated weights for policy 0, policy_version 307624 (0.0033) [2024-06-19 07:37:28,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5040160768. Throughput: 0: 42960.7. Samples: 1307849900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:37:28,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 07:37:30,075][26599] Updated weights for policy 0, policy_version 307634 (0.0030) [2024-06-19 07:37:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 5040406528. Throughput: 0: 43122.7. Samples: 1307971600. Policy #0 lag: (min: 0.0, avg: 12.5, max: 21.0) [2024-06-19 07:37:33,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 07:37:33,688][26599] Updated weights for policy 0, policy_version 307644 (0.0043) [2024-06-19 07:37:37,660][26599] Updated weights for policy 0, policy_version 307654 (0.0034) [2024-06-19 07:37:38,387][26367] Fps is (10 sec: 45846.5, 60 sec: 42594.0, 300 sec: 42597.5). Total num frames: 5040619520. Throughput: 0: 42940.6. Samples: 1308226040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:37:38,387][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 07:37:38,509][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307656_5040635904.pth... [2024-06-19 07:37:38,566][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307029_5030363136.pth [2024-06-19 07:37:41,326][26599] Updated weights for policy 0, policy_version 307664 (0.0033) [2024-06-19 07:37:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5040816128. Throughput: 0: 42870.6. Samples: 1308487980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:37:43,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 07:37:45,306][26599] Updated weights for policy 0, policy_version 307674 (0.0028) [2024-06-19 07:37:48,380][26367] Fps is (10 sec: 40986.6, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5041029120. Throughput: 0: 42868.7. Samples: 1308610640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:37:48,380][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 07:37:49,152][26599] Updated weights for policy 0, policy_version 307684 (0.0024) [2024-06-19 07:37:51,730][26579] Signal inference workers to stop experience collection... (19350 times) [2024-06-19 07:37:51,780][26599] InferenceWorker_p0-w0: stopping experience collection (19350 times) [2024-06-19 07:37:51,854][26579] Signal inference workers to resume experience collection... (19350 times) [2024-06-19 07:37:51,854][26599] InferenceWorker_p0-w0: resuming experience collection (19350 times) [2024-06-19 07:37:52,867][26599] Updated weights for policy 0, policy_version 307694 (0.0027) [2024-06-19 07:37:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5041258496. Throughput: 0: 42891.9. Samples: 1308866800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:37:53,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 07:37:57,470][26599] Updated weights for policy 0, policy_version 307704 (0.0035) [2024-06-19 07:37:58,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42654.0). Total num frames: 5041471488. Throughput: 0: 42953.8. Samples: 1309133300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:37:58,381][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 07:38:00,593][26599] Updated weights for policy 0, policy_version 307714 (0.0038) [2024-06-19 07:38:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5041684480. Throughput: 0: 42931.6. Samples: 1309254140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:03,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 07:38:04,906][26599] Updated weights for policy 0, policy_version 307724 (0.0028) [2024-06-19 07:38:08,266][26599] Updated weights for policy 0, policy_version 307734 (0.0031) [2024-06-19 07:38:08,384][26367] Fps is (10 sec: 44221.5, 60 sec: 42869.0, 300 sec: 42764.5). Total num frames: 5041913856. Throughput: 0: 42786.4. Samples: 1309511020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:08,384][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 07:38:12,369][26599] Updated weights for policy 0, policy_version 307744 (0.0034) [2024-06-19 07:38:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5042110464. Throughput: 0: 42644.6. Samples: 1309768900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:13,380][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 07:38:15,927][26599] Updated weights for policy 0, policy_version 307754 (0.0029) [2024-06-19 07:38:18,380][26367] Fps is (10 sec: 40974.2, 60 sec: 42598.5, 300 sec: 42709.7). Total num frames: 5042323456. Throughput: 0: 42585.8. Samples: 1309887960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:18,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 07:38:20,239][26599] Updated weights for policy 0, policy_version 307764 (0.0024) [2024-06-19 07:38:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5042552832. Throughput: 0: 42655.8. Samples: 1310145280. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:23,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 07:38:23,645][26599] Updated weights for policy 0, policy_version 307774 (0.0037) [2024-06-19 07:38:27,768][26599] Updated weights for policy 0, policy_version 307784 (0.0047) [2024-06-19 07:38:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5042749440. Throughput: 0: 42537.8. Samples: 1310402180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:28,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 07:38:31,211][26599] Updated weights for policy 0, policy_version 307794 (0.0024) [2024-06-19 07:38:33,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5042946048. Throughput: 0: 42578.5. Samples: 1310526680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:33,381][26367] Avg episode reward: [(0, '0.809')] [2024-06-19 07:38:35,908][26599] Updated weights for policy 0, policy_version 307804 (0.0047) [2024-06-19 07:38:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42876.0, 300 sec: 42709.5). Total num frames: 5043191808. Throughput: 0: 42635.6. Samples: 1310785400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:38,381][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 07:38:38,806][26599] Updated weights for policy 0, policy_version 307814 (0.0028) [2024-06-19 07:38:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5043372032. Throughput: 0: 42486.6. Samples: 1311045200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:43,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 07:38:43,402][26599] Updated weights for policy 0, policy_version 307824 (0.0041) [2024-06-19 07:38:46,607][26599] Updated weights for policy 0, policy_version 307834 (0.0027) [2024-06-19 07:38:48,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5043585024. Throughput: 0: 42506.6. Samples: 1311166940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 07:38:48,381][26367] Avg episode reward: [(0, '0.788')] [2024-06-19 07:38:50,803][26599] Updated weights for policy 0, policy_version 307844 (0.0044) [2024-06-19 07:38:53,380][26367] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5043847168. Throughput: 0: 42473.4. Samples: 1311422180. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:38:53,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 07:38:54,626][26599] Updated weights for policy 0, policy_version 307854 (0.0031) [2024-06-19 07:38:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5044011008. Throughput: 0: 42478.9. Samples: 1311680460. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:38:58,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 07:38:58,821][26599] Updated weights for policy 0, policy_version 307864 (0.0038) [2024-06-19 07:39:02,462][26599] Updated weights for policy 0, policy_version 307874 (0.0033) [2024-06-19 07:39:03,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5044224000. Throughput: 0: 42389.6. Samples: 1311795500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:03,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 07:39:06,255][26599] Updated weights for policy 0, policy_version 307884 (0.0043) [2024-06-19 07:39:08,384][26367] Fps is (10 sec: 45858.7, 60 sec: 42598.2, 300 sec: 42764.5). Total num frames: 5044469760. Throughput: 0: 42612.5. Samples: 1312063000. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:08,385][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 07:39:10,079][26599] Updated weights for policy 0, policy_version 307894 (0.0031) [2024-06-19 07:39:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5044649984. Throughput: 0: 42637.3. Samples: 1312320860. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:13,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 07:39:14,010][26599] Updated weights for policy 0, policy_version 307904 (0.0035) [2024-06-19 07:39:17,884][26599] Updated weights for policy 0, policy_version 307914 (0.0041) [2024-06-19 07:39:18,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5044879360. Throughput: 0: 42507.6. Samples: 1312439520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:18,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 07:39:21,218][26579] Signal inference workers to stop experience collection... (19400 times) [2024-06-19 07:39:21,221][26579] Signal inference workers to resume experience collection... (19400 times) [2024-06-19 07:39:21,276][26599] InferenceWorker_p0-w0: stopping experience collection (19400 times) [2024-06-19 07:39:21,276][26599] InferenceWorker_p0-w0: resuming experience collection (19400 times) [2024-06-19 07:39:21,827][26599] Updated weights for policy 0, policy_version 307924 (0.0036) [2024-06-19 07:39:23,384][26367] Fps is (10 sec: 45858.4, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5045108736. Throughput: 0: 42513.9. Samples: 1312698680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:23,385][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 07:39:25,495][26599] Updated weights for policy 0, policy_version 307934 (0.0033) [2024-06-19 07:39:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42710.0). Total num frames: 5045288960. Throughput: 0: 42524.0. Samples: 1312958780. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:28,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 07:39:29,592][26599] Updated weights for policy 0, policy_version 307944 (0.0030) [2024-06-19 07:39:33,121][26599] Updated weights for policy 0, policy_version 307954 (0.0040) [2024-06-19 07:39:33,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5045518336. Throughput: 0: 42434.7. Samples: 1313076500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:33,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 07:39:37,278][26599] Updated weights for policy 0, policy_version 307964 (0.0044) [2024-06-19 07:39:38,383][26367] Fps is (10 sec: 45863.0, 60 sec: 42596.5, 300 sec: 42709.1). Total num frames: 5045747712. Throughput: 0: 42664.6. Samples: 1313342200. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:38,384][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 07:39:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307968_5045747712.pth... [2024-06-19 07:39:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307342_5035491328.pth [2024-06-19 07:39:40,791][26599] Updated weights for policy 0, policy_version 307974 (0.0027) [2024-06-19 07:39:43,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 5045911552. Throughput: 0: 42746.2. Samples: 1313604040. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:43,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 07:39:44,828][26599] Updated weights for policy 0, policy_version 307984 (0.0032) [2024-06-19 07:39:48,380][26367] Fps is (10 sec: 40970.6, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 5046157312. Throughput: 0: 42849.8. Samples: 1313723740. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:48,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 07:39:48,420][26599] Updated weights for policy 0, policy_version 307994 (0.0039) [2024-06-19 07:39:52,335][26599] Updated weights for policy 0, policy_version 308004 (0.0029) [2024-06-19 07:39:53,380][26367] Fps is (10 sec: 45876.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5046370304. Throughput: 0: 42669.4. Samples: 1313982960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:53,380][26367] Avg episode reward: [(0, '0.358')] [2024-06-19 07:39:55,990][26599] Updated weights for policy 0, policy_version 308014 (0.0042) [2024-06-19 07:39:58,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5046566912. Throughput: 0: 42609.8. Samples: 1314238300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:39:58,380][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 07:39:59,976][26599] Updated weights for policy 0, policy_version 308024 (0.0035) [2024-06-19 07:40:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5046796288. Throughput: 0: 42727.2. Samples: 1314362240. Policy #0 lag: (min: 1.0, avg: 10.6, max: 21.0) [2024-06-19 07:40:03,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 07:40:03,651][26599] Updated weights for policy 0, policy_version 308034 (0.0035) [2024-06-19 07:40:07,560][26599] Updated weights for policy 0, policy_version 308044 (0.0032) [2024-06-19 07:40:08,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5047025664. Throughput: 0: 42741.2. Samples: 1314621880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:08,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 07:40:11,804][26599] Updated weights for policy 0, policy_version 308054 (0.0041) [2024-06-19 07:40:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5047222272. Throughput: 0: 42676.9. Samples: 1314879240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:13,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 07:40:15,620][26599] Updated weights for policy 0, policy_version 308064 (0.0039) [2024-06-19 07:40:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 5047435264. Throughput: 0: 42701.7. Samples: 1314998080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:18,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 07:40:19,350][26599] Updated weights for policy 0, policy_version 308074 (0.0039) [2024-06-19 07:40:23,234][26599] Updated weights for policy 0, policy_version 308084 (0.0039) [2024-06-19 07:40:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42327.8, 300 sec: 42598.4). Total num frames: 5047648256. Throughput: 0: 42509.1. Samples: 1315255000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:23,381][26367] Avg episode reward: [(0, '0.868')] [2024-06-19 07:40:26,956][26599] Updated weights for policy 0, policy_version 308094 (0.0028) [2024-06-19 07:40:28,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5047828480. Throughput: 0: 42468.1. Samples: 1315515100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:28,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 07:40:31,061][26599] Updated weights for policy 0, policy_version 308104 (0.0033) [2024-06-19 07:40:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5048074240. Throughput: 0: 42534.7. Samples: 1315637800. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:33,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 07:40:34,987][26599] Updated weights for policy 0, policy_version 308114 (0.0030) [2024-06-19 07:40:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42054.2, 300 sec: 42542.9). Total num frames: 5048270848. Throughput: 0: 42469.8. Samples: 1315894100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:38,380][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 07:40:38,558][26599] Updated weights for policy 0, policy_version 308124 (0.0037) [2024-06-19 07:40:42,671][26599] Updated weights for policy 0, policy_version 308134 (0.0040) [2024-06-19 07:40:43,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5048483840. Throughput: 0: 42356.5. Samples: 1316144340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:43,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 07:40:46,293][26599] Updated weights for policy 0, policy_version 308144 (0.0039) [2024-06-19 07:40:48,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 5048713216. Throughput: 0: 42413.4. Samples: 1316271000. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:48,384][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 07:40:50,306][26599] Updated weights for policy 0, policy_version 308154 (0.0030) [2024-06-19 07:40:53,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5048926208. Throughput: 0: 42416.6. Samples: 1316530620. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:53,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 07:40:54,068][26599] Updated weights for policy 0, policy_version 308164 (0.0033) [2024-06-19 07:40:57,876][26579] Signal inference workers to stop experience collection... (19450 times) [2024-06-19 07:40:57,924][26599] InferenceWorker_p0-w0: stopping experience collection (19450 times) [2024-06-19 07:40:57,927][26579] Signal inference workers to resume experience collection... (19450 times) [2024-06-19 07:40:57,933][26599] InferenceWorker_p0-w0: resuming experience collection (19450 times) [2024-06-19 07:40:58,074][26599] Updated weights for policy 0, policy_version 308174 (0.0057) [2024-06-19 07:40:58,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5049122816. Throughput: 0: 42283.2. Samples: 1316781980. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:40:58,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 07:41:01,962][26599] Updated weights for policy 0, policy_version 308184 (0.0030) [2024-06-19 07:41:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5049352192. Throughput: 0: 42470.7. Samples: 1316909260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:41:03,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 07:41:05,852][26599] Updated weights for policy 0, policy_version 308194 (0.0045) [2024-06-19 07:41:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5049548800. Throughput: 0: 42572.5. Samples: 1317170760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:41:08,381][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 07:41:09,847][26599] Updated weights for policy 0, policy_version 308204 (0.0036) [2024-06-19 07:41:13,304][26599] Updated weights for policy 0, policy_version 308214 (0.0037) [2024-06-19 07:41:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5049778176. Throughput: 0: 42255.1. Samples: 1317416580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:41:13,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 07:41:17,858][26599] Updated weights for policy 0, policy_version 308224 (0.0034) [2024-06-19 07:41:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5049958400. Throughput: 0: 42395.1. Samples: 1317545580. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 07:41:18,384][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 07:41:20,802][26599] Updated weights for policy 0, policy_version 308234 (0.0036) [2024-06-19 07:41:23,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5050171392. Throughput: 0: 42366.2. Samples: 1317800580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:23,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 07:41:25,461][26599] Updated weights for policy 0, policy_version 308244 (0.0033) [2024-06-19 07:41:28,375][26599] Updated weights for policy 0, policy_version 308254 (0.0028) [2024-06-19 07:41:28,380][26367] Fps is (10 sec: 47514.1, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5050433536. Throughput: 0: 42430.1. Samples: 1318053700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:28,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 07:41:32,914][26599] Updated weights for policy 0, policy_version 308264 (0.0030) [2024-06-19 07:41:33,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5050597376. Throughput: 0: 42526.5. Samples: 1318184540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:33,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 07:41:36,089][26599] Updated weights for policy 0, policy_version 308274 (0.0030) [2024-06-19 07:41:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5050843136. Throughput: 0: 42587.1. Samples: 1318447040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:38,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 07:41:38,520][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000308280_5050859520.pth... [2024-06-19 07:41:38,576][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307656_5040635904.pth [2024-06-19 07:41:40,408][26599] Updated weights for policy 0, policy_version 308284 (0.0025) [2024-06-19 07:41:43,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5051056128. Throughput: 0: 42668.5. Samples: 1318702060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:43,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 07:41:43,570][26599] Updated weights for policy 0, policy_version 308294 (0.0026) [2024-06-19 07:41:48,067][26599] Updated weights for policy 0, policy_version 308304 (0.0030) [2024-06-19 07:41:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42328.0, 300 sec: 42542.9). Total num frames: 5051252736. Throughput: 0: 42685.1. Samples: 1318830080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:48,380][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 07:41:51,329][26599] Updated weights for policy 0, policy_version 308314 (0.0033) [2024-06-19 07:41:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5051482112. Throughput: 0: 42564.5. Samples: 1319086160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:53,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 07:41:56,250][26599] Updated weights for policy 0, policy_version 308324 (0.0039) [2024-06-19 07:41:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5051678720. Throughput: 0: 42822.2. Samples: 1319343580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:41:58,381][26367] Avg episode reward: [(0, '0.851')] [2024-06-19 07:41:58,979][26599] Updated weights for policy 0, policy_version 308334 (0.0030) [2024-06-19 07:42:03,384][26367] Fps is (10 sec: 40944.9, 60 sec: 42322.8, 300 sec: 42542.3). Total num frames: 5051891712. Throughput: 0: 42804.2. Samples: 1319471920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:42:03,385][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 07:42:03,773][26599] Updated weights for policy 0, policy_version 308344 (0.0032) [2024-06-19 07:42:06,936][26599] Updated weights for policy 0, policy_version 308354 (0.0032) [2024-06-19 07:42:08,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5052121088. Throughput: 0: 42695.4. Samples: 1319721880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:42:08,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 07:42:11,590][26599] Updated weights for policy 0, policy_version 308364 (0.0037) [2024-06-19 07:42:12,707][26579] Signal inference workers to stop experience collection... (19500 times) [2024-06-19 07:42:12,707][26579] Signal inference workers to resume experience collection... (19500 times) [2024-06-19 07:42:12,758][26599] InferenceWorker_p0-w0: stopping experience collection (19500 times) [2024-06-19 07:42:12,758][26599] InferenceWorker_p0-w0: resuming experience collection (19500 times) [2024-06-19 07:42:13,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5052317696. Throughput: 0: 42915.5. Samples: 1319984900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:42:13,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 07:42:14,567][26599] Updated weights for policy 0, policy_version 308374 (0.0032) [2024-06-19 07:42:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42542.8). Total num frames: 5052530688. Throughput: 0: 42603.1. Samples: 1320101680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:42:18,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 07:42:19,127][26599] Updated weights for policy 0, policy_version 308384 (0.0041) [2024-06-19 07:42:22,486][26599] Updated weights for policy 0, policy_version 308394 (0.0030) [2024-06-19 07:42:23,380][26367] Fps is (10 sec: 44235.9, 60 sec: 43144.3, 300 sec: 42709.5). Total num frames: 5052760064. Throughput: 0: 42502.4. Samples: 1320359660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:42:23,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 07:42:27,046][26599] Updated weights for policy 0, policy_version 308404 (0.0034) [2024-06-19 07:42:28,380][26367] Fps is (10 sec: 40960.8, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 5052940288. Throughput: 0: 42398.7. Samples: 1320610000. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:42:28,380][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 07:42:30,227][26599] Updated weights for policy 0, policy_version 308414 (0.0039) [2024-06-19 07:42:33,380][26367] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42599.3). Total num frames: 5053186048. Throughput: 0: 42379.1. Samples: 1320737140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 07:42:33,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 07:42:35,086][26599] Updated weights for policy 0, policy_version 308424 (0.0035) [2024-06-19 07:42:37,898][26599] Updated weights for policy 0, policy_version 308434 (0.0032) [2024-06-19 07:42:38,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5053382656. Throughput: 0: 42420.4. Samples: 1320995080. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:42:38,384][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 07:42:42,754][26599] Updated weights for policy 0, policy_version 308444 (0.0036) [2024-06-19 07:42:43,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5053579264. Throughput: 0: 42296.0. Samples: 1321246900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:42:43,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 07:42:45,979][26599] Updated weights for policy 0, policy_version 308454 (0.0036) [2024-06-19 07:42:48,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5053825024. Throughput: 0: 42173.6. Samples: 1321369580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:42:48,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 07:42:50,194][26599] Updated weights for policy 0, policy_version 308464 (0.0039) [2024-06-19 07:42:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5054021632. Throughput: 0: 42507.6. Samples: 1321634720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:42:53,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 07:42:53,518][26599] Updated weights for policy 0, policy_version 308474 (0.0030) [2024-06-19 07:42:57,694][26599] Updated weights for policy 0, policy_version 308484 (0.0045) [2024-06-19 07:42:58,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5054218240. Throughput: 0: 42279.0. Samples: 1321887460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:42:58,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 07:43:01,120][26599] Updated weights for policy 0, policy_version 308494 (0.0040) [2024-06-19 07:43:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42874.1, 300 sec: 42543.4). Total num frames: 5054464000. Throughput: 0: 42460.5. Samples: 1322012400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:03,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 07:43:05,201][26599] Updated weights for policy 0, policy_version 308504 (0.0032) [2024-06-19 07:43:08,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5054660608. Throughput: 0: 42613.5. Samples: 1322277260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:08,381][26367] Avg episode reward: [(0, '0.414')] [2024-06-19 07:43:09,004][26599] Updated weights for policy 0, policy_version 308514 (0.0029) [2024-06-19 07:43:12,841][26599] Updated weights for policy 0, policy_version 308524 (0.0029) [2024-06-19 07:43:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5054857216. Throughput: 0: 42627.5. Samples: 1322528240. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:13,380][26367] Avg episode reward: [(0, '0.414')] [2024-06-19 07:43:16,534][26599] Updated weights for policy 0, policy_version 308534 (0.0028) [2024-06-19 07:43:18,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5055102976. Throughput: 0: 42616.8. Samples: 1322654900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:18,384][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 07:43:20,751][26599] Updated weights for policy 0, policy_version 308544 (0.0027) [2024-06-19 07:43:22,288][26579] Signal inference workers to stop experience collection... (19550 times) [2024-06-19 07:43:22,289][26579] Signal inference workers to resume experience collection... (19550 times) [2024-06-19 07:43:22,331][26599] InferenceWorker_p0-w0: stopping experience collection (19550 times) [2024-06-19 07:43:22,332][26599] InferenceWorker_p0-w0: resuming experience collection (19550 times) [2024-06-19 07:43:23,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5055315968. Throughput: 0: 42792.9. Samples: 1322920760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:23,381][26367] Avg episode reward: [(0, '0.905')] [2024-06-19 07:43:24,131][26599] Updated weights for policy 0, policy_version 308554 (0.0029) [2024-06-19 07:43:28,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5055496192. Throughput: 0: 42892.6. Samples: 1323177060. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:28,380][26367] Avg episode reward: [(0, '0.875')] [2024-06-19 07:43:28,485][26599] Updated weights for policy 0, policy_version 308564 (0.0038) [2024-06-19 07:43:31,701][26599] Updated weights for policy 0, policy_version 308574 (0.0027) [2024-06-19 07:43:33,384][26367] Fps is (10 sec: 44220.6, 60 sec: 42868.8, 300 sec: 42597.9). Total num frames: 5055758336. Throughput: 0: 42915.6. Samples: 1323300940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:33,385][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 07:43:36,201][26599] Updated weights for policy 0, policy_version 308584 (0.0033) [2024-06-19 07:43:38,380][26367] Fps is (10 sec: 44235.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5055938560. Throughput: 0: 42781.6. Samples: 1323559900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:38,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 07:43:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000308590_5055938560.pth... [2024-06-19 07:43:38,465][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000307968_5045747712.pth [2024-06-19 07:43:39,399][26599] Updated weights for policy 0, policy_version 308594 (0.0039) [2024-06-19 07:43:43,380][26367] Fps is (10 sec: 37696.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5056135168. Throughput: 0: 42712.5. Samples: 1323809520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:43,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 07:43:43,752][26599] Updated weights for policy 0, policy_version 308604 (0.0034) [2024-06-19 07:43:47,067][26599] Updated weights for policy 0, policy_version 308614 (0.0039) [2024-06-19 07:43:48,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 5056348160. Throughput: 0: 42666.6. Samples: 1323932400. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:43:48,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 07:43:51,716][26599] Updated weights for policy 0, policy_version 308624 (0.0039) [2024-06-19 07:43:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5056561152. Throughput: 0: 42432.5. Samples: 1324186720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:43:53,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 07:43:54,878][26599] Updated weights for policy 0, policy_version 308634 (0.0041) [2024-06-19 07:43:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5056790528. Throughput: 0: 42493.8. Samples: 1324440460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:43:58,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 07:43:59,531][26599] Updated weights for policy 0, policy_version 308644 (0.0033) [2024-06-19 07:44:02,519][26599] Updated weights for policy 0, policy_version 308654 (0.0034) [2024-06-19 07:44:03,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.3, 300 sec: 42487.8). Total num frames: 5057003520. Throughput: 0: 42444.4. Samples: 1324564900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:03,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 07:44:07,346][26599] Updated weights for policy 0, policy_version 308664 (0.0035) [2024-06-19 07:44:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5057200128. Throughput: 0: 42262.8. Samples: 1324822580. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:08,380][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 07:44:10,463][26599] Updated weights for policy 0, policy_version 308674 (0.0049) [2024-06-19 07:44:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5057429504. Throughput: 0: 42232.3. Samples: 1325077520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:13,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 07:44:15,065][26599] Updated weights for policy 0, policy_version 308684 (0.0035) [2024-06-19 07:44:18,194][26599] Updated weights for policy 0, policy_version 308694 (0.0051) [2024-06-19 07:44:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42487.8). Total num frames: 5057642496. Throughput: 0: 42282.1. Samples: 1325203480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:18,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 07:44:22,610][26599] Updated weights for policy 0, policy_version 308704 (0.0033) [2024-06-19 07:44:23,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5057839104. Throughput: 0: 42324.3. Samples: 1325464480. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:23,380][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 07:44:25,915][26599] Updated weights for policy 0, policy_version 308714 (0.0039) [2024-06-19 07:44:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5058052096. Throughput: 0: 42185.7. Samples: 1325707880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:28,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 07:44:30,526][26599] Updated weights for policy 0, policy_version 308724 (0.0031) [2024-06-19 07:44:33,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42054.8, 300 sec: 42487.7). Total num frames: 5058281472. Throughput: 0: 42399.0. Samples: 1325840360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:33,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 07:44:33,540][26599] Updated weights for policy 0, policy_version 308734 (0.0034) [2024-06-19 07:44:38,219][26599] Updated weights for policy 0, policy_version 308744 (0.0042) [2024-06-19 07:44:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 5058478080. Throughput: 0: 42248.8. Samples: 1326087920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:38,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 07:44:41,266][26599] Updated weights for policy 0, policy_version 308754 (0.0028) [2024-06-19 07:44:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5058691072. Throughput: 0: 42407.4. Samples: 1326348800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:43,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 07:44:45,855][26599] Updated weights for policy 0, policy_version 308764 (0.0047) [2024-06-19 07:44:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 5058920448. Throughput: 0: 42491.0. Samples: 1326477000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:48,382][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 07:44:48,521][26579] Signal inference workers to stop experience collection... (19600 times) [2024-06-19 07:44:48,575][26599] InferenceWorker_p0-w0: stopping experience collection (19600 times) [2024-06-19 07:44:48,636][26579] Signal inference workers to resume experience collection... (19600 times) [2024-06-19 07:44:48,636][26599] InferenceWorker_p0-w0: resuming experience collection (19600 times) [2024-06-19 07:44:48,778][26599] Updated weights for policy 0, policy_version 308774 (0.0050) [2024-06-19 07:44:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5059100672. Throughput: 0: 42392.3. Samples: 1326730240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:53,384][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 07:44:53,645][26599] Updated weights for policy 0, policy_version 308784 (0.0046) [2024-06-19 07:44:56,431][26599] Updated weights for policy 0, policy_version 308794 (0.0036) [2024-06-19 07:44:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5059330048. Throughput: 0: 42336.4. Samples: 1326982660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 22.0) [2024-06-19 07:44:58,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 07:45:01,546][26599] Updated weights for policy 0, policy_version 308804 (0.0046) [2024-06-19 07:45:03,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5059543040. Throughput: 0: 42507.5. Samples: 1327116320. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:03,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 07:45:04,378][26599] Updated weights for policy 0, policy_version 308814 (0.0033) [2024-06-19 07:45:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5059739648. Throughput: 0: 42211.0. Samples: 1327363980. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:08,383][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 07:45:09,213][26599] Updated weights for policy 0, policy_version 308824 (0.0030) [2024-06-19 07:45:12,035][26599] Updated weights for policy 0, policy_version 308834 (0.0034) [2024-06-19 07:45:13,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5059969024. Throughput: 0: 42285.0. Samples: 1327610700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:13,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 07:45:17,155][26599] Updated weights for policy 0, policy_version 308844 (0.0031) [2024-06-19 07:45:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5060182016. Throughput: 0: 42311.6. Samples: 1327744380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:18,388][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 07:45:19,941][26599] Updated weights for policy 0, policy_version 308854 (0.0054) [2024-06-19 07:45:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5060362240. Throughput: 0: 42465.8. Samples: 1327998880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:23,381][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 07:45:24,883][26599] Updated weights for policy 0, policy_version 308864 (0.0033) [2024-06-19 07:45:27,668][26599] Updated weights for policy 0, policy_version 308874 (0.0031) [2024-06-19 07:45:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5060608000. Throughput: 0: 42175.1. Samples: 1328246680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:28,381][26367] Avg episode reward: [(0, '0.822')] [2024-06-19 07:45:32,514][26599] Updated weights for policy 0, policy_version 308884 (0.0029) [2024-06-19 07:45:33,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5060804608. Throughput: 0: 42440.1. Samples: 1328386800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:33,383][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 07:45:35,262][26599] Updated weights for policy 0, policy_version 308894 (0.0041) [2024-06-19 07:45:38,384][26367] Fps is (10 sec: 39307.5, 60 sec: 42049.7, 300 sec: 42431.2). Total num frames: 5061001216. Throughput: 0: 42315.7. Samples: 1328634600. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:38,385][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 07:45:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000308899_5061001216.pth... [2024-06-19 07:45:38,465][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000308280_5050859520.pth [2024-06-19 07:45:40,099][26599] Updated weights for policy 0, policy_version 308904 (0.0044) [2024-06-19 07:45:42,907][26599] Updated weights for policy 0, policy_version 308914 (0.0027) [2024-06-19 07:45:43,384][26367] Fps is (10 sec: 45858.8, 60 sec: 42868.9, 300 sec: 42542.9). Total num frames: 5061263360. Throughput: 0: 42272.6. Samples: 1328885080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:43,385][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 07:45:47,775][26599] Updated weights for policy 0, policy_version 308924 (0.0041) [2024-06-19 07:45:48,380][26367] Fps is (10 sec: 42613.4, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 5061427200. Throughput: 0: 42398.6. Samples: 1329024260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:48,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 07:45:50,442][26599] Updated weights for policy 0, policy_version 308934 (0.0025) [2024-06-19 07:45:53,380][26367] Fps is (10 sec: 39335.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5061656576. Throughput: 0: 42416.0. Samples: 1329272700. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:53,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 07:45:55,402][26599] Updated weights for policy 0, policy_version 308944 (0.0030) [2024-06-19 07:45:58,135][26599] Updated weights for policy 0, policy_version 308954 (0.0037) [2024-06-19 07:45:58,380][26367] Fps is (10 sec: 49152.5, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5061918720. Throughput: 0: 42562.2. Samples: 1329526000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:45:58,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 07:46:02,852][26579] Signal inference workers to stop experience collection... (19650 times) [2024-06-19 07:46:02,901][26599] InferenceWorker_p0-w0: stopping experience collection (19650 times) [2024-06-19 07:46:02,908][26579] Signal inference workers to resume experience collection... (19650 times) [2024-06-19 07:46:02,916][26599] InferenceWorker_p0-w0: resuming experience collection (19650 times) [2024-06-19 07:46:03,046][26599] Updated weights for policy 0, policy_version 308964 (0.0028) [2024-06-19 07:46:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5062082560. Throughput: 0: 42640.5. Samples: 1329663200. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:46:03,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 07:46:05,748][26599] Updated weights for policy 0, policy_version 308974 (0.0037) [2024-06-19 07:46:08,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5062311936. Throughput: 0: 42542.7. Samples: 1329913300. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:46:08,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 07:46:10,809][26599] Updated weights for policy 0, policy_version 308984 (0.0027) [2024-06-19 07:46:13,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5062524928. Throughput: 0: 42661.1. Samples: 1330166580. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 07:46:13,385][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 07:46:13,641][26599] Updated weights for policy 0, policy_version 308994 (0.0045) [2024-06-19 07:46:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 5062705152. Throughput: 0: 42352.2. Samples: 1330292640. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:18,380][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 07:46:18,497][26599] Updated weights for policy 0, policy_version 309004 (0.0044) [2024-06-19 07:46:21,442][26599] Updated weights for policy 0, policy_version 309014 (0.0030) [2024-06-19 07:46:23,380][26367] Fps is (10 sec: 42614.0, 60 sec: 43144.6, 300 sec: 42431.8). Total num frames: 5062950912. Throughput: 0: 42494.6. Samples: 1330546700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:23,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 07:46:26,137][26599] Updated weights for policy 0, policy_version 309024 (0.0033) [2024-06-19 07:46:28,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5063163904. Throughput: 0: 42689.3. Samples: 1330805940. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:28,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 07:46:29,097][26599] Updated weights for policy 0, policy_version 309034 (0.0036) [2024-06-19 07:46:33,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 5063327744. Throughput: 0: 42339.7. Samples: 1330929540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:33,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 07:46:33,904][26599] Updated weights for policy 0, policy_version 309044 (0.0046) [2024-06-19 07:46:36,703][26599] Updated weights for policy 0, policy_version 309054 (0.0037) [2024-06-19 07:46:38,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43147.1, 300 sec: 42487.3). Total num frames: 5063589888. Throughput: 0: 42499.1. Samples: 1331185160. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:38,384][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 07:46:41,523][26599] Updated weights for policy 0, policy_version 309064 (0.0042) [2024-06-19 07:46:43,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42054.8, 300 sec: 42487.3). Total num frames: 5063786496. Throughput: 0: 42652.9. Samples: 1331445380. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:43,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 07:46:44,382][26599] Updated weights for policy 0, policy_version 309074 (0.0034) [2024-06-19 07:46:48,384][26367] Fps is (10 sec: 39307.6, 60 sec: 42595.9, 300 sec: 42375.7). Total num frames: 5063983104. Throughput: 0: 42403.7. Samples: 1331571520. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:48,384][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 07:46:49,336][26599] Updated weights for policy 0, policy_version 309084 (0.0037) [2024-06-19 07:46:52,043][26599] Updated weights for policy 0, policy_version 309094 (0.0035) [2024-06-19 07:46:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5064228864. Throughput: 0: 42525.7. Samples: 1331826960. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:53,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 07:46:57,091][26599] Updated weights for policy 0, policy_version 309104 (0.0037) [2024-06-19 07:46:58,380][26367] Fps is (10 sec: 44253.1, 60 sec: 41779.3, 300 sec: 42487.9). Total num frames: 5064425472. Throughput: 0: 42499.0. Samples: 1332078880. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:46:58,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 07:47:00,590][26599] Updated weights for policy 0, policy_version 309114 (0.0040) [2024-06-19 07:47:03,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 5064622080. Throughput: 0: 42546.9. Samples: 1332207260. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:47:03,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 07:47:04,704][26599] Updated weights for policy 0, policy_version 309124 (0.0046) [2024-06-19 07:47:08,137][26599] Updated weights for policy 0, policy_version 309134 (0.0030) [2024-06-19 07:47:08,380][26367] Fps is (10 sec: 42597.3, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5064851456. Throughput: 0: 42557.1. Samples: 1332461780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:47:08,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 07:47:12,651][26599] Updated weights for policy 0, policy_version 309144 (0.0027) [2024-06-19 07:47:13,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42327.9, 300 sec: 42487.3). Total num frames: 5065064448. Throughput: 0: 42478.2. Samples: 1332717460. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:47:13,380][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 07:47:15,765][26599] Updated weights for policy 0, policy_version 309154 (0.0033) [2024-06-19 07:47:18,384][26367] Fps is (10 sec: 42583.7, 60 sec: 42868.8, 300 sec: 42431.3). Total num frames: 5065277440. Throughput: 0: 42559.3. Samples: 1332844860. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:47:18,385][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 07:47:20,297][26599] Updated weights for policy 0, policy_version 309164 (0.0034) [2024-06-19 07:47:20,329][26579] Signal inference workers to stop experience collection... (19700 times) [2024-06-19 07:47:20,330][26579] Signal inference workers to resume experience collection... (19700 times) [2024-06-19 07:47:20,353][26599] InferenceWorker_p0-w0: stopping experience collection (19700 times) [2024-06-19 07:47:20,353][26599] InferenceWorker_p0-w0: resuming experience collection (19700 times) [2024-06-19 07:47:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5065474048. Throughput: 0: 42464.1. Samples: 1333096040. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:47:23,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 07:47:23,777][26599] Updated weights for policy 0, policy_version 309174 (0.0044) [2024-06-19 07:47:27,985][26599] Updated weights for policy 0, policy_version 309184 (0.0043) [2024-06-19 07:47:28,380][26367] Fps is (10 sec: 40975.3, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 5065687040. Throughput: 0: 42462.3. Samples: 1333356180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 07:47:28,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 07:47:31,406][26599] Updated weights for policy 0, policy_version 309194 (0.0038) [2024-06-19 07:47:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 5065916416. Throughput: 0: 42449.7. Samples: 1333481600. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:47:33,380][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 07:47:35,688][26599] Updated weights for policy 0, policy_version 309204 (0.0037) [2024-06-19 07:47:38,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5066129408. Throughput: 0: 42327.1. Samples: 1333731680. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:47:38,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 07:47:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000309212_5066129408.pth... [2024-06-19 07:47:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000308590_5055938560.pth [2024-06-19 07:47:38,962][26599] Updated weights for policy 0, policy_version 309214 (0.0043) [2024-06-19 07:47:43,232][26599] Updated weights for policy 0, policy_version 309224 (0.0033) [2024-06-19 07:47:43,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42376.3). Total num frames: 5066326016. Throughput: 0: 42649.8. Samples: 1333998120. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:47:43,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 07:47:46,575][26599] Updated weights for policy 0, policy_version 309234 (0.0027) [2024-06-19 07:47:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42874.1, 300 sec: 42487.3). Total num frames: 5066555392. Throughput: 0: 42526.8. Samples: 1334120960. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:47:48,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 07:47:50,973][26599] Updated weights for policy 0, policy_version 309244 (0.0045) [2024-06-19 07:47:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5066768384. Throughput: 0: 42474.5. Samples: 1334373120. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:47:53,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 07:47:54,160][26599] Updated weights for policy 0, policy_version 309254 (0.0039) [2024-06-19 07:47:58,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 5066964992. Throughput: 0: 42731.8. Samples: 1334640400. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:47:58,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 07:47:58,421][26599] Updated weights for policy 0, policy_version 309264 (0.0042) [2024-06-19 07:48:01,843][26599] Updated weights for policy 0, policy_version 309274 (0.0029) [2024-06-19 07:48:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 5067177984. Throughput: 0: 42635.5. Samples: 1334763300. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:03,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 07:48:05,978][26599] Updated weights for policy 0, policy_version 309284 (0.0048) [2024-06-19 07:48:08,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5067423744. Throughput: 0: 42619.8. Samples: 1335013940. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:08,381][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 07:48:09,885][26599] Updated weights for policy 0, policy_version 309294 (0.0036) [2024-06-19 07:48:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42376.3). Total num frames: 5067603968. Throughput: 0: 42574.6. Samples: 1335272040. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:13,380][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 07:48:13,556][26599] Updated weights for policy 0, policy_version 309304 (0.0042) [2024-06-19 07:48:17,641][26599] Updated weights for policy 0, policy_version 309314 (0.0049) [2024-06-19 07:48:18,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42327.8, 300 sec: 42376.2). Total num frames: 5067816960. Throughput: 0: 42458.5. Samples: 1335392240. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:18,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 07:48:21,303][26599] Updated weights for policy 0, policy_version 309324 (0.0033) [2024-06-19 07:48:23,380][26367] Fps is (10 sec: 45875.3, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5068062720. Throughput: 0: 42726.4. Samples: 1335654360. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:23,380][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 07:48:25,189][26599] Updated weights for policy 0, policy_version 309334 (0.0039) [2024-06-19 07:48:28,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42376.8). Total num frames: 5068259328. Throughput: 0: 42423.5. Samples: 1335907180. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:28,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 07:48:29,190][26599] Updated weights for policy 0, policy_version 309344 (0.0038) [2024-06-19 07:48:33,121][26599] Updated weights for policy 0, policy_version 309354 (0.0030) [2024-06-19 07:48:33,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5068455936. Throughput: 0: 42437.4. Samples: 1336030640. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:33,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:48:35,308][26579] Signal inference workers to stop experience collection... (19750 times) [2024-06-19 07:48:35,312][26579] Signal inference workers to resume experience collection... (19750 times) [2024-06-19 07:48:35,322][26599] InferenceWorker_p0-w0: stopping experience collection (19750 times) [2024-06-19 07:48:35,360][26599] InferenceWorker_p0-w0: resuming experience collection (19750 times) [2024-06-19 07:48:37,014][26599] Updated weights for policy 0, policy_version 309364 (0.0043) [2024-06-19 07:48:38,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5068701696. Throughput: 0: 42680.9. Samples: 1336293760. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:38,380][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 07:48:40,756][26599] Updated weights for policy 0, policy_version 309374 (0.0054) [2024-06-19 07:48:43,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42542.9). Total num frames: 5068898304. Throughput: 0: 42323.6. Samples: 1336544960. Policy #0 lag: (min: 2.0, avg: 10.8, max: 21.0) [2024-06-19 07:48:43,381][26367] Avg episode reward: [(0, '0.406')] [2024-06-19 07:48:44,666][26599] Updated weights for policy 0, policy_version 309384 (0.0028) [2024-06-19 07:48:48,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5069094912. Throughput: 0: 42395.1. Samples: 1336671080. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:48:48,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 07:48:48,583][26599] Updated weights for policy 0, policy_version 309394 (0.0026) [2024-06-19 07:48:52,271][26599] Updated weights for policy 0, policy_version 309404 (0.0028) [2024-06-19 07:48:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5069324288. Throughput: 0: 42635.7. Samples: 1336932540. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:48:53,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 07:48:56,100][26599] Updated weights for policy 0, policy_version 309414 (0.0037) [2024-06-19 07:48:58,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42595.9, 300 sec: 42431.3). Total num frames: 5069520896. Throughput: 0: 42620.0. Samples: 1337190100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:48:58,385][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 07:48:59,953][26599] Updated weights for policy 0, policy_version 309424 (0.0038) [2024-06-19 07:49:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5069733888. Throughput: 0: 42758.2. Samples: 1337316360. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:03,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 07:49:03,728][26599] Updated weights for policy 0, policy_version 309434 (0.0034) [2024-06-19 07:49:07,486][26599] Updated weights for policy 0, policy_version 309444 (0.0023) [2024-06-19 07:49:08,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 5069946880. Throughput: 0: 42651.9. Samples: 1337573700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:08,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 07:49:11,501][26599] Updated weights for policy 0, policy_version 309454 (0.0036) [2024-06-19 07:49:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 5070159872. Throughput: 0: 42656.8. Samples: 1337826740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:13,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 07:49:15,318][26599] Updated weights for policy 0, policy_version 309464 (0.0041) [2024-06-19 07:49:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5070372864. Throughput: 0: 42671.9. Samples: 1337950880. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:18,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 07:49:19,175][26599] Updated weights for policy 0, policy_version 309474 (0.0032) [2024-06-19 07:49:23,137][26599] Updated weights for policy 0, policy_version 309484 (0.0039) [2024-06-19 07:49:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5070585856. Throughput: 0: 42520.8. Samples: 1338207200. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:23,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 07:49:26,873][26599] Updated weights for policy 0, policy_version 309494 (0.0036) [2024-06-19 07:49:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5070815232. Throughput: 0: 42330.2. Samples: 1338449820. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:28,381][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 07:49:31,341][26599] Updated weights for policy 0, policy_version 309504 (0.0039) [2024-06-19 07:49:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 5071011840. Throughput: 0: 42560.7. Samples: 1338586320. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:33,381][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 07:49:34,617][26599] Updated weights for policy 0, policy_version 309514 (0.0034) [2024-06-19 07:49:38,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 5071208448. Throughput: 0: 42353.5. Samples: 1338838440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:38,380][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 07:49:38,412][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000309522_5071208448.pth... [2024-06-19 07:49:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000308899_5061001216.pth [2024-06-19 07:49:39,005][26599] Updated weights for policy 0, policy_version 309524 (0.0037) [2024-06-19 07:49:42,526][26599] Updated weights for policy 0, policy_version 309534 (0.0028) [2024-06-19 07:49:43,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5071454208. Throughput: 0: 42207.9. Samples: 1339089300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:43,381][26367] Avg episode reward: [(0, '0.832')] [2024-06-19 07:49:46,568][26599] Updated weights for policy 0, policy_version 309544 (0.0027) [2024-06-19 07:49:48,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5071667200. Throughput: 0: 42361.4. Samples: 1339222620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:48,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 07:49:50,138][26599] Updated weights for policy 0, policy_version 309554 (0.0048) [2024-06-19 07:49:53,380][26367] Fps is (10 sec: 37682.5, 60 sec: 41779.1, 300 sec: 42376.2). Total num frames: 5071831040. Throughput: 0: 42116.7. Samples: 1339468960. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:53,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 07:49:54,190][26599] Updated weights for policy 0, policy_version 309564 (0.0034) [2024-06-19 07:49:55,448][26579] Signal inference workers to stop experience collection... (19800 times) [2024-06-19 07:49:55,448][26579] Signal inference workers to resume experience collection... (19800 times) [2024-06-19 07:49:55,482][26599] InferenceWorker_p0-w0: stopping experience collection (19800 times) [2024-06-19 07:49:55,482][26599] InferenceWorker_p0-w0: resuming experience collection (19800 times) [2024-06-19 07:49:57,650][26599] Updated weights for policy 0, policy_version 309574 (0.0036) [2024-06-19 07:49:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42874.2, 300 sec: 42542.9). Total num frames: 5072093184. Throughput: 0: 42174.8. Samples: 1339724600. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 07:49:58,380][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 07:50:02,140][26599] Updated weights for policy 0, policy_version 309584 (0.0035) [2024-06-19 07:50:03,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5072273408. Throughput: 0: 42403.2. Samples: 1339859020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:03,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 07:50:05,421][26599] Updated weights for policy 0, policy_version 309594 (0.0022) [2024-06-19 07:50:08,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5072486400. Throughput: 0: 42192.0. Samples: 1340105840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:08,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 07:50:09,854][26599] Updated weights for policy 0, policy_version 309604 (0.0037) [2024-06-19 07:50:12,964][26599] Updated weights for policy 0, policy_version 309614 (0.0044) [2024-06-19 07:50:13,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5072732160. Throughput: 0: 42510.8. Samples: 1340362800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:13,380][26367] Avg episode reward: [(0, '0.425')] [2024-06-19 07:50:17,582][26599] Updated weights for policy 0, policy_version 309624 (0.0033) [2024-06-19 07:50:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5072928768. Throughput: 0: 42408.9. Samples: 1340494720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:18,381][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 07:50:20,535][26599] Updated weights for policy 0, policy_version 309634 (0.0041) [2024-06-19 07:50:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5073125376. Throughput: 0: 42311.5. Samples: 1340742460. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:23,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 07:50:25,131][26599] Updated weights for policy 0, policy_version 309644 (0.0024) [2024-06-19 07:50:28,289][26599] Updated weights for policy 0, policy_version 309654 (0.0041) [2024-06-19 07:50:28,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5073371136. Throughput: 0: 42491.6. Samples: 1341001420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:28,381][26367] Avg episode reward: [(0, '0.811')] [2024-06-19 07:50:32,719][26599] Updated weights for policy 0, policy_version 309664 (0.0031) [2024-06-19 07:50:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 5073551360. Throughput: 0: 42540.5. Samples: 1341136940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:33,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 07:50:35,629][26599] Updated weights for policy 0, policy_version 309674 (0.0036) [2024-06-19 07:50:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42432.3). Total num frames: 5073780736. Throughput: 0: 42585.2. Samples: 1341385280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:38,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 07:50:40,268][26599] Updated weights for policy 0, policy_version 309684 (0.0046) [2024-06-19 07:50:43,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5074010112. Throughput: 0: 42580.8. Samples: 1341640740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:43,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 07:50:43,844][26599] Updated weights for policy 0, policy_version 309694 (0.0045) [2024-06-19 07:50:47,936][26599] Updated weights for policy 0, policy_version 309704 (0.0040) [2024-06-19 07:50:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5074206720. Throughput: 0: 42439.5. Samples: 1341768800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:48,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 07:50:51,387][26599] Updated weights for policy 0, policy_version 309714 (0.0040) [2024-06-19 07:50:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43417.8, 300 sec: 42431.8). Total num frames: 5074436096. Throughput: 0: 42561.0. Samples: 1342021080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:53,380][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 07:50:55,495][26599] Updated weights for policy 0, policy_version 309724 (0.0032) [2024-06-19 07:50:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 5074632704. Throughput: 0: 42806.6. Samples: 1342289100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:50:58,381][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 07:50:59,026][26599] Updated weights for policy 0, policy_version 309734 (0.0039) [2024-06-19 07:51:02,507][26579] Signal inference workers to stop experience collection... (19850 times) [2024-06-19 07:51:02,507][26579] Signal inference workers to resume experience collection... (19850 times) [2024-06-19 07:51:02,519][26599] InferenceWorker_p0-w0: stopping experience collection (19850 times) [2024-06-19 07:51:02,540][26599] InferenceWorker_p0-w0: resuming experience collection (19850 times) [2024-06-19 07:51:03,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 5074829312. Throughput: 0: 42486.4. Samples: 1342406600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:51:03,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 07:51:03,601][26599] Updated weights for policy 0, policy_version 309744 (0.0034) [2024-06-19 07:51:06,611][26599] Updated weights for policy 0, policy_version 309754 (0.0027) [2024-06-19 07:51:08,380][26367] Fps is (10 sec: 45875.0, 60 sec: 43417.6, 300 sec: 42598.9). Total num frames: 5075091456. Throughput: 0: 42645.2. Samples: 1342661500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:51:08,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 07:51:11,143][26599] Updated weights for policy 0, policy_version 309764 (0.0030) [2024-06-19 07:51:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5075255296. Throughput: 0: 42894.3. Samples: 1342931660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 07:51:13,380][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 07:51:14,323][26599] Updated weights for policy 0, policy_version 309774 (0.0033) [2024-06-19 07:51:18,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5075468288. Throughput: 0: 42379.0. Samples: 1343044000. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:18,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 07:51:18,744][26599] Updated weights for policy 0, policy_version 309784 (0.0047) [2024-06-19 07:51:21,996][26599] Updated weights for policy 0, policy_version 309794 (0.0027) [2024-06-19 07:51:23,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 5075714048. Throughput: 0: 42631.6. Samples: 1343303700. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:23,380][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 07:51:26,230][26599] Updated weights for policy 0, policy_version 309804 (0.0039) [2024-06-19 07:51:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 5075877888. Throughput: 0: 42955.1. Samples: 1343573720. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:28,381][26367] Avg episode reward: [(0, '0.860')] [2024-06-19 07:51:29,777][26599] Updated weights for policy 0, policy_version 309814 (0.0032) [2024-06-19 07:51:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 5076107264. Throughput: 0: 42570.7. Samples: 1343684480. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:33,384][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 07:51:34,490][26599] Updated weights for policy 0, policy_version 309824 (0.0042) [2024-06-19 07:51:37,481][26599] Updated weights for policy 0, policy_version 309834 (0.0043) [2024-06-19 07:51:38,384][26367] Fps is (10 sec: 49134.0, 60 sec: 43141.8, 300 sec: 42653.4). Total num frames: 5076369408. Throughput: 0: 42730.7. Samples: 1343944120. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:38,384][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 07:51:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000309837_5076369408.pth... [2024-06-19 07:51:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000309212_5066129408.pth [2024-06-19 07:51:42,157][26599] Updated weights for policy 0, policy_version 309844 (0.0044) [2024-06-19 07:51:43,381][26367] Fps is (10 sec: 40958.6, 60 sec: 41779.0, 300 sec: 42487.8). Total num frames: 5076516864. Throughput: 0: 42472.7. Samples: 1344200380. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:43,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 07:51:45,226][26599] Updated weights for policy 0, policy_version 309854 (0.0028) [2024-06-19 07:51:48,380][26367] Fps is (10 sec: 37697.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5076746240. Throughput: 0: 42418.7. Samples: 1344315440. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:48,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 07:51:49,720][26599] Updated weights for policy 0, policy_version 309864 (0.0042) [2024-06-19 07:51:53,139][26599] Updated weights for policy 0, policy_version 309874 (0.0044) [2024-06-19 07:51:53,380][26367] Fps is (10 sec: 47514.9, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5076992000. Throughput: 0: 42585.8. Samples: 1344577860. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:53,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 07:51:57,186][26599] Updated weights for policy 0, policy_version 309884 (0.0035) [2024-06-19 07:51:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.4, 300 sec: 42487.4). Total num frames: 5077155840. Throughput: 0: 42488.0. Samples: 1344843620. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:51:58,380][26367] Avg episode reward: [(0, '0.352')] [2024-06-19 07:52:00,594][26599] Updated weights for policy 0, policy_version 309894 (0.0042) [2024-06-19 07:52:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 5077385216. Throughput: 0: 42614.3. Samples: 1344961640. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:52:03,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 07:52:05,046][26599] Updated weights for policy 0, policy_version 309904 (0.0029) [2024-06-19 07:52:07,028][26579] Signal inference workers to stop experience collection... (19900 times) [2024-06-19 07:52:07,030][26579] Signal inference workers to resume experience collection... (19900 times) [2024-06-19 07:52:07,056][26599] InferenceWorker_p0-w0: stopping experience collection (19900 times) [2024-06-19 07:52:07,084][26599] InferenceWorker_p0-w0: resuming experience collection (19900 times) [2024-06-19 07:52:08,137][26599] Updated weights for policy 0, policy_version 309914 (0.0042) [2024-06-19 07:52:08,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5077630976. Throughput: 0: 42570.1. Samples: 1345219360. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:52:08,381][26367] Avg episode reward: [(0, '0.812')] [2024-06-19 07:52:12,626][26599] Updated weights for policy 0, policy_version 309924 (0.0044) [2024-06-19 07:52:13,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42432.3). Total num frames: 5077794816. Throughput: 0: 42265.4. Samples: 1345475660. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:52:13,380][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 07:52:16,217][26599] Updated weights for policy 0, policy_version 309934 (0.0037) [2024-06-19 07:52:18,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5078024192. Throughput: 0: 42603.1. Samples: 1345601620. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:52:18,380][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 07:52:20,225][26599] Updated weights for policy 0, policy_version 309944 (0.0029) [2024-06-19 07:52:23,384][26367] Fps is (10 sec: 47495.8, 60 sec: 42595.7, 300 sec: 42653.4). Total num frames: 5078269952. Throughput: 0: 42528.0. Samples: 1345857880. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:52:23,384][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 07:52:23,727][26599] Updated weights for policy 0, policy_version 309954 (0.0028) [2024-06-19 07:52:27,711][26599] Updated weights for policy 0, policy_version 309964 (0.0041) [2024-06-19 07:52:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5078450176. Throughput: 0: 42537.2. Samples: 1346114540. Policy #0 lag: (min: 3.0, avg: 11.7, max: 23.0) [2024-06-19 07:52:28,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 07:52:31,239][26599] Updated weights for policy 0, policy_version 309974 (0.0031) [2024-06-19 07:52:33,380][26367] Fps is (10 sec: 39336.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5078663168. Throughput: 0: 42766.7. Samples: 1346239940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:52:33,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 07:52:35,674][26599] Updated weights for policy 0, policy_version 309984 (0.0028) [2024-06-19 07:52:38,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42327.8, 300 sec: 42653.9). Total num frames: 5078908928. Throughput: 0: 42558.1. Samples: 1346492980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:52:38,381][26367] Avg episode reward: [(0, '0.804')] [2024-06-19 07:52:38,847][26599] Updated weights for policy 0, policy_version 309994 (0.0027) [2024-06-19 07:52:43,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.7, 300 sec: 42487.3). Total num frames: 5079089152. Throughput: 0: 42399.9. Samples: 1346751620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:52:43,381][26367] Avg episode reward: [(0, '0.801')] [2024-06-19 07:52:43,533][26599] Updated weights for policy 0, policy_version 310004 (0.0041) [2024-06-19 07:52:46,445][26599] Updated weights for policy 0, policy_version 310014 (0.0037) [2024-06-19 07:52:48,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5079302144. Throughput: 0: 42483.1. Samples: 1346873380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:52:48,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 07:52:51,174][26599] Updated weights for policy 0, policy_version 310024 (0.0046) [2024-06-19 07:52:53,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5079547904. Throughput: 0: 42594.3. Samples: 1347136100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:52:53,380][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 07:52:54,063][26599] Updated weights for policy 0, policy_version 310034 (0.0029) [2024-06-19 07:52:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5079711744. Throughput: 0: 42629.3. Samples: 1347393980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:52:58,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 07:52:59,090][26599] Updated weights for policy 0, policy_version 310044 (0.0029) [2024-06-19 07:53:01,790][26599] Updated weights for policy 0, policy_version 310054 (0.0056) [2024-06-19 07:53:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42487.4). Total num frames: 5079957504. Throughput: 0: 42490.2. Samples: 1347513680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:03,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 07:53:06,683][26599] Updated weights for policy 0, policy_version 310064 (0.0037) [2024-06-19 07:53:08,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5080170496. Throughput: 0: 42643.0. Samples: 1347776660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:08,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 07:53:09,459][26599] Updated weights for policy 0, policy_version 310074 (0.0037) [2024-06-19 07:53:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5080350720. Throughput: 0: 42719.9. Samples: 1348036940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:13,384][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 07:53:14,450][26599] Updated weights for policy 0, policy_version 310084 (0.0034) [2024-06-19 07:53:17,080][26599] Updated weights for policy 0, policy_version 310094 (0.0037) [2024-06-19 07:53:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43144.4, 300 sec: 42542.8). Total num frames: 5080612864. Throughput: 0: 42534.0. Samples: 1348153980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:18,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 07:53:22,138][26599] Updated weights for policy 0, policy_version 310104 (0.0046) [2024-06-19 07:53:23,380][26367] Fps is (10 sec: 47514.1, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5080825856. Throughput: 0: 42733.1. Samples: 1348415960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:23,380][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 07:53:24,970][26599] Updated weights for policy 0, policy_version 310114 (0.0047) [2024-06-19 07:53:28,380][26367] Fps is (10 sec: 36044.8, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 5080973312. Throughput: 0: 42743.5. Samples: 1348675080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:28,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 07:53:29,725][26599] Updated weights for policy 0, policy_version 310124 (0.0032) [2024-06-19 07:53:32,731][26599] Updated weights for policy 0, policy_version 310134 (0.0037) [2024-06-19 07:53:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 5081235456. Throughput: 0: 42664.1. Samples: 1348793260. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:33,380][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 07:53:37,371][26599] Updated weights for policy 0, policy_version 310144 (0.0039) [2024-06-19 07:53:38,384][26367] Fps is (10 sec: 45858.9, 60 sec: 42049.8, 300 sec: 42486.8). Total num frames: 5081432064. Throughput: 0: 42537.0. Samples: 1349050420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 07:53:38,384][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 07:53:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000310146_5081432064.pth... [2024-06-19 07:53:38,468][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000309522_5071208448.pth [2024-06-19 07:53:38,721][26579] Signal inference workers to stop experience collection... (19950 times) [2024-06-19 07:53:38,771][26599] InferenceWorker_p0-w0: stopping experience collection (19950 times) [2024-06-19 07:53:38,778][26579] Signal inference workers to resume experience collection... (19950 times) [2024-06-19 07:53:38,785][26599] InferenceWorker_p0-w0: resuming experience collection (19950 times) [2024-06-19 07:53:40,406][26599] Updated weights for policy 0, policy_version 310154 (0.0040) [2024-06-19 07:53:43,380][26367] Fps is (10 sec: 37682.5, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 5081612288. Throughput: 0: 42494.1. Samples: 1349306220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:53:43,381][26367] Avg episode reward: [(0, '0.389')] [2024-06-19 07:53:45,176][26599] Updated weights for policy 0, policy_version 310164 (0.0045) [2024-06-19 07:53:48,152][26599] Updated weights for policy 0, policy_version 310174 (0.0038) [2024-06-19 07:53:48,380][26367] Fps is (10 sec: 45891.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5081890816. Throughput: 0: 42521.7. Samples: 1349427160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:53:48,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 07:53:53,357][26599] Updated weights for policy 0, policy_version 310184 (0.0038) [2024-06-19 07:53:53,384][26367] Fps is (10 sec: 44221.2, 60 sec: 41776.6, 300 sec: 42487.3). Total num frames: 5082054656. Throughput: 0: 42451.7. Samples: 1349687140. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:53:53,384][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 07:53:56,039][26599] Updated weights for policy 0, policy_version 310194 (0.0034) [2024-06-19 07:53:58,380][26367] Fps is (10 sec: 36045.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5082251264. Throughput: 0: 42204.9. Samples: 1349936160. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:53:58,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 07:54:01,054][26599] Updated weights for policy 0, policy_version 310204 (0.0034) [2024-06-19 07:54:03,380][26367] Fps is (10 sec: 45891.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5082513408. Throughput: 0: 42519.2. Samples: 1350067340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:03,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 07:54:03,772][26599] Updated weights for policy 0, policy_version 310214 (0.0037) [2024-06-19 07:54:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5082693632. Throughput: 0: 42460.3. Samples: 1350326680. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:08,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 07:54:08,723][26599] Updated weights for policy 0, policy_version 310224 (0.0041) [2024-06-19 07:54:11,590][26599] Updated weights for policy 0, policy_version 310234 (0.0023) [2024-06-19 07:54:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5082906624. Throughput: 0: 42188.6. Samples: 1350573560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:13,380][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 07:54:16,335][26599] Updated weights for policy 0, policy_version 310244 (0.0038) [2024-06-19 07:54:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 5083136000. Throughput: 0: 42437.6. Samples: 1350702960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:18,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 07:54:19,467][26599] Updated weights for policy 0, policy_version 310254 (0.0045) [2024-06-19 07:54:23,380][26367] Fps is (10 sec: 42597.5, 60 sec: 41779.1, 300 sec: 42431.8). Total num frames: 5083332608. Throughput: 0: 42427.8. Samples: 1350959520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:23,381][26367] Avg episode reward: [(0, '0.406')] [2024-06-19 07:54:23,918][26599] Updated weights for policy 0, policy_version 310264 (0.0034) [2024-06-19 07:54:27,215][26599] Updated weights for policy 0, policy_version 310274 (0.0041) [2024-06-19 07:54:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 5083561984. Throughput: 0: 42292.6. Samples: 1351209380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:28,381][26367] Avg episode reward: [(0, '0.331')] [2024-06-19 07:54:31,860][26599] Updated weights for policy 0, policy_version 310284 (0.0037) [2024-06-19 07:54:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5083774976. Throughput: 0: 42574.2. Samples: 1351343000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:33,384][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 07:54:34,718][26599] Updated weights for policy 0, policy_version 310294 (0.0036) [2024-06-19 07:54:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42054.8, 300 sec: 42376.2). Total num frames: 5083955200. Throughput: 0: 42324.7. Samples: 1351591600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:38,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 07:54:39,678][26599] Updated weights for policy 0, policy_version 310304 (0.0035) [2024-06-19 07:54:42,782][26599] Updated weights for policy 0, policy_version 310314 (0.0031) [2024-06-19 07:54:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 5084200960. Throughput: 0: 42319.1. Samples: 1351840520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:43,390][26367] Avg episode reward: [(0, '0.839')] [2024-06-19 07:54:47,232][26599] Updated weights for policy 0, policy_version 310324 (0.0041) [2024-06-19 07:54:48,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5084413952. Throughput: 0: 42461.4. Samples: 1351978100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:48,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 07:54:50,501][26599] Updated weights for policy 0, policy_version 310334 (0.0029) [2024-06-19 07:54:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42601.0, 300 sec: 42431.8). Total num frames: 5084610560. Throughput: 0: 42382.4. Samples: 1352233880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 07:54:53,380][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 07:54:54,927][26599] Updated weights for policy 0, policy_version 310344 (0.0035) [2024-06-19 07:54:58,289][26599] Updated weights for policy 0, policy_version 310354 (0.0036) [2024-06-19 07:54:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5084839936. Throughput: 0: 42477.2. Samples: 1352485040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:54:58,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 07:55:01,392][26579] Signal inference workers to stop experience collection... (20000 times) [2024-06-19 07:55:01,447][26599] InferenceWorker_p0-w0: stopping experience collection (20000 times) [2024-06-19 07:55:01,450][26579] Signal inference workers to resume experience collection... (20000 times) [2024-06-19 07:55:01,459][26599] InferenceWorker_p0-w0: resuming experience collection (20000 times) [2024-06-19 07:55:02,803][26599] Updated weights for policy 0, policy_version 310364 (0.0033) [2024-06-19 07:55:03,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5085052928. Throughput: 0: 42487.6. Samples: 1352614900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:03,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 07:55:05,901][26599] Updated weights for policy 0, policy_version 310374 (0.0036) [2024-06-19 07:55:08,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42376.2). Total num frames: 5085233152. Throughput: 0: 42357.0. Samples: 1352865580. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:08,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 07:55:10,253][26599] Updated weights for policy 0, policy_version 310384 (0.0047) [2024-06-19 07:55:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5085462528. Throughput: 0: 42349.3. Samples: 1353115100. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:13,383][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 07:55:13,826][26599] Updated weights for policy 0, policy_version 310394 (0.0037) [2024-06-19 07:55:17,924][26599] Updated weights for policy 0, policy_version 310404 (0.0033) [2024-06-19 07:55:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5085659136. Throughput: 0: 42372.0. Samples: 1353249740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:18,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 07:55:21,410][26599] Updated weights for policy 0, policy_version 310414 (0.0035) [2024-06-19 07:55:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42376.3). Total num frames: 5085872128. Throughput: 0: 42508.6. Samples: 1353504480. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:23,380][26367] Avg episode reward: [(0, '0.775')] [2024-06-19 07:55:25,672][26599] Updated weights for policy 0, policy_version 310424 (0.0030) [2024-06-19 07:55:28,384][26367] Fps is (10 sec: 47496.6, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 5086134272. Throughput: 0: 42565.9. Samples: 1353756140. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:28,384][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 07:55:29,067][26599] Updated weights for policy 0, policy_version 310434 (0.0025) [2024-06-19 07:55:33,384][26367] Fps is (10 sec: 42582.3, 60 sec: 42049.7, 300 sec: 42431.2). Total num frames: 5086298112. Throughput: 0: 42568.1. Samples: 1353893820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:33,385][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 07:55:33,593][26599] Updated weights for policy 0, policy_version 310444 (0.0033) [2024-06-19 07:55:36,634][26599] Updated weights for policy 0, policy_version 310454 (0.0031) [2024-06-19 07:55:38,380][26367] Fps is (10 sec: 37696.3, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 5086511104. Throughput: 0: 42429.6. Samples: 1354143220. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:38,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 07:55:38,530][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000310457_5086527488.pth... [2024-06-19 07:55:38,591][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000309837_5076369408.pth [2024-06-19 07:55:41,160][26599] Updated weights for policy 0, policy_version 310464 (0.0041) [2024-06-19 07:55:43,380][26367] Fps is (10 sec: 47531.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5086773248. Throughput: 0: 42373.4. Samples: 1354391840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:43,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 07:55:44,327][26599] Updated weights for policy 0, policy_version 310474 (0.0031) [2024-06-19 07:55:48,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42376.2). Total num frames: 5086937088. Throughput: 0: 42581.0. Samples: 1354531040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:48,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 07:55:48,788][26599] Updated weights for policy 0, policy_version 310484 (0.0031) [2024-06-19 07:55:51,930][26599] Updated weights for policy 0, policy_version 310494 (0.0034) [2024-06-19 07:55:53,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5087150080. Throughput: 0: 42481.4. Samples: 1354777240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:53,380][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 07:55:56,165][26599] Updated weights for policy 0, policy_version 310504 (0.0032) [2024-06-19 07:55:58,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5087412224. Throughput: 0: 42738.7. Samples: 1355038340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:55:58,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 07:55:59,519][26599] Updated weights for policy 0, policy_version 310514 (0.0030) [2024-06-19 07:56:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.3, 300 sec: 42320.7). Total num frames: 5087576064. Throughput: 0: 42738.7. Samples: 1355172980. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:56:03,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 07:56:03,879][26599] Updated weights for policy 0, policy_version 310524 (0.0036) [2024-06-19 07:56:07,192][26599] Updated weights for policy 0, policy_version 310534 (0.0025) [2024-06-19 07:56:08,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5087805440. Throughput: 0: 42557.3. Samples: 1355419560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 07:56:08,380][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 07:56:09,400][26579] Signal inference workers to stop experience collection... (20050 times) [2024-06-19 07:56:09,400][26579] Signal inference workers to resume experience collection... (20050 times) [2024-06-19 07:56:09,419][26599] InferenceWorker_p0-w0: stopping experience collection (20050 times) [2024-06-19 07:56:09,419][26599] InferenceWorker_p0-w0: resuming experience collection (20050 times) [2024-06-19 07:56:11,694][26599] Updated weights for policy 0, policy_version 310544 (0.0041) [2024-06-19 07:56:13,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5088034816. Throughput: 0: 42650.7. Samples: 1355675260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:13,380][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 07:56:14,907][26599] Updated weights for policy 0, policy_version 310554 (0.0049) [2024-06-19 07:56:18,384][26367] Fps is (10 sec: 40944.5, 60 sec: 42595.8, 300 sec: 42375.7). Total num frames: 5088215040. Throughput: 0: 42625.8. Samples: 1355811980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:18,385][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 07:56:19,301][26599] Updated weights for policy 0, policy_version 310564 (0.0027) [2024-06-19 07:56:22,479][26599] Updated weights for policy 0, policy_version 310574 (0.0029) [2024-06-19 07:56:23,380][26367] Fps is (10 sec: 40958.8, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5088444416. Throughput: 0: 42518.2. Samples: 1356056540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:23,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 07:56:27,195][26599] Updated weights for policy 0, policy_version 310584 (0.0033) [2024-06-19 07:56:28,380][26367] Fps is (10 sec: 47530.4, 60 sec: 42600.9, 300 sec: 42653.9). Total num frames: 5088690176. Throughput: 0: 42815.0. Samples: 1356318520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:28,381][26367] Avg episode reward: [(0, '0.783')] [2024-06-19 07:56:30,651][26599] Updated weights for policy 0, policy_version 310594 (0.0031) [2024-06-19 07:56:33,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42874.0, 300 sec: 42376.8). Total num frames: 5088870400. Throughput: 0: 42652.4. Samples: 1356450400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:33,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 07:56:34,888][26599] Updated weights for policy 0, policy_version 310604 (0.0030) [2024-06-19 07:56:38,164][26599] Updated weights for policy 0, policy_version 310614 (0.0028) [2024-06-19 07:56:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5089099776. Throughput: 0: 42673.2. Samples: 1356697540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:38,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 07:56:42,558][26599] Updated weights for policy 0, policy_version 310624 (0.0029) [2024-06-19 07:56:43,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5089296384. Throughput: 0: 42574.2. Samples: 1356954180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:43,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 07:56:45,737][26599] Updated weights for policy 0, policy_version 310634 (0.0041) [2024-06-19 07:56:48,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42868.9, 300 sec: 42431.3). Total num frames: 5089509376. Throughput: 0: 42355.3. Samples: 1357079120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:48,384][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 07:56:50,023][26599] Updated weights for policy 0, policy_version 310644 (0.0046) [2024-06-19 07:56:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5089738752. Throughput: 0: 42665.2. Samples: 1357339500. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:53,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 07:56:53,796][26599] Updated weights for policy 0, policy_version 310654 (0.0039) [2024-06-19 07:56:57,542][26599] Updated weights for policy 0, policy_version 310664 (0.0030) [2024-06-19 07:56:58,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5089935360. Throughput: 0: 42809.2. Samples: 1357601680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:56:58,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 07:57:01,362][26599] Updated weights for policy 0, policy_version 310674 (0.0036) [2024-06-19 07:57:03,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42431.8). Total num frames: 5090148352. Throughput: 0: 42612.0. Samples: 1357729360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:57:03,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 07:57:05,089][26599] Updated weights for policy 0, policy_version 310684 (0.0033) [2024-06-19 07:57:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5090361344. Throughput: 0: 42758.9. Samples: 1357980680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:57:08,380][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 07:57:09,189][26599] Updated weights for policy 0, policy_version 310694 (0.0041) [2024-06-19 07:57:13,152][26599] Updated weights for policy 0, policy_version 310704 (0.0038) [2024-06-19 07:57:13,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5090574336. Throughput: 0: 42761.8. Samples: 1358242800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:57:13,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 07:57:16,714][26599] Updated weights for policy 0, policy_version 310714 (0.0034) [2024-06-19 07:57:18,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43147.2, 300 sec: 42487.8). Total num frames: 5090803712. Throughput: 0: 42633.0. Samples: 1358368880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:57:18,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 07:57:20,789][26599] Updated weights for policy 0, policy_version 310724 (0.0040) [2024-06-19 07:57:23,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 5091000320. Throughput: 0: 42773.3. Samples: 1358622340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 07:57:23,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 07:57:23,793][26579] Signal inference workers to stop experience collection... (20100 times) [2024-06-19 07:57:23,840][26599] InferenceWorker_p0-w0: stopping experience collection (20100 times) [2024-06-19 07:57:23,849][26579] Signal inference workers to resume experience collection... (20100 times) [2024-06-19 07:57:23,856][26599] InferenceWorker_p0-w0: resuming experience collection (20100 times) [2024-06-19 07:57:24,735][26599] Updated weights for policy 0, policy_version 310734 (0.0045) [2024-06-19 07:57:28,339][26599] Updated weights for policy 0, policy_version 310744 (0.0025) [2024-06-19 07:57:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5091229696. Throughput: 0: 42622.6. Samples: 1358872200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:57:28,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 07:57:32,352][26599] Updated weights for policy 0, policy_version 310754 (0.0032) [2024-06-19 07:57:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 5091442688. Throughput: 0: 42819.0. Samples: 1359005820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:57:33,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 07:57:35,729][26599] Updated weights for policy 0, policy_version 310764 (0.0040) [2024-06-19 07:57:38,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5091655680. Throughput: 0: 42624.5. Samples: 1359257760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:57:38,385][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 07:57:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000310770_5091655680.pth... [2024-06-19 07:57:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000310146_5081432064.pth [2024-06-19 07:57:40,217][26599] Updated weights for policy 0, policy_version 310774 (0.0054) [2024-06-19 07:57:43,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5091868672. Throughput: 0: 42355.0. Samples: 1359507660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:57:43,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 07:57:43,808][26599] Updated weights for policy 0, policy_version 310784 (0.0031) [2024-06-19 07:57:47,708][26599] Updated weights for policy 0, policy_version 310794 (0.0041) [2024-06-19 07:57:48,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42600.9, 300 sec: 42431.8). Total num frames: 5092065280. Throughput: 0: 42452.3. Samples: 1359639720. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:57:48,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 07:57:51,179][26599] Updated weights for policy 0, policy_version 310804 (0.0043) [2024-06-19 07:57:53,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5092294656. Throughput: 0: 42566.7. Samples: 1359896180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:57:53,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 07:57:55,573][26599] Updated weights for policy 0, policy_version 310814 (0.0037) [2024-06-19 07:57:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5092507648. Throughput: 0: 42510.4. Samples: 1360155760. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:57:58,381][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 07:57:58,788][26599] Updated weights for policy 0, policy_version 310824 (0.0046) [2024-06-19 07:58:03,158][26599] Updated weights for policy 0, policy_version 310834 (0.0033) [2024-06-19 07:58:03,381][26367] Fps is (10 sec: 40956.0, 60 sec: 42597.7, 300 sec: 42487.2). Total num frames: 5092704256. Throughput: 0: 42480.0. Samples: 1360280520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:03,382][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 07:58:06,800][26599] Updated weights for policy 0, policy_version 310844 (0.0029) [2024-06-19 07:58:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5092950016. Throughput: 0: 42578.2. Samples: 1360538360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:08,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 07:58:11,004][26599] Updated weights for policy 0, policy_version 310854 (0.0029) [2024-06-19 07:58:13,380][26367] Fps is (10 sec: 42602.6, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 5093130240. Throughput: 0: 42646.8. Samples: 1360791300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:13,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 07:58:14,395][26599] Updated weights for policy 0, policy_version 310864 (0.0023) [2024-06-19 07:58:18,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42431.7). Total num frames: 5093343232. Throughput: 0: 42453.5. Samples: 1360916240. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:18,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 07:58:18,567][26599] Updated weights for policy 0, policy_version 310874 (0.0027) [2024-06-19 07:58:22,093][26599] Updated weights for policy 0, policy_version 310884 (0.0043) [2024-06-19 07:58:23,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5093588992. Throughput: 0: 42720.9. Samples: 1361180040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:23,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 07:58:26,453][26599] Updated weights for policy 0, policy_version 310894 (0.0037) [2024-06-19 07:58:28,380][26367] Fps is (10 sec: 44237.9, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5093785600. Throughput: 0: 42825.9. Samples: 1361434820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:28,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 07:58:29,644][26599] Updated weights for policy 0, policy_version 310904 (0.0042) [2024-06-19 07:58:33,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42543.4). Total num frames: 5093982208. Throughput: 0: 42720.1. Samples: 1361562120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:33,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 07:58:34,153][26599] Updated weights for policy 0, policy_version 310914 (0.0043) [2024-06-19 07:58:37,217][26599] Updated weights for policy 0, policy_version 310924 (0.0034) [2024-06-19 07:58:38,038][26579] Signal inference workers to stop experience collection... (20150 times) [2024-06-19 07:58:38,087][26599] InferenceWorker_p0-w0: stopping experience collection (20150 times) [2024-06-19 07:58:38,094][26579] Signal inference workers to resume experience collection... (20150 times) [2024-06-19 07:58:38,103][26599] InferenceWorker_p0-w0: resuming experience collection (20150 times) [2024-06-19 07:58:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 5094227968. Throughput: 0: 42812.4. Samples: 1361822740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 22.0) [2024-06-19 07:58:38,380][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 07:58:41,684][26599] Updated weights for policy 0, policy_version 310934 (0.0039) [2024-06-19 07:58:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5094408192. Throughput: 0: 42832.4. Samples: 1362083220. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:58:43,382][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 07:58:44,834][26599] Updated weights for policy 0, policy_version 310944 (0.0038) [2024-06-19 07:58:48,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 5094621184. Throughput: 0: 42740.0. Samples: 1362203780. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:58:48,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 07:58:49,102][26599] Updated weights for policy 0, policy_version 310954 (0.0047) [2024-06-19 07:58:52,930][26599] Updated weights for policy 0, policy_version 310964 (0.0024) [2024-06-19 07:58:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5094850560. Throughput: 0: 42889.4. Samples: 1362468380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:58:53,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 07:58:56,641][26599] Updated weights for policy 0, policy_version 310974 (0.0044) [2024-06-19 07:58:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5095063552. Throughput: 0: 42880.4. Samples: 1362720920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:58:58,381][26367] Avg episode reward: [(0, '0.857')] [2024-06-19 07:59:00,718][26599] Updated weights for policy 0, policy_version 310984 (0.0025) [2024-06-19 07:59:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42872.2, 300 sec: 42654.0). Total num frames: 5095276544. Throughput: 0: 42886.5. Samples: 1362846120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:03,380][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 07:59:04,078][26599] Updated weights for policy 0, policy_version 310994 (0.0048) [2024-06-19 07:59:08,370][26599] Updated weights for policy 0, policy_version 311004 (0.0035) [2024-06-19 07:59:08,383][26367] Fps is (10 sec: 42584.8, 60 sec: 42323.1, 300 sec: 42653.5). Total num frames: 5095489536. Throughput: 0: 42795.1. Samples: 1363105960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:08,384][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 07:59:11,678][26599] Updated weights for policy 0, policy_version 311014 (0.0042) [2024-06-19 07:59:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5095702528. Throughput: 0: 42730.2. Samples: 1363357680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:13,381][26367] Avg episode reward: [(0, '0.813')] [2024-06-19 07:59:16,037][26599] Updated weights for policy 0, policy_version 311024 (0.0037) [2024-06-19 07:59:18,380][26367] Fps is (10 sec: 44251.5, 60 sec: 43144.8, 300 sec: 42709.5). Total num frames: 5095931904. Throughput: 0: 42856.9. Samples: 1363490680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:18,380][26367] Avg episode reward: [(0, '0.860')] [2024-06-19 07:59:19,377][26599] Updated weights for policy 0, policy_version 311034 (0.0024) [2024-06-19 07:59:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5096128512. Throughput: 0: 42880.9. Samples: 1363752380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:23,381][26367] Avg episode reward: [(0, '0.880')] [2024-06-19 07:59:23,768][26599] Updated weights for policy 0, policy_version 311044 (0.0032) [2024-06-19 07:59:26,960][26599] Updated weights for policy 0, policy_version 311054 (0.0051) [2024-06-19 07:59:28,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5096357888. Throughput: 0: 42648.4. Samples: 1364002400. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:28,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 07:59:31,525][26599] Updated weights for policy 0, policy_version 311064 (0.0021) [2024-06-19 07:59:33,381][26367] Fps is (10 sec: 44231.4, 60 sec: 43143.7, 300 sec: 42764.9). Total num frames: 5096570880. Throughput: 0: 42982.9. Samples: 1364138060. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:33,382][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 07:59:34,577][26599] Updated weights for policy 0, policy_version 311074 (0.0042) [2024-06-19 07:59:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5096767488. Throughput: 0: 42698.2. Samples: 1364389800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:38,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 07:59:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000311082_5096767488.pth... [2024-06-19 07:59:38,456][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000310457_5086527488.pth [2024-06-19 07:59:39,282][26599] Updated weights for policy 0, policy_version 311084 (0.0035) [2024-06-19 07:59:42,491][26599] Updated weights for policy 0, policy_version 311094 (0.0036) [2024-06-19 07:59:43,380][26367] Fps is (10 sec: 42603.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5096996864. Throughput: 0: 42702.8. Samples: 1364642540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:43,380][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 07:59:46,840][26599] Updated weights for policy 0, policy_version 311104 (0.0035) [2024-06-19 07:59:48,380][26367] Fps is (10 sec: 45876.0, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5097226240. Throughput: 0: 42947.1. Samples: 1364778740. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:48,380][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 07:59:49,916][26599] Updated weights for policy 0, policy_version 311114 (0.0033) [2024-06-19 07:59:53,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5097406464. Throughput: 0: 42817.2. Samples: 1365032600. Policy #0 lag: (min: 0.0, avg: 9.0, max: 21.0) [2024-06-19 07:59:53,383][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 07:59:54,371][26599] Updated weights for policy 0, policy_version 311124 (0.0035) [2024-06-19 07:59:57,498][26599] Updated weights for policy 0, policy_version 311134 (0.0032) [2024-06-19 07:59:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5097635840. Throughput: 0: 42767.5. Samples: 1365282220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 07:59:58,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 08:00:01,966][26599] Updated weights for policy 0, policy_version 311144 (0.0023) [2024-06-19 08:00:03,380][26367] Fps is (10 sec: 45876.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5097865216. Throughput: 0: 42840.5. Samples: 1365418500. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:03,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 08:00:04,963][26579] Signal inference workers to stop experience collection... (20200 times) [2024-06-19 08:00:04,965][26579] Signal inference workers to resume experience collection... (20200 times) [2024-06-19 08:00:04,983][26599] InferenceWorker_p0-w0: stopping experience collection (20200 times) [2024-06-19 08:00:04,983][26599] InferenceWorker_p0-w0: resuming experience collection (20200 times) [2024-06-19 08:00:05,120][26599] Updated weights for policy 0, policy_version 311154 (0.0049) [2024-06-19 08:00:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42600.7, 300 sec: 42653.9). Total num frames: 5098045440. Throughput: 0: 42648.8. Samples: 1365671580. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:08,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 08:00:09,437][26599] Updated weights for policy 0, policy_version 311164 (0.0033) [2024-06-19 08:00:12,725][26599] Updated weights for policy 0, policy_version 311174 (0.0041) [2024-06-19 08:00:13,384][26367] Fps is (10 sec: 42582.7, 60 sec: 43142.0, 300 sec: 42820.0). Total num frames: 5098291200. Throughput: 0: 42737.1. Samples: 1365925720. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:13,384][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 08:00:17,174][26599] Updated weights for policy 0, policy_version 311184 (0.0029) [2024-06-19 08:00:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5098487808. Throughput: 0: 42732.8. Samples: 1366060980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:18,380][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 08:00:20,598][26599] Updated weights for policy 0, policy_version 311194 (0.0034) [2024-06-19 08:00:23,380][26367] Fps is (10 sec: 40974.4, 60 sec: 42871.4, 300 sec: 42598.9). Total num frames: 5098700800. Throughput: 0: 42644.0. Samples: 1366308780. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:23,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 08:00:25,036][26599] Updated weights for policy 0, policy_version 311204 (0.0032) [2024-06-19 08:00:28,222][26599] Updated weights for policy 0, policy_version 311214 (0.0039) [2024-06-19 08:00:28,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42821.1). Total num frames: 5098930176. Throughput: 0: 42881.6. Samples: 1366572220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:28,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 08:00:32,459][26599] Updated weights for policy 0, policy_version 311224 (0.0032) [2024-06-19 08:00:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42326.2, 300 sec: 42709.5). Total num frames: 5099110400. Throughput: 0: 42780.0. Samples: 1366703840. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:33,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 08:00:35,906][26599] Updated weights for policy 0, policy_version 311234 (0.0023) [2024-06-19 08:00:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5099356160. Throughput: 0: 42715.5. Samples: 1366954800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:38,390][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 08:00:40,414][26599] Updated weights for policy 0, policy_version 311244 (0.0037) [2024-06-19 08:00:43,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5099569152. Throughput: 0: 42883.6. Samples: 1367211980. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:43,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 08:00:43,450][26599] Updated weights for policy 0, policy_version 311254 (0.0034) [2024-06-19 08:00:47,954][26599] Updated weights for policy 0, policy_version 311264 (0.0028) [2024-06-19 08:00:48,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5099749376. Throughput: 0: 42711.4. Samples: 1367340520. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:48,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 08:00:51,177][26599] Updated weights for policy 0, policy_version 311274 (0.0033) [2024-06-19 08:00:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5100011520. Throughput: 0: 42765.8. Samples: 1367596040. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:53,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 08:00:55,836][26599] Updated weights for policy 0, policy_version 311284 (0.0034) [2024-06-19 08:00:58,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5100208128. Throughput: 0: 43092.0. Samples: 1367864700. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:00:58,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 08:00:58,793][26599] Updated weights for policy 0, policy_version 311294 (0.0036) [2024-06-19 08:01:03,380][26367] Fps is (10 sec: 36044.6, 60 sec: 41779.1, 300 sec: 42598.4). Total num frames: 5100371968. Throughput: 0: 42897.1. Samples: 1367991360. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:01:03,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 08:01:03,618][26599] Updated weights for policy 0, policy_version 311304 (0.0043) [2024-06-19 08:01:06,355][26599] Updated weights for policy 0, policy_version 311314 (0.0037) [2024-06-19 08:01:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5100650496. Throughput: 0: 42981.4. Samples: 1368242940. Policy #0 lag: (min: 1.0, avg: 11.3, max: 20.0) [2024-06-19 08:01:08,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 08:01:11,235][26599] Updated weights for policy 0, policy_version 311324 (0.0022) [2024-06-19 08:01:13,380][26367] Fps is (10 sec: 47514.5, 60 sec: 42601.0, 300 sec: 42821.1). Total num frames: 5100847104. Throughput: 0: 42945.5. Samples: 1368504760. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:13,380][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 08:01:14,015][26599] Updated weights for policy 0, policy_version 311334 (0.0033) [2024-06-19 08:01:18,380][26367] Fps is (10 sec: 37683.1, 60 sec: 42325.2, 300 sec: 42654.0). Total num frames: 5101027328. Throughput: 0: 42851.1. Samples: 1368632140. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:18,381][26367] Avg episode reward: [(0, '0.384')] [2024-06-19 08:01:18,781][26599] Updated weights for policy 0, policy_version 311344 (0.0041) [2024-06-19 08:01:21,782][26599] Updated weights for policy 0, policy_version 311354 (0.0028) [2024-06-19 08:01:22,019][26579] Signal inference workers to stop experience collection... (20250 times) [2024-06-19 08:01:22,053][26599] InferenceWorker_p0-w0: stopping experience collection (20250 times) [2024-06-19 08:01:22,088][26579] Signal inference workers to resume experience collection... (20250 times) [2024-06-19 08:01:22,089][26599] InferenceWorker_p0-w0: resuming experience collection (20250 times) [2024-06-19 08:01:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5101289472. Throughput: 0: 42937.5. Samples: 1368886980. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:23,380][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 08:01:26,237][26599] Updated weights for policy 0, policy_version 311364 (0.0023) [2024-06-19 08:01:28,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5101502464. Throughput: 0: 42938.2. Samples: 1369144200. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:28,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 08:01:29,415][26599] Updated weights for policy 0, policy_version 311374 (0.0029) [2024-06-19 08:01:33,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5101682688. Throughput: 0: 42832.5. Samples: 1369267980. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:33,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 08:01:33,722][26599] Updated weights for policy 0, policy_version 311384 (0.0033) [2024-06-19 08:01:36,994][26599] Updated weights for policy 0, policy_version 311394 (0.0036) [2024-06-19 08:01:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5101944832. Throughput: 0: 42959.1. Samples: 1369529200. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:38,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 08:01:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000311398_5101944832.pth... [2024-06-19 08:01:38,450][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000310770_5091655680.pth [2024-06-19 08:01:41,072][26599] Updated weights for policy 0, policy_version 311404 (0.0039) [2024-06-19 08:01:43,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42821.1). Total num frames: 5102141440. Throughput: 0: 42762.6. Samples: 1369789020. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:43,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 08:01:44,456][26599] Updated weights for policy 0, policy_version 311414 (0.0029) [2024-06-19 08:01:48,380][26367] Fps is (10 sec: 39321.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5102338048. Throughput: 0: 42654.3. Samples: 1369910800. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:48,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 08:01:49,095][26599] Updated weights for policy 0, policy_version 311424 (0.0037) [2024-06-19 08:01:52,102][26599] Updated weights for policy 0, policy_version 311434 (0.0036) [2024-06-19 08:01:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5102583808. Throughput: 0: 42860.0. Samples: 1370171640. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:53,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 08:01:56,563][26599] Updated weights for policy 0, policy_version 311444 (0.0041) [2024-06-19 08:01:58,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5102780416. Throughput: 0: 42835.6. Samples: 1370432360. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:01:58,380][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 08:01:59,685][26599] Updated weights for policy 0, policy_version 311454 (0.0042) [2024-06-19 08:02:03,380][26367] Fps is (10 sec: 40959.3, 60 sec: 43690.7, 300 sec: 42820.5). Total num frames: 5102993408. Throughput: 0: 42764.4. Samples: 1370556540. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:02:03,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 08:02:04,067][26599] Updated weights for policy 0, policy_version 311464 (0.0031) [2024-06-19 08:02:07,216][26599] Updated weights for policy 0, policy_version 311474 (0.0037) [2024-06-19 08:02:08,382][26367] Fps is (10 sec: 44229.8, 60 sec: 42870.4, 300 sec: 42875.9). Total num frames: 5103222784. Throughput: 0: 42853.2. Samples: 1370815440. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:02:08,382][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 08:02:12,058][26599] Updated weights for policy 0, policy_version 311484 (0.0042) [2024-06-19 08:02:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5103403008. Throughput: 0: 42767.5. Samples: 1371068740. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:02:13,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 08:02:14,835][26599] Updated weights for policy 0, policy_version 311494 (0.0033) [2024-06-19 08:02:18,380][26367] Fps is (10 sec: 39327.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5103616000. Throughput: 0: 42812.9. Samples: 1371194560. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:02:18,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 08:02:19,812][26599] Updated weights for policy 0, policy_version 311504 (0.0028) [2024-06-19 08:02:22,419][26599] Updated weights for policy 0, policy_version 311514 (0.0028) [2024-06-19 08:02:23,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5103861760. Throughput: 0: 42700.9. Samples: 1371450740. Policy #0 lag: (min: 0.0, avg: 7.2, max: 21.0) [2024-06-19 08:02:23,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 08:02:27,230][26599] Updated weights for policy 0, policy_version 311524 (0.0035) [2024-06-19 08:02:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5104058368. Throughput: 0: 42669.2. Samples: 1371709140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:02:28,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 08:02:29,703][26579] Signal inference workers to stop experience collection... (20300 times) [2024-06-19 08:02:29,704][26579] Signal inference workers to resume experience collection... (20300 times) [2024-06-19 08:02:29,733][26599] InferenceWorker_p0-w0: stopping experience collection (20300 times) [2024-06-19 08:02:29,734][26599] InferenceWorker_p0-w0: resuming experience collection (20300 times) [2024-06-19 08:02:30,216][26599] Updated weights for policy 0, policy_version 311534 (0.0033) [2024-06-19 08:02:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5104254976. Throughput: 0: 42797.4. Samples: 1371836680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:02:33,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 08:02:34,841][26599] Updated weights for policy 0, policy_version 311544 (0.0045) [2024-06-19 08:02:38,214][26599] Updated weights for policy 0, policy_version 311554 (0.0025) [2024-06-19 08:02:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5104500736. Throughput: 0: 42733.1. Samples: 1372094640. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:02:38,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 08:02:42,444][26599] Updated weights for policy 0, policy_version 311564 (0.0038) [2024-06-19 08:02:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5104697344. Throughput: 0: 42706.2. Samples: 1372354140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:02:43,380][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 08:02:45,942][26599] Updated weights for policy 0, policy_version 311574 (0.0038) [2024-06-19 08:02:48,380][26367] Fps is (10 sec: 37683.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5104877568. Throughput: 0: 42732.5. Samples: 1372479500. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:02:48,381][26367] Avg episode reward: [(0, '0.369')] [2024-06-19 08:02:50,066][26599] Updated weights for policy 0, policy_version 311584 (0.0022) [2024-06-19 08:02:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5105139712. Throughput: 0: 42663.2. Samples: 1372735220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:02:53,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 08:02:53,589][26599] Updated weights for policy 0, policy_version 311594 (0.0033) [2024-06-19 08:02:57,651][26599] Updated weights for policy 0, policy_version 311604 (0.0036) [2024-06-19 08:02:58,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42820.7). Total num frames: 5105336320. Throughput: 0: 42783.2. Samples: 1372993980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:02:58,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 08:03:01,123][26599] Updated weights for policy 0, policy_version 311614 (0.0033) [2024-06-19 08:03:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5105532928. Throughput: 0: 42743.6. Samples: 1373118020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:03,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 08:03:05,457][26599] Updated weights for policy 0, policy_version 311624 (0.0032) [2024-06-19 08:03:08,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42872.5, 300 sec: 42931.6). Total num frames: 5105795072. Throughput: 0: 42809.3. Samples: 1373377160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:08,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 08:03:08,738][26599] Updated weights for policy 0, policy_version 311634 (0.0037) [2024-06-19 08:03:13,169][26599] Updated weights for policy 0, policy_version 311644 (0.0033) [2024-06-19 08:03:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5105975296. Throughput: 0: 42746.7. Samples: 1373632740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:13,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 08:03:16,905][26599] Updated weights for policy 0, policy_version 311654 (0.0047) [2024-06-19 08:03:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5106188288. Throughput: 0: 42632.9. Samples: 1373755160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:18,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 08:03:20,832][26599] Updated weights for policy 0, policy_version 311664 (0.0033) [2024-06-19 08:03:23,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5106434048. Throughput: 0: 42683.2. Samples: 1374015380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:23,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 08:03:24,395][26599] Updated weights for policy 0, policy_version 311674 (0.0034) [2024-06-19 08:03:28,322][26599] Updated weights for policy 0, policy_version 311684 (0.0038) [2024-06-19 08:03:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5106630656. Throughput: 0: 42771.9. Samples: 1374278880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:28,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 08:03:31,924][26599] Updated weights for policy 0, policy_version 311694 (0.0031) [2024-06-19 08:03:33,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5106827264. Throughput: 0: 42716.5. Samples: 1374401740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:33,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 08:03:36,000][26599] Updated weights for policy 0, policy_version 311704 (0.0046) [2024-06-19 08:03:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5107073024. Throughput: 0: 42811.9. Samples: 1374661760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 08:03:38,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 08:03:38,492][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000311712_5107089408.pth... [2024-06-19 08:03:38,538][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000311082_5096767488.pth [2024-06-19 08:03:39,403][26599] Updated weights for policy 0, policy_version 311714 (0.0029) [2024-06-19 08:03:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5107253248. Throughput: 0: 42970.3. Samples: 1374927640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:03:43,380][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 08:03:43,636][26599] Updated weights for policy 0, policy_version 311724 (0.0038) [2024-06-19 08:03:47,438][26599] Updated weights for policy 0, policy_version 311734 (0.0036) [2024-06-19 08:03:48,380][26367] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5107466240. Throughput: 0: 42786.3. Samples: 1375043400. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:03:48,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 08:03:51,154][26599] Updated weights for policy 0, policy_version 311744 (0.0032) [2024-06-19 08:03:52,634][26579] Signal inference workers to stop experience collection... (20350 times) [2024-06-19 08:03:52,668][26599] InferenceWorker_p0-w0: stopping experience collection (20350 times) [2024-06-19 08:03:52,697][26579] Signal inference workers to resume experience collection... (20350 times) [2024-06-19 08:03:52,698][26599] InferenceWorker_p0-w0: resuming experience collection (20350 times) [2024-06-19 08:03:53,384][26367] Fps is (10 sec: 45858.3, 60 sec: 42868.9, 300 sec: 42875.6). Total num frames: 5107712000. Throughput: 0: 42876.6. Samples: 1375306760. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:03:53,384][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 08:03:55,191][26599] Updated weights for policy 0, policy_version 311754 (0.0032) [2024-06-19 08:03:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5107892224. Throughput: 0: 42977.0. Samples: 1375566700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:03:58,380][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 08:03:59,041][26599] Updated weights for policy 0, policy_version 311764 (0.0040) [2024-06-19 08:04:02,614][26599] Updated weights for policy 0, policy_version 311774 (0.0041) [2024-06-19 08:04:03,380][26367] Fps is (10 sec: 40974.6, 60 sec: 43144.5, 300 sec: 42821.0). Total num frames: 5108121600. Throughput: 0: 42942.1. Samples: 1375687560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:03,384][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 08:04:06,768][26599] Updated weights for policy 0, policy_version 311784 (0.0029) [2024-06-19 08:04:08,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5108367360. Throughput: 0: 43090.3. Samples: 1375954440. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:08,380][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 08:04:09,994][26599] Updated weights for policy 0, policy_version 311794 (0.0032) [2024-06-19 08:04:13,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5108547584. Throughput: 0: 43010.3. Samples: 1376214340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:13,380][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 08:04:14,540][26599] Updated weights for policy 0, policy_version 311804 (0.0028) [2024-06-19 08:04:17,505][26599] Updated weights for policy 0, policy_version 311814 (0.0051) [2024-06-19 08:04:18,384][26367] Fps is (10 sec: 40944.9, 60 sec: 43141.9, 300 sec: 42875.6). Total num frames: 5108776960. Throughput: 0: 42925.4. Samples: 1376333540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:18,385][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 08:04:22,148][26599] Updated weights for policy 0, policy_version 311824 (0.0040) [2024-06-19 08:04:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5108989952. Throughput: 0: 43018.4. Samples: 1376597580. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:23,380][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 08:04:25,120][26599] Updated weights for policy 0, policy_version 311834 (0.0022) [2024-06-19 08:04:28,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42598.4, 300 sec: 42765.2). Total num frames: 5109186560. Throughput: 0: 42714.6. Samples: 1376849800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:28,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 08:04:29,796][26599] Updated weights for policy 0, policy_version 311844 (0.0039) [2024-06-19 08:04:33,172][26599] Updated weights for policy 0, policy_version 311854 (0.0033) [2024-06-19 08:04:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5109415936. Throughput: 0: 42952.4. Samples: 1376976260. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:33,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 08:04:37,640][26599] Updated weights for policy 0, policy_version 311864 (0.0037) [2024-06-19 08:04:38,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 5109628928. Throughput: 0: 42779.9. Samples: 1377231700. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:38,381][26367] Avg episode reward: [(0, '0.822')] [2024-06-19 08:04:41,015][26599] Updated weights for policy 0, policy_version 311874 (0.0039) [2024-06-19 08:04:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5109841920. Throughput: 0: 42715.6. Samples: 1377488900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:43,381][26367] Avg episode reward: [(0, '0.766')] [2024-06-19 08:04:45,176][26599] Updated weights for policy 0, policy_version 311884 (0.0038) [2024-06-19 08:04:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5110054912. Throughput: 0: 42808.4. Samples: 1377613940. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:48,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 08:04:48,702][26599] Updated weights for policy 0, policy_version 311894 (0.0038) [2024-06-19 08:04:52,835][26599] Updated weights for policy 0, policy_version 311904 (0.0030) [2024-06-19 08:04:53,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42327.9, 300 sec: 42765.0). Total num frames: 5110251520. Throughput: 0: 42514.6. Samples: 1377867600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:04:53,382][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 08:04:56,465][26599] Updated weights for policy 0, policy_version 311914 (0.0043) [2024-06-19 08:04:58,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5110448128. Throughput: 0: 42560.0. Samples: 1378129540. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:04:58,380][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 08:05:00,454][26599] Updated weights for policy 0, policy_version 311924 (0.0049) [2024-06-19 08:05:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5110693888. Throughput: 0: 42684.4. Samples: 1378254180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:03,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 08:05:04,189][26599] Updated weights for policy 0, policy_version 311934 (0.0037) [2024-06-19 08:05:08,347][26599] Updated weights for policy 0, policy_version 311944 (0.0054) [2024-06-19 08:05:08,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42049.7, 300 sec: 42709.5). Total num frames: 5110890496. Throughput: 0: 42441.0. Samples: 1378507580. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:08,384][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 08:05:11,751][26599] Updated weights for policy 0, policy_version 311954 (0.0035) [2024-06-19 08:05:13,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5111103488. Throughput: 0: 42577.8. Samples: 1378765800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:13,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 08:05:15,960][26599] Updated weights for policy 0, policy_version 311964 (0.0039) [2024-06-19 08:05:18,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42327.8, 300 sec: 42765.0). Total num frames: 5111316480. Throughput: 0: 42525.7. Samples: 1378889920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:18,381][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 08:05:19,385][26599] Updated weights for policy 0, policy_version 311974 (0.0036) [2024-06-19 08:05:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5111513088. Throughput: 0: 42447.6. Samples: 1379141840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:23,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 08:05:23,656][26599] Updated weights for policy 0, policy_version 311984 (0.0043) [2024-06-19 08:05:25,025][26579] Signal inference workers to stop experience collection... (20400 times) [2024-06-19 08:05:25,026][26579] Signal inference workers to resume experience collection... (20400 times) [2024-06-19 08:05:25,054][26599] InferenceWorker_p0-w0: stopping experience collection (20400 times) [2024-06-19 08:05:25,054][26599] InferenceWorker_p0-w0: resuming experience collection (20400 times) [2024-06-19 08:05:27,322][26599] Updated weights for policy 0, policy_version 311994 (0.0038) [2024-06-19 08:05:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5111758848. Throughput: 0: 42442.1. Samples: 1379398800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:28,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 08:05:31,339][26599] Updated weights for policy 0, policy_version 312004 (0.0034) [2024-06-19 08:05:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5111955456. Throughput: 0: 42548.6. Samples: 1379528620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:33,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 08:05:35,241][26599] Updated weights for policy 0, policy_version 312014 (0.0057) [2024-06-19 08:05:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5112168448. Throughput: 0: 42500.0. Samples: 1379780100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:38,381][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 08:05:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312022_5112168448.pth... [2024-06-19 08:05:38,447][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000311398_5101944832.pth [2024-06-19 08:05:39,070][26599] Updated weights for policy 0, policy_version 312024 (0.0037) [2024-06-19 08:05:42,869][26599] Updated weights for policy 0, policy_version 312034 (0.0034) [2024-06-19 08:05:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5112381440. Throughput: 0: 42171.6. Samples: 1380027260. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:43,381][26367] Avg episode reward: [(0, '0.845')] [2024-06-19 08:05:46,769][26599] Updated weights for policy 0, policy_version 312044 (0.0041) [2024-06-19 08:05:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5112594432. Throughput: 0: 42271.5. Samples: 1380156400. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:48,381][26367] Avg episode reward: [(0, '0.302')] [2024-06-19 08:05:50,939][26599] Updated weights for policy 0, policy_version 312054 (0.0036) [2024-06-19 08:05:53,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5112791040. Throughput: 0: 42344.3. Samples: 1380412920. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:53,384][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 08:05:54,383][26599] Updated weights for policy 0, policy_version 312064 (0.0033) [2024-06-19 08:05:58,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5112987648. Throughput: 0: 42164.0. Samples: 1380663180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:05:58,384][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 08:05:58,591][26599] Updated weights for policy 0, policy_version 312074 (0.0030) [2024-06-19 08:06:02,203][26599] Updated weights for policy 0, policy_version 312084 (0.0028) [2024-06-19 08:06:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 5113217024. Throughput: 0: 42130.6. Samples: 1380785800. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:06:03,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 08:06:06,367][26599] Updated weights for policy 0, policy_version 312094 (0.0032) [2024-06-19 08:06:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42327.9, 300 sec: 42653.9). Total num frames: 5113430016. Throughput: 0: 42142.2. Samples: 1381038240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 24.0) [2024-06-19 08:06:08,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 08:06:09,850][26599] Updated weights for policy 0, policy_version 312104 (0.0028) [2024-06-19 08:06:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5113626624. Throughput: 0: 42148.9. Samples: 1381295500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:13,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 08:06:13,983][26599] Updated weights for policy 0, policy_version 312114 (0.0039) [2024-06-19 08:06:17,785][26599] Updated weights for policy 0, policy_version 312124 (0.0033) [2024-06-19 08:06:18,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5113856000. Throughput: 0: 42056.0. Samples: 1381421140. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:18,380][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 08:06:21,705][26599] Updated weights for policy 0, policy_version 312134 (0.0026) [2024-06-19 08:06:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5114052608. Throughput: 0: 42140.0. Samples: 1381676400. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:23,381][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 08:06:25,487][26599] Updated weights for policy 0, policy_version 312144 (0.0036) [2024-06-19 08:06:28,381][26367] Fps is (10 sec: 40958.9, 60 sec: 41779.1, 300 sec: 42653.9). Total num frames: 5114265600. Throughput: 0: 42382.8. Samples: 1381934500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:28,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 08:06:29,501][26599] Updated weights for policy 0, policy_version 312154 (0.0029) [2024-06-19 08:06:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5114478592. Throughput: 0: 42289.8. Samples: 1382059440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:33,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 08:06:33,471][26599] Updated weights for policy 0, policy_version 312164 (0.0042) [2024-06-19 08:06:37,053][26599] Updated weights for policy 0, policy_version 312174 (0.0029) [2024-06-19 08:06:38,380][26367] Fps is (10 sec: 44238.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5114707968. Throughput: 0: 42170.8. Samples: 1382310600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:38,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 08:06:41,025][26599] Updated weights for policy 0, policy_version 312184 (0.0031) [2024-06-19 08:06:43,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5114904576. Throughput: 0: 42365.4. Samples: 1382569620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:43,380][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 08:06:44,747][26599] Updated weights for policy 0, policy_version 312194 (0.0034) [2024-06-19 08:06:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42487.3). Total num frames: 5115117568. Throughput: 0: 42537.1. Samples: 1382699960. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:48,380][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 08:06:48,622][26599] Updated weights for policy 0, policy_version 312204 (0.0033) [2024-06-19 08:06:52,577][26599] Updated weights for policy 0, policy_version 312214 (0.0038) [2024-06-19 08:06:53,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5115330560. Throughput: 0: 42541.7. Samples: 1382952620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:53,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 08:06:56,540][26599] Updated weights for policy 0, policy_version 312224 (0.0038) [2024-06-19 08:06:58,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5115559936. Throughput: 0: 42466.6. Samples: 1383206500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:06:58,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 08:07:00,214][26599] Updated weights for policy 0, policy_version 312234 (0.0038) [2024-06-19 08:07:03,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42543.1). Total num frames: 5115772928. Throughput: 0: 42604.5. Samples: 1383338340. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:07:03,380][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 08:07:04,101][26599] Updated weights for policy 0, policy_version 312244 (0.0038) [2024-06-19 08:07:08,069][26599] Updated weights for policy 0, policy_version 312254 (0.0035) [2024-06-19 08:07:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5115985920. Throughput: 0: 42655.9. Samples: 1383595920. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:07:08,388][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 08:07:11,746][26579] Signal inference workers to stop experience collection... (20450 times) [2024-06-19 08:07:11,785][26599] InferenceWorker_p0-w0: stopping experience collection (20450 times) [2024-06-19 08:07:11,818][26579] Signal inference workers to resume experience collection... (20450 times) [2024-06-19 08:07:11,819][26599] InferenceWorker_p0-w0: resuming experience collection (20450 times) [2024-06-19 08:07:11,823][26599] Updated weights for policy 0, policy_version 312264 (0.0039) [2024-06-19 08:07:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5116198912. Throughput: 0: 42475.7. Samples: 1383845900. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:07:13,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 08:07:15,777][26599] Updated weights for policy 0, policy_version 312274 (0.0031) [2024-06-19 08:07:18,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5116395520. Throughput: 0: 42556.5. Samples: 1383974480. Policy #0 lag: (min: 1.0, avg: 10.6, max: 22.0) [2024-06-19 08:07:18,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 08:07:19,305][26599] Updated weights for policy 0, policy_version 312284 (0.0033) [2024-06-19 08:07:23,343][26599] Updated weights for policy 0, policy_version 312294 (0.0024) [2024-06-19 08:07:23,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5116624896. Throughput: 0: 42751.6. Samples: 1384234420. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:23,380][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 08:07:26,837][26599] Updated weights for policy 0, policy_version 312304 (0.0038) [2024-06-19 08:07:28,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5116837888. Throughput: 0: 42598.4. Samples: 1384486560. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:28,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 08:07:30,986][26599] Updated weights for policy 0, policy_version 312314 (0.0035) [2024-06-19 08:07:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5117050880. Throughput: 0: 42594.6. Samples: 1384616720. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:33,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 08:07:34,290][26599] Updated weights for policy 0, policy_version 312324 (0.0038) [2024-06-19 08:07:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5117247488. Throughput: 0: 42738.7. Samples: 1384875860. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:38,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 08:07:38,460][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312333_5117263872.pth... [2024-06-19 08:07:38,526][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000311712_5107089408.pth [2024-06-19 08:07:38,654][26599] Updated weights for policy 0, policy_version 312334 (0.0025) [2024-06-19 08:07:42,271][26599] Updated weights for policy 0, policy_version 312344 (0.0036) [2024-06-19 08:07:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5117476864. Throughput: 0: 42602.8. Samples: 1385123620. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:43,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 08:07:46,332][26599] Updated weights for policy 0, policy_version 312354 (0.0031) [2024-06-19 08:07:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5117689856. Throughput: 0: 42615.5. Samples: 1385256040. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:48,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 08:07:50,009][26599] Updated weights for policy 0, policy_version 312364 (0.0031) [2024-06-19 08:07:53,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42868.9, 300 sec: 42597.9). Total num frames: 5117902848. Throughput: 0: 42506.0. Samples: 1385508840. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:53,385][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 08:07:53,954][26599] Updated weights for policy 0, policy_version 312374 (0.0038) [2024-06-19 08:07:57,684][26599] Updated weights for policy 0, policy_version 312384 (0.0033) [2024-06-19 08:07:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5118115840. Throughput: 0: 42703.6. Samples: 1385767560. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:07:58,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 08:08:01,752][26599] Updated weights for policy 0, policy_version 312394 (0.0045) [2024-06-19 08:08:03,380][26367] Fps is (10 sec: 40975.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5118312448. Throughput: 0: 42684.4. Samples: 1385895280. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:08:03,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 08:08:05,429][26599] Updated weights for policy 0, policy_version 312404 (0.0035) [2024-06-19 08:08:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 5118525440. Throughput: 0: 42391.9. Samples: 1386142060. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:08:08,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 08:08:09,457][26599] Updated weights for policy 0, policy_version 312414 (0.0030) [2024-06-19 08:08:13,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5118754816. Throughput: 0: 42666.4. Samples: 1386406700. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:08:13,385][26367] Avg episode reward: [(0, '0.356')] [2024-06-19 08:08:13,386][26599] Updated weights for policy 0, policy_version 312424 (0.0043) [2024-06-19 08:08:17,373][26599] Updated weights for policy 0, policy_version 312434 (0.0033) [2024-06-19 08:08:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 5118951424. Throughput: 0: 42719.5. Samples: 1386539100. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:08:18,381][26367] Avg episode reward: [(0, '0.369')] [2024-06-19 08:08:20,896][26599] Updated weights for policy 0, policy_version 312444 (0.0042) [2024-06-19 08:08:23,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5119180800. Throughput: 0: 42477.8. Samples: 1386787360. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:08:23,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 08:08:25,069][26599] Updated weights for policy 0, policy_version 312454 (0.0041) [2024-06-19 08:08:28,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5119393792. Throughput: 0: 42720.9. Samples: 1387046060. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:08:28,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 08:08:28,492][26599] Updated weights for policy 0, policy_version 312464 (0.0045) [2024-06-19 08:08:32,683][26599] Updated weights for policy 0, policy_version 312474 (0.0035) [2024-06-19 08:08:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5119590400. Throughput: 0: 42650.6. Samples: 1387175320. Policy #0 lag: (min: 2.0, avg: 10.0, max: 21.0) [2024-06-19 08:08:33,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 08:08:35,316][26579] Signal inference workers to stop experience collection... (20500 times) [2024-06-19 08:08:35,317][26579] Signal inference workers to resume experience collection... (20500 times) [2024-06-19 08:08:35,356][26599] InferenceWorker_p0-w0: stopping experience collection (20500 times) [2024-06-19 08:08:35,360][26599] InferenceWorker_p0-w0: resuming experience collection (20500 times) [2024-06-19 08:08:36,150][26599] Updated weights for policy 0, policy_version 312484 (0.0028) [2024-06-19 08:08:38,380][26367] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5119836160. Throughput: 0: 42658.0. Samples: 1387428300. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:08:38,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 08:08:40,814][26599] Updated weights for policy 0, policy_version 312494 (0.0029) [2024-06-19 08:08:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5120032768. Throughput: 0: 42631.9. Samples: 1387686000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:08:43,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 08:08:43,698][26599] Updated weights for policy 0, policy_version 312504 (0.0035) [2024-06-19 08:08:48,379][26599] Updated weights for policy 0, policy_version 312514 (0.0040) [2024-06-19 08:08:48,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42432.3). Total num frames: 5120229376. Throughput: 0: 42615.4. Samples: 1387812980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:08:48,381][26367] Avg episode reward: [(0, '0.427')] [2024-06-19 08:08:51,338][26599] Updated weights for policy 0, policy_version 312524 (0.0036) [2024-06-19 08:08:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42874.0, 300 sec: 42653.9). Total num frames: 5120475136. Throughput: 0: 42827.9. Samples: 1388069320. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:08:53,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 08:08:55,937][26599] Updated weights for policy 0, policy_version 312534 (0.0038) [2024-06-19 08:08:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5120671744. Throughput: 0: 42632.7. Samples: 1388325020. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:08:58,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 08:08:58,997][26599] Updated weights for policy 0, policy_version 312544 (0.0043) [2024-06-19 08:09:03,382][26367] Fps is (10 sec: 39315.7, 60 sec: 42597.2, 300 sec: 42376.0). Total num frames: 5120868352. Throughput: 0: 42445.2. Samples: 1388449200. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:03,382][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 08:09:03,941][26599] Updated weights for policy 0, policy_version 312554 (0.0042) [2024-06-19 08:09:06,678][26599] Updated weights for policy 0, policy_version 312564 (0.0048) [2024-06-19 08:09:08,380][26367] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 5121130496. Throughput: 0: 42715.1. Samples: 1388709540. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:08,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 08:09:11,557][26599] Updated weights for policy 0, policy_version 312574 (0.0041) [2024-06-19 08:09:13,380][26367] Fps is (10 sec: 44244.4, 60 sec: 42601.1, 300 sec: 42487.9). Total num frames: 5121310720. Throughput: 0: 42735.6. Samples: 1388969160. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:13,380][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 08:09:14,223][26599] Updated weights for policy 0, policy_version 312584 (0.0038) [2024-06-19 08:09:18,384][26367] Fps is (10 sec: 37669.3, 60 sec: 42595.8, 300 sec: 42431.3). Total num frames: 5121507328. Throughput: 0: 42470.8. Samples: 1389086660. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:18,385][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 08:09:19,266][26599] Updated weights for policy 0, policy_version 312594 (0.0036) [2024-06-19 08:09:21,766][26599] Updated weights for policy 0, policy_version 312604 (0.0032) [2024-06-19 08:09:23,380][26367] Fps is (10 sec: 45874.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5121769472. Throughput: 0: 42662.4. Samples: 1389348100. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:23,381][26367] Avg episode reward: [(0, '0.373')] [2024-06-19 08:09:27,282][26599] Updated weights for policy 0, policy_version 312614 (0.0036) [2024-06-19 08:09:28,385][26367] Fps is (10 sec: 42595.6, 60 sec: 42322.3, 300 sec: 42431.2). Total num frames: 5121933312. Throughput: 0: 42781.4. Samples: 1389611340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:28,385][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 08:09:29,535][26599] Updated weights for policy 0, policy_version 312624 (0.0039) [2024-06-19 08:09:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5122162688. Throughput: 0: 42476.9. Samples: 1389724440. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:33,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 08:09:34,958][26599] Updated weights for policy 0, policy_version 312634 (0.0043) [2024-06-19 08:09:37,175][26599] Updated weights for policy 0, policy_version 312644 (0.0029) [2024-06-19 08:09:38,380][26367] Fps is (10 sec: 47534.4, 60 sec: 42871.7, 300 sec: 42598.4). Total num frames: 5122408448. Throughput: 0: 42641.5. Samples: 1389988180. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:38,380][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 08:09:38,501][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312648_5122424832.pth... [2024-06-19 08:09:38,564][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312022_5112168448.pth [2024-06-19 08:09:40,443][26579] Signal inference workers to stop experience collection... (20550 times) [2024-06-19 08:09:40,447][26579] Signal inference workers to resume experience collection... (20550 times) [2024-06-19 08:09:40,465][26599] InferenceWorker_p0-w0: stopping experience collection (20550 times) [2024-06-19 08:09:40,465][26599] InferenceWorker_p0-w0: resuming experience collection (20550 times) [2024-06-19 08:09:42,587][26599] Updated weights for policy 0, policy_version 312654 (0.0045) [2024-06-19 08:09:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 5122572288. Throughput: 0: 42840.1. Samples: 1390252820. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:43,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 08:09:45,191][26599] Updated weights for policy 0, policy_version 312664 (0.0038) [2024-06-19 08:09:48,380][26367] Fps is (10 sec: 40959.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5122818048. Throughput: 0: 42612.0. Samples: 1390366680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 20.0) [2024-06-19 08:09:48,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 08:09:50,254][26599] Updated weights for policy 0, policy_version 312674 (0.0030) [2024-06-19 08:09:52,786][26599] Updated weights for policy 0, policy_version 312684 (0.0037) [2024-06-19 08:09:53,380][26367] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5123047424. Throughput: 0: 42622.1. Samples: 1390627540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:09:53,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 08:09:57,855][26599] Updated weights for policy 0, policy_version 312694 (0.0038) [2024-06-19 08:09:58,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 5123194880. Throughput: 0: 42776.2. Samples: 1390894100. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:09:58,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 08:10:00,235][26599] Updated weights for policy 0, policy_version 312704 (0.0034) [2024-06-19 08:10:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 43145.7, 300 sec: 42598.9). Total num frames: 5123457024. Throughput: 0: 42721.2. Samples: 1391008960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:03,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 08:10:05,478][26599] Updated weights for policy 0, policy_version 312714 (0.0039) [2024-06-19 08:10:07,858][26599] Updated weights for policy 0, policy_version 312724 (0.0033) [2024-06-19 08:10:08,380][26367] Fps is (10 sec: 49153.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5123686400. Throughput: 0: 42709.4. Samples: 1391270020. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:08,380][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 08:10:13,103][26599] Updated weights for policy 0, policy_version 312734 (0.0028) [2024-06-19 08:10:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5123850240. Throughput: 0: 42726.2. Samples: 1391533840. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:13,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 08:10:15,494][26599] Updated weights for policy 0, policy_version 312744 (0.0044) [2024-06-19 08:10:18,380][26367] Fps is (10 sec: 40959.3, 60 sec: 43147.1, 300 sec: 42653.9). Total num frames: 5124096000. Throughput: 0: 42745.8. Samples: 1391648000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:18,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 08:10:20,738][26599] Updated weights for policy 0, policy_version 312754 (0.0031) [2024-06-19 08:10:23,275][26599] Updated weights for policy 0, policy_version 312764 (0.0028) [2024-06-19 08:10:23,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5124325376. Throughput: 0: 42639.9. Samples: 1391906980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:23,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 08:10:28,368][26599] Updated weights for policy 0, policy_version 312774 (0.0035) [2024-06-19 08:10:28,384][26367] Fps is (10 sec: 39307.7, 60 sec: 42598.9, 300 sec: 42486.8). Total num frames: 5124489216. Throughput: 0: 42617.0. Samples: 1392170740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:28,384][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 08:10:30,954][26599] Updated weights for policy 0, policy_version 312784 (0.0032) [2024-06-19 08:10:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5124751360. Throughput: 0: 42632.1. Samples: 1392285120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:33,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 08:10:35,790][26599] Updated weights for policy 0, policy_version 312794 (0.0033) [2024-06-19 08:10:38,380][26367] Fps is (10 sec: 47531.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5124964352. Throughput: 0: 42822.8. Samples: 1392554560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:38,380][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 08:10:39,026][26599] Updated weights for policy 0, policy_version 312804 (0.0029) [2024-06-19 08:10:43,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5125128192. Throughput: 0: 42565.0. Samples: 1392809520. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:43,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 08:10:43,406][26599] Updated weights for policy 0, policy_version 312814 (0.0030) [2024-06-19 08:10:46,608][26599] Updated weights for policy 0, policy_version 312824 (0.0043) [2024-06-19 08:10:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5125390336. Throughput: 0: 42655.7. Samples: 1392928460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:48,380][26367] Avg episode reward: [(0, '0.850')] [2024-06-19 08:10:51,433][26599] Updated weights for policy 0, policy_version 312834 (0.0038) [2024-06-19 08:10:51,450][26579] Signal inference workers to stop experience collection... (20600 times) [2024-06-19 08:10:51,450][26579] Signal inference workers to resume experience collection... (20600 times) [2024-06-19 08:10:51,466][26599] InferenceWorker_p0-w0: stopping experience collection (20600 times) [2024-06-19 08:10:51,466][26599] InferenceWorker_p0-w0: resuming experience collection (20600 times) [2024-06-19 08:10:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5125570560. Throughput: 0: 42715.5. Samples: 1393192220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:53,380][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 08:10:54,131][26599] Updated weights for policy 0, policy_version 312844 (0.0040) [2024-06-19 08:10:58,380][26367] Fps is (10 sec: 37682.6, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5125767168. Throughput: 0: 42379.5. Samples: 1393440920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:10:58,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 08:10:59,008][26599] Updated weights for policy 0, policy_version 312854 (0.0034) [2024-06-19 08:11:02,029][26599] Updated weights for policy 0, policy_version 312864 (0.0033) [2024-06-19 08:11:03,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5126029312. Throughput: 0: 42697.0. Samples: 1393569360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 08:11:03,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 08:11:06,568][26599] Updated weights for policy 0, policy_version 312874 (0.0032) [2024-06-19 08:11:08,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5126209536. Throughput: 0: 42819.7. Samples: 1393833860. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:08,380][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 08:11:09,853][26599] Updated weights for policy 0, policy_version 312884 (0.0029) [2024-06-19 08:11:13,380][26367] Fps is (10 sec: 37682.9, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5126406144. Throughput: 0: 42594.0. Samples: 1394087320. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:13,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 08:11:14,470][26599] Updated weights for policy 0, policy_version 312894 (0.0036) [2024-06-19 08:11:17,286][26599] Updated weights for policy 0, policy_version 312904 (0.0028) [2024-06-19 08:11:18,380][26367] Fps is (10 sec: 45874.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5126668288. Throughput: 0: 42863.5. Samples: 1394213980. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:18,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 08:11:22,076][26599] Updated weights for policy 0, policy_version 312914 (0.0035) [2024-06-19 08:11:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 5126832128. Throughput: 0: 42763.6. Samples: 1394478920. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:23,380][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 08:11:24,835][26599] Updated weights for policy 0, policy_version 312924 (0.0028) [2024-06-19 08:11:28,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42874.0, 300 sec: 42653.9). Total num frames: 5127061504. Throughput: 0: 42667.6. Samples: 1394729560. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:28,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 08:11:29,904][26599] Updated weights for policy 0, policy_version 312934 (0.0034) [2024-06-19 08:11:32,619][26599] Updated weights for policy 0, policy_version 312944 (0.0032) [2024-06-19 08:11:33,380][26367] Fps is (10 sec: 49151.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5127323648. Throughput: 0: 42822.5. Samples: 1394855480. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:33,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 08:11:37,562][26599] Updated weights for policy 0, policy_version 312954 (0.0033) [2024-06-19 08:11:38,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42049.7, 300 sec: 42653.4). Total num frames: 5127487488. Throughput: 0: 42625.0. Samples: 1395110500. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:38,384][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 08:11:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312957_5127487488.pth... [2024-06-19 08:11:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312333_5117263872.pth [2024-06-19 08:11:40,496][26599] Updated weights for policy 0, policy_version 312964 (0.0029) [2024-06-19 08:11:43,380][26367] Fps is (10 sec: 39321.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5127716864. Throughput: 0: 42576.0. Samples: 1395356840. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:43,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 08:11:45,143][26599] Updated weights for policy 0, policy_version 312974 (0.0025) [2024-06-19 08:11:48,148][26599] Updated weights for policy 0, policy_version 312984 (0.0026) [2024-06-19 08:11:48,380][26367] Fps is (10 sec: 44253.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5127929856. Throughput: 0: 42720.9. Samples: 1395491800. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:48,380][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 08:11:52,943][26599] Updated weights for policy 0, policy_version 312994 (0.0033) [2024-06-19 08:11:53,384][26367] Fps is (10 sec: 39307.6, 60 sec: 42322.7, 300 sec: 42542.4). Total num frames: 5128110080. Throughput: 0: 42438.2. Samples: 1395743740. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:53,385][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 08:11:55,949][26599] Updated weights for policy 0, policy_version 313004 (0.0044) [2024-06-19 08:11:56,926][26579] Signal inference workers to stop experience collection... (20650 times) [2024-06-19 08:11:56,986][26599] InferenceWorker_p0-w0: stopping experience collection (20650 times) [2024-06-19 08:11:56,988][26579] Signal inference workers to resume experience collection... (20650 times) [2024-06-19 08:11:57,004][26599] InferenceWorker_p0-w0: resuming experience collection (20650 times) [2024-06-19 08:11:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5128355840. Throughput: 0: 42395.1. Samples: 1395995100. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:11:58,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 08:12:00,720][26599] Updated weights for policy 0, policy_version 313014 (0.0038) [2024-06-19 08:12:03,380][26367] Fps is (10 sec: 45892.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5128568832. Throughput: 0: 42619.8. Samples: 1396131860. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:12:03,380][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 08:12:03,577][26599] Updated weights for policy 0, policy_version 313024 (0.0024) [2024-06-19 08:12:08,199][26599] Updated weights for policy 0, policy_version 313034 (0.0036) [2024-06-19 08:12:08,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5128749056. Throughput: 0: 42360.8. Samples: 1396385160. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:12:08,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 08:12:11,180][26599] Updated weights for policy 0, policy_version 313044 (0.0037) [2024-06-19 08:12:13,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5129011200. Throughput: 0: 42412.1. Samples: 1396638100. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:12:13,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 08:12:15,732][26599] Updated weights for policy 0, policy_version 313054 (0.0034) [2024-06-19 08:12:18,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5129207808. Throughput: 0: 42619.1. Samples: 1396773340. Policy #0 lag: (min: 1.0, avg: 8.5, max: 21.0) [2024-06-19 08:12:18,381][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 08:12:18,839][26599] Updated weights for policy 0, policy_version 313064 (0.0044) [2024-06-19 08:12:23,167][26599] Updated weights for policy 0, policy_version 313074 (0.0038) [2024-06-19 08:12:23,384][26367] Fps is (10 sec: 39307.0, 60 sec: 42868.8, 300 sec: 42597.9). Total num frames: 5129404416. Throughput: 0: 42472.0. Samples: 1397021740. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:23,384][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 08:12:26,821][26599] Updated weights for policy 0, policy_version 313084 (0.0035) [2024-06-19 08:12:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5129633792. Throughput: 0: 42624.2. Samples: 1397274920. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:28,380][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 08:12:30,629][26599] Updated weights for policy 0, policy_version 313094 (0.0036) [2024-06-19 08:12:33,380][26367] Fps is (10 sec: 40975.1, 60 sec: 41506.2, 300 sec: 42598.4). Total num frames: 5129814016. Throughput: 0: 42518.6. Samples: 1397405140. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:33,380][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 08:12:34,431][26599] Updated weights for policy 0, policy_version 313104 (0.0039) [2024-06-19 08:12:38,317][26599] Updated weights for policy 0, policy_version 313114 (0.0040) [2024-06-19 08:12:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42874.1, 300 sec: 42653.9). Total num frames: 5130059776. Throughput: 0: 42468.9. Samples: 1397654680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:38,380][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 08:12:42,325][26599] Updated weights for policy 0, policy_version 313124 (0.0043) [2024-06-19 08:12:43,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5130256384. Throughput: 0: 42646.3. Samples: 1397914180. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:43,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 08:12:45,838][26599] Updated weights for policy 0, policy_version 313134 (0.0027) [2024-06-19 08:12:48,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42543.4). Total num frames: 5130452992. Throughput: 0: 42424.4. Samples: 1398040960. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:48,380][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 08:12:50,057][26599] Updated weights for policy 0, policy_version 313144 (0.0037) [2024-06-19 08:12:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43147.1, 300 sec: 42653.9). Total num frames: 5130698752. Throughput: 0: 42431.5. Samples: 1398294580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:53,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 08:12:53,762][26599] Updated weights for policy 0, policy_version 313154 (0.0031) [2024-06-19 08:12:57,602][26599] Updated weights for policy 0, policy_version 313164 (0.0032) [2024-06-19 08:12:58,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5130895360. Throughput: 0: 42610.2. Samples: 1398555560. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:12:58,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 08:13:01,314][26599] Updated weights for policy 0, policy_version 313174 (0.0040) [2024-06-19 08:13:03,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5131091968. Throughput: 0: 42385.3. Samples: 1398680680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:13:03,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 08:13:05,440][26599] Updated weights for policy 0, policy_version 313184 (0.0033) [2024-06-19 08:13:08,384][26367] Fps is (10 sec: 44220.3, 60 sec: 43141.9, 300 sec: 42653.9). Total num frames: 5131337728. Throughput: 0: 42478.6. Samples: 1398933280. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:13:08,385][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 08:13:08,837][26599] Updated weights for policy 0, policy_version 313194 (0.0040) [2024-06-19 08:13:13,082][26599] Updated weights for policy 0, policy_version 313204 (0.0051) [2024-06-19 08:13:13,384][26367] Fps is (10 sec: 44221.0, 60 sec: 42049.7, 300 sec: 42653.4). Total num frames: 5131534336. Throughput: 0: 42612.0. Samples: 1399192620. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:13:13,384][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 08:13:16,528][26599] Updated weights for policy 0, policy_version 313214 (0.0028) [2024-06-19 08:13:18,380][26367] Fps is (10 sec: 40975.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5131747328. Throughput: 0: 42526.7. Samples: 1399318840. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:13:18,381][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 08:13:20,671][26599] Updated weights for policy 0, policy_version 313224 (0.0042) [2024-06-19 08:13:22,072][26579] Signal inference workers to stop experience collection... (20700 times) [2024-06-19 08:13:22,108][26599] InferenceWorker_p0-w0: stopping experience collection (20700 times) [2024-06-19 08:13:22,184][26579] Signal inference workers to resume experience collection... (20700 times) [2024-06-19 08:13:22,185][26599] InferenceWorker_p0-w0: resuming experience collection (20700 times) [2024-06-19 08:13:23,380][26367] Fps is (10 sec: 44252.7, 60 sec: 42874.0, 300 sec: 42653.9). Total num frames: 5131976704. Throughput: 0: 42815.9. Samples: 1399581400. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:13:23,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 08:13:24,151][26599] Updated weights for policy 0, policy_version 313234 (0.0030) [2024-06-19 08:13:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5132173312. Throughput: 0: 42663.5. Samples: 1399834040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:13:28,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 08:13:28,387][26599] Updated weights for policy 0, policy_version 313244 (0.0034) [2024-06-19 08:13:31,699][26599] Updated weights for policy 0, policy_version 313254 (0.0028) [2024-06-19 08:13:33,383][26367] Fps is (10 sec: 39311.1, 60 sec: 42596.4, 300 sec: 42487.0). Total num frames: 5132369920. Throughput: 0: 42592.0. Samples: 1399957720. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 08:13:33,383][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 08:13:36,064][26599] Updated weights for policy 0, policy_version 313264 (0.0039) [2024-06-19 08:13:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 5132615680. Throughput: 0: 42800.0. Samples: 1400220580. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:13:38,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 08:13:38,540][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000313271_5132632064.pth... [2024-06-19 08:13:38,585][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312648_5122424832.pth [2024-06-19 08:13:39,411][26599] Updated weights for policy 0, policy_version 313274 (0.0041) [2024-06-19 08:13:43,380][26367] Fps is (10 sec: 44248.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5132812288. Throughput: 0: 42670.2. Samples: 1400475720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:13:43,381][26367] Avg episode reward: [(0, '0.381')] [2024-06-19 08:13:43,750][26599] Updated weights for policy 0, policy_version 313284 (0.0043) [2024-06-19 08:13:46,887][26599] Updated weights for policy 0, policy_version 313294 (0.0042) [2024-06-19 08:13:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5133025280. Throughput: 0: 42740.0. Samples: 1400603980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:13:48,384][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 08:13:51,526][26599] Updated weights for policy 0, policy_version 313304 (0.0031) [2024-06-19 08:13:53,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5133254656. Throughput: 0: 42842.9. Samples: 1400861060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:13:53,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 08:13:54,768][26599] Updated weights for policy 0, policy_version 313314 (0.0042) [2024-06-19 08:13:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42598.6). Total num frames: 5133434880. Throughput: 0: 42626.0. Samples: 1401110640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:13:58,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 08:13:59,127][26599] Updated weights for policy 0, policy_version 313324 (0.0041) [2024-06-19 08:14:02,649][26599] Updated weights for policy 0, policy_version 313334 (0.0036) [2024-06-19 08:14:03,380][26367] Fps is (10 sec: 42599.2, 60 sec: 43144.6, 300 sec: 42542.9). Total num frames: 5133680640. Throughput: 0: 42632.9. Samples: 1401237320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:03,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 08:14:07,129][26599] Updated weights for policy 0, policy_version 313344 (0.0040) [2024-06-19 08:14:08,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42328.0, 300 sec: 42598.4). Total num frames: 5133877248. Throughput: 0: 42550.7. Samples: 1401496180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:08,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 08:14:10,661][26599] Updated weights for policy 0, policy_version 313354 (0.0047) [2024-06-19 08:14:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42601.0, 300 sec: 42654.5). Total num frames: 5134090240. Throughput: 0: 42581.4. Samples: 1401750200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:13,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 08:14:14,766][26599] Updated weights for policy 0, policy_version 313364 (0.0028) [2024-06-19 08:14:18,115][26599] Updated weights for policy 0, policy_version 313374 (0.0032) [2024-06-19 08:14:18,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42868.8, 300 sec: 42542.3). Total num frames: 5134319616. Throughput: 0: 42689.3. Samples: 1401878780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:18,384][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 08:14:22,147][26599] Updated weights for policy 0, policy_version 313384 (0.0027) [2024-06-19 08:14:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42654.6). Total num frames: 5134516224. Throughput: 0: 42580.5. Samples: 1402136700. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:23,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 08:14:25,995][26599] Updated weights for policy 0, policy_version 313394 (0.0031) [2024-06-19 08:14:28,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5134729216. Throughput: 0: 42590.7. Samples: 1402392300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:28,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 08:14:29,733][26599] Updated weights for policy 0, policy_version 313404 (0.0037) [2024-06-19 08:14:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42873.4, 300 sec: 42487.3). Total num frames: 5134942208. Throughput: 0: 42721.0. Samples: 1402526420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:33,380][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 08:14:33,887][26599] Updated weights for policy 0, policy_version 313414 (0.0028) [2024-06-19 08:14:37,438][26599] Updated weights for policy 0, policy_version 313424 (0.0043) [2024-06-19 08:14:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5135155200. Throughput: 0: 42610.0. Samples: 1402778500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:38,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 08:14:41,359][26599] Updated weights for policy 0, policy_version 313434 (0.0029) [2024-06-19 08:14:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5135384576. Throughput: 0: 42780.1. Samples: 1403035740. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:43,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 08:14:45,107][26599] Updated weights for policy 0, policy_version 313444 (0.0036) [2024-06-19 08:14:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5135581184. Throughput: 0: 42829.4. Samples: 1403164640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 08:14:48,380][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 08:14:49,417][26599] Updated weights for policy 0, policy_version 313454 (0.0028) [2024-06-19 08:14:50,476][26579] Signal inference workers to stop experience collection... (20750 times) [2024-06-19 08:14:50,501][26599] InferenceWorker_p0-w0: stopping experience collection (20750 times) [2024-06-19 08:14:50,538][26579] Signal inference workers to resume experience collection... (20750 times) [2024-06-19 08:14:50,538][26599] InferenceWorker_p0-w0: resuming experience collection (20750 times) [2024-06-19 08:14:52,619][26599] Updated weights for policy 0, policy_version 313464 (0.0035) [2024-06-19 08:14:53,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5135810560. Throughput: 0: 42668.3. Samples: 1403416260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:14:53,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 08:14:57,230][26599] Updated weights for policy 0, policy_version 313474 (0.0038) [2024-06-19 08:14:58,380][26367] Fps is (10 sec: 44236.1, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5136023552. Throughput: 0: 42739.4. Samples: 1403673480. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:14:58,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 08:15:00,386][26599] Updated weights for policy 0, policy_version 313484 (0.0030) [2024-06-19 08:15:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5136236544. Throughput: 0: 42695.0. Samples: 1403799900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:03,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 08:15:04,816][26599] Updated weights for policy 0, policy_version 313494 (0.0030) [2024-06-19 08:15:08,045][26599] Updated weights for policy 0, policy_version 313504 (0.0031) [2024-06-19 08:15:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5136449536. Throughput: 0: 42709.7. Samples: 1404058640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:08,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 08:15:12,406][26599] Updated weights for policy 0, policy_version 313514 (0.0033) [2024-06-19 08:15:13,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5136629760. Throughput: 0: 42715.6. Samples: 1404314500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:13,380][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 08:15:15,827][26599] Updated weights for policy 0, policy_version 313524 (0.0038) [2024-06-19 08:15:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42600.9, 300 sec: 42542.9). Total num frames: 5136875520. Throughput: 0: 42412.7. Samples: 1404435000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:18,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 08:15:20,378][26599] Updated weights for policy 0, policy_version 313534 (0.0039) [2024-06-19 08:15:23,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5137088512. Throughput: 0: 42672.9. Samples: 1404698780. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:23,381][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 08:15:23,427][26599] Updated weights for policy 0, policy_version 313544 (0.0042) [2024-06-19 08:15:27,812][26599] Updated weights for policy 0, policy_version 313554 (0.0035) [2024-06-19 08:15:28,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5137268736. Throughput: 0: 42741.4. Samples: 1404959100. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:28,380][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 08:15:30,862][26599] Updated weights for policy 0, policy_version 313564 (0.0033) [2024-06-19 08:15:33,380][26367] Fps is (10 sec: 44236.1, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 5137530880. Throughput: 0: 42561.6. Samples: 1405079920. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:33,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 08:15:35,355][26599] Updated weights for policy 0, policy_version 313574 (0.0030) [2024-06-19 08:15:38,380][26367] Fps is (10 sec: 47512.7, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5137743872. Throughput: 0: 42953.3. Samples: 1405349160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:38,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 08:15:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000313583_5137743872.pth... [2024-06-19 08:15:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000312957_5127487488.pth [2024-06-19 08:15:38,617][26599] Updated weights for policy 0, policy_version 313584 (0.0040) [2024-06-19 08:15:42,915][26599] Updated weights for policy 0, policy_version 313594 (0.0021) [2024-06-19 08:15:43,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5137924096. Throughput: 0: 42867.6. Samples: 1405602520. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:43,381][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 08:15:46,458][26599] Updated weights for policy 0, policy_version 313604 (0.0037) [2024-06-19 08:15:48,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5138186240. Throughput: 0: 42845.9. Samples: 1405727960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:48,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 08:15:51,067][26599] Updated weights for policy 0, policy_version 313614 (0.0039) [2024-06-19 08:15:53,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5138382848. Throughput: 0: 42778.3. Samples: 1405983660. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:53,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 08:15:54,151][26599] Updated weights for policy 0, policy_version 313624 (0.0039) [2024-06-19 08:15:58,380][26367] Fps is (10 sec: 36044.3, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 5138546688. Throughput: 0: 42705.2. Samples: 1406236240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:15:58,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 08:15:58,659][26599] Updated weights for policy 0, policy_version 313634 (0.0038) [2024-06-19 08:16:02,072][26599] Updated weights for policy 0, policy_version 313644 (0.0039) [2024-06-19 08:16:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5138808832. Throughput: 0: 42750.4. Samples: 1406358760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:16:03,380][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 08:16:06,274][26599] Updated weights for policy 0, policy_version 313654 (0.0031) [2024-06-19 08:16:08,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5139005440. Throughput: 0: 42669.2. Samples: 1406618900. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:08,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 08:16:09,664][26599] Updated weights for policy 0, policy_version 313664 (0.0026) [2024-06-19 08:16:13,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42487.4). Total num frames: 5139202048. Throughput: 0: 42444.9. Samples: 1406869120. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:13,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 08:16:13,857][26599] Updated weights for policy 0, policy_version 313674 (0.0032) [2024-06-19 08:16:17,231][26599] Updated weights for policy 0, policy_version 313684 (0.0041) [2024-06-19 08:16:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.4). Total num frames: 5139431424. Throughput: 0: 42655.5. Samples: 1406999420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:18,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 08:16:21,385][26599] Updated weights for policy 0, policy_version 313694 (0.0039) [2024-06-19 08:16:22,658][26579] Signal inference workers to stop experience collection... (20800 times) [2024-06-19 08:16:22,659][26579] Signal inference workers to resume experience collection... (20800 times) [2024-06-19 08:16:22,708][26599] InferenceWorker_p0-w0: stopping experience collection (20800 times) [2024-06-19 08:16:22,708][26599] InferenceWorker_p0-w0: resuming experience collection (20800 times) [2024-06-19 08:16:23,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5139644416. Throughput: 0: 42382.9. Samples: 1407256540. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:23,384][26367] Avg episode reward: [(0, '0.317')] [2024-06-19 08:16:25,000][26599] Updated weights for policy 0, policy_version 313704 (0.0034) [2024-06-19 08:16:28,380][26367] Fps is (10 sec: 42599.2, 60 sec: 43144.5, 300 sec: 42487.3). Total num frames: 5139857408. Throughput: 0: 42286.7. Samples: 1407505420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:28,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 08:16:29,442][26599] Updated weights for policy 0, policy_version 313714 (0.0037) [2024-06-19 08:16:32,861][26599] Updated weights for policy 0, policy_version 313724 (0.0031) [2024-06-19 08:16:33,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 5140070400. Throughput: 0: 42377.8. Samples: 1407634960. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:33,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 08:16:37,410][26599] Updated weights for policy 0, policy_version 313734 (0.0042) [2024-06-19 08:16:38,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5140267008. Throughput: 0: 42484.8. Samples: 1407895480. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:38,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 08:16:40,371][26599] Updated weights for policy 0, policy_version 313744 (0.0047) [2024-06-19 08:16:43,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5140512768. Throughput: 0: 42390.3. Samples: 1408143800. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:43,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 08:16:45,079][26599] Updated weights for policy 0, policy_version 313754 (0.0038) [2024-06-19 08:16:48,239][26599] Updated weights for policy 0, policy_version 313764 (0.0043) [2024-06-19 08:16:48,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42052.2, 300 sec: 42710.0). Total num frames: 5140709376. Throughput: 0: 42539.0. Samples: 1408273020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:48,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 08:16:52,703][26599] Updated weights for policy 0, policy_version 313774 (0.0038) [2024-06-19 08:16:53,380][26367] Fps is (10 sec: 37682.5, 60 sec: 41779.0, 300 sec: 42487.3). Total num frames: 5140889600. Throughput: 0: 42355.9. Samples: 1408524920. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:53,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 08:16:55,950][26599] Updated weights for policy 0, policy_version 313784 (0.0022) [2024-06-19 08:16:58,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5141135360. Throughput: 0: 42295.4. Samples: 1408772420. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:16:58,387][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 08:17:00,306][26599] Updated weights for policy 0, policy_version 313794 (0.0042) [2024-06-19 08:17:03,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5141348352. Throughput: 0: 42467.6. Samples: 1408910460. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:17:03,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 08:17:03,610][26599] Updated weights for policy 0, policy_version 313804 (0.0041) [2024-06-19 08:17:07,751][26599] Updated weights for policy 0, policy_version 313814 (0.0043) [2024-06-19 08:17:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5141544960. Throughput: 0: 42350.5. Samples: 1409162160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:17:08,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 08:17:11,308][26599] Updated weights for policy 0, policy_version 313824 (0.0031) [2024-06-19 08:17:13,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5141757952. Throughput: 0: 42440.0. Samples: 1409415220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:17:13,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 08:17:15,529][26599] Updated weights for policy 0, policy_version 313834 (0.0037) [2024-06-19 08:17:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.9). Total num frames: 5141970944. Throughput: 0: 42451.4. Samples: 1409545280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 08:17:18,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 08:17:18,862][26599] Updated weights for policy 0, policy_version 313844 (0.0032) [2024-06-19 08:17:23,230][26599] Updated weights for policy 0, policy_version 313854 (0.0034) [2024-06-19 08:17:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42327.9, 300 sec: 42542.9). Total num frames: 5142183936. Throughput: 0: 42473.0. Samples: 1409806760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:23,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 08:17:26,816][26599] Updated weights for policy 0, policy_version 313864 (0.0046) [2024-06-19 08:17:28,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5142396928. Throughput: 0: 42545.4. Samples: 1410058340. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:28,380][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 08:17:31,021][26599] Updated weights for policy 0, policy_version 313874 (0.0027) [2024-06-19 08:17:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5142609920. Throughput: 0: 42676.8. Samples: 1410193480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:33,381][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 08:17:34,260][26599] Updated weights for policy 0, policy_version 313884 (0.0032) [2024-06-19 08:17:38,384][26367] Fps is (10 sec: 40944.9, 60 sec: 42322.8, 300 sec: 42542.3). Total num frames: 5142806528. Throughput: 0: 42690.5. Samples: 1410446140. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:38,384][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 08:17:38,519][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000313893_5142822912.pth... [2024-06-19 08:17:38,570][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000313271_5132632064.pth [2024-06-19 08:17:38,715][26599] Updated weights for policy 0, policy_version 313894 (0.0038) [2024-06-19 08:17:41,818][26599] Updated weights for policy 0, policy_version 313904 (0.0031) [2024-06-19 08:17:43,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5143052288. Throughput: 0: 42790.3. Samples: 1410697980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:43,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 08:17:46,291][26599] Updated weights for policy 0, policy_version 313914 (0.0031) [2024-06-19 08:17:48,381][26367] Fps is (10 sec: 45890.7, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5143265280. Throughput: 0: 42787.4. Samples: 1410835900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:48,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 08:17:49,776][26599] Updated weights for policy 0, policy_version 313924 (0.0040) [2024-06-19 08:17:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 5143445504. Throughput: 0: 42718.8. Samples: 1411084500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:53,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 08:17:53,966][26599] Updated weights for policy 0, policy_version 313934 (0.0034) [2024-06-19 08:17:55,164][26579] Signal inference workers to stop experience collection... (20850 times) [2024-06-19 08:17:55,166][26579] Signal inference workers to resume experience collection... (20850 times) [2024-06-19 08:17:55,182][26599] InferenceWorker_p0-w0: stopping experience collection (20850 times) [2024-06-19 08:17:55,182][26599] InferenceWorker_p0-w0: resuming experience collection (20850 times) [2024-06-19 08:17:57,419][26599] Updated weights for policy 0, policy_version 313944 (0.0036) [2024-06-19 08:17:58,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5143707648. Throughput: 0: 42748.4. Samples: 1411338900. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:17:58,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 08:18:01,566][26599] Updated weights for policy 0, policy_version 313954 (0.0042) [2024-06-19 08:18:03,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 5143904256. Throughput: 0: 42832.5. Samples: 1411472740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:18:03,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 08:18:05,052][26599] Updated weights for policy 0, policy_version 313964 (0.0034) [2024-06-19 08:18:08,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 5144100864. Throughput: 0: 42588.0. Samples: 1411723220. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:18:08,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 08:18:09,379][26599] Updated weights for policy 0, policy_version 313974 (0.0029) [2024-06-19 08:18:12,630][26599] Updated weights for policy 0, policy_version 313984 (0.0029) [2024-06-19 08:18:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 5144346624. Throughput: 0: 42684.7. Samples: 1411979160. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:18:13,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 08:18:16,961][26599] Updated weights for policy 0, policy_version 313994 (0.0039) [2024-06-19 08:18:18,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 5144526848. Throughput: 0: 42666.4. Samples: 1412113460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:18:18,380][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 08:18:20,199][26599] Updated weights for policy 0, policy_version 314004 (0.0036) [2024-06-19 08:18:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5144739840. Throughput: 0: 42660.2. Samples: 1412365700. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:18:23,381][26367] Avg episode reward: [(0, '0.805')] [2024-06-19 08:18:24,939][26599] Updated weights for policy 0, policy_version 314014 (0.0036) [2024-06-19 08:18:27,887][26599] Updated weights for policy 0, policy_version 314024 (0.0034) [2024-06-19 08:18:28,380][26367] Fps is (10 sec: 45874.0, 60 sec: 43144.4, 300 sec: 42765.4). Total num frames: 5144985600. Throughput: 0: 42569.2. Samples: 1412613600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:18:28,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 08:18:32,532][26599] Updated weights for policy 0, policy_version 314034 (0.0036) [2024-06-19 08:18:33,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5145165824. Throughput: 0: 42591.4. Samples: 1412752500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 24.0) [2024-06-19 08:18:33,380][26367] Avg episode reward: [(0, '0.811')] [2024-06-19 08:18:35,356][26599] Updated weights for policy 0, policy_version 314044 (0.0039) [2024-06-19 08:18:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42874.0, 300 sec: 42598.4). Total num frames: 5145378816. Throughput: 0: 42665.6. Samples: 1413004460. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:18:38,381][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 08:18:40,156][26599] Updated weights for policy 0, policy_version 314054 (0.0039) [2024-06-19 08:18:43,087][26599] Updated weights for policy 0, policy_version 314064 (0.0032) [2024-06-19 08:18:43,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5145624576. Throughput: 0: 42642.7. Samples: 1413257820. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:18:43,381][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 08:18:47,738][26599] Updated weights for policy 0, policy_version 314074 (0.0032) [2024-06-19 08:18:48,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 5145821184. Throughput: 0: 42705.0. Samples: 1413394460. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:18:48,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 08:18:50,618][26599] Updated weights for policy 0, policy_version 314084 (0.0034) [2024-06-19 08:18:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5146034176. Throughput: 0: 42696.9. Samples: 1413644580. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:18:53,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 08:18:55,352][26599] Updated weights for policy 0, policy_version 314094 (0.0049) [2024-06-19 08:18:58,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5146263552. Throughput: 0: 42680.0. Samples: 1413899760. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:18:58,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 08:18:58,538][26599] Updated weights for policy 0, policy_version 314104 (0.0050) [2024-06-19 08:19:03,374][26599] Updated weights for policy 0, policy_version 314114 (0.0040) [2024-06-19 08:19:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5146443776. Throughput: 0: 42720.4. Samples: 1414035880. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:03,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 08:19:03,910][26579] Signal inference workers to stop experience collection... (20900 times) [2024-06-19 08:19:03,914][26579] Signal inference workers to resume experience collection... (20900 times) [2024-06-19 08:19:03,928][26599] InferenceWorker_p0-w0: stopping experience collection (20900 times) [2024-06-19 08:19:03,928][26599] InferenceWorker_p0-w0: resuming experience collection (20900 times) [2024-06-19 08:19:06,338][26599] Updated weights for policy 0, policy_version 314124 (0.0036) [2024-06-19 08:19:08,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5146656768. Throughput: 0: 42582.3. Samples: 1414281900. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:08,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 08:19:10,982][26599] Updated weights for policy 0, policy_version 314134 (0.0032) [2024-06-19 08:19:13,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42654.5). Total num frames: 5146902528. Throughput: 0: 42734.4. Samples: 1414536640. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:13,381][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 08:19:13,945][26599] Updated weights for policy 0, policy_version 314144 (0.0045) [2024-06-19 08:19:18,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5147082752. Throughput: 0: 42702.7. Samples: 1414674120. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:18,380][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 08:19:18,552][26599] Updated weights for policy 0, policy_version 314154 (0.0035) [2024-06-19 08:19:21,510][26599] Updated weights for policy 0, policy_version 314164 (0.0024) [2024-06-19 08:19:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5147312128. Throughput: 0: 42723.6. Samples: 1414927020. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:23,382][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 08:19:26,223][26599] Updated weights for policy 0, policy_version 314174 (0.0034) [2024-06-19 08:19:28,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 5147541504. Throughput: 0: 42619.3. Samples: 1415175680. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:28,380][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 08:19:29,216][26599] Updated weights for policy 0, policy_version 314184 (0.0041) [2024-06-19 08:19:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5147738112. Throughput: 0: 42653.2. Samples: 1415313860. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:33,381][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 08:19:34,123][26599] Updated weights for policy 0, policy_version 314194 (0.0032) [2024-06-19 08:19:36,718][26599] Updated weights for policy 0, policy_version 314204 (0.0046) [2024-06-19 08:19:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.7, 300 sec: 42654.0). Total num frames: 5147967488. Throughput: 0: 42613.8. Samples: 1415562200. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:38,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 08:19:38,411][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000314207_5147967488.pth... [2024-06-19 08:19:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000313583_5137743872.pth [2024-06-19 08:19:41,735][26599] Updated weights for policy 0, policy_version 314214 (0.0030) [2024-06-19 08:19:43,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5148180480. Throughput: 0: 42706.4. Samples: 1415821540. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:43,381][26367] Avg episode reward: [(0, '0.806')] [2024-06-19 08:19:44,634][26599] Updated weights for policy 0, policy_version 314224 (0.0025) [2024-06-19 08:19:48,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5148377088. Throughput: 0: 42496.8. Samples: 1415948240. Policy #0 lag: (min: 2.0, avg: 11.6, max: 21.0) [2024-06-19 08:19:48,381][26367] Avg episode reward: [(0, '0.826')] [2024-06-19 08:19:49,244][26599] Updated weights for policy 0, policy_version 314234 (0.0031) [2024-06-19 08:19:52,267][26599] Updated weights for policy 0, policy_version 314244 (0.0032) [2024-06-19 08:19:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5148606464. Throughput: 0: 42808.4. Samples: 1416208280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:19:53,381][26367] Avg episode reward: [(0, '0.380')] [2024-06-19 08:19:56,817][26599] Updated weights for policy 0, policy_version 314254 (0.0047) [2024-06-19 08:19:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5148819456. Throughput: 0: 42820.9. Samples: 1416463580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:19:58,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 08:19:59,827][26599] Updated weights for policy 0, policy_version 314264 (0.0038) [2024-06-19 08:20:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5149032448. Throughput: 0: 42711.9. Samples: 1416596160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:03,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 08:20:04,212][26599] Updated weights for policy 0, policy_version 314274 (0.0038) [2024-06-19 08:20:07,539][26599] Updated weights for policy 0, policy_version 314284 (0.0040) [2024-06-19 08:20:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5149245440. Throughput: 0: 42821.9. Samples: 1416854000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:08,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 08:20:11,649][26579] Signal inference workers to stop experience collection... (20950 times) [2024-06-19 08:20:11,658][26579] Signal inference workers to resume experience collection... (20950 times) [2024-06-19 08:20:11,703][26599] InferenceWorker_p0-w0: stopping experience collection (20950 times) [2024-06-19 08:20:11,703][26599] InferenceWorker_p0-w0: resuming experience collection (20950 times) [2024-06-19 08:20:11,797][26599] Updated weights for policy 0, policy_version 314294 (0.0035) [2024-06-19 08:20:13,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5149474816. Throughput: 0: 42976.2. Samples: 1417109620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:13,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 08:20:15,342][26599] Updated weights for policy 0, policy_version 314304 (0.0036) [2024-06-19 08:20:18,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5149655040. Throughput: 0: 42849.4. Samples: 1417242080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:18,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 08:20:19,277][26599] Updated weights for policy 0, policy_version 314314 (0.0038) [2024-06-19 08:20:22,962][26599] Updated weights for policy 0, policy_version 314324 (0.0028) [2024-06-19 08:20:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5149900800. Throughput: 0: 42890.5. Samples: 1417492280. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:23,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 08:20:27,255][26599] Updated weights for policy 0, policy_version 314334 (0.0052) [2024-06-19 08:20:28,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5150113792. Throughput: 0: 42930.1. Samples: 1417753400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:28,382][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 08:20:30,832][26599] Updated weights for policy 0, policy_version 314344 (0.0043) [2024-06-19 08:20:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5150294016. Throughput: 0: 42996.5. Samples: 1417883080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:33,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 08:20:34,875][26599] Updated weights for policy 0, policy_version 314354 (0.0036) [2024-06-19 08:20:38,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5150523392. Throughput: 0: 42718.4. Samples: 1418130600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:38,380][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 08:20:38,502][26599] Updated weights for policy 0, policy_version 314364 (0.0029) [2024-06-19 08:20:42,522][26599] Updated weights for policy 0, policy_version 314374 (0.0037) [2024-06-19 08:20:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5150736384. Throughput: 0: 42870.2. Samples: 1418392740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:43,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 08:20:46,119][26599] Updated weights for policy 0, policy_version 314384 (0.0032) [2024-06-19 08:20:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5150949376. Throughput: 0: 42754.7. Samples: 1418520120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:48,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 08:20:50,115][26599] Updated weights for policy 0, policy_version 314394 (0.0029) [2024-06-19 08:20:53,384][26367] Fps is (10 sec: 42584.7, 60 sec: 42596.1, 300 sec: 42764.6). Total num frames: 5151162368. Throughput: 0: 42648.0. Samples: 1418773300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:53,384][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 08:20:54,160][26599] Updated weights for policy 0, policy_version 314404 (0.0032) [2024-06-19 08:20:57,755][26599] Updated weights for policy 0, policy_version 314414 (0.0043) [2024-06-19 08:20:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5151375360. Throughput: 0: 42645.1. Samples: 1419028640. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:20:58,380][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 08:21:01,732][26599] Updated weights for policy 0, policy_version 314424 (0.0036) [2024-06-19 08:21:03,380][26367] Fps is (10 sec: 42612.5, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5151588352. Throughput: 0: 42637.9. Samples: 1419160780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 24.0) [2024-06-19 08:21:03,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 08:21:05,292][26599] Updated weights for policy 0, policy_version 314434 (0.0053) [2024-06-19 08:21:08,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5151801344. Throughput: 0: 42763.1. Samples: 1419416620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:08,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 08:21:09,321][26599] Updated weights for policy 0, policy_version 314444 (0.0041) [2024-06-19 08:21:12,771][26599] Updated weights for policy 0, policy_version 314454 (0.0033) [2024-06-19 08:21:13,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5152030720. Throughput: 0: 42678.3. Samples: 1419673920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:13,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 08:21:16,884][26599] Updated weights for policy 0, policy_version 314464 (0.0048) [2024-06-19 08:21:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42654.5). Total num frames: 5152227328. Throughput: 0: 42728.1. Samples: 1419805840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:18,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 08:21:20,539][26599] Updated weights for policy 0, policy_version 314474 (0.0030) [2024-06-19 08:21:23,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5152456704. Throughput: 0: 42836.2. Samples: 1420058240. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:23,381][26367] Avg episode reward: [(0, '0.377')] [2024-06-19 08:21:24,750][26599] Updated weights for policy 0, policy_version 314484 (0.0024) [2024-06-19 08:21:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5152653312. Throughput: 0: 42827.1. Samples: 1420319960. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:28,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 08:21:28,415][26599] Updated weights for policy 0, policy_version 314494 (0.0044) [2024-06-19 08:21:32,402][26599] Updated weights for policy 0, policy_version 314504 (0.0033) [2024-06-19 08:21:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5152866304. Throughput: 0: 42804.9. Samples: 1420446340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:33,384][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 08:21:35,906][26599] Updated weights for policy 0, policy_version 314514 (0.0029) [2024-06-19 08:21:38,380][26367] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5153112064. Throughput: 0: 42834.6. Samples: 1420700720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:38,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 08:21:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000314521_5153112064.pth... [2024-06-19 08:21:38,447][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000313893_5142822912.pth [2024-06-19 08:21:39,971][26599] Updated weights for policy 0, policy_version 314524 (0.0029) [2024-06-19 08:21:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5153308672. Throughput: 0: 43076.8. Samples: 1420967100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:43,381][26367] Avg episode reward: [(0, '0.285')] [2024-06-19 08:21:43,419][26599] Updated weights for policy 0, policy_version 314534 (0.0034) [2024-06-19 08:21:47,552][26599] Updated weights for policy 0, policy_version 314544 (0.0029) [2024-06-19 08:21:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5153521664. Throughput: 0: 42845.7. Samples: 1421088840. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:48,381][26367] Avg episode reward: [(0, '0.321')] [2024-06-19 08:21:51,207][26599] Updated weights for policy 0, policy_version 314554 (0.0029) [2024-06-19 08:21:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43146.9, 300 sec: 42765.0). Total num frames: 5153751040. Throughput: 0: 42942.7. Samples: 1421349040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:53,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 08:21:55,461][26599] Updated weights for policy 0, policy_version 314564 (0.0046) [2024-06-19 08:21:55,969][26579] Signal inference workers to stop experience collection... (21000 times) [2024-06-19 08:21:55,992][26599] InferenceWorker_p0-w0: stopping experience collection (21000 times) [2024-06-19 08:21:56,081][26579] Signal inference workers to resume experience collection... (21000 times) [2024-06-19 08:21:56,081][26599] InferenceWorker_p0-w0: resuming experience collection (21000 times) [2024-06-19 08:21:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5153931264. Throughput: 0: 42809.7. Samples: 1421600360. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:21:58,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 08:21:59,058][26599] Updated weights for policy 0, policy_version 314574 (0.0035) [2024-06-19 08:22:03,314][26599] Updated weights for policy 0, policy_version 314584 (0.0033) [2024-06-19 08:22:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5154144256. Throughput: 0: 42493.7. Samples: 1421718060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:22:03,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 08:22:06,723][26599] Updated weights for policy 0, policy_version 314594 (0.0041) [2024-06-19 08:22:08,380][26367] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5154390016. Throughput: 0: 42748.5. Samples: 1421981920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:22:08,381][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 08:22:10,808][26599] Updated weights for policy 0, policy_version 314604 (0.0053) [2024-06-19 08:22:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5154570240. Throughput: 0: 42679.5. Samples: 1422240540. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:22:13,382][26367] Avg episode reward: [(0, '0.696')] [2024-06-19 08:22:14,601][26599] Updated weights for policy 0, policy_version 314614 (0.0030) [2024-06-19 08:22:18,314][26599] Updated weights for policy 0, policy_version 314624 (0.0029) [2024-06-19 08:22:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5154799616. Throughput: 0: 42678.3. Samples: 1422366860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:18,384][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 08:22:22,501][26599] Updated weights for policy 0, policy_version 314634 (0.0035) [2024-06-19 08:22:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5155012608. Throughput: 0: 42780.9. Samples: 1422625860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:23,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 08:22:25,817][26599] Updated weights for policy 0, policy_version 314644 (0.0034) [2024-06-19 08:22:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5155225600. Throughput: 0: 42465.9. Samples: 1422878060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:28,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 08:22:30,114][26599] Updated weights for policy 0, policy_version 314654 (0.0037) [2024-06-19 08:22:33,247][26599] Updated weights for policy 0, policy_version 314664 (0.0032) [2024-06-19 08:22:33,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.6). Total num frames: 5155454976. Throughput: 0: 42617.7. Samples: 1423006640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:33,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 08:22:37,777][26599] Updated weights for policy 0, policy_version 314674 (0.0047) [2024-06-19 08:22:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5155651584. Throughput: 0: 42583.6. Samples: 1423265300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:38,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 08:22:40,813][26599] Updated weights for policy 0, policy_version 314684 (0.0026) [2024-06-19 08:22:43,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5155864576. Throughput: 0: 42604.0. Samples: 1423517540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:43,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 08:22:45,407][26599] Updated weights for policy 0, policy_version 314694 (0.0043) [2024-06-19 08:22:48,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5156093952. Throughput: 0: 42899.6. Samples: 1423648540. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:48,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 08:22:48,733][26599] Updated weights for policy 0, policy_version 314704 (0.0034) [2024-06-19 08:22:53,038][26599] Updated weights for policy 0, policy_version 314714 (0.0032) [2024-06-19 08:22:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5156290560. Throughput: 0: 42765.3. Samples: 1423906360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:53,381][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 08:22:56,402][26599] Updated weights for policy 0, policy_version 314724 (0.0033) [2024-06-19 08:22:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5156519936. Throughput: 0: 42725.4. Samples: 1424163180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:22:58,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 08:23:00,659][26599] Updated weights for policy 0, policy_version 314734 (0.0036) [2024-06-19 08:23:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5156732928. Throughput: 0: 42724.9. Samples: 1424289480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:23:03,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 08:23:04,169][26599] Updated weights for policy 0, policy_version 314744 (0.0029) [2024-06-19 08:23:08,268][26579] Signal inference workers to stop experience collection... (21050 times) [2024-06-19 08:23:08,273][26579] Signal inference workers to resume experience collection... (21050 times) [2024-06-19 08:23:08,313][26599] InferenceWorker_p0-w0: stopping experience collection (21050 times) [2024-06-19 08:23:08,313][26599] InferenceWorker_p0-w0: resuming experience collection (21050 times) [2024-06-19 08:23:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5156913152. Throughput: 0: 42612.5. Samples: 1424543420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:23:08,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-19 08:23:08,410][26599] Updated weights for policy 0, policy_version 314754 (0.0034) [2024-06-19 08:23:11,761][26599] Updated weights for policy 0, policy_version 314764 (0.0035) [2024-06-19 08:23:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5157142528. Throughput: 0: 42606.2. Samples: 1424795340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:23:13,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 08:23:16,186][26599] Updated weights for policy 0, policy_version 314774 (0.0046) [2024-06-19 08:23:18,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5157355520. Throughput: 0: 42724.6. Samples: 1424929240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:23:18,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 08:23:19,347][26599] Updated weights for policy 0, policy_version 314784 (0.0034) [2024-06-19 08:23:23,382][26367] Fps is (10 sec: 40952.2, 60 sec: 42324.0, 300 sec: 42598.1). Total num frames: 5157552128. Throughput: 0: 42744.0. Samples: 1425188860. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:23:23,383][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 08:23:23,667][26599] Updated weights for policy 0, policy_version 314794 (0.0032) [2024-06-19 08:23:27,030][26599] Updated weights for policy 0, policy_version 314804 (0.0025) [2024-06-19 08:23:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5157797888. Throughput: 0: 42674.7. Samples: 1425437900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 23.0) [2024-06-19 08:23:28,381][26367] Avg episode reward: [(0, '0.290')] [2024-06-19 08:23:31,254][26599] Updated weights for policy 0, policy_version 314814 (0.0045) [2024-06-19 08:23:33,380][26367] Fps is (10 sec: 45884.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5158010880. Throughput: 0: 42812.4. Samples: 1425575100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:23:33,381][26367] Avg episode reward: [(0, '0.161')] [2024-06-19 08:23:34,504][26599] Updated weights for policy 0, policy_version 314824 (0.0031) [2024-06-19 08:23:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5158207488. Throughput: 0: 42680.0. Samples: 1425826960. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:23:38,383][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 08:23:38,406][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000314832_5158207488.pth... [2024-06-19 08:23:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000314207_5147967488.pth [2024-06-19 08:23:39,163][26599] Updated weights for policy 0, policy_version 314834 (0.0033) [2024-06-19 08:23:42,371][26599] Updated weights for policy 0, policy_version 314844 (0.0042) [2024-06-19 08:23:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5158436864. Throughput: 0: 42673.2. Samples: 1426083480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:23:43,381][26367] Avg episode reward: [(0, '0.753')] [2024-06-19 08:23:46,543][26599] Updated weights for policy 0, policy_version 314854 (0.0031) [2024-06-19 08:23:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5158633472. Throughput: 0: 42739.7. Samples: 1426212760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:23:48,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 08:23:50,317][26599] Updated weights for policy 0, policy_version 314864 (0.0029) [2024-06-19 08:23:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5158830080. Throughput: 0: 42582.7. Samples: 1426459640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:23:53,380][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 08:23:54,172][26599] Updated weights for policy 0, policy_version 314874 (0.0044) [2024-06-19 08:23:57,821][26599] Updated weights for policy 0, policy_version 314884 (0.0032) [2024-06-19 08:23:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5159075840. Throughput: 0: 42735.1. Samples: 1426718420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:23:58,381][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 08:24:01,634][26599] Updated weights for policy 0, policy_version 314894 (0.0032) [2024-06-19 08:24:03,384][26367] Fps is (10 sec: 44220.6, 60 sec: 42322.8, 300 sec: 42764.5). Total num frames: 5159272448. Throughput: 0: 42845.4. Samples: 1426857440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:03,384][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 08:24:05,250][26599] Updated weights for policy 0, policy_version 314904 (0.0026) [2024-06-19 08:24:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5159485440. Throughput: 0: 42784.4. Samples: 1427114080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:08,383][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 08:24:09,094][26599] Updated weights for policy 0, policy_version 314914 (0.0041) [2024-06-19 08:24:12,906][26599] Updated weights for policy 0, policy_version 314924 (0.0037) [2024-06-19 08:24:13,380][26367] Fps is (10 sec: 45892.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5159731200. Throughput: 0: 42853.9. Samples: 1427366320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:13,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 08:24:16,686][26599] Updated weights for policy 0, policy_version 314934 (0.0033) [2024-06-19 08:24:18,384][26367] Fps is (10 sec: 44221.0, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5159927808. Throughput: 0: 42950.8. Samples: 1427508040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:18,384][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 08:24:20,521][26599] Updated weights for policy 0, policy_version 314944 (0.0041) [2024-06-19 08:24:23,380][26367] Fps is (10 sec: 40959.2, 60 sec: 43145.8, 300 sec: 42709.4). Total num frames: 5160140800. Throughput: 0: 42965.7. Samples: 1427760420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:23,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 08:24:24,212][26599] Updated weights for policy 0, policy_version 314954 (0.0036) [2024-06-19 08:24:28,160][26599] Updated weights for policy 0, policy_version 314964 (0.0028) [2024-06-19 08:24:28,380][26367] Fps is (10 sec: 45891.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5160386560. Throughput: 0: 42910.7. Samples: 1428014460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:28,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 08:24:32,225][26599] Updated weights for policy 0, policy_version 314974 (0.0044) [2024-06-19 08:24:32,239][26579] Signal inference workers to stop experience collection... (21100 times) [2024-06-19 08:24:32,243][26579] Signal inference workers to resume experience collection... (21100 times) [2024-06-19 08:24:32,274][26599] InferenceWorker_p0-w0: stopping experience collection (21100 times) [2024-06-19 08:24:32,274][26599] InferenceWorker_p0-w0: resuming experience collection (21100 times) [2024-06-19 08:24:33,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5160583168. Throughput: 0: 43120.9. Samples: 1428153200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:33,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 08:24:35,779][26599] Updated weights for policy 0, policy_version 314984 (0.0039) [2024-06-19 08:24:38,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5160779776. Throughput: 0: 43205.8. Samples: 1428403900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:38,380][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 08:24:39,718][26599] Updated weights for policy 0, policy_version 314994 (0.0030) [2024-06-19 08:24:43,327][26599] Updated weights for policy 0, policy_version 315004 (0.0037) [2024-06-19 08:24:43,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 5161025536. Throughput: 0: 43292.2. Samples: 1428666560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 08:24:43,380][26367] Avg episode reward: [(0, '0.406')] [2024-06-19 08:24:47,317][26599] Updated weights for policy 0, policy_version 315014 (0.0024) [2024-06-19 08:24:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5161222144. Throughput: 0: 43092.5. Samples: 1428796440. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:24:48,380][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 08:24:50,931][26599] Updated weights for policy 0, policy_version 315024 (0.0043) [2024-06-19 08:24:53,380][26367] Fps is (10 sec: 40959.1, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5161435136. Throughput: 0: 42979.5. Samples: 1429048160. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:24:53,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 08:24:55,182][26599] Updated weights for policy 0, policy_version 315034 (0.0044) [2024-06-19 08:24:58,380][26367] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5161664512. Throughput: 0: 43266.6. Samples: 1429313320. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:24:58,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 08:24:58,574][26599] Updated weights for policy 0, policy_version 315044 (0.0033) [2024-06-19 08:25:02,753][26599] Updated weights for policy 0, policy_version 315054 (0.0041) [2024-06-19 08:25:03,380][26367] Fps is (10 sec: 42599.1, 60 sec: 43147.2, 300 sec: 42765.0). Total num frames: 5161861120. Throughput: 0: 42927.5. Samples: 1429439620. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:03,381][26367] Avg episode reward: [(0, '0.822')] [2024-06-19 08:25:06,188][26599] Updated weights for policy 0, policy_version 315064 (0.0033) [2024-06-19 08:25:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5162074112. Throughput: 0: 42929.0. Samples: 1429692220. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:08,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 08:25:10,183][26599] Updated weights for policy 0, policy_version 315074 (0.0029) [2024-06-19 08:25:13,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5162303488. Throughput: 0: 43040.9. Samples: 1429951300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:13,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 08:25:13,904][26599] Updated weights for policy 0, policy_version 315084 (0.0024) [2024-06-19 08:25:18,102][26599] Updated weights for policy 0, policy_version 315094 (0.0034) [2024-06-19 08:25:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42874.0, 300 sec: 42709.5). Total num frames: 5162500096. Throughput: 0: 42846.5. Samples: 1430081300. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:18,381][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 08:25:21,457][26599] Updated weights for policy 0, policy_version 315104 (0.0045) [2024-06-19 08:25:23,380][26367] Fps is (10 sec: 42599.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5162729472. Throughput: 0: 42860.0. Samples: 1430332600. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:23,381][26367] Avg episode reward: [(0, '0.333')] [2024-06-19 08:25:26,018][26599] Updated weights for policy 0, policy_version 315114 (0.0040) [2024-06-19 08:25:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5162942464. Throughput: 0: 42714.5. Samples: 1430588720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:28,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 08:25:29,074][26599] Updated weights for policy 0, policy_version 315124 (0.0040) [2024-06-19 08:25:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5163139072. Throughput: 0: 42601.6. Samples: 1430713520. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:33,389][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 08:25:33,572][26599] Updated weights for policy 0, policy_version 315134 (0.0033) [2024-06-19 08:25:36,857][26599] Updated weights for policy 0, policy_version 315144 (0.0034) [2024-06-19 08:25:38,384][26367] Fps is (10 sec: 42583.2, 60 sec: 43141.8, 300 sec: 42820.0). Total num frames: 5163368448. Throughput: 0: 42631.3. Samples: 1430966720. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:38,384][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 08:25:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000315147_5163368448.pth... [2024-06-19 08:25:38,459][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000314521_5153112064.pth [2024-06-19 08:25:41,151][26599] Updated weights for policy 0, policy_version 315154 (0.0032) [2024-06-19 08:25:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5163581440. Throughput: 0: 42492.9. Samples: 1431225500. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:43,383][26367] Avg episode reward: [(0, '0.327')] [2024-06-19 08:25:44,734][26599] Updated weights for policy 0, policy_version 315164 (0.0031) [2024-06-19 08:25:48,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42598.4, 300 sec: 42765.5). Total num frames: 5163778048. Throughput: 0: 42452.4. Samples: 1431349980. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:48,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 08:25:48,923][26599] Updated weights for policy 0, policy_version 315174 (0.0038) [2024-06-19 08:25:52,250][26599] Updated weights for policy 0, policy_version 315184 (0.0032) [2024-06-19 08:25:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5164023808. Throughput: 0: 42610.3. Samples: 1431609680. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:53,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 08:25:56,804][26579] Signal inference workers to stop experience collection... (21150 times) [2024-06-19 08:25:56,804][26579] Signal inference workers to resume experience collection... (21150 times) [2024-06-19 08:25:56,849][26599] InferenceWorker_p0-w0: stopping experience collection (21150 times) [2024-06-19 08:25:56,850][26599] InferenceWorker_p0-w0: resuming experience collection (21150 times) [2024-06-19 08:25:56,946][26599] Updated weights for policy 0, policy_version 315194 (0.0034) [2024-06-19 08:25:58,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5164236800. Throughput: 0: 42543.7. Samples: 1431865760. Policy #0 lag: (min: 1.0, avg: 10.6, max: 24.0) [2024-06-19 08:25:58,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 08:25:59,881][26599] Updated weights for policy 0, policy_version 315204 (0.0037) [2024-06-19 08:26:03,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5164400640. Throughput: 0: 42433.9. Samples: 1431990820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:03,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 08:26:04,553][26599] Updated weights for policy 0, policy_version 315214 (0.0020) [2024-06-19 08:26:07,485][26599] Updated weights for policy 0, policy_version 315224 (0.0040) [2024-06-19 08:26:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5164646400. Throughput: 0: 42592.8. Samples: 1432249280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:08,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 08:26:12,091][26599] Updated weights for policy 0, policy_version 315234 (0.0032) [2024-06-19 08:26:13,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5164875776. Throughput: 0: 42618.7. Samples: 1432506560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:13,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 08:26:15,060][26599] Updated weights for policy 0, policy_version 315244 (0.0027) [2024-06-19 08:26:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5165056000. Throughput: 0: 42880.9. Samples: 1432643160. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:18,389][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 08:26:19,782][26599] Updated weights for policy 0, policy_version 315254 (0.0028) [2024-06-19 08:26:23,038][26599] Updated weights for policy 0, policy_version 315264 (0.0030) [2024-06-19 08:26:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5165285376. Throughput: 0: 42735.1. Samples: 1432889640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:23,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 08:26:27,391][26599] Updated weights for policy 0, policy_version 315274 (0.0037) [2024-06-19 08:26:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5165498368. Throughput: 0: 42806.7. Samples: 1433151800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:28,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 08:26:30,979][26599] Updated weights for policy 0, policy_version 315284 (0.0027) [2024-06-19 08:26:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5165694976. Throughput: 0: 42814.2. Samples: 1433276620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:33,380][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 08:26:35,135][26599] Updated weights for policy 0, policy_version 315294 (0.0040) [2024-06-19 08:26:38,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42325.3, 300 sec: 42709.0). Total num frames: 5165907968. Throughput: 0: 42599.6. Samples: 1433526820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:38,385][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 08:26:38,745][26599] Updated weights for policy 0, policy_version 315304 (0.0042) [2024-06-19 08:26:42,799][26599] Updated weights for policy 0, policy_version 315314 (0.0039) [2024-06-19 08:26:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5166137344. Throughput: 0: 42920.5. Samples: 1433797180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:43,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 08:26:46,274][26599] Updated weights for policy 0, policy_version 315324 (0.0023) [2024-06-19 08:26:48,380][26367] Fps is (10 sec: 45891.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5166366720. Throughput: 0: 42996.0. Samples: 1433925640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:48,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 08:26:50,552][26599] Updated weights for policy 0, policy_version 315334 (0.0041) [2024-06-19 08:26:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5166563328. Throughput: 0: 42813.3. Samples: 1434175880. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:53,388][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 08:26:53,735][26599] Updated weights for policy 0, policy_version 315344 (0.0023) [2024-06-19 08:26:58,207][26599] Updated weights for policy 0, policy_version 315354 (0.0038) [2024-06-19 08:26:58,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 5166759936. Throughput: 0: 43056.1. Samples: 1434444080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:26:58,380][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 08:27:01,235][26599] Updated weights for policy 0, policy_version 315364 (0.0021) [2024-06-19 08:27:03,380][26367] Fps is (10 sec: 44237.4, 60 sec: 43417.7, 300 sec: 42765.0). Total num frames: 5167005696. Throughput: 0: 42712.6. Samples: 1434565220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:27:03,380][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 08:27:05,923][26599] Updated weights for policy 0, policy_version 315374 (0.0036) [2024-06-19 08:27:08,380][26367] Fps is (10 sec: 45874.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5167218688. Throughput: 0: 42877.6. Samples: 1434819140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:27:08,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 08:27:09,021][26599] Updated weights for policy 0, policy_version 315384 (0.0039) [2024-06-19 08:27:13,384][26367] Fps is (10 sec: 39307.0, 60 sec: 42049.7, 300 sec: 42709.0). Total num frames: 5167398912. Throughput: 0: 42906.8. Samples: 1435082760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:27:13,384][26367] Avg episode reward: [(0, '0.282')] [2024-06-19 08:27:13,536][26599] Updated weights for policy 0, policy_version 315394 (0.0028) [2024-06-19 08:27:14,400][26579] Signal inference workers to stop experience collection... (21200 times) [2024-06-19 08:27:14,435][26599] InferenceWorker_p0-w0: stopping experience collection (21200 times) [2024-06-19 08:27:14,444][26579] Signal inference workers to resume experience collection... (21200 times) [2024-06-19 08:27:14,457][26599] InferenceWorker_p0-w0: resuming experience collection (21200 times) [2024-06-19 08:27:16,813][26599] Updated weights for policy 0, policy_version 315404 (0.0031) [2024-06-19 08:27:18,384][26367] Fps is (10 sec: 42583.4, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 5167644672. Throughput: 0: 42863.6. Samples: 1435205640. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:18,384][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 08:27:20,985][26599] Updated weights for policy 0, policy_version 315414 (0.0034) [2024-06-19 08:27:23,380][26367] Fps is (10 sec: 45892.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5167857664. Throughput: 0: 43089.3. Samples: 1435465680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:23,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 08:27:24,426][26599] Updated weights for policy 0, policy_version 315424 (0.0033) [2024-06-19 08:27:28,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5168054272. Throughput: 0: 42913.7. Samples: 1435728300. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:28,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 08:27:28,522][26599] Updated weights for policy 0, policy_version 315434 (0.0038) [2024-06-19 08:27:32,014][26599] Updated weights for policy 0, policy_version 315444 (0.0031) [2024-06-19 08:27:33,384][26367] Fps is (10 sec: 42582.7, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 5168283648. Throughput: 0: 42845.9. Samples: 1435853860. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:33,385][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 08:27:36,250][26599] Updated weights for policy 0, policy_version 315454 (0.0041) [2024-06-19 08:27:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43147.2, 300 sec: 42820.6). Total num frames: 5168496640. Throughput: 0: 42961.9. Samples: 1436109160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:38,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 08:27:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000315460_5168496640.pth... [2024-06-19 08:27:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000314832_5158207488.pth [2024-06-19 08:27:39,715][26599] Updated weights for policy 0, policy_version 315464 (0.0030) [2024-06-19 08:27:43,380][26367] Fps is (10 sec: 40974.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5168693248. Throughput: 0: 42730.9. Samples: 1436366980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:43,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 08:27:43,686][26599] Updated weights for policy 0, policy_version 315474 (0.0034) [2024-06-19 08:27:47,371][26599] Updated weights for policy 0, policy_version 315484 (0.0028) [2024-06-19 08:27:48,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5168939008. Throughput: 0: 42875.4. Samples: 1436494620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:48,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 08:27:51,169][26599] Updated weights for policy 0, policy_version 315494 (0.0048) [2024-06-19 08:27:53,380][26367] Fps is (10 sec: 45875.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5169152000. Throughput: 0: 43032.6. Samples: 1436755600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:53,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 08:27:55,075][26599] Updated weights for policy 0, policy_version 315504 (0.0030) [2024-06-19 08:27:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5169348608. Throughput: 0: 42946.2. Samples: 1437015180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:27:58,380][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 08:27:58,761][26599] Updated weights for policy 0, policy_version 315514 (0.0043) [2024-06-19 08:28:02,684][26599] Updated weights for policy 0, policy_version 315524 (0.0037) [2024-06-19 08:28:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5169594368. Throughput: 0: 42945.8. Samples: 1437138040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:28:03,381][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 08:28:06,374][26599] Updated weights for policy 0, policy_version 315534 (0.0033) [2024-06-19 08:28:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5169790976. Throughput: 0: 42887.6. Samples: 1437395620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:28:08,380][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 08:28:10,437][26599] Updated weights for policy 0, policy_version 315544 (0.0037) [2024-06-19 08:28:13,380][26367] Fps is (10 sec: 39320.7, 60 sec: 43147.0, 300 sec: 42820.5). Total num frames: 5169987584. Throughput: 0: 42743.5. Samples: 1437651760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:28:13,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 08:28:14,038][26599] Updated weights for policy 0, policy_version 315554 (0.0036) [2024-06-19 08:28:18,040][26599] Updated weights for policy 0, policy_version 315564 (0.0041) [2024-06-19 08:28:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42601.1, 300 sec: 42876.4). Total num frames: 5170200576. Throughput: 0: 42720.0. Samples: 1437776100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:28:18,380][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 08:28:21,528][26599] Updated weights for policy 0, policy_version 315574 (0.0034) [2024-06-19 08:28:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5170429952. Throughput: 0: 42793.7. Samples: 1438034880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:28:23,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 08:28:25,810][26599] Updated weights for policy 0, policy_version 315584 (0.0034) [2024-06-19 08:28:28,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5170642944. Throughput: 0: 42736.1. Samples: 1438290100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 08:28:28,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 08:28:29,449][26599] Updated weights for policy 0, policy_version 315594 (0.0031) [2024-06-19 08:28:33,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42598.4, 300 sec: 42820.0). Total num frames: 5170839552. Throughput: 0: 42727.7. Samples: 1438417520. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:28:33,384][26367] Avg episode reward: [(0, '0.185')] [2024-06-19 08:28:33,635][26599] Updated weights for policy 0, policy_version 315604 (0.0031) [2024-06-19 08:28:37,018][26599] Updated weights for policy 0, policy_version 315614 (0.0023) [2024-06-19 08:28:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5171052544. Throughput: 0: 42681.4. Samples: 1438676260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:28:38,380][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 08:28:41,162][26599] Updated weights for policy 0, policy_version 315624 (0.0022) [2024-06-19 08:28:43,380][26367] Fps is (10 sec: 44253.0, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5171281920. Throughput: 0: 42620.8. Samples: 1438933120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:28:43,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 08:28:44,491][26599] Updated weights for policy 0, policy_version 315634 (0.0048) [2024-06-19 08:28:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5171494912. Throughput: 0: 42849.2. Samples: 1439066260. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:28:48,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 08:28:48,657][26599] Updated weights for policy 0, policy_version 315644 (0.0040) [2024-06-19 08:28:51,977][26599] Updated weights for policy 0, policy_version 315654 (0.0034) [2024-06-19 08:28:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5171707904. Throughput: 0: 42678.2. Samples: 1439316140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:28:53,380][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 08:28:56,146][26599] Updated weights for policy 0, policy_version 315664 (0.0033) [2024-06-19 08:28:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.2, 300 sec: 42821.1). Total num frames: 5171904512. Throughput: 0: 42715.1. Samples: 1439573940. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:28:58,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 08:28:59,910][26599] Updated weights for policy 0, policy_version 315674 (0.0042) [2024-06-19 08:29:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5172133888. Throughput: 0: 42793.3. Samples: 1439701800. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:03,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 08:29:04,146][26599] Updated weights for policy 0, policy_version 315684 (0.0026) [2024-06-19 08:29:07,143][26579] Signal inference workers to stop experience collection... (21250 times) [2024-06-19 08:29:07,148][26579] Signal inference workers to resume experience collection... (21250 times) [2024-06-19 08:29:07,160][26599] InferenceWorker_p0-w0: stopping experience collection (21250 times) [2024-06-19 08:29:07,187][26599] InferenceWorker_p0-w0: resuming experience collection (21250 times) [2024-06-19 08:29:07,600][26599] Updated weights for policy 0, policy_version 315694 (0.0029) [2024-06-19 08:29:08,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5172346880. Throughput: 0: 42870.3. Samples: 1439964040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:08,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 08:29:11,911][26599] Updated weights for policy 0, policy_version 315704 (0.0044) [2024-06-19 08:29:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42821.1). Total num frames: 5172559872. Throughput: 0: 42700.0. Samples: 1440211600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:13,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 08:29:15,263][26599] Updated weights for policy 0, policy_version 315714 (0.0040) [2024-06-19 08:29:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5172756480. Throughput: 0: 42712.3. Samples: 1440339420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:18,381][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 08:29:19,546][26599] Updated weights for policy 0, policy_version 315724 (0.0039) [2024-06-19 08:29:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5172969472. Throughput: 0: 42611.1. Samples: 1440593760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:23,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 08:29:23,408][26599] Updated weights for policy 0, policy_version 315734 (0.0046) [2024-06-19 08:29:27,095][26599] Updated weights for policy 0, policy_version 315744 (0.0031) [2024-06-19 08:29:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5173198848. Throughput: 0: 42557.7. Samples: 1440848220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:28,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 08:29:30,879][26599] Updated weights for policy 0, policy_version 315754 (0.0023) [2024-06-19 08:29:33,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42598.4, 300 sec: 42764.5). Total num frames: 5173395456. Throughput: 0: 42466.8. Samples: 1440977420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:33,385][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 08:29:34,705][26599] Updated weights for policy 0, policy_version 315764 (0.0023) [2024-06-19 08:29:38,370][26599] Updated weights for policy 0, policy_version 315774 (0.0033) [2024-06-19 08:29:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5173641216. Throughput: 0: 42658.5. Samples: 1441235780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:38,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 08:29:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000315774_5173641216.pth... [2024-06-19 08:29:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000315147_5163368448.pth [2024-06-19 08:29:42,281][26599] Updated weights for policy 0, policy_version 315784 (0.0033) [2024-06-19 08:29:43,380][26367] Fps is (10 sec: 44253.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5173837824. Throughput: 0: 42633.1. Samples: 1441492420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 22.0) [2024-06-19 08:29:43,380][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 08:29:46,114][26599] Updated weights for policy 0, policy_version 315794 (0.0031) [2024-06-19 08:29:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5174050816. Throughput: 0: 42701.7. Samples: 1441623380. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:29:48,389][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 08:29:49,790][26599] Updated weights for policy 0, policy_version 315804 (0.0050) [2024-06-19 08:29:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5174263808. Throughput: 0: 42572.1. Samples: 1441879780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:29:53,380][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 08:29:53,887][26599] Updated weights for policy 0, policy_version 315814 (0.0034) [2024-06-19 08:29:57,810][26599] Updated weights for policy 0, policy_version 315824 (0.0040) [2024-06-19 08:29:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5174476800. Throughput: 0: 42790.2. Samples: 1442137160. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:29:58,381][26367] Avg episode reward: [(0, '0.761')] [2024-06-19 08:30:01,506][26599] Updated weights for policy 0, policy_version 315834 (0.0056) [2024-06-19 08:30:03,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5174689792. Throughput: 0: 42750.6. Samples: 1442263200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:03,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 08:30:05,525][26599] Updated weights for policy 0, policy_version 315844 (0.0040) [2024-06-19 08:30:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5174902784. Throughput: 0: 42689.7. Samples: 1442514800. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:08,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 08:30:09,022][26599] Updated weights for policy 0, policy_version 315854 (0.0027) [2024-06-19 08:30:13,151][26599] Updated weights for policy 0, policy_version 315864 (0.0025) [2024-06-19 08:30:13,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5175115776. Throughput: 0: 42935.7. Samples: 1442780320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:13,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 08:30:16,623][26599] Updated weights for policy 0, policy_version 315874 (0.0039) [2024-06-19 08:30:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5175328768. Throughput: 0: 42912.7. Samples: 1442908340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:18,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 08:30:20,807][26599] Updated weights for policy 0, policy_version 315884 (0.0036) [2024-06-19 08:30:23,380][26367] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5175558144. Throughput: 0: 42819.1. Samples: 1443162640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:23,383][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 08:30:24,353][26599] Updated weights for policy 0, policy_version 315894 (0.0029) [2024-06-19 08:30:28,316][26599] Updated weights for policy 0, policy_version 315904 (0.0041) [2024-06-19 08:30:28,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5175771136. Throughput: 0: 43030.2. Samples: 1443428780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:28,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 08:30:31,979][26599] Updated weights for policy 0, policy_version 315914 (0.0036) [2024-06-19 08:30:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42874.1, 300 sec: 42710.0). Total num frames: 5175967744. Throughput: 0: 42975.6. Samples: 1443557280. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:33,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 08:30:34,295][26579] Signal inference workers to stop experience collection... (21300 times) [2024-06-19 08:30:34,298][26579] Signal inference workers to resume experience collection... (21300 times) [2024-06-19 08:30:34,340][26599] InferenceWorker_p0-w0: stopping experience collection (21300 times) [2024-06-19 08:30:34,340][26599] InferenceWorker_p0-w0: resuming experience collection (21300 times) [2024-06-19 08:30:35,704][26599] Updated weights for policy 0, policy_version 315924 (0.0030) [2024-06-19 08:30:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5176197120. Throughput: 0: 42933.3. Samples: 1443811780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:38,380][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 08:30:39,521][26599] Updated weights for policy 0, policy_version 315934 (0.0038) [2024-06-19 08:30:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5176410112. Throughput: 0: 42997.7. Samples: 1444072060. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:43,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 08:30:43,418][26599] Updated weights for policy 0, policy_version 315944 (0.0029) [2024-06-19 08:30:47,082][26599] Updated weights for policy 0, policy_version 315954 (0.0039) [2024-06-19 08:30:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5176606720. Throughput: 0: 42962.4. Samples: 1444196500. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:48,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 08:30:51,127][26599] Updated weights for policy 0, policy_version 315964 (0.0036) [2024-06-19 08:30:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5176836096. Throughput: 0: 42880.4. Samples: 1444444420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:53,381][26367] Avg episode reward: [(0, '0.371')] [2024-06-19 08:30:54,523][26599] Updated weights for policy 0, policy_version 315974 (0.0025) [2024-06-19 08:30:58,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5177049088. Throughput: 0: 42955.9. Samples: 1444713340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 08:30:58,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 08:30:58,850][26599] Updated weights for policy 0, policy_version 315984 (0.0044) [2024-06-19 08:31:02,065][26599] Updated weights for policy 0, policy_version 315994 (0.0041) [2024-06-19 08:31:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5177262080. Throughput: 0: 42879.6. Samples: 1444837920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:03,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 08:31:06,464][26599] Updated weights for policy 0, policy_version 316004 (0.0039) [2024-06-19 08:31:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5177491456. Throughput: 0: 42935.2. Samples: 1445094720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:08,389][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 08:31:09,955][26599] Updated weights for policy 0, policy_version 316014 (0.0035) [2024-06-19 08:31:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5177671680. Throughput: 0: 42800.8. Samples: 1445354820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:13,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 08:31:14,234][26599] Updated weights for policy 0, policy_version 316024 (0.0045) [2024-06-19 08:31:17,762][26599] Updated weights for policy 0, policy_version 316034 (0.0029) [2024-06-19 08:31:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5177901056. Throughput: 0: 42658.2. Samples: 1445476900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:18,383][26367] Avg episode reward: [(0, '0.362')] [2024-06-19 08:31:21,888][26599] Updated weights for policy 0, policy_version 316044 (0.0037) [2024-06-19 08:31:23,380][26367] Fps is (10 sec: 47514.3, 60 sec: 43144.7, 300 sec: 42876.1). Total num frames: 5178146816. Throughput: 0: 42729.7. Samples: 1445734620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:23,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 08:31:25,802][26599] Updated weights for policy 0, policy_version 316054 (0.0034) [2024-06-19 08:31:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5178327040. Throughput: 0: 42682.3. Samples: 1445992760. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:28,381][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 08:31:29,550][26599] Updated weights for policy 0, policy_version 316064 (0.0038) [2024-06-19 08:31:33,181][26599] Updated weights for policy 0, policy_version 316074 (0.0033) [2024-06-19 08:31:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 43144.5, 300 sec: 42876.6). Total num frames: 5178556416. Throughput: 0: 42717.3. Samples: 1446118780. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:33,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 08:31:36,951][26599] Updated weights for policy 0, policy_version 316084 (0.0033) [2024-06-19 08:31:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5178769408. Throughput: 0: 43041.2. Samples: 1446381280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:38,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 08:31:38,449][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000316088_5178785792.pth... [2024-06-19 08:31:38,519][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000315460_5168496640.pth [2024-06-19 08:31:40,566][26599] Updated weights for policy 0, policy_version 316094 (0.0041) [2024-06-19 08:31:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5178966016. Throughput: 0: 42822.2. Samples: 1446640340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:43,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 08:31:44,594][26599] Updated weights for policy 0, policy_version 316104 (0.0035) [2024-06-19 08:31:48,002][26599] Updated weights for policy 0, policy_version 316114 (0.0035) [2024-06-19 08:31:48,380][26367] Fps is (10 sec: 44237.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5179211776. Throughput: 0: 42976.1. Samples: 1446771840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:48,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 08:31:51,809][26579] Signal inference workers to stop experience collection... (21350 times) [2024-06-19 08:31:51,862][26599] InferenceWorker_p0-w0: stopping experience collection (21350 times) [2024-06-19 08:31:51,926][26579] Signal inference workers to resume experience collection... (21350 times) [2024-06-19 08:31:51,926][26599] InferenceWorker_p0-w0: resuming experience collection (21350 times) [2024-06-19 08:31:52,470][26599] Updated weights for policy 0, policy_version 316124 (0.0031) [2024-06-19 08:31:53,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5179408384. Throughput: 0: 42940.6. Samples: 1447027040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:53,380][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 08:31:55,586][26599] Updated weights for policy 0, policy_version 316134 (0.0029) [2024-06-19 08:31:58,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5179621376. Throughput: 0: 42901.8. Samples: 1447285400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:31:58,392][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 08:32:00,041][26599] Updated weights for policy 0, policy_version 316144 (0.0034) [2024-06-19 08:32:03,210][26599] Updated weights for policy 0, policy_version 316154 (0.0041) [2024-06-19 08:32:03,380][26367] Fps is (10 sec: 45874.5, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5179867136. Throughput: 0: 42961.4. Samples: 1447410160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:32:03,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 08:32:07,694][26599] Updated weights for policy 0, policy_version 316164 (0.0030) [2024-06-19 08:32:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.5, 300 sec: 42876.6). Total num frames: 5180047360. Throughput: 0: 42996.0. Samples: 1447669440. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:32:08,380][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 08:32:10,770][26599] Updated weights for policy 0, policy_version 316174 (0.0033) [2024-06-19 08:32:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 43417.7, 300 sec: 42821.1). Total num frames: 5180276736. Throughput: 0: 42946.8. Samples: 1447925360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 20.0) [2024-06-19 08:32:13,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 08:32:15,391][26599] Updated weights for policy 0, policy_version 316184 (0.0037) [2024-06-19 08:32:18,349][26599] Updated weights for policy 0, policy_version 316194 (0.0048) [2024-06-19 08:32:18,380][26367] Fps is (10 sec: 47513.2, 60 sec: 43690.7, 300 sec: 42931.6). Total num frames: 5180522496. Throughput: 0: 43007.1. Samples: 1448054100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:18,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 08:32:22,801][26599] Updated weights for policy 0, policy_version 316204 (0.0040) [2024-06-19 08:32:23,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5180702720. Throughput: 0: 43176.5. Samples: 1448324220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:23,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 08:32:25,775][26599] Updated weights for policy 0, policy_version 316214 (0.0041) [2024-06-19 08:32:28,380][26367] Fps is (10 sec: 39321.7, 60 sec: 43144.6, 300 sec: 42821.1). Total num frames: 5180915712. Throughput: 0: 42979.6. Samples: 1448574420. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:28,381][26367] Avg episode reward: [(0, '0.375')] [2024-06-19 08:32:30,759][26599] Updated weights for policy 0, policy_version 316224 (0.0038) [2024-06-19 08:32:33,285][26599] Updated weights for policy 0, policy_version 316234 (0.0028) [2024-06-19 08:32:33,380][26367] Fps is (10 sec: 47513.2, 60 sec: 43690.5, 300 sec: 42987.1). Total num frames: 5181177856. Throughput: 0: 43027.8. Samples: 1448708100. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:33,381][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 08:32:38,253][26599] Updated weights for policy 0, policy_version 316244 (0.0037) [2024-06-19 08:32:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5181341696. Throughput: 0: 43094.2. Samples: 1448966280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:38,380][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 08:32:41,249][26599] Updated weights for policy 0, policy_version 316254 (0.0033) [2024-06-19 08:32:43,380][26367] Fps is (10 sec: 39321.5, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5181571072. Throughput: 0: 42845.2. Samples: 1449213440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:43,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 08:32:45,959][26599] Updated weights for policy 0, policy_version 316264 (0.0044) [2024-06-19 08:32:48,380][26367] Fps is (10 sec: 45875.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5181800448. Throughput: 0: 43017.0. Samples: 1449345920. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:48,380][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 08:32:48,910][26599] Updated weights for policy 0, policy_version 316274 (0.0029) [2024-06-19 08:32:53,380][26367] Fps is (10 sec: 37684.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5181947904. Throughput: 0: 43049.3. Samples: 1449606660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:53,380][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 08:32:53,645][26579] Signal inference workers to stop experience collection... (21400 times) [2024-06-19 08:32:53,698][26599] InferenceWorker_p0-w0: stopping experience collection (21400 times) [2024-06-19 08:32:53,706][26579] Signal inference workers to resume experience collection... (21400 times) [2024-06-19 08:32:53,722][26599] InferenceWorker_p0-w0: resuming experience collection (21400 times) [2024-06-19 08:32:53,859][26599] Updated weights for policy 0, policy_version 316284 (0.0039) [2024-06-19 08:32:56,416][26599] Updated weights for policy 0, policy_version 316294 (0.0035) [2024-06-19 08:32:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5182210048. Throughput: 0: 42935.2. Samples: 1449857440. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:32:58,380][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 08:33:01,358][26599] Updated weights for policy 0, policy_version 316304 (0.0028) [2024-06-19 08:33:03,384][26367] Fps is (10 sec: 50771.6, 60 sec: 43141.9, 300 sec: 42931.1). Total num frames: 5182455808. Throughput: 0: 43205.0. Samples: 1449998480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:33:03,385][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 08:33:03,933][26599] Updated weights for policy 0, policy_version 316314 (0.0034) [2024-06-19 08:33:08,380][26367] Fps is (10 sec: 39320.6, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5182603264. Throughput: 0: 42835.5. Samples: 1450251820. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:33:08,381][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 08:33:08,963][26599] Updated weights for policy 0, policy_version 316324 (0.0034) [2024-06-19 08:33:11,882][26599] Updated weights for policy 0, policy_version 316334 (0.0038) [2024-06-19 08:33:13,380][26367] Fps is (10 sec: 40974.8, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5182865408. Throughput: 0: 42737.3. Samples: 1450497600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:33:13,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 08:33:16,978][26599] Updated weights for policy 0, policy_version 316344 (0.0044) [2024-06-19 08:33:18,380][26367] Fps is (10 sec: 47514.5, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5183078400. Throughput: 0: 42921.5. Samples: 1450639560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:33:18,380][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 08:33:19,505][26599] Updated weights for policy 0, policy_version 316354 (0.0038) [2024-06-19 08:33:23,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5183242240. Throughput: 0: 42795.5. Samples: 1450892080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:33:23,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 08:33:24,675][26599] Updated weights for policy 0, policy_version 316364 (0.0051) [2024-06-19 08:33:27,139][26599] Updated weights for policy 0, policy_version 316374 (0.0042) [2024-06-19 08:33:28,380][26367] Fps is (10 sec: 42597.8, 60 sec: 43144.5, 300 sec: 42932.1). Total num frames: 5183504384. Throughput: 0: 42781.9. Samples: 1451138620. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 08:33:28,381][26367] Avg episode reward: [(0, '0.812')] [2024-06-19 08:33:32,143][26599] Updated weights for policy 0, policy_version 316384 (0.0040) [2024-06-19 08:33:33,380][26367] Fps is (10 sec: 47514.1, 60 sec: 42325.5, 300 sec: 42931.6). Total num frames: 5183717376. Throughput: 0: 42994.7. Samples: 1451280680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:33:33,380][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 08:33:34,758][26599] Updated weights for policy 0, policy_version 316394 (0.0049) [2024-06-19 08:33:38,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5183881216. Throughput: 0: 42823.5. Samples: 1451533720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:33:38,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 08:33:38,456][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000316400_5183897600.pth... [2024-06-19 08:33:38,513][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000315774_5173641216.pth [2024-06-19 08:33:39,958][26599] Updated weights for policy 0, policy_version 316404 (0.0035) [2024-06-19 08:33:42,246][26599] Updated weights for policy 0, policy_version 316414 (0.0044) [2024-06-19 08:33:43,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5184143360. Throughput: 0: 42683.4. Samples: 1451778200. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:33:43,381][26367] Avg episode reward: [(0, '0.304')] [2024-06-19 08:33:47,499][26599] Updated weights for policy 0, policy_version 316424 (0.0047) [2024-06-19 08:33:48,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 5184339968. Throughput: 0: 42623.9. Samples: 1451916400. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:33:48,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 08:33:48,545][26579] Signal inference workers to stop experience collection... (21450 times) [2024-06-19 08:33:48,576][26599] InferenceWorker_p0-w0: stopping experience collection (21450 times) [2024-06-19 08:33:48,601][26579] Signal inference workers to resume experience collection... (21450 times) [2024-06-19 08:33:48,617][26599] InferenceWorker_p0-w0: resuming experience collection (21450 times) [2024-06-19 08:33:50,170][26599] Updated weights for policy 0, policy_version 316434 (0.0035) [2024-06-19 08:33:53,383][26367] Fps is (10 sec: 39311.5, 60 sec: 43142.6, 300 sec: 42820.2). Total num frames: 5184536576. Throughput: 0: 42624.3. Samples: 1452170020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:33:53,383][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 08:33:55,057][26599] Updated weights for policy 0, policy_version 316444 (0.0034) [2024-06-19 08:33:57,691][26599] Updated weights for policy 0, policy_version 316454 (0.0035) [2024-06-19 08:33:58,380][26367] Fps is (10 sec: 45875.0, 60 sec: 43144.4, 300 sec: 42931.6). Total num frames: 5184798720. Throughput: 0: 42620.0. Samples: 1452415500. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:33:58,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 08:34:02,585][26599] Updated weights for policy 0, policy_version 316464 (0.0029) [2024-06-19 08:34:03,380][26367] Fps is (10 sec: 42609.9, 60 sec: 41781.8, 300 sec: 42765.0). Total num frames: 5184962560. Throughput: 0: 42617.8. Samples: 1452557360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:03,380][26367] Avg episode reward: [(0, '0.799')] [2024-06-19 08:34:05,268][26599] Updated weights for policy 0, policy_version 316474 (0.0042) [2024-06-19 08:34:08,380][26367] Fps is (10 sec: 39321.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5185191936. Throughput: 0: 42563.5. Samples: 1452807440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:08,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 08:34:10,289][26599] Updated weights for policy 0, policy_version 316484 (0.0027) [2024-06-19 08:34:12,981][26599] Updated weights for policy 0, policy_version 316494 (0.0032) [2024-06-19 08:34:13,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5185437696. Throughput: 0: 42597.9. Samples: 1453055520. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:13,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 08:34:17,882][26599] Updated weights for policy 0, policy_version 316504 (0.0042) [2024-06-19 08:34:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42876.1). Total num frames: 5185617920. Throughput: 0: 42575.0. Samples: 1453196560. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:18,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 08:34:20,857][26599] Updated weights for policy 0, policy_version 316514 (0.0036) [2024-06-19 08:34:23,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5185814528. Throughput: 0: 42490.7. Samples: 1453445800. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:23,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 08:34:25,423][26599] Updated weights for policy 0, policy_version 316524 (0.0031) [2024-06-19 08:34:28,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42987.7). Total num frames: 5186076672. Throughput: 0: 42698.2. Samples: 1453699620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:28,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 08:34:28,544][26599] Updated weights for policy 0, policy_version 316534 (0.0041) [2024-06-19 08:34:33,202][26599] Updated weights for policy 0, policy_version 316544 (0.0033) [2024-06-19 08:34:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5186256896. Throughput: 0: 42742.7. Samples: 1453839820. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:33,380][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 08:34:36,359][26599] Updated weights for policy 0, policy_version 316554 (0.0041) [2024-06-19 08:34:38,384][26367] Fps is (10 sec: 39307.4, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 5186469888. Throughput: 0: 42670.6. Samples: 1454090240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:38,385][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 08:34:40,760][26599] Updated weights for policy 0, policy_version 316564 (0.0027) [2024-06-19 08:34:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5186699264. Throughput: 0: 42876.1. Samples: 1454344920. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 08:34:43,380][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 08:34:43,927][26599] Updated weights for policy 0, policy_version 316574 (0.0027) [2024-06-19 08:34:48,268][26599] Updated weights for policy 0, policy_version 316584 (0.0028) [2024-06-19 08:34:48,380][26367] Fps is (10 sec: 44253.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5186912256. Throughput: 0: 42733.7. Samples: 1454480380. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:34:48,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 08:34:51,487][26599] Updated weights for policy 0, policy_version 316594 (0.0037) [2024-06-19 08:34:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43146.4, 300 sec: 42876.1). Total num frames: 5187125248. Throughput: 0: 42834.2. Samples: 1454734980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:34:53,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 08:34:56,090][26599] Updated weights for policy 0, policy_version 316604 (0.0032) [2024-06-19 08:34:58,383][26367] Fps is (10 sec: 44225.8, 60 sec: 42596.7, 300 sec: 42931.3). Total num frames: 5187354624. Throughput: 0: 43052.3. Samples: 1454992980. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:34:58,383][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 08:34:59,011][26599] Updated weights for policy 0, policy_version 316614 (0.0033) [2024-06-19 08:35:02,868][26579] Signal inference workers to stop experience collection... (21500 times) [2024-06-19 08:35:02,869][26579] Signal inference workers to resume experience collection... (21500 times) [2024-06-19 08:35:02,918][26599] InferenceWorker_p0-w0: stopping experience collection (21500 times) [2024-06-19 08:35:02,918][26599] InferenceWorker_p0-w0: resuming experience collection (21500 times) [2024-06-19 08:35:03,380][26367] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5187551232. Throughput: 0: 42850.3. Samples: 1455124820. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:03,380][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 08:35:03,585][26599] Updated weights for policy 0, policy_version 316624 (0.0034) [2024-06-19 08:35:06,733][26599] Updated weights for policy 0, policy_version 316634 (0.0034) [2024-06-19 08:35:08,380][26367] Fps is (10 sec: 42609.2, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5187780608. Throughput: 0: 42869.9. Samples: 1455374940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:08,380][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 08:35:11,196][26599] Updated weights for policy 0, policy_version 316644 (0.0051) [2024-06-19 08:35:13,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5187993600. Throughput: 0: 42997.3. Samples: 1455634500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:13,381][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 08:35:14,483][26599] Updated weights for policy 0, policy_version 316654 (0.0023) [2024-06-19 08:35:18,384][26367] Fps is (10 sec: 40944.9, 60 sec: 42868.9, 300 sec: 42820.1). Total num frames: 5188190208. Throughput: 0: 42704.5. Samples: 1455761680. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:18,384][26367] Avg episode reward: [(0, '0.830')] [2024-06-19 08:35:19,008][26599] Updated weights for policy 0, policy_version 316664 (0.0040) [2024-06-19 08:35:22,087][26599] Updated weights for policy 0, policy_version 316674 (0.0032) [2024-06-19 08:35:23,384][26367] Fps is (10 sec: 44221.2, 60 sec: 43688.1, 300 sec: 42931.1). Total num frames: 5188435968. Throughput: 0: 42784.5. Samples: 1456015540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:23,384][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 08:35:26,693][26599] Updated weights for policy 0, policy_version 316684 (0.0030) [2024-06-19 08:35:28,380][26367] Fps is (10 sec: 44252.2, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5188632576. Throughput: 0: 42998.9. Samples: 1456279880. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:28,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 08:35:29,814][26599] Updated weights for policy 0, policy_version 316694 (0.0025) [2024-06-19 08:35:33,380][26367] Fps is (10 sec: 37696.8, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5188812800. Throughput: 0: 42849.3. Samples: 1456408600. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:33,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 08:35:34,342][26599] Updated weights for policy 0, policy_version 316704 (0.0040) [2024-06-19 08:35:37,324][26599] Updated weights for policy 0, policy_version 316714 (0.0031) [2024-06-19 08:35:38,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43420.2, 300 sec: 42931.6). Total num frames: 5189074944. Throughput: 0: 42796.8. Samples: 1456660840. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:38,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 08:35:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000316716_5189074944.pth... [2024-06-19 08:35:38,443][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000316088_5178785792.pth [2024-06-19 08:35:42,059][26599] Updated weights for policy 0, policy_version 316724 (0.0029) [2024-06-19 08:35:43,380][26367] Fps is (10 sec: 47513.1, 60 sec: 43144.4, 300 sec: 42987.2). Total num frames: 5189287936. Throughput: 0: 42859.1. Samples: 1456921540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:43,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 08:35:44,797][26599] Updated weights for policy 0, policy_version 316734 (0.0040) [2024-06-19 08:35:48,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5189451776. Throughput: 0: 42748.2. Samples: 1457048500. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:48,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 08:35:49,503][26599] Updated weights for policy 0, policy_version 316744 (0.0033) [2024-06-19 08:35:52,667][26599] Updated weights for policy 0, policy_version 316754 (0.0032) [2024-06-19 08:35:53,380][26367] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42931.7). Total num frames: 5189713920. Throughput: 0: 42848.4. Samples: 1457303120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:53,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 08:35:57,041][26599] Updated weights for policy 0, policy_version 316764 (0.0032) [2024-06-19 08:35:58,380][26367] Fps is (10 sec: 47514.3, 60 sec: 42873.2, 300 sec: 42931.6). Total num frames: 5189926912. Throughput: 0: 42827.6. Samples: 1457561740. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 08:35:58,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 08:36:00,299][26599] Updated weights for policy 0, policy_version 316774 (0.0031) [2024-06-19 08:36:03,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5190090752. Throughput: 0: 42851.9. Samples: 1457689860. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:03,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 08:36:04,965][26599] Updated weights for policy 0, policy_version 316784 (0.0034) [2024-06-19 08:36:07,883][26599] Updated weights for policy 0, policy_version 316794 (0.0036) [2024-06-19 08:36:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5190352896. Throughput: 0: 42832.9. Samples: 1457942860. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:08,380][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 08:36:12,638][26599] Updated weights for policy 0, policy_version 316804 (0.0033) [2024-06-19 08:36:13,380][26367] Fps is (10 sec: 47513.6, 60 sec: 42871.5, 300 sec: 42931.6). Total num frames: 5190565888. Throughput: 0: 42730.8. Samples: 1458202760. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:13,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 08:36:15,877][26599] Updated weights for policy 0, policy_version 316814 (0.0032) [2024-06-19 08:36:18,384][26367] Fps is (10 sec: 39307.0, 60 sec: 42598.4, 300 sec: 42708.9). Total num frames: 5190746112. Throughput: 0: 42614.8. Samples: 1458326420. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:18,384][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 08:36:19,526][26579] Signal inference workers to stop experience collection... (21550 times) [2024-06-19 08:36:19,580][26599] InferenceWorker_p0-w0: stopping experience collection (21550 times) [2024-06-19 08:36:19,585][26579] Signal inference workers to resume experience collection... (21550 times) [2024-06-19 08:36:19,589][26599] InferenceWorker_p0-w0: resuming experience collection (21550 times) [2024-06-19 08:36:20,327][26599] Updated weights for policy 0, policy_version 316824 (0.0028) [2024-06-19 08:36:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42601.0, 300 sec: 42931.6). Total num frames: 5190991872. Throughput: 0: 42878.3. Samples: 1458590360. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:23,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 08:36:23,390][26599] Updated weights for policy 0, policy_version 316834 (0.0039) [2024-06-19 08:36:27,960][26599] Updated weights for policy 0, policy_version 316844 (0.0032) [2024-06-19 08:36:28,380][26367] Fps is (10 sec: 44252.3, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5191188480. Throughput: 0: 42764.0. Samples: 1458845920. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:28,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 08:36:31,202][26599] Updated weights for policy 0, policy_version 316854 (0.0030) [2024-06-19 08:36:33,380][26367] Fps is (10 sec: 39320.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5191385088. Throughput: 0: 42684.9. Samples: 1458969320. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:33,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 08:36:35,451][26599] Updated weights for policy 0, policy_version 316864 (0.0037) [2024-06-19 08:36:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5191630848. Throughput: 0: 42877.2. Samples: 1459232600. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:38,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 08:36:38,839][26599] Updated weights for policy 0, policy_version 316874 (0.0028) [2024-06-19 08:36:42,951][26599] Updated weights for policy 0, policy_version 316884 (0.0028) [2024-06-19 08:36:43,380][26367] Fps is (10 sec: 45876.5, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 5191843840. Throughput: 0: 42922.4. Samples: 1459493240. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:43,380][26367] Avg episode reward: [(0, '0.766')] [2024-06-19 08:36:46,363][26599] Updated weights for policy 0, policy_version 316894 (0.0035) [2024-06-19 08:36:48,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5192024064. Throughput: 0: 42916.8. Samples: 1459621120. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:48,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 08:36:50,464][26599] Updated weights for policy 0, policy_version 316904 (0.0024) [2024-06-19 08:36:53,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5192269824. Throughput: 0: 43025.2. Samples: 1459879000. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:53,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 08:36:54,264][26599] Updated weights for policy 0, policy_version 316914 (0.0032) [2024-06-19 08:36:57,974][26599] Updated weights for policy 0, policy_version 316924 (0.0026) [2024-06-19 08:36:58,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5192482816. Throughput: 0: 42931.0. Samples: 1460134660. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:36:58,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 08:37:01,879][26599] Updated weights for policy 0, policy_version 316934 (0.0032) [2024-06-19 08:37:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5192695808. Throughput: 0: 43077.7. Samples: 1460264760. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:37:03,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 08:37:05,533][26599] Updated weights for policy 0, policy_version 316944 (0.0039) [2024-06-19 08:37:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5192908800. Throughput: 0: 42957.8. Samples: 1460523460. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:37:08,381][26367] Avg episode reward: [(0, '0.768')] [2024-06-19 08:37:09,666][26599] Updated weights for policy 0, policy_version 316954 (0.0024) [2024-06-19 08:37:13,181][26599] Updated weights for policy 0, policy_version 316964 (0.0032) [2024-06-19 08:37:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5193138176. Throughput: 0: 43028.2. Samples: 1460782180. Policy #0 lag: (min: 0.0, avg: 13.1, max: 25.0) [2024-06-19 08:37:13,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 08:37:17,524][26599] Updated weights for policy 0, policy_version 316974 (0.0037) [2024-06-19 08:37:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42874.0, 300 sec: 42765.0). Total num frames: 5193318400. Throughput: 0: 43155.7. Samples: 1460911320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:18,384][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 08:37:20,684][26599] Updated weights for policy 0, policy_version 316984 (0.0046) [2024-06-19 08:37:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5193547776. Throughput: 0: 42973.4. Samples: 1461166400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:23,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 08:37:25,075][26599] Updated weights for policy 0, policy_version 316994 (0.0028) [2024-06-19 08:37:28,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5193760768. Throughput: 0: 42807.9. Samples: 1461419600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:28,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 08:37:28,782][26599] Updated weights for policy 0, policy_version 317004 (0.0029) [2024-06-19 08:37:32,571][26599] Updated weights for policy 0, policy_version 317014 (0.0039) [2024-06-19 08:37:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5193957376. Throughput: 0: 42860.1. Samples: 1461549820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:33,380][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 08:37:36,349][26599] Updated weights for policy 0, policy_version 317024 (0.0031) [2024-06-19 08:37:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5194203136. Throughput: 0: 42900.5. Samples: 1461809520. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:38,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 08:37:38,415][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317029_5194203136.pth... [2024-06-19 08:37:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000316400_5183897600.pth [2024-06-19 08:37:40,077][26599] Updated weights for policy 0, policy_version 317034 (0.0038) [2024-06-19 08:37:41,273][26579] Signal inference workers to stop experience collection... (21600 times) [2024-06-19 08:37:41,319][26599] InferenceWorker_p0-w0: stopping experience collection (21600 times) [2024-06-19 08:37:41,328][26579] Signal inference workers to resume experience collection... (21600 times) [2024-06-19 08:37:41,341][26599] InferenceWorker_p0-w0: resuming experience collection (21600 times) [2024-06-19 08:37:43,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5194416128. Throughput: 0: 42883.6. Samples: 1462064420. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:43,382][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 08:37:44,054][26599] Updated weights for policy 0, policy_version 317044 (0.0044) [2024-06-19 08:37:47,822][26599] Updated weights for policy 0, policy_version 317054 (0.0032) [2024-06-19 08:37:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5194612736. Throughput: 0: 42722.9. Samples: 1462187300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:48,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 08:37:51,849][26599] Updated weights for policy 0, policy_version 317064 (0.0031) [2024-06-19 08:37:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5194825728. Throughput: 0: 42614.2. Samples: 1462441100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:53,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 08:37:55,848][26599] Updated weights for policy 0, policy_version 317074 (0.0052) [2024-06-19 08:37:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5195055104. Throughput: 0: 42445.6. Samples: 1462692240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:37:58,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 08:37:59,332][26599] Updated weights for policy 0, policy_version 317084 (0.0028) [2024-06-19 08:38:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5195251712. Throughput: 0: 42465.0. Samples: 1462822240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:38:03,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 08:38:03,502][26599] Updated weights for policy 0, policy_version 317094 (0.0040) [2024-06-19 08:38:07,267][26599] Updated weights for policy 0, policy_version 317104 (0.0033) [2024-06-19 08:38:08,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5195481088. Throughput: 0: 42517.4. Samples: 1463079680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:38:08,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 08:38:11,062][26599] Updated weights for policy 0, policy_version 317114 (0.0032) [2024-06-19 08:38:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5195694080. Throughput: 0: 42544.0. Samples: 1463334080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:38:13,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 08:38:14,863][26599] Updated weights for policy 0, policy_version 317124 (0.0033) [2024-06-19 08:38:18,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5195874304. Throughput: 0: 42469.3. Samples: 1463460940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:38:18,380][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 08:38:19,332][26599] Updated weights for policy 0, policy_version 317134 (0.0033) [2024-06-19 08:38:22,506][26599] Updated weights for policy 0, policy_version 317144 (0.0033) [2024-06-19 08:38:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5196120064. Throughput: 0: 42369.5. Samples: 1463716140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:38:23,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 08:38:26,883][26599] Updated weights for policy 0, policy_version 317154 (0.0028) [2024-06-19 08:38:28,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5196333056. Throughput: 0: 42521.7. Samples: 1463977900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 20.0) [2024-06-19 08:38:28,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 08:38:30,045][26599] Updated weights for policy 0, policy_version 317164 (0.0030) [2024-06-19 08:38:33,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5196513280. Throughput: 0: 42450.3. Samples: 1464097560. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:38:33,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 08:38:34,559][26599] Updated weights for policy 0, policy_version 317174 (0.0042) [2024-06-19 08:38:37,785][26599] Updated weights for policy 0, policy_version 317184 (0.0045) [2024-06-19 08:38:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5196742656. Throughput: 0: 42536.7. Samples: 1464355260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:38:38,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 08:38:42,357][26599] Updated weights for policy 0, policy_version 317194 (0.0031) [2024-06-19 08:38:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5196955648. Throughput: 0: 42613.4. Samples: 1464609840. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:38:43,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 08:38:45,523][26599] Updated weights for policy 0, policy_version 317204 (0.0027) [2024-06-19 08:38:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42765.4). Total num frames: 5197152256. Throughput: 0: 42480.7. Samples: 1464733880. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:38:48,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 08:38:50,252][26599] Updated weights for policy 0, policy_version 317214 (0.0032) [2024-06-19 08:38:53,078][26599] Updated weights for policy 0, policy_version 317224 (0.0029) [2024-06-19 08:38:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5197398016. Throughput: 0: 42478.0. Samples: 1464991200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:38:53,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 08:38:54,466][26579] Signal inference workers to stop experience collection... (21650 times) [2024-06-19 08:38:54,500][26599] InferenceWorker_p0-w0: stopping experience collection (21650 times) [2024-06-19 08:38:54,537][26579] Signal inference workers to resume experience collection... (21650 times) [2024-06-19 08:38:54,537][26599] InferenceWorker_p0-w0: resuming experience collection (21650 times) [2024-06-19 08:38:58,032][26599] Updated weights for policy 0, policy_version 317234 (0.0040) [2024-06-19 08:38:58,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5197578240. Throughput: 0: 42601.8. Samples: 1465251160. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:38:58,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 08:39:00,747][26599] Updated weights for policy 0, policy_version 317244 (0.0038) [2024-06-19 08:39:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5197807616. Throughput: 0: 42361.2. Samples: 1465367200. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:03,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 08:39:05,702][26599] Updated weights for policy 0, policy_version 317254 (0.0039) [2024-06-19 08:39:08,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5198036992. Throughput: 0: 42426.6. Samples: 1465625340. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:08,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 08:39:08,482][26599] Updated weights for policy 0, policy_version 317264 (0.0034) [2024-06-19 08:39:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 5198200832. Throughput: 0: 42305.9. Samples: 1465881660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:13,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 08:39:13,399][26599] Updated weights for policy 0, policy_version 317274 (0.0039) [2024-06-19 08:39:16,579][26599] Updated weights for policy 0, policy_version 317284 (0.0027) [2024-06-19 08:39:18,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5198430208. Throughput: 0: 42294.6. Samples: 1466000820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:18,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 08:39:21,015][26599] Updated weights for policy 0, policy_version 317294 (0.0033) [2024-06-19 08:39:23,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5198659584. Throughput: 0: 42316.6. Samples: 1466259500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:23,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 08:39:24,341][26599] Updated weights for policy 0, policy_version 317304 (0.0035) [2024-06-19 08:39:28,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 5198856192. Throughput: 0: 42512.5. Samples: 1466522900. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:28,380][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 08:39:28,858][26599] Updated weights for policy 0, policy_version 317314 (0.0026) [2024-06-19 08:39:32,059][26599] Updated weights for policy 0, policy_version 317324 (0.0037) [2024-06-19 08:39:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42765.6). Total num frames: 5199085568. Throughput: 0: 42490.4. Samples: 1466645940. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:33,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 08:39:36,429][26599] Updated weights for policy 0, policy_version 317334 (0.0024) [2024-06-19 08:39:38,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5199298560. Throughput: 0: 42529.8. Samples: 1466905040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:38,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 08:39:38,397][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317340_5199298560.pth... [2024-06-19 08:39:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000316716_5189074944.pth [2024-06-19 08:39:39,710][26599] Updated weights for policy 0, policy_version 317344 (0.0037) [2024-06-19 08:39:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5199495168. Throughput: 0: 42454.1. Samples: 1467161600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 08:39:43,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 08:39:44,137][26599] Updated weights for policy 0, policy_version 317354 (0.0044) [2024-06-19 08:39:47,303][26599] Updated weights for policy 0, policy_version 317364 (0.0030) [2024-06-19 08:39:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5199740928. Throughput: 0: 42668.5. Samples: 1467287280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:39:48,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 08:39:51,604][26599] Updated weights for policy 0, policy_version 317374 (0.0030) [2024-06-19 08:39:53,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42709.8). Total num frames: 5199953920. Throughput: 0: 42887.1. Samples: 1467555260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:39:53,380][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 08:39:54,967][26599] Updated weights for policy 0, policy_version 317384 (0.0034) [2024-06-19 08:39:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5200150528. Throughput: 0: 42816.0. Samples: 1467808380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:39:58,381][26367] Avg episode reward: [(0, '0.390')] [2024-06-19 08:39:59,133][26599] Updated weights for policy 0, policy_version 317394 (0.0029) [2024-06-19 08:40:02,531][26599] Updated weights for policy 0, policy_version 317404 (0.0034) [2024-06-19 08:40:02,825][26579] Signal inference workers to stop experience collection... (21700 times) [2024-06-19 08:40:02,832][26579] Signal inference workers to resume experience collection... (21700 times) [2024-06-19 08:40:02,871][26599] InferenceWorker_p0-w0: stopping experience collection (21700 times) [2024-06-19 08:40:02,871][26599] InferenceWorker_p0-w0: resuming experience collection (21700 times) [2024-06-19 08:40:03,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5200396288. Throughput: 0: 42971.7. Samples: 1467934540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:03,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 08:40:06,731][26599] Updated weights for policy 0, policy_version 317414 (0.0037) [2024-06-19 08:40:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5200576512. Throughput: 0: 43003.2. Samples: 1468194640. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:08,380][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 08:40:10,070][26599] Updated weights for policy 0, policy_version 317424 (0.0031) [2024-06-19 08:40:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 43144.5, 300 sec: 42710.0). Total num frames: 5200789504. Throughput: 0: 42766.2. Samples: 1468447380. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:13,384][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 08:40:14,236][26599] Updated weights for policy 0, policy_version 317434 (0.0034) [2024-06-19 08:40:17,793][26599] Updated weights for policy 0, policy_version 317444 (0.0041) [2024-06-19 08:40:18,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42654.5). Total num frames: 5201018880. Throughput: 0: 42949.7. Samples: 1468578680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:18,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 08:40:21,963][26599] Updated weights for policy 0, policy_version 317454 (0.0033) [2024-06-19 08:40:23,382][26367] Fps is (10 sec: 42589.9, 60 sec: 42597.0, 300 sec: 42653.7). Total num frames: 5201215488. Throughput: 0: 42759.9. Samples: 1468829320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:23,383][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 08:40:25,539][26599] Updated weights for policy 0, policy_version 317464 (0.0033) [2024-06-19 08:40:28,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5201428480. Throughput: 0: 42701.9. Samples: 1469083340. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:28,385][26367] Avg episode reward: [(0, '0.783')] [2024-06-19 08:40:29,647][26599] Updated weights for policy 0, policy_version 317474 (0.0022) [2024-06-19 08:40:33,099][26599] Updated weights for policy 0, policy_version 317484 (0.0033) [2024-06-19 08:40:33,380][26367] Fps is (10 sec: 44245.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5201657856. Throughput: 0: 42885.0. Samples: 1469217100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:33,381][26367] Avg episode reward: [(0, '0.935')] [2024-06-19 08:40:33,429][26579] Saving new best policy, reward=0.935! [2024-06-19 08:40:37,323][26599] Updated weights for policy 0, policy_version 317494 (0.0032) [2024-06-19 08:40:38,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5201838080. Throughput: 0: 42571.1. Samples: 1469470960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:38,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 08:40:40,803][26599] Updated weights for policy 0, policy_version 317504 (0.0026) [2024-06-19 08:40:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 5202083840. Throughput: 0: 42670.7. Samples: 1469728560. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:43,380][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 08:40:45,129][26599] Updated weights for policy 0, policy_version 317514 (0.0035) [2024-06-19 08:40:48,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5202296832. Throughput: 0: 42747.6. Samples: 1469858180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:48,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 08:40:48,395][26599] Updated weights for policy 0, policy_version 317524 (0.0045) [2024-06-19 08:40:52,687][26599] Updated weights for policy 0, policy_version 317534 (0.0028) [2024-06-19 08:40:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5202493440. Throughput: 0: 42621.8. Samples: 1470112620. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:53,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 08:40:55,929][26599] Updated weights for policy 0, policy_version 317544 (0.0034) [2024-06-19 08:40:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5202706432. Throughput: 0: 42733.0. Samples: 1470370360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 08:40:58,381][26367] Avg episode reward: [(0, '0.891')] [2024-06-19 08:41:00,238][26599] Updated weights for policy 0, policy_version 317554 (0.0038) [2024-06-19 08:41:03,374][26599] Updated weights for policy 0, policy_version 317564 (0.0040) [2024-06-19 08:41:03,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5202968576. Throughput: 0: 42631.7. Samples: 1470497100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:03,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 08:41:08,268][26599] Updated weights for policy 0, policy_version 317574 (0.0038) [2024-06-19 08:41:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5203132416. Throughput: 0: 42755.2. Samples: 1470753220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:08,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 08:41:10,355][26579] Signal inference workers to stop experience collection... (21750 times) [2024-06-19 08:41:10,380][26599] InferenceWorker_p0-w0: stopping experience collection (21750 times) [2024-06-19 08:41:10,470][26579] Signal inference workers to resume experience collection... (21750 times) [2024-06-19 08:41:10,470][26599] InferenceWorker_p0-w0: resuming experience collection (21750 times) [2024-06-19 08:41:11,109][26599] Updated weights for policy 0, policy_version 317584 (0.0042) [2024-06-19 08:41:13,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42598.5, 300 sec: 42710.0). Total num frames: 5203345408. Throughput: 0: 42969.4. Samples: 1471016800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:13,380][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 08:41:15,727][26599] Updated weights for policy 0, policy_version 317594 (0.0025) [2024-06-19 08:41:18,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5203591168. Throughput: 0: 42728.8. Samples: 1471139900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:18,381][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 08:41:18,991][26599] Updated weights for policy 0, policy_version 317604 (0.0033) [2024-06-19 08:41:23,258][26599] Updated weights for policy 0, policy_version 317614 (0.0030) [2024-06-19 08:41:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42872.9, 300 sec: 42709.5). Total num frames: 5203787776. Throughput: 0: 42853.3. Samples: 1471399360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:23,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 08:41:26,521][26599] Updated weights for policy 0, policy_version 317624 (0.0052) [2024-06-19 08:41:28,380][26367] Fps is (10 sec: 37683.8, 60 sec: 42328.0, 300 sec: 42654.0). Total num frames: 5203968000. Throughput: 0: 42814.7. Samples: 1471655220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:28,380][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 08:41:31,043][26599] Updated weights for policy 0, policy_version 317634 (0.0042) [2024-06-19 08:41:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5204230144. Throughput: 0: 42759.5. Samples: 1471782360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:33,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 08:41:34,130][26599] Updated weights for policy 0, policy_version 317644 (0.0043) [2024-06-19 08:41:38,388][26367] Fps is (10 sec: 44203.9, 60 sec: 42866.2, 300 sec: 42597.3). Total num frames: 5204410368. Throughput: 0: 42775.6. Samples: 1472037840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:38,388][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 08:41:38,515][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317653_5204426752.pth... [2024-06-19 08:41:38,567][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317029_5194203136.pth [2024-06-19 08:41:38,769][26599] Updated weights for policy 0, policy_version 317654 (0.0043) [2024-06-19 08:41:42,311][26599] Updated weights for policy 0, policy_version 317664 (0.0030) [2024-06-19 08:41:43,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5204623360. Throughput: 0: 42723.5. Samples: 1472292920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:43,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 08:41:46,384][26599] Updated weights for policy 0, policy_version 317674 (0.0026) [2024-06-19 08:41:48,384][26367] Fps is (10 sec: 44253.1, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5204852736. Throughput: 0: 42743.6. Samples: 1472420720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:48,385][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 08:41:49,822][26599] Updated weights for policy 0, policy_version 317684 (0.0035) [2024-06-19 08:41:53,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5205065728. Throughput: 0: 42706.5. Samples: 1472675020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:53,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 08:41:54,040][26599] Updated weights for policy 0, policy_version 317694 (0.0033) [2024-06-19 08:41:57,954][26599] Updated weights for policy 0, policy_version 317704 (0.0034) [2024-06-19 08:41:58,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5205278720. Throughput: 0: 42608.8. Samples: 1472934200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:41:58,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 08:42:01,734][26599] Updated weights for policy 0, policy_version 317714 (0.0035) [2024-06-19 08:42:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5205508096. Throughput: 0: 42866.1. Samples: 1473068880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:42:03,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 08:42:05,567][26599] Updated weights for policy 0, policy_version 317724 (0.0041) [2024-06-19 08:42:08,383][26367] Fps is (10 sec: 40949.1, 60 sec: 42596.5, 300 sec: 42542.5). Total num frames: 5205688320. Throughput: 0: 42592.6. Samples: 1473316140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:42:08,384][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 08:42:09,336][26599] Updated weights for policy 0, policy_version 317734 (0.0034) [2024-06-19 08:42:13,163][26599] Updated weights for policy 0, policy_version 317744 (0.0029) [2024-06-19 08:42:13,384][26367] Fps is (10 sec: 42583.4, 60 sec: 43141.9, 300 sec: 42764.5). Total num frames: 5205934080. Throughput: 0: 42697.3. Samples: 1473576760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 08:42:13,384][26367] Avg episode reward: [(0, '0.790')] [2024-06-19 08:42:16,826][26599] Updated weights for policy 0, policy_version 317754 (0.0028) [2024-06-19 08:42:18,380][26367] Fps is (10 sec: 45887.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5206147072. Throughput: 0: 42907.6. Samples: 1473713200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:18,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 08:42:20,622][26599] Updated weights for policy 0, policy_version 317764 (0.0026) [2024-06-19 08:42:23,384][26367] Fps is (10 sec: 40960.0, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5206343680. Throughput: 0: 42851.5. Samples: 1473966000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:23,385][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 08:42:24,302][26599] Updated weights for policy 0, policy_version 317774 (0.0035) [2024-06-19 08:42:28,083][26599] Updated weights for policy 0, policy_version 317784 (0.0034) [2024-06-19 08:42:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5206573056. Throughput: 0: 42921.8. Samples: 1474224400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:28,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 08:42:32,188][26599] Updated weights for policy 0, policy_version 317794 (0.0044) [2024-06-19 08:42:33,380][26367] Fps is (10 sec: 44253.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5206786048. Throughput: 0: 43007.1. Samples: 1474355880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:33,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 08:42:35,613][26599] Updated weights for policy 0, policy_version 317804 (0.0041) [2024-06-19 08:42:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43149.7, 300 sec: 42653.9). Total num frames: 5206999040. Throughput: 0: 42953.4. Samples: 1474607920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:38,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 08:42:39,814][26599] Updated weights for policy 0, policy_version 317814 (0.0032) [2024-06-19 08:42:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5207195648. Throughput: 0: 43002.8. Samples: 1474869320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:43,380][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 08:42:43,736][26599] Updated weights for policy 0, policy_version 317824 (0.0037) [2024-06-19 08:42:46,647][26579] Signal inference workers to stop experience collection... (21800 times) [2024-06-19 08:42:46,650][26579] Signal inference workers to resume experience collection... (21800 times) [2024-06-19 08:42:46,670][26599] InferenceWorker_p0-w0: stopping experience collection (21800 times) [2024-06-19 08:42:46,670][26599] InferenceWorker_p0-w0: resuming experience collection (21800 times) [2024-06-19 08:42:47,494][26599] Updated weights for policy 0, policy_version 317834 (0.0038) [2024-06-19 08:42:48,380][26367] Fps is (10 sec: 44237.6, 60 sec: 43147.2, 300 sec: 42765.0). Total num frames: 5207441408. Throughput: 0: 42847.8. Samples: 1474997020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:48,380][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 08:42:51,394][26599] Updated weights for policy 0, policy_version 317844 (0.0029) [2024-06-19 08:42:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5207638016. Throughput: 0: 43093.7. Samples: 1475255240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:53,380][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 08:42:55,125][26599] Updated weights for policy 0, policy_version 317854 (0.0039) [2024-06-19 08:42:58,381][26367] Fps is (10 sec: 40956.5, 60 sec: 42870.9, 300 sec: 42709.4). Total num frames: 5207851008. Throughput: 0: 43027.2. Samples: 1475512860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:42:58,382][26367] Avg episode reward: [(0, '0.812')] [2024-06-19 08:42:59,033][26599] Updated weights for policy 0, policy_version 317864 (0.0039) [2024-06-19 08:43:02,720][26599] Updated weights for policy 0, policy_version 317874 (0.0039) [2024-06-19 08:43:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5208080384. Throughput: 0: 42804.0. Samples: 1475639380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:43:03,380][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 08:43:06,534][26599] Updated weights for policy 0, policy_version 317884 (0.0030) [2024-06-19 08:43:08,380][26367] Fps is (10 sec: 44240.1, 60 sec: 43419.5, 300 sec: 42709.5). Total num frames: 5208293376. Throughput: 0: 42873.2. Samples: 1475895140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:43:08,381][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 08:43:10,158][26599] Updated weights for policy 0, policy_version 317894 (0.0023) [2024-06-19 08:43:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42327.9, 300 sec: 42709.5). Total num frames: 5208473600. Throughput: 0: 42859.1. Samples: 1476153060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:43:13,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 08:43:14,302][26599] Updated weights for policy 0, policy_version 317904 (0.0042) [2024-06-19 08:43:17,971][26599] Updated weights for policy 0, policy_version 317914 (0.0033) [2024-06-19 08:43:18,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42868.9, 300 sec: 42708.9). Total num frames: 5208719360. Throughput: 0: 42674.2. Samples: 1476276380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:43:18,385][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 08:43:22,322][26599] Updated weights for policy 0, policy_version 317924 (0.0030) [2024-06-19 08:43:23,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42873.9, 300 sec: 42653.9). Total num frames: 5208915968. Throughput: 0: 42904.3. Samples: 1476538620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:43:23,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 08:43:25,559][26599] Updated weights for policy 0, policy_version 317934 (0.0036) [2024-06-19 08:43:28,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5209128960. Throughput: 0: 42584.3. Samples: 1476785620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 08:43:28,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 08:43:29,874][26599] Updated weights for policy 0, policy_version 317944 (0.0040) [2024-06-19 08:43:33,171][26599] Updated weights for policy 0, policy_version 317954 (0.0037) [2024-06-19 08:43:33,380][26367] Fps is (10 sec: 45876.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5209374720. Throughput: 0: 42636.0. Samples: 1476915640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:43:33,380][26367] Avg episode reward: [(0, '0.319')] [2024-06-19 08:43:37,326][26599] Updated weights for policy 0, policy_version 317964 (0.0036) [2024-06-19 08:43:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5209554944. Throughput: 0: 42727.5. Samples: 1477177980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:43:38,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 08:43:38,399][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317966_5209554944.pth... [2024-06-19 08:43:38,471][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317340_5199298560.pth [2024-06-19 08:43:40,717][26599] Updated weights for policy 0, policy_version 317974 (0.0035) [2024-06-19 08:43:43,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5209767936. Throughput: 0: 42581.7. Samples: 1477429000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:43:43,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 08:43:44,831][26599] Updated weights for policy 0, policy_version 317984 (0.0042) [2024-06-19 08:43:48,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5209980928. Throughput: 0: 42654.9. Samples: 1477558860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:43:48,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 08:43:48,757][26599] Updated weights for policy 0, policy_version 317994 (0.0040) [2024-06-19 08:43:52,377][26599] Updated weights for policy 0, policy_version 318004 (0.0033) [2024-06-19 08:43:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5210193920. Throughput: 0: 42766.8. Samples: 1477819640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:43:53,380][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 08:43:56,452][26599] Updated weights for policy 0, policy_version 318014 (0.0036) [2024-06-19 08:43:58,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42599.0, 300 sec: 42709.5). Total num frames: 5210406912. Throughput: 0: 42625.0. Samples: 1478071180. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:43:58,381][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 08:43:59,984][26599] Updated weights for policy 0, policy_version 318024 (0.0032) [2024-06-19 08:44:03,381][26367] Fps is (10 sec: 44232.6, 60 sec: 42597.8, 300 sec: 42709.4). Total num frames: 5210636288. Throughput: 0: 42783.6. Samples: 1478201520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:03,382][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 08:44:04,022][26599] Updated weights for policy 0, policy_version 318034 (0.0034) [2024-06-19 08:44:07,733][26599] Updated weights for policy 0, policy_version 318044 (0.0030) [2024-06-19 08:44:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5210832896. Throughput: 0: 42715.0. Samples: 1478460780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:08,380][26367] Avg episode reward: [(0, '0.882')] [2024-06-19 08:44:11,615][26599] Updated weights for policy 0, policy_version 318054 (0.0022) [2024-06-19 08:44:13,380][26367] Fps is (10 sec: 42601.8, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5211062272. Throughput: 0: 42833.3. Samples: 1478713120. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:13,381][26367] Avg episode reward: [(0, '0.786')] [2024-06-19 08:44:15,103][26579] Signal inference workers to stop experience collection... (21850 times) [2024-06-19 08:44:15,151][26599] InferenceWorker_p0-w0: stopping experience collection (21850 times) [2024-06-19 08:44:15,217][26579] Signal inference workers to resume experience collection... (21850 times) [2024-06-19 08:44:15,218][26599] InferenceWorker_p0-w0: resuming experience collection (21850 times) [2024-06-19 08:44:15,355][26599] Updated weights for policy 0, policy_version 318064 (0.0034) [2024-06-19 08:44:18,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42874.1, 300 sec: 42820.6). Total num frames: 5211291648. Throughput: 0: 42815.0. Samples: 1478842320. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:18,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 08:44:19,114][26599] Updated weights for policy 0, policy_version 318074 (0.0030) [2024-06-19 08:44:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5211471872. Throughput: 0: 42682.1. Samples: 1479098680. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:23,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 08:44:23,541][26599] Updated weights for policy 0, policy_version 318084 (0.0040) [2024-06-19 08:44:27,085][26599] Updated weights for policy 0, policy_version 318094 (0.0027) [2024-06-19 08:44:28,383][26367] Fps is (10 sec: 42587.8, 60 sec: 43142.7, 300 sec: 42820.2). Total num frames: 5211717632. Throughput: 0: 42558.0. Samples: 1479344220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:28,383][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 08:44:31,041][26599] Updated weights for policy 0, policy_version 318104 (0.0033) [2024-06-19 08:44:33,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5211914240. Throughput: 0: 42704.6. Samples: 1479480560. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:33,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 08:44:34,631][26599] Updated weights for policy 0, policy_version 318114 (0.0033) [2024-06-19 08:44:38,380][26367] Fps is (10 sec: 39330.6, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 5212110848. Throughput: 0: 42645.5. Samples: 1479738700. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:38,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 08:44:38,557][26599] Updated weights for policy 0, policy_version 318124 (0.0038) [2024-06-19 08:44:42,202][26599] Updated weights for policy 0, policy_version 318134 (0.0037) [2024-06-19 08:44:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5212356608. Throughput: 0: 42591.2. Samples: 1479987780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 23.0) [2024-06-19 08:44:43,380][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 08:44:46,235][26599] Updated weights for policy 0, policy_version 318144 (0.0031) [2024-06-19 08:44:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5212553216. Throughput: 0: 42590.9. Samples: 1480118080. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:44:48,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 08:44:49,833][26599] Updated weights for policy 0, policy_version 318154 (0.0038) [2024-06-19 08:44:53,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5212749824. Throughput: 0: 42632.7. Samples: 1480379260. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:44:53,381][26367] Avg episode reward: [(0, '0.366')] [2024-06-19 08:44:54,154][26599] Updated weights for policy 0, policy_version 318164 (0.0033) [2024-06-19 08:44:57,525][26599] Updated weights for policy 0, policy_version 318174 (0.0032) [2024-06-19 08:44:58,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5212995584. Throughput: 0: 42529.9. Samples: 1480626960. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:44:58,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 08:45:02,048][26599] Updated weights for policy 0, policy_version 318184 (0.0037) [2024-06-19 08:45:03,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42599.0, 300 sec: 42765.0). Total num frames: 5213192192. Throughput: 0: 42612.5. Samples: 1480759880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:03,380][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 08:45:05,006][26599] Updated weights for policy 0, policy_version 318194 (0.0035) [2024-06-19 08:45:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5213405184. Throughput: 0: 42686.4. Samples: 1481019560. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:08,381][26367] Avg episode reward: [(0, '0.840')] [2024-06-19 08:45:09,868][26599] Updated weights for policy 0, policy_version 318204 (0.0044) [2024-06-19 08:45:12,892][26599] Updated weights for policy 0, policy_version 318214 (0.0031) [2024-06-19 08:45:13,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5213634560. Throughput: 0: 42745.0. Samples: 1481267640. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:13,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 08:45:17,482][26599] Updated weights for policy 0, policy_version 318224 (0.0036) [2024-06-19 08:45:18,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42765.3). Total num frames: 5213831168. Throughput: 0: 42737.6. Samples: 1481403760. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:18,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 08:45:20,411][26599] Updated weights for policy 0, policy_version 318234 (0.0035) [2024-06-19 08:45:23,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42869.0, 300 sec: 42765.0). Total num frames: 5214044160. Throughput: 0: 42608.3. Samples: 1481656220. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:23,384][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 08:45:25,109][26599] Updated weights for policy 0, policy_version 318244 (0.0038) [2024-06-19 08:45:28,048][26599] Updated weights for policy 0, policy_version 318254 (0.0053) [2024-06-19 08:45:28,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42600.2, 300 sec: 42765.0). Total num frames: 5214273536. Throughput: 0: 42705.6. Samples: 1481909540. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:28,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 08:45:32,568][26599] Updated weights for policy 0, policy_version 318264 (0.0032) [2024-06-19 08:45:33,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5214470144. Throughput: 0: 42700.9. Samples: 1482039620. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:33,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 08:45:35,959][26599] Updated weights for policy 0, policy_version 318274 (0.0036) [2024-06-19 08:45:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5214683136. Throughput: 0: 42536.5. Samples: 1482293400. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:38,381][26367] Avg episode reward: [(0, '0.432')] [2024-06-19 08:45:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000318280_5214699520.pth... [2024-06-19 08:45:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317653_5204426752.pth [2024-06-19 08:45:40,228][26599] Updated weights for policy 0, policy_version 318284 (0.0045) [2024-06-19 08:45:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5214912512. Throughput: 0: 42665.2. Samples: 1482546900. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:43,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 08:45:43,597][26599] Updated weights for policy 0, policy_version 318294 (0.0035) [2024-06-19 08:45:47,807][26599] Updated weights for policy 0, policy_version 318304 (0.0031) [2024-06-19 08:45:48,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 5215092736. Throughput: 0: 42625.0. Samples: 1482678000. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:48,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 08:45:51,184][26599] Updated weights for policy 0, policy_version 318314 (0.0038) [2024-06-19 08:45:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5215322112. Throughput: 0: 42480.8. Samples: 1482931200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:53,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 08:45:55,706][26599] Updated weights for policy 0, policy_version 318324 (0.0026) [2024-06-19 08:45:58,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5215551488. Throughput: 0: 42742.2. Samples: 1483191040. Policy #0 lag: (min: 0.0, avg: 9.9, max: 21.0) [2024-06-19 08:45:58,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 08:45:58,993][26599] Updated weights for policy 0, policy_version 318334 (0.0038) [2024-06-19 08:46:03,183][26599] Updated weights for policy 0, policy_version 318344 (0.0038) [2024-06-19 08:46:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5215748096. Throughput: 0: 42641.6. Samples: 1483322620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:03,380][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 08:46:04,103][26579] Signal inference workers to stop experience collection... (21900 times) [2024-06-19 08:46:04,162][26599] InferenceWorker_p0-w0: stopping experience collection (21900 times) [2024-06-19 08:46:04,163][26579] Signal inference workers to resume experience collection... (21900 times) [2024-06-19 08:46:04,174][26599] InferenceWorker_p0-w0: resuming experience collection (21900 times) [2024-06-19 08:46:06,798][26599] Updated weights for policy 0, policy_version 318354 (0.0034) [2024-06-19 08:46:08,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5215977472. Throughput: 0: 42711.5. Samples: 1483578080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:08,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 08:46:11,176][26599] Updated weights for policy 0, policy_version 318364 (0.0035) [2024-06-19 08:46:13,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5216190464. Throughput: 0: 42790.6. Samples: 1483835120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:13,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 08:46:14,364][26599] Updated weights for policy 0, policy_version 318374 (0.0038) [2024-06-19 08:46:18,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.5, 300 sec: 42654.0). Total num frames: 5216370688. Throughput: 0: 42791.2. Samples: 1483965220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:18,380][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 08:46:18,699][26599] Updated weights for policy 0, policy_version 318384 (0.0033) [2024-06-19 08:46:21,920][26599] Updated weights for policy 0, policy_version 318394 (0.0036) [2024-06-19 08:46:23,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42874.1, 300 sec: 42876.1). Total num frames: 5216616448. Throughput: 0: 42750.8. Samples: 1484217180. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:23,380][26367] Avg episode reward: [(0, '0.351')] [2024-06-19 08:46:26,436][26599] Updated weights for policy 0, policy_version 318404 (0.0038) [2024-06-19 08:46:28,383][26367] Fps is (10 sec: 47499.6, 60 sec: 42869.4, 300 sec: 42764.6). Total num frames: 5216845824. Throughput: 0: 42824.0. Samples: 1484474100. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:28,392][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 08:46:29,423][26599] Updated weights for policy 0, policy_version 318414 (0.0030) [2024-06-19 08:46:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42766.1). Total num frames: 5217026048. Throughput: 0: 42861.3. Samples: 1484606760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:33,389][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 08:46:34,050][26599] Updated weights for policy 0, policy_version 318424 (0.0035) [2024-06-19 08:46:37,077][26599] Updated weights for policy 0, policy_version 318434 (0.0042) [2024-06-19 08:46:38,380][26367] Fps is (10 sec: 39333.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5217239040. Throughput: 0: 42747.2. Samples: 1484854820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:38,380][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 08:46:41,726][26599] Updated weights for policy 0, policy_version 318444 (0.0033) [2024-06-19 08:46:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42765.6). Total num frames: 5217468416. Throughput: 0: 42811.2. Samples: 1485117540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:43,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 08:46:44,651][26599] Updated weights for policy 0, policy_version 318454 (0.0028) [2024-06-19 08:46:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5217665024. Throughput: 0: 42683.0. Samples: 1485243360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:48,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 08:46:49,379][26599] Updated weights for policy 0, policy_version 318464 (0.0047) [2024-06-19 08:46:53,038][26599] Updated weights for policy 0, policy_version 318474 (0.0038) [2024-06-19 08:46:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5217878016. Throughput: 0: 42496.9. Samples: 1485490440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:53,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 08:46:57,012][26599] Updated weights for policy 0, policy_version 318484 (0.0036) [2024-06-19 08:46:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5218107392. Throughput: 0: 42725.1. Samples: 1485757740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:46:58,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 08:47:00,634][26599] Updated weights for policy 0, policy_version 318494 (0.0059) [2024-06-19 08:47:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42765.4). Total num frames: 5218304000. Throughput: 0: 42599.9. Samples: 1485882220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:47:03,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 08:47:04,732][26599] Updated weights for policy 0, policy_version 318504 (0.0038) [2024-06-19 08:47:08,363][26599] Updated weights for policy 0, policy_version 318514 (0.0044) [2024-06-19 08:47:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42710.0). Total num frames: 5218533376. Throughput: 0: 42452.9. Samples: 1486127560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:47:08,380][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 08:47:12,628][26599] Updated weights for policy 0, policy_version 318524 (0.0040) [2024-06-19 08:47:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5218729984. Throughput: 0: 42487.6. Samples: 1486385920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 08:47:13,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 08:47:16,002][26599] Updated weights for policy 0, policy_version 318534 (0.0034) [2024-06-19 08:47:18,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 5218926592. Throughput: 0: 42326.5. Samples: 1486511460. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:18,381][26367] Avg episode reward: [(0, '0.805')] [2024-06-19 08:47:20,269][26599] Updated weights for policy 0, policy_version 318544 (0.0041) [2024-06-19 08:47:23,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5219155968. Throughput: 0: 42363.4. Samples: 1486761180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:23,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 08:47:23,775][26599] Updated weights for policy 0, policy_version 318554 (0.0031) [2024-06-19 08:47:28,037][26599] Updated weights for policy 0, policy_version 318564 (0.0037) [2024-06-19 08:47:28,380][26367] Fps is (10 sec: 42597.8, 60 sec: 41781.1, 300 sec: 42598.4). Total num frames: 5219352576. Throughput: 0: 42358.4. Samples: 1487023680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:28,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 08:47:29,609][26579] Signal inference workers to stop experience collection... (21950 times) [2024-06-19 08:47:29,652][26599] InferenceWorker_p0-w0: stopping experience collection (21950 times) [2024-06-19 08:47:29,663][26579] Signal inference workers to resume experience collection... (21950 times) [2024-06-19 08:47:29,668][26599] InferenceWorker_p0-w0: resuming experience collection (21950 times) [2024-06-19 08:47:31,389][26599] Updated weights for policy 0, policy_version 318574 (0.0035) [2024-06-19 08:47:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5219581952. Throughput: 0: 42328.0. Samples: 1487148120. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:33,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 08:47:35,717][26599] Updated weights for policy 0, policy_version 318584 (0.0035) [2024-06-19 08:47:38,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.2, 300 sec: 42709.4). Total num frames: 5219794944. Throughput: 0: 42546.9. Samples: 1487405060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:38,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 08:47:38,386][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000318591_5219794944.pth... [2024-06-19 08:47:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000317966_5209554944.pth [2024-06-19 08:47:39,051][26599] Updated weights for policy 0, policy_version 318594 (0.0029) [2024-06-19 08:47:43,362][26599] Updated weights for policy 0, policy_version 318604 (0.0033) [2024-06-19 08:47:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5220007936. Throughput: 0: 42366.6. Samples: 1487664240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:43,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 08:47:46,917][26599] Updated weights for policy 0, policy_version 318614 (0.0043) [2024-06-19 08:47:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5220220928. Throughput: 0: 42440.0. Samples: 1487792020. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:48,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 08:47:51,100][26599] Updated weights for policy 0, policy_version 318624 (0.0037) [2024-06-19 08:47:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 5220433920. Throughput: 0: 42631.1. Samples: 1488045960. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:53,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 08:47:54,529][26599] Updated weights for policy 0, policy_version 318634 (0.0030) [2024-06-19 08:47:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5220630528. Throughput: 0: 42574.2. Samples: 1488301760. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:47:58,381][26367] Avg episode reward: [(0, '0.823')] [2024-06-19 08:47:58,684][26599] Updated weights for policy 0, policy_version 318644 (0.0030) [2024-06-19 08:48:02,139][26599] Updated weights for policy 0, policy_version 318654 (0.0040) [2024-06-19 08:48:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5220876288. Throughput: 0: 42608.6. Samples: 1488428840. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:48:03,380][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 08:48:06,182][26599] Updated weights for policy 0, policy_version 318664 (0.0035) [2024-06-19 08:48:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5221056512. Throughput: 0: 42844.2. Samples: 1488689160. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:48:08,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 08:48:10,055][26599] Updated weights for policy 0, policy_version 318674 (0.0035) [2024-06-19 08:48:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 5221285888. Throughput: 0: 42464.2. Samples: 1488934560. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:48:13,380][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 08:48:13,777][26599] Updated weights for policy 0, policy_version 318684 (0.0040) [2024-06-19 08:48:17,681][26599] Updated weights for policy 0, policy_version 318694 (0.0032) [2024-06-19 08:48:18,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5221498880. Throughput: 0: 42644.8. Samples: 1489067140. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:48:18,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 08:48:21,420][26599] Updated weights for policy 0, policy_version 318704 (0.0035) [2024-06-19 08:48:23,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.3, 300 sec: 42542.8). Total num frames: 5221679104. Throughput: 0: 42600.0. Samples: 1489322060. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:48:23,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 08:48:25,443][26599] Updated weights for policy 0, policy_version 318714 (0.0038) [2024-06-19 08:48:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42542.8). Total num frames: 5221924864. Throughput: 0: 42543.0. Samples: 1489578680. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 08:48:28,381][26367] Avg episode reward: [(0, '0.820')] [2024-06-19 08:48:29,122][26599] Updated weights for policy 0, policy_version 318724 (0.0034) [2024-06-19 08:48:32,840][26599] Updated weights for policy 0, policy_version 318734 (0.0029) [2024-06-19 08:48:33,380][26367] Fps is (10 sec: 47514.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5222154240. Throughput: 0: 42700.0. Samples: 1489713520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:48:33,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 08:48:36,611][26599] Updated weights for policy 0, policy_version 318744 (0.0038) [2024-06-19 08:48:38,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5222318080. Throughput: 0: 42656.8. Samples: 1489965520. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:48:38,380][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 08:48:40,361][26599] Updated weights for policy 0, policy_version 318754 (0.0041) [2024-06-19 08:48:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5222580224. Throughput: 0: 42800.9. Samples: 1490227800. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:48:43,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 08:48:44,122][26599] Updated weights for policy 0, policy_version 318764 (0.0037) [2024-06-19 08:48:48,327][26599] Updated weights for policy 0, policy_version 318774 (0.0034) [2024-06-19 08:48:48,380][26367] Fps is (10 sec: 47513.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5222793216. Throughput: 0: 42866.6. Samples: 1490357840. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:48:48,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 08:48:49,274][26579] Signal inference workers to stop experience collection... (22000 times) [2024-06-19 08:48:49,323][26599] InferenceWorker_p0-w0: stopping experience collection (22000 times) [2024-06-19 08:48:49,323][26579] Signal inference workers to resume experience collection... (22000 times) [2024-06-19 08:48:49,340][26599] InferenceWorker_p0-w0: resuming experience collection (22000 times) [2024-06-19 08:48:51,961][26599] Updated weights for policy 0, policy_version 318784 (0.0031) [2024-06-19 08:48:53,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5222973440. Throughput: 0: 42649.3. Samples: 1490608380. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:48:53,380][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 08:48:55,988][26599] Updated weights for policy 0, policy_version 318794 (0.0021) [2024-06-19 08:48:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42654.1). Total num frames: 5223219200. Throughput: 0: 42913.4. Samples: 1490865660. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:48:58,380][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 08:48:59,629][26599] Updated weights for policy 0, policy_version 318804 (0.0038) [2024-06-19 08:49:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5223415808. Throughput: 0: 42979.7. Samples: 1491001220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:03,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 08:49:03,610][26599] Updated weights for policy 0, policy_version 318814 (0.0027) [2024-06-19 08:49:08,079][26599] Updated weights for policy 0, policy_version 318824 (0.0029) [2024-06-19 08:49:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5223612416. Throughput: 0: 42871.7. Samples: 1491251280. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:08,380][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 08:49:11,257][26599] Updated weights for policy 0, policy_version 318834 (0.0036) [2024-06-19 08:49:13,380][26367] Fps is (10 sec: 45875.1, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5223874560. Throughput: 0: 42846.7. Samples: 1491506780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:13,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 08:49:15,651][26599] Updated weights for policy 0, policy_version 318844 (0.0041) [2024-06-19 08:49:18,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42869.0, 300 sec: 42709.0). Total num frames: 5224071168. Throughput: 0: 42878.8. Samples: 1491643220. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:18,384][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 08:49:18,999][26599] Updated weights for policy 0, policy_version 318854 (0.0035) [2024-06-19 08:49:23,203][26599] Updated weights for policy 0, policy_version 318864 (0.0030) [2024-06-19 08:49:23,384][26367] Fps is (10 sec: 39307.2, 60 sec: 43142.0, 300 sec: 42542.7). Total num frames: 5224267776. Throughput: 0: 42824.5. Samples: 1491892780. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:23,384][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 08:49:26,525][26599] Updated weights for policy 0, policy_version 318874 (0.0035) [2024-06-19 08:49:28,380][26367] Fps is (10 sec: 44253.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5224513536. Throughput: 0: 42681.4. Samples: 1492148460. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:28,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 08:49:30,703][26599] Updated weights for policy 0, policy_version 318884 (0.0032) [2024-06-19 08:49:33,380][26367] Fps is (10 sec: 44252.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5224710144. Throughput: 0: 42815.9. Samples: 1492284560. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:33,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 08:49:34,024][26599] Updated weights for policy 0, policy_version 318894 (0.0027) [2024-06-19 08:49:38,169][26599] Updated weights for policy 0, policy_version 318904 (0.0034) [2024-06-19 08:49:38,380][26367] Fps is (10 sec: 40959.1, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 5224923136. Throughput: 0: 42879.8. Samples: 1492537980. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:38,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 08:49:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000318904_5224923136.pth... [2024-06-19 08:49:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000318280_5214699520.pth [2024-06-19 08:49:41,825][26599] Updated weights for policy 0, policy_version 318914 (0.0036) [2024-06-19 08:49:43,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5225152512. Throughput: 0: 42810.2. Samples: 1492792120. Policy #0 lag: (min: 0.0, avg: 7.9, max: 21.0) [2024-06-19 08:49:43,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 08:49:45,865][26599] Updated weights for policy 0, policy_version 318924 (0.0049) [2024-06-19 08:49:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5225349120. Throughput: 0: 42719.5. Samples: 1492923600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:49:48,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 08:49:49,546][26599] Updated weights for policy 0, policy_version 318934 (0.0047) [2024-06-19 08:49:53,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 5225545728. Throughput: 0: 42755.4. Samples: 1493175280. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:49:53,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 08:49:53,758][26599] Updated weights for policy 0, policy_version 318944 (0.0027) [2024-06-19 08:49:57,131][26599] Updated weights for policy 0, policy_version 318954 (0.0031) [2024-06-19 08:49:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5225791488. Throughput: 0: 42758.1. Samples: 1493430900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:49:58,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 08:50:01,623][26599] Updated weights for policy 0, policy_version 318964 (0.0035) [2024-06-19 08:50:03,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5225988096. Throughput: 0: 42619.1. Samples: 1493560920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:03,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 08:50:05,054][26599] Updated weights for policy 0, policy_version 318974 (0.0030) [2024-06-19 08:50:08,384][26367] Fps is (10 sec: 40945.4, 60 sec: 43141.8, 300 sec: 42597.9). Total num frames: 5226201088. Throughput: 0: 42714.6. Samples: 1493814940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:08,384][26367] Avg episode reward: [(0, '0.816')] [2024-06-19 08:50:09,242][26599] Updated weights for policy 0, policy_version 318984 (0.0029) [2024-06-19 08:50:12,537][26599] Updated weights for policy 0, policy_version 318994 (0.0037) [2024-06-19 08:50:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5226430464. Throughput: 0: 42762.2. Samples: 1494072760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:13,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 08:50:16,909][26599] Updated weights for policy 0, policy_version 319004 (0.0037) [2024-06-19 08:50:18,301][26579] Signal inference workers to stop experience collection... (22050 times) [2024-06-19 08:50:18,344][26599] InferenceWorker_p0-w0: stopping experience collection (22050 times) [2024-06-19 08:50:18,348][26579] Signal inference workers to resume experience collection... (22050 times) [2024-06-19 08:50:18,357][26599] InferenceWorker_p0-w0: resuming experience collection (22050 times) [2024-06-19 08:50:18,380][26367] Fps is (10 sec: 42614.5, 60 sec: 42601.0, 300 sec: 42654.5). Total num frames: 5226627072. Throughput: 0: 42648.6. Samples: 1494203740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:18,380][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 08:50:20,135][26599] Updated weights for policy 0, policy_version 319014 (0.0033) [2024-06-19 08:50:23,380][26367] Fps is (10 sec: 42597.8, 60 sec: 43147.1, 300 sec: 42653.9). Total num frames: 5226856448. Throughput: 0: 42575.6. Samples: 1494453880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:23,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 08:50:24,701][26599] Updated weights for policy 0, policy_version 319024 (0.0034) [2024-06-19 08:50:27,638][26599] Updated weights for policy 0, policy_version 319034 (0.0032) [2024-06-19 08:50:28,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5227085824. Throughput: 0: 42736.0. Samples: 1494715240. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:28,380][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 08:50:32,339][26599] Updated weights for policy 0, policy_version 319044 (0.0034) [2024-06-19 08:50:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5227266048. Throughput: 0: 42792.3. Samples: 1494849260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:33,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 08:50:35,223][26599] Updated weights for policy 0, policy_version 319054 (0.0030) [2024-06-19 08:50:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5227495424. Throughput: 0: 42877.8. Samples: 1495104780. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:38,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 08:50:40,025][26599] Updated weights for policy 0, policy_version 319064 (0.0022) [2024-06-19 08:50:42,824][26599] Updated weights for policy 0, policy_version 319074 (0.0026) [2024-06-19 08:50:43,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5227724800. Throughput: 0: 42689.8. Samples: 1495351940. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:43,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 08:50:47,526][26599] Updated weights for policy 0, policy_version 319084 (0.0042) [2024-06-19 08:50:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5227905024. Throughput: 0: 42800.7. Samples: 1495486960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:48,388][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 08:50:50,397][26599] Updated weights for policy 0, policy_version 319094 (0.0040) [2024-06-19 08:50:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5228134400. Throughput: 0: 42857.3. Samples: 1495743360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:53,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 08:50:55,127][26599] Updated weights for policy 0, policy_version 319104 (0.0037) [2024-06-19 08:50:58,023][26599] Updated weights for policy 0, policy_version 319114 (0.0029) [2024-06-19 08:50:58,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5228363776. Throughput: 0: 42675.1. Samples: 1495993140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 08:50:58,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 08:51:03,031][26599] Updated weights for policy 0, policy_version 319124 (0.0024) [2024-06-19 08:51:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5228544000. Throughput: 0: 42690.6. Samples: 1496124820. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:03,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 08:51:05,736][26599] Updated weights for policy 0, policy_version 319134 (0.0033) [2024-06-19 08:51:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42874.1, 300 sec: 42654.0). Total num frames: 5228773376. Throughput: 0: 42689.9. Samples: 1496374920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:08,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 08:51:10,521][26599] Updated weights for policy 0, policy_version 319144 (0.0038) [2024-06-19 08:51:13,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5229002752. Throughput: 0: 42608.8. Samples: 1496632640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:13,392][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 08:51:13,651][26599] Updated weights for policy 0, policy_version 319154 (0.0042) [2024-06-19 08:51:18,128][26599] Updated weights for policy 0, policy_version 319164 (0.0036) [2024-06-19 08:51:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5229182976. Throughput: 0: 42541.8. Samples: 1496763640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:18,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 08:51:21,375][26599] Updated weights for policy 0, policy_version 319174 (0.0050) [2024-06-19 08:51:23,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42543.3). Total num frames: 5229395968. Throughput: 0: 42430.7. Samples: 1497014160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:23,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 08:51:25,808][26599] Updated weights for policy 0, policy_version 319184 (0.0041) [2024-06-19 08:51:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5229608960. Throughput: 0: 42524.4. Samples: 1497265540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:28,384][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 08:51:29,441][26599] Updated weights for policy 0, policy_version 319194 (0.0034) [2024-06-19 08:51:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5229821952. Throughput: 0: 42377.9. Samples: 1497393960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:33,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 08:51:33,724][26599] Updated weights for policy 0, policy_version 319204 (0.0039) [2024-06-19 08:51:37,216][26599] Updated weights for policy 0, policy_version 319214 (0.0040) [2024-06-19 08:51:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5230034944. Throughput: 0: 42271.1. Samples: 1497645560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:38,381][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 08:51:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000319216_5230034944.pth... [2024-06-19 08:51:38,449][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000318591_5219794944.pth [2024-06-19 08:51:41,817][26599] Updated weights for policy 0, policy_version 319224 (0.0034) [2024-06-19 08:51:43,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5230247936. Throughput: 0: 42400.8. Samples: 1497901180. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:43,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 08:51:44,921][26599] Updated weights for policy 0, policy_version 319234 (0.0057) [2024-06-19 08:51:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5230460928. Throughput: 0: 42139.9. Samples: 1498021120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:48,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 08:51:49,445][26579] Signal inference workers to stop experience collection... (22100 times) [2024-06-19 08:51:49,445][26579] Signal inference workers to resume experience collection... (22100 times) [2024-06-19 08:51:49,449][26599] Updated weights for policy 0, policy_version 319244 (0.0034) [2024-06-19 08:51:49,469][26599] InferenceWorker_p0-w0: stopping experience collection (22100 times) [2024-06-19 08:51:49,469][26599] InferenceWorker_p0-w0: resuming experience collection (22100 times) [2024-06-19 08:51:52,761][26599] Updated weights for policy 0, policy_version 319254 (0.0038) [2024-06-19 08:51:53,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5230673920. Throughput: 0: 42371.7. Samples: 1498281640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:53,380][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 08:51:57,031][26599] Updated weights for policy 0, policy_version 319264 (0.0039) [2024-06-19 08:51:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 5230870528. Throughput: 0: 42259.2. Samples: 1498534300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:51:58,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 08:52:00,480][26599] Updated weights for policy 0, policy_version 319274 (0.0036) [2024-06-19 08:52:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5231099904. Throughput: 0: 42161.1. Samples: 1498660880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:52:03,380][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 08:52:04,496][26599] Updated weights for policy 0, policy_version 319284 (0.0033) [2024-06-19 08:52:08,298][26599] Updated weights for policy 0, policy_version 319294 (0.0037) [2024-06-19 08:52:08,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5231312896. Throughput: 0: 42347.2. Samples: 1498919780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:52:08,380][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 08:52:12,141][26599] Updated weights for policy 0, policy_version 319304 (0.0035) [2024-06-19 08:52:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 5231509504. Throughput: 0: 42349.4. Samples: 1499171260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 21.0) [2024-06-19 08:52:13,380][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 08:52:16,341][26599] Updated weights for policy 0, policy_version 319314 (0.0036) [2024-06-19 08:52:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5231738880. Throughput: 0: 42359.9. Samples: 1499300160. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:18,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 08:52:20,119][26599] Updated weights for policy 0, policy_version 319324 (0.0034) [2024-06-19 08:52:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5231935488. Throughput: 0: 42357.4. Samples: 1499551640. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:23,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 08:52:23,913][26599] Updated weights for policy 0, policy_version 319334 (0.0031) [2024-06-19 08:52:27,729][26599] Updated weights for policy 0, policy_version 319344 (0.0037) [2024-06-19 08:52:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5232148480. Throughput: 0: 42377.9. Samples: 1499808180. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:28,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 08:52:31,560][26599] Updated weights for policy 0, policy_version 319354 (0.0028) [2024-06-19 08:52:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5232377856. Throughput: 0: 42632.1. Samples: 1499939560. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:33,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 08:52:35,242][26599] Updated weights for policy 0, policy_version 319364 (0.0027) [2024-06-19 08:52:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5232574464. Throughput: 0: 42495.5. Samples: 1500193940. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:38,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 08:52:39,134][26599] Updated weights for policy 0, policy_version 319374 (0.0036) [2024-06-19 08:52:42,734][26599] Updated weights for policy 0, policy_version 319384 (0.0040) [2024-06-19 08:52:43,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5232803840. Throughput: 0: 42599.9. Samples: 1500451300. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:43,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 08:52:46,803][26599] Updated weights for policy 0, policy_version 319394 (0.0041) [2024-06-19 08:52:48,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5233000448. Throughput: 0: 42657.2. Samples: 1500580460. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:48,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 08:52:50,266][26599] Updated weights for policy 0, policy_version 319404 (0.0027) [2024-06-19 08:52:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5233213440. Throughput: 0: 42460.8. Samples: 1500830520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:53,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 08:52:54,630][26599] Updated weights for policy 0, policy_version 319414 (0.0050) [2024-06-19 08:52:57,853][26599] Updated weights for policy 0, policy_version 319424 (0.0029) [2024-06-19 08:52:58,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5233442816. Throughput: 0: 42403.2. Samples: 1501079400. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:52:58,380][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 08:53:02,332][26599] Updated weights for policy 0, policy_version 319434 (0.0049) [2024-06-19 08:53:03,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42052.1, 300 sec: 42598.4). Total num frames: 5233623040. Throughput: 0: 42501.7. Samples: 1501212740. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:53:03,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 08:53:05,649][26599] Updated weights for policy 0, policy_version 319444 (0.0028) [2024-06-19 08:53:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5233868800. Throughput: 0: 42489.3. Samples: 1501463660. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:53:08,380][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 08:53:09,736][26579] Signal inference workers to stop experience collection... (22150 times) [2024-06-19 08:53:09,737][26579] Signal inference workers to resume experience collection... (22150 times) [2024-06-19 08:53:09,788][26599] InferenceWorker_p0-w0: stopping experience collection (22150 times) [2024-06-19 08:53:09,788][26599] InferenceWorker_p0-w0: resuming experience collection (22150 times) [2024-06-19 08:53:09,878][26599] Updated weights for policy 0, policy_version 319454 (0.0027) [2024-06-19 08:53:13,378][26599] Updated weights for policy 0, policy_version 319464 (0.0037) [2024-06-19 08:53:13,380][26367] Fps is (10 sec: 47514.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5234098176. Throughput: 0: 42498.8. Samples: 1501720620. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:53:13,380][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 08:53:17,977][26599] Updated weights for policy 0, policy_version 319474 (0.0033) [2024-06-19 08:53:18,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 5234262016. Throughput: 0: 42348.5. Samples: 1501845240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:53:18,380][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 08:53:21,295][26599] Updated weights for policy 0, policy_version 319484 (0.0032) [2024-06-19 08:53:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5234507776. Throughput: 0: 42380.1. Samples: 1502101040. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:53:23,380][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 08:53:25,623][26599] Updated weights for policy 0, policy_version 319494 (0.0030) [2024-06-19 08:53:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5234704384. Throughput: 0: 42624.1. Samples: 1502369380. Policy #0 lag: (min: 1.0, avg: 9.8, max: 20.0) [2024-06-19 08:53:28,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 08:53:28,805][26599] Updated weights for policy 0, policy_version 319504 (0.0027) [2024-06-19 08:53:33,352][26599] Updated weights for policy 0, policy_version 319514 (0.0041) [2024-06-19 08:53:33,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5234917376. Throughput: 0: 42336.5. Samples: 1502485600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:53:33,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 08:53:36,447][26599] Updated weights for policy 0, policy_version 319524 (0.0039) [2024-06-19 08:53:38,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5235146752. Throughput: 0: 42510.3. Samples: 1502743480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:53:38,380][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 08:53:38,458][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000319529_5235163136.pth... [2024-06-19 08:53:38,504][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000318904_5224923136.pth [2024-06-19 08:53:40,847][26599] Updated weights for policy 0, policy_version 319534 (0.0029) [2024-06-19 08:53:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 5235326976. Throughput: 0: 42894.1. Samples: 1503009640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:53:43,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 08:53:44,086][26599] Updated weights for policy 0, policy_version 319544 (0.0048) [2024-06-19 08:53:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5235556352. Throughput: 0: 42660.9. Samples: 1503132480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:53:48,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 08:53:48,787][26599] Updated weights for policy 0, policy_version 319554 (0.0023) [2024-06-19 08:53:51,870][26599] Updated weights for policy 0, policy_version 319564 (0.0045) [2024-06-19 08:53:53,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5235785728. Throughput: 0: 42718.6. Samples: 1503386000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:53:53,382][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 08:53:56,266][26599] Updated weights for policy 0, policy_version 319574 (0.0050) [2024-06-19 08:53:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5235965952. Throughput: 0: 42924.4. Samples: 1503652220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:53:58,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 08:53:59,393][26599] Updated weights for policy 0, policy_version 319584 (0.0033) [2024-06-19 08:54:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5236195328. Throughput: 0: 42960.8. Samples: 1503778480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:03,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 08:54:03,658][26599] Updated weights for policy 0, policy_version 319594 (0.0026) [2024-06-19 08:54:07,276][26599] Updated weights for policy 0, policy_version 319604 (0.0043) [2024-06-19 08:54:08,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5236441088. Throughput: 0: 42990.1. Samples: 1504035600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:08,381][26367] Avg episode reward: [(0, '0.374')] [2024-06-19 08:54:11,235][26599] Updated weights for policy 0, policy_version 319614 (0.0029) [2024-06-19 08:54:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42052.2, 300 sec: 42543.4). Total num frames: 5236621312. Throughput: 0: 42816.4. Samples: 1504296120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:13,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 08:54:15,160][26599] Updated weights for policy 0, policy_version 319624 (0.0039) [2024-06-19 08:54:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42654.5). Total num frames: 5236850688. Throughput: 0: 42893.4. Samples: 1504415800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:18,381][26367] Avg episode reward: [(0, '0.817')] [2024-06-19 08:54:18,974][26599] Updated weights for policy 0, policy_version 319634 (0.0043) [2024-06-19 08:54:21,784][26579] Signal inference workers to stop experience collection... (22200 times) [2024-06-19 08:54:21,785][26579] Signal inference workers to resume experience collection... (22200 times) [2024-06-19 08:54:21,832][26599] InferenceWorker_p0-w0: stopping experience collection (22200 times) [2024-06-19 08:54:21,832][26599] InferenceWorker_p0-w0: resuming experience collection (22200 times) [2024-06-19 08:54:22,715][26599] Updated weights for policy 0, policy_version 319644 (0.0035) [2024-06-19 08:54:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5237063680. Throughput: 0: 43028.8. Samples: 1504679780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:23,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 08:54:26,718][26599] Updated weights for policy 0, policy_version 319654 (0.0037) [2024-06-19 08:54:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5237260288. Throughput: 0: 42896.0. Samples: 1504939960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:28,381][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 08:54:30,302][26599] Updated weights for policy 0, policy_version 319664 (0.0030) [2024-06-19 08:54:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5237506048. Throughput: 0: 42980.5. Samples: 1505066600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:33,381][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 08:54:34,180][26599] Updated weights for policy 0, policy_version 319674 (0.0038) [2024-06-19 08:54:37,923][26599] Updated weights for policy 0, policy_version 319684 (0.0029) [2024-06-19 08:54:38,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5237719040. Throughput: 0: 43228.1. Samples: 1505331260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:38,381][26367] Avg episode reward: [(0, '0.784')] [2024-06-19 08:54:41,730][26599] Updated weights for policy 0, policy_version 319694 (0.0038) [2024-06-19 08:54:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5237915648. Throughput: 0: 42806.2. Samples: 1505578500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 08:54:43,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 08:54:45,750][26599] Updated weights for policy 0, policy_version 319704 (0.0042) [2024-06-19 08:54:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5238145024. Throughput: 0: 42918.3. Samples: 1505709800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:54:48,380][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 08:54:49,276][26599] Updated weights for policy 0, policy_version 319714 (0.0040) [2024-06-19 08:54:53,312][26599] Updated weights for policy 0, policy_version 319724 (0.0031) [2024-06-19 08:54:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5238358016. Throughput: 0: 42999.5. Samples: 1505970580. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:54:53,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 08:54:57,215][26599] Updated weights for policy 0, policy_version 319734 (0.0027) [2024-06-19 08:54:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42653.9). Total num frames: 5238571008. Throughput: 0: 42893.7. Samples: 1506226340. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:54:58,385][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 08:55:00,773][26599] Updated weights for policy 0, policy_version 319744 (0.0037) [2024-06-19 08:55:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42654.5). Total num frames: 5238784000. Throughput: 0: 43008.8. Samples: 1506351200. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:03,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 08:55:04,735][26599] Updated weights for policy 0, policy_version 319754 (0.0032) [2024-06-19 08:55:08,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5238996992. Throughput: 0: 42986.3. Samples: 1506614160. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:08,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 08:55:08,595][26599] Updated weights for policy 0, policy_version 319764 (0.0039) [2024-06-19 08:55:12,330][26599] Updated weights for policy 0, policy_version 319774 (0.0028) [2024-06-19 08:55:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5239209984. Throughput: 0: 42910.3. Samples: 1506870920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:13,380][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 08:55:16,229][26599] Updated weights for policy 0, policy_version 319784 (0.0037) [2024-06-19 08:55:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5239422976. Throughput: 0: 42997.3. Samples: 1507001480. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:18,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 08:55:19,895][26599] Updated weights for policy 0, policy_version 319794 (0.0028) [2024-06-19 08:55:23,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5239603200. Throughput: 0: 42683.5. Samples: 1507252020. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:23,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 08:55:23,883][26599] Updated weights for policy 0, policy_version 319804 (0.0025) [2024-06-19 08:55:27,926][26599] Updated weights for policy 0, policy_version 319814 (0.0034) [2024-06-19 08:55:28,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5239865344. Throughput: 0: 42825.4. Samples: 1507505640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:28,381][26367] Avg episode reward: [(0, '0.840')] [2024-06-19 08:55:31,894][26599] Updated weights for policy 0, policy_version 319824 (0.0033) [2024-06-19 08:55:33,384][26367] Fps is (10 sec: 45858.6, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5240061952. Throughput: 0: 42841.0. Samples: 1507637800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:33,384][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 08:55:35,525][26599] Updated weights for policy 0, policy_version 319834 (0.0040) [2024-06-19 08:55:38,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5240258560. Throughput: 0: 42533.8. Samples: 1507884600. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:38,381][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 08:55:38,515][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000319841_5240274944.pth... [2024-06-19 08:55:38,574][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000319216_5230034944.pth [2024-06-19 08:55:39,499][26599] Updated weights for policy 0, policy_version 319844 (0.0038) [2024-06-19 08:55:43,186][26599] Updated weights for policy 0, policy_version 319854 (0.0043) [2024-06-19 08:55:43,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5240487936. Throughput: 0: 42592.2. Samples: 1508142980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:43,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 08:55:47,082][26599] Updated weights for policy 0, policy_version 319864 (0.0040) [2024-06-19 08:55:47,104][26579] Signal inference workers to stop experience collection... (22250 times) [2024-06-19 08:55:47,105][26579] Signal inference workers to resume experience collection... (22250 times) [2024-06-19 08:55:47,129][26599] InferenceWorker_p0-w0: stopping experience collection (22250 times) [2024-06-19 08:55:47,130][26599] InferenceWorker_p0-w0: resuming experience collection (22250 times) [2024-06-19 08:55:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5240700928. Throughput: 0: 42799.6. Samples: 1508277180. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:48,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 08:55:50,816][26599] Updated weights for policy 0, policy_version 319874 (0.0044) [2024-06-19 08:55:53,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5240913920. Throughput: 0: 42540.7. Samples: 1508528500. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:53,384][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 08:55:54,716][26599] Updated weights for policy 0, policy_version 319884 (0.0033) [2024-06-19 08:55:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5241110528. Throughput: 0: 42529.7. Samples: 1508784760. Policy #0 lag: (min: 0.0, avg: 9.8, max: 22.0) [2024-06-19 08:55:58,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 08:55:58,547][26599] Updated weights for policy 0, policy_version 319894 (0.0030) [2024-06-19 08:56:02,454][26599] Updated weights for policy 0, policy_version 319904 (0.0039) [2024-06-19 08:56:03,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5241323520. Throughput: 0: 42511.9. Samples: 1508914520. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:03,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 08:56:06,084][26599] Updated weights for policy 0, policy_version 319914 (0.0034) [2024-06-19 08:56:08,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5241569280. Throughput: 0: 42693.9. Samples: 1509173240. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:08,380][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 08:56:10,283][26599] Updated weights for policy 0, policy_version 319924 (0.0032) [2024-06-19 08:56:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5241765888. Throughput: 0: 42600.3. Samples: 1509422660. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:13,390][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 08:56:13,735][26599] Updated weights for policy 0, policy_version 319934 (0.0038) [2024-06-19 08:56:17,920][26599] Updated weights for policy 0, policy_version 319944 (0.0039) [2024-06-19 08:56:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5241978880. Throughput: 0: 42436.3. Samples: 1509547280. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:18,380][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 08:56:21,305][26599] Updated weights for policy 0, policy_version 319954 (0.0029) [2024-06-19 08:56:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5242208256. Throughput: 0: 42703.6. Samples: 1509806260. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:23,381][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 08:56:25,605][26599] Updated weights for policy 0, policy_version 319964 (0.0033) [2024-06-19 08:56:28,381][26367] Fps is (10 sec: 44233.3, 60 sec: 42597.8, 300 sec: 42709.4). Total num frames: 5242421248. Throughput: 0: 42586.8. Samples: 1510059420. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:28,382][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 08:56:28,888][26599] Updated weights for policy 0, policy_version 319974 (0.0024) [2024-06-19 08:56:33,182][26599] Updated weights for policy 0, policy_version 319984 (0.0028) [2024-06-19 08:56:33,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42601.0, 300 sec: 42654.0). Total num frames: 5242617856. Throughput: 0: 42437.9. Samples: 1510186880. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:33,380][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 08:56:36,830][26599] Updated weights for policy 0, policy_version 319994 (0.0041) [2024-06-19 08:56:38,380][26367] Fps is (10 sec: 42601.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5242847232. Throughput: 0: 42549.5. Samples: 1510443220. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:38,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 08:56:40,873][26599] Updated weights for policy 0, policy_version 320004 (0.0034) [2024-06-19 08:56:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5243043840. Throughput: 0: 42623.7. Samples: 1510702820. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:43,380][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 08:56:44,505][26599] Updated weights for policy 0, policy_version 320014 (0.0038) [2024-06-19 08:56:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5243240448. Throughput: 0: 42516.6. Samples: 1510827760. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:48,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 08:56:48,642][26599] Updated weights for policy 0, policy_version 320024 (0.0039) [2024-06-19 08:56:52,229][26599] Updated weights for policy 0, policy_version 320034 (0.0031) [2024-06-19 08:56:53,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5243469824. Throughput: 0: 42398.5. Samples: 1511081180. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:53,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 08:56:56,371][26599] Updated weights for policy 0, policy_version 320044 (0.0029) [2024-06-19 08:56:58,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5243682816. Throughput: 0: 42611.1. Samples: 1511340160. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:56:58,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 08:56:59,978][26599] Updated weights for policy 0, policy_version 320054 (0.0027) [2024-06-19 08:57:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5243895808. Throughput: 0: 42573.7. Samples: 1511463100. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:57:03,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 08:57:04,015][26599] Updated weights for policy 0, policy_version 320064 (0.0033) [2024-06-19 08:57:06,216][26579] Signal inference workers to stop experience collection... (22300 times) [2024-06-19 08:57:06,216][26579] Signal inference workers to resume experience collection... (22300 times) [2024-06-19 08:57:06,234][26599] InferenceWorker_p0-w0: stopping experience collection (22300 times) [2024-06-19 08:57:06,245][26599] InferenceWorker_p0-w0: resuming experience collection (22300 times) [2024-06-19 08:57:07,777][26599] Updated weights for policy 0, policy_version 320074 (0.0037) [2024-06-19 08:57:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5244108800. Throughput: 0: 42266.6. Samples: 1511708260. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:57:08,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 08:57:11,788][26599] Updated weights for policy 0, policy_version 320084 (0.0032) [2024-06-19 08:57:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5244305408. Throughput: 0: 42557.5. Samples: 1511974480. Policy #0 lag: (min: 2.0, avg: 12.1, max: 21.0) [2024-06-19 08:57:13,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 08:57:15,319][26599] Updated weights for policy 0, policy_version 320094 (0.0026) [2024-06-19 08:57:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5244518400. Throughput: 0: 42447.8. Samples: 1512097040. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:18,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 08:57:19,708][26599] Updated weights for policy 0, policy_version 320104 (0.0041) [2024-06-19 08:57:23,113][26599] Updated weights for policy 0, policy_version 320114 (0.0027) [2024-06-19 08:57:23,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5244747776. Throughput: 0: 42293.7. Samples: 1512346440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:23,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 08:57:27,384][26599] Updated weights for policy 0, policy_version 320124 (0.0036) [2024-06-19 08:57:28,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.8, 300 sec: 42653.9). Total num frames: 5244960768. Throughput: 0: 42416.8. Samples: 1512611580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:28,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 08:57:30,669][26599] Updated weights for policy 0, policy_version 320134 (0.0027) [2024-06-19 08:57:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5245157376. Throughput: 0: 42418.2. Samples: 1512736580. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:33,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 08:57:34,984][26599] Updated weights for policy 0, policy_version 320144 (0.0032) [2024-06-19 08:57:38,099][26599] Updated weights for policy 0, policy_version 320154 (0.0042) [2024-06-19 08:57:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5245403136. Throughput: 0: 42556.9. Samples: 1512996240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:38,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 08:57:38,388][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000320154_5245403136.pth... [2024-06-19 08:57:38,441][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000319529_5235163136.pth [2024-06-19 08:57:42,838][26599] Updated weights for policy 0, policy_version 320164 (0.0038) [2024-06-19 08:57:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5245583360. Throughput: 0: 42457.3. Samples: 1513250740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:43,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 08:57:46,435][26599] Updated weights for policy 0, policy_version 320174 (0.0030) [2024-06-19 08:57:48,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5245812736. Throughput: 0: 42482.7. Samples: 1513374820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:48,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 08:57:50,685][26599] Updated weights for policy 0, policy_version 320184 (0.0034) [2024-06-19 08:57:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5246025728. Throughput: 0: 42628.5. Samples: 1513626540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:53,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 08:57:54,035][26599] Updated weights for policy 0, policy_version 320194 (0.0039) [2024-06-19 08:57:58,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41779.3, 300 sec: 42598.4). Total num frames: 5246189568. Throughput: 0: 42471.7. Samples: 1513885700. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:57:58,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 08:57:58,520][26599] Updated weights for policy 0, policy_version 320204 (0.0041) [2024-06-19 08:58:01,658][26599] Updated weights for policy 0, policy_version 320214 (0.0033) [2024-06-19 08:58:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5246435328. Throughput: 0: 42375.3. Samples: 1514003920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:58:03,380][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 08:58:06,111][26599] Updated weights for policy 0, policy_version 320224 (0.0023) [2024-06-19 08:58:08,380][26367] Fps is (10 sec: 47513.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5246664704. Throughput: 0: 42637.0. Samples: 1514265100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:58:08,380][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 08:58:09,167][26599] Updated weights for policy 0, policy_version 320234 (0.0023) [2024-06-19 08:58:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5246844928. Throughput: 0: 42591.1. Samples: 1514528180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:58:13,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 08:58:13,643][26599] Updated weights for policy 0, policy_version 320244 (0.0027) [2024-06-19 08:58:16,875][26599] Updated weights for policy 0, policy_version 320254 (0.0038) [2024-06-19 08:58:18,384][26367] Fps is (10 sec: 40944.5, 60 sec: 42595.9, 300 sec: 42597.9). Total num frames: 5247074304. Throughput: 0: 42348.6. Samples: 1514642420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:58:18,385][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 08:58:21,258][26599] Updated weights for policy 0, policy_version 320264 (0.0040) [2024-06-19 08:58:23,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5247303680. Throughput: 0: 42300.4. Samples: 1514899760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:58:23,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 08:58:24,618][26599] Updated weights for policy 0, policy_version 320274 (0.0033) [2024-06-19 08:58:28,380][26367] Fps is (10 sec: 39336.1, 60 sec: 41779.2, 300 sec: 42542.9). Total num frames: 5247467520. Throughput: 0: 42357.4. Samples: 1515156820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 08:58:28,381][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 08:58:28,618][26579] Signal inference workers to stop experience collection... (22350 times) [2024-06-19 08:58:28,625][26579] Signal inference workers to resume experience collection... (22350 times) [2024-06-19 08:58:28,628][26599] InferenceWorker_p0-w0: stopping experience collection (22350 times) [2024-06-19 08:58:28,661][26599] InferenceWorker_p0-w0: resuming experience collection (22350 times) [2024-06-19 08:58:29,345][26599] Updated weights for policy 0, policy_version 320284 (0.0036) [2024-06-19 08:58:32,297][26599] Updated weights for policy 0, policy_version 320294 (0.0032) [2024-06-19 08:58:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5247713280. Throughput: 0: 42247.2. Samples: 1515275940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:58:33,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 08:58:36,960][26599] Updated weights for policy 0, policy_version 320304 (0.0034) [2024-06-19 08:58:38,380][26367] Fps is (10 sec: 47512.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5247942656. Throughput: 0: 42422.1. Samples: 1515535540. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:58:38,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 08:58:39,924][26599] Updated weights for policy 0, policy_version 320314 (0.0027) [2024-06-19 08:58:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5248122880. Throughput: 0: 42269.8. Samples: 1515787840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:58:43,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 08:58:44,705][26599] Updated weights for policy 0, policy_version 320324 (0.0031) [2024-06-19 08:58:47,995][26599] Updated weights for policy 0, policy_version 320334 (0.0037) [2024-06-19 08:58:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5248368640. Throughput: 0: 42447.9. Samples: 1515914080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:58:48,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 08:58:52,291][26599] Updated weights for policy 0, policy_version 320344 (0.0038) [2024-06-19 08:58:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5248548864. Throughput: 0: 42370.1. Samples: 1516171760. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:58:53,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 08:58:55,726][26599] Updated weights for policy 0, policy_version 320354 (0.0029) [2024-06-19 08:58:58,384][26367] Fps is (10 sec: 39307.8, 60 sec: 42868.9, 300 sec: 42597.9). Total num frames: 5248761856. Throughput: 0: 42283.3. Samples: 1516431080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:58:58,384][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 08:58:59,944][26599] Updated weights for policy 0, policy_version 320364 (0.0031) [2024-06-19 08:59:03,298][26599] Updated weights for policy 0, policy_version 320374 (0.0031) [2024-06-19 08:59:03,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5249007616. Throughput: 0: 42469.2. Samples: 1516553380. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:03,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 08:59:07,475][26599] Updated weights for policy 0, policy_version 320384 (0.0041) [2024-06-19 08:59:08,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5249204224. Throughput: 0: 42656.6. Samples: 1516819300. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:08,380][26367] Avg episode reward: [(0, '0.391')] [2024-06-19 08:59:10,735][26599] Updated weights for policy 0, policy_version 320394 (0.0037) [2024-06-19 08:59:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5249417216. Throughput: 0: 42591.9. Samples: 1517073460. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:13,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 08:59:15,132][26599] Updated weights for policy 0, policy_version 320404 (0.0038) [2024-06-19 08:59:18,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42600.9, 300 sec: 42598.4). Total num frames: 5249630208. Throughput: 0: 42690.0. Samples: 1517197000. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:18,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 08:59:18,879][26599] Updated weights for policy 0, policy_version 320414 (0.0030) [2024-06-19 08:59:23,022][26599] Updated weights for policy 0, policy_version 320424 (0.0033) [2024-06-19 08:59:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5249843200. Throughput: 0: 42685.8. Samples: 1517456400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:23,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 08:59:26,417][26599] Updated weights for policy 0, policy_version 320434 (0.0044) [2024-06-19 08:59:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 5250039808. Throughput: 0: 42644.4. Samples: 1517706840. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:28,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 08:59:30,661][26599] Updated weights for policy 0, policy_version 320444 (0.0037) [2024-06-19 08:59:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5250252800. Throughput: 0: 42716.9. Samples: 1517836340. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:33,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 08:59:34,208][26599] Updated weights for policy 0, policy_version 320454 (0.0036) [2024-06-19 08:59:38,319][26599] Updated weights for policy 0, policy_version 320464 (0.0027) [2024-06-19 08:59:38,384][26367] Fps is (10 sec: 44220.9, 60 sec: 42322.8, 300 sec: 42597.9). Total num frames: 5250482176. Throughput: 0: 42694.4. Samples: 1518093160. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:38,384][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 08:59:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000320464_5250482176.pth... [2024-06-19 08:59:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000319841_5240274944.pth [2024-06-19 08:59:42,014][26599] Updated weights for policy 0, policy_version 320474 (0.0030) [2024-06-19 08:59:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42431.8). Total num frames: 5250662400. Throughput: 0: 42490.8. Samples: 1518343020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 08:59:43,381][26367] Avg episode reward: [(0, '0.345')] [2024-06-19 08:59:45,884][26599] Updated weights for policy 0, policy_version 320484 (0.0033) [2024-06-19 08:59:48,380][26367] Fps is (10 sec: 42614.4, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5250908160. Throughput: 0: 42518.3. Samples: 1518466700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 08:59:48,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 08:59:49,576][26599] Updated weights for policy 0, policy_version 320494 (0.0040) [2024-06-19 08:59:51,799][26579] Signal inference workers to stop experience collection... (22400 times) [2024-06-19 08:59:51,799][26579] Signal inference workers to resume experience collection... (22400 times) [2024-06-19 08:59:51,843][26599] InferenceWorker_p0-w0: stopping experience collection (22400 times) [2024-06-19 08:59:51,843][26599] InferenceWorker_p0-w0: resuming experience collection (22400 times) [2024-06-19 08:59:53,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5251104768. Throughput: 0: 42395.5. Samples: 1518727100. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 08:59:53,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 08:59:53,530][26599] Updated weights for policy 0, policy_version 320504 (0.0039) [2024-06-19 08:59:57,115][26599] Updated weights for policy 0, policy_version 320514 (0.0046) [2024-06-19 08:59:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42600.9, 300 sec: 42487.3). Total num frames: 5251317760. Throughput: 0: 42421.8. Samples: 1518982440. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 08:59:58,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 09:00:01,146][26599] Updated weights for policy 0, policy_version 320524 (0.0043) [2024-06-19 09:00:03,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42322.8, 300 sec: 42542.3). Total num frames: 5251547136. Throughput: 0: 42513.2. Samples: 1519110240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:03,384][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 09:00:05,238][26599] Updated weights for policy 0, policy_version 320534 (0.0027) [2024-06-19 09:00:08,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5251776512. Throughput: 0: 42540.2. Samples: 1519370700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:08,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 09:00:08,716][26599] Updated weights for policy 0, policy_version 320544 (0.0031) [2024-06-19 09:00:12,961][26599] Updated weights for policy 0, policy_version 320554 (0.0041) [2024-06-19 09:00:13,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5251956736. Throughput: 0: 42582.7. Samples: 1519623060. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:13,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 09:00:16,941][26599] Updated weights for policy 0, policy_version 320564 (0.0043) [2024-06-19 09:00:18,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5252186112. Throughput: 0: 42537.3. Samples: 1519750520. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:18,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 09:00:20,445][26599] Updated weights for policy 0, policy_version 320574 (0.0033) [2024-06-19 09:00:23,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 5252415488. Throughput: 0: 42600.0. Samples: 1520010000. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:23,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 09:00:24,518][26599] Updated weights for policy 0, policy_version 320584 (0.0037) [2024-06-19 09:00:28,038][26599] Updated weights for policy 0, policy_version 320594 (0.0046) [2024-06-19 09:00:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42543.4). Total num frames: 5252612096. Throughput: 0: 42738.1. Samples: 1520266240. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:28,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 09:00:32,047][26599] Updated weights for policy 0, policy_version 320604 (0.0030) [2024-06-19 09:00:33,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5252825088. Throughput: 0: 42755.4. Samples: 1520390700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:33,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 09:00:36,018][26599] Updated weights for policy 0, policy_version 320614 (0.0040) [2024-06-19 09:00:38,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42601.0, 300 sec: 42542.9). Total num frames: 5253038080. Throughput: 0: 42859.1. Samples: 1520655760. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:38,380][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 09:00:39,672][26599] Updated weights for policy 0, policy_version 320624 (0.0035) [2024-06-19 09:00:43,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42487.3). Total num frames: 5253234688. Throughput: 0: 42861.9. Samples: 1520911220. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:43,380][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 09:00:43,643][26599] Updated weights for policy 0, policy_version 320634 (0.0038) [2024-06-19 09:00:47,130][26599] Updated weights for policy 0, policy_version 320644 (0.0036) [2024-06-19 09:00:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5253480448. Throughput: 0: 42902.6. Samples: 1521040700. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:48,380][26367] Avg episode reward: [(0, '0.849')] [2024-06-19 09:00:51,241][26599] Updated weights for policy 0, policy_version 320654 (0.0032) [2024-06-19 09:00:53,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5253693440. Throughput: 0: 42916.0. Samples: 1521301920. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:53,380][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 09:00:54,613][26599] Updated weights for policy 0, policy_version 320664 (0.0031) [2024-06-19 09:00:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5253890048. Throughput: 0: 43020.4. Samples: 1521558980. Policy #0 lag: (min: 1.0, avg: 9.8, max: 21.0) [2024-06-19 09:00:58,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 09:00:58,639][26599] Updated weights for policy 0, policy_version 320674 (0.0031) [2024-06-19 09:01:02,580][26599] Updated weights for policy 0, policy_version 320684 (0.0043) [2024-06-19 09:01:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42874.0, 300 sec: 42542.8). Total num frames: 5254119424. Throughput: 0: 43082.7. Samples: 1521689240. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:03,382][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 09:01:06,341][26599] Updated weights for policy 0, policy_version 320694 (0.0051) [2024-06-19 09:01:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5254332416. Throughput: 0: 42993.2. Samples: 1521944700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:08,389][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 09:01:10,372][26599] Updated weights for policy 0, policy_version 320704 (0.0032) [2024-06-19 09:01:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5254529024. Throughput: 0: 42877.0. Samples: 1522195700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:13,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 09:01:13,649][26579] Signal inference workers to stop experience collection... (22450 times) [2024-06-19 09:01:13,649][26579] Signal inference workers to resume experience collection... (22450 times) [2024-06-19 09:01:13,671][26599] InferenceWorker_p0-w0: stopping experience collection (22450 times) [2024-06-19 09:01:13,671][26599] InferenceWorker_p0-w0: resuming experience collection (22450 times) [2024-06-19 09:01:14,045][26599] Updated weights for policy 0, policy_version 320714 (0.0030) [2024-06-19 09:01:17,727][26599] Updated weights for policy 0, policy_version 320724 (0.0027) [2024-06-19 09:01:18,381][26367] Fps is (10 sec: 42597.0, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 5254758400. Throughput: 0: 43020.2. Samples: 1522326620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:18,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 09:01:22,102][26599] Updated weights for policy 0, policy_version 320734 (0.0039) [2024-06-19 09:01:23,382][26367] Fps is (10 sec: 44229.9, 60 sec: 42597.2, 300 sec: 42542.7). Total num frames: 5254971392. Throughput: 0: 42893.1. Samples: 1522586020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:23,382][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 09:01:25,402][26599] Updated weights for policy 0, policy_version 320744 (0.0027) [2024-06-19 09:01:28,380][26367] Fps is (10 sec: 40961.7, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5255168000. Throughput: 0: 42842.7. Samples: 1522839140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:28,380][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 09:01:29,681][26599] Updated weights for policy 0, policy_version 320754 (0.0043) [2024-06-19 09:01:32,988][26599] Updated weights for policy 0, policy_version 320764 (0.0040) [2024-06-19 09:01:33,380][26367] Fps is (10 sec: 44243.4, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5255413760. Throughput: 0: 42902.1. Samples: 1522971300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:33,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 09:01:37,129][26599] Updated weights for policy 0, policy_version 320774 (0.0032) [2024-06-19 09:01:38,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5255610368. Throughput: 0: 42780.7. Samples: 1523227060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:38,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 09:01:38,531][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000320778_5255626752.pth... [2024-06-19 09:01:38,590][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000320154_5245403136.pth [2024-06-19 09:01:40,636][26599] Updated weights for policy 0, policy_version 320784 (0.0033) [2024-06-19 09:01:43,384][26367] Fps is (10 sec: 40945.4, 60 sec: 43141.9, 300 sec: 42653.4). Total num frames: 5255823360. Throughput: 0: 42551.3. Samples: 1523473940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:43,384][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 09:01:44,852][26599] Updated weights for policy 0, policy_version 320794 (0.0040) [2024-06-19 09:01:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 5256019968. Throughput: 0: 42534.6. Samples: 1523603300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:48,381][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 09:01:48,724][26599] Updated weights for policy 0, policy_version 320804 (0.0038) [2024-06-19 09:01:52,706][26599] Updated weights for policy 0, policy_version 320814 (0.0022) [2024-06-19 09:01:53,380][26367] Fps is (10 sec: 44252.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5256265728. Throughput: 0: 42581.8. Samples: 1523860880. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:53,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 09:01:56,212][26599] Updated weights for policy 0, policy_version 320824 (0.0038) [2024-06-19 09:01:58,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5256478720. Throughput: 0: 42659.0. Samples: 1524115360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:01:58,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 09:02:00,423][26599] Updated weights for policy 0, policy_version 320834 (0.0041) [2024-06-19 09:02:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5256675328. Throughput: 0: 42506.1. Samples: 1524239380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:02:03,381][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 09:02:04,099][26599] Updated weights for policy 0, policy_version 320844 (0.0040) [2024-06-19 09:02:07,923][26599] Updated weights for policy 0, policy_version 320854 (0.0028) [2024-06-19 09:02:08,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 5256888320. Throughput: 0: 42482.1. Samples: 1524497800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:02:08,384][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 09:02:11,687][26599] Updated weights for policy 0, policy_version 320864 (0.0044) [2024-06-19 09:02:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5257084928. Throughput: 0: 42569.3. Samples: 1524754760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:02:13,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 09:02:15,474][26599] Updated weights for policy 0, policy_version 320874 (0.0029) [2024-06-19 09:02:18,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42598.7, 300 sec: 42598.4). Total num frames: 5257314304. Throughput: 0: 42445.5. Samples: 1524881340. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:18,380][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 09:02:19,268][26599] Updated weights for policy 0, policy_version 320884 (0.0038) [2024-06-19 09:02:23,021][26599] Updated weights for policy 0, policy_version 320894 (0.0029) [2024-06-19 09:02:23,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42872.6, 300 sec: 42653.9). Total num frames: 5257543680. Throughput: 0: 42563.2. Samples: 1525142400. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:23,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 09:02:27,086][26599] Updated weights for policy 0, policy_version 320904 (0.0045) [2024-06-19 09:02:28,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42868.8, 300 sec: 42653.4). Total num frames: 5257740288. Throughput: 0: 42737.3. Samples: 1525397120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:28,385][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 09:02:29,015][26579] Signal inference workers to stop experience collection... (22500 times) [2024-06-19 09:02:29,015][26579] Signal inference workers to resume experience collection... (22500 times) [2024-06-19 09:02:29,026][26599] InferenceWorker_p0-w0: stopping experience collection (22500 times) [2024-06-19 09:02:29,026][26599] InferenceWorker_p0-w0: resuming experience collection (22500 times) [2024-06-19 09:02:30,589][26599] Updated weights for policy 0, policy_version 320914 (0.0043) [2024-06-19 09:02:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5257969664. Throughput: 0: 42681.5. Samples: 1525523960. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:33,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 09:02:34,781][26599] Updated weights for policy 0, policy_version 320924 (0.0033) [2024-06-19 09:02:38,254][26599] Updated weights for policy 0, policy_version 320934 (0.0039) [2024-06-19 09:02:38,380][26367] Fps is (10 sec: 44253.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5258182656. Throughput: 0: 42696.1. Samples: 1525782200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:38,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 09:02:42,562][26599] Updated weights for policy 0, policy_version 320944 (0.0028) [2024-06-19 09:02:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42600.9, 300 sec: 42598.4). Total num frames: 5258379264. Throughput: 0: 42652.9. Samples: 1526034740. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:43,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 09:02:45,845][26599] Updated weights for policy 0, policy_version 320954 (0.0035) [2024-06-19 09:02:48,384][26367] Fps is (10 sec: 42582.5, 60 sec: 43142.0, 300 sec: 42653.4). Total num frames: 5258608640. Throughput: 0: 42659.6. Samples: 1526159220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:48,385][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 09:02:50,178][26599] Updated weights for policy 0, policy_version 320964 (0.0037) [2024-06-19 09:02:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5258821632. Throughput: 0: 42814.0. Samples: 1526424280. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:53,381][26367] Avg episode reward: [(0, '0.448')] [2024-06-19 09:02:53,443][26599] Updated weights for policy 0, policy_version 320974 (0.0037) [2024-06-19 09:02:57,931][26599] Updated weights for policy 0, policy_version 320984 (0.0037) [2024-06-19 09:02:58,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5259018240. Throughput: 0: 42646.7. Samples: 1526673860. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:02:58,380][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 09:03:01,223][26599] Updated weights for policy 0, policy_version 320994 (0.0030) [2024-06-19 09:03:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5259247616. Throughput: 0: 42575.0. Samples: 1526797220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:03:03,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 09:03:05,564][26599] Updated weights for policy 0, policy_version 321004 (0.0035) [2024-06-19 09:03:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5259444224. Throughput: 0: 42635.9. Samples: 1527061020. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:03:08,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 09:03:08,984][26599] Updated weights for policy 0, policy_version 321014 (0.0044) [2024-06-19 09:03:13,294][26599] Updated weights for policy 0, policy_version 321024 (0.0040) [2024-06-19 09:03:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.5). Total num frames: 5259657216. Throughput: 0: 42663.0. Samples: 1527316800. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:03:13,380][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 09:03:16,774][26599] Updated weights for policy 0, policy_version 321034 (0.0030) [2024-06-19 09:03:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5259870208. Throughput: 0: 42657.8. Samples: 1527443560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:03:18,388][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 09:03:20,894][26599] Updated weights for policy 0, policy_version 321044 (0.0032) [2024-06-19 09:03:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5260066816. Throughput: 0: 42668.8. Samples: 1527702300. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:03:23,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 09:03:24,488][26599] Updated weights for policy 0, policy_version 321054 (0.0033) [2024-06-19 09:03:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42601.0, 300 sec: 42653.9). Total num frames: 5260296192. Throughput: 0: 42605.4. Samples: 1527951980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 09:03:28,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 09:03:28,503][26599] Updated weights for policy 0, policy_version 321064 (0.0023) [2024-06-19 09:03:32,245][26599] Updated weights for policy 0, policy_version 321074 (0.0041) [2024-06-19 09:03:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5260509184. Throughput: 0: 42783.5. Samples: 1528084320. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:03:33,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 09:03:36,394][26599] Updated weights for policy 0, policy_version 321084 (0.0041) [2024-06-19 09:03:38,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.1, 300 sec: 42653.9). Total num frames: 5260705792. Throughput: 0: 42526.2. Samples: 1528337960. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:03:38,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 09:03:38,495][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000321089_5260722176.pth... [2024-06-19 09:03:38,569][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000320464_5250482176.pth [2024-06-19 09:03:40,386][26599] Updated weights for policy 0, policy_version 321094 (0.0031) [2024-06-19 09:03:43,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42595.9, 300 sec: 42597.9). Total num frames: 5260935168. Throughput: 0: 42521.8. Samples: 1528587500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:03:43,384][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 09:03:44,005][26599] Updated weights for policy 0, policy_version 321104 (0.0036) [2024-06-19 09:03:47,961][26599] Updated weights for policy 0, policy_version 321114 (0.0043) [2024-06-19 09:03:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42327.9, 300 sec: 42709.5). Total num frames: 5261148160. Throughput: 0: 42888.1. Samples: 1528727180. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:03:48,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 09:03:49,129][26579] Signal inference workers to stop experience collection... (22550 times) [2024-06-19 09:03:49,179][26599] InferenceWorker_p0-w0: stopping experience collection (22550 times) [2024-06-19 09:03:49,242][26579] Signal inference workers to resume experience collection... (22550 times) [2024-06-19 09:03:49,242][26599] InferenceWorker_p0-w0: resuming experience collection (22550 times) [2024-06-19 09:03:52,111][26599] Updated weights for policy 0, policy_version 321124 (0.0035) [2024-06-19 09:03:53,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42052.3, 300 sec: 42654.5). Total num frames: 5261344768. Throughput: 0: 42696.0. Samples: 1528982340. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:03:53,381][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 09:03:55,459][26599] Updated weights for policy 0, policy_version 321134 (0.0034) [2024-06-19 09:03:58,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5261590528. Throughput: 0: 42482.2. Samples: 1529228500. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:03:58,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 09:03:59,625][26599] Updated weights for policy 0, policy_version 321144 (0.0028) [2024-06-19 09:04:03,070][26599] Updated weights for policy 0, policy_version 321154 (0.0034) [2024-06-19 09:04:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5261787136. Throughput: 0: 42657.8. Samples: 1529363160. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:03,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 09:04:07,109][26599] Updated weights for policy 0, policy_version 321164 (0.0030) [2024-06-19 09:04:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5262000128. Throughput: 0: 42602.8. Samples: 1529619420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:08,380][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 09:04:10,705][26599] Updated weights for policy 0, policy_version 321174 (0.0039) [2024-06-19 09:04:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5262213120. Throughput: 0: 42642.2. Samples: 1529870880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:13,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 09:04:14,701][26599] Updated weights for policy 0, policy_version 321184 (0.0028) [2024-06-19 09:04:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5262426112. Throughput: 0: 42656.9. Samples: 1530003880. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:18,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 09:04:18,485][26599] Updated weights for policy 0, policy_version 321194 (0.0034) [2024-06-19 09:04:22,174][26599] Updated weights for policy 0, policy_version 321204 (0.0028) [2024-06-19 09:04:23,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5262655488. Throughput: 0: 42777.5. Samples: 1530262940. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:23,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 09:04:26,225][26599] Updated weights for policy 0, policy_version 321214 (0.0035) [2024-06-19 09:04:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5262868480. Throughput: 0: 42883.0. Samples: 1530517080. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:28,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 09:04:29,716][26599] Updated weights for policy 0, policy_version 321224 (0.0046) [2024-06-19 09:04:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42654.5). Total num frames: 5263065088. Throughput: 0: 42672.1. Samples: 1530647420. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:33,380][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 09:04:33,829][26599] Updated weights for policy 0, policy_version 321234 (0.0037) [2024-06-19 09:04:37,392][26599] Updated weights for policy 0, policy_version 321244 (0.0039) [2024-06-19 09:04:38,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5263294464. Throughput: 0: 42700.4. Samples: 1530903860. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:38,384][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 09:04:41,608][26599] Updated weights for policy 0, policy_version 321254 (0.0037) [2024-06-19 09:04:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42874.2, 300 sec: 42709.5). Total num frames: 5263507456. Throughput: 0: 42688.6. Samples: 1531149480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 21.0) [2024-06-19 09:04:43,380][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 09:04:45,124][26599] Updated weights for policy 0, policy_version 321264 (0.0032) [2024-06-19 09:04:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5263704064. Throughput: 0: 42643.5. Samples: 1531282120. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:04:48,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 09:04:49,433][26599] Updated weights for policy 0, policy_version 321274 (0.0036) [2024-06-19 09:04:52,863][26599] Updated weights for policy 0, policy_version 321284 (0.0034) [2024-06-19 09:04:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5263917056. Throughput: 0: 42674.1. Samples: 1531539760. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:04:53,381][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 09:04:57,066][26599] Updated weights for policy 0, policy_version 321294 (0.0030) [2024-06-19 09:04:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 5264130048. Throughput: 0: 42655.1. Samples: 1531790360. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:04:58,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 09:05:00,398][26599] Updated weights for policy 0, policy_version 321304 (0.0028) [2024-06-19 09:05:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5264343040. Throughput: 0: 42487.1. Samples: 1531915800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:03,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 09:05:04,670][26599] Updated weights for policy 0, policy_version 321314 (0.0031) [2024-06-19 09:05:08,335][26599] Updated weights for policy 0, policy_version 321324 (0.0029) [2024-06-19 09:05:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5264572416. Throughput: 0: 42532.8. Samples: 1532176920. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:08,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 09:05:12,424][26599] Updated weights for policy 0, policy_version 321334 (0.0025) [2024-06-19 09:05:13,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5264769024. Throughput: 0: 42478.7. Samples: 1532428620. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:13,380][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 09:05:15,842][26599] Updated weights for policy 0, policy_version 321344 (0.0032) [2024-06-19 09:05:18,384][26367] Fps is (10 sec: 42583.3, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 5264998400. Throughput: 0: 42370.7. Samples: 1532554260. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:18,384][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 09:05:20,007][26599] Updated weights for policy 0, policy_version 321354 (0.0047) [2024-06-19 09:05:20,026][26579] Signal inference workers to stop experience collection... (22600 times) [2024-06-19 09:05:20,027][26579] Signal inference workers to resume experience collection... (22600 times) [2024-06-19 09:05:20,051][26599] InferenceWorker_p0-w0: stopping experience collection (22600 times) [2024-06-19 09:05:20,052][26599] InferenceWorker_p0-w0: resuming experience collection (22600 times) [2024-06-19 09:05:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5265195008. Throughput: 0: 42517.4. Samples: 1532817140. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:23,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 09:05:23,796][26599] Updated weights for policy 0, policy_version 321364 (0.0045) [2024-06-19 09:05:27,647][26599] Updated weights for policy 0, policy_version 321374 (0.0036) [2024-06-19 09:05:28,380][26367] Fps is (10 sec: 42613.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5265424384. Throughput: 0: 42630.0. Samples: 1533067840. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:28,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 09:05:31,342][26599] Updated weights for policy 0, policy_version 321384 (0.0037) [2024-06-19 09:05:33,384][26367] Fps is (10 sec: 44220.9, 60 sec: 42868.8, 300 sec: 42708.9). Total num frames: 5265637376. Throughput: 0: 42578.8. Samples: 1533198320. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:33,385][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 09:05:35,289][26599] Updated weights for policy 0, policy_version 321394 (0.0036) [2024-06-19 09:05:38,384][26367] Fps is (10 sec: 40945.4, 60 sec: 42322.8, 300 sec: 42708.9). Total num frames: 5265833984. Throughput: 0: 42511.7. Samples: 1533452940. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:38,384][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 09:05:38,522][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000321402_5265850368.pth... [2024-06-19 09:05:38,571][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000320778_5255626752.pth [2024-06-19 09:05:38,914][26599] Updated weights for policy 0, policy_version 321404 (0.0042) [2024-06-19 09:05:42,829][26599] Updated weights for policy 0, policy_version 321414 (0.0034) [2024-06-19 09:05:43,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5266063360. Throughput: 0: 42584.1. Samples: 1533706640. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:43,380][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 09:05:46,657][26599] Updated weights for policy 0, policy_version 321424 (0.0042) [2024-06-19 09:05:48,385][26367] Fps is (10 sec: 44233.1, 60 sec: 42868.3, 300 sec: 42653.3). Total num frames: 5266276352. Throughput: 0: 42769.1. Samples: 1533840600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:48,385][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 09:05:50,449][26599] Updated weights for policy 0, policy_version 321434 (0.0033) [2024-06-19 09:05:53,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5266456576. Throughput: 0: 42569.5. Samples: 1534092540. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:53,380][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 09:05:54,309][26599] Updated weights for policy 0, policy_version 321444 (0.0036) [2024-06-19 09:05:57,943][26599] Updated weights for policy 0, policy_version 321454 (0.0038) [2024-06-19 09:05:58,380][26367] Fps is (10 sec: 42617.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5266702336. Throughput: 0: 42719.9. Samples: 1534351020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 21.0) [2024-06-19 09:05:58,381][26367] Avg episode reward: [(0, '0.424')] [2024-06-19 09:06:01,798][26599] Updated weights for policy 0, policy_version 321464 (0.0032) [2024-06-19 09:06:03,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42595.9, 300 sec: 42597.9). Total num frames: 5266898944. Throughput: 0: 42894.2. Samples: 1534484500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:03,384][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 09:06:06,089][26599] Updated weights for policy 0, policy_version 321474 (0.0029) [2024-06-19 09:06:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5267111936. Throughput: 0: 42682.3. Samples: 1534737840. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:08,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 09:06:09,429][26599] Updated weights for policy 0, policy_version 321484 (0.0034) [2024-06-19 09:06:13,380][26367] Fps is (10 sec: 44252.6, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5267341312. Throughput: 0: 42769.3. Samples: 1534992460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:13,381][26367] Avg episode reward: [(0, '0.216')] [2024-06-19 09:06:13,705][26599] Updated weights for policy 0, policy_version 321494 (0.0032) [2024-06-19 09:06:17,361][26599] Updated weights for policy 0, policy_version 321504 (0.0037) [2024-06-19 09:06:18,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42600.9, 300 sec: 42654.2). Total num frames: 5267554304. Throughput: 0: 42723.4. Samples: 1535120720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:18,383][26367] Avg episode reward: [(0, '0.081')] [2024-06-19 09:06:21,232][26599] Updated weights for policy 0, policy_version 321514 (0.0036) [2024-06-19 09:06:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5267750912. Throughput: 0: 42791.8. Samples: 1535378420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:23,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 09:06:24,906][26599] Updated weights for policy 0, policy_version 321524 (0.0034) [2024-06-19 09:06:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5267980288. Throughput: 0: 42685.3. Samples: 1535627480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:28,381][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 09:06:28,690][26599] Updated weights for policy 0, policy_version 321534 (0.0027) [2024-06-19 09:06:32,400][26599] Updated weights for policy 0, policy_version 321544 (0.0037) [2024-06-19 09:06:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42601.0, 300 sec: 42654.0). Total num frames: 5268193280. Throughput: 0: 42780.7. Samples: 1535765540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:33,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 09:06:36,314][26599] Updated weights for policy 0, policy_version 321554 (0.0026) [2024-06-19 09:06:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42600.9, 300 sec: 42598.9). Total num frames: 5268389888. Throughput: 0: 42786.0. Samples: 1536017920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:38,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 09:06:40,030][26599] Updated weights for policy 0, policy_version 321564 (0.0030) [2024-06-19 09:06:43,231][26579] Signal inference workers to stop experience collection... (22650 times) [2024-06-19 09:06:43,236][26579] Signal inference workers to resume experience collection... (22650 times) [2024-06-19 09:06:43,272][26599] InferenceWorker_p0-w0: stopping experience collection (22650 times) [2024-06-19 09:06:43,272][26599] InferenceWorker_p0-w0: resuming experience collection (22650 times) [2024-06-19 09:06:43,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5268635648. Throughput: 0: 42792.1. Samples: 1536276820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:43,385][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 09:06:43,982][26599] Updated weights for policy 0, policy_version 321574 (0.0031) [2024-06-19 09:06:47,746][26599] Updated weights for policy 0, policy_version 321584 (0.0033) [2024-06-19 09:06:48,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42874.6, 300 sec: 42653.9). Total num frames: 5268848640. Throughput: 0: 42758.4. Samples: 1536408480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:48,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 09:06:51,650][26599] Updated weights for policy 0, policy_version 321594 (0.0035) [2024-06-19 09:06:53,382][26367] Fps is (10 sec: 40966.2, 60 sec: 43142.9, 300 sec: 42598.1). Total num frames: 5269045248. Throughput: 0: 42753.0. Samples: 1536661820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:53,383][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 09:06:55,833][26599] Updated weights for policy 0, policy_version 321604 (0.0041) [2024-06-19 09:06:58,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5269258240. Throughput: 0: 42799.2. Samples: 1536918420. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:06:58,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 09:06:59,405][26599] Updated weights for policy 0, policy_version 321614 (0.0044) [2024-06-19 09:07:03,359][26599] Updated weights for policy 0, policy_version 321624 (0.0033) [2024-06-19 09:07:03,380][26367] Fps is (10 sec: 44246.5, 60 sec: 43147.1, 300 sec: 42710.0). Total num frames: 5269487616. Throughput: 0: 42772.1. Samples: 1537045460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:07:03,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 09:07:06,923][26599] Updated weights for policy 0, policy_version 321634 (0.0027) [2024-06-19 09:07:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5269684224. Throughput: 0: 42566.3. Samples: 1537293900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:07:08,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 09:07:11,362][26599] Updated weights for policy 0, policy_version 321644 (0.0035) [2024-06-19 09:07:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5269913600. Throughput: 0: 42840.0. Samples: 1537555280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 09:07:13,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 09:07:14,697][26599] Updated weights for policy 0, policy_version 321654 (0.0036) [2024-06-19 09:07:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5270110208. Throughput: 0: 42726.2. Samples: 1537688220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:18,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 09:07:18,881][26599] Updated weights for policy 0, policy_version 321664 (0.0025) [2024-06-19 09:07:22,306][26599] Updated weights for policy 0, policy_version 321674 (0.0031) [2024-06-19 09:07:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42654.5). Total num frames: 5270323200. Throughput: 0: 42720.1. Samples: 1537940320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:23,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 09:07:26,281][26599] Updated weights for policy 0, policy_version 321684 (0.0032) [2024-06-19 09:07:28,380][26367] Fps is (10 sec: 45874.7, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5270568960. Throughput: 0: 42758.9. Samples: 1538200820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:28,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 09:07:29,898][26599] Updated weights for policy 0, policy_version 321694 (0.0034) [2024-06-19 09:07:33,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5270749184. Throughput: 0: 42803.1. Samples: 1538334620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:33,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 09:07:34,177][26599] Updated weights for policy 0, policy_version 321704 (0.0051) [2024-06-19 09:07:37,613][26599] Updated weights for policy 0, policy_version 321714 (0.0037) [2024-06-19 09:07:38,380][26367] Fps is (10 sec: 40960.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5270978560. Throughput: 0: 42766.4. Samples: 1538586220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:38,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 09:07:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000321715_5270978560.pth... [2024-06-19 09:07:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000321089_5260722176.pth [2024-06-19 09:07:41,808][26599] Updated weights for policy 0, policy_version 321724 (0.0034) [2024-06-19 09:07:43,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42874.1, 300 sec: 42710.0). Total num frames: 5271207936. Throughput: 0: 42816.9. Samples: 1538845180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:43,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 09:07:45,619][26599] Updated weights for policy 0, policy_version 321734 (0.0039) [2024-06-19 09:07:48,383][26367] Fps is (10 sec: 40947.8, 60 sec: 42323.2, 300 sec: 42598.0). Total num frames: 5271388160. Throughput: 0: 42873.9. Samples: 1538974920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:48,384][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 09:07:49,350][26599] Updated weights for policy 0, policy_version 321744 (0.0045) [2024-06-19 09:07:53,317][26599] Updated weights for policy 0, policy_version 321754 (0.0033) [2024-06-19 09:07:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42873.0, 300 sec: 42709.5). Total num frames: 5271617536. Throughput: 0: 42899.0. Samples: 1539224360. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:53,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 09:07:56,926][26599] Updated weights for policy 0, policy_version 321764 (0.0033) [2024-06-19 09:07:58,380][26367] Fps is (10 sec: 44250.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5271830528. Throughput: 0: 42992.0. Samples: 1539489920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:07:58,381][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 09:07:58,607][26579] Signal inference workers to stop experience collection... (22700 times) [2024-06-19 09:07:58,639][26599] InferenceWorker_p0-w0: stopping experience collection (22700 times) [2024-06-19 09:07:58,719][26579] Signal inference workers to resume experience collection... (22700 times) [2024-06-19 09:07:58,719][26599] InferenceWorker_p0-w0: resuming experience collection (22700 times) [2024-06-19 09:08:00,929][26599] Updated weights for policy 0, policy_version 321774 (0.0032) [2024-06-19 09:08:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5272027136. Throughput: 0: 42876.0. Samples: 1539617640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:08:03,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 09:08:05,025][26599] Updated weights for policy 0, policy_version 321784 (0.0036) [2024-06-19 09:08:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5272240128. Throughput: 0: 42792.7. Samples: 1539866000. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:08:08,381][26367] Avg episode reward: [(0, '0.509')] [2024-06-19 09:08:08,633][26599] Updated weights for policy 0, policy_version 321794 (0.0026) [2024-06-19 09:08:12,775][26599] Updated weights for policy 0, policy_version 321804 (0.0026) [2024-06-19 09:08:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5272469504. Throughput: 0: 42821.0. Samples: 1540127760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:08:13,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 09:08:16,172][26599] Updated weights for policy 0, policy_version 321814 (0.0049) [2024-06-19 09:08:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5272682496. Throughput: 0: 42766.3. Samples: 1540259100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:08:18,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 09:08:20,220][26599] Updated weights for policy 0, policy_version 321824 (0.0047) [2024-06-19 09:08:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5272895488. Throughput: 0: 42814.8. Samples: 1540512880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:08:23,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 09:08:23,948][26599] Updated weights for policy 0, policy_version 321834 (0.0034) [2024-06-19 09:08:27,728][26599] Updated weights for policy 0, policy_version 321844 (0.0041) [2024-06-19 09:08:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5273108480. Throughput: 0: 42809.3. Samples: 1540771600. Policy #0 lag: (min: 0.0, avg: 10.3, max: 23.0) [2024-06-19 09:08:28,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 09:08:31,585][26599] Updated weights for policy 0, policy_version 321854 (0.0030) [2024-06-19 09:08:33,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5273337856. Throughput: 0: 42750.0. Samples: 1540898540. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:08:33,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 09:08:35,749][26599] Updated weights for policy 0, policy_version 321864 (0.0046) [2024-06-19 09:08:38,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42765.5). Total num frames: 5273550848. Throughput: 0: 42777.6. Samples: 1541149360. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:08:38,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 09:08:39,251][26599] Updated weights for policy 0, policy_version 321874 (0.0033) [2024-06-19 09:08:43,251][26599] Updated weights for policy 0, policy_version 321884 (0.0034) [2024-06-19 09:08:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5273747456. Throughput: 0: 42765.8. Samples: 1541414380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:08:43,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 09:08:46,844][26599] Updated weights for policy 0, policy_version 321894 (0.0026) [2024-06-19 09:08:48,380][26367] Fps is (10 sec: 42599.5, 60 sec: 43146.8, 300 sec: 42820.6). Total num frames: 5273976832. Throughput: 0: 42689.4. Samples: 1541538660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:08:48,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 09:08:50,890][26599] Updated weights for policy 0, policy_version 321904 (0.0026) [2024-06-19 09:08:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5274189824. Throughput: 0: 42846.4. Samples: 1541794080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:08:53,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 09:08:54,340][26599] Updated weights for policy 0, policy_version 321914 (0.0043) [2024-06-19 09:08:58,369][26599] Updated weights for policy 0, policy_version 321924 (0.0049) [2024-06-19 09:08:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5274402816. Throughput: 0: 42941.0. Samples: 1542060100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:08:58,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 09:09:01,935][26599] Updated weights for policy 0, policy_version 321934 (0.0030) [2024-06-19 09:09:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5274615808. Throughput: 0: 42872.9. Samples: 1542188380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:03,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 09:09:05,873][26599] Updated weights for policy 0, policy_version 321944 (0.0037) [2024-06-19 09:09:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5274828800. Throughput: 0: 42773.4. Samples: 1542437680. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:08,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 09:09:09,781][26599] Updated weights for policy 0, policy_version 321954 (0.0032) [2024-06-19 09:09:13,116][26579] Signal inference workers to stop experience collection... (22750 times) [2024-06-19 09:09:13,169][26599] InferenceWorker_p0-w0: stopping experience collection (22750 times) [2024-06-19 09:09:13,169][26579] Signal inference workers to resume experience collection... (22750 times) [2024-06-19 09:09:13,182][26599] InferenceWorker_p0-w0: resuming experience collection (22750 times) [2024-06-19 09:09:13,310][26599] Updated weights for policy 0, policy_version 321964 (0.0036) [2024-06-19 09:09:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5275058176. Throughput: 0: 42902.3. Samples: 1542702200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:13,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 09:09:17,588][26599] Updated weights for policy 0, policy_version 321974 (0.0035) [2024-06-19 09:09:18,380][26367] Fps is (10 sec: 44236.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5275271168. Throughput: 0: 42877.3. Samples: 1542828020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:18,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 09:09:20,889][26599] Updated weights for policy 0, policy_version 321984 (0.0030) [2024-06-19 09:09:23,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5275467776. Throughput: 0: 42796.1. Samples: 1543075180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:23,383][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 09:09:25,318][26599] Updated weights for policy 0, policy_version 321994 (0.0031) [2024-06-19 09:09:28,330][26599] Updated weights for policy 0, policy_version 322004 (0.0029) [2024-06-19 09:09:28,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 5275713536. Throughput: 0: 42837.0. Samples: 1543342040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:28,380][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 09:09:33,036][26599] Updated weights for policy 0, policy_version 322014 (0.0044) [2024-06-19 09:09:33,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5275893760. Throughput: 0: 42964.5. Samples: 1543472060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:33,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 09:09:36,006][26599] Updated weights for policy 0, policy_version 322024 (0.0030) [2024-06-19 09:09:38,384][26367] Fps is (10 sec: 39306.9, 60 sec: 42596.0, 300 sec: 42708.9). Total num frames: 5276106752. Throughput: 0: 42772.9. Samples: 1543719020. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:38,385][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 09:09:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322028_5276106752.pth... [2024-06-19 09:09:38,492][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000321402_5265850368.pth [2024-06-19 09:09:40,721][26599] Updated weights for policy 0, policy_version 322034 (0.0035) [2024-06-19 09:09:43,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5276319744. Throughput: 0: 42846.2. Samples: 1543988180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 23.0) [2024-06-19 09:09:43,382][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 09:09:43,844][26599] Updated weights for policy 0, policy_version 322044 (0.0033) [2024-06-19 09:09:48,333][26599] Updated weights for policy 0, policy_version 322054 (0.0037) [2024-06-19 09:09:48,380][26367] Fps is (10 sec: 42613.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5276532736. Throughput: 0: 42796.8. Samples: 1544114240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:09:48,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 09:09:51,475][26599] Updated weights for policy 0, policy_version 322064 (0.0035) [2024-06-19 09:09:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5276762112. Throughput: 0: 42691.9. Samples: 1544358820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:09:53,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 09:09:55,895][26599] Updated weights for policy 0, policy_version 322074 (0.0034) [2024-06-19 09:09:58,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5276958720. Throughput: 0: 42761.0. Samples: 1544626440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:09:58,380][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 09:09:59,073][26599] Updated weights for policy 0, policy_version 322084 (0.0055) [2024-06-19 09:10:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5277171712. Throughput: 0: 42849.1. Samples: 1544756220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:03,380][26367] Avg episode reward: [(0, '0.337')] [2024-06-19 09:10:03,461][26599] Updated weights for policy 0, policy_version 322094 (0.0038) [2024-06-19 09:10:06,724][26599] Updated weights for policy 0, policy_version 322104 (0.0034) [2024-06-19 09:10:08,380][26367] Fps is (10 sec: 45874.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5277417472. Throughput: 0: 42985.9. Samples: 1545009540. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:08,381][26367] Avg episode reward: [(0, '0.357')] [2024-06-19 09:10:11,107][26599] Updated weights for policy 0, policy_version 322114 (0.0029) [2024-06-19 09:10:13,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42765.5). Total num frames: 5277614080. Throughput: 0: 42747.0. Samples: 1545265660. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:13,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 09:10:14,391][26599] Updated weights for policy 0, policy_version 322124 (0.0037) [2024-06-19 09:10:18,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5277810688. Throughput: 0: 42696.4. Samples: 1545393400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:18,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 09:10:18,597][26599] Updated weights for policy 0, policy_version 322134 (0.0033) [2024-06-19 09:10:22,008][26599] Updated weights for policy 0, policy_version 322144 (0.0038) [2024-06-19 09:10:22,014][26579] Signal inference workers to stop experience collection... (22800 times) [2024-06-19 09:10:22,014][26579] Signal inference workers to resume experience collection... (22800 times) [2024-06-19 09:10:22,058][26599] InferenceWorker_p0-w0: stopping experience collection (22800 times) [2024-06-19 09:10:22,058][26599] InferenceWorker_p0-w0: resuming experience collection (22800 times) [2024-06-19 09:10:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43144.7, 300 sec: 42820.6). Total num frames: 5278056448. Throughput: 0: 42908.9. Samples: 1545649760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:23,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 09:10:26,602][26599] Updated weights for policy 0, policy_version 322154 (0.0039) [2024-06-19 09:10:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42765.6). Total num frames: 5278253056. Throughput: 0: 42638.3. Samples: 1545906900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:28,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 09:10:29,666][26599] Updated weights for policy 0, policy_version 322164 (0.0034) [2024-06-19 09:10:33,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42710.0). Total num frames: 5278433280. Throughput: 0: 42660.2. Samples: 1546033940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:33,380][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 09:10:34,233][26599] Updated weights for policy 0, policy_version 322174 (0.0038) [2024-06-19 09:10:37,509][26599] Updated weights for policy 0, policy_version 322184 (0.0041) [2024-06-19 09:10:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43147.1, 300 sec: 42820.5). Total num frames: 5278695424. Throughput: 0: 42960.0. Samples: 1546292020. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:38,384][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 09:10:41,752][26599] Updated weights for policy 0, policy_version 322194 (0.0034) [2024-06-19 09:10:43,380][26367] Fps is (10 sec: 47513.5, 60 sec: 43144.6, 300 sec: 42821.2). Total num frames: 5278908416. Throughput: 0: 42773.7. Samples: 1546551260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:43,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 09:10:45,174][26599] Updated weights for policy 0, policy_version 322204 (0.0041) [2024-06-19 09:10:48,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5279088640. Throughput: 0: 42748.7. Samples: 1546679920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:48,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 09:10:49,458][26599] Updated weights for policy 0, policy_version 322214 (0.0056) [2024-06-19 09:10:52,710][26599] Updated weights for policy 0, policy_version 322224 (0.0044) [2024-06-19 09:10:53,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5279334400. Throughput: 0: 42903.5. Samples: 1546940200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:53,381][26367] Avg episode reward: [(0, '0.302')] [2024-06-19 09:10:57,054][26599] Updated weights for policy 0, policy_version 322234 (0.0027) [2024-06-19 09:10:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.3, 300 sec: 42821.1). Total num frames: 5279531008. Throughput: 0: 42801.2. Samples: 1547191720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:10:58,381][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 09:11:00,449][26599] Updated weights for policy 0, policy_version 322244 (0.0040) [2024-06-19 09:11:03,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5279727616. Throughput: 0: 42718.3. Samples: 1547315720. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:03,380][26367] Avg episode reward: [(0, '0.274')] [2024-06-19 09:11:04,910][26599] Updated weights for policy 0, policy_version 322254 (0.0028) [2024-06-19 09:11:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5279956992. Throughput: 0: 42717.7. Samples: 1547572060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:08,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 09:11:08,390][26599] Updated weights for policy 0, policy_version 322264 (0.0029) [2024-06-19 09:11:12,535][26599] Updated weights for policy 0, policy_version 322274 (0.0041) [2024-06-19 09:11:13,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5280169984. Throughput: 0: 42786.7. Samples: 1547832460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:13,384][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 09:11:15,954][26599] Updated weights for policy 0, policy_version 322284 (0.0043) [2024-06-19 09:11:18,382][26367] Fps is (10 sec: 42589.6, 60 sec: 42869.9, 300 sec: 42820.3). Total num frames: 5280382976. Throughput: 0: 42712.1. Samples: 1547956080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:18,383][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 09:11:20,013][26599] Updated weights for policy 0, policy_version 322294 (0.0031) [2024-06-19 09:11:23,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5280595968. Throughput: 0: 42687.6. Samples: 1548212960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:23,380][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 09:11:23,904][26599] Updated weights for policy 0, policy_version 322304 (0.0033) [2024-06-19 09:11:28,030][26599] Updated weights for policy 0, policy_version 322314 (0.0040) [2024-06-19 09:11:28,380][26367] Fps is (10 sec: 42607.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5280808960. Throughput: 0: 42604.4. Samples: 1548468460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:28,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 09:11:31,550][26599] Updated weights for policy 0, policy_version 322324 (0.0041) [2024-06-19 09:11:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5281038336. Throughput: 0: 42598.0. Samples: 1548596820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:33,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 09:11:35,763][26599] Updated weights for policy 0, policy_version 322334 (0.0022) [2024-06-19 09:11:35,928][26579] Signal inference workers to stop experience collection... (22850 times) [2024-06-19 09:11:35,928][26579] Signal inference workers to resume experience collection... (22850 times) [2024-06-19 09:11:35,964][26599] InferenceWorker_p0-w0: stopping experience collection (22850 times) [2024-06-19 09:11:35,964][26599] InferenceWorker_p0-w0: resuming experience collection (22850 times) [2024-06-19 09:11:38,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42765.6). Total num frames: 5281251328. Throughput: 0: 42440.1. Samples: 1548850000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:38,380][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 09:11:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322342_5281251328.pth... [2024-06-19 09:11:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000321715_5270978560.pth [2024-06-19 09:11:39,133][26599] Updated weights for policy 0, policy_version 322344 (0.0046) [2024-06-19 09:11:43,178][26599] Updated weights for policy 0, policy_version 322354 (0.0038) [2024-06-19 09:11:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5281464320. Throughput: 0: 42805.6. Samples: 1549117960. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:43,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 09:11:46,603][26599] Updated weights for policy 0, policy_version 322364 (0.0036) [2024-06-19 09:11:48,390][26367] Fps is (10 sec: 40920.2, 60 sec: 42864.7, 300 sec: 42763.9). Total num frames: 5281660928. Throughput: 0: 42767.6. Samples: 1549240680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:48,391][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 09:11:50,727][26599] Updated weights for policy 0, policy_version 322374 (0.0034) [2024-06-19 09:11:53,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5281873920. Throughput: 0: 42712.8. Samples: 1549494140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:53,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 09:11:54,482][26599] Updated weights for policy 0, policy_version 322384 (0.0027) [2024-06-19 09:11:58,292][26599] Updated weights for policy 0, policy_version 322394 (0.0033) [2024-06-19 09:11:58,380][26367] Fps is (10 sec: 44278.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5282103296. Throughput: 0: 42765.5. Samples: 1549756760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:11:58,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 09:12:02,143][26599] Updated weights for policy 0, policy_version 322404 (0.0027) [2024-06-19 09:12:03,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5282316288. Throughput: 0: 42875.4. Samples: 1549885380. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:12:03,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 09:12:05,941][26599] Updated weights for policy 0, policy_version 322414 (0.0033) [2024-06-19 09:12:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5282512896. Throughput: 0: 42801.2. Samples: 1550139020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:12:08,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 09:12:09,809][26599] Updated weights for policy 0, policy_version 322424 (0.0037) [2024-06-19 09:12:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 5282725888. Throughput: 0: 42885.0. Samples: 1550398280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 21.0) [2024-06-19 09:12:13,380][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 09:12:13,673][26599] Updated weights for policy 0, policy_version 322434 (0.0042) [2024-06-19 09:12:17,350][26599] Updated weights for policy 0, policy_version 322444 (0.0034) [2024-06-19 09:12:18,380][26367] Fps is (10 sec: 45875.2, 60 sec: 43146.0, 300 sec: 42876.1). Total num frames: 5282971648. Throughput: 0: 42870.5. Samples: 1550526000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:18,389][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 09:12:21,177][26599] Updated weights for policy 0, policy_version 322454 (0.0052) [2024-06-19 09:12:23,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5283168256. Throughput: 0: 42961.7. Samples: 1550783280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:23,380][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 09:12:24,882][26599] Updated weights for policy 0, policy_version 322464 (0.0033) [2024-06-19 09:12:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5283381248. Throughput: 0: 42846.6. Samples: 1551046060. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:28,380][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 09:12:28,648][26599] Updated weights for policy 0, policy_version 322474 (0.0032) [2024-06-19 09:12:32,390][26599] Updated weights for policy 0, policy_version 322484 (0.0033) [2024-06-19 09:12:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5283594240. Throughput: 0: 42971.4. Samples: 1551173980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:33,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 09:12:36,298][26599] Updated weights for policy 0, policy_version 322494 (0.0028) [2024-06-19 09:12:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5283823616. Throughput: 0: 43024.5. Samples: 1551430240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:38,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 09:12:40,035][26599] Updated weights for policy 0, policy_version 322504 (0.0034) [2024-06-19 09:12:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42821.0). Total num frames: 5284020224. Throughput: 0: 42914.0. Samples: 1551687880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:43,380][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 09:12:44,110][26599] Updated weights for policy 0, policy_version 322514 (0.0038) [2024-06-19 09:12:47,941][26599] Updated weights for policy 0, policy_version 322524 (0.0039) [2024-06-19 09:12:48,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42878.4, 300 sec: 42765.0). Total num frames: 5284233216. Throughput: 0: 42889.4. Samples: 1551815400. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:48,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 09:12:51,630][26599] Updated weights for policy 0, policy_version 322534 (0.0037) [2024-06-19 09:12:53,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5284462592. Throughput: 0: 42908.5. Samples: 1552069900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:53,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 09:12:55,666][26599] Updated weights for policy 0, policy_version 322544 (0.0030) [2024-06-19 09:12:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.6, 300 sec: 42820.6). Total num frames: 5284659200. Throughput: 0: 42746.3. Samples: 1552321860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:12:58,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 09:12:59,147][26599] Updated weights for policy 0, policy_version 322554 (0.0025) [2024-06-19 09:13:03,267][26599] Updated weights for policy 0, policy_version 322564 (0.0035) [2024-06-19 09:13:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5284888576. Throughput: 0: 42736.4. Samples: 1552449140. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:13:03,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 09:13:07,053][26599] Updated weights for policy 0, policy_version 322574 (0.0033) [2024-06-19 09:13:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5285085184. Throughput: 0: 42774.2. Samples: 1552708120. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:13:08,380][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 09:13:11,140][26599] Updated weights for policy 0, policy_version 322584 (0.0042) [2024-06-19 09:13:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5285314560. Throughput: 0: 42570.6. Samples: 1552961740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:13:13,381][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 09:13:13,819][26579] Signal inference workers to stop experience collection... (22900 times) [2024-06-19 09:13:13,850][26599] InferenceWorker_p0-w0: stopping experience collection (22900 times) [2024-06-19 09:13:13,865][26579] Signal inference workers to resume experience collection... (22900 times) [2024-06-19 09:13:13,867][26599] InferenceWorker_p0-w0: resuming experience collection (22900 times) [2024-06-19 09:13:14,920][26599] Updated weights for policy 0, policy_version 322594 (0.0054) [2024-06-19 09:13:18,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5285511168. Throughput: 0: 42706.2. Samples: 1553095760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:13:18,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 09:13:18,663][26599] Updated weights for policy 0, policy_version 322604 (0.0040) [2024-06-19 09:13:22,745][26599] Updated weights for policy 0, policy_version 322614 (0.0041) [2024-06-19 09:13:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5285724160. Throughput: 0: 42666.2. Samples: 1553350220. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:13:23,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 09:13:26,102][26599] Updated weights for policy 0, policy_version 322624 (0.0029) [2024-06-19 09:13:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5285953536. Throughput: 0: 42642.5. Samples: 1553606800. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 09:13:28,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 09:13:30,402][26599] Updated weights for policy 0, policy_version 322634 (0.0030) [2024-06-19 09:13:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5286166528. Throughput: 0: 42791.0. Samples: 1553741000. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:13:33,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 09:13:33,613][26599] Updated weights for policy 0, policy_version 322644 (0.0036) [2024-06-19 09:13:38,077][26599] Updated weights for policy 0, policy_version 322654 (0.0035) [2024-06-19 09:13:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5286363136. Throughput: 0: 42735.2. Samples: 1553992980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:13:38,381][26367] Avg episode reward: [(0, '0.335')] [2024-06-19 09:13:38,458][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322655_5286379520.pth... [2024-06-19 09:13:38,516][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322028_5276106752.pth [2024-06-19 09:13:41,749][26599] Updated weights for policy 0, policy_version 322664 (0.0041) [2024-06-19 09:13:43,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5286608896. Throughput: 0: 42587.8. Samples: 1554238320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:13:43,381][26367] Avg episode reward: [(0, '0.406')] [2024-06-19 09:13:45,902][26599] Updated weights for policy 0, policy_version 322674 (0.0048) [2024-06-19 09:13:48,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5286805504. Throughput: 0: 42622.1. Samples: 1554367140. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:13:48,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 09:13:49,209][26599] Updated weights for policy 0, policy_version 322684 (0.0039) [2024-06-19 09:13:53,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5287002112. Throughput: 0: 42484.0. Samples: 1554619900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:13:53,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 09:13:53,591][26599] Updated weights for policy 0, policy_version 322694 (0.0032) [2024-06-19 09:13:57,115][26599] Updated weights for policy 0, policy_version 322704 (0.0028) [2024-06-19 09:13:58,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5287231488. Throughput: 0: 42517.5. Samples: 1554875020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:13:58,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 09:14:01,118][26599] Updated weights for policy 0, policy_version 322714 (0.0038) [2024-06-19 09:14:03,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5287460864. Throughput: 0: 42486.6. Samples: 1555007660. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:03,384][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 09:14:04,542][26599] Updated weights for policy 0, policy_version 322724 (0.0041) [2024-06-19 09:14:08,384][26367] Fps is (10 sec: 40944.6, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5287641088. Throughput: 0: 42367.8. Samples: 1555256920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:08,384][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 09:14:09,174][26599] Updated weights for policy 0, policy_version 322734 (0.0035) [2024-06-19 09:14:12,264][26599] Updated weights for policy 0, policy_version 322744 (0.0040) [2024-06-19 09:14:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5287870464. Throughput: 0: 42361.3. Samples: 1555513060. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:13,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 09:14:16,733][26599] Updated weights for policy 0, policy_version 322754 (0.0037) [2024-06-19 09:14:18,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5288083456. Throughput: 0: 42295.6. Samples: 1555644300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:18,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 09:14:19,904][26599] Updated weights for policy 0, policy_version 322764 (0.0034) [2024-06-19 09:14:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5288296448. Throughput: 0: 42429.6. Samples: 1555902320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:23,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 09:14:24,177][26599] Updated weights for policy 0, policy_version 322774 (0.0027) [2024-06-19 09:14:27,546][26599] Updated weights for policy 0, policy_version 322784 (0.0035) [2024-06-19 09:14:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5288509440. Throughput: 0: 42741.9. Samples: 1556161700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:28,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 09:14:31,929][26599] Updated weights for policy 0, policy_version 322794 (0.0041) [2024-06-19 09:14:33,382][26367] Fps is (10 sec: 40953.5, 60 sec: 42324.1, 300 sec: 42709.7). Total num frames: 5288706048. Throughput: 0: 42780.2. Samples: 1556292320. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:33,383][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 09:14:35,166][26599] Updated weights for policy 0, policy_version 322804 (0.0036) [2024-06-19 09:14:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5288951808. Throughput: 0: 42905.3. Samples: 1556550640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:38,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 09:14:39,515][26599] Updated weights for policy 0, policy_version 322814 (0.0034) [2024-06-19 09:14:42,799][26599] Updated weights for policy 0, policy_version 322824 (0.0045) [2024-06-19 09:14:43,384][26367] Fps is (10 sec: 45866.6, 60 sec: 42595.9, 300 sec: 42820.0). Total num frames: 5289164800. Throughput: 0: 42670.7. Samples: 1556795360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:14:43,385][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 09:14:44,737][26579] Signal inference workers to stop experience collection... (22950 times) [2024-06-19 09:14:44,782][26599] InferenceWorker_p0-w0: stopping experience collection (22950 times) [2024-06-19 09:14:44,789][26579] Signal inference workers to resume experience collection... (22950 times) [2024-06-19 09:14:44,799][26599] InferenceWorker_p0-w0: resuming experience collection (22950 times) [2024-06-19 09:14:47,118][26599] Updated weights for policy 0, policy_version 322834 (0.0041) [2024-06-19 09:14:48,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42595.9, 300 sec: 42709.0). Total num frames: 5289361408. Throughput: 0: 42687.2. Samples: 1556928740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:14:48,385][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 09:14:50,460][26599] Updated weights for policy 0, policy_version 322844 (0.0038) [2024-06-19 09:14:53,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5289574400. Throughput: 0: 42835.0. Samples: 1557184340. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:14:53,383][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 09:14:54,735][26599] Updated weights for policy 0, policy_version 322854 (0.0038) [2024-06-19 09:14:58,072][26599] Updated weights for policy 0, policy_version 322864 (0.0038) [2024-06-19 09:14:58,380][26367] Fps is (10 sec: 44252.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5289803776. Throughput: 0: 42731.6. Samples: 1557435980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:14:58,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 09:15:02,398][26599] Updated weights for policy 0, policy_version 322874 (0.0034) [2024-06-19 09:15:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5290000384. Throughput: 0: 42809.3. Samples: 1557570720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:03,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 09:15:05,895][26599] Updated weights for policy 0, policy_version 322884 (0.0040) [2024-06-19 09:15:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42874.1, 300 sec: 42709.5). Total num frames: 5290213376. Throughput: 0: 42665.9. Samples: 1557822280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:08,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 09:15:10,228][26599] Updated weights for policy 0, policy_version 322894 (0.0034) [2024-06-19 09:15:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5290426368. Throughput: 0: 42432.5. Samples: 1558071160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:13,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 09:15:13,671][26599] Updated weights for policy 0, policy_version 322904 (0.0040) [2024-06-19 09:15:17,905][26599] Updated weights for policy 0, policy_version 322914 (0.0033) [2024-06-19 09:15:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5290622976. Throughput: 0: 42509.8. Samples: 1558205180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:18,380][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 09:15:21,423][26599] Updated weights for policy 0, policy_version 322924 (0.0027) [2024-06-19 09:15:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5290835968. Throughput: 0: 42400.4. Samples: 1558458660. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:23,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 09:15:25,578][26599] Updated weights for policy 0, policy_version 322934 (0.0040) [2024-06-19 09:15:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5291065344. Throughput: 0: 42610.6. Samples: 1558712680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:28,380][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 09:15:29,322][26599] Updated weights for policy 0, policy_version 322944 (0.0031) [2024-06-19 09:15:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42599.6, 300 sec: 42598.4). Total num frames: 5291261952. Throughput: 0: 42541.7. Samples: 1558842960. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:33,381][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 09:15:33,540][26599] Updated weights for policy 0, policy_version 322954 (0.0031) [2024-06-19 09:15:37,132][26599] Updated weights for policy 0, policy_version 322964 (0.0045) [2024-06-19 09:15:38,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5291491328. Throughput: 0: 42547.5. Samples: 1559098980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:38,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 09:15:38,414][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322967_5291491328.pth... [2024-06-19 09:15:38,466][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322342_5281251328.pth [2024-06-19 09:15:41,605][26599] Updated weights for policy 0, policy_version 322974 (0.0041) [2024-06-19 09:15:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42327.8, 300 sec: 42765.0). Total num frames: 5291704320. Throughput: 0: 42506.2. Samples: 1559348760. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:43,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 09:15:44,669][26599] Updated weights for policy 0, policy_version 322984 (0.0031) [2024-06-19 09:15:48,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42328.0, 300 sec: 42598.4). Total num frames: 5291900928. Throughput: 0: 42371.2. Samples: 1559477420. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:48,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 09:15:49,419][26599] Updated weights for policy 0, policy_version 322994 (0.0036) [2024-06-19 09:15:52,370][26599] Updated weights for policy 0, policy_version 323004 (0.0022) [2024-06-19 09:15:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5292113920. Throughput: 0: 42488.0. Samples: 1559734240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:53,382][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 09:15:57,029][26599] Updated weights for policy 0, policy_version 323014 (0.0035) [2024-06-19 09:15:58,072][26579] Signal inference workers to stop experience collection... (23000 times) [2024-06-19 09:15:58,074][26579] Signal inference workers to resume experience collection... (23000 times) [2024-06-19 09:15:58,088][26599] InferenceWorker_p0-w0: stopping experience collection (23000 times) [2024-06-19 09:15:58,124][26599] InferenceWorker_p0-w0: resuming experience collection (23000 times) [2024-06-19 09:15:58,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 5292359680. Throughput: 0: 42587.0. Samples: 1559987580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:15:58,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 09:16:00,084][26599] Updated weights for policy 0, policy_version 323024 (0.0036) [2024-06-19 09:16:03,384][26367] Fps is (10 sec: 42584.0, 60 sec: 42322.9, 300 sec: 42653.5). Total num frames: 5292539904. Throughput: 0: 42605.1. Samples: 1560122560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:03,384][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 09:16:04,622][26599] Updated weights for policy 0, policy_version 323034 (0.0032) [2024-06-19 09:16:07,631][26599] Updated weights for policy 0, policy_version 323044 (0.0041) [2024-06-19 09:16:08,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 5292769280. Throughput: 0: 42494.6. Samples: 1560370920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:08,381][26367] Avg episode reward: [(0, '0.837')] [2024-06-19 09:16:12,180][26599] Updated weights for policy 0, policy_version 323054 (0.0038) [2024-06-19 09:16:13,381][26367] Fps is (10 sec: 45888.9, 60 sec: 42871.1, 300 sec: 42765.3). Total num frames: 5292998656. Throughput: 0: 42750.6. Samples: 1560636480. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:13,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 09:16:15,235][26599] Updated weights for policy 0, policy_version 323064 (0.0042) [2024-06-19 09:16:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42709.4). Total num frames: 5293195264. Throughput: 0: 42653.7. Samples: 1560762380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:18,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 09:16:20,051][26599] Updated weights for policy 0, policy_version 323074 (0.0040) [2024-06-19 09:16:22,927][26599] Updated weights for policy 0, policy_version 323084 (0.0042) [2024-06-19 09:16:23,384][26367] Fps is (10 sec: 40946.9, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5293408256. Throughput: 0: 42584.2. Samples: 1561015420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:23,384][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 09:16:27,620][26599] Updated weights for policy 0, policy_version 323094 (0.0043) [2024-06-19 09:16:28,380][26367] Fps is (10 sec: 44237.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5293637632. Throughput: 0: 42869.6. Samples: 1561277880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:28,380][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 09:16:30,636][26599] Updated weights for policy 0, policy_version 323104 (0.0024) [2024-06-19 09:16:33,384][26367] Fps is (10 sec: 42598.3, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 5293834240. Throughput: 0: 42850.7. Samples: 1561405860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:33,385][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 09:16:35,173][26599] Updated weights for policy 0, policy_version 323114 (0.0040) [2024-06-19 09:16:38,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5294047232. Throughput: 0: 42827.1. Samples: 1561661460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:38,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 09:16:38,543][26599] Updated weights for policy 0, policy_version 323124 (0.0030) [2024-06-19 09:16:42,655][26599] Updated weights for policy 0, policy_version 323134 (0.0037) [2024-06-19 09:16:43,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42871.6, 300 sec: 42766.4). Total num frames: 5294276608. Throughput: 0: 42944.1. Samples: 1561920060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:43,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 09:16:46,339][26599] Updated weights for policy 0, policy_version 323144 (0.0033) [2024-06-19 09:16:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5294473216. Throughput: 0: 42904.2. Samples: 1562053100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:48,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 09:16:50,161][26599] Updated weights for policy 0, policy_version 323154 (0.0035) [2024-06-19 09:16:53,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 5294686208. Throughput: 0: 42978.8. Samples: 1562305120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:53,384][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 09:16:53,867][26599] Updated weights for policy 0, policy_version 323164 (0.0047) [2024-06-19 09:16:57,742][26599] Updated weights for policy 0, policy_version 323174 (0.0031) [2024-06-19 09:16:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5294899200. Throughput: 0: 42714.3. Samples: 1562558600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:16:58,380][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 09:17:01,826][26599] Updated weights for policy 0, policy_version 323184 (0.0033) [2024-06-19 09:17:03,380][26367] Fps is (10 sec: 42613.9, 60 sec: 42873.9, 300 sec: 42709.5). Total num frames: 5295112192. Throughput: 0: 42789.9. Samples: 1562687920. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:17:03,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 09:17:05,397][26599] Updated weights for policy 0, policy_version 323194 (0.0037) [2024-06-19 09:17:06,785][26579] Signal inference workers to stop experience collection... (23050 times) [2024-06-19 09:17:06,835][26599] InferenceWorker_p0-w0: stopping experience collection (23050 times) [2024-06-19 09:17:06,840][26579] Signal inference workers to resume experience collection... (23050 times) [2024-06-19 09:17:06,860][26599] InferenceWorker_p0-w0: resuming experience collection (23050 times) [2024-06-19 09:17:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5295341568. Throughput: 0: 42873.7. Samples: 1562944580. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:17:08,380][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 09:17:09,271][26599] Updated weights for policy 0, policy_version 323204 (0.0034) [2024-06-19 09:17:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.6, 300 sec: 42542.9). Total num frames: 5295521792. Throughput: 0: 42831.5. Samples: 1563205300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 09:17:13,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 09:17:13,419][26599] Updated weights for policy 0, policy_version 323214 (0.0034) [2024-06-19 09:17:16,984][26599] Updated weights for policy 0, policy_version 323224 (0.0035) [2024-06-19 09:17:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5295751168. Throughput: 0: 42582.1. Samples: 1563321900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:18,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 09:17:21,001][26599] Updated weights for policy 0, policy_version 323234 (0.0037) [2024-06-19 09:17:23,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42874.1, 300 sec: 42709.5). Total num frames: 5295980544. Throughput: 0: 42594.7. Samples: 1563578220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:23,381][26367] Avg episode reward: [(0, '0.429')] [2024-06-19 09:17:24,487][26599] Updated weights for policy 0, policy_version 323244 (0.0041) [2024-06-19 09:17:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5296177152. Throughput: 0: 42655.6. Samples: 1563839560. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:28,380][26367] Avg episode reward: [(0, '0.803')] [2024-06-19 09:17:28,463][26599] Updated weights for policy 0, policy_version 323254 (0.0027) [2024-06-19 09:17:32,145][26599] Updated weights for policy 0, policy_version 323264 (0.0038) [2024-06-19 09:17:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42601.1, 300 sec: 42598.4). Total num frames: 5296390144. Throughput: 0: 42457.8. Samples: 1563963700. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:33,380][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 09:17:36,162][26599] Updated weights for policy 0, policy_version 323274 (0.0028) [2024-06-19 09:17:38,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5296603136. Throughput: 0: 42510.9. Samples: 1564217960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:38,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 09:17:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000323279_5296603136.pth... [2024-06-19 09:17:38,454][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322655_5286379520.pth [2024-06-19 09:17:39,801][26599] Updated weights for policy 0, policy_version 323284 (0.0036) [2024-06-19 09:17:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5296816128. Throughput: 0: 42781.8. Samples: 1564483780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:43,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 09:17:43,769][26599] Updated weights for policy 0, policy_version 323294 (0.0034) [2024-06-19 09:17:47,560][26599] Updated weights for policy 0, policy_version 323304 (0.0036) [2024-06-19 09:17:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5297029120. Throughput: 0: 42691.5. Samples: 1564609040. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:48,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 09:17:51,417][26599] Updated weights for policy 0, policy_version 323314 (0.0040) [2024-06-19 09:17:53,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42601.0, 300 sec: 42653.9). Total num frames: 5297242112. Throughput: 0: 42705.3. Samples: 1564866320. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:53,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 09:17:55,078][26599] Updated weights for policy 0, policy_version 323324 (0.0042) [2024-06-19 09:17:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5297471488. Throughput: 0: 42587.1. Samples: 1565121720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:17:58,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 09:17:59,035][26599] Updated weights for policy 0, policy_version 323334 (0.0028) [2024-06-19 09:18:03,188][26599] Updated weights for policy 0, policy_version 323344 (0.0050) [2024-06-19 09:18:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5297684480. Throughput: 0: 42863.5. Samples: 1565250760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:18:03,381][26367] Avg episode reward: [(0, '0.324')] [2024-06-19 09:18:06,567][26599] Updated weights for policy 0, policy_version 323354 (0.0036) [2024-06-19 09:18:08,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5297897472. Throughput: 0: 42885.4. Samples: 1565508220. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:18:08,384][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 09:18:10,601][26599] Updated weights for policy 0, policy_version 323364 (0.0030) [2024-06-19 09:18:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5298110464. Throughput: 0: 42839.4. Samples: 1565767340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:18:13,384][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 09:18:14,064][26599] Updated weights for policy 0, policy_version 323374 (0.0034) [2024-06-19 09:18:18,104][26599] Updated weights for policy 0, policy_version 323384 (0.0032) [2024-06-19 09:18:18,380][26367] Fps is (10 sec: 42613.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5298323456. Throughput: 0: 42765.5. Samples: 1565888160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:18:18,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 09:18:21,686][26599] Updated weights for policy 0, policy_version 323394 (0.0039) [2024-06-19 09:18:23,383][26367] Fps is (10 sec: 42585.4, 60 sec: 42596.2, 300 sec: 42653.5). Total num frames: 5298536448. Throughput: 0: 42713.7. Samples: 1566140200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:18:23,384][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 09:18:25,660][26599] Updated weights for policy 0, policy_version 323404 (0.0030) [2024-06-19 09:18:28,380][26367] Fps is (10 sec: 40961.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5298733056. Throughput: 0: 42613.8. Samples: 1566401400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:18:28,380][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 09:18:29,361][26599] Updated weights for policy 0, policy_version 323414 (0.0034) [2024-06-19 09:18:33,175][26579] Signal inference workers to stop experience collection... (23100 times) [2024-06-19 09:18:33,176][26579] Signal inference workers to resume experience collection... (23100 times) [2024-06-19 09:18:33,189][26599] InferenceWorker_p0-w0: stopping experience collection (23100 times) [2024-06-19 09:18:33,189][26599] InferenceWorker_p0-w0: resuming experience collection (23100 times) [2024-06-19 09:18:33,326][26599] Updated weights for policy 0, policy_version 323424 (0.0024) [2024-06-19 09:18:33,380][26367] Fps is (10 sec: 44249.9, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5298978816. Throughput: 0: 42626.2. Samples: 1566527220. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:18:33,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 09:18:37,110][26599] Updated weights for policy 0, policy_version 323434 (0.0029) [2024-06-19 09:18:38,380][26367] Fps is (10 sec: 44235.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5299175424. Throughput: 0: 42526.6. Samples: 1566780020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:18:38,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 09:18:41,145][26599] Updated weights for policy 0, policy_version 323444 (0.0023) [2024-06-19 09:18:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5299388416. Throughput: 0: 42616.4. Samples: 1567039460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:18:43,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 09:18:44,851][26599] Updated weights for policy 0, policy_version 323454 (0.0033) [2024-06-19 09:18:48,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5299601408. Throughput: 0: 42536.6. Samples: 1567164900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:18:48,380][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 09:18:48,803][26599] Updated weights for policy 0, policy_version 323464 (0.0026) [2024-06-19 09:18:52,621][26599] Updated weights for policy 0, policy_version 323474 (0.0028) [2024-06-19 09:18:53,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 5299814400. Throughput: 0: 42468.9. Samples: 1567419320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:18:53,384][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 09:18:57,079][26599] Updated weights for policy 0, policy_version 323484 (0.0040) [2024-06-19 09:18:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5300027392. Throughput: 0: 42458.6. Samples: 1567677980. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:18:58,381][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 09:19:00,493][26599] Updated weights for policy 0, policy_version 323494 (0.0029) [2024-06-19 09:19:03,380][26367] Fps is (10 sec: 42613.2, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 5300240384. Throughput: 0: 42623.1. Samples: 1567806200. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:03,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 09:19:04,543][26599] Updated weights for policy 0, policy_version 323504 (0.0035) [2024-06-19 09:19:08,262][26599] Updated weights for policy 0, policy_version 323514 (0.0031) [2024-06-19 09:19:08,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42601.0, 300 sec: 42654.0). Total num frames: 5300453376. Throughput: 0: 42823.0. Samples: 1568067100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:08,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 09:19:12,140][26599] Updated weights for policy 0, policy_version 323524 (0.0031) [2024-06-19 09:19:13,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5300666368. Throughput: 0: 42560.8. Samples: 1568316640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:13,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 09:19:16,105][26599] Updated weights for policy 0, policy_version 323534 (0.0046) [2024-06-19 09:19:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5300879360. Throughput: 0: 42783.1. Samples: 1568452460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:18,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 09:19:19,874][26599] Updated weights for policy 0, policy_version 323544 (0.0032) [2024-06-19 09:19:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42327.5, 300 sec: 42598.4). Total num frames: 5301075968. Throughput: 0: 42878.3. Samples: 1568709540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:23,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 09:19:23,746][26599] Updated weights for policy 0, policy_version 323554 (0.0022) [2024-06-19 09:19:27,505][26599] Updated weights for policy 0, policy_version 323564 (0.0044) [2024-06-19 09:19:28,382][26367] Fps is (10 sec: 44227.9, 60 sec: 43142.9, 300 sec: 42765.0). Total num frames: 5301321728. Throughput: 0: 42647.4. Samples: 1568958680. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:28,383][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 09:19:31,585][26599] Updated weights for policy 0, policy_version 323574 (0.0055) [2024-06-19 09:19:33,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5301534720. Throughput: 0: 42858.1. Samples: 1569093520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:33,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 09:19:35,164][26599] Updated weights for policy 0, policy_version 323584 (0.0032) [2024-06-19 09:19:38,380][26367] Fps is (10 sec: 37691.7, 60 sec: 42052.5, 300 sec: 42487.9). Total num frames: 5301698560. Throughput: 0: 42754.7. Samples: 1569343120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:38,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 09:19:38,477][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000323591_5301714944.pth... [2024-06-19 09:19:38,520][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000322967_5291491328.pth [2024-06-19 09:19:39,215][26599] Updated weights for policy 0, policy_version 323594 (0.0039) [2024-06-19 09:19:42,757][26599] Updated weights for policy 0, policy_version 323604 (0.0021) [2024-06-19 09:19:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 5301944320. Throughput: 0: 42677.3. Samples: 1569598460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 25.0) [2024-06-19 09:19:43,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 09:19:46,815][26599] Updated weights for policy 0, policy_version 323614 (0.0048) [2024-06-19 09:19:47,468][26579] Signal inference workers to stop experience collection... (23150 times) [2024-06-19 09:19:47,469][26579] Signal inference workers to resume experience collection... (23150 times) [2024-06-19 09:19:47,509][26599] InferenceWorker_p0-w0: stopping experience collection (23150 times) [2024-06-19 09:19:47,509][26599] InferenceWorker_p0-w0: resuming experience collection (23150 times) [2024-06-19 09:19:48,380][26367] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5302173696. Throughput: 0: 42825.1. Samples: 1569733320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:19:48,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 09:19:50,568][26599] Updated weights for policy 0, policy_version 323624 (0.0042) [2024-06-19 09:19:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42327.8, 300 sec: 42542.9). Total num frames: 5302353920. Throughput: 0: 42463.9. Samples: 1569977980. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:19:53,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 09:19:54,429][26599] Updated weights for policy 0, policy_version 323634 (0.0032) [2024-06-19 09:19:58,128][26599] Updated weights for policy 0, policy_version 323644 (0.0038) [2024-06-19 09:19:58,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5302583296. Throughput: 0: 42599.9. Samples: 1570233640. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:19:58,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 09:20:02,037][26599] Updated weights for policy 0, policy_version 323654 (0.0025) [2024-06-19 09:20:03,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5302796288. Throughput: 0: 42479.3. Samples: 1570364020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:03,380][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 09:20:05,809][26599] Updated weights for policy 0, policy_version 323664 (0.0032) [2024-06-19 09:20:08,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5303009280. Throughput: 0: 42230.2. Samples: 1570609900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:08,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 09:20:09,994][26599] Updated weights for policy 0, policy_version 323674 (0.0023) [2024-06-19 09:20:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5303222272. Throughput: 0: 42339.8. Samples: 1570863880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:13,380][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 09:20:13,400][26599] Updated weights for policy 0, policy_version 323684 (0.0037) [2024-06-19 09:20:17,815][26599] Updated weights for policy 0, policy_version 323694 (0.0040) [2024-06-19 09:20:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5303418880. Throughput: 0: 42172.9. Samples: 1570991300. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:18,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 09:20:21,630][26599] Updated weights for policy 0, policy_version 323704 (0.0030) [2024-06-19 09:20:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5303648256. Throughput: 0: 42311.5. Samples: 1571247140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:23,381][26367] Avg episode reward: [(0, '0.758')] [2024-06-19 09:20:25,481][26599] Updated weights for policy 0, policy_version 323714 (0.0038) [2024-06-19 09:20:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42053.8, 300 sec: 42654.0). Total num frames: 5303844864. Throughput: 0: 42359.3. Samples: 1571504620. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:28,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 09:20:29,198][26599] Updated weights for policy 0, policy_version 323724 (0.0048) [2024-06-19 09:20:33,099][26599] Updated weights for policy 0, policy_version 323734 (0.0026) [2024-06-19 09:20:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5304057856. Throughput: 0: 42211.1. Samples: 1571632820. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:33,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 09:20:36,638][26599] Updated weights for policy 0, policy_version 323744 (0.0041) [2024-06-19 09:20:38,384][26367] Fps is (10 sec: 44220.0, 60 sec: 43141.8, 300 sec: 42653.4). Total num frames: 5304287232. Throughput: 0: 42398.8. Samples: 1571886080. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:38,385][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 09:20:40,817][26599] Updated weights for policy 0, policy_version 323754 (0.0028) [2024-06-19 09:20:43,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5304467456. Throughput: 0: 42449.8. Samples: 1572143880. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:43,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 09:20:44,325][26599] Updated weights for policy 0, policy_version 323764 (0.0042) [2024-06-19 09:20:48,382][26367] Fps is (10 sec: 39330.8, 60 sec: 41778.2, 300 sec: 42598.2). Total num frames: 5304680448. Throughput: 0: 42134.7. Samples: 1572260140. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:48,382][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 09:20:48,984][26599] Updated weights for policy 0, policy_version 323774 (0.0033) [2024-06-19 09:20:52,021][26599] Updated weights for policy 0, policy_version 323784 (0.0044) [2024-06-19 09:20:53,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5304926208. Throughput: 0: 42354.2. Samples: 1572515840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:53,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 09:20:56,463][26599] Updated weights for policy 0, policy_version 323794 (0.0027) [2024-06-19 09:20:58,380][26367] Fps is (10 sec: 42604.4, 60 sec: 42052.4, 300 sec: 42598.9). Total num frames: 5305106432. Throughput: 0: 42580.4. Samples: 1572780000. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 09:20:58,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 09:20:59,681][26599] Updated weights for policy 0, policy_version 323804 (0.0023) [2024-06-19 09:21:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5305335808. Throughput: 0: 42442.3. Samples: 1572901200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:03,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 09:21:03,991][26599] Updated weights for policy 0, policy_version 323814 (0.0043) [2024-06-19 09:21:07,286][26599] Updated weights for policy 0, policy_version 323824 (0.0027) [2024-06-19 09:21:08,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42598.5). Total num frames: 5305565184. Throughput: 0: 42430.7. Samples: 1573156520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:08,380][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 09:21:09,497][26579] Signal inference workers to stop experience collection... (23200 times) [2024-06-19 09:21:09,548][26599] InferenceWorker_p0-w0: stopping experience collection (23200 times) [2024-06-19 09:21:09,550][26579] Signal inference workers to resume experience collection... (23200 times) [2024-06-19 09:21:09,559][26599] InferenceWorker_p0-w0: resuming experience collection (23200 times) [2024-06-19 09:21:11,802][26599] Updated weights for policy 0, policy_version 323834 (0.0027) [2024-06-19 09:21:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5305761792. Throughput: 0: 42657.3. Samples: 1573424200. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:13,380][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 09:21:14,991][26599] Updated weights for policy 0, policy_version 323844 (0.0033) [2024-06-19 09:21:18,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 5305958400. Throughput: 0: 42512.0. Samples: 1573545860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:18,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 09:21:19,431][26599] Updated weights for policy 0, policy_version 323854 (0.0037) [2024-06-19 09:21:22,790][26599] Updated weights for policy 0, policy_version 323864 (0.0028) [2024-06-19 09:21:23,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5306204160. Throughput: 0: 42503.9. Samples: 1573798600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:23,381][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 09:21:27,159][26599] Updated weights for policy 0, policy_version 323874 (0.0037) [2024-06-19 09:21:28,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 5306417152. Throughput: 0: 42746.3. Samples: 1574067460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:28,381][26367] Avg episode reward: [(0, '0.341')] [2024-06-19 09:21:30,271][26599] Updated weights for policy 0, policy_version 323884 (0.0045) [2024-06-19 09:21:33,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5306597376. Throughput: 0: 42850.0. Samples: 1574188340. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:33,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 09:21:35,068][26599] Updated weights for policy 0, policy_version 323894 (0.0047) [2024-06-19 09:21:37,826][26599] Updated weights for policy 0, policy_version 323904 (0.0035) [2024-06-19 09:21:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5306843136. Throughput: 0: 42798.7. Samples: 1574441780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:38,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 09:21:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000323904_5306843136.pth... [2024-06-19 09:21:38,441][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000323279_5296603136.pth [2024-06-19 09:21:42,581][26599] Updated weights for policy 0, policy_version 323914 (0.0029) [2024-06-19 09:21:43,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5307023360. Throughput: 0: 42581.3. Samples: 1574696160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:43,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 09:21:45,815][26599] Updated weights for policy 0, policy_version 323924 (0.0034) [2024-06-19 09:21:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42872.4, 300 sec: 42598.9). Total num frames: 5307252736. Throughput: 0: 42602.1. Samples: 1574818300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:48,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 09:21:50,221][26599] Updated weights for policy 0, policy_version 323934 (0.0035) [2024-06-19 09:21:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5307465728. Throughput: 0: 42696.5. Samples: 1575077860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:53,380][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 09:21:53,523][26599] Updated weights for policy 0, policy_version 323944 (0.0027) [2024-06-19 09:21:57,859][26599] Updated weights for policy 0, policy_version 323954 (0.0032) [2024-06-19 09:21:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5307678720. Throughput: 0: 42455.4. Samples: 1575334700. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:21:58,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 09:22:01,572][26599] Updated weights for policy 0, policy_version 323964 (0.0053) [2024-06-19 09:22:03,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5307891712. Throughput: 0: 42578.1. Samples: 1575461880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:22:03,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 09:22:05,483][26599] Updated weights for policy 0, policy_version 323974 (0.0034) [2024-06-19 09:22:08,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42322.7, 300 sec: 42653.4). Total num frames: 5308104704. Throughput: 0: 42553.5. Samples: 1575713660. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:22:08,384][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 09:22:09,151][26599] Updated weights for policy 0, policy_version 323984 (0.0031) [2024-06-19 09:22:13,173][26599] Updated weights for policy 0, policy_version 323994 (0.0038) [2024-06-19 09:22:13,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5308317696. Throughput: 0: 42190.3. Samples: 1575966020. Policy #0 lag: (min: 0.0, avg: 9.5, max: 22.0) [2024-06-19 09:22:13,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 09:22:17,466][26599] Updated weights for policy 0, policy_version 324004 (0.0031) [2024-06-19 09:22:18,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42871.4, 300 sec: 42542.8). Total num frames: 5308530688. Throughput: 0: 42367.6. Samples: 1576094880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:18,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 09:22:20,716][26599] Updated weights for policy 0, policy_version 324014 (0.0031) [2024-06-19 09:22:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5308743680. Throughput: 0: 42397.8. Samples: 1576349680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:23,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 09:22:25,257][26599] Updated weights for policy 0, policy_version 324024 (0.0043) [2024-06-19 09:22:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5308956672. Throughput: 0: 42312.7. Samples: 1576600240. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:28,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 09:22:28,491][26599] Updated weights for policy 0, policy_version 324034 (0.0030) [2024-06-19 09:22:32,991][26599] Updated weights for policy 0, policy_version 324044 (0.0036) [2024-06-19 09:22:33,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.5, 300 sec: 42487.4). Total num frames: 5309136896. Throughput: 0: 42494.4. Samples: 1576730540. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:33,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 09:22:36,190][26599] Updated weights for policy 0, policy_version 324054 (0.0032) [2024-06-19 09:22:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5309366272. Throughput: 0: 42382.4. Samples: 1576985080. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:38,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 09:22:40,832][26599] Updated weights for policy 0, policy_version 324064 (0.0034) [2024-06-19 09:22:41,873][26579] Signal inference workers to stop experience collection... (23250 times) [2024-06-19 09:22:41,874][26579] Signal inference workers to resume experience collection... (23250 times) [2024-06-19 09:22:41,924][26599] InferenceWorker_p0-w0: stopping experience collection (23250 times) [2024-06-19 09:22:41,924][26599] InferenceWorker_p0-w0: resuming experience collection (23250 times) [2024-06-19 09:22:43,380][26367] Fps is (10 sec: 45874.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5309595648. Throughput: 0: 42313.3. Samples: 1577238800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:43,381][26367] Avg episode reward: [(0, '0.768')] [2024-06-19 09:22:43,801][26599] Updated weights for policy 0, policy_version 324074 (0.0038) [2024-06-19 09:22:48,381][26367] Fps is (10 sec: 39321.1, 60 sec: 41779.0, 300 sec: 42431.7). Total num frames: 5309759488. Throughput: 0: 42322.0. Samples: 1577366380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:48,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 09:22:48,535][26599] Updated weights for policy 0, policy_version 324084 (0.0036) [2024-06-19 09:22:51,479][26599] Updated weights for policy 0, policy_version 324094 (0.0046) [2024-06-19 09:22:53,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5310005248. Throughput: 0: 42285.2. Samples: 1577616340. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:53,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 09:22:56,390][26599] Updated weights for policy 0, policy_version 324104 (0.0038) [2024-06-19 09:22:58,384][26367] Fps is (10 sec: 47497.4, 60 sec: 42595.8, 300 sec: 42542.3). Total num frames: 5310234624. Throughput: 0: 42391.2. Samples: 1577873780. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:22:58,385][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 09:22:59,106][26599] Updated weights for policy 0, policy_version 324114 (0.0034) [2024-06-19 09:23:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42432.3). Total num frames: 5310414848. Throughput: 0: 42275.6. Samples: 1577997280. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:23:03,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 09:23:04,078][26599] Updated weights for policy 0, policy_version 324124 (0.0058) [2024-06-19 09:23:06,917][26599] Updated weights for policy 0, policy_version 324134 (0.0037) [2024-06-19 09:23:08,380][26367] Fps is (10 sec: 40975.6, 60 sec: 42328.0, 300 sec: 42487.3). Total num frames: 5310644224. Throughput: 0: 42171.2. Samples: 1578247380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:23:08,380][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 09:23:11,892][26599] Updated weights for policy 0, policy_version 324144 (0.0047) [2024-06-19 09:23:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5310857216. Throughput: 0: 42328.2. Samples: 1578505000. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:23:13,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 09:23:14,543][26599] Updated weights for policy 0, policy_version 324154 (0.0048) [2024-06-19 09:23:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.3, 300 sec: 42376.7). Total num frames: 5311037440. Throughput: 0: 42304.9. Samples: 1578634260. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:23:18,380][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 09:23:19,622][26599] Updated weights for policy 0, policy_version 324164 (0.0038) [2024-06-19 09:23:22,150][26599] Updated weights for policy 0, policy_version 324174 (0.0033) [2024-06-19 09:23:23,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5311283200. Throughput: 0: 42201.5. Samples: 1578884140. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:23:23,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 09:23:27,097][26599] Updated weights for policy 0, policy_version 324184 (0.0040) [2024-06-19 09:23:28,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 5311496192. Throughput: 0: 42519.7. Samples: 1579152180. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:23:28,380][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 09:23:29,795][26599] Updated weights for policy 0, policy_version 324194 (0.0035) [2024-06-19 09:23:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42431.8). Total num frames: 5311692800. Throughput: 0: 42585.7. Samples: 1579282720. Policy #0 lag: (min: 1.0, avg: 10.2, max: 22.0) [2024-06-19 09:23:33,380][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 09:23:34,598][26599] Updated weights for policy 0, policy_version 324204 (0.0041) [2024-06-19 09:23:37,866][26599] Updated weights for policy 0, policy_version 324214 (0.0039) [2024-06-19 09:23:38,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 5311938560. Throughput: 0: 42675.0. Samples: 1579536720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:23:38,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 09:23:38,559][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000324216_5311954944.pth... [2024-06-19 09:23:38,624][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000323591_5301714944.pth [2024-06-19 09:23:42,128][26599] Updated weights for policy 0, policy_version 324224 (0.0031) [2024-06-19 09:23:43,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5312151552. Throughput: 0: 42820.5. Samples: 1579800540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:23:43,380][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 09:23:45,376][26599] Updated weights for policy 0, policy_version 324234 (0.0038) [2024-06-19 09:23:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 43144.7, 300 sec: 42487.8). Total num frames: 5312348160. Throughput: 0: 42889.3. Samples: 1579927300. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:23:48,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 09:23:49,904][26599] Updated weights for policy 0, policy_version 324244 (0.0023) [2024-06-19 09:23:52,858][26599] Updated weights for policy 0, policy_version 324254 (0.0046) [2024-06-19 09:23:53,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42868.8, 300 sec: 42542.3). Total num frames: 5312577536. Throughput: 0: 43009.7. Samples: 1580182980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:23:53,385][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 09:23:57,579][26599] Updated weights for policy 0, policy_version 324264 (0.0028) [2024-06-19 09:23:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42601.0, 300 sec: 42542.9). Total num frames: 5312790528. Throughput: 0: 43094.6. Samples: 1580444260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:23:58,380][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 09:24:00,883][26599] Updated weights for policy 0, policy_version 324274 (0.0034) [2024-06-19 09:24:03,380][26367] Fps is (10 sec: 40975.1, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5312987136. Throughput: 0: 43022.2. Samples: 1580570260. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:03,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 09:24:05,210][26579] Signal inference workers to stop experience collection... (23300 times) [2024-06-19 09:24:05,247][26599] InferenceWorker_p0-w0: stopping experience collection (23300 times) [2024-06-19 09:24:05,266][26579] Signal inference workers to resume experience collection... (23300 times) [2024-06-19 09:24:05,266][26599] InferenceWorker_p0-w0: resuming experience collection (23300 times) [2024-06-19 09:24:05,270][26599] Updated weights for policy 0, policy_version 324284 (0.0038) [2024-06-19 09:24:08,247][26599] Updated weights for policy 0, policy_version 324294 (0.0030) [2024-06-19 09:24:08,384][26367] Fps is (10 sec: 44220.7, 60 sec: 43141.9, 300 sec: 42597.9). Total num frames: 5313232896. Throughput: 0: 43145.4. Samples: 1580825840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:08,384][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 09:24:12,806][26599] Updated weights for policy 0, policy_version 324304 (0.0041) [2024-06-19 09:24:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 5313413120. Throughput: 0: 43081.3. Samples: 1581090840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:13,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 09:24:15,781][26599] Updated weights for policy 0, policy_version 324314 (0.0037) [2024-06-19 09:24:18,384][26367] Fps is (10 sec: 40959.7, 60 sec: 43414.9, 300 sec: 42597.9). Total num frames: 5313642496. Throughput: 0: 42742.2. Samples: 1581206280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:18,385][26367] Avg episode reward: [(0, '0.221')] [2024-06-19 09:24:20,586][26599] Updated weights for policy 0, policy_version 324324 (0.0037) [2024-06-19 09:24:23,380][26367] Fps is (10 sec: 45875.0, 60 sec: 43144.6, 300 sec: 42543.2). Total num frames: 5313871872. Throughput: 0: 42866.8. Samples: 1581465720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:23,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 09:24:23,399][26599] Updated weights for policy 0, policy_version 324334 (0.0033) [2024-06-19 09:24:28,049][26599] Updated weights for policy 0, policy_version 324344 (0.0027) [2024-06-19 09:24:28,380][26367] Fps is (10 sec: 42614.5, 60 sec: 42871.5, 300 sec: 42487.4). Total num frames: 5314068480. Throughput: 0: 42908.9. Samples: 1581731440. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:28,380][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 09:24:31,155][26599] Updated weights for policy 0, policy_version 324354 (0.0033) [2024-06-19 09:24:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5314265088. Throughput: 0: 42751.2. Samples: 1581851100. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:33,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 09:24:35,889][26599] Updated weights for policy 0, policy_version 324364 (0.0034) [2024-06-19 09:24:38,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5314510848. Throughput: 0: 42674.5. Samples: 1582103180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:38,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 09:24:38,887][26599] Updated weights for policy 0, policy_version 324374 (0.0033) [2024-06-19 09:24:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5314691072. Throughput: 0: 42817.4. Samples: 1582371040. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:43,380][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 09:24:43,531][26599] Updated weights for policy 0, policy_version 324384 (0.0029) [2024-06-19 09:24:46,394][26599] Updated weights for policy 0, policy_version 324394 (0.0035) [2024-06-19 09:24:48,381][26367] Fps is (10 sec: 40956.3, 60 sec: 42870.8, 300 sec: 42598.3). Total num frames: 5314920448. Throughput: 0: 42635.5. Samples: 1582488900. Policy #0 lag: (min: 0.0, avg: 8.7, max: 23.0) [2024-06-19 09:24:48,382][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 09:24:51,145][26599] Updated weights for policy 0, policy_version 324404 (0.0038) [2024-06-19 09:24:53,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42600.9, 300 sec: 42542.9). Total num frames: 5315133440. Throughput: 0: 42644.2. Samples: 1582744680. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:24:53,381][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 09:24:54,229][26599] Updated weights for policy 0, policy_version 324414 (0.0034) [2024-06-19 09:24:58,384][26367] Fps is (10 sec: 40949.3, 60 sec: 42322.8, 300 sec: 42486.8). Total num frames: 5315330048. Throughput: 0: 42710.7. Samples: 1583012980. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:24:58,384][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 09:24:58,762][26599] Updated weights for policy 0, policy_version 324424 (0.0033) [2024-06-19 09:25:01,922][26599] Updated weights for policy 0, policy_version 324434 (0.0044) [2024-06-19 09:25:03,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5315575808. Throughput: 0: 42683.0. Samples: 1583126860. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:03,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 09:25:05,160][26579] Signal inference workers to stop experience collection... (23350 times) [2024-06-19 09:25:05,168][26579] Signal inference workers to resume experience collection... (23350 times) [2024-06-19 09:25:05,202][26599] InferenceWorker_p0-w0: stopping experience collection (23350 times) [2024-06-19 09:25:05,202][26599] InferenceWorker_p0-w0: resuming experience collection (23350 times) [2024-06-19 09:25:06,479][26599] Updated weights for policy 0, policy_version 324444 (0.0028) [2024-06-19 09:25:08,380][26367] Fps is (10 sec: 45891.5, 60 sec: 42600.9, 300 sec: 42598.4). Total num frames: 5315788800. Throughput: 0: 42715.4. Samples: 1583387920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:08,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 09:25:09,486][26599] Updated weights for policy 0, policy_version 324454 (0.0034) [2024-06-19 09:25:13,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5315952640. Throughput: 0: 42594.6. Samples: 1583648200. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:13,380][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 09:25:13,982][26599] Updated weights for policy 0, policy_version 324464 (0.0027) [2024-06-19 09:25:17,150][26599] Updated weights for policy 0, policy_version 324474 (0.0048) [2024-06-19 09:25:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42874.1, 300 sec: 42598.4). Total num frames: 5316214784. Throughput: 0: 42558.1. Samples: 1583766220. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:18,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 09:25:21,976][26599] Updated weights for policy 0, policy_version 324484 (0.0039) [2024-06-19 09:25:23,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5316411392. Throughput: 0: 42719.3. Samples: 1584025540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:23,380][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 09:25:25,200][26599] Updated weights for policy 0, policy_version 324494 (0.0028) [2024-06-19 09:25:28,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5316608000. Throughput: 0: 42498.5. Samples: 1584283480. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:28,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 09:25:29,903][26599] Updated weights for policy 0, policy_version 324504 (0.0031) [2024-06-19 09:25:32,901][26599] Updated weights for policy 0, policy_version 324514 (0.0046) [2024-06-19 09:25:33,383][26367] Fps is (10 sec: 44225.3, 60 sec: 43142.6, 300 sec: 42598.6). Total num frames: 5316853760. Throughput: 0: 42542.5. Samples: 1584403380. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:33,383][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 09:25:37,577][26599] Updated weights for policy 0, policy_version 324524 (0.0029) [2024-06-19 09:25:38,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 5317033984. Throughput: 0: 42603.7. Samples: 1584661840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:38,380][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 09:25:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000324527_5317050368.pth... [2024-06-19 09:25:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000323904_5306843136.pth [2024-06-19 09:25:40,448][26599] Updated weights for policy 0, policy_version 324534 (0.0034) [2024-06-19 09:25:43,380][26367] Fps is (10 sec: 37692.7, 60 sec: 42325.2, 300 sec: 42543.1). Total num frames: 5317230592. Throughput: 0: 42233.6. Samples: 1584913340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:43,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 09:25:45,298][26599] Updated weights for policy 0, policy_version 324544 (0.0035) [2024-06-19 09:25:48,353][26599] Updated weights for policy 0, policy_version 324554 (0.0022) [2024-06-19 09:25:48,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42872.2, 300 sec: 42598.4). Total num frames: 5317492736. Throughput: 0: 42433.0. Samples: 1585036340. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:48,380][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 09:25:53,011][26599] Updated weights for policy 0, policy_version 324564 (0.0030) [2024-06-19 09:25:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5317656576. Throughput: 0: 42320.1. Samples: 1585292320. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:53,380][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 09:25:55,819][26599] Updated weights for policy 0, policy_version 324574 (0.0035) [2024-06-19 09:25:58,380][26367] Fps is (10 sec: 37682.6, 60 sec: 42327.8, 300 sec: 42487.3). Total num frames: 5317869568. Throughput: 0: 42251.4. Samples: 1585549520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:25:58,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 09:26:00,734][26599] Updated weights for policy 0, policy_version 324584 (0.0036) [2024-06-19 09:26:03,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42325.4, 300 sec: 42542.8). Total num frames: 5318115328. Throughput: 0: 42328.9. Samples: 1585671020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 09:26:03,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 09:26:03,639][26599] Updated weights for policy 0, policy_version 324594 (0.0037) [2024-06-19 09:26:08,380][26367] Fps is (10 sec: 42599.3, 60 sec: 41779.3, 300 sec: 42487.3). Total num frames: 5318295552. Throughput: 0: 42360.5. Samples: 1585931760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:08,380][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 09:26:08,411][26579] Signal inference workers to stop experience collection... (23400 times) [2024-06-19 09:26:08,413][26579] Signal inference workers to resume experience collection... (23400 times) [2024-06-19 09:26:08,421][26599] InferenceWorker_p0-w0: stopping experience collection (23400 times) [2024-06-19 09:26:08,424][26599] Updated weights for policy 0, policy_version 324604 (0.0028) [2024-06-19 09:26:08,436][26599] InferenceWorker_p0-w0: resuming experience collection (23400 times) [2024-06-19 09:26:11,405][26599] Updated weights for policy 0, policy_version 324614 (0.0042) [2024-06-19 09:26:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5318524928. Throughput: 0: 42222.0. Samples: 1586183460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:13,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 09:26:15,983][26599] Updated weights for policy 0, policy_version 324624 (0.0031) [2024-06-19 09:26:18,384][26367] Fps is (10 sec: 47495.8, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5318770688. Throughput: 0: 42331.0. Samples: 1586308320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:18,385][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 09:26:19,133][26599] Updated weights for policy 0, policy_version 324634 (0.0043) [2024-06-19 09:26:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5318950912. Throughput: 0: 42392.4. Samples: 1586569500. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:23,380][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 09:26:23,469][26599] Updated weights for policy 0, policy_version 324644 (0.0042) [2024-06-19 09:26:26,867][26599] Updated weights for policy 0, policy_version 324654 (0.0028) [2024-06-19 09:26:28,380][26367] Fps is (10 sec: 39336.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5319163904. Throughput: 0: 42222.7. Samples: 1586813360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:28,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 09:26:31,417][26599] Updated weights for policy 0, policy_version 324664 (0.0029) [2024-06-19 09:26:33,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42327.1, 300 sec: 42542.9). Total num frames: 5319393280. Throughput: 0: 42408.9. Samples: 1586944740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:33,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 09:26:34,807][26599] Updated weights for policy 0, policy_version 324674 (0.0023) [2024-06-19 09:26:38,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5319573504. Throughput: 0: 42475.4. Samples: 1587203720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:38,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 09:26:39,160][26599] Updated weights for policy 0, policy_version 324684 (0.0033) [2024-06-19 09:26:42,501][26599] Updated weights for policy 0, policy_version 324694 (0.0037) [2024-06-19 09:26:43,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5319802880. Throughput: 0: 42236.2. Samples: 1587450140. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:43,380][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 09:26:46,736][26599] Updated weights for policy 0, policy_version 324704 (0.0032) [2024-06-19 09:26:48,384][26367] Fps is (10 sec: 45858.7, 60 sec: 42322.7, 300 sec: 42597.9). Total num frames: 5320032256. Throughput: 0: 42603.2. Samples: 1587588320. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:48,393][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 09:26:50,025][26599] Updated weights for policy 0, policy_version 324714 (0.0044) [2024-06-19 09:26:53,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5320212480. Throughput: 0: 42504.3. Samples: 1587844460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:53,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 09:26:54,343][26599] Updated weights for policy 0, policy_version 324724 (0.0029) [2024-06-19 09:26:57,716][26599] Updated weights for policy 0, policy_version 324734 (0.0040) [2024-06-19 09:26:58,380][26367] Fps is (10 sec: 42614.5, 60 sec: 43144.7, 300 sec: 42598.4). Total num frames: 5320458240. Throughput: 0: 42540.9. Samples: 1588097800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:26:58,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 09:27:02,053][26599] Updated weights for policy 0, policy_version 324744 (0.0051) [2024-06-19 09:27:03,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 5320654848. Throughput: 0: 42748.4. Samples: 1588231840. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:27:03,381][26367] Avg episode reward: [(0, '0.826')] [2024-06-19 09:27:05,355][26599] Updated weights for policy 0, policy_version 324754 (0.0023) [2024-06-19 09:27:08,380][26367] Fps is (10 sec: 39320.8, 60 sec: 42598.2, 300 sec: 42487.3). Total num frames: 5320851456. Throughput: 0: 42584.3. Samples: 1588485800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:27:08,381][26367] Avg episode reward: [(0, '0.827')] [2024-06-19 09:27:09,641][26599] Updated weights for policy 0, policy_version 324764 (0.0040) [2024-06-19 09:27:13,150][26599] Updated weights for policy 0, policy_version 324774 (0.0025) [2024-06-19 09:27:13,386][26367] Fps is (10 sec: 44211.3, 60 sec: 42867.3, 300 sec: 42597.6). Total num frames: 5321097216. Throughput: 0: 42791.4. Samples: 1588739220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:27:13,386][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 09:27:17,465][26599] Updated weights for policy 0, policy_version 324784 (0.0031) [2024-06-19 09:27:18,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42327.7, 300 sec: 42598.4). Total num frames: 5321310208. Throughput: 0: 42814.4. Samples: 1588871400. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:27:18,381][26367] Avg episode reward: [(0, '0.775')] [2024-06-19 09:27:21,235][26599] Updated weights for policy 0, policy_version 324794 (0.0040) [2024-06-19 09:27:23,380][26367] Fps is (10 sec: 40983.3, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5321506816. Throughput: 0: 42736.5. Samples: 1589126860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:23,381][26367] Avg episode reward: [(0, '0.817')] [2024-06-19 09:27:25,618][26599] Updated weights for policy 0, policy_version 324804 (0.0031) [2024-06-19 09:27:28,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5321736192. Throughput: 0: 42781.3. Samples: 1589375300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:28,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 09:27:28,838][26599] Updated weights for policy 0, policy_version 324814 (0.0035) [2024-06-19 09:27:30,594][26579] Signal inference workers to stop experience collection... (23450 times) [2024-06-19 09:27:30,648][26599] InferenceWorker_p0-w0: stopping experience collection (23450 times) [2024-06-19 09:27:30,704][26579] Signal inference workers to resume experience collection... (23450 times) [2024-06-19 09:27:30,704][26599] InferenceWorker_p0-w0: resuming experience collection (23450 times) [2024-06-19 09:27:33,034][26599] Updated weights for policy 0, policy_version 324824 (0.0043) [2024-06-19 09:27:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5321932800. Throughput: 0: 42644.3. Samples: 1589507160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:33,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 09:27:36,416][26599] Updated weights for policy 0, policy_version 324834 (0.0034) [2024-06-19 09:27:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5322145792. Throughput: 0: 42673.8. Samples: 1589764780. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:38,380][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 09:27:38,388][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000324839_5322162176.pth... [2024-06-19 09:27:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000324216_5311954944.pth [2024-06-19 09:27:40,557][26599] Updated weights for policy 0, policy_version 324844 (0.0033) [2024-06-19 09:27:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5322375168. Throughput: 0: 42709.2. Samples: 1590019720. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:43,381][26367] Avg episode reward: [(0, '0.365')] [2024-06-19 09:27:44,862][26599] Updated weights for policy 0, policy_version 324854 (0.0043) [2024-06-19 09:27:48,028][26599] Updated weights for policy 0, policy_version 324864 (0.0033) [2024-06-19 09:27:48,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42328.0, 300 sec: 42598.4). Total num frames: 5322571776. Throughput: 0: 42474.7. Samples: 1590143200. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:48,380][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 09:27:52,414][26599] Updated weights for policy 0, policy_version 324874 (0.0029) [2024-06-19 09:27:53,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42871.5, 300 sec: 42543.4). Total num frames: 5322784768. Throughput: 0: 42676.6. Samples: 1590406240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:53,380][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 09:27:55,901][26599] Updated weights for policy 0, policy_version 324884 (0.0026) [2024-06-19 09:27:58,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5323030528. Throughput: 0: 42598.7. Samples: 1590655920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:27:58,381][26367] Avg episode reward: [(0, '0.454')] [2024-06-19 09:28:00,148][26599] Updated weights for policy 0, policy_version 324894 (0.0024) [2024-06-19 09:28:03,264][26599] Updated weights for policy 0, policy_version 324904 (0.0032) [2024-06-19 09:28:03,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5323227136. Throughput: 0: 42609.5. Samples: 1590788820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:28:03,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 09:28:07,706][26599] Updated weights for policy 0, policy_version 324914 (0.0029) [2024-06-19 09:28:08,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5323423744. Throughput: 0: 42739.6. Samples: 1591050140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:28:08,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 09:28:10,727][26599] Updated weights for policy 0, policy_version 324924 (0.0042) [2024-06-19 09:28:13,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42875.6, 300 sec: 42820.6). Total num frames: 5323669504. Throughput: 0: 42837.8. Samples: 1591303000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:28:13,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 09:28:15,346][26599] Updated weights for policy 0, policy_version 324934 (0.0037) [2024-06-19 09:28:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 5323866112. Throughput: 0: 42833.9. Samples: 1591434680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:28:18,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 09:28:18,485][26599] Updated weights for policy 0, policy_version 324944 (0.0034) [2024-06-19 09:28:23,319][26599] Updated weights for policy 0, policy_version 324954 (0.0033) [2024-06-19 09:28:23,384][26367] Fps is (10 sec: 37669.2, 60 sec: 42322.8, 300 sec: 42542.3). Total num frames: 5324046336. Throughput: 0: 42705.4. Samples: 1591686680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:28:23,384][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 09:28:26,295][26599] Updated weights for policy 0, policy_version 324964 (0.0043) [2024-06-19 09:28:28,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5324308480. Throughput: 0: 42681.7. Samples: 1591940400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:28:28,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 09:28:30,752][26599] Updated weights for policy 0, policy_version 324974 (0.0045) [2024-06-19 09:28:33,380][26367] Fps is (10 sec: 45891.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5324505088. Throughput: 0: 42994.5. Samples: 1592077960. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:28:33,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 09:28:34,034][26599] Updated weights for policy 0, policy_version 324984 (0.0033) [2024-06-19 09:28:38,223][26599] Updated weights for policy 0, policy_version 324994 (0.0030) [2024-06-19 09:28:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5324701696. Throughput: 0: 42623.4. Samples: 1592324300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:28:38,381][26367] Avg episode reward: [(0, '0.360')] [2024-06-19 09:28:41,655][26599] Updated weights for policy 0, policy_version 325004 (0.0036) [2024-06-19 09:28:43,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5324947456. Throughput: 0: 42745.9. Samples: 1592579480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:28:43,381][26367] Avg episode reward: [(0, '0.360')] [2024-06-19 09:28:45,756][26599] Updated weights for policy 0, policy_version 325014 (0.0041) [2024-06-19 09:28:48,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42543.4). Total num frames: 5325127680. Throughput: 0: 42849.5. Samples: 1592717040. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:28:48,380][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 09:28:49,675][26599] Updated weights for policy 0, policy_version 325024 (0.0037) [2024-06-19 09:28:53,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5325324288. Throughput: 0: 42501.4. Samples: 1592962700. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:28:53,380][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 09:28:53,746][26599] Updated weights for policy 0, policy_version 325034 (0.0045) [2024-06-19 09:28:57,169][26599] Updated weights for policy 0, policy_version 325044 (0.0038) [2024-06-19 09:28:58,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5325586432. Throughput: 0: 42583.1. Samples: 1593219240. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:28:58,380][26367] Avg episode reward: [(0, '0.365')] [2024-06-19 09:29:01,179][26599] Updated weights for policy 0, policy_version 325054 (0.0025) [2024-06-19 09:29:03,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.5, 300 sec: 42543.4). Total num frames: 5325783040. Throughput: 0: 42740.9. Samples: 1593358020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:03,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 09:29:04,672][26599] Updated weights for policy 0, policy_version 325064 (0.0041) [2024-06-19 09:29:05,172][26579] Signal inference workers to stop experience collection... (23500 times) [2024-06-19 09:29:05,174][26579] Signal inference workers to resume experience collection... (23500 times) [2024-06-19 09:29:05,216][26599] InferenceWorker_p0-w0: stopping experience collection (23500 times) [2024-06-19 09:29:05,216][26599] InferenceWorker_p0-w0: resuming experience collection (23500 times) [2024-06-19 09:29:08,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5325979648. Throughput: 0: 42708.4. Samples: 1593608400. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:08,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 09:29:08,578][26599] Updated weights for policy 0, policy_version 325074 (0.0041) [2024-06-19 09:29:12,451][26599] Updated weights for policy 0, policy_version 325084 (0.0045) [2024-06-19 09:29:13,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 5326225408. Throughput: 0: 42705.8. Samples: 1593862160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:13,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 09:29:16,045][26599] Updated weights for policy 0, policy_version 325094 (0.0033) [2024-06-19 09:29:18,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5326422016. Throughput: 0: 42607.6. Samples: 1593995300. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:18,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 09:29:19,906][26599] Updated weights for policy 0, policy_version 325104 (0.0028) [2024-06-19 09:29:23,380][26367] Fps is (10 sec: 42599.1, 60 sec: 43420.3, 300 sec: 42653.9). Total num frames: 5326651392. Throughput: 0: 42858.8. Samples: 1594252940. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:23,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 09:29:23,509][26599] Updated weights for policy 0, policy_version 325114 (0.0041) [2024-06-19 09:29:27,389][26599] Updated weights for policy 0, policy_version 325124 (0.0033) [2024-06-19 09:29:28,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5326864384. Throughput: 0: 43064.0. Samples: 1594517360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:28,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 09:29:31,217][26599] Updated weights for policy 0, policy_version 325134 (0.0033) [2024-06-19 09:29:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5327060992. Throughput: 0: 42862.5. Samples: 1594645860. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:33,384][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 09:29:34,867][26599] Updated weights for policy 0, policy_version 325144 (0.0046) [2024-06-19 09:29:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5327290368. Throughput: 0: 42998.1. Samples: 1594897620. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:38,384][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 09:29:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000325152_5327290368.pth... [2024-06-19 09:29:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000324527_5317050368.pth [2024-06-19 09:29:39,053][26599] Updated weights for policy 0, policy_version 325154 (0.0039) [2024-06-19 09:29:42,535][26599] Updated weights for policy 0, policy_version 325164 (0.0032) [2024-06-19 09:29:43,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42654.1). Total num frames: 5327503360. Throughput: 0: 43069.3. Samples: 1595157360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:43,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 09:29:46,856][26599] Updated weights for policy 0, policy_version 325174 (0.0039) [2024-06-19 09:29:48,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42868.8, 300 sec: 42597.9). Total num frames: 5327699968. Throughput: 0: 42828.9. Samples: 1595285480. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 09:29:48,385][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 09:29:50,306][26599] Updated weights for policy 0, policy_version 325184 (0.0024) [2024-06-19 09:29:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43690.6, 300 sec: 42765.5). Total num frames: 5327945728. Throughput: 0: 42880.0. Samples: 1595538000. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:29:53,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 09:29:54,519][26599] Updated weights for policy 0, policy_version 325194 (0.0039) [2024-06-19 09:29:57,960][26599] Updated weights for policy 0, policy_version 325204 (0.0031) [2024-06-19 09:29:58,381][26367] Fps is (10 sec: 45890.8, 60 sec: 42871.2, 300 sec: 42653.9). Total num frames: 5328158720. Throughput: 0: 42987.4. Samples: 1595796600. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:29:58,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 09:30:02,056][26599] Updated weights for policy 0, policy_version 325214 (0.0045) [2024-06-19 09:30:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5328355328. Throughput: 0: 42968.6. Samples: 1595928880. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:03,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 09:30:05,549][26599] Updated weights for policy 0, policy_version 325224 (0.0030) [2024-06-19 09:30:08,380][26367] Fps is (10 sec: 40961.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5328568320. Throughput: 0: 42781.8. Samples: 1596178120. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:08,380][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 09:30:09,791][26599] Updated weights for policy 0, policy_version 325234 (0.0029) [2024-06-19 09:30:13,191][26599] Updated weights for policy 0, policy_version 325244 (0.0047) [2024-06-19 09:30:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5328797696. Throughput: 0: 42742.2. Samples: 1596440760. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:13,380][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 09:30:14,417][26579] Signal inference workers to stop experience collection... (23550 times) [2024-06-19 09:30:14,465][26599] InferenceWorker_p0-w0: stopping experience collection (23550 times) [2024-06-19 09:30:14,472][26579] Signal inference workers to resume experience collection... (23550 times) [2024-06-19 09:30:14,485][26599] InferenceWorker_p0-w0: resuming experience collection (23550 times) [2024-06-19 09:30:17,446][26599] Updated weights for policy 0, policy_version 325254 (0.0032) [2024-06-19 09:30:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5328977920. Throughput: 0: 42800.9. Samples: 1596571900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:18,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 09:30:20,942][26599] Updated weights for policy 0, policy_version 325264 (0.0029) [2024-06-19 09:30:23,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5329223680. Throughput: 0: 42758.3. Samples: 1596821900. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:23,385][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 09:30:25,074][26599] Updated weights for policy 0, policy_version 325274 (0.0035) [2024-06-19 09:30:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42598.8). Total num frames: 5329420288. Throughput: 0: 42711.4. Samples: 1597079380. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:28,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 09:30:28,543][26599] Updated weights for policy 0, policy_version 325284 (0.0041) [2024-06-19 09:30:32,555][26599] Updated weights for policy 0, policy_version 325294 (0.0029) [2024-06-19 09:30:33,380][26367] Fps is (10 sec: 40975.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5329633280. Throughput: 0: 42775.1. Samples: 1597210200. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:33,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 09:30:36,030][26599] Updated weights for policy 0, policy_version 325304 (0.0033) [2024-06-19 09:30:38,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5329862656. Throughput: 0: 42926.2. Samples: 1597469680. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:38,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 09:30:40,464][26599] Updated weights for policy 0, policy_version 325314 (0.0036) [2024-06-19 09:30:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5330075648. Throughput: 0: 42731.8. Samples: 1597719520. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:43,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 09:30:43,975][26599] Updated weights for policy 0, policy_version 325324 (0.0031) [2024-06-19 09:30:48,069][26599] Updated weights for policy 0, policy_version 325334 (0.0038) [2024-06-19 09:30:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42874.0, 300 sec: 42765.0). Total num frames: 5330272256. Throughput: 0: 42624.3. Samples: 1597846980. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:48,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 09:30:51,604][26599] Updated weights for policy 0, policy_version 325344 (0.0043) [2024-06-19 09:30:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5330485248. Throughput: 0: 42561.7. Samples: 1598093400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:53,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 09:30:56,048][26599] Updated weights for policy 0, policy_version 325354 (0.0023) [2024-06-19 09:30:58,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5330731008. Throughput: 0: 42559.0. Samples: 1598355920. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:30:58,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 09:30:59,180][26599] Updated weights for policy 0, policy_version 325364 (0.0045) [2024-06-19 09:31:03,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5330894848. Throughput: 0: 42500.5. Samples: 1598484420. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:31:03,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 09:31:04,103][26599] Updated weights for policy 0, policy_version 325374 (0.0027) [2024-06-19 09:31:07,082][26599] Updated weights for policy 0, policy_version 325384 (0.0033) [2024-06-19 09:31:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5331140608. Throughput: 0: 42614.6. Samples: 1598739400. Policy #0 lag: (min: 0.0, avg: 11.5, max: 20.0) [2024-06-19 09:31:08,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 09:31:11,775][26599] Updated weights for policy 0, policy_version 325394 (0.0029) [2024-06-19 09:31:13,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.3, 300 sec: 42654.5). Total num frames: 5331353600. Throughput: 0: 42590.3. Samples: 1598995940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:13,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 09:31:14,806][26599] Updated weights for policy 0, policy_version 325404 (0.0029) [2024-06-19 09:31:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5331550208. Throughput: 0: 42424.0. Samples: 1599119280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:18,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 09:31:19,416][26599] Updated weights for policy 0, policy_version 325414 (0.0036) [2024-06-19 09:31:22,511][26599] Updated weights for policy 0, policy_version 325424 (0.0029) [2024-06-19 09:31:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42328.0, 300 sec: 42709.5). Total num frames: 5331763200. Throughput: 0: 42373.8. Samples: 1599376500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:23,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 09:31:23,693][26579] Signal inference workers to stop experience collection... (23600 times) [2024-06-19 09:31:23,694][26579] Signal inference workers to resume experience collection... (23600 times) [2024-06-19 09:31:23,718][26599] InferenceWorker_p0-w0: stopping experience collection (23600 times) [2024-06-19 09:31:23,718][26599] InferenceWorker_p0-w0: resuming experience collection (23600 times) [2024-06-19 09:31:26,996][26599] Updated weights for policy 0, policy_version 325434 (0.0036) [2024-06-19 09:31:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5331992576. Throughput: 0: 42550.7. Samples: 1599634300. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:28,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 09:31:30,045][26599] Updated weights for policy 0, policy_version 325444 (0.0033) [2024-06-19 09:31:33,386][26367] Fps is (10 sec: 40936.9, 60 sec: 42321.3, 300 sec: 42708.7). Total num frames: 5332172800. Throughput: 0: 42599.7. Samples: 1599764200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:33,386][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 09:31:34,543][26599] Updated weights for policy 0, policy_version 325454 (0.0030) [2024-06-19 09:31:37,759][26599] Updated weights for policy 0, policy_version 325464 (0.0039) [2024-06-19 09:31:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5332402176. Throughput: 0: 42813.8. Samples: 1600020020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:38,380][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 09:31:38,471][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000325465_5332418560.pth... [2024-06-19 09:31:38,524][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000324839_5322162176.pth [2024-06-19 09:31:42,463][26599] Updated weights for policy 0, policy_version 325474 (0.0028) [2024-06-19 09:31:43,380][26367] Fps is (10 sec: 44262.0, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 5332615168. Throughput: 0: 42567.2. Samples: 1600271440. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:43,381][26367] Avg episode reward: [(0, '0.403')] [2024-06-19 09:31:45,378][26599] Updated weights for policy 0, policy_version 325484 (0.0036) [2024-06-19 09:31:48,381][26367] Fps is (10 sec: 42596.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5332828160. Throughput: 0: 42596.6. Samples: 1600401280. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:48,381][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 09:31:49,909][26599] Updated weights for policy 0, policy_version 325494 (0.0050) [2024-06-19 09:31:53,134][26599] Updated weights for policy 0, policy_version 325504 (0.0023) [2024-06-19 09:31:53,380][26367] Fps is (10 sec: 44235.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 5333057536. Throughput: 0: 42727.4. Samples: 1600662140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:53,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 09:31:57,353][26599] Updated weights for policy 0, policy_version 325514 (0.0034) [2024-06-19 09:31:58,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5333254144. Throughput: 0: 42765.7. Samples: 1600920400. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:31:58,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 09:32:00,720][26599] Updated weights for policy 0, policy_version 325524 (0.0033) [2024-06-19 09:32:03,380][26367] Fps is (10 sec: 42599.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5333483520. Throughput: 0: 42717.8. Samples: 1601041580. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:32:03,380][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 09:32:04,890][26599] Updated weights for policy 0, policy_version 325534 (0.0030) [2024-06-19 09:32:08,359][26599] Updated weights for policy 0, policy_version 325544 (0.0036) [2024-06-19 09:32:08,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.9). Total num frames: 5333712896. Throughput: 0: 42900.4. Samples: 1601307020. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:32:08,381][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 09:32:12,498][26599] Updated weights for policy 0, policy_version 325554 (0.0033) [2024-06-19 09:32:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5333909504. Throughput: 0: 42911.6. Samples: 1601565320. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:32:13,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 09:32:15,971][26599] Updated weights for policy 0, policy_version 325564 (0.0042) [2024-06-19 09:32:18,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5334122496. Throughput: 0: 42853.7. Samples: 1601692380. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:32:18,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 09:32:20,340][26599] Updated weights for policy 0, policy_version 325574 (0.0025) [2024-06-19 09:32:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5334351872. Throughput: 0: 43018.2. Samples: 1601955840. Policy #0 lag: (min: 0.0, avg: 12.1, max: 22.0) [2024-06-19 09:32:23,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 09:32:23,482][26599] Updated weights for policy 0, policy_version 325584 (0.0024) [2024-06-19 09:32:27,915][26599] Updated weights for policy 0, policy_version 325594 (0.0044) [2024-06-19 09:32:28,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5334548480. Throughput: 0: 43080.4. Samples: 1602210060. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:32:28,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 09:32:31,216][26599] Updated weights for policy 0, policy_version 325604 (0.0033) [2024-06-19 09:32:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43421.7, 300 sec: 42820.6). Total num frames: 5334777856. Throughput: 0: 42936.8. Samples: 1602333420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:32:33,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 09:32:35,704][26599] Updated weights for policy 0, policy_version 325614 (0.0033) [2024-06-19 09:32:36,509][26579] Signal inference workers to stop experience collection... (23650 times) [2024-06-19 09:32:36,512][26579] Signal inference workers to resume experience collection... (23650 times) [2024-06-19 09:32:36,540][26599] InferenceWorker_p0-w0: stopping experience collection (23650 times) [2024-06-19 09:32:36,540][26599] InferenceWorker_p0-w0: resuming experience collection (23650 times) [2024-06-19 09:32:38,380][26367] Fps is (10 sec: 45874.9, 60 sec: 43417.6, 300 sec: 42820.6). Total num frames: 5335007232. Throughput: 0: 42905.0. Samples: 1602592860. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:32:38,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 09:32:38,979][26599] Updated weights for policy 0, policy_version 325624 (0.0036) [2024-06-19 09:32:43,250][26599] Updated weights for policy 0, policy_version 325634 (0.0026) [2024-06-19 09:32:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5335187456. Throughput: 0: 42907.2. Samples: 1602851220. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:32:43,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 09:32:46,709][26599] Updated weights for policy 0, policy_version 325644 (0.0039) [2024-06-19 09:32:48,380][26367] Fps is (10 sec: 40959.4, 60 sec: 43144.7, 300 sec: 42820.5). Total num frames: 5335416832. Throughput: 0: 42773.6. Samples: 1602966400. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:32:48,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 09:32:50,890][26599] Updated weights for policy 0, policy_version 325654 (0.0033) [2024-06-19 09:32:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 5335613440. Throughput: 0: 42723.2. Samples: 1603229560. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:32:53,380][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 09:32:54,613][26599] Updated weights for policy 0, policy_version 325664 (0.0023) [2024-06-19 09:32:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5335826432. Throughput: 0: 42579.9. Samples: 1603481420. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:32:58,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 09:32:58,477][26599] Updated weights for policy 0, policy_version 325674 (0.0040) [2024-06-19 09:33:02,722][26599] Updated weights for policy 0, policy_version 325684 (0.0028) [2024-06-19 09:33:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5336039424. Throughput: 0: 42535.2. Samples: 1603606460. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:03,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 09:33:05,959][26599] Updated weights for policy 0, policy_version 325694 (0.0033) [2024-06-19 09:33:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5336268800. Throughput: 0: 42554.2. Samples: 1603870780. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:08,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 09:33:10,423][26599] Updated weights for policy 0, policy_version 325704 (0.0042) [2024-06-19 09:33:13,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5336481792. Throughput: 0: 42403.9. Samples: 1604118240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:13,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 09:33:13,474][26599] Updated weights for policy 0, policy_version 325714 (0.0035) [2024-06-19 09:33:18,087][26599] Updated weights for policy 0, policy_version 325724 (0.0022) [2024-06-19 09:33:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42821.1). Total num frames: 5336678400. Throughput: 0: 42506.2. Samples: 1604246200. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:18,380][26367] Avg episode reward: [(0, '0.176')] [2024-06-19 09:33:21,251][26599] Updated weights for policy 0, policy_version 325734 (0.0036) [2024-06-19 09:33:23,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5336907776. Throughput: 0: 42431.4. Samples: 1604502280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:23,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 09:33:25,661][26599] Updated weights for policy 0, policy_version 325744 (0.0035) [2024-06-19 09:33:28,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 5337104384. Throughput: 0: 42508.8. Samples: 1604764120. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:28,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 09:33:28,805][26599] Updated weights for policy 0, policy_version 325754 (0.0042) [2024-06-19 09:33:33,157][26599] Updated weights for policy 0, policy_version 325764 (0.0036) [2024-06-19 09:33:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5337317376. Throughput: 0: 42653.4. Samples: 1604885800. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:33,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 09:33:36,727][26599] Updated weights for policy 0, policy_version 325774 (0.0031) [2024-06-19 09:33:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5337546752. Throughput: 0: 42548.2. Samples: 1605144240. Policy #0 lag: (min: 0.0, avg: 11.2, max: 20.0) [2024-06-19 09:33:38,381][26367] Avg episode reward: [(0, '0.859')] [2024-06-19 09:33:38,520][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000325779_5337563136.pth... [2024-06-19 09:33:38,570][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000325152_5327290368.pth [2024-06-19 09:33:40,814][26599] Updated weights for policy 0, policy_version 325784 (0.0032) [2024-06-19 09:33:43,380][26367] Fps is (10 sec: 45875.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5337776128. Throughput: 0: 42761.5. Samples: 1605405680. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:33:43,380][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 09:33:44,090][26599] Updated weights for policy 0, policy_version 325794 (0.0043) [2024-06-19 09:33:48,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 5337939968. Throughput: 0: 42801.3. Samples: 1605532520. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:33:48,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 09:33:48,524][26599] Updated weights for policy 0, policy_version 325804 (0.0033) [2024-06-19 09:33:49,246][26579] Signal inference workers to stop experience collection... (23700 times) [2024-06-19 09:33:49,246][26579] Signal inference workers to resume experience collection... (23700 times) [2024-06-19 09:33:49,270][26599] InferenceWorker_p0-w0: stopping experience collection (23700 times) [2024-06-19 09:33:49,271][26599] InferenceWorker_p0-w0: resuming experience collection (23700 times) [2024-06-19 09:33:51,666][26599] Updated weights for policy 0, policy_version 325814 (0.0034) [2024-06-19 09:33:53,380][26367] Fps is (10 sec: 42597.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5338202112. Throughput: 0: 42686.1. Samples: 1605791660. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:33:53,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 09:33:56,574][26599] Updated weights for policy 0, policy_version 325824 (0.0037) [2024-06-19 09:33:58,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5338398720. Throughput: 0: 42927.0. Samples: 1606049960. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:33:58,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 09:33:59,132][26599] Updated weights for policy 0, policy_version 325834 (0.0036) [2024-06-19 09:34:03,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5338595328. Throughput: 0: 42969.2. Samples: 1606179820. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:03,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 09:34:04,026][26599] Updated weights for policy 0, policy_version 325844 (0.0037) [2024-06-19 09:34:06,931][26599] Updated weights for policy 0, policy_version 325854 (0.0026) [2024-06-19 09:34:08,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5338841088. Throughput: 0: 42871.3. Samples: 1606431480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:08,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 09:34:11,757][26599] Updated weights for policy 0, policy_version 325864 (0.0036) [2024-06-19 09:34:13,384][26367] Fps is (10 sec: 45859.0, 60 sec: 42868.9, 300 sec: 42820.0). Total num frames: 5339054080. Throughput: 0: 42907.3. Samples: 1606695100. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:13,385][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 09:34:14,390][26599] Updated weights for policy 0, policy_version 325874 (0.0039) [2024-06-19 09:34:18,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5339217920. Throughput: 0: 42867.6. Samples: 1606814840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:18,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 09:34:19,316][26599] Updated weights for policy 0, policy_version 325884 (0.0040) [2024-06-19 09:34:22,286][26599] Updated weights for policy 0, policy_version 325894 (0.0041) [2024-06-19 09:34:23,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5339480064. Throughput: 0: 42908.6. Samples: 1607075120. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:23,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 09:34:27,187][26599] Updated weights for policy 0, policy_version 325904 (0.0042) [2024-06-19 09:34:28,380][26367] Fps is (10 sec: 47512.7, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5339693056. Throughput: 0: 42859.8. Samples: 1607334380. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:28,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 09:34:29,994][26599] Updated weights for policy 0, policy_version 325914 (0.0036) [2024-06-19 09:34:33,382][26367] Fps is (10 sec: 40954.8, 60 sec: 42870.6, 300 sec: 42709.3). Total num frames: 5339889664. Throughput: 0: 42802.8. Samples: 1607458700. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:33,382][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 09:34:34,895][26599] Updated weights for policy 0, policy_version 325924 (0.0038) [2024-06-19 09:34:37,765][26599] Updated weights for policy 0, policy_version 325934 (0.0034) [2024-06-19 09:34:38,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5340119040. Throughput: 0: 42751.6. Samples: 1607715480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:38,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 09:34:42,571][26599] Updated weights for policy 0, policy_version 325944 (0.0032) [2024-06-19 09:34:43,380][26367] Fps is (10 sec: 40965.5, 60 sec: 42052.3, 300 sec: 42710.0). Total num frames: 5340299264. Throughput: 0: 42745.5. Samples: 1607973500. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:43,380][26367] Avg episode reward: [(0, '0.806')] [2024-06-19 09:34:45,449][26599] Updated weights for policy 0, policy_version 325954 (0.0036) [2024-06-19 09:34:48,380][26367] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5340528640. Throughput: 0: 42598.0. Samples: 1608096720. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:48,380][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 09:34:50,119][26599] Updated weights for policy 0, policy_version 325964 (0.0025) [2024-06-19 09:34:53,110][26599] Updated weights for policy 0, policy_version 325974 (0.0030) [2024-06-19 09:34:53,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5340758016. Throughput: 0: 42742.2. Samples: 1608354880. Policy #0 lag: (min: 0.0, avg: 9.0, max: 20.0) [2024-06-19 09:34:53,381][26367] Avg episode reward: [(0, '0.313')] [2024-06-19 09:34:57,857][26599] Updated weights for policy 0, policy_version 325984 (0.0025) [2024-06-19 09:34:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5340938240. Throughput: 0: 42783.1. Samples: 1608620180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:34:58,380][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 09:35:00,791][26599] Updated weights for policy 0, policy_version 325994 (0.0030) [2024-06-19 09:35:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5341167616. Throughput: 0: 42807.6. Samples: 1608741180. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:03,381][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 09:35:03,796][26579] Signal inference workers to stop experience collection... (23750 times) [2024-06-19 09:35:03,848][26599] InferenceWorker_p0-w0: stopping experience collection (23750 times) [2024-06-19 09:35:03,855][26579] Signal inference workers to resume experience collection... (23750 times) [2024-06-19 09:35:03,863][26599] InferenceWorker_p0-w0: resuming experience collection (23750 times) [2024-06-19 09:35:05,374][26599] Updated weights for policy 0, policy_version 326004 (0.0028) [2024-06-19 09:35:08,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5341396992. Throughput: 0: 42897.4. Samples: 1609005500. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:08,380][26367] Avg episode reward: [(0, '0.257')] [2024-06-19 09:35:08,535][26599] Updated weights for policy 0, policy_version 326014 (0.0026) [2024-06-19 09:35:12,890][26599] Updated weights for policy 0, policy_version 326024 (0.0045) [2024-06-19 09:35:13,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42601.0, 300 sec: 42820.6). Total num frames: 5341609984. Throughput: 0: 42876.6. Samples: 1609263820. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:13,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 09:35:16,035][26599] Updated weights for policy 0, policy_version 326034 (0.0027) [2024-06-19 09:35:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42710.0). Total num frames: 5341822976. Throughput: 0: 42889.5. Samples: 1609388680. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:18,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 09:35:20,363][26599] Updated weights for policy 0, policy_version 326044 (0.0037) [2024-06-19 09:35:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5342052352. Throughput: 0: 42972.6. Samples: 1609649240. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:23,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 09:35:23,514][26599] Updated weights for policy 0, policy_version 326054 (0.0034) [2024-06-19 09:35:27,935][26599] Updated weights for policy 0, policy_version 326064 (0.0031) [2024-06-19 09:35:28,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 5342232576. Throughput: 0: 42887.1. Samples: 1609903420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:28,380][26367] Avg episode reward: [(0, '0.866')] [2024-06-19 09:35:31,244][26599] Updated weights for policy 0, policy_version 326074 (0.0043) [2024-06-19 09:35:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43145.4, 300 sec: 42765.0). Total num frames: 5342478336. Throughput: 0: 42969.2. Samples: 1610030340. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:33,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 09:35:35,378][26599] Updated weights for policy 0, policy_version 326084 (0.0045) [2024-06-19 09:35:38,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5342658560. Throughput: 0: 43029.3. Samples: 1610291200. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:38,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 09:35:38,478][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000326091_5342674944.pth... [2024-06-19 09:35:38,535][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000325465_5332418560.pth [2024-06-19 09:35:39,022][26599] Updated weights for policy 0, policy_version 326094 (0.0034) [2024-06-19 09:35:43,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5342871552. Throughput: 0: 42771.1. Samples: 1610544880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:43,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 09:35:43,467][26599] Updated weights for policy 0, policy_version 326104 (0.0042) [2024-06-19 09:35:46,567][26599] Updated weights for policy 0, policy_version 326114 (0.0039) [2024-06-19 09:35:48,380][26367] Fps is (10 sec: 47513.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5343133696. Throughput: 0: 43005.2. Samples: 1610676420. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:48,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 09:35:51,244][26599] Updated weights for policy 0, policy_version 326124 (0.0035) [2024-06-19 09:35:53,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5343297536. Throughput: 0: 42858.9. Samples: 1610934160. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:53,381][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 09:35:54,218][26599] Updated weights for policy 0, policy_version 326134 (0.0032) [2024-06-19 09:35:58,380][26367] Fps is (10 sec: 37683.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5343510528. Throughput: 0: 42683.7. Samples: 1611184580. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:35:58,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 09:35:58,805][26599] Updated weights for policy 0, policy_version 326144 (0.0032) [2024-06-19 09:36:01,486][26579] Signal inference workers to stop experience collection... (23800 times) [2024-06-19 09:36:01,544][26599] InferenceWorker_p0-w0: stopping experience collection (23800 times) [2024-06-19 09:36:01,546][26579] Signal inference workers to resume experience collection... (23800 times) [2024-06-19 09:36:01,556][26599] InferenceWorker_p0-w0: resuming experience collection (23800 times) [2024-06-19 09:36:01,846][26599] Updated weights for policy 0, policy_version 326154 (0.0043) [2024-06-19 09:36:03,380][26367] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5343756288. Throughput: 0: 42667.7. Samples: 1611308720. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:36:03,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 09:36:06,451][26599] Updated weights for policy 0, policy_version 326164 (0.0029) [2024-06-19 09:36:08,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5343952896. Throughput: 0: 42569.6. Samples: 1611564880. Policy #0 lag: (min: 0.0, avg: 9.9, max: 20.0) [2024-06-19 09:36:08,386][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 09:36:09,729][26599] Updated weights for policy 0, policy_version 326174 (0.0031) [2024-06-19 09:36:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5344149504. Throughput: 0: 42530.1. Samples: 1611817280. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:13,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 09:36:14,386][26599] Updated weights for policy 0, policy_version 326184 (0.0044) [2024-06-19 09:36:17,294][26599] Updated weights for policy 0, policy_version 326194 (0.0029) [2024-06-19 09:36:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5344395264. Throughput: 0: 42504.4. Samples: 1611943040. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:18,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 09:36:21,907][26599] Updated weights for policy 0, policy_version 326204 (0.0032) [2024-06-19 09:36:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5344575488. Throughput: 0: 42405.7. Samples: 1612199460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:23,381][26367] Avg episode reward: [(0, '0.361')] [2024-06-19 09:36:25,273][26599] Updated weights for policy 0, policy_version 326214 (0.0033) [2024-06-19 09:36:28,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42765.8). Total num frames: 5344788480. Throughput: 0: 42410.2. Samples: 1612453340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:28,381][26367] Avg episode reward: [(0, '0.196')] [2024-06-19 09:36:29,666][26599] Updated weights for policy 0, policy_version 326224 (0.0025) [2024-06-19 09:36:32,892][26599] Updated weights for policy 0, policy_version 326234 (0.0031) [2024-06-19 09:36:33,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5345017856. Throughput: 0: 42335.1. Samples: 1612581500. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:33,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 09:36:37,102][26599] Updated weights for policy 0, policy_version 326244 (0.0047) [2024-06-19 09:36:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5345214464. Throughput: 0: 42390.4. Samples: 1612841720. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:38,380][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 09:36:40,645][26599] Updated weights for policy 0, policy_version 326254 (0.0035) [2024-06-19 09:36:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5345427456. Throughput: 0: 42575.1. Samples: 1613100460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:43,380][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 09:36:44,971][26599] Updated weights for policy 0, policy_version 326264 (0.0032) [2024-06-19 09:36:48,299][26599] Updated weights for policy 0, policy_version 326274 (0.0040) [2024-06-19 09:36:48,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5345673216. Throughput: 0: 42579.6. Samples: 1613224800. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:48,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 09:36:52,550][26599] Updated weights for policy 0, policy_version 326284 (0.0034) [2024-06-19 09:36:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5345853440. Throughput: 0: 42735.2. Samples: 1613487960. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:53,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 09:36:55,822][26599] Updated weights for policy 0, policy_version 326294 (0.0038) [2024-06-19 09:36:58,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5346066432. Throughput: 0: 42756.8. Samples: 1613741340. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:36:58,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 09:37:00,091][26599] Updated weights for policy 0, policy_version 326304 (0.0030) [2024-06-19 09:37:03,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5346312192. Throughput: 0: 42753.4. Samples: 1613866940. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:37:03,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 09:37:03,511][26599] Updated weights for policy 0, policy_version 326314 (0.0032) [2024-06-19 09:37:08,239][26599] Updated weights for policy 0, policy_version 326324 (0.0037) [2024-06-19 09:37:08,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5346492416. Throughput: 0: 42702.6. Samples: 1614121080. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:37:08,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 09:37:11,260][26599] Updated weights for policy 0, policy_version 326334 (0.0039) [2024-06-19 09:37:13,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5346705408. Throughput: 0: 42758.5. Samples: 1614377480. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:37:13,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 09:37:16,005][26599] Updated weights for policy 0, policy_version 326344 (0.0040) [2024-06-19 09:37:18,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5346934784. Throughput: 0: 42776.5. Samples: 1614506440. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:37:18,380][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 09:37:18,898][26599] Updated weights for policy 0, policy_version 326354 (0.0032) [2024-06-19 09:37:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5347115008. Throughput: 0: 42638.5. Samples: 1614760460. Policy #0 lag: (min: 0.0, avg: 11.5, max: 23.0) [2024-06-19 09:37:23,389][26367] Avg episode reward: [(0, '0.801')] [2024-06-19 09:37:23,853][26599] Updated weights for policy 0, policy_version 326364 (0.0043) [2024-06-19 09:37:26,470][26599] Updated weights for policy 0, policy_version 326374 (0.0047) [2024-06-19 09:37:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5347344384. Throughput: 0: 42489.3. Samples: 1615012480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:37:28,381][26367] Avg episode reward: [(0, '0.858')] [2024-06-19 09:37:31,622][26599] Updated weights for policy 0, policy_version 326384 (0.0034) [2024-06-19 09:37:32,267][26579] Signal inference workers to stop experience collection... (23850 times) [2024-06-19 09:37:32,274][26579] Signal inference workers to resume experience collection... (23850 times) [2024-06-19 09:37:32,303][26599] InferenceWorker_p0-w0: stopping experience collection (23850 times) [2024-06-19 09:37:32,303][26599] InferenceWorker_p0-w0: resuming experience collection (23850 times) [2024-06-19 09:37:33,380][26367] Fps is (10 sec: 47514.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5347590144. Throughput: 0: 42749.3. Samples: 1615148520. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:37:33,381][26367] Avg episode reward: [(0, '0.799')] [2024-06-19 09:37:33,961][26599] Updated weights for policy 0, policy_version 326394 (0.0038) [2024-06-19 09:37:38,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5347770368. Throughput: 0: 42478.2. Samples: 1615399480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:37:38,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 09:37:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000326402_5347770368.pth... [2024-06-19 09:37:38,471][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000325779_5337563136.pth [2024-06-19 09:37:39,258][26599] Updated weights for policy 0, policy_version 326404 (0.0038) [2024-06-19 09:37:42,151][26599] Updated weights for policy 0, policy_version 326414 (0.0037) [2024-06-19 09:37:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5347999744. Throughput: 0: 42427.6. Samples: 1615650580. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:37:43,384][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 09:37:46,891][26599] Updated weights for policy 0, policy_version 326424 (0.0030) [2024-06-19 09:37:48,384][26367] Fps is (10 sec: 45858.9, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5348229120. Throughput: 0: 42506.8. Samples: 1615779900. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:37:48,384][26367] Avg episode reward: [(0, '0.755')] [2024-06-19 09:37:49,648][26599] Updated weights for policy 0, policy_version 326434 (0.0037) [2024-06-19 09:37:53,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5348409344. Throughput: 0: 42572.0. Samples: 1616036820. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:37:53,381][26367] Avg episode reward: [(0, '0.869')] [2024-06-19 09:37:54,566][26599] Updated weights for policy 0, policy_version 326444 (0.0032) [2024-06-19 09:37:57,239][26599] Updated weights for policy 0, policy_version 326454 (0.0034) [2024-06-19 09:37:58,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5348638720. Throughput: 0: 42382.3. Samples: 1616284680. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:37:58,381][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 09:38:02,275][26599] Updated weights for policy 0, policy_version 326464 (0.0028) [2024-06-19 09:38:03,381][26367] Fps is (10 sec: 45871.9, 60 sec: 42597.8, 300 sec: 42709.4). Total num frames: 5348868096. Throughput: 0: 42529.4. Samples: 1616420300. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:03,382][26367] Avg episode reward: [(0, '0.238')] [2024-06-19 09:38:04,841][26599] Updated weights for policy 0, policy_version 326474 (0.0031) [2024-06-19 09:38:08,382][26367] Fps is (10 sec: 40952.4, 60 sec: 42597.2, 300 sec: 42598.1). Total num frames: 5349048320. Throughput: 0: 42487.6. Samples: 1616672480. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:08,383][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 09:38:09,784][26599] Updated weights for policy 0, policy_version 326484 (0.0033) [2024-06-19 09:38:12,314][26599] Updated weights for policy 0, policy_version 326494 (0.0027) [2024-06-19 09:38:13,383][26367] Fps is (10 sec: 42590.6, 60 sec: 43142.7, 300 sec: 42764.6). Total num frames: 5349294080. Throughput: 0: 42492.5. Samples: 1616924760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:13,383][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 09:38:17,547][26599] Updated weights for policy 0, policy_version 326504 (0.0036) [2024-06-19 09:38:18,380][26367] Fps is (10 sec: 45883.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5349507072. Throughput: 0: 42567.1. Samples: 1617064040. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:18,381][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 09:38:20,171][26599] Updated weights for policy 0, policy_version 326514 (0.0033) [2024-06-19 09:38:23,380][26367] Fps is (10 sec: 39332.2, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5349687296. Throughput: 0: 42675.2. Samples: 1617319860. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:23,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 09:38:25,058][26599] Updated weights for policy 0, policy_version 326524 (0.0035) [2024-06-19 09:38:27,768][26599] Updated weights for policy 0, policy_version 326534 (0.0030) [2024-06-19 09:38:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5349933056. Throughput: 0: 42705.3. Samples: 1617572320. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:28,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 09:38:29,424][26579] Signal inference workers to stop experience collection... (23900 times) [2024-06-19 09:38:29,449][26599] InferenceWorker_p0-w0: stopping experience collection (23900 times) [2024-06-19 09:38:29,540][26579] Signal inference workers to resume experience collection... (23900 times) [2024-06-19 09:38:29,541][26599] InferenceWorker_p0-w0: resuming experience collection (23900 times) [2024-06-19 09:38:32,596][26599] Updated weights for policy 0, policy_version 326544 (0.0043) [2024-06-19 09:38:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5350129664. Throughput: 0: 42844.8. Samples: 1617707760. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:33,380][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 09:38:35,413][26599] Updated weights for policy 0, policy_version 326554 (0.0030) [2024-06-19 09:38:38,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5350326272. Throughput: 0: 42849.1. Samples: 1617965020. Policy #0 lag: (min: 0.0, avg: 12.5, max: 22.0) [2024-06-19 09:38:38,380][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 09:38:40,113][26599] Updated weights for policy 0, policy_version 326564 (0.0041) [2024-06-19 09:38:42,977][26599] Updated weights for policy 0, policy_version 326574 (0.0045) [2024-06-19 09:38:43,380][26367] Fps is (10 sec: 45874.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5350588416. Throughput: 0: 42775.9. Samples: 1618209600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:38:43,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 09:38:47,874][26599] Updated weights for policy 0, policy_version 326584 (0.0035) [2024-06-19 09:38:48,381][26367] Fps is (10 sec: 44235.0, 60 sec: 42327.7, 300 sec: 42598.4). Total num frames: 5350768640. Throughput: 0: 42818.7. Samples: 1618347120. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:38:48,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 09:38:50,654][26599] Updated weights for policy 0, policy_version 326594 (0.0034) [2024-06-19 09:38:53,380][26367] Fps is (10 sec: 37683.9, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5350965248. Throughput: 0: 42843.7. Samples: 1618600360. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:38:53,380][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 09:38:55,660][26599] Updated weights for policy 0, policy_version 326604 (0.0037) [2024-06-19 09:38:58,380][26367] Fps is (10 sec: 45876.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5351227392. Throughput: 0: 42759.0. Samples: 1618848800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:38:58,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 09:38:58,416][26599] Updated weights for policy 0, policy_version 326614 (0.0044) [2024-06-19 09:39:03,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42050.3, 300 sec: 42542.3). Total num frames: 5351391232. Throughput: 0: 42678.8. Samples: 1618984740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:03,384][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 09:39:03,396][26599] Updated weights for policy 0, policy_version 326624 (0.0035) [2024-06-19 09:39:06,512][26599] Updated weights for policy 0, policy_version 326634 (0.0041) [2024-06-19 09:39:08,384][26367] Fps is (10 sec: 39307.4, 60 sec: 42870.3, 300 sec: 42598.4). Total num frames: 5351620608. Throughput: 0: 42617.9. Samples: 1619237820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:08,384][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 09:39:10,932][26599] Updated weights for policy 0, policy_version 326644 (0.0046) [2024-06-19 09:39:13,380][26367] Fps is (10 sec: 45892.0, 60 sec: 42600.3, 300 sec: 42820.6). Total num frames: 5351849984. Throughput: 0: 42737.0. Samples: 1619495480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:13,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 09:39:14,214][26599] Updated weights for policy 0, policy_version 326654 (0.0037) [2024-06-19 09:39:18,380][26367] Fps is (10 sec: 40974.3, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5352030208. Throughput: 0: 42718.1. Samples: 1619630080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:18,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 09:39:18,636][26599] Updated weights for policy 0, policy_version 326664 (0.0032) [2024-06-19 09:39:21,859][26599] Updated weights for policy 0, policy_version 326674 (0.0035) [2024-06-19 09:39:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5352275968. Throughput: 0: 42576.8. Samples: 1619880980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:23,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 09:39:26,423][26599] Updated weights for policy 0, policy_version 326684 (0.0044) [2024-06-19 09:39:28,383][26367] Fps is (10 sec: 45864.1, 60 sec: 42596.6, 300 sec: 42709.3). Total num frames: 5352488960. Throughput: 0: 42848.8. Samples: 1620137900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:28,383][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 09:39:29,404][26599] Updated weights for policy 0, policy_version 326694 (0.0043) [2024-06-19 09:39:33,383][26367] Fps is (10 sec: 40949.0, 60 sec: 42596.5, 300 sec: 42598.0). Total num frames: 5352685568. Throughput: 0: 42752.9. Samples: 1620271100. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:33,384][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 09:39:33,897][26599] Updated weights for policy 0, policy_version 326704 (0.0032) [2024-06-19 09:39:36,841][26599] Updated weights for policy 0, policy_version 326714 (0.0029) [2024-06-19 09:39:38,380][26367] Fps is (10 sec: 42609.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5352914944. Throughput: 0: 42575.6. Samples: 1620516260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:38,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 09:39:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000326716_5352914944.pth... [2024-06-19 09:39:38,450][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000326091_5342674944.pth [2024-06-19 09:39:41,595][26599] Updated weights for policy 0, policy_version 326724 (0.0038) [2024-06-19 09:39:43,380][26367] Fps is (10 sec: 44248.1, 60 sec: 42325.3, 300 sec: 42709.4). Total num frames: 5353127936. Throughput: 0: 42852.7. Samples: 1620777180. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:43,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 09:39:44,594][26599] Updated weights for policy 0, policy_version 326734 (0.0034) [2024-06-19 09:39:48,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 5353324544. Throughput: 0: 42780.7. Samples: 1620909720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:48,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 09:39:49,060][26579] Signal inference workers to stop experience collection... (23950 times) [2024-06-19 09:39:49,060][26579] Signal inference workers to resume experience collection... (23950 times) [2024-06-19 09:39:49,091][26599] InferenceWorker_p0-w0: stopping experience collection (23950 times) [2024-06-19 09:39:49,091][26599] InferenceWorker_p0-w0: resuming experience collection (23950 times) [2024-06-19 09:39:49,206][26599] Updated weights for policy 0, policy_version 326744 (0.0032) [2024-06-19 09:39:52,522][26599] Updated weights for policy 0, policy_version 326754 (0.0028) [2024-06-19 09:39:53,380][26367] Fps is (10 sec: 42598.9, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5353553920. Throughput: 0: 42764.3. Samples: 1621162060. Policy #0 lag: (min: 0.0, avg: 11.6, max: 21.0) [2024-06-19 09:39:53,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 09:39:56,849][26599] Updated weights for policy 0, policy_version 326764 (0.0027) [2024-06-19 09:39:58,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5353766912. Throughput: 0: 42634.2. Samples: 1621414020. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:39:58,381][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 09:40:00,329][26599] Updated weights for policy 0, policy_version 326774 (0.0033) [2024-06-19 09:40:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43147.1, 300 sec: 42653.9). Total num frames: 5353979904. Throughput: 0: 42580.1. Samples: 1621546180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:03,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 09:40:04,410][26599] Updated weights for policy 0, policy_version 326784 (0.0031) [2024-06-19 09:40:07,935][26599] Updated weights for policy 0, policy_version 326794 (0.0032) [2024-06-19 09:40:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42874.1, 300 sec: 42654.0). Total num frames: 5354192896. Throughput: 0: 42666.7. Samples: 1621800980. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:08,380][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 09:40:12,073][26599] Updated weights for policy 0, policy_version 326804 (0.0034) [2024-06-19 09:40:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5354405888. Throughput: 0: 42624.2. Samples: 1622055880. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:13,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 09:40:15,469][26599] Updated weights for policy 0, policy_version 326814 (0.0042) [2024-06-19 09:40:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5354618880. Throughput: 0: 42515.3. Samples: 1622184180. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:18,381][26367] Avg episode reward: [(0, '0.481')] [2024-06-19 09:40:19,804][26599] Updated weights for policy 0, policy_version 326824 (0.0029) [2024-06-19 09:40:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5354831872. Throughput: 0: 42773.2. Samples: 1622441060. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:23,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 09:40:23,497][26599] Updated weights for policy 0, policy_version 326834 (0.0050) [2024-06-19 09:40:27,429][26599] Updated weights for policy 0, policy_version 326844 (0.0042) [2024-06-19 09:40:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42600.2, 300 sec: 42598.4). Total num frames: 5355044864. Throughput: 0: 42747.2. Samples: 1622700800. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:28,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 09:40:31,047][26599] Updated weights for policy 0, policy_version 326854 (0.0029) [2024-06-19 09:40:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42873.4, 300 sec: 42709.5). Total num frames: 5355257856. Throughput: 0: 42603.2. Samples: 1622826860. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:33,384][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 09:40:35,003][26599] Updated weights for policy 0, policy_version 326864 (0.0040) [2024-06-19 09:40:38,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5355487232. Throughput: 0: 42714.3. Samples: 1623084200. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:38,380][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 09:40:38,724][26599] Updated weights for policy 0, policy_version 326874 (0.0041) [2024-06-19 09:40:42,715][26599] Updated weights for policy 0, policy_version 326884 (0.0038) [2024-06-19 09:40:43,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42542.8). Total num frames: 5355683840. Throughput: 0: 42831.4. Samples: 1623341440. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:43,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 09:40:46,559][26599] Updated weights for policy 0, policy_version 326894 (0.0029) [2024-06-19 09:40:48,380][26367] Fps is (10 sec: 40958.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5355896832. Throughput: 0: 42688.3. Samples: 1623467160. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:48,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 09:40:50,881][26599] Updated weights for policy 0, policy_version 326904 (0.0036) [2024-06-19 09:40:53,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5356109824. Throughput: 0: 42629.3. Samples: 1623719300. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:53,380][26367] Avg episode reward: [(0, '0.401')] [2024-06-19 09:40:54,318][26599] Updated weights for policy 0, policy_version 326914 (0.0035) [2024-06-19 09:40:58,348][26599] Updated weights for policy 0, policy_version 326924 (0.0038) [2024-06-19 09:40:58,380][26367] Fps is (10 sec: 42599.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5356322816. Throughput: 0: 42873.9. Samples: 1623985200. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:40:58,380][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 09:41:01,888][26599] Updated weights for policy 0, policy_version 326934 (0.0035) [2024-06-19 09:41:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5356552192. Throughput: 0: 42841.5. Samples: 1624112040. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:41:03,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 09:41:05,875][26599] Updated weights for policy 0, policy_version 326944 (0.0033) [2024-06-19 09:41:05,887][26579] Signal inference workers to stop experience collection... (24000 times) [2024-06-19 09:41:05,888][26579] Signal inference workers to resume experience collection... (24000 times) [2024-06-19 09:41:05,926][26599] InferenceWorker_p0-w0: stopping experience collection (24000 times) [2024-06-19 09:41:05,926][26599] InferenceWorker_p0-w0: resuming experience collection (24000 times) [2024-06-19 09:41:08,383][26367] Fps is (10 sec: 44222.4, 60 sec: 42869.2, 300 sec: 42764.6). Total num frames: 5356765184. Throughput: 0: 42875.3. Samples: 1624370580. Policy #0 lag: (min: 1.0, avg: 11.7, max: 22.0) [2024-06-19 09:41:08,384][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 09:41:09,494][26599] Updated weights for policy 0, policy_version 326954 (0.0028) [2024-06-19 09:41:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5356961792. Throughput: 0: 42808.1. Samples: 1624627160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:13,381][26367] Avg episode reward: [(0, '0.814')] [2024-06-19 09:41:13,454][26599] Updated weights for policy 0, policy_version 326964 (0.0028) [2024-06-19 09:41:17,035][26599] Updated weights for policy 0, policy_version 326974 (0.0046) [2024-06-19 09:41:18,380][26367] Fps is (10 sec: 42611.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5357191168. Throughput: 0: 42813.3. Samples: 1624753460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:18,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 09:41:20,954][26599] Updated weights for policy 0, policy_version 326984 (0.0028) [2024-06-19 09:41:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5357404160. Throughput: 0: 42903.1. Samples: 1625014840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:23,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 09:41:24,836][26599] Updated weights for policy 0, policy_version 326994 (0.0030) [2024-06-19 09:41:28,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5357617152. Throughput: 0: 42895.8. Samples: 1625271740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:28,380][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 09:41:28,506][26599] Updated weights for policy 0, policy_version 327004 (0.0025) [2024-06-19 09:41:32,589][26599] Updated weights for policy 0, policy_version 327014 (0.0035) [2024-06-19 09:41:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5357813760. Throughput: 0: 42705.1. Samples: 1625388880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:33,381][26367] Avg episode reward: [(0, '0.329')] [2024-06-19 09:41:36,667][26599] Updated weights for policy 0, policy_version 327024 (0.0034) [2024-06-19 09:41:38,380][26367] Fps is (10 sec: 42597.3, 60 sec: 42598.2, 300 sec: 42765.0). Total num frames: 5358043136. Throughput: 0: 42871.4. Samples: 1625648520. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:38,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 09:41:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327029_5358043136.pth... [2024-06-19 09:41:38,477][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000326402_5347770368.pth [2024-06-19 09:41:40,317][26599] Updated weights for policy 0, policy_version 327034 (0.0037) [2024-06-19 09:41:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5358256128. Throughput: 0: 42744.3. Samples: 1625908700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:43,381][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 09:41:44,142][26599] Updated weights for policy 0, policy_version 327044 (0.0030) [2024-06-19 09:41:47,896][26599] Updated weights for policy 0, policy_version 327054 (0.0037) [2024-06-19 09:41:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5358469120. Throughput: 0: 42707.4. Samples: 1626033880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:48,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 09:41:51,617][26599] Updated weights for policy 0, policy_version 327064 (0.0027) [2024-06-19 09:41:53,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5358698496. Throughput: 0: 42778.2. Samples: 1626295460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:53,380][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 09:41:55,636][26599] Updated weights for policy 0, policy_version 327074 (0.0037) [2024-06-19 09:41:58,384][26367] Fps is (10 sec: 42583.2, 60 sec: 42868.8, 300 sec: 42653.4). Total num frames: 5358895104. Throughput: 0: 42704.5. Samples: 1626549020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:41:58,385][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 09:41:59,098][26599] Updated weights for policy 0, policy_version 327084 (0.0033) [2024-06-19 09:42:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5359108096. Throughput: 0: 42797.9. Samples: 1626679360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:42:03,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 09:42:03,394][26599] Updated weights for policy 0, policy_version 327094 (0.0032) [2024-06-19 09:42:06,566][26599] Updated weights for policy 0, policy_version 327104 (0.0029) [2024-06-19 09:42:08,380][26367] Fps is (10 sec: 45892.2, 60 sec: 43146.9, 300 sec: 42876.1). Total num frames: 5359353856. Throughput: 0: 42790.7. Samples: 1626940420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:42:08,380][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 09:42:10,947][26599] Updated weights for policy 0, policy_version 327114 (0.0037) [2024-06-19 09:42:11,720][26579] Signal inference workers to stop experience collection... (24050 times) [2024-06-19 09:42:11,777][26599] InferenceWorker_p0-w0: stopping experience collection (24050 times) [2024-06-19 09:42:11,840][26579] Signal inference workers to resume experience collection... (24050 times) [2024-06-19 09:42:11,840][26599] InferenceWorker_p0-w0: resuming experience collection (24050 times) [2024-06-19 09:42:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5359534080. Throughput: 0: 42569.2. Samples: 1627187360. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:42:13,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 09:42:14,574][26599] Updated weights for policy 0, policy_version 327124 (0.0042) [2024-06-19 09:42:18,380][26367] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5359730688. Throughput: 0: 42828.8. Samples: 1627316180. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:42:18,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 09:42:18,696][26599] Updated weights for policy 0, policy_version 327134 (0.0031) [2024-06-19 09:42:22,696][26599] Updated weights for policy 0, policy_version 327144 (0.0036) [2024-06-19 09:42:23,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5359960064. Throughput: 0: 42650.4. Samples: 1627567780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:42:23,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 09:42:26,401][26599] Updated weights for policy 0, policy_version 327154 (0.0041) [2024-06-19 09:42:28,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5360189440. Throughput: 0: 42548.3. Samples: 1627823380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 09:42:28,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 09:42:30,301][26599] Updated weights for policy 0, policy_version 327164 (0.0032) [2024-06-19 09:42:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5360369664. Throughput: 0: 42806.7. Samples: 1627960180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:42:33,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 09:42:33,981][26599] Updated weights for policy 0, policy_version 327174 (0.0035) [2024-06-19 09:42:38,027][26599] Updated weights for policy 0, policy_version 327184 (0.0038) [2024-06-19 09:42:38,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5360599040. Throughput: 0: 42583.9. Samples: 1628211740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:42:38,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 09:42:41,647][26599] Updated weights for policy 0, policy_version 327194 (0.0041) [2024-06-19 09:42:43,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 5360828416. Throughput: 0: 42691.4. Samples: 1628469980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:42:43,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 09:42:45,460][26599] Updated weights for policy 0, policy_version 327204 (0.0032) [2024-06-19 09:42:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5361025024. Throughput: 0: 42684.4. Samples: 1628600160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:42:48,380][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 09:42:49,189][26599] Updated weights for policy 0, policy_version 327214 (0.0043) [2024-06-19 09:42:53,005][26599] Updated weights for policy 0, policy_version 327224 (0.0035) [2024-06-19 09:42:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5361238016. Throughput: 0: 42636.0. Samples: 1628859040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:42:53,380][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 09:42:56,913][26599] Updated weights for policy 0, policy_version 327234 (0.0051) [2024-06-19 09:42:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42601.0, 300 sec: 42654.1). Total num frames: 5361451008. Throughput: 0: 42821.9. Samples: 1629114340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:42:58,380][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 09:43:00,699][26599] Updated weights for policy 0, policy_version 327244 (0.0030) [2024-06-19 09:43:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.3). Total num frames: 5361664000. Throughput: 0: 42751.7. Samples: 1629240000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:03,380][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 09:43:04,456][26599] Updated weights for policy 0, policy_version 327254 (0.0036) [2024-06-19 09:43:08,318][26599] Updated weights for policy 0, policy_version 327264 (0.0037) [2024-06-19 09:43:08,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.3, 300 sec: 42709.9). Total num frames: 5361893376. Throughput: 0: 43010.2. Samples: 1629503240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:08,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 09:43:12,022][26599] Updated weights for policy 0, policy_version 327274 (0.0023) [2024-06-19 09:43:13,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5362106368. Throughput: 0: 42910.3. Samples: 1629754340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:13,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 09:43:16,418][26599] Updated weights for policy 0, policy_version 327284 (0.0050) [2024-06-19 09:43:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5362319360. Throughput: 0: 42876.9. Samples: 1629889640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:18,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 09:43:19,573][26599] Updated weights for policy 0, policy_version 327294 (0.0032) [2024-06-19 09:43:23,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5362515968. Throughput: 0: 42892.6. Samples: 1630142060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:23,384][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 09:43:23,937][26599] Updated weights for policy 0, policy_version 327304 (0.0028) [2024-06-19 09:43:27,199][26599] Updated weights for policy 0, policy_version 327314 (0.0033) [2024-06-19 09:43:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5362745344. Throughput: 0: 42691.7. Samples: 1630391100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:28,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 09:43:31,417][26599] Updated weights for policy 0, policy_version 327324 (0.0035) [2024-06-19 09:43:33,380][26367] Fps is (10 sec: 44252.7, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5362958336. Throughput: 0: 42829.7. Samples: 1630527500. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:33,381][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 09:43:34,699][26599] Updated weights for policy 0, policy_version 327334 (0.0030) [2024-06-19 09:43:38,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5363138560. Throughput: 0: 42755.1. Samples: 1630783020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:38,380][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 09:43:38,400][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327341_5363154944.pth... [2024-06-19 09:43:38,466][26579] Signal inference workers to stop experience collection... (24100 times) [2024-06-19 09:43:38,466][26579] Signal inference workers to resume experience collection... (24100 times) [2024-06-19 09:43:38,478][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000326716_5352914944.pth [2024-06-19 09:43:38,491][26599] InferenceWorker_p0-w0: stopping experience collection (24100 times) [2024-06-19 09:43:38,491][26599] InferenceWorker_p0-w0: resuming experience collection (24100 times) [2024-06-19 09:43:38,931][26599] Updated weights for policy 0, policy_version 327344 (0.0036) [2024-06-19 09:43:42,442][26599] Updated weights for policy 0, policy_version 327354 (0.0028) [2024-06-19 09:43:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 5363384320. Throughput: 0: 42788.9. Samples: 1631039840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:43:43,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 09:43:46,860][26599] Updated weights for policy 0, policy_version 327364 (0.0037) [2024-06-19 09:43:48,380][26367] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5363613696. Throughput: 0: 42845.2. Samples: 1631168040. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:43:48,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 09:43:50,081][26599] Updated weights for policy 0, policy_version 327374 (0.0037) [2024-06-19 09:43:53,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5363793920. Throughput: 0: 42667.0. Samples: 1631423260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:43:53,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 09:43:54,443][26599] Updated weights for policy 0, policy_version 327384 (0.0026) [2024-06-19 09:43:58,097][26599] Updated weights for policy 0, policy_version 327394 (0.0042) [2024-06-19 09:43:58,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43144.5, 300 sec: 42876.6). Total num frames: 5364039680. Throughput: 0: 42812.1. Samples: 1631680880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:43:58,380][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 09:44:02,082][26599] Updated weights for policy 0, policy_version 327404 (0.0032) [2024-06-19 09:44:03,380][26367] Fps is (10 sec: 45875.9, 60 sec: 43144.5, 300 sec: 42821.1). Total num frames: 5364252672. Throughput: 0: 42675.2. Samples: 1631810020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:03,380][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 09:44:05,739][26599] Updated weights for policy 0, policy_version 327414 (0.0035) [2024-06-19 09:44:08,387][26367] Fps is (10 sec: 39293.2, 60 sec: 42320.3, 300 sec: 42652.9). Total num frames: 5364432896. Throughput: 0: 42744.2. Samples: 1632065700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:08,388][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 09:44:09,723][26599] Updated weights for policy 0, policy_version 327424 (0.0033) [2024-06-19 09:44:13,211][26599] Updated weights for policy 0, policy_version 327434 (0.0027) [2024-06-19 09:44:13,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5364678656. Throughput: 0: 42914.0. Samples: 1632322240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:13,381][26367] Avg episode reward: [(0, '0.337')] [2024-06-19 09:44:17,363][26599] Updated weights for policy 0, policy_version 327444 (0.0025) [2024-06-19 09:44:18,380][26367] Fps is (10 sec: 47548.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5364908032. Throughput: 0: 42829.8. Samples: 1632454840. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:18,381][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 09:44:20,788][26599] Updated weights for policy 0, policy_version 327454 (0.0030) [2024-06-19 09:44:23,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42600.9, 300 sec: 42654.3). Total num frames: 5365071872. Throughput: 0: 42770.5. Samples: 1632707700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:23,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 09:44:25,088][26599] Updated weights for policy 0, policy_version 327464 (0.0034) [2024-06-19 09:44:28,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.4, 300 sec: 42765.4). Total num frames: 5365301248. Throughput: 0: 42718.2. Samples: 1632962160. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:28,380][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 09:44:28,731][26599] Updated weights for policy 0, policy_version 327474 (0.0042) [2024-06-19 09:44:32,708][26599] Updated weights for policy 0, policy_version 327484 (0.0036) [2024-06-19 09:44:33,380][26367] Fps is (10 sec: 47513.8, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5365547008. Throughput: 0: 42836.4. Samples: 1633095680. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:33,381][26367] Avg episode reward: [(0, '0.821')] [2024-06-19 09:44:36,621][26599] Updated weights for policy 0, policy_version 327494 (0.0025) [2024-06-19 09:44:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5365710848. Throughput: 0: 42766.4. Samples: 1633347740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:38,380][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 09:44:40,296][26599] Updated weights for policy 0, policy_version 327504 (0.0036) [2024-06-19 09:44:43,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5365956608. Throughput: 0: 42623.0. Samples: 1633598920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:43,381][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 09:44:44,114][26599] Updated weights for policy 0, policy_version 327514 (0.0036) [2024-06-19 09:44:47,851][26599] Updated weights for policy 0, policy_version 327524 (0.0046) [2024-06-19 09:44:48,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5366169600. Throughput: 0: 42842.2. Samples: 1633738080. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:48,385][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 09:44:51,503][26579] Signal inference workers to stop experience collection... (24150 times) [2024-06-19 09:44:51,504][26579] Signal inference workers to resume experience collection... (24150 times) [2024-06-19 09:44:51,550][26599] InferenceWorker_p0-w0: stopping experience collection (24150 times) [2024-06-19 09:44:51,551][26599] InferenceWorker_p0-w0: resuming experience collection (24150 times) [2024-06-19 09:44:51,633][26599] Updated weights for policy 0, policy_version 327534 (0.0035) [2024-06-19 09:44:53,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5366349824. Throughput: 0: 42691.7. Samples: 1633986520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:53,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 09:44:55,455][26599] Updated weights for policy 0, policy_version 327544 (0.0043) [2024-06-19 09:44:58,380][26367] Fps is (10 sec: 42614.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5366595584. Throughput: 0: 42771.3. Samples: 1634246940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 09:44:58,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 09:44:59,344][26599] Updated weights for policy 0, policy_version 327554 (0.0029) [2024-06-19 09:45:02,879][26599] Updated weights for policy 0, policy_version 327564 (0.0032) [2024-06-19 09:45:03,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5366824960. Throughput: 0: 42817.7. Samples: 1634381640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:03,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 09:45:06,749][26599] Updated weights for policy 0, policy_version 327574 (0.0034) [2024-06-19 09:45:08,380][26367] Fps is (10 sec: 42597.8, 60 sec: 43149.6, 300 sec: 42765.0). Total num frames: 5367021568. Throughput: 0: 42911.1. Samples: 1634638700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:08,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 09:45:10,490][26599] Updated weights for policy 0, policy_version 327584 (0.0034) [2024-06-19 09:45:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5367234560. Throughput: 0: 42943.0. Samples: 1634894600. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:13,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 09:45:14,607][26599] Updated weights for policy 0, policy_version 327594 (0.0038) [2024-06-19 09:45:18,106][26599] Updated weights for policy 0, policy_version 327604 (0.0033) [2024-06-19 09:45:18,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5367480320. Throughput: 0: 42913.8. Samples: 1635026800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:18,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 09:45:22,253][26599] Updated weights for policy 0, policy_version 327614 (0.0039) [2024-06-19 09:45:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5367660544. Throughput: 0: 43056.4. Samples: 1635285280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:23,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 09:45:25,675][26599] Updated weights for policy 0, policy_version 327624 (0.0031) [2024-06-19 09:45:28,384][26367] Fps is (10 sec: 40945.2, 60 sec: 43141.8, 300 sec: 42820.0). Total num frames: 5367889920. Throughput: 0: 43099.6. Samples: 1635538560. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:28,385][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 09:45:30,083][26599] Updated weights for policy 0, policy_version 327634 (0.0033) [2024-06-19 09:45:33,358][26599] Updated weights for policy 0, policy_version 327644 (0.0033) [2024-06-19 09:45:33,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5368119296. Throughput: 0: 42893.3. Samples: 1635668120. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:33,380][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 09:45:37,959][26599] Updated weights for policy 0, policy_version 327654 (0.0025) [2024-06-19 09:45:38,380][26367] Fps is (10 sec: 40975.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5368299520. Throughput: 0: 43111.1. Samples: 1635926520. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:38,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 09:45:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327655_5368299520.pth... [2024-06-19 09:45:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327029_5358043136.pth [2024-06-19 09:45:41,057][26599] Updated weights for policy 0, policy_version 327664 (0.0045) [2024-06-19 09:45:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5368528896. Throughput: 0: 42754.7. Samples: 1636170900. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:43,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 09:45:45,739][26599] Updated weights for policy 0, policy_version 327674 (0.0034) [2024-06-19 09:45:48,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 5368725504. Throughput: 0: 42761.0. Samples: 1636305880. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:48,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 09:45:48,973][26599] Updated weights for policy 0, policy_version 327684 (0.0034) [2024-06-19 09:45:53,384][26367] Fps is (10 sec: 39307.2, 60 sec: 42868.9, 300 sec: 42708.9). Total num frames: 5368922112. Throughput: 0: 42642.0. Samples: 1636557740. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:53,384][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 09:45:53,395][26599] Updated weights for policy 0, policy_version 327694 (0.0033) [2024-06-19 09:45:56,579][26599] Updated weights for policy 0, policy_version 327704 (0.0033) [2024-06-19 09:45:58,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5369167872. Throughput: 0: 42516.4. Samples: 1636807840. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:45:58,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 09:46:00,897][26599] Updated weights for policy 0, policy_version 327714 (0.0042) [2024-06-19 09:46:03,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42325.4, 300 sec: 42709.9). Total num frames: 5369364480. Throughput: 0: 42509.0. Samples: 1636939700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:46:03,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 09:46:04,361][26599] Updated weights for policy 0, policy_version 327724 (0.0046) [2024-06-19 09:46:08,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5369561088. Throughput: 0: 42464.9. Samples: 1637196200. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:46:08,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 09:46:08,761][26599] Updated weights for policy 0, policy_version 327734 (0.0032) [2024-06-19 09:46:09,850][26579] Signal inference workers to stop experience collection... (24200 times) [2024-06-19 09:46:09,884][26599] InferenceWorker_p0-w0: stopping experience collection (24200 times) [2024-06-19 09:46:09,907][26579] Signal inference workers to resume experience collection... (24200 times) [2024-06-19 09:46:09,908][26599] InferenceWorker_p0-w0: resuming experience collection (24200 times) [2024-06-19 09:46:11,910][26599] Updated weights for policy 0, policy_version 327744 (0.0038) [2024-06-19 09:46:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5369806848. Throughput: 0: 42486.3. Samples: 1637450280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 23.0) [2024-06-19 09:46:13,380][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 09:46:16,425][26599] Updated weights for policy 0, policy_version 327754 (0.0032) [2024-06-19 09:46:18,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 5370003456. Throughput: 0: 42514.7. Samples: 1637581280. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:18,380][26367] Avg episode reward: [(0, '0.457')] [2024-06-19 09:46:19,405][26599] Updated weights for policy 0, policy_version 327764 (0.0042) [2024-06-19 09:46:23,381][26367] Fps is (10 sec: 40955.1, 60 sec: 42597.6, 300 sec: 42709.3). Total num frames: 5370216448. Throughput: 0: 42392.7. Samples: 1637834240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:23,382][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 09:46:24,165][26599] Updated weights for policy 0, policy_version 327774 (0.0033) [2024-06-19 09:46:27,104][26599] Updated weights for policy 0, policy_version 327784 (0.0031) [2024-06-19 09:46:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42601.0, 300 sec: 42820.6). Total num frames: 5370445824. Throughput: 0: 42558.2. Samples: 1638086020. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:28,381][26367] Avg episode reward: [(0, '0.444')] [2024-06-19 09:46:31,995][26599] Updated weights for policy 0, policy_version 327794 (0.0033) [2024-06-19 09:46:33,384][26367] Fps is (10 sec: 42587.5, 60 sec: 42049.7, 300 sec: 42709.0). Total num frames: 5370642432. Throughput: 0: 42509.0. Samples: 1638218940. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:33,384][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 09:46:35,099][26599] Updated weights for policy 0, policy_version 327804 (0.0057) [2024-06-19 09:46:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5370839040. Throughput: 0: 42392.2. Samples: 1638465240. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:38,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 09:46:39,543][26599] Updated weights for policy 0, policy_version 327814 (0.0034) [2024-06-19 09:46:42,684][26599] Updated weights for policy 0, policy_version 327824 (0.0032) [2024-06-19 09:46:43,380][26367] Fps is (10 sec: 44252.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5371084800. Throughput: 0: 42496.4. Samples: 1638720180. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:43,381][26367] Avg episode reward: [(0, '0.831')] [2024-06-19 09:46:47,276][26599] Updated weights for policy 0, policy_version 327834 (0.0030) [2024-06-19 09:46:48,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5371281408. Throughput: 0: 42688.4. Samples: 1638860680. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:48,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 09:46:50,034][26599] Updated weights for policy 0, policy_version 327844 (0.0029) [2024-06-19 09:46:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42874.0, 300 sec: 42710.0). Total num frames: 5371494400. Throughput: 0: 42611.0. Samples: 1639113700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:53,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 09:46:54,910][26599] Updated weights for policy 0, policy_version 327854 (0.0047) [2024-06-19 09:46:57,638][26599] Updated weights for policy 0, policy_version 327864 (0.0044) [2024-06-19 09:46:58,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5371740160. Throughput: 0: 42556.7. Samples: 1639365340. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:46:58,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 09:47:02,632][26599] Updated weights for policy 0, policy_version 327874 (0.0030) [2024-06-19 09:47:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5371920384. Throughput: 0: 42720.8. Samples: 1639503720. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:47:03,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 09:47:05,563][26599] Updated weights for policy 0, policy_version 327884 (0.0028) [2024-06-19 09:47:08,384][26367] Fps is (10 sec: 39307.6, 60 sec: 42868.8, 300 sec: 42709.0). Total num frames: 5372133376. Throughput: 0: 42630.9. Samples: 1639752740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:47:08,384][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 09:47:10,100][26599] Updated weights for policy 0, policy_version 327894 (0.0032) [2024-06-19 09:47:13,225][26599] Updated weights for policy 0, policy_version 327904 (0.0034) [2024-06-19 09:47:13,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5372379136. Throughput: 0: 42718.6. Samples: 1640008360. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:47:13,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 09:47:17,685][26599] Updated weights for policy 0, policy_version 327914 (0.0036) [2024-06-19 09:47:18,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5372559360. Throughput: 0: 42786.2. Samples: 1640144160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:47:18,380][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 09:47:20,934][26599] Updated weights for policy 0, policy_version 327924 (0.0046) [2024-06-19 09:47:23,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42869.6, 300 sec: 42709.0). Total num frames: 5372788736. Throughput: 0: 42981.0. Samples: 1640399540. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:47:23,385][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 09:47:25,374][26599] Updated weights for policy 0, policy_version 327934 (0.0033) [2024-06-19 09:47:27,982][26579] Signal inference workers to stop experience collection... (24250 times) [2024-06-19 09:47:28,008][26599] InferenceWorker_p0-w0: stopping experience collection (24250 times) [2024-06-19 09:47:28,045][26579] Signal inference workers to resume experience collection... (24250 times) [2024-06-19 09:47:28,046][26599] InferenceWorker_p0-w0: resuming experience collection (24250 times) [2024-06-19 09:47:28,362][26599] Updated weights for policy 0, policy_version 327944 (0.0039) [2024-06-19 09:47:28,384][26367] Fps is (10 sec: 47495.9, 60 sec: 43141.9, 300 sec: 42931.1). Total num frames: 5373034496. Throughput: 0: 43073.1. Samples: 1640658620. Policy #0 lag: (min: 0.0, avg: 8.7, max: 22.0) [2024-06-19 09:47:28,384][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 09:47:33,023][26599] Updated weights for policy 0, policy_version 327954 (0.0025) [2024-06-19 09:47:33,380][26367] Fps is (10 sec: 40974.5, 60 sec: 42600.9, 300 sec: 42709.5). Total num frames: 5373198336. Throughput: 0: 42816.8. Samples: 1640787440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:47:33,381][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 09:47:36,094][26599] Updated weights for policy 0, policy_version 327964 (0.0030) [2024-06-19 09:47:38,380][26367] Fps is (10 sec: 40974.6, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5373444096. Throughput: 0: 42757.8. Samples: 1641037800. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:47:38,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 09:47:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327969_5373444096.pth... [2024-06-19 09:47:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327341_5363154944.pth [2024-06-19 09:47:41,165][26599] Updated weights for policy 0, policy_version 327974 (0.0042) [2024-06-19 09:47:43,380][26367] Fps is (10 sec: 45875.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5373657088. Throughput: 0: 42935.2. Samples: 1641297420. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:47:43,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 09:47:43,957][26599] Updated weights for policy 0, policy_version 327984 (0.0037) [2024-06-19 09:47:48,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5373837312. Throughput: 0: 42613.7. Samples: 1641421340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:47:48,381][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 09:47:48,749][26599] Updated weights for policy 0, policy_version 327994 (0.0043) [2024-06-19 09:47:51,644][26599] Updated weights for policy 0, policy_version 328004 (0.0033) [2024-06-19 09:47:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5374083072. Throughput: 0: 42835.5. Samples: 1641680180. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:47:53,381][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 09:47:56,241][26599] Updated weights for policy 0, policy_version 328014 (0.0041) [2024-06-19 09:47:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5374279680. Throughput: 0: 42834.2. Samples: 1641935900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:47:58,381][26367] Avg episode reward: [(0, '0.824')] [2024-06-19 09:47:59,624][26599] Updated weights for policy 0, policy_version 328024 (0.0031) [2024-06-19 09:48:03,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5374476288. Throughput: 0: 42528.0. Samples: 1642057920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:03,380][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 09:48:03,842][26599] Updated weights for policy 0, policy_version 328034 (0.0038) [2024-06-19 09:48:07,222][26599] Updated weights for policy 0, policy_version 328044 (0.0036) [2024-06-19 09:48:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42874.1, 300 sec: 42709.5). Total num frames: 5374705664. Throughput: 0: 42609.7. Samples: 1642316820. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:08,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 09:48:11,346][26599] Updated weights for policy 0, policy_version 328054 (0.0036) [2024-06-19 09:48:13,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42049.9, 300 sec: 42653.4). Total num frames: 5374902272. Throughput: 0: 42608.1. Samples: 1642575980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:13,384][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 09:48:14,844][26599] Updated weights for policy 0, policy_version 328064 (0.0024) [2024-06-19 09:48:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 5375115264. Throughput: 0: 42584.5. Samples: 1642703740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:18,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 09:48:18,871][26599] Updated weights for policy 0, policy_version 328074 (0.0022) [2024-06-19 09:48:22,507][26599] Updated weights for policy 0, policy_version 328084 (0.0035) [2024-06-19 09:48:23,380][26367] Fps is (10 sec: 44252.6, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5375344640. Throughput: 0: 42720.5. Samples: 1642960220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:23,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 09:48:26,403][26599] Updated weights for policy 0, policy_version 328094 (0.0027) [2024-06-19 09:48:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42054.8, 300 sec: 42709.5). Total num frames: 5375557632. Throughput: 0: 42679.9. Samples: 1643218020. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:28,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 09:48:30,300][26599] Updated weights for policy 0, policy_version 328104 (0.0036) [2024-06-19 09:48:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 5375770624. Throughput: 0: 42697.4. Samples: 1643342720. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:33,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 09:48:34,342][26599] Updated weights for policy 0, policy_version 328114 (0.0039) [2024-06-19 09:48:37,907][26599] Updated weights for policy 0, policy_version 328124 (0.0024) [2024-06-19 09:48:38,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5375983616. Throughput: 0: 42609.3. Samples: 1643597600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:38,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 09:48:41,924][26599] Updated weights for policy 0, policy_version 328134 (0.0022) [2024-06-19 09:48:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5376212992. Throughput: 0: 42757.9. Samples: 1643860000. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 09:48:43,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 09:48:45,441][26599] Updated weights for policy 0, policy_version 328144 (0.0042) [2024-06-19 09:48:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5376409600. Throughput: 0: 42793.7. Samples: 1643983640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:48:48,380][26367] Avg episode reward: [(0, '0.198')] [2024-06-19 09:48:48,554][26579] Signal inference workers to stop experience collection... (24300 times) [2024-06-19 09:48:48,554][26579] Signal inference workers to resume experience collection... (24300 times) [2024-06-19 09:48:48,608][26599] InferenceWorker_p0-w0: stopping experience collection (24300 times) [2024-06-19 09:48:48,608][26599] InferenceWorker_p0-w0: resuming experience collection (24300 times) [2024-06-19 09:48:49,467][26599] Updated weights for policy 0, policy_version 328154 (0.0049) [2024-06-19 09:48:53,223][26599] Updated weights for policy 0, policy_version 328164 (0.0030) [2024-06-19 09:48:53,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42595.8, 300 sec: 42708.9). Total num frames: 5376638976. Throughput: 0: 42780.5. Samples: 1644242100. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:48:53,385][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 09:48:57,025][26599] Updated weights for policy 0, policy_version 328174 (0.0042) [2024-06-19 09:48:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5376851968. Throughput: 0: 42870.5. Samples: 1644505000. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:48:58,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 09:49:00,934][26599] Updated weights for policy 0, policy_version 328184 (0.0036) [2024-06-19 09:49:03,384][26367] Fps is (10 sec: 42598.4, 60 sec: 43141.9, 300 sec: 42821.1). Total num frames: 5377064960. Throughput: 0: 42798.8. Samples: 1644629840. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:03,384][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 09:49:04,491][26599] Updated weights for policy 0, policy_version 328194 (0.0033) [2024-06-19 09:49:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5377277952. Throughput: 0: 42843.9. Samples: 1644888200. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:08,384][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 09:49:08,547][26599] Updated weights for policy 0, policy_version 328204 (0.0042) [2024-06-19 09:49:12,018][26599] Updated weights for policy 0, policy_version 328214 (0.0044) [2024-06-19 09:49:13,380][26367] Fps is (10 sec: 42613.4, 60 sec: 43147.0, 300 sec: 42653.9). Total num frames: 5377490944. Throughput: 0: 42761.3. Samples: 1645142280. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:13,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 09:49:16,286][26599] Updated weights for policy 0, policy_version 328224 (0.0033) [2024-06-19 09:49:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5377703936. Throughput: 0: 42876.4. Samples: 1645272160. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:18,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 09:49:19,535][26599] Updated weights for policy 0, policy_version 328234 (0.0034) [2024-06-19 09:49:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5377916928. Throughput: 0: 43025.3. Samples: 1645533740. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:23,381][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 09:49:23,855][26599] Updated weights for policy 0, policy_version 328244 (0.0029) [2024-06-19 09:49:27,594][26599] Updated weights for policy 0, policy_version 328254 (0.0027) [2024-06-19 09:49:28,380][26367] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5378146304. Throughput: 0: 42770.0. Samples: 1645784660. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:28,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 09:49:31,478][26599] Updated weights for policy 0, policy_version 328264 (0.0040) [2024-06-19 09:49:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5378342912. Throughput: 0: 42955.4. Samples: 1645916640. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:33,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 09:49:35,024][26599] Updated weights for policy 0, policy_version 328274 (0.0040) [2024-06-19 09:49:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5378555904. Throughput: 0: 42995.8. Samples: 1646176760. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:38,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 09:49:38,545][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000328282_5378572288.pth... [2024-06-19 09:49:38,602][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327655_5368299520.pth [2024-06-19 09:49:39,095][26599] Updated weights for policy 0, policy_version 328284 (0.0034) [2024-06-19 09:49:42,587][26599] Updated weights for policy 0, policy_version 328294 (0.0031) [2024-06-19 09:49:43,380][26367] Fps is (10 sec: 45875.9, 60 sec: 43144.6, 300 sec: 42821.1). Total num frames: 5378801664. Throughput: 0: 42831.2. Samples: 1646432400. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:43,380][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 09:49:46,592][26599] Updated weights for policy 0, policy_version 328304 (0.0044) [2024-06-19 09:49:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5378981888. Throughput: 0: 43185.1. Samples: 1646573020. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:48,381][26367] Avg episode reward: [(0, '0.894')] [2024-06-19 09:49:50,223][26599] Updated weights for policy 0, policy_version 328314 (0.0029) [2024-06-19 09:49:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5379194880. Throughput: 0: 43000.6. Samples: 1646823220. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:53,380][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 09:49:54,230][26599] Updated weights for policy 0, policy_version 328324 (0.0038) [2024-06-19 09:49:57,983][26599] Updated weights for policy 0, policy_version 328334 (0.0039) [2024-06-19 09:49:58,380][26367] Fps is (10 sec: 45876.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5379440640. Throughput: 0: 42968.6. Samples: 1647075860. Policy #0 lag: (min: 1.0, avg: 10.0, max: 23.0) [2024-06-19 09:49:58,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 09:50:02,032][26599] Updated weights for policy 0, policy_version 328344 (0.0028) [2024-06-19 09:50:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42874.0, 300 sec: 42765.0). Total num frames: 5379637248. Throughput: 0: 43059.5. Samples: 1647209840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:03,384][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 09:50:05,844][26599] Updated weights for policy 0, policy_version 328354 (0.0025) [2024-06-19 09:50:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5379850240. Throughput: 0: 42780.5. Samples: 1647458860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:08,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 09:50:09,733][26599] Updated weights for policy 0, policy_version 328364 (0.0044) [2024-06-19 09:50:13,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5380063232. Throughput: 0: 42974.9. Samples: 1647718520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:13,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 09:50:13,508][26599] Updated weights for policy 0, policy_version 328374 (0.0028) [2024-06-19 09:50:17,279][26599] Updated weights for policy 0, policy_version 328384 (0.0037) [2024-06-19 09:50:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5380276224. Throughput: 0: 42954.3. Samples: 1647849580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:18,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 09:50:21,105][26599] Updated weights for policy 0, policy_version 328394 (0.0039) [2024-06-19 09:50:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5380489216. Throughput: 0: 42773.0. Samples: 1648101540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:23,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 09:50:24,148][26579] Signal inference workers to stop experience collection... (24350 times) [2024-06-19 09:50:24,198][26579] Signal inference workers to resume experience collection... (24350 times) [2024-06-19 09:50:24,198][26599] InferenceWorker_p0-w0: stopping experience collection (24350 times) [2024-06-19 09:50:24,210][26599] InferenceWorker_p0-w0: resuming experience collection (24350 times) [2024-06-19 09:50:25,347][26599] Updated weights for policy 0, policy_version 328404 (0.0039) [2024-06-19 09:50:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 5380702208. Throughput: 0: 42946.7. Samples: 1648365000. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:28,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 09:50:28,679][26599] Updated weights for policy 0, policy_version 328414 (0.0043) [2024-06-19 09:50:32,904][26599] Updated weights for policy 0, policy_version 328424 (0.0045) [2024-06-19 09:50:33,381][26367] Fps is (10 sec: 44232.9, 60 sec: 43144.0, 300 sec: 42820.4). Total num frames: 5380931584. Throughput: 0: 42670.0. Samples: 1648493200. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:33,382][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 09:50:36,253][26599] Updated weights for policy 0, policy_version 328434 (0.0046) [2024-06-19 09:50:38,380][26367] Fps is (10 sec: 44235.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5381144576. Throughput: 0: 42649.1. Samples: 1648742440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:38,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 09:50:40,355][26599] Updated weights for policy 0, policy_version 328444 (0.0029) [2024-06-19 09:50:43,380][26367] Fps is (10 sec: 42602.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5381357568. Throughput: 0: 42946.2. Samples: 1649008440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:43,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 09:50:43,790][26599] Updated weights for policy 0, policy_version 328454 (0.0021) [2024-06-19 09:50:47,876][26599] Updated weights for policy 0, policy_version 328464 (0.0035) [2024-06-19 09:50:48,380][26367] Fps is (10 sec: 42599.3, 60 sec: 43144.6, 300 sec: 42876.6). Total num frames: 5381570560. Throughput: 0: 42814.3. Samples: 1649136480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:48,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 09:50:51,525][26599] Updated weights for policy 0, policy_version 328474 (0.0031) [2024-06-19 09:50:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5381783552. Throughput: 0: 42959.2. Samples: 1649392020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:53,380][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 09:50:55,733][26599] Updated weights for policy 0, policy_version 328484 (0.0032) [2024-06-19 09:50:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5381980160. Throughput: 0: 42756.0. Samples: 1649642540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:50:58,381][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 09:50:59,138][26599] Updated weights for policy 0, policy_version 328494 (0.0032) [2024-06-19 09:51:03,083][26599] Updated weights for policy 0, policy_version 328504 (0.0029) [2024-06-19 09:51:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5382209536. Throughput: 0: 42777.7. Samples: 1649774580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:51:03,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 09:51:07,191][26599] Updated weights for policy 0, policy_version 328514 (0.0037) [2024-06-19 09:51:08,380][26367] Fps is (10 sec: 45874.3, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5382438912. Throughput: 0: 43003.4. Samples: 1650036700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:51:08,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 09:51:10,588][26599] Updated weights for policy 0, policy_version 328524 (0.0029) [2024-06-19 09:51:13,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5382635520. Throughput: 0: 42916.2. Samples: 1650296240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 09:51:13,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 09:51:14,725][26599] Updated weights for policy 0, policy_version 328534 (0.0034) [2024-06-19 09:51:18,268][26599] Updated weights for policy 0, policy_version 328544 (0.0027) [2024-06-19 09:51:18,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43144.4, 300 sec: 42876.2). Total num frames: 5382864896. Throughput: 0: 42744.7. Samples: 1650416680. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:18,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 09:51:22,415][26599] Updated weights for policy 0, policy_version 328554 (0.0040) [2024-06-19 09:51:23,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5383077888. Throughput: 0: 43032.1. Samples: 1650678880. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:23,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 09:51:25,916][26599] Updated weights for policy 0, policy_version 328564 (0.0036) [2024-06-19 09:51:28,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42871.4, 300 sec: 42821.1). Total num frames: 5383274496. Throughput: 0: 42912.0. Samples: 1650939480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:28,380][26367] Avg episode reward: [(0, '0.746')] [2024-06-19 09:51:29,967][26599] Updated weights for policy 0, policy_version 328574 (0.0033) [2024-06-19 09:51:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42872.1, 300 sec: 42931.7). Total num frames: 5383503872. Throughput: 0: 42848.0. Samples: 1651064640. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:33,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 09:51:33,641][26599] Updated weights for policy 0, policy_version 328584 (0.0033) [2024-06-19 09:51:37,666][26599] Updated weights for policy 0, policy_version 328594 (0.0042) [2024-06-19 09:51:38,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5383716864. Throughput: 0: 42909.1. Samples: 1651322940. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:38,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 09:51:38,388][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000328596_5383716864.pth... [2024-06-19 09:51:38,440][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000327969_5373444096.pth [2024-06-19 09:51:39,825][26579] Signal inference workers to stop experience collection... (24400 times) [2024-06-19 09:51:39,826][26579] Signal inference workers to resume experience collection... (24400 times) [2024-06-19 09:51:39,836][26599] InferenceWorker_p0-w0: stopping experience collection (24400 times) [2024-06-19 09:51:39,836][26599] InferenceWorker_p0-w0: resuming experience collection (24400 times) [2024-06-19 09:51:41,503][26599] Updated weights for policy 0, policy_version 328604 (0.0037) [2024-06-19 09:51:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5383929856. Throughput: 0: 42969.8. Samples: 1651576180. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:43,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 09:51:45,441][26599] Updated weights for policy 0, policy_version 328614 (0.0030) [2024-06-19 09:51:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5384142848. Throughput: 0: 42936.0. Samples: 1651706700. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:48,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 09:51:49,003][26599] Updated weights for policy 0, policy_version 328624 (0.0037) [2024-06-19 09:51:53,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5384323072. Throughput: 0: 42882.8. Samples: 1651966420. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:53,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 09:51:53,422][26599] Updated weights for policy 0, policy_version 328634 (0.0040) [2024-06-19 09:51:56,527][26599] Updated weights for policy 0, policy_version 328644 (0.0036) [2024-06-19 09:51:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5384568832. Throughput: 0: 42729.5. Samples: 1652219060. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:51:58,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 09:52:01,001][26599] Updated weights for policy 0, policy_version 328654 (0.0040) [2024-06-19 09:52:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42821.1). Total num frames: 5384765440. Throughput: 0: 43031.8. Samples: 1652353100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:52:03,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 09:52:04,155][26599] Updated weights for policy 0, policy_version 328664 (0.0032) [2024-06-19 09:52:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 5384978432. Throughput: 0: 42793.0. Samples: 1652604560. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:52:08,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 09:52:08,849][26599] Updated weights for policy 0, policy_version 328674 (0.0029) [2024-06-19 09:52:11,859][26599] Updated weights for policy 0, policy_version 328684 (0.0037) [2024-06-19 09:52:13,380][26367] Fps is (10 sec: 45874.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5385224192. Throughput: 0: 42537.1. Samples: 1652853660. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:52:13,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 09:52:16,535][26599] Updated weights for policy 0, policy_version 328694 (0.0036) [2024-06-19 09:52:18,384][26367] Fps is (10 sec: 44221.9, 60 sec: 42596.2, 300 sec: 42820.6). Total num frames: 5385420800. Throughput: 0: 42762.6. Samples: 1652989100. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:52:18,384][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 09:52:19,458][26599] Updated weights for policy 0, policy_version 328704 (0.0040) [2024-06-19 09:52:23,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 5385617408. Throughput: 0: 42564.5. Samples: 1653238340. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:52:23,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 09:52:24,083][26599] Updated weights for policy 0, policy_version 328714 (0.0027) [2024-06-19 09:52:27,006][26599] Updated weights for policy 0, policy_version 328724 (0.0026) [2024-06-19 09:52:28,380][26367] Fps is (10 sec: 45890.6, 60 sec: 43417.6, 300 sec: 42987.2). Total num frames: 5385879552. Throughput: 0: 42646.2. Samples: 1653495260. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:52:28,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 09:52:31,753][26599] Updated weights for policy 0, policy_version 328734 (0.0038) [2024-06-19 09:52:33,384][26367] Fps is (10 sec: 42583.2, 60 sec: 42322.7, 300 sec: 42709.0). Total num frames: 5386043392. Throughput: 0: 42858.3. Samples: 1653635480. Policy #0 lag: (min: 1.0, avg: 11.2, max: 24.0) [2024-06-19 09:52:33,385][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 09:52:34,747][26599] Updated weights for policy 0, policy_version 328744 (0.0034) [2024-06-19 09:52:38,380][26367] Fps is (10 sec: 36044.5, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5386240000. Throughput: 0: 42532.4. Samples: 1653880380. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:52:38,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 09:52:39,283][26599] Updated weights for policy 0, policy_version 328754 (0.0027) [2024-06-19 09:52:42,553][26599] Updated weights for policy 0, policy_version 328764 (0.0034) [2024-06-19 09:52:43,380][26367] Fps is (10 sec: 47530.7, 60 sec: 43144.5, 300 sec: 42987.2). Total num frames: 5386518528. Throughput: 0: 42633.3. Samples: 1654137560. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:52:43,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 09:52:46,977][26599] Updated weights for policy 0, policy_version 328774 (0.0049) [2024-06-19 09:52:48,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5386698752. Throughput: 0: 42764.4. Samples: 1654277500. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:52:48,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 09:52:49,996][26599] Updated weights for policy 0, policy_version 328784 (0.0041) [2024-06-19 09:52:53,380][26367] Fps is (10 sec: 37683.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5386895360. Throughput: 0: 42741.7. Samples: 1654527940. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:52:53,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 09:52:54,782][26599] Updated weights for policy 0, policy_version 328794 (0.0032) [2024-06-19 09:52:55,809][26579] Signal inference workers to stop experience collection... (24450 times) [2024-06-19 09:52:55,810][26579] Signal inference workers to resume experience collection... (24450 times) [2024-06-19 09:52:55,834][26599] InferenceWorker_p0-w0: stopping experience collection (24450 times) [2024-06-19 09:52:55,834][26599] InferenceWorker_p0-w0: resuming experience collection (24450 times) [2024-06-19 09:52:57,619][26599] Updated weights for policy 0, policy_version 328804 (0.0027) [2024-06-19 09:52:58,384][26367] Fps is (10 sec: 47496.5, 60 sec: 43415.0, 300 sec: 43042.2). Total num frames: 5387173888. Throughput: 0: 42911.8. Samples: 1654784840. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:52:58,385][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 09:53:02,349][26599] Updated weights for policy 0, policy_version 328814 (0.0032) [2024-06-19 09:53:03,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5387337728. Throughput: 0: 42992.5. Samples: 1654923620. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:03,380][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 09:53:05,046][26599] Updated weights for policy 0, policy_version 328824 (0.0039) [2024-06-19 09:53:08,384][26367] Fps is (10 sec: 37683.1, 60 sec: 42868.8, 300 sec: 42876.1). Total num frames: 5387550720. Throughput: 0: 42906.8. Samples: 1655169300. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:08,384][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 09:53:10,378][26599] Updated weights for policy 0, policy_version 328834 (0.0034) [2024-06-19 09:53:12,764][26599] Updated weights for policy 0, policy_version 328844 (0.0027) [2024-06-19 09:53:13,384][26367] Fps is (10 sec: 47496.0, 60 sec: 43142.0, 300 sec: 43042.2). Total num frames: 5387812864. Throughput: 0: 42831.1. Samples: 1655422820. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:13,385][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 09:53:17,886][26599] Updated weights for policy 0, policy_version 328854 (0.0032) [2024-06-19 09:53:18,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42327.7, 300 sec: 42765.0). Total num frames: 5387960320. Throughput: 0: 42760.8. Samples: 1655559560. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:18,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 09:53:20,591][26599] Updated weights for policy 0, policy_version 328864 (0.0037) [2024-06-19 09:53:23,380][26367] Fps is (10 sec: 39336.3, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5388206080. Throughput: 0: 42771.2. Samples: 1655805080. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:23,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 09:53:25,592][26599] Updated weights for policy 0, policy_version 328874 (0.0036) [2024-06-19 09:53:28,182][26599] Updated weights for policy 0, policy_version 328884 (0.0028) [2024-06-19 09:53:28,380][26367] Fps is (10 sec: 49152.1, 60 sec: 42871.5, 300 sec: 42987.2). Total num frames: 5388451840. Throughput: 0: 42850.8. Samples: 1656065840. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:28,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 09:53:33,167][26599] Updated weights for policy 0, policy_version 328894 (0.0031) [2024-06-19 09:53:33,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 5388599296. Throughput: 0: 42674.7. Samples: 1656197860. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:33,381][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 09:53:35,838][26599] Updated weights for policy 0, policy_version 328904 (0.0033) [2024-06-19 09:53:38,380][26367] Fps is (10 sec: 39321.1, 60 sec: 43417.6, 300 sec: 42820.5). Total num frames: 5388845056. Throughput: 0: 42605.3. Samples: 1656445180. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:38,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 09:53:38,535][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000328910_5388861440.pth... [2024-06-19 09:53:38,600][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000328282_5378572288.pth [2024-06-19 09:53:40,746][26599] Updated weights for policy 0, policy_version 328914 (0.0029) [2024-06-19 09:53:43,380][26367] Fps is (10 sec: 47513.9, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5389074432. Throughput: 0: 42681.2. Samples: 1656705340. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:43,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 09:53:43,418][26599] Updated weights for policy 0, policy_version 328924 (0.0033) [2024-06-19 09:53:48,333][26599] Updated weights for policy 0, policy_version 328934 (0.0038) [2024-06-19 09:53:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.5, 300 sec: 42765.6). Total num frames: 5389254656. Throughput: 0: 42492.5. Samples: 1656835780. Policy #0 lag: (min: 2.0, avg: 10.9, max: 22.0) [2024-06-19 09:53:48,380][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 09:53:51,074][26599] Updated weights for policy 0, policy_version 328944 (0.0044) [2024-06-19 09:53:53,380][26367] Fps is (10 sec: 42597.7, 60 sec: 43417.5, 300 sec: 42876.1). Total num frames: 5389500416. Throughput: 0: 42722.5. Samples: 1657091660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:53:53,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 09:53:55,749][26599] Updated weights for policy 0, policy_version 328954 (0.0027) [2024-06-19 09:53:58,380][26367] Fps is (10 sec: 45874.5, 60 sec: 42327.9, 300 sec: 42876.6). Total num frames: 5389713408. Throughput: 0: 42923.9. Samples: 1657354240. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:53:58,381][26367] Avg episode reward: [(0, '0.780')] [2024-06-19 09:53:58,834][26599] Updated weights for policy 0, policy_version 328964 (0.0026) [2024-06-19 09:54:03,323][26599] Updated weights for policy 0, policy_version 328974 (0.0032) [2024-06-19 09:54:03,380][26367] Fps is (10 sec: 40961.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5389910016. Throughput: 0: 42696.1. Samples: 1657480880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:03,381][26367] Avg episode reward: [(0, '0.821')] [2024-06-19 09:54:06,251][26579] Signal inference workers to stop experience collection... (24500 times) [2024-06-19 09:54:06,301][26599] InferenceWorker_p0-w0: stopping experience collection (24500 times) [2024-06-19 09:54:06,311][26579] Signal inference workers to resume experience collection... (24500 times) [2024-06-19 09:54:06,321][26599] InferenceWorker_p0-w0: resuming experience collection (24500 times) [2024-06-19 09:54:06,459][26599] Updated weights for policy 0, policy_version 328984 (0.0040) [2024-06-19 09:54:08,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43420.3, 300 sec: 42931.7). Total num frames: 5390155776. Throughput: 0: 42956.9. Samples: 1657738140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:08,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 09:54:10,842][26599] Updated weights for policy 0, policy_version 328994 (0.0036) [2024-06-19 09:54:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42328.0, 300 sec: 42876.1). Total num frames: 5390352384. Throughput: 0: 43081.3. Samples: 1658004500. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:13,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 09:54:14,113][26599] Updated weights for policy 0, policy_version 329004 (0.0028) [2024-06-19 09:54:18,380][26367] Fps is (10 sec: 37683.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5390532608. Throughput: 0: 42815.6. Samples: 1658124560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:18,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 09:54:18,797][26599] Updated weights for policy 0, policy_version 329014 (0.0029) [2024-06-19 09:54:21,676][26599] Updated weights for policy 0, policy_version 329024 (0.0045) [2024-06-19 09:54:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5390778368. Throughput: 0: 42946.8. Samples: 1658377780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:23,380][26367] Avg episode reward: [(0, '0.824')] [2024-06-19 09:54:26,301][26599] Updated weights for policy 0, policy_version 329034 (0.0039) [2024-06-19 09:54:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42052.2, 300 sec: 42820.6). Total num frames: 5390974976. Throughput: 0: 43101.3. Samples: 1658644900. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:28,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 09:54:29,534][26599] Updated weights for policy 0, policy_version 329044 (0.0032) [2024-06-19 09:54:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5391187968. Throughput: 0: 42840.4. Samples: 1658763600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:33,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 09:54:33,841][26599] Updated weights for policy 0, policy_version 329054 (0.0031) [2024-06-19 09:54:37,071][26599] Updated weights for policy 0, policy_version 329064 (0.0031) [2024-06-19 09:54:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5391417344. Throughput: 0: 42830.3. Samples: 1659019020. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:38,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 09:54:41,374][26599] Updated weights for policy 0, policy_version 329074 (0.0029) [2024-06-19 09:54:43,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5391630336. Throughput: 0: 42792.0. Samples: 1659279880. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:43,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 09:54:44,833][26599] Updated weights for policy 0, policy_version 329084 (0.0047) [2024-06-19 09:54:48,384][26367] Fps is (10 sec: 39307.4, 60 sec: 42595.7, 300 sec: 42764.5). Total num frames: 5391810560. Throughput: 0: 42750.6. Samples: 1659404820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:48,385][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 09:54:49,269][26599] Updated weights for policy 0, policy_version 329094 (0.0035) [2024-06-19 09:54:52,420][26599] Updated weights for policy 0, policy_version 329104 (0.0037) [2024-06-19 09:54:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5392056320. Throughput: 0: 42856.3. Samples: 1659666680. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:53,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 09:54:56,955][26599] Updated weights for policy 0, policy_version 329114 (0.0036) [2024-06-19 09:54:58,381][26367] Fps is (10 sec: 45887.4, 60 sec: 42597.7, 300 sec: 42820.4). Total num frames: 5392269312. Throughput: 0: 42575.9. Samples: 1659920460. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:54:58,382][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 09:55:00,044][26599] Updated weights for policy 0, policy_version 329124 (0.0027) [2024-06-19 09:55:03,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5392465920. Throughput: 0: 42777.8. Samples: 1660049560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 09:55:03,380][26367] Avg episode reward: [(0, '0.419')] [2024-06-19 09:55:04,548][26599] Updated weights for policy 0, policy_version 329134 (0.0041) [2024-06-19 09:55:07,842][26599] Updated weights for policy 0, policy_version 329144 (0.0027) [2024-06-19 09:55:08,380][26367] Fps is (10 sec: 44241.0, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5392711680. Throughput: 0: 42956.8. Samples: 1660310840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:08,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 09:55:12,167][26599] Updated weights for policy 0, policy_version 329154 (0.0042) [2024-06-19 09:55:13,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5392908288. Throughput: 0: 42572.1. Samples: 1660560640. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:13,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 09:55:15,535][26599] Updated weights for policy 0, policy_version 329164 (0.0040) [2024-06-19 09:55:18,380][26367] Fps is (10 sec: 40960.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5393121280. Throughput: 0: 42788.3. Samples: 1660689080. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:18,381][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 09:55:19,884][26599] Updated weights for policy 0, policy_version 329174 (0.0033) [2024-06-19 09:55:21,290][26579] Signal inference workers to stop experience collection... (24550 times) [2024-06-19 09:55:21,292][26579] Signal inference workers to resume experience collection... (24550 times) [2024-06-19 09:55:21,323][26599] InferenceWorker_p0-w0: stopping experience collection (24550 times) [2024-06-19 09:55:21,323][26599] InferenceWorker_p0-w0: resuming experience collection (24550 times) [2024-06-19 09:55:23,315][26599] Updated weights for policy 0, policy_version 329184 (0.0052) [2024-06-19 09:55:23,384][26367] Fps is (10 sec: 44220.4, 60 sec: 42868.8, 300 sec: 42875.6). Total num frames: 5393350656. Throughput: 0: 42816.2. Samples: 1660945900. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:23,385][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 09:55:27,672][26599] Updated weights for policy 0, policy_version 329194 (0.0036) [2024-06-19 09:55:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.1). Total num frames: 5393547264. Throughput: 0: 42772.1. Samples: 1661204620. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:28,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 09:55:31,042][26599] Updated weights for policy 0, policy_version 329204 (0.0043) [2024-06-19 09:55:33,384][26367] Fps is (10 sec: 40960.0, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5393760256. Throughput: 0: 42803.6. Samples: 1661330980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:33,384][26367] Avg episode reward: [(0, '0.809')] [2024-06-19 09:55:35,324][26599] Updated weights for policy 0, policy_version 329214 (0.0035) [2024-06-19 09:55:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5393973248. Throughput: 0: 42655.6. Samples: 1661586180. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:38,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 09:55:38,425][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000329223_5393989632.pth... [2024-06-19 09:55:38,498][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000328596_5383716864.pth [2024-06-19 09:55:38,646][26599] Updated weights for policy 0, policy_version 329224 (0.0031) [2024-06-19 09:55:42,914][26599] Updated weights for policy 0, policy_version 329234 (0.0029) [2024-06-19 09:55:43,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5394169856. Throughput: 0: 42720.5. Samples: 1661842840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:43,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 09:55:46,139][26599] Updated weights for policy 0, policy_version 329244 (0.0035) [2024-06-19 09:55:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43147.2, 300 sec: 42765.0). Total num frames: 5394399232. Throughput: 0: 42620.3. Samples: 1661967480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:48,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 09:55:50,656][26599] Updated weights for policy 0, policy_version 329254 (0.0028) [2024-06-19 09:55:53,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5394612224. Throughput: 0: 42561.9. Samples: 1662226120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:53,380][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 09:55:54,243][26599] Updated weights for policy 0, policy_version 329264 (0.0052) [2024-06-19 09:55:58,204][26599] Updated weights for policy 0, policy_version 329274 (0.0036) [2024-06-19 09:55:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42599.1, 300 sec: 42765.0). Total num frames: 5394825216. Throughput: 0: 42807.9. Samples: 1662487000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:55:58,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 09:56:01,891][26599] Updated weights for policy 0, policy_version 329284 (0.0043) [2024-06-19 09:56:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5395038208. Throughput: 0: 42876.5. Samples: 1662618520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:56:03,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 09:56:05,655][26599] Updated weights for policy 0, policy_version 329294 (0.0039) [2024-06-19 09:56:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.1). Total num frames: 5395251200. Throughput: 0: 42719.6. Samples: 1662868120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:56:08,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 09:56:09,363][26599] Updated weights for policy 0, policy_version 329304 (0.0033) [2024-06-19 09:56:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5395464192. Throughput: 0: 42685.7. Samples: 1663125480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:56:13,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 09:56:13,445][26599] Updated weights for policy 0, policy_version 329314 (0.0031) [2024-06-19 09:56:17,253][26599] Updated weights for policy 0, policy_version 329324 (0.0033) [2024-06-19 09:56:18,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5395693568. Throughput: 0: 42740.4. Samples: 1663254140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 24.0) [2024-06-19 09:56:18,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 09:56:21,298][26599] Updated weights for policy 0, policy_version 329334 (0.0029) [2024-06-19 09:56:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42328.0, 300 sec: 42765.0). Total num frames: 5395890176. Throughput: 0: 42618.3. Samples: 1663504000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:23,380][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 09:56:25,181][26599] Updated weights for policy 0, policy_version 329344 (0.0036) [2024-06-19 09:56:28,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5396103168. Throughput: 0: 42707.5. Samples: 1663764680. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:28,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 09:56:28,866][26599] Updated weights for policy 0, policy_version 329354 (0.0041) [2024-06-19 09:56:32,721][26599] Updated weights for policy 0, policy_version 329364 (0.0030) [2024-06-19 09:56:33,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 5396332544. Throughput: 0: 42763.2. Samples: 1663891820. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:33,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 09:56:36,374][26599] Updated weights for policy 0, policy_version 329374 (0.0025) [2024-06-19 09:56:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5396545536. Throughput: 0: 42797.7. Samples: 1664152020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:38,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 09:56:40,469][26599] Updated weights for policy 0, policy_version 329384 (0.0030) [2024-06-19 09:56:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5396758528. Throughput: 0: 42660.9. Samples: 1664406740. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:43,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 09:56:43,904][26599] Updated weights for policy 0, policy_version 329394 (0.0048) [2024-06-19 09:56:47,981][26599] Updated weights for policy 0, policy_version 329404 (0.0042) [2024-06-19 09:56:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5396955136. Throughput: 0: 42687.1. Samples: 1664539440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:48,381][26367] Avg episode reward: [(0, '0.329')] [2024-06-19 09:56:51,489][26599] Updated weights for policy 0, policy_version 329414 (0.0030) [2024-06-19 09:56:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5397168128. Throughput: 0: 42780.4. Samples: 1664793240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:53,381][26367] Avg episode reward: [(0, '0.494')] [2024-06-19 09:56:55,598][26599] Updated weights for policy 0, policy_version 329424 (0.0022) [2024-06-19 09:56:58,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5397381120. Throughput: 0: 42746.7. Samples: 1665049080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:56:58,380][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 09:56:59,530][26599] Updated weights for policy 0, policy_version 329434 (0.0025) [2024-06-19 09:57:03,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5397594112. Throughput: 0: 42782.7. Samples: 1665179520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:57:03,384][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 09:57:03,442][26599] Updated weights for policy 0, policy_version 329444 (0.0041) [2024-06-19 09:57:07,072][26599] Updated weights for policy 0, policy_version 329454 (0.0030) [2024-06-19 09:57:08,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5397823488. Throughput: 0: 42871.5. Samples: 1665433220. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:57:08,380][26367] Avg episode reward: [(0, '0.847')] [2024-06-19 09:57:10,917][26579] Signal inference workers to stop experience collection... (24600 times) [2024-06-19 09:57:10,917][26579] Signal inference workers to resume experience collection... (24600 times) [2024-06-19 09:57:10,932][26599] InferenceWorker_p0-w0: stopping experience collection (24600 times) [2024-06-19 09:57:10,932][26599] InferenceWorker_p0-w0: resuming experience collection (24600 times) [2024-06-19 09:57:11,074][26599] Updated weights for policy 0, policy_version 329464 (0.0043) [2024-06-19 09:57:13,380][26367] Fps is (10 sec: 44253.0, 60 sec: 42871.5, 300 sec: 42765.5). Total num frames: 5398036480. Throughput: 0: 42693.0. Samples: 1665685860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:57:13,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 09:57:15,047][26599] Updated weights for policy 0, policy_version 329474 (0.0040) [2024-06-19 09:57:18,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5398249472. Throughput: 0: 42712.8. Samples: 1665813900. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:57:18,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 09:57:18,561][26599] Updated weights for policy 0, policy_version 329484 (0.0028) [2024-06-19 09:57:22,524][26599] Updated weights for policy 0, policy_version 329494 (0.0030) [2024-06-19 09:57:23,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5398478848. Throughput: 0: 42597.7. Samples: 1666068920. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:57:23,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 09:57:26,342][26599] Updated weights for policy 0, policy_version 329504 (0.0024) [2024-06-19 09:57:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42821.1). Total num frames: 5398675456. Throughput: 0: 42783.9. Samples: 1666332020. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:57:28,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 09:57:30,008][26599] Updated weights for policy 0, policy_version 329514 (0.0036) [2024-06-19 09:57:33,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5398872064. Throughput: 0: 42512.0. Samples: 1666452480. Policy #0 lag: (min: 0.0, avg: 11.6, max: 22.0) [2024-06-19 09:57:33,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 09:57:34,180][26599] Updated weights for policy 0, policy_version 329524 (0.0048) [2024-06-19 09:57:37,887][26599] Updated weights for policy 0, policy_version 329534 (0.0028) [2024-06-19 09:57:38,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5399117824. Throughput: 0: 42569.4. Samples: 1666708860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:57:38,380][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 09:57:38,531][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000329537_5399134208.pth... [2024-06-19 09:57:38,583][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000328910_5388861440.pth [2024-06-19 09:57:41,718][26599] Updated weights for policy 0, policy_version 329544 (0.0036) [2024-06-19 09:57:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5399298048. Throughput: 0: 42615.1. Samples: 1666966760. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:57:43,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 09:57:45,433][26599] Updated weights for policy 0, policy_version 329554 (0.0038) [2024-06-19 09:57:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5399527424. Throughput: 0: 42544.4. Samples: 1667093860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:57:48,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 09:57:49,833][26599] Updated weights for policy 0, policy_version 329564 (0.0037) [2024-06-19 09:57:53,124][26599] Updated weights for policy 0, policy_version 329574 (0.0028) [2024-06-19 09:57:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42598.9). Total num frames: 5399740416. Throughput: 0: 42643.0. Samples: 1667352160. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:57:53,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 09:57:57,505][26599] Updated weights for policy 0, policy_version 329584 (0.0038) [2024-06-19 09:57:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5399953408. Throughput: 0: 42705.2. Samples: 1667607600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:57:58,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 09:58:00,770][26599] Updated weights for policy 0, policy_version 329594 (0.0040) [2024-06-19 09:58:03,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42874.1, 300 sec: 42765.6). Total num frames: 5400166400. Throughput: 0: 42638.4. Samples: 1667732620. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:03,380][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 09:58:05,034][26599] Updated weights for policy 0, policy_version 329604 (0.0033) [2024-06-19 09:58:08,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 5400395776. Throughput: 0: 42801.4. Samples: 1667994980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:08,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 09:58:08,385][26599] Updated weights for policy 0, policy_version 329614 (0.0036) [2024-06-19 09:58:12,652][26599] Updated weights for policy 0, policy_version 329624 (0.0025) [2024-06-19 09:58:13,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5400608768. Throughput: 0: 42723.5. Samples: 1668254580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:13,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 09:58:16,184][26599] Updated weights for policy 0, policy_version 329634 (0.0033) [2024-06-19 09:58:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5400821760. Throughput: 0: 42797.4. Samples: 1668378360. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:18,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 09:58:20,208][26599] Updated weights for policy 0, policy_version 329644 (0.0037) [2024-06-19 09:58:23,384][26367] Fps is (10 sec: 42583.3, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 5401034752. Throughput: 0: 42906.2. Samples: 1668639800. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:23,384][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 09:58:23,912][26599] Updated weights for policy 0, policy_version 329654 (0.0046) [2024-06-19 09:58:27,863][26599] Updated weights for policy 0, policy_version 329664 (0.0044) [2024-06-19 09:58:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5401231360. Throughput: 0: 42872.0. Samples: 1668896000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:28,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 09:58:31,541][26599] Updated weights for policy 0, policy_version 329674 (0.0035) [2024-06-19 09:58:33,381][26367] Fps is (10 sec: 42613.0, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5401460736. Throughput: 0: 42872.6. Samples: 1669023140. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:33,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 09:58:35,399][26599] Updated weights for policy 0, policy_version 329684 (0.0032) [2024-06-19 09:58:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5401657344. Throughput: 0: 42767.1. Samples: 1669276680. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:38,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 09:58:39,175][26599] Updated weights for policy 0, policy_version 329694 (0.0030) [2024-06-19 09:58:40,461][26579] Signal inference workers to stop experience collection... (24650 times) [2024-06-19 09:58:40,461][26579] Signal inference workers to resume experience collection... (24650 times) [2024-06-19 09:58:40,504][26599] InferenceWorker_p0-w0: stopping experience collection (24650 times) [2024-06-19 09:58:40,504][26599] InferenceWorker_p0-w0: resuming experience collection (24650 times) [2024-06-19 09:58:42,924][26599] Updated weights for policy 0, policy_version 329704 (0.0039) [2024-06-19 09:58:43,384][26367] Fps is (10 sec: 42583.7, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 5401886720. Throughput: 0: 42729.5. Samples: 1669530580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:43,384][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 09:58:46,911][26599] Updated weights for policy 0, policy_version 329714 (0.0034) [2024-06-19 09:58:48,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5402116096. Throughput: 0: 42843.5. Samples: 1669660580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 21.0) [2024-06-19 09:58:48,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 09:58:50,554][26599] Updated weights for policy 0, policy_version 329724 (0.0038) [2024-06-19 09:58:53,380][26367] Fps is (10 sec: 40974.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5402296320. Throughput: 0: 42655.5. Samples: 1669914480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:58:53,381][26367] Avg episode reward: [(0, '0.533')] [2024-06-19 09:58:54,563][26599] Updated weights for policy 0, policy_version 329734 (0.0038) [2024-06-19 09:58:58,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5402509312. Throughput: 0: 42590.3. Samples: 1670171140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:58:58,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 09:58:58,425][26599] Updated weights for policy 0, policy_version 329744 (0.0037) [2024-06-19 09:59:02,325][26599] Updated weights for policy 0, policy_version 329754 (0.0035) [2024-06-19 09:59:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5402738688. Throughput: 0: 42686.6. Samples: 1670299260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:03,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 09:59:05,953][26599] Updated weights for policy 0, policy_version 329764 (0.0028) [2024-06-19 09:59:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5402935296. Throughput: 0: 42506.1. Samples: 1670552420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:08,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 09:59:09,869][26599] Updated weights for policy 0, policy_version 329774 (0.0029) [2024-06-19 09:59:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5403148288. Throughput: 0: 42596.9. Samples: 1670812860. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:13,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 09:59:13,552][26599] Updated weights for policy 0, policy_version 329784 (0.0032) [2024-06-19 09:59:17,369][26599] Updated weights for policy 0, policy_version 329794 (0.0032) [2024-06-19 09:59:18,384][26367] Fps is (10 sec: 45858.3, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5403394048. Throughput: 0: 42559.0. Samples: 1670938440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:18,384][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 09:59:21,182][26599] Updated weights for policy 0, policy_version 329804 (0.0035) [2024-06-19 09:59:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 5403590656. Throughput: 0: 42597.8. Samples: 1671193580. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:23,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 09:59:24,934][26599] Updated weights for policy 0, policy_version 329814 (0.0038) [2024-06-19 09:59:28,380][26367] Fps is (10 sec: 40975.4, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5403803648. Throughput: 0: 42742.7. Samples: 1671453840. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:28,380][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 09:59:29,059][26599] Updated weights for policy 0, policy_version 329824 (0.0034) [2024-06-19 09:59:32,735][26599] Updated weights for policy 0, policy_version 329834 (0.0030) [2024-06-19 09:59:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.7, 300 sec: 42765.0). Total num frames: 5404033024. Throughput: 0: 42557.0. Samples: 1671575640. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:33,380][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 09:59:36,798][26599] Updated weights for policy 0, policy_version 329844 (0.0042) [2024-06-19 09:59:38,384][26367] Fps is (10 sec: 40944.6, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5404213248. Throughput: 0: 42489.5. Samples: 1671826660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:38,385][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 09:59:38,529][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000329848_5404229632.pth... [2024-06-19 09:59:38,600][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000329223_5393989632.pth [2024-06-19 09:59:40,715][26599] Updated weights for policy 0, policy_version 329854 (0.0031) [2024-06-19 09:59:43,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42327.9, 300 sec: 42765.5). Total num frames: 5404426240. Throughput: 0: 42548.0. Samples: 1672085800. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:43,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 09:59:44,450][26599] Updated weights for policy 0, policy_version 329864 (0.0037) [2024-06-19 09:59:48,277][26599] Updated weights for policy 0, policy_version 329874 (0.0042) [2024-06-19 09:59:48,380][26367] Fps is (10 sec: 44253.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5404655616. Throughput: 0: 42486.7. Samples: 1672211160. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:48,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 09:59:51,944][26599] Updated weights for policy 0, policy_version 329884 (0.0030) [2024-06-19 09:59:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42654.1). Total num frames: 5404852224. Throughput: 0: 42468.4. Samples: 1672463500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:53,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 09:59:56,017][26599] Updated weights for policy 0, policy_version 329894 (0.0041) [2024-06-19 09:59:56,101][26579] Signal inference workers to stop experience collection... (24700 times) [2024-06-19 09:59:56,152][26599] InferenceWorker_p0-w0: stopping experience collection (24700 times) [2024-06-19 09:59:56,159][26579] Signal inference workers to resume experience collection... (24700 times) [2024-06-19 09:59:56,161][26599] InferenceWorker_p0-w0: resuming experience collection (24700 times) [2024-06-19 09:59:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5405065216. Throughput: 0: 42515.5. Samples: 1672726060. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 09:59:58,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 09:59:59,402][26599] Updated weights for policy 0, policy_version 329904 (0.0027) [2024-06-19 10:00:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5405294592. Throughput: 0: 42667.9. Samples: 1672858340. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 10:00:03,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 10:00:03,428][26599] Updated weights for policy 0, policy_version 329914 (0.0044) [2024-06-19 10:00:06,862][26599] Updated weights for policy 0, policy_version 329924 (0.0048) [2024-06-19 10:00:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5405491200. Throughput: 0: 42582.6. Samples: 1673109800. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:08,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 10:00:11,069][26599] Updated weights for policy 0, policy_version 329934 (0.0036) [2024-06-19 10:00:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5405720576. Throughput: 0: 42650.2. Samples: 1673373100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:13,380][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 10:00:14,685][26599] Updated weights for policy 0, policy_version 329944 (0.0040) [2024-06-19 10:00:18,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42327.8, 300 sec: 42654.4). Total num frames: 5405933568. Throughput: 0: 42905.6. Samples: 1673506400. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:18,389][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 10:00:18,623][26599] Updated weights for policy 0, policy_version 329954 (0.0033) [2024-06-19 10:00:22,612][26599] Updated weights for policy 0, policy_version 329964 (0.0054) [2024-06-19 10:00:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5406146560. Throughput: 0: 42930.6. Samples: 1673758380. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:23,381][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 10:00:26,308][26599] Updated weights for policy 0, policy_version 329974 (0.0037) [2024-06-19 10:00:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 5406359552. Throughput: 0: 42888.5. Samples: 1674015780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:28,381][26367] Avg episode reward: [(0, '0.407')] [2024-06-19 10:00:30,205][26599] Updated weights for policy 0, policy_version 329984 (0.0040) [2024-06-19 10:00:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5406556160. Throughput: 0: 42957.7. Samples: 1674144260. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:33,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 10:00:34,005][26599] Updated weights for policy 0, policy_version 329994 (0.0034) [2024-06-19 10:00:37,802][26599] Updated weights for policy 0, policy_version 330004 (0.0029) [2024-06-19 10:00:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42874.0, 300 sec: 42765.0). Total num frames: 5406785536. Throughput: 0: 43004.4. Samples: 1674398700. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:38,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 10:00:41,529][26599] Updated weights for policy 0, policy_version 330014 (0.0032) [2024-06-19 10:00:43,384][26367] Fps is (10 sec: 44221.0, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5406998528. Throughput: 0: 42964.6. Samples: 1674659620. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:43,385][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 10:00:45,233][26599] Updated weights for policy 0, policy_version 330024 (0.0035) [2024-06-19 10:00:48,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5407211520. Throughput: 0: 42976.9. Samples: 1674792300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:48,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 10:00:49,034][26599] Updated weights for policy 0, policy_version 330034 (0.0038) [2024-06-19 10:00:52,790][26599] Updated weights for policy 0, policy_version 330044 (0.0035) [2024-06-19 10:00:53,380][26367] Fps is (10 sec: 44252.7, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5407440896. Throughput: 0: 43174.2. Samples: 1675052640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:53,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 10:00:56,823][26599] Updated weights for policy 0, policy_version 330054 (0.0023) [2024-06-19 10:00:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5407653888. Throughput: 0: 42926.2. Samples: 1675304780. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:00:58,384][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 10:01:00,418][26599] Updated weights for policy 0, policy_version 330064 (0.0032) [2024-06-19 10:01:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5407850496. Throughput: 0: 42926.4. Samples: 1675438080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:01:03,380][26367] Avg episode reward: [(0, '0.797')] [2024-06-19 10:01:04,312][26599] Updated weights for policy 0, policy_version 330074 (0.0035) [2024-06-19 10:01:08,226][26599] Updated weights for policy 0, policy_version 330084 (0.0025) [2024-06-19 10:01:08,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5408096256. Throughput: 0: 43131.0. Samples: 1675699280. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:01:08,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 10:01:11,587][26579] Signal inference workers to stop experience collection... (24750 times) [2024-06-19 10:01:11,588][26579] Signal inference workers to resume experience collection... (24750 times) [2024-06-19 10:01:11,599][26599] InferenceWorker_p0-w0: stopping experience collection (24750 times) [2024-06-19 10:01:11,600][26599] InferenceWorker_p0-w0: resuming experience collection (24750 times) [2024-06-19 10:01:11,896][26599] Updated weights for policy 0, policy_version 330094 (0.0034) [2024-06-19 10:01:13,380][26367] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5408309248. Throughput: 0: 43174.6. Samples: 1675958640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:01:13,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 10:01:15,963][26599] Updated weights for policy 0, policy_version 330104 (0.0038) [2024-06-19 10:01:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5408505856. Throughput: 0: 43135.6. Samples: 1676085360. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:01:18,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 10:01:19,351][26599] Updated weights for policy 0, policy_version 330114 (0.0027) [2024-06-19 10:01:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5408735232. Throughput: 0: 43138.6. Samples: 1676339940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 20.0) [2024-06-19 10:01:23,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 10:01:23,550][26599] Updated weights for policy 0, policy_version 330124 (0.0032) [2024-06-19 10:01:27,359][26599] Updated weights for policy 0, policy_version 330134 (0.0034) [2024-06-19 10:01:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5408931840. Throughput: 0: 42919.0. Samples: 1676590820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:01:28,382][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 10:01:31,207][26599] Updated weights for policy 0, policy_version 330144 (0.0030) [2024-06-19 10:01:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5409144832. Throughput: 0: 42857.3. Samples: 1676720880. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:01:33,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 10:01:34,927][26599] Updated weights for policy 0, policy_version 330154 (0.0034) [2024-06-19 10:01:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5409374208. Throughput: 0: 42903.2. Samples: 1676983280. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:01:38,380][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 10:01:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000330162_5409374208.pth... [2024-06-19 10:01:38,460][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000329537_5399134208.pth [2024-06-19 10:01:39,090][26599] Updated weights for policy 0, policy_version 330164 (0.0033) [2024-06-19 10:01:42,283][26599] Updated weights for policy 0, policy_version 330174 (0.0034) [2024-06-19 10:01:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 5409570816. Throughput: 0: 42920.5. Samples: 1677236200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:01:43,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 10:01:46,642][26599] Updated weights for policy 0, policy_version 330184 (0.0041) [2024-06-19 10:01:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5409800192. Throughput: 0: 42907.6. Samples: 1677368920. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:01:48,380][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 10:01:50,006][26599] Updated weights for policy 0, policy_version 330194 (0.0029) [2024-06-19 10:01:53,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5409996800. Throughput: 0: 42812.0. Samples: 1677625820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:01:53,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 10:01:54,204][26599] Updated weights for policy 0, policy_version 330204 (0.0030) [2024-06-19 10:01:57,864][26599] Updated weights for policy 0, policy_version 330214 (0.0045) [2024-06-19 10:01:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42821.1). Total num frames: 5410226176. Throughput: 0: 42610.4. Samples: 1677876100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:01:58,380][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 10:02:01,752][26599] Updated weights for policy 0, policy_version 330224 (0.0041) [2024-06-19 10:02:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5410439168. Throughput: 0: 42761.3. Samples: 1678009620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:03,381][26367] Avg episode reward: [(0, '0.801')] [2024-06-19 10:02:05,612][26599] Updated weights for policy 0, policy_version 330234 (0.0034) [2024-06-19 10:02:08,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42869.1, 300 sec: 42820.0). Total num frames: 5410668544. Throughput: 0: 42905.2. Samples: 1678270820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:08,392][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 10:02:09,304][26599] Updated weights for policy 0, policy_version 330244 (0.0040) [2024-06-19 10:02:13,003][26599] Updated weights for policy 0, policy_version 330254 (0.0034) [2024-06-19 10:02:13,380][26367] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5410897920. Throughput: 0: 43065.8. Samples: 1678528780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:13,381][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 10:02:17,327][26599] Updated weights for policy 0, policy_version 330264 (0.0029) [2024-06-19 10:02:18,382][26367] Fps is (10 sec: 40966.1, 60 sec: 42870.0, 300 sec: 42709.2). Total num frames: 5411078144. Throughput: 0: 43053.7. Samples: 1678658380. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:18,383][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 10:02:20,416][26599] Updated weights for policy 0, policy_version 330274 (0.0037) [2024-06-19 10:02:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5411307520. Throughput: 0: 43011.9. Samples: 1678918820. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:23,381][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 10:02:25,054][26599] Updated weights for policy 0, policy_version 330284 (0.0027) [2024-06-19 10:02:28,380][26367] Fps is (10 sec: 44245.1, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5411520512. Throughput: 0: 43064.3. Samples: 1679174100. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:28,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 10:02:28,542][26599] Updated weights for policy 0, policy_version 330294 (0.0040) [2024-06-19 10:02:32,802][26599] Updated weights for policy 0, policy_version 330304 (0.0039) [2024-06-19 10:02:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42709.4). Total num frames: 5411717120. Throughput: 0: 43011.4. Samples: 1679304440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:33,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 10:02:34,499][26579] Signal inference workers to stop experience collection... (24800 times) [2024-06-19 10:02:34,500][26579] Signal inference workers to resume experience collection... (24800 times) [2024-06-19 10:02:34,548][26599] InferenceWorker_p0-w0: stopping experience collection (24800 times) [2024-06-19 10:02:34,548][26599] InferenceWorker_p0-w0: resuming experience collection (24800 times) [2024-06-19 10:02:35,981][26599] Updated weights for policy 0, policy_version 330314 (0.0028) [2024-06-19 10:02:38,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5411946496. Throughput: 0: 42897.5. Samples: 1679556200. Policy #0 lag: (min: 1.0, avg: 10.7, max: 20.0) [2024-06-19 10:02:38,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 10:02:40,435][26599] Updated weights for policy 0, policy_version 330324 (0.0047) [2024-06-19 10:02:43,380][26367] Fps is (10 sec: 45875.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5412175872. Throughput: 0: 42937.3. Samples: 1679808280. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:02:43,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 10:02:43,650][26599] Updated weights for policy 0, policy_version 330334 (0.0025) [2024-06-19 10:02:48,098][26599] Updated weights for policy 0, policy_version 330344 (0.0028) [2024-06-19 10:02:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5412356096. Throughput: 0: 42866.2. Samples: 1679938600. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:02:48,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 10:02:51,307][26599] Updated weights for policy 0, policy_version 330354 (0.0037) [2024-06-19 10:02:53,384][26367] Fps is (10 sec: 40944.9, 60 sec: 43142.0, 300 sec: 42820.0). Total num frames: 5412585472. Throughput: 0: 42738.5. Samples: 1680194060. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:02:53,385][26367] Avg episode reward: [(0, '0.799')] [2024-06-19 10:02:55,816][26599] Updated weights for policy 0, policy_version 330364 (0.0022) [2024-06-19 10:02:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5412798464. Throughput: 0: 42583.2. Samples: 1680445020. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:02:58,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 10:02:59,030][26599] Updated weights for policy 0, policy_version 330374 (0.0032) [2024-06-19 10:03:03,380][26367] Fps is (10 sec: 39335.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5412978688. Throughput: 0: 42590.2. Samples: 1680574860. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:03,381][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 10:03:03,659][26599] Updated weights for policy 0, policy_version 330384 (0.0036) [2024-06-19 10:03:07,227][26599] Updated weights for policy 0, policy_version 330394 (0.0031) [2024-06-19 10:03:08,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42600.8, 300 sec: 42765.0). Total num frames: 5413224448. Throughput: 0: 42437.3. Samples: 1680828500. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:08,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 10:03:11,263][26599] Updated weights for policy 0, policy_version 330404 (0.0038) [2024-06-19 10:03:13,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5413437440. Throughput: 0: 42378.3. Samples: 1681081120. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:13,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 10:03:15,289][26599] Updated weights for policy 0, policy_version 330414 (0.0028) [2024-06-19 10:03:18,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42599.9, 300 sec: 42710.0). Total num frames: 5413634048. Throughput: 0: 42445.1. Samples: 1681214460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:18,380][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 10:03:19,069][26599] Updated weights for policy 0, policy_version 330424 (0.0046) [2024-06-19 10:03:23,037][26599] Updated weights for policy 0, policy_version 330434 (0.0029) [2024-06-19 10:03:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5413847040. Throughput: 0: 42440.0. Samples: 1681466000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:23,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 10:03:26,991][26599] Updated weights for policy 0, policy_version 330444 (0.0035) [2024-06-19 10:03:28,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5414092800. Throughput: 0: 42500.3. Samples: 1681720800. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:28,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 10:03:30,511][26599] Updated weights for policy 0, policy_version 330454 (0.0036) [2024-06-19 10:03:33,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42869.0, 300 sec: 42820.0). Total num frames: 5414289408. Throughput: 0: 42673.1. Samples: 1681859040. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:33,384][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 10:03:34,527][26599] Updated weights for policy 0, policy_version 330464 (0.0044) [2024-06-19 10:03:37,999][26599] Updated weights for policy 0, policy_version 330474 (0.0037) [2024-06-19 10:03:38,384][26367] Fps is (10 sec: 39307.5, 60 sec: 42322.7, 300 sec: 42709.5). Total num frames: 5414486016. Throughput: 0: 42487.5. Samples: 1682106000. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:38,385][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 10:03:38,413][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000330474_5414486016.pth... [2024-06-19 10:03:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000329848_5404229632.pth [2024-06-19 10:03:42,084][26599] Updated weights for policy 0, policy_version 330484 (0.0040) [2024-06-19 10:03:43,380][26367] Fps is (10 sec: 44253.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5414731776. Throughput: 0: 42651.6. Samples: 1682364340. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:43,380][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 10:03:45,582][26599] Updated weights for policy 0, policy_version 330494 (0.0040) [2024-06-19 10:03:48,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5414928384. Throughput: 0: 42680.1. Samples: 1682495460. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:48,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 10:03:49,580][26599] Updated weights for policy 0, policy_version 330504 (0.0037) [2024-06-19 10:03:53,170][26599] Updated weights for policy 0, policy_version 330514 (0.0029) [2024-06-19 10:03:53,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42601.0, 300 sec: 42820.6). Total num frames: 5415141376. Throughput: 0: 42676.9. Samples: 1682748960. Policy #0 lag: (min: 1.0, avg: 9.7, max: 20.0) [2024-06-19 10:03:53,383][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 10:03:55,740][26579] Signal inference workers to stop experience collection... (24850 times) [2024-06-19 10:03:55,741][26579] Signal inference workers to resume experience collection... (24850 times) [2024-06-19 10:03:55,761][26599] InferenceWorker_p0-w0: stopping experience collection (24850 times) [2024-06-19 10:03:55,791][26599] InferenceWorker_p0-w0: resuming experience collection (24850 times) [2024-06-19 10:03:57,419][26599] Updated weights for policy 0, policy_version 330524 (0.0044) [2024-06-19 10:03:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5415370752. Throughput: 0: 42845.2. Samples: 1683009160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:03:58,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 10:04:00,912][26599] Updated weights for policy 0, policy_version 330534 (0.0027) [2024-06-19 10:04:03,380][26367] Fps is (10 sec: 45875.6, 60 sec: 43690.8, 300 sec: 42931.6). Total num frames: 5415600128. Throughput: 0: 42743.9. Samples: 1683137940. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:03,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 10:04:04,965][26599] Updated weights for policy 0, policy_version 330544 (0.0041) [2024-06-19 10:04:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5415780352. Throughput: 0: 42769.3. Samples: 1683390620. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:08,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 10:04:08,413][26599] Updated weights for policy 0, policy_version 330554 (0.0040) [2024-06-19 10:04:12,506][26599] Updated weights for policy 0, policy_version 330564 (0.0039) [2024-06-19 10:04:13,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.5). Total num frames: 5416009728. Throughput: 0: 42827.5. Samples: 1683648040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:13,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 10:04:15,998][26599] Updated weights for policy 0, policy_version 330574 (0.0030) [2024-06-19 10:04:18,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5416206336. Throughput: 0: 42620.0. Samples: 1683776780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:18,380][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 10:04:20,165][26599] Updated weights for policy 0, policy_version 330584 (0.0037) [2024-06-19 10:04:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5416419328. Throughput: 0: 42746.1. Samples: 1684029420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:23,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 10:04:23,870][26599] Updated weights for policy 0, policy_version 330594 (0.0041) [2024-06-19 10:04:27,664][26599] Updated weights for policy 0, policy_version 330604 (0.0030) [2024-06-19 10:04:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5416632320. Throughput: 0: 42682.5. Samples: 1684285060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:28,381][26367] Avg episode reward: [(0, '0.843')] [2024-06-19 10:04:31,520][26599] Updated weights for policy 0, policy_version 330614 (0.0049) [2024-06-19 10:04:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42600.9, 300 sec: 42821.1). Total num frames: 5416845312. Throughput: 0: 42703.5. Samples: 1684417120. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:33,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 10:04:35,375][26599] Updated weights for policy 0, policy_version 330624 (0.0036) [2024-06-19 10:04:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43147.2, 300 sec: 42876.1). Total num frames: 5417074688. Throughput: 0: 42804.5. Samples: 1684675160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:38,384][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 10:04:39,076][26599] Updated weights for policy 0, policy_version 330634 (0.0049) [2024-06-19 10:04:43,115][26599] Updated weights for policy 0, policy_version 330644 (0.0031) [2024-06-19 10:04:43,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5417287680. Throughput: 0: 42720.1. Samples: 1684931560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:43,381][26367] Avg episode reward: [(0, '0.445')] [2024-06-19 10:04:46,954][26599] Updated weights for policy 0, policy_version 330654 (0.0035) [2024-06-19 10:04:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5417484288. Throughput: 0: 42604.4. Samples: 1685055140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:48,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 10:04:50,523][26599] Updated weights for policy 0, policy_version 330664 (0.0038) [2024-06-19 10:04:53,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5417713664. Throughput: 0: 42646.4. Samples: 1685309700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:53,380][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 10:04:54,780][26599] Updated weights for policy 0, policy_version 330674 (0.0032) [2024-06-19 10:04:57,970][26599] Updated weights for policy 0, policy_version 330684 (0.0025) [2024-06-19 10:04:58,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42595.9, 300 sec: 42820.0). Total num frames: 5417926656. Throughput: 0: 42559.8. Samples: 1685563380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:04:58,384][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 10:05:02,282][26599] Updated weights for policy 0, policy_version 330694 (0.0040) [2024-06-19 10:05:03,380][26367] Fps is (10 sec: 39320.6, 60 sec: 41779.1, 300 sec: 42765.0). Total num frames: 5418106880. Throughput: 0: 42636.3. Samples: 1685695420. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:05:03,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 10:05:05,738][26599] Updated weights for policy 0, policy_version 330704 (0.0030) [2024-06-19 10:05:08,384][26367] Fps is (10 sec: 42598.4, 60 sec: 42868.9, 300 sec: 42820.0). Total num frames: 5418352640. Throughput: 0: 42826.3. Samples: 1685956760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:05:08,384][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 10:05:09,879][26599] Updated weights for policy 0, policy_version 330714 (0.0037) [2024-06-19 10:05:13,294][26599] Updated weights for policy 0, policy_version 330724 (0.0030) [2024-06-19 10:05:13,380][26367] Fps is (10 sec: 47514.4, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5418582016. Throughput: 0: 42789.4. Samples: 1686210580. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:13,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 10:05:17,413][26599] Updated weights for policy 0, policy_version 330734 (0.0040) [2024-06-19 10:05:18,380][26367] Fps is (10 sec: 40974.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5418762240. Throughput: 0: 42819.6. Samples: 1686344000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:18,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 10:05:20,903][26599] Updated weights for policy 0, policy_version 330744 (0.0029) [2024-06-19 10:05:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5418975232. Throughput: 0: 42744.5. Samples: 1686598660. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:23,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 10:05:25,767][26599] Updated weights for policy 0, policy_version 330754 (0.0035) [2024-06-19 10:05:28,384][26367] Fps is (10 sec: 45859.1, 60 sec: 43141.9, 300 sec: 42931.1). Total num frames: 5419220992. Throughput: 0: 42822.7. Samples: 1686858740. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:28,384][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 10:05:28,518][26599] Updated weights for policy 0, policy_version 330764 (0.0034) [2024-06-19 10:05:33,245][26599] Updated weights for policy 0, policy_version 330774 (0.0038) [2024-06-19 10:05:33,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 5419401216. Throughput: 0: 42990.3. Samples: 1686989700. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:33,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 10:05:36,422][26599] Updated weights for policy 0, policy_version 330784 (0.0029) [2024-06-19 10:05:38,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42598.4, 300 sec: 42821.1). Total num frames: 5419630592. Throughput: 0: 42941.6. Samples: 1687242080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:38,384][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 10:05:38,521][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000330789_5419646976.pth... [2024-06-19 10:05:38,573][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000330162_5409374208.pth [2024-06-19 10:05:40,697][26599] Updated weights for policy 0, policy_version 330794 (0.0033) [2024-06-19 10:05:41,749][26579] Signal inference workers to stop experience collection... (24900 times) [2024-06-19 10:05:41,777][26599] InferenceWorker_p0-w0: stopping experience collection (24900 times) [2024-06-19 10:05:41,807][26579] Signal inference workers to resume experience collection... (24900 times) [2024-06-19 10:05:41,812][26599] InferenceWorker_p0-w0: resuming experience collection (24900 times) [2024-06-19 10:05:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5419843584. Throughput: 0: 43124.9. Samples: 1687503840. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:43,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 10:05:44,002][26599] Updated weights for policy 0, policy_version 330804 (0.0041) [2024-06-19 10:05:48,265][26599] Updated weights for policy 0, policy_version 330814 (0.0038) [2024-06-19 10:05:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5420056576. Throughput: 0: 43005.1. Samples: 1687630640. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:48,380][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 10:05:51,498][26599] Updated weights for policy 0, policy_version 330824 (0.0032) [2024-06-19 10:05:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5420269568. Throughput: 0: 42944.0. Samples: 1687889080. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:53,380][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 10:05:55,773][26599] Updated weights for policy 0, policy_version 330834 (0.0039) [2024-06-19 10:05:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42874.1, 300 sec: 42876.1). Total num frames: 5420498944. Throughput: 0: 42972.0. Samples: 1688144320. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:05:58,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 10:05:59,267][26599] Updated weights for policy 0, policy_version 330844 (0.0034) [2024-06-19 10:06:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5420695552. Throughput: 0: 42933.4. Samples: 1688276000. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:06:03,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 10:06:03,583][26599] Updated weights for policy 0, policy_version 330854 (0.0040) [2024-06-19 10:06:06,893][26599] Updated weights for policy 0, policy_version 330864 (0.0034) [2024-06-19 10:06:08,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42871.4, 300 sec: 42764.5). Total num frames: 5420924928. Throughput: 0: 42918.7. Samples: 1688530160. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:06:08,385][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 10:06:11,102][26599] Updated weights for policy 0, policy_version 330874 (0.0034) [2024-06-19 10:06:13,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5421137920. Throughput: 0: 42942.1. Samples: 1688790980. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:06:13,381][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 10:06:14,561][26599] Updated weights for policy 0, policy_version 330884 (0.0038) [2024-06-19 10:06:18,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5421334528. Throughput: 0: 42883.8. Samples: 1688919480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:06:18,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 10:06:18,667][26599] Updated weights for policy 0, policy_version 330894 (0.0027) [2024-06-19 10:06:22,353][26599] Updated weights for policy 0, policy_version 330904 (0.0035) [2024-06-19 10:06:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5421563904. Throughput: 0: 43031.2. Samples: 1689178480. Policy #0 lag: (min: 0.0, avg: 8.7, max: 20.0) [2024-06-19 10:06:23,380][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 10:06:26,185][26599] Updated weights for policy 0, policy_version 330914 (0.0036) [2024-06-19 10:06:28,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42874.1, 300 sec: 42876.1). Total num frames: 5421793280. Throughput: 0: 42887.1. Samples: 1689433760. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:06:28,381][26367] Avg episode reward: [(0, '0.322')] [2024-06-19 10:06:29,939][26599] Updated weights for policy 0, policy_version 330924 (0.0034) [2024-06-19 10:06:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5421989888. Throughput: 0: 42923.5. Samples: 1689562200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:06:33,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 10:06:34,000][26599] Updated weights for policy 0, policy_version 330934 (0.0035) [2024-06-19 10:06:37,742][26599] Updated weights for policy 0, policy_version 330944 (0.0042) [2024-06-19 10:06:38,380][26367] Fps is (10 sec: 42597.5, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5422219264. Throughput: 0: 43090.9. Samples: 1689828180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:06:38,381][26367] Avg episode reward: [(0, '0.414')] [2024-06-19 10:06:41,608][26599] Updated weights for policy 0, policy_version 330954 (0.0032) [2024-06-19 10:06:43,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5422432256. Throughput: 0: 42972.4. Samples: 1690078080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:06:43,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 10:06:45,450][26599] Updated weights for policy 0, policy_version 330964 (0.0032) [2024-06-19 10:06:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5422645248. Throughput: 0: 42744.0. Samples: 1690199480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:06:48,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 10:06:48,911][26579] Signal inference workers to stop experience collection... (24950 times) [2024-06-19 10:06:48,942][26599] InferenceWorker_p0-w0: stopping experience collection (24950 times) [2024-06-19 10:06:49,030][26579] Signal inference workers to resume experience collection... (24950 times) [2024-06-19 10:06:49,030][26599] InferenceWorker_p0-w0: resuming experience collection (24950 times) [2024-06-19 10:06:49,160][26599] Updated weights for policy 0, policy_version 330974 (0.0029) [2024-06-19 10:06:53,025][26599] Updated weights for policy 0, policy_version 330984 (0.0043) [2024-06-19 10:06:53,383][26367] Fps is (10 sec: 42586.2, 60 sec: 43142.4, 300 sec: 42820.1). Total num frames: 5422858240. Throughput: 0: 43025.7. Samples: 1690466280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:06:53,384][26367] Avg episode reward: [(0, '0.826')] [2024-06-19 10:06:57,391][26599] Updated weights for policy 0, policy_version 330994 (0.0030) [2024-06-19 10:06:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5423087616. Throughput: 0: 42844.4. Samples: 1690718980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:06:58,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 10:07:00,604][26599] Updated weights for policy 0, policy_version 331004 (0.0033) [2024-06-19 10:07:03,380][26367] Fps is (10 sec: 40971.7, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5423267840. Throughput: 0: 42836.1. Samples: 1690847100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:03,381][26367] Avg episode reward: [(0, '0.812')] [2024-06-19 10:07:04,893][26599] Updated weights for policy 0, policy_version 331014 (0.0037) [2024-06-19 10:07:08,085][26599] Updated weights for policy 0, policy_version 331024 (0.0036) [2024-06-19 10:07:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42874.2, 300 sec: 42709.5). Total num frames: 5423497216. Throughput: 0: 42790.2. Samples: 1691104040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:08,380][26367] Avg episode reward: [(0, '0.823')] [2024-06-19 10:07:12,509][26599] Updated weights for policy 0, policy_version 331034 (0.0039) [2024-06-19 10:07:13,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.8). Total num frames: 5423710208. Throughput: 0: 42799.0. Samples: 1691359720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:13,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 10:07:15,821][26599] Updated weights for policy 0, policy_version 331044 (0.0047) [2024-06-19 10:07:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5423906816. Throughput: 0: 42667.9. Samples: 1691482260. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:18,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 10:07:20,077][26599] Updated weights for policy 0, policy_version 331054 (0.0037) [2024-06-19 10:07:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5424136192. Throughput: 0: 42477.8. Samples: 1691739680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:23,381][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 10:07:23,646][26599] Updated weights for policy 0, policy_version 331064 (0.0039) [2024-06-19 10:07:28,162][26599] Updated weights for policy 0, policy_version 331074 (0.0043) [2024-06-19 10:07:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5424332800. Throughput: 0: 42653.4. Samples: 1691997480. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:28,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 10:07:31,346][26599] Updated weights for policy 0, policy_version 331084 (0.0041) [2024-06-19 10:07:33,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5424529408. Throughput: 0: 42698.8. Samples: 1692120920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:33,380][26367] Avg episode reward: [(0, '0.754')] [2024-06-19 10:07:35,585][26599] Updated weights for policy 0, policy_version 331094 (0.0044) [2024-06-19 10:07:38,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5424775168. Throughput: 0: 42480.8. Samples: 1692377800. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:07:38,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 10:07:38,403][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000331102_5424775168.pth... [2024-06-19 10:07:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000330474_5414486016.pth [2024-06-19 10:07:38,991][26599] Updated weights for policy 0, policy_version 331104 (0.0034) [2024-06-19 10:07:43,170][26599] Updated weights for policy 0, policy_version 331114 (0.0026) [2024-06-19 10:07:43,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5424971776. Throughput: 0: 42538.7. Samples: 1692633220. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:07:43,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 10:07:46,674][26599] Updated weights for policy 0, policy_version 331124 (0.0037) [2024-06-19 10:07:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42710.0). Total num frames: 5425184768. Throughput: 0: 42689.2. Samples: 1692768120. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:07:48,381][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 10:07:50,632][26599] Updated weights for policy 0, policy_version 331134 (0.0034) [2024-06-19 10:07:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42327.3, 300 sec: 42709.5). Total num frames: 5425397760. Throughput: 0: 42541.6. Samples: 1693018420. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:07:53,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 10:07:54,520][26599] Updated weights for policy 0, policy_version 331144 (0.0032) [2024-06-19 10:07:58,258][26599] Updated weights for policy 0, policy_version 331154 (0.0026) [2024-06-19 10:07:58,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5425627136. Throughput: 0: 42636.5. Samples: 1693278360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:07:58,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 10:08:02,044][26599] Updated weights for policy 0, policy_version 331164 (0.0035) [2024-06-19 10:08:03,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5425840128. Throughput: 0: 42778.6. Samples: 1693407300. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:03,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 10:08:04,166][26579] Signal inference workers to stop experience collection... (25000 times) [2024-06-19 10:08:04,167][26579] Signal inference workers to resume experience collection... (25000 times) [2024-06-19 10:08:04,207][26599] InferenceWorker_p0-w0: stopping experience collection (25000 times) [2024-06-19 10:08:04,208][26599] InferenceWorker_p0-w0: resuming experience collection (25000 times) [2024-06-19 10:08:05,908][26599] Updated weights for policy 0, policy_version 331174 (0.0047) [2024-06-19 10:08:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5426053120. Throughput: 0: 42614.3. Samples: 1693657320. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:08,380][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 10:08:09,744][26599] Updated weights for policy 0, policy_version 331184 (0.0032) [2024-06-19 10:08:13,384][26367] Fps is (10 sec: 40945.3, 60 sec: 42322.8, 300 sec: 42764.5). Total num frames: 5426249728. Throughput: 0: 42707.6. Samples: 1693919480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:13,385][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 10:08:13,989][26599] Updated weights for policy 0, policy_version 331194 (0.0034) [2024-06-19 10:08:17,475][26599] Updated weights for policy 0, policy_version 331204 (0.0033) [2024-06-19 10:08:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5426462720. Throughput: 0: 42647.1. Samples: 1694040040. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:18,380][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 10:08:21,520][26599] Updated weights for policy 0, policy_version 331214 (0.0036) [2024-06-19 10:08:23,380][26367] Fps is (10 sec: 44252.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5426692096. Throughput: 0: 42589.7. Samples: 1694294340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:23,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 10:08:24,877][26599] Updated weights for policy 0, policy_version 331224 (0.0031) [2024-06-19 10:08:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42710.0). Total num frames: 5426888704. Throughput: 0: 42814.8. Samples: 1694559880. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:28,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 10:08:29,185][26599] Updated weights for policy 0, policy_version 331234 (0.0036) [2024-06-19 10:08:32,511][26599] Updated weights for policy 0, policy_version 331244 (0.0043) [2024-06-19 10:08:33,383][26367] Fps is (10 sec: 42585.5, 60 sec: 43142.2, 300 sec: 42820.6). Total num frames: 5427118080. Throughput: 0: 42560.2. Samples: 1694683460. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:33,384][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 10:08:36,799][26599] Updated weights for policy 0, policy_version 331254 (0.0035) [2024-06-19 10:08:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 5427314688. Throughput: 0: 42690.4. Samples: 1694939480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:38,381][26367] Avg episode reward: [(0, '0.859')] [2024-06-19 10:08:40,071][26599] Updated weights for policy 0, policy_version 331264 (0.0029) [2024-06-19 10:08:43,380][26367] Fps is (10 sec: 40973.2, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5427527680. Throughput: 0: 42644.5. Samples: 1695197360. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:43,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 10:08:44,490][26599] Updated weights for policy 0, policy_version 331274 (0.0038) [2024-06-19 10:08:47,885][26599] Updated weights for policy 0, policy_version 331284 (0.0045) [2024-06-19 10:08:48,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5427757056. Throughput: 0: 42547.2. Samples: 1695321920. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:48,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 10:08:52,433][26599] Updated weights for policy 0, policy_version 331294 (0.0025) [2024-06-19 10:08:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5427970048. Throughput: 0: 42714.6. Samples: 1695579480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:53,381][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 10:08:55,347][26599] Updated weights for policy 0, policy_version 331304 (0.0034) [2024-06-19 10:08:58,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5428150272. Throughput: 0: 42706.9. Samples: 1695841140. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:08:58,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 10:08:59,837][26599] Updated weights for policy 0, policy_version 331314 (0.0031) [2024-06-19 10:09:03,277][26599] Updated weights for policy 0, policy_version 331324 (0.0036) [2024-06-19 10:09:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5428412416. Throughput: 0: 42638.9. Samples: 1695958800. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:03,381][26367] Avg episode reward: [(0, '0.786')] [2024-06-19 10:09:07,171][26579] Signal inference workers to stop experience collection... (25050 times) [2024-06-19 10:09:07,176][26579] Signal inference workers to resume experience collection... (25050 times) [2024-06-19 10:09:07,212][26599] InferenceWorker_p0-w0: stopping experience collection (25050 times) [2024-06-19 10:09:07,212][26599] InferenceWorker_p0-w0: resuming experience collection (25050 times) [2024-06-19 10:09:07,314][26599] Updated weights for policy 0, policy_version 331334 (0.0035) [2024-06-19 10:09:08,380][26367] Fps is (10 sec: 47514.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5428625408. Throughput: 0: 42834.4. Samples: 1696221880. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:08,381][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 10:09:10,799][26599] Updated weights for policy 0, policy_version 331344 (0.0035) [2024-06-19 10:09:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42600.9, 300 sec: 42709.5). Total num frames: 5428805632. Throughput: 0: 42684.3. Samples: 1696480680. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:13,381][26367] Avg episode reward: [(0, '0.825')] [2024-06-19 10:09:14,826][26599] Updated weights for policy 0, policy_version 331354 (0.0041) [2024-06-19 10:09:18,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5429051392. Throughput: 0: 42623.9. Samples: 1696601400. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:18,380][26367] Avg episode reward: [(0, '0.790')] [2024-06-19 10:09:18,498][26599] Updated weights for policy 0, policy_version 331364 (0.0026) [2024-06-19 10:09:22,350][26599] Updated weights for policy 0, policy_version 331374 (0.0053) [2024-06-19 10:09:23,380][26367] Fps is (10 sec: 47513.7, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5429280768. Throughput: 0: 42875.0. Samples: 1696868860. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:23,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 10:09:26,259][26599] Updated weights for policy 0, policy_version 331384 (0.0032) [2024-06-19 10:09:28,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5429444608. Throughput: 0: 42928.7. Samples: 1697129160. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:28,381][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 10:09:29,948][26599] Updated weights for policy 0, policy_version 331394 (0.0034) [2024-06-19 10:09:33,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42873.7, 300 sec: 42765.0). Total num frames: 5429690368. Throughput: 0: 42970.7. Samples: 1697255600. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:33,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 10:09:33,646][26599] Updated weights for policy 0, policy_version 331404 (0.0036) [2024-06-19 10:09:37,376][26599] Updated weights for policy 0, policy_version 331414 (0.0041) [2024-06-19 10:09:38,380][26367] Fps is (10 sec: 47514.0, 60 sec: 43417.5, 300 sec: 42820.6). Total num frames: 5429919744. Throughput: 0: 42999.2. Samples: 1697514440. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:38,381][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 10:09:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000331416_5429919744.pth... [2024-06-19 10:09:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000330789_5419646976.pth [2024-06-19 10:09:41,207][26599] Updated weights for policy 0, policy_version 331424 (0.0041) [2024-06-19 10:09:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5430099968. Throughput: 0: 43023.3. Samples: 1697777180. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:43,380][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 10:09:45,165][26599] Updated weights for policy 0, policy_version 331434 (0.0035) [2024-06-19 10:09:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5430329344. Throughput: 0: 43190.9. Samples: 1697902380. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:48,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 10:09:48,995][26599] Updated weights for policy 0, policy_version 331444 (0.0029) [2024-06-19 10:09:52,617][26599] Updated weights for policy 0, policy_version 331454 (0.0034) [2024-06-19 10:09:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.5). Total num frames: 5430542336. Throughput: 0: 43072.0. Samples: 1698160120. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:53,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 10:09:56,644][26599] Updated weights for policy 0, policy_version 331464 (0.0047) [2024-06-19 10:09:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43417.8, 300 sec: 42876.1). Total num frames: 5430755328. Throughput: 0: 42928.6. Samples: 1698412460. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:09:58,380][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 10:10:00,569][26599] Updated weights for policy 0, policy_version 331474 (0.0043) [2024-06-19 10:10:03,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42595.9, 300 sec: 42765.0). Total num frames: 5430968320. Throughput: 0: 43057.4. Samples: 1698539140. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:10:03,384][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 10:10:04,613][26599] Updated weights for policy 0, policy_version 331484 (0.0036) [2024-06-19 10:10:08,376][26599] Updated weights for policy 0, policy_version 331494 (0.0030) [2024-06-19 10:10:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5431197696. Throughput: 0: 42885.9. Samples: 1698798720. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:10:08,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 10:10:12,090][26599] Updated weights for policy 0, policy_version 331504 (0.0046) [2024-06-19 10:10:13,380][26367] Fps is (10 sec: 42612.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5431394304. Throughput: 0: 42764.3. Samples: 1699053560. Policy #0 lag: (min: 0.0, avg: 8.2, max: 21.0) [2024-06-19 10:10:13,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 10:10:16,171][26599] Updated weights for policy 0, policy_version 331514 (0.0035) [2024-06-19 10:10:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5431607296. Throughput: 0: 42887.5. Samples: 1699185540. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:18,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 10:10:19,715][26599] Updated weights for policy 0, policy_version 331524 (0.0045) [2024-06-19 10:10:23,380][26367] Fps is (10 sec: 44238.2, 60 sec: 42598.5, 300 sec: 42765.6). Total num frames: 5431836672. Throughput: 0: 42828.1. Samples: 1699441700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:23,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 10:10:23,669][26599] Updated weights for policy 0, policy_version 331534 (0.0032) [2024-06-19 10:10:27,522][26599] Updated weights for policy 0, policy_version 331544 (0.0042) [2024-06-19 10:10:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43417.7, 300 sec: 42876.1). Total num frames: 5432049664. Throughput: 0: 42553.3. Samples: 1699692080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:28,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 10:10:31,189][26599] Updated weights for policy 0, policy_version 331554 (0.0044) [2024-06-19 10:10:33,108][26579] Signal inference workers to stop experience collection... (25100 times) [2024-06-19 10:10:33,108][26579] Signal inference workers to resume experience collection... (25100 times) [2024-06-19 10:10:33,144][26599] InferenceWorker_p0-w0: stopping experience collection (25100 times) [2024-06-19 10:10:33,144][26599] InferenceWorker_p0-w0: resuming experience collection (25100 times) [2024-06-19 10:10:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5432262656. Throughput: 0: 42750.1. Samples: 1699826140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:33,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 10:10:34,967][26599] Updated weights for policy 0, policy_version 331564 (0.0038) [2024-06-19 10:10:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5432459264. Throughput: 0: 42757.3. Samples: 1700084200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:38,381][26367] Avg episode reward: [(0, '0.880')] [2024-06-19 10:10:38,902][26599] Updated weights for policy 0, policy_version 331574 (0.0024) [2024-06-19 10:10:42,909][26599] Updated weights for policy 0, policy_version 331584 (0.0052) [2024-06-19 10:10:43,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5432672256. Throughput: 0: 42579.8. Samples: 1700328560. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:43,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 10:10:46,725][26599] Updated weights for policy 0, policy_version 331594 (0.0039) [2024-06-19 10:10:48,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5432901632. Throughput: 0: 42694.9. Samples: 1700460260. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:48,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 10:10:50,666][26599] Updated weights for policy 0, policy_version 331604 (0.0033) [2024-06-19 10:10:53,381][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.2, 300 sec: 42765.0). Total num frames: 5433114624. Throughput: 0: 42563.2. Samples: 1700714080. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:53,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 10:10:54,458][26599] Updated weights for policy 0, policy_version 331614 (0.0041) [2024-06-19 10:10:58,348][26599] Updated weights for policy 0, policy_version 331624 (0.0049) [2024-06-19 10:10:58,384][26367] Fps is (10 sec: 42583.3, 60 sec: 42868.8, 300 sec: 42820.0). Total num frames: 5433327616. Throughput: 0: 42746.1. Samples: 1700977280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:10:58,384][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 10:11:02,263][26599] Updated weights for policy 0, policy_version 331634 (0.0041) [2024-06-19 10:11:03,380][26367] Fps is (10 sec: 40961.7, 60 sec: 42601.0, 300 sec: 42710.0). Total num frames: 5433524224. Throughput: 0: 42595.2. Samples: 1701102320. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:11:03,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 10:11:05,934][26599] Updated weights for policy 0, policy_version 331644 (0.0044) [2024-06-19 10:11:08,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5433753600. Throughput: 0: 42582.6. Samples: 1701357920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:11:08,381][26367] Avg episode reward: [(0, '0.793')] [2024-06-19 10:11:09,983][26599] Updated weights for policy 0, policy_version 331654 (0.0047) [2024-06-19 10:11:13,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5433950208. Throughput: 0: 42611.1. Samples: 1701609580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:11:13,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 10:11:13,602][26599] Updated weights for policy 0, policy_version 331664 (0.0034) [2024-06-19 10:11:17,646][26599] Updated weights for policy 0, policy_version 331674 (0.0039) [2024-06-19 10:11:18,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5434146816. Throughput: 0: 42525.3. Samples: 1701739780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:11:18,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 10:11:21,169][26599] Updated weights for policy 0, policy_version 331684 (0.0053) [2024-06-19 10:11:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5434392576. Throughput: 0: 42403.1. Samples: 1701992340. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:11:23,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 10:11:25,412][26599] Updated weights for policy 0, policy_version 331694 (0.0040) [2024-06-19 10:11:28,383][26367] Fps is (10 sec: 44223.9, 60 sec: 42323.3, 300 sec: 42709.0). Total num frames: 5434589184. Throughput: 0: 42775.5. Samples: 1702253580. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 10:11:28,384][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 10:11:29,158][26599] Updated weights for policy 0, policy_version 331704 (0.0023) [2024-06-19 10:11:32,981][26599] Updated weights for policy 0, policy_version 331714 (0.0032) [2024-06-19 10:11:33,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5434802176. Throughput: 0: 42733.4. Samples: 1702383260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:11:33,381][26367] Avg episode reward: [(0, '0.819')] [2024-06-19 10:11:36,767][26599] Updated weights for policy 0, policy_version 331724 (0.0023) [2024-06-19 10:11:38,384][26367] Fps is (10 sec: 44233.5, 60 sec: 42868.9, 300 sec: 42708.9). Total num frames: 5435031552. Throughput: 0: 42759.9. Samples: 1702638420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:11:38,393][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 10:11:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000331728_5435031552.pth... [2024-06-19 10:11:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000331102_5424775168.pth [2024-06-19 10:11:40,543][26599] Updated weights for policy 0, policy_version 331734 (0.0047) [2024-06-19 10:11:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5435228160. Throughput: 0: 42775.0. Samples: 1702902000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:11:43,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 10:11:44,478][26599] Updated weights for policy 0, policy_version 331744 (0.0039) [2024-06-19 10:11:48,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42325.3, 300 sec: 42654.3). Total num frames: 5435441152. Throughput: 0: 42660.3. Samples: 1703022040. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:11:48,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 10:11:48,597][26599] Updated weights for policy 0, policy_version 331754 (0.0067) [2024-06-19 10:11:52,061][26599] Updated weights for policy 0, policy_version 331764 (0.0036) [2024-06-19 10:11:53,384][26367] Fps is (10 sec: 45858.3, 60 sec: 42869.1, 300 sec: 42709.0). Total num frames: 5435686912. Throughput: 0: 42718.7. Samples: 1703280420. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:11:53,384][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 10:11:56,180][26599] Updated weights for policy 0, policy_version 331774 (0.0038) [2024-06-19 10:11:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42327.8, 300 sec: 42709.5). Total num frames: 5435867136. Throughput: 0: 42817.3. Samples: 1703536360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:11:58,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 10:11:59,708][26599] Updated weights for policy 0, policy_version 331784 (0.0038) [2024-06-19 10:12:03,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5436096512. Throughput: 0: 42556.0. Samples: 1703654800. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:03,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 10:12:03,747][26599] Updated weights for policy 0, policy_version 331794 (0.0033) [2024-06-19 10:12:07,387][26599] Updated weights for policy 0, policy_version 331804 (0.0035) [2024-06-19 10:12:08,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5436325888. Throughput: 0: 42786.6. Samples: 1703917740. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:08,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 10:12:11,199][26599] Updated weights for policy 0, policy_version 331814 (0.0046) [2024-06-19 10:12:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5436506112. Throughput: 0: 42684.5. Samples: 1704174260. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:13,384][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 10:12:14,476][26579] Signal inference workers to stop experience collection... (25150 times) [2024-06-19 10:12:14,477][26579] Signal inference workers to resume experience collection... (25150 times) [2024-06-19 10:12:14,501][26599] InferenceWorker_p0-w0: stopping experience collection (25150 times) [2024-06-19 10:12:14,501][26599] InferenceWorker_p0-w0: resuming experience collection (25150 times) [2024-06-19 10:12:15,268][26599] Updated weights for policy 0, policy_version 331824 (0.0028) [2024-06-19 10:12:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5436735488. Throughput: 0: 42527.6. Samples: 1704297000. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:18,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 10:12:18,720][26599] Updated weights for policy 0, policy_version 331834 (0.0039) [2024-06-19 10:12:22,661][26599] Updated weights for policy 0, policy_version 331844 (0.0037) [2024-06-19 10:12:23,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5436964864. Throughput: 0: 42775.9. Samples: 1704563180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:23,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 10:12:26,639][26599] Updated weights for policy 0, policy_version 331854 (0.0039) [2024-06-19 10:12:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42600.6, 300 sec: 42765.0). Total num frames: 5437145088. Throughput: 0: 42519.1. Samples: 1704815360. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:28,380][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 10:12:30,242][26599] Updated weights for policy 0, policy_version 331864 (0.0037) [2024-06-19 10:12:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5437374464. Throughput: 0: 42579.9. Samples: 1704938140. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:33,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 10:12:34,334][26599] Updated weights for policy 0, policy_version 331874 (0.0045) [2024-06-19 10:12:38,059][26599] Updated weights for policy 0, policy_version 331884 (0.0031) [2024-06-19 10:12:38,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42874.1, 300 sec: 42820.6). Total num frames: 5437603840. Throughput: 0: 42620.3. Samples: 1705198180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:38,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 10:12:41,839][26599] Updated weights for policy 0, policy_version 331894 (0.0029) [2024-06-19 10:12:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5437784064. Throughput: 0: 42612.4. Samples: 1705453920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 10:12:43,381][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 10:12:45,855][26599] Updated weights for policy 0, policy_version 331904 (0.0026) [2024-06-19 10:12:48,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5438013440. Throughput: 0: 42732.1. Samples: 1705577740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:12:48,380][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 10:12:49,337][26599] Updated weights for policy 0, policy_version 331914 (0.0031) [2024-06-19 10:12:53,380][26367] Fps is (10 sec: 44237.9, 60 sec: 42328.0, 300 sec: 42709.5). Total num frames: 5438226432. Throughput: 0: 42766.4. Samples: 1705842220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:12:53,380][26367] Avg episode reward: [(0, '0.497')] [2024-06-19 10:12:53,459][26599] Updated weights for policy 0, policy_version 331924 (0.0025) [2024-06-19 10:12:57,652][26599] Updated weights for policy 0, policy_version 331934 (0.0042) [2024-06-19 10:12:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5438439424. Throughput: 0: 42677.8. Samples: 1706094760. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:12:58,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 10:13:01,130][26599] Updated weights for policy 0, policy_version 331944 (0.0034) [2024-06-19 10:13:03,381][26367] Fps is (10 sec: 44234.3, 60 sec: 42871.2, 300 sec: 42764.9). Total num frames: 5438668800. Throughput: 0: 42762.2. Samples: 1706221320. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:03,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 10:13:05,230][26599] Updated weights for policy 0, policy_version 331954 (0.0026) [2024-06-19 10:13:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.5). Total num frames: 5438865408. Throughput: 0: 42654.7. Samples: 1706482640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:08,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 10:13:08,734][26599] Updated weights for policy 0, policy_version 331964 (0.0030) [2024-06-19 10:13:12,838][26599] Updated weights for policy 0, policy_version 331974 (0.0030) [2024-06-19 10:13:13,384][26367] Fps is (10 sec: 40946.8, 60 sec: 42868.9, 300 sec: 42764.5). Total num frames: 5439078400. Throughput: 0: 42627.1. Samples: 1706733740. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:13,384][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 10:13:16,445][26599] Updated weights for policy 0, policy_version 331984 (0.0038) [2024-06-19 10:13:18,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5439307776. Throughput: 0: 42757.1. Samples: 1706862200. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:18,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 10:13:20,322][26599] Updated weights for policy 0, policy_version 331994 (0.0038) [2024-06-19 10:13:23,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42598.5, 300 sec: 42820.5). Total num frames: 5439520768. Throughput: 0: 42777.8. Samples: 1707123180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:23,381][26367] Avg episode reward: [(0, '0.449')] [2024-06-19 10:13:24,163][26599] Updated weights for policy 0, policy_version 332004 (0.0031) [2024-06-19 10:13:27,782][26599] Updated weights for policy 0, policy_version 332014 (0.0040) [2024-06-19 10:13:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.9). Total num frames: 5439717376. Throughput: 0: 42825.9. Samples: 1707381080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:28,381][26367] Avg episode reward: [(0, '0.824')] [2024-06-19 10:13:31,707][26599] Updated weights for policy 0, policy_version 332024 (0.0038) [2024-06-19 10:13:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5439946752. Throughput: 0: 43116.9. Samples: 1707518000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:33,380][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 10:13:35,332][26599] Updated weights for policy 0, policy_version 332034 (0.0043) [2024-06-19 10:13:38,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42322.8, 300 sec: 42764.5). Total num frames: 5440143360. Throughput: 0: 42902.6. Samples: 1707773000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:38,385][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 10:13:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332040_5440143360.pth... [2024-06-19 10:13:38,452][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000331416_5429919744.pth [2024-06-19 10:13:39,288][26599] Updated weights for policy 0, policy_version 332044 (0.0028) [2024-06-19 10:13:43,039][26599] Updated weights for policy 0, policy_version 332054 (0.0036) [2024-06-19 10:13:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5440372736. Throughput: 0: 42967.2. Samples: 1708028280. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:43,380][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 10:13:46,592][26579] Signal inference workers to stop experience collection... (25200 times) [2024-06-19 10:13:46,594][26579] Signal inference workers to resume experience collection... (25200 times) [2024-06-19 10:13:46,614][26599] InferenceWorker_p0-w0: stopping experience collection (25200 times) [2024-06-19 10:13:46,615][26599] InferenceWorker_p0-w0: resuming experience collection (25200 times) [2024-06-19 10:13:46,741][26599] Updated weights for policy 0, policy_version 332064 (0.0050) [2024-06-19 10:13:48,380][26367] Fps is (10 sec: 44253.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5440585728. Throughput: 0: 43167.1. Samples: 1708163820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:48,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 10:13:50,592][26599] Updated weights for policy 0, policy_version 332074 (0.0034) [2024-06-19 10:13:53,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5440798720. Throughput: 0: 43016.4. Samples: 1708418380. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:53,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 10:13:54,609][26599] Updated weights for policy 0, policy_version 332084 (0.0036) [2024-06-19 10:13:58,381][26367] Fps is (10 sec: 42596.5, 60 sec: 42871.2, 300 sec: 42709.4). Total num frames: 5441011712. Throughput: 0: 43047.1. Samples: 1708670720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 10:13:58,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 10:13:58,508][26599] Updated weights for policy 0, policy_version 332094 (0.0042) [2024-06-19 10:14:02,228][26599] Updated weights for policy 0, policy_version 332104 (0.0036) [2024-06-19 10:14:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.8, 300 sec: 42765.0). Total num frames: 5441241088. Throughput: 0: 43040.8. Samples: 1708799040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:03,381][26367] Avg episode reward: [(0, '0.477')] [2024-06-19 10:14:06,131][26599] Updated weights for policy 0, policy_version 332114 (0.0043) [2024-06-19 10:14:08,380][26367] Fps is (10 sec: 42600.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5441437696. Throughput: 0: 43004.4. Samples: 1709058380. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:08,381][26367] Avg episode reward: [(0, '0.461')] [2024-06-19 10:14:09,796][26599] Updated weights for policy 0, policy_version 332124 (0.0031) [2024-06-19 10:14:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42874.0, 300 sec: 42709.4). Total num frames: 5441650688. Throughput: 0: 42761.2. Samples: 1709305340. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:13,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 10:14:14,131][26599] Updated weights for policy 0, policy_version 332134 (0.0034) [2024-06-19 10:14:17,633][26599] Updated weights for policy 0, policy_version 332144 (0.0056) [2024-06-19 10:14:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5441880064. Throughput: 0: 42679.4. Samples: 1709438580. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:18,381][26367] Avg episode reward: [(0, '0.547')] [2024-06-19 10:14:21,857][26599] Updated weights for policy 0, policy_version 332154 (0.0042) [2024-06-19 10:14:23,380][26367] Fps is (10 sec: 40961.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5442060288. Throughput: 0: 42752.9. Samples: 1709696720. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:23,380][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 10:14:25,403][26599] Updated weights for policy 0, policy_version 332164 (0.0030) [2024-06-19 10:14:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5442306048. Throughput: 0: 42582.6. Samples: 1709944500. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:28,381][26367] Avg episode reward: [(0, '0.853')] [2024-06-19 10:14:29,602][26599] Updated weights for policy 0, policy_version 332174 (0.0034) [2024-06-19 10:14:32,865][26599] Updated weights for policy 0, policy_version 332184 (0.0030) [2024-06-19 10:14:33,380][26367] Fps is (10 sec: 47513.2, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5442535424. Throughput: 0: 42715.5. Samples: 1710086020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:33,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 10:14:37,407][26599] Updated weights for policy 0, policy_version 332194 (0.0024) [2024-06-19 10:14:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5442699264. Throughput: 0: 42717.4. Samples: 1710340660. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:38,380][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 10:14:40,398][26599] Updated weights for policy 0, policy_version 332204 (0.0030) [2024-06-19 10:14:43,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5442961408. Throughput: 0: 42701.6. Samples: 1710592280. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:43,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 10:14:44,969][26599] Updated weights for policy 0, policy_version 332214 (0.0030) [2024-06-19 10:14:47,931][26599] Updated weights for policy 0, policy_version 332224 (0.0035) [2024-06-19 10:14:48,380][26367] Fps is (10 sec: 47513.0, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5443174400. Throughput: 0: 42755.5. Samples: 1710723040. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:48,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 10:14:52,485][26599] Updated weights for policy 0, policy_version 332234 (0.0033) [2024-06-19 10:14:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5443354624. Throughput: 0: 42703.1. Samples: 1710980020. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:53,383][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 10:14:55,654][26599] Updated weights for policy 0, policy_version 332244 (0.0030) [2024-06-19 10:14:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.7, 300 sec: 42765.5). Total num frames: 5443584000. Throughput: 0: 42834.8. Samples: 1711232900. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:14:58,383][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 10:14:59,972][26599] Updated weights for policy 0, policy_version 332254 (0.0037) [2024-06-19 10:15:03,246][26599] Updated weights for policy 0, policy_version 332264 (0.0037) [2024-06-19 10:15:03,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5443813376. Throughput: 0: 42800.5. Samples: 1711364600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:15:03,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 10:15:07,928][26599] Updated weights for policy 0, policy_version 332274 (0.0036) [2024-06-19 10:15:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5443993600. Throughput: 0: 42692.3. Samples: 1711617880. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:15:08,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 10:15:10,928][26599] Updated weights for policy 0, policy_version 332284 (0.0031) [2024-06-19 10:15:13,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5444222976. Throughput: 0: 42866.2. Samples: 1711873480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:15:13,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 10:15:15,471][26599] Updated weights for policy 0, policy_version 332294 (0.0036) [2024-06-19 10:15:18,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5444452352. Throughput: 0: 42638.7. Samples: 1712004760. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:18,380][26367] Avg episode reward: [(0, '0.356')] [2024-06-19 10:15:18,512][26599] Updated weights for policy 0, policy_version 332304 (0.0043) [2024-06-19 10:15:23,093][26599] Updated weights for policy 0, policy_version 332314 (0.0058) [2024-06-19 10:15:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5444632576. Throughput: 0: 42555.1. Samples: 1712255640. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:23,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 10:15:26,464][26599] Updated weights for policy 0, policy_version 332324 (0.0032) [2024-06-19 10:15:28,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5444845568. Throughput: 0: 42602.7. Samples: 1712509400. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:28,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 10:15:31,096][26599] Updated weights for policy 0, policy_version 332334 (0.0040) [2024-06-19 10:15:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5445074944. Throughput: 0: 42670.8. Samples: 1712643220. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:33,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 10:15:34,046][26599] Updated weights for policy 0, policy_version 332344 (0.0040) [2024-06-19 10:15:38,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5445271552. Throughput: 0: 42599.6. Samples: 1712897000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:38,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 10:15:38,396][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332353_5445271552.pth... [2024-06-19 10:15:38,478][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000331728_5435031552.pth [2024-06-19 10:15:38,646][26599] Updated weights for policy 0, policy_version 332354 (0.0028) [2024-06-19 10:15:41,794][26599] Updated weights for policy 0, policy_version 332364 (0.0028) [2024-06-19 10:15:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5445500928. Throughput: 0: 42560.0. Samples: 1713148100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:43,381][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 10:15:46,038][26599] Updated weights for policy 0, policy_version 332374 (0.0028) [2024-06-19 10:15:47,218][26579] Signal inference workers to stop experience collection... (25250 times) [2024-06-19 10:15:47,233][26599] InferenceWorker_p0-w0: stopping experience collection (25250 times) [2024-06-19 10:15:47,275][26579] Signal inference workers to resume experience collection... (25250 times) [2024-06-19 10:15:47,276][26599] InferenceWorker_p0-w0: resuming experience collection (25250 times) [2024-06-19 10:15:48,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42598.5, 300 sec: 42765.1). Total num frames: 5445730304. Throughput: 0: 42700.0. Samples: 1713286100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:48,380][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 10:15:49,271][26599] Updated weights for policy 0, policy_version 332384 (0.0039) [2024-06-19 10:15:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5445926912. Throughput: 0: 42850.3. Samples: 1713546140. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:53,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 10:15:53,486][26599] Updated weights for policy 0, policy_version 332394 (0.0032) [2024-06-19 10:15:56,882][26599] Updated weights for policy 0, policy_version 332404 (0.0024) [2024-06-19 10:15:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5446156288. Throughput: 0: 42829.7. Samples: 1713800820. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:15:58,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 10:16:01,287][26599] Updated weights for policy 0, policy_version 332414 (0.0030) [2024-06-19 10:16:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5446369280. Throughput: 0: 42813.8. Samples: 1713931380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:16:03,381][26367] Avg episode reward: [(0, '0.617')] [2024-06-19 10:16:04,452][26599] Updated weights for policy 0, policy_version 332424 (0.0044) [2024-06-19 10:16:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5446565888. Throughput: 0: 42923.5. Samples: 1714187200. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:16:08,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 10:16:08,955][26599] Updated weights for policy 0, policy_version 332434 (0.0046) [2024-06-19 10:16:12,089][26599] Updated weights for policy 0, policy_version 332444 (0.0041) [2024-06-19 10:16:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5446795264. Throughput: 0: 43021.8. Samples: 1714445380. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:16:13,381][26367] Avg episode reward: [(0, '0.353')] [2024-06-19 10:16:16,494][26599] Updated weights for policy 0, policy_version 332454 (0.0045) [2024-06-19 10:16:18,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5447008256. Throughput: 0: 42954.2. Samples: 1714576160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:16:18,392][26367] Avg episode reward: [(0, '0.434')] [2024-06-19 10:16:19,679][26599] Updated weights for policy 0, policy_version 332464 (0.0043) [2024-06-19 10:16:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42821.0). Total num frames: 5447221248. Throughput: 0: 43000.1. Samples: 1714832000. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:16:23,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 10:16:23,915][26599] Updated weights for policy 0, policy_version 332474 (0.0043) [2024-06-19 10:16:27,264][26599] Updated weights for policy 0, policy_version 332484 (0.0046) [2024-06-19 10:16:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5447434240. Throughput: 0: 42968.1. Samples: 1715081660. Policy #0 lag: (min: 0.0, avg: 9.2, max: 22.0) [2024-06-19 10:16:28,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 10:16:31,448][26599] Updated weights for policy 0, policy_version 332494 (0.0032) [2024-06-19 10:16:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42710.0). Total num frames: 5447630848. Throughput: 0: 42915.9. Samples: 1715217320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:16:33,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 10:16:34,984][26599] Updated weights for policy 0, policy_version 332504 (0.0034) [2024-06-19 10:16:38,384][26367] Fps is (10 sec: 42582.9, 60 sec: 43142.0, 300 sec: 42820.0). Total num frames: 5447860224. Throughput: 0: 42811.7. Samples: 1715472820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:16:38,384][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 10:16:39,380][26599] Updated weights for policy 0, policy_version 332514 (0.0037) [2024-06-19 10:16:42,835][26599] Updated weights for policy 0, policy_version 332524 (0.0053) [2024-06-19 10:16:43,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5448073216. Throughput: 0: 42727.7. Samples: 1715723560. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:16:43,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 10:16:46,866][26599] Updated weights for policy 0, policy_version 332534 (0.0045) [2024-06-19 10:16:48,384][26367] Fps is (10 sec: 42598.3, 60 sec: 42595.8, 300 sec: 42709.5). Total num frames: 5448286208. Throughput: 0: 42871.6. Samples: 1715860760. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:16:48,384][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 10:16:50,370][26599] Updated weights for policy 0, policy_version 332544 (0.0030) [2024-06-19 10:16:53,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5448499200. Throughput: 0: 43037.8. Samples: 1716123900. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:16:53,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 10:16:54,386][26599] Updated weights for policy 0, policy_version 332554 (0.0037) [2024-06-19 10:16:58,016][26579] Signal inference workers to stop experience collection... (25300 times) [2024-06-19 10:16:58,017][26579] Signal inference workers to resume experience collection... (25300 times) [2024-06-19 10:16:58,060][26599] InferenceWorker_p0-w0: stopping experience collection (25300 times) [2024-06-19 10:16:58,060][26599] InferenceWorker_p0-w0: resuming experience collection (25300 times) [2024-06-19 10:16:58,164][26599] Updated weights for policy 0, policy_version 332564 (0.0026) [2024-06-19 10:16:58,380][26367] Fps is (10 sec: 44252.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5448728576. Throughput: 0: 42791.0. Samples: 1716370980. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:16:58,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 10:17:02,062][26599] Updated weights for policy 0, policy_version 332574 (0.0027) [2024-06-19 10:17:03,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5448941568. Throughput: 0: 42742.8. Samples: 1716499580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:03,380][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 10:17:05,714][26599] Updated weights for policy 0, policy_version 332584 (0.0030) [2024-06-19 10:17:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5449154560. Throughput: 0: 42948.8. Samples: 1716764700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:08,381][26367] Avg episode reward: [(0, '0.484')] [2024-06-19 10:17:09,982][26599] Updated weights for policy 0, policy_version 332594 (0.0039) [2024-06-19 10:17:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5449367552. Throughput: 0: 43064.0. Samples: 1717019540. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:13,380][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 10:17:13,474][26599] Updated weights for policy 0, policy_version 332604 (0.0024) [2024-06-19 10:17:17,430][26599] Updated weights for policy 0, policy_version 332614 (0.0033) [2024-06-19 10:17:18,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42868.9, 300 sec: 42764.5). Total num frames: 5449580544. Throughput: 0: 42696.2. Samples: 1717138800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:18,384][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 10:17:21,464][26599] Updated weights for policy 0, policy_version 332624 (0.0035) [2024-06-19 10:17:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5449793536. Throughput: 0: 42796.2. Samples: 1717398500. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:23,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 10:17:25,001][26599] Updated weights for policy 0, policy_version 332634 (0.0045) [2024-06-19 10:17:28,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5449990144. Throughput: 0: 43023.0. Samples: 1717659600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:28,389][26367] Avg episode reward: [(0, '0.801')] [2024-06-19 10:17:29,047][26599] Updated weights for policy 0, policy_version 332644 (0.0048) [2024-06-19 10:17:32,594][26599] Updated weights for policy 0, policy_version 332654 (0.0029) [2024-06-19 10:17:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5450219520. Throughput: 0: 42693.6. Samples: 1717781820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:33,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 10:17:37,046][26599] Updated weights for policy 0, policy_version 332664 (0.0054) [2024-06-19 10:17:38,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42873.9, 300 sec: 42876.1). Total num frames: 5450432512. Throughput: 0: 42615.4. Samples: 1718041600. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:38,389][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 10:17:38,409][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332668_5450432512.pth... [2024-06-19 10:17:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332040_5440143360.pth [2024-06-19 10:17:40,227][26599] Updated weights for policy 0, policy_version 332674 (0.0040) [2024-06-19 10:17:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5450629120. Throughput: 0: 42719.2. Samples: 1718293340. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:43,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 10:17:44,689][26599] Updated weights for policy 0, policy_version 332684 (0.0026) [2024-06-19 10:17:47,873][26599] Updated weights for policy 0, policy_version 332694 (0.0045) [2024-06-19 10:17:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42874.0, 300 sec: 42820.5). Total num frames: 5450858496. Throughput: 0: 42605.2. Samples: 1718416820. Policy #0 lag: (min: 0.0, avg: 8.9, max: 21.0) [2024-06-19 10:17:48,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 10:17:52,586][26599] Updated weights for policy 0, policy_version 332704 (0.0042) [2024-06-19 10:17:53,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5451071488. Throughput: 0: 42577.0. Samples: 1718680660. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:17:53,380][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 10:17:55,559][26599] Updated weights for policy 0, policy_version 332714 (0.0033) [2024-06-19 10:17:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42765.1). Total num frames: 5451284480. Throughput: 0: 42450.5. Samples: 1718929820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:17:58,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 10:18:00,266][26599] Updated weights for policy 0, policy_version 332724 (0.0040) [2024-06-19 10:18:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5451497472. Throughput: 0: 42590.0. Samples: 1719055200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:03,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 10:18:03,590][26599] Updated weights for policy 0, policy_version 332734 (0.0033) [2024-06-19 10:18:07,954][26599] Updated weights for policy 0, policy_version 332744 (0.0036) [2024-06-19 10:18:08,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42765.6). Total num frames: 5451694080. Throughput: 0: 42718.3. Samples: 1719320820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:08,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 10:18:11,170][26599] Updated weights for policy 0, policy_version 332754 (0.0028) [2024-06-19 10:18:13,381][26367] Fps is (10 sec: 42596.7, 60 sec: 42598.0, 300 sec: 42764.9). Total num frames: 5451923456. Throughput: 0: 42494.6. Samples: 1719571880. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:13,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 10:18:15,512][26599] Updated weights for policy 0, policy_version 332764 (0.0044) [2024-06-19 10:18:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42054.9, 300 sec: 42654.0). Total num frames: 5452103680. Throughput: 0: 42614.8. Samples: 1719699480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:18,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 10:18:18,904][26599] Updated weights for policy 0, policy_version 332774 (0.0033) [2024-06-19 10:18:23,046][26599] Updated weights for policy 0, policy_version 332784 (0.0030) [2024-06-19 10:18:23,380][26367] Fps is (10 sec: 40962.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5452333056. Throughput: 0: 42577.0. Samples: 1719957560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:23,381][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 10:18:25,749][26579] Signal inference workers to stop experience collection... (25350 times) [2024-06-19 10:18:25,792][26599] InferenceWorker_p0-w0: stopping experience collection (25350 times) [2024-06-19 10:18:25,813][26579] Signal inference workers to resume experience collection... (25350 times) [2024-06-19 10:18:25,814][26599] InferenceWorker_p0-w0: resuming experience collection (25350 times) [2024-06-19 10:18:26,376][26599] Updated weights for policy 0, policy_version 332794 (0.0032) [2024-06-19 10:18:28,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5452562432. Throughput: 0: 42528.4. Samples: 1720207120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:28,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 10:18:31,123][26599] Updated weights for policy 0, policy_version 332804 (0.0026) [2024-06-19 10:18:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42765.6). Total num frames: 5452759040. Throughput: 0: 42728.1. Samples: 1720339580. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:33,380][26367] Avg episode reward: [(0, '0.430')] [2024-06-19 10:18:34,131][26599] Updated weights for policy 0, policy_version 332814 (0.0051) [2024-06-19 10:18:38,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5452955648. Throughput: 0: 42591.4. Samples: 1720597280. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:38,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 10:18:38,720][26599] Updated weights for policy 0, policy_version 332824 (0.0028) [2024-06-19 10:18:41,817][26599] Updated weights for policy 0, policy_version 332834 (0.0022) [2024-06-19 10:18:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5453201408. Throughput: 0: 42589.4. Samples: 1720846340. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:43,381][26367] Avg episode reward: [(0, '0.719')] [2024-06-19 10:18:46,422][26599] Updated weights for policy 0, policy_version 332844 (0.0026) [2024-06-19 10:18:48,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5453414400. Throughput: 0: 42907.7. Samples: 1720986040. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:48,380][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 10:18:49,569][26599] Updated weights for policy 0, policy_version 332854 (0.0031) [2024-06-19 10:18:53,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42052.2, 300 sec: 42654.0). Total num frames: 5453594624. Throughput: 0: 42578.2. Samples: 1721236840. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:53,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 10:18:54,154][26599] Updated weights for policy 0, policy_version 332864 (0.0034) [2024-06-19 10:18:57,176][26599] Updated weights for policy 0, policy_version 332874 (0.0030) [2024-06-19 10:18:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5453840384. Throughput: 0: 42555.5. Samples: 1721486860. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:18:58,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 10:19:01,854][26599] Updated weights for policy 0, policy_version 332884 (0.0038) [2024-06-19 10:19:03,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5454053376. Throughput: 0: 42761.2. Samples: 1721623740. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:19:03,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 10:19:04,801][26599] Updated weights for policy 0, policy_version 332894 (0.0025) [2024-06-19 10:19:08,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5454233600. Throughput: 0: 42487.9. Samples: 1721869520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:08,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 10:19:09,699][26599] Updated weights for policy 0, policy_version 332904 (0.0025) [2024-06-19 10:19:12,562][26599] Updated weights for policy 0, policy_version 332914 (0.0031) [2024-06-19 10:19:13,382][26367] Fps is (10 sec: 44228.4, 60 sec: 42870.5, 300 sec: 42764.7). Total num frames: 5454495744. Throughput: 0: 42562.2. Samples: 1722122500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:13,383][26367] Avg episode reward: [(0, '0.830')] [2024-06-19 10:19:17,451][26599] Updated weights for policy 0, policy_version 332924 (0.0036) [2024-06-19 10:19:18,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5454675968. Throughput: 0: 42731.6. Samples: 1722262500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:18,380][26367] Avg episode reward: [(0, '0.790')] [2024-06-19 10:19:20,226][26599] Updated weights for policy 0, policy_version 332934 (0.0028) [2024-06-19 10:19:23,380][26367] Fps is (10 sec: 37690.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5454872576. Throughput: 0: 42493.8. Samples: 1722509500. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:23,382][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 10:19:25,139][26599] Updated weights for policy 0, policy_version 332944 (0.0043) [2024-06-19 10:19:28,000][26599] Updated weights for policy 0, policy_version 332954 (0.0036) [2024-06-19 10:19:28,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5455118336. Throughput: 0: 42529.3. Samples: 1722760160. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:28,381][26367] Avg episode reward: [(0, '0.366')] [2024-06-19 10:19:32,711][26599] Updated weights for policy 0, policy_version 332964 (0.0025) [2024-06-19 10:19:33,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5455314944. Throughput: 0: 42484.4. Samples: 1722897840. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:33,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 10:19:35,648][26599] Updated weights for policy 0, policy_version 332974 (0.0038) [2024-06-19 10:19:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5455527936. Throughput: 0: 42370.5. Samples: 1723143520. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:38,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 10:19:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332979_5455527936.pth... [2024-06-19 10:19:38,440][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332353_5445271552.pth [2024-06-19 10:19:39,869][26579] Signal inference workers to stop experience collection... (25400 times) [2024-06-19 10:19:39,925][26599] InferenceWorker_p0-w0: stopping experience collection (25400 times) [2024-06-19 10:19:39,929][26579] Signal inference workers to resume experience collection... (25400 times) [2024-06-19 10:19:39,946][26599] InferenceWorker_p0-w0: resuming experience collection (25400 times) [2024-06-19 10:19:40,227][26599] Updated weights for policy 0, policy_version 332984 (0.0029) [2024-06-19 10:19:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5455757312. Throughput: 0: 42645.4. Samples: 1723405900. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:43,381][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 10:19:43,419][26599] Updated weights for policy 0, policy_version 332994 (0.0040) [2024-06-19 10:19:47,845][26599] Updated weights for policy 0, policy_version 333004 (0.0034) [2024-06-19 10:19:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5455970304. Throughput: 0: 42546.2. Samples: 1723538320. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:48,390][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 10:19:50,970][26599] Updated weights for policy 0, policy_version 333014 (0.0025) [2024-06-19 10:19:53,384][26367] Fps is (10 sec: 42583.1, 60 sec: 43141.9, 300 sec: 42709.0). Total num frames: 5456183296. Throughput: 0: 42735.3. Samples: 1723792760. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:53,384][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 10:19:55,623][26599] Updated weights for policy 0, policy_version 333024 (0.0028) [2024-06-19 10:19:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5456412672. Throughput: 0: 42724.4. Samples: 1724045020. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:19:58,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 10:19:58,856][26599] Updated weights for policy 0, policy_version 333034 (0.0032) [2024-06-19 10:20:03,231][26599] Updated weights for policy 0, policy_version 333044 (0.0032) [2024-06-19 10:20:03,380][26367] Fps is (10 sec: 42613.1, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5456609280. Throughput: 0: 42513.1. Samples: 1724175600. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:20:03,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 10:20:06,411][26599] Updated weights for policy 0, policy_version 333054 (0.0037) [2024-06-19 10:20:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5456822272. Throughput: 0: 42592.9. Samples: 1724426180. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:20:08,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 10:20:10,880][26599] Updated weights for policy 0, policy_version 333064 (0.0038) [2024-06-19 10:20:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42326.6, 300 sec: 42653.9). Total num frames: 5457035264. Throughput: 0: 42824.9. Samples: 1724687280. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:20:13,381][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 10:20:14,120][26599] Updated weights for policy 0, policy_version 333074 (0.0029) [2024-06-19 10:20:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5457231872. Throughput: 0: 42618.7. Samples: 1724815680. Policy #0 lag: (min: 0.0, avg: 11.2, max: 22.0) [2024-06-19 10:20:18,381][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 10:20:18,464][26599] Updated weights for policy 0, policy_version 333084 (0.0041) [2024-06-19 10:20:21,588][26599] Updated weights for policy 0, policy_version 333094 (0.0030) [2024-06-19 10:20:23,384][26367] Fps is (10 sec: 44221.2, 60 sec: 43415.0, 300 sec: 42820.0). Total num frames: 5457477632. Throughput: 0: 42805.1. Samples: 1725069900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:23,385][26367] Avg episode reward: [(0, '0.813')] [2024-06-19 10:20:26,078][26599] Updated weights for policy 0, policy_version 333104 (0.0032) [2024-06-19 10:20:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5457674240. Throughput: 0: 42815.6. Samples: 1725332600. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:28,380][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 10:20:29,484][26599] Updated weights for policy 0, policy_version 333114 (0.0036) [2024-06-19 10:20:33,380][26367] Fps is (10 sec: 39335.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5457870848. Throughput: 0: 42620.0. Samples: 1725456220. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:33,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 10:20:33,791][26599] Updated weights for policy 0, policy_version 333124 (0.0032) [2024-06-19 10:20:36,957][26599] Updated weights for policy 0, policy_version 333134 (0.0035) [2024-06-19 10:20:38,380][26367] Fps is (10 sec: 45874.7, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 5458132992. Throughput: 0: 42738.0. Samples: 1725715820. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:38,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 10:20:41,405][26599] Updated weights for policy 0, policy_version 333144 (0.0045) [2024-06-19 10:20:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5458313216. Throughput: 0: 42828.9. Samples: 1725972320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:43,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 10:20:44,642][26599] Updated weights for policy 0, policy_version 333154 (0.0030) [2024-06-19 10:20:48,380][26367] Fps is (10 sec: 37683.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5458509824. Throughput: 0: 42660.7. Samples: 1726095320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:48,380][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 10:20:49,105][26599] Updated weights for policy 0, policy_version 333164 (0.0024) [2024-06-19 10:20:52,223][26599] Updated weights for policy 0, policy_version 333174 (0.0037) [2024-06-19 10:20:53,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42601.0, 300 sec: 42654.0). Total num frames: 5458739200. Throughput: 0: 42820.5. Samples: 1726353100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:53,380][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 10:20:56,677][26599] Updated weights for policy 0, policy_version 333184 (0.0036) [2024-06-19 10:20:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5458952192. Throughput: 0: 42828.6. Samples: 1726614560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:20:58,380][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 10:20:59,871][26599] Updated weights for policy 0, policy_version 333194 (0.0029) [2024-06-19 10:21:03,383][26367] Fps is (10 sec: 40947.7, 60 sec: 42323.4, 300 sec: 42653.5). Total num frames: 5459148800. Throughput: 0: 42771.9. Samples: 1726740540. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:21:03,384][26367] Avg episode reward: [(0, '0.855')] [2024-06-19 10:21:04,369][26599] Updated weights for policy 0, policy_version 333204 (0.0027) [2024-06-19 10:21:05,919][26579] Signal inference workers to stop experience collection... (25450 times) [2024-06-19 10:21:05,966][26599] InferenceWorker_p0-w0: stopping experience collection (25450 times) [2024-06-19 10:21:05,974][26579] Signal inference workers to resume experience collection... (25450 times) [2024-06-19 10:21:05,980][26599] InferenceWorker_p0-w0: resuming experience collection (25450 times) [2024-06-19 10:21:07,677][26599] Updated weights for policy 0, policy_version 333214 (0.0031) [2024-06-19 10:21:08,382][26367] Fps is (10 sec: 42590.9, 60 sec: 42597.2, 300 sec: 42653.7). Total num frames: 5459378176. Throughput: 0: 42717.4. Samples: 1726992100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:21:08,383][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 10:21:12,069][26599] Updated weights for policy 0, policy_version 333224 (0.0031) [2024-06-19 10:21:13,380][26367] Fps is (10 sec: 45888.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5459607552. Throughput: 0: 42839.0. Samples: 1727260360. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:21:13,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 10:21:15,209][26599] Updated weights for policy 0, policy_version 333234 (0.0028) [2024-06-19 10:21:18,380][26367] Fps is (10 sec: 42605.8, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5459804160. Throughput: 0: 42798.7. Samples: 1727382160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:21:18,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 10:21:19,708][26599] Updated weights for policy 0, policy_version 333244 (0.0033) [2024-06-19 10:21:22,730][26599] Updated weights for policy 0, policy_version 333254 (0.0042) [2024-06-19 10:21:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5460033536. Throughput: 0: 42652.6. Samples: 1727635180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:21:23,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 10:21:27,248][26599] Updated weights for policy 0, policy_version 333264 (0.0036) [2024-06-19 10:21:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5460230144. Throughput: 0: 42912.5. Samples: 1727903380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:21:28,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 10:21:30,763][26599] Updated weights for policy 0, policy_version 333274 (0.0033) [2024-06-19 10:21:33,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42871.4, 300 sec: 42654.4). Total num frames: 5460443136. Throughput: 0: 42778.0. Samples: 1728020340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 20.0) [2024-06-19 10:21:33,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 10:21:35,450][26599] Updated weights for policy 0, policy_version 333284 (0.0039) [2024-06-19 10:21:38,302][26599] Updated weights for policy 0, policy_version 333294 (0.0027) [2024-06-19 10:21:38,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5460688896. Throughput: 0: 42653.7. Samples: 1728272520. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:21:38,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 10:21:38,390][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000333294_5460688896.pth... [2024-06-19 10:21:38,474][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332668_5450432512.pth [2024-06-19 10:21:43,035][26599] Updated weights for policy 0, policy_version 333304 (0.0035) [2024-06-19 10:21:43,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42325.5, 300 sec: 42598.9). Total num frames: 5460852736. Throughput: 0: 42758.7. Samples: 1728538700. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:21:43,380][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 10:21:45,806][26599] Updated weights for policy 0, policy_version 333314 (0.0037) [2024-06-19 10:21:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5461098496. Throughput: 0: 42589.4. Samples: 1728656940. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:21:48,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 10:21:50,514][26599] Updated weights for policy 0, policy_version 333324 (0.0028) [2024-06-19 10:21:53,305][26599] Updated weights for policy 0, policy_version 333334 (0.0037) [2024-06-19 10:21:53,380][26367] Fps is (10 sec: 49151.4, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5461344256. Throughput: 0: 42866.5. Samples: 1728921020. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:21:53,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 10:21:58,067][26599] Updated weights for policy 0, policy_version 333344 (0.0040) [2024-06-19 10:21:58,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5461508096. Throughput: 0: 42762.8. Samples: 1729184680. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:21:58,380][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 10:22:01,012][26599] Updated weights for policy 0, policy_version 333354 (0.0033) [2024-06-19 10:22:03,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42873.4, 300 sec: 42598.4). Total num frames: 5461721088. Throughput: 0: 42737.6. Samples: 1729305360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:03,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 10:22:05,820][26599] Updated weights for policy 0, policy_version 333364 (0.0036) [2024-06-19 10:22:06,523][26579] Signal inference workers to stop experience collection... (25500 times) [2024-06-19 10:22:06,560][26599] InferenceWorker_p0-w0: stopping experience collection (25500 times) [2024-06-19 10:22:06,583][26579] Signal inference workers to resume experience collection... (25500 times) [2024-06-19 10:22:06,583][26599] InferenceWorker_p0-w0: resuming experience collection (25500 times) [2024-06-19 10:22:08,380][26367] Fps is (10 sec: 45875.0, 60 sec: 43145.8, 300 sec: 42709.5). Total num frames: 5461966848. Throughput: 0: 42855.1. Samples: 1729563660. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:08,380][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 10:22:08,606][26599] Updated weights for policy 0, policy_version 333374 (0.0054) [2024-06-19 10:22:13,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42598.9). Total num frames: 5462147072. Throughput: 0: 42951.7. Samples: 1729836200. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:13,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 10:22:13,442][26599] Updated weights for policy 0, policy_version 333384 (0.0028) [2024-06-19 10:22:16,094][26599] Updated weights for policy 0, policy_version 333394 (0.0046) [2024-06-19 10:22:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5462376448. Throughput: 0: 42894.3. Samples: 1729950580. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:18,381][26367] Avg episode reward: [(0, '0.342')] [2024-06-19 10:22:20,973][26599] Updated weights for policy 0, policy_version 333404 (0.0029) [2024-06-19 10:22:23,380][26367] Fps is (10 sec: 49151.9, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5462638592. Throughput: 0: 43086.2. Samples: 1730211400. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:23,389][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 10:22:24,067][26599] Updated weights for policy 0, policy_version 333414 (0.0034) [2024-06-19 10:22:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5462802432. Throughput: 0: 43158.6. Samples: 1730480840. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:28,391][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 10:22:28,448][26599] Updated weights for policy 0, policy_version 333424 (0.0045) [2024-06-19 10:22:31,693][26599] Updated weights for policy 0, policy_version 333434 (0.0033) [2024-06-19 10:22:33,380][26367] Fps is (10 sec: 39321.7, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5463031808. Throughput: 0: 43148.5. Samples: 1730598620. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:33,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 10:22:36,106][26599] Updated weights for policy 0, policy_version 333444 (0.0048) [2024-06-19 10:22:38,380][26367] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5463277568. Throughput: 0: 42994.6. Samples: 1730855780. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:38,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 10:22:39,396][26599] Updated weights for policy 0, policy_version 333454 (0.0027) [2024-06-19 10:22:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5463441408. Throughput: 0: 42992.8. Samples: 1731119360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:43,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 10:22:43,947][26599] Updated weights for policy 0, policy_version 333464 (0.0028) [2024-06-19 10:22:47,297][26599] Updated weights for policy 0, policy_version 333474 (0.0026) [2024-06-19 10:22:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5463687168. Throughput: 0: 43000.2. Samples: 1731240360. Policy #0 lag: (min: 1.0, avg: 9.4, max: 21.0) [2024-06-19 10:22:48,381][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 10:22:51,512][26599] Updated weights for policy 0, policy_version 333484 (0.0046) [2024-06-19 10:22:53,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5463916544. Throughput: 0: 43128.5. Samples: 1731504440. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:22:53,381][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 10:22:54,748][26599] Updated weights for policy 0, policy_version 333494 (0.0038) [2024-06-19 10:22:58,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5464080384. Throughput: 0: 42840.0. Samples: 1731764000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:22:58,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 10:22:59,078][26599] Updated weights for policy 0, policy_version 333504 (0.0025) [2024-06-19 10:23:02,238][26599] Updated weights for policy 0, policy_version 333514 (0.0043) [2024-06-19 10:23:03,380][26367] Fps is (10 sec: 39321.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5464309760. Throughput: 0: 42879.9. Samples: 1731880180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:03,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 10:23:06,754][26599] Updated weights for policy 0, policy_version 333524 (0.0043) [2024-06-19 10:23:08,384][26367] Fps is (10 sec: 49133.8, 60 sec: 43415.0, 300 sec: 42875.6). Total num frames: 5464571904. Throughput: 0: 43033.9. Samples: 1732148080. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:08,384][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 10:23:09,706][26599] Updated weights for policy 0, policy_version 333534 (0.0041) [2024-06-19 10:23:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5464719360. Throughput: 0: 42862.2. Samples: 1732409640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:13,381][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 10:23:14,385][26599] Updated weights for policy 0, policy_version 333544 (0.0042) [2024-06-19 10:23:17,401][26599] Updated weights for policy 0, policy_version 333554 (0.0036) [2024-06-19 10:23:18,380][26367] Fps is (10 sec: 39336.2, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5464965120. Throughput: 0: 42827.6. Samples: 1732525860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:18,380][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 10:23:21,632][26579] Signal inference workers to stop experience collection... (25550 times) [2024-06-19 10:23:21,688][26599] InferenceWorker_p0-w0: stopping experience collection (25550 times) [2024-06-19 10:23:21,750][26579] Signal inference workers to resume experience collection... (25550 times) [2024-06-19 10:23:21,750][26599] InferenceWorker_p0-w0: resuming experience collection (25550 times) [2024-06-19 10:23:21,895][26599] Updated weights for policy 0, policy_version 333564 (0.0023) [2024-06-19 10:23:23,380][26367] Fps is (10 sec: 49152.8, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5465210880. Throughput: 0: 43108.1. Samples: 1732795640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:23,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 10:23:24,901][26599] Updated weights for policy 0, policy_version 333574 (0.0035) [2024-06-19 10:23:28,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5465374720. Throughput: 0: 43003.5. Samples: 1733054520. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:28,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 10:23:29,473][26599] Updated weights for policy 0, policy_version 333584 (0.0032) [2024-06-19 10:23:33,058][26599] Updated weights for policy 0, policy_version 333594 (0.0033) [2024-06-19 10:23:33,384][26367] Fps is (10 sec: 39307.1, 60 sec: 42868.8, 300 sec: 42875.6). Total num frames: 5465604096. Throughput: 0: 42889.8. Samples: 1733170560. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:33,384][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 10:23:36,919][26599] Updated weights for policy 0, policy_version 333604 (0.0034) [2024-06-19 10:23:38,380][26367] Fps is (10 sec: 47513.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5465849856. Throughput: 0: 42900.3. Samples: 1733434960. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:38,381][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 10:23:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000333609_5465849856.pth... [2024-06-19 10:23:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000332979_5455527936.pth [2024-06-19 10:23:40,706][26599] Updated weights for policy 0, policy_version 333614 (0.0037) [2024-06-19 10:23:43,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5466013696. Throughput: 0: 42867.1. Samples: 1733693020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:43,380][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 10:23:45,002][26599] Updated weights for policy 0, policy_version 333624 (0.0029) [2024-06-19 10:23:48,258][26599] Updated weights for policy 0, policy_version 333634 (0.0031) [2024-06-19 10:23:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5466259456. Throughput: 0: 42931.6. Samples: 1733812100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:48,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 10:23:52,594][26599] Updated weights for policy 0, policy_version 333644 (0.0033) [2024-06-19 10:23:53,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5466472448. Throughput: 0: 42808.0. Samples: 1734074280. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:53,380][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 10:23:55,820][26599] Updated weights for policy 0, policy_version 333654 (0.0039) [2024-06-19 10:23:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5466669056. Throughput: 0: 42803.7. Samples: 1734335800. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:23:58,381][26367] Avg episode reward: [(0, '0.793')] [2024-06-19 10:24:00,216][26599] Updated weights for policy 0, policy_version 333664 (0.0033) [2024-06-19 10:24:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43144.7, 300 sec: 42931.7). Total num frames: 5466898432. Throughput: 0: 42872.0. Samples: 1734455100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 23.0) [2024-06-19 10:24:03,380][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 10:24:03,426][26599] Updated weights for policy 0, policy_version 333674 (0.0022) [2024-06-19 10:24:07,690][26599] Updated weights for policy 0, policy_version 333684 (0.0033) [2024-06-19 10:24:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42054.8, 300 sec: 42709.8). Total num frames: 5467095040. Throughput: 0: 42665.3. Samples: 1734715580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:08,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 10:24:11,143][26599] Updated weights for policy 0, policy_version 333694 (0.0030) [2024-06-19 10:24:13,380][26367] Fps is (10 sec: 40959.8, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5467308032. Throughput: 0: 42645.4. Samples: 1734973560. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:13,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 10:24:15,239][26599] Updated weights for policy 0, policy_version 333704 (0.0029) [2024-06-19 10:24:18,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5467537408. Throughput: 0: 42862.6. Samples: 1735099220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:18,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 10:24:18,738][26599] Updated weights for policy 0, policy_version 333714 (0.0039) [2024-06-19 10:24:22,774][26599] Updated weights for policy 0, policy_version 333724 (0.0032) [2024-06-19 10:24:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42820.6). Total num frames: 5467750400. Throughput: 0: 42794.4. Samples: 1735360700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:23,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 10:24:26,329][26599] Updated weights for policy 0, policy_version 333734 (0.0039) [2024-06-19 10:24:28,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5467963392. Throughput: 0: 42731.9. Samples: 1735615960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:28,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 10:24:30,601][26599] Updated weights for policy 0, policy_version 333744 (0.0027) [2024-06-19 10:24:33,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42598.4, 300 sec: 42820.1). Total num frames: 5468160000. Throughput: 0: 42774.3. Samples: 1735737100. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:33,385][26367] Avg episode reward: [(0, '0.817')] [2024-06-19 10:24:33,924][26599] Updated weights for policy 0, policy_version 333754 (0.0036) [2024-06-19 10:24:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5468372992. Throughput: 0: 42662.1. Samples: 1735994080. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:38,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 10:24:38,390][26599] Updated weights for policy 0, policy_version 333764 (0.0034) [2024-06-19 10:24:41,565][26579] Signal inference workers to stop experience collection... (25600 times) [2024-06-19 10:24:41,572][26579] Signal inference workers to resume experience collection... (25600 times) [2024-06-19 10:24:41,609][26599] InferenceWorker_p0-w0: stopping experience collection (25600 times) [2024-06-19 10:24:41,609][26599] InferenceWorker_p0-w0: resuming experience collection (25600 times) [2024-06-19 10:24:41,707][26599] Updated weights for policy 0, policy_version 333774 (0.0043) [2024-06-19 10:24:43,384][26367] Fps is (10 sec: 42598.2, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5468585984. Throughput: 0: 42658.3. Samples: 1736255580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:43,385][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 10:24:45,946][26599] Updated weights for policy 0, policy_version 333784 (0.0033) [2024-06-19 10:24:48,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.5). Total num frames: 5468798976. Throughput: 0: 42742.6. Samples: 1736378520. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:48,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 10:24:49,457][26599] Updated weights for policy 0, policy_version 333794 (0.0036) [2024-06-19 10:24:53,380][26367] Fps is (10 sec: 44253.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5469028352. Throughput: 0: 42669.8. Samples: 1736635720. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:53,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 10:24:53,470][26599] Updated weights for policy 0, policy_version 333804 (0.0035) [2024-06-19 10:24:57,712][26599] Updated weights for policy 0, policy_version 333814 (0.0038) [2024-06-19 10:24:58,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5469224960. Throughput: 0: 42617.0. Samples: 1736891480. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:24:58,384][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 10:25:01,305][26599] Updated weights for policy 0, policy_version 333824 (0.0031) [2024-06-19 10:25:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5469454336. Throughput: 0: 42580.4. Samples: 1737015340. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:25:03,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 10:25:05,375][26599] Updated weights for policy 0, policy_version 333834 (0.0032) [2024-06-19 10:25:08,380][26367] Fps is (10 sec: 45892.2, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5469683712. Throughput: 0: 42544.1. Samples: 1737275180. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:25:08,380][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 10:25:08,799][26599] Updated weights for policy 0, policy_version 333844 (0.0024) [2024-06-19 10:25:13,061][26599] Updated weights for policy 0, policy_version 333854 (0.0042) [2024-06-19 10:25:13,384][26367] Fps is (10 sec: 42583.0, 60 sec: 42868.9, 300 sec: 42875.6). Total num frames: 5469880320. Throughput: 0: 42593.9. Samples: 1737532840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:25:13,385][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 10:25:16,512][26599] Updated weights for policy 0, policy_version 333864 (0.0040) [2024-06-19 10:25:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42821.1). Total num frames: 5470109696. Throughput: 0: 42643.8. Samples: 1737655920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:25:18,381][26367] Avg episode reward: [(0, '0.400')] [2024-06-19 10:25:20,630][26599] Updated weights for policy 0, policy_version 333874 (0.0030) [2024-06-19 10:25:23,380][26367] Fps is (10 sec: 44252.6, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5470322688. Throughput: 0: 42736.4. Samples: 1737917220. Policy #0 lag: (min: 0.0, avg: 9.4, max: 19.0) [2024-06-19 10:25:23,381][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 10:25:24,214][26599] Updated weights for policy 0, policy_version 333884 (0.0029) [2024-06-19 10:25:28,195][26599] Updated weights for policy 0, policy_version 333894 (0.0036) [2024-06-19 10:25:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5470519296. Throughput: 0: 42511.1. Samples: 1738168420. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:25:28,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 10:25:31,942][26599] Updated weights for policy 0, policy_version 333904 (0.0033) [2024-06-19 10:25:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42874.0, 300 sec: 42709.5). Total num frames: 5470732288. Throughput: 0: 42513.3. Samples: 1738291620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:25:33,381][26367] Avg episode reward: [(0, '0.401')] [2024-06-19 10:25:36,278][26599] Updated weights for policy 0, policy_version 333914 (0.0047) [2024-06-19 10:25:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5470961664. Throughput: 0: 42648.9. Samples: 1738554920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:25:38,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 10:25:38,518][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000333922_5470978048.pth... [2024-06-19 10:25:38,561][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000333294_5460688896.pth [2024-06-19 10:25:39,587][26599] Updated weights for policy 0, policy_version 333924 (0.0034) [2024-06-19 10:25:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42601.0, 300 sec: 42820.5). Total num frames: 5471141888. Throughput: 0: 42608.3. Samples: 1738808700. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:25:43,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 10:25:43,889][26599] Updated weights for policy 0, policy_version 333934 (0.0041) [2024-06-19 10:25:47,070][26599] Updated weights for policy 0, policy_version 333944 (0.0038) [2024-06-19 10:25:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5471387648. Throughput: 0: 42619.6. Samples: 1738933220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:25:48,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 10:25:51,803][26599] Updated weights for policy 0, policy_version 333954 (0.0039) [2024-06-19 10:25:53,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5471584256. Throughput: 0: 42641.8. Samples: 1739194060. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:25:53,380][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 10:25:53,757][26579] Signal inference workers to stop experience collection... (25650 times) [2024-06-19 10:25:53,758][26579] Signal inference workers to resume experience collection... (25650 times) [2024-06-19 10:25:53,805][26599] InferenceWorker_p0-w0: stopping experience collection (25650 times) [2024-06-19 10:25:53,805][26599] InferenceWorker_p0-w0: resuming experience collection (25650 times) [2024-06-19 10:25:55,180][26599] Updated weights for policy 0, policy_version 333964 (0.0042) [2024-06-19 10:25:58,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42601.0, 300 sec: 42821.0). Total num frames: 5471780864. Throughput: 0: 42727.1. Samples: 1739455400. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:25:58,380][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 10:25:59,375][26599] Updated weights for policy 0, policy_version 333974 (0.0027) [2024-06-19 10:26:02,623][26599] Updated weights for policy 0, policy_version 333984 (0.0031) [2024-06-19 10:26:03,380][26367] Fps is (10 sec: 45874.3, 60 sec: 43144.5, 300 sec: 42931.9). Total num frames: 5472043008. Throughput: 0: 42808.8. Samples: 1739582320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:03,381][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 10:26:07,015][26599] Updated weights for policy 0, policy_version 333994 (0.0050) [2024-06-19 10:26:08,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42595.8, 300 sec: 42820.0). Total num frames: 5472239616. Throughput: 0: 42749.5. Samples: 1739841100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:08,384][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 10:26:10,080][26599] Updated weights for policy 0, policy_version 334004 (0.0029) [2024-06-19 10:26:13,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42327.9, 300 sec: 42765.0). Total num frames: 5472419840. Throughput: 0: 42908.4. Samples: 1740099300. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:13,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 10:26:14,546][26599] Updated weights for policy 0, policy_version 334014 (0.0037) [2024-06-19 10:26:17,699][26599] Updated weights for policy 0, policy_version 334024 (0.0044) [2024-06-19 10:26:18,380][26367] Fps is (10 sec: 44252.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5472681984. Throughput: 0: 42866.7. Samples: 1740220620. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:18,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 10:26:22,626][26599] Updated weights for policy 0, policy_version 334034 (0.0035) [2024-06-19 10:26:23,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5472878592. Throughput: 0: 42959.2. Samples: 1740488080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:23,380][26367] Avg episode reward: [(0, '0.763')] [2024-06-19 10:26:25,203][26599] Updated weights for policy 0, policy_version 334044 (0.0038) [2024-06-19 10:26:28,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5473075200. Throughput: 0: 42908.9. Samples: 1740739600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:28,380][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 10:26:30,127][26599] Updated weights for policy 0, policy_version 334054 (0.0036) [2024-06-19 10:26:33,089][26599] Updated weights for policy 0, policy_version 334064 (0.0046) [2024-06-19 10:26:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5473320960. Throughput: 0: 42852.0. Samples: 1740861560. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:33,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 10:26:37,739][26599] Updated weights for policy 0, policy_version 334074 (0.0043) [2024-06-19 10:26:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.4, 300 sec: 42931.6). Total num frames: 5473517568. Throughput: 0: 43098.1. Samples: 1741133480. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:26:38,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 10:26:40,695][26599] Updated weights for policy 0, policy_version 334084 (0.0042) [2024-06-19 10:26:43,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5473714176. Throughput: 0: 42696.4. Samples: 1741376740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:26:43,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 10:26:45,353][26599] Updated weights for policy 0, policy_version 334094 (0.0023) [2024-06-19 10:26:48,276][26599] Updated weights for policy 0, policy_version 334104 (0.0038) [2024-06-19 10:26:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5473959936. Throughput: 0: 42785.9. Samples: 1741507680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:26:48,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 10:26:53,006][26599] Updated weights for policy 0, policy_version 334114 (0.0032) [2024-06-19 10:26:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5474140160. Throughput: 0: 42886.1. Samples: 1741770820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:26:53,381][26367] Avg episode reward: [(0, '0.843')] [2024-06-19 10:26:55,778][26599] Updated weights for policy 0, policy_version 334124 (0.0027) [2024-06-19 10:26:56,731][26579] Signal inference workers to stop experience collection... (25700 times) [2024-06-19 10:26:56,788][26599] InferenceWorker_p0-w0: stopping experience collection (25700 times) [2024-06-19 10:26:56,794][26579] Signal inference workers to resume experience collection... (25700 times) [2024-06-19 10:26:56,805][26599] InferenceWorker_p0-w0: resuming experience collection (25700 times) [2024-06-19 10:26:58,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42871.3, 300 sec: 42820.6). Total num frames: 5474353152. Throughput: 0: 42735.0. Samples: 1742022380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:26:58,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 10:27:00,647][26599] Updated weights for policy 0, policy_version 334134 (0.0031) [2024-06-19 10:27:03,306][26599] Updated weights for policy 0, policy_version 334144 (0.0040) [2024-06-19 10:27:03,380][26367] Fps is (10 sec: 47513.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5474615296. Throughput: 0: 43015.1. Samples: 1742156300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:03,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 10:27:08,272][26599] Updated weights for policy 0, policy_version 334154 (0.0044) [2024-06-19 10:27:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42327.8, 300 sec: 42820.5). Total num frames: 5474779136. Throughput: 0: 42605.5. Samples: 1742405340. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:08,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 10:27:11,618][26599] Updated weights for policy 0, policy_version 334164 (0.0032) [2024-06-19 10:27:13,380][26367] Fps is (10 sec: 37683.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5474992128. Throughput: 0: 42547.6. Samples: 1742654240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:13,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 10:27:15,848][26599] Updated weights for policy 0, policy_version 334174 (0.0050) [2024-06-19 10:27:18,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5475237888. Throughput: 0: 42757.8. Samples: 1742785660. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:18,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 10:27:19,077][26599] Updated weights for policy 0, policy_version 334184 (0.0030) [2024-06-19 10:27:23,367][26599] Updated weights for policy 0, policy_version 334194 (0.0034) [2024-06-19 10:27:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5475434496. Throughput: 0: 42540.4. Samples: 1743047800. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:23,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 10:27:26,666][26599] Updated weights for policy 0, policy_version 334204 (0.0033) [2024-06-19 10:27:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5475647488. Throughput: 0: 42563.5. Samples: 1743292100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:28,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 10:27:31,096][26599] Updated weights for policy 0, policy_version 334214 (0.0040) [2024-06-19 10:27:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5475860480. Throughput: 0: 42547.2. Samples: 1743422300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:33,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 10:27:34,817][26599] Updated weights for policy 0, policy_version 334224 (0.0032) [2024-06-19 10:27:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5476057088. Throughput: 0: 42415.2. Samples: 1743679500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:38,380][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 10:27:38,478][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000334233_5476073472.pth... [2024-06-19 10:27:38,556][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000333609_5465849856.pth [2024-06-19 10:27:38,707][26599] Updated weights for policy 0, policy_version 334234 (0.0029) [2024-06-19 10:27:42,242][26599] Updated weights for policy 0, policy_version 334244 (0.0039) [2024-06-19 10:27:43,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5476270080. Throughput: 0: 42414.0. Samples: 1743931000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:43,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 10:27:46,708][26599] Updated weights for policy 0, policy_version 334254 (0.0037) [2024-06-19 10:27:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5476499456. Throughput: 0: 42388.6. Samples: 1744063780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:48,380][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 10:27:49,757][26599] Updated weights for policy 0, policy_version 334264 (0.0041) [2024-06-19 10:27:53,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5476696064. Throughput: 0: 42425.4. Samples: 1744314480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:27:53,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 10:27:54,291][26599] Updated weights for policy 0, policy_version 334274 (0.0035) [2024-06-19 10:27:57,722][26599] Updated weights for policy 0, policy_version 334284 (0.0040) [2024-06-19 10:27:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5476925440. Throughput: 0: 42464.3. Samples: 1744565140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:27:58,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 10:28:01,824][26599] Updated weights for policy 0, policy_version 334294 (0.0029) [2024-06-19 10:28:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 41779.2, 300 sec: 42543.4). Total num frames: 5477122048. Throughput: 0: 42426.6. Samples: 1744694860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:03,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 10:28:05,827][26599] Updated weights for policy 0, policy_version 334304 (0.0042) [2024-06-19 10:28:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5477335040. Throughput: 0: 42227.6. Samples: 1744948040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:08,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 10:28:09,357][26599] Updated weights for policy 0, policy_version 334314 (0.0032) [2024-06-19 10:28:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5477548032. Throughput: 0: 42463.6. Samples: 1745202960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:13,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 10:28:13,473][26599] Updated weights for policy 0, policy_version 334324 (0.0053) [2024-06-19 10:28:16,977][26599] Updated weights for policy 0, policy_version 334334 (0.0048) [2024-06-19 10:28:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 5477744640. Throughput: 0: 42483.0. Samples: 1745334040. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:18,381][26367] Avg episode reward: [(0, '0.645')] [2024-06-19 10:28:21,382][26599] Updated weights for policy 0, policy_version 334344 (0.0027) [2024-06-19 10:28:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 5477957632. Throughput: 0: 42429.8. Samples: 1745588840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:23,380][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 10:28:24,675][26599] Updated weights for policy 0, policy_version 334354 (0.0044) [2024-06-19 10:28:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 5478187008. Throughput: 0: 42475.9. Samples: 1745842420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:28,381][26367] Avg episode reward: [(0, '0.389')] [2024-06-19 10:28:28,998][26599] Updated weights for policy 0, policy_version 334364 (0.0049) [2024-06-19 10:28:32,254][26599] Updated weights for policy 0, policy_version 334374 (0.0039) [2024-06-19 10:28:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5478400000. Throughput: 0: 42436.4. Samples: 1745973420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:33,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 10:28:34,120][26579] Signal inference workers to stop experience collection... (25750 times) [2024-06-19 10:28:34,120][26579] Signal inference workers to resume experience collection... (25750 times) [2024-06-19 10:28:34,142][26599] InferenceWorker_p0-w0: stopping experience collection (25750 times) [2024-06-19 10:28:34,142][26599] InferenceWorker_p0-w0: resuming experience collection (25750 times) [2024-06-19 10:28:36,590][26599] Updated weights for policy 0, policy_version 334384 (0.0040) [2024-06-19 10:28:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5478612992. Throughput: 0: 42576.6. Samples: 1746230420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:38,381][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 10:28:39,966][26599] Updated weights for policy 0, policy_version 334394 (0.0023) [2024-06-19 10:28:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5478825984. Throughput: 0: 42597.4. Samples: 1746482020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:43,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 10:28:44,163][26599] Updated weights for policy 0, policy_version 334404 (0.0043) [2024-06-19 10:28:47,793][26599] Updated weights for policy 0, policy_version 334414 (0.0051) [2024-06-19 10:28:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5479038976. Throughput: 0: 42574.3. Samples: 1746610700. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:48,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 10:28:52,159][26599] Updated weights for policy 0, policy_version 334424 (0.0046) [2024-06-19 10:28:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5479251968. Throughput: 0: 42553.3. Samples: 1746862940. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:53,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 10:28:55,452][26599] Updated weights for policy 0, policy_version 334434 (0.0042) [2024-06-19 10:28:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5479448576. Throughput: 0: 42620.4. Samples: 1747120880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:28:58,380][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 10:28:59,631][26599] Updated weights for policy 0, policy_version 334444 (0.0038) [2024-06-19 10:29:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5479677952. Throughput: 0: 42577.4. Samples: 1747250020. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:29:03,380][26367] Avg episode reward: [(0, '0.520')] [2024-06-19 10:29:03,496][26599] Updated weights for policy 0, policy_version 334454 (0.0039) [2024-06-19 10:29:07,196][26599] Updated weights for policy 0, policy_version 334464 (0.0046) [2024-06-19 10:29:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5479890944. Throughput: 0: 42577.7. Samples: 1747504840. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 10:29:08,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 10:29:11,009][26599] Updated weights for policy 0, policy_version 334474 (0.0039) [2024-06-19 10:29:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5480087552. Throughput: 0: 42795.2. Samples: 1747768200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:13,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 10:29:14,823][26599] Updated weights for policy 0, policy_version 334484 (0.0033) [2024-06-19 10:29:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5480316928. Throughput: 0: 42590.6. Samples: 1747890000. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:18,381][26367] Avg episode reward: [(0, '0.571')] [2024-06-19 10:29:18,874][26599] Updated weights for policy 0, policy_version 334494 (0.0035) [2024-06-19 10:29:22,426][26599] Updated weights for policy 0, policy_version 334504 (0.0032) [2024-06-19 10:29:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5480513536. Throughput: 0: 42517.8. Samples: 1748143720. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:23,380][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 10:29:26,369][26599] Updated weights for policy 0, policy_version 334514 (0.0030) [2024-06-19 10:29:28,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42322.8, 300 sec: 42598.4). Total num frames: 5480726528. Throughput: 0: 42720.5. Samples: 1748404600. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:28,385][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 10:29:30,317][26599] Updated weights for policy 0, policy_version 334524 (0.0038) [2024-06-19 10:29:33,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5480972288. Throughput: 0: 42824.0. Samples: 1748537780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:33,392][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 10:29:33,927][26599] Updated weights for policy 0, policy_version 334534 (0.0033) [2024-06-19 10:29:38,043][26599] Updated weights for policy 0, policy_version 334544 (0.0036) [2024-06-19 10:29:38,380][26367] Fps is (10 sec: 44253.5, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 5481168896. Throughput: 0: 42834.4. Samples: 1748790480. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:38,380][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 10:29:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000334544_5481168896.pth... [2024-06-19 10:29:38,461][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000333922_5470978048.pth [2024-06-19 10:29:41,591][26599] Updated weights for policy 0, policy_version 334554 (0.0037) [2024-06-19 10:29:43,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5481381888. Throughput: 0: 42575.4. Samples: 1749036780. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:43,381][26367] Avg episode reward: [(0, '0.708')] [2024-06-19 10:29:46,122][26599] Updated weights for policy 0, policy_version 334564 (0.0036) [2024-06-19 10:29:48,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5481611264. Throughput: 0: 42611.0. Samples: 1749167520. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:48,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 10:29:49,733][26599] Updated weights for policy 0, policy_version 334574 (0.0044) [2024-06-19 10:29:53,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42325.4, 300 sec: 42598.9). Total num frames: 5481791488. Throughput: 0: 42549.4. Samples: 1749419560. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:53,380][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 10:29:53,931][26599] Updated weights for policy 0, policy_version 334584 (0.0027) [2024-06-19 10:29:57,319][26599] Updated weights for policy 0, policy_version 334594 (0.0029) [2024-06-19 10:29:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5482037248. Throughput: 0: 42369.7. Samples: 1749674840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:29:58,381][26367] Avg episode reward: [(0, '0.866')] [2024-06-19 10:29:59,219][26579] Signal inference workers to stop experience collection... (25800 times) [2024-06-19 10:29:59,251][26599] InferenceWorker_p0-w0: stopping experience collection (25800 times) [2024-06-19 10:29:59,290][26579] Signal inference workers to resume experience collection... (25800 times) [2024-06-19 10:29:59,290][26599] InferenceWorker_p0-w0: resuming experience collection (25800 times) [2024-06-19 10:30:01,524][26599] Updated weights for policy 0, policy_version 334604 (0.0043) [2024-06-19 10:30:03,384][26367] Fps is (10 sec: 45856.1, 60 sec: 42868.5, 300 sec: 42597.8). Total num frames: 5482250240. Throughput: 0: 42600.7. Samples: 1749807200. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:30:03,385][26367] Avg episode reward: [(0, '0.487')] [2024-06-19 10:30:04,979][26599] Updated weights for policy 0, policy_version 334614 (0.0039) [2024-06-19 10:30:08,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42543.4). Total num frames: 5482430464. Throughput: 0: 42620.9. Samples: 1750061660. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:30:08,380][26367] Avg episode reward: [(0, '0.398')] [2024-06-19 10:30:09,116][26599] Updated weights for policy 0, policy_version 334624 (0.0033) [2024-06-19 10:30:12,448][26599] Updated weights for policy 0, policy_version 334634 (0.0034) [2024-06-19 10:30:13,380][26367] Fps is (10 sec: 42615.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5482676224. Throughput: 0: 42564.4. Samples: 1750319840. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:30:13,392][26367] Avg episode reward: [(0, '0.349')] [2024-06-19 10:30:16,622][26599] Updated weights for policy 0, policy_version 334644 (0.0028) [2024-06-19 10:30:18,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42868.9, 300 sec: 42597.9). Total num frames: 5482889216. Throughput: 0: 42631.3. Samples: 1750456340. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:30:18,393][26367] Avg episode reward: [(0, '0.420')] [2024-06-19 10:30:19,962][26599] Updated weights for policy 0, policy_version 334654 (0.0037) [2024-06-19 10:30:23,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5483085824. Throughput: 0: 42661.3. Samples: 1750710240. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:30:23,381][26367] Avg episode reward: [(0, '0.286')] [2024-06-19 10:30:24,262][26599] Updated weights for policy 0, policy_version 334664 (0.0028) [2024-06-19 10:30:27,568][26599] Updated weights for policy 0, policy_version 334674 (0.0028) [2024-06-19 10:30:28,384][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42653.4). Total num frames: 5483315200. Throughput: 0: 42801.0. Samples: 1750962980. Policy #0 lag: (min: 0.0, avg: 10.5, max: 21.0) [2024-06-19 10:30:28,385][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 10:30:31,914][26599] Updated weights for policy 0, policy_version 334684 (0.0028) [2024-06-19 10:30:33,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5483544576. Throughput: 0: 42851.2. Samples: 1751095820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:30:33,380][26367] Avg episode reward: [(0, '0.747')] [2024-06-19 10:30:35,588][26599] Updated weights for policy 0, policy_version 334694 (0.0043) [2024-06-19 10:30:38,380][26367] Fps is (10 sec: 42614.0, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5483741184. Throughput: 0: 42800.8. Samples: 1751345600. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:30:38,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 10:30:39,805][26599] Updated weights for policy 0, policy_version 334704 (0.0040) [2024-06-19 10:30:43,278][26599] Updated weights for policy 0, policy_version 334714 (0.0031) [2024-06-19 10:30:43,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42869.0, 300 sec: 42597.9). Total num frames: 5483954176. Throughput: 0: 42976.2. Samples: 1751608920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:30:43,385][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 10:30:47,546][26599] Updated weights for policy 0, policy_version 334724 (0.0034) [2024-06-19 10:30:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5484167168. Throughput: 0: 42870.0. Samples: 1751736180. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:30:48,384][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 10:30:50,882][26599] Updated weights for policy 0, policy_version 334734 (0.0036) [2024-06-19 10:30:53,380][26367] Fps is (10 sec: 39336.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5484347392. Throughput: 0: 42718.2. Samples: 1751983980. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:30:53,380][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 10:30:55,174][26599] Updated weights for policy 0, policy_version 334744 (0.0035) [2024-06-19 10:30:58,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5484593152. Throughput: 0: 42634.2. Samples: 1752238380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:30:58,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 10:30:58,462][26599] Updated weights for policy 0, policy_version 334754 (0.0034) [2024-06-19 10:31:02,832][26599] Updated weights for policy 0, policy_version 334764 (0.0034) [2024-06-19 10:31:03,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42601.2, 300 sec: 42598.9). Total num frames: 5484806144. Throughput: 0: 42600.6. Samples: 1752373220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:03,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 10:31:05,915][26599] Updated weights for policy 0, policy_version 334774 (0.0034) [2024-06-19 10:31:08,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5485002752. Throughput: 0: 42510.2. Samples: 1752623200. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:08,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 10:31:10,483][26599] Updated weights for policy 0, policy_version 334784 (0.0040) [2024-06-19 10:31:13,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5485248512. Throughput: 0: 42496.4. Samples: 1752875160. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:13,381][26367] Avg episode reward: [(0, '0.807')] [2024-06-19 10:31:14,105][26599] Updated weights for policy 0, policy_version 334794 (0.0042) [2024-06-19 10:31:18,087][26599] Updated weights for policy 0, policy_version 334804 (0.0034) [2024-06-19 10:31:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5485445120. Throughput: 0: 42545.7. Samples: 1753010380. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:18,389][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 10:31:21,547][26599] Updated weights for policy 0, policy_version 334814 (0.0052) [2024-06-19 10:31:23,380][26367] Fps is (10 sec: 37683.5, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5485625344. Throughput: 0: 42682.8. Samples: 1753266320. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:23,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 10:31:25,536][26599] Updated weights for policy 0, policy_version 334824 (0.0034) [2024-06-19 10:31:28,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42871.4, 300 sec: 42597.9). Total num frames: 5485887488. Throughput: 0: 42537.2. Samples: 1753523100. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:28,385][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 10:31:28,982][26599] Updated weights for policy 0, policy_version 334834 (0.0032) [2024-06-19 10:31:33,086][26599] Updated weights for policy 0, policy_version 334844 (0.0031) [2024-06-19 10:31:33,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5486084096. Throughput: 0: 42779.3. Samples: 1753661240. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:33,380][26367] Avg episode reward: [(0, '0.598')] [2024-06-19 10:31:36,473][26599] Updated weights for policy 0, policy_version 334854 (0.0037) [2024-06-19 10:31:38,380][26367] Fps is (10 sec: 39336.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5486280704. Throughput: 0: 42887.0. Samples: 1753913900. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:38,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 10:31:38,388][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000334856_5486280704.pth... [2024-06-19 10:31:38,438][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000334233_5476073472.pth [2024-06-19 10:31:39,522][26579] Signal inference workers to stop experience collection... (25850 times) [2024-06-19 10:31:39,522][26579] Signal inference workers to resume experience collection... (25850 times) [2024-06-19 10:31:39,572][26599] InferenceWorker_p0-w0: stopping experience collection (25850 times) [2024-06-19 10:31:39,573][26599] InferenceWorker_p0-w0: resuming experience collection (25850 times) [2024-06-19 10:31:40,722][26599] Updated weights for policy 0, policy_version 334864 (0.0037) [2024-06-19 10:31:43,380][26367] Fps is (10 sec: 45874.9, 60 sec: 43147.2, 300 sec: 42653.9). Total num frames: 5486542848. Throughput: 0: 42926.7. Samples: 1754170080. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 10:31:43,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 10:31:44,062][26599] Updated weights for policy 0, policy_version 334874 (0.0034) [2024-06-19 10:31:48,286][26599] Updated weights for policy 0, policy_version 334884 (0.0037) [2024-06-19 10:31:48,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5486739456. Throughput: 0: 42977.0. Samples: 1754307180. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:31:48,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 10:31:51,538][26599] Updated weights for policy 0, policy_version 334894 (0.0053) [2024-06-19 10:31:53,380][26367] Fps is (10 sec: 37682.7, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5486919680. Throughput: 0: 42934.1. Samples: 1754555240. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:31:53,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 10:31:55,968][26599] Updated weights for policy 0, policy_version 334904 (0.0034) [2024-06-19 10:31:58,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5487165440. Throughput: 0: 43061.3. Samples: 1754812920. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:31:58,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 10:31:59,452][26599] Updated weights for policy 0, policy_version 334914 (0.0031) [2024-06-19 10:32:03,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5487362048. Throughput: 0: 43048.0. Samples: 1754947540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:03,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 10:32:03,739][26599] Updated weights for policy 0, policy_version 334924 (0.0039) [2024-06-19 10:32:07,175][26599] Updated weights for policy 0, policy_version 334934 (0.0032) [2024-06-19 10:32:08,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5487575040. Throughput: 0: 42724.7. Samples: 1755188940. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:08,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 10:32:11,828][26599] Updated weights for policy 0, policy_version 334944 (0.0044) [2024-06-19 10:32:13,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5487820800. Throughput: 0: 42626.2. Samples: 1755441120. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:13,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 10:32:15,171][26599] Updated weights for policy 0, policy_version 334954 (0.0029) [2024-06-19 10:32:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5487984640. Throughput: 0: 42537.8. Samples: 1755575440. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:18,380][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 10:32:19,305][26599] Updated weights for policy 0, policy_version 334964 (0.0035) [2024-06-19 10:32:22,637][26599] Updated weights for policy 0, policy_version 334974 (0.0040) [2024-06-19 10:32:23,384][26367] Fps is (10 sec: 40943.3, 60 sec: 43414.6, 300 sec: 42653.4). Total num frames: 5488230400. Throughput: 0: 42595.3. Samples: 1755830860. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:23,385][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 10:32:27,092][26599] Updated weights for policy 0, policy_version 334984 (0.0035) [2024-06-19 10:32:28,380][26367] Fps is (10 sec: 47512.8, 60 sec: 42874.1, 300 sec: 42709.5). Total num frames: 5488459776. Throughput: 0: 42507.0. Samples: 1756082900. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:28,381][26367] Avg episode reward: [(0, '0.785')] [2024-06-19 10:32:30,711][26599] Updated weights for policy 0, policy_version 334994 (0.0032) [2024-06-19 10:32:33,380][26367] Fps is (10 sec: 40976.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5488640000. Throughput: 0: 42453.4. Samples: 1756217580. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:33,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 10:32:34,705][26599] Updated weights for policy 0, policy_version 335004 (0.0031) [2024-06-19 10:32:38,219][26599] Updated weights for policy 0, policy_version 335014 (0.0037) [2024-06-19 10:32:38,384][26367] Fps is (10 sec: 40944.9, 60 sec: 43141.9, 300 sec: 42708.9). Total num frames: 5488869376. Throughput: 0: 42625.4. Samples: 1756473540. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:38,385][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 10:32:42,292][26599] Updated weights for policy 0, policy_version 335024 (0.0039) [2024-06-19 10:32:43,384][26367] Fps is (10 sec: 47496.3, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5489115136. Throughput: 0: 42532.6. Samples: 1756727040. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:43,385][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 10:32:45,808][26599] Updated weights for policy 0, policy_version 335034 (0.0050) [2024-06-19 10:32:48,380][26367] Fps is (10 sec: 42614.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5489295360. Throughput: 0: 42528.9. Samples: 1756861340. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:48,380][26367] Avg episode reward: [(0, '0.405')] [2024-06-19 10:32:49,891][26599] Updated weights for policy 0, policy_version 335044 (0.0043) [2024-06-19 10:32:53,380][26367] Fps is (10 sec: 37697.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5489491968. Throughput: 0: 42966.3. Samples: 1757122420. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:53,380][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 10:32:53,639][26599] Updated weights for policy 0, policy_version 335054 (0.0028) [2024-06-19 10:32:54,082][26579] Signal inference workers to stop experience collection... (25900 times) [2024-06-19 10:32:54,083][26579] Signal inference workers to resume experience collection... (25900 times) [2024-06-19 10:32:54,120][26599] InferenceWorker_p0-w0: stopping experience collection (25900 times) [2024-06-19 10:32:54,120][26599] InferenceWorker_p0-w0: resuming experience collection (25900 times) [2024-06-19 10:32:57,471][26599] Updated weights for policy 0, policy_version 335064 (0.0035) [2024-06-19 10:32:58,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5489737728. Throughput: 0: 42897.7. Samples: 1757371520. Policy #0 lag: (min: 0.0, avg: 11.1, max: 20.0) [2024-06-19 10:32:58,384][26367] Avg episode reward: [(0, '0.679')] [2024-06-19 10:33:01,551][26599] Updated weights for policy 0, policy_version 335074 (0.0036) [2024-06-19 10:33:03,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42868.8, 300 sec: 42708.9). Total num frames: 5489934336. Throughput: 0: 42887.1. Samples: 1757505520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:03,385][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 10:33:05,075][26599] Updated weights for policy 0, policy_version 335084 (0.0024) [2024-06-19 10:33:08,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5490147328. Throughput: 0: 42890.2. Samples: 1757760900. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:08,384][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 10:33:09,090][26599] Updated weights for policy 0, policy_version 335094 (0.0033) [2024-06-19 10:33:12,819][26599] Updated weights for policy 0, policy_version 335104 (0.0039) [2024-06-19 10:33:13,380][26367] Fps is (10 sec: 45892.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5490393088. Throughput: 0: 43001.0. Samples: 1758017940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:13,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 10:33:16,831][26599] Updated weights for policy 0, policy_version 335114 (0.0040) [2024-06-19 10:33:18,380][26367] Fps is (10 sec: 44252.5, 60 sec: 43417.5, 300 sec: 42820.5). Total num frames: 5490589696. Throughput: 0: 42884.0. Samples: 1758147360. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:18,381][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 10:33:20,461][26599] Updated weights for policy 0, policy_version 335124 (0.0039) [2024-06-19 10:33:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42601.3, 300 sec: 42709.5). Total num frames: 5490786304. Throughput: 0: 42975.6. Samples: 1758407280. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:23,381][26367] Avg episode reward: [(0, '0.741')] [2024-06-19 10:33:24,462][26599] Updated weights for policy 0, policy_version 335134 (0.0033) [2024-06-19 10:33:28,151][26599] Updated weights for policy 0, policy_version 335144 (0.0038) [2024-06-19 10:33:28,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5491015680. Throughput: 0: 42905.2. Samples: 1758657620. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:28,381][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 10:33:32,275][26599] Updated weights for policy 0, policy_version 335154 (0.0031) [2024-06-19 10:33:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5491212288. Throughput: 0: 42742.7. Samples: 1758784760. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:33,380][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 10:33:35,794][26599] Updated weights for policy 0, policy_version 335164 (0.0028) [2024-06-19 10:33:38,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42601.1, 300 sec: 42709.5). Total num frames: 5491425280. Throughput: 0: 42383.6. Samples: 1759029680. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:38,381][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 10:33:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000335170_5491425280.pth... [2024-06-19 10:33:38,467][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000334544_5481168896.pth [2024-06-19 10:33:39,960][26599] Updated weights for policy 0, policy_version 335174 (0.0037) [2024-06-19 10:33:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41781.8, 300 sec: 42653.9). Total num frames: 5491621888. Throughput: 0: 42637.9. Samples: 1759290220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:43,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 10:33:43,601][26599] Updated weights for policy 0, policy_version 335184 (0.0046) [2024-06-19 10:33:47,656][26599] Updated weights for policy 0, policy_version 335194 (0.0039) [2024-06-19 10:33:48,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42322.7, 300 sec: 42653.4). Total num frames: 5491834880. Throughput: 0: 42526.7. Samples: 1759419220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:48,384][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 10:33:51,173][26599] Updated weights for policy 0, policy_version 335204 (0.0032) [2024-06-19 10:33:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5492064256. Throughput: 0: 42284.2. Samples: 1759663540. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:53,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 10:33:55,196][26579] Signal inference workers to stop experience collection... (25950 times) [2024-06-19 10:33:55,196][26579] Signal inference workers to resume experience collection... (25950 times) [2024-06-19 10:33:55,235][26599] InferenceWorker_p0-w0: stopping experience collection (25950 times) [2024-06-19 10:33:55,235][26599] InferenceWorker_p0-w0: resuming experience collection (25950 times) [2024-06-19 10:33:55,346][26599] Updated weights for policy 0, policy_version 335214 (0.0032) [2024-06-19 10:33:58,380][26367] Fps is (10 sec: 42613.0, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5492260864. Throughput: 0: 42421.6. Samples: 1759926920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:33:58,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 10:33:58,863][26599] Updated weights for policy 0, policy_version 335224 (0.0030) [2024-06-19 10:34:03,244][26599] Updated weights for policy 0, policy_version 335234 (0.0024) [2024-06-19 10:34:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42328.0, 300 sec: 42653.9). Total num frames: 5492473856. Throughput: 0: 42368.5. Samples: 1760053940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:34:03,380][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 10:34:06,624][26599] Updated weights for policy 0, policy_version 335244 (0.0033) [2024-06-19 10:34:08,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42874.0, 300 sec: 42820.5). Total num frames: 5492719616. Throughput: 0: 42281.7. Samples: 1760309960. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:34:08,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 10:34:10,752][26599] Updated weights for policy 0, policy_version 335254 (0.0043) [2024-06-19 10:34:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 5492899840. Throughput: 0: 42430.4. Samples: 1760566980. Policy #0 lag: (min: 0.0, avg: 9.1, max: 22.0) [2024-06-19 10:34:13,380][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 10:34:14,205][26599] Updated weights for policy 0, policy_version 335264 (0.0039) [2024-06-19 10:34:18,380][26367] Fps is (10 sec: 37683.2, 60 sec: 41779.2, 300 sec: 42653.9). Total num frames: 5493096448. Throughput: 0: 42461.2. Samples: 1760695520. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:18,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 10:34:18,638][26599] Updated weights for policy 0, policy_version 335274 (0.0042) [2024-06-19 10:34:21,950][26599] Updated weights for policy 0, policy_version 335284 (0.0022) [2024-06-19 10:34:23,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42595.8, 300 sec: 42765.0). Total num frames: 5493342208. Throughput: 0: 42676.0. Samples: 1760950260. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:23,384][26367] Avg episode reward: [(0, '0.250')] [2024-06-19 10:34:26,182][26599] Updated weights for policy 0, policy_version 335294 (0.0033) [2024-06-19 10:34:28,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5493555200. Throughput: 0: 42619.6. Samples: 1761208100. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:28,381][26367] Avg episode reward: [(0, '0.213')] [2024-06-19 10:34:29,600][26599] Updated weights for policy 0, policy_version 335304 (0.0033) [2024-06-19 10:34:33,380][26367] Fps is (10 sec: 39336.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5493735424. Throughput: 0: 42652.9. Samples: 1761338440. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:33,380][26367] Avg episode reward: [(0, '0.294')] [2024-06-19 10:34:33,757][26599] Updated weights for policy 0, policy_version 335314 (0.0039) [2024-06-19 10:34:37,348][26599] Updated weights for policy 0, policy_version 335324 (0.0040) [2024-06-19 10:34:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5493981184. Throughput: 0: 42878.2. Samples: 1761593060. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:38,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 10:34:41,712][26599] Updated weights for policy 0, policy_version 335334 (0.0034) [2024-06-19 10:34:43,380][26367] Fps is (10 sec: 47512.4, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5494210560. Throughput: 0: 42517.8. Samples: 1761840220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:43,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 10:34:44,975][26599] Updated weights for policy 0, policy_version 335344 (0.0036) [2024-06-19 10:34:48,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42327.9, 300 sec: 42653.9). Total num frames: 5494374400. Throughput: 0: 42687.1. Samples: 1761974860. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:48,381][26367] Avg episode reward: [(0, '0.502')] [2024-06-19 10:34:49,129][26599] Updated weights for policy 0, policy_version 335354 (0.0042) [2024-06-19 10:34:51,721][26579] Signal inference workers to stop experience collection... (26000 times) [2024-06-19 10:34:51,721][26579] Signal inference workers to resume experience collection... (26000 times) [2024-06-19 10:34:51,741][26599] InferenceWorker_p0-w0: stopping experience collection (26000 times) [2024-06-19 10:34:51,742][26599] InferenceWorker_p0-w0: resuming experience collection (26000 times) [2024-06-19 10:34:52,481][26599] Updated weights for policy 0, policy_version 335364 (0.0040) [2024-06-19 10:34:53,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5494603776. Throughput: 0: 42732.9. Samples: 1762232940. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:53,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 10:34:56,660][26599] Updated weights for policy 0, policy_version 335374 (0.0040) [2024-06-19 10:34:58,380][26367] Fps is (10 sec: 47513.2, 60 sec: 43144.6, 300 sec: 42710.1). Total num frames: 5494849536. Throughput: 0: 42832.8. Samples: 1762494460. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:34:58,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 10:35:00,182][26599] Updated weights for policy 0, policy_version 335384 (0.0035) [2024-06-19 10:35:03,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5495029760. Throughput: 0: 42850.4. Samples: 1762623780. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:35:03,380][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 10:35:04,222][26599] Updated weights for policy 0, policy_version 335394 (0.0033) [2024-06-19 10:35:08,050][26599] Updated weights for policy 0, policy_version 335404 (0.0026) [2024-06-19 10:35:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5495259136. Throughput: 0: 42950.6. Samples: 1762882880. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:35:08,380][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 10:35:11,874][26599] Updated weights for policy 0, policy_version 335414 (0.0034) [2024-06-19 10:35:13,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 5495472128. Throughput: 0: 42738.6. Samples: 1763131340. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:35:13,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 10:35:15,745][26599] Updated weights for policy 0, policy_version 335424 (0.0034) [2024-06-19 10:35:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5495685120. Throughput: 0: 42766.2. Samples: 1763262920. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:35:18,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 10:35:19,526][26599] Updated weights for policy 0, policy_version 335434 (0.0045) [2024-06-19 10:35:23,111][26599] Updated weights for policy 0, policy_version 335444 (0.0040) [2024-06-19 10:35:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42874.0, 300 sec: 42710.0). Total num frames: 5495914496. Throughput: 0: 42948.4. Samples: 1763525740. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:35:23,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 10:35:27,038][26599] Updated weights for policy 0, policy_version 335454 (0.0026) [2024-06-19 10:35:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5496127488. Throughput: 0: 43164.2. Samples: 1763782600. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:35:28,381][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 10:35:30,920][26599] Updated weights for policy 0, policy_version 335464 (0.0036) [2024-06-19 10:35:33,380][26367] Fps is (10 sec: 42599.2, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5496340480. Throughput: 0: 42985.9. Samples: 1763909220. Policy #0 lag: (min: 0.0, avg: 9.1, max: 21.0) [2024-06-19 10:35:33,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 10:35:34,635][26599] Updated weights for policy 0, policy_version 335474 (0.0041) [2024-06-19 10:35:38,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 5496553472. Throughput: 0: 43035.0. Samples: 1764169520. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:35:38,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 10:35:38,414][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000335484_5496569856.pth... [2024-06-19 10:35:38,420][26599] Updated weights for policy 0, policy_version 335484 (0.0037) [2024-06-19 10:35:38,486][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000334856_5486280704.pth [2024-06-19 10:35:42,292][26599] Updated weights for policy 0, policy_version 335494 (0.0046) [2024-06-19 10:35:43,381][26367] Fps is (10 sec: 40958.6, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5496750080. Throughput: 0: 42902.1. Samples: 1764425060. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:35:43,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 10:35:46,142][26599] Updated weights for policy 0, policy_version 335504 (0.0039) [2024-06-19 10:35:48,380][26367] Fps is (10 sec: 40961.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5496963072. Throughput: 0: 42754.6. Samples: 1764547740. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:35:48,380][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 10:35:49,762][26599] Updated weights for policy 0, policy_version 335514 (0.0027) [2024-06-19 10:35:53,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5497192448. Throughput: 0: 42729.2. Samples: 1764805700. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:35:53,381][26367] Avg episode reward: [(0, '0.441')] [2024-06-19 10:35:54,058][26599] Updated weights for policy 0, policy_version 335524 (0.0040) [2024-06-19 10:35:57,231][26599] Updated weights for policy 0, policy_version 335534 (0.0024) [2024-06-19 10:35:58,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5497405440. Throughput: 0: 42804.9. Samples: 1765057560. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:35:58,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 10:36:01,695][26599] Updated weights for policy 0, policy_version 335544 (0.0034) [2024-06-19 10:36:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5497618432. Throughput: 0: 42828.9. Samples: 1765190220. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:03,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 10:36:04,826][26599] Updated weights for policy 0, policy_version 335554 (0.0032) [2024-06-19 10:36:08,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5497815040. Throughput: 0: 42703.7. Samples: 1765447400. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:08,380][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 10:36:09,198][26599] Updated weights for policy 0, policy_version 335564 (0.0040) [2024-06-19 10:36:12,661][26599] Updated weights for policy 0, policy_version 335574 (0.0044) [2024-06-19 10:36:13,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5498060800. Throughput: 0: 42564.4. Samples: 1765698000. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:13,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 10:36:16,757][26599] Updated weights for policy 0, policy_version 335584 (0.0040) [2024-06-19 10:36:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5498257408. Throughput: 0: 42790.1. Samples: 1765834780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:18,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 10:36:20,278][26599] Updated weights for policy 0, policy_version 335594 (0.0029) [2024-06-19 10:36:21,965][26579] Signal inference workers to stop experience collection... (26050 times) [2024-06-19 10:36:21,965][26579] Signal inference workers to resume experience collection... (26050 times) [2024-06-19 10:36:22,008][26599] InferenceWorker_p0-w0: stopping experience collection (26050 times) [2024-06-19 10:36:22,008][26599] InferenceWorker_p0-w0: resuming experience collection (26050 times) [2024-06-19 10:36:23,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 5498470400. Throughput: 0: 42668.6. Samples: 1766089600. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:23,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 10:36:24,619][26599] Updated weights for policy 0, policy_version 335604 (0.0031) [2024-06-19 10:36:27,887][26599] Updated weights for policy 0, policy_version 335614 (0.0040) [2024-06-19 10:36:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5498699776. Throughput: 0: 42422.8. Samples: 1766334080. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:28,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 10:36:32,270][26599] Updated weights for policy 0, policy_version 335624 (0.0036) [2024-06-19 10:36:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5498896384. Throughput: 0: 42822.9. Samples: 1766474780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:33,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 10:36:35,325][26599] Updated weights for policy 0, policy_version 335634 (0.0035) [2024-06-19 10:36:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5499125760. Throughput: 0: 42868.4. Samples: 1766734780. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:38,390][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 10:36:39,965][26599] Updated weights for policy 0, policy_version 335644 (0.0036) [2024-06-19 10:36:43,178][26599] Updated weights for policy 0, policy_version 335654 (0.0029) [2024-06-19 10:36:43,380][26367] Fps is (10 sec: 45875.6, 60 sec: 43417.8, 300 sec: 42765.0). Total num frames: 5499355136. Throughput: 0: 42726.7. Samples: 1766980260. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:43,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 10:36:47,917][26599] Updated weights for policy 0, policy_version 335664 (0.0033) [2024-06-19 10:36:48,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5499518976. Throughput: 0: 42694.8. Samples: 1767111480. Policy #0 lag: (min: 0.0, avg: 10.0, max: 21.0) [2024-06-19 10:36:48,380][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 10:36:51,467][26599] Updated weights for policy 0, policy_version 335674 (0.0044) [2024-06-19 10:36:53,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5499748352. Throughput: 0: 42652.3. Samples: 1767366760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:36:53,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 10:36:55,483][26599] Updated weights for policy 0, policy_version 335684 (0.0028) [2024-06-19 10:36:58,380][26367] Fps is (10 sec: 47513.0, 60 sec: 43144.5, 300 sec: 42820.5). Total num frames: 5499994112. Throughput: 0: 42627.1. Samples: 1767616220. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:36:58,381][26367] Avg episode reward: [(0, '0.531')] [2024-06-19 10:36:59,122][26599] Updated weights for policy 0, policy_version 335694 (0.0043) [2024-06-19 10:37:03,037][26599] Updated weights for policy 0, policy_version 335704 (0.0031) [2024-06-19 10:37:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5500174336. Throughput: 0: 42574.7. Samples: 1767750640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:03,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 10:37:06,741][26599] Updated weights for policy 0, policy_version 335714 (0.0041) [2024-06-19 10:37:08,380][26367] Fps is (10 sec: 40959.1, 60 sec: 43144.3, 300 sec: 42653.9). Total num frames: 5500403712. Throughput: 0: 42622.9. Samples: 1768007640. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:08,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 10:37:10,647][26599] Updated weights for policy 0, policy_version 335724 (0.0048) [2024-06-19 10:37:13,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5500633088. Throughput: 0: 42669.5. Samples: 1768254200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:13,380][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 10:37:14,445][26599] Updated weights for policy 0, policy_version 335734 (0.0036) [2024-06-19 10:37:18,380][26367] Fps is (10 sec: 40961.3, 60 sec: 42598.5, 300 sec: 42654.5). Total num frames: 5500813312. Throughput: 0: 42506.8. Samples: 1768387580. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:18,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 10:37:18,538][26599] Updated weights for policy 0, policy_version 335744 (0.0032) [2024-06-19 10:37:22,170][26599] Updated weights for policy 0, policy_version 335754 (0.0029) [2024-06-19 10:37:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5501026304. Throughput: 0: 42363.3. Samples: 1768641120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:23,380][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 10:37:26,131][26599] Updated weights for policy 0, policy_version 335764 (0.0030) [2024-06-19 10:37:28,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5501272064. Throughput: 0: 42559.6. Samples: 1768895440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:28,381][26367] Avg episode reward: [(0, '0.762')] [2024-06-19 10:37:29,831][26599] Updated weights for policy 0, policy_version 335774 (0.0036) [2024-06-19 10:37:33,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 5501452288. Throughput: 0: 42429.6. Samples: 1769020820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:33,381][26367] Avg episode reward: [(0, '0.517')] [2024-06-19 10:37:33,960][26599] Updated weights for policy 0, policy_version 335784 (0.0044) [2024-06-19 10:37:37,448][26599] Updated weights for policy 0, policy_version 335794 (0.0047) [2024-06-19 10:37:38,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42543.4). Total num frames: 5501665280. Throughput: 0: 42392.9. Samples: 1769274440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:38,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 10:37:38,497][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000335796_5501681664.pth... [2024-06-19 10:37:38,546][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000335170_5491425280.pth [2024-06-19 10:37:41,710][26579] Signal inference workers to stop experience collection... (26100 times) [2024-06-19 10:37:41,765][26599] InferenceWorker_p0-w0: stopping experience collection (26100 times) [2024-06-19 10:37:41,773][26579] Signal inference workers to resume experience collection... (26100 times) [2024-06-19 10:37:41,783][26599] InferenceWorker_p0-w0: resuming experience collection (26100 times) [2024-06-19 10:37:41,911][26599] Updated weights for policy 0, policy_version 335804 (0.0049) [2024-06-19 10:37:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5501878272. Throughput: 0: 42509.7. Samples: 1769529160. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:43,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 10:37:45,271][26599] Updated weights for policy 0, policy_version 335814 (0.0031) [2024-06-19 10:37:48,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5502074880. Throughput: 0: 42355.7. Samples: 1769656800. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:48,384][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 10:37:49,716][26599] Updated weights for policy 0, policy_version 335824 (0.0036) [2024-06-19 10:37:53,260][26599] Updated weights for policy 0, policy_version 335834 (0.0033) [2024-06-19 10:37:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5502304256. Throughput: 0: 42247.3. Samples: 1769908760. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:53,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 10:37:57,341][26599] Updated weights for policy 0, policy_version 335844 (0.0032) [2024-06-19 10:37:58,380][26367] Fps is (10 sec: 44252.7, 60 sec: 42052.3, 300 sec: 42654.5). Total num frames: 5502517248. Throughput: 0: 42483.9. Samples: 1770165980. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:37:58,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 10:38:00,800][26599] Updated weights for policy 0, policy_version 335854 (0.0041) [2024-06-19 10:38:03,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42598.9). Total num frames: 5502713856. Throughput: 0: 42406.6. Samples: 1770295880. Policy #0 lag: (min: 1.0, avg: 9.9, max: 21.0) [2024-06-19 10:38:03,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 10:38:05,020][26599] Updated weights for policy 0, policy_version 335864 (0.0038) [2024-06-19 10:38:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5502959616. Throughput: 0: 42443.4. Samples: 1770551080. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:08,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 10:38:08,383][26599] Updated weights for policy 0, policy_version 335874 (0.0046) [2024-06-19 10:38:12,713][26599] Updated weights for policy 0, policy_version 335884 (0.0032) [2024-06-19 10:38:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5503156224. Throughput: 0: 42421.2. Samples: 1770804400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:13,381][26367] Avg episode reward: [(0, '0.838')] [2024-06-19 10:38:16,032][26599] Updated weights for policy 0, policy_version 335894 (0.0038) [2024-06-19 10:38:18,380][26367] Fps is (10 sec: 39322.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5503352832. Throughput: 0: 42422.4. Samples: 1770929820. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:18,380][26367] Avg episode reward: [(0, '0.383')] [2024-06-19 10:38:20,463][26599] Updated weights for policy 0, policy_version 335904 (0.0037) [2024-06-19 10:38:23,384][26367] Fps is (10 sec: 42583.4, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5503582208. Throughput: 0: 42504.7. Samples: 1771187300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:23,384][26367] Avg episode reward: [(0, '0.387')] [2024-06-19 10:38:23,929][26599] Updated weights for policy 0, policy_version 335914 (0.0029) [2024-06-19 10:38:27,973][26599] Updated weights for policy 0, policy_version 335924 (0.0029) [2024-06-19 10:38:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5503795200. Throughput: 0: 42571.7. Samples: 1771444880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:28,380][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 10:38:31,455][26599] Updated weights for policy 0, policy_version 335934 (0.0038) [2024-06-19 10:38:33,384][26367] Fps is (10 sec: 42598.3, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 5504008192. Throughput: 0: 42544.4. Samples: 1771571300. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:33,385][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 10:38:35,771][26599] Updated weights for policy 0, policy_version 335944 (0.0031) [2024-06-19 10:38:38,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5504221184. Throughput: 0: 42738.2. Samples: 1771831980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:38,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 10:38:38,906][26599] Updated weights for policy 0, policy_version 335954 (0.0039) [2024-06-19 10:38:43,380][26367] Fps is (10 sec: 39335.8, 60 sec: 42052.3, 300 sec: 42598.9). Total num frames: 5504401408. Throughput: 0: 42620.4. Samples: 1772083900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:43,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 10:38:43,690][26599] Updated weights for policy 0, policy_version 335964 (0.0035) [2024-06-19 10:38:46,616][26599] Updated weights for policy 0, policy_version 335974 (0.0036) [2024-06-19 10:38:48,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42873.9, 300 sec: 42653.9). Total num frames: 5504647168. Throughput: 0: 42558.1. Samples: 1772211000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:48,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 10:38:51,172][26599] Updated weights for policy 0, policy_version 335984 (0.0028) [2024-06-19 10:38:53,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5504860160. Throughput: 0: 42709.5. Samples: 1772473000. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:53,380][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 10:38:54,266][26599] Updated weights for policy 0, policy_version 335994 (0.0023) [2024-06-19 10:38:58,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5505056768. Throughput: 0: 42820.9. Samples: 1772731340. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:38:58,381][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 10:38:58,676][26599] Updated weights for policy 0, policy_version 336004 (0.0029) [2024-06-19 10:39:02,211][26599] Updated weights for policy 0, policy_version 336014 (0.0027) [2024-06-19 10:39:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5505286144. Throughput: 0: 42772.8. Samples: 1772854600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:39:03,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 10:39:06,226][26599] Updated weights for policy 0, policy_version 336024 (0.0037) [2024-06-19 10:39:08,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5505515520. Throughput: 0: 42738.9. Samples: 1773110400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:39:08,388][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 10:39:09,966][26599] Updated weights for policy 0, policy_version 336034 (0.0039) [2024-06-19 10:39:13,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42709.5). Total num frames: 5505695744. Throughput: 0: 42666.2. Samples: 1773364860. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:39:13,380][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 10:39:13,839][26599] Updated weights for policy 0, policy_version 336044 (0.0039) [2024-06-19 10:39:16,658][26579] Signal inference workers to stop experience collection... (26150 times) [2024-06-19 10:39:16,659][26579] Signal inference workers to resume experience collection... (26150 times) [2024-06-19 10:39:16,704][26599] InferenceWorker_p0-w0: stopping experience collection (26150 times) [2024-06-19 10:39:16,704][26599] InferenceWorker_p0-w0: resuming experience collection (26150 times) [2024-06-19 10:39:17,637][26599] Updated weights for policy 0, policy_version 336054 (0.0036) [2024-06-19 10:39:18,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 5505925120. Throughput: 0: 42602.6. Samples: 1773488260. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 10:39:18,381][26367] Avg episode reward: [(0, '0.467')] [2024-06-19 10:39:21,638][26599] Updated weights for policy 0, policy_version 336064 (0.0038) [2024-06-19 10:39:23,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42327.9, 300 sec: 42598.4). Total num frames: 5506121728. Throughput: 0: 42575.6. Samples: 1773747880. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:23,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 10:39:25,230][26599] Updated weights for policy 0, policy_version 336074 (0.0037) [2024-06-19 10:39:28,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5506334720. Throughput: 0: 42639.1. Samples: 1774002660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:28,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 10:39:29,287][26599] Updated weights for policy 0, policy_version 336084 (0.0029) [2024-06-19 10:39:33,068][26599] Updated weights for policy 0, policy_version 336094 (0.0039) [2024-06-19 10:39:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42601.0, 300 sec: 42653.9). Total num frames: 5506564096. Throughput: 0: 42586.3. Samples: 1774127380. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:33,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 10:39:37,342][26599] Updated weights for policy 0, policy_version 336104 (0.0032) [2024-06-19 10:39:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5506760704. Throughput: 0: 42353.3. Samples: 1774378900. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:38,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 10:39:38,417][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000336107_5506777088.pth... [2024-06-19 10:39:38,485][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000335484_5496569856.pth [2024-06-19 10:39:41,166][26599] Updated weights for policy 0, policy_version 336114 (0.0031) [2024-06-19 10:39:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5506973696. Throughput: 0: 42233.0. Samples: 1774631820. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:43,380][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 10:39:44,845][26599] Updated weights for policy 0, policy_version 336124 (0.0031) [2024-06-19 10:39:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5507186688. Throughput: 0: 42344.4. Samples: 1774760100. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:48,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 10:39:48,762][26599] Updated weights for policy 0, policy_version 336134 (0.0039) [2024-06-19 10:39:52,509][26599] Updated weights for policy 0, policy_version 336144 (0.0037) [2024-06-19 10:39:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5507399680. Throughput: 0: 42319.3. Samples: 1775014760. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:53,380][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 10:39:56,376][26599] Updated weights for policy 0, policy_version 336154 (0.0037) [2024-06-19 10:39:58,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5507612672. Throughput: 0: 42407.7. Samples: 1775273220. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:39:58,381][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 10:40:00,157][26599] Updated weights for policy 0, policy_version 336164 (0.0037) [2024-06-19 10:40:03,380][26367] Fps is (10 sec: 44235.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5507842048. Throughput: 0: 42475.4. Samples: 1775399660. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:40:03,381][26367] Avg episode reward: [(0, '0.332')] [2024-06-19 10:40:04,282][26599] Updated weights for policy 0, policy_version 336174 (0.0031) [2024-06-19 10:40:07,768][26599] Updated weights for policy 0, policy_version 336184 (0.0038) [2024-06-19 10:40:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5508038656. Throughput: 0: 42328.8. Samples: 1775652680. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:40:08,381][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 10:40:12,040][26599] Updated weights for policy 0, policy_version 336194 (0.0030) [2024-06-19 10:40:13,380][26367] Fps is (10 sec: 37684.0, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5508218880. Throughput: 0: 42418.8. Samples: 1775911500. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:40:13,380][26367] Avg episode reward: [(0, '0.377')] [2024-06-19 10:40:15,707][26599] Updated weights for policy 0, policy_version 336204 (0.0040) [2024-06-19 10:40:18,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5508464640. Throughput: 0: 42272.4. Samples: 1776029640. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:40:18,381][26367] Avg episode reward: [(0, '0.386')] [2024-06-19 10:40:20,427][26599] Updated weights for policy 0, policy_version 336214 (0.0035) [2024-06-19 10:40:23,285][26599] Updated weights for policy 0, policy_version 336224 (0.0036) [2024-06-19 10:40:23,386][26367] Fps is (10 sec: 47486.7, 60 sec: 42867.4, 300 sec: 42597.6). Total num frames: 5508694016. Throughput: 0: 42414.7. Samples: 1776287800. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:40:23,386][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 10:40:28,179][26599] Updated weights for policy 0, policy_version 336234 (0.0031) [2024-06-19 10:40:28,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 5508857856. Throughput: 0: 42579.9. Samples: 1776547920. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:40:28,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 10:40:29,645][26579] Signal inference workers to stop experience collection... (26200 times) [2024-06-19 10:40:29,645][26579] Signal inference workers to resume experience collection... (26200 times) [2024-06-19 10:40:29,662][26599] InferenceWorker_p0-w0: stopping experience collection (26200 times) [2024-06-19 10:40:29,671][26599] InferenceWorker_p0-w0: resuming experience collection (26200 times) [2024-06-19 10:40:31,174][26599] Updated weights for policy 0, policy_version 336244 (0.0036) [2024-06-19 10:40:33,380][26367] Fps is (10 sec: 42622.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5509120000. Throughput: 0: 42398.7. Samples: 1776668040. Policy #0 lag: (min: 1.0, avg: 10.2, max: 21.0) [2024-06-19 10:40:33,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 10:40:35,748][26599] Updated weights for policy 0, policy_version 336254 (0.0039) [2024-06-19 10:40:38,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5509300224. Throughput: 0: 42555.1. Samples: 1776929740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:40:38,380][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 10:40:38,878][26599] Updated weights for policy 0, policy_version 336264 (0.0032) [2024-06-19 10:40:43,262][26599] Updated weights for policy 0, policy_version 336274 (0.0024) [2024-06-19 10:40:43,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5509513216. Throughput: 0: 42443.2. Samples: 1777183160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:40:43,381][26367] Avg episode reward: [(0, '0.766')] [2024-06-19 10:40:46,435][26599] Updated weights for policy 0, policy_version 336284 (0.0037) [2024-06-19 10:40:48,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5509758976. Throughput: 0: 42380.1. Samples: 1777306760. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:40:48,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 10:40:50,886][26599] Updated weights for policy 0, policy_version 336294 (0.0030) [2024-06-19 10:40:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5509955584. Throughput: 0: 42526.3. Samples: 1777566360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:40:53,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 10:40:54,274][26599] Updated weights for policy 0, policy_version 336304 (0.0041) [2024-06-19 10:40:58,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 5510152192. Throughput: 0: 42414.6. Samples: 1777820160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:40:58,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 10:40:58,592][26599] Updated weights for policy 0, policy_version 336314 (0.0033) [2024-06-19 10:41:01,966][26599] Updated weights for policy 0, policy_version 336324 (0.0042) [2024-06-19 10:41:03,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5510397952. Throughput: 0: 42541.4. Samples: 1777944000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:03,382][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 10:41:06,651][26599] Updated weights for policy 0, policy_version 336334 (0.0033) [2024-06-19 10:41:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5510594560. Throughput: 0: 42667.1. Samples: 1778207580. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:08,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 10:41:09,563][26599] Updated weights for policy 0, policy_version 336344 (0.0033) [2024-06-19 10:41:13,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42487.3). Total num frames: 5510791168. Throughput: 0: 42440.0. Samples: 1778457720. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:13,381][26367] Avg episode reward: [(0, '0.376')] [2024-06-19 10:41:14,116][26599] Updated weights for policy 0, policy_version 336354 (0.0037) [2024-06-19 10:41:17,264][26599] Updated weights for policy 0, policy_version 336364 (0.0035) [2024-06-19 10:41:18,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5511036928. Throughput: 0: 42564.5. Samples: 1778583440. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:18,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 10:41:21,682][26599] Updated weights for policy 0, policy_version 336374 (0.0037) [2024-06-19 10:41:23,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42329.2, 300 sec: 42487.3). Total num frames: 5511233536. Throughput: 0: 42581.2. Samples: 1778845900. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:23,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 10:41:25,070][26599] Updated weights for policy 0, policy_version 336384 (0.0032) [2024-06-19 10:41:28,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5511430144. Throughput: 0: 42519.7. Samples: 1779096540. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:28,380][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 10:41:29,222][26599] Updated weights for policy 0, policy_version 336394 (0.0036) [2024-06-19 10:41:32,605][26599] Updated weights for policy 0, policy_version 336404 (0.0036) [2024-06-19 10:41:33,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5511675904. Throughput: 0: 42710.7. Samples: 1779228740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:33,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 10:41:37,179][26599] Updated weights for policy 0, policy_version 336414 (0.0034) [2024-06-19 10:41:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42376.2). Total num frames: 5511856128. Throughput: 0: 42634.7. Samples: 1779484920. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:38,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 10:41:38,392][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000336417_5511856128.pth... [2024-06-19 10:41:38,485][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000335796_5501681664.pth [2024-06-19 10:41:40,406][26599] Updated weights for policy 0, policy_version 336424 (0.0031) [2024-06-19 10:41:43,384][26367] Fps is (10 sec: 39306.9, 60 sec: 42595.9, 300 sec: 42542.3). Total num frames: 5512069120. Throughput: 0: 42549.4. Samples: 1779735040. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:43,385][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 10:41:44,758][26599] Updated weights for policy 0, policy_version 336434 (0.0043) [2024-06-19 10:41:48,159][26599] Updated weights for policy 0, policy_version 336444 (0.0050) [2024-06-19 10:41:48,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5512298496. Throughput: 0: 42708.3. Samples: 1779865880. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:48,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 10:41:52,334][26599] Updated weights for policy 0, policy_version 336454 (0.0050) [2024-06-19 10:41:53,380][26367] Fps is (10 sec: 42613.7, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 5512495104. Throughput: 0: 42371.1. Samples: 1780114280. Policy #0 lag: (min: 0.0, avg: 9.5, max: 19.0) [2024-06-19 10:41:53,381][26367] Avg episode reward: [(0, '0.847')] [2024-06-19 10:41:55,715][26599] Updated weights for policy 0, policy_version 336464 (0.0035) [2024-06-19 10:41:58,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5512708096. Throughput: 0: 42524.9. Samples: 1780371340. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:41:58,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 10:41:59,802][26599] Updated weights for policy 0, policy_version 336474 (0.0037) [2024-06-19 10:42:03,284][26599] Updated weights for policy 0, policy_version 336484 (0.0032) [2024-06-19 10:42:03,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5512953856. Throughput: 0: 42700.7. Samples: 1780504980. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:03,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 10:42:07,803][26599] Updated weights for policy 0, policy_version 336494 (0.0041) [2024-06-19 10:42:08,384][26367] Fps is (10 sec: 42582.6, 60 sec: 42322.8, 300 sec: 42375.7). Total num frames: 5513134080. Throughput: 0: 42478.4. Samples: 1780757580. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:08,384][26367] Avg episode reward: [(0, '0.707')] [2024-06-19 10:42:08,787][26579] Signal inference workers to stop experience collection... (26250 times) [2024-06-19 10:42:08,787][26579] Signal inference workers to resume experience collection... (26250 times) [2024-06-19 10:42:08,799][26599] InferenceWorker_p0-w0: stopping experience collection (26250 times) [2024-06-19 10:42:08,799][26599] InferenceWorker_p0-w0: resuming experience collection (26250 times) [2024-06-19 10:42:11,180][26599] Updated weights for policy 0, policy_version 336504 (0.0043) [2024-06-19 10:42:13,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42868.9, 300 sec: 42542.3). Total num frames: 5513363456. Throughput: 0: 42465.0. Samples: 1781007620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:13,384][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 10:42:15,279][26599] Updated weights for policy 0, policy_version 336514 (0.0023) [2024-06-19 10:42:18,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5513576448. Throughput: 0: 42540.3. Samples: 1781143060. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:18,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 10:42:19,101][26599] Updated weights for policy 0, policy_version 336524 (0.0026) [2024-06-19 10:42:22,969][26599] Updated weights for policy 0, policy_version 336534 (0.0042) [2024-06-19 10:42:23,380][26367] Fps is (10 sec: 40974.5, 60 sec: 42325.3, 300 sec: 42376.2). Total num frames: 5513773056. Throughput: 0: 42455.5. Samples: 1781395420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:23,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 10:42:26,611][26599] Updated weights for policy 0, policy_version 336544 (0.0023) [2024-06-19 10:42:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5514002432. Throughput: 0: 42546.2. Samples: 1781649460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:28,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 10:42:30,969][26599] Updated weights for policy 0, policy_version 336554 (0.0039) [2024-06-19 10:42:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.1, 300 sec: 42487.3). Total num frames: 5514199040. Throughput: 0: 42493.4. Samples: 1781778080. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:33,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 10:42:34,353][26599] Updated weights for policy 0, policy_version 336564 (0.0034) [2024-06-19 10:42:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5514412032. Throughput: 0: 42652.1. Samples: 1782033620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:38,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 10:42:38,501][26599] Updated weights for policy 0, policy_version 336574 (0.0035) [2024-06-19 10:42:41,940][26599] Updated weights for policy 0, policy_version 336584 (0.0045) [2024-06-19 10:42:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42327.9, 300 sec: 42487.8). Total num frames: 5514608640. Throughput: 0: 42573.3. Samples: 1782287140. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:43,381][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 10:42:46,153][26599] Updated weights for policy 0, policy_version 336594 (0.0032) [2024-06-19 10:42:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5514838016. Throughput: 0: 42354.3. Samples: 1782410920. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:48,381][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 10:42:49,872][26599] Updated weights for policy 0, policy_version 336604 (0.0042) [2024-06-19 10:42:53,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5515051008. Throughput: 0: 42439.8. Samples: 1782667220. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:53,392][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 10:42:53,752][26599] Updated weights for policy 0, policy_version 336614 (0.0026) [2024-06-19 10:42:57,469][26599] Updated weights for policy 0, policy_version 336624 (0.0039) [2024-06-19 10:42:58,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5515264000. Throughput: 0: 42539.4. Samples: 1782921740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:42:58,389][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 10:43:01,337][26599] Updated weights for policy 0, policy_version 336634 (0.0038) [2024-06-19 10:43:03,380][26367] Fps is (10 sec: 40960.5, 60 sec: 41779.3, 300 sec: 42376.3). Total num frames: 5515460608. Throughput: 0: 42372.5. Samples: 1783049820. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:43:03,381][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 10:43:05,138][26599] Updated weights for policy 0, policy_version 336644 (0.0037) [2024-06-19 10:43:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42601.0, 300 sec: 42487.3). Total num frames: 5515689984. Throughput: 0: 42468.9. Samples: 1783306520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:43:08,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 10:43:08,970][26599] Updated weights for policy 0, policy_version 336654 (0.0038) [2024-06-19 10:43:12,734][26599] Updated weights for policy 0, policy_version 336664 (0.0036) [2024-06-19 10:43:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42327.9, 300 sec: 42542.9). Total num frames: 5515902976. Throughput: 0: 42470.3. Samples: 1783560620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:13,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 10:43:16,663][26599] Updated weights for policy 0, policy_version 336674 (0.0041) [2024-06-19 10:43:18,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.3, 300 sec: 42487.8). Total num frames: 5516115968. Throughput: 0: 42543.1. Samples: 1783692520. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:18,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 10:43:20,975][26599] Updated weights for policy 0, policy_version 336684 (0.0033) [2024-06-19 10:43:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5516328960. Throughput: 0: 42524.4. Samples: 1783947220. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:23,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 10:43:24,265][26599] Updated weights for policy 0, policy_version 336694 (0.0041) [2024-06-19 10:43:28,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42487.8). Total num frames: 5516541952. Throughput: 0: 42466.5. Samples: 1784198140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:28,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 10:43:28,582][26599] Updated weights for policy 0, policy_version 336704 (0.0031) [2024-06-19 10:43:31,550][26579] Signal inference workers to stop experience collection... (26300 times) [2024-06-19 10:43:31,604][26599] InferenceWorker_p0-w0: stopping experience collection (26300 times) [2024-06-19 10:43:31,665][26579] Signal inference workers to resume experience collection... (26300 times) [2024-06-19 10:43:31,666][26599] InferenceWorker_p0-w0: resuming experience collection (26300 times) [2024-06-19 10:43:31,812][26599] Updated weights for policy 0, policy_version 336714 (0.0032) [2024-06-19 10:43:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5516754944. Throughput: 0: 42643.5. Samples: 1784329880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:33,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 10:43:36,098][26599] Updated weights for policy 0, policy_version 336724 (0.0032) [2024-06-19 10:43:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5516967936. Throughput: 0: 42696.9. Samples: 1784588580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:38,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 10:43:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000336729_5516967936.pth... [2024-06-19 10:43:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000336107_5506777088.pth [2024-06-19 10:43:39,432][26599] Updated weights for policy 0, policy_version 336734 (0.0041) [2024-06-19 10:43:43,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 5517197312. Throughput: 0: 42589.8. Samples: 1784838280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:43,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 10:43:43,655][26599] Updated weights for policy 0, policy_version 336744 (0.0037) [2024-06-19 10:43:47,598][26599] Updated weights for policy 0, policy_version 336754 (0.0035) [2024-06-19 10:43:48,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5517393920. Throughput: 0: 42749.8. Samples: 1784973560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:48,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 10:43:51,594][26599] Updated weights for policy 0, policy_version 336764 (0.0033) [2024-06-19 10:43:53,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 5517590528. Throughput: 0: 42564.5. Samples: 1785221920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:53,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 10:43:55,165][26599] Updated weights for policy 0, policy_version 336774 (0.0039) [2024-06-19 10:43:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5517836288. Throughput: 0: 42611.9. Samples: 1785478160. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:43:58,381][26367] Avg episode reward: [(0, '0.763')] [2024-06-19 10:43:59,080][26599] Updated weights for policy 0, policy_version 336784 (0.0037) [2024-06-19 10:44:02,597][26599] Updated weights for policy 0, policy_version 336794 (0.0052) [2024-06-19 10:44:03,380][26367] Fps is (10 sec: 45875.1, 60 sec: 43144.6, 300 sec: 42487.3). Total num frames: 5518049280. Throughput: 0: 42643.2. Samples: 1785611460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:44:03,381][26367] Avg episode reward: [(0, '0.806')] [2024-06-19 10:44:06,621][26599] Updated weights for policy 0, policy_version 336804 (0.0034) [2024-06-19 10:44:08,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5518229504. Throughput: 0: 42532.5. Samples: 1785861180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:44:08,381][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 10:44:10,337][26599] Updated weights for policy 0, policy_version 336814 (0.0026) [2024-06-19 10:44:13,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5518458880. Throughput: 0: 42589.9. Samples: 1786114680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:44:13,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 10:44:14,955][26599] Updated weights for policy 0, policy_version 336824 (0.0030) [2024-06-19 10:44:18,162][26599] Updated weights for policy 0, policy_version 336834 (0.0047) [2024-06-19 10:44:18,380][26367] Fps is (10 sec: 47514.1, 60 sec: 43144.7, 300 sec: 42653.9). Total num frames: 5518704640. Throughput: 0: 42742.9. Samples: 1786253300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:44:18,380][26367] Avg episode reward: [(0, '0.442')] [2024-06-19 10:44:22,470][26599] Updated weights for policy 0, policy_version 336844 (0.0041) [2024-06-19 10:44:23,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5518868480. Throughput: 0: 42608.6. Samples: 1786505960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:44:23,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 10:44:25,701][26599] Updated weights for policy 0, policy_version 336854 (0.0025) [2024-06-19 10:44:28,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5519097856. Throughput: 0: 42677.8. Samples: 1786758780. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:44:28,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 10:44:30,054][26599] Updated weights for policy 0, policy_version 336864 (0.0034) [2024-06-19 10:44:33,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5519327232. Throughput: 0: 42546.7. Samples: 1786888160. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:44:33,381][26367] Avg episode reward: [(0, '0.472')] [2024-06-19 10:44:33,458][26599] Updated weights for policy 0, policy_version 336874 (0.0033) [2024-06-19 10:44:36,719][26579] Signal inference workers to stop experience collection... (26350 times) [2024-06-19 10:44:36,760][26599] InferenceWorker_p0-w0: stopping experience collection (26350 times) [2024-06-19 10:44:36,774][26579] Signal inference workers to resume experience collection... (26350 times) [2024-06-19 10:44:36,785][26599] InferenceWorker_p0-w0: resuming experience collection (26350 times) [2024-06-19 10:44:37,816][26599] Updated weights for policy 0, policy_version 336884 (0.0043) [2024-06-19 10:44:38,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 5519523840. Throughput: 0: 42525.3. Samples: 1787135560. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:44:38,381][26367] Avg episode reward: [(0, '0.756')] [2024-06-19 10:44:41,259][26599] Updated weights for policy 0, policy_version 336894 (0.0042) [2024-06-19 10:44:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5519753216. Throughput: 0: 42569.8. Samples: 1787393800. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:44:43,381][26367] Avg episode reward: [(0, '0.837')] [2024-06-19 10:44:45,355][26599] Updated weights for policy 0, policy_version 336904 (0.0035) [2024-06-19 10:44:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5519933440. Throughput: 0: 42397.3. Samples: 1787519340. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:44:48,381][26367] Avg episode reward: [(0, '0.859')] [2024-06-19 10:44:48,951][26599] Updated weights for policy 0, policy_version 336914 (0.0033) [2024-06-19 10:44:53,188][26599] Updated weights for policy 0, policy_version 336924 (0.0042) [2024-06-19 10:44:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5520162816. Throughput: 0: 42604.1. Samples: 1787778360. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:44:53,380][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 10:44:56,546][26599] Updated weights for policy 0, policy_version 336934 (0.0033) [2024-06-19 10:44:58,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5520392192. Throughput: 0: 42672.0. Samples: 1788034920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:44:58,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 10:45:00,659][26599] Updated weights for policy 0, policy_version 336944 (0.0031) [2024-06-19 10:45:03,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5520588800. Throughput: 0: 42481.7. Samples: 1788164980. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:03,381][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 10:45:04,097][26599] Updated weights for policy 0, policy_version 336954 (0.0030) [2024-06-19 10:45:08,261][26599] Updated weights for policy 0, policy_version 336964 (0.0028) [2024-06-19 10:45:08,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 5520818176. Throughput: 0: 42703.9. Samples: 1788427640. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:08,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 10:45:11,883][26599] Updated weights for policy 0, policy_version 336974 (0.0037) [2024-06-19 10:45:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5521031168. Throughput: 0: 42762.7. Samples: 1788683100. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:13,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 10:45:15,693][26599] Updated weights for policy 0, policy_version 336984 (0.0036) [2024-06-19 10:45:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.3, 300 sec: 42543.7). Total num frames: 5521244160. Throughput: 0: 42883.5. Samples: 1788817920. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:18,381][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 10:45:19,369][26599] Updated weights for policy 0, policy_version 336994 (0.0042) [2024-06-19 10:45:23,195][26599] Updated weights for policy 0, policy_version 337004 (0.0044) [2024-06-19 10:45:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5521473536. Throughput: 0: 43066.2. Samples: 1789073540. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:23,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 10:45:27,008][26599] Updated weights for policy 0, policy_version 337014 (0.0039) [2024-06-19 10:45:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5521686528. Throughput: 0: 42942.1. Samples: 1789326200. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:28,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 10:45:30,855][26599] Updated weights for policy 0, policy_version 337024 (0.0029) [2024-06-19 10:45:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5521883136. Throughput: 0: 42948.5. Samples: 1789452020. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:33,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 10:45:34,105][26579] Signal inference workers to stop experience collection... (26400 times) [2024-06-19 10:45:34,140][26599] InferenceWorker_p0-w0: stopping experience collection (26400 times) [2024-06-19 10:45:34,164][26579] Signal inference workers to resume experience collection... (26400 times) [2024-06-19 10:45:34,168][26599] InferenceWorker_p0-w0: resuming experience collection (26400 times) [2024-06-19 10:45:34,654][26599] Updated weights for policy 0, policy_version 337034 (0.0029) [2024-06-19 10:45:38,317][26599] Updated weights for policy 0, policy_version 337044 (0.0037) [2024-06-19 10:45:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5522128896. Throughput: 0: 43099.4. Samples: 1789717840. Policy #0 lag: (min: 0.0, avg: 12.3, max: 23.0) [2024-06-19 10:45:38,381][26367] Avg episode reward: [(0, '0.530')] [2024-06-19 10:45:38,404][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337044_5522128896.pth... [2024-06-19 10:45:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000336417_5511856128.pth [2024-06-19 10:45:42,477][26599] Updated weights for policy 0, policy_version 337054 (0.0036) [2024-06-19 10:45:43,384][26367] Fps is (10 sec: 44220.6, 60 sec: 42868.8, 300 sec: 42597.9). Total num frames: 5522325504. Throughput: 0: 43028.1. Samples: 1789971340. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:45:43,385][26367] Avg episode reward: [(0, '0.348')] [2024-06-19 10:45:45,968][26599] Updated weights for policy 0, policy_version 337064 (0.0040) [2024-06-19 10:45:48,384][26367] Fps is (10 sec: 37669.6, 60 sec: 42868.9, 300 sec: 42542.3). Total num frames: 5522505728. Throughput: 0: 42791.2. Samples: 1790090740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:45:48,385][26367] Avg episode reward: [(0, '0.402')] [2024-06-19 10:45:50,331][26599] Updated weights for policy 0, policy_version 337074 (0.0040) [2024-06-19 10:45:53,384][26367] Fps is (10 sec: 40960.2, 60 sec: 42868.8, 300 sec: 42653.4). Total num frames: 5522735104. Throughput: 0: 42719.8. Samples: 1790350180. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:45:53,384][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 10:45:54,119][26599] Updated weights for policy 0, policy_version 337084 (0.0041) [2024-06-19 10:45:57,885][26599] Updated weights for policy 0, policy_version 337094 (0.0030) [2024-06-19 10:45:58,380][26367] Fps is (10 sec: 44253.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5522948096. Throughput: 0: 42803.1. Samples: 1790609240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:45:58,380][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 10:46:01,594][26599] Updated weights for policy 0, policy_version 337104 (0.0031) [2024-06-19 10:46:03,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5523161088. Throughput: 0: 42751.5. Samples: 1790741740. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:03,381][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 10:46:05,829][26599] Updated weights for policy 0, policy_version 337114 (0.0046) [2024-06-19 10:46:08,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.7, 300 sec: 42709.5). Total num frames: 5523390464. Throughput: 0: 42739.7. Samples: 1790996820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:08,380][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 10:46:09,027][26599] Updated weights for policy 0, policy_version 337124 (0.0027) [2024-06-19 10:46:13,365][26599] Updated weights for policy 0, policy_version 337134 (0.0043) [2024-06-19 10:46:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5523603456. Throughput: 0: 42865.1. Samples: 1791255120. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:13,381][26367] Avg episode reward: [(0, '0.205')] [2024-06-19 10:46:16,723][26599] Updated weights for policy 0, policy_version 337144 (0.0026) [2024-06-19 10:46:18,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42868.9, 300 sec: 42653.4). Total num frames: 5523816448. Throughput: 0: 42953.9. Samples: 1791385100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:18,384][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 10:46:20,847][26599] Updated weights for policy 0, policy_version 337154 (0.0046) [2024-06-19 10:46:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5524029440. Throughput: 0: 42651.6. Samples: 1791637160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:23,381][26367] Avg episode reward: [(0, '0.522')] [2024-06-19 10:46:24,530][26599] Updated weights for policy 0, policy_version 337164 (0.0023) [2024-06-19 10:46:28,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5524242432. Throughput: 0: 42842.3. Samples: 1791899080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:28,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 10:46:28,491][26599] Updated weights for policy 0, policy_version 337174 (0.0038) [2024-06-19 10:46:32,456][26599] Updated weights for policy 0, policy_version 337184 (0.0039) [2024-06-19 10:46:33,384][26367] Fps is (10 sec: 44220.8, 60 sec: 43142.0, 300 sec: 42764.5). Total num frames: 5524471808. Throughput: 0: 43104.5. Samples: 1792030440. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:33,384][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 10:46:36,056][26599] Updated weights for policy 0, policy_version 337194 (0.0036) [2024-06-19 10:46:38,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42325.3, 300 sec: 42710.0). Total num frames: 5524668416. Throughput: 0: 42902.0. Samples: 1792280620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:38,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 10:46:39,911][26599] Updated weights for policy 0, policy_version 337204 (0.0030) [2024-06-19 10:46:43,380][26367] Fps is (10 sec: 39336.3, 60 sec: 42328.0, 300 sec: 42598.4). Total num frames: 5524865024. Throughput: 0: 42901.8. Samples: 1792539820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:43,380][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 10:46:43,925][26599] Updated weights for policy 0, policy_version 337214 (0.0035) [2024-06-19 10:46:47,563][26599] Updated weights for policy 0, policy_version 337224 (0.0036) [2024-06-19 10:46:48,380][26367] Fps is (10 sec: 45875.3, 60 sec: 43693.3, 300 sec: 42820.6). Total num frames: 5525127168. Throughput: 0: 42857.3. Samples: 1792670320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:48,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 10:46:51,744][26599] Updated weights for policy 0, policy_version 337234 (0.0035) [2024-06-19 10:46:53,380][26367] Fps is (10 sec: 45874.6, 60 sec: 43147.2, 300 sec: 42765.0). Total num frames: 5525323776. Throughput: 0: 42971.5. Samples: 1792930540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:53,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 10:46:55,340][26599] Updated weights for policy 0, policy_version 337244 (0.0043) [2024-06-19 10:46:58,380][26367] Fps is (10 sec: 37683.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5525504000. Throughput: 0: 42905.8. Samples: 1793185880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 20.0) [2024-06-19 10:46:58,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 10:46:59,406][26599] Updated weights for policy 0, policy_version 337254 (0.0030) [2024-06-19 10:47:02,888][26579] Signal inference workers to stop experience collection... (26450 times) [2024-06-19 10:47:02,889][26579] Signal inference workers to resume experience collection... (26450 times) [2024-06-19 10:47:02,909][26599] InferenceWorker_p0-w0: stopping experience collection (26450 times) [2024-06-19 10:47:02,909][26599] InferenceWorker_p0-w0: resuming experience collection (26450 times) [2024-06-19 10:47:03,071][26599] Updated weights for policy 0, policy_version 337264 (0.0026) [2024-06-19 10:47:03,380][26367] Fps is (10 sec: 42598.9, 60 sec: 43144.6, 300 sec: 42765.6). Total num frames: 5525749760. Throughput: 0: 42761.3. Samples: 1793309200. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:03,380][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 10:47:06,993][26599] Updated weights for policy 0, policy_version 337274 (0.0039) [2024-06-19 10:47:08,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 5525962752. Throughput: 0: 42906.2. Samples: 1793567940. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:08,382][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 10:47:10,593][26599] Updated weights for policy 0, policy_version 337284 (0.0035) [2024-06-19 10:47:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5526159360. Throughput: 0: 42701.3. Samples: 1793820640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:13,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 10:47:14,508][26599] Updated weights for policy 0, policy_version 337294 (0.0042) [2024-06-19 10:47:18,155][26599] Updated weights for policy 0, policy_version 337304 (0.0034) [2024-06-19 10:47:18,384][26367] Fps is (10 sec: 44220.8, 60 sec: 43144.5, 300 sec: 42820.0). Total num frames: 5526405120. Throughput: 0: 42718.7. Samples: 1793952780. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:18,384][26367] Avg episode reward: [(0, '0.810')] [2024-06-19 10:47:21,959][26599] Updated weights for policy 0, policy_version 337314 (0.0030) [2024-06-19 10:47:23,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5526601728. Throughput: 0: 42794.2. Samples: 1794206360. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:23,381][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 10:47:25,737][26599] Updated weights for policy 0, policy_version 337324 (0.0045) [2024-06-19 10:47:28,380][26367] Fps is (10 sec: 40974.3, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5526814720. Throughput: 0: 42815.8. Samples: 1794466540. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:28,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 10:47:29,491][26599] Updated weights for policy 0, policy_version 337334 (0.0029) [2024-06-19 10:47:33,358][26599] Updated weights for policy 0, policy_version 337344 (0.0038) [2024-06-19 10:47:33,384][26367] Fps is (10 sec: 44221.2, 60 sec: 42871.5, 300 sec: 42820.0). Total num frames: 5527044096. Throughput: 0: 42719.8. Samples: 1794592860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:33,384][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 10:47:37,411][26599] Updated weights for policy 0, policy_version 337354 (0.0037) [2024-06-19 10:47:38,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5527224320. Throughput: 0: 42704.4. Samples: 1794852240. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:38,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 10:47:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337356_5527240704.pth... [2024-06-19 10:47:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000336729_5516967936.pth [2024-06-19 10:47:40,977][26599] Updated weights for policy 0, policy_version 337364 (0.0032) [2024-06-19 10:47:43,380][26367] Fps is (10 sec: 40975.0, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5527453696. Throughput: 0: 42705.8. Samples: 1795107640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:43,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 10:47:45,390][26599] Updated weights for policy 0, policy_version 337374 (0.0036) [2024-06-19 10:47:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5527666688. Throughput: 0: 42836.3. Samples: 1795236840. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:48,381][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 10:47:48,689][26599] Updated weights for policy 0, policy_version 337384 (0.0035) [2024-06-19 10:47:52,916][26599] Updated weights for policy 0, policy_version 337394 (0.0025) [2024-06-19 10:47:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5527879680. Throughput: 0: 42833.0. Samples: 1795495420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:53,381][26367] Avg episode reward: [(0, '0.392')] [2024-06-19 10:47:56,330][26599] Updated weights for policy 0, policy_version 337404 (0.0042) [2024-06-19 10:47:58,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5528092672. Throughput: 0: 42996.1. Samples: 1795755460. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:47:58,380][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 10:48:00,522][26599] Updated weights for policy 0, policy_version 337414 (0.0035) [2024-06-19 10:48:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5528322048. Throughput: 0: 42881.3. Samples: 1795882280. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:48:03,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 10:48:04,066][26599] Updated weights for policy 0, policy_version 337424 (0.0026) [2024-06-19 10:48:07,968][26599] Updated weights for policy 0, policy_version 337434 (0.0042) [2024-06-19 10:48:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5528535040. Throughput: 0: 42995.7. Samples: 1796141160. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:48:08,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 10:48:11,513][26599] Updated weights for policy 0, policy_version 337444 (0.0034) [2024-06-19 10:48:13,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5528731648. Throughput: 0: 42988.6. Samples: 1796401020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 10:48:13,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 10:48:14,800][26579] Signal inference workers to stop experience collection... (26500 times) [2024-06-19 10:48:14,800][26579] Signal inference workers to resume experience collection... (26500 times) [2024-06-19 10:48:14,843][26599] InferenceWorker_p0-w0: stopping experience collection (26500 times) [2024-06-19 10:48:14,844][26599] InferenceWorker_p0-w0: resuming experience collection (26500 times) [2024-06-19 10:48:15,421][26599] Updated weights for policy 0, policy_version 337454 (0.0029) [2024-06-19 10:48:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42601.0, 300 sec: 42820.6). Total num frames: 5528961024. Throughput: 0: 42925.6. Samples: 1796524360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:18,381][26367] Avg episode reward: [(0, '0.749')] [2024-06-19 10:48:19,070][26599] Updated weights for policy 0, policy_version 337464 (0.0037) [2024-06-19 10:48:23,371][26599] Updated weights for policy 0, policy_version 337474 (0.0032) [2024-06-19 10:48:23,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5529174016. Throughput: 0: 42870.3. Samples: 1796781400. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:23,381][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 10:48:26,730][26599] Updated weights for policy 0, policy_version 337484 (0.0034) [2024-06-19 10:48:28,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5529387008. Throughput: 0: 42895.4. Samples: 1797037940. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:28,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 10:48:31,269][26599] Updated weights for policy 0, policy_version 337494 (0.0037) [2024-06-19 10:48:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42601.0, 300 sec: 42820.6). Total num frames: 5529600000. Throughput: 0: 42899.6. Samples: 1797167320. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:33,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 10:48:34,653][26599] Updated weights for policy 0, policy_version 337504 (0.0051) [2024-06-19 10:48:38,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5529780224. Throughput: 0: 42783.9. Samples: 1797420700. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:38,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 10:48:39,048][26599] Updated weights for policy 0, policy_version 337514 (0.0040) [2024-06-19 10:48:42,395][26599] Updated weights for policy 0, policy_version 337524 (0.0025) [2024-06-19 10:48:43,384][26367] Fps is (10 sec: 40945.0, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5530009600. Throughput: 0: 42610.2. Samples: 1797673080. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:43,384][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 10:48:46,750][26599] Updated weights for policy 0, policy_version 337534 (0.0034) [2024-06-19 10:48:48,380][26367] Fps is (10 sec: 47513.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5530255360. Throughput: 0: 42768.8. Samples: 1797806880. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:48,381][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 10:48:49,928][26599] Updated weights for policy 0, policy_version 337544 (0.0037) [2024-06-19 10:48:53,384][26367] Fps is (10 sec: 42598.5, 60 sec: 42595.8, 300 sec: 42709.0). Total num frames: 5530435584. Throughput: 0: 42771.2. Samples: 1798066020. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:53,384][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 10:48:54,415][26599] Updated weights for policy 0, policy_version 337554 (0.0034) [2024-06-19 10:48:57,330][26599] Updated weights for policy 0, policy_version 337564 (0.0033) [2024-06-19 10:48:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5530664960. Throughput: 0: 42546.1. Samples: 1798315600. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:48:58,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 10:49:01,987][26599] Updated weights for policy 0, policy_version 337574 (0.0043) [2024-06-19 10:49:03,380][26367] Fps is (10 sec: 45891.4, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5530894336. Throughput: 0: 42731.1. Samples: 1798447260. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:49:03,381][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 10:49:05,197][26599] Updated weights for policy 0, policy_version 337584 (0.0035) [2024-06-19 10:49:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5531074560. Throughput: 0: 42725.7. Samples: 1798704060. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:49:08,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 10:49:09,466][26599] Updated weights for policy 0, policy_version 337594 (0.0025) [2024-06-19 10:49:12,715][26599] Updated weights for policy 0, policy_version 337604 (0.0039) [2024-06-19 10:49:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5531320320. Throughput: 0: 42591.7. Samples: 1798954560. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:49:13,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 10:49:17,069][26599] Updated weights for policy 0, policy_version 337614 (0.0039) [2024-06-19 10:49:18,383][26367] Fps is (10 sec: 47501.6, 60 sec: 43142.7, 300 sec: 42986.8). Total num frames: 5531549696. Throughput: 0: 42677.5. Samples: 1799087920. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:49:18,383][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 10:49:20,621][26599] Updated weights for policy 0, policy_version 337624 (0.0049) [2024-06-19 10:49:23,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5531713536. Throughput: 0: 42756.5. Samples: 1799344740. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:49:23,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 10:49:25,173][26599] Updated weights for policy 0, policy_version 337634 (0.0039) [2024-06-19 10:49:28,380][26367] Fps is (10 sec: 39332.2, 60 sec: 42598.6, 300 sec: 42765.0). Total num frames: 5531942912. Throughput: 0: 42743.1. Samples: 1799596360. Policy #0 lag: (min: 1.0, avg: 10.4, max: 21.0) [2024-06-19 10:49:28,380][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 10:49:28,479][26599] Updated weights for policy 0, policy_version 337644 (0.0028) [2024-06-19 10:49:32,722][26599] Updated weights for policy 0, policy_version 337654 (0.0039) [2024-06-19 10:49:33,380][26367] Fps is (10 sec: 45874.2, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5532172288. Throughput: 0: 42722.6. Samples: 1799729400. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:49:33,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 10:49:36,198][26599] Updated weights for policy 0, policy_version 337664 (0.0032) [2024-06-19 10:49:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5532352512. Throughput: 0: 42591.0. Samples: 1799982460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:49:38,381][26367] Avg episode reward: [(0, '0.845')] [2024-06-19 10:49:38,415][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337669_5532368896.pth... [2024-06-19 10:49:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337044_5522128896.pth [2024-06-19 10:49:39,707][26579] Signal inference workers to stop experience collection... (26550 times) [2024-06-19 10:49:39,751][26599] InferenceWorker_p0-w0: stopping experience collection (26550 times) [2024-06-19 10:49:39,761][26579] Signal inference workers to resume experience collection... (26550 times) [2024-06-19 10:49:39,767][26599] InferenceWorker_p0-w0: resuming experience collection (26550 times) [2024-06-19 10:49:40,245][26599] Updated weights for policy 0, policy_version 337674 (0.0024) [2024-06-19 10:49:43,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43147.0, 300 sec: 42931.6). Total num frames: 5532598272. Throughput: 0: 42655.9. Samples: 1800235120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:49:43,385][26367] Avg episode reward: [(0, '0.830')] [2024-06-19 10:49:43,704][26599] Updated weights for policy 0, policy_version 337684 (0.0031) [2024-06-19 10:49:47,731][26599] Updated weights for policy 0, policy_version 337694 (0.0040) [2024-06-19 10:49:48,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42871.6, 300 sec: 42931.6). Total num frames: 5532827648. Throughput: 0: 42850.4. Samples: 1800375520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:49:48,380][26367] Avg episode reward: [(0, '0.815')] [2024-06-19 10:49:51,258][26599] Updated weights for policy 0, policy_version 337704 (0.0042) [2024-06-19 10:49:53,380][26367] Fps is (10 sec: 39322.8, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5532991488. Throughput: 0: 42729.9. Samples: 1800626900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:49:53,380][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 10:49:55,671][26599] Updated weights for policy 0, policy_version 337714 (0.0028) [2024-06-19 10:49:58,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5533237248. Throughput: 0: 42747.8. Samples: 1800878220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:49:58,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 10:49:58,829][26599] Updated weights for policy 0, policy_version 337724 (0.0025) [2024-06-19 10:50:03,153][26599] Updated weights for policy 0, policy_version 337734 (0.0026) [2024-06-19 10:50:03,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5533433856. Throughput: 0: 42762.9. Samples: 1801012140. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:03,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 10:50:06,685][26599] Updated weights for policy 0, policy_version 337744 (0.0043) [2024-06-19 10:50:08,384][26367] Fps is (10 sec: 40945.6, 60 sec: 42868.9, 300 sec: 42764.5). Total num frames: 5533646848. Throughput: 0: 42625.4. Samples: 1801263040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:08,385][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 10:50:10,955][26599] Updated weights for policy 0, policy_version 337754 (0.0030) [2024-06-19 10:50:13,384][26367] Fps is (10 sec: 45858.8, 60 sec: 42868.9, 300 sec: 42875.6). Total num frames: 5533892608. Throughput: 0: 42581.9. Samples: 1801512700. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:13,384][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 10:50:14,502][26599] Updated weights for policy 0, policy_version 337764 (0.0030) [2024-06-19 10:50:18,380][26367] Fps is (10 sec: 42614.6, 60 sec: 42054.2, 300 sec: 42709.5). Total num frames: 5534072832. Throughput: 0: 42835.9. Samples: 1801657000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:18,380][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 10:50:18,548][26599] Updated weights for policy 0, policy_version 337774 (0.0039) [2024-06-19 10:50:21,975][26599] Updated weights for policy 0, policy_version 337784 (0.0029) [2024-06-19 10:50:23,380][26367] Fps is (10 sec: 39335.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5534285824. Throughput: 0: 42800.9. Samples: 1801908500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:23,380][26367] Avg episode reward: [(0, '0.729')] [2024-06-19 10:50:26,134][26599] Updated weights for policy 0, policy_version 337794 (0.0035) [2024-06-19 10:50:28,380][26367] Fps is (10 sec: 47512.7, 60 sec: 43417.5, 300 sec: 42931.6). Total num frames: 5534547968. Throughput: 0: 42793.5. Samples: 1802160820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:28,381][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 10:50:29,574][26599] Updated weights for policy 0, policy_version 337804 (0.0034) [2024-06-19 10:50:33,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42052.4, 300 sec: 42598.4). Total num frames: 5534695424. Throughput: 0: 42665.2. Samples: 1802295460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:33,381][26367] Avg episode reward: [(0, '0.356')] [2024-06-19 10:50:33,869][26599] Updated weights for policy 0, policy_version 337814 (0.0037) [2024-06-19 10:50:37,207][26599] Updated weights for policy 0, policy_version 337824 (0.0035) [2024-06-19 10:50:38,380][26367] Fps is (10 sec: 39322.2, 60 sec: 43144.6, 300 sec: 42765.6). Total num frames: 5534941184. Throughput: 0: 42863.1. Samples: 1802555740. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:38,380][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 10:50:39,636][26579] Signal inference workers to stop experience collection... (26600 times) [2024-06-19 10:50:39,636][26579] Signal inference workers to resume experience collection... (26600 times) [2024-06-19 10:50:39,654][26599] InferenceWorker_p0-w0: stopping experience collection (26600 times) [2024-06-19 10:50:39,654][26599] InferenceWorker_p0-w0: resuming experience collection (26600 times) [2024-06-19 10:50:41,299][26599] Updated weights for policy 0, policy_version 337834 (0.0034) [2024-06-19 10:50:43,384][26367] Fps is (10 sec: 47496.6, 60 sec: 42869.0, 300 sec: 42931.6). Total num frames: 5535170560. Throughput: 0: 42968.7. Samples: 1802811960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 10:50:43,384][26367] Avg episode reward: [(0, '0.574')] [2024-06-19 10:50:44,567][26599] Updated weights for policy 0, policy_version 337844 (0.0034) [2024-06-19 10:50:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42821.1). Total num frames: 5535367168. Throughput: 0: 42979.2. Samples: 1802946200. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:50:48,380][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 10:50:48,908][26599] Updated weights for policy 0, policy_version 337854 (0.0033) [2024-06-19 10:50:51,986][26599] Updated weights for policy 0, policy_version 337864 (0.0039) [2024-06-19 10:50:53,380][26367] Fps is (10 sec: 40975.2, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5535580160. Throughput: 0: 43087.2. Samples: 1803201800. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:50:53,380][26367] Avg episode reward: [(0, '0.385')] [2024-06-19 10:50:56,503][26599] Updated weights for policy 0, policy_version 337874 (0.0030) [2024-06-19 10:50:58,380][26367] Fps is (10 sec: 45874.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5535825920. Throughput: 0: 43339.8. Samples: 1803462840. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:50:58,381][26367] Avg episode reward: [(0, '0.276')] [2024-06-19 10:50:59,465][26599] Updated weights for policy 0, policy_version 337884 (0.0034) [2024-06-19 10:51:03,384][26367] Fps is (10 sec: 42582.3, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5536006144. Throughput: 0: 43030.6. Samples: 1803593540. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:03,385][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 10:51:04,145][26599] Updated weights for policy 0, policy_version 337894 (0.0035) [2024-06-19 10:51:07,509][26599] Updated weights for policy 0, policy_version 337904 (0.0036) [2024-06-19 10:51:08,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43420.2, 300 sec: 42876.1). Total num frames: 5536251904. Throughput: 0: 43196.4. Samples: 1803852340. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:08,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 10:51:11,797][26599] Updated weights for policy 0, policy_version 337914 (0.0042) [2024-06-19 10:51:13,380][26367] Fps is (10 sec: 45891.9, 60 sec: 42874.0, 300 sec: 42876.6). Total num frames: 5536464896. Throughput: 0: 43205.8. Samples: 1804105080. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:13,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 10:51:15,142][26599] Updated weights for policy 0, policy_version 337924 (0.0023) [2024-06-19 10:51:18,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5536645120. Throughput: 0: 43055.0. Samples: 1804232940. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:18,381][26367] Avg episode reward: [(0, '0.526')] [2024-06-19 10:51:19,462][26599] Updated weights for policy 0, policy_version 337934 (0.0041) [2024-06-19 10:51:22,715][26599] Updated weights for policy 0, policy_version 337944 (0.0038) [2024-06-19 10:51:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43690.6, 300 sec: 42931.6). Total num frames: 5536907264. Throughput: 0: 43122.5. Samples: 1804496260. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:23,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 10:51:27,028][26599] Updated weights for policy 0, policy_version 337954 (0.0040) [2024-06-19 10:51:28,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42598.4, 300 sec: 42821.1). Total num frames: 5537103872. Throughput: 0: 43046.5. Samples: 1804748900. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:28,384][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 10:51:30,370][26599] Updated weights for policy 0, policy_version 337964 (0.0038) [2024-06-19 10:51:33,380][26367] Fps is (10 sec: 37683.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5537284096. Throughput: 0: 42998.4. Samples: 1804881140. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:33,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 10:51:34,475][26599] Updated weights for policy 0, policy_version 337974 (0.0033) [2024-06-19 10:51:37,899][26599] Updated weights for policy 0, policy_version 337984 (0.0036) [2024-06-19 10:51:38,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5537529856. Throughput: 0: 43186.2. Samples: 1805145180. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:38,380][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 10:51:38,496][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337985_5537546240.pth... [2024-06-19 10:51:38,547][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337356_5527240704.pth [2024-06-19 10:51:41,816][26579] Signal inference workers to stop experience collection... (26650 times) [2024-06-19 10:51:41,841][26599] InferenceWorker_p0-w0: stopping experience collection (26650 times) [2024-06-19 10:51:41,876][26579] Signal inference workers to resume experience collection... (26650 times) [2024-06-19 10:51:41,877][26599] InferenceWorker_p0-w0: resuming experience collection (26650 times) [2024-06-19 10:51:42,041][26599] Updated weights for policy 0, policy_version 337994 (0.0037) [2024-06-19 10:51:43,380][26367] Fps is (10 sec: 47514.0, 60 sec: 43147.1, 300 sec: 42820.6). Total num frames: 5537759232. Throughput: 0: 43014.3. Samples: 1805398480. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:43,381][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 10:51:45,541][26599] Updated weights for policy 0, policy_version 338004 (0.0034) [2024-06-19 10:51:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5537939456. Throughput: 0: 43030.6. Samples: 1805529760. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:48,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 10:51:49,439][26599] Updated weights for policy 0, policy_version 338014 (0.0048) [2024-06-19 10:51:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5538168832. Throughput: 0: 42946.3. Samples: 1805784920. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:53,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 10:51:53,799][26599] Updated weights for policy 0, policy_version 338024 (0.0045) [2024-06-19 10:51:57,286][26599] Updated weights for policy 0, policy_version 338034 (0.0041) [2024-06-19 10:51:58,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5538398208. Throughput: 0: 42830.6. Samples: 1806032460. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:51:58,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 10:52:01,424][26599] Updated weights for policy 0, policy_version 338044 (0.0047) [2024-06-19 10:52:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43147.1, 300 sec: 42820.6). Total num frames: 5538594816. Throughput: 0: 43005.0. Samples: 1806168160. Policy #0 lag: (min: 0.0, avg: 9.0, max: 22.0) [2024-06-19 10:52:03,381][26367] Avg episode reward: [(0, '0.412')] [2024-06-19 10:52:05,033][26599] Updated weights for policy 0, policy_version 338054 (0.0046) [2024-06-19 10:52:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5538807808. Throughput: 0: 42733.5. Samples: 1806419260. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:08,380][26367] Avg episode reward: [(0, '0.541')] [2024-06-19 10:52:09,130][26599] Updated weights for policy 0, policy_version 338064 (0.0028) [2024-06-19 10:52:12,735][26599] Updated weights for policy 0, policy_version 338074 (0.0035) [2024-06-19 10:52:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.5). Total num frames: 5539020800. Throughput: 0: 42802.7. Samples: 1806675020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:13,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 10:52:16,740][26599] Updated weights for policy 0, policy_version 338084 (0.0038) [2024-06-19 10:52:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5539233792. Throughput: 0: 42836.1. Samples: 1806808760. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:18,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 10:52:20,349][26599] Updated weights for policy 0, policy_version 338094 (0.0036) [2024-06-19 10:52:23,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5539430400. Throughput: 0: 42442.6. Samples: 1807055100. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:23,381][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 10:52:24,229][26599] Updated weights for policy 0, policy_version 338104 (0.0029) [2024-06-19 10:52:28,087][26599] Updated weights for policy 0, policy_version 338114 (0.0041) [2024-06-19 10:52:28,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42821.1). Total num frames: 5539676160. Throughput: 0: 42592.5. Samples: 1807315140. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:28,385][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 10:52:32,391][26599] Updated weights for policy 0, policy_version 338124 (0.0036) [2024-06-19 10:52:33,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5539872768. Throughput: 0: 42599.5. Samples: 1807446740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:33,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 10:52:35,669][26599] Updated weights for policy 0, policy_version 338134 (0.0031) [2024-06-19 10:52:38,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5540085760. Throughput: 0: 42599.9. Samples: 1807701920. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:38,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 10:52:39,977][26599] Updated weights for policy 0, policy_version 338144 (0.0024) [2024-06-19 10:52:43,237][26599] Updated weights for policy 0, policy_version 338154 (0.0035) [2024-06-19 10:52:43,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5540315136. Throughput: 0: 42918.3. Samples: 1807963780. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:43,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 10:52:47,554][26599] Updated weights for policy 0, policy_version 338164 (0.0027) [2024-06-19 10:52:48,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5540511744. Throughput: 0: 42757.4. Samples: 1808092240. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:48,380][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 10:52:50,948][26599] Updated weights for policy 0, policy_version 338174 (0.0034) [2024-06-19 10:52:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5540741120. Throughput: 0: 42723.9. Samples: 1808341840. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:53,384][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 10:52:55,064][26599] Updated weights for policy 0, policy_version 338184 (0.0022) [2024-06-19 10:52:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5540937728. Throughput: 0: 42988.0. Samples: 1808609480. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:52:58,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 10:52:58,696][26599] Updated weights for policy 0, policy_version 338194 (0.0036) [2024-06-19 10:53:02,534][26599] Updated weights for policy 0, policy_version 338204 (0.0037) [2024-06-19 10:53:03,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5541150720. Throughput: 0: 42684.3. Samples: 1808729560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:53:03,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 10:53:06,563][26599] Updated weights for policy 0, policy_version 338214 (0.0031) [2024-06-19 10:53:08,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42871.3, 300 sec: 42876.1). Total num frames: 5541380096. Throughput: 0: 42921.1. Samples: 1808986560. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:53:08,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 10:53:10,265][26599] Updated weights for policy 0, policy_version 338224 (0.0033) [2024-06-19 10:53:13,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5541593088. Throughput: 0: 42857.8. Samples: 1809243740. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:53:13,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 10:53:14,131][26599] Updated weights for policy 0, policy_version 338234 (0.0036) [2024-06-19 10:53:16,736][26579] Signal inference workers to stop experience collection... (26700 times) [2024-06-19 10:53:16,774][26599] InferenceWorker_p0-w0: stopping experience collection (26700 times) [2024-06-19 10:53:16,797][26579] Signal inference workers to resume experience collection... (26700 times) [2024-06-19 10:53:16,798][26599] InferenceWorker_p0-w0: resuming experience collection (26700 times) [2024-06-19 10:53:18,025][26599] Updated weights for policy 0, policy_version 338244 (0.0026) [2024-06-19 10:53:18,384][26367] Fps is (10 sec: 42583.6, 60 sec: 42868.9, 300 sec: 42820.0). Total num frames: 5541806080. Throughput: 0: 42580.6. Samples: 1809363020. Policy #0 lag: (min: 0.0, avg: 8.6, max: 20.0) [2024-06-19 10:53:18,385][26367] Avg episode reward: [(0, '0.837')] [2024-06-19 10:53:21,852][26599] Updated weights for policy 0, policy_version 338254 (0.0038) [2024-06-19 10:53:23,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5542002688. Throughput: 0: 42551.7. Samples: 1809616740. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:23,380][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 10:53:25,832][26599] Updated weights for policy 0, policy_version 338264 (0.0035) [2024-06-19 10:53:28,380][26367] Fps is (10 sec: 39335.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5542199296. Throughput: 0: 42496.0. Samples: 1809876100. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:28,384][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 10:53:29,560][26599] Updated weights for policy 0, policy_version 338274 (0.0032) [2024-06-19 10:53:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5542428672. Throughput: 0: 42328.4. Samples: 1809997020. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:33,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 10:53:33,667][26599] Updated weights for policy 0, policy_version 338284 (0.0038) [2024-06-19 10:53:37,431][26599] Updated weights for policy 0, policy_version 338294 (0.0030) [2024-06-19 10:53:38,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.4, 300 sec: 42876.6). Total num frames: 5542658048. Throughput: 0: 42593.3. Samples: 1810258540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:38,381][26367] Avg episode reward: [(0, '0.359')] [2024-06-19 10:53:38,434][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000338298_5542674432.pth... [2024-06-19 10:53:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337669_5532368896.pth [2024-06-19 10:53:41,246][26599] Updated weights for policy 0, policy_version 338304 (0.0028) [2024-06-19 10:53:43,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5542854656. Throughput: 0: 42274.5. Samples: 1810511840. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:43,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 10:53:45,000][26599] Updated weights for policy 0, policy_version 338314 (0.0060) [2024-06-19 10:53:48,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.3, 300 sec: 42821.1). Total num frames: 5543067648. Throughput: 0: 42310.3. Samples: 1810633520. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:48,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 10:53:48,853][26599] Updated weights for policy 0, policy_version 338324 (0.0039) [2024-06-19 10:53:52,684][26599] Updated weights for policy 0, policy_version 338334 (0.0036) [2024-06-19 10:53:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 5543264256. Throughput: 0: 42289.1. Samples: 1810889560. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:53,381][26367] Avg episode reward: [(0, '0.801')] [2024-06-19 10:53:56,487][26599] Updated weights for policy 0, policy_version 338344 (0.0043) [2024-06-19 10:53:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5543477248. Throughput: 0: 42270.8. Samples: 1811145920. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:53:58,380][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 10:54:00,350][26599] Updated weights for policy 0, policy_version 338354 (0.0035) [2024-06-19 10:54:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.5, 300 sec: 42765.0). Total num frames: 5543690240. Throughput: 0: 42437.3. Samples: 1811272540. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:54:03,380][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 10:54:04,373][26599] Updated weights for policy 0, policy_version 338364 (0.0044) [2024-06-19 10:54:08,278][26599] Updated weights for policy 0, policy_version 338374 (0.0046) [2024-06-19 10:54:08,380][26367] Fps is (10 sec: 44235.7, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5543919616. Throughput: 0: 42402.9. Samples: 1811524880. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:54:08,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 10:54:12,147][26599] Updated weights for policy 0, policy_version 338384 (0.0030) [2024-06-19 10:54:13,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42654.3). Total num frames: 5544132608. Throughput: 0: 42348.1. Samples: 1811781760. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:54:13,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 10:54:16,220][26599] Updated weights for policy 0, policy_version 338394 (0.0029) [2024-06-19 10:54:18,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42328.0, 300 sec: 42820.6). Total num frames: 5544345600. Throughput: 0: 42625.0. Samples: 1811915140. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:54:18,380][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 10:54:19,692][26599] Updated weights for policy 0, policy_version 338404 (0.0030) [2024-06-19 10:54:23,380][26367] Fps is (10 sec: 39321.9, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5544525824. Throughput: 0: 42218.4. Samples: 1812158360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:54:23,380][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 10:54:23,987][26599] Updated weights for policy 0, policy_version 338414 (0.0037) [2024-06-19 10:54:27,499][26599] Updated weights for policy 0, policy_version 338424 (0.0030) [2024-06-19 10:54:28,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5544755200. Throughput: 0: 42266.4. Samples: 1812413820. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:54:28,381][26367] Avg episode reward: [(0, '0.413')] [2024-06-19 10:54:31,665][26599] Updated weights for policy 0, policy_version 338434 (0.0032) [2024-06-19 10:54:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42052.4, 300 sec: 42709.5). Total num frames: 5544951808. Throughput: 0: 42507.7. Samples: 1812546360. Policy #0 lag: (min: 0.0, avg: 8.5, max: 21.0) [2024-06-19 10:54:33,380][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 10:54:35,141][26599] Updated weights for policy 0, policy_version 338444 (0.0038) [2024-06-19 10:54:38,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.3, 300 sec: 42654.0). Total num frames: 5545181184. Throughput: 0: 42376.4. Samples: 1812796500. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:54:38,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 10:54:39,323][26599] Updated weights for policy 0, policy_version 338454 (0.0031) [2024-06-19 10:54:42,015][26579] Signal inference workers to stop experience collection... (26750 times) [2024-06-19 10:54:42,015][26579] Signal inference workers to resume experience collection... (26750 times) [2024-06-19 10:54:42,027][26599] InferenceWorker_p0-w0: stopping experience collection (26750 times) [2024-06-19 10:54:42,027][26599] InferenceWorker_p0-w0: resuming experience collection (26750 times) [2024-06-19 10:54:42,753][26599] Updated weights for policy 0, policy_version 338464 (0.0032) [2024-06-19 10:54:43,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5545410560. Throughput: 0: 42518.2. Samples: 1813059240. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:54:43,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 10:54:46,933][26599] Updated weights for policy 0, policy_version 338474 (0.0036) [2024-06-19 10:54:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5545607168. Throughput: 0: 42584.0. Samples: 1813188820. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:54:48,380][26367] Avg episode reward: [(0, '0.834')] [2024-06-19 10:54:50,456][26599] Updated weights for policy 0, policy_version 338484 (0.0028) [2024-06-19 10:54:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5545836544. Throughput: 0: 42694.4. Samples: 1813446120. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:54:53,381][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 10:54:54,547][26599] Updated weights for policy 0, policy_version 338494 (0.0034) [2024-06-19 10:54:58,099][26599] Updated weights for policy 0, policy_version 338504 (0.0029) [2024-06-19 10:54:58,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5546049536. Throughput: 0: 42714.7. Samples: 1813703920. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:54:58,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 10:55:02,351][26599] Updated weights for policy 0, policy_version 338514 (0.0039) [2024-06-19 10:55:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.4, 300 sec: 42765.5). Total num frames: 5546262528. Throughput: 0: 42590.9. Samples: 1813831740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:03,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 10:55:05,814][26599] Updated weights for policy 0, policy_version 338524 (0.0035) [2024-06-19 10:55:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42598.4, 300 sec: 42654.4). Total num frames: 5546475520. Throughput: 0: 42805.1. Samples: 1814084600. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:08,386][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 10:55:09,723][26599] Updated weights for policy 0, policy_version 338534 (0.0040) [2024-06-19 10:55:13,289][26599] Updated weights for policy 0, policy_version 338544 (0.0033) [2024-06-19 10:55:13,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5546704896. Throughput: 0: 43010.5. Samples: 1814349300. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:13,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 10:55:17,346][26599] Updated weights for policy 0, policy_version 338554 (0.0039) [2024-06-19 10:55:18,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5546885120. Throughput: 0: 42939.1. Samples: 1814478620. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:18,380][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 10:55:21,262][26599] Updated weights for policy 0, policy_version 338564 (0.0037) [2024-06-19 10:55:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 43417.6, 300 sec: 42654.0). Total num frames: 5547130880. Throughput: 0: 42825.9. Samples: 1814723660. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:23,381][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 10:55:24,837][26599] Updated weights for policy 0, policy_version 338574 (0.0029) [2024-06-19 10:55:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5547327488. Throughput: 0: 42956.0. Samples: 1814992260. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:28,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 10:55:28,806][26599] Updated weights for policy 0, policy_version 338584 (0.0034) [2024-06-19 10:55:32,609][26599] Updated weights for policy 0, policy_version 338594 (0.0043) [2024-06-19 10:55:33,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5547524096. Throughput: 0: 42642.6. Samples: 1815107740. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:33,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 10:55:36,423][26599] Updated weights for policy 0, policy_version 338604 (0.0045) [2024-06-19 10:55:38,380][26367] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42710.0). Total num frames: 5547769856. Throughput: 0: 42658.9. Samples: 1815365780. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:38,381][26367] Avg episode reward: [(0, '0.447')] [2024-06-19 10:55:38,395][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000338609_5547769856.pth... [2024-06-19 10:55:38,458][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000337985_5537546240.pth [2024-06-19 10:55:40,296][26599] Updated weights for policy 0, policy_version 338614 (0.0025) [2024-06-19 10:55:43,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5547966464. Throughput: 0: 42608.8. Samples: 1815621320. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:43,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 10:55:44,151][26599] Updated weights for policy 0, policy_version 338624 (0.0033) [2024-06-19 10:55:48,206][26599] Updated weights for policy 0, policy_version 338634 (0.0052) [2024-06-19 10:55:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5548179456. Throughput: 0: 42562.2. Samples: 1815747040. Policy #0 lag: (min: 0.0, avg: 11.8, max: 22.0) [2024-06-19 10:55:48,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 10:55:51,911][26599] Updated weights for policy 0, policy_version 338644 (0.0026) [2024-06-19 10:55:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5548408832. Throughput: 0: 42719.6. Samples: 1816006980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:55:53,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 10:55:55,711][26599] Updated weights for policy 0, policy_version 338654 (0.0031) [2024-06-19 10:55:58,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42595.8, 300 sec: 42709.5). Total num frames: 5548605440. Throughput: 0: 42580.2. Samples: 1816265560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:55:58,384][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 10:55:59,577][26599] Updated weights for policy 0, policy_version 338664 (0.0029) [2024-06-19 10:56:03,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42595.9, 300 sec: 42597.9). Total num frames: 5548818432. Throughput: 0: 42508.1. Samples: 1816391640. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:03,384][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 10:56:03,770][26599] Updated weights for policy 0, policy_version 338674 (0.0028) [2024-06-19 10:56:05,206][26579] Signal inference workers to stop experience collection... (26800 times) [2024-06-19 10:56:05,208][26579] Signal inference workers to resume experience collection... (26800 times) [2024-06-19 10:56:05,227][26599] InferenceWorker_p0-w0: stopping experience collection (26800 times) [2024-06-19 10:56:05,262][26599] InferenceWorker_p0-w0: resuming experience collection (26800 times) [2024-06-19 10:56:07,271][26599] Updated weights for policy 0, policy_version 338684 (0.0035) [2024-06-19 10:56:08,380][26367] Fps is (10 sec: 44253.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5549047808. Throughput: 0: 42787.1. Samples: 1816649080. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:08,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 10:56:11,277][26599] Updated weights for policy 0, policy_version 338694 (0.0056) [2024-06-19 10:56:13,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5549244416. Throughput: 0: 42528.3. Samples: 1816906040. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:13,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 10:56:14,884][26599] Updated weights for policy 0, policy_version 338704 (0.0030) [2024-06-19 10:56:18,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5549457408. Throughput: 0: 42742.6. Samples: 1817031160. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:18,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 10:56:19,100][26599] Updated weights for policy 0, policy_version 338714 (0.0046) [2024-06-19 10:56:22,454][26599] Updated weights for policy 0, policy_version 338724 (0.0039) [2024-06-19 10:56:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5549686784. Throughput: 0: 42671.1. Samples: 1817285980. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:23,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 10:56:26,622][26599] Updated weights for policy 0, policy_version 338734 (0.0027) [2024-06-19 10:56:28,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5549867008. Throughput: 0: 42704.0. Samples: 1817543000. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:28,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 10:56:30,118][26599] Updated weights for policy 0, policy_version 338744 (0.0038) [2024-06-19 10:56:33,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5550096384. Throughput: 0: 42648.1. Samples: 1817666200. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:33,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 10:56:34,291][26599] Updated weights for policy 0, policy_version 338754 (0.0033) [2024-06-19 10:56:37,752][26599] Updated weights for policy 0, policy_version 338764 (0.0045) [2024-06-19 10:56:38,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5550325760. Throughput: 0: 42554.6. Samples: 1817921940. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:38,381][26367] Avg episode reward: [(0, '0.511')] [2024-06-19 10:56:42,627][26599] Updated weights for policy 0, policy_version 338774 (0.0025) [2024-06-19 10:56:43,384][26367] Fps is (10 sec: 42582.7, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 5550522368. Throughput: 0: 42487.6. Samples: 1818177500. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:43,384][26367] Avg episode reward: [(0, '0.433')] [2024-06-19 10:56:45,667][26599] Updated weights for policy 0, policy_version 338784 (0.0037) [2024-06-19 10:56:48,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5550735360. Throughput: 0: 42335.0. Samples: 1818296560. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:48,381][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 10:56:50,193][26599] Updated weights for policy 0, policy_version 338794 (0.0029) [2024-06-19 10:56:53,337][26599] Updated weights for policy 0, policy_version 338804 (0.0041) [2024-06-19 10:56:53,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5550964736. Throughput: 0: 42403.1. Samples: 1818557220. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:53,380][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 10:56:57,715][26599] Updated weights for policy 0, policy_version 338814 (0.0026) [2024-06-19 10:56:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5551161344. Throughput: 0: 42468.9. Samples: 1818817140. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:56:58,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 10:57:00,790][26599] Updated weights for policy 0, policy_version 338824 (0.0024) [2024-06-19 10:57:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5551374336. Throughput: 0: 42424.5. Samples: 1818940260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:57:03,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 10:57:05,232][26599] Updated weights for policy 0, policy_version 338834 (0.0031) [2024-06-19 10:57:08,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5551603712. Throughput: 0: 42514.0. Samples: 1819199260. Policy #0 lag: (min: 0.0, avg: 12.1, max: 25.0) [2024-06-19 10:57:08,384][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 10:57:08,506][26599] Updated weights for policy 0, policy_version 338844 (0.0032) [2024-06-19 10:57:13,051][26599] Updated weights for policy 0, policy_version 338854 (0.0029) [2024-06-19 10:57:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5551783936. Throughput: 0: 42579.1. Samples: 1819459060. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:13,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 10:57:16,227][26599] Updated weights for policy 0, policy_version 338864 (0.0044) [2024-06-19 10:57:18,380][26367] Fps is (10 sec: 39336.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5551996928. Throughput: 0: 42462.2. Samples: 1819577000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:18,380][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 10:57:20,955][26599] Updated weights for policy 0, policy_version 338874 (0.0038) [2024-06-19 10:57:23,380][26367] Fps is (10 sec: 45875.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5552242688. Throughput: 0: 42573.5. Samples: 1819837740. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:23,380][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 10:57:23,876][26599] Updated weights for policy 0, policy_version 338884 (0.0038) [2024-06-19 10:57:28,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5552422912. Throughput: 0: 42666.1. Samples: 1820097320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:28,381][26367] Avg episode reward: [(0, '0.814')] [2024-06-19 10:57:28,468][26599] Updated weights for policy 0, policy_version 338894 (0.0035) [2024-06-19 10:57:29,325][26579] Signal inference workers to stop experience collection... (26850 times) [2024-06-19 10:57:29,325][26579] Signal inference workers to resume experience collection... (26850 times) [2024-06-19 10:57:29,367][26599] InferenceWorker_p0-w0: stopping experience collection (26850 times) [2024-06-19 10:57:29,367][26599] InferenceWorker_p0-w0: resuming experience collection (26850 times) [2024-06-19 10:57:31,933][26599] Updated weights for policy 0, policy_version 338904 (0.0033) [2024-06-19 10:57:33,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5552652288. Throughput: 0: 42700.4. Samples: 1820218080. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:33,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 10:57:36,087][26599] Updated weights for policy 0, policy_version 338914 (0.0044) [2024-06-19 10:57:38,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42325.3, 300 sec: 42542.8). Total num frames: 5552865280. Throughput: 0: 42609.1. Samples: 1820474640. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:38,381][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 10:57:38,391][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000338920_5552865280.pth... [2024-06-19 10:57:38,480][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000338298_5542674432.pth [2024-06-19 10:57:39,533][26599] Updated weights for policy 0, policy_version 338924 (0.0038) [2024-06-19 10:57:43,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42328.0, 300 sec: 42542.9). Total num frames: 5553061888. Throughput: 0: 42527.7. Samples: 1820730880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:43,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 10:57:43,736][26599] Updated weights for policy 0, policy_version 338934 (0.0042) [2024-06-19 10:57:47,119][26599] Updated weights for policy 0, policy_version 338944 (0.0029) [2024-06-19 10:57:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5553307648. Throughput: 0: 42609.7. Samples: 1820857700. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:48,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 10:57:51,321][26599] Updated weights for policy 0, policy_version 338954 (0.0043) [2024-06-19 10:57:53,380][26367] Fps is (10 sec: 44235.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5553504256. Throughput: 0: 42743.3. Samples: 1821122560. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:53,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 10:57:54,637][26599] Updated weights for policy 0, policy_version 338964 (0.0024) [2024-06-19 10:57:58,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5553717248. Throughput: 0: 42615.6. Samples: 1821376760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:57:58,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 10:57:59,244][26599] Updated weights for policy 0, policy_version 338974 (0.0051) [2024-06-19 10:58:02,482][26599] Updated weights for policy 0, policy_version 338984 (0.0039) [2024-06-19 10:58:03,380][26367] Fps is (10 sec: 44237.9, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5553946624. Throughput: 0: 42784.0. Samples: 1821502280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:58:03,380][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 10:58:06,895][26599] Updated weights for policy 0, policy_version 338994 (0.0037) [2024-06-19 10:58:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5554159616. Throughput: 0: 42754.1. Samples: 1821761680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:58:08,381][26367] Avg episode reward: [(0, '0.787')] [2024-06-19 10:58:10,255][26599] Updated weights for policy 0, policy_version 339004 (0.0037) [2024-06-19 10:58:13,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42868.9, 300 sec: 42542.9). Total num frames: 5554356224. Throughput: 0: 42628.6. Samples: 1822015760. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:58:13,384][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 10:58:14,359][26599] Updated weights for policy 0, policy_version 339014 (0.0034) [2024-06-19 10:58:17,736][26599] Updated weights for policy 0, policy_version 339024 (0.0040) [2024-06-19 10:58:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 43144.3, 300 sec: 42653.9). Total num frames: 5554585600. Throughput: 0: 42713.2. Samples: 1822140180. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:58:18,384][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 10:58:22,315][26599] Updated weights for policy 0, policy_version 339034 (0.0034) [2024-06-19 10:58:23,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5554782208. Throughput: 0: 42712.2. Samples: 1822396680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 21.0) [2024-06-19 10:58:23,380][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 10:58:25,456][26599] Updated weights for policy 0, policy_version 339044 (0.0035) [2024-06-19 10:58:28,380][26367] Fps is (10 sec: 39322.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5554978816. Throughput: 0: 42755.1. Samples: 1822654860. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:58:28,380][26367] Avg episode reward: [(0, '0.411')] [2024-06-19 10:58:29,926][26599] Updated weights for policy 0, policy_version 339054 (0.0042) [2024-06-19 10:58:32,925][26599] Updated weights for policy 0, policy_version 339064 (0.0033) [2024-06-19 10:58:33,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5555224576. Throughput: 0: 42808.4. Samples: 1822784080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:58:33,386][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 10:58:37,457][26599] Updated weights for policy 0, policy_version 339074 (0.0035) [2024-06-19 10:58:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.6, 300 sec: 42598.4). Total num frames: 5555421184. Throughput: 0: 42809.1. Samples: 1823048960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:58:38,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 10:58:40,467][26599] Updated weights for policy 0, policy_version 339084 (0.0039) [2024-06-19 10:58:43,314][26579] Signal inference workers to stop experience collection... (26900 times) [2024-06-19 10:58:43,363][26599] InferenceWorker_p0-w0: stopping experience collection (26900 times) [2024-06-19 10:58:43,371][26579] Signal inference workers to resume experience collection... (26900 times) [2024-06-19 10:58:43,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5555617792. Throughput: 0: 42655.6. Samples: 1823296260. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:58:43,380][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 10:58:43,383][26599] InferenceWorker_p0-w0: resuming experience collection (26900 times) [2024-06-19 10:58:45,267][26599] Updated weights for policy 0, policy_version 339094 (0.0034) [2024-06-19 10:58:48,118][26599] Updated weights for policy 0, policy_version 339104 (0.0038) [2024-06-19 10:58:48,380][26367] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5555879936. Throughput: 0: 42718.0. Samples: 1823424600. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:58:48,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 10:58:53,071][26599] Updated weights for policy 0, policy_version 339114 (0.0028) [2024-06-19 10:58:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.5, 300 sec: 42598.4). Total num frames: 5556043776. Throughput: 0: 42680.1. Samples: 1823682280. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:58:53,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 10:58:56,399][26599] Updated weights for policy 0, policy_version 339124 (0.0030) [2024-06-19 10:58:58,380][26367] Fps is (10 sec: 37683.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5556256768. Throughput: 0: 42625.7. Samples: 1823933760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:58:58,381][26367] Avg episode reward: [(0, '0.796')] [2024-06-19 10:59:00,744][26599] Updated weights for policy 0, policy_version 339134 (0.0027) [2024-06-19 10:59:03,380][26367] Fps is (10 sec: 47513.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5556518912. Throughput: 0: 42753.5. Samples: 1824064080. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:03,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 10:59:03,987][26599] Updated weights for policy 0, policy_version 339144 (0.0044) [2024-06-19 10:59:08,374][26599] Updated weights for policy 0, policy_version 339154 (0.0034) [2024-06-19 10:59:08,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5556699136. Throughput: 0: 42888.4. Samples: 1824326660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:08,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 10:59:11,647][26599] Updated weights for policy 0, policy_version 339164 (0.0028) [2024-06-19 10:59:13,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5556912128. Throughput: 0: 42656.9. Samples: 1824574420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:13,381][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 10:59:15,991][26599] Updated weights for policy 0, policy_version 339174 (0.0039) [2024-06-19 10:59:18,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42820.5). Total num frames: 5557157888. Throughput: 0: 42654.3. Samples: 1824703520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:18,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 10:59:19,188][26599] Updated weights for policy 0, policy_version 339184 (0.0036) [2024-06-19 10:59:23,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5557338112. Throughput: 0: 42467.1. Samples: 1824959980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:23,380][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 10:59:23,461][26599] Updated weights for policy 0, policy_version 339194 (0.0043) [2024-06-19 10:59:26,715][26599] Updated weights for policy 0, policy_version 339204 (0.0028) [2024-06-19 10:59:28,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5557551104. Throughput: 0: 42625.7. Samples: 1825214420. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:28,381][26367] Avg episode reward: [(0, '0.435')] [2024-06-19 10:59:31,159][26599] Updated weights for policy 0, policy_version 339214 (0.0033) [2024-06-19 10:59:33,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5557796864. Throughput: 0: 42598.8. Samples: 1825341540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:33,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 10:59:34,289][26599] Updated weights for policy 0, policy_version 339224 (0.0039) [2024-06-19 10:59:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5557977088. Throughput: 0: 42682.1. Samples: 1825602980. Policy #0 lag: (min: 0.0, avg: 11.6, max: 20.0) [2024-06-19 10:59:38,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 10:59:38,441][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000339233_5557993472.pth... [2024-06-19 10:59:38,501][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000338609_5547769856.pth [2024-06-19 10:59:38,722][26599] Updated weights for policy 0, policy_version 339234 (0.0044) [2024-06-19 10:59:42,461][26599] Updated weights for policy 0, policy_version 339244 (0.0040) [2024-06-19 10:59:43,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5558190080. Throughput: 0: 42656.4. Samples: 1825853300. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 10:59:43,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 10:59:46,253][26599] Updated weights for policy 0, policy_version 339254 (0.0043) [2024-06-19 10:59:46,877][26579] Signal inference workers to stop experience collection... (26950 times) [2024-06-19 10:59:46,927][26599] InferenceWorker_p0-w0: stopping experience collection (26950 times) [2024-06-19 10:59:46,936][26579] Signal inference workers to resume experience collection... (26950 times) [2024-06-19 10:59:46,944][26599] InferenceWorker_p0-w0: resuming experience collection (26950 times) [2024-06-19 10:59:48,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5558403072. Throughput: 0: 42518.1. Samples: 1825977400. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 10:59:48,381][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 10:59:49,938][26599] Updated weights for policy 0, policy_version 339264 (0.0028) [2024-06-19 10:59:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5558616064. Throughput: 0: 42526.7. Samples: 1826240360. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 10:59:53,380][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 10:59:54,268][26599] Updated weights for policy 0, policy_version 339274 (0.0029) [2024-06-19 10:59:57,339][26599] Updated weights for policy 0, policy_version 339284 (0.0028) [2024-06-19 10:59:58,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5558845440. Throughput: 0: 42593.2. Samples: 1826491120. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 10:59:58,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 11:00:01,975][26599] Updated weights for policy 0, policy_version 339294 (0.0026) [2024-06-19 11:00:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5559042048. Throughput: 0: 42727.1. Samples: 1826626240. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:03,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 11:00:04,787][26599] Updated weights for policy 0, policy_version 339304 (0.0047) [2024-06-19 11:00:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5559255040. Throughput: 0: 42814.5. Samples: 1826886640. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:08,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 11:00:09,595][26599] Updated weights for policy 0, policy_version 339314 (0.0027) [2024-06-19 11:00:12,899][26599] Updated weights for policy 0, policy_version 339324 (0.0037) [2024-06-19 11:00:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5559484416. Throughput: 0: 42589.4. Samples: 1827130940. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:13,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 11:00:17,280][26599] Updated weights for policy 0, policy_version 339334 (0.0039) [2024-06-19 11:00:18,383][26367] Fps is (10 sec: 44224.0, 60 sec: 42323.2, 300 sec: 42598.0). Total num frames: 5559697408. Throughput: 0: 42811.4. Samples: 1827268180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:18,384][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 11:00:20,442][26599] Updated weights for policy 0, policy_version 339344 (0.0035) [2024-06-19 11:00:23,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5559894016. Throughput: 0: 42640.0. Samples: 1827521780. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:23,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 11:00:24,809][26599] Updated weights for policy 0, policy_version 339354 (0.0029) [2024-06-19 11:00:28,380][26367] Fps is (10 sec: 42610.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5560123392. Throughput: 0: 42671.9. Samples: 1827773540. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:28,381][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 11:00:28,541][26599] Updated weights for policy 0, policy_version 339364 (0.0042) [2024-06-19 11:00:32,501][26599] Updated weights for policy 0, policy_version 339374 (0.0022) [2024-06-19 11:00:33,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5560352768. Throughput: 0: 42779.7. Samples: 1827902480. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:33,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 11:00:36,418][26599] Updated weights for policy 0, policy_version 339384 (0.0042) [2024-06-19 11:00:38,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5560516608. Throughput: 0: 42491.1. Samples: 1828152460. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:38,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 11:00:40,198][26599] Updated weights for policy 0, policy_version 339394 (0.0032) [2024-06-19 11:00:43,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5560762368. Throughput: 0: 42547.6. Samples: 1828405760. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:43,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 11:00:44,088][26599] Updated weights for policy 0, policy_version 339404 (0.0044) [2024-06-19 11:00:47,994][26599] Updated weights for policy 0, policy_version 339414 (0.0029) [2024-06-19 11:00:48,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42598.5, 300 sec: 42542.8). Total num frames: 5560958976. Throughput: 0: 42485.2. Samples: 1828538080. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:48,381][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 11:00:51,728][26599] Updated weights for policy 0, policy_version 339424 (0.0033) [2024-06-19 11:00:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 5561171968. Throughput: 0: 42298.4. Samples: 1828790060. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:53,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 11:00:55,718][26599] Updated weights for policy 0, policy_version 339434 (0.0025) [2024-06-19 11:00:58,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42598.9). Total num frames: 5561384960. Throughput: 0: 42649.7. Samples: 1829050180. Policy #0 lag: (min: 0.0, avg: 11.4, max: 21.0) [2024-06-19 11:00:58,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 11:00:59,529][26599] Updated weights for policy 0, policy_version 339444 (0.0034) [2024-06-19 11:01:03,301][26599] Updated weights for policy 0, policy_version 339454 (0.0031) [2024-06-19 11:01:03,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5561614336. Throughput: 0: 42409.9. Samples: 1829176500. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:03,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 11:01:07,041][26599] Updated weights for policy 0, policy_version 339464 (0.0032) [2024-06-19 11:01:08,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5561827328. Throughput: 0: 42536.6. Samples: 1829435920. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:08,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 11:01:10,818][26599] Updated weights for policy 0, policy_version 339474 (0.0042) [2024-06-19 11:01:13,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5562040320. Throughput: 0: 42615.5. Samples: 1829691240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:13,381][26367] Avg episode reward: [(0, '0.694')] [2024-06-19 11:01:14,946][26599] Updated weights for policy 0, policy_version 339484 (0.0032) [2024-06-19 11:01:18,347][26599] Updated weights for policy 0, policy_version 339494 (0.0037) [2024-06-19 11:01:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42873.6, 300 sec: 42654.0). Total num frames: 5562269696. Throughput: 0: 42605.4. Samples: 1829819720. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:18,381][26367] Avg episode reward: [(0, '0.743')] [2024-06-19 11:01:22,291][26599] Updated weights for policy 0, policy_version 339504 (0.0024) [2024-06-19 11:01:23,384][26367] Fps is (10 sec: 42583.4, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5562466304. Throughput: 0: 42904.5. Samples: 1830083320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:23,384][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 11:01:25,863][26599] Updated weights for policy 0, policy_version 339514 (0.0031) [2024-06-19 11:01:28,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 5562679296. Throughput: 0: 43041.0. Samples: 1830342760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:28,384][26367] Avg episode reward: [(0, '0.609')] [2024-06-19 11:01:29,632][26599] Updated weights for policy 0, policy_version 339524 (0.0032) [2024-06-19 11:01:33,277][26599] Updated weights for policy 0, policy_version 339534 (0.0043) [2024-06-19 11:01:33,384][26367] Fps is (10 sec: 45875.2, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5562925056. Throughput: 0: 43150.4. Samples: 1830480000. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:33,385][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 11:01:37,579][26599] Updated weights for policy 0, policy_version 339544 (0.0038) [2024-06-19 11:01:38,382][26367] Fps is (10 sec: 42607.6, 60 sec: 43143.4, 300 sec: 42654.3). Total num frames: 5563105280. Throughput: 0: 43131.0. Samples: 1830731020. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:38,382][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 11:01:38,407][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000339545_5563105280.pth... [2024-06-19 11:01:38,485][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000338920_5552865280.pth [2024-06-19 11:01:41,423][26599] Updated weights for policy 0, policy_version 339554 (0.0022) [2024-06-19 11:01:43,380][26367] Fps is (10 sec: 40975.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5563334656. Throughput: 0: 43025.8. Samples: 1830986340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:43,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 11:01:44,812][26579] Signal inference workers to stop experience collection... (27000 times) [2024-06-19 11:01:44,848][26599] InferenceWorker_p0-w0: stopping experience collection (27000 times) [2024-06-19 11:01:44,871][26579] Signal inference workers to resume experience collection... (27000 times) [2024-06-19 11:01:44,871][26599] InferenceWorker_p0-w0: resuming experience collection (27000 times) [2024-06-19 11:01:45,009][26599] Updated weights for policy 0, policy_version 339564 (0.0043) [2024-06-19 11:01:48,380][26367] Fps is (10 sec: 44243.1, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5563547648. Throughput: 0: 43205.3. Samples: 1831120740. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:48,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 11:01:48,998][26599] Updated weights for policy 0, policy_version 339574 (0.0042) [2024-06-19 11:01:53,064][26599] Updated weights for policy 0, policy_version 339584 (0.0030) [2024-06-19 11:01:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5563744256. Throughput: 0: 43082.6. Samples: 1831374640. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:53,381][26367] Avg episode reward: [(0, '0.818')] [2024-06-19 11:01:56,440][26599] Updated weights for policy 0, policy_version 339594 (0.0040) [2024-06-19 11:01:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5563973632. Throughput: 0: 43143.7. Samples: 1831632700. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:01:58,381][26367] Avg episode reward: [(0, '0.512')] [2024-06-19 11:02:00,535][26599] Updated weights for policy 0, policy_version 339604 (0.0023) [2024-06-19 11:02:03,380][26367] Fps is (10 sec: 45875.5, 60 sec: 43144.6, 300 sec: 42710.0). Total num frames: 5564203008. Throughput: 0: 43255.2. Samples: 1831766200. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:02:03,381][26367] Avg episode reward: [(0, '0.417')] [2024-06-19 11:02:03,987][26599] Updated weights for policy 0, policy_version 339614 (0.0037) [2024-06-19 11:02:08,076][26599] Updated weights for policy 0, policy_version 339624 (0.0040) [2024-06-19 11:02:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5564399616. Throughput: 0: 43114.6. Samples: 1832023320. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:02:08,380][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 11:02:11,810][26599] Updated weights for policy 0, policy_version 339634 (0.0036) [2024-06-19 11:02:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5564612608. Throughput: 0: 42936.8. Samples: 1832274760. Policy #0 lag: (min: 1.0, avg: 11.1, max: 22.0) [2024-06-19 11:02:13,381][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 11:02:15,670][26599] Updated weights for policy 0, policy_version 339644 (0.0042) [2024-06-19 11:02:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5564841984. Throughput: 0: 42778.5. Samples: 1832404880. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:18,381][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 11:02:19,398][26599] Updated weights for policy 0, policy_version 339654 (0.0030) [2024-06-19 11:02:23,252][26599] Updated weights for policy 0, policy_version 339664 (0.0029) [2024-06-19 11:02:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43147.1, 300 sec: 42820.5). Total num frames: 5565054976. Throughput: 0: 42871.1. Samples: 1832660160. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:23,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 11:02:27,029][26599] Updated weights for policy 0, policy_version 339674 (0.0041) [2024-06-19 11:02:28,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43147.2, 300 sec: 42765.0). Total num frames: 5565267968. Throughput: 0: 42892.9. Samples: 1832916520. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:28,381][26367] Avg episode reward: [(0, '0.294')] [2024-06-19 11:02:31,110][26599] Updated weights for policy 0, policy_version 339684 (0.0028) [2024-06-19 11:02:33,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42327.8, 300 sec: 42709.5). Total num frames: 5565464576. Throughput: 0: 42741.7. Samples: 1833044120. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:33,381][26367] Avg episode reward: [(0, '0.223')] [2024-06-19 11:02:34,919][26599] Updated weights for policy 0, policy_version 339694 (0.0040) [2024-06-19 11:02:38,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42872.4, 300 sec: 42765.0). Total num frames: 5565677568. Throughput: 0: 42772.8. Samples: 1833299420. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:38,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 11:02:38,846][26599] Updated weights for policy 0, policy_version 339704 (0.0043) [2024-06-19 11:02:42,532][26599] Updated weights for policy 0, policy_version 339714 (0.0035) [2024-06-19 11:02:43,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5565890560. Throughput: 0: 42877.4. Samples: 1833562180. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:43,381][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 11:02:46,333][26599] Updated weights for policy 0, policy_version 339724 (0.0037) [2024-06-19 11:02:48,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5566119936. Throughput: 0: 42807.5. Samples: 1833692540. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:48,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 11:02:50,076][26599] Updated weights for policy 0, policy_version 339734 (0.0027) [2024-06-19 11:02:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5566332928. Throughput: 0: 42663.5. Samples: 1833943180. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:53,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 11:02:53,894][26599] Updated weights for policy 0, policy_version 339744 (0.0033) [2024-06-19 11:02:57,834][26599] Updated weights for policy 0, policy_version 339754 (0.0035) [2024-06-19 11:02:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5566529536. Throughput: 0: 42940.4. Samples: 1834207080. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:02:58,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 11:03:01,400][26599] Updated weights for policy 0, policy_version 339764 (0.0037) [2024-06-19 11:03:03,384][26367] Fps is (10 sec: 44220.9, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5566775296. Throughput: 0: 42867.7. Samples: 1834334080. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:03:03,384][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 11:03:05,669][26599] Updated weights for policy 0, policy_version 339774 (0.0044) [2024-06-19 11:03:08,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.3, 300 sec: 42765.5). Total num frames: 5566971904. Throughput: 0: 42859.5. Samples: 1834588840. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:03:08,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 11:03:09,214][26599] Updated weights for policy 0, policy_version 339784 (0.0031) [2024-06-19 11:03:13,380][26367] Fps is (10 sec: 39335.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5567168512. Throughput: 0: 42988.8. Samples: 1834851020. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:03:13,381][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 11:03:13,768][26599] Updated weights for policy 0, policy_version 339794 (0.0029) [2024-06-19 11:03:16,763][26599] Updated weights for policy 0, policy_version 339804 (0.0044) [2024-06-19 11:03:18,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5567397888. Throughput: 0: 42861.9. Samples: 1834972900. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:03:18,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 11:03:21,266][26599] Updated weights for policy 0, policy_version 339814 (0.0036) [2024-06-19 11:03:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5567610880. Throughput: 0: 42857.3. Samples: 1835228000. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:03:23,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 11:03:24,809][26599] Updated weights for policy 0, policy_version 339824 (0.0026) [2024-06-19 11:03:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.3, 300 sec: 42654.0). Total num frames: 5567807488. Throughput: 0: 42700.5. Samples: 1835483700. Policy #0 lag: (min: 2.0, avg: 10.6, max: 21.0) [2024-06-19 11:03:28,380][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 11:03:28,661][26599] Updated weights for policy 0, policy_version 339834 (0.0039) [2024-06-19 11:03:30,296][26579] Signal inference workers to stop experience collection... (27050 times) [2024-06-19 11:03:30,298][26579] Signal inference workers to resume experience collection... (27050 times) [2024-06-19 11:03:30,315][26599] InferenceWorker_p0-w0: stopping experience collection (27050 times) [2024-06-19 11:03:30,345][26599] InferenceWorker_p0-w0: resuming experience collection (27050 times) [2024-06-19 11:03:32,277][26599] Updated weights for policy 0, policy_version 339844 (0.0035) [2024-06-19 11:03:33,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5568036864. Throughput: 0: 42639.2. Samples: 1835611300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:03:33,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:03:36,167][26599] Updated weights for policy 0, policy_version 339854 (0.0049) [2024-06-19 11:03:38,384][26367] Fps is (10 sec: 42582.3, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5568233472. Throughput: 0: 42720.5. Samples: 1835865760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:03:38,385][26367] Avg episode reward: [(0, '0.471')] [2024-06-19 11:03:38,417][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000339859_5568249856.pth... [2024-06-19 11:03:38,489][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000339233_5557993472.pth [2024-06-19 11:03:39,984][26599] Updated weights for policy 0, policy_version 339864 (0.0031) [2024-06-19 11:03:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5568462848. Throughput: 0: 42483.2. Samples: 1836118820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:03:43,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 11:03:43,625][26599] Updated weights for policy 0, policy_version 339874 (0.0035) [2024-06-19 11:03:47,581][26599] Updated weights for policy 0, policy_version 339884 (0.0039) [2024-06-19 11:03:48,382][26367] Fps is (10 sec: 44246.5, 60 sec: 42597.3, 300 sec: 42820.3). Total num frames: 5568675840. Throughput: 0: 42711.8. Samples: 1836256020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:03:48,383][26367] Avg episode reward: [(0, '0.816')] [2024-06-19 11:03:51,159][26599] Updated weights for policy 0, policy_version 339894 (0.0041) [2024-06-19 11:03:53,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5568872448. Throughput: 0: 42659.6. Samples: 1836508520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:03:53,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 11:03:55,519][26599] Updated weights for policy 0, policy_version 339904 (0.0034) [2024-06-19 11:03:58,380][26367] Fps is (10 sec: 44243.2, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5569118208. Throughput: 0: 42284.9. Samples: 1836753840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:03:58,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 11:03:59,713][26599] Updated weights for policy 0, policy_version 339914 (0.0042) [2024-06-19 11:04:03,055][26599] Updated weights for policy 0, policy_version 339924 (0.0038) [2024-06-19 11:04:03,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42327.9, 300 sec: 42765.0). Total num frames: 5569314816. Throughput: 0: 42677.3. Samples: 1836893380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:03,384][26367] Avg episode reward: [(0, '0.553')] [2024-06-19 11:04:07,486][26599] Updated weights for policy 0, policy_version 339934 (0.0038) [2024-06-19 11:04:08,380][26367] Fps is (10 sec: 37682.8, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5569495040. Throughput: 0: 42481.3. Samples: 1837139660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:08,381][26367] Avg episode reward: [(0, '0.676')] [2024-06-19 11:04:10,591][26599] Updated weights for policy 0, policy_version 339944 (0.0035) [2024-06-19 11:04:13,380][26367] Fps is (10 sec: 45875.3, 60 sec: 43417.6, 300 sec: 42765.0). Total num frames: 5569773568. Throughput: 0: 42345.7. Samples: 1837389260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:13,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 11:04:15,069][26599] Updated weights for policy 0, policy_version 339954 (0.0041) [2024-06-19 11:04:18,352][26599] Updated weights for policy 0, policy_version 339964 (0.0036) [2024-06-19 11:04:18,384][26367] Fps is (10 sec: 47497.1, 60 sec: 42868.9, 300 sec: 42820.0). Total num frames: 5569970176. Throughput: 0: 42672.4. Samples: 1837531720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:18,385][26367] Avg episode reward: [(0, '0.311')] [2024-06-19 11:04:22,657][26599] Updated weights for policy 0, policy_version 339974 (0.0029) [2024-06-19 11:04:23,380][26367] Fps is (10 sec: 37682.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5570150400. Throughput: 0: 42559.4. Samples: 1837780780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:23,381][26367] Avg episode reward: [(0, '0.304')] [2024-06-19 11:04:26,010][26599] Updated weights for policy 0, policy_version 339984 (0.0038) [2024-06-19 11:04:28,380][26367] Fps is (10 sec: 42614.3, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5570396160. Throughput: 0: 42571.1. Samples: 1838034520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:28,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 11:04:30,265][26599] Updated weights for policy 0, policy_version 339994 (0.0033) [2024-06-19 11:04:33,380][26367] Fps is (10 sec: 45876.1, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5570609152. Throughput: 0: 42563.3. Samples: 1838171300. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:33,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 11:04:33,560][26599] Updated weights for policy 0, policy_version 340004 (0.0034) [2024-06-19 11:04:37,766][26599] Updated weights for policy 0, policy_version 340014 (0.0035) [2024-06-19 11:04:38,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5570789376. Throughput: 0: 42422.2. Samples: 1838417520. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:38,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 11:04:40,514][26579] Signal inference workers to stop experience collection... (27100 times) [2024-06-19 11:04:40,514][26579] Signal inference workers to resume experience collection... (27100 times) [2024-06-19 11:04:40,555][26599] InferenceWorker_p0-w0: stopping experience collection (27100 times) [2024-06-19 11:04:40,555][26599] InferenceWorker_p0-w0: resuming experience collection (27100 times) [2024-06-19 11:04:41,286][26599] Updated weights for policy 0, policy_version 340024 (0.0033) [2024-06-19 11:04:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5571035136. Throughput: 0: 42690.4. Samples: 1838674900. Policy #0 lag: (min: 0.0, avg: 10.7, max: 22.0) [2024-06-19 11:04:43,380][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 11:04:45,308][26599] Updated weights for policy 0, policy_version 340034 (0.0039) [2024-06-19 11:04:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42326.3, 300 sec: 42709.4). Total num frames: 5571215360. Throughput: 0: 42504.3. Samples: 1838806080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:04:48,381][26367] Avg episode reward: [(0, '0.718')] [2024-06-19 11:04:49,418][26599] Updated weights for policy 0, policy_version 340044 (0.0038) [2024-06-19 11:04:52,915][26599] Updated weights for policy 0, policy_version 340054 (0.0030) [2024-06-19 11:04:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5571444736. Throughput: 0: 42430.9. Samples: 1839049040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:04:53,380][26367] Avg episode reward: [(0, '0.806')] [2024-06-19 11:04:57,170][26599] Updated weights for policy 0, policy_version 340064 (0.0024) [2024-06-19 11:04:58,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5571690496. Throughput: 0: 42732.3. Samples: 1839312220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:04:58,384][26367] Avg episode reward: [(0, '0.813')] [2024-06-19 11:05:00,518][26599] Updated weights for policy 0, policy_version 340074 (0.0031) [2024-06-19 11:05:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5571870720. Throughput: 0: 42443.1. Samples: 1839441500. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:03,380][26367] Avg episode reward: [(0, '0.690')] [2024-06-19 11:05:04,844][26599] Updated weights for policy 0, policy_version 340084 (0.0036) [2024-06-19 11:05:08,380][26367] Fps is (10 sec: 39322.2, 60 sec: 43144.7, 300 sec: 42709.5). Total num frames: 5572083712. Throughput: 0: 42548.2. Samples: 1839695440. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:08,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 11:05:08,485][26599] Updated weights for policy 0, policy_version 340094 (0.0040) [2024-06-19 11:05:12,488][26599] Updated weights for policy 0, policy_version 340104 (0.0043) [2024-06-19 11:05:13,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.5). Total num frames: 5572313088. Throughput: 0: 42595.1. Samples: 1839951300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:13,380][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 11:05:16,109][26599] Updated weights for policy 0, policy_version 340114 (0.0035) [2024-06-19 11:05:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42327.8, 300 sec: 42765.0). Total num frames: 5572509696. Throughput: 0: 42409.2. Samples: 1840079720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:18,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 11:05:19,966][26599] Updated weights for policy 0, policy_version 340124 (0.0032) [2024-06-19 11:05:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5572739072. Throughput: 0: 42679.6. Samples: 1840338100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:23,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 11:05:23,588][26599] Updated weights for policy 0, policy_version 340134 (0.0030) [2024-06-19 11:05:27,689][26599] Updated weights for policy 0, policy_version 340144 (0.0043) [2024-06-19 11:05:28,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5572952064. Throughput: 0: 42695.4. Samples: 1840596200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:28,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 11:05:31,253][26599] Updated weights for policy 0, policy_version 340154 (0.0029) [2024-06-19 11:05:33,380][26367] Fps is (10 sec: 40959.2, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 5573148672. Throughput: 0: 42661.8. Samples: 1840725860. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:33,381][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 11:05:35,310][26599] Updated weights for policy 0, policy_version 340164 (0.0032) [2024-06-19 11:05:38,380][26367] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5573378048. Throughput: 0: 42856.0. Samples: 1840977560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:38,380][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 11:05:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000340172_5573378048.pth... [2024-06-19 11:05:38,446][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000339545_5563105280.pth [2024-06-19 11:05:38,903][26599] Updated weights for policy 0, policy_version 340174 (0.0034) [2024-06-19 11:05:43,018][26599] Updated weights for policy 0, policy_version 340184 (0.0048) [2024-06-19 11:05:43,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.2, 300 sec: 42765.0). Total num frames: 5573574656. Throughput: 0: 42748.5. Samples: 1841235900. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:43,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 11:05:46,549][26599] Updated weights for policy 0, policy_version 340194 (0.0024) [2024-06-19 11:05:48,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42869.0, 300 sec: 42764.5). Total num frames: 5573787648. Throughput: 0: 42723.1. Samples: 1841364200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:48,385][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 11:05:50,471][26599] Updated weights for policy 0, policy_version 340204 (0.0027) [2024-06-19 11:05:52,076][26579] Signal inference workers to stop experience collection... (27150 times) [2024-06-19 11:05:52,077][26579] Signal inference workers to resume experience collection... (27150 times) [2024-06-19 11:05:52,090][26599] InferenceWorker_p0-w0: stopping experience collection (27150 times) [2024-06-19 11:05:52,090][26599] InferenceWorker_p0-w0: resuming experience collection (27150 times) [2024-06-19 11:05:53,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5574000640. Throughput: 0: 42836.5. Samples: 1841623080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:53,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 11:05:54,191][26599] Updated weights for policy 0, policy_version 340214 (0.0031) [2024-06-19 11:05:58,120][26599] Updated weights for policy 0, policy_version 340224 (0.0032) [2024-06-19 11:05:58,380][26367] Fps is (10 sec: 44252.5, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5574230016. Throughput: 0: 42955.0. Samples: 1841884280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:05:58,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 11:06:02,302][26599] Updated weights for policy 0, policy_version 340234 (0.0048) [2024-06-19 11:06:03,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42709.4). Total num frames: 5574426624. Throughput: 0: 43038.7. Samples: 1842016460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:03,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 11:06:05,783][26599] Updated weights for policy 0, policy_version 340244 (0.0038) [2024-06-19 11:06:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5574656000. Throughput: 0: 42774.5. Samples: 1842262960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:08,381][26367] Avg episode reward: [(0, '0.439')] [2024-06-19 11:06:10,030][26599] Updated weights for policy 0, policy_version 340254 (0.0040) [2024-06-19 11:06:13,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5574868992. Throughput: 0: 42705.0. Samples: 1842517920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:13,380][26367] Avg episode reward: [(0, '0.488')] [2024-06-19 11:06:13,489][26599] Updated weights for policy 0, policy_version 340264 (0.0037) [2024-06-19 11:06:17,586][26599] Updated weights for policy 0, policy_version 340274 (0.0035) [2024-06-19 11:06:18,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42595.9, 300 sec: 42709.5). Total num frames: 5575065600. Throughput: 0: 42693.1. Samples: 1842647200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:18,384][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 11:06:21,243][26599] Updated weights for policy 0, policy_version 340284 (0.0034) [2024-06-19 11:06:23,382][26367] Fps is (10 sec: 44229.4, 60 sec: 42870.3, 300 sec: 42820.8). Total num frames: 5575311360. Throughput: 0: 42703.3. Samples: 1842899280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:23,382][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 11:06:25,143][26599] Updated weights for policy 0, policy_version 340294 (0.0031) [2024-06-19 11:06:28,380][26367] Fps is (10 sec: 44252.7, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 5575507968. Throughput: 0: 42801.8. Samples: 1843161980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:28,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 11:06:28,813][26599] Updated weights for policy 0, policy_version 340304 (0.0040) [2024-06-19 11:06:32,735][26599] Updated weights for policy 0, policy_version 340314 (0.0035) [2024-06-19 11:06:33,380][26367] Fps is (10 sec: 39327.2, 60 sec: 42598.4, 300 sec: 42709.7). Total num frames: 5575704576. Throughput: 0: 42561.0. Samples: 1843279300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:33,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 11:06:36,484][26599] Updated weights for policy 0, policy_version 340324 (0.0034) [2024-06-19 11:06:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 5575933952. Throughput: 0: 42575.4. Samples: 1843538980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:38,381][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 11:06:40,291][26599] Updated weights for policy 0, policy_version 340334 (0.0035) [2024-06-19 11:06:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5576130560. Throughput: 0: 42440.0. Samples: 1843794080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:43,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 11:06:44,542][26599] Updated weights for policy 0, policy_version 340344 (0.0031) [2024-06-19 11:06:48,100][26599] Updated weights for policy 0, policy_version 340354 (0.0034) [2024-06-19 11:06:48,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 5576359936. Throughput: 0: 42300.6. Samples: 1843919980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:48,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 11:06:52,435][26599] Updated weights for policy 0, policy_version 340364 (0.0032) [2024-06-19 11:06:53,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5576572928. Throughput: 0: 42591.2. Samples: 1844179560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:53,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 11:06:55,842][26599] Updated weights for policy 0, policy_version 340374 (0.0030) [2024-06-19 11:06:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5576769536. Throughput: 0: 42541.8. Samples: 1844432300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:06:58,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 11:07:00,078][26599] Updated weights for policy 0, policy_version 340384 (0.0044) [2024-06-19 11:07:03,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5576998912. Throughput: 0: 42395.1. Samples: 1844554820. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:07:03,380][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 11:07:03,487][26599] Updated weights for policy 0, policy_version 340394 (0.0025) [2024-06-19 11:07:07,775][26599] Updated weights for policy 0, policy_version 340404 (0.0030) [2024-06-19 11:07:08,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5577195520. Throughput: 0: 42557.4. Samples: 1844814300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:07:08,381][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 11:07:11,141][26599] Updated weights for policy 0, policy_version 340414 (0.0042) [2024-06-19 11:07:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5577424896. Throughput: 0: 42407.2. Samples: 1845070300. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:07:13,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 11:07:15,482][26599] Updated weights for policy 0, policy_version 340424 (0.0027) [2024-06-19 11:07:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42327.8, 300 sec: 42542.9). Total num frames: 5577605120. Throughput: 0: 42553.3. Samples: 1845194200. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 11:07:18,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 11:07:19,206][26599] Updated weights for policy 0, policy_version 340434 (0.0029) [2024-06-19 11:07:23,043][26599] Updated weights for policy 0, policy_version 340444 (0.0035) [2024-06-19 11:07:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42326.5, 300 sec: 42653.9). Total num frames: 5577850880. Throughput: 0: 42521.0. Samples: 1845452420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:23,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 11:07:26,860][26599] Updated weights for policy 0, policy_version 340454 (0.0030) [2024-06-19 11:07:28,384][26367] Fps is (10 sec: 45859.3, 60 sec: 42595.9, 300 sec: 42709.0). Total num frames: 5578063872. Throughput: 0: 42552.7. Samples: 1845709100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:28,384][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 11:07:30,591][26599] Updated weights for policy 0, policy_version 340464 (0.0035) [2024-06-19 11:07:33,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.6, 300 sec: 42654.0). Total num frames: 5578260480. Throughput: 0: 42609.7. Samples: 1845837420. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:33,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 11:07:34,511][26599] Updated weights for policy 0, policy_version 340474 (0.0043) [2024-06-19 11:07:38,197][26599] Updated weights for policy 0, policy_version 340484 (0.0040) [2024-06-19 11:07:38,380][26367] Fps is (10 sec: 42613.7, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5578489856. Throughput: 0: 42610.2. Samples: 1846097020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:38,381][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 11:07:38,475][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000340485_5578506240.pth... [2024-06-19 11:07:38,529][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000339859_5568249856.pth [2024-06-19 11:07:42,284][26599] Updated weights for policy 0, policy_version 340494 (0.0040) [2024-06-19 11:07:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5578686464. Throughput: 0: 42659.5. Samples: 1846351980. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:43,381][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 11:07:43,447][26579] Signal inference workers to stop experience collection... (27200 times) [2024-06-19 11:07:43,448][26579] Signal inference workers to resume experience collection... (27200 times) [2024-06-19 11:07:43,468][26599] InferenceWorker_p0-w0: stopping experience collection (27200 times) [2024-06-19 11:07:43,468][26599] InferenceWorker_p0-w0: resuming experience collection (27200 times) [2024-06-19 11:07:45,906][26599] Updated weights for policy 0, policy_version 340504 (0.0049) [2024-06-19 11:07:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5578915840. Throughput: 0: 42614.6. Samples: 1846472480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:48,381][26367] Avg episode reward: [(0, '0.829')] [2024-06-19 11:07:49,909][26599] Updated weights for policy 0, policy_version 340514 (0.0035) [2024-06-19 11:07:53,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5579112448. Throughput: 0: 42598.3. Samples: 1846731220. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:53,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 11:07:53,725][26599] Updated weights for policy 0, policy_version 340524 (0.0028) [2024-06-19 11:07:57,958][26599] Updated weights for policy 0, policy_version 340534 (0.0027) [2024-06-19 11:07:58,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.9). Total num frames: 5579341824. Throughput: 0: 42716.9. Samples: 1846992560. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:07:58,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 11:08:01,363][26599] Updated weights for policy 0, policy_version 340544 (0.0029) [2024-06-19 11:08:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5579554816. Throughput: 0: 42760.1. Samples: 1847118400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:08:03,381][26367] Avg episode reward: [(0, '0.727')] [2024-06-19 11:08:05,666][26599] Updated weights for policy 0, policy_version 340554 (0.0029) [2024-06-19 11:08:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5579767808. Throughput: 0: 42544.9. Samples: 1847366940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:08:08,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 11:08:09,225][26599] Updated weights for policy 0, policy_version 340564 (0.0041) [2024-06-19 11:08:13,241][26599] Updated weights for policy 0, policy_version 340574 (0.0034) [2024-06-19 11:08:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5579964416. Throughput: 0: 42650.1. Samples: 1847628200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:08:13,381][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 11:08:17,230][26599] Updated weights for policy 0, policy_version 340584 (0.0028) [2024-06-19 11:08:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5580177408. Throughput: 0: 42567.1. Samples: 1847752940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:08:18,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 11:08:20,981][26599] Updated weights for policy 0, policy_version 340594 (0.0029) [2024-06-19 11:08:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5580390400. Throughput: 0: 42440.5. Samples: 1848006840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:08:23,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 11:08:25,129][26599] Updated weights for policy 0, policy_version 340604 (0.0034) [2024-06-19 11:08:28,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42052.3, 300 sec: 42542.3). Total num frames: 5580587008. Throughput: 0: 42501.5. Samples: 1848264700. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:08:28,384][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 11:08:28,726][26599] Updated weights for policy 0, policy_version 340614 (0.0026) [2024-06-19 11:08:32,908][26599] Updated weights for policy 0, policy_version 340624 (0.0032) [2024-06-19 11:08:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42654.5). Total num frames: 5580816384. Throughput: 0: 42599.6. Samples: 1848389460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 24.0) [2024-06-19 11:08:33,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 11:08:36,627][26599] Updated weights for policy 0, policy_version 340634 (0.0038) [2024-06-19 11:08:38,380][26367] Fps is (10 sec: 45891.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5581045760. Throughput: 0: 42529.3. Samples: 1848645040. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:08:38,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 11:08:40,603][26599] Updated weights for policy 0, policy_version 340644 (0.0041) [2024-06-19 11:08:43,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42598.6). Total num frames: 5581242368. Throughput: 0: 42292.5. Samples: 1848895720. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:08:43,380][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 11:08:44,145][26599] Updated weights for policy 0, policy_version 340654 (0.0045) [2024-06-19 11:08:48,014][26599] Updated weights for policy 0, policy_version 340664 (0.0040) [2024-06-19 11:08:48,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42052.2, 300 sec: 42598.4). Total num frames: 5581438976. Throughput: 0: 42452.0. Samples: 1849028740. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:08:48,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 11:08:51,663][26599] Updated weights for policy 0, policy_version 340674 (0.0042) [2024-06-19 11:08:53,384][26367] Fps is (10 sec: 44220.2, 60 sec: 42868.9, 300 sec: 42597.9). Total num frames: 5581684736. Throughput: 0: 42577.9. Samples: 1849283100. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:08:53,385][26367] Avg episode reward: [(0, '0.813')] [2024-06-19 11:08:55,971][26599] Updated weights for policy 0, policy_version 340684 (0.0034) [2024-06-19 11:08:58,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5581881344. Throughput: 0: 42475.6. Samples: 1849539600. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:08:58,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 11:08:59,315][26599] Updated weights for policy 0, policy_version 340694 (0.0038) [2024-06-19 11:09:03,384][26367] Fps is (10 sec: 39321.5, 60 sec: 42049.7, 300 sec: 42653.4). Total num frames: 5582077952. Throughput: 0: 42461.4. Samples: 1849663860. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:03,385][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 11:09:03,554][26599] Updated weights for policy 0, policy_version 340704 (0.0034) [2024-06-19 11:09:06,888][26599] Updated weights for policy 0, policy_version 340714 (0.0034) [2024-06-19 11:09:08,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5582307328. Throughput: 0: 42476.5. Samples: 1849918280. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:08,380][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 11:09:11,038][26599] Updated weights for policy 0, policy_version 340724 (0.0025) [2024-06-19 11:09:11,918][26579] Signal inference workers to stop experience collection... (27250 times) [2024-06-19 11:09:11,919][26579] Signal inference workers to resume experience collection... (27250 times) [2024-06-19 11:09:11,935][26599] InferenceWorker_p0-w0: stopping experience collection (27250 times) [2024-06-19 11:09:11,935][26599] InferenceWorker_p0-w0: resuming experience collection (27250 times) [2024-06-19 11:09:13,380][26367] Fps is (10 sec: 44253.1, 60 sec: 42598.4, 300 sec: 42543.4). Total num frames: 5582520320. Throughput: 0: 42425.6. Samples: 1850173700. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:13,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 11:09:14,737][26599] Updated weights for policy 0, policy_version 340734 (0.0040) [2024-06-19 11:09:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5582716928. Throughput: 0: 42475.1. Samples: 1850300840. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:18,381][26367] Avg episode reward: [(0, '0.662')] [2024-06-19 11:09:18,638][26599] Updated weights for policy 0, policy_version 340744 (0.0029) [2024-06-19 11:09:22,347][26599] Updated weights for policy 0, policy_version 340754 (0.0044) [2024-06-19 11:09:23,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5582946304. Throughput: 0: 42460.8. Samples: 1850555780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:23,381][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 11:09:26,287][26599] Updated weights for policy 0, policy_version 340764 (0.0033) [2024-06-19 11:09:28,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42600.9, 300 sec: 42487.3). Total num frames: 5583142912. Throughput: 0: 42537.6. Samples: 1850809920. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:28,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 11:09:30,365][26599] Updated weights for policy 0, policy_version 340774 (0.0023) [2024-06-19 11:09:33,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5583372288. Throughput: 0: 42340.4. Samples: 1850934060. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:33,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 11:09:33,985][26599] Updated weights for policy 0, policy_version 340784 (0.0025) [2024-06-19 11:09:38,019][26599] Updated weights for policy 0, policy_version 340794 (0.0035) [2024-06-19 11:09:38,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.4, 300 sec: 42487.3). Total num frames: 5583568896. Throughput: 0: 42307.5. Samples: 1851186780. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:38,380][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 11:09:38,393][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000340795_5583585280.pth... [2024-06-19 11:09:38,453][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000340172_5573378048.pth [2024-06-19 11:09:41,749][26599] Updated weights for policy 0, policy_version 340804 (0.0032) [2024-06-19 11:09:43,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5583765504. Throughput: 0: 42323.5. Samples: 1851444160. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:43,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 11:09:45,753][26599] Updated weights for policy 0, policy_version 340814 (0.0028) [2024-06-19 11:09:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.5, 300 sec: 42542.9). Total num frames: 5583994880. Throughput: 0: 42324.4. Samples: 1851568300. Policy #0 lag: (min: 0.0, avg: 9.2, max: 21.0) [2024-06-19 11:09:48,380][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 11:09:49,252][26599] Updated weights for policy 0, policy_version 340824 (0.0038) [2024-06-19 11:09:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42054.8, 300 sec: 42431.8). Total num frames: 5584207872. Throughput: 0: 42292.4. Samples: 1851821440. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:09:53,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 11:09:53,422][26599] Updated weights for policy 0, policy_version 340834 (0.0024) [2024-06-19 11:09:56,850][26599] Updated weights for policy 0, policy_version 340844 (0.0037) [2024-06-19 11:09:58,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5584404480. Throughput: 0: 42252.8. Samples: 1852075080. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:09:58,381][26367] Avg episode reward: [(0, '0.722')] [2024-06-19 11:10:01,433][26599] Updated weights for policy 0, policy_version 340854 (0.0031) [2024-06-19 11:10:03,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42601.0, 300 sec: 42542.9). Total num frames: 5584633856. Throughput: 0: 42210.2. Samples: 1852200300. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:03,380][26367] Avg episode reward: [(0, '0.806')] [2024-06-19 11:10:05,147][26599] Updated weights for policy 0, policy_version 340864 (0.0035) [2024-06-19 11:10:08,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5584846848. Throughput: 0: 42323.3. Samples: 1852460320. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:08,380][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 11:10:09,014][26599] Updated weights for policy 0, policy_version 340874 (0.0045) [2024-06-19 11:10:12,766][26599] Updated weights for policy 0, policy_version 340884 (0.0033) [2024-06-19 11:10:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5585059840. Throughput: 0: 42207.6. Samples: 1852709260. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:13,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 11:10:16,567][26599] Updated weights for policy 0, policy_version 340894 (0.0041) [2024-06-19 11:10:18,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5585289216. Throughput: 0: 42354.8. Samples: 1852840020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:18,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 11:10:20,157][26599] Updated weights for policy 0, policy_version 340904 (0.0033) [2024-06-19 11:10:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 5585469440. Throughput: 0: 42507.5. Samples: 1853099620. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:23,381][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 11:10:24,224][26599] Updated weights for policy 0, policy_version 340914 (0.0038) [2024-06-19 11:10:28,380][26367] Fps is (10 sec: 39321.0, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5585682432. Throughput: 0: 42312.8. Samples: 1853348240. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:28,382][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 11:10:28,650][26599] Updated weights for policy 0, policy_version 340924 (0.0030) [2024-06-19 11:10:31,939][26599] Updated weights for policy 0, policy_version 340934 (0.0033) [2024-06-19 11:10:33,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42487.3). Total num frames: 5585911808. Throughput: 0: 42476.0. Samples: 1853479720. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:33,380][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 11:10:36,365][26599] Updated weights for policy 0, policy_version 340944 (0.0029) [2024-06-19 11:10:38,384][26367] Fps is (10 sec: 44221.3, 60 sec: 42595.8, 300 sec: 42542.4). Total num frames: 5586124800. Throughput: 0: 42604.2. Samples: 1853738780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:38,384][26367] Avg episode reward: [(0, '0.813')] [2024-06-19 11:10:39,586][26599] Updated weights for policy 0, policy_version 340954 (0.0038) [2024-06-19 11:10:43,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42871.4, 300 sec: 42543.4). Total num frames: 5586337792. Throughput: 0: 42420.8. Samples: 1853984020. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:43,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 11:10:43,909][26599] Updated weights for policy 0, policy_version 340964 (0.0041) [2024-06-19 11:10:45,948][26579] Signal inference workers to stop experience collection... (27300 times) [2024-06-19 11:10:45,949][26579] Signal inference workers to resume experience collection... (27300 times) [2024-06-19 11:10:45,991][26599] InferenceWorker_p0-w0: stopping experience collection (27300 times) [2024-06-19 11:10:45,991][26599] InferenceWorker_p0-w0: resuming experience collection (27300 times) [2024-06-19 11:10:47,448][26599] Updated weights for policy 0, policy_version 340974 (0.0040) [2024-06-19 11:10:48,380][26367] Fps is (10 sec: 40974.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5586534400. Throughput: 0: 42545.6. Samples: 1854114860. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:48,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 11:10:51,490][26599] Updated weights for policy 0, policy_version 340984 (0.0029) [2024-06-19 11:10:53,380][26367] Fps is (10 sec: 40960.7, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5586747392. Throughput: 0: 42429.7. Samples: 1854369660. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:53,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 11:10:55,190][26599] Updated weights for policy 0, policy_version 340994 (0.0026) [2024-06-19 11:10:58,380][26367] Fps is (10 sec: 45875.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5586993152. Throughput: 0: 42433.7. Samples: 1854618780. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:10:58,381][26367] Avg episode reward: [(0, '0.691')] [2024-06-19 11:10:59,238][26599] Updated weights for policy 0, policy_version 341004 (0.0033) [2024-06-19 11:11:03,110][26599] Updated weights for policy 0, policy_version 341014 (0.0041) [2024-06-19 11:11:03,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 5587173376. Throughput: 0: 42493.1. Samples: 1854752220. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:11:03,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 11:11:07,073][26599] Updated weights for policy 0, policy_version 341024 (0.0042) [2024-06-19 11:11:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42431.8). Total num frames: 5587386368. Throughput: 0: 42356.7. Samples: 1855005680. Policy #0 lag: (min: 1.0, avg: 10.7, max: 23.0) [2024-06-19 11:11:08,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 11:11:10,749][26599] Updated weights for policy 0, policy_version 341034 (0.0042) [2024-06-19 11:11:13,380][26367] Fps is (10 sec: 44237.9, 60 sec: 42598.5, 300 sec: 42543.4). Total num frames: 5587615744. Throughput: 0: 42430.4. Samples: 1855257600. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:13,380][26367] Avg episode reward: [(0, '0.440')] [2024-06-19 11:11:14,552][26599] Updated weights for policy 0, policy_version 341044 (0.0044) [2024-06-19 11:11:18,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42052.3, 300 sec: 42376.5). Total num frames: 5587812352. Throughput: 0: 42341.8. Samples: 1855385100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:18,381][26367] Avg episode reward: [(0, '0.387')] [2024-06-19 11:11:18,679][26599] Updated weights for policy 0, policy_version 341054 (0.0028) [2024-06-19 11:11:21,994][26599] Updated weights for policy 0, policy_version 341064 (0.0037) [2024-06-19 11:11:23,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 5588025344. Throughput: 0: 42143.3. Samples: 1855635080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:23,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 11:11:26,292][26599] Updated weights for policy 0, policy_version 341074 (0.0034) [2024-06-19 11:11:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.5, 300 sec: 42542.9). Total num frames: 5588254720. Throughput: 0: 42521.0. Samples: 1855897460. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:28,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 11:11:30,053][26599] Updated weights for policy 0, policy_version 341084 (0.0031) [2024-06-19 11:11:33,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42487.4). Total num frames: 5588467712. Throughput: 0: 42367.3. Samples: 1856021380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:33,380][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 11:11:33,850][26599] Updated weights for policy 0, policy_version 341094 (0.0034) [2024-06-19 11:11:37,653][26599] Updated weights for policy 0, policy_version 341104 (0.0035) [2024-06-19 11:11:38,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42601.0, 300 sec: 42542.9). Total num frames: 5588680704. Throughput: 0: 42474.2. Samples: 1856281000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:38,380][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 11:11:38,442][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000341107_5588697088.pth... [2024-06-19 11:11:38,500][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000340485_5578506240.pth [2024-06-19 11:11:41,605][26599] Updated weights for policy 0, policy_version 341114 (0.0026) [2024-06-19 11:11:43,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.5, 300 sec: 42487.3). Total num frames: 5588893696. Throughput: 0: 42693.8. Samples: 1856540000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:43,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 11:11:45,245][26599] Updated weights for policy 0, policy_version 341124 (0.0033) [2024-06-19 11:11:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42431.8). Total num frames: 5589090304. Throughput: 0: 42385.9. Samples: 1856659580. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:48,381][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 11:11:49,174][26599] Updated weights for policy 0, policy_version 341134 (0.0029) [2024-06-19 11:11:53,156][26599] Updated weights for policy 0, policy_version 341144 (0.0028) [2024-06-19 11:11:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5589303296. Throughput: 0: 42451.6. Samples: 1856916000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:53,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 11:11:57,066][26599] Updated weights for policy 0, policy_version 341154 (0.0031) [2024-06-19 11:11:58,384][26367] Fps is (10 sec: 40944.8, 60 sec: 41776.7, 300 sec: 42375.7). Total num frames: 5589499904. Throughput: 0: 42532.4. Samples: 1857171720. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:11:58,385][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 11:12:00,738][26599] Updated weights for policy 0, policy_version 341164 (0.0039) [2024-06-19 11:12:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.5, 300 sec: 42431.8). Total num frames: 5589712896. Throughput: 0: 42421.8. Samples: 1857294080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:12:03,381][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 11:12:05,044][26599] Updated weights for policy 0, policy_version 341174 (0.0041) [2024-06-19 11:12:08,354][26599] Updated weights for policy 0, policy_version 341184 (0.0030) [2024-06-19 11:12:08,384][26367] Fps is (10 sec: 45875.3, 60 sec: 42868.9, 300 sec: 42486.8). Total num frames: 5589958656. Throughput: 0: 42513.9. Samples: 1857548360. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:12:08,384][26367] Avg episode reward: [(0, '0.656')] [2024-06-19 11:12:12,824][26599] Updated weights for policy 0, policy_version 341194 (0.0025) [2024-06-19 11:12:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5590138880. Throughput: 0: 42390.7. Samples: 1857805040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:12:13,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 11:12:15,896][26599] Updated weights for policy 0, policy_version 341204 (0.0033) [2024-06-19 11:12:18,380][26367] Fps is (10 sec: 40974.8, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 5590368256. Throughput: 0: 42308.8. Samples: 1857925280. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:12:18,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 11:12:20,593][26599] Updated weights for policy 0, policy_version 341214 (0.0045) [2024-06-19 11:12:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.4, 300 sec: 42432.3). Total num frames: 5590581248. Throughput: 0: 42355.4. Samples: 1858187000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 23.0) [2024-06-19 11:12:23,381][26367] Avg episode reward: [(0, '0.421')] [2024-06-19 11:12:23,565][26599] Updated weights for policy 0, policy_version 341224 (0.0035) [2024-06-19 11:12:27,794][26579] Signal inference workers to stop experience collection... (27350 times) [2024-06-19 11:12:27,832][26599] InferenceWorker_p0-w0: stopping experience collection (27350 times) [2024-06-19 11:12:27,849][26579] Signal inference workers to resume experience collection... (27350 times) [2024-06-19 11:12:27,852][26599] InferenceWorker_p0-w0: resuming experience collection (27350 times) [2024-06-19 11:12:28,164][26599] Updated weights for policy 0, policy_version 341234 (0.0046) [2024-06-19 11:12:28,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 5590777856. Throughput: 0: 42366.3. Samples: 1858446480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:12:28,380][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 11:12:31,303][26599] Updated weights for policy 0, policy_version 341244 (0.0037) [2024-06-19 11:12:33,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42052.3, 300 sec: 42376.3). Total num frames: 5590990848. Throughput: 0: 42357.9. Samples: 1858565680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:12:33,380][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 11:12:36,119][26599] Updated weights for policy 0, policy_version 341254 (0.0038) [2024-06-19 11:12:38,380][26367] Fps is (10 sec: 45874.4, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5591236608. Throughput: 0: 42481.3. Samples: 1858827660. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:12:38,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 11:12:38,833][26599] Updated weights for policy 0, policy_version 341264 (0.0031) [2024-06-19 11:12:43,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42052.2, 300 sec: 42376.2). Total num frames: 5591416832. Throughput: 0: 42598.0. Samples: 1859088480. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:12:43,381][26367] Avg episode reward: [(0, '0.518')] [2024-06-19 11:12:43,685][26599] Updated weights for policy 0, policy_version 341274 (0.0036) [2024-06-19 11:12:46,383][26599] Updated weights for policy 0, policy_version 341284 (0.0035) [2024-06-19 11:12:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5591646208. Throughput: 0: 42502.1. Samples: 1859206680. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:12:48,381][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 11:12:51,588][26599] Updated weights for policy 0, policy_version 341294 (0.0044) [2024-06-19 11:12:53,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5591875584. Throughput: 0: 42673.6. Samples: 1859468520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:12:53,383][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 11:12:54,430][26599] Updated weights for policy 0, policy_version 341304 (0.0031) [2024-06-19 11:12:58,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42600.9, 300 sec: 42376.2). Total num frames: 5592055808. Throughput: 0: 42583.9. Samples: 1859721320. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:12:58,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 11:12:59,063][26599] Updated weights for policy 0, policy_version 341314 (0.0034) [2024-06-19 11:13:01,996][26599] Updated weights for policy 0, policy_version 341324 (0.0033) [2024-06-19 11:13:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42431.8). Total num frames: 5592285184. Throughput: 0: 42703.6. Samples: 1859846940. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:03,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 11:13:06,543][26599] Updated weights for policy 0, policy_version 341334 (0.0036) [2024-06-19 11:13:08,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42327.9, 300 sec: 42487.3). Total num frames: 5592498176. Throughput: 0: 42797.3. Samples: 1860112880. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:08,381][26367] Avg episode reward: [(0, '0.685')] [2024-06-19 11:13:09,528][26599] Updated weights for policy 0, policy_version 341344 (0.0028) [2024-06-19 11:13:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5592711168. Throughput: 0: 42733.7. Samples: 1860369500. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:13,381][26367] Avg episode reward: [(0, '0.869')] [2024-06-19 11:13:14,029][26599] Updated weights for policy 0, policy_version 341354 (0.0048) [2024-06-19 11:13:17,527][26599] Updated weights for policy 0, policy_version 341364 (0.0029) [2024-06-19 11:13:18,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42871.6, 300 sec: 42542.9). Total num frames: 5592940544. Throughput: 0: 42793.3. Samples: 1860491380. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:18,380][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 11:13:21,943][26599] Updated weights for policy 0, policy_version 341374 (0.0043) [2024-06-19 11:13:23,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.5, 300 sec: 42543.4). Total num frames: 5593137152. Throughput: 0: 42682.4. Samples: 1860748360. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:23,380][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 11:13:25,074][26599] Updated weights for policy 0, policy_version 341384 (0.0022) [2024-06-19 11:13:28,380][26367] Fps is (10 sec: 39321.3, 60 sec: 42598.3, 300 sec: 42431.8). Total num frames: 5593333760. Throughput: 0: 42607.7. Samples: 1861005820. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:28,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 11:13:29,977][26599] Updated weights for policy 0, policy_version 341394 (0.0039) [2024-06-19 11:13:32,631][26599] Updated weights for policy 0, policy_version 341404 (0.0023) [2024-06-19 11:13:33,380][26367] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42487.3). Total num frames: 5593579520. Throughput: 0: 42917.3. Samples: 1861137960. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:33,381][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 11:13:37,542][26599] Updated weights for policy 0, policy_version 341414 (0.0034) [2024-06-19 11:13:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 41779.2, 300 sec: 42376.2). Total num frames: 5593743360. Throughput: 0: 42866.7. Samples: 1861397520. Policy #0 lag: (min: 0.0, avg: 11.3, max: 22.0) [2024-06-19 11:13:38,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 11:13:38,395][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000341415_5593743360.pth... [2024-06-19 11:13:38,457][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000340795_5583585280.pth [2024-06-19 11:13:38,918][26579] Signal inference workers to stop experience collection... (27400 times) [2024-06-19 11:13:38,971][26599] InferenceWorker_p0-w0: stopping experience collection (27400 times) [2024-06-19 11:13:38,978][26579] Signal inference workers to resume experience collection... (27400 times) [2024-06-19 11:13:38,985][26599] InferenceWorker_p0-w0: resuming experience collection (27400 times) [2024-06-19 11:13:40,286][26599] Updated weights for policy 0, policy_version 341424 (0.0044) [2024-06-19 11:13:43,384][26367] Fps is (10 sec: 40945.7, 60 sec: 42869.0, 300 sec: 42542.3). Total num frames: 5593989120. Throughput: 0: 42678.0. Samples: 1861641980. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:13:43,384][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 11:13:45,258][26599] Updated weights for policy 0, policy_version 341434 (0.0030) [2024-06-19 11:13:48,217][26599] Updated weights for policy 0, policy_version 341444 (0.0026) [2024-06-19 11:13:48,380][26367] Fps is (10 sec: 47513.4, 60 sec: 42871.4, 300 sec: 42487.8). Total num frames: 5594218496. Throughput: 0: 42971.1. Samples: 1861780640. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:13:48,384][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 11:13:52,920][26599] Updated weights for policy 0, policy_version 341454 (0.0030) [2024-06-19 11:13:53,384][26367] Fps is (10 sec: 39321.3, 60 sec: 41776.7, 300 sec: 42375.7). Total num frames: 5594382336. Throughput: 0: 42754.8. Samples: 1862037000. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:13:53,385][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 11:13:55,714][26599] Updated weights for policy 0, policy_version 341464 (0.0040) [2024-06-19 11:13:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 43144.5, 300 sec: 42598.9). Total num frames: 5594644480. Throughput: 0: 42446.1. Samples: 1862279580. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:13:58,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 11:14:00,579][26599] Updated weights for policy 0, policy_version 341474 (0.0040) [2024-06-19 11:14:03,330][26599] Updated weights for policy 0, policy_version 341484 (0.0030) [2024-06-19 11:14:03,380][26367] Fps is (10 sec: 49170.0, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5594873856. Throughput: 0: 42977.2. Samples: 1862425360. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:03,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 11:14:08,283][26599] Updated weights for policy 0, policy_version 341494 (0.0035) [2024-06-19 11:14:08,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42325.4, 300 sec: 42431.8). Total num frames: 5595037696. Throughput: 0: 42824.4. Samples: 1862675460. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:08,381][26367] Avg episode reward: [(0, '0.820')] [2024-06-19 11:14:11,195][26599] Updated weights for policy 0, policy_version 341504 (0.0027) [2024-06-19 11:14:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5595299840. Throughput: 0: 42561.3. Samples: 1862921080. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:13,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 11:14:15,922][26599] Updated weights for policy 0, policy_version 341514 (0.0034) [2024-06-19 11:14:18,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5595480064. Throughput: 0: 42754.3. Samples: 1863061900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:18,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 11:14:18,858][26599] Updated weights for policy 0, policy_version 341524 (0.0030) [2024-06-19 11:14:23,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5595676672. Throughput: 0: 42603.6. Samples: 1863314680. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:23,380][26367] Avg episode reward: [(0, '0.410')] [2024-06-19 11:14:23,454][26599] Updated weights for policy 0, policy_version 341534 (0.0028) [2024-06-19 11:14:26,481][26599] Updated weights for policy 0, policy_version 341544 (0.0033) [2024-06-19 11:14:28,380][26367] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42598.4). Total num frames: 5595938816. Throughput: 0: 42722.4. Samples: 1863564340. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:28,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 11:14:31,083][26599] Updated weights for policy 0, policy_version 341554 (0.0039) [2024-06-19 11:14:33,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5596119040. Throughput: 0: 42825.4. Samples: 1863707780. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:33,381][26367] Avg episode reward: [(0, '0.728')] [2024-06-19 11:14:34,276][26599] Updated weights for policy 0, policy_version 341564 (0.0037) [2024-06-19 11:14:38,380][26367] Fps is (10 sec: 39322.1, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5596332032. Throughput: 0: 42689.2. Samples: 1863957860. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:38,381][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 11:14:38,594][26599] Updated weights for policy 0, policy_version 341574 (0.0030) [2024-06-19 11:14:40,440][26579] Signal inference workers to stop experience collection... (27450 times) [2024-06-19 11:14:40,441][26579] Signal inference workers to resume experience collection... (27450 times) [2024-06-19 11:14:40,460][26599] InferenceWorker_p0-w0: stopping experience collection (27450 times) [2024-06-19 11:14:40,460][26599] InferenceWorker_p0-w0: resuming experience collection (27450 times) [2024-06-19 11:14:41,879][26599] Updated weights for policy 0, policy_version 341584 (0.0034) [2024-06-19 11:14:43,380][26367] Fps is (10 sec: 47513.6, 60 sec: 43420.3, 300 sec: 42709.5). Total num frames: 5596594176. Throughput: 0: 42958.4. Samples: 1864212700. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:43,380][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 11:14:46,232][26599] Updated weights for policy 0, policy_version 341594 (0.0033) [2024-06-19 11:14:48,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42595.9, 300 sec: 42597.9). Total num frames: 5596774400. Throughput: 0: 42764.1. Samples: 1864349900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:48,385][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 11:14:49,569][26599] Updated weights for policy 0, policy_version 341604 (0.0039) [2024-06-19 11:14:53,384][26367] Fps is (10 sec: 39307.1, 60 sec: 43417.6, 300 sec: 42653.4). Total num frames: 5596987392. Throughput: 0: 42781.0. Samples: 1864600760. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:53,385][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 11:14:53,785][26599] Updated weights for policy 0, policy_version 341614 (0.0044) [2024-06-19 11:14:57,209][26599] Updated weights for policy 0, policy_version 341624 (0.0040) [2024-06-19 11:14:58,380][26367] Fps is (10 sec: 44252.5, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5597216768. Throughput: 0: 43062.6. Samples: 1864858900. Policy #0 lag: (min: 0.0, avg: 12.2, max: 22.0) [2024-06-19 11:14:58,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 11:15:01,258][26599] Updated weights for policy 0, policy_version 341634 (0.0039) [2024-06-19 11:15:03,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5597429760. Throughput: 0: 42860.0. Samples: 1864990600. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:03,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 11:15:04,763][26599] Updated weights for policy 0, policy_version 341644 (0.0033) [2024-06-19 11:15:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43417.6, 300 sec: 42653.9). Total num frames: 5597642752. Throughput: 0: 42941.3. Samples: 1865247040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:08,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 11:15:08,709][26599] Updated weights for policy 0, policy_version 341654 (0.0037) [2024-06-19 11:15:12,384][26599] Updated weights for policy 0, policy_version 341664 (0.0044) [2024-06-19 11:15:13,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5597839360. Throughput: 0: 43067.3. Samples: 1865502360. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:13,381][26367] Avg episode reward: [(0, '0.856')] [2024-06-19 11:15:16,235][26599] Updated weights for policy 0, policy_version 341674 (0.0046) [2024-06-19 11:15:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5598052352. Throughput: 0: 42632.3. Samples: 1865626240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:18,381][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 11:15:20,019][26599] Updated weights for policy 0, policy_version 341684 (0.0031) [2024-06-19 11:15:23,380][26367] Fps is (10 sec: 45875.1, 60 sec: 43690.6, 300 sec: 42765.0). Total num frames: 5598298112. Throughput: 0: 42780.4. Samples: 1865882980. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:23,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 11:15:24,043][26599] Updated weights for policy 0, policy_version 341694 (0.0042) [2024-06-19 11:15:27,568][26599] Updated weights for policy 0, policy_version 341704 (0.0037) [2024-06-19 11:15:28,380][26367] Fps is (10 sec: 44237.8, 60 sec: 42598.6, 300 sec: 42653.9). Total num frames: 5598494720. Throughput: 0: 42988.5. Samples: 1866147180. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:28,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 11:15:31,725][26599] Updated weights for policy 0, policy_version 341714 (0.0035) [2024-06-19 11:15:33,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42871.5, 300 sec: 42598.9). Total num frames: 5598691328. Throughput: 0: 42752.9. Samples: 1866273620. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:33,380][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 11:15:35,188][26599] Updated weights for policy 0, policy_version 341724 (0.0038) [2024-06-19 11:15:38,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 5598937088. Throughput: 0: 42952.8. Samples: 1866533480. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:38,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 11:15:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000341732_5598937088.pth... [2024-06-19 11:15:38,453][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000341107_5588697088.pth [2024-06-19 11:15:39,122][26599] Updated weights for policy 0, policy_version 341734 (0.0029) [2024-06-19 11:15:42,856][26599] Updated weights for policy 0, policy_version 341744 (0.0040) [2024-06-19 11:15:43,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5599133696. Throughput: 0: 42871.2. Samples: 1866788100. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:43,381][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 11:15:47,425][26599] Updated weights for policy 0, policy_version 341754 (0.0053) [2024-06-19 11:15:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42874.1, 300 sec: 42709.5). Total num frames: 5599346688. Throughput: 0: 42809.8. Samples: 1866917040. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:48,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 11:15:50,993][26599] Updated weights for policy 0, policy_version 341764 (0.0037) [2024-06-19 11:15:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 43147.2, 300 sec: 42654.0). Total num frames: 5599576064. Throughput: 0: 42826.7. Samples: 1867174240. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:53,380][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 11:15:55,081][26599] Updated weights for policy 0, policy_version 341774 (0.0034) [2024-06-19 11:15:58,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5599772672. Throughput: 0: 42955.0. Samples: 1867435340. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:15:58,381][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 11:15:58,467][26599] Updated weights for policy 0, policy_version 341784 (0.0034) [2024-06-19 11:16:00,198][26579] Signal inference workers to stop experience collection... (27500 times) [2024-06-19 11:16:00,198][26579] Signal inference workers to resume experience collection... (27500 times) [2024-06-19 11:16:00,243][26599] InferenceWorker_p0-w0: stopping experience collection (27500 times) [2024-06-19 11:16:00,243][26599] InferenceWorker_p0-w0: resuming experience collection (27500 times) [2024-06-19 11:16:02,490][26599] Updated weights for policy 0, policy_version 341794 (0.0033) [2024-06-19 11:16:03,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5599969280. Throughput: 0: 42965.5. Samples: 1867559680. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:16:03,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 11:16:05,794][26599] Updated weights for policy 0, policy_version 341804 (0.0036) [2024-06-19 11:16:08,380][26367] Fps is (10 sec: 45875.5, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5600231424. Throughput: 0: 43130.7. Samples: 1867823860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:16:08,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 11:16:10,049][26599] Updated weights for policy 0, policy_version 341814 (0.0030) [2024-06-19 11:16:13,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5600411648. Throughput: 0: 42926.0. Samples: 1868078860. Policy #0 lag: (min: 1.0, avg: 11.1, max: 21.0) [2024-06-19 11:16:13,383][26367] Avg episode reward: [(0, '0.726')] [2024-06-19 11:16:13,890][26599] Updated weights for policy 0, policy_version 341824 (0.0033) [2024-06-19 11:16:17,554][26599] Updated weights for policy 0, policy_version 341834 (0.0032) [2024-06-19 11:16:18,384][26367] Fps is (10 sec: 39307.3, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5600624640. Throughput: 0: 42914.7. Samples: 1868204940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:18,385][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 11:16:21,360][26599] Updated weights for policy 0, policy_version 341844 (0.0040) [2024-06-19 11:16:23,380][26367] Fps is (10 sec: 45875.5, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5600870400. Throughput: 0: 42847.5. Samples: 1868461620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:23,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 11:16:25,390][26599] Updated weights for policy 0, policy_version 341854 (0.0047) [2024-06-19 11:16:28,380][26367] Fps is (10 sec: 44253.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5601067008. Throughput: 0: 42887.1. Samples: 1868718020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:28,381][26367] Avg episode reward: [(0, '0.649')] [2024-06-19 11:16:28,846][26599] Updated weights for policy 0, policy_version 341864 (0.0031) [2024-06-19 11:16:33,001][26599] Updated weights for policy 0, policy_version 341874 (0.0044) [2024-06-19 11:16:33,384][26367] Fps is (10 sec: 40945.1, 60 sec: 43141.9, 300 sec: 42708.9). Total num frames: 5601280000. Throughput: 0: 42737.9. Samples: 1868840400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:33,384][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 11:16:36,440][26599] Updated weights for policy 0, policy_version 341884 (0.0035) [2024-06-19 11:16:38,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5601509376. Throughput: 0: 42900.0. Samples: 1869104740. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:38,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 11:16:40,518][26599] Updated weights for policy 0, policy_version 341894 (0.0038) [2024-06-19 11:16:43,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5601705984. Throughput: 0: 42749.0. Samples: 1869359040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:43,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:16:44,360][26599] Updated weights for policy 0, policy_version 341904 (0.0028) [2024-06-19 11:16:48,013][26599] Updated weights for policy 0, policy_version 341914 (0.0027) [2024-06-19 11:16:48,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5601918976. Throughput: 0: 42647.4. Samples: 1869478820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:48,381][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 11:16:51,976][26599] Updated weights for policy 0, policy_version 341924 (0.0045) [2024-06-19 11:16:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42821.1). Total num frames: 5602131968. Throughput: 0: 42531.5. Samples: 1869737780. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:53,381][26367] Avg episode reward: [(0, '0.396')] [2024-06-19 11:16:55,660][26599] Updated weights for policy 0, policy_version 341934 (0.0047) [2024-06-19 11:16:58,380][26367] Fps is (10 sec: 42599.5, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5602344960. Throughput: 0: 42622.0. Samples: 1869996840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:16:58,380][26367] Avg episode reward: [(0, '0.482')] [2024-06-19 11:16:59,637][26599] Updated weights for policy 0, policy_version 341944 (0.0036) [2024-06-19 11:17:02,804][26579] Signal inference workers to stop experience collection... (27550 times) [2024-06-19 11:17:02,816][26579] Signal inference workers to resume experience collection... (27550 times) [2024-06-19 11:17:02,819][26599] InferenceWorker_p0-w0: stopping experience collection (27550 times) [2024-06-19 11:17:02,832][26599] InferenceWorker_p0-w0: resuming experience collection (27550 times) [2024-06-19 11:17:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42710.0). Total num frames: 5602557952. Throughput: 0: 42598.6. Samples: 1870121720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:17:03,381][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 11:17:03,975][26599] Updated weights for policy 0, policy_version 341954 (0.0040) [2024-06-19 11:17:07,373][26599] Updated weights for policy 0, policy_version 341964 (0.0025) [2024-06-19 11:17:08,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5602787328. Throughput: 0: 42629.3. Samples: 1870379940. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:17:08,381][26367] Avg episode reward: [(0, '0.352')] [2024-06-19 11:17:11,677][26599] Updated weights for policy 0, policy_version 341974 (0.0030) [2024-06-19 11:17:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5602967552. Throughput: 0: 42719.5. Samples: 1870640400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:17:13,381][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 11:17:14,973][26599] Updated weights for policy 0, policy_version 341984 (0.0038) [2024-06-19 11:17:18,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5603180544. Throughput: 0: 42690.5. Samples: 1870761320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:17:18,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 11:17:19,061][26599] Updated weights for policy 0, policy_version 341994 (0.0029) [2024-06-19 11:17:22,604][26599] Updated weights for policy 0, policy_version 342004 (0.0026) [2024-06-19 11:17:23,382][26367] Fps is (10 sec: 47506.3, 60 sec: 42870.4, 300 sec: 42931.4). Total num frames: 5603442688. Throughput: 0: 42705.6. Samples: 1871026560. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:17:23,382][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 11:17:26,889][26599] Updated weights for policy 0, policy_version 342014 (0.0049) [2024-06-19 11:17:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5603622912. Throughput: 0: 42623.5. Samples: 1871277100. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:17:28,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 11:17:30,229][26599] Updated weights for policy 0, policy_version 342024 (0.0057) [2024-06-19 11:17:33,382][26367] Fps is (10 sec: 37681.5, 60 sec: 42326.5, 300 sec: 42653.7). Total num frames: 5603819520. Throughput: 0: 42757.3. Samples: 1871402980. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:17:33,383][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 11:17:34,528][26599] Updated weights for policy 0, policy_version 342034 (0.0040) [2024-06-19 11:17:37,859][26599] Updated weights for policy 0, policy_version 342044 (0.0037) [2024-06-19 11:17:38,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5604065280. Throughput: 0: 42749.3. Samples: 1871661500. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:17:38,381][26367] Avg episode reward: [(0, '0.757')] [2024-06-19 11:17:38,395][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342045_5604065280.pth... [2024-06-19 11:17:38,453][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000341415_5593743360.pth [2024-06-19 11:17:42,190][26599] Updated weights for policy 0, policy_version 342054 (0.0035) [2024-06-19 11:17:43,380][26367] Fps is (10 sec: 44245.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5604261888. Throughput: 0: 42706.5. Samples: 1871918640. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:17:43,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 11:17:45,540][26599] Updated weights for policy 0, policy_version 342064 (0.0032) [2024-06-19 11:17:48,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5604474880. Throughput: 0: 42739.4. Samples: 1872045000. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:17:48,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 11:17:49,848][26599] Updated weights for policy 0, policy_version 342074 (0.0036) [2024-06-19 11:17:53,103][26599] Updated weights for policy 0, policy_version 342084 (0.0025) [2024-06-19 11:17:53,380][26367] Fps is (10 sec: 44237.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5604704256. Throughput: 0: 42798.4. Samples: 1872305860. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:17:53,380][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 11:17:57,427][26599] Updated weights for policy 0, policy_version 342094 (0.0031) [2024-06-19 11:17:58,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5604917248. Throughput: 0: 42906.3. Samples: 1872571180. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:17:58,380][26367] Avg episode reward: [(0, '0.437')] [2024-06-19 11:18:00,597][26599] Updated weights for policy 0, policy_version 342104 (0.0033) [2024-06-19 11:18:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5605130240. Throughput: 0: 42976.1. Samples: 1872695240. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:03,380][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 11:18:04,917][26599] Updated weights for policy 0, policy_version 342114 (0.0030) [2024-06-19 11:18:08,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5605343232. Throughput: 0: 42724.5. Samples: 1872949100. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:08,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 11:18:08,701][26599] Updated weights for policy 0, policy_version 342124 (0.0038) [2024-06-19 11:18:12,500][26599] Updated weights for policy 0, policy_version 342134 (0.0042) [2024-06-19 11:18:13,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5605556224. Throughput: 0: 42968.2. Samples: 1873210660. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:13,380][26367] Avg episode reward: [(0, '0.607')] [2024-06-19 11:18:16,214][26599] Updated weights for policy 0, policy_version 342144 (0.0045) [2024-06-19 11:18:18,380][26367] Fps is (10 sec: 44237.3, 60 sec: 43417.6, 300 sec: 42876.1). Total num frames: 5605785600. Throughput: 0: 43060.6. Samples: 1873340620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:18,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 11:18:19,837][26579] Signal inference workers to stop experience collection... (27600 times) [2024-06-19 11:18:19,846][26599] InferenceWorker_p0-w0: stopping experience collection (27600 times) [2024-06-19 11:18:19,950][26579] Signal inference workers to resume experience collection... (27600 times) [2024-06-19 11:18:19,951][26599] InferenceWorker_p0-w0: resuming experience collection (27600 times) [2024-06-19 11:18:20,081][26599] Updated weights for policy 0, policy_version 342154 (0.0045) [2024-06-19 11:18:23,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42326.4, 300 sec: 42876.1). Total num frames: 5605982208. Throughput: 0: 42840.8. Samples: 1873589340. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:23,381][26367] Avg episode reward: [(0, '0.779')] [2024-06-19 11:18:24,105][26599] Updated weights for policy 0, policy_version 342164 (0.0044) [2024-06-19 11:18:27,666][26599] Updated weights for policy 0, policy_version 342174 (0.0025) [2024-06-19 11:18:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5606195200. Throughput: 0: 43080.9. Samples: 1873857280. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:28,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:18:31,879][26599] Updated weights for policy 0, policy_version 342184 (0.0032) [2024-06-19 11:18:33,380][26367] Fps is (10 sec: 44237.4, 60 sec: 43419.1, 300 sec: 42987.2). Total num frames: 5606424576. Throughput: 0: 43191.2. Samples: 1873988600. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:33,381][26367] Avg episode reward: [(0, '0.555')] [2024-06-19 11:18:35,241][26599] Updated weights for policy 0, policy_version 342194 (0.0042) [2024-06-19 11:18:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.5, 300 sec: 42876.6). Total num frames: 5606637568. Throughput: 0: 42985.2. Samples: 1874240200. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:38,381][26367] Avg episode reward: [(0, '0.798')] [2024-06-19 11:18:39,491][26599] Updated weights for policy 0, policy_version 342204 (0.0039) [2024-06-19 11:18:42,850][26599] Updated weights for policy 0, policy_version 342214 (0.0028) [2024-06-19 11:18:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5606850560. Throughput: 0: 42831.9. Samples: 1874498620. Policy #0 lag: (min: 0.0, avg: 11.7, max: 22.0) [2024-06-19 11:18:43,394][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 11:18:47,169][26599] Updated weights for policy 0, policy_version 342224 (0.0037) [2024-06-19 11:18:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43144.6, 300 sec: 42987.7). Total num frames: 5607063552. Throughput: 0: 42821.2. Samples: 1874622200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:18:48,381][26367] Avg episode reward: [(0, '0.542')] [2024-06-19 11:18:50,427][26599] Updated weights for policy 0, policy_version 342234 (0.0031) [2024-06-19 11:18:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5607292928. Throughput: 0: 43001.0. Samples: 1874884140. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:18:53,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 11:18:54,603][26599] Updated weights for policy 0, policy_version 342244 (0.0033) [2024-06-19 11:18:58,229][26599] Updated weights for policy 0, policy_version 342254 (0.0034) [2024-06-19 11:18:58,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5607489536. Throughput: 0: 42836.4. Samples: 1875138300. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:18:58,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 11:19:02,237][26599] Updated weights for policy 0, policy_version 342264 (0.0031) [2024-06-19 11:19:03,380][26367] Fps is (10 sec: 39320.8, 60 sec: 42598.2, 300 sec: 42876.1). Total num frames: 5607686144. Throughput: 0: 42671.4. Samples: 1875260840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:03,382][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 11:19:06,099][26599] Updated weights for policy 0, policy_version 342274 (0.0035) [2024-06-19 11:19:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5607915520. Throughput: 0: 42901.9. Samples: 1875519920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:08,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 11:19:10,147][26599] Updated weights for policy 0, policy_version 342284 (0.0038) [2024-06-19 11:19:13,380][26367] Fps is (10 sec: 44238.0, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5608128512. Throughput: 0: 42583.7. Samples: 1875773540. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:13,381][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 11:19:13,682][26599] Updated weights for policy 0, policy_version 342294 (0.0038) [2024-06-19 11:19:17,718][26599] Updated weights for policy 0, policy_version 342304 (0.0035) [2024-06-19 11:19:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5608325120. Throughput: 0: 42424.9. Samples: 1875897720. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:18,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 11:19:21,480][26599] Updated weights for policy 0, policy_version 342314 (0.0027) [2024-06-19 11:19:23,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5608538112. Throughput: 0: 42510.3. Samples: 1876153160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:23,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 11:19:25,729][26599] Updated weights for policy 0, policy_version 342324 (0.0035) [2024-06-19 11:19:28,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5608767488. Throughput: 0: 42530.2. Samples: 1876412480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:28,381][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 11:19:29,024][26599] Updated weights for policy 0, policy_version 342334 (0.0036) [2024-06-19 11:19:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42052.3, 300 sec: 42765.0). Total num frames: 5608947712. Throughput: 0: 42592.6. Samples: 1876538860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:33,380][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 11:19:33,555][26599] Updated weights for policy 0, policy_version 342344 (0.0049) [2024-06-19 11:19:36,329][26579] Signal inference workers to stop experience collection... (27650 times) [2024-06-19 11:19:36,378][26599] InferenceWorker_p0-w0: stopping experience collection (27650 times) [2024-06-19 11:19:36,445][26579] Signal inference workers to resume experience collection... (27650 times) [2024-06-19 11:19:36,445][26599] InferenceWorker_p0-w0: resuming experience collection (27650 times) [2024-06-19 11:19:36,576][26599] Updated weights for policy 0, policy_version 342354 (0.0022) [2024-06-19 11:19:38,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5609177088. Throughput: 0: 42337.3. Samples: 1876789320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:38,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 11:19:38,430][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342358_5609193472.pth... [2024-06-19 11:19:38,481][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000341732_5598937088.pth [2024-06-19 11:19:41,258][26599] Updated weights for policy 0, policy_version 342364 (0.0034) [2024-06-19 11:19:43,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.4, 300 sec: 42765.6). Total num frames: 5609390080. Throughput: 0: 42346.2. Samples: 1877043880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:43,381][26367] Avg episode reward: [(0, '0.736')] [2024-06-19 11:19:44,293][26599] Updated weights for policy 0, policy_version 342374 (0.0034) [2024-06-19 11:19:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.3, 300 sec: 42710.0). Total num frames: 5609586688. Throughput: 0: 42479.2. Samples: 1877172400. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:48,381][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 11:19:48,846][26599] Updated weights for policy 0, policy_version 342384 (0.0044) [2024-06-19 11:19:51,874][26599] Updated weights for policy 0, policy_version 342394 (0.0050) [2024-06-19 11:19:53,380][26367] Fps is (10 sec: 40960.0, 60 sec: 41779.2, 300 sec: 42654.0). Total num frames: 5609799680. Throughput: 0: 42176.1. Samples: 1877417840. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:53,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 11:19:56,438][26599] Updated weights for policy 0, policy_version 342404 (0.0032) [2024-06-19 11:19:58,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5610029056. Throughput: 0: 42406.1. Samples: 1877681820. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:19:58,381][26367] Avg episode reward: [(0, '0.863')] [2024-06-19 11:19:59,415][26599] Updated weights for policy 0, policy_version 342414 (0.0033) [2024-06-19 11:20:03,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5610242048. Throughput: 0: 42439.0. Samples: 1877807480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:20:03,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 11:20:04,485][26599] Updated weights for policy 0, policy_version 342424 (0.0047) [2024-06-19 11:20:07,601][26599] Updated weights for policy 0, policy_version 342434 (0.0038) [2024-06-19 11:20:08,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5610455040. Throughput: 0: 42350.7. Samples: 1878058940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:08,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 11:20:12,193][26599] Updated weights for policy 0, policy_version 342444 (0.0048) [2024-06-19 11:20:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5610668032. Throughput: 0: 42432.0. Samples: 1878321920. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:13,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 11:20:15,308][26599] Updated weights for policy 0, policy_version 342454 (0.0035) [2024-06-19 11:20:18,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5610881024. Throughput: 0: 42413.2. Samples: 1878447460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:18,381][26367] Avg episode reward: [(0, '0.765')] [2024-06-19 11:20:19,860][26599] Updated weights for policy 0, policy_version 342464 (0.0029) [2024-06-19 11:20:22,918][26599] Updated weights for policy 0, policy_version 342474 (0.0033) [2024-06-19 11:20:23,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5611110400. Throughput: 0: 42550.2. Samples: 1878704080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:23,384][26367] Avg episode reward: [(0, '0.737')] [2024-06-19 11:20:27,432][26599] Updated weights for policy 0, policy_version 342484 (0.0040) [2024-06-19 11:20:28,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5611307008. Throughput: 0: 42639.1. Samples: 1878962640. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:28,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 11:20:30,607][26599] Updated weights for policy 0, policy_version 342494 (0.0033) [2024-06-19 11:20:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5611520000. Throughput: 0: 42507.6. Samples: 1879085240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:33,381][26367] Avg episode reward: [(0, '0.781')] [2024-06-19 11:20:34,921][26599] Updated weights for policy 0, policy_version 342504 (0.0047) [2024-06-19 11:20:37,983][26599] Updated weights for policy 0, policy_version 342514 (0.0030) [2024-06-19 11:20:38,384][26367] Fps is (10 sec: 45858.1, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 5611765760. Throughput: 0: 42882.7. Samples: 1879347720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:38,385][26367] Avg episode reward: [(0, '0.505')] [2024-06-19 11:20:42,500][26599] Updated weights for policy 0, policy_version 342524 (0.0033) [2024-06-19 11:20:43,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5611945984. Throughput: 0: 42711.9. Samples: 1879603860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:43,381][26367] Avg episode reward: [(0, '0.548')] [2024-06-19 11:20:46,190][26599] Updated weights for policy 0, policy_version 342534 (0.0036) [2024-06-19 11:20:48,380][26367] Fps is (10 sec: 39336.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5612158976. Throughput: 0: 42697.8. Samples: 1879728880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:48,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 11:20:50,384][26599] Updated weights for policy 0, policy_version 342544 (0.0044) [2024-06-19 11:20:53,380][26367] Fps is (10 sec: 44237.6, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5612388352. Throughput: 0: 42753.3. Samples: 1879982840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:53,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 11:20:53,787][26599] Updated weights for policy 0, policy_version 342554 (0.0047) [2024-06-19 11:20:57,998][26599] Updated weights for policy 0, policy_version 342564 (0.0036) [2024-06-19 11:20:58,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5612584960. Throughput: 0: 42783.2. Samples: 1880247160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:20:58,380][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 11:21:01,363][26599] Updated weights for policy 0, policy_version 342574 (0.0028) [2024-06-19 11:21:03,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5612781568. Throughput: 0: 42646.6. Samples: 1880366560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:21:03,384][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 11:21:05,312][26579] Signal inference workers to stop experience collection... (27700 times) [2024-06-19 11:21:05,313][26579] Signal inference workers to resume experience collection... (27700 times) [2024-06-19 11:21:05,330][26599] InferenceWorker_p0-w0: stopping experience collection (27700 times) [2024-06-19 11:21:05,330][26599] InferenceWorker_p0-w0: resuming experience collection (27700 times) [2024-06-19 11:21:05,626][26599] Updated weights for policy 0, policy_version 342584 (0.0031) [2024-06-19 11:21:08,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5613027328. Throughput: 0: 42661.4. Samples: 1880623840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:21:08,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 11:21:08,866][26599] Updated weights for policy 0, policy_version 342594 (0.0050) [2024-06-19 11:21:13,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 5613207552. Throughput: 0: 42684.4. Samples: 1880883440. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:21:13,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 11:21:13,403][26599] Updated weights for policy 0, policy_version 342604 (0.0025) [2024-06-19 11:21:16,677][26599] Updated weights for policy 0, policy_version 342614 (0.0038) [2024-06-19 11:21:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5613436928. Throughput: 0: 42594.2. Samples: 1881001980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:21:18,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 11:21:20,923][26599] Updated weights for policy 0, policy_version 342624 (0.0040) [2024-06-19 11:21:23,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5613666304. Throughput: 0: 42527.8. Samples: 1881261320. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:23,381][26367] Avg episode reward: [(0, '0.585')] [2024-06-19 11:21:24,390][26599] Updated weights for policy 0, policy_version 342634 (0.0026) [2024-06-19 11:21:28,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42325.2, 300 sec: 42598.9). Total num frames: 5613846528. Throughput: 0: 42630.3. Samples: 1881522220. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:28,384][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 11:21:28,697][26599] Updated weights for policy 0, policy_version 342644 (0.0043) [2024-06-19 11:21:32,147][26599] Updated weights for policy 0, policy_version 342654 (0.0027) [2024-06-19 11:21:33,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5614075904. Throughput: 0: 42562.8. Samples: 1881644200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:33,380][26367] Avg episode reward: [(0, '0.452')] [2024-06-19 11:21:36,190][26599] Updated weights for policy 0, policy_version 342664 (0.0035) [2024-06-19 11:21:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42054.8, 300 sec: 42653.9). Total num frames: 5614288896. Throughput: 0: 42576.4. Samples: 1881898780. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:38,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 11:21:38,406][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342670_5614305280.pth... [2024-06-19 11:21:38,460][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342045_5604065280.pth [2024-06-19 11:21:39,779][26599] Updated weights for policy 0, policy_version 342674 (0.0037) [2024-06-19 11:21:43,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5614469120. Throughput: 0: 42542.6. Samples: 1882161580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:43,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 11:21:43,917][26599] Updated weights for policy 0, policy_version 342684 (0.0031) [2024-06-19 11:21:47,442][26599] Updated weights for policy 0, policy_version 342694 (0.0032) [2024-06-19 11:21:48,384][26367] Fps is (10 sec: 42583.2, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5614714880. Throughput: 0: 42538.0. Samples: 1882280920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:48,384][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 11:21:51,668][26599] Updated weights for policy 0, policy_version 342704 (0.0048) [2024-06-19 11:21:53,380][26367] Fps is (10 sec: 47514.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5614944256. Throughput: 0: 42453.8. Samples: 1882534260. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:53,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 11:21:55,199][26599] Updated weights for policy 0, policy_version 342714 (0.0029) [2024-06-19 11:21:58,380][26367] Fps is (10 sec: 40974.5, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5615124480. Throughput: 0: 42392.3. Samples: 1882791100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:21:58,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 11:21:59,261][26599] Updated weights for policy 0, policy_version 342724 (0.0024) [2024-06-19 11:22:02,918][26599] Updated weights for policy 0, policy_version 342734 (0.0039) [2024-06-19 11:22:03,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5615353856. Throughput: 0: 42550.2. Samples: 1882916740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:22:03,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 11:22:06,959][26599] Updated weights for policy 0, policy_version 342744 (0.0044) [2024-06-19 11:22:08,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42653.9). Total num frames: 5615550464. Throughput: 0: 42472.4. Samples: 1883172580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:22:08,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 11:22:10,674][26599] Updated weights for policy 0, policy_version 342754 (0.0037) [2024-06-19 11:22:13,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5615747072. Throughput: 0: 42426.2. Samples: 1883431400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:22:13,383][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 11:22:14,897][26599] Updated weights for policy 0, policy_version 342764 (0.0037) [2024-06-19 11:22:18,312][26599] Updated weights for policy 0, policy_version 342774 (0.0033) [2024-06-19 11:22:18,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.4, 300 sec: 42598.6). Total num frames: 5616009216. Throughput: 0: 42611.4. Samples: 1883561720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:22:18,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 11:22:22,564][26599] Updated weights for policy 0, policy_version 342784 (0.0038) [2024-06-19 11:22:23,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5616189440. Throughput: 0: 42706.8. Samples: 1883820580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:22:23,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 11:22:24,782][26579] Signal inference workers to stop experience collection... (27750 times) [2024-06-19 11:22:24,783][26579] Signal inference workers to resume experience collection... (27750 times) [2024-06-19 11:22:24,795][26599] InferenceWorker_p0-w0: stopping experience collection (27750 times) [2024-06-19 11:22:24,795][26599] InferenceWorker_p0-w0: resuming experience collection (27750 times) [2024-06-19 11:22:26,073][26599] Updated weights for policy 0, policy_version 342794 (0.0043) [2024-06-19 11:22:28,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42598.5, 300 sec: 42654.2). Total num frames: 5616402432. Throughput: 0: 42518.8. Samples: 1884074920. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:22:28,381][26367] Avg episode reward: [(0, '0.641')] [2024-06-19 11:22:30,242][26599] Updated weights for policy 0, policy_version 342804 (0.0034) [2024-06-19 11:22:33,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5616648192. Throughput: 0: 42767.0. Samples: 1884205280. Policy #0 lag: (min: 0.0, avg: 9.6, max: 22.0) [2024-06-19 11:22:33,381][26367] Avg episode reward: [(0, '0.724')] [2024-06-19 11:22:33,662][26599] Updated weights for policy 0, policy_version 342814 (0.0031) [2024-06-19 11:22:37,855][26599] Updated weights for policy 0, policy_version 342824 (0.0045) [2024-06-19 11:22:38,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5616828416. Throughput: 0: 42814.2. Samples: 1884460900. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:22:38,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 11:22:41,033][26599] Updated weights for policy 0, policy_version 342834 (0.0029) [2024-06-19 11:22:43,380][26367] Fps is (10 sec: 40959.8, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5617057792. Throughput: 0: 42865.8. Samples: 1884720060. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:22:43,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 11:22:45,454][26599] Updated weights for policy 0, policy_version 342844 (0.0027) [2024-06-19 11:22:48,380][26367] Fps is (10 sec: 47514.0, 60 sec: 43147.2, 300 sec: 42709.5). Total num frames: 5617303552. Throughput: 0: 42982.3. Samples: 1884850940. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:22:48,380][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 11:22:48,457][26599] Updated weights for policy 0, policy_version 342854 (0.0022) [2024-06-19 11:22:53,118][26599] Updated weights for policy 0, policy_version 342864 (0.0029) [2024-06-19 11:22:53,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5617483776. Throughput: 0: 42892.5. Samples: 1885102740. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:22:53,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 11:22:56,664][26599] Updated weights for policy 0, policy_version 342874 (0.0039) [2024-06-19 11:22:58,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5617696768. Throughput: 0: 42893.0. Samples: 1885361580. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:22:58,381][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 11:23:00,750][26599] Updated weights for policy 0, policy_version 342884 (0.0033) [2024-06-19 11:23:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5617909760. Throughput: 0: 42794.7. Samples: 1885487480. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:03,381][26367] Avg episode reward: [(0, '0.786')] [2024-06-19 11:23:04,291][26599] Updated weights for policy 0, policy_version 342894 (0.0031) [2024-06-19 11:23:08,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5618122752. Throughput: 0: 42752.4. Samples: 1885744440. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:08,381][26367] Avg episode reward: [(0, '0.848')] [2024-06-19 11:23:08,402][26599] Updated weights for policy 0, policy_version 342904 (0.0036) [2024-06-19 11:23:12,036][26599] Updated weights for policy 0, policy_version 342914 (0.0045) [2024-06-19 11:23:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42598.4). Total num frames: 5618352128. Throughput: 0: 42820.3. Samples: 1886001840. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:13,381][26367] Avg episode reward: [(0, '0.775')] [2024-06-19 11:23:16,043][26599] Updated weights for policy 0, policy_version 342924 (0.0032) [2024-06-19 11:23:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5618548736. Throughput: 0: 42685.2. Samples: 1886126120. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:18,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 11:23:19,746][26599] Updated weights for policy 0, policy_version 342934 (0.0034) [2024-06-19 11:23:23,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.3, 300 sec: 42598.4). Total num frames: 5618761728. Throughput: 0: 42578.0. Samples: 1886376920. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:23,381][26367] Avg episode reward: [(0, '0.382')] [2024-06-19 11:23:23,697][26599] Updated weights for policy 0, policy_version 342944 (0.0043) [2024-06-19 11:23:27,013][26579] Signal inference workers to stop experience collection... (27800 times) [2024-06-19 11:23:27,072][26599] InferenceWorker_p0-w0: stopping experience collection (27800 times) [2024-06-19 11:23:27,133][26579] Signal inference workers to resume experience collection... (27800 times) [2024-06-19 11:23:27,133][26599] InferenceWorker_p0-w0: resuming experience collection (27800 times) [2024-06-19 11:23:27,278][26599] Updated weights for policy 0, policy_version 342954 (0.0029) [2024-06-19 11:23:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.4, 300 sec: 42598.4). Total num frames: 5618991104. Throughput: 0: 42700.9. Samples: 1886641600. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:28,381][26367] Avg episode reward: [(0, '0.374')] [2024-06-19 11:23:31,831][26599] Updated weights for policy 0, policy_version 342964 (0.0036) [2024-06-19 11:23:33,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5619204096. Throughput: 0: 42565.2. Samples: 1886766380. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:33,381][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 11:23:35,029][26599] Updated weights for policy 0, policy_version 342974 (0.0024) [2024-06-19 11:23:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5619417088. Throughput: 0: 42525.8. Samples: 1887016400. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:38,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 11:23:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342982_5619417088.pth... [2024-06-19 11:23:38,442][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342358_5609193472.pth [2024-06-19 11:23:39,519][26599] Updated weights for policy 0, policy_version 342984 (0.0029) [2024-06-19 11:23:43,015][26599] Updated weights for policy 0, policy_version 342994 (0.0041) [2024-06-19 11:23:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5619630080. Throughput: 0: 42539.9. Samples: 1887275880. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:43,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 11:23:47,159][26599] Updated weights for policy 0, policy_version 343004 (0.0042) [2024-06-19 11:23:48,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42542.8). Total num frames: 5619843072. Throughput: 0: 42499.9. Samples: 1887399980. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:48,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 11:23:50,756][26599] Updated weights for policy 0, policy_version 343014 (0.0031) [2024-06-19 11:23:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5620072448. Throughput: 0: 42551.1. Samples: 1887659240. Policy #0 lag: (min: 0.0, avg: 10.2, max: 22.0) [2024-06-19 11:23:53,381][26367] Avg episode reward: [(0, '0.576')] [2024-06-19 11:23:54,671][26599] Updated weights for policy 0, policy_version 343024 (0.0026) [2024-06-19 11:23:58,338][26599] Updated weights for policy 0, policy_version 343034 (0.0038) [2024-06-19 11:23:58,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5620269056. Throughput: 0: 42578.7. Samples: 1887917880. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:23:58,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 11:24:02,227][26599] Updated weights for policy 0, policy_version 343044 (0.0040) [2024-06-19 11:24:03,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5620482048. Throughput: 0: 42610.8. Samples: 1888043600. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:03,380][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 11:24:05,921][26599] Updated weights for policy 0, policy_version 343054 (0.0044) [2024-06-19 11:24:08,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5620695040. Throughput: 0: 42753.6. Samples: 1888300820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:08,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 11:24:10,235][26599] Updated weights for policy 0, policy_version 343064 (0.0029) [2024-06-19 11:24:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5620891648. Throughput: 0: 42550.7. Samples: 1888556380. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:13,381][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 11:24:13,585][26599] Updated weights for policy 0, policy_version 343074 (0.0034) [2024-06-19 11:24:17,795][26599] Updated weights for policy 0, policy_version 343084 (0.0032) [2024-06-19 11:24:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5621104640. Throughput: 0: 42627.6. Samples: 1888684620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:18,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 11:24:21,377][26599] Updated weights for policy 0, policy_version 343094 (0.0050) [2024-06-19 11:24:23,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.6, 300 sec: 42542.9). Total num frames: 5621317632. Throughput: 0: 42590.8. Samples: 1888932980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:23,380][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 11:24:25,514][26599] Updated weights for policy 0, policy_version 343104 (0.0034) [2024-06-19 11:24:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5621547008. Throughput: 0: 42556.9. Samples: 1889190940. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:28,381][26367] Avg episode reward: [(0, '0.802')] [2024-06-19 11:24:29,245][26599] Updated weights for policy 0, policy_version 343114 (0.0028) [2024-06-19 11:24:33,338][26599] Updated weights for policy 0, policy_version 343124 (0.0036) [2024-06-19 11:24:33,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5621743616. Throughput: 0: 42605.9. Samples: 1889317240. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:33,381][26367] Avg episode reward: [(0, '0.489')] [2024-06-19 11:24:36,987][26599] Updated weights for policy 0, policy_version 343134 (0.0037) [2024-06-19 11:24:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5621972992. Throughput: 0: 42574.6. Samples: 1889575100. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:38,381][26367] Avg episode reward: [(0, '0.745')] [2024-06-19 11:24:40,790][26599] Updated weights for policy 0, policy_version 343144 (0.0035) [2024-06-19 11:24:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5622169600. Throughput: 0: 42449.9. Samples: 1889828120. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:43,380][26367] Avg episode reward: [(0, '0.769')] [2024-06-19 11:24:44,813][26599] Updated weights for policy 0, policy_version 343154 (0.0042) [2024-06-19 11:24:48,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5622382592. Throughput: 0: 42599.0. Samples: 1889960560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:48,381][26367] Avg episode reward: [(0, '0.436')] [2024-06-19 11:24:48,592][26599] Updated weights for policy 0, policy_version 343164 (0.0037) [2024-06-19 11:24:52,376][26599] Updated weights for policy 0, policy_version 343174 (0.0036) [2024-06-19 11:24:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5622595584. Throughput: 0: 42515.5. Samples: 1890214020. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:53,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 11:24:56,209][26599] Updated weights for policy 0, policy_version 343184 (0.0041) [2024-06-19 11:24:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5622824960. Throughput: 0: 42626.2. Samples: 1890474560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:24:58,381][26367] Avg episode reward: [(0, '0.591')] [2024-06-19 11:24:59,757][26599] Updated weights for policy 0, policy_version 343194 (0.0052) [2024-06-19 11:25:03,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42049.7, 300 sec: 42542.3). Total num frames: 5623005184. Throughput: 0: 42574.4. Samples: 1890600620. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:25:03,384][26367] Avg episode reward: [(0, '0.675')] [2024-06-19 11:25:03,749][26599] Updated weights for policy 0, policy_version 343204 (0.0022) [2024-06-19 11:25:07,398][26599] Updated weights for policy 0, policy_version 343214 (0.0028) [2024-06-19 11:25:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5623234560. Throughput: 0: 42871.4. Samples: 1890862200. Policy #0 lag: (min: 0.0, avg: 9.6, max: 21.0) [2024-06-19 11:25:08,381][26367] Avg episode reward: [(0, '0.640')] [2024-06-19 11:25:11,216][26599] Updated weights for policy 0, policy_version 343224 (0.0051) [2024-06-19 11:25:13,380][26367] Fps is (10 sec: 47530.7, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5623480320. Throughput: 0: 42704.0. Samples: 1891112620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:13,381][26367] Avg episode reward: [(0, '0.510')] [2024-06-19 11:25:15,047][26599] Updated weights for policy 0, policy_version 343234 (0.0027) [2024-06-19 11:25:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5623660544. Throughput: 0: 42776.2. Samples: 1891242180. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:18,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 11:25:18,896][26579] Signal inference workers to stop experience collection... (27850 times) [2024-06-19 11:25:18,896][26579] Signal inference workers to resume experience collection... (27850 times) [2024-06-19 11:25:18,951][26599] InferenceWorker_p0-w0: stopping experience collection (27850 times) [2024-06-19 11:25:18,951][26599] InferenceWorker_p0-w0: resuming experience collection (27850 times) [2024-06-19 11:25:19,030][26599] Updated weights for policy 0, policy_version 343244 (0.0039) [2024-06-19 11:25:22,805][26599] Updated weights for policy 0, policy_version 343254 (0.0037) [2024-06-19 11:25:23,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5623873536. Throughput: 0: 42700.4. Samples: 1891496620. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:23,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 11:25:26,729][26599] Updated weights for policy 0, policy_version 343264 (0.0033) [2024-06-19 11:25:28,380][26367] Fps is (10 sec: 45876.5, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5624119296. Throughput: 0: 42644.0. Samples: 1891747100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:28,380][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 11:25:30,327][26599] Updated weights for policy 0, policy_version 343274 (0.0041) [2024-06-19 11:25:33,384][26367] Fps is (10 sec: 42583.4, 60 sec: 42595.8, 300 sec: 42487.3). Total num frames: 5624299520. Throughput: 0: 42647.7. Samples: 1891879860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:33,384][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 11:25:34,290][26599] Updated weights for policy 0, policy_version 343284 (0.0037) [2024-06-19 11:25:38,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5624512512. Throughput: 0: 42741.4. Samples: 1892137380. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:38,380][26367] Avg episode reward: [(0, '0.470')] [2024-06-19 11:25:38,465][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000343294_5624528896.pth... [2024-06-19 11:25:38,472][26599] Updated weights for policy 0, policy_version 343294 (0.0038) [2024-06-19 11:25:38,511][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342670_5614305280.pth [2024-06-19 11:25:41,996][26599] Updated weights for policy 0, policy_version 343304 (0.0033) [2024-06-19 11:25:43,380][26367] Fps is (10 sec: 44252.0, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5624741888. Throughput: 0: 42526.1. Samples: 1892388240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:43,381][26367] Avg episode reward: [(0, '0.396')] [2024-06-19 11:25:46,117][26599] Updated weights for policy 0, policy_version 343314 (0.0035) [2024-06-19 11:25:48,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42598.3, 300 sec: 42542.8). Total num frames: 5624938496. Throughput: 0: 42629.1. Samples: 1892518780. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:48,383][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 11:25:49,700][26599] Updated weights for policy 0, policy_version 343324 (0.0040) [2024-06-19 11:25:53,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5625167872. Throughput: 0: 42373.3. Samples: 1892769000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:53,381][26367] Avg episode reward: [(0, '0.514')] [2024-06-19 11:25:53,704][26599] Updated weights for policy 0, policy_version 343334 (0.0042) [2024-06-19 11:25:57,789][26599] Updated weights for policy 0, policy_version 343344 (0.0038) [2024-06-19 11:25:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5625380864. Throughput: 0: 42516.4. Samples: 1893025860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:25:58,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 11:26:01,154][26599] Updated weights for policy 0, policy_version 343354 (0.0043) [2024-06-19 11:26:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42873.9, 300 sec: 42542.8). Total num frames: 5625577472. Throughput: 0: 42495.1. Samples: 1893154460. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:26:03,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 11:26:05,376][26599] Updated weights for policy 0, policy_version 343364 (0.0038) [2024-06-19 11:26:08,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5625806848. Throughput: 0: 42449.8. Samples: 1893406860. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:26:08,381][26367] Avg episode reward: [(0, '0.532')] [2024-06-19 11:26:08,686][26599] Updated weights for policy 0, policy_version 343374 (0.0049) [2024-06-19 11:26:13,069][26599] Updated weights for policy 0, policy_version 343384 (0.0049) [2024-06-19 11:26:13,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5626019840. Throughput: 0: 42636.3. Samples: 1893665740. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:26:13,381][26367] Avg episode reward: [(0, '0.416')] [2024-06-19 11:26:16,202][26599] Updated weights for policy 0, policy_version 343394 (0.0031) [2024-06-19 11:26:18,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42052.4, 300 sec: 42431.8). Total num frames: 5626183680. Throughput: 0: 42302.5. Samples: 1893783320. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:26:18,381][26367] Avg episode reward: [(0, '0.284')] [2024-06-19 11:26:20,745][26599] Updated weights for policy 0, policy_version 343404 (0.0023) [2024-06-19 11:26:23,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5626462208. Throughput: 0: 42307.9. Samples: 1894041240. Policy #0 lag: (min: 0.0, avg: 9.5, max: 21.0) [2024-06-19 11:26:23,384][26367] Avg episode reward: [(0, '0.491')] [2024-06-19 11:26:23,812][26599] Updated weights for policy 0, policy_version 343414 (0.0036) [2024-06-19 11:26:27,752][26579] Signal inference workers to stop experience collection... (27900 times) [2024-06-19 11:26:27,752][26579] Signal inference workers to resume experience collection... (27900 times) [2024-06-19 11:26:27,779][26599] InferenceWorker_p0-w0: stopping experience collection (27900 times) [2024-06-19 11:26:27,779][26599] InferenceWorker_p0-w0: resuming experience collection (27900 times) [2024-06-19 11:26:28,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5626642432. Throughput: 0: 42470.5. Samples: 1894299400. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:26:28,380][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 11:26:28,486][26599] Updated weights for policy 0, policy_version 343424 (0.0039) [2024-06-19 11:26:31,475][26599] Updated weights for policy 0, policy_version 343434 (0.0045) [2024-06-19 11:26:33,380][26367] Fps is (10 sec: 36044.6, 60 sec: 42054.7, 300 sec: 42487.3). Total num frames: 5626822656. Throughput: 0: 42098.7. Samples: 1894413220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:26:33,381][26367] Avg episode reward: [(0, '0.395')] [2024-06-19 11:26:36,305][26599] Updated weights for policy 0, policy_version 343444 (0.0029) [2024-06-19 11:26:38,384][26367] Fps is (10 sec: 44220.3, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5627084800. Throughput: 0: 42267.4. Samples: 1894671180. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:26:38,385][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 11:26:39,057][26599] Updated weights for policy 0, policy_version 343454 (0.0045) [2024-06-19 11:26:43,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42052.4, 300 sec: 42543.4). Total num frames: 5627265024. Throughput: 0: 42522.3. Samples: 1894939360. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:26:43,380][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 11:26:44,014][26599] Updated weights for policy 0, policy_version 343464 (0.0032) [2024-06-19 11:26:46,673][26599] Updated weights for policy 0, policy_version 343474 (0.0038) [2024-06-19 11:26:48,380][26367] Fps is (10 sec: 39336.4, 60 sec: 42325.5, 300 sec: 42487.3). Total num frames: 5627478016. Throughput: 0: 42309.6. Samples: 1895058380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:26:48,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 11:26:51,615][26599] Updated weights for policy 0, policy_version 343484 (0.0040) [2024-06-19 11:26:53,380][26367] Fps is (10 sec: 47513.0, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5627740160. Throughput: 0: 42554.6. Samples: 1895321820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:26:53,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 11:26:54,694][26599] Updated weights for policy 0, policy_version 343494 (0.0037) [2024-06-19 11:26:58,380][26367] Fps is (10 sec: 42597.6, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5627904000. Throughput: 0: 42531.1. Samples: 1895579640. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:26:58,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 11:26:59,329][26599] Updated weights for policy 0, policy_version 343504 (0.0041) [2024-06-19 11:27:02,442][26599] Updated weights for policy 0, policy_version 343514 (0.0027) [2024-06-19 11:27:03,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5628133376. Throughput: 0: 42477.6. Samples: 1895694820. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:03,381][26367] Avg episode reward: [(0, '0.463')] [2024-06-19 11:27:07,161][26599] Updated weights for policy 0, policy_version 343524 (0.0041) [2024-06-19 11:27:08,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5628346368. Throughput: 0: 42583.1. Samples: 1895957480. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:08,384][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 11:27:10,103][26599] Updated weights for policy 0, policy_version 343534 (0.0036) [2024-06-19 11:27:13,380][26367] Fps is (10 sec: 39322.1, 60 sec: 41779.2, 300 sec: 42431.8). Total num frames: 5628526592. Throughput: 0: 42536.3. Samples: 1896213540. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:13,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 11:27:15,090][26599] Updated weights for policy 0, policy_version 343544 (0.0042) [2024-06-19 11:27:17,811][26599] Updated weights for policy 0, policy_version 343554 (0.0035) [2024-06-19 11:27:18,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5628788736. Throughput: 0: 42559.1. Samples: 1896328380. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:18,381][26367] Avg episode reward: [(0, '0.873')] [2024-06-19 11:27:22,613][26599] Updated weights for policy 0, policy_version 343564 (0.0049) [2024-06-19 11:27:23,380][26367] Fps is (10 sec: 45875.7, 60 sec: 42052.4, 300 sec: 42653.9). Total num frames: 5628985344. Throughput: 0: 42716.5. Samples: 1896593260. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:23,380][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 11:27:25,572][26599] Updated weights for policy 0, policy_version 343574 (0.0044) [2024-06-19 11:27:28,380][26367] Fps is (10 sec: 37683.4, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 5629165568. Throughput: 0: 42591.9. Samples: 1896856000. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:28,381][26367] Avg episode reward: [(0, '0.789')] [2024-06-19 11:27:30,034][26599] Updated weights for policy 0, policy_version 343584 (0.0040) [2024-06-19 11:27:33,353][26599] Updated weights for policy 0, policy_version 343594 (0.0038) [2024-06-19 11:27:33,380][26367] Fps is (10 sec: 45875.0, 60 sec: 43690.8, 300 sec: 42765.0). Total num frames: 5629444096. Throughput: 0: 42561.7. Samples: 1896973660. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:33,381][26367] Avg episode reward: [(0, '0.732')] [2024-06-19 11:27:37,811][26599] Updated weights for policy 0, policy_version 343604 (0.0028) [2024-06-19 11:27:38,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42601.0, 300 sec: 42653.9). Total num frames: 5629640704. Throughput: 0: 42661.4. Samples: 1897241580. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:38,381][26367] Avg episode reward: [(0, '0.701')] [2024-06-19 11:27:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000343606_5629640704.pth... [2024-06-19 11:27:38,470][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000342982_5619417088.pth [2024-06-19 11:27:41,178][26599] Updated weights for policy 0, policy_version 343614 (0.0040) [2024-06-19 11:27:43,380][26367] Fps is (10 sec: 36044.3, 60 sec: 42325.2, 300 sec: 42376.2). Total num frames: 5629804544. Throughput: 0: 42636.5. Samples: 1897498280. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:27:43,382][26367] Avg episode reward: [(0, '0.705')] [2024-06-19 11:27:45,541][26599] Updated weights for policy 0, policy_version 343624 (0.0032) [2024-06-19 11:27:47,559][26579] Signal inference workers to stop experience collection... (27950 times) [2024-06-19 11:27:47,559][26579] Signal inference workers to resume experience collection... (27950 times) [2024-06-19 11:27:47,605][26599] InferenceWorker_p0-w0: stopping experience collection (27950 times) [2024-06-19 11:27:47,605][26599] InferenceWorker_p0-w0: resuming experience collection (27950 times) [2024-06-19 11:27:48,380][26367] Fps is (10 sec: 42598.5, 60 sec: 43144.5, 300 sec: 42654.0). Total num frames: 5630066688. Throughput: 0: 42686.4. Samples: 1897615700. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:27:48,381][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 11:27:49,663][26599] Updated weights for policy 0, policy_version 343634 (0.0044) [2024-06-19 11:27:53,380][26367] Fps is (10 sec: 44237.4, 60 sec: 41779.3, 300 sec: 42542.9). Total num frames: 5630246912. Throughput: 0: 42565.9. Samples: 1897872940. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:27:53,380][26367] Avg episode reward: [(0, '0.499')] [2024-06-19 11:27:53,432][26599] Updated weights for policy 0, policy_version 343644 (0.0049) [2024-06-19 11:27:57,270][26599] Updated weights for policy 0, policy_version 343654 (0.0035) [2024-06-19 11:27:58,380][26367] Fps is (10 sec: 37682.5, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5630443520. Throughput: 0: 42442.1. Samples: 1898123440. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:27:58,381][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 11:28:01,341][26599] Updated weights for policy 0, policy_version 343664 (0.0032) [2024-06-19 11:28:03,380][26367] Fps is (10 sec: 45875.1, 60 sec: 42871.6, 300 sec: 42653.9). Total num frames: 5630705664. Throughput: 0: 42808.1. Samples: 1898254740. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:03,381][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 11:28:04,845][26599] Updated weights for policy 0, policy_version 343674 (0.0049) [2024-06-19 11:28:08,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42052.3, 300 sec: 42431.8). Total num frames: 5630869504. Throughput: 0: 42544.3. Samples: 1898507760. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:08,381][26367] Avg episode reward: [(0, '0.612')] [2024-06-19 11:28:08,972][26599] Updated weights for policy 0, policy_version 343684 (0.0034) [2024-06-19 11:28:12,366][26599] Updated weights for policy 0, policy_version 343694 (0.0037) [2024-06-19 11:28:13,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42871.4, 300 sec: 42542.9). Total num frames: 5631098880. Throughput: 0: 42313.3. Samples: 1898760100. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:13,381][26367] Avg episode reward: [(0, '0.443')] [2024-06-19 11:28:16,695][26599] Updated weights for policy 0, policy_version 343704 (0.0040) [2024-06-19 11:28:18,384][26367] Fps is (10 sec: 47496.2, 60 sec: 42595.8, 300 sec: 42653.4). Total num frames: 5631344640. Throughput: 0: 42672.5. Samples: 1898894080. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:18,385][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 11:28:20,428][26599] Updated weights for policy 0, policy_version 343714 (0.0029) [2024-06-19 11:28:23,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42052.2, 300 sec: 42431.8). Total num frames: 5631508480. Throughput: 0: 42354.7. Samples: 1899147540. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:23,381][26367] Avg episode reward: [(0, '0.715')] [2024-06-19 11:28:24,346][26599] Updated weights for policy 0, policy_version 343724 (0.0027) [2024-06-19 11:28:27,905][26599] Updated weights for policy 0, policy_version 343734 (0.0038) [2024-06-19 11:28:28,381][26367] Fps is (10 sec: 39332.0, 60 sec: 42870.8, 300 sec: 42487.2). Total num frames: 5631737856. Throughput: 0: 42318.7. Samples: 1899402660. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:28,382][26367] Avg episode reward: [(0, '0.655')] [2024-06-19 11:28:31,900][26599] Updated weights for policy 0, policy_version 343744 (0.0054) [2024-06-19 11:28:33,380][26367] Fps is (10 sec: 47513.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5631983616. Throughput: 0: 42689.3. Samples: 1899536720. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:33,380][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 11:28:35,512][26599] Updated weights for policy 0, policy_version 343754 (0.0045) [2024-06-19 11:28:38,380][26367] Fps is (10 sec: 42602.5, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5632163840. Throughput: 0: 42589.2. Samples: 1899789460. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:38,384][26367] Avg episode reward: [(0, '0.465')] [2024-06-19 11:28:39,786][26599] Updated weights for policy 0, policy_version 343764 (0.0042) [2024-06-19 11:28:43,081][26599] Updated weights for policy 0, policy_version 343774 (0.0028) [2024-06-19 11:28:43,381][26367] Fps is (10 sec: 40955.9, 60 sec: 43143.9, 300 sec: 42542.7). Total num frames: 5632393216. Throughput: 0: 42636.1. Samples: 1900042100. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:43,382][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 11:28:47,311][26599] Updated weights for policy 0, policy_version 343784 (0.0042) [2024-06-19 11:28:47,771][26579] Signal inference workers to stop experience collection... (28000 times) [2024-06-19 11:28:47,780][26579] Signal inference workers to resume experience collection... (28000 times) [2024-06-19 11:28:47,827][26599] InferenceWorker_p0-w0: stopping experience collection (28000 times) [2024-06-19 11:28:47,827][26599] InferenceWorker_p0-w0: resuming experience collection (28000 times) [2024-06-19 11:28:48,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5632622592. Throughput: 0: 42637.7. Samples: 1900173440. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:48,381][26367] Avg episode reward: [(0, '0.368')] [2024-06-19 11:28:50,653][26599] Updated weights for policy 0, policy_version 343794 (0.0031) [2024-06-19 11:28:53,380][26367] Fps is (10 sec: 40964.4, 60 sec: 42598.4, 300 sec: 42487.3). Total num frames: 5632802816. Throughput: 0: 42693.0. Samples: 1900428940. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:53,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 11:28:55,005][26599] Updated weights for policy 0, policy_version 343804 (0.0027) [2024-06-19 11:28:58,324][26599] Updated weights for policy 0, policy_version 343814 (0.0035) [2024-06-19 11:28:58,380][26367] Fps is (10 sec: 42598.2, 60 sec: 43417.7, 300 sec: 42598.4). Total num frames: 5633048576. Throughput: 0: 42668.9. Samples: 1900680200. Policy #0 lag: (min: 1.0, avg: 8.3, max: 21.0) [2024-06-19 11:28:58,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 11:29:02,573][26599] Updated weights for policy 0, policy_version 343824 (0.0031) [2024-06-19 11:29:03,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5633261568. Throughput: 0: 42690.7. Samples: 1900815000. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:03,380][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 11:29:06,038][26599] Updated weights for policy 0, policy_version 343834 (0.0035) [2024-06-19 11:29:08,380][26367] Fps is (10 sec: 39321.1, 60 sec: 42871.3, 300 sec: 42542.8). Total num frames: 5633441792. Throughput: 0: 42541.6. Samples: 1901061920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:08,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 11:29:10,287][26599] Updated weights for policy 0, policy_version 343844 (0.0031) [2024-06-19 11:29:13,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5633671168. Throughput: 0: 42498.8. Samples: 1901315060. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:13,380][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 11:29:14,070][26599] Updated weights for policy 0, policy_version 343854 (0.0041) [2024-06-19 11:29:17,830][26599] Updated weights for policy 0, policy_version 343864 (0.0035) [2024-06-19 11:29:18,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42327.8, 300 sec: 42598.4). Total num frames: 5633884160. Throughput: 0: 42513.2. Samples: 1901449820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:18,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 11:29:21,652][26599] Updated weights for policy 0, policy_version 343874 (0.0028) [2024-06-19 11:29:23,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42542.9). Total num frames: 5634097152. Throughput: 0: 42563.6. Samples: 1901704820. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:23,384][26367] Avg episode reward: [(0, '0.523')] [2024-06-19 11:29:25,654][26599] Updated weights for policy 0, policy_version 343884 (0.0027) [2024-06-19 11:29:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42599.1, 300 sec: 42542.8). Total num frames: 5634293760. Throughput: 0: 42552.4. Samples: 1901956920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:28,381][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 11:29:29,301][26599] Updated weights for policy 0, policy_version 343894 (0.0028) [2024-06-19 11:29:33,319][26599] Updated weights for policy 0, policy_version 343904 (0.0031) [2024-06-19 11:29:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5634523136. Throughput: 0: 42478.2. Samples: 1902084960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:33,384][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 11:29:37,072][26599] Updated weights for policy 0, policy_version 343914 (0.0046) [2024-06-19 11:29:38,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5634736128. Throughput: 0: 42544.8. Samples: 1902343460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:38,381][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 11:29:38,439][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000343918_5634752512.pth... [2024-06-19 11:29:38,488][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000343294_5624528896.pth [2024-06-19 11:29:40,952][26599] Updated weights for policy 0, policy_version 343924 (0.0025) [2024-06-19 11:29:43,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42599.0, 300 sec: 42598.4). Total num frames: 5634949120. Throughput: 0: 42587.1. Samples: 1902596620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:43,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 11:29:44,763][26599] Updated weights for policy 0, policy_version 343934 (0.0030) [2024-06-19 11:29:48,384][26367] Fps is (10 sec: 40944.8, 60 sec: 42049.7, 300 sec: 42542.3). Total num frames: 5635145728. Throughput: 0: 42406.7. Samples: 1902723460. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:48,385][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 11:29:48,733][26599] Updated weights for policy 0, policy_version 343944 (0.0028) [2024-06-19 11:29:52,336][26599] Updated weights for policy 0, policy_version 343954 (0.0034) [2024-06-19 11:29:53,380][26367] Fps is (10 sec: 44237.7, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5635391488. Throughput: 0: 42786.5. Samples: 1902987300. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:53,380][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 11:29:56,241][26599] Updated weights for policy 0, policy_version 343964 (0.0029) [2024-06-19 11:29:58,384][26367] Fps is (10 sec: 45875.1, 60 sec: 42595.8, 300 sec: 42709.5). Total num frames: 5635604480. Throughput: 0: 42831.6. Samples: 1903242640. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:29:58,385][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 11:30:00,163][26599] Updated weights for policy 0, policy_version 343974 (0.0040) [2024-06-19 11:30:03,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5635801088. Throughput: 0: 42512.1. Samples: 1903362860. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:30:03,384][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:30:03,916][26599] Updated weights for policy 0, policy_version 343984 (0.0051) [2024-06-19 11:30:07,763][26599] Updated weights for policy 0, policy_version 343994 (0.0036) [2024-06-19 11:30:07,787][26579] Signal inference workers to stop experience collection... (28050 times) [2024-06-19 11:30:07,787][26579] Signal inference workers to resume experience collection... (28050 times) [2024-06-19 11:30:07,831][26599] InferenceWorker_p0-w0: stopping experience collection (28050 times) [2024-06-19 11:30:07,836][26599] InferenceWorker_p0-w0: resuming experience collection (28050 times) [2024-06-19 11:30:08,384][26367] Fps is (10 sec: 44236.8, 60 sec: 43415.1, 300 sec: 42597.9). Total num frames: 5636046848. Throughput: 0: 42657.8. Samples: 1903624580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:30:08,385][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 11:30:11,997][26599] Updated weights for policy 0, policy_version 344004 (0.0028) [2024-06-19 11:30:13,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5636227072. Throughput: 0: 42689.8. Samples: 1903877960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 22.0) [2024-06-19 11:30:13,381][26367] Avg episode reward: [(0, '0.854')] [2024-06-19 11:30:15,394][26599] Updated weights for policy 0, policy_version 344014 (0.0033) [2024-06-19 11:30:18,384][26367] Fps is (10 sec: 37683.4, 60 sec: 42322.9, 300 sec: 42542.3). Total num frames: 5636423680. Throughput: 0: 42574.4. Samples: 1904000960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:18,384][26367] Avg episode reward: [(0, '0.753')] [2024-06-19 11:30:19,663][26599] Updated weights for policy 0, policy_version 344024 (0.0030) [2024-06-19 11:30:23,314][26599] Updated weights for policy 0, policy_version 344034 (0.0034) [2024-06-19 11:30:23,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42487.3). Total num frames: 5636653056. Throughput: 0: 42642.0. Samples: 1904262360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:23,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 11:30:27,378][26599] Updated weights for policy 0, policy_version 344044 (0.0027) [2024-06-19 11:30:28,380][26367] Fps is (10 sec: 44252.8, 60 sec: 42871.5, 300 sec: 42598.9). Total num frames: 5636866048. Throughput: 0: 42594.3. Samples: 1904513360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:28,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 11:30:30,871][26599] Updated weights for policy 0, policy_version 344054 (0.0039) [2024-06-19 11:30:33,380][26367] Fps is (10 sec: 42599.4, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5637079040. Throughput: 0: 42715.6. Samples: 1904645500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:33,380][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 11:30:34,834][26599] Updated weights for policy 0, policy_version 344064 (0.0030) [2024-06-19 11:30:38,194][26599] Updated weights for policy 0, policy_version 344074 (0.0044) [2024-06-19 11:30:38,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5637308416. Throughput: 0: 42679.4. Samples: 1904907880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:38,381][26367] Avg episode reward: [(0, '0.569')] [2024-06-19 11:30:42,485][26599] Updated weights for policy 0, policy_version 344084 (0.0040) [2024-06-19 11:30:43,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5637505024. Throughput: 0: 42646.6. Samples: 1905161580. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:43,381][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 11:30:46,398][26599] Updated weights for policy 0, policy_version 344094 (0.0044) [2024-06-19 11:30:48,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42874.0, 300 sec: 42542.9). Total num frames: 5637718016. Throughput: 0: 42705.8. Samples: 1905284620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:48,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 11:30:50,235][26599] Updated weights for policy 0, policy_version 344104 (0.0032) [2024-06-19 11:30:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5637931008. Throughput: 0: 42744.8. Samples: 1905547940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:53,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 11:30:53,806][26599] Updated weights for policy 0, policy_version 344114 (0.0039) [2024-06-19 11:30:57,861][26599] Updated weights for policy 0, policy_version 344124 (0.0035) [2024-06-19 11:30:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42327.9, 300 sec: 42598.4). Total num frames: 5638144000. Throughput: 0: 42714.6. Samples: 1905800120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:30:58,381][26367] Avg episode reward: [(0, '0.704')] [2024-06-19 11:31:01,190][26599] Updated weights for policy 0, policy_version 344134 (0.0033) [2024-06-19 11:31:03,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42871.6, 300 sec: 42598.4). Total num frames: 5638373376. Throughput: 0: 42803.6. Samples: 1905926960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:31:03,380][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 11:31:05,626][26599] Updated weights for policy 0, policy_version 344144 (0.0039) [2024-06-19 11:31:08,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42327.9, 300 sec: 42598.4). Total num frames: 5638586368. Throughput: 0: 42701.5. Samples: 1906183920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:31:08,381][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 11:31:09,264][26599] Updated weights for policy 0, policy_version 344154 (0.0031) [2024-06-19 11:31:13,276][26599] Updated weights for policy 0, policy_version 344164 (0.0029) [2024-06-19 11:31:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5638782976. Throughput: 0: 42788.9. Samples: 1906438860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:31:13,381][26367] Avg episode reward: [(0, '0.365')] [2024-06-19 11:31:16,867][26599] Updated weights for policy 0, policy_version 344174 (0.0029) [2024-06-19 11:31:18,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42874.0, 300 sec: 42487.3). Total num frames: 5638995968. Throughput: 0: 42563.4. Samples: 1906560860. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:31:18,382][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 11:31:21,245][26599] Updated weights for policy 0, policy_version 344184 (0.0034) [2024-06-19 11:31:23,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42325.5, 300 sec: 42542.9). Total num frames: 5639192576. Throughput: 0: 42435.2. Samples: 1906817460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:31:23,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 11:31:24,388][26599] Updated weights for policy 0, policy_version 344194 (0.0045) [2024-06-19 11:31:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5639405568. Throughput: 0: 42508.0. Samples: 1907074440. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:31:28,381][26367] Avg episode reward: [(0, '0.825')] [2024-06-19 11:31:28,759][26599] Updated weights for policy 0, policy_version 344204 (0.0025) [2024-06-19 11:31:31,902][26599] Updated weights for policy 0, policy_version 344214 (0.0031) [2024-06-19 11:31:33,380][26367] Fps is (10 sec: 45874.0, 60 sec: 42871.3, 300 sec: 42598.9). Total num frames: 5639651328. Throughput: 0: 42658.6. Samples: 1907204260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 25.0) [2024-06-19 11:31:33,381][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 11:31:36,205][26599] Updated weights for policy 0, policy_version 344224 (0.0032) [2024-06-19 11:31:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42052.3, 300 sec: 42598.4). Total num frames: 5639831552. Throughput: 0: 42520.1. Samples: 1907461340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:31:38,380][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 11:31:38,394][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000344229_5639847936.pth... [2024-06-19 11:31:38,442][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000343606_5629640704.pth [2024-06-19 11:31:39,976][26599] Updated weights for policy 0, policy_version 344234 (0.0037) [2024-06-19 11:31:43,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5640044544. Throughput: 0: 42513.4. Samples: 1907713220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:31:43,381][26367] Avg episode reward: [(0, '0.775')] [2024-06-19 11:31:43,452][26579] Signal inference workers to stop experience collection... (28100 times) [2024-06-19 11:31:43,504][26599] InferenceWorker_p0-w0: stopping experience collection (28100 times) [2024-06-19 11:31:43,504][26579] Signal inference workers to resume experience collection... (28100 times) [2024-06-19 11:31:43,517][26599] InferenceWorker_p0-w0: resuming experience collection (28100 times) [2024-06-19 11:31:43,960][26599] Updated weights for policy 0, policy_version 344244 (0.0029) [2024-06-19 11:31:47,823][26599] Updated weights for policy 0, policy_version 344254 (0.0034) [2024-06-19 11:31:48,384][26367] Fps is (10 sec: 45858.1, 60 sec: 42868.9, 300 sec: 42542.3). Total num frames: 5640290304. Throughput: 0: 42616.4. Samples: 1907844860. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:31:48,385][26367] Avg episode reward: [(0, '0.600')] [2024-06-19 11:31:51,485][26599] Updated weights for policy 0, policy_version 344264 (0.0034) [2024-06-19 11:31:53,380][26367] Fps is (10 sec: 39321.6, 60 sec: 41779.2, 300 sec: 42487.3). Total num frames: 5640437760. Throughput: 0: 42556.4. Samples: 1908098960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:31:53,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 11:31:55,514][26599] Updated weights for policy 0, policy_version 344274 (0.0036) [2024-06-19 11:31:58,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5640699904. Throughput: 0: 42522.7. Samples: 1908352380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:31:58,381][26367] Avg episode reward: [(0, '0.805')] [2024-06-19 11:31:59,054][26599] Updated weights for policy 0, policy_version 344284 (0.0039) [2024-06-19 11:32:03,169][26599] Updated weights for policy 0, policy_version 344294 (0.0042) [2024-06-19 11:32:03,380][26367] Fps is (10 sec: 49152.2, 60 sec: 42598.3, 300 sec: 42654.0). Total num frames: 5640929280. Throughput: 0: 42862.3. Samples: 1908489660. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:03,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 11:32:07,196][26599] Updated weights for policy 0, policy_version 344304 (0.0046) [2024-06-19 11:32:08,380][26367] Fps is (10 sec: 39321.5, 60 sec: 41779.2, 300 sec: 42598.4). Total num frames: 5641093120. Throughput: 0: 42572.3. Samples: 1908733220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:08,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 11:32:10,734][26599] Updated weights for policy 0, policy_version 344314 (0.0038) [2024-06-19 11:32:13,384][26367] Fps is (10 sec: 40945.1, 60 sec: 42595.8, 300 sec: 42542.3). Total num frames: 5641338880. Throughput: 0: 42444.6. Samples: 1908984600. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:13,384][26367] Avg episode reward: [(0, '0.835')] [2024-06-19 11:32:14,750][26599] Updated weights for policy 0, policy_version 344324 (0.0029) [2024-06-19 11:32:18,346][26599] Updated weights for policy 0, policy_version 344334 (0.0038) [2024-06-19 11:32:18,380][26367] Fps is (10 sec: 47513.7, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5641568256. Throughput: 0: 42699.2. Samples: 1909125720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:18,381][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 11:32:22,262][26599] Updated weights for policy 0, policy_version 344344 (0.0027) [2024-06-19 11:32:23,380][26367] Fps is (10 sec: 39335.8, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5641732096. Throughput: 0: 42382.1. Samples: 1909368540. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:23,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 11:32:26,045][26599] Updated weights for policy 0, policy_version 344354 (0.0045) [2024-06-19 11:32:28,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42487.3). Total num frames: 5641977856. Throughput: 0: 42438.7. Samples: 1909622960. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:28,381][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 11:32:29,819][26599] Updated weights for policy 0, policy_version 344364 (0.0039) [2024-06-19 11:32:33,384][26367] Fps is (10 sec: 45858.5, 60 sec: 42322.9, 300 sec: 42542.3). Total num frames: 5642190848. Throughput: 0: 42504.5. Samples: 1909757560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:33,384][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 11:32:33,813][26599] Updated weights for policy 0, policy_version 344374 (0.0043) [2024-06-19 11:32:37,309][26599] Updated weights for policy 0, policy_version 344384 (0.0041) [2024-06-19 11:32:38,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5642387456. Throughput: 0: 42322.2. Samples: 1910003460. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:38,381][26367] Avg episode reward: [(0, '0.791')] [2024-06-19 11:32:41,608][26579] Signal inference workers to stop experience collection... (28150 times) [2024-06-19 11:32:41,608][26579] Signal inference workers to resume experience collection... (28150 times) [2024-06-19 11:32:41,634][26599] InferenceWorker_p0-w0: stopping experience collection (28150 times) [2024-06-19 11:32:41,634][26599] InferenceWorker_p0-w0: resuming experience collection (28150 times) [2024-06-19 11:32:41,774][26599] Updated weights for policy 0, policy_version 344394 (0.0032) [2024-06-19 11:32:43,380][26367] Fps is (10 sec: 44253.2, 60 sec: 43144.6, 300 sec: 42598.4). Total num frames: 5642633216. Throughput: 0: 42430.7. Samples: 1910261760. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:43,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 11:32:45,238][26599] Updated weights for policy 0, policy_version 344404 (0.0034) [2024-06-19 11:32:48,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42054.9, 300 sec: 42598.4). Total num frames: 5642813440. Throughput: 0: 42386.7. Samples: 1910397060. Policy #0 lag: (min: 0.0, avg: 11.0, max: 21.0) [2024-06-19 11:32:48,380][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 11:32:49,391][26599] Updated weights for policy 0, policy_version 344414 (0.0035) [2024-06-19 11:32:53,365][26599] Updated weights for policy 0, policy_version 344424 (0.0041) [2024-06-19 11:32:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 43417.7, 300 sec: 42709.5). Total num frames: 5643042816. Throughput: 0: 42471.2. Samples: 1910644420. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:32:53,381][26367] Avg episode reward: [(0, '0.557')] [2024-06-19 11:32:56,849][26599] Updated weights for policy 0, policy_version 344434 (0.0028) [2024-06-19 11:32:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5643255808. Throughput: 0: 42683.9. Samples: 1910905220. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:32:58,380][26367] Avg episode reward: [(0, '0.723')] [2024-06-19 11:33:01,210][26599] Updated weights for policy 0, policy_version 344444 (0.0042) [2024-06-19 11:33:03,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5643452416. Throughput: 0: 42380.1. Samples: 1911032820. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:03,380][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 11:33:04,831][26599] Updated weights for policy 0, policy_version 344454 (0.0046) [2024-06-19 11:33:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5643681792. Throughput: 0: 42584.5. Samples: 1911284840. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:08,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 11:33:08,937][26599] Updated weights for policy 0, policy_version 344464 (0.0034) [2024-06-19 11:33:12,658][26599] Updated weights for policy 0, policy_version 344474 (0.0035) [2024-06-19 11:33:13,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42874.1, 300 sec: 42598.9). Total num frames: 5643911168. Throughput: 0: 42639.6. Samples: 1911541740. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:13,381][26367] Avg episode reward: [(0, '0.320')] [2024-06-19 11:33:16,610][26599] Updated weights for policy 0, policy_version 344484 (0.0043) [2024-06-19 11:33:18,382][26367] Fps is (10 sec: 40953.1, 60 sec: 42051.1, 300 sec: 42653.7). Total num frames: 5644091392. Throughput: 0: 42447.7. Samples: 1911667620. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:18,383][26367] Avg episode reward: [(0, '0.792')] [2024-06-19 11:33:20,224][26599] Updated weights for policy 0, policy_version 344494 (0.0031) [2024-06-19 11:33:23,382][26367] Fps is (10 sec: 42592.6, 60 sec: 43416.7, 300 sec: 42709.4). Total num frames: 5644337152. Throughput: 0: 42549.5. Samples: 1911918240. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:23,382][26367] Avg episode reward: [(0, '0.363')] [2024-06-19 11:33:24,139][26599] Updated weights for policy 0, policy_version 344504 (0.0044) [2024-06-19 11:33:28,170][26599] Updated weights for policy 0, policy_version 344514 (0.0034) [2024-06-19 11:33:28,384][26367] Fps is (10 sec: 42589.6, 60 sec: 42322.7, 300 sec: 42486.8). Total num frames: 5644517376. Throughput: 0: 42655.1. Samples: 1912181400. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:28,385][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 11:33:31,677][26599] Updated weights for policy 0, policy_version 344524 (0.0023) [2024-06-19 11:33:33,380][26367] Fps is (10 sec: 39326.6, 60 sec: 42327.9, 300 sec: 42598.4). Total num frames: 5644730368. Throughput: 0: 42299.5. Samples: 1912300540. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:33,381][26367] Avg episode reward: [(0, '0.238')] [2024-06-19 11:33:35,811][26599] Updated weights for policy 0, policy_version 344534 (0.0024) [2024-06-19 11:33:38,380][26367] Fps is (10 sec: 45891.5, 60 sec: 43144.4, 300 sec: 42654.1). Total num frames: 5644976128. Throughput: 0: 42597.1. Samples: 1912561300. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:38,381][26367] Avg episode reward: [(0, '0.768')] [2024-06-19 11:33:38,520][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000344543_5644992512.pth... [2024-06-19 11:33:38,567][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000343918_5634752512.pth [2024-06-19 11:33:39,620][26599] Updated weights for policy 0, policy_version 344544 (0.0035) [2024-06-19 11:33:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.2, 300 sec: 42487.3). Total num frames: 5645156352. Throughput: 0: 42428.0. Samples: 1912814480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:43,381][26367] Avg episode reward: [(0, '0.171')] [2024-06-19 11:33:43,410][26599] Updated weights for policy 0, policy_version 344554 (0.0028) [2024-06-19 11:33:47,306][26599] Updated weights for policy 0, policy_version 344564 (0.0043) [2024-06-19 11:33:48,384][26367] Fps is (10 sec: 37670.2, 60 sec: 42322.8, 300 sec: 42542.3). Total num frames: 5645352960. Throughput: 0: 42351.7. Samples: 1912938800. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:48,384][26367] Avg episode reward: [(0, '0.171')] [2024-06-19 11:33:51,177][26599] Updated weights for policy 0, policy_version 344574 (0.0038) [2024-06-19 11:33:53,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42598.3, 300 sec: 42542.9). Total num frames: 5645598720. Throughput: 0: 42569.6. Samples: 1913200480. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:53,381][26367] Avg episode reward: [(0, '0.317')] [2024-06-19 11:33:55,165][26599] Updated weights for policy 0, policy_version 344584 (0.0034) [2024-06-19 11:33:56,905][26579] Signal inference workers to stop experience collection... (28200 times) [2024-06-19 11:33:56,905][26579] Signal inference workers to resume experience collection... (28200 times) [2024-06-19 11:33:56,955][26599] InferenceWorker_p0-w0: stopping experience collection (28200 times) [2024-06-19 11:33:56,955][26599] InferenceWorker_p0-w0: resuming experience collection (28200 times) [2024-06-19 11:33:58,380][26367] Fps is (10 sec: 44252.2, 60 sec: 42325.2, 300 sec: 42487.3). Total num frames: 5645795328. Throughput: 0: 42479.4. Samples: 1913453320. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:33:58,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 11:33:58,866][26599] Updated weights for policy 0, policy_version 344594 (0.0041) [2024-06-19 11:34:02,880][26599] Updated weights for policy 0, policy_version 344604 (0.0028) [2024-06-19 11:34:03,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5645991936. Throughput: 0: 42538.0. Samples: 1913581760. Policy #0 lag: (min: 1.0, avg: 11.3, max: 21.0) [2024-06-19 11:34:03,381][26367] Avg episode reward: [(0, '0.492')] [2024-06-19 11:34:06,459][26599] Updated weights for policy 0, policy_version 344614 (0.0033) [2024-06-19 11:34:08,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.3, 300 sec: 42598.4). Total num frames: 5646237696. Throughput: 0: 42687.4. Samples: 1913839120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:08,381][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 11:34:10,468][26599] Updated weights for policy 0, policy_version 344624 (0.0039) [2024-06-19 11:34:13,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42052.2, 300 sec: 42542.9). Total num frames: 5646434304. Throughput: 0: 42585.2. Samples: 1914097580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:13,381][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 11:34:14,571][26599] Updated weights for policy 0, policy_version 344634 (0.0036) [2024-06-19 11:34:18,265][26599] Updated weights for policy 0, policy_version 344644 (0.0030) [2024-06-19 11:34:18,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42599.5, 300 sec: 42542.8). Total num frames: 5646647296. Throughput: 0: 42646.1. Samples: 1914219620. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:18,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 11:34:22,303][26599] Updated weights for policy 0, policy_version 344654 (0.0031) [2024-06-19 11:34:23,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42326.3, 300 sec: 42654.0). Total num frames: 5646876672. Throughput: 0: 42827.3. Samples: 1914488520. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:23,380][26367] Avg episode reward: [(0, '0.824')] [2024-06-19 11:34:25,876][26599] Updated weights for policy 0, policy_version 344664 (0.0029) [2024-06-19 11:34:28,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42601.1, 300 sec: 42542.9). Total num frames: 5647073280. Throughput: 0: 42841.4. Samples: 1914742340. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:28,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 11:34:29,681][26599] Updated weights for policy 0, policy_version 344674 (0.0033) [2024-06-19 11:34:33,265][26599] Updated weights for policy 0, policy_version 344684 (0.0034) [2024-06-19 11:34:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5647302656. Throughput: 0: 42950.1. Samples: 1914871400. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:33,381][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 11:34:37,215][26599] Updated weights for policy 0, policy_version 344694 (0.0034) [2024-06-19 11:34:38,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5647515648. Throughput: 0: 42897.0. Samples: 1915130840. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:38,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 11:34:40,832][26599] Updated weights for policy 0, policy_version 344704 (0.0040) [2024-06-19 11:34:43,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.4, 300 sec: 42598.9). Total num frames: 5647712256. Throughput: 0: 42832.9. Samples: 1915380800. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:43,382][26367] Avg episode reward: [(0, '0.820')] [2024-06-19 11:34:44,829][26599] Updated weights for policy 0, policy_version 344714 (0.0034) [2024-06-19 11:34:48,380][26367] Fps is (10 sec: 44237.4, 60 sec: 43420.3, 300 sec: 42598.4). Total num frames: 5647958016. Throughput: 0: 42843.2. Samples: 1915509700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:48,380][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 11:34:48,383][26599] Updated weights for policy 0, policy_version 344724 (0.0032) [2024-06-19 11:34:52,311][26599] Updated weights for policy 0, policy_version 344734 (0.0036) [2024-06-19 11:34:53,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.4, 300 sec: 42487.8). Total num frames: 5648138240. Throughput: 0: 42717.8. Samples: 1915761420. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:53,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 11:34:55,929][26599] Updated weights for policy 0, policy_version 344744 (0.0035) [2024-06-19 11:34:58,380][26367] Fps is (10 sec: 40959.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5648367616. Throughput: 0: 42768.4. Samples: 1916022160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:34:58,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 11:34:59,875][26599] Updated weights for policy 0, policy_version 344754 (0.0048) [2024-06-19 11:35:03,380][26367] Fps is (10 sec: 45875.8, 60 sec: 43417.7, 300 sec: 42543.4). Total num frames: 5648596992. Throughput: 0: 43036.2. Samples: 1916156240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:35:03,380][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 11:35:03,463][26599] Updated weights for policy 0, policy_version 344764 (0.0036) [2024-06-19 11:35:07,515][26599] Updated weights for policy 0, policy_version 344774 (0.0037) [2024-06-19 11:35:08,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5648793600. Throughput: 0: 42669.6. Samples: 1916408660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:35:08,381][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 11:35:11,135][26599] Updated weights for policy 0, policy_version 344784 (0.0030) [2024-06-19 11:35:13,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.5, 300 sec: 42654.5). Total num frames: 5649006592. Throughput: 0: 42793.7. Samples: 1916668060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:35:13,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 11:35:15,536][26599] Updated weights for policy 0, policy_version 344794 (0.0043) [2024-06-19 11:35:18,384][26367] Fps is (10 sec: 44220.9, 60 sec: 43142.0, 300 sec: 42653.4). Total num frames: 5649235968. Throughput: 0: 42765.8. Samples: 1916796020. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:35:18,385][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 11:35:18,791][26599] Updated weights for policy 0, policy_version 344804 (0.0035) [2024-06-19 11:35:20,282][26579] Signal inference workers to stop experience collection... (28250 times) [2024-06-19 11:35:20,320][26599] InferenceWorker_p0-w0: stopping experience collection (28250 times) [2024-06-19 11:35:20,342][26579] Signal inference workers to resume experience collection... (28250 times) [2024-06-19 11:35:20,342][26599] InferenceWorker_p0-w0: resuming experience collection (28250 times) [2024-06-19 11:35:23,116][26599] Updated weights for policy 0, policy_version 344814 (0.0036) [2024-06-19 11:35:23,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42598.2, 300 sec: 42598.4). Total num frames: 5649432576. Throughput: 0: 42753.6. Samples: 1917054760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:35:23,381][26367] Avg episode reward: [(0, '0.579')] [2024-06-19 11:35:26,300][26599] Updated weights for policy 0, policy_version 344824 (0.0041) [2024-06-19 11:35:28,380][26367] Fps is (10 sec: 42614.0, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5649661952. Throughput: 0: 42888.9. Samples: 1917310800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:35:28,381][26367] Avg episode reward: [(0, '0.558')] [2024-06-19 11:35:30,959][26599] Updated weights for policy 0, policy_version 344834 (0.0035) [2024-06-19 11:35:33,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5649874944. Throughput: 0: 42911.4. Samples: 1917440720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:35:33,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 11:35:34,327][26599] Updated weights for policy 0, policy_version 344844 (0.0042) [2024-06-19 11:35:38,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.3, 300 sec: 42542.9). Total num frames: 5650055168. Throughput: 0: 42970.6. Samples: 1917695100. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:35:38,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 11:35:38,501][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000344853_5650071552.pth... [2024-06-19 11:35:38,563][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000344229_5639847936.pth [2024-06-19 11:35:38,711][26599] Updated weights for policy 0, policy_version 344854 (0.0036) [2024-06-19 11:35:41,955][26599] Updated weights for policy 0, policy_version 344864 (0.0032) [2024-06-19 11:35:43,380][26367] Fps is (10 sec: 42599.1, 60 sec: 43144.6, 300 sec: 42654.0). Total num frames: 5650300928. Throughput: 0: 42770.0. Samples: 1917946800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:35:43,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 11:35:46,293][26599] Updated weights for policy 0, policy_version 344874 (0.0031) [2024-06-19 11:35:48,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5650513920. Throughput: 0: 42745.2. Samples: 1918079780. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:35:48,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 11:35:49,690][26599] Updated weights for policy 0, policy_version 344884 (0.0037) [2024-06-19 11:35:53,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5650710528. Throughput: 0: 42711.2. Samples: 1918330660. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:35:53,388][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 11:35:54,023][26599] Updated weights for policy 0, policy_version 344894 (0.0038) [2024-06-19 11:35:57,465][26599] Updated weights for policy 0, policy_version 344904 (0.0035) [2024-06-19 11:35:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42653.9). Total num frames: 5650956288. Throughput: 0: 42569.7. Samples: 1918583700. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:35:58,381][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 11:36:01,645][26599] Updated weights for policy 0, policy_version 344914 (0.0038) [2024-06-19 11:36:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42325.2, 300 sec: 42542.9). Total num frames: 5651136512. Throughput: 0: 42682.6. Samples: 1918716580. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:03,381][26367] Avg episode reward: [(0, '0.835')] [2024-06-19 11:36:05,241][26599] Updated weights for policy 0, policy_version 344924 (0.0041) [2024-06-19 11:36:08,380][26367] Fps is (10 sec: 39321.7, 60 sec: 42598.5, 300 sec: 42598.4). Total num frames: 5651349504. Throughput: 0: 42534.4. Samples: 1918968800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:08,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 11:36:09,195][26599] Updated weights for policy 0, policy_version 344934 (0.0032) [2024-06-19 11:36:12,912][26599] Updated weights for policy 0, policy_version 344944 (0.0037) [2024-06-19 11:36:13,380][26367] Fps is (10 sec: 45875.6, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5651595264. Throughput: 0: 42511.2. Samples: 1919223800. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:13,381][26367] Avg episode reward: [(0, '0.483')] [2024-06-19 11:36:16,761][26599] Updated weights for policy 0, policy_version 344954 (0.0024) [2024-06-19 11:36:18,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42600.9, 300 sec: 42709.4). Total num frames: 5651791872. Throughput: 0: 42587.9. Samples: 1919357180. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:18,381][26367] Avg episode reward: [(0, '0.778')] [2024-06-19 11:36:20,645][26599] Updated weights for policy 0, policy_version 344964 (0.0027) [2024-06-19 11:36:23,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5651988480. Throughput: 0: 42502.7. Samples: 1919607720. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:23,381][26367] Avg episode reward: [(0, '0.795')] [2024-06-19 11:36:24,347][26599] Updated weights for policy 0, policy_version 344974 (0.0043) [2024-06-19 11:36:28,171][26599] Updated weights for policy 0, policy_version 344984 (0.0029) [2024-06-19 11:36:28,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5652234240. Throughput: 0: 42678.2. Samples: 1919867320. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:28,381][26367] Avg episode reward: [(0, '0.623')] [2024-06-19 11:36:32,395][26599] Updated weights for policy 0, policy_version 344994 (0.0038) [2024-06-19 11:36:33,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42325.5, 300 sec: 42653.9). Total num frames: 5652414464. Throughput: 0: 42567.2. Samples: 1919995300. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:33,380][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 11:36:33,508][26579] Signal inference workers to stop experience collection... (28300 times) [2024-06-19 11:36:33,508][26579] Signal inference workers to resume experience collection... (28300 times) [2024-06-19 11:36:33,547][26599] InferenceWorker_p0-w0: stopping experience collection (28300 times) [2024-06-19 11:36:33,552][26599] InferenceWorker_p0-w0: resuming experience collection (28300 times) [2024-06-19 11:36:35,896][26599] Updated weights for policy 0, policy_version 345004 (0.0034) [2024-06-19 11:36:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5652643840. Throughput: 0: 42605.8. Samples: 1920247920. Policy #0 lag: (min: 0.0, avg: 8.9, max: 20.0) [2024-06-19 11:36:38,380][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 11:36:39,977][26599] Updated weights for policy 0, policy_version 345014 (0.0048) [2024-06-19 11:36:43,380][26367] Fps is (10 sec: 44236.0, 60 sec: 42598.3, 300 sec: 42598.9). Total num frames: 5652856832. Throughput: 0: 42842.2. Samples: 1920511600. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:36:43,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 11:36:43,543][26599] Updated weights for policy 0, policy_version 345024 (0.0033) [2024-06-19 11:36:47,761][26599] Updated weights for policy 0, policy_version 345034 (0.0047) [2024-06-19 11:36:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5653053440. Throughput: 0: 42549.4. Samples: 1920631300. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:36:48,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 11:36:51,366][26599] Updated weights for policy 0, policy_version 345044 (0.0045) [2024-06-19 11:36:53,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42871.3, 300 sec: 42653.9). Total num frames: 5653282816. Throughput: 0: 42570.1. Samples: 1920884460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:36:53,381][26367] Avg episode reward: [(0, '0.766')] [2024-06-19 11:36:55,343][26599] Updated weights for policy 0, policy_version 345054 (0.0030) [2024-06-19 11:36:58,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42052.2, 300 sec: 42542.8). Total num frames: 5653479424. Throughput: 0: 42761.2. Samples: 1921148060. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:36:58,381][26367] Avg episode reward: [(0, '0.735')] [2024-06-19 11:36:59,065][26599] Updated weights for policy 0, policy_version 345064 (0.0035) [2024-06-19 11:37:02,964][26599] Updated weights for policy 0, policy_version 345074 (0.0037) [2024-06-19 11:37:03,381][26367] Fps is (10 sec: 40958.2, 60 sec: 42598.0, 300 sec: 42709.4). Total num frames: 5653692416. Throughput: 0: 42447.2. Samples: 1921267320. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:03,381][26367] Avg episode reward: [(0, '0.767')] [2024-06-19 11:37:06,709][26599] Updated weights for policy 0, policy_version 345084 (0.0039) [2024-06-19 11:37:08,380][26367] Fps is (10 sec: 47513.7, 60 sec: 43417.6, 300 sec: 42765.5). Total num frames: 5653954560. Throughput: 0: 42802.6. Samples: 1921533840. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:08,381][26367] Avg episode reward: [(0, '0.866')] [2024-06-19 11:37:10,494][26599] Updated weights for policy 0, policy_version 345094 (0.0023) [2024-06-19 11:37:13,380][26367] Fps is (10 sec: 44239.3, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5654134784. Throughput: 0: 42839.5. Samples: 1921795100. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:13,383][26367] Avg episode reward: [(0, '0.777')] [2024-06-19 11:37:14,475][26599] Updated weights for policy 0, policy_version 345104 (0.0032) [2024-06-19 11:37:18,030][26599] Updated weights for policy 0, policy_version 345114 (0.0046) [2024-06-19 11:37:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5654347776. Throughput: 0: 42686.5. Samples: 1921916200. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:18,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 11:37:22,158][26599] Updated weights for policy 0, policy_version 345124 (0.0035) [2024-06-19 11:37:23,380][26367] Fps is (10 sec: 45874.8, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5654593536. Throughput: 0: 42905.6. Samples: 1922178680. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:23,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 11:37:25,678][26599] Updated weights for policy 0, policy_version 345134 (0.0037) [2024-06-19 11:37:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 5654773760. Throughput: 0: 42618.4. Samples: 1922429420. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:28,380][26367] Avg episode reward: [(0, '0.475')] [2024-06-19 11:37:29,727][26599] Updated weights for policy 0, policy_version 345144 (0.0027) [2024-06-19 11:37:33,384][26367] Fps is (10 sec: 39307.7, 60 sec: 42868.8, 300 sec: 42709.0). Total num frames: 5654986752. Throughput: 0: 42644.5. Samples: 1922550460. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:33,384][26367] Avg episode reward: [(0, '0.495')] [2024-06-19 11:37:33,862][26599] Updated weights for policy 0, policy_version 345154 (0.0032) [2024-06-19 11:37:37,297][26599] Updated weights for policy 0, policy_version 345164 (0.0030) [2024-06-19 11:37:38,381][26367] Fps is (10 sec: 45870.6, 60 sec: 43143.8, 300 sec: 42709.3). Total num frames: 5655232512. Throughput: 0: 42952.1. Samples: 1922817340. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:38,382][26367] Avg episode reward: [(0, '0.674')] [2024-06-19 11:37:38,398][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000345168_5655232512.pth... [2024-06-19 11:37:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000344543_5644992512.pth [2024-06-19 11:37:41,298][26599] Updated weights for policy 0, policy_version 345174 (0.0030) [2024-06-19 11:37:43,382][26367] Fps is (10 sec: 42605.0, 60 sec: 42597.0, 300 sec: 42709.2). Total num frames: 5655412736. Throughput: 0: 42800.3. Samples: 1923074160. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:43,383][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 11:37:45,002][26599] Updated weights for policy 0, policy_version 345184 (0.0033) [2024-06-19 11:37:48,380][26367] Fps is (10 sec: 40963.2, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 5655642112. Throughput: 0: 42829.3. Samples: 1923194620. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:48,381][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 11:37:48,810][26599] Updated weights for policy 0, policy_version 345194 (0.0032) [2024-06-19 11:37:52,649][26599] Updated weights for policy 0, policy_version 345204 (0.0029) [2024-06-19 11:37:53,139][26579] Signal inference workers to stop experience collection... (28350 times) [2024-06-19 11:37:53,175][26599] InferenceWorker_p0-w0: stopping experience collection (28350 times) [2024-06-19 11:37:53,190][26579] Signal inference workers to resume experience collection... (28350 times) [2024-06-19 11:37:53,191][26599] InferenceWorker_p0-w0: resuming experience collection (28350 times) [2024-06-19 11:37:53,380][26367] Fps is (10 sec: 45885.2, 60 sec: 43144.7, 300 sec: 42765.0). Total num frames: 5655871488. Throughput: 0: 42586.4. Samples: 1923450220. Policy #0 lag: (min: 0.0, avg: 8.1, max: 20.0) [2024-06-19 11:37:53,380][26367] Avg episode reward: [(0, '0.515')] [2024-06-19 11:37:56,367][26599] Updated weights for policy 0, policy_version 345214 (0.0026) [2024-06-19 11:37:58,384][26367] Fps is (10 sec: 39307.9, 60 sec: 42595.9, 300 sec: 42653.4). Total num frames: 5656035328. Throughput: 0: 42572.6. Samples: 1923711020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:37:58,384][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 11:38:00,299][26599] Updated weights for policy 0, policy_version 345224 (0.0040) [2024-06-19 11:38:03,380][26367] Fps is (10 sec: 40959.8, 60 sec: 43145.0, 300 sec: 42709.5). Total num frames: 5656281088. Throughput: 0: 42536.1. Samples: 1923830320. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:03,381][26367] Avg episode reward: [(0, '0.738')] [2024-06-19 11:38:04,043][26599] Updated weights for policy 0, policy_version 345234 (0.0029) [2024-06-19 11:38:07,911][26599] Updated weights for policy 0, policy_version 345244 (0.0028) [2024-06-19 11:38:08,380][26367] Fps is (10 sec: 47530.3, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5656510464. Throughput: 0: 42593.3. Samples: 1924095380. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:08,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 11:38:12,017][26599] Updated weights for policy 0, policy_version 345254 (0.0028) [2024-06-19 11:38:13,380][26367] Fps is (10 sec: 39321.4, 60 sec: 42325.3, 300 sec: 42654.2). Total num frames: 5656674304. Throughput: 0: 42726.2. Samples: 1924352100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:13,382][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 11:38:15,737][26599] Updated weights for policy 0, policy_version 345264 (0.0033) [2024-06-19 11:38:18,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.1). Total num frames: 5656920064. Throughput: 0: 42638.0. Samples: 1924469020. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:18,384][26367] Avg episode reward: [(0, '0.710')] [2024-06-19 11:38:19,704][26599] Updated weights for policy 0, policy_version 345274 (0.0039) [2024-06-19 11:38:23,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42052.2, 300 sec: 42710.0). Total num frames: 5657116672. Throughput: 0: 42524.7. Samples: 1924730920. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:23,381][26367] Avg episode reward: [(0, '0.748')] [2024-06-19 11:38:23,498][26599] Updated weights for policy 0, policy_version 345284 (0.0034) [2024-06-19 11:38:27,519][26599] Updated weights for policy 0, policy_version 345294 (0.0039) [2024-06-19 11:38:28,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5657313280. Throughput: 0: 42366.3. Samples: 1924980560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:28,381][26367] Avg episode reward: [(0, '0.678')] [2024-06-19 11:38:31,567][26599] Updated weights for policy 0, policy_version 345304 (0.0027) [2024-06-19 11:38:33,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5657542656. Throughput: 0: 42499.7. Samples: 1925107100. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:33,381][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 11:38:35,048][26599] Updated weights for policy 0, policy_version 345314 (0.0032) [2024-06-19 11:38:38,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42052.9, 300 sec: 42709.5). Total num frames: 5657755648. Throughput: 0: 42551.8. Samples: 1925365060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:38,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 11:38:39,590][26599] Updated weights for policy 0, policy_version 345324 (0.0026) [2024-06-19 11:38:42,550][26599] Updated weights for policy 0, policy_version 345334 (0.0027) [2024-06-19 11:38:43,384][26367] Fps is (10 sec: 40944.9, 60 sec: 42324.2, 300 sec: 42709.5). Total num frames: 5657952256. Throughput: 0: 42382.2. Samples: 1925618220. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:43,385][26367] Avg episode reward: [(0, '0.697')] [2024-06-19 11:38:47,193][26599] Updated weights for policy 0, policy_version 345344 (0.0028) [2024-06-19 11:38:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5658198016. Throughput: 0: 42644.7. Samples: 1925749340. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:48,381][26367] Avg episode reward: [(0, '0.880')] [2024-06-19 11:38:50,022][26599] Updated weights for policy 0, policy_version 345354 (0.0028) [2024-06-19 11:38:53,380][26367] Fps is (10 sec: 44252.7, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5658394624. Throughput: 0: 42450.8. Samples: 1926005660. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:53,381][26367] Avg episode reward: [(0, '0.829')] [2024-06-19 11:38:54,750][26599] Updated weights for policy 0, policy_version 345364 (0.0034) [2024-06-19 11:38:58,259][26599] Updated weights for policy 0, policy_version 345374 (0.0040) [2024-06-19 11:38:58,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42874.0, 300 sec: 42765.0). Total num frames: 5658607616. Throughput: 0: 42329.7. Samples: 1926256940. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:38:58,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 11:39:02,321][26599] Updated weights for policy 0, policy_version 345384 (0.0038) [2024-06-19 11:39:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.2, 300 sec: 42653.9). Total num frames: 5658820608. Throughput: 0: 42643.5. Samples: 1926387980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:39:03,381][26367] Avg episode reward: [(0, '0.503')] [2024-06-19 11:39:05,872][26599] Updated weights for policy 0, policy_version 345394 (0.0041) [2024-06-19 11:39:08,380][26367] Fps is (10 sec: 40960.7, 60 sec: 41779.3, 300 sec: 42654.0). Total num frames: 5659017216. Throughput: 0: 42490.5. Samples: 1926642980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:39:08,380][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 11:39:09,907][26599] Updated weights for policy 0, policy_version 345404 (0.0033) [2024-06-19 11:39:13,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5659246592. Throughput: 0: 42436.2. Samples: 1926890180. Policy #0 lag: (min: 0.0, avg: 9.3, max: 20.0) [2024-06-19 11:39:13,380][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:39:13,430][26599] Updated weights for policy 0, policy_version 345414 (0.0028) [2024-06-19 11:39:16,342][26579] Signal inference workers to stop experience collection... (28400 times) [2024-06-19 11:39:16,344][26579] Signal inference workers to resume experience collection... (28400 times) [2024-06-19 11:39:16,379][26599] InferenceWorker_p0-w0: stopping experience collection (28400 times) [2024-06-19 11:39:16,379][26599] InferenceWorker_p0-w0: resuming experience collection (28400 times) [2024-06-19 11:39:17,471][26599] Updated weights for policy 0, policy_version 345424 (0.0030) [2024-06-19 11:39:18,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5659459584. Throughput: 0: 42648.0. Samples: 1927026260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:18,381][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 11:39:21,414][26599] Updated weights for policy 0, policy_version 345434 (0.0031) [2024-06-19 11:39:23,384][26367] Fps is (10 sec: 42582.4, 60 sec: 42595.9, 300 sec: 42708.9). Total num frames: 5659672576. Throughput: 0: 42606.4. Samples: 1927282500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:23,385][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 11:39:25,184][26599] Updated weights for policy 0, policy_version 345444 (0.0047) [2024-06-19 11:39:28,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5659885568. Throughput: 0: 42502.9. Samples: 1927530700. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:28,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 11:39:29,059][26599] Updated weights for policy 0, policy_version 345454 (0.0031) [2024-06-19 11:39:33,275][26599] Updated weights for policy 0, policy_version 345464 (0.0043) [2024-06-19 11:39:33,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42325.2, 300 sec: 42598.4). Total num frames: 5660082176. Throughput: 0: 42418.7. Samples: 1927658180. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:33,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 11:39:36,562][26599] Updated weights for policy 0, policy_version 345474 (0.0028) [2024-06-19 11:39:38,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42325.4, 300 sec: 42654.0). Total num frames: 5660295168. Throughput: 0: 42418.8. Samples: 1927914500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:38,380][26367] Avg episode reward: [(0, '0.572')] [2024-06-19 11:39:38,502][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000345478_5660311552.pth... [2024-06-19 11:39:38,564][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000344853_5650071552.pth [2024-06-19 11:39:40,901][26599] Updated weights for policy 0, policy_version 345484 (0.0044) [2024-06-19 11:39:43,384][26367] Fps is (10 sec: 44221.0, 60 sec: 42871.5, 300 sec: 42597.9). Total num frames: 5660524544. Throughput: 0: 42501.1. Samples: 1928169640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:43,385][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 11:39:44,189][26599] Updated weights for policy 0, policy_version 345494 (0.0030) [2024-06-19 11:39:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42052.4, 300 sec: 42654.0). Total num frames: 5660721152. Throughput: 0: 42517.5. Samples: 1928301260. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:48,380][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 11:39:48,532][26599] Updated weights for policy 0, policy_version 345504 (0.0039) [2024-06-19 11:39:51,969][26599] Updated weights for policy 0, policy_version 345514 (0.0039) [2024-06-19 11:39:53,380][26367] Fps is (10 sec: 40974.9, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5660934144. Throughput: 0: 42489.2. Samples: 1928555000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:53,381][26367] Avg episode reward: [(0, '0.657')] [2024-06-19 11:39:56,090][26599] Updated weights for policy 0, policy_version 345524 (0.0038) [2024-06-19 11:39:58,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5661163520. Throughput: 0: 42584.8. Samples: 1928806500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:39:58,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 11:40:00,396][26599] Updated weights for policy 0, policy_version 345534 (0.0051) [2024-06-19 11:40:03,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5661360128. Throughput: 0: 42579.1. Samples: 1928942320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:40:03,381][26367] Avg episode reward: [(0, '0.580')] [2024-06-19 11:40:03,649][26599] Updated weights for policy 0, policy_version 345544 (0.0036) [2024-06-19 11:40:07,994][26599] Updated weights for policy 0, policy_version 345554 (0.0030) [2024-06-19 11:40:08,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42595.7, 300 sec: 42597.9). Total num frames: 5661573120. Throughput: 0: 42519.6. Samples: 1929195880. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:40:08,384][26367] Avg episode reward: [(0, '0.847')] [2024-06-19 11:40:11,203][26599] Updated weights for policy 0, policy_version 345564 (0.0041) [2024-06-19 11:40:13,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.3, 300 sec: 42543.4). Total num frames: 5661786112. Throughput: 0: 42727.2. Samples: 1929453420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:40:13,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 11:40:15,622][26599] Updated weights for policy 0, policy_version 345574 (0.0045) [2024-06-19 11:40:18,380][26367] Fps is (10 sec: 44253.4, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5662015488. Throughput: 0: 42768.6. Samples: 1929582760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:40:18,380][26367] Avg episode reward: [(0, '0.850')] [2024-06-19 11:40:18,796][26599] Updated weights for policy 0, policy_version 345584 (0.0033) [2024-06-19 11:40:23,211][26599] Updated weights for policy 0, policy_version 345594 (0.0030) [2024-06-19 11:40:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5662228480. Throughput: 0: 42739.1. Samples: 1929837760. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:40:23,380][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 11:40:26,828][26599] Updated weights for policy 0, policy_version 345604 (0.0042) [2024-06-19 11:40:28,383][26367] Fps is (10 sec: 42584.4, 60 sec: 42596.2, 300 sec: 42598.0). Total num frames: 5662441472. Throughput: 0: 42827.5. Samples: 1930096860. Policy #0 lag: (min: 0.0, avg: 10.7, max: 21.0) [2024-06-19 11:40:28,384][26367] Avg episode reward: [(0, '0.628')] [2024-06-19 11:40:30,815][26599] Updated weights for policy 0, policy_version 345614 (0.0025) [2024-06-19 11:40:33,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5662670848. Throughput: 0: 42746.5. Samples: 1930224860. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:40:33,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 11:40:34,394][26599] Updated weights for policy 0, policy_version 345624 (0.0028) [2024-06-19 11:40:38,276][26599] Updated weights for policy 0, policy_version 345634 (0.0042) [2024-06-19 11:40:38,380][26367] Fps is (10 sec: 42611.8, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5662867456. Throughput: 0: 42767.1. Samples: 1930479520. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:40:38,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 11:40:42,055][26599] Updated weights for policy 0, policy_version 345644 (0.0039) [2024-06-19 11:40:43,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5663080448. Throughput: 0: 42798.3. Samples: 1930732420. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:40:43,380][26367] Avg episode reward: [(0, '0.692')] [2024-06-19 11:40:45,829][26599] Updated weights for policy 0, policy_version 345654 (0.0038) [2024-06-19 11:40:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42653.9). Total num frames: 5663293440. Throughput: 0: 42651.2. Samples: 1930861620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:40:48,381][26367] Avg episode reward: [(0, '0.808')] [2024-06-19 11:40:49,754][26579] Signal inference workers to stop experience collection... (28450 times) [2024-06-19 11:40:49,754][26579] Signal inference workers to resume experience collection... (28450 times) [2024-06-19 11:40:49,783][26599] InferenceWorker_p0-w0: stopping experience collection (28450 times) [2024-06-19 11:40:49,784][26599] InferenceWorker_p0-w0: resuming experience collection (28450 times) [2024-06-19 11:40:49,895][26599] Updated weights for policy 0, policy_version 345664 (0.0035) [2024-06-19 11:40:53,339][26599] Updated weights for policy 0, policy_version 345674 (0.0038) [2024-06-19 11:40:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42598.4). Total num frames: 5663522816. Throughput: 0: 42731.0. Samples: 1931118620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:40:53,381][26367] Avg episode reward: [(0, '0.415')] [2024-06-19 11:40:57,457][26599] Updated weights for policy 0, policy_version 345684 (0.0034) [2024-06-19 11:40:58,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5663735808. Throughput: 0: 42724.6. Samples: 1931376180. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:40:58,384][26367] Avg episode reward: [(0, '0.397')] [2024-06-19 11:41:01,331][26599] Updated weights for policy 0, policy_version 345694 (0.0026) [2024-06-19 11:41:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5663948800. Throughput: 0: 42756.8. Samples: 1931506820. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:03,381][26367] Avg episode reward: [(0, '0.631')] [2024-06-19 11:41:04,959][26599] Updated weights for policy 0, policy_version 345704 (0.0039) [2024-06-19 11:41:08,380][26367] Fps is (10 sec: 40975.0, 60 sec: 42874.1, 300 sec: 42542.9). Total num frames: 5664145408. Throughput: 0: 42854.2. Samples: 1931766200. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:08,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 11:41:08,831][26599] Updated weights for policy 0, policy_version 345714 (0.0022) [2024-06-19 11:41:12,515][26599] Updated weights for policy 0, policy_version 345724 (0.0041) [2024-06-19 11:41:13,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.5, 300 sec: 42598.4). Total num frames: 5664358400. Throughput: 0: 42844.9. Samples: 1932024740. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:13,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 11:41:16,293][26599] Updated weights for policy 0, policy_version 345734 (0.0033) [2024-06-19 11:41:18,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5664587776. Throughput: 0: 42978.6. Samples: 1932158900. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:18,384][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 11:41:20,104][26599] Updated weights for policy 0, policy_version 345744 (0.0036) [2024-06-19 11:41:23,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5664784384. Throughput: 0: 42849.5. Samples: 1932407740. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:23,381][26367] Avg episode reward: [(0, '0.513')] [2024-06-19 11:41:24,318][26599] Updated weights for policy 0, policy_version 345754 (0.0035) [2024-06-19 11:41:28,003][26599] Updated weights for policy 0, policy_version 345764 (0.0030) [2024-06-19 11:41:28,384][26367] Fps is (10 sec: 42583.4, 60 sec: 42871.2, 300 sec: 42708.9). Total num frames: 5665013760. Throughput: 0: 42964.9. Samples: 1932666000. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:28,384][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 11:41:31,856][26599] Updated weights for policy 0, policy_version 345774 (0.0038) [2024-06-19 11:41:33,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5665210368. Throughput: 0: 43022.2. Samples: 1932797620. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:33,381][26367] Avg episode reward: [(0, '0.615')] [2024-06-19 11:41:35,567][26599] Updated weights for policy 0, policy_version 345784 (0.0037) [2024-06-19 11:41:38,382][26367] Fps is (10 sec: 42608.0, 60 sec: 42870.5, 300 sec: 42653.8). Total num frames: 5665439744. Throughput: 0: 42941.8. Samples: 1933051060. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:38,382][26367] Avg episode reward: [(0, '0.597')] [2024-06-19 11:41:38,410][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000345791_5665439744.pth... [2024-06-19 11:41:38,462][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000345168_5655232512.pth [2024-06-19 11:41:39,501][26599] Updated weights for policy 0, policy_version 345794 (0.0035) [2024-06-19 11:41:43,111][26599] Updated weights for policy 0, policy_version 345804 (0.0028) [2024-06-19 11:41:43,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5665652736. Throughput: 0: 42886.0. Samples: 1933305900. Policy #0 lag: (min: 1.0, avg: 8.8, max: 20.0) [2024-06-19 11:41:43,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 11:41:47,123][26599] Updated weights for policy 0, policy_version 345814 (0.0035) [2024-06-19 11:41:48,380][26367] Fps is (10 sec: 42604.0, 60 sec: 42871.4, 300 sec: 42654.0). Total num frames: 5665865728. Throughput: 0: 42871.0. Samples: 1933436020. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:41:48,381][26367] Avg episode reward: [(0, '0.660')] [2024-06-19 11:41:50,793][26599] Updated weights for policy 0, policy_version 345824 (0.0037) [2024-06-19 11:41:53,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5666078720. Throughput: 0: 42807.1. Samples: 1933692520. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:41:53,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 11:41:54,681][26599] Updated weights for policy 0, policy_version 345834 (0.0026) [2024-06-19 11:41:58,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42600.9, 300 sec: 42709.6). Total num frames: 5666291712. Throughput: 0: 42505.2. Samples: 1933937480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:41:58,382][26367] Avg episode reward: [(0, '0.706')] [2024-06-19 11:41:58,747][26599] Updated weights for policy 0, policy_version 345844 (0.0038) [2024-06-19 11:42:02,642][26599] Updated weights for policy 0, policy_version 345854 (0.0034) [2024-06-19 11:42:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5666504704. Throughput: 0: 42501.5. Samples: 1934071460. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:03,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 11:42:06,385][26599] Updated weights for policy 0, policy_version 345864 (0.0046) [2024-06-19 11:42:08,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5666701312. Throughput: 0: 42654.6. Samples: 1934327200. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:08,381][26367] Avg episode reward: [(0, '0.508')] [2024-06-19 11:42:10,457][26599] Updated weights for policy 0, policy_version 345874 (0.0038) [2024-06-19 11:42:12,653][26579] Signal inference workers to stop experience collection... (28500 times) [2024-06-19 11:42:12,715][26599] InferenceWorker_p0-w0: stopping experience collection (28500 times) [2024-06-19 11:42:12,771][26579] Signal inference workers to resume experience collection... (28500 times) [2024-06-19 11:42:12,771][26599] InferenceWorker_p0-w0: resuming experience collection (28500 times) [2024-06-19 11:42:13,380][26367] Fps is (10 sec: 44236.0, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5666947072. Throughput: 0: 42439.3. Samples: 1934575620. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:13,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 11:42:13,974][26599] Updated weights for policy 0, policy_version 345884 (0.0033) [2024-06-19 11:42:18,071][26599] Updated weights for policy 0, policy_version 345894 (0.0029) [2024-06-19 11:42:18,380][26367] Fps is (10 sec: 44236.3, 60 sec: 42598.4, 300 sec: 42542.9). Total num frames: 5667143680. Throughput: 0: 42562.9. Samples: 1934712960. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:18,381][26367] Avg episode reward: [(0, '0.334')] [2024-06-19 11:42:21,526][26599] Updated weights for policy 0, policy_version 345904 (0.0031) [2024-06-19 11:42:23,380][26367] Fps is (10 sec: 39322.2, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5667340288. Throughput: 0: 42576.0. Samples: 1934966920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:23,381][26367] Avg episode reward: [(0, '0.362')] [2024-06-19 11:42:25,675][26599] Updated weights for policy 0, policy_version 345914 (0.0036) [2024-06-19 11:42:28,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42601.0, 300 sec: 42654.5). Total num frames: 5667569664. Throughput: 0: 42464.1. Samples: 1935216780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:28,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 11:42:29,143][26599] Updated weights for policy 0, policy_version 345924 (0.0042) [2024-06-19 11:42:33,306][26599] Updated weights for policy 0, policy_version 345934 (0.0038) [2024-06-19 11:42:33,380][26367] Fps is (10 sec: 44236.1, 60 sec: 42871.4, 300 sec: 42543.0). Total num frames: 5667782656. Throughput: 0: 42508.0. Samples: 1935348880. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:33,382][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 11:42:36,744][26599] Updated weights for policy 0, policy_version 345944 (0.0045) [2024-06-19 11:42:38,384][26367] Fps is (10 sec: 42582.5, 60 sec: 42596.7, 300 sec: 42653.7). Total num frames: 5667995648. Throughput: 0: 42336.0. Samples: 1935597800. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:38,385][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 11:42:41,046][26599] Updated weights for policy 0, policy_version 345954 (0.0037) [2024-06-19 11:42:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5668208640. Throughput: 0: 42620.0. Samples: 1935855380. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:43,381][26367] Avg episode reward: [(0, '0.671')] [2024-06-19 11:42:44,319][26599] Updated weights for policy 0, policy_version 345964 (0.0031) [2024-06-19 11:42:48,383][26367] Fps is (10 sec: 40965.7, 60 sec: 42323.7, 300 sec: 42487.0). Total num frames: 5668405248. Throughput: 0: 42415.5. Samples: 1935980260. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:48,383][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 11:42:48,796][26599] Updated weights for policy 0, policy_version 345974 (0.0025) [2024-06-19 11:42:52,077][26599] Updated weights for policy 0, policy_version 345984 (0.0029) [2024-06-19 11:42:53,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.3, 300 sec: 42654.5). Total num frames: 5668618240. Throughput: 0: 42376.9. Samples: 1936234160. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:53,381][26367] Avg episode reward: [(0, '0.751')] [2024-06-19 11:42:56,397][26599] Updated weights for policy 0, policy_version 345994 (0.0037) [2024-06-19 11:42:58,380][26367] Fps is (10 sec: 42608.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5668831232. Throughput: 0: 42584.6. Samples: 1936491920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:42:58,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 11:42:59,870][26599] Updated weights for policy 0, policy_version 346004 (0.0047) [2024-06-19 11:43:03,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42325.3, 300 sec: 42487.3). Total num frames: 5669044224. Throughput: 0: 42412.2. Samples: 1936621500. Policy #0 lag: (min: 0.0, avg: 10.3, max: 22.0) [2024-06-19 11:43:03,380][26367] Avg episode reward: [(0, '0.651')] [2024-06-19 11:43:04,166][26599] Updated weights for policy 0, policy_version 346014 (0.0030) [2024-06-19 11:43:07,607][26599] Updated weights for policy 0, policy_version 346024 (0.0034) [2024-06-19 11:43:08,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5669257216. Throughput: 0: 42293.4. Samples: 1936870120. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:08,380][26367] Avg episode reward: [(0, '0.599')] [2024-06-19 11:43:12,249][26599] Updated weights for policy 0, policy_version 346034 (0.0031) [2024-06-19 11:43:13,380][26367] Fps is (10 sec: 44236.5, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5669486592. Throughput: 0: 42500.9. Samples: 1937129320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:13,381][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 11:43:15,674][26599] Updated weights for policy 0, policy_version 346044 (0.0042) [2024-06-19 11:43:18,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5669683200. Throughput: 0: 42398.3. Samples: 1937256800. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:18,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:43:19,859][26599] Updated weights for policy 0, policy_version 346054 (0.0033) [2024-06-19 11:43:23,211][26599] Updated weights for policy 0, policy_version 346064 (0.0022) [2024-06-19 11:43:23,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.3, 300 sec: 42709.5). Total num frames: 5669912576. Throughput: 0: 42522.5. Samples: 1937511160. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:23,381][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:43:27,463][26599] Updated weights for policy 0, policy_version 346074 (0.0042) [2024-06-19 11:43:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5670125568. Throughput: 0: 42570.1. Samples: 1937771040. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:28,381][26367] Avg episode reward: [(0, '0.595')] [2024-06-19 11:43:31,022][26599] Updated weights for policy 0, policy_version 346084 (0.0037) [2024-06-19 11:43:33,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5670322176. Throughput: 0: 42555.0. Samples: 1937895140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:33,381][26367] Avg episode reward: [(0, '0.473')] [2024-06-19 11:43:35,251][26599] Updated weights for policy 0, policy_version 346094 (0.0037) [2024-06-19 11:43:38,307][26579] Signal inference workers to stop experience collection... (28550 times) [2024-06-19 11:43:38,341][26599] InferenceWorker_p0-w0: stopping experience collection (28550 times) [2024-06-19 11:43:38,380][26367] Fps is (10 sec: 40960.8, 60 sec: 42328.0, 300 sec: 42654.5). Total num frames: 5670535168. Throughput: 0: 42584.5. Samples: 1938150460. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:38,380][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 11:43:38,418][26579] Signal inference workers to resume experience collection... (28550 times) [2024-06-19 11:43:38,418][26599] InferenceWorker_p0-w0: resuming experience collection (28550 times) [2024-06-19 11:43:38,550][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000346104_5670567936.pth... [2024-06-19 11:43:38,554][26599] Updated weights for policy 0, policy_version 346104 (0.0027) [2024-06-19 11:43:38,605][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000345478_5660311552.pth [2024-06-19 11:43:42,811][26599] Updated weights for policy 0, policy_version 346114 (0.0029) [2024-06-19 11:43:43,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42325.4, 300 sec: 42542.9). Total num frames: 5670748160. Throughput: 0: 42581.0. Samples: 1938408060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:43,380][26367] Avg episode reward: [(0, '0.733')] [2024-06-19 11:43:46,132][26599] Updated weights for policy 0, policy_version 346124 (0.0041) [2024-06-19 11:43:48,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42873.2, 300 sec: 42653.9). Total num frames: 5670977536. Throughput: 0: 42518.7. Samples: 1938534840. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:48,381][26367] Avg episode reward: [(0, '0.654')] [2024-06-19 11:43:50,362][26599] Updated weights for policy 0, policy_version 346134 (0.0031) [2024-06-19 11:43:53,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5671190528. Throughput: 0: 42767.0. Samples: 1938794640. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:53,384][26367] Avg episode reward: [(0, '0.716')] [2024-06-19 11:43:53,871][26599] Updated weights for policy 0, policy_version 346144 (0.0032) [2024-06-19 11:43:58,025][26599] Updated weights for policy 0, policy_version 346154 (0.0032) [2024-06-19 11:43:58,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5671387136. Throughput: 0: 42618.2. Samples: 1939047140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:43:58,381][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 11:44:01,506][26599] Updated weights for policy 0, policy_version 346164 (0.0036) [2024-06-19 11:44:03,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5671616512. Throughput: 0: 42596.9. Samples: 1939173660. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:44:03,381][26367] Avg episode reward: [(0, '0.752')] [2024-06-19 11:44:05,983][26599] Updated weights for policy 0, policy_version 346174 (0.0031) [2024-06-19 11:44:08,385][26367] Fps is (10 sec: 42579.6, 60 sec: 42595.2, 300 sec: 42597.8). Total num frames: 5671813120. Throughput: 0: 42584.0. Samples: 1939427620. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:44:08,385][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:44:09,333][26599] Updated weights for policy 0, policy_version 346184 (0.0049) [2024-06-19 11:44:13,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5672026112. Throughput: 0: 42475.3. Samples: 1939682420. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:44:13,381][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 11:44:13,498][26599] Updated weights for policy 0, policy_version 346194 (0.0041) [2024-06-19 11:44:17,155][26599] Updated weights for policy 0, policy_version 346204 (0.0037) [2024-06-19 11:44:18,382][26367] Fps is (10 sec: 44247.0, 60 sec: 42870.0, 300 sec: 42654.2). Total num frames: 5672255488. Throughput: 0: 42594.1. Samples: 1939811960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 11:44:18,383][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 11:44:21,139][26599] Updated weights for policy 0, policy_version 346214 (0.0036) [2024-06-19 11:44:23,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42052.4, 300 sec: 42542.9). Total num frames: 5672435712. Throughput: 0: 42685.7. Samples: 1940071320. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:23,380][26367] Avg episode reward: [(0, '0.786')] [2024-06-19 11:44:24,954][26599] Updated weights for policy 0, policy_version 346224 (0.0047) [2024-06-19 11:44:28,384][26367] Fps is (10 sec: 42591.6, 60 sec: 42595.9, 300 sec: 42709.0). Total num frames: 5672681472. Throughput: 0: 42544.9. Samples: 1940322740. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:28,385][26367] Avg episode reward: [(0, '0.859')] [2024-06-19 11:44:28,707][26599] Updated weights for policy 0, policy_version 346234 (0.0025) [2024-06-19 11:44:32,547][26599] Updated weights for policy 0, policy_version 346244 (0.0046) [2024-06-19 11:44:33,380][26367] Fps is (10 sec: 45874.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5672894464. Throughput: 0: 42640.4. Samples: 1940453660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:33,381][26367] Avg episode reward: [(0, '0.586')] [2024-06-19 11:44:36,342][26599] Updated weights for policy 0, policy_version 346254 (0.0033) [2024-06-19 11:44:38,380][26367] Fps is (10 sec: 39336.4, 60 sec: 42325.3, 300 sec: 42543.4). Total num frames: 5673074688. Throughput: 0: 42553.9. Samples: 1940709560. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:38,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 11:44:40,171][26599] Updated weights for policy 0, policy_version 346264 (0.0025) [2024-06-19 11:44:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5673320448. Throughput: 0: 42643.5. Samples: 1940966100. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:43,383][26367] Avg episode reward: [(0, '0.712')] [2024-06-19 11:44:43,909][26599] Updated weights for policy 0, policy_version 346274 (0.0042) [2024-06-19 11:44:47,768][26599] Updated weights for policy 0, policy_version 346284 (0.0036) [2024-06-19 11:44:48,380][26367] Fps is (10 sec: 47513.0, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5673549824. Throughput: 0: 42750.6. Samples: 1941097440. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:48,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 11:44:51,376][26599] Updated weights for policy 0, policy_version 346294 (0.0036) [2024-06-19 11:44:53,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42322.8, 300 sec: 42597.9). Total num frames: 5673730048. Throughput: 0: 42753.6. Samples: 1941351500. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:53,384][26367] Avg episode reward: [(0, '0.453')] [2024-06-19 11:44:55,454][26599] Updated weights for policy 0, policy_version 346304 (0.0040) [2024-06-19 11:44:58,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5673959424. Throughput: 0: 42815.4. Samples: 1941609120. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:44:58,381][26367] Avg episode reward: [(0, '0.298')] [2024-06-19 11:44:59,079][26599] Updated weights for policy 0, policy_version 346314 (0.0032) [2024-06-19 11:45:03,160][26599] Updated weights for policy 0, policy_version 346324 (0.0037) [2024-06-19 11:45:03,384][26367] Fps is (10 sec: 44236.5, 60 sec: 42595.8, 300 sec: 42709.5). Total num frames: 5674172416. Throughput: 0: 42791.8. Samples: 1941737660. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:45:03,385][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 11:45:05,310][26579] Signal inference workers to stop experience collection... (28600 times) [2024-06-19 11:45:05,310][26579] Signal inference workers to resume experience collection... (28600 times) [2024-06-19 11:45:05,344][26599] InferenceWorker_p0-w0: stopping experience collection (28600 times) [2024-06-19 11:45:05,344][26599] InferenceWorker_p0-w0: resuming experience collection (28600 times) [2024-06-19 11:45:06,576][26599] Updated weights for policy 0, policy_version 346334 (0.0030) [2024-06-19 11:45:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42601.6, 300 sec: 42654.0). Total num frames: 5674369024. Throughput: 0: 42509.8. Samples: 1941984260. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:45:08,380][26367] Avg episode reward: [(0, '0.507')] [2024-06-19 11:45:10,994][26599] Updated weights for policy 0, policy_version 346344 (0.0037) [2024-06-19 11:45:13,380][26367] Fps is (10 sec: 40975.4, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5674582016. Throughput: 0: 42686.6. Samples: 1942243480. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:45:13,380][26367] Avg episode reward: [(0, '0.464')] [2024-06-19 11:45:14,343][26599] Updated weights for policy 0, policy_version 346354 (0.0034) [2024-06-19 11:45:18,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42326.8, 300 sec: 42598.4). Total num frames: 5674795008. Throughput: 0: 42513.3. Samples: 1942366760. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:45:18,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 11:45:18,803][26599] Updated weights for policy 0, policy_version 346364 (0.0033) [2024-06-19 11:45:22,318][26599] Updated weights for policy 0, policy_version 346374 (0.0027) [2024-06-19 11:45:23,380][26367] Fps is (10 sec: 44237.1, 60 sec: 43144.6, 300 sec: 42654.4). Total num frames: 5675024384. Throughput: 0: 42412.9. Samples: 1942618140. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:45:23,380][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 11:45:26,494][26599] Updated weights for policy 0, policy_version 346384 (0.0047) [2024-06-19 11:45:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42601.0, 300 sec: 42598.4). Total num frames: 5675237376. Throughput: 0: 42651.5. Samples: 1942885420. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:45:28,388][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 11:45:29,899][26599] Updated weights for policy 0, policy_version 346394 (0.0040) [2024-06-19 11:45:33,380][26367] Fps is (10 sec: 40959.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5675433984. Throughput: 0: 42521.3. Samples: 1943010900. Policy #0 lag: (min: 0.0, avg: 10.8, max: 23.0) [2024-06-19 11:45:33,381][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 11:45:34,077][26599] Updated weights for policy 0, policy_version 346404 (0.0023) [2024-06-19 11:45:37,871][26599] Updated weights for policy 0, policy_version 346414 (0.0033) [2024-06-19 11:45:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43144.4, 300 sec: 42653.9). Total num frames: 5675663360. Throughput: 0: 42497.1. Samples: 1943263720. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:45:38,381][26367] Avg episode reward: [(0, '0.462')] [2024-06-19 11:45:38,405][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000346415_5675663360.pth... [2024-06-19 11:45:38,463][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000345791_5665439744.pth [2024-06-19 11:45:41,709][26599] Updated weights for policy 0, policy_version 346424 (0.0029) [2024-06-19 11:45:43,380][26367] Fps is (10 sec: 44237.5, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5675876352. Throughput: 0: 42617.4. Samples: 1943526900. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:45:43,381][26367] Avg episode reward: [(0, '0.498')] [2024-06-19 11:45:45,548][26599] Updated weights for policy 0, policy_version 346434 (0.0043) [2024-06-19 11:45:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42052.3, 300 sec: 42542.9). Total num frames: 5676072960. Throughput: 0: 42613.7. Samples: 1943655120. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:45:48,381][26367] Avg episode reward: [(0, '0.587')] [2024-06-19 11:45:49,229][26599] Updated weights for policy 0, policy_version 346444 (0.0032) [2024-06-19 11:45:53,018][26599] Updated weights for policy 0, policy_version 346454 (0.0034) [2024-06-19 11:45:53,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43147.1, 300 sec: 42654.4). Total num frames: 5676318720. Throughput: 0: 42976.7. Samples: 1943918220. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:45:53,381][26367] Avg episode reward: [(0, '0.624')] [2024-06-19 11:45:57,267][26599] Updated weights for policy 0, policy_version 346464 (0.0035) [2024-06-19 11:45:58,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42595.8, 300 sec: 42597.9). Total num frames: 5676515328. Throughput: 0: 42783.2. Samples: 1944168880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:45:58,384][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 11:46:00,617][26599] Updated weights for policy 0, policy_version 346474 (0.0048) [2024-06-19 11:46:03,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42601.0, 300 sec: 42653.9). Total num frames: 5676728320. Throughput: 0: 42871.6. Samples: 1944295980. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:03,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 11:46:04,961][26599] Updated weights for policy 0, policy_version 346484 (0.0038) [2024-06-19 11:46:08,107][26599] Updated weights for policy 0, policy_version 346494 (0.0053) [2024-06-19 11:46:08,380][26367] Fps is (10 sec: 45891.3, 60 sec: 43417.5, 300 sec: 42765.0). Total num frames: 5676974080. Throughput: 0: 43052.2. Samples: 1944555500. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:08,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 11:46:12,634][26599] Updated weights for policy 0, policy_version 346504 (0.0029) [2024-06-19 11:46:13,384][26367] Fps is (10 sec: 40945.2, 60 sec: 42595.7, 300 sec: 42542.3). Total num frames: 5677137920. Throughput: 0: 42949.4. Samples: 1944818300. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:13,385][26367] Avg episode reward: [(0, '0.594')] [2024-06-19 11:46:15,729][26599] Updated weights for policy 0, policy_version 346514 (0.0043) [2024-06-19 11:46:18,380][26367] Fps is (10 sec: 39321.6, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5677367296. Throughput: 0: 42799.6. Samples: 1944936880. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:18,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 11:46:20,414][26599] Updated weights for policy 0, policy_version 346524 (0.0037) [2024-06-19 11:46:23,380][26367] Fps is (10 sec: 45891.5, 60 sec: 42871.3, 300 sec: 42654.4). Total num frames: 5677596672. Throughput: 0: 42937.7. Samples: 1945195920. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:23,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 11:46:23,472][26599] Updated weights for policy 0, policy_version 346534 (0.0032) [2024-06-19 11:46:27,869][26599] Updated weights for policy 0, policy_version 346544 (0.0037) [2024-06-19 11:46:28,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42325.3, 300 sec: 42598.4). Total num frames: 5677776896. Throughput: 0: 42892.8. Samples: 1945457080. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:28,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 11:46:31,164][26599] Updated weights for policy 0, policy_version 346554 (0.0041) [2024-06-19 11:46:33,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.5, 300 sec: 42598.6). Total num frames: 5678006272. Throughput: 0: 42848.9. Samples: 1945583320. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:33,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 11:46:33,572][26579] Signal inference workers to stop experience collection... (28650 times) [2024-06-19 11:46:33,572][26579] Signal inference workers to resume experience collection... (28650 times) [2024-06-19 11:46:33,619][26599] InferenceWorker_p0-w0: stopping experience collection (28650 times) [2024-06-19 11:46:33,619][26599] InferenceWorker_p0-w0: resuming experience collection (28650 times) [2024-06-19 11:46:35,469][26599] Updated weights for policy 0, policy_version 346564 (0.0049) [2024-06-19 11:46:38,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5678235648. Throughput: 0: 42669.5. Samples: 1945838340. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:38,380][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 11:46:38,683][26599] Updated weights for policy 0, policy_version 346574 (0.0052) [2024-06-19 11:46:43,114][26599] Updated weights for policy 0, policy_version 346584 (0.0032) [2024-06-19 11:46:43,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5678432256. Throughput: 0: 42929.7. Samples: 1946100560. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:43,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 11:46:46,428][26599] Updated weights for policy 0, policy_version 346594 (0.0027) [2024-06-19 11:46:48,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42598.4). Total num frames: 5678645248. Throughput: 0: 42893.4. Samples: 1946226180. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:48,381][26367] Avg episode reward: [(0, '0.773')] [2024-06-19 11:46:50,457][26599] Updated weights for policy 0, policy_version 346604 (0.0051) [2024-06-19 11:46:53,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5678874624. Throughput: 0: 42864.1. Samples: 1946484380. Policy #0 lag: (min: 0.0, avg: 11.0, max: 23.0) [2024-06-19 11:46:53,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 11:46:53,954][26599] Updated weights for policy 0, policy_version 346614 (0.0029) [2024-06-19 11:46:57,894][26599] Updated weights for policy 0, policy_version 346624 (0.0035) [2024-06-19 11:46:58,380][26367] Fps is (10 sec: 44237.3, 60 sec: 42874.1, 300 sec: 42653.9). Total num frames: 5679087616. Throughput: 0: 42825.4. Samples: 1946745280. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:46:58,380][26367] Avg episode reward: [(0, '0.658')] [2024-06-19 11:47:01,583][26599] Updated weights for policy 0, policy_version 346634 (0.0034) [2024-06-19 11:47:03,384][26367] Fps is (10 sec: 42582.8, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5679300608. Throughput: 0: 43117.1. Samples: 1946877300. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:03,385][26367] Avg episode reward: [(0, '0.663')] [2024-06-19 11:47:05,876][26599] Updated weights for policy 0, policy_version 346644 (0.0037) [2024-06-19 11:47:08,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.5, 300 sec: 42654.0). Total num frames: 5679529984. Throughput: 0: 43009.9. Samples: 1947131360. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:08,381][26367] Avg episode reward: [(0, '0.529')] [2024-06-19 11:47:09,335][26599] Updated weights for policy 0, policy_version 346654 (0.0035) [2024-06-19 11:47:13,373][26599] Updated weights for policy 0, policy_version 346664 (0.0033) [2024-06-19 11:47:13,380][26367] Fps is (10 sec: 44253.2, 60 sec: 43420.3, 300 sec: 42709.5). Total num frames: 5679742976. Throughput: 0: 42969.1. Samples: 1947390680. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:13,381][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 11:47:17,359][26599] Updated weights for policy 0, policy_version 346674 (0.0032) [2024-06-19 11:47:18,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5679939584. Throughput: 0: 42909.8. Samples: 1947514260. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:18,380][26367] Avg episode reward: [(0, '0.760')] [2024-06-19 11:47:20,997][26599] Updated weights for policy 0, policy_version 346684 (0.0035) [2024-06-19 11:47:23,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5680168960. Throughput: 0: 43048.8. Samples: 1947775540. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:23,384][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 11:47:24,997][26599] Updated weights for policy 0, policy_version 346694 (0.0035) [2024-06-19 11:47:28,380][26367] Fps is (10 sec: 44236.5, 60 sec: 43417.6, 300 sec: 42709.5). Total num frames: 5680381952. Throughput: 0: 42807.1. Samples: 1948026880. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:28,381][26367] Avg episode reward: [(0, '0.684')] [2024-06-19 11:47:28,532][26599] Updated weights for policy 0, policy_version 346704 (0.0034) [2024-06-19 11:47:32,873][26599] Updated weights for policy 0, policy_version 346714 (0.0037) [2024-06-19 11:47:33,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42654.5). Total num frames: 5680578560. Throughput: 0: 42828.9. Samples: 1948153480. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:33,383][26367] Avg episode reward: [(0, '0.642')] [2024-06-19 11:47:36,106][26599] Updated weights for policy 0, policy_version 346724 (0.0030) [2024-06-19 11:47:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5680807936. Throughput: 0: 42928.0. Samples: 1948416140. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:38,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 11:47:38,460][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000346730_5680824320.pth... [2024-06-19 11:47:38,515][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000346104_5670567936.pth [2024-06-19 11:47:40,447][26599] Updated weights for policy 0, policy_version 346734 (0.0040) [2024-06-19 11:47:43,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42765.3). Total num frames: 5681020928. Throughput: 0: 42779.8. Samples: 1948670380. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:43,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 11:47:44,175][26599] Updated weights for policy 0, policy_version 346744 (0.0025) [2024-06-19 11:47:47,913][26599] Updated weights for policy 0, policy_version 346754 (0.0036) [2024-06-19 11:47:48,380][26367] Fps is (10 sec: 40960.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5681217536. Throughput: 0: 42612.8. Samples: 1948794720. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:48,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 11:47:50,573][26579] Signal inference workers to stop experience collection... (28700 times) [2024-06-19 11:47:50,608][26599] InferenceWorker_p0-w0: stopping experience collection (28700 times) [2024-06-19 11:47:50,621][26579] Signal inference workers to resume experience collection... (28700 times) [2024-06-19 11:47:50,627][26599] InferenceWorker_p0-w0: resuming experience collection (28700 times) [2024-06-19 11:47:51,637][26599] Updated weights for policy 0, policy_version 346764 (0.0037) [2024-06-19 11:47:53,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5681430528. Throughput: 0: 42699.1. Samples: 1949052820. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:53,381][26367] Avg episode reward: [(0, '0.485')] [2024-06-19 11:47:55,907][26599] Updated weights for policy 0, policy_version 346774 (0.0029) [2024-06-19 11:47:58,380][26367] Fps is (10 sec: 45874.8, 60 sec: 43144.4, 300 sec: 42820.5). Total num frames: 5681676288. Throughput: 0: 42672.8. Samples: 1949310960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:47:58,381][26367] Avg episode reward: [(0, '0.418')] [2024-06-19 11:47:59,351][26599] Updated weights for policy 0, policy_version 346784 (0.0032) [2024-06-19 11:48:03,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42600.9, 300 sec: 42709.5). Total num frames: 5681856512. Throughput: 0: 42774.6. Samples: 1949439120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:48:03,381][26367] Avg episode reward: [(0, '0.326')] [2024-06-19 11:48:03,473][26599] Updated weights for policy 0, policy_version 346794 (0.0040) [2024-06-19 11:48:07,178][26599] Updated weights for policy 0, policy_version 346804 (0.0032) [2024-06-19 11:48:08,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42709.5). Total num frames: 5682085888. Throughput: 0: 42696.9. Samples: 1949696900. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 11:48:08,381][26367] Avg episode reward: [(0, '0.446')] [2024-06-19 11:48:10,913][26599] Updated weights for policy 0, policy_version 346814 (0.0032) [2024-06-19 11:48:13,384][26367] Fps is (10 sec: 44221.1, 60 sec: 42595.8, 300 sec: 42764.5). Total num frames: 5682298880. Throughput: 0: 42748.6. Samples: 1949950720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:13,385][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 11:48:14,742][26599] Updated weights for policy 0, policy_version 346824 (0.0035) [2024-06-19 11:48:18,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5682495488. Throughput: 0: 42815.6. Samples: 1950080180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:18,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 11:48:18,675][26599] Updated weights for policy 0, policy_version 346834 (0.0031) [2024-06-19 11:48:22,386][26599] Updated weights for policy 0, policy_version 346844 (0.0029) [2024-06-19 11:48:23,380][26367] Fps is (10 sec: 42613.4, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5682724864. Throughput: 0: 42716.7. Samples: 1950338400. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:23,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 11:48:26,114][26599] Updated weights for policy 0, policy_version 346854 (0.0022) [2024-06-19 11:48:28,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5682954240. Throughput: 0: 42804.5. Samples: 1950596580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:28,381][26367] Avg episode reward: [(0, '0.592')] [2024-06-19 11:48:29,912][26599] Updated weights for policy 0, policy_version 346864 (0.0051) [2024-06-19 11:48:33,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42871.6, 300 sec: 42765.0). Total num frames: 5683150848. Throughput: 0: 42933.8. Samples: 1950726740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:33,380][26367] Avg episode reward: [(0, '0.606')] [2024-06-19 11:48:33,658][26599] Updated weights for policy 0, policy_version 346874 (0.0035) [2024-06-19 11:48:37,438][26599] Updated weights for policy 0, policy_version 346884 (0.0032) [2024-06-19 11:48:38,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5683363840. Throughput: 0: 42822.1. Samples: 1950979820. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:38,381][26367] Avg episode reward: [(0, '0.646')] [2024-06-19 11:48:41,112][26599] Updated weights for policy 0, policy_version 346894 (0.0023) [2024-06-19 11:48:43,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5683576832. Throughput: 0: 42822.8. Samples: 1951237980. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:43,380][26367] Avg episode reward: [(0, '0.731')] [2024-06-19 11:48:45,148][26599] Updated weights for policy 0, policy_version 346904 (0.0037) [2024-06-19 11:48:48,384][26367] Fps is (10 sec: 42583.1, 60 sec: 42868.8, 300 sec: 42709.0). Total num frames: 5683789824. Throughput: 0: 42827.7. Samples: 1951366520. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:48,385][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 11:48:48,788][26599] Updated weights for policy 0, policy_version 346914 (0.0026) [2024-06-19 11:48:52,800][26599] Updated weights for policy 0, policy_version 346924 (0.0034) [2024-06-19 11:48:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5684019200. Throughput: 0: 42790.3. Samples: 1951622460. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:53,381][26367] Avg episode reward: [(0, '0.700')] [2024-06-19 11:48:56,513][26599] Updated weights for policy 0, policy_version 346934 (0.0045) [2024-06-19 11:48:58,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42052.3, 300 sec: 42653.9). Total num frames: 5684199424. Throughput: 0: 42977.7. Samples: 1951884560. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:48:58,380][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 11:49:00,492][26599] Updated weights for policy 0, policy_version 346944 (0.0036) [2024-06-19 11:49:03,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42765.6). Total num frames: 5684428800. Throughput: 0: 42767.0. Samples: 1952004700. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:49:03,381][26367] Avg episode reward: [(0, '0.687')] [2024-06-19 11:49:04,675][26599] Updated weights for policy 0, policy_version 346954 (0.0027) [2024-06-19 11:49:08,224][26599] Updated weights for policy 0, policy_version 346964 (0.0036) [2024-06-19 11:49:08,380][26367] Fps is (10 sec: 45874.3, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5684658176. Throughput: 0: 42817.3. Samples: 1952265180. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:49:08,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 11:49:12,368][26599] Updated weights for policy 0, policy_version 346974 (0.0038) [2024-06-19 11:49:13,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42601.0, 300 sec: 42709.8). Total num frames: 5684854784. Throughput: 0: 42711.1. Samples: 1952518580. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:49:13,381][26367] Avg episode reward: [(0, '0.516')] [2024-06-19 11:49:15,951][26599] Updated weights for policy 0, policy_version 346984 (0.0046) [2024-06-19 11:49:18,384][26367] Fps is (10 sec: 40945.5, 60 sec: 42868.8, 300 sec: 42820.0). Total num frames: 5685067776. Throughput: 0: 42484.9. Samples: 1952638720. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:49:18,385][26367] Avg episode reward: [(0, '0.689')] [2024-06-19 11:49:20,023][26599] Updated weights for policy 0, policy_version 346994 (0.0033) [2024-06-19 11:49:22,692][26579] Signal inference workers to stop experience collection... (28750 times) [2024-06-19 11:49:22,734][26599] InferenceWorker_p0-w0: stopping experience collection (28750 times) [2024-06-19 11:49:22,743][26579] Signal inference workers to resume experience collection... (28750 times) [2024-06-19 11:49:22,751][26599] InferenceWorker_p0-w0: resuming experience collection (28750 times) [2024-06-19 11:49:23,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42871.5, 300 sec: 42765.6). Total num frames: 5685297152. Throughput: 0: 42762.8. Samples: 1952904140. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:49:23,381][26367] Avg episode reward: [(0, '0.534')] [2024-06-19 11:49:23,390][26599] Updated weights for policy 0, policy_version 347004 (0.0035) [2024-06-19 11:49:27,430][26599] Updated weights for policy 0, policy_version 347014 (0.0031) [2024-06-19 11:49:28,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42325.3, 300 sec: 42709.5). Total num frames: 5685493760. Throughput: 0: 42705.7. Samples: 1953159740. Policy #0 lag: (min: 0.0, avg: 9.6, max: 20.0) [2024-06-19 11:49:28,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 11:49:31,183][26599] Updated weights for policy 0, policy_version 347024 (0.0049) [2024-06-19 11:49:33,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5685706752. Throughput: 0: 42590.6. Samples: 1953282940. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:49:33,380][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 11:49:34,974][26599] Updated weights for policy 0, policy_version 347034 (0.0031) [2024-06-19 11:49:38,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5685919744. Throughput: 0: 42650.2. Samples: 1953541720. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:49:38,381][26367] Avg episode reward: [(0, '0.426')] [2024-06-19 11:49:38,455][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347042_5685936128.pth... [2024-06-19 11:49:38,511][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000346415_5675663360.pth [2024-06-19 11:49:39,140][26599] Updated weights for policy 0, policy_version 347044 (0.0032) [2024-06-19 11:49:42,822][26599] Updated weights for policy 0, policy_version 347054 (0.0025) [2024-06-19 11:49:43,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5686149120. Throughput: 0: 42380.8. Samples: 1953791700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:49:43,381][26367] Avg episode reward: [(0, '0.714')] [2024-06-19 11:49:46,658][26599] Updated weights for policy 0, policy_version 347064 (0.0032) [2024-06-19 11:49:48,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42601.1, 300 sec: 42765.6). Total num frames: 5686345728. Throughput: 0: 42686.0. Samples: 1953925560. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:49:48,380][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 11:49:50,749][26599] Updated weights for policy 0, policy_version 347074 (0.0038) [2024-06-19 11:49:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5686575104. Throughput: 0: 42544.2. Samples: 1954179660. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:49:53,381][26367] Avg episode reward: [(0, '0.480')] [2024-06-19 11:49:54,611][26599] Updated weights for policy 0, policy_version 347084 (0.0044) [2024-06-19 11:49:58,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 5686771712. Throughput: 0: 42660.9. Samples: 1954438320. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:49:58,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 11:49:58,386][26599] Updated weights for policy 0, policy_version 347094 (0.0039) [2024-06-19 11:50:02,010][26599] Updated weights for policy 0, policy_version 347104 (0.0031) [2024-06-19 11:50:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5687001088. Throughput: 0: 42825.3. Samples: 1954565700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:03,381][26367] Avg episode reward: [(0, '0.423')] [2024-06-19 11:50:06,035][26599] Updated weights for policy 0, policy_version 347114 (0.0034) [2024-06-19 11:50:08,384][26367] Fps is (10 sec: 44220.8, 60 sec: 42595.9, 300 sec: 42820.0). Total num frames: 5687214080. Throughput: 0: 42599.7. Samples: 1954821280. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:08,384][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 11:50:09,619][26599] Updated weights for policy 0, policy_version 347124 (0.0025) [2024-06-19 11:50:13,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5687410688. Throughput: 0: 42558.3. Samples: 1955074860. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:13,381][26367] Avg episode reward: [(0, '0.563')] [2024-06-19 11:50:13,787][26599] Updated weights for policy 0, policy_version 347134 (0.0051) [2024-06-19 11:50:17,584][26599] Updated weights for policy 0, policy_version 347144 (0.0034) [2024-06-19 11:50:18,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42874.2, 300 sec: 42765.0). Total num frames: 5687640064. Throughput: 0: 42669.8. Samples: 1955203080. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:18,381][26367] Avg episode reward: [(0, '0.681')] [2024-06-19 11:50:21,528][26599] Updated weights for policy 0, policy_version 347154 (0.0036) [2024-06-19 11:50:23,384][26367] Fps is (10 sec: 45858.2, 60 sec: 42868.9, 300 sec: 42820.0). Total num frames: 5687869440. Throughput: 0: 42781.5. Samples: 1955467040. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:23,384][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 11:50:25,193][26599] Updated weights for policy 0, policy_version 347164 (0.0046) [2024-06-19 11:50:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5688049664. Throughput: 0: 42809.5. Samples: 1955718120. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:28,380][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 11:50:29,087][26599] Updated weights for policy 0, policy_version 347174 (0.0034) [2024-06-19 11:50:32,703][26599] Updated weights for policy 0, policy_version 347184 (0.0037) [2024-06-19 11:50:33,380][26367] Fps is (10 sec: 40975.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5688279040. Throughput: 0: 42669.3. Samples: 1955845680. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:33,380][26367] Avg episode reward: [(0, '0.540')] [2024-06-19 11:50:37,172][26599] Updated weights for policy 0, policy_version 347194 (0.0044) [2024-06-19 11:50:38,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5688492032. Throughput: 0: 42877.8. Samples: 1956109160. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:38,381][26367] Avg episode reward: [(0, '0.619')] [2024-06-19 11:50:40,233][26599] Updated weights for policy 0, policy_version 347204 (0.0040) [2024-06-19 11:50:43,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5688688640. Throughput: 0: 42625.3. Samples: 1956356460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:50:43,381][26367] Avg episode reward: [(0, '0.680')] [2024-06-19 11:50:44,881][26599] Updated weights for policy 0, policy_version 347214 (0.0030) [2024-06-19 11:50:47,700][26579] Signal inference workers to stop experience collection... (28800 times) [2024-06-19 11:50:47,701][26579] Signal inference workers to resume experience collection... (28800 times) [2024-06-19 11:50:47,716][26599] InferenceWorker_p0-w0: stopping experience collection (28800 times) [2024-06-19 11:50:47,716][26599] InferenceWorker_p0-w0: resuming experience collection (28800 times) [2024-06-19 11:50:47,836][26599] Updated weights for policy 0, policy_version 347224 (0.0038) [2024-06-19 11:50:48,380][26367] Fps is (10 sec: 44236.3, 60 sec: 43144.4, 300 sec: 42765.0). Total num frames: 5688934400. Throughput: 0: 42651.1. Samples: 1956485000. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:50:48,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 11:50:52,526][26599] Updated weights for policy 0, policy_version 347234 (0.0030) [2024-06-19 11:50:53,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.3, 300 sec: 42765.5). Total num frames: 5689131008. Throughput: 0: 42924.7. Samples: 1956752740. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:50:53,381][26367] Avg episode reward: [(0, '0.460')] [2024-06-19 11:50:55,289][26599] Updated weights for policy 0, policy_version 347244 (0.0035) [2024-06-19 11:50:58,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5689344000. Throughput: 0: 42897.6. Samples: 1957005260. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:50:58,381][26367] Avg episode reward: [(0, '0.422')] [2024-06-19 11:51:00,261][26599] Updated weights for policy 0, policy_version 347254 (0.0035) [2024-06-19 11:51:03,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42654.0). Total num frames: 5689556992. Throughput: 0: 42819.5. Samples: 1957129960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:03,381][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 11:51:03,412][26599] Updated weights for policy 0, policy_version 347264 (0.0044) [2024-06-19 11:51:08,018][26599] Updated weights for policy 0, policy_version 347274 (0.0035) [2024-06-19 11:51:08,380][26367] Fps is (10 sec: 40961.2, 60 sec: 42328.0, 300 sec: 42765.6). Total num frames: 5689753600. Throughput: 0: 42747.6. Samples: 1957390520. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:08,380][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 11:51:10,877][26599] Updated weights for policy 0, policy_version 347284 (0.0042) [2024-06-19 11:51:13,382][26367] Fps is (10 sec: 42590.9, 60 sec: 42870.2, 300 sec: 42764.8). Total num frames: 5689982976. Throughput: 0: 42840.4. Samples: 1957646020. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:13,383][26367] Avg episode reward: [(0, '0.776')] [2024-06-19 11:51:15,665][26599] Updated weights for policy 0, policy_version 347294 (0.0040) [2024-06-19 11:51:18,380][26367] Fps is (10 sec: 45874.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5690212352. Throughput: 0: 42887.4. Samples: 1957775620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:18,381][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 11:51:18,645][26599] Updated weights for policy 0, policy_version 347304 (0.0032) [2024-06-19 11:51:23,326][26599] Updated weights for policy 0, policy_version 347314 (0.0023) [2024-06-19 11:51:23,380][26367] Fps is (10 sec: 40967.6, 60 sec: 42054.9, 300 sec: 42765.0). Total num frames: 5690392576. Throughput: 0: 42662.2. Samples: 1958028960. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:23,381][26367] Avg episode reward: [(0, '0.673')] [2024-06-19 11:51:26,271][26599] Updated weights for policy 0, policy_version 347324 (0.0042) [2024-06-19 11:51:28,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43144.4, 300 sec: 42820.6). Total num frames: 5690638336. Throughput: 0: 42843.1. Samples: 1958284400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:28,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 11:51:30,751][26599] Updated weights for policy 0, policy_version 347334 (0.0042) [2024-06-19 11:51:33,384][26367] Fps is (10 sec: 47496.2, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 5690867712. Throughput: 0: 42949.1. Samples: 1958417860. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:33,384][26367] Avg episode reward: [(0, '0.805')] [2024-06-19 11:51:34,206][26599] Updated weights for policy 0, policy_version 347344 (0.0048) [2024-06-19 11:51:38,357][26599] Updated weights for policy 0, policy_version 347354 (0.0042) [2024-06-19 11:51:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5691047936. Throughput: 0: 42575.6. Samples: 1958668640. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:38,381][26367] Avg episode reward: [(0, '0.528')] [2024-06-19 11:51:38,401][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347354_5691047936.pth... [2024-06-19 11:51:38,451][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000346730_5680824320.pth [2024-06-19 11:51:41,794][26599] Updated weights for policy 0, policy_version 347364 (0.0038) [2024-06-19 11:51:43,384][26367] Fps is (10 sec: 39321.5, 60 sec: 42868.9, 300 sec: 42764.5). Total num frames: 5691260928. Throughput: 0: 42555.8. Samples: 1958920420. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:43,384][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 11:51:46,007][26599] Updated weights for policy 0, policy_version 347374 (0.0036) [2024-06-19 11:51:48,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5691506688. Throughput: 0: 42735.2. Samples: 1959053040. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:48,380][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 11:51:49,368][26599] Updated weights for policy 0, policy_version 347384 (0.0044) [2024-06-19 11:51:53,380][26367] Fps is (10 sec: 42613.6, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5691686912. Throughput: 0: 42624.3. Samples: 1959308620. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:53,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 11:51:53,605][26599] Updated weights for policy 0, policy_version 347394 (0.0030) [2024-06-19 11:51:56,809][26599] Updated weights for policy 0, policy_version 347404 (0.0035) [2024-06-19 11:51:58,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42871.6, 300 sec: 42765.5). Total num frames: 5691916288. Throughput: 0: 42587.9. Samples: 1959562400. Policy #0 lag: (min: 0.0, avg: 9.7, max: 21.0) [2024-06-19 11:51:58,381][26367] Avg episode reward: [(0, '0.593')] [2024-06-19 11:52:01,564][26599] Updated weights for policy 0, policy_version 347414 (0.0033) [2024-06-19 11:52:03,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42325.4, 300 sec: 42598.4). Total num frames: 5692096512. Throughput: 0: 42555.7. Samples: 1959690620. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:03,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 11:52:04,519][26599] Updated weights for policy 0, policy_version 347424 (0.0044) [2024-06-19 11:52:08,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42871.4, 300 sec: 42653.9). Total num frames: 5692325888. Throughput: 0: 42460.4. Samples: 1959939680. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:08,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 11:52:09,218][26599] Updated weights for policy 0, policy_version 347434 (0.0037) [2024-06-19 11:52:12,522][26599] Updated weights for policy 0, policy_version 347444 (0.0049) [2024-06-19 11:52:13,380][26367] Fps is (10 sec: 45875.0, 60 sec: 42872.7, 300 sec: 42765.0). Total num frames: 5692555264. Throughput: 0: 42523.6. Samples: 1960197960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:13,381][26367] Avg episode reward: [(0, '0.550')] [2024-06-19 11:52:17,061][26599] Updated weights for policy 0, policy_version 347454 (0.0053) [2024-06-19 11:52:18,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42325.4, 300 sec: 42653.9). Total num frames: 5692751872. Throughput: 0: 42431.8. Samples: 1960327140. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:18,381][26367] Avg episode reward: [(0, '0.661')] [2024-06-19 11:52:20,335][26599] Updated weights for policy 0, policy_version 347464 (0.0042) [2024-06-19 11:52:23,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5692964864. Throughput: 0: 42496.0. Samples: 1960580960. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:23,380][26367] Avg episode reward: [(0, '0.625')] [2024-06-19 11:52:24,499][26599] Updated weights for policy 0, policy_version 347474 (0.0036) [2024-06-19 11:52:26,053][26579] Signal inference workers to stop experience collection... (28850 times) [2024-06-19 11:52:26,103][26599] InferenceWorker_p0-w0: stopping experience collection (28850 times) [2024-06-19 11:52:26,110][26579] Signal inference workers to resume experience collection... (28850 times) [2024-06-19 11:52:26,120][26599] InferenceWorker_p0-w0: resuming experience collection (28850 times) [2024-06-19 11:52:28,021][26599] Updated weights for policy 0, policy_version 347484 (0.0033) [2024-06-19 11:52:28,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5693194240. Throughput: 0: 42680.4. Samples: 1960840880. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:28,380][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 11:52:32,406][26599] Updated weights for policy 0, policy_version 347494 (0.0025) [2024-06-19 11:52:33,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42054.8, 300 sec: 42653.9). Total num frames: 5693390848. Throughput: 0: 42565.6. Samples: 1960968500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:33,381][26367] Avg episode reward: [(0, '0.506')] [2024-06-19 11:52:35,559][26599] Updated weights for policy 0, policy_version 347504 (0.0039) [2024-06-19 11:52:38,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5693620224. Throughput: 0: 42547.7. Samples: 1961223260. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:38,381][26367] Avg episode reward: [(0, '0.493')] [2024-06-19 11:52:39,838][26599] Updated weights for policy 0, policy_version 347514 (0.0050) [2024-06-19 11:52:43,263][26599] Updated weights for policy 0, policy_version 347524 (0.0027) [2024-06-19 11:52:43,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42874.1, 300 sec: 42765.0). Total num frames: 5693833216. Throughput: 0: 42644.9. Samples: 1961481420. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:43,381][26367] Avg episode reward: [(0, '0.659')] [2024-06-19 11:52:47,346][26599] Updated weights for policy 0, policy_version 347534 (0.0035) [2024-06-19 11:52:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42052.2, 300 sec: 42709.5). Total num frames: 5694029824. Throughput: 0: 42641.8. Samples: 1961609500. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:48,381][26367] Avg episode reward: [(0, '0.629')] [2024-06-19 11:52:50,675][26599] Updated weights for policy 0, policy_version 347544 (0.0031) [2024-06-19 11:52:53,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.6, 300 sec: 42654.0). Total num frames: 5694259200. Throughput: 0: 42849.8. Samples: 1961867920. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:53,380][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 11:52:54,597][26599] Updated weights for policy 0, policy_version 347554 (0.0043) [2024-06-19 11:52:58,015][26599] Updated weights for policy 0, policy_version 347564 (0.0029) [2024-06-19 11:52:58,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5694488576. Throughput: 0: 43053.3. Samples: 1962135360. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:52:58,381][26367] Avg episode reward: [(0, '0.583')] [2024-06-19 11:53:02,047][26599] Updated weights for policy 0, policy_version 347574 (0.0040) [2024-06-19 11:53:03,380][26367] Fps is (10 sec: 42597.9, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5694685184. Throughput: 0: 43143.1. Samples: 1962268580. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:53:03,381][26367] Avg episode reward: [(0, '0.634')] [2024-06-19 11:53:05,561][26599] Updated weights for policy 0, policy_version 347584 (0.0045) [2024-06-19 11:53:08,381][26367] Fps is (10 sec: 42593.3, 60 sec: 43143.6, 300 sec: 42765.4). Total num frames: 5694914560. Throughput: 0: 43128.6. Samples: 1962521800. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:53:08,382][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 11:53:09,552][26599] Updated weights for policy 0, policy_version 347594 (0.0039) [2024-06-19 11:53:13,327][26599] Updated weights for policy 0, policy_version 347604 (0.0035) [2024-06-19 11:53:13,380][26367] Fps is (10 sec: 45875.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5695143936. Throughput: 0: 43150.5. Samples: 1962782660. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:53:13,384][26367] Avg episode reward: [(0, '0.744')] [2024-06-19 11:53:17,124][26599] Updated weights for policy 0, policy_version 347614 (0.0030) [2024-06-19 11:53:18,380][26367] Fps is (10 sec: 42603.8, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5695340544. Throughput: 0: 43267.6. Samples: 1962915540. Policy #0 lag: (min: 0.0, avg: 9.4, max: 21.0) [2024-06-19 11:53:18,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 11:53:21,131][26599] Updated weights for policy 0, policy_version 347624 (0.0045) [2024-06-19 11:53:23,380][26367] Fps is (10 sec: 40960.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5695553536. Throughput: 0: 43191.2. Samples: 1963166860. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:23,380][26367] Avg episode reward: [(0, '0.620')] [2024-06-19 11:53:24,841][26599] Updated weights for policy 0, policy_version 347634 (0.0037) [2024-06-19 11:53:28,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5695766528. Throughput: 0: 43268.0. Samples: 1963428480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:28,381][26367] Avg episode reward: [(0, '0.451')] [2024-06-19 11:53:29,009][26599] Updated weights for policy 0, policy_version 347644 (0.0039) [2024-06-19 11:53:32,495][26599] Updated weights for policy 0, policy_version 347654 (0.0030) [2024-06-19 11:53:33,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5695979520. Throughput: 0: 43231.6. Samples: 1963554920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:33,380][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 11:53:36,543][26599] Updated weights for policy 0, policy_version 347664 (0.0034) [2024-06-19 11:53:38,380][26367] Fps is (10 sec: 42597.7, 60 sec: 42871.3, 300 sec: 42765.0). Total num frames: 5696192512. Throughput: 0: 43152.2. Samples: 1963809780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:38,381][26367] Avg episode reward: [(0, '0.577')] [2024-06-19 11:53:38,563][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347669_5696208896.pth... [2024-06-19 11:53:38,616][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347042_5685936128.pth [2024-06-19 11:53:40,001][26599] Updated weights for policy 0, policy_version 347674 (0.0033) [2024-06-19 11:53:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.4, 300 sec: 42710.0). Total num frames: 5696389120. Throughput: 0: 42971.6. Samples: 1964069080. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:43,382][26367] Avg episode reward: [(0, '0.653')] [2024-06-19 11:53:44,202][26599] Updated weights for policy 0, policy_version 347684 (0.0047) [2024-06-19 11:53:47,627][26599] Updated weights for policy 0, policy_version 347694 (0.0038) [2024-06-19 11:53:48,384][26367] Fps is (10 sec: 42583.6, 60 sec: 43141.9, 300 sec: 42708.9). Total num frames: 5696618496. Throughput: 0: 42845.0. Samples: 1964196760. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:48,384][26367] Avg episode reward: [(0, '0.559')] [2024-06-19 11:53:51,726][26599] Updated weights for policy 0, policy_version 347704 (0.0027) [2024-06-19 11:53:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5696831488. Throughput: 0: 42892.7. Samples: 1964451920. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:53,381][26367] Avg episode reward: [(0, '0.552')] [2024-06-19 11:53:55,513][26599] Updated weights for policy 0, policy_version 347714 (0.0031) [2024-06-19 11:53:58,380][26367] Fps is (10 sec: 42614.4, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5697044480. Throughput: 0: 42844.2. Samples: 1964710640. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:53:58,380][26367] Avg episode reward: [(0, '0.551')] [2024-06-19 11:53:59,420][26599] Updated weights for policy 0, policy_version 347724 (0.0035) [2024-06-19 11:54:01,912][26579] Signal inference workers to stop experience collection... (28900 times) [2024-06-19 11:54:01,963][26599] InferenceWorker_p0-w0: stopping experience collection (28900 times) [2024-06-19 11:54:01,969][26579] Signal inference workers to resume experience collection... (28900 times) [2024-06-19 11:54:01,975][26599] InferenceWorker_p0-w0: resuming experience collection (28900 times) [2024-06-19 11:54:02,968][26599] Updated weights for policy 0, policy_version 347734 (0.0036) [2024-06-19 11:54:03,380][26367] Fps is (10 sec: 45876.0, 60 sec: 43417.7, 300 sec: 42820.6). Total num frames: 5697290240. Throughput: 0: 42793.4. Samples: 1964841240. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:54:03,380][26367] Avg episode reward: [(0, '0.370')] [2024-06-19 11:54:07,333][26599] Updated weights for policy 0, policy_version 347744 (0.0040) [2024-06-19 11:54:08,380][26367] Fps is (10 sec: 44235.9, 60 sec: 42872.3, 300 sec: 42820.5). Total num frames: 5697486848. Throughput: 0: 43010.0. Samples: 1965102320. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:54:08,381][26367] Avg episode reward: [(0, '0.527')] [2024-06-19 11:54:10,461][26599] Updated weights for policy 0, policy_version 347754 (0.0032) [2024-06-19 11:54:13,382][26367] Fps is (10 sec: 39312.3, 60 sec: 42323.8, 300 sec: 42765.2). Total num frames: 5697683456. Throughput: 0: 42830.3. Samples: 1965355940. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:54:13,383][26367] Avg episode reward: [(0, '0.588')] [2024-06-19 11:54:14,900][26599] Updated weights for policy 0, policy_version 347764 (0.0035) [2024-06-19 11:54:18,187][26599] Updated weights for policy 0, policy_version 347774 (0.0031) [2024-06-19 11:54:18,380][26367] Fps is (10 sec: 44237.5, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5697929216. Throughput: 0: 42830.2. Samples: 1965482280. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:54:18,381][26367] Avg episode reward: [(0, '0.524')] [2024-06-19 11:54:22,374][26599] Updated weights for policy 0, policy_version 347784 (0.0026) [2024-06-19 11:54:23,380][26367] Fps is (10 sec: 44246.4, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5698125824. Throughput: 0: 42949.0. Samples: 1965742480. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:54:23,381][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 11:54:25,758][26599] Updated weights for policy 0, policy_version 347794 (0.0022) [2024-06-19 11:54:28,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5698338816. Throughput: 0: 42948.8. Samples: 1966001780. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:54:28,388][26367] Avg episode reward: [(0, '0.428')] [2024-06-19 11:54:29,970][26599] Updated weights for policy 0, policy_version 347804 (0.0028) [2024-06-19 11:54:33,242][26599] Updated weights for policy 0, policy_version 347814 (0.0040) [2024-06-19 11:54:33,380][26367] Fps is (10 sec: 45875.7, 60 sec: 43417.6, 300 sec: 42931.6). Total num frames: 5698584576. Throughput: 0: 42811.1. Samples: 1966123100. Policy #0 lag: (min: 0.0, avg: 10.3, max: 21.0) [2024-06-19 11:54:33,380][26367] Avg episode reward: [(0, '0.519')] [2024-06-19 11:54:37,749][26599] Updated weights for policy 0, policy_version 347824 (0.0037) [2024-06-19 11:54:38,384][26367] Fps is (10 sec: 42583.3, 60 sec: 42869.0, 300 sec: 42764.5). Total num frames: 5698764800. Throughput: 0: 42785.5. Samples: 1966377420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:54:38,384][26367] Avg episode reward: [(0, '0.650')] [2024-06-19 11:54:41,244][26599] Updated weights for policy 0, policy_version 347834 (0.0036) [2024-06-19 11:54:43,380][26367] Fps is (10 sec: 37683.3, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5698961408. Throughput: 0: 42919.1. Samples: 1966642000. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:54:43,380][26367] Avg episode reward: [(0, '0.610')] [2024-06-19 11:54:45,173][26599] Updated weights for policy 0, policy_version 347844 (0.0029) [2024-06-19 11:54:48,380][26367] Fps is (10 sec: 44252.2, 60 sec: 43147.0, 300 sec: 42820.5). Total num frames: 5699207168. Throughput: 0: 42931.7. Samples: 1966773180. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:54:48,381][26367] Avg episode reward: [(0, '0.614')] [2024-06-19 11:54:48,631][26599] Updated weights for policy 0, policy_version 347854 (0.0034) [2024-06-19 11:54:52,849][26599] Updated weights for policy 0, policy_version 347864 (0.0025) [2024-06-19 11:54:53,380][26367] Fps is (10 sec: 45874.2, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5699420160. Throughput: 0: 42877.8. Samples: 1967031820. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:54:53,381][26367] Avg episode reward: [(0, '0.525')] [2024-06-19 11:54:56,224][26599] Updated weights for policy 0, policy_version 347874 (0.0027) [2024-06-19 11:54:58,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5699616768. Throughput: 0: 42892.4. Samples: 1967286000. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:54:58,380][26367] Avg episode reward: [(0, '0.361')] [2024-06-19 11:55:00,395][26599] Updated weights for policy 0, policy_version 347884 (0.0043) [2024-06-19 11:55:03,380][26367] Fps is (10 sec: 42599.2, 60 sec: 42598.4, 300 sec: 42821.1). Total num frames: 5699846144. Throughput: 0: 42907.1. Samples: 1967413100. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:03,381][26367] Avg episode reward: [(0, '0.721')] [2024-06-19 11:55:04,235][26599] Updated weights for policy 0, policy_version 347894 (0.0038) [2024-06-19 11:55:08,095][26599] Updated weights for policy 0, policy_version 347904 (0.0026) [2024-06-19 11:55:08,384][26367] Fps is (10 sec: 45858.2, 60 sec: 43142.0, 300 sec: 42931.1). Total num frames: 5700075520. Throughput: 0: 42994.4. Samples: 1967677380. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:08,384][26367] Avg episode reward: [(0, '0.764')] [2024-06-19 11:55:11,847][26599] Updated weights for policy 0, policy_version 347914 (0.0040) [2024-06-19 11:55:13,380][26367] Fps is (10 sec: 42598.3, 60 sec: 43146.2, 300 sec: 42820.5). Total num frames: 5700272128. Throughput: 0: 42658.8. Samples: 1967921420. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:13,381][26367] Avg episode reward: [(0, '0.544')] [2024-06-19 11:55:14,104][26579] Signal inference workers to stop experience collection... (28950 times) [2024-06-19 11:55:14,105][26579] Signal inference workers to resume experience collection... (28950 times) [2024-06-19 11:55:14,137][26599] InferenceWorker_p0-w0: stopping experience collection (28950 times) [2024-06-19 11:55:14,137][26599] InferenceWorker_p0-w0: resuming experience collection (28950 times) [2024-06-19 11:55:15,802][26599] Updated weights for policy 0, policy_version 347924 (0.0039) [2024-06-19 11:55:18,380][26367] Fps is (10 sec: 42613.7, 60 sec: 42871.4, 300 sec: 42821.1). Total num frames: 5700501504. Throughput: 0: 42898.6. Samples: 1968053540. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:18,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 11:55:19,414][26599] Updated weights for policy 0, policy_version 347934 (0.0030) [2024-06-19 11:55:23,365][26599] Updated weights for policy 0, policy_version 347944 (0.0041) [2024-06-19 11:55:23,380][26367] Fps is (10 sec: 44236.4, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5700714496. Throughput: 0: 43102.5. Samples: 1968316880. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:23,381][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 11:55:27,134][26599] Updated weights for policy 0, policy_version 347954 (0.0043) [2024-06-19 11:55:28,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.5, 300 sec: 42820.5). Total num frames: 5700911104. Throughput: 0: 42915.1. Samples: 1968573180. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:28,381][26367] Avg episode reward: [(0, '0.456')] [2024-06-19 11:55:31,181][26599] Updated weights for policy 0, policy_version 347964 (0.0030) [2024-06-19 11:55:33,380][26367] Fps is (10 sec: 42598.7, 60 sec: 42598.4, 300 sec: 42876.1). Total num frames: 5701140480. Throughput: 0: 42825.5. Samples: 1968700320. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:33,381][26367] Avg episode reward: [(0, '0.570')] [2024-06-19 11:55:35,409][26599] Updated weights for policy 0, policy_version 347974 (0.0041) [2024-06-19 11:55:38,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42600.9, 300 sec: 42820.5). Total num frames: 5701320704. Throughput: 0: 42644.9. Samples: 1968950840. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:38,381][26367] Avg episode reward: [(0, '0.635')] [2024-06-19 11:55:38,389][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347982_5701337088.pth... [2024-06-19 11:55:38,436][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347354_5691047936.pth [2024-06-19 11:55:38,946][26599] Updated weights for policy 0, policy_version 347984 (0.0038) [2024-06-19 11:55:42,922][26599] Updated weights for policy 0, policy_version 347994 (0.0036) [2024-06-19 11:55:43,380][26367] Fps is (10 sec: 39321.5, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5701533696. Throughput: 0: 42763.5. Samples: 1969210360. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:43,382][26367] Avg episode reward: [(0, '0.539')] [2024-06-19 11:55:46,602][26599] Updated weights for policy 0, policy_version 348004 (0.0040) [2024-06-19 11:55:48,380][26367] Fps is (10 sec: 45875.6, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5701779456. Throughput: 0: 42765.3. Samples: 1969337540. Policy #0 lag: (min: 1.0, avg: 8.9, max: 20.0) [2024-06-19 11:55:48,384][26367] Avg episode reward: [(0, '0.393')] [2024-06-19 11:55:50,520][26599] Updated weights for policy 0, policy_version 348014 (0.0030) [2024-06-19 11:55:53,380][26367] Fps is (10 sec: 44236.9, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5701976064. Throughput: 0: 42589.7. Samples: 1969593760. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:55:53,381][26367] Avg episode reward: [(0, '0.404')] [2024-06-19 11:55:54,209][26599] Updated weights for policy 0, policy_version 348024 (0.0030) [2024-06-19 11:55:58,169][26599] Updated weights for policy 0, policy_version 348034 (0.0040) [2024-06-19 11:55:58,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5702189056. Throughput: 0: 42851.4. Samples: 1969849740. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:55:58,381][26367] Avg episode reward: [(0, '0.469')] [2024-06-19 11:56:01,723][26599] Updated weights for policy 0, policy_version 348044 (0.0043) [2024-06-19 11:56:03,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5702402048. Throughput: 0: 42818.7. Samples: 1969980380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:03,381][26367] Avg episode reward: [(0, '0.603')] [2024-06-19 11:56:05,684][26599] Updated weights for policy 0, policy_version 348054 (0.0039) [2024-06-19 11:56:08,384][26367] Fps is (10 sec: 42583.5, 60 sec: 42325.3, 300 sec: 42820.3). Total num frames: 5702615040. Throughput: 0: 42618.8. Samples: 1970234880. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:08,385][26367] Avg episode reward: [(0, '0.562')] [2024-06-19 11:56:09,367][26599] Updated weights for policy 0, policy_version 348064 (0.0040) [2024-06-19 11:56:13,380][26367] Fps is (10 sec: 42598.5, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5702828032. Throughput: 0: 42517.3. Samples: 1970486460. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:13,381][26367] Avg episode reward: [(0, '0.554')] [2024-06-19 11:56:13,616][26599] Updated weights for policy 0, policy_version 348074 (0.0034) [2024-06-19 11:56:17,212][26599] Updated weights for policy 0, policy_version 348084 (0.0047) [2024-06-19 11:56:18,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42325.4, 300 sec: 42876.1). Total num frames: 5703041024. Throughput: 0: 42661.8. Samples: 1970620100. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:18,381][26367] Avg episode reward: [(0, '0.622')] [2024-06-19 11:56:21,110][26599] Updated weights for policy 0, policy_version 348094 (0.0043) [2024-06-19 11:56:23,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5703254016. Throughput: 0: 42690.3. Samples: 1970871900. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:23,381][26367] Avg episode reward: [(0, '0.639')] [2024-06-19 11:56:25,288][26599] Updated weights for policy 0, policy_version 348104 (0.0036) [2024-06-19 11:56:28,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42868.8, 300 sec: 42765.0). Total num frames: 5703483392. Throughput: 0: 42488.1. Samples: 1971122480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:28,385][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 11:56:28,586][26599] Updated weights for policy 0, policy_version 348114 (0.0036) [2024-06-19 11:56:32,923][26599] Updated weights for policy 0, policy_version 348124 (0.0037) [2024-06-19 11:56:33,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42322.8, 300 sec: 42820.0). Total num frames: 5703680000. Throughput: 0: 42674.8. Samples: 1971258060. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:33,385][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 11:56:36,591][26599] Updated weights for policy 0, policy_version 348134 (0.0037) [2024-06-19 11:56:36,615][26579] Signal inference workers to stop experience collection... (29000 times) [2024-06-19 11:56:36,615][26579] Signal inference workers to resume experience collection... (29000 times) [2024-06-19 11:56:36,637][26599] InferenceWorker_p0-w0: stopping experience collection (29000 times) [2024-06-19 11:56:36,637][26599] InferenceWorker_p0-w0: resuming experience collection (29000 times) [2024-06-19 11:56:38,380][26367] Fps is (10 sec: 40975.2, 60 sec: 42871.6, 300 sec: 42821.1). Total num frames: 5703892992. Throughput: 0: 42604.9. Samples: 1971510980. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:38,380][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 11:56:40,500][26599] Updated weights for policy 0, policy_version 348144 (0.0025) [2024-06-19 11:56:43,380][26367] Fps is (10 sec: 45892.2, 60 sec: 43417.7, 300 sec: 42820.5). Total num frames: 5704138752. Throughput: 0: 42663.3. Samples: 1971769580. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:43,381][26367] Avg episode reward: [(0, '0.575')] [2024-06-19 11:56:44,033][26599] Updated weights for policy 0, policy_version 348154 (0.0035) [2024-06-19 11:56:48,050][26599] Updated weights for policy 0, policy_version 348164 (0.0033) [2024-06-19 11:56:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5704335360. Throughput: 0: 42730.3. Samples: 1971903240. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:48,380][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 11:56:51,478][26599] Updated weights for policy 0, policy_version 348174 (0.0040) [2024-06-19 11:56:53,380][26367] Fps is (10 sec: 39320.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5704531968. Throughput: 0: 42721.1. Samples: 1972157180. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:53,381][26367] Avg episode reward: [(0, '0.688')] [2024-06-19 11:56:55,607][26599] Updated weights for policy 0, policy_version 348184 (0.0038) [2024-06-19 11:56:58,381][26367] Fps is (10 sec: 44233.6, 60 sec: 43144.2, 300 sec: 42987.1). Total num frames: 5704777728. Throughput: 0: 42888.7. Samples: 1972416480. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:56:58,381][26367] Avg episode reward: [(0, '0.711')] [2024-06-19 11:56:58,873][26599] Updated weights for policy 0, policy_version 348194 (0.0023) [2024-06-19 11:57:03,071][26599] Updated weights for policy 0, policy_version 348204 (0.0040) [2024-06-19 11:57:03,380][26367] Fps is (10 sec: 44237.7, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5704974336. Throughput: 0: 42895.2. Samples: 1972550380. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:57:03,380][26367] Avg episode reward: [(0, '0.835')] [2024-06-19 11:57:06,563][26599] Updated weights for policy 0, policy_version 348214 (0.0040) [2024-06-19 11:57:08,380][26367] Fps is (10 sec: 40962.5, 60 sec: 42874.0, 300 sec: 42820.5). Total num frames: 5705187328. Throughput: 0: 42862.2. Samples: 1972800700. Policy #0 lag: (min: 0.0, avg: 10.1, max: 21.0) [2024-06-19 11:57:08,381][26367] Avg episode reward: [(0, '0.880')] [2024-06-19 11:57:10,555][26599] Updated weights for policy 0, policy_version 348224 (0.0035) [2024-06-19 11:57:13,380][26367] Fps is (10 sec: 44236.1, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5705416704. Throughput: 0: 43199.5. Samples: 1973066300. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:13,381][26367] Avg episode reward: [(0, '0.630')] [2024-06-19 11:57:14,110][26599] Updated weights for policy 0, policy_version 348234 (0.0039) [2024-06-19 11:57:18,270][26599] Updated weights for policy 0, policy_version 348244 (0.0040) [2024-06-19 11:57:18,380][26367] Fps is (10 sec: 44237.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5705629696. Throughput: 0: 43099.0. Samples: 1973197360. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:18,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 11:57:21,575][26599] Updated weights for policy 0, policy_version 348254 (0.0037) [2024-06-19 11:57:23,380][26367] Fps is (10 sec: 42598.8, 60 sec: 43144.6, 300 sec: 42876.1). Total num frames: 5705842688. Throughput: 0: 43089.3. Samples: 1973450000. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:23,384][26367] Avg episode reward: [(0, '0.308')] [2024-06-19 11:57:25,863][26599] Updated weights for policy 0, policy_version 348264 (0.0030) [2024-06-19 11:57:28,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43147.1, 300 sec: 42987.2). Total num frames: 5706072064. Throughput: 0: 43219.9. Samples: 1973714480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:28,381][26367] Avg episode reward: [(0, '0.468')] [2024-06-19 11:57:29,086][26599] Updated weights for policy 0, policy_version 348274 (0.0039) [2024-06-19 11:57:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 43147.2, 300 sec: 42876.1). Total num frames: 5706268672. Throughput: 0: 43127.5. Samples: 1973843980. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:33,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 11:57:33,516][26599] Updated weights for policy 0, policy_version 348284 (0.0033) [2024-06-19 11:57:36,747][26599] Updated weights for policy 0, policy_version 348294 (0.0041) [2024-06-19 11:57:38,384][26367] Fps is (10 sec: 40945.1, 60 sec: 43141.9, 300 sec: 42875.6). Total num frames: 5706481664. Throughput: 0: 43026.4. Samples: 1974093520. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:38,385][26367] Avg episode reward: [(0, '0.770')] [2024-06-19 11:57:38,440][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000348297_5706498048.pth... [2024-06-19 11:57:38,480][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347669_5696208896.pth [2024-06-19 11:57:41,077][26599] Updated weights for policy 0, policy_version 348304 (0.0046) [2024-06-19 11:57:43,380][26367] Fps is (10 sec: 45875.6, 60 sec: 43144.6, 300 sec: 43042.7). Total num frames: 5706727424. Throughput: 0: 43218.1. Samples: 1974361260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:43,380][26367] Avg episode reward: [(0, '0.768')] [2024-06-19 11:57:44,265][26599] Updated weights for policy 0, policy_version 348314 (0.0034) [2024-06-19 11:57:48,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5706907648. Throughput: 0: 43061.7. Samples: 1974488160. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:48,381][26367] Avg episode reward: [(0, '0.699')] [2024-06-19 11:57:48,722][26599] Updated weights for policy 0, policy_version 348324 (0.0037) [2024-06-19 11:57:52,023][26599] Updated weights for policy 0, policy_version 348334 (0.0030) [2024-06-19 11:57:53,380][26367] Fps is (10 sec: 39321.1, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5707120640. Throughput: 0: 43056.9. Samples: 1974738260. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:53,381][26367] Avg episode reward: [(0, '0.703')] [2024-06-19 11:57:56,347][26599] Updated weights for policy 0, policy_version 348344 (0.0034) [2024-06-19 11:57:58,380][26367] Fps is (10 sec: 45875.2, 60 sec: 43145.0, 300 sec: 42987.2). Total num frames: 5707366400. Throughput: 0: 42940.1. Samples: 1974998600. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:57:58,381][26367] Avg episode reward: [(0, '0.774')] [2024-06-19 11:57:59,705][26599] Updated weights for policy 0, policy_version 348354 (0.0037) [2024-06-19 11:58:03,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.4, 300 sec: 42876.3). Total num frames: 5707563008. Throughput: 0: 43002.6. Samples: 1975132480. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:58:03,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 11:58:03,972][26579] Signal inference workers to stop experience collection... (29050 times) [2024-06-19 11:58:04,032][26599] InferenceWorker_p0-w0: stopping experience collection (29050 times) [2024-06-19 11:58:04,033][26579] Signal inference workers to resume experience collection... (29050 times) [2024-06-19 11:58:04,042][26599] InferenceWorker_p0-w0: resuming experience collection (29050 times) [2024-06-19 11:58:04,049][26599] Updated weights for policy 0, policy_version 348364 (0.0030) [2024-06-19 11:58:07,434][26599] Updated weights for policy 0, policy_version 348374 (0.0031) [2024-06-19 11:58:08,380][26367] Fps is (10 sec: 40959.6, 60 sec: 43144.5, 300 sec: 42820.6). Total num frames: 5707776000. Throughput: 0: 42996.4. Samples: 1975384840. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:58:08,381][26367] Avg episode reward: [(0, '0.670')] [2024-06-19 11:58:11,755][26599] Updated weights for policy 0, policy_version 348384 (0.0043) [2024-06-19 11:58:13,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5707988992. Throughput: 0: 42849.0. Samples: 1975642680. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:58:13,380][26367] Avg episode reward: [(0, '0.814')] [2024-06-19 11:58:14,993][26599] Updated weights for policy 0, policy_version 348394 (0.0035) [2024-06-19 11:58:18,380][26367] Fps is (10 sec: 40959.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5708185600. Throughput: 0: 42958.1. Samples: 1975777100. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:58:18,381][26367] Avg episode reward: [(0, '0.698')] [2024-06-19 11:58:19,453][26599] Updated weights for policy 0, policy_version 348404 (0.0032) [2024-06-19 11:58:22,471][26599] Updated weights for policy 0, policy_version 348414 (0.0031) [2024-06-19 11:58:23,380][26367] Fps is (10 sec: 42597.5, 60 sec: 42871.4, 300 sec: 42876.1). Total num frames: 5708414976. Throughput: 0: 42902.1. Samples: 1976023960. Policy #0 lag: (min: 0.0, avg: 9.5, max: 20.0) [2024-06-19 11:58:23,381][26367] Avg episode reward: [(0, '0.621')] [2024-06-19 11:58:26,953][26599] Updated weights for policy 0, policy_version 348424 (0.0037) [2024-06-19 11:58:28,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42871.4, 300 sec: 42931.6). Total num frames: 5708644352. Throughput: 0: 42840.2. Samples: 1976289080. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:58:28,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 11:58:29,979][26599] Updated weights for policy 0, policy_version 348434 (0.0033) [2024-06-19 11:58:33,384][26367] Fps is (10 sec: 42583.3, 60 sec: 42868.8, 300 sec: 42875.6). Total num frames: 5708840960. Throughput: 0: 42944.5. Samples: 1976420820. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:58:33,385][26367] Avg episode reward: [(0, '0.677')] [2024-06-19 11:58:34,435][26599] Updated weights for policy 0, policy_version 348444 (0.0025) [2024-06-19 11:58:37,574][26599] Updated weights for policy 0, policy_version 348454 (0.0032) [2024-06-19 11:58:38,380][26367] Fps is (10 sec: 42599.2, 60 sec: 43147.2, 300 sec: 42987.2). Total num frames: 5709070336. Throughput: 0: 43019.2. Samples: 1976674120. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:58:38,381][26367] Avg episode reward: [(0, '0.720')] [2024-06-19 11:58:41,889][26599] Updated weights for policy 0, policy_version 348464 (0.0042) [2024-06-19 11:58:43,380][26367] Fps is (10 sec: 42614.3, 60 sec: 42325.3, 300 sec: 42876.6). Total num frames: 5709266944. Throughput: 0: 43024.9. Samples: 1976934720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:58:43,380][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 11:58:45,218][26599] Updated weights for policy 0, policy_version 348474 (0.0046) [2024-06-19 11:58:48,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5709479936. Throughput: 0: 42865.9. Samples: 1977061440. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:58:48,381][26367] Avg episode reward: [(0, '0.560')] [2024-06-19 11:58:49,457][26599] Updated weights for policy 0, policy_version 348484 (0.0030) [2024-06-19 11:58:53,380][26367] Fps is (10 sec: 44236.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5709709312. Throughput: 0: 42960.5. Samples: 1977318060. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:58:53,380][26367] Avg episode reward: [(0, '0.589')] [2024-06-19 11:58:53,547][26599] Updated weights for policy 0, policy_version 348494 (0.0043) [2024-06-19 11:58:57,114][26599] Updated weights for policy 0, policy_version 348504 (0.0025) [2024-06-19 11:58:58,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5709922304. Throughput: 0: 43019.0. Samples: 1977578540. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:58:58,381][26367] Avg episode reward: [(0, '0.725')] [2024-06-19 11:59:01,078][26599] Updated weights for policy 0, policy_version 348514 (0.0034) [2024-06-19 11:59:03,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5710135296. Throughput: 0: 42946.3. Samples: 1977709680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:03,381][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 11:59:04,691][26599] Updated weights for policy 0, policy_version 348524 (0.0022) [2024-06-19 11:59:08,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.6, 300 sec: 42932.0). Total num frames: 5710348288. Throughput: 0: 43161.1. Samples: 1977966200. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:08,380][26367] Avg episode reward: [(0, '0.486')] [2024-06-19 11:59:08,588][26599] Updated weights for policy 0, policy_version 348534 (0.0043) [2024-06-19 11:59:12,092][26579] Signal inference workers to stop experience collection... (29100 times) [2024-06-19 11:59:12,092][26579] Signal inference workers to resume experience collection... (29100 times) [2024-06-19 11:59:12,112][26599] InferenceWorker_p0-w0: stopping experience collection (29100 times) [2024-06-19 11:59:12,112][26599] InferenceWorker_p0-w0: resuming experience collection (29100 times) [2024-06-19 11:59:12,239][26599] Updated weights for policy 0, policy_version 348544 (0.0030) [2024-06-19 11:59:13,384][26367] Fps is (10 sec: 42583.3, 60 sec: 42868.8, 300 sec: 42820.0). Total num frames: 5710561280. Throughput: 0: 42966.5. Samples: 1978222720. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:13,384][26367] Avg episode reward: [(0, '0.833')] [2024-06-19 11:59:16,310][26599] Updated weights for policy 0, policy_version 348554 (0.0043) [2024-06-19 11:59:18,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5710757888. Throughput: 0: 42844.8. Samples: 1978348680. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:18,381][26367] Avg episode reward: [(0, '0.742')] [2024-06-19 11:59:20,093][26599] Updated weights for policy 0, policy_version 348564 (0.0038) [2024-06-19 11:59:23,380][26367] Fps is (10 sec: 42614.1, 60 sec: 42871.6, 300 sec: 42876.1). Total num frames: 5710987264. Throughput: 0: 42764.1. Samples: 1978598500. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:23,380][26367] Avg episode reward: [(0, '0.584')] [2024-06-19 11:59:23,842][26599] Updated weights for policy 0, policy_version 348574 (0.0041) [2024-06-19 11:59:28,012][26599] Updated weights for policy 0, policy_version 348584 (0.0038) [2024-06-19 11:59:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5711200256. Throughput: 0: 42639.8. Samples: 1978853520. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:28,381][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 11:59:31,627][26599] Updated weights for policy 0, policy_version 348594 (0.0040) [2024-06-19 11:59:33,380][26367] Fps is (10 sec: 42597.9, 60 sec: 42874.1, 300 sec: 42876.6). Total num frames: 5711413248. Throughput: 0: 42643.9. Samples: 1978980420. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:33,381][26367] Avg episode reward: [(0, '0.549')] [2024-06-19 11:59:35,705][26599] Updated weights for policy 0, policy_version 348604 (0.0035) [2024-06-19 11:59:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42598.3, 300 sec: 42931.6). Total num frames: 5711626240. Throughput: 0: 42611.1. Samples: 1979235560. Policy #0 lag: (min: 1.0, avg: 9.9, max: 24.0) [2024-06-19 11:59:38,381][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 11:59:38,422][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000348611_5711642624.pth... [2024-06-19 11:59:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000347982_5701337088.pth [2024-06-19 11:59:39,757][26599] Updated weights for policy 0, policy_version 348614 (0.0044) [2024-06-19 11:59:43,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42871.4, 300 sec: 42820.6). Total num frames: 5711839232. Throughput: 0: 42567.5. Samples: 1979494080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 11:59:43,381][26367] Avg episode reward: [(0, '0.730')] [2024-06-19 11:59:43,569][26599] Updated weights for policy 0, policy_version 348624 (0.0029) [2024-06-19 11:59:47,170][26599] Updated weights for policy 0, policy_version 348634 (0.0034) [2024-06-19 11:59:48,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5712035840. Throughput: 0: 42544.0. Samples: 1979624160. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 11:59:48,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 11:59:51,253][26599] Updated weights for policy 0, policy_version 348644 (0.0026) [2024-06-19 11:59:53,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.3, 300 sec: 42876.1). Total num frames: 5712265216. Throughput: 0: 42460.7. Samples: 1979876940. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 11:59:53,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 11:59:55,009][26599] Updated weights for policy 0, policy_version 348654 (0.0038) [2024-06-19 11:59:58,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42820.5). Total num frames: 5712478208. Throughput: 0: 42586.9. Samples: 1980138980. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 11:59:58,381][26367] Avg episode reward: [(0, '0.590')] [2024-06-19 11:59:58,771][26599] Updated weights for policy 0, policy_version 348664 (0.0031) [2024-06-19 12:00:02,515][26599] Updated weights for policy 0, policy_version 348674 (0.0032) [2024-06-19 12:00:03,384][26367] Fps is (10 sec: 42582.9, 60 sec: 42595.8, 300 sec: 42765.0). Total num frames: 5712691200. Throughput: 0: 42649.0. Samples: 1980268040. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:03,393][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 12:00:06,305][26599] Updated weights for policy 0, policy_version 348684 (0.0047) [2024-06-19 12:00:08,380][26367] Fps is (10 sec: 42598.1, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5712904192. Throughput: 0: 42746.0. Samples: 1980522080. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:08,381][26367] Avg episode reward: [(0, '0.604')] [2024-06-19 12:00:10,163][26599] Updated weights for policy 0, policy_version 348694 (0.0031) [2024-06-19 12:00:13,380][26367] Fps is (10 sec: 42614.7, 60 sec: 42601.0, 300 sec: 42765.0). Total num frames: 5713117184. Throughput: 0: 42766.9. Samples: 1980778020. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:13,380][26367] Avg episode reward: [(0, '0.241')] [2024-06-19 12:00:13,964][26599] Updated weights for policy 0, policy_version 348704 (0.0041) [2024-06-19 12:00:18,009][26599] Updated weights for policy 0, policy_version 348714 (0.0034) [2024-06-19 12:00:18,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5713330176. Throughput: 0: 42668.5. Samples: 1980900500. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:18,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 12:00:21,721][26599] Updated weights for policy 0, policy_version 348724 (0.0052) [2024-06-19 12:00:23,380][26367] Fps is (10 sec: 42598.2, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5713543168. Throughput: 0: 42719.2. Samples: 1981157920. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:23,381][26367] Avg episode reward: [(0, '0.794')] [2024-06-19 12:00:25,539][26599] Updated weights for policy 0, policy_version 348734 (0.0031) [2024-06-19 12:00:28,380][26367] Fps is (10 sec: 44236.8, 60 sec: 42871.6, 300 sec: 42820.6). Total num frames: 5713772544. Throughput: 0: 42850.3. Samples: 1981422340. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:28,381][26367] Avg episode reward: [(0, '0.668')] [2024-06-19 12:00:29,441][26599] Updated weights for policy 0, policy_version 348744 (0.0035) [2024-06-19 12:00:33,333][26599] Updated weights for policy 0, policy_version 348754 (0.0036) [2024-06-19 12:00:33,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42871.6, 300 sec: 42931.7). Total num frames: 5713985536. Throughput: 0: 42675.3. Samples: 1981544540. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:33,380][26367] Avg episode reward: [(0, '0.618')] [2024-06-19 12:00:36,937][26599] Updated weights for policy 0, policy_version 348764 (0.0030) [2024-06-19 12:00:38,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42598.5, 300 sec: 42876.1). Total num frames: 5714182144. Throughput: 0: 42822.8. Samples: 1981803960. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:38,380][26367] Avg episode reward: [(0, '0.474')] [2024-06-19 12:00:40,958][26599] Updated weights for policy 0, policy_version 348774 (0.0033) [2024-06-19 12:00:43,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42871.5, 300 sec: 42820.6). Total num frames: 5714411520. Throughput: 0: 42703.5. Samples: 1982060640. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:43,381][26367] Avg episode reward: [(0, '0.632')] [2024-06-19 12:00:44,599][26599] Updated weights for policy 0, policy_version 348784 (0.0039) [2024-06-19 12:00:48,380][26367] Fps is (10 sec: 44236.0, 60 sec: 43144.5, 300 sec: 42876.1). Total num frames: 5714624512. Throughput: 0: 42671.4. Samples: 1982188100. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:48,384][26367] Avg episode reward: [(0, '0.772')] [2024-06-19 12:00:48,834][26599] Updated weights for policy 0, policy_version 348794 (0.0049) [2024-06-19 12:00:52,234][26599] Updated weights for policy 0, policy_version 348804 (0.0037) [2024-06-19 12:00:53,380][26367] Fps is (10 sec: 40959.6, 60 sec: 42598.3, 300 sec: 42820.6). Total num frames: 5714821120. Throughput: 0: 42671.5. Samples: 1982442300. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:53,381][26367] Avg episode reward: [(0, '0.713')] [2024-06-19 12:00:53,530][26579] Signal inference workers to stop experience collection... (29150 times) [2024-06-19 12:00:53,530][26579] Signal inference workers to resume experience collection... (29150 times) [2024-06-19 12:00:53,556][26599] InferenceWorker_p0-w0: stopping experience collection (29150 times) [2024-06-19 12:00:53,587][26599] InferenceWorker_p0-w0: resuming experience collection (29150 times) [2024-06-19 12:00:56,467][26599] Updated weights for policy 0, policy_version 348814 (0.0040) [2024-06-19 12:00:58,380][26367] Fps is (10 sec: 40960.6, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5715034112. Throughput: 0: 42631.9. Samples: 1982696460. Policy #0 lag: (min: 0.0, avg: 10.4, max: 25.0) [2024-06-19 12:00:58,381][26367] Avg episode reward: [(0, '0.458')] [2024-06-19 12:00:59,983][26599] Updated weights for policy 0, policy_version 348824 (0.0040) [2024-06-19 12:01:03,380][26367] Fps is (10 sec: 42599.3, 60 sec: 42601.1, 300 sec: 42821.1). Total num frames: 5715247104. Throughput: 0: 42773.4. Samples: 1982825300. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:03,381][26367] Avg episode reward: [(0, '0.564')] [2024-06-19 12:01:04,358][26599] Updated weights for policy 0, policy_version 348834 (0.0039) [2024-06-19 12:01:07,699][26599] Updated weights for policy 0, policy_version 348844 (0.0038) [2024-06-19 12:01:08,380][26367] Fps is (10 sec: 44236.2, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5715476480. Throughput: 0: 42655.9. Samples: 1983077440. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:08,381][26367] Avg episode reward: [(0, '0.771')] [2024-06-19 12:01:12,052][26599] Updated weights for policy 0, policy_version 348854 (0.0035) [2024-06-19 12:01:13,380][26367] Fps is (10 sec: 42597.8, 60 sec: 42598.3, 300 sec: 42820.5). Total num frames: 5715673088. Throughput: 0: 42426.1. Samples: 1983331520. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:13,381][26367] Avg episode reward: [(0, '0.695')] [2024-06-19 12:01:15,248][26599] Updated weights for policy 0, policy_version 348864 (0.0035) [2024-06-19 12:01:18,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42820.6). Total num frames: 5715886080. Throughput: 0: 42513.7. Samples: 1983457660. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:18,381][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 12:01:19,585][26599] Updated weights for policy 0, policy_version 348874 (0.0033) [2024-06-19 12:01:22,789][26599] Updated weights for policy 0, policy_version 348884 (0.0028) [2024-06-19 12:01:23,380][26367] Fps is (10 sec: 45874.9, 60 sec: 43144.4, 300 sec: 42876.6). Total num frames: 5716131840. Throughput: 0: 42596.2. Samples: 1983720800. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:23,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 12:01:27,143][26599] Updated weights for policy 0, policy_version 348894 (0.0034) [2024-06-19 12:01:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42598.3, 300 sec: 42876.6). Total num frames: 5716328448. Throughput: 0: 42531.1. Samples: 1983974540. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:28,381][26367] Avg episode reward: [(0, '0.682')] [2024-06-19 12:01:30,499][26599] Updated weights for policy 0, policy_version 348904 (0.0042) [2024-06-19 12:01:33,380][26367] Fps is (10 sec: 39322.1, 60 sec: 42325.2, 300 sec: 42820.5). Total num frames: 5716525056. Throughput: 0: 42403.2. Samples: 1984096240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:33,381][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 12:01:34,771][26599] Updated weights for policy 0, policy_version 348914 (0.0036) [2024-06-19 12:01:38,126][26599] Updated weights for policy 0, policy_version 348924 (0.0035) [2024-06-19 12:01:38,384][26367] Fps is (10 sec: 44221.1, 60 sec: 43141.9, 300 sec: 42820.0). Total num frames: 5716770816. Throughput: 0: 42590.0. Samples: 1984359000. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:38,384][26367] Avg episode reward: [(0, '0.536')] [2024-06-19 12:01:38,408][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000348924_5716770816.pth... [2024-06-19 12:01:38,464][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000348297_5706498048.pth [2024-06-19 12:01:42,879][26599] Updated weights for policy 0, policy_version 348934 (0.0040) [2024-06-19 12:01:43,380][26367] Fps is (10 sec: 45875.3, 60 sec: 42871.5, 300 sec: 42876.1). Total num frames: 5716983808. Throughput: 0: 42605.8. Samples: 1984613720. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:43,381][26367] Avg episode reward: [(0, '0.581')] [2024-06-19 12:01:45,967][26599] Updated weights for policy 0, policy_version 348944 (0.0033) [2024-06-19 12:01:48,380][26367] Fps is (10 sec: 37697.0, 60 sec: 42052.4, 300 sec: 42765.0). Total num frames: 5717147648. Throughput: 0: 42432.4. Samples: 1984734760. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:48,381][26367] Avg episode reward: [(0, '0.602')] [2024-06-19 12:01:50,505][26599] Updated weights for policy 0, policy_version 348954 (0.0030) [2024-06-19 12:01:53,380][26367] Fps is (10 sec: 40960.3, 60 sec: 42871.6, 300 sec: 42765.1). Total num frames: 5717393408. Throughput: 0: 42567.3. Samples: 1984992960. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:53,380][26367] Avg episode reward: [(0, '0.626')] [2024-06-19 12:01:53,603][26599] Updated weights for policy 0, policy_version 348964 (0.0036) [2024-06-19 12:01:57,920][26599] Updated weights for policy 0, policy_version 348974 (0.0023) [2024-06-19 12:01:58,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5717606400. Throughput: 0: 42665.3. Samples: 1985251460. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:01:58,381][26367] Avg episode reward: [(0, '0.759')] [2024-06-19 12:02:01,458][26599] Updated weights for policy 0, policy_version 348984 (0.0042) [2024-06-19 12:02:03,380][26367] Fps is (10 sec: 39320.8, 60 sec: 42325.2, 300 sec: 42709.5). Total num frames: 5717786624. Throughput: 0: 42612.8. Samples: 1985375240. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:02:03,381][26367] Avg episode reward: [(0, '0.739')] [2024-06-19 12:02:05,616][26599] Updated weights for policy 0, policy_version 348994 (0.0027) [2024-06-19 12:02:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5718032384. Throughput: 0: 42425.5. Samples: 1985629940. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:02:08,380][26367] Avg episode reward: [(0, '0.717')] [2024-06-19 12:02:09,163][26599] Updated weights for policy 0, policy_version 349004 (0.0038) [2024-06-19 12:02:12,788][26579] Signal inference workers to stop experience collection... (29200 times) [2024-06-19 12:02:12,838][26579] Signal inference workers to resume experience collection... (29200 times) [2024-06-19 12:02:12,847][26599] InferenceWorker_p0-w0: stopping experience collection (29200 times) [2024-06-19 12:02:12,878][26599] InferenceWorker_p0-w0: resuming experience collection (29200 times) [2024-06-19 12:02:13,380][26367] Fps is (10 sec: 44237.2, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5718228992. Throughput: 0: 42570.3. Samples: 1985890200. Policy #0 lag: (min: 0.0, avg: 11.6, max: 23.0) [2024-06-19 12:02:13,381][26367] Avg episode reward: [(0, '0.566')] [2024-06-19 12:02:13,452][26599] Updated weights for policy 0, policy_version 349014 (0.0034) [2024-06-19 12:02:16,887][26599] Updated weights for policy 0, policy_version 349024 (0.0039) [2024-06-19 12:02:18,380][26367] Fps is (10 sec: 39321.2, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5718425600. Throughput: 0: 42595.6. Samples: 1986013040. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:18,381][26367] Avg episode reward: [(0, '0.546')] [2024-06-19 12:02:21,395][26599] Updated weights for policy 0, policy_version 349034 (0.0037) [2024-06-19 12:02:23,380][26367] Fps is (10 sec: 44236.6, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5718671360. Throughput: 0: 42301.1. Samples: 1986262400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:23,381][26367] Avg episode reward: [(0, '0.379')] [2024-06-19 12:02:24,578][26599] Updated weights for policy 0, policy_version 349044 (0.0046) [2024-06-19 12:02:28,380][26367] Fps is (10 sec: 44237.1, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5718867968. Throughput: 0: 42661.8. Samples: 1986533500. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:28,381][26367] Avg episode reward: [(0, '0.450')] [2024-06-19 12:02:28,883][26599] Updated weights for policy 0, policy_version 349054 (0.0025) [2024-06-19 12:02:32,005][26599] Updated weights for policy 0, policy_version 349064 (0.0034) [2024-06-19 12:02:33,380][26367] Fps is (10 sec: 39322.0, 60 sec: 42325.4, 300 sec: 42654.5). Total num frames: 5719064576. Throughput: 0: 42644.9. Samples: 1986653780. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:33,381][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 12:02:36,568][26599] Updated weights for policy 0, policy_version 349074 (0.0023) [2024-06-19 12:02:38,384][26367] Fps is (10 sec: 44220.5, 60 sec: 42325.3, 300 sec: 42653.4). Total num frames: 5719310336. Throughput: 0: 42496.5. Samples: 1986905460. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:38,384][26367] Avg episode reward: [(0, '0.782')] [2024-06-19 12:02:40,189][26599] Updated weights for policy 0, policy_version 349084 (0.0033) [2024-06-19 12:02:43,384][26367] Fps is (10 sec: 45858.4, 60 sec: 42322.8, 300 sec: 42764.5). Total num frames: 5719523328. Throughput: 0: 42638.0. Samples: 1987170320. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:43,385][26367] Avg episode reward: [(0, '0.666')] [2024-06-19 12:02:44,053][26599] Updated weights for policy 0, policy_version 349094 (0.0036) [2024-06-19 12:02:47,961][26599] Updated weights for policy 0, policy_version 349104 (0.0024) [2024-06-19 12:02:48,380][26367] Fps is (10 sec: 40974.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5719719936. Throughput: 0: 42644.5. Samples: 1987294240. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:48,381][26367] Avg episode reward: [(0, '0.664')] [2024-06-19 12:02:51,948][26599] Updated weights for policy 0, policy_version 349114 (0.0033) [2024-06-19 12:02:53,380][26367] Fps is (10 sec: 42613.8, 60 sec: 42598.3, 300 sec: 42653.9). Total num frames: 5719949312. Throughput: 0: 42810.1. Samples: 1987556400. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:53,381][26367] Avg episode reward: [(0, '0.568')] [2024-06-19 12:02:55,616][26599] Updated weights for policy 0, policy_version 349124 (0.0041) [2024-06-19 12:02:58,380][26367] Fps is (10 sec: 45874.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5720178688. Throughput: 0: 42708.8. Samples: 1987812100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:02:58,381][26367] Avg episode reward: [(0, '0.636')] [2024-06-19 12:02:59,446][26599] Updated weights for policy 0, policy_version 349134 (0.0043) [2024-06-19 12:03:03,278][26599] Updated weights for policy 0, policy_version 349144 (0.0033) [2024-06-19 12:03:03,380][26367] Fps is (10 sec: 42598.1, 60 sec: 43144.5, 300 sec: 42709.5). Total num frames: 5720375296. Throughput: 0: 42846.1. Samples: 1987941120. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:03:03,381][26367] Avg episode reward: [(0, '0.501')] [2024-06-19 12:03:07,013][26599] Updated weights for policy 0, policy_version 349154 (0.0046) [2024-06-19 12:03:08,380][26367] Fps is (10 sec: 40960.9, 60 sec: 42598.4, 300 sec: 42709.5). Total num frames: 5720588288. Throughput: 0: 43056.6. Samples: 1988199940. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:03:08,380][26367] Avg episode reward: [(0, '0.537')] [2024-06-19 12:03:11,045][26599] Updated weights for policy 0, policy_version 349164 (0.0029) [2024-06-19 12:03:13,380][26367] Fps is (10 sec: 42598.9, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5720801280. Throughput: 0: 42679.9. Samples: 1988454100. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:03:13,381][26367] Avg episode reward: [(0, '0.616')] [2024-06-19 12:03:14,767][26599] Updated weights for policy 0, policy_version 349174 (0.0023) [2024-06-19 12:03:18,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.5, 300 sec: 42654.0). Total num frames: 5720997888. Throughput: 0: 42885.4. Samples: 1988583620. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:03:18,380][26367] Avg episode reward: [(0, '0.582')] [2024-06-19 12:03:18,652][26599] Updated weights for policy 0, policy_version 349184 (0.0034) [2024-06-19 12:03:22,303][26599] Updated weights for policy 0, policy_version 349194 (0.0038) [2024-06-19 12:03:23,384][26367] Fps is (10 sec: 44220.7, 60 sec: 42868.9, 300 sec: 42709.0). Total num frames: 5721243648. Throughput: 0: 42944.4. Samples: 1988837960. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:03:23,385][26367] Avg episode reward: [(0, '0.466')] [2024-06-19 12:03:25,452][26579] Signal inference workers to stop experience collection... (29250 times) [2024-06-19 12:03:25,452][26579] Signal inference workers to resume experience collection... (29250 times) [2024-06-19 12:03:25,504][26599] InferenceWorker_p0-w0: stopping experience collection (29250 times) [2024-06-19 12:03:25,504][26599] InferenceWorker_p0-w0: resuming experience collection (29250 times) [2024-06-19 12:03:26,201][26599] Updated weights for policy 0, policy_version 349204 (0.0036) [2024-06-19 12:03:28,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42710.0). Total num frames: 5721440256. Throughput: 0: 42816.8. Samples: 1989096920. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:03:28,381][26367] Avg episode reward: [(0, '0.521')] [2024-06-19 12:03:29,702][26599] Updated weights for policy 0, policy_version 349214 (0.0043) [2024-06-19 12:03:33,380][26367] Fps is (10 sec: 40974.9, 60 sec: 43144.5, 300 sec: 42653.9). Total num frames: 5721653248. Throughput: 0: 42905.8. Samples: 1989225000. Policy #0 lag: (min: 0.0, avg: 10.9, max: 20.0) [2024-06-19 12:03:33,381][26367] Avg episode reward: [(0, '0.669')] [2024-06-19 12:03:33,738][26599] Updated weights for policy 0, policy_version 349224 (0.0040) [2024-06-19 12:03:37,409][26599] Updated weights for policy 0, policy_version 349234 (0.0045) [2024-06-19 12:03:38,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42601.0, 300 sec: 42709.5). Total num frames: 5721866240. Throughput: 0: 42809.4. Samples: 1989482820. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:03:38,381][26367] Avg episode reward: [(0, '0.686')] [2024-06-19 12:03:38,410][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000349236_5721882624.pth... [2024-06-19 12:03:38,465][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000348611_5711642624.pth [2024-06-19 12:03:41,412][26599] Updated weights for policy 0, policy_version 349244 (0.0037) [2024-06-19 12:03:43,380][26367] Fps is (10 sec: 40960.5, 60 sec: 42328.0, 300 sec: 42653.9). Total num frames: 5722062848. Throughput: 0: 42845.5. Samples: 1989740140. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:03:43,380][26367] Avg episode reward: [(0, '0.535')] [2024-06-19 12:03:44,847][26599] Updated weights for policy 0, policy_version 349254 (0.0033) [2024-06-19 12:03:48,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.4, 300 sec: 42598.4). Total num frames: 5722275840. Throughput: 0: 42791.1. Samples: 1989866720. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:03:48,381][26367] Avg episode reward: [(0, '0.652')] [2024-06-19 12:03:49,161][26599] Updated weights for policy 0, policy_version 349264 (0.0039) [2024-06-19 12:03:52,565][26599] Updated weights for policy 0, policy_version 349274 (0.0030) [2024-06-19 12:03:53,380][26367] Fps is (10 sec: 45875.4, 60 sec: 42871.6, 300 sec: 42709.5). Total num frames: 5722521600. Throughput: 0: 42756.0. Samples: 1990123960. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:03:53,380][26367] Avg episode reward: [(0, '0.740')] [2024-06-19 12:03:56,953][26599] Updated weights for policy 0, policy_version 349284 (0.0042) [2024-06-19 12:03:58,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42325.3, 300 sec: 42653.9). Total num frames: 5722718208. Throughput: 0: 42766.6. Samples: 1990378600. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:03:58,381][26367] Avg episode reward: [(0, '0.643')] [2024-06-19 12:04:00,302][26599] Updated weights for policy 0, policy_version 349294 (0.0033) [2024-06-19 12:04:03,380][26367] Fps is (10 sec: 40959.4, 60 sec: 42598.5, 300 sec: 42653.9). Total num frames: 5722931200. Throughput: 0: 42764.3. Samples: 1990508020. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:03,381][26367] Avg episode reward: [(0, '0.613')] [2024-06-19 12:04:04,674][26599] Updated weights for policy 0, policy_version 349304 (0.0043) [2024-06-19 12:04:07,913][26599] Updated weights for policy 0, policy_version 349314 (0.0027) [2024-06-19 12:04:08,384][26367] Fps is (10 sec: 45858.9, 60 sec: 43141.8, 300 sec: 42765.0). Total num frames: 5723176960. Throughput: 0: 42856.4. Samples: 1990766500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:08,385][26367] Avg episode reward: [(0, '0.556')] [2024-06-19 12:04:12,305][26599] Updated weights for policy 0, policy_version 349324 (0.0033) [2024-06-19 12:04:13,380][26367] Fps is (10 sec: 44236.7, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5723373568. Throughput: 0: 42875.5. Samples: 1991026320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:13,381][26367] Avg episode reward: [(0, '0.504')] [2024-06-19 12:04:15,493][26599] Updated weights for policy 0, policy_version 349334 (0.0028) [2024-06-19 12:04:18,380][26367] Fps is (10 sec: 40974.4, 60 sec: 43144.4, 300 sec: 42709.4). Total num frames: 5723586560. Throughput: 0: 42697.7. Samples: 1991146400. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:18,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 12:04:20,148][26599] Updated weights for policy 0, policy_version 349344 (0.0037) [2024-06-19 12:04:23,153][26599] Updated weights for policy 0, policy_version 349354 (0.0034) [2024-06-19 12:04:23,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42874.2, 300 sec: 42765.1). Total num frames: 5723815936. Throughput: 0: 42811.2. Samples: 1991409320. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:23,380][26367] Avg episode reward: [(0, '0.648')] [2024-06-19 12:04:27,774][26599] Updated weights for policy 0, policy_version 349364 (0.0025) [2024-06-19 12:04:28,380][26367] Fps is (10 sec: 40960.4, 60 sec: 42598.4, 300 sec: 42653.9). Total num frames: 5723996160. Throughput: 0: 42919.0. Samples: 1991671500. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:28,384][26367] Avg episode reward: [(0, '0.637')] [2024-06-19 12:04:30,848][26599] Updated weights for policy 0, policy_version 349374 (0.0032) [2024-06-19 12:04:33,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5724225536. Throughput: 0: 42828.5. Samples: 1991794000. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:33,383][26367] Avg episode reward: [(0, '0.561')] [2024-06-19 12:04:35,482][26599] Updated weights for policy 0, policy_version 349384 (0.0029) [2024-06-19 12:04:38,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5724438528. Throughput: 0: 42757.8. Samples: 1992048060. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:38,380][26367] Avg episode reward: [(0, '0.565')] [2024-06-19 12:04:38,550][26599] Updated weights for policy 0, policy_version 349394 (0.0029) [2024-06-19 12:04:43,072][26599] Updated weights for policy 0, policy_version 349404 (0.0044) [2024-06-19 12:04:43,380][26367] Fps is (10 sec: 42599.1, 60 sec: 43144.5, 300 sec: 42765.0). Total num frames: 5724651520. Throughput: 0: 42973.1. Samples: 1992312380. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:43,380][26367] Avg episode reward: [(0, '0.496')] [2024-06-19 12:04:45,127][26579] Signal inference workers to stop experience collection... (29300 times) [2024-06-19 12:04:45,176][26599] InferenceWorker_p0-w0: stopping experience collection (29300 times) [2024-06-19 12:04:45,176][26579] Signal inference workers to resume experience collection... (29300 times) [2024-06-19 12:04:45,193][26599] InferenceWorker_p0-w0: resuming experience collection (29300 times) [2024-06-19 12:04:46,632][26599] Updated weights for policy 0, policy_version 349414 (0.0032) [2024-06-19 12:04:48,380][26367] Fps is (10 sec: 42597.8, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5724864512. Throughput: 0: 42811.5. Samples: 1992434540. Policy #0 lag: (min: 0.0, avg: 10.7, max: 23.0) [2024-06-19 12:04:48,381][26367] Avg episode reward: [(0, '0.633')] [2024-06-19 12:04:50,615][26599] Updated weights for policy 0, policy_version 349424 (0.0050) [2024-06-19 12:04:53,380][26367] Fps is (10 sec: 44236.4, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5725093888. Throughput: 0: 42814.6. Samples: 1992693000. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:04:53,381][26367] Avg episode reward: [(0, '0.605')] [2024-06-19 12:04:54,136][26599] Updated weights for policy 0, policy_version 349434 (0.0032) [2024-06-19 12:04:58,154][26599] Updated weights for policy 0, policy_version 349444 (0.0029) [2024-06-19 12:04:58,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.5, 300 sec: 42710.0). Total num frames: 5725290496. Throughput: 0: 42760.9. Samples: 1992950560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:04:58,381][26367] Avg episode reward: [(0, '0.476')] [2024-06-19 12:05:01,650][26599] Updated weights for policy 0, policy_version 349454 (0.0031) [2024-06-19 12:05:03,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5725503488. Throughput: 0: 42781.0. Samples: 1993071540. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:03,389][26367] Avg episode reward: [(0, '0.644')] [2024-06-19 12:05:05,639][26599] Updated weights for policy 0, policy_version 349464 (0.0022) [2024-06-19 12:05:08,380][26367] Fps is (10 sec: 45875.2, 60 sec: 42874.1, 300 sec: 42820.5). Total num frames: 5725749248. Throughput: 0: 42919.0. Samples: 1993340680. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:08,381][26367] Avg episode reward: [(0, '0.849')] [2024-06-19 12:05:09,323][26599] Updated weights for policy 0, policy_version 349474 (0.0031) [2024-06-19 12:05:13,194][26599] Updated weights for policy 0, policy_version 349484 (0.0033) [2024-06-19 12:05:13,380][26367] Fps is (10 sec: 44237.4, 60 sec: 42871.5, 300 sec: 42765.0). Total num frames: 5725945856. Throughput: 0: 42545.9. Samples: 1993586060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:13,381][26367] Avg episode reward: [(0, '0.490')] [2024-06-19 12:05:16,965][26599] Updated weights for policy 0, policy_version 349494 (0.0033) [2024-06-19 12:05:18,380][26367] Fps is (10 sec: 39321.8, 60 sec: 42598.5, 300 sec: 42709.5). Total num frames: 5726142464. Throughput: 0: 42636.1. Samples: 1993712620. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:18,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 12:05:20,817][26599] Updated weights for policy 0, policy_version 349504 (0.0037) [2024-06-19 12:05:23,380][26367] Fps is (10 sec: 42597.4, 60 sec: 42598.2, 300 sec: 42709.5). Total num frames: 5726371840. Throughput: 0: 43010.0. Samples: 1993983520. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:23,381][26367] Avg episode reward: [(0, '0.709')] [2024-06-19 12:05:24,385][26599] Updated weights for policy 0, policy_version 349514 (0.0032) [2024-06-19 12:05:28,380][26367] Fps is (10 sec: 44236.6, 60 sec: 43144.6, 300 sec: 42709.5). Total num frames: 5726584832. Throughput: 0: 42771.0. Samples: 1994237080. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:28,381][26367] Avg episode reward: [(0, '0.479')] [2024-06-19 12:05:28,802][26599] Updated weights for policy 0, policy_version 349524 (0.0029) [2024-06-19 12:05:31,896][26599] Updated weights for policy 0, policy_version 349534 (0.0031) [2024-06-19 12:05:33,380][26367] Fps is (10 sec: 44237.4, 60 sec: 43144.6, 300 sec: 42820.5). Total num frames: 5726814208. Throughput: 0: 42833.3. Samples: 1994362040. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:33,381][26367] Avg episode reward: [(0, '0.543')] [2024-06-19 12:05:36,550][26599] Updated weights for policy 0, policy_version 349544 (0.0032) [2024-06-19 12:05:38,380][26367] Fps is (10 sec: 42598.8, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5727010816. Throughput: 0: 42912.5. Samples: 1994624060. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:38,380][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 12:05:38,402][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000349549_5727010816.pth... [2024-06-19 12:05:38,475][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000348924_5716770816.pth [2024-06-19 12:05:39,408][26599] Updated weights for policy 0, policy_version 349554 (0.0043) [2024-06-19 12:05:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42871.4, 300 sec: 42709.5). Total num frames: 5727223808. Throughput: 0: 42960.4. Samples: 1994883780. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:43,381][26367] Avg episode reward: [(0, '0.611')] [2024-06-19 12:05:44,011][26599] Updated weights for policy 0, policy_version 349564 (0.0029) [2024-06-19 12:05:47,059][26599] Updated weights for policy 0, policy_version 349574 (0.0033) [2024-06-19 12:05:48,380][26367] Fps is (10 sec: 44236.8, 60 sec: 43144.6, 300 sec: 42820.6). Total num frames: 5727453184. Throughput: 0: 43054.4. Samples: 1995008980. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:48,380][26367] Avg episode reward: [(0, '0.459')] [2024-06-19 12:05:51,412][26599] Updated weights for policy 0, policy_version 349584 (0.0049) [2024-06-19 12:05:53,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42598.4, 300 sec: 42765.0). Total num frames: 5727649792. Throughput: 0: 42930.7. Samples: 1995272560. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:53,381][26367] Avg episode reward: [(0, '0.478')] [2024-06-19 12:05:55,011][26599] Updated weights for policy 0, policy_version 349594 (0.0052) [2024-06-19 12:05:58,380][26367] Fps is (10 sec: 40959.3, 60 sec: 42871.4, 300 sec: 42765.0). Total num frames: 5727862784. Throughput: 0: 43108.8. Samples: 1995525960. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:05:58,381][26367] Avg episode reward: [(0, '0.601')] [2024-06-19 12:05:58,947][26599] Updated weights for policy 0, policy_version 349604 (0.0047) [2024-06-19 12:06:02,588][26599] Updated weights for policy 0, policy_version 349614 (0.0035) [2024-06-19 12:06:03,380][26367] Fps is (10 sec: 44237.2, 60 sec: 43144.6, 300 sec: 42765.0). Total num frames: 5728092160. Throughput: 0: 43211.1. Samples: 1995657120. Policy #0 lag: (min: 0.0, avg: 9.3, max: 21.0) [2024-06-19 12:06:03,381][26367] Avg episode reward: [(0, '0.538')] [2024-06-19 12:06:04,816][26579] Signal inference workers to stop experience collection... (29350 times) [2024-06-19 12:06:04,849][26599] InferenceWorker_p0-w0: stopping experience collection (29350 times) [2024-06-19 12:06:04,874][26579] Signal inference workers to resume experience collection... (29350 times) [2024-06-19 12:06:04,880][26599] InferenceWorker_p0-w0: resuming experience collection (29350 times) [2024-06-19 12:06:06,426][26599] Updated weights for policy 0, policy_version 349624 (0.0037) [2024-06-19 12:06:08,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42325.4, 300 sec: 42765.0). Total num frames: 5728288768. Throughput: 0: 42806.9. Samples: 1995909820. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:08,380][26367] Avg episode reward: [(0, '0.545')] [2024-06-19 12:06:10,573][26599] Updated weights for policy 0, policy_version 349634 (0.0040) [2024-06-19 12:06:13,380][26367] Fps is (10 sec: 40959.5, 60 sec: 42598.3, 300 sec: 42765.0). Total num frames: 5728501760. Throughput: 0: 42925.3. Samples: 1996168720. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:13,381][26367] Avg episode reward: [(0, '0.355')] [2024-06-19 12:06:14,279][26599] Updated weights for policy 0, policy_version 349644 (0.0038) [2024-06-19 12:06:18,067][26599] Updated weights for policy 0, policy_version 349654 (0.0033) [2024-06-19 12:06:18,380][26367] Fps is (10 sec: 44235.8, 60 sec: 43144.4, 300 sec: 42709.5). Total num frames: 5728731136. Throughput: 0: 43052.8. Samples: 1996299420. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:18,381][26367] Avg episode reward: [(0, '0.374')] [2024-06-19 12:06:22,039][26599] Updated weights for policy 0, policy_version 349664 (0.0040) [2024-06-19 12:06:23,380][26367] Fps is (10 sec: 42599.1, 60 sec: 42598.6, 300 sec: 42709.5). Total num frames: 5728927744. Throughput: 0: 42858.3. Samples: 1996552680. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:23,380][26367] Avg episode reward: [(0, '0.429')] [2024-06-19 12:06:25,665][26599] Updated weights for policy 0, policy_version 349674 (0.0026) [2024-06-19 12:06:28,380][26367] Fps is (10 sec: 42598.6, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5729157120. Throughput: 0: 42893.3. Samples: 1996813980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:28,381][26367] Avg episode reward: [(0, '0.596')] [2024-06-19 12:06:29,500][26599] Updated weights for policy 0, policy_version 349684 (0.0035) [2024-06-19 12:06:33,293][26599] Updated weights for policy 0, policy_version 349694 (0.0030) [2024-06-19 12:06:33,380][26367] Fps is (10 sec: 45874.7, 60 sec: 42871.5, 300 sec: 42765.5). Total num frames: 5729386496. Throughput: 0: 42967.5. Samples: 1996942520. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:33,381][26367] Avg episode reward: [(0, '0.667')] [2024-06-19 12:06:37,279][26599] Updated weights for policy 0, policy_version 349704 (0.0048) [2024-06-19 12:06:38,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42871.5, 300 sec: 42709.5). Total num frames: 5729583104. Throughput: 0: 42763.6. Samples: 1997196920. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:38,381][26367] Avg episode reward: [(0, '0.608')] [2024-06-19 12:06:40,914][26599] Updated weights for policy 0, policy_version 349714 (0.0033) [2024-06-19 12:06:43,380][26367] Fps is (10 sec: 42598.7, 60 sec: 43144.6, 300 sec: 42931.6). Total num frames: 5729812480. Throughput: 0: 42859.7. Samples: 1997454640. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:43,380][26367] Avg episode reward: [(0, '0.500')] [2024-06-19 12:06:45,006][26599] Updated weights for policy 0, policy_version 349724 (0.0046) [2024-06-19 12:06:48,384][26367] Fps is (10 sec: 42582.2, 60 sec: 42595.7, 300 sec: 42764.5). Total num frames: 5730009088. Throughput: 0: 42889.7. Samples: 1997587320. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:48,385][26367] Avg episode reward: [(0, '0.573')] [2024-06-19 12:06:48,682][26599] Updated weights for policy 0, policy_version 349734 (0.0042) [2024-06-19 12:06:53,087][26599] Updated weights for policy 0, policy_version 349744 (0.0043) [2024-06-19 12:06:53,384][26367] Fps is (10 sec: 40944.4, 60 sec: 42868.8, 300 sec: 42764.5). Total num frames: 5730222080. Throughput: 0: 42939.9. Samples: 1997842280. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:53,385][26367] Avg episode reward: [(0, '0.438')] [2024-06-19 12:06:56,468][26599] Updated weights for policy 0, policy_version 349754 (0.0041) [2024-06-19 12:06:58,380][26367] Fps is (10 sec: 44253.0, 60 sec: 43144.5, 300 sec: 42931.6). Total num frames: 5730451456. Throughput: 0: 42785.3. Samples: 1998094060. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:06:58,382][26367] Avg episode reward: [(0, '0.638')] [2024-06-19 12:07:00,721][26599] Updated weights for policy 0, policy_version 349764 (0.0035) [2024-06-19 12:07:03,380][26367] Fps is (10 sec: 44252.9, 60 sec: 42871.4, 300 sec: 42820.5). Total num frames: 5730664448. Throughput: 0: 42647.2. Samples: 1998218540. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:07:03,381][26367] Avg episode reward: [(0, '0.672')] [2024-06-19 12:07:04,091][26599] Updated weights for policy 0, policy_version 349774 (0.0037) [2024-06-19 12:07:08,249][26599] Updated weights for policy 0, policy_version 349784 (0.0039) [2024-06-19 12:07:08,380][26367] Fps is (10 sec: 40959.7, 60 sec: 42871.3, 300 sec: 42820.5). Total num frames: 5730861056. Throughput: 0: 42624.6. Samples: 1998470800. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:07:08,381][26367] Avg episode reward: [(0, '0.567')] [2024-06-19 12:07:11,918][26599] Updated weights for policy 0, policy_version 349794 (0.0030) [2024-06-19 12:07:13,380][26367] Fps is (10 sec: 39322.4, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5731057664. Throughput: 0: 42536.2. Samples: 1998728100. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:07:13,380][26367] Avg episode reward: [(0, '0.683')] [2024-06-19 12:07:15,849][26599] Updated weights for policy 0, policy_version 349804 (0.0042) [2024-06-19 12:07:18,380][26367] Fps is (10 sec: 42599.0, 60 sec: 42598.5, 300 sec: 42765.0). Total num frames: 5731287040. Throughput: 0: 42343.5. Samples: 1998847980. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:07:18,381][26367] Avg episode reward: [(0, '0.627')] [2024-06-19 12:07:19,746][26599] Updated weights for policy 0, policy_version 349814 (0.0028) [2024-06-19 12:07:23,332][26599] Updated weights for policy 0, policy_version 349824 (0.0033) [2024-06-19 12:07:23,380][26367] Fps is (10 sec: 45874.5, 60 sec: 43144.4, 300 sec: 42876.1). Total num frames: 5731516416. Throughput: 0: 42424.8. Samples: 1999106040. Policy #0 lag: (min: 0.0, avg: 9.8, max: 21.0) [2024-06-19 12:07:23,381][26367] Avg episode reward: [(0, '0.750')] [2024-06-19 12:07:24,704][26579] Signal inference workers to stop experience collection... (29400 times) [2024-06-19 12:07:24,704][26579] Signal inference workers to resume experience collection... (29400 times) [2024-06-19 12:07:24,754][26599] InferenceWorker_p0-w0: stopping experience collection (29400 times) [2024-06-19 12:07:24,755][26599] InferenceWorker_p0-w0: resuming experience collection (29400 times) [2024-06-19 12:07:27,333][26599] Updated weights for policy 0, policy_version 349834 (0.0033) [2024-06-19 12:07:28,380][26367] Fps is (10 sec: 40960.1, 60 sec: 42325.4, 300 sec: 42820.6). Total num frames: 5731696640. Throughput: 0: 42507.9. Samples: 1999367500. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:07:28,381][26367] Avg episode reward: [(0, '0.860')] [2024-06-19 12:07:31,050][26599] Updated weights for policy 0, policy_version 349844 (0.0023) [2024-06-19 12:07:33,380][26367] Fps is (10 sec: 42598.4, 60 sec: 42598.4, 300 sec: 42821.1). Total num frames: 5731942400. Throughput: 0: 42412.4. Samples: 1999495720. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:07:33,381][26367] Avg episode reward: [(0, '0.702')] [2024-06-19 12:07:34,910][26599] Updated weights for policy 0, policy_version 349854 (0.0033) [2024-06-19 12:07:38,380][26367] Fps is (10 sec: 44237.0, 60 sec: 42598.4, 300 sec: 42765.6). Total num frames: 5732139008. Throughput: 0: 42367.1. Samples: 1999748640. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:07:38,380][26367] Avg episode reward: [(0, '0.408')] [2024-06-19 12:07:38,474][26579] Saving /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000349863_5732155392.pth... [2024-06-19 12:07:38,549][26579] Removing /workspace/metta/train_dir/p2.dr4/checkpoint_p0/checkpoint_000349236_5721882624.pth [2024-06-19 12:07:38,949][26599] Updated weights for policy 0, policy_version 349864 (0.0037) [2024-06-19 12:07:42,522][26599] Updated weights for policy 0, policy_version 349874 (0.0032) [2024-06-19 12:07:43,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42325.2, 300 sec: 42820.6). Total num frames: 5732352000. Throughput: 0: 42534.7. Samples: 2000008120. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:07:43,381][26367] Avg episode reward: [(0, '0.578')] [2024-06-19 12:07:46,469][26599] Updated weights for policy 0, policy_version 349884 (0.0043) [2024-06-19 12:07:48,380][26367] Fps is (10 sec: 42598.3, 60 sec: 42601.1, 300 sec: 42765.0). Total num frames: 5732564992. Throughput: 0: 42569.0. Samples: 2000134140. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:07:48,381][26367] Avg episode reward: [(0, '0.816')] [2024-06-19 12:07:50,527][26599] Updated weights for policy 0, policy_version 349894 (0.0043) [2024-06-19 12:07:53,380][26367] Fps is (10 sec: 40959.9, 60 sec: 42327.9, 300 sec: 42653.9). Total num frames: 5732761600. Throughput: 0: 42720.5. Samples: 2000393220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:07:53,381][26367] Avg episode reward: [(0, '0.734')] [2024-06-19 12:07:54,228][26599] Updated weights for policy 0, policy_version 349904 (0.0036) [2024-06-19 12:07:58,038][26599] Updated weights for policy 0, policy_version 349914 (0.0030) [2024-06-19 12:07:58,380][26367] Fps is (10 sec: 42598.0, 60 sec: 42325.3, 300 sec: 42765.0). Total num frames: 5732990976. Throughput: 0: 42640.3. Samples: 2000646920. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:07:58,381][26367] Avg episode reward: [(0, '0.665')] [2024-06-19 12:08:01,846][26599] Updated weights for policy 0, policy_version 349924 (0.0025) [2024-06-19 12:08:03,380][26367] Fps is (10 sec: 45876.0, 60 sec: 42598.5, 300 sec: 42820.6). Total num frames: 5733220352. Throughput: 0: 42961.0. Samples: 2000781220. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:08:03,380][26367] Avg episode reward: [(0, '0.647')] [2024-06-19 12:08:05,792][26599] Updated weights for policy 0, policy_version 349934 (0.0037) [2024-06-19 12:08:08,380][26367] Fps is (10 sec: 40960.2, 60 sec: 42325.4, 300 sec: 42709.5). Total num frames: 5733400576. Throughput: 0: 42750.7. Samples: 2001029820. Policy #0 lag: (min: 0.0, avg: 10.6, max: 21.0) [2024-06-19 12:08:08,381][26367] Avg episode reward: [(0, '0.693')] [2024-06-19 12:08:09,373][26599] Updated weights for policy 0, policy_version 349944 (0.0036)